# Detecting objects unique to your business using Amazon Rekognition Custom Labels and sending predictions for human review using Amazon A2I

Developing a custom model to analyze images is a significant undertaking that requires time, expertise, and resources. It often takes months to complete. Additionally, it can require thousands or tens of thousands of hand-labeled images to provide the model with enough data to accurately make decisions. Generating this data can take months to gather, and can require large teams of labelers to prepare it for use in machine learning. In addition setting up a workflow for auditing or reviewing model predictions to validate adherence to your requirements can further add to the overall complexity.  

With Amazon Rekognition Custom Labels, you can identify the objects and scenes in images that are specific to your business needs. For example, you can find your logo in social media posts, identify your products on store shelves, classify machine parts in an assembly line, distinguish healthy and infected plants, or detect animated characters in videos. Amazon Rekognition Custom Labels builds off of Amazon Rekognitionâ€™s existing capabilities, which are already trained on tens of millions of images across many categories. Instead of thousands of images, you can upload a small set of training images (typically a few hundred images or less) that are specific to your use case. Predictions from Amazon Rekognition Custom Labels can be easily sent to Amazon Augmented AI (Amazon A2I). Amazon A2I makes it easy to integrate a human review into your machine learning workflow. This allows you to automatically have humans step into your ML pipeline to review results below a confidence threshold, for setting up review/auditing workflows and to augment the prediction results to improve model accuracy. 

In this post we show you to how to build a custom object detection model trained to detect pepperoni slices in a pizza using Amazon Rekognition custom labels with a dataset labeled using Amazon SageMaker GroundTruth. We then show how to create your own private workforce and setup an Amazon A2I workflow definition to conditionally trigger human loops for review and augmenting tasks. You can use the annotations created by Amazon A2I for model re-training. 

## Contents

1. [Prerequisites](#Prequisites)
2. [Step 1 - Train an Amazon Rekognition custom model](#Step-2---Train-a-Rekognition-custom-model)
3. [Step 2 - Setup an Amazon A2I Flow Definition ](#Step-3---Setup-an-Amazon-A2I-Flow-Definition)
4. [Step 3 - Start Human Loops](#Step-4---Start-Human-Loops)
5. [Step 4 - Evaluate Results](#Step-4---Evaluate-Results)
6. [Cleanup](#Cleanup)

## Prerequisites

Before getting started, you need to create your human workforce, set up your Amazon SageMaker Studio notebook and download the datasets we will use in this post.

### Creating your human workforce
This step requires you to use the AWS Console. We will create a private workteam and add only one user (you) to it. To create a private team:

1. Go to AWS Console > Amazon SageMaker > Labeling workforces
1. Click "Private" and then "Create private team".
1. Enter the desired name for your private workteam.
1. Enter your own email address in the "Email addresses" section.
1. Enter the name of your organization and a contact email to administer the private workteam.
1. Click "Create Private Team".
1. The AWS Console should now return to AWS Console > Amazon SageMaker > Labeling workforces. Your newly created team should be visible under "Private teams". Next to it you will see an ARN which is a long string that looks like arn:aws:sagemaker:region-name-123456:workteam/private-crowd/team-name. Please enter this ARN in the cell below and execute the cell.
1. You should get an email from no-reply@verificationemail.com that contains your workforce username and password.
1. In AWS Console > Amazon SageMaker > Labeling workforces, click on the URL in Labeling portal sign-in URL. Use the email/password combination from Step 8 to log in (you will be asked to create a new, non-default password).
1. This is your private worker's interface. When we create a verification task in Verify your task using a private team below, your task should appear in this window. You can invite your colleagues to participate in the labeling job by clicking the "Invite new workers" button.


### Initialize Variables

Use the following cell to specify:
* `WORKTEAM_ARN` : the Amazon Resource Name (ARN) of the private work team you want to use for this walkthrough.
* `BUCKET` : The Amazon S3 bucket you want to use to store input and output data. This bucket must have CORS enabled (see note below).
*  `REGION` : The AWS Region your notebook instance, `BUCKET` and work team are located in. Note that these resources must be located in the same AWS Region.

**Important**: The bucket you specify for `BUCKET` must have CORS enabled. You can enable CORS by adding a policy similar to the following to your Amazon S3 bucket. To learn how to add CORS to an S3 bucket, see [CORS Permission Requirement](https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-permissions-security.html#a2i-cors-update) in the Amazon A2I documentation. 


```
[{
   "AllowedHeaders": [],
   "AllowedMethods": ["GET"],
   "AllowedOrigins": ["*"],
   "ExposeHeaders": []
}]
```

If you do not add a CORS configuration to the S3 buckets that contains your image input data, human review tasks for those input data objects will fail. 



In [None]:
# initialization
import os
import io
import boto3
import botocore
from time import gmtime, strftime 

REGION = 'us-east-1'
WORKTEAM_ARN= 'your-workteam-arn'
BUCKET = 'your-s3-bucket-name'
PREFIX = 'a2i-rekogCL-demo-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
s3 = boto3.client('s3')
s3r = boto3.resource('s3')

from sagemaker import get_execution_role
# Setting Role to the default SageMaker Execution Role
ROLE = get_execution_role()
display(ROLE)

# Download the Amazon SageMaker GroundTruth object detection manifest to Amazon S3
For our post we will use a pre-labeled object detection dataset created using the Amazon SageMaker GroundTruth bounding box labeling job task type. The image files from this dataset are available in the **data/images** folder and the manifest file is available in the **data/manifest** folder in this repo. Let's Ddownload the images and the manifest to a S3 bucket that we will use for training our Amazon Rekognition custom model.

In [None]:
# upload images
folderpath = r"data/images" # make sure to put the 'r' in front and provide the folder where your files are
filepaths  = [os.path.join(folderpath, name) for name in os.listdir(folderpath) if not name.startswith('.')] # do not select hidden directories
for path in filepaths:
    s3r.meta.client.upload_file(path, BUCKET, PREFIX+'/'+path)
#upload test image in s3
folderpath = r"data/testimages" # make sure to put the 'r' in front and provide the folder where your files are
filepaths  = [os.path.join(folderpath, name) for name in os.listdir(folderpath) if not name.startswith('.')] # do not select hidden directories
for path in filepaths:
    s3r.meta.client.upload_file(path, BUCKET, PREFIX+'/'+path)

# replace bucket, prefix entries from the template and upload the manifest file
tempname = ("bucket","prefix")
realname = (BUCKET,PREFIX)
f1 = open('./data/manifest/output-change.manifest', 'r')
f2 = open('./data/manifest/output.manifest', 'w')
for line in f1:
    for check, rep in zip(tempname, realname):
        line = line.replace(check, rep)
    f2.write(line)
f1.close()
f2.close()

s3r.meta.client.upload_file('./data/manifest/output.manifest', BUCKET, PREFIX+'/'+'data/manifest/output.manifest')

## Step 1 - Train an Amazon Rekognition custom model

We will perform the steps to train the custom object detection model using the AWS console. Let's begin by selecting the [Amazon Rekognition custom labels in the AWS Console](https://console.aws.amazon.com/rekognition/custom-labels#/). The Amazon Rekognition Custom Labels console is where you create and manage your models. The first time you use the console, Amazon Rekognition Custom Labels asks to create an Amazon S3 bucket in your account. The bucket is used to store your Amazon Rekognition Custom Labels projects, datasets, and models. You can't use the Amazon Rekognition Custom Labels console unless the bucket is created. **Please note** that this bucket is not the same as the S3 bucket you defined for this notebook in the Prerequisites section. 

### Create a Project

Within Amazon Rekognition Custom Labels, you use projects to manage the models that you create. A project manages the input images and labels, datasets, model training, model evaluation, and running the model. For more information, see Creating an Amazon Rekognition Custom Labels Project (https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cp-create-project.html).

### Create a dataset

In an Amazon Rekognition Custom Labels project, datasets contain the images, assigned labels, and bounding boxes that you use to train and test a custom model. You can create and manage datasets by using the Amazon Rekognition Custom Labels console. You can't create a dataset with the Amazon Rekognition Custom Labels API. For our model training, we already have the pre-labeled images from Amazon SageMaker GroundTruth that we need to create our training dataset. We will use the images and the manifest file we uploaded to the S3 bucket in the Prerequisites section. For Image Location, please select "Import Images labeled by Amazon SageMaker GroundTruth" and provide the S3 path to the manifest file you uploaded in the cell above. 

**Note:** You should get a prompt to provide the S3 bucket policy when you provide the S3 path above as shown here. Please copy the bucket policy as requested:
* Make sure that your S3 bucket is correctly configured
* You've specified an external S3 bucket: your bucket name
* To use the images in this bucket, copy the policy below (to copy, choose the preceding link text). 
* Paste the policy into the "Bucket Policy" section of your bucket.

### Train an Amazon Rekognition Custom Model

Training a model requires labeled images and algorithms to train the model. Amazon Rekognition Custom Labels simplifies this process by choosing the algorithm for you. Accurately labeled images are very important to the process because they influence the algorithm that is chosen to train the model. After training a model, you can evaluate its performance. For more information, see Evaluating a Trained Amazon Rekognition Custom Labels Model (https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tr-train-results.html).

To train an Amazon Rekognition Custom Model
* Choose Train Model.
* For Choose project, choose your newly created project.
* For Choose training dataset, choose your newly created dataset.
* For Choose a test dataset, provide the training dataset you created above
* Click on Train at the bottom right of the page
    
The training should approximately take 45 minutes to complete. When training completes, you can evaluate the performance of the model. To help you, Amazon Rekognition Custom Labels provides summary metrics and evaluation metrics for each label. For information about the available metrics, see [Metrics for Evaluating Your Model](https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tr-metrics-use.html). To improve your model using metrics, see [Improving an Amazon Rekognition Custom Labels Model](https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tr-improve-model.html). For more details please refer to the [Amazon Rekognition Custom Labels documentation](https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/tm-train-model.html)

### Lets check the status of our Rekognition custom model

In [None]:
!aws rekognition describe-projects

In [None]:
#Replace the project-arn below with the project-arn of your project from the describe-projects output above
!aws rekognition describe-project-versions --project-arn 'your-project-arn'

In [None]:
# Start the Project so we can run object detection using this model
# Copy/paste the ProjectVersionArn for your model from the describe-project-versions cell output above to the --project-version-arn parameter here
!aws rekognition start-project-version \
  --project-version-arn 'your-project-version-arn' \
  --min-inference-units 1 \
  --region us-east-1

### Lets run object detection using our Rekogntion custom model using one of the images from our dataset
Please copy the ProjectVersionArn for your model from the result of the cell execution above and provide this as input to the model_arn below. Lets select a sample image from our dataset.

In [None]:
from PIL import Image, ImageDraw, ExifTags, ImageColor, ImageFont
image=Image.open('./data/testimages/pexels-polina-tankilevitch-4109078.jpg')
display(image)


Now lets send this image to our Amazon Rekognition Custom model for detection

In [None]:
import io
import json


test_photo = '/data/testimages/pexels-polina-tankilevitch-4109078.jpg'
#s3_connection = boto3.resource('s3')
client = boto3.client('rekognition')
s3_object = s3r.Object(BUCKET, PREFIX + test_photo)
s3_response = s3_object.get()

stream = io.BytesIO(s3_response['Body'].read())
image = Image.open(stream)
model_arn = 'your-project-version-arn'
min_confidence=50    
#Call DetectCustomLabels 
response = client.detect_custom_labels(Image={'S3Object': {'Bucket': BUCKET, 'Name': PREFIX + test_photo}},
    MinConfidence=min_confidence,
    ProjectVersionArn=model_arn)
#print("Response from Rekog is: " + str(response))   
imgWidth, imgHeight = image.size  
draw = ImageDraw.Draw(image)  
       
# calculate and display bounding boxes for each detected custom label       
   
for customLabel in response['CustomLabels']:
    print('Label ' + str(customLabel['Name'])) 
    print('Confidence ' + str(customLabel['Confidence'])) 
    if 'Geometry' in customLabel:
        box = customLabel['Geometry']['BoundingBox']
        left = imgWidth * box['Left']
        top = imgHeight * box['Top']
        width = imgWidth * box['Width']
        height = imgHeight * box['Height']

        fnt = ImageFont.load_default()
        draw.text((left,top), customLabel['Name'], fill='#00d400', font=fnt) 

        print('Left: ' + '{0:.0f}'.format(left))
        print('Top: ' + '{0:.0f}'.format(top))
        print('Label Width: ' + "{0:.0f}".format(width))
        print('Label Height: ' + "{0:.0f}".format(height))
        print("Label %s",json.dumps(customLabel['Name']))

        points = (
            (left,top),
            (left + width, top),
            (left + width, top + height),
            (left , top + height),
            (left, top))
        draw.line(points, fill='#00d400', width=5)

display(image)        


## Step 2 - Setup an Amazon A2I Flow Definition

Amazon Augmented AI (Amazon A2I) makes it easy to build the workflows required for human review of ML predictions. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems or managing large numbers of human reviewers.

To incorporate Amazon A2I into your human review workflows you need:

- A worker task template to create a worker UI. The worker UI displays your input data, such as documents or images, and instructions to workers. It also provides interactive tools that the worker uses to complete your tasks. For more information, see A2I instructions overview

- A human review workflow, also referred to as a flow definition. You use the flow definition to configure your human workforce and provide information about how to accomplish the human review task. To learn more see create flow definition

- When using a custom task type, you start a human loop using the Amazon Augmented AI Runtime API. When you call StartHumanLoop in your custom application, a task is sent to human reviewers.

In this section, you set up a human review loop for low-confidence detections in Amazon A2I. It includes the following steps:

- Create a human task UI
- Create the flow definition

Lets now initialize some variables that we need in the subsequent steps

In [None]:
import time

timestamp = time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
# Amazon SageMaker client
sagemaker_client = boto3.client('sagemaker')

# Amazon Augment AI (A2I) client
a2i = boto3.client('sagemaker-a2i-runtime')

# Flow definition name - this value is unique per account and region. You can also provide your own value here.
flowDefinitionName = 'fd-rekog-custom-' + timestamp

# Task UI name - this value is unique per account and region. You can also provide your own value here.
taskUIName = 'ui-rekog-custom-' + timestamp

# Flow definition outputs
OUTPUT_PATH = f's3://{BUCKET}/{PREFIX}/output'

### Create Human Task UI

Create a human task UI resource, giving a UI template in liquid html. This template will be rendered to the human workers whenever human loop is required.

For over 70 pre built UIs, check: https://github.com/aws-samples/amazon-a2i-sample-task-uis.

We will be taking an [object detection UI](https://github.com/aws-samples/amazon-a2i-sample-task-uis/blob/master/images/bounding-box.liquid.html) and filling in the object categories in the `labels` variable in the template.

In [None]:
template = r"""
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>

<crowd-form>
  <crowd-bounding-box
    name="annotatedResult"
    src="{{ task.input.taskObject | grant_read_access }}"
    header="Identify Pepperoni Pizza slices in the image, select the label from the right and draw bounding boxes depicting them. You need one bounding box per pizza slice you want to label."
    labels="['pepperoni pizza slice']"
  >
    <full-instructions header="Bounding Box Instructions" >
      <p>Use the bounding box tool to draw boxes around the requested target of interest:</p>
      <ol>
        <li>Draw a rectangle using your mouse over each instance of the target.</li>
        <li>Make sure the box does not cut into the target, leave a 2 - 3 pixel margin</li>
        <li>
          When targets are overlapping, draw a box around each object,
          include all contiguous parts of the target in the box.
          Do not include parts that are completely overlapped by another object.
        </li>
        <li>
          Do not include parts of the target that cannot be seen,
          even though you think you can interpolate the whole shape of the target.
        </li>
        <li>Avoid shadows, they're not considered as a part of the target.</li>
        <li>If the target goes off the screen, label up to the edge of the image.</li>
      </ol>
    </full-instructions>

    <short-instructions>
      Draw boxes around the requested target of interest.
    </short-instructions>
  </crowd-bounding-box>
</crowd-form>
"""

def create_task_ui():
    '''
    Creates a Human Task UI resource.

    Returns:
    struct: HumanTaskUiArn
    '''
    response = sagemaker_client.create_human_task_ui(
        HumanTaskUiName=taskUIName,
        UiTemplate={'Content': template})
    return response

In [None]:
# Create task UI
humanTaskUiResponse = create_task_ui()
humanTaskUiArn = humanTaskUiResponse['HumanTaskUiArn']
print(humanTaskUiArn)

### Creating the Flow Definition

In this section, we're going to create a flow definition definition. Flow Definitions allow us to specify:

* The workforce that your tasks will be sent to.
* The instructions that your workforce will receive. This is called a worker task template.
* The configuration of your worker tasks, including the number of workers that receive a task and time limits to complete tasks.
* Where your output data will be stored.

This notebook is going to use the API, but you can optionally create this workflow definition in the console as well. 

For more details and instructions, see: https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-create-flow-definition.html.

In [None]:
create_workflow_definition_response = sagemaker_client.create_flow_definition(
        FlowDefinitionName= flowDefinitionName,
        RoleArn= ROLE,
        HumanLoopConfig= {
            "WorkteamArn": WORKTEAM_ARN,
            "HumanTaskUiArn": humanTaskUiArn,
            "TaskCount": 1,
            "TaskDescription": "Identify custom labels in the image",
            "TaskTitle": "Identify custom image"
        },
        OutputConfig={
            "S3OutputPath" : OUTPUT_PATH
        }
    )
flowDefinitionArn = create_workflow_definition_response['FlowDefinitionArn'] # let's save this ARN for future use

In [None]:
# Describe flow definition - status should be active
for x in range(60):
    describeFlowDefinitionResponse = sagemaker_client.describe_flow_definition(FlowDefinitionName=flowDefinitionName)
    print(describeFlowDefinitionResponse['FlowDefinitionStatus'])
    if (describeFlowDefinitionResponse['FlowDefinitionStatus'] == 'Active'):
        print("Flow Definition is active")
        break
    time.sleep(2)

## Step 3 - Start Human Loops

In this step we send our test dataset for predictions using our Amazon Rekognition Custom Labels model, and conditionally send a list of prediction results to a private workforce we setup for human review. We use the confidence score from the CustomLabels tag returned by the DetectCustomLabels API to send all images with a confidence score of less than 60% to the human loop. We execute the following tasks:

- Trigger conditions for human loop activation
- Check the human loop status and wait for reviewers to complete the task

### Run model predictions and trigger human loop
We first ask Amazon Rekognition Custom Label model to send us labels it was able to detect with a confidence of greater than 20%. For those images that meet this criteria we then select the prediction results returned with a confidence of < 60% to a human loop.

In [None]:
import uuid

human_loops_started = []
SCORE_THRESHOLD = 60

folderpath = r"data/testimages" # make sure to put the 'r' in front and provide the folder where your files are
filepaths  = [os.path.join(folderpath, name) for name in os.listdir(folderpath) if not name.startswith('.')] # do not select hidden directories
for path in filepaths:
    # Call custom label endpoint and not display any object detected with probability lower than 0.01
    response = client.detect_custom_labels(Image={'S3Object': {'Bucket': BUCKET, 'Name': PREFIX+'/'+path}},
        MinConfidence=0,
        ProjectVersionArn=model_arn)    
  
    #Get the custom labels
    labels=response['CustomLabels']
    if labels and labels[0]['Confidence'] < SCORE_THRESHOLD: 
        s3_fname='s3://%s/%s' % (BUCKET, PREFIX+'/'+path)
        print("Images with labels less than 60% confidence score: " + s3_fname)
        humanLoopName = str(uuid.uuid4())
        inputContent = {
            "initialValue": labels[0]['Confidence'],
            "taskObject": s3_fname
        }
        start_loop_response = a2i.start_human_loop(
            HumanLoopName=humanLoopName,
            FlowDefinitionArn=flowDefinitionArn,
            HumanLoopInput={
                "InputContent": json.dumps(inputContent)
            }
        )
        human_loops_started.append(humanLoopName)
        print(f'Starting human loop with name: {humanLoopName}  \n')

### Check Status of Human Loop

In [None]:
completed_human_loops = []
for human_loop_name in human_loops_started:
    resp = a2i.describe_human_loop(HumanLoopName=human_loop_name)
    print(f'HumanLoop Name: {human_loop_name}')
    print(f'HumanLoop Status: {resp["HumanLoopStatus"]}')
    print(f'HumanLoop Output Destination: {resp["HumanLoopOutput"]}')
    print('\n')
    
    if resp["HumanLoopStatus"] == "Completed":
        completed_human_loops.append(resp)

### Wait For Workers to Complete Task

In [None]:
workteamName = WORKTEAM_ARN[WORKTEAM_ARN.rfind('/') + 1:]
print("Navigate to the private worker portal and do the tasks. Make sure you've invited yourself to your workteam!")
print('https://' + sagemaker_client.describe_workteam(WorkteamName=workteamName)['Workteam']['SubDomain'])

### Check Status of Human Loop Again

In [None]:
completed_human_loops = []
for human_loop_name in human_loops_started:
    resp = a2i.describe_human_loop(HumanLoopName=human_loop_name)
    print(f'HumanLoop Name: {human_loop_name}')
    print(f'HumanLoop Status: {resp["HumanLoopStatus"]}')
    print(f'HumanLoop Output Destination: {resp["HumanLoopOutput"]}')
    print('\n')
    
    if resp["HumanLoopStatus"] == "Completed":
        completed_human_loops.append(resp)

## Step 4 - Evaluate results and re-train the model 

Once our human workforce has completed the tasks, Amazon A2I stores the annotation results in a S3 bucket. You can use these annotation results to updated your original labeling annotations, update your training dataset and re-train your models to improve their accuracy. You can also send the Amazon A2I annotations for usage in downstream analytics of prediction results, for auditing purposes and for conditional activation of application logic as required.

In [None]:
import re
import pprint

pp = pprint.PrettyPrinter(indent=4)

for resp in completed_human_loops:
    splitted_string = re.split('s3://' +  BUCKET + '/', resp['HumanLoopOutput']['OutputS3Uri'])
    output_bucket_key = splitted_string[1]

    response = s3.get_object(Bucket=BUCKET, Key=output_bucket_key)
    content = response["Body"].read()
    json_output = json.loads(content)
    pp.pprint(json_output)
    print('\n')

### Create an augmented manifest for re-training

We will now take the output from Amazon A2I and convert this to an **augmented manifest file** (similar to the **`output.manifest`** file we used for originally training our Amazon Rekognition Custom Labels model). Replace the **`dsname`** variable below with the name of your dataset that you created originally

In [None]:
object_categories = ['pepperoni pizza slice','cheese slice'] # if you have more labels, add them here
object_categories_dict = {str(i): j for i, j in enumerate(object_categories)}

dsname = 'pepperoni_pizza'

def convert_a2i_to_augmented_manifest(a2i_output):
    annotations = []
    confidence = []
    for i, bbox in enumerate(a2i_output['humanAnswers'][0]['answerContent']['annotatedResult']['boundingBoxes']):
        object_class_key = [key for (key, value) in object_categories_dict.items() if value == bbox['label']][0]
        obj = {'class_id': int(object_class_key), 
               'width': bbox['width'],
               'top': bbox['top'],
               'height': bbox['height'],
               'left': bbox['left']}
        annotations.append(obj)
        confidence.append({'confidence': 1})

    # Change the attribute name to the dataset-name_BB for this dataset. This will later be used in setting the training data
    augmented_manifest={'source-ref': a2i_output['inputContent']['taskObject'],
                        dsname+'_BB': {'annotations': annotations,
                                           'image_size': [{'width': a2i_output['humanAnswers'][0]['answerContent']['annotatedResult']['inputImageProperties']['width'],
                                                           'depth':3,
                                                           'height': a2i_output['humanAnswers'][0]['answerContent']['annotatedResult']['inputImageProperties']['height']}]},
                        dsname+'_BB-metadata': {'job-name': 'a2i/%s' % a2i_output['humanLoopName'],
                                                    'class-map': object_categories_dict,
                                                    'human-annotated':'yes',
                                                    'objects': confidence,
                                                    'creation-date': a2i_output['humanAnswers'][0]['submissionTime'],
                                                    'type':'groundtruth/object-detection'}}
    return augmented_manifest

This function will take an A2I output json and result in a json object that is compatible to how Amazon SageMaker Ground Truth outputs the result and how the Amazon Rekognition Custom Labels expects from the input. In order to create a cohort of training images from all the images re-labeled by human reviewers in A2I console. You can loop through all the A2I output, convert the json file, and concatenate them into a JSON Lines file, with each line represents results of one image.

In [None]:
output=[]
with open('augmented-temp.manifest', 'w') as outfile:
    # convert the a2i json to augmented manifest for each human loop output
    for resp in completed_human_loops:
        splitted_string = re.split('s3://' +  BUCKET + '/', resp['HumanLoopOutput']['OutputS3Uri'])
        output_bucket_key = splitted_string[1]

        response = s3.get_object(Bucket=BUCKET, Key=output_bucket_key)
        content = response["Body"].read()
        json_output = json.loads(content)
        
        # convert using the function
        augmented_manifest = convert_a2i_to_augmented_manifest(json_output)
        print(json.dumps(augmented_manifest))
        json.dump(augmented_manifest, outfile)
        outfile.write('\n')
        output.append(augmented_manifest)
        print('\n')

Let's now use the contents of the original output.manifest file to intentionally replace the image references whose annotations were augmented by Amazon A2I. This will create a new full copy of the manifest to use to train a 2nd model that trains on annotations from A2I.

In [None]:
f4 = open('./augmented-temp.manifest', 'r')
with open('augmented.manifest', 'w') as outfl:
    for lin1 in f4:
        z_json = json.loads(lin1)
        done_json = json.loads(lin1)
        done_json['source-ref'] = 'a'
        f3 = open('./data/manifest/output.manifest', 'r')
        for lin2 in f3:
            x_json = json.loads(lin2)
            if z_json['source-ref'] == x_json['source-ref']:
                print("replacing the annotations")
                x_json[dsname+'_BB'] = z_json[dsname+'_BB']
                x_json[dsname+'_BB-metadata'] = z_json[dsname+'_BB-metadata']
            elif done_json['source-ref'] != z_json['source-ref']:
                print("This is a net new annotation to augmented file")
                json.dump(z_json, outfl)
                outfl.write('\n')
                print(str(z_json))
                done_json = z_json
            json.dump(x_json, outfl)
            outfl.write('\n')         
        f3.close()       
f4.close()    

In [None]:
# take a look at how Json Lines looks like
!head -n2 augmented.manifest

### Train a new model using the augmented manifest

In [None]:
# upload the manifest file to S3
s3r.meta.client.upload_file('./augmented.manifest', BUCKET, PREFIX+'/'+'data/manifest/augmented.manifest')

Now we have uploaded the augmented manifest file from Amazon A2I to the S3 bucket, you can train a new model by using this augmented manifest as an input dataset. Please find below the list of steps you will perform:

**`Create a new Amazon Rekognition Custom Labels dataset`**
- Go to AWS Console --> Amazon Rekognition Custom Labels --> Datasets
- Choose **Create dataset**, enter a new Dataset name and select *Import images labeled by Amazon Sagemaker Ground Truth*. Provide the S3 location of the augmented.manifest file you created above in `<S3 bucket location of your .manifest file>`
- Create the dataset

**`Train a new model by selecting this newly created dataset`**
- Go to AWS Console --> Amazon Rekognition Custom Labels --> Projects
- Click on the project you used when training your model the first time
- Click Train Model and select the dataset you created in the step above
- Click Start Training

## Cleanup
To avoid incurring unnecessary charges, delete the resources used in this walkthrough
when not in use, including the following:

* [Amazon Rekognition Custom Label Project](https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/cp-delete.html)
* [Amazon S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/delete-bucket.html)
* [Amazon A2I Flow definition](https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-delete-flow-definition.html)
* [Amazon SageMaker notebook instance](https://sagemaker-workshop.com/cleanup/sagemaker.html)



In [None]:
!aws rekognition stop-project-version --project-version-arn 'your-project-version-arn'

In [None]:
!aws rekognition delete-project --project-arn 'your-project-arn'

In [None]:
!aws sagemaker delete-flow-definition --flow-definition-name flowDefinitionName

## Conclusion
This notebook demonstrated how you can use Amazon Rekognition Custom Labels and Amazon A2I to train models to detect objects and scenes unique to your business and define conditions to send the predictions to a Human Workflow with labelers to review and update the results. The human labeled output can be used to augment the training dataset for re-training, improving model accuracy or it can be sent to downstream applications for analytics and insights.