## Amazon Lookout for Vision Lab

To help you learn about creating a model, Amazon Lookout for Vision provides example images of circuit boards (circuit_board) that you can use. These images are taken from https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/su-prepare-example-images.html.

### Environmental variables

In a very first step we want to define the two global variables needed for this notebook:

- bucket: the S3 bucket that you will create and then use as your source for Amazon Lookout for Vision
 - Note: Please read the comments carefully. Depending on your region you need to uncomment the correct command
- project: the project name you want to use in Amazon Lookout for Vision

In [None]:
import os
import boto3

bucket = "lfv-s3-bucket--MMDDYY"
project = "circuitproject"
os.environ["BUCKET"] = bucket
os.environ["REGION"] = boto3.session.Session().region_name

client = boto3.client('lookoutvision')

You can check your region here with:

In [None]:
# Check your region:
print(boto3.session.Session().region_name)

Depending on your region follow the instructions of the next cell:

In [None]:
##If you are not using existing S3 bucket, create your S3 bucket:

if boto3.session.Session().region_name=='us-east-1':
 !aws s3api create-bucket --bucket $BUCKET
else:
 !aws s3api create-bucket --bucket $BUCKET --create-bucket-configuration LocationConstraint=$REGION

## Image Preparation and EDA

In Amazon Lookout for Vision - see also
- https://aws.amazon.com/lookout-for-vision/ and
- https://aws.amazon.com/blogs/aws/amazon-lookout-for-vision-new-machine-learning-service-that-simplifies-defect-detection-for-manufacturing/
if you already have pre-labeled images available, as it is the case in this example, you can already establish a folder structure that lets you define training and validation. Further, images are labeled for Amazon Lookout via the corresponding folder (normal=good, anomaly=bad).

We will import the sample images provided by AWS Lookout of Vision. If you're importing your own images, you will prepare them at this stage.

### Generate the *manifest* files

You might be familiar with the manifest files if you ever used Amazon SageMaker Ground Truth. If you are not don't worry about that section too much.

If you are still interested in what's happening, you can continue reading:

Each dataset training/ as well as validation/ needs a manifest file. This file is used by Amazon Lookout for Vision to determine where to look for the images. The manifest follows a fixed structure. Most importantly are the keys (it's JSON formatted) *source-ref* this is the location for each file, *auto-label* the value for each label (0=bad, 1=good), *folder* which indicates whether Amazon Lookout is using training or validation and *creation-date* as this let's you know when an image was put in place. All other fields are pre-set for you.

Each manifest file itself contains N JSON objects, where N is the number of images that are used in this dataset.

In [None]:
# Datetime for datetime generation and json to dump the JSON object
# to the corresponding files:
from datetime import datetime
import json

# Current date and time in manifest file format:
now = datetime.now()
dttm = now.strftime("%Y-%m-%dT%H:%M:%S.%f")

# The two datasets used: train and test
datasets = ["train", "test"]

# For each dataset...
for ds in datasets:
 # ...list the folder available (normal or anomaly).
 #print(ds)
 folders = os.listdir("./circuitboard/{}".format(ds))
 # Then open the manifest file for this dataset...
 with open("{}.manifest".format(ds), "w") as f:
 for folder in folders:
 filecount=0
 #print(folder)
 # ...and iterate through both folders by first listing
 # the corresponding files and setting the appropriate label
 # (as noted above: 1 = good, 0 = bad):
 files = os.listdir("./circuitboard/{}/{}".format(ds, folder))
 label = 1
 if folder == "anomaly":
 label = 0
 # For each file in the folder...
 for file in files:
 filecount+=1
 #print(filecount)
 # Uncomment the following two lines to use the entire dataset
 if filecount>20:
 break
 # ...generate a manifest JSON object and save it to the manifest
 # file. Don't forget to add '/n' to generate a new line:
 manifest = {
 "source-ref": "s3://{}/{}/{}/{}/{}".format(bucket,project, ds, folder, file),
 "auto-label": label,
 "auto-label-metadata": {
 "confidence": 1,
 "job-name": "labeling-job/auto-label",
 "class-name": folder,
 "human-annotated": "yes",
 "creation-date": dttm,
 "type": "groundtruth/image-classification"
 }
 }
 f.write(json.dumps(manifest)+"\n")

### Upload manifest files and images to S3

Now it's time to upload all the images and the manifest files:

In [None]:
# Upload manifest files to S3 bucket:
!aws s3 cp train.manifest s3://{bucket}/{project}/train.manifest
!aws s3 cp test.manifest s3://{bucket}/{project}/test.manifest

In [None]:
# Upload images to S3 bucket:
!aws s3 cp circuitboard/train/normal s3://{bucket}/{project}/train/normal --recursive
!aws s3 cp circuitboard/train/anomaly s3://{bucket}/{project}/train/anomaly --recursive

!aws s3 cp circuitboard/test/normal s3://{bucket}/{project}/test/normal --recursive
!aws s3 cp circuitboard/test/anomaly s3://{bucket}/{project}/test/anomaly --recursive

## Amazon Lookout for Vision

We are almost done. You have a couple of options on how to create your Amazon Lookout project (console, CLI or boto3). We chose boto3 SDK in this example. We highly recommend to check out the console, too. It's so simple to generate a project and let a model be trained. This is what we should show to our customers, too!

The steps we take with SDK are:

1. Create a project (the name as been set right at the beginning)
2. Tell your project where to find your training dataset. This is done via the manifest file for training.
3. Tell your project where to find your test dataset. This is done via the manifest file for test.
 - Note: This step is optional. In general all 'test' related code, etc. is optional. Amazon Lookout for Vision will also work with 'training' dataset only. We chose to use both as training and testing is a common (best) practice when training AI/ML models. And we should always let our customer know this to help them get to the next level.
4. Create a model. This command will trigger the model training and validation.

**Note**: Training a model can (will) take a few hours as it uses Deep Learning in the background. Once your model is trained, you can continue with this notebook to make predictions.

### Creating Project

In [None]:
#Creating project
print('Creating project:' + project)
response=client.create_project(ProjectName=project)
print('project ARN: ' + response['ProjectMetadata']['ProjectArn'])
print('Done!')

### Creating Training Dataset

In [None]:
#Creating training dataset
dataset_type ='train'
manifest_file = project+'/train.manifest'

print('Creating dataset...')
dataset=json.loads('{ "GroundTruthManifest": { "S3Object": { "Bucket": "' + bucket + '", "Key": "'+ manifest_file + '" } } }')

response=client.create_dataset(ProjectName=project, DatasetType=dataset_type, DatasetSource=dataset)
print('Dataset Status: ' + response['DatasetMetadata']['Status'])
print('Dataset Status Message: ' + response['DatasetMetadata']['StatusMessage'])
print('Dataset Type: ' + response['DatasetMetadata']['DatasetType'])
print('Done!')

### Creating Test Dataset

In [None]:
#Creating test dataset
dataset_type ='test'
manifest_file = project+'/test.manifest'

print('Creating dataset...')
dataset=json.loads('{ "GroundTruthManifest": { "S3Object": { "Bucket": "' + bucket + '", "Key": "'+ manifest_file + '" } } }')

response=client.create_dataset(ProjectName=project, DatasetType=dataset_type, DatasetSource=dataset)
print('Dataset Status: ' + response['DatasetMetadata']['Status'])
print('Dataset Status Message: ' + response['DatasetMetadata']['StatusMessage'])
print('Dataset Type: ' + response['DatasetMetadata']['DatasetType'])
print('Done!')

### Creating/training Model

In [None]:
#Creating/training model
output_bucket = bucket
output_folder = project+'/model/'

 
print('Creating model...')
output_config=dataset=json.loads('{ "S3Location": { "Bucket": "' + output_bucket + '", "Prefix": "'+ output_folder + '" } } ')

response=client.create_model(ProjectName=project, OutputConfig=output_config)
print('ARN: ' + response['ModelMetadata']['ModelArn'])
print('Version: ' + response['ModelMetadata']['ModelVersion'])
print('Status: ' + response['ModelMetadata']['Status'])
print('Message: ' + response['ModelMetadata']['StatusMessage'])
print('Done!')

### Model Deployment

Getting the model in an operating stage is as easy as telling it to "start". This process also takes a few minutes. So, please be patient. You can again check in the console (or via CLI) the status of the model.

#### Wait for the model training to complete

In [None]:
import time

while client.describe_model(ProjectName=project,ModelVersion='1')['ModelDescription']['Status']!='TRAINED':
 print('.',end='');time.sleep(5);
print('Done!')

#### Hosting the trained model

In [None]:
model_version='1'
min_inference_units=1 
 
print('Starting model version ' + model_version + ' for project ' + project )
response=client.start_model(ProjectName=project,
 ModelVersion=model_version,
 MinInferenceUnits=min_inference_units)
print('Status: ' + response['Status'])

#### Wait for model hosting to complete

In [None]:
while client.describe_model(ProjectName=project,ModelVersion='1')['ModelDescription']['Status']!='HOSTED':
 print('.',end='');time.sleep(5);
print('Done!')

### Make Predictions


Making predictions via boto3 SDK requires the project name, model version, content type and a sample images. We are using images locally from the SageMaker notebook instance:

If you would like to use GUI based solution to make predictions, refer to this demo - https://github.com/aws-samples/amazon-lookout-for-vision-demo

In [None]:
#Picking an anamolous image from the extra images
photo='circuitboard/extra_images/extra_images-anomaly_3.jpg'
model_version='1'
 
with open(photo, 'rb') as image:
 response = client.detect_anomalies(ProjectName=project, 
 ContentType='image/jpeg',
 Body=image.read(),
 ModelVersion=model_version)
print ('Anomalous?: ' + str(response['DetectAnomalyResult']['IsAnomalous']))
print ('Confidence: ' + str(response['DetectAnomalyResult']['Confidence']))

In [None]:
#Picking a normal image from the extra images
photo='circuitboard/extra_images/extra_images-normal_1.jpg'
model_version='1'
 
with open(photo, 'rb') as image:
 response=client.detect_anomalies(ProjectName=project, 
 ContentType='image/jpeg',
 Body=image.read(),
 ModelVersion=model_version)
print ('Anomalous?: ' + str(response['DetectAnomalyResult']['IsAnomalous']))
print ('Confidence: ' + str(response['DetectAnomalyResult']['Confidence']))

# BE FRUGAL, stop the model

If you don't need your model anymore please stop it to save costs!

In [None]:
#If you are not using the model, stop to save costs!
model_version='1'

print('Stopping model version ' + model_version + ' for project ' + project )
response=client.stop_model(ProjectName=project,
 ModelVersion=model_version)
print('Status: ' + response['Status'])