# Counting dice with Computer Vision


![](https://user-images.githubusercontent.com/3716307/70001073-3ec5d000-1511-11ea-9b4f-42e14b6af1b7.png)

## SageMaker Ground-Truth

You'll shortly step through the process of setting up a SageMaker ground-truth labelling job, but first we need to upload our images to Amazon S3.

We have already loaded the images onto this SageMaker Notebook instance, and you can find them at `./data`.

Using the Amazon SageMaker SDK we can upload these images to the default bucket. See `session.upload_data` below.

In [None]:
import sagemaker

session = sagemaker.session.Session()
default_s3_bucket = 's3://{}'.format(session.default_bucket())
print('default_s3_bucket: {}'.format(default_s3_bucket))

In [None]:
training_images = session.upload_data('./data', key_prefix='vegas-dice-images')
print("Bucket for ground-truth Labeling: {}/images/".format(training_images))

You should now open the [AWS Management Console](https://console.aws.amazon.com/sagemaker/groundtruth?region=us-east-1#/labeling-jobs) and setup your SageMaker ground-truth labeling job.

## Exploritory Data Analysis

In this section we are going to download and explore the data. This is a dataset that has been labeled in ground-truth and contains the bounding boxes for the dice present in the picture from the training dataset

It is recommended at the beginning of any ML project to get well acquainted with the format and type of data that you are working with, on a qualitative and quantitative level

In [None]:
!pip install gluoncv --pre -q

In [None]:
import glob
import json
import math
import os
import random
import time
import zipfile

import cv2
import gluoncv as gcv
import matplotlib.pyplot as plt
import mxnet as mx
import numpy as np

### Visualize the images

In [None]:
data_dir = 'data'
images_dir = os.path.join(data_dir, 'images')
train_images = glob.glob(images_dir + "/*")

In [None]:
print("We have {} images".format(len(train_images)))

Let's see how they look like. We use matplotlib to plot 36 images from the dataset to get a feel for what they are

In [None]:
n_images = 36
cols = (int(math.sqrt(n_images)))
fig = plt.figure(figsize=(15,10))
for n, (image) in enumerate(train_images[:n_images]):
    image = plt.imread(image)
    a = fig.add_subplot(np.ceil(n_images/float(cols)), cols, n + 1)
    plt.imshow(image)
    plt.axis('off')
plt.subplots_adjust(wspace=0.06, hspace=0.06)
plt.show()

### Bounding boxes

We've included the `output.manifest` file from our complete SageMaker GroundTruth labeling job on all images.

Let's dig in the info we have in the manifest file! On each image, there is one or more dice. We read this information from the `output.manifest` file.

In [None]:
image_info = []
with open(os.path.join(data_dir, 'manifest', 'output.manifest')) as f:
    lines = f.readlines()
    for line in lines:
        image_info.append(json.loads(line[:-1]))

For each image, we have the following information:

In [None]:
info = image_info[10]
task = 'dice-labeling'
info

We can access the name of the different classes corresponding to the class index as given by the SageMaker ground-truth labeling job

In [None]:
class_map = info[task+'-metadata']['class-map']
classes = [class_map[str(i)] for i in range(len(class_map))]
classes

We can read the data from this dictionnary to use it to draw a bounding box around the dice using the OpenCV library

In [None]:
info = image_info[random.randint(0, len(image_info)-1)]
image = plt.imread(os.path.join(images_dir, info['source-ref'].split('/')[-1]))

In [None]:
boxes = info['dice-labeling']['annotations']
for box in boxes:
    cv2.rectangle(image, (int(box['left']), int(box['top'])), (int(box['left']+box['width']), int(box['top']+box['height'])), (0,255,0), 3)
    cv2.putText(image, str(box['class_id']+1), (int(box['left']+box['width']), int(box['top']+box['height'])), 1, 3, (255,0,0), 3)

In [None]:
o = plt.imshow(image)

## Fine-tuning an object detection model

Now that we have explored the dataset, let's run a training job on SageMaker

In [None]:
import glob
import os
import re
import subprocess
import sys
import time
from time import gmtime, strftime

import boto3
import sagemaker
import numpy as np
from utils import get_execution_role

## 1) Running the job locally

It is always a good practice to run first the training job in local model in order to make sure that the training complete successfully, this allows much faster feedback cycle than waiting for the creation of remote instances

### Configuring the environment

In [None]:
s3_output_path = '{}/'.format(default_s3_bucket)
print("S3 bucket path: {}".format(s3_output_path))

We first run it locally

In [None]:
instance_type = 'local' if mx.context.num_gpus() == 0 else 'local_gpu'

Make sure docker is setup properly

In [None]:
!/bin/bash ./setup.sh

We get the SageMaker execution role

In [None]:
try:
    role = sagemaker.get_execution_role()
except:
    role = get_execution_role()

print("Using IAM role arn: {}".format(role))

### Job definition

In [None]:
# create a descriptive job name 
job_name_prefix = 'hpo-dice-yolo'

Static hyperparameters

In [None]:
static_hyperparameters = {
    'epochs': 2
}

### Estimator

In [None]:
from sagemaker.mxnet.estimator import MXNet
estimator = MXNet(entry_point="src/train_yolo.py",
                  role=role,
                  train_instance_type=instance_type,
                  train_instance_count=1,
                  output_path=s3_output_path,
                  framework_version="1.4.1",
                  py_version='py3',
                  base_job_name=job_name_prefix,
                  hyperparameters=static_hyperparameters
                 )

Let's first run in local mode to make sure it is running properly, then we can run it remotely

In [None]:
estimator.fit({"train": training_images})

### 2) Hyperparameter Tuner job

We are going to run an hyper-parameter tuning job, it is using gaussian processes to estimate the best combination of parameters. Try picking some ranges based on what you know of ML and let the system finds the best candidates for you

We now pick a cloud instance a create a new estimator

In [None]:
static_hyperparameters = {
    'epochs' : 50
}

In [None]:
instance_type = "ml.p3.2xlarge"
estimator = MXNet(entry_point="src/train_yolo.py",
                  role=role,
                  train_instance_type=instance_type,
                  train_instance_count=1,
                  output_path=s3_output_path,
                  framework_version="1.4.1",
                  py_version='py3',
                  base_job_name=job_name_prefix,
                  hyperparameters=static_hyperparameters
                 )

### Metrics 
We define the metric that we are going to track, we want to track:
- the current running validation Mean Average Precision: `run_validation_mAP`
- the final best validation Mean Average Precision: `validation_mAP`

<img src="https://www.pyimagesearch.com/wp-content/uploads/2016/09/iou_equation.png"  width=400>

In [None]:
metric_definitions = [
    {'Name': 'validation_mAP', 'Regex': 'best mAP ([-+]?[0-9]*[.]?[0-9]+([eE][-+]?[0-9]+)?)'},
    {'Name': 'run_validation_mAP', 'Regex': 'running mAP ([-+]?[0-9]*[.]?[0-9]+([eE][-+]?[0-9]+)?)'}]

### HPO job parameters

In [None]:
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner

# The hyperparameters we're going to tune
hyperparameter_ranges = {
    'lr': ContinuousParameter(0.0001, 0.002), # learning rate, how much should the model learn from the current iteration ( < 0.01 )
    'batch_size': IntegerParameter(2, 8), # batch size, how many pictures in each learning iteration (> 1)
    'lr_factor': ContinuousParameter(0.3, 1), # learning rate factor, How much to multiply the learning rate after 2/3 of trainign (0 < x < 1)
    'wd': ContinuousParameter(0.00001, 0.00005), # Weight decay: Regularization to force small weights ( < 0.001 )
    'class_factor': ContinuousParameter(2, 8), # Class factor: How much to weigh the importance of getting the right class vs finding objects (> 1)
    'model': CategoricalParameter(["yolo3_darknet53_coco", "yolo3_mobilenet1.0_coco"]),
}

We are running a total of 4 jobs, 2 in parallel. This is quite a small number to really appreciate the power of the bayesian sampling of the hyper parameters but given the time and budget constraint, it is an acceptable compromise

In [None]:
max_jobs = 4
max_parallel_jobs = 2

We create the tuner object which will `Maximize` the `validataion_mAP` metric across training runs by picking candidate parameter sets from the ranges provided

In [None]:
tuner = HyperparameterTuner(estimator,
                            objective_metric_name='validation_mAP',
                            objective_type='Maximize',
                            hyperparameter_ranges=hyperparameter_ranges,
                            metric_definitions=metric_definitions,
                            max_jobs=max_jobs,
                            max_parallel_jobs=max_parallel_jobs,
                            base_tuning_job_name=job_name_prefix
                           )
tuner.fit({"train":training_images})

In [None]:
job_name = tuner.latest_tuning_job.job_name
print("Tuning job: %s" % job_name)

In [None]:
print("You can monitor the progress of your jobs here: https://us-east-1.console.aws.amazon.com/sagemaker/home?region=us-east-1#/hyper-tuning-jobs/{}".format(job_name))

In [None]:
!pygmentize src/train_yolo.py

<h1 style="color: red">Stop Here!</h1> 

Now go back up to <a href="#Estimator">here</a> and we'll explain what just happened :)

**Continue after this point when at least one job as completed in the HPO job**

Now that you have a good handle on what happened, let's see if we can deploy the best model from our HPO job

In [None]:
best_job = tuner.best_training_job()
tuner.best_training_job()

## Deployment

Deploy the best tuning job

In [None]:
estimator_best_job = estimator.attach(best_job, session)

We deploy the best tuning job on a cluster of one CPU instance

In [None]:
deployed_model = estimator_best_job.deploy(1, 'ml.c5.4xlarge')

Predict bounding boxes

In [None]:
x, image = gcv.data.transforms.presets.yolo.load_test('test.jpg', short=384)
output = deployed_model.predict(image)

Visualize the result

In [None]:
cid = np.array(output['cid'])
scores = np.array(output['score'])
bbox = np.array(output['bbox'])


o = gcv.utils.viz.plot_bbox(image, bbox, scores, cid, class_names=classes)

## Running inference with the webcam

We are going to process the images from the browser, to the endpoint and back

This **ONLY** works in Jupyter Notebook **NOT** in Jupyter Labs


1) START: Capture an image in javascript through the webcam in the browse

2) Convert that image to base64 and send it over to the python kernel

3) Convert the image back to numpy and send it over to the SageMaker endpoint

4) Get the predicted bounding boxes and paint them over the numpy image

5) Convert the annoted image back to base64 and send it to the javascript

6) Display the annoted frame

7) GOTO START

In [None]:
import base64
from io import BytesIO
from PIL import Image
from utils import show_webcam
def get_annotated_image(input_image_b64):
    prefix, input_image_b64 = input_image_b64.split(',')
    input_image_binary = BytesIO(base64.b64decode(input_image_b64))
    input_image_np = np.asarray(Image.open(input_image_binary))
    input_image_np, _ = mx.image.center_crop(mx.nd.array(input_image_np), (512,384))
    _, input_image_loaded = gcv.data.transforms.presets.yolo.transform_test(input_image_np, short=384)
    output = deployed_model.predict(input_image_loaded)
    cid = np.array(output['cid'])
    scores = np.array(output['score'])
    bbox = np.array(output['bbox'])
    output_image_np = gcv.utils.viz.cv_plot_bbox(input_image_loaded, bbox, scores, cid, class_names=classes)
    output_image_PIL = Image.fromarray(output_image_np)
    output_buffer = BytesIO()
    output_image_PIL.save(output_buffer, format="JPEG")
    output_image_b64 = 'data:image/jpeg;base64,'+base64.b64encode(output_buffer.getvalue()).decode("utf-8")
    print(output_image_b64)

show_webcam()
#display() #uncomment this command if you want to remove the webcam

## Compilation

We can compile the model using SageMaker neo for faster runtime on specific hardware platforms, in the cloud or on the edge

In [None]:
compiled_model = estimator_best_job.compile_model('ml_c5', {'data' : (1, 3, 384, 512)}, s3_output_path, framework='mxnet', framework_version='1.4.1')

You can find the compiled model on s3 and refer to the [SageMaker documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html) for the depoloyment instructions