# GluonCV YoloV3 training and optimizing using Neo


---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

---


1. [Introduction](#Introduction)
2. [Setup](#Setup)
3. [Data Preparation](#Data-Preparation)
 1. [Download data](#Download-data)
 2. [Convert data into RecordIO](#Convert-data-into-RecordIO)
 3. [Upload data to S3](#Upload-data-to-S3)
4. [Training](#Training)
5. [Compile the trained model using SageMaker Neo](#Compile-the-trained-model-using-SageMaker-Neo)
6. [Deploy the compiled model and request Inferences](#Deploy-the-compiled-model-and-request-Inferences)
7. [Delete the Endpoint](#Delete-the-Endpoint)

## Introduction

This is an end-to-end example of GluonCV YoloV3 model training inside of Amazon SageMaker notebook and then compile the trained model using Neo runtime. In this demo, we will demonstrate how to train and to host an MXNet model on the [Pascal VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) using the YoloV3 algorithm. We will also demonstrate how to optimize this trained model using Neo.

***This notebook is for demonstration purpose only. Please fine tune the training parameters based on your own dataset.***

## Setup

To train the YoloV3 MXNet model on Amazon SageMaker, we need to setup and authenticate the use of AWS services.

To start, we need to upgrade the [SageMaker SDK for Python](https://sagemaker.readthedocs.io/en/stable/v2.html) to v2.33.0 or greater and restart the kernel.

In [None]:
!~/anaconda3/envs/mxnet_p36/bin/pip install --upgrade 'sagemaker>=2.33.0'
!~/anaconda3/envs/mxnet_p36/bin/pip install --upgrade opencv-python

Then we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3. We also create a session.

In [None]:
import sagemaker
from sagemaker import get_execution_role

role = get_execution_role()
sess = sagemaker.Session()

We also need the S3 bucket that is used for training, and storing the tranied model artifacts. 

In [None]:
bucket = sess.default_bucket()
folder = "DEMO-ObjectDetection-YOLOv3-MXNet"
custom_code_sub_folder = folder + "/custom-code"
training_data_sub_folder = folder + "/training-data"
validation_data_sub_folder = folder + "/validation-data"
training_output_sub_folder = folder + "/training-output"
compilation_output_sub_folder = folder + "/compilation-output"

To easily visualize the detection outputs we also define the following function. The function visualizes the high-confidence predictions with bounding box by filtering out low-confidence detections.

In [None]:
%matplotlib inline
def visualize_detection(img_file, dets, classes=[], thresh=0.6):
 """
 visualize detections in one image
 Parameters:
 ----------
 img_file : numpy.array
 image, in bgr format
 dets : numpy.array
 yolo detections, numpy.array([[id, score, x1, y1, x2, y2]...])
 each row is one object
 classes : tuple or list of str
 class names
 thresh : float
 score threshold
 """
 import random
 import matplotlib.pyplot as plt
 import matplotlib.image as mpimg
 from matplotlib.patches import Rectangle

 img = mpimg.imread(img_file)
 plt.imshow(img)
 height = img.shape[0]
 width = img.shape[1]
 colors = dict()
 klasses = dets[0][0]
 scores = dets[1][0]
 bbox = dets[2][0]
 for i in range(len(classes)):
 klass = klasses[i][0]
 score = scores[i][0]
 x0, y0, x1, y1 = bbox[i]
 if score < thresh:
 continue
 cls_id = int(klass)
 if cls_id not in colors:
 colors[cls_id] = (random.random(), random.random(), random.random())
 xmin = int(x0 * width / 320)
 ymin = int(y0 * height / 320)
 xmax = int(x1 * width / 320)
 ymax = int(y1 * height / 320)
 rect = Rectangle(
 (xmin, ymin),
 xmax - xmin,
 ymax - ymin,
 fill=False,
 edgecolor=colors[cls_id],
 linewidth=3.5,
 )
 plt.gca().add_patch(rect)
 class_name = str(cls_id)
 if classes and len(classes) > cls_id:
 class_name = classes[cls_id]
 plt.gca().text(
 xmin,
 ymin - 2,
 "{:s} {:.3f}".format(class_name, score),
 bbox=dict(facecolor=colors[cls_id], alpha=0.5),
 fontsize=12,
 color="white",
 )
 plt.tight_layout(rect=[0, 0, 2, 2])
 plt.show()

In [None]:
# Initializing object categories
object_categories = [
 "aeroplane",
 "bicycle",
 "bird",
 "boat",
 "bottle",
 "bus",
 "car",
 "cat",
 "chair",
 "cow",
 "diningtable",
 "dog",
 "horse",
 "motorbike",
 "person",
 "pottedplant",
 "sheep",
 "sofa",
 "train",
 "tvmonitor",
]

# Setting a threshold 0.02 will only plot detection results that have a confidence score greater than 0.02
threshold = 0.02

Finally we load the test image into the memory. The test image used in this notebook is from [PEXELS](https://www.pexels.com/) which remains unseen until the time of preditcion.

In [None]:
import PIL.Image
import numpy as np

test_file = "test.jpg"
test_image = PIL.Image.open(test_file)
test_image = np.asarray(test_image.resize((320, 320)))

## Data Preparation
[Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) was a popular computer vision challenge and they released annual challenge datasets for object detection from 2005 to 2012. In this notebook, we will use the data sets from 2007 and 2012, named as VOC07 and VOC12 respectively. Cumulatively, we have more than 20,000 images containing about 50,000 annotated objects. These annotated objects are grouped into 20 categories.

***Notes:***
1. While using the Pascal VOC dataset, please be aware of the database usage rights. The VOC data includes images obtained from flickr's website. Use of these images must respect the corresponding terms of use: https://www.flickr.com/help/terms
2. Default EBS Volume size for SageMaker Notebook instances is 5GB. While performing this step if you run out of storage then consider increasing the volume size. One way to do so is by using AWS CLI as documented [here](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/sagemaker/update-notebook-instance.html).

### Download data
Download the Pascal VOC datasets from 2007 and 2012 from Oxford University's website.

In [None]:
%%time
# Download and extract the datasets
![ ! -f /tmp/pascal-voc.tgz ] && { wget -P /tmp https://s3.amazonaws.com/fast-ai-imagelocal/pascal-voc.tgz; }
![ ! -d VOCdevkit ] && { tar -xf /tmp/pascal-voc.tgz --no-same-owner; mv pascal-voc VOCdevkit; }

### Convert data into RecordIO
[RecordIO](https://mxnet.incubator.apache.org/architecture/note_data_loading.html) is a highly efficient binary data format from [MXNet](https://mxnet.incubator.apache.org/). Using this format, dataset is simple to prepare and transfer to the instance that will run the training job.

In [None]:
!python tools/prepare_dataset.py --dataset pascal --year 2007,2012 --set trainval --target VOCdevkit/train.lst
!python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target VOCdevkit/val.lst --no-shuffle

### Upload data to S3
Upload the data to the S3 bucket. 

In [None]:
# Upload the RecordIO files to train and validation channels
sess.upload_data(path="VOCdevkit/train.rec", bucket=bucket, key_prefix=training_data_sub_folder)
sess.upload_data(path="VOCdevkit/train.idx", bucket=bucket, key_prefix=training_data_sub_folder)

sess.upload_data(path="VOCdevkit/val.rec", bucket=bucket, key_prefix=validation_data_sub_folder)
sess.upload_data(path="VOCdevkit/val.idx", bucket=bucket, key_prefix=validation_data_sub_folder)

Next, we need to setup training and compilation output locations in S3, where the respective model artifacts will be dumped. We also setup the s3 location for training data, validation data and custom code.

In [None]:
# S3 Location where the training data is stored in the previous step
s3_training_data_location = "s3://{}/{}".format(bucket, training_data_sub_folder)

# S3 Location to save the model artifact after training
s3_training_output_location = "s3://{}/{}".format(bucket, training_output_sub_folder)

# S3 Location where the training data is stored in the previous step
s3_validation_data_location = "s3://{}/{}".format(bucket, validation_data_sub_folder)

# S3 Location to save the model artifact after compilation
s3_compilation_output_location = "s3://{}/{}".format(bucket, compilation_output_sub_folder)

# S3 Location to save your custom code in tar.gz format
s3_custom_code_upload_location = "s3://{}/{}".format(bucket, custom_code_sub_folder)

## Training
Now that we are done with all the setup that is needed, we are ready to train our object detector. To begin, let us create a ``sagemaker.MXNet`` object. This estimator will launch the training job. It may take some time for training job to complete. To make it faster `num-epochs` can be reduced.

In [None]:
from sagemaker.mxnet import MXNet

yolo_estimator = MXNet(
 entry_point="train_yolo.py",
 role=role,
 output_path=s3_training_output_location,
 code_location=s3_custom_code_upload_location,
 instance_count=1,
 instance_type="ml.p3.16xlarge",
 framework_version="1.8.0",
 py_version="py37",
 hyperparameters={
 "num-epochs": 10,
 "data-shape": 320,
 "gpus": "0,1,2,3,4,5,6,7",
 "network": "mobilenet1.0",
 },
)

In [None]:
yolo_estimator.fit({"train": s3_training_data_location, "val": s3_validation_data_location})

## Compile the trained model using SageMaker Neo

After training the model we can use SageMaker Neo's ``compile_model()`` API to compile the trained model. When calling ``compile_model()`` user is expected to provide all the correct input shapes required by the model for successful compilation. We also specify the target instance family, the name of our IAM execution role, S3 bucket to which the compiled model would be stored and we set ``MMS_DEFAULT_RESPONSE_TIMEOUT`` environment variable to 500. 

For this example, we will choose `ml_p3` as the target instance family while compiling the trained model. 

In [None]:
%%time
compiled_model = yolo_estimator.compile_model(
 target_instance_family="ml_p3",
 input_shape={"data": [1, 3, 320, 320]},
 role=role,
 output_path=s3_compilation_output_location,
 framework="mxnet",
 framework_version="1.8",
 env={"MMS_DEFAULT_RESPONSE_TIMEOUT": "500"},
)

## Deploy the compiled model and request Inferences

We have to deploy the compiled model on one of the instance family for which the trained model was compiled for. Since we have compiled for `ml_p3` we can deploy to any `ml.p3` instance type. For this example we will choose `ml.p3.2xlarge`

In [None]:
%%time
neo_object_detector = compiled_model.deploy(initial_instance_count=1, instance_type="ml.p3.2xlarge")

In [None]:
%%time
response = neo_object_detector.predict(test_image)

In [None]:
# Visualize the detections.
visualize_detection(test_file, response, object_categories, threshold)

## Delete the Endpoint
Having an endpoint running will incur some costs. Therefore as a clean-up job, we should delete the endpoint.

In [None]:
print("Endpoint name: " + neo_object_detector.endpoint_name)
neo_object_detector.delete_endpoint()

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker_neo_compilation_jobs|gluoncv_yolo|gluoncv_yolo_neo.ipynb)
