{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Amazon SageMaker Object Detection using the RecordIO format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "Object detection is the process of identifying and localizing objects in an image. A typical object detection solution takes in an image as input and provides a bounding box on the image where a object of interest is along with identifying what object the box encapsulates. But before we have this solution, we need to acquire and process a traning dataset, create and setup a training job for the alorithm so that the aglorithm can learn about the dataset and then host the algorithm as an endpoint, to which we can supply the query image.\n",
    "\n",
    "This notebook is an end-to-end example introducing the Amazon SageMaker Object Detection algorithm. In this demo, we will demonstrate how to train and to host an object detection model on the [Pascal VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/) using the Single Shot multibox Detector ([SSD](https://arxiv.org/abs/1512.02325)) algorithm. In doing so, we will also demonstrate how to construct a training dataset using the RecordIO format as this is the format that the training job will consume. We will also demonstrate how to host and validate this trained model. Amazon SageMaker Object Detection also allow training with the image and JSON format, which is illustrated in the [image and JSON Notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_image_json_format.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "outputs": [],
   "source": [
    "from IPython.display import clear_output\n",
    "clear_output()"
   ],
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   }
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "To train the Object Detection algorithm on Amazon SageMaker, we need to setup and authenticate the use of AWS services. To begin with we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3 will automatically be obtained from the role used to start the notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "import boto3\n",
    "import time\n",
    "import string\n",
    "import sagemaker\n",
    "import random\n",
    "from sagemaker import get_execution_role\n",
    "\n",
    "role = get_execution_role()\n",
    "print(role)\n",
    "sess = sagemaker.Session()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also need the S3 bucket that you want to use for training and to store the tranied model artifacts. In this notebook, we require a custom bucket that exists so as to keep the naming clean. You can end up using a default bucket that SageMaker comes with as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bucket = '<Specify the bucket to use model training>'\n",
    "prefix = 'object-detection-training'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lastly, we need the Amazon SageMaker Object Detection docker image, which is static and need not be changed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.amazon.amazon_estimator import get_image_uri\n",
    "\n",
    "training_image = get_image_uri(sess.boto_region_name, 'object-detection', repo_version=\"latest\")\n",
    "print (training_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Upload\n",
    "\n",
    "Let's list the content of the generated dataset RecordIO file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!head -n 3 RecordIO/train.lst > example.lst\n",
    "f = open('example.lst','r')\n",
    "lst_content = f.read()\n",
    "print(lst_content)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As can be seen that each line in the .lst file represents the annotations for a image. A .lst file is a **tab**-delimited file with multiple columns. The rows of the file are annotations of the of image files. The first column specifies a unique image index. The second column specifies the header size of the current row. In the above example .lst file, 2 from the second column means the second and third columns are header information, which will not be considered as label and bounding box information of the image specified by the current row.\n",
    "\n",
    "The third column specifies the label width of a single object. In the first row of above sample .lst file, 5 from the third row means each object within an image will have 5 numbers to describe its label information, including class index, and bounding box coordinates. If there are multiple objects within one image, all the label information should be listed in one line. The annotation information for each object is represented as ``[class_index, xmin, ymin, xmax, ymax]``. \n",
    "\n",
    "The classes should be labeled with successive numbers and start with 0. The bounding box coordinates are ratios of its top-left (xmin, ymin) and bottom-right (xmax, ymax) corner indices to the overall image size. Note that the top-left corner of the entire image is the origin (0, 0). The last column specifies the relative path of the image file.\n",
    "\n",
    "After generating the .lst file, the RecordIO can be created by running the following command:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Upload data to S3\n",
    "Upload the data to the S3 bucket. We do this in multiple channels. Channels are simply directories in the bucket that differentiate between training and validation data. Let us simply call these directories `train` and `validation`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "# Upload the RecordIO files to train and validation channels\n",
    "train_channel = prefix + '/train'\n",
    "validation_channel = prefix + '/validation'\n",
    "\n",
    "sess.upload_data(path='RecordIO/train.rec', bucket=bucket, key_prefix=train_channel)\n",
    "sess.upload_data(path='RecordIO/val.rec', bucket=bucket, key_prefix=validation_channel)\n",
    "\n",
    "s3_train_data = 's3://{}/{}'.format(bucket, train_channel)\n",
    "s3_validation_data = 's3://{}/{}'.format(bucket, validation_channel)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we need to setup an output location at S3, where the model artifact will be dumped. These artifacts are also the output of the algorithm's traning job."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_input_train = 's3://{}/{}'.format(bucket, train_channel)\n",
    "s3_input_validation = 's3://{}/{}'.format(bucket, validation_channel)\n",
    "s3_output_location = 's3://{}/{}/output'.format(bucket, prefix)\n",
    "print(s3_input_train)\n",
    "print(s3_input_validation)\n",
    "print(s3_output_location)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hyperparameter Tuning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The object detection algorithm at its core is the [Single-Shot Multi-Box detection algorithm (SSD)](https://arxiv.org/abs/1512.02325). This algorithm uses a `base_network`, which is typically a [VGG](https://arxiv.org/abs/1409.1556) or a [ResNet](https://arxiv.org/abs/1512.03385). The Amazon SageMaker object detection algorithm supports VGG-16 and ResNet-50 now. It also has a lot of options for hyperparameters that help configure the training job. The next step in our training, is to setup these hyperparameters and data channels for training the model. Consider the following example definition of hyperparameters. See the SageMaker Object Detection [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html) for more details on the hyperparameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Parameters\n",
    "\n",
    "## Number of Classes\n",
    "num_classes = 2\n",
    "\n",
    "## Number of training samples (80% total)\n",
    "num_training_samples = 1600\n",
    "\n",
    "## Instance type\n",
    "instance_type = 'ml.p3.8xlarge'\n",
    "\n",
    "## Instance count per job\n",
    "instance_count_per_job = 1\n",
    "\n",
    "## Learn Scheduler Step\n",
    "lr_scheduler_step = '20,40'\n",
    "\n",
    "## Max HyperParameter Training Epochs\n",
    "ht_epochs = 60\n",
    "\n",
    "## Max Final Training Epochs\n",
    "final_epochs = 240\n",
    "\n",
    "## Max HyperParameter Training jobs\n",
    "max_ht_training_jobs = 10\n",
    "\n",
    "## Max Parallel jobs\n",
    "max_parallel_jobs = 1\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we are done with all the setup that is needed, and also uploaded the files to S3, we are ready to start training, but the objective is get the best train from this dataset. To do it we have to run a Hyperparameter Tunning Job.\n",
    "\n",
    "Follow the next steps to do it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data channels\n",
    "train_data = sagemaker.session.s3_input(s3_input_train, distribution='FullyReplicated', \n",
    "                        content_type='application/x-recordio', s3_data_type='S3Prefix')\n",
    "validation_data = sagemaker.session.s3_input(s3_input_validation, distribution='FullyReplicated', \n",
    "                             content_type='application/x-recordio', s3_data_type='S3Prefix')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner\n",
    "\n",
    "# Base training estimator\n",
    "od = sagemaker.estimator.Estimator(training_image,\n",
    "                                   role, \n",
    "                                   train_instance_count = instance_count_per_job, \n",
    "                                   train_instance_type = instance_type,\n",
    "                                   train_volume_size = 50,\n",
    "                                   train_max_run = 360000,\n",
    "                                   input_mode= 'File',\n",
    "                                   output_path = s3_output_location,\n",
    "                                   sagemaker_session = sess,\n",
    "                                   base_job_name = 'smart-cooler')\n",
    "# Fixed hyperparameters\n",
    "od.set_hyperparameters(base_network = 'resnet-50',\n",
    "                       use_pretrained_model = 1,\n",
    "                       early_stopping = True,\n",
    "                       num_classes = num_classes,\n",
    "                       num_training_samples = num_training_samples,\n",
    "                       epochs = ht_epochs,\n",
    "                       lr_scheduler_step = lr_scheduler_step,\n",
    "                       lr_scheduler_factor = 0.1,\n",
    "                       overlap_threshold = 0.5,\n",
    "                       nms_threshold = 0.45,\n",
    "                       image_shape = 512,\n",
    "                       label_width = 350)\n",
    "\n",
    "# Run tuning job\n",
    "tuning_job_name = \"smart-cooler-tuning-{}\".format(''.join(random.choices(string.ascii_letters + string.digits, k=8)))\n",
    "\n",
    "hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.001, 0.100),\n",
    "                         'mini_batch_size': IntegerParameter(16, 64),\n",
    "                         'momentum': ContinuousParameter(0.80, 0.99),\n",
    "                         'weight_decay': ContinuousParameter(0.001, 0.100),\n",
    "                         'optimizer': CategoricalParameter(['sgd', 'adam', 'rmsprop', 'adadelta'])}\n",
    "\n",
    "tuner = HyperparameterTuner(od, \n",
    "                           'validation:mAP', \n",
    "                           hyperparameter_ranges,\n",
    "                           objective_type = 'Maximize', \n",
    "                           max_jobs = max_ht_training_jobs, \n",
    "                           max_parallel_jobs = max_parallel_jobs, \n",
    "                           early_stopping_type = 'Auto')\n",
    "\n",
    "tuner.fit({'train': train_data, 'validation': validation_data}, \n",
    "          job_name = tuning_job_name, include_cls_metadata = False)\n",
    "tuner.wait()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After the tuning job is finished, the top 5 performing hyperparameters can be listed below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get best jobs\n",
    "tuner_metrics = sagemaker.HyperparameterTuningJobAnalytics(tuning_job_name)\n",
    "best_jobs = tuner_metrics.dataframe().sort_values(['FinalObjectiveValue'], ascending=False).head(5)\n",
    "\n",
    "best_jobs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training\n",
    "Now that we are done with the Hyperparameter Tuning job we will take the best hyperparameter results and mature the model, training with more epochs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Train with best job hyperparameters\n",
    "best_job = best_jobs.head(1)\n",
    "i = best_job.index[0]\n",
    "\n",
    "hyperparams = od.hyperparameters()\n",
    "hyperparams['learning_rate'] = best_job.at[i, 'learning_rate']\n",
    "hyperparams['mini_batch_size'] = int(best_job.at[i, 'mini_batch_size'])\n",
    "hyperparams['momentum'] = best_job.at[i, 'momentum']\n",
    "hyperparams['weight_decay'] = best_job.at[i, 'weight_decay']\n",
    "hyperparams['optimizer'] = best_job.at[i, 'optimizer']\n",
    "\n",
    "hyperparams['epochs'] = final_epochs\n",
    "\n",
    "print(hyperparams)\n",
    "\n",
    "od.set_hyperparameters(**hyperparams)\n",
    "\n",
    "x = ''.join(random.choices(string.ascii_letters + string.digits, k=8))\n",
    "od.fit({'train': train_data, 'validation': validation_data}, \n",
    "          job_name='smart-cooler-training-' + x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hosting\n",
    "Once the training is done, we can deploy the trained model as an Amazon SageMaker real-time hosted endpoint. This will allow us to make predictions (or inference) from the model. Note that we don't have to host on the same instance type that we used to train. Training is a prolonged and compute-intensive job that require different compute and memory requirements than hosting. We can choose any type of instance we want to host the model. In our case, we chose the `ml.p3.2xlarge` instance type to train, but we will host the model on a less expensive instance type, `ml.m5.xlarge`. \n",
    "The endpoint deployment can be accomplished as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Deploy model to endpoint\n",
    "od_endpoint = od.deploy(initial_instance_count = 1,\n",
    "                        instance_type = 'ml.m5.2xlarge',\n",
    "                        endpoint_name='smart-cooler-endpoint-' + x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inference\n",
    "Now that the trained model is deployed at an endpoint that is up-and-running, we can use this endpoint for inference.\n",
    "The following code blocks define some functions that will be used for processing and visualizing inference results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install wget pillow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import glob\n",
    "import wget\n",
    "import random\n",
    "import numpy as np\n",
    "from PIL import Image\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.image as mpimg\n",
    "from PIL import Image\n",
    "from math import floor, ceil\n",
    "import itertools\n",
    "import boto3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_object_boundary_box(img, obj_boundaries):\n",
    "    if (len(obj_boundaries) != 4):\n",
    "        raise Exception(\"Sagemaker boundaries are not of size 4\")\n",
    "    # Find size of boundary box \n",
    "    img_boundaries = img.getbbox()\n",
    "    x_size = img_boundaries[2] - img_boundaries[0]\n",
    "    y_size = img_boundaries[3] - img_boundaries[1]\n",
    "\n",
    "    # Generate tuple of pixel boundaries using the boundaries generated from model. \n",
    "    x_min = floor(x_size * obj_boundaries[0])\n",
    "    y_min = floor(y_size * obj_boundaries[1])\n",
    "    x_max = ceil(x_size * obj_boundaries[2])\n",
    "    y_max = ceil(y_size * obj_boundaries[3])\n",
    "    return tuple(map(int, [x_min, y_min, x_max, y_max]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def visualize_detection(img_file, dets, classes=[], colors=[], thresh=0.6):\n",
    "    import random\n",
    "    import matplotlib.pyplot as plt\n",
    "    import matplotlib.image as mpimg\n",
    "\n",
    "    fig_size = plt.rcParams[\"figure.figsize\"]\n",
    "    fig_size[0] = 9\n",
    "    fig_size[1] = 6\n",
    "    plt.rcParams[\"figure.figsize\"] = fig_size\n",
    "    \n",
    "    img=mpimg.imread(img_file)\n",
    "    plt.imshow(img)\n",
    "    height = img.shape[0]\n",
    "    width = img.shape[1]\n",
    "    for det in dets:\n",
    "        (klass, score, x0, y0, x1, y1) = det\n",
    "        if score < thresh:\n",
    "            continue\n",
    "        cls_id = int(klass)\n",
    "        xmin = int(x0 * width)\n",
    "        ymin = int(y0 * height)\n",
    "        xmax = int(x1 * width)\n",
    "        ymax = int(y1 * height)\n",
    "        rect = plt.Rectangle((xmin, ymin), xmax - xmin,\n",
    "                             ymax - ymin, fill=False,\n",
    "                             edgecolor=colors[cls_id],\n",
    "                             linewidth=3.5)\n",
    "        plt.gca().add_patch(rect)\n",
    "        class_name = str(cls_id)\n",
    "        if classes and len(classes) > cls_id:\n",
    "            class_name = classes[cls_id]\n",
    "        plt.gca().text(xmin, ymin - 2,\n",
    "                        '{:s} {:.2f}'.format(class_name, score),\n",
    "                        bbox=dict(facecolor=colors[cls_id], alpha=0.5),\n",
    "                                fontsize=12, color='white')\n",
    "        \n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_iou(bb1, bb2):\n",
    "    \"\"\"\n",
    "    Calculate the Intersection over Union (IoU) of two bounding boxes.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    bb1 : dict\n",
    "        Keys: {'x1', 'x2', 'y1', 'y2'}\n",
    "        The (x1, y1) position is at the top left corner,\n",
    "        the (x2, y2) position is at the bottom right corner\n",
    "    bb2 : dict\n",
    "        Keys: {'x1', 'x2', 'y1', 'y2'}\n",
    "        The (x, y) position is at the top left corner,\n",
    "        the (x2, y2) position is at the bottom right corner\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    float\n",
    "        in [0, 1]\n",
    "    \"\"\"\n",
    "    \n",
    "    #print(bb1[0])\n",
    "    #bb1 = float(bb1)\n",
    "    #bb2 = float(bb2)\n",
    "    #print(bb1)\n",
    "    \n",
    "    assert bb1[0] < bb1[2]\n",
    "    assert bb1[1] < bb1[3]\n",
    "    assert bb2[0] < bb2[2]\n",
    "    assert bb2[1] < bb2[3]\n",
    "\n",
    "    # determine the coordinates of the intersection rectangle\n",
    "    x_left = max(bb1[0], bb2[0])\n",
    "    y_top = max(bb1[1], bb2[1])\n",
    "    x_right = min(bb1[2], bb2[2])\n",
    "    y_bottom = min(bb1[3], bb2[3])\n",
    "\n",
    "    if x_right < x_left or y_bottom < y_top:\n",
    "        return 0.0\n",
    "    \n",
    "    # The intersection of two axis-aligned bounding boxes is always an\n",
    "    # axis-aligned bounding box\n",
    "    intersection_area = (x_right - x_left) * (y_bottom - y_top)\n",
    "    \n",
    "    # compute the area of both AABBs\n",
    "    bb1_area = (bb1[2] - bb1[0]) * (bb1[3] - bb1[1])\n",
    "    bb2_area = (bb2[2] - bb2[0]) * (bb2[3] - bb2[1])\n",
    "    \n",
    "    if intersection_area == bb1_area or intersection_area == bb1_area:\n",
    "        return 1.0\n",
    "    \n",
    "    if intersection_area / bb1_area > 0.5  or intersection_area / bb2_area > 0.5:\n",
    "        return 0.5\n",
    "\n",
    "    # compute the intersection over union by taking the intersection\n",
    "    # area and dividing it by the sum of prediction + ground-truth\n",
    "    # areas - the interesection area\n",
    "    iou = intersection_area / float(bb1_area + bb2_area - intersection_area)\n",
    "    assert iou >= 0.0\n",
    "    assert iou <= 1.0\n",
    "    return iou"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_bounding_boxes(result, treshold_ssd, treshold_iou):\n",
    "    # Array of bounding boxes\n",
    "    bounding_boxes = []\n",
    "\n",
    "    print(\"Prediction quantity: {}\".format(len(result[\"prediction\"])))\n",
    "\n",
    "    # Remove bounding boxes above the confidence ssd treshold \n",
    "    for prediction in result[\"prediction\"]:\n",
    "        if prediction[1] >= treshold_ssd:\n",
    "            bounding_boxes.append(prediction)\n",
    "\n",
    "    print(\"Bounding box quantity before NMS: {}\".format(len(bounding_boxes)))\n",
    "\n",
    "    i=0\n",
    "    while i < len(bounding_boxes):\n",
    "        boundbox = bounding_boxes[i]\n",
    "        #print(str(i) + \" BB: \" + str(boundbox[1]))\n",
    "        i+=1\n",
    "\n",
    "        if boundbox[5] - boundbox[3] < 0.05:\n",
    "            print(\"Size Removing, Y small: \" + str(boundbox[5] - boundbox[3]))\n",
    "            bounding_boxes.remove(boundbox)\n",
    "            i=0\n",
    "            continue\n",
    "\n",
    "        if boundbox[4] - boundbox[2] < 0.05:\n",
    "            print(\"Size Removing, x small: \" + str(boundbox[4] - boundbox[2]))\n",
    "            bounding_boxes.remove(boundbox)\n",
    "            i=0\n",
    "            continue  \n",
    "\n",
    "        if boundbox[5] - boundbox[3] > 0.20:\n",
    "            print(\"Size Removing, Y large: \" + str(boundbox[5] - boundbox[3]))\n",
    "            bounding_boxes.remove(boundbox)\n",
    "            i=0\n",
    "            continue\n",
    "\n",
    "        if boundbox[4] - boundbox[2] > 0.20:\n",
    "            print(\"Size Removing, x large: \" + str(boundbox[4] - boundbox[2]))\n",
    "            bounding_boxes.remove(boundbox)\n",
    "            i=0\n",
    "            continue   \n",
    "\n",
    "        count = 0\n",
    "        len_bounding_boxes = len(bounding_boxes)\n",
    "        while count < len_bounding_boxes:\n",
    "\n",
    "            if boundbox != bounding_boxes[count]:\n",
    "                check = get_iou(boundbox[2:],bounding_boxes[count][2:])\n",
    "                if check >= treshold_iou:\n",
    "\n",
    "                    # Best Score\n",
    "                    if boundbox[1] < bounding_boxes[count][1]:\n",
    "                        print(\"Removing this: \" + str(boundbox[1]))\n",
    "                        if boundbox in bounding_boxes:\n",
    "                            bounding_boxes.remove(boundbox)\n",
    "                        \n",
    "\n",
    "                    else:\n",
    "                        print(\"Removing last: \" + str(bounding_boxes[count][1]))\n",
    "                        if bounding_boxes[count] in bounding_boxes:\n",
    "                            bounding_boxes.remove(bounding_boxes[count])\n",
    "\n",
    "                    print(\"Item Deleted: {}\".format(count))\n",
    "                    len_bounding_boxes -= 1\n",
    "\n",
    "            # Raise Counter\n",
    "            count += 1\n",
    "    \n",
    "    print(\"Bounding box quantity after NMS: {}\".format(len(bounding_boxes)))\n",
    "\n",
    "    return bounding_boxes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us download an image to test. Please add the URL of the image you want test. Better if this image is one from the fridge with the products trained."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "file_name = 'test.jpg'\n",
    "\n",
    "with open(file_name, 'rb') as image:\n",
    "    f = image.read()\n",
    "    b = bytearray(f)\n",
    "    ne = open('n.txt','wb')\n",
    "    ne.write(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us use our endpoint to try to detect objects within this image. Since the image is jpeg, we use the appropriate content_type to run the prediction job. The endpoint returns a JSON file that we can simply load and peek into."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import json\n",
    "file_name = 'test.jpg'\n",
    "\n",
    "with open(file_name, 'rb') as f:\n",
    "    payload = f.read()\n",
    "    payload = bytearray(payload)\n",
    "    \n",
    "od_endpoint.content_type = 'image/jpeg'\n",
    "result = json.loads(od_endpoint.predict(payload))\n",
    "\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results are in a format that is similar to the input .lst file (See [RecordIO Notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb) for more details on the .lst file definition. )with an addition of a confidence score for each detected object. The format of the output can be represented as `[class_index, confidence_score, xmin, ymin, xmax, ymax]`. Typically, we don't consider low-confidence predictions.\n",
    "\n",
    "We have provided additional script to easily visualize the detection outputs. You can visulize the high-confidence preditions with bounding box by filtering out low-confidence detections following next steps:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Setting the threshold to see results according to these limits."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Treshold definition\n",
    "threshold_ssd = 0.25\n",
    "threshold_iou = 0.10\n",
    "\n",
    "# Define Object categories\n",
    "object_categories = ['red',\n",
    "                     'blue']\n",
    "\n",
    "object_colors = [(0.9,0.4,0.4),\n",
    "                 (0.1,0.1,0.9)]\n",
    "\n",
    "# Visualize the detections.\n",
    "visualize_detection(file_name, \n",
    "                    get_bounding_boxes(result, threshold_ssd, threshold_iou),\n",
    "                    object_categories, \n",
    "                    object_colors, \n",
    "                    threshold_ssd)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_mxnet_p36",
   "language": "python",
   "name": "conda_mxnet_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  },
  "notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.  Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
 },
 "nbformat": 4,
 "nbformat_minor": 4
}