{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Workshop - Human in the Loop for SageMaker Models - Module 1\n", "\n", "In this firt module you are going to train and deploy an object detection model utilizing [Object Detection](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html) built-in algorithm.\n", "\n", "If it is your first time running a notebook, check out the menus to get a feel for it. To execute the cell click into the cell and click run icon in the toolbar above or press `Shift + Enter`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "This notebook is developed and tested in a SageMaker Notebook Instance with a `ml.t3.medium` instance with SageMaker Python SDK v2. It is recommended to execute the notebook in the same environment for best experience." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install opencv-python-headless" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import time\n", "import json\n", "import random\n", "import cv2\n", "import boto3\n", "import botocore\n", "import sagemaker\n", "from sagemaker import get_execution_role\n", "from sagemaker import image_uris\n", "import matplotlib.pyplot as plt\n", "import matplotlib.patches as patches \n", "import matplotlib.image as mpimg" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "role = get_execution_role()\n", "sess = sagemaker.Session()\n", "bucket = sess.default_bucket()\n", "prefix = \"DEMO-ObjectDetection\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Dataset and Prepare Data\n", "\n", "_Can take up ~5-10 minutes_" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#%%time\n", "\n", "# Download the dataset\n", "!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar\n", "!wget -P /tmp http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar\n", "\n", "# Extract the data.\n", "!tar -xf /tmp/VOCtrainval_06-Nov-2007.tar && rm /tmp/VOCtrainval_06-Nov-2007.tar\n", "!tar -xf /tmp/VOCtest_06-Nov-2007.tar && rm /tmp/VOCtest_06-Nov-2007.tar" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Convert data into RecordIO**\n", "\n", "RecordIO is a highly efficient binary data format from MXNet that makes it easy and simple to prepare the dataset and transfer to the instance that will run the training job. To generate a RecordIO file, we will use the tools from MXNet. The provided tools will first generate a list file and then use the im2rec tool to create the RecordIO file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!python tools/prepare_dataset.py --dataset pascal --year 2007 --set trainval --target VOCdevkit/train.lst\n", "!python tools/prepare_dataset.py --dataset pascal --year 2007 --set test --target VOCdevkit/val.lst --no-shuffle" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Upload the RecordIO files to train and validation channels\n", "train_channel = prefix + \"/train\"\n", "validation_channel = prefix + \"/validation\"\n", "\n", "sess.upload_data(path=\"VOCdevkit/train.rec\", bucket=bucket, key_prefix=train_channel)\n", "sess.upload_data(path=\"VOCdevkit/val.rec\", bucket=bucket, key_prefix=validation_channel)\n", "\n", "s3_train_data = \"s3://{}/{}\".format(bucket, train_channel)\n", "s3_validation_data = \"s3://{}/{}\".format(bucket, validation_channel)\n", "s3_output_location = \"s3://{}/{}/output\".format(bucket, prefix)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train with SageMaker Built-in Algorithm" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "training_image = image_uris.retrieve(\n", " region=sess.boto_region_name, framework=\"object-detection\", version=\"latest\"\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "od_model = sagemaker.estimator.Estimator(\n", " training_image,\n", " role,\n", " instance_count=1,\n", " instance_type=\"ml.p3.2xlarge\",\n", " volume_size=50,\n", " max_run=360000,\n", " input_mode=\"File\",\n", " output_path=s3_output_location,\n", " sagemaker_session=sess,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "od_model.set_hyperparameters(\n", " base_network=\"resnet-50\",\n", " use_pretrained_model=1,\n", " num_classes=20,\n", " mini_batch_size=32,\n", " epochs=1,\n", " learning_rate=0.001,\n", " lr_scheduler_step=\"3,6\",\n", " lr_scheduler_factor=0.1,\n", " optimizer=\"sgd\",\n", " momentum=0.9,\n", " weight_decay=0.0005,\n", " overlap_threshold=0.5,\n", " nms_threshold=0.45,\n", " image_shape=300,\n", " label_width=350,\n", " num_training_samples=16551,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train_data = sagemaker.inputs.TrainingInput(\n", " s3_train_data,\n", " distribution=\"FullyReplicated\",\n", " content_type=\"application/x-recordio\",\n", " s3_data_type=\"S3Prefix\",\n", ")\n", "validation_data = sagemaker.inputs.TrainingInput(\n", " s3_validation_data,\n", " distribution=\"FullyReplicated\",\n", " content_type=\"application/x-recordio\",\n", " s3_data_type=\"S3Prefix\",\n", ")\n", "data_channels = {\"train\": train_data, \"validation\": validation_data}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "od_model.fit(inputs=data_channels, logs=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy Endpoint for Inferences and test" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "endpoint_name = 'DEMO-ObjectDetection-endpoint'\n", "\n", "object_detector = od_model.deploy(\n", " initial_instance_count=1,\n", " endpoint_name = endpoint_name,\n", " instance_type=\"ml.m5.xlarge\",\n", " serializer = sagemaker.serializers.IdentitySerializer('image/jpeg')\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def visualize_detection(img_file, dets, classes=[], thresh=0.6):\n", " \"\"\"\n", " visualize detections in one image\n", " Parameters:\n", " ----------\n", " img : numpy.array\n", " image, in bgr format\n", " dets : numpy.array\n", " ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])\n", " each row is one object\n", " classes : tuple or list of str\n", " class names\n", " thresh : float\n", " score threshold\n", " \"\"\"\n", " img=mpimg.imread(img_file)\n", " f, ax = plt.subplots(1, 1)\n", " ax.imshow(img)\n", " height = img.shape[0]\n", " width = img.shape[1]\n", " colors = dict()\n", " output = []\n", " for det in dets:\n", " (klass, score, x0, y0, x1, y1) = det\n", " cls_id = int(klass)\n", " class_name = str(cls_id)\n", " if classes and len(classes) > cls_id:\n", " class_name = classes[cls_id]\n", " output.append([class_name, score])\n", " if score < thresh:\n", " continue\n", " if cls_id not in colors:\n", " colors[cls_id] = (random.random(), random.random(), random.random())\n", " xmin = int(x0 * width)\n", " ymin = int(y0 * height)\n", " xmax = int(x1 * width)\n", " ymax = int(y1 * height)\n", " rect = patches.Rectangle((xmin, ymin), xmax - xmin,\n", " ymax - ymin, fill=False,\n", " edgecolor=colors[cls_id],\n", " linewidth=3.5)\n", " ax.add_patch(rect)\n", "\n", "\n", " ax.text(xmin, ymin - 2,\n", " '{:s} {:.3f}'.format(class_name, score),\n", " bbox=dict(facecolor=colors[cls_id], alpha=0.5),\n", " fontsize=12, color='white')\n", "\n", " return f, output\n", " \n", "def load_and_predict(file_name, predictor, threshold=0.5):\n", " \"\"\"\n", " load an image, make object detection to an predictor, and visualize detections\n", " Parameters:\n", " ----------\n", " file_name : str\n", " image file location, in str format\n", " predictor : sagemaker.predictor.RealTimePredictor\n", " a predictor loaded from hosted endpoint\n", " threshold : float\n", " score threshold for bounding box display\n", " \"\"\"\n", " with open(file_name, 'rb') as image:\n", " f = image.read()\n", " b = bytearray(f)\n", " results = predictor.predict(b)\n", " detections = json.loads(results)\n", " \n", " fig, detection_filtered = visualize_detection(file_name, detections['prediction'], \n", " object_categories, threshold)\n", " \n", " return results, detection_filtered, fig" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "object_categories = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', \n", " 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', \n", " 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test_photos_index = ['980382', '276517', '1571457']\n", "\n", "if not os.path.isdir('sample-a2i-images'):\n", " os.mkdir('sample-a2i-images')\n", " \n", "for ind in test_photos_index:\n", " !curl https://images.pexels.com/photos/{ind}/pexels-photo-{ind}.jpeg > sample-a2i-images/pexels-photo-{ind}.jpeg" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test_photos = ['sample-a2i-images/pexels-photo-980382.jpeg', # motorcycle\n", " 'sample-a2i-images/pexels-photo-276517.jpeg', # bicycle\n", " 'sample-a2i-images/pexels-photo-1571457.jpeg'] # sofa" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "results = load_and_predict(test_photos[2], object_detector, threshold=0.2)[0]\n", "dict_results = json.loads(results.decode('utf8'))['prediction']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "visualize_detection(test_photos[2], dict_results, object_categories, thresh=0.4)[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Clean Up (finish module 2 and 3 before)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# object_detector.delete_endpoint()" ] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (MXNet 1.6 Python 3.6 CPU Optimized)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/mxnet-1.6-cpu-py36" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.13" } }, "nbformat": 4, "nbformat_minor": 4 }