{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Amazon SageMaker オブジェクト検出" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import clear_output\n", "clear_output()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 設定\n", "\n", "SagaMaker上でオブジェクト検出アルゴリズムをトレーニングするには、AWSサービスの認証をセットアップする必要があります。SageMakerにアクセスできるAWSアカウントのロールが必要です。このロールは、SagaMakerがS3のデータにアクセスできるようにするために使用され、ノートブックの起動に使用されたロールから自動的に取得されます。\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%time\n", "import boto3\n", "import time\n", "import string\n", "import sagemaker\n", "import random\n", "from sagemaker import get_execution_role\n", "\n", "role = get_execution_role()\n", "print(role)\n", "sess = sagemaker.Session()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bucket = '<モデルに使用するバケットを指定>'\n", "prefix = 'object-detection-training'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SageMaker物体検出のdockerイメージが必要です。これは静的なので変更する必要はありません。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.amazon.amazon_estimator import get_image_uri\n", "\n", "training_image = get_image_uri(sess.boto_region_name, 'object-detection', repo_version=\"latest\")\n", "print (training_image)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## データアップロード" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### S3へのデータアップロード\n", "S3へデータをアップロードします。この時に2つのディレクトリが作成されます。1つはトレーニング用、2つ目はバリデーション用になります。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "# Upload the RecordIO files to train and validation channels\n", "train_channel = prefix + '/train'\n", "validation_channel = prefix + '/validation'\n", "\n", "sess.upload_data(path='RecordIO/train.rec', bucket=bucket, key_prefix=train_channel)\n", "sess.upload_data(path='RecordIO/val.rec', bucket=bucket, key_prefix=validation_channel)\n", "\n", "s3_train_data = 's3://{}/{}'.format(bucket, train_channel)\n", "s3_validation_data = 's3://{}/{}'.format(bucket, validation_channel)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "学習モデルのアウトプット先を定義します。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s3_input_train = 's3://{}/{}'.format(bucket, train_channel)\n", "s3_input_validation = 's3://{}/{}'.format(bucket, validation_channel)\n", "s3_output_location = 's3://{}/{}/output'.format(bucket, prefix)\n", "print(s3_input_train)\n", "print(s3_input_validation)\n", "print(s3_output_location)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ハイパーパラメータのチューニング" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Parameters\n", "\n", "## Number of Classes\n", "num_classes = 2\n", "\n", "## Number of training samples (80% total)\n", "num_training_samples = 1600\n", "\n", "## Instance type\n", "instance_type = 'ml.p3.8xlarge'\n", "\n", "## Instance count per job\n", "instance_count_per_job = 1\n", "\n", "## Learn Scheduler Step\n", "lr_scheduler_step = '20,40'\n", "\n", "## Max HyperParameter Training Epochs\n", "ht_epochs = 60\n", "\n", "## Max Final Training Epochs\n", "final_epochs = 240\n", "\n", "## Max HyperParameter Training jobs\n", "max_ht_training_jobs = 1\n", "\n", "## Max Parallel jobs\n", "max_parallel_jobs = 1\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "すべての必要な準備が完了しました。これでトレーニングを開始することが可能です。ここで課題となる箇所が、データセットからベストな学習をすることです。これに対応するために、ハイパーパラメータチューニングジョブを実行します。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Data channels\n", "train_data = sagemaker.session.s3_input(s3_input_train, distribution='FullyReplicated', \n", " content_type='application/x-recordio', s3_data_type='S3Prefix')\n", "validation_data = sagemaker.session.s3_input(s3_input_validation, distribution='FullyReplicated', \n", " content_type='application/x-recordio', s3_data_type='S3Prefix')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner\n", "\n", "# Base training estimator\n", "od = sagemaker.estimator.Estimator(training_image,\n", " role, \n", " train_instance_count = instance_count_per_job, \n", " train_instance_type = instance_type,\n", " train_volume_size = 50,\n", " train_max_run = 360000,\n", " input_mode= 'File',\n", " output_path = s3_output_location,\n", " sagemaker_session = sess,\n", " base_job_name = 'fridge20')\n", "# Fixed hyperparameters\n", "od.set_hyperparameters(base_network = 'resnet-50',\n", " use_pretrained_model = 1,\n", " early_stopping = True,\n", " num_classes = num_classes,\n", " num_training_samples = num_training_samples,\n", " epochs = ht_epochs,\n", " lr_scheduler_step = lr_scheduler_step,\n", " lr_scheduler_factor = 0.1,\n", " overlap_threshold = 0.5,\n", " nms_threshold = 0.45,\n", " image_shape = 512,\n", " label_width = 350)\n", "\n", "# Run tuning job\n", "tuning_job_name = \"smart-cooler-tuning-{}\".format(''.join(random.choices(string.ascii_letters + string.digits, k=8)))\n", "\n", "hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.001, 0.100),\n", " 'mini_batch_size': IntegerParameter(16, 64),\n", " 'momentum': ContinuousParameter(0.80, 0.99),\n", " 'weight_decay': ContinuousParameter(0.001, 0.100),\n", " 'optimizer': CategoricalParameter(['sgd', 'adam', 'rmsprop', 'adadelta'])}\n", "\n", "tuner = HyperparameterTuner(od, \n", " 'validation:mAP', \n", " hyperparameter_ranges,\n", " objective_type = 'Maximize', \n", " max_jobs = max_ht_training_jobs, \n", " max_parallel_jobs = max_parallel_jobs, \n", " early_stopping_type = 'Auto')\n", "\n", "tuner.fit({'train': train_data, 'validation': validation_data}, \n", " job_name = tuning_job_name, include_cls_metadata = False)\n", "tuner.wait()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "チューニングジョブが完了後に、上位5つのハイパーパラメータがリストされます。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get best jobs\n", "tuner_metrics = sagemaker.HyperparameterTuningJobAnalytics(tuning_job_name)\n", "best_jobs = tuner_metrics.dataframe().sort_values(['FinalObjectiveValue'], ascending=False).head(5)\n", "\n", "best_jobs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## トレーニング\n", "ここまでの過程で、ハイパーパラメータチューニングジョブが完了しました。このチューニングジョブによって、ベストなハイパーパラメータの結果を得ましたので、さらに繰り返し学習させます。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Train with best job hyperparameters\n", "best_job = best_jobs.head(1)\n", "i = best_job.index[0]\n", "\n", "hyperparams = od.hyperparameters()\n", "hyperparams['learning_rate'] = best_job.at[i, 'learning_rate']\n", "hyperparams['mini_batch_size'] = int(best_job.at[i, 'mini_batch_size'])\n", "hyperparams['momentum'] = best_job.at[i, 'momentum']\n", "hyperparams['weight_decay'] = best_job.at[i, 'weight_decay']\n", "hyperparams['optimizer'] = best_job.at[i, 'optimizer']\n", "\n", "hyperparams['epochs'] = final_epochs\n", "\n", "print(hyperparams)\n", "\n", "od.set_hyperparameters(**hyperparams)\n", "\n", "x = ''.join(random.choices(string.ascii_letters + string.digits, k=8))\n", "od.fit({'train': train_data, 'validation': validation_data}, \n", " job_name='smart-cooler-training-' + x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ホスティング\n", "トレーニングが完了したら、学習されたモデルをSageMakerのエンドポイントにデプロイさせます。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Deploy model to endpoint\n", "od_endpoint = od.deploy(initial_instance_count = 1,\n", " instance_type = 'ml.m5.2xlarge',\n", " endpoint_name='smart-cooler-endpoint-' + x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 推論\n", "学習されたモデルがエンドポイントにデプロイされたので、エンドポイントを使用して推論をすることが可能です。\n", "では、このエンドポイントを使用して画像を推論させます。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install wget pillow" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "import glob\n", "import wget\n", "import random\n", "import numpy as np\n", "from PIL import Image\n", "import matplotlib.pyplot as plt\n", "import matplotlib.image as mpimg\n", "from PIL import Image\n", "from math import floor, ceil\n", "import itertools\n", "import boto3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_object_boundary_box(img, obj_boundaries):\n", " if (len(obj_boundaries) != 4):\n", " raise Exception(\"Sagemaker boundaries are not of size 4\")\n", " # Find size of boundary box \n", " img_boundaries = img.getbbox()\n", " x_size = img_boundaries[2] - img_boundaries[0]\n", " y_size = img_boundaries[3] - img_boundaries[1]\n", "\n", " # Generate tuple of pixel boundaries using the boundaries generated from model. \n", " x_min = floor(x_size * obj_boundaries[0])\n", " y_min = floor(y_size * obj_boundaries[1])\n", " x_max = ceil(x_size * obj_boundaries[2])\n", " y_max = ceil(y_size * obj_boundaries[3])\n", " return tuple(map(int, [x_min, y_min, x_max, y_max]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def visualize_detection(img_file, dets, classes=[], colors=[], thresh=0.6):\n", " import random\n", " import matplotlib.pyplot as plt\n", " import matplotlib.image as mpimg\n", "\n", " fig_size = plt.rcParams[\"figure.figsize\"]\n", " fig_size[0] = 9\n", " fig_size[1] = 6\n", " plt.rcParams[\"figure.figsize\"] = fig_size\n", " \n", " img=mpimg.imread(img_file)\n", " plt.imshow(img)\n", " height = img.shape[0]\n", " width = img.shape[1]\n", " for det in dets:\n", " (klass, score, x0, y0, x1, y1) = det\n", " if score < thresh:\n", " continue\n", " cls_id = int(klass)\n", " xmin = int(x0 * width)\n", " ymin = int(y0 * height)\n", " xmax = int(x1 * width)\n", " ymax = int(y1 * height)\n", " rect = plt.Rectangle((xmin, ymin), xmax - xmin,\n", " ymax - ymin, fill=False,\n", " edgecolor=colors[cls_id],\n", " linewidth=3.5)\n", " plt.gca().add_patch(rect)\n", " class_name = str(cls_id)\n", " if classes and len(classes) > cls_id:\n", " class_name = classes[cls_id]\n", " plt.gca().text(xmin, ymin - 2,\n", " '{:s} {:.2f}'.format(class_name, score),\n", " bbox=dict(facecolor=colors[cls_id], alpha=0.5),\n", " fontsize=12, color='white')\n", " \n", " plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_iou(bb1, bb2):\n", " \"\"\"\n", " Calculate the Intersection over Union (IoU) of two bounding boxes.\n", "\n", " Parameters\n", " ----------\n", " bb1 : dict\n", " Keys: {'x1', 'x2', 'y1', 'y2'}\n", " The (x1, y1) position is at the top left corner,\n", " the (x2, y2) position is at the bottom right corner\n", " bb2 : dict\n", " Keys: {'x1', 'x2', 'y1', 'y2'}\n", " The (x, y) position is at the top left corner,\n", " the (x2, y2) position is at the bottom right corner\n", "\n", " Returns\n", " -------\n", " float\n", " in [0, 1]\n", " \"\"\"\n", " \n", " #print(bb1[0])\n", " #bb1 = float(bb1)\n", " #bb2 = float(bb2)\n", " #print(bb1)\n", " \n", " assert bb1[0] < bb1[2]\n", " assert bb1[1] < bb1[3]\n", " assert bb2[0] < bb2[2]\n", " assert bb2[1] < bb2[3]\n", "\n", " # determine the coordinates of the intersection rectangle\n", " x_left = max(bb1[0], bb2[0])\n", " y_top = max(bb1[1], bb2[1])\n", " x_right = min(bb1[2], bb2[2])\n", " y_bottom = min(bb1[3], bb2[3])\n", "\n", " if x_right < x_left or y_bottom < y_top:\n", " return 0.0\n", " \n", " # The intersection of two axis-aligned bounding boxes is always an\n", " # axis-aligned bounding box\n", " intersection_area = (x_right - x_left) * (y_bottom - y_top)\n", " \n", " # compute the area of both AABBs\n", " bb1_area = (bb1[2] - bb1[0]) * (bb1[3] - bb1[1])\n", " bb2_area = (bb2[2] - bb2[0]) * (bb2[3] - bb2[1])\n", " \n", " if intersection_area == bb1_area or intersection_area == bb1_area:\n", " return 1.0\n", " \n", " if intersection_area / bb1_area > 0.5 or intersection_area / bb2_area > 0.5:\n", " return 0.5\n", "\n", " # compute the intersection over union by taking the intersection\n", " # area and dividing it by the sum of prediction + ground-truth\n", " # areas - the interesection area\n", " iou = intersection_area / float(bb1_area + bb2_area - intersection_area)\n", " assert iou >= 0.0\n", " assert iou <= 1.0\n", " return iou" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_bounding_boxes(result, treshold_ssd, treshold_iou):\n", " # Array of bounding boxes\n", " bounding_boxes = []\n", "\n", " print(\"Prediction quantity: {}\".format(len(result[\"prediction\"])))\n", "\n", " # Remove bounding boxes above the confidence ssd treshold \n", " for prediction in result[\"prediction\"]:\n", " if prediction[1] >= treshold_ssd:\n", " bounding_boxes.append(prediction)\n", "\n", " print(\"Bounding box quantity before NMS: {}\".format(len(bounding_boxes)))\n", "\n", " i=0\n", " while i < len(bounding_boxes):\n", " boundbox = bounding_boxes[i]\n", " #print(str(i) + \" BB: \" + str(boundbox[1]))\n", " i+=1\n", "\n", " if boundbox[5] - boundbox[3] < 0.05:\n", " print(\"Size Removing, Y small: \" + str(boundbox[5] - boundbox[3]))\n", " bounding_boxes.remove(boundbox)\n", " i=0\n", " continue\n", "\n", " if boundbox[4] - boundbox[2] < 0.05:\n", " print(\"Size Removing, x small: \" + str(boundbox[4] - boundbox[2]))\n", " bounding_boxes.remove(boundbox)\n", " i=0\n", " continue \n", "\n", " if boundbox[5] - boundbox[3] > 0.20:\n", " print(\"Size Removing, Y large: \" + str(boundbox[5] - boundbox[3]))\n", " bounding_boxes.remove(boundbox)\n", " i=0\n", " continue\n", "\n", " if boundbox[4] - boundbox[2] > 0.20:\n", " print(\"Size Removing, x large: \" + str(boundbox[4] - boundbox[2]))\n", " bounding_boxes.remove(boundbox)\n", " i=0\n", " continue \n", "\n", " count = 0\n", " len_bounding_boxes = len(bounding_boxes)\n", " while count < len_bounding_boxes:\n", "\n", " if boundbox != bounding_boxes[count]:\n", " check = get_iou(boundbox[2:],bounding_boxes[count][2:])\n", " if check >= treshold_iou:\n", "\n", " # Best Score\n", " if boundbox[1] < bounding_boxes[count][1]:\n", " print(\"Removing this: \" + str(boundbox[1]))\n", " if boundbox in bounding_boxes:\n", " bounding_boxes.remove(boundbox)\n", " \n", "\n", " else:\n", " print(\"Removing last: \" + str(bounding_boxes[count][1]))\n", " if bounding_boxes[count] in bounding_boxes:\n", " bounding_boxes.remove(bounding_boxes[count])\n", "\n", " print(\"Item Deleted: {}\".format(count))\n", " len_bounding_boxes -= 1\n", "\n", " # Raise Counter\n", " count += 1\n", " \n", " print(\"Bounding box quantity after NMS: {}\".format(len(bounding_boxes)))\n", "\n", " return bounding_boxes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "推論を確認するために実際の冷蔵庫内の写真を指定してする。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "file_name = 'test.jpg'\n", "\n", "with open(file_name, 'rb') as image:\n", " f = image.read()\n", " b = bytearray(f)\n", " ne = open('n.txt','wb')\n", " ne.write(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "推論を実施" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "import json\n", "file_name = 'test.jpg'\n", "\n", "with open(file_name, 'rb') as f:\n", " payload = f.read()\n", " payload = bytearray(payload)\n", " \n", "od_endpoint.content_type = 'image/jpeg'\n", "result = json.loads(od_endpoint.predict(payload))\n", "\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results are in a format that is similar to the input .lst file (See [RecordIO Notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb) for more details on the .lst file definition. )with an addition of a confidence score for each detected object. The format of the output can be represented as `[class_index, confidence_score, xmin, ymin, xmax, ymax]`. Typically, we don't consider low-confidence predictions.\n", "\n", "次のスクリプトで、推論結果を可視化します。高く信頼された推論のみ可視化されて、信頼が低い推論はフィルタリングされさます。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Setting the treshould to see results according to this limits." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Treshold definition\n", "treshold_ssd = 0.25\n", "treshold_iou = 0.10\n", "\n", "# Define Object categories\n", "object_categories = ['red',\n", " 'blue']\n", "\n", "object_colors = [(0.9,0.4,0.4),(0.1,0.1,0.9)]\n", "\n", "# Visualize the detections.\n", "visualize_detection(file_name, \n", " get_bounding_boxes(result, treshold_ssd, treshold_iou), \n", " object_categories, \n", " object_colors, \n", " treshold_ssd)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "conda_mxnet_p36", "language": "python", "name": "conda_mxnet_p36" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" }, "notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License." }, "nbformat": 4, "nbformat_minor": 4 }