{ "cells": [ { "cell_type": "markdown", "id": "d89c3d67", "metadata": {}, "source": [ "# Deploy Yolov7 to SageMaker + Inferentia\n", "\n", "\n", "We'll create a SageMaker real-time endpoint with a Yolov7 model capable of detecting people and predicting the pose of each person. For that purpose, we need to get the model and prepare it to be deployed to AWS Inferentia." ] }, { "cell_type": "markdown", "id": "58b1061d", "metadata": {}, "source": [ "## 1) Install dependencies" ] }, { "cell_type": "code", "execution_count": null, "id": "9e6e0ad9", "metadata": {}, "outputs": [], "source": [ "# with this library we can build docker images and push them to ECR\n", "%pip install sagemaker-studio-image-build" ] }, { "cell_type": "markdown", "id": "efec75fd", "metadata": {}, "source": [ "## 2) Compile a pre-trained model\n", "When you deploy a model to a SageMaker Endpoint/inf1 instance (AWS Inferentia), you first need compile the model with NeuronSDK. We'll use a sample provided by the official AWS Neuron SDK + Inferentia Samples.\n", "\n", "- Clone the repo: https://github.com/aws-neuron/aws-neuron-samples\n", "- Load the jupyter notebook for Yolov7: https://github.com/aws-neuron/aws-neuron-samples/tree/master/torch-neuron/inference/yolov7\n", "- Start running the notebook, but enable Dynamic Batch and also Neuron Core Pipelines for 4 Neuron Cores,in model compilation section, as following:\n", "\n", "```python\n", "import torch\n", "import torch.neuron\n", "\n", "model_neuron = torch.neuron.trace(\n", " model, example_inputs=x,\n", " dynamic_batch_size=True,\n", " compiler_args['--neuron-core-pipeline', '4']\n", ")\n", "\n", "## Export to saved model\n", "model_neuron.save(\"yolov7_neuron.pt\")\n", "```" ] }, { "cell_type": "markdown", "id": "275e2ea1", "metadata": {}, "source": [ "## 3) Pack and upload the model to S3\n", "After compiling the model with the instructions above, **copy** the model to the same directory of this Notebook" ] }, { "cell_type": "code", "execution_count": null, "id": "2d5a89a8", "metadata": {}, "outputs": [], "source": [ "import os\n", "import io\n", "import tarfile\n", "import sagemaker\n", "\n", "sagemaker_session = sagemaker.Session()\n", "bucket = sagemaker_session.default_bucket()\n", "image_name='pytorch-inference-neuron'\n", "image_tag=\"1.10.2h-neuron-py37-sdk1.19.0-ubuntu18.04\"\n", "model_s3_path=\"models/yolov7-pose/model.tar.gz\"" ] }, { "cell_type": "code", "execution_count": null, "id": "70a2c9cd", "metadata": {}, "outputs": [], "source": [ "with io.BytesIO() as tar_file:\n", " with tarfile.open(fileobj=tar_file, mode='w:gz') as tar:\n", " tar.add('yolov7_neuron.pt', 'model.pt')\n", " tar.list()\n", " tar_file.seek(0)\n", " s3_uri = sagemaker_session.upload_string_as_file_body(\n", " tar_file.read(), bucket=bucket, key=model_s3_path\n", " )\n", " print(s3_uri)" ] }, { "cell_type": "markdown", "id": "715815db", "metadata": {}, "source": [ "## 3) Build a custom docker container with additional libraries\n", "**YOU DON\"T NEED TO RUN** this section if you already did that before" ] }, { "cell_type": "markdown", "id": "93b49329", "metadata": {}, "source": [ "We'll extend a pythorch-inference container to apply a patch that allow us to pass CustomAttributes to our code and also to install required libraries like libJPEG Turbo." ] }, { "cell_type": "code", "execution_count": null, "id": "d68a0f10", "metadata": {}, "outputs": [], "source": [ "!pygmentize container_01/Dockerfile" ] }, { "cell_type": "code", "execution_count": null, "id": "59a6698f", "metadata": {}, "outputs": [], "source": [ "!sm-docker build container_01/ --repository $image_name:$image_tag" ] }, { "cell_type": "markdown", "id": "b35e5738", "metadata": {}, "source": [ "## 4) Inference Code executed by SageMaker Endpoint\n", "We need to create a custom inference file to pass to SageMaker. This code has the mechanisms to invoke the model and also pre/post process the input jpeg image & predictions.\n", "\n", "- **input_fn()**: Will receive the bytes of a .jpeg file. This file needs to be a mosaic, composed of multiple frames in just one image. By using **CustomAttributes** we share some metadata about the mosaic to the endpoint. With tile_width and tile_height we can compute how many images does the mosaic have, parse it and build a batch.\n", "- **output_fn()**: Gets the predictions and converts them to a numpy blob" ] }, { "cell_type": "code", "execution_count": null, "id": "6a28879e", "metadata": {}, "outputs": [], "source": [ "!pygmentize code_01/inference.py" ] }, { "cell_type": "markdown", "id": "80aae4ec", "metadata": {}, "source": [ "## 5) Deploy our model to SageMaker" ] }, { "cell_type": "code", "execution_count": null, "id": "d3598f0c", "metadata": {}, "outputs": [], "source": [ "import boto3\n", "import logging\n", "from sagemaker.pytorch.model import PyTorchModel\n", "from sagemaker.predictor import Predictor\n", "\n", "sagemaker_session = sagemaker.Session()\n", "\n", "account_id = boto3.client('sts').get_caller_identity().get('Account')\n", "region_name = sagemaker_session.boto_session.region_name\n", "bucket = sagemaker_session.default_bucket()\n", "s3_uri=f\"s3://{bucket}/{model_s3_path}\"\n", "role=sagemaker.get_execution_role()\n", "print(f\"Bucket: {bucket}\\nAWS AccountID: {account_id}\\nRegion: {region_name}\")\n", "\n", "# https://github.com/aws/deep-learning-containers/blob/master/available_images.md#neuron-containers\n", "image_uri=f\"{account_id}.dkr.ecr.{region_name}.amazonaws.com/{image_name}:{image_tag}\"\n", "\n", "print(image_uri)\n", "sagemaker_model = PyTorchModel(\n", " image_uri=image_uri,\n", " model_data=s3_uri, \n", " role=role, \n", " name=\"yolov7-pose-inferentia\",\n", " sagemaker_session=sagemaker_session,\n", " entry_point=\"code_01/inference.py\",\n", " container_log_level=logging.DEBUG,\n", " model_server_workers=4, # keep 4 workers\n", " framework_version=\"1.10.0\",\n", " # for production it is important to define vpc_config and use a vpc_endpoint\n", " #vpc_config={\n", " # 'Subnets': ['', ''],\n", " # 'SecurityGroupIds': ['', '']\n", " #}\n", ")\n", "sagemaker_model._is_compiled_model = True" ] }, { "cell_type": "code", "execution_count": null, "id": "bad27169", "metadata": {}, "outputs": [], "source": [ "predictor = sagemaker_model.deploy(\n", " endpoint_name=\"yolov7-pose-inferentia\",\n", " instance_type=\"ml.inf1.6xlarge\",\n", " initial_instance_count=1\n", ")" ] }, { "cell_type": "markdown", "id": "dd2e18d4", "metadata": {}, "source": [ "## 6) Test the endpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "fb939242", "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import os\n", "import cv2\n", "import numpy as np\n", "import urllib.request\n", "import matplotlib.pyplot as plt\n", "\n", "if not os.path.isfile('zidane.jpg'):\n", " urllib.request.urlretrieve(\n", " 'https://raw.githubusercontent.com/ultralytics/yolov5/master/data/images/zidane.jpg',\n", " 'zidane.jpg'\n", " )\n", " \n", "if not os.path.isfile('mosaic4.jpg'):\n", " img = cv2.imread('zidane.jpg')\n", " h,w,c = img.shape\n", " factor = 960/w\n", " new_h,new_w=int(h*factor),int(w*factor)\n", " img = cv2.resize(img, (new_w,new_h))\n", " mosaic = np.zeros((new_h*2, new_w*2, c), dtype=np.uint8)\n", " for i in range(2):\n", " for j in range(2):\n", " ph, pw = i*new_h, j*new_w\n", " mosaic[ph:ph+new_h, pw:pw+new_w] = img[:]\n", " cv2.imwrite('mosaic4.jpg', mosaic)\n", "plt.figure(figsize=(15,10))\n", "plt.imshow(cv2.cvtColor(cv2.imread('mosaic4.jpg'), cv2.COLOR_BGR2RGB))" ] }, { "cell_type": "code", "execution_count": null, "id": "4759e9f7", "metadata": {}, "outputs": [], "source": [ "import json\n", "import time\n", "import sagemaker\n", "import numpy as np\n", "from sagemaker.predictor import Predictor\n", "from sagemaker.serializers import DataSerializer\n", "from sagemaker.deserializers import NumpyDeserializer\n", "\n", "sagemaker_session = sagemaker.Session()\n", "\n", "predictor = Predictor(endpoint_name=\"yolov7-pose-inferentia\", sagemaker_session=sagemaker_session)\n", "predictor.serializer = DataSerializer(content_type='image/jpeg')\n", "predictor.deserializer = NumpyDeserializer()\n", "\n", "mosaic_size=2\n", "custom_attributes={\n", " 'CustomAttributes': json.dumps({ \n", " \"tile_width\": 960, \n", " \"tile_height\": 540,\n", " \"conf_thres\": 0.15,\n", " \"iou_thres\": 0.45\n", " })\n", "}\n", "data = open(f'mosaic{mosaic_size*mosaic_size}.jpg', 'rb').read()\n", "t = time.time()\n", "y = predictor.predict(data, initial_args=custom_attributes)\n", "elapsed = (time.time()-t) * 1000\n", "print(f\"Elapsed: {elapsed}, Latency per image: {elapsed / (mosaic_size ** 2)}\")\n", "y.shape" ] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.13" } }, "nbformat": 4, "nbformat_minor": 5 }