# YoloV8 Model Inference in Amazon SageMaker
This notebook will demonstrate how to create an endpoint for real time inference with the trained YoloV8 (see [1] and [2]) model.

References:
----
[1] https://docs.ultralytics.com/ </br>
[2] https://github.com/ultralytics/ultralytics

## 1. SageMaker Initialization
First we upgrade SageMaker to the latest version. If your notebook is already using latest Sagemaker 2.x API, you may skip the next cell.

In [None]:
! pip install --upgrade pip
! python3 -m pip install --upgrade sagemaker

In [None]:
import boto3
import sagemaker
from sagemaker import get_execution_role

role = (
    get_execution_role()
)  # provide a pre-existing role ARN as an alternative to creating a new role
print(f"SageMaker Execution Role:{role}")

client = boto3.client('sts')
account = client.get_caller_identity()['Account']
print(f'AWS account:{account}')

session = boto3.session.Session()
aws_region = session.region_name
print(f"AWS region:{aws_region}")

container_name = "inference-container"

## 2. Build, Test and Push Amazon SageMaker Serving Container Images
For this step, the [IAM Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) attached to this Studio notebook needs full access to [Amazon ECR service](https://aws.amazon.com/ecr/) and access to [Amazon EC2 service](https://aws.amazon.com/ec2/). Check this [page](https://github.com/aws-samples/sagemaker-studio-docker-cli-extension#prerequsites) for prerequistes to use this notebook on SageMaker Studio. We use prebuild [YoloV8 docker image](https://hub.docker.com/r/ultralytics/ultralytics/tags) as a base to build on top and configure it to work with SageMaker.

### 2.1. Use local Mode to develop and test our code

#### 2.1.1. Docker Environment Preparation on SageMaker Studio Notebooks
By default, SageMaker Studio does not support docker operations. This [github repo](https://github.com/aws-samples/sagemaker-studio-docker-cli-extension) enables us to build, test and push images on Studio. You can skip this step if you running this notebook on your local machine or on Amazon SageMaker Notebook Instance.
</br></br>
Run the below cell to clone [SageMaker Studio Docker CLI extension](https://github.com/aws-samples/sagemaker-studio-docker-cli-extension) and install required dependencies missing from **Data Science** kernel

In [None]:
!cd ~ && git clone https://github.com/aws-samples/sagemaker-studio-docker-cli-extension

# fix dependancies
!conda update --force -y conda
!conda install -y pyyaml==5.4.1
#!apt-get install -y procps

# setup the extension
!cd ~/sagemaker-studio-docker-cli-extension && ./setup.sh

We will use **m5.xlarge** instance to build and test YoloV8 SageMaker docker image

In [None]:
!sdocker create-host --instance-type m5.xlarge

#### 2.1.2. Build docker image
We first build docker image to test it in local mode before deploying it to ECR

In [None]:
!cd inference-container && docker build . -t yolov8-sagemaker-inference:latest

#### 2.2.1. Create Local Inference Endpoint

##### 2.2.1.1. Define Amazon SageMaker Model
We first download pretrained model, then we define SageMaker model.

In [None]:
!mkdir -p inference-container/model && cd inference-container/model && wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-pose.pt

The container can have the following environment variables:

| Env           | Value                     |
|---------------|---------------------------|
| SM_MODEL_NAME | Name of YoloV8 model      |
| SM_TRACKING   | `Enabled` or `Disabled`   |
| SM_HP_CLASSES | List of classes ex. `[0]` |
| SM_HP_CONF    | Default `0.25`            |
| SM_HP_IOU     | Default `0.7`             |
| SM_HP_HALF    | Default `False`           |
| SM_HP_TRACKER | `botsort` or `bytetrack`  |

In [None]:
from sagemaker.local import LocalSession

sagemaker_session = LocalSession()
sagemaker_session.config = {'local': {'local_code': True}}

image_uri = "yolov8-sagemaker-inference:latest"
model_name = "yolov8-model-1" # set the name of the model

model_uri = "file://inference-container/model/yolov8n-pose.pt" # define the local pretrained model URI manually.

serving_container_def = {
    'Image': image_uri,
    'ModelDataUrl': model_uri,
    'Mode': 'SingleModel',
    'Environment': {
                    'SM_MODEL_NAME' : 'yolov8n-pose.pt'
                   }
}

create_model_response = sagemaker_session.create_model(name=model_name, 
                                                       role=role, 
                                                       container_defs=serving_container_def)

##### 2.2.1.2. Create Endpoint Configuration
Next, we set the name of the Amaozn SageMaker hosted service endpoint configuration.


In [None]:
endpoint_config_name = f"{model_name}-endpoint-config"
print(endpoint_config_name)

Then create the local Amazon SageMaker hosted service endpoint configuration that uses one a local endpoint container for testing purposes.

In [None]:
epc = sagemaker_session.create_endpoint_config(
    name=endpoint_config_name,
    model_name=model_name,
    initial_instance_count=1,
    instance_type="local",
)
print(epc)

Next we specify the Amazon SageMaker endpoint name for the endpoint used to serve the model.

In [None]:
endpoint_name = f"{model_name}-endpoint"
print(endpoint_name)

##### 2.2.1.3. Create Endpoint
In this step, we create the Amazon SageMaker endpoint using the endpoint configuration we created above.

In [None]:
ep = sagemaker_session.create_endpoint(
    endpoint_name=endpoint_name, config_name=endpoint_config_name, wait=True
)
print(ep)

### 2.2.2. Test Local Endpoint

#### 2.2.2.1. Visualization Helper Functions
Draw the bounding box, pose sekeleton and ID for each tracked object in the raw frames.

In [None]:
!apt-get update && apt-get install ffmpeg libsm6 libxext6  -y
!pip install opencv-python

In [None]:
import cv2
import numpy as np

pose_palette = np.array([[255, 128, 0], [255, 153, 51], [255, 178, 102], [230, 230, 0], [255, 153, 255],
                                      [153, 204, 255], [255, 102, 255], [255, 51, 255], [102, 178, 255], [51, 153, 255],
                                      [255, 153, 153], [255, 102, 102], [255, 51, 51], [153, 255, 153], [102, 255, 102],
                                      [51, 255, 51], [0, 255, 0], [0, 0, 255], [255, 0, 0], [255, 255, 255]],
                                     dtype=np.uint8)

skeleton = [[16, 14], [14, 12], [17, 15], [15, 13], [12, 13], [6, 12], [7, 13], [6, 7], [6, 8], [7, 9],
                         [8, 10], [9, 11], [2, 3], [1, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 7]]

limb_color = pose_palette[[9, 9, 9, 9, 7, 7, 7, 0, 0, 0, 0, 0, 16, 16, 16, 16, 16, 16, 16]]
kpt_color = pose_palette[[16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 9, 9, 9, 9, 9, 9]]


def get_color(idx):
    idx = idx * 3
    color = (75, 95, 230)

    return color

def draw_res(det_boxes, frame, frame_id, image_w):
    i = 0
    indexIDs = []
    boxes = []
    person_num = 0
    conf = None
    text_scale = max(1, image_w / 1600.)
    text_thickness = 3
    line_thickness = max(3, int(image_w/ 500.))
    for det_box in det_boxes:
        name, class_id, conf, box, track_id, keypoints = det_box.values()
        indexIDs.append(track_id)
        x1, y1, x2, y2 = box.values()
        intbox = tuple(map(int, (x1, y1, x2, y2)))
        textbox = tuple(map(int, (x1 - line_thickness, y1, x2 + line_thickness, y1 - 15)))
        color = get_color(abs(int(track_id)))
        cv2.rectangle(frame, textbox[0:2], textbox[2:4], color=color, thickness=-1)
        cv2.rectangle(frame, intbox[0:2], intbox[2:4], color=color, thickness=line_thickness)
        kx = keypoints["x"]
        ky = keypoints["y"]
        cv2.line(frame, (int(kx[skeleton[-2][0] - 1]), int(ky[skeleton[-2][0] - 1])), (int(kx[skeleton[-2][1] - 1]), int(ky[skeleton[-2][1] - 1])), tuple(limb_color[-2].tolist()), 2)
        cv2.line(frame, (int(kx[skeleton[-1][0] - 1]), int(ky[skeleton[-1][0] - 1])), (int(kx[skeleton[-1][1] - 1]), int(ky[skeleton[-1][1] - 1])), tuple(limb_color[-1].tolist()), 2)
        for i, (x, y) in enumerate(zip(kx, ky)):
            cv2.line(frame, (int(kx[skeleton[i][0] - 1]), int(ky[skeleton[i][0] - 1])), (int(kx[skeleton[i][1] - 1]), int(ky[skeleton[i][1] - 1])), tuple(limb_color[i].tolist()), 2)
            cv2.circle(frame, (int(x), int(y)), 3, tuple(kpt_color[i].tolist()), -1)
        cv2.putText(frame, f"ID: {str(track_id)} - {str(round(conf,4))}", (intbox[0], intbox[1]), cv2.FONT_HERSHEY_PLAIN, text_scale, (255, 255, 255),thickness=2)
        cv2.putText(frame, 'frame:{}'.format(frame_id), (int(25), int(25)),0, text_scale, (230,95,75),3)
        i += 1
    return frame

#### 2.2.2.2. Invoke endpoint

Next, we download a [video](https://motchallenge.net/sequenceVideos/MOT17-09-FRCNN-raw.mp4) from MOT17 dataset to test our endpoint. We create a directory input for saving the processed result, and then download video to input directory with MP4 format.

In [None]:
!mkdir -p input
!mkdir -p output
!cd input && wget "https://motchallenge.net/sequenceVideos/MOT17-09-FRCNN-raw.mp4" -O test.mp4

After preparing the test data, we invoke the endpoint to run the real time inferece on the test video.

In [None]:
import os
import cv2
import json
import time
import base64

sm_runtime = sagemaker_session.sagemaker_runtime_client

data_path = "input/test.mp4" 
cap = cv2.VideoCapture(data_path)
frame_w  = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
resize_factor = 1

fourcc = cv2.VideoWriter_fourcc(*'MP4V')
file_path = f"output/out-t-{time.localtime().tm_min}-{time.localtime().tm_sec}.mp4" 
out = cv2.VideoWriter(file_path, fourcc, 25, (int(frame_w / resize_factor), int(frame_h / resize_factor)))

processing_time = 0
frame_id = 0


i = 0
while True:
    ret, frame = cap.read()
    if ret != True:
        break
    
    if resize_factor == 1:
        res = frame
    else:
        res = cv2.resize(frame, dsize=(int(frame_w / resize_factor), int(frame_h / resize_factor)), interpolation=cv2.INTER_CUBIC)
    Body = {"frame_id": frame_id}
    Body["frame_w"] = int(frame_w / resize_factor)
    Body["frame_h"] = int(frame_h / resize_factor)
    Body["frame_data"] = base64.b64encode(res).decode("utf-8")
    
    request_time=time.time()
    body = json.dumps(Body).encode("utf-8")
    response = sm_runtime.invoke_endpoint(EndpointName=ep, Body=body, ContentType="application/json")

    if frame_id > 0:
        processing_time += (time.time() - request_time)
    print(f'frame-{frame_id} Processing time: {(time.time() - request_time)}')
    body = response["Body"].read()
    msg = body.decode("utf-8")
    data = json.loads(msg)
    frame_res = draw_res(data[0], res, frame_id, int(frame_w / resize_factor))
    out.write(frame_res)
    frame_id += 1

out.release()
cap.release()
print('average processing time: ', processing_time/frame_id)

#### 2.2.3. Cleanup resources

In [None]:
sagemaker_session.delete_model(model_name)
sagemaker_session.delete_endpoint_config(epc)
sagemaker_session.delete_endpoint(ep)

### 2.3. Push tested image to ECR
Now that we built and tested our image, we can push it to ECR to be able to use it with SageMaker Endpoints

In [None]:
%%bash

set -x
# This script shows how to build the Docker image and push it to ECR to be ready for use
# by SageMaker.

image="yolov8-sagemaker-inference"
tag="latest"
region=$REGION_NAME

# Get the account number associated with the current IAM credentials
account=$(aws sts get-caller-identity --query Account --output text)

if [ $? -ne 0 ]
then
    exit 255
fi


fullname="${account}.dkr.ecr.${region}.amazonaws.com/${image}:${tag}"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --region ${region} --repository-names "${image}" > /dev/null 2>&1
if [ $? -ne 0 ]; then
    aws ecr create-repository --region ${region} --repository-name "${image}" > /dev/null
fi


# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker tag ${image}:${tag} ${fullname}

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)
docker push ${fullname}
if [ $? -eq 0 ]; then
	echo "Amazon ECR URI: ${fullname}"
else
	echo "Error: Image build and push failed"
	exit 1
fi

#### 2.3.1. Delete Docker Host
Make sure to delete docker host after finishing with local mode to avoid extra charges

In [None]:
!sdocker terminate-current-host

### 2.4. Deploy on SageMaker Endpoint
Now that we pushed the inference image to ECR we can test it on SageMaker. First we package pretrained model into **.tar.gz** archive and upload to s3

In [None]:
sagemaker_session = sagemaker.session.Session(boto_session=session)
s3_bucket_name = sagemaker_session.default_bucket()

In [None]:
!cd inference-container/model  && tar -czvf model.tar.gz yolov8n-pose.pt && rm yolov8n-pose.pt

In [None]:
!aws s3 cp inference-container/model/model.tar.gz s3://{s3_bucket_name}/yolov8/model/model.tar.gz

#### 2.4.1. Define SageMaker Model

In [None]:
image = "yolov8-sagemaker-inference"
tag = "latest"
region = os.getenv("AWS_REGION")

sagemaker_session = sagemaker.session.Session(boto_session=session)

s3_bucket_name = sagemaker_session.default_bucket()

image_uri = f"{account}.dkr.ecr.{region}.amazonaws.com/{image}:{tag}"
model_name = "yolov8-model-1" # set the name of the model

model_uri = f"s3://{s3_bucket_name}/yolov8/model/model.tar.gz" # define the local pretrained model URI manually.

serving_container_def = {
    'Image': image_uri,
    'ModelDataUrl': model_uri,
    'Mode': 'SingleModel',
    'Environment': {
                    'SM_MODEL_NAME' : 'yolov8n-pose.pt',
                   }
}

create_model_response = sagemaker_session.create_model(name=model_name, 
                                                       role=role, 
                                                       container_defs=serving_container_def)

#### 2.4.2. Define SageMaker Endpoint Configuration

In [None]:
endpoint_config_name = f"{model_name}-endpoint-config"
epc = sagemaker_session.create_endpoint_config(
    name=endpoint_config_name,
    model_name=model_name,
    initial_instance_count=1,
    instance_type="ml.m5.2xlarge",
)
print(epc)

#### 2.4.3. Create SageMaker Endpoint

In [None]:
endpoint_name = f"{model_name}-endpoint"
ep = sagemaker_session.create_endpoint(
    endpoint_name=endpoint_name, config_name=endpoint_config_name, wait=True
)
print(ep)

#### 2.4.3. Test Endpoint

In [None]:
import os
import cv2
import json
import time
import base64

sm_runtime = sagemaker_session.sagemaker_runtime_client

data_path = "input/test.mp4" 
cap = cv2.VideoCapture(data_path)
frame_w  = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
resize_factor = 1

fourcc = cv2.VideoWriter_fourcc(*'MP4V')
file_path = f"output/out-t-{time.localtime().tm_min}-{time.localtime().tm_sec}.mp4" 
out = cv2.VideoWriter(file_path, fourcc, 25, (int(frame_w / resize_factor), int(frame_h / resize_factor)))

processing_time = 0
frame_id = 0


i = 0
while True:
    ret, frame = cap.read()
    if ret != True:
        break
        
    if resize_factor == 1:
        res = frame
    else:
        res = cv2.resize(frame, dsize=(int(frame_w / resize_factor), int(frame_h / resize_factor)), interpolation=cv2.INTER_CUBIC)
    Body = {"frame_id": frame_id}
    Body["frame_w"] = int(frame_w / resize_factor)
    Body["frame_h"] = int(frame_h / resize_factor)
    Body["frame_data"] = base64.b64encode(res).decode("utf-8")
    
    request_time=time.time()
    body = json.dumps(Body).encode("utf-8")
    response = sm_runtime.invoke_endpoint(EndpointName=ep, Body=body, ContentType="application/json")

    if frame_id > 0:
        processing_time += (time.time() - request_time)
    print(f'frame-{frame_id} Processing time: {(time.time() - request_time)}')
    body = response["Body"].read()
    msg = body.decode("utf-8")
    data = json.loads(msg)
    frame_res = draw_res(data[0], res, frame_id, int(frame_w / resize_factor))
    out.write(frame_res)
    frame_id += 1

out.release()
cap.release()
print('average processing time: ', processing_time/frame_id)

#### 2.4.4. Cleanup resources

In [None]:
sagemaker_session.delete_model(model_name)
sagemaker_session.delete_endpoint_config(epc)
sagemaker_session.delete_endpoint(ep)