# Bring your own container


---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

---


This notebook shows an example of bring your own container. This example leverages the MultiModelServer to host MME and this example 
can be further modified and adapted to fit your needs


# Multi-Model Endpoint - CatBoost

This example notebook also showcases how to use a custom container to host multiple CatBoost models on a SageMaker Multi Model Endpoint. The model this notebook deploys is taken from this [CatBoost tutorial](https://github.com/catboost/tutorials/blob/master/python_tutorial_with_tasks.ipynb). 

We are using this framework as an example to demonstrate deployment and serving using MultiModel Endpoint and showcase the capability. This notebook can be extended to any framework.

Catboost is gaining in popularity and is not yet supported as a framework for SageMaker MultiModelEndpoint. Further this example serves to demostrate how to bring your own container to a MultiModelEndpoint

In this Notebook we will use identical model to simulate multiple models for loading and inference

## Prerequisites
### Packages and Permissions
The SageMaker SDK uses the SageMaker default S3 bucket when needed. If the get_execution_role does not return a role with the appropriate permissions, you'll need to specify an IAM role ARN that does. Please make sure the `SageMakerFullAccess` policy is attached to the execution role you are using.

## Load model and test local inference
Here, install `catboost` to test we can load up the model locally and make inference. 

We load up the model locally using `CatBoostClassifier()`. `test_data.csv` contains a single row of test inference data.

In [None]:
# Cell 01

!pip install catboost

In [None]:
# Cell 02
from catboost import CatBoostClassifier, Pool as CatboostPool, cv
import os
import pandas

model_file = CatBoostClassifier()
model_file = model_file.load_model("./models/mme_catboost/catboost_model.bin")
df = pandas.read_csv("./data/mme_catboost/test_data.csv")
df.head(2)

In [None]:
# Cell 03
import pandas as pd
import io
import json

out = model_file.predict_proba(df)
print(out)

## Upload tar ball to s3


### Create a model tar ball

SageMaker requires our model to be packaged in a tar.gz file.

In [None]:
# Cell 04
!cd models/mme_catboost && tar -czvf catboost-model.tar.gz catboost_model.bin
!ls models/mme_catboost

### Upload 5 copies of the model to S3

Multi-Model Endpoints require all our models to be in a specific S3 prefix. Here we upload 100 of them to our default bucket. 

This is a simulation of having different models which we need to use to predict. In reality you would probably have each of these models trained separately

In [None]:
# Cell 05
import sagemaker

sess = sagemaker.Session()
s3_bucket = sess.default_bucket() # Replace with your own bucket name if needed
print(s3_bucket)

### Upload the model tar balls using boto3 with a unique name

In [None]:
# Cell 06
import boto3

s3 = boto3.client("s3")
for i in range(0, 5):
 with open("models/mme_catboost/catboost-model.tar.gz", "rb") as f:
 s3.upload_fileobj(f, s3_bucket, "catboost/catboost-model-{}.tar.gz".format(i))

print("Models:uploaded and ready for use")

### List all models in s3 prefix we will use for our Multi-Model Endpoint

In [None]:
# Cell 07
!aws s3 ls s3://$s3_bucket/catboost/

## Building the custom container

The container folder in this example contains 3 files:
```
├── container
│ ├── dockerd-entrypoint.py
│ ├── Dockerfile
│ └── model_handler.py
```

- `dockerd-entrypoint.py` is the entry point script that will start the multi model server.
- `Dockerfile` contains the container definition that will be used to assemble the image. This includes the packages that need to be installed.
- `model_handler.py` is the script that will contain the logic to load up the model and make inference.

Take a look through the files to see if there is any customization that you would like to do.
Below cells highlight the main part of the files. 


### Install catboost in the `Dockerfile`

In [None]:
# Cell 08
! sed -n '26,30p' container/mme_catboost/Dockerfile

### Update `initialize` function in `model_handler.py` with logic to load up the model
In this case we are using `CatBoostClassifier()`. Feel free to update the loading logic in this function to your needs.

In [None]:
# Cell 09
! sed -n '22,40p' ./container/mme_catboost/model_handler.py

### Update `handle` function in `model_handler.py` with logic to load up the model

In [None]:
# Cell 10
! sed -n '70,85p' ./container/mme_catboost/model_handler.py

### Build and Push the custom image to ECR

**This steps takes atleast 5-6 minutes so please be patient and ignore any "warnings" **

In [None]:
%%sh
# Cell 11

echo "Starting Docker Build"

# The name of our algorithm
algorithm_name=catboost-sagemaker-multimodel

cd container/mme_catboost

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-east-1 if none defined)
region=$(aws configure get region)
region=${region:-us-east-1}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

echo "fullname:image=${fullname}"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
 aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

echo "Starting the Docker Build with ${algorithm_name}"
docker build -q -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

echo "Pushing Docker image ${fullname} to ECR "
docker push ${fullname}

### Deploy Multi Model Endpoint

In [None]:
# Cell 12
from sagemaker import get_execution_role

sm_client = boto3.client(service_name="sagemaker")
runtime_sm_client = boto3.client(service_name="sagemaker-runtime")

account_id = boto3.client("sts").get_caller_identity()["Account"]
region = boto3.Session().region_name

role = get_execution_role()
print(role)

### Create the SageMaker Multi-Model

In [None]:
# Cell 13
from time import gmtime, strftime

model_name = "catboost-multimodel-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
model_url = "s3://{}/catboost/".format(s3_bucket) ## MODEL S3 URL
container = "{}.dkr.ecr.{}.amazonaws.com/catboost-sagemaker-multimodel:latest".format(
 account_id, region
)
instance_type = "ml.m5.xlarge"

print("Model name: " + model_name)
print("Model data Url: " + model_url)
print("Container image: " + container)

container = {"Image": container, "ModelDataUrl": model_url, "Mode": "MultiModel"}

create_model_response = sm_client.create_model(
 ModelName=model_name, ExecutionRoleArn=role, Containers=[container]
)

print("Model ARN: " + create_model_response["ModelArn"])

### Create the SageMaker Endpoint Configuration


In [None]:
# Cell 14
endpoint_config_name = "catboost-multimodel-config" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print("Endpoint config name: " + endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
 EndpointConfigName=endpoint_config_name,
 ProductionVariants=[
 {
 "InstanceType": instance_type,
 "InitialInstanceCount": 1,
 "InitialVariantWeight": 1,
 "ModelName": model_name,
 "VariantName": "AllTraffic",
 }
 ],
)

print("Endpoint config ARN: " + create_endpoint_config_response["EndpointConfigArn"])

### Create the SageMaker Multi-Model Endpoint

**This step will take a couple of minutes**

In [None]:
%%time
# Cell 15

import time

endpoint_name = "catboost-multimodel-endpoint-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print("Endpoint name: " + endpoint_name)

create_endpoint_response = sm_client.create_endpoint(
 EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
print("Endpoint Arn: " + create_endpoint_response["EndpointArn"])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Endpoint Status: " + status)

print("Waiting for {} endpoint to be in service...".format(endpoint_name))
waiter = sm_client.get_waiter("endpoint_in_service")
waiter.wait(EndpointName=endpoint_name)

print("Created {} endpoint is in Service and read to invoke ...".format(endpoint_name))

### Invoke each of the 5 models
We have identical models here to simulate multiple models belonging to the same framework

In [None]:
# Cell 16
from datetime import datetime
import time

for i in range(0, 4):
 start_time = datetime.now()
 response = runtime_sm_client.invoke_endpoint(
 EndpointName=endpoint_name,
 TargetModel="catboost-model-{}.tar.gz".format(i),
 Body=df.to_csv(index=False),
 )
 time_delta = (datetime.now() - start_time).total_seconds() * 1000
 time_delta = "{:.2f}".format(time_delta)

 print(f'Time={time_delta} --- > ::{json.loads(response["Body"].read().decode("utf-8"))}')

### Invoke just one of models 1000 times 
Since the models are in memory and loaded, these invocations should not have any latency 


In [None]:
# Cell 17
import numpy as np

print("Starting invocation for model::catboost-model-1.tar.gz, please wait ...")
results = []
for i in range(0, 1000):
 start = time.time()
 response = runtime_sm_client.invoke_endpoint(
 EndpointName=endpoint_name,
 TargetModel="catboost-model-1.tar.gz",
 Body=df.to_csv(index=False),
 )
 results.append((time.time() - start) * 1000)
print("\nPredictions for model latency: \n")
print("\nP95: " + str(np.percentile(results, 95)) + " ms\n")
print("P90: " + str(np.percentile(results, 90)) + " ms\n")
print("Average: " + str(np.average(results)) + " ms\n")

## Optional Clean up
Clean up and delete the end point

In [None]:
# delete the end point
# Cell 18

sm_client.delete_endpoint(EndpointName=endpoint_name)

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/inference|nlp|realtime|byoc|multi_model_catboost.ipynb)
