{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Serve a Pytorch model trained on SageMaker"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The model for this example was trained using this sample notebook on sagemaker - https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/pytorch_mnist/pytorch_mnist.ipynb\n",
"\n",
"It is certainly easiler to do estimator.deploy() using the standard Sagemaker SDK if you are following that example, but cinsider this one if you have a pytorch model (or two) on S3 and you are looking for an easy way to test and deploy this model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install torch"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip show sagemaker"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1 : Write a model transform script\n",
"\n",
"#### Make sure you have a ...\n",
"\n",
"- \"load_model\" function\n",
" - input args are model path\n",
" - returns loaded model object\n",
" - model name is the same as what you saved the model file as (see above step)\n",
"
\n",
"- \"predict\" function\n",
" - input args are the loaded model object and a payload\n",
" - returns the result of model.predict\n",
" - make sure you format it as a single (or multiple) string return inside a list for real time (for mini batch)\n",
" - from a client, a list or string or np.array that is sent for prediction is interpreted as bytes. Do what you have to for converting back to list or string or np.array\n",
" - return the error for debugging\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile modelscript_pytorch.py\n",
"import torch\n",
"import torch.distributed as dist\n",
"import torch.nn as nn\n",
"import torch.nn.functional as F\n",
"import torch.optim as optim\n",
"import torch.utils.data\n",
"import torch.utils.data.distributed\n",
"from joblib import load\n",
"import numpy as np\n",
"import os\n",
"import json\n",
"from six import BytesIO\n",
"\n",
"class Net(nn.Module):\n",
" def __init__(self):\n",
" super(Net, self).__init__()\n",
" self.conv1 = nn.Conv2d(1, 10, kernel_size=5)\n",
" self.conv2 = nn.Conv2d(10, 20, kernel_size=5)\n",
" self.conv2_drop = nn.Dropout2d()\n",
" self.fc1 = nn.Linear(320, 50)\n",
" self.fc2 = nn.Linear(50, 10)\n",
"\n",
" def forward(self, x):\n",
" x = F.relu(F.max_pool2d(self.conv1(x), 2))\n",
" x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))\n",
" x = x.view(-1, 320)\n",
" x = F.relu(self.fc1(x))\n",
" x = F.dropout(x, training=self.training)\n",
" x = self.fc2(x)\n",
" return F.log_softmax(x, dim=1)\n",
" \n",
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"\n",
"#Return loaded model\n",
"def load_model(modelpath):\n",
" model = torch.nn.DataParallel(Net())\n",
" with open(os.path.join(modelpath, 'model.pth'), 'rb') as f:\n",
" model.load_state_dict(torch.load(f))\n",
" print(\"loaded\")\n",
" return model.to(device)\n",
"\n",
"# return prediction based on loaded model (from the step above) and an input payload\n",
"def predict(model, payload):\n",
" \n",
" if type(payload) == list:\n",
" data = np.frombuffer(payload[0]['body'],dtype=np.float32).reshape(1,1,28,28)\n",
" elif type(payload) == np.ndarray:\n",
" data = payload \n",
" try:\n",
" print(type(data))\n",
" input_data = torch.Tensor(data)\n",
" model.eval()\n",
" with torch.no_grad():\n",
" out = model(input_data.to(device)).argmax(axis=1)[0].tolist()\n",
" except Exception as e:\n",
" out = str(e)\n",
" return [out]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download model locally"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!aws s3 cp s3://ezsmdeploy/pytorchmnist/input.html ./\n",
"!aws s3 cp s3://ezsmdeploy/pytorchmnist/model.tar.gz ./\n",
"!tar xvf model.tar.gz"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Input data for prediction\n",
"\n",
"Draw a number from 0 - 9 in the box that appears when you run the next cell"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"import numpy as np\n",
"HTML(open(\"input.html\").read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Does this work locally? (not \"_in a container locally_\", but _actually_ in local)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"image = np.array([data], dtype=np.float32)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from modelscript_pytorch import *\n",
"model = load_model('./') # "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"predict(model,image)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### ok great! Now let's install ezsmdeploy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"_[To Do]_: currently local; replace with pip version!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install ezsmdeploy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import ezsmdeploy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### If you have been running other inference containers in local mode, stop existing containers to avoid conflict"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!docker container stop $(docker container ls -aq) >/dev/null"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Upload to your S3 bucket"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sagemaker\n",
"modelpath = sagemaker.session.Session().upload_data('./model.tar.gz')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy locally"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ez = ezsmdeploy.Deploy(model = [modelpath], #loading pretrained MNIST model\n",
" script = 'modelscript_pytorch.py',\n",
" requirements = ['numpy','torch','joblib'], #or pass in the path to requirements.txt\n",
" instance_type = 'local',\n",
" wait = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test containerized version locally"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since you are downloading this model from a hub, the first time you invoke it will be slow, so invoke again to get an inference without all of the container logs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"out = ez.predictor.predict(image.tobytes()).decode()\n",
"out"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy on SageMaker"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ezonsm = ezsmdeploy.Deploy(model = [modelpath],\n",
" script = 'modelscript_pytorch.py',\n",
" requirements = ['numpy','torch','joblib'], #or pass in the path to requirements.txt\n",
" wait = True,\n",
" ei = 'ml.eia2.medium') # Add a GPU accelerator"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"out = ezonsm.predictor.predict(image.tobytes(), target_model='model1.tar.gz').decode() \n",
"out"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ezonsm.predictor.delete_endpoint()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_python3",
"language": "python",
"name": "conda_python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}