{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Part 2: Deploy a model trained using SageMaker distributed data parallel\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
    "\n",
    "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Use this notebook after you have completed **Part 1: Distributed data parallel MNIST training with PyTorch and SageMaker's distributed data parallel library** in the notebook pytorch_smdataparallel_mnist_demo.ipynb. To deploy the model you previously trained, you need to create a Sagemaker Endpoint. This is a hosted prediction service that you can use to perform inference.\n",
    "\n",
    "## Finding the model\n",
    "\n",
    "This notebook uses a stored model if it exists. If you recently ran a training example that use the `%store%` magic, it will be restored in the next cell.\n",
    "\n",
    "Otherwise, you can pass the URI to the model file (a .tar.gz file) in the `model_data` variable.\n",
    "\n",
    "To find the location of model files in the [SageMaker console](https://console.aws.amazon.com/sagemaker/home), do the following: \n",
    "\n",
    "1. Go to the SageMaker console: https://console.aws.amazon.com/sagemaker/home.\n",
    "1. Select **Training** in the left navigation pane and then Select **Training jobs**. \n",
    "1. Find your recent training job and choose it.\n",
    "1. In the **Output** section, you should see an S3 URI under **S3 model artifact**. Copy this S3 URI.\n",
    "1. Uncomment the `model_data` line in the next cell that manually sets the model's URI and replace the placeholder value with that S3 URI."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Retrieve a saved model from a previous notebook run's stored variable\n",
    "%store -r model_data\n",
    "\n",
    "# If no model was found, set it manually here.\n",
    "# model_data = 's3://sagemaker-us-west-2-XXX/pytorch-smdataparallel-mnist-2020-10-16-17-15-16-419/output/model.tar.gz'\n",
    "\n",
    "print(\"Using this model: {}\".format(model_data))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a model object\n",
    "\n",
    "You define the model object by using the SageMaker Python SDK's `PyTorchModel` and pass in the model from the `estimator` and the `entry_point`. The endpoint's entry point for inference is defined by `model_fn` as seen in the following code block that prints out `inference.py`. The function loads the model and sets it to use a GPU, if available."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pygmentize code/inference.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sagemaker\n",
    "\n",
    "role = sagemaker.get_execution_role()\n",
    "\n",
    "from sagemaker.pytorch import PyTorchModel\n",
    "\n",
    "model = PyTorchModel(\n",
    "    model_data=model_data,\n",
    "    source_dir=\"code\",\n",
    "    entry_point=\"inference.py\",\n",
    "    role=role,\n",
    "    framework_version=\"1.6.0\",\n",
    "    py_version=\"py3\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Deploy the model on an endpoint\n",
    "\n",
    "You create a `predictor` by using the `model.deploy` function. You can optionally change both the instance count and instance type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictor = model.deploy(initial_instance_count=1, instance_type=\"ml.m4.xlarge\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test the model\n",
    "You can test the depolyed model using samples from the test set.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Download the test set\n",
    "import torchvision\n",
    "from torchvision import datasets, transforms\n",
    "from torch.utils.data import DataLoader\n",
    "from packaging.version import Version\n",
    "\n",
    "# Set the source to download MNIST data from\n",
    "TORCHVISION_VERSION = \"0.9.1\"\n",
    "if Version(torchvision.__version__) < Version(TORCHVISION_VERSION):\n",
    "    # Set path to data source and include checksum key to make sure data isn't corrupted\n",
    "    datasets.MNIST.resources = [\n",
    "        (\n",
    "            \"https://sagemaker-sample-files.s3.amazonaws.com/datasets/image/MNIST/train-images-idx3-ubyte.gz\",\n",
    "            \"f68b3c2dcbeaaa9fbdd348bbdeb94873\",\n",
    "        ),\n",
    "        (\n",
    "            \"https://sagemaker-sample-files.s3.amazonaws.com/datasets/image/MNIST/train-labels-idx1-ubyte.gz\",\n",
    "            \"d53e105ee54ea40749a09fcbcd1e9432\",\n",
    "        ),\n",
    "        (\n",
    "            \"https://sagemaker-sample-files.s3.amazonaws.com/datasets/image/MNIST/t10k-images-idx3-ubyte.gz\",\n",
    "            \"9fb629c4189551a2d022fa330f9573f3\",\n",
    "        ),\n",
    "        (\n",
    "            \"https://sagemaker-sample-files.s3.amazonaws.com/datasets/image/MNIST/t10k-labels-idx1-ubyte.gz\",\n",
    "            \"ec29112dd5afa0611ce80d1b7f02629c\",\n",
    "        ),\n",
    "    ]\n",
    "else:\n",
    "    # Set path to data source\n",
    "    datasets.MNIST.mirrors = [\n",
    "        \"https://sagemaker-sample-files.s3.amazonaws.com/datasets/image/MNIST/\"\n",
    "    ]\n",
    "\n",
    "\n",
    "test_set = datasets.MNIST(\n",
    "    \"data\",\n",
    "    download=True,\n",
    "    train=False,\n",
    "    transform=transforms.Compose(\n",
    "        [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]\n",
    "    ),\n",
    ")\n",
    "\n",
    "\n",
    "# Randomly sample 16 images from the test set\n",
    "test_loader = DataLoader(test_set, shuffle=True, batch_size=16)\n",
    "test_images, _ = iter(test_loader).next()\n",
    "\n",
    "# inspect the images\n",
    "import torchvision\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib inline\n",
    "\n",
    "\n",
    "def imshow(img):\n",
    "    img = img.numpy()\n",
    "    img = np.transpose(img, (1, 2, 0))\n",
    "    plt.imshow(img)\n",
    "    return\n",
    "\n",
    "\n",
    "# unnormalize the test images for displaying\n",
    "unnorm_images = (test_images * 0.3081) + 0.1307\n",
    "\n",
    "print(\"Sampled test images: \")\n",
    "imshow(torchvision.utils.make_grid(unnorm_images))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Send the sampled images to endpoint for inference\n",
    "outputs = predictor.predict(test_images.numpy())\n",
    "predicted = np.argmax(outputs, axis=1)\n",
    "\n",
    "print(\"Predictions: \")\n",
    "print(predicted.tolist())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cleanup\n",
    "\n",
    "If you don't intend on trying out inference or to do anything else with the endpoint, you should delete it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictor.delete_endpoint()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Notebook CI Test Results\n",
    "\n",
    "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
    "\n",
    "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n",
    "\n",
    "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/training|distributed_training|pytorch|data_parallel|mnist|infer_pytorch.ipynb)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p36",
   "language": "python",
   "name": "conda_pytorch_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}