{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deploying pre-trained NGC PyTorch models with Amazon SageMaker Neo"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Amazon SageMaker Neo is API to compile machine learning models to optimize them for our choice of hardward targets. In addition to the pre-trained models from TorchVision demonstrated in this SageMaker Neo example [notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_neo_compilation_jobs/pytorch_torchvision/pytorch_torchvision_neo.ipynb), NGC pre-trained models can also be deployed in Neo, as demonstrated in this notebook.\n",
    "\n",
    "Before you get started, make sure you have downloaded the [NGC ResNet50 model](https://ngc.nvidia.com/catalog/models/nvidia:rnpyt_fp16) to the `NGC_assets` directory. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# download the NGC model weights. We will later load the weights into a ResNet50.\n",
    "!wget https://api.ngc.nvidia.com/v2/models/nvidia/rnpyt_fp16/versions/1/files/NVIDIA_ResNet50v15_FP16_PyT_20190225.pth -O NGC_assets/NVIDIA_ResNet50v15_FP16_PyT_20190225.pth"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Environment Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll import appropriate Python packages as well as a python script from [NVIDIA Github](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/image_classification/resnet.py)/[NGC model script](https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch).\n",
    "\n",
    "For this notebook, you can use the kernel conda_pytorch_p36."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import tarfile\n",
    "\n",
    "# a python file from DeepLearningExamples or NGC model scripts, to build the resnet model\n",
    "import NGC_assets.image_classification_resnet as models \n",
    "!mkdir -p \"NGC_assets\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "torch.__version__\n",
    "# 1.4.0 or 1.2.0 are both okay"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# make sure the following shapes match\n",
    "input_shape = [1,3,224,224]\n",
    "data_shape = '{\"input0\":[1,3,224,224]}'\n",
    "\n",
    "# make sure the following types match\n",
    "target_device = 'ml_p3' \n",
    "endpoint_instance_type = 'ml.p3.2xlarge' \n",
    "#target_device options: https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-compilation-job.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Build ResNet50 from NGC\n",
    "The next section will build a ResNet50 from NGC_assets/image_classification_resnet.py, and load in the downloaded weights. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# First let's define parameters of the model to build. \n",
    "args_arch = \"resnet50\"\n",
    "args_model_config = \"fanin\"\n",
    "args_weights = \"NGC_assets/NVIDIA_ResNet50v15_FP16_PyT_20190225.pth\"\n",
    "args_precision = \"FP16\"\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The weights downloaded from NGC can then be loaded into the reconstructed model. For more details of the network, see NGC_assets/image_classification_resnet.py, downloaded from [NVIDIA Github](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/image_classification/resnet.py)/[NGC model script](https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# build a model \n",
    "model = models.build_resnet(args_arch, args_model_config, verbose=False)\n",
    "\n",
    "# load NGC downloaded weights \n",
    "weights = torch.load(args_weights)\n",
    "model.load_state_dict(weights)\n",
    "\n",
    "model = model.cuda()\n",
    "\n",
    "# if we want FP16 precision\n",
    "if args_precision == \"FP16\":\n",
    "    model =model.half()\n",
    "\n",
    "model.eval()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save the reconstructed model, to pth and to a tar file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "trace = torch.jit.trace(model.float().eval(), torch.zeros(input_shape).float().cuda())\n",
    "trace.save('NGC_assets/ngc_model.pth')\n",
    "\n",
    "with tarfile.open('NGC_assets/NGC_model.tar.gz', 'w:gz') as f:\n",
    "    f.add('NGC_assets/ngc_model.pth')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Invoke Neo Compilation API"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, forward the model artifact to Neo Compilation API:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import time\n",
    "from sagemaker.utils import name_from_base\n",
    "\n",
    "role = sagemaker.get_execution_role()\n",
    "sess = sagemaker.Session()\n",
    "region = sess.boto_region_name\n",
    "bucket = sess.default_bucket()\n",
    "\n",
    "compilation_job_name = name_from_base('NGC-ResNet50-Neo')\n",
    "\n",
    "model_key = '{}/model/NGC_model.tar.gz'.format(compilation_job_name)\n",
    "model_path = 's3://{}/{}'.format(bucket, model_key)\n",
    "boto3.resource('s3').Bucket(bucket).upload_file('NGC_assets/NGC_model.tar.gz', model_key)\n",
    "\n",
    "sm_client = boto3.client('sagemaker')\n",
    "\n",
    "framework = 'PYTORCH'\n",
    "compiled_model_path = 's3://{}/{}/output'.format(bucket, compilation_job_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = sm_client.create_compilation_job(\n",
    "    CompilationJobName=compilation_job_name,\n",
    "    RoleArn=role,\n",
    "    InputConfig={\n",
    "        'S3Uri': model_path,\n",
    "        'DataInputConfig': data_shape,\n",
    "        'Framework': framework\n",
    "    },\n",
    "    OutputConfig={\n",
    "        'S3OutputLocation': compiled_model_path,\n",
    "        'TargetDevice': target_device\n",
    "    },\n",
    "    StoppingCondition={\n",
    "        'MaxRuntimeInSeconds': 300\n",
    "    }\n",
    ")\n",
    "print(response)\n",
    "\n",
    "# Poll every 30 sec\n",
    "while True:\n",
    "    response = sm_client.describe_compilation_job(CompilationJobName=compilation_job_name)\n",
    "    if response['CompilationJobStatus'] == 'COMPLETED':\n",
    "        break\n",
    "    elif response['CompilationJobStatus'] == 'FAILED':\n",
    "        print(response)\n",
    "        raise RuntimeError('Compilation failed')\n",
    "    print('Compiling ...')\n",
    "    time.sleep(30)\n",
    "print('Done!')\n",
    "\n",
    "# Extract compiled model artifact\n",
    "compiled_model_path = response['ModelArtifacts']['S3ModelArtifacts']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create prediction endpoint"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To create a prediction endpoint, we first specify two additional functions, to be used with Neo Deep Learning Runtime:\n",
    "\n",
    "* `neo_preprocess(payload, content_type)`: Function that takes in the payload and Content-Type of each incoming request and returns a NumPy array. Here, the payload is byte-encoded NumPy array, so the function simply decodes the bytes to obtain the NumPy array.\n",
    "* `neo_postprocess(result)`: Function that takes the prediction results produced by Deep Learining Runtime and returns the response body\n",
    "\n",
    "Note: this file is reused from the sample notebook which runs a torchvision ResNet18 model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pygmentize resnet18.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Upload the Python script containing the two functions to S3:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "source_key = '{}/source/sourcedir.tar.gz'.format(compilation_job_name)\n",
    "source_path = 's3://{}/{}'.format(bucket, source_key)\n",
    "\n",
    "with tarfile.open('sourcedir.tar.gz', 'w:gz') as f:\n",
    "    f.add('resnet18.py')\n",
    "\n",
    "boto3.resource('s3').Bucket(bucket).upload_file('sourcedir.tar.gz', source_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then create a SageMaker model record:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.model import NEO_IMAGE_ACCOUNT\n",
    "from sagemaker.fw_utils import create_image_uri\n",
    "\n",
    "model_name = name_from_base('NGC-ResNet50-Neo')\n",
    "\n",
    "framework_version = \"0.4.0\"\n",
    "image_uri = create_image_uri(region, 'neo-' + framework.lower(), target_device.replace('_', '.'),\n",
    "                             framework_version, py_version='py3', account=NEO_IMAGE_ACCOUNT[region])\n",
    "\n",
    "response = sm_client.create_model(\n",
    "    ModelName=model_name,\n",
    "    PrimaryContainer={\n",
    "        'Image': image_uri,\n",
    "        'ModelDataUrl': compiled_model_path,\n",
    "        'Environment': { 'SAGEMAKER_SUBMIT_DIRECTORY': source_path }\n",
    "    },\n",
    "    ExecutionRoleArn=role\n",
    ")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we create an Endpoint Configuration:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "config_name = model_name\n",
    "\n",
    "response = sm_client.create_endpoint_config(\n",
    "    EndpointConfigName=config_name,\n",
    "    ProductionVariants=[\n",
    "        {\n",
    "            'VariantName': 'default-variant-name',\n",
    "            'ModelName': model_name,\n",
    "            'InitialInstanceCount': 1,\n",
    "            'InstanceType': endpoint_instance_type,\n",
    "            'InitialVariantWeight': 1.0\n",
    "        },\n",
    "    ],\n",
    ")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we create an Endpoint:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "endpoint_name = model_name + '-Endpoint'\n",
    "\n",
    "response = sm_client.create_endpoint(\n",
    "    EndpointName=endpoint_name,\n",
    "    EndpointConfigName=config_name,\n",
    ")\n",
    "print(response)\n",
    "\n",
    "print('Creating endpoint ...')\n",
    "waiter = sm_client.get_waiter('endpoint_in_service')\n",
    "waiter.wait(EndpointName=endpoint_name)\n",
    "\n",
    "response = sm_client.describe_endpoint(EndpointName=endpoint_name)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Send requests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try to send a cat picture.\n",
    "\n",
    "![title](cat.jpg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import numpy as np\n",
    "import time\n",
    "\n",
    "sm_runtime = boto3.Session().client('sagemaker-runtime')\n",
    "\n",
    "\n",
    "with open('cat.jpg', 'rb') as f:\n",
    "    payload = f.read()\n",
    "\n",
    "response = sm_runtime.invoke_endpoint(EndpointName=endpoint_name,\n",
    "                                      ContentType='application/x-image',\n",
    "                                      Body=payload)\n",
    "print(response)\n",
    "result = json.loads(response['Body'].read().decode())\n",
    "\n",
    "print('Most likely class: {}'.format(np.argmax(result)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Clean up\n",
    "Don't forget to delete an endpoint after we no longer need it. To see the status of your endpoints, to the SageMaker console, see Inference -> Endpoints."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sess.delete_endpoint(endpoint_name)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p36",
   "language": "python",
   "name": "conda_pytorch_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}