{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Compile and Deploy the pretrained PyTorch model from model zoo with SageMaker Neo\n",
    "\n",
    "---\n",
    "\n",
    "이 노트북에서는 사전 훈련된 MnasNet 기반 이미지 분류(Image classification) 모델을 SageMaker Neo로 컴파일하여 배포합니다. SageMaker Neo는 머신 러닝 모델을 하드웨어에 맞게 최적화하는 API로, Neo로 컴파일한 모델은 클라우드와 엣지 디바이스 어디에서나 실행할 수 있습니다.\n",
    "\n",
    "SageMaker Neo에서 지원하는 인스턴스 유형, 하드웨어 및 딥러닝 프레임워크는 아래 링크를 참조하세요.\n",
    "(본 예제 코드는 2021년 2월 기준으로 작성되었으며, 작성 시점에서 PyTroch 1.8.0까지 지원하고 있습니다. 단, AWS Inferentia 기반 인스턴스로 배포 시에는 PyTorch 1.7.1까지 지원합니다.)\n",
    "\n",
    "SageMaker Neo가 지원하는 인스턴스 타입, 하드웨어 및 딥러닝 프레임워크는 아래 링크를 참조해 주세요.\n",
    "- 클라우드 인스턴스: https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-cloud.html\n",
    "- 엣지 디바이스: https://docs.aws.amazon.com/sagemaker/latest/dg/neo-supported-devices-edge.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys, sagemaker, random\n",
    "print(sagemaker.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 1. Inference script\n",
    "---\n",
    "\n",
    "아래 코드 셀은 `src` 디렉토리에 SageMaker 추론 스크립트를 저장합니다.\n",
    "\n",
    "DLR에서 자동으로 모델을 로드하기 때문에 별도의 `model_fn()`을 정의할 필요가 없습니다. 물론 커스텀 로직이 포함되었다면 `model_fn()`을 재정의할 수 있습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile src/infer_pytorch_neo.py\n",
    "\n",
    "import io\n",
    "import json\n",
    "import logging\n",
    "import os\n",
    "import pickle\n",
    "\n",
    "import numpy as np\n",
    "import torch\n",
    "import torchvision.transforms as transforms\n",
    "from PIL import Image  # Training container doesn't have this package\n",
    "\n",
    "logger = logging.getLogger(__name__)\n",
    "logger.setLevel(logging.DEBUG)\n",
    "    \n",
    "    \n",
    "def transform_fn(model, payload, request_content_type='application/octet-stream', \n",
    "                 response_content_type='application/json'):\n",
    "\n",
    "    logger.info('Invoking user-defined transform function')\n",
    "\n",
    "    if request_content_type != 'application/octet-stream':\n",
    "        raise RuntimeError(\n",
    "            'Content type must be application/octet-stream. Provided: {0}'.format(request_content_type))\n",
    "\n",
    "    # preprocess\n",
    "    decoded = Image.open(io.BytesIO(payload))\n",
    "    preprocess = transforms.Compose([\n",
    "        transforms.Resize(256),\n",
    "        transforms.CenterCrop(224),\n",
    "        transforms.ToTensor(),\n",
    "        transforms.Normalize(\n",
    "            mean=[\n",
    "                0.485, 0.456, 0.406], std=[\n",
    "                0.229, 0.224, 0.225]),\n",
    "    ])\n",
    "    normalized = preprocess(decoded)\n",
    "    batchified = normalized.unsqueeze(0)\n",
    "\n",
    "    # predict\n",
    "    device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "    batchified = batchified.to(device)\n",
    "    result = model.forward(batchified)\n",
    "\n",
    "    # Softmax (assumes batch size 1)\n",
    "    result = np.squeeze(result.detach().cpu().numpy())\n",
    "    result_exp = np.exp(result - np.max(result))\n",
    "    result = result_exp / np.sum(result_exp)\n",
    "\n",
    "    response_body = json.dumps(result.tolist())\n",
    "\n",
    "    return response_body, response_content_type"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 2. Import pre-trained model from TorchVision\n",
    "---\n",
    "본 예제는 TorchVision의 pre-trained 모델 중 MnasNet을 사용합니다.\n",
    "MnasNet은 정확도(accuracy)와 모바일 디바이스의 latency를 모두 고려한 강화학습 기반 NAS(neural architecture search)이며, TorchVision은 image classification에 최적화된 MNasNet-B1을 내장하고 있습니다. \n",
    "(참조 논문: https://arxiv.org/pdf/1807.11626.pdf)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torchvision.models as models\n",
    "import tarfile\n",
    "\n",
    "model = models.mnasnet1_0(pretrained=True)\n",
    "\n",
    "input_shape = [1,3,224,224]\n",
    "traced_model = torch.jit.trace(model.float().eval(), torch.zeros(input_shape).float())\n",
    "torch.jit.save(traced_model, 'model.pth')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Local Inference without Endpoint\n",
    "\n",
    "충분한 검증 및 테스트 없이 훈련된 모델을 곧바로 실제 운영 환경에 배포하기에는 많은 위험 요소들이 있기 때문에, 로컬 환경 상에서 추론을 수행하면서 디버깅하는 것을 권장합니다. 아래 코드 셀의 코드를 예시로 참조해 주세요."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_inference(img_path, predictor, show_img=True):\n",
    "    with open(img_path, mode='rb') as file:\n",
    "        payload = bytearray(file.read())\n",
    "\n",
    "    response = predictor.predict(payload)\n",
    "    result = json.loads(response.decode())\n",
    "    pred_cls_idx, pred_cls_str, prob = parse_result(result, img_path, show_img)\n",
    "    \n",
    "    return pred_cls_idx, pred_cls_str, prob \n",
    "\n",
    "\n",
    "def parse_result(result, img_path, show_img=True):\n",
    "    pred_cls_idx = np.argmax(result)\n",
    "    pred_cls_str = label_map[str(pred_cls_idx)]\n",
    "    prob = np.amax(result)*100\n",
    "    \n",
    "    if show_img:\n",
    "        import matplotlib.pyplot as plt\n",
    "        img = Image.open(img_path)\n",
    "        plt.figure()\n",
    "        fig, ax = plt.subplots(1, figsize=(10,10))\n",
    "        ax.imshow(img)\n",
    "        overlay_text = f'{pred_cls_str} {prob:.2f}%'\n",
    "        ax.text(20, 40, overlay_text, style='italic',\n",
    "                bbox={'facecolor': 'yellow', 'alpha': 0.5, 'pad': 10}, fontsize=20)\n",
    "\n",
    "    return pred_cls_idx, pred_cls_str, prob"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "모델 배포가 완료되었으면, 저자가 직접 준비한 샘플 이미지들로 추론을 수행해 보겠습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import json\n",
    "import numpy as np\n",
    "from io import BytesIO\n",
    "from PIL import Image\n",
    "from src.infer_pytorch_neo import transform_fn\n",
    "\n",
    "path = \"./samples\"\n",
    "img_list = os.listdir(path)\n",
    "img_path_list = [os.path.join(path, img) for img in img_list]\n",
    "\n",
    "#test_idx = random.randint(0, len(img_list)-1)\n",
    "test_idx = 0\n",
    "img_path = img_path_list[test_idx]\n",
    "\n",
    "with open(img_path, mode='rb') as file:\n",
    "    payload = bytearray(file.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "클래스 인덱스에 대응하는 클래스명을 매핑하기 위한 딕셔너리를 생성합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from src.utils import get_label_map_imagenet\n",
    "label_file = 'metadata/imagenet1000_clsidx_to_labels.txt'\n",
    "label_map = get_label_map_imagenet(label_file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "#model = torch.jit.load('model.pth')\n",
    "#model = model.to(device)\n",
    "\n",
    "response_body, _ = transform_fn(traced_model, payload)\n",
    "result = json.loads(response_body)\n",
    "parse_result(result, img_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 3. Compile the Model\n",
    "---\n",
    "\n",
    "### 모델 압축"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tarfile.open('model.tar.gz', 'w:gz') as f:\n",
    "    f.add('model.pth')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import time\n",
    "from sagemaker.utils import name_from_base\n",
    "\n",
    "role = sagemaker.get_execution_role()\n",
    "sm_sess = sagemaker.Session()\n",
    "region = sm_sess.boto_region_name\n",
    "bucket = sm_sess.default_bucket()\n",
    "\n",
    "compilation_job_name = name_from_base('TorchVision-MNasNet-Neo')\n",
    "prefix = compilation_job_name+'/model'\n",
    "\n",
    "model_path = sm_sess.upload_data(path='model.tar.gz', key_prefix=prefix)\n",
    "\n",
    "data_shape = '{\"input0\":[1,3,224,224]}'\n",
    "target_device = 'ml_c5'\n",
    "framework = 'PYTORCH'\n",
    "framework_version = '1.8'\n",
    "compiled_model_path = 's3://{}/{}/output'.format(bucket, compilation_job_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 모델 컴파일\n",
    "SageMaker Neo로 모델을 컴파일합니다. 컴파일된 모델은 s3에 저장되며, 타겟 디바이스 나 타겟 인스턴스에 곧바로 배포할 수 있습니다. 타겟 디바이스 배포 시에는 Neo-DLR 패키지를 이용해 컴파일된 모델을 추론할 수 있습니다. 컴파일된 모델 아티팩트의 경로는 AWS 웹사이트의 SageMaker UI에서도 확인할 수 있고 `compiled_model.model_data`로 가져올 수도 있습니다.\n",
    "<br>\n",
    "\n",
    "참조: https://github.com/neo-ai/neo-ai-dlr\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.pytorch.model import PyTorchModel\n",
    "from sagemaker.predictor import Predictor\n",
    "\n",
    "pytorch_model = PyTorchModel(\n",
    "    model_data=model_path,\n",
    "    predictor_cls=Predictor,\n",
    "    framework_version = framework_version,\n",
    "    role=role,\n",
    "    sagemaker_session=sm_sess,\n",
    "    source_dir='src',\n",
    "    entry_point='infer_pytorch_neo.py',\n",
    "    py_version='py3',\n",
    "    env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'}\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "compiled_model = pytorch_model.compile(\n",
    "    target_instance_family=target_device, \n",
    "    input_shape=data_shape,\n",
    "    job_name=compilation_job_name,\n",
    "    role=role,\n",
    "    compile_max_run=600,\n",
    "    framework=framework.lower(),\n",
    "    framework_version=framework_version,\n",
    "    output_path=compiled_model_path\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 모델 배포\n",
    "\n",
    "SageMaker가 관리하는 배포 클러스터를 프로비저닝하고 추론 컨테이너를 배포하기 때문에, 추론 서비스를 시작하는 데에는 약 5~10분 정도 소요됩니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# account_map = {\n",
    "#     'us-east-1': '785573368785',\n",
    "#     'us-east-2': '007439368137',\n",
    "#     'us-west-1': '710691900526',\n",
    "#     'us-west-2': '301217895009',\n",
    "#     'ap-northeast-1': '941853720454',\n",
    "#     'ap-northeast-2': '151534178276'\n",
    "# }\n",
    "# image_uri = f'{account_map[region]}.dkr.ecr.{region}.amazonaws.com/sagemaker-inference-pytorch:1.8-cpu-py3'\n",
    "\n",
    "predictor = compiled_model.deploy(\n",
    "    initial_instance_count = 1,\n",
    "    #image_uri=image_uri,\n",
    "    instance_type = 'ml.c5.xlarge',\n",
    "    wait=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Wait for the endpoint jobs to complete\n",
    "\n",
    "엔드포인트가 생성될 때까지 기다립니다. 약 5-10분의 시간이 소요됩니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "def make_endpoint_link(region, endpoint_name, endpoint_task):\n",
    "    \n",
    "    endpoint_link = f'<b><a target=\"blank\" href=\"https://console.aws.amazon.com/sagemaker/home?region={region}#/endpoints/{endpoint_name}\">{endpoint_task} Review Endpoint</a></b>'   \n",
    "    return endpoint_link \n",
    "        \n",
    "endpoint_link = make_endpoint_link(region, predictor.endpoint_name, '[Deploy Neo model]')\n",
    "\n",
    "display(HTML(endpoint_link))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sm_sess.wait_for_endpoint(predictor.endpoint_name, poll=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 4. Inference\n",
    "---\n",
    "\n",
    "모델 배포가 완료되었으면, 추론을 수행합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import json\n",
    "import numpy as np\n",
    "from io import BytesIO\n",
    "from PIL import Image\n",
    "\n",
    "img_list = os.listdir(path)\n",
    "img_path_list = [os.path.join(path, img) for img in img_list]\n",
    "print(img_path_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_idx = random.randint(0, len(img_list)-1)\n",
    "img_path = img_path_list[test_idx]\n",
    "pred_cls_idx, pred_cls_str, prob = get_inference(img_path, predictor)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "마지막으로 latency를 측정합니다. 본 예제는 CPU만으로 추론을 수행해도 충분합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "start_time = time.time()\n",
    "num_tests = 20\n",
    "for _ in range(num_tests):\n",
    "    response = predictor.predict(payload)\n",
    "end_time = (time.time()-start_time)\n",
    "print(f'Neo optimized inference time is {(end_time/num_tests)*1000:.4f} ms (avg)')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Endpoint Clean-up\n",
    "\n",
    "SageMaker Endpoint로 인한 과금을 막기 위해, 본 핸즈온이 끝나면 반드시 Endpoint를 삭제해 주시기 바랍니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictor.delete_endpoint()\n",
    "pytorch_model.delete_model()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_latest_p37",
   "language": "python",
   "name": "conda_pytorch_latest_p37"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}