{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hello Tensorflow Serving \n",
    "\n",
    "본 노트북에서는 이전 단계에서 생성한 모델을 추론 단계에서 어떻게 Tensorflow Serving을 이용하여 서비스형태로 실행할 수 있는지 살펴봅니다. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 모델 준비\n",
    "\n",
    "이전 단계에서 생성한 모델(S3경로)을 로드합니다. (해당 파일은 이전단계에서 실행한 `code/train.py`에서 165라인의 `model.save(ckpt_dir)`명령을 통해 학습을 실행한 컨테이너 내부에 저장된 후 SageMaker에 의해 S3로 Export 된 파일입니다.)\n",
    "- 참고 : https://www.tensorflow.org/tutorials/keras/save_and_load#save_the_entire_model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'s3://sagemaker-ap-northeast-1-308961792850/tensorflow/mnist/tensorflow-training-2021-07-16-11-57-03-562/model.tar.gz'"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%store -r tf_mnist_model_data\n",
    "tf_mnist_model_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "S3에 있는 모델(`model.tar.gz`)을 로컬 환경으로 복사(다운로드)합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "download: s3://sagemaker-ap-northeast-1-308961792850/tensorflow/mnist/tensorflow-training-2021-07-16-11-57-03-562/model.tar.gz to model/model.tar.gz\n",
      "00000000/\n",
      "00000000/saved_model.pb\n",
      "00000000/variables/\n",
      "00000000/variables/variables.index\n",
      "00000000/variables/variables.data-00000-of-00001\n",
      "00000000/assets/\n"
     ]
    }
   ],
   "source": [
    "!mkdir -p model\n",
    "!aws s3 cp {tf_mnist_model_data} model/\n",
    "!tar -zxvf model/model.tar.gz -C model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tensorflow에서 제공하는 `saved_model_cli` 유틸리티 tool을 이용하여 모델의 signatuure를 살펴보겠습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n",
      "\n",
      "signature_def['__saved_model_init_op']:\n",
      "  The given SavedModel SignatureDef contains the following input(s):\n",
      "  The given SavedModel SignatureDef contains the following output(s):\n",
      "    outputs['__saved_model_init_op'] tensor_info:\n",
      "        dtype: DT_INVALID\n",
      "        shape: unknown_rank\n",
      "        name: NoOp\n",
      "  Method name is: \n",
      "\n",
      "signature_def['serving_default']:\n",
      "  The given SavedModel SignatureDef contains the following input(s):\n",
      "    inputs['input_1'] tensor_info:\n",
      "        dtype: DT_FLOAT\n",
      "        shape: (-1, 28, 28, 1)\n",
      "        name: serving_default_input_1:0\n",
      "  The given SavedModel SignatureDef contains the following output(s):\n",
      "    outputs['output_1'] tensor_info:\n",
      "        dtype: DT_FLOAT\n",
      "        shape: (-1, 10)\n",
      "        name: StatefulPartitionedCall:0\n",
      "  Method name is: tensorflow/serving/predict\n",
      "WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "If using Keras pass *_constraint arguments to layers.\n",
      "\n",
      "Defined Functions:\n",
      "  Function Name: '__call__'\n",
      "    Option #1\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          input_1: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='input_1')\n",
      "\n",
      "  Function Name: '_default_save_signature'\n",
      "    Option #1\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          input_1: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='input_1')\n",
      "\n",
      "  Function Name: 'call_and_return_all_conditional_losses'\n",
      "    Option #1\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          input_1: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='input_1')\n"
     ]
    }
   ],
   "source": [
    "!saved_model_cli show --dir model/00000000 --all"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tensorflow Serving 설치\n",
    "\n",
    "Tensorflow Serving docker 이미지를 가져옵니다. (Tensorflow Serving을 설치하는 방법은 `apt-get install`을 이용하거나 오픈소스로부터 직접 빌드하는 등 여러가지가 있지만 그 중 Docker를 사용하는 것이 가장 간단합니다.)\n",
    "\n",
    "- TF Serving 설치가이드 : https://www.tensorflow.org/tfx/serving/setup\n",
    "- Docker hub tensoflow/serving : https://hub.docker.com/r/tensorflow/serving"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using default tag: latest\n",
      "latest: Pulling from tensorflow/serving\n",
      "Digest: sha256:6651f4839e1124dbde75ee531825112af0a6b8ef082c88ab14ca53eb69a2e4bb\n",
      "Status: Image is up to date for tensorflow/serving:latest\n",
      "docker.io/tensorflow/serving:latest\n",
      "REPOSITORY                                                                   TAG                 IMAGE ID            CREATED             SIZE\n",
      "308961792850.dkr.ecr.ap-northeast-1.amazonaws.com/sagemaker-decision-trees   latest              ffd2e91dd0a4        2 days ago          379MB\n",
      "sagemaker-decision-trees                                                     latest              ffd2e91dd0a4        2 days ago          379MB\n",
      "308961792850.dkr.ecr.ap-northeast-1.amazonaws.com/xgboost-byoc               latest              8a6b4b686bb6        2 days ago          1.28GB\n",
      "xgboost-byoc                                                                 latest              8a6b4b686bb6        2 days ago          1.28GB\n",
      "308961792850.dkr.ecr.ap-northeast-1.amazonaws.com/iris-model                 latest              d476edee8a68        3 days ago          1.84GB\n",
      "iris-model                                                                   latest              d476edee8a68        3 days ago          1.84GB\n",
      "iris_model                                                                   1.0                 d476edee8a68        3 days ago          1.84GB\n",
      "308961792850.dkr.ecr.ap-northeast-1.amazonaws.com/iris-model                 <none>              f7c2ca933075        6 days ago          1.84GB\n",
      "<none>                                                                       <none>              d54709177015        6 days ago          1.84GB\n",
      "308961792850.dkr.ecr.ap-northeast-1.amazonaws.com/xgboost-byoc               <none>              4f4bd22f6d30        7 days ago          1.28GB\n",
      "308961792850.dkr.ecr.ap-northeast-1.amazonaws.com/xgboost-byoc               <none>              8792752e6bba        7 days ago          1.28GB\n",
      "python                                                                       3.7-buster          9569e8192573        2 weeks ago         877MB\n",
      "ubuntu                                                                       16.04               065cf14a189c        4 weeks ago         135MB\n",
      "ubuntu                                                                       18.04               7d0d8fa37224        4 weeks ago         63.1MB\n",
      "763104351884.dkr.ecr.ap-northeast-1.amazonaws.com/tensorflow-training        2.3-cpu-py37        86acf6174637        5 weeks ago         3.3GB\n",
      "763104351884.dkr.ecr.ap-northeast-1.amazonaws.com/tensorflow-inference       2.3-cpu             7109bafe0044        5 weeks ago         1.34GB\n",
      "tensorflow/serving                                                           latest              e874bf5e4700        2 months ago        406MB\n",
      "763104351884.dkr.ecr.ap-northeast-1.amazonaws.com/tensorflow-inference       2.4-cpu             f75bff2fa69d        2 months ago        1.37GB\n",
      "763104351884.dkr.ecr.ap-northeast-1.amazonaws.com/mxnet-training             1.8.0-cpu-py37      3edc2cf3a707        3 months ago        1.96GB\n",
      "763104351884.dkr.ecr.ap-northeast-1.amazonaws.com/tensorflow-training        2.1-cpu-py3         7c2807ec5000        4 months ago        2.31GB\n",
      "763104351884.dkr.ecr.ap-northeast-1.amazonaws.com/tensorflow-inference       2.1-cpu             3490d3243118        4 months ago        1.28GB\n"
     ]
    }
   ],
   "source": [
    "!docker pull tensorflow/serving\n",
    "!docker images"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tensorflow serving 실행 \n",
    "\n",
    "터미널을 열고 아래 명령을 실행합니다.\n",
    "\n",
    "```\n",
    "cd ~/SageMaker/tf-deploy/\n",
    "docker run -t --rm -p 8501:8501 \\\n",
    "    -v `pwd`/model/:/models/MNIST \\\n",
    "    -e MODEL_NAME=MNIST \\\n",
    "    tensorflow/serving &\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "c418d47e4736        tensorflow/serving   \"/usr/bin/tf_serving…\"   11 seconds ago      Up 10 seconds       8500/tcp, 0.0.0.0:8501->8501/tcp   nice_wing\n"
     ]
    }
   ],
   "source": [
    "!docker ps "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tensorflow Serving은 관리용 API를 함께 제공합니다. `metadata` API를 이용하여 실행중인 모델의 IO 정보를 확인합니다.\n",
    "\n",
    "- 참고 : https://www.tensorflow.org/tfx/serving/api_rest"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'model_spec': {'name': 'MNIST', 'signature_name': '', 'version': '0'},\n",
       " 'metadata': {'signature_def': {'signature_def': {'serving_default': {'inputs': {'input_1': {'dtype': 'DT_FLOAT',\n",
       "       'tensor_shape': {'dim': [{'size': '-1', 'name': ''},\n",
       "         {'size': '28', 'name': ''},\n",
       "         {'size': '28', 'name': ''},\n",
       "         {'size': '1', 'name': ''}],\n",
       "        'unknown_rank': False},\n",
       "       'name': 'serving_default_input_1:0'}},\n",
       "     'outputs': {'output_1': {'dtype': 'DT_FLOAT',\n",
       "       'tensor_shape': {'dim': [{'size': '-1', 'name': ''},\n",
       "         {'size': '10', 'name': ''}],\n",
       "        'unknown_rank': False},\n",
       "       'name': 'StatefulPartitionedCall:0'}},\n",
       "     'method_name': 'tensorflow/serving/predict'},\n",
       "    '__saved_model_init_op': {'inputs': {},\n",
       "     'outputs': {'__saved_model_init_op': {'dtype': 'DT_INVALID',\n",
       "       'tensor_shape': {'dim': [], 'unknown_rank': True},\n",
       "       'name': 'NoOp'}},\n",
       "     'method_name': ''}}}}}"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import requests\n",
    "\n",
    "base_url = 'http://localhost:8501'\n",
    "r = requests.get(f'{base_url}/v1/models/MNIST/metadata')\n",
    "r.json()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 추론 실행\n",
    "\n",
    "임의의 테스트이미지(28x28)를 생성하고 local에서 실행중인 Docker의 `predict` API를 호출합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'predictions': [[-3.97380495,\n",
       "   1.04965734,\n",
       "   4.18941164,\n",
       "   5.63644314,\n",
       "   -2.85667372,\n",
       "   4.06895208,\n",
       "   2.45907688,\n",
       "   0.194627672,\n",
       "   1.70868921,\n",
       "   -0.908244908]]}"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import json\n",
    "\n",
    "test_data = np.random.rand(1,28,28,1).tolist()\n",
    "payload = json.dumps({\"instances\":test_data})\n",
    "\n",
    "r = requests.post(\"%s/v1/models/MNIST:predict\" % base_url, data=payload)\n",
    "r.json()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tensorflow Serving을 이용하여 학습된 모델을 추론서비스로 만들수 있다는 것을 확인하였습니다. 단, 현재까지 확인한 코드는 기본 기능을 확인하는 수준이었으며 실제 production 환경에서는 보다 복잡한 구성이 필요할 것입니다. (예를 들어 병렬처리, 로깅, 모니터링, Auto-scaling, A/B 테스트 등 성능, 안정성, 운영, 보안 등의 고려 필요)\n",
    "\n",
    "- 참고 : https://www.tensorflow.org/tfx/serving/serving_config"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "c418d47e4736\n"
     ]
    }
   ],
   "source": [
    "!docker ps|grep tensorflow/serving | awk '{print $1}'| xargs docker stop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_tensorflow2_p36",
   "language": "python",
   "name": "conda_tensorflow2_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}