{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Finetuning PyTorch BERT with NGC\n",
    "The BERT family of models are a powerful set of natural language understanding models based on the transformer architecture originating from the paper Attention Is All You Need, which you can find here:  https://arxiv.org/abs/1706.03762\n",
    "\n",
    "These models work by running unsupervised pre-training on massive sets of text data. This process requires an enormous amount of time and compute. Luckily for us, BERT models are built for transfer learning. BERT models are able to be finetuned to perform many different NLU tasks like question answering, sentiment analysis, document summarization, and more.\n",
    "\n",
    "For this tutorial, we are going to download a BERT base model and finetune this model on the Stanford Question Answering Dataset and walk through the steps necessary to deploy it to a Sagemaker endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!wget https://api.ngc.nvidia.com/v2/models/nvidia/bert_base_pyt_amp_ckpt_pretraining_lamb/versions/1/files/bert_base.pt -O bert_base.pt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 167,
   "metadata": {},
   "outputs": [],
   "source": [
    "import collections\n",
    "import math\n",
    "import random\n",
    "import torch\n",
    "import os, tarfile, json\n",
    "import time, datetime\n",
    "from io import StringIO\n",
    "import numpy as np\n",
    "import boto3\n",
    "import sagemaker\n",
    "from sagemaker.pytorch import estimator, PyTorchModel, PyTorchPredictor, PyTorch\n",
    "from sagemaker.utils import name_from_base\n",
    "from model_utils.file_utils import PYTORCH_PRETRAINED_BERT_CACHE\n",
    "from model_utils.modeling import BertForQuestionAnswering, BertConfig, WEIGHTS_NAME, CONFIG_NAME\n",
    "from model_utils.tokenization import (BasicTokenizer, BertTokenizer, whitespace_tokenize)\n",
    "from types import SimpleNamespace\n",
    "from helper_funcs import *\n",
    "\n",
    "sagemaker_session = sagemaker.Session()\n",
    "role = sagemaker.get_execution_role()\n",
    "bucket = sagemaker_session.default_bucket() # can replace with your own S3 bucket 'privisaa-bucket-virginia'\n",
    "prefix = 'bert_pytorch_ngc'\n",
    "runtime_client = boto3.client('runtime.sagemaker')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create our training docker container\n",
    "\n",
    "Now we are going to create a custom docker container based on the NGC Bert container and push it to AWS Elastic Container Registry (ECR). In order to perform this operation, our sagemaker execution role needs to have access to ECR, this can be configured in IAM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%sh\n",
    "\n",
    "# The name of our algorithm\n",
    "algorithm_name=bert-ngc-torch-train\n",
    "\n",
    "chmod +x train\n",
    "chmod +x serve\n",
    "\n",
    "account=$(aws sts get-caller-identity --query Account --output text)\n",
    "\n",
    "# Get the region defined in the current configuration (default to us-west-2 if none defined)\n",
    "region=$(aws configure get region)\n",
    "region=${region:-us-east-1}\n",
    "\n",
    "fullname=\"${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest\"\n",
    "\n",
    "# If the repository doesn't exist in ECR, create it.\n",
    "\n",
    "aws ecr describe-repositories --repository-names \"${algorithm_name}\" > /dev/null 2>&1\n",
    "\n",
    "if [ $? -ne 0 ]\n",
    "then\n",
    "    aws ecr create-repository --repository-name \"${algorithm_name}\" > /dev/null\n",
    "fi\n",
    "\n",
    "# Get the login command from ECR and execute it directly\n",
    "$(aws ecr get-login --region ${region} --no-include-email)\n",
    "\n",
    "# Build the docker image locally with the image name and then push it to ECR\n",
    "# with the full name.\n",
    "\n",
    "docker build  -t ${algorithm_name} .\n",
    "docker tag ${algorithm_name} ${fullname}\n",
    "\n",
    "# some kind of security auth issue with pushing this to ecr, not authorized to perform ecr:InitiateLayerUpload\n",
    "docker push ${fullname}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download and inspect the data\n",
    "\n",
    "To get an idea of what the Stanford Question Answering Dataset contains, let's download it locally and look at it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!cd data/squad/ && bash squad_download.sh\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 156,
   "metadata": {},
   "outputs": [],
   "source": [
    "# load the v2.0 dev set\n",
    "\n",
    "with open('data/squad/v2.0/dev-v2.0.json', 'r') as f:\n",
    "    squad_data = json.load(f)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we've loaded some of the data, you can use the below block to look at a random context, question, and answer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ind = random.randint(0,34)\n",
    "sq = squad_data['data'][ind]\n",
    "print('Paragraph title: ',sq['title'], '\\n')\n",
    "print(sq['paragraphs'][0]['context'],'\\n')\n",
    "print('Question:', sq['paragraphs'][0]['qas'][0]['question'])\n",
    "print('Answer:', sq['paragraphs'][0]['qas'][0]['answers'][0]['text'])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## View BERT input\n",
    "\n",
    "BERT needs us to transform our text data into a numeric representation known as tokens. There are a variety of tokenizers available, we are going to use a tokenizer specially designed for BERT that we will instantiate with our vocabulary file. Let's take a look at our transformed question and context we will be supplying BERT for inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "namespace(input_ids=[101, 2054, 4145, 2106, 17586, 1999, 16433, 2224, 2000, 2817, 3722, 6681, 1029, 102, 17586, 1999, 16433, 2109, 1996, 4145, 1997, 2486, 1999, 1996, 2817, 1997, 17337, 1998, 3048, 5200, 1998, 3722, 6681, 1010, 2021, 24762, 2107, 2004, 17484, 1998, 7905, 14428, 6155, 6025, 8050, 10697, 1999, 4824, 2486, 1012, 1999, 2112, 2023, 2001, 2349, 2000, 2019, 12958, 4824, 1997, 1996, 2823, 2512, 1011, 5793, 2486, 1997, 15012, 1010, 1998, 1037, 8821, 14710, 3193, 1997, 1996, 3267, 1997, 3019, 4367, 1012, 1037, 8050, 7561, 2001, 1996, 6772, 2008, 1037, 2486, 2003, 3223, 2000, 5441, 4367, 1010, 2130, 2012, 1037, 5377, 10146, 1012, 2087, 1997, 1996, 3025, 24216, 2015, 2055, 4367, 1998, 2486, 2020, 2776, 13371, 2011, 21514, 14891, 9463, 2072, 1998, 2909, 7527, 8446, 1012, 2007, 2010, 8045, 12369, 1010, 2909, 7527, 8446, 19788, 4277, 1997, 4367, 2008, 2020, 2025, 5301, 1011, 2006, 2005, 3053, 2093, 3634, 2086, 1012, 2011, 1996, 2220, 3983, 2301, 1010, 15313, 2764, 1037, 3399, 1997, 20805, 2008, 11178, 10173, 1996, 2895, 1997, 2749, 2006, 5200, 2007, 4852, 2617, 2050, 2379, 1996, 3177, 1997, 2422, 1010, 1998, 2036, 3024, 12369, 2046, 1996, 2749, 2550, 2011, 24665, 18891, 12516, 1998, 1999, 8743, 2401, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], input_mask=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], segment_ids=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "doc_tokens = sq['paragraphs'][0]['context'].split()\n",
    "tokenizer = BertTokenizer('vocab', do_lower_case=True, max_len=512)\n",
    "query_tokens = tokenizer.tokenize(sq['paragraphs'][0]['qas'][0]['question'])\n",
    "\n",
    "feature = preprocess_tokenized_text(doc_tokens, \n",
    "                                    query_tokens, \n",
    "                                    tokenizer, \n",
    "                                    max_seq_length=384, \n",
    "                                    max_query_length=64)\n",
    "\n",
    "tensors_for_inference, tokens_for_postprocessing = feature\n",
    "tokens_for_postprocessing\n",
    "tensors_for_inference"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Send data to s3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!aws s3 cp --recursive data/squad s3://{bucket}/{prefix}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "'s3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.\n"
     ]
    }
   ],
   "source": [
    "s3train = f's3://{bucket}/{prefix}/v1.1/train-v1.1.json'\n",
    "\n",
    "train = sagemaker.session.s3_input(s3train, distribution='FullyReplicated', \n",
    "                        content_type=None, s3_data_type='S3Prefix')\n",
    "\n",
    "data_channels = {'train': train}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Instantiate the model\n",
    "\n",
    "Now we are going to instantiate our model, here we are going to specify our hyperparameters for training as well as the number of GPUs we are going to use. The ml.p3dn.24xlarge instances contain 8 V100 volta GPUs, making them ideal for heavy duty deep learning training. \n",
    "\n",
    "Once we have set our hyperparameters, we will instantiate a Sagemaker Estimator that we will use to run our training job. We specify the Docker image we just pushed to ECR as well as an entrypoint giving instructions for what operations our container should perform when it starts up. Our Docker container has two commands, train and serve. When we instantiate a training job, behind the scenes Sagemaker is running our Docker container and telling it to run the train command."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "account=!aws sts get-caller-identity --query Account --output text\n",
    "\n",
    "# Get the region defined in the current configuration (default to us-west-2 if none defined)\n",
    "region=!aws configure get region\n",
    "\n",
    "algoname = 'bert-ngc-torch-train'\n",
    "\n",
    "fullname=\"{}.dkr.ecr.{}.amazonaws.com/{}\".format(account[0], region[0], algoname)\n",
    "\n",
    "fullname"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 149,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Parameter image_name will be renamed to image_uri in SageMaker Python SDK v2.\n"
     ]
    }
   ],
   "source": [
    "# set our hyperparameters\n",
    "hyperparameters = {'bert_model': 'bert-base-uncased',  'num_train_epochs': 2, \n",
    "                   'vocab_file': '/workspace/bert/data/bert_vocab.txt',\n",
    "                   'config_file':'/workspace/bert/bert_config.json', \n",
    "                  'output_dir': '/opt/ml/model',\n",
    "                  'train_file': '/opt/ml/input/data/train/train-v1.1.json', #'/workspace/bert/data/squad/v1.1/train-v1.1.json',\n",
    "                  'num_gpus':8, 'train_batch_size':32, 'max_seq_length':512, 'doc_stride':128, 'seed':1,\n",
    "                  'learning_rate':3e-5,\n",
    "                  'save_to_s3':bucket}\n",
    "\n",
    "# instantiate model\n",
    "torch_model = PyTorch( role=role,\n",
    "                      train_instance_count=1,\n",
    "                      train_instance_type='ml.p3dn.24xlarge',\n",
    "                      entry_point='transform_script.py',\n",
    "                      image_name=fullname,\n",
    "                      framework_version='1.5.0',\n",
    "                      hyperparameters=hyperparameters\n",
    "                     )\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fine-tune the model\n",
    "\n",
    "If you use an instance with 4 GPUs and a batch size of 4 this process will take ~15 minutes to complete for this particular finetuning task with 2 epochs. Each additional epoch will add another 7 or so minutes. It's recommended to at minimum use a training instance with 4 GPUs, although you will likely get better performance with one of the ml.p3.16xlarge or ml.p3dn.24xlarge instances. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "torch_model.fit(data_channels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Deploy our trained model\n",
    "\n",
    "Now that we've finetuned our base BERT model, what now? Let's deploy our trained model to an endpoint and ask it some questions!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Parameter image will be renamed to image_uri in SageMaker Python SDK v2.\n",
      "'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-----------------!"
     ]
    }
   ],
   "source": [
    "endpoint_name = 'bert-endpoint-byoc-150'\n",
    "\n",
    "# if deploying from a model you trained in the same session \n",
    "#     bert_end = torch_model.deploy(instance_type='ml.g4dn.4xlarge', initial_instance_count=1, \n",
    "#                           endpoint_name=endpoint_name)\n",
    "\n",
    "model_data = f's3://{bucket}/model.tar.gz'\n",
    "\n",
    "# We are going to use a SageMaker serving container\n",
    "torch_model = PyTorchModel(model_data=model_data,\n",
    "                       role=role,\n",
    "                      entry_point='transform_script.py',\n",
    "                      framework_version='1.5.0')\n",
    "bert_end = torch_model.deploy(instance_type='ml.g4dn.4xlarge', initial_instance_count=1, \n",
    "                          endpoint_name=endpoint_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 173,
   "metadata": {},
   "outputs": [],
   "source": [
    "context = \"\"\"\n",
    "NGC Containers are designed to enable a software platform centered around minimal OS requirements, \n",
    "Docker and driver installation on the server or workstation, and provisioning of all application and SDK software \n",
    "in the NGC containers through the NGC container registry. NGC manages a catalog of fully integrated and optimized \n",
    "deep learning framework containers that take full advantage of NVIDIA GPUs in both single GPU and \n",
    "multi-GPU configurations. \n",
    "\"\"\"\n",
    "\n",
    "amzn_context = \"\"\"\n",
    "Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy machine learning (ML) models quickly. SageMaker removes the heavy lifting from each step of the machine learning process to make it easier to develop high quality models.\n",
    "Traditional ML development is a complex, expensive, iterative process made even harder because there are no integrated \n",
    "tools for the entire machine learning workflow. You need to stitch together tools and workflows, which is time-consuming\n",
    "and error-prone. SageMaker solves this challenge by providing all of the components used for machine learning in a single \n",
    "toolset so models get to production faster with much less effort and at lower cost.\n",
    "\"\"\"\n",
    "\n",
    "question = \"What configurations can NGC containers work with?\"\n",
    "question = \"What do NGC containers take advantage of?\"\n",
    "\n",
    "amzn_question = \"How does SageMaker solve traditional ML Development?\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that our endpoint has been deployed, let's send it some requests! "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 172,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-------------------------------------------------------------------------------------\n",
      "In what country is Normandy located? : France\n",
      "-------------------------------------------------------------------------------------\n",
      "CPU times: user 74.4 ms, sys: 3.91 ms, total: 78.3 ms\n",
      "Wall time: 176 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "# context='Danielle is a girl who really loves her cat, Steve. Steve is a large cat with a very furry belly. He gets very excited by the prospect of eating chicken covered in gravy.'\n",
    "# question='who loves Steve?'  # 'What kind of food does Steve like?'\n",
    "\n",
    "pass_in_data = {'context':context, 'question':question}\n",
    "json_data = json.dumps(pass_in_data)\n",
    "\n",
    "\n",
    "# if model_data:\n",
    "response = runtime_client.invoke_endpoint(EndpointName=endpoint_name,\n",
    "                                       ContentType='application/json',\n",
    "                                       Body=json_data)\n",
    "response = eval(response['Body'].read().decode('utf-8'))\n",
    "doc_tokens = context.split()\n",
    "tokenizer = BertTokenizer('vocab', do_lower_case=True, max_len=512)\n",
    "query_tokens = tokenizer.tokenize(question)\n",
    "feature = preprocess_tokenized_text(doc_tokens, \n",
    "                                    query_tokens, \n",
    "                                    tokenizer, \n",
    "                                    max_seq_length=384, \n",
    "                                    max_query_length=64)\n",
    "tensors_for_inference, tokens_for_postprocessing = feature\n",
    "response = get_predictions(doc_tokens, tokens_for_postprocessing, \n",
    "                         response[0], response[1], n_best_size=1, \n",
    "                         max_answer_length=64, do_lower_case=True, \n",
    "                         can_give_negative_answer=True, \n",
    "                         null_score_diff_threshold=-11.0)\n",
    "\n",
    "#response = bert_end.predict(json.dumps(pass_in_data), initial_args={'ContentType':'application/json'}) \n",
    "\n",
    "# print result\n",
    "print('-------------------------------------------------------------------------------------')\n",
    "print(f'{question} : {response[0][\"text\"]}')\n",
    "print('-------------------------------------------------------------------------------------')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cleanup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm bert_base.pt\n",
    "!rm s3_bucket.txt\n",
    "bert_end.delete_endpoint()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p36",
   "language": "python",
   "name": "conda_pytorch_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}