{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# SageMaker Inference Recommender"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
    "\n",
    "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Introduction\n",
    "\n",
    "SageMaker Inference Recommender is a new capability of SageMaker that reduces the time required to get machine learning (ML) models in production by automating performance benchmarking and load testing models across SageMaker ML instances. You can use Inference Recommender to deploy your model to a real-time inference endpoint that delivers the best performance at the lowest cost. \n",
    "\n",
    "Get started with Inference Recommender on SageMaker in minutes while selecting an instance and get an optimized endpoint configuration in hours, eliminating weeks of manual testing and tuning time.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Setup \n",
    "\n",
    "Note that we are using the `conda_tensorflow2_p36` kernel in SageMaker Notebook Instances. This is running Python 3.6 and TensorFlow 2.1.3. If you'd like to use the same setup, in the AWS Management Console, go to the Amazon SageMaker console. Choose Notebook Instances, and click create a new notebook instance. Upload the current notebook and set the kernel. You can also run this in SageMaker Studio Notebooks with the `TensorFlow 2.6 Python 3.8 CPU Optimized` kernel.\n",
    "\n",
    "In the next steps, you'll import standard methods and libraries as well as set variables that will be used in this notebook. The `get_execution_role` function retrieves the AWS Identity and Access Management (IAM) role you created at the time of creating your notebook instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install --upgrade pip awscli botocore boto3  --quiet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import get_execution_role, Session, image_uris\n",
    "import boto3\n",
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "region = boto3.Session().region_name\n",
    "role = get_execution_role()\n",
    "sm_client = boto3.client(\"sagemaker\", region_name=region)\n",
    "sagemaker_session = Session()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Machine learning model details\n",
    "\n",
    "Inference Recommender uses metadata about your ML model to recommend the best instance types and endpoint configurations for deployment. You can provide as much or as little information as you'd like but the more information you provide, the better your recommendations will be.\n",
    "\n",
    "ML Frameworks: `TENSORFLOW, PYTORCH, XGBOOST, SAGEMAKER-SCIKIT-LEARN`\n",
    "\n",
    "ML Domains: `COMPUTER_VISION, NATURAL_LANGUAGE_PROCESSING, MACHINE_LEARNING`\n",
    "\n",
    "Example ML Tasks: `CLASSIFICATION, REGRESSION, IMAGE_CLASSIFICATION, OBJECT_DETECTION, SEGMENTATION, FILL_MASK, TEXT_CLASSIFICATION, TEXT_GENERATION, OTHER`\n",
    "\n",
    "Note: Select the task that is the closest match to your model. Chose `OTHER` if none apply."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "\n",
    "# ML framework details\n",
    "framework = \"tensorflow\"\n",
    "# Note that only the framework major and minor version is supported for Neo compilation\n",
    "framework_version = \".\".join(tf.__version__.split(\".\")[:-1])\n",
    "\n",
    "# model name as standardized by model zoos or a similar open source model\n",
    "model_name = \"resnet50\"\n",
    "\n",
    "# ML model details\n",
    "ml_domain = \"COMPUTER_VISION\"\n",
    "ml_task = \"IMAGE_CLASSIFICATION\"\n",
    "\n",
    "print(\"TF Version\", framework_version)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Create a model archive\n",
    "\n",
    "SageMaker models need to be packaged in `.tar.gz` files. When your SageMaker Endpoint is provisioned, the files in the archive will be extracted and put in `/opt/ml/model/` on the Endpoint. \n",
    "\n",
    "In this step, there are two optional tasks to:\n",
    "\n",
    "   (1) Download a pretrained model from Keras applications\n",
    "   \n",
    "   (2) Download a sample inference script (inference.py) from S3\n",
    "   \n",
    "These tasks are provided as a sample reference but can and should be modified when using your own trained models with Inference Recommender. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional: Download model from Keras applications\n",
    "\n",
    "Let's download the model from Keras applications. By setting the variable download_the_model=False, you can skip the download and provide your own model archive."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "download_the_model = True"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import tensorflow as tf\n",
    "from tensorflow.keras.applications.resnet50 import ResNet50\n",
    "from tensorflow.keras import backend"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if download_the_model:\n",
    "    tf.keras.backend.set_learning_phase(0)\n",
    "    input_tensor = tf.keras.Input(name=\"input_1\", shape=(224, 224, 3))\n",
    "    model = tf.keras.applications.resnet50.ResNet50(input_tensor=input_tensor)\n",
    "\n",
    "    # Creating the directory strcture\n",
    "    model_version = \"1\"\n",
    "    export_dir = \"./model/\" + model_version\n",
    "    if not os.path.exists(export_dir):\n",
    "        os.makedirs(export_dir)\n",
    "        print(\"Directory \", export_dir, \" Created \")\n",
    "    else:\n",
    "        print(\"Directory \", export_dir, \" already exists\")\n",
    "\n",
    "    # Save to SavedModel\n",
    "    model.save(export_dir, save_format=\"tf\", include_optimizer=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.makedirs(\"code\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile code/inference.py\n",
    "\n",
    "import io\n",
    "import json\n",
    "import numpy as np\n",
    "from PIL import Image\n",
    "\n",
    "IMAGE_SIZE = (224, 224)\n",
    "\n",
    "\n",
    "def input_handler(data, context):\n",
    "    \"\"\"Pre-process request input before it is sent to TensorFlow Serving REST API\n",
    "    https://github.com/aws/amazon-sagemaker-examples/blob/0e57a288f54910a50dcbe3dfe2acb8d62e3b3409/sagemaker-python-sdk/tensorflow_serving_container/sample_utils.py#L61\n",
    "\n",
    "    Args:\n",
    "        data (obj): the request data stream\n",
    "        context (Context): an object containing request and configuration details\n",
    "\n",
    "    Returns:\n",
    "        (dict): a JSON-serializable dict that contains request body and headers\n",
    "    \"\"\"\n",
    "\n",
    "    if context.request_content_type == \"application/x-image\":\n",
    "        buf = np.fromstring(data.read(), np.uint8)\n",
    "        image = Image.open(io.BytesIO(buf)).resize(IMAGE_SIZE)\n",
    "        image = np.array(image)\n",
    "        image = np.expand_dims(image, axis=0)\n",
    "        return json.dumps({\"instances\": image.tolist()})\n",
    "    else:\n",
    "        _return_error(\n",
    "            415, 'Unsupported content type \"{}\"'.format(context.request_content_type or \"Unknown\")\n",
    "        )\n",
    "\n",
    "\n",
    "def output_handler(response, context):\n",
    "    \"\"\"Post-process TensorFlow Serving output before it is returned to the client.\n",
    "\n",
    "    Args:\n",
    "        response (obj): the TensorFlow serving response\n",
    "        context (Context): an object containing request and configuration details\n",
    "\n",
    "    Returns:\n",
    "        (bytes, string): data to return to client, response content type\n",
    "    \"\"\"\n",
    "    if response.status_code != 200:\n",
    "        _return_error(response.status_code, response.content.decode(\"utf-8\"))\n",
    "    response_content_type = context.accept_header\n",
    "    prediction = response.content\n",
    "    return prediction, response_content_type\n",
    "\n",
    "\n",
    "def _return_error(code, message):\n",
    "    raise ValueError(\"Error: {}, {}\".format(str(code), message))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile code/requirements.txt\n",
    "\n",
    "numpy\n",
    "pillow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a tarball\n",
    "\n",
    "To bring your own TensorFlow model, SageMaker expects a single archive file in .tar.gz format, containing a model file (\\*.pb) in TF SavedModel format and the script (\\*.py) for inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_archive_name = \"tfmodel.tar.gz\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tar -cvpzf {model_archive_name} ./model ./code"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Upload to S3\n",
    "\n",
    "We now have a model archive ready. We need to upload it to S3 before we can use with Inference Recommender. Furthermore, we will use the SageMaker Python SDK to handle the upload."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# model package tarball (model artifact + inference code)\n",
    "model_url = sagemaker_session.upload_data(path=model_archive_name, key_prefix=\"tfmodel\")\n",
    "print(\"model uploaded to: {}\".format(model_url))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Create a sample payload archive\n",
    "\n",
    "We need to create an archive that contains individual files that Inference Recommender can send to your Endpoint. Inference Recommender will randomly sample files from this archive so make sure it contains a similar distribution of payloads you'd expect in production. Note that your inference code must be able to read in the file formats from the sample payload.\n",
    "\n",
    "*Here we are only adding four images for the example. For your own use case(s), it's recommended to add a variety of samples that is representative of your payloads.* "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "payload_archive_name = \"tf_payload.tar.gz\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## optional: download sample images\n",
    "SAMPLES_BUCKET = f\"sagemaker-example-files-prod-{region}\"\n",
    "PREFIX = \"datasets/image/pets/\"\n",
    "payload_location = \"./sample-payload/\"\n",
    "\n",
    "if not os.path.exists(payload_location):\n",
    "    os.makedirs(payload_location)\n",
    "    print(\"Directory \", payload_location, \" Created \")\n",
    "else:\n",
    "    print(\"Directory \", payload_location, \" already exists\")\n",
    "\n",
    "sagemaker_session.download_data(payload_location, SAMPLES_BUCKET, PREFIX)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tar the payload"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cd ./sample-payload/ && tar czvf ../{payload_archive_name} *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Upload to S3\n",
    "\n",
    "Next, we'll upload the packaged payload examples (payload.tar.gz) that was created above to S3.  The S3 location will be used as input to our Inference Recommender job later in this notebook. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sample_payload_url = sagemaker_session.upload_data(\n",
    "    path=payload_archive_name, key_prefix=\"tf_payload\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Register model in Model Registry\n",
    "\n",
    "In order to use Inference Recommender, you must have a versioned model in SageMaker Model Registry.  To register a model in the Model Registry, you must have a model artifact packaged in a tarball and an inference container image.  Registering a model includes the following steps:\n",
    "\n",
    "\n",
    "1) **Create Model Group:** This is a one-time task per machine learning use case. A Model Group contains one or more versions of your packaged model. \n",
    "\n",
    "2) **Register Model Version/Package:** This task is performed for each new packaged model version. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Container image URL\n",
    "\n",
    "If you don’t have an inference container image, you can use one of the open source AWS [Deep Learning Containers (DLCs)](https://github.com/aws/deep-learning-containers) provided by AWS to serve your ML model. The code below retrieves a DLC based on your ML framework, framework version, python version, and instance type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "instance_type = \"ml.c5.xlarge\"  # Note: you can use any CPU-based instance type here, this is just to get a CPU tagged image\n",
    "dlc_uri = image_uris.retrieve(\n",
    "    framework,\n",
    "    region,\n",
    "    version=framework_version,\n",
    "    py_version=\"py3\",\n",
    "    instance_type=instance_type,\n",
    "    image_scope=\"inference\",\n",
    ")\n",
    "dlc_uri"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Model Group"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_package_group_name = \"{}-cpu-models-\".format(framework) + str(round(time.time()))\n",
    "model_package_group_description = \"{} models\".format(ml_task.lower())\n",
    "\n",
    "model_package_group_input_dict = {\n",
    "    \"ModelPackageGroupName\": model_package_group_name,\n",
    "    \"ModelPackageGroupDescription\": model_package_group_description,\n",
    "}\n",
    "\n",
    "create_model_package_group_response = sm_client.create_model_package_group(\n",
    "    **model_package_group_input_dict\n",
    ")\n",
    "print(\n",
    "    \"ModelPackageGroup Arn : {}\".format(create_model_package_group_response[\"ModelPackageGroupArn\"])\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Register Model Version/Package\n",
    "\n",
    "In this step, you'll register your pretrained model that was packaged in the prior steps as a new version in SageMaker Model Registry.  First, you'll configure the model package/version identifying which model package group this new model should be registered within as well as identify the initial approval status. You'll also identify the domain and task for your model.  These values were set earlier in the notebook \n",
    "where `ml_domain = 'COMPUTER_VISION'` and `ml_task = 'IMAGE_CLASSIFICATION'`\n",
    "\n",
    "*Note: ModelApprovalStatus is a configuration parameter that can be used in conjunction with SageMaker Projects to trigger automated deployment pipeline.*  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_package_description = \"{} {} inference recommender\".format(framework, model_name)\n",
    "\n",
    "model_approval_status = \"PendingManualApproval\"\n",
    "\n",
    "create_model_package_input_dict = {\n",
    "    \"ModelPackageGroupName\": model_package_group_name,\n",
    "    \"Domain\": ml_domain.upper(),\n",
    "    \"Task\": ml_task.upper(),\n",
    "    \"SamplePayloadUrl\": sample_payload_url,\n",
    "    \"ModelPackageDescription\": model_package_description,\n",
    "    \"ModelApprovalStatus\": model_approval_status,\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up inference specification\n",
    "\n",
    "You'll now setup the inference specification configuration for your model version.  This contains information on how the model should be hosted.\n",
    "\n",
    "Inference Recommender expects a single input MIME type for sending requests. Learn more about [common inference data formats on SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-inference.html). This MIME type will be sent in the Content-Type header when invoking your endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_mime_types = [\"application/x-image\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you specify a set of instance types below (i.e. non-empty list), then Inference Recommender will only support recommendations within the set of instances below. For this example, we provide a list of common instance types used for image classification algorithms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "supported_realtime_inference_types = [\"ml.c5.xlarge\", \"ml.m5.large\", \"ml.inf1.xlarge\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional: Model optimization\n",
    "\n",
    "[Amazon SageMaker Neo](https://aws.amazon.com/sagemaker/neo) is a capability of SageMaker that automatically optimizes your ML models for any target instance type. With Neo, you don’t need to set up third-party or framework-specific compiler software, or tune the model manually for optimizing inference performance. \n",
    "\n",
    "Inference Recommender compiles your model using SageMaker Neo if the `ModelInput` field is provided. To prepare the inputs for model compilation, specify the input layer name and shape (NHWC format for TF) for your trained model. The dictionary format required is as follows:\n",
    "\n",
    "    For one input: {'input':[1,224,224,3]}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_input_configuration = '{\"input_1\":[1,224,224,3]}'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you don't know your input layer name or shape, you can use the `saved_model_cli`. [Learn more](https://www.tensorflow.org/guide/saved_model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!saved_model_cli show --dir {export_dir} --all"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that you've collected all the ModelPackage details, the next step is to create the Model Version in Model Registry."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "modelpackage_inference_specification = {\n",
    "    \"InferenceSpecification\": {\n",
    "        \"Containers\": [\n",
    "            {\n",
    "                \"Image\": dlc_uri,\n",
    "                \"Framework\": framework.upper(),\n",
    "                \"FrameworkVersion\": framework_version,\n",
    "                \"NearestModelName\": model_name,\n",
    "                \"ModelInput\": {\"DataInputConfig\": data_input_configuration},\n",
    "            }\n",
    "        ],\n",
    "        \"SupportedContentTypes\": input_mime_types,  # required, must be non-null\n",
    "        \"SupportedResponseMIMETypes\": [],\n",
    "        \"SupportedRealtimeInferenceInstanceTypes\": supported_realtime_inference_types,  # optional\n",
    "    }\n",
    "}\n",
    "\n",
    "# Specify the model data\n",
    "modelpackage_inference_specification[\"InferenceSpecification\"][\"Containers\"][0][\n",
    "    \"ModelDataUrl\"\n",
    "] = model_url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "create_model_package_input_dict.update(modelpackage_inference_specification)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "create_mode_package_response = sm_client.create_model_package(**create_model_package_input_dict)\n",
    "model_package_arn = create_mode_package_response[\"ModelPackageArn\"]\n",
    "print(\"ModelPackage Version ARN : {}\".format(model_package_arn))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sm_client.describe_model_package(ModelPackageName=model_package_arn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Alternative Option: ContainerConfig\n",
    "\n",
    "If you are missing mandatory fields to create an inference recommender job in your model package version like so (this `create_model_package_input_dict` is missing `Domain`, `Task`, and `SamplePayloadUrl`):\n",
    "\n",
    "```\n",
    "create_model_package_input_dict = {\n",
    "    \"ModelPackageGroupName\": model_package_group_name,\n",
    "    \"ModelPackageDescription\": model_package_description,\n",
    "    \"ModelApprovalStatus\": model_approval_status,\n",
    "}\n",
    "```\n",
    "\n",
    "You may define the fields `Domain`, `Task`, and `SamplePayloadUrl` in the optional field `ContainerConfig` like so:\n",
    "\n",
    "```\n",
    "payload_config = {\n",
    "    \"SamplePayloadUrl\": sample_payload_url,\n",
    "}\n",
    "\n",
    "container_config = {\n",
    "    \"Domain\": ml_domain.upper(),\n",
    "    \"Task\": ml_task.upper(),\n",
    "    \"PayloadConfig\": payload_config,\n",
    "}\n",
    "```\n",
    "\n",
    "And then provide it directly within `create_inference_recommendations_job()` API like so:\n",
    "\n",
    "```\n",
    "default_response = client.create_inference_recommendations_job(\n",
    "    JobName=str(default_job),\n",
    "    JobDescription=\"\",\n",
    "    JobType=\"Default\",\n",
    "    RoleArn=role,\n",
    "    InputConfig={\n",
    "        \"ModelPackageVersionArn\": model_package_arn,\n",
    "        \"ContainerConfig\": container_config\n",
    "    },\n",
    ")\n",
    "```\n",
    "\n",
    "For more information on what else can be provided via `ContainerConfig` please refer to the `CreateInferenceRecommendationsJob` doc here: [CreateInferenceRecommendationsJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateInferenceRecommendationsJob.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5: Create a SageMaker Inference Recommender Default Job\n",
    "\n",
    "Now with your model in Model Registry, you can kick off a 'Default' job to get instance recommendations. This only requires your `ModelPackageVersionArn` and comes back with recommendations within an hour. \n",
    "\n",
    "The output is a list of instance type recommendations with associated environment variables, cost, throughput and latency metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import uuid\n",
    "from sagemaker import get_execution_role\n",
    "\n",
    "inference_client = boto3.client(\"sagemaker\", region)\n",
    "\n",
    "role = get_execution_role()\n",
    "default_job = uuid.uuid1()\n",
    "default_response = inference_client.create_inference_recommendations_job(\n",
    "    JobName=str(default_job),\n",
    "    JobDescription=\"\",\n",
    "    JobType=\"Default\",\n",
    "    RoleArn=role,\n",
    "    InputConfig={\"ModelPackageVersionArn\": model_package_arn},\n",
    ")\n",
    "\n",
    "print(default_response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Instance Recommendation Results\n",
    "\n",
    "Each inference recommendation includes `InstanceType`, `InitialInstanceCount`, `EnvironmentParameters` which are tuned environment variable parameters for better performance. We also include performance and cost metrics such as `MaxInvocations`, `ModelLatency`, `CostPerHour` and `CostPerInference`. We believe these metrics will help you narrow down to a specific endpoint configuration that suits your use case. \n",
    "\n",
    "Example:   \n",
    "\n",
    "If your motivation is overall price-performance with an emphasis on throughput, then you should focus on `CostPerInference` metrics  \n",
    "If your motivation is a balance between latency and throughput, then you should focus on `ModelLatency` / `MaxInvocations` metrics\n",
    "\n",
    "| Metric | Description |\n",
    "| --- | --- |\n",
    "| ModelLatency | The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container. <br /> Units: Milliseconds |\n",
    "| MaximumInvocations | The maximum number of InvokeEndpoint requests sent to an endpoint per minute. <br /> Units: None |\n",
    "| CostPerHour | The estimated cost per hour for your real-time endpoint. <br /> Units: US Dollars |\n",
    "| CostPerInference | The estimated cost per inference for your real-time endpoint. <br /> Units: US Dollars |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pprint\n",
    "import pandas as pd\n",
    "\n",
    "inference_client = boto3.client(\"sagemaker\", region)\n",
    "\n",
    "stopped = False\n",
    "while not stopped:\n",
    "    inference_recommender_job = inference_client.describe_inference_recommendations_job(\n",
    "        JobName=str(default_job)\n",
    "    )\n",
    "    if inference_recommender_job[\"Status\"] in [\"COMPLETED\", \"STOPPED\", \"FAILED\"]:\n",
    "        stopped = True\n",
    "    else:\n",
    "        print(\"Inference recommender job in progress\")\n",
    "        time.sleep(600)\n",
    "\n",
    "if inference_recommender_job[\"Status\"] == \"FAILED\":\n",
    "    print(\"Inference recommender job failed \")\n",
    "    print(\"Failed Reason: {}\".inference_recommender_job[\"FailureReason\"])\n",
    "else:\n",
    "    print(\"Inference recommender job completed\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Detailing out the result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = [\n",
    "    {**x[\"EndpointConfiguration\"], **x[\"ModelConfiguration\"], **x[\"Metrics\"]}\n",
    "    for x in inference_recommender_job[\"InferenceRecommendations\"]\n",
    "]\n",
    "df = pd.DataFrame(data)\n",
    "dropFilter = df.filter([\"VariantName\"])\n",
    "df.drop(dropFilter, inplace=True, axis=1)\n",
    "pd.set_option(\"max_colwidth\", 400)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional: ListInferenceRecommendationsJobSteps\n",
    "To see the list of subtasks for an Inference Recommender job, simply provide the `JobName` to the `ListInferenceRecommendationsJobSteps` API. \n",
    "\n",
    "To see more information for the API, please refer to the doc here: [ListInferenceRecommendationsJobSteps](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ListInferenceRecommendationsJobSteps.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "list_job_steps_response = inference_client.list_inference_recommendations_job_steps(\n",
    "    JobName=str(default_job)\n",
    ")\n",
    "print(list_job_steps_response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. Custom Load Test\n",
    "\n",
    "With an 'Advanced' job, you can provide your production requirements, select instance types, tune environment variables and perform more extensive load tests. This typically takes 2 hours depending on your traffic pattern and number of instance types. \n",
    "\n",
    "The output is a list of endpoint configuration recommendations (instance type, instance count, environment variables) with associated cost, throughput and latency metrics.\n",
    "\n",
    "In the below example, we are tuning the endpoint against an environment variable `OMP_NUM_THREADS` with values `[1, 2, 4]` and we aim to limit the  latency requirement to `500` ms. The goal is to find which value for `OMP_NUM_THREADS` provides the best performance. \n",
    "\n",
    "For some context, Python internally uses OpenMP for implementing multithreading within processes. The default value for `OMP_NUM_THREADS` is equal to the number of CPU core. However, when implemented on top of Simultaneous Multi Threading (SMT) such Intel’s HypeThreading, a certain process might oversubscribe a particular core by spawning twice the threads as the number of actual CPU cores. In certain cases, a Python binary might end up spawning up to four times the threads as available actual processor cores. Therefore, an ideal setting for this parameter, if you have oversubscribed available cores using worker threads, is 1 or half the number of CPU cores on a SMT-enabled CPU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "instance_type = \"ml.c5.xlarge\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import uuid\n",
    "\n",
    "inference_client = boto3.client(\"sagemaker\", region)\n",
    "\n",
    "role = get_execution_role()\n",
    "advanced_job = uuid.uuid1()\n",
    "advanced_response = inference_client.create_inference_recommendations_job(\n",
    "    JobName=str(advanced_job),\n",
    "    JobDescription=\"\",\n",
    "    JobType=\"Advanced\",\n",
    "    RoleArn=role,\n",
    "    InputConfig={\n",
    "        \"ModelPackageVersionArn\": model_package_arn,\n",
    "        \"JobDurationInSeconds\": 7200,\n",
    "        \"EndpointConfigurations\": [\n",
    "            {\n",
    "                \"InstanceType\": instance_type,\n",
    "                \"EnvironmentParameterRanges\": {\n",
    "                    \"CategoricalParameterRanges\": [\n",
    "                        {\"Name\": \"OMP_NUM_THREADS\", \"Value\": [\"1\", \"2\", \"4\"]}\n",
    "                    ]\n",
    "                },\n",
    "            }\n",
    "        ],\n",
    "        \"ResourceLimit\": {\"MaxNumberOfTests\": 3, \"MaxParallelOfTests\": 1},\n",
    "        \"TrafficPattern\": {\n",
    "            \"TrafficType\": \"PHASES\",\n",
    "            \"Phases\": [{\"InitialNumberOfUsers\": 1, \"SpawnRate\": 1, \"DurationInSeconds\": 120}],\n",
    "        },\n",
    "    },\n",
    "    StoppingConditions={\n",
    "        \"MaxInvocations\": 1000,\n",
    "        \"ModelLatencyThresholds\": [{\"Percentile\": \"P95\", \"ValueInMilliseconds\": 500}],\n",
    "    },\n",
    ")\n",
    "\n",
    "print(advanced_response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8. Custom Load Test Results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import uuid\n",
    "import pprint\n",
    "import pandas as pd\n",
    "\n",
    "inference_client = boto3.client(\"sagemaker\", region)\n",
    "\n",
    "stopped = False\n",
    "while not stopped:\n",
    "    inference_recommender_job = inference_client.describe_inference_recommendations_job(\n",
    "        JobName=str(advanced_job)\n",
    "    )\n",
    "    if inference_recommender_job[\"Status\"] in [\"COMPLETED\", \"STOPPED\", \"FAILED\"]:\n",
    "        stopped = True\n",
    "    else:\n",
    "        print(\"Inference recommender job in progress\")\n",
    "        time.sleep(600)\n",
    "\n",
    "if inference_recommender_job[\"Status\"] == \"FAILED\":\n",
    "    print(\"Inference recommender job failed \")\n",
    "    print(\"Failed Reason: {}\".inference_recommender_job[\"FailureReason\"])\n",
    "else:\n",
    "    print(\"Inference recommender job completed\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Detailing out the result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = [\n",
    "    {**x[\"EndpointConfiguration\"], **x[\"ModelConfiguration\"], **x[\"Metrics\"]}\n",
    "    for x in inference_recommender_job[\"InferenceRecommendations\"]\n",
    "]\n",
    "df = pd.DataFrame(data)\n",
    "dropFilter = df.filter([\"VariantName\"])\n",
    "df.drop(dropFilter, inplace=True, axis=1)\n",
    "pd.set_option(\"max_colwidth\", 400)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Notebook CI Test Results\n",
    "\n",
    "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
    "\n",
    "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-inference-recommender|inference-recommender.ipynb)\n",
    "\n",
    "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-inference-recommender|inference-recommender.ipynb)\n"
   ]
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   },
   {
    "_defaultOrder": 55,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 56,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4de.24xlarge",
    "vcpuNum": 96
   }
  ],
  "instance_type": "ml.g4dn.xlarge",
  "kernelspec": {
   "display_name": "Python 3 (TensorFlow 2.10.0 Python 3.9 GPU Optimized)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/tensorflow-2.10.1-gpu-py39-cu112-ubuntu20.04-sagemaker-v1.2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}