{ "cells": [ { "cell_type": "markdown", "id": "9e61d282", "metadata": {}, "source": [ "# Running multi-container endpoints on Amazon SageMaker\n" ] }, { "cell_type": "markdown", "id": "cef4f82c", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "b49beb16", "metadata": {}, "source": [ "\n", "SageMaker multi-container endpoints enable customers to deploy multiple containers to deploy different models on a SageMaker endpoint. The containers can be run in a sequence as an inference pipeline, or each container can be accessed individually by using direct invocation to improve endpoint utilization and optimize costs.\n", "\n", "\n", "This notebook shows how to create a multi-container endpoint which will host both the PyTorch(>=1.5) model and a TensorFlow(>=2.0) model, on a single endpoint. Here, `Direct` invocation behavior of multi-container endpoints is showcased where each model container can be invoked directly rather than being called in a sequence.\n", "\n", "This notebook is divided in the following sections:\n", "\n", "1. **Pre-requisites**\n", "1. **Train a TensorFlow Model in SageMaker**\n", "1. **Train a PyTorch Model in SageMaker**\n", "1. **Setup Multi-container Endpoint with Direct Invocation**\n", "1. **Inference**\n", "1. **Clean up**" ] }, { "cell_type": "markdown", "id": "5f9c3c07", "metadata": {}, "source": [ "## Section 1: Pre-requisites" ] }, { "cell_type": "markdown", "id": "071ed4e6", "metadata": {}, "source": [ "First, import some necessary libraries and variables. This is the place where the output paths for the models are defined." ] }, { "cell_type": "code", "execution_count": null, "id": "01773f8c", "metadata": { "tags": [] }, "outputs": [], "source": [ "import os\n", "import json\n", "import time\n", "import random\n", "import numpy as np\n", "from utils.mnist import mnist_to_numpy, normalize\n", "import random\n", "import matplotlib.pyplot as plt\n", "\n", "import boto3\n", "import sagemaker\n", "from sagemaker.tensorflow import TensorFlow\n", "from sagemaker.pytorch import PyTorch\n", "from sagemaker import get_execution_role\n", "from sagemaker.s3 import S3Downloader\n", "from sagemaker.s3 import S3Uploader\n", "\n", "sess = sagemaker.Session()\n", "\n", "role = get_execution_role()\n", "\n", "bucket = sess.default_bucket()\n", "\n", "output_prefix = \"/multi-container-endpoint/output\"\n", "output_path = \"s3://\" + bucket + output_prefix\n", "\n", "region = sess.boto_region_name\n", "\n", "sm_client = sess.sagemaker_client\n", "runtime_sm_client = sess.sagemaker_runtime_client\n", "s3_client = boto3.client(\"s3\")" ] }, { "cell_type": "markdown", "id": "770c7165", "metadata": {}, "source": [ "### Dataset" ] }, { "cell_type": "markdown", "id": "1bc9333f", "metadata": {}, "source": [ "This notebook uses `MNIST` dataset. `MNIST` is a widely used dataset for handwritten digit classification. It consists of 70,000 labeled `28x28` pixel grayscale images of hand-written digits. The dataset is split into 60,000 training images and 10,000 test images. There are 10 classes (one for each of the 10 digits). " ] }, { "cell_type": "markdown", "id": "b162a1cf", "metadata": {}, "source": [ "### Set up channels for training and testing data\n", "\n", "Next, the framework Estimator needs to know where to find the training and testing data. It can be a link to an S3 bucket, or it can be a path in the local file system if local mode is used. For this notebook, download the `MNIST` data from a public S3 bucket and upload it to the default bucket created in the first cell. \n", "\n", "__NOTE: Local mode is not supported in Studio.__" ] }, { "cell_type": "code", "execution_count": null, "id": "75a536fc", "metadata": { "tags": [] }, "outputs": [], "source": [ "import logging\n", "import boto3\n", "from botocore.exceptions import ClientError\n", "\n", "# Download training and testing data from a public S3 bucket\n", "\n", "\n", "def download_from_s3(data_dir=\"/tmp/data\", train=True):\n", " \"\"\"Download MNIST dataset and convert it to numpy array\n", "\n", " Args:\n", " data_dir (str): directory to save the data\n", " train (bool): download training set\n", "\n", " Returns:\n", " None\n", " \"\"\"\n", "\n", " if not os.path.exists(data_dir):\n", " os.makedirs(data_dir)\n", "\n", " if train:\n", " images_file = \"train-images-idx3-ubyte.gz\"\n", " labels_file = \"train-labels-idx1-ubyte.gz\"\n", " else:\n", " images_file = \"t10k-images-idx3-ubyte.gz\"\n", " labels_file = \"t10k-labels-idx1-ubyte.gz\"\n", "\n", " # download objects\n", " s3 = boto3.client(\"s3\")\n", " bucket = f\"sagemaker-example-files-prod-{region}\"\n", " for obj in [images_file, labels_file]:\n", " key = os.path.join(\"datasets/image/MNIST\", obj)\n", " dest = os.path.join(data_dir, obj)\n", " if not os.path.exists(dest):\n", " s3.download_file(bucket, key, dest)\n", " return\n", "\n", "\n", "download_from_s3(\"/tmp/data\", True)\n", "download_from_s3(\"/tmp/data\", False)" ] }, { "cell_type": "markdown", "id": "93225db4", "metadata": {}, "source": [ "### Create channels for SageMaker Training\n", "\n", "The keys of the dictionary `channels` are parsed to the training image, and it creates the environment variable `SM_CHANNEL_`. \n", "\n", "In this example, `SM_CHANNEL_TRAINING` and `SM_CHANNEL_TESTING` are created in the training image (checkout how `tensorflow/train.py` or `pytorch/train.py` to learn how to access these variables). For more information, see: [SM_CHANNEL_{channel_name}](https://github.com/aws/sagemaker-training-toolkit/blob/master/ENVIRONMENT_VARIABLES.md#sm_channel_channel_name)" ] }, { "cell_type": "code", "execution_count": null, "id": "d54edb5d", "metadata": { "tags": [] }, "outputs": [], "source": [ "# upload to the default bucket\n", "\n", "dataset_prefix = \"multi-container-endpoint/dataset\"\n", "\n", "loc = sess.upload_data(path=\"/tmp/data\", bucket=bucket, key_prefix=dataset_prefix)\n", "\n", "channels = {\"training\": loc, \"testing\": loc}" ] }, { "cell_type": "markdown", "id": "5a5b61f6", "metadata": {}, "source": [ "\n", "Now all the pre-requisites are set up it is time to train the models. In the following section, a TensorFlow model is trained on the `MNIST` dataset\n" ] }, { "cell_type": "markdown", "id": "1175c38a", "metadata": {}, "source": [ "## Section 2: Train a TensorFlow model in SageMaker using the TensorFlow Estimator\n", "\n", "The `TensorFlow` class allows to run a training script on SageMaker infrastructure in a containerized environment. \n", "\n", "It needs the following parameters to set up the environment:\n", "\n", "- `entry_point`: A user defined python file to be used by the training container as the instructions for training. This file is further discussed in the next subsection\n", "\n", "- `role`: An IAM role to make AWS service requests\n", "\n", "- `instance_type`: The type of SageMaker instance to run the training script. \n", "\n", "- `model_dir`: S3 bucket URI where the checkpoint data and models can be exported to during training (default: None). To disable having `model_dir` passed to the training script, set `model_dir`=False\n", "\n", "- `instance_count`: The number of instances needed to run the training job. Multiple instances are needed for distributed training\n", "\n", "- `output_path`: S3 bucket URI to save training output (model artifacts and output files)\n", "\n", "- `framework_version`: The version of TensorFlow to use.\n", "\n", "- `py_version`: The python version to use\n", "\n", "For more information, see [the API reference](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/sagemaker.tensorflow.html)\n", "\n" ] }, { "cell_type": "markdown", "id": "663baba0", "metadata": {}, "source": [ "### Implement the entry point for training\n", "\n", "The entry point for training is a python script that provides all the code for training a TensorFlow model. It is used by the SageMaker TensorFlow Estimator (`TensorFlow` class) as the entry point for running the training job.\n", "\n", "Under the hood, SageMaker TensorFlow Estimator downloads a docker image with runtime environments specified by the parameters you used to initiate the estimator class, and it injects the training script into the docker image to be used as the entry point to run the container.\n", "\n", "In the rest of the notebook, *training image* refers to the docker image specified by the Estimator and *training container* refers to the container that runs the training image. \n", "\n", "This means the training script is very similar to a training script that might run outside Amazon SageMaker, but it can access the useful environment variables provided by the training image. Checkout [the complete list of environment variables](https://github.com/aws/sagemaker-training-toolkit/blob/master/ENVIRONMENT_VARIABLES.md) for a complete description of all environment variables your training script can access to. \n", "\n", "In this example, the training script at `tensorflow/code/train.py` is used as the entry point for the TensorFlow Estimator. " ] }, { "cell_type": "code", "execution_count": null, "id": "b4751387", "metadata": { "tags": [] }, "outputs": [], "source": [ "!pygmentize 'tensorflow/code/train.py'" ] }, { "cell_type": "markdown", "id": "1f39aef8", "metadata": {}, "source": [ "### Set hyperparameters\n", "\n", "In addition, TensorFlow Estimator allows parsing command line arguments to your training script via `hyperparameters`. Note that TensorFlow 2.3.1 version is used for training, the same should be used for inference to avoid any errors." ] }, { "cell_type": "code", "execution_count": null, "id": "05c6b109", "metadata": { "tags": [] }, "outputs": [], "source": [ "tf_output_path = output_path + \"/tensorflow\"\n", "\n", "tf_estimator = TensorFlow(\n", " entry_point=\"train.py\",\n", " source_dir=\"tensorflow/code\", # directory of training script\n", " role=role,\n", " framework_version=\"2.3.1\",\n", " model_dir=False, # don't pass --model_dir to training script\n", " py_version=\"py37\",\n", " instance_type=\"ml.c4.xlarge\",\n", " instance_count=1,\n", " output_path=tf_output_path,\n", " hyperparameters={\n", " \"batch-size\": 512,\n", " \"epochs\": 1,\n", " \"learning-rate\": 1e-3,\n", " \"beta_1\": 0.9,\n", " \"beta_2\": 0.999,\n", " },\n", ")" ] }, { "cell_type": "markdown", "id": "5952abb9", "metadata": {}, "source": [ "### Run the training script on SageMaker\n", "Now, the TensorFlow training container has everything to execute the training script, model training can be started by calling `fit` method." ] }, { "cell_type": "code", "execution_count": null, "id": "a5b4b0b2", "metadata": { "tags": [] }, "outputs": [], "source": [ "tf_estimator.fit(inputs=channels)" ] }, { "cell_type": "markdown", "id": "416e365f", "metadata": {}, "source": [ "### Inspect and store model data\n", "\n", "Now, the training is finished, the model artifact has been saved in the `output_path`." ] }, { "cell_type": "code", "execution_count": null, "id": "8e3f74d3", "metadata": { "tags": [] }, "outputs": [], "source": [ "tf_mnist_model_data = tf_estimator.model_data\n", "print(\"Model artifact saved at:\\n\", tf_mnist_model_data)" ] }, { "cell_type": "markdown", "id": "c0bb4758", "metadata": {}, "source": [ "## Section 3: Train a PyTorch model in SageMaker using PyTorch Estimator\n", "\n", "In this section, A PyTorch model is trained on the same `MNIST` dataset. \n", "\n", "### PyTorch Estimator\n", "\n", "The `PyTorch` class allows to run the training script on SageMaker infrastructure in a containerized environment.\n", "\n", "It needs to have the following parameters to set up the environment:\n", "\n", "- `entry_point`: A user defined python file to be used by the training container as the instructions for training. This file is further discussed in the next subsection.\n", "\n", "- `role`: An IAM role to make AWS service requests\n", "\n", "- `instance_type`: The type of SageMaker instance to run the training script. \n", "\n", "- `instance_count`: The number of instances needed to run the training job. Multiple instances are needed for distributed training.\n", "\n", "- `output_path`: S3 bucket URI to save training output (model artifacts and output files)\n", "\n", "- `framework_version`: The version of PyTorch to use.\n", "\n", "- `py_version`: The python version to use\n", "\n", "For more information, see [the API reference](https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/sagemaker.pytorch.html)\n", "\n", "\n", "### Implement the entry point for training\n", "\n", "The entry point for training is a python script that provides all the code for training a PyTorch model. It is used by the SageMaker PyTorch Estimator (`PyTorch` class above) as the entry point for running the training job.\n", "\n", "Under the hood, SageMaker PyTorch Estimator creates a docker image with runtime environments specified by the parameters used to initiate the Estimator class, and it injects the training script into the docker image to be used as the entry point to run the container. Here as well, the training script can access all the useful environment variables provided by the training image as described in **Section 2**. The training script present at `pytorch/code/train.py` is used as the entry point for the PyTorch Estimator." ] }, { "cell_type": "code", "execution_count": null, "id": "1a2b432e", "metadata": { "tags": [] }, "outputs": [], "source": [ "!pygmentize 'pytorch/code/train.py'" ] }, { "cell_type": "markdown", "id": "34823e16", "metadata": {}, "source": [ "### Set hyperparameters\n", "\n", "In addition, PyTorch Estimator allows parsing command line arguments to your training script via `hyperparameters`. Note that PyTorch 1.8.1 version is used for training, the same should be used for inference as well to avoid any errors.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "4bc22ac3", "metadata": { "tags": [] }, "outputs": [], "source": [ "pytorch_est = PyTorch(\n", " entry_point=\"train.py\",\n", " source_dir=\"pytorch/code\", # directory of your training script\n", " role=role,\n", " framework_version=\"1.8.1\",\n", " py_version=\"py3\",\n", " instance_type=\"ml.c4.xlarge\",\n", " instance_count=1,\n", " output_path=output_path + \"/pytorch\",\n", " hyperparameters={\"batch-size\": 128, \"epochs\": 1, \"learning-rate\": 1e-3, \"log-interval\": 100},\n", ")" ] }, { "cell_type": "markdown", "id": "a6152c36", "metadata": {}, "source": [ "### Run the training script on SageMaker\n", "Now, the PyTorch training container has everything to execute the training script. The training can be started by calling `fit` method." ] }, { "cell_type": "code", "execution_count": null, "id": "6b20de29", "metadata": { "tags": [] }, "outputs": [], "source": [ "pytorch_est.fit(inputs=channels)" ] }, { "cell_type": "markdown", "id": "37ec02e1", "metadata": {}, "source": [ "### Inspect and store model data\n", "\n", "Now, the training is finished, the model artifact has been saved in the `output_path`." ] }, { "cell_type": "code", "execution_count": null, "id": "00a247ea", "metadata": { "tags": [] }, "outputs": [], "source": [ "pt_mnist_model_data = pytorch_est.model_data\n", "print(\"Model artifact saved at:\\n\", pt_mnist_model_data)" ] }, { "cell_type": "markdown", "id": "e636ca2a", "metadata": {}, "source": [ "\n", "## Section 4: Set up Multi-container endpoint with Direct Invocation" ] }, { "cell_type": "markdown", "id": "2c7e9721", "metadata": {}, "source": [ "In this section, a multi-container endpoint is set up.\n", "\n", "SageMaker multi-container endpoints enable customers to deploy multiple containers to deploy different models on the same SageMaker endpoint. The containers can be run in a sequence as an inference pipeline, or each container can be accessed individually by using `direct` invocation to improve endpoint utilization and optimize costs.\n", "\n", "The TensorFlow and PyTorch models, trained in the earlier sections would be deployed against a single sagemaker endpoint using multi-container capability of SageMaker Endpoints. This section uses`boto3` APIs.\n", "\n", "Setting up a multi-container endpoint is a multi-step process, which looks like the following:\n", "- Create inference container definitions for all the containers needed to deploy\n", "- Create a SageMaker model using the `create_model` API. Use the `Containers` parameter instead of `PrimaryContainer`, and include more than one container in the `Containers` parameter.\n", "- Create a SageMaker Endpoint Configuration using the `create_endpoint_config` API\n", "- Create a SageMaker Endpoint using the `create_endpoint` API which uses the model and endpoint configuration created in the earlier steps.\n", "\n" ] }, { "cell_type": "markdown", "id": "eddba257", "metadata": {}, "source": [ "### Create inference container definition for TensorFlow model\n", "\n", "To create a container definition, following must be defined :\n", "\n", "- `ContainerHostname`: The value of the parameter uniquely identifies the container for the purposes of logging and metrics. The `ContainerHostname` parameter is required for each container in a multi-container endpoint with `direct` invocation. Though it can be skipped, in case of serial inference pipeline as the inference pipeline will assign a unique name automatically.\n", "\n", "- `Image`: It is the path where inference code is stored. This can be either in Amazon EC2 Container Registry or in a Docker registry that is accessible from the same VPC that is configured for the endpoint. If custom algorithm is used instead of an algorithm provided by Amazon SageMaker, the inference code must meet Amazon SageMaker requirements.\n", "\n", "- `ModelDataUrl`: The S3 path where the model artifacts, which result from model training, are stored. This path must point to a single GZIP compressed tar archive (`.tar.gz` suffix). The S3 path is required for Amazon SageMaker built-in algorithms/frameworks, but not if a custom algorithm (not provided by sagemaker) is used.\n", "\n", "For the Image argument, supply the ECR path of the TensorFlow 2.3.1 inference image. For deep learning images available in SageMaker, refer to [Available Deep Learning Containers Images](https://github.com/aws/deep-learning-containers/blob/master/available_images.md).\n" ] }, { "cell_type": "code", "execution_count": null, "id": "991e8729", "metadata": { "tags": [] }, "outputs": [], "source": [ "tf_ecr_image_uri = sagemaker.image_uris.retrieve(\n", " framework=\"tensorflow\",\n", " region=region,\n", " version=\"2.3.1\",\n", " py_version=\"py37\",\n", " instance_type=\"ml.c5.4xlarge\",\n", " image_scope=\"inference\",\n", ")\n", "\n", "tensorflow_container = {\n", " \"ContainerHostname\": \"tensorflow-mnist\",\n", " \"Image\": tf_ecr_image_uri,\n", " \"ModelDataUrl\": tf_mnist_model_data,\n", "}" ] }, { "cell_type": "markdown", "id": "dc2f0713", "metadata": {}, "source": [ "### Create inference container definition for PyTorch model\n", "\n", "Now similarly, create the container definition for PyTorch model. \n", "\n", "Here in addition to the arguments defined for TensorFlow container, one more additional argument needs to be defined which is `Environment`. This is because, the PyTorch model server needs to know how to load the model and make the predictions. This is explained in detail in the following section.\n", "\n", "\n", "To tell the inference image how to load the model checkpoint, it needs to implement:\n", "\n", "- How to parse the incoming request\n", "- How to use the trained model to make inference\n", "- How to return the prediction to the caller of the service\n", "\n", "\n", "To achieve this, it needs to:\n", "\n", "- implement a function called `model_fn` which returns a PyTorch model.\n", "- implement a function called `input_fn` function which handles data decoding and returns an object that can be passed to `predict_fn`\n", "- implement a function called `predict_fn` function which will perform the prediction and returns as object that can be passed to `output_fn`\n", "- implement a function called `output_fn` function which will perform the de-serialization of the output given by `predict_fn`\n", "\n", "\n", "To achieve this, `inference.py` is created which provides the implementation of all the above functions in that file. This file must be supplied as an environment variable `SAGEMAKER_PROGRAM`.\n", "\n", "The model and `inference.py` also need to be wrapped together in a single `tar.gz`. The following steps are performed to zip the inference and model file together:\n", "\n", "- Download the `model.tar.gz` containing the trained PyTorch model\n", "- Unzip the `model.tar.gz`. The `model.pth` file is visible after unzipping.\n", "- GZIP the `model file(.pth)` and the `inference.py` together in a new `tar.gz`\n", "- Upload the new `tar.gz` to `s3` location, to be referred in the `model container definition` later\n" ] }, { "cell_type": "code", "execution_count": null, "id": "a57531c1", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Download the model.tar.gz containing the PyTorch model, to current dir\n", "S3Downloader.download(pt_mnist_model_data, \".\")\n", "\n", "# unzip the tar.gz\n", "!tar -xvf model.tar.gz\n", "\n", "# after unzipping, remove the model.tar.gz\n", "!rm model.tar.gz\n", "\n", "# copy the pytorch inference script to current dir\n", "!cp pytorch/code/inference.py .\n", "\n", "# gzip the inference.py and model file together in a new model.tar.gz\n", "!tar -czvf model.tar.gz model.pth inference.py\n", "\n", "# remove the residual files\n", "!rm inference.py model.pth\n", "\n", "# upload the new tar.gz to s3\n", "updated_pt_model_key = \"multi-container-endpoint/output/pytorch/updated\"\n", "pt_updated_model_uri = S3Uploader.upload(\n", " \"model.tar.gz\", \"s3://{}/{}\".format(bucket, updated_pt_model_key)\n", ")\n", "\n", "# remove the new model.tar.gz from the current dir\n", "!rm model.tar.gz" ] }, { "cell_type": "markdown", "id": "7bc87625", "metadata": {}, "source": [ "\n", "Now, everything is ready to create a container definition for PyTorch container\n" ] }, { "cell_type": "code", "execution_count": null, "id": "aded05d4", "metadata": { "tags": [] }, "outputs": [], "source": [ "pt_ecr_image_uri = sagemaker.image_uris.retrieve(\n", " framework=\"pytorch\",\n", " region=region,\n", " version=\"1.8.1\",\n", " py_version=\"py36\",\n", " instance_type=\"ml.c5.4xlarge\",\n", " image_scope=\"inference\",\n", ")\n", "\n", "pytorch_container = {\n", " \"ContainerHostname\": \"pytorch-mnist\",\n", " \"Image\": pt_ecr_image_uri,\n", " \"ModelDataUrl\": pt_updated_model_uri,\n", " \"Environment\": {\n", " \"SAGEMAKER_PROGRAM\": \"inference.py\",\n", " \"SAGEMAKER_SUBMIT_DIRECTORY\": pt_updated_model_uri,\n", " },\n", "}" ] }, { "cell_type": "markdown", "id": "ef8e2d25", "metadata": {}, "source": [ "### Create a SageMaker Model\n", "\n", "In the below cell, call the `create_model` API to create a model which contains the definitions of both the PyTorch and TensorFlow containers created above. It needs to supply both the containers under the `Containers` argument. Also set the `Mode` parameter of the `InferenceExecutionConfig` field to `Direct` for direct invocation of each container, or `Serial` to use containers as an inference pipeline. The default mode is `Serial`. For more details, check out [Deploy multi-container endpoints](https://docs.aws.amazon.com/sagemaker/latest/dg/multi-container-endpoints.html)\n", "\n", "\n", "Since this notebook focuses on the Direct invocation behavior, hence set the value as `Direct`.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "8348bdaf", "metadata": { "tags": [] }, "outputs": [], "source": [ "create_model_response = sm_client.create_model(\n", " ModelName=\"mnist-multi-container\",\n", " Containers=[pytorch_container, tensorflow_container],\n", " InferenceExecutionConfig={\"Mode\": \"Direct\"},\n", " ExecutionRoleArn=role,\n", ")" ] }, { "cell_type": "markdown", "id": "36cbde45", "metadata": {}, "source": [ "### Create Endpoint Configuration\n", "\n", "Now, create an endpoint configuration by calling the `create_endpoint_config` API. Here, supply the same `ModelName` used in the `create_model` API call." ] }, { "cell_type": "code", "execution_count": null, "id": "6dcf179d", "metadata": { "tags": [] }, "outputs": [], "source": [ "endpoint_config = sm_client.create_endpoint_config(\n", " EndpointConfigName=\"mnist-multi-container-ep-config\",\n", " ProductionVariants=[\n", " {\n", " \"VariantName\": \"prod\",\n", " \"ModelName\": \"mnist-multi-container\",\n", " \"InitialInstanceCount\": 1,\n", " \"InstanceType\": \"ml.c5.4xlarge\",\n", " },\n", " ],\n", ")" ] }, { "cell_type": "markdown", "id": "cbeca28b", "metadata": {}, "source": [ "### Create a SageMaker Multi-container endpoint\n", "\n", "Now, the last step is to create a SageMaker multi-container endpoint. The `create_endpoint` API is used for this. The API behavior has no change compared to how a single container/model endpoint is deployed." ] }, { "cell_type": "code", "execution_count": null, "id": "dc38d5ae", "metadata": { "tags": [] }, "outputs": [], "source": [ "endpoint = sm_client.create_endpoint(\n", " EndpointName=\"mnist-multi-container-ep\", EndpointConfigName=\"mnist-multi-container-ep-config\"\n", ")" ] }, { "cell_type": "markdown", "id": "c6fa8103", "metadata": {}, "source": [ "The `create_endpoint` API is synchronous in nature and returns an immediate response with the endpoint status being in`Creating` state. It takes around ~8-10 minutes for multi-container endpoint to be `InService`.\n", "\n", "In the below cell, use the `describe_endpoint` API to check the status of endpoint creation. It runs a simple waiter loop calling the `describe_endpoint` API, for the endpoint to be `InService`" ] }, { "cell_type": "code", "execution_count": null, "id": "95cacf69", "metadata": { "tags": [] }, "outputs": [], "source": [ "describe_endpoint = sm_client.describe_endpoint(EndpointName=\"mnist-multi-container-ep\")\n", "\n", "endpoint_status = describe_endpoint[\"EndpointStatus\"]\n", "\n", "while endpoint_status != \"InService\":\n", " print(\"Current endpoint status is: {}, Trying again...\".format(endpoint_status))\n", " time.sleep(60)\n", " resp = sm_client.describe_endpoint(EndpointName=\"mnist-multi-container-ep\")\n", " endpoint_status = resp[\"EndpointStatus\"]\n", "\n", "print(\"Endpoint status changed to 'InService'\")" ] }, { "cell_type": "markdown", "id": "e3ad9ec2", "metadata": {}, "source": [ "## Section 5: Inference" ] }, { "cell_type": "markdown", "id": "379d0fe3", "metadata": {}, "source": [ "Now that the endpoint is set up it is time to perform inference on the endpoint by specifying one of the container host name. First, download the `MNIST` data and select a random sample of images. \n", "\n", "Use the helper functions defined in `code.utils` to download `MNIST` data set and normalize the input data.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "86ff8bd6", "metadata": { "tags": [] }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "data_dir = \"/tmp/data\"\n", "X, _ = mnist_to_numpy(data_dir, train=False)\n", "\n", "# randomly sample 16 images to inspect\n", "mask = random.sample(range(X.shape[0]), 16)\n", "samples = X[mask]\n", "\n", "# plot the images\n", "fig, axs = plt.subplots(nrows=1, ncols=16, figsize=(16, 1))\n", "\n", "for i, splt in enumerate(axs):\n", " splt.imshow(samples[i])" ] }, { "cell_type": "code", "execution_count": null, "id": "819130cf", "metadata": { "tags": [] }, "outputs": [], "source": [ "print(samples.shape, samples.dtype)" ] }, { "cell_type": "markdown", "id": "8f89fa02", "metadata": {}, "source": [ "### Invoking the TensorFlow container\n", "\n", "Now invoke the TensorFlow container, on the same endpoint. First normalize the sample selected and then pass the sample to the `invoke_endpoint` API." ] }, { "cell_type": "code", "execution_count": null, "id": "b0392f2a", "metadata": { "tags": [] }, "outputs": [], "source": [ "tf_samples = normalize(samples, axis=(1, 2))\n", "\n", "tf_result = runtime_sm_client.invoke_endpoint(\n", " EndpointName=\"mnist-multi-container-ep\",\n", " ContentType=\"application/json\",\n", " Accept=\"application/json\",\n", " TargetContainerHostname=\"tensorflow-mnist\",\n", " Body=json.dumps({\"instances\": np.expand_dims(tf_samples, 3).tolist()}),\n", ")\n", "\n", "tf_body = tf_result[\"Body\"].read().decode(\"utf-8\")\n", "\n", "tf_json_predictions = json.loads(tf_body)[\"predictions\"]\n", "\n", "\n", "# softmax to logit\n", "tf_predictions = np.array(tf_json_predictions, dtype=np.float32)\n", "tf_predictions = np.argmax(tf_json_predictions, axis=1)" ] }, { "cell_type": "code", "execution_count": null, "id": "49c6bf6a", "metadata": { "tags": [] }, "outputs": [], "source": [ "print(\"Predictions: \", tf_predictions.tolist())" ] }, { "cell_type": "markdown", "id": "032df496", "metadata": {}, "source": [ "### Invoke PyTorch container\n", "\n", "Now, invoke the PyTorch Container. In `transform_fn`, of `inference.py` it is declared that the parsed data is a python dictionary with a key `inputs` and its value should be a 1D array of length 784. Hence, create a sample inference data in the cell below." ] }, { "cell_type": "markdown", "id": "3eba89b4", "metadata": {}, "source": [ "Before we invoke the SageMaker PyTorch model server with `samples`, we need to do some pre-processing\n", "- convert its data type to 32 bit floating point\n", "- normalize each channel (only one channel for `MNIST`)\n", "- add a channel dimension" ] }, { "cell_type": "code", "execution_count": null, "id": "2182e32e", "metadata": { "tags": [] }, "outputs": [], "source": [ "pt_samples = normalize(samples.astype(np.float32), axis=(1, 2))\n", "\n", "pt_result = runtime_sm_client.invoke_endpoint(\n", " EndpointName=\"mnist-multi-container-ep\",\n", " ContentType=\"application/json\",\n", " Accept=\"application/json\",\n", " TargetContainerHostname=\"pytorch-mnist\",\n", " Body=json.dumps({\"inputs\": np.expand_dims(pt_samples, axis=1).tolist()}),\n", ")\n", "\n", "pt_body = pt_result[\"Body\"].read().decode(\"utf-8\")\n", "\n", "pt_predictions = np.argmax(np.array(json.loads(pt_body), dtype=np.float32), axis=1).tolist()\n", "print(\"Predicted digits: \", pt_predictions)" ] }, { "cell_type": "markdown", "id": "bb9a7205", "metadata": {}, "source": [ "\n", "## Section 6: clean up\n" ] }, { "cell_type": "markdown", "id": "7ab901f3", "metadata": {}, "source": [ "Before leaving this exercise, it is a good practice to delete the resources created." ] }, { "cell_type": "code", "execution_count": null, "id": "56bd7445", "metadata": { "tags": [] }, "outputs": [], "source": [ "sm_client.delete_endpoint(EndpointName=\"mnist-multi-container-ep\")\n", "sm_client.delete_endpoint_config(EndpointConfigName=\"mnist-multi-container-ep-config\")\n", "sm_client.delete_model(ModelName=\"mnist-multi-container\")" ] }, { "cell_type": "markdown", "id": "96ddf399", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/advanced_functionality|multi-container-endpoint|direct-invocation|multi-container-direct-invocation.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }