{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deploy Autopilot models to serverless inference endpoints" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Amazon SageMaker Serverless Inference is a purpose-built inference option that makes it easy for customers to deploy and scale ML models. Serverless Inference is ideal for workloads which have idle periods between traffic spurts and can tolerate cold starts. Serverless endpoints also automatically launch compute resources and scale them in and out depending on traffic, eliminating the need to choose instance types or manage scaling policies.\n", "\n", "In this notebook we use models generated with Amazon SageMaker Autopilot and then deploy these models to serverless endpoints.\n", "\n", "We will be using the public [UCI Direct Marketing](https://archive.ics.uci.edu/ml/datasets/bank+marketing) dataset for this example.\n", "\n", "**Notebook Settings:**\n", "\n", "- **SageMaker Classic Notebook Instance:** `ml.t3.xlarge` Notebook Instance & `conda_python3` Kernel\n", "- **SageMaker Studio:** `Python 3 (Data Science 2.0) Kernel`\n", "- **Regions Available:** SageMaker Serverless Inference is currently available in the following regions: \n", " US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), Asia Pacific (Tokyo) and Asia Pacific (Sydney)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "Let's ensure we have the latest packages installed. For this notebook, we need the below versions for `sagemaker` and `boto3` packages\n", "1. sagemaker >= `2.110.0`\n", "2. boto3 >= `boto3-1.24.84`\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -U awscli sagemaker boto3 --quiet" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import boto3\n", "import sagemaker\n", "import sys\n", "\n", "print(f\"SageMaker Version: {sagemaker.__version__}\")\n", "print(f\"Boto3 Version: {boto3.__version__}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "Import packages, establish session and unique ID for job name suffix" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Import required libraries\n", "import os\n", "import json\n", "import itertools\n", "import numpy as np\n", "import pandas as pd\n", "\n", "from datetime import datetime\n", "from time import gmtime, strftime, sleep\n", "from uuid import uuid4\n", "from IPython import display\n", "\n", "\n", "# Define region, bucket\n", "session = sagemaker.Session()\n", "region = boto3.Session().region_name\n", "bucket = session.default_bucket()\n", "# use the below for default SageMaker execution role else replace with your own IAM Role ARN\n", "role = sagemaker.get_execution_role()\n", "\n", "prefix = \"autopilot/bankadditional\"\n", "\n", "today = datetime.now().strftime(\"%d%b%Y\")\n", "timestamp_suffix = f\"{str(uuid4())[:6]}-{today}\"\n", "\n", "# Define sagemaker client object to invoke Sagemaker services\n", "sm_client = boto3.client(\"sagemaker\", region_name=region)\n", "\n", "# Set prefix for AutoML jobnames. Let's keep the prefix short. We use suffixes to distinguish job names.\n", "automl_job_prefix = \"bankmrkt\" # 6-8 chars max\n", "model_prefix = automl_job_prefix\n", "\n", "print(f\"Bucket: s3://{bucket}/{prefix}\")\n", "print(f\"Region: {region}\")\n", "print(f\"Role: {role}\")\n", "print(f\"Job and model prefix string: {automl_job_prefix}\")\n", "print(f\"suffix string: {timestamp_suffix}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset\n", "This example uses [UCI direct marketing dataset](https://archive.ics.uci.edu/ml/datasets/Bank+Marketing):\n", "\n", "[Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014\n", "\n", "Download dataset from `sagemaker-example-files-prod-{region}` s3 bucket:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.s3 import S3Downloader\n", "\n", "s3uri = f\"s3://sagemaker-example-files-prod-{region}/datasets/tabular/uci_bank_marketing/bank-additional-full.csv\"\n", "\n", "if not os.path.exists('data/bank-additional/bank-additional-full.csv'):\n", " print(\"Downloading bank-additional-full.csv...\")\n", " !mkdir -p data/bank-additional\n", " S3Downloader.download(s3_uri=s3uri, local_path=\"data/bank-additional\", sagemaker_session=session)\n", " print(\"Done\")\n", "else:\n", " print(\"Skipping download..dataset exists at ./data/bank-additional\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualize dataset\n", "The data is related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to predict if the client will subscribe a term deposit (variable y).\n", "\n", "Problem Type: **Binary Classification**\n", "\n", "Ref: \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_data = pd.read_csv(\"./data/bank-additional/bank-additional-full.csv\")\n", "\n", "pd.set_option(\"display.max_columns\", 10) # View all of the columns\n", "df_data # show first 5 and last 5 rows of the dataframe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Upload dataset to S3\n", "We upload the `bank-additional-full.csv` to S3." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set this flag to False for subsequent runs of this notebook\n", "upload_dataset = True" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DATA_FILE = \"data/bank-additional/bank-additional-full.csv\"\n", "\n", "if upload_dataset:\n", " print(f\"Uploading data to s3...\")\n", " dataset_s3uri = session.upload_data(DATA_FILE, key_prefix=f\"{prefix}/raw\")\n", " print(f\"Data uploaded to : \\n {dataset_s3uri}\")\n", "else:\n", " dataset_s3uri = f\"s3://{bucket}/{prefix}/raw/bank-additional-full.csv\"\n", " print(f\"Skipping upload .. dataset is under: {dataset_s3uri}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Launch Autopilot jobs in `ENSEMBLING` and `HPO` modes\n", "\n", "\n", "First we specify the AutoML job config constants\n", "- `TargetAttributeName` (Target column `y` for your dataset)\n", "- `Training Mode` - `Valid values: AUTO | ENSEMBLING | HYPERPARAMETER_TUNING`\n", "- `ProblemType` (optional) `Valid values: BinaryClassification | MulticlassClassification | Regression`\n", "- `ObjectiveMetric` (Optional) Valid Values: `Accuracy | F1 | MSE` [`AutoMLJobObjective`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLJobObjective.html)\n", "- `Max_Candidates` (Optional) (set only for HPO Jobs)\n", "- `OutputDataConfig` (Optional, set if you need to specify output location for artifacts generated)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Autopilot job params\n", "target_column = \"y\"\n", "training_mode = \"ENSEMBLING\"\n", "\n", "# Optional Parameters\n", "problem_type = \"BinaryClassification\"\n", "objective_metric = \"F1\"\n", "max_job_runtime_seconds = 3600\n", "max_runtime_per_job_seconds = 1200\n", "max_candidates = 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we define the Autopilot job config values\n", "- [`AutoMLJobConfig`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLJobConfig.html) (`Mode` = `AUTO | ENSEMBLING | HYPERPARAMETER_TUNING`)\n", "- [`InputDataConfig`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateAutoMLJob.html#sagemaker-CreateAutoMLJob-request-InputDataConfig)\n", "- [`AutoMLJobObjective`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLJobObjective.html) (Optional. `Accuracy | MSE | F1 | F1macro | AUC`)\n", "- [`OutputDataConfig`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_AutoMLOutputDataConfig.html) (Optional)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Define Autopilot job config values" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "automl_job_config = {\n", " \"CompletionCriteria\": {\n", " \"MaxRuntimePerTrainingJobInSeconds\": max_runtime_per_job_seconds,\n", " \"MaxAutoMLJobRuntimeInSeconds\": max_job_runtime_seconds,\n", " },\n", " \"Mode\": training_mode,\n", "}\n", "\n", "automl_job_objective = {\"MetricName\": objective_metric}\n", "\n", "input_data_config = [\n", " {\n", " \"DataSource\": {\"S3DataSource\": {\"S3DataType\": \"S3Prefix\", \"S3Uri\": dataset_s3uri}},\n", " \"TargetAttributeName\": target_column,\n", " }\n", "]\n", "\n", "output_data_config = {\"S3OutputPath\": f\"s3://{bucket}/{prefix}/output\"}\n", "\n", "# Optional: Define a Tag\n", "tags_config = [{\"Key\": \"Project\", \"Value\": \"Autopilot-serverless\"}]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Launch Autopilot job with training mode set to `ENSEMBLING`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " ens_automl_job_name = f\"{model_prefix}-ENS-{timestamp_suffix}\"\n", " print(f\"Launching AutoMLJob → {ens_automl_job_name} with mode set to {training_mode}\")\n", " response = sm_client.create_auto_ml_job(\n", " AutoMLJobName=ens_automl_job_name,\n", " InputDataConfig=input_data_config,\n", " OutputDataConfig=output_data_config,\n", " AutoMLJobConfig=automl_job_config,\n", " ProblemType=problem_type,\n", " AutoMLJobObjective=automl_job_objective,\n", " RoleArn=role,\n", " Tags=tags_config,\n", " )\n", " print(response)\n", "except Exception as e:\n", " print(f\"Error launching ENSEMBLING Autopilot Job: {ens_automl_job_name}\")\n", " print(f\"{e}\")\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Launch Autopilot job with training mode set to `HYPERPARAMETER_TUNING` mode\n", "\n", "We update the `automl_job_config` dict to update `training_mode` to `HYPERPARAMETER_TUNING` and set the `MaxCandidates` to 15.\n", "\n", ">NOTE: In `HPO` mode the best model is derived by tuning various hyperparameters, default setting for `max_candidates` is 250 but for demonstration purposes we set the `max_candidates` to 15" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We use the defined job prefix to construct model name(s) and later to construct endpoint config and endpoint names.\n", "try:\n", " training_mode = \"HYPERPARAMETER_TUNING\"\n", " automl_job_config[\"Mode\"] = training_mode\n", " automl_job_config[\"CompletionCriteria\"][\"MaxCandidates\"] = 15\n", " hpo_automl_job_name = f\"{model_prefix}-HPO-{timestamp_suffix}\"\n", " print(f\"Launching AutoMLJob → {hpo_automl_job_name} with mode set to {training_mode}\")\n", " response = sm_client.create_auto_ml_job(\n", " AutoMLJobName=hpo_automl_job_name,\n", " InputDataConfig=input_data_config,\n", " OutputDataConfig=output_data_config,\n", " AutoMLJobConfig=automl_job_config,\n", " ProblemType=problem_type,\n", " AutoMLJobObjective=automl_job_objective,\n", " RoleArn=role,\n", " Tags=tags_config,\n", " )\n", " print(response)\n", "except Exception as e:\n", " print(f\"Error launching HPO Autopilot Job: {hpo_automl_job_name}\")\n", " print(f\"{e}\")\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Monitor AutoML job completion status\n", "\n", ">**NOTE:** Jobs with `ENSEMBLING` mode finishes faster" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_job_status(sm_client, job_name):\n", " resp = sm_client.describe_auto_ml_job(AutoMLJobName=job_name)\n", " p_status = resp[\"AutoMLJobStatus\"]\n", " s_status = resp[\"AutoMLJobSecondaryStatus\"]\n", " desc = f\"{job_name}: {p_status} | {s_status} ...\"\n", " return (p_status, desc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# monitor job status launched in ensembling mode\n", "(p_status, desc) = get_job_status(sm_client, ens_automl_job_name)\n", "\n", "while p_status not in (\"Completed\", \"Failed\"):\n", " (p_status, desc) = get_job_status(sm_client, ens_automl_job_name)\n", " if p_status not in (\"Completed\", \"Failed\"):\n", " print(desc)\n", " sleep(60)\n", " continue\n", " else:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create model from the best candidate generated by Autopilot\n", "- In `Ensemble` training mode Autopilot generates a single Inference container.\n", "\n", "![](./images/ap-jobprofile-ens-04Oct2022.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### helper functions to create model(s), serverless endpoint config and endpoint" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def create_autopilot_model(sm_client, model_name, role, model_container, index):\n", " try:\n", " transform_mode = model_container[\"Environment\"][\"AUTOML_TRANSFORM_MODE\"]\n", " if transform_mode:\n", " model_name = f\"{model_name}-datamodel-{index}\"\n", " except:\n", " model_name = f\"{model_name}-Inf-{index}\"\n", "\n", " if len(model_name) <= 63:\n", " print(f\"Creating Model {index}: {model_name} ...\")\n", " model_response = sm_client.create_model(\n", " ModelName=model_name, ExecutionRoleArn=role, Containers=[model_container]\n", " )\n", " status_code = model_response[\"ResponseMetadata\"][\"HTTPStatusCode\"]\n", " model_arn = model_response[\"ModelArn\"]\n", " return (status_code, model_arn)\n", " else:\n", " print(f\"Model Name: {model_name} length exceeds max. allowed chars : 63\")\n", " raise ValueError(\"Model name cannot exceed 63 chars.\")\n", "\n", "\n", "def create_serverless_endpoint_config(\n", " sm_client, endpoint_config_name, model_name, memory: int = 2048, max_concurrency: int = 20\n", "):\n", " if len(endpoint_config_name) <= 63:\n", " print(f\"Creating Endpoint Config: {endpoint_config_name} ...\")\n", " try:\n", " epc_response = sm_client.create_endpoint_config(\n", " EndpointConfigName=endpoint_config_name,\n", " ProductionVariants=[\n", " {\n", " \"ModelName\": model_name,\n", " \"VariantName\": \"AllTraffic\",\n", " \"ServerlessConfig\": {\n", " \"MemorySizeInMB\": memory,\n", " \"MaxConcurrency\": max_concurrency,\n", " },\n", " }\n", " ],\n", " )\n", " status_code = epc_response[\"ResponseMetadata\"][\"HTTPStatusCode\"]\n", " epc_arn = epc_response[\"EndpointConfigArn\"]\n", " return (status_code, epc_arn)\n", " except Exception as e:\n", " print(f\"Error creating EndpointConfig: {endpoint_config_name}\")\n", " print(f\"{e}\")\n", " else:\n", " print(f\"EndpointConfig name exceeds allowed 63 char limit\")\n", " raise ValueError(\"EndpointConfig name cannot exceed 63 chars.\")\n", "\n", "\n", "def create_serverless_endpoint(sm_client, endpoint_name, endpoint_config_name):\n", " if len(endpoint_name) <= 63:\n", " print(f\"Creating Serverless Endpoint: {endpoint_name} ...\")\n", " try:\n", " ep_response = sm_client.create_endpoint(\n", " EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name\n", " )\n", " status_code = ep_response[\"ResponseMetadata\"][\"HTTPStatusCode\"]\n", " return status_code\n", " except Exception as e:\n", " print(f\"Error creating Endpoint: {endpoint_name}\")\n", " print(f\"{e}\")\n", " else:\n", " print(f\"Endpoint name exceeds allowed 63 char limit\")\n", " raise ValueError(\"Endpoint name cannot exceed 63 chars.\")\n", "\n", "\n", "def get_s3_objsize_in_MB(bucket, key):\n", " s3 = boto3.client(\"s3\")\n", " resp = s3.head_object(Bucket=bucket, Key=key)[\"ContentLength\"]\n", " size = round(resp / (1024 * 1024))\n", " if size < 1:\n", " print(f\"Model Size: ~ {round(resp / 1024)} KB\")\n", " else:\n", " print(f\"Model Size: ~ {size} MB\")\n", "\n", " return size\n", "\n", "\n", "def set_serverless_endpoint_memory(model_size: int):\n", " if model_size <= 1024:\n", " return 1024\n", " elif model_size > 1024 and model_size <= 2048:\n", " return 2048\n", " elif model_size > 2048 and model_size <= 3072:\n", " return 3072\n", " elif model_size > 3072 and model_size <= 4096:\n", " return 4096\n", " elif model_size > 4096 and model_size <= 5120:\n", " return 5120\n", " elif model_size > 5120 and model_size <= 6144:\n", " return 6144\n", " elif model_size > 6144:\n", " raise ValueError(\"Model size is greater than 6GB\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Verify model size and create serverless endpoint configuration accordingly\n", "\n", ">Serverless Inference auto-assigns compute resources proportional to the memory you select. \n", "If you choose a larger memory size, your container has access to more `vCPUs`. Choose your endpoint’s memory size according to your model size. \n", "Generally, the memory size should be at least as large as your model size. \n", "\n", "Ref: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "response = sm_client.describe_auto_ml_job(AutoMLJobName=ens_automl_job_name)\n", "inference_container = response[\"BestCandidate\"][\"InferenceContainers\"][0]\n", "print(f\"Inference Container for AutoML job: {ens_automl_job_name}\")\n", "print(inference_container)\n", "\n", "# Verify generated model size before creating endpoint config.\n", "# Extract s3 Key from ModelDataUrl\n", "model_dataurl_key = inference_container[\"ModelDataUrl\"].split(f\"{bucket}\")[1][1:]\n", "ens_model_size = get_s3_objsize_in_MB(bucket, model_dataurl_key)\n", "print(f\"Ensemble Model Size: ~ {ens_model_size}MB\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set serverless endpoint config `MemorySize` and `MaxConcurrency`. Generally, the memory size should be **at least** as large as your model size. \n", "\n", "Set endpoint memory size to `4096` (4 GB) and `MaxConcurrency` to 10.\n", "\n", "Your serverless endpoint has a minimum RAM size of **1024 MB (1 GB)**, and the maximum RAM size you can choose is **6144 MB (6 GB)**\n", "\n", "If you don't specify any Memory `2048` (2 GB) is chosen as default. The memory sizes you can choose are 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.\n", "\n", "Ref: \n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "models = list()\n", "# create model\n", "(status, model_arn) = create_autopilot_model(\n", " sm_client, ens_automl_job_name, role, inference_container, 0\n", ")\n", "model_name = model_arn.split(\"/\")[1]\n", "models.append(model_name)\n", "\n", "endpoint_configs = list()\n", "endpoint_config_name = f\"epc-{model_name}\"\n", "memory = 4096\n", "# create endpoint config\n", "(status, epc_arn) = create_serverless_endpoint_config(\n", " sm_client, endpoint_config_name, model_name, memory=memory, max_concurrency=10\n", ")\n", "endpoint_configs.append(endpoint_config_name)\n", "\n", "endpoints = list()\n", "endpoint_name = endpoint_config_name.replace(\"epc-\", \"ep-\")\n", "# create serverless endpoint\n", "create_serverless_endpoint(sm_client, endpoint_name, endpoint_config_name)\n", "endpoints.append(endpoint_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wait for endpoint status to be `InService`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_endpoint_status(sm_client, endpoint_name):\n", " resp = sm_client.describe_endpoint(EndpointName=endpoint_name)\n", " status = resp[\"EndpointStatus\"]\n", " desc = f\"{endpoint_name} | {status} ...\"\n", " return (status, desc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# monitor endpoint status\n", "(status, desc) = get_endpoint_status(sm_client, endpoint_name)\n", "print(desc)\n", "while status not in (\"InService\", \"Failed\"):\n", " (status, desc) = get_endpoint_status(sm_client, endpoint_name)\n", " if status not in (\"InService\", \"Failed\"):\n", " print(desc)\n", " sleep(60)\n", " continue\n", " else:\n", " print(desc)\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Send Inference request to serverless endpoint with ENSEMBLE model\n", "\n", ">**NOTE:** Serverless endpoints, being fully-managed, provision compute resources on demand, as a result your endpoint may experience cold starts. Typically, you'll experience a cold start during the first inference request and after a brief period of inactivity.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.predictor import Predictor\n", "from sagemaker.serializers import CSVSerializer\n", "from sagemaker.deserializers import CSVDeserializer\n", "\n", "endpoint = endpoints[0]\n", "\n", "payload = \"51,technician,married,professional.course,no,yes,no,cellular,apr,thu,687,1,0,1,success,-1.8,93.075,-47.1,1.365,5099.1\"\n", "# payload = \"42,services,married,professional.course,no,yes,no,telephone,may,thu,813,1,999,0,nonexistent,1.1,93.994,-36.4,4.855,5191.0\"\n", "# payload = \"37,services,married,high.school,no,yes,no,telephone,may,mon,226,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0\"\n", "# payload = \"55,admin.,married,high.school,no,no,no,telephone,may,thu,94,1,999,0,nonexistent,1.1,93.994,-36.4,4.855,5191.0\"\n", "# payload = \"34,blue-collar,married,basic.4y,no,no,no,telephone,may,tue,800,4,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0\"\n", "\n", "try:\n", " print(f\"Invoking endpoint: {endpoint} with payload .. \\n\")\n", " print(payload)\n", " predictor = Predictor(\n", " endpoint_name=endpoint,\n", " sagmaker_session=session,\n", " serializer=CSVSerializer(),\n", " deserializer=CSVDeserializer(),\n", " )\n", " prediction = predictor.predict(payload)\n", " print(f\"Predicted Label: {prediction[0][0]}\")\n", "except Exception as e:\n", " print(f\"Error invoking Endpoint: {endpoint}\")\n", " print(f\"{e}\")\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cleanup (ensemble endpoint)\n", "Delete endpoint, endpoint config and model in that order" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "epc_name = endpoint.replace(\"ep-\", \"epc-\")\n", "model_name = endpoint.replace(\"ep-\", \"\")\n", "\n", "print(f\"Deleting endpoint : {endpoint}\")\n", "try:\n", " sm_client.delete_endpoint(EndpointName=endpoint)\n", "except Exception as e:\n", " print(f\"{e}\")\n", " pass\n", "\n", "print(f\"Deleting EndpointConfig : {epc_name}\")\n", "try:\n", " sm_client.delete_endpoint_config(EndpointConfigName=epc_name)\n", "except Exception as e:\n", " print(f\"{e}\")\n", " pass\n", "\n", "print(f\"Deleting Model : {model_name}\")\n", "try:\n", " sm_client.delete_model(ModelName=model_name)\n", "except Exception as e:\n", " print(f\"{e}\")\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy HPO models to serverless endpoints\n", "\n", "Autopilot in HYPERPARAMETER_TUNING mode generates 3 inference containers for binary classification problem types.\n", "\n", "Ref: \n", "\n", "![](./images/ap-jobprofile-hpo-04Oct2022.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Monitor HPO AutoML job completion status" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# monitor job status launched in hpo mode\n", "(p_status, desc) = get_job_status(sm_client, hpo_automl_job_name)\n", "print(desc)\n", "while p_status not in (\"Completed\", \"Failed\"):\n", " (p_status, desc) = get_job_status(sm_client, hpo_automl_job_name)\n", " if p_status not in (\"Completed\", \"Failed\"):\n", " print(desc)\n", " sleep(60)\n", " continue\n", " else:\n", " print(desc)\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We enumerate through `InferenceContainers` list from `BestCandidate` HPO Model and create endpoints accordingly\n", "\n", "- Step 1 : Create Model\n", "- Step 2 : Create Endpoint Config with Model Name\n", "- Step 3 : Create Endpoint with Endpoint Config" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "job_response = sm_client.describe_auto_ml_job(AutoMLJobName=hpo_automl_job_name)\n", "inference_containers = job_response[\"BestCandidate\"][\"InferenceContainers\"]\n", "print(f\"Inference Containers for AutoML job: {hpo_automl_job_name}\")\n", "print(inference_containers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get model sizes of generated inference containers" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for idx, container in enumerate(inference_containers):\n", " print(f\"calculating generated model_{idx} size\")\n", " # Extract s3 Key from ModelDataUrl\n", " model_dataurl_key = container[\"ModelDataUrl\"].split(f\"{bucket}\")[1][1:]\n", " # print(model_dataurl_key)\n", " model_size = get_s3_objsize_in_MB(bucket, model_dataurl_key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All generated models are less than 1 MB. \n", "Let's set `MemorySize` to **2048 MB** and `MaxConcurrency` to **10**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "models = list()\n", "endpoint_configs = list()\n", "endpoints = list()\n", "\n", "memory = 2048\n", "max_concurreny = 10\n", "\n", "# Create model, endpoint_config, endpoint and store them in lists for easier access\n", "for idx, container in enumerate(inference_containers):\n", " (status, model_arn) = create_autopilot_model(\n", " sm_client, hpo_automl_job_name, role, container, idx\n", " )\n", " model_name = model_arn.split(\"/\")[1]\n", " print(f\"\\tcreated model: {model_name}...\")\n", " models.append(model_name)\n", "\n", " endpoint_config_name = f\"epc-{model_name}\"\n", " endpoint_name = f\"ep-{model_name}\"\n", "\n", " (status, epc_arn) = create_serverless_endpoint_config(\n", " sm_client, endpoint_config_name, model_name, memory=memory, max_concurrency=max_concurreny\n", " )\n", " print(f\"\\tcreated epc: {endpoint_config_name}\")\n", " endpoint_configs.append(endpoint_config_name)\n", "\n", " res = create_serverless_endpoint(sm_client, endpoint_name, endpoint_config_name)\n", " print(f\"\\tcreated ep: {endpoint_name}\")\n", " endpoints.append(endpoint_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Monitor Endpoint creation status\n", "Wait till all Endpoints are in `InService` status" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "statuses = [get_endpoint_status(sm_client, ep)[0] for ep in endpoints]\n", "print(statuses)\n", "\n", "while statuses != [\"InService\", \"InService\", \"InService\"]:\n", " statuses = [get_endpoint_status(sm_client, ep)[0] for ep in endpoints]\n", " print(statuses)\n", " if statuses != [\"InService\", \"InService\", \"InService\"]:\n", " sleep(60)\n", " continue\n", " else:\n", " print(statuses)\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Send inference request to get predictions from each endpoint\n", "\n", "Inference request flow:\n", "\n", "![](./images/ap-hpo-serverless-payloadflow.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Note** : In `HPO` mode, the `feature-transform` container response is of type `application/x-recordio-protobuf`. Therefore, we use an [IdentitySerializer](https://sagemaker.readthedocs.io/en/stable/api/inference/serializers.html#sagemaker.serializers.IdentitySerializer) to serialize the response from `feature-transform` container to the `Inference Container` i.e. container #2, without any modification. We then deserialize the output of the `Inference Container` using a `CSVDeserializer`.\n", "\n", "\n", "Serverless endpoints, being fully-managed, provision compute resources on demand, as a result the endpoints may experience cold starts. Typically, you'll experience a cold start during the first inference request and after a brief period of inactivity.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Inference request with `Predicted Label` as output" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.predictor import Predictor\n", "from sagemaker.serializers import CSVSerializer, IdentitySerializer\n", "from sagemaker.deserializers import CSVDeserializer\n", "\n", "payload = \"51,technician,married,professional.course,no,yes,no,cellular,apr,thu,687,1,0,1,success,-1.8,93.075,-47.1,1.365,5099.1\"\n", "# payload = \"42,services,married,professional.course,no,yes,no,telephone,may,thu,813,1,999,0,nonexistent,1.1,93.994,-36.4,4.855,5191.0\"\n", "# payload = \"37,services,married,high.school,no,yes,no,telephone,may,mon,226,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0\"\n", "# payload = \"55,admin,married,high.school,no,no,no,telephone,may,thu,94,1,999,0,nonexistent,1.1,93.994,-36.4,4.855,5191.0\"\n", "# payload = \"34,blue-collar,married,basic.4y,no,no,no,telephone,may,tue,800,4,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0\"\n", "# payload = \"100,services,married,high.school,no,yes,no,cellular,apr,thu,483,2,999,0,nonexistent,-1.8,93.075,-47.1,1.41,5099.1\"\n", "\n", "for _, ep in enumerate(endpoints):\n", " try:\n", " print(f\"Payload: {payload}\")\n", " if _ == 1:\n", " predictor = Predictor(\n", " endpoint_name=ep,\n", " sagemaker_session=session,\n", " serializer=IdentitySerializer(content_type=\"application/x-recordio-protobuf\"),\n", " deserializer=CSVDeserializer(),\n", " )\n", " else:\n", " predictor = Predictor(\n", " endpoint_name=ep, sagemaker_session=session, serializer=CSVSerializer()\n", " )\n", " prediction = predictor.predict(payload)\n", " print(f\"Prediction: \\n{prediction}\")\n", " print(\"--\" * 20)\n", " payload = prediction\n", " except Exception as e:\n", " print(f\"Error invoking Endpoint: {ep} \\n {e}\")\n", " break\n", "\n", "print(f\"Final Prediction: {payload.decode('utf-8')}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NOTE (Optional)**: For capturing `probabilities` and `labels` along with `predicted_label` we could update the `inference_containers` to configure the required inputs and outputs.\n", "\n", "For e.g. to update an inference container to output **`predicted_label`** and **`probabilites`** we could update the `inference_containers` object defined earlier.\n", "\n", "![Configure SM Inference Output](./images/configure-sm-inference-output.png)\n", "\n", "After update, recreate models, endpoint configs and then re-deploy the endpoints with updated `SAGEMAKER_INFERENCE_OUPUT` configurations.\n", "\n", "You can read more about configuring inference output in generated containers [here](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-container-output.html#autopilot-problem-type-container-output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cleanup (HPO endpoints)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Deleting endpoints...\")\n", "for _, ep in enumerate(endpoints):\n", " try:\n", " print(f\"\\tDeleting {ep}...\")\n", " sm_client.delete_endpoint(EndpointName=ep)\n", " except Exception as e:\n", " print(f\"{e}\")\n", " continue\n", "print(\"--\" * 15)\n", "print(\"Deleting endpoint configs...\")\n", "for _, epc in enumerate(endpoint_configs):\n", " try:\n", " print(f\"\\tDeleting {epc} ...\")\n", " sm_client.delete_endpoint_config(EndpointConfigName=epc)\n", " except Exception as e:\n", " print(f\"{e}\")\n", " continue\n", "print(\"--\" * 15)\n", "print(\"Deleting models...\")\n", "for _, mdl in enumerate(models):\n", " try:\n", " print(f\"\\tDeleting {mdl}...\")\n", " sm_client.delete_model(ModelName=mdl)\n", " except Exception as e:\n", " print(f\"{e}\")\n", " continue\n", "\n", "print(f\"Done\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/autopilot|autopilot-serverless-inference|autopilot-models-serverless-inference.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "vscode": { "interpreter": { "hash": "8c4c57eb149902836539fe532ea353cbda55dc8653105f24c3221170071603b8" } } }, "nbformat": 4, "nbformat_minor": 4 }