{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Integrate Inference Recommender With Your Existing Python SDK Model Workflows\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "* SDK https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.model.Model.right_size\n", "* boto3 https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#client\n", "\n", "In this notebook we will demonstrate the following Python SDK workflow:\n", "1. Download the Model & Payload\n", "2. Create the Python SDK model\n", "3. (Optional) Register the Model with Model Registry\n", "4. Run an Inference Recommendations Job\n", "5. Inspect the contents of the Inference Recommendations\n", "6. Review detailed CloudWatch metrics\n", "7. Deploy the model \n", "8. Perform inference off of the deployed model\n", "9. Clean up\n", "\n", "When demonstrating this Python SDK workflow, we will be using an SKLearn model.\n", "\n", "**This sample is provided for demonstration purposes, make sure to conduct appropriate testing if deriving this code for your own use-cases!**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 0: Make sure that you're using the latest Python SDK and Boto3 (Feel free to skip if already on latest versions)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "!pip install --upgrade pip awscli botocore boto3 sagemaker --quiet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up Client and Session" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sagemaker\n", "import boto3\n", "import time\n", "\n", "region = boto3.Session().region_name\n", "role = sagemaker.get_execution_role()\n", "sm_client = boto3.client(\"sagemaker\", region_name=region)\n", "sagemaker_session = sagemaker.Session()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1: Download the Model & Payload\n", "\n", "In this example, we are using a pre-trained scikit-learn model, trained on the California Housing dataset, present in Scikit-Learn: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html. The California Housing dataset was originally published in:\n", "\n", "> Pace, R. Kelley, and Ronald Barry. \"Sparse spatial auto-regressions.\" Statistics & Probability Letters 33.3 (1997): 291-297." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download the Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "export_dir = \"./model/\"\n", "\n", "if not os.path.exists(export_dir):\n", " os.makedirs(export_dir)\n", " print(\"Directory \", export_dir, \" Created \")\n", "else:\n", " print(\"Directory \", export_dir, \" already exists\")\n", "\n", "model_archive_name = \"sk-model.tar.gz\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s3 = boto3.client(\"s3\")\n", "s3.download_file(\n", " f\"sagemaker-example-files-prod-{region}\",\n", " \"models/california-housing/model.joblib\",\n", " export_dir + \"model.joblib\",\n", ")" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Tar the model and code" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!tar -cvpzf {model_archive_name} -C ./model \"model.joblib\" -C ../code \"inference.py\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download the payload " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "payload_location = \"./sample-payload/\"\n", "\n", "if not os.path.exists(payload_location):\n", " os.makedirs(payload_location)\n", " print(\"Directory \", payload_location, \" Created \")\n", "else:\n", " print(\"Directory \", payload_location, \" already exists\")\n", "\n", "payload_archive_name = \"sk_payload.tar.gz\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import fetch_california_housing\n", "from sklearn.model_selection import train_test_split\n", "import pandas as pd\n", "\n", "data = fetch_california_housing()\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(\n", " data.data, data.target, test_size=0.25, random_state=42\n", ")\n", "\n", "# we don't train a model, so we will need only the testing data\n", "testX = pd.DataFrame(X_test, columns=data.feature_names)\n", "# Save testing data to CSV\n", "testX[data.feature_names].head(10).to_csv(\n", " os.path.join(payload_location, \"test_data.csv\"), header=False, index=False\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tar the payload" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!cd ./sample-payload/ && tar czvf ../{payload_archive_name} *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Upload Your model and payload to S3\n", "\n", "We will be uploading the pretrained model and corresponding test set as `sk-model.tar.gz` and as `sk_payload.tar.gz` to S3." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "prefix = \"sagemaker/scikit-learn-inference-recommender\"\n", "\n", "model_url = sagemaker_session.upload_data(model_archive_name, key_prefix=prefix)\n", "sample_payload_url = sagemaker_session.upload_data(payload_archive_name, key_prefix=prefix)\n", "\n", "print(\"model uploaded to: {}\".format(model_url))\n", "print(\"sample payload uploaded to: {}\".format(sample_payload_url))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Create the Python SDK model\n", "\n", "In this example we will be using a Python SDK `SKLearnModel` model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.sklearn.model import SKLearnModel\n", "from sagemaker import image_uris\n", "\n", "model = SKLearnModel(\n", " model_data=model_url,\n", " role=role,\n", " image_uri=image_uris.retrieve(\n", " framework=\"sklearn\", region=region, version=\"1.0-1\", image_scope=\"inference\"\n", " ),\n", " entry_point=\"./code/inference.py\",\n", " framework_version=\"1.0-1\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## (Optional) Step 3: Register the Model with Model Registry\n", "\n", "We will be calling Python SDK `.register()` to create a `model_package` and `model_package_group` for the `SKLearnModel`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# uncomment the code below to register your model to model registry\n", "# and replace all the \"model\" with \"model_package\" if the following code blocks\n", "\n", "# model_package_group_name = \"sample-notebook-package-group-\" + str(round(time.time()))\n", "# model_package = model.register(\n", "# content_types=[\"text/csv\"],\n", "# response_types=[\"text/csv\"],\n", "# model_package_group_name=model_package_group_name,\n", "# image_uri=model.image_uri,\n", "# approval_status=\"Approved\",\n", "# )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4: Run an Inference Recommendations Job\n", "\n", "The Python SDK method for Inference Recommender is `.right_size()`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let us run a simple Default Inference Recommendations Job" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "job_name = \"sample-notebook-inference-recommendation-job\" + str(round(time.time()))\n", "\n", "model.right_size(\n", " job_name=job_name,\n", " sample_payload_url=sample_payload_url,\n", " supported_content_types=[\"text/csv\"],\n", " supported_instance_types=[\"ml.c5.large\"],\n", " framework=\"SAGEMAKER-SCIKIT-LEARN\",\n", ")\n", "\n", "# If you registered the model in step 3, you can also run the inference recommender\n", "# job with model package.\n", "\n", "# model_package.right_size(\n", "# job_name=job_name,\n", "# sample_payload_url=sample_payload_url,\n", "# supported_content_types=[\"text/csv\"],\n", "# supported_instance_types=[\"ml.c5.large\"],\n", "# framework=\"SAGEMAKER-SCIKIT-LEARN\",\n", "# )" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### If you want to run an Advanced Inference Recommendations Job here's a basic load test to show how:\n", "```\n", "from sagemaker.parameter import CategoricalParameter\n", "from sagemaker.inference_recommender.inference_recommender_mixin import ( \n", " Phase, \n", " ModelLatencyThreshold\n", ")\n", "\n", "advanced_job_name = (\n", " \"inference-recommendation-job\" + \"-load-test-\" + str(round(time.time()))\n", ")\n", "\n", "hyperparameter_ranges = [\n", " {\n", " \"instance_types\": CategoricalParameter([\"ml.m5.xlarge\", \"ml.g4dn.xlarge\"]),\n", " 'OMP_NUM_THREADS': CategoricalParameter(['1', '2']),\n", " }\n", "]\n", "\n", "phases = [\n", " Phase(duration_in_seconds=300, initial_number_of_users=2, spawn_rate=2),\n", " Phase(duration_in_seconds=300, initial_number_of_users=14, spawn_rate=2)\n", "]\n", "\n", "model_latency_thresholds = [\n", " ModelLatencyThreshold(percentile=\"P95\", value_in_milliseconds=100)\n", "]\n", "\n", "model_package.right_size(\n", " job_name=advanced_job_name\n", " sample_payload_url=sample_payload_url,\n", " supported_content_types=[\"text/csv\"],\n", " framework=\"SAGEMAKER-SCIKIT-LEARN\",\n", " job_duration_in_seconds=3600,\n", " hyperparameter_ranges=hyperparameter_ranges,\n", " phases=phases, # TrafficPattern\n", " max_invocations=100, # StoppingConditions\n", " model_latency_thresholds=model_latency_thresholds,\n", " max_tests=5, # ResourceLimit\n", " max_parallel_tests=5\n", ")\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5: Inspect the contents of the Inference Recommendations\n", "- `.inference_recommender_job_results` to get the full response JSON\n", "- invoke `describe_inference_recommendations_job` API to get the full response JSON\n", "- `.inference_recommendations` to get just the default recommendation's response JSON" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model.inference_recommender_job_results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "inference_recommendation_res = sm_client.describe_inference_recommendations_job(JobName=job_name)\n", "print(inference_recommendation_res)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model.inference_recommendations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 6: Review detailed CloudWatch metrics\n", "\n", "Below we've provided a helper script cloudwatch.py that you can import to get access to functions for reviewing CW metrics. Just by providing your region and Inference Recommender job name you can get detailed charts of your endpoints during the load test.\n", "\n", "| Metric | Description |\n", "| --- | --- |\n", "| **OverheadLatency** | The interval of time added to the time taken to respond to a client request by SageMaker overheads. This interval is measured from the time SageMaker receives the request until it returns a response to the client, minus the ModelLatency. Overhead latency can vary depending on multiple factors, including request and response payload sizes, request frequency, and authentication/authorization of the request.
**Units**: Microseconds |\n", "| **CPUUtilization** | The sum of each individual CPU core's utilization. The CPU utilization of each core range is 0–100. For example, if there are four CPUs, the CPUUtilization range is 0%–400%. For processing jobs, the value is the CPU utilization of the processing container on the instance.
For training jobs, the value is the CPU utilization of the algorithm container on the instance.
For batch transform jobs, the value is the CPU utilization of the transform container on the instance.
For endpoint variants, the value is the sum of the CPU utilization of the primary and supplementary containers on the instance.
Note:
For multi-instance jobs, each instance reports CPU utilization metrics. However, the default view in CloudWatch shows the average CPU utilization across all instances.
**Units**: Percent |\n", "| **MemoryUtilization** | The percentage of memory that is used by the containers on an instance. This value range is 0%–100%.
For processing jobs, the value is the memory utilization of the processing container on the instance.
For training jobs, the value is the memory utilization of the algorithm container on the instance.
For batch transform jobs, the value is the memory utilization of the transform container on the instance.
For endpoint variants, the value is the sum of the memory utilization of the primary and supplementary containers on the instance.
**Units**: Percent |\n", "\n", "Learn more about CloudWatch: https://docs.aws.amazon.com/sagemaker/latest/dg/monitoring-cloudwatch.html" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from cloudwatch import get_endpoint_metrics\n", "\n", "cw_client = boto3.client(\"cloudwatch\", region)\n", "df_cw = get_endpoint_metrics(sm_client, cw_client, region, job_name, include_plots=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### View Inference Recommender CloudWatch metrics(Advanced Job)\n", "You can also observe load test metrics as seen by Inference Recommender. Inference Recommender publishes the following metrics to the `/aws/sagemaker/InferenceRecommendationsJobs` CloudWatch name space. \n", "\n", "| Metric | Description |\n", "| --- | --- |\n", "| **ClientInvocations** | The number of `InvokeEndpoint` requests sent to a model endpoint as seen by Inference Recommender. To get the total number of requests sent to a model endpoint, use the Sum statistic.
**Units**: None
**Valid statistics**: Sum | \n", "| **ClientInvocationErrors** | The number of `InvokeEndpoint` requests that failed as seen by Inference Recommender.
**Units**: None
**Valid statistics**: Sum | \n", "| **ClientLatency** | The interval of time taken by a model to respond as seen by Inference Recommender. Note that the time is in milliseconds, whereas ModelLatency endpoint invocation metric is in microseconds.
**Units**: Milliseconds
**Valid statistics**: Average, Sum, Min, Max, Sample Count, percentiles | \n", "| **NumberOfUsers** | The number of users sending `InvokeEndpoint` requests sent to a model endpoint.
**Units**: None
**Valid statistics**: Max, Min, Average\n", "\n", "We will graph the \"ClientInvocations\" metric of the first recommendation in the job result as an example. " ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "```\n", "from PIL import Image\n", "from io import BytesIO\n", "import json\n", "\n", "cloudwatch_client = boto3.client(\"cloudwatch\")\n", "\n", "endpoint_name = inference_recommendation_res[\"InferenceRecommendations\"][0][\n", " \"EndpointConfiguration\"\n", "][\"EndpointName\"]\n", "widget = {\n", " \"metrics\": [\n", " [\n", " \"/aws/sagemaker/InferenceRecommendationsJobs\",\n", " \"ClientInvocations\",\n", " \"JobName\",\n", " advanced_job_name,\n", " \"EndpointName\",\n", " endpoint_name,\n", " ]\n", " ],\n", " \"title\": \"ClientInvocations\",\n", " \"view\": \"timeSeries\",\n", " \"start\": \"-PT1H\",\n", " \"end\": \"P0D\",\n", " \"stat\": \"Sum\",\n", " \"period\": 60,\n", "}\n", "\n", "widget_string = json.dumps(widget)\n", "\n", "response = cloudwatch_client.get_metric_widget_image(MetricWidget=widget_string)\n", "\n", "Image.open(BytesIO(response[\"MetricWidgetImage\"]))\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 7: Deploy the model to an endpoint \n", "By default, invoking `deploy` after `right_size` will deploy the first recommendation in the recommendation list.\n", "You can choose to deploy the desired recommendation with recommendation ID, reference the `optional` below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.sklearn.model import SKLearnPredictor\n", "import uuid\n", "\n", "\n", "model.predictor_cls = SKLearnPredictor\n", "\n", "endpoint_name = \"notebook-test-\" + str(uuid.uuid4())\n", "predictor = model.deploy(endpoint_name=endpoint_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optional: deploy specific recommendation with recommendation ID" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", "from sagemaker.sklearn.model import SKLearnPredictor\n", "import uuid\n", "\n", "# Substitue recommendation_id with the one you want to deploy\n", "recommendation_id = \"\"\n", "model_package.predictor_cls = SKLearnPredictor\n", "\n", "endpoint_name = \"notebook-test-\" + str(uuid.uuid4())\n", "predictor = model_package.deploy(recommendation_id=recommendation_id, endpoint_name=endpoint_name)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 8: Produce Inference" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "payload = pd.read_csv(\"./sample-payload/test_data.csv\")\n", "\n", "inference = predictor.predict(payload)\n", "\n", "print(inference)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 9: Cleanup\n", "\n", "Cleanup the resources if needed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Delete model and endpoint\n", "predictor.delete_model()\n", "predictor.delete_endpoint()\n", "\n", "\n", "# Delete Model Package if you registered your model in step 3\n", "# sagemaker_session.sagemaker_client.delete_model_package(\n", "# ModelPackageName=model_package.model_package_arn\n", "# )\n", "# sagemaker_session.sagemaker_client.delete_model_package_group(\n", "# ModelPackageGroupName=model_package_group_name\n", "# )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "This notebook illustrates how to use SageMaker Inference Recommender's new feature API `right_size` to invoke the inference recommendation job and `inference_recommendation_id` support in `deploy` method to deploy the recommendations. The notebook work you through downloading a pre-trained scikit-learn model, registering your model in Model Registry, creating an inference Recommendation Default job or Advanced job, inspecting recommendation result and deploying it, invoking endpoint to produce inference and cleaning up the resources created." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-inference-recommender|python-sdk|inference-recommender-with-python-sdk.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "interpreter": { "hash": "2817208bf8c8750f5dab5d6cc6872c9cd2a5e267f466766610f3ce054b996ce3" }, "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "vscode": { "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" } } }, "nbformat": 4, "nbformat_minor": 4 }