{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "82fce004-0da0-4337-becb-a7ff859502ca", "metadata": {}, "source": [ "# Serve Falcon 7B model with Amazon SageMaker Hosting" ] }, { "attachments": {}, "cell_type": "markdown", "id": "d6a17284-4f19-42ac-8caf-bc76a3d1f061", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.\n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "id": "20b832e6-641a-443c-b533-2396e19b01bc", "metadata": {}, "source": [ "In this example we walk through how to deploy and perform inference on the **Falcon 7B model** using the **Large Model Inference(LMI)** container provided by AWS using **DJL Serving** and **DeepSpeed**. The **Falcon 7B model** is a casual decoder model simlilar to the larger **Falcon 40B model**.\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "9aad5b38-cb9c-4df9-ada8-84d0bed0c91a", "metadata": {}, "source": [ "## Setup" ] }, { "attachments": {}, "cell_type": "markdown", "id": "4abbea2a-b6c8-4106-ae69-17c877a00f88", "metadata": {}, "source": [ "Installs the dependencies required to package the model and run inferences using Amazon SageMaker. Update SageMaker, boto3 etc" ] }, { "cell_type": "code", "execution_count": null, "id": "bd50d8cb-1e06-480a-afb9-a5cc770e3c5f", "metadata": { "tags": [] }, "outputs": [], "source": [ "!pip install sagemaker boto3 --upgrade --quiet" ] }, { "cell_type": "code", "execution_count": null, "id": "67b43806-e267-4259-8dd8-617d4feb839d", "metadata": { "tags": [] }, "outputs": [], "source": [ "import sagemaker\n", "import jinja2\n", "from sagemaker import image_uris\n", "import boto3\n", "import os\n", "import time\n", "import json\n", "from pathlib import Path\n", "from sagemaker.utils import name_from_base" ] }, { "attachments": {}, "cell_type": "markdown", "id": "dc80a142-7f21-48ca-9a44-1462cd82004c", "metadata": {}, "source": [ "## Imports and variables" ] }, { "cell_type": "code", "execution_count": null, "id": "ac2cb0e7-e646-4a8b-b9b5-f25c08ec6b9c", "metadata": { "tags": [] }, "outputs": [], "source": [ "role = sagemaker.get_execution_role() # execution role for the endpoint\n", "sess = sagemaker.session.Session() # sagemaker session for interacting with different AWS APIs\n", "bucket = sess.default_bucket() # bucket to house artifacts\n", "model_bucket = sess.default_bucket() # bucket to house artifacts\n", "s3_code_prefix_deepspeed = \"hf-large-model-djl-/code_falcon7b/deepspeed\" # folder within bucket where code artifact will go\n", "\n", "region = sess._region_name\n", "account_id = sess.account_id()\n", "\n", "s3_client = boto3.client(\"s3\")\n", "sm_client = boto3.client(\"sagemaker\")\n", "smr_client = boto3.client(\"sagemaker-runtime\")\n", "\n", "jinja_env = jinja2.Environment()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "8a6bdc38-8552-4d2a-b1fe-aeacc229b85c", "metadata": { "tags": [] }, "source": [ "### 1. Create SageMaker compatible model artifacts" ] }, { "attachments": {}, "cell_type": "markdown", "id": "1a7f3f1f-d82e-403c-b966-a1a495a26d5d", "metadata": {}, "source": [ "In order to prepare our model for deployment to a SageMaker Endpoint for hosting, we will need to prepare a few things for SageMaker and our container. We will use a local folder as the location of these files including **serving.properties** that defines parameters for the LMI container and **requirements.txt** to detail what dependies to install." ] }, { "cell_type": "code", "execution_count": null, "id": "ac8d9604", "metadata": { "tags": [] }, "outputs": [], "source": [ "!mkdir -p code_falcon7b_deepspeed" ] }, { "cell_type": "code", "execution_count": null, "id": "e9f79b23-fd1d-4923-8179-50ad4ee60b52", "metadata": { "tags": [] }, "outputs": [], "source": [ "# define a variable to contain the s3url of the location that has the model\n", "pretrained_model_location = f\"s3://{bucket}/models/falcon40b/\"\n", "print(f\"Pretrained model will be downloaded from ---- > {pretrained_model_location}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "466c1b6f-213b-4540-b65c-c2c2e029e283", "metadata": {}, "source": [ "In the **serving.properties** files define the the **engine** to use and **model** to host. Note the **tensor_parallel_degree** parameter which is set to a value of **1** in this scenario. Since the entire model can fit on a sigle GPU we do not have to divide the model into multiple parts. In this case we will use a 'ml.g5.2xlarge' instance which provides **1** GPU. Be careful not to specify a value larger than the instance provides or your deployment will fail. " ] }, { "cell_type": "code", "execution_count": null, "id": "212382fc", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%writefile ./code_falcon7b_deepspeed/serving.properties\n", "engine=DeepSpeed\n", "option.model_id=tiiuae/falcon-7b\n", "option.tensor_parallel_degree=1\n", "#option.s3url = {{s3url}}" ] }, { "cell_type": "code", "execution_count": null, "id": "8db0e5dc", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%writefile ./code_falcon7b_deepspeed/requirements.txt\n", "einops\n", "torch==2.0.1\n", "git+https://github.com/lanking520/DeepSpeed.git@falcon" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ba7a2900-ef24-4426-bc5c-fd85c5ad269a", "metadata": {}, "source": [ "### 2. Create a model.py with custom inference code" ] }, { "attachments": {}, "cell_type": "markdown", "id": "62b8fbf7-608d-4223-b943-83dcd87ef335", "metadata": {}, "source": [ "SageMaker allows you to bring your own script for inference. Here we create our **model.py** file with the appropriate code for the Falcon 7B model." ] }, { "cell_type": "code", "execution_count": null, "id": "b7b228ee", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%writefile ./code_falcon7b_deepspeed/model.py\n", "from djl_python import Input, Output\n", "import os\n", "import torch\n", "from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer\n", "from typing import Any, Dict, Tuple\n", "import deepspeed\n", "import warnings\n", "\n", "predictor = None\n", "\n", "\n", "def get_model(properties):\n", " model_name = properties[\"model_id\"]\n", " local_rank = int(os.getenv(\"LOCAL_RANK\", \"0\"))\n", " model = AutoModelForCausalLM.from_pretrained(\n", " model_name, low_cpu_mem_usage=True, trust_remote_code=True, torch_dtype=torch.bfloat16\n", " )\n", " model = deepspeed.init_inference(model, mp_size=properties[\"tensor_parallel_degree\"])\n", " tokenizer = AutoTokenizer.from_pretrained(model_name)\n", " generator = pipeline(\n", " task=\"text-generation\", model=model, tokenizer=tokenizer, device=local_rank\n", " )\n", " return generator\n", "\n", "\n", "def handle(inputs: Input) -> None:\n", " global predictor\n", " if not predictor:\n", " predictor = get_model(inputs.get_properties())\n", "\n", " if inputs.is_empty():\n", " # Model server makes an empty call to warmup the model on startup\n", " return None\n", " data = inputs.get_as_json()\n", " text = data[\"text\"]\n", " text_length = data[\"text_length\"]\n", " result = predictor(text, do_sample=True, min_length=text_length, max_length=text_length)\n", " return Output().add(result)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c53f3ac7-dd27-4e4b-b5a5-d71cb8bc0733", "metadata": {}, "source": [ "### 3. Create the Tarball and then upload to S3 location\n", "Next, we will package our artifacts as `*.tar.gz` files for uploading to S3 for SageMaker to use for deployment" ] }, { "cell_type": "code", "execution_count": null, "id": "0ae6f030", "metadata": { "tags": [] }, "outputs": [], "source": [ "!rm -f model.tar.gz\n", "!rm -rf code_falcon7b_deepspeed/.ipynb_checkpoints\n", "!tar czvf model.tar.gz -C code_falcon7b_deepspeed .\n", "s3_code_artifact_deepspeed = sess.upload_data(\"model.tar.gz\", bucket, s3_code_prefix_deepspeed)\n", "print(f\"S3 Code or Model tar for deepspeed uploaded to --- > {s3_code_artifact_deepspeed}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c9cbc98f-2991-4fed-ae7b-5d801b15ac18", "metadata": {}, "source": [ "### 4. Define a serving container, SageMaker Model and SageMaker endpoint\n", "Now that we have uploaded the model artifacts to S3, we can create a SageMaker endpoint.\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "81489ba8-c6cc-4c44-a9d7-eda16f49394b", "metadata": { "tags": [] }, "source": [ "#### Define the serving container\n", "Here we define the container to use for the model for inference. We will be using SageMaker's Large Model Inference(LMI) container using DeepSpeed. " ] }, { "cell_type": "code", "execution_count": null, "id": "ae731378-f75d-41bc-ba67-cabdf4ca6cde", "metadata": { "tags": [] }, "outputs": [], "source": [ "# inference_image_uri = f\"{account_id}.dkr.ecr.{region}.amazonaws.com/djl-ds:latest\"\n", "inference_image_uri = (\n", " f\"763104351884.dkr.ecr.{region}.amazonaws.com/djl-inference:0.22.1-deepspeed0.9.2-cu118\"\n", ")\n", "print(f\"Image going to be used is ---- > {inference_image_uri}\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "7648de8f-351d-4f1b-96a7-899b413ff8f4", "metadata": {}, "source": [ "#### Create SageMaker model, endpoint configuration and endpoint.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "327a864e", "metadata": { "tags": [] }, "outputs": [], "source": [ "model_name_ds = name_from_base(f\"falcon7b-model-ds\")\n", "print(model_name_ds)" ] }, { "cell_type": "code", "execution_count": null, "id": "178c74e9", "metadata": { "tags": [] }, "outputs": [], "source": [ "create_model_response = sm_client.create_model(\n", " ModelName=model_name_ds,\n", " ExecutionRoleArn=role,\n", " PrimaryContainer={\"Image\": inference_image_uri, \"ModelDataUrl\": s3_code_artifact_deepspeed},\n", ")\n", "model_arn = create_model_response[\"ModelArn\"]\n", "\n", "print(f\"Created Model: {model_arn}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "f37baaaa", "metadata": { "tags": [] }, "outputs": [], "source": [ "model_name = model_name_ds\n", "print(f\"Building EndpointConfig and Endpoint for: {model_name}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "3a6a0a9c-d2be-4548-a78c-4ad2cbab9087", "metadata": { "tags": [] }, "outputs": [], "source": [ "endpoint_config_name = f\"{model_name}-config\"\n", "endpoint_name = f\"{model_name}-endpoint\"\n", "\n", "endpoint_config_response = sm_client.create_endpoint_config(\n", " EndpointConfigName=endpoint_config_name,\n", " ProductionVariants=[\n", " {\n", " \"VariantName\": \"variant1\",\n", " \"ModelName\": model_name,\n", " \"InstanceType\": \"ml.g5.2xlarge\",\n", " \"InitialInstanceCount\": 1,\n", " \"ModelDataDownloadTimeoutInSeconds\": 3600,\n", " \"ContainerStartupHealthCheckTimeoutInSeconds\": 3600,\n", " # \"VolumeSizeInGB\": 512\n", " },\n", " ],\n", ")\n", "endpoint_config_response" ] }, { "cell_type": "code", "execution_count": null, "id": "f34c6563-75bc-4441-b14a-0ea3694ce61f", "metadata": { "tags": [] }, "outputs": [], "source": [ "create_endpoint_response = sm_client.create_endpoint(\n", " EndpointName=f\"{endpoint_name}\", EndpointConfigName=endpoint_config_name\n", ")\n", "print(f\"Created Endpoint: {create_endpoint_response['EndpointArn']}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "11e348ac-b9b4-4998-b9bd-51d0d07f51c6", "metadata": { "tags": [] }, "outputs": [], "source": [ "import time\n", "\n", "resp = sm_client.describe_endpoint(EndpointName=endpoint_name)\n", "status = resp[\"EndpointStatus\"]\n", "print(\"Status: \" + status)\n", "\n", "while status == \"Creating\":\n", " time.sleep(60)\n", " resp = sm_client.describe_endpoint(EndpointName=endpoint_name)\n", " status = resp[\"EndpointStatus\"]\n", " print(\"Status: \" + status)\n", "\n", "print(\"Arn: \" + resp[\"EndpointArn\"])\n", "print(\"Status: \" + status)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "e5e6750d-d680-42f3-b879-005c4c006add", "metadata": {}, "source": [ "### Run Inference" ] }, { "cell_type": "code", "execution_count": null, "id": "401387f9-1f8e-440d-bd4c-11aa6714b975", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%time\n", "\n", "response_model = smr_client.invoke_endpoint(\n", " EndpointName=endpoint_name,\n", " Body=json.dumps({\"text\": \"What is the purpose of life?\", \"text_length\": 150}),\n", " ContentType=\"application/json\",\n", ")\n", "\n", "response_model[\"Body\"].read().decode(\"utf8\")" ] }, { "cell_type": "code", "execution_count": null, "id": "c4484728", "metadata": {}, "outputs": [], "source": [ "start_time = time.time()\n", "\n", "while (time.time() - start_time) < 300: # 300 seconds = 5 minutes\n", " response_model = smr_client.invoke_endpoint(\n", " EndpointName=endpoint_name,\n", " Body=json.dumps({\"text\": \"What is Love?\", \"text_length\": 150}),\n", " ContentType=\"application/json\",\n", " )\n", "\n", " print(\"Loop restarting - answer: \" + response_model[\"Body\"].read().decode(\"utf8\"))" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2aeec2f9-51f5-406b-b8cb-615ac67a709c", "metadata": {}, "source": [ "### Clean Up" ] }, { "cell_type": "code", "execution_count": null, "id": "b3744ca9-d6cf-4aa4-8693-4fd43a71e4fe", "metadata": { "tags": [] }, "outputs": [], "source": [ "# - Delete the end point\n", "sm_client.delete_endpoint(EndpointName=endpoint_name)" ] }, { "cell_type": "code", "execution_count": null, "id": "c05cfff2-a56d-4fe4-8915-74e44b68715e", "metadata": { "tags": [] }, "outputs": [], "source": [ "# - In case the end point failed we still want to delete the model\n", "sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)\n", "sm_client.delete_model(ModelName=model_name)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6552973c-a89b-4a9f-8378-54f2f54b15f4", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/inference%7Cgenerativeai%7Cllm-workshop%7Clab10-falcon-40b-and-7b%7Cfalcon-7b-deepspeed.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 } ], "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science 2.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" } }, "nbformat": 4, "nbformat_minor": 5 }