{ "cells": [ { "cell_type": "markdown", "id": "14920c76-c5b3-451b-b59e-ae8f5d4216fd", "metadata": {}, "source": [ "# Optimized Stable Diffusion Deployments" ] }, { "cell_type": "markdown", "id": "c114e0e3", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.\n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "---\n", "\n", "\n", "This notebook will show you how to deploy an [AITemplate](https://github.com/facebookincubator/AITemplate/tree/main/examples/05_stable_diffusion) Optimized version of Stable Diffusion which delivers 2X performance gain versus a standard version without sacrificing the quality of the generated images. \n", "\n", "Additionally, this notebook will demonstrate how to deploy an endpoint with pagination capabilities that would allow the API caller to display intermediate de-noising steps and reducing the initial latency to subsecond range. This enhances the end-user expereince by providing more immediate results and showing a smooth animation of the end to end image generation process. However this comes at an additional compute cost of decoding intermediate latent outputs. " ] }, { "cell_type": "code", "execution_count": null, "id": "738d30d2", "metadata": {}, "outputs": [], "source": [ "%pip install -Uq sagemaker" ] }, { "cell_type": "code", "execution_count": null, "id": "662a2f5e", "metadata": {}, "outputs": [], "source": [ "import sagemaker\n", "from sagemaker.model import Model\n", "from sagemaker import serializers, deserializers\n", "from sagemaker import image_uris\n", "import boto3\n", "import os\n", "import time\n", "import json\n", "from pathlib import Path\n", "\n", "from io import BytesIO\n", "from PIL import Image\n", "import base64\n", "from concurrent.futures import ThreadPoolExecutor\n", "from copy import deepcopy\n", "import time\n", "\n", "import matplotlib.pyplot as plt\n", "from IPython import display\n", "from IPython.display import clear_output\n", "from IPython.core.display import HTML\n", "\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": null, "id": "3a87252c", "metadata": {}, "outputs": [], "source": [ "role = sagemaker.get_execution_role() # execution role for the endpoint\n", "sess = sagemaker.session.Session() # sagemaker session for interacting with different AWS APIs\n", "bucket = sess.default_bucket() # bucket to house artifacts\n", "model_bucket = sess.default_bucket() # bucket to house artifacts\n", "s3_code_prefix = \"stable-diffusion-2/code\" # folder within bucket where code artifact will go\n", "s3_model_prefix = \"stable-diffusion-2/model\" # folder where model checkpoint will go\n", "\n", "region = sess._region_name\n", "account_id = sess.account_id()" ] }, { "cell_type": "code", "execution_count": null, "id": "e3bb2b15", "metadata": {}, "outputs": [], "source": [ "inference_image_uri = (\n", " f\"763104351884.dkr.ecr.{region}.amazonaws.com/djl-inference:0.21.0-deepspeed0.8.3-cu117\"\n", ")\n", "print(f\"Image going to be used is ---- > {inference_image_uri}\")" ] }, { "cell_type": "markdown", "id": "f3348a10", "metadata": {}, "source": [ "We have provided compiled AITemplate weights for the `ml.g5` class of instances. You can compile these on your own using the instructions [here](https://github.com/facebookincubator/AITemplate/tree/main/examples/05_stable_diffusion)." ] }, { "cell_type": "code", "execution_count": null, "id": "ad7b65d9", "metadata": {}, "outputs": [], "source": [ "ait_compiled_weight_uri = (\n", " f\"s3://sagemaker-example-files-prod-{region}/models/aitemplate_compiled/g5hw/\"\n", ")\n", "print(f\"Compiled weights to be used is ---- > {ait_compiled_weight_uri}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "d180475d", "metadata": {}, "outputs": [], "source": [ "def deploy_model(image_uri, model_data, role, endpoint_name, instance_type, env, sagemaker_session):\n", " \"\"\"Helper function to create the SageMaker Endpoint resources and return a predictor\"\"\"\n", "\n", " model = Model(image_uri=image_uri, model_data=model_data, role=role, env=env)\n", "\n", " model.deploy(initial_instance_count=1, instance_type=instance_type, endpoint_name=endpoint_name)\n", "\n", " predictor = sagemaker.Predictor(\n", " endpoint_name=endpoint_name,\n", " sagemaker_session=sagemaker_session,\n", " serializer=serializers.JSONSerializer(),\n", " deserializer=deserializers.JSONDeserializer(),\n", " )\n", "\n", " return predictor" ] }, { "cell_type": "markdown", "id": "2055779a", "metadata": {}, "source": [ "## Deploy Model\n", "In this section we will package the model configuration and inference code and deploy it to a SageMaker Endpoint. The following are the steps to deploy the endpoint:\n", "1. Update the `serving.properties` configuration file with the location of the compiled model artifacts. More information on the supported configurations can be found [here](https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints-large-model-configuration.html)\n", "2. Package the inference code along with the configuration file into a `model.tar.gz`\n", "3. Upload the `model.tar.gz` to an S3 bucket\n", "4. Deploy the model using the `deploy_model` helper function" ] }, { "cell_type": "code", "execution_count": null, "id": "5eb39dbc", "metadata": {}, "outputs": [], "source": [ "!sed -i 's@option.s3url=.*@option.s3url={ait_compiled_weight_uri}@g' model/serving.properties" ] }, { "cell_type": "code", "execution_count": null, "id": "b522b495", "metadata": {}, "outputs": [], "source": [ "!pygmentize model/serving.properties | cat -n" ] }, { "cell_type": "markdown", "id": "3a17b5ad", "metadata": {}, "source": [ "The inference code is contained within the [model.py](model/model.py) file in the `model` source directory. We use an environment variable `PAGINATION` to indicate whether to use the standard pipeline which will only return the final image, or a pagination based pipeline which will return intermediate results of each de-noising step. The code for each pipeline is contained within it's own python module:\n", "- [pipeline_stable_diffusion_ait.py](model/pipeline_stable_diffusion_ait.py) - Code for the standard pipeline\n", "- [pipeline_stable_diffusion_pagination_ait.py](model/pipeline_stable_diffusion_pagination_ait.py) - Code for the paginated pipeline\n", "\n", "The pipelines require AITemplate to be installed in the inference container. As of 4/2023 AITemplate is not available from PyPi and must be installed by building from source code as per the instructions in the [git repo](https://github.com/facebookincubator/AITemplate). For convinience, we've included a pre-compiled python wheel `model/ait/aitemplate-0.3.dev0-py3-none-any.whl` that will be installed when the endpoint is launched" ] }, { "cell_type": "code", "execution_count": null, "id": "7f3cea83", "metadata": {}, "outputs": [], "source": [ "!tar czvf sd_model.tar.gz model/" ] }, { "cell_type": "code", "execution_count": null, "id": "7bbb1778", "metadata": {}, "outputs": [], "source": [ "sd_s3_code_artifact = sess.upload_data(\"sd_model.tar.gz\", bucket, s3_code_prefix)\n", "print(f\"S3 Code or Model tar ball uploaded to --- > {sd_s3_code_artifact}\")" ] }, { "cell_type": "markdown", "id": "0910ee1b", "metadata": {}, "source": [ "The inference code supports both paginated and non paginated responses which is controlled by the `PAGINATION` environment variable\n", "\n", "Here we will deploy the endpoint without pagination by setting the environment variable to `false`" ] }, { "cell_type": "code", "execution_count": null, "id": "508a2e08", "metadata": {}, "outputs": [], "source": [ "sd_endpoint_name = sagemaker.utils.name_from_base(\"stable-diffusion\")\n", "sd_predictor = deploy_model(\n", " image_uri=inference_image_uri,\n", " model_data=sd_s3_code_artifact,\n", " role=role,\n", " env={\"PAGINATION\": \"false\"},\n", " endpoint_name=sd_endpoint_name,\n", " instance_type=\"ml.g5.xlarge\",\n", " sagemaker_session=sess,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "bf96809b", "metadata": {}, "outputs": [], "source": [ "def invoke_endpoint(predictor, payload):\n", " \"helper function to invoke endpoint\"\n", " result = predictor.predict(payload)\n", " return result" ] }, { "cell_type": "code", "execution_count": null, "id": "de863671", "metadata": {}, "outputs": [], "source": [ "def decode_image(img):\n", " \"decodes the base64 encoded image that is returned by the endpoint\"\n", " buff = BytesIO(base64.b64decode(img.encode(\"utf8\")))\n", " image = Image.open(buff)\n", " return image" ] }, { "cell_type": "code", "execution_count": null, "id": "8ead772d", "metadata": {}, "outputs": [], "source": [ "prompt = \"\"\"60s cartoon style photo of a Panda bear wearing underground clothes in far off galaxy warner brothers, trending pixiv fanbox, acrylic palette knife, 8k, vibrant colors, devinart, trending on artstation, low details, smooth \n", "\"\"\"\n", "negative_prompt = \"ugly, tiling, blurred, watermark, grainy, signature, cut off, draft, amateur, multiple, text, poor, low, basic, worst, unprofessional\"\n", "payload = {\n", " \"parameters\": {\n", " \"num_inference_steps\": 50,\n", " \"guidance_scale\": 9,\n", " \"negative_prompt\": negative_prompt,\n", " \"num_images_per_prompt\": 1,\n", " # the next 2 parameters will only be utilized by the pagination enabled endpoint\n", " \"starting_step\": 0,\n", " \"num_interim_images\": 5,\n", " },\n", " \"prompt\": prompt,\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "6be0c54f", "metadata": {}, "outputs": [], "source": [ "t1 = time.perf_counter()\n", "response = invoke_endpoint(sd_predictor, payload)\n", "response_time = time.perf_counter() - t1\n", "[decode_image(img) for img in response[\"images\"]][0]" ] }, { "cell_type": "code", "execution_count": null, "id": "640cb6be", "metadata": {}, "outputs": [], "source": [ "print(f\"Response returned in {response_time:.2f}s\")" ] }, { "cell_type": "code", "execution_count": null, "id": "b531370a", "metadata": {}, "outputs": [], "source": [ "sd_predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "id": "6af1bf07", "metadata": {}, "source": [ "## Enable Pagination\n", "To enable pagination of intermediate results, we set the `PAGINATION` environment variable to `true` and redeploy the endpoint. Rather than just a single image within its response, the paginated endpoint contains 3 values in its response:\n", "1. Batch of intermediate images encoded as base64 encoded JPEGs\n", "2. A [safetensor](https://github.com/huggingface/safetensors) value for the last latent in the generation pipeline encoded as base64\n", "3. The last step number in the generation pipeline\n", "\n", "Items 2 and 3 enable the pagination. By providing a latent tensor and the step number, we can bypass the completed steps and pick up the image generation from the last completed step. Essentially after receiving the initial batch of intermediate images, we invoke the endpoint again this time providing the latent input and the step number. This process repeats until the specified number of denoising steps are completed. This allows for a next batch of images to be fetched concurrently while intermediate frames are displayed to the user." ] }, { "cell_type": "code", "execution_count": null, "id": "00c03f6c", "metadata": {}, "outputs": [], "source": [ "sd_endpoint_name = sagemaker.utils.name_from_base(\"stable-diffusion\")\n", "sd_predictor = deploy_model(\n", " image_uri=inference_image_uri,\n", " model_data=sd_s3_code_artifact,\n", " role=role,\n", " env={\"PAGINATION\": \"true\"},\n", " endpoint_name=sd_endpoint_name,\n", " instance_type=\"ml.g5.xlarge\",\n", " sagemaker_session=sess,\n", ")" ] }, { "cell_type": "markdown", "id": "a132852c", "metadata": {}, "source": [ "The function bellow encapsulates the process for querying the pagination endpoint. It provides a python iterator than we can iterate through to display the intermediate images. A background thread is used to fetch susbsequent batches of images to simulate the concurrentcy aspect " ] }, { "cell_type": "code", "execution_count": null, "id": "2afdec83", "metadata": {}, "outputs": [], "source": [ "def run_paginated_inference(predictor, initial_payload):\n", " \"creates an iterator for intermediate images\"\n", "\n", " payload = deepcopy(initial_payload) # make a deep copy to not mutate the initial payload\n", " num_inference_steps = payload[\"parameters\"][\"num_inference_steps\"]\n", " steps_completed = 0\n", " while steps_completed < num_inference_steps:\n", " if steps_completed == 0:\n", " payload[\"parameters\"][\"starting_step\"] = 0\n", " result = invoke_endpoint(predictor, payload)\n", " else:\n", " while not future.done():\n", " time.sleep(0.1)\n", " result = future.result()\n", "\n", " images = result[\"images\"]\n", " steps_completed = result[\"step\"]\n", "\n", " payload[\"parameters\"][\"starting_step\"] = result[\"step\"]\n", " payload[\"parameters\"][\"latents\"] = result[\"latents\"]\n", "\n", " # use a single background thread to fetch next set images while curernt batch is displayed\n", " with ThreadPoolExecutor(max_workers=1) as e:\n", " future = e.submit(predictor.predict, payload)\n", "\n", " for img in images:\n", " yield img" ] }, { "cell_type": "code", "execution_count": null, "id": "69e1d086", "metadata": {}, "outputs": [], "source": [ "images_it = run_paginated_inference(sd_predictor, payload) # create the image iterator\n", "t1 = time.perf_counter()\n", "for n, img in enumerate(images_it):\n", " if n == 0:\n", " inital_response_time = time.perf_counter() - t1\n", "\n", " html = f\"\"\"
\n", " \n", "
\"\"\"\n", " display.display(HTML(html))\n", " clear_output(wait=True) # comment this line to see individual outputs" ] }, { "cell_type": "code", "execution_count": null, "id": "efa76270", "metadata": {}, "outputs": [], "source": [ "print(f\"first batch delivered in {inital_response_time} seconds\")" ] }, { "cell_type": "markdown", "id": "c7ce5761", "metadata": {}, "source": [ "We can see from above that the first batch was delivered in under one second. This provides a more immediate response to the user at the expense of additional compute cost of having to decode intermediate images. It also doubles the time to generate the final image. " ] }, { "cell_type": "code", "execution_count": null, "id": "e1aa8e4e", "metadata": {}, "outputs": [], "source": [ "sd_predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "id": "ab0d4630", "metadata": {}, "source": [ "## Conclusion\n", "In this notebook we saw how we can deploy an AITemplate optimized Stable Diffusion model which offers a 2X peformance increase without sacrificing quality of the generated image. We also saw how we can provide a beter User Experience by returning intermediate results which provides a faster initial response time and a look into the image generation process. " ] }, { "cell_type": "markdown", "id": "6c3fece4-cf34-4e4a-8679-443ae05d1b95", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/inference|generativeai|llm-workshop|lab2-stable-diffusion|option2-aitemplate|sd_txt2img.ipynb)" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }