{ "cells": [ { "cell_type": "markdown", "id": "c6c03e23", "metadata": {}, "source": [ "\n", "# Serve GPT4ALL-J-6b on SageMaker with transformer-neuronx using DJL container.\n", "\n", "This notebook was run on ml.m5.xlarge\n", "\n", "In this notebook, we explore how to host a large language model on SageMaker using the Large Model Inference container that packages DJL model server and transformers-neuronx, an open source library that serves the purpose of running transformer inference workflows using AWS Neuron SDK. Neuron SDK is required to to run deep learning workloads on AWS Inferentia and AWS Trainium based instances . We use DJLServing as the model serving solution in this example. \n", "\n", "\n", "In this notebook, we deploy the open source GPT4ALL-J-6B(https://huggingface.co/nomic-ai/gpt4all-j) model across NeuronCores on an ml.inf2.24xlarge instance. The model is downloaded, converted to Inf2 format,partitioned and loaded to the Inferentia Chips.\n" ] }, { "cell_type": "markdown", "id": "3d7e159c", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.\n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "0bcfa65e", "metadata": {}, "source": [ "## Licence agreement\n", " - View license information https://huggingface.co/nomic-ai/gpt4all-j before using the model.\n", " - This notebook is a sample notebook and not intended for production use. Please refer to the licence at https://github.com/aws/mit-0.\n", "\n", " \n" ] }, { "cell_type": "markdown", "id": "eb63b6c1", "metadata": {}, "source": [ "#### Import the relevant libraries and configure several global variables using boto3" ] }, { "cell_type": "code", "execution_count": null, "id": "a78aa656-a254-4037-8736-2c3ab0a9ef7e", "metadata": { "tags": [] }, "outputs": [], "source": [ "%pip install sagemaker boto3 awscli huggingface_hub --upgrade --quiet" ] }, { "cell_type": "code", "execution_count": null, "id": "456e483a", "metadata": { "tags": [] }, "outputs": [], "source": [ "import boto3\n", "import sagemaker\n", "from sagemaker import Model, serializers, deserializers, image_uris\n", "import jinja2\n", "import os\n", "import time\n", "import json\n", "from pathlib import Path\n", "\n", "role = sagemaker.get_execution_role() # execution role for the endpoint\n", "sess = sagemaker.session.Session() # sagemaker session for interacting with different AWS APIs\n", "region = sess._region_name # region name of the current SageMaker Studio environment\n", "account_id = sess.account_id() # account_id of the current SageMaker Studio environment" ] }, { "cell_type": "markdown", "id": "3b6c6a87", "metadata": {}, "source": [ "## Create SageMaker compatible Model artifact, upload model to S3 .\n", "\n", "SageMaker Large Model Inference containers can be used to host models without providing your own inference code. You can also provide a '*model.py*' file if you like to customize in the model loading and inference process.\n", "SageMaker needs the model artifacts to be in a Tarball format. In this example, we provide the `serving.properties`.\n", "\n", "The tarball is in the following format\n", "\n", "```\n", "code\n", "├──── \n", "│ └── serving.properties\n", "\n", "```\n", "\n", "- `serving.properties` is the configuration file that can be used to configure the model server.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "2ee6789b-dc80-48e9-a228-9dba4e1f8ee3", "metadata": { "tags": [] }, "outputs": [], "source": [ "##Setup S3 bucket to store model artifacts\n", "bucket = sess.default_bucket() # bucket to house artifacts\n", "model_bucket = sess.default_bucket() # bucket to house artifacts\n", "s3_model_prefix = \"inf2-llm-djl/gpt4jall\" # folder within bucket where model artifact will go" ] }, { "cell_type": "code", "execution_count": null, "id": "d788caa5-0827-4e13-8e52-6f54d475f8c3", "metadata": { "tags": [] }, "outputs": [], "source": [ "from huggingface_hub import snapshot_download\n", "from pathlib import Path\n", "import os\n", "\n", "# - This will download the model into the current directory where ever the jupyter notebook is running\n", "local_model_path = Path(\".\")\n", "local_model_path.mkdir(exist_ok=True)\n", "model_name = \"nomic-ai/gpt4all-j\"\n", "# Only download pytorch checkpoint files\n", "allow_patterns = [\"*.json\", \"*.pt\", \"*.bin\", \"*.txt\", \"*.model\"]\n", "\n", "# - Leverage the snapshot library to donload the model since the model is stored in repository using LFS\n", "model_download_path = snapshot_download(\n", " repo_id=model_name,\n", " cache_dir=local_model_path,\n", " allow_patterns=allow_patterns,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "8ef8adfe-c956-455f-a8e0-285390912f61", "metadata": { "tags": [] }, "outputs": [], "source": [ "# define a variable to contain the s3url of the location that has the model\n", "pretrained_model_location = f\"s3://{model_bucket}/{s3_model_prefix}/\"\n", "print(f\"Pretrained model will be uploaded to ---- > {pretrained_model_location}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "e65aae71-58f9-480d-b861-eaef2aee9ee8", "metadata": { "tags": [] }, "outputs": [], "source": [ "model_artifact = sess.upload_data(path=model_download_path, key_prefix=s3_model_prefix)\n", "print(f\"Model uploaded to --- > {model_artifact}\")\n", "print(f\"We will set option.s3url={model_artifact}\")" ] }, { "cell_type": "markdown", "id": "f715510e-5093-4bd8-9198-db172ce14248", "metadata": {}, "source": [ "## Start preparing model artifacts\n", "In LMI contianer, we expect some artifacts to help setting up the model\n", "\n", "serving.properties (required): Defines the model server settings\n", "model.py (optional): A python file to define the core inference logic\n", "requirements.txt (optional): Any additional pip wheel need to install" ] }, { "cell_type": "markdown", "id": "5a507efc", "metadata": { "tags": [] }, "source": [ "#### Create serving.properties \n", "This is a configuration file to indicate to DJL Serving which model and configurations you want to use\n", "\n", "Here is a list of settings that we use in this configuration file -\n", "- `engine`: The engine for DJL to use. In this case, we intend to use transformers-neuronx and hence set it to **Python**. \n", "- `option.entryPoint`: The entrypoint python file or module. Since here we are using the default handler, we will configure it to use transformers-neuronx\n", "- `option.model_id` : The model id of a pretrained model hosted inside a model repository on huggingface.co (https://huggingface.co/models). The container uses this model id to download the corresponding model repository on huggingface.co. \n", "- `option.n_positions`: Number of tokens model can accept.\n", "\n", "\n", "If you want to download teh model from S3, you can instead set the parameter`option.s3url`to the Amazon S3 bucket that contains the model. When this is set, the container leverages [s5cmd](https://github.com/peak/s5cmd) to download the model from s3. This is extremely fast and useful when downloading large models.\n", "\n", "\n", "- `option.tensor_parallel_degree`: This property value sets the number of devices over which the tensor parallel modules are distributed. For example, if we have 12 Neruon Core and you do tensor parallel degree of 4. LMI will distribute to 3 copies of models and each of them takes 4 cores\n", "\n", "For more details on the configuration options and an exhaustive list, you can refer the documentation - https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints-large-model-configuration.html.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "e18abd82", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%writefile serving.properties\n", "# Start writing content here\n", "engine = Python\n", "option.entryPoint = djl_python.transformers - neuronx\n", "# option.model_id=nomic-ai/gpt4all-j\n", "option.s3url = {{s3url}}\n", "option.tensor_parallel_degree = 2\n", "option.model_loading_timeout = 2400\n", "option.n_positions = 512" ] }, { "cell_type": "markdown", "id": "f3a1437b", "metadata": {}, "source": [ "Create the model tarball and upload it to a S3 bucket " ] }, { "cell_type": "code", "execution_count": null, "id": "786a02ed", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%sh\n", "mkdir mymodel\n", "mv serving.properties mymodel/\n" ] }, { "cell_type": "code", "execution_count": null, "id": "fc440528-8b0e-42e7-97a5-b54e4d50143e", "metadata": { "tags": [] }, "outputs": [], "source": [ "# we plug in the appropriate model location into our `serving.properties` file based on the region in which this notebook is running\n", "jinja_env = jinja2.Environment()\n", "template = jinja_env.from_string(Path(\"mymodel/serving.properties\").open().read())\n", "Path(\"mymodel/serving.properties\").open(\"w\").write(template.render(s3url=pretrained_model_location))\n", "!pygmentize mymodel/serving.properties | cat -n" ] }, { "cell_type": "code", "execution_count": null, "id": "ee55d87d-557d-4bd4-9bcb-c53b94c458dd", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%sh\n", "tar czvf mymodel.tar.gz mymodel/\n", "rm -rf mymodel" ] }, { "cell_type": "code", "execution_count": null, "id": "03d9203a", "metadata": { "tags": [] }, "outputs": [], "source": [ "s3_code_prefix = \"large-model-lmi/code\"\n", "bucket = sess.default_bucket() # bucket to house artifacts\n", "code_artifact = sess.upload_data(\"mymodel.tar.gz\", bucket, s3_code_prefix)\n", "print(f\"S3 Code or Model tar ball uploaded to --- > {code_artifact}\")\n", "\n", "\n", "env = {\"HUGGINGFACE_HUB_CACHE\": \"/tmp\", \"TRANSFORMERS_CACHE\": \"/tmp\"}\n", "##Retrieve djl container image\n", "image_uri = image_uris.retrieve(\n", " framework=\"djl-deepspeed\", region=sess.boto_session.region_name, version=\"0.21.0\"\n", ")\n", "image_uri = image_uri.split(\":\")[0] + \":\" + \"0.22.1-neuronx-sdk2.9.0\"\n", "\n", "model = Model(image_uri=image_uri, model_data=code_artifact, env=env, role=role)" ] }, { "cell_type": "markdown", "id": "ed435a7c", "metadata": {}, "source": [ "#### Create SageMaker endpoint\n", "1. Create the Model using the Image container and the Model Tarball uploaded earlier\n", "2. Create the endpoint config using the following key parameters\n", "\n", " a) Instance Type is ml.inf2.8xlarge\n", " \n", " b) ContainerStartupHealthCheckTimeoutInSeconds is 3600 to ensure health check starts after the model is ready \n", "3. The container downloads the model into the /tmp space on the container because SageMaker maps the /tmp to the Amazon Elastic Block Store (Amazon EBS). We need to add a volume_size parameter to ensure /tmp directory has enough space to download and compile the model.\n", "4. Create the end point using the endpoint config created \n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "a25a6aa6", "metadata": { "tags": [] }, "outputs": [], "source": [ "instance_type = \"ml.inf2.8xlarge\"\n", "endpoint_name = sagemaker.utils.name_from_base(\"lmi-model\")\n", "\n", "\n", "model.deploy(\n", " initial_instance_count=1,\n", " instance_type=instance_type,\n", " endpoint_name=endpoint_name,\n", " container_startup_health_check_timeout=3600,\n", " volume_size=256,\n", ")\n", "\n", "# our requests and responses will be in json format so we specify the serializer and the deserializer\n", "predictor = sagemaker.Predictor(\n", " endpoint_name=endpoint_name,\n", " sagemaker_session=sess,\n", " serializer=serializers.JSONSerializer(),\n", " deserializer=deserializers.JSONDeserializer(),\n", ")" ] }, { "cell_type": "markdown", "id": "a5516aca", "metadata": {}, "source": [ "## Test the endpoint \n", "This is a generative model so we pass in a Text as a prompt. " ] }, { "cell_type": "code", "execution_count": null, "id": "36729749", "metadata": { "tags": [] }, "outputs": [], "source": [ "predictor.predict({\"inputs\": \"What is deep learning\", \"parameters\": {\"max_length\": 500}})" ] }, { "cell_type": "markdown", "id": "8c5a7651", "metadata": {}, "source": [ "## Conclusion\n", "In this post, we demonstrated how to use SageMaker large model inference containers to host GPT4ALL-J-6B. \n", "\n" ] }, { "cell_type": "markdown", "id": "03a86317", "metadata": {}, "source": [ "## Clean Up" ] }, { "cell_type": "code", "execution_count": null, "id": "2defcfef", "metadata": { "tags": [] }, "outputs": [], "source": [ "# - Delete the end point\n", "sess.delete_endpoint(endpoint_name)\n", "sess.delete_endpoint_config(endpoint_name)\n", "model.delete_model()" ] }, { "cell_type": "markdown", "id": "cd80abd9", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/inference|generativeai|llm-workshop|lab8-Inferentia2-gpt4all-j|inferentia2-llm-GPT4allJ.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "vscode": { "interpreter": { "hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1" } } }, "nbformat": 4, "nbformat_minor": 5 }