{ "cells": [ { "cell_type": "markdown", "id": "2054403f", "metadata": {}, "source": [ "# Serve GPT-J on SageMaker with DJLServing using PySDK\n", "\n", "In this notebook, we explore how to host a fine-tuned GPT-J parameter model on SageMaker using [Deep Java Library (DJL) on Amazon SageMaker](https://sagemaker.readthedocs.io/en/stable/frameworks/djl/index.html).\n", "\n", "Language models have recently exploded in both size and popularity. In 2018, BERT-large entered the scene and, with its 340M parameters and novel transformer architecture, set the standard on NLP task accuracy. Within just a few years, state-of-the-art NLP model size has grown by more than 500x with models such as OpenAI’s 175 billion parameter GPT-3 and similarly sized open source Bloom 176B raising the bar on NLP accuracy. This increase in the number of parameters is driven by the simple and empirically-demonstrated positive relationship between model size and accuracy: more is better. With easy access from models zoos such as Hugging Face and improved accuracy in NLP tasks such as classification and text generation, practitioners are increasingly reaching for these large models. However, deploying them can be a challenge because of their size.\n", "\n", "Model parallelism can help deploy large models that would normally be too large for a single GPU. With model parallelism, we partition and distribute a model across multiple GPUs. Each GPU holds a different part of the model, resolving the memory capacity issue for the largest deep learning models with billions of parameters. This notebook uses tensor parallelism techniques which allow GPUs to work simultaneously on the same layer of a model and achieve low latency inference relative to a pipeline parallel solution.\n", "\n", "SageMaker has rolled out DeepSpeed container which now provides users with the ability to leverage the managed serving capabilities and help to provide the un-differentiated heavy lifting.\n", "\n", "In this notebook, we deploy the fine tuned GPT-J model. DeepSpeed is used for tensor parallelism inference while DJLServing handles inference requests and the distributed workers. For further reading on DeepSpeed you can refer to https://arxiv.org/pdf/2207.00032.pdf \n" ] }, { "cell_type": "code", "execution_count": null, "id": "c780d5fc", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Instal boto3 library to create model and run inference workloads\n", "%pip install -Uqq boto3 awscli" ] }, { "cell_type": "markdown", "id": "d0d4c85a", "metadata": {}, "source": [ "## Section to Download Model from S3\n", "\n", "In this section we download the model archive from S3. We will decompress the file and inspect the artifacts. " ] }, { "cell_type": "code", "execution_count": null, "id": "4f5a06ca", "metadata": { "tags": [] }, "outputs": [], "source": [ "import sagemaker\n", "import boto3\n", "\n", "bucket = sagemaker.session.Session().default_bucket()\n", "print(bucket)\n", "\n", "from sagemaker import get_execution_role\n", "\n", "role = sagemaker.get_execution_role() # execution role for the endpoint\n", "session = (\n", " sagemaker.session.Session()\n", ") # sagemaker session for interacting with different AWS APIs\n", "region = session._region_name\n", "\n", "sm_client = boto3.client(\"sagemaker\")" ] }, { "cell_type": "markdown", "id": "f37ba706-b12d-4427-928e-dadfdd02a811", "metadata": {}, "source": [ "Next cell controls which local path to use for fetching the model" ] }, { "cell_type": "code", "execution_count": null, "id": "4fddb70e-ac30-4d03-bfad-c73155bb58b3", "metadata": { "tags": [] }, "outputs": [], "source": [ "local_model_path = \"./model\"" ] }, { "cell_type": "markdown", "id": "b625f99d-2591-4617-94ef-428324a7b4ac", "metadata": {}, "source": [ "
\n", "Model Location
\n", "Set the model_s3_uri variable to the S3 location of the fine-tuned `model.tar.gz` file. \n", "It should be of the form `s3://{bucket}/smp-tensorparallel-outputdir/{training-job-name}/output/model.tar.gz`" ] }, { "cell_type": "code", "execution_count": null, "id": "04d9f02b-9ea8-4d81-899b-7f16206f5b59", "metadata": { "tags": [] }, "outputs": [], "source": [ "# model_s3_uri = #\"\"" ] }, { "cell_type": "markdown", "id": "9493490c-a006-4fd9-8a4a-f1e7cdb87020", "metadata": {}, "source": [ "The following cell downloads the model from S3 and places it in the local model path we defined above." ] }, { "cell_type": "code", "execution_count": null, "id": "2857301c-96cd-45c2-b5cc-8b32cf53045a", "metadata": { "tags": [] }, "outputs": [], "source": [ "! rm -rf $local_model_path\n", "! mkdir -p $local_model_path || echo \"Dir exists\"\n", "! aws s3 cp $model_s3_uri $local_model_path" ] }, { "cell_type": "markdown", "id": "82ce889a-8b40-4762-9ea5-671d73a506d5", "metadata": {}, "source": [ "The following commands, extract the binary from the archive, clean up our local directory and rename the model to `gptj.pt`" ] }, { "cell_type": "code", "execution_count": null, "id": "9495df7b-e20d-4e4f-9253-88c3abd22c2f", "metadata": { "tags": [] }, "outputs": [], "source": [ "! tar -xvf $local_model_path/model.tar.gz fullmodel.pt" ] }, { "cell_type": "code", "execution_count": null, "id": "8d48d50f-d790-4448-ad7f-ec609363ff2c", "metadata": { "tags": [] }, "outputs": [], "source": [ "! rm $local_model_path/model.tar.gz\n", "! mv fullmodel.pt $local_model_path/gptj.pt" ] }, { "cell_type": "markdown", "id": "c467b82a", "metadata": {}, "source": [ "## Upload to model artifacts to S3 \n", "In this step, we will download the tokenizer and model from Hugging Face hub, load the model weights we extracted in the previous step and upload the uncompressed artifacts to a S3 bucket." ] }, { "cell_type": "code", "execution_count": null, "id": "8015b99b-99cc-4448-ae49-d7c65ee54210", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%bash\n", "pip install -U pip --quiet\n", "pip install -U sagemaker --quiet\n", "pip install -U boto3 --quiet\n", "\n", "pip install -U transformers --quiet\n", "pip install smart_open" ] }, { "cell_type": "code", "execution_count": null, "id": "f8e3cdf0-b291-4c4b-b4a3-3722e28574ae", "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.s3 import S3Uploader\n", "from transformers import AutoModel, AutoTokenizer, AutoModelForCausalLM\n", "from smart_open import open as smart_open\n", "from transformers import GPTJModel, GPTJConfig\n", "import io\n", "import torch" ] }, { "cell_type": "code", "execution_count": null, "id": "34b7ce14-e9e6-41a1-8a34-f5a1f81c0bd8", "metadata": { "tags": [] }, "outputs": [], "source": [ "model_config = \"gpt-j-xl\"\n", "\n", "if model_config == \"gpt-j-6B\":\n", " model_id = \"EleutherAI/gpt-j-6B\"\n", " model = AutoModelForCausalLM.from_pretrained(\n", " model_id,\n", " revision=\"float16\",\n", " torch_dtype=torch.float16,\n", " )\n", "elif model_config == \"gpt-j-xl\":\n", " model_id = \"EleutherAI/gpt-j-6B\"\n", " model_config = GPTJConfig(\n", " vocab_size=50400,\n", " n_positions=1024,\n", " n_embd=1600,\n", " n_layer=48,\n", " n_head=25,\n", " n_inner=None,\n", " activation_function=\"gelu_new\",\n", " resid_pdrop=0.1,\n", " embd_pdrop=0.1,\n", " attn_pdrop=0.1,\n", " layer_norm_epsilon=1e-05,\n", " initializer_range=0.02,\n", " summary_type=\"cls_index\",\n", " summary_use_proj=True,\n", " summary_activation=None,\n", " summary_proj_to_labels=True,\n", " summary_first_dropout=0.1,\n", " # gradient_checkpointing=args.gradient_checkpointing > 0,\n", " use_cache=False,\n", " bos_token_id=50256,\n", " eos_token_id=50256,\n", " return_dict=True,\n", " )\n", " model = AutoModelForCausalLM.from_config(model_config, torch_dtype=torch.float16)\n", "\n", "\n", "load_path = f\"{local_model_path}/gptj.pt\"\n", "with smart_open(load_path, \"rb\") as f:\n", " buffer = io.BytesIO(f.read())\n", " model.load_state_dict(torch.load(buffer))\n", "\n", "tokenizer = AutoTokenizer.from_pretrained(model_id)\n", "\n", "model.save_pretrained(\"tmp/gptj-finetuned\")\n", "\n", "tokenizer.save_pretrained(\"tmp/gptj-finetuned\")\n", "\n", "s3_location = f\"s3://{bucket}/djl-serving/gptj-finetuned\"\n", "S3Uploader.upload(\"tmp/gptj-finetuned\", s3_location)\n", "\n", "print(f\"Pretrained model will be downloaded from ---- > {s3_location}\")" ] }, { "cell_type": "markdown", "id": "7c3fdb49-7342-4c35-90ce-458f4608ccef", "metadata": {}, "source": [ "## Deploy the model to SageMaker" ] }, { "cell_type": "code", "execution_count": null, "id": "4e3b8cd5-b4a2-4a5a-ab33-722ded6d175e", "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.djl_inference import DJLModel\n", "\n", "model = DJLModel(\n", " s3_location,\n", " role,\n", " task=\"text-generation\",\n", " number_of_partitions=1,\n", " data_type=\"fp16\",\n", ")\n", "\n", "predictor = model.deploy(initial_instance_count=1, instance_type=\"ml.g4dn.4xlarge\")" ] }, { "cell_type": "markdown", "id": "dcde45f2-3bb0-4d3b-8152-acae3f2ba3a8", "metadata": {}, "source": [ "## Run inference using your endpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "b3c201db-c57e-43db-a9a5-fb0e72371255", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%time\n", "data = {\n", " \"inputs\": [\n", " \"The ability to spread butter on toast is\",\n", " \"Video games are truly the\",\n", " ],\n", "}\n", "outputs = predictor.predict(data)\n", "for output in outputs:\n", " print(output)" ] }, { "cell_type": "markdown", "id": "4d97af6b", "metadata": {}, "source": [ "## Conclusion\n", "In this notebook, we demonstrated how to use SageMaker large model inference containers to host a finetuned GTP-J. We used Deep Java Library (DJL) on Amazon SageMaker. For more details about Amazon SageMaker and its large model inference capabilities, refer to the following:\n", "\n", "* Amazon SageMaker now supports deploying large models through configurable volume size and timeout quotas (https://aws.amazon.com/about-aws/whats-new/2022/09/amazon-sagemaker-deploying-large-models-volume-size-timeout-quotas/)\n", "* Real-time inference – Amazon SageMake (https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html)\n", "\n" ] }, { "cell_type": "markdown", "id": "ce7bf02c", "metadata": {}, "source": [ "## Clean Up" ] }, { "cell_type": "code", "execution_count": null, "id": "5c65d444", "metadata": {}, "outputs": [], "source": [ "# - Delete the end point\n", "sm_client.delete_endpoint(EndpointName=predictor.endpoint_name)" ] }, { "cell_type": "code", "execution_count": null, "id": "bca9e21c", "metadata": {}, "outputs": [], "source": [ "# - In case the end point failed we still want to delete the model\n", "# sm_client.delete_endpoint_config(EndpointConfigName=predictor.endpoint_name)\n", "sm_client.delete_model(ModelName=predictor.endpoint_name)" ] }, { "cell_type": "code", "execution_count": null, "id": "f4660925-51b2-4f57-b9c7-c9f547f4dcf6", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 } ], "instance_type": "ml.m5.8xlarge", "kernelspec": { "display_name": "Python 3 (PyTorch 1.12 Python 3.8 CPU Optimized)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/pytorch-1.12-cpu-py38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.16" }, "vscode": { "interpreter": { "hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1" } } }, "nbformat": 4, "nbformat_minor": 5 }