{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "edeebade-eaaf-4935-8203-cc59bd2d9d12",
   "metadata": {},
   "outputs": [],
   "source": [
    "### Kernel and SageMaker Setup\n",
    "Please use the ml.t3.medium instance for this notebook. The Kernel is 'Data Science - Python3'."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f6b8b81-aee6-4e38-9473-481ee76692e8",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Text Summarization of Consumer Health Questions\n",
    "## Part 2 Fine tuning Flan-t5 via SageMaker SDK\n",
    "In the previous notebook we fine-tuned the MeQSum dataset on a local notebook instance. In this notebook we will learn how to use the SageMaker SDK to spin up training instances for fine-tuning the Flan-T5-base model on a medical summary task. \n",
    "### MeQSum Dataset\n",
    "\"On the Summarization of Cealth Questions\". Asma Ben Abacha and Dina Demner-Fushman. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019.  \n",
    "#### Citation Information\n",
    "@Inproceedings{MeQSum,\n",
    "author = {Asma {Ben Abacha} and Dina Demner-Fushman},\n",
    "title = {On the Summarization of Consumer Health Questions},\n",
    "booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28th - August 2},\n",
    "year = {2019},\n",
    "abstract = {Question understanding is one of the main challenges in question answering. In real world applications, users often submit natural language questions that are longer than needed and include peripheral information that increases the complexity of the question, leading to substantially more false positives in answer retrieval. In this paper, we study neural abstractive models for medical question summarization. We introduce the MeQSum corpus of 1,000 summarized consumer health questions. We explore data augmentation methods and evaluate state-of-the-art neural abstractive models on this new task. In particular, we show that semantic augmentation from question datasets improves the overall performance, and that pointer-generator networks outperform sequence-to-sequence attentional models on this task, with a ROUGE-1 score of 44.16%. We also present a detailed error analysis and discuss directions for improvement that are specific to question summarization. }}\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85189347-5009-4b11-b662-dddb23c5ec8b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip -q install transformers==4.28.0 datasets==2.12.0 sagemaker==2.156.0 --upgrade"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84cd8dcd-73c3-44ca-8d4b-a979cbaa2446",
   "metadata": {},
   "source": [
    "## 1. Data Preparation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8901bbb6-66bf-497c-a3c1-2ebe6db9f0ea",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "# dataset from https://github.com/abachaa/MeQSum\n",
    "\n",
    "df = pd.read_excel('MeQSum_ACL2019_BenAbacha_Demner-Fushman.xlsx')\n",
    "df = df.drop('File', axis=1)\n",
    "df = df.rename(columns={'CHQ':'Text'})\n",
    "df = df.dropna()\n",
    "df['Text']= df['Text'].apply(lambda x: x.lower())\n",
    "df['Summary'] = df['Summary'].apply(lambda x: x.lower())\n",
    "df['Id'] = range(0, len(df.index))\n",
    "df = df[['Id', 'Text', 'Summary']]\n",
    "# df = df.sample(frac=1).reset_index(drop=True) # to shuffle\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39dc384f-72cc-4c8f-a494-f66a340bd651",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Import libraries required for modelling & create a SageMaker Session\n",
    "\n",
    "import datasets\n",
    "from datasets import Dataset\n",
    "from datasets import load_metric\n",
    "from datasets import concatenate_datasets\n",
    "from datasets.filesystems import S3FileSystem\n",
    "\n",
    "import transformers\n",
    "from transformers import AutoTokenizer\n",
    "\n",
    "import sagemaker\n",
    "from sagemaker.huggingface import HuggingFace\n",
    "\n",
    "sess = sagemaker.Session()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2aea70ac-4fab-4368-9d8c-f530db178ef1",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "model_checkpoint = 'google/flan-t5-base' # 'google/flan-t5-small' for quick training.\n",
    "tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3da11596-f097-4fc2-83a6-65779e01a4f3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "#  Similar to any ML model, we split the data into a train and test set\n",
    "\n",
    "train = df[:700]\n",
    "val = df[700:900]\n",
    "test = df[900:]\n",
    "print('train: {}, val: {}, test: {}'.format(train.shape, val.shape, test.shape))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84c09d05-7f3c-494a-a362-1e4a92e9eda5",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Metadata and dataset objects\n",
    "\n",
    "train_dataset = Dataset.from_pandas(train)\n",
    "val_dataset = Dataset.from_pandas(val)\n",
    "test_dataset = Dataset.from_pandas(test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f6278893-356b-471a-a571-b487433e1462",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Deterimine the max input length and max target length based on the number of rows in the dataset\n",
    "\n",
    "tokenized_inputs = concatenate_datasets([train_dataset, val_dataset, test_dataset]).map(lambda x: tokenizer(x[\"Text\"], truncation=True), batched=True, remove_columns=[\"Text\", \"Summary\"])\n",
    "max_input_length = max([len(x) for x in tokenized_inputs[\"input_ids\"]])\n",
    "print(f\"Max input length: {max_input_length}\")\n",
    "\n",
    "tokenized_targets = concatenate_datasets([train_dataset, val_dataset, test_dataset]).map(lambda x: tokenizer(x[\"Summary\"], truncation=True), batched=True, remove_columns=[\"Text\", \"Summary\"])\n",
    "max_target_length = max([len(x) for x in tokenized_targets[\"input_ids\"]])\n",
    "print(f\"Max target length: {max_target_length}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff31ceb7-4809-494d-9da4-4c24afc9f74a",
   "metadata": {},
   "source": [
    "#### Create a function to tokenise inputs to the model & ensure vectors are the same length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "526692e1-9e6e-4ee0-9079-a52139035b40",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def preprocess_function(sample,padding=\"max_length\"):\n",
    "    inputs = [\"summarize: \" + item for item in sample[\"Text\"]]\n",
    "    model_inputs = tokenizer(inputs, max_length=max_input_length, padding=padding, truncation=True)\n",
    "\n",
    "    labels = tokenizer(text_target=sample[\"Summary\"], max_length=max_target_length, padding=padding, truncation=True)\n",
    "\n",
    "    if padding == \"max_length\":\n",
    "        labels[\"input_ids\"] = [\n",
    "            [(l if l != tokenizer.pad_token_id else -100) for l in label] for label in labels[\"input_ids\"]\n",
    "        ]\n",
    "\n",
    "    model_inputs[\"labels\"] = labels[\"input_ids\"]\n",
    "    return model_inputs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "54658c51-eb8d-4bc4-84f0-7cdd3fee58b8",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Apply the tokenisation function\n",
    "\n",
    "tokenized_train = train_dataset.map(preprocess_function, batched=True)\n",
    "tokenized_val = val_dataset.map(preprocess_function, batched=True)\n",
    "\n",
    "print(f\"Keys of tokenized dataset: {tokenized_train.features}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1893351d-2a65-46a6-afbf-a4aef4af5461",
   "metadata": {},
   "source": [
    "#### We will upload the data to S3, which integrates natively with Amazon SageMaker"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a862d8d9-9935-443b-bf86-4bff7be8ce51",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Uploading dataset to S3\n",
    "s3 = S3FileSystem()\n",
    "\n",
    "bucket = sess.default_bucket()\n",
    "s3_prefix = \"huggingface/meqsum-flan-t5-summarization\"\n",
    "\n",
    "base_job_name=\"huggingface-meqsum-flan-t5-summarization\"\n",
    "checkpoint_in_bucket=\"checkpoints\"\n",
    "\n",
    "# The S3 URI to store the checkpoints\n",
    "checkpoint_s3_bucket=\"s3://{}/{}/{}\".format(bucket, base_job_name, checkpoint_in_bucket)\n",
    "\n",
    "# The local path where the model will save its checkpoints in the training container\n",
    "checkpoint_local_path=\"/opt/ml/checkpoints\"\n",
    "\n",
    "dataset_input_path = \"s3://{}/{}\".format(bucket, s3_prefix)\n",
    "train_input_path = \"{}/train\".format(dataset_input_path)\n",
    "valid_input_path = \"{}/validation\".format(dataset_input_path)\n",
    "\n",
    "print(dataset_input_path)\n",
    "print(train_input_path)\n",
    "print(valid_input_path)\n",
    "print(checkpoint_s3_bucket)\n",
    "\n",
    "tokenized_train.save_to_disk(train_input_path, fs=s3)\n",
    "tokenized_val.save_to_disk(valid_input_path, fs=s3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "855d2e86-c42b-4526-a7af-58462ca60f14",
   "metadata": {
    "tags": []
   },
   "source": [
    "## 2. Training the model using SageMaker Training"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "76d9bac4-847d-4372-8f8c-a9409af1e80a",
   "metadata": {},
   "source": [
    "#### Define the objects & parameters for model training\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a4de95f0-0c1e-4b99-a98c-06154b1f7947",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Define hyperparameters\n",
    "\n",
    "hyperparameters = {\n",
    "    \"epochs\": 10,\n",
    "    \"learning-rate\": 2e-5,\n",
    "    \"train-batch-size\": 4,\n",
    "    \"eval-batch-size\": 4,\n",
    "    \"model-name\": model_checkpoint,\n",
    "    'output_dir': checkpoint_local_path\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc91da16-6549-4a87-b9f9-554816c85365",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Define the loss metrics that will be used to measure model performance\n",
    "\n",
    "metric_definitions=[\n",
    "    {'Name': 'loss', 'Regex': \"'loss': ([0-9]+(.|e\\-)[0-9]+),?\"},\n",
    "    {'Name': 'learning_rate', 'Regex': \"'learning_rate': ([0-9]+(.|e\\-)[0-9]+),?\"},\n",
    "    {'Name': 'eval_loss', 'Regex': \"'eval_loss': ([0-9]+(.|e\\-)[0-9]+),?\"},\n",
    "    {'Name': 'epoch', 'Regex': \"'epoch': ([0-9]+(.|e\\-)[0-9]+),?\"}\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f8764c34-521e-4540-bc8c-68989b4a6c02",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Here we call the Hugging Face container and add parameters to it such as dataset locations, and the training instance size\n",
    "\n",
    "huggingface_estimator = HuggingFace(\n",
    "    role=sagemaker.get_execution_role(),\n",
    "    entry_point=\"train.py\",\n",
    "    dependencies=[\"requirements.txt\"],\n",
    "    hyperparameters=hyperparameters,\n",
    "    base_job_name=base_job_name,\n",
    "    checkpoint_s3_uri=checkpoint_s3_bucket,\n",
    "    checkpoint_local_path=checkpoint_local_path,\n",
    "    transformers_version=\"4.26.0\",\n",
    "    pytorch_version=\"1.13.1\",\n",
    "    py_version=\"py39\",\n",
    "    instance_type=\"ml.p3.2xlarge\",\n",
    "    instance_count=1,\n",
    "    metric_definitions=metric_definitions\n",
    "    # distribution={\"smdistributed\": {\"dataparallel\": {\"enabled\": True}}}, # For distributed training.\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3c54380b-22d6-4943-b813-e837d81001d4",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "huggingface_estimator.fit({\"train\": train_input_path, \"valid\": valid_input_path})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "687e4b1c-bda3-49a5-bfd1-edcf3bd12d68",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "huggingface_estimator.model_data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81a66587-cb13-4e18-b694-f10abf459ff5",
   "metadata": {
    "tags": []
   },
   "source": [
    "## 3. Perform inferencing on the test dataset with a SageMaker endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b9ecc1f-5ab6-4e24-abb4-f9a64439c8c9",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Deploy the model endpoint\n",
    "\n",
    "huggingface_predictor = huggingface_estimator.deploy(\n",
    "    initial_instance_count=1, instance_type=\"ml.p3.2xlarge\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de9ce3b2-8d90-4d09-9eda-2fea2dfe73a9",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Obtain the model predictions\n",
    "\n",
    "predictions = []\n",
    "for test_data in test_dataset: \n",
    "    prediction = huggingface_predictor.predict({\"inputs\": f\"summarize: {test_data['Text']}\"})\n",
    "    predictions.append(prediction[0]['generated_text'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "015d733a-fa7b-4dac-b3fa-f21c99272f16",
   "metadata": {},
   "source": [
    "#### The `Predicted Summary` column is the model's output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb3c0df9-80b5-4a5d-a0b5-0c7b3c044dfa",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Model summarisation performance can be inspected by evaluation metrics and spot checks\n",
    "\n",
    "test['Predicted Summary'] = predictions\n",
    "pd.set_option('display.max_colwidth', 1024)\n",
    "test"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88575a2f-cb63-42a8-a6c1-50567e420ee8",
   "metadata": {},
   "source": [
    "## 4. Clean up of resources so they aren't left runnning"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51dbe7aa-8fc2-47c7-91f9-c1b673ea370e",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Remember to delete your endpoint after use as you will be charged for the instances it uses\n",
    "\n",
    "huggingface_predictor.delete_model()\n",
    "huggingface_predictor.delete_endpoint()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b272691-aca8-4219-bcc9-8f9d0b6cdc94",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   },
   {
    "_defaultOrder": 55,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 56,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4de.24xlarge",
    "vcpuNum": 96
   }
  ],
  "kernelspec": {
   "display_name": "Python 3 (Data Science)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/datascience-1.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}