{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# End-to-End NLP: News Headline Classifier (Local Version)\n",
    "\n",
    "_**Train a PyTorch-based model to classify news headlines between four domains**_\n",
    "\n",
    "This notebook works well with the `Python 3 (PyTorch 1.13 Python 3.9 CPU Optimized)` kernel on SageMaker Studio, or `conda_pytorch_p38` on classic SageMaker Notebook Instances.\n",
    "\n",
    "---\n",
    "\n",
    "In this version, the model is trained and evaluated here on the notebook instance itself. We'll show in the follow-on notebook how to take advantage of Amazon SageMaker to separate these infrastructure needs.\n",
    "\n",
    "Note that you can safely ignore the WARNING about the pip version.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "# First install some libraries which might not be available across all kernels (e.g. in Studio):\n",
    "!pip install \"ipywidgets<8\" torchtext==0.6"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download News Aggregator Dataset\n",
    "\n",
    "We will download **FastAi AG News** dataset from the [Registry of Open Data on AWS](https://registry.opendata.aws/fast-ai-nlp/) public repository. This dataset contains a table of news headlines and their corresponding classes.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "local_dir = \"data\"\n",
    "# Download the AG News data from the Registry of Open Data on AWS.\n",
    "!mkdir -p {local_dir}\n",
    "!aws s3 cp s3://fast-ai-nlp/ag_news_csv.tgz {local_dir} --no-sign-request\n",
    "\n",
    "# Un-tar the AG News data.\n",
    "!tar zxf {local_dir}/ag_news_csv.tgz -C {local_dir}/ --strip-components=1 --no-same-owner\n",
    "print(\"Done!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Let's visualize the dataset\n",
    "\n",
    "We will load the ag_news_csv/train.csv file to a Pandas dataframe for our data processing work."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "import os\n",
    "import re\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import util.preprocessing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "column_names = [\"CATEGORY\", \"TITLE\", \"CONTENT\"]\n",
    "# we use the train.csv only\n",
    "df = pd.read_csv(f\"{local_dir}/train.csv\", names=column_names, header=None, delimiter=\",\")\n",
    "# shuffle the DataFrame rows\n",
    "df = df.sample(frac=1, random_state=1337)\n",
    "# make the category classes more readable\n",
    "mapping = {1: \"World\", 2: \"Sports\", 3: \"Business\", 4: \"Sci/Tech\"}\n",
    "df = df.replace({\"CATEGORY\": mapping})\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this exercise we'll **only use**:\n",
    "\n",
    "- The **title** (Headline) of the news story, as our input\n",
    "- The **category**, as our target variable\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "df[\"CATEGORY\"].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The dataset has **four article categories** with equal weighting:\n",
    "\n",
    "- Business\n",
    "- Sci/Tech\n",
    "- Sports\n",
    "- World\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Natural Language Pre-Processing\n",
    "\n",
    "We'll do some basic processing of the text data to convert it into numerical form that the algorithm will be able to consume to create a model.\n",
    "\n",
    "We will do typical pre processing for NLP workloads such as: dummy encoding the labels, tokenizing the documents and set fixed sequence lengths for input feature dimension, padding documents to have fixed length input vectors.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dummy Encode the Labels\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "encoded_y, labels = util.preprocessing.dummy_encode_labels(df, \"CATEGORY\")\n",
    "print(labels)\n",
    "print(encoded_y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For example, looking at the first record in our (shuffled) dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "df[\"CATEGORY\"].iloc[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "encoded_y[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tokenize and Set Fixed Sequence Lengths\n",
    "\n",
    "We want to describe our inputs at the more meaningful word level (rather than individual characters), and ensure a fixed length of the input feature dimension.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "_cell_guid": "7bcf422f-0e75-4d49-b3b1-12553fcaf4ff",
    "_uuid": "46b7fc9aef5a519f96a295e980ba15deee781e97",
    "tags": []
   },
   "outputs": [],
   "source": [
    "processed_docs, tokenizer = util.preprocessing.tokenize_and_pad_docs(df, \"TITLE\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "df[\"TITLE\"].iloc[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "processed_docs[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import Word Embeddings\n",
    "\n",
    "To represent our words in numeric form, we'll use pre-trained vector representations for each word in the vocabulary: In this case we'll be using [pre-trained word embeddings from FastText](https://fasttext.cc/docs/en/crawl-vectors.html), which are also available for a broad range of languages other than English.\n",
    "\n",
    "You could also explore training custom, domain-specific word embeddings using SageMaker's built-in [BlazingText algorithm](https://docs.aws.amazon.com/sagemaker/latest/dg/blazingtext.html). See the official [blazingtext_word2vec_text8 sample](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/introduction_to_amazon_algorithms/blazingtext_word2vec_text8) for an example notebook showing how.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "embedding_matrix = util.preprocessing.get_word_embeddings(tokenizer, f\"{local_dir}/embeddings\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "np.save(\n",
    "    file=f\"{local_dir}/embeddings/docs-embedding-matrix\",\n",
    "    arr=embedding_matrix,\n",
    "    allow_pickle=False,\n",
    ")\n",
    "vocab_size = embedding_matrix.shape[0]\n",
    "print(embedding_matrix.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Split Train and Test Sets\n",
    "\n",
    "Finally we need to divide our data into model training and evaluation sets:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "x_train, x_test, y_train, y_test = train_test_split(\n",
    "    processed_docs, encoded_y, test_size=0.2, random_state=42\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Do you always remember to save your datasets for traceability when experimenting locally? ;-)\n",
    "os.makedirs(f\"{local_dir}/train\", exist_ok=True)\n",
    "np.save(f\"{local_dir}/train/train_X.npy\", x_train)\n",
    "np.save(f\"{local_dir}/train/train_Y.npy\", y_train)\n",
    "os.makedirs(f\"{local_dir}/test\", exist_ok=True)\n",
    "np.save(f\"{local_dir}/test/test_X.npy\", x_test)\n",
    "np.save(f\"{local_dir}/test/test_Y.npy\", y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define the Model\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "import torch.optim as optim\n",
    "from torch.utils.data import DataLoader\n",
    "\n",
    "seed = 42\n",
    "np.random.seed(seed)\n",
    "num_classes = len(labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "class Net(nn.Module):\n",
    "    def __init__(self, vocab_size=400000, emb_dim=300, num_classes=4):\n",
    "        super(Net, self).__init__()\n",
    "        self.embedding = nn.Embedding(vocab_size, emb_dim)\n",
    "        self.conv1 = nn.Conv1d(emb_dim, 128, kernel_size=3)\n",
    "        self.max_pool1d = nn.MaxPool1d(5)\n",
    "        self.flatten1 = nn.Flatten()\n",
    "        self.dropout1 = nn.Dropout(p=0.3)\n",
    "        self.fc1 = nn.Linear(896, 128)\n",
    "        self.fc2 = nn.Linear(128, num_classes)\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = self.embedding(x)\n",
    "        x = torch.transpose(x, 1, 2)\n",
    "        x = self.flatten1(self.max_pool1d(self.conv1(x)))\n",
    "        x = self.dropout1(x)\n",
    "        x = F.relu(self.fc1(x))\n",
    "        x = self.fc2(x)\n",
    "        return F.softmax(x, dim=-1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define Train and Helper Functions\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def test(model, test_loader, device):\n",
    "    model.eval()\n",
    "    test_loss = 0.0\n",
    "    correct = 0\n",
    "    with torch.no_grad():\n",
    "        for data, target in test_loader:\n",
    "            data, target = data.to(device), target.to(device)\n",
    "            output = model(data)\n",
    "            test_loss += F.binary_cross_entropy(output, target, reduction=\"sum\").item()\n",
    "            pred = output.max(1, keepdim=True)[1]  # get the index of the max log-probability\n",
    "            target_index = target.max(1, keepdim=True)[1]\n",
    "            correct += pred.eq(target_index).sum().item()\n",
    "\n",
    "    test_loss /= len(test_loader.dataset)  # Average loss over dataset samples\n",
    "    print(f\"val_loss: {test_loss:.4f}, val_acc: {correct/len(test_loader.dataset):.4f}\")\n",
    "\n",
    "\n",
    "def train(\n",
    "    train_loader, test_loader, embedding_matrix, num_classes=4, epochs=12, learning_rate=0.001\n",
    "):\n",
    "    ###### Setup model architecture ############\n",
    "    model = Net(\n",
    "        vocab_size=embedding_matrix.shape[0],\n",
    "        emb_dim=embedding_matrix.shape[1],\n",
    "        num_classes=num_classes,\n",
    "    )\n",
    "    model.embedding.weight = torch.nn.parameter.Parameter(\n",
    "        torch.FloatTensor(embedding_matrix), False\n",
    "    )\n",
    "    device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "    model.to(device)\n",
    "    optimizer = optim.RMSprop(model.parameters(), lr=learning_rate)\n",
    "\n",
    "    for epoch in range(1, epochs + 1):\n",
    "        model.train()\n",
    "        running_loss = 0.0\n",
    "        n_batches = 0\n",
    "        for batch_idx, (X_train, y_train) in enumerate(train_loader, 1):\n",
    "            data, target = X_train.to(device), y_train.to(device)\n",
    "            optimizer.zero_grad()\n",
    "            output = model(data)\n",
    "            loss = F.binary_cross_entropy(output, target)\n",
    "            loss.backward()\n",
    "            optimizer.step()\n",
    "            running_loss += loss.item()\n",
    "            n_batches += 1\n",
    "        print(f\"epoch: {epoch}, train_loss: {running_loss / n_batches:.6f}\")  # (Avg over batches)\n",
    "        print(\"Evaluating model\")\n",
    "        test(model, test_loader, device)\n",
    "    return model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "class Dataset(torch.utils.data.Dataset):\n",
    "    def __init__(self, data, labels):\n",
    "        \"\"\"Initialization\"\"\"\n",
    "        self.labels = labels\n",
    "        self.data = data\n",
    "\n",
    "    def __len__(self):\n",
    "        \"\"\"Denotes the total number of samples\"\"\"\n",
    "        return len(self.data)\n",
    "\n",
    "    def __getitem__(self, index):\n",
    "        # Load data and get label\n",
    "        X = torch.as_tensor(self.data[index]).long()\n",
    "        y = torch.as_tensor(self.labels[index])\n",
    "        return X, y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fit (Train) and Evaluate the Model\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "# fit the model here in the notebook:\n",
    "epochs = 5\n",
    "learning_rate = 0.001\n",
    "model_dir = \"model/\"\n",
    "trainloader = torch.utils.data.DataLoader(Dataset(x_train, y_train), batch_size=16, shuffle=True)\n",
    "testloader = torch.utils.data.DataLoader(Dataset(x_test, y_test), batch_size=32, shuffle=True)\n",
    "\n",
    "print(\"Training model\")\n",
    "model = train(\n",
    "    trainloader,\n",
    "    testloader,\n",
    "    embedding_matrix,\n",
    "    num_classes=num_classes,\n",
    "    epochs=epochs,\n",
    "    learning_rate=learning_rate,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use the Model (Locally)\n",
    "\n",
    "Let's evaluate our model with some example headlines...\n",
    "\n",
    "If you struggle with the widget, you can always simply call the `classify()` function from Python. You can be creative with your headlines!\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import ipywidgets as widgets\n",
    "from IPython import display\n",
    "\n",
    "\n",
    "def classify(text):\n",
    "    \"\"\"Classify a headline and print the results\"\"\"\n",
    "    processed = tokenizer.preprocess(text)\n",
    "    padded = tokenizer.pad([processed])\n",
    "    final_text = []\n",
    "    for w in padded[0]:\n",
    "        final_text.append(tokenizer.vocab.stoi[w])\n",
    "    final_text = torch.tensor([final_text])\n",
    "    model.cpu()\n",
    "    model.eval()\n",
    "    with torch.no_grad():\n",
    "        result = model(final_text)\n",
    "    print(result)\n",
    "    ix = np.argmax(result.detach())\n",
    "    print(f\"Predicted class: '{labels[ix]}' with confidence {result[0][ix]:.2%}\")\n",
    "\n",
    "\n",
    "# Either try out the interactive widget:\n",
    "interaction = widgets.interact_manual(\n",
    "    classify,\n",
    "    text=widgets.Text(\n",
    "        value=\"The markets were bullish after news of the merger\",\n",
    "        placeholder=\"Type a news headline...\",\n",
    "        description=\"Headline:\",\n",
    "        layout=widgets.Layout(width=\"99%\"),\n",
    "    ),\n",
    ")\n",
    "interaction.widget.children[1].description = \"Classify!\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Or just use the function to classify your own headline:\n",
    "classify(\"Retailers are expanding after the recent economic growth\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Review\n",
    "\n",
    "In this notebook we pre-processed publicly downloadable data and trained a neural news headline classifier model: As a data scientist might normally do when working on a local machine.\n",
    "\n",
    "...But can we use the cloud more effectively to allocate high-performance resources; and easily deploy our trained models for use by other applications?\n",
    "\n",
    "Head on over to the next notebook, [Headline Classifier SageMaker.ipynb](Headline%20Classifier%20SageMaker.ipynb), where we'll show how the same model can be trained and then deployed on specific target infrastructure with Amazon SageMaker.\n"
   ]
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   }
  ],
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (PyTorch 1.13 Python 3.9 CPU Optimized)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/pytorch-1.13-cpu-py39"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}