{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# SageMaker Serverless Inference\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## HuggingFace Text Classification example\n", "\n", "Amazon SageMaker Serverless Inference is a purpose-built inference option that makes it easy for you to deploy and scale ML models. Serverless Inference is ideal for workloads which have idle periods between traffic spurts and can tolerate cold starts. Serverless endpoints automatically launch compute resources and scale them in and out depending on traffic, eliminating the need to choose instance types or manage scaling policies. This takes away the undifferentiated heavy lifting of selecting and managing servers. Serverless Inference integrates with AWS Lambda to offer you high availability, built-in fault tolerance and automatic scaling.\n", "\n", "Serverless Inference is a great choice for customers that have intermittent or unpredictable prediction traffic. For example, a document processing service used to extract and analyze data on a periodic basis. Customers that choose Serverless Inference should make sure that their workloads can tolerate cold starts. A cold start can occur when your endpoint doesn’t receive traffic for a period of time. It can also occur when your concurrent requests exceed the current request usage. The cold start time will depend on your model size, how long it takes to download, and your container startup time.\n", "\n", "## Introduction\n", "\n", "Text Classification can be used to solve various use-cases like sentiment analysis, spam detection, hashtag prediction etc. \n", "\n", "\n", "This notebook demonstrates the use of the [HuggingFace `transformers` library](https://huggingface.co/transformers/) together with a custom Amazon sagemaker-sdk extension to fine-tune a pre-trained transformer on multi class text classification. In particular, the pre-trained model will be fine-tuned using the [`20 newsgroups dataset`](http://qwone.com/~jason/20Newsgroups/). To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on.\n", "\n", "Notebook Setting\n", "- SageMaker Classic Notebook Instance: `ml.m5.xlarge` Notebook Instance & `conda_pytorch_p36 Kernel`\n", "- SageMaker Studio: `Python 3 (PyTorch 1.6 Python 3.6 CPU Optimized)`\n", "- Regions Available: SageMaker Serverless Inference is currently available in the following regions: US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), Asia Pacific (Tokyo) and Asia Pacific (Sydney)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Table of Contents\n", "- Setup\n", "- Data Preprocessing\n", "- Model Training\n", "- Deployment\n", " - Endpoint Configuration (Adjust for Serverless)\n", " - Serverless Endpoint Creation\n", " - Endpoint Invocation\n", "- Cleanup\n", "- Conclusion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Development Environment and Permissions " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "If you run this notebook in SageMaker Studio, you need to make sure `ipywidgets` is installed and restart the kernel, so please uncomment the code in the next cell, and run it." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "%%capture\n", "import IPython\n", "import sys\n", "\n", "!{sys.executable} -m pip install ipywidgets\n", "# IPython.Application.instance().kernel.do_shutdown(True) # has to restart kernel so changes are used" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's install the required packages from HuggingFace and SageMaker" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import sys\n", "\n", "!{sys.executable} -m pip install \"scikit_learn==1.2.0\" \"sagemaker>=2.86.1\" \"transformers==4.6.1\" \"datasets==1.6.2\" \"nltk==3.8\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make sure SageMaker version is >= 2.86.1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import sagemaker\n", "import boto3\n", "\n", "print(sagemaker.__version__)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sess = sagemaker.Session()\n", "# sagemaker session bucket -> used for uploading data, models and logs\n", "# sagemaker will automatically create this bucket if it not exists\n", "sagemaker_session_bucket = None\n", "if sagemaker_session_bucket is None and sess is not None:\n", " # set to default bucket if a bucket name is not given\n", " sagemaker_session_bucket = sess.default_bucket()\n", "\n", "role = sagemaker.get_execution_role()\n", "sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)\n", "region = sess.boto_region_name\n", "\n", "s3_prefix = \"huggingface_serverless/20_newsgroups\"\n", "\n", "print(f\"sagemaker role arn: {role}\")\n", "print(f\"sagemaker bucket: {sess.default_bucket()}\")\n", "print(f\"sagemaker session region: {sess.boto_region_name}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Preparation\n", "\n", "Now we'll download a dataset from the web on which we want to train the text classification model.\n", "\n", "In this example, let us train the text classification model on the [`20 newsgroups dataset`](http://qwone.com/~jason/20Newsgroups/). The `20 newsgroups dataset` consists of 20000 messages taken from 20 Usenet newsgroups." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import os\n", "import shutil\n", "\n", "data_dir = \"20_newsgroups_bulk\"\n", "if os.path.exists(data_dir): # cleanup existing data folder\n", " shutil.rmtree(data_dir)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "s3 = boto3.client(\"s3\")\n", "s3.download_file(\n", " f\"sagemaker-example-files-prod-{region}\",\n", " \"datasets/text/20_newsgroups/20_newsgroups_bulk.tar.gz\",\n", " \"20_newsgroups_bulk.tar.gz\",\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "!tar xzf 20_newsgroups_bulk.tar.gz\n", "!ls 20_newsgroups_bulk" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "file_list = [os.path.join(data_dir, f) for f in os.listdir(data_dir)]\n", "print(\"Number of files:\", len(file_list))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "documents_count = 0\n", "for file in file_list:\n", " df = pd.read_csv(file, header=None, names=[\"text\"])\n", " documents_count = documents_count + df.shape[0]\n", "print(\"Number of documents:\", documents_count)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's inspect the dataset files and analyze the categories." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "categories_list = [f.split(\"/\")[1] for f in file_list]\n", "categories_list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the dataset consists of 20 topics, each in different file.\n", "\n", "Let us inspect the dataset to get some understanding about how the data and the label is provided in the dataset. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "df = pd.read_csv(\"./20_newsgroups_bulk/rec.motorcycles\", header=None, names=[\"text\"])\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "df[\"text\"][0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "df = pd.read_csv(\"./20_newsgroups_bulk/comp.sys.mac.hardware\", header=None, names=[\"text\"])\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "df[\"text\"][0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see from the above, there is a single file for each class in the dataset. Each record is just a plain text paragraphs with header, body, footer and quotes. We will need to process them into a suitable data format." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Preprocessing\n", "We need to preprocess the dataset to remove the header, footer, quotes, leading/trailing whitespace, extra spaces, tabs, and HTML tags/markups. \n", "\n", "Download the `nltk` tokenizer and other libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import nltk\n", "from nltk.tokenize import word_tokenize\n", "import re\n", "import string\n", "\n", "nltk.download(\"punkt\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sklearn.datasets._twenty_newsgroups import (\n", " strip_newsgroup_header,\n", " strip_newsgroup_quoting,\n", " strip_newsgroup_footer,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This following function will remove the header, footer and quotes (of earlier messages in each text)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "def strip_newsgroup_item(item):\n", " item = strip_newsgroup_header(item)\n", " item = strip_newsgroup_quoting(item)\n", " item = strip_newsgroup_footer(item)\n", " return item" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function will take care of removing leading/trailing whitespace, extra spaces, tabs, and HTML tags/markups." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "def process_text(texts):\n", " final_text_list = []\n", " for text in texts:\n", " # Check if the sentence is a missing value\n", " if isinstance(text, str) == False:\n", " text = \"\"\n", "\n", " filtered_sentence = []\n", "\n", " # Lowercase\n", " text = text.lower()\n", "\n", " # Remove leading/trailing whitespace, extra space, tabs, and HTML tags/markups\n", " text = text.strip()\n", " text = re.sub(\"\\[.*?\\]\", \"\", text)\n", " text = re.sub(\"https?://\\S+|www\\.\\S+\", \"\", text)\n", " text = re.sub(\"<.*?>+\", \"\", text)\n", " text = re.sub(\"[%s]\" % re.escape(string.punctuation), \"\", text)\n", " text = re.sub(\"\\n\", \"\", text)\n", " text = re.sub(\"\\w*\\d\\w*\", \"\", text)\n", "\n", " for w in word_tokenize(text):\n", " # We are applying some custom filtering here, feel free to try different things\n", " # Check if it is not numeric\n", " if not w.isnumeric():\n", " filtered_sentence.append(w)\n", " final_string = \" \".join(filtered_sentence) # final string of cleaned words\n", "\n", " final_text_list.append(final_string)\n", "\n", " return final_text_list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will read each of the `20_newsgroups` dataset files, call `strip_newsgroup_item` and `process_text` functions we defined earlier, and then aggregate all data into one dataframe." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df = pd.DataFrame()\n", "\n", "for file in file_list:\n", " print(f\"Processing {file}\")\n", " label = file.split(\"/\")[1]\n", " df = pd.read_csv(file, header=None, names=[\"text\"])\n", " df[\"text\"] = df[\"text\"].apply(strip_newsgroup_item)\n", " df[\"text\"] = process_text(df[\"text\"].tolist())\n", " df[\"label\"] = label\n", " all_categories_df = all_categories_df.append(df, ignore_index=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's inspect how many categories there are in our dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"label\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In our dataset there are 20 categories which is too much, so we will combine the sub-categories." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# replace to politics\n", "all_categories_df[\"label\"].replace(\n", " {\n", " \"talk.politics.misc\": \"politics\",\n", " \"talk.politics.guns\": \"politics\",\n", " \"talk.politics.mideast\": \"politics\",\n", " },\n", " inplace=True,\n", ")\n", "\n", "# replace to recreational\n", "all_categories_df[\"label\"].replace(\n", " {\n", " \"rec.sport.hockey\": \"recreational\",\n", " \"rec.sport.baseball\": \"recreational\",\n", " \"rec.autos\": \"recreational\",\n", " \"rec.motorcycles\": \"recreational\",\n", " },\n", " inplace=True,\n", ")\n", "\n", "# replace to religion\n", "all_categories_df[\"label\"].replace(\n", " {\n", " \"soc.religion.christian\": \"religion\",\n", " \"talk.religion.misc\": \"religion\",\n", " \"alt.atheism\": \"religion\",\n", " },\n", " inplace=True,\n", ")\n", "\n", "# replace to computer\n", "all_categories_df[\"label\"].replace(\n", " {\n", " \"comp.windows.x\": \"computer\",\n", " \"comp.sys.ibm.pc.hardware\": \"computer\",\n", " \"comp.os.ms-windows.misc\": \"computer\",\n", " \"comp.graphics\": \"computer\",\n", " \"comp.sys.mac.hardware\": \"computer\",\n", " },\n", " inplace=True,\n", ")\n", "# replace to sales\n", "all_categories_df[\"label\"].replace({\"misc.forsale\": \"sales\"}, inplace=True)\n", "\n", "# replace to science\n", "all_categories_df[\"label\"].replace(\n", " {\n", " \"sci.crypt\": \"science\",\n", " \"sci.electronics\": \"science\",\n", " \"sci.med\": \"science\",\n", " \"sci.space\": \"science\",\n", " },\n", " inplace=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are left with 6 categories, which is much better." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"label\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's calculate number of words for each row." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"word_count\"] = all_categories_df[\"text\"].apply(lambda x: len(str(x).split()))\n", "all_categories_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get basic statistics about the dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"word_count\"].describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the mean value is around 159 words. However, there are outliers, such as a text with 11351 words. This can make it harder for the model to result in good performance. We will take care to drop those rows.\n", "\n", "Let's drop empty rows first." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "no_text = all_categories_df[all_categories_df[\"word_count\"] == 0]\n", "print(len(no_text))\n", "\n", "# drop these rows\n", "all_categories_df.drop(no_text.index, inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's drop the rows that are longer than 256 words, as it is a length close to the mean value of the word count. This is done to make it easy for the model to train without outliers. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "long_text = all_categories_df[all_categories_df[\"word_count\"] > 256]\n", "print(len(long_text))\n", "\n", "# drop these rows\n", "all_categories_df.drop(long_text.index, inplace=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"label\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get basic statistics about the dataset after our outliers fixes." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"word_count\"].describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This looks much more balanced.\n", "\n", "Now we drop the `word_count` columns as we will not need it anymore." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df.drop(columns=\"word_count\", axis=1, inplace=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's convert categorical label to integer number, in order to prepare the dataset for training." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "categories = all_categories_df[\"label\"].unique().tolist()\n", "categories" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "categories.index(\"recreational\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"label\"] = all_categories_df[\"label\"].apply(lambda x: categories.index(x))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "all_categories_df[\"label\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We partition the dataset into 80% training and 20% validation set and save to `csv` files." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", "train_df, test_df = train_test_split(all_categories_df, test_size=0.2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "train_df.to_csv(\"train.csv\", index=None)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "test_df.to_csv(\"test.csv\", index=None)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's inspect the label distribution in the training dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "train_df[\"label\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's inspect the label distribution in the test dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "test_df[\"label\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tokenization \n", "\n", "A tokenizer is in charge of preparing the inputs for a model. The library contains tokenizers for all the models. Most of the tokenizers are available in two flavors: a full python implementation and a “Fast” implementation based on the Rust library [tokenizers](https://github.com/huggingface/tokenizers). The “Fast” implementations allows:\n", "\n", " - A significant speed-up in particular when doing batched tokenization.\n", " - Additional methods to map between the original string (character and words) and the token space (e.g. getting the index of the token comprising a given character or the span of characters corresponding to a given token). " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from datasets import load_dataset\n", "from transformers import AutoTokenizer\n", "\n", "# tokenizer used in preprocessing\n", "tokenizer_name = \"distilbert-base-uncased\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# download tokenizer\n", "tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load train and test datasets\n", "\n", "Let's create a [Dataset](https://huggingface.co/docs/datasets/loading_datasets.html) from our local `csv` files for training and test we saved earlier." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset = load_dataset(\"csv\", data_files={\"train\": \"train.csv\", \"test\": \"test.csv\"})" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset[\"train\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset[\"train\"][0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset[\"test\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "dataset[\"test\"][0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# tokenizer helper function\n", "def tokenize(batch):\n", " return tokenizer(batch[\"text\"], padding=\"max_length\", truncation=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "train_dataset = dataset[\"train\"]\n", "test_dataset = dataset[\"test\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tokenize train and test datasets\n", "\n", "Let's tokenize the train dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "train_dataset = train_dataset.map(tokenize, batched=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's tokenize the test dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "test_dataset = test_dataset.map(tokenize, batched=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set format for PyTorch" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "train_dataset = train_dataset.rename_column(\"label\", \"labels\")\n", "train_dataset.set_format(\"torch\", columns=[\"input_ids\", \"attention_mask\", \"labels\"])\n", "test_dataset = test_dataset.rename_column(\"label\", \"labels\")\n", "test_dataset.set_format(\"torch\", columns=[\"input_ids\", \"attention_mask\", \"labels\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uploading data to `sagemaker_session_bucket`\n", "\n", "After we processed the datasets, we are going to upload it to S3." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import botocore\n", "from datasets.filesystems import S3FileSystem\n", "\n", "s3 = S3FileSystem()\n", "\n", "# save train_dataset to s3\n", "training_input_path = f\"s3://{sess.default_bucket()}/{s3_prefix}/train\"\n", "train_dataset.save_to_disk(training_input_path, fs=s3)\n", "\n", "# save test_dataset to s3\n", "test_input_path = f\"s3://{sess.default_bucket()}/{s3_prefix}/test\"\n", "test_dataset.save_to_disk(test_input_path, fs=s3)\n", "\n", "print(training_input_path)\n", "print(test_input_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training the HuggingFace model for supervised text classification\n", "\n", "In order to create a sagemaker training job we need a `HuggingFace` Estimator. The Estimator handles end-to-end Amazon SageMaker training and deployment tasks. In an Estimator we define, which fine-tuning script should be used as `entry_point`, which `instance_type` should be used, which `hyperparameters` are passed in .....\n", "\n", "\n", "\n", "```python\n", "huggingface_estimator = HuggingFace(entry_point='train.py',\n", " source_dir='./code',\n", " instance_type='ml.p3.2xlarge',\n", " instance_count=1,\n", " volume_size=256,\n", " role=role,\n", " transformers_version='4.6',\n", " pytorch_version='1.7',\n", " py_version='py36',\n", " hyperparameters = {'epochs': 1,\n", " 'model_name':'distilbert-base-uncased',\n", " 'num_labels': 6\n", " })\n", "```\n", "\n", "When we create a SageMaker training job, SageMaker takes care of starting and managing all the required ec2 instances for us with the `huggingface` container, uploads the provided fine-tuning script `train.py` and downloads the data from our `sagemaker_session_bucket` into the container at `/opt/ml/input/data`. Then, it starts the training job by running. \n", "\n", "```python\n", "/opt/conda/bin/python train.py --epochs 1 --model_name distilbert-base-uncased --num_labels 6\n", "```\n", "\n", "The `hyperparameters` you define in the `HuggingFace` estimator are passed in as named arguments. \n", "\n", "SageMaker is providing useful properties about the training environment through various environment variables, including the following:\n", "\n", "* `SM_MODEL_DIR`: A string that represents the path where the training job writes the model artifacts to. After training, artifacts in this directory are uploaded to S3 for model hosting.\n", "\n", "* `SM_NUM_GPUS`: An integer representing the number of GPUs available to the host.\n", "\n", "* `SM_CHANNEL_XXXX:` A string that represents the path to the directory that contains the input data for the specified channel. For example, if you specify two input channels in the HuggingFace estimator’s fit call, named `train` and `test`, the environment variables `SM_CHANNEL_TRAIN` and `SM_CHANNEL_TEST` are set.\n", "\n", "\n", "To run your training job locally you can define `instance_type='local'` or `instance_type='local-gpu'` for `gpu` usage. _Note: this does not work within SageMaker Studio_\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We create a metric_definition dictionary that contains regex-based definitions that will be used to parse the job logs and extract metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "metric_definitions = [\n", " {\"Name\": \"loss\", \"Regex\": \"'loss': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"learning_rate\", \"Regex\": \"'learning_rate': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"eval_loss\", \"Regex\": \"'eval_loss': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"eval_accuracy\", \"Regex\": \"'eval_accuracy': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"eval_f1\", \"Regex\": \"'eval_f1': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"eval_precision\", \"Regex\": \"'eval_precision': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"eval_recall\", \"Regex\": \"'eval_recall': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\"Name\": \"eval_runtime\", \"Regex\": \"'eval_runtime': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", " {\n", " \"Name\": \"eval_samples_per_second\",\n", " \"Regex\": \"'eval_samples_per_second': ([0-9]+(.|e\\-)[0-9]+),?\",\n", " },\n", " {\"Name\": \"epoch\", \"Regex\": \"'epoch': ([0-9]+(.|e\\-)[0-9]+),?\"},\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating an Estimator and start a training job" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.huggingface import HuggingFace\n", "\n", "# hyperparameters, which are passed into the training job\n", "hyperparameters = {\"epochs\": 1, \"model_name\": \"distilbert-base-uncased\", \"num_labels\": 6}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's define the SageMaker `HuggingFace` estimator with resource configurations and hyperparameters to train Text Classification on `20 newsgroups` dataset, running on a `p3.2xlarge` instance." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "huggingface_estimator = HuggingFace(\n", " entry_point=\"train.py\",\n", " source_dir=\"./code\",\n", " instance_type=\"ml.p3.2xlarge\",\n", " instance_count=1,\n", " volume_size=256,\n", " role=role,\n", " transformers_version=\"4.6\",\n", " pytorch_version=\"1.7\",\n", " py_version=\"py36\",\n", " hyperparameters=hyperparameters,\n", " metric_definitions=metric_definitions,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# starting the train job with our uploaded datasets as input\n", "huggingface_estimator.fit({\"train\": training_input_path, \"test\": test_input_path})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deployment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Serverless Configuration\n", "\n", "#### Memory size - `memory_size_in_mb`\n", "Your serverless endpoint has a minimum RAM size of 1024 MB (1 GB), and the maximum RAM size you can choose is 6144 MB (6 GB). The memory sizes you can select are 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB. Serverless Inference auto-assigns compute resources proportional to the memory you select. If you select a larger memory size, your container has access to more `vCPUs`. Select your endpoint’s memory size according to your model size. Generally, the memory size should be at least as large as your model size. You may need to benchmark in order to select the right memory selection for your model based on your latency SLAs. The memory size increments have different pricing; see the Amazon SageMaker pricing page for more information.\n", "\n", "#### Concurrent invocations - `max_concurrency`\n", " \n", "Serverless Inference manages predefined scaling policies and quotas for the capacity of your endpoint. Serverless endpoints have a quota for how many concurrent invocations can be processed at the same time. If the endpoint is invoked before it finishes processing the first request, then it handles the second request concurrently. You can set the maximum concurrency for a single endpoint up to 200, and the total number of serverless endpoint variants you can host in a Region is 50. The total concurrency you can share between all serverless endpoints per Region in your account is 200. The maximum concurrency for an individual endpoint prevents that endpoint from taking up all the invocations allowed for your account, and any endpoint invocations beyond the maximum are throttled." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.serverless.serverless_inference_config import ServerlessInferenceConfig\n", "\n", "serverless_config = ServerlessInferenceConfig(\n", " memory_size_in_mb=6144,\n", " max_concurrency=1,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Serverless Endpoint Creation\n", "Now that we have a `ServerlessInferenceConfig`, we can create a serverless endpoint and deploy our model to it." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "%%time\n", "\n", "predictor = huggingface_estimator.deploy(serverless_inference_config=serverless_config)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Endpoint Invocation\n", "\n", "Using few samples, you can now invoke the SageMaker endpoint to get predictions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "def predict_sentence(sentence):\n", " result = predictor.predict({\"inputs\": sentence})\n", " index = int(result[0][\"label\"].split(\"LABEL_\")[1])\n", " print(categories[index])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "sentences = [\n", " \"The modem is an internal AT/(E)ISA 8-bit card (just a little longer than a half-card).\",\n", " \"In the cage I usually wave to bikers. They usually don't wave back. My wife thinks it's strange but I don't care.\",\n", " \"Voyager has the unusual luck to be on a stable trajectory out of the solar system.\",\n", "]\n", "\n", "# using the same processing logic that we used during data preparation for training\n", "processed_sentences = process_text(sentences)\n", "\n", "for sentence in processed_sentences:\n", " predict_sentence(sentence)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Clean up\n", "\n", "Endpoints should be deleted when no longer in use, since (per the [SageMaker pricing page](https://aws.amazon.com/sagemaker/pricing/)) they're billed by time deployed." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "\n", "In this notebook you successfully ran SageMaker Training Job with the HuggingFace framework to fine-tune a pre-trained transformer on text classification using the `20 newsgroups dataset` dataset.\n", "Then, you prepared the Serverless configuration required, and deployed your model to SageMaker Serverless Endpoint. Finally, you invoked the Serverless endpoint with sample data and got the prediction results.\n", "\n", "As next steps, you can try running SageMaker Training Jobs with your own algorithm and your own data, and deploy the model to SageMaker Serverless Endpoint." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/serverless-inference|huggingface-serverless-inference|huggingface-text-classification-serverless-inference.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 } ], "interpreter": { "hash": "c281c456f1b8161c8906f4af2c08ed2c40c50136979eaae69688b01f70e9f4a9" }, "kernelspec": { "display_name": "Python 3 (PyTorch 1.12 Python 3.8 CPU Optimized)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/pytorch-1.12-cpu-py38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.16" } }, "nbformat": 4, "nbformat_minor": 4 }