{ "cells": [ { "cell_type": "markdown", "id": "f1b70f55", "metadata": {}, "source": [ "# Compile and Train the GPT2 Model using the Transformers Trainer API with the SST2 Dataset for Single-Node Multi-GPU Training" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "6a28d54c", "metadata": {}, "source": [ "1. [Introduction](#Introduction) \n", "2. [Development Environment and Permissions](#Development-Environment-and-Permissions)\n", " 1. [Installation](#Installation) \n", " 2. [Permissions](#Permissions)\n", "3. [SageMaker Training Job](#SageMaker-Training-Job) \n", " 1. [Training with Native TensorFlow](#NativeTF) \n", " 2. [Training with Optimized TensorFlow](#OptimizedTF) \n", " 3. [Analysis](#Analysis) " ] }, { "cell_type": "markdown", "id": "0290f0a8", "metadata": {}, "source": [ "## SageMaker Training Compiler Overview\n", "\n", "SageMaker Training Compiler is a capability of SageMaker that makes these hard-to-implement optimizations to reduce training time on GPU instances. The compiler optimizes Deep Learning (DL) models to accelerate training by more efficiently using SageMaker machine learning (ML) GPU instances. SageMaker Training Compiler is available at no additional charge within SageMaker and can help reduce total billable time as it accelerates training. \n", "\n", "SageMaker Training Compiler is integrated into the AWS Deep Learning Containers (DLCs). Using the SageMaker Training Compiler enabled AWS DLCs, you can compile and optimize training jobs on GPU instances with minimal changes to your code. Bring your deep learning models to SageMaker and enable SageMaker Training Compiler to accelerate the speed of your training job on SageMaker ML instances for accelerated computing. \n", "\n", "For more information, see [SageMaker Training Compiler](https://docs.aws.amazon.com/sagemaker/latest/dg/training-compiler.html) in the *Amazon SageMaker Developer Guide*.\n", "\n", "## Introduction\n", "\n", "In this demo, you'll use Hugging Face's `transformers` and `datasets` libraries with Amazon SageMaker Training Compiler to train the `gpt-2` model on the `Stanford Sentiment Treebank v2 (SST2)` dataset. Please note that by using this notebook you will be downloading SST2 from https://huggingface.co/datasets/sst2 and can check dataset information and terms there. To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on. \n", "\n", "**NOTE:** You can run this demo in SageMaker Studio, SageMaker notebook instances, or your local machine with AWS CLI set up. If using SageMaker Studio or SageMaker notebook instances, make sure you choose one of the TensorFlow-based kernels, `Python 3 (TensorFlow x.y Python 3.x CPU Optimized)` or `conda_tensorflow_p36` respectively.\n", "\n", "**NOTE:** This notebook uses two `ml.p3.8xlarge` instances that have multiple GPUs. If you don't have enough quota, see [Request a service quota increase for SageMaker resources](https://docs.aws.amazon.com/sagemaker/latest/dg/regions-quotas.html#service-limit-increase-request-procedure). " ] }, { "cell_type": "markdown", "id": "221ab379", "metadata": {}, "source": [ "# Development Environment " ] }, { "cell_type": "markdown", "id": "3225cd60", "metadata": {}, "source": [ "## Installation\n", "\n", "This example notebook requires the **SageMaker Python SDK v2.115.0** and **transformers v4.21**." ] }, { "cell_type": "code", "execution_count": null, "id": "81ab31b0", "metadata": { "scrolled": true }, "outputs": [], "source": [ "!pip install \"sagemaker>=2.115\" --upgrade\n", "!pip install botocore boto3 awscli --upgrade" ] }, { "cell_type": "code", "execution_count": null, "id": "839fbb8b", "metadata": {}, "outputs": [], "source": [ "!pip install transformers==4.21" ] }, { "cell_type": "code", "execution_count": null, "id": "02f4b0c2", "metadata": {}, "outputs": [], "source": [ "import botocore\n", "import boto3\n", "import sagemaker\n", "import transformers\n", "import pandas as pd\n", "\n", "print(f\"sagemaker: {sagemaker.__version__}\")\n", "print(f\"transformers: {transformers.__version__}\")" ] }, { "cell_type": "markdown", "id": "43f0e97d", "metadata": {}, "source": [ "### SageMaker environment \n" ] }, { "cell_type": "code", "execution_count": null, "id": "0cfe4100", "metadata": {}, "outputs": [], "source": [ "import sagemaker\n", "\n", "sess = sagemaker.Session()\n", "\n", "# SageMaker session bucket -> used for uploading data, models and logs\n", "# SageMaker will automatically create this bucket if it does not exist\n", "sagemaker_session_bucket = None\n", "if sagemaker_session_bucket is None and sess is not None:\n", " # set to default bucket if a bucket name is not given\n", " sagemaker_session_bucket = sess.default_bucket()\n", "\n", "role = sagemaker.get_execution_role()\n", "sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)\n", "\n", "print(f\"sagemaker role arn: {role}\")\n", "print(f\"sagemaker bucket: {sess.default_bucket()}\")\n", "print(f\"sagemaker session region: {sess.boto_region_name}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "512fd53d", "metadata": {}, "outputs": [], "source": [ "from sagemaker.huggingface import HuggingFace, TrainingCompilerConfig" ] }, { "cell_type": "markdown", "id": "fc2c552e", "metadata": {}, "source": [ "## SageMaker Training Job\n", "\n", "To create a SageMaker training job, we use a `HuggingFace` estimator. Using the estimator, you can define which fine-tuning script should SageMaker use through `entry_point`, which `instance_type` to use for training, which `hyperparameters` to pass, and so on.\n", "\n", "When a SageMaker training job starts, SageMaker takes care of starting and managing all the required machine learning instances, picks up the `HuggingFace` Deep Learning Container, uploads your training script, and downloads the data from `sagemaker_session_bucket` into the container at `/opt/ml/input/data`.\n", "\n", "In the following section, you learn how to set up two versions of the SageMaker `HuggingFace` estimator, a native one without the compiler and an optimized one with the compiler." ] }, { "cell_type": "markdown", "id": "0831b533", "metadata": {}, "source": [ "### Training Setup" ] }, { "cell_type": "code", "execution_count": null, "id": "17a19e6a", "metadata": {}, "outputs": [], "source": [ "# Here we configure the training job. Please configure the appropriate options below:\n", "\n", "EPOCHS = 100\n", "\n", "# For more information about the options, please look into the training scripts\n", "\n", "# SageMaker Training Compiler currently only supports training on GPU\n", "# Select Instance type for training\n", "INSTANCE_TYPE = \"ml.p3.8xlarge\" # ml.p3.8xlarge is easily available. However, p3.16xlarge provides better performance.\n", "NUM_GPUS_PER_INSTANCE = 4" ] }, { "cell_type": "markdown", "id": "f525bfc2", "metadata": {}, "source": [ "First, we define some basic parameters common to all estimators.\n", "\n", "This example uses a modified version of HuggingFace training script `run_clm.py`, which you can find inside the `scripts` folder. \n", "\n", "**Note:** We recommend you to turn the SageMaker Debugger's profiling and debugging tools off to avoid additional overheads." ] }, { "cell_type": "code", "execution_count": null, "id": "536373fe", "metadata": {}, "outputs": [], "source": [ "estimator_args = dict(\n", " entry_point=f\"run_clm.py\",\n", " source_dir=\"./scripts\",\n", " instance_type=INSTANCE_TYPE,\n", " base_job_name=\"trcomp-tf-example\",\n", " instance_count=1,\n", " role=role,\n", " volume_size=100,\n", " model_dir=False,\n", " py_version=\"py39\",\n", " framework_version=\"2.10.0\",\n", " disable_profiler=True, # Disabling SageMaker Profiler to avoid overheads during benchmarking\n", " debugger_hook_config=False, # Disabling SageMaker Debugger to avoid overheads during benchmarking\n", ")" ] }, { "cell_type": "markdown", "id": "6c1ecbc2", "metadata": {}, "source": [ "Next, we define some basic arguments to be passed to the training script." ] }, { "cell_type": "code", "execution_count": null, "id": "e0b56d7d", "metadata": {}, "outputs": [], "source": [ "# hyperparameters are passed to the training entrypoint as arguments\n", "hyperparameters = {\n", " \"model_type\": \"gpt2\",\n", " \"tokenizer_name\": \"gpt2\",\n", " \"output_dir\": \"/opt/ml/model\",\n", " \"dataset_name\": \"glue\",\n", " \"dataset_config_name\": \"sst2\",\n", " \"do_train\": True,\n", " \"do_eval\": True,\n", " \"block_size\": 128,\n", " \"num_train_epochs\": EPOCHS,\n", " \"overwrite_output_dir\": True,\n", " \"save_strategy\": \"no\",\n", " \"evaluation_strategy\": \"no\",\n", " \"logging_strategy\": \"epoch\",\n", " \"output_dir\": \"/opt/ml/model\",\n", " \"dataloader_drop_last\": True,\n", "}" ] }, { "cell_type": "markdown", "id": "29d8204d", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "0074b92b", "metadata": {}, "source": [ "### Training with Native TensorFlow" ] }, { "cell_type": "markdown", "id": "ab5f5cf1", "metadata": {}, "source": [ "The batch size below is the maximum batch we could fit into the memory of an Nvidia V100 GPU (P3). If you change the model, instance type or sequence length etc., please experiment to find the largest batch size that will fit into memory. " ] }, { "cell_type": "code", "execution_count": null, "id": "2dc840c2", "metadata": {}, "outputs": [], "source": [ "from sagemaker.tensorflow import TensorFlow\n", "\n", "native_batch_size = 48\n", "learning_rate_native = float(\"1.25e-5\") * NUM_GPUS_PER_INSTANCE\n", "\n", "hyperparameters[\"per_device_train_batch_size\"]: native_batch_size\n", "hyperparameters[\"learning_rate\"]: learning_rate_native\n", "\n", "# configure the training job\n", "native_estimator = TensorFlow(\n", " **estimator_args,\n", " hyperparameters=hyperparameters,\n", ")\n", "\n", "# start the training job\n", "native_estimator.fit(wait=False)\n", "native_estimator.latest_training_job.name" ] }, { "cell_type": "markdown", "id": "72b75b1f", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "01fd04a7", "metadata": {}, "source": [ "### Training with Optimized TensorFlow" ] }, { "cell_type": "markdown", "id": "b2eb56e5", "metadata": {}, "source": [ "Compilation through Training Compiler changes the memory footprint of the model. Most commonly, this manifests as a reduction in memory utilization and a consequent increase in the largest batch size that can fit on the GPU. But in some cases the compiler intelligently promotes caching which leads to a decrease in the largest batch size that can fit on the GPU. Note that if you want to change the batch size, you must adjust the learning rate appropriately." ] }, { "cell_type": "code", "execution_count": null, "id": "d00c0b52", "metadata": {}, "outputs": [], "source": [ "from sagemaker.tensorflow import TensorFlow, TrainingCompilerConfig\n", "\n", "# SageMaker Training Compiler has increased the memory consumption of the model leading to decrease in batch size.\n", "hyperparameters[\"per_device_train_batch_size\"] = 32\n", "\n", "# The original LR was set for a batch of 48. Here we are scaling learning rate with batch size.\n", "hyperparameters[\"learning_rate\"] = (\n", " learning_rate_native\n", " / 48\n", " * hyperparameters[\"per_device_train_batch_size\"]\n", " * NUM_GPUS_PER_INSTANCE\n", ")\n", "\n", "# configure the training job\n", "optimized_estimator = TensorFlow(\n", " compiler_config=TrainingCompilerConfig(),\n", " hyperparameters=hyperparameters,\n", " **estimator_args,\n", ")\n", "\n", "# start the training job\n", "optimized_estimator.fit(wait=False)\n", "optimized_estimator.latest_training_job.name" ] }, { "cell_type": "markdown", "id": "26688483", "metadata": {}, "source": [ "### Wait for training jobs to complete\n" ] }, { "cell_type": "code", "execution_count": null, "id": "20cc74af", "metadata": {}, "outputs": [], "source": [ "waiter = native_estimator.sagemaker_session.sagemaker_client.get_waiter(\n", " \"training_job_completed_or_stopped\"\n", ")\n", "waiter.wait(TrainingJobName=native_estimator.latest_training_job.name)\n", "waiter.wait(TrainingJobName=optimized_estimator.latest_training_job.name)" ] }, { "cell_type": "code", "execution_count": null, "id": "5267ff10", "metadata": {}, "outputs": [], "source": [ "native_estimator = TensorFlow.attach(native_estimator.latest_training_job.name)\n", "optimized_estimator = TensorFlow.attach(optimized_estimator.latest_training_job.name)" ] }, { "cell_type": "markdown", "id": "5dcce59d", "metadata": {}, "source": [ "## Analysis" ] }, { "cell_type": "markdown", "id": "b42a93db", "metadata": {}, "source": [ "**Note:** If the estimator object is no longer available due to a kernel break or refresh, you need to directly use the training job name and manually attach the training job to a new HuggingFace estimator. For example:\n", "\n", "```python\n", "huggingface_estimator = HuggingFace.attach(\"\")\n", "```" ] }, { "cell_type": "markdown", "id": "5cc75f0a", "metadata": {}, "source": [ "### Load logs of the training job *with* SageMaker Training Compiler" ] }, { "cell_type": "code", "execution_count": null, "id": "aec64f95", "metadata": {}, "outputs": [], "source": [ "%%capture optimized\n", "\n", "# access the logs of the optimized training job\n", "optimized_estimator.sagemaker_session.logs_for_job(optimized_estimator.latest_training_job.name)" ] }, { "cell_type": "markdown", "id": "bb090301", "metadata": {}, "source": [ "### Load logs of the training job *without* SageMaker Training Compiler" ] }, { "cell_type": "code", "execution_count": null, "id": "9b7b1016", "metadata": {}, "outputs": [], "source": [ "%%capture native\n", "\n", "# access the logs of the native training job\n", "native_estimator.sagemaker_session.logs_for_job(native_estimator.latest_training_job.name)" ] }, { "cell_type": "markdown", "id": "3a66fa80", "metadata": {}, "source": [ "### Create helper functions for analysis" ] }, { "cell_type": "code", "execution_count": null, "id": "7efc1234", "metadata": {}, "outputs": [], "source": [ "from ast import literal_eval\n", "from collections import defaultdict\n", "from matplotlib import pyplot as plt\n", "\n", "\n", "def _summarize(captured):\n", " final = []\n", " for line in captured.stdout.replace(\"#010\", \"\").replace(\"]\", \"\\n\").replace(\"-\", \"\").split(\"\\n\"):\n", " cleaned = line.strip()\n", " if (\n", " cleaned.startswith(\"ETA\")\n", " or \"*\" * 5 in cleaned\n", " or \"ms/step\" in cleaned\n", " or \"Epoch\" in cleaned\n", " or (\"INFO\" in cleaned and \"=\" in cleaned)\n", " ):\n", " final.append(cleaned)\n", " return final\n", "\n", "\n", "def make_sense(string):\n", " try:\n", " return literal_eval(string.split(chr(27))[0])\n", " except:\n", " pass\n", "\n", "\n", "def summarize(summary):\n", " final = {\"train\": {}}\n", " phase = \"train\"\n", " for line in summary:\n", " if \"Epoch\" in line:\n", " epoch = literal_eval(line.split(\" \")[1].split(\"/\")[0])\n", " if epoch not in final[phase]:\n", " final[phase][epoch] = {\"loss\": [], \"ms/step\": -1}\n", " elif line.startswith(\"ETA\"):\n", " try:\n", " extract = line[line.index(\"loss:\") : line.index(\"loss:\") + 49]\n", " values = [i for i in extract.split(\" \") if make_sense(i)]\n", " loss = values[0].split(chr(27))[0]\n", " final[phase][epoch][\"loss\"].append(loss)\n", " except:\n", " pass\n", " elif \"ms/step\" in line:\n", " avg_step_latency = make_sense(\n", " [i for i in line.split(\" \") if \"ms/step\" in i][0].replace(\"ms/step\", \"\")\n", " )\n", " final[phase][epoch][\"ms/step\"] = avg_step_latency\n", " return final" ] }, { "cell_type": "markdown", "id": "3aa7fe3c", "metadata": {}, "source": [ "### Convergence of Training Loss\n", "\n", "SageMaker Training Compiler does not affect the model convergence behavior. Here, we see the decrease in training loss is similar with and without SageMaker Training Compiler\n" ] }, { "cell_type": "code", "execution_count": null, "id": "cb39ef3f", "metadata": {}, "outputs": [], "source": [ "def get_loss_across_epochs(data_summary):\n", " loss = []\n", " epochs = []\n", " for epoch, value in data_summary[\"train\"].items():\n", " if value[\"loss\"]:\n", " epochs.append(epoch)\n", " loss.append(literal_eval(value[\"loss\"][0]))\n", " return epochs, loss\n", "\n", "\n", "optimized_epochs, optimized_losses = get_loss_across_epochs(summarize(_summarize(optimized)))\n", "native_epochs, native_losses = get_loss_across_epochs(summarize(_summarize(native)))\n", "\n", "%matplotlib inline\n", "\n", "plt.title(\"Plot of Training Loss\")\n", "plt.ylabel(\"Training Loss\")\n", "plt.xlabel(\"Epoch\")\n", "plt.plot(native_epochs, native_losses, label=\"Baseline TF\")\n", "\n", "plt.plot(optimized_epochs, optimized_losses, label=\"SM Training Compiler Enhanced TF\")\n", "plt.legend()" ] }, { "cell_type": "markdown", "id": "222bd985", "metadata": {}, "source": [ "### Total Billable Time\n", "\n", "Finally, the decrease in total training time results in a decrease in the billable seconds from SageMaker" ] }, { "cell_type": "code", "execution_count": null, "id": "52c79ad9", "metadata": {}, "outputs": [], "source": [ "def BillableTimeInSeconds(name):\n", " describe_training_job = (\n", " optimized_estimator.sagemaker_session.sagemaker_client.describe_training_job\n", " )\n", " details = describe_training_job(TrainingJobName=name)\n", " return details[\"BillableTimeInSeconds\"]" ] }, { "cell_type": "code", "execution_count": null, "id": "c9cd5c72", "metadata": {}, "outputs": [], "source": [ "Billable = {}\n", "Billable[\"Native\"] = BillableTimeInSeconds(native_estimator.latest_training_job.name)\n", "Billable[\"Optimized\"] = BillableTimeInSeconds(optimized_estimator.latest_training_job.name)\n", "pd.DataFrame(Billable, index=[\"BillableSecs\"])" ] }, { "cell_type": "code", "execution_count": null, "id": "dcba1bb4", "metadata": {}, "outputs": [], "source": [ "speedup = (Billable[\"Native\"] - Billable[\"Optimized\"]) * 100 / Billable[\"Native\"]\n", "print(f\"SageMaker Training Compiler integrated TensorFlow was {int(speedup)}% faster in summary.\")" ] }, { "cell_type": "markdown", "id": "1d7b4ec1", "metadata": {}, "source": [ "## Clean up\n", "\n", "Stop all training jobs launched if the jobs are still running." ] }, { "cell_type": "code", "execution_count": null, "id": "a87c34ce", "metadata": {}, "outputs": [], "source": [ "import boto3\n", "\n", "sm = boto3.client(\"sagemaker\")\n", "\n", "\n", "def stop_training_job(name):\n", " status = sm.describe_training_job(TrainingJobName=name)[\"TrainingJobStatus\"]\n", " if status == \"InProgress\":\n", " sm.stop_training_job(TrainingJobName=name)\n", "\n", "\n", "stop_training_job(native_estimator.latest_training_job.name)\n", "stop_training_job(optimized_estimator.latest_training_job.name)" ] }, { "cell_type": "markdown", "id": "cf0b0258", "metadata": {}, "source": [ "Also, to find instructions on cleaning up resources, see [Clean Up](https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html) in the *Amazon SageMaker Developer Guide*." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-training-compiler|huggingface|tensorflow_multiple_gpu_single_node|language-modeling.ipynb)\n" ] } ], "metadata": { "kernelspec": { "display_name": "conda_tensorflow2_p38", "language": "python", "name": "conda_tensorflow2_p38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 5 }