{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Automate Model Retraining & Deployment Using the AWS Step Functions Data Science SDK\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "1. [Introduction](#Introduction)\n", "1. [Setup](#Setup)\n", "1. [Create Resources](#Create-Resources)\n", "1. [Build a Machine Learning Workflow](#Build-a-Machine-Learning-Workflow)\n", "1. [Run the Workflow](#Run-the-Workflow)\n", "1. [Clean Up](#Clean-Up)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "This notebook describes how to use the AWS Step Functions Data Science SDK to create a machine learning model retraining workflow. The Step Functions SDK is an open source library that allows data scientists to easily create and execute machine learning workflows using AWS Step Functions and Amazon SageMaker. For more information, please see the following resources:\n", "* [AWS Step Functions](https://aws.amazon.com/step-functions/)\n", "* [AWS Step Functions Developer Guide](https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html)\n", "* [AWS Step Functions Data Science SDK](https://aws-step-functions-data-science-sdk.readthedocs.io)\n", "\n", "In this notebook, we will use the SDK to create steps that capture and transform data using AWS Glue, encorporate this data into the training of a machine learning model, deploy the model to a SageMaker endpoint, link these steps together to create a workflow, and then execute the workflow in AWS Step Functions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "First, we'll need to install and load all the required modules. Then we'll create fine-grained IAM roles for the Lambda, Glue, and Step Functions resources that we will create. The IAM roles grant the services permissions within your AWS environment." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import sys\n", "\n", "!{sys.executable} -m pip install --upgrade stepfunctions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import the Required Modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import uuid\n", "import logging\n", "import stepfunctions\n", "import boto3\n", "import sagemaker\n", "\n", "from sagemaker.amazon.amazon_estimator import image_uris\n", "from sagemaker.inputs import TrainingInput\n", "from sagemaker.s3 import S3Uploader\n", "from stepfunctions import steps\n", "from stepfunctions.steps import TrainingStep, ModelStep\n", "from stepfunctions.inputs import ExecutionInput\n", "from stepfunctions.workflow import Workflow\n", "\n", "session = sagemaker.Session()\n", "stepfunctions.set_stream_logger(level=logging.INFO)\n", "\n", "region = boto3.Session().region_name\n", "bucket = session.default_bucket()\n", "id = uuid.uuid4().hex\n", "\n", "# Create a unique name for the AWS Glue job to be created. If you change the\n", "# default name, you may need to change the Step Functions execution role.\n", "job_name = \"glue-customer-churn-etl-{}\".format(id)\n", "\n", "# Create a unique name for the AWS Lambda function to be created. If you change\n", "# the default name, you may need to change the Step Functions execution role.\n", "function_name = \"query-training-status-{}\".format(id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we'll create fine-grained IAM roles for the Lambda, Glue, and Step Functions resources. The IAM roles grant the services permissions within your AWS environment.\n", "\n", "### Add permissions to your notebook role in IAM\n", "\n", "The IAM role assumed by your notebook requires permission to create and run workflows in AWS Step Functions. If this notebook is running on a SageMaker notebook instance, do the following to provide IAM permissions to the notebook:\n", "\n", "1. Open the Amazon [SageMaker console](https://console.aws.amazon.com/sagemaker/). \n", "2. Select **Notebook instances** and choose the name of your notebook instance.\n", "3. Under **Permissions and encryption** select the role ARN to view the role on the IAM console.\n", "4. Copy and save the IAM role ARN for later use. \n", "5. Choose **Attach policies** and search for `AWSStepFunctionsFullAccess`.\n", "6. Select the check box next to `AWSStepFunctionsFullAccess` and choose **Attach policy**.\n", "\n", "We also need to provide permissions that allow the notebook instance the ability to create an AWS Lambda function and AWS Glue job. We will edit the managed policy attached to our role directly to encorporate these specific permissions:\n", "\n", "1. Under **Permisions policies** expand the AmazonSageMaker-ExecutionPolicy-******** policy and choose **Edit policy**.\n", "2. Select **Add additional permissions**. Choose **IAM** for Service and **PassRole** for Actions.\n", "3. Under Resources, choose **Specific**. Select **Add ARN** and enter `query_training_status-role` for **Role name with path*** and choose **Add**. You will create this role later on in this notebook.\n", "4. Select **Add additional permissions** a second time. Choose **Lambda** for Service, **Write** for Access level, and **All resources** for Resources.\n", "5. Select **Add additional permissions** a final time. Choose **Glue** for Service, **Write** for Access level, and **All resources** for Resources.\n", "6. Choose **Review policy** and then **Save changes**.\n", "\n", "If you are running this notebook outside of SageMaker, the SDK will use your configured AWS CLI configuration. For more information, see [Configuring the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, let's create an execution role in IAM for Step Functions. \n", "\n", "### Create an Execution Role for Step Functions\n", "\n", "Your Step Functions workflow requires an IAM role to interact with other services in your AWS environment. \n", "\n", "1. Go to the [IAM console](https://console.aws.amazon.com/iam/).\n", "2. Select **Roles** and then **Create role**.\n", "3. Under **Choose the service that will use this role** select **Step Functions**.\n", "4. Choose **Next** until you can enter a **Role name**.\n", "5. Enter a name such as `AmazonSageMaker-StepFunctionsWorkflowExecutionRole` and then select **Create role**.\n", "\n", "Next, create and attach a policy to the role you created. As a best practice, the following steps will attach a policy that only provides access to the specific resources and actions needed for this solution.\n", "\n", "1. Under the **Permissions** tab, click **Attach policies** and then **Create policy**.\n", "2. Enter the following in the **JSON** tab:\n", "\n", "```json\n", "{\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": \"iam:PassRole\",\n", " \"Resource\": \"NOTEBOOK_ROLE_ARN\",\n", " \"Condition\": {\n", " \"StringEquals\": {\n", " \"iam:PassedToService\": \"sagemaker.amazonaws.com\"\n", " }\n", " }\n", " },\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"sagemaker:CreateModel\",\n", " \"sagemaker:DeleteEndpointConfig\",\n", " \"sagemaker:DescribeTrainingJob\",\n", " \"sagemaker:CreateEndpoint\",\n", " \"sagemaker:StopTrainingJob\",\n", " \"sagemaker:CreateTrainingJob\",\n", " \"sagemaker:UpdateEndpoint\",\n", " \"sagemaker:CreateEndpointConfig\",\n", " \"sagemaker:DeleteEndpoint\"\n", " ],\n", " \"Resource\": [\n", " \"arn:aws:sagemaker:*:*:*\"\n", " ]\n", " },\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"events:DescribeRule\",\n", " \"events:PutRule\",\n", " \"events:PutTargets\"\n", " ],\n", " \"Resource\": [\n", " \"arn:aws:events:*:*:rule/StepFunctionsGetEventsForSageMakerTrainingJobsRule\"\n", " ]\n", " },\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"lambda:InvokeFunction\"\n", " ],\n", " \"Resource\": [\n", " \"arn:aws:lambda:*:*:function:query-training-status*\"\n", " ]\n", " },\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"glue:StartJobRun\",\n", " \"glue:GetJobRun\",\n", " \"glue:BatchStopJobRun\",\n", " \"glue:GetJobRuns\"\n", " ],\n", " \"Resource\": \"arn:aws:glue:*:*:job/glue-customer-churn-etl*\"\n", " }\n", " ]\n", "}\n", "```\n", "\n", "3. Replace **NOTEBOOK_ROLE_ARN** with the ARN for your notebook that you created in the previous step.\n", "4. Choose **Review policy** and give the policy a name such as `AmazonSageMaker-StepFunctionsWorkflowExecutionPolicy`.\n", "5. Choose **Create policy**.\n", "6. Select **Roles** and search for your `AmazonSageMaker-StepFunctionsWorkflowExecutionRole` role.\n", "7. Under the **Permissions** tab, click **Attach policies**.\n", "8. Search for your newly created `AmazonSageMaker-StepFunctionsWorkflowExecutionPolicy` policy and select the check box next to it.\n", "9. Choose **Attach policy**. You will then be redirected to the details page for the role.\n", "10. Copy the AmazonSageMaker-StepFunctionsWorkflowExecutionRole **Role ARN** at the top of the Summary." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configure Execution Roles" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# paste the AmazonSageMaker-StepFunctionsWorkflowExecutionRole ARN from above\n", "workflow_execution_role = \"\"\n", "\n", "# SageMaker Execution Role\n", "# You can use sagemaker.get_execution_role() if running inside sagemaker's notebook instance\n", "sagemaker_execution_role = (\n", " sagemaker.get_execution_role()\n", ") # Replace with ARN if not in an AWS SageMaker notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create a Glue IAM Role\n", "You need to create an IAM role so that you can create and execute an AWS Glue Job on your data in Amazon S3.\n", "\n", "1. Go to the [IAM console](https://console.aws.amazon.com/iam/).\n", "2. Select **Roles** and then **Create role**.\n", "3. Under **Choose the service that will use this role** select **Glue**.\n", "4. Choose **Next** until you can enter a **Role name**.\n", "5. Enter a name such as `AWS-Glue-S3-Bucket-Access` and then select **Create role**.\n", "\n", "Next, create and attach a policy to the role you created. The following steps attach a managed policy that provides Glue access to the specific S3 bucket holding your data.\n", "\n", "1. Under the **Permissions** tab, click **Attach policies** and then **Create policy**.\n", "2. Enter the following in the **JSON** tab:\n", "\n", "```json\n", "{\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\n", " \"Sid\": \"ListObjectsInBucket\",\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\"s3:ListBucket\"],\n", " \"Resource\": [\"arn:aws:s3:::BUCKET-NAME\"]\n", " },\n", " {\n", " \"Sid\": \"AllObjectActions\",\n", " \"Effect\": \"Allow\",\n", " \"Action\": \"s3:*Object\",\n", " \"Resource\": [\"arn:aws:s3:::BUCKET-NAME/*\"]\n", " }\n", " ]\n", "}\n", "```\n", "\n", "3. Run the next cell (below) to retrieve the specific **S3 bucket name** that we will grant permissions to." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "session = sagemaker.Session()\n", "bucket = session.default_bucket()\n", "print(bucket)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. Copy the output of the above cell and replace the **two occurances** of **BUCKET-NAME** in the JSON text that you entered.\n", "5. Choose **Review policy** and give the policy a name such as `S3BucketAccessPolicy`.\n", "6. Choose **Create policy**.\n", "7. Select **Roles**, then search for and select your `AWS-Glue-S3-Bucket-Access` role.\n", "8. Under the **Permissions** tab, click **Attach policies**.\n", "9. Search for your newly created `S3BucketAccessPolicy` policy and select the check box next to it.\n", "10. Choose **Attach policy**. You will then be redirected to the details page for the role.\n", "11. Copy the **Role ARN** at the top of the Summary tab." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# paste the AWS-Glue-S3-Bucket-Access role ARN from above\n", "glue_role = \"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create a Lambda IAM Role\n", "You also need to create an IAM role so that you can create and execute an AWS Lambda function stored in Amazon S3.\n", "\n", "1. Go to the [IAM console](https://console.aws.amazon.com/iam/).\n", "2. Select **Roles** and then **Create role**.\n", "3. Under **Choose the service that will use this role** select **Lambda**.\n", "4. Choose **Next** until you can enter a **Role name**.\n", "5. Enter a name such as `query_training_status-role` and then select **Create role**.\n", "\n", "Next, attach policies to the role you created. The following steps attach policies that provides Lambda access to S3 and read-only access to SageMaker.\n", "\n", "1. Under the **Permissions** tab, click **Attach Policies**.\n", "2. In the search box, type **SageMaker** and select **AmazonSageMakerReadOnly** from the populated list.\n", "3. In the search box type **AWSLambda** and select **AWSLambdaBasicExecutionRole** from the populated list.\n", "4. Choose **Attach policy**. You will then be redirected to the details page for the role.\n", "5. Copy the **Role ARN** at the top of the **Summary**.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# paste the query_training_status-role role ARN from above\n", "lambda_role = \"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Prepare the Dataset\n", "This notebook uses the XGBoost algorithm to automate the classification of unhappy customers for telecommunication service providers. The goal is to identify customers who may cancel their service soon so that you can entice them to stay. This is known as customer churn prediction.\n", "\n", "The dataset we use is publicly available and was mentioned in the book [Discovering Knowledge in Data](https://www.amazon.com/dp/0470908742/) by Daniel T. Larose. It is attributed by the author to the University of California Irvine Repository of Machine Learning Datasets." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "project_name = \"ml_deploy\"\n", "\n", "data_source = S3Uploader.upload(\n", " local_path=\"./data/customer-churn.csv\",\n", " desired_s3_uri=\"s3://{}/{}\".format(bucket, project_name),\n", " sagemaker_session=session,\n", ")\n", "\n", "train_prefix = \"train\"\n", "val_prefix = \"validation\"\n", "\n", "train_data = \"s3://{}/{}/{}/\".format(bucket, project_name, train_prefix)\n", "validation_data = \"s3://{}/{}/{}/\".format(bucket, project_name, val_prefix)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create Resources\n", "In the following steps we'll create the Glue job and Lambda function that are called from the Step Functions workflow." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create the AWS Glue Job" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "glue_script_location = S3Uploader.upload(\n", " local_path=\"./code/glue_etl.py\",\n", " desired_s3_uri=\"s3://{}/{}\".format(bucket, project_name),\n", " sagemaker_session=session,\n", ")\n", "glue_client = boto3.client(\"glue\")\n", "\n", "response = glue_client.create_job(\n", " Name=job_name,\n", " Description=\"PySpark job to extract the data and split in to training and validation data sets\",\n", " Role=glue_role, # you can pass your existing AWS Glue role here if you have used Glue before\n", " ExecutionProperty={\"MaxConcurrentRuns\": 2},\n", " Command={\"Name\": \"glueetl\", \"ScriptLocation\": glue_script_location, \"PythonVersion\": \"3\"},\n", " DefaultArguments={\"--job-language\": \"python\"},\n", " GlueVersion=\"1.0\",\n", " WorkerType=\"Standard\",\n", " NumberOfWorkers=2,\n", " Timeout=60,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create the AWS Lambda Function" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import zipfile\n", "\n", "zip_name = \"query_training_status.zip\"\n", "lambda_source_code = \"./code/query_training_status.py\"\n", "\n", "zf = zipfile.ZipFile(zip_name, mode=\"w\")\n", "zf.write(lambda_source_code, arcname=lambda_source_code.split(\"/\")[-1])\n", "zf.close()\n", "\n", "\n", "S3Uploader.upload(\n", " local_path=zip_name,\n", " desired_s3_uri=\"s3://{}/{}\".format(bucket, project_name),\n", " sagemaker_session=session,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "lambda_client = boto3.client(\"lambda\")\n", "\n", "response = lambda_client.create_function(\n", " FunctionName=function_name,\n", " Runtime=\"python3.7\",\n", " Role=lambda_role,\n", " Handler=\"query_training_status.lambda_handler\",\n", " Code={\"S3Bucket\": bucket, \"S3Key\": \"{}/{}\".format(project_name, zip_name)},\n", " Description=\"Queries a SageMaker training job and return the results.\",\n", " Timeout=15,\n", " MemorySize=128,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configure the AWS SageMaker Estimator" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "container = sagemaker.image_uris.retrieve(\"xgboost\", region, \"1.2-1\")\n", "\n", "xgb = sagemaker.estimator.Estimator(\n", " container,\n", " sagemaker_execution_role,\n", " train_instance_count=1,\n", " train_instance_type=\"ml.m4.xlarge\",\n", " output_path=\"s3://{}/{}/output\".format(bucket, project_name),\n", ")\n", "\n", "xgb.set_hyperparameters(\n", " max_depth=5,\n", " eta=0.2,\n", " gamma=4,\n", " min_child_weight=6,\n", " subsample=0.8,\n", " objective=\"binary:logistic\",\n", " eval_metric=\"error\",\n", " num_round=100,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Build a Machine Learning Workflow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can use a state machine workflow to create a model retraining pipeline. The AWS Data Science Workflows SDK provides several AWS SageMaker workflow steps that you can use to construct an ML pipeline. In this tutorial you will create the following steps:\n", "\n", "* [**ETLStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/compute.html#stepfunctions.steps.compute.GlueStartJobRunStep) - Starts an AWS Glue job to extract the latest data from our source database and prepare our data.\n", "* [**TrainingStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.TrainingStep) - Creates the training step and passes the defined estimator.\n", "* [**ModelStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.ModelStep) - Creates a model in SageMaker using the artifacts created during the TrainingStep.\n", "* [**LambdaStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/compute.html#stepfunctions.steps.compute.LambdaStep) - Creates the task state step within our workflow that calls a Lambda function.\n", "* [**ChoiceStateStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/states.html#stepfunctions.steps.states.Choice) - Creates the choice state step within our workflow.\n", "* [**EndpointConfigStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.EndpointConfigStep) - Creates the endpoint config step to define the new configuration for our endpoint.\n", "* [**EndpointStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.EndpointStep) - Creates the endpoint step to update our model endpoint.\n", "* [**FailStateStep**](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/states.html#stepfunctions.steps.states.Fail) - Creates fail state step within our workflow." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# SageMaker expects unique names for each job, model and endpoint.\n", "# If these names are not unique the execution will fail.\n", "execution_input = ExecutionInput(\n", " schema={\n", " \"TrainingJobName\": str,\n", " \"GlueJobName\": str,\n", " \"ModelName\": str,\n", " \"EndpointName\": str,\n", " \"LambdaFunctionName\": str,\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create an ETL step with AWS Glue\n", "In the following cell, we create a Glue step thats runs an AWS Glue job. The Glue job extracts the latest data from our source database, removes unnecessary columns, splits the data in to training and validation sets, and saves the data to CSV format in S3. Glue is performing this extraction, transformation, and load (ETL) in a serverless fashion, so there are no compute resources to configure and manage. See the [GlueStartJobRunStep](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/compute.html#stepfunctions.steps.compute.GlueStartJobRunStep) Compute step in the AWS Step Functions Data Science SDK documentation." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "etl_step = steps.GlueStartJobRunStep(\n", " \"Extract, Transform, Load\",\n", " parameters={\n", " \"JobName\": execution_input[\"GlueJobName\"],\n", " \"Arguments\": {\n", " \"--S3_SOURCE\": data_source,\n", " \"--S3_DEST\": \"s3a://{}/{}/\".format(bucket, project_name),\n", " \"--TRAIN_KEY\": train_prefix + \"/\",\n", " \"--VAL_KEY\": val_prefix + \"/\",\n", " },\n", " },\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a SageMaker Training Step \n", "\n", "In the following cell, we create the training step and pass the estimator we defined above. See [TrainingStep](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.TrainingStep) in the AWS Step Functions Data Science SDK documentation to learn more." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "training_step = steps.TrainingStep(\n", " \"Model Training\",\n", " estimator=xgb,\n", " data={\n", " \"train\": TrainingInput(train_data, content_type=\"text/csv\"),\n", " \"validation\": TrainingInput(validation_data, content_type=\"text/csv\"),\n", " },\n", " job_name=execution_input[\"TrainingJobName\"],\n", " wait_for_completion=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a Model Step \n", "\n", "In the following cell, we define a model step that will create a model in Amazon SageMaker using the artifacts created during the TrainingStep. See [ModelStep](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.ModelStep) in the AWS Step Functions Data Science SDK documentation to learn more.\n", "\n", "The model creation step typically follows the training step. The Step Functions SDK provides the [get_expected_model](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.TrainingStep.get_expected_model) method in the TrainingStep class to provide a reference for the trained model artifacts. Please note that this method is only useful when the ModelStep directly follows the TrainingStep." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "model_step = steps.ModelStep(\n", " \"Save Model\",\n", " model=training_step.get_expected_model(),\n", " model_name=execution_input[\"ModelName\"],\n", " result_path=\"$.ModelStepResults\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a Lambda Step\n", "In the following cell, we define a lambda step that will invoke the previously created lambda function as part of our Step Function workflow. See [LambdaStep](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/compute.html#stepfunctions.steps.compute.LambdaStep) in the AWS Step Functions Data Science SDK documentation to learn more." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "lambda_step = steps.compute.LambdaStep(\n", " \"Query Training Results\",\n", " parameters={\n", " \"FunctionName\": execution_input[\"LambdaFunctionName\"],\n", " \"Payload\": {\"TrainingJobName.$\": \"$.TrainingJobName\"},\n", " },\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a Choice State Step \n", "In the following cell, we create a choice step in order to build a dynamic workflow. This choice step branches based off of the results of our SageMaker training step: did the training job fail or should the model be saved and the endpoint be updated? We will add specfic rules to this choice step later on in section 8 of this notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "check_accuracy_step = steps.states.Choice(\"Accuracy > 90%\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create an Endpoint Configuration Step\n", "In the following cell we create an endpoint configuration step. See [EndpointConfigStep](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.sagemaker.EndpointConfigStep) in the AWS Step Functions Data Science SDK documentation to learn more." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "endpoint_config_step = steps.EndpointConfigStep(\n", " \"Create Model Endpoint Config\",\n", " endpoint_config_name=execution_input[\"ModelName\"],\n", " model_name=execution_input[\"ModelName\"],\n", " initial_instance_count=1,\n", " instance_type=\"ml.m4.xlarge\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Update the Model Endpoint Step\n", "In the following cell, we create the Endpoint step to deploy the new model as a managed API endpoint, updating an existing SageMaker endpoint if our choice state is sucessful." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "endpoint_step = steps.EndpointStep(\n", " \"Update Model Endpoint\",\n", " endpoint_name=execution_input[\"EndpointName\"],\n", " endpoint_config_name=execution_input[\"ModelName\"],\n", " update=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create the Fail State Step\n", "In addition, we create a Fail step which proceeds from our choice state if the validation accuracy of our model is lower than the threshold we define. See [FailStateStep](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/states.html#stepfunctions.steps.states.Fail) in the AWS Step Functions Data Science SDK documentation to learn more." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "fail_step = steps.states.Fail(\n", " \"Model Accuracy Too Low\", comment=\"Validation accuracy lower than threshold\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add Rules to Choice State\n", "In the cells below, we add a threshold rule to our choice state. Therefore, if the validation accuracy of our model is below 0.90, we move to the Fail State. If the validation accuracy of our model is above 0.90, we move to the save model step with proceeding endpoint update. See [here](https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst) for more information on how XGBoost calculates classification error.\n", "\n", "For binary classification problems the XGBoost algorithm defines the model error as: \n", "\n", "\\begin{equation*}\n", "\\frac{incorret\\:predictions}{total\\:number\\:of\\:predictions}\n", "\\end{equation*}\n", "\n", "To achieve an accuracy of 90%, we need error <.10." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "threshold_rule = steps.choice_rule.ChoiceRule.NumericLessThan(\n", " variable=lambda_step.output()[\"Payload\"][\"trainingMetrics\"][0][\"Value\"], value=0.1\n", ")\n", "\n", "check_accuracy_step.add_choice(rule=threshold_rule, next_step=endpoint_config_step)\n", "check_accuracy_step.default_choice(next_step=fail_step)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Link all the Steps Together\n", "Finally, create your workflow definition by chaining all of the steps together that we've created. See [Chain](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/sagemaker.html#stepfunctions.steps.states.Chain) in the AWS Step Functions Data Science SDK documentation to learn more." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "endpoint_config_step.next(endpoint_step)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "workflow_definition = steps.Chain(\n", " [etl_step, training_step, model_step, lambda_step, check_accuracy_step]\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run the Workflow\n", "Create your workflow using the workflow definition above, and render the graph with [render_graph](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Workflow.render_graph):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "workflow = Workflow(\n", " name=\"MyInferenceRoutine_{}\".format(id),\n", " definition=workflow_definition,\n", " role=workflow_execution_role,\n", " execution_input=execution_input,\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "workflow.render_graph()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create the workflow in AWS Step Functions with [create](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Workflow.create):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "workflow.create()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the workflow with [execute](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Workflow.execute):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "execution = workflow.execute(\n", " inputs={\n", " \"TrainingJobName\": \"regression-{}\".format(id), # Each Sagemaker Job requires a unique name,\n", " \"GlueJobName\": job_name,\n", " \"ModelName\": \"CustomerChurn-{}\".format(id), # Each Model requires a unique name,\n", " \"EndpointName\": \"CustomerChurn\", # Each Endpoint requires a unique name\n", " \"LambdaFunctionName\": function_name,\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Render workflow progress with the [render_progress](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Execution.render_progress). This generates a snapshot of the current state of your workflow as it executes. This is a static image therefore you must run the cell again to check progress:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "execution.render_progress()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use [list_events](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Execution.list_events) to list all events in the workflow execution:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "execution.list_events()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use [list_executions](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Workflow.list_executions) to list all executions for a specific workflow:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "workflow.list_executions(html=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use [list_workflows](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/workflow.html#stepfunctions.workflow.Workflow.list_workflows) to list all workflows in your AWS account:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "Workflow.list_workflows(html=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Clean Up\n", "When you are done, make sure to clean up your AWS account by deleting resources you won't be reusing. Uncomment the code below and run the cell to delete the Glue job, Lambda function, and Step Function." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "# lambda_client.delete_function(FunctionName=function_name)\n", "# glue_client.delete_job(JobName=job_name)\n", "# workflow.delete()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/step-functions-data-science-sdk|automate_model_retraining_workflow|automate_model_retraining_workflow.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }