{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Ingest data with Athena\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook demonstrates how to set up a database with Athena and query data with it.\n", "\n", "Amazon Athena is a serverless interactive query service that makes it easy to analyze your S3 data with standard SQL. It uses S3 as its underlying data store, and uses Presto with ANSI SQL support, and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet. Athena is ideal for quick, ad-hoc querying but it can also handle complex analysis, including large joins, window functions, and arrays. \n", "\n", "To get started, you can point to your data in Amazon S3, define the schema, and start querying using the built-in query editor. Amazon Athena allows you to tap into all your data in S3 without the need to set up complex processes to extract, transform, and load the data (ETL).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set up Athena\n", "First, we are going to make sure we have the necessary policies attached to the role that we used to create this notebook to access Athena. You can do this through an IAM client as shown below, or through the AWS console. \n", "\n", "**Note: You would need IAMFullAccess to attach policies to the role.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Attach IAMFullAccess Policy from Console\n", "\n", "**1.** Go to **SageMaker Console**, choose **Notebook instances** in the navigation panel, then select your notebook instance to view the details. Then under **Permissions and Encryption**, click on the **IAM role ARN** link and it will take you to your role summary in the **IAM Console**. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**2.** Click on **Create Policy** under **Permissions**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3.** In the **Attach Permissions** page, search for **IAMFullAccess**. It will show up in the policy search results if it has not been attached to your role yet. Select the checkbox for the **IAMFullAccess** Policy, then click **Attach Policy**. You now have the policy successfully attached to your role." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install -qU 'sagemaker>=2.15.0' 'PyAthena==1.10.7' 'awswrangler==2.14.0'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import io\n", "import boto3\n", "import sagemaker\n", "import json\n", "from sagemaker import get_execution_role\n", "import os\n", "import sys\n", "from sklearn.datasets import fetch_california_housing\n", "import pandas as pd\n", "from botocore.exceptions import ClientError\n", "\n", "# Get region\n", "session = boto3.session.Session()\n", "region_name = session.region_name\n", "\n", "# Get SageMaker session & default S3 bucket\n", "sagemaker_session = sagemaker.Session()\n", "bucket = sagemaker_session.default_bucket() # replace with your own bucket name if you have one\n", "iam = boto3.client(\"iam\")\n", "s3 = sagemaker_session.boto_session.resource(\"s3\")\n", "role = sagemaker.get_execution_role()\n", "role_name = role.split(\"/\")[-1]\n", "prefix = \"data/tabular/california_housing\"\n", "filename = \"california_housing.csv\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download data from online resources and write data to S3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example uses the California Housing dataset, which was originally published in:\n", "\n", "> Pace, R. Kelley, and Ronald Barry. \"Sparse spatial autoregressions.\" Statistics & Probability Letters 33.3 (1997): 291-297." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# helper functions to upload data to s3\n", "def write_to_s3(filename, bucket, prefix):\n", " # put one file in a separate folder. This is helpful if you read and prepare data with Athena\n", " filename_key = filename.split(\".\")[0]\n", " key = \"{}/{}/{}\".format(prefix, filename_key, filename)\n", " return s3.Bucket(bucket).upload_file(filename, key)\n", "\n", "\n", "def upload_to_s3(bucket, prefix, filename):\n", " url = \"s3://{}/{}/{}\".format(bucket, prefix, filename)\n", " print(\"Writing to {}\".format(url))\n", " write_to_s3(filename, bucket, prefix)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tabular_data = fetch_california_housing()\n", "tabular_data_full = pd.DataFrame(tabular_data.data, columns=tabular_data.feature_names)\n", "tabular_data_full[\"target\"] = pd.DataFrame(tabular_data.target)\n", "tabular_data_full.to_csv(\"california_housing.csv\", index=False)\n", "\n", "upload_to_s3(bucket, \"data/tabular\", filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up IAM roles and policies" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you run the following command, you will see an error that you cannot list policies if `IAMFullAccess` policy is not attached to your role. Please follow the steps above to attach the IAMFullAccess policy to your role if you see an error." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# check if IAM policy is attached\n", "try:\n", " existing_policies = iam.list_attached_role_policies(RoleName=role_name)[\"AttachedPolicies\"]\n", " if \"IAMFullAccess\" not in [po[\"PolicyName\"] for po in existing_policies]:\n", " print(\n", " \"ERROR: You need to attach the IAMFullAccess policy in order to attach policy to the role\"\n", " )\n", " else:\n", " print(\"IAMFullAccessPolicy Already Attached\")\n", "except ClientError as e:\n", " if e.response[\"Error\"][\"Code\"] == \"AccessDenied\":\n", " print(\n", " \"ERROR: You need to attach the IAMFullAccess policy in order to attach policy to the role.\"\n", " )\n", " else:\n", " print(\"Unexpected error: %s\" % e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Policy Document\n", "We will create policies we used to access S3 and Athena. The two policies we will create here are: \n", "* S3FullAccess: `arn:aws:iam::aws:policy/AmazonS3FullAccess`\n", "* AthenaFullAccess: `arn:aws:iam::aws:policy/AmazonAthenaFullAccess`\n", "\n", "You can check the policy document in the IAM console and copy the policy file here." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "athena_access_role_policy_doc = {\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\"Effect\": \"Allow\", \"Action\": [\"athena:*\"], \"Resource\": [\"*\"]},\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"glue:CreateDatabase\",\n", " \"glue:DeleteDatabase\",\n", " \"glue:GetDatabase\",\n", " \"glue:GetDatabases\",\n", " \"glue:UpdateDatabase\",\n", " \"glue:CreateTable\",\n", " \"glue:DeleteTable\",\n", " \"glue:BatchDeleteTable\",\n", " \"glue:UpdateTable\",\n", " \"glue:GetTable\",\n", " \"glue:GetTables\",\n", " \"glue:BatchCreatePartition\",\n", " \"glue:CreatePartition\",\n", " \"glue:DeletePartition\",\n", " \"glue:BatchDeletePartition\",\n", " \"glue:UpdatePartition\",\n", " \"glue:GetPartition\",\n", " \"glue:GetPartitions\",\n", " \"glue:BatchGetPartition\",\n", " ],\n", " \"Resource\": [\"*\"],\n", " },\n", " {\"Effect\": \"Allow\", \"Action\": [\"lakeformation:GetDataAccess\"], \"Resource\": [\"*\"]},\n", " ],\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create IAM client\n", "iam = boto3.client(\"iam\")\n", "# create a policy\n", "try:\n", " response = iam.create_policy(\n", " PolicyName=\"myAthenaPolicy\", PolicyDocument=json.dumps(athena_access_role_policy_doc)\n", " )\n", "except ClientError as e:\n", " if e.response[\"Error\"][\"Code\"] == \"EntityAlreadyExists\":\n", " print(\"Policy already created.\")\n", " else:\n", " print(\"Unexpected error: %s\" % e)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# get policy ARN\n", "sts = boto3.client(\"sts\")\n", "account_id = sts.get_caller_identity()[\"Account\"]\n", "policy_athena_arn = f\"arn:aws:iam::{account_id}:policy/myAthenaPolicy\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Attach Policy to Role" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Attach a role policy\n", "try:\n", " response = iam.attach_role_policy(PolicyArn=policy_athena_arn, RoleName=role_name)\n", "except ClientError as e:\n", " if e.response[\"Error\"][\"Code\"] == \"EntityAlreadyExists\":\n", " print(\"Policy is already attached to your role.\")\n", " else:\n", " print(\"Unexpected error: %s\" % e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Intro to PyAthena \n", "\n", "We are going to leverage [PyAthena](https://pypi.org/project/PyAthena/) to connect and run Athena queries. PyAthena is a Python DB API 2.0 (PEP 249) compliant client for Amazon Athena. **Note that you will need to specify the region in which you created the database/table in Athena, making sure your catalog in the specified region that contains the database.** " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pyathena import connect\n", "from pyathena.pandas_cursor import PandasCursor\n", "from pyathena.util import as_pandas" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set Athena database name\n", "database_name = \"tabular_california_housing\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set S3 staging directory -- this is a temporary directory used for Athena queries\n", "s3_staging_dir = \"s3://{0}/athena/staging\".format(bucket)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# write the SQL statement to execute\n", "statement = \"CREATE DATABASE IF NOT EXISTS {}\".format(database_name)\n", "print(statement)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# connect to s3 using PyAthena\n", "cursor = connect(region_name=region_name, s3_staging_dir=s3_staging_dir).cursor()\n", "cursor.execute(statement)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Register Table with Athena \n", "When you run a CREATE TABLE query in Athena, you register your table with the AWS Glue Data Catalog. \n", "\n", "To specify the path to your data in Amazon S3, use the LOCATION property, as shown in the following example: `LOCATION s3://bucketname/folder/`\n", "\n", "The LOCATION in Amazon S3 specifies all of the files representing your table. Athena reads all data stored in `s3://bucketname/folder/`. If you have data that you do not want Athena to read, do not store that data in the same Amazon S3 folder as the data you want Athena to read. If you are leveraging partitioning, to ensure Athena scans data within a partition, your WHERE filter must include the partition. For more information, see [Table Location and Partitions](https://docs.aws.amazon.com/athena/latest/ug/tables-location-format.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prefix = \"data/tabular\"\n", "filename_key = \"california_housing\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_s3_location = \"s3://{}/{}/{}/\".format(bucket, prefix, filename_key)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "table_name_csv = \"california_housing_athena\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# SQL statement to execute\n", "\n", "statement = \"\"\"CREATE EXTERNAL TABLE IF NOT EXISTS {}.{}(\n", " MedInc double,\n", " HouseAge double,\n", " AveRooms double,\n", " AveBedrms double,\n", " Population double,\n", " AveOccup double,\n", " Latitude double,\n", " Longitude double, \n", " MedValue double\n", "\n", ") ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\\\\n' LOCATION '{}'\n", "TBLPROPERTIES ('skip.header.line.count'='1')\"\"\".format(\n", " database_name, table_name_csv, data_s3_location\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Execute statement using connection cursor\n", "cursor = connect(region_name=region_name, s3_staging_dir=s3_staging_dir).cursor()\n", "cursor.execute(statement)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# verify the table has been created\n", "statement = \"SHOW TABLES in {}\".format(database_name)\n", "cursor.execute(statement)\n", "\n", "df_show = as_pandas(cursor)\n", "df_show.head(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# run a sample query\n", "statement = \"\"\"SELECT * FROM {}.{}\n", "LIMIT 100\"\"\".format(\n", " database_name, table_name_csv\n", ")\n", "# Execute statement using connection cursor\n", "cursor = connect(region_name=region_name, s3_staging_dir=s3_staging_dir).cursor()\n", "cursor.execute(statement)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = as_pandas(cursor)\n", "df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alternatives: Use AWS Data Wrangler to query data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import awswrangler as wr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Glue Catalog" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for table in wr.catalog.get_tables(database=database_name):\n", " print(table[\"Name\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Athena" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "df = wr.athena.read_sql_query(\n", " sql=\"SELECT * FROM {} LIMIT 100\".format(table_name_csv), database=database_name\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Citation\n", "\n", "Data Science On AWS workshops, Chris Fregly, Antje Barth, https://www.datascienceonaws.com/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/ingest_data|ingest-with-aws-services|ingest_data_with_Athena.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }