{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Security controls demo notebook\n", "This notebook guides through a demo of different security controls." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Infrastructure walk-through\n", "Take a look to the following components and services created by the deployment:\n", "\n", "- VPC setup\n", "- Subnets - all private\n", "- Network devices (NAT Gateway, Network Firewall) and the route tables\n", "- Deployed security groups (SageMaker and VPC Endpoints security groups) and their ingress and egress rules\n", "- S3 VPC endpoint setup with the endpoint policy\n", "- S3 VPC interface endpoints for AWS public services\n", "- S3 buckets named `---models` and `---data` with the bucket policy. You can try and see that there is no AWS console access to the solution buckets (data and models). When you try to list the bucket content in AWS console you will get `AccessDenied` exception\n", "- AWS Network Firewall routing setup. You might want to read through [Deployment models for AWS Network Firewall](https://aws.amazon.com/blogs/networking-and-content-delivery/deployment-models-for-aws-network-firewall/) to familiarize yourself with different types of AWS Network Firewall deployment\n", "- Firewall policy with a stateful rule group with an allow domain list. There is a single domain `.kaggle.com` on the list\n", "- SageMaker IAM execution role\n", "- KMS CMKs for EBS and S3 bucket encryption" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## S3 access \n", "Create a file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_bucket = '---data'\n", "model_bucket = '/---models'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!touch test-file.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Copy file to `data` S3 bucket. The operation must be successful:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!aws s3 cp test-file.txt s3://{data_bucket}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try to copy the file to any other bucket or list any other bucket. You'll get an `AccessDenied` exception: `An error occurred (AccessDenied) when calling the PutObject operation: Access Denied`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!aws s3 cp test-file.txt s3://" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try to list `-data` or `-models` S3 buckets. The operation must be successful." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ " !aws s3 ls s3://{data_bucket}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try to list any other bucket in our account. You'll get an `AccessDenied` exception: `An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!aws s3 ls s3://" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try to list the `---data` bucket from a command line outside of Studio. You'll get an `AccessDenied` error." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SageMaker Studio has access to the designated S3 buckets (`-models` and `-data`) and to these S3 buckets only. The access to S3 buckets is controlled by a combination of the S3 VPC endpoint policy and the S3 bucket policy.\n", "\n", "❗ Note, you are not able to use [SageMaker JumpStart](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html) or any other SageMaker Studio functionality which requires access to other Amazon S3 buckets. To enable access to other S3 buckets you have to change the S3 VPC endpoint policy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Enable access to JumpStart S3 buckets\n", "Now we are going to change the S3 VPC endpoint policy and to allow access to additional S3 resources.\n", "\n", "First, try to open SageMaker Studio JumpStart:\n", "![SageMaker Studio JumpStart](../design/sm-studio-jumpstart.png)\n", "\n", "The access is denied because the S3 VPC endpoint policy doesn't allow access to any S3 buckets except for `models` and `data` as configured in the endpoint policy:\n", "![S3 access denied](../design/jumpstart-no-access-s3.png)\n", "\n", "Now add the following statement to the S3 VPC endpoint policy:\n", "```json\n", "{\n", " \"Effect\": \"Allow\",\n", " \"Principal\": \"*\",\n", " \"Action\": [\n", " \"s3:GetObject\"\n", " ],\n", " \"Resource\": \"*\",\n", " \"Condition\": {\n", " \"StringEqualsIgnoreCase\": {\n", " \"s3:ExistingObjectTag/SageMaker\": \"true\"\n", " }\n", " }\n", "}\n", "```\n", "\n", "Command line:\n", "```sh\n", "cat <s3-vpce-policy.json\n", "{\n", " \"Effect\": \"Allow\",\n", " \"Principal\": \"*\",\n", " \"Action\": [\n", " \"s3:GetObject\"\n", " ],\n", " \"Resource\": \"*\",\n", " \"Condition\": {\n", " \"StringEqualsIgnoreCase\": {\n", " \"s3:ExistingObjectTag/SageMaker\": \"true\"\n", " }\n", " }\n", "}\n", "EoF\n", "```\n", "\n", "```sh\n", "VPCE_ID=# VPC Id from the stack output\n", "\n", "aws ec2 modify-vpc-endpoint \\\n", " --vpc-endpoint-id $VPCE_ID \\\n", " --policy-document file://s3-vpce-policy.json\n", "```\n", "\n", "Refresh the JumpPage start page - now you have access to all JumpStart resources\n", "\n", "We have seen now, that you can control access to S3 buckets via combination of S3 bucket policy and S3 Endpoint policy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Control internet traffic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This operation succeeds because the domain `.kaggle.com` is allowed by the firewall policy. You can connect to this and only to this domain from any Studio notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!wget https://kaggle.com" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This operation times out after 5 minutes because any internet traffic except to and from the `.kaggle.com` domain isn’t allowed and is dropped by Network Firewall." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!git clone https://github.com/aws-samples/amazon-sagemaker-studio-vpc-networkfirewall.git" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add `.github.com` domain to the allowd domain list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!git clone https://github.com/aws-samples/amazon-sagemaker-studio-vpc-networkfirewall.git" ] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3.9.9 64-bit", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" }, "vscode": { "interpreter": { "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49" } } }, "nbformat": 4, "nbformat_minor": 4 }