AWSTemplateFormatVersion: '2010-09-09' Description: An example service that deploys an Jupyter Lab notebook with AWS Neuron support for machine learning. Parameters: VpcId: Type: String Description: The VPC that the service is running inside of PublicSubnetIds: Type: List Description: List of public facing subnets PrivateSubnetIds: Type: List Description: List of private subnets ClusterName: Type: String Description: The name of the ECS cluster into which to launch capacity. ECSTaskExecutionRole: Type: String Description: The role used to start up an ECS task CapacityProvider: Type: String Description: The cluster capacity provider that the service should use to request capacity when it wants to start up a task ServiceName: Type: String Default: jupyter Description: A name for the service ImageUrl: Type: String Description: The URL of a Juypter notebook container to run ContainerCpu: Type: Number Default: 10240 Description: How much CPU to give the container. 1024 is 1 CPU ContainerMemory: Type: Number Default: 32768 Description: How much memory in megabytes to give the container MyIp: Type: String Default: 0.0.0.0/0 Description: The IP addresses that you want to accept traffic from. Default accepts traffic from anywhere on the internet. Resources: # The task definition. This is a simple metadata description of what # container to run, and what resource requirements it has. TaskDefinition: Type: AWS::ECS::TaskDefinition DependsOn: - TaskAccessToJupyterToken Properties: Family: !Ref ServiceName NetworkMode: awsvpc RequiresCompatibilities: - EC2 ExecutionRoleArn: !Ref ECSTaskExecutionRole ContainerDefinitions: - Name: jupyter Cpu: !Ref ContainerCpu MemoryReservation: !Ref ContainerMemory Image: !Ref ImageUrl EntryPoint: - '/bin/sh' - '-c' Command: - '/opt/conda/bin/jupyter-lab --allow-root --ServerApp.token=${JUPYTER_TOKEN} --ServerApp.ip=*' Secrets: - Name: JUPYTER_TOKEN ValueFrom: !Ref JupyterToken PortMappings: - ContainerPort: 8888 HostPort: 8888 MountPoints: - SourceVolume: efs-volume ContainerPath: /home LogConfiguration: LogDriver: 'awslogs' Options: awslogs-group: !Ref LogGroup awslogs-region: !Ref AWS::Region awslogs-stream-prefix: !Ref ServiceName LinuxParameters: Devices: # Ensure that the AWS Neuron SDK inside the container # can access the underlying host device provided by AWS Inferentia - ContainerPath: /dev/neuron0 HostPath: /dev/neuron0 Permissions: - read - write Capabilities: Add: - "IPC_LOCK" Volumes: - Name: efs-volume EFSVolumeConfiguration: FilesystemId: !Ref EFSFileSystem RootDirectory: / TransitEncryption: ENABLED # The secret token used to protect the Jupyter notebook JupyterToken: Type: AWS::SecretsManager::Secret Properties: GenerateSecretString: PasswordLength: 30 ExcludePunctuation: true # Attach a policy to the task execution role, which grants # the ECS agent the ability to fetch the Jupyter notebook # secret token on behalf of the task. TaskAccessToJupyterToken: Type: AWS::IAM::Policy Properties: Roles: - !Ref ECSTaskExecutionRole PolicyName: AccessJupyterToken PolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Action: - secretsmanager:DescribeSecret - secretsmanager:GetSecretValue Resource: !Ref JupyterToken # Attach a policy to the task execution role which allows the # task to access the EFS filesystem that provides durable storage # for the task. TaskAccessToFilesystem: Type: AWS::IAM::Policy Properties: Roles: - !Ref ECSTaskExecutionRole PolicyName: EFSAccess PolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Action: - elasticfilesystem:ClientMount - elasticfilesystem:ClientWrite - elasticfilesystem:DescribeMountTargets - elasticfilesystem:DescribeFileSystems Resource: !GetAtt EFSFileSystem.Arn # The service. The service is a resource which allows you to run multiple # copies of a type of task, and gather up their logs and metrics, as well # as monitor the number of running tasks and replace any that have crashed Service: Type: AWS::ECS::Service DependsOn: PublicLoadBalancerListener Properties: ServiceName: !Ref ServiceName Cluster: !Ref ClusterName PlacementStrategies: - Field: attribute:ecs.availability-zone Type: spread - Field: cpu Type: binpack CapacityProviderStrategy: - Base: 0 CapacityProvider: !Ref CapacityProvider Weight: 1 DeploymentConfiguration: MaximumPercent: 200 MinimumHealthyPercent: 75 DesiredCount: 1 NetworkConfiguration: AwsvpcConfiguration: SecurityGroups: - !Ref ServiceSecurityGroup Subnets: - !Select [ 0, !Ref PrivateSubnetIds ] - !Select [ 1, !Ref PrivateSubnetIds ] TaskDefinition: !Ref TaskDefinition LoadBalancers: - ContainerName: jupyter ContainerPort: 8888 TargetGroupArn: !Ref JupyterTargetGroup # Because we are launching tasks in AWS VPC networking mode # the tasks themselves also have an extra security group that is unique # to them. This is a unique security group just for this service, # to control which things it can talk to, and who can talk to it ServiceSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: !Sub Access to service ${ServiceName} VpcId: !Ref VpcId # This log group stores the stdout logs from this service's containers LogGroup: Type: AWS::Logs::LogGroup # Keeps track of the list of tasks running on EC2 instances JupyterTargetGroup: Type: AWS::ElasticLoadBalancingV2::TargetGroup Properties: HealthCheckIntervalSeconds: 6 HealthCheckPath: /api HealthCheckProtocol: HTTP HealthCheckTimeoutSeconds: 5 HealthyThresholdCount: 2 TargetType: ip Port: 8888 Protocol: HTTP UnhealthyThresholdCount: 10 VpcId: !Ref VpcId TargetGroupAttributes: - Key: deregistration_delay.timeout_seconds Value: 0 # A public facing load balancer, this is used as ingress for # public facing internet traffic. The traffic is forwarded # down to the Juypter notebook where ever it is currently hosted # on whichever machine Amazon ECS placed it on. PublicLoadBalancerSG: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Access to the public facing load balancer VpcId: !Ref VpcId SecurityGroupIngress: # Allow access to ALB from the specified IP address - CidrIp: !Ref MyIp IpProtocol: -1 PublicLoadBalancer: Type: AWS::ElasticLoadBalancingV2::LoadBalancer Properties: Scheme: internet-facing LoadBalancerAttributes: - Key: idle_timeout.timeout_seconds Value: '30' Subnets: # The load balancer is placed into the public subnets, so that traffic # from the internet can reach the load balancer directly via the internet gateway - !Select [ 0, !Ref PublicSubnetIds ] - !Select [ 1, !Ref PublicSubnetIds ] SecurityGroups: - !Ref PublicLoadBalancerSG PublicLoadBalancerListener: Type: AWS::ElasticLoadBalancingV2::Listener DependsOn: - PublicLoadBalancer Properties: DefaultActions: - Type: 'forward' ForwardConfig: TargetGroups: - TargetGroupArn: !Ref JupyterTargetGroup Weight: 100 LoadBalancerArn: !Ref 'PublicLoadBalancer' Port: 80 Protocol: HTTP # The Jupyter services' security group allows inbound # traffic from the public facing ALB JupyterIngressFromPublicALB: Type: AWS::EC2::SecurityGroupIngress Properties: Description: Ingress from the public ALB GroupId: !Ref 'ServiceSecurityGroup' IpProtocol: -1 SourceSecurityGroupId: !Ref 'PublicLoadBalancerSG' # Filesystem that provides durable storage for the notebook EFSFileSystem: Type: AWS::EFS::FileSystem Properties: Encrypted: true PerformanceMode: generalPurpose ThroughputMode: bursting # Mount target allows usage of the EFS inside of subnet one EFSMountTargetOne: Type: AWS::EFS::MountTarget Properties: FileSystemId: !Ref EFSFileSystem SubnetId: !Select [ 0, !Ref PrivateSubnetIds ] SecurityGroups: - !Ref EFSFileSystemSecurityGroup # Mount target allows usage of the EFS inside of subnet two EFSMountTargetTwo: Type: AWS::EFS::MountTarget Properties: FileSystemId: !Ref EFSFileSystem SubnetId: !Select [ 1, !Ref PrivateSubnetIds ] SecurityGroups: - !Ref EFSFileSystemSecurityGroup # This security group is used by the mount targets so # that they will allow inbound NFS connections from # the ECS tasks that we launch EFSFileSystemSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Security group for EFS file system VpcId: !Ref VpcId SecurityGroupIngress: - IpProtocol: tcp FromPort: 2049 ToPort: 2049 SourceSecurityGroupId: !Ref ServiceSecurityGroup Outputs: LoadBalancerUrl: Description: The URL at which you can access the application Value: !GetAtt PublicLoadBalancer.DNSName Secret: Description: The ARN of the secret that was created to protect the Juypter Lab Value: !Ref JupyterToken