# Appendixes --- ## Appendix A ![](../img/ai-ml-roles.png) Source: [Increase your machine learning success with AWS ML services and AWS Machine Learning Embark](https://aws.amazon.com/blogs/machine-learning/increase-your-machine-learning-success-with-aws-ml-services-and-aws-ml-embark/) ## Appendix B [Deployment into an existing VPC and with pre-provisioned IAM resources](base-vpc-deployment.md) # Appendix C ## Solution CI/CD pipelines The solution is tested end-to-end for all possible deployment options using [AWS CodePipeline](https://aws.amazon.com/codepipeline/) and AWS developer tools. ## Setup CI/CD pipelines ### Create CodePipeline artifact Amazon S3 buckets The CodePipeline pipelines used in the solution deploy stack in different regions. You must create an Amazon S3 CodePipeline bucket per region following the naming convention: `codepipeline--`: ``` PROJECT_NAME=sm-mlops aws s3 mb s3://codepipeline-${PROJECT_NAME}-us-east-2 --region us-east-2 aws s3 mb s3://codepipeline-${PROJECT_NAME}-eu-central-1 --region eu-central-1 aws s3 mb s3://codepipeline-${PROJECT_NAME}-eu-west-1 --region eu-west-1 aws s3 mb s3://codepipeline-${PROJECT_NAME}-eu-west-2 --region eu-west-2 ``` ### Setup CodeCommit repository and notifications To use CI/CD pipelines you must setup CodeCommit repository and configure notifications on pipeline status changes. + Setup CodeCommit repository + Create SNS topic to receive notifications ### Setup pipelines To setup all CI/CD pipelines run the following command from the solution directory: ```bash aws cloudformation deploy \ --template-file test/cfn_templates/create-base-infra-pipeline.yaml \ --stack-name base-infra-$AWS_DEFAULT_REGION \ --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM \ --parameter-overrides \ CodeCommitRepositoryArn= \ NotificationArn= ``` # Appendix D ## Terraform considerations [AWS Service Catalog Terraform Reference Architecture GitHub](https://github.com/aws-samples/aws-service-catalog-terraform-reference-architecture) [AWS Service Catalog FAQ](https://aws.amazon.com/servicecatalog/faqs/): > Q: Can I use Terraform with AWS Service Catalog? > You can leverage the [AWS Service Catalog Terraform Reference Architecture](https://d1.awsstatic.com/whitepapers/DevOps/TerraformReferenceArchitecture-instructions.pdf). This reference architecture provides an example for using AWS Service Catalog products, an AWS CloudFormation custom resource, and Terraform to provision resources on AWS. # Appendix E How to get an IAM snapshot from the account: ``` aws iam get-account-authorization-details > iam_snapshot.json ``` ## Enabling SageMaker projects programmatically To enable SageMaker projects you need first to enable SageMaker AWS Service Catalog portfolio and then to associate the Studio execution role with the portfolio using https://docs.aws.amazon.com/cli/latest/reference/servicecatalog/associate-principal-with-portfolio.html. In addition you need to make sure to create two roles (which otherwise get created through the console): `AmazonSageMakerServiceCatalogProductsUseRole` and `AmazonSageMakerServiceCatalogProductsLaunchRole`. Below a sample code_snippet for boto3 for the full workflow: + `studio_role_arn` is the role which is associated with Studio + `sc_client` is AWS Service Catalog boto3 client + `client`: is SageMaker boto3 client ```python def enable_projects(studio_role_arn): # enable Project on account level (accepts portfolio share) response = client.enable_sagemaker_servicecatalog_portfolio() # associate studio role with portfolio response = sc_client.list_accepted_portfolio_shares() portfolio_id = '' for portfolio in response['PortfolioDetails']: if portfolio['ProviderName'] == 'Amazon SageMaker': portfolio_id = portfolio['Id'] response = sc_client.associate_principal_with_portfolio( PortfolioId=portfolio_id, PrincipalARN=studio_role_arn, PrincipalType='IAM' ) ``` # Appendix F # Alternative architectural options for MLOps The following architectural options for implementing MLOps pipeline are available: + SageMaker Projects + CodePipeline (implemented by this solutions) + [Step Functions](https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/readmelink.html#getting-started-with-sample-jupyter-notebooks) + [Apache AirFlow](https://airflow.apache.org/), [SageMaker Operators for AirFlow](https://sagemaker.readthedocs.io/en/stable/using_workflow.html) + [SageMaker Operators for Kubernetes](https://aws.amazon.com/blogs/machine-learning/introducing-amazon-sagemaker-operators-for-kubernetes/) # Appendix G # Develop and deploy seed code You can develop and evolve the seed code for your own needs. To deliver the new version of the seed code **in form of the project template**, please follow the steps: + Update existing or create your own version of the seed code + Zip all files that should go into a project CodeCommit repository into a single `.zip` file + Upload this `.zip` file to an Amazon S3 bucket of your choice. You must specify this S3 bucket name when you create a new project in Studio + Set a special tag `servicecatalog:provisioning` on the uploaded file. This tag enables access to the object by `AmazonSageMakerServiceCatalogProductsLaunchRole` IAM role: ```bash aws s3api put-object-tagging \ --bucket \ --key /seed-code/ \ --tagging 'TagSet=[{Key=servicecatalog:provisioning,Value=true}]' ``` + Update the `AWS::CodeCommit::Repository` resource in the CloudFormation template with the CI/CD pipeline for the corresponding MLOps project ([`project-model-build-train.yaml`](https://github.com/aws-samples/amazon-sagemaker-secure-mlops/blob/master/cfn_templates/project-model-build-train.yaml) or [`project-model-deploy.yaml`](https://github.com/aws-samples/amazon-sagemaker-secure-mlops/blob/master/cfn_templates/project-model-deploy.yaml)) with the new zip-file name. Model deploy project `project-model-deploy.yaml`: ```yaml ModelDeployCodeCommitRepository: Type: AWS::CodeCommit::Repository Properties: # Max allowed length: 100 chars RepositoryName: !Sub sagemaker-${SageMakerProjectName}-${SageMakerProjectId}-model-deploy # max: 10+33+15+12=70 RepositoryDescription: !Sub SageMaker Endpoint deployment infrastructure as code for the project ${SageMakerProjectName} Code: S3: Bucket: !Ref SeedCodeS3BucketName Key: /seed-code/ BranchName: main ``` Model build, train, validate project `project-model-build-train.yaml`: ```yaml ModelBuildCodeCommitRepository: Type: AWS::CodeCommit::Repository Properties: # Max allowed length: 100 chars RepositoryName: !Sub sagemaker-${SageMakerProjectName}-${SageMakerProjectId}-model-build-train # max: 10+33+15+18=76 RepositoryDescription: !Sub SageMaker Model building infrastructure as code for the project ${SageMakerProjectName} Code: S3: Bucket: !Ref SeedCodeS3BucketName Key: sagemaker-mlops/seed-code/mlops-model-build-train-v1.0.zip BranchName: main ``` + Update the CloudFormation template file `env-sc-portfolio.yaml` with a new version of Service Catalog Product: ```yaml DataScienceMLOpsModelBuildTrainProduct: Type: 'AWS::ServiceCatalog::CloudFormationProduct' Properties: Name: !Sub '${EnvName}-${EnvType} MLOps Model Build Train ' Description: 'This template creates a CI/CD MLOps project which implements ML build-train-validate pipeline' Owner: 'data science Administration Team' ProvisioningArtifactParameters: - Name: 'MLOps Model Build Train ' ``` + Package and upload all changed CloudFormation template to the distribution S3 bucket + Update Service Catalog CloudFormation stack with the updated templates: - Package the CloudFormation templates and upload everything to the Amazon S3 bucket: ```bash S3_BUCKET_NAME= make package CFN_BUCKET_NAME=$S3_BUCKET_NAME ``` - Update the Service Catalog portfolio and product stack: ```bash STACK_NAME= # PRINCIPAL_ROLE_ARN= # LAUNCH_ROLE_ARN= # aws cloudformation update-stack \ --template-url https://s3.$AWS_DEFAULT_REGION.amazonaws.com/$S3_BUCKET_NAME/sagemaker-mlops/env-sc-portfolio.yaml \ --region $AWS_DEFAULT_REGION \ --stack-name $STACK_NAME \ --parameters \ ParameterKey=EnvName,ParameterValue=sm-mlops \ ParameterKey=EnvType,ParameterValue=dev \ ParameterKey=SCMLOpsPortfolioPrincipalRoleArn,ParameterValue=$PRINCIPAL_ROLE_ARN \ ParameterKey=SCMLOpsProductLaunchRoleArn,ParameterValue=$LAUNCH_ROLE_ARN ``` + Restart Studio (close the browser window with Studio and open again via AWS console) # Appendix H ## Amazon SageMaker features ![SageMaker Features](../img/sagemaker-features.png) # Appendix I [Use `CloudFormation` provider instead of `CloudFormationStackSet` in CodePipeline deploy action](use-cfn-stack-instead-of-stacksets.md) --- [Back to README](../README.md) --- Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0