# Bring your own container to MLOps with SageMaker Project In this project, we demonstrate how to bring your own container to MLOps(machine learning operations) with [SageMaker Project](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-whatis.html) by using the library of text classification task from [Hugging Face](https://huggingface.co/) ecosystem. Besides ML pipeline automation with SageMaker Pipelines, we also use [AWS CDK](https://docs.aws.amazon.com/cdk/v2/guide/home.html) to implement CI/CD with [AWS CodePipeline](https://docs.aws.amazon.com/codepipeline/latest/userguide/welcome.html). ## Overview
### Region By default, we use the region of `us-east-1`. If you perfer other region, you can modify the following code in `infra/cicd_construct.py`. More regions and corresponding accounts can be found in [Available Deep Learning Containers Images](https://github.com/aws/deep-learning-containers/blob/master/available_images.md). ```python deploy_spec = codebuild.BuildSpec.from_object( dict( version="0.2", phases=dict( install={ "runtime-versions": { "nodejs": "12", }, "commands": [ "npm install -g aws-cdk@latest", ], }, build=dict( commands=[ #login_cli_cmd(cdk.Aws.REGION), "aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com",# Setup your preferable region: {account_id}.dkr.ecr.{region}.amazonaws.com "cdk -a . deploy --all --require-approval=never --verbose", ] ), ), ) ) ``` Meanwhile in the `repos/build_pipeline/containers/batch_transform/Dockerfile`, `repos/build_pipeline/containers/training/Dockerfile` and `repos/build_pipeline/containers/serving/Dockerfile`, correspondingly you need to modify `account` and `region`. ```bash FROM {account}.dkr.ecr.{region}.amazonaws.com/huggingface-tensorflow-training:2.5.1-transformers4.12.3-gpu-py37-cu112-ubuntu18.04 COPY resources/train.py /opt/program/train.py ENV SAGEMAKER_PROGRAM /opt/program/train.py ``` ## Solution - [Amazon SageMaker Feature Store](https://aws.amazon.com/sagemaker/feature-store/) is often used in storing features of structured data. In this solution, we store labeled data of text classification in Amazon SageMaker Feature Store. - When using the custom algorithms or the state-of-the-art algorithms in a machine learning project, custom containers need to be provided. This solution demonstrates how to bring your own container to MLOps with CI/CD. You can refer to [End to End Pipeline: Bring your own container to SageMaker Pipelines](https://github.com/aws-samples/aws-sagemaker-byoc-end2end) to learn how to implement ML pipeline for BYOC with [SageMaker Pipelines](https://aws.amazon.com/sagemaker/pipelines/) and [AWS SDK](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html). - [Amazon SageMaker Asynchronous Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference.html) is a new capability in SageMaker that queues incoming requests and processes them asynchronously. We deploy the asynchronous inference endpoint in model serving phase. ## Setup ### CDK environment We can follow [Operationalize a Machine Learning model with Amazon SageMaker Featurestore and Amazon SageMaker DataWrangler Using CDK](https://github.com/aws-samples/amazon-sagemaker-mlops-with-featurestore-and-datawrangler) to set up the CDK development environment. ### Build and run a SageMaker project Once running `cdk deploy` in CDK environment successfully, a SageMaker template will be generated in `AWS Service Catalog`. we can create a SageMaker project from the SageMaker project template.
We can clone the demo repository from repositories on the `Sagemaker project console` after creating SageMaker project.
Next step, run the script `walkthrough.ipynb` in the repository `sagemaker-{project-name}-demo` step by step. ## Security See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information. ## License This library is licensed under the MIT-0 License. See the [LICENSE](LICENSE) file.