Amazon SageMaker Containers: a Library to Create Docker Containers

Amazon SageMaker Containers is a library that implements the functionality that you need to create containers to run scripts, train algorithms, or deploy models that are compatible with Amazon SageMaker. To install this library, use a RUN pip install sagemaker-containers command in your Dockerfile. The library defines the locations for storing code and other resources when you install it. Your Dockerfile must also copy the code to be run into the location expected by an Amazon SageMaker-compatible container and define the entry point containing the code to run when the container is started. The library also defines other information that a container needs to manage deployments for training and inference. After you build a Docker image, you can push it to the Amazon Elastic Container Registry (Amazon ECR). To create a container, you can pull the image from Amazon ECR and build the container using the docker build command.

The following high-level schematic shows how the files are organized in an Amazon SageMaker-compatible container created with the Amazon SageMaker Containers library.

[Files and resources in an Amazon SageMaker-compatible container.]

When Amazon SageMaker trains a model, it creates a number of files in the container’s /opt/ml directory.

/opt/ml
├── input
│   ├── config
│   │   ├── hyperparameters.json
│   │   └── resourceConfig.json
│   └── data
│       └── <channel_name>
│           └── <input data>
├── model
│ 
├── code
│   └── <script files>
│
└── output
    └── failure

When you run a model training job, the Amazon SageMaker container has a /opt/ml/input/ directory that contains JSON files that configure the hyperparameters for the algorithm and the network layout used for distributed training. The directory also contains files that specify the channels through which Amazon SageMaker accesses the data in Amazon Simple Storage Service (Amazon S3). Place scripts to run in the /opt/ml/code/ directory. The /opt/ml/model/ directory contains the model generated by your algorithm in a singe file or an entire directory tree in any format. You can also send information about why a training job failed to the /opt/ml/output/ directory. Amazon SageMaker packages files in this directory into a compressed tar archive file.

When you host a trained model on Amazon SageMaker to make inferences, you deploy the model to an HTTP endpoint. The model makes realtime predictions in response to inference requests. The container must contain a serving stack to process these requests. The five files used in the standard Python serving stack by Amazon SageMaker are installed in the container’s WORKDIR. You can choose a different toolset to deploy an HTTP endpoint and, therefore, could have a different layout. If you’re writing in a programming language other than Python, you will have a different layout, the nature of which will depend on the frameworks and tools that you choose. The Python serving stack in the WORKDIR directory contains the following files: + nginx.conf– The configuration file for the nginx front end. + predictor.py– The program that implements the Flask web server and the decision tree predictions for this application. You need to customize the code that performs prediction for your application. + serve – The program started when the container is started for hosting. This file simply launches the Gunicorn server, which runs multiple instances of the Flask application defined in predictor.py. + train – The program that is invoked when you run the container for training. To implement your training algorithm, you modify this program. + wsgi.py – A small wrapper used to invoke the Flask application.

In the container, the model files are in the same place that they were written to during training.

/opt/ml
└── model
    └── <model files>

For more information, see Use Your Own Inference Code

You can provide separate Docker images for the training algorithm and inference code, as shown in the figure. Or you can use a single Docker image for both. When creating Docker images for use with Amazon SageMaker, consider the following: + Providing two Docker images can increase storage requirements and cost because common libraries might be duplicated. + In general, smaller containers start faster for both training and hosting. Models train faster and the hosting service can react to increases in traffic by automatically scaling more quickly. + You might be able to write an inference container that is significantly smaller than the training container. This is especially common when you use GPUs for training, but your inference code is optimized for CPUs. + Amazon SageMaker requires that Docker containers run without privileged access. + Docker containers might send messages to the Stdout and Stderr files. Amazon SageMaker sends these messages to Amazon CloudWatch logs in your AWS account.