# Introduction

Amazon SageMaker Autopilot currently allow deploying generated models to real-time inference endpoints by default. In this repository, we'll show how to deploy Autopilot models trained with `ENSEMBLING` and `HYPERPARAMETER OPTIMIZATION` (HPO) modes to serverless endpoints.

The notebook in this folder is the solution as described in this [blog post](https://aws.amazon.com/blogs/machine-learning/deploy-amazon-sagemaker-autopilot-models-to-serverless-inference-endpoints/).

## Dataset

In this example, we use the [UCI Bank Marketing](https://archive.ics.uci.edu/ml/datasets/Bank+Marketing) dataset to predict if a client will subscribe to a term deposit offered by the bank. This is a binary classification problem type.

## Solution Overview

In the first part of the notebook we'll launch two Autopilot jobs one with training mode set to `ENSEMBLING` and the other with `HYPERPARAMETER OPTIMIZATION` (HPO).

### Autopilot ensembling model to serverless endpoint

Autopilot generates a single model in `ENSEMBLING` training mode. We deploy this single model to a serverless endpoint. Then we also send an inference request with test data to the serverless endpoint.

![Deploying Autopilot Ensembling Models to Serverless Endpoints](images/deploying-autopilot-to-serverless-endpoints-ENS.png)

### Autopilot HPO models to serverless endpoints

In the second part of the notebook we'll extract the three inference containers generated by Autopilot in `HPO` training mode and deploy these models to three separate serverless endpoints and send inference requests in sequence.

![Deploying Autopilot HPO Models to Serverless Endpoints](images/deploying-autopilot-to-serverless-endpoints-HPO.png)

----

## Additional References

- If you’re new to Autopilot, we encourage you to refer to [Get started with Amazon SageMaker Autopilot](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-get-started.html).
- To determine the optimal configuration for your serverless endpoint from a cost and performance perspective, we encourage you to explore our Serverless Inference Benchmarking Toolkit. For more information, refer to [Introducing the Amazon SageMaker Serverless Inference Benchmarking Toolkit](https://aws.amazon.com/blogs/machine-learning/introducing-the-amazon-sagemaker-serverless-inference-benchmarking-toolkit/).
- To learn more about Autopilot training modes, refer to [Amazon SageMaker Autopilot is up to eight times faster with new ensemble training mode powered by AutoGluon](https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-autopilot-is-up-to-eight-times-faster-with-new-ensemble-training-mode-powered-by-autogluon/).
- Refer to [Inference container definitions for regression and classification problem types](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-container-output.html#autopilot-problem-type-container-output).
- Refer to [Configure inference output in generated containers](https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-automate-model-development-container-output.html#autopilot-problem-type-container-output)
- For an overview on how to deploy an XGBoost model to a serverless inference endpoint, we encourage you to refer to this [example notebook](https://github.com/aws/amazon-sagemaker-examples/blob/main/serverless-inference/Serverless-Inference-Walkthrough.ipynb).

## Security

See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.

## License

This project is licensed under the MIT License.