{ "cells": [ { "cell_type": "markdown", "id": "c24fc73e", "metadata": {}, "source": [ "# Run mulitple deep learning models on GPUs with Amazon SageMaker Multi-model endpoints (MME)\n" ] }, { "cell_type": "markdown", "id": "c528238f", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "33cc860f", "metadata": {}, "source": [ "\n", "[Amazon SageMaker](https://aws.amazon.com/sagemaker/) multi-model endpoints(MME) provide a scalable and cost-effective way to deploy large number of deep learning models. Previously, customers had limited options to deploy 100s of deep learning models that need accelerated compute with GPUs. Now customers can deploy 1000s of deep learning models behind one SageMaker endpoint. Now, MME will run multiple models on a GPU core, share GPU instances behind an endpoint across multiple models and dynamically load/unload models based on the incoming traffic. With this, customers can significantly save cost and achieve best price performance.\n", "\n", "