{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Amazon SageMaker Workshop\n", "## _**Deployment**_\n", "\n", "---\n", "\n", "In this part of the workshop we will deploy our model created in the previous lab in an endpoint for real-time inferences to Predict Mobile Customer Departure.\n", "\n", "---\n", "\n", "## Contents\n", "\n", "1. [Model hosting](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html)\n", " * Set up a persistent endpoint to get predictions from your model\n", " \n", "2. [Exercise - You turn to an endpoint and customize inference](#Exercise)\n", " \n", "---\n", "\n", "## Background\n", "\n", "In the previous labs [Modeling](../../2-Modeling/modeling.ipynb) and [Evaluation](../../3-Evaluation/evaluation.ipynb) we trained multiple models with multiple SageMaker training jobs and evaluated them .\n", "\n", "Let's import the libraries for this lab:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Supress default INFO loggingd\n", "import logging\n", "logger = logging.getLogger()\n", "logger.setLevel(logging.ERROR)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "import json\n", "from time import strftime, gmtime\n", "\n", "import boto3\n", "\n", "import sagemaker\n", "from sagemaker import get_execution_role\n", "from sagemaker.predictor import csv_serializer\n", "from sagemaker.model_monitor import DataCaptureConfig, DatasetFormat, DefaultModelMonitor\n", "from sagemaker.s3 import S3Uploader, S3Downloader" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = boto3.Session()\n", "sm = sess.client('sagemaker')\n", "role = sagemaker.get_execution_role()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%store -r bucket\n", "%store -r prefix\n", "%store -r region\n", "%store -r docker_image_name\n", "%store -r framework_version" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bucket, prefix, region, docker_image_name, framework_version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "### - if you _**skipped**_ the lab `2-Modeling/` follow instructions:\n", "\n", " - **run this:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# # Uncomment if you have not done Lab 2-Modeling\n", "\n", "# from config.solution_lab2 import get_estimator_from_lab2\n", "# xgb = get_estimator_from_lab2(docker_image_name, framework_version)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "### - if you _**have done**_ the lab `2-Modeling/` follow instructions:\n", "\n", " - **run this:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# # Uncomment if you've done Lab 2-Modeling\n", "\n", "#%store -r training_job_name\n", "#xgb = sagemaker.estimator.Estimator.attach(training_job_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Host the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we've trained the model, let's deploy it to a hosted endpoint. To monitor the model after it's hosted and serving requests, we'll also add configurations to capture data that is being sent to the endpoint." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_capture_prefix = '{}/datacapture'.format(prefix)\n", "\n", "endpoint_name = \"workshop-xgboost-customer-churn-\" + strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())\n", "print(\"EndpointName = {}\".format(endpoint_name))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xgb_predictor = xgb.deploy(initial_instance_count=1, \n", " instance_type='ml.m4.xlarge',\n", " endpoint_name=endpoint_name,\n", " data_capture_config=DataCaptureConfig(enable_capture=True,\n", " sampling_percentage=100,\n", " destination_s3_uri=f's3://{bucket}/{data_capture_prefix}'\n", " )\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, we just trained a model with SageMaker and then used deployed it in a managed SageMaker endpoint. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import display, HTML\n", "sm_ep_placeholder = \"https://us-east-2.console.aws.amazon.com/sagemaker/home?region={}#/endpoints\"\n", "\n", "display(HTML(f\"Look at your endpoints here\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or go to the left tab here, inside the Studio UI, and select \"Endpoints\":\n", "\n", "![endpoints.png](media/endpoints.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's save the endpoint name for later (Monitoring lab)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%store endpoint_name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Invoke the deployed model\n", "\n", "Now that we have a hosted endpoint running, we can make real-time predictions from our model by making an http POST request. But first, we need to set up serializers and deserializers for passing our `test_data` NumPy arrays to the model behind the endpoint." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.serializers import CSVSerializer\n", "from sagemaker.deserializers import CSVDeserializer\n", "\n", "xgb_predictor.serializer = CSVSerializer()\n", "xgb_predictor.deserializer = CSVDeserializer()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we'll loop over our test dataset and collect predictions by invoking the XGBoost endpoint:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Sending test traffic to the endpoint {}. \\nPlease wait for a minute...\".format(endpoint_name))\n", "\n", "count = 0\n", "\n", "with open('config/test_sample.csv', 'r') as f:\n", " for row in f:\n", " if count == 10: break # only evaluate 10 first items\n", " payload = row.rstrip('\\n')\n", " response = xgb_predictor.predict(data=payload)\n", " print(response)\n", " time.sleep(0.5)\n", " count+= 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Verify that data is captured in Amazon S3\n", "\n", "When we made some real-time predictions by sending data to our endpoint, we should have also captured that data for monitoring purposes. \n", "\n", "Let's list the data capture files stored in Amazon S3. Expect to see different files from different time periods organized based on the hour in which the invocation occurred. The format of the Amazon S3 path is:\n", "\n", "`s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name}/yyyy/mm/dd/hh/filename.jsonl`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from time import sleep\n", "\n", "current_endpoint_capture_prefix = '{}/{}'.format(data_capture_prefix, endpoint_name)\n", "for _ in range(12): # wait up to a minute to see captures in S3\n", " capture_files = S3Downloader.list(\"s3://{}/{}\".format(bucket, current_endpoint_capture_prefix))\n", " if capture_files:\n", " break\n", " sleep(5)\n", "\n", "print(\"Found Data Capture Files:\")\n", "print(capture_files)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the data captured is stored in a SageMaker specific json-line formatted file. Next, Let's take a quick peek at the contents of a single line in a pretty formatted json so that we can observe the format a little better." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "capture_file = S3Downloader.read_file(capture_files[-1])\n", "\n", "print(\"=====Single Data Capture====\")\n", "print(json.dumps(json.loads(capture_file.split('\\n')[0]), indent=2)[:2000])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, each inference request is captured in one line in the jsonl file. The line contains both the input and output merged together. In our example, we provided the ContentType as `text/csv` which is reflected in the `observedContentType` value. Also, we expose the enconding that we used to encode the input and output payloads in the capture format with the `encoding` value.\n", "\n", "To recap, we have observed how you can enable capturing the input and/or output payloads to an Endpoint with a new parameter. We have also observed how the captured format looks like in S3. Let's continue to explore how SageMaker helps with monitoring the data collected in S3." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## _Alternative deployment_\n", "\n", "Ok, nice! We can train with SageMaker and then deploy in a managed endpoint with monitoring enabled.\n", "\n", "But:\n", "\n", "#### - What if I already have a model that was trained outside of SageMaker? How do I deploy it in SageMaker without training it previously?\n", "\n", "#### - What if I need to preprocess the request before performing inference and then post process what my model just predicted. How can I customize the inference logic with a custom inference script?\n", "\n", "# Exercise\n", "### _[Challenge] Your turn!_\n", "\n", "Deploy another model in SageMaker. Remember that the output of each training job was an artifact (tar.gz file with the model and other configurations) that was saved in S3.\n", "\n", "1. Pick one of this models in S3 or upload another one from your laptop to S3. Then deploy it.\n", "(If you haven't trained a model, pick the `model.tar.gz` in the `config` directory).\n", "\n", "2. Add a custom inference script in your endpoint\n", "\n", "To make things easiser, you can add a simple post-processing function add a new value to the output `\"hello from post-processing function!!!` to the request.\n", "\n", "So, if we send to our endpoint: \n", "```\n", "186,0.1,137.8,97,187.7,118,146.4,85,8.7,6,1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.10,0.11,0.12,0.13,0.14,0.15,0.16,0.17,1.1,0.18,0.19,0.20,0.21,0.22,0.23,0.24,0.25,0.26,0.27,0.28,0.29,0.30,0.31,0.32,0.33,0.34,0.35,0.36,0.37,0.38,0.39,0.40,0.41,0.42,0.43,0.44,0.45,0.46,0.47,0.48,0.49,0.50,0.51,0.52,0.53,1.2,1.3,0.54,1.4,0.55\n", "``` \n", "\n", "The output will be something like:\n", "```\n", "0.014719205908477306,\"hello from post-processing\"\n", "```\n", "\n", "Want a hint? [Look here](./solutions/b-hint1.md)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# YOUR SOLUTION HERE\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "# [You can now go to the lab 5-Monitoring](../../5-Monitoring/monitoring.ipynb)" ] } ], "metadata": { "celltoolbar": "Tags", "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/datascience-1.0" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" }, "notice": "Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License." }, "nbformat": 4, "nbformat_minor": 4 }