{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deploy your model for inference\n", "\n", "In this lab you will walk through the process of deploying an XGBoost model that has been approved in the Model Registry. We will create a SageMaker serverless endpoint. For more information on deployment options on SageMaker, visit the [SageMaker documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html).\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install \"sagemaker>=2.123.0\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import sagemaker\n", "import boto3\n", "import numpy as np \n", "import pandas as pd \n", "import os \n", "from sagemaker import get_execution_role\n", "from datetime import datetime\n", "\n", "# Get default bucket\n", "bucket = sagemaker.Session().default_bucket()\n", "prefix = 'sagemaker/mlops-workshop'\n", "\n", "# Get SageMaker Execution Role\n", "role = get_execution_role()\n", "region = boto3.Session().region_name\n", "\n", "# SageMaker Session\n", "sagemaker_session = sagemaker.session.Session()\n", "\n", "# SageMaker client\n", "sm_client = boto3.client('sagemaker')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retrieve variables" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "%store -r" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(model_package_arn)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.model import ModelPackage\n", "\n", "model_package = ModelPackage(\n", " model_package_arn = model_package_arn, \n", " role = role,\n", " sagemaker_session = sagemaker_session\n", ")" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Deploy the model\n", "\n", "Since we are still experimenting with our model, we will create a Serverless Endpoint to save on cost. \n", "For more information on Serverless Endpoints, visit the SageMaker documentation [here](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html)\n", "\n", "Ensure the model is approved before deploying it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Approve the Model\n", "\n", "\n", "This can be done via the UI as shown in the following image or through SageMaker APIs. In this notebook, we will use the API to approve the model.\n", "\n", "![](./imgs/mr-approval.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "model_package_update_input_dict = {\n", " \"ModelPackageArn\" : model_package_arn,\n", " \"ModelApprovalStatus\" : \"Approved\"\n", "}\n", "\n", "model_package_update_response = sm_client.update_model_package(**model_package_update_input_dict)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from datetime import datetime\n", "current_time = datetime.now().strftime(\"%d-%m-%Y-%H-%M-%S\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a Serverless endpoint\n", "\n", "This step should take 3-5 minutes to complete. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.serverless import ServerlessInferenceConfig\n", "\n", "endpoint_name = 'xgb-model-' + current_time\n", "\n", "try:\n", " model_package.deploy(\n", " endpoint_name = endpoint_name,\n", " serverless_inference_config = ServerlessInferenceConfig(\n", " memory_size_in_mb = 4096,\n", " max_concurrency = 1\n", " )\n", " )\n", "except Exception as e:\n", " print(\"Make sure model is in an Approved state. Navigate to the model registry UI to approve the model\\n\\n\", e)\n", " raise Exception(\"Make sure model is in an Approved state. Navigate to the model registry UI to approve the model\\n\\n\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.predictor import Predictor\n", "xgb_predictor = Predictor(\n", " endpoint_name = endpoint_name, \n", " serializer = sagemaker.serializers.CSVSerializer(),\n", " sagemaker_session = sagemaker_session\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation\n", "Let us evaluate our model against the test dataset.\n", "\n", "As our data is currently stored as NumPy arrays in memory of our notebook instance. To send it in an HTTP POST request, we'll serialize it as a CSV string and then decode the resulting CSV.\n", "\n", "*Note: For inference with CSV format, SageMaker XGBoost requires that the data does NOT include the target variable.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The helper method below allows us to pass in our test data and make predictions against it. The following steps are performed in this helper method. \n", "1. Loop over our test dataset\n", "1. Split it into mini-batches of rows \n", "1. Convert those mini-batches to CSV string payloads (notice, we drop the target variable from our dataset first)\n", "1. Retrieve mini-batch predictions by invoking the XGBoost endpoint\n", "1. Collect predictions and convert from the CSV output our model provides into a NumPy array" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "!aws s3 cp {test_uri}/test.csv test.csv" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "test_data = pd.read_csv('test.csv', header = None)\n", "test_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "y_true = test_data[0]\n", "data = test_data.drop(0, axis = 1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "data.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "def predict(data, predictor, rows=500 ):\n", " split_array = np.array_split(data, int(data.shape[0] / float(rows) + 1))\n", " predictions = ''\n", " for array in split_array:\n", " predictions = ','.join([predictions, predictor.predict(array).decode('utf-8')])\n", "\n", " return np.fromstring(predictions[1:], sep=',')\n", "\n", "predictions = predict(data.to_numpy(), xgb_predictor)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A confusion matrix is a table that is often used to describe the performance of a classification model. Below we will check our confusion matrix to see how well we predicted versus actuals." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(predictions)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "pd.crosstab(index=y_true, columns=np.round(predictions), rownames=['actuals'], colnames=['predictions'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (Optional) Clean-up\n", "\n", "If you are done with this notebook, please run the cell below. This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "xgb_predictor.delete_endpoint(delete_endpoint_config=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Now move on the Module 2 in the workshop" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-2:429704687514:image/datascience-1.0" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }