{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Grab data" ] }, { "source": [ "Commentary:\n", "\n", "The popular [Abalone](https://archive.ics.uci.edu/ml/datasets/Abalone) data set originally from the UCI data repository \\[1\\] will be used.\n", "\n", "> \\[1\\] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "import boto3\n", "\n", "for p in ['raw_data', 'training_data', 'validation_data']:\n", " Path(p).mkdir(exist_ok=True)\n", "\n", "s3 = boto3.client('s3')\n", "s3.download_file('sagemaker-sample-files', 'datasets/tabular/uci_abalone/abalone.libsvm', 'raw_data/abalone')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Prepare training and validation data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import load_svmlight_file, dump_svmlight_file\n", "from sklearn.model_selection import train_test_split\n", "\n", "X, y = load_svmlight_file('raw_data/abalone')\n", "x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=1984, shuffle=True)\n", "\n", "dump_svmlight_file(x_train, y_train, 'training_data/abalone.train')\n", "dump_svmlight_file(x_test, y_test, 'validation_data/abalone.test')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Train model" ] }, { "source": [ "Commentary:\n", "\n", "Notice that the [SageMaker XGBoost container](https://github.com/aws/sagemaker-xgboost-container) framework version is set to be `1.2-1`. This is extremely important – the older `0.90-2` version will NOT work with SageMaker Neo out of the box. This is because in February of 2021, the SageMaker Neo team updated their XGBoost library version to `1.2` and backwards compatibility was not kept.\n", "\n", "Moreover, notice that we are using the open source XGBoost algorithm version, so we must provide our own training script and model loading function. These two required components are defined in `entrypoint.py`, which is part of the `neo-blog` repository. The training script is very basic, and the inspiration was taken from another sample notebook [here](https://github.com/aws/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb). Please note also that for `instance_count` and `instance_type`, the values are `1` and `local`, respectively, which means that the training job will run locally on our notebook instance. This is beneficial because it eliminates the startup time of training instances when a job runs remotely instead.\n", "\n", "Finally, notice that the number of boosting rounds has been set to 10,000. This means that the model will consist of 10,000 individual trees and will be computationally expensive to run, which we want for load testing purposes. A side effect will be that the model will severely overfit on the training data, but that is okay since accuracy is not a priority here. A computationally expensive model could have also been achieved by increasing the `max_depth` parameter as well.\n" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sagemaker\n", "from sagemaker.xgboost.estimator import XGBoost\n", "from sagemaker.session import Session\n", "from sagemaker.inputs import TrainingInput\n", "\n", "bucket = Session().default_bucket()\n", "role = sagemaker.get_execution_role()\n", "\n", "# initialize hyperparameters\n", "hyperparameters = {\n", " \"max_depth\":\"5\",\n", " \"eta\":\"0.2\",\n", " \"gamma\":\"4\",\n", " \"min_child_weight\":\"6\",\n", " \"subsample\":\"0.7\",\n", " \"verbosity\":\"1\",\n", " \"objective\":\"reg:squarederror\",\n", " \"num_round\":\"10000\"\n", "}\n", "\n", "# construct a SageMaker XGBoost estimator\n", "# specify the entry_point to your xgboost training script\n", "estimator = XGBoost(entry_point = \"entrypoint.py\", \n", " framework_version='1.2-1', # 1.x MUST be used \n", " hyperparameters=hyperparameters,\n", " role=role,\n", " instance_count=1,\n", " instance_type='local',\n", " output_path=f's3://{bucket}/neo-demo') # gets saved in bucket/neo-demo/job_name/model.tar.gz\n", "\n", "# define the data type and paths to the training and validation datasets\n", "content_type = \"libsvm\"\n", "train_input = TrainingInput('file://training_data', content_type=content_type)\n", "validation_input = TrainingInput('file://validation_data', content_type=content_type)\n", "\n", "# execute the XGBoost training job\n", "estimator.fit({'train': train_input, 'validation': validation_input}, logs=['Training'])\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploy unoptimized model" ] }, { "source": [ "Commentary:\n", "\n", "There are two interesting things to note here. The first of which is that although the training job was local, the model artifact was still set up to be stored in [Amazon S3](https://aws.amazon.com/s3/) upon job completion. The other peculiarity here is that we must create an `XGBoostModel` object and use its `deploy` method, rather than calling the `deploy` method of the estimator itself. This is due to the fact that we ran the training job in local mode, so the estimator is not aware of any “official” training job that is viewable in the SageMaker console and associable with the model artifact. Because of this, the estimator will error out if its own `deploy` method is used, and the `XGBoostModel` object must be constructed first instead. \n", "\n", "Notice also that we will be hosting the model on a c5 (compute-optimized) instance type. This instance will be particularly well suited for hosting the XGBoost model, since XGBoost by default runs on CPU and it’s a CPU-bound algorithm for inference (on the other hand, during training XGBoost is a memory bound algorithm). The c5.large instance type is also marginally cheaper to run in the us-east-1 region at $0.119 per hour compared to a t2.large at $0.1299 per hour.\n" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sagemaker.xgboost.model import XGBoostModel\n", "\n", "# grab the model artifact that was written out by the local training job\n", "s3_model_artifact = estimator.latest_training_job.describe()['ModelArtifacts']['S3ModelArtifacts']\n", "\n", "# we have to switch from local mode to remote mode\n", "xgboost_model = XGBoostModel(\n", " model_data=s3_model_artifact,\n", " role=role,\n", " entry_point=\"entrypoint.py\",\n", " framework_version='1.2-1',\n", ")\n", "\n", "unoptimized_endpoint_name = 'unoptimized-c5'\n", "\n", "xgboost_model.deploy(\n", " initial_instance_count = 1, \n", " instance_type='ml.c5.large',\n", " endpoint_name=unoptimized_endpoint_name\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optimize model with SageMaker Neo" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "job_name = s3_model_artifact.split(\"/\")[-2]\n", "neo_model = xgboost_model.compile(\n", " target_instance_family=\"ml_c5\",\n", " role=role,\n", " input_shape =f'{{\"data\": [1, {X.shape[1]}]}}',\n", " output_path =f's3://{bucket}/neo-demo/{job_name}', # gets saved in bucket/neo-demo/model-ml_c5.tar.gz\n", " framework = \"xgboost\",\n", " job_name=job_name # what it shows up as in console\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploy Neo model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "optimized_endpoint_name = 'neo-optimized-c5'\n", "\n", "neo_model.deploy(\n", " initial_instance_count = 1, \n", " instance_type='ml.c5.large',\n", " endpoint_name=optimized_endpoint_name\n", ")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Validate that endpoints are working" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import boto3\n", "\n", "smr = boto3.client('sagemaker-runtime')\n", "\n", "resp = smr.invoke_endpoint(EndpointName='neo-optimized-c5', Body=b'2,0.675,0.55,0.175,1.689,0.694,0.371,0.474', ContentType='text/csv')\n", "print('neo-optimized model response: ', resp['Body'].read())\n", "resp = smr.invoke_endpoint(EndpointName='unoptimized-c5', Body=b'2,0.675,0.55,0.175,1.689,0.694,0.371,0.474', ContentType='text/csv')\n", "print('unoptimized model response: ', resp['Body'].read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create CloudWatch dashboard for monitoring performance" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "cw = boto3.client('cloudwatch')\n", "\n", "dashboard_name = 'NeoDemo'\n", "region = Session().boto_region_name # get region we're currently in\n", "\n", "body = {\n", " \"widgets\": [\n", " {\n", " \"type\": \"metric\",\n", " \"x\": 0,\n", " \"y\": 0,\n", " \"width\": 24,\n", " \"height\": 12,\n", " \"properties\": {\n", " \"metrics\": [\n", " [ \"AWS/SageMaker\", \"Invocations\", \"EndpointName\", optimized_endpoint_name, \"VariantName\", \"AllTraffic\", { \"stat\": \"Sum\", \"yAxis\": \"left\" } ],\n", " [ \"...\", unoptimized_endpoint_name, \".\", \".\", { \"stat\": \"Sum\", \"yAxis\": \"left\" } ],\n", " [ \".\", \"ModelLatency\", \".\", \".\", \".\", \".\" ],\n", " [ \"...\", optimized_endpoint_name, \".\", \".\" ],\n", " [ \"/aws/sagemaker/Endpoints\", \"CPUUtilization\", \".\", \".\", \".\", \".\", { \"yAxis\": \"right\" } ],\n", " [ \"...\", unoptimized_endpoint_name, \".\", \".\", { \"yAxis\": \"right\" } ]\n", " ],\n", " \"view\": \"timeSeries\",\n", " \"stacked\": False,\n", " \"region\": region,\n", " \"stat\": \"Average\",\n", " \"period\": 60,\n", " \"title\": \"Performance Metrics\",\n", " \"start\": \"-PT1H\",\n", " \"end\": \"P0D\"\n", " }\n", " }\n", " ]\n", "}\n", "\n", "cw.put_dashboard(DashboardName=dashboard_name, DashboardBody=json.dumps(body))\n", "\n", "print('link to dashboard:')\n", "print(f'https://console.aws.amazon.com/cloudwatch/home?region={region}#dashboards:name={dashboard_name}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Install node.js" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%conda install -c conda-forge nodejs " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Validate successful installation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!node --version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Install Serverless framework and Serverless Artillery" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!npm install -g serverless@1.80.0 serverless-artillery@0.4.9" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Validate successful installations" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!serverless --version" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!slsart --version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploy Serverless Artillery" ] }, { "source": [ "Commentary:\n", "\n", "The most important file that makes up part of the load generating function under the `serverless_artillery` directory is `processor.js`, which is responsible for generating the payload body and signed headers of each request that gets sent to the SageMaker endpoints. Please take a moment to review the file’s contents. In it, you’ll see that we’re manually signing our requests using the AWS Signature Version 4 algorithm. When you use any AWS SDK like [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html), your requests are automatically signed for you by the library. Here, however, we are directly interacting with AWS’s SageMaker API endpoints, so we must sign requests ourselves. The access keys and session token of the load-generating lambda function’s role are used to sign the request, and the role is given permissions to invoke SageMaker endpoints in its role statements (defined in serverless.yml on line 18). When a request is sent, AWS will first validate the signed headers, then validate that the assumed role has permission to invoke endpoints, and then finally let the request from the Lambda to pass through. \n" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!cd serverless_artillery && npm install && slsart deploy --stage dev" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Serverless Artillery load test script" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.core.magic import register_line_cell_magic\n", "\n", "@register_line_cell_magic\n", "def writefilewithvariables(line, cell):\n", " with open(line, 'w') as f:\n", " f.write(cell.format(**globals()))\n", "\n", "# Get region that we're currently in\n", "region = Session().boto_region_name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%writefilewithvariables script.yaml\n", "\n", "config:\n", " variables:\n", " unoptimizedEndpointName: {unoptimized_endpoint_name} # the xgboost model has 10000 trees\n", " optimizedEndpointName: {optimized_endpoint_name} # the xgboost model has 10000 trees\n", " numRowsInRequest: 125 # Each request to the endpoint contains 125 rows \n", " target: 'https://runtime.sagemaker.{region}.amazonaws.com'\n", " phases:\n", " - duration: 120\n", " arrivalRate: 20 # 1200 total invocations per minute (600 per endpoint)\n", " - duration: 120\n", " arrivalRate: 40 # 2400 total invocations per minute (1200 per endpoint)\n", " - duration: 120\n", " arrivalRate: 60 # 3600 total invocations per minute (1800 per endpoint)\n", " - duration: 120\n", " arrivalRate: 80 # 4800 invocations per minute (2400 per endpoint... this is the max of the unoptimized endpoint)\n", " - duration: 120\n", " arrivalRate: 120 # only the neo endpoint can handle this load...\n", " - duration: 120\n", " arrivalRate: 160\n", " \n", " processor: './processor.js'\n", " \n", "scenarios:\n", " - flow:\n", " - post:\n", " url: '/endpoints/{{{{ unoptimizedEndpointName }}}}/invocations'\n", " beforeRequest: 'setRequest'\n", " - flow:\n", " - post:\n", " url: '/endpoints/{{{{ optimizedEndpointName }}}}/invocations'\n", " beforeRequest: 'setRequest'\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Perform load tests" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!slsart invoke --stage dev --path script.yaml" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Here's the link to the dashboard again:\")\n", "print(f'https://console.aws.amazon.com/cloudwatch/home?region={region}#dashboards:name={dashboard_name}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Clean up resources" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "# delete endpoints and endpoint configurations\n", "\n", "sm = boto3.client('sagemaker')\n", "\n", "for name in [unoptimized_endpoint_name, optimized_endpoint_name]:\n", " sm.delete_endpoint(EndpointName=name)\n", " sm.delete_endpoint_config(EndpointConfigName=name)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "# remove serverless artillery resources\n", "\n", "!slsart remove --stage dev\n" ] } ], "metadata": { "kernelspec": { "name": "python385jvsc74a57bd01abf8de982abe96e657c85b87b33ad381284ca2f54fc0f95d7e60df1292fbfaf", "display_name": "Python 3.8.5 64-bit ('.env': venv)" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" }, "metadata": { "interpreter": { "hash": "7e85ec9bf098c5427e45e2f632dcd4eeff803b007e1abd287d600879388709c1" } } }, "nbformat": 4, "nbformat_minor": 4 }