{ "cells": [ { "cell_type": "markdown", "id": "1b84c609", "metadata": {}, "source": [ "# Use SKlearn and Amazon SageMaker Clarify\n", "_**Run Amazon SageMaker Clarify processing after you trained a model**_\n", "\n", "---\n", "\n", "Take introduction from here:\n", "\n", "https://github.com/aws/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/linear_learner_abalone/Linear_Learner_Regression_csv_format.ipynb\n", "\n", "## Contents\n", "1. [Introduction](#Introduction)\n", "2. [Setup](#Setup)\n", " 1. [Source the libraries](#Source-the-libraries)\n", " 2. [Set S3 bucket and data prefix](#Set-S3-bucket-and-data-prefix)\n", " 3. [Set role and global vars](#Set-role-and-global-vars)\n", "3. [Load the data](#Load-the-data)\n", "4. [Upload the data to S3](#Upload-the-data-to-S3)\n", "5. [Train a SKLearn estimator](#Train-a-SKLearn-estimator)\n", "6. [Amazon SageMaker Clarify](#Amazon-SageMaker-Clarify)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "4e96395a", "metadata": {}, "source": [ "## Introduction\n", "\n", "This notebook demonstrates the use of Amazon SageMaker SKLearn to train a regression model. \n", "\n", "We use the [Abalone data](https://datahub.io/machine-learning/abalone), originally from the [UCI data repository](https://archive.ics.uci.edu/ml/datasets/abalone). More details about the original dataset can be found [here](https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.names).\n", "\n", "---\n", "## Setup\n", "\n", "This notebook was tested in Amazon SageMaker notebook on a ml.t3.medium instance with Python 3 (conda_python3) kernel.\n", "\n", "Let's start by specifying:\n", "1. Sourcing libraries\n", "2. The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.\n", "3. The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the boto regexp with a the appropriate full IAM role arn string(s).\n", "4. The global variables used later for training the model" ] }, { "cell_type": "markdown", "id": "ad4680fa", "metadata": {}, "source": [ "### Source the libraries" ] }, { "cell_type": "code", "execution_count": 1, "id": "ee42db9f", "metadata": {}, "outputs": [], "source": [ "import sagemaker\n", "from sagemaker import get_execution_role\n", "from sagemaker.sklearn.estimator import SKLearn\n", "import pandas as pd\n", "import numpy as np\n", "import urllib\n", "import boto3\n", "import json\n", "import os\n", "import time\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "d1c2010f", "metadata": {}, "source": [ "### Set S3 bucket and data prefix" ] }, { "cell_type": "code", "execution_count": 2, "id": "ccc37199", "metadata": {}, "outputs": [], "source": [ "# Provide information to where the training and validation data will be uploaded to \n", "S3_BUCKET = 'sagemaker-clarify-demo' # YOUR_S3_BUCKET\n", "PREFIX = 'abalone-clarify-notebook'\n", "DATA_PREFIX = f'{PREFIX}/prepared_data'" ] }, { "cell_type": "markdown", "id": "e64455c5", "metadata": {}, "source": [ "### Set role and global vars" ] }, { "cell_type": "code", "execution_count": 3, "id": "f8fa69b3", "metadata": {}, "outputs": [], "source": [ "# Get a SageMaker-compatible role used by this function and the session.\n", "sagemaker_session = sagemaker.Session()\n", "region = sagemaker_session.boto_region_name\n", "role = get_execution_role()\n", "\n", "# Set your instance count and type\n", "framework_version = '0.23-1'\n", "instance_type = 'ml.m5.xlarge'\n", "instance_count = 1\n", "\n", "# Set your code folder\n", "source_dir = 'model/'\n", "entry_point = 'predictor.py'" ] }, { "cell_type": "markdown", "id": "646a34e6", "metadata": {}, "source": [ "## Load the data" ] }, { "cell_type": "markdown", "id": "aa0f8d52", "metadata": {}, "source": [ "Read the dataset from your raw input prefix within S3" ] }, { "cell_type": "code", "execution_count": 4, "id": "98d2b703", "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('https://datahub.io/machine-learning/abalone/r/abalone.csv')\n", "cols = [x if \"rings\" not in x else \"Rings\" for x in df.columns]\n", "df.columns = cols" ] }, { "cell_type": "code", "execution_count": 5, "id": "eba9d64f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SexLengthDiameterHeightWhole_weightShucked_weightViscera_weightShell_weightRings
0M0.4550.3650.0950.51400.22450.10100.15015
1M0.3500.2650.0900.22550.09950.04850.0707
2F0.5300.4200.1350.67700.25650.14150.2109
3M0.4400.3650.1250.51600.21550.11400.15510
4I0.3300.2550.0800.20500.08950.03950.0557
\n", "
" ], "text/plain": [ " Sex Length Diameter Height Whole_weight Shucked_weight Viscera_weight \\\n", "0 M 0.455 0.365 0.095 0.5140 0.2245 0.1010 \n", "1 M 0.350 0.265 0.090 0.2255 0.0995 0.0485 \n", "2 F 0.530 0.420 0.135 0.6770 0.2565 0.1415 \n", "3 M 0.440 0.365 0.125 0.5160 0.2155 0.1140 \n", "4 I 0.330 0.255 0.080 0.2050 0.0895 0.0395 \n", "\n", " Shell_weight Rings \n", "0 0.150 15 \n", "1 0.070 7 \n", "2 0.210 9 \n", "3 0.155 10 \n", "4 0.055 7 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "id": "ec695dc8", "metadata": {}, "source": [ "Define the target column of the data set for later referencing, also within Clarify" ] }, { "cell_type": "code", "execution_count": 6, "id": "4dca11ad", "metadata": {}, "outputs": [], "source": [ "target_col = 'Rings'" ] }, { "cell_type": "code", "execution_count": 7, "id": "2b246751", "metadata": {}, "outputs": [], "source": [ "# X: input features - this is what your algorithm takes for learning\n", "# y: target - this is what your algorithm will predict\n", "X = df.drop(target_col, axis=1)\n", "y = df[target_col]" ] }, { "cell_type": "markdown", "id": "2d808ef6", "metadata": {}, "source": [ "Use the train_test_split function to split your data. Don't forget to set a random_state to ensure reproduceability!" ] }, { "cell_type": "code", "execution_count": 8, "id": "6fc7ddb0", "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "# Split the data into 70% training and 30% validation data\n", "X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=42)" ] }, { "cell_type": "markdown", "id": "8376786e", "metadata": {}, "source": [ "SageMaker algorithms and features require the target column to come in first place" ] }, { "cell_type": "code", "execution_count": 9, "id": "b6810676", "metadata": {}, "outputs": [], "source": [ "train = pd.concat([y_train, X_train], axis=1)\n", "val = pd.concat([y_val, X_val], axis=1)" ] }, { "cell_type": "markdown", "id": "89e46f5a", "metadata": {}, "source": [ "## Upload the data to S3\n", "The data must lie in S3 to make it consumable by the SKLearn container and Clarify." ] }, { "cell_type": "markdown", "id": "19986c69", "metadata": {}, "source": [ "Save your data locally first: training, validation and the features of the validation set for Clarify" ] }, { "cell_type": "code", "execution_count": 15, "id": "5f9abcc9", "metadata": {}, "outputs": [], "source": [ "train.to_csv('train_data.csv', index=False) # training data\n", "val.to_csv('val_data.csv', index=False) # validation data\n", "baseline = train.agg({'Sex': 'mode', \n", " 'Length': 'mean', \n", " 'Length': 'mean', \n", " 'Diameter': 'mean', \n", " 'Height': 'mean', \n", " 'Whole_weight': 'mean', \n", " 'Shucked_weight': 'mean', \n", " 'Viscera_weight': 'mean', \n", " 'Shell_weight': 'mean'}) # used in SageMaker Clarify: only store your features\n", "baseline.to_csv('baseline.csv', index=False, header=None)" ] }, { "cell_type": "code", "execution_count": 22, "id": "a9734364", "metadata": {}, "outputs": [], "source": [ "# write the columns to be one hot encoded and the column names, in the same order as in the training data into a config file\n", "# this config file will be read during training and prediction to one hot encode the columns \n", "config_data = {\n", " 'one_hot_encoding': ['Sex'], \n", " 'numeric': ['Length', 'Diameter', 'Height', 'Whole_weight', 'Shucked_weight', 'Viscera_weight', 'Shell_weight'],\n", " 'header': train.columns.tolist()\n", "}\n", "with open('config_data.json', 'w') as outfile:\n", " json.dump(config_data, outfile)" ] }, { "cell_type": "markdown", "id": "3073ee90", "metadata": {}, "source": [ "Upload the data to S3 so it can be consumed by SageMaker training job and Clarify" ] }, { "cell_type": "code", "execution_count": 23, "id": "78689eb1", "metadata": {}, "outputs": [], "source": [ "from sagemaker.s3 import S3Uploader\n", "from sagemaker.inputs import TrainingInput\n", "train_uri = S3Uploader.upload('train_data.csv', f's3://{S3_BUCKET}/{DATA_PREFIX}')\n", "val_uri = S3Uploader.upload('val_data.csv', f's3://{S3_BUCKET}/{DATA_PREFIX}')\n", "baseline_uri = S3Uploader.upload('baseline.csv', f's3://{S3_BUCKET}/{DATA_PREFIX}')\n", "config_uri = S3Uploader.upload('config_data.json', f's3://{S3_BUCKET}/{DATA_PREFIX}')" ] }, { "cell_type": "code", "execution_count": 24, "id": "d2ce6d41", "metadata": {}, "outputs": [], "source": [ "# Set training input for the SKlearn estimator (not necessary but recommended)\n", "train_data = TrainingInput(train_uri, content_type='csv')\n", "validation_data = TrainingInput(val_uri, content_type='csv')\n", "config_data = TrainingInput(config_uri, content_type='json')" ] }, { "cell_type": "markdown", "id": "a14f4d15", "metadata": {}, "source": [ "## Train a SKLearn estimator\n", "\n", "Define an estimator to create a model upon which clarify provides explainability. This could be any estimator from the builtin algorithms (https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html) or a Base Estimator like Sklearn or similar. \n", "\n", "The model class can be either classifiction or regression. \n", "\n", "As an example the following uses the SKLearn Base estimator to train a RandomForestRegressor on the underlying data. Please see the file model/predictor.py for a definition of the SKLearn Base estimator which will be used in the following.\n", "\n", "For more info regarding the SKLearn container please visit: https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html#scikit-learn-estimator" ] }, { "cell_type": "code", "execution_count": 25, "id": "877ec42d", "metadata": {}, "outputs": [], "source": [ "# Set the hyperparamters for this RandomForestRegressor. In this simple example only 3 Hyperparamters are permitted\n", "hyperparameters = {\n", " 'n_estimators': 100,\n", " 'max_depth': 10,\n", " 'max_features': 'sqrt',\n", " 'random_state': 42\n", " }\n", "sklearn = SKLearn(\n", " entry_point=entry_point,\n", " source_dir=source_dir,\n", " hyperparameters=hyperparameters,\n", " role=role,\n", " instance_count=instance_count,\n", " instance_type=instance_type,\n", " framework_version=framework_version,\n", " sagemaker_session=sagemaker_session,\n", " #code_location=f's3://{S3_BUCKET}/{PREFIX}/model/',\n", " output_path=f's3://{S3_BUCKET}/{PREFIX}/model/',\n", " enable_sagemaker_metrics=True,\n", " metric_definitions=[\n", " {\n", " 'Name': 'train:mae',\n", " 'Regex': 'Train_mae=(.*?);'\n", " },\n", " {\n", " 'Name': 'validation:mae',\n", " 'Regex': 'Validation_mae=(.*?);'\n", " }\n", " ])" ] }, { "cell_type": "code", "execution_count": 26, "id": "d4c6d0e4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021-05-06 16:46:00 Starting - Starting the training job...\n", "2021-05-06 16:46:24 Starting - Launching requested ML instancesProfilerReport-1620319560: InProgress\n", "......\n", "2021-05-06 16:47:24 Starting - Preparing the instances for training......\n", "2021-05-06 16:48:25 Downloading - Downloading input data\n", "2021-05-06 16:48:25 Training - Downloading the training image.....\u001b[34m2021-05-06 16:49:12,230 sagemaker-containers INFO Imported framework sagemaker_sklearn_container.training\u001b[0m\n", "\u001b[34m2021-05-06 16:49:12,233 sagemaker-training-toolkit INFO No GPUs detected (normal if no gpus installed)\u001b[0m\n", "\u001b[34m2021-05-06 16:49:12,248 sagemaker_sklearn_container.training INFO Invoking user training script.\u001b[0m\n", "\u001b[34m2021-05-06 16:49:12,546 sagemaker-training-toolkit INFO No GPUs detected (normal if no gpus installed)\u001b[0m\n", "\u001b[34m2021-05-06 16:49:13,167 sagemaker-training-toolkit INFO No GPUs detected (normal if no gpus installed)\u001b[0m\n", "\u001b[34m2021-05-06 16:49:13,178 sagemaker-training-toolkit INFO No GPUs detected (normal if no gpus installed)\u001b[0m\n", "\u001b[34m2021-05-06 16:49:13,187 sagemaker-training-toolkit INFO Invoking user script\n", "\u001b[0m\n", "\u001b[34mTraining Env:\n", "\u001b[0m\n", "\u001b[34m{\n", " \"additional_framework_parameters\": {},\n", " \"channel_input_dirs\": {\n", " \"config\": \"/opt/ml/input/data/config\",\n", " \"validation\": \"/opt/ml/input/data/validation\",\n", " \"train\": \"/opt/ml/input/data/train\"\n", " },\n", " \"current_host\": \"algo-1\",\n", " \"framework_module\": \"sagemaker_sklearn_container.training:main\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"hyperparameters\": {\n", " \"max_features\": \"sqrt\",\n", " \"max_depth\": 10,\n", " \"n_estimators\": 100,\n", " \"random_state\": 42\n", " },\n", " \"input_config_dir\": \"/opt/ml/input/config\",\n", " \"input_data_config\": {\n", " \"config\": {\n", " \"ContentType\": \"json\",\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " },\n", " \"validation\": {\n", " \"ContentType\": \"csv\",\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " },\n", " \"train\": {\n", " \"ContentType\": \"csv\",\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " }\n", " },\n", " \"input_dir\": \"/opt/ml/input\",\n", " \"is_master\": true,\n", " \"job_name\": \"sagemaker-scikit-learn-2021-05-06-16-46-00-049\",\n", " \"log_level\": 20,\n", " \"master_hostname\": \"algo-1\",\n", " \"model_dir\": \"/opt/ml/model\",\n", " \"module_dir\": \"s3://sagemaker-clarify-demo-sabina/sagemaker-scikit-learn-2021-05-06-16-46-00-049/source/sourcedir.tar.gz\",\n", " \"module_name\": \"predictor\",\n", " \"network_interface_name\": \"eth0\",\n", " \"num_cpus\": 4,\n", " \"num_gpus\": 0,\n", " \"output_data_dir\": \"/opt/ml/output/data\",\n", " \"output_dir\": \"/opt/ml/output\",\n", " \"output_intermediate_dir\": \"/opt/ml/output/intermediate\",\n", " \"resource_config\": {\n", " \"current_host\": \"algo-1\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"network_interface_name\": \"eth0\"\n", " },\n", " \"user_entry_point\": \"predictor.py\"\u001b[0m\n", "\u001b[34m}\n", "\u001b[0m\n", "\u001b[34mEnvironment variables:\n", "\u001b[0m\n", "\u001b[34mSM_HOSTS=[\"algo-1\"]\u001b[0m\n", "\u001b[34mSM_NETWORK_INTERFACE_NAME=eth0\u001b[0m\n", "\u001b[34mSM_HPS={\"max_depth\":10,\"max_features\":\"sqrt\",\"n_estimators\":100,\"random_state\":42}\u001b[0m\n", "\u001b[34mSM_USER_ENTRY_POINT=predictor.py\u001b[0m\n", "\u001b[34mSM_FRAMEWORK_PARAMS={}\u001b[0m\n", "\u001b[34mSM_RESOURCE_CONFIG={\"current_host\":\"algo-1\",\"hosts\":[\"algo-1\"],\"network_interface_name\":\"eth0\"}\u001b[0m\n", "\u001b[34mSM_INPUT_DATA_CONFIG={\"config\":{\"ContentType\":\"json\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"train\":{\"ContentType\":\"csv\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"validation\":{\"ContentType\":\"csv\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}}\u001b[0m\n", "\u001b[34mSM_OUTPUT_DATA_DIR=/opt/ml/output/data\u001b[0m\n", "\u001b[34mSM_CHANNELS=[\"config\",\"train\",\"validation\"]\u001b[0m\n", "\u001b[34mSM_CURRENT_HOST=algo-1\u001b[0m\n", "\u001b[34mSM_MODULE_NAME=predictor\u001b[0m\n", "\u001b[34mSM_LOG_LEVEL=20\u001b[0m\n", "\u001b[34mSM_FRAMEWORK_MODULE=sagemaker_sklearn_container.training:main\u001b[0m\n", "\u001b[34mSM_INPUT_DIR=/opt/ml/input\u001b[0m\n", "\u001b[34mSM_INPUT_CONFIG_DIR=/opt/ml/input/config\u001b[0m\n", "\u001b[34mSM_OUTPUT_DIR=/opt/ml/output\u001b[0m\n", "\u001b[34mSM_NUM_CPUS=4\u001b[0m\n", "\u001b[34mSM_NUM_GPUS=0\u001b[0m\n", "\u001b[34mSM_MODEL_DIR=/opt/ml/model\u001b[0m\n", "\u001b[34mSM_MODULE_DIR=s3://sagemaker-clarify-demo-sabina/sagemaker-scikit-learn-2021-05-06-16-46-00-049/source/sourcedir.tar.gz\u001b[0m\n", "\u001b[34mSM_TRAINING_ENV={\"additional_framework_parameters\":{},\"channel_input_dirs\":{\"config\":\"/opt/ml/input/data/config\",\"train\":\"/opt/ml/input/data/train\",\"validation\":\"/opt/ml/input/data/validation\"},\"current_host\":\"algo-1\",\"framework_module\":\"sagemaker_sklearn_container.training:main\",\"hosts\":[\"algo-1\"],\"hyperparameters\":{\"max_depth\":10,\"max_features\":\"sqrt\",\"n_estimators\":100,\"random_state\":42},\"input_config_dir\":\"/opt/ml/input/config\",\"input_data_config\":{\"config\":{\"ContentType\":\"json\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"train\":{\"ContentType\":\"csv\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"validation\":{\"ContentType\":\"csv\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}},\"input_dir\":\"/opt/ml/input\",\"is_master\":true,\"job_name\":\"sagemaker-scikit-learn-2021-05-06-16-46-00-049\",\"log_level\":20,\"master_hostname\":\"algo-1\",\"model_dir\":\"/opt/ml/model\",\"module_dir\":\"s3://sagemaker-clarify-demo-sabina/sagemaker-scikit-learn-2021-05-06-16-46-00-049/source/sourcedir.tar.gz\",\"module_name\":\"predictor\",\"network_interface_name\":\"eth0\",\"num_cpus\":4,\"num_gpus\":0,\"output_data_dir\":\"/opt/ml/output/data\",\"output_dir\":\"/opt/ml/output\",\"output_intermediate_dir\":\"/opt/ml/output/intermediate\",\"resource_config\":{\"current_host\":\"algo-1\",\"hosts\":[\"algo-1\"],\"network_interface_name\":\"eth0\"},\"user_entry_point\":\"predictor.py\"}\u001b[0m\n", "\u001b[34mSM_USER_ARGS=[\"--max_depth\",\"10\",\"--max_features\",\"sqrt\",\"--n_estimators\",\"100\",\"--random_state\",\"42\"]\u001b[0m\n", "\u001b[34mSM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate\u001b[0m\n", "\u001b[34mSM_CHANNEL_CONFIG=/opt/ml/input/data/config\u001b[0m\n", "\u001b[34mSM_CHANNEL_VALIDATION=/opt/ml/input/data/validation\u001b[0m\n", "\u001b[34mSM_CHANNEL_TRAIN=/opt/ml/input/data/train\u001b[0m\n", "\u001b[34mSM_HP_MAX_FEATURES=sqrt\u001b[0m\n", "\u001b[34mSM_HP_MAX_DEPTH=10\u001b[0m\n", "\u001b[34mSM_HP_N_ESTIMATORS=100\u001b[0m\n", "\u001b[34mSM_HP_RANDOM_STATE=42\u001b[0m\n", "\u001b[34mPYTHONPATH=/opt/ml/code:/miniconda3/bin:/miniconda3/lib/python37.zip:/miniconda3/lib/python3.7:/miniconda3/lib/python3.7/lib-dynload:/miniconda3/lib/python3.7/site-packages\n", "\u001b[0m\n", "\u001b[34mInvoking script with the following command:\n", "\u001b[0m\n", "\u001b[34m/miniconda3/bin/python predictor.py --max_depth 10 --max_features sqrt --n_estimators 100 --random_state 42\n", "\n", "\u001b[0m\n", "\u001b[34mTrain_mae=1.0292093132963624;\u001b[0m\n", "\u001b[34mValidation_mae=1.5265188347826182;\u001b[0m\n", "\u001b[34mSaved model!\u001b[0m\n", "\u001b[34m2021-05-06 16:49:14,854 sagemaker-containers INFO Reporting training SUCCESS\u001b[0m\n", "\n", "2021-05-06 16:49:26 Uploading - Uploading generated training model\n", "2021-05-06 16:49:26 Completed - Training job completed\n", "Training seconds: 81\n", "Billable seconds: 81\n" ] } ], "source": [ "# Train the model\n", "sklearn.fit({\n", " 'train': train_data,\n", " 'validation': validation_data,\n", " 'config': config_data}, wait=True)" ] }, { "cell_type": "markdown", "id": "5c64cb72", "metadata": {}, "source": [ "### Create a SageMaker model for Amazon SageMaker Clarify\n", "Create an SageMaker model from the SKLearn estimator" ] }, { "cell_type": "code", "execution_count": 27, "id": "11148a8a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'sagemaker-sklearn-clarify-2021-05-06-17-33-47'" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from time import gmtime, strftime\n", "\n", "# generate time stamp\n", "uid = strftime('%Y-%m-%d-%H-%M-%S', gmtime())\n", "\n", "# Create model with model name\n", "model_name = 'sagemaker-sklearn-clarify-{}'.format(uid)\n", "model = sklearn.create_model(name=model_name)\n", "container_def = model.prepare_container_def()\n", "sagemaker_session.create_model(model.name, role, container_def)" ] }, { "cell_type": "markdown", "id": "d2d528ff", "metadata": {}, "source": [ "## Amazon SageMaker Clarify\n", "Apply Clarify on the trained SKLearn model and run explainability on the model" ] }, { "cell_type": "markdown", "id": "919d01f2", "metadata": {}, "source": [ "The next section is taken from\n", "\n", "https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_processing/fairness_and_explainability/fairness_and_explainability.html\n", "\n", "and modified to this use case.\n", "\n", "The following steps are necessary to explain predictions:\n", "1. Create a Clarify Processor which will generate the explanations\n", "1. Configure the Processor with the input data, where to store the output and detailed information about the dataset columns\n", "1. Configure the Processor with the model to explain predictions for\n", "1. Configure the Processor to use the SHAP value to explain predictions. Configure the SHAP value computation.\n", "1. Explain the Predictions" ] }, { "cell_type": "code", "execution_count": 28, "id": "01442392", "metadata": {}, "outputs": [], "source": [ "from sagemaker import clarify\n", "clarify_processor = clarify.SageMakerClarifyProcessor(role=role,\n", " instance_count=1,\n", " instance_type='ml.m5.xlarge',\n", " sagemaker_session=sagemaker_session)" ] }, { "cell_type": "markdown", "id": "1a5e1464", "metadata": {}, "source": [ "A `DataConfig` object communicates some basic information about data I/O to SageMaker Clarify. We specify where to find the input dataset, where to store the output, the target column (label), the header names, and the dataset type." ] }, { "cell_type": "code", "execution_count": 29, "id": "8b1089eb", "metadata": {}, "outputs": [], "source": [ "explainability_output_path = f's3://{S3_BUCKET}/{PREFIX}/clarify-explainability'\n", "explainability_data_config = clarify.DataConfig(s3_data_input_path=train_uri,\n", " s3_output_path=explainability_output_path,\n", " label=target_col, # our target column\n", " headers=train.columns.tolist(), # the columns of the training DataFrame\n", " dataset_type='text/csv')" ] }, { "cell_type": "markdown", "id": "9b3d331d", "metadata": {}, "source": [ "A `ModelConfig` object communicates information about your trained model. To avoid additional traffic to your production models, SageMaker Clarify sets up and tears down a dedicated endpoint when processing.\n", "\n", "`accept_type` denotes the endpoint response payload format, and `content_type` denotes the payload format of request to the endpoint." ] }, { "cell_type": "code", "execution_count": 30, "id": "32f33033", "metadata": {}, "outputs": [], "source": [ "model_config = clarify.ModelConfig(model_name=model_name,\n", " instance_type='ml.m5.xlarge',\n", " instance_count=1,\n", " accept_type='text/csv',\n", " content_type='text/csv')" ] }, { "cell_type": "markdown", "id": "ed6d591a", "metadata": {}, "source": [ "Kernel SHAP algorithm requires a baseline (also known as background dataset). Baseline dataset type shall be the same as `dataset_type` of `DataConfig`, and baseline samples shall only include features. By definition, baseline should either be a S3 URI to the baseline dataset file, or an in-place list of samples. In this case we chose the latter we put the first sample of the test dataset to the list." ] }, { "cell_type": "code", "execution_count": 31, "id": "18516a83", "metadata": {}, "outputs": [], "source": [ "shap_config = clarify.SHAPConfig(baseline=baseline_uri,\n", " num_samples=20,\n", " agg_method='mean_abs',\n", " save_local_shap_values=True)" ] }, { "cell_type": "code", "execution_count": 32, "id": "bd2adc05", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Job Name: Clarify-Explainability-2021-05-06-17-33-51-824\n", "Inputs: [{'InputName': 'dataset', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-clarify-demo-sabina/abalone-clarify-notebook/prepared_data/train_data.csv', 'LocalPath': '/opt/ml/processing/input/data', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'analysis_config', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-clarify-demo-sabina/abalone-clarify-notebook/clarify-explainability/analysis_config.json', 'LocalPath': '/opt/ml/processing/input/config', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]\n", "Outputs: [{'OutputName': 'analysis_result', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-clarify-demo-sabina/abalone-clarify-notebook/clarify-explainability', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob'}}]\n", ".............................\u001b[34mINFO:sagemaker-clarify-processing:Starting SageMaker Clarify Processing job\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Analysis config path: /opt/ml/processing/input/config\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Analysis result path: /opt/ml/processing/output\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:This host is algo-1.\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:This host is the leader.\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Number of hosts in the cluster is 1.\u001b[0m\n", "\u001b[34mINFO:sagemaker-clarify-processing:Running Python / Pandas based analyzer.\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Dataset uri: /opt/ml/processing/input/data\u001b[0m\n", "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Dataset type: text/csv\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Spinning up shadow endpoint\u001b[0m\n", "\u001b[34mINFO:sagemaker:Creating endpoint-config with name sagemaker-clarify-endpoint-config-1620322708-5ab6\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Creating endpoint sagemaker-clarify-endpoint-1620322708-7850\u001b[0m\n", "\u001b[34mINFO:explainers.shap.kernel_shap:SHAP n_samples 20\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Checking endpoint status\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Endpoint is in service after 421 seconds\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:=====================================================\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:Shap analyzer: explaining 2923 rows, 8 columns...\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:=====================================================\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Prediction batch size is initialized with 139810\u001b[0m\n", "\u001b[34mN/A% (0 of 2923) | | Elapsed Time: 0:00:00 ETA: --:--:--\u001b[0m\n", "\u001b[34m 8% (251 of 2923) |# | Elapsed Time: 0:00:30 ETA: 0:05:20\u001b[0m\n", "\u001b[34m 16% (486 of 2923) |### | Elapsed Time: 0:01:00 ETA: 0:05:11\u001b[0m\n", "\u001b[34m 25% (741 of 2923) |##### | Elapsed Time: 0:01:30 ETA: 0:04:16\u001b[0m\n", "\u001b[34m 34% (997 of 2923) |####### | Elapsed Time: 0:02:00 ETA: 0:03:46\u001b[0m\n", "\u001b[34m 42% (1253 of 2923) |######## | Elapsed Time: 0:02:30 ETA: 0:03:16\u001b[0m\n", "\u001b[34m 51% (1510 of 2923) |########## | Elapsed Time: 0:03:00 ETA: 0:02:45\u001b[0m\n", "\u001b[34m 60% (1767 of 2923) |############ | Elapsed Time: 0:03:30 ETA: 0:02:15\u001b[0m\n", "\u001b[34m 69% (2024 of 2923) |############# | Elapsed Time: 0:04:00 ETA: 0:01:45\u001b[0m\n", "\u001b[34m 78% (2281 of 2923) |############### | Elapsed Time: 0:04:30 ETA: 0:01:15\u001b[0m\n", "\u001b[34m 86% (2538 of 2923) |################# | Elapsed Time: 0:05:00 ETA: 0:00:45\u001b[0m\n", "\u001b[34m 95% (2795 of 2923) |################### | Elapsed Time: 0:05:30 ETA: 0:00:14\u001b[0m\n", "\u001b[34m100% (2923 of 2923) |####################| Elapsed Time: 0:05:45 Time: 0:05:45\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:getting explanations took 345.97 seconds.\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:===================================================\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:converting explanations to tabular took 0.19 seconds.\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:===================================================\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:Wrote 2923 local explanations to: /opt/ml/processing/output/explanations_shap/out.csv, baseline used compute them to: /opt/ml/processing/output/explanations_shap/baseline.csv\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:writing local explanations took 0.04 seconds.\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:===================================================\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:aggregating local explanations took 0.00 seconds.\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:===================================================\u001b[0m\n", "\u001b[34mINFO:analyzer.shap_analyzer:Shap analysis finished.\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Stop using endpoint: sagemaker-clarify-endpoint-1620322708-7850\u001b[0m\n", "\u001b[34mINFO:sagemaker:Deleting endpoint configuration with name: sagemaker-clarify-endpoint-config-1620322708-5ab6\u001b[0m\n", "\u001b[34mINFO:sagemaker:Deleting endpoint with name: sagemaker-clarify-endpoint-1620322708-7850\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Model endpoint delivered 8.44052 requests per second and a total of 2924 requests over 346 seconds\u001b[0m\n", "\u001b[34mINFO:analyzer.predictor:Delivered 2924 predict calls with a total of 61384 examples.\u001b[0m\n", "\u001b[34m[NbConvertApp] Converting notebook /opt/ml/processing/output/report.ipynb to html\u001b[0m\n", "\u001b[34m[NbConvertApp] Writing 307013 bytes to /opt/ml/processing/output/report.html\u001b[0m\n", "\u001b[34mINFO:analyzer.report:HTML report '/opt/ml/processing/output/report.html' generated successfully.\u001b[0m\n", "\u001b[34mINFO:analyzer.report:PDF report '/opt/ml/processing/output/report.pdf' generated successfully.\u001b[0m\n", "\u001b[34mINFO:sagemaker-clarify-processing:Collected analyses: \u001b[0m\n", "\u001b[34m{\n", " \"version\": \"1.0\",\n", " \"explanations\": {\n", " \"kernel_shap\": {\n", " \"label0\": {\n", " \"global_shap_values\": {\n", " \"Sex\": 0.48190629356040343,\n", " \"Length\": 0.4746253099340384,\n", " \"Diameter\": 0.5857106212129074,\n", " \"Height\": 0.6034382366520307,\n", " \"Whole_weight\": 0.7601108411731745,\n", " \"Shucked_weight\": 1.5231065213913553,\n", " \"Viscera_weight\": 0.4879189335991356,\n", " \"Shell_weight\": 1.642456032363892\n", " },\n", " \"expected_value\": 10.37101383667798\n", " }\n", " }\n", " }\u001b[0m\n", "\u001b[34m}\u001b[0m\n", "\u001b[34mINFO:analyzer.system_util:exit_message: Completed: SageMaker XAI Analyzer ran successfully\u001b[0m\n", "\u001b[34m-------!\u001b[0m\n", "\n", "Time taken to run explainability: 17.95\n" ] } ], "source": [ "s = time.time()\n", "clarify_processor.run_explainability(data_config=explainability_data_config,\n", " model_config=model_config,\n", " explainability_config=shap_config)\n", "print(f\"Time taken to run explainability: {(time.time()-s)/60:.2f}\")" ] }, { "cell_type": "markdown", "id": "b845f737", "metadata": {}, "source": [ "### View explained predictions" ] }, { "cell_type": "code", "execution_count": 34, "id": "ade84a0e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2923, 8)\n" ] }, { "data": { "text/plain": [ "Shell_weight 1227\n", "Shucked_weight 1199\n", "Diameter 131\n", "Whole_weight 127\n", "Height 99\n", "Sex 53\n", "Viscera_weight 47\n", "Length 40\n", "dtype: int64" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "local_explanations_out = pd.read_csv(explainability_output_path+'/explanations_shap/out.csv')\n", "print(local_explanations_out.shape)\n", "feature_names = [str.replace(c, \"_label0\", \"\") for c in local_explanations_out.columns.to_series()]\n", "local_explanations_out.columns = feature_names\n", "# which columns have the most impact per prediction\n", "local_explanations_out.abs().idxmax(axis=1).value_counts()" ] }, { "cell_type": "code", "execution_count": 35, "id": "a6a1b081", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Example number: 100\n", "\n", "Feature values -- Label: Rings 9\n", "Sex M\n", "Length 0.63\n", "Diameter 0.505\n", "Height 0.17\n", "Whole_weight 1.0915\n", "Shucked_weight 0.4615\n", "Viscera_weight 0.266\n", "Shell_weight 0.3\n", "Name: 1694, dtype: object\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAmIAAAHzCAYAAACDhHGdAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAz4UlEQVR4nO3deZhkZX3+//fNDMguoiOyDSiBGFAhOOK+IoogwS2K2xeNiiYaxSWKmkRi9CcxiUvccFSUGJeoiKASAYmCSwjMIAqoCCIKQtiURVER/Pz+OKehpqdmunu6Z57qqffruurqOlvVp86c6br7Oc95TqoKSZIkrXsbtC5AkiRpXBnEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmDQHkjwqyeXr+D13TlJJFq7L9+3f+9lJTllLr/2XSa5K8qskd10Lr99sv7UwXz5vkiOT/EfrOqR1zSCm9VaSS5M8tnUd892wL/Kq+kRVPW4tvNeGwDuAx1XV5lV13Ry8pseBSPLoJF9LckOSS4cs37lffnOSH04+ZpI8K8lPk/w6yReSbL3Oitd6zSAmaZRsA2wMXDDTDdPxd5pYRevfr4FjgL9ZxWafAr4D3BV4I/C5JIv619sD+CDwXLpj9Gbg/XNctsaUv7Q0dpLcKcm7klzRP96V5E4Dyw9Ocm6SG5P8OMn+/fznJ/lBkpuSXJLkxTN4z3snOTXJL5JcmOTp/fxd+nl799PbJbk2yaP66a8neVuSs/q/5E9Y1V/iq6tv4tRpklcnuTrJlUmeP7D8wCTf6T/zZUmOHHjpM/qf1/enCx+c5HlJvjmw/UOSnN3XeHaShwws+3qSf0zyrb62U5LcbUj9uwEXDrzXf0/ztd+a5Ft0X473mvSaHwcWA1/sa3/twOJnJ/lZv7/fOLDNBkmO6P/tr0vymdW1fiR5Yn+8XJ/k20nu189/Rv/vsGU//YQk/zfw5f7ufl/fmGR5kocPvOaRST6b5D/6fXZekt2SvL7/97ssyeMG1p/JcXLnJB/pj4GfJ3lLkgWrWPfI/vP/e1/HBUmWDCyvJH80MP2xJG/pn08cc68dOOaelOSAJD/qj/s3THrLjZP8Z/9e5yTZc+C1t0tyXJJrkvwkycsn1fm5fn/dCDxv8mepqrOq6uPAJUM+527A3sCbquo3VXUccB7w1H6VZwNfrKozqupXwN8BT0myxbD9Js1IVfnwsV4+gEuBxw6Z/2bgTODuwCLg28A/9sv2AW4A9qP7Q2V74N79sgOBXYAAj6T74t+7X/Yo4PJV1LEZcBnwfGAh3S/8a4E9+uUvAn4AbAqcDPzLwLZfB34O3Kd/neOA/+iX7QwUsHCa9d3af/YNgQP65XcZWH7f/jPfD7gKeNKw9+nnPQ/4Zv98a+CXdK0FC4Fn9tN3HfgMPwZ2Azbpp49axb6a/Jmm89o/A/bol2841XEw8B4f6uvZE/gd8Cf98sPpjo8dgDvRtYR8ahX17g1cDTwQWAAc2r/fnfrlnwA+RtfKcgXwxIFtn9PPXwi8Gvg/YON+2ZHAb4HH98v/HfgJXUvNhnTHzE/W8Dj5Qv+ZNqP7P3AW8OJVfL6JOg7oP9/bgDMHlhfwRwPTHwPeMumY+/uBmq8BPgls0f+b/Ra418B7/R54Wr/+a/rPvCHdcbm8f62N6AL3JcDjJ237pH7dTVbze+GxwKWT5j0Z+MGkee8F3tM/PwF43aTlvwLu3/r3nI/5/2hegA8fa+vBqoPYj4EDBqYfP/GLuf+Ceuc0X/8LwCv6549i1UHsGcA3Js37IN1f3xPTJ9L9Bf49+i/xfv7XGQgtwO7ALf2X4gpfsNOo7zesGKauBh60im3fNbEfhr0PKwax5wJnTdr+f4DnDXyGvx1Y9lfAV1bxviu81zRf+80zOQ4G3mOHgXlnAYf0z38A7DuwbFu6L/mV9jPwAfoQPzDvQuCR/fOt6ILiecAHp6jzl8Ce/fMjgVMHlh1E98W/oJ/eov8MW83kOKE7rfY7BoIKXbj92ipqOhL46qTX/c3A9FRB7DdDan7gwPrLuSPwH8mKIW8D4Erg4XRB92eTans98NGBbc+Y5v/bYUHsuYPv3c97K/Cx/vlpwEsmLf858KjpvKcPH6t7jPRVNNJash3w04Hpn/bzAHYEThq2UZInAG+ia9nZgK4F67xpvN9OwAOTXD8wbyHw8YHpD9GFscOq6neTtr9sUq0bAsNO7U1V33VVdevA9M3A5v22DwSOomtR2YiuJeiz0/hssPL+nKhz+4Hp/xv2vnP02pexZlZV007A8Un+MLD8NroQ8/NJr7ETcGiSvx6Yt1FfN1V1fZLPAq/ijtNcACR5NfDCft0CtmTFf9erBp7/Bri2qm4bmKav+fr++XSOk536+VcmmZi3Aavfh5P308ZJFk46llbluiE1T/5cg8fC7XVU1R/SXYk8sX+2m/R/aAHwjWHbroFf0e3/QVsCN01zubTG7COmcXQF3RfShMX9POh+me8yeYN0fciOA/4F2KaqtqILbJm87hCXAadX1VYDj82r6i/7196crgXqI8CRQ/r27Dip1t/Tndqcq/qgO110IrBjVd0ZOHpg25pi28n7c6LOyaFlTUzntaeqb6rlk10GPGHSv9fGVTXs81wGvHXSuptW1acAkuwF/AVdR/B/m9io7w/2OuDpdKeHt6I7JT7df69hpjxO+np/B9xtoN4tq2qPNXzPm+kC/4R7rOHrTLj9M6S78GIHumPgMrpTsYP7eYuqOmBg25n+Ow+6ALjXpD5fe3LHRSMX9NMTtd2L7o+VH83iPSXAIKb134ZJNh54LKT7UvzbJIv6TuN/D0yMX/QR4PlJ9u07bW+f5N7c0Up0DXBr3/o03eEbvgTsluS5STbsHw9I8if98ncDy6vqhcCX6ULQoOck2T3JpnR9vD430MowYTb1QXfa6BdV9dsk+wDPGlh2DfAHJnWEH3BS//melWRhkmfQncL60gzef1Xm4rWvYtW1D3M08NYkOwH0x8nBq1j3Q8BLkjwwnc3SXfiwRZKN6Y6rN9D1D9w+yV/1221B13/qGmBhkr9n5RaXmZryOKmqK4FTgH9NsmV/jO+S5JFr+J7nAs9KsiDdRS1r+joT7p/kKf3/08PpQuOZdKeOb0zyuiSb9O93nyQPmO4L9591Y7oWwfS/DzYCqKof9Z/lTf38J9P1lTyu3/wTwEFJHp5kM7r9+/mqskVMs2YQ0/ruJLrTHxOPI4G3AMvo+mOdB5zTz6OqzqL70nwnXQvF6cBO/S/clwOfoevL8yy6FqQp9ds+DjiE7q/7/wP+CbhT/wW/P/CSfvVXAXsnefbAS3ycru/N/9EN7fByJplNfb2/At6c5Ca6YPqZgde+ma6/zLfSXRn4oEnvfR3wRLoO59cBr6XrlD65NWbG5ui130YXvK9P8ppprP9uun13Sr8/zqTrozSsvmV0ndDfS7ffL+aOK/beRtdv8AP96ebnAG9JsivdRRn/Rdei8lO6TuuzObUG0zhOev+PLrh/v6/5c3T94NbEK+j6r11Pd2XhF9bwdSacQNencuICjadU1e/7QHkQsBddB/5rgQ8Dd57Baz+C7nfASXQthr+hC6UTDgGW9O99FPC0qroGoKouoPs/+gm6vpVb0P2fkWYtVbNpzZW0NiX5Ot3Vbx9uXYtGl8eJNH/ZIiZJktSIQUySJKkRT01KkiQ1YouYJElSIwYxSZKkRublyPp3u9vdauedd25dhiRJ0pSWL19+bVUtGrZsXgaxnXfemWXLlrUuQ5IkaUpJJt+q7XaempQkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGlnYugBJkqRBOx/x5dYl3O7Sow5cq69vi5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSI46sL2mNjdPo15K0NtgiJkmS1MicBLEk+ye5MMnFSY4YsvxvkpzbP85PcluSrftllyY5r1+2bC7qkSRJmg9mfWoyyQLgfcB+wOXA2UlOrKrvT6xTVf8M/HO//kHAK6vqFwMv8+iquna2tUiSJM0nc9Eitg9wcVVdUlW3AJ8GDl7N+s8EPjUH7ytJkjSvzUUQ2x64bGD68n7eSpJsCuwPHDcwu4BTkixPctgc1CNJkjQvzMVVkxkyr1ax7kHAtyadlnxoVV2R5O7AqUl+WFVnrPQmXUg7DGDx4sWzrVmSJKm5uWgRuxzYcWB6B+CKVax7CJNOS1bVFf3Pq4Hj6U51rqSqllbVkqpasmjRolkXLUmS1NpcBLGzgV2T3DPJRnRh68TJKyW5M/BI4ISBeZsl2WLiOfA44Pw5qEmSJGnkzfrUZFXdmuRlwMnAAuCYqrogyUv65Uf3qz4ZOKWqfj2w+TbA8UkmavlkVX1ltjVJkiTNB3Mysn5VnQScNGne0ZOmPwZ8bNK8S4A956IGSZKk+caR9SVJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSI3MSxJLsn+TCJBcnOWLI8kcluSHJuf3j76e7rSRJ0vpq4WxfIMkC4H3AfsDlwNlJTqyq709a9RtV9cQ13FaSJGm9MxctYvsAF1fVJVV1C/Bp4OB1sK0kSdK8NhdBbHvgsoHpy/t5kz04yXeT/FeSPWa4rSRJ0npn1qcmgQyZV5OmzwF2qqpfJTkA+AKw6zS37d4kOQw4DGDx4sVrXKwkSdKomIsWscuBHQemdwCuGFyhqm6sql/1z08CNkxyt+lsO/AaS6tqSVUtWbRo0RyULUmS1NZcBLGzgV2T3DPJRsAhwImDKyS5R5L0z/fp3/e66WwrSZK0vpr1qcmqujXJy4CTgQXAMVV1QZKX9MuPBp4G/GWSW4HfAIdUVQFDt51tTZIkSfPBXPQRmzjdeNKkeUcPPH8v8N7pbitJkjQOHFlfkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqZGFrQuQpPXJzkd8uXUJt7v0qANblyBpCraISZIkNWIQkyRJasQgJkmS1MicBLEk+ye5MMnFSY4YsvzZSb7XP76dZM+BZZcmOS/JuUmWzUU9kiRJ88GsO+snWQC8D9gPuBw4O8mJVfX9gdV+Ajyyqn6Z5AnAUuCBA8sfXVXXzrYWSZKk+WQuWsT2AS6uqkuq6hbg08DBgytU1ber6pf95JnADnPwvpIkSfPaXASx7YHLBqYv7+etyguA/xqYLuCUJMuTHDYH9UiSJM0LczGOWIbMq6ErJo+mC2IPG5j90Kq6IsndgVOT/LCqzhiy7WHAYQCLFy+efdWSJEmNzUWL2OXAjgPTOwBXTF4pyf2ADwMHV9V1E/Or6or+59XA8XSnOldSVUuraklVLVm0aNEclC1JktTWXASxs4Fdk9wzyUbAIcCJgyskWQx8HnhuVf1oYP5mSbaYeA48Djh/DmqSJEkaebM+NVlVtyZ5GXAysAA4pqouSPKSfvnRwN8DdwXenwTg1qpaAmwDHN/PWwh8sqq+MtuaJEmjxVs/ScPNyb0mq+ok4KRJ844eeP5C4IVDtrsE2HPyfEmSpHHgyPqSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqZE5CWJJ9k9yYZKLkxwxZHmS/Fu//HtJ9p7utpIkSeurWQexJAuA9wFPAHYHnplk90mrPQHYtX8cBnxgBttKkiStl+aiRWwf4OKquqSqbgE+DRw8aZ2DgX+vzpnAVkm2nea2kiRJ66W5CGLbA5cNTF/ez5vOOtPZVpIkab20cA5eI0Pm1TTXmc623Qskh9Gd1mTx4sUzqW+onY/48qxfY65cetSBrUu4nftlOPfLcKNUy6hwnwznfhnO3y3DjVIta9tctIhdDuw4ML0DcMU015nOtgBU1dKqWlJVSxYtWjTroiVJklqbiyB2NrBrknsm2Qg4BDhx0jonAv+vv3ryQcANVXXlNLeVJElaL8361GRV3ZrkZcDJwALgmKq6IMlL+uVHAycBBwAXAzcDz1/dtrOtSZIkaT6Yiz5iVNVJdGFrcN7RA88LeOl0t5UkSRoHjqwvSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhqZVRBLsnWSU5Nc1P+8y5B1dkzytSQ/SHJBklcMLDsyyc+TnNs/DphNPZIkSfPJbFvEjgBOq6pdgdP66cluBV5dVX8CPAh4aZLdB5a/s6r26h8nzbIeSZKkeWO2Qexg4Nj++bHAkyavUFVXVtU5/fObgB8A28/yfSVJkua92QaxbarqSugCF3D31a2cZGfgT4H/HZj9siTfS3LMsFObkiRJ66spg1iSryY5f8jj4Jm8UZLNgeOAw6vqxn72B4BdgL2AK4F/Xc32hyVZlmTZNddcM5O3liRJGkkLp1qhqh67qmVJrkqybVVdmWRb4OpVrLchXQj7RFV9fuC1rxpY50PAl1ZTx1JgKcCSJUtqqrolSZJG3WxPTZ4IHNo/PxQ4YfIKSQJ8BPhBVb1j0rJtByafDJw/y3okSZLmjdkGsaOA/ZJcBOzXT5NkuyQTV0A+FHgu8Jghw1S8Pcl5Sb4HPBp45SzrkSRJmjemPDW5OlV1HbDvkPlXAAf0z78JZBXbP3c27y9JkjSfObK+JElSI7NqEZvPLj3qwNYlSJKkMWeLmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJamRh6wI0Wi496sDWJUiSNDZsEZMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSI7MKYkm2TnJqkov6n3dZxXqXJjkvyblJls10e0mSpPXRbFvEjgBOq6pdgdP66VV5dFXtVVVL1nB7SZKk9cpsg9jBwLH982OBJ63j7SVJkuat2QaxbarqSoD+591XsV4BpyRZnuSwNdhekiRpvbNwqhWSfBW4x5BFb5zB+zy0qq5Icnfg1CQ/rKozZrA9fYA7DGDx4sUz2VSSJGkkTRnEquqxq1qW5Kok21bVlUm2Ba5exWtc0f+8OsnxwD7AGcC0tu+3XQosBViyZElNVbckSdKom+2pyROBQ/vnhwInTF4hyWZJtph4DjwOOH+620uSJK2vZhvEjgL2S3IRsF8/TZLtkpzUr7MN8M0k3wXOAr5cVV9Z3faSJEnjYMpTk6tTVdcB+w6ZfwVwQP/8EmDPmWwvSZI0DhxZX5IkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGlnYugBJksbVpUcd2LoENWaLmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpkVkFsSRbJzk1yUX9z7sMWeePk5w78LgxyeH9siOT/Hxg2QGzqUeSJGk+mW2L2BHAaVW1K3BaP72Cqrqwqvaqqr2A+wM3A8cPrPLOieVVddIs65EkSZo3ZhvEDgaO7Z8fCzxpivX3BX5cVT+d5ftKkiTNe7MNYttU1ZUA/c+7T7H+IcCnJs17WZLvJTlm2KlNSZKk9dWUQSzJV5OcP+Rx8EzeKMlGwJ8Bnx2Y/QFgF2Av4ErgX1ez/WFJliVZds0118zkrSVJkkbSwqlWqKrHrmpZkquSbFtVVybZFrh6NS/1BOCcqrpq4LVvf57kQ8CXVlPHUmApwJIlS2qquiVJkkbdbE9Nnggc2j8/FDhhNes+k0mnJfvwNuHJwPmzrEeSJGnemG0QOwrYL8lFwH79NEm2S3L7FZBJNu2Xf37S9m9Pcl6S7wGPBl45y3okSZLmjSlPTa5OVV1HdyXk5PlXAAcMTN8M3HXIes+dzftLkiTNZ46sL0mS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY0YxCRJkhoxiEmSJDViEJMkSWrEICZJktSIQUySJKkRg5gkSVIjBjFJkqRGDGKSJEmNGMQkSZIamVUQS/LnSS5I8ockS1az3v5JLkxycZIjBuZvneTUJBf1P+8ym3okSZLmk9m2iJ0PPAU4Y1UrJFkAvA94ArA78Mwku/eLjwBOq6pdgdP6aUmSpLEwqyBWVT+oqgunWG0f4OKquqSqbgE+DRzcLzsYOLZ/fizwpNnUI0mSNJ+siz5i2wOXDUxf3s8D2KaqrgTof959HdQjSZI0EhZOtUKSrwL3GLLojVV1wjTeI0Pm1TS2m1zHYcBhAIsXL57p5pIkSSNnyiBWVY+d5XtcDuw4ML0DcEX//Kok21bVlUm2Ba5eTR1LgaUAS5YsmXGQkyRJGjXr4tTk2cCuSe6ZZCPgEODEftmJwKH980OB6bSwSZIkrRembBFbnSRPBt4DLAK+nOTcqnp8ku2AD1fVAVV1a5KXAScDC4BjquqC/iWOAj6T5AXAz4A/n0090tpy6VEHti5BkrQeStX8O8u3ZMmSWrZsWesyJEmSppRkeVUNHW/VkfUlSZIaMYhJkiQ1YhCTJElqxCAmSZLUiEFMkiSpEYOYJElSIwYxSZKkRgxikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1IhBTJIkqRGDmCRJUiMGMUmSpEYMYpIkSY2kqlrXMGNJrgF+2rqO3t2Aa1sXMYLcLytznwznfhnO/TKc+2Vl7pPhRmm/7FRVi4YtmJdBbJQkWVZVS1rXMWrcLytznwznfhnO/TKc+2Vl7pPh5st+8dSkJElSIwYxSZKkRgxis7e0dQEjyv2yMvfJcO6X4dwvw7lfVuY+GW5e7Bf7iEmSJDVii5gkSVIjBjFJkqRGDGLSWpJkQZJ/bl2HpPVPkntOZ964SXKn6cwbJQaxGUrygknTC5K8qVU9oyTJoiRvSLI0yTETj9Z1tVJVtwH3T5LWtYyaJKdNZ964SfLx6cwbNx4vQx03ZN7n1nkVo+d/pjlvZCxsXcA8tG+SpwIvAO4KfBQ4vW1JI+ME4BvAV4HbGtcyKr4DnJDks8CvJ2ZW1efbldROko2BTYG7JbkLMBFStwS2a1bY6NhjcCLJAuD+jWppzuNlZUnuTXec3DnJUwYWbQls3Kaq9pLcA9ge2CTJn7LisbJps8KmwSA2Q1X1rCTPAM4DbgaeWVXfalzWqNi0ql7XuogRszVwHfCYgXkFjGUQA14MHE73JbqcO35Z3gi8r1FNzSV5PfAGui+RGydmA7cwTy7BX0s8Xlb2x8ATga2Agwbm3wS8qEVBI+LxwPOAHYB3DMy/ie7/1shy+IoZSrIrcCxdEPsT4PvAq6rq5qaFjYAkbwG+XVUnta5Foy3JX1fVe1rXMWqSvK2qXt+6jlHj8bKyJA+uqpE+5dZCkqdW1bDTtiPLIDZDSX4IvKyqvtr3/XkV8BdVtccUm663ktxE18oTYDPgd8Dv++mqqi0bltdUkt2ADwDbVNV9ktwP+LOqekvj0ppL8hBgZwZa5qvq35sVNCKSbA/sxIr75Yx2FY0Gj5cVJVlE1wK2Myvuk79oVdMo6DvmP5WV98ubW9U0FYPYDCXZsqpunDRv16q6qFVNGl1JTgf+BvhgVf1pP+/8qrpP28ra6jug7wKcyx39CauqXt6sqBGQ5CjgELqW9sH98mftqmrP42VlSb5N1yd3OQN9cudba9BcS/IV4AZW3i//2qyoKdhHbOY2SfJOYPuq2j/J7sCDgbEPYklOq6p9p5o3ZjatqrMmXTh5a6tiRsgSYPfyL8HJngz8cVX9rnUhI8bjZWX2yR1uh6rav3URM+HwFTP3MeBkYNt++kd0nUnHVpKNk9yV/sqmJFv3j50Z0yubBlybZBe6U7ckeRpwZduSRsL5wD1aFzGCLgE2bF3ECPJ4WdmXkhzQuogR9O0k921dxEx4anKGkpxdVQ9I8p2BU03nVtVejUtrJskruOPKpisGFt0IfKiq3tuirlGQ5F50V709BPgl8BPg2VX106aFNZLki3ShdAtgL+Asuj6FAIzrKbgk76HbL9sDewKnseJ+GctTcB4vK7NP7nBJzqPbLwuBXen+qPkdd+yX+zUsb7U8NTlzv+5bfyZaOB5Edz56bFXVu4F3e2XTUFVVj02yGbBBVd005qNf/0vrAkbUsv7ncuDEloWMGI+XSapqi9Y1jKgnti5gTdkiNkNJ9gbeA9yHrrl8EfC0qvpe08JGwKTBBSfcAJxXVVev63pGQZJzqmrvSfOWV9XYDtIpafb676LJbgB+WlVj2w81ydZDZt9UVb9f58VMky1i05TkAcBlVXVOkkfSDTT4VOAU4PKmxY2OF9BduPC1fvpRwJnAbkneXFVjc6sWR79evYHTK4NuoGsZenVVXbLuq2pv4PTKoIn98paqum7dV9Wex8tQ7wf2phvTEuC+wHeBuyZ5SVWd0qyyts4BdqTrChK6gW+vTHI18KKqWt6wtqEMYtP3QeCx/fOHAG8E/pqu38JS4GltyhopfwD+pKquAkiyDd0YWg8EzgDGJojh6NdTeQddf8JP0v2yPISuM/aFwDF0IX4c/RfdJfef7KcPods/N9BdKHTQ8M3Wex4vK7sUeEFVXQDQX8H/N8A/0t25Y1yD2FeA46vqZIAkjwP2Bz5DF14f2LC2oTw1OU1JvltVe/bP3wdcU1VH9tNj3Vl/QpLzquq+A9OhOy15n8GLG8aJo18Pl+R/q+qBk+adWVUPGvy/Nm6SfKuqHjps3uT/X+PE42Vlw753JuaN83dSkmVVtWTYvFHdLw5fMX0Lkky0IO4L/PfAMlsWO99I8qUkhyY5lO4m4Gf0HdWvb1taM9clOS3J+QBJ7pfkb1sXNQL+kOTpSTboH08fWDbOfx1unuT2wJFkH2DzfnJs+/3g8TLMhUk+kOSR/eP9wI/6keVHtj/UOvCLJK9LslP/eC3wyyQL6M7ajBxbxKYpyRuBA4BrgcXA3lVVSf4IOHbyX7HjqG8BeyrwULrTB98EjhvnQRgdWX+4fliPd9P1KSy6voSvBH4O3L+qvtmwvGb6vqjH0IWv0A0B80LgAuDAqvpMw/Ka8XhZWZJNgL8CHsYdv2/fD/yWbrDXXzUsr5kkdwPexIr75R/oTu8vrqqLG5Y3lEFsBvqhKrYFTqmqX/fzdgM2r6pzmhankeS4c1oTSe5M9/v5+ta1SFq7PKU2A1V15pB5P2pRyyjqrw78J+DudH+JjPUAgz1H1h+Q5LVV9faBAUxXMMYDlz6nqv4jyasmzQegqt7RpLDGPF5WluQzVfX0VVxhyygPXLo2JXlXVR0+MAjwCkZ58F+DmObS24GDquoHrQsZIS+lu6r23kl+Tjey/nPaltTUxLGxbLVrjZ/N+p8O1rkij5eVvaL/OW8HMF1LJq7Kn3eDAHtqUnNm2BVf6gyOrN+6llGSZLOJ0/zSVDxeVpRkJ2DXqvpq32dsob9jbu8/t7iqLmxdy3R41aTm0rIk/5nkmUmeMvFoXVRLSbZK8nK6sX3emuTfkvxb67paS/LgJN+nb/FIsmd/1ddYS7KbV9muzONlZUleBHyOboxLgB2ALzQraEQkOQg4l248MZLslWSkbxtmENNc2hK4GXgc3cCTB2Hz+UnAznSjXy8feIy7dwGPB64DqKrvAo9oWdCI+BDwevrhB/pbpx3StKLR8C48XiZ7Kd0V6jcCVNVFdP1zx92RwD70QyZV1bl0v4NHln3ENGeq6vmtaxhBG1fVq6ZebfxU1WUTndF7t7WqZYRsWlVnTdov4zx+2O08Xlbyu6q6ZWKf9ONc2tcIbq2qGyYdKyPNFjHNGU+rDPXxJC9Ksm2SrScerYsaAZcleQhQSTZK8hru6Jg9zrzKdjiPl5WdnuQNwCZJ9gM+C3yxcU2j4Pwkz6IbhH3X/orbb7cuanXsrK854+ClK0vyUuCtdM3kE//Zqqru1ayoEdAPuvhuuvu3hu6+eK8Y15taT+gHLl1Kdz/bX9JdZfvsqvpp08Ia83hZWZINgBfQdQUJcDLw4XEeQBsgyaZ094Ke2C9fAd5SVb9tWthqGMQ0Zxy8dGVJfgw8sKqubV2L5g+vsl1Rko1H+Yu0hSSPAc6sqptb1zJKktyrqi5pXcdM2EdMc8nTKiu7gO4CBgGrGphzwjgO0DmoD+5nAt8AzgC+37aikXF+kqu4Y798q6puaFxTa88Djk5yHd1++Qbwzar6ZdOq2vtYku2Bs+mOlW9U1XmNa1otW8Q0ZzytsrIkxwN7AF8Dfjcxf1wDR7qbwU/4B7p7wt2uqo5dtxWNlv6GzQ8EHk53Rdy9ge9W1ZObFjYCkizmjv1yAHD9OLe2T0iyHfA04DXAdlU19g0sSTYCHgA8Cngx3W0IR7Zv7tj/g2nu9M3Bjx08rZLkcLpLz8fVF3Bsn9sNBq0kh4978BriNrqhK24D/gBcBVzdtKIRkGQHugD2cGBPupbmsbvR96Akz6HbH/cFrgXeS9cqNtaSPIxuvzwc2Ar4EiO+X2wR01qV5GdVtbh1HRo9Sc6pqr1b1zFKktxMN+bcO4CvjnNn9EFJ/kB3qun/q6oTWtczCpJcC/wYOBr4WlVd2rai0ZDkNrpbYr0NOKmqbmlc0pQMYlqrklxWVTu2rqOVJLvS/ULYHdh4Yv64XzUJBrFhkhwMPIxuQMpb6C67P6OqTmtaWGNJ9qTbL48AFgMXAadX1UeaFtZYkj3o9snDgF2BC6vquW2raivJVnStp4+gOz35B+B/qurvWta1Op6a1No27kn/o3T9oN4JPBp4Pt0l1WMpyU3ccUxsmuTGiUV0w3ps2aay0dC39pyQ5N7AE4DDgdcCm7Ssq7Wq+m5/IcOP6U45PYfui3Zsg1iSLelC6U50I8ffmS50jLWquj7JJcCOdLd9egiwYduqVs8WMc3apC/XFRYBm4xz59Eky6vq/knOq6r79vO+UVUPb12bRk+S44C9gIu540q4/x33oRuSLAPuRNdC+E26VsKxvQgIIMn36PbFxP64vHFJI6EP7BfSX0VK9/9npE9Pju0XpOZOVW3RuoYR9tt+4MWLkrwM+DneD06rdhRwTlUNvX1Pkv2q6tR1XNMoeEJVXbOqhUkOHbcLP6rqfqtbnuQ9VfXX66qeEbJrVa2yZTDJ66vqbeuyoKnYIiatRUkeQHcrlq2Af6Q7ffD2qjqzZV2an+xXN5z7ZWXuk+FGcb/YIiatRVV1dv/0V3T9w6TZGNv+hVNwv2i6Ru5YMYhJa0GSd1XV4Um+yJD+c1X1Zw3K0vznKYzh3C+arpE7Vgxi0trx8f7nvzStQhoPI9fKMQLcJ8ON3H4xiElrQVUt73+enmRR/3yVnY2labq0dQEj6lutCxhB725dwIj6bOsCJrOzvrQWJAnd+GEvo/sLbAPgVuA9VfXmlrVpdCXZFHg1sLiqXtQPCPzHVfWlxqU1keRVq1teVe9YV7WMmv4PvNex8mDRj2lWVENJ3sNqTjuO8v19bRGT1o7D6UZ3fkBV/QRuvyn6B5K8sqre2bI4jayPAsuBB/fTl9P9BT+WQQyYGBrnj+lGST+xnz4IOKNJRaPjE8B/AgcCLwEOBca51X1Z6wLWlC1i0lqQ5DvAflV17aT5i4BTqupP21SmUZZkWVUtSfKdiWMkyXeras/WtbWU5BTgqVV1Uz+9BfDZqtq/bWXtDAwW/b2JMcWSnF5Vj2xdm2bGFjFp7dhwcgiDrp9YkpG+3YaauiXJJvSnWJLsAvyubUkjYTHdvTcn3EJ3W59x9vv+55VJDgSuoLulz1ha1RXqE0b5SnWDmLR2rO6WGiN9uw019SbgK8COST5Bd3r7eU0rGg0fB85Kcjzdl+2TgX9vW1Jzb0lyZ7o+he8BtgRe2bakpubtFeqempTWgiS3Ab8etgjYuKpsFdNQSe4KPIjuWDlzWMvqOEqyN90Nv6G7t+J3WtbTUpIFwMvtazpc36q8uKoubF3LdBjEJKmxPmSsUlWds65qGVVJHkZ3H8GP9n0tN5+4EGYcJflaVT26dR2jJslBdK1jG1XVPZPsBbx5lE9NGsQkqbEkX1vN4hrXIQkmJHkTsIRuKI/dkmxH11n/oY1LaybJW+nuXfufDLS+j3toT7IceAzw9YELXm6/oGEU2UdMkhqzZWNKTwb+FDgHoKqu6K+cHGcP6X8OjktYdCFknN1aVTd0QznODwYxSRoR/RW1fwk8op/1deCDVfX7VW40Hm6pqkoycTXpZq0Las3wvkrnJ3kWsKAfEPnlwLcb17RaG7QuQJJ0uw8A9wfe3z/u388bd59J8kFgqyQvAr4KfLhxTU0l2SbJR5L8Vz+9e5IXtK5rBPw1sAfdsC+fAm6kG2B7ZNlHTJJGxLDBWx3QtZNkP+BxdFeTnkx35eTYjrHWB7CPAm+sqj2TLAS+U1X3bVyaZshTk5I0Om5LsktV/Rhuvy3WbY1rai7JMVX1F8Cp/fTmwEnAvk0La+tuVfWZJK8HqKpb+2FzxlqS3YDX0A34e3vGGeULXgxikjQ6/gb4WpJL6Fp+dgKe37akkfDzJB+oqr9Mchfgy8CHWhfV2K/7Mecm+s09CLihbUkj4bPA0XSnrudFMPXUpCSNkCR3orvJdYAfjvPpt0FJ/oluuIb7A0dV1XGNS2qqH3vuPcB9gPOBRcDTqup7TQtrbOIenK3rmAmDmCSNkCQPYeXTKmN5O58kTxmcBP4OOIvuNlBU1edb1DUq+n5hE6H9wnG+ujbJ1v3TlwNXA8czcJ/WqvpFi7qmwyAmSSMiyceBXYBzueO0SlXVy5sV1VCSj65mcfX9xsZSkpcCn6iq6/vpuwDPrKr3Ny2skSQ/oTtNOzGA2Arhpqrutc6LmiaDmCSNiCQ/AHYvfzFrCknOraq9Js37zsRo8uMmyT7AZVV1ZT99KPBU4FLgyFFuEXMcMUkaHecD92hdxKhJcmySrQam75LkmIYljYINMjB8fH8j8I0a1tPa0fSnIpM8AngbcCzdBQxLG9Y1Ja+alKTGknyR7lTKFsD3k5zFiv1bRvaGxevI/SZOwQFU1S+TjGXLz4CT6Qa6PZru2HkJfd+5MbVgoNXrGcDS/oKO45Kc266sqRnEJKm9/6b7ffwdYGw7XK/GBknuUlW/hNs7Zo/799frgMPobokV4BTG+24DC5IsrKpb6caXO2xg2UgfKyNdnCSNie3pbuL8BuC7dPfG+xbwP6Pct2Ud+lfg20k+10//OfDWhvU0V1V/oDsdd3QfTHeoqnkxbtZa8ing9CTXAr8BvgGQ5I8Y8fHV7KwvSSMiyUbAErpQ9uD+cX1V7d60sBGQZHfgMXStP6dV1fcbl9RUkq8Df0bXoHIucA1welW9qmFZTfWD2m4LnFJVv+7n7QZsXlXnNC1uNWwRk6TRsQmwJd3ApXcGrgDOa1rRCEiyGPgVcOLgvKr6WbuqmrtzVd2Y5IXAR6vqTUnGejDXqjpzyLwftahlJgxiktRYkqXAHsBNwP/SnZp8x0SfKPFl7hgXahPgnsCFdPtsXC1Msi3wdOCNrYvRmjOISVJ7i4E7ARcBPwcuB65vWdAoqar7Dk73t/d5caNyRsWb6a6c/GZVnd3fIP6ixjVpDdhHTJJGQD8m1B50/cMeQncPwV/Qddh/U8vaRlGSc6pq79Z1SLNlEJOkEZJkB+ChdGHsicBdq2qrpkU1lmSwA/oGwN50++XxjUpqJslrq+rtSd7DpNv4AIzr7bDmM09NSlJjSV5OF7weSjeO2LeA/wGOwc760A10O+FWuj5jxzWqpbUXJ/kWsKx1IZobtohJUmNJ3kE/dtjEvfKkYZIcTjdy/LbAfwKfqqpzW9ak2TGISZJGWj8W1GuAnRk4k1NVj2lVU2tJdgIO6R8b0w1o+qmqssP+PGMQkySNtCTfpRtFfjlw++jxVbW8WVEjpL/v5jF09+Rc0LoezYx9xCRJo+7WqvpA6yJGSZINgf3pWsT2BU4H/qFpUVojtohJkkZSfw9FgJcDVwPHA7+bWD6O9+FMsh/wTOBA4Czg08AXJm7po/nHICZJGklJfkI3REP6WSt8YVXVvdZ5UY0l+RrwSeC4cQyi6yODmCRpJCXZB7hs4krSJIcCTwUuBY40iGh9sEHrAiRJWoWj6U9FJnkE8DbgWOAGYGnDuqQ5Y2d9SdKoWjDQ6vUMYGlVHQccl+TcdmVJc8cWMUnSqFqQZKLBYF/gvweW2ZCg9YIHsiRpVH0KOD3JtcBvgG8AJPkjutOT0rxnZ31J0shK8iC62/mcMjFEQz/S/uZVdU7T4qQ5YBCTJElqxD5ikiRJjRjEJEmSGjGISZIkNWIQkyRJasQgJkmS1Mj/Dyp63aS1d0tMAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(10, 7))\n", "selected_example = 100\n", "print(f\"Example number: {selected_example}\")\n", "print(f\"\\nFeature values -- Label: {train.iloc[selected_example]}\")\n", "local_explanations_out.iloc[selected_example][baseline.columns.tolist()].plot(kind='bar', title='Local explanation for the example number '+str(selected_example), rot=90)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "ba0064dc", "metadata": {}, "source": [ "Is the impact of the feature positive or negative with regards to the target variable" ] }, { "cell_type": "code", "execution_count": 36, "id": "bbce64f8", "metadata": {}, "outputs": [], "source": [ "local_explanations_out['max_predictor_column'] = local_explanations_out[baseline.columns.tolist()].abs().idxmax(axis=1)\n", "local_explanations_out['max_predictor_column_shap_value'] = local_explanations_out.apply(lambda x: np.sign(x[x['max_predictor_column']]), axis=1)" ] }, { "cell_type": "code", "execution_count": 37, "id": "29f931c3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "max_predictor_column max_predictor_column_shap_value\n", "Shell_weight 1.0 631\n", "Shucked_weight 1.0 604\n", "Shell_weight -1.0 596\n", "Shucked_weight -1.0 595\n", "Diameter -1.0 124\n", "Whole_weight -1.0 87\n", "Height 1.0 63\n", "Sex -1.0 46\n", "Whole_weight 1.0 40\n", "Viscera_weight -1.0 36\n", "Height -1.0 36\n", "Length -1.0 32\n", "Viscera_weight 1.0 11\n", "Length 1.0 8\n", "Sex 1.0 7\n", "Diameter 1.0 7\n", "dtype: int64" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "local_explanations_out[['max_predictor_column', 'max_predictor_column_shap_value']].value_counts()" ] }, { "cell_type": "code", "execution_count": null, "id": "99a9bb81", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.13" } }, "nbformat": 4, "nbformat_minor": 5 }