{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Fairness and Explainability with SageMaker Clarify using AWS SDK for Python (Boto3)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Runtime\n", "\n", "This notebook takes approximately 30 minutes to run.\n", "\n", "## Contents\n", "\n", "1. [Overview](#Overview)\n", "1. [Prerequisites and Data](#Prerequisites-and-Data)\n", " 1. [Import libraries](#Import-libraries)\n", " 1. [Set configurations](#Set-configurations)\n", " 1. [Download data](#Download-data)\n", " 1. [Loading the data: Adult Dataset](#Loading-the-data:-Adult-Dataset) \n", " 1. [Data inspection](#Data-inspection) \n", " 1. [Encode and Upload the Dataset](#Encode-and-Upload-the-Dataset) \n", "1. [Train and Deploy XGBoost Model](#Train-XGBoost-Model)\n", " 1. [Train Model](#Train-Model)\n", " 1. [Create Model](#Create-Model)\n", "1. [Amazon SageMaker Clarify](#Amazon-SageMaker-Clarify)\n", " 1. [Set Configurations](#Set-Configurations)\n", " 1. [Detecting Bias](#Detecting-Bias)\n", " 1. [Get Started with a SageMaker Clarify Container](#Get-Started-with-a-SageMaker-Clarify-Container)\n", " 1. [Configure a SageMaker Clarify Processing Job Container's Input and Output Parameters ](#Configure-a-SageMaker-Clarify-Processing-Job-Container's-Input-and-Output-Parameters)\n", " 1. [Configure Analysis Config](#Configure-Analysis-Config)\n", " 1. [Run SageMaker Clarify Processing Job](#Run-SageMaker-Clarify-Processing-Job)\n", " 1. [Viewing the Bias Report](#Viewing-the-Bias-Report)\n", " 1. [Explaining Predictions](#Explaining-Predictions)\n", " 1. [Configure a SageMaker Clarify Processing Job Container's Input and Output Parameters ](#Configure-a-SageMaker-Clarify-Processing-Job-Container's-input-and-output-parameters)\n", " 1. [Configure Analysis Config](#Configure-analysis-config)\n", " 1. [Run SageMaker Clarify Processing Job](#Run-SageMaker-Clarify-Processing-job)\n", " 1. [Viewing the Explainability Report](#Viewing-the-Explainability-Report)\n", " 1. [Analysis of local explanations](#Analysis-of-local-explanations)\n", "1. [Clean Up](#Clean-Up)\n", "\n", "## Overview\n", "Amazon SageMaker Clarify helps improve your machine learning models by detecting potential bias and helping explain how these models make predictions. The fairness and explainability functionality provided by SageMaker Clarify takes a step towards enabling AWS customers to build trustworthy and understandable machine learning models. The product comes with the tools to help you with the following tasks.\n", "\n", "* Measure biases that can occur during each stage of the ML lifecycle (data collection, model training and tuning, and monitoring of ML models deployed for inference).\n", "* Generate model governance reports targeting risk and compliance teams and external regulators.\n", "* Provide explanations of the data, models, and monitoring used to assess predictions.\n", "\n", "This sample notebook walks you through: \n", "1. Key terms and concepts needed to understand SageMaker Clarify\n", "1. Measuring the pre-training bias of a dataset and post-training bias of a model\n", "1. Explaining the importance of the various input features on the model's decision\n", "1. Accessing the reports through SageMaker Studio if you have an instance set up.\n", "\n", "In doing so, the notebook first trains a [SageMaker XGBoost](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html) model using training dataset, then utilizes the [AWS SDK for Python](https://aws.amazon.com/sdk-for-python/) to launch SageMaker Clarify jobs to analyze an example dataset in CSV format. Additionally, there are peer examples available that use the [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/) to launch SageMaker Clarify jobs to analyze data [in CSV format](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-clarify/fairness_and_explainability/fairness_and_explainability.ipynb) and [in JSON Lines format](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-clarify/fairness_and_explainability/fairness_and_explainability_jsonlines_format.ipynb)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites and Data" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Import libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import os\n", "import boto3\n", "import time\n", "from datetime import datetime\n", "from sagemaker import get_execution_role, session" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Set configurations" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Region: ap-south-1\n", "Role: arn:aws:iam::000000000000:role/service-role/SMClarifySageMaker-ExecutionRole\n" ] } ], "source": [ "# Initialize sagemaker session\n", "sagemaker_session = session.Session()\n", "\n", "region = sagemaker_session.boto_region_name\n", "print(f\"Region: {region}\")\n", "\n", "role = get_execution_role()\n", "print(f\"Role: {role}\")\n", "\n", "bucket = sagemaker_session.default_bucket()\n", "\n", "prefix = \"sagemaker/DEMO-sagemaker-clarify-boto3\"" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Download data\n", "Data Source: [https://archive.ics.uci.edu/ml/machine-learning-databases/adult/](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/)\n", "\n", "Let's __download__ the data and save it in the local folder with the name adult.data and adult.test from UCI repository$^{[2]}$.\n", "\n", "$^{[2]}$Dua Dheeru, and Efi Karra Taniskidou. \"[UCI Machine Learning Repository](http://archive.ics.uci.edu/ml)\". Irvine, CA: University of California, School of Information and Computer Science (2017)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "adult.data saved!\n", "adult.test saved!\n" ] } ], "source": [ "from sagemaker.s3 import S3Downloader\n", "\n", "adult_columns = [\n", " \"Age\",\n", " \"Workclass\",\n", " \"fnlwgt\",\n", " \"Education\",\n", " \"Education-Num\",\n", " \"Marital Status\",\n", " \"Occupation\",\n", " \"Relationship\",\n", " \"Ethnic group\",\n", " \"Sex\",\n", " \"Capital Gain\",\n", " \"Capital Loss\",\n", " \"Hours per week\",\n", " \"Country\",\n", " \"Target\",\n", "]\n", "if not os.path.isfile(\"adult.data\"):\n", " S3Downloader.download(\n", " s3_uri=\"s3://{}/{}\".format(\n", " f\"sagemaker-example-files-prod-{region}\", \"datasets/tabular/uci_adult/adult.data\"\n", " ),\n", " local_path=\"./\",\n", " sagemaker_session=sagemaker_session,\n", " )\n", " print(\"adult.data saved!\")\n", "else:\n", " print(\"adult.data already on disk.\")\n", "\n", "if not os.path.isfile(\"adult.test\"):\n", " S3Downloader.download(\n", " s3_uri=\"s3://{}/{}\".format(\n", " f\"sagemaker-example-files-prod-{region}\", \"datasets/tabular/uci_adult/adult.test\"\n", " ),\n", " local_path=\"./\",\n", " sagemaker_session=sagemaker_session,\n", " )\n", " print(\"adult.test saved!\")\n", "else:\n", " print(\"adult.test already on disk.\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Loading the data: Adult Dataset\n", "From the UCI repository of machine learning datasets, this database contains 14 features concerning demographic characteristics of 45,222 rows (32,561 for training and 12,661 for testing). The task is to predict whether a person has a yearly income that is more or less than $50,000.\n", "\n", "Here are the features and their possible values:\n", "\n", "1. **Age**: continuous.\n", "1. **Workclass**: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.\n", "1. **Fnlwgt**: continuous (the number of people the census takers believe that observation represents).\n", "1. **Education**: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.\n", "1. **Education-num**: continuous.\n", "1. **Marital-status**: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.\n", "1. **Occupation**: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.\n", "1. **Relationship**: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.\n", "1. **Ethnic group**: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.\n", "1. **Sex**: Female, Male.\n", " * **Note**: this data is extracted from the 1994 Census and enforces a binary option on Sex\n", "1. **Capital-gain**: continuous.\n", "1. **Capital-loss**: continuous.\n", "1. **Hours-per-week**: continuous.\n", "1. **Native-country**: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.\n", "\n", "Next, we specify our binary prediction task: \n", "\n", "15. **Target**: <=50,000, >$50,000." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AgeWorkclassfnlwgtEducationEducation-NumMarital StatusOccupationRelationshipEthnic groupSexCapital GainCapital LossHours per weekCountryTarget
039State-gov77516Bachelors13Never-marriedAdm-clericalNot-in-familyWhiteMale2174040United-States<=50K
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
238Private215646HS-grad9DivorcedHandlers-cleanersNot-in-familyWhiteMale0040United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeBlackFemale0040Cuba<=50K
\n", "
" ], "text/plain": [ " Age Workclass fnlwgt Education Education-Num \\\n", "0 39 State-gov 77516 Bachelors 13 \n", "1 50 Self-emp-not-inc 83311 Bachelors 13 \n", "2 38 Private 215646 HS-grad 9 \n", "3 53 Private 234721 11th 7 \n", "4 28 Private 338409 Bachelors 13 \n", "\n", " Marital Status Occupation Relationship Ethnic group Sex \\\n", "0 Never-married Adm-clerical Not-in-family White Male \n", "1 Married-civ-spouse Exec-managerial Husband White Male \n", "2 Divorced Handlers-cleaners Not-in-family White Male \n", "3 Married-civ-spouse Handlers-cleaners Husband Black Male \n", "4 Married-civ-spouse Prof-specialty Wife Black Female \n", "\n", " Capital Gain Capital Loss Hours per week Country Target \n", "0 2174 0 40 United-States <=50K \n", "1 0 0 13 United-States <=50K \n", "2 0 0 40 United-States <=50K \n", "3 0 0 40 United-States <=50K \n", "4 0 0 40 Cuba <=50K " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "training_data = pd.read_csv(\n", " \"adult.data\", names=adult_columns, sep=r\"\\s*,\\s*\", engine=\"python\", na_values=\"?\"\n", ").dropna()\n", "\n", "testing_data = pd.read_csv(\n", " \"adult.test\", names=adult_columns, sep=r\"\\s*,\\s*\", engine=\"python\", na_values=\"?\", skiprows=1\n", ").dropna()\n", "\n", "training_data.head()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Data inspection\n", "Plotting histograms for the distribution of the different features is a good way to visualize the data. Let's plot a few of the features that can be considered _sensitive_. \n", "Let's take a look specifically at the Sex feature of a census respondent. In the first plot we see that there are fewer Female respondents as a whole but especially in the positive outcomes, where they form ~$\\frac{1}{7}$th of respondents." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEICAYAAABfz4NwAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAAAY2UlEQVR4nO3de7RedX3n8fenhCJqoVyONE2iQYm2kKlhJWZQq4uu2BIvFZwBDdMKtKwVZeFYl3ZmQG11tJkRFWmZJdg4MARGuYyIsCpUEarUEcGDRsK1hItyTAYOiBiqRBO/88fzO/pw8uTcc04u79dae539fPfvt/fv4XI+z/7t/ZydqkKSpF+b6QFIknYOBoIkCTAQJEmNgSBJAgwESVJjIEiSAANB2ikkOSTJTUk2JTl7psejPZOBoF1Skv+QpD/JU0k2Jrkuye9Pw3EryWE7YNcrgceA/arqPT2OOzfJlUkeS/JkknVJTtkB49AebNZMD0AaryTvBs4A3g58CfgZsBw4Fvj6DA5tMl4A3FXb/6boJcB3W7vNwL8BfmuaxqY9RVW5uOwyC7A/8BRwwght9gH+FtjQlr8F9mnbTgG+Pqx9AYe19YuATwJfBDYBtwAvattuam3/tY3hLcDBwD8APwJ+CPwz8GvbGdcrgG8BT7afr+g65s/pBNtTwGt69H0KWDTCez4K+EYbx3eBo7uO+Rgwr71+aWvzOzP979Jl51ucMtKu5uXAs4CrRmjzPjq/IBfR+QW4FHj/OI5xIvBfgQOA9cAqgKp6ddv+0qp6blVdDrwHGAD6gEOA99IJjWdIciCdkDkXOAj4BPDFJAdV1SnAZ4CPtv1+pceYvgl8MsmKJM8ftu85bd9/AxwI/CVwZZK+qvoG8PfAmiT70jnTeH9V3TOOfx7aQxgI2tUcBDxWVVtGaPMnwIeq6tGqGqTzy/2t4zjG56vq1naMz9AJlu35OTAbeEFV/byq/rmqek37vB64r6ouqaotVXUpcA/wx2Mc0wl0zj7+CngwydokL2vb/hS4tqqurapfVNX1QD/wurb9g3TOrG6lc8b0yTEeU3sYA0G7mseBg5OMdP3rt4Hvdb3+XquN1f/rWv8J8NwR2n6MzlnEl5M8kOSMMY5paFxzxjKgqnqiqs6oqiPonImsBb6QJHSuK5yQ5EdDC/D7dIKKqvo5nWmphcDZ2wksyUDQLudm4GnguBHabKDzS3LI81sNOvP/zx7akGRSF2aralNVvaeqXkjn0/67kywbw5iGxvWDCRzzMeDjdELmQOBh4JKq+s2u5TlV9RH45ZTSB4D/BZydZJ/xHlN7BgNBu5SqehL4azrz6ccleXaSvZO8NslHW7NLgfcn6UtycGv/v9u27wJHJFmU5Fl0plPG4xHghUMvkrwhyWHtk/qPga1tGe5a4MXtdtlZSd4CHE7ngvSokpyVZGHr+xvAacD6qnq8vbc/TnJMkr2SPCvJ0e1W1dA5O7gAOBXYCHx4nO9ZewgDQbucqvoE8G46F4oH6XxCfgfwhdbkb+jMod8OrAO+3WpU1b8AHwK+AtzH+G9T/SCdC7Q/SvJmYEHb11N0zl7Oq6qv9hjz48Ab6FyEfhz4z8Ab2qf9sXg2nQvpPwIeoHO28ca274fp3HL7Xn71z+M/0fn/+510ppj+qk0V/RnwZ0leNc73rT1AnE6UJIFnCJKkxkCQJAEGgiSpMRAkScAu/MftDj744Jo/f/5MD0OSdim33XbbY1XV12vbLhsI8+fPp7+/f6aHIUm7lCTDvzH/S04ZSZIAA0GS1BgIkiTAQJAkNQaCJAkwECRJjYEgSQIMBElSYyBIkoAxfFM5yTzgYuC3gF8Aq6vq75IcCFwOzAceAt5cVU+0PmfSeTrTVuCdVfWlVl9M5+lN+9J5gtRfVFW1R/pdDCym8/CQt1TVQ1P2LiVtY/4ZX5zpIexWHvrI62d6CJM2ljOELcB7qup3gaOA05McDpwB3FBVC4Ab2mvathXAEcBy4Lwke7V9nQ+spPOUqQVtO3TC44mqOgw4BzhrCt6bJGkcRg2EqtpYVd9u65uAu4E5dB7Zt6Y1W8OvHnp+LHBZVW2uqgeB9cDSJLOB/arq5vYov4uH9Rna1+eAZe1ZsJKkaTKuawhJ5gNHArcAh1TVRuiEBvC81mwOnWe6DhlotTltfXj9GX2qagvwJHDQeMYmSZqcMQdCkucCVwLvqqofj9S0R61GqI/UZ/gYVibpT9I/ODg42pAlSeMwpkBIsjedMPhMVX2+lR9p00C0n4+2+gAwr6v7XGBDq8/tUX9GnySzgP2BHw4fR1WtrqolVbWkr6/nn/OWJE3QqIHQ5vIvAO6uqk90bboGOLmtnwxc3VVfkWSfJIfSuXh8a5tW2pTkqLbPk4b1GdrX8cCN7TqDJGmajOUBOa8E3gqsS7K21d4LfAS4IsmpwPeBEwCq6s4kVwB30blD6fSq2tr6ncavbju9ri3QCZxLkqync2awYnJvS5I0XqMGQlV9nd5z/ADLttNnFbCqR70fWNij/jQtUCRJM8NvKkuSAANBktQYCJIkwECQJDUGgiQJMBAkSY2BIEkCDARJUmMgSJIAA0GS1BgIkiTAQJAkNQaCJAkwECRJjYEgSQIMBElSYyBIkoCxPVP5wiSPJrmjq3Z5krVteWjo0ZpJ5if5ade2T3X1WZxkXZL1Sc5tz1WmPXv58la/Jcn8qX+bkqTRjOUM4SJgeXehqt5SVYuqahFwJfD5rs33D22rqrd31c8HVgIL2jK0z1OBJ6rqMOAc4KyJvBFJ0uSMGghVdROdB99vo33KfzNw6Uj7SDIb2K+qbq6qAi4GjmubjwXWtPXPAcuGzh4kSdNnstcQXgU8UlX3ddUOTfKdJF9L8qpWmwMMdLUZaLWhbQ8DVNUW4EngoF4HS7IySX+S/sHBwUkOXZLUbbKBcCLPPDvYCDy/qo4E3g18Nsl+QK9P/NV+jrTtmcWq1VW1pKqW9PX1TWLYkqThZk20Y5JZwL8DFg/VqmozsLmt35bkfuDFdM4I5nZ1nwtsaOsDwDxgoO1zf7YzRSVJ2nEmc4bwGuCeqvrlVFCSviR7tfUX0rl4/EBVbQQ2JTmqXR84Cbi6dbsGOLmtHw/c2K4zSJKm0VhuO70UuBl4SZKBJKe2TSvY9mLyq4Hbk3yXzgXit1fV0Kf904D/CawH7geua/ULgIOSrKczzXTGJN6PJGmCRp0yqqoTt1M/pUftSjq3ofZq3w8s7FF/GjhhtHFIknYsv6ksSQIMBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCTAQJEmNgSBJAgwESVJjIEiSAANBktQYCJIkwECQJDUGgiQJMBAkSc1Ynph2YZJHk9zRVftgkh8kWduW13VtOzPJ+iT3Jjmmq744ybq27dz2KE2S7JPk8la/Jcn8KX6PkqQxGMsZwkXA8h71c6pqUVuuBUhyOJ1Hax7R+pw39Ixl4HxgJZ3nLC/o2uepwBNVdRhwDnDWBN+LJGkSRg2EqroJ+OFo7ZpjgcuqanNVPUjn+clLk8wG9quqm6uqgIuB47r6rGnrnwOWDZ09SJKmz2SuIbwjye1tSumAVpsDPNzVZqDV5rT14fVn9KmqLcCTwEG9DphkZZL+JP2Dg4OTGLokabiJBsL5wIuARcBG4OxW7/XJvkaoj9Rn22LV6qpaUlVL+vr6xjVgSdLIJhQIVfVIVW2tql8AnwaWtk0DwLyupnOBDa0+t0f9GX2SzAL2Z+xTVJKkKTKhQGjXBIa8CRi6A+kaYEW7c+hQOhePb62qjcCmJEe16wMnAVd39Tm5rR8P3NiuM0iSptGs0RokuRQ4Gjg4yQDwAeDoJIvoTO08BLwNoKruTHIFcBewBTi9qra2XZ1G546lfYHr2gJwAXBJkvV0zgxWTMH7kiSN06iBUFUn9ihfMEL7VcCqHvV+YGGP+tPACaONQ5K0Y/lNZUkSYCBIkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEmAgSBJagwESRJgIEiSGgNBkgQYCJKkxkCQJAEGgiSpMRAkSYCBIElqDARJEjCGQEhyYZJHk9zRVftYknuS3J7kqiS/2erzk/w0ydq2fKqrz+Ik65KsT3Jue7Yy7fnLl7f6LUnmT/3blCSNZixnCBcBy4fVrgcWVtXvAf8CnNm17f6qWtSWt3fVzwdWAgvaMrTPU4Enquow4BzgrHG/C0nSpI0aCFV1E/DDYbUvV9WW9vKbwNyR9pFkNrBfVd1cVQVcDBzXNh8LrGnrnwOWDZ09SJKmz1RcQ/hz4Lqu14cm+U6SryV5VavNAQa62gy02tC2hwFayDwJHNTrQElWJulP0j84ODgFQ5ckDZlUICR5H7AF+EwrbQSeX1VHAu8GPptkP6DXJ/4a2s0I255ZrFpdVUuqaklfX99khi5JGmbWRDsmORl4A7CsTQNRVZuBzW39tiT3Ay+mc0bQPa00F9jQ1geAecBAklnA/gybopIk7XgTOkNIshz4L8Abq+onXfW+JHu19RfSuXj8QFVtBDYlOapdHzgJuLp1uwY4ua0fD9w4FDCSpOkz6hlCkkuBo4GDkwwAH6BzV9E+wPXt+u832x1FrwY+lGQLsBV4e1UNfdo/jc4dS/vSueYwdN3hAuCSJOvpnBmsmJJ3Jkkal1EDoapO7FG+YDttrwSu3M62fmBhj/rTwAmjjUOStGP5TWVJEmAgSJIaA0GSBBgIkqTGQJAkAQaCJKkxECRJgIEgSWoMBEkSYCBIkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEmAgSBJakYNhCQXJnk0yR1dtQOTXJ/kvvbzgK5tZyZZn+TeJMd01RcnWde2ndsepUmSfZJc3uq3JJk/xe9RkjQGYzlDuAhYPqx2BnBDVS0AbmivSXI4nUdgHtH6nDf0jGXgfGAlnecsL+ja56nAE1V1GHAOcNZE34wkaeJGDYSquonOs467HQusaetrgOO66pdV1eaqehBYDyxNMhvYr6purqoCLh7WZ2hfnwOWDZ09SJKmz0SvIRxSVRsB2s/ntfoc4OGudgOtNqetD68/o09VbQGeBA7qddAkK5P0J+kfHByc4NAlSb1M9UXlXp/sa4T6SH22LVatrqolVbWkr69vgkOUJPUya4L9Hkkyu6o2tumgR1t9AJjX1W4usKHV5/aod/cZSDIL2J9tp6h2WfPP+OJMD2G38tBHXj/TQ5B2WxM9Q7gGOLmtnwxc3VVf0e4cOpTOxeNb27TSpiRHtesDJw3rM7Sv44Eb23UGSdI0GvUMIcmlwNHAwUkGgA8AHwGuSHIq8H3gBICqujPJFcBdwBbg9Kra2nZ1Gp07lvYFrmsLwAXAJUnW0zkzWDEl70ySNC6jBkJVnbidTcu2034VsKpHvR9Y2KP+NC1QJEkzx28qS5IAA0GS1BgIkiTAQJAkNQaCJAkwECRJjYEgSQIMBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCTAQJEmNgSBJAgwESVIz4UBI8pIka7uWHyd5V5IPJvlBV/11XX3OTLI+yb1JjumqL06yrm07tz1mU5I0jSYcCFV1b1UtqqpFwGLgJ8BVbfM5Q9uq6lqAJIfTeTzmEcBy4Lwke7X25wMr6TyDeUHbLkmaRlM1ZbQMuL+qvjdCm2OBy6pqc1U9CKwHliaZDexXVTdXVQEXA8dN0bgkSWM0VYGwAri06/U7ktye5MIkB7TaHODhrjYDrTanrQ+vS5Km0aQDIcmvA28E/k8rnQ+8CFgEbATOHmrao3uNUO91rJVJ+pP0Dw4OTmbYkqRhpuIM4bXAt6vqEYCqeqSqtlbVL4BPA0tbuwFgXle/ucCGVp/bo76NqlpdVUuqaklfX98UDF2SNGQqAuFEuqaL2jWBIW8C7mjr1wArkuyT5FA6F49vraqNwKYkR7W7i04Crp6CcUmSxmHWZDoneTbwh8DbusofTbKIzrTPQ0PbqurOJFcAdwFbgNOramvrcxpwEbAvcF1bJEnTaFKBUFU/AQ4aVnvrCO1XAat61PuBhZMZiyRpcvymsiQJMBAkSY2BIEkCDARJUmMgSJIAA0GS1BgIkiTAQJAkNQaCJAkwECRJjYEgSQIMBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCZhkICR5KMm6JGuT9LfagUmuT3Jf+3lAV/szk6xPcm+SY7rqi9t+1ic5tz1bWZI0jabiDOEPqmpRVS1pr88AbqiqBcAN7TVJDgdWAEcAy4HzkuzV+pwPrAQWtGX5FIxLkjQOO2LK6FhgTVtfAxzXVb+sqjZX1YPAemBpktnAflV1c1UVcHFXH0nSNJlsIBTw5SS3JVnZaodU1UaA9vN5rT4HeLir70CrzWnrw+vbSLIySX+S/sHBwUkOXZLUbdYk+7+yqjYkeR5wfZJ7Rmjb67pAjVDftli1GlgNsGTJkp5tJEkTM6kzhKra0H4+ClwFLAUeadNAtJ+PtuYDwLyu7nOBDa0+t0ddkjSNJhwISZ6T5DeG1oE/Au4ArgFObs1OBq5u69cAK5Lsk+RQOhePb23TSpuSHNXuLjqpq48kaZpMZsroEOCqdofoLOCzVfWPSb4FXJHkVOD7wAkAVXVnkiuAu4AtwOlVtbXt6zTgImBf4Lq2SJKm0YQDoaoeAF7ao/44sGw7fVYBq3rU+4GFEx2LJGny/KayJAkwECRJjYEgSQIMBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCTAQJEmNgSBJAgwESVJjIEiSAANBktQYCJIkwECQJDWTeYTmvCT/lOTuJHcm+YtW/2CSHyRZ25bXdfU5M8n6JPcmOaarvjjJurbt3PYoTUnSNJrMIzS3AO+pqm+3ZyvfluT6tu2cqvp4d+MkhwMrgCOA3wa+kuTF7TGa5wMrgW8C1wLL8TGakjStJnyGUFUbq+rbbX0TcDcwZ4QuxwKXVdXmqnoQWA8sTTIb2K+qbq6qAi4GjpvouCRJEzMl1xCSzAeOBG5ppXckuT3JhUkOaLU5wMNd3QZabU5bH17vdZyVSfqT9A8ODk7F0CVJzaQDIclzgSuBd1XVj+lM/7wIWARsBM4eatqje41Q37ZYtbqqllTVkr6+vskOXZLUZVKBkGRvOmHwmar6PEBVPVJVW6vqF8CngaWt+QAwr6v7XGBDq8/tUZckTaPJ3GUU4ALg7qr6RFd9dlezNwF3tPVrgBVJ9klyKLAAuLWqNgKbkhzV9nkScPVExyVJmpjJ3GX0SuCtwLoka1vtvcCJSRbRmfZ5CHgbQFXdmeQK4C46dyid3u4wAjgNuAjYl87dRd5hJEnTbMKBUFVfp/f8/7Uj9FkFrOpR7wcWTnQskqTJ85vKkiTAQJAkNQaCJAkwECRJjYEgSQIMBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCTAQJEmNgSBJAgwESVJjIEiSAANBktQYCJIkYCcKhCTLk9ybZH2SM2Z6PJK0p9kpAiHJXsAngdcCh9N5LvPhMzsqSdqz7BSBACwF1lfVA1X1M+Ay4NgZHpMk7VFmzfQAmjnAw12vB4B/O7xRkpXAyvbyqST3TsPY9hQHA4/N9CBGk7NmegSaAf63ObVesL0NO0sgpEettilUrQZW7/jh7HmS9FfVkpkehzSc/21On51lymgAmNf1ei6wYYbGIkl7pJ0lEL4FLEhyaJJfB1YA18zwmCRpj7JTTBlV1ZYk7wC+BOwFXFhVd87wsPY0TsVpZ+V/m9MkVdtM1UuS9kA7y5SRJGmGGQiSJMBA2C0k2Zpkbdcyfwce66EkB++o/WvPkKSSXNL1elaSwST/MEq/o0dro4nbKS4qa9J+WlWLZnoQ0jj8K7Awyb5V9VPgD4EfzPCY9nieIeymkixO8rUktyX5UpLZrf7VJOckuSnJ3UleluTzSe5L8jdd/b/Q+t7ZviHe6xh/muTWdlby9+1vUkljdR3w+rZ+InDp0IYkS5N8I8l32s+XDO+c5DlJLkzyrdbOP3czSQbC7mHfrumiq5LsDfwP4PiqWgxcCKzqav+zqno18CngauB0YCFwSpKDWps/b32XAO/sqgOQ5HeBtwCvbGcnW4E/2XFvUbuhy4AVSZ4F/B5wS9e2e4BXV9WRwF8D/61H//cBN1bVy4A/AD6W5Dk7eMy7NaeMdg/PmDJKspDOL/jrk0Dnux0bu9oPfelvHXBnVW1s/R6g843xx+mEwJtau3nAglYfsgxYDHyrHWNf4NEpfVfarVXV7e1614nAtcM27w+sSbKAzp+x2bvHLv4IeGOSv2yvnwU8H7h7x4x492cg7J5C5xf9y7ezfXP7+Yuu9aHXs5IcDbwGeHlV/STJV+n8zzb8GGuq6sypGrT2SNcAHweOBrrPQj8M/FNVvamFxld79A3w76vKP3I5RZwy2j3dC/QleTlAkr2THDGO/vsDT7Qw+B3gqB5tbgCOT/K8dowDk2z3ryhK23Eh8KGqWjesvj+/ush8ynb6fgn4j2mnqEmO3CEj3IMYCLuh9kyJ44GzknwXWAu8Yhy7+Ec6Zwq30/mk9s0ex7gLeD/w5dbuemD2JIeuPUxVDVTV3/XY9FHgvyf5v3SmPHv5MJ2ppNuT3NFeaxL80xWSJMAzBElSYyBIkgADQZLUGAiSJMBAkCQ1BoIkCTAQJEnN/wcqlWaXRFP9cQAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "training_data[\"Sex\"].value_counts().sort_values().plot(kind=\"bar\", title=\"Counts of Sex\", rot=0)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "$50K'}>" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAEICAYAAACzliQjAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAAAYxElEQVR4nO3dfZRdVX3G8e9jEkIEAsRMYpwJDmAUklSgGUJQa7FBCaImtkaDLySWmmUWrbalqyupreJLWnxrFQvYVDCDL8Qs30iRqDEYFY2EQV7SJKTM4i3TRDKAYEAbSPz1j7NHjzd37txJZu6Q2c9nrbPuOfvsfc4+d26ee+4+594oIjAzszw8Z6g7YGZmjePQNzPLiEPfzCwjDn0zs4w49M3MMuLQNzPLiEPfDiuSJkr6oaQ9kj451P1pBElrJS0c6n4ASFo51H2wQ+PQH+YkvVVSh6QnJe1KAfKKBuw3JL1oEDa9GHgEGBsRl1bZb4ukr0l6RNITkjZLWjQI/WiYiDg/ItqHuh+9kXSZpGfSa6xnOqm0vlXS9yX9StI9ks4trVsk6ZbS8lhJP05/w1GNPpYcOPSHMUl/C3wK+GdgInACcBUwdwi7daheCGyN3r9V+AVgR6r3POAi4OEG9a3fJI0cwn2PlnRsnXWXSXoQuFBSl6QPVFT5SkQcXZruK627HriD4u/xPuCrkpqq7ON44HvAg8BbIuKZgzku60NEeBqGE3As8CQwv0ad0RRvCjvT9ClgdFq3CLilon4AL0rzK4ErgW8Be4BbgZPTuh+muk+lPrwFGA/cCDwOPAb8CHhOL/16GXAb8ER6fFlpn88AT6ftnlul7ZPA6TWOeRbwk9SPu4BzSvt8BJiclk9LdU7pZTunAOvSsWwH3lxadwFFyP2S4g3ostK61vTcXAw8lJ6rRcAtwCeAXwD3A+eX2mwA/qL8d6lR98S0zT0UAXol8MVejqE59fFLwLk1/h5np+M8Jf0Nmnqet7T+shr7eDGwFzimVPYj4N0VxzM+PWef760fngYoG4a6A54G6Q8Lc4B9wMgadT4E/BSYkP4h/wT4cFq3iL5D/zFgJjAyBceqanXT8r8AnwVGpemPAFXp07gUZu9I270wLT+vtN+P1Dim7wE/BhYAJ1SsawYeBV5L8Sn31Wm5Ka1fDtwMjAHuBv6yl30cRRHm70x9/EOKN4xpaf05wB+kfbyU4pPGvLSuNT0316XtjEnP9TPAu4ARwBKKN2GlNhv4/dCvVXcjxRvCEcArKEK9aiCn+s8HLk3H+2B6TZxUUWce8D+AgJVVtnEZxRv0Y8AWYElp3RuBbRX1/x34TOl4tqZ2n632mvA0sJOHd4av5wGPRMS+GnXeBnwoInZHRDfwQYqwrdfXI2JT2seXgNNr1H0GmAS8MCKeiYgfRfpXX+EC4N6I+EJE7IuI64F7gNfX2af5FGeS/wTcL+lOSWemdW8HboqImyLiNxGxDuigeBOAIryOBTZRBOmVvezjdcADEfH51MefAV8D3gQQERsiYnPax90Uwxt/XLGNyyLiqYj4dVp+MCL+MyL2A+3puZrYy/6r1pV0AnAm8P6IeDoibgHW1HqyIuLnEfHJiHgpRUAfB/xU0gZJp6Vq6yg+OfwMOEPSBZJGlDazGjiV4sThXcD7JV2Y1h1N8YZQ9gRwTGl5MsUngs/38pqwAeTQH74eBcb3MWb8Aoqzux4PprJ6/bw0/yuKf+C9+TjQCXxX0n2SltbZp55+NdfToYj4RUQsjYhpFKF5J/BNSaIY558v6fGeieJseFJq+wzFJ4npwCdrBNALgbMqtvM2irNmJJ2VLlx2S3oCeDfF8EXZjorl3z6XEfGrNNvb89lb3RcAj5XKqu2nlk6KIa9OiqGc49I+nqL4RHcpxdn+R4ENPa+tiNgaETsjYn9E/AT4NOkNkGK4bWzFfsZSvIn0uAv4O2CtpDP60V87CA794Wsj8H8UH817s5MiwHqckMqgGI9/bs8KSc8/lM5ExJ6IuDQiTqI4a/9bSbPr6FNPv/73IPb5CMVQxwsoho12AF+IiONK01ERcTmApGbgAxTjyp+UNLqXTe8AflCxnaMjYkla/2WKM+zJEXEsadiisnv9PZ467ALGSXpuqWxyrQaSRkiaI+l6imsMF1AMxbVExA9+29ki0G+mONtvoxi2Or2XzQa/O94twEmSymf2p6Xy3zWI+DRwObBO0vSaR2mHxKE/TEXEE8D7gSslzZP0XEmjJJ0v6WOp2vXAP0pqkjQ+1f9iWncXME3S6ZKOpBj66I+HgfJte6+T9KJ0xv1LYH+aKt0EvDjdajpS0luAqRQXgfsk6aOSpqe2x1CMeXdGxKPp2F4v6bwUdkdKOifd5imKs/xrKC6y7gI+3Mtubkx9fEd6TkdJOlPSqWn9MRRn3P8naSbw1nr6fqgi4kGK4arLJB0h6WxqDItJmgB0UYT8TymuwfxpRPxXeVhQUpuks0pNT6a4CaA7rZ8r6XgVZgLvAW5Iffofik9bH0jP9xsp3jC+VqX/H6P4lPA9SS856CfCahvqiwqeBneiGHbooDhz/znF3TY9d8McCVxBEXC70vyRpbbvo7hAuYNiPLzyQu5HSnXPAbpKy+9O23wceDPwN8ADqR9dwD/V6PMrgNspxn5vB15RWvd7+63S9jPAvRTDCt0UAX1qaf1ZwA8oLjp2p+fjBOC9FBczj0j1XpDW/1Ev+3lJattNMZR2M+muIYqhjQcphjBupLhw+cW0rjU9jyNL21pE7YvmG6i4e6dG3ZMprmnsAdYDK4BrejmGo4HT6ngNnQF8P70OnqT4RFC+WHt9eg6epLj+8p6K9q3pGH5NcafTuX0c+0fSa+Tkof73Mxynniv+ZjYMSfoKcE9EVN5Xf7DbWxkRiwZiWzY0PLxjNoykYaaTJT1H0hyKL+J9c4i7Zc8iQ/ZtQDMbFM8Hvk5xy24XxTDMHQO1cZ/lH/48vGNmlhEP75iZZeRZP7wzfvz4aG1tHepumJkdVm6//fZHIuKAH7Z71od+a2srHR0dQ90NM7PDSvpV1AN4eMfMLCMOfTOzjDj0zcwy4tA3M8uIQ9/MLCMOfTOzjDj0zcwy4tA3M8uIQ9/MLCPP+m/kmtmhaV36raHuwrDywOUXDHUXDonP9M3MMuLQNzPLiEPfzCwjDn0zs4w49M3MMuLQNzPLiEPfzCwjDn0zs4zUFfqSjpP0VUn3SNom6WxJ4yStk3Rvejy+VH+ZpE5J2yWdVyqfIWlzWneFJA3GQZmZWXX1nul/Gvh2RJwCnAZsA5YC6yNiCrA+LSNpKrAAmAbMAa6SNCJt52pgMTAlTXMG6DjMzKwOfYa+pLHAK4FrACLi6Yh4HJgLtKdq7cC8ND8XWBUReyPifqATmClpEjA2IjZGRADXldqYmVkD1HOmfxLQDXxe0h2SPifpKGBiROwCSI8TUv1mYEepfVcqa07zleVmZtYg9YT+SOAPgasj4gzgKdJQTi+qjdNHjfIDNyAtltQhqaO7u7uOLpqZWT3qCf0uoCsibk3LX6V4E3g4DdmQHneX6k8utW8BdqbylirlB4iIFRHRFhFtTU1N9R6LmZn1oc/Qj4ifAzskvSQVzQa2AmuAhalsIXBDml8DLJA0WtKJFBdsN6UhoD2SZqW7di4qtTEzswao9/f0/wr4kqQjgPuAd1K8YayWdDHwEDAfICK2SFpN8cawD7gkIvan7SwBVgJjgLVpMjOzBqkr9CPiTqCtyqrZvdRfDiyvUt4BTO9H/8zMbAD5G7lmZhlx6JuZZcShb2aWEYe+mVlGHPpmZhlx6JuZZcShb2aWEYe+mVlGHPpmZhlx6JuZZcShb2aWEYe+mVlGHPpmZhlx6JuZZcShb2aWEYe+mVlGHPpmZhlx6JuZZcShb2aWEYe+mVlGHPpmZhlx6JuZZcShb2aWEYe+mVlG6gp9SQ9I2izpTkkdqWycpHWS7k2Px5fqL5PUKWm7pPNK5TPSdjolXSFJA39IZmbWm/6c6b8qIk6PiLa0vBRYHxFTgPVpGUlTgQXANGAOcJWkEanN1cBiYEqa5hz6IZiZWb0OZXhnLtCe5tuBeaXyVRGxNyLuBzqBmZImAWMjYmNEBHBdqY2ZmTVAvaEfwHcl3S5pcSqbGBG7ANLjhFTeDOwote1KZc1pvrL8AJIWS+qQ1NHd3V1nF83MrC8j66z38ojYKWkCsE7SPTXqVhunjxrlBxZGrABWALS1tVWtY2Zm/VfXmX5E7EyPu4FvADOBh9OQDelxd6reBUwuNW8BdqbylirlZmbWIH2GvqSjJB3TMw+8BvhvYA2wMFVbCNyQ5tcACySNlnQixQXbTWkIaI+kWemunYtKbczMrAHqGd6ZCHwj3V05EvhyRHxb0m3AakkXAw8B8wEiYouk1cBWYB9wSUTsT9taAqwExgBr02RmZg3SZ+hHxH3AaVXKHwVm99JmObC8SnkHML3/3TQzs4Hgb+SamWXEoW9mlhGHvplZRhz6ZmYZceibmWXEoW9mlhGHvplZRhz6ZmYZceibmWXEoW9mlhGHvplZRhz6ZmYZceibmWXEoW9mlhGHvplZRhz6ZmYZceibmWXEoW9mlhGHvplZRhz6ZmYZceibmWXEoW9mlhGHvplZRuoOfUkjJN0h6ca0PE7SOkn3psfjS3WXSeqUtF3SeaXyGZI2p3VXSNLAHo6ZmdXSnzP99wLbSstLgfURMQVYn5aRNBVYAEwD5gBXSRqR2lwNLAampGnOIfXezMz6pa7Ql9QCXAB8rlQ8F2hP8+3AvFL5qojYGxH3A53ATEmTgLERsTEiAriu1MbMzBqg3jP9TwF/D/ymVDYxInYBpMcJqbwZ2FGq15XKmtN8ZfkBJC2W1CGpo7u7u84umplZX/oMfUmvA3ZHxO11brPaOH3UKD+wMGJFRLRFRFtTU1OduzUzs76MrKPOy4E3SHotcCQwVtIXgYclTYqIXWnoZneq3wVMLrVvAXam8pYq5WZm1iB9nulHxLKIaImIVooLtDdHxNuBNcDCVG0hcEOaXwMskDRa0okUF2w3pSGgPZJmpbt2Liq1MTOzBqjnTL83lwOrJV0MPATMB4iILZJWA1uBfcAlEbE/tVkCrATGAGvTZGZmDdKv0I+IDcCGNP8oMLuXesuB5VXKO4Dp/e2kmZkNDH8j18wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy0ifoS/pSEmbJN0laYukD6bycZLWSbo3PR5farNMUqek7ZLOK5XPkLQ5rbtCkgbnsMzMrJp6zvT3An8SEacBpwNzJM0ClgLrI2IKsD4tI2kqsACYBswBrpI0Im3ramAxMCVNcwbuUMzMrC99hn4UnkyLo9IUwFygPZW3A/PS/FxgVUTsjYj7gU5gpqRJwNiI2BgRAVxXamNmZg1Q15i+pBGS7gR2A+si4lZgYkTsAkiPE1L1ZmBHqXlXKmtO85Xl1fa3WFKHpI7u7u5+HI6ZmdVSV+hHxP6IOB1ooThrn16jerVx+qhRXm1/KyKiLSLampqa6umimZnVoV9370TE48AGirH4h9OQDelxd6rWBUwuNWsBdqbylirlZmbWIPXcvdMk6bg0PwY4F7gHWAMsTNUWAjek+TXAAkmjJZ1IccF2UxoC2iNpVrpr56JSGzMza4CRddSZBLSnO3CeA6yOiBslbQRWS7oYeAiYDxARWyStBrYC+4BLImJ/2tYSYCUwBlibJjMza5A+Qz8i7gbOqFL+KDC7lzbLgeVVyjuAWtcDzMxsEPkbuWZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpaRPkNf0mRJ35e0TdIWSe9N5eMkrZN0b3o8vtRmmaROSdslnVcqnyFpc1p3hSQNzmGZmVk19Zzp7wMujYhTgVnAJZKmAkuB9RExBViflknrFgDTgDnAVZJGpG1dDSwGpqRpzgAei5mZ9aHP0I+IXRHxszS/B9gGNANzgfZUrR2Yl+bnAqsiYm9E3A90AjMlTQLGRsTGiAjgulIbMzNrgH6N6UtqBc4AbgUmRsQuKN4YgAmpWjOwo9SsK5U1p/nK8mr7WSypQ1JHd3d3f7poZmY11B36ko4Gvgb8dUT8slbVKmVRo/zAwogVEdEWEW1NTU31dtHMzPpQV+hLGkUR+F+KiK+n4ofTkA3pcXcq7wIml5q3ADtTeUuVcjMza5B67t4RcA2wLSL+tbRqDbAwzS8EbiiVL5A0WtKJFBdsN6UhoD2SZqVtXlRqY2ZmDTCyjjovB94BbJZ0Zyr7B+ByYLWki4GHgPkAEbFF0mpgK8WdP5dExP7UbgmwEhgDrE2TmZk1SJ+hHxG3UH08HmB2L22WA8urlHcA0/vTQTMzGzj+Rq6ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llxKFvZpYRh76ZWUYc+mZmGXHom5llZGRfFSRdC7wO2B0R01PZOOArQCvwAPDmiPhFWrcMuBjYD7wnIr6TymcAK4ExwE3AeyMiBvZwhk7r0m8NdReGjQcuv2Cou2A2bNVzpr8SmFNRthRYHxFTgPVpGUlTgQXAtNTmKkkjUpurgcXAlDRVbtPMzAZZn6EfET8EHqsongu0p/l2YF6pfFVE7I2I+4FOYKakScDYiNiYzu6vK7UxM7MGOdgx/YkRsQsgPU5I5c3AjlK9rlTWnOYry6uStFhSh6SO7u7ug+yimZlVGugLuapSFjXKq4qIFRHRFhFtTU1NA9Y5M7PcHWzoP5yGbEiPu1N5FzC5VK8F2JnKW6qUm5lZAx1s6K8BFqb5hcANpfIFkkZLOpHigu2mNAS0R9IsSQIuKrUxM7MGqeeWzeuBc4DxkrqADwCXA6slXQw8BMwHiIgtklYDW4F9wCURsT9tagm/u2VzbZrMzKyB+gz9iLiwl1Wze6m/HFhepbwDmN6v3pmZ2YDyN3LNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMOPTNzDLi0Dczy4hD38wsIw59M7OMNDz0Jc2RtF1Sp6Sljd6/mVnOGhr6kkYAVwLnA1OBCyVNbWQfzMxy1ugz/ZlAZ0TcFxFPA6uAuQ3ug5lZtkY2eH/NwI7SchdwVmUlSYuBxWnxSUnbG9C3HIwHHhnqTvRFHx3qHtgQ8etzYL2wWmGjQ19VyuKAgogVwIrB705eJHVERNtQ98OsGr8+G6PRwztdwOTScguws8F9MDPLVqND/zZgiqQTJR0BLADWNLgPZmbZaujwTkTsk/SXwHeAEcC1EbGlkX3InIfM7NnMr88GUMQBQ+pmZjZM+Ru5ZmYZceibmWXEoX+YkLRf0p2lqXUQ9/WApPGDtX3Lh6SQ9IXS8khJ3ZJu7KPdOX3VsYPT6Pv07eD9OiJOH+pOmPXTU8B0SWMi4tfAq4H/HeI+Zc1n+ocxSTMk/UDS7ZK+I2lSKt8g6d8k/VDSNklnSvq6pHslfaTU/pup7Zb0Lehq+3i7pE3p08V/pN9PMuuPtcAFaf5C4PqeFZJmSvqJpDvS40sqG0s6StK1km5L9fzTLYfAoX/4GFMa2vmGpFHAZ4A3RcQM4Fpgean+0xHxSuCzwA3AJcB0YJGk56U6f57atgHvKZUDIOlU4C3Ay9OnjP3A2wbvEG2YWgUskHQk8FLg1tK6e4BXRsQZwPuBf67S/n3AzRFxJvAq4OOSjhrkPg9bHt45fPze8I6k6RQhvk4SFN972FWq3/Olt83AlojYldrdR/Gt6Ecpgv6Nqd5kYEoq7zEbmAHclvYxBtg9oEdlw15E3J2uQV0I3FSx+ligXdIUip9kGVVlE68B3iDp79LykcAJwLbB6fHw5tA/fIkizM/uZf3e9Pib0nzP8khJ5wDnAmdHxK8kbaD4x1S5j/aIWDZQnbZsrQE+AZwDlD9Rfhj4fkS8Mb0xbKjSVsCfRYR/eHEAeHjn8LUdaJJ0NoCkUZKm9aP9scAvUuCfAsyqUmc98CZJE9I+xkmq+st9Zn24FvhQRGyuKD+W313YXdRL2+8Af6X0cVPSGYPSw0w49A9T6f8jeBPwUUl3AXcCL+vHJr5NccZ/N8XZ1k+r7GMr8I/Ad1O9dcCkQ+y6ZSgiuiLi01VWfQz4F0k/phiirObDFMM+d0v677RsB8k/w2BmlhGf6ZuZZcShb2aWEYe+mVlGHPpmZhlx6JuZZcShb2aWEYe+mVlG/h9PcIaPfLaQ0gAAAABJRU5ErkJggg==", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "training_data[\"Sex\"].where(training_data[\"Target\"] == \">50K\").value_counts().sort_values().plot(\n", " kind=\"bar\", title=\"Counts of Sex earning >$50K\", rot=0\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Encode and Upload the Dataset\n", "Here we encode the training and test data. Encoding input data is not necessary for SageMaker Clarify, but is necessary for the model." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sklearn import preprocessing\n", "\n", "\n", "def number_encode_features(df):\n", " result = df.copy()\n", " encoders = {}\n", " for column in result.columns:\n", " if result.dtypes[column] == np.object:\n", " encoders[column] = preprocessing.LabelEncoder()\n", " result[column] = encoders[column].fit_transform(result[column].fillna(\"None\"))\n", " return result, encoders\n", "\n", "\n", "training_data = pd.concat([training_data[\"Target\"], training_data.drop([\"Target\"], axis=1)], axis=1)\n", "training_data, _ = number_encode_features(training_data)\n", "training_data.to_csv(\"train_data.csv\", index=False, header=False)\n", "\n", "testing_data, _ = number_encode_features(testing_data)\n", "test_features = testing_data.drop([\"Target\"], axis=1)\n", "test_target = testing_data[\"Target\"]\n", "test_features.to_csv(\"test_features.csv\", index=False, header=False)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "A quick note about our encoding: the \"Female\" Sex value has been encoded as 0 and \"Male\" as 1." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TargetAgeWorkclassfnlwgtEducationEducation-NumMarital StatusOccupationRelationshipEthnic groupSexCapital GainCapital LossHours per weekCountry
003957751691340141217404038
105048331191323041001338
2038221564611905141004038
305322347211725021004038
402823384099132952000404
\n", "
" ], "text/plain": [ " Target Age Workclass fnlwgt Education Education-Num Marital Status \\\n", "0 0 39 5 77516 9 13 4 \n", "1 0 50 4 83311 9 13 2 \n", "2 0 38 2 215646 11 9 0 \n", "3 0 53 2 234721 1 7 2 \n", "4 0 28 2 338409 9 13 2 \n", "\n", " Occupation Relationship Ethnic group Sex Capital Gain Capital Loss \\\n", "0 0 1 4 1 2174 0 \n", "1 3 0 4 1 0 0 \n", "2 5 1 4 1 0 0 \n", "3 5 0 2 1 0 0 \n", "4 9 5 2 0 0 0 \n", "\n", " Hours per week Country \n", "0 40 38 \n", "1 13 38 \n", "2 40 38 \n", "3 40 38 \n", "4 40 4 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "training_data.head()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Lastly, let's upload the data to S3." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "tags": [] }, "outputs": [], "source": [ "from sagemaker.s3 import S3Uploader\n", "from sagemaker.inputs import TrainingInput\n", "\n", "train_uri = S3Uploader.upload(\n", " local_path=\"train_data.csv\",\n", " desired_s3_uri=\"s3://{}/{}\".format(bucket, prefix),\n", " sagemaker_session=sagemaker_session,\n", ")\n", "train_input = TrainingInput(train_uri, content_type=\"csv\")\n", "test_uri = S3Uploader.upload(\n", " local_path=\"test_features.csv\",\n", " desired_s3_uri=\"s3://{}/{}\".format(bucket, prefix),\n", " sagemaker_session=sagemaker_session,\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Train XGBoost Model\n", "#### Train Model\n", "Since our focus is on understanding how to use SageMaker Clarify, we keep it simple by using a standard XGBoost model. For this section we will be using Amazon SageMaker Python SDK for simplicity.\n", "\n", "It takes about 5 minutes for the model to be trained." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:sagemaker:Creating training-job with name: sagemaker-xgboost-2023-01-19-01-01-28-728\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "2023-01-19 01:01:28 Starting - Starting the training job...\n", "2023-01-19 01:01:46 Starting - Preparing the instances for training......\n", "2023-01-19 01:02:23 Downloading - Downloading input data....\n", "2023-01-19 01:02:48 Training - Downloading the training image...\n", "2023-01-19 01:03:08 Training - Training image download completed. Training in progress.......\n", "2023-01-19 01:03:44 Uploading - Uploading generated training model.\n", "2023-01-19 01:03:55 Completed - Training job completed\n" ] } ], "source": [ "from sagemaker.image_uris import retrieve\n", "from sagemaker.estimator import Estimator\n", "\n", "# This is references the AWS managed XGBoost container\n", "xgboost_image_uri = retrieve(region=region, framework=\"xgboost\", version=\"1.5-1\")\n", "\n", "xgb = Estimator(\n", " xgboost_image_uri,\n", " role,\n", " instance_count=1,\n", " instance_type=\"ml.m5.xlarge\",\n", " disable_profiler=True,\n", " sagemaker_session=sagemaker_session,\n", ")\n", "\n", "xgb.set_hyperparameters(\n", " max_depth=5,\n", " eta=0.2,\n", " gamma=4,\n", " min_child_weight=6,\n", " subsample=0.8,\n", " objective=\"binary:logistic\",\n", " num_round=800,\n", ")\n", "\n", "xgb.fit({\"train\": train_input}, logs=False)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Create Model\n", "Here we create the SageMaker model." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:sagemaker:Creating model with name: DEMO-clarify-xgboost-model\n" ] }, { "data": { "text/plain": [ "'DEMO-clarify-xgboost-model'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_name = \"DEMO-clarify-xgboost-model\"\n", "model = xgb.create_model(name=model_name)\n", "container_def = model.prepare_container_def()\n", "sagemaker_session.create_model(model_name, role, container_def)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Amazon SageMaker Clarify\n", "With your model set up, it's time to explore SageMaker Clarify. For a general overview of how SageMaker Clarify processing jobs work, refer to [the provided link](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-processing-job-configure-how-it-works.html). This section will demonstrate how to use the AWS SDK for Python (Boto3) to launch SageMaker Clarify processing jobs." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Set Configurations" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Role: arn:aws:iam::000000000000:role/service-role/SMClarifySageMaker-ExecutionRole\n", "Region: ap-south-1\n" ] } ], "source": [ "# Initialise SageMaker boto3 client\n", "sagemaker_client = boto3.Session().client(\"sagemaker\")\n", "\n", "# Note: We will be using role fetched in section 1 for convenient/demo purpose so that the notebook\n", "# can be easily executed in SageMaker Studio or SageMaker Notebook Instance.\n", "# You can use their own execution role for their project.\n", "print(f\"Role: {role}\")\n", "\n", "print(f\"Region: {region}\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Get Started with a SageMaker Clarify Container\n", "Amazon SageMaker provides prebuilt SageMaker Clarify container images that include the libraries and other dependencies needed to compute bias metrics and feature attributions for explainability. This image has been enabled to run SageMaker Clarify processing job in your account.\n", "\n", "The following code uses the SageMaker Python SDK API to easily retrieve the image URI. If you are unable to use the SageMaker Python SDK, you can find the image URI by referring to [the regional image URI page](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-processing-job-configure-container.html)." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Clarify Image URI: 452307495513.dkr.ecr.ap-south-1.amazonaws.com/sagemaker-clarify-processing:1.0\n" ] } ], "source": [ "clarify_image_uri = retrieve(region=region, framework=\"clarify\", version=\"1.0\")\n", "print(f\"Clarify Image URI: {clarify_image_uri}\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Detecting Bias\n", "SageMaker Clarify helps you detect possible [pre-training](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-detect-data-bias.html) and [post-training](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-detect-post-training-bias.html) biases using a variety of metrics." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Configure a SageMaker Clarify Processing Job Container's Input and Output Parameters \n", "The Processing Job requires that you specify the following input parameters: a dataset files with input name \"dataset\" as Amazon S3 object or prefix, and an analysis configuration file with input name \"analysis_config\" as an Amazon S3 object. The job also requires an output parameter: the output location as an Amazon S3 prefix." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "tags": [] }, "outputs": [], "source": [ "bias_analysis_config_path = \"s3://{}/{}/bias_analysis_config.json\".format(bucket, prefix)\n", "bias_analysis_result_path = \"s3://{}/{}/bias_analysis_output\".format(bucket, prefix)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Configure Analysis Config\n", "The inputs for the analysis are configured by the parameters of the ProcessingInput API. The \"analysis_config\" value of the input_name specifies the JSON file that contains the configuration values. The path to the JSON file is provided in the source parameter of ProcessingInput. More details on configuring an analysis config can be referred [here](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-processing-job-configure-analysis.html).\n", "\n", "For our example use case we will be using the following analysis config:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "{\n", " \"dataset_type\": \"text/csv\",\n", " \"headers\": [\"Target\", \"Age\", \"Workclass\", \"fnlwgt\", \"Education\", \"Education-Num\", \"Marital Status\", \"Occupation\", \"Relationship\", \"Ethnic group\", \"Sex\", \"Capital Gain\", \"Capital Loss\", \"Hours per week\", \"Country\"],\n", " \"label\": \"Target\",\n", " \"label_values_or_threshold\": [1],\n", " \"facet\": [{\n", " \"name_or_index\": \"Sex\",\n", " \"value_or_threshold\": [0]\n", " }],\n", " \"group_variable\": \"Age\",\n", " \"probability_threshold\": 0.8,\n", " \"methods\": {\n", " \"pre_training_bias\": {\n", " \"methods\": \"all\"\n", " },\n", " \"post_training_bias\": {\n", " \"methods\": \"all\"\n", " },\n", " \"report\": {\n", " \"name\": \"report\",\n", " \"title\": \"Analysis Report\"\n", " }\n", " },\n", " \"predictor\": {\n", " \"model_name\": \"DEMO-clarify-xgboost-model\",\n", " \"instance_type\": \"ml.m5.xlarge\",\n", " \"initial_instance_count\": 1,\n", " \"accept_type\": \"text/csv\",\n", " \"content_type\": \"text/csv\"\n", " }\n", "}" ] } ], "source": [ "!echo\n", "!cat analysis_config/bias_analysis_config.json" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "`bias_analysis_config.json` here contains configuration values for detecting bias using a clarify container:\n", "\n", "* `dataset_type` specifies the format of your dataset, for this example as we are using CSV dataset this will be `text/csv`\n", "* `headers` is the list of column names in the dataset\n", "* `label` specifies the ground truth label, which is also known as observed label or target attribute. It is used for many bias metrics. In this example, the \"Target\" column has the ground truth label.\n", "* `facet`: SageMaker Clarify also needs information on what the sensitive columns (`facets`) are, what the sensitive features (`facet: values_or_threshold`) may be, and what the desirable outcomes are (`label_values_or_threshold`).\n", "SageMaker Clarify can handle both categorical and continuous data for `facet: values_or_threshold` and for `label_values_or_threshold`. In this case we are using categorical data. This example has selected the sensitive attribute `Sex` to analyze. The results will show if the model has a preference for records of one sex over the other.\n", "* `group_variable`: This example has selected the \"Age\" column which is used to form subgroups for the measurement of bias metric [Conditional Demographic Disparity (CDD)](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-cddl.html) or [Conditional Demographic Disparity in Predicted Labels (CDDPL)](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-cddpl.html).\n", "* `probability_threshold` is to indicate the threshold to select the binary label in the case of binary classification. XGBoost model outputs probabilities of samples, so SageMaker Clarify invokes the endpoint then uses `probability_threshold` to convert the probability to binary labels for bias analysis. Prediction above the threshold is interpreted as label value `1` and below or equal as label value `0`.\n", "* `methods` is the list of methods and their parameters for the analyses and reports. If any section is omitted, then it is not computed.\n", " * `pre_training_bias`: Pre-training bias metrics to be computed. The detailed description of the metrics can be found on [Measure Pre-training Bias](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-data-bias.html). This example sets methods to \"all\" to compute all the pre-training bias metrics.\n", " * `post_training_bias`: Post-training bias metrics to be computed. The detailed description of the metrics can be found on [Measure Post-training Bias](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-detect-post-training-bias.html). This example sets methods to \"all\" to compute all the post-training bias metrics.\n", "* `predictor` includes model configuration, this section is required if the analysis requires predictions from model\n", " * `model_name`: name of the concerned model, using name of the xgboost model trained earlier, `DEMO-clarify-xgboost-model`\n", " * `instance_type` and `initial_instance_count` specify your preferred instance type and instance count used to run your model on during SageMaker Clarify's processing. The example dataset is small, so a single standard instance is good enough to run this example.\n", " * `accept_type` denotes the endpoint response payload format, and `content_type` denotes the payload format of request to the endpoint. As per the example model we created above both of these will be `text/csv`" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole\n" ] }, { "data": { "text/plain": [ "'s3://sagemaker-ap-south-1-000000000000/sagemaker/DEMO-sagemaker-clarify-boto3/bias_analysis_config.json'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Upload the analysis_config to the concerned S3 path.\n", "S3Uploader.upload(\"analysis_config/bias_analysis_config.json\", \"s3://{}/{}\".format(bucket, prefix))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Run SageMaker Clarify Processing Job\n", "Refer this documentation to [configure a clarify processing job](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-processing-job-configure-parameters.html) for your use case." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "tags": [] }, "outputs": [], "source": [ "def create_processing_job(analysis_config_path, analysis_result_path):\n", " processing_job_name = \"DEMO-clarify-job-{}\".format(datetime.now().strftime(\"%d-%m-%Y-%H-%M-%S\"))\n", "\n", " response = sagemaker_client.create_processing_job(\n", " ProcessingJobName=processing_job_name,\n", " AppSpecification={\"ImageUri\": clarify_image_uri},\n", " ProcessingInputs=[\n", " {\n", " \"InputName\": \"analysis_config\",\n", " \"S3Input\": {\n", " \"S3DataType\": \"S3Prefix\",\n", " \"S3InputMode\": \"File\",\n", " \"S3Uri\": analysis_config_path,\n", " \"LocalPath\": \"/opt/ml/processing/input/config\",\n", " },\n", " },\n", " {\n", " \"InputName\": \"dataset\",\n", " \"S3Input\": {\n", " \"S3DataType\": \"S3Prefix\",\n", " \"S3InputMode\": \"File\",\n", " \"S3Uri\": train_uri,\n", " \"LocalPath\": \"/opt/ml/processing/input/data\",\n", " },\n", " },\n", " ],\n", " ProcessingOutputConfig={\n", " \"Outputs\": [\n", " {\n", " \"OutputName\": \"analysis_result\",\n", " \"S3Output\": {\n", " \"S3Uri\": analysis_result_path,\n", " \"LocalPath\": \"/opt/ml/processing/output\",\n", " \"S3UploadMode\": \"EndOfJob\",\n", " },\n", " }\n", " ]\n", " },\n", " ProcessingResources={\n", " \"ClusterConfig\": {\n", " \"InstanceCount\": 1,\n", " \"InstanceType\": \"ml.m5.xlarge\",\n", " \"VolumeSizeInGB\": 30,\n", " }\n", " },\n", " StoppingCondition={\n", " \"MaxRuntimeInSeconds\": 3600,\n", " },\n", " RoleArn=role,\n", " )\n", "\n", " return processing_job_name" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Here is a brief explanation of inputs used above, for detailed documentation check [CreateProcessingJob API reference](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateProcessingJob.html):\n", "\n", "* `AppSpecification`: Here we provide the region specific clarify image uri we fetched earlier\n", "* `ProcessingInputs`: Clarify job requires that you provide two ProcessingInput parameters.\n", " * `InputName: analysis_config`: The analysis configuration JSON file for a SageMaker Clarify job must be specified as an Amazon S3 object with the InputName \"analysis_config\". We will be providing the example analysis_configs that we have provided with this notebook. \n", " * `InputName: dataset`, dataset fetched earlier provided here as an Amazon S3 object.\n", "* `ProcessingOutputConfig`: The job also requires an output parameter, the output location as an Amazon S3 prefix with the OutputName \"analysis_result\". The S3UploadMode should be set to \"EndOfJob\", because the analysis results is generated at the end of the job. We will be providing here the `analysis_result_path` that we configured earlier.\n", "* `ProcessingResources` contains the ClusterConfig specifying the ML compute instance type we want to use and the count. SageMaker SHAP analysis is CPU-intensive, to speed up the analysis, use a better instance type, or add more instances to enable Spark parallelization. The SageMaker Clarify job doesn’t use GPU.\n", "* `StoppingCondition`: Using a maximum limit of 60 min for example job run. You can set the MaxRuntimeInSeconds of a SageMaker Clarify job to up to 7 days (604800 seconds). If the job cannot be completed within this time limit, it will be force-stopped and no analysis results are provided." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# Wait for processing job to complete\n", "def wait_for_job(job_name):\n", " while (\n", " sagemaker_client.describe_processing_job(ProcessingJobName=job_name)[\"ProcessingJobStatus\"]\n", " == \"InProgress\"\n", " ):\n", " print(\".\", end=\"\")\n", " time.sleep(60)\n", " print()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ".......\n" ] } ], "source": [ "# The job takes about 10 minutes to run\n", "processing_job_name = create_processing_job(bias_analysis_config_path, bias_analysis_result_path)\n", "wait_for_job(processing_job_name)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Viewing the Bias Report\n", "In Studio, you can view the results under the experiments tab.\n", "\n", "\n", "\n", "Each bias metric has detailed explanations with examples that you can explore.\n", "\n", "\n", "\n", "You could also summarize the results in a handy table!\n", "\n", "\n", "\n", "If you're not a Studio user yet, you can access the complete analysis report at the following S3 bucket.\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'s3://sagemaker-ap-south-1-000000000000/sagemaker/DEMO-sagemaker-clarify-boto3/bias_analysis_output'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bias_analysis_result_path" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Explaining Predictions\n", "There are expanding business needs and legislative regulations that require explanations of _why_ a model made the decision it did. SageMaker Clarify uses Kernel SHAP to explain the contribution that each input feature makes to the final decision." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Configure a SageMaker Clarify Processing Job Container's input and output parameters " ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "tags": [] }, "outputs": [], "source": [ "explainability_analysis_config_path = \"s3://{}/{}/explainability_analysis_config.json\".format(\n", " bucket, prefix\n", ")\n", "explainability_analysis_result_path = \"s3://{}/{}/explainability_analysis_output\".format(\n", " bucket, prefix\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Configure analysis config\n", "For our example use case we will be using the following analysis config:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "{\n", " \"dataset_type\": \"text/csv\",\n", " \"headers\": [\"Target\", \"Age\", \"Workclass\", \"fnlwgt\", \"Education\", \"Education-Num\", \"Marital Status\", \"Occupation\", \"Relationship\", \"Ethnic group\", \"Sex\", \"Capital Gain\", \"Capital Loss\", \"Hours per week\", \"Country\"],\n", " \"label\": \"Target\",\n", " \"methods\": {\n", " \"shap\": {\n", " \"baseline\": [\n", " [38, 2, 189794, 10, 10, 3, 6, 1, 4, 1, 1092, 88, 41, 36]\n", " ],\n", " \"num_samples\": 15,\n", " \"agg_method\": \"mean_abs\",\n", " \"use_logit\": false,\n", " \"save_local_shap_values\": true\n", " },\n", " \"report\": {\n", " \"name\": \"report\",\n", " \"title\": \"Analysis Report\"\n", " }\n", " },\n", " \"predictor\": {\n", " \"model_name\": \"DEMO-clarify-xgboost-model\",\n", " \"instance_type\": \"ml.m5.xlarge\",\n", " \"initial_instance_count\": 1,\n", " \"accept_type\": \"text/csv\",\n", " \"content_type\": \"text/csv\"\n", " }\n", "}" ] } ], "source": [ "!echo\n", "!cat analysis_config/explainability_analysis_config.json" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "`explainability_analysis_config.json` here contains configuration values for computing feature attribution using a SageMaker Clarify job:\n", "\n", "* `dataset_type` specify the format of your dataset, for this example as we are using csv dataset this will be `text/csv`\n", "* `headers` is the list of column names in the dataset\n", "* `label` specifies the ground truth label, in this example the \"Target\" column. The SageMaker Clarify job will drop the column and uses the remaining feature columns for explainability analysis.\n", "* `methods` is the list of methods and their parameters for the analyses and reports.\n", " * `shap:` This section has the parameter for SHAP analysis. Kernel SHAP algorithm requires a baseline (also known as background dataset). If not provided, a baseline is calculated automatically by SageMaker Clarify using K-means or K-prototypes in the input dataset. Baseline dataset type shall be the same as `dataset_type`, and baseline samples shall only include features. By definition, `baseline` should either be a S3 URI to the baseline dataset file, or an in-place list of samples. In this case we chose the latter, and put the mean of the train dataset to the list. For more details on baseline selection please [refer this documentation](https://docs.aws.amazon.com/en_us/sagemaker/latest/dg/clarify-feature-attribute-shap-baselines.html).\n", "* `predictor` includes model configuration, this section is required if the analysis requires predictions from model\n", " * `model_name`: name of the concerned model, using name of the xgboost model trained earlier, `DEMO-clarify-xgboost-model`\n", " * `instance_type` and `initial_instance_count` specify your preferred instance type and instance count used to run your model on during SageMaker Clarify's processing. The testing dataset is small, so a single standard instance is good enough to run this example.\n", " * `accept_type` denotes the endpoint response payload format, and `content_type` denotes the payload format of request to the endpoint. As per the example model we created above both of these will be `text/csv`" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole\n" ] }, { "data": { "text/plain": [ "'s3://sagemaker-ap-south-1-000000000000/sagemaker/DEMO-sagemaker-clarify-boto3/explainability_analysis_config.json'" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Upload the analysis_config to the concerned S3 path.\n", "S3Uploader.upload(\n", " \"analysis_config/explainability_analysis_config.json\", \"s3://{}/{}\".format(bucket, prefix)\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Run SageMaker Clarify Processing job" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ".............\n" ] } ], "source": [ "# The job takes about 10 minutes to run\n", "processing_job_name = create_processing_job(\n", " explainability_analysis_config_path, explainability_analysis_result_path\n", ")\n", "wait_for_job(processing_job_name)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Viewing the Explainability Report\n", "As with the bias report, you can view the explainability report in Studio under the experiments tab.\n", "\n", "\n", "\n", "\n", "The Model Insights tab contains direct links to the report and model insights.\n", "\n", "If you're not a Studio user yet, you can access the complete analysis report at the following S3 bucket." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'s3://sagemaker-ap-south-1-000000000000/sagemaker/DEMO-sagemaker-clarify-boto3/explainability_analysis_output'" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "explainability_analysis_result_path" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Analysis of local explanations\n", "It is possible to visualize the local explanations for single examples in your dataset. You can use the obtained results from running Kernel SHAP algorithm for global explanations.\n", "\n", "You can simply load the local explanations stored in your output path, and visualize the explanation (i.e., the impact that the single features have on the prediction of your model) for any single example." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:botocore.credentials:Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Example number: 111 \n", "with model prediction: False\n", "\n", "Feature values -- Label Target 0\n", "Age 21\n", "Workclass 2\n", "fnlwgt 199915\n", "Education 15\n", "Education-Num 10\n", "Marital Status 4\n", "Occupation 7\n", "Relationship 3\n", "Ethnic group 4\n", "Sex 0\n", "Capital Gain 0\n", "Capital Loss 0\n", "Hours per week 40\n", "Country 38\n", "Name: 120, dtype: int64\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAFMCAYAAAAz5jD9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAAA/J0lEQVR4nO3deZwcVbn/8c83CYsCYY0Q1iAGNXgVMbK4I4sBUXBBQTZXREVR3EAuykW8ouL1qhfBsGhABUF+StQgQmQXhEQQCIiERQkECCgQZTPw/P44p0lN0z0zNXV6MgPf9+s1r+6q6nqquqe7nqpTZ1FEYGZmVseYZb0DZmY2+jh5mJlZbU4eZmZWm5OHmZnV5uRhZma1OXmYmVltTh6jnKQ3SFowzNucJCkkjRvO7eZt7yXptz2K/RFJ90j6p6Q1exB/mX1uy8Joeb+SjpD0o2W9H6ONk0cPSLpd0vbLej9Gu04Hn4j4cUTs2INtLQf8D7BjRKwcEfcXiOnvgSFpW0kXSHpQ0u0dln9Z0nWSlkg6om3ZREkzJd2VfwuThmm3B+TkYZasDawIzKu7ohL/lowuV1n/Ak4GPttltfnA54Bfd1j2JPAb4B1FdrAgf+GHkaQVJP1vPou4Kz9fobJ8V0nXSHpI0i2SpuX575N0o6TFkm6V9OEa23yRpPMk/V3STZLeledvkudtkafXlXSfpDfk6QslfVXSlfmM6WxJa3TZRtf9axWrSfq0pHslLZT0vsryN0u6Or/nO9rOvC7Ojw/koqRtJL1X0qWV9V8l6aq8j1dJelVl2YX5rO6yvG+/lbRWh/3fFLipsq3fDTL2VyRdBjwMPL8t5qnAhsAv875/rrJ4L0l/y5/3YZV1xkg6JP/v75d0RrfPPL9+l/x9eUDS7yW9NM9/d/4/jM/TO0m6W9KEPP3t/Fk/JGmupNdWYh4h6UxJP8qf2XWSNpV0aP7/3SFpx8rr63xPVpV0Uv4O3CnpKElju7z2iPz+T8n7MU/S1MrykPSCyvQPJR2Vn7e+c5+rfOd2k7SzpL/k7/0X2ja5oqSf5m39UdLLKrHXlXSWpEWSbpP0ibb9/Fn+vB4C3tv+XiLiyog4Fbi103uNiBkRcQ6wuMOyeyLie8BVndZdpiLCf4X/gNuB7TvMPxK4AngeMAH4PfDlvGxL4EFgB1JSXw94UV72ZmATQMDrSQerLfKyNwALuuzHSsAdwPuAccAWwH3AZnn5h4AbgecC5wLHVNa9ELgTeEmOcxbwo7xsEhDAuEHu35L83pcDds7LV68s/4/8nl8K3APs1mk7ed57gUvz8zWAfwD75Pe3Z55es/IebgE2BZ6Tp4/u8lm1v6fBxP4bsFlevtxA34PKNk7I+/My4DHgxXn5J0nfj/WBFYDvA6d12d8tgHuBrYCxwH55eyvk5T8GfgisCdwF7FJZd+88fxzwaeBuYMW87AjgUeBNefkpwG3AYfn/9yHgtiF+T36R39NKpN/AlcCHu7y/1n7snN/fV4ErKssDeEFl+ofAUW3fuS9W9nkR8BNglfw/exR4fmVb/wbemV//mfyelyN9L+fmWMuTThJuBd7Utu5u+bXP6ee4sD1wez/LfwQc0WXZuPyeJy3r49tT+7Ssd+CZ+Ef35HELsHNl+k2tL1P+UX1rkPF/ARyUn7+B7snj3cAlbfO+D3ypMj0TuA64lnzgyfMvpHKgBaYAj+cfcp+DwiD27xH6JoB7ga27rPu/rc+h03bomzz2Aa5sW/9y4L2V9/CflWUfBX7TZbt9tjXI2EfW+R5UtrF+Zd6VwB75+Y3AdpVlE0kHpqd9zsBx5BOPyrybgNfn56uRktt1wPcH2M9/AC/Lz48AzqssewvwT2Bsnl4lv4fV6nxPSMWCj1E5uJIS8gVd9ukI4Py2uI9UpgdKHo902OetKq+fy9KTlCPom5jGAAuB15KS89/a9u1Q4AeVdS8e5O/2GZU8RnQtiGegdYG/Vqb/mucBbADM6rSSpJ2AL5HOoMeQrhSuG8T2NgK2kvRAZd444NTK9AmkBLJ/RDzWtv4dbfu6HNCp2Geg/bs/IpZUph8GVs7rbgUcTTpzXZ50xn3mIN4bPP3zbO3nepXpuzttt1DsOxiabvu0EfBzSU9Wlj9BOvDe2RZjI2A/SR+vzFs+7zcR8YCkM4GDaSsvl/Rp4IP5tQGMp+//9Z7K80eA+yLiico0eZ8fyM8H8z3ZKM9fKKk1bwz9f4btn9OKksa1fZe6ub/DPre/r+p34an9iIgnlWowtj6fddt+Q2OBSzqt+2ziex7D6y7Sj6hlwzwP0hdwk/YVlO6JnAUcA6wdEauRkozaX9vBHcBFEbFa5W/liPhIjr0y6Uz/JOCIDmXVG7Tt679JxV6l9g9SUcJMYIOIWBU4vrJuDLBu++fZ2s/2A+1QDCb2QPs30PJ2dwA7tf2/VoyITu/nDuArba99bkScBiBpc+D9wGnAd1or5fsbnwfeRSo6XI1UXDrY/1cnA35P8v4+BqxV2d/xEbHZELf5MOkkpWWdIcZpeeo9KFV+WJ/0HbiDVExX/ZxXiYidK+vW/T8/Izh59M5yklas/I0j/ZD/U9KEfOP2i6RLVUgH8PdJ2i7fOF1P0otYeja+CFiSz/IHW1X1V8CmkvaRtFz+e6WkF+fl3wbmRsQHSTU9jm9bf29JUyQ9l3TP4meVs7mWJvsHqUjh7xHxqKQtgfdUli0i1TZ5fsc1U5LaVNJ7JI2T9G5S8cavamy/mxKx76H7vndyPPAVSRsB5O/Jrl1eewJwgKStlKykVPlgFUkrkr5XXyDd71pP0kfzequQ7gcsAsZJ+iLpyqOJAb8nEbEQ+C3wTUnj83d8E0mvH+I2rwHeI2msUsWSocZpeYWkt+ff6SdJie4KUrHiQ5I+L+k5eXsvkfTKwQbO73VF0pWX8vFg+cry5fLyMaT/yYrVigR5WatizQp5eplz8uidWaRL49bfEcBRwBzS/YXrgD/meUTElaQf+rdIZ4IXARtFxGLgE8AZpLLp95DO1AeU190R2IN0FnU38DXSF3BXYBpwQH75wcAWkvaqhDiVVJZ8N6ka6ydo02T/so8CR0paTEqmZ1RiPwx8BbhMqUbR1m3bvh/YhXTT935SdcddIqL9rLe2QrG/SjpZeEDSZwbx+m+TPrvf5s/jClKZe6f9m0O6Efx/pM99Pktr+nyVdB/suFwUuTdwlKTJpIoR5wB/IRUxPUrzYpcBvyfZvqSTjRvyPv+MdF9nKA4i3Y95ANiLdJ+tibNJ9whblSTeHhH/zknwLcDmpJvo9wEnAqvWiP060jFgFunK7BFSIm05Ic/bk1Qx4ZG8Dy2PkO47AfyZpcVwy5TyzRizPiRdSKo1c+Ky3hcbufw9efbylYeZmdXm5GFmZrW52MrMzGorcuUhaZpS1xfzJR3SYbkkfScvv1a5S4z+1pW0u1KXBE+q0i1BXnZofv1Nkt5U4j2YmdngNW4kmKuUHUvqVmMBcJWkmRFxQ+VlOwGT899WpNaxWw2w7vXA20ktoqvbm0KqPbQZqRHP+ZI27VCFtI+11lorJk2a1PTtmpk9q8ydO/e+iJjQPr9EC/MtgfkRcSuApNOBXUnV8Vp2BU6JVEZ2haTVJE0kdV/Qcd2IuDHPa9/ersDpuQribZLm5324vL+dnDRpEnPmzGn0Rs3Mnm0ktfe0AJQptlqPvvXEF9C3C4f+XjOYdYeyPQAk7S9pjqQ5ixYtGiCsmZkNVonk0albg/a78N1eM5h1h7K9NDNiekRMjYipEyY87arLzMyGqESx1QL69m3T6hNmMK9ZfhDrDmV7ZmbWQyWuPK4CJkvaOPfXsgdP755iJrBvrnW1NfBg7utmMOu2mwnsoTSw0sakm/BXFngfZmY2SI2vPCJiiaQDSX3mjAVOjoh5kg7Iy48n9emyM6n/nYdJfTh1XRdA0tuA75IGTfq1pGsi4k059hmkG/JLgI8NVNPKzMzKetY0Epw6dWq4tpWZWT2S5kbE1Pb57p7EzMxqc/IwM7PaPAxtj0065Ne1Xn/70W/u0Z6MPP5srAl/f5YtX3mYmVltTh5mZlabk4eZmdXm5GFmZrU5eZiZWW1OHmZmVpuTh5mZ1eZ2HmZmy8Bob6fiKw8zM6vNycPMzGpz8jAzs9qcPMzMrDYnDzMzq83Jw8zMaiuSPCRNk3STpPmSDumwXJK+k5dfK2mLgdaVtIak8yTdnB9Xz/MnSXpE0jX57/gS78HMzAavcfKQNBY4FtgJmALsKWlK28t2Aibnv/2B4wax7iHA7IiYDMzO0y23RMTm+e+Apu/BzMzqKXHlsSUwPyJujYjHgdOBXdtesytwSiRXAKtJmjjAursCM/LzGcBuBfbVzMwKKJE81gPuqEwvyPMG85r+1l07IhYC5MfnVV63saSrJV0k6bXddkzS/pLmSJqzaNGiOu/JzMz6USJ5qMO8GORrBrNuu4XAhhHxcuBg4CeSxnd6YURMj4ipETF1woQJA4Q1M7PBKpE8FgAbVKbXB+4a5Gv6W/eeXLRFfrwXICIei4j78/O5wC3ApgXeh5mZDVKJ5HEVMFnSxpKWB/YAZra9Ziawb651tTXwYC6K6m/dmcB++fl+wNkAkibkG+1Iej7pJvytBd6HmZkNUuNedSNiiaQDgXOBscDJETFP0gF5+fHALGBnYD7wMPC+/tbNoY8GzpD0AeBvwO55/uuAIyUtAZ4ADoiIvzd9H2ZmNnhFumSPiFmkBFGdd3zleQAfG+y6ef79wHYd5p8FnNVwl83MrAG3MDczs9qcPMzMrDYnDzMzq83Jw8zMavMY5mZmHYz2McZ7zcnDuvKPx8y6cbGVmZnV5uRhZma1OXmYmVltTh5mZlabk4eZmdXm5GFmZrU5eZiZWW1OHmZmVpuTh5mZ1ebkYWZmtRVJHpKmSbpJ0nxJh3RYLknfycuvlbTFQOtKWkPSeZJuzo+rV5Ydml9/k6Q3lXgPZmY2eI2TRx5P/FhgJ2AKsKekKW0v24k01vhkYH/guEGsewgwOyImA7PzNHn5HsBmwDTge60xzc3MbHiUuPLYEpgfEbdGxOPA6cCuba/ZFTglkiuA1SRNHGDdXYEZ+fkMYLfK/NMj4rGIuI00LvqWBd6HmZkNUoleddcD7qhMLwC2GsRr1htg3bUjYiFARCyU9LxKrCs6xHoaSfuTrnTYcMMNO+58r3uOHc09zfZ633sdv9f/29Eev9dG+29rtMfv9edf4spDHebFIF8zmHWHsr00M2J6REyNiKkTJkwYIKyZmQ1WieSxANigMr0+cNcgX9Pfuvfkoi3y4701tmdmZj1UInlcBUyWtLGk5Uk3s2e2vWYmsG+udbU18GAukupv3ZnAfvn5fsDZlfl7SFpB0sakm/BXFngfZmY2SI3veUTEEkkHAucCY4GTI2KepAPy8uOBWcDOpJvbDwPv62/dHPpo4AxJHwD+Buye15kn6QzgBmAJ8LGIeKLp+zAzs8ErMgxtRMwiJYjqvOMrzwP42GDXzfPvB7brss5XgK802GUzM2vALczNzKw2Jw8zM6vNycPMzGpz8jAzs9qK3DA3M2s30lq8W1m+8jAzs9qcPMzMrDYnDzMzq83Jw8zManPyMDOz2pw8zMysNicPMzOrzcnDzMxqc/IwM7Pa3MLcbIRyC20byXzlYWZmtTl5mJlZbY2Sh6Q1JJ0n6eb8uHqX102TdJOk+ZIOGcz6kg7Nr79J0psq8y/M867Jf89r8h7MzKy+plcehwCzI2IyMDtP9yFpLHAssBMwBdhT0pT+1s/L9wA2A6YB38txWvaKiM3z370N34OZmdXUNHnsCszIz2cAu3V4zZbA/Ii4NSIeB07P6/W3/q7A6RHxWETcBszPcczMbARomjzWjoiFAPmxUxHSesAdlekFeV5/6/e3DsAPcpHV4ZLUbeck7S9pjqQ5ixYtqvO+zMysHwNW1ZV0PrBOh0WHDXIbnQ7u0WCdvSLiTkmrAGcB+wCndAoSEdOB6QBTp04daJtmZjZIAyaPiNi+2zJJ90iaGBELJU0EOt1/WABsUJleH7grP++2ftd1IuLO/LhY0k9IxVkdk4eZmfVG02KrmcB++fl+wNkdXnMVMFnSxpKWJ90InznA+jOBPSStIGljYDJwpaRxktYCkLQcsAtwfcP3YGZmNTVtYX40cIakDwB/A3YHkLQucGJE7BwRSyQdCJwLjAVOjoh5/a0fEfMknQHcACwBPhYRT0haCTg3J46xwPnACQ3fg5mZ1dQoeUTE/cB2HebfBexcmZ4FzBrs+nnZV4CvtM37F/CKJvtsZmbNuYW5mZnV5uRhZma1OXmYmVltTh5mZlabx/MwM3sG6vV4ML7yMDOz2nzlYTZEHunPns185WFmZrX5ysOesXxlYNY7vvIwM7PanDzMzKw2Jw8zM6vNycPMzGpz8jAzs9qcPMzMrDYnDzMzq61R8pC0hqTzJN2cH1fv8rppkm6SNF/SIQOtL2lNSRdI+qek/2uL9QpJ1+VY35GkJu/BzMzqa3rlcQgwOyImA7PzdB+SxgLHAjsBU4A9JU0ZYP1HgcOBz3TY5nHA/qRxzScD0xq+BzMzq6lp8tgVmJGfzwB26/CaLYH5EXFrRDwOnJ7X67p+RPwrIi4lJZGnSJoIjI+IyyMigFO6bNPMzHqoafJYOyIWAuTH53V4zXrAHZXpBXneYNdvj7WgS6ynkbS/pDmS5ixatGiA0GZmNlgD9m0l6XxgnQ6LDhvkNjrdk4hBrtsoVkRMB6YDTJ06dajbNDOzNgMmj4jYvtsySfdImhgRC3OR0r0dXrYA2KAyvT5wV34+mPXbY63fJZaZmQ2TpsVWM4H98vP9gLM7vOYqYLKkjSUtD+yR1xvs+k/JRVuLJW2da1ntO9A6ZmZWXtPkcTSwg6SbgR3yNJLWlTQLICKWAAcC5wI3AmdExLz+1s8xbgf+B3ivpAWVGlofAU4E5gO3AOc0fA9mZlZTo/E8IuJ+YLsO8+8Cdq5MzwJmDXb9vGxSl/lzgJcMbY/NzKwEtzA3M7PanDzMzKw2Jw8zM6vNycPMzGpz8jAzs9qcPMzMrDYnDzMzq83Jw8zManPyMDOz2pw8zMysNicPMzOrzcnDzMxqc/IwM7PanDzMzKw2Jw8zM6vNycPMzGprlDwkrSHpPEk358fVu7xumqSbJM2XdMhA60taU9IFkv4p6f/aYl2YY12T/57X5D2YmVl9Ta88DgFmR8RkYHae7kPSWOBYYCdgCrBnZUjZbus/ChwOfKbLdveKiM3z370N34OZmdXUNHnsCszIz2cAu3V4zZbA/Ii4NSIeB07P63VdPyL+FRGXkpKImZmNME2Tx9oRsRAgP3YqQloPuKMyvSDPG+z6nfwgF1kdLkndXiRpf0lzJM1ZtGjRIEObmdlAxg30AknnA+t0WHTYILfR6eAeg1y3k70i4k5JqwBnAfsAp3R6YURMB6YDTJ06tck2zcysYsDkERHbd1sm6R5JEyNioaSJQKf7DwuADSrT6wN35eeDWb99f+7Mj4sl/YRULNYxeZiZWW80LbaaCeyXn+8HnN3hNVcBkyVtLGl5YI+83mDXf4qkcZLWys+XA3YBrm/0DszMrLYBrzwGcDRwhqQPAH8DdgeQtC5wYkTsHBFLJB0InAuMBU6OiHn9rZ9j3A6MB5aXtBuwI/BX4NycOMYC5wMnNHwPZmZWU6PkERH3A9t1mH8XsHNlehYwa7Dr52WTumz2FUPZVzMzK8ctzM3MrDYnDzMzq83Jw8zManPyMDOz2pw8zMysNicPMzOrzcnDzMxqc/IwM7PanDzMzKw2Jw8zM6vNycPMzGpz8jAzs9qcPMzMrDYnDzMzq83Jw8zManPyMDOz2pw8zMystkbJQ9Iaks6TdHN+XL3L66ZJuknSfEmHDLS+pB0kzZV0XX58Y2WdV+T58yV9R5KavAczM6uv6ZXHIcDsiJgMzM7TfUgaCxwL7ARMAfaUNGWA9e8D3hIR/wHsB5xaCXkcsD8wOf9Na/gezMyspqbJY1dgRn4+A9itw2u2BOZHxK0R8Thwel6v6/oRcXUeBx1gHrCipBUkTQTGR8TlERHAKV22aWZmPdQ0eawdEQsB8uPzOrxmPeCOyvSCPG+w678DuDoiHsvrLegS62kk7S9pjqQ5ixYtGuRbMjOzgYwb6AWSzgfW6bDosEFuo9M9iRjUitJmwNeAHYcSKyKmA9MBpk6dOqhtmpnZwAZMHhGxfbdlku6RNDEiFuYipXs7vGwBsEFlen2gVSTVdX1J6wM/B/aNiFsqsdbvEsvMzIZJ02KrmaQb2uTHszu85ipgsqSNJS0P7JHX67q+pNWAXwOHRsRlrUC5aGuxpK1zLat9u2zTzMx6qGnyOBrYQdLNwA55GknrSpoFEBFLgAOBc4EbgTMiYl5/6+fXvwA4XNI1+a91P+QjwInAfOAW4JyG78HMzGoasNiqPxFxP7Bdh/l3ATtXpmcBs2qsfxRwVJdtzgFeMvS9NjOzptzC3MzManPyMDOz2pw8zMysNicPMzOrzcnDzMxqc/IwM7PanDzMzKw2Jw8zM6vNycPMzGpz8jAzs9qcPMzMrDYnDzMzq83Jw8zManPyMDOz2pw8zMysNicPMzOrrVHykLSGpPMk3ZwfV+/yummSbpI0X9IhA60vaQdJcyVdlx/fWFnnwhyrfYRBMzMbJk2vPA4BZkfEZGB2nu5D0ljgWGAnYAqwp6QpA6x/H/CWiPgP0tjmp7aF3SsiNs9/9zZ8D2ZmVlPT5LErMCM/nwHs1uE1WwLzI+LWiHgcOD2v13X9iLg6D2ULMA9YUdIKDffVzMwKaZo81o6IhQD5sVMR0nrAHZXpBXneYNd/B3B1RDxWmfeDXGR1uCR12zlJ+0uaI2nOokWLBv+uzMysX+MGeoGk84F1Oiw6bJDb6HRwj0GtKG0GfA3YsTJ7r4i4U9IqwFnAPsApndaPiOnAdICpU6cOaptmZjawAZNHRGzfbZmkeyRNjIiFkiYCne4/LAA2qEyvD7SKpLquL2l94OfAvhFxS2V/7syPiyX9hFQs1jF5mJlZbzQttppJuqFNfjy7w2uuAiZL2ljS8sAeeb2u60taDfg1cGhEXNYKJGmcpLXy8+WAXYDrG74HMzOrqWnyOBrYQdLNwA55GknrSpoFEBFLgAOBc4EbgTMiYl5/6+fXvwA4vK1K7grAuZKuBa4B7gROaPgezMyspgGLrfoTEfcD23WYfxewc2V6FjCrxvpHAUd12ewrhrq/ZmZWhluYm5lZbU4eZmZWm5OHmZnV5uRhZma1OXmYmVltTh5mZlabk4eZmdXm5GFmZrU5eZiZWW1OHmZmVpuTh5mZ1ebkYWZmtTl5mJlZbU4eZmZWm5OHmZnV5uRhZma1OXmYmVltjZKHpDUknSfp5vy4epfXTZN0k6T5kg4ZaH1JW1aGn/2TpLdV1nmFpOtyrO9IUpP3YGZm9TW98jgEmB0Rk4HZeboPSWOBY4GdgCnAnpKmDLD+9cDUiNgcmAZ8X1JryNzjgP2ByflvWsP3YGZmNTVNHrsCM/LzGcBuHV6zJTA/Im6NiMeB0/N6XdePiIcjYkmevyIQAJImAuMj4vKICOCULts0M7Meapo81o6IhQD58XkdXrMecEdlekGe1+/6kraSNA+4DjggJ5P18vqdYj2NpP0lzZE0Z9GiRbXfnJmZdTZuoBdIOh9Yp8Oiwwa5jU73JGKglSLiD8Bmkl4MzJB0Tt1YETEdmA4wderUAbdpZmaDM2DyiIjtuy2TdI+kiRGxMBcp3dvhZQuADSrT6wN35ecDrh8RN0r6F/CSHGv9LrGG5Paj39xkdTOzZ6WmxVYzgf3y8/2Aszu85ipgsqSNJS0P7JHX67p+fu24/Hwj4IXA7bloa7GkrXMtq327bNPMzHqoafI4GthB0s3ADnkaSetKmgWQ71UcCJwL3AicERHz+lsfeA3wJ0nXAD8HPhoR9+VlHwFOBOYDtwDnNHwPZmZWk1KlpWe+qVOnxpw5c5b1bpiZjSqS5kbE1Pb5bmFuZma1OXmYmVltTh5mZlabk4eZmdXm5GFmZrU5eZiZWW3Pmqq6khYBf62xylrAfQO+auh6GX8077vjO77jj6z4G0XEhPaZz5rkUZekOZ3qNo+G+KN53x3f8R1/dMR3sZWZmdXm5GFmZrU5eXQ3fRTHH8377viO7/ijIL7veZiZWW2+8jAzs9qcPMzMrDYnDzMzq83J4xlC0gqDmfdsJOnrksZLWk7SbEn3Sdp7We/XSCFpRUkHS/p/ks6S9ClJKy7r/apD0qskvUfSvq2/grE3af2WJL1B0ickrVYw/pFt02Ml/bhg/DVKxapy8sgkrS3pJEnn5Okpkj5QMP6rJa2Un+8t6X/yELulXD7IeUOS9/88SX+RdKuk2yTdWip+ZTvjJa3R+isUdseIeAjYBVgAbAp8tlBsACRtKukESb+V9LvWX8H4a0r6rqQ/Spor6duS1iwU/hRgM+C7wP8BLwZOLRQbgPbfUj5AfqlQ7FOBY0gjkL4y/5VsZHcW8ISkFwAnARsDPykYf0NJh8JTJ3w/B24uGP8Pks6UtHMevruIcaUCPQP8EPgBcFie/gvwU9KXpYTjgJdJehnwuRz3FOD1TYJKWgdYD3iOpJcDrS/HeOC5TWK3OQn4FDAXeKJgXAAkfRg4EngEaFUBDOD5BcIvlx93Bk6LiL8X/A21nAkcD5xADz4f4HTgYuAdeXov0vdz+wKxXxgRL6tMXyDpTwXiVm0n6R3AB4A1Sb+1iwrFngpMid5VHX0yIpZIehvwvxHxXUlXF4z/PuDHOYFsC5wTEd8qGH9T0vfk/cB3Jf0U+GFE/KVR1IjwX/rOXZUfr67Mu6Zg/D/mxy8CH6jOaxh3P+ACYHF+bP3NBN5ecP//0OPP/2ZgrR7FPhr4M3A1KZFMKP1+gLk9/nyeFh+YUyj2D4GtK9NbAd/rwXt4N6lPpb8Bry4Y90xgYg8/+z8AewLXAxvnedcXiLtF5W8r4Brg2Na8Hr2XbYE7gQdIyXubocZyO49M0oWks7rzImILSVsDX4uIRlcGlfgXAb8hnWW8DlhESk7/USj+OyLirBKxusQ/GhgL/D/gsdb8iPhjofi/ISW7h0vE6xB/deChiHhC0nOB8RFxd8H4RwD3koocqp/P3wvFPwaYA5yRZ70T2CwiGhf9SLoReCHpoA6wIXAj8CQQEfHSAtuYDMwAriMVi90AHFzi/y3pAmBz4Er6fvZvbRo7x58CHABcHhGnSdoYeHdEHN0w7gX9LI6IeGOT+JXtrAnsDewD3EMqRZhJ+szOjIiNhxTXySORtAWpzPclpDOMCcA7I+LaQvHXAd5DusK5RNKGwBsi4pRC8Q/uMPtB0hnrNQXid/qil/yCv5xUlPEH+h4APlEgdsebp6U++7yN2zpvIkoUuyFpMbASqUhMpPuV/6psZ3yD2P3ee4uIOr1Rd9vGn4EDI+L8XO5+MPD+iNisQOyOJ3gRUapYrLqt1YENSh0XhoOkv5DuYf0gIha0Lft8RHxtSHGdPJaSNI50Bibgpoj4d8HYKwGP5jPfTYEXkco2i2xD0k9IZb+/zLPeDFyVt3NmRHy9xHZ6RdKVwKWkM9MnW/MjYkaB2N+tTK4IbEcqMnxn09jPBPlE5mki4m+d5g9xG+MjVVqozpscESVvDPdELpV4K+ke8TWkUoOLIqLTCdtQ4q8N/DewbkTslK90tomIxvdbJY0FvlFqX/vEdvJIJL29w+wHgesi4t4C8ecCrwVWB64gFUE8HBF7NY2d458LvCMi/pmnVwZ+BryNdPUxpWH8VYEvkYrcIJWXHhkRDzaJW4n/+4h4VYlYg9jWqsCppYo1cszlgI+w9PO5EPh+05MDSS+KiD/nK+OnKVFsKOk6UuUEkZLrxqSTp8ZXBZVttA6Q60XEtBIHSEmXRsRr8lVZ9UAmGl6NtW3n6oh4uaQPkq46viTp2hLFeTn+OeTKOhHxsnwSe3XBIu3ZEbFdiVhVrm211AeAbUg3mwHeQDrIbyrpyIhoWnVREfFwrrL43Yj4uqRrGsas2hB4vDL9b9IgLo9IeqzLOnWcTCrOe1ee3of0he+UdIfiAkn7k66cit8zaPMwMLlwzONIN+O/l6f3yfM+2DDuwcD+wDc7LAugcbFh+0EqJ6oPN43b5ocUrs0YEa/Jj6s03bkBjJM0kfTdP2ygFw/BWhFxRqu6bqSaXSVr7F0jaSapYkGrqJOI+H9Ngjp5LPUk8OKIuAeeOlM6jlQL4mKa13uXpG1IVSxbdd7HNoxZ9RPgCkln5+m3AKfl4rIbCsTfJCLeUZn+r8LJ7z358dDKvCJVdSX9kqVnpmOAKaQfUkmvjL7VXX9XorprROyfH7dtGqvGNv8o6ZWFw/b6AImk55GunMjbKFXsdiRwLnBZRFwl6fmUbYfxr3xTOwByZZ0iV/TZGsD99D3RCFLllyFz8lhqUitxZPcCm0ZqE1DivsRBpAPjzyNiXv4C9lfbopaI+LKkWaSGUgIOiIg5eXGJorFHJL0mIi6F1GiQ1CajiKHW+BikYyrPlwB/bb9xWMATkjaJiFsA8v+39MHxVcAkKr/bEjf92ypbjCFVFV3UNG6bnh0gJb2VdGW2Lul3uxGptliRYreIOJPKyUZE3MrS9jYlHEyq/bSJpMvIlXUKxj8xIi6rzsi/30Z8zyOT9D1S0U/rS/IOUmvkzwK/Gs4zv6FQ6uLgEuD3EfGvgV4/hPibk6parkpKTn8H3hsRRRqT9bJGlKSvRcTnB5rXcBvbkYplbiV9PhsB74uIIicISq2oNyHdsG0lpShUG61a3XcJcDtwVkQ82jR2ZRs9q82Yr/DeCJyf701sC+zZumorEH990r6/mpT8LgUOKnkC0uPKOn+MiC0Gmlc7rpNHkqsPvp105g7pMm9iRHysUPwJpJblm9H30rpUVdf3k/Z9G1KDwUuAiyPi7H5XrL+d8QDtNWcKxO1ZjaguP55iNzwrMVdg6QHgzxFR4l5TK/aN9LYVNZJWISWkfxaM+Urgjoi4Ox8gP0w6MbsB+GKJe1rKY3LnJPLyiHhS0pURsWXT2Dn+eaRi4VbR9d7AXhGxQ6H4zyVdfWwUER9SahPzwoj4VcO42wCvAj4JVFusjwfe1lbMWpuLrbKICEm3kO5xvAu4jdSnTSk/Jt0g3IXU4Gg/ChYNRMTJwMlK7UneBXyGdKO10c1ESXtHxI/aijZQ7t4jIv6nSfyWiPh4W/xVaXifSdJHgI8Cz5dUPcNdBbis81q1t/HGiPhdh9p6m0hqfFOy4npgHWBhoXhPkfQS0me9Rp6+D9gvIq4vEP77LO1C5VWkG84fJzVQm06Z4pkHcu3Ci0ndfNxLuoIqZUJE/KAy/UNJnywY/wekbn+2ydMLSCUgjZIHsDywMuk4Xz0OPESBz/1Znzxym4s9SN0P3E86wKsHxVRrRsRJkg6K1HjpIqVW50VIOpF0I/ge0lXHO4ESrb9Xyo+dklAvL1tL1Ij6CXAO8FXgkMr8xQVrcb0e+B2pgkK7xjclKzf7VwFuUGoPU7oV9XRSa+8L8jbfkOeVqDo9tvJZvxuYHqknhLMKVrjYlXT/7VOk+3urkm5yl9Lqhfm0PN06VpSySUS8W9KeALmGZOPO1yrHmR9GgYae7Z71yYPU59ElwFsiYj6ApE/1YDutMsyFkt4M3AWsXzD+mqTaWw+Q7kfcFxGNz74i4vv56fm9uOlWidWpRtQZ3dcYWG6D8iDpx16tjbOypJVL1MaJpd2DHBkRfVqZK3Vj0dQxA7+ksZWq92Yi4sJcS6+EsZLG5e/idqSr4ZYix5/KPb4nSfflSns/qbfhb5G+o78ndTNUyuOSnsPSygSbUDlBKGAFSdN5emWLZkXm0aPOxEbLH6kR3U+BO0g9om4H3NaD7exCOiN6CamW1VzgrT3YzotJZZx/BRYUjPu0Thw7zWsQ//WVv1cD6xeM/RZS1cp/kYojnwTmFf7cO30+xTpLJF0BjsnPNyW1eF6uUOyfA4fng8sk4D+BXxSKfRipiPBsUseUrfusLyBVfW0S+wPAZyvTd5KKZBYDHyn5/+2w7WMKxtqR1Oh2Eal4+3ZS10Wl4v+J1IB1S+AVrb+mcX3DPMtnWruRzlLfSDqD+XlE/HZZ7tdgSdqF1IL9daRW7JcDl0S6F9Ikbk9vug2HXtbGkfQiUiWIr9N3jJDxpANbkeqi6mEPBUr9Nf0XSyuLXAz8V0T8o2nsHH9rYCLw28hXCbm4eOVo0EJe0lXAtIi4P09fnf+/K+Ztva7/CEMn6W8R0bFblyHGWxPYmlTZ4oqIuK9g7LkR8YpS8VpcbJXlL/WPSTfc1gB2J5WTN0oeuRZR1wwdBapaZjuRfvTfjoi7CsWEHt90U+pQsNvnExGxSdNtAP+OiPsljZE0JiIukDSkzuA6eCHpqnI1+t73WAx8qNA2oEc9FCj1fXRmRJQYF6SjiLiiw7xmY0kkY1qJIzszx340FwP1UrEBYXI17ItJJ3t/LhW34peSPkrhHp995dFjkvbrb3kU6PhvOEjaKHpw001PHw1vDEtri/0x+rZqH+o2ziddVX4VWIvUkOyVUbAvLUnbRESxkRs7xL+aVHPsW6TxYOZJui4K9H+k1HXFPlGon7LhIml+RLygw/wxwPxo2KOxuo9kKeBPEVHknqWkN5Ku+l5L6lHhGlI1+28Xit+THp+dPEY5Pb1TuKcWUbZzuF63UxlD6g/qs6Qfz39HRIluVVpFko+QElOrNs6Pmp55tW1jRVIZfPvn8/5C8V9HSqiXRcTXlFqwf7LElaukM0hFJufRt++jUlfFPaHUsPfvEfGfbfOPInWHckDD+K2r4k5XGY0Pvm3bGksaPndbUlX+RyLiRaXi94KTxzDJDY12j4gH8vTqwOkR8aZlumODJOm3pIoFn6HSTiUattJW6o32/aRqlpcCX43cxUcpGp4W5meSau69h1RNdC/gxog4qNQ2eqXb1fFIvyrOJwUnkg66rZ4OXka6H/TBKNjYsZckzSZViLicVPPz0ijQk3clfk96b3DyGCaSromIzdvmXR0RLy+4jbHA2vStjlekc7jWTbdqy2xJF0XDkRYlLSA16Ppflo5k95Qo0MhOw9DCvHKz9tqIeGlOiucWvDLblJS4J1GyuuUzQL4Ka1VMuKH0yUevSfoWqQbUY6SaaReTRi0s0necetR7g2+YD58nJG3YOpgrjd5WLHNL+jhpvI17WDqYUgClDpC9aqdyPmk/X5b/qho1suvSwlykCgBFWphXtD6fB3KL7btJB/pSzgSOJ51pl+5wsTWeR9WDpDP4o9puSo84kToqvHVZ78dQRcSnAHIr+feRWpyvA6xQKH7x3hvAVx7DRtI0UqvdVqvy1wEfjojfFIo/H9iqVz/0XBX4EmADUidx40nVOWf2Ynsl5B/J6vS2hXlrWx8kdWfzUtKPf2Xg8FjayLJp/J5Ut8yxv05KSD/Js/YgJdkHgddERKfW81aIpANJN8tfQWqf1ap59bsebW854NqIeHGjOE4ew0fSWiyty3154brcFwA7RIFW5cuapF9FxC4F4jyXVE3333n6hcDOpC7ZS/U5NSwkHUGqJVa0umWOfVlEvLrTvFI1ukajfmpbAeUGKpP0WVLCmNuL329b7w1jSQ2Jz4iIQ7qvNYi4Th7DQ2k0wi9WpseQhkItNQztSaQ2B7+m78GlSMeFkmaQuqF+IE+vDnyzVG2itm0VuRck6WJStdabJb0AuJLUlmcKcFXTH0/bttYEjmBpt92XAF8udSXYq+qWOfafgP0j4g95ekvghEhDoha9L1dSrw/uw1nbqpckVe9LFhvPxvc8hs+Gkg6NiK8qdd19JgU6LpR0akTsQ+rm+lukRn3LN43bwUtbiQMgIv4hqVcHlasLxVk9Ilojvu0HnBYRH5e0PKl7mGLJAziddPbYapeyF6l2WpHGd9HbwbI+SOqReeU8vRj4QK7N9NUebrepufRzcKfhKJQ9/syHTURcpDQyamt0yCKjIPrKY5hIEums9zpSXe5zIuJb/a81qLg3kFqX/5I07nofBS+t/0Tqb+cfeXoN4KKRXKTRVjPsMuAbEfGLPP2nKNi1Sqd7EsrjTBSKvxypf6JWlxsXAt+PsoMGrUo6JjxQKuYzRb7SnkzfNjwXL7s9GjxJ7wK+QfrOiHR/5bMR8bMmcX3l0WNKI6i1fJs0vsFlpK6St4gGfftkxwO/ATYm1Y55atMUGgM8+ybwe0mtL9zuwFcKxUaph94jSCPwjWNpI8cm+3+tpGNIHea9gNzVjKTVGu1sZxdI2oOlPQG/k1SEWMpxwHLA9/L0PnneB0ttIEZZC/OqXh7cc2WIg0i1C68h3be8nL5jgg819lhSle6edQ9D6pzyla22I7nB7/lAo+ThK48eyzeyu4mC7QCOi4iPlIjVzzamkH4wAmaXagGeY/+Z1FBwLpWqqE3uGSj1b3QQqVO+kyMPmas0FvgmEdG4umJlW4tJDb1a1aTHsLS1dkTDlv6drpRKXz2NVt0O7gV/W9eRinyuiIjNlTrD/K+IeHeh+D3tHqa90kO+3/qnpqUGvvLosYjYNv+zdo+In/ZwO71OHBsC/wRmVueVaoQIPBgR5xSKBaRBdYCjO8z/PWlMhpLbajRi4yA8IWmTVgO43DCuaHuPUewglh7ct20d3AvGfzRSZ4tIWiEi/pxr7hWLD1yn1AtFL7qH+Y2kc1k6mNW7gVlNgzp5DINIYyp/jHQDdbT6NUur+z2HVEx2E0tb9jZ1gaRvkBoFVmuLlRgNseeU+p56moLl4p8lfUa3kq78NqLQgET5u/njtpp0e0bE9/pdceTo9cF9QS7q/AVwnqR/kBrJlvJryhZxApBrGK4dEZ9VGib5NeRmAqT7r83iu9hqeEg6nNQ530/pe3ZRtLHacMn3cj4cER8uFK9T8V6xYr1ey3XpW1YkDbwzt+T+51p6LyQdAP4cEUVGm9MwdJ3TS5J+TkqknyQVq/6DNFDWzj3Y1utJHWueU7iywnOADSPipoIxfwV8ISKubZs/FfhS08afTh7DpJf19JcVdegzyhJJGwBfj4g9G8Z5Y0T8Lp85Pk2Jxo5KXbe8LPLBIN/EvTYKDWQ1nHpxcK9Uh+93XoP4byENN7x8RGwsaXPSsMaNxqeXdH1EvKTLssaNP11sNUxGe51xSQdXJscAW5CGzSwVf1VS31yt4p+LSD+gId9EbGtZ+zRNf5wDWEAacrip1wO/o+9AUy2N+v6qOBc4Q9LxOeYBpBp8o0L1QB4RF7XmkWqkldAniebkWrKrmCNIV6oXAkTENZJKHC9W7GdZ48GynDyGyXDU0++x6g3hJaQy2rMKxj8ZuJ40EBSkH/4PgI5n3IN0TNOdGiz1HTFyDLA5S7sJH7KI+FJ+emRE9Ll6LXSAAfg88GHS91OkKs0nFoo9HHpycJd0KPAF4DmSHmrNBh4n9VNXypKIeDA1BXtKiSKhqyR9KCJOqM5UGo1ybtPgLrYaJpJOJNXTb42RsA/wREQUq6c/mnUpd3/avJFKfcfEWALcHhHFeu7tVETYqWHis0n14A483JpNPrhHxKGFtvPVUrG6xD8JmE3q8eAdwCdI92yaDma1NqkvtMdZmiymknqgeFtE3N0ovpPH8Bit9fSHq+hH0uWkVq+X5ulXA8dExDYFYk8mdbMxhb6NyEqOBLcSqdbPE3l6LLBCRDzc/5oDxn0R6cz666QaVy3jSZ/XkO9LSDojIt6lzl2yEwXHO+mlXh3cJb0o19zqeF+vVE1ApQ48DwN2JCW/c0n9oj1aKP62LC1CnReFeut18hgmkv5IautRraf/s5F+w1lLO1V7O2mMgR/l6T1JZ9dfKLSdzUlXZauSfkB/B97batjXMPalpPsp3yLdO3gf6bv/pX5XrLeNK4DtI49ep9RP1G+j4TjpknYljb/+ViptbEj9T52e26wMNfbEiFioNLbM00QPxqwvqdcHd0nTI2L/4aoJKGl8jru4ZNxecfLoMUmfJHVHsjpwAtAqt54EvL/UWUCvSbo4Il430LwC2xkPEBEPDfTaGjFboyA+VcNE0iUR8dqC2+hpsZukbSLi8hKxnimG++DeK5JeSbrn17qv+CDp2ND4vkQv+YZ5761P6tPqxcBfSGfUc4EfRETJhka9NkHS8yON2ta6WTuhaVBJe0fEj9pqc9G6eRhlupR/VKmV/81KA+/cCTyvQNyqf6nSV5mkV5Da9ZRydW7Mtxl9i94ad4mfqwF/jfSZKP9FNOxSpdciYv/8uG0vtyNpRdKIlK9haXf7x5cqVgJOAj4aEZfk7b2GVFlkRBcbOnn0WER8BkCpG/CpwKuAbYCPSXogIqYsy/2r4VPAhbmFM6QrpxINBFfKj5269yh1WfxJ4LmkG5FfBrYF9i0Uu7qNMyW1TggmkrqBKOVU4M/Am4AjSV2+31go9teBt0REqXjDahgO7qeQiglbY4HvSfp/7F4o/uJW4gCIiEuV+kob0VxsNUxyO4ZtSIMFbQOsBlwXEUW6mBgOuYXzi/JksRbOOfar22sndZo3xNi7R8SZA80rsJ3l6NsCvGQL5Ksj4uXK3cznbZ1bomhGHUYSHE0knUE6uFfvx60eEUUO7r2u7CLpW6STm9NIye/dpFbyZ8HI7aLHyaPHJE0nFTUsBv4AXEHqwO0fy3THBknS5yLi6/l5nwOupP8ueMO8U1XUIi3Yexm7Eq+n/UNJujIitlQaHfGjwN3AlSVqjEn6NqkyxC/o26/YqBiqdxgO7j8kXclckae3AvaLiI8Wij8sPW+X5mKr3tsQWIE0etedpJbHDyzLHappD1KxBsChpBEQW6aR6tkPmaRtSEV5E9rue4wnjbfcJPZOpDHL15P0nbbYpceK/lBEHNuaiDTS4odYOv5GU9NzQjqcVOtqZeCL/a8yaONJ7SR2rMwr1Xp9OFwtaeu2g3uxNjbAVsC+klo9SG8I3Niq4ty0SnOv79n0ipNHj0XENKW7v5uRDpKfBl4i6e+kMQeKVRftEXV53ml6KJYnHQjH0fe+x0OkAZWauIs0QNZb6duidjHpHk5JYyQpok//UMWGA46IVovviyg3wFcr9qgpOu2ipwd30kmStXGx1TCStD7pnsergF2ANSNitWW6UwOoFu+0F/WULPqRtFGv2hVIGhcRpa802rdxDKmb9Fb/UB8B/hYRn24Y9+D+lpeojaY0styHSJUgnjqhLFGTazh0a6fSUup7Jel59K3pVmosm1HJVx49JukTpGTxauDfpMvpy0n1uq9bhrs2WC9T6tdHPL2Pn/46XqvrYaXxPNqrog65vLfVgppUrNHrFtSHkw7AB7C0f6iTCsTt9SBTAGeTaiidzygcYKqVHHp1cJf0VtIwzOsC95JOEm6k3Fg2o5KTR+9NIo0V/KmIWLiM96W2iGh036GGH5PGOtmFdADej+a99h6UH3dpGKcrSeOA/ya1Wr+DlDg2IDUGHUPDg3FElBwRr5vnRsTnh2E7PTEMB/cvk4a2PT/XeNuWVKOrCEm7A7+JiMWS/pPUY/VRI7WWVcuYZb0Dz3QRcXBE/Gw0Jo5htmZEnAT8OyIuykUmWzcJmLveGAucFBF/bf8rstfwDWAN4PkRsUWkAZQ2JnWzUqxXX0mbSpot6fo8/dJ8oCnhV5KKD5w0jFoH979EGvpgO8reMP93RNxPuq81JiIuIPWaXMrhOXG8htSOZwZwXMH4PeHkYSNFq03EQklvlvRyUuv8RiJ1VPhwbmfTC7uQalo91agrP/8IqaZXKSeQarv9O2/jWlJNuCGTtDgXQx5ESiCPSHqoMn+06PXB/QGlvsouBn6cqzaXvIfWujp9M3BcRJxNwcoWveJiKxspjsoH+E+TWvKOp1yNqEeB6ySdR98hgD9RIHa0ali1zXyi032WBp4bEVeq75gPjQ5gETEc91OGQ/vB/V7KHtx3JXU18ylSy/5VSa38S7lT0veB7YGv5ca4I/7E3snDRoSI+FV++iCp+5CSfp3/euEGSftGxCnVmZL2JnUnUsp9kjYhd9ki6Z1AkaJQSbMjYruB5o1gPTm4S3oBsHall4MngRmSXkfqIeL+ptvI3kWqDnxMRDwgaSJ9u98fkVxV10YESTOAg9paaH9zpFcXlbQeqTHdI6S2JAG8kjRA0dsi4s5C23k+afS6V5G6rrgN2KvJvZvcJ9RKpGFu38DSdjvjSWOAv7jJPvdah4N7a/7rgDsjD3/QIP6vgC/kIsLq/KnAlyKi09DAdbcxhjRefIkhi4eVrzxspHhpK3HAUy20X14isHo4GFRODltJeiOpdo9IB97ZTWO3bedWYHulQafGkJLVu4EmN/4/TOrQcV2gWrPnIeDYTiuMMP9L5x4OHs7Lmh7cJ7UnDoCImCNpUsPYrVhPSvqTpA1HW7sRJw8bKcZIWr3V55ekNSj3/fwBSweD2pY8GFSh2ABEGpel+NgsSuObfAxYj9Qe4/w8/RnSGOk/HmrsiPg28G1JH4+I7w64wsjT64N7f+2YnlMgfstEYJ6kK+l7T67IKJ294uRhI8U3gd9L+lme3h34SqHYz4mI2bn7kL8CR0i6hJRQRrpTScVUl5MaIX6OVBNnt4i4pkngVqeXEfHdXnZ62UO9PrhfJelDEXFCdaakD9C3u5umhqMtT3G+52EjhqQpwBtJVwWzI+KGQnEvA15Laqz5O1IHlUdHxAtLxO8l9R39cCxwH7BhFBiqdLi6nukVSacBv+tycN8xIhqNpyJpbeDnwOMsTRZTScn7bRFxd5P4o52Th40IkjbsNL9EObDSMJ83kmrIfJlUG+frrV5YR7Ie9yd2dW7U2Od5p+mRaLgO7rlFeeuG9rwoPHS00sBPrQPx8sBywL9ihI/k6ORhI0KrB9Q8+RxSK+2bIuJZ3X+QpCdYWg4u0mfzMAWGih3tVx4tvT64DzdJuwFbjvRiQycPG5EkbQF8OCKGPNStpJn9LR/pNyR7rZKYqkmJPL1iRCy3rPbt2U7SFRHRqHueXvMNcxuRIuKPubipiW1InRWeRhrFsWgNq9FuGDu9tH5Ientlcgyp6G3En9U7ediI0DZuxRhSz6JNe9VdB9iB1APqe0itzE+LiHkN45qVVG2PsgS4ndRqfkRzsZWNCJKq1WZbP6CzIuLRQvFXICWRbwBHjtJ2DWYjhpOHPaPlpPFmUuKYRBr/++RS3YaYNaU0wuh3SQPGBXApqaueBct0xwbg5GHLlKRf0k/5bpOb2rm/rJcA5wCnR8T1Q41l1iu5t+efkBqEAuxN6rdsh2W3VwNz8rBlStLr89O3k+5R/ChP7wnc3qS6oqQnWVrNtfpFb1zN1awUSddExOYDzRtpfMPclqmIuAhA0pcj4nWVRb+UdHHD2CN+TAQzUnf7e5NqBUI6cSrV3XvP+MdlI8WE3O04AJI2BiYsw/0xGy7vJ43pcTdpjJZ35nkjmoutbESQNI00XsWtedYkUiPBc5fZTplZV04eNmLkmlEvypN/jojHluX+mPWSpO/Sf2WREsMk94yLrWyZkvS5yuRbI+JP+e8xSf+9zHbMrPfmkDp0nAu8tfK89Tei+crDlqlnSud8Zk2Mhl6M2/nKw5Y1dXneadrsmWrUncU7ediyFl2ed5o2sxHCxVa2TLlbcHu2ahsE6rn0/e6P+EasTh5mZlabi63MzKw2Jw8zM6vNycPMzGpz8jAzs9qcPMzMrLb/D+2iVhEYqmv2AAAAAElFTkSuQmCC", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "local_explanations_out = pd.read_csv(\n", " explainability_analysis_result_path + \"/explanations_shap/out.csv\"\n", ")\n", "feature_names = [str.replace(c, \"_label0\", \"\") for c in local_explanations_out.columns.to_series()]\n", "local_explanations_out.columns = feature_names\n", "\n", "selected_example = 111\n", "print(\n", " \"Example number:\",\n", " selected_example,\n", " \"\\nwith model prediction:\",\n", " sum(local_explanations_out.iloc[selected_example]) > 0,\n", ")\n", "print(\"\\nFeature values -- Label\", training_data.iloc[selected_example])\n", "local_explanations_out.iloc[selected_example].plot(\n", " kind=\"bar\", title=\"Local explanation for the example number \" + str(selected_example), rot=90\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Clean Up\n", "Finally, don't forget to clean up the resources we set up and used for this demo!" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'ResponseMetadata': {'RequestId': '00748684-073e-4201-8186-f9603a8ebe3b',\n", " 'HTTPStatusCode': 200,\n", " 'HTTPHeaders': {'x-amzn-requestid': '00748684-073e-4201-8186-f9603a8ebe3b',\n", " 'content-type': 'application/x-amz-json-1.1',\n", " 'content-length': '0',\n", " 'date': 'Thu, 19 Jan 2023 01:29:06 GMT'},\n", " 'RetryAttempts': 0}}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sagemaker_client.delete_model(ModelName=model_name)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability_boto3.ipynb)\n" ] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }