{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Amazon SageMaker Workshop\n",
    "### _**Evaluation**_\n",
    "\n",
    "---\n",
    "In this part of the workshop we will get the previous model we trained to Predict Mobile Customer Departure and evaluate its performance with a test dataset.\n",
    "\n",
    "---\n",
    "\n",
    "## Contents\n",
    "\n",
    "1. [Background](#Background) - Getting the model trained in the previous lab.\n",
    "2. [Evaluate](#Evaluate)\n",
    "    * Creating a script to evaluate model\n",
    "    * Using [SageMaker Processing](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job.html) jobs to automate evaluation of models\n",
    "3. [Exercise](#a_Exercise) - customizing metrics and evaluation reports\n",
    "4. [Wrap-up - end of Evaluation Lab](#Wrap-up)\n",
    "\n",
    "\n",
    "---\n",
    "\n",
    "## Background\n",
    "\n",
    "In the previous [Modeling](../2-Modeling/modeling.ipynb) lab we used SageMaker trained models by creating multiple SageMaker training jobs.\n",
    "\n",
    "Install and import some packages we'll need for this lab:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "from sagemaker.s3 import S3Uploader, S3Downloader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "sm_sess = sagemaker.session.Session()\n",
    "role = sagemaker.get_execution_role()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Get the variables from initial setup:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r bucket\n",
    "%store -r prefix\n",
    "%store -r region\n",
    "%store -r docker_image_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('sagemaker-studio-us-east-2-309044321483',\n",
       " 'xgboost-churn',\n",
       " 'us-east-2',\n",
       " '257758044811.dkr.ecr.us-east-2.amazonaws.com/sagemaker-xgboost:1.3-1')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bucket, prefix, region, docker_image_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "### - if you _**skipped**_ the data preparation lab follow instructions:\n",
    "\n",
    "   - **run this [notebook](./config/pre_setup.ipynb)**\n",
    "   - load the model S3 URI:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # Uncomment if you skipped the data preparation lab\n",
    "\n",
    "# %store -r s3uri_model\n",
    "# !cp config/model.tar.gz ./\n",
    "\n",
    "# %store -r s3uri_test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "### - if you _**have done**_ the previous labs\n",
    "\n",
    "Download the model and test data from S3:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'workshop-framework-xgboost-customer-chu-2021-06-10-00-06-45-384'"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Uncomment if you have done the previous labs\n",
    "\n",
    "# Get name of training job and other variables\n",
    "%store -r training_job_name\n",
    "training_job_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "2021-06-10 00:16:15 Starting - Preparing the instances for training\n",
      "2021-06-10 00:16:15 Downloading - Downloading input data\n",
      "2021-06-10 00:16:15 Training - Training image download completed. Training in progress.\n",
      "2021-06-10 00:16:15 Uploading - Uploading generated training model\n",
      "2021-06-10 00:16:15 Completed - Training job completed\n",
      "\n",
      "s3uri_model = s3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/output/workshop-framework-xgboost-customer-chu-2021-06-10-00-06-45-384/output/model.tar.gz\n"
     ]
    }
   ],
   "source": [
    "# Uncomment if you have done the previous labs\n",
    "estimator = sagemaker.estimator.Estimator.attach(training_job_name)\n",
    "s3uri_model = estimator.model_data\n",
    "print(\"\\ns3uri_model =\",s3uri_model)\n",
    "\n",
    "S3Downloader.download(s3uri_model, \".\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r s3uri_test\n",
    "S3Downloader.download(s3uri_test, \".\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now you should have the `model.tar.gz` file in the 3-Evaluation directory \n",
    "\n",
    "(click the refresh button)\n",
    "\n",
    "![refresh_dir.png](./media/refresh_dir.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Evaluate model\n",
    "\n",
    "Let's create a simple evaluation with some Scikit-Learn Metrics like [Area Under the Curve (AUC)](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.auc.html) and [Accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading xgboost model.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<xgboost.core.Booster at 0x7fdb02a1c390>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import json\n",
    "import os\n",
    "import tarfile\n",
    "import logging\n",
    "import pickle\n",
    "\n",
    "import pandas as pd\n",
    "import xgboost\n",
    "\n",
    "from sklearn.metrics import classification_report, roc_auc_score, accuracy_score\n",
    "\n",
    "\n",
    "model_path = \"model.tar.gz\"\n",
    "with tarfile.open(model_path) as tar:\n",
    "    tar.extractall(path=\".\")\n",
    "\n",
    "print(\"Loading xgboost model.\")\n",
    "model = pickle.load(open(\"xgboost-model\", \"rb\"))\n",
    "model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading test input data\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "      <th>...</th>\n",
       "      <th>60</th>\n",
       "      <th>61</th>\n",
       "      <th>62</th>\n",
       "      <th>63</th>\n",
       "      <th>64</th>\n",
       "      <th>65</th>\n",
       "      <th>66</th>\n",
       "      <th>67</th>\n",
       "      <th>68</th>\n",
       "      <th>69</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>186</td>\n",
       "      <td>0</td>\n",
       "      <td>137.8</td>\n",
       "      <td>97</td>\n",
       "      <td>187.7</td>\n",
       "      <td>118</td>\n",
       "      <td>146.4</td>\n",
       "      <td>85</td>\n",
       "      <td>8.7</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>132</td>\n",
       "      <td>25</td>\n",
       "      <td>113.2</td>\n",
       "      <td>96</td>\n",
       "      <td>269.9</td>\n",
       "      <td>107</td>\n",
       "      <td>229.1</td>\n",
       "      <td>87</td>\n",
       "      <td>7.1</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>112</td>\n",
       "      <td>17</td>\n",
       "      <td>183.2</td>\n",
       "      <td>95</td>\n",
       "      <td>252.8</td>\n",
       "      <td>125</td>\n",
       "      <td>156.7</td>\n",
       "      <td>95</td>\n",
       "      <td>9.7</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>91</td>\n",
       "      <td>24</td>\n",
       "      <td>93.5</td>\n",
       "      <td>112</td>\n",
       "      <td>183.4</td>\n",
       "      <td>128</td>\n",
       "      <td>240.7</td>\n",
       "      <td>133</td>\n",
       "      <td>9.9</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>22</td>\n",
       "      <td>0</td>\n",
       "      <td>110.3</td>\n",
       "      <td>107</td>\n",
       "      <td>166.5</td>\n",
       "      <td>93</td>\n",
       "      <td>202.3</td>\n",
       "      <td>96</td>\n",
       "      <td>9.5</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>329</th>\n",
       "      <td>0</td>\n",
       "      <td>59</td>\n",
       "      <td>0</td>\n",
       "      <td>166.3</td>\n",
       "      <td>95</td>\n",
       "      <td>239.3</td>\n",
       "      <td>87</td>\n",
       "      <td>123.2</td>\n",
       "      <td>108</td>\n",
       "      <td>10.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>330</th>\n",
       "      <td>0</td>\n",
       "      <td>127</td>\n",
       "      <td>14</td>\n",
       "      <td>143.2</td>\n",
       "      <td>99</td>\n",
       "      <td>169.9</td>\n",
       "      <td>91</td>\n",
       "      <td>221.6</td>\n",
       "      <td>77</td>\n",
       "      <td>11.6</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>331</th>\n",
       "      <td>0</td>\n",
       "      <td>86</td>\n",
       "      <td>0</td>\n",
       "      <td>166.2</td>\n",
       "      <td>112</td>\n",
       "      <td>255.3</td>\n",
       "      <td>81</td>\n",
       "      <td>228.1</td>\n",
       "      <td>97</td>\n",
       "      <td>5.4</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>332</th>\n",
       "      <td>0</td>\n",
       "      <td>36</td>\n",
       "      <td>43</td>\n",
       "      <td>29.9</td>\n",
       "      <td>123</td>\n",
       "      <td>129.1</td>\n",
       "      <td>117</td>\n",
       "      <td>325.9</td>\n",
       "      <td>105</td>\n",
       "      <td>8.6</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>333</th>\n",
       "      <td>0</td>\n",
       "      <td>123</td>\n",
       "      <td>0</td>\n",
       "      <td>163.1</td>\n",
       "      <td>119</td>\n",
       "      <td>249.4</td>\n",
       "      <td>51</td>\n",
       "      <td>168.2</td>\n",
       "      <td>77</td>\n",
       "      <td>9.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>334 rows × 70 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     0    1   2      3    4      5    6      7    8     9   ...  60  61  62  \\\n",
       "0     0  186   0  137.8   97  187.7  118  146.4   85   8.7  ...   0   0   0   \n",
       "1     0  132  25  113.2   96  269.9  107  229.1   87   7.1  ...   0   0   0   \n",
       "2     0  112  17  183.2   95  252.8  125  156.7   95   9.7  ...   0   0   0   \n",
       "3     0   91  24   93.5  112  183.4  128  240.7  133   9.9  ...   0   0   0   \n",
       "4     0   22   0  110.3  107  166.5   93  202.3   96   9.5  ...   0   0   0   \n",
       "..   ..  ...  ..    ...  ...    ...  ...    ...  ...   ...  ...  ..  ..  ..   \n",
       "329   0   59   0  166.3   95  239.3   87  123.2  108  10.0  ...   0   0   0   \n",
       "330   0  127  14  143.2   99  169.9   91  221.6   77  11.6  ...   0   0   0   \n",
       "331   0   86   0  166.2  112  255.3   81  228.1   97   5.4  ...   0   0   0   \n",
       "332   0   36  43   29.9  123  129.1  117  325.9  105   8.6  ...   0   0   0   \n",
       "333   0  123   0  163.1  119  249.4   51  168.2   77   9.0  ...   0   0   0   \n",
       "\n",
       "     63  64  65  66  67  68  69  \n",
       "0     0   0   1   1   0   1   0  \n",
       "1     0   1   0   1   0   0   1  \n",
       "2     0   1   0   1   0   0   1  \n",
       "3     0   0   1   0   1   0   1  \n",
       "4     1   0   0   1   0   1   0  \n",
       "..   ..  ..  ..  ..  ..  ..  ..  \n",
       "329   0   0   1   1   0   1   0  \n",
       "330   0   1   0   1   0   0   1  \n",
       "331   0   1   0   1   0   1   0  \n",
       "332   1   0   0   1   0   0   1  \n",
       "333   0   1   0   1   0   1   0  \n",
       "\n",
       "[334 rows x 70 columns]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(\"Loading test input data\")\n",
    "test_path = \"config/test-dataset.csv\"\n",
    "df = pd.read_csv(test_path, header=None)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading test data. We should get an `DMatrix` object...\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<xgboost.core.DMatrix at 0x7fdb0246a450>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(\"Reading test data. We should get an `DMatrix` object...\")\n",
    "y_test = df.iloc[:, 0].to_numpy()\n",
    "df.drop(df.columns[0], axis=1, inplace=True)\n",
    "X_test = xgboost.DMatrix(df.values)\n",
    "X_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Performing predictions against test data.\n",
      "Creating classification evaluation report\n",
      "Accuracy = 0.9550898203592815\n",
      "AUC = 0.9146998834498835\n"
     ]
    }
   ],
   "source": [
    "print(\"Performing predictions against test data.\")\n",
    "predictions_probs = model.predict(X_test)\n",
    "predictions = predictions_probs.round()\n",
    "\n",
    "print(\"Creating classification evaluation report\")\n",
    "acc = accuracy_score(y_test, predictions)\n",
    "auc = roc_auc_score(y_test, predictions_probs)\n",
    "\n",
    "print(\"Accuracy =\", acc)\n",
    "print(\"AUC =\", auc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating a classification report\n",
    "\n",
    "Now, let's save the results in a JSON file, following the structure defined in SageMaker docs:\n",
    "https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html\n",
    "\n",
    "We'll use this logic later in [Lab 6-Pipelines](../6-Pipelines/pipelines.ipynb):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Classification report:\n",
      "{'binary_classification_metrics': {'accuracy': {'standard_deviation': 'NaN',\n",
      "                                                'value': 0.9550898203592815},\n",
      "                                   'auc': {'standard_deviation': 'NaN',\n",
      "                                           'value': 0.9146998834498835}}}\n"
     ]
    }
   ],
   "source": [
    "import pprint\n",
    "# The metrics reported can change based on the model used - check the link for the documentation \n",
    "report_dict = {\n",
    "    \"binary_classification_metrics\": {\n",
    "        \"accuracy\": {\n",
    "            \"value\": acc,\n",
    "            \"standard_deviation\": \"NaN\",\n",
    "        },\n",
    "        \"auc\": {\"value\": auc, \"standard_deviation\": \"NaN\"},\n",
    "    },\n",
    "}\n",
    "\n",
    "print(\"Classification report:\")\n",
    "pprint.pprint(report_dict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Saving classification report to ./evaluation.json\n"
     ]
    }
   ],
   "source": [
    "evaluation_output_path = os.path.join(\n",
    "    \".\", \"evaluation.json\"\n",
    ")\n",
    "print(\"Saving classification report to {}\".format(evaluation_output_path))\n",
    "\n",
    "with open(evaluation_output_path, \"w\") as f:\n",
    "    f.write(json.dumps(report_dict))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## Ok, now we have working code. Let's put it in a Python Script"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing evaluate.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile evaluate.py\n",
    "\"\"\"Evaluation script for measuring model accuracy.\"\"\"\n",
    "\n",
    "import json\n",
    "import os\n",
    "import tarfile\n",
    "import logging\n",
    "import pickle\n",
    "\n",
    "import pandas as pd\n",
    "import xgboost\n",
    "\n",
    "logger = logging.getLogger()\n",
    "logger.setLevel(logging.INFO)\n",
    "logger.addHandler(logging.StreamHandler())\n",
    "\n",
    "# May need to import additional metrics depending on what you are measuring.\n",
    "# See https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html\n",
    "from sklearn.metrics import classification_report, roc_auc_score, accuracy_score\n",
    "\n",
    "def get_dataset(dir_path, dataset_name) -> pd.DataFrame:\n",
    "    files = [ os.path.join(dir_path, file) for file in os.listdir(dir_path) ]\n",
    "    if len(files) == 0:\n",
    "        raise ValueError(('There are no files in {}.\\n' +\n",
    "                          'This usually indicates that the channel ({}) was incorrectly specified,\\n' +\n",
    "                          'the data specification in S3 was incorrectly specified or the role specified\\n' +\n",
    "                          'does not have permission to access the data.').format(files, dataset_name))\n",
    "    raw_data = [ pd.read_csv(file, header=None) for file in files ]\n",
    "    df = pd.concat(raw_data)\n",
    "    return df\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    model_path = \"/opt/ml/processing/model/model.tar.gz\"\n",
    "    with tarfile.open(model_path) as tar:\n",
    "        tar.extractall(path=\"..\")\n",
    "\n",
    "    logger.debug(\"Loading xgboost model.\")\n",
    "    model = pickle.load(open(\"xgboost-model\", \"rb\"))\n",
    "\n",
    "    logger.info(\"Loading test input data\")\n",
    "    test_path = \"/opt/ml/processing/test\"\n",
    "    df = get_dataset(test_path, \"test_set\")\n",
    "\n",
    "    logger.debug(\"Reading test data.\")\n",
    "    y_test = df.iloc[:, 0].to_numpy()\n",
    "    df.drop(df.columns[0], axis=1, inplace=True)\n",
    "    X_test = xgboost.DMatrix(df.values)\n",
    "\n",
    "    logger.info(\"Performing predictions against test data.\")\n",
    "    predictions_probs = model.predict(X_test)\n",
    "    predictions = predictions_probs.round()\n",
    "\n",
    "    logger.info(\"Creating classification evaluation report\")\n",
    "    acc = accuracy_score(y_test, predictions)\n",
    "    auc = roc_auc_score(y_test, predictions_probs)\n",
    "\n",
    "    # The metrics reported can change based on the model used, but it must be a specific name per (https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html)\n",
    "    report_dict = {\n",
    "        \"binary_classification_metrics\": {\n",
    "            \"accuracy\": {\n",
    "                \"value\": acc,\n",
    "                \"standard_deviation\": \"NaN\",\n",
    "            },\n",
    "            \"auc\": {\"value\": auc, \"standard_deviation\": \"NaN\"},\n",
    "        },\n",
    "    }\n",
    "\n",
    "    logger.info(\"Classification report:\\n{}\".format(report_dict))\n",
    "\n",
    "    evaluation_output_path = os.path.join(\n",
    "        \"/opt/ml/processing/evaluation\", \"evaluation.json\"\n",
    "    )\n",
    "    logger.info(\"Saving classification report to {}\".format(evaluation_output_path))\n",
    "\n",
    "    with open(evaluation_output_path, \"w\") as f:\n",
    "        f.write(json.dumps(report_dict))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## Ok, now we are finally running this script with a simple call to SageMaker Processing!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.processing import (\n",
    "    ProcessingInput,\n",
    "    ProcessingOutput,\n",
    "    ScriptProcessor,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Processing step for evaluation\n",
    "processor = ScriptProcessor(\n",
    "    image_uri=docker_image_name,\n",
    "    command=[\"python3\"],\n",
    "    instance_type=\"ml.m5.xlarge\",\n",
    "    instance_count=1,\n",
    "    base_job_name=\"CustomerChurn/eval-script\",\n",
    "    sagemaker_session=sm_sess,\n",
    "    role=role,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "entrypoint = \"evaluate.py\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "from time import strftime, gmtime\n",
    "# Helper to create timestamps\n",
    "create_date = lambda: strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Job Name:  CustomerChurnEval-2021-06-10-00-18-17\n",
      "Inputs:  [{'InputName': 'input-1', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/output/workshop-framework-xgboost-customer-chu-2021-06-10-00-06-45-384/output/model.tar.gz', 'LocalPath': '/opt/ml/processing/model', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'input-2', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/data/test/test-dataset.csv', 'LocalPath': '/opt/ml/processing/test', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-309044321483/CustomerChurnEval-2021-06-10-00-18-17/input/code/evaluate.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]\n",
      "Outputs:  [{'OutputName': 'evaluation', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-east-2-309044321483/CustomerChurnEval-2021-06-10-00-18-17/output/evaluation', 'LocalPath': '/opt/ml/processing/evaluation', 'S3UploadMode': 'EndOfJob'}}]\n",
      ".......................\u001b[34mLoading test input data\u001b[0m\n",
      "\u001b[34mPerforming predictions against test data.\u001b[0m\n",
      "\u001b[34mCreating classification evaluation report\u001b[0m\n",
      "\u001b[34mClassification report:\u001b[0m\n",
      "\u001b[34m{'binary_classification_metrics': {'accuracy': {'value': 0.9550898203592815, 'standard_deviation': 'NaN'}, 'auc': {'value': 0.9146998834498835, 'standard_deviation': 'NaN'}}}\u001b[0m\n",
      "\u001b[34mSaving classification report to /opt/ml/processing/evaluation/evaluation.json\u001b[0m\n",
      "\n"
     ]
    }
   ],
   "source": [
    "processor.run(\n",
    "    code=entrypoint,\n",
    "    inputs=[\n",
    "        sagemaker.processing.ProcessingInput(\n",
    "            source=s3uri_model,\n",
    "            destination=\"/opt/ml/processing/model\",\n",
    "        ),\n",
    "        sagemaker.processing.ProcessingInput(\n",
    "            source=s3uri_test,\n",
    "            destination=\"/opt/ml/processing/test\",\n",
    "        ),\n",
    "    ],\n",
    "    outputs=[\n",
    "        sagemaker.processing.ProcessingOutput(\n",
    "            output_name=\"evaluation\", source=\"/opt/ml/processing/evaluation\"\n",
    "        ),\n",
    "    ],\n",
    "    job_name=f\"CustomerChurnEval-{create_date()}\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If everything went well, the SageMaker Processing job must have created the JSON with the evaluation report of our model and saved it in S3.\n",
    "\n",
    "In addition, under the hood, SageMaker Processing has uploaded our `evaluate.py` script to S3. Let's check where the script was saved:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'s3://sagemaker-us-east-2-309044321483/CustomerChurnEval-2021-06-10-00-18-17/input/code/evaluate.py'"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "for proc_in in processor.latest_job.inputs:\n",
    "    if proc_in.input_name == \"code\":\n",
    "        s3_evaluation_code_uri = proc_in.source \n",
    "        \n",
    "s3_evaluation_code_uri"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Let's store the S3 URI where our evaluation script was saved for later"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Stored 's3_evaluation_code_uri' (str)\n"
     ]
    }
   ],
   "source": [
    "%store s3_evaluation_code_uri"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Let's check it the evaluation report from S3!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'s3://sagemaker-us-east-2-309044321483/CustomerChurnEval-2021-06-10-00-18-17/output/evaluation'"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "out_s3_report_uri = processor.latest_job.outputs[0].destination\n",
    "out_s3_report_uri"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['s3://sagemaker-us-east-2-309044321483/CustomerChurnEval-2021-06-10-00-18-17/output/evaluation/evaluation.json']"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reports_list = S3Downloader.list(out_s3_report_uri)\n",
    "reports_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=====Model Report====\n",
      "{\n",
      "  \"binary_classification_metrics\": {\n",
      "    \"accuracy\": {\n",
      "      \"value\": 0.9550898203592815,\n",
      "      \"standard_deviation\": \"NaN\"\n",
      "    },\n",
      "    \"auc\": {\n",
      "      \"value\": 0.9146998834498835,\n",
      "      \"standard_deviation\": \"NaN\"\n",
      "    }\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "report = S3Downloader.read_file(reports_list[0])\n",
    "\n",
    "print(\"=====Model Report====\")\n",
    "print(json.dumps(json.loads(report.split('\\n')[0]), indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Wrap-up\n",
    "\n",
    "Now that we finished the **evaluation lab**, let's make everything here re-usable. It may come in handy later (spoiler alert - when creating Pipelines)..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing ../6-Pipelines/my_labs_solutions/evaluation_solution.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile ../6-Pipelines/my_labs_solutions/evaluation_solution.py\n",
    "import sagemaker\n",
    "from sagemaker.processing import (\n",
    "    ProcessingInput,\n",
    "    ProcessingOutput,\n",
    "    ScriptProcessor,\n",
    ")\n",
    "\n",
    "def get_evaluation_processor(docker_image_name) -> ScriptProcessor:\n",
    "    \n",
    "    role = sagemaker.get_execution_role()\n",
    "    sm_sess = sagemaker.session.Session()\n",
    "\n",
    "    # Processing step for evaluation\n",
    "    processor = ScriptProcessor(\n",
    "        image_uri=docker_image_name,\n",
    "        command=[\"python3\"],\n",
    "        instance_type=\"ml.m5.xlarge\",\n",
    "        instance_count=1,\n",
    "        base_job_name=\"CustomerChurn/eval-script\",\n",
    "        sagemaker_session=sm_sess,\n",
    "        role=role,\n",
    "    )\n",
    "    \n",
    "    return processor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# [Extra] Fairness and Explainability with SageMaker Clarify"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "Amazon SageMaker Clarify helps improve your machine learning models by detecting potential bias and helping explain how these models make predictions. The fairness and explainability functionality provided by SageMaker Clarify takes a step towards enabling AWS customers to build trustworthy and understandable machine learning models. The product comes with the tools to help you with the following tasks.\n",
    "\n",
    "* Measure biases that can occur during each stage of the ML lifecycle (data collection, model training and tuning, and monitoring of ML models deployed for inference).\n",
    "* Generate model governance reports targeting risk and compliance teams and external regulators.\n",
    "* Provide explanations of the data, models, and monitoring used to assess predictions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### - if you _**have done**_ the previous labs\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using already existing model: clarify-model\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'clarify-model'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "session = sagemaker.session.Session()\n",
    "model_name = 'clarify-model'\n",
    "\n",
    "model = estimator.create_model(name=model_name)\n",
    "container_def = model.prepare_container_def()\n",
    "session.create_model(model_name,\n",
    "                     role,\n",
    "                     container_def)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that you have your model set up. Let's say hello to SageMaker Clarify!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import clarify\n",
    "clarify_processor = clarify.SageMakerClarifyProcessor(role=role,\n",
    "                                                      instance_count=1,\n",
    "                                                      instance_type='ml.m5.xlarge',\n",
    "                                                      sagemaker_session=session)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Detecting Bias\n",
    "SageMaker Clarify helps you detect possible pre- and post-training biases using a variety of metrics.\n",
    "#### Writing DataConfig and ModelConfig\n",
    "A `DataConfig` object communicates some basic information about data I/O to SageMaker Clarify. We specify where to find the input dataset, where to store the output, the target column (`label`), the header names, and the dataset type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r bucket\n",
    "%store -r prefix\n",
    "%store -r s3uri_train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Churn',\n",
       " 'Account Length',\n",
       " 'VMail Message',\n",
       " 'Day Mins',\n",
       " 'Day Calls',\n",
       " 'Eve Mins',\n",
       " 'Eve Calls',\n",
       " 'Night Mins',\n",
       " 'Night Calls',\n",
       " 'Intl Mins',\n",
       " 'Intl Calls',\n",
       " 'CustServ Calls',\n",
       " 'State_AK',\n",
       " 'State_AL',\n",
       " 'State_AR',\n",
       " 'State_AZ',\n",
       " 'State_CA',\n",
       " 'State_CO',\n",
       " 'State_CT',\n",
       " 'State_DC',\n",
       " 'State_DE',\n",
       " 'State_FL',\n",
       " 'State_GA',\n",
       " 'State_HI',\n",
       " 'State_IA',\n",
       " 'State_ID',\n",
       " 'State_IL',\n",
       " 'State_IN',\n",
       " 'State_KS',\n",
       " 'State_KY',\n",
       " 'State_LA',\n",
       " 'State_MA',\n",
       " 'State_MD',\n",
       " 'State_ME',\n",
       " 'State_MI',\n",
       " 'State_MN',\n",
       " 'State_MO',\n",
       " 'State_MS',\n",
       " 'State_MT',\n",
       " 'State_NC',\n",
       " 'State_ND',\n",
       " 'State_NE',\n",
       " 'State_NH',\n",
       " 'State_NJ',\n",
       " 'State_NM',\n",
       " 'State_NV',\n",
       " 'State_NY',\n",
       " 'State_OH',\n",
       " 'State_OK',\n",
       " 'State_OR',\n",
       " 'State_PA',\n",
       " 'State_RI',\n",
       " 'State_SC',\n",
       " 'State_SD',\n",
       " 'State_TN',\n",
       " 'State_TX',\n",
       " 'State_UT',\n",
       " 'State_VA',\n",
       " 'State_VT',\n",
       " 'State_WA',\n",
       " 'State_WI',\n",
       " 'State_WV',\n",
       " 'State_WY',\n",
       " 'Area Code_408',\n",
       " 'Area Code_415',\n",
       " 'Area Code_510',\n",
       " \"Int'l Plan_no\",\n",
       " \"Int'l Plan_yes\",\n",
       " 'VMail Plan_no',\n",
       " 'VMail Plan_yes']"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "col_names = pd.read_csv(\"/root//amazon-sagemaker-workshop/2-Modeling/config/training-dataset-with-header.csv\").columns.to_list()\n",
    "col_names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "bias_report_output_path = f's3://{bucket}/{prefix}/clarify-bias-report'\n",
    "bias_data_config = clarify.DataConfig(s3_data_input_path=s3uri_train,\n",
    "                                      s3_output_path=bias_report_output_path,\n",
    "                                      label='Churn',\n",
    "                                      headers=col_names,\n",
    "                                      dataset_type='text/csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A `ModelConfig` object communicates information about your trained model. To avoid additional traffic to your production models, SageMaker Clarify sets up and tears down a dedicated endpoint when processing.\n",
    "* `instance_type` and `instance_count` specify your preferred instance type and instance count used to run your model on during SageMaker Clarify's processing. The testing dataset is small so a single standard instance is good enough to run this example. If your have a large complex dataset, you may want to use a better instance type to speed up, or add more instances to enable Spark parallelization.\n",
    "* `accept_type` denotes the endpoint response payload format, and `content_type` denotes the payload format of request to the endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_config = clarify.ModelConfig(model_name=model_name,\n",
    "                                   instance_type='ml.m5.xlarge',\n",
    "                                   instance_count=1,\n",
    "                                   accept_type='text/csv',\n",
    "                                   content_type='text/csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A `ModelPredictedLabelConfig` provides information on the format of your predictions. XGBoost model outputs probabilities of samples, so SageMaker Clarify invokes the endpoint then uses `probability_threshold` to convert the probability to binary labels for bias analysis. Prediction above the threshold is interpreted as label value `1` and below or equal as label value `0`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictions_config = clarify.ModelPredictedLabelConfig(probability_threshold=0.8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Writing BiasConfig\n",
    "SageMaker Clarify also needs information on what the sensitive columns (`facets`) are, what the sensitive features (`facet_values_or_threshold`) may be, and what the desirable outcomes are (`label_values_or_threshold`).\n",
    "SageMaker Clarify can handle both categorical and continuous data for `facet_values_or_threshold` and for `label_values_or_threshold`. In this case we are using categorical data.\n",
    "\n",
    "We specify this information in the `BiasConfig` API. Here that the positive outcome is being `churn`. We will group users if they don't have an International Plan (`Int'l Plan_no`), which could be associated to customers from diffent segments/behaviors. `group_name` is used to form subgroups for the measurement of Conditional Demographic Disparity in Labels (CDDL) and Conditional Demographic Disparity in Predicted Labels (CDDPL) with regards to Simpson’s paradox."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "bias_config = clarify.BiasConfig(label_values_or_threshold=[1],\n",
    "                                facet_name=\"Int'l Plan_no\",\n",
    "                                facet_values_or_threshold=[100],\n",
    "                                group_name=\"Account Length\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Pre-training Bias\n",
    "Bias can be present in your data before any model training occurs. Inspecting your data for bias before training begins can help detect any data collection gaps, inform your feature engineering, and hep you understand what societal biases the data may reflect.\n",
    "\n",
    "Computing pre-training bias metrics does not require a trained model.\n",
    "\n",
    "#### Post-training Bias\n",
    "Computing post-training bias metrics does require a trained model.\n",
    "\n",
    "Unbiased training data (as determined by concepts of fairness measured by bias metric) may still result in biased model predictions after training. Whether this occurs depends on several factors including hyperparameter choices.\n",
    "\n",
    "\n",
    "You can run these options separately with `run_pre_training_bias()` and `run_post_training_bias()` or at the same time with `run_bias()` as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Job Name:  Clarify-Bias-2021-06-10-00-41-43-637\n",
      "Inputs:  [{'InputName': 'dataset', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/data/train/train.csv', 'LocalPath': '/opt/ml/processing/input/data', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'analysis_config', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/clarify-bias-report/analysis_config.json', 'LocalPath': '/opt/ml/processing/input/config', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]\n",
      "Outputs:  [{'OutputName': 'analysis_result', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/clarify-bias-report', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob'}}]\n",
      "..........................\u001b[34mINFO:sagemaker-clarify-processing:Starting SageMaker Clarify Processing job\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Analysis config path: /opt/ml/processing/input/config/analysis_config.json\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Analysis result path: /opt/ml/processing/output\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:This host is algo-1.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:This host is the leader.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Number of hosts in the cluster is 1.\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Running Python / Pandas based analyzer.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Dataset uri: /opt/ml/processing/input/data\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Dataset type: text/csv\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Spinning up shadow endpoint\u001b[0m\n",
      "\u001b[34mINFO:sagemaker:Creating endpoint-config with name sagemaker-clarify-endpoint-config-1623285953-9917\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Creating endpoint sagemaker-clarify-endpoint-1623285953-d74a\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:======================================\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Calculating post-training bias metrics\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:======================================\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Getting predictions from the endpoint\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Checking endpoint status\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Endpoint is in service after 721 seconds\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Prediction batch size is initialized with 21845\u001b[0m\n",
      "\u001b[34mINFO:analyzer.prediction_util:We assume a prediction above 0.800 indicates 1 and below or equal indicates 0.\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:AD metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 224, in AD\n",
      "    raise ValueError(\"Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DAR metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 179, in DAR\n",
      "    dar, _ = common.DLR(feature, sensitive_facet_index, positive_label_index, positive_predicted_label_index)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/common.py\", line 220, in DLR\n",
      "    raise ValueError(\"DLR: Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: DLR: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DCA metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 96, in DCA\n",
      "    dca, _ = common.DCO(feature, sensitive_facet_index, positive_label_index, positive_predicted_label_index)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/common.py\", line 166, in DCO\n",
      "    raise ValueError(\"DCO: Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: DCO: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DCR metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 116, in DCR\n",
      "    _, dcr = common.DCO(feature, sensitive_facet_index, positive_label_index, positive_predicted_label_index)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/common.py\", line 166, in DCO\n",
      "    raise ValueError(\"DCO: Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: DCO: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DI metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 75, in DI\n",
      "    raise ValueError(\"Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DPPL metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 41, in DPPL\n",
      "    return common.DPL(feature, sensitive_facet_index, positive_predicted_label_index)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/common.py\", line 62, in DPL\n",
      "    raise ValueError(\"Facet set is empty.\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty.\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DRR metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 199, in DRR\n",
      "    _, drr = common.DLR(feature, sensitive_facet_index, positive_label_index, positive_predicted_label_index)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/common.py\", line 220, in DLR\n",
      "    raise ValueError(\"DLR: Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: DLR: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:FT metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 355, in FT\n",
      "    raise ValueError(\"Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:RD metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 141, in RD\n",
      "    raise ValueError(\"Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:TE metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/posttraining.py\", line 302, in TE\n",
      "    raise ValueError(\"Facet set is empty\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Stop using endpoint: sagemaker-clarify-endpoint-1623285953-d74a\u001b[0m\n",
      "\u001b[34mINFO:sagemaker:Deleting endpoint configuration with name: sagemaker-clarify-endpoint-config-1623285953-9917\u001b[0m\n",
      "\u001b[34mINFO:sagemaker:Deleting endpoint with name: sagemaker-clarify-endpoint-1623285953-d74a\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Model endpoint delivered 1.24143 requests per second and a total of 1 requests over 1 seconds\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Delivered 1 predict calls with a total of 2333 examples.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Stop using endpoint: None\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Delivered 1 predict calls with a total of 2333 examples.\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:=====================================\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Calculating pre-training bias metrics\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:=====================================\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:CI metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 53, in CI\n",
      "    raise ValueError(\"CI: facet set is empty. Check that x[facet] has non-zero length.\")\u001b[0m\n",
      "\u001b[34mValueError: CI: facet set is empty. Check that x[facet] has non-zero length.\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:DPL metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 71, in DPL\n",
      "    return common.DPL(feature, sensitive_facet_index, positive_label_index)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/common.py\", line 62, in DPL\n",
      "    raise ValueError(\"Facet set is empty.\")\u001b[0m\n",
      "\u001b[34mValueError: Facet set is empty.\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:JS metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 114, in JS\n",
      "    raise ValueError(\"No instance of common facet found, dataset may be too small\")\u001b[0m\n",
      "\u001b[34mValueError: No instance of common facet found, dataset may be too small\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:KL metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 91, in KL\n",
      "    raise ValueError(\"No instance of common facet found, dataset may be too small\")\u001b[0m\n",
      "\u001b[34mValueError: No instance of common facet found, dataset may be too small\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:KS metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 175, in KS\n",
      "    return LP_norm(label, sensitive_facet_index, np.inf)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 141, in LP_norm\n",
      "    raise ValueError(\"No instance of common facet found, dataset may be too small\")\u001b[0m\n",
      "\u001b[34mValueError: No instance of common facet found, dataset may be too small\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:LP metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 132, in LP\n",
      "    return LP_norm(label, sensitive_facet_index, 2)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 141, in LP_norm\n",
      "    raise ValueError(\"No instance of common facet found, dataset may be too small\")\u001b[0m\n",
      "\u001b[34mValueError: No instance of common facet found, dataset may be too small\u001b[0m\n",
      "\u001b[34mERROR:smclarify.bias.report:TVD metrics failed\u001b[0m\n",
      "\u001b[34mTraceback (most recent call last):\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/report.py\", line 292, in _categorical_metric_call_wrapper\n",
      "    group_variable=group_variable,\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/__init__.py\", line 27, in call_metric\n",
      "    return metric(**{key: kwargs[key] for key in inspect.signature(metric).parameters.keys()})\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 158, in TVD\n",
      "    Lp_res = LP_norm(label, sensitive_facet_index, 1)\n",
      "  File \"/usr/local/lib/python3.7/site-packages/smclarify/bias/metrics/pretraining.py\", line 141, in LP_norm\n",
      "    raise ValueError(\"No instance of common facet found, dataset may be too small\")\u001b[0m\n",
      "\u001b[34mValueError: No instance of common facet found, dataset may be too small\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:======================================\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Calculating bias statistics for report\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:======================================\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:smclarify.bias.metrics.common:data uniqueness fraction: 0.0008572653236176596\u001b[0m\n",
      "\u001b[34mINFO:analyzer.report:jupyter nbconvert --to html --output /opt/ml/processing/output/report.html /opt/ml/processing/output/report.ipynb --template sagemaker-xai\u001b[0m\n",
      "\u001b[34m[NbConvertApp] Converting notebook /opt/ml/processing/output/report.ipynb to html\u001b[0m\n",
      "\u001b[34m[NbConvertApp] Writing 345423 bytes to /opt/ml/processing/output/report.html\u001b[0m\n",
      "\u001b[34mINFO:analyzer.report:HTML report '/opt/ml/processing/output/report.html' generated successfully.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.report:wkhtmltopdf -q /opt/ml/processing/output/report.html /opt/ml/processing/output/report.pdf\u001b[0m\n",
      "\u001b[34mINFO:analyzer.report:PDF report '/opt/ml/processing/output/report.pdf' generated successfully.\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Collected analyses: \u001b[0m\n",
      "\u001b[34m{\n",
      "    \"version\": \"1.0\",\n",
      "    \"post_training_bias_metrics\": {\n",
      "        \"label\": \"Churn\",\n",
      "        \"facets\": {\n",
      "            \"Int'l Plan_no\": [\n",
      "                {\n",
      "                    \"value_or_threshold\": \"100\",\n",
      "                    \"metrics\": [\n",
      "                        {\n",
      "                            \"name\": \"AD\",\n",
      "                            \"description\": \"Accuracy Difference (AD)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"CDDPL\",\n",
      "                            \"description\": \"Conditional Demographic Disparity in Predicted Labels (CDDPL)\",\n",
      "                            \"value\": 0.0\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DAR\",\n",
      "                            \"description\": \"Difference in Acceptance Rates (DAR)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"DLR: Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DCA\",\n",
      "                            \"description\": \"Difference in Conditional Acceptance (DCA)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"DCO: Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DCR\",\n",
      "                            \"description\": \"Difference in Conditional Rejection (DCR)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"DCO: Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DI\",\n",
      "                            \"description\": \"Disparate Impact (DI)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DPPL\",\n",
      "                            \"description\": \"Difference in Positive Proportions in Predicted Labels (DPPL)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty.\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DRR\",\n",
      "                            \"description\": \"Difference in Rejection Rates (DRR)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"DLR: Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"FT\",\n",
      "                            \"description\": \"Flip Test (FT)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"RD\",\n",
      "                            \"description\": \"Recall Difference (RD)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"TE\",\n",
      "                            \"description\": \"Treatment Equality (TE)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty\"\n",
      "                        }\n",
      "                    ]\n",
      "                }\n",
      "            ]\n",
      "        },\n",
      "        \"label_value_or_threshold\": \"1\"\n",
      "    },\n",
      "    \"pre_training_bias_metrics\": {\n",
      "        \"label\": \"Churn\",\n",
      "        \"facets\": {\n",
      "            \"Int'l Plan_no\": [\n",
      "                {\n",
      "                    \"value_or_threshold\": \"100\",\n",
      "                    \"metrics\": [\n",
      "                        {\n",
      "                            \"name\": \"CDDL\",\n",
      "                            \"description\": \"Conditional Demographic Disparity in Labels (CDDL)\",\n",
      "                            \"value\": 0.0\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"CI\",\n",
      "                            \"description\": \"Class Imbalance (CI)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"CI: facet set is empty. Check that x[facet] has non-zero length.\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"DPL\",\n",
      "                            \"description\": \"Difference in Positive Proportions in Labels (DPL)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"Facet set is empty.\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"JS\",\n",
      "                            \"description\": \"Jensen-Shannon Divergence (JS)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"No instance of common facet found, dataset may be too small\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"KL\",\n",
      "                            \"description\": \"Kullback-Liebler Divergence (KL)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"No instance of common facet found, dataset may be too small\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"KS\",\n",
      "                            \"description\": \"Kolmogorov-Smirnov Distance (KS)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"No instance of common facet found, dataset may be too small\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"LP\",\n",
      "                            \"description\": \"L-p Norm (LP)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"No instance of common facet found, dataset may be too small\"\n",
      "                        },\n",
      "                        {\n",
      "                            \"name\": \"TVD\",\n",
      "                            \"description\": \"Total Variation Distance (TVD)\",\n",
      "                            \"value\": null,\n",
      "                            \"error\": \"No instance of common facet found, dataset may be too small\"\n",
      "                        }\n",
      "                    ]\n",
      "                }\n",
      "            ]\n",
      "        },\n",
      "        \"label_value_or_threshold\": \"1\"\n",
      "    }\u001b[0m\n",
      "\u001b[34m}\u001b[0m\n",
      "\u001b[34mINFO:analyzer.system_util:exit_message: Completed: SageMaker XAI Analyzer ran successfully\u001b[0m\n",
      "\u001b[34m------------!\u001b[0m\n",
      "\n"
     ]
    }
   ],
   "source": [
    "clarify_processor.run_bias(data_config=bias_data_config,\n",
    "                           bias_config=bias_config,\n",
    "                           model_config=model_config,\n",
    "                           model_predicted_label_config=predictions_config,\n",
    "                           pre_training_methods='all',\n",
    "                           post_training_methods='all')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Viewing the Bias Report\n",
    "In Studio, you can view the results under the experiments tab:\n",
    "\n",
    "<img src=\"./media/bias_report_experiments.png\" width=\"30%\">\n",
    "\n",
    "Right click on the unassigned trials:\n",
    "\n",
    "<img src=\"./media/bias_report_right_click.png\" width=\"30%\">\n",
    "\n",
    "Each bias metric has detailed explanations with examples that you can explore.\n",
    "\n",
    "<img src=\"./media/bias_report_table.png\">\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you're not a Studio user yet, you can access the bias report in pdf, html and ipynb formats in the following S3 bucket:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'s3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/clarify-bias-report'"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bias_report_output_path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Explaining Predictions\n",
    "There are expanding business needs and legislative regulations that require explanations of _why_ a model made the decision it did. SageMaker Clarify uses SHAP to explain the contribution that each input feature makes to the final decision."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Kernel SHAP algorithm requires a baseline (also known as background dataset). Baseline dataset type shall be the same as `dataset_type` of `DataConfig`, and baseline samples shall only include features. By definition, `baseline` should either be a S3 URI to the baseline dataset file, or an in-place list of samples. In this case we chose the latter, and put the first sample of the test dataset to the list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "cn = col_names.copy()\n",
    "cn.remove(\"Churn\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Account Length</th>\n",
       "      <th>VMail Message</th>\n",
       "      <th>Day Mins</th>\n",
       "      <th>Day Calls</th>\n",
       "      <th>Eve Mins</th>\n",
       "      <th>Eve Calls</th>\n",
       "      <th>Night Mins</th>\n",
       "      <th>Night Calls</th>\n",
       "      <th>Intl Mins</th>\n",
       "      <th>Intl Calls</th>\n",
       "      <th>...</th>\n",
       "      <th>State_WI</th>\n",
       "      <th>State_WV</th>\n",
       "      <th>State_WY</th>\n",
       "      <th>Area Code_408</th>\n",
       "      <th>Area Code_415</th>\n",
       "      <th>Area Code_510</th>\n",
       "      <th>Int'l Plan_no</th>\n",
       "      <th>Int'l Plan_yes</th>\n",
       "      <th>VMail Plan_no</th>\n",
       "      <th>VMail Plan_yes</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>186.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>137.8</td>\n",
       "      <td>97.0</td>\n",
       "      <td>187.7</td>\n",
       "      <td>118.0</td>\n",
       "      <td>146.4</td>\n",
       "      <td>85.0</td>\n",
       "      <td>8.7</td>\n",
       "      <td>6.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 69 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   Account Length  VMail Message  Day Mins  Day Calls  Eve Mins  Eve Calls  \\\n",
       "0           186.0            0.0     137.8       97.0     187.7      118.0   \n",
       "\n",
       "   Night Mins  Night Calls  Intl Mins  Intl Calls  ...  State_WI  State_WV  \\\n",
       "0       146.4         85.0        8.7         6.0  ...       0.0       0.0   \n",
       "\n",
       "   State_WY  Area Code_408  Area Code_415  Area Code_510  Int'l Plan_no  \\\n",
       "0       0.0            0.0            0.0            1.0            1.0   \n",
       "\n",
       "   Int'l Plan_yes  VMail Plan_no  VMail Plan_yes  \n",
       "0             0.0            1.0             0.0  \n",
       "\n",
       "[1 rows x 69 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_sample = df.iloc[0].values.tolist()\n",
    "pd.DataFrame(columns=cn, data=[test_sample])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "shap_config = clarify.SHAPConfig(baseline=[test_sample],\n",
    "                                 num_samples=15,\n",
    "                                 agg_method='mean_abs',\n",
    "                                 save_local_shap_values=False)\n",
    "\n",
    "explainability_output_path = f's3://{bucket}/{prefix}/clarify-explainability'\n",
    "explainability_data_config = clarify.DataConfig(s3_data_input_path=s3uri_train,\n",
    "                                s3_output_path=explainability_output_path,\n",
    "                                label='Churn',\n",
    "                                headers=col_names,\n",
    "                                dataset_type='text/csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Job Name:  Clarify-Explainability-2021-06-10-01-18-19-063\n",
      "Inputs:  [{'InputName': 'dataset', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/data/train/train.csv', 'LocalPath': '/opt/ml/processing/input/data', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'analysis_config', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/clarify-explainability/analysis_config.json', 'LocalPath': '/opt/ml/processing/input/config', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]\n",
      "Outputs:  [{'OutputName': 'analysis_result', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-studio-us-east-2-309044321483/xgboost-churn/clarify-explainability', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob'}}]\n",
      "...........................\u001b[34mINFO:sagemaker-clarify-processing:Starting SageMaker Clarify Processing job\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Analysis config path: /opt/ml/processing/input/config/analysis_config.json\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Analysis result path: /opt/ml/processing/output\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:This host is algo-1.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:This host is the leader.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Number of hosts in the cluster is 1.\u001b[0m\n",
      "\u001b[34mINFO:sagemaker-clarify-processing:Running Python / Pandas based analyzer.\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Dataset uri: /opt/ml/processing/input/data\u001b[0m\n",
      "\u001b[34mINFO:analyzer.data_loading.data_loader_util:Dataset type: text/csv\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Spinning up shadow endpoint\u001b[0m\n",
      "\u001b[34mINFO:sagemaker:Creating endpoint-config with name sagemaker-clarify-endpoint-config-1623288151-929c\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Creating endpoint sagemaker-clarify-endpoint-1623288151-76b0\u001b[0m\n",
      "\u001b[34mWARNING:explainers.shap.kernel_shap:The number of requested samples 15 is too small given the number of features 69\u001b[0m\n",
      "\u001b[34mINFO:explainers.shap.kernel_shap:SHAP n_samples 15\u001b[0m\n",
      "\u001b[34mINFO:analyzer.predictor:Checking endpoint status\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "clarify_processor.run_explainability(data_config=explainability_data_config,\n",
    "                                     model_config=model_config,\n",
    "                                     explainability_config=shap_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Viewing the Explainability Report\n",
    "As with the bias report, you can view the explainability report in Studio under the experiments tab\n",
    "\n",
    "\n",
    "<img src=\"./recordings/explainability_detail.gif\">\n",
    "\n",
    "The Model Insights tab contains direct links to the report and model insights.\n",
    "\n",
    "If you're not a Studio user yet, as with the Bias Report, you can access this report at the following S3 bucket."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "explainability_output_path"
   ]
  }
 ],
 "metadata": {
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (Data Science)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-2:429704687514:image/datascience-1.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}