{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Amazon SageMaker Workshop\n",
    "## _**Modeling**_\n",
    "\n",
    "---\n",
    "In this part of the workshop we will use **train** multiple Gradient Boosted Trees model to Predict Mobile Customer Departure.\n",
    "\n",
    "---\n",
    "\n",
    "## Contents\n",
    "\n",
    "1. [Background](#Background) - Getting the data prepared in the previous lab.\n",
    "2. [Train](#Train) - Train with the Amazon SageMaker XGBoost algorithm\n",
    "    * [Creating Amazon SageMaker Training Jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-training.html)\n",
    "      * [Algorithm implementations](https://docs.aws.amazon.com/sagemaker/latest/dg/algorithms-choose.html#algorithms-choose-implementation) - Built-in and Framework mode (pre-made Docker images)\n",
    "    * [Amazon SageMaker Experiments](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments.html)\n",
    "      * Manage multiple Runs\n",
    "      * Experiment with hyperparameters and charting\n",
    "    * [Amazon SageMaker Debugger](https://docs.aws.amazon.com/sagemaker/latest/dg/train-debugger.html)\n",
    "      * Debug your model \n",
    "      \n",
    "3. [Exercise](#a_Exercise) - You turn to create multiple training Jobs]\n",
    "4. [Wrap-up - end of Modeling Lab](#Wrap-up)\n",
    "  \n",
    "\n",
    "---\n",
    "\n",
    "## Background\n",
    "\n",
    "In the previous [Data Preparation](../1-DataPrep/data_preparation.ipynb) lab we used SageMaker notebook (with Pandas library) and/or SageMaker Data Wrangler.\n",
    "\n",
    "Let's import the libraries for this lab:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "#Supress default INFO logging\n",
    "import logging\n",
    "logger = logging.getLogger()\n",
    "logger.setLevel(logging.ERROR)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pprint\n",
    "from time import strftime, gmtime\n",
    "\n",
    "import boto3\n",
    "import sagemaker\n",
    "from sagemaker import get_execution_role\n",
    "from sagemaker.debugger import rule_configs, Rule, DebuggerHookConfig\n",
    "\n",
    "from sagemaker.experiments.run import Run, load_run"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "role = sagemaker.get_execution_role()\n",
    "sm_sess = sagemaker.session.Session()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "### - if you _**skipped**_ the lab `1-DataPrep` follow instructions:\n",
    "\n",
    "   - **run this [notebook](./config/pre_setup.ipynb)**\n",
    "---\n",
    "Now, load variables for this lab:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%store -r bucket\n",
    "%store -r prefix\n",
    "%store -r docker_image_name\n",
    "%store -r framework_version\n",
    "%store -r s3uri_train\n",
    "%store -r s3uri_validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "bucket, prefix, docker_image_name, framework_version, s3uri_train, s3uri_validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Train\n",
    "\n",
    "We'll use the XGBoost library to train a class of models known as gradient boosted decision trees on the data that we just uploaded. \n",
    "\n",
    "Because we're using XGBoost, we first need to specify the locations of the XGBoost algorithm containers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, because we're training with the CSV file format, we'll create `s3_input`s that our training function can use as a pointer to the files in S3."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.inputs import TrainingInput"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "s3_input_train = TrainingInput(s3_data=s3uri_train, content_type='csv')\n",
    "s3_input_validation = TrainingInput(s3_data=s3uri_validation, content_type='csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "pprint.pprint(s3_input_train.config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "pprint.pprint(s3_input_validation.config)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Amazon SageMaker Experiments\n",
    "\n",
    "Amazon SageMaker Experiments allows us to keep track of model training; organize related models together and log model configuration, parameters, and metrics to reproduce and iterate on previous models and compare models. We'll create a single experiment (during training job) to keep track of the different approaches we'll try to train the model.\n",
    "\n",
    "Each approach or block of training code that we run will be an experiment run. Later, we'll be able to compare different runs in Amazon SageMaker Studio.\n",
    "\n",
    "Let's choose a name to the experiment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Helper to create timestamps\n",
    "create_date = lambda: strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())\n",
    "experiment_name=f\"customer-churn-prediction-xgboost-{create_date()}\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Hyperparameters\n",
    "Now we can specify our XGBoost hyperparameters.  Among them are these key hyperparameters:\n",
    "- `max_depth` Controls how deep each tree within the algorithm can be built.  Deeper trees can lead to better fit, but are more computationally expensive and can lead to overfitting.  Typically, you need to explore some trade-offs in model performance between a large number of shallow trees and a smaller number of deeper trees.\n",
    "- `subsample` Controls sampling of the training data.  This hyperparameter can help reduce overfitting, but setting it too low can also starve the model of data.\n",
    "- `num_round` Controls the number of boosting rounds.  This value specifies the models that are subsequently trained using the residuals of previous iterations.  Again, more rounds should produce a better fit on the training data, but can be computationally expensive or lead to overfitting.\n",
    "- `eta` Controls how aggressive each round of boosting is.  Larger values lead to more conservative boosting.\n",
    "- `gamma` Controls how aggressively trees are grown.  Larger values lead to more conservative models.\n",
    "- `min_child_weight` Also controls how aggresively trees are grown. Large values lead to a more conservative model.\n",
    "\n",
    "For more information about these hyperparameters, see [XGBoost's hyperparameters GitHub page](https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "hyperparams = {\"max_depth\":5,\n",
    "               \"subsample\":0.8,\n",
    "               \"num_round\":600,\n",
    "               \"eta\":0.2,\n",
    "               \"gamma\":4,\n",
    "               \"min_child_weight\":6,\n",
    "               \"objective\":'binary:logistic',\n",
    "               \"verbosity\": 0\n",
    "              }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run 1 - XGBoost in algorithm mode\n",
    "\n",
    "For our first run, we'll use the built-in XGBoost algorithm to train a model without supplying any additional code. This way, we can use XGBoost to train and deploy a model as we would with other Amazon SageMaker built-in algorithms.\n",
    "\n",
    "We'll create a new `Run` object. To train the model, we'll create an estimator and specify a few parameters, like the type of training instances we'd like to use and how many, and where the artifacts of the trained model should be stored. \n",
    "\n",
    "We'll also associate the training job with the experiment run that we just created when we call the `fit` method of the `estimator`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "xgb = sagemaker.estimator.Estimator(image_uri=docker_image_name,\n",
    "                                    role=role,\n",
    "                                    hyperparameters=hyperparams,\n",
    "                                    instance_count=1, \n",
    "                                    instance_type='ml.m4.xlarge',\n",
    "                                    output_path=f's3://{bucket}/{prefix}/output',\n",
    "                                    base_job_name='workshop-xgboost-customer-churn',\n",
    "                                    sagemaker_session=sm_sess)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "with Run(\n",
    "    experiment_name=experiment_name,\n",
    "    run_name=f\"algorithm-mode-run-{create_date()}\",\n",
    "    sagemaker_session=sm_sess,\n",
    ") as run:\n",
    "    xgb.fit(inputs={\n",
    "                      'train': s3_input_train,\n",
    "                      'validation': s3_input_validation\n",
    "                    }\n",
    "           )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Review the results\n",
    "\n",
    "After the training job completes successfully, you can view metrics, logs, and graphs related to the run on the **SageMaker resources** tab in Amazon SageMaker Studio. \n",
    "\n",
    "To see the experiments, choose the **Experiments** on left menu, and select experiment with the name you've created on previous step.\n",
    "\n",
    "![experiments_button](media/10-experiments.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Too see detailed metrics about your training job, click on respective run name:\n",
    "    \n",
    "![Run_Detail](media/20-run.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Download the model\n",
    "\n",
    "You can find and download the model in Amazon SageMaker Studio. To find the model, choose the **Output Artifacts** tab in the left, and check the S3 URI.\n",
    "\n",
    "![Run_Detail](media/30-download.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Trying other hyperparameter values\n",
    "\n",
    "To improve a model, you typically try other hyperparameter values to see if they affect the final validation error. Let's vary the `min_child_weight` parameter and start other training jobs with those different values to see how they affect the validation error. For each value, we'll create a separate run so that we can compare the results in Amazon SageMaker Studio later."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='a_Exercise'></a>\n",
    "\n",
    "# 2a. Exercise\n",
    "\n",
    "### >>> Your turn! \n",
    "\n",
    "Run 5 **- simultaneous -** training jobs with different values for `min_child_weight` in XGBoost.\n",
    "\n",
    "If you'd like to try with another hyperparameters, choose other ones from here:\n",
    "\n",
    "https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost_hyperparameters.html "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# YOUR SOLUTION HERE\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Want a hint? [Look here](./solutions/a-hint.md)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualize and compare multiple training jobs\n",
    "\n",
    "#### Create charts\n",
    "\n",
    "To create a chart, multi-select the components. Because this is a sample training run and the data is sparse, there's not much to chart in a time series. However, we can create a scatter plot for the parameter sweep. \n",
    "\n",
    "On your experiment, select runs that you would like to compare. The following image is an example.\n",
    "\n",
    "![Run_Detail](media/40-compare.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On following screen, click on \"Add Graph\" button and then choose \"Scatter plot\" type:\n",
    "\n",
    "![Run_Detail](media/50-add-graph.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On graph screen, choose:\n",
    "\n",
    "* \"Wall time\" on dimension\n",
    "* \"validation:logloss\" on both axis, and click on \"Update\" button:\n",
    "\n",
    "![scatter plot example](media/60-graph-detail.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Amazon SageMaker Debugger\n",
    "\n",
    "Amazon SageMaker Debugger lets you debug a model during training. As you train, Debugger periodicially saves tensors, which fully specify the state of the model at that point in time. Debugger saves the tensors to an Amazon S3 bucket. You can then use Amazon SageMaker Studio for analysis and visualization to diagnose training issues.\n",
    "\n",
    "#### Specify SageMaker Debugger Rules\n",
    "\n",
    "To enable automated detection of common issues during training, Amazon SageMaker Debugger also allows you to attach a list of rules to evaluate the training job against.\n",
    "\n",
    "Some rules that apply to XGBoost include `AllZero`, `ClassImbalance`, `Confusion`, `LossNotDecreasing`, `Overfit`, `Overtraining`, `SimilarAcrossRuns`, `TensorVariance`, `UnchangedTensor`, and `TreeDepth`. \n",
    "\n",
    "[More information here](https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-built-in-rules.html#debugger-built-in-rules-Rule)\n",
    "\n",
    "We'll use the `LossNotDecreasing` rule--which is triggered if the loss doesn't decrease monotonically at any point during training--the `Overtraining` rule, and the `Overfit` rule. Let's create the rules."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "debug_rules = [Rule.sagemaker(rule_configs.loss_not_decreasing()),\n",
    "               Rule.sagemaker(rule_configs.overtraining()),\n",
    "               Rule.sagemaker(rule_configs.overfit())\n",
    "              ]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run 2 - XGBoost in framework mode\n",
    "\n",
    "For the next run, we'll train a similar model, but use XGBoost in framework mode. If you've worked with the open source XGBoost, using XGBoost this way will be familiar to you. \n",
    "\n",
    "Using XGBoost as a framework provides more flexibility than using it as a built-in algorithm because it enables more advanced scenarios that **allow incorporating custom pre-processing, training and post-processing scripts.**\n",
    "\n",
    "In addition, we'll be able to specify a list of rules that we want Amazon SageMaker Debugger to evaluate our training  against."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Fit estimator\n",
    "\n",
    "To use XGBoost as a framework, you need to specify an entry-point script that can incorporate additional processing into your training jobs.\n",
    "\n",
    "We've made a couple of simple changes to enable the Amazon SageMaker Debugger `smdebug` library. We created a `SessionHook`, which we pass as a callback function when creating a `Booster`. We passed a `SaveConfig` object that tells the hook to save the evaluation metrics, feature importances, and SHAP values at regular intervals. (Debugger is highly configurable. You can choose exactly what to save.) We describe the changes in more detail after we train this example. For even more detail, see the [Developer Guide for XGBoost](https://github.com/awslabs/sagemaker-debugger/tree/master/docs/xgboost)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%writefile xgboost_customer_churn.py\n",
    "import argparse\n",
    "import json\n",
    "import os\n",
    "import pickle\n",
    "import random\n",
    "import tempfile\n",
    "import urllib.request\n",
    "\n",
    "import xgboost\n",
    "from smdebug import SaveConfig\n",
    "from smdebug.xgboost import Hook\n",
    "\n",
    "\n",
    "def parse_args():\n",
    "\n",
    "    parser = argparse.ArgumentParser()\n",
    "\n",
    "    parser.add_argument(\"--max_depth\", type=int, default=5)\n",
    "    parser.add_argument(\"--eta\", type=float, default=0.2)\n",
    "    parser.add_argument(\"--gamma\", type=int, default=4)\n",
    "    parser.add_argument(\"--min_child_weight\", type=int, default=6)\n",
    "    parser.add_argument(\"--subsample\", type=float, default=0.8)\n",
    "    parser.add_argument(\"--verbosity\", type=int, default=0)\n",
    "    parser.add_argument(\"--objective\", type=str, default=\"binary:logistic\")\n",
    "    parser.add_argument(\"--num_round\", type=int, default=50)\n",
    "    parser.add_argument(\"--smdebug_path\", type=str, default=None)\n",
    "    parser.add_argument(\"--smdebug_frequency\", type=int, default=1)\n",
    "    parser.add_argument(\"--smdebug_collections\", type=str, default='metrics')\n",
    "    parser.add_argument(\"--output_uri\", type=str, default=\"/opt/ml/output/tensors\",\n",
    "                        help=\"S3 URI of the bucket where tensor data will be stored.\")\n",
    "\n",
    "    parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))\n",
    "    parser.add_argument('--validation', type=str, default=os.environ.get('SM_CHANNEL_VALIDATION'))\n",
    "    parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])\n",
    "    \n",
    "    args = parser.parse_args()\n",
    "\n",
    "    return args\n",
    "\n",
    "\n",
    "def create_smdebug_hook(out_dir, train_data=None, validation_data=None, frequency=1, collections=None,):\n",
    "\n",
    "    save_config = SaveConfig(save_interval=frequency)\n",
    "    hook = Hook(\n",
    "        out_dir=out_dir,\n",
    "        train_data=train_data,\n",
    "        validation_data=validation_data,\n",
    "        save_config=save_config,\n",
    "        include_collections=collections,\n",
    "    )\n",
    "\n",
    "    return hook\n",
    "\n",
    "\n",
    "def main():\n",
    "    \n",
    "    args = parse_args()\n",
    "\n",
    "    train, validation = args.train, args.validation\n",
    "    parse_csv = \"?format=csv&label_column=0\"\n",
    "    dtrain = xgboost.DMatrix(train+parse_csv)\n",
    "    dval = xgboost.DMatrix(validation+parse_csv)\n",
    "\n",
    "    watchlist = [(dtrain, \"train\"), (dval, \"validation\")]\n",
    "\n",
    "    params = {\n",
    "        \"max_depth\": args.max_depth,\n",
    "        \"eta\": args.eta,\n",
    "        \"gamma\": args.gamma,\n",
    "        \"min_child_weight\": args.min_child_weight,\n",
    "        \"subsample\": args.subsample,\n",
    "        \"verbosity\": args.verbosity,\n",
    "        \"objective\": args.objective}\n",
    "\n",
    "    # The output_uri is a the URI for the s3 bucket where the metrics will be\n",
    "    # saved.\n",
    "    output_uri = (\n",
    "        args.smdebug_path\n",
    "        if args.smdebug_path is not None\n",
    "        else args.output_uri\n",
    "    )\n",
    "\n",
    "    collections = (\n",
    "        args.smdebug_collections.split(',')\n",
    "        if args.smdebug_collections is not None\n",
    "        else None\n",
    "    )\n",
    "\n",
    "    hook = create_smdebug_hook(\n",
    "        out_dir=output_uri,\n",
    "        frequency=args.smdebug_frequency,\n",
    "        collections=collections,\n",
    "        train_data=dtrain,\n",
    "        validation_data=dval,\n",
    "    )\n",
    "\n",
    "    bst = xgboost.train(\n",
    "        params=params,\n",
    "        dtrain=dtrain,\n",
    "        evals=watchlist,\n",
    "        num_boost_round=args.num_round,\n",
    "        callbacks=[hook])\n",
    "    \n",
    "    if not os.path.exists(args.model_dir):\n",
    "        os.makedirs(args.model_dir)\n",
    "\n",
    "    model_location = os.path.join(args.model_dir, 'xgboost-model')\n",
    "    pickle.dump(bst, open(model_location, 'wb'))\n",
    "\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "\n",
    "    main()\n",
    "\n",
    "\n",
    "def model_fn(model_dir):\n",
    "    \"\"\"Load a model. For XGBoost Framework, a default function to load a model is not provided.\n",
    "    Users should provide customized model_fn() in script.\n",
    "    Args:\n",
    "        model_dir: a directory where model is saved.\n",
    "    Returns:\n",
    "        A XGBoost model.\n",
    "        XGBoost model format type.\n",
    "    \"\"\"\n",
    "    model_files = (file for file in os.listdir(model_dir) if os.path.isfile(os.path.join(model_dir, file)))\n",
    "    model_file = next(model_files)\n",
    "    try:\n",
    "        booster = pickle.load(open(os.path.join(model_dir, model_file), 'rb'))\n",
    "        format = 'pkl_format'\n",
    "    except Exception as exp_pkl:\n",
    "        try:\n",
    "            booster = xgboost.Booster()\n",
    "            booster.load_model(os.path.join(model_dir, model_file))\n",
    "            format = 'xgb_format'\n",
    "        except Exception as exp_xgb:\n",
    "            raise ModelLoadInferenceError(\"Unable to load model: {} {}\".format(str(exp_pkl), str(exp_xgb)))\n",
    "    booster.set_param('nthread', 1)\n",
    "    return booster, format\n",
    "\n",
    "\n",
    "def predict_fn(input_object, model):\n",
    "    \"\"\"\n",
    "    Perform prediction on the deserialized object, with the loaded model.\n",
    "    \"\"\"\n",
    "    X_test = xgboost.DMatrix(input_object.values)\n",
    "    predictions_probs = model.predict(X_test)\n",
    "    predictions = predictions_probs.round()\n",
    "    return {\"predictions\": predictions}\n",
    "\n",
    "\n",
    "def input_fn(request_body, content_type):\n",
    "    \"\"\"\n",
    "    Perform preprocessing task on inference dataset.\n",
    "    \"\"\"\n",
    "    if content_type == \"text/csv\":\n",
    "        df = pd.read_csv(StringIO(request_body), header=None)\n",
    "        return df\n",
    "    else:\n",
    "        raise ValueError(\"{} not supported by script!\".format(content_type))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tips about predict_fn & input_fn on code\n",
    "Although we are developing the training script, we can add functions for the inference logic. SageMaker provides a simple way for users to customize not only the training logic, but also the inference logic. All you have to do is create functions called model_fn, input_fn, predict_fn and output_fn [more details here](https://sagemaker.readthedocs.io/en/stable/frameworks/xgboost/using_xgboost.html#serve-a-model)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create our framwork estimator and call `fit` to start the training job. As before, we'll create a separate run for this run so that we can use Amazon SageMaker Studio to compare it with other runs later. Because we are running in framework mode, we also need to pass additional parameters, like the entry point script and the framework version, to the estimator. \n",
    "\n",
    "As training progresses, you'll be able to see Amazon SageMaker Debugger logs that evaluate the rule against the training job."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.xgboost.estimator import XGBoost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "train_script_name = 'xgboost_customer_churn.py'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "framework_xgb = XGBoost(image_uri=docker_image_name,\n",
    "                        entry_point=train_script_name,\n",
    "                        role=role,\n",
    "                        framework_version=framework_version,\n",
    "                        py_version=\"py3\",\n",
    "                        hyperparameters=hyperparams,\n",
    "                        instance_count=1, \n",
    "                        instance_type='ml.m4.xlarge',\n",
    "                        output_path=f's3://{bucket}/{prefix}/output',\n",
    "                        base_job_name='workshop-framework-xgboost-customer-churn',\n",
    "                        sagemaker_session=sm_sess,\n",
    "                        rules=debug_rules\n",
    "                        )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "with Run(\n",
    "    experiment_name=experiment_name,\n",
    "    run_name=f\"framework-mode-run-{create_date()}\",\n",
    "    sagemaker_session=sm_sess,\n",
    ") as run:\n",
    "    framework_xgb.fit(inputs={\n",
    "                          'train': s3_input_train,\n",
    "                          'validation': s3_input_validation\n",
    "                             }\n",
    "                     )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After the training has been running for a while you can view debug info in the Debugger panel. To get to this panel you must click through the experiment, respective run, and then click on left \"debugs\" option.\n",
    "\n",
    "![debugger panel](media/70-debug.png)\n",
    "\n",
    "It's also possible to open details about debugging, clicking on the name inside \"Debug Insights\" (on the top down of the run):\n",
    "\n",
    "![view_debugger panel](media/71-debug-insights.png)\n",
    "\n",
    "On this page, it's possible to see detailed metrics about Debug process:\n",
    "\n",
    "![Debugger detail](media/72-debug-det.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Let's save `training_job_name` for later (Lab4-Deployment)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "training_job_name = framework_xgb.latest_training_job.job_name\n",
    "training_job_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store training_job_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Let's also store the S3 URI and training script name where our training script was saved for later\n",
    "\n",
    "(Observe that SageMaker in Framework mode saves the training script in S3 and puts the the S3 URI as a hyperparameter called `sagemaker_submit_directory` and also the `sagemaker_program` key point to the training script)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "framework_xgb.hyperparameters()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_modeling_code_uri = eval(framework_xgb.hyperparameters()[\"sagemaker_submit_directory\"])\n",
    "s3_modeling_code_uri"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store s3_modeling_code_uri"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store train_script_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Wrap-up\n",
    "\n",
    "Now that we finished the **modeling lab**, let's make everything here re-usable. It may come in handy later (spoiler alert - when creating Pipelines)...\n",
    "\n",
    "**Obs.:** We will use the `sagemaker.estimator.Estimator` class, but we will also pass our custom training script as a hyperparameter.\n",
    "\n",
    "(yes, it is possible to do that, like we saw when inspecting the hyperparameters and seeing the `sagemaker_program` and `sagemaker_submit_directory` keys!)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile ../6-Pipelines/my_labs_solutions/modeling_solution.py\n",
    "\n",
    "import boto3\n",
    "import sagemaker\n",
    "from sagemaker.inputs import TrainingInput\n",
    "\n",
    "from sagemaker.debugger import rule_configs, Rule, DebuggerHookConfig\n",
    "\n",
    "def get_modeling_estimator(bucket,\n",
    "                           prefix,\n",
    "                           s3_modeling_code_uri,\n",
    "                           docker_image_name,\n",
    "                           entry_point_script = 'xgboost_customer_churn.py') -> sagemaker.estimator.Estimator:\n",
    "    \n",
    "    role = sagemaker.get_execution_role()\n",
    "    sm_sess = sagemaker.session.Session()\n",
    "\n",
    "    # Input configs\n",
    "    hyperparams = {\"sagemaker_program\": entry_point_script,\n",
    "                   \"sagemaker_submit_directory\": s3_modeling_code_uri,\n",
    "                   \"max_depth\": 5,\n",
    "                   \"subsample\": 0.8,\n",
    "                   \"num_round\": 600,\n",
    "                   \"eta\": 0.2,\n",
    "                   \"gamma\": 4,\n",
    "                   \"min_child_weight\": 6,\n",
    "                   \"objective\": 'binary:logistic',\n",
    "                   \"verbosity\": 0\n",
    "                  }\n",
    "\n",
    "    # Debugger configs\n",
    "    debug_rules = [\n",
    "        Rule.sagemaker(rule_configs.loss_not_decreasing()),\n",
    "        Rule.sagemaker(rule_configs.overtraining()),\n",
    "        Rule.sagemaker(rule_configs.overfit())\n",
    "    ]\n",
    "\n",
    "    # Estimator configs\n",
    "    xgb = sagemaker.estimator.Estimator(image_uri=docker_image_name,\n",
    "                                        role=role,\n",
    "                                        hyperparameters=hyperparams,\n",
    "                                        instance_count=1, \n",
    "                                        instance_type='ml.m4.xlarge',\n",
    "                                        output_path=f's3://{bucket}/{prefix}/output',\n",
    "                                        base_job_name='pipeline-xgboost-customer-churn',\n",
    "                                        sagemaker_session=sm_sess,\n",
    "                                        rules=debug_rules)\n",
    "    \n",
    "    return xgb\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# [You can now go to the lab 3-Evaluation](../3-Evaluation/evaluation.ipynb)"
   ]
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   }
  ],
  "celltoolbar": "Tags",
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (Data Science)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/datascience-1.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  },
  "notice": "Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.  Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
 },
 "nbformat": 4,
 "nbformat_minor": 4
}