THIS EXERCISE IS BEING DEVELOPED (will be added before Wrap-up Section)
<a id='a_Exercise'></a>
# 3a. Exercise

### >>> Your turn (challenge)! 

- Add some other metrics to the evaluation report (from [Scikit-Learn](https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics)) 

- Create plots and save them in reports to be viewed inside Studio!

We will use this to show the metrics inside the Pipeline.

It will be something like this (but for the Processing):

<img src="media/studio-plots.png" alt="studio-plots.png"  width="50%">

https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-studio-view-execution.html

In [None]:
# YOUR SOLUTION HERE








If you've completed the challenge exercise above, let's load and update the the evaluation code in S3:

If you'd like to save some plots, [check the hints and solution here!](./solutions/a-hint.md)

In [None]:
%store -r s3_evaluation_code_uri_with_experiments

In [None]:
# Update the `s3_evaluation_code_uri` variable to point to the new script
s3_evaluation_code_uri = s3_evaluation_code_uri_with_experiments

#### Let's store the S3 URI where our evaluation script was saved for later

In [None]:
%store s3_evaluation_code_uri

# Solution - Exercise 3a.

If you want, just run the cells below:

In [None]:
%%writefile evaluate_with_experiments.py
"""Evaluation script for measuring model accuracy."""
import argparse, os, subprocess, sys
import json
import os
import tarfile
import logging
import pickle

import pandas as pd
import xgboost


logger = logging.getLogger()
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())

# May need to import additional metrics depending on what you are measuring.
# See https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html
from sklearn.metrics import classification_report, roc_auc_score, accuracy_score


def pip_install(package):
    logger.info(f"Pip installing `{package}`")
    subprocess.call([sys.executable, "-m", "pip", "install", package])


if __name__ == "__main__":
    pip_install("sagemaker-experiments==0.1.31")
    
    # Instantiate SM Experiment Tracker
    from smexperiments.tracker import Tracker
    tracker = Tracker.load()
    
    
    model_path = "/opt/ml/processing/model/model.tar.gz"
    with tarfile.open(model_path) as tar:
        tar.extractall(path="..")

    logger.debug("Loading xgboost model.")
    model = pickle.load(open("xgboost-model", "rb"))

    logger.info("Loading test input data")
    test_path = "/opt/ml/processing/test/test-dataset.csv"
    df = pd.read_csv(test_path, header=None)

    logger.debug("Reading test data.")
    y_test = df.iloc[:, 0].to_numpy()
    df.drop(df.columns[0], axis=1, inplace=True)
    X_test = xgboost.DMatrix(df.values)

    logger.info("Performing predictions against test data.")
    predictions_probs = model.predict(X_test)
    predictions = predictions_probs.round()

    logger.info("Creating classification evaluation report")
    acc = accuracy_score(y_test, predictions)
    auc = roc_auc_score(y_test, predictions_probs)

    # The metrics reported can change based on the model used, but it must be a specific name per (https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html)
    report_dict = {
        "binary_classification_metrics": {
            "accuracy": {
                "value": acc,
                "standard_deviation": "NaN",
            },
            "auc": {"value": auc, "standard_deviation": "NaN"},
        },
    }

    logger.info("Classification report:\n{}".format(report_dict))

    evaluation_output_path = os.path.join(
        "/opt/ml/processing/evaluation", "evaluation.json"
    )
    logger.info("Saving classification report to {}".format(evaluation_output_path))

    with open(evaluation_output_path, "w") as f:
        f.write(json.dumps(report_dict))
    
    logger.info("Creating and logging plots to Studio")
    tracker.log_precision_recall(y_test, predictions_probs, title="Precision-recall for predicting Churn", output_artifact=True)
    tracker.log_roc_curve(y_test, predictions_probs, title="ROC Curve for predicting Churn", output_artifact=True)
    tracker.log_confusion_matrix(y_test, predictions, title="Confusion matrix for predicting Churn", output_artifact=True)


Observe that we called the following methods after instantiating a `tracker` object:
- `tracker.log_precision_recall(...)`
- `tracker.log_roc_curve(...)`
- `tracker.log_confusion_matrix(...)`
    
These plots will be visible in the 6-Pipelines lab! 

In [None]:
%store -r docker_image_name
%store -r s3uri_model
%store -r s3url_test

In [None]:
entrypoint = "evaluate_with_experiments.py"

In [None]:
from sagemaker.processing import (
    ProcessingInput,
    ProcessingOutput,
    ScriptProcessor,
)

In [None]:
import sagemaker
sm_sess = sagemaker.session.Session()
role = sagemaker.get_execution_role()

In [None]:
# Processing step for evaluation
processor = ScriptProcessor(
    image_uri=docker_image_name,
    command=["python3"],
    instance_type="ml.m5.xlarge",
    instance_count=1,
    base_job_name="CustomerChurn/eval-script",
    sagemaker_session=sm_sess,
    role=role,
)

In [None]:
from time import strftime, gmtime
# Helper to create timestamps
create_date = lambda: strftime("%Y-%m-%d-%H-%M-%S", gmtime())

In [None]:
processor.run(
    code=entrypoint,
    inputs=[
        sagemaker.processing.ProcessingInput(
            source=s3uri_model,
            destination="/opt/ml/processing/model",
        ),
        sagemaker.processing.ProcessingInput(
            source=s3url_test,
            destination="/opt/ml/processing/test",
        ),
    ],
    outputs=[
        sagemaker.processing.ProcessingOutput(
            output_name="evaluation", source="/opt/ml/processing/evaluation"
        ),
    ],
    job_name=f"Experiments-CustomerChurnEval-{create_date()}"
)

Let's get the S3 URI location for the new evaluation python script:

In [None]:
for proc_in in processor.latest_job.inputs:
    if proc_in.input_name == "code":
        s3_evaluation_code_uri_with_experiments = proc_in.source 
        
s3_evaluation_code_uri_with_experiments

Save this location for later:

In [None]:
%store s3_evaluation_code_uri_with_experiments

We can now go back to [the main notebook](../evaluation.ipynb).