## Predictor Monitoring enables you to monitor effectiveness over time

We are excited to announce that Amazon Forecast now offers a new feature called Predictor Monitoring that enables customers to automatically evaluate their trained predictor's performance over time. Think of an [Amazon Forecast Predictor](https://docs.aws.amazon.com/forecast/latest/dg/howitworks-predictor.html) as a saved machine learning model used to generate predictions based on a set of training data. Once a predictor is created, it can be used for days, weeks or potentially months to generate new forecasted data points without change -- every customer's data and use case is different.

Over time, a variety of factors can cause the performance of a predictor to change such as external factors (supply chain) or changes in consumer preferences. New products, items and services may be created and the distribution of data may change too. Eventually, a new predictor will be needed to ensure high quality predictions continue to be made.

Once you enable monitoring for a predictor and then import new data and produce a new forecast, the monitor will collect statistics automatically. You may use these statistics to decide when it's the right time to train a new predictor.

This notebook provides an anecdotal series of steps to illustrate what may happen in the real-world as multiple datasets are imported and forecasted over time. The provided notebook is saved in an executed state, so you may review outputs without having to run each cell, unless you choose to do so.

This notebook introduces you to the Predictor Monitoring concept and does not require a complete set of data. For this exercise, we will use a very small slice of the yellow taxi trip records from [NYC Taxi and Limousine Commission (TLC)](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page).


## Table of Contents
* [Initial Setup](#setup)
* Step 1: [Prepare a set of time-sliced sample data](#prepare)
* Step 2: [Create an initial predictor & forecast](#initial)
* Step 3: [Demonstrate the Predictor Monitoring Lifecycle](#lifecycle)
* Step 4: [View the Predictor Monitor Evaluation](#evaluation)
* Step 5: [Cleanup](#cleanup)


# Initial Setup

### Upgrade boto3

Before proceeding, ensure you have upgraded boto3.

In [None]:
!pip install boto3 --upgrade

### Setup Imports

In [1]:
import boto3
from time import sleep
import subprocess
import sys
import os
import pandas as pd

sys.path.insert( 0, os.path.abspath("../../common") )

import json
import util

### Function to supressing printing account numbers

In [2]:
import re

def mask_arn(input_string):

 mask_regex = re.compile(':[0-9]{12}:')
 mask = mask_regex.search(input_string)
 
 while mask:
 input_string = input_string.replace(mask.group(),'X'*12)
 mask = mask_regex.search(input_string) 
 
 return input_string

### Create an instance of AWS SDK client for Amazon Forecast

In [3]:
# Set your region accordingly, us-east-1 as shown
region = 'us-east-1'
session = boto3.Session(region_name=region) 
forecast = session.client(service_name='forecast')

# Checking to make sure we can communicate with Amazon Forecast
assert forecast.list_monitors()

### Setup IAM Role used by Amazon Forecast to access your data

In [5]:
role_name = "ForecastNotebookRole-Basic"
print(f"Creating Role {mask_arn(role_name)}...")
role_arn = util.get_or_create_iam_role( role_name = role_name )

# echo user inputs without account
print(f"Success! Created role = {mask_arn(role_arn).split('/')[1]}")

# Step 1: Prepare a set of time-sliced sample data

In this step, a small dataset is available in the file taxi_sample_data.csv. 

The dataset has the following 3 columns:
- timestamp: Timetamp at which pick-ups are requested.
- item_id: Pick-up location ID.
- target_value: Number of pick-ups requested around the timestamp at the pick-up location.

First, the routine below uses a single input file to create a small seed file of 100k rows, representing something you might use to train an initial predictor.

Next, the routine creates four additional data files, t1 to t4 respectively. Each file contains 25k more data rows than the prior file, which simulates the passing of time with more ground truth data being avaialble. This simulates what might happen in the real-world where, as time lapses, more ground truth target time series data will be available. Later in this notebook, we'll import and forecast these files and see the metrics produced by predictor monitoring.

Note: As delivered, this uses the sample file in the data folder relative to this notebook. Please take care to ensure this file is available to your notebook.

### Create cumulative time-sliced files to demonstrate predictor monitoring over time

In [6]:
!head -100000 ./data/taxi_sample_data.csv > ./data/TAXI_TTS_seed.csv
!head -125000 ./data/taxi_sample_data.csv > ./data/TAXI_TTS_t1.csv
!head -150000 ./data/taxi_sample_data.csv > ./data/TAXI_TTS_t2.csv
!head -175000 ./data/taxi_sample_data.csv > ./data/TAXI_TTS_t3.csv
!head -200000 ./data/taxi_sample_data.csv > ./data/TAXI_TTS_t4.csv

### Upload files to S3

In [8]:
bucket_name = input("\nEnter S3 bucket name for uploading the data and hit ENTER key:")

s3 = boto3.resource('s3')
for file in os.listdir('./data'):
 if file.endswith(".csv") and file.startswith('TAXI_TTS_'):
 print(file)
 s3.meta.client.upload_file('./data/'+file, bucket_name, file)


Enter S3 bucket name for uploading the data and hit ENTER key:forecast-us-east-1-XXXXXXXXXXXX
TAXI_TTS_seed.csv
TAXI_TTS_t2.csv
TAXI_TTS_t4.csv
TAXI_TTS_t1.csv
TAXI_TTS_t3.csv


# Step 2: Create an initial predictor & forecast

In step 2, we will 
- create a new dataset
- import an initial seed data file into the dataset
- create a new dataset group, which is the container for the new dataset
- create a new auto predictor model, with predictor monitoring enabled
- create a new forecast

All of these steps represent an initial state of a production system at the start. From here, time moves on and new data is recorded, allowing future forecasts to predict further out into the horizon as the ground truth horizon advances.

To prepare you for what's ahead, Step 3 below will simulate processing real-world data over time allowing you to review the Predictor Monitoring results over time.

### Create Dataset

In [9]:
DATASET_FREQUENCY = "H"
TS_DATASET_NAME = "TAXI_PREDICTOR_MONITOR_DEMO"
TS_SCHEMA = {
 "Attributes":[
 {
 "AttributeName":"timestamp",
 "AttributeType":"timestamp"
 },
 {
 "AttributeName":"item_id",
 "AttributeType":"string"
 },
 {
 "AttributeName":"target_value",
 "AttributeType":"integer"
 }
 ]
} 

create_dataset_response = forecast.create_dataset(Domain="CUSTOM",
 DatasetType='TARGET_TIME_SERIES',
 DatasetName=TS_DATASET_NAME,
 DataFrequency=DATASET_FREQUENCY,
 Schema=TS_SCHEMA)

ts_dataset_arn = create_dataset_response['DatasetArn']
describe_dataset_response = forecast.describe_dataset(DatasetArn=ts_dataset_arn)

print(f"Dataset ARN {mask_arn(ts_dataset_arn)} is now {describe_dataset_response['Status']}.")

Dataset ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset/TAXI_PREDICTOR_MONITOR_DEMO is now ACTIVE.


### Import the initial seed data file

In [10]:
TS_IMPORT_JOB_NAME = 'TAXI_TTS_seed'
TIMESTAMP_FORMAT = "yyyy-MM-dd hh:mm:ss"
ts_s3_path = f"s3://{bucket_name}/{TS_IMPORT_JOB_NAME}.csv"
TIMEZONE = "EST"

#frequency of poll event from API to check status of tasks
sleep_duration=300


ts_dataset_import_job_response = \
 forecast.create_dataset_import_job(DatasetImportJobName=TS_IMPORT_JOB_NAME,
 DatasetArn=ts_dataset_arn,
 DataSource= {
 "S3Config" : {
 "Path": ts_s3_path,
 "RoleArn": role_arn
 } 
 },
 TimestampFormat=TIMESTAMP_FORMAT,
 TimeZone = TIMEZONE)

ts_dataset_import_job_arn = ts_dataset_import_job_response['DatasetImportJobArn']

print(f"Waiting for Dataset Import Job with ARN {mask_arn(ts_dataset_import_job_arn)} to become ACTIVE.\n\nCurrent Status:\n")

status = util.wait(lambda: forecast.describe_dataset_import_job(DatasetImportJobArn=ts_dataset_import_job_arn), sleep_duration)
 

Waiting for Dataset Import Job with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset-import-job/TAXI_PREDICTOR_MONITOR_DEMO/TAXI_TTS_seed to become ACTIVE.

Current Status:

CREATE_PENDING 
CREATE_IN_PROGRESS 
ACTIVE 


### Create a dataset group

In [11]:
DATASET_GROUP_NAME = "TAXI_PREDICTOR_MONITOR_DEMO"
DATASET_ARNS = [ts_dataset_arn]

create_dataset_group_response = \
 forecast.create_dataset_group(Domain="CUSTOM",
 DatasetGroupName=DATASET_GROUP_NAME,
 DatasetArns=DATASET_ARNS)

dataset_group_arn = create_dataset_group_response['DatasetGroupArn']
describe_dataset_group_response = forecast.describe_dataset_group(DatasetGroupArn=dataset_group_arn)

print(f"The DatasetGroup with ARN {mask_arn(dataset_group_arn)} is now {describe_dataset_group_response['Status']}.")

The DatasetGroup with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset-group/TAXI_PREDICTOR_MONITOR_DEMO is now ACTIVE.


### Create a new auto predictor with MonitorConfig defined

Observe the new paramter in the create_auto_predictor() function MonitorConfig.

In [16]:
PREDICTOR_NAME = "TAXI_PREDICTOR_MONITOR_DEMO"
FORECAST_HORIZON = 4
FORECAST_FREQUENCY = "D"

create_auto_predictor_response = \
 forecast.create_auto_predictor(PredictorName = PREDICTOR_NAME,
 ForecastHorizon = FORECAST_HORIZON,
 ForecastFrequency = FORECAST_FREQUENCY,
 DataConfig = {
 'DatasetGroupArn': dataset_group_arn
 },
 MonitorConfig={"MonitorName": "TAXI_PREDICTOR_MONITOR_DEMO"},
 ForecastTypes=["0.5"],
 OptimizationMetric="MAPE"
 )

predictor_arn = create_auto_predictor_response['PredictorArn']
print(f"Waiting for Predictor with ARN {mask_arn(predictor_arn)} to become ACTIVE. Depending on data size and predictor setting,it can take several hours to be ACTIVE.\n\nCurrent Status:\n")

status = util.wait(lambda: forecast.describe_auto_predictor(PredictorArn=predictor_arn), sleep_duration)

print(f"Predictor with ARN {mask_arn(predictor_arn)} is ACTIVE.")

# retrieve the monitor ARN for future inspection as monitor_arn variable
response = forecast.list_monitors()
for i in response['Monitors']:
 if i['ResourceArn']==predictor_arn:
 monitor_arn = i['MonitorArn']

Waiting for Predictor with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXpredictor/TAXI_PREDICTOR_MONITOR_DEMO_01G3RQ4MCPS4YD26G4YB5FNYRR to become ACTIVE. Depending on data size and predictor setting,it can take several hours to be ACTIVE.

Current Status:

CREATE_PENDING 
CREATE_IN_PROGRESS ..........
ACTIVE 
Predictor with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXpredictor/TAXI_PREDICTOR_MONITOR_DEMO_01G3RQ4MCPS4YD26G4YB5FNYRR is ACTIVE.


### Create a new forecast based on the newly created predictor

This predictor and forecast is based on the initial seed dataset only, the first 100k rows.

In [17]:
FORECAST_NAME = "TAXI_FORECAST_seed"
 
create_forecast_response = \
 forecast.create_forecast(ForecastName=FORECAST_NAME,
 PredictorArn=predictor_arn)

forecast_arn = create_forecast_response['ForecastArn']

print(f"Waiting for Forecast with ARN {mask_arn(forecast_arn)} to become ACTIVE. Depending on data size and predictor settings,it can take several hours to be ACTIVE.\n\nCurrent Status:\n")

status = util.wait(lambda: forecast.describe_forecast(ForecastArn=forecast_arn), sleep_duration)


Waiting for Forecast with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXforecast/TAXI_FORECAST_seed to become ACTIVE. Depending on data size and predictor settings,it can take several hours to be ACTIVE.

Current Status:

CREATE_PENDING 
CREATE_IN_PROGRESS .
ACTIVE 


# Step 3: Demonstrate the Predictor Monitoring Lifecycle

In Step 3, we will import the four additional data files created from Step 1. Each data file successively adds an additional 25k rows of new ground truth target time series data.

It is not expected you have to run this workflow, you may elect to read the output results. The workflow is as follows.

1. import t1 csv file
2. produce a new forecast which is based on t1 imported TTS state
3. import t2 csv file
4. produce a new forecast which is based on t2 imported TTS state
5. import t3 csv file
6. retrain the original auto-predictor based on the t3 imported TTS state
7. produce a new forecast which is based on t3 imported TTS state
8. import t4 csv file
9. produce a new forecast which is based on t4 imported TTS state
10. finally, review the monitor performance results in this notebook.

If you elect to run this in your account, you may see the graphic in the console also.

In [None]:
TIMESTAMP_FORMAT = "yyyy-MM-dd hh:mm:ss"
TIMEZONE = "EST"

for i in range(1,5):

 TS_IMPORT_JOB_NAME = 'TAXI_TTS_t'+str(i)
 ts_s3_path = f"s3://{bucket_name}/{TS_IMPORT_JOB_NAME}.csv"

 print('\n\nProcessing incremental TTS dataset file '+str(i)+'\n')
 
 # Invoke import job of file i
 ts_dataset_import_job_response = \
 forecast.create_dataset_import_job(DatasetImportJobName=TS_IMPORT_JOB_NAME,
 DatasetArn=ts_dataset_arn,
 DataSource= {
 "S3Config" : {
 "Path": ts_s3_path,
 "RoleArn": role_arn
 } 
 },
 TimestampFormat=TIMESTAMP_FORMAT,
 TimeZone = TIMEZONE)

 ts_dataset_import_job_arn = ts_dataset_import_job_response['DatasetImportJobArn']

 # Wait on import to complete
 print(f"Waiting for Dataset Import Job with ARN {mask_arn(ts_dataset_import_job_arn)} to become ACTIVE.\n")
 status = util.wait(lambda: forecast.describe_dataset_import_job(DatasetImportJobArn=ts_dataset_import_job_arn), sleep_duration)

 # Wait on dataset to become active
 print(f"Waiting for Dataset ARN {mask_arn(ts_dataset_arn)} to become ACTIVE.\n")
 status = util.wait(lambda: forecast.describe_dataset(DatasetArn=ts_dataset_arn), sleep_duration)
 
 # only after importing third of four new datasets, retrain original predictor
 
 if i==3:
 
 PREDICTOR_NAME = "TAXI_PREDICTOR_MONITOR_DEMO_RETRAIN1"

 create_auto_predictor_response = \
 forecast.create_auto_predictor(PredictorName = PREDICTOR_NAME,
 ReferencePredictorArn=predictor_arn
 )
 
 predictor_arn = create_auto_predictor_response['PredictorArn']

 # wait on retrained predictor to become active
 print(f"Waiting for Predictor with ARN {mask_arn(predictor_arn)} to become ACTIVE.\n")
 status = util.wait(lambda: forecast.describe_auto_predictor(PredictorArn=predictor_arn), sleep_duration)

 
 # Generate a new forecast based on latest file import
 
 FORECAST_NAME = "TAXI_FORECAST_" + str(i)
 
 create_forecast_response = \
 forecast.create_forecast(ForecastName=FORECAST_NAME,
 PredictorArn=predictor_arn)

 forecast_arn = create_forecast_response['ForecastArn']

 # Wait on forecast to complete
 print(f"Waiting for Forecast with ARN {mask_arn(forecast_arn)} to become ACTIVE.\n")
 status = util.wait(lambda: forecast.describe_forecast(ForecastArn=forecast_arn), sleep_duration)



Processing incremental TTS dataset file 1

Waiting for Dataset Import Job with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset-import-job/TAXI_PREDICTOR_MONITOR_DEMO/TAXI_TTS_t1 to become ACTIVE.

CREATE_PENDING 
ACTIVE 
Waiting for Dataset ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset/TAXI_PREDICTOR_MONITOR_DEMO to become ACTIVE.

ACTIVE 
Waiting for Forecast with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXforecast/TAXI_FORECAST_1 to become ACTIVE.

CREATE_PENDING 
CREATE_IN_PROGRESS ...
ACTIVE 


Processing incremental TTS dataset file 2

Waiting for Dataset Import Job with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset-import-job/TAXI_PREDICTOR_MONITOR_DEMO/TAXI_TTS_t2 to become ACTIVE.

CREATE_PENDING 
ACTIVE 
Waiting for Dataset ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset/TAXI_PREDICTOR_MONITOR_DEMO to become ACTIVE.

ACTIVE 
Waiting for Forecast with ARN arn:aws:forecast:us-west-2XXXXXXXXXXXXforecast/TAXI_FORECAST_2 to become ACTIVE.

CREATE_PENDING 
CREATE_IN_PRO

# Step 4: View the Predictor Monitor Evaluation

This overall step shows the performance of the monitor over time after cumulative files t1, t2, t3, t4 were imported. While this example shows how to look at the data from the JSON response produced by list_monitor_evaluations(), you might consider using the built-in visualization available inside the AWS Console.

Please review the output below. The model metrics decrease slightly after input files t1 and t2 are imported. After t2, a new predictor is trained, leading to better loss metrics in t3 and t4. The main goal is to use the monitor to know when metrics have degraded beyond your required threshold. At that time, you can create and install a new predictor as the basis for generating forecasts.


In [19]:
list_monitor_response = forecast.list_monitors()

for i in list_monitor_response['Monitors']:
 if i['ResourceArn'] == predictor_arn:
 monitor_arn = i['MonitorArn']

#for display purposes
monitor_evaluations = forecast.list_monitor_evaluations(
 MonitorArn=monitor_arn,
 MaxResults=10
)

print('PredictorARN:\n',mask_arn(predictor_arn))
print('MonitorARN:\n',mask_arn(monitor_arn),'\n\n')

for i in monitor_evaluations['PredictorMonitorEvaluations']:

 print ('Dataset ARN:\n',mask_arn(i['MonitorDataSource']['DatasetImportJobArn']))
 print ('Evaluation State: ',i['EvaluationState'])
 print ('Evaluation EvaluationTime: ',i['EvaluationTime'])
 print ('Evaluation Window: ',i['WindowStartDatetime'],i['WindowEndDatetime'])
 
 if i['EvaluationState'] !='FAILURE':
 
 for m in i['MetricResults']:
 print (m['MetricName'],': ',m['MetricValue'])
 
 print ('\n\n')

PredictorARN:
 arn:aws:forecast:us-west-2XXXXXXXXXXXXpredictor/TAXI_PREDICTOR_MONITOR_DEMO_RETRAIN1_01G3RZNZRZXBQP0EF81NC4CETK
MonitorARN:
 arn:aws:forecast:us-west-2XXXXXXXXXXXXmonitor/TAXI_PREDICTOR_MONITOR_DEMO_RETRAIN1_01G3S1G876GRBAV467GCM9H53Q 


Dataset ARN:
 arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset-import-job/TAXI_PREDICTOR_MONITOR_DEMO/TAXI_TTS_t4
Evaluation State: SUCCESS
Evaluation EvaluationTime: 2022-05-23 18:26:37.953000+00:00
Evaluation Window: 2018-02-01 00:00:00+00:00 2018-02-04 00:00:00+00:00
AverageWeightedQuantileLoss : 0.3553749003728468
MAPE : 0.34043327644304344
MASE : 1.38765530970562
RMSE : 1120.6850296685875
WAPE : 0.36202280281927385



Dataset ARN:
 arn:aws:forecast:us-west-2XXXXXXXXXXXXdataset-import-job/TAXI_PREDICTOR_MONITOR_DEMO/TAXI_TTS_t3
Evaluation State: SUCCESS
Evaluation EvaluationTime: 2022-05-23 17:31:12.771000+00:00
Evaluation Window: 2018-01-23 00:00:00+00:00 2018-01-26 00:00:00+00:00
AverageWeightedQuantileLoss : 0.3991355863665969
MAPE :

Below is an example of a chart that helps conceptualize a model's performance over time, available inside the AWS Console.

![chart](./images/predictor_monitor.png)

# Step 5: Cleanup

You will need to allow a few minutes for each of these steps to complete.


In [None]:
forecast.delete_resource_tree(ResourceArn=dataset_group_arn)

Once the dataset group has been deleted (allow a few minutes), you may proceed. The following code will allow you to test and determine when the dataset group has been deleted. When you run this next cell, you may see your dataset group. Allow a couple minutes, and try again. Once your dataset is deleted, you may proceed to next step.

In [None]:
forecast.list_dataset_groups()

Delete dataset import jobs with TAXI_PREDICTOR_MONITOR_DEMO in the job name.

In [None]:
response = forecast.list_dataset_import_jobs()

for i in response['DatasetImportJobs']:

 try:
 if i['DatasetImportJobArn'].index('TAXI_PREDICTOR_MONITOR_DEMO'):
 print('Deleting',i['DatasetImportJobName'])
 forecast.delete_dataset_import_job(DatasetImportJobArn=i['DatasetImportJobArn'])
 except:
 pass

It will take a few minutes to delete the dataset import jobs. Once that is complete, the dataset can be deleted as follows in the next cell.

In [None]:
forecast.delete_dataset(DatasetArn=ts_dataset_arn)