{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# SageMaker/DeepAR demo on household electricity consumption dataset\n",
"\n",
"This notebook complements the following two notebooks:\n",
"* [DeepAR introduction notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/deepar_synthetic/deepar_synthetic.ipynb). \n",
"* [Individual household electric power consumption dataset](https://github.com/amirrezaeian/Individual-household-electric-power-consumption-Data-Set-/blob/master/data_e_power.ipynb).\n",
"\n",
"The household electric power consumption dataset is available at: \n",
"http://archive.ics.uci.edu/ml/datasets/Individual+household+electric+power+consumption\n",
"\n",
"In summary, the dataset consists of the following attributes:\n",
"* `date`: date (dd/mm/yyyy)\n",
"* `time`: time (hh:mm:ss)\n",
"* `global_active_power`: household global minute-averaged active power (in Kilowatt) \n",
"* `global_reactive_power`: household global minute-averaged reactive power (in Kilowatt)\n",
"* `voltage`: minute-averaged voltage (in Volt)\n",
"* `global_intensity`: household global minute-averaged current intensity (in Ampere) \n",
"* `sub_metering_1`: energy sub-metering No.1 (in Watt-per-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered). \n",
"* `sub_metering_2`: energy sub-metering No.2 (in Watt-per-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light. \n",
"* `sub_metering_3`: energy sub-metering No.3 (in Watt-per-hour of active energy). It corresponds to an electric water-heater and an air-conditioner. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In particular, we will see how to:\n",
"* Prepare the dataset\n",
"* Use the SageMaker Python SDK to train a DeepAR model and deploy it\n",
"* Make requests to the deployed model to obtain forecasts interactively\n",
"* Illustrate advanced features of DeepAR: missing values, additional time features, non-regular frequencies and category information\n",
"\n",
"For more information about DeepAR, see the [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html) "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.\n",
" \"This module will be removed in 0.20.\", DeprecationWarning)\n",
"Using MXNet backend.\n"
]
}
],
"source": [
"%matplotlib inline\n",
"\n",
"import sys\n",
"import os\n",
"import json\n",
"import zipfile\n",
"import random\n",
"import datetime\n",
"from urllib.request import urlretrieve\n",
"from dateutil.parser import parse\n",
"from random import shuffle\n",
"\n",
"import boto3\n",
"import s3fs\n",
"import sagemaker\n",
"import numpy as np\n",
"import pandas as pd\n",
"import statsmodels.api as sm\n",
"import matplotlib.pyplot as plt\n",
"from scipy.stats import randint\n",
"import seaborn as sns # used for plot interactive graph. \n",
"\n",
"from __future__ import print_function\n",
"from ipywidgets import interact, interactive, fixed, interact_manual\n",
"import ipywidgets as widgets\n",
"from ipywidgets import IntSlider, FloatSlider, Checkbox\n",
"\n",
"from sklearn.model_selection import train_test_split # to split the data into two parts\n",
"from sklearn.cross_validation import KFold # use for cross validation\n",
"from sklearn.preprocessing import StandardScaler # for normalization\n",
"from sklearn.preprocessing import MinMaxScaler\n",
"from sklearn.pipeline import Pipeline # pipeline making\n",
"from sklearn.model_selection import cross_val_score\n",
"from sklearn.feature_selection import SelectFromModel\n",
"from sklearn import metrics # for the check the error and accuracy of the model\n",
"from sklearn.metrics import mean_squared_error,r2_score\n",
"\n",
"## for Deep-learing:\n",
"import keras\n",
"from keras.layers import Dense\n",
"from keras.models import Sequential\n",
"# from keras.utils import to_categorical\n",
"from keras.optimizers import SGD \n",
"from keras.callbacks import EarlyStopping\n",
"from keras.utils import np_utils\n",
"import itertools\n",
"from keras.layers import LSTM\n",
"from keras.layers.convolutional import Conv1D\n",
"from keras.layers.convolutional import MaxPooling1D\n",
"from keras.layers import Dropout\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# set random seeds for reproducibility\n",
"np.random.seed(42)\n",
"random.seed(42)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"sagemaker_session = sagemaker.Session()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before starting, we can override the default values for the following: \n",
"- The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.\n",
"- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"s3_bucket = sagemaker.Session().default_bucket() # replace with an existing bucket if needed\n",
"s3_prefix = 'deepar-household-electricity-notebook' # prefix used for all data stored within the bucket\n",
"\n",
"role = sagemaker.get_execution_role() # IAM role to use by SageMaker"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"region = sagemaker_session.boto_region_name\n",
"\n",
"s3_data_path = \"s3://{}/{}/data\".format(s3_bucket, s3_prefix)\n",
"s3_output_path = \"s3://{}/{}/output\".format(s3_bucket, s3_prefix)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we configure the container image to be used for the region that we are running in."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"image_name = sagemaker.amazon.amazon_estimator.get_image_uri(region, \"forecasting-deepar\", \"latest\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Importing household electricity consumption dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After downloading the original dataset from the UCI ML repository, we load and parse the dataset. In addition, we modify dataset into a time-series formation supported by Pandas, so that we can utilize many features to handle time-series dataset (e.g. datetime, resampling, aggregation, basic statistics, etc.)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* https://www.allaboutcircuits.com/textbook/alternating-current/chpt-11/true-reactive-and-apparent-power/\n",
"* https://circuitglobe.com/what-is-power-triangle.html\n",
"* https://en.wikipedia.org/wiki/AC_power\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### White noise\n",
"In discrete time, white noise is a discrete signal whose samples are regarded as a sequence of serially uncorrelated random variables with zero mean and finite variance. Depending on the context, one may also require that the samples be independent and having identical probability distribution (a.k.a. _i.i.d_). In particular, if each sample has a normal distribution with zero mean, the signal is said to be **Gaussian white noise**. \n",
"\n",
"Some properties of white noise:\n",
"* White noise is the simplest example of a **stationary process**. \n",
"* if the lag is 0, auto-covariance will be a variance of probability distribution. Otherwise, auto-covariance will be 0. That is:\n",
"
\n",
"\\begin{equation}\n",
" \\gamma_l = \n",
" \\begin{cases}\n",
" Var[e_t] & \\text{for $l = 0$} \\\\\n",
" 0 & \\text{for $l \\neq 0$} \n",
" \\end{cases}\n",
"\\end{equation}\n",
" \n",
"* if the lag is 0, auto-correlation will be 1. Otherwise, auto-correlation will be 0. That is:\n",
"
\n",
"\\begin{equation}\n",
"e_t \\sim \\text{ $i.i.d$ } N(\\mu,\\sigma^2) \\text{ for all $t$}\n",
"\\end{equation}\n",
"\n",
"**Prewhitening:**\n",
"A technique to process a time series data to make it behave statistically like white noise. (The 'pre' means that whitening precedes some other analytical approaches enabling to work better if the additive noise is white).\n",
"\n",
"https://datascienceschool.net/view-notebook/6b963e771dc54f8c8cb23437274a86d6/ \n",
"http://hosting.astro.cornell.edu/~cordes/A6523/Prewhitening.pdf"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"FILE_NAME = './household_power_consumption.csv'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Note that dataset includes `nan` and `?` as a `string`. They need to be converted to numpy `nan` in importing stage and treated both of them the same.\n",
"* Two columns `Date` and `Time` can be merged into one column `Date_Time`.\n",
"* The index of dataset need to be reset (with `Date_Time`)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Data can be downloaded from: http://archive.ics.uci.edu/ml/machine-learning-databases/00235/\n",
"* Just open the zip file and grab the file 'household_power_consumption.txt' put it in the directory that you would like to run the code.\n",
"* `infer_datetime_format`: to allow speedups for homogeneously formatted datetimes. `pd.read_csv` and `pd.to_datetime` learned a new `infer_datetime_format` keyword which greatly improves parsing perf in many cases. (http://pandas.pydata.org/pandas-docs/version/0.17.1/whatsnew.html#id55)\n",
"* `low_memory`: Please refer the following link: https://stackoverflow.com/questions/24251219/pandas-read-csv-low-memory-and-dtype-options)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/dateutil/parser/__init__.py:46: DeprecationWarning: _timelex is a private class and may break without warning, it will be moved and or renamed in future versions.\n",
" warnings.warn(msg, DeprecationWarning)\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Global_active_power
\n",
"
Global_reactive_power
\n",
"
Voltage
\n",
"
Global_intensity
\n",
"
Sub_metering_1
\n",
"
Sub_metering_2
\n",
"
Sub_metering_3
\n",
"
\n",
"
\n",
"
Date_Time
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2006-12-16 17:24:00
\n",
"
4.216
\n",
"
0.418
\n",
"
234.84
\n",
"
18.4
\n",
"
0.0
\n",
"
1.0
\n",
"
17.0
\n",
"
\n",
"
\n",
"
2006-12-16 17:25:00
\n",
"
5.360
\n",
"
0.436
\n",
"
233.63
\n",
"
23.0
\n",
"
0.0
\n",
"
1.0
\n",
"
16.0
\n",
"
\n",
"
\n",
"
2006-12-16 17:26:00
\n",
"
5.374
\n",
"
0.498
\n",
"
233.29
\n",
"
23.0
\n",
"
0.0
\n",
"
2.0
\n",
"
17.0
\n",
"
\n",
"
\n",
"
2006-12-16 17:27:00
\n",
"
5.388
\n",
"
0.502
\n",
"
233.74
\n",
"
23.0
\n",
"
0.0
\n",
"
1.0
\n",
"
17.0
\n",
"
\n",
"
\n",
"
2006-12-16 17:28:00
\n",
"
3.666
\n",
"
0.528
\n",
"
235.68
\n",
"
15.8
\n",
"
0.0
\n",
"
1.0
\n",
"
17.0
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Global_active_power Global_reactive_power Voltage \\\n",
"Date_Time \n",
"2006-12-16 17:24:00 4.216 0.418 234.84 \n",
"2006-12-16 17:25:00 5.360 0.436 233.63 \n",
"2006-12-16 17:26:00 5.374 0.498 233.29 \n",
"2006-12-16 17:27:00 5.388 0.502 233.74 \n",
"2006-12-16 17:28:00 3.666 0.528 235.68 \n",
"\n",
" Global_intensity Sub_metering_1 Sub_metering_2 \\\n",
"Date_Time \n",
"2006-12-16 17:24:00 18.4 0.0 1.0 \n",
"2006-12-16 17:25:00 23.0 0.0 1.0 \n",
"2006-12-16 17:26:00 23.0 0.0 2.0 \n",
"2006-12-16 17:27:00 23.0 0.0 1.0 \n",
"2006-12-16 17:28:00 15.8 0.0 1.0 \n",
"\n",
" Sub_metering_3 \n",
"Date_Time \n",
"2006-12-16 17:24:00 17.0 \n",
"2006-12-16 17:25:00 16.0 \n",
"2006-12-16 17:26:00 17.0 \n",
"2006-12-16 17:27:00 17.0 \n",
"2006-12-16 17:28:00 17.0 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = pd.read_csv(FILE_NAME, sep=\",\", parse_dates={'Date_Time': ['Date', 'Time']}, \n",
" infer_datetime_format=True, na_values=['nan','?'], \n",
" low_memory=False, index_col='Date_Time')\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We need to check the type of data for each column. If some of them has `object` type, they should be converted into the numerical format (e.g. `float64`, `int64`).\n",
"For example, we can use the following codes for the above tasks: \n",
"```\n",
"data = data.convert_objects(convert_numeric=True)\n",
"data.info()\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"DatetimeIndex: 2075259 entries, 2006-12-16 17:24:00 to 2010-11-26 21:02:00\n",
"Data columns (total 7 columns):\n",
"Global_active_power float64\n",
"Global_reactive_power float64\n",
"Voltage float64\n",
"Global_intensity float64\n",
"Sub_metering_1 float64\n",
"Sub_metering_2 float64\n",
"Sub_metering_3 float64\n",
"dtypes: float64(7)\n",
"memory usage: 126.7 MB\n"
]
}
],
"source": [
"data.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When we want to build an ML model, we need to understand dataset in the following perspectives:\n",
"* The meaning of data for each feature (column)\n",
"* Summarized information from the basic statistics\n",
"* Relaionship or Association between features\n",
"* Features having similar patterns\n",
"* Trend or Periodicity\n",
"* Outliers, Noisy data, Missing values\n",
"* Data type (categorical data, numerical data, etc.)\n",
"... "
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
" Global_active_power Global_reactive_power Voltage \\\n",
"count 2.075259e+06 2.075259e+06 2.075259e+06 \n",
"mean 1.091615e+00 1.237145e-01 2.408399e+02 \n",
"std 1.050655e+00 1.120142e-01 3.219643e+00 \n",
"min 7.600000e-02 0.000000e+00 2.232000e+02 \n",
"25% 3.100000e-01 4.800000e-02 2.390200e+02 \n",
"50% 6.300000e-01 1.020000e-01 2.409600e+02 \n",
"75% 1.520000e+00 1.920000e-01 2.428600e+02 \n",
"max 1.112200e+01 1.390000e+00 2.541500e+02 \n",
"\n",
" Global_intensity Sub_metering_1 Sub_metering_2 Sub_metering_3 \n",
"count 2.075259e+06 2.075259e+06 2.075259e+06 2.075259e+06 \n",
"mean 4.627759e+00 1.121923e+00 1.298520e+00 6.458447e+00 \n",
"std 4.416490e+00 6.114397e+00 5.785470e+00 8.384178e+00 \n",
"min 2.000000e-01 0.000000e+00 0.000000e+00 0.000000e+00 \n",
"25% 1.400000e+00 0.000000e+00 0.000000e+00 0.000000e+00 \n",
"50% 2.800000e+00 0.000000e+00 0.000000e+00 1.000000e+00 \n",
"75% 6.400000e+00 0.000000e+00 1.000000e+00 1.700000e+01 \n",
"max 4.840000e+01 8.800000e+01 8.000000e+01 3.100000e+01 "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# another sanity check to make sure that there are not more any nan\n",
"data.isnull().sum()\n",
"data.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can make a transformed dataset with different frequency by using `resample()`. \n",
"**down-sampling:** \n",
"* To transform the original dataset to a lower frequencey\n",
"* Summarize or aggregate the higher frequency data points (i.e. original dataset)\n",
"* Example: 1 minute-based timestamps → 5 minute-based timestamps \n",
"\n",
"**up-sampling:** \n",
"* To transform the original dataset to a higher frequencey\n",
"* the lower frequency data points (i.e. original dataset)\n",
"* Example: 1 minute-based timestamps → 0.5 minute-based (i.e. 30 second-based) timestamps \n",
"\n",
"https://machinelearningmastery.com/resample-interpolate-time-series-data-python/ \n",
"**Resampling** \n",
"Resampling involves changing the frequency of your time series observations.\n",
"\n",
"Two types of resampling are:\n",
"\n",
"1. Upsampling: Where you increase the frequency of the samples, such as from minutes to seconds. For upsampling, `ffill()` (i.e. forward filling) or `bfill()` (i.e. backward filling) can be required to fill the newly created data points that was not available. \n",
"1. Downsampling: Where you decrease the frequency of the samples, such as from days to months. For downsampling, some kind of aggregation operation can be needed. (e.g. `mean()`, `first()`, etc.)\n",
"\n",
"In both cases, data must be invented.\n",
"\n",
"In the case of upsampling, care may be needed in determining how the fine-grained observations are calculated using interpolation. In the case of downsampling, care may be needed in selecting the summary statistics used to calculate the new aggregated values.\n",
"\n",
"There are perhaps two main reasons why you may be interested in resampling your time series data:\n",
"\n",
"1. Problem Framing: Resampling may be required if your data is available at the same frequency that you want to make predictions.\n",
"1. Feature Engineering: Resampling can also be used to provide additional structure or insight into the learning problem for supervised learning models.\n",
"There is a lot of overlap between these two cases.\n",
"\n",
"For example, you may have daily data and want to predict a monthly problem. You could use the daily data directly or you could downsample it to monthly data and develop your model.\n",
"\n",
"A feature engineering perspective may use observations and summaries of observations from both time scales and more in developing a model.\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Date_Time\n",
"2006-12-31 41817.648460\n",
"2007-01-31 69014.045230\n",
"2007-02-28 56491.069230\n",
"2007-03-31 58863.283615\n",
"2007-04-30 39245.548781\n",
"2007-05-31 44008.872000\n",
"2007-06-30 35729.767447\n",
"2007-07-31 29846.831570\n",
"2007-08-31 34120.475531\n",
"2007-09-30 41874.789230\n",
"2007-10-31 49278.553230\n",
"2007-11-30 55920.827230\n",
"2007-12-31 72605.261615\n",
"2008-01-31 65170.473615\n",
"2008-02-29 49334.346845\n",
"2008-03-31 55591.685615\n",
"2008-04-30 48209.992000\n",
"2008-05-31 45724.043230\n",
"2008-06-30 42945.063615\n",
"2008-07-31 35479.601230\n",
"2008-08-31 12344.063230\n",
"2008-09-30 42667.792000\n",
"2008-10-31 50743.399447\n",
"2008-11-30 59918.584535\n",
"2008-12-31 56911.416668\n",
"2009-01-31 62951.099615\n",
"2009-02-28 50291.953362\n",
"2009-03-31 54761.169230\n",
"2009-04-30 49277.707230\n",
"2009-05-31 45214.196460\n",
"2009-06-30 37149.767696\n",
"2009-07-31 27594.810460\n",
"2009-08-31 30049.032998\n",
"2009-09-30 42631.838845\n",
"2009-10-31 51089.811615\n",
"2009-11-30 55068.733615\n",
"2009-12-31 60907.189230\n",
"2010-01-31 62797.504679\n",
"2010-02-28 55473.889230\n",
"2010-03-31 50368.601679\n",
"2010-04-30 44379.215615\n",
"2010-05-31 48893.491615\n",
"2010-06-30 41887.607230\n",
"2010-07-31 32188.843615\n",
"2010-08-31 29991.384254\n",
"2010-09-30 42026.211946\n",
"2010-10-31 51934.045615\n",
"2010-11-30 44598.388000\n",
"Freq: M, Name: Global_active_power, dtype: float64"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data['Global_active_power'].resample('M').sum()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Visualization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below I resample over day, and show the sum and mean of Global_active_power. It is seen that mean and sum of resampled data set, have similar structure."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Sum of 'Global_active_power' resampled over month\n",
"data['Global_active_power'].resample('M').mean().plot(kind='bar')\n",
"plt.xticks(rotation=60)\n",
"plt.ylabel('Global_active_power')\n",
"plt.title('Global_active_power per month (averaged over month)')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below I show mean of 'global_active_power' resampled over day."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"data['Global_active_power'].resample('Q').mean().plot(kind='bar')\n",
"plt.xticks(rotation=60)\n",
"plt.ylabel('Global_active_power')\n",
"plt.title('Global_active_power per quarter (averaged over quarter)')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is very important to note from above two plots that resampling over larger time inteval, will diminish the periodicity of system as we expect. This is important for machine learning feature engineering."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"mean of 'Voltage' resampled over month"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Below I show hist plot of the mean of different feature resampled over month \n",
"data.Global_active_power.resample('M').mean().plot(kind='hist', alpha=0.3, legend=True )\n",
"data.Global_reactive_power.resample('M').mean().plot(kind='hist', alpha=0.3, legend=True)\n",
"#df.Voltage.resample('M').sum().plot(kind='hist',color='g', legend=True)\n",
"data.Global_intensity.resample('M').mean().plot(kind='hist', alpha=0.3, legend=True)\n",
"data.Sub_metering_1.resample('M').mean().plot(kind='hist', alpha=0.3, legend=True)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Below I show hist plot of the mean of different feature resampled over month \n",
"data.Global_active_power.resample('M').mean().plot(kind='hist', color='r', legend=True )\n",
"#from pyqt_fit import kde\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Comparison of the mean of different features resampled over day"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.\n",
" warnings.warn(\"The 'normed' kwarg is deprecated, and has been \"\n",
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.\n",
" warnings.warn(\"The 'normed' kwarg is deprecated, and has been \"\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAaUAAAGoCAYAAADmTPpwAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3X+YlXWd//Hne8ZDDKSOGJr8Wvx1oSEwJgo2riItqamoqKFCSe7i1lX+iCIxKSDXdGOX3Da7TNN0NzSU6PgzDTUyCVF0sIGUzGKRo99AkUJFHeH9/eOcg8MwwDlnzv3zvB7XNRdz7nPPfX9uHM+Lz+f+3J+3uTsiIiJxUBd1A0RERIoUSiIiEhsKJRERiQ2FkoiIxIZCSUREYkOhJCIisaFQEhGR2FAoiYhIbCiUREQkNvaIugFl0NITIpJ0FnUD4k49JRERiY0k9ZSq5o6lazrdfsGIASG3RERE2lNPSUREYkOhJCIisaFQEhGR2FAoiYhIbCiUREQkNhRKIiISGwolERGJDYWSiIjEhkJJRERiQ6EkIiKxoVASEZHYUCiJiEhsKJRERCQ2FEoiIhIbCiUREYkNhZKIiMSGQklERGJDoSQiIrGhUBIRkdhQKImISGwolEREJDYUSiIiEht7RN2AoN2xdE3UTRARkRKppyQiIrGhUBIRkdhQKImISGwolEREJDYUSiIiEhsKJRERiQ2FkoiIxIZCSUREYkOhJCIisZH6FR3CtLPVIy4YMSDkloiIJJN6SiIiEhvqKYVAPSgRkdKopyQiIrGhUBIRkdjQ8F07nQ2zaYhNRCQ86imJiEhsKJRERCQ2FEoiIhIbCiUREYkNTXSIkCZWiIhsT6FUoZ09ECsiIpVTKO2GwkdEJDwKpZjRkkQiUssUSglRTo9NASYiSaXZdyIiEhvqKaWQelUiklQKpRoX9kQOhaCI7Iq5e9RtKImZPQR8pMwf+wjwWgDNiZNauEbQdaZJLVwjdH6dr7n7yVE0JikSE0qVMLNl7j486nYEqRauEXSdaVIL1wi1c53VpokOIiISGwolERGJjbSH0k1RNyAEtXCNoOtMk1q4Rqid66yqVN9TEhGRZEl7T0lERBJEoSQiIrGhUBIRkdhQKImISGwolEREJDYSE0onn3yyA/rSl770leSvkqT0864kiQml116rhaWyRERq+/MuMaEkIiLpp1ASEZHYUCiJiEhsJLrIX1tbG2vXruWdd96JuikSI927d6dfv35kMpmomyIiZUp0KK1du5Y999yTgQMHYmZRN0diwN15/fXXWbt2LQceeGDUzRGRMiV6+O6dd95h3333VSDJNmbGvvvuq96zSEIlOpQABZLsQL8TIsmV+FASEZH0UChJyS666CL2228/jjjiiMja4O5ceumlHHLIIQwdOpRnn302sraISPUplGLo/fffD+S4W7Zs6dLPT5o0iYceeqhKranML3/5S1588UVefPFFbrrpJr74xS9G2h4RqS6FUhesXr2aww47jAsvvJChQ4dyzjnn8PbbbwPwzDPPcMIJJ3DUUUdx0kkn8eqrrwJw8803c/TRRzNs2DDOPvvsbftPmjSJKVOmcOKJJ3LFFVfwm9/8hqamJpqamjjyyCPZtGkT7s7UqVM54ogjGDJkCPPmzQNg0aJFjBo1inPOOYfDDjuMCRMmUKwoPHDgQL797W9z3HHHcffdd3fpeo8//nh69eq1w/Ybb7yRG2+8cYftt912G2eccQYnn3wygwYNYtasWV06P8A999zD5z73OcyMkSNHsnHjxm1/tyJBy7bkaL7uMQ6c9gDN1z1GtiUXdZNSJ9FTwuNg1apV3HLLLTQ3N3PRRRfxwx/+kMsuu4xLLrmEe+65h969ezNv3jyuuuoqbr31VsaNG8fkyZMBmD59OrfccguXXHIJAH/84x955JFHqK+v5/TTT+eGG26gubmZN998k+7du7NgwQKWL1/Oc889x2uvvcbRRx/N8ccfD0BLSwsrV66kT58+NDc3s3jxYo477jgg/9zOE088sUPb586dy+zZs3fYfsghhzB//vyS/w6+8IUv7PS9p556ihUrVtCjRw+OPvpoTj31VIYPH77dPuPHj2fVqlU7/OyUKVP43Oc+t922XC5H//79t73u168fuVyOAw44oOT2ilQi25LjygWtbG7LjzjkNm7mygWtAJx5ZN8om5YqCqUu6t+/P83NzQBMnDiR73//+5x88smsWLGCMWPGAPlhs+KH5ooVK5g+fTobN27kzTff5KSTTtp2rHPPPZf6+noAmpubmTJlChMmTGDcuHH069ePJ554gvPPP5/6+nr2339/TjjhBJ5++mn22msvjjnmGPr16wdAU1MTq1ev3hZK48eP77TtEyZMYMKECcH8xRSMGTOGfffdF4Bx48bxxBNP7BBKxR5fKYo9wPY0207CMPvhVdsCqWhz2xZmP7xKoVRFCqUu6viBaGa4O4MHD2bJkiU77D9p0iSy2SzDhg3jtttuY9GiRdve69mz57bvp02bxqmnnsqDDz7IyJEjeeSRRzr9QC760Ic+tO37+vr67e5LtT9ue9XqKe1KZ38/HZXTU+rXrx8vv/zyttdr166lT58+VWmryK68snFzWdulMgqlLlqzZg1Llizh2GOP5c477+S4445j0KBBrF+/ftv2trY2/vjHPzJ48GA2bdrEAQccQFtbG3PnzqVv387/hfXSSy8xZMgQhgwZwpIlS3jhhRc4/vjj+dGPfsSFF17Ihg0bePzxx5k9ezYvvPBCRW2vVk/pBz/4AQBf/vKXd3hv4cKFbNiwgYaGBrLZLLfeeusO+5TTUxo7diw/+MEPOO+881i6dCl77723hu4kFH0aG8h1EkB9GhsiaE16aaJDFx1++OHcfvvtDB06lA0bNvDFL36Rbt26MX/+fK644gqGDRtGU1MTv/vd7wC4+uqrGTFiBGPGjOGwww7b6XGvv/56jjjiCIYNG0ZDQwOnnHIKZ511FkOHDmXYsGGMHj2a7373u3z0ox8N61I5//zzOfbYY1m1ahX9+vXjlltuAeCFF17YNkTX0XHHHcdnP/tZmpqaOPvss3cYuivXpz/9aQ466CAOOeQQJk+ezA9/+MMuHU+kVFNPGkRDpn67bQ2ZeqaeNCiiFqWT7WpIKE6GDx/uy5Yt227b888/z+GHHx5Ri/Kz70477TRWrFgRWRvi4LTTTmPBggV069Ztu+233XYby5Yt29aTClPUvxuSTtmWHLMfXsUrGzfTp7GBqScNKvd+Ukk3QDv7vEuBkq5dw3fSZffff3/UTRAJxZlH9tWkhoAplLpg4MCBNd9L2pVJkyYxadKkqJshIgmS+HtKSRl+lPDod0IkuRIdSt27d+f111/Xh5BsU6yn1L1796ibIiIVSPTwXb9+/Vi7di3r16+PuikSI8XKsyK7U4WJC1JliQ6lTCaj6qIiUpFsS46p85+jbUt+pCW3cTNT5z8HaNmgKCV6+E5EpFKz7lu5LZCK2rY4s+5bGVGLBBRKIlKj3ni7raztEo5ED9+JiJSreB9J4kmhJCI1o2P5ic40NmRCbJF0pOE7EakZnZWfaC9TZ8wcOzjEFklH6imJSM3YVZmJvpoSHgsKJRFJnZ09f7Sz8hN9GxtYPG10BC2VjjR8JyKpUrxvlNu4GeeDsuXZlpzKTySAQklEUmV3ZcuvHTeEvo0NGPke0rXjhmjILkY0fCciqdLZ8Fz77So/EW/qKYlIqtRb57XkdrZd4kU9JRFJheLkhi07qRqws+0SLwolEUm8Uh6K7dvYEGKLpFIKJRFJpPbTvuvMdtkTMtAMu4RQKIlI4nQsO7G7oTlH5SiSItBQMrNbgdOAde5+RGFbL2AeMBBYDXzG3d8Ish0ikg7F3tHOZtjtjIbukiPo2Xe3ASd32DYNeNTdDwUeLbwWEdml9g/FlkMPxyZLoKHk7o8DGzpsPgO4vfD97cCZQbZBRNJhd4uptlec/q2HY5MnintK+7v7qwDu/qqZ7bezHc3sYuBigAEDBoTUPBGJo10tplrUkKnj+atPCaE11afPu7xYPzzr7je5+3B3H967d++omyMiEepTwn2h7h3WtUsSfd7lRRFKfzWzAwAKf66LoA0ikjCdLaba0UaVMk+8KELpXuDCwvcXAvdE0AYRSZjiYqq7Wi6olN6UxFvQU8LvBEYBHzGztcAM4DrgLjP7Z2ANcG6QbRCR5Mm25Jgybzlb2207dL+efOnEQ9mrYQ/e6KRHlKk3zbJLgUBDyd3P38lbnwzyvCKSXNmWHJfPW77D9hfXvdXpdoB9emSYcfpgzbJLAa3oICKxMvvhVWXtv0+PDC3f+lRArZGwxXr2nYjUnlKmfrfX2VCeJJd6SiISqWxLjln3rdwWLmbkF6uTmqRQEpHIdFxYFaDcskeNDZkqt0qipOE7EYnMVb9o3S6QKjFz7OAqtUbiQKEkIpGYnm3lrfdKW8tuZ7rVm2bcpYxCSUQicefSl7v083UG3z1nWJVaI3Ghe0oiEondFebr6ND9evL2e1t5ZeNm+jQ2MPWkQeolpZBCSUQiUb+bEuYdvf3eVhZPGx1giyQOFEoiEopi1dhiT+eg3j14cd1bJf98ucX9JJkUSiISuGLV2GKRvkoCZlcLsUp6KJREJFDZlhxT7lrO1i4+EFvuPShJJs2+E5HAFB+O7WogQb60uaSfekoiEoidrfZdiYZMvcpS1AiFkohURfuJDI09MlVZKNVA079rjEJJRLqs40SGagTSofv1ZOGUUV0+jiSL7imJSJfNfnjVtkCqBgVS7VJPSUS6rNwaSB311RCdFCiURKTLunIPaZ8eGa3UINto+E5EKpZtydE061cVB1Km3phxukpPyAfUUxKRimRbcky9+znaKnwISUN20hmFkoiUbXq2lZ8+uaain20+uBdzJx9b5RZJWiiURKQsE25ewuKXNlT0swok2R2FkoiUbHq2teJAun58k4bqZLcUSiKyW9OzrcxduoZK10RVIEmpFEoisktdGa4z4HsKJCmDQklEdirbkqs4kAD+ct2pVWyN1AKFkohsJ9uSY+a9K9m4uevr14mUS6EkItt0Zap3R6p/JJXQig4iAuR7SNUKJED1j6QiCiURAeCrd1WnIF+RJjdIJRRKIsL0bCtbqlCyvEhDd1Ip3VMSqXFdmfLdmUy9aehOKqZQEqlhXZ3y3dE+PTLMOH2whu6kYgolkRqTbckx676VVSlZDgoiqS6FkkgNybbkuHxedSc0tHzrU1U9ntQ2TXQQqSFXLvh9VY/Xs1t9VY8nolASqSGb27ZW7Vj1dcY1Zw2p2vFEQMN3IjVjera1y8cwwFHVWAmOQkkk5bItOb5293O8X2HZ8va04rcETaEkkmLVnNjQkKlTIEngFEoiKVXNxVUBrh03tGrHEtmZyELJzL4C/Av5IepW4PPu/k5U7RFJg2xLjtkPryK3cXPVjtnYkGHmWD2HJOGIJJTMrC9wKfAxd99sZncB5wG3RdEekTQYM2cRL657q6rHXK0ifRKyKKeE7wE0mNkeQA/glQjbIpJoE25eUvVAEolCJKHk7jngP4A1wKvA39z9Vx33M7OLzWyZmS1bv3592M0USYxqrl8n0dDnXV4koWRm+wBnAAcCfYCeZjax437ufpO7D3f34b179w67mSKxlm3J0XzdYwyc9kAgx9+nRyaQ40rn9HmXF9Xw3T8Bf3H39e7eBiwAPhFRW0QSJ9uS48oFrVWd0NDRjNMHB3ZskZ2JKpTWACPNrIeZGfBJ4PmI2iKSOLMfXsXmti2BHb+HnkmSiER1T2kpMB94lvx08DrgpijaIpJEQfaQMnXGd/RMkkQksueU3H0GMCOq84skURDTviH/LNLfNrfRR2vaScS0ooNIQoy4ZiF/3fReIMdePkM1kSQeVLpCJAGyLbnAAqlvY0MgxxWphHpKIjFUXC7olY2b6dPYwFvvvh/IeRoy9Uw9aVAgxxaphEJJJGaK072Ls+uCmtSgmkgSRwolkZiZdd/KQKd71xnM+YzqIkk8KZREYiTbkuONt9sCO756RxJ3CiWRGJn98KpAj7942uhAjy/SVSXNvjOzejN7JOjGiNS6VwJ8KLaxQWvZSfyVFEruvgV428z2Drg9IjWteya4pzRmjtVadhJ/5QzfvQO0mtlCYNsj5e5+adVbJVKDpmdb2dy2NZBjXz9eExskGcoJpQcKXyJSZdmWHD99ck1gx1cgSVKUHErufruZNQAD3D3Yu7EiNSDbkmPWfSsDnW0HWrFBkqXkAWwzOx1YDjxUeN1kZvcG1TCRNMu25Pjq3c8FHkiZOtOKDZIo5dxVnQkcA2wEcPfl5CvHikiZZt23ki1bPdBzNDZkmH3uMA3dSaKUc0/pfXf/W74m3zbB/l8lkkJBPyC7+rpTAzu2SNDKCaUVZnYBUG9mhwKXAr8Lplki6ZRtyXH5vOWBHb9HgFPKRcJQzm/wJcBg4F3gTuBvwOVBNEokrYIMpDpDFWMl8crpKX3U3a8CrgqqMSJp074Ehe1+94oZWmRV0qGcULrNzPoCTwOPA79199ZgmiWSfB1LUAR1A7YOmKOHYyUlynlO6Xgz6wYcDYwCHjCzD7t7r6AaJ5Jksx9eFUgJijqDvbpn+NvmNvpo1W9JmZJDycyOA/6x8NUI3A/8NqB2iSReEMX5GjL1XDtuiEJIUquc4bvfAMuAa4EH3f29YJokknzTs8GMbCuQJO3KCaV9gWbgeOBSM9sKLHH3bwbSMpGECmodu4kjByiQJPXKuae00cz+DPQH+gGfAFSgRaSDbyz4fdWP2XxwL/7tzCFVP65I3JRzT+klYBXwBHAj8HkN4Yl8INuS46pftPJ2lctP7FFnzJ18bFWPKRJX5QzfHeruwRR7EUm44gKrQaxn9x/nDqv6MUXiqpxQ6mNm/03+vpKT7zFd5u5rA2mZSAIUH44NYqYdQPd6030kqSnlLDP0E+BeoA/QF7ivsE2kJhUfjg0qkABeuObTgR1bJI7KCaXe7v4Td3+/8HUb0DugdonEXlAPxwIcul9PrfYtNamc4bvXzGwi+cVYAc4HXq9+k0SSIYge0v57dmPpVWOqflyRpCgnlC4CfgB8r/B6cWGbSM2Ynm3lzqUvs8WDWclOgSS1rpznlNYAYwNsi0isTc+2BvJQbFFdkMuIiyREOc8pHQT8FzCS/Oy7JcBX3P3PAbVNJBaCnmFXdMGIAYEeXyQJypnocAdwF3AA+Rl4d/PB/SWRVCpWig06kHp2q9eKDSKUF0rm7v/bbvbdTwmuRIxILARZKba9a85SIIlAeRMdfm1m04CfkQ+j8eRrKvUCcPcNAbRPJDITbl4Synl6ZOr0gKxIQTmhNL7w57922H4R+ZA6qCotEomBbEuOxS8F/++sTL3xnXFDAz+PSFKUM/vuwF29b2Zj3H1h15skEr1Z960M7NjFSXaqGiuyo3J6Srvz74BCSRIv25LjjbfbAjv+X7RSg8hOlTPRYXf0lIWkwpS7gpvcMHGkpn2L7Eo1Q0kz8STxsi05Aqg+AeTXs9O0b5Fdq2YoiSTeVb9oDezYC6eMCuzYImlRzVBaXc7OZtZoZvPN7AUze97MVFpTIjM928rAaQ/w1nvBrPrdt7EhkOOKpE3JoWRmPczsm2Z2c+H1oWZ2WvF9dx9X5rn/C3jI3Q8DhgHPl/nzIlUR9Jp2AFNPGhTo8UXSopzZdz8BngGKPZq15Jcaur/ck5rZXsDxwCQAd38PeK/c44h0RVhr2jUf3EvTvkVKVM7w3cHu/l2gDcDdN1P5jLuDgPXAT8ysxcx+bGY9O+5kZheb2TIzW7Z+/foKTyWyozCqxtabMXHkAOZO1si07J4+7/LK6Sm9Z2YNFGbZmdnBwLtdOO/HgUvcfamZ/RcwDfhm+53c/SbgJoDhw4drdp9UTZBVY1UxViqhz7u8cnpKM4GHgP5mNhd4FPh6heddC6x196WF1/PJh5RIKIIeshORypSzzNCvzOwZ8vWUDLjM3V+r5KTu/v/M7GUzG+Tuq4BPAn+o5Fgi5cq25AI7tmbZiXRNOUX+7iVfP+led3+rCue+BJhrZt2APwOfr8IxRXYrqHIUmXrTLDuRLirnntJ/kl8p/DozewqYB9zv7u9UcmJ3Xw4Mr+RnRcoVRinz2ecM0yw7kS4qZ/juN8BvzKweGA1MBm4F9gqobSJVEXQgNWTquXbcEAWSSBWUtUp4Yfbd6eR7TB8Hbg+iUSLVFFQgGSo/IVJt5dxTmgeMID8D7wZgkbtvDaphIl2VbckFWhdJJShEqq/cFR0ucPdgHu4QqaJsS46p85+jbUvNPu4hkki7DSUzG+3ujwE9gDPMtl/Ewd0XBNQ2kYpkW3J89a7n2OLBBZKmfosEo5Se0gnAY+TvJXXkgEJJYiPbkmPq3cEGUkOmXlO/RQKy21By9xmFb7/t7n9p/56ZHRhIq0QqNPPelbQFVaUPaGzIMHPsYE1sEAlIOfeUfs6OSwHNB46qXnNEKjfh5iVs3NwWyLF7ZOr4zrihCiORgJVyT+kwYDCwt5m1r5m0F9A9qIaJlGPojIf4+7vBzME5dL+eqhorEpJSekqDgNOARra/r7SJ/AO0IpEacc3CwAJp4sgB/NuZQwI5tojsqJR7SvcA95jZse6+JIQ2iZQk25Ljql+0BlrCXIEkEq5ySld8wcwaiy/MbB8zuzWANonsVnGWXVCBBCphLhKFciY6DHX3jcUX7v6GmR0ZQJtEdmvKvOUEuZxIt3rTpAaRCJQTSnVmto+7vwFgZr3K/HmRLhszZxEvrqtG5ZRd++45wwI/h4jsqNzSFb8zs/mF1+cC11S/SSKdCyuQmg/upV6SSETKKV3xP4XKsyeSXyB5nLurWqyEJoxAAlj9ukqli0SlrOE3d19pZuspPJ9kZgPcPbhCNSIFY+YsCu1cr2xUKEm0Nrz1XtRNiEzJs+/MbKyZvQj8BfgNsBr4ZUDtEtlmws1LQuslQb5GkohEo5wp4VcDI4E/uvuBwCeBxYG0SqQg25Jj8UsbQjufFlsViVY5odTm7q+Tn4VX5+6/BpoCapcI2ZYcl89bHvh5isVY+jY2qKy5SMTKuae00cw+DDwOzDWzdcD7wTRLal3QgdSQqeOdtq0qZy4SM+WE0hnAZuArwARgb+DbQTRKJMhAun58k0JIJKbKmRJevNO8Fbi94/tmtsTdj61Ww6Q2Tc+2cufSlwM9hwJJJL6quSKDylhIl0zPtvLTJ4N9wkBlzEXirZyJDrsTXLlPqQlBBxJokVWRuKtmKIlUJNuSY+C0BwI/T2NDRkN3IjFXzVCy3e8isr1sS44pdwU/7bshU8/MsYMDP4+IdE017yl9torHkhrxtbufY2vAA799Ne1bJDF2G0pmtonO7xcZ4O6+F/lvVlS5bZJyh131IO8HmEgqZS6SPKWUQ98zjIZI7SgO2QXZQ1IgiSRT2cN3ZrYf7aZ/a5VwKUcY074BBZJIQmmVcAlNWIGkZ5FEkkurhEsosi25UAJJq3yLJJtWCZdQTL07uGnf9ZZ/GkGrfIskXyWrhP8WrRIuZRgzZxFtW4M5tgEvXfvpYA4uIqErp6dUXCX8cuAh4CXg9CAaJekxPdsaaNVYVYkVSZeyVgk3s48CxwAbgIcLw3kiOxXkfSTdPxJJn3Jm3/0L8BQwDjgHeNLMLgqqYZJ82ZZcYMc2Q/ePRFKonHtKU4Eji70jM9sX+B1waxANk+QaOuMh/v7ulmBP4qqLJJJG5YTSWmBTu9ebgGCrsUniHHLlA7wfQhET3UsSSadS1r6bUvg2Byw1s3vIr4V3BvnhPBEAJty8JJRA0r0kkfQqpadUXPvupcJX0T3Vb44kVbYlx+KXNoRyLt1LEkmvUhZkndX+tZntmd/sb3b15GZWDywDcu5+WlePJ9G5fF7wNZEg/4CsAkkkvcqZfXeEmbUAK4CVZvaMmXW1atplwPNdPIZELIyqsQCZOtOwnUjKlfPw7E3AFHf/B3f/B+CrwM2VntjM+gGnAj+u9BgSvbACqbEhw+xzh6mXJJJy5cy+61lY7w4Ad19kZj27cO7rga/zwT2rHZjZxcDFAAMGDOjCqaTaJty8JJR7SJl6Y/Y5CiNJv/afdx/5aO3+vpfTU/qzmX3TzAYWvqaTL2NRNjM7DVjn7s/saj93v8ndh7v78N69e1dyKglAWIEEKJCkZrT/vNuzsVfUzYlMOaF0EdAbWAD8ovD95ys8bzMw1sxWAz8DRpvZTys8loQsrECyUM4iInFScii5+xvufqm7f9zdj3T3y9z9jUpO6u5Xuns/dx8InAc85u4TKzmWhCfbkgvtHhLkH4ab/fCq0M4nItEr5eHZ+8h/PnTK3cdWtUUSS9mWXGjTvtt7ZePm0M8pItEpZaLDf3SyrRhSXR5hcfdFwKKuHkeC9dW7wg8k0HJCUrvuWLqGC0bU3gSvUkKpEejn7jcAmNlT5O8nOXBFgG2TmJiebWVLCMsHdaTlhERqTyn3lL4O3NvudTdgODAK+EIAbZIYybbkAq2J1FFjQwZDpc1FalUpPaVu7t5+NfAnCuUrXu/ic0qSAGEuHzT1pEEKIZEaV0oo7dP+hbt/ud1LPTyUYhNuXhL4Oa4f36QgEpFtSgmlpWY22d23W1LIzP4Vla5IrbCmfiuQRKS9UkLpK0DWzC4Ani1sOwr4EHBmUA2T6IS5np2ISHullK5YB3zCzEYDxVXBH3D3xwJtmUQirECqA2aO7eoi8yKSNiUvyFoIIQVRioUVSGYw5zO6lyQiOypn7TtJsQPDqolUb3xPgSQiO1FO6QpJqRHXLNz5OlJVtE+PDDNOH6xAEpGdUijVuOnZVv666b1Az5GpMxXoE5GSKJRq2IhrFgYeSIACSURKplCqUUNnPMTf390SyrkUSCJSKk10qEEjrlkYWiDVm0r1iUjpFEo1ZsycRaEM2RWdP6J/aOcSkeRTKNWYF9e9Fer5/u3MIaGeT0SSTaFUQ8IsZQ75lb9FRMqhUKoRYQeSCvSJSCU0+64GRNFDUm0kEamEQinFxsxZFNo9pIZMHc9ffUoo5xKR9FIopVTYvaPumfpQzyci6aR7SikUdiABbHy7LfRPyk7YAAAOQUlEQVRziqTdHUvXRN2E0CmUUibbkovkvH00005EqkChlDKXz1se+jk1005EqkX3lFIkimE7zbQTkWpSKKXEmDmLQj9nY0OGxdNGh35eEUkvDd+lRNjLB9UZzBw7ONRzikj6KZRSIIphuzkqaS4SilqbgafhuwSLIowg30tSIIlIENRTSqioAglgq0d2ahFJOYVSAkUZSKDVv0UkOAqlhInq4dgiPZMkIkFSKCVMFA/HFu3TI8O144bofpJIyGppsoMmOiRIVMN2ekBWRMKiUEqIKAJp4sgBKmcuIqFSKMXciGsW8tdN74V+3saGjAJJREKne0oxFlUgZepMqzWISCQUSjEWRSCZwexzh+n+kYhEQqEUU1HcQ8rUGxNGDGD2w6s4cNoDNF/3WORT0EWktuieUgxFEUj79Mhw6tAD+PkzOTa3bQEgt3EzVy5oBbSskIiEQz2lmIkikAxo+dan+PUL67cFUtHmti3MfnhV6G0SkdqkUBImjBwAwCsbN3f6/s62i4hUWyShZGb9zezXZva8ma00s8uiaEfcRNFLaj6417ap3312sqbdzraLiFRbVD2l94GvuvvhwEjgS2b2sYjaEgtRPRw7d/Kx215PPWkQDZn67fbRWnciEqZIJjq4+6vAq4XvN5nZ80Bf4A9RtCdKUT2L1LNb/Q4PxxYnM8x+eBWvbNxMHy0vJBIbdyxdwwUjBkTdjMBFPvvOzAYCRwJLO3nvYuBigAED0vcfI8oSFGd9vPOgOfPIvgohkQi0/7z7yEdr9//BSCc6mNmHgZ8Dl7v73zu+7+43uftwdx/eu3fv8BsYoKEzHor0/L9+YX2k5xeR7bX/vNuzsVfUzYlMZKFkZhnygTTX3RdE1Y6o/P3dLbvfKUCaUScicRTV7DsDbgGed/c5UbQhSlFXjgXNqBOReIrqnlIz8Fmg1cyKVeu+4e4PRtSe0MQhkDSjTiSZOhb7S+PEh6hm3z1BfiGBmhJlIBX/sjWjTkTiLPLZd7Ui6h7S98Y3KYhEJPa0zFAIog6kiSMHKJBEJBHUUwpYlIHUI1PHd8YNVSCJSGIolAIUdQ/pD1efEun5RUTKpVAKSNSB1FdTvkVSr+NsvJ1J0iw93VMKwGFXRTuzXVO+RSSpFEpVlm3J8c4Wj+z8jQ0Zrh03RPeRRCSRNHxXZZfPW777nQJg5Iv1dVz5W0QkSdRTqqIo7yM5WmRVRJJPoVQl07OtUTdBi6yKSOJp+K4Kop5pV6RFVkWkoyTNvAP1lLosikCqM8jUbb90oGbciUgaqKfUBVH1kOZ8pglQ2XIRSR+FUoWyLblIztt+HTuFkIikjUKpQmFP/W5syDBz7GAFkYikmkKpAlEM2y2f8anQzyki8ZS0yQvl0ESHMkURSPVWc/UQRaRGqadUounZVn76ZGmLH1bb+SP6R3JeEZGwqadUgigDaaKWDhKRGqKeUgmiCKSGTL0WVhWRmqNQ2o0w7yHVGWz1fC0kPXckIrVIobQLYQaShulEpBRpnnkHuqe0UwokEZHwKZQ6EfZqDQokEZE8hVInoirUJyJS6xRKHYT9cGxjQybU84mIxJkmOhREsVJDps6YOXZw6OcVEYkrhRLRlaCYfe4wTfsWkbLcsXT75ybTNhuv5ofvxsxZFMl5+zY2KJBERDqo+VB6cd1boZ9TVWJFRDpX06EUxrBd88G9uH58E30bGzDyPSQtHyQi0rmavacUdCDVm3H+iP7bnkFSCImI7F5NhlKQgZSpgxe/c2pgxxcRaa/jxIc4qWQSRk0P3wVh9rlNUTdBRCSxaqqnFGQPSSt7i4h0Xc2EUpCB1CNTx+JpowM7vohIraiJ4bsgA6nO4DvjhgZ2fBGRWpL6nlKQgbRPjwwzTh+sITsRkSpJfSgFRTWQRER21NVlj1I9fBdUL0mBJCISjNT2lIIIpA/tUce/nz1Uw3UiIgFJZSgFEUjNB/di7uRjq35cERH5QGTDd2Z2spmtMrM/mdm0qNqxOwZcP75JgSQiEoJIekpmVg/cAIwB1gJPm9m97v6HKNrTGd03EhEJX1TDd8cAf3L3PwOY2c+AM4DIQ6mxIcPMsZrmLSLR6dWzW+qK95UqqlDqC7zc7vVaYETHnczsYuBigAEDqv8faPV1WjhVROIh6M+7pIjqnpJ1ss132OB+k7sPd/fhvXv37tLBO1IgiUicVPp5lzZR9ZTWAv3bve4HvFKtg//lulM5cNoD26WcFbaLiEh8RRVKTwOHmtmBQA44D7igmidQAImIJE8koeTu75vZl4GHgXrgVndfGUVbREQkPiJ7eNbdHwQejOr8IiISP6le+05ERJJFoSQiIrGhUBIRkdhQKImISGwolEREJDYUSiIiEhsKJRERiQ1z32HJuVgys/XA/5X5Yx8BXgugOXFSC9cIus40qYVrhM6v8zV3P3l3P2hmD5WyXxolJpQqYWbL3H141O0IUi1cI+g606QWrhFq5zqrTcN3IiISGwolERGJjbSH0k1RNyAEtXCNoOtMk1q4Rqid66yqVN9TEhGRZEl7T0lERBJEoSQiIrGRylAys5PNbJWZ/cnMpkXdnmoxs1vNbJ2ZrWi3rZeZLTSzFwt/7hNlG7vKzPqb2a/N7HkzW2lmlxW2p+06u5vZU2b2XOE6ZxW2H2hmSwvXOc/MukXd1q4ys3ozazGz+wuv03iNq82s1cyWm9mywrZU/c6GJXWhZGb1wA3AKcDHgPPN7GPRtqpqbgM6PlA3DXjU3Q8FHi28TrL3ga+6++HASOBLhf9+abvOd4HR7j4MaAJONrORwL8D3ytc5xvAP0fYxmq5DHi+3es0XiPAie7e1O7ZpLT9zoYidaEEHAP8yd3/7O7vAT8Dzoi4TVXh7o8DGzpsPgO4vfD97cCZoTaqytz9VXd/tvD9JvIfZn1J33W6u79ZeJkpfDkwGphf2J746zSzfsCpwI8Lr42UXeMupOp3NixpDKW+wMvtXq8tbEur/d39Vch/oAP7RdyeqjGzgcCRwFJSeJ2FYa3lwDpgIfASsNHd3y/skobf3euBrwNbC6/3JX3XCPl/UPzKzJ4xs4sL21L3OxuGPaJuQACsk22a954wZvZh4OfA5e7+9/w/sNPF3bcATWbWCPwCOLyz3cJtVfWY2WnAOnd/xsxGFTd3smtir7GdZnd/xcz2Axaa2QtRNyip0thTWgv0b/e6H/BKRG0Jw1/N7ACAwp/rIm5Pl5lZhnwgzXX3BYXNqbvOInffCCwifw+t0cyK/1hM+u9uMzDWzFaTH0YfTb7nlKZrBMDdXyn8uY78PzCOIcW/s0FKYyg9DRxamOHTDTgPuDfiNgXpXuDCwvcXAvdE2JYuK9xzuAV43t3ntHsrbdfZu9BDwswagH8if//s18A5hd0SfZ3ufqW793P3geT/P3zM3SeQomsEMLOeZrZn8XvgU8AKUvY7G5ZUruhgZp8m/y+yeuBWd78m4iZVhZndCYwivyT+X4EZQBa4CxgArAHOdfeOkyESw8yOA34LtPLBfYhvkL+vlKbrHEr+5nc9+X8c3uXu3zazg8j3KnoBLcBEd383upZWR2H47mvuflrarrFwPb8ovNwDuMPdrzGzfUnR72xYUhlKIiKSTGkcvhMRkYRSKImISGwolEREJDYUSiIiEhsKJRERiQ2FkoiIxIZCSUJlZvub2R1m9ufCOmFLzOwsMxtVLG2wi5+daWZfK/N8b+7ivT5mNn9n77fb7xvlnLMUZjbczL5f+H6UmX2i2ucQSSKFkoSmsFpDFnjc3Q9y96PIP+nfL4r2uPsr7n7O7vek6qHk7svc/dLCy1GAQkkEhZKEazTwnrvfWNzg7v/n7v/dfqdCcbSsmf3ezJ4srH5QNMzMHisUTptc2P/DZvaomT1bKLRWUqkSMxtYLJhoZpPMbIGZPVQ49ncL268DGgrF2+YWtk0sFOhbbmY/KtTwwszeNLNrCoX7njSz/QvbzzWzFYXtjxe2jTKz+wsroX8B+ErheP9oZn8prP+Hme1VKCCXKftvWySBFEoSpsHAsyXsNwtocfeh5Hsp/9PuvaHk6/McC3zLzPoA7wBnufvHgROB/7TKlhVvAsYDQ4DxZtbf3acBmwvF2yaY2eGFfZrdvQnYAkwo/HxP4MlC4b7HgcmF7d8CTipsH9v+hO6+GriRfNG7Jnf/LfnFWU8t7HIe8HN3b6vgekQSR6EkkTGzGwq9h6c7vHUc8L8A7v4YsK+Z7V147x533+zur5Ff2PMY8uUQvmNmvwceIV+fZ/8KmvSou//N3d8B/gD8Qyf7fBI4Cni6UAvpk8BBhffeA4r3xZ4BBha+XwzcVujZ1ZfQjh8Dny98/3ngJ2Veh0hipbGeksTXSuDs4gt3/5KZfQRY1mG/XdXc6bhYo5PvqfQGjnL3tkKphO4VtK/9oqBb6Pz/DwNud/crO3mvzT9YTHLbz7v7F8xsBPnez3Iza9pVI9x9cWFo8QSg3t1XlHshIkmlnpKE6TGgu5l9sd22Hp3s9ziFIbHC6tKvufvfC++dYWbdCyswjyJfqmRv8sXk2szsRDrv4XRFW7t7Oo8C5xSKuRXvf+3yfGZ2sLsvdfdvAa+xfb0vgE3Anh22/Q9wJ+olSY1RKEloCr2IM4ETCjfznyJfvuGKDrvOBIYXhuOu44OaNABPAQ8ATwJXF4qrzS3sv4x8mFW76udNwO/NbK67/wGYTr709e/JlzE/YDc/P7swAWMF+cB9rsP79wFnFSc6FLbNBfYhH0wiNUOlK0RiyMzOAc5w989G3RaRMOmekkjMmNl/A6cAn466LSJhU09JUs/MhlCYzdfOu+4+Ior2iMjOKZRERCQ2NNFBRERiQ6EkIiKxoVASEZHYUCiJiEhs/H+YZxPd5o8RQwAAAABJRU5ErkJggg==\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## The correlations between 'Global_intensity', 'Global_active_power'\n",
"# data_returns = data.pct_change()\n",
"sns.jointplot(x='Global_intensity', y='Global_active_power', data=data) \n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.\n",
" warnings.warn(\"The 'normed' kwarg is deprecated, and has been \"\n",
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/matplotlib/axes/_axes.py:6462: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.\n",
" warnings.warn(\"The 'normed' kwarg is deprecated, and has been \"\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAaUAAAGoCAYAAADmTPpwAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3X+Yk+WZL/DvPSFABpVBnW0lQEFtQRGYqaPSHdcKrWKL4ixqqcW2bvfo1tO1Bd2p45ZWtO7K7pxWt1t7vPTU1R6pO4I0otjSH6A9UKEFZ6ZAhVorRaK7ohCrTHQymfv8kbwhybxv8r759b5Jvp/rmstJJj8e4kzuPM9zP/ctqgoiIiIvaHB7AERERAYGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8oxRbg/AAZaeIKJqJ24PwOs4UyIiIs+oppkSkaf9cPsB0+s/c96UCo+EqHpxpkRERJ7BoERERJ7BoERERJ7BoERERJ7BoERERJ7BoERERJ7BlHCiMjNLFWeaOJE5zpSIiMgzOFMicgEP2hKZY1AiKoBVUCGi4nD5joiIPINBiYiIPINBiYiIPIN7SkQewgQIqncMSkQ5MKGBqLK4fEdERJ7BoERERJ7BoERERJ7BoERERJ7BoERERJ7B7DuiKsBK41QvGJSIkpj+TeQ+Lt8REZFnMCgREZFnMCgREZFnMCgREZFniKq6PQa7qmag5G21ntDArDxPE7cH4HWcKRERkWcwKBERkWcwKBERkWcwKBERkWewogPVtFpPaiCqNQxKRDWGdfKomnH5joiIPINBiYiIPIPLd1QTuHdEVBs4UyIiIs/gTImoDljNJJkAQV7DoERVhct0RLWNy3dEROQZnCkR1TGeaSKvYVAiz+JSHVH94fIdERF5Bpv8kes4I6oOXNYrCTb5y4PLd1RRDEBElAuDEhHZwqQIqgQGJSoLzoiIqBDcU6KiMQBRNs6gLHFPKQ8GJbKNwYeKxWDFoJQPg1KdY6Aht9VZoGJQyoNBqQL4xk9UeR4NdgxKeVRNUBKRnwA42YWnPhnAGy48bzE45sqotjFX23iB2hvzG6p6SSUHU22qJii5RUR2qGqb2+NwgmOujGobc7WNF+CY6xHLDBERkWcwKBERkWcwKOV3v9sDKADHXBnVNuZqGy/AMdcd7ikREZFncKZERESewaBERESewaBERESewaBERESeUTVB6ZJLLlEkSg3xi1/84le1ftlSo+93tlRNUHrjjWqrNEJEVJh6fr+rmqBERES1j0GJiIg8g0GJiIg8Y5TbAyDyilgshoMHD+Ldd991eyhU5caOHYtJkybB7/e7PZSqw6BElHTw4EEcf/zxmDp1KkTYi40Ko6p48803cfDgQUybNs3t4VQdLt8RJb377rs46aSTGJCoKCKCk046iTPuAjEoEaVhQKJS4O9R4RiUiIjIMxiUiMh1qoovf/nLOP300zF79mw8//zzOW+/aNEinHXWWRUaXSanYyVnGJSIatzQ0FBZHjcej5fssX784x/jxRdfxIsvvoj7778fN9xwg+Vt161bh+OOO65kz+2Uk7GScwxKNSDUG0b7qk2Y1rUB7as2IdQbdntIVID9+/djxowZ+PznP4/Zs2fjyiuvxMDAAABg586d+OhHP4qzzz4bCxYswGuvvQYAeOCBB3DOOedgzpw5uOKKK1K3v/baa3HTTTdh3rx5uOWWW/Dss8+ipaUFLS0taG1txdtvvw1VRWdnJ8466yzMmjULPT09AIBnnnkGF154Ia688krMmDEDS5cuhdEMdOrUqbjjjjtw/vnnY82aNSX7tz/xxBP43Oc+BxHB3LlzEYlEUv/GdO+88w6+/e1vY8WKFRnX33fffbjvvvtG3P6hhx7C5ZdfjksuuQTTp0/H7bffXrGxUmGYEl7lQr1h3LpuF6KxxKfWcCSKW9ftAgB0tAbdHBoVYN++ffj+97+P9vZ2fOELX8D3vvc9fOUrX8GNN96IJ554As3Nzejp6cHXvvY1PPjgg1i8eDGuu+46AMCKFSvw/e9/HzfeeCMA4Pe//z1+/vOfw+fz4bLLLsO9996L9vZ2vPPOOxg7dizWrVuHvr4+9Pf344033sA555yDCy64AADQ29uLPXv2YOLEiWhvb8fWrVtx/vnnA0icwdmyZcuIsa9evRrd3d0jrj/99NOxdu3anP/ucDiMyZMnpy5PmjQJ4XAYp5xySsbtvv71r+Pmm29GY2NjxvVf/OIXLR/717/+NXbv3o3Gxkacc845WLhwIdra2jJus2TJEuzbt2/EfW+66SZ87nOfK2isVBgGpSrXvXFfKiAZorE4ujfuY1CqQpMnT0Z7ezsA4JprrsF3vvMdXHLJJdi9ezcuuugiAIllM+MNcPfu3VixYgUikQjeeecdLFiwIPVYV111FXw+HwCgvb0dN910E5YuXYrFixdj0qRJ2LJlC66++mr4fD68733vw0c/+lH85je/wQknnIBzzz0XkyZNAgC0tLRg//79qaC0ZMkS07EvXboUS5cuLejfbczE0mVnsPX19eEPf/gD7r77buzfv9/2Y1900UU46aSTAACLFy/Gli1bRgQlY5ZYqrFS4RiUqtyrkaij68nbst/cRASqipkzZ+K5554bcftrr70WoVAIc+bMwUMPPYRnnnkm9bNx48alvu/q6sLChQvx9NNPY+7cufj5z39u+uZqGDNmTOp7n8+XsS+V/rjpnMyU7r33XjzwwAMAgKeffhqTJk3CK6+8kvr5wYMHMXHixIz7PPfcc9i5cyemTp2KoaEhvP7667jwwgsz/s1mzF7TbE5mSnbGSoXjnlKVm9gUcHQ9eduBAwdSwefRRx/F+eefj+nTp+PQoUOp62OxGPbs2QMAePvtt3HKKacgFoth9erVlo/70ksvYdasWbjlllvQ1taGvXv34oILLkBPTw/i8TgOHTqEX/7ylzj33HMLHvvSpUvR19c34sts6e5LX/pS6ucTJ07EokWL8IMf/ACqim3btmH8+PEjlsNuuOEGvPrqq9i/fz+2bNmCD33oQ6mA9N3vfhff/e53Tcf1s5/9DIcPH0Y0GkUoFErNRNP19PSYjj07IAGwNVYqHINSletcMB0Bvy/juoDfh84F010aERXjjDPOwMMPP4zZs2fj8OHDuOGGGzB69GisXbsWt9xyC+bMmYOWlhb86le/AgB885vfxHnnnYeLLroIM2bMsHzce+65B2eddRbmzJmDQCCAT3ziE/jrv/5rzJ49G3PmzMH8+fPxr//6r3j/+99fqX9qhk9+8pM49dRTcfrpp+O6667D9773vdTPWlpa8t5/7969qSW6bOeffz4++9nPoqWlBVdcccWIpbtSjpWKJ7mm8F7S1tamO3bscHsYnhTqDaN74z68GoliYlMAnQumcz+pAC+88ALOOOMM155///79uPTSS7F7927XxlCtLr30Uqxbtw6jR4/OuP6hhx7Cjh07LGdR5WTx+2Rr86lG3+9s/du5p1QDOlqDDEJU15566im3h0AlwqBE5BFTp07lLKnErr32Wlx77bVuD4Mc4J4SUZpqWc4mb+PvUeEYlIiSxo4dizfffJNvKFQUo5/S2LFj3R5KVeLyXQ1i4kNhJk2ahIMHD+LQoUNuD4WqnNF5lpxjUKohod4wVq7fg0g0lrqOZYfs8/v97BRK5DIu39UIowZeekAyGGWHiIi8jkGpRpjVwEvHskNEVA0YlGpEvqDDskNEVA0YlGpEvqAzb0ZzhUZCRFQ4BqUaYVYDL93mvcwoIyLvY/ZdjTAy65b19Jn+nHtKRFQNOFOqIR2tQQTZyoKIqhiDUo0ppJVFqDeM9lWbMK1rA9pXbUKoN1zuYRIRmeLyXY0xlvHMKjqYVXoAgFvX7Uqlk/OwLRG5if2U6oRxuDb9LFPA78NYfwOODIw8cBtsCmBr1/xKDpGoHrCfUh6cKdUJs8O10Vjc8sAtEyOIyA0MSlWkmEKrToMMEyOIyA0MSlUie/nN2PvZ8afD2Lz3UN5ANbEpgLBJYGoK+PHe0PCIZb1ciRFEROXC7LsqYbX8tnrbAYQjUSiOBSqz7DmrrLyVi2birsWzEGwKQJDYS7pr8SwmORCRKzhTqhJWy2/ZaSpGRfDsoGKWlTdvRjNuf3JPKtGhKeBn7yUiclVZg5KIPAjgUgCvq+pZyetOBNADYCqA/QA+papHyjmOWmC1/GbGKoB1tAZTASfUG0bn2n7E4sfCWiQaQ+ea/tRtiYgqrdzLdw8BuCTrui4Av1DVDwL4RfIy5WG2/GaVX2knSaF7476MgGSIDStufqyfB2iJyBVlnSmp6i9FZGrW1ZcDuDD5/cMAngFwSznHUW1yZdllL789vjOcsdfkbxAMDA5hWteGnIkPubLx4qo8QEtErnBjT+l9qvoaAKjqayLyF1Y3FJHrAVwPAFOmTKnQ8NxllWUHZC6/Gdo+cGIqUI0P+HF0cCi1R5SrOkO+5UCrvSkiKo96fL8z4+nsO1W9X1XbVLWtubk++gFZZdlZtTPvaA1ia9d8vLxqIcaNGTViSc7qvp0LpsPvy33AmgdoiSqnHt/vzLgRlP5bRE4BgOR/X3dhDJ5lFQjsBAgn9+1oDaL7yjmY0Oi3fDweoCWiSnMjKK0H8Pnk958H8IQLY/Asq0BgJ0A4vW9HaxC937gY9yxpcVxZnIioHMoalETkUQDPAZguIgdF5G8BrAJwkYi8COCi5GVKKqT1RLH3XbPjQFayBHiAlohcUe7su6stfvSxcj5vpWRnyc2b0Wyr5E8uuVpPlOK+K0K78Oj2VxDPUR0+NpzoYNu9cR8P0xJRRbF1hU3pb+Y+Ecw9dQKeP/CWZZVtIDFLMWYcxRRTzcXJ464I7cIj2w44evz0fwMRFY2tK/JgmSEblj7wHLa+dDh1Oa6acdlKeuZbqRvphXrDGSWC7Dzuo9tfcfw8TA0nokrydEq4F4R6w7YCkJVXI1HHad52xtS5pt+0OV+ux821ZJdLOBJ11Cad7dWJqFCcKeVRaOAwTGwKFJXmbWbl+j2IDVsHGKvH9YkUFZjszO7yHf4lIsqFM6U8ijlAamS+FZPmbSYSHTlDsvO4V583uaDnM9iZ3ZV6VkhE9YVBKY9cgaP9tBMz+hBdM3eKaV+iYtK8ncr1uHd2zMI1c6fAJ8f2G43vfCJoP+1ENAWsD9MCiSCda3mu1LNCIqovXL7Lo3PB9IzlKEP7aSdi9XUfsfUYuVK1C8nKm9DoN91PEsl/vujOjlm4s2PWiOuzl92sjA/4cy7PWdXUY3UIIrKjroOSnYBg99xQdjZcU8CPlYtmpm5nVky10P2X2y6bOaIXkt8n6L5yTsH7NmbLbtkCfh9EYLk8Z8wKs4Mbq0MQkV01H5SsAo+TgGAWULKfo5CGebn2X3I9XzEHbK3kWl4TIPUcy3v6ct6/HGMjovpR00EpV+ApNCCYydcwDzAPTFaBIByJouX2n6YSGiY0+nHbZTNTz1WON3urZbdgUwBbu+anLndv3Jd3eS5fECcislLTQSlX4MkVENpXbXL0xl9ow7xcPY3SM+yODMRw02N9SM8CL3Wqda5lt/TZ5viAH36fZARhLs8RUanUdPZdrkywXBvv4UgUimNv/PkOf+bbxM/V08hui3OzY0lWj1vI4dWO1iDuWjwrI3vwirODWLl+D5b19KVek0g0Bmhi9padZUhEVKyaninlygSbN6PZVh04O0t6nQumj9hTymbV0wjIXJLL1Q3WTPbt7e6VWe21GbfJlY0XG1Y0jh6F3m9cbDmuctX6I6LaVtMzpVzngzbvPWT7cfKdsbHTMK9BxHTGkt45dmvXfAQdpk6nnzkC7B1eNQJOrhlhvmy8XK+J2eMv7+nDVJYdIqI8ajoomS1JGUtNTg5z2jlj09EaxG2XzbQ8fGrsLeV7Q3a6NxNXzVims3N49fYn9+QMXKHecN4Z2/gch2zNApoxh7S7JEpE9amml+8A60wwu0tldjfx7Rw+tZvuvXL9nrylhNKlz3bGB/ym91UAU7s2oNHfgIHYsOnjhJPVGoxU9lyODg4h1Bs2PXuV73UtpvI4lwWJalvNByUrnQumo3NNf87CpkEHb3p2Dp8C9srtrFw001Z1hWzRWBxj/Q0I+H2W97UKSIbbn8xd7NUQi2tqZmUEiaZGP955d8jWWAspO8Rir0S1r26DkvEmlj0rya7EYJfdN1mzpUCzT/93LZ5leSYol8hADHcvaSnovgBMyxdZMYKCESSc3LeQskOlPFtGRN5Ut0EJKO0hTzvLgWZLgVaf/u9aPAtbu+ajfdUmR8FFkWhl7jRholBOZ3NA4eeaWOyVqPbVdKJDJZll+vl9gqZA7vM8+bLlCn3DDUei9noPV0BTwG+abOJUqVuAEJH31PVMqZQKrfmW79N/IWeXDFY7Qw0CjB1lnfBgZBCmL2uOG+2D39fgKAEDSHzq+fO7MUSiMfhEMG9Gc8GzUxZ7Jap9DEolVMhyYL5WD3YP+eaS3uoifc9sRWgXVm87kBG8An4fVi4yr7MHIG8Chq9BcPyYUXgrGsOoBiA2jFR0jKvikW0H8Mi2A46SSAws9kpU+xiUXJbv07+TQ75mmgJ+08oLod4wHt8ZzghIAuCKsxNv8Nn7XMt7+ixnXuniw4pINIZxo304OmgdvArNnGOxV6Laxj0ll+U64AuMLCPklFhsLFkdcN2891DOw6925QpIBrZJJ6JsnCmVmVm6NzByCSq9PUT6fQXOA0K6iEWadq4q6ZVUSFV2IqpdnCmVkVkNuJvX9GdU3TaWxlaEdo24f/fGfUUFJCAR0MzqzXkpY83Oa0FE9YEzpRIxmxGZLYPFTaolKIDV2w6g7QMnZswSSnX+xmz/xmwvywusXguAJYaI6oGoFvtZvDLa2tp0x44dbg/DlFndu1ylfqxkd3l1enA2H58IhlUxsSmAqScF8NwfD5v2afKC7NfC6jW+a/EsAMzIo6ph6/igl9/vimDr386gVAKlCh4C4OVVC1OX7RR5rVUCpMolGTNGs9/UCY1+vBsbzniNjH24QtLOicqMQSkPLt+VQKlmM9n7PNnncqrj40NpNDX68xbMBczr7WW3yQBYsJWoWjAolYBPBPEiZ5xWlQnSz+WUejnPy5wUd80lGotj5fo9XN4jqhLMviuBYgMSAIz15/9f0blgOvy+3DNg46dNAX/OTrj1JBKN5eyyS0TewZlSCQQtSgU5OWN0ZCCWd6nJuH5ZT5/pz7knZQ/bXRB5F2dKJWBWITzg92Hp3CkZlRoa88yGjDfLUG8Y7as2ZbQ5N3S0Bi3bUmTvSdltPFiP2O6CyJs4UyoBu4VCp3VtyPtY2Y3z7J4xMtuTKsUbb8DvQ4PYKxvkJYJEkB4YHDLdn2ri0iaRJzEolYidQqF221CY9Ve6/ck9qce3GwSLaXsBJN7YPzxlPJ774+GCH8MN6WecQr1hdK7tRyyeuZB6ZCCG1jt+itsuM6+IzqU9IncwKFXQvBnNI1pF2HVkIIZQbzgjMOV64wz1hnH0vaECR5qgALa+VF0BSZB4nYFjFSCyA5LhyEAMnWv7AUUq9Zxp5ETuYlCqELNWEU7Z3ZwP9YZtnfGpRQrg8Z2JPbjHd4bz7qmZBSwmQhC5h0GpQkqRdBCORHHarU8jrpqzWsHK9XvqMiAZorE4Ht3+SlGp+kyEIHIHs+8qpFRvcsYbba7zNk5blteiYs+OeamKOlE9YVCqEKs3uWBTAPcsabFXFCpLNBbHzY/18yBoEfw+gb8h89W3qq5BROXHoFQhVmeZjCW4pXOnjAhMdgJVXHXEjCnfeShKCDYF0H3lHHRfNcey8y8RVZZr714islxE9ojIbhF5VETGujUWu3Idas31MyB/2/M7O2bh7iUtGT+/e0kLfFb9zNOktxUP9Ybrej/JrvQPBG7I9/tCVK9caV0hIkEAWwCcqapREXkMwNOq+pDVfdwu5W5VsmdCox8LZ58yItPL6PVT7JveVBsHbg3jRvuq7pCrm5oCfqxcNNOyT1OpAlaoN4yV6/fk3Osr9XOSZ7F1RR5urvOMAhAQkVEAGgG86uJYAOT+9GqVPXdkIIbV2w6YHng1Zi/FsCopZIYByZlINIZlPX1l+38HHEvPz5d8UsrnJKpmrqSEq2pYRP4XgAMAogB+qqo/dWMshuyZUPYhylzZc1ZzzWIy7oyDn/XSqsJrSpUt2b1xn+3lVKahE7kUlERkAoDLAUwDEAGwRkSuUdVHsm53PYDrAWDKlCllHZPZTCj9EGUhJXvspBUbwSe9xA0AVvd2WbEp4YV8qGAaen2r5Pudl7l1ePbjAF5W1UMAICLrAPwlgIygpKr3A7gfSKyxlnNAVp9SjevNiqCmy25TYZVWnB6Emhr9eOfdoRElbsaMamBAcpEARaWEF9IyhGnoVMn3Oy9zKygdADBXRBqRWL77GABXd/WsZkLGp1djA9pswzrg9+GKs4PYvPfQiKKeuYKQWfXqaCzOgOQyRXF175xW75jQ6Mdtl81kkgMR3NtT2i4iawE8D2AIQC+SnxDcYqcdhFEE1WzJzewNJfsTc6lafFN5Fdux1+neUOPoUQxIREmu1b5T1dsA3ObW82ez2w7CuK2dNxE22fO2CY1+0w8Kb0UzK7I75XT/MRyJon3VJrbMIAILsmawG2zsYjaVdxk9l878+o8xEBvO+NmwIqN/lR3ps+fxAT/8PrFsmWGGLTOIEliPpozGB9jd1KuMZdnsgGRwstRqLNOGI1EoEuefnAQkA88qEXGmVBJWad1HB0c22WsQQNX6bBOVX1Pyw0L7qk0lebxSLtNydk31jkEpS/YyjAgQGYhZ7jGZHbpd1tOHBkksA2UL+H0YYOUF1/gbBJfOOSVvynbAQVHbUgaSBhFM69pQsbbsdpN2iCqFQSlNdoBJT/02gs2OPx3GnR2zUtdbfUq2OsR/dDCOpoCfPY9cEhtWPLLtQN7bjc2q6J5LIQerrWT3ywJKs8dk55A297XIC1wpyFqIShQobF+1ydGbS6EFUMeN9mFYwcw8DxMAL69aaOu2hRyWtctIyCiG2fgCfh/G+htM985K8ZxkiQVZ82CiQxqnyzCFFkA9OhjHXYtnFX0ehsrHScmf9LYkQO6/vIC/AeNGZ87CmnIkxJRiadCqhJZVMgf3tchNtpbvRMQHYKOqfrzM43FVKZdh8rn9yT14592RiRDkvkJK/qQfJ8hVxSMaG0bA78M9S1oylsisZumlqIfnNMiwBh+5ydZMSVXjAAZEZHyZx+Mqs+6w5XJkIMZmfB4UbArgirOD6N64r+AGfB2tQWztmo+XVy1E4+hRI/4/m6V+5+pMXCyrINMU8JftOYkK5STR4V0Au0TkZwCOGleq6pdLPiqXpFd1YMuI+hNMJgBkb/4vN0lwsZKdUGD1exSORDOqRjipKOKUVQmtlYtmlu05iQplO9FBRD5vdr2qPlzSEVlwY+PPTsdQqi3BHIEke8ktm1lCQXb1+GxG91umftcNJjrku5GT7DsRCQCYoqoVP3bu5v+kaV0beNiVMKHRj95vXGz5c6t9oXyByW4rdAaWmsCglIft7DsRuQxAH4CfJC+3iMj6wsZWXZxs/Np61akq5Ss9ZJVQkO8DjZ3yQtmljIwzRfn2u0K9YbSv2lTw/hhRpTlJCV8J4FwkOsVCVfuQ6Bxb88w2of0NAr8vMwQF/D7OqOpYMVlr4Ug0Z+DI1RnZSqGBjMhNThIdhlT1LZGMN+K6eA+22oQ2u45JErXLrPTQitAuPLr9lVQlhmKkBw4gs6pCvs7IZnIFMi77kVc5CUq7ReQzAHwi8kEAXwbwq/IMy3us2lpkX7fjT4dtlbGh6jMY14yMuRWhXWX5f20WOPJ1RjZTSCAjcpuT5bsbAcwE8B6ARwG8BWBZOQZVzTbvPeT2EKhM4sOasVz26PZXyvZc2QHI6gzdwOCQ5XKcVcDi4VjyMiczpfer6tcAfK1cg6kFxX4KtaouTt6Q/v+30CW7fNl4ZoxZU/YRhSMDMcsiqlbnk/IdjmWWH7nJyUzpIRF5SUT+U0T+p4jkP0lYh3J9Cp3Q6EdjjpYIfp/gM+dNKcewqESa0uoVFpppWehnjo7WIMaNGfk50irhIb0mnyBxBitf6jmTI8httmdKqnqBiIwGcA6ACwFsEJHjVPXEcg2uGll9Ok1/Mwj1hrH8sT5kf9COxRWb9x7KeYCT3GX8Pwv1htHQIIiXaVorSJx7yp6tON0nstoLtcLkCHKb7aAkIucD+KvkVxOApwD8vzKNq2qZZerNm9GM7o37sLynL3XZauUnHImi/bQTGZQ86q3k0ln3xn0FB6SmgB9HB4dytkxvaJDU70B6Rl4hCQ9OMDmC3OZkT+lZADsA3AXgaVUdLM+Qql92xejsWmqr82RsbX3pcNnHSIVpEEGoN1zwm7Qg0Twy19LfmFENeG9oOOM6Y7ZS6D6RXeUOekT5ONlTOgnAHQA+AuAnIvJzEflmeYZVO8yWQ5jHUL3iqlje05ext+SEZv3XTHZAMrwaiRa0T+REOauVE9nhZE8pIiJ/BDAZwCQAfwmAXery4LJH7VEAf47GEPD7Kto92JitON0ncqKc1cqJ7HCyp/QSgH0AtgC4D8DfcAkvv0o2DqTKiStw1+JZFa3gMW9Gc1H3t5vqXc6gR5SPkz2lD6qq+bpCHTP+0MORKHwiiKum+vJ0tAbRuWA6lvf0ccmuRlWytFTPr19B2wdOLChgmO1tWp1vInKTk6A0UUT+HUA7EisYWwB8RVUPlmVkHpX+aXOsvwHR2LE4bRymzP6DX9bT58pYqbzK9f/V6gB1bFixcv2egoKI3VRvHpwltzlJdPgPAOsBTAQQBPBk8rq6kX2wMD0gZYvG4rj5sX5M69oAn7ChBdkzZlQDvv2pFsufF9pw0k6qNw/Okhc4CUrNqvofqjqU/HoIQHGL3FXG7NNmLnFVKAovR0P1Z3BouCwzEzt18Appj0FUak6C0hsico2I+JJf1wB4s1wD8yJm0lG5TWwKINQbhtXkekKBqeh2Ur15cJa8wElQ+gKATwH4r+TXlcnr6kYpDhAa50uIsgWS7ChuAAAgAElEQVT8Pkw9KZBIjDGZXPt9gtsum1nQY9s538Sq4uQFTs4pHQCwqIxj8bx5M5pt9c8xsvDMpGfpEaWLxuKW1Tx8Iui+ck5GpRCnCQn5Ur3nzWjG6m0HMjJFA34f5s1oNq3DR1QOtmdKInKqiDwpIodE5HUReUJETi3n4LwmV6+kYFMA9yxpwf5VC/GtT83JORtiQCKnhlVHlK4qZUJCqDeMx3eGMwKSAPjwlPF4fGeYyQ9UMaI23yBFZBuAe5Fo8AcAnwZwo6qeV6axZWhra9MdO3ZU4qksTevaYHreSAC8vGphxnXl6kpK9S3YFMDR94ZMs/CCTQFs7Zpf0OO2r9rk+KxVkLOmQthavffC+10Z2Pq3O9lTElX9v2nZd4+gzsq42V1zD/WG2YGWyiIciVqmhReTkFDIfTlronJwEpQ2i0iXiEwVkQ+IyFeR6Kl0oojURU8lswwmQbLdxKpNCPWGM5ZWiCrJyNxrX7UJ07o2pH4n7d63EEwZp1JzUtFhSfK/f5d1/ReQmDHV/P5SerHKcCSa0dba+NSYqPJQuSKdRMCxhIRCSwl1LpiOm3r6UEgdMaaMUynZnimp6rQcX6eKyEXlHKhXdLQGsbVrPoJNgRFrl9FYHEcGCjtxT+TEmFENI9K7N+89VPDh147WYMH93ZkyTqXkZPkun38p4WN5Hj8dkpveGxpOZcS9+lYUy3r6LJeM7f6uFtJIl72WqNRKGZTq6kwoPx2SV+RLoLX7u+q0RmOpGwwSAaUNSnWViWeV9EDkJU5mMlefN9n24xrljpb39DlKqCDKx0miA6XJ7tA5PuAvuIIzUTn4RHDF2fYb9t3ZMQsARlR1MHNkIJbaP2VvJiqlUs6U9pfwsaqCkfRw95IWvDfE/ofkLXFVPL4z7GgWc2fHLLy8aiHuWdLiqPgrU8OpVJyUGWoUka+LyAPJyx8UkUuNn6vqYidPLCJNIrJWRPaKyAsi8hEn9/cSpy0tiCql0GDR0RpE7zcuxj1LWkYsU1th8g+VgtMmf+8BMILHQQB3FvHc/wbgJ6o6A8AcAC8U8Viu4h8jeVkxv59m1cWbAuYzqPEW1xM54WRP6TRVXSIiVwOAqkZFCmupKiInALgAwLXJxxoEMFjIY3nBxKYAKzhQxRg15wBkVAofGBwyPSdXbKZodnXxUG8YnWv6EcvKIT86OIRQb5j7SlQUJzOlQREJIJllJyKnITFzKsSpAA4B+A8R6RWR/yMi47JvJCLXi8gOEdlx6JB3a8mZZeIRlYNRdNUIFFu75uPlVQuxtWs+brtsZt5GfqXQ0RrEcWNHfp6NxZX7SkWolve7cnMSlFYC+AmAySKyGsAvAHy1wOcdBeDDAP63qrYCOAqgK/tGqnq/qrapaltzs3c7r5stcbSfVhflAKnCwpGoZeKCnUZ+pRKxqFzCpezCVcv7XbnZbl0BACJyEoC5SBzJ2aaqbxT0pCLvT95/avLyXwHoUtWFVvepxlLuSx94zrJpG1ExGv0NiMaGXWu6Z9Xqopj2GXWCrSvycJJ9tx7AxQCeUdWnCg1IAKCq/wXgFREx1hU+BuB3hT6eF4V6w/j1y0fcHgbVqIHYsKtN98yWrFlyiErByfLdtwD8FYDficgaEblSRMYW8dw3AlgtIr8F0ALgn4t4LM9ZuX7PiI1gonIo1xmhXG0wKrlUSPXFdvadqj4L4FkR8QGYD+A6AA8COKGQJ1bVPgBthdzXq0K94VQ2FMMRVdKryb2m25/ck8rAawr4sXLRzIIChdEXLFcbjOysPKJScFRmKJl9dxkSvZU+DODhcgyqmhiBKLu/ElEljQ/40bm2H7H4sd/ASDSGzjX9AJyX/zE7EG7MyBiIqJyc7Cn1IHHAdT6Ae5E4t3RjuQZWDUK9YdyU1jKAAYncIoKMgGSIDReWpm2VRcfsOio3JzOl/wDwGVWtq3o6uZZEbl3324I6dRKVmlWKNpBYemtftSmVpZe+zGyVvWd1IJwtW6jc8qaEi8h8Vd0kIqa17VR1XVlGlsWNFMlQb3jEkohhQqOfXWbJM3wiiOf5W/b7BEvOmYzHd4YzluYCft+IJIXsPSWr25FjTAnPw87y3UeT/73M5OtSqzvVgu6N+0wDEgAGJPKUfAEJSCzv/XD7AVst07Oz65oCfoz1N7B/EpVd3uU7Vb0t+e0dqvpy+s9EZFpZRuUBod4w69lRzbE6pWC2V2Rk19nJxCMqFSfnlB43uW5tqQbiJcYfIVG9yLVXlCsTj6jU8s6URGQGgJkAxmftK50AoJjDs56SvvnbYGN9nqhW5KvEwEw8qiQ72XfTkdg7akJiH8nwNhIHaKte9vKE3YAU8PvQIMDRwbpKSKQaIYCt2nm5MvHsZPIROWFnT+kJAE+IyEdU9bkKjKniCu0cy26zVK2cFE7tXDDdNBNv3ozmEXtNy3v6sONPh3Fnx6yyjJtqn5M9pS+KSJNxQUQmiMiDZRhTxXEZguqJ08KpVnXuNu89NOKDmQJYve0As/OoYE6C0mxVjRgXVPUIgNbSD6nyeCCQ6sWERj/GjHKe2p3eULBzwfRUaS0zCjAJggrmpKJDg4hMSAYjiMiJDu/vWWbLE0S1xt8geOe9odTZu3Akis41/bj9yT2IDMQwelQD3hs6VqNktE8wGNfUwdxgUwDzZjSPOHxrhqsPVCgnQeVbAH4lIkYa+FUA/qn0Q6o8Y2PW2LBtYrUGqkFmrVRiw5r6XU8PSAAwmAxeRuJPOBLF6m0HbNV45OoDFcpJ64ofiMhOAPOQSNxZrKo105gvuwz/1K4NLo6GyJvsBCQ2+6NiOFp+U9U9InIIyfNJIjJFVQ+UZWQuC1qkwRLRSD4RDKsyLbxEDh8ddHsIrrEdlERkERJLeBMBvA7gA0i0sphZnqG5a96MZjyyLX+8FQF4zpbq3dXnTWYaOJWEk+y7bwKYC+D3qjoNwMcAbC3LqDxg895Dtm639LwpuGdJC4JcQ6c6YFXmecNvX6voOKh2OQlKMVV9E4ksvAZV3QygpUzjcp3d7KHHdyZSajsXTIffZ6syO1FVCjYFLPeUjgzEeDaJSsJJUIqIyHEAfglgtYj8G4Ch8gzLfXazh6KxOG56rA/Levos21wQVTsBsLVrfs4VAZ5NolJwEpQuBzAAYDmAnwB4CZm18GpK54LpCPh9tm5r1Q6AqFY0NfoBoKDCrURO2A5KqnpUVYdVdUhVH1bV7ySX8wAAIlJTdfHSS6sANlsmEtWod94dQqg3jI7WIJoCftPb8GwSlYKTmVI+NdPGwmCUVtm/aiHuXtJi+cdIVOtiw5panps58XjT2xw++h73lahooiXKZxaR51X1wyV5MBNe6Vkf6g3j5sf62W+JyMI1c6cwPdyarUWXU8+YrX984bflHkul2fq310TtukoyDgUu6+lzeSREpdUgwNhRDRiIDee/cQ6rk+f7Nu89xD5L5Fgpl+/qZtulozUIfylfOSKXpLej+Mx5U6Al+DNWAI9sO4BwJApFombesp4+tN7xUy7vUV6lfGv9bAkfy7NCvWGc+fUfo8gPk0Se8F9vvYulc6dga9d80/5IpXRkIIZb1+1iYKKc8i7ficjbMK/DKABUVU9A4pvdJR6b5xxrm86IRLUhropHth3Aj54P4+hg+Vu3RGNxdG/cx6U8smSnHbp5qk0dKrRtOpHXVSIgGXieiXJxvHwnIn8hIlOMr3IMyqv4x0SUaf+qhZjQ6OyoBM8zUS62g5KILBKRFwG8DOBZAPsB/LhM4/Iku39Mfp+UdLOOyIt8kkiKuO2ymSOqn/garBMm5s1oLuu4qLqxSrgDVqWHBECjvyGVxdR95Rx8e0kLAkzRoxoWV0X7qk0AkFH9xCeCeI7aW3Yr8FN9cnJOKaaqb4pIqkq4iPxL2UbmQdlt0/Odv+hoDaLl9p8iEmVrdapN4UgUnWv60X3VHHQumJ5MBMq9P8VlcMrFSVAyqoT/PySqhL+OGq4SbiW7bXo+bzEgUY2LDSuW9fQl0nFt3J57SpSL0yrhUQDLUAdVwkuFf4BUL+wEJL9PclYaJ3JUJRxAM4BPAjgM4LH0KuFkzkkLDKJaNqHRj+4r5/CMEuVke/lORP4HgG8A2ITE3v6/i8gdqvpguQZXC4w/wJXr93BviepOwO/DXYtnMRCRbU72lDoBtBqzIxE5CcCvADAo5WHsQ4V6w+jeuA9hbvRSHfCJmAYk4++AxVrJjO3WFSLyCwCfUNXB5OXRAJ5W1Y+XcXwpXmldUQrHyhWxOgR5iwhQiq4sRtJDU8APESAyEMPEpgDmzWjG4zvDGb/7dTabYuuKPOzUvrsp+W0YwHYReQKJ37fLAfy64OHVkOxPfvNmNI8o2w9kppJfcXYwdZsGEfZnIk8o1a+h8TDpS9bhSBSrtx0YkRDBeniUzs7ynVH77qXkl+GJ0g+ndCq1RJA96wlHongk2U/GuNy5th/QROqscd3jO8OpT4dLH3gOW186XPKxEXmNVczj2SUy2CnIenv6ZRE5PnG1vlO2URXJLFDcum4XAJQ8MNkp0hqLj/xTTP90uO2PR0o6JqJqw6MTZHCSfXcWgP8L4MTk5TcAfE5V9xT65CLiA7ADQFhVLy30cbKZBYpSLRFkz8CKSVoIR6KY2rWhqPEQVZvsQ7YBv49nlyjFyeHZ+wHcpKofUNUPALgZwANFPv9XALxQ5GOMYLUUUOwSgTEDS++oWTftdolKZOncKRkdb+soyYFscJISPk5VNxsXVPUZERlX6BOLyCQACwH8E4Cb8tzcEasZTLFLBGYzMKYnEDnT9oETcWfHLLeHQR7lZKb0RxH5uohMTX6tQKKNRaHuAfBVAJZtXEXkehHZISI7Dh2yX1nYrIpCKZYIuBlLVLzOtf1siW4i/f3u7Uj9Jj45CUpfQKLM0DoAP0p+/zeFPKmIXArgdVXdmet2qnq/qrapaltzs/0eLB2twVQp/VIuEXAzlqh4sbiie+M+t4fhOenvd8c3nej2cFxje/lOVY8A+HKJnrcdwCIR+SSAsQBOEJFHVPWaEj2+42redtgtzU9EuYUjUYR6w9xLohHsHJ59Ejm2TlR1kdMnVdVbAdyafPwLAfxDKQNSuRh/QMt6+lweCVH1K9cxDapudmZK/8vkOiNI1V3yWUdrkPXriEqAlRxy++H2A/jMeVPcHkbF2dlTagJwlqo+q6rPAugG8DCAhwD8RbEDUNVnSnlGqRLYjoKoNJg8RNnsBKWvAlifdnk0gDYAFwL4YhnG5HnpiRREVDgmD1E2O8t3o1X1lbTLW5LtK94s5pxSNcjVamLcaB8GBuMINgUwMDiEIwPslUTkhACs5EAj2JkpTUi/oKp/n3bRfp52lQn1htG5pt9y7+joYDxV1YEBici5xtE+LO/pQ/uqTTy3RCl2gtJ2Ebku+0oR+TvUcOuKlev3pKp6E1HpNAX88Psk44Pdret2MTARAHvLd8sBhETkMwCeT153NoAxADrKNbBysdvSgq3LiUpPAIwbM2rE3xcz8chgp3XF6wD+UkTmA5iZvHqDqm4q68jKoJItLYhopIlNgbIVTKbaYLvMkKpuUtV/T35VXUACcre0SBfqDdffASyiCpg3o9ky446ZeAQ4q31X9ex8QjNmU9xNIiq9p/pfs1UwOdQbRvuqTZjWtYGJEHXGSeuKqmenpYWdTrJEVJhINIblPX0YH/BjrL8BkYHYiL1dJ8vsdveIqXrU1UzJzic0O+vajf4GTGj0j7i+/bQTsX/VQtyzpIUVH4gsKBLB6d3YMO5e0oKtXfMzAomTZfbsppvM4qt+dTVTMn7xc32yytfifEKjH73fuNjx87BWHlGmaCyOmx/rx/Kevoy/RbuJELmCF2dL1auughKQv6VFrvYUfp/gtssSCYi5lg3SK0H4RBiQiCzENbF7m75EZ7dzNLP4alPdBaV80mc5RlCJqyKYFnhyrXkDyPiZ8UdHRLkZsxyzD4ZmnaPtBi+qLgxKJvLNpqyWDZb19KWCGBE592okamuZHTBf1TALXlRdGJQKkGt5gAGJqHDGLMdO52i7wYuqC4NSAZi4QFScpoAf7w0Nj1hxOPrekKM26XaCF1WXukoJLxU2+SMqXMDvw8pFM3HX4lkjjlZEorGMtG4eoq0/nCkVIDsZgojsEQCTJozFzY/1Wy51G/uzK9fvwdHBIcTiIzP0ODuqXQxKDpilgQMw3WxlVQiikRTAi68ftXVbs0r9PIdU+7h8Z5PV6XEAGa3RRcCARFRGPIdU2xiUbMp3erxzwXT4GwRMviMqL55Dqm0MSjblOz3evXEfO9USVcC8Gc1uD6Fifrj9gNtDqDgGJZvy9YDhkgJRZWzee8jtIVAZMSjZlK/COJcUiCqDGa+1jUHJpo7WYCqhQQAEmwK4a/GsVBaQsadEROXlk2N/ZzzHVHsYlBzoaA1ia9d83L2kBQCwvKcv9YfQ0RpE91Vz0BQY2WeJiErHON/Efkq1ieeUHLLTFTPXwUAiys/fIJaJQ8bxC/ZTqk0MSjmYHZbN9YewZscBbH3psEujJaodsWGFIHHYNl36Pm699FP64fYD+Mx5U9weRsUwKFmwmhFZHYwNR6LcgCUqIQXgaxAcP2YU3orGRlQBZz+l2sQ9JQtWM6L0TVYiKq/4sCISjaGp0T+iLUW+jFiqTpwpWbBaAoir+bICEZXPkYEYlvX0YVlPX0YXaID9lGoNg5KFXD2TGJCI3BOORLHcIkBR9ePynQX2TCLyLuODYb2kgddTuSEGJQvph2WJyLuM7FeqDQxKORiHZYvVFPCP6LBJRKUTjkRrfrZUL7inVEYC4OVVCwEA07o2uDsYohrHrrS1gTMlGwpNA08/LzHWz5eaqBB2l9C5jFcb+E5pw9XnTS7ofpGBwdSSwntDw6UcElHdcHIonQfYqx+Dkg13dszCNXOnwOl86ehgHMt6+jC1awPY/4+oNHKtXAjAvaUqx6Bk050ds3D3khYmLBC5LFexYwUsl/DY5qI6MNHBpuxaeETkTWbVWOxU9ydv4EzJJrNaeETkPWYFWXNV9ydvcSUoichkEdksIi+IyB4R+Yob43Ci1srhE9Uiq4Ks9dLmoha4NVMaAnCzqp4BYC6AL4nImS6NxZZSlMNnu3Si8rpr8SzT5Tirv1+2ufAeV/aUVPU1AK8lv39bRF4AEATwOzfGY0fngulF7ymNHtWA2CCXAInKZVlPHzrX9MHva8BALHEMY0KjHwtnn4LHd4Yz/n6rrc1FvTT7cz3RQUSmAmgFsN3kZ9cDuB4Apkxx939Gepn8fGchrFpbHGVAIiq72DAQGz52LvDIQAyPbMssaNoU8GPlopmeSnJIf787+f3eGVeluZroICLHAXgcwDJV/XP2z1X1flVtU9W25ubmyg8wS0dr0Fb1cB5JIvK2o4NDbg9hhPT3u+ObTnR7OK5xLSiJiB+JgLRaVde5NQ6nmIVHVP1icWXmnUe5lX0nAL4P4AVV/bYbYygUs3WIagP/lr3JrT2ldgCfBbBLRPqS1/2jqj7t0nhsa2r048hAzO1hEFGRqjHzLrvZXy0mPriVfbcFcFxKznWh3jDeYkAiqgnhSBTTbt0Ao2qRkaDkE0Fcla3WXcKKDg50b9wH1vomqh3pZfSMb43aevXSat1rXE8J94JQbziV6p3rUxLL4hPVF6MUEWdLlVOXQckIQq9Gohgf8OPo4BBi8cSno/RPSZ1r+wEcO6NkBCwiqh9MiKisulu+M6oFhyNRKIBINJYKSNliccXtT+5JXWZAIqo/1ZgQUc3qbqbk9JxReqZdsCnAJTyiOuL1UkTZ2XhWqilLr+5mSsVMxTsXTK++lEEiKohPxLLAK5VP3QUlp1PxpsCxTrMdrUEsnVs9nziIqDANAL71qTkMSC6ou6BkVrvO3yAYM2rkS+FvEKxcNDN1OdQbxlP9r5V9jETkrmEAO/502O1h1KW6C0odrUHctXgWgk0BCBL7REvOnYwGyVyYEwBLzp2c+qQU6g2jc00/IlEeniWqB49uf8XtIdSlukt0ABKBKX1a3r5q04jkBwXwyLYDeHT7Kzj5OD/+++3BCo+SiNxUK9m21ZTkANThTMlMruSHuCoDElEd8gnTmtzAoASeQyCika4+b7LbQ6hLDEowT34govrkE8E1c6fgzo5Zbg+lLtXlnlI2J63Oiag2BZsC2No13+1h1D3OlJI6WoP8hSSqU16v3FBPOFPKwlJCRPVhtE8QiysmVmHfpGrLqHOCQSlL54LpWNbTl/+GRORpXI6rTly+y9LRGsQ9S1rg5ytDVLW4HFe9OFMykX24NtQbxlfX9mPQosUFEXkH25hXNwYlG4wgFeoN4x/X/RYDMTZFJ/KaRn8DfvfNT7g9DCoSg5ID6cFp5fo9BdfBC/h9qZL4K0K7sHrbAXAORlScgdgwpt26AUZ1oAmNftx2WaKgstFpuhqTGuqNaJXUd2pra9MdO3a4PYwMod4wvvajXTg6aL9poCF7E9Zo0e4k808ABjOiHBoE8DVIRnfp9A+FLrBVu+jUM2brnQ89ZfqzKs68s/Vv53Z+ETpag9hzxyW4Z0mL44oQ2cHHOCd1z5IW24+hSHwaZDUKInPDioyABADRWBzLevrQvmoTQr1hl0ZGVhiUSsBoh5HeEDAfq2KPHa1BR49zZCBm2guKiHILR6K4dd0uBiaP4btZiXS0BtF328W2b5+rLP7KRTMdzX7Y44moMNFYHDc/1s/A5CFMdCgxuxUhgsnK5MZekrEJO29GMzbvPYRoLA6fCOKqaAr48fa7MTAjnaj04qq4dd0uAGAChAcwKJVY54LpuHXdrhFNA9MZB/tCveGM24YjUTyy7UDqdsZsKhqLMyARlVE0Fkf3xn1VEZR+uP1AxuUqTnwwxeW7EjNrt37N3CkZl43Mn+6N+3IGL8N7QzwXRVRurHnpDZwplUF2RQgzod4w/wiIPCbUG66K2VItY1ByQag3jM61/W4Pg4iy3P7kntQBeR64dQeDkgu6N+4bcXaCiNx3ZCBmutfLRIjKYVAqM7PsOi7bEXmXWeuaakqEqHYMSmWUL7uOiKrHqx79MJmdjeclhWQGMig5kF6fTgSpwo/jRvvg9zXwECtRDWsQYSJEBTAo2ZRdzTu9IEOiIKvzoqxEVD14yLYyGJSAoltREFF9MMoSLe/pY1ZemdR9UAr1htG5ph+xYWbDEVF+RqUVZuWVR91XdOjeuI8BiYgKYmTlUenU/UzJqxk1RFQd+B6SqdhafHU/U5qYrNZNRFQIvoeUVt0Hpc4F0/kiEFFBjIr/VDp1/37c0RrE+Eb7nV6JiABgQqM/VfGfSse1PSURuQTAvwHwAfg/qrrKrbFEBpgKTkTWJjT6oQq8FY0xFbzMXAlKIuIDcC+AiwAcBPAbEVmvqr9zYzwTbXaLJaL6cs3cKbizY1bFn/fEcaNrrnmfXW7NlM4F8AdV/SMAiMh/ArgcgCtByaxbrN8ngILp4kQl0H7aiVh93UfcHgZVAbf2lIIAXkm7fDB5XQYRuV5EdojIjkOHDpVtMGbdYruvnIPuq+aM6CDbFDi2/9Tob4C/7nfliHJjQLKnUu93XieqlZ8JiMhVABao6v9IXv4sgHNV9Uar+7S1temOHTsqNcSqtSK0C49ufwVxVfhEcGpzI146dBSc8FE5+AT41qdauL9in9i5UY2+39n6t7u1fHcQwOS0y5MAvOrSWGrKnR2zXFkDJyIqBbcWn34D4IMiMk1ERgP4NID1Lo2FiIg8wpWZkqoOicjfA9iIREr4g6q6x42xEBGRd7h2TklVnwbwtFvPT0RE3sPcMSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gwGJSIi8gxXat8VQkQOAfiTC099MoA3XHjeYnDMlVFtY6628QK1N+Y3VPWSfA8gIj+xc7taVDVByS0iskNV29wehxMcc2VU25irbbwAx1yPuHxHRESewaBERESewaCU3/1uD6AAHHNlVNuYq228AMdcd7inREREnsGZEhEReQaDEhEReUZdByURmSwim0XkBRHZIyJfSV7fLSJ7ReS3IvIjEWlKXj9VRKIi0pf8us9DY/5mcrx9IvJTEZmYvF5E5Dsi8ofkzz9cBWO+UETeSnudv+GVMaf9/B9EREXk5ORlz77OOcbs2ddZRFaKSDhtbJ9Mu8+tydd5n4gs8PJ4vfCeUXVUtW6/AJwC4MPJ748H8HsAZwK4GMCo5PX/AuBfkt9PBbDbo2M+Ie02XwZwX/L7TwL4MQABMBfA9ioY84UAnvLi65y8PBmJrsl/AnCy11/nHGP27OsMYCWAfzC5/ZkA+gGMATANwEsAfB4er+vvGdX2VdczJVV9TVWfT37/NoAXAARV9aeqOpS82TYAk9waY7YcY/5z2s3GATAyWC4H8ANN2AagSURO8fiYXWc15uSP7wbwVWSO17Ovc44xuy7PmM1cDuA/VfU9VX0ZwB8AnFv+kSYUMF5yqK6DUjoRmQqgFcD2rB99AYlPwIZpItIrIs+KyF9VaHimsscsIv8kIq8AWArAWIoJAngl7W4H4eIfkc0xA8BHRKRfRH4sIjMrPtA06WMWkUUAwqran3Uzz77OOcYMePR1Tl7198ml0AdFZELyOs+8zjbHC3joPaMquD1V88IXgOMA7ASwOOv6rwH4EY6lzo8BcFLy+7OR+OM4oZJjzTfm5M9uBXB78vsNAM5P+9kvAJzt8TGfAOC45PefBPCiF343ADQi8QY0Pvmz/Ti2FObJ1znPmD35Oicvvw+AD4kPzv8E4MHk9fcCuCbtft8HcIWHx+uZ94xq+ar7mZKI+AE8DmC1qq5Lu/7zAC4FsFSTv1GaWDJ4M/n9TiTWsz/klTGn+SGAK5LfH0RiP8EwCcCr5R3hSE7GrKp/VtV3kt8/DcBvbM5XksmYT0NiH6NfRPYj8Vo+LyLvh0ZeklMAAAO5SURBVHdfZ8sxe/h1hqr+t6rGVXUYwAM4tkTn+uvsZLxeec+oJnUdlEREkPik9YKqfjvt+ksA3AJgkaoOpF3fLCK+5PenAvgggD96ZMwfTLvZIgB7k9+vB/C5ZHbYXABvqeprFRswnI9ZRN6fvA9E5Fwkfk/frNyIzcesqrtU9S9UdaqqTkXiDfLDqvpf8OjrnGvMXn2dk9en78f9NYDdye/XA/i0iIwRkWlI/A3+2qvj9cJ7RrUZ5fYAXNYO4LMAdolIX/K6fwTwHSSm3T9L/s1uU9UvArgAwB0iMgQgDuCLqnrYI2P+WxGZDmAYiQyrLyZ/9jQSSzN/ADAA4G8qO1wAzsd8JYAbkq9zFMCnjdmq22NOzijMePZ1zjFmz77OAK4WkRYkEjP2A/g7AFDVPSLyGIDfARgC8CVVjXt1vPDGe0ZVYZkhIiLyjLpeviMiIm9hUCIiIs9gUCIiIs9gUCIiIs9gUCIiIs9gUKKqJiLPSFalaBFZJiLfs7j9VBExzpC0SFr1aSJyH4MSVbtHAXw667pPJ6/PpwWJs0VE5BEMSlTt1gK4VETGAKkimRMBbJFEX6zdIrJLRJak30lERgO4A8ASSfS5WSIi54rIr5LFM3+VPNgLEWkUkceSxTZ7RGS7iLQlf3axiDwnIs+LyBoROa6C/3aimlPvFR2oyqnqmyLyawCXAHgCiVlSDxLFSFsAzAFwMoDfiMgv0+43KImmdm2q+vcAICInALhAVYdE5OMA/hmJenz/E8ARVZ0tImcB6Eve/mQAKwB8XFWPisgtAG5CItgRUQEYlKgWGEt4RlD6AhKlYB5NlqD5bxF5FsA5AH6b43HGA3g4WZNPAfiT158P4N8AQFV3i4jxGHORaPC2NVmOajSA50r47yKqO1y+o1oQAvAxSbQgD2iiCZsU8DjfBLBZVc8CcBmAscnrrR5LAPxMVVuSX2eq6t8W8LxElMSgRFUv2X7hGQAP4liCwy+R2C/yiUgzEoUxs6tJv41ES2vDeADh5PfXpl2/BcCnAEBEzgQwK3n9NgDtInJ68meNIsK2BERFYFCiWvEoEvtH/5m8/CMklur6AWwC8NVki4l0mwGcaSQ6APhXAHeJyFYkGrYZvgegOblsd0vycd9S1UNIBK9Hkz/bBmBGOf5xRPWCVcKJ8kj2w/Gr6rsichoSXWU/pKqDLg+NqOYw0YEov0YAm5MdRwXADQxIROXBmRIREXkG95SIiMgzGJSIiMgzGJSIiMgzGJSIiMgzGJSIiMgz/j8FN7EgzmEjyQAAAABJRU5ErkJggg==\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"## The correlations between 'Voltage' and 'Global_active_power'\n",
"sns.jointplot(x='Voltage', y='Global_active_power', data=data) \n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From above two plots it is seen that 'Global_intensity' and 'Global_active_power' correlated. But 'Voltage', 'Global_active_power' are less correlated. This is important observation for machine learning purpose."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Comparison among features"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Correlations of mean of features resampled over months\n",
"plt.matshow(data.resample('M').mean().corr(method='spearman'),vmax=1,vmin=-1,cmap='RdBu')\n",
"plt.title('resampled over month', size=15)\n",
"plt.colorbar()\n",
"plt.margins(0.02)\n",
"plt.matshow(data.resample('A').mean().corr(method='spearman'),vmax=1,vmin=-1,cmap='RdBu')\n",
"plt.title('resampled over year', size=15)\n",
"plt.colorbar()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is seen from above that with resampling techniques one can change the correlations among features. This is important for feature engineering."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Global_active_power
\n",
"
Global_reactive_power
\n",
"
Voltage
\n",
"
Global_intensity
\n",
"
Sub_metering_1
\n",
"
Sub_metering_2
\n",
"
Sub_metering_3
\n",
"
\n",
" \n",
" \n",
"
\n",
"
Global_active_power
\n",
"
1.000000
\n",
"
0.247017
\n",
"
-0.399762
\n",
"
0.998889
\n",
"
0.484401
\n",
"
0.434569
\n",
"
0.638555
\n",
"
\n",
"
\n",
"
Global_reactive_power
\n",
"
0.247017
\n",
"
1.000000
\n",
"
-0.112246
\n",
"
0.266120
\n",
"
0.123111
\n",
"
0.139231
\n",
"
0.089617
\n",
"
\n",
"
\n",
"
Voltage
\n",
"
-0.399762
\n",
"
-0.112246
\n",
"
1.000000
\n",
"
-0.411363
\n",
"
-0.195976
\n",
"
-0.167405
\n",
"
-0.268172
\n",
"
\n",
"
\n",
"
Global_intensity
\n",
"
0.998889
\n",
"
0.266120
\n",
"
-0.411363
\n",
"
1.000000
\n",
"
0.489298
\n",
"
0.440347
\n",
"
0.626543
\n",
"
\n",
"
\n",
"
Sub_metering_1
\n",
"
0.484401
\n",
"
0.123111
\n",
"
-0.195976
\n",
"
0.489298
\n",
"
1.000000
\n",
"
0.054721
\n",
"
0.102571
\n",
"
\n",
"
\n",
"
Sub_metering_2
\n",
"
0.434569
\n",
"
0.139231
\n",
"
-0.167405
\n",
"
0.440347
\n",
"
0.054721
\n",
"
1.000000
\n",
"
0.080872
\n",
"
\n",
"
\n",
"
Sub_metering_3
\n",
"
0.638555
\n",
"
0.089617
\n",
"
-0.268172
\n",
"
0.626543
\n",
"
0.102571
\n",
"
0.080872
\n",
"
1.000000
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Global_active_power Global_reactive_power Voltage \\\n",
"Global_active_power 1.000000 0.247017 -0.399762 \n",
"Global_reactive_power 0.247017 1.000000 -0.112246 \n",
"Voltage -0.399762 -0.112246 1.000000 \n",
"Global_intensity 0.998889 0.266120 -0.411363 \n",
"Sub_metering_1 0.484401 0.123111 -0.195976 \n",
"Sub_metering_2 0.434569 0.139231 -0.167405 \n",
"Sub_metering_3 0.638555 0.089617 -0.268172 \n",
"\n",
" Global_intensity Sub_metering_1 Sub_metering_2 \\\n",
"Global_active_power 0.998889 0.484401 0.434569 \n",
"Global_reactive_power 0.266120 0.123111 0.139231 \n",
"Voltage -0.411363 -0.195976 -0.167405 \n",
"Global_intensity 1.000000 0.489298 0.440347 \n",
"Sub_metering_1 0.489298 1.000000 0.054721 \n",
"Sub_metering_2 0.440347 0.054721 1.000000 \n",
"Sub_metering_3 0.626543 0.102571 0.080872 \n",
"\n",
" Sub_metering_3 \n",
"Global_active_power 0.638555 \n",
"Global_reactive_power 0.089617 \n",
"Voltage -0.268172 \n",
"Global_intensity 0.626543 \n",
"Sub_metering_1 0.102571 \n",
"Sub_metering_2 0.080872 \n",
"Sub_metering_3 1.000000 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(data.corr())\n",
"pd.plotting.scatter_matrix(data, figsize=(12, 12))\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Please check your scatter_matrix as below: \n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### StatsModels\n",
"\n",
"`statsmodels` is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. An extensive list of result statistics are available for each estimator. The results are tested against existing statistical packages to ensure that they are correct. The package is released under the open source Modified BSD (3-clause) license. The online documentation is hosted at [statsmodels.org](http://www.statsmodels.org/).\n",
"\n",
"1. Statistics `stats`\n",
" * statistical tests\n",
" * kernel density estimation\n",
" * generalized method of moments \n",
"
\n",
"1. Linear regression\n",
" * Linear model\n",
" * Generalized Linear Model (GLM)\n",
" * Robust Linear Model\n",
" * Linear Mixed Effects Model\n",
" * ANOVA (Analysis of Variance)\n",
" * Discrete Dependent Variable \n",
"
\n",
"1. Time-Series analysis\n",
" * ARMA/ARIMA process\n",
" * Vector ARMA process"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [],
"source": [
"timeseries = data.Global_active_power.resample('D').mean()\n",
"# for i in range(num_timeseries):\n",
"# timeseries.append(np.trim_zeros(data.iloc[:,i], trim='f'))"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Date_Time\n",
"2006-12-16 3.053475\n",
"2006-12-17 2.354486\n",
"2006-12-18 1.530435\n",
"2006-12-19 1.157079\n",
"2006-12-20 1.545658\n",
"Freq: D, Name: Global_active_power, dtype: float64"
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"timeseries.head()"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1442"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(timeseries)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train and Test splits\n",
"\n",
"Often times one is interested in evaluating the model or tuning its hyperparameters by looking at error metrics on a hold-out test set. Here we split the available data into train and test sets for evaluating the trained model. For standard machine learning tasks such as classification and regression, one typically obtains this split by randomly separating examples into train and test sets. However, in forecasting it is important to do this train/test split based on time rather than by time series.\n",
"\n",
"In this example, we will reserve the last section of each of the time series for evalutation purpose and use only the first part as training data. "
]
},
{
"cell_type": "code",
"execution_count": 199,
"metadata": {},
"outputs": [],
"source": [
"# we use minute frequency for the time series\n",
"freq = 'D'\n",
"\n",
"# we predict for 60 days \n",
"prediction_length = 60\n",
"\n",
"# we also use 60 days as context length, \n",
"# this is the number of state updates accomplished before making predictions\n",
"context_length = 60"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We specify here the portion of the data that is used for training: the model sees data from 2006-12-16 to 2008-12-31 for training."
]
},
{
"cell_type": "code",
"execution_count": 201,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Date_Time\n",
"2006-12-16 3.053475\n",
"2006-12-17 2.354486\n",
"2006-12-18 1.530435\n",
"2006-12-19 1.157079\n",
"2006-12-20 1.545658\n",
"Freq: D, Name: Global_active_power, dtype: float64"
]
},
"execution_count": 201,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"timeseries.head()"
]
},
{
"cell_type": "code",
"execution_count": 202,
"metadata": {},
"outputs": [],
"source": [
"start_dataset = pd.Timestamp(\"2006-12-16\", freq=freq)\n",
"end_training = pd.Timestamp(\"2010-07-11\", freq=freq)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The DeepAR JSON input format represents each time series as a JSON object. In the simplest case each time series just consists of a start time stamp (``start``) and a list of values (``target``). For more complex cases, DeepAR also supports the fields ``dynamic_feat`` for time-series features and ``cat`` for categorical features, which we will use later."
]
},
{
"cell_type": "code",
"execution_count": 203,
"metadata": {},
"outputs": [],
"source": [
"training_data = [\n",
" {\n",
" \"start\": str(start_dataset),\n",
" \"target\": timeseries[start_dataset:end_training - 1].tolist() # We use -1, because pandas indexing includes the upper bound \n",
" # \"target\": ts[start_dataset:end_training - 1].tolist() # We use -1, because pandas indexing includes the upper bound \n",
" }\n",
" #for ts in timeseries\n",
"]\n",
"# print(len(training_data))"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1303"
]
},
"execution_count": 204,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(timeseries[start_dataset:end_training -1])"
]
},
{
"cell_type": "code",
"execution_count": 205,
"metadata": {},
"outputs": [],
"source": [
"test_data = [\n",
" {\n",
" \"start\": str(start_dataset),\n",
" \"target\": timeseries[end_training:end_training + 10 * prediction_length].tolist() # We use -1, because pandas indexing includes the upper bound \n",
" }\n",
" #for ts in timeseries\n",
"]\n",
"# print(len(test_data))"
]
},
{
"cell_type": "code",
"execution_count": 206,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"139"
]
},
"execution_count": 206,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(timeseries[end_training:end_training + 10 * prediction_length])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's now write the dictionary to the `jsonlines` file format that DeepAR understands (it also supports gzipped jsonlines and parquet)."
]
},
{
"cell_type": "code",
"execution_count": 207,
"metadata": {},
"outputs": [],
"source": [
"def write_dicts_to_file(path, data):\n",
" with open(path, 'wb') as fp:\n",
" for d in data:\n",
" fp.write(json.dumps(d).encode(\"utf-8\"))\n",
" fp.write(\"\\n\".encode('utf-8'))"
]
},
{
"cell_type": "code",
"execution_count": 208,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 4 ms, sys: 0 ns, total: 4 ms\n",
"Wall time: 2.18 ms\n"
]
}
],
"source": [
"%%time\n",
"write_dicts_to_file(\"train.json\", training_data)\n",
"write_dicts_to_file(\"test.json\", test_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have the data files locally, let us copy them to S3 where DeepAR can access them. Depending on your connection, this may take a couple of minutes."
]
},
{
"cell_type": "code",
"execution_count": 209,
"metadata": {},
"outputs": [],
"source": [
"s3 = boto3.resource('s3')\n",
"def copy_to_s3(local_file, s3_path, override=False):\n",
" assert s3_path.startswith('s3://')\n",
" split = s3_path.split('/')\n",
" bucket = split[2]\n",
" path = '/'.join(split[3:])\n",
" buk = s3.Bucket(bucket)\n",
" \n",
" if len(list(buk.objects.filter(Prefix=path))) > 0:\n",
" if not override:\n",
" print('File s3://{}/{} already exists.\\nSet override to upload anyway.\\n'.format(s3_bucket, s3_path))\n",
" return\n",
" else:\n",
" print('Overwriting existing file')\n",
" with open(local_file, 'rb') as data:\n",
" print('Uploading file to {}'.format(s3_path))\n",
" buk.put_object(Key=path, Body=data)"
]
},
{
"cell_type": "code",
"execution_count": 210,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Uploading file to s3://sagemaker-us-west-2-742595989409/deepar-household-electricity-notebook/data/train/train.json\n",
"Uploading file to s3://sagemaker-us-west-2-742595989409/deepar-household-electricity-notebook/data/test/test.json\n",
"CPU times: user 28 ms, sys: 0 ns, total: 28 ms\n",
"Wall time: 122 ms\n"
]
}
],
"source": [
"%%time\n",
"copy_to_s3(\"train.json\", s3_data_path + \"/train/train.json\")\n",
"copy_to_s3(\"test.json\", s3_data_path + \"/test/test.json\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's have a look to what we just wrote to S3."
]
},
{
"cell_type": "code",
"execution_count": 211,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\"start\": \"2006-12-16 00:00:00\", \"target\": [3.0534747474747492, 2.354486111111111, 1.530434722222219...\n"
]
}
],
"source": [
"s3filesystem = s3fs.S3FileSystem()\n",
"with s3filesystem.open(s3_data_path + \"/train/train.json\", 'rb') as fp:\n",
" print(fp.readline().decode(\"utf-8\")[:100] + \"...\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are all set with our dataset processing, we can now call DeepAR to train a model and generate predictions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a model\n",
"\n",
"Here we define the estimator that will launch the training job."
]
},
{
"cell_type": "code",
"execution_count": 212,
"metadata": {},
"outputs": [],
"source": [
"estimator = sagemaker.estimator.Estimator(\n",
" sagemaker_session=sagemaker_session,\n",
" image_name=image_name,\n",
" role=role,\n",
" train_instance_count=1,\n",
" train_instance_type='ml.m4.xlarge',\n",
" base_job_name='deepar-home-electricity-demo',\n",
" output_path=s3_output_path\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we need to set the hyperparameters for the training job. For example frequency of the time series used, number of data points the model will look at in the past, number of predicted data points. The other hyperparameters concern the model to train (number of layers, number of cells per layer, likelihood function) and the training options (number of epochs, batch size, learning rate...). We use default parameters for every optional parameter in this case (you can always use [Sagemaker Automated Model Tuning](https://aws.amazon.com/blogs/aws/sagemaker-automatic-model-tuning/) to tune them)."
]
},
{
"cell_type": "code",
"execution_count": 213,
"metadata": {},
"outputs": [],
"source": [
"# hyperparameters = {\n",
"# \"time_freq\": freq,\n",
"# \"epochs\": \"5\",\n",
"# \"early_stopping_patience\": \"10\",\n",
"# \"mini_batch_size\": \"20\",\n",
"# \"learning_rate\": \"0.001\",\n",
"# \"context_length\": str(context_length),\n",
"# \"prediction_length\": str(prediction_length)\n",
"# }\n",
"hyperparameters = {\n",
" \"time_freq\": freq,\n",
" \"context_length\": str(context_length),\n",
" \"prediction_length\": str(prediction_length),\n",
" \"num_cells\": \"40\",\n",
" \"num_layers\": \"3\",\n",
" \"likelihood\": \"gaussian\",\n",
" \"epochs\": \"20\",\n",
" \"mini_batch_size\": \"32\",\n",
" \"learning_rate\": \"0.001\",\n",
" \"dropout_rate\": \"0.05\",\n",
" \"early_stopping_patience\": \"10\"\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 214,
"metadata": {},
"outputs": [],
"source": [
"estimator.set_hyperparameters(**hyperparameters)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are ready to launch the training job. SageMaker will start an EC2 instance, download the data from S3, start training the model and save the trained model.\n",
"\n",
"If you provide the `test` data channel as we do in this example, DeepAR will also calculate accuracy metrics for the trained model on this test. This is done by predicting the last `prediction_length` points of each time-series in the test set and comparing this to the actual value of the time-series. \n",
"\n",
"**Note:** the next cell may take a few minutes to complete, depending on data size, model complexity, training options."
]
},
{
"cell_type": "code",
"execution_count": 215,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:sagemaker:Creating training-job with name: deepar-home-electricity-demo-2018-07-27-03-15-18-550\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"......................\n",
"\u001b[31mArguments: train\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/algorithm/default-input.json: {u'num_dynamic_feat': u'auto', u'dropout_rate': u'0.10', u'mini_batch_size': u'128', u'test_quantiles': u'[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]', u'_tuning_objective_metric': u'', u'_num_gpus': u'auto', u'num_eval_samples': u'100', u'learning_rate': u'0.001', u'num_cells': u'40', u'num_layers': u'2', u'embedding_dimension': u'10', u'_kvstore': u'auto', u'_num_kv_servers': u'auto', u'cardinality': u'auto', u'likelihood': u'student-t', u'early_stopping_patience': u''}\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Reading provided configuration from /opt/ml/input/config/hyperparameters.json: {u'dropout_rate': u'0.05', u'learning_rate': u'0.001', u'num_cells': u'40', u'prediction_length': u'60', u'epochs': u'20', u'time_freq': u'D', u'context_length': u'60', u'num_layers': u'3', u'mini_batch_size': u'32', u'likelihood': u'gaussian', u'early_stopping_patience': u'10'}\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Final configuration: {u'dropout_rate': u'0.05', u'test_quantiles': u'[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]', u'_tuning_objective_metric': u'', u'num_eval_samples': u'100', u'learning_rate': u'0.001', u'num_layers': u'3', u'epochs': u'20', u'embedding_dimension': u'10', u'num_cells': u'40', u'_num_kv_servers': u'auto', u'mini_batch_size': u'32', u'likelihood': u'gaussian', u'num_dynamic_feat': u'auto', u'cardinality': u'auto', u'_num_gpus': u'auto', u'prediction_length': u'60', u'time_freq': u'D', u'context_length': u'60', u'_kvstore': u'auto', u'early_stopping_patience': u'10'}\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Detected entry point for worker worker\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Using early stopping with patience 10\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] [cardinality=auto] `cat` field was NOT found in the file `/opt/ml/input/data/train/train.json` and will NOT be used for training.\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] [num_dynamic_feat=auto] `dynamic_feat` field was NOT found in the file `/opt/ml/input/data/train/train.json` and will NOT be used for training.\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Training set statistics:\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Real time series\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] number of time series: 1\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] number of observations: 1303\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] mean target length: 1303\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] min/mean/max target: 0.173818051815/1.10822587149/3.31485128403\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] mean abs(target): 1.10822587149\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] contains missing values: no\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Small number of time series. Doing 10 number of passes over dataset per epoch.\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Test set statistics:\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Real time series\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] number of time series: 1\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] number of observations: 139\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] mean target length: 139\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] min/mean/max target: 0.364406943321/0.94621287833/1.88464164734\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] mean abs(target): 0.94621287833\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] contains missing values: no\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] nvidia-smi took: 0.0251779556274 secs to identify 0 gpus\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Number of GPUs being used: 0\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Create Store: local\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"get_graph.time\": {\"count\": 1, \"max\": 469.3880081176758, \"sum\": 469.3880081176758, \"min\": 469.3880081176758}}, \"EndTime\": 1532661526.713893, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661526.242651}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:46 INFO 140074042410816] Number of GPUs being used: 0\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"initialize.time\": {\"count\": 1, \"max\": 971.4839458465576, \"sum\": 971.4839458465576, \"min\": 971.4839458465576}}, \"EndTime\": 1532661527.214226, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661526.71398}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:47 INFO 140074042410816] Epoch[0] Batch[0] avg_epoch_loss=1.272068\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] Epoch[0] Batch[5] avg_epoch_loss=0.845399\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] Epoch[0] Batch [5]#011Speed: 218.46 samples/sec#011loss=0.845399\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] processed a total of 319 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"epochs\": {\"count\": 1, \"max\": 20, \"sum\": 20.0, \"min\": 20}, \"update.time\": {\"count\": 1, \"max\": 1658.9720249176025, \"sum\": 1658.9720249176025, \"min\": 1658.9720249176025}}, \"EndTime\": 1532661528.873359, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661527.214291}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=192.268629997 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] #progress_metric: host=algo-1, completed 5 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:48 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_c0f18e48-f8ab-4af3-abf5-86b6b3a4b369-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 47.096967697143555, \"sum\": 47.096967697143555, \"min\": 47.096967697143555}}, \"EndTime\": 1532661528.921088, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661528.873479}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:49 INFO 140074042410816] Epoch[1] Batch[0] avg_epoch_loss=0.450175\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:49 INFO 140074042410816] Epoch[1] Batch[5] avg_epoch_loss=0.410062\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:49 INFO 140074042410816] Epoch[1] Batch [5]#011Speed: 230.24 samples/sec#011loss=0.410062\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:50 INFO 140074042410816] processed a total of 319 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1497.1671104431152, \"sum\": 1497.1671104431152, \"min\": 1497.1671104431152}}, \"EndTime\": 1532661530.41839, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661528.921164}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:50 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=213.052985582 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:50 INFO 140074042410816] #progress_metric: host=algo-1, completed 10 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:50 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:50 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_de513dee-4530-4948-b19a-39bb8395c93e-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 46.9210147857666, \"sum\": 46.9210147857666, \"min\": 46.9210147857666}}, \"EndTime\": 1532661530.465844, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661530.418465}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:50 INFO 140074042410816] Epoch[2] Batch[0] avg_epoch_loss=0.380706\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:51 INFO 140074042410816] Epoch[2] Batch[5] avg_epoch_loss=0.315473\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:51 INFO 140074042410816] Epoch[2] Batch [5]#011Speed: 221.60 samples/sec#011loss=0.315473\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] Epoch[2] Batch[10] avg_epoch_loss=0.326351\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] Epoch[2] Batch [10]#011Speed: 223.90 samples/sec#011loss=0.339405\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] processed a total of 338 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1643.8491344451904, \"sum\": 1643.8491344451904, \"min\": 1643.8491344451904}}, \"EndTime\": 1532661532.109826, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661530.465917}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=205.592850456 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] #progress_metric: host=algo-1, completed 15 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_1858a189-e242-4797-867e-ab3feb7b666d-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 56.439876556396484, \"sum\": 56.439876556396484, \"min\": 56.439876556396484}}, \"EndTime\": 1532661532.166809, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661532.109963}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:52 INFO 140074042410816] Epoch[3] Batch[0] avg_epoch_loss=0.309939\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] Epoch[3] Batch[5] avg_epoch_loss=0.259952\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] Epoch[3] Batch [5]#011Speed: 219.63 samples/sec#011loss=0.259952\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] Epoch[3] Batch[10] avg_epoch_loss=0.223093\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] Epoch[3] Batch [10]#011Speed: 227.20 samples/sec#011loss=0.178861\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] processed a total of 331 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1633.991003036499, \"sum\": 1633.991003036499, \"min\": 1633.991003036499}}, \"EndTime\": 1532661533.800934, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661532.166879}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=202.557603709 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] #progress_metric: host=algo-1, completed 20 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:53 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_0701de60-24fb-439e-945d-3dedfa47034d-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 48.83313179016113, \"sum\": 48.83313179016113, \"min\": 48.83313179016113}}, \"EndTime\": 1532661533.850233, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661533.801004}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:54 INFO 140074042410816] Epoch[4] Batch[0] avg_epoch_loss=0.124813\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:54 INFO 140074042410816] Epoch[4] Batch[5] avg_epoch_loss=0.172714\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:54 INFO 140074042410816] Epoch[4] Batch [5]#011Speed: 225.45 samples/sec#011loss=0.172714\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:55 INFO 140074042410816] processed a total of 310 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1531.6309928894043, \"sum\": 1531.6309928894043, \"min\": 1531.6309928894043}}, \"EndTime\": 1532661535.382007, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661533.850316}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:55 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=202.384097219 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:55 INFO 140074042410816] #progress_metric: host=algo-1, completed 25 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:55 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:55 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_36d3075b-c282-486d-871b-d399ee9f1e44-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 45.822858810424805, \"sum\": 45.822858810424805, \"min\": 45.822858810424805}}, \"EndTime\": 1532661535.428317, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661535.382078}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:55 INFO 140074042410816] Epoch[5] Batch[0] avg_epoch_loss=0.178906\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] Epoch[5] Batch[5] avg_epoch_loss=0.161604\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] Epoch[5] Batch [5]#011Speed: 221.68 samples/sec#011loss=0.161604\u001b[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] processed a total of 302 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1514.8308277130127, \"sum\": 1514.8308277130127, \"min\": 1514.8308277130127}}, \"EndTime\": 1532661536.943273, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661535.428385}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=199.347825875 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] #progress_metric: host=algo-1, completed 30 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:56 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_3f6d3df6-d554-4ddc-8d21-74d95b98ed84-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 45.50004005432129, \"sum\": 45.50004005432129, \"min\": 45.50004005432129}}, \"EndTime\": 1532661536.989284, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661536.943343}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:57 INFO 140074042410816] Epoch[6] Batch[0] avg_epoch_loss=0.135446\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:57 INFO 140074042410816] Epoch[6] Batch[5] avg_epoch_loss=0.114501\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:57 INFO 140074042410816] Epoch[6] Batch [5]#011Speed: 222.89 samples/sec#011loss=0.114501\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:58 INFO 140074042410816] processed a total of 316 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1476.783037185669, \"sum\": 1476.783037185669, \"min\": 1476.783037185669}}, \"EndTime\": 1532661538.466206, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661536.989359}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:58 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=213.962974431 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:58 INFO 140074042410816] #progress_metric: host=algo-1, completed 35 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:58 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:58 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_d13c9e29-891f-454b-95ff-d389dd213fef-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 44.64602470397949, \"sum\": 44.64602470397949, \"min\": 44.64602470397949}}, \"EndTime\": 1532661538.511351, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661538.466273}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:58 INFO 140074042410816] Epoch[7] Batch[0] avg_epoch_loss=0.189694\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:59 INFO 140074042410816] Epoch[7] Batch[5] avg_epoch_loss=0.141834\u001b[0m\n",
"\u001b[31m[07/27/2018 03:18:59 INFO 140074042410816] Epoch[7] Batch [5]#011Speed: 223.07 samples/sec#011loss=0.141834\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:00 INFO 140074042410816] processed a total of 312 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1551.4729022979736, \"sum\": 1551.4729022979736, \"min\": 1551.4729022979736}}, \"EndTime\": 1532661540.06295, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661538.51142}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:00 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=201.084053695 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:00 INFO 140074042410816] #progress_metric: host=algo-1, completed 40 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:00 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:00 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_dd0f6ada-f55e-4089-b999-2c57c062ebe1-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 74.94807243347168, \"sum\": 74.94807243347168, \"min\": 74.94807243347168}}, \"EndTime\": 1532661540.138372, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661540.063028}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:00 INFO 140074042410816] Epoch[8] Batch[0] avg_epoch_loss=0.041331\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] Epoch[8] Batch[5] avg_epoch_loss=0.101388\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] Epoch[8] Batch [5]#011Speed: 221.49 samples/sec#011loss=0.101388\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] Epoch[8] Batch[10] avg_epoch_loss=0.103547\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] Epoch[8] Batch [10]#011Speed: 222.41 samples/sec#011loss=0.106138\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] processed a total of 333 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1634.108066558838, \"sum\": 1634.108066558838, \"min\": 1634.108066558838}}, \"EndTime\": 1532661541.772616, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661540.138446}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=203.765076203 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] #progress_metric: host=algo-1, completed 45 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:01 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_5186df37-c53f-4195-a4b6-d70c1b17804c-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 63.11511993408203, \"sum\": 63.11511993408203, \"min\": 63.11511993408203}}, \"EndTime\": 1532661541.836224, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661541.772702}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:02 INFO 140074042410816] Epoch[9] Batch[0] avg_epoch_loss=0.125514\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:02 INFO 140074042410816] Epoch[9] Batch[5] avg_epoch_loss=0.074293\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:02 INFO 140074042410816] Epoch[9] Batch [5]#011Speed: 226.06 samples/sec#011loss=0.074293\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] Epoch[9] Batch[10] avg_epoch_loss=0.035307\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] Epoch[9] Batch [10]#011Speed: 221.50 samples/sec#011loss=-0.011475\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] processed a total of 324 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1648.5710144042969, \"sum\": 1648.5710144042969, \"min\": 1648.5710144042969}}, \"EndTime\": 1532661543.48493, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661541.836296}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=196.518469233 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] #progress_metric: host=algo-1, completed 50 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_40df461c-4580-46f5-8987-d1d28505c79a-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 49.41511154174805, \"sum\": 49.41511154174805, \"min\": 49.41511154174805}}, \"EndTime\": 1532661543.534826, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661543.485018}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:03 INFO 140074042410816] Epoch[10] Batch[0] avg_epoch_loss=0.083117\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:04 INFO 140074042410816] Epoch[10] Batch[5] avg_epoch_loss=0.073549\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:04 INFO 140074042410816] Epoch[10] Batch [5]#011Speed: 224.92 samples/sec#011loss=0.073549\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] Epoch[10] Batch[10] avg_epoch_loss=0.068213\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] Epoch[10] Batch [10]#011Speed: 214.43 samples/sec#011loss=0.061811\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] processed a total of 349 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1659.013032913208, \"sum\": 1659.013032913208, \"min\": 1659.013032913208}}, \"EndTime\": 1532661545.193972, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661543.534898}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=210.350440248 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] #progress_metric: host=algo-1, completed 55 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] loss did not improve for 1 epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:05 INFO 140074042410816] Epoch[11] Batch[0] avg_epoch_loss=0.033785\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] Epoch[11] Batch[5] avg_epoch_loss=0.054952\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] Epoch[11] Batch [5]#011Speed: 215.06 samples/sec#011loss=0.054952\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] processed a total of 309 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1523.3738422393799, \"sum\": 1523.3738422393799, \"min\": 1523.3738422393799}}, \"EndTime\": 1532661546.717783, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661545.194053}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=202.821246749 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] #progress_metric: host=algo-1, completed 60 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] loss did not improve for 2 epochs\u001b[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m[07/27/2018 03:19:06 INFO 140074042410816] Epoch[12] Batch[0] avg_epoch_loss=0.125290\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:07 INFO 140074042410816] Epoch[12] Batch[5] avg_epoch_loss=0.090269\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:07 INFO 140074042410816] Epoch[12] Batch [5]#011Speed: 225.58 samples/sec#011loss=0.090269\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] Epoch[12] Batch[10] avg_epoch_loss=0.041725\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] Epoch[12] Batch [10]#011Speed: 222.48 samples/sec#011loss=-0.016528\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] processed a total of 326 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1660.499095916748, \"sum\": 1660.499095916748, \"min\": 1660.499095916748}}, \"EndTime\": 1532661548.378843, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661546.717875}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=196.311152772 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] #progress_metric: host=algo-1, completed 65 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] loss did not improve for 3 epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:08 INFO 140074042410816] Epoch[13] Batch[0] avg_epoch_loss=0.038031\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:09 INFO 140074042410816] Epoch[13] Batch[5] avg_epoch_loss=0.054937\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:09 INFO 140074042410816] Epoch[13] Batch [5]#011Speed: 207.99 samples/sec#011loss=0.054937\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] Epoch[13] Batch[10] avg_epoch_loss=0.036359\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] Epoch[13] Batch [10]#011Speed: 221.30 samples/sec#011loss=0.014066\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] processed a total of 342 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1699.0950107574463, \"sum\": 1699.0950107574463, \"min\": 1699.0950107574463}}, \"EndTime\": 1532661550.078365, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661548.378929}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=201.272015504 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] #progress_metric: host=algo-1, completed 70 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] loss did not improve for 4 epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:10 INFO 140074042410816] Epoch[14] Batch[0] avg_epoch_loss=0.075549\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] Epoch[14] Batch[5] avg_epoch_loss=0.037262\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] Epoch[14] Batch [5]#011Speed: 219.56 samples/sec#011loss=0.037262\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] Epoch[14] Batch[10] avg_epoch_loss=0.039390\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] Epoch[14] Batch [10]#011Speed: 224.12 samples/sec#011loss=0.041944\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] processed a total of 330 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1684.4029426574707, \"sum\": 1684.4029426574707, \"min\": 1684.4029426574707}}, \"EndTime\": 1532661551.76323, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661550.078429}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=195.900116044 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] #progress_metric: host=algo-1, completed 75 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:11 INFO 140074042410816] loss did not improve for 5 epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:12 INFO 140074042410816] Epoch[15] Batch[0] avg_epoch_loss=0.021203\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:12 INFO 140074042410816] Epoch[15] Batch[5] avg_epoch_loss=0.019402\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:12 INFO 140074042410816] Epoch[15] Batch [5]#011Speed: 225.40 samples/sec#011loss=0.019402\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] Epoch[15] Batch[10] avg_epoch_loss=0.040792\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] Epoch[15] Batch [10]#011Speed: 217.36 samples/sec#011loss=0.066460\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] processed a total of 325 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1688.2209777832031, \"sum\": 1688.2209777832031, \"min\": 1688.2209777832031}}, \"EndTime\": 1532661553.451881, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661551.763318}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=192.498137011 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] #progress_metric: host=algo-1, completed 80 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] loss did not improve for 6 epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:13 INFO 140074042410816] Epoch[16] Batch[0] avg_epoch_loss=-0.006675\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:14 INFO 140074042410816] Epoch[16] Batch[5] avg_epoch_loss=0.010563\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:14 INFO 140074042410816] Epoch[16] Batch [5]#011Speed: 219.53 samples/sec#011loss=0.010563\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:14 INFO 140074042410816] processed a total of 299 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1507.7879428863525, \"sum\": 1507.7879428863525, \"min\": 1507.7879428863525}}, \"EndTime\": 1532661554.9601, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661553.451944}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:14 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=198.287036097 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:14 INFO 140074042410816] #progress_metric: host=algo-1, completed 85 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:14 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:15 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_e27fd8e1-f54e-4ab5-ab91-87fec4f72123-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 55.8319091796875, \"sum\": 55.8319091796875, \"min\": 55.8319091796875}}, \"EndTime\": 1532661555.016606, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661554.960187}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:15 INFO 140074042410816] Epoch[17] Batch[0] avg_epoch_loss=-0.001955\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:15 INFO 140074042410816] Epoch[17] Batch[5] avg_epoch_loss=-0.004937\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:15 INFO 140074042410816] Epoch[17] Batch [5]#011Speed: 216.02 samples/sec#011loss=-0.004937\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] Epoch[17] Batch[10] avg_epoch_loss=0.002667\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] Epoch[17] Batch [10]#011Speed: 225.72 samples/sec#011loss=0.011793\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] processed a total of 329 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1667.4549579620361, \"sum\": 1667.4549579620361, \"min\": 1667.4549579620361}}, \"EndTime\": 1532661556.684201, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661555.016683}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=197.291643279 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] #progress_metric: host=algo-1, completed 90 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_cb82d8d0-d122-4842-80a4-8f69191b4d46-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 49.47805404663086, \"sum\": 49.47805404663086, \"min\": 49.47805404663086}}, \"EndTime\": 1532661556.734169, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661556.684287}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:16 INFO 140074042410816] Epoch[18] Batch[0] avg_epoch_loss=-0.043327\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:17 INFO 140074042410816] Epoch[18] Batch[5] avg_epoch_loss=0.008369\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:17 INFO 140074042410816] Epoch[18] Batch [5]#011Speed: 225.63 samples/sec#011loss=0.008369\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:18 INFO 140074042410816] processed a total of 292 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1511.1079216003418, \"sum\": 1511.1079216003418, \"min\": 1511.1079216003418}}, \"EndTime\": 1532661558.245413, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661556.734244}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:18 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=193.22237955 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:18 INFO 140074042410816] #progress_metric: host=algo-1, completed 95 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:18 INFO 140074042410816] loss did not improve for 1 epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:18 INFO 140074042410816] Epoch[19] Batch[0] avg_epoch_loss=0.002925\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] Epoch[19] Batch[5] avg_epoch_loss=-0.027517\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] Epoch[19] Batch [5]#011Speed: 218.34 samples/sec#011loss=-0.027517\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] processed a total of 320 examples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"update.time\": {\"count\": 1, \"max\": 1495.6309795379639, \"sum\": 1495.6309795379639, \"min\": 1495.6309795379639}}, \"EndTime\": 1532661559.741463, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661558.24548}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] #throughput_metric: host=algo-1, train throughput=213.939775141 records/second\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] #progress_metric: host=algo-1, completed 100 % of epochs\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] best epoch loss so far\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/state_bdd9a716-e304-4987-a6ad-8ff21edbd8dc-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.serialize.time\": {\"count\": 1, \"max\": 45.989990234375, \"sum\": 45.989990234375, \"min\": 45.989990234375}}, \"EndTime\": 1532661559.78796, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661559.741538}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] Loading parameters from best epoch (19)\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"state.deserialize.time\": {\"count\": 1, \"max\": 18.826007843017578, \"sum\": 18.826007843017578, \"min\": 18.826007843017578}}, \"EndTime\": 1532661559.806988, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661559.788032}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] Final loss: -0.0369148883969 (occurred at epoch 19)\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] #quality_metric: host=algo-1, train final_loss =-0.0369148883969\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] Worker algo-1 finished training.\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 WARNING 140074042410816] wait_for_all_workers will not sync workers since the kv store is not running distributed\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:19 INFO 140074042410816] All workers finished. Serializing model for prediction.\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"get_graph.time\": {\"count\": 1, \"max\": 2714.4081592559814, \"sum\": 2714.4081592559814, \"min\": 2714.4081592559814}}, \"EndTime\": 1532661562.522, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661559.807052}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:25 INFO 140074042410816] Number of GPUs being used: 0\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"finalize.time\": {\"count\": 1, \"max\": 5285.989046096802, \"sum\": 5285.989046096802, \"min\": 5285.989046096802}}, \"EndTime\": 1532661565.093545, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661562.522086}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:25 INFO 140074042410816] Serializing to /opt/ml/model/model_algo-1\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:25 INFO 140074042410816] Saved checkpoint to \"/opt/ml/model/model_algo-1-0000.params\"\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"model.serialize.time\": {\"count\": 1, \"max\": 420.93801498413086, \"sum\": 420.93801498413086, \"min\": 420.93801498413086}}, \"EndTime\": 1532661565.514611, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661565.093624}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:25 INFO 140074042410816] Successfully serialized the model for prediction.\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:25 INFO 140074042410816] Evaluating model accuracy on testset using 100 samples\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"model.bind.time\": {\"count\": 1, \"max\": 0.03886222839355469, \"sum\": 0.03886222839355469, \"min\": 0.03886222839355469}}, \"EndTime\": 1532661565.515471, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661565.514672}\n",
"\u001b[0m\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m#metrics {\"Metrics\": {\"model.score.time\": {\"count\": 1, \"max\": 3564.686059951782, \"sum\": 3564.686059951782, \"min\": 3564.686059951782}}, \"EndTime\": 1532661569.080128, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661565.515531}\n",
"\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, RMSE): 0.38677016793\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, mean_wQuantileLoss): 0.203314\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.1]): 0.119702\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.2]): 0.186436\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.3]): 0.232395\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.4]): 0.260534\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.5]): 0.270927\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.6]): 0.259778\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.7]): 0.228279\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.8]): 0.17313\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #test_score (algo-1, wQuantileLoss[0.9]): 0.0986483\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #quality_metric: host=algo-1, test RMSE =0.38677016793\u001b[0m\n",
"\u001b[31m[07/27/2018 03:19:29 INFO 140074042410816] #quality_metric: host=algo-1, test mean_wQuantileLoss =0.203314334154\u001b[0m\n",
"\u001b[31m#metrics {\"Metrics\": {\"totaltime\": {\"count\": 1, \"max\": 43657.426834106445, \"sum\": 43657.426834106445, \"min\": 43657.426834106445}, \"setuptime\": {\"count\": 1, \"max\": 10.553836822509766, \"sum\": 10.553836822509766, \"min\": 10.553836822509766}}, \"EndTime\": 1532661569.716042, \"Dimensions\": {\"Host\": \"algo-1\", \"Operation\": \"training\", \"Algorithm\": \"AWS/DeepAR\"}, \"StartTime\": 1532661569.080218}\n",
"\u001b[0m\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py:782: DeprecationWarning: generator 'multi_stream_iter' raised StopIteration\n",
" for idx, event in sagemaker.logs.multi_stream_iter(client, log_group, stream_names, positions):\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"===== Job Complete =====\n",
"Billable seconds: 168\n",
"CPU times: user 468 ms, sys: 48 ms, total: 516 ms\n",
"Wall time: 4min 42s\n"
]
}
],
"source": [
"%%time\n",
"data_channels = {\n",
" \"train\": \"{}/train/\".format(s3_data_path),\n",
" \"test\": \"{}/test/\".format(s3_data_path)\n",
"}\n",
"\n",
"estimator.fit(inputs=data_channels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since you pass a test set in this example, accuracy metrics for the forecast are computed and logged (see bottom of the log).\n",
"You can find the definition of these metrics from [our documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html). You can use these to optimize the parameters and tune your model or use SageMaker's [Automated Model Tuning service](https://aws.amazon.com/blogs/aws/sagemaker-automatic-model-tuning/) to tune the model for you."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create endpoint and predictor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have a trained model, we can use it to perform predictions by deploying it to an endpoint.\n",
"\n",
"**Note: Remember to delete the endpoint after running this experiment. A cell at the very bottom of this notebook will do that: make sure you run it at the end.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To query the endpoint and perform predictions, we can define the following utility class: this allows making requests using `pandas.Series` objects rather than raw JSON strings."
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [],
"source": [
"class DeepARPredictor(sagemaker.predictor.RealTimePredictor):\n",
" \n",
" def __init__(self, *args, **kwargs):\n",
" super().__init__(*args, content_type=sagemaker.content_types.CONTENT_TYPE_JSON, **kwargs)\n",
" \n",
" def predict(self, ts, cat=None, dynamic_feat=None, \n",
" num_samples=100, return_samples=False, quantiles=[\"0.1\", \"0.5\", \"0.9\"]):\n",
" \"\"\"Requests the prediction of for the time series listed in `ts`, each with the (optional)\n",
" corresponding category listed in `cat`.\n",
" \n",
" ts -- `pandas.Series` object, the time series to predict\n",
" cat -- integer, the group associated to the time series (default: None)\n",
" num_samples -- integer, number of samples to compute at prediction time (default: 100)\n",
" return_samples -- boolean indicating whether to include samples in the response (default: False)\n",
" quantiles -- list of strings specifying the quantiles to compute (default: [\"0.1\", \"0.5\", \"0.9\"])\n",
" \n",
" Return value: list of `pandas.DataFrame` objects, each containing the predictions\n",
" \"\"\"\n",
" prediction_time = ts.index[-1] + 1\n",
" quantiles = [str(q) for q in quantiles]\n",
" req = self.__encode_request(ts, cat, dynamic_feat, num_samples, return_samples, quantiles)\n",
" res = super(DeepARPredictor, self).predict(req)\n",
" return self.__decode_response(res, ts.index.freq, prediction_time, return_samples)\n",
" \n",
" def __encode_request(self, ts, cat, dynamic_feat, num_samples, return_samples, quantiles):\n",
" instance = series_to_dict(ts, cat if cat is not None else None, dynamic_feat if dynamic_feat else None)\n",
"\n",
" configuration = {\n",
" \"num_samples\": num_samples,\n",
" \"output_types\": [\"quantiles\", \"samples\"] if return_samples else [\"quantiles\"],\n",
" \"quantiles\": quantiles\n",
" }\n",
" \n",
" http_request_data = {\n",
" \"instances\": [instance],\n",
" \"configuration\": configuration\n",
" }\n",
" \n",
" return json.dumps(http_request_data).encode('utf-8')\n",
" \n",
" def __decode_response(self, response, freq, prediction_time, return_samples):\n",
" # we only sent one time series so we only receive one in return\n",
" # however, if possible one will pass multiple time series as predictions will then be faster\n",
" predictions = json.loads(response.decode('utf-8'))['predictions'][0]\n",
" prediction_length = len(next(iter(predictions['quantiles'].values())))\n",
" prediction_index = pd.DatetimeIndex(start=prediction_time, freq=freq, periods=prediction_length) \n",
" if return_samples:\n",
" dict_of_samples = {'sample_' + str(i): s for i, s in enumerate(predictions['samples'])}\n",
" else:\n",
" dict_of_samples = {}\n",
" return pd.DataFrame(data={**predictions['quantiles'], **dict_of_samples}, index=prediction_index)\n",
"\n",
" def set_frequency(self, freq):\n",
" self.freq = freq\n",
" \n",
"def encode_target(ts):\n",
" return [x if np.isfinite(x) else \"NaN\" for x in ts] \n",
"\n",
"def series_to_dict(ts, cat=None, dynamic_feat=None):\n",
" \"\"\"Given a pandas.Series object, returns a dictionary encoding the time series.\n",
"\n",
" ts -- a pands.Series object with the target time series\n",
" cat -- an integer indicating the time series category\n",
"\n",
" Return value: a dictionary\n",
" \"\"\"\n",
" obj = {\"start\": str(ts.index[0]), \"target\": encode_target(ts)}\n",
" if cat is not None:\n",
" obj[\"cat\"] = cat\n",
" if dynamic_feat is not None:\n",
" obj[\"dynamic_feat\"] = dynamic_feat \n",
" return obj"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can deploy the model and create and endpoint that can be queried using our custom DeepARPredictor class."
]
},
{
"cell_type": "code",
"execution_count": 216,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:sagemaker:Creating model with name: forecasting-deepar-2018-07-27-03-22-49-329\n",
"INFO:sagemaker:Creating endpoint with name deepar-home-electricity-demo-2018-07-27-03-15-18-550\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"--------------------------------------------------------------!"
]
}
],
"source": [
"predictor = estimator.deploy(\n",
" initial_instance_count=1,\n",
" instance_type='ml.m4.xlarge',\n",
" predictor_cls=DeepARPredictor)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Make predictions and plot results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can use the `predictor` object to generate predictions."
]
},
{
"cell_type": "code",
"execution_count": 217,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
0.1
\n",
"
0.5
\n",
"
0.9
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2010-09-28
\n",
"
0.790016
\n",
"
0.968579
\n",
"
1.102380
\n",
"
\n",
"
\n",
"
2010-09-29
\n",
"
0.744394
\n",
"
0.925553
\n",
"
1.131745
\n",
"
\n",
"
\n",
"
2010-09-30
\n",
"
0.780791
\n",
"
0.999924
\n",
"
1.199642
\n",
"
\n",
"
\n",
"
2010-10-01
\n",
"
0.853651
\n",
"
1.013829
\n",
"
1.180801
\n",
"
\n",
"
\n",
"
2010-10-02
\n",
"
0.792879
\n",
"
0.994975
\n",
"
1.229360
\n",
"
\n",
"
\n",
"
2010-10-03
\n",
"
0.775380
\n",
"
0.958632
\n",
"
1.198079
\n",
"
\n",
"
\n",
"
2010-10-04
\n",
"
0.762203
\n",
"
0.960841
\n",
"
1.154595
\n",
"
\n",
"
\n",
"
2010-10-05
\n",
"
0.770622
\n",
"
1.005522
\n",
"
1.173487
\n",
"
\n",
"
\n",
"
2010-10-06
\n",
"
0.773807
\n",
"
0.982053
\n",
"
1.176765
\n",
"
\n",
"
\n",
"
2010-10-07
\n",
"
0.899876
\n",
"
1.072179
\n",
"
1.279244
\n",
"
\n",
"
\n",
"
2010-10-08
\n",
"
0.815179
\n",
"
1.030676
\n",
"
1.242005
\n",
"
\n",
"
\n",
"
2010-10-09
\n",
"
0.846799
\n",
"
1.075428
\n",
"
1.238595
\n",
"
\n",
"
\n",
"
2010-10-10
\n",
"
0.825716
\n",
"
1.065997
\n",
"
1.253097
\n",
"
\n",
"
\n",
"
2010-10-11
\n",
"
0.884659
\n",
"
1.077818
\n",
"
1.245890
\n",
"
\n",
"
\n",
"
2010-10-12
\n",
"
0.920730
\n",
"
1.073006
\n",
"
1.231640
\n",
"
\n",
"
\n",
"
2010-10-13
\n",
"
0.901258
\n",
"
1.109938
\n",
"
1.298712
\n",
"
\n",
"
\n",
"
2010-10-14
\n",
"
0.921772
\n",
"
1.064525
\n",
"
1.289613
\n",
"
\n",
"
\n",
"
2010-10-15
\n",
"
0.920833
\n",
"
1.111202
\n",
"
1.273061
\n",
"
\n",
"
\n",
"
2010-10-16
\n",
"
1.029230
\n",
"
1.188139
\n",
"
1.344286
\n",
"
\n",
"
\n",
"
2010-10-17
\n",
"
0.911581
\n",
"
1.111804
\n",
"
1.263128
\n",
"
\n",
"
\n",
"
2010-10-18
\n",
"
0.900885
\n",
"
1.073449
\n",
"
1.198113
\n",
"
\n",
"
\n",
"
2010-10-19
\n",
"
0.981259
\n",
"
1.093404
\n",
"
1.269331
\n",
"
\n",
"
\n",
"
2010-10-20
\n",
"
0.936951
\n",
"
1.104434
\n",
"
1.282887
\n",
"
\n",
"
\n",
"
2010-10-21
\n",
"
0.936540
\n",
"
1.088976
\n",
"
1.319242
\n",
"
\n",
"
\n",
"
2010-10-22
\n",
"
0.942080
\n",
"
1.122903
\n",
"
1.362140
\n",
"
\n",
"
\n",
"
2010-10-23
\n",
"
0.932080
\n",
"
1.119267
\n",
"
1.325692
\n",
"
\n",
"
\n",
"
2010-10-24
\n",
"
0.895720
\n",
"
1.140803
\n",
"
1.406739
\n",
"
\n",
"
\n",
"
2010-10-25
\n",
"
0.897375
\n",
"
1.112538
\n",
"
1.328852
\n",
"
\n",
"
\n",
"
2010-10-26
\n",
"
0.924637
\n",
"
1.168070
\n",
"
1.364797
\n",
"
\n",
"
\n",
"
2010-10-27
\n",
"
0.957276
\n",
"
1.175334
\n",
"
1.382472
\n",
"
\n",
"
\n",
"
2010-10-28
\n",
"
0.922212
\n",
"
1.157789
\n",
"
1.396143
\n",
"
\n",
"
\n",
"
2010-10-29
\n",
"
0.967077
\n",
"
1.199347
\n",
"
1.414492
\n",
"
\n",
"
\n",
"
2010-10-30
\n",
"
0.830030
\n",
"
1.195158
\n",
"
1.476995
\n",
"
\n",
"
\n",
"
2010-10-31
\n",
"
0.951630
\n",
"
1.270762
\n",
"
1.669064
\n",
"
\n",
"
\n",
"
2010-11-01
\n",
"
0.809570
\n",
"
1.112877
\n",
"
1.352301
\n",
"
\n",
"
\n",
"
2010-11-02
\n",
"
1.001853
\n",
"
1.168526
\n",
"
1.374803
\n",
"
\n",
"
\n",
"
2010-11-03
\n",
"
0.885834
\n",
"
1.084763
\n",
"
1.243589
\n",
"
\n",
"
\n",
"
2010-11-04
\n",
"
0.782202
\n",
"
1.071941
\n",
"
1.283330
\n",
"
\n",
"
\n",
"
2010-11-05
\n",
"
0.933213
\n",
"
1.149291
\n",
"
1.367097
\n",
"
\n",
"
\n",
"
2010-11-06
\n",
"
0.951722
\n",
"
1.190081
\n",
"
1.398906
\n",
"
\n",
"
\n",
"
2010-11-07
\n",
"
0.967498
\n",
"
1.161788
\n",
"
1.393864
\n",
"
\n",
"
\n",
"
2010-11-08
\n",
"
0.915501
\n",
"
1.155778
\n",
"
1.372192
\n",
"
\n",
"
\n",
"
2010-11-09
\n",
"
0.971213
\n",
"
1.198470
\n",
"
1.409445
\n",
"
\n",
"
\n",
"
2010-11-10
\n",
"
0.904380
\n",
"
1.075270
\n",
"
1.346587
\n",
"
\n",
"
\n",
"
2010-11-11
\n",
"
0.962840
\n",
"
1.147342
\n",
"
1.325590
\n",
"
\n",
"
\n",
"
2010-11-12
\n",
"
1.091086
\n",
"
1.283538
\n",
"
1.502536
\n",
"
\n",
"
\n",
"
2010-11-13
\n",
"
0.985451
\n",
"
1.207216
\n",
"
1.435151
\n",
"
\n",
"
\n",
"
2010-11-14
\n",
"
0.873823
\n",
"
1.155347
\n",
"
1.407026
\n",
"
\n",
"
\n",
"
2010-11-15
\n",
"
0.907594
\n",
"
1.095161
\n",
"
1.333333
\n",
"
\n",
"
\n",
"
2010-11-16
\n",
"
1.021516
\n",
"
1.178446
\n",
"
1.405428
\n",
"
\n",
"
\n",
"
2010-11-17
\n",
"
1.005996
\n",
"
1.200050
\n",
"
1.414176
\n",
"
\n",
"
\n",
"
2010-11-18
\n",
"
0.972625
\n",
"
1.208396
\n",
"
1.411030
\n",
"
\n",
"
\n",
"
2010-11-19
\n",
"
1.036822
\n",
"
1.217191
\n",
"
1.393905
\n",
"
\n",
"
\n",
"
2010-11-20
\n",
"
1.032539
\n",
"
1.240849
\n",
"
1.465980
\n",
"
\n",
"
\n",
"
2010-11-21
\n",
"
1.046555
\n",
"
1.229090
\n",
"
1.448083
\n",
"
\n",
"
\n",
"
2010-11-22
\n",
"
1.013634
\n",
"
1.188815
\n",
"
1.428163
\n",
"
\n",
"
\n",
"
2010-11-23
\n",
"
0.997702
\n",
"
1.177898
\n",
"
1.392498
\n",
"
\n",
"
\n",
"
2010-11-24
\n",
"
0.859340
\n",
"
1.073107
\n",
"
1.274164
\n",
"
\n",
"
\n",
"
2010-11-25
\n",
"
0.938809
\n",
"
1.146115
\n",
"
1.351901
\n",
"
\n",
"
\n",
"
2010-11-26
\n",
"
0.996817
\n",
"
1.178803
\n",
"
1.396186
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 0.1 0.5 0.9\n",
"2010-09-28 0.790016 0.968579 1.102380\n",
"2010-09-29 0.744394 0.925553 1.131745\n",
"2010-09-30 0.780791 0.999924 1.199642\n",
"2010-10-01 0.853651 1.013829 1.180801\n",
"2010-10-02 0.792879 0.994975 1.229360\n",
"2010-10-03 0.775380 0.958632 1.198079\n",
"2010-10-04 0.762203 0.960841 1.154595\n",
"2010-10-05 0.770622 1.005522 1.173487\n",
"2010-10-06 0.773807 0.982053 1.176765\n",
"2010-10-07 0.899876 1.072179 1.279244\n",
"2010-10-08 0.815179 1.030676 1.242005\n",
"2010-10-09 0.846799 1.075428 1.238595\n",
"2010-10-10 0.825716 1.065997 1.253097\n",
"2010-10-11 0.884659 1.077818 1.245890\n",
"2010-10-12 0.920730 1.073006 1.231640\n",
"2010-10-13 0.901258 1.109938 1.298712\n",
"2010-10-14 0.921772 1.064525 1.289613\n",
"2010-10-15 0.920833 1.111202 1.273061\n",
"2010-10-16 1.029230 1.188139 1.344286\n",
"2010-10-17 0.911581 1.111804 1.263128\n",
"2010-10-18 0.900885 1.073449 1.198113\n",
"2010-10-19 0.981259 1.093404 1.269331\n",
"2010-10-20 0.936951 1.104434 1.282887\n",
"2010-10-21 0.936540 1.088976 1.319242\n",
"2010-10-22 0.942080 1.122903 1.362140\n",
"2010-10-23 0.932080 1.119267 1.325692\n",
"2010-10-24 0.895720 1.140803 1.406739\n",
"2010-10-25 0.897375 1.112538 1.328852\n",
"2010-10-26 0.924637 1.168070 1.364797\n",
"2010-10-27 0.957276 1.175334 1.382472\n",
"2010-10-28 0.922212 1.157789 1.396143\n",
"2010-10-29 0.967077 1.199347 1.414492\n",
"2010-10-30 0.830030 1.195158 1.476995\n",
"2010-10-31 0.951630 1.270762 1.669064\n",
"2010-11-01 0.809570 1.112877 1.352301\n",
"2010-11-02 1.001853 1.168526 1.374803\n",
"2010-11-03 0.885834 1.084763 1.243589\n",
"2010-11-04 0.782202 1.071941 1.283330\n",
"2010-11-05 0.933213 1.149291 1.367097\n",
"2010-11-06 0.951722 1.190081 1.398906\n",
"2010-11-07 0.967498 1.161788 1.393864\n",
"2010-11-08 0.915501 1.155778 1.372192\n",
"2010-11-09 0.971213 1.198470 1.409445\n",
"2010-11-10 0.904380 1.075270 1.346587\n",
"2010-11-11 0.962840 1.147342 1.325590\n",
"2010-11-12 1.091086 1.283538 1.502536\n",
"2010-11-13 0.985451 1.207216 1.435151\n",
"2010-11-14 0.873823 1.155347 1.407026\n",
"2010-11-15 0.907594 1.095161 1.333333\n",
"2010-11-16 1.021516 1.178446 1.405428\n",
"2010-11-17 1.005996 1.200050 1.414176\n",
"2010-11-18 0.972625 1.208396 1.411030\n",
"2010-11-19 1.036822 1.217191 1.393905\n",
"2010-11-20 1.032539 1.240849 1.465980\n",
"2010-11-21 1.046555 1.229090 1.448083\n",
"2010-11-22 1.013634 1.188815 1.428163\n",
"2010-11-23 0.997702 1.177898 1.392498\n",
"2010-11-24 0.859340 1.073107 1.274164\n",
"2010-11-25 0.938809 1.146115 1.351901\n",
"2010-11-26 0.996817 1.178803 1.396186"
]
},
"execution_count": 217,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictor.predict(timeseries[:-prediction_length], quantiles=[0.10, 0.5, 0.90])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below we define a plotting function that queries the model and displays the forecast."
]
},
{
"cell_type": "code",
"execution_count": 218,
"metadata": {},
"outputs": [],
"source": [
"def plot(\n",
" predictor, \n",
" target_ts, \n",
" cat=None, \n",
" dynamic_feat=None, \n",
" forecast_date=end_training, \n",
" show_samples=False, \n",
" plot_history=7 * 12,\n",
" confidence=80\n",
"):\n",
" print(\"calling served model to generate predictions starting from {}\".format(str(forecast_date)))\n",
" assert(confidence > 50 and confidence < 100)\n",
" low_quantile = 0.5 - confidence * 0.005\n",
" up_quantile = confidence * 0.005 + 0.5\n",
" \n",
" # we first construct the argument to call our model\n",
" args = {\n",
" \"ts\": target_ts[:forecast_date],\n",
" \"return_samples\": show_samples,\n",
" \"quantiles\": [low_quantile, 0.5, up_quantile],\n",
" \"num_samples\": 100\n",
" }\n",
"\n",
"\n",
" if dynamic_feat is not None:\n",
" args[\"dynamic_feat\"] = dynamic_feat\n",
" fig = plt.figure(figsize=(20, 6))\n",
" ax = plt.subplot(2, 1, 1)\n",
" else:\n",
" fig = plt.figure(figsize=(20, 3))\n",
" ax = plt.subplot(1,1,1)\n",
" \n",
" if cat is not None:\n",
" args[\"cat\"] = cat\n",
" ax.text(0.9, 0.9, 'cat = {}'.format(cat), transform=ax.transAxes)\n",
"\n",
" # call the end point to get the prediction\n",
" prediction = predictor.predict(**args)\n",
"\n",
" # plot the samples\n",
" if show_samples: \n",
" for key in prediction.keys():\n",
" if \"sample\" in key:\n",
" prediction[key].plot(color='lightskyblue', alpha=0.2, label='_nolegend_')\n",
" \n",
" \n",
" # plot the target\n",
" target_section = target_ts[forecast_date-plot_history:forecast_date+prediction_length]\n",
" target_section.plot(color=\"black\", label='target')\n",
" \n",
" # plot the confidence interval and the median predicted\n",
" ax.fill_between(\n",
" prediction[str(low_quantile)].index, \n",
" prediction[str(low_quantile)].values, \n",
" prediction[str(up_quantile)].values, \n",
" color=\"b\", alpha=0.3, label='{}% confidence interval'.format(confidence)\n",
" )\n",
" prediction[\"0.5\"].plot(color=\"b\", label='P50')\n",
" ax.legend(loc=2) \n",
" \n",
" # fix the scale as the samples may change it\n",
" ax.set_ylim(target_section.min() * 0.5, target_section.max() * 1.5)\n",
" \n",
" if dynamic_feat is not None:\n",
" for i, f in enumerate(dynamic_feat, start=1):\n",
" ax = plt.subplot(len(dynamic_feat) * 2, 1, len(dynamic_feat) + i, sharex=ax)\n",
" feat_ts = pd.Series(\n",
" index=pd.DatetimeIndex(start=target_ts.index[0], freq=target_ts.index.freq, periods=len(f)),\n",
" data=f\n",
" )\n",
" feat_ts[forecast_date-plot_history:forecast_date+prediction_length].plot(ax=ax, color='g')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can interact with the function previously defined, to look at the forecast of any customer at any point in (future) time. \n",
"\n",
"For each request, the predictions are obtained by calling our served model on the fly.\n",
"\n",
"Here we forecast the consumption of an office after week-end (note the lower week-end consumption). \n",
"You can select any time series and any forecast date, just click on `Run Interact` to generate the predictions from our served endpoint and see the plot."
]
},
{
"cell_type": "code",
"execution_count": 219,
"metadata": {},
"outputs": [],
"source": [
"style = {'description_width': 'initial'}"
]
},
{
"cell_type": "code",
"execution_count": 241,
"metadata": {},
"outputs": [],
"source": [
"list_of_df = predictor.predict(timeseries[:-prediction_length])\n",
"actual_data = timeseries[-prediction_length:]"
]
},
{
"cell_type": "code",
"execution_count": 242,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"60"
]
},
"execution_count": 242,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(actual_data)"
]
},
{
"cell_type": "code",
"execution_count": 243,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
0.1
\n",
"
0.5
\n",
"
0.9
\n",
"
\n",
" \n",
" \n",
"
\n",
"
2010-09-28
\n",
"
0.790016
\n",
"
0.968579
\n",
"
1.102380
\n",
"
\n",
"
\n",
"
2010-09-29
\n",
"
0.744394
\n",
"
0.925553
\n",
"
1.131745
\n",
"
\n",
"
\n",
"
2010-09-30
\n",
"
0.780791
\n",
"
0.999924
\n",
"
1.199642
\n",
"
\n",
"
\n",
"
2010-10-01
\n",
"
0.853651
\n",
"
1.013829
\n",
"
1.180801
\n",
"
\n",
"
\n",
"
2010-10-02
\n",
"
0.792879
\n",
"
0.994975
\n",
"
1.229360
\n",
"
\n",
"
\n",
"
2010-10-03
\n",
"
0.775380
\n",
"
0.958632
\n",
"
1.198079
\n",
"
\n",
"
\n",
"
2010-10-04
\n",
"
0.762203
\n",
"
0.960841
\n",
"
1.154595
\n",
"
\n",
"
\n",
"
2010-10-05
\n",
"
0.770622
\n",
"
1.005522
\n",
"
1.173487
\n",
"
\n",
"
\n",
"
2010-10-06
\n",
"
0.773807
\n",
"
0.982053
\n",
"
1.176765
\n",
"
\n",
"
\n",
"
2010-10-07
\n",
"
0.899876
\n",
"
1.072179
\n",
"
1.279244
\n",
"
\n",
"
\n",
"
2010-10-08
\n",
"
0.815179
\n",
"
1.030676
\n",
"
1.242005
\n",
"
\n",
"
\n",
"
2010-10-09
\n",
"
0.846799
\n",
"
1.075428
\n",
"
1.238595
\n",
"
\n",
"
\n",
"
2010-10-10
\n",
"
0.825716
\n",
"
1.065997
\n",
"
1.253097
\n",
"
\n",
"
\n",
"
2010-10-11
\n",
"
0.884659
\n",
"
1.077818
\n",
"
1.245890
\n",
"
\n",
"
\n",
"
2010-10-12
\n",
"
0.920730
\n",
"
1.073006
\n",
"
1.231640
\n",
"
\n",
"
\n",
"
2010-10-13
\n",
"
0.901258
\n",
"
1.109938
\n",
"
1.298712
\n",
"
\n",
"
\n",
"
2010-10-14
\n",
"
0.921772
\n",
"
1.064525
\n",
"
1.289613
\n",
"
\n",
"
\n",
"
2010-10-15
\n",
"
0.920833
\n",
"
1.111202
\n",
"
1.273061
\n",
"
\n",
"
\n",
"
2010-10-16
\n",
"
1.029230
\n",
"
1.188139
\n",
"
1.344286
\n",
"
\n",
"
\n",
"
2010-10-17
\n",
"
0.911581
\n",
"
1.111804
\n",
"
1.263128
\n",
"
\n",
"
\n",
"
2010-10-18
\n",
"
0.900885
\n",
"
1.073449
\n",
"
1.198113
\n",
"
\n",
"
\n",
"
2010-10-19
\n",
"
0.981259
\n",
"
1.093404
\n",
"
1.269331
\n",
"
\n",
"
\n",
"
2010-10-20
\n",
"
0.936951
\n",
"
1.104434
\n",
"
1.282887
\n",
"
\n",
"
\n",
"
2010-10-21
\n",
"
0.936540
\n",
"
1.088976
\n",
"
1.319242
\n",
"
\n",
"
\n",
"
2010-10-22
\n",
"
0.942080
\n",
"
1.122903
\n",
"
1.362140
\n",
"
\n",
"
\n",
"
2010-10-23
\n",
"
0.932080
\n",
"
1.119267
\n",
"
1.325692
\n",
"
\n",
"
\n",
"
2010-10-24
\n",
"
0.895720
\n",
"
1.140803
\n",
"
1.406739
\n",
"
\n",
"
\n",
"
2010-10-25
\n",
"
0.897375
\n",
"
1.112538
\n",
"
1.328852
\n",
"
\n",
"
\n",
"
2010-10-26
\n",
"
0.924637
\n",
"
1.168070
\n",
"
1.364797
\n",
"
\n",
"
\n",
"
2010-10-27
\n",
"
0.957276
\n",
"
1.175334
\n",
"
1.382472
\n",
"
\n",
"
\n",
"
2010-10-28
\n",
"
0.922212
\n",
"
1.157789
\n",
"
1.396143
\n",
"
\n",
"
\n",
"
2010-10-29
\n",
"
0.967077
\n",
"
1.199347
\n",
"
1.414492
\n",
"
\n",
"
\n",
"
2010-10-30
\n",
"
0.830030
\n",
"
1.195158
\n",
"
1.476995
\n",
"
\n",
"
\n",
"
2010-10-31
\n",
"
0.951630
\n",
"
1.270762
\n",
"
1.669064
\n",
"
\n",
"
\n",
"
2010-11-01
\n",
"
0.809570
\n",
"
1.112877
\n",
"
1.352301
\n",
"
\n",
"
\n",
"
2010-11-02
\n",
"
1.001853
\n",
"
1.168526
\n",
"
1.374803
\n",
"
\n",
"
\n",
"
2010-11-03
\n",
"
0.885834
\n",
"
1.084763
\n",
"
1.243589
\n",
"
\n",
"
\n",
"
2010-11-04
\n",
"
0.782202
\n",
"
1.071941
\n",
"
1.283330
\n",
"
\n",
"
\n",
"
2010-11-05
\n",
"
0.933213
\n",
"
1.149291
\n",
"
1.367097
\n",
"
\n",
"
\n",
"
2010-11-06
\n",
"
0.951722
\n",
"
1.190081
\n",
"
1.398906
\n",
"
\n",
"
\n",
"
2010-11-07
\n",
"
0.967498
\n",
"
1.161788
\n",
"
1.393864
\n",
"
\n",
"
\n",
"
2010-11-08
\n",
"
0.915501
\n",
"
1.155778
\n",
"
1.372192
\n",
"
\n",
"
\n",
"
2010-11-09
\n",
"
0.971213
\n",
"
1.198470
\n",
"
1.409445
\n",
"
\n",
"
\n",
"
2010-11-10
\n",
"
0.904380
\n",
"
1.075270
\n",
"
1.346587
\n",
"
\n",
"
\n",
"
2010-11-11
\n",
"
0.962840
\n",
"
1.147342
\n",
"
1.325590
\n",
"
\n",
"
\n",
"
2010-11-12
\n",
"
1.091086
\n",
"
1.283538
\n",
"
1.502536
\n",
"
\n",
"
\n",
"
2010-11-13
\n",
"
0.985451
\n",
"
1.207216
\n",
"
1.435151
\n",
"
\n",
"
\n",
"
2010-11-14
\n",
"
0.873823
\n",
"
1.155347
\n",
"
1.407026
\n",
"
\n",
"
\n",
"
2010-11-15
\n",
"
0.907594
\n",
"
1.095161
\n",
"
1.333333
\n",
"
\n",
"
\n",
"
2010-11-16
\n",
"
1.021516
\n",
"
1.178446
\n",
"
1.405428
\n",
"
\n",
"
\n",
"
2010-11-17
\n",
"
1.005996
\n",
"
1.200050
\n",
"
1.414176
\n",
"
\n",
"
\n",
"
2010-11-18
\n",
"
0.972625
\n",
"
1.208396
\n",
"
1.411030
\n",
"
\n",
"
\n",
"
2010-11-19
\n",
"
1.036822
\n",
"
1.217191
\n",
"
1.393905
\n",
"
\n",
"
\n",
"
2010-11-20
\n",
"
1.032539
\n",
"
1.240849
\n",
"
1.465980
\n",
"
\n",
"
\n",
"
2010-11-21
\n",
"
1.046555
\n",
"
1.229090
\n",
"
1.448083
\n",
"
\n",
"
\n",
"
2010-11-22
\n",
"
1.013634
\n",
"
1.188815
\n",
"
1.428163
\n",
"
\n",
"
\n",
"
2010-11-23
\n",
"
0.997702
\n",
"
1.177898
\n",
"
1.392498
\n",
"
\n",
"
\n",
"
2010-11-24
\n",
"
0.859340
\n",
"
1.073107
\n",
"
1.274164
\n",
"
\n",
"
\n",
"
2010-11-25
\n",
"
0.938809
\n",
"
1.146115
\n",
"
1.351901
\n",
"
\n",
"
\n",
"
2010-11-26
\n",
"
0.996817
\n",
"
1.178803
\n",
"
1.396186
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 0.1 0.5 0.9\n",
"2010-09-28 0.790016 0.968579 1.102380\n",
"2010-09-29 0.744394 0.925553 1.131745\n",
"2010-09-30 0.780791 0.999924 1.199642\n",
"2010-10-01 0.853651 1.013829 1.180801\n",
"2010-10-02 0.792879 0.994975 1.229360\n",
"2010-10-03 0.775380 0.958632 1.198079\n",
"2010-10-04 0.762203 0.960841 1.154595\n",
"2010-10-05 0.770622 1.005522 1.173487\n",
"2010-10-06 0.773807 0.982053 1.176765\n",
"2010-10-07 0.899876 1.072179 1.279244\n",
"2010-10-08 0.815179 1.030676 1.242005\n",
"2010-10-09 0.846799 1.075428 1.238595\n",
"2010-10-10 0.825716 1.065997 1.253097\n",
"2010-10-11 0.884659 1.077818 1.245890\n",
"2010-10-12 0.920730 1.073006 1.231640\n",
"2010-10-13 0.901258 1.109938 1.298712\n",
"2010-10-14 0.921772 1.064525 1.289613\n",
"2010-10-15 0.920833 1.111202 1.273061\n",
"2010-10-16 1.029230 1.188139 1.344286\n",
"2010-10-17 0.911581 1.111804 1.263128\n",
"2010-10-18 0.900885 1.073449 1.198113\n",
"2010-10-19 0.981259 1.093404 1.269331\n",
"2010-10-20 0.936951 1.104434 1.282887\n",
"2010-10-21 0.936540 1.088976 1.319242\n",
"2010-10-22 0.942080 1.122903 1.362140\n",
"2010-10-23 0.932080 1.119267 1.325692\n",
"2010-10-24 0.895720 1.140803 1.406739\n",
"2010-10-25 0.897375 1.112538 1.328852\n",
"2010-10-26 0.924637 1.168070 1.364797\n",
"2010-10-27 0.957276 1.175334 1.382472\n",
"2010-10-28 0.922212 1.157789 1.396143\n",
"2010-10-29 0.967077 1.199347 1.414492\n",
"2010-10-30 0.830030 1.195158 1.476995\n",
"2010-10-31 0.951630 1.270762 1.669064\n",
"2010-11-01 0.809570 1.112877 1.352301\n",
"2010-11-02 1.001853 1.168526 1.374803\n",
"2010-11-03 0.885834 1.084763 1.243589\n",
"2010-11-04 0.782202 1.071941 1.283330\n",
"2010-11-05 0.933213 1.149291 1.367097\n",
"2010-11-06 0.951722 1.190081 1.398906\n",
"2010-11-07 0.967498 1.161788 1.393864\n",
"2010-11-08 0.915501 1.155778 1.372192\n",
"2010-11-09 0.971213 1.198470 1.409445\n",
"2010-11-10 0.904380 1.075270 1.346587\n",
"2010-11-11 0.962840 1.147342 1.325590\n",
"2010-11-12 1.091086 1.283538 1.502536\n",
"2010-11-13 0.985451 1.207216 1.435151\n",
"2010-11-14 0.873823 1.155347 1.407026\n",
"2010-11-15 0.907594 1.095161 1.333333\n",
"2010-11-16 1.021516 1.178446 1.405428\n",
"2010-11-17 1.005996 1.200050 1.414176\n",
"2010-11-18 0.972625 1.208396 1.411030\n",
"2010-11-19 1.036822 1.217191 1.393905\n",
"2010-11-20 1.032539 1.240849 1.465980\n",
"2010-11-21 1.046555 1.229090 1.448083\n",
"2010-11-22 1.013634 1.188815 1.428163\n",
"2010-11-23 0.997702 1.177898 1.392498\n",
"2010-11-24 0.859340 1.073107 1.274164\n",
"2010-11-25 0.938809 1.146115 1.351901\n",
"2010-11-26 0.996817 1.178803 1.396186"
]
},
"execution_count": 243,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list_of_df"
]
},
{
"cell_type": "code",
"execution_count": 244,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Date_Time\n",
"2010-11-22 1.417733\n",
"2010-11-23 1.095511\n",
"2010-11-24 1.247394\n",
"2010-11-25 0.993864\n",
"2010-11-26 1.178230\n",
"Freq: D, Name: Global_active_power, dtype: float64"
]
},
"execution_count": 244,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"timeseries.tail()"
]
},
{
"cell_type": "code",
"execution_count": 264,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(20,6))\n",
"timeseries[800:-len(list_of_df)].plot(color='#C3C8C4', linewidth=1.0)\n",
"p10 = list_of_df['0.1']\n",
"p90 = list_of_df['0.9']\n",
"plt.fill_between(p10.index, p10, p90, color='#C5F7AB', alpha=0.5, label='80% confidence interval')\n",
"actual_data.plot(color='#FCE08F', label='target')\n",
"list_of_df['0.5'].plot(marker='^', linewidth=3.0, label='prediction median')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Additional features\n",
"\n",
"DeepAR has additional features such as handling the missing values as below:\n",
"\n",
"* missing values: DeepAR can handle missing values in the time series during training as well as for inference.\n",
"* Additional time features: DeepAR provides a set default time series features such as hour of day etc. However, you can provide additional feature time series via the `dynamic_feat` field. \n",
"* generalize frequencies: any integer multiple of the previously supported base frequencies (minutes `min`, hours `H`, days `D`, weeks `W`, month `M`) are now allowed; e.g., `15min`. We already demonstrated this above by using `2H` frequency.\n",
"* categories: If your time series belong to different groups (e.g. types of product, regions, etc), this information can be encoded as one or more categorical features using the `cat` field.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Based on the previous results, we will be able to impelement the advanced models to support the above features."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete endpoints"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"predictor.delete_endpoint()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_mxnet_p36",
"language": "python",
"name": "conda_mxnet_p36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
},
"notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
},
"nbformat": 4,
"nbformat_minor": 2
}