\n",
"\n",
"This notebook demontrates how to train an RL agent for Energy Storage System (ESS) arbitrage. The simulated energy environment is created based on the paper [Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning](https://arxiv.org/abs/1904.12232), and with [this sample dataset](https://aemo.com.au/en/energy-systems/electricity/national-electricity-market-nem/data-nem/aggregated-data).\n",
"\n",
"Ensure that your Python virtual environment installed the required Python packages specified in `GITROOT/setup.py`."
]
},
{
"cell_type": "markdown",
"id": "fbdbe4b2",
"metadata": {},
"source": [
"# Global config\n",
"\n",
"**
NOTE: due to the pedagogical nature, this notebook fixes the random seed calling \n",
"on `np.random.seed(1)` on every cells that train an agent. Please note that it is NOT sufficient to call \n",
"a single `np.random.seed(1)` because it only affects its own cell.
**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e45c416-7e97-4759-8a72-9eb3a2cce610",
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"%config InlineBackend.figure_format = 'retina'\n",
"%load_ext autoreload\n",
"%autoreload 2\n",
"\n",
"import numpy as np\n",
"import os\n",
"import pandas as pd\n",
"from pathlib import Path\n",
"from typing import Any, Dict, List, Optional, Union\n",
"\n",
"from energy_storage_system.agents import Agent, MovingAveragePriceAgent, PriceVsCostAgent, RandomAgent\n",
"from energy_storage_system.envs import SimpleBattery\n",
"from energy_storage_system.utils import ReportIO, evaluate_episode, plot_reward, plot_analysis, train\n",
"\n",
"\n",
"# Pre-create GITROOT/data and its sub-directories (NOTE: GITROOT/data is not versioned).\n",
"data_dir = Path('../data')\n",
"%set_env DATA_DIR=$data_dir\n",
"!mkdir -p $DATA_DIR/agent_input $DATA_DIR/agent_output $DATA_DIR/bokeh_output $DATA_DIR/streamlit_input\n",
"\n",
"env_config = {\n",
" \"MAX_STEPS_PER_EPISODE\": 168,\n",
" \"LOCAL\": True, # True means to use data from local src folder instead of S3.\n",
" \"FILEPATH\": data_dir / 'agent_input' / 'sample-data.csv'\n",
"}\n",
"\n",
"# True means to download year 2020.\n",
"# False means to download only March 2021.\n",
"yearly_data = False"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c594901c",
"metadata": {},
"outputs": [],
"source": [
"# Execute this cell to download the sample data to a local file called ../data/agent_input/sample-data.csv\n",
"if not yearly_data:\n",
" !curl https://aemo.com.au/aemo/data/nem/priceanddemand/PRICE_AND_DEMAND_202103_NSW1.csv > $DATA_DIR/agent_input/sample-data.csv\n",
"else:\n",
" # Download 12 months of year 2020 data.\n",
" !bash ../bin/download_data.sh $DATA_DIR/price_demand_data &> /dev/null\n",
"\n",
" # Combine those 12 .csv files to ../data/agent_input/sample-data.csv\n",
" import glob\n",
" files = (data_dir / 'price_demand_data').glob('PRICE_AND_DEMAND*.csv')\n",
" df = pd.concat([pd.read_csv(f) for f in files], axis=0, ignore_index=True)\n",
" df.sort_values('SETTLEMENTDATE', inplace=True)\n",
" df.to_csv(data_dir / 'agent_input' / 'sample-data.csv', index=False)"
]
},
{
"cell_type": "markdown",
"id": "c4044060-5ec6-4766-9324-90b77338342c",
"metadata": {},
"source": [
"# Battery Environment\n",
"\n",
"With sample data ready, instantiate a new gym environment for the energy storage system. This notebook defaults to\n",
"just 10 training episodes to minimize training times. You may want to explore the agents's behavior by increasing\n",
"the training episodes to a larger number (e.g., 3000, etc.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62986bad-c1fb-4af8-a523-1a02dcfcf148",
"metadata": {},
"outputs": [],
"source": [
"env = SimpleBattery(env_config)\n",
"\n",
"# More episodes means a longer training time.\n",
"episodes = 10"
]
},
{
"cell_type": "markdown",
"id": "c7b41a51",
"metadata": {},
"source": [
"The next cell defines a helper function `train_eval_save()` to (train + evaluate + plot) an agent. This function will be used to evaluate three baseline agents:\n",
"\n",
"1. a random agent\n",
"2. an agent that considers market price vs cost\n",
"3. an agent that considers the moving average of market price"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e418bd9f",
"metadata": {},
"outputs": [],
"source": [
"def train_eval_save(\n",
" env: Dict[str, Any],\n",
" agent: Agent,\n",
" episodes: int,\n",
" np_seed: Optional[int] = None,\n",
" bokeh_dir: Optional[Union[str, os.PathLike]] = None,\n",
" streamlit_csv: Optional[Union[str, os.PathLike]] = None,\n",
") -> pd.DataFrame:\n",
" \"\"\"Helper function to train, evaluate, inline plot, and save outputs.\n",
"\n",
" Args:\n",
" env (Dict[str, Any]): configuration of the energy gym environment.\n",
" agent (Agent): a baseline agent.\n",
" episodes (int): Number of training episodes .\n",
" np_seed (Optional[int], optional): Random seed (None means do not fix it). Defaults to None.\n",
" bokeh_dir (Optional[Union[str, os.PathLike]], optional): Where to save interactive .html\n",
" reports (None means do not generate these files). Defaults to None.\n",
" streamlit_csv (Optional[Union[str, os.PathLike]], optional): Where to save .csv files\n",
" that the Streamlit demo can visualize (None means do not generate such files).\n",
" Defaults to None.\n",
"\n",
" Returns:\n",
" pd.DataFrame: exploitation results.\n",
" \"\"\"\n",
" if np_seed is not None:\n",
" np.random.seed(np_seed)\n",
"\n",
" # Training\n",
" train_results = train(env, agent, episodes)\n",
" plot_reward(train_results.rewards_list) # Jupyter autoplots the returned fig\n",
" print(\"Average rewards across training episodes:\", train_results.mean_rewards)\n",
"\n",
" # Evaluation\n",
" df_eval = evaluate_episode(agent, env)\n",
" plot_analysis(df_eval) # Jupyter autoplots the returned fig\n",
"\n",
" # Generate bokeh input\n",
" if bokeh_dir is not None:\n",
" # close_fig=True to prevent Jupyter to auto-display the generated figure,\n",
" # which are exactly the same as what the above calls have produced.\n",
" ReportIO(bokeh_dir).save2(train_results.rewards_list, df_eval, close_fig=True)\n",
" \n",
" # Generate streamlit input\n",
" if streamlit_csv is not None:\n",
" df_eval.to_csv(streamlit_csv, index=False)\n",
"\n",
" return df_eval"
]
},
{
"cell_type": "markdown",
"id": "5ecbf84f-a087-4862-aa5b-cb93b6bdc391",
"metadata": {},
"source": [
"# Train and evaluate baseline agents"
]
},
{
"cell_type": "markdown",
"id": "9ca85df4",
"metadata": {},
"source": [
"## Random Agent\n",
"\n",
"Train an agent who behaves randomly. This is purely for demonstration of how to use the `energy_storage_system` module,\n",
"hence do not save outputs.\n",
"\n",
"**Policy evaluation and observation**: the agent action is totally random, regardless of price and cost."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2b001681",
"metadata": {},
"outputs": [],
"source": [
"df_eval_random = train_eval_save(env, RandomAgent(), episodes, np_seed=1)"
]
},
{
"cell_type": "markdown",
"id": "b4ebf816",
"metadata": {},
"source": [
"## Market price vs cost agent\n",
"\n",
"This agent behaves as follows:\n",
"\n",
"- SELL: when market price is higher than cost\n",
"- BUY: when market price is lower than cost\n",
"- HOLD: others\n",
"\n",
"**Policy evaluation and observation**: agent discharges (sell:1) when price is higher than cost, and charges (buy:0)\n",
"\n",
" CHARGE = 0\n",
" DISCHARGE = 1\n",
" HOLD = 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d83c5e8e",
"metadata": {},
"outputs": [],
"source": [
"df_eval_price_vs_cost = train_eval_save(\n",
" env,\n",
" PriceVsCostAgent(),\n",
" episodes,\n",
" np_seed=1,\n",
" # Save the output for downstream tasks (i.e., bokeh and streamlit).\n",
" bokeh_dir=data_dir / 'agent_output' / 'agent-price-vs-cost',\n",
" streamlit_csv=data_dir / 'streamlit_input' / 'result_price_vs_cost_agent.csv',\n",
")"
]
},
{
"cell_type": "markdown",
"id": "a581e4cc",
"metadata": {},
"source": [
"## Market Price vs Historical price Agent\n",
"\n",
"This agent behaves as follows:\n",
"\n",
"- SELL: when market price is higher than past 5 days average price\n",
"- BUY: when market price is lower than past 5 days average price\n",
"- HOLD: others\n",
"\n",
"**Policy evaluation and observation**: Agent will start selling when market price is increasing (high than last 5 days average), and buy when market price is dropping.\n",
"\n",
" CHARGE = 0\n",
" DISCHARGE = 1\n",
" HOLD = 2\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bbf0be58",
"metadata": {},
"outputs": [],
"source": [
"df_eval_ma = train_eval_save(\n",
" env,\n",
" MovingAveragePriceAgent(),\n",
" episodes,\n",
" np_seed=1,\n",
" # Save the output for downstream tasks (i.e., bokeh and streamlit).\n",
" bokeh_dir=data_dir / 'agent_output' / 'agent-hist-price',\n",
" streamlit_csv=data_dir / 'streamlit_input' / 'result_hist_price_agent.csv',\n",
")"
]
},
{
"cell_type": "markdown",
"id": "bdc3ee03-3924-4306-bba1-008c20a95872",
"metadata": {},
"source": [
"## SageMaker RL - DQN\n",
"\n",
"Next is to use DQN algorithm running on SageMaker RL. Please refer to [01_battery_sim_on_sm.ipynb](01_battery_sim_on_sm.ipynb) and\n",
"[02_battery_sim_on_sm-eval.ipynb](02_battery_sim_on_sm-eval.ipynb)."
]
},
{
"cell_type": "markdown",
"id": "e4e2c544-72d5-4bc7-9bd5-dc4fb7aac63c",
"metadata": {},
"source": [
"# Generate interactive reports for baseline agents\n",
"\n",
"The next cell uses `ipython` to generate interactive reports (i.e., `.html` files). Once the cell completes, feel free\n",
"to open & inspect the generated `.html` files. The cell uses `ipython` to recognize `src/energy_storage_system` defined\n",
"in `ipython_config.py`.\n",
"\n",
"To execute the script from a terminal as `python -m xxx.yyy -o bokeh_output/abcd agent_output/xyz`, please:\n",
"\n",
"1. `pip install` this repo, or\n",
"2. modify `PYTHONPATH` environment variable accordingly."
]
},
{
"cell_type": "markdown",
"id": "64d4b058-c534-45e0-bda0-1f5fef4a358b",
"metadata": {},
"source": [
"## Per-agent reports"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbee6ecf-d90a-4f1c-93c8-aec6c3800f6d",
"metadata": {},
"outputs": [],
"source": [
"# ipython takes all before --. Anything after -- belongs to the energy_storage_systme.bokeh_report CLI script.\n",
"# When using python from terminal, remove the --.\n",
"!ipython -m energy_storage_system.bokeh_report -- -o $DATA_DIR/bokeh_output/agent-price-vs-cost $DATA_DIR/agent_output/agent-price-vs-cost\n",
"!ipython -m energy_storage_system.bokeh_report -- -o $DATA_DIR/bokeh_output/agent-hist-price $DATA_DIR/agent_output/agent-hist-price\n",
"\n",
"!echo \"Generated Bokeh reports:\"\n",
"!find $DATA_DIR/bokeh_output"
]
},
{
"cell_type": "markdown",
"id": "716197f2-63c6-4ba9-bb9b-dd19445f21c3",
"metadata": {},
"source": [
"## Comparing across agents\n",
"\n",
"Compare how each agent maintains the energy-inventory level across time."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "68b6c2e8-ff13-4542-bbcc-4db575efd412",
"metadata": {},
"outputs": [],
"source": [
"# ipython takes all before --. Anything after -- belongs to the energy_storage_systen.bokeh_energy_inventory CLI script.\n",
"# When using python from terminal, remove the --.\n",
"!ipython -m energy_storage_system.bokeh_energy_inventory -- -o $DATA_DIR/bokeh_output/energy-inventory $DATA_DIR/agent_output/\n",
"\n",
"!echo \"Generated Bokeh reports:\"\n",
"!find $DATA_DIR/bokeh_output/energy-inventory"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df62de0f-0938-4924-8c66-397442d35bda",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_python3",
"language": "python",
"name": "conda_python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.13"
},
"toc-autonumbering": true,
"toc-showcode": false,
"toc-showmarkdowntxt": false
},
"nbformat": 4,
"nbformat_minor": 5
}