{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Unity Cloud Simulation and Automated Playtest with Amazon SageMaker RL\n",
    "\n",
    "---\n",
    "## Introduction\n",
    "\n",
    "[Unity](https://unity.com/) is currently the most popular gaming engine used by game developers around the world. Unity engine can be used to create 3D, 2D, virtual reality, and augmented reality games, as well as simulations and other experiences. [ML-Agents](https://github.com/Unity-Technologies/ml-agents) is an open-sourced toolkit developed by Unity to enable games and simulations to serve as environments for training intelligent agents. It provides capabilities on how to interact with Unity executables as well as how to train a RL agent. For how to use ML-Agents to train a Unity game agent on SageMaker, please refer to this [notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/unity_ml_agents/unity_mlagents_learn.ipynb).\n",
    "\n",
    "This sample notebook walks you through how to train the RL agents and automate playtests against a Unity game by using cloud simulation with Amazon SageMaker RL. Once you create your new game with Unity, you can automate playtests for your game by running the RL agent on Amazon SageMaker RL. After the playtests, you can also evaluate and visualize the result from game logs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bring your own custom Unity environment\n",
    "\n",
    "For this sample, you can use your custom Unity executable for training.\n",
    "1. [Create a Unity executable](https://github.com/Unity-Technologies/ml-agents/blob/742c2fbf01188fbf27e82d5a7d9b5fd42f0de67a/docs/Learning-Environment-Executable.md) of your environment for Linux. Please make sure your environment only contains one single agent.\n",
    "2. Upload the executable file, dependency data files and library files to s3.\n",
    "3. Specify the s3 path in the form of `S3://<your bucket name>/<prefix>` in the following variable. \n",
    "ex) `s3_binary_path = 's3://unity-environment-1113/RollerBall'` \n",
    "It is used later for the training job and the evaluation step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_binary_path = 's3://sagemaker-unity-mlagents/RollerBall'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pre-requisites\n",
    "\n",
    "### Imports\n",
    "\n",
    "To get started, we'll import the Python libraries we need, set up the environment with a few prerequisites for permissions and configurations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import sagemaker\n",
    "import boto3\n",
    "import sys\n",
    "import os\n",
    "import glob\n",
    "import re\n",
    "import subprocess\n",
    "import numpy as np\n",
    "from IPython.display import HTML\n",
    "import time\n",
    "from time import gmtime, strftime\n",
    "sys.path.append(\"common\")\n",
    "from misc import get_execution_role, wait_for_s3_object\n",
    "from docker_utils import build_and_push_docker_image\n",
    "from sagemaker.rl import RLEstimator, RLToolkit, RLFramework"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup S3 bucket\n",
    "\n",
    "Set up the linkage and authentication to the S3 bucket that you want to use for checkpoint and the metadata. Note that the following sets the S3 output path to a default S3 bucket of your account."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sage_session = sagemaker.session.Session()\n",
    "s3_bucket = sage_session.default_bucket()  \n",
    "s3_output_path = 's3://{}/'.format(s3_bucket)\n",
    "print(\"S3 bucket path: {}\".format(s3_output_path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define Variables \n",
    "\n",
    "We define variables such as the job prefix for the training jobs *and the image path for the container (only when this is BYOC).*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a descriptive job name \n",
    "job_name_prefix = 'rl-unity-ray'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configure where training happens\n",
    "\n",
    "You can train your RL training jobs using the SageMaker notebook instance or local notebook instance. In both of these scenarios, you can run the following in either local or SageMaker modes. The local mode uses the SageMaker Python SDK to run your code in a local container before deploying to SageMaker. This can speed up iterative testing and debugging while using the same familiar Python SDK interface. You just need to set `local_mode = True`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# run in local_mode on this machine, or as a SageMaker TrainingJob?\n",
    "local_mode = False\n",
    "\n",
    "if local_mode:\n",
    "    instance_type = 'local'\n",
    "else:\n",
    "    # If on SageMaker, pick the instance type\n",
    "    instance_type = \"ml.c5.2xlarge\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create an IAM role\n",
    "\n",
    "Either get the execution role when running from a SageMaker notebook instance `role = sagemaker.get_execution_role()` or, when running from local notebook instance, use utils method `role = get_execution_role()` to create an execution role."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    role = sagemaker.get_execution_role()\n",
    "except:\n",
    "    role = get_execution_role()\n",
    "\n",
    "print(\"Using IAM role arn: {}\".format(role))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Install docker for `local` mode\n",
    "\n",
    "In order to work in `local` mode, you need to have docker installed. When running from you local machine, please make sure that you have docker and docker-compose (for local CPU machines) and nvidia-docker (for local GPU machines) installed. Alternatively, when running from a SageMaker notebook instance, you can simply run the following script to install dependenceis.\n",
    "\n",
    "Note, you can only run a single local notebook at one time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# only run from SageMaker notebook instance\n",
    "if local_mode:\n",
    "    !/bin/bash ./common/setup.sh"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build docker container\n",
    "\n",
    "We must build a custom docker container with Roboschool installed.  This takes care of everything:\n",
    "\n",
    "1. Fetching base container image\n",
    "2. Installing Roboschool and its dependencies\n",
    "3. Uploading the new container image to ECR\n",
    "\n",
    "This step can take a long time if you are running on a machine with a slow internet connection.  If your notebook instance is in SageMaker or EC2 it should take 3-10 minutes depending on the instance type.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "cpu_or_gpu = 'gpu' if instance_type.startswith('ml.p') else 'cpu'\n",
    "repository_short_name = \"sagemaker-unity-ray-%s\" % cpu_or_gpu\n",
    "docker_build_args = {\n",
    "    'CPU_OR_GPU': cpu_or_gpu, \n",
    "    'AWS_REGION': boto3.Session().region_name,\n",
    "}\n",
    "custom_image_name = build_and_push_docker_image(repository_short_name, build_args=docker_build_args)\n",
    "print(\"Using ECR image %s\" % custom_image_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Write the Training Code\n",
    "\n",
    "The training code is written in the file “train-unity.py” which is uploaded in the /src directory. \n",
    "First import the environment files and the preset files, and then define the main() function.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pygmentize src/train-unity.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train the RL model using the Python SDK Script mode with Unity executables\n",
    "\n",
    "If you are using local mode, the training will run on the notebook instance. When using SageMaker for training, you can select a GPU or CPU instance. The RLEstimator is used for training RL jobs. \n",
    "\n",
    "1. Specify the source directory where the environment, presets and training code is uploaded.\n",
    "2. Specify the entry point as the training code \n",
    "3. Specify the choice of RL toolkit and framework. This automatically resolves to the ECR path for the RL Container. \n",
    "4. Define the training parameters such as the instance count, job name, S3 path for output and job name. \n",
    "5. Specify the hyperparameters for the RL agent algorithm. \n",
    "6. Define the metrics definitions that you are interested in capturing in your logs. These can also be visualized in CloudWatch and SageMaker Notebooks. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "metric_definitions = RLEstimator.default_metric_definitions(RLToolkit.RAY)\n",
    "    \n",
    "estimator = RLEstimator(entry_point=\"train-unity.py\",\n",
    "                        source_dir='src',\n",
    "                        dependencies=[\"common/sagemaker_rl\"],\n",
    "                        image_uri=custom_image_name,\n",
    "                        role=role,\n",
    "                        train_instance_type=instance_type,\n",
    "                        train_instance_count=1,\n",
    "                        output_path=s3_output_path,\n",
    "                        base_job_name=job_name_prefix,\n",
    "                        model_dir=s3_output_path+'output',\n",
    "                        metric_definitions=metric_definitions,\n",
    "                        hyperparameters={\n",
    "                          # Attention scientists!  You can override any Ray algorithm parameter here:\n",
    "                          \"rl.training.config.env_config.env_name\": \"RollerBall.x86_64\",\n",
    "                          \"rl.training.stop.timesteps_total\": 50000,\n",
    "                          #\"rl.training.config.train_batch_size\": 100,\n",
    "                          #\"rl.training.config.num_sgd_iter\": 10,\n",
    "                        }\n",
    "                    )\n",
    "\n",
    "estimator.fit({'train': s3_binary_path}, wait=local_mode)\n",
    "job_name = estimator.latest_training_job.job_name\n",
    "print(\"Training job: %s\" % job_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualization\n",
    "### Plot metrics for training job\n",
    "We can see the reward metric of the training as it's running, using algorithm metrics that are recorded in CloudWatch metrics.  We can plot this to see the performance of the model over time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from sagemaker.analytics import TrainingJobAnalytics\n",
    "import time\n",
    "\n",
    "sm_client = boto3.client(service_name='sagemaker')\n",
    "state = sm_client.describe_training_job(\n",
    "    TrainingJobName=job_name).get('SecondaryStatus')\n",
    "\n",
    "if not local_mode:\n",
    "    while state == 'Starting':\n",
    "        state = sm_client.describe_training_job(\n",
    "            TrainingJobName=job_name).get('SecondaryStatus')\n",
    "        print(\"Training job starting...\")\n",
    "        time.sleep(30)\n",
    "    print(\"Training job started. Waiting for algorithm metric...\")\n",
    "    df = TrainingJobAnalytics(job_name, ['episode_reward_mean']).dataframe()\n",
    "    num_metrics = len(df)\n",
    "    if num_metrics == 0:\n",
    "        print(\"No algorithm metrics found in CloudWatch, please check later.\")\n",
    "    else:\n",
    "        plt = df.plot(x='timestamp', y='value', figsize=(12,5), legend=True, style='b-')\n",
    "        plt.set_ylabel('Mean reward per episode')\n",
    "        plt.set_xlabel('Training time (s)')\n",
    "else:\n",
    "    print(\"Can't plot metrics in local mode.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluation of RL models\n",
    "\n",
    "We use the last checkpointed model to run evaluation for the RL Agent. \n",
    "\n",
    "### Load checkpointed model for evaluation\n",
    "\n",
    "Checkpointed data from the previously trained models will be passed on for evaluation / inference in the checkpoint channel. In local mode, we can simply use the local directory, whereas in the SageMaker mode, it needs to be moved to S3 first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tmp_dir = \"/tmp/{}\".format(job_name)\n",
    "os.system(\"mkdir {}\".format(tmp_dir))\n",
    "print(\"Create local folder {}\".format(tmp_dir))\n",
    "\n",
    "if local_mode:\n",
    "    model_tar_key = \"{}/model.tar.gz\".format(job_name)\n",
    "else:\n",
    "    model_tar_key = \"{}/output/model.tar.gz\".format(job_name)\n",
    "    \n",
    "local_checkpoint_dir = \"{}/model\".format(tmp_dir)\n",
    "\n",
    "wait_for_s3_object(s3_bucket, model_tar_key, tmp_dir, training_job_name=job_name)  \n",
    "\n",
    "if not os.path.isfile(\"{}/model.tar.gz\".format(tmp_dir)):\n",
    "    raise FileNotFoundError(\"File model.tar.gz not found\")\n",
    "    \n",
    "os.system(\"mkdir -p {}\".format(local_checkpoint_dir))\n",
    "os.system(\"tar -xvzf {}/model.tar.gz -C {}\".format(tmp_dir, local_checkpoint_dir))\n",
    "\n",
    "print(\"Checkpoint directory {}\".format(local_checkpoint_dir))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if local_mode:\n",
    "    checkpoint_path = 'file://{}'.format(local_checkpoint_dir)\n",
    "    print(\"Local checkpoint file path: {}\".format(local_checkpoint_dir))\n",
    "else:\n",
    "    checkpoint_path = \"s3://{}/{}/checkpoint/\".format(s3_bucket, job_name)\n",
    "    if not os.listdir(local_checkpoint_dir):\n",
    "        raise FileNotFoundError(\"Checkpoint files not found under the path\")\n",
    "    os.system(\"aws s3 cp --recursive {} {}\".format(local_checkpoint_dir, checkpoint_path))\n",
    "    print(\"S3 checkpoint file path: {}\".format(checkpoint_path))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pygmentize src/evaluate-unity.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run the evaluation step for a Unity environment using the trained RL model\n",
    "\n",
    "Use the checkpointed model to run the evaluation step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "    \n",
    "estimator_eval = RLEstimator(entry_point=\"evaluate-unity.py\",\n",
    "                        source_dir='src',\n",
    "                        dependencies=[\"common/sagemaker_rl\"],\n",
    "                        image_uri=custom_image_name,\n",
    "                        role=role,\n",
    "                        train_instance_type=instance_type,\n",
    "                        train_instance_count=1,\n",
    "                        output_path=s3_output_path,\n",
    "                        base_job_name=job_name_prefix + \"-evaluation\",\n",
    "                        hyperparameters={\n",
    "                            \"evaluate_episodes\": 2000,\n",
    "                            \"algorithm\": \"PPO\",\n",
    "                            \"env\": \"RollerBall.x86_64\"\n",
    "                        }\n",
    "                    )\n",
    "\n",
    "estimator_eval.fit({'model': checkpoint_path, 'train': s3_binary_path}, wait=local_mode) # specify a train channel for the custom Unity executable option\n",
    "job_name = estimator_eval.latest_training_job.job_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualize the output for the stage evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tmp_dir = \"/tmp/{}\".format(job_name)\n",
    "summary_logs_dir = \"{}/Logs\".format(tmp_dir)\n",
    "os.system(\"mkdir -p {}\".format(summary_logs_dir))\n",
    "print(\"Create local directory for Summary Logs {}\".format(summary_logs_dir))\n",
    "\n",
    "if local_mode:\n",
    "    summary_log_key = \"{}/intermediate/Logs\".format(job_name)\n",
    "else:\n",
    "    summary_log_key = \"{}/output/intermediate/Logs\".format(job_name)\n",
    "\n",
    "wait_for_s3_object(s3_bucket, summary_log_key, summary_logs_dir, training_job_name=job_name)  \n",
    "\n",
    "if not os.path.isdir(summary_logs_dir):\n",
    "    raise FileNotFoundError(\"Directory Logs not found\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from glob import glob\n",
    "import pandas as pd\n",
    "files = glob(\"{}/*.csv\".format(summary_logs_dir), recursive=True)\n",
    "for file in files:\n",
    "    print(file)\n",
    "    print(os.path.getsize(file))\n",
    "    result = pd.read_csv(file)\n",
    "    display(result.drop(['play_id', 'message', 'created_at'], axis=1))\n",
    "    number_of_obstacles = result['number_of_obstacles'].unique()\n",
    "    summary = []\n",
    "    for item in number_of_obstacles:\n",
    "        cleared_count = ((result['number_of_obstacles'] == item) & (result['cleared?'] == True)).sum()\n",
    "        total_count = (result['number_of_obstacles'] == item).sum()\n",
    "        success_rate = cleared_count / total_count\n",
    "        summary.append([item, success_rate, total_count])\n",
    "    if len(summary) != 0:\n",
    "        df = pd.DataFrame(summary, columns=['number_of_obstacles', 'success_rate', 'total_count']).sort_values('number_of_obstacles')\n",
    "        display(df)\n",
    "        df=df.astype(float)\n",
    "        df.plot(x='number_of_obstacles', y='success_rate')"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "conda_tensorflow_p36",
   "language": "python",
   "name": "conda_tensorflow_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  },
  "notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
 },
 "nbformat": 4,
 "nbformat_minor": 4
}