{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b4c5d7a5",
   "metadata": {},
   "source": [
    "# Building Your First E-Commerce Recommender\n",
    "\n",
    "This notebook will walk you through the steps to build a Domain dataset group and a recommender that returns product recommendations based on data generated for our fictitious retail store data set. The goal is to recommend products that are relevant based on a particular user.\n",
    "\n",
    "This synthetic data comes from the [Retail Demo Store project](https://github.com/aws-samples/retail-demo-store). Follow the link to learn more about the data and potential uses.\n",
    "\n",
    "# How to Use the Notebook\n",
    "\n",
    "The code is broken up into cells like the one below. There's a triangular Run button at the top of this page that you can click to execute each cell and move onto the next, or you can press `Shift` + `Enter` while in the cell to execute it and move onto the next one.\n",
    "\n",
    "As a cell is executing you'll notice a line to the side showcase an `*` while the cell is running or it will update to a number to indicate the last cell that completed executing after it has finished exectuting all the code within a cell.\n",
    "\n",
    "Simply follow the instructions below and execute the cells to get started with Amazon Personalize using case optimized recommenders."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38adcd1c",
   "metadata": {},
   "source": [
    "## Imports\n",
    "Python ships with a broad collection of libraries and we need to import those as well as the ones installed to help us like [boto3](https://aws.amazon.com/sdk-for-python/) (AWS SDK for python) and [Pandas](https://pandas.pydata.org/)/[Numpy](https://numpy.org/) which are core data science tools."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ec417d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imports\n",
    "import boto3\n",
    "import json\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import time\n",
    "import datetime"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e687bca",
   "metadata": {},
   "source": [
    "Next you will want to validate that your environment can communicate successfully with Amazon Personalize, the lines below do just that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb363f04",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure the SDK to Personalize:\n",
    "personalize = boto3.client('personalize')\n",
    "personalize_runtime = boto3.client('personalize-runtime')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11143552",
   "metadata": {},
   "source": [
    "## Specify an S3 Bucket and Data Output Location\n",
    "\n",
    "Amazon Personalize will need an S3 bucket to act as the source of your data. The code bellow will create a bucket with a unique `bucket_name`.\n",
    "\n",
    "The Amazon S3 bucket needs to be in the same region as the Amazon Personalize resources. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eca438fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sets the same region as current Amazon SageMaker Notebook\n",
    "with open('/opt/ml/metadata/resource-metadata.json') as notebook_info:\n",
    "    data = json.load(notebook_info)\n",
    "    resource_arn = data['ResourceArn']\n",
    "    region = resource_arn.split(':')[3]\n",
    "print('region:', region)\n",
    "\n",
    "# Or you can specify the region where your bucket and model will be domiciled\n",
    "# region = \"us-east-1\" \n",
    "\n",
    "s3 = boto3.client('s3')\n",
    "account_id = boto3.client('sts').get_caller_identity().get('Account')\n",
    "bucket_name = account_id + \"-\" + region + \"-\" + \"personalizemanagedretailers\"\n",
    "print('bucket_name:', bucket_name)\n",
    "\n",
    "try: \n",
    "    if region == \"us-east-1\":\n",
    "        s3.create_bucket(Bucket=bucket_name)\n",
    "    else:\n",
    "        s3.create_bucket(\n",
    "            Bucket = bucket_name,\n",
    "            CreateBucketConfiguration={'LocationConstraint': region}\n",
    "            )\n",
    "except s3.exceptions.BucketAlreadyOwnedByYou:\n",
    "    print(\"Bucket already exists. Using bucket\", bucket_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4a4e634",
   "metadata": {},
   "source": [
    "## Download, Prepare, and Upload Training Data\n",
    "\n",
    "First we need to download the data (training data). In this tutorial we'll use the Purchase history from a retail store  dataset. The dataset contains the user id,items id,the interaction between customers and items and the time this interaction took place(Timestamp) \n",
    "\n",
    "### Download and Explore the Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3f47c955",
   "metadata": {},
   "outputs": [],
   "source": [
    "!aws s3 cp s3://retail-demo-store-us-east-1/csvs/items.csv .\n",
    "!aws s3 cp s3://retail-demo-store-us-east-1/csvs/interactions.csv ."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5e7e372",
   "metadata": {},
   "source": [
    "The dataset has been successfully downloaded as Electronics_Store_purchase_history.csv\n",
    "\n",
    "Lets learn more about the dataset by viewing its charateristics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e2be2e5",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('./interactions.csv')\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0a6eaf51",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.EVENT_TYPE.value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5490ee16",
   "metadata": {},
   "outputs": [],
   "source": [
    "def convert_event_type(event_type_in_some_format):\n",
    "    if(event_type_in_some_format == \"ProductViewed\"):\n",
    "        return \"View\"\n",
    "    if(event_type_in_some_format == \"OrderCompleted\"):\n",
    "        return \"Purchase\"\n",
    "    else:\n",
    "        return event_type_in_some_format\n",
    "\n",
    "df['EVENT_TYPE'] = df['EVENT_TYPE'].apply(convert_event_type)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3579b30",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.EVENT_TYPE.value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ce52eb1d",
   "metadata": {},
   "source": [
    "The ECOMMERCE recommenders require you to provide specific EVENT_TYPE values in order to understand the context of an interaction, therefore we are going to modify our interactions EVENTYPE column"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc08e9c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b65a5a2",
   "metadata": {},
   "source": [
    "From the 2 cells above, we've learned that our data has has 9 columns, 675004 rows and the headers are: ITEM_ID, USER_ID, EVENT_TYPE, TIMESTAMP and DISCOUNT.\n",
    "\n",
    "To be compatible with an Amazon Personalize interactions schema, this dataset requires column headings compatible with Amazon Personalize default column names (read about column names [here](https://docs.aws.amazon.com/personalize/latest/dg/how-it-works-dataset-schema.html) )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da205516",
   "metadata": {},
   "source": [
    "## Prepare the Interactions Data\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "903560a2",
   "metadata": {},
   "source": [
    "### Drop Columns\n",
    "\n",
    "Some columns in this dataset would not add value to our model and as such need to be dropped from this dataset. Columns such as *discount*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba254bf7",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "test=df.drop(columns=['DISCOUNT'])\n",
    "df=test\n",
    "df.sample(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2eb1daf6",
   "metadata": {},
   "source": [
    "In the cells below, we will write our cleaned data to a file named \"final_training_data.csv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86926551",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv(\"cleaned_training_data.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f536ce21",
   "metadata": {},
   "source": [
    "### Upload to S3\n",
    "Now that our training data is ready for Amazon Personalize,the next step is to upload it to the s3 bucket created earlier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "949fa812",
   "metadata": {},
   "outputs": [],
   "source": [
    "interactions_file_path = 'cleaned_training_data.csv'\n",
    "boto3.Session().resource('s3').Bucket(bucket_name).Object(interactions_file_path).upload_file(interactions_file_path)\n",
    "interactions_s3DataPath = \"s3://\"+bucket_name+\"/\"+interactions_file_path\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e946aa93",
   "metadata": {},
   "source": [
    "## Configure an S3 bucket and an IAM role\n",
    "\n",
    "So far, we have downloaded, manipulated, and saved the data onto the Amazon EBS instance attached to instance running this Jupyter notebook. However, Amazon Personalize will need an S3 bucket to act as the source of your data, as well as IAM roles for accessing that bucket. Let's set all of that up.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1761a177",
   "metadata": {},
   "source": [
    "## Set the S3 bucket policy\n",
    "Amazon Personalize needs to be able to read the contents of your S3 bucket. So add a bucket policy which allows that.\n",
    "\n",
    "Note: Make sure the role you are using to run the code in this notebook has the necessary permissions to modify the S3 bucket policy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c21fbd03",
   "metadata": {},
   "outputs": [],
   "source": [
    "s3 = boto3.client(\"s3\")\n",
    "policy = {\n",
    "    \"Version\": \"2012-10-17\",\n",
    "    \"Id\": \"PersonalizeS3BucketAccessPolicy\",\n",
    "    \"Statement\": [\n",
    "        {\n",
    "            \"Sid\": \"PersonalizeS3BucketAccessPolicy\",\n",
    "            \"Effect\": \"Allow\",\n",
    "            \"Principal\": {\n",
    "                \"Service\": \"personalize.amazonaws.com\"\n",
    "            },\n",
    "            \"Action\": [\n",
    "                \"s3:GetObject\",\n",
    "                \"s3:ListBucket\"\n",
    "            ],\n",
    "            \"Resource\": [\n",
    "                \"arn:aws:s3:::{}\".format(bucket_name),\n",
    "                \"arn:aws:s3:::{}/*\".format(bucket_name)\n",
    "            ]\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "s3.put_bucket_policy(Bucket=bucket_name, Policy=json.dumps(policy))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4fb89bad",
   "metadata": {},
   "source": [
    "## Create and Wait for Dataset Group\n",
    "The largest grouping in Personalize is a Dataset Group, this will isolate your data, event trackers, solutions, Recommenders, and campaigns. Grouping things together that share a common collection of data. Feel free to alter the name below if you'd like.\n",
    "\n",
    "### Create Dataset Group"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c482c4ab",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "response = personalize.create_dataset_group(\n",
    "    name='personalize_ecomemerce_ds_group',\n",
    "    domain='ECOMMERCE'\n",
    ")\n",
    "\n",
    "dataset_group_arn = response['datasetGroupArn']\n",
    "print(json.dumps(response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ed79a29",
   "metadata": {},
   "source": [
    "Wait for Dataset Group to Have ACTIVE Status\n",
    "Before we can use the Dataset Group in any items below it must be active, execute the cell below and wait for it to show active."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78eb7af7",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "max_time = time.time() + 3*60*60 # 3 hours\n",
    "while time.time() < max_time:\n",
    "    describe_dataset_group_response = personalize.describe_dataset_group(\n",
    "        datasetGroupArn = dataset_group_arn\n",
    "    )\n",
    "    status = describe_dataset_group_response[\"datasetGroup\"][\"status\"]\n",
    "    print(\"DatasetGroup: {}\".format(status))\n",
    "    \n",
    "    if status == \"ACTIVE\" or status == \"CREATE FAILED\":\n",
    "        break\n",
    "        \n",
    "    time.sleep(60)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f233906",
   "metadata": {},
   "source": [
    "## Create Interactions Schema\n",
    "A core component of how Personalize understands your data comes from the Schema that is defined below. This configuration tells the service how to digest the data provided via your CSV file. Note the columns and types align to what was in the file you created above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "982ee9ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "interactions_schema = schema = {\n",
    "    \"type\": \"record\",\n",
    "    \"name\": \"Interactions\",\n",
    "    \"namespace\": \"com.amazonaws.personalize.schema\",\n",
    "    \"fields\": [\n",
    "        {\n",
    "            \"name\": \"USER_ID\",\n",
    "            \"type\": \"string\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"ITEM_ID\",\n",
    "            \"type\": \"string\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"TIMESTAMP\",\n",
    "            \"type\": \"long\"\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"EVENT_TYPE\",\n",
    "            \"type\": \"string\"\n",
    "            \n",
    "        }\n",
    "    ],\n",
    "    \"version\": \"1.0\"\n",
    "}\n",
    "\n",
    "create_schema_response = personalize.create_schema(\n",
    "    name = \"personalize-ecommerce-interatn_group\",\n",
    "    domain = \"ECOMMERCE\",\n",
    "    schema = json.dumps(interactions_schema)\n",
    ")\n",
    "\n",
    "interaction_schema_arn = create_schema_response['schemaArn']\n",
    "print(json.dumps(create_schema_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "162e1b61",
   "metadata": {},
   "source": [
    "## Create Datasets\n",
    "After the group, the next thing to create is the actual datasets."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ae8f503",
   "metadata": {},
   "source": [
    "### Create Interactions Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8386293d",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_type = \"INTERACTIONS\"\n",
    "\n",
    "create_dataset_response = personalize.create_dataset(\n",
    "    name = \"personalize_ecommerce_demo_interactions\",\n",
    "    datasetType = dataset_type,\n",
    "    datasetGroupArn = dataset_group_arn,\n",
    "    schemaArn = interaction_schema_arn\n",
    ")\n",
    "\n",
    "interactions_dataset_arn = create_dataset_response['datasetArn']\n",
    "print(json.dumps(create_dataset_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "451b362d",
   "metadata": {},
   "source": [
    "## Create Personalize Role\n",
    "Also Amazon Personalize needs the ability to assume Roles in AWS in order to have the permissions to execute certain tasks, the lines below grant that.\n",
    "\n",
    "Note: Make sure the role you are using to run the code in this notebook has the necessary permissions to create a role."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b972b39",
   "metadata": {},
   "outputs": [],
   "source": [
    "iam = boto3.client(\"iam\")\n",
    "\n",
    "role_name = \"PersonalizeRoleEcommerceDemoRecommender\"\n",
    "assume_role_policy_document = {\n",
    "    \"Version\": \"2012-10-17\",\n",
    "    \"Statement\": [\n",
    "        {\n",
    "          \"Effect\": \"Allow\",\n",
    "          \"Principal\": {\n",
    "            \"Service\": \"personalize.amazonaws.com\"\n",
    "          },\n",
    "          \"Action\": \"sts:AssumeRole\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "create_role_response = iam.create_role(\n",
    "    RoleName = role_name,\n",
    "    AssumeRolePolicyDocument = json.dumps(assume_role_policy_document)\n",
    ")\n",
    "\n",
    "# AmazonPersonalizeFullAccess provides access to any S3 bucket with a name that includes \"personalize\" or \"Personalize\" \n",
    "# if you would like to use a bucket with a different name, please consider creating and attaching a new policy\n",
    "# that provides read access to your bucket or attaching the AmazonS3ReadOnlyAccess policy to the role\n",
    "policy_arn = \"arn:aws:iam::aws:policy/service-role/AmazonPersonalizeFullAccess\"\n",
    "iam.attach_role_policy(\n",
    "    RoleName = role_name,\n",
    "    PolicyArn = policy_arn\n",
    ")\n",
    "\n",
    "# Now add S3 support\n",
    "iam.attach_role_policy(\n",
    "    PolicyArn='arn:aws:iam::aws:policy/AmazonS3FullAccess',\n",
    "    RoleName=role_name\n",
    ")\n",
    "time.sleep(60) # wait for a minute to allow IAM role policy attachment to propagate\n",
    "\n",
    "role_arn = create_role_response[\"Role\"][\"Arn\"]\n",
    "print(role_arn)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77d738fa",
   "metadata": {},
   "source": [
    "## Import the data\n",
    "Earlier you created the DatasetGroup and Dataset to house your information, now you will execute an import job that will load the data from S3 into Amazon Personalize for usage building your model.\n",
    "### Create Interactions Dataset Import Job"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f03f8fb6",
   "metadata": {},
   "outputs": [],
   "source": [
    "create_interactions_dataset_import_job_response = personalize.create_dataset_import_job(\n",
    "    jobName = \"personalize_ecommerce_demo_interactions_import\",\n",
    "    datasetArn = interactions_dataset_arn,\n",
    "    dataSource = {\n",
    "        \"dataLocation\": \"s3://{}/{}\".format(bucket_name, interactions_file_path)\n",
    "    },\n",
    "    roleArn = role_arn\n",
    ")\n",
    "\n",
    "dataset_interactions_import_job_arn = create_interactions_dataset_import_job_response['datasetImportJobArn']\n",
    "print(json.dumps(create_interactions_dataset_import_job_response, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b2d88bd",
   "metadata": {},
   "source": [
    "Wait for Dataset Import Job to Have ACTIVE Status\n",
    "It can take a while before the import job completes, please wait until you see that it is active below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e81c95d9",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "max_time = time.time() + 3*60*60 # 3 hours\n",
    "while time.time() < max_time:\n",
    "    describe_dataset_import_job_response = personalize.describe_dataset_import_job(\n",
    "        datasetImportJobArn = dataset_interactions_import_job_arn\n",
    "    )\n",
    "    status = describe_dataset_import_job_response[\"datasetImportJob\"]['status']\n",
    "    print(\"DatasetImportJob: {}\".format(status))\n",
    "    \n",
    "    if status == \"ACTIVE\" or status == \"CREATE FAILED\":\n",
    "        break\n",
    "        \n",
    "    time.sleep(60)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa7d323b",
   "metadata": {},
   "source": [
    "## Choose a recommender use cases\n",
    "\n",
    "Each domain has different use cases. When you create a recommender you create it for a specific use case, and each use case has different requirements for getting recommendations.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cf3ec520",
   "metadata": {},
   "outputs": [],
   "source": [
    "available_recipes = personalize.list_recipes(domain='ECOMMERCE') # See a list of recommenders for the domain. \n",
    "display (available_recipes['recipes'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61617777",
   "metadata": {},
   "source": [
    "We are going to create a recommender of the type \"Customers who viewed X also viewed\". This recommender gives recommendations for items that customers also viewed based on an item that you specify. With this use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and `Purchase` events."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3ffab210",
   "metadata": {},
   "outputs": [],
   "source": [
    "create_recommender_response = personalize.create_recommender(\n",
    "  name = 'viewed_x_also_viewed_demo',\n",
    "  recipeArn = 'arn:aws:personalize:::recipe/aws-ecomm-customers-who-viewed-x-also-viewed',\n",
    "  datasetGroupArn = dataset_group_arn\n",
    ")\n",
    "viewed_x_also_viewed_arn = create_recommender_response[\"recommenderArn\"]\n",
    "print (json.dumps(create_recommender_response))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b3b66c38",
   "metadata": {},
   "source": [
    "We are going to create a second recommender of the type \"Recommended For You\". This type of recommender offers personalized recommendations for items based on a user that you specify. With this use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and `Purchase` events.\n",
    "\n",
    "[More use cases per domain](https://docs.aws.amazon.com/personalize/latest/dg/domain-use-cases.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "54a2096a",
   "metadata": {},
   "outputs": [],
   "source": [
    "create_recommender_response = personalize.create_recommender(\n",
    "  name = 'recommended_for_you_demo',\n",
    "  recipeArn = 'arn:aws:personalize:::recipe/aws-ecomm-recommended-for-you',\n",
    "  datasetGroupArn = dataset_group_arn\n",
    ")\n",
    "recommended_for_you_arn = create_recommender_response[\"recommenderArn\"]\n",
    "print (json.dumps(create_recommender_response))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a868817",
   "metadata": {},
   "source": [
    "We wait until the recomenders have finished creating and have status `ACTIVE`. We check periodically on the status of the recommender"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "285fc2f4",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "max_time = time.time() + 10*60*60 # 10 hours\n",
    "\n",
    "while time.time() < max_time:\n",
    "\n",
    "    version_response = personalize.describe_recommender(\n",
    "        recommenderArn = viewed_x_also_viewed_arn\n",
    "    )\n",
    "    status = version_response[\"recommender\"][\"status\"]\n",
    "\n",
    "    if status == \"ACTIVE\":\n",
    "        print(\"Build succeeded for {}\".format(viewed_x_also_viewed_arn))\n",
    "        \n",
    "    elif status == \"CREATE FAILED\":\n",
    "        print(\"Build failed for {}\".format(viewed_x_also_viewed_arn))\n",
    "        \n",
    "\n",
    "    if status == \"ACTIVE\" or status == \"CREATE FAILED\":\n",
    "        break\n",
    "    else:\n",
    "        print('The \"Customers who viewed X also viewed\" Recommender build is still in progress')\n",
    "        \n",
    "    time.sleep(60)\n",
    "    \n",
    "while time.time() < max_time:\n",
    "\n",
    "    version_response = personalize.describe_recommender(\n",
    "        recommenderArn = recommended_for_you_arn\n",
    "    )\n",
    "    status = version_response[\"recommender\"][\"status\"]\n",
    "\n",
    "    if status == \"ACTIVE\":\n",
    "        print(\"Build succeeded for {}\".format(recommended_for_you_arn))\n",
    "        \n",
    "    elif status == \"CREATE FAILED\":\n",
    "        print(\"Build failed for {}\".format(recommended_for_you_arn))\n",
    "        break\n",
    "\n",
    "    if status == \"ACTIVE\" or status == \"CREATE FAILED\":\n",
    "        break\n",
    "    else:\n",
    "        print('The \"Recommended for you\" Recommender build is still in progress')\n",
    "        \n",
    "    time.sleep(60)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "487a8f5b",
   "metadata": {},
   "source": [
    "## Getting recommendations with a recommender\n",
    "Now that the recommenders have been trained, lets have a look at the recommendations we can get for our users!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fdfa3488",
   "metadata": {},
   "outputs": [],
   "source": [
    "# reading the original data in order to have a dataframe that has both item_ids \n",
    "# and the corresponding titles to make out recommendations easier to read.\n",
    "items_df = pd.read_csv('./items.csv')\n",
    "items_df.sample(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dda9fce8",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_item_by_id(item_id, item_df):\n",
    "    \"\"\"\n",
    "    This takes in an item_id from a recommendation in string format,\n",
    "    converts it to an int, and then does a lookup in a default or specified\n",
    "    dataframe and returns the item description.\n",
    "    \n",
    "    A really broad try/except clause was added in case anything goes wrong.\n",
    "    \n",
    "    Feel free to add more debugging or filtering here to improve results if\n",
    "    you hit an error.\n",
    "    \"\"\"\n",
    "    try:\n",
    "        return items_df.loc[items_df[\"ITEM_ID\"]==str(item_id)]['PRODUCT_DESCRIPTION'].values[0]\n",
    "    except:\n",
    "        print (item_id)\n",
    "        return \"Error obtaining item description\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d38943e1",
   "metadata": {},
   "source": [
    "Let us get some recommendations using the \"Customers who viewed X also viewed\" Recommender:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3838f305",
   "metadata": {},
   "outputs": [],
   "source": [
    "# use a random valid id for a quick sanity check, modify the line of code bellow to a valid id in your dataset\n",
    "get_item_by_id(\"c72257d4-430b-4eb7-9de3-28396e593381\", items_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d49f2cd2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# First pick a user\n",
    "test_user_id = \"777\"\n",
    "\n",
    "# Select a random item\n",
    "test_item_id = \"8fbe091c-f73c-4727-8fe7-d27eabd17bea\" # a random item: 8fbe091c-f73c-4727-8fe7-d27eabd17bea\n",
    "\n",
    "# Get recommendations for the user for this item\n",
    "get_recommendations_response = personalize_runtime.get_recommendations(\n",
    "    recommenderArn = viewed_x_also_viewed_arn,\n",
    "    itemId = test_item_id,\n",
    "    userId = test_user_id,\n",
    "    numResults = 10\n",
    ")\n",
    "\n",
    "# Build a new dataframe for the recommendations\n",
    "item_list = get_recommendations_response['itemList']\n",
    "recommendation_list = []\n",
    "\n",
    "for item in item_list:\n",
    "    item = get_item_by_id(item['itemId'], items_df)\n",
    "    recommendation_list.append(item)\n",
    "\n",
    "user_recommendations_df = pd.DataFrame(recommendation_list, columns = [get_item_by_id(test_item_id, items_df)])\n",
    "\n",
    "pd.options.display.max_rows =10\n",
    "display(user_recommendations_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7fbe0d3e",
   "metadata": {},
   "source": [
    "Get recommendations from the recommender returning \"Recommended for you\":"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53476ab1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# First pick a user\n",
    "test_user_id = \"777\" \n",
    "\n",
    "# Get recommendations for the user\n",
    "get_recommendations_response = personalize_runtime.get_recommendations(\n",
    "    recommenderArn = recommended_for_you_arn,\n",
    "    userId = test_user_id,\n",
    "    numResults = 20\n",
    ")\n",
    "\n",
    "# Build a new dataframe for the recommendations\n",
    "item_list = get_recommendations_response['itemList']\n",
    "recommendation_list = []\n",
    "for item in item_list:\n",
    "    item = get_item_by_id(item['itemId'], items_df)\n",
    "    recommendation_list.append(item)\n",
    "\n",
    "\n",
    "user_recommendations_df = pd.DataFrame(recommendation_list, columns = [test_user_id])\n",
    "\n",
    "pd.options.display.max_rows =20\n",
    "display(user_recommendations_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1a3a3e2",
   "metadata": {},
   "source": [
    "## Review\n",
    "Using the codes above you have successfully trained a deep learning model to generate item recommendations based on prior user behavior. You have created two recommenders for two foundational use cases. \n",
    "Going forward, you can adapt this code to create other recommenders."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8a27602",
   "metadata": {},
   "source": [
    "## Notes for the Next Notebook:\n",
    "There are a few values you will need for the next notebook, execute the cell below to store them so they can be used in the `Clean_Up_Resources.ipynb` notebook.\n",
    "\n",
    "This will overwite any data stored for those variables and set them to the values specified in this notebook. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fba02ff2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# store for cleanup\n",
    "%store dataset_group_arn\n",
    "%store role_name\n",
    "%store region"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95a14669",
   "metadata": {},
   "source": [
    "If you have run the `Building_Your_First_Recommender_Video_On_Demand.ipynb` notebook, please make sure you re-run the previous step in the `Building_Your_First_Recommender_Video_On_Demand.ipynb` notebook and re-run the `Clean_Up_Resources.ipynb` to remove the resources created in that notebook after you run the `Clean_Up_Resources.ipynb` with the resources created here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "669af678",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}