{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Retail Demo Store - Personalization Workshop - Lab 4\n",
"\n",
"In this lab we are going to build on the [prior lab](./Lab-2-Prepare-Personalize-and-import-data.ipynb) by creating Amazon Personalize domain recommenders and custom solutions for additional use cases.\n",
"\n",
"## Lab 4 Objectives\n",
"\n",
"In this lab we will accomplish the following steps.\n",
"\n",
"- Evaluate the recommendations from the e-commerce recommenders created in the last lab.\n",
"- Evaluate the recommendations from the custom solutions and campaigns created in the last lab.\n",
"- Activate the recommenders and campaigns in the Retail Demo Store storefront by setting their ARNs in the System Manager Parameter Store.\n",
"- Real-time events:\n",
" - Create a Amazon Personalize Event Tracker that can be used to stream real-time events in the storefront to Personalize so Personalize can learn from user bahvior in real-time.\n",
" - Evaluate the effect of the event tracker on real-time recommendations.\n",
" - Configure and deploy the Retail Demo Store web app to pick up the event tracker so it can start streaming events.\n",
"- Create and evaluate how to use filters to apply business rules to recommendations and to promote a specific set of items while maintaining relevance.\n",
"\n",
"This lab should take 30-45 minutes to complete."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Just as in the previous labs, we have to prepare our environment by importing dependencies and creating clients.\n",
"\n",
"### Import dependencies\n",
"\n",
"The following libraries are needed for this lab."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"import json\n",
"import time\n",
"import requests\n",
"import random\n",
"import uuid\n",
"import pandas as pd\n",
"from IPython.display import Image, HTML\n",
"\n",
"from botocore.exceptions import ClientError"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create clients\n",
"\n",
"We will need the following AWS service clients in this lab. Notice that we are creating some new Personalize clients with the service name of `personalize-runtime` and `personalize-events`. We'll be using these clients in this lab to get recommendations from our recommenders and campaigns and sending events to Personalize."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"personalize = boto3.client('personalize')\n",
"personalize_runtime = boto3.client('personalize-runtime')\n",
"personalize_events = boto3.client('personalize-events')\n",
"servicediscovery = boto3.client('servicediscovery')\n",
"ssm = boto3.client('ssm')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load variables saved in prior labs\n",
"\n",
"At the end of Lab 1 we saved some variables that we'll need in this lab. The following cell will load those variables into this lab environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%store -r"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Lookup IP addresses of Products and Users microservices\n",
"\n",
"In this lab we will need to lookup details on recommended products and users. We'll do this by making RESTful API calls to these services. In the cells below, we will lookup the IP addresses of these microservices using [AWS Cloud Map](https://aws.amazon.com/cloud-map/)'s Service Discovery."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = servicediscovery.discover_instances(\n",
" NamespaceName='retaildemostore.local',\n",
" ServiceName='products',\n",
" MaxResults=1,\n",
" HealthStatus='HEALTHY'\n",
")\n",
"\n",
"assert len(response['Instances']) > 0, 'Products service instance not found; check ECS to ensure it launched cleanly'\n",
"\n",
"products_service_instance = response['Instances'][0]['Attributes']['AWS_INSTANCE_IPV4']\n",
"print('Products Service Instance IP: {}'.format(products_service_instance))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = requests.get('http://{}/products/all'.format(products_service_instance))\n",
"products = response.json()\n",
"products_df = pd.DataFrame(products)\n",
"products_df.head(5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = servicediscovery.discover_instances(\n",
" NamespaceName='retaildemostore.local',\n",
" ServiceName='users',\n",
" MaxResults=1,\n",
" HealthStatus='HEALTHY'\n",
")\n",
"\n",
"assert len(response['Instances']) > 0, 'Users service instance not found; check ECS to ensure it launched cleanly'\n",
"\n",
"users_service_instance = response['Instances'][0]['Attributes']['AWS_INSTANCE_IPV4']\n",
"print('Users Service Instance IP: {}'.format(users_service_instance))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = requests.get('http://{}/users/all?count=10000'.format(users_service_instance))\n",
"users = response.json()\n",
"users_df = pd.DataFrame(users)\n",
"users_df.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load interactions dataset\n",
"\n",
"Next let's load the interaction dataset (the CSV created in Lab 1) so we can query it to see what historical interactions were used to train the model for each user. This will help us better understand why certain products are being recommended."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"interactions_df = pd.read_csv(interactions_filename)\n",
"interactions_df['USER_ID'] = interactions_df.USER_ID.astype(str)\n",
"interactions_df['TIMESTAMP'] = pd.to_datetime(interactions_df['TIMESTAMP'],unit='s')\n",
"interactions_df.head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next let's create a couple utility functions that we can use later in the notebook to lookup recent interactions and product details for past interactions.\n",
"\n",
"The first function will lookup the most recent interactions for a user and return them in a dataframe."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Update DF rendering\n",
"pd.set_option('display.max_rows', 30)\n",
"pd.set_option('display.max_colwidth', None)\n",
"\n",
"def lookup_historical_interactions(user_id, max_count = 10):\n",
" recent_df = interactions_df.loc[interactions_df['USER_ID'] == str(user_id)]\n",
" recent_df = recent_df.sort_values(by = 'TIMESTAMP', ascending = False)\n",
" recent_df = recent_df[:max_count]\n",
" \n",
" rows = []\n",
" columns_to_keep = ['id', 'name', 'category', 'style', 'price', 'image']\n",
" for index, interaction in recent_df.iterrows():\n",
" product = products_df.loc[products_df['id'] == interaction['ITEM_ID']]\n",
" if product.empty:\n",
" continue\n",
" product = product.iloc[0]\n",
" row = {}\n",
" row['TIMESTAMP'] = interaction['TIMESTAMP']\n",
" row['EVENT_TYPE'] = interaction['EVENT_TYPE']\n",
" for col in columns_to_keep:\n",
" if col == 'image':\n",
" row[col] = ''\n",
" elif col == 'name':\n",
" row[col] = '' + product[col] + ''\n",
" else:\n",
" row[col] = product[col]\n",
" rows.append(row)\n",
" \n",
" return pd.DataFrame(rows)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, let's test the interaction history lookup function for a random user."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"# Randomly select a user.\n",
"user = users_df.sample(1).iloc[0]\n",
"user_id = user['id']\n",
"# Lookup recent interactions and product details for user.\n",
"df = lookup_historical_interactions(user_id, 20)\n",
"# Display info on user and recent interactions\n",
"header = f'