{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sending Events to S3\n", "\n", "This notebook is pretty simple, just enough code to send events as discrete JSON files to S3 based on user interactions for Amazon Personalize." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Imports\n", "import boto3\n", "import json\n", "import numpy as np\n", "import pandas as pd\n", "import time\n", "import uuid" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Be sure to update the Campaign and Dataset ARNs below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Setup and Config\n", "# Recommendations from Event data\n", "personalize = boto3.client('personalize')\n", "personalize_runtime = boto3.client('personalize-runtime')\n", "\n", "# Campaign ARN\n", "Campaign_ARN = \"arn:aws:personalize:us-east-1:059124553121:campaign/personalize-demo-camp\"\n", "\n", "# Dataset Group Arn:\n", "datasetGroupArn = \"arn:aws:personalize:us-east-1:059124553121:dataset-group/personalize-lambda-demo\"\n", "\n", "# Establish a connection to Personalize's Event Streaming\n", "personalize_events = boto3.client(service_name='personalize-events')\n", "\n", "\n", "# S3 Configuration\n", "bucket = \"personalizedemo9000chris\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Faking Events\n", "\n", "This code will rely on 3 users that we happen to know exist in the MovieLens dataset:\n", "\n", "1. 196\n", "2. 13\n", "3. 198\n", "\n", "They were chosen at random, also we are going to now select 3 movies at random for the user to interact with:\n", "\n", "1. 225\n", "2. 203\n", "3. 242\n", "\n", "Each of these users will interact with each movie, and each interaction will be written to S3 as a distinct JSON file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "users = [\"196\", \"13\", \"198\"]\n", "items = [\"225\", \"203\", \"242\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def write_interaction_to_JSON(user, item):\n", " # An event needs a user, session, time, type, and a properties dictionary.\n", " time_of_interaction = int(time.time())\n", " # This is set for simplification\n", " session_id = user\n", " user_id = user\n", " event_type = \"watched\"\n", " properties = {\n", " 'itemId': str(item)\n", " }\n", " properties = json.dumps(properties)\n", " \n", " event_headers = {\n", " 'userId': user,\n", " 'sessionId': session_id\n", " }\n", " \n", " event_content = {\n", " 'sentAt': time_of_interaction,\n", " 'eventType': event_type,\n", " 'properties': properties\n", " }\n", " event_list = [event_content]\n", " json_struct = [event_headers, event_content]\n", " json_filename = str(time_of_interaction) + \".json\"\n", " \n", " with open(json_filename, 'w') as outfile:\n", " json.dump(json_struct, outfile)\n", " \n", " return json_filename\n", "\n", "def send_interactions_to_s3():\n", " for user in users:\n", " for item in items:\n", " filename = write_interaction_to_JSON(user, item)\n", " boto3.Session().resource('s3').Bucket(bucket).Object(filename).upload_file(filename)\n", " print(filename)\n", " # Sleep needed to make sure everyone gets a nice new timestamp\n", " time.sleep(2)\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "send_interactions_to_s3()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The code below is just to explore what to put in our Lambda:\n", "\n", "with open(\"1568076414.json\", 'r') as input_file:\n", " content = json.loads(input_file.read())\n", " # assume tracking ID was provided via an environment variable and replace\n", " trackingId = \"a68baf70-f0f2-4c66-af2d-e11047a0372f\"\n", " print(content[0]['userId'])\n", " \n", " personalize_events.put_events(\n", " trackingId = trackingId,\n", " userId = content[0]['userId'],\n", " sessionId = content[0]['sessionId'],\n", " eventList = [content[1]]\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 4 }