# Retail Demo Store - Search Workshop

Welcome to the Retail Demo Store Search Workshop. In this module we'll be configuring the Retail Demo Store Search service to allow searching for product data via [Amazon OpenSearch Service](https://aws.amazon.com/opensearch-service/) (formerly Amazon Elasticsearch Service). An Amazon OpenSearch domain should already be provisioned for you in your AWS environment as part of the Retail Demo Store deployment.

Recommended Time: 20 Minutes

## Setup

To get started, we need to perform a bit of setup. Walk through each of the following steps to configure your environment to interact with the Amazon Personalize Service.

### Import dependencies and setup Boto3 python clients

Througout this workshop we will need access to some common libraries and clients for connecting to AWS services.

In [None]:
# Import Dependencies

import sys
import boto3
import botocore
import json
import pandas as pd
import requests

from packaging import version
from random import randint


Next, we will create the clients for the AWS services needed in this workshop.

In [None]:
# Setup Clients

servicediscovery = boto3.client('servicediscovery')
ssm = boto3.client('ssm')
opensearch_service = boto3.client('opensearch')

## Create index and bulk index product data

### Get Products Service instance

We will be creating a new OpenSearch Index and indexing our product data so that our users can search for products. To do this, first we will be pulling our Product data from [Products Service](https://github.com/aws-samples/retail-demo-store/tree/master/src/products) that is deployed as part of the Retail Demo Store. To connect to the Products Service we will use Service Discovery to discover an instance of the Products Service, and then connect directly to that service instances to access our data.

In [None]:
response = servicediscovery.discover_instances(
    NamespaceName='retaildemostore.local',
    ServiceName='products',
    MaxResults=1,
    HealthStatus='HEALTHY'
)

assert len(response['Instances']) > 0, 'Products service instance not found; check ECS to ensure it launched cleanly'

products_service_instance = response['Instances'][0]['Attributes']['AWS_INSTANCE_IPV4']
print('Service Instance IP: {}'.format(products_service_instance))

#### Download and explore the Products dataset

Now that we have the IP address of one of our Products Service instances, we can connect to it and fetch our product catalog. To more easily explore our data, we will convert the json response form the Products Service into a Pandas dataframe and print it as a table. 

In [None]:
response = requests.get('http://{}/products/all'.format(products_service_instance))
products = response.json()
products_df = pd.DataFrame(products)
pd.set_option('display.max_rows', 5)

products_df

### Install OpenSearch python library

We will use the Python OpenSearch library to connect to our Amazon OpenSearch cluster, create a new index, and then bulk index our product data. First, we need to install the OpenSearch library into the local notebook environment. We'll ensure pip is updated as well.

In [None]:
!{sys.executable} -m pip install --upgrade pip
!{sys.executable} -m pip install opensearch-py

### Discover OpenSearch domain endpoint

Before we can configure the OpenSearch client, we need to determine the endpoint for the OpenSearch domain created in your AWS environment. We will accomplish this by looking for the OpenSearch domain with tag key of `Name` and tag value of `retaildemostore`. This tag was associated with the Amazon OpenSearch domain that was created when the project was deployed to your AWS account using CloudFormation.

In [None]:
opensearch_domain_endpoint = None

domains_response = opensearch_service.list_domain_names()

for domain_name in domains_response['DomainNames']:
    describe_response = opensearch_service.describe_domain(
        DomainName=domain_name['DomainName']
    )
    
    tags_response = opensearch_service.list_tags(ARN=describe_response['DomainStatus']['ARN'])

    domain_match = False
    for tag in tags_response['TagList']:
        if tag['Key'] == 'Name' and tag['Value'] == 'retaildemostore':
            domain_match = True
            break
            
    if domain_match:
        opensearch_domain_endpoint = describe_response['DomainStatus']['Endpoints']['vpc']
        break

print('OpenSearch domain endpoint: ' + str(opensearch_domain_endpoint))

assert opensearch_domain_endpoint, 'OpenSearch domain endpoint could not be determined. Ensure Amazon OpenSearch domain has been successfully created and has "retaildemostore" tag before continuing.'

### Configure and create OpenSearch python client

In [None]:
from opensearchpy import OpenSearch

SEARCH_HOST = {
    'host' : opensearch_domain_endpoint,
    'port' : 443,
    'scheme' : 'https',
}

client = OpenSearch(hosts = [SEARCH_HOST])

# These variables will be used throughout the rest of the notebook
INDEX_NAME = 'products'
ID_FIELD = 'id'

### Prepare Product data for indexing

Batch products into chunks that will be used for batch indexing below.

In [None]:
bulk_datas = [] 
bulk_data = []

bulk_datas.append(bulk_data)

max_data_len = 100

for product in products:
    data_dict = product

    op_dict = {
        "index": {
            "_index": INDEX_NAME, 
            "_id": data_dict[ID_FIELD]
        }
    }
    bulk_data.append(op_dict)
    bulk_data.append(data_dict)
    
    if len(bulk_data) >= max_data_len:
        bulk_data = []
        bulk_datas.append(bulk_data)

### Check for and delete existing indexes

If the products index already exists, we'll delete it so everything gets rebuilt from scratch.

In [None]:
if client.indices.exists(INDEX_NAME):
    print("Deleting '%s' index..." % (INDEX_NAME))
    res = client.indices.delete(index = INDEX_NAME)
    print(" response: '%s'" % (res))
else:
    print('Index does not exist. Nothing to delete.')

### Create index

In [None]:
request_body = {
    "settings" : {
        "number_of_shards": 1,
        "number_of_replicas": 0
    }
}
print("Creating '%s' index..." % (INDEX_NAME))
res = client.indices.create(index = INDEX_NAME, body = request_body)
print(" response: '%s'" % (res))

### Perform bulk indexing

In [None]:
print("Bulk indexing...")
for bulk_data in bulk_datas:
    res = client.bulk(index = INDEX_NAME, body = bulk_data, refresh = True)
    
print("Done")

### Validate results through OpenSearch

To verify that the products have been successfully indexed, let's perform a wildcard search for `brush*` directly against the OpenSearch index.

In [None]:
res = client.search(index = INDEX_NAME, body={"query": {"wildcard": { "name": "brush*"}}})
print(json.dumps(res, indent=2))

## Validate results through Search Service

Finally, let's verify that the Retail Demo Store's [Search service](https://github.com/aws-samples/retail-demo-store/tree/master/src/search) can successfully query the the OpenSearch index as well.

### Discover Search Service

First we need to get the address to the [Search service](https://github.com/aws-samples/retail-demo-store/tree/master/src/search).

In [None]:
response = servicediscovery.discover_instances(
    NamespaceName='retaildemostore.local',
    ServiceName='search',
    MaxResults=1,
    HealthStatus='HEALTHY'
)

assert len(response['Instances']) > 0, 'Search service instance not found; check ECS to ensure it launched cleanly'

search_service_instance = response['Instances'][0]['Attributes']['AWS_INSTANCE_IPV4']
print('Service Instance IP: {}'.format(search_service_instance))

### Call Search Service

Let's call the service's index page which simply echos the service name.

In [None]:
!curl {search_service_instance}

Finally, let's do the same `brush` search through the Search service. We should get back the same item IDs as the direct OpenSearch query above.

In [None]:
!curl {search_service_instance}/search/products?searchTerm='brush'

## Workshop complete

**Congratulations!** You have completed the first Retail Demo Store workshop where we indexed the products from the Retail Demo Store's Products microservice in an OpenSearch domain index. This domain is used by the Retail Demo Store's Search microservice to process search queries from the Web user interface. To see this in action, open the Retail Demo Store's web UI in a new browser tab/window and enter a value in the search field at the top of the page.

### Next step

Move on to the **[1-Personalization](../1-Personalization/Lab-1-Introduction-and-data-preparation.ipynb)** workshop where we will learn how to train machine learning models using Amazon Personalize to produce personalized product recommendations to users and add the ability to provide personalized reranking of products.