# Full Stack Semantic Search Web Application

In the previous modules, we've demonstrated both keyword and semantic search with Amazon OpenSearch Service. In this module, we will now create a search enabled application using a sage maker endpoint and a serverless web application.

By the end of this module, the architecture will look as follows:

![full stack semantic search](semantic_search_fullstack.jpg)

### 1.Import PyTorch and check version.

As in the previous modules, let's import PyTorch and confirm that have have the latest version of PyTorch. The version should already be 1.10.2 or higher. If not, please run the lab in order to get everything set up.

In [None]:
import torch
print(torch.__version__)

### 2. Retrieve notebook variables

The line below will retrieve your shared variables from the previous notebook.

In [None]:
%store -r

### 3. Initialize boto3

We will use boto3 to interact with other AWS services.

Note: You can ignore any PythonDeprecationWarning warnings.

In [None]:
import boto3
import re
import time
import sagemaker
from sagemaker import get_execution_role

s3_resource = boto3.resource("s3")
s3 = boto3.client('s3')


### 4. Save pre-trained BERT model to S3

First off, we will host a pretrained BERT model in a SageMaker Pytorch model server to generate 768x1 dimension fixed length sentence embedding from [sentence-transformers](https://github.com/UKPLab/sentence-transformers) using [HuggingFace Transformers](https://huggingface.co/sentence-transformers/distilbert-base-nli-stsb-mean-tokens). 

This SageMaker endpoint will be called by the application to generate vector for the search query. First we'll get a pre-trained model and upload to S3

In [None]:
import torch
from transformers import AutoTokenizer, AutoModel
from transformers import DistilBertTokenizer, DistilBertModel
import os
from transformers import AutoTokenizer, AutoModel

model_name = "sentence-transformers/distilbert-base-nli-stsb-mean-tokens"
saved_model_dir = 'transformer'
os.makedirs(saved_model_dir, exist_ok=True)

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name) 

tokenizer.save_pretrained(saved_model_dir)
model.save_pretrained(saved_model_dir)

Create a SageMaker session and get the execution role to be used later.

In [None]:
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()


Unpack the model

In [None]:
!cd transformer && tar czvf ../model.tar.gz *

And finally upload the model to S3.

In [None]:
inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='sentence-transformers-model')
inputs

### 5. Create PyTorch Model Object

Next we need to create a PyTorchModel object. The deploy() method on the model object creates an endpoint which serves prediction requests in real-time. If the instance_type is set to a SageMaker instance type (e.g. ml.m5.large) then the model will be deployed on SageMaker. If the instance_type parameter is set to local then it will be deployed locally as a Docker container and ready for testing locally.

We need to create a Predictor class to accept TEXT as input and output JSON. The default behaviour is to accept a numpy array.

In [None]:
from sagemaker.pytorch import PyTorch, PyTorchModel
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role

class StringPredictor(Predictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(StringPredictor, self).__init__(endpoint_name, sagemaker_session, content_type='text/plain')

### 6. Deploy the BERT model to SageMaker Endpoint
Now that we have the predictor class, let's deploy a SageMaker endpoint for our application to invoke.

#### Note: This process will take about 5 minutes to complete.

You can ignore the "content_type is a no-op in sagemaker>=2" warning.

In [None]:
pytorch_model = PyTorchModel(model_data = inputs, 
                             role=role, 
                             entry_point ='inference.py',
                             source_dir = './code',
                             py_version = 'py39', 
                             framework_version = '1.13.1',
                             predictor_cls=StringPredictor)

predictor = pytorch_model.deploy(instance_type='ml.m5d.large', 
                                 initial_instance_count=1, 
                                 endpoint_name = f'semantic-search-model-{int(time.time())}')

### 7. Test the SageMaker Endpoint.

Now that the endpoint is created, let's quickly test it out.

In [None]:
import json
original_payload = 'Does this work with xbox?'
features = predictor.predict(original_payload)
vector_data = json.loads(features)

vector_data

----

# Deploying a full-stack semantic search application

We are now ready to build a real-world full-stack ML-powered web app. The Serverless Application Model (SAM) template we create below will deploy an Amazon API Gateway and AWS Lambda function. The Lambda function runs your code in response to HTTP requests that are sent to the API Gateway.

### 8. Build lambda zip file

First, we need to package our lambda function for deployment.

In [None]:
%cd backend/lambda
!sh build-lambda.sh
!unzip -l lambda.zip
%cd /home/ec2-user/SageMaker/semantic-search-with-amazon-opensearch

### Get  account information

In [None]:
account_id = boto3.client('sts').get_caller_identity().get('Account')
my_account = f'{account_id}'
print("account id: " + my_account)

### Disable S3 account level public access block

Please refer S3 documenation for more information. https://docs.aws.amazon.com/AmazonS3/latest/userguide/configuring-block-public-access-account.html

In [None]:
client = boto3.client('s3control')
response = client.put_public_access_block(
    PublicAccessBlockConfiguration={
        'BlockPublicAcls': False,
        'IgnorePublicAcls': False,
        'BlockPublicPolicy': False,
        'RestrictPublicBuckets': False
    },
    AccountId=my_account
)
print(response)

### Disable S3 bucket level public access block

Please refer S3 documenation for more informaiton. https://docs.aws.amazon.com/AmazonS3/latest/userguide/configuring-block-public-access-bucket.html


In [None]:
print(bucket)
s3 = boto3.client('s3')
response = s3.put_public_access_block(
    Bucket=f'{bucket}',
    PublicAccessBlockConfiguration={
        'BlockPublicAcls': False,
        'IgnorePublicAcls': False,
        'BlockPublicPolicy': False,
        'RestrictPublicBuckets': False
    }
)
print(response)

### Change S3 bucket ownership settings

Please refer S3 documenation for more information. https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-ownership-existing-bucket.html

In [None]:
response = s3.put_bucket_ownership_controls(
    Bucket=f'{bucket}',
    OwnershipControls={
        'Rules': [
            {
                'ObjectOwnership': 'BucketOwnerPreferred'
            },
        ]
    }
)
print(response)

Upload the packaged Lambda zip file to S3

In [None]:
s3_resource.Object(bucket, 'lambda/lambda.zip').upload_file('./backend/lambda/lambda.zip',ExtraArgs={'ACL':'public-read'})
lambda_zip_url = f'{bucket}'
print("lambada zip file url: " + lambda_zip_url)

### 9. Deploy a CloudFormation stack to create API Gateway and Lambda function

Next, we'll create a link to deploy a CloudFormation stack for our SAM application. Execute the following code block to generate a web link.

### Note: Click the generated link to deploy new CloudFormation template for Lambda and API Gateway. Mark all the checkboxes at the end of the form and click "Create Stack".

In [None]:
s3_resource.Object(bucket, 'backend/template.yaml').upload_file('./backend/template.yaml', ExtraArgs={'ACL':'public-read'})


sam_template_url = f'https://{bucket}.s3.amazonaws.com/backend/template.yaml'
print("cloudformation template url:" + sam_template_url)


# Generate the CloudFormation Quick Create Link

print("Click the URL below to create the backend API for semantic search:\n")
print((
    'https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review'
    f'?templateURL={sam_template_url}'
    '&stackName=semantic-search-api'
    f'&param_BucketName={outputs["s3BucketTraining"]}'
    f'&param_DomainName={outputs["OpenSearchDomainName"]}'
    f'&param_ElasticSearchURL={outputs["OpenSearchDomainEndpoint"]}'
    f'&param_SagemakerEndpoint={predictor.endpoint}'
    f'&param_LambdaZipFile={lambda_zip_url}'
))

### 10. Wait for the CloudFormation stack to complete.
Before proceeding further, wait for the CloudFormation stack to become complete. The status should change to "CREATE_COMPLETE".

### 11. Update the front end config

Next, we need to update the config of the front end with the API values.

In [None]:
import json

cfn = boto3.client('cloudformation')

def get_cfn_outputs(stackname):
    outputs = {}
    for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:
        outputs[output['OutputKey']] = output['OutputValue']
    return outputs

api_endpoint = get_cfn_outputs('semantic-search-api')['TextSimilarityApi']

with open('./frontend/src/config/config.json', 'w') as outfile:
    json.dump({'apiEndpoint': api_endpoint}, outfile)

### 12. Deploy frontend services

Now that we've updated the configuration, we need to build and deploy our front end.

In [None]:
# add NPM to the path so we can assemble the web frontend from our notebook code

from os import environ

npm_path = ':/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin'

if npm_path not in environ['PATH']:
    ADD_NPM_PATH = environ['PATH']
    ADD_NPM_PATH = ADD_NPM_PATH + npm_path
else:
    ADD_NPM_PATH = environ['PATH']
    
%set_env PATH=$ADD_NPM_PATH

In [None]:
%cd ./frontend/

!npm install

In [None]:
!npm run-script build

### Disable web page S3  bucket level public access block

In [None]:
host_bucket = f"{outputs['s3BucketHostingBucketName']}"
print(host_bucket)
response = s3.put_public_access_block(
    Bucket=host_bucket,
    PublicAccessBlockConfiguration={
        'BlockPublicAcls': False,
        'IgnorePublicAcls': False,
        'BlockPublicPolicy': False,
        'RestrictPublicBuckets': False
    }
)
print(response)

### Change web page  S3 bucket ownership settings

In [None]:
response = s3.put_bucket_ownership_controls(
    Bucket=host_bucket,
    OwnershipControls={
        'Rules': [
            {
                'ObjectOwnership': 'BucketOwnerPreferred'
            },
        ]
    }
)
print(response)

In [None]:
hosting_bucket = f"s3://{outputs['s3BucketHostingBucketName']}"

!aws s3 sync ./build/ $hosting_bucket --acl public-read

### 13. Browse to the application

Now that the application is deployed, let's browse to the front end and test it out. 

### Note: Execute the following and click on the link generated.

In [None]:
print('Click the URL below:\n')
print(outputs['S3BucketSecureURL'] + '/index.html')

You can search the question, for example "does this work with xbox?", compare the search result. you will see the difference between keyword search and semantic search.

![full stack semantic search](full-stack-semantic-search-ui.jpg)

In keyword search, some questions like "Does this work for a switch?", "does this work with pc" which include "does this work" are searched however the meaning is totally different with query.

In semantic search, some questions like "Do I need to buy anything extra to used in xbox one s controller?", "How do these headphones connect to the Xbox360 controller?" are searched. The meaning is very close to the query.
![full stack semantic search](full-stack-semantic-search-ui-2.jpg)

## Cleanup

### Restore S3 bucket ownership

In [None]:
response = s3.put_bucket_ownership_controls(
    Bucket=f'{bucket}',
    OwnershipControls={
        'Rules': [
            {
                'ObjectOwnership': 'BucketOwnerEnforced'
            },
        ]
    }
)
print(response)

response = s3.put_bucket_ownership_controls(
    Bucket=f'{host_bucket}',
    OwnershipControls={
        'Rules': [
            {
                'ObjectOwnership': 'BucketOwnerEnforced'
            },
        ]
    }
)
print(response)

### Block S3 public access

In [None]:
response = s3.put_public_access_block(
    Bucket=f'{bucket}',
    PublicAccessBlockConfiguration={
        'BlockPublicAcls': True,
        'IgnorePublicAcls': True,
        'BlockPublicPolicy': True,
        'RestrictPublicBuckets': True
    }
)
response = s3.put_public_access_block(
    Bucket=f'{host_bucket}',
    PublicAccessBlockConfiguration={
        'BlockPublicAcls': True,
        'IgnorePublicAcls': True,
        'BlockPublicPolicy': True,
        'RestrictPublicBuckets': True
    }
)

response = client.put_public_access_block(
    PublicAccessBlockConfiguration={
        'BlockPublicAcls': True,
        'IgnorePublicAcls': True,
        'BlockPublicPolicy': True,
        'RestrictPublicBuckets': True
    },
    AccountId=my_account
)


