# Module 4: Inference Patterns- Inference pipline based feature look up
**This notebook uses the feature groups created in `module 1` and `module 2` and model trained in `module 3` to show how we can look up features from online feature store in realtime from an endpoint**


**Note:** Please set kernel to `Python 3 (Data Science)` and select instance to `ml.t3.medium`


## Contents

1. [Background](#Background)
2. [Setup](#Setup)
3. [Loading feature group names](#Loading-feature-group-names)
4. [Prepare a script to look up features from the featurestore](#Prepare-a-script-to-look-up-features-from-the-featurestore)
5. [Load pre-trained xgboost model](#Load-pre-trained-xgboost-model)
6. [Create and deploy an inference pipline](#Create-and-deploy-an-inference-pipline)
7.[Make inference using the inference pipeline](#Make-inference-using-the-inference-pipeline)
8. [Cleanup](#Cleanup)


## Background
In this notebook, we demonstrate how to retreive features from two online feature groups within an endpoint. First we use the feature set derived in Modules 1 and 2 as well as the model trained in module 3 that was a SageMaker XGBoost algorithm predicting which product the user would add to their baskets.

Retreiving the already trained model, we will create an inference pipline [Inference pipeline](https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html). Using inference pipeline you can chain a sequence of 2 to 15 containers and delpoy on the same endpoint- Inference pipeline are great for real time inferences where a sequence of models feed into one another to generate the final prediction, or where pre-processing or post-processing of restuls in real time are requiered. 

In this notebook, we will see how we can use a XGBoost container as the first container within the inference pipeline to look up features from online features stores and feed the retreived features into a second XGboost container for model inference. You will also see how we delpoy these two container onto the same endpoint via using inference pipelineModel.

Our first XGBoost contianer will get the features from two online features stores (customers and products feature groups created in Module 2) by sending the request body as customer id and product id to retreive their associated features from customer and product feature groups. 

Take a few minutes reviewing the following architecture that shows an exmaple of an inference pipeline with multiple container.

![Inference endpoint lookup](../images/m4_nb3_inference_pattern.png "Inference Pipeline endpoint feature look up")


## Setup

In [None]:
import sagemaker
from sagemaker.serializers import CSVSerializer
from sagemaker.inputs import TrainingInput
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role
import pandas as pd
import numpy as np
import sagemaker
import logging
import json
import os
import sys
sys.path.append('..')
import boto3



### Essentials

In [None]:
logger = logging.getLogger('__name__')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

In [None]:
sagemaker_execution_role = get_execution_role()
logger.info(f'Role = {sagemaker_execution_role}')
session = boto3.Session()
sagemaker_session = sagemaker.Session()
default_bucket = sagemaker_session.default_bucket()
prefix = 'sagemaker-featurestore-workshop'
s3 = session.resource('s3')
sagemaker_client = session.client(service_name="sagemaker")

## Loading feature group names

We will be loading and using the data we created and ingested in feature groups created in module 1 and 2. Therefore we will restore the feature group name to use.

In [None]:
%store -r customers_feature_group_name
%store -r products_feature_group_name

## Prepare a script to look up features from the featurestore
Within an inference pipeline we can deploy multiple containers in sequence- the first container for example can be pre-precessing container using Sklearn or any other framework of your choice- for the demonstration we will create a xgboost model object that does nothing but looks up the features from feature store. For this, we will prepare a customised inference script to use when creating the model object. Please note that in the code we are returning 'None' as the model, as we have not trained any model using an estimators or any processor model.


In [None]:
%%writefile custom_library/inference_get_features.py

import json
from io import StringIO
import os
import pickle as pkl
import joblib
import time
import sys
import subprocess
import numpy as np
import pandas as pd
import numpy as np
import boto3
import sagemaker
import helper
import json
import os
import pickle as pkl
import numpy as np
import ast
from sagemaker.serializers import CSVSerializer

boto_session = boto3.Session()
region= boto_session.region_name

#The feature list is passed as an environemnt variable to the script- feature list is defined by the client.
feature_list=os.environ['feature_list']
feature_list=ast.literal_eval(feature_list)



def model_fn(model_dir):
 print ('processing - in model_fn')
 return None



def input_fn(request_body, request_content_type):
 print(request_content_type)
 """
 The SageMaker XGBoost model server receives the request data body and the content type,
 and invokes the `input_fn`.
 Return a DMatrix (an object that can be passed to predict_fn).
 """
 if request_content_type == "text/csv":
 params =request_body.split(',')
 id_dict={'customer_id':params[0].strip(), 'product_id':params[1].strip()}
 start = time.time()
 recs= helper.get_latest_featureset_values(id_dict, feature_list)
 end= time.time()
 duration= end-start
 print("time to lookup features from two feature stores:", duration)
 records= [e for e in recs.values()]
 return [records]
 else:
 raise ValueError("{} not supported by script!".format(request_content_type))
 

def predict_fn(input_data, model):
 """
 SageMaker XGBoost model server invokes `predict_fn` on the return value of `input_fn`.
 Return a two-dimensional NumPy array where the first columns are predictions
 and the remaining columns are the feature contributions (SHAP values) for that prediction.
 """
 return input_data


### Prepare the featuregroup names and list of features to be retreived from the online featurestore defined by the client and passed on to the script as an environemnt variable

In [None]:
%store -r customers_feature_group_name
%store -r products_feature_group_name

customers_fg = sagemaker_client.describe_feature_group(
 FeatureGroupName=customers_feature_group_name)

products_fg = sagemaker_client.describe_feature_group(
 FeatureGroupName=products_feature_group_name)


'''select all features from the feature group using '*' OR OR selected a list from the complete list of features, you can get via the following code.
customers_feats='*'
products_feats='*'

OR

customers_feats=','.join(i['FeatureName'] for i in customers_fg['FeatureDefinitions'])
products_feats=','.join(i['FeatureName'] for i in products_fg['FeatureDefinitions'])
'''

customers_feats='*'
products_feats='*'

customer_feats_desc=customers_fg["FeatureGroupName"]+ ":"+customers_feats
products_feats_desc=products_fg["FeatureGroupName"]+ ":"+products_feats

feature_list=str([customer_feats_desc,products_feats_desc])
print(feature_list)


### Create the XGB model object and pass on the customised inference script. Take note of the environemt variables we are defining, in particular feature_list passed on by the client to the script

In [None]:
from sagemaker.xgboost.model import XGBoostModel


#env={"feature_list": feature_list}

fs_lookup_model = XGBoostModel(
 model_data=None,
 role=sagemaker_execution_role,
 source_dir= './custom_library',
 entry_point="inference_get_features.py",
 framework_version="1.3-1",
 sagemaker_session=sagemaker_session,
)

fs_lookup_model.env = {"SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT":"text/csv", "feature_list": feature_list}

## Load pre-trained xgboost model
As for the model which actually does the inference we load our already trained XGboost model and deploy it as the second container in the sequence of inference pipeline. We do this simply by loading the trained model from module 3 and creating the model object to pass on to the inference pipeline

In [None]:
%store -r training_jobName
sagemaker_client = session.client(service_name="sagemaker")
from sagemaker.xgboost.model import XGBoostModel

training_job_info = sagemaker_client.describe_training_job(
 TrainingJobName=training_jobName
)
xgb_model_data = training_job_info["ModelArtifacts"]["S3ModelArtifacts"]
print(xgb_model_data)

container_uri = training_job_info['AlgorithmSpecification']['TrainingImage']

In [None]:
from time import gmtime, strftime
from sagemaker.utils import name_from_base
from sagemaker.model import Model

xgb_model = Model(
 image_uri=container_uri,
 model_data=xgb_model_data,
 role=sagemaker_execution_role,
 name=name_from_base("fs-workshop-xgboost-model"),
 sagemaker_session=sagemaker_session,
)


## Create and deploy an inference pipline
As shown in the following code, we use Pipleline model and pass on the two models as a sequence to the pipeline and deploy it similar to any other deployment to an endpoint

In [None]:
from sagemaker.pipeline import PipelineModel

instance_type = "ml.m5.2xlarge"
 
model_name = name_from_base("inference-pipeline")
endpoint_name = name_from_base("inference-pipeline-ep")

sm_model = PipelineModel(name=model_name, role=sagemaker_execution_role, models=[fs_lookup_model, xgb_model])

sm_model.deploy(initial_instance_count=1, instance_type=instance_type, endpoint_name=endpoint_name)

## Make inference using the inference pipeline

In [None]:
from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer

cust_id='C50'
prod_id='P2'
test_data= f'{cust_id},{prod_id}'
print(test_data)

predictor = Predictor(
 endpoint_name=endpoint_name,
 sagemaker_session=None,
 serializer=CSVSerializer(),
 Content_Type="text/csv",
 Accept="text/csv"
)
print(predictor.predict(test_data))

## Cleanup

In [None]:
endpoint_name = sm_model.endpoint_name
print(endpoint_name)

In [None]:
response = sagemaker_client.describe_endpoint_config(EndpointConfigName=endpoint_name)
model_name = response['ProductionVariants'][0]['ModelName']
model_name

In [None]:
sagemaker_client.delete_model(ModelName=model_name) 
sagemaker_client.delete_endpoint(EndpointName=endpoint_name)
sagemaker_client.delete_endpoint_config(EndpointConfigName=endpoint_name)