# Solution b.

Create a inference script. Let's call it `inference.py`.

Let's also create the `input_fn`, `predict_fn`, `output_fn` and `model_fn` functions.

Copy the cells below and paste in [the main notebook](../deployment_hosting.ipynb).

In [None]:
%%writefile inference.py

import os
import pickle

import xgboost
import sagemaker_xgboost_container.encoder as xgb_encoders

# Same as in the training script
def model_fn(model_dir):
 """Load a model. For XGBoost Framework, a default function to load a model is not provided.
 Users should provide customized model_fn() in script.
 Args:
 model_dir: a directory where model is saved.
 Returns:
 A XGBoost model.
 XGBoost model format type.
 """
 model_files = (file for file in os.listdir(model_dir) if os.path.isfile(os.path.join(model_dir, file)))
 model_file = next(model_files)
 try:
 booster = pickle.load(open(os.path.join(model_dir, model_file), 'rb'))
 format = 'pkl_format'
 except Exception as exp_pkl:
 try:
 booster = xgboost.Booster()
 booster.load_model(os.path.join(model_dir, model_file))
 format = 'xgb_format'
 except Exception as exp_xgb:
 raise ModelLoadInferenceError("Unable to load model: {} {}".format(str(exp_pkl), str(exp_xgb)))
 booster.set_param('nthread', 1)
 return booster


def input_fn(request_body, request_content_type):
 """
 The SageMaker XGBoost model server receives the request data body and the content type,
 and invokes the `input_fn`.
 The input_fn that just validates request_content_type and prints
 """
 
 print("Hello from the PRE-processing function!!!")
 
 if request_content_type == "text/csv":
 return xgb_encoders.csv_to_dmatrix(request_body)
 else:
 raise ValueError(
 "Content type {} is not supported.".format(request_content_type)
 )

def predict_fn(input_object, model):
 """
 SageMaker XGBoost model server invokes `predict_fn` on the return value of `input_fn`.
 """
 return model.predict(input_object)[0]


def output_fn(prediction, response_content_type):
 """
 After invoking predict_fn, the model server invokes `output_fn`.
 An output_fn that just adds a column to the output and validates response_content_type
 """
 print("Hello from the POST-processing function!!!")
 
 appended_output = "hello from post-processing function!!!"
 predictions = [prediction, appended_output]

 if response_content_type == "text/csv":
 return ','.join(str(x) for x in predictions)
 else:
 raise ValueError("Content type {} is not supported.".format(response_content_type))
 

Deploy the new model with the inference script:

- find the S3 bucket where the artifact is stored (you can create a tarball and upload it to S3 or use another model that was previously created in SageMaker)

#### Finding a previously trained model:

Go to the Experiments tab in Studio again: 
![experiments_s3_artifact.png](../media/experiments_s3_artifact.png)

Choose another trained model, such as the one trained with Framework mode (right-click and choose `Open in trial details`):

![trial_s3_artifact.png](../media/trial_s3_artifact.png)

Click on `Artifacts` and look at the `Output artifacts`:
![trial_uri_s3_artifact.png](../media/trial_uri_s3_artifact.png)

Copy and paste your `SageMaker.ModelArtifact` of the S3 URI where the model is saved:

In this example:
```
s3_artifact="s3://sagemaker-studio-us-east-2-/xgboost-churn/output/demo-xgboost-customer-churn-2021-04-13-18-51-56-144/output/model.tar.gz"
```

In [None]:
s3_artifact="s3:///PATH/TO/model.tar.gz"

In [None]:
%store -r docker_image_name
%store -r framework_version

**Deploy it:**

In [None]:
from sagemaker.xgboost.model import XGBoostModel

xgb_inference_model = XGBoostModel(
 entry_point="inference.py",
 model_data=s3_artifact,
 role=role,
 image_uri=docker_image_name,
 framework_version=framework_version,
 py_version="py3"
)

In [None]:
data_capture_prefix = '{}/datacapture'.format(prefix)

endpoint_name = "model-xgboost-customer-churn-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print("EndpointName = {}".format(endpoint_name))

In [None]:
predictor = xgb_inference_model.deploy( initial_instance_count=1, 
 instance_type='ml.m4.xlarge',
 endpoint_name=endpoint_name,
 data_capture_config=DataCaptureConfig(
 enable_capture=True,
 sampling_percentage=100,
 destination_s3_uri='s3://{}/{}'.format(bucket, data_capture_prefix)
 )
 )

In [None]:
## Updating an existing endpoint
# model_name = xgb_inference_model.name

# from sagemaker.predictor import Predictor
# predictor = Predictor(endpoint_name=endpoint_name)
# predictor.update_endpoint(instance_type='ml.m4.xlarge', 
# initial_instance_count=1, 
# model_name=model_name,
# data_capture_config=DataCaptureConfig(
# enable_capture=True,
# sampling_percentage=100,
# destination_s3_uri='s3://{}/{}'.format(bucket, data_capture_prefix)
# )
# )

**Send some requests:**

In [None]:
runtime_client = boto3.client("sagemaker-runtime")

In [None]:
with open('/root/amazon-sagemaker-workshop/4-Deployment/RealTime/config/test_sample.csv', 'r') as f:
 for row in f:
 payload = row.rstrip('\n')
 print(f"Sending: {payload}")
 response = runtime_client.invoke_endpoint(EndpointName=endpoint_name,
 ContentType='text/csv', 
 Accept='text/csv', 
 Body=payload)
 
 print(f"\nReceived: {response['Body'].read()}")
 break

Go to CloudWatch logs and check the inference logic:

[Link to CloudWatch Logs](https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logsV2:log-groups$3FlogGroupNameFilter$3D$252Faws$252Fsagemaker$252FEndpoints$252F)