# SageMaker model deployment as CI/CD pipeline
This notebook demonstrates how to use SageMaker Project template for CI/CD model deployment. You are going to implement:<br/>
1. Load the data for the iris multi-class classification problem<br/>
2. Use a SageMaker built-in estimator [XGBoost](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html) to train the model on the dataset<br/>
3. Create a [SageMaker pipeline](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines.html) to train and register the model<br/>
4. Select the latest model package from the model package group and set the status to `Approved` and launch the model deployment CI/CD pipeline

## Load packages and get environment configuration 

In [None]:
if False:
    !pip install -U sagemaker

In [None]:
%matplotlib inline
import pandas as pd
import numpy as np
import sagemaker
import json
import boto3
from sagemaker import get_execution_role
import sagemaker.session
from sklearn.model_selection import train_test_split
from sklearn import datasets

sm = boto3.client("sagemaker")
ssm = boto3.client("ssm")

def get_environment(project_name, ssm_params):
    r = sm.describe_domain(
            DomainId=sm.describe_project(
                ProjectName=project_name
                )["CreatedBy"]["DomainId"]
        )
    del r["ResponseMetadata"]
    del r["CreationTime"]
    del r["LastModifiedTime"]
    r = {**r, **r["DefaultUserSettings"]}
    del r["DefaultUserSettings"]

    i = {
        **r,
        **{t["Key"]:t["Value"] 
            for t in sm.list_tags(ResourceArn=r["DomainArn"])["Tags"] 
            if t["Key"] in ["EnvironmentName", "EnvironmentType"]}
    }

    for p in ssm_params:
        try:
            i[p["VariableName"]] = ssm.get_parameter(Name=f"{i['EnvironmentName']}-{i['EnvironmentType']}-{p['ParameterName']}")["Parameter"]["Value"]
        except:
            i[p["VariableName"]] = ""

    return i

def get_session(region, default_bucket):
    """Gets the sagemaker session based on the region.

    Args:
        region: the aws region to start the session
        default_bucket: the bucket to use for storing the artifacts

    Returns:
        sagemaker.session.Session instance
    """

    boto_session = boto3.Session(region_name=region)

    sagemaker_client = boto_session.client("sagemaker")
    runtime_client = boto_session.client("sagemaker-runtime")
    return sagemaker.session.Session(
        boto_session=boto_session,
        sagemaker_client=sagemaker_client,
        sagemaker_runtime_client=runtime_client,
        default_bucket=default_bucket,
    )

<div class="alert alert-info"> ðŸ’¡ <strong> Get environment variables </strong>

Set the <b>`project_name`</b> to the name of the current SageMaker project.
Various environment data is loaded and shown:
</div>

In [None]:
# Set to the specific SageMaker project name
project_name = <PROJECT NAME>

# Dynamically load environmental SSM parameters - provide the list of the variables to load from SSM parameter store
ssm_parameters = [
    {"VariableName":"DataBucketName", "ParameterName":"data-bucket-name"},
    {"VariableName":"ModelBucketName", "ParameterName":"model-bucket-name"},
    {"VariableName":"S3VPCEId", "ParameterName":"s3-vpce-id"},
    {"VariableName":"S3KmsKeyId", "ParameterName":"kms-s3-key-arn"},
    {"VariableName":"EbsKmsKeyArn", "ParameterName":"kms-ebs-key-arn"},
    {"VariableName":"PipelineExecutionRole", "ParameterName":"sm-pipeline-execution-role-arn"},
    {"VariableName":"ModelExecutionRole", "ParameterName":"sm-model-execution-role-name"},
    {"VariableName":"StackSetExecutionRole", "ParameterName":"stackset-execution-role-name"},
    {"VariableName":"StackSetAdministrationRole", "ParameterName":"stackset-administration-role-arn"},
    {"VariableName":"StagingAccountList", "ParameterName":"staging-account-list"},
    {"VariableName":"ProdAccountList", "ParameterName":"production-account-list"},
    {"VariableName":"EnvTypeStagingName", "ParameterName":"env-type-staging-name"},
    {"VariableName":"EnvTypeProdName", "ParameterName":"env-type-prod-name"},
]

env_data = get_environment(project_name=project_name, ssm_params=ssm_parameters)
print(f"Environment data:\n{json.dumps(env_data, indent=2)}")

In [None]:
# Create SageMaker session
sagemaker_session = get_session(boto3.Session().region_name, env_data["DataBucketName"])

region = boto3.Session().region_name
pipeline_role = env_data["PipelineExecutionRole"]
processing_role = env_data["ExecutionRole"]
model_execution_role = env_data["ModelExecutionRole"]
training_role = env_data["ExecutionRole"]
data_bucket = sagemaker_session.default_bucket()
model_bucket = env_data["ModelBucketName"]

print(f"SageMaker version: {sagemaker.__version__}")
print(f"Region: {region}")
print(f"Pipeline execution role: {pipeline_role}")
print(f"Processing role: {processing_role}")
print(f"Training role: {training_role}")
print(f"Model execution role: {model_execution_role}")
print(f"Pipeline data bucket: {data_bucket}")
print(f"Pipeline model bucket: {model_bucket}")


project_id = sm.describe_project(ProjectName=project_name)['ProjectId']
# The model package group name must be the same as specified at project creation time in ModelPackageGroupName parameter
model_package_group_name = f"{project_name}-{project_id}"
print(f"Model package group name: {model_package_group_name}")

assert(len(project_name) <= 15 ) # the project name should not have more than 15 chars

# Prefix for S3 objects
prefix=f"{project_name}-{project_id}"

## Setup the network config
You must provide the network configuration such as subnet ids and security group ids for SageMaker training and register model jobs. The security controls in the SageMaker execution role IAM policy prevents starting any SageMaker job without VPC attachment.

In [None]:
from sagemaker.network import NetworkConfig

network_config = NetworkConfig(
        enable_network_isolation=False, 
        security_group_ids=env_data["SecurityGroups"],
        subnets=env_data["SubnetIds"],
        encrypt_inter_container_traffic=True)

## Load the dataset

### Load from scikit-learn
Load the [iris dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html) from `sklearn` module. The iris dataset is a classic and very easy multi-class classification dataset.

In [None]:
iris = datasets.load_iris()
dataset = np.insert(iris.data, 0, iris.target, axis=1)

df = pd.DataFrame(data=dataset, columns=['iris_id'] + iris.feature_names)
df['species'] = df['iris_id'].map(lambda x: 'setosa' if x == 0 else 'versicolor' if x == 1 else 'virginica')

df.head()

### Upload the dataset to an S3 bucket

In [None]:
X=iris.data
y=iris.target

# Split the dataset into train and test parts
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42, stratify=y)
yX_train = np.column_stack((y_train, X_train))
yX_test = np.column_stack((y_test, X_test))
np.savetxt("iris_train.csv", yX_train, delimiter=",", fmt='%0.3f')
np.savetxt("iris_test.csv", yX_test, delimiter=",", fmt='%0.3f')

# Upload the dataset to an S3 bucket
input_train = sagemaker_session.upload_data(path='iris_train.csv', key_prefix=f'{prefix}/datasets/iris/data')
input_test = sagemaker_session.upload_data(path='iris_test.csv', key_prefix=f'{prefix}/datasets/iris/data')

print(input_train)
print(input_test)

### Create the ML Pipeline

#### Pipeline input parameters

In [None]:
from sagemaker.workflow.parameters import (
    ParameterInteger,
    ParameterString,
)

training_instance_type = ParameterString(
    name="TrainingInstanceType",
    default_value="ml.m5.xlarge"
)
training_instance_count = ParameterInteger(
    name="TrainingInstanceCount",
    default_value=1
)
input_train_data = ParameterString(
    name="InputDataTrain",
    default_value=input_train,
)
input_test_data = ParameterString(
    name="InputDataTest",
    default_value=input_test,
)
model_approval_status = ParameterString(
    name="ModelApprovalStatus", default_value="PendingManualApproval"
)

#### Setup an estimator that will run the training process

In [None]:
from sagemaker.estimator import Estimator
import time

base_job_prefix = f"{prefix}/iris-{time.strftime('%Y-%m-%d-%H-%M-%S')}"
model_path = f"s3://{model_bucket}/{base_job_prefix}"

image_uri = sagemaker.image_uris.retrieve(
    framework="xgboost", 
    region=region, 
    version="1.0-1", 
    py_version="py3", 
    instance_type=training_instance_type,
)
xgb_train = Estimator(
    image_uri=image_uri,
    instance_type=training_instance_type,
    instance_count=training_instance_count,
    output_path=model_path,
    base_job_name=f"{base_job_prefix}/train",
    sagemaker_session=sagemaker_session,
    role=training_role,
    subnets=network_config.subnets,
    security_group_ids=network_config.security_group_ids,
    encrypt_inter_container_traffic=True,
    enable_network_isolation=False,
    volume_kms_key=env_data["EbsKmsKeyArn"],
    output_kms_key=env_data["S3KmsKeyId"]
)
xgb_train.set_hyperparameters(
    eta=0.1,
    max_depth=10,
    gamma=4,
    num_class=len(np.unique(y)),
    alpha=10,
    min_child_weight=6,
    silent=0,
    objective='multi:softmax',
    num_round=30
)

### Training step

In [None]:
from sagemaker.inputs import TrainingInput
from sagemaker.workflow.steps import TrainingStep

step_train = TrainingStep(
    name="IrisTrain",
    estimator=xgb_train,
    inputs={
        "train": TrainingInput(s3_data=input_train_data, content_type="text/csv"),
        "validation": TrainingInput(s3_data=input_test_data, content_type="text/csv"
        )
    },
)

### Model register step

In [None]:
vpc_config = {
    "Subnets":network_config.subnets,
    "SecurityGroupIds":network_config.security_group_ids
}

In [None]:
from sagemaker.workflow.step_collections import RegisterModel

# NOTE: model_approval_status is not available as arg in service dsl currently
step_register = RegisterModel(
    name="IrisRegisterModel",
    estimator=xgb_train,
    model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,
    content_types=["text/csv"],
    response_types=["text/csv"],
    inference_instances=["ml.t2.medium", "ml.m5.xlarge"],
    transform_instances=["ml.m5.xlarge"],
    model_package_group_name=model_package_group_name,
    approval_status=model_approval_status,
    vpc_config_override=vpc_config
)

### Create a pipeline
For sake of simplicity we limit the pipeline to train and register steps only. For real-life production example you might create a pipeline with data processing, training, model evaluation, and conditional model register steps. This extended example is covered by `MLOps Model Build Train` SageMaker project template.

In [None]:
from botocore.exceptions import ClientError, ValidationError
from sagemaker.workflow.pipeline import Pipeline


pipeline_name = f"{prefix}-IrisPipeline"

pipeline = Pipeline(
    name=pipeline_name,
    parameters=[
        training_instance_type,
        training_instance_count,        
        input_train_data,
        model_approval_status,
        input_test_data
    ],
    steps=[step_train, step_register],
    sagemaker_session=sagemaker_session,
)

response = pipeline.upsert(role_arn=pipeline_role)

pipeline_arn = response["PipelineArn"]
sm.add_tags(
    ResourceArn=pipeline_arn,
    Tags=[
        {'Key': 'sagemaker:project-name', 'Value': project_name },
        {'Key': 'sagemaker:project-id', 'Value': project_id },
        {'Key': 'EnvironmentName', 'Value': env_data['EnvironmentName'] },
        {'Key': 'EnvironmentType', 'Value': env_data['EnvironmentType'] },
    ]
)

print(response)

### Execute the pipeline

In [None]:
execution = pipeline.start()

In [None]:
execution.describe()

###Â Wait till the completion of the pipeline

In [None]:
execution.wait()

### Finally, approve the model to launch the model deployment process

In [None]:
# list all model packages and select the latest one
model_packages = []

for p in sm.get_paginator('list_model_packages').paginate(
        ModelPackageGroupName=model_package_group_name,
        SortBy="CreationTime",
        SortOrder="Descending",
    ):
    model_packages.extend(p["ModelPackageSummaryList"])

if len(model_packages) == 0:
    raise Exception(f"No model package is found for {model_package_group_name} model package group")
    
latest_model_package_arn = model_packages[0]["ModelPackageArn"]
print(latest_model_package_arn)

The following statement sets the `ModelApprovalStatus` for the model package to `Approved`. The model package state change will launch the EventBridge rule and the rule will launch the CodePipeline CI/CD pipeline with model deployment.

In [None]:
model_package_update_response = sm.update_model_package(
    ModelPackageArn=latest_model_package_arn,
    ModelApprovalStatus="Approved",
)

The model deployment CI/CD pipeline will perform the followign actions:<br/>
1. Create a SageMaker endpoint in staging account (or `*-staging` endpoint in the current account in case of single-account deployment)<br/>
2. Run the test script on the staging endpoint<br/>
3. Wait until the test result is manually approved in [AWS CodePipeline console](https://console.aws.amazon.com/codesuite/codepipeline)<br/>
4. Create a SageMaker endpoint in the production account (or `*-prod` endpoint in the current account in case of single-account deployment)<br/>

After successful completion of the CI/CD pipeline, you will see two endpoints in status `InService` in SageMaker Studio Components->Endpoints widget.

### CI/CD pipeline execution
You can follow up the execution of the model deployment pipeline including the stages and actions

In [None]:
cp = boto3.client("codepipeline")

code_pipeline_name = f"sagemaker-{project_name}-{project_id}-modeldeploy"

r = cp.get_pipeline_state(name=code_pipeline_name)

r

Wait about 15 minutes until the staging endpoint is deployed and the pipeline stops at the manual approval stage:

In [None]:
import time
from IPython.core.display import display, HTML

print(f"waiting till the pipeline stops at the manual approval stage")

while len([a for a in [s for s in cp.get_pipeline_state(
    name=code_pipeline_name
    )["stageStates"] if s["stageName"] == "DeployModelStaging"][0]["actionStates"]
           if a["actionName"]=="ApproveStagingDeployment" and a.get("latestExecution") and a.get("latestExecution")["status"]=="InProgress"])==0:
    print("waiting...")
    time.sleep(20)

print(f"staging deployment completed.")

display(
    HTML(
        '<b>Please approve the manual step in <a target="top" href="https://console.aws.amazon.com/codesuite/codepipeline/pipelines/{}/view?region={}">AWS CodePipeline</a></b>'.format(
            code_pipeline_name, region)
    )
)

Click on the link ^^^ above ^^^ to approve the production deployment.

After completion of the previous code snippet, you will have a staging endpoint deployed in the SageMaker environment in the staging account or in the same account in the case of single-account setup. Approve the production deployment by clicking on the provided link above and approving the CodePipeline manual approval stage.<br/>
The model deployment pipeline continues and deploys the production endpoint into the production account.
You can check the status and details of the SageMaker endpoint in the `Component and registries`->`Endpoints` widget:

![endpoints](img/endpoints.png)

Please keep in mind, that you can see the deployed staging and production SageMaker endpoints in Studio in the case of single-account deployment only. If you deploy the model to different staging and production accounts, you have to log into the AWS console in the corresponding account.

###Â Production deployment
Wait another 15 minutes until the model has been deployed to production.

In [None]:
print(f"waiting for production endpoint deployment")

while len([a for a in [s for s in cp.get_pipeline_state(
    name=code_pipeline_name
    )["stageStates"] if s["stageName"] == "DeployModelProd"][0]["actionStates"]
           if a["actionName"]=="DeployProd" and a.get("latestExecution") and a.get("latestExecution")["status"]=="Succeeded"])==0:
    print("waiting...")
    time.sleep(20)

print(f"production deployment completed.")


## Clean up
After you have finished testing and experimenting with model deployment, you should clean up the provisioned resources to avoid charges for the SageMaker inference instances.<br/>
The code in this section deletes the SageMaker staging and production endpoints. The corresponding CloudFormation stack set instances and stack sets are also deleted.

### Delete CloudFormation stack sets
This will delete provisioned SageMaker endpoints and associated resoures.

In [None]:
import time

cf = boto3.client("cloudformation")

for ss in [
        f"sagemaker-{project_name}-{project_id}-deploy-{env_data['EnvTypeStagingName']}",
        f"sagemaker-{project_name}-{project_id}-deploy-{env_data['EnvTypeProdName']}"
        ]:
    accounts = [a["Account"] for a in cf.list_stack_instances(StackSetName=ss)["Summaries"]]
    print(f"delete stack set instances for {ss} stack set for the accounts {accounts}")
    r = cf.delete_stack_instances(
        StackSetName=ss,
        Accounts=accounts,
        Regions=[boto3.session.Session().region_name],
        RetainStacks=False,
    )
    print(r)

    time.sleep(180)

    print(f"delete stack set {ss}")
    r = cf.delete_stack_set(
        StackSetName=ss
    )

### Delete SageMaker project
This will delete the associated CloudFormation stack and CodeCommit repository

In [None]:
print(f"Deleting project {project_name}:{sm.delete_project(ProjectName=project_name)}")

### Delete project S3 bucket
This will remove all files and S3 bucket

In [None]:
!aws s3 rb s3://sm-mlops-cp-{project_name}-{project_id} --force

## Release resources

In [None]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>