# End to End example to manage lifecycle of ML models deployed on the edge using SageMaker Edge Manager

**SageMaker Studio Kernel**: Data Science


## Contents 

* Use Case
* Workflow
* Setup
* Building and Deploying the ML Model
* Running the fleet of Virtual Wind Turbines and Edge Devices
* Cleanup


## Use Case

The challenge we're trying to address here is to detect anomalies in the components of a Wind Turbine. Each wind turbine has many sensors that reads data like:
 - Internal & external temperature
 - Wind speed
 - Rotor speed
 - Air pressure
 - Voltage (or current) in the generator
 - Vibration in the GearBox (using an IMU -> Accelerometer + Gyroscope)

So, depending on the types of the anomalies we want to detect, we need to select one or more features and then prepare a dataset that 'explains' the anomalies. We are interested in three types of anomalies:
 - Rotor speed (when the rotor is not in an expected speed)
 - Produced voltage (when the generator is not producing the expected voltage)
 - Gearbox vibration (when the vibration of the gearbox is far from the expected)
 
All these three anomalies (or violations) depend on many variables while the turbine is working. Thus, in order to address that, let's use a ML model called [Autoencoder](https://en.wikipedia.org/wiki/Autoencoder), with correlated features. This model is unsupervised. It learns the latent representation of the dataset and tries to predict (regression) the same tensor given as input. The strategy then is to use a dataset collected from a normal turbine (without anomalies). The model will then learn **'what is a normal turbine'**. When the sensors readings of a malfunctioning turbine is used as input, the model will not be able to rebuild the input, predicting something with a high error and detected as an anomaly.


## Workflow


In this example, you will create a robust end-to-end solution that manages the lifecycle of ML models deployed to a wind turbine fleet to detect the anomalies in the operation using SageMaker Edge Manager.

 - Prepare a ML model
     - download a pre-trained model;
     - compile the ML model with SageMaker Neo for Linux x86_64;
     - create a deployment package using SageMaker Edge Manager;
     - download/unpack the deployment package;
 - Download/unpack a package with the IoT certificates, required by the agent; 
 - Download/unpack **SageMaker Edge Agent** for Linux x86_64;
 - Generate the protobuf/grpc stubs (.py scripts) - with these files we will send requests via unix:// sockets to the agent; 
 - Using some helper functions, we're going to interact with the agent and do some tests.

The following diagram shows the resources, required to run this experiment and understand how the agent works and how to interact with it.  
![Pipeline](../imgs/EdgeManagerWorkshop_MinimalistArchitecture.png)

## Step 1 - Setup 

### Installing some required libraries

In [None]:
!apt-get -y update && apt-get -y install build-essential procps
!pip install --quiet -U numpy sysv_ipc boto3 grpcio-tools grpcio protobuf sagemaker
!pip install --quiet -U matplotlib==3.4.1 seaborn==0.11.1
!pip install --quiet -U grpcio-tools grpcio protobuf
!pip install --quiet paho-mqtt
!pip install --quiet ipywidgets

In [None]:
import boto3
import tarfile
import os
import stat
import io
import time
import sagemaker
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
import numpy as np
import glob

### Let's take a look at the dataset and its features
Download the dataset 

In [None]:
%matplotlib inline
%config InlineBackend.figure_format='retina'

!mkdir -p data
!curl https://aws-ml-blog.s3.amazonaws.com/artifacts/monitor-manage-anomaly-detection-model-wind-turbine-fleet-sagemaker-neo/dataset_wind_turbine.csv.gz -o data/dataset_wind.csv.gz
    
parser = lambda date: datetime.strptime(date, '%Y-%m-%dT%H:%M:%S.%f+00:00')
df = pd.read_csv('data/dataset_wind.csv.gz', compression="gzip", sep=',', low_memory=False, parse_dates=[ 'eventTime'], date_parser=parser)

df.head()

Features:
  - **nanoId**: id of the edge device that collected the data
  - **turbineId**: id of the turbine that produced this data
  - **arduino_timestamp**: timestamp of the arduino that was operating this turbine
  - **nanoFreemem**: amount of free memory in bytes
  - **eventTime**: timestamp of the row
  - **rps**: rotation of the rotor in Rotations Per Second
  - **voltage**: voltage produced by the generator in milivolts
  - **qw, qx, qy, qz**: quaternion angular acceleration
  - **gx, gy, gz**: gravity acceleration
  - **ax, ay, az**: linear acceleration
  - **gearboxtemp**: internal temperature
  - **ambtemp**: external temperature
  - **humidity**: air humidity
  - **pressure**: air pressure
  - **gas**: air quality
  - **wind_speed_rps**: wind speed in Rotations Per Second

## Step 2 - Deploying the pre-built ML Model


In this below section you will :

 - Compile/Optimize your pre-trained model to your edge device (Linux X86_64) using [SageMaker NEO](https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html)
 - Create a deployment package with a signed model + the runtime used by SageMaker Edge Agent to load and invoke the optimized model
 - Deploy the package using IoT Jobs


In [None]:
project_name='wind-turbine-farm'

s3_client = boto3.client('s3')
sm_client = boto3.client('sagemaker')

project_id = sm_client.describe_project(ProjectName=project_name)['ProjectId']
bucket_name = 'sagemaker-wind-turbine-farm-%s' % project_id

prefix='wind_turbine_anomaly'
sagemaker_session=sagemaker.Session(default_bucket=bucket_name)
role = sagemaker.get_execution_role()
print('Project name: %s' % project_name)
print('Project id: %s' % project_id)
print('Bucket name: %s' % bucket_name)

## Compiling/Packaging/Deploying our ML model to our edge devices

Invoking SageMaker NEO to compile the pre-trained model. To know how this model was trained please refer to the training notebook [here](https://github.com/aws-samples/amazon-sagemaker-edge-manager-workshop/tree/main/lab/02-Training). 

Upload the pre-trained model to S3 bucket

In [None]:
model_file = open("model/model.tar.gz", "rb")
boto3.Session().resource("s3").Bucket(bucket_name).Object('model/model.tar.gz').upload_fileobj(model_file)
print("Model successfully uploaded!")

It will compile the model for targeted hardware and OS with SageMaker Neo service. It will also include the [deep learning runtime](https://github.com/neo-ai/neo-ai-dlr) in the model package.

In [None]:
compilation_job_name = 'wind-turbine-anomaly-%d' % int(time.time()*1000)
sm_client.create_compilation_job(
    CompilationJobName=compilation_job_name,
    RoleArn=role,
    InputConfig={
        'S3Uri': 's3://%s/model/model.tar.gz' % sagemaker_session.default_bucket(),
        'DataInputConfig': '{"input0":[1,6,10,10]}',
        'Framework': 'PYTORCH'
    },
    OutputConfig={
        'S3OutputLocation': 's3://%s/wind_turbine/optimized/' % sagemaker_session.default_bucket(),        
        'TargetPlatform': { 'Os': 'LINUX', 'Arch': 'X86_64' }
    },
    StoppingCondition={ 'MaxRuntimeInSeconds': 900 }
)
while True:
    resp = sm_client.describe_compilation_job(CompilationJobName=compilation_job_name)    
    if resp['CompilationJobStatus'] in ['STARTING', 'INPROGRESS']:
        print('Running...')
    else:
        print(resp['CompilationJobStatus'], compilation_job_name)
        break
    time.sleep(5)

### Building the Deployment Package SageMaker Edge Manager
It will sign the model and create a deployment package with:
 - The optimized model
 - Model Metadata

In [None]:
import time
model_version = '1.0'
model_name = 'WindTurbineAnomalyDetection'
edge_packaging_job_name='wind-turbine-anomaly-%d' % int(time.time()*1000)
resp = sm_client.create_edge_packaging_job(
    EdgePackagingJobName=edge_packaging_job_name,
    CompilationJobName=compilation_job_name,
    ModelName=model_name,
    ModelVersion=model_version,
    RoleArn=role,
    OutputConfig={
        'S3OutputLocation': 's3://%s/%s/model/' % (bucket_name, prefix)
    }
)
while True:
    resp = sm_client.describe_edge_packaging_job(EdgePackagingJobName=edge_packaging_job_name)    
    if resp['EdgePackagingJobStatus'] in ['STARTING', 'INPROGRESS']:
        print('Running...')
    else:
        print(resp['EdgePackagingJobStatus'], compilation_job_name)
        break
    time.sleep(5)

### Deploy the package
Using IoT Jobs, we will notify the Python application in the edge devices. The application will:
 - Download the deployment package
 - Unpack it
 - Load the new mode (unload previous versions if any)

In [None]:
import boto3
import json
import sagemaker
import uuid

iot_client = boto3.client('iot')
sts_client = boto3.client('sts')

model_version = '1.0'
model_name = 'WindTurbineAnomalyDetection'
sagemaker_session=sagemaker.Session()
region_name = sagemaker_session.boto_session.region_name
account_id = sts_client.get_caller_identity()["Account"]

In [None]:
resp = iot_client.create_job(
    jobId=str(uuid.uuid4()),
    targets=[
        'arn:aws:iot:%s:%s:thinggroup/WindTurbineFarm-%s' % (region_name, account_id, project_id),        
    ],
    document=json.dumps({
        'type': 'new_model',
        'model_version': model_version,
        'model_name': model_name,
        'model_package_bucket': bucket_name,
        'model_package_key': "%s/model/%s-%s.tar.gz" % (prefix, model_name, model_version)        
    }),
    targetSelection='SNAPSHOT'
)

Alright! Now, the deployment process will start on the connected edge devices!

## Step 3 - Running the fleet of Virtual Wind Turbines and Edge Devices

In this section you will run a local application written in Python3 that simulates 5 Wind Turbines and 5 edge devices. The SageMaker Edge Agent is deployed on the edge devices.

Here you'll be the **Wind Turbine Farm Operator**. It's possible to visualize the data flowing from the sensors to the ML Model and analyze the anomalies. Also, you'll be able to inject noise (pressing some buttons) in the data to simulate potential anomalies with the equipment.

<table border="0" cellpading="0">
    <tr>
        <td align="center"><b>ARCHITECTURE</b></td>
        <td align="center"><b>PYTHON CLASS STRUCTURE in DEMO</b></td>
    </tr>
    <tr>
        <td><img src="../imgs/EdgeManagerWorkshop_Macro.png" width="500px"></img></td>
        <td><img src="../imgs/EdgeManagerWorkshop_App.png"  width="500px"></img></td>
    </tr>
</table>  

The components of the applicationare:
  - Simulator:
      - [Simulator](app/simulator.py): Program that launches the virtual wind turbines and the edge devices. It uses Python Threads to run all the 10 processes
      - [Wind Farm](app/windfarm.py): This is the application that runs on the edge device. It is reponsible for reading the sensors, invoking the ML model and analyzing the anomalies  
  - Edge Application:
      - [Turbine](app/turbine.py): Virtual Wind Turbine. It reads the raw data collected from the 3D Prited Mini Turbine and stream it as a circular buffer. It also has a graphical representation in **IPython Widgets** that is rendered by the Simulator/Dashboard.
      - [Over The Air](app/ota.py): This is a module integrated with **IoT Jobs**. In the previous exercise you created an IoT job to deploy the model. This module gets the document process it and deployes the model in each edge device and loads it via SageMaker Edge Manager.
      - [Edge client](app/edgeagentclient.py): An abstraction layer on top of the **generated stubs** (proto compilation). It makes it easy to integrate **Wind Farm** with the SageMaker Edge Agent

In [None]:
agent_config_package_prefix = 'wind_turbine_agent/config.tgz'
agent_version = '1.20210512.96da6cc'
agent_pkg_bucket = 'sagemaker-edge-release-store-us-west-2-linux-x64'

### Prepare the edge devices
 1. First download the deployment package that contains the IoT + CA certificates and the configuration file of the SageMaker Edge Agent. 
 2. Then, download the SageMaker Edge Manager package and complete the deployment process.
 
 > You can see all the artifacts that will be loaded/executed by the virtual Edge Device in **agent/**

In [None]:
if not os.path.isdir('agent'):
    s3_client = boto3.client('s3')

    # Get the configuration package with certificates and config files
    with io.BytesIO() as file:
        s3_client.download_fileobj(bucket_name, agent_config_package_prefix, file)
        file.seek(0)
        # Extract the files
        tar = tarfile.open(fileobj=file)
        tar.extractall('.')
        tar.close()    

    # Download and install SageMaker Edge Manager
    agent_pkg_key = 'Releases/%s/%s.tgz' % (agent_version, agent_version)
    # get the agent package
    with io.BytesIO() as file:
        s3_client.download_fileobj(agent_pkg_bucket, agent_pkg_key, file)
        file.seek(0)
        # Extract the files
        tar = tarfile.open(fileobj=file)
        tar.extractall('agent')
        tar.close()
        # Adjust the permissions
        os.chmod('agent/bin/sagemaker_edge_agent_binary', stat.S_IXUSR|stat.S_IWUSR|stat.S_IXGRP|stat.S_IWGRP)

### Finally, create the SageMaker Edge Agent client stubs, using the protobuffer compiler

SageMaker EdgeManager exposes a [gRPC API](https://grpc.io/docs/what-is-grpc/introduction/) to processes on device. In order to use gRPC API in your choice of language, you need to use the protobuf file `agent.proto` (the definition file for gRPC interface) to generate a stub in your preferred language. Our example was written in Python, therefore below is an example to generate Python EdgeManager gRPC stubs.

In [None]:
!python3 -m grpc_tools.protoc --proto_path=agent/docs/api --python_out=app/ --grpc_python_out=app/ agent/docs/api/agent.proto

### SageMaker Edge Agent - local directory structure
```
agent
└───certificates
│   └───root
│   │       <<aws_region>>.pem # CA certificate used by Edge Manager to sign the model
│   │
│   └───iot
│           edge_device_<<device_id>>_cert.pem # IoT certificate
│           edge_device_<<device_id>>_key.pem # IoT private key
│           edge_device_<<device_id>>_pub.pem # IoT public key
│           ...
│       
└───conf
│       config_edge_device_<<device_id>>.json # Edge Manager config file
│       ...
│
└───model    
│   └───<<device_id>>
│       └───<<model_name>>
│           └───<<model_version>> # Artifacts from the Edge Manager model package
│                   sagemaker_edge_manifest
│                   ...
│
└───logs
│       agent<<device_id>>.log # Logs collected by the local application
│       ...
app
    agent_pb2_grpc.py # grpc stubs generated by protoc
    agent_pb2.py # agent stubs generated by protoc
    ...
```

##  Simulating The Wind Turbine Farm
Now its time to run our simulator and start playing with the turbines, agents and with the anomalies
 > After clicking on **Start**, each turbine will start buffering some data. It takes a few seconds but after completing this process, the application runs in real-time   
 > Try to press some buttons while the simulation is running, to inject noise in the data and see some anomalies  


In [None]:
import sys
sys.path.insert(1, 'app')
import windfarm
import edgeagentclient
import turbine
import simulator
import ota
import boto3
from importlib import reload

reload(simulator)
reload(turbine)
reload(edgeagentclient)
reload(windfarm)
reload(ota)

# If there is an existing simulator running, halt it
try:
    farm.halt()
except:
    pass

iot_client = boto3.client('iot')

mqtt_host=iot_client.describe_endpoint(endpointType='iot:Data-ATS')['endpointAddress']
mqtt_port=8883

!mkdir -p agent/logs && rm -f agent/logs/*
simulator = simulator.WindTurbineFarmSimulator(5)
simulator.start()

farm = windfarm.WindTurbineFarm(simulator, mqtt_host, mqtt_port)
farm.start()

simulator.show()

 > If you want to experiment with the deployment process, with the wind farm running, go back to Step 2, replace the variable **model_version** by the constant (string) '2.0' in the Json document used by the IoT Job. Then, create a new IoT Job to simulate how to deploy new versions of the model. Go back to this exercise to see the results.

In [None]:
try:
    farm.halt()
except:
    pass

print("Done")

## Cleanup
Run the next cell only if you already finished exploring/hacking the content of the workshop.  
This code will delete all the resouces created so far, including the **SageMaker Project** you've created

In [None]:
# import boto3
# import time
# from shutil import rmtree

# iot_client = boto3.client('iot')
# sm_client = boto3.client('sagemaker')
# s3_resource = boto3.resource('s3')

# policy_name='WindTurbineFarmPolicy-%s' % project_id
# thing_group_name='WindTurbineFarm-%s' % project_id
# fleet_name='wind-turbine-farm-%s' % project_id

# # Delete all files from the S3 Bucket
# s3_resource.Bucket(bucket_name).objects.all().delete()

# # now deregister the devices from the fleet
# resp = sm_client.list_devices(DeviceFleetName=fleet_name)
# devices = [d['DeviceName'] for d in resp['DeviceSummaries']]
# if len(devices) > 0:
#     sm_client.deregister_devices(DeviceFleetName=fleet_name, DeviceNames=devices)

# # now deregister the devices from the fleet
# for i,cert_arn in enumerate(iot_client.list_targets_for_policy(policyName=policy_name)['targets']):
#     for t in iot_client.list_principal_things(principal=cert_arn)['things']:
#         iot_client.detach_thing_principal(thingName=t, principal=cert_arn)
#     iot_client.detach_policy(policyName=policy_name, target=cert_arn)
#     certificateId = cert_arn.split('/')[-1]

# iot_client.delete_role_alias(roleAlias='SageMakerEdge-%s' % fleet_name)
# iot_client.delete_thing_group(thingGroupName=thing_group_name)

# if os.path.isdir('agent'): rmtree('agent')
# sm_client.delete_project(ProjectName=project_name)

Mission Complete! 