## Model Preparation for Grafana/InfluxDB demo
In this notebook, we are going to take a basic Resnet-18 PyTorch model, and pre-process it to be compiled with Amazon SageMaker NEO. As a first step, we import the following libraries. 

Please note, that this notebook should be run with kernel `conda_pytorch_latest_p36`.

In [1]:
import torch
import tarfile
import boto3
import sagemaker
import time
from sagemaker.utils import name_from_base
import urllib

[Optional] Check that your PyTorch version is at least 1.6.

In [2]:
print(torch.__version__)

1.7.1


Download the sample [PyTorch Resnet-18 model](https://pytorch.org/hub/pytorch_vision_resnet/) trained with 1000 classes with [ImageNet](https://image-net.org/).

In [3]:
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)

Using cache found in /home/ec2-user/.cache/torch/hub/pytorch_vision_v0.10.0


Download some sample images for us to later test model validity.

In [4]:
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

For frameworks that come with different frameworks, the preparation step for the model can be slightly different. Please refer to [this guide for more information on the pre-processing steps](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-compilation-preparing-model.html) for MXNet, TensorFlow, PyTorch, etc. 

In [5]:
# PyTorch Resnet classification models take input pictures in the following size
input_shape = [1, 3, 224, 224]
trace = torch.jit.trace(model.float().eval(), torch.zeros(input_shape).float())
trace.save("model.pth")
with tarfile.open("model.tar.gz", "w:gz") as f:
    f.add("model.pth")

In this step, we prepare for compilation jobs. SageMaker has a default bucket for each account (starts with sagemaker-) in the same region of this notebook instance. We are going to store all of the artifacts and compiled models within that default bucket. Feel free to change the bucket location if needed otherwise.

In [6]:
role = sagemaker.get_execution_role()
print(role)
sess = sagemaker.Session()
region = sess.boto_region_name
bucket = sess.default_bucket()

compilation_job_name = name_from_base("TorchVision-ResNet18-Neo")
prefix = compilation_job_name + "/model"
model_path = sess.upload_data(path="model.tar.gz", key_prefix=prefix)

data_shape = '{"input0":[1,3,224,224]}'
target_platform = {'Os': 'LINUX','Arch': 'X86_64'}
framework = "PYTORCH"
framework_version = "1.6"
compiled_model_path = "s3://{}/{}/output".format(bucket, compilation_job_name)


arn:aws:iam::593512547852:role/service-role/AmazonSageMaker-ExecutionRole-20200723T133094


Start compilation job, and start a polling process to wait for the compilation job to succeed. It could take around 5 minutes.

In [7]:
# Create a SageMaker client so you can submit a compilation job
sagemaker_client = boto3.client('sagemaker', region_name='us-west-2')

response = sagemaker_client.create_compilation_job(
    CompilationJobName=compilation_job_name,
    RoleArn=role,
    InputConfig={
        'S3Uri': model_path,
        'DataInputConfig': data_shape,
        'Framework': framework.upper()
    },
    OutputConfig={
        'S3OutputLocation': compiled_model_path,
        'TargetPlatform': target_platform 
    },
    StoppingCondition={
        'MaxRuntimeInSeconds': 900
    }
)
while True:
    response = sagemaker_client.describe_compilation_job(CompilationJobName=compilation_job_name)
    if response['CompilationJobStatus'] == 'COMPLETED':
        break
    elif response['CompilationJobStatus'] == 'FAILED':
        raise RuntimeError('Compilation failed')
    print('Compiling ...')
    time.sleep(30)
print('Done!')

Compiling ...
Compiling ...
Compiling ...
Compiling ...
Compiling ...
Compiling ...
Compiling ...
Compiling ...
Compiling ...
Done!


Since NEO compilation output is in the format of `.tar.gz`, but AWS IoT Greengrass only accepts `.zip`. We need the following step to convert the compiled model to convert the format in order for the archived file to be downloaded and unpacked by Greengrass service.

In [8]:
s3_client = boto3.client('s3')
object_path = '{}/output/model-{}_{}.tar.gz'.format(compilation_job_name, target_platform['Os'], target_platform['Arch'])
neo_compiled_model = 'compiled-model.tar.gz'
s3_client.download_file(bucket, object_path, neo_compiled_model)
!mkdir model
!tar zfxv compiled-model.tar.gz -C model/
!zip compiled-model.zip model/*
s3_client.upload_file('compiled-model.zip', bucket, '{}/output/model-{}_{}.zip'.format(compilation_job_name, target_platform['Os'], target_platform['Arch']))


compiled.params
compiled.meta
compiled_model.json
compiled.so
libdlr.so
dlr.h
manifest
  adding: model/compiled.meta (deflated 66%)
  adding: model/compiled_model.json (deflated 93%)
  adding: model/compiled.params (deflated 7%)
  adding: model/compiled.so (deflated 77%)
  adding: model/dlr.h (deflated 83%)
  adding: model/libdlr.so (deflated 60%)
  adding: model/manifest (deflated 45%)


This is the end of this brief notebook on how to prepare a Amazon SageMaker NEO compiled model. Please refer to [developer's guide](https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html) for more information on Amazon SageMaker NEO service.