{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Orchestrating Gender Prediction on SageMaker \n", "\n", "Amazon SageMaker provides a powerful orchestration framework that you can use to productionize any of your own machine learning algorithm, using any machine learning framework and programming languages.
\n", "This is possible because, as a manager of containers, SageMaker have standarized ways interacting with your code running inside a Docker container. Since you are free to build a docker container using whatever code and depndency you like, this gives you freedom to bring your own machinery.
\n", "A key take away of this workshop is the boilerplate code necessary to package your code in specific format as required by Sagemaker.
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note in the beginning that we do not need to import any SageMaker specific API, or any of your machine learning library API in order to run this notebook. This is because the actual work of model training and inference generation would happen inside the docker containers, not within the Jupyter runtime." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import time\n", "import boto3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use some parameters to uniquely identify the production pipeline, and set some hyperparameters." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "run_type='cpu'\n", "instance_class = \"p2\" if run_type.lower()=='gpu' else \"c4\"\n", "instance_type = \"ml.{}.8xlarge\".format(instance_class)\n", "\n", "pipeline_name = 'gender-classifier'\n", "run='01'\n", "\n", "run_name = pipeline_name+\"-\"+run\n", "\n", "epochs = '10'\n", "\n", "print(\"Using instance type - \" + instance_type)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Fetch name of the S3 bucket to which this Notebook instance have access to, if not mention your own bucket name\n", "\n", "sts = boto3.client('sts')\n", "iam = boto3.client('iam')\n", "\n", "\n", "caller = sts.get_caller_identity()\n", "account = caller['Account']\n", "arn = caller['Arn']\n", "role = arn[arn.find(\"/AmazonSageMaker\")+1:arn.find(\"/SageMaker\")]\n", "timestamp = role[role.find(\"Role-\")+5:]\n", "policyarn = \"arn:aws:iam::{}:policy/service-role/AmazonSageMaker-ExecutionPolicy-{}\".format(account, timestamp)\n", "\n", "s3bucketname = \"\"\n", "policystatements = []\n", "\n", "try:\n", " policy = iam.get_policy(\n", " PolicyArn=policyarn\n", " )['Policy']\n", " policyversion = policy['DefaultVersionId']\n", " policystatements = iam.get_policy_version(\n", " PolicyArn = policyarn, \n", " VersionId = policyversion\n", " )['PolicyVersion']['Document']['Statement']\n", "except Exception as e:\n", " s3bucketname=input(\"Which S3 bucket do you want to use to host training data and model? \")\n", " \n", "for stmt in policystatements:\n", " action = \"\"\n", " actions = stmt['Action']\n", " for act in actions:\n", " if act == \"s3:ListBucket\":\n", " action = act\n", " break\n", " if action == \"s3:ListBucket\":\n", " resource = stmt['Resource'][0]\n", " s3bucketname = resource[resource.find(\":::\")+3:]\n", "\n", " print(s3bucketname)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare instance\n", "\n", "One advantage of using SageMaker hosted notebooks is that, we can access the underlying instance, in the same way as we would from an ssh session, using the Jupyter magic shell command.
\n",
"The boilerplate code, which we affectionately call the `Dockerizer` framework, was made available on this Notebook instance by the Lifecycle Configuration that you used. Just look into the folder and ensure the necessary files are available."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -Rl ../container"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Container structure\n",
"\n",
"Notice that the artefacts obtained from the repsitory follows the structure as shown:\n",
"\n",
" \n",
"* The code can either fetch the data directly from an S3 bucket location, or access it from the location `/opt/ml/input/data/ \n",
"The benefit of having a separate preparation notebook, as we followed in the previous step, is that feature formatting, model architecture, and fitment are all well tested. Therefore we don't need to tweak things around in containers, which becomes cumbersome and time-consuming. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/train \n",
"\n",
" #Read training data from CSV and load into a data frame\n",
" data=pd.read_csv(data_filename, sep=',', names = [\"Name\", \"Gender\"])\n",
" data = shuffle(data)\n",
" print(\"Training data loaded\")\n",
"\n",
" #number of names\n",
" num_names = data.shape[0]\n",
"\n",
" # length of longest name\n",
" max_name_length = (data['Name'].map(len).max())\n",
"\n",
" #Separate data and label\n",
" names = data['Name'].values\n",
" genders = data['Gender']\n",
"\n",
" #Determine Alphabets in the input\n",
" names = data['Name'].values\n",
" txt = \"\"\n",
" for n in names:\n",
" txt += n.lower()\n",
"\n",
" #Alphabet derived as an unordered set containing unique entries of all characters used in name\n",
" chars = sorted(set(txt))\n",
" alphabet_size = len(chars)\n",
"\n",
" #Assign index values to each symbols in Alphabet\n",
" char_indices = dict((str(chr(c)), i) for i, c in enumerate(range(97,123)))\n",
" alphabet_size = 123-97\n",
" char_indices['max_name_length'] = max_name_length\n",
"\n",
" #One hot encoding to create training-X\n",
" X = np.zeros((num_names, max_name_length, alphabet_size))\n",
" for i,name in enumerate(names):\n",
" name = name.lower()\n",
" for t, char in enumerate(name):\n",
" X[i, t,char_indices[char]] = 1\n",
"\n",
" #Encode training-Y with 'M' as 1 and 'F' as 0\n",
" Y = np.ones((num_names,2))\n",
" Y[data['Gender'] == 'F',0] = 0\n",
" Y[data['Gender'] == 'M',1] = 0\n",
"\n",
" #Shape of one-hot encoded array is equal to length of longest input string by size of Alphabet\n",
" data_dim = alphabet_size\n",
" timesteps = max_name_length\n",
" print(\"Training data prepared\")\n",
"\n",
" #Consider this as a binary classification problem\n",
" num_classes = 2\n",
"\n",
" #Initiate a sequential model\n",
" model = Sequential()\n",
"\n",
" # Add an LSTM layer that returns a sequence of vectors of dimension sequence size (512 by default)\n",
" model.add(LSTM(sequence_size, return_sequences=True, input_shape=(timesteps, data_dim)))\n",
"\n",
" # Drop out certain percentage (20% by default) to prevent over fitting\n",
" if dropout_ratio > 0 and dropout_ratio < 1:\n",
" model.add(Dropout(dropout_ratio))\n",
"\n",
" # Stack another LSTM layer that returns a single vector of dimension sequence size (512 by default)\n",
" model.add(LSTM(sequence_size, return_sequences=False))\n",
"\n",
" # Drop out certain percentage (20% by default) to prevent over fitting\n",
" if dropout_ratio > 0 and dropout_ratio < 1:\n",
" model.add(Dropout(dropout_ratio))\n",
"\n",
" # Finally add an activation layer with a chosen activation function (Sigmoid by default)\n",
" model.add(Dense(num_classes, activation=activation_function))\n",
"\n",
" # Compile the Stacked LSTM Model with a loss function (binary_crossentropy by default),\n",
" #optimizer function (rmsprop) and a metric for measuring model effectiveness (accuracy by default)\n",
" model.compile(loss=loss_function, optimizer=optimizer_function, metrics=[metrics_measure])\n",
" print(\"Model compiled\")\n",
"\n",
" # Train the model for a number of epochs (10 by default), with a batch size (1000 by default)\n",
" # Split a portion of trainining data (20% by default) to be used a validation data\n",
" model.fit(X, Y, validation_split=split_ratio, epochs=num_epochs, batch_size=batch_records)\n",
" print(\"Model trained\")\n",
"\n",
" # Save the model artifacts and character indices under /opt/ml/model\n",
" model_type='lstm-gender-classifier'\n",
" model.save(os.path.join(model_path,'{}-model.h5'.format(model_type)))\n",
" char_indices['max_name_length'] = max_name_length\n",
" np.save(os.path.join(model_path,'{}-indices.npy'.format(model_type)), char_indices)\n",
"\n",
" print('Training complete.')\n",
" except Exception as e:\n",
" # Write out an error file. This will be returned as the failureReason in the\n",
" # DescribeTrainingJob result.\n",
" trc = traceback.format_exc()\n",
" with open(os.path.join(output_path, 'failure'), 'w') as s:\n",
" s.write('Exception during training: ' + str(e) + '\\n' + trc)\n",
" # Printing this causes the exception to be in the training job logs, as well.\n",
" print('Exception during training: ' + str(e) + '\\n' + trc, file=sys.stderr)\n",
" # A non-zero exit code causes the training job to be marked as Failed.\n",
" sys.exit(255)\n",
"\n",
"if __name__ == '__main__':\n",
" train()\n",
"\n",
" # A zero exit code causes the job to be marked a Succeeded.\n",
" sys.exit(0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inference code\n",
"\n",
"The file named `predictor.py` is where we need to package the code for generating inference using the trained model that was saved into an S3 bucket location by the training code during the training job run. \n",
"We'll write code into this file using Jupyter magic command - `writefile`. \n",
"We create `Class` variables in this class to hold loaded model, character indices, tensor-flow graph, and anything else that needs to be referenced while generating prediction. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/predictor.py\n",
"\n",
"# A singleton for holding the model. This simply loads the model and holds it.\n",
"# It has a predict function that does a prediction based on the model and the input data.\n",
"\n",
"class ScoringService(object):\n",
" model_type = None # Where we keep the model type, qualified by hyperparameters used during training\n",
" model = None # Where we keep the model when it's loaded\n",
" graph = None\n",
" indices = None # Where we keep the indices of Alphabet when it's loaded"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Generally, we have to provide class methods to load the model and related artefacts from the model path as assigned by SageMaker within the running container. \n",
"Notice here that SageMaker copies the artefacts from the S3 location (as defined during model creation) into the container local file system."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/predictor.py\n",
"\n",
" @classmethod\n",
" def get_indices(cls):\n",
" #Get the indices for Alphabet for this instance, loading it if it's not already loaded\n",
" if cls.indices == None:\n",
" model_type='lstm-gender-classifier'\n",
" index_path = os.path.join(model_path, '{}-indices.npy'.format(model_type))\n",
" if os.path.exists(index_path):\n",
" cls.indices = np.load(index_path).item()\n",
" else:\n",
" print(\"Character Indices not found.\")\n",
" return cls.indices\n",
"\n",
" @classmethod\n",
" def get_model(cls):\n",
" #Get the model object for this instance, loading it if it's not already loaded\n",
" if cls.model == None:\n",
" model_type='lstm-gender-classifier'\n",
" mod_path = os.path.join(model_path, '{}-model.h5'.format(model_type))\n",
" if os.path.exists(mod_path):\n",
" cls.model = load_model(mod_path)\n",
" cls.model._make_predict_function()\n",
" cls.graph = tf.get_default_graph()\n",
" else:\n",
" print(\"LSTM Model not found.\")\n",
" return cls.model\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, inside another clas method, named `predict`, we provide the code that we used earlier to generate prediction. \n",
"Only difference with our previous test prediciton (in development notebook) is that in this case, the predictor will grab the data from the `input` variable, which in turn is obtained from the HTTP request payload."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/predictor.py\n",
"\n",
" @classmethod\n",
" def predict(cls, input):\n",
"\n",
" mod = cls.get_model()\n",
" ind = cls.get_indices()\n",
"\n",
" result = {}\n",
"\n",
" if mod == None:\n",
" print(\"Model not loaded.\")\n",
" else:\n",
" if 'max_name_length' not in ind:\n",
" max_name_length = 15\n",
" alphabet_size = 26\n",
" else:\n",
" max_name_length = ind['max_name_length']\n",
" ind.pop('max_name_length', None)\n",
" alphabet_size = len(ind)\n",
"\n",
" inputs_list = input.strip('\\n').split(\",\")\n",
" num_inputs = len(inputs_list)\n",
"\n",
" X_test = np.zeros((num_inputs, max_name_length, alphabet_size))\n",
"\n",
" for i,name in enumerate(inputs_list):\n",
" name = name.lower().strip('\\n')\n",
" for t, char in enumerate(name):\n",
" if char in ind:\n",
" X_test[i, t,ind[char]] = 1\n",
"\n",
" with cls.graph.as_default():\n",
" predictions = mod.predict(X_test)\n",
"\n",
" for i,name in enumerate(inputs_list):\n",
" result[name] = 'M' if predictions[i][0]>predictions[i][1] else 'F'\n",
" print(\"{} ({})\".format(inputs_list[i],\"M\" if predictions[i][0]>predictions[i][1] else \"F\"))\n",
"\n",
" return json.dumps(result)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the prediction code captured, we move on to define the flask app, and provide a `ping`, which SageMaker uses to conduct health check on container instances that are responsible behind the hosted prediction endpoint. \n",
"Here we can have the container return healthy response, with status code `200` when everythings goes well. \n",
"For simplicity, we are only validating whether model has been loaded in this case. In practice, this provides opportunity extensive health check (including any external dependency check), as required."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/predictor.py\n",
"\n",
"# The flask app for serving predictions\n",
"app = flask.Flask(__name__)\n",
"\n",
"@app.route('/ping', methods=['GET'])\n",
"def ping():\n",
" #Determine if the container is working and healthy.\n",
" # Declare it healthy if we can load the model successfully.\n",
" health = ScoringService.get_model() is not None and ScoringService.get_indices() is not None\n",
" status = 200 if health else 404\n",
" return flask.Response(response='\\n', status=status, mimetype='application/json')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Last but not the least, we define a `transformation` method that would intercept the HTTP request coming through to the SageMaker hosted endpoint. \n",
"Here we have the opportunity to decide what type of data we accept with the request. In this particular example, we are accepting only `CSV` formatted data, decoding the data, and invoking prediction. \n",
"The response is similarly funneled backed to the caller with MIME type of `CSV`. \n",
"You are free to choose any or multiple MIME types for your requests and response. However if you choose to do so, it is within this method that we have to transform the back to and from the format that is suitable to passed for prediction."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/predictor.py\n",
"\n",
"\n",
"@app.route('/invocations', methods=['POST'])\n",
"def transformation():\n",
" #Do an inference on a single batch of data\n",
" data = None\n",
"\n",
" # Convert from CSV to pandas\n",
" if flask.request.content_type == 'text/csv':\n",
" data = flask.request.data.decode('utf-8')\n",
" else:\n",
" return flask.Response(response='This predictor only supports CSV data', status=415, mimetype='text/plain')\n",
"\n",
" print('Invoked with {} records'.format(data.count(\",\")+1))\n",
"\n",
" # Do the prediction\n",
" predictions = ScoringService.predict(data)\n",
"\n",
" result = \"\"\n",
" for prediction in predictions:\n",
" result = result + prediction\n",
"\n",
" return flask.Response(response=result, status=200, mimetype='text/csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that in containerizing our custom LSTM Algorithm, where we used `Keras` as our framework of our choice, we did not have to interact directly with the SageMaker API, even though SageMaker API doesn't support `Keras`. \n",
"This serves to show the power and flexibility offered by containerized machine learning pipeline on SageMaker."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Container publishing\n",
"\n",
"Of course the code written so far in this notebook haven't sttod the test of execution so far. In order to do so, we need to actually build the `Docker` containers, publish it to `Amazon ECR` repository, and then either use SageMaker console or API to run the training hosting and deployment stages."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Conceptually, the steps required for publishing are: \n",
"1. Make the `train` and `predictor.py` files executable\n",
"2. Create an ECR repository within your default region\n",
"3. Build a docker container with an identifieable name (we used a combination or model name and version as unique)\n",
"4. Tage the image and publish to the ECR repository\n",
" \n",
"However, it is often more convenient to automate these steps. This notebook shows one way to do so, using `boto3 SageMaker` API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sagemaker = boto3.client('sagemaker')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we create a training job specifying the name of our produciton pipeline, ARN of the published image on ECR, location of available training data on S3 bucket, and desired S3 location where we need the trained model to be saved. \n",
"We wait until the training job completes before proceeding to the next stage."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"region = boto3.session.Session().region_name\n",
"response = sagemaker.create_training_job(\n",
" TrainingJobName='{}-training'.format(run_name),\n",
" HyperParameters={\n",
" 'num_epochs': epochs\n",
" },\n",
" AlgorithmSpecification={\n",
" 'TrainingImage': '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(account,region,run_name),\n",
" 'TrainingInputMode': 'File'\n",
" }, \n",
" RoleArn='arn:aws:iam::{}:role/service-role/AmazonSageMaker-ExecutionRole-{}'.format(account,timestamp),\n",
" InputDataConfig=[\n",
" {\n",
" 'ChannelName': 'train',\n",
" 'DataSource': {\n",
" 'S3DataSource': {\n",
" 'S3DataType': 'S3Prefix',\n",
" 'S3Uri': 's3://{}/data'.format(s3bucketname),\n",
" 'S3DataDistributionType': 'FullyReplicated'\n",
" }\n",
" },\n",
" 'CompressionType': 'None',\n",
" 'RecordWrapperType': 'None'\n",
" },\n",
" ],\n",
" OutputDataConfig={\n",
" 'S3OutputPath': 's3://{}/output'.format(s3bucketname)\n",
" },\n",
" ResourceConfig={\n",
" 'InstanceType': instance_type,\n",
" 'InstanceCount': 1,\n",
" 'VolumeSizeInGB': 10\n",
" },\n",
" StoppingCondition={\n",
" 'MaxRuntimeInSeconds': 86400\n",
" },\n",
" Tags=[\n",
" {\n",
" 'Key': 'Name',\n",
" 'Value': '{}-training'.format(run_name)\n",
" }\n",
" ] \n",
")\n",
"status='InProgress'\n",
"step = 0\n",
"sleep = 30\n",
"print(\"{} - Time Elapsed: {} seconds\".format(status,step*sleep))\n",
"while status != 'Completed' and status != 'Failed':\n",
" response = sagemaker.describe_training_job(\n",
" TrainingJobName=run_name+'-training'\n",
" )\n",
" status = response['TrainingJobStatus']\n",
" time.sleep(sleep)\n",
" step = step+1\n",
" print(\"{} - Time Elapsed: {} seconds\".format(status,step*sleep))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If training succeeds, we move on to create a model hosting definition, by providing the S3 location to the model artifact, and ARN to the ECR image of the container."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if status == 'Completed':\n",
" response = sagemaker.create_model(\n",
" ModelName='{}-model'.format(run_name),\n",
" PrimaryContainer={\n",
" 'Image': '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(account,region,run_name),\n",
" 'ModelDataUrl': 's3://{}/output/{}-training/output/model.tar.gz'.format(s3bucketname,run_name),\n",
" 'Environment': {\n",
" 'string': 'string'\n",
" }\n",
" },\n",
" ExecutionRoleArn='arn:aws:iam::{}:role/service-role/AmazonSageMaker-ExecutionRole-{}'.format(account,timestamp),\n",
" Tags=[\n",
" {\n",
" 'Key': 'Name',\n",
" 'Value': '{}-model'.format(run_name)\n",
" }\n",
" ]\n",
" ) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the model hosting definition, our next step is to create configuration of a hosted endpoint that will be used to serve prediciton generation requests. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = sagemaker.create_endpoint_config(\n",
" EndpointConfigName='{}-endpoint-config'.format(run_name),\n",
" ProductionVariants=[\n",
" {\n",
" 'VariantName': 'default',\n",
" 'ModelName': '{}-model'.format(run_name),\n",
" 'InitialInstanceCount': 1,\n",
" 'InstanceType': instance_type,\n",
" 'InitialVariantWeight': 1\n",
" },\n",
" ],\n",
" Tags=[\n",
" {\n",
" 'Key': 'Name',\n",
" 'Value': '{}-endpoint-config'.format(run_name)\n",
" }\n",
" ]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creating the endpoint is the last step in the ML cycle, that prepares your model to serve client reqests from applications. \n",
"We wait until provisioning of infrastrucuture needed to host the endpoint is completed and the endpoint in service."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = sagemaker.create_endpoint(\n",
" EndpointName='{}-endpoint'.format(run_name),\n",
" EndpointConfigName='{}-endpoint-config'.format(run_name),\n",
" Tags=[\n",
" {\n",
" 'Key': 'string',\n",
" 'Value': run_name+'-endpoint'\n",
" }\n",
" ]\n",
")\n",
"status='Creating'\n",
"step = 0\n",
"sleep = 30\n",
"print(\"{} - Time Elapsed: {} seconds\".format(status,step*sleep))\n",
"while status != 'InService' and status != 'Failed' and status != 'OutOfService':\n",
" response = sagemaker.describe_endpoint(\n",
" EndpointName='{}-endpoint'.format(run_name)\n",
" )\n",
" status = response['EndpointStatus']\n",
" time.sleep(sleep)\n",
" step = step+1\n",
" print(\"{} - Time Elapsed: {} seconds\".format(status,step*sleep))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At the end we run a quick test to validate we are able to generate same predicitions as we did in our preparation notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!aws sagemaker-runtime invoke-endpoint --endpoint-name \"$run_name-endpoint\" --body 'Tom,Allie,Jim,Sophie,John,Kayla,Mike,Amanda,Andrew' --content-type text/csv outfile\n",
"!cat outfile"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Head back to Module-3 of the workshop now, to the section titled - `Integration`, and follow the steps described. \n",
"You'll need to copy the endpoint name from the output of the cell below, to use in the Lambda function that will send request to this hosted endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(response['EndpointName'])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_python3",
"language": "python",
"name": "conda_python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
\n",
"First part of the file would contain the necessary imports, as ususal. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile byoa/predictor.py\n",
"# This is the file that implements a flask server to do inferences. It's the file that you will modify to\n",
"# implement the scoring for your own algorithm.\n",
"\n",
"from __future__ import print_function\n",
"\n",
"import os\n",
"import json\n",
"import pickle\n",
"from io import StringIO\n",
"import sys\n",
"import signal\n",
"import traceback\n",
"\n",
"import numpy as np\n",
"\n",
"import keras\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense, Dropout\n",
"from keras.layers import Embedding\n",
"from keras.layers import LSTM\n",
"from keras.models import load_model\n",
"import flask\n",
"\n",
"import tensorflow as tf\n",
"\n",
"import pandas as pd\n",
"\n",
"from os import listdir, sep\n",
"from os.path import abspath, basename, isdir\n",
"from sys import argv\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When run within an instantiated container, SageMaker makes the trained model available locally at `/opt/ml`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a byoa/predictor.py\n",
"\n",
"prefix = '/opt/ml/'\n",
"model_path = os.path.join(prefix, 'model')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The machinery to produce inference is wrapped around in a Pythonic class structure, within a `Singleton` class, aptly named - `ScoringService`.
\n",
"All of these ar conveniently encapsulated inside `build_and_push` script. We simply run it with the unique name of our production run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!sh build_and_push.sh $run_name"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Orchestraion\n",
"\n",
"At this point, we can head to ECS console, grab the ARN for the repository where we published the docker image with our training and inference code, and use SageMaker console to spawn training job, create hosted model, and endpoint.