{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "4baabab9", "metadata": { "papermill": { "duration": 0.021827, "end_time": "2021-06-07T00:16:46.305199", "exception": false, "start_time": "2021-06-07T00:16:46.283372", "status": "completed" }, "tags": [] }, "source": [ "# Machine Translation English-German Example Using SageMaker Seq2Seq\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "3fd82ba7", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "id": "568e74ab", "metadata": { "papermill": { "duration": 0.021827, "end_time": "2021-06-07T00:16:46.305199", "exception": false, "start_time": "2021-06-07T00:16:46.283372", "status": "completed" }, "tags": [] }, "source": [ "\n", "1. [Introduction](#Introduction)\n", "2. [Setup](#Setup)\n", "3. [Download dataset and preprocess](#Download-dataset-and-preprocess)\n", "3. [Training the Machine Translation model](#Training-the-Machine-Translation-model)\n", "4. [Inference](#Inference)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "387c8d44", "metadata": { "papermill": { "duration": 0.021137, "end_time": "2021-06-07T00:16:46.347464", "exception": false, "start_time": "2021-06-07T00:16:46.326327", "status": "completed" }, "tags": [] }, "source": [ "## Introduction\n", "\n", "Welcome to our Machine Translation end-to-end example! In this demo, we will train a English-German translation model and will test the predictions on a few examples.\n", "\n", "SageMaker Seq2Seq algorithm is built on top of [Sockeye](https://github.com/awslabs/sockeye), a sequence-to-sequence framework for Neural Machine Translation based on MXNet. SageMaker Seq2Seq implements state-of-the-art encoder-decoder architectures which can also be used for tasks like Abstractive Summarization in addition to Machine Translation.\n", "\n", "To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on." ] }, { "attachments": {}, "cell_type": "markdown", "id": "13622fa5", "metadata": { "papermill": { "duration": 0.021033, "end_time": "2021-06-07T00:16:46.389485", "exception": false, "start_time": "2021-06-07T00:16:46.368452", "status": "completed" }, "tags": [] }, "source": [ "## Setup\n", "\n", "Let's start by specifying:\n", "- The S3 bucket and prefix that you want to use for training and model data. **This should be within the same region as the Notebook Instance, training, and hosting.**\n", "- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the boto regexp in the cell below with a the appropriate full IAM role arn string(s)." ] }, { "cell_type": "code", "execution_count": null, "id": "34170980", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:16:46.435546Z", "iopub.status.busy": "2021-06-07T00:16:46.434925Z", "iopub.status.idle": "2021-06-07T00:16:46.436650Z", "shell.execute_reply": "2021-06-07T00:16:46.437019Z" }, "isConfigCell": true, "papermill": { "duration": 0.026128, "end_time": "2021-06-07T00:16:46.437174", "exception": false, "start_time": "2021-06-07T00:16:46.411046", "status": "completed" }, "tags": [ "parameters" ] }, "outputs": [], "source": [ "import sagemaker\n", "\n", "sagemaker_session = sagemaker.Session()\n", "region = sagemaker_session.boto_region_name\n", "\n", "# S3 bucket and prefix\n", "bucket = sagemaker_session.default_bucket()\n", "prefix = \"sagemaker/DEMO-seq2seq\"" ] }, { "cell_type": "code", "execution_count": null, "id": "c8c71c9e", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:16:46.530988Z", "iopub.status.busy": "2021-06-07T00:16:46.530301Z", "iopub.status.idle": "2021-06-07T00:16:47.386678Z", "shell.execute_reply": "2021-06-07T00:16:47.386168Z" }, "papermill": { "duration": 0.88088, "end_time": "2021-06-07T00:16:47.386786", "exception": false, "start_time": "2021-06-07T00:16:46.505906", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "import boto3\n", "import re\n", "from sagemaker import get_execution_role\n", "\n", "role = get_execution_role()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "58f2a566", "metadata": { "papermill": { "duration": 0.021349, "end_time": "2021-06-07T00:16:47.429511", "exception": false, "start_time": "2021-06-07T00:16:47.408162", "status": "completed" }, "tags": [] }, "source": [ "Next, we'll import the Python libraries we'll need for the remainder of the exercise." ] }, { "cell_type": "code", "execution_count": null, "id": "11fdbaeb", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:16:47.477268Z", "iopub.status.busy": "2021-06-07T00:16:47.476811Z", "iopub.status.idle": "2021-06-07T00:16:48.051269Z", "shell.execute_reply": "2021-06-07T00:16:48.051644Z" }, "papermill": { "duration": 0.60087, "end_time": "2021-06-07T00:16:48.051780", "exception": false, "start_time": "2021-06-07T00:16:47.450910", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "from time import gmtime, strftime\n", "import time\n", "import numpy as np\n", "import os\n", "import json\n", "\n", "# For plotting attention matrix later on\n", "import matplotlib\n", "\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2fa13beb", "metadata": { "papermill": { "duration": 0.021444, "end_time": "2021-06-07T00:16:48.094706", "exception": false, "start_time": "2021-06-07T00:16:48.073262", "status": "completed" }, "tags": [] }, "source": [ "## Download dataset and preprocess" ] }, { "attachments": {}, "cell_type": "markdown", "id": "e5a3e0b4", "metadata": { "papermill": { "duration": 0.021348, "end_time": "2021-06-07T00:16:48.137455", "exception": false, "start_time": "2021-06-07T00:16:48.116107", "status": "completed" }, "tags": [] }, "source": [ "In this notebook, we will train a English to German translation model on a dataset from the\n", "[Conference on Machine Translation (WMT) 2017](http://www.statmt.org/wmt17/)." ] }, { "cell_type": "code", "execution_count": null, "id": "4b5eeab5", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:16:48.187097Z", "iopub.status.busy": "2021-06-07T00:16:48.184203Z", "iopub.status.idle": "2021-06-07T00:17:28.807525Z", "shell.execute_reply": "2021-06-07T00:17:28.807097Z" }, "papermill": { "duration": 40.648725, "end_time": "2021-06-07T00:17:28.807637", "exception": false, "start_time": "2021-06-07T00:16:48.158912", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "%%bash\n", "wget http://data.statmt.org/wmt17/translation-task/preprocessed/de-en/corpus.tc.de.gz & \\\n", "wget http://data.statmt.org/wmt17/translation-task/preprocessed/de-en/corpus.tc.en.gz & wait\n", "gunzip corpus.tc.de.gz & \\\n", "gunzip corpus.tc.en.gz & wait\n", "mkdir validation\n", "wget http://data.statmt.org/wmt17/translation-task/preprocessed/de-en/dev.tgz\n", "tar xzf dev.tgz -C validation" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c783e499", "metadata": { "papermill": { "duration": 0.060228, "end_time": "2021-06-07T00:17:28.943237", "exception": false, "start_time": "2021-06-07T00:17:28.883009", "status": "completed" }, "tags": [] }, "source": [ "Please note that it is a common practise to split words into subwords using Byte Pair Encoding (BPE). Please refer to [this](https://github.com/awslabs/sockeye/tree/master/tutorials/wmt) tutorial if you are interested in performing BPE." ] }, { "attachments": {}, "cell_type": "markdown", "id": "d5f518a8", "metadata": { "papermill": { "duration": 0.049397, "end_time": "2021-06-07T00:17:29.036709", "exception": false, "start_time": "2021-06-07T00:17:28.987312", "status": "completed" }, "tags": [] }, "source": [ "Since training on the whole dataset might take several hours/days, for this demo, let us train on the **first 10,000 lines only**. Don't run the next cell if you want to train on the complete dataset." ] }, { "cell_type": "code", "execution_count": null, "id": "3bfccf50", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:17:29.140937Z", "iopub.status.busy": "2021-06-07T00:17:29.140466Z", "iopub.status.idle": "2021-06-07T00:17:29.449926Z", "shell.execute_reply": "2021-06-07T00:17:29.449511Z" }, "papermill": { "duration": 0.360556, "end_time": "2021-06-07T00:17:29.450066", "exception": false, "start_time": "2021-06-07T00:17:29.089510", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "!head -n 10000 corpus.tc.en > corpus.tc.en.small\n", "!head -n 10000 corpus.tc.de > corpus.tc.de.small" ] }, { "attachments": {}, "cell_type": "markdown", "id": "171589d1", "metadata": { "papermill": { "duration": 0.043187, "end_time": "2021-06-07T00:17:29.536593", "exception": false, "start_time": "2021-06-07T00:17:29.493406", "status": "completed" }, "tags": [] }, "source": [ "Now, let's use the preprocessing script `create_vocab_proto.py` (provided with this notebook) to create vocabulary mappings (strings to integers) and convert these files to x-recordio-protobuf as required for training by SageMaker Seq2Seq. \n", "Uncomment the cell below and run to see check the arguments this script expects." ] }, { "cell_type": "code", "execution_count": null, "id": "88fb15d3", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:17:29.628365Z", "iopub.status.busy": "2021-06-07T00:17:29.625318Z", "iopub.status.idle": "2021-06-07T00:17:29.636907Z", "shell.execute_reply": "2021-06-07T00:17:29.636466Z" }, "papermill": { "duration": 0.05778, "end_time": "2021-06-07T00:17:29.637011", "exception": false, "start_time": "2021-06-07T00:17:29.579231", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "%%bash\n", "# python3 create_vocab_proto.py -h" ] }, { "attachments": {}, "cell_type": "markdown", "id": "4c601108", "metadata": { "papermill": { "duration": 0.043197, "end_time": "2021-06-07T00:17:29.722780", "exception": false, "start_time": "2021-06-07T00:17:29.679583", "status": "completed" }, "tags": [] }, "source": [ "The cell below does the preprocessing. If you are using the complete dataset, the script might take around 10-15 min on an m4.xlarge notebook instance. Remove \".small\" from the file names for training on full datasets." ] }, { "cell_type": "code", "execution_count": null, "id": "33cd43e5", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:17:29.813204Z", "iopub.status.busy": "2021-06-07T00:17:29.812643Z", "iopub.status.idle": "2021-06-07T00:17:30.902664Z", "shell.execute_reply": "2021-06-07T00:17:30.903018Z" }, "papermill": { "duration": 1.137858, "end_time": "2021-06-07T00:17:30.903154", "exception": false, "start_time": "2021-06-07T00:17:29.765296", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "%%time\n", "%%bash\n", "python3 create_vocab_proto.py \\\n", " --train-source corpus.tc.en.small \\\n", " --train-target corpus.tc.de.small \\\n", " --val-source validation/newstest2014.tc.en \\\n", " --val-target validation/newstest2014.tc.de" ] }, { "attachments": {}, "cell_type": "markdown", "id": "a1d0868b", "metadata": { "papermill": { "duration": 0.043095, "end_time": "2021-06-07T00:17:30.989500", "exception": false, "start_time": "2021-06-07T00:17:30.946405", "status": "completed" }, "tags": [] }, "source": [ "The script will output 4 files, namely:\n", "- train.rec : Contains source and target sentences for training in protobuf format\n", "- val.rec : Contains source and target sentences for validation in protobuf format\n", "- vocab.src.json : Vocabulary mapping (string to int) for source language (English in this example)\n", "- vocab.trg.json : Vocabulary mapping (string to int) for target language (German in this example)\n", "\n", "Let's upload the pre-processed dataset and vocabularies to S3" ] }, { "cell_type": "code", "execution_count": null, "id": "a44024c9", "metadata": { "execution": { "iopub.execute_input": "2021-06-07T00:17:31.092372Z", "iopub.status.busy": "2021-06-07T00:17:31.091900Z", "iopub.status.idle": "2021-06-07T00:17:31.306296Z", "shell.execute_reply": "2021-06-07T00:17:31.305608Z" }, "papermill": { "duration": 0.271489, "end_time": "2021-06-07T00:17:31.306497", "exception": true, "start_time": "2021-06-07T00:17:31.035008", "status": "failed" }, "tags": [] }, "outputs": [], "source": [ "def upload_to_s3(bucket, prefix, channel, file):\n", " s3 = boto3.resource(\"s3\")\n", " data = open(file, \"rb\")\n", " key = prefix + \"/\" + channel + \"/\" + file\n", " s3.Bucket(bucket).put_object(Key=key, Body=data)\n", "\n", "\n", "upload_to_s3(bucket, prefix, \"train\", \"train.rec\")\n", "upload_to_s3(bucket, prefix, \"validation\", \"val.rec\")\n", "upload_to_s3(bucket, prefix, \"vocab\", \"vocab.src.json\")\n", "upload_to_s3(bucket, prefix, \"vocab\", \"vocab.trg.json\")" ] }, { "cell_type": "code", "execution_count": null, "id": "1ea80f0e", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "region_name = boto3.Session().region_name" ] }, { "cell_type": "code", "execution_count": null, "id": "0beb3344", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "from sagemaker.amazon.amazon_estimator import get_image_uri\n", "\n", "container = get_image_uri(region_name, \"seq2seq\")\n", "\n", "print(\"Using SageMaker Seq2Seq container: {} ({})\".format(container, region_name))" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f82a4c92", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "## Training the Machine Translation model" ] }, { "cell_type": "code", "execution_count": null, "id": "fa329926", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "job_name = \"DEMO-seq2seq-en-de-\" + strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())\n", "print(\"Training job\", job_name)\n", "\n", "create_training_params = {\n", " \"AlgorithmSpecification\": {\"TrainingImage\": container, \"TrainingInputMode\": \"File\"},\n", " \"RoleArn\": role,\n", " \"OutputDataConfig\": {\"S3OutputPath\": \"s3://{}/{}/\".format(bucket, prefix)},\n", " \"ResourceConfig\": {\n", " # Seq2Seq does not support multiple machines. Currently, it only supports single machine, multiple GPUs\n", " \"InstanceCount\": 1,\n", " \"InstanceType\": \"ml.p2.xlarge\", # We suggest one of [\"ml.p2.16xlarge\", \"ml.p2.8xlarge\", \"ml.p2.xlarge\"]\n", " \"VolumeSizeInGB\": 50,\n", " },\n", " \"TrainingJobName\": job_name,\n", " \"HyperParameters\": {\n", " # Please refer to the documentation for complete list of parameters\n", " \"max_seq_len_source\": \"60\",\n", " \"max_seq_len_target\": \"60\",\n", " \"optimized_metric\": \"bleu\",\n", " \"batch_size\": \"64\", # Please use a larger batch size (256 or 512) if using ml.p2.8xlarge or ml.p2.16xlarge\n", " \"checkpoint_frequency_num_batches\": \"1000\",\n", " \"rnn_num_hidden\": \"512\",\n", " \"num_layers_encoder\": \"1\",\n", " \"num_layers_decoder\": \"1\",\n", " \"num_embed_source\": \"512\",\n", " \"num_embed_target\": \"512\",\n", " \"checkpoint_threshold\": \"3\",\n", " \"max_num_batches\": \"2100\"\n", " # Training will stop after 2100 iterations/batches.\n", " # This is just for demo purposes. Remove the above parameter if you want a better model.\n", " },\n", " \"StoppingCondition\": {\"MaxRuntimeInSeconds\": 48 * 3600},\n", " \"InputDataConfig\": [\n", " {\n", " \"ChannelName\": \"train\",\n", " \"DataSource\": {\n", " \"S3DataSource\": {\n", " \"S3DataType\": \"S3Prefix\",\n", " \"S3Uri\": \"s3://{}/{}/train/\".format(bucket, prefix),\n", " \"S3DataDistributionType\": \"FullyReplicated\",\n", " }\n", " },\n", " },\n", " {\n", " \"ChannelName\": \"vocab\",\n", " \"DataSource\": {\n", " \"S3DataSource\": {\n", " \"S3DataType\": \"S3Prefix\",\n", " \"S3Uri\": \"s3://{}/{}/vocab/\".format(bucket, prefix),\n", " \"S3DataDistributionType\": \"FullyReplicated\",\n", " }\n", " },\n", " },\n", " {\n", " \"ChannelName\": \"validation\",\n", " \"DataSource\": {\n", " \"S3DataSource\": {\n", " \"S3DataType\": \"S3Prefix\",\n", " \"S3Uri\": \"s3://{}/{}/validation/\".format(bucket, prefix),\n", " \"S3DataDistributionType\": \"FullyReplicated\",\n", " }\n", " },\n", " },\n", " ],\n", "}\n", "\n", "sagemaker_client = boto3.Session().client(service_name=\"sagemaker\")\n", "sagemaker_client.create_training_job(**create_training_params)\n", "\n", "status = sagemaker_client.describe_training_job(TrainingJobName=job_name)[\"TrainingJobStatus\"]\n", "print(status)" ] }, { "cell_type": "code", "execution_count": null, "id": "d453c1c2", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "import time\n", "\n", "status = sagemaker_client.describe_training_job(TrainingJobName=job_name)[\"TrainingJobStatus\"]\n", "\n", "while status == \"InProgress\":\n", " time.sleep(60)\n", " status = sagemaker_client.describe_training_job(TrainingJobName=job_name)[\"TrainingJobStatus\"]\n", "\n", "\n", "print(status)\n", "# if the job failed, determine why\n", "if status == \"Failed\":\n", " message = sagemaker_client.describe_training_job(TrainingJobName=job_name)[\"FailureReason\"]\n", " print(\"Training failed with the following error: {}\".format(message))\n", " raise Exception(\"Training job failed\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "3b5ed18f", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "> Now wait for the training job to complete and proceed to the next step after you see model artifacts in your S3 bucket." ] }, { "attachments": {}, "cell_type": "markdown", "id": "5cb116e9", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "You can jump to [Use a pretrained model](#Use-a-pretrained-model) as training might take some time." ] }, { "attachments": {}, "cell_type": "markdown", "id": "93d2480f", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "## Inference\n", "\n", "A trained model does nothing on its own. We now want to use the model to perform inference. For this example, that means translating sentence(s) from English to German.\n", "This section involves several steps,\n", "- Create model - Create a model using the artifact (model.tar.gz) produced by training\n", "- Create Endpoint Configuration - Create a configuration defining an endpoint, using the above model\n", "- Create Endpoint - Use the configuration to create an inference endpoint.\n", "- Perform Inference - Perform inference on some input data using the endpoint.\n", "\n", "### Create model\n", "We now create a SageMaker Model from the training output. Using the model, we can then create an Endpoint Configuration." ] }, { "cell_type": "code", "execution_count": null, "id": "25744e84", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "use_pretrained_model = False" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0e028094", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "### Use a pretrained model\n", "#### Please uncomment and run the cell below if you want to use a pretrained model, as training might take several hours/days to complete." ] }, { "cell_type": "code", "execution_count": null, "id": "365589a1", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "# use_pretrained_model = True\n", "# model_name = \"DEMO-pretrained-en-de-model\"\n", "# s3 = boto3.client(\"s3\")\n", "# s3.download_file(f\"sagemaker-example-files-prod-{region}\", \"models/seq2seq-data/model.tar.gz\", \"model.tar.gz\")\n", "# s3.download_file(f\"sagemaker-example-files-prod-{region}\", \"models/seq2seq-data/vocab.src.json\", \"vocab.src.json\")\n", "# s3.download_file(f\"sagemaker-example-files-prod-{region}\", \"models/seq2seq-data/vocab.trg.json\", \"vocab.trg.json\")\n", "# upload_to_s3(bucket, prefix, 'pretrained_model', 'model.tar.gz')\n", "# model_data = \"s3://{}/{}/pretrained_model/model.tar.gz\".format(bucket, prefix)" ] }, { "cell_type": "code", "execution_count": null, "id": "87e32144", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "%%time\n", "\n", "sage = boto3.client(\"sagemaker\")\n", "\n", "if not use_pretrained_model:\n", " info = sage.describe_training_job(TrainingJobName=job_name)\n", " model_name = job_name\n", " model_data = info[\"ModelArtifacts\"][\"S3ModelArtifacts\"]\n", "\n", "print(model_name)\n", "print(model_data)\n", "\n", "primary_container = {\"Image\": container, \"ModelDataUrl\": model_data}\n", "\n", "create_model_response = sage.create_model(\n", " ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container\n", ")\n", "\n", "print(create_model_response[\"ModelArn\"])" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f7903d8c", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "### Create endpoint configuration\n", "Use the model to create an endpoint configuration. The endpoint configuration also contains information about the type and number of EC2 instances to use when hosting the model.\n", "\n", "Since SageMaker Seq2Seq is based on Neural Nets, we could use an ml.p2.xlarge (GPU) instance, but for this example we will use a free tier eligible ml.m4.xlarge." ] }, { "cell_type": "code", "execution_count": null, "id": "720a775b", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "from time import gmtime, strftime\n", "\n", "endpoint_config_name = \"DEMO-Seq2SeqEndpointConfig-\" + strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())\n", "print(endpoint_config_name)\n", "create_endpoint_config_response = sage.create_endpoint_config(\n", " EndpointConfigName=endpoint_config_name,\n", " ProductionVariants=[\n", " {\n", " \"InstanceType\": \"ml.m4.xlarge\",\n", " \"InitialInstanceCount\": 1,\n", " \"ModelName\": model_name,\n", " \"VariantName\": \"AllTraffic\",\n", " }\n", " ],\n", ")\n", "\n", "print(\"Endpoint Config Arn: \" + create_endpoint_config_response[\"EndpointConfigArn\"])" ] }, { "attachments": {}, "cell_type": "markdown", "id": "46b991e2", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "### Create endpoint\n", "Lastly, we create the endpoint that serves up model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 10-15 minutes to complete." ] }, { "cell_type": "code", "execution_count": null, "id": "ba6dc71c", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "%%time\n", "import time\n", "\n", "endpoint_name = \"DEMO-Seq2SeqEndpoint-\" + strftime(\"%Y-%m-%d-%H-%M-%S\", gmtime())\n", "print(endpoint_name)\n", "create_endpoint_response = sage.create_endpoint(\n", " EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name\n", ")\n", "print(create_endpoint_response[\"EndpointArn\"])\n", "\n", "resp = sage.describe_endpoint(EndpointName=endpoint_name)\n", "status = resp[\"EndpointStatus\"]\n", "print(\"Status: \" + status)\n", "\n", "# wait until the status has changed\n", "sage.get_waiter(\"endpoint_in_service\").wait(EndpointName=endpoint_name)\n", "\n", "# print the status of the endpoint\n", "endpoint_response = sage.describe_endpoint(EndpointName=endpoint_name)\n", "status = endpoint_response[\"EndpointStatus\"]\n", "print(\"Endpoint creation ended with EndpointStatus = {}\".format(status))\n", "\n", "if status != \"InService\":\n", " raise Exception(\"Endpoint creation failed.\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "5266c3e0", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "If you see the message,\n", "> Endpoint creation ended with EndpointStatus = InService\n", "\n", "then congratulations! You now have a functioning inference endpoint. You can confirm the endpoint configuration and status by navigating to the \"Endpoints\" tab in the AWS SageMaker console. \n", "\n", "We will finally create a runtime object from which we can invoke the endpoint." ] }, { "cell_type": "code", "execution_count": null, "id": "83cdd141", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "runtime = boto3.client(service_name=\"runtime.sagemaker\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "f0cf9f1f", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "## Perform Inference" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0a6defaa", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "### Using JSON format for inference (Suggested for a single or small number of data instances)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "af99ae50", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "#### Note that you don't have to convert string to text using the vocabulary mapping for inference using JSON mode" ] }, { "cell_type": "code", "execution_count": null, "id": "2682605e", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "sentences = [\"you are so good !\", \"can you drive a car ?\", \"i want to watch a movie .\"]\n", "\n", "payload = {\"instances\": []}\n", "for sent in sentences:\n", " payload[\"instances\"].append({\"data\": sent})\n", "\n", "response = runtime.invoke_endpoint(\n", " EndpointName=endpoint_name, ContentType=\"application/json\", Body=json.dumps(payload)\n", ")\n", "\n", "response = response[\"Body\"].read().decode(\"utf-8\")\n", "response = json.loads(response)\n", "print(response)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "0f6efa98", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "### Retrieving the Attention Matrix" ] }, { "attachments": {}, "cell_type": "markdown", "id": "ac8b596e", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "Passing `\"attention_matrix\":\"true\"` in `configuration` of the data instance will return the attention matrix." ] }, { "cell_type": "code", "execution_count": null, "id": "002db02f", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "sentence = \"can you drive a car ?\"\n", "\n", "payload = {\"instances\": [{\"data\": sentence, \"configuration\": {\"attention_matrix\": \"true\"}}]}\n", "\n", "response = runtime.invoke_endpoint(\n", " EndpointName=endpoint_name, ContentType=\"application/json\", Body=json.dumps(payload)\n", ")\n", "\n", "response = response[\"Body\"].read().decode(\"utf-8\")\n", "response = json.loads(response)[\"predictions\"][0]\n", "\n", "source = sentence\n", "target = response[\"target\"]\n", "attention_matrix = np.array(response[\"matrix\"])\n", "\n", "print(\"Source: %s \\nTarget: %s\" % (source, target))" ] }, { "cell_type": "code", "execution_count": null, "id": "26efc5d8", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "# Define a function for plotting the attentioan matrix\n", "def plot_matrix(attention_matrix, target, source):\n", " source_tokens = source.split()\n", " target_tokens = target.split()\n", " assert attention_matrix.shape[0] == len(target_tokens)\n", " plt.imshow(attention_matrix.transpose(), interpolation=\"nearest\", cmap=\"Greys\")\n", " plt.xlabel(\"target\")\n", " plt.ylabel(\"source\")\n", " plt.gca().set_xticks([i for i in range(0, len(target_tokens))])\n", " plt.gca().set_yticks([i for i in range(0, len(source_tokens))])\n", " plt.gca().set_xticklabels(target_tokens)\n", " plt.gca().set_yticklabels(source_tokens)\n", " plt.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "id": "76887048", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "plot_matrix(attention_matrix, target, source)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "08336fef", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "### Using Protobuf format for inference (Suggested for efficient bulk inference)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "e5d55712", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "Reading the vocabulary mappings as this mode of inference accepts list of integers and returns list of integers." ] }, { "cell_type": "code", "execution_count": null, "id": "6d4bb300", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "import io\n", "import tempfile\n", "from record_pb2 import Record\n", "from create_vocab_proto import (\n", " vocab_from_json,\n", " reverse_vocab,\n", " write_recordio,\n", " list_to_record_bytes,\n", " read_next,\n", ")\n", "\n", "source = vocab_from_json(\"vocab.src.json\")\n", "target = vocab_from_json(\"vocab.trg.json\")\n", "\n", "source_rev = reverse_vocab(source)\n", "target_rev = reverse_vocab(target)" ] }, { "cell_type": "code", "execution_count": null, "id": "3bb99c18", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "sentences = [\n", " \"this is so cool\",\n", " \"i am having dinner .\",\n", " \"i am sitting in an aeroplane .\",\n", " \"come let us go for a long drive .\",\n", "]" ] }, { "attachments": {}, "cell_type": "markdown", "id": "7dd386b3", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "Converting the string to integers, followed by protobuf encoding:" ] }, { "cell_type": "code", "execution_count": null, "id": "ff7f8176", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "# Convert strings to integers using source vocab mapping. Out-of-vocabulary strings are mapped to 1 - the mapping for \n", "sentences = [[source.get(token, 1) for token in sentence.split()] for sentence in sentences]\n", "f = io.BytesIO()\n", "for sentence in sentences:\n", " record = list_to_record_bytes(sentence, [])\n", " write_recordio(f, record)" ] }, { "cell_type": "code", "execution_count": null, "id": "b39d0fc6", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "response = runtime.invoke_endpoint(\n", " EndpointName=endpoint_name, ContentType=\"application/x-recordio-protobuf\", Body=f.getvalue()\n", ")\n", "\n", "response = response[\"Body\"].read()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "21c31712", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "Now, parse the protobuf response and convert list of integers back to strings" ] }, { "cell_type": "code", "execution_count": null, "id": "60eb209d", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "def _parse_proto_response(received_bytes):\n", " output_file = tempfile.NamedTemporaryFile()\n", " output_file.write(received_bytes)\n", " output_file.flush()\n", " target_sentences = []\n", " with open(output_file.name, \"rb\") as datum:\n", " next_record = True\n", " while next_record:\n", " next_record = read_next(datum)\n", " if next_record:\n", " rec = Record()\n", " rec.ParseFromString(next_record)\n", " target = list(rec.features[\"target\"].int32_tensor.values)\n", " target_sentences.append(target)\n", " else:\n", " break\n", " return target_sentences" ] }, { "cell_type": "code", "execution_count": null, "id": "70edfe45", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "targets = _parse_proto_response(response)\n", "resp = [\" \".join([target_rev.get(token, \"\") for token in sentence]) for sentence in targets]\n", "print(resp)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "4faac22e", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "source": [ "## Stop / Close the Endpoint (Optional)\n", "\n", "Finally, we should delete the endpoint before we close the notebook." ] }, { "cell_type": "code", "execution_count": null, "id": "a13afa51", "metadata": { "papermill": { "duration": null, "end_time": null, "exception": null, "start_time": null, "status": "pending" }, "tags": [] }, "outputs": [], "source": [ "sage.delete_endpoint(EndpointName=endpoint_name)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "6b4a5ac9", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n", "\n", "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/introduction_to_amazon_algorithms|seq2seq_translation_en-de|SageMaker-Seq2Seq-Translation-English-German.ipynb)\n" ] } ], "metadata": { "celltoolbar": "Tags", "kernelspec": { "display_name": "Python 3 (Data Science 3.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/sagemaker-data-science-310-v1" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "notice": "Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.", "papermill": { "default_parameters": {}, "duration": 46.346339, "end_time": "2021-06-07T00:17:31.811195", "environment_variables": {}, "exception": true, "input_path": "SageMaker-Seq2Seq-Translation-English-German.ipynb", "output_path": "/opt/ml/processing/output/SageMaker-Seq2Seq-Translation-English-German-2021-06-07-00-12-41.ipynb", "parameters": { "kms_key": "arn:aws:kms:us-west-2:521695447989:key/6e9984db-50cf-4c7e-926c-877ec47a8b25" }, "start_time": "2021-06-07T00:16:45.464856", "version": "2.3.3" } }, "nbformat": 4, "nbformat_minor": 5 }