{
"cells": [
{
"cell_type": "markdown",
"id": "6df43048",
"metadata": {},
"source": [
"# In-context learning with AlexaTM 20B in SageMaker JumpStart"
]
},
{
"cell_type": "markdown",
"id": "3e8c78ed",
"metadata": {},
"source": [
"\n",
"\n",
"---\n",
"\n",
"Welcome to [SageMaker JumpStart](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html)! SageMaker JumpStart is the Machine Learning (ML) Hub of SageMaker that offers 350+ pre-trained models, built-in algorithms, and pre-built solution templates to help customers get started with ML quickly. You can access these models and algorithms in JumpStart programmatically using [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/overview.html#use-built-in-algorithms-with-pre-trained-models-in-sagemaker-python-sdk) or through the JumpStart UI in [SageMaker Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html). In this notebook, we demonstrate how to use Amazon’s state-of-the-art [Alexa Teacher Model 20 Billion parameter model](https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning) (AlexaTM 20B), through the [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/overview.html#use-prebuilt-models-with-sagemaker-jumpstart).\n",
"\n",
"AlexaTM 20B is a multitask, multilingual, large-scale sequence-to-sequence (seq2seq) model, trained on a mixture of Common Crawl (mC4) and Wikipedia data across 12 languages, using denoising and Causal Language Modeling (CLM) tasks. It achieves state-of-the-art performance on common in-context language tasks such as 1-shot summarization and 1-shot machine translation, outperforming decoder only models such as Open AI’s GPT3 and Google’s PaLM, that are over eight times bigger. \n",
"\n",
"In-context learning - also known as \"prompting\" - refers to a method where you use a NLP model on a new task without having to fine-tune it. A few task examples are provided to the model only as part of the inference input, a paradigm known as \"few-shot in-context learning\". In some cases, the model can perform well without any training data at all, only given an explanation of what should be predicted. This is called \"zero-shot in-context learning\". A simple example of the latter is passing as input to a multilingual generation model the following sentence: \"Translate from English to French. Original: We should hurry up and get out of there; Translation: \". The model completes the example, prompted to translate the original text due to the setup we provided. To learn more about the model, please check out the Amazon Science [blog post](https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning) or read the [paper](https://arxiv.org/abs/2208.01448). \n",
"\n",
"In this notebook, we demonstrate how to deploy the AlexaTM 20B through the JumpStart API and run inference. Furthermore, we demonstrate how AlexaTM 20B can be used for in-context learning with five example tasks: (i) Text Summarization, (ii) Natural Language generation, (iii) Machine Translation, (iv) Extractive Question Answering, and (v) Natural language Inference and Classification.\n",
"\n",
"\n",
"Licensing: By using this model, you agree to the [Alexa Teacher Model license](https://jumpstart-cache-prod-us-east-2.s3.us-east-2.amazonaws.com/licenses/Alexa20b-License/MODEL_LICENSE.md).\n",
"\n",
"---\n"
]
},
{
"cell_type": "markdown",
"id": "783faff2",
"metadata": {},
"source": [
"## Table of Contents\n",
"\n",
"1. [Set Up](#1.-Set-Up)\n",
"2. [Retrieve jumpstart artifacts & deploy an endpoint](#2.-Retrieve-jumpstart-artifacts-&-deploy-an-endpoint)\n",
"3. [Query endpoint and parse response](#3.-Query-endpoint-and-parse-response)\n",
"4. [In-context learning](#4.-In-context-learning)\n",
" * [1-shot text summarization](#4.1.-1-shot-text-summarization)\n",
" * [1-shot natural language Generation](#4.2.-1-shot-natural-language-generation)\n",
" * [1-shot machine translation](#4.3.-1-shot-machine-translation)\n",
" * [Zero-shot extractive question answering](#4.4.-Zero-shot-extractive-question-answering)\n",
" * [Zero-shot natural language inference and classification](#4.5.-Zero-shot-natural-language-inference-and-classification)\n",
"5. [Clean up the endpoint](#5.-Clean-up-the-endpoint)\n",
"6. [License](#6.-License)"
]
},
{
"cell_type": "markdown",
"id": "976dc431",
"metadata": {},
"source": [
"Note: This notebook was tested on ml.t3.medium instance in Amazon SageMaker Studio with Python 3.10 (Data Science 3.0) kernel. The real-time endpoint instance onto which the model is deployed (ml.g4dn.12xlarge) requires at least 50 GB of CPU memory and 42 GB of GPU memory. CPU instances have not been tested, but they may work.\n",
"\n",
"Please see steps in [Onboard to Amazon SageMaker Domain Using Quick setup](https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-quick-start.html) to make use of SageMaker Studio."
]
},
{
"cell_type": "markdown",
"id": "89774571",
"metadata": {},
"source": [
"## 1. Set Up"
]
},
{
"cell_type": "markdown",
"id": "32fec93c",
"metadata": {},
"source": [
"---\n",
"Before executing the notebook, there are some initial steps required for set up. This notebook requires the latest version of the Sagemaker SDK.\n",
"\n",
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4d68a82f",
"metadata": {},
"outputs": [],
"source": [
"!pip install sagemaker boto3 --upgrade --quiet\n",
"!pip show sagemaker | egrep \"Name|Version\"\n",
"!pip show boto3 | egrep \"Name|Version\"\n",
"!python --version"
]
},
{
"cell_type": "markdown",
"id": "2a8808c0",
"metadata": {},
"source": [
"#### Permissions and environment variables\n",
"\n",
"---\n",
"To host on Amazon SageMaker, we need to set up and authenticate the use of AWS services. Here, we use the execution role associated with the current notebook as the AWS account role with SageMaker access. \n",
"\n",
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ce34fc7e",
"metadata": {},
"outputs": [],
"source": [
"import sagemaker, json\n",
"from sagemaker import get_execution_role\n",
"\n",
"aws_role = get_execution_role()"
]
},
{
"cell_type": "markdown",
"id": "c1986865",
"metadata": {},
"source": [
"## 2. Retrieve jumpstart artifacts & deploy an endpoint\n",
"\n",
"***\n",
"\n",
"JumpStart allows you to deploy pre-trained models for inference without having to fine-tune them on a new dataset. This can be done by first fetching `deploy_image_uri` - an optimized deep learning container image - using the `model_id`. SageMaker gives customers the ability to run docker containers extensively for training and inferencing. JumpStart uses these available framework-specific SageMaker Deep Learning Containers (https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html) (DLCs). Then, we fetch the `model_uri` containing the model parameters, along with inference handling scripts and any associated dependencies. Finally, we go on to create a [sagemaker.model.Model](https://sagemaker.readthedocs.io/en/stable/api/inference/model.html) instance in SageMaker and deploy it to a real-time endpoint.\n",
"\n",
"\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f8c28ed",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from datetime import datetime\n",
"from sagemaker import image_uris, model_uris, script_uris, hyperparameters\n",
"from sagemaker.model import Model\n",
"from sagemaker.predictor import Predictor\n",
"from sagemaker.utils import name_from_base\n",
"\n",
"# model_version = \"*\" fetches the latest version of the model\n",
"model_id, model_version = \"pytorch-textgeneration1-alexa20b\", \"*\"\n",
"\n",
"endpoint_name = name_from_base(f\"jumpstart-console-infer-{model_id}\")\n",
"\n",
"endpoint_config_name = \"config-\" + endpoint_name\n",
"\n",
"\n",
"# GPU Instance Reqts: >50 GB of CPU RAM and >42 GB of GPU memory in total\n",
"# Tested with ml.g4dn.12xlarge, ml.p3.8xlarge and ml.p3.16xlarge\n",
"instance_type = \"ml.g4dn.12xlarge\"\n",
"\n",
"# If using an EBS-backed instance, you must specify at least 256 GB of storage\n",
"# If using an instance with local SSD storage, volume_size must be None\n",
"if instance_type == \"ml.g4dn.12xlarge\":\n",
" volume_size = None\n",
"elif instance_type in [\"ml.p3.8xlarge\", \"ml.p3.16xlarge\"]:\n",
" volume_size = 256\n",
"else:\n",
" volume_size = None\n",
" print(\n",
" f\"Instance_type={instance_type} not tested. Setting volume_size = None.\"\n",
" \"If you run into out of space errors and your instance supports EBS storage,\"\n",
" \"please set volume_size = 256.\"\n",
" )\n",
"\n",
"# Retrieve the inference docker container uri. This is the base PyTorch container image.\n",
"deploy_image_uri = image_uris.retrieve(\n",
" region=None,\n",
" framework=None, # automatically inferred from model_id\n",
" image_scope=\"inference\",\n",
" model_id=model_id,\n",
" model_version=model_version,\n",
" instance_type=instance_type,\n",
")\n",
"\n",
"\n",
"# Retrieve the model uri. This includes both pre-trained parameters, inference handling scripts and any dependencies.\n",
"model_uri = model_uris.retrieve(\n",
" model_id=model_id, model_version=model_version, model_scope=\"inference\"\n",
")\n",
"\n",
"env = {\n",
" \"SAGEMAKER_MODEL_SERVER_TIMEOUT\": str(3600),\n",
" \"MODEL_CACHE_ROOT\": \"/opt/ml/model\",\n",
" \"SAGEMAKER_ENV\": \"1\",\n",
" \"SAGEMAKER_SUBMIT_DIRECTORY\": \"/opt/ml/model/code/\",\n",
" \"SAGEMAKER_PROGRAM\": \"inference.py\",\n",
" \"SAGEMAKER_MODEL_SERVER_WORKERS\": \"1\", # without this, there will be one process per GPU\n",
" \"TS_DEFAULT_WORKERS_PER_MODEL\": \"1\", # without this, each worker will have 1/num_gpus the RAM\n",
"}\n",
"\n",
"# Create the SageMaker model instance. Note that we need to pass Predictor class when we deploy model through Model class,\n",
"# for being able to run inference through the sagemaker API.\n",
"model = Model(\n",
" image_uri=deploy_image_uri,\n",
" model_data=model_uri,\n",
" role=aws_role,\n",
" predictor_cls=Predictor,\n",
" name=endpoint_name,\n",
" env=env,\n",
")\n",
"\n",
"print(\"☕ Spinning up the endpoint. This will take a little while ☕\")\n",
"\n",
"# deploy the Model.\n",
"model_predictor = model.deploy(\n",
" initial_instance_count=1,\n",
" instance_type=instance_type,\n",
" endpoint_name=endpoint_name,\n",
" volume_size=volume_size, # Specify the size of the Amazon EBS volume.\n",
" model_data_download_timeout=3600, # Specify the model download timeout in seconds.\n",
" container_startup_health_check_timeout=3600, # Specify the health checkup timeout in seconds\n",
")"
]
},
{
"cell_type": "markdown",
"id": "631a748c",
"metadata": {
"jupyter": {
"outputs_hidden": true
},
"tags": []
},
"source": [
"Note that for some of the JumpStart models, we package model parameters and inference handling scripts separately and supply each to the Model class. However, due to the size of the AlexaTM20B model, this is currently infeasible for AlexaTM 20B. Thus, we store the inference script as well as any dependency along with the model parameters for AlexaTM20B."
]
},
{
"cell_type": "markdown",
"id": "0f7c28e9",
"metadata": {},
"source": [
"## 3. Query endpoint and parse response\n",
"\n",
"---\n",
"The input passed to the endpoint is a json with `input_texts` and other text generation arguments encoded in `utf-8` format. The output of the endpoint is a `json` object containing the generated list of text, `generated_texts`.\n",
"\n",
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8803b608",
"metadata": {},
"outputs": [],
"source": [
"def query(model_predictor, text, generate_kwargs=None, max_num_attempts=5):\n",
" \"\"\"Query the model predictor.\n",
"\n",
" model_predictor: The deployed model pipeline.\n",
" text: a string or list of strings to input to the model pipeline.\n",
" generate_kwargs: A dictionary of generation arguments.\n",
" max_num_attempts: Maximum number of invokation request.\n",
"\n",
" returns: A JSON of the model outputs.\n",
" \"\"\"\n",
"\n",
" payload = {\"text_inputs\": text}\n",
" if generate_kwargs is not None:\n",
" payload.update(generate_kwargs)\n",
"\n",
" encoded_inp = json.dumps(payload).encode(\"utf-8\")\n",
" for _ in range(max_num_attempts):\n",
" try:\n",
" query_response = model_predictor.predict(\n",
" encoded_inp,\n",
" {\"ContentType\": \"application/json\", \"Accept\": \"application/json\"},\n",
" )\n",
" break\n",
" except Exception as e:\n",
" print(\"Invokation request unsuccessful. Retrying.\")\n",
" continue\n",
" return query_response\n",
"\n",
"\n",
"def parse_response(query_response):\n",
" \"\"\"Parse response and return the list of generated texts.\"\"\"\n",
"\n",
" return json.loads(query_response)[\"generated_texts\"]\n",
"\n",
"\n",
"newline, bold, unbold = \"\\n\", \"\\033[1m\", \"\\033[0m\"\n",
"text = \"[CLM]My name is Lewis and I like to\"\n",
"kwargs = {\"num_beams\": 5, \"no_repeat_ngram_size\": 2}\n",
"query_response = query(model_predictor, text, kwargs)\n",
"generated_texts = parse_response(query_response)\n",
"print(f\"Input text: {text}{newline}\" f\"Generated text: {bold}{generated_texts}{unbold}{newline}\")"
]
},
{
"cell_type": "markdown",
"id": "0f83a141",
"metadata": {},
"source": [
"### 3.1 Supported parameters"
]
},
{
"cell_type": "markdown",
"id": "de065dbf",
"metadata": {},
"source": [
"***\n",
"\n",
"We currently support 10 text generation parameters for inference which determine the length of output sequence, number of output sequences, diversity in output sequences and much more. All of these arguments are optional.\n",
"\n",
"* **max_length:** Model generates text until the output length (which includes the input context length) reaches max_length. If specified, it must be a positive integer.\n",
"* **num_return_sequences:** Number of output sequences returned. If specified, it must be a positive integer.\n",
"* **num_beams:** Number of beams used for greedy search. If specified, it must be an integer greater than or equal to num_return_sequences.\n",
"* **no_repeat_ngram_size:** Model ensures that a sequence of words of no_repeat_ngram_size is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.\n",
"* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If temperature -> 0, it results in greedy decoding. If specified, it must be a positive float.\n",
"* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of sentence token. If specified, it must be boolean.\n",
"* **do_sample:** If True, model samples the next word as per the likelyhood. If specified, it must be boolean.\n",
"* **top_k:** In each step of text generation, sample from only the top_k most likely words. If specified, it must be a positive integer.\n",
"* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability top_p. If specified, it must be a float between 0 and 1.\n",
"* **seed:** Fix the randomized state for reproducibility. If specified, it must be an integer.\n",
"\n",
"You may specify any subset of the parameters mentioned above when invoking an endpoint. For a more detailed guide on how to set these parameters, please see [this article](https://huggingface.co/blog/how-to-generate).\n",
"\n",
"***"
]
},
{
"cell_type": "markdown",
"id": "e86983e5",
"metadata": {},
"source": [
"## 3.2 Details to consider\n",
"\n",
"***\n",
"\n",
"There are a few details to consider as you use the AlexaTM20B model, listed below.\n",
"\n",
"**CLM vs Denoising:** The AlexaTM 20B model was trained with both denoising and Causal Language Modeling (CLM) tasks. Denoising is the default mode, and CLM is activated by prepending each input with `[CLM] `. In most cases, CLM mode is best for in-context learning.\n",
"\n",
"**Delimiters:** Some models use linebreaks (`\\n`) as a delimiter within prompts and/or between shots. The AlexaTM 20B tokenizer removes `\\n`, so it should not be used. Instead, we recommend using `
`, `;`, or `\\\\n`.\n",
"\n",
"**Dummy shots:** Sometimes the model benefits from adding \"dummy shots\" like `Question: Is this question in English?
Answer: Yes`. Some examples are given below.\n",
"\n",
"**Prompt patterns:** For more ideas about prompt patterns, please see [Appendix B of this paper](https://arxiv.org/abs/2208.01448)\n",
"\n",
"**Batching:** Batches of inputs, provided to the `query` function above as a list of strings, may also be used.\n",
"\n",
"*Sometimes the request will time out. Please just try the cell again.*\n",
"\n",
"***\n"
]
},
{
"cell_type": "markdown",
"id": "fa1a01f4",
"metadata": {},
"source": [
"## 4. In-context learning\n",
"\n",
"---\n",
"In-context learning refers to learning a task without finetuning i.e. changing the model parameters. When performing in-context learning, we provide the language model with a prompt, which consists of training input-output pairs that demonstrate the task objective. We append a test input to the prompt and allow the language model to make predictions by conditioning on the prompt and predicting the next tokens/words. It is a highly effective technique to solve few-shot learning problems, a setting where we learn a task from a few training samples. Unlike prior sequence-to-sequence models, AlexaTM 20B was trained on causal language modeling in addition to denoising, which makes it a model very performant for in-context learning. Next, we show how you can use this model for several 1-shot and zero-shot tasks via in-context learning. \n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "51dcf8bb",
"metadata": {},
"source": [
"### 4.1. 1-shot text summarization\n",
"\n",
"\n",
"\n",
"Text Summarization is the task of shortening the data and creating a summary that represents the most important information present in the original text. 1-shot text summarization refers to the setting where we learn to summarize the text based on a single training sample. Following is a text summarization sample from the [XSUM](https://huggingface.co/datasets/xsum/viewer/default/train) dataset.\n",
"\n",
"***\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05bfce4b",
"metadata": {},
"outputs": [],
"source": [
"train_article = 'The announcement ends months of uncertainty for Cornish Language Partnership staff whose contracts had been due to end. Local government minister Andrew Stunnell said the three-year funding package for the service would help make sure the language survived. But he warned that long term funding should come from Cornwall. He said it was \"important to make sure the Cornish were given the opportunity to put down sound foundations.\" \"In the longer term support for the Cornish language is going to be something which is going to have to be based in Cornwall and will not come from London,\" he added. The Cornish Language Partnership\\'s, Jennifer Lowe, said: \"We can now plan for the future thanks to the funding.\" The United Nations recently upgraded the status of the Cornish language from \"extinct\" to \"critically endangered\". It is thought fewer than 500 people worldwide are fluent in the language.\"'\n",
"train_summary = \"The government is spending nearly £400,000 to help save the Cornish language.\"\n",
"\n",
"test_article = 'Torrents of water brought down a suspended ceiling and damaged stock in the Victoria Centre store at about 22:40 BST on Tuesday. Managers had hoped for a weekend reopening but it is now closed \"until further notice\". Staff have been helping with the clean-up operation. Water poured through from a rooftop room, leaving the top floor under three inches of water and stock \"significantly\" damaged. A spokeswoman said: \"Our teams are working around the clock to get the shop open as quickly as possible and we\\'re sorry for the inconvenience this has caused to our customers.\"'\n",
"test_summary = \"John Lewis's Nottingham store is to remain closed longer than expected after 80,000 litres of hot water leaked from a ruptured heating pipe.\"\n",
"\n",
"prompt = (\n",
" f\"[CLM] {train_article} ==> \"\n",
" f\"Short summary: {train_summary}
\"\n",
" f\"Article: {test_article} ==> Short summary:\"\n",
")\n",
"\n",
"\n",
"query_response = query(model_predictor, text=prompt)\n",
"generated_text = parse_response(query_response)[0]\n",
"\n",
"# Trim using the delimiter\n",
"generated_text = generated_text.split(\"
\")[0].strip()\n",
"\n",
"print(f\"Model input: {prompt}{newline}\")\n",
"print(f\"Model output: {bold}{generated_text}{unbold}{newline}\")\n",
"print(f\"Ground truth: {test_summary}\")"
]
},
{
"cell_type": "markdown",
"id": "5a153495",
"metadata": {},
"source": [
"***\n",
"Next, we demonstrate the same inference request, this time using top k sampling. Recall that in top k sampling, the model only considers the top k most likely words in each step of text generation. You can run this multiple times to get different results, or set a seed to get the same result each time.\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4d8ce40e",
"metadata": {},
"outputs": [],
"source": [
"train_article = 'The announcement ends months of uncertainty for Cornish Language Partnership staff whose contracts had been due to end. Local government minister Andrew Stunnell said the three-year funding package for the service would help make sure the language survived. But he warned that long term funding should come from Cornwall. He said it was \"important to make sure the Cornish were given the opportunity to put down sound foundations.\" \"In the longer term support for the Cornish language is going to be something which is going to have to be based in Cornwall and will not come from London,\" he added. The Cornish Language Partnership\\'s, Jennifer Lowe, said: \"We can now plan for the future thanks to the funding.\" The United Nations recently upgraded the status of the Cornish language from \"extinct\" to \"critically endangered\". It is thought fewer than 500 people worldwide are fluent in the language.\"'\n",
"train_summary = \"The government is spending nearly £400,000 to help save the Cornish language.\"\n",
"\n",
"test_article = 'Torrents of water brought down a suspended ceiling and damaged stock in the Victoria Centre store at about 22:40 BST on Tuesday. Managers had hoped for a weekend reopening but it is now closed \"until further notice\". Staff have been helping with the clean-up operation. Water poured through from a rooftop room, leaving the top floor under three inches of water and stock \"significantly\" damaged. A spokeswoman said: \"Our teams are working around the clock to get the shop open as quickly as possible and we\\'re sorry for the inconvenience this has caused to our customers.\"'\n",
"test_summary = \"John Lewis's Nottingham store is to remain closed longer than expected after 80,000 litres of hot water leaked from a ruptured heating pipe.\"\n",
"\n",
"prompt = (\n",
" f\"[CLM] {train_article} ==> \"\n",
" f\"Short summary: {train_summary}
\"\n",
" f\"Article: {test_article} ==> Short summary:\"\n",
")\n",
"\n",
"generate_kwargs = {\"do_sample\": True, \"max_length\": 50, \"top_k\": 50}\n",
"\n",
"query_response = query(model_predictor, text=prompt, generate_kwargs=generate_kwargs)\n",
"generated_text = parse_response(query_response)[0]\n",
"\n",
"# Trim using the delimiter\n",
"generated_text = generated_text.split(\"
\")[0].strip()\n",
"\n",
"print(f\"Model input: {prompt}{newline}\")\n",
"print(f\"Model output: {bold}{generated_text}{unbold}{newline}\")\n",
"print(f\"Ground truth: {test_summary}\")"
]
},
{
"cell_type": "markdown",
"id": "2e9493ba",
"metadata": {},
"source": [
"## 4.2. 1-shot natural language generation\n",
"\n",
"Natural language generation is the task of producing text narratives given the input text. The following sample shows a training sample from the [E2E dataset](https://huggingface.co/datasets/e2e_nlg)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "320b5a7c",
"metadata": {},
"outputs": [],
"source": [
"train_inp = \"name[The Punter], food[Indian], priceRange[cheap]\"\n",
"train_out = \"The Punter provides Indian food in the cheap price range.\"\n",
"\n",
"test_inp = \"name[Blue Spice], eatType[coffee shop], area[city centre]\"\n",
"test_out = \"Blue Spice is a coffee shop in city centre.\"\n",
"\n",
"prompt = (\n",
" f\"[CLM] {train_inp} ==> \"\n",
" f\"senetence describing the place: {train_out} ; \"\n",
" f\"{test_inp} ==> sentence describing the place:\"\n",
")\n",
"\n",
"query_response = query(model_predictor, prompt)\n",
"generated_text = parse_response(query_response)[0]\n",
"\n",
"# Trim using the delimiter\n",
"generated_text = generated_text.split(\";\")[0]\n",
"\n",
"print(f\"Model input: {prompt}{newline}\")\n",
"print(f\"Model output: {bold}{generated_text}{unbold}{newline}\")\n",
"print(f\"Ground truth: {test_out}\")"
]
},
{
"cell_type": "markdown",
"id": "80a91e66",
"metadata": {},
"source": [
"### 4.3. 1-shot machine translation\n",
"***\n",
"Machine Translation is the task of translating text from one language to another. The following example shows a training sample from the [WMT19 dataset](https://huggingface.co/datasets/wmt19), where we need to translate from German to English.\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4d8fba10",
"metadata": {},
"outputs": [],
"source": [
"train_inp = \"Das Parlament erhebt sich zu einer Schweigeminute.\"\n",
"train_out = \"The House rose and observed a minute' s silence\"\n",
"\n",
"test_inp = \"Kleingärtner bewirtschaften den einstigen Grund von Bauern.\"\n",
"test_out = \"Allotment holders cultivate the soil of former farmers.\"\n",
"\n",
"prompt = (\n",
" f\"[CLM] Sentence: {train_inp}; \"\n",
" f\"Translation in English: {train_out}; \"\n",
" f\"Sentence: {test_inp}; \"\n",
" \"Translation in English:\"\n",
")\n",
"\n",
"generate_kwargs = {\"num_return_sequences\": 1, \"num_beams\": 10}\n",
"\n",
"query_response = query(model_predictor, text=prompt, generate_kwargs=generate_kwargs)\n",
"generated_text = parse_response(query_response)[0]\n",
"\n",
"# Trim using the delimiter\n",
"generated_text = generated_text.split(\";\")[0]\n",
"\n",
"print(f\"Model input: {prompt}{newline}\")\n",
"print(f\"Model output: {bold}{generated_text}{unbold}{newline}\")\n",
"print(f\"Ground truth: {test_out}\")"
]
},
{
"cell_type": "markdown",
"id": "1f6c5f69",
"metadata": {},
"source": [
"### 4.4. Zero-shot extractive question answering\n",
"\n",
"***\n",
"Extractive question answering is the task of finding the answer to a question from the context paragraph. Here is an example of a context and a question from the [SQuAD v2 dataset](https://huggingface.co/datasets/squad_v2).\n",
"***\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9137cba0",
"metadata": {},
"outputs": [],
"source": [
"test_context = 'The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse (\"Norman\" comes from \"Norseman\") raiders and pirates from Denmark, Iceland and Norway who, under their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct cultural and ethnic identity of the Normans emerged initially in the first half of the 10th century, and it continued to evolve over the succeeding centuries.'\n",
"test_question = \"In what country is Normandy located?\"\n",
"test_answer = [\"France\", \"France\", \"France\", \"France\"]\n",
"\n",
"prompt = (\n",
" f\"[CLM] Context: {test_context} \"\n",
" \"Question: What is the last word in the passage? \"\n",
" f\"Answer: {test_context.split()[-1].strip('.').strip('?')}; \"\n",
" f\"Question: {test_question} \"\n",
" \"Answer:\"\n",
")\n",
"\n",
"\n",
"query_response = query(model_predictor, prompt)\n",
"generated_text = parse_response(query_response)[0]\n",
"\n",
"# Trim using the delimiter\n",
"generated_text = generated_text.split(\";\")[0]\n",
"\n",
"print(f\"Model input: {prompt}{newline}\")\n",
"print(f\"Model output: {bold}{generated_text}{unbold}{newline}\")\n",
"print(f\"Ground truth: {test_answer}\")"
]
},
{
"cell_type": "markdown",
"id": "13e6feca",
"metadata": {},
"source": [
"### 4.5. Zero-shot natural language inference and classification\n",
"\n",
"***\n",
"\n",
"Natural language inference is a task where, given a premise, we need to determine if a hypothesis is true, false or undetermined (neutral). Here is an example from the [CB dataset](https://huggingface.co/datasets/super_glue/viewer/cb/test).\n",
"\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e6c52691",
"metadata": {},
"outputs": [],
"source": [
"test_context = \"When Matt married Inez I was left high and dry. A year later I met a sales rep for an agricultural firm and married him. You could say it was on the rebound but I had no reason to regret it.\"\n",
"test_hypothesis = \"the marriage was on the rebound\"\n",
"test_answer = \"True\"\n",
"\n",
"prompt = (\n",
" f\"[CLM] Context: {test_context}
\"\n",
" \"Question: Is this passage written in English?
Answer (True, False, or Neither): True
\"\n",
" \"Question: Is this passage written in French?
Answer (True, False, or Neither): False
\"\n",
" f\"Question: {test_hypothesis}
\"\n",
" \"Answer (True, False, or Neither):\"\n",
")\n",
"\n",
"\n",
"query_response = query(model_predictor, prompt)\n",
"generated_text = parse_response(query_response)[0]\n",
"\n",
"# Trim using the delimiter\n",
"generated_text = generated_text.split(\"
\")[0].strip()\n",
"\n",
"print(f\"Model input: {prompt}{newline}\")\n",
"print(f\"Model output: {bold}{generated_text}{unbold}{newline}\")\n",
"print(f\"Ground truth: {test_answer}\")"
]
},
{
"cell_type": "markdown",
"id": "48bfef9e",
"metadata": {},
"source": [
"## 5. Clean up the endpoint"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38041fac",
"metadata": {},
"outputs": [],
"source": [
"# Delete the SageMaker endpoint\n",
"model_predictor.delete_model()\n",
"model_predictor.delete_endpoint()"
]
},
{
"cell_type": "markdown",
"id": "2d0f49e5",
"metadata": {},
"source": [
"## 6. License\n",
"\n",
"---\n",
"\n",
"There are two separate licenses: one for the code (Apache 2.0) and one for any usage of the model weights (Alexa Teacher Model License Agreement), which can be seen below.\n",
"\n",
"\n",
" Licenses:
\n",
"\n",
"\n",
"## Code License\n",
" Apache License\n",
" Version 2.0, January 2004\n",
" http://www.apache.org/licenses/\n",
"\n",
" \n",
" TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n",
"\n",
" 1. Definitions.\n",
"\n",
" \"License\" shall mean the terms and conditions for use, reproduction,\n",
" and distribution as defined by Sections 1 through 9 of this document.\n",
"\n",
" \"Licensor\" shall mean the copyright owner or entity authorized by\n",
" the copyright owner that is granting the License.\n",
"\n",
" \"Legal Entity\" shall mean the union of the acting entity and all\n",
" other entities that control, are controlled by, or are under common\n",
" control with that entity. For the purposes of this definition,\n",
" \"control\" means (i) the power, direct or indirect, to cause the\n",
" direction or management of such entity, whether by contract or\n",
" otherwise, or (ii) ownership of fifty percent (50%) or more of the\n",
" outstanding shares, or (iii) beneficial ownership of such entity.\n",
"\n",
" \"You\" (or \"Your\") shall mean an individual or Legal Entity\n",
" exercising permissions granted by this License.\n",
"\n",
" \"Source\" form shall mean the preferred form for making modifications,\n",
" including but not limited to software source code, documentation\n",
" source, and configuration files.\n",
"\n",
" \"Object\" form shall mean any form resulting from mechanical\n",
" transformation or translation of a Source form, including but\n",
" not limited to compiled object code, generated documentation,\n",
" and conversions to other media types.\n",
"\n",
" \"Work\" shall mean the work of authorship, whether in Source or\n",
" Object form, made available under the License, as indicated by a\n",
" copyright notice that is included in or attached to the work\n",
" (an example is provided in the Appendix below).\n",
"\n",
" \"Derivative Works\" shall mean any work, whether in Source or Object\n",
" form, that is based on (or derived from) the Work and for which the\n",
" editorial revisions, annotations, elaborations, or other modifications\n",
" represent, as a whole, an original work of authorship. For the purposes\n",
" of this License, Derivative Works shall not include works that remain\n",
" separable from, or merely link (or bind by name) to the interfaces of,\n",
" the Work and Derivative Works thereof.\n",
"\n",
" \"Contribution\" shall mean any work of authorship, including\n",
" the original version of the Work and any modifications or additions\n",
" to that Work or Derivative Works thereof, that is intentionally\n",
" submitted to Licensor for inclusion in the Work by the copyright owner\n",
" or by an individual or Legal Entity authorized to submit on behalf of\n",
" the copyright owner. For the purposes of this definition, \"submitted\"\n",
" means any form of electronic, verbal, or written communication sent\n",
" to the Licensor or its representatives, including but not limited to\n",
" communication on electronic mailing lists, source code control systems,\n",
" and issue tracking systems that are managed by, or on behalf of, the\n",
" Licensor for the purpose of discussing and improving the Work, but\n",
" excluding communication that is conspicuously marked or otherwise\n",
" designated in writing by the copyright owner as \"Not a Contribution.\"\n",
"\n",
" \"Contributor\" shall mean Licensor and any individual or Legal Entity\n",
" on behalf of whom a Contribution has been received by Licensor and\n",
" subsequently incorporated within the Work.\n",
"\n",
" 2. Grant of Copyright License. Subject to the terms and conditions of\n",
" this License, each Contributor hereby grants to You a perpetual,\n",
" worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n",
" copyright license to reproduce, prepare Derivative Works of,\n",
" publicly display, publicly perform, sublicense, and distribute the\n",
" Work and such Derivative Works in Source or Object form.\n",
"\n",
" 3. Grant of Patent License. Subject to the terms and conditions of\n",
" this License, each Contributor hereby grants to You a perpetual,\n",
" worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n",
" (except as stated in this section) patent license to make, have made,\n",
" use, offer to sell, sell, import, and otherwise transfer the Work,\n",
" where such license applies only to those patent claims licensable\n",
" by such Contributor that are necessarily infringed by their\n",
" Contribution(s) alone or by combination of their Contribution(s)\n",
" with the Work to which such Contribution(s) was submitted. If You\n",
" institute patent litigation against any entity (including a\n",
" cross-claim or counterclaim in a lawsuit) alleging that the Work\n",
" or a Contribution incorporated within the Work constitutes direct\n",
" or contributory patent infringement, then any patent licenses\n",
" granted to You under this License for that Work shall terminate\n",
" as of the date such litigation is filed.\n",
"\n",
" 4. Redistribution. You may reproduce and distribute copies of the\n",
" Work or Derivative Works thereof in any medium, with or without\n",
" modifications, and in Source or Object form, provided that You\n",
" meet the following conditions:\n",
"\n",
" (a) You must give any other recipients of the Work or\n",
" Derivative Works a copy of this License; and\n",
"\n",
" (b) You must cause any modified files to carry prominent notices\n",
" stating that You changed the files; and\n",
"\n",
" (c) You must retain, in the Source form of any Derivative Works\n",
" that You distribute, all copyright, patent, trademark, and\n",
" attribution notices from the Source form of the Work,\n",
" excluding those notices that do not pertain to any part of\n",
" the Derivative Works; and\n",
"\n",
" (d) If the Work includes a \"NOTICE\" text file as part of its\n",
" distribution, then any Derivative Works that You distribute must\n",
" include a readable copy of the attribution notices contained\n",
" within such NOTICE file, excluding those notices that do not\n",
" pertain to any part of the Derivative Works, in at least one\n",
" of the following places: within a NOTICE text file distributed\n",
" as part of the Derivative Works; within the Source form or\n",
" documentation, if provided along with the Derivative Works; or,\n",
" within a display generated by the Derivative Works, if and\n",
" wherever such third-party notices normally appear. The contents\n",
" of the NOTICE file are for informational purposes only and\n",
" do not modify the License. You may add Your own attribution\n",
" notices within Derivative Works that You distribute, alongside\n",
" or as an addendum to the NOTICE text from the Work, provided\n",
" that such additional attribution notices cannot be construed\n",
" as modifying the License.\n",
"\n",
" You may add Your own copyright statement to Your modifications and\n",
" may provide additional or different license terms and conditions\n",
" for use, reproduction, or distribution of Your modifications, or\n",
" for any such Derivative Works as a whole, provided Your use,\n",
" reproduction, and distribution of the Work otherwise complies with\n",
" the conditions stated in this License.\n",
"\n",
" 5. Submission of Contributions. Unless You explicitly state otherwise,\n",
" any Contribution intentionally submitted for inclusion in the Work\n",
" by You to the Licensor shall be under the terms and conditions of\n",
" this License, without any additional terms or conditions.\n",
" Notwithstanding the above, nothing herein shall supersede or modify\n",
" the terms of any separate license agreement you may have executed\n",
" with Licensor regarding such Contributions.\n",
"\n",
" 6. Trademarks. This License does not grant permission to use the trade\n",
" names, trademarks, service marks, or product names of the Licensor,\n",
" except as required for reasonable and customary use in describing the\n",
" origin of the Work and reproducing the content of the NOTICE file.\n",
"\n",
" 7. Disclaimer of Warranty. Unless required by applicable law or\n",
" agreed to in writing, Licensor provides the Work (and each\n",
" Contributor provides its Contributions) on an \"AS IS\" BASIS,\n",
" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n",
" implied, including, without limitation, any warranties or conditions\n",
" of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n",
" PARTICULAR PURPOSE. You are solely responsible for determining the\n",
" appropriateness of using or redistributing the Work and assume any\n",
" risks associated with Your exercise of permissions under this License.\n",
"\n",
" 8. Limitation of Liability. In no event and under no legal theory,\n",
" whether in tort (including negligence), contract, or otherwise,\n",
" unless required by applicable law (such as deliberate and grossly\n",
" negligent acts) or agreed to in writing, shall any Contributor be\n",
" liable to You for damages, including any direct, indirect, special,\n",
" incidental, or consequential damages of any character arising as a\n",
" result of this License or out of the use or inability to use the\n",
" Work (including but not limited to damages for loss of goodwill,\n",
" work stoppage, computer failure or malfunction, or any and all\n",
" other commercial damages or losses), even if such Contributor\n",
" has been advised of the possibility of such damages.\n",
"\n",
" 9. Accepting Warranty or Additional Liability. While redistributing\n",
" the Work or Derivative Works thereof, You may choose to offer,\n",
" and charge a fee for, acceptance of support, warranty, indemnity,\n",
" or other liability obligations and/or rights consistent with this\n",
" License. However, in accepting such obligations, You may act only\n",
" on Your own behalf and on Your sole responsibility, not on behalf\n",
" of any other Contributor, and only if You agree to indemnify,\n",
" defend, and hold each Contributor harmless for any liability\n",
" incurred by, or claims asserted against, such Contributor by reason\n",
" of your accepting any such warranty or additional liability.\n",
"\n",
" END OF TERMS AND CONDITIONS\n",
"\n",
" APPENDIX: How to apply the Apache License to your work.\n",
"\n",
" To apply the Apache License to your work, attach the following\n",
" boilerplate notice, with the fields enclosed by brackets \"[]\"\n",
" replaced with your own identifying information. (Don't include\n",
" the brackets!) The text should be enclosed in the appropriate\n",
" comment syntax for the file format. We also recommend that a\n",
" file or class name and description of purpose be included on the\n",
" same \"printed page\" as the copyright notice for easier\n",
" identification within third-party archives.\n",
"\n",
" Copyright 2022 Amazon.com Services LLC or its respective affiliates\n",
"\n",
" Licensed under the Apache License, Version 2.0 (the \"License\");\n",
" you may not use this file except in compliance with the License.\n",
" You may obtain a copy of the License at\n",
"\n",
" http://www.apache.org/licenses/LICENSE-2.0\n",
"\n",
" Unless required by applicable law or agreed to in writing, software\n",
" distributed under the License is distributed on an \"AS IS\" BASIS,\n",
" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
" See the License for the specific language governing permissions and\n",
" limitations under the License.\n",
"\n",
"\n",
"## Model License\n",
" \n",
"# Alexa Teacher Model License Agreement\n",
"This is a license agreement (“License Agreement”) between the individual or entity that uses any of the Materials as defined below (“you”) and Amazon.com Services LLC or its respective affiliates that make such Materials available under this License Agreement (individually and collectively, “Amazon”). If you copy (including download), use, reproduce, distribute, or create derivative works of any of the Materials, you accept and agree to be bound by this License Agreement and represent that you have the authority to bind yourself or the entity you represent to this License Agreement.\n",
"### 1. Definitions\n",
"“Materials” means software, code, sample code, templates, libraries, documentation, data, information, model weights, and any other materials made available under this License Agreement.\n",
"“Non-Commercial” means not primarily or in substantial part intended for or directed towards obtaining commercial advantage or monetary compensation.\n",
"### 2. License\n",
"Amazon grants you a perpetual, irrevocable, non-exclusive, royalty-free, non-transferable, non-sub-licensable license to copy, use, reproduce, distribute, and create derivative works of the Materials for Non-Commercial purposes only.\n",
"### 3. Limitations\n",
"You may use the Materials only as expressly authorized by this License Agreement. You may use the Materials, derivative works of the Materials, or any other materials that are derived from or based on the Materials for Non-Commercial purposes only. You may not: (a) circumvent or disable any copy protection, security, or other controls in the Materials; (b) offer or impose any terms on the Materials that alter or are inconsistent with the terms of this License Agreement; (c) remove, modify, or obscure any copyright, trademark, or other proprietary or attribution notices on or in any Materials; or (d) direct, encourage, or assist any third party to take any action prohibited by this License Agreement.\n",
"### 4. Attribution\n",
"Any publication or research submitted for publication, academic or otherwise, that is based in whole or in part on the Materials or use of the Materials must include a citation to the Materials in accordance with reasonable academic standards.\n",
"### 5. Distribution\n",
"If you distribute any copies of the Materials, you must include with the Materials: (1) a copy of this License Agreement and (2) the following notice: “Alexa Teacher Model is licensed under the Alexa Teacher Model License Agreement, Copyright (c) Amazon.com, Inc. or its affiliates. All Rights Reserved.”\n",
"### 6. Reservation of Rights; Feedback; Derivative Works\n",
"The Materials include the intellectual property of Amazon or its licensors. Except for the rights explicitly granted to you in this License Agreement, all right, title, and interest in and to the Materials is reserved and retained by Amazon and its licensors. If you make suggestions, ideas, or other feedback available to Amazon relating to the Materials, Amazon may exercise all rights in such feedback without restriction and without compensating you.\n",
"### 7. Disclaimers and Limitation on Liability\n",
"THE MATERIALS ARE PROVIDED “AS IS” WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, OR NON-INFRINGEMENT. YOUR USE OF THE MATERIALS IS AT YOUR SOLE RISK. IN NO EVENT WILL AMAZON, ITS LICENSORS, OR ANY OF THEIR RESPECTIVE AFFILIATES OR SUPPLIERS BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL, INCIDENTAL, PUNITIVE, OR EXEMPLARY DAMAGES (INCLUDING FOR ANY LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOST PROFITS OR DATA, COST OF COVER, OR COMPUTER FAILURE OR MALFUNCTION) ARISING FROM OR RELATING TO THE MATERIALS OR THIS LICENSE AGREEMENT, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, EVEN IF AMAZON HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THESE LIMITATIONS AND DISCLAIMERS APPLY EXCEPT TO THE EXTENT PROHIBITED BY APPLICABLE LAW.\n",
"### 8. Compliance with Laws\n",
"You will comply with all applicable laws, rules, regulations, orders, and other requirements of governmental agencies or courts (together, “Laws”) in your use of the Materials, including but not limited to all applicable export, re-export, and import Laws of the United States and other countries. You will not use the Materials in any way that infringes, misappropriates, or otherwise violates Amazon’s rights or the rights of any third party.\n",
"### 9. Termination\n",
"If you violate any term of this License Agreement, your rights under this License Agreement will terminate immediately.\n",
"### 10. General\n",
"If any provision of this License Agreement is held invalid by a court with jurisdiction over the parties to this License Agreement, such provision will be deemed to be restated to reflect as nearly as possible the original intentions of the parties in accordance with applicable law, and the remainder of this License Agreement will remain in full force and effect. This License Agreement is governed by the laws of the State of Washington, without reference to rules governing choice of laws, and you irrevocably consent to the exclusive jurisdiction and venue of the federal and state courts located in King County, Washington. However, Amazon may seek injunctive (or similar) remedies in any jurisdiction. This License Agreement supersedes all prior or contemporaneous representations, understandings, agreements, or communications between you and Amazon, whether written or verbal, regarding the subject matter of this License Agreement.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "04daede2",
"metadata": {},
"source": [
"## Citation\n",
"\n",
"---\n",
"\n",
"If you use this model in conjunction with a publication, please cite our paper:\n",
"\n",
"```\n",
"@article{soltan2022alexatm,\n",
" title={AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2seq Model},\n",
" author={Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan},\n",
" year={2022}\n",
"}\n",
"```\n",
"\n",
"---"
]
}
],
"metadata": {
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "conda_amazonei_mxnet_p36",
"language": "python",
"name": "conda_amazonei_mxnet_p36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.13"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"metadata": {
"collapsed": false
},
"source": []
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}