{ "cells": [ { "cell_type": "markdown", "id": "4a1a52b6", "metadata": {}, "source": [ "# SageMaker JumpStart Foundation Models - BloomZ 176B FP16 Few-shot and Zero-shot learning" ] }, { "cell_type": "markdown", "id": "14d6b75f", "metadata": {}, "source": [ "---\n", "\n", "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "5acea92d", "metadata": {}, "source": [ "---\n", "Welcome to Amazon [SageMaker JumpStart](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html)! You can use Sagemaker JumpStart to solve many Machine Learning tasks through one-click in SageMaker Studio, or through [SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/overview.html#use-prebuilt-models-with-sagemaker-jumpstart).\n", "\n", "\n", "In this demo notebook, we demonstrate how to use the SageMaker Python SDK for deploying Foundation Models as an endpoint and use them for various NLP tasks. The Foundation models perform **Text Generation**. It takes a text string as input and predicts a sequence of next few words. These models can, for example, fill in incomplete text or paraphrase.\n", "\n", "Here, we show how to use the state-of-the-art instruction-tuned **BloomZ 176B model** from [Hugging Face](https://huggingface.co/bigscience/bloomz) for **Text Generation** in the following tasks. You can use **BloomZ 176B model** model with **few shot learning** and **zero shot learning** for many NLP tasks, without fine-tuning the model. \n", "\n", "The **few shot leanring** allows a machine learning model to make predictions with only a couple of examples. There is no need to train a new model as models like **BloomZ 176B model** have a significant amount of parameters such that they can easily adapt to many contexts without being re-trained. Below are tasks that text generation can solve using few shot learning method.\n", "\n", "* **Few-shot learning**\n", " * **Text summarization**\n", " * **Code generation**\n", " * **Name entity recognition (NER)**\n", " * **Question answering**\n", " * **Grammar and spelling correction**\n", " * **Product description and generalization**\n", " * **Sentence / sentiment classification**\n", " * **Chatbot and conversational AI**\n", " * **Tweet generation**\n", " * **Machine translation**\n", " * **Intent classification**\n", "* **Zero-shot learning**\n", " * **Multilingual text / sentiment classification**\n", " * **Multilingual question and answering**\n", " * **Code generation**\n", " * **Paragraph rephrase**\n", " * **Summarization**\n", " * **Common sense reasoning / natural language inference**\n", " * **Question and answering**\n", " * **Sentence / sentiment classification**\n", " * **Imaginary article generation based on title**\n", " * **Summarize a title based on a article**\n", " \n", "**Note. Deploying this model requires a `p4de.24xlarge` instance and the deployment usually takes about 1 hour.** \n", "\n", "**If you don't have quota for such instance, please go to AWS console -> Service Quotas -> AWS services to request it.**\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "815c0bc7", "metadata": {}, "source": [ "1. [Set Up](#1.-Set-Up)\n", "2. [Select a model](#2.-Select-a-model)\n", "3. [Retrieve Artifacts & Deploy an Endpoint](#3.-Retrieve-Artifacts-&-Deploy-an-Endpoint)\n", "4. [Query endpoint and parse response](#4.-Query-endpoint-and-parse-response)\n", "5. [Advanced features: How to use varisous parameters to control the generated text](#5.-Advanced-features:-How-to-use-various-advanced-parameters-to-control-the-generated-text)\n", "6. [Advanced features: How to use prompts engineering to solve different tasks](#6.-Advacned-features:-How-to-use-prompts-engineering-to-solve-different-tasks)\n", "5. [Clean up the endpoint](#5.-Clean-up-the-endpoint)" ] }, { "cell_type": "markdown", "id": "a7e35194", "metadata": {}, "source": [ "Note: This notebook was tested on ml.t3.medium instance in Amazon SageMaker Studio with Python 3 (Data Science) kernel and in Amazon SageMaker Notebook instance with conda_python3 kernel." ] }, { "cell_type": "markdown", "id": "d2f8dfad", "metadata": {}, "source": [ "### 1. Set Up" ] }, { "cell_type": "markdown", "id": "32f31be0", "metadata": {}, "source": [ "---\n", "Before executing the notebook, there are some initial steps required for set up. This notebook requires ipywidgets.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": null, "id": "eb67d497", "metadata": { "tags": [] }, "outputs": [], "source": [ "!pip install ipywidgets==7.0.0 --quiet\n", "!pip install --upgrade sagemaker --quiet" ] }, { "cell_type": "markdown", "id": "769f5d81", "metadata": {}, "source": [ "#### Permissions and environment variables\n", "\n", "---\n", "To host on Amazon SageMaker, we need to set up and authenticate the use of AWS services. Here, we use the execution role associated with the current notebook as the AWS account role with SageMaker access. \n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 5, "id": "67131eee", "metadata": { "tags": [] }, "outputs": [], "source": [ "import sagemaker, boto3, json\n", "from sagemaker.session import Session\n", "\n", "sagemaker_session = Session()\n", "aws_role = sagemaker_session.get_caller_identity_arn()\n", "aws_region = boto3.Session().region_name\n", "sess = sagemaker.Session()" ] }, { "cell_type": "markdown", "id": "69849d02", "metadata": {}, "source": [ "## 2. Select a pre-trained model\n", "***\n", "You can continue with the default model, or can choose a different model from the dropdown generated upon running the next cell. A complete list of SageMaker pre-trained models can also be accessed at [Sagemaker pre-trained Models](https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html#).\n", "***" ] }, { "cell_type": "code", "execution_count": 11, "id": "652b2d4f", "metadata": { "jumpStartAlterations": [ "modelIdVersion" ], "tags": [] }, "outputs": [], "source": [ "model_id, model_version, = (\n", " \"huggingface-textgeneration1-bloomz-176b-fp16\",\n", " \"*\",\n", ")" ] }, { "cell_type": "markdown", "id": "170e1228", "metadata": {}, "source": [ "***\n", "[Optional] Select a different Sagemaker pre-trained model. Here, we download the model_manifest file from the Built-In Algorithms s3 bucket, filter-out all the Text Generation models and select a model for inference.\n", "\n", "The notebook in the following sections uses [**BloomZ 176B**](https://huggingface.co/bigscience/bloomz) as an example. You are welcome to try [**BloomZ 7b1**](https://huggingface.co/bigscience/bloomz-7b1), [**BloomZ 3b**](https://huggingface.co/bigscience/bloomz-3b), and many others yourself. Please modify `model_id` based on the dropdown list shown as below.\n", "***" ] }, { "cell_type": "code", "execution_count": 12, "id": "0d8a1f7e", "metadata": { "tags": [] }, "outputs": [], "source": [ "from ipywidgets import Dropdown\n", "from sagemaker.jumpstart.notebook_utils import list_jumpstart_models\n", "\n", "# Retrieves all Text Generation models available by SageMaker Built-In Algorithms.\n", "filter_value = \"task == textgeneration1\"\n", "text_generation_models = list_jumpstart_models(filter=filter_value)\n", "\n", "# display the model-ids in a dropdown to select a model for inference.\n", "model_dropdown = Dropdown(\n", " options=text_generation_models,\n", " value=model_id,\n", " description=\"Select a model\",\n", " style={\"description_width\": \"initial\"},\n", " layout={\"width\": \"max-content\"},\n", ")" ] }, { "cell_type": "markdown", "id": "a28d45e5", "metadata": {}, "source": [ "#### Choose a model for Inference" ] }, { "cell_type": "code", "execution_count": 13, "id": "52b7a67a", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5c0159c195e5444ab895b03be09730e3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "A Jupyter Widget" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(model_dropdown)" ] }, { "cell_type": "code", "execution_count": 14, "id": "271642dc", "metadata": { "tags": [] }, "outputs": [], "source": [ "# model_version=\"*\" fetches the latest version of the model\n", "model_id, model_version = model_dropdown.value, \"*\"" ] }, { "cell_type": "markdown", "id": "0b08aa4a", "metadata": {}, "source": [ "### 3. Retrieve Artifacts & Deploy an Endpoint\n", "\n", "***\n", "\n", "Using SageMaker, we can perform inference on the pre-trained model, even without fine-tuning it first on a new dataset. We start by retrieving the `deploy_image_uri`, `deploy_source_uri`, and `model_uri` for the pre-trained model. To host the pre-trained model, we create an instance of [`sagemaker.model.Model`](https://sagemaker.readthedocs.io/en/stable/api/inference/model.html) and deploy it. This may take a few minutes.\n", "\n", "***" ] }, { "cell_type": "code", "execution_count": 8, "id": "acff6c2a", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "data": { "text/plain": [ "'huggingface-textgeneration1-bloomz-176b-fp16'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_id" ] }, { "cell_type": "markdown", "id": "d0b40890-567f-4d3d-bc68-a0f1dbd2bace", "metadata": {}, "source": [ "**Now we deploy the model using simiplified JumpStart SDK with following few lines of code.**\n", "\n", "**Note. Deploying this model requires a `p4de.24xlarge` instance and the deployment usually takes about 1 hour.**\n", "**If you don't have quota for such instance, please go to AWS console -> Service Quotas -> AWS services to request it.**" ] }, { "cell_type": "code", "execution_count": 18, "id": "9e916ece-66a3-42e4-a301-e7d2e5c27bb7", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "---------------------------------------------------------------------------------------------------------!Elastic time: 3498.1374440193176\n" ] } ], "source": [ "import time\n", "from sagemaker.jumpstart.model import JumpStartModel\n", "from sagemaker.utils import name_from_base\n", "\n", "\n", "endpoint_name = name_from_base(f\"jumpstart-example-{model_id}\")\n", "\n", "inference_instance_type = \"ml.p4de.24xlarge\"\n", "\n", "\n", "model = JumpStartModel(\n", " model_id=model_id,\n", " name=endpoint_name,\n", ")\n", "\n", "start = time.time()\n", "\n", "predictor = model.deploy(\n", " initial_instance_count=1,\n", " instance_type=inference_instance_type,\n", " # predictor_cls=Predictor,\n", " endpoint_name=endpoint_name,\n", " model_data_download_timeout=3600,\n", " container_startup_health_check_timeout=3600,\n", ")\n", "\n", "print(f\"Elastic time: {time.time()-start}\") # model.deploy()" ] }, { "cell_type": "markdown", "id": "1f9c254b", "metadata": {}, "source": [ "### 4. Query endpoint and parse response\n", "\n", "---\n", "Input to the endpoint is any string of text formatted as json and encoded in `utf-8` format. Output of the endpoint is a `json` with generated text.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 19, "id": "439998c0", "metadata": { "tags": [] }, "outputs": [], "source": [ "newline, bold, unbold = \"\\n\", \"\\033[1m\", \"\\033[0m\"\n", "\n", "\n", "def query_endpoint(encoded_text, endpoint_name):\n", " client = boto3.client(\"runtime.sagemaker\")\n", " response = client.invoke_endpoint(\n", " EndpointName=endpoint_name, ContentType=\"application/x-text\", Body=encoded_text\n", " )\n", " return response\n", "\n", "\n", "def parse_response(query_response):\n", " model_predictions = json.loads(query_response[\"Body\"].read())\n", " generated_text = model_predictions[0][\"generated_text\"]\n", " return generated_text" ] }, { "cell_type": "markdown", "id": "fc5d644d", "metadata": {}, "source": [ "---\n", "Below, we put in some example input text. You can put in any text and the model predicts next words in the sequence. Longer sequences of text can be generated by calling the model repeatedly.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 24, "id": "8262dfcc", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Inference:\n", "input text: I will\n", "generated text: \u001b[1m['\"I will\" is a future tense, so it is used to talk about what will happen in the']\u001b[0m\n", "\n", "Inference:\n", "input text: The movie is so funny\n", "generated text: \u001b[1m['\"The movie is so funny\"']\u001b[0m\n", "\n" ] } ], "source": [ "newline, bold, unbold = \"\\n\", \"\\033[1m\", \"\\033[0m\"\n", "\n", "text1 = \"I will\"\n", "text2 = \"The movie is so funny\"\n", "endpoint_name = \"jumpstart-example-huggingface-textgener-2023-05-23-14-30-58-443\"\n", "\n", "for text in [text1, text2]:\n", " query_response = query_endpoint(json.dumps(text).encode(\"utf-8\"), endpoint_name=endpoint_name)\n", " generated_text = parse_response(query_response)\n", " print(\n", " f\"Inference:{newline}\"\n", " f\"input text: {text}{newline}\"\n", " f\"generated text: {bold}{generated_text}{unbold}{newline}\"\n", " )" ] }, { "cell_type": "markdown", "id": "a2554851-cbcc-4ef9-864e-776a3550ceca", "metadata": { "tags": [] }, "source": [ "### 5. Advanced features: How to use various advanced parameters to control the generated text\n", "\n", "***\n", "This model also supports many advanced parameters while performing inference. They include:\n", "\n", "* **max_length:** Model generates text until the output length (which includes the input context length) reaches `max_length`. If specified, it must be a positive integer.\n", "* **num_return_sequences:** Number of output sequences returned. If specified, it must be a positive integer.\n", "* **num_beams:** Number of beams used in the greedy search. If specified, it must be integer greater than or equal to `num_return_sequences`.\n", "* **no_repeat_ngram_size:** Model ensures that a sequence of words of `no_repeat_ngram_size` is not repeated in the output sequence. If specified, it must be a positive integer greater than 1.\n", "* **temperature:** Controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If `temperature` -> 0, it results in greedy decoding. If specified, it must be a positive float.\n", "* **early_stopping:** If True, text generation is finished when all beam hypotheses reach the end of stence token. If specified, it must be boolean.\n", "* **do_sample:** If True, sample the next word as per the likelyhood. If specified, it must be boolean.\n", "* **top_k:** In each step of text generation, sample from only the `top_k` most likely words. If specified, it must be a positive integer.\n", "* **top_p:** In each step of text generation, sample from the smallest possible set of words with cumulative probability `top_p`. If specified, it must be a float between 0 and 1.\n", "* **seed:** Fix the randomized state for reproducibility. If specified, it must be an integer.\n", "\n", "We may specify any subset of the parameters mentioned above while invoking an endpoint. Next, we show an example of how to invoke endpoint with these arguments\n", "\n", "***" ] }, { "cell_type": "code", "execution_count": 31, "id": "1af6f7b0-4093-48c9-acdb-54b05886b2dc", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['How to make a pasta? boil water in a pot, then add the pasta.', 'How to make a pasta? boil water in a pot, add salt and pasta.']\n" ] } ], "source": [ "# Input must be a json\n", "payload = {\n", " \"text_inputs\": [\"How to make a pasta?\"],\n", " \"max_length\": 2000,\n", " \"num_return_sequences\": 2,\n", " \"top_k\": 250,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", " \"num_beams\": 3,\n", "}\n", "\n", "\n", "def query_endpoint_with_json_payload(encoded_json, endpoint_name):\n", " client = boto3.client(\"runtime.sagemaker\")\n", " response = client.invoke_endpoint(\n", " EndpointName=endpoint_name, ContentType=\"application/json\", Body=encoded_json\n", " )\n", " return response\n", "\n", "\n", "def parse_response_multiple_texts(query_response):\n", " model_predictions = json.loads(query_response[\"Body\"].read())\n", " # generated_text = model_predictions[0][\"generated_text\"]\n", " generated_text = []\n", " for x in model_predictions[0]:\n", " generated_text.append(x[\"generated_text\"])\n", " return generated_text\n", "\n", "\n", "query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", ")\n", "\n", "generated_texts = parse_response_multiple_texts(query_response)\n", "print(generated_texts)" ] }, { "cell_type": "markdown", "id": "c54977c4-d91a-4489-b137-26fa1d5f1f2d", "metadata": {}, "source": [ "### 6. Advanced features: How to use prompts engineering to solve different few shot and zero shot learning NLP tasks\n", "\n", "Note . The notebook in the following sections uses [**BloomZ 176B**](https://huggingface.co/bigscience/bloomz) as an example. You are welcome to try [**BloomZ 7b1**](https://huggingface.co/bigscience/bloomz-7b1), [**BloomZ 3b**](https://huggingface.co/bigscience/bloomz-3b) and many others yourself. Please modify `model_id` in above section." ] }, { "cell_type": "markdown", "id": "a37469f7-be44-4827-bdda-e167061ee4af", "metadata": {}, "source": [ "## Let's first try few shot learning tasks" ] }, { "cell_type": "markdown", "id": "6ec5b958-1f20-4899-a02f-e3854e804c27", "metadata": {}, "source": [ "### 6.1. Summarization" ] }, { "cell_type": "markdown", "id": "68a1b142-bf45-4e4e-9535-57184ad986df", "metadata": {}, "source": [ "Define the text article you want to summarize." ] }, { "cell_type": "code", "execution_count": 66, "id": "84c4893d-5673-428d-90b8-09583987151a", "metadata": {}, "outputs": [], "source": [ "text = \"\"\"Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. \n", "You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. \"\"\"" ] }, { "cell_type": "code", "execution_count": 67, "id": "9e746d6f-317b-4cb4-a334-11b9b3eb55d5", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1mNumber of return sequences are set as 2\u001b[0m\n", "\n", "\u001b[1m For prompt: '[Original]: Amazon scientists, in collaboration with researchers from the University of Sheffield, are making a large-scale fact extraction and verification dataset publicly available for the first time. The dataset, comprising more than 185,000 evidence-backed claims, is being made available to hopefully catalyze research and development that addresses the problems of fact extraction and verification in software applications or cloud-based services that perform automatic information extraction.\n", " [Summary]: Amazon and University researchers make fact extraction and verification dataset publicly available.\n", " ###\n", " [Original]: Prime members in the U.S. can get even more delivered to their door with a Prime membership. Members can now enjoy one year of Grubhub+ valued at $9.99 per month for free—at no added cost to their Prime membership. To activate this deal, visit amazon.com/grubhub. This new offer includes unlimited, $0 food delivery fees on orders over $12 as well as exclusive perks for Grubhub+ members and rewards like free food and order discounts. Plus, diners can “eat good while doing good” by opting into Grubhub’s Donate the Change program, a donation-matching initiative that raised more than $25 million in 2021 alone, benefiting more than 20 charitable organizations across the country.\n", " [Summary]: Prime members in the U.S. can enjoy one year of Grubhub+ for free, with no food-delivery fees on eligible orders.\n", " ###\n", " [Original]: {text}\n", " [Summary]:'\u001b[0m\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: [Original]: Amazon scientists, in collaboration with researchers from the University of Sheffield, are making a large-scale fact extraction and verification dataset publicly available for the first time. The dataset, comprising more than 185,000 evidence-backed claims, is being made available to hopefully catalyze research and development that addresses the problems of fact extraction and verification in software applications or cloud-based services that perform automatic information extraction.\n", " [Summary]: Amazon and University researchers make fact extraction and verification dataset publicly available.\n", " ###\n", " [Original]: Prime members in the U.S. can get even more delivered to their door with a Prime membership. Members can now enjoy one year of Grubhub+ valued at $9.99 per month for free—at no added cost to their Prime membership. To activate this deal, visit amazon.com/grubhub. This new offer includes unlimited, $0 food delivery fees on orders over $12 as well as exclusive perks for Grubhub+ members and rewards like free food and order discounts. Plus, diners can “eat good while doing good” by opting into Grubhub’s Donate the Change program, a donation-matching initiative that raised more than $25 million in 2021 alone, benefiting more than 20 charitable organizations across the country.\n", " [Summary]: Prime members in the U.S. can enjoy one year of Grubhub+ for free, with no food-delivery fees on eligible orders.\n", " ###\n", " [Original]: Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. \n", "You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. \n", " [Summary]: Amazon Comprehend makes it easy to extract insights from documents.\n", "\n", "\u001b[1mResult 1\u001b[0m: [Original]: Amazon scientists, in collaboration with researchers from the University of Sheffield, are making a large-scale fact extraction and verification dataset publicly available for the first time. The dataset, comprising more than 185,000 evidence-backed claims, is being made available to hopefully catalyze research and development that addresses the problems of fact extraction and verification in software applications or cloud-based services that perform automatic information extraction.\n", " [Summary]: Amazon and University researchers make fact extraction and verification dataset publicly available.\n", " ###\n", " [Original]: Prime members in the U.S. can get even more delivered to their door with a Prime membership. Members can now enjoy one year of Grubhub+ valued at $9.99 per month for free—at no added cost to their Prime membership. To activate this deal, visit amazon.com/grubhub. This new offer includes unlimited, $0 food delivery fees on orders over $12 as well as exclusive perks for Grubhub+ members and rewards like free food and order discounts. Plus, diners can “eat good while doing good” by opting into Grubhub’s Donate the Change program, a donation-matching initiative that raised more than $25 million in 2021 alone, benefiting more than 20 charitable organizations across the country.\n", " [Summary]: Prime members in the U.S. can enjoy one year of Grubhub+ for free, with no food-delivery fees on eligible orders.\n", " ###\n", " [Original]: Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. \n", "You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. \n", " [Summary]: Amazon Comprehend is a document analysis service that uses natural language processing (NLP) to extract insights from documents.\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"[Original]: Amazon scientists, in collaboration with researchers from the University of Sheffield, are making a large-scale fact extraction and verification dataset publicly available for the first time. The dataset, comprising more than 185,000 evidence-backed claims, is being made available to hopefully catalyze research and development that addresses the problems of fact extraction and verification in software applications or cloud-based services that perform automatic information extraction.\n", " [Summary]: Amazon and University researchers make fact extraction and verification dataset publicly available.\n", " ###\n", " [Original]: Prime members in the U.S. can get even more delivered to their door with a Prime membership. Members can now enjoy one year of Grubhub+ valued at $9.99 per month for free—at no added cost to their Prime membership. To activate this deal, visit amazon.com/grubhub. This new offer includes unlimited, $0 food delivery fees on orders over $12 as well as exclusive perks for Grubhub+ members and rewards like free food and order discounts. Plus, diners can “eat good while doing good” by opting into Grubhub’s Donate the Change program, a donation-matching initiative that raised more than $25 million in 2021 alone, benefiting more than 20 charitable organizations across the country.\n", " [Summary]: Prime members in the U.S. can enjoy one year of Grubhub+ for free, with no food-delivery fees on eligible orders.\n", " ###\n", " [Original]: {text}\n", " [Summary]:\"\"\"\n", "]\n", "\n", "num_return_sequences = 2\n", "parameters = {\n", " \"max_length\": 600,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_p\": 0.01,\n", " \"do_sample\": False,\n", "}\n", "\n", "print(f\"{bold}Number of return sequences are set as {num_return_sequences}{unbold}{newline}\")\n", "for each_prompt in prompts:\n", " payload = {\"text_inputs\": each_prompt.replace(\"{text}\", text), **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} For prompt: '{each_prompt}'{unbold}{newline}\")\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "c357c0dc-2644-4ce6-b579-f61c60a0777e", "metadata": {}, "source": [ "### 6.2. Code generation\n" ] }, { "cell_type": "code", "execution_count": 68, "id": "f33632b8-b054-47ba-a3f0-f59fadc7bb4f", "metadata": { "tags": [] }, "outputs": [], "source": [ "description = \"a Headline saying Welcome to AI\"" ] }, { "cell_type": "code", "execution_count": 69, "id": "633e0c14-19c5-4bfe-9a0a-584dffadc5f3", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'description: a orange button that says stop\n", " code: \n", " ###\n", " description: a blue box that contains yellow circles with red borders\n", " code:
\n", " ###\n", " description: {description}\n", " code:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'description: a orange button that says stop\n", " code: \n", " ###\n", " description: a blue box that contains yellow circles with red borders\n", " code:
\n", " ###\n", " description: a Headline saying Welcome to AI\n", " code:

Welcome to AI

'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'description: a orange button that says stop\n", " code: \n", " ###\n", " description: a blue box that contains yellow circles with red borders\n", " code:
\n", " ###\n", " description: a Headline saying Welcome to AI\n", " code:

Welcome to AI

'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"description: a orange button that says stop\n", " code: \n", " ###\n", " description: a blue box that contains yellow circles with red borders\n", " code:
\n", " ###\n", " description: {description}\n", " code:\"\"\"\n", "]\n", "\n", "num_return_sequences = 2\n", "\n", "parameters = {\n", " \"max_length\": 200,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.3,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{description}\", description)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "f8340134-5938-40c4-8aab-ed05b84f8beb", "metadata": {}, "source": [ "How about SQL code?" ] }, { "cell_type": "code", "execution_count": 70, "id": "29d6d889-5ea6-4d55-abd1-33d875f7df9a", "metadata": { "tags": [] }, "outputs": [], "source": [ "question = \"Fetch three employees from the Employee table\"" ] }, { "cell_type": "code", "execution_count": 71, "id": "9d937ec4-3987-4f40-a839-b82177c060e2", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Question: Fetch the companies that have less than five people in it.\n", " Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;\n", " ###\n", " Question: Show all companies along with the number of employees in each department\n", " Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;\n", " ###\n", " Question: Show the last record of the Employee table\n", " Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;\n", " ###\n", " Question: {question};\n", " Answer:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Question: Fetch the companies that have less than five people in it.\n", " Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;\n", " ###\n", " Question: Show all companies along with the number of employees in each department\n", " Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;\n", " ###\n", " Question: Show the last record of the Employee table\n", " Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;\n", " ###\n", " Question: Fetch three employees from the Employee table;\n", " Answer: SELECT * FROM Employee LIMIT 3;'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Question: Fetch the companies that have less than five people in it.\n", " Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;\n", " ###\n", " Question: Show all companies along with the number of employees in each department\n", " Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;\n", " ###\n", " Question: Show the last record of the Employee table\n", " Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;\n", " ###\n", " Question: Fetch three employees from the Employee table;\n", " Answer: SELECT * FROM Employee LIMIT 3;\n", " ###'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Question: Fetch the companies that have less than five people in it.\n", " Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;\n", " ###\n", " Question: Show all companies along with the number of employees in each department\n", " Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;\n", " ###\n", " Question: Show the last record of the Employee table\n", " Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;\n", " ###\n", " Question: {question};\n", " Answer:\"\"\"\n", "]\n", "\n", "num_return_sequences = 2\n", "\n", "parameters = {\n", " \"max_length\": 200,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_p\": 0.01,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{question}\", question)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "bc8f7ac9-695d-4707-b857-4244236ec101", "metadata": {}, "source": [ "### 6.3. Name entity recognition (NER)" ] }, { "cell_type": "code", "execution_count": 72, "id": "13666249-8961-4c94-a702-4af4d2b38314", "metadata": { "tags": [] }, "outputs": [], "source": [ "sentence = \"David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.\"" ] }, { "cell_type": "code", "execution_count": 73, "id": "1f072e98-2489-4404-a766-9ac9e5cf1d62", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: '[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. \n", " [Name]: Fred\n", " [Position]: Co-founder and CEO\n", " [Company]: Platform.sh\n", " ###\n", " [Text]: Microsoft (the word being a portmanteau of \"microcomputer software\") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a \"devices and services\" strategy.\n", " [Name]: Steve Ballmer\n", " [Position]: CEO\n", " [Company]: Microsoft\n", " ###\n", " [Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.\n", " [Name]: Franck Riboud\n", " [Position]: CEO\n", " [Company]: Danone\n", " ###\n", " [Text]: {sentence}\n", " '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: '[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. \n", " [Name]: Fred\n", " [Position]: Co-founder and CEO\n", " [Company]: Platform.sh\n", " ###\n", " [Text]: Microsoft (the word being a portmanteau of \"microcomputer software\") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a \"devices and services\" strategy.\n", " [Name]: Steve Ballmer\n", " [Position]: CEO\n", " [Company]: Microsoft\n", " ###\n", " [Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.\n", " [Name]: Franck Riboud\n", " [Position]: CEO\n", " [Company]: Danone\n", " ###\n", " [Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.\n", " [Name]: David Melvin\n", " [Position]: Senior Adviser\n", " [Company]: CITIC CLSA'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: '[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. \n", " [Name]: Fred\n", " [Position]: Co-founder and CEO\n", " [Company]: Platform.sh\n", " ###\n", " [Text]: Microsoft (the word being a portmanteau of \"microcomputer software\") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a \"devices and services\" strategy.\n", " [Name]: Steve Ballmer\n", " [Position]: CEO\n", " [Company]: Microsoft\n", " ###\n", " [Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.\n", " [Name]: Franck Riboud\n", " [Position]: CEO\n", " [Company]: Danone\n", " ###\n", " [Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.\n", " [Name]: David Melvin\n", " [Position]: Senior Adviser\n", " [Company]: CITIC CLSA'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. \n", " [Name]: Fred\n", " [Position]: Co-founder and CEO\n", " [Company]: Platform.sh\n", " ###\n", " [Text]: Microsoft (the word being a portmanteau of \"microcomputer software\") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a \"devices and services\" strategy.\n", " [Name]: Steve Ballmer\n", " [Position]: CEO\n", " [Company]: Microsoft\n", " ###\n", " [Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.\n", " [Name]: Franck Riboud\n", " [Position]: CEO\n", " [Company]: Danone\n", " ###\n", " [Text]: {sentence}\n", " \"\"\"\n", "]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 550,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_p\": 0.1,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "310ef8b5-b913-47a5-979e-79debb651a0b", "metadata": {}, "source": [ "### 6.4. Question answering" ] }, { "cell_type": "code", "execution_count": 74, "id": "2df4f7ee-5067-4033-9075-99521965a0f2", "metadata": { "tags": [] }, "outputs": [], "source": [ "question = \"Which plan is recommended for GPT-J?\"" ] }, { "cell_type": "code", "execution_count": 75, "id": "b14ff933-2238-47b8-82b0-8b6d345e7241", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.\n", " Question: When was NLP Cloud founded?\n", " Answer: 2021\n", " ###\n", " Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.\n", " Question: What did NLP Cloud develop?\n", " Answer: API\n", " ###\n", " Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.\n", " Question: When can plans be stopped?\n", " Answer: Anytime\n", " ###\n", " Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.\n", " Question: {question}\n", " Answer:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.\n", " Question: When was NLP Cloud founded?\n", " Answer: 2021\n", " ###\n", " Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.\n", " Question: What did NLP Cloud develop?\n", " Answer: API\n", " ###\n", " Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.\n", " Question: When can plans be stopped?\n", " Answer: Anytime\n", " ###\n", " Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.\n", " Question: Which plan is recommended for GPT-J?\n", " Answer: GPU'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.\n", " Question: When was NLP Cloud founded?\n", " Answer: 2021\n", " ###\n", " Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.\n", " Question: What did NLP Cloud develop?\n", " Answer: API\n", " ###\n", " Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.\n", " Question: When can plans be stopped?\n", " Answer: Anytime\n", " ###\n", " Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.\n", " Question: Which plan is recommended for GPT-J?\n", " Answer: GPU'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.\n", " Question: When was NLP Cloud founded?\n", " Answer: 2021\n", " ###\n", " Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.\n", " Question: What did NLP Cloud develop?\n", " Answer: API\n", " ###\n", " Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.\n", " Question: When can plans be stopped?\n", " Answer: Anytime\n", " ###\n", " Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.\n", " Question: {question}\n", " Answer:\"\"\"\n", "]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 350,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_p\": 0.1,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{question}\", question)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "368d9c05-ffc9-4d2b-a53b-d7d5ef5481ba", "metadata": {}, "source": [ "### 6.5. Grammar and spelling correction " ] }, { "cell_type": "code", "execution_count": 76, "id": "2727c519-ba16-460c-a601-c72a16051f35", "metadata": { "tags": [] }, "outputs": [], "source": [ "sentence = \"I do not wan to go\"" ] }, { "cell_type": "code", "execution_count": 77, "id": "52c87de5-63e7-4ae3-af1a-f2cbf1f7bf08", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'I love goin to the beach.\n", " Correction: I love going to the beach.\n", " ###\n", " Let me hav it!\n", " Correction: Let me have it!\n", " ###\n", " It have too many drawbacks.\n", " Correction: It has too many drawbacks.\n", " ###\n", " {sentence}\n", " Correction:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'I love goin to the beach.\n", " Correction: I love going to the beach.\n", " ###\n", " Let me hav it!\n", " Correction: Let me have it!\n", " ###\n", " It have too many drawbacks.\n", " Correction: It has too many drawbacks.\n", " ###\n", " I do not wan to go\n", " Correction: I don't want to go\n", " ###'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'I love goin to the beach.\n", " Correction: I love going to the beach.\n", " ###\n", " Let me hav it!\n", " Correction: Let me have it!\n", " ###\n", " It have too many drawbacks.\n", " Correction: It has too many drawbacks.\n", " ###\n", " I do not wan to go\n", " Correction: I don't want to go\n", " ###'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"I love goin to the beach.\n", " Correction: I love going to the beach.\n", " ###\n", " Let me hav it!\n", " Correction: Let me have it!\n", " ###\n", " It have too many drawbacks.\n", " Correction: It has too many drawbacks.\n", " ###\n", " {sentence}\n", " Correction:\"\"\"\n", "]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 250,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_p\": 0.1,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "c00febfb-9f98-4f45-8204-a79f134ed17e", "metadata": {}, "source": [ "### 6.6. Product description and generalization\n" ] }, { "cell_type": "code", "execution_count": 78, "id": "a0a08432-60ef-430f-b564-1c59c84bd2ee", "metadata": {}, "outputs": [], "source": [ "sentence = \"\"\"t-shirt, men, $39\"\"\"" ] }, { "cell_type": "code", "execution_count": 79, "id": "2a70c89f-42d5-4e2c-b3c1-4cfaf4e3f655", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Generate a product description out of keywords.\n", "\n", " Keywords: shoes, women, $59\n", " Sentence: Beautiful shoes for women at the price of $59.\n", " ###\n", " Keywords: trousers, men, $69\n", " Sentence: Modern trousers for men, for $69 only.\n", " ###\n", " Keywords: gloves, winter, $19\n", " Sentence: Amazingly hot gloves for cold winters, at $19.\n", " ###\n", " Keywords: {sentence}\n", " Sentence:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Generate a product description out of keywords.\n", "\n", " Keywords: shoes, women, $59\n", " Sentence: Beautiful shoes for women at the price of $59.\n", " ###\n", " Keywords: trousers, men, $69\n", " Sentence: Modern trousers for men, for $69 only.\n", " ###\n", " Keywords: gloves, winter, $19\n", " Sentence: Amazingly hot gloves for cold winters, at $19.\n", " ###\n", " Keywords: t-shirt, men, $39\n", " Sentence: Cool t-shirt for men, at $39.'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Generate a product description out of keywords.\n", "\n", " Keywords: shoes, women, $59\n", " Sentence: Beautiful shoes for women at the price of $59.\n", " ###\n", " Keywords: trousers, men, $69\n", " Sentence: Modern trousers for men, for $69 only.\n", " ###\n", " Keywords: gloves, winter, $19\n", " Sentence: Amazingly hot gloves for cold winters, at $19.\n", " ###\n", " Keywords: t-shirt, men, $39\n", " Sentence: Cool t-shirt for men, for $39 only.'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Generate a product description out of keywords.\n", "\n", " Keywords: shoes, women, $59\n", " Sentence: Beautiful shoes for women at the price of $59.\n", " ###\n", " Keywords: trousers, men, $69\n", " Sentence: Modern trousers for men, for $69 only.\n", " ###\n", " Keywords: gloves, winter, $19\n", " Sentence: Amazingly hot gloves for cold winters, at $19.\n", " ###\n", " Keywords: {sentence}\n", " Sentence:\"\"\"\n", "]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 150,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "13ab5c92-5210-4ffe-842a-78e8ea0a6521", "metadata": {}, "source": [ "### 6.7. Sentence / Sentiment Classification" ] }, { "cell_type": "markdown", "id": "2f140ae5-54c0-4738-b2c2-d715f8ae8050", "metadata": {}, "source": [ "Define the sentence you want to classifiy and the corresponded options." ] }, { "cell_type": "code", "execution_count": 80, "id": "dd29e5fb-d7f9-43bd-ad03-4887c1381169", "metadata": {}, "outputs": [], "source": [ "sentence = \"I am trying to cook chicken with tomatoes.\"" ] }, { "cell_type": "code", "execution_count": 81, "id": "edb65e09-8c2f-4a8c-87bd-64ab49a067b0", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Message: When the spaceship landed on Mars, the whole humanity was excited\n", " Topic: space\n", " ###\n", " Message: I love playing tennis and golf. I'm practicing twice a week.\n", " Topic: sport\n", " ###\n", " Message: Managing a team of sales people is a tough but rewarding job.\n", " Topic: business\n", " ###\n", " Message: I am trying to cook chicken with tomatoes.\n", " Topic:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Message: When the spaceship landed on Mars, the whole humanity was excited\n", " Topic: space\n", " ###\n", " Message: I love playing tennis and golf. I'm practicing twice a week.\n", " Topic: sport\n", " ###\n", " Message: Managing a team of sales people is a tough but rewarding job.\n", " Topic: business\n", " ###\n", " Message: I am trying to cook chicken with tomatoes.\n", " Topic: food'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Message: When the spaceship landed on Mars, the whole humanity was excited\n", " Topic: space\n", " ###\n", " Message: I love playing tennis and golf. I'm practicing twice a week.\n", " Topic: sport\n", " ###\n", " Message: Managing a team of sales people is a tough but rewarding job.\n", " Topic: business\n", " ###\n", " Message: I am trying to cook chicken with tomatoes.\n", " Topic: food'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Message: When the spaceship landed on Mars, the whole humanity was excited\n", " Topic: space\n", " ###\n", " Message: I love playing tennis and golf. I'm practicing twice a week.\n", " Topic: sport\n", " ###\n", " Message: Managing a team of sales people is a tough but rewarding job.\n", " Topic: business\n", " ###\n", " Message: {sentence}\n", " Topic:\"\"\"\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 200,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "3046a9ef-d64e-42ec-a1ca-107a15f7164f", "metadata": {}, "source": [ "### 6.8. Chatbot and Conversational AI" ] }, { "cell_type": "markdown", "id": "6f151f47-a656-42aa-9b6b-8cc6e4ced876", "metadata": {}, "source": [ "Define the sentence and the language you want to translate the sentence to." ] }, { "cell_type": "code", "execution_count": 82, "id": "e5f8885f-76bd-4b42-92ac-cdeb905c97a6", "metadata": {}, "outputs": [], "source": [ "sentence = \"I caught flu\"" ] }, { "cell_type": "code", "execution_count": 83, "id": "b7259e6a-45c1-49af-a7c8-f176f3552ee8", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'This is a discussion between a [human] and a [robot]. \n", " The [robot] is very nice and empathetic.\n", "\n", " [human]: Hello nice to meet you.\n", " [robot]: Nice to meet you too.\n", " ###\n", " [human]: How is it going today?\n", " [robot]: Not so bad, thank you! How about you?\n", " ###\n", " [human]: I am ok, but I am a bit sad...\n", " [robot]: Oh? Why that?\n", " ###\n", " [human]: I caught flu\n", " [robot]:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'This is a discussion between a [human] and a [robot]. \n", " The [robot] is very nice and empathetic.\n", "\n", " [human]: Hello nice to meet you.\n", " [robot]: Nice to meet you too.\n", " ###\n", " [human]: How is it going today?\n", " [robot]: Not so bad, thank you! How about you?\n", " ###\n", " [human]: I am ok, but I am a bit sad...\n", " [robot]: Oh? Why that?\n", " ###\n", " [human]: I caught flu\n", " [robot]: Oh no, I am sorry to hear that.'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'This is a discussion between a [human] and a [robot]. \n", " The [robot] is very nice and empathetic.\n", "\n", " [human]: Hello nice to meet you.\n", " [robot]: Nice to meet you too.\n", " ###\n", " [human]: How is it going today?\n", " [robot]: Not so bad, thank you! How about you?\n", " ###\n", " [human]: I am ok, but I am a bit sad...\n", " [robot]: Oh? Why that?\n", " ###\n", " [human]: I caught flu\n", " [robot]: Oh no, I am sorry to hear that!'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"This is a discussion between a [human] and a [robot]. \n", " The [robot] is very nice and empathetic.\n", "\n", " [human]: Hello nice to meet you.\n", " [robot]: Nice to meet you too.\n", " ###\n", " [human]: How is it going today?\n", " [robot]: Not so bad, thank you! How about you?\n", " ###\n", " [human]: I am ok, but I am a bit sad...\n", " [robot]: Oh? Why that?\n", " ###\n", " [human]: {sentence}\n", " [robot]:\"\"\"\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 250,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "ca7602e2-27f0-4fe7-86dd-806081810265", "metadata": {}, "source": [ "### 6.9. Tweet generation" ] }, { "cell_type": "markdown", "id": "74c0b210-1c03-4af9-a2aa-a6a7225b738d", "metadata": {}, "source": [ "Define the sentence, pronoun, and options you want to reason." ] }, { "cell_type": "code", "execution_count": 84, "id": "d099455d-7275-46cb-a73b-75ffaf918796", "metadata": {}, "outputs": [], "source": [ "keyword = \"nlp\"" ] }, { "cell_type": "code", "execution_count": 85, "id": "974aca77-2154-4162-af03-49a4f62cda9b", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'keyword: markets\n", " tweet: Take feedback from nature and markets, not from people\n", " ###\n", " keyword: children\n", " tweet: Maybe we die so we can come back as children.\n", " ###\n", " keyword: startups\n", " tweet: Startups should not worry about how to put out fires, they should worry about how to start them.\n", " ###\n", " keyword: nlp\n", " tweet:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'keyword: markets\n", " tweet: Take feedback from nature and markets, not from people\n", " ###\n", " keyword: children\n", " tweet: Maybe we die so we can come back as children.\n", " ###\n", " keyword: startups\n", " tweet: Startups should not worry about how to put out fires, they should worry about how to start them.\n", " ###\n", " keyword: nlp\n", " tweet: NLP is a tool, not a goal.'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'keyword: markets\n", " tweet: Take feedback from nature and markets, not from people\n", " ###\n", " keyword: children\n", " tweet: Maybe we die so we can come back as children.\n", " ###\n", " keyword: startups\n", " tweet: Startups should not worry about how to put out fires, they should worry about how to start them.\n", " ###\n", " keyword: nlp\n", " tweet: NLP is a tool, not a goal, not a religion.'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"keyword: markets\n", " tweet: Take feedback from nature and markets, not from people\n", " ###\n", " keyword: children\n", " tweet: Maybe we die so we can come back as children.\n", " ###\n", " keyword: startups\n", " tweet: Startups should not worry about how to put out fires, they should worry about how to start them.\n", " ###\n", " keyword: {keyword}\n", " tweet:\"\"\"\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 300,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{keyword}\", keyword)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "efb120a1-a66d-4312-ad71-4ec632111d49", "metadata": {}, "source": [ "### 6.10. Machine translation" ] }, { "cell_type": "code", "execution_count": 86, "id": "2e4a91a4-16d2-4bb7-a6bc-fde2e0603ecf", "metadata": { "tags": [] }, "outputs": [], "source": [ "sentence = \"NLP Cloud permet de deployer le NLP en production facilement.\"" ] }, { "cell_type": "code", "execution_count": 87, "id": "d51584bc-1004-455d-92bd-5f24cb162309", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Hugging Face a révolutionné le NLP.\n", " Translation: Hugging Face revolutionized NLP.\n", " ###\n", " Cela est incroyable!\n", " Translation: This is unbelievable!\n", " ###\n", " Désolé je ne peux pas.\n", " Translation: Sorry but I cannot.\n", " ###\n", " NLP Cloud permet de deployer le NLP en production facilement.\n", " Translation:'\n", "\n", "\u001b[1m The 2 summarized results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Hugging Face a révolutionné le NLP.\n", " Translation: Hugging Face revolutionized NLP.\n", " ###\n", " Cela est incroyable!\n", " Translation: This is unbelievable!\n", " ###\n", " Désolé je ne peux pas.\n", " Translation: Sorry but I cannot.\n", " ###\n", " NLP Cloud permet de deployer le NLP en production facilement.\n", " Translation: NLP Cloud makes it easy to deploy NLP in production.'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Hugging Face a révolutionné le NLP.\n", " Translation: Hugging Face revolutionized NLP.\n", " ###\n", " Cela est incroyable!\n", " Translation: This is unbelievable!\n", " ###\n", " Désolé je ne peux pas.\n", " Translation: Sorry but I cannot.\n", " ###\n", " NLP Cloud permet de deployer le NLP en production facilement.\n", " Translation: NLP Cloud makes it easy to deploy NLP in production.\n", " ###'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Hugging Face a révolutionné le NLP.\n", " Translation: Hugging Face revolutionized NLP.\n", " ###\n", " Cela est incroyable!\n", " Translation: This is unbelievable!\n", " ###\n", " Désolé je ne peux pas.\n", " Translation: Sorry but I cannot.\n", " ###\n", " {sentence}\n", " Translation:\"\"\"\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 150,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "61f0b08a-9605-4075-ad53-66766ddc24b9", "metadata": {}, "source": [ "### 6.11. Intent classification" ] }, { "cell_type": "code", "execution_count": 88, "id": "9ac7e00b-cb6f-406f-be6f-2256caf2ec1f", "metadata": { "tags": [] }, "outputs": [], "source": [ "sentence = \"Can you please teach me Chinese next week?\"" ] }, { "cell_type": "code", "execution_count": 89, "id": "5cee0a58-a91a-47cb-93db-1844894d1e7e", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'I want to start coding tomorrow because it seems to be so fun!\n", " Intent: start coding\n", " ###\n", " Show me the last pictures you have please.\n", " Intent: show pictures\n", " ###\n", " Search all these files as fast as possible.\n", " Intent: search files\n", " ###\n", " Can you please teach me Chinese next week?\n", " Intent:'\n", "\n", "\u001b[1m The 2 summarized results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'I want to start coding tomorrow because it seems to be so fun!\n", " Intent: start coding\n", " ###\n", " Show me the last pictures you have please.\n", " Intent: show pictures\n", " ###\n", " Search all these files as fast as possible.\n", " Intent: search files\n", " ###\n", " Can you please teach me Chinese next week?\n", " Intent: teach me chinese\n", " ###'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'I want to start coding tomorrow because it seems to be so fun!\n", " Intent: start coding\n", " ###\n", " Show me the last pictures you have please.\n", " Intent: show pictures\n", " ###\n", " Search all these files as fast as possible.\n", " Intent: search files\n", " ###\n", " Can you please teach me Chinese next week?\n", " Intent: teach me chinese\n", " ###'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"I want to start coding tomorrow because it seems to be so fun!\n", " Intent: start coding\n", " ###\n", " Show me the last pictures you have please.\n", " Intent: show pictures\n", " ###\n", " Search all these files as fast as possible.\n", " Intent: search files\n", " ###\n", " {sentence}\n", " Intent:\"\"\"\n", "]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 150,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_p\": 0.5,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "7fffbe00-30ee-4c0a-938a-5a6c27b9f36b", "metadata": {}, "source": [ "## Next, let's try zero shot learning tasks" ] }, { "cell_type": "markdown", "id": "d2bf731e-ca3c-465e-8169-f931cefe8eb3", "metadata": {}, "source": [ "### 6.13. Multilingual text / sentiment classification (Chinese to English)" ] }, { "cell_type": "code", "execution_count": 91, "id": "80d3289a-e7b3-487e-abac-884b22aa0bc4", "metadata": { "tags": [] }, "outputs": [], "source": [ "text = \"\"\"一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。\"\"\"" ] }, { "cell_type": "code", "execution_count": 94, "id": "3555103c-8468-4255-b666-bba5515dd4b2", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "For input with prompt: 一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。 Would you rate the previous review as positive, neutral or negative?\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: '一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。 Would you rate the previous review as positive, neutral or negative? positive'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: '一个传奇的开端,一个不灭的神话,这不仅仅是一部电影,而是作为一个走进新时代的标签,永远彪炳史册。 Would you rate the previous review as positive, neutral or negative? positive'\n", "\n" ] } ], "source": [ "prompts = [\"\"\"{text} Would you rate the previous review as positive, neutral or negative?\"\"\"]\n", "\n", "parameters = {\n", " \"max_length\": 200,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 250,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{text}\", text)\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"For input with prompt: {payload['text_inputs']}\")\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "ba70796b-6320-4313-91e6-170e0c3a9870", "metadata": {}, "source": [ "### 6.14. Multilingual question and answering (English to Chinese)" ] }, { "cell_type": "code", "execution_count": 95, "id": "95ab3351-f372-48be-aff7-b17c39faa1bf", "metadata": { "tags": [] }, "outputs": [], "source": [ "text = \"\"\"what is the backpropagation.\"\"\"" ] }, { "cell_type": "code", "execution_count": 99, "id": "5b70fb09-b9ce-4888-a6aa-5d0e19d7bc9c", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "For input with prompt: Explain to me in Traditional Chinese what is the backpropagation.\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Explain to me in Traditional Chinese what is the backpropagation. 梯度反向传播算法是一种机器学习算法,用于训练人工神经网络。该算法使用梯度下降来最小化损失函数。'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Explain to me in Traditional Chinese what is the backpropagation. 遞迴神經網絡的反向傳播算法,是一種用於訓練遞迴神經網絡的算法。'\n", "\n" ] } ], "source": [ "prompts = [\"\"\"Explain to me in Traditional Chinese {text}\"\"\"]\n", "\n", "parameters = {\n", " \"max_length\": 500,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 250,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{text}\", text)\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"For input with prompt: {payload['text_inputs']}\")\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "e56362ba-0552-42c5-8c36-c6d2a420079c", "metadata": {}, "source": [ "### 6.15. Code generation" ] }, { "cell_type": "code", "execution_count": 100, "id": "7f653857-2edc-4fd6-badd-10235e3e9ab8", "metadata": { "tags": [] }, "outputs": [], "source": [ "text = \"binary search tree\"\n", "code_start = \"\"\"\n", "def binary_search(a, x):\n", " low = 0\n", " high = len(a) - 1\"\"\"" ] }, { "cell_type": "code", "execution_count": 101, "id": "1a39ca18-d9bb-42f1-be1e-efb068238d99", "metadata": { "tags": [] }, "outputs": [], "source": [ "prompts = [\n", " \"\"\"Write a {text} with O(log(n)) computational complexity. \n", "{code_start}\"\"\"\n", "]" ] }, { "cell_type": "code", "execution_count": 102, "id": "7d9adbd0-6ae1-498b-84a5-eb0b41fdda32", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "For input with prompt: Write a binary search tree with O(log(n)) computational complexity. \n", "\n", "def binary_search(a, x):\n", " low = 0\n", " high = len(a) - 1\n", "\u001b[1m For prompt: 'Write a {text} with O(log(n)) computational complexity. \n", "{code_start}'\u001b[0m\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: 'Write a binary search tree with O(log(n)) computational complexity. \n", "\n", "def binary_search(a, x):\n", " low = 0\n", " high = len(a) - 1\n", " while low <= high:\n", " mid = (low + high) // 2\n", " if a[mid] < x:\n", " low = mid + 1\n", " elif a[mid] > x:\n", " high = mid - 1\n", " else:\n", " return mid\n", " return -1'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: 'Write a binary search tree with O(log(n)) computational complexity. \n", "\n", "def binary_search(a, x):\n", " low = 0\n", " high = len(a) - 1\n", " while low <= high:\n", " mid = (low + high) // 2\n", " if a[mid] < x:\n", " low = mid + 1\n", " elif a[mid] > x:\n", " high = mid - 1\n", " else:\n", " return mid\n", " return low'\n", "\n" ] } ], "source": [ "parameters = {\n", " \"max_length\": 500,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.95,\n", " \"do_sample\": False,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{text}\", text)\n", " input_text = input_text.replace(\"{code_start}\", code_start)\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"For input with prompt: {payload['text_inputs']}\")\n", " print(f\"{bold} For prompt: '{each_prompt}'{unbold}{newline}\")\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "163af7aa-3751-4641-94c2-9b7a15d6ab89", "metadata": {}, "source": [ "### 6.16. Paragraph rephrase" ] }, { "cell_type": "code", "execution_count": 103, "id": "32e58dc3-c25b-4e27-86ca-adb397674cc5", "metadata": { "tags": [] }, "outputs": [], "source": [ "sentence = \"\"\"Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\"\"\"" ] }, { "cell_type": "code", "execution_count": 105, "id": "498bd66b-4cc9-4089-9ff4-03c9c3d41b3e", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: '\\ Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\n", "\n", "How would you rephrase that briefly using English?'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: '\\ Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\n", "\n", "How would you rephrase that briefly using English? Amazon SageMaker gets a major update'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: '\\ Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\n", "\n", "How would you rephrase that briefly using English? Amazon SageMaker gets a major update'\n", "\n", "\u001b[1m For prompt\u001b[0m: '\"Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\n", "The above sentence is very complicated. Please provide me a simplified synonymous version consisting of multiple sentences:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1m The reasoning result 0 is\u001b[0m: '\"Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\n", "The above sentence is very complicated. Please provide me a simplified synonymous version consisting of multiple sentences: Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker. SageMaker aims to provide a machine learning service that can be used to build, train and deploy ML models for virtually any use case.'\n", "\n", "\u001b[1m The reasoning result 1 is\u001b[0m: '\"Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker.\n", "SageMaker aims to provide a machine learning service which can be used to build, train, and deploy ML models for virtually any use case.\n", "During this year’s re:Invent conference, AWS made several announcements to further improve SageMaker’s capabilities.\n", "The above sentence is very complicated. Please provide me a simplified synonymous version consisting of multiple sentences: Amazon Web Services (AWS) has announced nine major new updates for its cloud-based machine learning platform, SageMaker. SageMaker aims to provide a machine learning service that can be used to build, train and deploy ML models for virtually any use case.'\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"\\ {sentence}\\n\\nHow would you rephrase that briefly using English?\"\"\",\n", " \"\"\"\"{sentence}\\nThe above sentence is very complicated. Please provide me a simplified synonymous version consisting of multiple sentences:\"\"\",\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 5000,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 250,\n", " \"top_p\": 0.01,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold} The reasoning result {idx} is{unbold}: '{each_generated_text}'{newline}\")" ] }, { "cell_type": "markdown", "id": "91ed5956-bc68-454d-add8-a014c8ea04df", "metadata": {}, "source": [ "### 6.17. Summarization" ] }, { "cell_type": "code", "execution_count": 106, "id": "631ecdab-cc4b-403a-8954-b4b24f53bc0e", "metadata": { "tags": [] }, "outputs": [], "source": [ "text = \"\"\"Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. \n", "You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. \n", "All of the Amazon Comprehend features accept UTF-8 text documents as the input. In addition, custom classification and custom entity recognition accept image files, PDF files, and Word files as input. \n", "Amazon Comprehend can examine and analyze documents in a variety of languages, depending on the specific feature. For more information, see Languages supported in Amazon Comprehend. Amazon Comprehend's Dominant language capability can examine documents and determine the dominant language for a far wider selection of languages.\"\"\"" ] }, { "cell_type": "code", "execution_count": 107, "id": "6023de95-2502-41bd-aa94-94f39fbb875d", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1mNumber of return sequences are set as 2\u001b[0m\n", "\n", "\u001b[1m For prompt: '{text}\n", "\n", "===\n", "Write a summary of the previous text in English:'\u001b[0m\n", "\n", "\u001b[1m The 2 summarized results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. \n", "You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. \n", "All of the Amazon Comprehend features accept UTF-8 text documents as the input. In addition, custom classification and custom entity recognition accept image files, PDF files, and Word files as input. \n", "Amazon Comprehend can examine and analyze documents in a variety of languages, depending on the specific feature. For more information, see Languages supported in Amazon Comprehend. Amazon Comprehend's Dominant language capability can examine documents and determine the dominant language for a far wider selection of languages.\n", "\n", "===\n", "Write a summary of the previous text in English: Analyze documents with natural language processing. Build new products on top of document analysis. Use Amazon Comprehend's document analysis capabilities.\n", "\n", "\u001b[1mResult 1\u001b[0m: Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. Use Amazon Comprehend to create new products based on understanding the structure of documents. For example, using Amazon Comprehend you can search social networking feeds for mentions of products or scan an entire document repository for key phrases. \n", "You can access Amazon Comprehend document analysis capabilities using the Amazon Comprehend console or using the Amazon Comprehend APIs. You can run real-time analysis for small workloads or you can start asynchronous analysis jobs for large document sets. You can use the pre-trained models that Amazon Comprehend provides, or you can train your own custom models for classification and entity recognition. \n", "All of the Amazon Comprehend features accept UTF-8 text documents as the input. In addition, custom classification and custom entity recognition accept image files, PDF files, and Word files as input. \n", "Amazon Comprehend can examine and analyze documents in a variety of languages, depending on the specific feature. For more information, see Languages supported in Amazon Comprehend. Amazon Comprehend's Dominant language capability can examine documents and determine the dominant language for a far wider selection of languages.\n", "\n", "===\n", "Write a summary of the previous text in English: Analyze documents with natural language processing. Create new products based on document analysis. Use the Amazon Comprehend console or APIs. Input documents in UTF-8 format.\n", "\n" ] } ], "source": [ "prompts = [\"\"\"{text}\\n\\n===\\nWrite a summary of the previous text in English:\"\"\"]\n", "\n", "num_return_sequences = 2\n", "parameters = {\n", " \"max_length\": 500,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "print(f\"{bold}Number of return sequences are set as {num_return_sequences}{unbold}{newline}\")\n", "for each_prompt in prompts:\n", " payload = {\"text_inputs\": each_prompt.replace(\"{text}\", text), **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} For prompt: '{each_prompt}'{unbold}{newline}\")\n", " print(f\"{bold} The {num_return_sequences} summarized results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "276cad38-cfa0-4590-acee-f6a5cb641a92", "metadata": {}, "source": [ "### 6.18. Common sense reasoning / natural language inference\n", "\n", "In the common sense reasoning, you can design a prompt and combine it with the premise, hypothesis, and options, send the combined text into the endpoint to get an answer. Examples are demonstrated as below." ] }, { "cell_type": "code", "execution_count": 108, "id": "27e3b487-caee-4a55-8032-ae7e969f7570", "metadata": { "tags": [] }, "outputs": [], "source": [ "premise = \"The world cup has kicked off in Los Angeles, United States.\"\n", "hypothesis = \"The world cup takes place in United States.\"\n", "options = \"\"\"[\"yes\", \"no\"]\"\"\"" ] }, { "cell_type": "code", "execution_count": 111, "id": "772d1939-85f0-4063-88f5-725872c8f9bd", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Given that The world cup has kicked off in Los Angeles, United States. Does it follow that The world cup takes place in United States. Yes or no?'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Given that The world cup has kicked off in Los Angeles, United States. Does it follow that The world cup takes place in United States. Yes or no? Yes\n", "\n", "\u001b[1mResult 1\u001b[0m: Given that The world cup has kicked off in Los Angeles, United States. Does it follow that The world cup takes place in United States. Yes or no? Yes\n", "\n", "\u001b[1m For prompt\u001b[0m: '\"The world cup has kicked off in Los Angeles, United States. Using only the above description and what you know about the world, is \"The world cup takes place in United States.\" definitely correct? Yes or no?'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: \"The world cup has kicked off in Los Angeles, United States. Using only the above description and what you know about the world, is \"The world cup takes place in United States.\" definitely correct? Yes or no? Yes\n", "\n", "\u001b[1mResult 1\u001b[0m: \"The world cup has kicked off in Los Angeles, United States. Using only the above description and what you know about the world, is \"The world cup takes place in United States.\" definitely correct? Yes or no? Yes\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Given that {premise} Does it follow that {hypothesis} Yes or no?\"\"\",\n", " \"\"\"\"{premise} Using only the above description and what you know about the world, is \\\"{hypothesis}\\\" definitely correct? Yes or no?\"\"\",\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 50,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{premise}\", premise)\n", " input_text = input_text.replace(\"{hypothesis}\", hypothesis)\n", " # input_text = input_text.replace(\"{options_}\", options)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "792a7d6a-4d05-4f8c-a8c0-d0c7fd5bfb26", "metadata": {}, "source": [ "### 6.19. Question and Answering\n", "\n", "Now, let's try another reasoning task with a different type of prompt template. You can simply provide context and question as shown below." ] }, { "cell_type": "code", "execution_count": 112, "id": "d3052a9e-6bfe-4de8-b884-12cabcbcc66c", "metadata": { "tags": [] }, "outputs": [], "source": [ "context = \"\"\"The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing.\n", "\"\"\"\n", "question = \"what are the key features of new Kindle?\"" ] }, { "cell_type": "code", "execution_count": 114, "id": "d0d20026-4c74-40fa-963d-be9793af704c", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'question: \"{question}\"\\n\n", "Context: \"{context}\"\\n\n", "Answer:'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: question: \"what are the key features of new Kindle?\"\\n\n", "Context: \"The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing.\n", "\"\\n\n", "Answer: lets you take notes on millions of books and documents, write lists and journals, and more\n", "\n", "\u001b[1mResult 1\u001b[0m: question: \"what are the key features of new Kindle?\"\\n\n", "Context: \"The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing.\n", "\"\\n\n", "Answer: take notes on millions of books and documents, write lists and journals, and more\n", "\n" ] } ], "source": [ "prompts = [\"\"\"question: \\\"{question}\"\\\\n\\nContext: \\\"{context}\"\\\\n\\nAnswer:\"\"\"]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 500,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 250,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{context}\", context)\n", " input_text = input_text.replace(\"{question}\", question)\n", " print(f\"{bold} For prompt{unbold}: '{each_prompt}'{newline}\")\n", "\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "45552d06-6e76-4386-9ae0-ddbe62c5404a", "metadata": {}, "source": [ "### 6.20. Sentence / Sentiment Classification" ] }, { "cell_type": "code", "execution_count": 115, "id": "5a5f8885-554b-42a4-9a80-e299f430199e", "metadata": { "tags": [] }, "outputs": [], "source": [ "sentence1 = \"This moive is so great and once again dazzles and delights us\"\n", "options_ = \"\"\"OPTIONS:\\n-positive \\n-negative \"\"\"" ] }, { "cell_type": "code", "execution_count": 116, "id": "8d081751-2f9f-4483-aaf8-daad4a8d14d9", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Review:\n", "This moive is so great and once again dazzles and delights us\n", "Is this movie review sentence negative or positive?\n", "OPTIONS:\n", "-positive \n", "-negative '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Review:\n", "This moive is so great and once again dazzles and delights us\n", "Is this movie review sentence negative or positive?\n", "OPTIONS:\n", "-positive \n", "-negative Positive\n", "\n", "\u001b[1mResult 1\u001b[0m: Review:\n", "This moive is so great and once again dazzles and delights us\n", "Is this movie review sentence negative or positive?\n", "OPTIONS:\n", "-positive \n", "-negative Positive\n", "\n", "\u001b[1m For prompt\u001b[0m: 'Short movie review: This moive is so great and once again dazzles and delights us\n", "Did the critic think positively or negatively of the movie?\n", "OPTIONS:\n", "-positive \n", "-negative '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Short movie review: This moive is so great and once again dazzles and delights us\n", "Did the critic think positively or negatively of the movie?\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1mResult 1\u001b[0m: Short movie review: This moive is so great and once again dazzles and delights us\n", "Did the critic think positively or negatively of the movie?\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1m For prompt\u001b[0m: 'Sentence from a movie review: This moive is so great and once again dazzles and delights us\n", "Was the movie seen positively or negatively based on the preceding review? \n", "\n", "OPTIONS:\n", "-positive \n", "-negative '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Sentence from a movie review: This moive is so great and once again dazzles and delights us\n", "Was the movie seen positively or negatively based on the preceding review? \n", "\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1mResult 1\u001b[0m: Sentence from a movie review: This moive is so great and once again dazzles and delights us\n", "Was the movie seen positively or negatively based on the preceding review? \n", "\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1m For prompt\u001b[0m: '\"This moive is so great and once again dazzles and delights us\"\n", "How would the sentiment of this sentence be perceived?\n", "\n", "OPTIONS:\n", "-positive \n", "-negative '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: \"This moive is so great and once again dazzles and delights us\"\n", "How would the sentiment of this sentence be perceived?\n", "\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1mResult 1\u001b[0m: \"This moive is so great and once again dazzles and delights us\"\n", "How would the sentiment of this sentence be perceived?\n", "\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1m For prompt\u001b[0m: 'Is the sentiment of the following sentence positive or negative?\n", "This moive is so great and once again dazzles and delights us\n", "OPTIONS:\n", "-positive \n", "-negative '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Is the sentiment of the following sentence positive or negative?\n", "This moive is so great and once again dazzles and delights us\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1mResult 1\u001b[0m: Is the sentiment of the following sentence positive or negative?\n", "This moive is so great and once again dazzles and delights us\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1m For prompt\u001b[0m: 'What is the sentiment of the following movie review sentence?\n", "This moive is so great and once again dazzles and delights us\n", "OPTIONS:\n", "-positive \n", "-negative '\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: What is the sentiment of the following movie review sentence?\n", "This moive is so great and once again dazzles and delights us\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n", "\u001b[1mResult 1\u001b[0m: What is the sentiment of the following movie review sentence?\n", "This moive is so great and once again dazzles and delights us\n", "OPTIONS:\n", "-positive \n", "-negative positive\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Review:\\n{sentence}\\nIs this movie review sentence negative or positive?\\n{options_}\"\"\",\n", " \"\"\"Short movie review: {sentence}\\nDid the critic think positively or negatively of the movie?\\n{options_}\"\"\",\n", " \"\"\"Sentence from a movie review: {sentence}\\nWas the movie seen positively or negatively based on the preceding review? \\n\\n{options_}\"\"\",\n", " \"\"\"\\\"{sentence}\\\"\\nHow would the sentiment of this sentence be perceived?\\n\\n{options_}\"\"\",\n", " \"\"\"Is the sentiment of the following sentence positive or negative?\\n{sentence}\\n{options_}\"\"\",\n", " \"\"\"What is the sentiment of the following movie review sentence?\\n{sentence}\\n{options_}\"\"\",\n", "]\n", "\n", "parameters = {\n", " \"max_length\": 50,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{sentence}\", sentence1)\n", " input_text = input_text.replace(\"{options_}\", options_)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "1774dc76-f284-48bc-9792-ce0cf6dd8110", "metadata": {}, "source": [ "### 6.21. Imaginary article generation based on title" ] }, { "cell_type": "code", "execution_count": 120, "id": "fc2365eb-4251-4f31-b9fe-f956c789c001", "metadata": { "tags": [] }, "outputs": [], "source": [ "title = \"University has new facility coming up\"" ] }, { "cell_type": "code", "execution_count": 121, "id": "53ad09f7-7894-4723-8d68-129838cda02a", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: 'Title: \"University has new facility coming up\"\\nGiven the above title of an imaginary article, imagine the article.\\n'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: Title: \"University has new facility coming up\"\\nGiven the above title of an imaginary article, imagine the article.\\n The new £5.7m facility will be built on the site of the existing £3.2m building at the University of the Highlands and Islands (UHI) campus in Inverness.\n", "\n", "It will include a new lecture theatre, seminar room, office space and a new entrance.\n", "\n", "UHI said the new building would be completed by the end of the year.\n", "\n", "The existing building was completed in 2004 and has been used for teaching and research since then.\n", "\n", "UHI said the new building would allow it to expand its provision of higher education in the Highlands and Islands.\n", "\n", "It said the new building would also allow it to increase the number of students it could accommodate.\n", "\n", "UHI said it had made a number of changes to its provision of higher education in the Highlands and Islands in recent years.\n", "\n", "It said it had introduced a new range of undergraduate courses in Inverness and Dundee and had increased the number of postgraduate courses it offered in the Highlands and Islands.\n", "\n", "It said it had also increased the number of staff based in the Highlands and Islands.\n", "\n", "UHI said it had been working closely with the Scottish Funding Council (SFC) on the new building.\n", "\n", "It said the SFC had made a contribution of £1.6m towards the cost of the new building.\n", "\n", "UHI said the new building would also include a new entrance to the campus.\n", "\n", "It said the new entrance would be designed to reflect the architecture of the existing building.\n", "\n", "UHI said it was also working with the SFC on plans for a new £1.3m laboratory building at its campus in Dundee.\n", "\n", "It said the new laboratory building would be built on the site of the existing laboratory and would provide space for a number of new teaching and research laboratories.\n", "\n", "UHI said the new laboratory building would be completed by the end of 2013.\n", "\n", "\u001b[1mResult 1\u001b[0m: Title: \"University has new facility coming up\"\\nGiven the above title of an imaginary article, imagine the article.\\n The £6.2m research centre, which will be built on the University of Bath campus, will provide a home for the Institute of Advanced Materials and Nanotechnology (IAMNT).\n", "\n", "It will also be home to the Institute for Manufacturing Processes (IMP), which was established in 2007.\n", "\n", "The new facility will be known as the Advanced Materials Innovation Centre (AMIC).\n", "\n", "The centre will be built on the site of the former University of Bath College of Science and Technology (COST) building.\n", "\n", "It will include laboratories, offices, conference facilities and an open plan area for collaborative research, as well as a café and a seminar room.\n", "\n", "The centre will be built on the site of the former University of Bath College of Science and Technology (COST) building\n", "\n", "The new centre will be the home of the Institute of Advanced Materials and Nanotechnology (IAMNT), which was set up in 2006 to conduct research in the fields of advanced materials and nanotechnology.\n", "\n", "The centre will also be home to the Institute for Manufacturing Processes (IMP), which was established in 2007 to carry out research in the field of advanced manufacturing processes.\n", "\n", "The centre will be built on the site of the former University of Bath College of Science and Technology (COST) building.\n", "\n", "The university said the new building would provide the \"space and facilities\" to allow the two institutes to expand their research activities.\n", "\n", "The new centre will be the home of the Institute of Advanced Materials and Nanotechnology (IAMNT), which was set up in 2006 to conduct research in the fields of advanced materials and nanotechnology.\n", "\n", "The centre will also be home to the Institute for Manufacturing Processes (IMP), which was established in 2007 to carry out research in the field of advanced manufacturing processes.\n", "\n", "The university said the new building would provide the \"space and facilities\" to allow the two institutes to expand their research activities.\n", "\n", "The university said the new building would provide the \"space and facilities\" to allow the two institutes to expand their research activities.\n", "\n", "It is hoped that the new centre will help to attract more research funding, and increase the number of PhD students and post-doctoral researchers at the University of Bath.\n", "\n", "The project is being funded by the Engineering and Physical Sciences Research Council (EPSRC) and the National Institute for Health Research (NIHR).\n", "\n", "It is hoped that the new centre will help to attract more research funding, and increase the number of PhD students and post-doctoral researchers at the University of Bath.\n", "\n", "The project is being funded by the Engineering and Physical Sciences Research Council (EPSRC) and the National Institute for Health Research (NIHR)...\n", "\n" ] } ], "source": [ "prompts = [\n", " \"\"\"Title: \\\"{title}\\\"\\\\nGiven the above title of an imaginary article, imagine the article.\\\\n\"\"\"\n", "]\n", "\n", "\n", "parameters = {\n", " \"max_length\": 5000,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 50,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{title}\", title)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "7b3b8473-ea51-457c-b555-e647a107a5c2", "metadata": {}, "source": [ "### 6.22. Summarize a title based on the article" ] }, { "cell_type": "code", "execution_count": 122, "id": "1b425e6a-6a1a-4a52-b9b2-b4bbe67bc20b", "metadata": { "tags": [] }, "outputs": [], "source": [ "article = \"\"\"The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing.\"\"\"" ] }, { "cell_type": "code", "execution_count": 124, "id": "a41ed3c8-30a5-478a-9b8a-50f5ab0fa6de", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m For prompt\u001b[0m: '''The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing. \n", "\n", " \\n\\nGive me a good title for the article above.'\n", "\n", "\u001b[1m The 2 results are\u001b[0m:\n", "\n", "\u001b[1mResult 0\u001b[0m: ''The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing. \n", "\n", " \\n\\nGive me a good title for the article above. Amazon's Kindle Scribe: Everything you need to know\n", "\n", "\u001b[1mResult 1\u001b[0m: ''The newest and most innovative Kindle yet lets you take notes on millions of books and documents, write lists and journals, and more. \n", "\n", "For readers who have always wished they could write in their eBooks, Amazon’s new Kindle lets them do just that. The Kindle Scribe is the first Kindle for reading and writing and allows users to supplement their books and documents with notes, lists, and more.\n", "\n", "Here’s everything you need to know about the Kindle Scribe, including frequently asked questions.\n", "\n", "The Kindle Scribe makes it easy to read and write like you would on paper \n", "\n", "The Kindle Scribe features a 10.2-inch, glare-free screen (the largest of all Kindle devices), crisp 300 ppi resolution, and 35 LED front lights that automatically adjust to your environment. Further personalize your experience with the adjustable warm light, font sizes, line spacing, and more.\n", "\n", "It comes with your choice of the Basic Pen or the Premium Pen, which you use to write on the screen like you would on paper. They also attach magnetically to your Kindle and never need to be charged. The Premium Pen includes a dedicated eraser and a customizable shortcut button.\n", "\n", "The Kindle Scribe has the most storage options of all Kindle devices: choose from 8 GB, 16 GB, or 32 GB to suit your level of reading and writing. \n", "\n", " \\n\\nGive me a good title for the article above. Kindle Scribe: Everything you need to know\n", "\n" ] } ], "source": [ "prompts = [\"\"\"'\\'{article} \\n\\n \\\\n\\\\nGive me a good title for the article above.\"\"\"]\n", "\n", "parameters = {\n", " \"max_length\": 5000,\n", " \"num_return_sequences\": num_return_sequences,\n", " \"top_k\": 250,\n", " \"top_p\": 0.95,\n", " \"do_sample\": True,\n", "}\n", "\n", "\n", "for each_prompt in prompts:\n", " input_text = each_prompt.replace(\"{article}\", article)\n", " print(f\"{bold} For prompt{unbold}: '{input_text}'{newline}\")\n", " payload = {\"text_inputs\": input_text, **parameters}\n", " query_response = query_endpoint_with_json_payload(\n", " json.dumps(payload).encode(\"utf-8\"), endpoint_name=endpoint_name\n", " )\n", " generated_texts = parse_response_multiple_texts(query_response)\n", " print(f\"{bold} The {num_return_sequences} results are{unbold}:{newline}\")\n", " for idx, each_generated_text in enumerate(generated_texts):\n", " print(f\"{bold}Result {idx}{unbold}: {each_generated_text}{newline}\")" ] }, { "cell_type": "markdown", "id": "aa5de21f", "metadata": {}, "source": [ "### 7. Clean up the endpoint" ] }, { "cell_type": "code", "execution_count": 90, "id": "69b588d1", "metadata": {}, "outputs": [], "source": [ "# Delete the SageMaker endpoint\n", "model_predictor.delete_model()\n", "model_predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "id": "4f08c710", "metadata": {}, "source": [ "## Notebook CI Test Results\n", "\n", "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n", "\n", "![This badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/introduction_to_amazon_algorithms|jumpstart-foundation-models|text-generation-few-shot-learning.ipynb)\n" ] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 } ], "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science 2.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/sagemaker-data-science-38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" } }, "nbformat": 4, "nbformat_minor": 5 }