{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "10f42322-8e6b-4df5-9585-a969d364b12a",
   "metadata": {},
   "source": [
    "# Summarize Scientific Documents with a Foundation Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2e28edf-de16-4de0-96ff-c4b5678d4e17",
   "metadata": {},
   "source": [
    "Researchers must stay up-to-date on their fields of interest. However, it's difficult to keep track of the large number of journals, whitepapers, and research pre-prints generated in many areas. In response, many research groups have turned to AI/ML tools to summarize and classify new documents.\n",
    "\n",
    "In this workshop, we'll use a foundation model (FM) to process scientific documents from the HuggingFace [scientific_documents](https://huggingface.co/datasets/scientific_papers) dataset.\n",
    "\n",
    "This notebook was created and tested on an `ml.m5.2xlarge.medium (8 vCPU + 32 GiB)` notebook instance running the `Python 3 (Data Science 3.0)` kernel in SageMaker Studio."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0da45f90-8e9c-4c0e-8fe6-da0f9bbe994b",
   "metadata": {},
   "source": [
    "## 1. Install required libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ff4bfd94-7d5f-4c69-b4b1-637d71c465e6",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%pip install -q -U pip\n",
    "%pip install -q -U torch --index-url https://download.pytorch.org/whl/cpu \n",
    "%pip install -q -U transformers datasets einops accelerate "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1de31b09-df82-49af-876f-f5c76b4dd13c",
   "metadata": {},
   "source": [
    "## 2. Download PubMed document abstracts"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f7ac0c3-4b3d-48cb-84be-388f00f758ee",
   "metadata": {},
   "source": [
    "Download a sample of PubMed abstracts from HuggingFace Hub (https://huggingface.co/datasets/scientific_papers)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "2d63a32a-a54f-4296-a896-e734c1e16d6a",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Found cached dataset scientific_papers (/root/.cache/huggingface/datasets/scientific_papers/pubmed/1.1.1/306757013fb6f37089b6a75469e6638a553bd9f009484938d8f75a4c5e84206f)\n"
     ]
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "\n",
    "dataset = load_dataset(\"scientific_papers\", \"pubmed\", split='test[:5000]')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d256b9cb-2d09-4fe2-994b-8e50b6ed0ce1",
   "metadata": {},
   "source": [
    "Take a look at an example abstract"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "a453296c-ab89-427e-83b0-3fcab4fa4989",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " background : dental students use extracted human teeth to learn practical and technical skills before they enter the clinical environment . in the present research , knowledge , performance , and attitudes toward sterilization / disinfection methods of extracted human teeth were evaluated in a selected group of iranian dental students.materials and methods : in this descriptive cross - sectional study the subjects consisted of fourth- , fifth- and sixth - year dental students . \n",
      " data were collected by questionnaires and analyzed by fisher 's exact test and chi - squared test using spss 11.5.results:in this study , 100 dental students participated . \n",
      " the average knowledge score was 15.9  4.8 . \n",
      " based on the opinion of 81 students sodium hypochlorite was selected as suitable material for sterilization and 78 students believed that oven sterilization is a good way for the purpose . \n",
      " the average performance score was 4.1  0.8 , with 3.9  1.7 and 4.3  1.1 for males and females , respectively , with no significant differences between the two sexes . \n",
      " the maximum and minimum attitude scores were 60 and 25 , with an average score of 53.1  5.2.conclusion:the results of this study indicated that knowledge , performance and attitude of dental students in relation to sterilization / disinfection methods of extracted human teeth were good . \n",
      " however , weaknesses were observed in relation to teaching and materials suitable for sterilization . \n"
     ]
    }
   ],
   "source": [
    "import random\n",
    "\n",
    "def get_random_abstract(data):\n",
    "    return random.sample(data['abstract'], 1)[0]\n",
    "    \n",
    "abstract = get_random_abstract(dataset)\n",
    "print(abstract)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d75362f8-af69-426b-8289-659818c3a63d",
   "metadata": {},
   "source": [
    "## 3. Generate abstract summaries using a foundation model (FM)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff893fc0-d28d-4431-898f-d8fdec0d5755",
   "metadata": {},
   "source": [
    "[Flan-T5](https://huggingface.co/docs/transformers/model_doc/flan-t5) is a foundation model trained on a large collection of text documents. In addition, it was \"instruction-tuned\" to perform reasonably well on a wide range of language processing tasks, such as question answering and translation. In this example, we'll use it for text summarization.\n",
    "\n",
    "Although the pre-training data likely included some scientific text, Flan-T5 was not specifically trained to handle biomedical text. We'll see the result of this in the outputs below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "c3302b8b-c099-4226-a4eb-7afe42479b5c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from transformers import AutoTokenizer, AutoModelForSeq2SeqLM\n",
    "\n",
    "model_checkpoint='google/flan-t5-base'\n",
    "tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True)\n",
    "model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "87f6b19d-a602-49c0-ae66-39df06e8b74e",
   "metadata": {},
   "source": [
    "### 3.1. Basic text generation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73bb11ab-015e-4528-87e8-aa4e43ad496b",
   "metadata": {},
   "source": [
    "Before passing the abstract text to our model, we need to tokenize it, i.e. convert it from text into a numerical representation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "4122e92c-9292-414b-8fe4-0bb011f4a5a6",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([ 2458,     3,    10,  4814,   481,   169, 21527,   936,  3841,    12,\n",
      "          669,  3236,    11,  2268,  1098,   274,    79,  2058,     8,  3739,\n",
      "         1164,     3,     5,    16,     8,   915,   585,     3,     6,  1103,\n",
      "            3,     6,   821,     3,     6,    11, 18537,  2957, 29675,   257,\n",
      "            3,    87, 30929,    23,   106,  2254,    13, 21527,   936,  3841,\n",
      "          130, 14434,    16,     3,     9,  2639,   563,    13,     3,    23,\n",
      "           52,     9, 15710,  4814,   481,     5, 11303,     7,    11,  2254,\n",
      "            3,    10,    16,    48, 25444,  2269,     3,    18,  1375,   138,\n",
      "          810,     8,  7404, 14280,    26,    13,  4509,    18,     3,     6,\n",
      "         8486,    18,    11, 13305,     3,    18,   215,  4814,   481,     3,\n",
      "            5,   331,   130,  4759,    57, 19144,     7,    11,     3, 16466,\n",
      "           57,  2495,    49,     3,    31,     7,  2883,   794,    11,     3,\n",
      "         1436,     3,    18,  2812,    26,   794,   338,     3,     7,   102,\n",
      "            7,     7,  7806,  9125,    60,     7,    83,    17,     7,    10,\n",
      "           77,    48,   810,     3,     6,   910,  4814,   481, 10627,     3,\n",
      "            5,     8,  1348,  1103,  2604,    47,  9996,  1298,     3, 27441,\n",
      "            3,     5,     3,   390,    30,     8,  3474,    13,     3,  4959,\n",
      "          481, 19049, 10950,   524,   322,   155,    15,    47,  2639,    38,\n",
      "         3255,  1037,    21, 29675,   257,    11,     3,  3940,   481,  6141,\n",
      "           24,  4836, 29675,   257,    19,     3,     9,   207,   194,    21,\n",
      "            8,  1730,     3,     5,     8,  1348,   821,  2604,    47,     3,\n",
      "        19708,     3, 22384,     3,     6,    28,     3, 28640,     3, 18596,\n",
      "           11,     3, 21841,     3, 11039,    21,  5069,     7,    11,  3955,\n",
      "            7,     3,     6,  6898,     3,     6,    28,   150,  1516,  5859,\n",
      "          344,     8,   192,     3,     7,  6667,     7,     3,     5,     8,\n",
      "         2411,    11,  2559,  7525,  7586,   130,  1640,    11,   944,     3,\n",
      "            6,    28,    46,  1348,  2604,    13,   305, 18495,     3, 24342,\n",
      "            5,  1018, 11593,    10,   532,   772,    13,    48,   810,  7972,\n",
      "           24,  1103,     3,     6,   821,    11,  7525,    13,  4814,   481,\n",
      "           16,  4689,    12, 29675,   257,     3,    87, 30929,    23,   106,\n",
      "         2254,    13, 21527,   936,  3841,   130,   207,     3,     5,   983,\n",
      "            3,     6, 21506,   130,  6970,    16,  4689,    12,  2119,    11,\n",
      "         1397,  3255,    21, 29675,   257,     3,     5,     3,     1])\n"
     ]
    }
   ],
   "source": [
    "import pprint\n",
    "tokenized_abstract = tokenized_input = tokenizer(abstract, return_tensors='pt')\n",
    "pprint.pprint(tokenized_abstract['input_ids'][0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61a30134-f310-4d3c-ab60-31cae38f4288",
   "metadata": {},
   "source": [
    "Next, we pass the tokens to the model and ask it to generate new tokens to \"fill in the blank\" at the end."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "5c259adc-48f2-4326-8a91-68876ef9db22",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([    0,  1103,     6,   821,    11, 18537,  2957, 29675,   257,  2254,\n",
       "           13, 21527,   936,  3841,   130,   207,    68, 31221,   130,  6970,\n",
       "           16,  2119,    11,  1397,  3255,    21, 29675,   257,     5,     1])"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model_output = model.generate(tokenized_abstract['input_ids'], max_new_tokens=50)[0]\n",
    "model_output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b4e64a2-2942-455a-944a-e0081bbd72e3",
   "metadata": {},
   "source": [
    "Finally, we decode the model output back into text and clean it up."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "8c5fe645-cb91-440c-9151-7458c644f3b9",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Knowledge, performance and attitudes toward sterilization methods of extracted human teeth were good but shortcomings were observed in teaching and materials suitable for sterilization.\n"
     ]
    }
   ],
   "source": [
    "print(tokenizer.decode(model_output, skip_special_tokens=True).strip().capitalize())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1de343cf-8b31-4068-9565-7dafff069e10",
   "metadata": {},
   "source": [
    "Let's put these steps all together"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "f84dcbfd-96ec-4284-a556-97fa97092c18",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MODEL INPUT:\n",
      "' background : dental students use extracted human teeth to learn practical and technical skills before they enter the clinical environment . in the present research , knowledge , performance , and attitudes toward sterilization / disinfection methods of extracted human teeth were evaluated in a selected group of iranian dental students.materials and methods : in this descriptive cross - sectional study the subjects consisted of fourth- , fifth- and sixth - year dental students . \n",
      " data were collected by questionnaires and analyzed by fisher 's exact test and chi - squared test using spss 11.5.results:in this study , 100 dental students participated . \n",
      " the average knowledge score was 15.9  4.8 . \n",
      " based on the opinion of 81 students sodium hypochlorite was selected as suitable material for sterilization and 78 students believed that oven sterilization is a good way for the purpose . \n",
      " the average performance score was 4.1  0.8 , with 3.9  1.7 and 4.3  1.1 for males and females , respectively , with no significant differences between the two sexes . \n",
      " the maximum and minimum attitude scores were 60 and 25 , with an average score of 53.1  5.2.conclusion:the results of this study indicated that knowledge , performance and attitude of dental students in relation to sterilization / disinfection methods of extracted human teeth were good . \n",
      " however , weaknesses were observed in relation to teaching and materials suitable for sterilization . '\n",
      "\n",
      "MODEL OUTPUT:\n",
      "'Knowledge and attitudes toward sterilization methods of extracted human teeth are good but educational and material suitable for sterilization are abysmal'\n",
      "\n"
     ]
    }
   ],
   "source": [
    "def generate_text(input_text):\n",
    "    model_input = input_text\n",
    "    tokenized_input = tokenizer(model_input, return_tensors='pt')\n",
    "    model_output = tokenizer.decode(\n",
    "        model.generate(\n",
    "            tokenized_input[\"input_ids\"], \n",
    "            max_new_tokens=50,\n",
    "            temperature=0.75,\n",
    "            do_sample=True\n",
    "        )[0], \n",
    "        skip_special_tokens=True\n",
    "    )\n",
    "    \n",
    "    return(model_input, model_output.strip().capitalize())\n",
    "\n",
    "\n",
    "no_prompt_input, no_prompt_output = generate_text(abstract)\n",
    "\n",
    "print(f\"MODEL INPUT:\\n'{no_prompt_input}'\\n\")\n",
    "print(f\"MODEL OUTPUT:\\n'{no_prompt_output}'\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb14daf5-9092-4eff-ac98-f75d60bc4560",
   "metadata": {},
   "source": [
    "### 3.2. Using a text prompt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd467033-1f82-48ab-988d-fe99fdf0c82b",
   "metadata": {},
   "source": [
    "We can guide the model to generate a more accurate response via \"prompt engineering\". This helps it to understand the task at hand. In the next cell, we'll try using a list of prompts to see how they affect the output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "8377dbe6-dfab-43ae-bb31-4cc6211e700f",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Abstract:\n",
      "' background : dental students use extracted human teeth to learn practical and technical skills before they enter the clinical environment . in the present research , knowledge , performance , and attitudes toward sterilization / disinfection methods of extracted human teeth were evaluated in a selected group of iranian dental students.materials and methods : in this descriptive cross - sectional study the subjects consisted of fourth- , fifth- and sixth - year dental students . \n",
      " data were collected by questionnaires and analyzed by fisher 's exact test and chi - squared test using spss 11.5.results:in this study , 100 dental students participated . \n",
      " the average knowledge score was 15.9  4.8 . \n",
      " based on the opinion of 81 students sodium hypochlorite was selected as suitable material for sterilization and 78 students believed that oven sterilization is a good way for the purpose . \n",
      " the average performance score was 4.1  0.8 , with 3.9  1.7 and 4.3  1.1 for males and females , respectively , with no significant differences between the two sexes . \n",
      " the maximum and minimum attitude scores were 60 and 25 , with an average score of 53.1  5.2.conclusion:the results of this study indicated that knowledge , performance and attitude of dental students in relation to sterilization / disinfection methods of extracted human teeth were good . \n",
      " however , weaknesses were observed in relation to teaching and materials suitable for sterilization . '\n",
      "\n",
      "#########################\n",
      "Prompt: 'Briefly summarize this sentence: {text}'\n",
      "Model response: 'Information about sterilization methods and attitudes toward sterilization methods of extracted human teeth in dental students of iranian university'\n",
      "\n",
      "#########################\n",
      "Prompt: 'Write a short summary for this text: {text}'\n",
      "Model response: 'Knowledge, performance and attitudes toward sterilization methods is good but there are weaknesses in teaching and materials for sterilization'\n",
      "\n",
      "#########################\n",
      "Prompt: '{text}\n",
      "\n",
      "Write a brief summary in a sentence or less'\n",
      "Model response: 'Knowledge performance and attitudes toward sterilization methods of extracted human teeth are good but shortcomings in teaching and materials for sterilization'\n",
      "\n",
      "#########################\n",
      "Prompt: 'Write a sentence based on '{text}''\n",
      "Model response: 'We have found that the knowledge and attitudes toward sterilization / disinfection methods of extracted human teeth were good but there were some weaknesses in teaching and materials suitable for sterilization, which may have to be addressed in the future.'\n",
      "\n",
      "#########################\n",
      "Prompt: 'Summarize this article:\n",
      "\n",
      "{text}'\n",
      "Model response: 'Knowledge, performance and attitudes toward sterilization methods of extracted human teeth are good'\n",
      "\n"
     ]
    }
   ],
   "source": [
    "prompts = [\n",
    "    \"Briefly summarize this sentence: {text}\",\n",
    "    \"Write a short summary for this text: {text}\",\n",
    "    \"{text}\\n\\nWrite a brief summary in a sentence or less\",\n",
    "    \"Write a sentence based on '{text}'\",\n",
    "    \"Summarize this article:\\n\\n{text}\",\n",
    "]\n",
    "print(f\"Abstract:\\n'{abstract}'\\n\")\n",
    "for each_prompt in prompts:\n",
    "    print(\"#\"*25)\n",
    "    print(f\"Prompt: '{each_prompt}'\")\n",
    "    input = each_prompt.replace(\"{text}\", abstract)\n",
    "    prompted_input, prompted_output = generate_text(input)\n",
    "    # print(no_prompt_input)\n",
    "    print(f\"Model response: '{prompted_output}'\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b70b84da-2ee0-4fdc-b7e9-c0a268866b73",
   "metadata": {},
   "source": [
    "These are all pretty good. For the purposes of our testing, let's use the prompy `\"Summarize this article:\\n\\n{text}\"` going forward."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "f7465c23-684f-454e-a459-10c5fe208216",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MODEL INPUT:\n",
      "'Summarize this article:\n",
      "\n",
      " background : dental students use extracted human teeth to learn practical and technical skills before they enter the clinical environment . in the present research , knowledge , performance , and attitudes toward sterilization / disinfection methods of extracted human teeth were evaluated in a selected group of iranian dental students.materials and methods : in this descriptive cross - sectional study the subjects consisted of fourth- , fifth- and sixth - year dental students . \n",
      " data were collected by questionnaires and analyzed by fisher 's exact test and chi - squared test using spss 11.5.results:in this study , 100 dental students participated . \n",
      " the average knowledge score was 15.9  4.8 . \n",
      " based on the opinion of 81 students sodium hypochlorite was selected as suitable material for sterilization and 78 students believed that oven sterilization is a good way for the purpose . \n",
      " the average performance score was 4.1  0.8 , with 3.9  1.7 and 4.3  1.1 for males and females , respectively , with no significant differences between the two sexes . \n",
      " the maximum and minimum attitude scores were 60 and 25 , with an average score of 53.1  5.2.conclusion:the results of this study indicated that knowledge , performance and attitude of dental students in relation to sterilization / disinfection methods of extracted human teeth were good . \n",
      " however , weaknesses were observed in relation to teaching and materials suitable for sterilization . '\n",
      "\n",
      "MODEL OUTPUT:\n",
      "'Learning and teaching methods of dental students in iran to sterilize human teeth are adequate'\n",
      "\n"
     ]
    }
   ],
   "source": [
    "def generate_w_prompt(input_text, start_prompt = 'Summarize this article:\\n\\n', end_prompt = ''):  \n",
    "    model_input = start_prompt + input_text + end_prompt\n",
    "    prompted_input, prompted_output = generate_text(model_input)\n",
    "\n",
    "    return(prompted_input, prompted_output.strip().capitalize())\n",
    "    \n",
    "prompted_input, prompted_output = generate_w_prompt(abstract)\n",
    "\n",
    "print(f\"MODEL INPUT:\\n'{prompted_input}'\\n\")\n",
    "print(f\"MODEL OUTPUT:\\n'{prompted_output}'\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e7d796c-56c0-460a-98e3-339aa35975d8",
   "metadata": {
    "tags": []
   },
   "source": [
    "### 3.3. Use few-shot inference"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60838582-51b2-4f96-9a8d-e28ec79bac80",
   "metadata": {},
   "source": [
    "Just like people, sometimes LLMs learn best from some examples. In this case, we pass one or more examples of the output we expect to the model, a technique known as \"few-shot learning\". We're not actually \"retraining\" the model, just giving it additional guidance at runtime.\n",
    "\n",
    "In this case, our \"examples\" will be from the [scitldr](https://huggingface.co/datasets/allenai/scitldr) dataset from the Allen Institute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "fc579463-3c7c-47df-95a5-fdfbd1ea83b2",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Found cached dataset scitldr (/root/.cache/huggingface/datasets/allenai___scitldr/Abstract/0.0.0/79e0fa75961392034484808cfcc8f37deb15ceda153b798c92d9f621d1042fef)\n"
     ]
    }
   ],
   "source": [
    "huggingface_dataset_name = \"allenai/scitldr\"\n",
    "scitldr_dataset = load_dataset(huggingface_dataset_name, 'Abstract', split='train')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "38f3dfb6-aff8-45a7-b39e-ebd7da38b4e6",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MODEL INPUT:\n",
      "'Summarize this article:\n",
      "\n",
      "The Deep Image Prior (DIP, Ulyanov et al., 2017) is a fascinating recent approach for recovering images which appear natural, yet is not fully understood. This work aims at shedding some further light on this approach by investigating the properties of the early outputs of the DIP. First, we show that these early iterations demonstrate invariance to adversarial perturbations by classifying progres\n",
      "\n",
      "We investigate properties of the recently introduced deep image prior (ulyanov et al, 2017)\n",
      "\n",
      "Summarize this article:\n",
      "\n",
      " background : dental students use extracted human teeth to learn practical and technical skills before they enter the clinical environment . in the present research , knowledge , performance , and attitudes toward sterilization / disinfection methods of extracted human teeth were evaluated in a selected group of iranian dental students.materials and methods : in this descriptive cross - sectional study the subjects consisted of fourth- , fifth- and sixth - year dental students . \n",
      " data were collected by questionnaires and analyzed by fisher 's exact test and chi - squared test using spss 11.5.results:in this study , 100 dental students participated . \n",
      " the average knowledge score was 15.9  4.8 . \n",
      " based on the opinion of 81 students sodium hypochlorite was selected as suitable material for sterilization and 78 students believed that oven sterilization is a good way for the purpose . \n",
      " the average performance score was 4.1  0.8 , with 3.9  1.7 and 4.3  1.1 for males and females , respectively , with no significant differences between the two sexes . \n",
      " the maximum and minimum attitude scores were 60 and 25 , with an average score of 53.1  5.2.conclusion:the results of this study indicated that knowledge , performance and attitude of dental students in relation to sterilization / disinfection methods of extracted human teeth were good . \n",
      " however , weaknesses were observed in relation to teaching and materials suitable for sterilization . '\n",
      "\n",
      "MODEL OUTPUT:\n",
      "'In a qualitative and quantitative study the knowledge of dental students regarding sterilization methods of extracted human teeth is good and the teaching is good.'\n",
      "\n"
     ]
    }
   ],
   "source": [
    "def generate_w_few_shot_prompt(input_text, example_dataset, num_shots=1, sep_sequence = '\\n\\n', start_prompt = 'Summarize this article:\\n\\n'): \n",
    "    for i in range(num_shots):\n",
    "        n = random.randint(0, len(example_dataset))\n",
    "        example_text = ' '.join(example_dataset[n]['source'])[:400]\n",
    "        example_summary = example_dataset[n]['target'][0].strip().capitalize()\n",
    "        shot = start_prompt + example_text + sep_sequence + example_summary + sep_sequence\n",
    "    shot = shot + start_prompt\n",
    "    few_shot_input, few_shot_output = generate_w_prompt(input_text=input_text, start_prompt=shot)\n",
    "    return(few_shot_input, few_shot_output.strip().capitalize())\n",
    "\n",
    "few_shot_input, few_shot_output = generate_w_few_shot_prompt(abstract, scitldr_dataset, num_shots=1)\n",
    "\n",
    "print(f\"MODEL INPUT:\\n'{few_shot_input}'\\n\")\n",
    "print(f\"MODEL OUTPUT:\\n'{few_shot_output}'\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c3e738e-75ca-4f7f-82c6-11f41935b9d5",
   "metadata": {},
   "source": [
    "### 3.4. Compare all the methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "5237d1bf-6aa8-43e4-bb9c-145cfab4b8e2",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ABSTRACT:\n",
      " background : dental students use extracted human teeth to learn practical and technical skills before they enter the clinical environment . in the present research , knowledge , performance , and attitudes toward sterilization / disinfection methods of extracted human teeth were evaluated in a selected group of iranian dental students.materials and methods : in this descriptive cross - sectional study the subjects consisted of fourth- , fifth- and sixth - year dental students . \n",
      " data were collected by questionnaires and analyzed by fisher 's exact test and chi - squared test using spss 11.5.results:in this study , 100 dental students participated . \n",
      " the average knowledge score was 15.9  4.8 . \n",
      " based on the opinion of 81 students sodium hypochlorite was selected as suitable material for sterilization and 78 students believed that oven sterilization is a good way for the purpose . \n",
      " the average performance score was 4.1  0.8 , with 3.9  1.7 and 4.3  1.1 for males and females , respectively , with no significant differences between the two sexes . \n",
      " the maximum and minimum attitude scores were 60 and 25 , with an average score of 53.1  5.2.conclusion:the results of this study indicated that knowledge , performance and attitude of dental students in relation to sterilization / disinfection methods of extracted human teeth were good . \n",
      " however , weaknesses were observed in relation to teaching and materials suitable for sterilization . \n",
      "\n",
      "\n",
      "NO-PROMPT-SUMMARY:\tKnowledge and attitudes toward sterilization methods of extracted human teeth are good but educational and material suitable for sterilization are abysmal\n",
      "ZERO-SHOT-SUMMARY:\tLearning and teaching methods of dental students in iran to sterilize human teeth are adequate\n",
      "FEW-SHOT-SUMMARY:\tIn a qualitative and quantitative study the knowledge of dental students regarding sterilization methods of extracted human teeth is good and the teaching is good.\n"
     ]
    }
   ],
   "source": [
    "print(\"ABSTRACT:\")\n",
    "print(abstract)\n",
    "print(\"\\n\")\n",
    "print(f\"NO-PROMPT-SUMMARY:\\t{no_prompt_output}\")\n",
    "print(f\"ZERO-SHOT-SUMMARY:\\t{prompted_output}\")\n",
    "print(f\"FEW-SHOT-SUMMARY:\\t{few_shot_output}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34824488-86d0-49d5-a8d1-59305bc612e0",
   "metadata": {},
   "source": [
    "Try a few more examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "c07dc232-ee04-4285-8c75-528108532b9f",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "#########################\n",
      "SOURCE_TEXT:\n",
      " background : delivery is one of the most important crises with mental , social , and deep emotional dimensions in women 's life . health providers respect to pregnant women 's bill of rights , as an important component of providing humanistic and ethical care , is of utmost importance . \n",
      " this study aimed to determine health providers compliance with the pregnant women 's bill of rights in labor and delivery and some of its related factors in 2013.materials and methods : this descriptive , cross - sectional study was carried out on the subjects selected through census sampling ( n = 257 ) from among the healthcare providers working in the labor rooms of four educational hospitals . \n",
      " the data were collected by a self - reported questionnaire whose validity and reliability were established . \n",
      " data were analyzed through descriptive and inferential statistics.results:the compliance with pregnant women 's bill of rights was found to be at a very high level in 22.8% of the midwifery students , 28.6% of the residents of obstetrics and gynecology , 21.9% of the interns , 50% of the obstetrics and gynecology faculty members ( professors ) , and 31.9% of the midwives . \n",
      " there was a significant difference between the five groups of service providers in terms of overall compliance with mothers rights ( p = 0.002 ) . \n",
      " the results showed that the residents in higher years of education ( p = 0.001 ) , midwifery students in higher semesters ( p = 0.001 ) , midwives with more work experience ( p < 0.001 ) , and personal experience of being hospitalized in labor room ( p < 0.001 ) had a higher compliance with bill of rights . \n",
      " meanwhile , there was no significant difference in compliance with bill of rights between labor and age ( p = 0.82).conclusions : the results showed that the health providers compliance with the pregnant women 's bill of rights was not acceptable in the labor room . \n",
      " therefore , necessary actions are needed to remove the barriers against pregnant women 's compliance of bill of rights and to facilitate the compliance with it in hospitals . \n",
      "\n",
      "NO-PROMPT-SUMMARY:\tCompliance with the pregnant women's bill of rights in labor and delivery is not acceptable in the labor room.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Token indices sequence length is longer than the specified maximum sequence length for this model (645 > 512). Running this sequence through the model will result in indexing errors\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ZERO-SHOT-PROMPT:\tCompliance with the pregnant women's bill of rights in labor and delivery is not acceptable in the labor room.\n",
      "FEW-SHOT-PROMPT:\tThe compliance with pregnant women's bill of rights is not acceptable in the labor room\n",
      "\n",
      "#########################\n",
      "SOURCE_TEXT:\n",
      " objectivesin some clinical situations , dentists come across partially edentulous patients , \n",
      " and it might be necessary to connect teeth to implants . \n",
      " the aim of this study was \n",
      " to evaluate a metal - ceramic fixed tooth / implant - supported denture with a straight \n",
      " segment , located in the posterior region of the maxilla , when varying the number \n",
      " of teeth used as abutments . \n",
      " materials and methodsa three - element fixed denture composed of one tooth and one implant ( model 1 ) , and \n",
      " a four - element fixed denture composed of two teeth and one implant ( model 2 ) were \n",
      " modeled . \n",
      " a 100 n load was applied , distributed uniformly on the entire set , \n",
      " simulating functional mastication , for further analysis of the seqv ( von mises ) \n",
      " principal stresses , which were compared with the flow limit of the materials . \n",
      " resultsin a quantitative analysis , it may be observed that in the denture with one tooth , \n",
      " the maximum seqv stress was 47.84 mpa , whereas for the denture with two teeth the \n",
      " maximum seqv stress was 35.82 mpa , both located in the region between the pontic \n",
      " and the tooth . \n",
      " conclusionlower stresses were observed in the denture with an additional tooth . based on the \n",
      " flow limit of the materials , \n",
      " porcelain showed values below the limit of functional \n",
      " mastication . \n",
      "\n",
      "NO-PROMPT-SUMMARY:\tA metal-ceramic model of a fixed denture that has a straight segment and a small incisor.\n",
      "ZERO-SHOT-PROMPT:\tA metal - ceramic fixed implant - supported denture with a straight segment located in the posterior region of the maxilla.\n",
      "FEW-SHOT-PROMPT:\tA metal - ceramic fixed tooth / implant supported denture with a straight segment located in the posterior region of the maxilla: a comparison of strengths\n",
      "\n",
      "#########################\n",
      "SOURCE_TEXT:\n",
      " nucleotide insertions in the ferritin light chain ( ftl ) polypeptide gene \n",
      "  cause hereditary ferritinopathy , a neurodegenerative disease characterized by \n",
      "  abnormal accumulation of ferritin and iron in the central nervous system . here \n",
      "  \n",
      " we describe for the first time the protein structure and iron storage function \n",
      "  of the ftl mutant p.phe167serfsx26 ( mt - ftl ) , which has a c terminus \n",
      "  altered in sequence and extended in length . \n",
      " mt - ftl polypeptides assembled \n",
      "  spontaneously into soluble , spherical 24-mers that were ultrastructurally \n",
      "  indistinguishable from those of the wild type . \n",
      " far - uv cd showed a decrease in \n",
      "  -helical content , and 8-anilino-1-naphthalenesulfonate fluorescence \n",
      "  revealed the appearance of hydrophobic binding sites . \n",
      " near - uv cd and \n",
      "  proteolysis studies suggested little or no structural alteration outside of \n",
      "  the c - terminal region . \n",
      " in contrast to wild type , mt - ftl homopolymers \n",
      "  precipitated at much lower iron loading , had a diminished capacity to \n",
      "  incorporate iron , and were less thermostable . \n",
      " however , precipitation was \n",
      "  significantly reversed by addition of iron chelators both in vitro \n",
      "  and in vivo . \n",
      " our results reveal substantial protein conformational \n",
      "  changes localized at the 4-fold pore of mt - ftl homopolymers and imply that the \n",
      "  c terminus of the mt - ftl polypeptide plays an important role in ferritin \n",
      "  solubility , stability , and iron management . \n",
      " we propose that the protrusion of \n",
      "  some portion of the c terminus above the spherical shell allows it to \n",
      "  cross - link with other mutant polypeptides through iron bridging , leading to \n",
      "  enhanced mutant precipitation by iron . \n",
      " our data suggest that hereditary \n",
      "  ferritinopathy pathogenesis is likely to result from a combination of \n",
      "  reduction in iron storage function and enhanced toxicity associated with \n",
      "  iron - induced ferritin aggregates . \n",
      "\n",
      "NO-PROMPT-SUMMARY:\tPhe167serfsx26 mutant ferritin is complex with iron binding sites to make it more stable and resist iron sulfides\n",
      "ZERO-SHOT-PROMPT:\tA mutant ferritin light chain polypeptide is a protrusion of the c terminus of the mt-ftl polypeptide gene resulting in iron-induced sedimentation in the spherical\n",
      "FEW-SHOT-PROMPT:\tStructural properties of the mutant ftl polypeptide p.phe167serfsx26 are significantly altered by iron bridging. a novel c terminus of the ftl poly\n",
      "\n"
     ]
    }
   ],
   "source": [
    "abstracts = random.sample(dataset['abstract'], 3)\n",
    "\n",
    "for abstract in abstracts:\n",
    "    print(\"#\"*25)\n",
    "    print(f\"SOURCE_TEXT:\\n{abstract}\\n\")\n",
    "    no_prompt_input, no_prompt_output = generate_text(abstract)\n",
    "    print(f\"NO-PROMPT-SUMMARY:\\t{no_prompt_output}\")\n",
    "    prompted_input, prompted_output = generate_w_prompt(abstract)\n",
    "    print(f\"ZERO-SHOT-PROMPT:\\t{prompted_output}\")\n",
    "    few_shot_input, few_shot_output = generate_w_few_shot_prompt(abstract, scitldr_dataset)\n",
    "    print(f\"FEW-SHOT-PROMPT:\\t{few_shot_output}\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a7ad210-3480-42b9-aedb-6cab154e287f",
   "metadata": {},
   "source": [
    "## 4. Conclusions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de0e603f-13f9-4f88-bc82-c73570bc9973",
   "metadata": {},
   "source": [
    "In this notebook we saw how to adapt a pre-trained LLM to summarize scientific text without any additional training. However, for domain-specific language like this you may see better results after fine-tuning."
   ]
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   },
   {
    "_defaultOrder": 55,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 56,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4de.24xlarge",
    "vcpuNum": 96
   }
  ],
  "instance_type": "ml.m5.2xlarge",
  "kernelspec": {
   "display_name": "Python 3 (Data Science 3.0)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/sagemaker-data-science-310-v1"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "vscode": {
   "interpreter": {
    "hash": "a8534c14445fc6cdc3039d8140510d6736e5b4960d89f445a45d8db6afd8452b"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}