{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "cefcedcc-b81e-4fb7-8147-886102f5c0e5",
   "metadata": {},
   "source": [
    "# Lab 3: Chatbot and Semantic Search Applications\n",
    "\n",
    "---\n",
    "\n",
    "본 모듈에서는 SageMaker 훈련 인스턴스에서 훈련한 모델을 로컬 공간으로 복사하여 추론을 수행합니다.\n",
    "\n",
    "SageMaker 훈련된 모델 파라메터는 model.tar.gz로 압축되어 S3에 저장되며, 압축 파일 내에는 훈련 인스턴스의 /opt/ml/model의 모든 파일/디렉토리들이 포함되어 있습니다. 따라서, 로컬/개발/온프레미스 환경에서 훈련된 모델을 자유롭게 테스트할 수 있습니다.\n",
    "\n",
    "만약 로컬/개발/온프레미스 환경이 아닌 SageMaker 엔드포인트 배포를 고려한다면, 아래 URL을 참조하세요.\n",
    "- SageMaker Hugging Face Inference Toolkit: https://github.com/aws/sagemaker-huggingface-inference-toolkit\n",
    "- Amazon SageMaker Deep Learning Inference Hands-on-Lab: https://github.com/aws-samples/sagemaker-inference-samples-kr\n",
    "\n",
    "\n",
    "<div class=\"alert alert-info\"><h4>Note</h4><p>\n",
    "CUDA out of memory 에러가 발생하면, 현재 실행 중인 노트북들을 모두 셧다운 후, 이 노트북을 재실행해 주시기 바랍니다. \n",
    "</p></div>\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a7d2b9d-4ece-4c8e-ad27-66ad482fafee",
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "%store -r"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24edc799-50b7-48a8-abbf-e2595429c392",
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    s3_model_path \n",
    "    local_model_dir \n",
    "    train_dir \n",
    "    print(\"[OK] You can proceed.\")\n",
    "except NameError:\n",
    "    print(\"+\"*60)\n",
    "    print(\"[ERROR] Please run previous notebooks and before you continue.\")\n",
    "    print(\"+\"*60)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f6eb74b3-a239-489f-bb17-694e58e4142a",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$local_model_dir\" \"$s3_model_path\"\n",
    "#aws s3 cp $2 $1\n",
    "cd $1\n",
    "tar -xzvf model.tar.gz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30bdf5be-05da-4025-ae95-872e0b0b309d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import random\n",
    "import time\n",
    "from operator import itemgetter \n",
    "\n",
    "def get_faiss_index(emb, data, dim=768):\n",
    "    import faiss\n",
    "    n_gpus = torch.cuda.device_count()\n",
    "\n",
    "    if n_gpus == 0:\n",
    "        # Create the Inner Product Index\n",
    "        index = faiss.IndexFlatIP(dim)\n",
    "    else:\n",
    "        flat_config = []\n",
    "        res = [faiss.StandardGpuResources() for i in range(n_gpus)]\n",
    "        for i in range(n_gpus):\n",
    "            cfg = faiss.GpuIndexFlatConfig()\n",
    "            cfg.useFloat16 = False\n",
    "            cfg.device = i\n",
    "            flat_config.append(cfg)\n",
    "\n",
    "        index = faiss.GpuIndexFlatIP(res[0], dim, flat_config[0])\n",
    "\n",
    "    index = faiss.IndexIDMap(index)\n",
    "    index.add_with_ids(emb, np.array(range(0, len(data)))) \n",
    "    return index\n",
    "\n",
    "\n",
    "def search(model, query, data, index, k=5, random_select=False, verbose=True):\n",
    "    t = time.time()\n",
    "    query_vector = model.encode(query)\n",
    "    dists, top_k_inds = index.search(query_vector, k)\n",
    "    if verbose:\n",
    "        print('total time: {}'.format(time.time() - t))\n",
    "    results = [itemgetter(*ind)(data) for ind in top_k_inds] \n",
    "    \n",
    "    if random_select:\n",
    "        return [random.choice(r) for r in results]\n",
    "    else:\n",
    "        return results"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7950ec22-d8a7-4f5a-8b11-7390e2191226",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 1. Load trained model\n",
    "---\n",
    "\n",
    "이전 모듈에서 훈련한 모델을 로드합니다. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2e8e39b8-0f78-48a3-b57c-7feef4741315",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import torch\n",
    "from sentence_transformers import SentenceTransformer\n",
    "\n",
    "dirs = ([name for name in os.listdir(local_model_dir) if os.path.isdir(os.path.join(local_model_dir, name))])\n",
    "model = SentenceTransformer(os.path.join(local_model_dir, dirs[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7fc140b5-86ff-4a61-98d5-3b74fec56b36",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 2. Predictions\n",
    "---\n",
    "\n",
    "### Chatbot\n",
    "\n",
    "챗봇은 크게 두 가지 형태로 개발합니다. 1) 생성 모델을 사용하여 해당 질문에 대한 창의적인 답변을 생성하거나, 2) 수많은 질문-답변 리스트들 중 질문에 부합하는 질문 후보들을 추린 다음 해당 후보에 적합한 답변을 찾는 방식이죠.\n",
    "본 핸즈온은 2)의 방법으로 간단하게 챗봇 예시를 보여드립니다. 질문 텍스트를 입력으로 받으면, 해당 질문의 임베딩을 계산하여 질문 임베딩과 모든 질문 리스트의 임베딩을 비교하여 유사도가 가장 높은 질문 후보들을 찾고, 각 후보에 매칭되는 답변을 찾습니다.\n",
    "\n",
    "코사인 유사도를 직접 계산할 수도 있지만, 페이스북에서 개발한 Faiss 라이브러리 (https://github.com/facebookresearch/faiss) 를 사용하면 훨씬 빠른 속도로 계산할 수 있습니다. Faiss는 Product 양자화 알고리즘을 GPU로 더욱 빠르게 구현한 라이브러리입니다.\n",
    "\n",
    "References\n",
    "- Billion-scale similarity search with GPUs: https://arxiv.org/pdf/1702.08734.pdf\n",
    "- Product Quantizers for k-NN Tutorial Part 1: https://mccormickml.com/2017/10/13/product-quantizer-tutorial-part-1\n",
    "- Product Quantizers for k-NN Tutorial Part 1: http://mccormickml.com/2017/10/22/product-quantizer-tutorial-part-2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "081c6317-c469-4964-86fc-6278f4011a54",
   "metadata": {},
   "outputs": [],
   "source": [
    "chatbot_df = pd.read_csv(f'{train_dir}/chatbot-train.csv')\n",
    "chatbot_data = chatbot_df['A'].tolist()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c241160-8da6-48f6-a310-0968620a0758",
   "metadata": {},
   "source": [
    "#### Load embedding & Indexing the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c805f341-9ba9-4847-bbaf-d5ab08c482c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "chatbot_emb = np.load(f'{local_model_path}/chatbot_emb.npy')\n",
    "chatbot_index = get_faiss_index(chatbot_emb, chatbot_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f3547790-ea20-4589-9d37-dd23eb531de4",
   "metadata": {},
   "source": [
    "#### Inference\n",
    "샘플 질문들에 대한 추론을 수행합니다. 챗봇 답변은 계속 변경됩니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "60ba7b6a-c06c-4851-a9fc-997535dd1705",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = ['커피 라떼 마시고 싶어']\n",
    "search(model, query, chatbot_data, chatbot_index, k=10, random_select=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e20240bd-a90b-4f50-8834-9d70fee25bcb",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = ['너무 졸려', '놀고 싶어']\n",
    "search(model, query, chatbot_data, chatbot_index, random_select=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eae029d6-13b8-4e5e-9fca-4bc1be0f8503",
   "metadata": {},
   "source": [
    "### Semantic Search (News)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "916d3fc2-48a9-48dc-bcce-b02c7343ce63",
   "metadata": {},
   "outputs": [],
   "source": [
    "news_data = []\n",
    "f = open(f'{train_dir}/KCCq28_Korean_sentences_EUCKR_v2.txt', 'rt', encoding='cp949')\n",
    "lines = f.readlines()\n",
    "for line in lines:\n",
    "    line = line.strip()\n",
    "    news_data.append(line)\n",
    "f.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8ed838ff-9b5f-4de2-a686-4789be05707b",
   "metadata": {},
   "source": [
    "#### Load embedding & Indexing the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4cc51366-3ce8-4539-bc6c-16d8ef5eb151",
   "metadata": {},
   "outputs": [],
   "source": [
    "news_emb = np.load(f'{local_model_path}/news_emb.npy')\n",
    "news_index = get_faiss_index(news_emb, news_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2d15996-5dbc-4e55-b378-69ff090ecd14",
   "metadata": {},
   "source": [
    "#### Inference\n",
    "샘플 질문들에 대한 추론을 수행합니다. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1db3c4c6-fb38-4ccf-baaf-b3d5806564ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = ['메이저리그 베이스볼']\n",
    "search(model, query, news_data, news_index, k=10, random_select=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e69e6a21-e67e-4088-8567-5ff46d84c931",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = ['아이스 라떼', '미세먼지']\n",
    "search(model, query, news_data, news_index, k=7, random_select=False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p38",
   "language": "python",
   "name": "conda_pytorch_p38"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}