{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hMBwJT0r38tj"
   },
   "source": [
    "# Lab 2: Train Hugging Face Transformers on Amazon SageMaker\n",
    "\n",
    "### Korean NLP Downstream task: Token Classification\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Amc52mb94Jzq"
   },
   "source": [
    "\n",
    "## Introduction\n",
    "---\n",
    "\n",
    "바로 이전 모듈에서 기존에 온프레미스에서 개발했던 환경과 동일한 환경으로 모델을 빌드하고 훈련했습니다. 하지만 아래와 같은 상황들에서도 기존 환경을 사용하는 것이 바람직할까요?\n",
    "\n",
    "- 온프레미스의 GPU는 총 1장으로 훈련 시간이 너무 오래 소요됨\n",
    "- 가용 서버 대수가 2대인데 10개의 딥러닝 모델을 동시에 훈련해야 함\n",
    "- 필요한 상황에만 GPU를 활용\n",
    "- 기타 등등\n",
    "\n",
    "Amazon SageMaker는 데이터 과학자들 및 머신 러닝 엔지니어들을 위한 완전 관리형 머신 러닝 서비스로 훈련 및 추론 수행 시 인프라 설정에 대한 추가 작업이 필요하지 않기에, 단일 GPU 기반의 딥러닝 훈련을 포함한 멀티 GPU 및 멀티 인스턴스 분산 훈련을 보다 쉽고 빠르게 수행할 수 있습니다. SageMaker는 다양한 유즈케이스들에 적합한 예제들을 지속적으로 업데이트하고 있으며, 한국어 세션 및 자료들도 제공되고 있습니다.\n",
    "\n",
    "### Notes\n",
    "\n",
    "이미 기본적인 Hugging Face 용법 및 자연어 처리에 익숙하신 분들은 앞 모듈을 생략하고 이 모듈부터 핸즈온을 시작하셔도 됩니다.\n",
    "이 노트북은 SageMaker 기본 API를 참조하므로, SageMaker Studio, SageMaker 노트북 인스턴스 또는 AWS CLI가 설정된 로컬 시스템에서 실행해야 합니다. SageMaker Studio 또는 SageMaker 노트북 인스턴스를 사용하는 경우 PyTorch 기반 커널을 선택하세요.\n",
    "훈련 job 수행 시 최소 `ml.g4dn.xlarge` 이상의 훈련 인스턴스가 필요하며, `ml.p3.8xlarge`나 `ml.p3.16xlarge` 인스턴스를 권장합니다. 만약 인스턴스 사용에 제한이 걸려 있다면 Request a service quota increase for SageMaker resources를 참조하여 인스턴스 제한을 해제해 주세요.\n",
    "\n",
    "### References\n",
    "- Hugging Face Tutorial: https://huggingface.co/docs/transformers/training\n",
    "- 네이버, 창원대가 함께하는 NLP Challenge GitHub: https://github.com/naver/nlp-challenge\n",
    "- 네이버, 창원대가 함께하는 NLP Challenge 리더보드 및 라이센스: http://air.changwon.ac.kr/?page_id=10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## 1. Setup Environments\n",
    "---\n",
    "\n",
    "### Import modules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sagemaker role arn: arn:aws:iam::143656149352:role/service-role/AmazonSageMaker-ExecutionRole-20220317T150353\n",
      "sagemaker bucket: sagemaker-us-east-1-143656149352\n",
      "sagemaker session region: us-east-1\n"
     ]
    }
   ],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import sagemaker.huggingface\n",
    "\n",
    "sess = sagemaker.Session()\n",
    "# sagemaker session bucket -> used for uploading data, models and logs\n",
    "# sagemaker will automatically create this bucket if it not exists\n",
    "sagemaker_session_bucket=None\n",
    "if sagemaker_session_bucket is None and sess is not None:\n",
    "    # set to default bucket if a bucket name is not given\n",
    "    sagemaker_session_bucket = sess.default_bucket()\n",
    "\n",
    "role = sagemaker.get_execution_role()\n",
    "region = boto3.Session().region_name\n",
    "sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)\n",
    "\n",
    "print(f\"sagemaker role arn: {role}\")\n",
    "print(f\"sagemaker bucket: {sess.default_bucket()}\")\n",
    "print(f\"sagemaker session region: {sess.boto_region_name}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import sys\n",
    "import logging\n",
    "import argparse\n",
    "import torch\n",
    "from torch import nn\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from tqdm import tqdm\n",
    "from sklearn.model_selection import train_test_split\n",
    "from transformers import BertTokenizer, BertTokenizerFast, BertConfig, BertForTokenClassification\n",
    "\n",
    "logging.basicConfig(\n",
    "    level=logging.INFO, \n",
    "    format='[{%(filename)s:%(lineno)d} %(levelname)s - %(message)s',\n",
    "    handlers=[\n",
    "        logging.StreamHandler(sys.stdout)\n",
    "    ]\n",
    ")\n",
    "logger = logging.getLogger(__name__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Feature set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[{2817287866.py:9} INFO - Loaded train_dataset length is: 52416\n",
      "[{2817287866.py:10} INFO - Loaded test_dataset length is: 13104\n"
     ]
    }
   ],
   "source": [
    "train_dir = 'ner_train'\n",
    "valid_dir = 'ner_valid'\n",
    "train_dict = torch.load(os.path.join(train_dir, 'train_features.pt'))\n",
    "valid_dict = torch.load(os.path.join(valid_dir, 'valid_features.pt'))\n",
    "\n",
    "train_ids, train_attention_masks, train_labels = train_dict['input_ids'], train_dict['attention_mask'], train_dict['labels']     \n",
    "valid_ids, valid_attention_masks, valid_labels = valid_dict['input_ids'], valid_dict['attention_mask'], valid_dict['labels']   \n",
    "        \n",
    "logger.info(f\"Loaded train_dataset length is: {len(train_dict['input_ids'])}\")\n",
    "logger.info(f\"Loaded test_dataset length is: {len(valid_dict['input_ids'])}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# num_debug_samples = 500\n",
    "# train_ids = train_ids[:num_debug_samples, :]\n",
    "# train_attention_masks = train_attention_masks[:num_debug_samples, :]\n",
    "# train_labels = train_labels[:num_debug_samples, :]\n",
    "\n",
    "# valid_ids = valid_ids[:num_debug_samples, :]\n",
    "# valid_attention_masks = valid_attention_masks[:num_debug_samples, :]\n",
    "# valid_labels = valid_labels[:num_debug_samples, :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# train_dict = {'input_ids': train_ids, 'attention_mask': train_attention_masks, 'labels': train_labels}\n",
    "# valid_dict = {'input_ids': valid_ids, 'attention_mask': valid_attention_masks, 'labels': valid_labels}\n",
    "\n",
    "# torch.save(train_dict, os.path.join(train_dir, 'train_features.pt'))\n",
    "# torch.save(valid_dict, os.path.join(valid_dir, 'valid_features.pt'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Uploading data to Amazon S3 Bucket\n",
    "\n",
    "SageMaker 훈련을 위해 데이터셋을 S3로 업로드합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "bucket = sess.default_bucket()\n",
    "\n",
    "# s3 key prefix for the data\n",
    "s3_prefix = 'samples/datasets/naver-changwon-ner'\n",
    "\n",
    "# save train_dataset to s3\n",
    "train_input_path = f's3://{bucket}/{s3_prefix}/train'\n",
    "valid_input_path = f's3://{bucket}/{s3_prefix}/valid'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 25.1 ms, sys: 29.7 ms, total: 54.8 ms\n",
      "Wall time: 3.07 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "!aws s3 cp {train_dir} {train_input_path} --recursive --quiet\n",
    "!aws s3 cp {valid_dir} {valid_input_path} --recursive --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br>\n",
    "\n",
    "## 2. Training with Native Hugging Face (PyTorch Framework)\n",
    "\n",
    "---\n",
    "\n",
    "### Overview and Training Script\n",
    "\n",
    "SageMaker에 대한 대표적인 오해가 여전히 많은 분들이 SageMaker 훈련을 위해 소스 코드를 전면적으로 수정해야 한다고 생각합니다. 하지만, 실제로는 별도의 소스 코드 수정 없이 기존 여러분이 사용했던 파이썬 스크립트에 SageMaker 훈련에 필요한 SageMaker 전용 환경 변수들만 추가하면 됩니다. \n",
    "\n",
    "SageMaker 훈련은 훈련 작업을 호출할 때, 1) 훈련 EC2 인스턴스 프로비저닝 - 2) 컨테이너 구동을 위한 도커 이미지 및 훈련 데이터 다운로드 - 3) 컨테이너 구동 - 4) 컨테이너 환경에서 훈련 수행 - 5) 컨테이너 환경에서 S3의 특정 버킷에 저장 - 6) 훈련 인스턴스 종료로 구성됩니다. 따라서, 훈련 수행 로직은 아래 예시와 같이 기존 개발 환경과 동일합니다.\n",
    "\n",
    "```python\n",
    "/opt/conda/bin/python train.py --epochs 5 --train_batch_size 32 ...\n",
    "```\n",
    "\n",
    "이 과정에서 컨테이너 환경에 필요한 환경 변수(예: 모델 경로, 훈련 데이터 경로) 들은 사전에 지정되어 있으며, 이 환경 변수들이 설정되어 있어야 훈련에 필요한 파일들의 경로를 인식할 수 있습니다. 대표적인 환경 변수들에 대한 자세한 내용은 https://github.com/aws/sagemaker-containers#important-environment-variables 을 참조하세요."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[{1923326789.py:12} INFO - learning_rate: 0.0002\n"
     ]
    }
   ],
   "source": [
    "from sagemaker.huggingface import HuggingFace\n",
    "import time\n",
    "instance_type = 'ml.p3.8xlarge'\n",
    "num_gpus = 4\n",
    "instance_count = 1\n",
    "batch_size_native = 32\n",
    "learning_rate_native = float('5e-5')\n",
    "batch_size = batch_size_native\n",
    "model_id = 'bert-base-multilingual-cased' \n",
    "tokenizer_id = model_id\n",
    "learning_rate = learning_rate_native / batch_size_native * batch_size * num_gpus * instance_count\n",
    "logger.info(f'learning_rate: {learning_rate}')\n",
    "\n",
    "# hyperparameters, which are passed into the training job\n",
    "hyperparameters = {\n",
    "    'n_gpus': num_gpus,                 # number of GPUs per instance\n",
    "    'epochs': 3,                        # number of training epochs\n",
    "    'seed': 42,                         # random seed\n",
    "    'train_batch_size': batch_size,     # batch size for training\n",
    "    'eval_batch_size': batch_size*2,    # batch size for evaluation\n",
    "    'learning_rate': learning_rate,     # learning rate used during training\n",
    "    'tokenizer_id': tokenizer_id,       # pre-trained tokenizer\n",
    "    'model_id': model_id                # pre-trained model\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[{666484959.py:23} INFO - training job name: kornlp-ner-training-2022-07-05-22-46-48\n"
     ]
    }
   ],
   "source": [
    "# define Training Job Name \n",
    "job_name = f'kornlp-ner-training-{time.strftime(\"%Y-%m-%d-%H-%M-%S\", time.localtime())}'\n",
    "chkpt_s3_path = f's3://{bucket}/{s3_prefix}/native/checkpoints'\n",
    "\n",
    "# create the Estimator\n",
    "sm_estimator = HuggingFace(\n",
    "    entry_point           = 'train.py',         # fine-tuning script used in training jon\n",
    "    source_dir            = './scripts',        # directory where fine-tuning script is stored\n",
    "    instance_type         = instance_type,      # instances type used for the training job\n",
    "    instance_count        = instance_count,     # the number of instances used for training\n",
    "    base_job_name         = job_name,           # the name of the training job\n",
    "    role                  = role,               # IAM role used in training job to access AWS ressources, e.g. S3\n",
    "    transformers_version  = '4.17.0',           # the transformers version used in the training job\n",
    "    pytorch_version       = '1.10.2',           # the pytorch_version version used in the training job\n",
    "    py_version            = 'py38',             # the python version used in the training job\n",
    "    hyperparameters       = hyperparameters,    # the hyperparameter used for running the training job\n",
    "    disable_profiler      = True,\n",
    "    debugger_hook_config  = False, \n",
    "    checkpoint_s3_uri     = chkpt_s3_path,\n",
    "    checkpoint_local_path ='/opt/ml/checkpoints',  \n",
    ")\n",
    "\n",
    "logger.info(f'training job name: {job_name}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`fit()` 메소드를 호출하여 훈련 job을 시작합니다. `fit()` 메소드의 인자값 중 `wait=True`로 설정할 경우에는 동기(synchronous) 방식으로 동직하게 되며, `wait=False`일 경우에는 비동기(aynchronous) 방식으로 동작하여 여러 개의 훈련 job을 동시에 실행할 수 있습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[{session.py:590} INFO - Creating training-job with name: kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543\n"
     ]
    }
   ],
   "source": [
    "# define a data input dictonary with our uploaded s3 uris\n",
    "data = {\n",
    "    'train': train_input_path,\n",
    "    'valid': valid_input_path\n",
    "}\n",
    "\n",
    "# starting the train job with our uploaded datasets as input\n",
    "sm_estimator.fit(data, wait=False)\n",
    "train_job_name = sm_estimator.latest_training_job.job_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### View Training Job\n",
    "SageMaker 콘솔 창에서 훈련 내역을 직접 확인할 수도 있지만, 아래 코드 셀에서 생성되는 링크를 클릭하면 더 편리하게 훈련 내역을 확인할 수 있습니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<b> [Hugging Face Training - Native] Review <a target=\"blank\" href=\"https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/jobs/kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543\">Training Job</a></b>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b> [Hugging Face Training - Native] Review <a target=\"blank\" href=\"https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logStream:group=/aws/sagemaker/TrainingJobs;prefix=kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543;streamFilter=typeLogStreamPrefix\">CloudWatch Logs</a></b>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "def make_console_link(region, train_job_name, train_task='[Training]'):\n",
    "    train_job_link = f'<b> {train_task} Review <a target=\"blank\" href=\"https://console.aws.amazon.com/sagemaker/home?region={region}#/jobs/{train_job_name}\">Training Job</a></b>'   \n",
    "    cloudwatch_link = f'<b> {train_task} Review <a target=\"blank\" href=\"https://console.aws.amazon.com/cloudwatch/home?region={region}#logStream:group=/aws/sagemaker/TrainingJobs;prefix={train_job_name};streamFilter=typeLogStreamPrefix\">CloudWatch Logs</a></b>'\n",
    "    return train_job_link, cloudwatch_link  \n",
    "        \n",
    "train_job_link, cloudwatch_link = make_console_link(region, train_job_name, '[Hugging Face Training - Native]')\n",
    "\n",
    "display(HTML(train_job_link))\n",
    "display(HTML(cloudwatch_link))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Wait for the training jobs to complete\n",
    "훈련이 완료될 때까지 기다립니다. `estimator.fit(...)`에서 `wait=False`로 설정한 경우, 아래 코드 셀의 주석을 해제 후 실행하여 동기 방식으로 변경할 수도 있습니다. 훈련 완료까지는 약 15-20분의 시간이 소요됩니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2022-07-05 22:46:50 Starting - Starting the training job......\n",
      "2022-07-05 22:47:36 Starting - Preparing the instances for training......\n",
      "2022-07-05 22:48:52 Downloading - Downloading input data.........\n",
      "2022-07-05 22:50:18 Training - Downloading the training image..................\n",
      "2022-07-05 22:53:34 Training - Training image download completed. Training in progress..\u001b[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device\u001b[0m\n",
      "\u001b[34mbash: no job control in this shell\u001b[0m\n",
      "\u001b[34m/opt/conda/lib/python3.8/site-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated\n",
      "  \"class\": algorithms.Blowfish,\u001b[0m\n",
      "\u001b[34m2022-07-05 22:53:37,016 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training\u001b[0m\n",
      "\u001b[34m2022-07-05 22:53:37,055 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.\u001b[0m\n",
      "\u001b[34m2022-07-05 22:53:37,061 sagemaker_pytorch_container.training INFO     Invoking user training script.\u001b[0m\n",
      "\u001b[34m2022-07-05 22:53:37,548 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:\u001b[0m\n",
      "\u001b[34m/opt/conda/bin/python3.8 -m pip install -r requirements.txt\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: transformers in /opt/conda/lib/python3.8/site-packages (from -r requirements.txt (line 1)) (4.17.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: tokenizers!=0.11.3,>=0.11.1 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (0.12.1)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: sacremoses in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (0.0.53)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (4.63.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (5.4.1)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: requests in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (2.27.1)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: filelock in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (3.7.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (1.22.2)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (21.3)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (2022.4.24)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: huggingface-hub<1.0,>=0.1.0 in /opt/conda/lib/python3.8/site-packages (from transformers->-r requirements.txt (line 1)) (0.6.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.8/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers->-r requirements.txt (line 1)) (4.2.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/lib/python3.8/site-packages (from packaging>=20.0->transformers->-r requirements.txt (line 1)) (3.0.9)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: charset-normalizer~=2.0.0 in /opt/conda/lib/python3.8/site-packages (from requests->transformers->-r requirements.txt (line 1)) (2.0.12)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.8/site-packages (from requests->transformers->-r requirements.txt (line 1)) (3.3)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.8/site-packages (from requests->transformers->-r requirements.txt (line 1)) (2021.10.8)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.8/site-packages (from requests->transformers->-r requirements.txt (line 1)) (1.26.9)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: six in /opt/conda/lib/python3.8/site-packages (from sacremoses->transformers->-r requirements.txt (line 1)) (1.16.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: joblib in /opt/conda/lib/python3.8/site-packages (from sacremoses->transformers->-r requirements.txt (line 1)) (1.1.0)\u001b[0m\n",
      "\u001b[34mRequirement already satisfied: click in /opt/conda/lib/python3.8/site-packages (from sacremoses->transformers->-r requirements.txt (line 1)) (8.1.3)\u001b[0m\n",
      "\u001b[34mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\n",
      "\u001b[34mWARNING: There was an error checking the latest version of pip.\u001b[0m\n",
      "\u001b[34m2022-07-05 22:53:40,442 sagemaker-training-toolkit INFO     Invoking user script\u001b[0m\n",
      "\u001b[34mTraining Env:\u001b[0m\n",
      "\u001b[34m{\n",
      "    \"additional_framework_parameters\": {},\n",
      "    \"channel_input_dirs\": {\n",
      "        \"train\": \"/opt/ml/input/data/train\",\n",
      "        \"valid\": \"/opt/ml/input/data/valid\"\n",
      "    },\n",
      "    \"current_host\": \"algo-1\",\n",
      "    \"framework_module\": \"sagemaker_pytorch_container.training:main\",\n",
      "    \"hosts\": [\n",
      "        \"algo-1\"\n",
      "    ],\n",
      "    \"hyperparameters\": {\n",
      "        \"epochs\": 3,\n",
      "        \"eval_batch_size\": 64,\n",
      "        \"learning_rate\": 0.0002,\n",
      "        \"model_id\": \"bert-base-multilingual-cased\",\n",
      "        \"n_gpus\": 4,\n",
      "        \"seed\": 42,\n",
      "        \"tokenizer_id\": \"bert-base-multilingual-cased\",\n",
      "        \"train_batch_size\": 32\n",
      "    },\n",
      "    \"input_config_dir\": \"/opt/ml/input/config\",\n",
      "    \"input_data_config\": {\n",
      "        \"train\": {\n",
      "            \"TrainingInputMode\": \"File\",\n",
      "            \"S3DistributionType\": \"FullyReplicated\",\n",
      "            \"RecordWrapperType\": \"None\"\n",
      "        },\n",
      "        \"valid\": {\n",
      "            \"TrainingInputMode\": \"File\",\n",
      "            \"S3DistributionType\": \"FullyReplicated\",\n",
      "            \"RecordWrapperType\": \"None\"\n",
      "        }\n",
      "    },\n",
      "    \"input_dir\": \"/opt/ml/input\",\n",
      "    \"is_master\": true,\n",
      "    \"job_name\": \"kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543\",\n",
      "    \"log_level\": 20,\n",
      "    \"master_hostname\": \"algo-1\",\n",
      "    \"model_dir\": \"/opt/ml/model\",\n",
      "    \"module_dir\": \"s3://sagemaker-us-east-1-143656149352/kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543/source/sourcedir.tar.gz\",\n",
      "    \"module_name\": \"train\",\n",
      "    \"network_interface_name\": \"eth0\",\n",
      "    \"num_cpus\": 32,\n",
      "    \"num_gpus\": 4,\n",
      "    \"output_data_dir\": \"/opt/ml/output/data\",\n",
      "    \"output_dir\": \"/opt/ml/output\",\n",
      "    \"output_intermediate_dir\": \"/opt/ml/output/intermediate\",\n",
      "    \"resource_config\": {\n",
      "        \"current_host\": \"algo-1\",\n",
      "        \"current_instance_type\": \"ml.p3.8xlarge\",\n",
      "        \"current_group_name\": \"homogeneousCluster\",\n",
      "        \"hosts\": [\n",
      "            \"algo-1\"\n",
      "        ],\n",
      "        \"instance_groups\": [\n",
      "            {\n",
      "                \"instance_group_name\": \"homogeneousCluster\",\n",
      "                \"instance_type\": \"ml.p3.8xlarge\",\n",
      "                \"hosts\": [\n",
      "                    \"algo-1\"\n",
      "                ]\n",
      "            }\n",
      "        ],\n",
      "        \"network_interface_name\": \"eth0\"\n",
      "    },\n",
      "    \"user_entry_point\": \"train.py\"\u001b[0m\n",
      "\u001b[34m}\u001b[0m\n",
      "\u001b[34mEnvironment variables:\u001b[0m\n",
      "\u001b[34mSM_HOSTS=[\"algo-1\"]\u001b[0m\n",
      "\u001b[34mSM_NETWORK_INTERFACE_NAME=eth0\u001b[0m\n",
      "\u001b[34mSM_HPS={\"epochs\":3,\"eval_batch_size\":64,\"learning_rate\":0.0002,\"model_id\":\"bert-base-multilingual-cased\",\"n_gpus\":4,\"seed\":42,\"tokenizer_id\":\"bert-base-multilingual-cased\",\"train_batch_size\":32}\u001b[0m\n",
      "\u001b[34mSM_USER_ENTRY_POINT=train.py\u001b[0m\n",
      "\u001b[34mSM_FRAMEWORK_PARAMS={}\u001b[0m\n",
      "\u001b[34mSM_RESOURCE_CONFIG={\"current_group_name\":\"homogeneousCluster\",\"current_host\":\"algo-1\",\"current_instance_type\":\"ml.p3.8xlarge\",\"hosts\":[\"algo-1\"],\"instance_groups\":[{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.p3.8xlarge\"}],\"network_interface_name\":\"eth0\"}\u001b[0m\n",
      "\u001b[34mSM_INPUT_DATA_CONFIG={\"train\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"valid\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}}\u001b[0m\n",
      "\u001b[34mSM_OUTPUT_DATA_DIR=/opt/ml/output/data\u001b[0m\n",
      "\u001b[34mSM_CHANNELS=[\"train\",\"valid\"]\u001b[0m\n",
      "\u001b[34mSM_CURRENT_HOST=algo-1\u001b[0m\n",
      "\u001b[34mSM_MODULE_NAME=train\u001b[0m\n",
      "\u001b[34mSM_LOG_LEVEL=20\u001b[0m\n",
      "\u001b[34mSM_FRAMEWORK_MODULE=sagemaker_pytorch_container.training:main\u001b[0m\n",
      "\u001b[34mSM_INPUT_DIR=/opt/ml/input\u001b[0m\n",
      "\u001b[34mSM_INPUT_CONFIG_DIR=/opt/ml/input/config\u001b[0m\n",
      "\u001b[34mSM_OUTPUT_DIR=/opt/ml/output\u001b[0m\n",
      "\u001b[34mSM_NUM_CPUS=32\u001b[0m\n",
      "\u001b[34mSM_NUM_GPUS=4\u001b[0m\n",
      "\u001b[34mSM_MODEL_DIR=/opt/ml/model\u001b[0m\n",
      "\u001b[34mSM_MODULE_DIR=s3://sagemaker-us-east-1-143656149352/kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543/source/sourcedir.tar.gz\u001b[0m\n",
      "\u001b[34mSM_TRAINING_ENV={\"additional_framework_parameters\":{},\"channel_input_dirs\":{\"train\":\"/opt/ml/input/data/train\",\"valid\":\"/opt/ml/input/data/valid\"},\"current_host\":\"algo-1\",\"framework_module\":\"sagemaker_pytorch_container.training:main\",\"hosts\":[\"algo-1\"],\"hyperparameters\":{\"epochs\":3,\"eval_batch_size\":64,\"learning_rate\":0.0002,\"model_id\":\"bert-base-multilingual-cased\",\"n_gpus\":4,\"seed\":42,\"tokenizer_id\":\"bert-base-multilingual-cased\",\"train_batch_size\":32},\"input_config_dir\":\"/opt/ml/input/config\",\"input_data_config\":{\"train\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"valid\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}},\"input_dir\":\"/opt/ml/input\",\"is_master\":true,\"job_name\":\"kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543\",\"log_level\":20,\"master_hostname\":\"algo-1\",\"model_dir\":\"/opt/ml/model\",\"module_dir\":\"s3://sagemaker-us-east-1-143656149352/kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543/source/sourcedir.tar.gz\",\"module_name\":\"train\",\"network_interface_name\":\"eth0\",\"num_cpus\":32,\"num_gpus\":4,\"output_data_dir\":\"/opt/ml/output/data\",\"output_dir\":\"/opt/ml/output\",\"output_intermediate_dir\":\"/opt/ml/output/intermediate\",\"resource_config\":{\"current_group_name\":\"homogeneousCluster\",\"current_host\":\"algo-1\",\"current_instance_type\":\"ml.p3.8xlarge\",\"hosts\":[\"algo-1\"],\"instance_groups\":[{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.p3.8xlarge\"}],\"network_interface_name\":\"eth0\"},\"user_entry_point\":\"train.py\"}\u001b[0m\n",
      "\u001b[34mSM_USER_ARGS=[\"--epochs\",\"3\",\"--eval_batch_size\",\"64\",\"--learning_rate\",\"0.0002\",\"--model_id\",\"bert-base-multilingual-cased\",\"--n_gpus\",\"4\",\"--seed\",\"42\",\"--tokenizer_id\",\"bert-base-multilingual-cased\",\"--train_batch_size\",\"32\"]\u001b[0m\n",
      "\u001b[34mSM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate\u001b[0m\n",
      "\u001b[34mSM_CHANNEL_TRAIN=/opt/ml/input/data/train\u001b[0m\n",
      "\u001b[34mSM_CHANNEL_VALID=/opt/ml/input/data/valid\u001b[0m\n",
      "\u001b[34mSM_HP_EPOCHS=3\u001b[0m\n",
      "\u001b[34mSM_HP_EVAL_BATCH_SIZE=64\u001b[0m\n",
      "\u001b[34mSM_HP_LEARNING_RATE=0.0002\u001b[0m\n",
      "\u001b[34mSM_HP_MODEL_ID=bert-base-multilingual-cased\u001b[0m\n",
      "\u001b[34mSM_HP_N_GPUS=4\u001b[0m\n",
      "\u001b[34mSM_HP_SEED=42\u001b[0m\n",
      "\u001b[34mSM_HP_TOKENIZER_ID=bert-base-multilingual-cased\u001b[0m\n",
      "\u001b[34mSM_HP_TRAIN_BATCH_SIZE=32\u001b[0m\n",
      "\u001b[34mPYTHONPATH=/opt/ml/code:/opt/conda/bin:/opt/conda/lib/python38.zip:/opt/conda/lib/python3.8:/opt/conda/lib/python3.8/lib-dynload:/opt/conda/lib/python3.8/site-packages:/opt/conda/lib/python3.8/site-packages/smdebug-1.0.13b20220512-py3.8.egg:/opt/conda/lib/python3.8/site-packages/pyinstrument-3.4.2-py3.8.egg:/opt/conda/lib/python3.8/site-packages/pyinstrument_cext-0.2.4-py3.8-linux-x86_64.egg\u001b[0m\n",
      "\u001b[34mInvoking script with the following command:\u001b[0m\n",
      "\u001b[34m/opt/conda/bin/python3.8 train.py --epochs 3 --eval_batch_size 64 --learning_rate 0.0002 --model_id bert-base-multilingual-cased --n_gpus 4 --seed 42 --tokenizer_id bert-base-multilingual-cased --train_batch_size 32\u001b[0m\n",
      "\u001b[34m[{train.py:128} INFO - ***** Arguments *****\u001b[0m\n",
      "\u001b[34m[{train.py:129} INFO - epochs=3\u001b[0m\n",
      "\u001b[34mseed=42\u001b[0m\n",
      "\u001b[34mtrain_batch_size=32\u001b[0m\n",
      "\u001b[34meval_batch_size=64\u001b[0m\n",
      "\u001b[34mwarmup_steps=100\u001b[0m\n",
      "\u001b[34mlearning_rate=0.0002\u001b[0m\n",
      "\u001b[34mdisable_tqdm=False\u001b[0m\n",
      "\u001b[34mdebug=False\u001b[0m\n",
      "\u001b[34mtokenizer_id=bert-base-multilingual-cased\u001b[0m\n",
      "\u001b[34mmodel_id=bert-base-multilingual-cased\u001b[0m\n",
      "\u001b[34moutput_data_dir=/opt/ml/output/data\u001b[0m\n",
      "\u001b[34mmodel_dir=/opt/ml/model\u001b[0m\n",
      "\u001b[34mn_gpus=4\u001b[0m\n",
      "\u001b[34mtrain_dir=/opt/ml/input/data/train\u001b[0m\n",
      "\u001b[34mvalid_dir=/opt/ml/input/data/valid\u001b[0m\n",
      "\u001b[34mchkpt_dir=/opt/ml/checkpoints\u001b[0m\n",
      "\u001b[34m[{train.py:165} INFO - Loaded train_dataset length is: 52416\u001b[0m\n",
      "\u001b[34m[{train.py:166} INFO - Loaded test_dataset length is: 13104\u001b[0m\n",
      "\u001b[34mDownloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]\u001b[0m\n",
      "\u001b[34mDownloading: 100%|██████████| 29.0/29.0 [00:00<00:00, 38.7kB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   0%|          | 0.00/972k [00:00<?, ?B/s]\u001b[0m\n",
      "\u001b[34mDownloading: 100%|██████████| 972k/972k [00:00<00:00, 57.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   0%|          | 0.00/1.87M [00:00<?, ?B/s]\u001b[0m\n",
      "\u001b[34mDownloading: 100%|██████████| 1.87M/1.87M [00:00<00:00, 55.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   0%|          | 0.00/625 [00:00<?, ?B/s]\u001b[0m\n",
      "\u001b[34mDownloading: 100%|██████████| 625/625 [00:00<00:00, 954kB/s]\u001b[0m\n",
      "\u001b[34m[{train.py:176} INFO - num_train samples=52416, num_valid samples=13104\u001b[0m\n",
      "\u001b[34mDownloading:   0%|          | 0.00/681M [00:00<?, ?B/s]\u001b[0m\n",
      "\u001b[34mDownloading:   1%|          | 5.13M/681M [00:00<00:13, 53.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   2%|▏         | 10.3M/681M [00:00<00:13, 54.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   2%|▏         | 16.5M/681M [00:00<00:11, 58.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   3%|▎         | 22.8M/681M [00:00<00:11, 61.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   4%|▍         | 29.1M/681M [00:00<00:10, 63.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   5%|▌         | 35.4M/681M [00:00<00:10, 64.4MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   6%|▌         | 41.7M/681M [00:00<00:10, 65.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   7%|▋         | 48.1M/681M [00:00<00:10, 65.4MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   8%|▊         | 54.3M/681M [00:00<00:10, 65.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:   9%|▉         | 60.6M/681M [00:01<00:09, 65.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  10%|▉         | 66.8M/681M [00:01<00:09, 65.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  11%|█         | 73.1M/681M [00:01<00:09, 65.3MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  12%|█▏        | 79.4M/681M [00:01<00:09, 65.4MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  13%|█▎        | 85.7M/681M [00:01<00:09, 65.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  14%|█▎        | 92.0M/681M [00:01<00:09, 65.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  14%|█▍        | 98.3M/681M [00:01<00:09, 66.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  15%|█▌        | 105M/681M [00:01<00:09, 66.3MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  16%|█▋        | 111M/681M [00:01<00:08, 66.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  17%|█▋        | 117M/681M [00:01<00:08, 66.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  18%|█▊        | 124M/681M [00:02<00:08, 66.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  19%|█▉        | 130M/681M [00:02<00:08, 66.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  20%|██        | 137M/681M [00:02<00:08, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  21%|██        | 143M/681M [00:02<00:08, 66.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  22%|██▏       | 149M/681M [00:02<00:08, 66.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  23%|██▎       | 156M/681M [00:02<00:08, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  24%|██▍       | 162M/681M [00:02<00:08, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  25%|██▍       | 169M/681M [00:02<00:08, 66.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  26%|██▌       | 175M/681M [00:02<00:07, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  27%|██▋       | 181M/681M [00:02<00:07, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  28%|██▊       | 188M/681M [00:03<00:07, 66.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  28%|██▊       | 194M/681M [00:03<00:07, 66.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  29%|██▉       | 201M/681M [00:03<00:07, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  30%|███       | 207M/681M [00:03<00:07, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  31%|███▏      | 213M/681M [00:03<00:07, 66.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  32%|███▏      | 220M/681M [00:03<00:07, 66.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  33%|███▎      | 226M/681M [00:03<00:07, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  34%|███▍      | 232M/681M [00:03<00:07, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  35%|███▌      | 239M/681M [00:03<00:06, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  36%|███▌      | 245M/681M [00:03<00:06, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  37%|███▋      | 252M/681M [00:04<00:06, 66.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  38%|███▊      | 258M/681M [00:04<00:06, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  39%|███▉      | 264M/681M [00:04<00:06, 66.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  40%|███▉      | 271M/681M [00:04<00:06, 66.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  41%|████      | 277M/681M [00:04<00:06, 66.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  42%|████▏     | 284M/681M [00:04<00:06, 69.3MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  43%|████▎     | 292M/681M [00:04<00:05, 72.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  44%|████▍     | 299M/681M [00:04<00:05, 74.4MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  45%|████▌     | 307M/681M [00:04<00:05, 75.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  46%|████▌     | 315M/681M [00:04<00:04, 77.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  47%|████▋     | 322M/681M [00:05<00:04, 78.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  48%|████▊     | 330M/681M [00:05<00:04, 78.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  50%|████▉     | 338M/681M [00:05<00:04, 79.1MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  51%|█████     | 345M/681M [00:05<00:04, 79.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  52%|█████▏    | 353M/681M [00:05<00:04, 79.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  53%|█████▎    | 361M/681M [00:05<00:04, 80.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  54%|█████▍    | 368M/681M [00:05<00:04, 79.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  55%|█████▌    | 376M/681M [00:05<00:04, 79.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  56%|█████▋    | 384M/681M [00:05<00:03, 80.1MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  57%|█████▋    | 391M/681M [00:05<00:03, 79.3MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  59%|█████▊    | 399M/681M [00:06<00:03, 78.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  60%|█████▉    | 406M/681M [00:06<00:03, 79.1MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  61%|██████    | 414M/681M [00:06<00:03, 79.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  62%|██████▏   | 422M/681M [00:06<00:03, 79.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  63%|██████▎   | 429M/681M [00:06<00:03, 79.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  64%|██████▍   | 437M/681M [00:06<00:03, 79.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  65%|██████▌   | 445M/681M [00:06<00:03, 79.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  66%|██████▋   | 452M/681M [00:06<00:03, 79.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  68%|██████▊   | 460M/681M [00:06<00:02, 78.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  69%|██████▊   | 467M/681M [00:06<00:02, 79.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  70%|██████▉   | 475M/681M [00:07<00:02, 79.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  71%|███████   | 483M/681M [00:07<00:02, 77.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  72%|███████▏  | 490M/681M [00:07<00:02, 78.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  73%|███████▎  | 498M/681M [00:07<00:02, 78.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  74%|███████▍  | 505M/681M [00:07<00:02, 79.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  75%|███████▌  | 513M/681M [00:07<00:02, 79.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  76%|███████▋  | 521M/681M [00:07<00:02, 79.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  78%|███████▊  | 528M/681M [00:07<00:02, 79.3MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  79%|███████▊  | 536M/681M [00:07<00:01, 78.6MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  80%|███████▉  | 544M/681M [00:07<00:01, 79.1MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  81%|████████  | 551M/681M [00:08<00:01, 79.4MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  82%|████████▏ | 559M/681M [00:08<00:01, 79.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  83%|████████▎ | 566M/681M [00:08<00:01, 79.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  84%|████████▍ | 574M/681M [00:08<00:01, 79.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  85%|████████▌ | 582M/681M [00:08<00:01, 79.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  87%|████████▋ | 589M/681M [00:08<00:01, 78.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  88%|████████▊ | 597M/681M [00:08<00:01, 79.1MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  89%|████████▊ | 605M/681M [00:08<00:01, 78.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  90%|████████▉ | 612M/681M [00:08<00:00, 78.1MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  91%|█████████ | 619M/681M [00:08<00:00, 77.7MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  92%|█████████▏| 627M/681M [00:09<00:00, 78.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  93%|█████████▎| 635M/681M [00:09<00:00, 78.5MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  94%|█████████▍| 642M/681M [00:09<00:00, 78.8MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  95%|█████████▌| 650M/681M [00:09<00:00, 79.3MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  96%|█████████▋| 657M/681M [00:09<00:00, 78.0MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  98%|█████████▊| 665M/681M [00:09<00:00, 78.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading:  99%|█████████▊| 672M/681M [00:09<00:00, 78.2MB/s]\u001b[0m\n",
      "\u001b[34mDownloading: 100%|█████████▉| 680M/681M [00:09<00:00, 77.9MB/s]\u001b[0m\n",
      "\u001b[34mDownloading: 100%|██████████| 681M/681M [00:09<00:00, 73.1MB/s]\u001b[0m\n",
      "\u001b[34mSome weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertForTokenClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight']\u001b[0m\n",
      "\u001b[34m- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\u001b[0m\n",
      "\u001b[34m- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\u001b[0m\n",
      "\u001b[34mSome weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: ['classifier.bias', 'classifier.weight']\u001b[0m\n",
      "\u001b[34mYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\u001b[0m\n",
      "\u001b[34mSome weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertForTokenClassification: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight']\u001b[0m\n",
      "\u001b[34m- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\u001b[0m\n",
      "\u001b[34m- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\u001b[0m\n",
      "\u001b[34mSome weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: ['classifier.bias', 'classifier.weight']\u001b[0m\n",
      "\u001b[34mYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\u001b[0m\n",
      "\u001b[34m[{train.py:221} INFO - ***** Continue Training *****\u001b[0m\n",
      "\u001b[34mLoading model from /opt/ml/checkpoints/checkpoint-1230).\u001b[0m\n",
      "\u001b[34mLoading model from /opt/ml/checkpoints/checkpoint-1230).\u001b[0m\n",
      "\u001b[34m/opt/conda/lib/python3.8/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n",
      "  warnings.warn(\u001b[0m\n",
      "\u001b[34m***** Running training *****\n",
      "  Num examples = 52416\u001b[0m\n",
      "\u001b[34m***** Running training *****\n",
      "  Num examples = 52416\n",
      "  Num Epochs = 3\u001b[0m\n",
      "\u001b[34mNum Epochs = 3\n",
      "  Instantaneous batch size per device = 32\n",
      "  Total train batch size (w. parallel, distributed & accumulation) = 128\u001b[0m\n",
      "\u001b[34mInstantaneous batch size per device = 32\n",
      "  Total train batch size (w. parallel, distributed & accumulation) = 128\u001b[0m\n",
      "\u001b[34mGradient Accumulation steps = 1\n",
      "  Total optimization steps = 1230\u001b[0m\n",
      "\u001b[34mGradient Accumulation steps = 1\n",
      "  Total optimization steps = 1230\u001b[0m\n",
      "\u001b[34mContinuing training from checkpoint, will skip to saved global_step\u001b[0m\n",
      "\u001b[34mContinuing training from checkpoint, will skip to saved global_step\n",
      "  Continuing training from epoch 3\u001b[0m\n",
      "\u001b[34mContinuing training from epoch 3\n",
      "  Continuing training from global step 1230\n",
      "  Will skip the first 3 epochs then the first 0 batches in the first epoch. If this takes a lot of time, you can add the `--ignore_data_skip` flag to your launch command, but you will resume the training on data already seen by your model.\u001b[0m\n",
      "\u001b[34mContinuing training from global step 1230\n",
      "  Will skip the first 3 epochs then the first 0 batches in the first epoch. If this takes a lot of time, you can add the `--ignore_data_skip` flag to your launch command, but you will resume the training on data already seen by your model.\u001b[0m\n",
      "\u001b[34m0it [00:00, ?it/s]\u001b[0m\n",
      "\u001b[34mSkipping the first batches: : 0it [00:00, ?it/s]\u001b[0m\n",
      "\u001b[34m0%|          | 0/1230 [00:00<?, ?it/s]#033[A\u001b[0m\n",
      "\u001b[34mTraining completed. Do not forget to share your model on huggingface.co/models =)\u001b[0m\n",
      "\u001b[34mTraining completed. Do not forget to share your model on huggingface.co/models =)\u001b[0m\n",
      "\u001b[34m#033[A\u001b[0m\n",
      "\u001b[34m{'train_runtime': 1.1273, 'train_samples_per_second': 139489.252, 'train_steps_per_second': 1091.09, 'train_loss': 0.0, 'epoch': 3.0}\u001b[0m\n",
      "\u001b[34mSkipping the first batches: : 0it [00:01, ?it/s]\u001b[0m\n",
      "\u001b[34m0%|          | 0/1230 [00:01<?, ?it/s]#033[A\u001b[0m\n",
      "\u001b[34m0%|          | 0/1230 [00:01<?, ?it/s]\u001b[0m\n",
      "\u001b[34m***** Running Prediction *****\u001b[0m\n",
      "\u001b[34m***** Running Prediction *****\u001b[0m\n",
      "\u001b[34mNum examples = 13104\n",
      "  Batch size = 256\u001b[0m\n",
      "\u001b[34mNum examples = 13104\n",
      "  Batch size = 256\u001b[0m\n",
      "\u001b[34malgo-1:32:32 [0] ofi_init:1157 NCCL WARN NET/OFI Only EFA provider is supported\u001b[0m\n",
      "\u001b[34malgo-1:32:32 [0] ofi_init:1208 NCCL WARN NET/OFI aws-ofi-nccl initialization failed\u001b[0m\n",
      "\u001b[34mNCCL version 2.10.3+cuda11.3\u001b[0m\n",
      "\u001b[34m/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.\n",
      "  warnings.warn('Was asked to gather along dimension 0, but all '\u001b[0m\n",
      "\u001b[34m0%|          | 0/52 [00:00<?, ?it/s]#033[A\u001b[0m\n",
      "\u001b[34m4%|▍         | 2/52 [00:00<00:04, 10.98it/s]#033[A\u001b[0m\n",
      "\u001b[34m8%|▊         | 4/52 [00:00<00:06,  7.16it/s]#033[A\u001b[0m\n",
      "\u001b[34m10%|▉         | 5/52 [00:00<00:07,  6.68it/s]#033[A\u001b[0m\n",
      "\u001b[34m12%|█▏        | 6/52 [00:00<00:07,  6.37it/s]#033[A\u001b[0m\n",
      "\u001b[34m13%|█▎        | 7/52 [00:01<00:07,  6.18it/s]#033[A\u001b[0m\n",
      "\u001b[34m15%|█▌        | 8/52 [00:01<00:07,  6.05it/s]#033[A\u001b[0m\n",
      "\u001b[34m17%|█▋        | 9/52 [00:01<00:07,  5.95it/s]#033[A\u001b[0m\n",
      "\u001b[34m19%|█▉        | 10/52 [00:01<00:07,  5.88it/s]#033[A\u001b[0m\n",
      "\u001b[34m21%|██        | 11/52 [00:01<00:07,  5.84it/s]#033[A\u001b[0m\n",
      "\u001b[34m23%|██▎       | 12/52 [00:01<00:06,  5.81it/s]#033[A\u001b[0m\n",
      "\u001b[34m25%|██▌       | 13/52 [00:02<00:06,  5.79it/s]#033[A\u001b[0m\n",
      "\u001b[34mSkipping the first batches: : 0it [00:14, ?it/s]\u001b[0m\n",
      "\u001b[34m27%|██▋       | 14/52 [00:02<00:07,  5.22it/s]#033[A\u001b[0m\n",
      "\u001b[34m29%|██▉       | 15/52 [00:02<00:06,  5.36it/s]#033[A\u001b[0m\n",
      "\u001b[34m31%|███       | 16/52 [00:02<00:06,  5.47it/s]#033[A\u001b[0m\n",
      "\u001b[34m33%|███▎      | 17/52 [00:02<00:06,  5.55it/s]#033[A\u001b[0m\n",
      "\u001b[34m35%|███▍      | 18/52 [00:03<00:06,  5.61it/s]#033[A\u001b[0m\n",
      "\u001b[34m37%|███▋      | 19/52 [00:03<00:05,  5.66it/s]#033[A\u001b[0m\n",
      "\u001b[34m38%|███▊      | 20/52 [00:03<00:05,  5.70it/s]#033[A\u001b[0m\n",
      "\u001b[34m40%|████      | 21/52 [00:03<00:05,  5.72it/s]#033[A\u001b[0m\n",
      "\u001b[34m42%|████▏     | 22/52 [00:03<00:05,  5.73it/s]#033[A\u001b[0m\n",
      "\u001b[34m44%|████▍     | 23/52 [00:03<00:05,  5.75it/s]#033[A\u001b[0m\n",
      "\u001b[34m46%|████▌     | 24/52 [00:04<00:04,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m48%|████▊     | 25/52 [00:04<00:04,  5.75it/s]#033[A\u001b[0m\n",
      "\u001b[34m50%|█████     | 26/52 [00:04<00:04,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m52%|█████▏    | 27/52 [00:04<00:04,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m54%|█████▍    | 28/52 [00:04<00:04,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m56%|█████▌    | 29/52 [00:04<00:03,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m58%|█████▊    | 30/52 [00:05<00:03,  5.77it/s]#033[A\u001b[0m\n",
      "\u001b[34m60%|█████▉    | 31/52 [00:05<00:03,  5.77it/s]#033[A\u001b[0m\n",
      "\u001b[34m62%|██████▏   | 32/52 [00:05<00:03,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m63%|██████▎   | 33/52 [00:05<00:03,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m65%|██████▌   | 34/52 [00:05<00:03,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m67%|██████▋   | 35/52 [00:06<00:03,  5.25it/s]#033[A\u001b[0m\n",
      "\u001b[34m69%|██████▉   | 36/52 [00:06<00:02,  5.40it/s]#033[A\u001b[0m\n",
      "\u001b[34m71%|███████   | 37/52 [00:06<00:02,  5.50it/s]#033[A\u001b[0m\n",
      "\u001b[34m73%|███████▎  | 38/52 [00:06<00:02,  5.57it/s]#033[A\u001b[0m\n",
      "\u001b[34m75%|███████▌  | 39/52 [00:06<00:02,  5.62it/s]#033[A\u001b[0m\n",
      "\u001b[34m77%|███████▋  | 40/52 [00:06<00:02,  5.65it/s]#033[A\u001b[0m\n",
      "\u001b[34m79%|███████▉  | 41/52 [00:07<00:01,  5.68it/s]#033[A\u001b[0m\n",
      "\u001b[34m81%|████████  | 42/52 [00:07<00:01,  5.70it/s]#033[A\u001b[0m\n",
      "\u001b[34m83%|████████▎ | 43/52 [00:07<00:01,  5.72it/s]#033[A\u001b[0m\n",
      "\u001b[34m85%|████████▍ | 44/52 [00:07<00:01,  5.73it/s]#033[A\u001b[0m\n",
      "\u001b[34m87%|████████▋ | 45/52 [00:07<00:01,  5.73it/s]#033[A\u001b[0m\n",
      "\u001b[34m88%|████████▊ | 46/52 [00:07<00:01,  5.74it/s]#033[A\u001b[0m\n",
      "\u001b[34m90%|█████████ | 47/52 [00:08<00:00,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m92%|█████████▏| 48/52 [00:08<00:00,  5.77it/s]#033[A\u001b[0m\n",
      "\u001b[34m94%|█████████▍| 49/52 [00:08<00:00,  5.77it/s]#033[A\u001b[0m\n",
      "\u001b[34m96%|█████████▌| 50/52 [00:08<00:00,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m98%|█████████▊| 51/52 [00:08<00:00,  5.76it/s]#033[A\u001b[0m\n",
      "\u001b[34m***** Evaluation results at /opt/ml/output/data *****\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_accuracy = 0.9013632683138849\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_f1 = 0.9013632683138849\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_loss = 0.34593191742897034\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_precision = 0.9013632683138849\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_recall = 0.9013632683138849\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_runtime = 20.3227\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_samples_per_second = 644.798\u001b[0m\n",
      "\u001b[34m[{train.py:236} INFO - test_steps_per_second = 2.559\u001b[0m\n",
      "\u001b[34mtokenizer config file saved in /opt/ml/model/tokenizer_config.json\u001b[0m\n",
      "\u001b[34mtokenizer config file saved in /opt/ml/model/tokenizer_config.json\u001b[0m\n",
      "\u001b[34mSpecial tokens file saved in /opt/ml/model/special_tokens_map.json\u001b[0m\n",
      "\u001b[34mSpecial tokens file saved in /opt/ml/model/special_tokens_map.json\u001b[0m\n",
      "\u001b[34mSaving model checkpoint to /opt/ml/model\u001b[0m\n",
      "\u001b[34mSaving model checkpoint to /opt/ml/model\u001b[0m\n",
      "\u001b[34mConfiguration saved in /opt/ml/model/config.json\u001b[0m\n",
      "\u001b[34mConfiguration saved in /opt/ml/model/config.json\u001b[0m\n",
      "\u001b[34mModel weights saved in /opt/ml/model/pytorch_model.bin\u001b[0m\n",
      "\u001b[34mModel weights saved in /opt/ml/model/pytorch_model.bin\u001b[0m\n",
      "\u001b[34mtokenizer config file saved in /opt/ml/model/tokenizer_config.json\u001b[0m\n",
      "\u001b[34mtokenizer config file saved in /opt/ml/model/tokenizer_config.json\u001b[0m\n",
      "\u001b[34mSpecial tokens file saved in /opt/ml/model/special_tokens_map.json\u001b[0m\n",
      "\u001b[34mSpecial tokens file saved in /opt/ml/model/special_tokens_map.json\u001b[0m\n",
      "\u001b[34m100%|██████████| 52/52 [00:11<00:00,  4.61it/s]\u001b[0m\n",
      "\u001b[34m2022-07-05 22:54:42,171 sagemaker-training-toolkit INFO     Reporting training SUCCESS\u001b[0m\n",
      "\n",
      "2022-07-05 22:54:49 Uploading - Uploading generated training model\n",
      "2022-07-05 22:56:25 Completed - Training job completed\n",
      "Training seconds: 453\n",
      "Billable seconds: 453\n"
     ]
    }
   ],
   "source": [
    "sess.logs_for_job(job_name=train_job_name, wait=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Copy model artifacts from S3 to local path\n",
    "\n",
    "훈련된 모델 파라메터는 `model.tar.gz`로 압축되어 S3에 저장됩니다. 만약 SageMaker 상에서 훈련한 모델을 곧바로 배포한다면, 아래 코드 셀을 실행할 필요는 없지만, 로컬/개발 환경에서 훈련된 모델을 간단히 테스트하거나 다른 환경에서 모델을 서빙할 때는 S3에 저장된 모델을 다운로드하셔야 합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "local_model_dir = './model'\n",
    "!rm -rf {local_model_dir}\n",
    "s3_model_path = sm_estimator.model_data\n",
    "os.makedirs(local_model_dir, exist_ok=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "download: s3://sagemaker-us-east-1-143656149352/kornlp-ner-training-2022-07-05-22-46-48-2022-07-05-22-46-49-543/output/model.tar.gz to model/model.tar.gz\n",
      "tokenizer_config.json\n",
      "tag2entity.json\n",
      "pytorch_model.bin\n",
      "training_args.bin\n",
      "vocab.txt\n",
      "tokenizer.json\n",
      "tag2id.json\n",
      "config.json\n",
      "id2tag.json\n",
      "special_tokens_map.json\n"
     ]
    }
   ],
   "source": [
    "%%bash -s \"$local_model_dir\" \"$s3_model_path\"\n",
    "aws s3 cp $2 $1\n",
    "cd $1\n",
    "tar -xzvf model.tar.gz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Stored 's3_model_path' (str)\n",
      "Stored 'local_model_dir' (str)\n",
      "Stored 'model_id' (str)\n",
      "Stored 'tokenizer_id' (str)\n"
     ]
    }
   ],
   "source": [
    "%store s3_model_path local_model_dir model_id tokenizer_id"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "emi8JP4I4lxX"
   },
   "source": [
    "<br>\n",
    "\n",
    "## 3. Prediction\n",
    "---\n",
    "### Load fine-tuned model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "with open(os.path.join(local_model_dir, 'tag2id.json'), 'r') as f:\n",
    "    tag2id = json.loads(f.read())\n",
    "\n",
    "with open(os.path.join(local_model_dir, 'id2tag.json'), 'r') as f:\n",
    "    id2tag = json.loads(f.read())    \n",
    "\n",
    "with open(os.path.join(local_model_dir, 'tag2entity.json'), 'r') as f:\n",
    "    tag2entity = json.loads(f.read())\n",
    "\n",
    "tag2id = {k:int(v) for k,v in tag2id.items()}     \n",
    "id2tag = {int(k):v for k,v in id2tag.items()}    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "# from transformers import BertTokenizerFast, BertConfig, BertForTokenClassification\n",
    "# tokenizer = BertTokenizerFast.from_pretrained(tokenizer_id)\n",
    "# tokenizer.save(f'{local_model_path}/tokenizer.json')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Some weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias']\n",
      "- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
      "- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
      "Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: ['classifier.bias', 'classifier.weight']\n",
      "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
     ]
    }
   ],
   "source": [
    "from transformers import BertTokenizerFast, BertConfig, BertForTokenClassification\n",
    "tokenizer = BertTokenizerFast.from_pretrained(f'{local_model_dir}')\n",
    "model = BertForTokenClassification.from_pretrained(model_id, num_labels=len(id2tag))\n",
    "model.config.id2label = id2tag\n",
    "model.config.label2id = tag2id\n",
    "model.load_state_dict(torch.load(f'{local_model_dir}/pytorch_model.bin'))\n",
    "model = model.eval()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "mvYrdoG8t8UV",
    "outputId": "7e64b739-52c4-45c2-9bed-8e13437bf666"
   },
   "outputs": [],
   "source": [
    "def predict_fn_token(example, model):\n",
    " \n",
    "    inputs_with_offsets = tokenizer(example, return_offsets_mapping=True, return_tensors='pt')\n",
    "    inputs = inputs_with_offsets.copy()\n",
    "    inputs.pop('offset_mapping')\n",
    "    tokens = inputs_with_offsets.tokens()\n",
    "    offsets = inputs_with_offsets[\"offset_mapping\"]\n",
    "        \n",
    "    device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "    inputs = inputs.to(device)\n",
    "    model = model.to(device)\n",
    "    \n",
    "    output = model(**inputs)\n",
    "    softmax_fn = nn.Softmax(dim=-1)\n",
    "    softmax_output = softmax_fn(output['logits'])\n",
    "    prob, pred = torch.max(softmax_output, dim=-1)\n",
    "    pred_str_lst = [model.config.id2label[id] for id in pred.squeeze().tolist()]\n",
    "    prob_lst, pred_lst = prob.squeeze().tolist(), pred.squeeze().tolist()\n",
    "    offsets_lst = offsets.squeeze().tolist()\n",
    "    \n",
    "    df = pd.DataFrame(zip(tokens, pred_str_lst, prob_lst, pred_lst, offsets_lst), \n",
    "                  columns=['token', 'tag', 'score', 'label', 'offset'])\n",
    "    \n",
    "    return df\n",
    "\n",
    "\n",
    "def predict_fn_word(example, model):\n",
    "\n",
    "    from transformers import pipeline\n",
    "    device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
    "    device_id = -1 if device.type == \"cpu\" else 0\n",
    "    \n",
    "    nlp = pipeline(\"ner\", model=model.to(device), device=device_id, \n",
    "                   tokenizer=tokenizer, aggregation_strategy='average')\n",
    "    results = nlp(example)\n",
    "    \n",
    "    entity_lst, score_lst, word_lst, start_lst, end_lst = [], [], [], [], []\n",
    "    tag2entity[''] = '-'\n",
    "\n",
    "    for result in results:\n",
    "        entity = tag2entity[result['entity_group']]\n",
    "        score = result['score']\n",
    "        word = result['word']\n",
    "        start = result['start']\n",
    "        end = result['end']\n",
    "\n",
    "        entity_lst.append(entity)\n",
    "        score_lst.append(score)\n",
    "        word_lst.append(word)\n",
    "        start_lst.append(start)\n",
    "        end_lst.append(end)\n",
    "\n",
    "    df = pd.DataFrame(zip(word_lst, entity_lst, score_lst, start_lst, end_lst), \n",
    "                      columns=['word', 'entity', 'score', 'start', 'end'])\n",
    "    return df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example\n",
    "여러분만의 샘플 문장을 만들어서 자유롭게 추론을 수행해 보세요."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>token</th>\n",
       "      <th>tag</th>\n",
       "      <th>score</th>\n",
       "      <th>label</th>\n",
       "      <th>offset</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[CLS]</td>\n",
       "      <td>-</td>\n",
       "      <td>0.999694</td>\n",
       "      <td>2</td>\n",
       "      <td>[0, 0]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>아</td>\n",
       "      <td>ORG_B</td>\n",
       "      <td>0.971049</td>\n",
       "      <td>3</td>\n",
       "      <td>[0, 1]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>##마</td>\n",
       "      <td>ORG_B</td>\n",
       "      <td>0.971712</td>\n",
       "      <td>3</td>\n",
       "      <td>[1, 2]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>##존</td>\n",
       "      <td>ORG_B</td>\n",
       "      <td>0.969575</td>\n",
       "      <td>3</td>\n",
       "      <td>[2, 3]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Sage</td>\n",
       "      <td>TRM_B</td>\n",
       "      <td>0.558043</td>\n",
       "      <td>8</td>\n",
       "      <td>[4, 8]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>##M</td>\n",
       "      <td>TRM_B</td>\n",
       "      <td>0.546610</td>\n",
       "      <td>8</td>\n",
       "      <td>[8, 9]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>##aker</td>\n",
       "      <td>TRM_B</td>\n",
       "      <td>0.550489</td>\n",
       "      <td>8</td>\n",
       "      <td>[9, 13]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>##는</td>\n",
       "      <td>TRM_B</td>\n",
       "      <td>0.540549</td>\n",
       "      <td>8</td>\n",
       "      <td>[13, 14]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>머신</td>\n",
       "      <td>TRM_B</td>\n",
       "      <td>0.943233</td>\n",
       "      <td>8</td>\n",
       "      <td>[15, 17]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>러</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.683184</td>\n",
       "      <td>9</td>\n",
       "      <td>[18, 19]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>##닝</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.706626</td>\n",
       "      <td>9</td>\n",
       "      <td>[19, 20]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>통</td>\n",
       "      <td>-</td>\n",
       "      <td>0.543906</td>\n",
       "      <td>2</td>\n",
       "      <td>[21, 22]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>##합</td>\n",
       "      <td>-</td>\n",
       "      <td>0.524757</td>\n",
       "      <td>2</td>\n",
       "      <td>[22, 23]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>엔</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.515450</td>\n",
       "      <td>9</td>\n",
       "      <td>[24, 25]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>##드</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.526613</td>\n",
       "      <td>9</td>\n",
       "      <td>[25, 26]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>##투</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.550793</td>\n",
       "      <td>9</td>\n",
       "      <td>[26, 27]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>##엔</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.584466</td>\n",
       "      <td>9</td>\n",
       "      <td>[27, 28]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>##드</td>\n",
       "      <td>TRM_I</td>\n",
       "      <td>0.526351</td>\n",
       "      <td>9</td>\n",
       "      <td>[28, 29]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>관</td>\n",
       "      <td>-</td>\n",
       "      <td>0.525129</td>\n",
       "      <td>2</td>\n",
       "      <td>[30, 31]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>##리</td>\n",
       "      <td>-</td>\n",
       "      <td>0.548310</td>\n",
       "      <td>2</td>\n",
       "      <td>[31, 32]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     token    tag     score  label    offset\n",
       "0    [CLS]      -  0.999694      2    [0, 0]\n",
       "1        아  ORG_B  0.971049      3    [0, 1]\n",
       "2      ##마  ORG_B  0.971712      3    [1, 2]\n",
       "3      ##존  ORG_B  0.969575      3    [2, 3]\n",
       "4     Sage  TRM_B  0.558043      8    [4, 8]\n",
       "5      ##M  TRM_B  0.546610      8    [8, 9]\n",
       "6   ##aker  TRM_B  0.550489      8   [9, 13]\n",
       "7      ##는  TRM_B  0.540549      8  [13, 14]\n",
       "8       머신  TRM_B  0.943233      8  [15, 17]\n",
       "9        러  TRM_I  0.683184      9  [18, 19]\n",
       "10     ##닝  TRM_I  0.706626      9  [19, 20]\n",
       "11       통      -  0.543906      2  [21, 22]\n",
       "12     ##합      -  0.524757      2  [22, 23]\n",
       "13       엔  TRM_I  0.515450      9  [24, 25]\n",
       "14     ##드  TRM_I  0.526613      9  [25, 26]\n",
       "15     ##투  TRM_I  0.550793      9  [26, 27]\n",
       "16     ##엔  TRM_I  0.584466      9  [27, 28]\n",
       "17     ##드  TRM_I  0.526351      9  [28, 29]\n",
       "18       관      -  0.525129      2  [30, 31]\n",
       "19     ##리      -  0.548310      2  [31, 32]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "example = '아마존 SageMaker는 머신 러닝 통합 엔드투엔드 관리형 서비스로 2017년 re:Invent 행사가 열린 라스베가스에서 발표되었다.'\n",
    "token_df = predict_fn_token(example, model)\n",
    "token_df.head(20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "from transformers import pipeline\n",
    "nlp = pipeline(\"ner\", model=model.to('cpu'), device=-1, \n",
    "               tokenizer=tokenizer, aggregation_strategy='average')\n",
    "results = nlp(example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>word</th>\n",
       "      <th>entity</th>\n",
       "      <th>score</th>\n",
       "      <th>start</th>\n",
       "      <th>end</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>아마존</td>\n",
       "      <td>Organization</td>\n",
       "      <td>0.970779</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>SageMaker는 머신</td>\n",
       "      <td>Term</td>\n",
       "      <td>0.746078</td>\n",
       "      <td>4</td>\n",
       "      <td>17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>러닝</td>\n",
       "      <td>Term</td>\n",
       "      <td>0.694905</td>\n",
       "      <td>18</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>통합</td>\n",
       "      <td>-</td>\n",
       "      <td>0.534332</td>\n",
       "      <td>21</td>\n",
       "      <td>23</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>엔드투엔드</td>\n",
       "      <td>Term</td>\n",
       "      <td>0.540735</td>\n",
       "      <td>24</td>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>관리형</td>\n",
       "      <td>-</td>\n",
       "      <td>0.531987</td>\n",
       "      <td>30</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>서비스로</td>\n",
       "      <td>Term</td>\n",
       "      <td>0.483000</td>\n",
       "      <td>34</td>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>2017년</td>\n",
       "      <td>Date</td>\n",
       "      <td>0.996233</td>\n",
       "      <td>39</td>\n",
       "      <td>44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>re :</td>\n",
       "      <td>Event</td>\n",
       "      <td>0.803023</td>\n",
       "      <td>45</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Invent 행사가</td>\n",
       "      <td>Event</td>\n",
       "      <td>0.731452</td>\n",
       "      <td>48</td>\n",
       "      <td>58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>열린</td>\n",
       "      <td>-</td>\n",
       "      <td>0.999549</td>\n",
       "      <td>59</td>\n",
       "      <td>61</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>라스베가스에서</td>\n",
       "      <td>Location</td>\n",
       "      <td>0.994044</td>\n",
       "      <td>62</td>\n",
       "      <td>69</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>발표되었다.</td>\n",
       "      <td>-</td>\n",
       "      <td>0.999645</td>\n",
       "      <td>70</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             word        entity     score  start  end\n",
       "0             아마존  Organization  0.970779      0    3\n",
       "1   SageMaker는 머신          Term  0.746078      4   17\n",
       "2              러닝          Term  0.694905     18   20\n",
       "3              통합             -  0.534332     21   23\n",
       "4           엔드투엔드          Term  0.540735     24   29\n",
       "5             관리형             -  0.531987     30   33\n",
       "6            서비스로          Term  0.483000     34   38\n",
       "7           2017년          Date  0.996233     39   44\n",
       "8            re :         Event  0.803023     45   48\n",
       "9      Invent 행사가         Event  0.731452     48   58\n",
       "10             열린             -  0.999549     59   61\n",
       "11        라스베가스에서      Location  0.994044     62   69\n",
       "12         발표되었다.             -  0.999645     70   76"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "word_df = predict_fn_word(example, model)\n",
    "word_df"
   ]
  }
 ],
 "metadata": {
  "accelerator": "TPU",
  "colab": {
   "authorship_tag": "ABX9TyPT/32fR6YbrNgmG6aLi8U7",
   "include_colab_link": true,
   "machine_shape": "hm",
   "name": "5_(BERT_실습)한국어 개체명 인식.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "conda_pytorch_p38",
   "language": "python",
   "name": "conda_pytorch_p38"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "1ac7cea5aaba45af9eddeaaee02e1e5a": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "2ef88e8c35374ca69203a64d209745ea": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "32c340873ce247e88df66c73309eecdc": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "4a4ff12bb4604faf8c1cd79156713854": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": "initial"
     }
    },
    "51eafe68808a4ffbac05605381c2d5a3": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "598f160635264f138769ae94a127455c": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "Downloading: 100%",
      "description_tooltip": null,
      "layout": "IPY_MODEL_1ac7cea5aaba45af9eddeaaee02e1e5a",
      "max": 1961828,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_b61fb58de9be4c84b7767bf69e32c5d9",
      "value": 1961828
     }
    },
    "5b0843766d3f4ac785c7dba85254d605": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_ca07bf481be7473ab1b22babaa76c3da",
       "IPY_MODEL_ab60a81459a84ae19fa922aa4ce27e8a"
      ],
      "layout": "IPY_MODEL_6356e0609f9f49d5996ef7f4f77fbd2d"
     }
    },
    "6356e0609f9f49d5996ef7f4f77fbd2d": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "693887ccba30416586e2085b7e36118b": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_9169c98d7ee5423ba5b6eb3c4dbbeda7",
      "placeholder": "​",
      "style": "IPY_MODEL_cc353722b52045efb9009ef79c7d56b7",
      "value": " 1.96M/1.96M [00:00&lt;00:00, 6.43MB/s]"
     }
    },
    "6cb6badcbbd34359be9dad2c8af93098": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "88f28f34de9e41cc948c7aebb4035589": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "8d879e2bbea04536aaa1ee5d356bb7c3": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_2ef88e8c35374ca69203a64d209745ea",
      "placeholder": "​",
      "style": "IPY_MODEL_32c340873ce247e88df66c73309eecdc",
      "value": " 29.0/29.0 [00:00&lt;00:00, 50.2B/s]"
     }
    },
    "9169c98d7ee5423ba5b6eb3c4dbbeda7": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "93234e1bd6444d819b130d83402d2d7b": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_e6e0a2274e7f420f91dc97a143315da0",
       "IPY_MODEL_8d879e2bbea04536aaa1ee5d356bb7c3"
      ],
      "layout": "IPY_MODEL_ad295e710380441588473f810a9210d7"
     }
    },
    "a0dbf235f20c497186d319b5b1558dd9": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": "initial"
     }
    },
    "ab60a81459a84ae19fa922aa4ce27e8a": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_51eafe68808a4ffbac05605381c2d5a3",
      "placeholder": "​",
      "style": "IPY_MODEL_6cb6badcbbd34359be9dad2c8af93098",
      "value": " 996k/996k [00:00&lt;00:00, 1.67MB/s]"
     }
    },
    "ad295e710380441588473f810a9210d7": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "b61fb58de9be4c84b7767bf69e32c5d9": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": "initial"
     }
    },
    "ca07bf481be7473ab1b22babaa76c3da": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "Downloading: 100%",
      "description_tooltip": null,
      "layout": "IPY_MODEL_88f28f34de9e41cc948c7aebb4035589",
      "max": 995526,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_4a4ff12bb4604faf8c1cd79156713854",
      "value": 995526
     }
    },
    "cc353722b52045efb9009ef79c7d56b7": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "e4dcb08aab3748b18a10d0f5daaf3554": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_598f160635264f138769ae94a127455c",
       "IPY_MODEL_693887ccba30416586e2085b7e36118b"
      ],
      "layout": "IPY_MODEL_f1ed2fc28a3e499fa784d5aa1777a77b"
     }
    },
    "e6e0a2274e7f420f91dc97a143315da0": {
     "model_module": "@jupyter-widgets/controls",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "Downloading: 100%",
      "description_tooltip": null,
      "layout": "IPY_MODEL_ed834b7f997141479ab90216655e230a",
      "max": 29,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_a0dbf235f20c497186d319b5b1558dd9",
      "value": 29
     }
    },
    "ed834b7f997141479ab90216655e230a": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "f1ed2fc28a3e499fa784d5aa1777a77b": {
     "model_module": "@jupyter-widgets/base",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}