{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "hMBwJT0r38tj" }, "source": [ "# Lab 2: Train Hugging Face Transformers on Amazon SageMaker\n", "\n", "### Korean NLP Downstream task: Question Answering\n", "---" ] }, { "cell_type": "markdown", "metadata": { "id": "Amc52mb94Jzq" }, "source": [ "\n", "## Introduction\n", "---\n", "\n", "바로 이전 모듈에서 기존에 온프레미스에서 개발했던 환경과 동일한 환경으로 모델을 빌드하고 훈련했습니다. 하지만 아래와 같은 상황들에서도 기존 환경을 사용하는 것이 바람직할까요?\n", "\n", "- 온프레미스의 GPU는 총 1장으로 훈련 시간이 너무 오래 소요됨\n", "- 가용 서버 대수가 2대인데 10개의 딥러닝 모델을 동시에 훈련해야 함\n", "- 필요한 상황에만 GPU를 활용\n", "- 기타 등등\n", "\n", "Amazon SageMaker는 데이터 과학자들 및 머신 러닝 엔지니어들을 위한 완전 관리형 머신 러닝 서비스로 훈련 및 추론 수행 시 인프라 설정에 대한 추가 작업이 필요하지 않기에, 단일 GPU 기반의 딥러닝 훈련을 포함한 멀티 GPU 및 멀티 인스턴스 분산 훈련을 보다 쉽고 빠르게 수행할 수 있습니다. SageMaker는 다양한 유즈케이스들에 적합한 예제들을 지속적으로 업데이트하고 있으며, 한국어 세션 및 자료들도 제공되고 있습니다.\n", "\n", "### Notes\n", "\n", "이미 기본적인 Hugging Face 용법 및 자연어 처리에 익숙하신 분들은 앞 모듈을 생략하고 이 모듈부터 핸즈온을 시작하셔도 됩니다.\n", "이 노트북은 SageMaker 기본 API를 참조하므로, SageMaker Studio, SageMaker 노트북 인스턴스 또는 AWS CLI가 설정된 로컬 시스템에서 실행해야 합니다. SageMaker Studio 또는 SageMaker 노트북 인스턴스를 사용하는 경우 PyTorch 기반 커널을 선택하세요.\n", "훈련 job 수행 시 최소 `ml.g4dn.xlarge` 이상의 훈련 인스턴스가 필요하며, `ml.p3.8xlarge`나 `ml.p3.16xlarge` 인스턴스를 권장합니다. 만약 인스턴스 사용에 제한이 걸려 있다면 Request a service quota increase for SageMaker resources를 참조하여 인스턴스 제한을 해제해 주세요.\n", "\n", "### References\n", "- Hugging Face Tutorial: https://huggingface.co/docs/transformers/training\n", "- 네이버, 창원대가 함께하는 NLP Challenge GitHub: https://github.com/naver/nlp-challenge\n", "- 네이버, 창원대가 함께하는 NLP Challenge 리더보드 및 라이센스: http://air.changwon.ac.kr/?page_id=10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 1. Setup Environments\n", "---\n", "\n", "### Import modules" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sagemaker role arn: arn:aws:iam::143656149352:role/service-role/AmazonSageMaker-ExecutionRole-20220317T150353\n", "sagemaker bucket: sagemaker-us-east-1-143656149352\n", "sagemaker session region: us-east-1\n" ] } ], "source": [ "import boto3\n", "import sagemaker\n", "import sagemaker.huggingface\n", "\n", "sess = sagemaker.Session()\n", "# sagemaker session bucket -> used for uploading data, models and logs\n", "# sagemaker will automatically create this bucket if it not exists\n", "sagemaker_session_bucket=None\n", "if sagemaker_session_bucket is None and sess is not None:\n", " # set to default bucket if a bucket name is not given\n", " sagemaker_session_bucket = sess.default_bucket()\n", "\n", "role = sagemaker.get_execution_role()\n", "region = boto3.Session().region_name\n", "sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)\n", "\n", "print(f\"sagemaker role arn: {role}\")\n", "print(f\"sagemaker bucket: {sess.default_bucket()}\")\n", "print(f\"sagemaker session region: {sess.boto_region_name}\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import os\n", "import sys\n", "import logging\n", "import argparse\n", "import torch\n", "from torch import nn\n", "import numpy as np\n", "import pandas as pd\n", "from tqdm import tqdm\n", "from sklearn.model_selection import train_test_split\n", "from transformers import BertTokenizer, BertTokenizerFast, BertConfig, BertForTokenClassification\n", "\n", "logging.basicConfig(\n", " level=logging.INFO, \n", " format='[{%(filename)s:%(lineno)d} %(levelname)s - %(message)s',\n", " handlers=[\n", " logging.StreamHandler(sys.stdout)\n", " ]\n", ")\n", "logger = logging.getLogger(__name__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load Feature set" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2022-07-06 04:57:21-- https://korquad.github.io/dataset/KorQuAD_v1.0_train.json\n", "Resolving korquad.github.io (korquad.github.io)... 185.199.108.153, 185.199.111.153, 185.199.109.153, ...\n", "Connecting to korquad.github.io (korquad.github.io)|185.199.108.153|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 38527475 (37M) [application/json]\n", "Saving to: ‘qna_train/KorQuAD_v1.0_train.json’\n", "\n", "100%[======================================>] 38,527,475 --.-K/s in 0.1s \n", "\n", "2022-07-06 04:57:21 (269 MB/s) - ‘qna_train/KorQuAD_v1.0_train.json’ saved [38527475/38527475]\n", "\n", "--2022-07-06 04:57:21-- https://korquad.github.io/dataset/KorQuAD_v1.0_dev.json\n", "Resolving korquad.github.io (korquad.github.io)... 185.199.110.153, 185.199.109.153, 185.199.111.153, ...\n", "Connecting to korquad.github.io (korquad.github.io)|185.199.110.153|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 3881058 (3.7M) [application/json]\n", "Saving to: ‘qna_valid/KorQuAD_v1.0_dev.json’\n", "\n", "100%[======================================>] 3,881,058 --.-K/s in 0.02s \n", "\n", "2022-07-06 04:57:21 (164 MB/s) - ‘qna_valid/KorQuAD_v1.0_dev.json’ saved [3881058/3881058]\n", "\n" ] } ], "source": [ "train_dir = 'qna_train'\n", "valid_dir = 'qna_valid'\n", "\n", "!wget https://korquad.github.io/dataset/KorQuAD_v1.0_train.json -O {train_dir}/KorQuAD_v1.0_train.json\n", "!wget https://korquad.github.io/dataset/KorQuAD_v1.0_dev.json -O {valid_dir}/KorQuAD_v1.0_dev.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Uploading data to Amazon S3 Bucket\n", "\n", "SageMaker 훈련을 위해 데이터셋을 S3로 업로드합니다." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "bucket = sess.default_bucket()\n", "\n", "# s3 key prefix for the data\n", "s3_prefix = 'samples/datasets/korquad'\n", "\n", "# save train_dataset to s3\n", "train_input_path = f's3://{bucket}/{s3_prefix}/train'\n", "valid_input_path = f's3://{bucket}/{s3_prefix}/valid'" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 7.29 ms, sys: 42 ms, total: 49.3 ms\n", "Wall time: 1.74 s\n" ] } ], "source": [ "%%time\n", "!aws s3 cp {train_dir} {train_input_path} --recursive --quiet\n", "!aws s3 cp {valid_dir} {valid_input_path} --recursive --quiet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "## 2. Training with Native Hugging Face (PyTorch Framework)\n", "\n", "---\n", "\n", "### Overview and Training Script\n", "\n", "SageMaker에 대한 대표적인 오해가 여전히 많은 분들이 SageMaker 훈련을 위해 소스 코드를 전면적으로 수정해야 한다고 생각합니다. 하지만, 실제로는 별도의 소스 코드 수정 없이 기존 여러분이 사용했던 파이썬 스크립트에 SageMaker 훈련에 필요한 SageMaker 전용 환경 변수들만 추가하면 됩니다. \n", "\n", "SageMaker 훈련은 훈련 작업을 호출할 때, 1) 훈련 EC2 인스턴스 프로비저닝 - 2) 컨테이너 구동을 위한 도커 이미지 및 훈련 데이터 다운로드 - 3) 컨테이너 구동 - 4) 컨테이너 환경에서 훈련 수행 - 5) 컨테이너 환경에서 S3의 특정 버킷에 저장 - 6) 훈련 인스턴스 종료로 구성됩니다. 따라서, 훈련 수행 로직은 아래 예시와 같이 기존 개발 환경과 동일합니다.\n", "\n", "```python\n", "/opt/conda/bin/python train.py --epochs 5 --train_batch_size 32 ...\n", "```\n", "\n", "이 과정에서 컨테이너 환경에 필요한 환경 변수(예: 모델 경로, 훈련 데이터 경로) 들은 사전에 지정되어 있으며, 이 환경 변수들이 설정되어 있어야 훈련에 필요한 파일들의 경로를 인식할 수 있습니다. 대표적인 환경 변수들에 대한 자세한 내용은 https://github.com/aws/sagemaker-containers#important-environment-variables 을 참조하세요." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{2571691105.py:15} INFO - learning_rate: 5e-05\n" ] } ], "source": [ "from sagemaker.huggingface import HuggingFace\n", "import time\n", "instance_type = 'ml.p3.8xlarge'\n", "num_gpus = 4\n", "instance_count = 1\n", "train_batch_size = 16\n", "eval_batch_size = 64\n", "max_length = 384\n", "stride = 64\n", "model_id = 'salti/bert-base-multilingual-cased-finetuned-squad' \n", "fp16 = True\n", "tokenizer_id = model_id\n", "logging_steps = 100\n", "learning_rate = 5e-5\n", "logger.info(f'learning_rate: {learning_rate}')\n", "\n", "# hyperparameters, which are passed into the training job\n", "hyperparameters = {\n", " 'n_gpus': num_gpus, # number of GPUs per instance\n", " 'epochs': 3, # number of training epochs\n", " 'seed': 42, # random seed\n", " 'train_batch_size': train_batch_size, # batch size for training\n", " 'eval_batch_size': eval_batch_size, # batch size for evaluation\n", " 'max_length': max_length, # max sequence\n", " 'stride': stride, # stride\n", " 'logging_steps': logging_steps, # logging steps\n", " 'learning_rate': learning_rate, # learning rate used during training\n", " 'fp16': fp16, # use FP16\n", " 'tokenizer_id': tokenizer_id, # pre-trained tokenizer\n", " 'model_id': model_id # pre-trained model\n", "}" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{581002583.py:23} INFO - training job name: kornlp-qna-training-2022-07-06-04-57-30\n" ] } ], "source": [ "# define Training Job Name \n", "job_name = f'kornlp-qna-training-{time.strftime(\"%Y-%m-%d-%H-%M-%S\", time.localtime())}'\n", "chkpt_s3_path = f's3://{bucket}/{s3_prefix}/native/checkpoints'\n", "\n", "# create the Estimator\n", "sm_estimator = HuggingFace(\n", " entry_point = 'train.py', # fine-tuning script used in training jon\n", " source_dir = './scripts', # directory where fine-tuning script is stored\n", " instance_type = instance_type, # instances type used for the training job\n", " instance_count = instance_count, # the number of instances used for training\n", " base_job_name = job_name, # the name of the training job\n", " role = role, # IAM role used in training job to access AWS ressources, e.g. S3\n", " transformers_version = '4.17.0', # the transformers version used in the training job\n", " pytorch_version = '1.10.2', # the pytorch_version version used in the training job\n", " py_version = 'py38', # the python version used in the training job\n", " hyperparameters = hyperparameters, # the hyperparameter used for running the training job\n", " disable_profiler = True,\n", " debugger_hook_config = False, \n", " checkpoint_s3_uri = chkpt_s3_path,\n", " checkpoint_local_path ='/opt/ml/checkpoints', \n", ")\n", "\n", "logger.info(f'training job name: {job_name}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`fit()` 메소드를 호출하여 훈련 job을 시작합니다. `fit()` 메소드의 인자값 중 `wait=True`로 설정할 경우에는 동기(synchronous) 방식으로 동직하게 되며, `wait=False`일 경우에는 비동기(aynchronous) 방식으로 동작하여 여러 개의 훈련 job을 동시에 실행할 수 있습니다." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{session.py:590} INFO - Creating training-job with name: kornlp-qna-training-2022-07-06-04-57-30-2022-07-06-04-57-31-762\n" ] } ], "source": [ "# define a data input dictonary with our uploaded s3 uris\n", "data = {\n", " 'train': train_input_path,\n", " 'valid': valid_input_path\n", "}\n", "\n", "# starting the train job with our uploaded datasets as input\n", "sm_estimator.fit(data, wait=False)\n", "train_job_name = sm_estimator.latest_training_job.job_name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### View Training Job\n", "SageMaker 콘솔 창에서 훈련 내역을 직접 확인할 수도 있지만, 아래 코드 셀에서 생성되는 링크를 클릭하면 더 편리하게 훈련 내역을 확인할 수 있습니다." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ " [Hugging Face Training - Native] Review Training Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ " [Hugging Face Training - Native] Review CloudWatch Logs" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.core.display import display, HTML\n", "\n", "def make_console_link(region, train_job_name, train_task='[Training]'):\n", " train_job_link = f' {train_task} Review Training Job' \n", " cloudwatch_link = f' {train_task} Review CloudWatch Logs'\n", " return train_job_link, cloudwatch_link \n", " \n", "train_job_link, cloudwatch_link = make_console_link(region, train_job_name, '[Hugging Face Training - Native]')\n", "\n", "display(HTML(train_job_link))\n", "display(HTML(cloudwatch_link))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Wait for the training jobs to complete\n", "훈련이 완료될 때까지 기다립니다. `estimator.fit(...)`에서 `wait=False`로 설정한 경우, 아래 코드 셀의 주석을 해제 후 실행하여 동기 방식으로 변경할 수도 있습니다. 훈련 완료까지는 약 15-20분의 시간이 소요됩니다." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-07-06 05:01:19 Starting - Starting the training job......\n", "2022-07-06 05:01:58 Starting - Preparing the instances for training.........\n", "2022-07-06 05:03:25 Downloading - Downloading input data......\n", "2022-07-06 05:04:36 Training - Downloading the training image.......................\u001b[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device\u001b[0m\n", "\u001b[34mbash: no job control in this shell\u001b[0m\n", "\u001b[34m/opt/conda/lib/python3.8/site-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated\n", " \"class\": algorithms.Blowfish,\u001b[0m\n", "\u001b[34m2022-07-06 05:08:30,777 sagemaker-training-toolkit INFO Imported framework sagemaker_pytorch_container.training\u001b[0m\n", "\u001b[34m2022-07-06 05:08:30,816 sagemaker_pytorch_container.training INFO Block until all host DNS lookups succeed.\u001b[0m\n", "\u001b[34m2022-07-06 05:08:30,824 sagemaker_pytorch_container.training INFO Invoking user training script.\u001b[0m\n", "\n", "2022-07-06 05:08:27 Training - Training image download completed. Training in progress.\u001b[34m2022-07-06 05:09:55,362 sagemaker-training-toolkit INFO Invoking user script\u001b[0m\n", "\u001b[34mTraining Env:\u001b[0m\n", "\u001b[34m{\n", " \"additional_framework_parameters\": {},\n", " \"channel_input_dirs\": {\n", " \"train\": \"/opt/ml/input/data/train\",\n", " \"valid\": \"/opt/ml/input/data/valid\"\n", " },\n", " \"current_host\": \"algo-1\",\n", " \"framework_module\": \"sagemaker_pytorch_container.training:main\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"hyperparameters\": {\n", " \"epochs\": 3,\n", " \"eval_batch_size\": 64,\n", " \"fp16\": true,\n", " \"learning_rate\": 5e-05,\n", " \"logging_steps\": 100,\n", " \"max_length\": 384,\n", " \"model_id\": \"salti/bert-base-multilingual-cased-finetuned-squad\",\n", " \"n_gpus\": 4,\n", " \"seed\": 42,\n", " \"stride\": 64,\n", " \"tokenizer_id\": \"salti/bert-base-multilingual-cased-finetuned-squad\",\n", " \"train_batch_size\": 16\n", " },\n", " \"input_config_dir\": \"/opt/ml/input/config\",\n", " \"input_data_config\": {\n", " \"train\": {\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " },\n", " \"valid\": {\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " }\n", " },\n", " \"input_dir\": \"/opt/ml/input\",\n", " \"is_master\": true,\n", " \"job_name\": \"kornlp-qna-training-2022-07-06-04-57-30-2022-07-06-04-57-31-762\",\n", " \"log_level\": 20,\n", " \"master_hostname\": \"algo-1\",\n", " \"model_dir\": \"/opt/ml/model\",\n", " \"module_dir\": \"s3://sagemaker-us-east-1-143656149352/kornlp-qna-training-2022-07-06-04-57-30-2022-07-06-04-57-31-762/source/sourcedir.tar.gz\",\n", " \"module_name\": \"train\",\n", " \"network_interface_name\": \"eth0\",\n", " \"num_cpus\": 32,\n", " \"num_gpus\": 4,\n", " \"output_data_dir\": \"/opt/ml/output/data\",\n", " \"output_dir\": \"/opt/ml/output\",\n", " \"output_intermediate_dir\": \"/opt/ml/output/intermediate\",\n", " \"resource_config\": {\n", " \"current_host\": \"algo-1\",\n", " \"current_instance_type\": \"ml.p3.8xlarge\",\n", " \"current_group_name\": \"homogeneousCluster\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"instance_groups\": [\n", " {\n", " \"instance_group_name\": \"homogeneousCluster\",\n", " \"instance_type\": \"ml.p3.8xlarge\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ]\n", " }\n", " ],\n", " \"network_interface_name\": \"eth0\"\n", " },\n", " \"user_entry_point\": \"train.py\"\u001b[0m\n", "\u001b[34m}\u001b[0m\n", "\u001b[34mEnvironment variables:\u001b[0m\n", "\u001b[34mSM_HOSTS=[\"algo-1\"]\u001b[0m\n", "\u001b[34mSM_NETWORK_INTERFACE_NAME=eth0\u001b[0m\n", "\u001b[34mSM_HPS={\"epochs\":3,\"eval_batch_size\":64,\"fp16\":true,\"learning_rate\":5e-05,\"logging_steps\":100,\"max_length\":384,\"model_id\":\"salti/bert-base-multilingual-cased-finetuned-squad\",\"n_gpus\":4,\"seed\":42,\"stride\":64,\"tokenizer_id\":\"salti/bert-base-multilingual-cased-finetuned-squad\",\"train_batch_size\":16}\u001b[0m\n", "\u001b[34mSM_USER_ENTRY_POINT=train.py\u001b[0m\n", "\u001b[34mSM_FRAMEWORK_PARAMS={}\u001b[0m\n", "\u001b[34mSM_RESOURCE_CONFIG={\"current_group_name\":\"homogeneousCluster\",\"current_host\":\"algo-1\",\"current_instance_type\":\"ml.p3.8xlarge\",\"hosts\":[\"algo-1\"],\"instance_groups\":[{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.p3.8xlarge\"}],\"network_interface_name\":\"eth0\"}\u001b[0m\n", "\u001b[34mSM_INPUT_DATA_CONFIG={\"train\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"valid\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}}\u001b[0m\n", "\u001b[34mSM_OUTPUT_DATA_DIR=/opt/ml/output/data\u001b[0m\n", "\u001b[34mSM_CHANNELS=[\"train\",\"valid\"]\u001b[0m\n", "\u001b[34mSM_CURRENT_HOST=algo-1\u001b[0m\n", "\u001b[34mSM_MODULE_NAME=train\u001b[0m\n", "\u001b[34mSM_LOG_LEVEL=20\u001b[0m\n", "\u001b[34mSM_FRAMEWORK_MODULE=sagemaker_pytorch_container.training:main\u001b[0m\n", "\u001b[34mSM_INPUT_DIR=/opt/ml/input\u001b[0m\n", "\u001b[34mSM_INPUT_CONFIG_DIR=/opt/ml/input/config\u001b[0m\n", "\u001b[34mSM_OUTPUT_DIR=/opt/ml/output\u001b[0m\n", "\u001b[34mSM_NUM_CPUS=32\u001b[0m\n", "\u001b[34mSM_NUM_GPUS=4\u001b[0m\n", "\u001b[34mSM_MODEL_DIR=/opt/ml/model\u001b[0m\n", "\u001b[34mSM_MODULE_DIR=s3://sagemaker-us-east-1-143656149352/kornlp-qna-training-2022-07-06-04-57-30-2022-07-06-04-57-31-762/source/sourcedir.tar.gz\u001b[0m\n", "\u001b[34mSM_TRAINING_ENV={\"additional_framework_parameters\":{},\"channel_input_dirs\":{\"train\":\"/opt/ml/input/data/train\",\"valid\":\"/opt/ml/input/data/valid\"},\"current_host\":\"algo-1\",\"framework_module\":\"sagemaker_pytorch_container.training:main\",\"hosts\":[\"algo-1\"],\"hyperparameters\":{\"epochs\":3,\"eval_batch_size\":64,\"fp16\":true,\"learning_rate\":5e-05,\"logging_steps\":100,\"max_length\":384,\"model_id\":\"salti/bert-base-multilingual-cased-finetuned-squad\",\"n_gpus\":4,\"seed\":42,\"stride\":64,\"tokenizer_id\":\"salti/bert-base-multilingual-cased-finetuned-squad\",\"train_batch_size\":16},\"input_config_dir\":\"/opt/ml/input/config\",\"input_data_config\":{\"train\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"valid\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}},\"input_dir\":\"/opt/ml/input\",\"is_master\":true,\"job_name\":\"kornlp-qna-training-2022-07-06-04-57-30-2022-07-06-04-57-31-762\",\"log_level\":20,\"master_hostname\":\"algo-1\",\"model_dir\":\"/opt/ml/model\",\"module_dir\":\"s3://sagemaker-us-east-1-143656149352/kornlp-qna-training-2022-07-06-04-57-30-2022-07-06-04-57-31-762/source/sourcedir.tar.gz\",\"module_name\":\"train\",\"network_interface_name\":\"eth0\",\"num_cpus\":32,\"num_gpus\":4,\"output_data_dir\":\"/opt/ml/output/data\",\"output_dir\":\"/opt/ml/output\",\"output_intermediate_dir\":\"/opt/ml/output/intermediate\",\"resource_config\":{\"current_group_name\":\"homogeneousCluster\",\"current_host\":\"algo-1\",\"current_instance_type\":\"ml.p3.8xlarge\",\"hosts\":[\"algo-1\"],\"instance_groups\":[{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.p3.8xlarge\"}],\"network_interface_name\":\"eth0\"},\"user_entry_point\":\"train.py\"}\u001b[0m\n", "\u001b[34mSM_USER_ARGS=[\"--epochs\",\"3\",\"--eval_batch_size\",\"64\",\"--fp16\",\"True\",\"--learning_rate\",\"5e-05\",\"--logging_steps\",\"100\",\"--max_length\",\"384\",\"--model_id\",\"salti/bert-base-multilingual-cased-finetuned-squad\",\"--n_gpus\",\"4\",\"--seed\",\"42\",\"--stride\",\"64\",\"--tokenizer_id\",\"salti/bert-base-multilingual-cased-finetuned-squad\",\"--train_batch_size\",\"16\"]\u001b[0m\n", "\u001b[34mSM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate\u001b[0m\n", "\u001b[34mSM_CHANNEL_TRAIN=/opt/ml/input/data/train\u001b[0m\n", "\u001b[34mSM_CHANNEL_VALID=/opt/ml/input/data/valid\u001b[0m\n", "\u001b[34mSM_HP_EPOCHS=3\u001b[0m\n", "\u001b[34mSM_HP_EVAL_BATCH_SIZE=64\u001b[0m\n", "\u001b[34mSM_HP_FP16=true\u001b[0m\n", "\u001b[34mSM_HP_LEARNING_RATE=5e-05\u001b[0m\n", "\u001b[34mSM_HP_LOGGING_STEPS=100\u001b[0m\n", "\u001b[34mSM_HP_MAX_LENGTH=384\u001b[0m\n", "\u001b[34mSM_HP_MODEL_ID=salti/bert-base-multilingual-cased-finetuned-squad\u001b[0m\n", "\u001b[34mSM_HP_N_GPUS=4\u001b[0m\n", "\u001b[34mSM_HP_SEED=42\u001b[0m\n", "\u001b[34mSM_HP_STRIDE=64\u001b[0m\n", "\u001b[34mSM_HP_TOKENIZER_ID=salti/bert-base-multilingual-cased-finetuned-squad\u001b[0m\n", "\u001b[34mSM_HP_TRAIN_BATCH_SIZE=16\u001b[0m\n", "\u001b[34mPYTHONPATH=/opt/ml/code:/opt/conda/bin:/opt/conda/lib/python38.zip:/opt/conda/lib/python3.8:/opt/conda/lib/python3.8/lib-dynload:/opt/conda/lib/python3.8/site-packages:/opt/conda/lib/python3.8/site-packages/smdebug-1.0.13b20220512-py3.8.egg:/opt/conda/lib/python3.8/site-packages/pyinstrument-3.4.2-py3.8.egg:/opt/conda/lib/python3.8/site-packages/pyinstrument_cext-0.2.4-py3.8-linux-x86_64.egg\u001b[0m\n", "\u001b[34mInvoking script with the following command:\u001b[0m\n", "\u001b[34m/opt/conda/bin/python3.8 train.py --epochs 3 --eval_batch_size 64 --fp16 True --learning_rate 5e-05 --logging_steps 100 --max_length 384 --model_id salti/bert-base-multilingual-cased-finetuned-squad --n_gpus 4 --seed 42 --stride 64 --tokenizer_id salti/bert-base-multilingual-cased-finetuned-squad --train_batch_size 16\u001b[0m\n", "\u001b[34m[{train.py:164} INFO - ***** Arguments *****\u001b[0m\n", "\u001b[34m[{train.py:165} INFO - epochs=3\u001b[0m\n", "\u001b[34mseed=42\u001b[0m\n", "\u001b[34mtrain_batch_size=16\u001b[0m\n", "\u001b[34meval_batch_size=64\u001b[0m\n", "\u001b[34mmax_length=384\u001b[0m\n", "\u001b[34mstride=64\u001b[0m\n", "\u001b[34mwarmup_steps=100\u001b[0m\n", "\u001b[34mlogging_steps=100\u001b[0m\n", "\u001b[34mlearning_rate=5e-05\u001b[0m\n", "\u001b[34mdisable_tqdm=False\u001b[0m\n", "\u001b[34mfp16=True\u001b[0m\n", "\u001b[34mtokenizer_id=salti/bert-base-multilingual-cased-finetuned-squad\u001b[0m\n", "\u001b[34mmodel_id=salti/bert-base-multilingual-cased-finetuned-squad\u001b[0m\n", "\u001b[34moutput_data_dir=/opt/ml/output/data\u001b[0m\n", "\u001b[34mmodel_dir=/opt/ml/model\u001b[0m\n", "\u001b[34mn_gpus=4\u001b[0m\n", "\u001b[34mtrain_dir=/opt/ml/input/data/train\u001b[0m\n", "\u001b[34mvalid_dir=/opt/ml/input/data/valid\u001b[0m\n", "\u001b[34mchkpt_dir=/opt/ml/checkpoints\u001b[0m\n", "\u001b[34mDownloading: 0%| | 0.00/264 [00:00\n", "\n", "## 3. Prediction\n", "---\n", "### Load fine-tuned model" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "from transformers import BertForQuestionAnswering\n", "tokenizer = BertTokenizerFast.from_pretrained(f'{local_model_dir}')\n", "model = BertForQuestionAnswering.from_pretrained(f'{local_model_dir}')\n", "model.load_state_dict(torch.load(f'{local_model_dir}/pytorch_model.bin'))\n", "model = model.eval()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def predict_fn(example, model):\n", " \n", " from transformers import pipeline\n", " device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", " device_id = -1 if device.type == \"cpu\" else 0\n", " \n", " context = example[0]\n", " question = example[1]\n", " \n", " nlp = pipeline(\"question-answering\", model=model.to(device_id), device=device_id,\n", " tokenizer=tokenizer)\n", " results = nlp(question=question, context=context)\n", " return results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example\n", "여러분만의 샘플 문장을 만들어서 자유롭게 추론을 수행해 보세요." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "context = r\"\"\"\n", "아마존웹서비스(AWS)는 카카오 게임 전문 계열사 카카오게임즈가 자사 머신러닝(ML), 데이터베이스(DB) 및 데이터 분석 등 서비스를 통해 사용자 경험을 제고했다고 7일 밝혔다.\n", "AWS는 카카오게임즈가 AWS클라우드 역량을 활용해 게임 데이터 분석 솔루션을 실행하고, 대량의 게임 데이터와 설치 건수, 사용자 유지율과 같은 성과 지표를 분석하고 있다고 설명했다. \n", "현재 카카오게임즈는 폭증하는 데이터를 저장·분석하기 위한 방법으로 클라우드 오브젝트 스토리지 서비스 '아마존 S3(Amazon Simple Storage Service)' 기반 데이터 레이크(Data Lake)를 구축했다. 또 데이터 분석을 용이하게 해주는 대화형 쿼리 서비스 '아마존 아테나(Amazon Athena)'를 도입해 데이터 레이크로부터 게임 데이터를 통합하고, 게임 사용자 행동과 관련된 인사이트를 확보 중이다. \n", "이를 통해 카카오게임즈는 게임 봇을 탐지하고 제거하는 방식으로 사용자 경험을 제고했다. 또한 관계형 데이터베이스 서비스 '아마존 오로라(Amazon Aurora)'를 활용해 게임 내 구매와 같은 대규모 데이터베이스 거래를 처리하고 있다. 이밖에도 카카오게임즈는 ML 모델 구축, 교육 및 배포를 위한 완전 관리형 서비스 '아마존 세이지메이커(Amazon SageMaker)'를 활용할 예정이다.\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'score': 0.998553991317749, 'start': 29, 'end': 36, 'answer': '카카오게임즈가'}\n", "{'score': 0.8284631967544556, 'start': 263, 'end': 269, 'answer': '아마존 S3'}\n", "{'score': 0.9286826252937317, 'start': 514, 'end': 521, 'answer': '아마존 오로라'}\n", "{'score': 0.8117018938064575, 'start': 626, 'end': 636, 'answer': '아마존 세이지메이커'}\n" ] } ], "source": [ "question = \"카카오 게임 전문 계열사는?\"\n", "print(predict_fn((context, question), model))\n", "\n", "question = \"AWS의 클라우드 오브젝트 스토리지 서비스는?\"\n", "print(predict_fn((context, question), model))\n", "\n", "question = \"AWS의 관계형 데이터베이스 서비스는?\"\n", "print(predict_fn((context, question), model))\n", "\n", "question = \"AWS의 ML 모델 완전 관리형 서비스는?\"\n", "print(predict_fn((context, question), model))" ] } ], "metadata": { "accelerator": "TPU", "colab": { "authorship_tag": "ABX9TyPT/32fR6YbrNgmG6aLi8U7", "include_colab_link": true, "machine_shape": "hm", "name": "5_(BERT_실습)한국어 개체명 인식.ipynb", "provenance": [] }, "kernelspec": { "display_name": "conda_pytorch_p38", "language": "python", "name": "conda_pytorch_p38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "1ac7cea5aaba45af9eddeaaee02e1e5a": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2ef88e8c35374ca69203a64d209745ea": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "32c340873ce247e88df66c73309eecdc": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "4a4ff12bb4604faf8c1cd79156713854": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "initial" } }, "51eafe68808a4ffbac05605381c2d5a3": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "598f160635264f138769ae94a127455c": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "Downloading: 100%", "description_tooltip": null, "layout": "IPY_MODEL_1ac7cea5aaba45af9eddeaaee02e1e5a", "max": 1961828, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_b61fb58de9be4c84b7767bf69e32c5d9", "value": 1961828 } }, "5b0843766d3f4ac785c7dba85254d605": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_ca07bf481be7473ab1b22babaa76c3da", "IPY_MODEL_ab60a81459a84ae19fa922aa4ce27e8a" ], "layout": "IPY_MODEL_6356e0609f9f49d5996ef7f4f77fbd2d" } }, "6356e0609f9f49d5996ef7f4f77fbd2d": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "693887ccba30416586e2085b7e36118b": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9169c98d7ee5423ba5b6eb3c4dbbeda7", "placeholder": "​", "style": "IPY_MODEL_cc353722b52045efb9009ef79c7d56b7", "value": " 1.96M/1.96M [00:00<00:00, 6.43MB/s]" } }, "6cb6badcbbd34359be9dad2c8af93098": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "88f28f34de9e41cc948c7aebb4035589": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "8d879e2bbea04536aaa1ee5d356bb7c3": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_2ef88e8c35374ca69203a64d209745ea", "placeholder": "​", "style": "IPY_MODEL_32c340873ce247e88df66c73309eecdc", "value": " 29.0/29.0 [00:00<00:00, 50.2B/s]" } }, "9169c98d7ee5423ba5b6eb3c4dbbeda7": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "93234e1bd6444d819b130d83402d2d7b": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_e6e0a2274e7f420f91dc97a143315da0", "IPY_MODEL_8d879e2bbea04536aaa1ee5d356bb7c3" ], "layout": "IPY_MODEL_ad295e710380441588473f810a9210d7" } }, "a0dbf235f20c497186d319b5b1558dd9": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "initial" } }, "ab60a81459a84ae19fa922aa4ce27e8a": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_51eafe68808a4ffbac05605381c2d5a3", "placeholder": "​", "style": "IPY_MODEL_6cb6badcbbd34359be9dad2c8af93098", "value": " 996k/996k [00:00<00:00, 1.67MB/s]" } }, "ad295e710380441588473f810a9210d7": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "b61fb58de9be4c84b7767bf69e32c5d9": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "initial" } }, "ca07bf481be7473ab1b22babaa76c3da": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "Downloading: 100%", "description_tooltip": null, "layout": "IPY_MODEL_88f28f34de9e41cc948c7aebb4035589", "max": 995526, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_4a4ff12bb4604faf8c1cd79156713854", "value": 995526 } }, "cc353722b52045efb9009ef79c7d56b7": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "e4dcb08aab3748b18a10d0f5daaf3554": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_598f160635264f138769ae94a127455c", "IPY_MODEL_693887ccba30416586e2085b7e36118b" ], "layout": "IPY_MODEL_f1ed2fc28a3e499fa784d5aa1777a77b" } }, "e6e0a2274e7f420f91dc97a143315da0": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "Downloading: 100%", "description_tooltip": null, "layout": "IPY_MODEL_ed834b7f997141479ab90216655e230a", "max": 29, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_a0dbf235f20c497186d319b5b1558dd9", "value": 29 } }, "ed834b7f997141479ab90216655e230a": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f1ed2fc28a3e499fa784d5aa1777a77b": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } } } } }, "nbformat": 4, "nbformat_minor": 4 }