{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "# SageMaker로 Yolov5 학습 및 평가 하기\n",
    "\n",
    "본 모듈에서는 Amzaon SageMaker API을 효과적으로 이용하기 위해 multigpu-distributed 학습을 위한 PyTorch 프레임워크 자체 구현만으로 모델 훈련을 수행해 봅니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "install_needed = True  # should only be True once\n",
    "install_needed = False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "DAEMON_PATH=\"/etc/docker\"\n",
    "MEMORY_SIZE=10G\n",
    "\n",
    "FLAG=$(cat $DAEMON_PATH/daemon.json | jq 'has(\"data-root\")')\n",
    "# echo $FLAG\n",
    "\n",
    "if [ \"$FLAG\" == true ]; then\n",
    "    echo \"Already revised\"\n",
    "else\n",
    "    echo \"Add data-root and default-shm-size=$MEMORY_SIZE\"\n",
    "    sudo cp $DAEMON_PATH/daemon.json $DAEMON_PATH/daemon.json.bak\n",
    "    sudo cat $DAEMON_PATH/daemon.json.bak | jq '. += {\"data-root\":\"/home/ec2-user/SageMaker/.container/docker\",\"default-shm-size\":\"'$MEMORY_SIZE'\"}' | sudo tee $DAEMON_PATH/daemon.json > /dev/null\n",
    "    sudo service docker restart\n",
    "    echo \"Docker Restart\"\n",
    "fi"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "import IPython\n",
    "\n",
    "if install_needed:\n",
    "    print(\"installing deps and restarting kernel\")\n",
    "#     !{sys.executable} -m pip install -U split-folders tqdm albumentations crc32c wget\n",
    "    !{sys.executable} -m pip install 'sagemaker[local]' --upgrade\n",
    "    !{sys.executable} -m pip install -U smdebug sagemaker-experiments\n",
    "    !{sys.executable} -m pip install -U sagemaker ipyplot jsonlines\n",
    "    IPython.Application.instance().kernel.do_shutdown(True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## 2. 모델 실험 환경 설정\n",
    "\n",
    "<p>Sagemaker 학습에 필요한 기본적인 package를 import 합니다. </p>\n",
    "<p>boto3는 HTTP API 호출을 숨기는 편한 추상화 모델을 가지고 있고, Amazon EC2 인스턴스 및 S3 버켓과 같은 AWS 리소스와 동작하는 파이선 클래스를 제공합니다. </p>\n",
    "<p>sagemaker python sdk는 Amazon SageMaker에서 기계 학습 모델을 교육 및 배포하기 위한 오픈 소스 라이브러리입니다.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import joblib\n",
    "import sagemaker\n",
    "\n",
    "import datetime\n",
    "import glob\n",
    "import os\n",
    "import time\n",
    "import warnings\n",
    "\n",
    "import sys\n",
    "import matplotlib.image as mpimg\n",
    "import ipyplot\n",
    "\n",
    "from smexperiments.experiment import Experiment\n",
    "from smexperiments.trial import Trial\n",
    "\n",
    "import shutil\n",
    "\n",
    "import boto3\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "# from tqdm import tqdm\n",
    "from time import strftime\n",
    "from PIL import Image\n",
    "\n",
    "from sagemaker import get_execution_role\n",
    "from sagemaker.pytorch import PyTorch\n",
    "\n",
    "from sagemaker.debugger import (Rule,\n",
    "                                rule_configs,\n",
    "                                ProfilerConfig, \n",
    "                                FrameworkProfile, \n",
    "                                DetailedProfilingConfig, \n",
    "                                DataloaderProfilingConfig, \n",
    "                                PythonProfilingConfig)\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "role = get_execution_role()\n",
    "\n",
    "sess = boto3.Session()\n",
    "region = sess.region_name\n",
    "account = boto3.client(\"sts\").get_caller_identity().get(\"Account\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sagemaker.__version__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_experiment(experiment_name):\n",
    "    try:\n",
    "        sm_experiment = Experiment.load(experiment_name)\n",
    "    except:\n",
    "        sm_experiment = Experiment.create(experiment_name=experiment_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_trial(experiment_name):\n",
    "    create_date = strftime(\"%m%d-%H%M%s\")\n",
    "    sm_trial = Trial.create(trial_name=f'{experiment_name}-{create_date}',\n",
    "                            experiment_name=experiment_name)\n",
    "\n",
    "    job_name = f'{sm_trial.trial_name}'\n",
    "    return job_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. custom training container 생성\n",
    "현재 사용하는 노트북의 iam role에 IAMFullAccess을 추가한 이후에 아래 작업을 수행합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# iam_client = boto3.client('iam')\n",
    "# role=get_execution_role()\n",
    "# base_role_name=role.split('/')[-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# iam_client.attach_role_policy(\n",
    "#     RoleName=base_role_name,\n",
    "#     PolicyArn='arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess'\n",
    "# )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %%bash\n",
    "# cd ./docker\n",
    "# echo $(pwd)\n",
    "# container_name=yolov5-training-sagemaker\n",
    "# account=$(aws sts get-caller-identity --query Account --output text)\n",
    "\n",
    "# # Get the region defined in the current configuration (default to us-west-2 if none defined)\n",
    "# region=$(aws configure get region)\n",
    "# region=${region:-us-west-2}\n",
    "\n",
    "# fullname=\"${account}.dkr.ecr.${region}.amazonaws.com/${container_name}:1.0\"\n",
    "\n",
    "# # If the repository doesn't exist in ECR, create it.\n",
    "# aws ecr describe-repositories --repository-names \"${container_name}\" > /dev/null 2>&1\n",
    "# if [ $? -ne 0 ]\n",
    "# then\n",
    "#     aws ecr create-repository --repository-name \"${container_name}\" > /dev/null\n",
    "# fi\n",
    "\n",
    "# # Get the login command from ECR and execute it directly\n",
    "# $(aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin \"763104351884.dkr.ecr.us-west-2.amazonaws.com\")\n",
    "\n",
    "# # Build the docker image locally with the image name and then push it to ECR\n",
    "# # with the full name.\n",
    "# docker build -f Dockerfile -t ${fullname} .\n",
    "# # docker tag ${container_name} ${fullname}\n",
    "\n",
    "# # Get the login command from ECR and execute it directly\n",
    "# $(aws ecr get-login --region ${region} --no-include-email)\n",
    "# docker push ${fullname}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## 4. yolov5 format 데이터 설정"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "실습에 사용한 데이터셋은 [roboflow 에서 공개한 BCCD Dataset](https://public.roboflow.com/object-detection/bccd)으로, 혈액의 WBC(백혈구), RBC(적혈구), Platelets(혈소판)를 촬영한 이미지들입니다.\n",
    "\n",
    "yolov5 object detection모델을 학습하기 위한 train/val/test 데이터셋은 다음과 같은 폴더 구조를 따라야 합니다. `images` 폴더에는 이미지를 저장하고, `labels` 폴더에는 이미지 별 annotation 결과 파일을 저장합니다.\n",
    "```\n",
    "├── test\n",
    "│   ├── images\n",
    "│   └── labels\n",
    "├── train\n",
    "│   ├── images\n",
    "│   └── labels\n",
    "└── valid\n",
    "    ├── images\n",
    "    └── labels\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "YOLOv5에서는 데이터셋이 저장된 경로와 Class수, Class이름을 별도 YAML파일에 선언합니다. \n",
    "\n",
    "- `data_sm.yaml`: 학습을 SageMaker Local mode, SageMaker managed training에서 수행할 경우 사용하는 설정 파일입니다. SageMaker는 S3에 저장된 데이터셋을(managed training의 경우, Local mode에서는 Local에 저장된 데이터셋) SageMaker container 내 `/opt/ml/input/data/[channel_name]/` 에 저장하므로 `train`, `val` 경로는 Jupyter notebook local 경로가 아닌 SageMaker container의 경로를 지정합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile yolov5/data/data_sm.yaml\n",
    "train: /opt/ml/input/data/inputdata/train/images\n",
    "val: /opt/ml/input/data/inputdata/valid/images\n",
    "\n",
    "nc: 3\n",
    "names: ['Platelets', 'RBC', 'WBC']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Estimator 설정"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Metric definitions 정의"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "학습코드 수행 시 발생하는 Standard output 로그에서 특정 패턴을 만족하는 값을 찾아 CloudWatch 사용자 metric으로 저장할 수 있습니다. `metric_definitions`는 SageMaker `Estimator`를 선언할 때 `metric_definitions` 파라미터의 값으로 전달 합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO\n",
    "metric_definitions = [\n",
    "    {'Name': 'Precision', 'Regex': r'all\\s+[0-9.]+\\s+[0-9.]+\\s+([0-9.]+)'},\n",
    "    {'Name': 'Recall', 'Regex': r'all\\s+[0-9.]+\\s+[0-9.]+\\s+[0-9.]+\\s+([0-9.]+)'},\n",
    "    {'Name': 'mAP@.5', 'Regex': r'all\\s+[0-9.]+\\s+[0-9.]+\\s+[0-9.]+\\s+[0-9.]+\\s+([0-9.]+)'},\n",
    "    {'Name': 'mAP@.5:.95', 'Regex': r'all\\s+[0-9.]+\\s+[0-9.]+\\s+[0-9.]+\\s+[0-9.]+\\s+[0-9.]+\\s+([0-9.]+)'}\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### WandB 설정 (Optional)\n",
    "https://wandb.ai/cayush/yoloV5/reports/Track-and-debug-your-YOLOv5-models--VmlldzozMDQ1OTg\n",
    "\n",
    "`wandb.init()`을 수행하면 `/root/.netrc` 파일에 WEB API Key가 저장됩니다. `.netrc`파일을 SageMaker Local/Managed 학습 수행 시 실행되는 Container내부의 `/root/.netrc`로 저장하기 위한 코드를 `utils/loggers/__init__.py`에 추가합니다. `.netrc` 파일은 `source_dir/.netrc`에 미리 저장해야 합니다.\n",
    "\n",
    "```\n",
    "# __init__.py\n",
    "...\n",
    "################## For SageMaker ##################\n",
    "from pathlib import Path\n",
    "import subprocess\n",
    "\n",
    "def wandb_setting():\n",
    "    set_path = '/opt/ml/code/.netrc' #WANDB API Key\n",
    "    file = Path(set_path)\n",
    "    if file.exists():\n",
    "        subprocess.run(['cp', '-r', set_path, '/root/.netrc'])\n",
    "    else:\n",
    "        print('=' * 100)\n",
    "        print('Not found!!!')\n",
    "        print('=' * 100)    \n",
    "\n",
    "wandb_setting()\n",
    "################## For SageMaker ##################\n",
    "...\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Storage 설정"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sagemaker_session = sagemaker.Session()\n",
    "bucket = sagemaker_session.default_bucket()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# bucket = 'XXXXXXXXXXXXX'  ## <== Create a bucket in your account and modify the bucket name.\n",
    "code_location = f's3://{bucket}/poc_yolov5/sm_codes'\n",
    "output_path = f's3://{bucket}/poc_yolov5/output'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Hyperparameters 설정"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hyperparameters = {\n",
    "    'data': 'data_sm.yaml',\n",
    "    'cfg': 'yolov5s.yaml',\n",
    "    'weights': 'weights/yolov5s.pt', # Transfer learning\n",
    "    'batch-size': 64,\n",
    "    'epochs': 1,\n",
    "    'project': '/opt/ml/model',\n",
    "    'workers': 0, # To avoid shm OOM issue\n",
    "    'freeze': 10, # For transfer learning, freeze all Layers except for the final output convolution layers.\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Instance 및 최대 학습 시간 설정"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "experiment_name = 'yolov5-poc-exp1'\n",
    "instance_type = 'ml.m5.4xlarge' \n",
    "# instance_type = 'local'\n",
    "instance_count = 1\n",
    "do_spot_training = False\n",
    "max_wait = None\n",
    "max_run = 1*60*60"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Local mode"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "source_dir = f'{Path.cwd()}/yolov5'\n",
    "\n",
    "\n",
    "if instance_type in ['local_gpu', 'local']:\n",
    "    from sagemaker.local import LocalSession\n",
    "    \n",
    "    sagemaker_session = LocalSession()\n",
    "    sagemaker_session.config = {'local': {'local_code': True}}\n",
    "    s3_data_path = f'file://{Path.cwd()}/BCCD'\n",
    "    \n",
    "    checkpoint_s3_uri = None\n",
    "else:\n",
    "    sagemaker_session = sagemaker.Session()\n",
    "    s3_data_path = f's3://{bucket}/dataset/BCCD'\n",
    "\n",
    "    checkpoint_s3_uri = f's3://{bucket}/poc_yolov5/checkpoints'\n",
    "    !aws s3 sync ./BCCD {s3_data_path} --quiet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "image_uri = None\n",
    "distribution = None\n",
    "train_job_name = 'sm'\n",
    "\n",
    "if do_spot_training:\n",
    "    max_wait = max_run\n",
    "\n",
    "print(\"train_job_name : {} \\ntrain_instance_type : {} \\ntrain_instance_count : {} \\nimage_uri : {} \\ndistribution : {}\".format(train_job_name, instance_type, instance_count, image_uri, distribution))    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# all input configurations, parameters, and metrics specified in estimator \n",
    "# definition are automatically tracked\n",
    "estimator = PyTorch(\n",
    "    entry_point='train.py',\n",
    "    source_dir=source_dir,\n",
    "    role=role,\n",
    "    sagemaker_session=sagemaker_session,\n",
    "    framework_version='1.10',\n",
    "    py_version='py38',\n",
    "    # image_uri=image_uri,\n",
    "    instance_count=instance_count,\n",
    "    instance_type=instance_type,\n",
    "    # volume_size=1024,\n",
    "    code_location = code_location,\n",
    "    output_path=output_path,\n",
    "    hyperparameters=hyperparameters,\n",
    "    # distribution=distribution,\n",
    "    # disable_profiler=True,\n",
    "    # debugger_hook_config=False,\n",
    "    metric_definitions=metric_definitions,\n",
    "    max_run=max_run,\n",
    "    use_spot_instances=do_spot_training,\n",
    "    max_wait=max_wait,\n",
    "    checkpoint_s3_uri=checkpoint_s3_uri,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!sudo rm -rf ./yolov5/weights/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "create_experiment(experiment_name)\n",
    "job_name = create_trial(experiment_name)\n",
    "\n",
    "# Now associate the estimator with the Experiment and Trial\n",
    "estimator.fit(\n",
    "    inputs={'inputdata': s3_data_path},\n",
    "    job_name=job_name,\n",
    "    experiment_config={\n",
    "      'TrialName': job_name,\n",
    "      'TrialComponentDisplayName': job_name,\n",
    "    },\n",
    "    wait=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "job_name=estimator.latest_training_job.name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sagemaker_session.logs_for_job(job_name=job_name, wait=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. 학습 결과 확인"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "artifacts_dir = estimator.model_data.replace('model.tar.gz', '')\n",
    "print(artifacts_dir)\n",
    "!aws s3 ls --human-readable {artifacts_dir}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_dir = './model'\n",
    "\n",
    "!rm -rf $model_dir\n",
    "\n",
    "import json , os\n",
    "\n",
    "if not os.path.exists(model_dir):\n",
    "    os.makedirs(model_dir)\n",
    "\n",
    "!aws s3 cp {artifacts_dir}model.tar.gz {model_dir}/model.tar.gz\n",
    "!tar -xvzf {model_dir}/model.tar.gz -C {model_dir}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from yolov5 import utils\n",
    "\n",
    "sys.path.append(f'{os.getcwd()}/yolov5')\n",
    "display = utils.notebook_init()  # checks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "images = []\n",
    "for img_path in glob.glob('model/*/*'):\n",
    "    if img_path.split('.')[-1] in ['jpg','png']:\n",
    "        images.append(mpimg.imread(img_path))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ipyplot.plot_images(images, max_images=2, img_width=450)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## 7. Custom Inference Docker 생성"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %%bash\n",
    "# cd ./docker\n",
    "# echo $(pwd)\n",
    "# container_name=yolov5-inference-sagemaker\n",
    "# account=$(aws sts get-caller-identity --query Account --output text)\n",
    "\n",
    "# # Get the region defined in the current configuration (default to us-west-2 if none defined)\n",
    "# region=$(aws configure get region)\n",
    "# region=${region:-us-west-2}\n",
    "\n",
    "# fullname=\"${account}.dkr.ecr.${region}.amazonaws.com/${container_name}:1.0\"\n",
    "\n",
    "# # If the repository doesn't exist in ECR, create it.\n",
    "# aws ecr describe-repositories --repository-names \"${container_name}\" > /dev/null 2>&1\n",
    "# if [ $? -ne 0 ]\n",
    "# then\n",
    "#     aws ecr create-repository --repository-name \"${container_name}\" > /dev/null\n",
    "# fi\n",
    "\n",
    "# # Get the login command from ECR and execute it directly\n",
    "# $(aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin \"763104351884.dkr.ecr.us-west-2.amazonaws.com\")\n",
    "\n",
    "# # Build the docker image locally with the image name and then push it to ECR\n",
    "# # with the full name.\n",
    "# docker build -f Dockerfile.inf -t ${fullname} .\n",
    "# # docker tag ${container_name} ${fullname}\n",
    "\n",
    "# # Get the login command from ECR and execute it directly\n",
    "# $(aws ecr get-login --region ${region} --no-include-email)\n",
    "# docker push ${fullname}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Processing job으로 Inference하기"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.pytorch import PyTorch\n",
    "from sagemaker.processing import Processor, ScriptProcessor, FrameworkProcessor\n",
    "from sagemaker.processing import ProcessingInput, ProcessingOutput"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# inf_image_uri = f'{account}.dkr.ecr.{region}.amazonaws.com/yolov5-inference-sagemaker:1.0'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "instance_count = 1\n",
    "instance_type = \"ml.c5.2xlarge\"\n",
    "# instance_type = 'local'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "detect_processor = FrameworkProcessor(\n",
    "    PyTorch,\n",
    "    framework_version=\"1.10\",\n",
    "    py_version=\"py38\",\n",
    "    role=role, \n",
    "    # image_uri=inf_image_uri,\n",
    "    instance_count=instance_count,\n",
    "    instance_type=instance_type,\n",
    "    code_location=code_location\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_test_path = f\"s3://{bucket}/dataset/BCCD/test/images/\"\n",
    "detect_output = f\"s3://{bucket}/poc_yolov5/detect_output\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if instance_type == 'local':\n",
    "    from pathlib import Path\n",
    "    from sagemaker.local import LocalSession\n",
    "    \n",
    "    sagemaker_session = LocalSession()\n",
    "    \n",
    "    sagemaker_session.config = {'local': {'local_code': True}}\n",
    "    # s3_data_path = f'file://{Path.cwd()}/BCCD/test/images/'\n",
    "else:\n",
    "    sagemaker_session = sagemaker.Session()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### detect.py 수정하기\n",
    "함수 하위에 model.tar.gz 압축을 푸는 코드를 추가합니다.\n",
    "```python\n",
    "def run(...) :\n",
    "    \n",
    "    try:\n",
    "        ############ MLOps for SageMaker ###########\n",
    "        w = str(weights[0] if isinstance(weights, list) else weights)\n",
    "        if w.endswith('.tar.gz'):\n",
    "            import tarfile\n",
    "            ap = tarfile.open(w)\n",
    "            ap.extractall(\"/opt/ml/processing/weights/\")\n",
    "            ap.close()\n",
    "\n",
    "\n",
    "        import glob\n",
    "#         print(glob.glob(\"/opt/ml/processing/weights/*/*/*\"))\n",
    "        training_job_name = glob.glob(\"/opt/ml/processing/weights/*/*/*\")[0].split(\"/\")[-3]\n",
    "        # LOGGER.info(f\" **************** training_job_name : {training_job_name}\")\n",
    "        weights = f\"/opt/ml/processing/weights/{training_job_name}/weights/best.pt\"\n",
    "    except:\n",
    "        print(\"Not use SageMaker Processinng job\")\n",
    "        pass\n",
    "```\n",
    "local mode에서 권한 이슈가 발생할 경우 아래 값을 코드 맨 하단에 추가합니다.\n",
    "```python\n",
    " os.system('chmod 777 -R /opt/ml/processing/output/exp')\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "create_experiment(experiment_name)\n",
    "job_name = create_trial(experiment_name)\n",
    "\n",
    "detect_processor.run(\n",
    "    code=\"detect.py\",\n",
    "    source_dir=source_dir,\n",
    "    inputs=[ProcessingInput(source=s3_test_path, input_name=\"test_data\", destination=\"/opt/ml/processing/input\"),\n",
    "            ProcessingInput(source=artifacts_dir, input_name=\"model_weight\", destination=\"/opt/ml/processing/weights\")\n",
    "    ],\n",
    "    outputs=[\n",
    "        ProcessingOutput(source=\"/opt/ml/processing/output\", destination=detect_output),\n",
    "    ],\n",
    "    arguments=[\"--img\", \"640\", \"--conf\", \"0.25\", \"--source\", \"/opt/ml/processing/input\", \"--weights\", \"/opt/ml/processing/weights/model.tar.gz\", \"--project\", \"/opt/ml/processing/output\"],\n",
    "    job_name=job_name,\n",
    "    experiment_config={\n",
    "      'TrialName': job_name,\n",
    "      'TrialComponentDisplayName': job_name,\n",
    "    },\n",
    "    wait=False\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "detect_processor.latest_job.wait()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "detect_output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!rm -rf detect_output\n",
    "!mkdir detect_output\n",
    "!aws s3 sync {detect_output}/exp detect_output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "detect_result = []\n",
    "for img_path in glob.glob('./detect_output/*'):\n",
    "    if img_path.split('.')[-1] in ['jpg','png']:\n",
    "        detect_result.append(mpimg.imread(img_path))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ipyplot.plot_images(detect_result, max_images=2, img_width=250)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Code repository 생성 및 push"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "현재 사용하는 노트북의 iam role에 IAMFullAccess을 추가한 이후에 아래 작업을 수행합니다."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "iam_client = boto3.client('iam')\n",
    "\n",
    "role=get_execution_role()\n",
    "base_role_name=role.split('/')[-1]\n",
    "\n",
    "iam_client.attach_role_policy(\n",
    "    RoleName=base_role_name,\n",
    "    PolicyArn='arn:aws:iam::aws:policy/AWSCodeCommitFullAccess'\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "codecommit = boto3.client('codecommit')\n",
    "repository_name = 'model_code'\n",
    "\n",
    "try:\n",
    "    response = codecommit.create_repository(\n",
    "        repositoryName=repository_name,\n",
    "        repositoryDescription='Data Scientists share their training code using this Repository'\n",
    "    )\n",
    "except:\n",
    "    \n",
    "    print(\"Repository already exists\")\n",
    "    response = codecommit.get_repository(\n",
    "        repositoryName=repository_name\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "codecommit_repo = response['repositoryMetadata']['cloneUrlHttp']\n",
    "codecommit_repo"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!sudo rm -rf ./yolov5/.git/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!git init\n",
    "!git remote add repo_codecommit $codecommit_repo\n",
    "!git checkout -b main\n",
    "!git add yolov5\n",
    "!git commit -m \"yolov5-update\"\n",
    "!git push --set-upstream repo_codecommit main"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Pipeline에 필요한 값 저장하기"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store bucket artifacts_dir codecommit_repo "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "instance_type": "ml.c5.large",
  "kernelspec": {
   "display_name": "conda_pytorch_p36",
   "language": "python",
   "name": "conda_pytorch_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}