{ "cells": [ { "cell_type": "markdown", "id": "2fceb094", "metadata": {}, "source": [ "# 1.4 SageMaker Training with Experiments, HPO and Processing" ] }, { "cell_type": "markdown", "id": "b1c7b1e2", "metadata": {}, "source": [ "## 학습 작업의 실행 노트북 개요\n", "\n", "- SageMaker Training에 SageMaker 실험을 추가하여 여러 실험의 결과를 비교할 수 있습니다.\n", " - [작업 실행 시 필요 라이브러리 import](#작업-실행-시-필요-라이브러리-import)\n", " - [SageMaker 세션과 Role, 사용 버킷 정의](#SageMaker-세션과-Role,-사용-버킷-정의)\n", " - [하이퍼파라미터 정의](#하이퍼파라미터-정의)\n", " - [학습 실행 작업 정의](#학습-실행-작업-정의)\n", " - 학습 코드 명\n", " - 학습 코드 폴더 명\n", " - 학습 코드가 사용한 Framework 종류, 버전 등\n", " - 학습 인스턴스 타입과 개수\n", " - SageMaker 세션\n", " - 학습 작업 하이퍼파라미터 정의\n", " - 학습 작업 산출물 관련 S3 버킷 설정 등\n", " - [학습 데이터셋 지정](#학습-데이터셋-지정)\n", " - 학습에 사용하는 데이터셋의 S3 URI 지정\n", " - [SageMaker 실험 설정](#SageMaker-실험-설정)\n", " - [학습 실행](#학습-실행)\n", " - [데이터 세트 설명](#데이터-세트-설명)\n", " - [실험 결과 보기](#실험-결과-보기)\n", " - [Evaluation 하기](#Evaluation-하기)" ] }, { "cell_type": "markdown", "id": "4dc90460", "metadata": {}, "source": [ "### 작업 실행 시 필요 라이브러리 import" ] }, { "cell_type": "code", "execution_count": null, "id": "80a85024", "metadata": {}, "outputs": [], "source": [ "import boto3\n", "import sagemaker" ] }, { "cell_type": "markdown", "id": "ba9d596a", "metadata": {}, "source": [ "### SageMaker 세션과 Role, 사용 버킷 정의" ] }, { "cell_type": "code", "execution_count": null, "id": "bda98a0a", "metadata": {}, "outputs": [], "source": [ "sagemaker_session = sagemaker.session.Session()\n", "region = sagemaker_session._region_name\n", "role = sagemaker.get_execution_role()" ] }, { "cell_type": "code", "execution_count": null, "id": "8664b776", "metadata": {}, "outputs": [], "source": [ "bucket = sagemaker_session.default_bucket()\n", "code_location = f's3://{bucket}/xgboost/code'\n", "output_path = f's3://{bucket}/xgboost/output'" ] }, { "cell_type": "markdown", "id": "21bd23dc", "metadata": {}, "source": [ "### 하이퍼파라미터 정의" ] }, { "cell_type": "code", "execution_count": null, "id": "fbfbb81f", "metadata": {}, "outputs": [], "source": [ "hyperparameters = {\n", " \"scale_pos_weight\" : 29, \n", " \"objective\": \"binary:logistic\",\n", " \"num_round\": 100,\n", "}" ] }, { "cell_type": "markdown", "id": "3d765652", "metadata": {}, "source": [ "### 학습 실행 작업 정의" ] }, { "cell_type": "code", "execution_count": null, "id": "17aae4fc", "metadata": {}, "outputs": [], "source": [ "instance_count = 1\n", "instance_type = \"ml.m5.large\"\n", "# instance_type = 'local'\n", "max_run = 1*60*60\n", "\n", "use_spot_instances = False\n", "if use_spot_instances:\n", " max_wait = 1*60*60\n", "else:\n", " max_wait = None" ] }, { "cell_type": "code", "execution_count": null, "id": "1347adc9-0747-4f68-9946-7864d384d1ec", "metadata": {}, "outputs": [], "source": [ "# image_uri = sagemaker.image_uris.retrieve(\n", "# \"xgboost\",\n", "# version=\"1.5-1\",\n", "# region=region,\n", "# image_scope='training',\n", "# # instance_type=instance_type\n", "# )\n", "# image_uri" ] }, { "cell_type": "code", "execution_count": null, "id": "188e81ec-c435-4472-8273-5bf13d0a5bb5", "metadata": {}, "outputs": [], "source": [ "if instance_type in ['local', 'local_gpu']:\n", " from sagemaker.local import LocalSession\n", " sagemaker_session = LocalSession()\n", " sagemaker_session.config = {'local': {'local_code': True}}\n", "else:\n", " sagemaker_session = sagemaker.session.Session()" ] }, { "cell_type": "code", "execution_count": null, "id": "3e489556", "metadata": {}, "outputs": [], "source": [ "from sagemaker.xgboost.estimator import XGBoost\n", "\n", "# estimator = sagemaker.estimator.Estimator(\n", "estimator = XGBoost(\n", " entry_point=\"xgboost_starter_script.py\",\n", " source_dir=\"src\",\n", " # image_uri=image_uri,\n", " output_path=output_path,\n", " code_location=code_location,\n", " hyperparameters=hyperparameters,\n", " role=role,\n", " sagemaker_session=sagemaker_session,\n", " instance_count=instance_count,\n", " instance_type=instance_type,\n", " framework_version=\"1.3-1\",\n", " max_run=max_run,\n", " use_spot_instances=use_spot_instances, # spot instance 활용\n", " max_wait=max_wait,\n", ")" ] }, { "cell_type": "markdown", "id": "9eeb2bf2", "metadata": {}, "source": [ "### 학습 데이터셋 지정" ] }, { "cell_type": "code", "execution_count": null, "id": "475c82bb", "metadata": {}, "outputs": [], "source": [ "data_path=f's3://{bucket}/xgboost/dataset'\n", "!aws s3 sync ../data/dataset/ $data_path" ] }, { "cell_type": "code", "execution_count": null, "id": "2c5ef86f-5aea-493c-b4df-41faaf2dd686", "metadata": {}, "outputs": [], "source": [ "if instance_type in ['local', 'local_gpu']:\n", " from pathlib import Path\n", " file_path = f'file://{Path.cwd()}'\n", " inputs = file_path.split('lab_1_training')[0] + 'data/dataset/'\n", " \n", "else:\n", " inputs = data_path\n", "inputs" ] }, { "cell_type": "markdown", "id": "c686dcc8", "metadata": {}, "source": [ "### SageMaker 실험 설정" ] }, { "cell_type": "code", "execution_count": null, "id": "e68616ec", "metadata": {}, "outputs": [], "source": [ "# !pip install -U sagemaker-experiments" ] }, { "cell_type": "code", "execution_count": null, "id": "7f26f8ec", "metadata": {}, "outputs": [], "source": [ "experiment_name='xgb-1'" ] }, { "cell_type": "code", "execution_count": null, "id": "b9750d63", "metadata": {}, "outputs": [], "source": [ "from smexperiments.experiment import Experiment\n", "from smexperiments.trial import Trial\n", "from time import strftime" ] }, { "cell_type": "code", "execution_count": null, "id": "99736980", "metadata": {}, "outputs": [], "source": [ "def create_experiment(experiment_name):\n", " try:\n", " sm_experiment = Experiment.load(experiment_name)\n", " except:\n", " sm_experiment = Experiment.create(experiment_name=experiment_name)" ] }, { "cell_type": "code", "execution_count": null, "id": "38ae2ace", "metadata": {}, "outputs": [], "source": [ "def create_trial(experiment_name):\n", " create_date = strftime(\"%m%d-%H%M%s\") \n", " sm_trial = Trial.create(trial_name=f'{experiment_name}-{create_date}',\n", " experiment_name=experiment_name)\n", "\n", " job_name = f'{sm_trial.trial_name}'\n", " return job_name" ] }, { "cell_type": "markdown", "id": "1913bbb2-8bca-4a42-9e82-0632a8b5b549", "metadata": { "tags": [] }, "source": [ "### HPO 실행 + 학습 실행\n", "\n", "SageMaker의 [Automatic Model Tuning](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html)을 활용할 수 있습니다. 이 방식은 높은 평가 비용의 최적화 문제를 위해 특별히 설계된 베이지안 최적화 방법을 사용합니다. [HyperparameterTuner](https://sagemaker.readthedocs.io/en/stable/tuner.html)의 `fit()` 방법은 `Estimator`와 같이 기본적으로 제공되지 않습니다. (HPO 작업은 일반적으로는 오래 걸리기 때문입니다.) SageMaker console에 있는 \"Hyperparameter Tuning Jobs\"은 진행되는 작업의 상세 상태와 metrics를 확인하기에 좋은 UI를 제공합니다. " ] }, { "cell_type": "code", "execution_count": null, "id": "3f69dad0-2eb1-4eb4-92a8-3a935ebcf126", "metadata": {}, "outputs": [], "source": [ "max_jobs=4 # TODO: Ideally 12 or more\n", "max_parallel_jobs=2 # TODO: Maybe only 1 for Event Engine, 2-3 if possible" ] }, { "cell_type": "code", "execution_count": null, "id": "c088ce23-2a9f-4277-9c29-c9610b242263", "metadata": {}, "outputs": [], "source": [ "create_experiment(experiment_name)\n", "job_name = create_trial(experiment_name)\n", "\n", "job_name = job_name[15:] ## job_name must have length less than or equal to 32 for HPO\n", "\n", "tuner = sagemaker.tuner.HyperparameterTuner(\n", " estimator,\n", " objective_metric_name=\"validation:auc\",\n", " hyperparameter_ranges={\n", " \"max_depth\": sagemaker.tuner.IntegerParameter(2, 5),\n", " \"eta\": sagemaker.tuner.ContinuousParameter(0.1, 0.5)\n", " },\n", " objective_type=\"Maximize\",\n", " max_jobs=max_jobs, # TODO: Ideally 12 or more\n", " max_parallel_jobs=max_parallel_jobs, # TODO: Maybe only 1 for Event Engine, 2-3 if possible\n", ")\n", "\n", "tuner.fit(\n", " job_name = job_name,\n", " inputs={'inputdata': inputs},\n", " experiment_config={\n", " 'TrialName': job_name,\n", " 'TrialComponentDisplayName': job_name,\n", " },\n", " wait=False\n", ")\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ee680f4b-f9e4-4f93-b30a-db6642007ad3", "metadata": {}, "outputs": [], "source": [ "tuner.wait()" ] }, { "cell_type": "markdown", "id": "620b2585", "metadata": {}, "source": [ "### 실험 결과 보기\n", "위의 실험한 결과를 확인 합니다.\n", "- 각각의 훈련잡의 시도에 대한 훈련 사용 데이터, 모델 입력 하이퍼 파라미터, 모델 평가 지표, 모델 아티펙트 결과 위치 등의 확인이 가능합니다.\n", "- **아래의 모든 내용은 SageMaker Studio 를 통해서 직관적으로 확인이 가능합니다.**" ] }, { "cell_type": "code", "execution_count": null, "id": "b121c5ca", "metadata": {}, "outputs": [], "source": [ "from sagemaker.analytics import ExperimentAnalytics, HyperparameterTuningJobAnalytics\n", "import pandas as pd\n", "pd.options.display.max_columns = 50\n", "pd.options.display.max_rows = 10\n", "pd.options.display.max_colwidth = 100" ] }, { "cell_type": "code", "execution_count": null, "id": "80080207", "metadata": {}, "outputs": [], "source": [ "trial_component_training_analytics = HyperparameterTuningJobAnalytics(\n", " sagemaker_session= sagemaker_session,\n", " hyperparameter_tuning_job_name=job_name\n", ")\n", "\n", "trial_component_training_analytics.dataframe()[['TrainingJobName', 'TrainingJobStatus', \n", " 'eta', 'max_depth', 'FinalObjectiveValue']]" ] }, { "cell_type": "markdown", "id": "1a23cfc2", "metadata": { "tags": [] }, "source": [ "### Evaluation 하기\n", "SageMaker Processing을 이용하여 Evalution을 수행하는 코드를 동작할 수 있습니다. MLOps에서 Processing을 적용하면 전처리, Evaluation 등을 serverless로 동작할 수 있습니다." ] }, { "cell_type": "code", "execution_count": null, "id": "6671a6c5", "metadata": {}, "outputs": [], "source": [ "from sagemaker.processing import FrameworkProcessor\n", "from sagemaker.processing import ProcessingInput, ProcessingOutput" ] }, { "cell_type": "code", "execution_count": null, "id": "a18f8d6c", "metadata": {}, "outputs": [], "source": [ "instance_count = 1\n", "instance_type = \"ml.m5.large\"\n", "# instance_type = 'local'" ] }, { "cell_type": "code", "execution_count": null, "id": "31647257", "metadata": {}, "outputs": [], "source": [ "script_eval = FrameworkProcessor(\n", " XGBoost,\n", " framework_version=\"1.3-1\",\n", " role=role,\n", " instance_type=instance_type,\n", " instance_count=instance_count\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "b3f2ad2a-8b86-4a9e-82ff-67a1fe8cbdb7", "metadata": {}, "outputs": [], "source": [ "client = boto3.client('sagemaker')\n", "response = client.describe_training_job(\n", " TrainingJobName=tuner.best_training_job()\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "3ec78e8c-c257-477e-b4d4-3e987c0630f5", "metadata": {}, "outputs": [], "source": [ "artifacts_dir = response['ModelArtifacts']['S3ModelArtifacts']\n", "artifacts_dir" ] }, { "cell_type": "code", "execution_count": null, "id": "f9829f86", "metadata": {}, "outputs": [], "source": [ "s3_test_path = data_path + '/test.csv'\n", "detect_outputpath = f's3://{bucket}/xgboost/processing'" ] }, { "cell_type": "code", "execution_count": null, "id": "ef915fe1", "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "source_dir = f'{Path.cwd()}/src'\n", "\n", "if instance_type == 'local':\n", " from sagemaker.local import LocalSession\n", " from pathlib import Path\n", "\n", " sagemaker_session = LocalSession()\n", " sagemaker_session.config = {'local': {'local_code': True}}\n", "\n", " s3_test_path=f'../data/dataset/test.csv'\n", "else:\n", " sagemaker_session = sagemaker.session.Session()\n", " s3_test_path=data_path + '/test.csv' " ] }, { "cell_type": "code", "execution_count": null, "id": "d54892a4", "metadata": {}, "outputs": [], "source": [ "create_experiment(experiment_name)\n", "job_name = create_trial(experiment_name)\n", "\n", "script_eval.run(\n", " code=\"evaluation.py\",\n", " source_dir=source_dir,\n", " inputs=[ProcessingInput(source=s3_test_path, input_name=\"test_data\", destination=\"/opt/ml/processing/test\"),\n", " ProcessingInput(source=artifacts_dir, input_name=\"model_weight\", destination=\"/opt/ml/processing/model\")\n", " ],\n", " outputs=[\n", " ProcessingOutput(source=\"/opt/ml/processing/output\", output_name='evaluation', destination=detect_outputpath + \"/\" + job_name),\n", " ],\n", " job_name=job_name,\n", " experiment_config={\n", " 'TrialName': job_name,\n", " 'TrialComponentDisplayName': job_name,\n", " },\n", " wait=False\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "3b4e2101", "metadata": {}, "outputs": [], "source": [ "script_eval.latest_job.wait()" ] }, { "cell_type": "markdown", "id": "1f2b05ad", "metadata": {}, "source": [ "### 실험 결과 확인" ] }, { "cell_type": "code", "execution_count": null, "id": "60c7b7f3", "metadata": {}, "outputs": [], "source": [ "# artifacts_dir = xgb_estimator.model_data.replace('model.tar.gz', '')\n", "print(artifacts_dir)\n", "!aws s3 ls --human-readable {artifacts_dir}" ] }, { "cell_type": "code", "execution_count": null, "id": "3ffe8e7b", "metadata": {}, "outputs": [], "source": [ "model_dir = './model'\n", "\n", "!rm -rf $model_dir\n", "\n", "import json , os\n", "\n", "if not os.path.exists(model_dir):\n", " os.makedirs(model_dir)\n", "\n", "!aws s3 cp {artifacts_dir} {model_dir}/model.tar.gz\n", "!tar -xvzf {model_dir}/model.tar.gz -C {model_dir}" ] }, { "cell_type": "code", "execution_count": null, "id": "63b86672", "metadata": {}, "outputs": [], "source": [ "!pip install xgboost graphviz" ] }, { "cell_type": "code", "execution_count": null, "id": "d82fcc80", "metadata": {}, "outputs": [], "source": [ "import xgboost as xgb\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "id": "c000ace4", "metadata": {}, "outputs": [], "source": [ "model = xgb.XGBClassifier()\n", "model.load_model(\"./model/xgboost-model\")" ] }, { "cell_type": "code", "execution_count": null, "id": "18c74b46", "metadata": {}, "outputs": [], "source": [ "test_prep_df = pd.read_csv('../data/dataset/test.csv')\n", "x_test = test_prep_df.drop('fraud', axis=1)\n", "feature_data = xgb.DMatrix(x_test)\n", "model.get_booster().feature_names = feature_data.feature_names\n", "model.get_booster().feature_types = feature_data.feature_types\n", "fig, ax = plt.subplots(figsize=(15, 8))\n", "xgb.plot_importance(model, ax=ax, importance_type='gain')" ] }, { "cell_type": "code", "execution_count": null, "id": "b8b38848", "metadata": {}, "outputs": [], "source": [ "xgb.plot_tree(model, num_trees=0, rankdir='LR')\n", "\n", "fig = plt.gcf()\n", "fig.set_size_inches(50, 15)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "880bc4cf", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "e62a7f6d-d8d6-432f-9f33-b9c0ab141fd9", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 5 }