{ "cells": [ { "cell_type": "markdown", "id": "09065721", "metadata": {}, "source": [ "# SageMaker Endpoint (Single Model Endpoint)\n", "---\n", "\n", "이제 **SageMaker 모델 호스팅 서비스인 SageMaker 엔드포인트**에 모델을 배포할 준비가 되었습니다. \n", "\n", "SageMaker 엔드포인트는 REST API를 통해 실시간 추론을 수행할 수 있는 완전 관리형 서비스입니다. 기본적으로 분산 컨테이너로 고가용성, 다중 모델 로딩, A/B 테스트를 위한 인프라 환경(EC2, 로드밸런서, 오토스케일링, 모델 아티팩트 로딩 등)이 사전 구축되어 있기에 몇 줄의 코드만으로 Endpoint가 자동으로 생성되기에, 모델을 프로덕션에 빠르게 배포할 수 있습니다.\n", "\n", "SageMaker 빌트인 XGBoost를 사용하면 별도의 훈련/추론 스크립트 작성 없이 쉽게 모델을 훈련하고 엔드포인트로 배포할 수 있습니다. 하지만, 여러 가지 요인들로 인해 (예: SHAP 계산을 위한 피쳐 기여값 리턴, 추론값 및 추론 스코어 동시 리턴 등) 커스텀 추론 로직이 필요한 경우, SageMaker 빌트인 XGBoost 대신 SageMaker XGBoost 컨테이너를 사용할 수 있습니다.\n", "\n", "이 노트북은 SageMaker XGBoost 컨테이너 상에서, 기본적인 추론 스크립트로 모델을 배포하는 법을 아래와 같은 목차로 진행합니다. \n", "\n", "완료 시간은 **20-30분** 정도 소요됩니다.\n", "\n", "### 목차\n", "- [1. Create Model Serving Script](#1.-Create-Model-Serving-Script)\n", "- [2. Deploy a trained model from Amazon S3](#2.-Deploy-a-trained-model-from-Amazon-S3)\n", " - [2.1. Deploy to Local Environment: XGBoostModel class](#2.1.-Deploy-to-Local-Environment:-XGBoostModel-class)\n", " - [2.2. Deploy to Local Environment: Model class](#2.2.-Deploy-to-Local-Environment:-Model-class)\n", " - [2.3. Deploy to Hosting Instance](#2.3.-Deploy-to-Hosting-Instance)\n", "\n", "유사한 예제로 실습해 보실 분들은 아래 링크의 샘플 노트북을 참조해 주세요.\n", "- https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone_dist_script_mode.ipynb" ] }, { "cell_type": "markdown", "id": "2686e584", "metadata": {}, "source": [ "

주의

\n", "아래 코드 셀은 핸즈온에 필요한 라이브러리들을 설치하고, 주피터 노트북 커널을 셧다운시킵니다. \n", " \n", "노트북 커널이 셧다운된다면, 아래 코드 셀에서 install_needed = False로 변경 후, 코드 셀을 다시 실행해 주세요. 이 작업은 한 번만 수행하면 됩니다. \n", "

" ] }, { "cell_type": "code", "execution_count": null, "id": "b88f96bb", "metadata": {}, "outputs": [], "source": [ "import sys, IPython\n", "\n", "install_needed = True\n", "#install_needed = False\n", "\n", "if install_needed:\n", " print(\"===> Installing deps and restarting kernel. Please change 'install_needed = False' and run this code cell again.\")\n", " !{sys.executable} -m pip install -U sagemaker locust pyngrok xgboost==1.3.1\n", " IPython.Application.instance().kernel.do_shutdown(True)\n", "else:\n", " import sagemaker, xgboost\n", " print(f'SageMaker API version={sagemaker.__version__}, XGBoost version={xgboost.__version__}')" ] }, { "cell_type": "code", "execution_count": null, "id": "f0e3e2d6", "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "%store -r\n", "XGB_FRAMEWORK_VERSION = '1.3-1'\n", "DATASET_PATH = '../data/dataset'" ] }, { "cell_type": "markdown", "id": "f10bfaf7", "metadata": {}, "source": [ "
\n", "\n", "# 0. (Optional) Prepare Your Model\n", "---\n", "\n", "혹시 이전 과정을 다 끝내지 못했다면, 아래 코드 셀의 주석을 해제하여 코드 셀을 실행해 주세요." ] }, { "cell_type": "code", "execution_count": null, "id": "2ad63102", "metadata": {}, "outputs": [], "source": [ "# import pandas as pd\n", "# import xgboost as xgb\n", "# data = pd.read_csv(f'{DATASET_PATH}/train.csv')\n", "# train = data.drop('fraud', axis=1)\n", "# label = pd.DataFrame(data['fraud'])\n", "# dtrain = xgb.DMatrix(train, label=label)\n", "\n", "# params = {'max_depth': 3, 'eta': 0.2, 'objective': \"binary:logistic\", 'scale_pos_weight': 29}\n", "# num_boost_round = 100\n", "# nfold = 5\n", "# early_stopping_rounds = 10\n", "\n", "# cv_results = xgb.cv(\n", "# params = params,\n", "# dtrain = dtrain,\n", "# num_boost_round = num_boost_round,\n", "# nfold = nfold,\n", "# early_stopping_rounds = early_stopping_rounds,\n", "# metrics = ('auc'),\n", "# stratified = True, # 레이블 (0,1) 의 분포에 따라 훈련 , 검증 세트 분리\n", "# seed = 0\n", "# )\n", "\n", "# print(\"cv_results: \", cv_results)\n", "\n", "# # Select the best score\n", "# print(f\"[0]#011train-auc:{cv_results.iloc[-1]['train-auc-mean']}\")\n", "# print(f\"[1]#011validation-auc:{cv_results.iloc[-1]['test-auc-mean']}\")\n", "\n", "# metrics_data = {\n", "# 'classification_metrics': {\n", "# 'validation:auc': { 'value': cv_results.iloc[-1]['test-auc-mean']},\n", "# 'train:auc': {'value': cv_results.iloc[-1]['train-auc-mean']}\n", "# }\n", "# }\n", " \n", "# model = xgb.train(params=params, dtrain=dtrain, num_boost_round=len(cv_results))\n", "# model.save_model(\"xgboost-model\")\n", "# !tar -czvf model.tar.gz xgboost-model && rm xgboost-model " ] }, { "cell_type": "markdown", "id": "a676f6fb", "metadata": {}, "source": [ "`Session()`은 AWS 환경에 접속하는 접속 정보와 SageMaker에서 사용하는 리소스를 관리하기 위한 편리한 방법을 제공합니다. " ] }, { "cell_type": "code", "execution_count": null, "id": "b820e0cc", "metadata": {}, "outputs": [], "source": [ "import boto3\n", "import sagemaker\n", "import json\n", "\n", "sess = sagemaker.Session()\n", "sm_session = sagemaker.session.Session()\n", "boto_session = boto3.session.Session()\n", "sm_client = boto3.client(\"sagemaker\")\n", "bucket = sm_session.default_bucket()\n", "role = sagemaker.get_execution_role()\n", "region = boto_session.region_name\n", "\n", "print((bucket, role))" ] }, { "cell_type": "markdown", "id": "04c68ee3", "metadata": {}, "source": [ "
\n", "\n", "# 1. Create Model Serving Script\n", "\n", "---\n", "\n", "아래 코드 셀은 src 디렉토리에 SageMaker 추론 스크립트를 저장합니다.\n", "\n", "#### Option 1.\n", "- `model_fn(model_dir)`: S3의 `model_dir`에 저장된 모델 아티팩트를 로드합니다.\n", "- `input_fn(request_body, content_type)`: 입력 데이터를 전처리합니다. `content_type`은 입력 데이터 종류에 따라 다양하게 처리 가능합니다. (예: `application/x-npy`, `application/json`, `application/csv`등)\n", "- `predict_fn(input_object, model)`: `input_fn(...)`을 통해 들어온 데이터에 대해 추론을 수행합니다.\n", "- `output_fn(prediction, accept_type)`: `predict_fn(...)`에서 받은 추론 결과를 후처리를 거쳐 프론트엔드로 전송합니다.\n", "\n", "#### Option 2.\n", "- `model_fn(model_dir)`: S3의 model_dir에 저장된 모델 아티팩트를 로드합니다.\n", "- `transform_fn(model, request_body, content_type, accept_type)`: `input_fn(...), predict_fn(...), output_fn(...)`을 `transform_fn(...)`으로 통합할 수 있습니다." ] }, { "cell_type": "code", "execution_count": null, "id": "c2c3e1eb", "metadata": {}, "outputs": [], "source": [ "%%writefile src/inference.py\n", "import os\n", "import time\n", "import json\n", "import pickle as pkl\n", "import numpy as np\n", "from io import BytesIO\n", "import xgboost as xgb\n", "import sagemaker_xgboost_container.encoder as xgb_encoders\n", "NUM_FEATURES = 58\n", "\n", "\n", "def model_fn(model_dir):\n", " \"\"\"\n", " Deserialize and return fitted model.\n", " \"\"\"\n", " model_file = \"xgboost-model\"\n", " model = xgb.Booster()\n", " model.load_model(os.path.join(model_dir, model_file))\n", " return model\n", " \n", "\n", "def input_fn(request_body, request_content_type):\n", " \"\"\"\n", " The SageMaker XGBoost model server receives the request data body and the content type,\n", " and invokes the `input_fn`.\n", " Return a DMatrix (an object that can be passed to predict_fn).\n", " \"\"\"\n", " print(\"Content type: \", request_content_type)\n", " if request_content_type == \"application/x-npy\": \n", " stream = BytesIO(request_body)\n", " array = np.frombuffer(stream.getvalue())\n", " array = array.reshape(int(len(array)/NUM_FEATURES), NUM_FEATURES)\n", " return xgb.DMatrix(array)\n", " elif request_content_type == \"text/csv\":\n", " return xgb_encoders.csv_to_dmatrix(request_body.rstrip(\"\\n\"))\n", " elif request_content_type == \"text/libsvm\":\n", " return xgb_encoders.libsvm_to_dmatrix(request_body)\n", " else:\n", " raise ValueError(\n", " \"Content type {} is not supported.\".format(request_content_type)\n", " )\n", " \n", "\n", "def predict_fn(input_data, model):\n", " \"\"\"\n", " SageMaker XGBoost model server invokes `predict_fn` on the return value of `input_fn`.\n", "\n", " Return a two-dimensional NumPy array (predictions and scores)\n", " \"\"\"\n", " start_time = time.time()\n", " y_probs = model.predict(input_data)\n", " print(\"--- Inference time: %s secs ---\" % (time.time() - start_time)) \n", " y_preds = [1 if e >= 0.5 else 0 for e in y_probs] \n", " #feature_contribs = model.predict(input_data, pred_contribs=True, validate_features=False)\n", " return np.vstack((y_preds, y_probs))\n", "\n", "\n", "def output_fn(predictions, content_type=\"application/json\"):\n", " \"\"\"\n", " After invoking predict_fn, the model server invokes `output_fn`.\n", " \"\"\"\n", " if content_type == \"text/csv\":\n", " return ','.join(str(x) for x in outputs)\n", " elif content_type == \"application/json\":\n", " outputs = json.dumps({\n", " 'pred': predictions[0,:].tolist(),\n", " 'prob': predictions[1,:].tolist()\n", " }) \n", " \n", " return outputs\n", " else:\n", " raise ValueError(\"Content type {} is not supported.\".format(content_type))" ] }, { "cell_type": "markdown", "id": "a26dcf78", "metadata": {}, "source": [ "### Check Inference Results & Debugging\n", "\n", "로컬 엔드포인트나 호스팅 엔드포인트 배포 전, 로컬 환경 상에서 직접 추론을 수행하여 결과를 확인합니다. " ] }, { "cell_type": "code", "execution_count": null, "id": "6a58c6bc", "metadata": {}, "outputs": [], "source": [ "!rm -rf model && mkdir model && tar -xzvf model.tar.gz -C model" ] }, { "cell_type": "code", "execution_count": null, "id": "e3010604", "metadata": {}, "outputs": [], "source": [ "import xgboost as xgb\n", "import numpy as np\n", "import pandas as pd\n", "\n", "model = xgb.Booster()\n", "model.load_model(\"model/xgboost-model\")\n", "\n", "test_df = pd.read_csv(f'{DATASET_PATH}/test.csv')\n", "y_test = test_df.iloc[:, 0].astype('int')\n", "test_df = test_df.drop('fraud', axis=1)\n", "dtest = xgb.DMatrix(test_df)" ] }, { "cell_type": "code", "execution_count": null, "id": "7407029b", "metadata": {}, "outputs": [], "source": [ "y_prob = model.predict(dtest)\n", "y_pred = np.array([1 if e >= 0.5 else 0 for e in y_prob])\n", "y_prob[0], y_pred[0]" ] }, { "cell_type": "markdown", "id": "351c2412", "metadata": {}, "source": [ "
\n", "\n", "# 2. Deploy a trained model from Amazon S3\n", "---\n", "\n", "SageMaker API의 `Model` 클래스는 훈련한 모델을 서빙하기 위한 모델 아티팩트와 도커 이미지를 정의합니다. \n", "`Model` 클래스 인스턴스 호출 시 AWS에서 사전 빌드한 도커 이미지 URL을 직접 가져올 수도 있지만, Model의 자식 클래스로(예: `XGBoostModel`, `TensorFlowModel`) 초기화하면 파라메터에 버전만 지정하는 것만으로 편리하게 추론을 수행하는 환경을 정의할 수 있습니다." ] }, { "cell_type": "markdown", "id": "c7300ce7", "metadata": {}, "source": [ "### Upload model artifacts to S3\n", "압축한 모델 아티팩트를 Amazon S3로 복사합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "f51977cb", "metadata": {}, "outputs": [], "source": [ "prefix = 'sm-special-webinar/deploy'\n", "s3_path = f's3://{bucket}/{prefix}/model.tar.gz'\n", "!aws s3 cp model.tar.gz {s3_path}" ] }, { "cell_type": "markdown", "id": "1471eefc", "metadata": {}, "source": [ "
\n", "\n", "## 2.1. Deploy to Local Environment: XGBoostModel class\n", "\n", "SageMaker 호스팅 엔드포인트로 배포하기 전에 로컬 모드 엔드포인트로 배포할 수 있습니다. 로컬 모드는 현재 개발 중인 환경에서 도커 컨테이너를 실행하여 SageMaker 프로세싱/훈련/추론 작업을 에뮬레이트할 수 있습니다. 추론 작업의 경우는 Amazon ECR의 딥러닝 프레임워크 기반 추론 컨테이너를 로컬로 가져오고(docker pull) 컨테이너를 실행하여(docker run) 모델 서버를 시작합니다.\n", "\n", "https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html\n", "\n", "```python\n", "local_model_path = f'{os.getcwd()}/model'\n", "ecr_uri = xgb_image_uri\n", "\n", "# 도커 컨테이너 구동\n", "!docker run --name xgb -itd -p 8080:8080 -v {local_model_path}:/opt/ml/model {ecr_uri} serve\n", "\n", "# 실시간 호출 테스트 \n", "!curl -X POST -H 'Content-Type: application/json' localhost:8080/invocations -d ...\n", "\n", "# 도커 컨테이너 중지 및 삭제 \n", "!docker stop xgb\n", "!docker rm xgb\n", "```\n", "\n", "참고로 SageMaker SDK에서 `deploy(...)` 메소드로 엔드포인트 배포 시, 인스턴스 타입을 local 이나 local_gpu로 지정하면 위의 과정을 자동으로 수행할 수 있습니다.\n", "\n", "```python\n", "# 로컬 엔드포인트 배포\n", "local_predictor = local_model.deploy(initial_instance_count=1, instance_type=\"local\")\n", "\n", "# 실시간 호출 테스트 \n", "local_predictor.predict(...)\n", "\n", "# 로컬 엔드포인트 삭제 (도커 컨테이너 중지 및 삭제)\n", "local_predictor.delete_endpoint()\n", "```" ] }, { "cell_type": "markdown", "id": "34dafaf0", "metadata": {}, "source": [ "아래 코드를 보시면 아시겠지만, 지속적으로 업데이트되는 파이썬 버전&프레임워크 버전&트랜스포머 버전에 쉽게 대응할 수 있습니다. AWS에서 관리하고 있는 딥러닝 컨테이너(DLC) 목록을 아래 주소에서 확인해 보세요.\n", "- https://github.com/aws/deep-learning-containers/blob/master/available_images.md" ] }, { "cell_type": "markdown", "id": "5d1a1bd9", "metadata": {}, "source": [ "### Create Model" ] }, { "cell_type": "code", "execution_count": null, "id": "3dfcb78f", "metadata": {}, "outputs": [], "source": [ "from sagemaker.xgboost.model import XGBoostModel\n", "from sagemaker.model import Model\n", "from sagemaker.image_uris import retrieve\n", "\n", "xgb_model = XGBoostModel(\n", " model_data=s3_path,\n", " role=role,\n", " entry_point=\"src/inference.py\",\n", " framework_version=XGB_FRAMEWORK_VERSION,\n", ")" ] }, { "cell_type": "markdown", "id": "364bb6a0", "metadata": {}, "source": [ "### Create Endpoint\n", "\n", "SageMaker SDK는 `deploy(...)` 메소드를 호출 시, `create-endpoint-config`와 `create-endpoint`를 같이 수행합니다. 좀 더 세분화된 파라메터 조정을 원하면 AWS CLI나 boto3 SDK client 활용을 권장 드립니다." ] }, { "cell_type": "code", "execution_count": null, "id": "9ee2abd0", "metadata": {}, "outputs": [], "source": [ "xgb_predictor = xgb_model.deploy(\n", " initial_instance_count=1,\n", " instance_type='local'\n", ")" ] }, { "cell_type": "markdown", "id": "0bd036cc", "metadata": {}, "source": [ "### Check Docker\n", "\n", "모델 서빙을 위한 도커 컨테이너가 구동되고 있음을 확인할 수 있습니다." ] }, { "cell_type": "code", "execution_count": null, "id": "1f9fd6cf", "metadata": {}, "outputs": [], "source": [ "!docker ps" ] }, { "cell_type": "markdown", "id": "cf33fa1b", "metadata": {}, "source": [ "### Prediction - SageMaker SDK & text/csv\n", "샘플 데이터에 대해 추론을 수행합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "84a21b3d", "metadata": {}, "outputs": [], "source": [ "from sagemaker.serializers import CSVSerializer, NumpySerializer\n", "from sagemaker.deserializers import JSONDeserializer\n", "xgb_predictor.serializer = CSVSerializer()\n", "xgb_predictor.deserializer = JSONDeserializer() \n", "\n", "outputs = xgb_predictor.predict(test_df.values[0:4,:])\n", "y_test_sample = y_test[0:4].values\n", "y_pred_sample = outputs['pred']; y_prob_sample = outputs['prob']\n", "y_test_sample, y_pred_sample, y_prob_sample" ] }, { "cell_type": "markdown", "id": "9c78f3f4", "metadata": {}, "source": [ "### Prediction - boto3 SDK & application/x-npy\n", "\n", "위의 코드 셀처럼 SageMaker SDK의 `predict(...)` 메소드로 추론을 수행할 수도 있지만, 이번에는 boto3의 `invoke_endpoint(...)` 메소드로 추론을 수행해 보겠습니다.\n", "Boto3는 서비스 레벨의 저수준(low-level) SDK로, ML 실험에 초점을 맞춰 일부 기능들이 추상화된 고수준(high-level) SDK인 SageMaker SDK와 달리 SageMaker API를 완벽하게 제어할 수 있습으며, 프로덕션 및 자동화 작업에 적합합니다.\n", "\n", "[Note] `invoke_endpoint(...)` 호출을 위한 런타임 클라이언트 인스턴스 생성 시, 로컬 배포 모드에서는`sagemaker.local.LocalSagemakerRuntimeClient(...)`를 호출해야 합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "080259f7", "metadata": {}, "outputs": [], "source": [ "import json\n", "runtime_client = sagemaker.local.LocalSagemakerRuntimeClient()\n", "endpoint_name = xgb_model.endpoint_name\n", "\n", "response = runtime_client.invoke_endpoint(\n", " EndpointName=endpoint_name, \n", " ContentType='application/x-npy',\n", " Accept='application/json',\n", " Body=test_df.values[0:4,:].tobytes()\n", ")\n", "\n", "print(json.loads(response['Body'].read().decode()))" ] }, { "cell_type": "markdown", "id": "dc1d180d", "metadata": {}, "source": [ "### Prediction - boto3 SDK & text/csv" ] }, { "cell_type": "code", "execution_count": null, "id": "519f49ef", "metadata": {}, "outputs": [], "source": [ "import io\n", "from io import StringIO\n", "csv_file = io.StringIO()\n", "test_df[0:4].to_csv(csv_file, sep=\",\", header=False, index=False)\n", "payload = csv_file.getvalue()\n", "\n", "response = runtime_client.invoke_endpoint(\n", " EndpointName=endpoint_name, \n", " ContentType='text/csv',\n", " Accept='application/json',\n", " Body=payload\n", ")\n", "\n", "print(json.loads(response['Body'].read().decode()))" ] }, { "cell_type": "markdown", "id": "dadddacf", "metadata": {}, "source": [ "### Local Mode Endpoint Clean-up" ] }, { "cell_type": "code", "execution_count": null, "id": "46ca908e", "metadata": {}, "outputs": [], "source": [ "xgb_predictor.delete_endpoint()\n", "xgb_model.delete_model()" ] }, { "cell_type": "markdown", "id": "69882923", "metadata": {}, "source": [ "
\n", "\n", "## 2.2. Deploy to Local Environment: Model class\n", "\n", "이번에는 `Model` 클래스로 로컬 환경에서 모델 서빙을 수행합니다. 여러분의 추론 환경을 커스터마이징하여 private ECR에 등록할 때 유용합니다." ] }, { "cell_type": "markdown", "id": "1b18643d", "metadata": {}, "source": [ "### Create Model" ] }, { "cell_type": "code", "execution_count": null, "id": "2b3d696a", "metadata": {}, "outputs": [], "source": [ "# If you need to create a sagemaker.model.Model, rather than sagemaker.xgboost.model.XGBoostModel\n", "image_uri = sagemaker.image_uris.retrieve(\"xgboost\", region, XGB_FRAMEWORK_VERSION)\n", "print(image_uri)\n", "\n", "xgb_model = Model(\n", " image_uri=image_uri,\n", " model_data=s3_path,\n", " role=role,\n", " entry_point=\"src/inference.py\"\n", ")" ] }, { "cell_type": "markdown", "id": "06d41094", "metadata": {}, "source": [ "### Create Endpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "632d087f", "metadata": {}, "outputs": [], "source": [ "xgb_predictor = xgb_model.deploy(\n", " initial_instance_count=1,\n", " instance_type='local'\n", ")" ] }, { "cell_type": "markdown", "id": "9c57e4ae", "metadata": {}, "source": [ "### Create Predictor\n", "\n", "`Model` 클래스로 모델 생성 시, `Predictor` 클래스를 생성하고 직렬화 및 역직렬화 포맷을 지정합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "78ef54b2", "metadata": {}, "outputs": [], "source": [ "from sagemaker.predictor import Predictor\n", "from sagemaker.serializers import CSVSerializer\n", "from sagemaker.local import LocalSession\n", "\n", "endpoint_name = xgb_model.endpoint_name\n", "local_sess = LocalSession()\n", "\n", "xgb_predictor = Predictor(\n", " endpoint_name=endpoint_name, \n", " sagemaker_session=local_sess,\n", " serializer=CSVSerializer()\n", ")" ] }, { "cell_type": "markdown", "id": "f5aece95", "metadata": {}, "source": [ "### Prediction - SageMaker SDK & text/csv\n", "샘플 데이터에 대해 추론을 수행합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "900509c0", "metadata": {}, "outputs": [], "source": [ "import json\n", "outputs = json.loads(xgb_predictor.predict(test_df.values[0:4,:]))\n", "y_test_sample = y_test[0:4].values\n", "y_pred_sample = outputs['pred']; y_prob_sample = outputs['prob']\n", "y_test_sample, y_pred_sample, y_prob_sample" ] }, { "cell_type": "markdown", "id": "d79688a6", "metadata": {}, "source": [ "### Local Mode Endpoint Clean-up" ] }, { "cell_type": "code", "execution_count": null, "id": "6d55647c", "metadata": {}, "outputs": [], "source": [ "xgb_predictor.delete_model()\n", "xgb_predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "id": "017ba349", "metadata": {}, "source": [ "
\n", "\n", "## 2.3. Deploy to Hosting Instance\n", "\n", "로컬 모드에서 충분히 디버깅했으면 실제 호스팅 인스턴스로 배포할 차례입니다. 코드는 거의 동일하며, `instance_type`만 다르다는 점을 주목해 주세요! " ] }, { "cell_type": "markdown", "id": "49bcdcfe", "metadata": {}, "source": [ "### Create Model" ] }, { "cell_type": "code", "execution_count": null, "id": "1ae35c3d", "metadata": {}, "outputs": [], "source": [ "from sagemaker.xgboost.model import XGBoostModel\n", "\n", "xgb_model = XGBoostModel(\n", " model_data=s3_path,\n", " role=role,\n", " entry_point=\"src/inference.py\",\n", " framework_version=XGB_FRAMEWORK_VERSION,\n", ")" ] }, { "cell_type": "markdown", "id": "c7bd632e", "metadata": {}, "source": [ "### Create Endpoint\n", "\n", "SageMaker SDK는 `deploy(...)` 메소드를 호출 시, `create-endpoint-config`와 `create-endpoint`를 같이 수행합니다. 좀 더 세분화된 파라메터 조정을 원하면 AWS CLI나 boto3 SDK client 활용을 권장 드립니다." ] }, { "cell_type": "code", "execution_count": null, "id": "fad942a7", "metadata": {}, "outputs": [], "source": [ "xgb_predictor = xgb_model.deploy(\n", " initial_instance_count=1,\n", " instance_type='ml.m5.xlarge', \n", " wait=False\n", ")" ] }, { "cell_type": "markdown", "id": "8f911c60", "metadata": {}, "source": [ "### Wait for the endpoint jobs to complete\n", "\n", "엔드포인트가 생성될 때까지 기다립니다. 엔드포인트가 가리키는 호스팅 리소스를 프로비저닝하는 데에 몇 분의 시간이 소요됩니다. " ] }, { "cell_type": "code", "execution_count": null, "id": "a7d680ad", "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import display, HTML\n", "def make_endpoint_link(region, endpoint_name, endpoint_task):\n", " endpoint_link = f'{endpoint_task} Review Endpoint' \n", " return endpoint_link \n", " \n", "endpoint_link = make_endpoint_link(region, xgb_predictor.endpoint_name, '[Deploy model from S3]')\n", "display(HTML(endpoint_link))" ] }, { "cell_type": "code", "execution_count": null, "id": "e2ecba5e", "metadata": {}, "outputs": [], "source": [ "sess.wait_for_endpoint(xgb_predictor.endpoint_name, poll=5)" ] }, { "cell_type": "markdown", "id": "2f00697f", "metadata": {}, "source": [ "### Prediction - SageMaker SDK & text/csv\n", "샘플 데이터에 대해 추론을 수행합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "94d32365", "metadata": {}, "outputs": [], "source": [ "from sagemaker.serializers import CSVSerializer, NumpySerializer\n", "from sagemaker.deserializers import JSONDeserializer\n", "xgb_predictor.serializer = CSVSerializer()\n", "xgb_predictor.deserializer = JSONDeserializer() \n", "\n", "outputs = xgb_predictor.predict(test_df.values[0:4,:])\n", "y_test_sample = y_test[0:4].values\n", "y_pred_sample = outputs['pred']; y_prob_sample = outputs['prob']\n", "y_test_sample, y_pred_sample, y_prob_sample" ] }, { "cell_type": "markdown", "id": "552f9424", "metadata": {}, "source": [ "### Prediction - boto3 SDK & application/x-npy\n", "\n", "위의 코드 셀처럼 SageMaker SDK의 `predict(...)` 메소드로 추론을 수행할 수도 있지만, 이번에는 boto3의 `invoke_endpoint(...)` 메소드로 추론을 수행해 보겠습니다.\n", "Boto3는 서비스 레벨의 저수준(low-level) SDK로, ML 실험에 초점을 맞춰 일부 기능들이 추상화된 고수준(high-level) SDK인 SageMaker SDK와 달리 SageMaker API를 완벽하게 제어할 수 있습으며, 프로덕션 및 자동화 작업에 적합합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "3c472196", "metadata": {}, "outputs": [], "source": [ "runtime_client = boto3.client('sagemaker-runtime')\n", "endpoint_name = xgb_model.endpoint_name\n", "\n", "response = runtime_client.invoke_endpoint(\n", " EndpointName=endpoint_name, \n", " ContentType='application/x-npy',\n", " Accept='application/json',\n", " Body=test_df.values[0:4,:].tobytes()\n", ")\n", "\n", "print(json.loads(response['Body'].read().decode()))" ] }, { "cell_type": "markdown", "id": "370dbfcb", "metadata": {}, "source": [ "### Prediction - boto3 SDK & text/csv" ] }, { "cell_type": "code", "execution_count": null, "id": "a6d430cd", "metadata": {}, "outputs": [], "source": [ "import io\n", "from io import StringIO\n", "csv_file = io.StringIO()\n", "test_df[0:4].to_csv(csv_file, sep=\",\", header=False, index=False)\n", "payload = csv_file.getvalue()\n", "\n", "response = runtime_client.invoke_endpoint(\n", " EndpointName=endpoint_name, \n", " ContentType='text/csv',\n", " Accept='application/json',\n", " Body=payload\n", ")\n", "\n", "print(json.loads(response['Body'].read().decode()))" ] }, { "cell_type": "markdown", "id": "b1ada7b8", "metadata": {}, "source": [ "### Evaluation (Not Required)\n", "\n", "테스트셋에 대해 성능 평가를 수행합니다. 다만, 일반적인 경우 테스트셋은 정답 데이터가 포함되어 있지 않다는 점 유념해 주세요." ] }, { "cell_type": "code", "execution_count": null, "id": "90988dcd", "metadata": {}, "outputs": [], "source": [ "outputs = xgb_predictor.predict(test_df.values)\n", "y_pred = outputs['pred']; y_prob = outputs['prob']" ] }, { "cell_type": "code", "execution_count": null, "id": "87591ede", "metadata": {}, "outputs": [], "source": [ "from sklearn.metrics import classification_report\n", "from sklearn.metrics import confusion_matrix\n", "from sklearn.metrics import mean_squared_error\n", "print(f\"{classification_report(y_true=y_test, y_pred=y_pred)}\")\n", "\n", "cm = confusion_matrix(y_true=y_test, y_pred=y_pred) \n", "print(cm)\n", "\n", "mse = mean_squared_error(y_test, y_pred)\n", "std = np.std(y_test - y_pred)\n", "report_dict = {\n", " \"regression_metrics\": {\n", " \"mse\": {\n", " \"value\": mse,\n", " \"standard_deviation\": std\n", " },\n", " },\n", "}\n", "report_dict" ] }, { "cell_type": "code", "execution_count": null, "id": "d9824d60", "metadata": {}, "outputs": [], "source": [ "from sklearn.metrics import roc_curve\n", "from sklearn.metrics import roc_auc_score\n", "from matplotlib import pyplot\n", "auc = roc_auc_score(y_test, y_prob)\n", "fpr, tpr, _ = roc_curve(y_test, y_prob)\n", "\n", "pyplot.plot(fpr, tpr, linestyle='--', label='Fraud')\n", "pyplot.xlabel('False Positive Rate')\n", "pyplot.ylabel('True Positive Rate')\n", "pyplot.title(f'AUC={auc:.4f}')\n", "pyplot.legend()\n", "pyplot.show()" ] }, { "cell_type": "markdown", "id": "011e2106", "metadata": {}, "source": [ "다음 모듈에서 재사용할 변수들을 저장합니다. 만약 다음 모듈로 진행하지 않는다면 아래 섹션의 코드 셀을 주석 해제 후 실행해 주세요." ] }, { "cell_type": "code", "execution_count": null, "id": "746aea9b", "metadata": {}, "outputs": [], "source": [ "%store endpoint_name test_df s3_path" ] }, { "cell_type": "markdown", "id": "f84cb564", "metadata": {}, "source": [ "### (Optional) Endpoint Clean-up" ] }, { "cell_type": "code", "execution_count": null, "id": "2a09d334", "metadata": {}, "outputs": [], "source": [ "# xgb_predictor.delete_endpoint()\n", "# xgb_model.delete_model()" ] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.13" } }, "nbformat": 4, "nbformat_minor": 5 }