{ "cells": [ { "cell_type": "markdown", "id": "41e5f0b9", "metadata": {}, "source": [ "# Load Testing using Locust\n", "\n", "---\n", "\n", "모델 배포는 모델 서빙의 첫 단추로 프로덕션 배포 시에 고려할 점들이 많습니다. 예를 들어, 특정 이벤트로 인해 갑자기 동시 접속자가 증가해서 트래픽이 몰릴 수 있죠. SageMaker는 관리형 서비스이니만큼 오토스케일링 policy를 손쉽게 구성할 수 있지만, 비용 최적화 관점에서 최적의 인스턴스 종류와 개수를 정하는 것은 쉽지 않습니다. 따라서, 로드 테스트를 통해 엔드포인트가 처리할 수 있는 RPS(Request Per Second; 동시 초당 접속자)를 파악하는 것이 중요하며, 이를 위해 자체 테스트 툴킷을 개발하거나 오픈소스 툴킷을 사용합니다. (또한, re:Invent 2021에 소개된 신규 서비스인 SageMaker Inference Recommender를 사용하여 로드 테스트를 API 호출로 편리하게 수행할 수 있습니다.)\n", "\n", "본 노트북에서는 Locust (https://docs.locust.io/en/stable/) 를 사용하여 간단한 로드 테스트를 수행해 보겠습니다. Locust는 Python으로 테스트 스크립트를 빠르게 작성할 수 있고 파라메터들이 직관적이라 빠르게 로드 테스트 환경을 구축하고 실행할 수 있습니다.\n", "\n", "완료 시간은 **10-20분** 정도 소요됩니다. \n", "\n", "\n", "### 목차\n", "- [1. Create Locust Script](#1.-Create-Locust-Script)\n", "- [2. Load Testing](#2.-Load-Testing)" ] }, { "cell_type": "markdown", "id": "0779b632", "metadata": {}, "source": [ "

주의

\n", "아래 코드 셀은 ngrok 토큰을 설정하고, 주피터 노트북 커널을 셧다운시킵니다. https://ngrok.com/ 에서 회원 가입 후, 토큰을 설정해 주시기 바랍니다.\n", " \n", "노트북 커널이 셧다운된다면, 아래 코드 셀에서 setup_needed = False로 변경 후, 코드 셀을 다시 실행해 주세요. 이 작업은 한 번만 수행하면 됩니다. \n", "

" ] }, { "cell_type": "markdown", "id": "92cc9f7b", "metadata": {}, "source": [ "회원 가입 후, 로그인하면 아래와 같은 화면이 출력됩니다. **2. Connect your account** 에서 `ngrok authtoken [YOUR-TOKEN]`의 `[YOUR-TOKEN]`을 아래 코드 셀로 복사하세요.\n", "![ngrok_1](img/ngrok_1.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "4a0be349", "metadata": {}, "outputs": [], "source": [ "import sys, IPython\n", "from pyngrok import ngrok\n", "\n", "setup_needed = True\n", "#setup_needed = False\n", "\n", "if setup_needed:\n", " print(\"===> Setting the authtoken. Please change 'setup_needed = False' and run this code cell again.\")\n", " ngrok.set_auth_token(\"[YOUR-TOKEN]\")\n", " IPython.Application.instance().kernel.do_shutdown(True)" ] }, { "cell_type": "code", "execution_count": null, "id": "9aa02e82", "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "%store -r\n", "%store" ] }, { "cell_type": "code", "execution_count": null, "id": "824624af", "metadata": {}, "outputs": [], "source": [ "try:\n", " endpoint_name\n", " s3_path\n", " test_df\n", " print(\"[OK] You can proceed.\")\n", "except NameError:\n", " print(\"+\"*60)\n", " print(\"[ERROR] Please run '1_deploy.ipynb' before you continue.\")\n", " print(\"+\"*60)" ] }, { "cell_type": "markdown", "id": "0c1533bc", "metadata": {}, "source": [ "
\n", "\n", "# 1. Create Locust Script\n", "---\n", "\n", "아래 코드 셀은 Locust 기반 로드 테스트에 필요한 스크립트를 저장합니다. \n", "- `config.json`: 로드 테스트에서 사용할 설정값들을 저장합니다.\n", "- `stress.py`: 로드 테스트 시 각 사용자의 객체를 생성하는 스크립트로, `HttpUser` 클래스를 상속받습니다. 이 클래스는 각 사용자에게 client 속성을 부여합니다. " ] }, { "cell_type": "code", "execution_count": null, "id": "e26614a5", "metadata": {}, "outputs": [], "source": [ "%%writefile config.json\n", "{\n", " \"contentType\": \"text/csv\",\n", " \"showEndpointResponse\": 0,\n", " \"dataFile\": \"../data/dataset/test.csv\",\n", " \"numTestSamples\": 100\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "a0a46c6c", "metadata": {}, "outputs": [], "source": [ "%%writefile stress.py\n", "import os\n", "import json\n", "import time\n", "import boto3\n", "import io\n", "from io import StringIO\n", "import pandas as pd\n", "from locust import HttpUser, task, events, between\n", "\n", "\n", "class SageMakerConfig:\n", "\n", " def __init__(self):\n", " self.__config__ = None\n", "\n", " @property\n", " def data_file(self):\n", " return self.config[\"dataFile\"]\n", "\n", " @property\n", " def content_type(self):\n", " return self.config[\"contentType\"]\n", "\n", " @property\n", " def show_endpoint_response(self):\n", " return self.config[\"showEndpointResponse\"]\n", " \n", " @property\n", " def num_test_samples(self):\n", " return self.config[\"numTestSamples\"]\n", "\n", " @property\n", " def config(self):\n", " self.__config__ = self.__config__ or self.load_config()\n", " return self.__config__\n", "\n", " def load_config(self):\n", " config_file = os.path.join(os.path.dirname(os.path.realpath(__file__)), \"config.json\")\n", " with open(config_file, \"r\") as c:\n", " return json.loads(c.read())\n", "\n", "\n", " \n", "class SageMakerEndpointTestSet(HttpUser):\n", " wait_time = between(5, 15)\n", " \n", " def __init__(self, parent):\n", " super().__init__(parent)\n", " self.config = SageMakerConfig()\n", " \n", " \n", " def on_start(self):\n", " data_file_full_path = os.path.join(os.path.dirname(__file__), self.config.data_file)\n", " \n", " test_df = pd.read_csv(data_file_full_path)\n", " y_test = test_df.iloc[:, 0].astype('int')\n", " test_df = test_df.drop('fraud', axis=1)\n", "\n", " csv_file = io.StringIO()\n", " test_df[0:self.config.num_test_samples].to_csv(csv_file, sep=\",\", header=False, index=False)\n", " self.payload = csv_file.getvalue()\n", " \n", "\n", " @task\n", " def test_invoke(self):\n", " response = self._locust_wrapper(self._invoke_endpoint, self.payload)\n", " if self.config.show_endpoint_response:\n", " print(response[\"Body\"].read())\n", "\n", " \n", " def _invoke_endpoint(self, payload):\n", " region = self.client.base_url.split(\"://\")[1].split(\".\")[2]\n", " endpoint_name = self.client.base_url.split(\"/\")[-2]\n", " runtime_client = boto3.client('sagemaker-runtime', region_name=region)\n", "\n", " response = runtime_client.invoke_endpoint(\n", " EndpointName=endpoint_name,\n", " Body=payload,\n", " ContentType=self.config.content_type\n", " )\n", "\n", " return response\n", " \n", "\n", " def _locust_wrapper(self, func, *args, **kwargs):\n", " \"\"\"\n", " Locust wrapper so that the func fires the sucess and failure events for custom boto3 client\n", " :param func: The function to invoke\n", " :param args: args to use\n", " :param kwargs:\n", " :return:\n", " \"\"\"\n", " start_time = time.time()\n", " try:\n", " result = func(*args, **kwargs)\n", " total_time = int((time.time() - start_time) * 1000)\n", " events.request.fire(request_type=\"boto3\", \n", " name=\"invoke_endpoint\", \n", " response_time=total_time, \n", " response_length=0)\n", "\n", " return result\n", " except Exception as e:\n", " total_time = int((time.time() - start_time) * 1000)\n", " events.request.fire(request_type=\"boto3\", \n", " name=\"invoke_endpoint\", \n", " response_time=total_time, \n", " response_length=0,\n", " exception=e)\n", "\n", " raise e" ] }, { "cell_type": "markdown", "id": "86e785ab", "metadata": {}, "source": [ "
\n", "\n", "# 2. Load Testing\n", "---\n", "\n", "로드 테스트는 아래 파라메터들의 설정만으로 로드 테스트를 편리하게 수행할 수 있습니다.\n", "\n", "- `num_users`: 어플리케이션을 테스트하는 총 사용자 수입니다. \n", "- `spawn_rate`: 초당 몇 명씩 사용자를 늘릴 것인지 정합니다. 이 때, on_start 함수가 정의되어 있다면 이 함수를 같이 호출합니다.\n", "\n", "예를 들어 `num_users=100, spawn_rate=10` 일 때는 초당 10명의 사용자가 추가되며, 10초 후에는 100명의 사용자로 늘어납니다. 이 사용자 수에 도달하면 통계치가 재설정되니다." ] }, { "cell_type": "code", "execution_count": null, "id": "198ade15", "metadata": {}, "outputs": [], "source": [ "import boto3\n", "region = boto3.Session().region_name\n", "\n", "num_users = 100\n", "spawn_rate = 10\n", "endpoint_url = f'https://runtime.sagemaker.{region}.amazonaws.com/endpoints/{endpoint_name}/invocations'" ] }, { "cell_type": "markdown", "id": "74239f8e", "metadata": {}, "source": [ "### Running a locustfile\n", "\n", "주피터 노트북 상에서의 실습을 위해 nohup으로 백그라운드에서 locust를 시작합니다. Locust는 기본적으로 8089 포트를 사용합니다. (http://localahost:8089)" ] }, { "cell_type": "code", "execution_count": null, "id": "786aaf67", "metadata": {}, "outputs": [], "source": [ "# %%bash -s \"$num_users\" \"$spawn_rate\" \"$endpoint_url\"\n", "\n", "# echo locust -f stress.py -u $1 -r $2 -H $3" ] }, { "cell_type": "code", "execution_count": null, "id": "60e3259d", "metadata": {}, "outputs": [], "source": [ "%%bash -s \"$num_users\" \"$spawn_rate\" \"$endpoint_url\"\n", "\n", "nohup locust -f stress.py -u $1 -r $2 -H $3 >/dev/null 2>&1 &" ] }, { "cell_type": "markdown", "id": "6566ef39", "metadata": {}, "source": [ "### Secure tunnels to localhost using ngrok\n", "\n", "ngrok를 사용해 외부에서 로컬호스트로 접속할 수 있습니다. pyngrok는 Python wrapper로 API 호출로 ngrok을 더 편리하게 사용할 수 있습니다.\n", "\n", "- ngrok: https://ngrok.com/\n", "- pyngrok: https://pyngrok.readthedocs.io/en/latest" ] }, { "cell_type": "code", "execution_count": null, "id": "efcf8ddd", "metadata": {}, "outputs": [], "source": [ "http_tunnel = ngrok.connect(8089, bind_tls=True)\n", "http_url = http_tunnel.public_url" ] }, { "cell_type": "markdown", "id": "b2a8ddd5", "metadata": {}, "source": [ "아래 코드 셀 실행 시 출력되는 URL을 클릭 후, `Start swarming` 버튼을 클릭해 주세요." ] }, { "cell_type": "code", "execution_count": null, "id": "ae9e8b7e", "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import display, HTML\n", "display(HTML(f'Load test: {http_url}'))" ] }, { "cell_type": "markdown", "id": "31a9513a", "metadata": {}, "source": [ "![locust_1](img/locust_1.png)\n", "![locust_2](img/locust_2.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "d6713c53", "metadata": {}, "outputs": [], "source": [ "tunnels = ngrok.get_tunnels()\n", "print(tunnels)" ] }, { "cell_type": "markdown", "id": "f8396ae3", "metadata": {}, "source": [ "### CloudWatch Monitoring\n", "아래 코드 셀에서 출력되는 링크를 클릭해면 CloudWatch 대시보드로 이동합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "69f3e10e", "metadata": {}, "outputs": [], "source": [ "cw_url = f\"https://console.aws.amazon.com/cloudwatch/home?region={region}#metricsV2:graph=~(metrics~(~(~'AWS*2fSageMaker~'InvocationsPerInstance~'EndpointName~'{endpoint_name}~'VariantName~'AllTraffic))~view~'timeSeries~stacked~false~region~'{region}~start~'-PT15M~end~'P0D~stat~'SampleCount~period~60);query=~'*7bAWS*2fSageMaker*2cEndpointName*2cVariantName*7d*20{endpoint_name}\"\n", "display(HTML(f'Cloudwatch Monitoring'))" ] }, { "cell_type": "markdown", "id": "2175d394", "metadata": {}, "source": [ "### Stop Locust and Disconnect ngrok" ] }, { "cell_type": "code", "execution_count": null, "id": "7ac9b47d", "metadata": {}, "outputs": [], "source": [ "!pkill -9 -ef locust\n", "ngrok.disconnect(http_url)" ] }, { "cell_type": "markdown", "id": "979c2580", "metadata": {}, "source": [ "### (Optional) More testing\n", "\n", "위 섹션에서 `num_users, spawn_rate`를 변경해서 테스트해 보세요. (예: `num_users=1000, spawn_rate=20`) RPS가 일정 이상이면 Failures 수치가 올라가는 것을 확인할 수 있습니다." ] }, { "cell_type": "markdown", "id": "7fe1fb50", "metadata": {}, "source": [ "
\n", "\n", "# 3. Endpoint Clean-up\n", "---\n", "\n", "과금 방지를 위해 엔드포인트를 삭제합니다." ] }, { "cell_type": "code", "execution_count": null, "id": "e2640d28", "metadata": {}, "outputs": [], "source": [ "import sagemaker\n", "from sagemaker.predictor import Predictor\n", "from sagemaker.serializers import CSVSerializer\n", "\n", "sess = sagemaker.Session()\n", "xgb_predictor = Predictor(\n", " endpoint_name=endpoint_name, \n", " sagemaker_session=sess,\n", " serializer=CSVSerializer()\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "4226fdde", "metadata": {}, "outputs": [], "source": [ "xgb_predictor.delete_endpoint()" ] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" } }, "nbformat": 4, "nbformat_minor": 5 }