{ "cells": [ { "cell_type": "markdown", "id": "c7e94f2f", "metadata": {}, "source": [ "# Lab 3: Deploy Hugging Face Transformers in SageMaker Real-time Endpoint\n", "---" ] }, { "cell_type": "markdown", "id": "a49e8444", "metadata": {}, "source": [ "## Introduction\n", "---\n", "\n", "본 모듈에서는 Hugging Face 모델을 리얼타임 엔드포인트로 배포합니다. SageMakers는 사전 빌드된 Hugging Face 추론 컨테이너와 Hugging Face Inference Toolkit을 제공하고 있기 때문에, 기존 SageMaker 엔드포인트 배포와 동일한 방법으로 진행할 수 있습니다. 또한, Hugging Face 전용 기능으로 Hugging Face Hub(https://huggingface.co/models) 에 등록된 모델을 직접 임포트해서 엔드포인트 배포가 가능합니다. 아래의 예제 코드를 참조해 주세요.\n", "\n", "```python\n", "hub = {\n", " 'HF_MODEL_ID': model_id, \n", " 'HF_TASK':'text-classification' \n", "}\n", "\n", "hf_hub_model = HuggingFaceModel(\n", " env=hub,\n", " ...\n", ")\n", "```\n", "\n", "SageMaker Hugging Face Inference Toolkit은 ML 모델을 제공하기 위해 [멀티 모델 서버(MMS; Multi Model Server)](https://github.com/awslabs/multi-model-server)를 사용합니다. SageMaker와 호환되도록 하는 구성 및 설정으로 MMS를 부트스트랩하고 시나리오의 요구 사항에 따라 모델 당 작업자 수(number of workers per model)와 같은 중요한 성능 매개변수를 조정할 수 있습니다.\n", "\n", "보다 다양한 유즈케이스에 대한 예제 코드가 필요하고 핸즈온 및 추론에 필요한 스크립트를 커스터마이징하고 싶다면(BYOS; Bring Your Own Scripts) 아래 URL을 참조하세요.\n", "\n", "- SageMaker Hugging Face Inference Toolkit: https://github.com/aws/sagemaker-huggingface-inference-toolkit\n", "- Amazon SageMaker Deep Learning Inference Hands-on-Lab: https://github.com/aws-samples/sagemaker-inference-samples-kr\n", "\n", "\n", "엔드포인트 생성은 다음의 세 단계로 구성됩니다.\n", "1. **모델(Model) 생성** — SageMaker 배포에 필요한 모델을 생성합니다. 추론 컨테이너 이미지와 모델 아티팩트의 S3 경로를 설정합니다.\n", "2. **엔드포인트 설정(Endpoint Configuration) 생성** — 프로덕션 변형(production variants)에서 하나 이상의 모델 이름과 SageMaker가 각 프로덕션 변형을 호스팅하기 위해 프로비저닝할 추론 호스팅 인스턴스 타입을 지정합니다.\n", "3. **엔드포인트(Endpoint) 생성** — 엔드포인트 설정을 기반으로 엔드포인트를 생성합니다. 호스팅 인스턴스를 프로비저닝하고 모델을 배포합니다.\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "365b29db", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Stored variables and their in-db values:\n", "local_model_dir -> './model'\n", "model_id -> 'bert-base-multilingual-cased'\n", "s3_model_path -> 's3://sagemaker-us-east-1-143656149352/kornlp-ner-\n", "tokenizer_id -> 'bert-base-multilingual-cased'\n" ] } ], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "%store -r\n", "%store" ] }, { "cell_type": "code", "execution_count": 2, "id": "b3bf0de4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[OK] You can proceed.\n" ] } ], "source": [ "try:\n", " model_id \n", " tokenizer_id\n", " s3_model_path\n", " local_model_dir\n", " print(\"[OK] You can proceed.\")\n", "except NameError:\n", " print(\"+\"*60)\n", " print(\"[ERROR] Please run the previous hands-on lab before you continue.\")\n", " print(\"+\"*60)" ] }, { "cell_type": "code", "execution_count": 3, "id": "1147860f", "metadata": {}, "outputs": [], "source": [ "import os\n", "import json\n", "import sys\n", "import logging\n", "import boto3\n", "import sagemaker\n", "import pandas as pd\n", "from sagemaker.huggingface import HuggingFaceModel\n", "from sagemaker import session\n", "from transformers import ElectraConfig\n", "from transformers import (\n", " ElectraModel, ElectraTokenizer, ElectraForSequenceClassification\n", ")\n", "\n", "logging.basicConfig(\n", " level=logging.INFO, \n", " format='[{%(filename)s:%(lineno)d} %(levelname)s - %(message)s',\n", " handlers=[\n", " logging.StreamHandler(sys.stdout)\n", " ]\n", ")\n", "logger = logging.getLogger(__name__)\n", "\n", "\n", "sess = sagemaker.Session()\n", "role = sagemaker.get_execution_role()\n", "region = boto3.Session().region_name" ] }, { "cell_type": "markdown", "id": "8d580b98", "metadata": {}, "source": [ "
\n", "\n", "## 1. Model Serving Preparation\n", "---\n", "\n", "### Create Model Serving Script\n", "\n", "아래 코드 셀은 src 디렉토리에 SageMaker 추론 스크립트를 저장합니다.\n", "\n", "#### Option 1.\n", "- `model_fn(model_dir)`: S3의 `model_dir`에 저장된 모델 아티팩트를 로드합니다.\n", "- `input_fn(request_body, content_type)`: 입력 데이터를 전처리합니다. `content_type`은 입력 데이터 종류에 따라 다양하게 처리 가능합니다. (예: `application/x-npy`, `application/json`, `application/csv`등)\n", "- `predict_fn(input_object, model)`: `input_fn(...)`을 통해 들어온 데이터에 대해 추론을 수행합니다.\n", "- `output_fn(prediction, accept_type)`: `predict_fn(...)`에서 받은 추론 결과를 후처리를 거쳐 프론트엔드로 전송합니다.\n", "\n", "#### Option 2.\n", "- `model_fn(model_dir)`: S3의 model_dir에 저장된 모델 아티팩트를 로드합니다.\n", "- `transform_fn(model, request_body, content_type, accept_type)`: `input_fn(...), predict_fn(...), output_fn(...)`을 `transform_fn(...)`으로 통합할 수 있습니다." ] }, { "cell_type": "code", "execution_count": 4, "id": "c59fc54f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting scripts/inference.py\n" ] } ], "source": [ "%%writefile scripts/inference.py\n", "import os\n", "import sys\n", "import json\n", "import torch\n", "import logging\n", "import numpy as np\n", "from transformers import BertTokenizerFast, BertConfig, BertForTokenClassification, pipeline\n", "os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"\n", "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", "\n", "logging.basicConfig(\n", " level=logging.INFO, \n", " format='[{%(filename)s:%(lineno)d} %(levelname)s - %(message)s',\n", " handlers=[\n", " logging.StreamHandler(sys.stdout)\n", " ]\n", ")\n", "logger = logging.getLogger(__name__)\n", "\n", "\n", "def model_fn(model_dir):\n", " tokenizer = BertTokenizerFast.from_pretrained(f'{model_dir}')\n", "\n", " with open(os.path.join(model_dir, 'tag2id.json'), 'r') as f:\n", " tag2id = json.loads(f.read())\n", "\n", " with open(os.path.join(model_dir, 'id2tag.json'), 'r') as f:\n", " id2tag = json.loads(f.read()) \n", "\n", " with open(os.path.join(model_dir, 'tag2entity.json'), 'r') as f:\n", " tag2entity = json.loads(f.read())\n", "\n", " model_file = 'pytorch_model.bin'\n", " model_id = 'bert-base-multilingual-cased'\n", " model = BertForTokenClassification.from_pretrained(model_id, num_labels=len(id2tag))\n", " \n", " tag2id = {k:int(v) for k,v in tag2id.items()} \n", " id2tag = {int(k):v for k,v in id2tag.items()} \n", " \n", " model.config.id2label = id2tag\n", " model.config.label2id = tag2id\n", " model.load_state_dict(torch.load(f'{model_dir}/{model_file}', map_location=torch.device(device)))\n", " model = model.eval()\n", " return (model, tokenizer)\n", "\n", "\n", "def input_fn(input_data, content_type=\"application/jsonlines\"): \n", " \n", " data_str = input_data.decode(\"utf-8\")\n", " jsonlines = data_str.split(\"\\n\")\n", " inputs = []\n", "\n", " for jsonline in jsonlines:\n", " text = json.loads(jsonline)[\"text\"][0]\n", " logger.info(\"input text: {}\".format(text)) \n", " inputs.append(text)\n", " \n", " return inputs\n", "\n", "\n", "def predict_fn(inputs, model_tuple): \n", " model, tokenizer = model_tuple\n", " device_id = -1 if device.type == \"cpu\" else 0\n", " outputs = []\n", " \n", " for example in inputs:\n", " nlp = pipeline(\"ner\", model=model.to(device), device=device_id, \n", " tokenizer=tokenizer, aggregation_strategy='average')\n", " output = nlp(example)\n", " logger.info(\"predicted_results: {}\".format(output))\n", " print(\"predicted_results: {}\".format(output))\n", " \n", " prediction_dict = {}\n", " prediction_dict[\"output\"] = output \n", "\n", " outputs.append(output)\n", " \n", " output = outputs[0]\n", " jsonlines = []\n", "\n", " for entity in output:\n", " for k, v in entity.items():\n", " if type(v) == np.float32:\n", " entity[k] = v.item()\n", "\n", " jsonline = json.dumps(entity)\n", " jsonlines.append(jsonline)\n", "\n", " jsonlines_output = '\\n'.join(jsonlines)\n", "\n", " return jsonlines_output\n", "\n", "\n", "def output_fn(outputs, accept=\"application/jsonlines\"):\n", " return outputs, accept" ] }, { "cell_type": "markdown", "id": "3f8fde69", "metadata": {}, "source": [ "### Check Inference Results & Debugging\n", "로컬 엔드포인트나 호스팅 엔드포인트 배포 전, 로컬 환경 상에서 직접 추론을 수행하여 결과를 확인합니다." ] }, { "cell_type": "code", "execution_count": 5, "id": "ac1a3480", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Some weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertForTokenClassification: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight']\n", "- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n", "- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n", "Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: ['classifier.bias', 'classifier.weight']\n", "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n" ] } ], "source": [ "from scripts.inference import model_fn, input_fn, predict_fn\n", "model_tuple = model_fn('./model')\n", "model_sample_path = 'payload_samples.txt'" ] }, { "cell_type": "code", "execution_count": 6, "id": "1746097e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{inference.py:55} INFO - input text: 아마존 SageMaker는 머신 러닝 통합 엔드투엔드 관리형 서비스로 2017년 런칭되었다.\n", "[{inference.py:70} INFO - predicted_results: [{'entity_group': 'ORG_B', 'score': 0.93706465, 'word': '아마존', 'start': 0, 'end': 3}, {'entity_group': 'TRM_I', 'score': 0.52719957, 'word': 'SageMaker는', 'start': 4, 'end': 14}, {'entity_group': 'TRM_B', 'score': 0.92613167, 'word': '머신', 'start': 15, 'end': 17}, {'entity_group': 'TRM_I', 'score': 0.5666977, 'word': '러닝', 'start': 18, 'end': 20}, {'entity_group': '', 'score': 0.67403936, 'word': '통합', 'start': 21, 'end': 23}, {'entity_group': 'TRM_I', 'score': 0.45783806, 'word': '엔드투엔드', 'start': 24, 'end': 29}, {'entity_group': '', 'score': 0.44695696, 'word': '관리형', 'start': 30, 'end': 33}, {'entity_group': 'TRM_I', 'score': 0.54463005, 'word': '서비스로', 'start': 34, 'end': 38}, {'entity_group': 'DAT_B', 'score': 0.9937644, 'word': '2017년', 'start': 39, 'end': 44}, {'entity_group': '', 'score': 0.9995462, 'word': '런칭되었다.', 'start': 45, 'end': 51}]\n", "predicted_results: [{'entity_group': 'ORG_B', 'score': 0.93706465, 'word': '아마존', 'start': 0, 'end': 3}, {'entity_group': 'TRM_I', 'score': 0.52719957, 'word': 'SageMaker는', 'start': 4, 'end': 14}, {'entity_group': 'TRM_B', 'score': 0.92613167, 'word': '머신', 'start': 15, 'end': 17}, {'entity_group': 'TRM_I', 'score': 0.5666977, 'word': '러닝', 'start': 18, 'end': 20}, {'entity_group': '', 'score': 0.67403936, 'word': '통합', 'start': 21, 'end': 23}, {'entity_group': 'TRM_I', 'score': 0.45783806, 'word': '엔드투엔드', 'start': 24, 'end': 29}, {'entity_group': '', 'score': 0.44695696, 'word': '관리형', 'start': 30, 'end': 33}, {'entity_group': 'TRM_I', 'score': 0.54463005, 'word': '서비스로', 'start': 34, 'end': 38}, {'entity_group': 'DAT_B', 'score': 0.9937644, 'word': '2017년', 'start': 39, 'end': 44}, {'entity_group': '', 'score': 0.9995462, 'word': '런칭되었다.', 'start': 45, 'end': 51}]\n" ] } ], "source": [ "with open(model_sample_path, 'w') as file:\n", " file.write('{\"text\": [\"아마존 SageMaker는 머신 러닝 통합 엔드투엔드 관리형 서비스로 2017년 런칭되었다.\"]}')\n", " \n", "with open(model_sample_path, mode='rb') as file:\n", " request_body = file.read() \n", " \n", "inputs = input_fn(request_body)\n", "outputs = predict_fn(inputs, model_tuple) " ] }, { "cell_type": "markdown", "id": "9bb2303e", "metadata": {}, "source": [ "
\n", "\n", "## 2. Deploy to Local Environment\n", "\n", "---\n", "\n", "SageMaker 호스팅 엔드포인트로 배포하기 전에 로컬 모드 엔드포인트로 배포할 수 있습니다. 로컬 모드는 현재 개발 중인 환경에서 도커 컨테이너를 실행하여 SageMaker 프로세싱/훈련/추론 작업을 에뮬레이트할 수 있습니다. 추론 작업의 경우는 Amazon ECR의 딥러닝 프레임워크 기반 추론 컨테이너를 로컬로 가져오고(docker pull) 컨테이너를 실행하여(docker run) 모델 서버를 시작합니다.\n", "\n", "https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html\n", "\n", "```python\n", "local_model_path = f'{os.getcwd()}/model'\n", "ecr_uri = image_uri\n", "\n", "# 도커 컨테이너 구동\n", "!docker run --name smmodel -itd -p 8080:8080 -v {local_model_path}:/opt/ml/model {ecr_uri} serve\n", "\n", "# 실시간 호출 테스트 \n", "!curl -X POST -H 'Content-Type: application/json' localhost:8080/invocations -d ...\n", "\n", "# 도커 컨테이너 중지 및 삭제 \n", "!docker stop smmodel\n", "!docker rm smmodel\n", "```\n", "\n", "참고로 SageMaker SDK에서 `deploy(...)` 메소드로 엔드포인트 배포 시, 인스턴스 타입을 local 이나 local_gpu로 지정하면 위의 과정을 자동으로 수행할 수 있습니다.\n", "\n", "```python\n", "# 로컬 엔드포인트 배포\n", "local_predictor = local_model.deploy(initial_instance_count=1, instance_type=\"local\")\n", "\n", "# 실시간 호출 테스트 \n", "local_predictor.predict(...)\n", "\n", "# 로컬 엔드포인트 삭제 (도커 컨테이너 중지 및 삭제)\n", "local_predictor.delete_endpoint()\n", "```\n", "\n", "아래 코드를 보시면 아시겠지만, 지속적으로 업데이트되는 파이썬 버전&프레임워크 버전&트랜스포머 버전에 쉽게 대응할 수 있습니다. AWS에서 관리하고 있는 딥러닝 컨테이너(DLC) 목록을 아래 주소에서 확인해 보세요.\n", "\n", "https://github.com/aws/deep-learning-containers/blob/master/available_images.md" ] }, { "cell_type": "code", "execution_count": 12, "id": "d1009927", "metadata": {}, "outputs": [], "source": [ "import os\n", "import time\n", "import sagemaker\n", "from sagemaker.pytorch import PyTorchModel\n", "from sagemaker.serializers import JSONSerializer, JSONLinesSerializer, IdentitySerializer\n", "from sagemaker.deserializers import JSONDeserializer, JSONLinesDeserializer\n", "#from sagemaker.pytorch.model import PyTorchModel\n", "role = sagemaker.get_execution_role()\n", "endpoint_name = \"local-endpoint-pytorch-{}\".format(int(time.time()))\n", "local_model_path = f'file://{os.getcwd()}/{local_model_dir}/model.tar.gz'" ] }, { "cell_type": "markdown", "id": "9a68c326", "metadata": {}, "source": [ "### Create Endpoint\n", "\n", "SageMaker SDK는 `deploy(...)` 메소드를 호출 시, 엔드포인트 컨피그 생성(create-endpoint-config)과 엔드포인트 생성(create-endpoint)을 같이 수행합니다. 좀 더 세분화된 파라메터 조정을 원하면 AWS CLI나 boto3 SDK client 활용을 권장 드립니다." ] }, { "cell_type": "code", "execution_count": 13, "id": "f4257173", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{session.py:2668} INFO - Creating model with name: pytorch-inference-2022-06-13-08-53-27-217\n", "[{session.py:3585} INFO - Creating endpoint-config with name pytorch-inference-2022-06-13-08-53-27-218\n", "[{session.py:3053} INFO - Creating endpoint with name pytorch-inference-2022-06-13-08-53-27-218\n", "[{image.py:270} INFO - serving\n", "[{image.py:273} INFO - creating hosting dir in /tmp/tmpsvvw65dp\n", "[{image.py:1012} INFO - No AWS credentials found in session but credentials from EC2 Metadata Service are available.\n", "[{image.py:685} INFO - docker compose file: \n", "networks:\n", " sagemaker-local:\n", " name: sagemaker-local\n", "services:\n", " algo-1-zqdqk:\n", " command: serve\n", " container_name: cx2mwac59i-algo-1-zqdqk\n", " environment:\n", " - '[Masked]'\n", " - '[Masked]'\n", " - '[Masked]'\n", " - '[Masked]'\n", " image: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.8.1-cpu-py3\n", " networks:\n", " sagemaker-local:\n", " aliases:\n", " - algo-1-zqdqk\n", " ports:\n", " - 8080:8080\n", " stdin_open: true\n", " tty: true\n", " volumes:\n", " - /tmp/tmpe0z9alqy:/opt/ml/model\n", "version: '2.3'\n", "\n", "[{image.py:708} INFO - docker command: docker-compose -f /tmp/tmpsvvw65dp/docker-compose.yaml up --build --abort-on-container-exit\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 5\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused')': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused')': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused')': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 10\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 15\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 20\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 25\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 30\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 35\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 40\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 45\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 50\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 55\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 60\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 65\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 70\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "Attaching to cx2mwac59i-algo-1-zqdqk\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting transformers\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading transformers-4.18.0-py3-none-any.whl (4.0 MB)\n", " |████████████████████████████████| 4.0 MB 31.8 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hCollecting huggingface-hub<1.0,>=0.1.0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)\n", " |████████████████████████████████| 67 kB 9.6 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hCollecting regex!=2019.12.17\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading regex-2022.6.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (749 kB)\n", " |████████████████████████████████| 749 kB 65.6 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hRequirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.6/site-packages (from transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (5.4.1)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting filelock\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading filelock-3.4.1-py3-none-any.whl (9.9 kB)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.6/site-packages (from transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (1.19.1)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting tokenizers!=0.11.3,<0.13,>=0.11.1\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading tokenizers-0.12.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)\n", " |████████████████████████████████| 6.6 MB 63.7 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hRequirement already satisfied: requests in /opt/conda/lib/python3.6/site-packages (from transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (2.22.0)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: dataclasses in /opt/conda/lib/python3.6/site-packages (from transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (0.8)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.6/site-packages (from transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (4.61.2)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting importlib-metadata\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading importlib_metadata-4.8.3-py3-none-any.whl (17 kB)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.6/site-packages (from transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (20.4)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting sacremoses\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading sacremoses-0.0.53.tar.gz (880 kB)\n", " |████████████████████████████████| 880 kB 63.5 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25h Preparing metadata (setup.py) ... \u001b[?25ldone\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hRequirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.6/site-packages (from huggingface-hub<1.0,>=0.1.0->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (4.0.1)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting packaging>=20.0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading packaging-21.3-py3-none-any.whl (40 kB)\n", " |████████████████████████████████| 40 kB 9.1 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hRequirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/lib/python3.6/site-packages (from packaging>=20.0->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (3.0.6)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting zipp>=0.5\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading zipp-3.6.0-py3-none-any.whl (5.3 kB)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (2.8)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (2021.5.30)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (1.25.11)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (3.0.4)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Requirement already satisfied: six in /opt/conda/lib/python3.6/site-packages (from sacremoses->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (1.16.0)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Collecting click\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Downloading click-8.0.4-py3-none-any.whl (97 kB)\n", " |████████████████████████████████| 97 kB 12.9 MB/s \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25hRequirement already satisfied: joblib in /opt/conda/lib/python3.6/site-packages (from sacremoses->transformers->-r /opt/ml/model/code/requirements.txt (line 1)) (1.0.1)\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Building wheels for collected packages: sacremoses\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 75\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Building wheel for sacremoses (setup.py) ... \u001b[?25ldone\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[?25h Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895259 sha256=7bf713231461fd434a535731f9896c7b2adfec471f7b3cbddf262e00c04817b3\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Stored in directory: /root/.cache/pip/wheels/4c/64/31/e9900a234b23fb3e9dc565d6114a9d6ff84a72dbdd356502b4\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Successfully built sacremoses\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Installing collected packages: zipp, importlib-metadata, regex, packaging, filelock, click, tokenizers, sacremoses, huggingface-hub, transformers\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Attempting uninstall: packaging\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Found existing installation: packaging 20.4\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Uninstalling packaging-20.4:\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Successfully uninstalled packaging-20.4\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Successfully installed click-8.0.4 filelock-3.4.1 huggingface-hub-0.4.0 importlib-metadata-4.8.3 packaging-21.3 regex-2022.6.2 sacremoses-0.0.53 tokenizers-0.12.1 transformers-4.18.0 zipp-3.6.0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m \u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m ['torchserve', '--start', '--model-store', '/.sagemaker/ts/models', '--ts-config', '/etc/sagemaker-ts.properties', '--log-config', '/opt/conda/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/etc/log4j.properties', '--models', 'model.mar']\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:54:49,358 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:54:49,588 [INFO ] main org.pytorch.serve.ModelServer - \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Torchserve version: 0.4.0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m TS Home: /opt/conda/lib/python3.6/site-packages\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Current directory: /\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Temp directory: /home/model-server/tmp\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Number of GPUs: 0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Number of CPUs: 4\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Max heap size: 3928 M\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Python executable: /opt/conda/bin/python3.6\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Config file: /etc/sagemaker-ts.properties\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Inference address: http://0.0.0.0:8080\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Management address: http://0.0.0.0:8080\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Metrics address: http://127.0.0.1:8082\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Model Store: /.sagemaker/ts/models\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Initial Models: model.mar\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Log dir: /logs\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Metrics dir: /logs\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Netty threads: 0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Netty client threads: 0\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Default workers per model: 4\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Blacklist Regex: N/A\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Maximum Response Size: 6553500\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Maximum Request Size: 6553500\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Prefer direct buffer: false\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Allowed Urls: [file://.*|http(s)?://.*]\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Custom python dependency for model allowed: false\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Metrics report format: prometheus\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Enable metrics API: true\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Workflow Store: /.sagemaker/ts/models\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:54:49,607 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:54:49,655 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: model.mar\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 80\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 85\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 90\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 95\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{connectionpool.py:810} WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /ping\n", "[{entities.py:635} INFO - Container still not up, got: -1\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:09,059 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model loaded.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:09,107 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:09,339 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:09,345 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:09,352 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m Model server started.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:09,943 [WARN ] pool-2-thread-1 org.pytorch.serve.metrics.MetricCollector - worker pid is not available yet.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,025 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,026 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:0.0|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,027 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:93.93645477294922|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,028 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:100.0|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,028 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:7333.359375|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,035 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:8031.3203125|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,035 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:53.3|#Level:Host|#hostname:dc9d06b0ca46,timestamp:1655110510\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,381 [INFO ] W-9001-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9001\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,381 [INFO ] W-9002-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9002\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,384 [INFO ] W-9001-model_1-stdout MODEL_LOG - [PID]93\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,384 [INFO ] W-9000-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,385 [INFO ] W-9001-model_1-stdout MODEL_LOG - Torch worker started.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,386 [INFO ] W-9002-model_1-stdout MODEL_LOG - [PID]95\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,386 [INFO ] W-9000-model_1-stdout MODEL_LOG - [PID]92\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,386 [INFO ] W-9000-model_1-stdout MODEL_LOG - Torch worker started.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,387 [INFO ] W-9000-model_1-stdout MODEL_LOG - Python runtime: 3.6.13\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,387 [INFO ] W-9002-model_1-stdout MODEL_LOG - Torch worker started.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,387 [INFO ] W-9002-model_1-stdout MODEL_LOG - Python runtime: 3.6.13\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,390 [INFO ] W-9001-model_1-stdout MODEL_LOG - Python runtime: 3.6.13\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,393 [INFO ] W-9001-model_1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,394 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,394 [INFO ] W-9002-model_1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9002\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,396 [INFO ] W-9003-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9003\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,398 [INFO ] W-9003-model_1-stdout MODEL_LOG - [PID]94\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,398 [INFO ] W-9003-model_1-stdout MODEL_LOG - Torch worker started.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,398 [INFO ] W-9003-model_1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9003\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,398 [INFO ] W-9003-model_1-stdout MODEL_LOG - Python runtime: 3.6.13\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,425 [INFO ] W-9003-model_1-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9003.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,436 [INFO ] W-9002-model_1-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9002.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,438 [INFO ] W-9000-model_1-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,440 [INFO ] W-9001-model_1-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9001.\n", "[{entities.py:632} INFO - Checking if serving container is up, attempt: 100\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,654 [INFO ] pool-1-thread-5 ACCESS_LOG - /172.19.0.1:35610 \"GET /ping HTTP/1.1\" 200 31\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:10,677 [INFO ] pool-1-thread-5 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:dc9d06b0ca46,timestamp:null\n", "!\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:12,458 [WARN ] W-9001-model_1-stderr MODEL_LOG - \n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:55:12,460 [WARN ] W-9001-model_1-stderr MODEL_LOG - Downloading: 0%| | 0.00/625 [00:008080/tcp, :::8080->8080/tcp, 8081/tcp cx2mwac59i-algo-1-zqdqk\r\n" ] } ], "source": [ "!docker ps" ] }, { "cell_type": "markdown", "id": "934ed7fe", "metadata": {}, "source": [ "### Sample data prediction\n", "\n", "샘플 데이터로 추론을 수행합니다." ] }, { "cell_type": "code", "execution_count": 16, "id": "133c6b66", "metadata": {}, "outputs": [], "source": [ "data = [\n", " {\"text\": [\"아마존 SageMaker는 머신 러닝 통합 엔드투엔드 관리형 서비스로 2017년 re:Invent 행사가 열린 라스베가스에서 발표되었다.\"]}\n", "]" ] }, { "cell_type": "code", "execution_count": 17, "id": "36abf3d3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,152 [INFO ] W-9002-model_1-stdout MODEL_LOG - input text: 아마존 SageMaker는 머신 러닝 통합 엔드투엔드 관리형 서비스로 2017년 re:Invent 행사가 열린 라스베가스에서 발표되었다.\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,627 [INFO ] W-9002-model_1-stdout MODEL_LOG - predicted_results: [{'entity_group': 'ORG_B', 'score': 0.97080106, 'word': '아마존', 'start': 0, 'end': 3}, {'entity_group': 'TRM_B', 'score': 0.74606895, 'word': 'SageMaker는 머신', 'start': 4, 'end': 17}, {'entity_group': 'TRM_I', 'score': 0.69509137, 'word': '러닝', 'start': 18, 'end': 20}, {'entity_group': '', 'score': 0.534267, 'word': '통합', 'start': 21, 'end': 23}, {'entity_group': 'TRM_I', 'score': 0.54091847, 'word': '엔드투엔드', 'start': 24, 'end': 29}, {'entity_group': '', 'score': 0.53203374, 'word': '관리형', 'start': 30, 'end': 33}, {'entity_group': 'TRM_I', 'score': 0.48312756, 'word': '서비스로', 'start': 34, 'end': 38}, {'entity_group': 'DAT_B', 'score': 0.9962328, 'word': '2017년', 'start': 39, 'end': 44}, {'entity_group': 'EVT_B', 'score': 0.80297107, 'word': 're :', 'start': 45, 'end': 48}, {'entity_group': 'EVT_I', 'score': 0.7315686, 'word': 'Invent 행사가', 'start': 48, 'end': 58}, {'entity_group': '', 'score': 0.99955004, 'word': '열린', 'start': 59, 'end': 61}, {'entity_group': 'LOC_B', 'score': 0.9940478, 'word': '라스베가스에서', 'start': 62, 'end': 69}, {'entity_group': '', 'score': 0.9996456, 'word': '발표되었다.', 'start': 70, 'end': 76}]\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,628 [INFO ] W-9002-model_1-stdout MODEL_LOG - predicted_results: [{'entity_group': 'ORG_B', 'score': 0.97080106, 'word': '아마존', 'start': 0, 'end': 3}, {'entity_group': 'TRM_B', 'score': 0.74606895, 'word': 'SageMaker는 머신', 'start': 4, 'end': 17}, {'entity_group': 'TRM_I', 'score': 0.69509137, 'word': '러닝', 'start': 18, 'end': 20}, {'entity_group': '', 'score': 0.534267, 'word': '통합', 'start': 21, 'end': 23}, {'entity_group': 'TRM_I', 'score': 0.54091847, 'word': '엔드투엔드', 'start': 24, 'end': 29}, {'entity_group': '', 'score': 0.53203374, 'word': '관리형', 'start': 30, 'end': 33}, {'entity_group': 'TRM_I', 'score': 0.48312756, 'word': '서비스로', 'start': 34, 'end': 38}, {'entity_group': 'DAT_B', 'score': 0.9962328, 'word': '2017년', 'start': 39, 'end': 44}, {'entity_group': 'EVT_B', 'score': 0.80297107, 'word': 're :', 'start': 45, 'end': 48}, {'entity_group': 'EVT_I', 'score': 0.7315686, 'word': 'Invent 행사가', 'start': 48, 'end': 58}, {'entity_group': '', 'score': 0.99955004, 'word': '열린', 'start': 59, 'end': 61}, {'entity_group': 'LOC_B', 'score': 0.9940478, 'word': '라스베가스에서', 'start': 62, 'end': 69}, {'entity_group': '', 'score': 0.9996456, 'word': '발표되었다.', 'start': 70, 'end': 76}]\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,634 [INFO ] W-9002-model_1 org.pytorch.serve.wlm.WorkerThread - Backend response time: 483\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,634 [INFO ] W-9002-model_1 ACCESS_LOG - /172.19.0.1:47938 \"POST /invocations HTTP/1.1\" 200 496\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,635 [INFO ] W-9002-model_1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:dc9d06b0ca46,timestamp:null\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,633 [INFO ] W-9002-model_1-stdout MODEL_METRICS - PredictionTime.Milliseconds:474.34|#ModelName:model,Level:Model|#hostname:dc9d06b0ca46,requestID:a4bd0268-ef61-4cb2-b7a1-1748cc5a7f48,timestamp:1655110646\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,637 [INFO ] W-9002-model_1 TS_METRICS - QueueTime.ms:0|#Level:Host|#hostname:dc9d06b0ca46,timestamp:null\n", "\u001b[36mcx2mwac59i-algo-1-zqdqk |\u001b[0m 2022-06-13 08:57:26,637 [INFO ] W-9002-model_1 TS_METRICS - WorkerThreadTime.ms:3|#Level:Host|#hostname:dc9d06b0ca46,timestamp:null\n" ] } ], "source": [ "results = predictor.predict(data)" ] }, { "cell_type": "code", "execution_count": 18, "id": "251acfaf", "metadata": {}, "outputs": [], "source": [ "def display_ner_outputs(results, tag2entity):\n", " entity_lst, score_lst, word_lst, start_lst, end_lst = [], [], [], [], []\n", " tag2entity[''] = '-'\n", "\n", " for result in results:\n", " entity = tag2entity[result['entity_group']]\n", " score = result['score']\n", " word = result['word']\n", " start = result['start']\n", " end = result['end']\n", "\n", " entity_lst.append(entity)\n", " score_lst.append(score)\n", " word_lst.append(word)\n", " start_lst.append(start)\n", " end_lst.append(end)\n", "\n", " df = pd.DataFrame(zip(word_lst, entity_lst, score_lst, start_lst, end_lst), \n", " columns=['word', 'entity', 'score', 'start', 'end'])\n", " return df" ] }, { "cell_type": "code", "execution_count": 19, "id": "8d9c6b50", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
wordentityscorestartend
0아마존Organization0.97080103
1SageMaker는 머신Term0.746069417
2러닝Term0.6950911820
3통합-0.5342672123
4엔드투엔드Term0.5409182429
5관리형-0.5320343033
6서비스로Term0.4831283438
72017년Date0.9962333944
8re :Event0.8029714548
9Invent 행사가Event0.7315694858
10열린-0.9995505961
11라스베가스에서Location0.9940486269
12발표되었다.-0.9996467076
\n", "
" ], "text/plain": [ " word entity score start end\n", "0 아마존 Organization 0.970801 0 3\n", "1 SageMaker는 머신 Term 0.746069 4 17\n", "2 러닝 Term 0.695091 18 20\n", "3 통합 - 0.534267 21 23\n", "4 엔드투엔드 Term 0.540918 24 29\n", "5 관리형 - 0.532034 30 33\n", "6 서비스로 Term 0.483128 34 38\n", "7 2017년 Date 0.996233 39 44\n", "8 re : Event 0.802971 45 48\n", "9 Invent 행사가 Event 0.731569 48 58\n", "10 열린 - 0.999550 59 61\n", "11 라스베가스에서 Location 0.994048 62 69\n", "12 발표되었다. - 0.999646 70 76" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "local_model_dir = 'model'\n", "with open(os.path.join(local_model_dir, 'tag2entity.json'), 'r') as f:\n", " tag2entity = json.loads(f.read())\n", "display_ner_outputs(results, tag2entity)" ] }, { "cell_type": "code", "execution_count": 20, "id": "c62330f1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{session.py:3113} INFO - Deleting endpoint configuration with name: pytorch-inference-2022-06-13-08-53-27-218\n", "[{session.py:3103} INFO - Deleting endpoint with name: pytorch-inference-2022-06-13-08-53-27-218\n", "Gracefully stopping... (press Ctrl+C again to force)\n" ] } ], "source": [ "predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "id": "c248bc76", "metadata": {}, "source": [ "
\n", "\n", "## 3. Deploy to Hosting Instance\n", "---\n", "로컬 모드에서 충분히 디버깅했으면 실제 호스팅 인스턴스로 배포할 차례입니다. 코드는 거의 동일하며, instance_type만 다르다는 점을 주목해 주세요!" ] }, { "cell_type": "markdown", "id": "c0aede3c", "metadata": {}, "source": [ "### Create Endpoint" ] }, { "cell_type": "code", "execution_count": 23, "id": "3c66c4f3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{session.py:2668} INFO - Creating model with name: pytorch-inference-2022-06-13-08-59-04-856\n", "[{session.py:3585} INFO - Creating endpoint-config with name pytorch-inference-2022-06-13-08-59-05-123\n", "[{session.py:3053} INFO - Creating endpoint with name pytorch-inference-2022-06-13-08-59-05-123\n" ] } ], "source": [ "from sagemaker.pytorch import PyTorchModel\n", "from sagemaker.serializers import JSONSerializer, JSONLinesSerializer, IdentitySerializer\n", "from sagemaker.deserializers import JSONDeserializer, JSONLinesDeserializer\n", "\n", "model = PyTorchModel(\n", " model_data=f\"{s3_model_path}\", # path to your trained SageMaker model\n", " role=role, # IAM role with permissions to create an endpoint \n", " entry_point='inference.py',\n", " source_dir='scripts',\n", " framework_version=\"1.8.1\", # PyTorch version used\n", " py_version='py3', # Python version used\n", ")\n", "predictor = model.deploy(\n", " initial_instance_count=1,\n", " instance_type=\"ml.c5.xlarge\",\n", " serializer=JSONLinesSerializer(),\n", " deserializer=JSONLinesDeserializer(), \n", " wait=False\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "10433bc0", "metadata": {}, "outputs": [], "source": [ "# from sagemaker.huggingface.model import HuggingFaceModel\n", "# model = HuggingFaceModel(\n", "# model_data=f\"{s3_model_path}\", # path to your trained SageMaker model\n", "# role=role, \n", "# transformers_version=\"4.6\",\n", "# pytorch_version=\"1.7\",\n", "\n", "# #entry_point='inference.py',\n", "# source_dir='scripts',\n", "# #framework_version=\"1.8.1\", # PyTorch version used\n", "# py_version='py36', # Python version used\n", "# )\n", "\n", "# predictor = model.deploy(\n", "# initial_instance_count=1,\n", "# instance_type=\"ml.c5.xlarge\",\n", "# wait=False\n", "# )" ] }, { "cell_type": "markdown", "id": "96947f3d", "metadata": {}, "source": [ "### Wait for the endpoint jobs to complete\n", "엔드포인트가 생성될 때까지 기다립니다. 엔드포인트가 가리키는 호스팅 리소스를 프로비저닝하는 데에 몇 분의 시간이 소요됩니다." ] }, { "cell_type": "code", "execution_count": 24, "id": "a07d3048", "metadata": {}, "outputs": [ { "data": { "text/html": [ "[Deploy model from S3] Review Endpoint" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.core.display import display, HTML\n", "def make_endpoint_link(region, endpoint_name, endpoint_task):\n", " endpoint_link = f'{endpoint_task} Review Endpoint' \n", " return endpoint_link \n", " \n", "endpoint_link = make_endpoint_link(region, predictor.endpoint_name, '[Deploy model from S3]')\n", "display(HTML(endpoint_link))" ] }, { "cell_type": "code", "execution_count": 25, "id": "69fc553e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--------------------------------------!" ] }, { "data": { "text/plain": [ "{'EndpointName': 'pytorch-inference-2022-06-13-08-59-05-123',\n", " 'EndpointArn': 'arn:aws:sagemaker:us-east-1:143656149352:endpoint/pytorch-inference-2022-06-13-08-59-05-123',\n", " 'EndpointConfigName': 'pytorch-inference-2022-06-13-08-59-05-123',\n", " 'ProductionVariants': [{'VariantName': 'AllTraffic',\n", " 'DeployedImages': [{'SpecifiedImage': '763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.8.1-cpu-py3',\n", " 'ResolvedImage': '763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference@sha256:ea8411872129fd66a7712a2d21564b82cb165628bd534ce3a587d3c1ec6241cd',\n", " 'ResolutionTime': datetime.datetime(2022, 6, 13, 8, 59, 7, 370000, tzinfo=tzlocal())}],\n", " 'CurrentWeight': 1.0,\n", " 'DesiredWeight': 1.0,\n", " 'CurrentInstanceCount': 1,\n", " 'DesiredInstanceCount': 1}],\n", " 'EndpointStatus': 'InService',\n", " 'CreationTime': datetime.datetime(2022, 6, 13, 8, 59, 5, 398000, tzinfo=tzlocal()),\n", " 'LastModifiedTime': datetime.datetime(2022, 6, 13, 9, 2, 14, 93000, tzinfo=tzlocal()),\n", " 'ResponseMetadata': {'RequestId': '67502169-41b7-416d-b014-817e37dca4f0',\n", " 'HTTPStatusCode': 200,\n", " 'HTTPHeaders': {'x-amzn-requestid': '67502169-41b7-416d-b014-817e37dca4f0',\n", " 'content-type': 'application/x-amz-json-1.1',\n", " 'content-length': '775',\n", " 'date': 'Mon, 13 Jun 2022 09:02:16 GMT'},\n", " 'RetryAttempts': 0}}" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sess.wait_for_endpoint(predictor.endpoint_name, poll=5)" ] }, { "cell_type": "markdown", "id": "c3e2bfa8", "metadata": {}, "source": [ "### Sample data prediction\n", "\n", "샘플 데이터로 추론을 수행합니다." ] }, { "cell_type": "code", "execution_count": 26, "id": "d4461751", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
wordentityscorestartend
0아마존Organization0.97080103
1SageMaker는 머신Term0.746069417
2러닝Term0.6950911820
3통합-0.5342672123
4엔드투엔드Term0.5409182429
5관리형-0.5320343033
6서비스로Term0.4831283438
72017년Date0.9962333944
8re :Event0.8029714548
9Invent 행사가Event0.7315694858
10열린-0.9995505961
11라스베가스에서Location0.9940486269
12발표되었다.-0.9996467076
\n", "
" ], "text/plain": [ " word entity score start end\n", "0 아마존 Organization 0.970801 0 3\n", "1 SageMaker는 머신 Term 0.746069 4 17\n", "2 러닝 Term 0.695091 18 20\n", "3 통합 - 0.534267 21 23\n", "4 엔드투엔드 Term 0.540918 24 29\n", "5 관리형 - 0.532034 30 33\n", "6 서비스로 Term 0.483128 34 38\n", "7 2017년 Date 0.996233 39 44\n", "8 re : Event 0.802971 45 48\n", "9 Invent 행사가 Event 0.731569 48 58\n", "10 열린 - 0.999550 59 61\n", "11 라스베가스에서 Location 0.994048 62 69\n", "12 발표되었다. - 0.999646 70 76" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = [\n", " {\"text\": [\"아마존 SageMaker는 머신 러닝 통합 엔드투엔드 관리형 서비스로 2017년 re:Invent 행사가 열린 라스베가스에서 발표되었다.\"]}\n", "]\n", "results = predictor.predict(data)\n", "display_ner_outputs(results, tag2entity)" ] }, { "cell_type": "markdown", "id": "7a758949", "metadata": {}, "source": [ "## Clean up \n", "\n", "비용 과금 방지를 위해 엔드포인트를 삭제합니다." ] }, { "cell_type": "code", "execution_count": 28, "id": "6be7bc9f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{session.py:3113} INFO - Deleting endpoint configuration with name: pytorch-inference-2022-06-13-08-59-05-123\n", "[{session.py:3103} INFO - Deleting endpoint with name: pytorch-inference-2022-06-13-08-59-05-123\n" ] } ], "source": [ "predictor.delete_endpoint()\n", "model.delete_model()" ] } ], "metadata": { "kernelspec": { "display_name": "conda_pytorch_p38", "language": "python", "name": "conda_pytorch_p38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 5 }