{ "cells": [ { "cell_type": "markdown", "id": "16c61f54", "metadata": {}, "source": [ "# Finetune a stable diffusion model for text to image generation" ] }, { "cell_type": "markdown", "id": "bdc23bae", "metadata": {}, "source": [ "***\n", "Amazon [SageMaker JumpStart](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html)에 오신 것을 환영합니다! SageMaker Studio에서 원클릭으로 또는 [SageMaker JumpStart API](https://sagemaker.readthedocs.io/en/stable/overview.html#use-prebuilt-models-with-sagemaker-jumpstart)를 통해 JumpStart를 사용하여 다양한 머신러닝 작업을 해결할 수 있습니다. 이 데모 노트북에서는 JumpStart API를 사용하여 state-of-the-art Stable Diffusion 모델을 사용하여 텍스트에서 이미지를 생성하는 방법을 보여줍니다. 또한 데이터 세트에 맞게 모델을 fine-tune하는 방법도 보여드립니다.\n", "\n", "Stable Diffusion은 텍스트 프롬프트만으로 사실적인 이미지를 생성할 수 있는 텍스트-이미지 모델입니다. diffusion 모델은 실제 이미지에 추가된 노이즈를 제거하는 방법을 학습하여 훈련합니다. 이 노이즈 제거 과정을 통해 사실적인 이미지가 생성됩니다. 이러한 모델은 텍스트에 대한 생성 프로세스를 조절하여 텍스트만으로 이미지를 생성할 수도 있습니다. 예를 들어, Stable Diffusion은 모델이 순수한 노이즈 이미지에서 shape을 인식하는 방법을 학습하고 입력 텍스트의 단어와 도형이 일치하는 경우 점차적으로 이러한 shape에 초점을 맞추는 latent diffusion입니다.\n", "\n", "대규모 모델을 훈련 및 배포하고 Stable Diffusion과 같은 모델에서 추론을 실행하는 것은 종종 메모리 부족, 페이로드 크기 제한 초과 등과 같은 문제를 포함하여 까다로운 작업입니다. JumpStart는 강력한 테스트를 거친 바로 사용할 수 있는 스크립트를 제공하여 이 프로세스를 간소화합니다. 또한 권장 인스턴스 유형, 이미지 생성 프로세스를 안내하는 매개변수 선택 방법, 프롬프트 엔지니어링 등 프로세스의 각 단계에 대한 가이드를 제공합니다. 또한 코드를 직접 작성하지 않고도 JumpStart의 80개 이상의 Diffusion 모델 중 하나에 추론을 배포하고 실행할 수 있습니다.\n", "\n", "\n", "이 노트북에서는 JumpStart를 사용하여 데이터 세트에 맞게 Stable Diffusion 모델을 fine-tune하는 방법을 알아봅니다. 이는 art, logos, custom designs, NFTs 등을 만들거나 반려 동물의 커스텀 AI 이미지나 자신의 아바타를 생성하는 등의 재미있는 작업을 할 때 유용하게 사용할 수 있습니다.\n", "\n", "Model 라이선스: 이 모델을 사용하면, 여러분들은 [CreativeML Open RAIL-M++ 라이선스](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)에 동의하는 것입니다.\n", "\n", "***" ] }, { "cell_type": "markdown", "id": "5db28351", "metadata": {}, "source": [ "1. [Set Up](#1.-Set-Up)\n", "2. [Fine-tune the pre-trained model on a custom dataset](#2.-Fine-tune-the-pre-trained-model-on-a-custom-dataset)\n", " * [Retrieve Training Artifacts](#2.1.-Retrieve-Training-Artifacts)\n", " * [Set Training parameters](#2.2.-Set-Training-parameters)\n", " * [Start Training](#2.3.-Start-Training)\n", " * [Deploy and run inference on the fine-tuned model](#3.2.-Deploy-and-run-inference-on-the-fine-tuned-model)\n", "3. [Conclusion](#3.-Conclusion)\n" ] }, { "cell_type": "markdown", "id": "ce462973", "metadata": {}, "source": [ "Note: 이 노트북은 Python 3 (Data Science) 커널이 설치된 Amazon SageMaker Studio의 ml.t3.medium 인스턴스와 conda_python3 커널이 설치된 Amazon SageMaker Notebook 인스턴스에서 테스트되었습니다.\n", "\n", "Note: 데이터 세트에서 모델을 fine-tuning하려면 account에서 사용 가능한 `ml.g4dn.2xlarge` 인스턴스 유형이 필요합니다. fine-trained 모델을 배포하려면 `ml.p3.2xlarge` 또는 `ml.g4dn.2xlarge` 인스턴스 유형을 사용할 수 있습니다. 해당 region에서 `ml.g5.2xlarge` 인스턴스 유형을 사용할 수 있는 경우 해당 인스턴스 유형을 배포에 사용하는 것이 좋습니다. " ] }, { "cell_type": "markdown", "id": "9ea47727", "metadata": {}, "source": [ "## 1. Set Up" ] }, { "cell_type": "markdown", "id": "35b91e81", "metadata": {}, "source": [ "***\n", "노트북을 실행하기 전에 설정에 필요한 몇 가지 initial 단계가 있습니다. 이 노트북을 사용하려면 최신 버전의 sagemaker가 필요합니다.\n", "\n", "***" ] }, { "cell_type": "markdown", "id": "48370155", "metadata": {}, "source": [ "#### Permissions and environment variables\n", "\n", "***\n", "Amazon SageMaker에서 호스팅하려면 AWS 서비스 사용을 설정하고 인증해야 합니다. 여기서는 현재 노트북과 연결된 실행 역할을 SageMaker 액세스 권한이 있는 AWS 계정 역할로 사용합니다.\n", "\n", "***" ] }, { "cell_type": "code", "execution_count": 2, "id": "90518e45", "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.8/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.2\n", " warnings.warn(f\"A NumPy version >={np_minversion} and <{np_maxversion}\"\n" ] } ], "source": [ "import sagemaker, boto3, json\n", "from sagemaker import get_execution_role\n", "\n", "aws_role = get_execution_role()\n", "aws_region = boto3.Session().region_name\n", "sess = sagemaker.Session()" ] }, { "cell_type": "markdown", "id": "2c8edfc4", "metadata": {}, "source": [ "## 2. Fine-tune the pre-trained model on a custom dataset\n", "\n", "---\n", "이전에는 사전 학습된 모델에서 추론을 실행하는 방법을 살펴보았습니다. 다음에는 여러 클래스가 있는 사용자 지정 데이터 세트에 맞게 모델을 미세 조정하는 방법에 대해 설명합니다.\n", "\n", "모델은 모든 이미지 데이터 세트에 맞게 미세 조정할 수 있습니다. 훈련 이미지가 5개만 있어도 매우 잘 작동합니다.\n", "\n", "미세 조정 스크립트는 [dreambooth](https://dreambooth.github.io/)의 스크립트를 기반으로 합니다. 미세 조정을 통해 반환된 모델은 추론을 위해 추가로 배포할 수 있습니다. 다음은 학습 데이터의 포맷 방법에 대한 가이드입니다.\n", "\n", "\n", "- **Input:** instance 이미지들, `dataset_info.json`과 (optional) 디렉토리 `class_data_dir`들이 포함된 디렉토리\n", " - 이미지는 `.png`, `.jpg` 또는 `.jpeg` 포맷일 수 있습니다.\n", " - `dataset_info.json` 파일은 {'instance_prompt':<>,'class_prompt':<>} 포맷이여야 합니다.\n", " - with_prior_preservation = False 이면, 'class_prompt'를 무시하도록 선택할 수 있습니다.\n", " - `class_data_dir` 디렉토리는 class 이미지들이 있어야 합니다. with_prior_preservation = True 이고, class_data_dir에 class 이미지가 없거나 이미 존재하는 이미지가 충분하지 않을 경우, class_prompt를 통해 추가 이미지를 샘플링합니다.\n", "- **Output:** 추론을 위해 배포할 수 있는 학습된 모델입니다.\n", "\n", "s3 경로는 `s3://bucket_name/input_directory/`와 같아야 합니다. 뒤에 오는 `/` 는 필수여야 합니다.\n", "\n", "다음은 학습 데이터의 예시 형식입니다.\n", "\n", " input_directory\n", " |---instance_image_1.png\n", " |---instance_image_2.png\n", " |---instance_image_3.png\n", " |---instance_image_4.png\n", " |---instance_image_5.png\n", " |---dataset_info.json\n", " |---class_data_dir\n", " |---class_image_1.png\n", " |---class_image_2.png\n", " |---class_image_3.png\n", " |---class_image_4.png\n", "\n", "**Prior preservation, instance prompt 와 class prompt:** Prior preservation은 학습하려는 class와 동일한 class의 이미지를 추가로 사용하는 기술입니다. 예를 들어, 학습 데이터가 특정 개의 이미지로 구성된 경우, Prior preservation을 통해 일반 개의 class 이미지를 통합합니다. 특정 개를 학습하는 동안 다른 개의 이미지를 보여줌으로써 과적합을 방지합니다. instance prompt에 있는 특정 개를 나타내는 tag가 class prompt에서 누락 되었습니다. 예를 들어 instance prompt는 \"a photo of a Doppler dog\"이고 class prompt는 \"a photo of a dog\"일 수 있습니다. 하이퍼 매개변수 with_prior_preservation = True를 설정하여 prior preservation을 활성화할 수 있습니다.\n", "\n", "저희는 개 이미지의 default 데이터 세트를 제공합니다. class 이미지들 없는 개 한 마리의 이미지 (instance prompt에 해당하는 instance 이미지들)로 구성됩니다. default 데이터 세트를 사용하는 경우 데모 노트북에서 추론을 수행하는 동안 \"a photo of a Doppler dog\"이라는 프롬프트를 사용해 보세요.\n", "\n", "License: [MIT](https://github.com/marshmellow77/dreambooth-sm/blob/main/LICENSE)." ] }, { "cell_type": "markdown", "id": "b8bfaa4d", "metadata": {}, "source": [ "### 2.1. Retrieve Training Artifacts\n", "\n", "---\n", "여기서는 학습 도커 컨테이너, 학습 알고리즘 소스와 사전 학습된 기본 모델을 검색합니다. model_version=\"*\"는 최신 모델을 가져옵니다.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 7, "id": "f11ff722", "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:sagemaker.image_uris:image_uri is not presented, retrieving image_uri based on instance_type, framework etc.\n" ] } ], "source": [ "from sagemaker import image_uris, model_uris, script_uris\n", "\n", "# Currently, not all the stable diffusion models in jumpstart support finetuning. Thus, we manually select a model\n", "# which supports finetuning.\n", "train_model_id, train_model_version, train_scope = (\n", " \"model-txt2img-stabilityai-stable-diffusion-v2-1-base\",\n", " \"*\",\n", " \"training\",\n", ")\n", "\n", "# Retrieve the docker image\n", "train_image_uri = image_uris.retrieve(\n", " region=None,\n", " framework=None, # automatically inferred from model_id\n", " model_id=train_model_id,\n", " model_version=train_model_version,\n", " image_scope=train_scope,\n", " instance_type=training_instance_type,\n", ")\n", "\n", "# Retrieve the training script. This contains all the necessary files including data processing, model training etc.\n", "train_source_uri = script_uris.retrieve(\n", " model_id=train_model_id, model_version=train_model_version, script_scope=train_scope\n", ")\n", "# Retrieve the pre-trained model tarball to further fine-tune\n", "train_model_uri = model_uris.retrieve(\n", " model_id=train_model_id, model_version=train_model_version, model_scope=train_scope\n", ")" ] }, { "cell_type": "markdown", "id": "6e266289", "metadata": {}, "source": [ "### 2.2. Set Training parameters\n", "\n", "---\n", "\n", "이제 필요한 모든 설정이 완료되었으므로 stable diffusion 모델을 학습할 준비가 되었습니다. 먼저, [``sageMaker.estimator.Estimator``](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html) 객체를 생성하겠습니다. 이 estimator가 training job을 시작합니다.\n", "\n", "학습을 위해 설정해야 하는 파라미터에는 두 가지 종류가 있습니다. 첫 번째는 training job에 대한 매개 변수입니다. 여기에는 다음이 포함됩니다: (i) Training data 경로. 이것은 입력 데이터가 저장되는 S3 폴더, (ii) Output 경로: 학습 output이 저장되는 S3 폴더입니다. (iii) 학습 인스턴스 유형: 학습을 실행할 머신의 유형을 나타냅니다. 위에서 학습 인스턴스 유형을 정의하여 올바른 train_image_uri를 가져왔습니다.\n", "\n", "두 번째 파라미터 세트는 알고리즘 별 학습 하이퍼 파라미터입니다.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 8, "id": "e21c709f", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Sample training data is available in this bucket\n", "training_data_bucket = f\"jumpstart-cache-prod-{aws_region}\"\n", "training_data_prefix = \"training-datasets/dogs_sd_finetuning/\"\n", "\n", "training_dataset_s3_path = f\"s3://{training_data_bucket}/{training_data_prefix}\"\n", "\n", "output_bucket = sess.default_bucket()\n", "output_prefix = \"jumpstart-example-sd-training\"\n", "\n", "s3_output_location = f\"s3://{output_bucket}/{output_prefix}/output\"" ] }, { "cell_type": "markdown", "id": "adda2a1e", "metadata": {}, "source": [ "---\n", "\n", "알고리즘 별 하이퍼파라미터의 경우, 먼저 알고리즘이 default 값으로 허용하는 학습 하이퍼파라미터의 Python dict를 가져옵니다. 그런 다음 이를 custom 값으로 재정의할 수 있습니다.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 13, "id": "aa371787", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'epochs': '20', 'max_steps': '400', 'batch_size': '1', 'with_prior_preservation': True, 'num_class_images': '100', 'learning_rate': '2e-06', 'prior_loss_weight': '1.0', 'center_crop': 'False', 'lr_scheduler': 'constant', 'adam_weight_decay': '0.01', 'adam_beta1': '0.9', 'adam_beta2': '0.999', 'adam_epsilon': '1e-08', 'gradient_accumulation_steps': '1', 'max_grad_norm': '1.0', 'compute_fid': 'False', 'seed': '0'}\n" ] } ], "source": [ "from sagemaker import hyperparameters\n", "\n", "# Retrieve the default hyper-parameters for fine-tuning the model\n", "hyperparameters = hyperparameters.retrieve_default(\n", " model_id=train_model_id, model_version=train_model_version\n", ")\n", "\n", "# [Optional] Override default hyperparameters with custom values\n", "hyperparameters[\"max_steps\"] = \"400\"\n", "# hyperparameters[\"with_prior_preservation\"] = False\n", "hyperparameters[\"with_prior_preservation\"] = True\n", "\n", "print(hyperparameters)" ] }, { "cell_type": "code", "execution_count": 16, "id": "0b52e342-2697-47b5-9f79-4c86af95e45a", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'ml.g5.xlarge'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Tested with ml.g4dn.2xlarge (16GB GPU memory) and ml.g5.2xlarge (24GB GPU memory) instances. Other instances may work as well.\n", "# If ml.g5.2xlarge instance type is available, please change the following instance type to speed up training.\n", "if hyperparameters[\"with_prior_preservation\"]:\n", " training_instance_type = \"ml.g5.xlarge\"\n", "else:\n", " training_instance_type = \"ml.g4dn.2xlarge\"\n", " \n", "training_instance_type" ] }, { "cell_type": "markdown", "id": "d102c884", "metadata": {}, "source": [ "---\n", "`with_prior_preservation=True`로 설정할 경우 class 이미지들의 생성에 더 많은 메모리가 필요하므로, ml.g5.2xlarge 인스턴스 타입을 사용하시기 바랍니다. 현재 ml.g4dn.2xlarge 인스턴스 유형에 대한 훈련은 `with_prior_preservation=True`를 설정할 때 CUDA 메모리 부족 문제가 발생하고 있습니다.\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "7cda2854", "metadata": {}, "source": [ "### 2.3. Start Training\n", "---\n", "먼저 필요한 모든 assets이 포함된 estimator 개체를 생성한 다음 training job을 시작합니다. default 데이터 세트에서는 10분도 채 걸리지 않습니다.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 23, "id": "76bdbb83", "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:sagemaker:Creating training-job with name: jumpstart-example-model-txt2img-stabili-2023-05-28-07-21-53-037\n" ] } ], "source": [ "from sagemaker.estimator import Estimator\n", "from sagemaker.utils import name_from_base\n", "from sagemaker.tuner import HyperparameterTuner\n", "\n", "training_job_name = name_from_base(f\"jumpstart-example-{train_model_id}-transfer-learning\")\n", "\n", "# Create SageMaker Estimator instance\n", "sd_estimator = Estimator(\n", " role=aws_role,\n", " image_uri=train_image_uri,\n", " source_dir=train_source_uri,\n", " model_uri=train_model_uri,\n", " entry_point=\"transfer_learning.py\", # Entry-point file in source_dir and present in train_source_uri.\n", " instance_count=1,\n", " instance_type=training_instance_type,\n", " max_run=360000,\n", " hyperparameters=hyperparameters,\n", " output_path=s3_output_location,\n", " base_job_name=training_job_name,\n", ")\n", "\n", "# Launch a SageMaker Training job by passing s3 path of the training data\n", "sd_estimator.fit({\"training\": training_dataset_s3_path}, logs=True, wait=False)" ] }, { "cell_type": "code", "execution_count": 24, "id": "79965f1f-eded-4a60-88ca-2b33e3f975a5", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2023-05-28 07:21:57 Starting - Starting the training job...\n", "2023-05-28 07:22:13 Starting - Preparing the instances for training......\n", "2023-05-28 07:23:09 Downloading - Downloading input data......\n", "2023-05-28 07:24:24 Training - Downloading the training image...............\n", "2023-05-28 07:27:00 Training - Training image download completed. Training in progress..\u001b[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device\u001b[0m\n", "\u001b[34mbash: no job control in this shell\u001b[0m\n", "\u001b[34m2023-05-28 07:27:07,781 sagemaker-training-toolkit INFO Imported framework sagemaker_pytorch_container.training\u001b[0m\n", "\u001b[34m2023-05-28 07:27:07,801 sagemaker_pytorch_container.training INFO Block until all host DNS lookups succeed.\u001b[0m\n", "\u001b[34m2023-05-28 07:27:07,804 sagemaker_pytorch_container.training INFO Invoking user training script.\u001b[0m\n", "\u001b[34m2023-05-28 07:27:09,174 sagemaker-training-toolkit INFO Installing dependencies from requirements.txt:\u001b[0m\n", "\u001b[34m/opt/conda/bin/python3.8 -m pip install -r requirements.txt\u001b[0m\n", "\u001b[34mProcessing ./lib/accelerate/accelerate-0.15.0-py3-none-any.whl\u001b[0m\n", "\u001b[34mProcessing ./lib/bitsandbytes/bitsandbytes-0.38.1-py3-none-any.whl\u001b[0m\n", "\u001b[34mProcessing ./lib/datasets/datasets-2.8.0-py3-none-any.whl\u001b[0m\n", "\u001b[34mProcessing ./lib/diffusers/diffusers-0.11.1-py3-none-any.whl\u001b[0m\n", "\u001b[34mProcessing ./lib/pytorch-fid/pytorch_fid-0.3.0-py3-none-any.whl\u001b[0m\n", "\u001b[34mProcessing ./lib/transformers/transformers-4.25.1-py3-none-any.whl\u001b[0m\n", "\u001b[34mProcessing ./lib/sagemaker_jumpstart_script_utilities/sagemaker_jumpstart_script_utilities-1.1.4-py2.py3-none-any.whl\u001b[0m\n", "\u001b[34mRequirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.8/site-packages (from accelerate==0.15.0->-r requirements.txt (line 1)) (1.22.2)\u001b[0m\n", "\u001b[34mRequirement already satisfied: psutil in /opt/conda/lib/python3.8/site-packages (from accelerate==0.15.0->-r requirements.txt (line 1)) (5.9.2)\u001b[0m\n", "\u001b[34mRequirement already satisfied: torch>=1.4.0 in /opt/conda/lib/python3.8/site-packages (from accelerate==0.15.0->-r requirements.txt (line 1)) (1.10.2+cu113)\u001b[0m\n", "\u001b[34mRequirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.8/site-packages (from accelerate==0.15.0->-r requirements.txt (line 1)) (21.3)\u001b[0m\n", "\u001b[34mRequirement already satisfied: pyyaml in /opt/conda/lib/python3.8/site-packages (from accelerate==0.15.0->-r requirements.txt (line 1)) (5.4.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: requests>=2.19.0 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (2.28.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: dill<0.3.7 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (0.3.5.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: fsspec[http]>=2021.11.1 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (2022.8.2)\u001b[0m\n", "\u001b[34mRequirement already satisfied: aiohttp in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (3.8.3)\u001b[0m\n", "\u001b[34mRequirement already satisfied: tqdm>=4.62.1 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (4.64.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: pyarrow>=6.0.0 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (9.0.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: pandas in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (1.5.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: huggingface-hub<1.0.0,>=0.2.0 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (0.10.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: xxhash in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (3.0.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: responses<0.19 in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (0.18.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: multiprocess in /opt/conda/lib/python3.8/site-packages (from datasets==2.8.0->-r requirements.txt (line 3)) (0.70.13)\u001b[0m\n", "\u001b[34mRequirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.8/site-packages (from diffusers==0.11.1->-r requirements.txt (line 4)) (2022.9.13)\u001b[0m\n", "\u001b[34mRequirement already satisfied: Pillow in /opt/conda/lib/python3.8/site-packages (from diffusers==0.11.1->-r requirements.txt (line 4)) (9.2.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: filelock in /opt/conda/lib/python3.8/site-packages (from diffusers==0.11.1->-r requirements.txt (line 4)) (3.8.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: importlib-metadata in /opt/conda/lib/python3.8/site-packages (from diffusers==0.11.1->-r requirements.txt (line 4)) (4.12.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: torchvision>=0.2.2 in /opt/conda/lib/python3.8/site-packages (from pytorch-fid==0.3.0->-r requirements.txt (line 5)) (0.11.3)\u001b[0m\n", "\u001b[34mRequirement already satisfied: scipy in /opt/conda/lib/python3.8/site-packages (from pytorch-fid==0.3.0->-r requirements.txt (line 5)) (1.9.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.8/site-packages (from transformers==4.25.1->-r requirements.txt (line 6)) (0.13.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (6.0.2)\u001b[0m\n", "\u001b[34mRequirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (4.0.2)\u001b[0m\n", "\u001b[34mRequirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (1.2.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (1.8.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: charset-normalizer<3.0,>=2.0 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (2.0.12)\u001b[0m\n", "\u001b[34mRequirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (1.3.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.8/site-packages (from aiohttp->datasets==2.8.0->-r requirements.txt (line 3)) (21.4.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.8/site-packages (from huggingface-hub<1.0.0,>=0.2.0->datasets==2.8.0->-r requirements.txt (line 3)) (4.3.0)\u001b[0m\n", "\u001b[34mRequirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/lib/python3.8/site-packages (from packaging>=20.0->accelerate==0.15.0->-r requirements.txt (line 1)) (3.0.9)\u001b[0m\n", "\u001b[34mRequirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.8/site-packages (from requests>=2.19.0->datasets==2.8.0->-r requirements.txt (line 3)) (1.26.12)\u001b[0m\n", "\u001b[34mRequirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.8/site-packages (from requests>=2.19.0->datasets==2.8.0->-r requirements.txt (line 3)) (3.3)\u001b[0m\n", "\u001b[34mRequirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.8/site-packages (from requests>=2.19.0->datasets==2.8.0->-r requirements.txt (line 3)) (2022.9.24)\u001b[0m\n", "\u001b[34mRequirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.8/site-packages (from importlib-metadata->diffusers==0.11.1->-r requirements.txt (line 4)) (3.8.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: python-dateutil>=2.8.1 in /opt/conda/lib/python3.8/site-packages (from pandas->datasets==2.8.0->-r requirements.txt (line 3)) (2.8.2)\u001b[0m\n", "\u001b[34mRequirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.8/site-packages (from pandas->datasets==2.8.0->-r requirements.txt (line 3)) (2022.2.1)\u001b[0m\n", "\u001b[34mRequirement already satisfied: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.8.1->pandas->datasets==2.8.0->-r requirements.txt (line 3)) (1.16.0)\u001b[0m\n", "\u001b[34mInstalling collected packages: bitsandbytes, sagemaker-jumpstart-script-utilities, accelerate, transformers, pytorch-fid, diffusers, datasets\u001b[0m\n", "\u001b[34mAttempting uninstall: transformers\u001b[0m\n", "\u001b[34mFound existing installation: transformers 4.17.0\u001b[0m\n", "\u001b[34mUninstalling transformers-4.17.0:\u001b[0m\n", "\u001b[34mSuccessfully uninstalled transformers-4.17.0\u001b[0m\n", "\u001b[34mAttempting uninstall: datasets\u001b[0m\n", "\u001b[34mFound existing installation: datasets 1.18.4\u001b[0m\n", "\u001b[34mUninstalling datasets-1.18.4:\u001b[0m\n", "\u001b[34mSuccessfully uninstalled datasets-1.18.4\u001b[0m\n", "\u001b[34mSuccessfully installed accelerate-0.15.0 bitsandbytes-0.38.1 datasets-2.8.0 diffusers-0.11.1 pytorch-fid-0.3.0 sagemaker-jumpstart-script-utilities-1.1.4 transformers-4.25.1\u001b[0m\n", "\u001b[34mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\n", "\u001b[34m[notice] A new release of pip available: 22.2.2 -> 23.1.2\u001b[0m\n", "\u001b[34m[notice] To update, run: pip install --upgrade pip\u001b[0m\n", "\u001b[34m2023-05-28 07:27:17,403 sagemaker-training-toolkit INFO Waiting for the process to finish and give a return code.\u001b[0m\n", "\u001b[34m2023-05-28 07:27:17,404 sagemaker-training-toolkit INFO Done waiting for a return code. Received 0 from exiting process.\u001b[0m\n", "\u001b[34m2023-05-28 07:27:17,468 sagemaker-training-toolkit INFO Invoking user script\u001b[0m\n", "\u001b[34mTraining Env:\u001b[0m\n", "\u001b[34m{\n", " \"additional_framework_parameters\": {},\n", " \"channel_input_dirs\": {\n", " \"model\": \"/opt/ml/input/data/model\",\n", " \"training\": \"/opt/ml/input/data/training\"\n", " },\n", " \"current_host\": \"algo-1\",\n", " \"current_instance_group\": \"homogeneousCluster\",\n", " \"current_instance_group_hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"current_instance_type\": \"ml.g5.xlarge\",\n", " \"distribution_hosts\": [],\n", " \"distribution_instance_groups\": [],\n", " \"framework_module\": \"sagemaker_pytorch_container.training:main\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"hyperparameters\": {\n", " \"adam_beta1\": \"0.9\",\n", " \"adam_beta2\": \"0.999\",\n", " \"adam_epsilon\": \"1e-08\",\n", " \"adam_weight_decay\": \"0.01\",\n", " \"batch_size\": \"1\",\n", " \"center_crop\": \"False\",\n", " \"compute_fid\": \"False\",\n", " \"epochs\": \"20\",\n", " \"gradient_accumulation_steps\": \"1\",\n", " \"learning_rate\": \"2e-06\",\n", " \"lr_scheduler\": \"constant\",\n", " \"max_grad_norm\": \"1.0\",\n", " \"max_steps\": \"400\",\n", " \"num_class_images\": \"100\",\n", " \"prior_loss_weight\": \"1.0\",\n", " \"seed\": \"0\",\n", " \"with_prior_preservation\": true\n", " },\n", " \"input_config_dir\": \"/opt/ml/input/config\",\n", " \"input_data_config\": {\n", " \"model\": {\n", " \"ContentType\": \"application/x-sagemaker-model\",\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " },\n", " \"training\": {\n", " \"TrainingInputMode\": \"File\",\n", " \"S3DistributionType\": \"FullyReplicated\",\n", " \"RecordWrapperType\": \"None\"\n", " }\n", " },\n", " \"input_dir\": \"/opt/ml/input\",\n", " \"instance_groups\": [\n", " \"homogeneousCluster\"\n", " ],\n", " \"instance_groups_dict\": {\n", " \"homogeneousCluster\": {\n", " \"instance_group_name\": \"homogeneousCluster\",\n", " \"instance_type\": \"ml.g5.xlarge\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ]\n", " }\n", " },\n", " \"is_hetero\": false,\n", " \"is_master\": true,\n", " \"is_modelparallel_enabled\": null,\n", " \"job_name\": \"jumpstart-example-model-txt2img-stabili-2023-05-28-07-21-53-037\",\n", " \"log_level\": 20,\n", " \"master_hostname\": \"algo-1\",\n", " \"model_dir\": \"/opt/ml/model\",\n", " \"module_dir\": \"s3://jumpstart-cache-prod-us-west-2/source-directory-tarballs/stabilityai/transfer_learning/txt2img/prepack/v1.0.3/sourcedir.tar.gz\",\n", " \"module_name\": \"transfer_learning\",\n", " \"network_interface_name\": \"eth0\",\n", " \"num_cpus\": 4,\n", " \"num_gpus\": 1,\n", " \"output_data_dir\": \"/opt/ml/output/data\",\n", " \"output_dir\": \"/opt/ml/output\",\n", " \"output_intermediate_dir\": \"/opt/ml/output/intermediate\",\n", " \"resource_config\": {\n", " \"current_host\": \"algo-1\",\n", " \"current_instance_type\": \"ml.g5.xlarge\",\n", " \"current_group_name\": \"homogeneousCluster\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ],\n", " \"instance_groups\": [\n", " {\n", " \"instance_group_name\": \"homogeneousCluster\",\n", " \"instance_type\": \"ml.g5.xlarge\",\n", " \"hosts\": [\n", " \"algo-1\"\n", " ]\n", " }\n", " ],\n", " \"network_interface_name\": \"eth0\"\n", " },\n", " \"user_entry_point\": \"transfer_learning.py\"\u001b[0m\n", "\u001b[34m}\u001b[0m\n", "\u001b[34mEnvironment variables:\u001b[0m\n", "\u001b[34mSM_HOSTS=[\"algo-1\"]\u001b[0m\n", "\u001b[34mSM_NETWORK_INTERFACE_NAME=eth0\u001b[0m\n", "\u001b[34mSM_HPS={\"adam_beta1\":\"0.9\",\"adam_beta2\":\"0.999\",\"adam_epsilon\":\"1e-08\",\"adam_weight_decay\":\"0.01\",\"batch_size\":\"1\",\"center_crop\":\"False\",\"compute_fid\":\"False\",\"epochs\":\"20\",\"gradient_accumulation_steps\":\"1\",\"learning_rate\":\"2e-06\",\"lr_scheduler\":\"constant\",\"max_grad_norm\":\"1.0\",\"max_steps\":\"400\",\"num_class_images\":\"100\",\"prior_loss_weight\":\"1.0\",\"seed\":\"0\",\"with_prior_preservation\":true}\u001b[0m\n", "\u001b[34mSM_USER_ENTRY_POINT=transfer_learning.py\u001b[0m\n", "\u001b[34mSM_FRAMEWORK_PARAMS={}\u001b[0m\n", "\u001b[34mSM_RESOURCE_CONFIG={\"current_group_name\":\"homogeneousCluster\",\"current_host\":\"algo-1\",\"current_instance_type\":\"ml.g5.xlarge\",\"hosts\":[\"algo-1\"],\"instance_groups\":[{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.g5.xlarge\"}],\"network_interface_name\":\"eth0\"}\u001b[0m\n", "\u001b[34mSM_INPUT_DATA_CONFIG={\"model\":{\"ContentType\":\"application/x-sagemaker-model\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"training\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}}\u001b[0m\n", "\u001b[34mSM_OUTPUT_DATA_DIR=/opt/ml/output/data\u001b[0m\n", "\u001b[34mSM_CHANNELS=[\"model\",\"training\"]\u001b[0m\n", "\u001b[34mSM_CURRENT_HOST=algo-1\u001b[0m\n", "\u001b[34mSM_CURRENT_INSTANCE_TYPE=ml.g5.xlarge\u001b[0m\n", "\u001b[34mSM_CURRENT_INSTANCE_GROUP=homogeneousCluster\u001b[0m\n", "\u001b[34mSM_CURRENT_INSTANCE_GROUP_HOSTS=[\"algo-1\"]\u001b[0m\n", "\u001b[34mSM_INSTANCE_GROUPS=[\"homogeneousCluster\"]\u001b[0m\n", "\u001b[34mSM_INSTANCE_GROUPS_DICT={\"homogeneousCluster\":{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.g5.xlarge\"}}\u001b[0m\n", "\u001b[34mSM_DISTRIBUTION_INSTANCE_GROUPS=[]\u001b[0m\n", "\u001b[34mSM_IS_HETERO=false\u001b[0m\n", "\u001b[34mSM_MODULE_NAME=transfer_learning\u001b[0m\n", "\u001b[34mSM_LOG_LEVEL=20\u001b[0m\n", "\u001b[34mSM_FRAMEWORK_MODULE=sagemaker_pytorch_container.training:main\u001b[0m\n", "\u001b[34mSM_INPUT_DIR=/opt/ml/input\u001b[0m\n", "\u001b[34mSM_INPUT_CONFIG_DIR=/opt/ml/input/config\u001b[0m\n", "\u001b[34mSM_OUTPUT_DIR=/opt/ml/output\u001b[0m\n", "\u001b[34mSM_NUM_CPUS=4\u001b[0m\n", "\u001b[34mSM_NUM_GPUS=1\u001b[0m\n", "\u001b[34mSM_MODEL_DIR=/opt/ml/model\u001b[0m\n", "\u001b[34mSM_MODULE_DIR=s3://jumpstart-cache-prod-us-west-2/source-directory-tarballs/stabilityai/transfer_learning/txt2img/prepack/v1.0.3/sourcedir.tar.gz\u001b[0m\n", "\u001b[34mSM_TRAINING_ENV={\"additional_framework_parameters\":{},\"channel_input_dirs\":{\"model\":\"/opt/ml/input/data/model\",\"training\":\"/opt/ml/input/data/training\"},\"current_host\":\"algo-1\",\"current_instance_group\":\"homogeneousCluster\",\"current_instance_group_hosts\":[\"algo-1\"],\"current_instance_type\":\"ml.g5.xlarge\",\"distribution_hosts\":[],\"distribution_instance_groups\":[],\"framework_module\":\"sagemaker_pytorch_container.training:main\",\"hosts\":[\"algo-1\"],\"hyperparameters\":{\"adam_beta1\":\"0.9\",\"adam_beta2\":\"0.999\",\"adam_epsilon\":\"1e-08\",\"adam_weight_decay\":\"0.01\",\"batch_size\":\"1\",\"center_crop\":\"False\",\"compute_fid\":\"False\",\"epochs\":\"20\",\"gradient_accumulation_steps\":\"1\",\"learning_rate\":\"2e-06\",\"lr_scheduler\":\"constant\",\"max_grad_norm\":\"1.0\",\"max_steps\":\"400\",\"num_class_images\":\"100\",\"prior_loss_weight\":\"1.0\",\"seed\":\"0\",\"with_prior_preservation\":true},\"input_config_dir\":\"/opt/ml/input/config\",\"input_data_config\":{\"model\":{\"ContentType\":\"application/x-sagemaker-model\",\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"},\"training\":{\"RecordWrapperType\":\"None\",\"S3DistributionType\":\"FullyReplicated\",\"TrainingInputMode\":\"File\"}},\"input_dir\":\"/opt/ml/input\",\"instance_groups\":[\"homogeneousCluster\"],\"instance_groups_dict\":{\"homogeneousCluster\":{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.g5.xlarge\"}},\"is_hetero\":false,\"is_master\":true,\"is_modelparallel_enabled\":null,\"job_name\":\"jumpstart-example-model-txt2img-stabili-2023-05-28-07-21-53-037\",\"log_level\":20,\"master_hostname\":\"algo-1\",\"model_dir\":\"/opt/ml/model\",\"module_dir\":\"s3://jumpstart-cache-prod-us-west-2/source-directory-tarballs/stabilityai/transfer_learning/txt2img/prepack/v1.0.3/sourcedir.tar.gz\",\"module_name\":\"transfer_learning\",\"network_interface_name\":\"eth0\",\"num_cpus\":4,\"num_gpus\":1,\"output_data_dir\":\"/opt/ml/output/data\",\"output_dir\":\"/opt/ml/output\",\"output_intermediate_dir\":\"/opt/ml/output/intermediate\",\"resource_config\":{\"current_group_name\":\"homogeneousCluster\",\"current_host\":\"algo-1\",\"current_instance_type\":\"ml.g5.xlarge\",\"hosts\":[\"algo-1\"],\"instance_groups\":[{\"hosts\":[\"algo-1\"],\"instance_group_name\":\"homogeneousCluster\",\"instance_type\":\"ml.g5.xlarge\"}],\"network_interface_name\":\"eth0\"},\"user_entry_point\":\"transfer_learning.py\"}\u001b[0m\n", "\u001b[34mSM_USER_ARGS=[\"--adam_beta1\",\"0.9\",\"--adam_beta2\",\"0.999\",\"--adam_epsilon\",\"1e-08\",\"--adam_weight_decay\",\"0.01\",\"--batch_size\",\"1\",\"--center_crop\",\"False\",\"--compute_fid\",\"False\",\"--epochs\",\"20\",\"--gradient_accumulation_steps\",\"1\",\"--learning_rate\",\"2e-06\",\"--lr_scheduler\",\"constant\",\"--max_grad_norm\",\"1.0\",\"--max_steps\",\"400\",\"--num_class_images\",\"100\",\"--prior_loss_weight\",\"1.0\",\"--seed\",\"0\",\"--with_prior_preservation\",\"True\"]\u001b[0m\n", "\u001b[34mSM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate\u001b[0m\n", "\u001b[34mSM_CHANNEL_MODEL=/opt/ml/input/data/model\u001b[0m\n", "\u001b[34mSM_CHANNEL_TRAINING=/opt/ml/input/data/training\u001b[0m\n", "\u001b[34mSM_HP_ADAM_BETA1=0.9\u001b[0m\n", "\u001b[34mSM_HP_ADAM_BETA2=0.999\u001b[0m\n", "\u001b[34mSM_HP_ADAM_EPSILON=1e-08\u001b[0m\n", "\u001b[34mSM_HP_ADAM_WEIGHT_DECAY=0.01\u001b[0m\n", "\u001b[34mSM_HP_BATCH_SIZE=1\u001b[0m\n", "\u001b[34mSM_HP_CENTER_CROP=False\u001b[0m\n", "\u001b[34mSM_HP_COMPUTE_FID=False\u001b[0m\n", "\u001b[34mSM_HP_EPOCHS=20\u001b[0m\n", "\u001b[34mSM_HP_GRADIENT_ACCUMULATION_STEPS=1\u001b[0m\n", "\u001b[34mSM_HP_LEARNING_RATE=2e-06\u001b[0m\n", "\u001b[34mSM_HP_LR_SCHEDULER=constant\u001b[0m\n", "\u001b[34mSM_HP_MAX_GRAD_NORM=1.0\u001b[0m\n", "\u001b[34mSM_HP_MAX_STEPS=400\u001b[0m\n", "\u001b[34mSM_HP_NUM_CLASS_IMAGES=100\u001b[0m\n", "\u001b[34mSM_HP_PRIOR_LOSS_WEIGHT=1.0\u001b[0m\n", "\u001b[34mSM_HP_SEED=0\u001b[0m\n", "\u001b[34mSM_HP_WITH_PRIOR_PRESERVATION=true\u001b[0m\n", "\u001b[34mPYTHONPATH=/opt/ml/code:/opt/conda/bin:/opt/conda/lib/python38.zip:/opt/conda/lib/python3.8:/opt/conda/lib/python3.8/lib-dynload:/opt/conda/lib/python3.8/site-packages:/opt/conda/lib/python3.8/site-packages/smdebug-1.0.22b20220929-py3.8.egg:/opt/conda/lib/python3.8/site-packages/pyinstrument-3.4.2-py3.8.egg:/opt/conda/lib/python3.8/site-packages/pyinstrument_cext-0.2.4-py3.8-linux-x86_64.egg\u001b[0m\n", "\u001b[34mInvoking script with the following command:\u001b[0m\n", "\u001b[34m/opt/conda/bin/python3.8 transfer_learning.py --adam_beta1 0.9 --adam_beta2 0.999 --adam_epsilon 1e-08 --adam_weight_decay 0.01 --batch_size 1 --center_crop False --compute_fid False --epochs 20 --gradient_accumulation_steps 1 --learning_rate 2e-06 --lr_scheduler constant --max_grad_norm 1.0 --max_steps 400 --num_class_images 100 --prior_loss_weight 1.0 --seed 0 --with_prior_preservation True\u001b[0m\n", "\u001b[34m[2023-05-28 07:27:18.720: W smdistributed/modelparallel/torch/nn/predefined_hooks.py:68] Found unsupported HuggingFace version 4.25.1 for automated tensor parallelism. HuggingFace modules will not be automatically distributed. You can use smp.tp_register_with_module API to register desired modules for tensor parallelism, or directly instantiate an smp.nn.DistributedModule. Supported HuggingFace transformers versions for automated tensor parallelism: ['4.17.0', '4.20.1', '4.21.0']\u001b[0m\n", "\u001b[34mRunning training scripts with arguments: Namespace(adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, adam_weight_decay=0.01, batch_size=1, center_crop=False, class_data_dir='class_data_dir', compute_fid=False, epochs=20, gradient_accumulation_steps=1, gradient_checkpointing=True, learning_rate=2e-06, local_rank=-1, lr_scheduler='constant', lr_warmup_steps=500, max_grad_norm=1.0, max_steps=400, mixed_precision=None, model_dir='/opt/ml/model', num_class_images=100, pretrained_model='/opt/ml/input/data/model', prior_generation_precision=None, prior_loss_weight=1.0, scale_lr=False, seed=0, train='/opt/ml/input/data/training', train_alt=None, train_text_encoder=False, use_8bit_adam=True, with_prior_preservation=True).\u001b[0m\n", "\u001b[34mIgnoring unrecognized arguments: [].\u001b[0m\n", "\u001b[34mGenerating class images\u001b[0m\n", "\u001b[34mNumber of class images to sample: 100.\u001b[0m\n", "\u001b[34mGenerating class images: 0%| | 0/100 [00:00" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "text = \"a photo of a Doppler dog with a hat\"\n", "query_response = query(finetuned_predictor, text)\n", "img, prmpt = parse_response(query_response)\n", "display_img_and_prompt(img, prmpt)" ] }, { "cell_type": "markdown", "id": "944a6f0f", "metadata": {}, "source": [ "[2.4. Supported Inference parameters](#2.4.-Supported-Inference-parameters)에 언급된 모든 파라미터는 finetuned 모델에서도 지원됩니다. 또한, `accept`\b를 변경하여 [2.5. Compressed Image Output](#2.5.-Compressed-Image-Output)에서 압축 이미지 출력으로 받을 수도 있습니다." ] }, { "cell_type": "markdown", "id": "f3381a2c", "metadata": {}, "source": [ "---\n", "다음으로 fine-tuned 모델에 해당하는 endpoint를 삭제합니다.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 34, "id": "b03c8594", "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:sagemaker:Deleting model with name: sagemaker-jumpstart-2023-05-29-04-21-26-310\n", "INFO:sagemaker:Deleting endpoint configuration with name: jumpstart-example-FT-model-txt2img-stab-2023-05-29-04-21-26-310\n", "INFO:sagemaker:Deleting endpoint with name: jumpstart-example-FT-model-txt2img-stab-2023-05-29-04-21-26-310\n" ] } ], "source": [ "# Delete the SageMaker endpoint\n", "finetuned_predictor.delete_model()\n", "finetuned_predictor.delete_endpoint()" ] }, { "cell_type": "markdown", "id": "a504c9ac", "metadata": {}, "source": [ "## 3. Conclusion\n", "---\n", "\n", "인상적인 이미지를 만드는 것은 예술부터 NFTs에 이르기까지 다양한 산업에서 활용될 수 있지만, 오늘 우리는 AI가 개인화할 수 있기를 기대합니다. JumpStart는 사전 학습된 모델에 fine-tuning 기능을 제공하므로 최소 5개의 학습 이미지만으로 모델을 자신의 use case에 맞게 조정할 수 있습니다. 이는 아트, 로고, 커스텀 디자인, NFT 등을 만들 때 유용하며, 반려동물의 custom AI 이미지나 자신의 아바타를 생성하는 것과 같은 재미있는 작업을 할 때도 유용합니다. 이 실습에서는 이미지 생성 모델에 stable diffusion 텍스트를 fine-tune 하는 방법을 배웠습니다. Stable Diffusion 미세 조정에 대한 자세한 내용은 블로그 [Fine-tune text-to-image Stable Diffusion models with Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/fine-tune-text-to-image-stable-diffusion-models-with-amazon-sagemaker-jumpstart/)를 참조하세요." ] }, { "cell_type": "code", "execution_count": null, "id": "e5238f67-d001-441b-b054-129270e3a641", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "availableInstances": [ { "_defaultOrder": 0, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.t3.medium", "vcpuNum": 2 }, { "_defaultOrder": 1, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.t3.large", "vcpuNum": 2 }, { "_defaultOrder": 2, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.t3.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 3, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.t3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 4, "_isFastLaunch": true, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5.large", "vcpuNum": 2 }, { "_defaultOrder": 5, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 6, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 7, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 8, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 9, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 10, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 11, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 12, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.m5d.large", "vcpuNum": 2 }, { "_defaultOrder": 13, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.m5d.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 14, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.m5d.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 15, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.m5d.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 16, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.m5d.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 17, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.m5d.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 18, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.m5d.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 19, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.m5d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 20, "_isFastLaunch": false, "category": "General purpose", "gpuNum": 0, "hideHardwareSpecs": true, "memoryGiB": 0, "name": "ml.geospatial.interactive", "supportedImageNames": [ "sagemaker-geospatial-v1-0" ], "vcpuNum": 0 }, { "_defaultOrder": 21, "_isFastLaunch": true, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 4, "name": "ml.c5.large", "vcpuNum": 2 }, { "_defaultOrder": 22, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 8, "name": "ml.c5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 23, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.c5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 24, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.c5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 25, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 72, "name": "ml.c5.9xlarge", "vcpuNum": 36 }, { "_defaultOrder": 26, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 96, "name": "ml.c5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 27, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 144, "name": "ml.c5.18xlarge", "vcpuNum": 72 }, { "_defaultOrder": 28, "_isFastLaunch": false, "category": "Compute optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.c5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 29, "_isFastLaunch": true, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g4dn.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 30, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g4dn.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 31, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g4dn.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 32, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g4dn.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 33, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g4dn.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 34, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g4dn.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 35, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 61, "name": "ml.p3.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 36, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 244, "name": "ml.p3.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 37, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 488, "name": "ml.p3.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 38, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.p3dn.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 39, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.r5.large", "vcpuNum": 2 }, { "_defaultOrder": 40, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.r5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 41, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.r5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 42, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.r5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 43, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.r5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 44, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.r5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 45, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 512, "name": "ml.r5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 46, "_isFastLaunch": false, "category": "Memory Optimized", "gpuNum": 0, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.r5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 47, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 16, "name": "ml.g5.xlarge", "vcpuNum": 4 }, { "_defaultOrder": 48, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 32, "name": "ml.g5.2xlarge", "vcpuNum": 8 }, { "_defaultOrder": 49, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 64, "name": "ml.g5.4xlarge", "vcpuNum": 16 }, { "_defaultOrder": 50, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 128, "name": "ml.g5.8xlarge", "vcpuNum": 32 }, { "_defaultOrder": 51, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 1, "hideHardwareSpecs": false, "memoryGiB": 256, "name": "ml.g5.16xlarge", "vcpuNum": 64 }, { "_defaultOrder": 52, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 192, "name": "ml.g5.12xlarge", "vcpuNum": 48 }, { "_defaultOrder": 53, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 4, "hideHardwareSpecs": false, "memoryGiB": 384, "name": "ml.g5.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 54, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 768, "name": "ml.g5.48xlarge", "vcpuNum": 192 }, { "_defaultOrder": 55, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4d.24xlarge", "vcpuNum": 96 }, { "_defaultOrder": 56, "_isFastLaunch": false, "category": "Accelerated computing", "gpuNum": 8, "hideHardwareSpecs": false, "memoryGiB": 1152, "name": "ml.p4de.24xlarge", "vcpuNum": 96 } ], "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science 2.0)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-38" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "pycharm": { "stem_cell": { "cell_type": "raw", "metadata": { "collapsed": false }, "source": [] } } }, "nbformat": 4, "nbformat_minor": 5 }