{ "cells": [ { "cell_type": "markdown", "id": "25e3e8e5", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "### SageMaker Stable diffusion Quick Kit - Inference 部署\n", " [SageMaker Stable Diffusion Quick Kit](https://github.com/aws-samples/sagemaker-stablediffusion-quick-kit) 提供了一组开箱即用的代码、配置文件,它可以帮助客户在亚马逊云上使用Amazon SageMaker , Lambda, Cloudfront快速构建Stable diffusion AI绘图服务.\n", " \n", " ![架构](https://raw.githubusercontent.com/aws-samples/sagemaker-stablediffusion-quick-kit/main/images/architecture.png)\n", "\n", "\n", "#### 前提条件\n", "1. 亚马逊云账号\n", "2. 建议使用ml.g4dn.xlarge/ml.g5.xlarge\n", "\n", "### Notebook部署步骤\n", "1. 升级boto3, sagemaker python sdk\n", "2. 部署AIGC推理服务\n", " * 配置模型参数\n", " * 配置异步推理\n", " * 部署SageMaker Endpoint \n", "3. 测试模型\n", "4. 配置推理服务弹性伸缩策略(可选)\n", "5. 清除资源\n" ] }, { "cell_type": "markdown", "id": "ee4f4c43", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "### 1. 升级boto3, sagemaker python sdk" ] }, { "cell_type": "code", "execution_count": null, "id": "04812f1f", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "!pip install --upgrade boto3 sagemaker" ] }, { "cell_type": "code", "execution_count": null, "id": "96c1e0a6", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "import time\n", "import boto3\n", "import sagemaker\n", "account_id = boto3.client('sts').get_caller_identity().get('Account')\n", "region_name = boto3.session.Session().region_name\n", "\n", "sagemaker_session = sagemaker.Session()\n", "bucket = sagemaker_session.default_bucket()\n", "role = sagemaker.get_execution_role()\n", "\n", "print(f\"role: {role}\")\n", "print(f\"bucket: {bucket}\")" ] }, { "cell_type": "markdown", "id": "cc41f63f", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "### 2. 部署AIGC推理服务" ] }, { "cell_type": "markdown", "id": "c039a56c", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "#### 2.1 配置模型参数\n", " * model_name: 支持 Huggingface diffusers models 结构,\n", " * 可以直接使用Huggingface的名字:Linaqruf/anything-v3.0\n", " * s3://sagemaker-us-east-1-123456789011/dreambooth/trained_models/model.tar.gz\n", " * 目前不支持.ckpt(single check point format),请使用转换脚本转换为diffusers格式\n", " * model_args: diffuser StableDiffusionPipeline init arguments\n", " * framework_version: pytroch版本\n", " * py_version: python版本\n", " * model_environment: 推理环境变量" ] }, { "cell_type": "code", "execution_count": null, "id": "a09f2b8f", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "\n", "#model_name = 'andite/anything-v4.0' # 默认的,高品质、高细节的动漫风格\n", "#model_name = 'Envvi/Inkpunk-Diffusion' # 温克朋克风格,提示词 nvinkpunk\n", "#model_name = 'nousr/robo-diffusion-2-base' # 看起来很酷的机器人,提示词 nousr robot \n", "#model_name = 'prompthero/openjourney' # openjorney 风格,提示词 mdjrny-v4 style\n", "#model_name = 'dreamlike-art/dreamlike-photoreal-2.0' #写实,真实风格,提示词 photo\n", "#model_name = 'runwayml/stable-diffusion-inpainting'\n", "#model_name = 'danbrown/RPG-v4' #RPG 角色扮演\n", "model_name = 'runwayml/stable-diffusion-v1-5' #标准stable diffusion 1.5\n", "\n", "\n", "\n", "#增加 SD Webui Lora 模型加载示例, \n", "#lora_model 请对应的LoRA模型上传到自己账号的s3桶\n", "#lora_model = f's3://{bucket}/fakemonPokMonLORA/fakemonPokMonLORA_v10Beta.safetensors'\n", "\n", "\n", "\n", "framework_version = '1.10'\n", "py_version = 'py38'\n", "\n", "model_environment = {\n", " 'SAGEMAKER_MODEL_SERVER_TIMEOUT':'600', \n", " 'SAGEMAKER_MODEL_SERVER_WORKERS': '1', \n", " 'model_name':model_name,\n", " #'lora_model':lora_model, #开启LoRA使用\n", " 's3_bucket':bucket\n", "}" ] }, { "cell_type": "markdown", "id": "df6274b0", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "#### 2.2 创建dummy model_data 文件(真正的模型使用infernece.py进行加载), 为SageMaker Endpoint 创建 PyTorchModel " ] }, { "cell_type": "code", "execution_count": null, "id": "a01e5c39", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "!touch dummy\n", "!tar czvf model.tar.gz dummy sagemaker-logo-small.png\n", "assets_dir = 's3://{0}/{1}/assets/'.format(bucket, 'stablediffusion')\n", "model_data = 's3://{0}/{1}/assets/model.tar.gz'.format(bucket, 'stablediffusion')\n", "!aws s3 cp model.tar.gz $assets_dir\n", "!rm -f dummy model.tar.gz" ] }, { "cell_type": "code", "execution_count": null, "id": "f895ce27", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "from sagemaker.pytorch.model import PyTorchModel\n", "\n", "model = PyTorchModel(\n", " name = None,\n", " model_data = model_data,\n", " entry_point = 'inference.py',\n", " source_dir = \"./code/\",\n", " role = role,\n", " framework_version = framework_version, \n", " py_version = py_version,\n", " env = model_environment\n", ")" ] }, { "cell_type": "markdown", "id": "9f9cc05d", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "#### 2.3 配置异步推理、设置推理使用的实例类型\n" ] }, { "cell_type": "code", "execution_count": null, "id": "78b6bfab", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "from sagemaker.async_inference import AsyncInferenceConfig\n", "import uuid\n", "\n", "endpoint_name = f'AIGC-Quick-Kit-{str(uuid.uuid4())}'\n", "instance_type = 'ml.g5.xlarge'\n", "instance_count = 1\n", "async_config = AsyncInferenceConfig(output_path='s3://{0}/{1}/asyncinvoke/out/'.format(bucket, 'stablediffusion'))\n", "\n", "print(f'endpoint_name: {endpoint_name}')" ] }, { "cell_type": "markdown", "id": "5692d24e", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "#### 2.4 部署SageMaker Endpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "72c5e717", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "from sagemaker.serializers import JSONSerializer\n", "from sagemaker.deserializers import JSONDeserializer\n", "\n", "\n", "async_predictor = model.deploy(\n", " endpoint_name = endpoint_name,\n", " instance_type = instance_type, \n", " initial_instance_count = instance_count,\n", " async_inference_config = async_config,\n", " serializer = JSONSerializer(),\n", " deserializer = JSONDeserializer(),\n", " #wait=False\n", ")" ] }, { "cell_type": "markdown", "id": "ab9cbef2", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "\n", "#### 2.5 编写异步推理调用辅助方法(适用于Notebook)\n", " * get_bucket_and_key, read s3 object\n", " * draw_image, download image from s3 and draw it in notebook\n", " * async_predict_fn \n" ] }, { "cell_type": "code", "execution_count": null, "id": "85546a90", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "import json\n", "import io\n", "from PIL import Image\n", "import traceback\n", "import time\n", "from sagemaker.async_inference.waiter_config import WaiterConfig\n", "\n", "\n", "s3_resource = boto3.resource('s3')\n", "\n", "def get_bucket_and_key(s3uri):\n", " pos = s3uri.find('/', 5)\n", " bucket = s3uri[5 : pos]\n", " key = s3uri[pos + 1 : ]\n", " return bucket, key\n", "\n", "def draw_image(response):\n", " try:\n", " bucket, key = get_bucket_and_key(response.output_path)\n", " obj = s3_resource.Object(bucket, key)\n", " body = obj.get()['Body'].read().decode('utf-8') \n", " predictions = json.loads(body)['result']\n", " print(predictions)\n", " for prediction in predictions:\n", " bucket, key = get_bucket_and_key(prediction)\n", " obj = s3_resource.Object(bucket, key)\n", " bytes = obj.get()['Body'].read()\n", " image = Image.open(io.BytesIO(bytes))\n", " image.show()\n", " except Exception as e:\n", " traceback.print_exc()\n", " print(e)\n", "\n", "\n", "def async_predict_fn(predictor,inputs):\n", " response = predictor.predict_async(inputs)\n", " \n", " print(f\"Response object: {response}\")\n", " print(f\"Response output path: {response.output_path}\")\n", " print(\"Start Polling to get response:\")\n", " \n", " start = time.time()\n", " config = WaiterConfig(\n", " max_attempts=100, # number of attempts\n", " delay=10 # time in seconds to wait between attempts\n", " )\n", "\n", " response.get_result(config)\n", " draw_image(response)\n", "\n", " print(f\"Time taken: {time.time() - start}s\")" ] }, { "cell_type": "markdown", "id": "b43860d1", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "### 3. 测试\n", "#### 3.1 txt2img 文本到图片推理" ] }, { "cell_type": "code", "execution_count": null, "id": "cb56166e", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "#AIGC Quick Kit txt2img\n", "inputs_txt2img = {\n", " \"prompt\": \"a photo of an astronaut riding a horse on mars\",\n", " \"negative_prompt\":\"\",\n", " \"steps\":20,\n", " \"sampler\":\"euler_a\",\n", " \"seed\": 52362,\n", " \"height\": 512, \n", " \"width\": 512,\n", " \"count\":2\n", "\n", "}\n", "start=time.time()\n", "async_predict_fn(async_predictor,inputs_txt2img)\n", "print(f\"Time taken: {time.time() - start}s\")" ] }, { "cell_type": "markdown", "id": "e0ecd78c", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "#### 3.3 img2img 图片到图片推理\n", " \n", " * 原始图片 :![](https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg)" ] }, { "cell_type": "code", "execution_count": null, "id": "4006b47a", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "#AIGC Quick Kit img2img\n", "# 图片到图片推理\n", "inputs_img2img = {\n", " \"prompt\": \"A fantasy landscape, trending on artstation\",\n", " \"negative_prompt\":\"\",\n", " \"steps\":20,\n", " \"sampler\":\"euler_a\",\n", " \"seed\":43768,\n", " \"height\": 512, \n", " \"width\": 512,\n", " \"count\":2,\n", " \"input_image\":\"https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg\"\n", " #\"input_image\":\"https://img.getimg.ai/inputs/img-Rj5vsMBFWrshn7cFwneVI.png\"\n", "\n", "}\n", "\n", "async_predict_fn(async_predictor,inputs_img2img)" ] }, { "cell_type": "markdown", "id": "d0877bea", "metadata": {}, "source": [ "#### 3.2 LoRA 测试\n", "测试LoRA 模型" ] }, { "cell_type": "code", "execution_count": null, "id": "c350d596", "metadata": {}, "outputs": [], "source": [ "prompt =\"pokemon,fire, a red wolf with blue eyes\"\n", "\n", "\n", "negative_prompt= \"nsfw, human, 1boy, 1girl,watermark, (worst quality, low quality:1.4), ( jpeg artifacts:1.4), (depth of field, bokeh, blurry, film grain, chromatic aberration, lens flare:1.0), greyscale, monochrome, dusty sunbeams, trembling, motion lines, motion blur, emphasis lines, text, title, logo, signature,\"\n", "\n", "inputs_txt2img = {\n", " \"prompt\": prompt,\n", " \"negative_prompt\":negative_prompt,\n", " \"steps\":20,\n", " \"sampler\":\"euler_a\",\n", " \"seed\": 52362,\n", " \"height\": 512, \n", " \"width\": 512,\n", " \"count\":2\n", "\n", "}\n", "start=time.time()\n", "async_predict_fn(async_predictor,inputs_txt2img)\n", "print(f\"Time taken: {time.time() - start}s\")\n" ] }, { "cell_type": "markdown", "id": "5af27f2f", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "### 4. 配置推理服务弹性伸缩策略(可选)" ] }, { "cell_type": "code", "execution_count": null, "id": "d306e2a2", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "# application-autoscaling client\n", "asg_client = boto3.client(\"application-autoscaling\")\n", "\n", "# This is the format in which application autoscaling references the endpoint\n", "resource_id = f\"endpoint/{async_predictor.endpoint_name}/variant/AllTraffic\"\n", "\n", "# Configure Autoscaling on asynchronous endpoint down to zero instances\n", "response = asg_client.register_scalable_target(\n", " ServiceNamespace=\"sagemaker\",\n", " ResourceId=resource_id,\n", " ScalableDimension=\"sagemaker:variant:DesiredInstanceCount\",\n", " MinCapacity=1,\n", " MaxCapacity=2,\n", ")\n", "\n", "response = asg_client.put_scaling_policy(\n", " PolicyName=f'Request-ScalingPolicy-{async_predictor.endpoint_name}',\n", " ServiceNamespace=\"sagemaker\",\n", " ResourceId=resource_id,\n", " ScalableDimension=\"sagemaker:variant:DesiredInstanceCount\",\n", " PolicyType=\"TargetTrackingScaling\",\n", " TargetTrackingScalingPolicyConfiguration={\n", " \"TargetValue\": 2.0,\n", " \"CustomizedMetricSpecification\": {\n", " \"MetricName\": \"ApproximateBacklogSizePerInstance\",\n", " \"Namespace\": \"AWS/SageMaker\",\n", " \"Dimensions\": [{\"Name\": \"EndpointName\", \"Value\": async_predictor.endpoint_name}],\n", " \"Statistic\": \"Average\",\n", " },\n", " \"ScaleInCooldown\": 600, # duration until scale in begins (down to zero)\n", " \"ScaleOutCooldown\": 300 # duration between scale out attempts\n", " },\n", ")" ] }, { "cell_type": "markdown", "id": "33236185", "metadata": {}, "source": [ "#### 通过并发推理,测试伸缩策略" ] }, { "cell_type": "code", "execution_count": null, "id": "6c346b59", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "import time\n", "import random\n", "\n", "start = time.time()\n", "\n", "outputs=[]\n", "\n", "def build_prompts_with_random_seed():\n", " return {\n", " \"prompt\": \"a photo of an astronaut riding a horse on mars\",\n", " \"negative_prompt\":\"\",\n", " \"steps\":50,\n", " \"sampler\":\"ddim\",\n", " \"seed\": random.randint(52362, 99999999),\n", " \"height\": 512, \n", " \"width\": 512,\n", " \"count\":2\n", "\n", " }\n", "\n", "# send 10 requests\n", "for i in range(10):\n", " prediction = async_predictor.predict_async(build_prompts_with_random_seed())\n", " outputs.append(prediction)\n", "\n", "# iterate over list of output paths and get results\n", "results = []\n", "for output in outputs:\n", " response = output.get_result(WaiterConfig(max_attempts=600))\n", " results.append(response)\n", "\n", "print(f\"Time taken: {time.time() - start}s\")\n", "print(results)" ] }, { "cell_type": "markdown", "id": "d294816f", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "source": [ "#### 绘制推理结果\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3b29866f", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "for r in results:\n", " for item in r[\"result\"]:\n", " bucket, key = get_bucket_and_key(item)\n", " obj = s3_resource.Object(bucket, key)\n", " bytes = obj.get()['Body'].read()\n", " image = Image.open(io.BytesIO(bytes))\n", " image.show()\n" ] }, { "cell_type": "code", "execution_count": null, "id": "be86866f", "metadata": { "pycharm": { "name": "#%%\n" }, "tags": [] }, "outputs": [], "source": [ "response = asg_client.deregister_scalable_target(\n", " ServiceNamespace='sagemaker',\n", " ResourceId=resource_id,\n", " ScalableDimension='sagemaker:variant:DesiredInstanceCount'\n", ")\n" ] }, { "cell_type": "markdown", "id": "0e496af7", "metadata": { "pycharm": { "name": "#%% md\n" }, "tags": [] }, "source": [ "### 5. 清除资源" ] }, { "cell_type": "code", "execution_count": null, "id": "a2e3df27", "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "async_predictor.delete_endpoint()" ] } ], "metadata": { "kernelspec": { "display_name": "conda_pytorch_p39", "language": "python", "name": "conda_pytorch_p39" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15" } }, "nbformat": 4, "nbformat_minor": 5 }