{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "25e3e8e5",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### SageMaker Stable diffusion Quick Kit - Inference 部署\n",
    "   [SageMaker Stable Diffusion Quick Kit](https://github.com/aws-samples/sagemaker-stablediffusion-quick-kit) 提供了一组开箱即用的代码、配置文件，它可以帮助客户在亚马逊云上使用Amazon SageMaker , Lambda, Cloudfront快速构建Stable diffusion AI绘图服务.\n",
    "   \n",
    "   ![架构](https://raw.githubusercontent.com/aws-samples/sagemaker-stablediffusion-quick-kit/main/images/architecture.png)\n",
    "\n",
    "\n",
    "#### 前提条件\n",
    "1. 亚马逊云账号\n",
    "2. 建议使用ml.g4dn.xlarge/ml.g5.xlarge\n",
    "\n",
    "### Notebook部署步骤\n",
    "1. 升级boto3, sagemaker python sdk\n",
    "2. 部署AIGC推理服务\n",
    "    * 配置模型参数\n",
    "    * 配置异步推理\n",
    "    * 部署SageMaker Endpoint \n",
    "3. 测试模型\n",
    "4. 配置推理服务弹性伸缩策略(可选)\n",
    "5. 清除资源\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee4f4c43",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### 1. 升级boto3, sagemaker python sdk"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04812f1f",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip install --upgrade boto3 sagemaker"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "96c1e0a6",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import time\n",
    "import boto3\n",
    "import sagemaker\n",
    "account_id = boto3.client('sts').get_caller_identity().get('Account')\n",
    "region_name = boto3.session.Session().region_name\n",
    "\n",
    "sagemaker_session = sagemaker.Session()\n",
    "bucket = sagemaker_session.default_bucket()\n",
    "role = sagemaker.get_execution_role()\n",
    "\n",
    "print(f\"role: {role}\")\n",
    "print(f\"bucket: {bucket}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cc41f63f",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### 2. 部署AIGC推理服务"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c039a56c",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### 2.1 配置模型参数\n",
    "   * model_name: 支持 Huggingface diffusers models 结构，\n",
    "       * 可以直接使用Huggingface的名字:Linaqruf/anything-v3.0\n",
    "       * s3://sagemaker-us-east-1-123456789011/dreambooth/trained_models/model.tar.gz\n",
    "       * 目前不支持.ckpt(single check point format),请使用转换脚本转换为diffusers格式\n",
    "   * model_args:  diffuser StableDiffusionPipeline init arguments\n",
    "   * framework_version: pytroch版本\n",
    "   * py_version: python版本\n",
    "   * model_environment: 推理环境变量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a09f2b8f",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "\n",
    "#model_name = 'andite/anything-v4.0' # 默认的，高品质、高细节的动漫风格\n",
    "#model_name = 'Envvi/Inkpunk-Diffusion' # 温克朋克风格，提示词 nvinkpunk\n",
    "#model_name = 'nousr/robo-diffusion-2-base' # 看起来很酷的机器人，提示词 nousr robot \n",
    "#model_name = 'prompthero/openjourney' # openjorney 风格,提示词 mdjrny-v4 style\n",
    "#model_name = 'dreamlike-art/dreamlike-photoreal-2.0' #写实，真实风格，提示词 photo\n",
    "#model_name = 'runwayml/stable-diffusion-inpainting'\n",
    "#model_name = 'danbrown/RPG-v4' #RPG 角色扮演\n",
    "model_name = 'runwayml/stable-diffusion-v1-5' #标准stable diffusion 1.5\n",
    "\n",
    "\n",
    "\n",
    "#增加 SD Webui Lora 模型加载示例, \n",
    "#lora_model 请对应的LoRA模型上传到自己账号的s3桶\n",
    "#lora_model = f's3://{bucket}/fakemonPokMonLORA/fakemonPokMonLORA_v10Beta.safetensors'\n",
    "\n",
    "\n",
    "\n",
    "framework_version = '1.10'\n",
    "py_version = 'py38'\n",
    "\n",
    "model_environment = {\n",
    "    'SAGEMAKER_MODEL_SERVER_TIMEOUT':'600', \n",
    "    'SAGEMAKER_MODEL_SERVER_WORKERS': '1', \n",
    "    'model_name':model_name,\n",
    "    #'lora_model':lora_model, #开启LoRA使用\n",
    "    's3_bucket':bucket\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df6274b0",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### 2.2 创建dummy model_data 文件(真正的模型使用infernece.py进行加载), 为SageMaker Endpoint 创建 PyTorchModel "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a01e5c39",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "!touch dummy\n",
    "!tar czvf model.tar.gz dummy sagemaker-logo-small.png\n",
    "assets_dir = 's3://{0}/{1}/assets/'.format(bucket, 'stablediffusion')\n",
    "model_data = 's3://{0}/{1}/assets/model.tar.gz'.format(bucket, 'stablediffusion')\n",
    "!aws s3 cp model.tar.gz $assets_dir\n",
    "!rm -f dummy model.tar.gz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f895ce27",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.pytorch.model import PyTorchModel\n",
    "\n",
    "model = PyTorchModel(\n",
    "    name = None,\n",
    "    model_data = model_data,\n",
    "    entry_point = 'inference.py',\n",
    "    source_dir = \"./code/\",\n",
    "    role = role,\n",
    "    framework_version = framework_version, \n",
    "    py_version = py_version,\n",
    "    env = model_environment\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f9cc05d",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### 2.3 配置异步推理、设置推理使用的实例类型\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78b6bfab",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.async_inference import AsyncInferenceConfig\n",
    "import uuid\n",
    "\n",
    "endpoint_name = f'AIGC-Quick-Kit-{str(uuid.uuid4())}'\n",
    "instance_type = 'ml.g5.xlarge'\n",
    "instance_count = 1\n",
    "async_config = AsyncInferenceConfig(output_path='s3://{0}/{1}/asyncinvoke/out/'.format(bucket, 'stablediffusion'))\n",
    "\n",
    "print(f'endpoint_name: {endpoint_name}')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5692d24e",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### 2.4 部署SageMaker Endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "72c5e717",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.serializers import JSONSerializer\n",
    "from sagemaker.deserializers import JSONDeserializer\n",
    "\n",
    "\n",
    "async_predictor = model.deploy(\n",
    "    endpoint_name = endpoint_name,\n",
    "    instance_type = instance_type, \n",
    "    initial_instance_count = instance_count,\n",
    "    async_inference_config = async_config,\n",
    "    serializer = JSONSerializer(),\n",
    "    deserializer = JSONDeserializer(),\n",
    "    #wait=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab9cbef2",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "\n",
    "#### 2.5 编写异步推理调用辅助方法(适用于Notebook)\n",
    " * get_bucket_and_key, read s3 object\n",
    " * draw_image, download image from s3 and draw it in notebook\n",
    " * async_predict_fn \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85546a90",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import json\n",
    "import io\n",
    "from PIL import Image\n",
    "import traceback\n",
    "import time\n",
    "from sagemaker.async_inference.waiter_config import WaiterConfig\n",
    "\n",
    "\n",
    "s3_resource = boto3.resource('s3')\n",
    "\n",
    "def get_bucket_and_key(s3uri):\n",
    "    pos = s3uri.find('/', 5)\n",
    "    bucket = s3uri[5 : pos]\n",
    "    key = s3uri[pos + 1 : ]\n",
    "    return bucket, key\n",
    "\n",
    "def draw_image(response):\n",
    "    try:\n",
    "        bucket, key = get_bucket_and_key(response.output_path)\n",
    "        obj = s3_resource.Object(bucket, key)\n",
    "        body = obj.get()['Body'].read().decode('utf-8') \n",
    "        predictions = json.loads(body)['result']\n",
    "        print(predictions)\n",
    "        for prediction in predictions:\n",
    "            bucket, key = get_bucket_and_key(prediction)\n",
    "            obj = s3_resource.Object(bucket, key)\n",
    "            bytes = obj.get()['Body'].read()\n",
    "            image = Image.open(io.BytesIO(bytes))\n",
    "            image.show()\n",
    "    except Exception as e:\n",
    "        traceback.print_exc()\n",
    "        print(e)\n",
    "\n",
    "\n",
    "def async_predict_fn(predictor,inputs):\n",
    "    response = predictor.predict_async(inputs)\n",
    "    \n",
    "    print(f\"Response object: {response}\")\n",
    "    print(f\"Response output path: {response.output_path}\")\n",
    "    print(\"Start Polling to get response:\")\n",
    "    \n",
    "    start = time.time()\n",
    "    config = WaiterConfig(\n",
    "        max_attempts=100, #  number of attempts\n",
    "        delay=10 #  time in seconds to wait between attempts\n",
    "    )\n",
    "\n",
    "    response.get_result(config)\n",
    "    draw_image(response)\n",
    "\n",
    "    print(f\"Time taken: {time.time() - start}s\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b43860d1",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### 3. 测试\n",
    "#### 3.1 txt2img 文本到图片推理"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cb56166e",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "#AIGC Quick Kit txt2img\n",
    "inputs_txt2img = {\n",
    "    \"prompt\": \"a photo of an astronaut riding a horse on mars\",\n",
    "    \"negative_prompt\":\"\",\n",
    "    \"steps\":20,\n",
    "    \"sampler\":\"euler_a\",\n",
    "    \"seed\": 52362,\n",
    "    \"height\": 512, \n",
    "    \"width\": 512,\n",
    "    \"count\":2\n",
    "\n",
    "}\n",
    "start=time.time()\n",
    "async_predict_fn(async_predictor,inputs_txt2img)\n",
    "print(f\"Time taken: {time.time() - start}s\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0ecd78c",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "#### 3.3 img2img 图片到图片推理\n",
    "  \n",
    " * 原始图片 :![](https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4006b47a",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "#AIGC Quick Kit img2img\n",
    "# 图片到图片推理\n",
    "inputs_img2img = {\n",
    "    \"prompt\": \"A fantasy landscape, trending on artstation\",\n",
    "    \"negative_prompt\":\"\",\n",
    "    \"steps\":20,\n",
    "    \"sampler\":\"euler_a\",\n",
    "    \"seed\":43768,\n",
    "    \"height\": 512, \n",
    "    \"width\": 512,\n",
    "    \"count\":2,\n",
    "    \"input_image\":\"https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg\"\n",
    "    #\"input_image\":\"https://img.getimg.ai/inputs/img-Rj5vsMBFWrshn7cFwneVI.png\"\n",
    "\n",
    "}\n",
    "\n",
    "async_predict_fn(async_predictor,inputs_img2img)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0877bea",
   "metadata": {},
   "source": [
    "#### 3.2 LoRA  测试\n",
    "测试LoRA 模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c350d596",
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt =\"pokemon,fire, a red wolf with blue eyes\"\n",
    "\n",
    "\n",
    "negative_prompt= \"nsfw, human, 1boy, 1girl,watermark, (worst quality, low quality:1.4), ( jpeg artifacts:1.4), (depth of field, bokeh, blurry, film grain, chromatic aberration, lens flare:1.0), greyscale, monochrome, dusty sunbeams, trembling, motion lines, motion blur, emphasis lines, text, title, logo, signature,\"\n",
    "\n",
    "inputs_txt2img = {\n",
    "    \"prompt\": prompt,\n",
    "    \"negative_prompt\":negative_prompt,\n",
    "    \"steps\":20,\n",
    "    \"sampler\":\"euler_a\",\n",
    "    \"seed\": 52362,\n",
    "    \"height\": 512, \n",
    "    \"width\": 512,\n",
    "    \"count\":2\n",
    "\n",
    "}\n",
    "start=time.time()\n",
    "async_predict_fn(async_predictor,inputs_txt2img)\n",
    "print(f\"Time taken: {time.time() - start}s\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5af27f2f",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### 4. 配置推理服务弹性伸缩策略(可选)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d306e2a2",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "# application-autoscaling client\n",
    "asg_client = boto3.client(\"application-autoscaling\")\n",
    "\n",
    "# This is the format in which application autoscaling references the endpoint\n",
    "resource_id = f\"endpoint/{async_predictor.endpoint_name}/variant/AllTraffic\"\n",
    "\n",
    "# Configure Autoscaling on asynchronous endpoint down to zero instances\n",
    "response = asg_client.register_scalable_target(\n",
    "    ServiceNamespace=\"sagemaker\",\n",
    "    ResourceId=resource_id,\n",
    "    ScalableDimension=\"sagemaker:variant:DesiredInstanceCount\",\n",
    "    MinCapacity=1,\n",
    "    MaxCapacity=2,\n",
    ")\n",
    "\n",
    "response = asg_client.put_scaling_policy(\n",
    "    PolicyName=f'Request-ScalingPolicy-{async_predictor.endpoint_name}',\n",
    "    ServiceNamespace=\"sagemaker\",\n",
    "    ResourceId=resource_id,\n",
    "    ScalableDimension=\"sagemaker:variant:DesiredInstanceCount\",\n",
    "    PolicyType=\"TargetTrackingScaling\",\n",
    "    TargetTrackingScalingPolicyConfiguration={\n",
    "        \"TargetValue\": 2.0,\n",
    "        \"CustomizedMetricSpecification\": {\n",
    "            \"MetricName\": \"ApproximateBacklogSizePerInstance\",\n",
    "            \"Namespace\": \"AWS/SageMaker\",\n",
    "            \"Dimensions\": [{\"Name\": \"EndpointName\", \"Value\": async_predictor.endpoint_name}],\n",
    "            \"Statistic\": \"Average\",\n",
    "        },\n",
    "        \"ScaleInCooldown\": 600, # duration until scale in begins (down to zero)\n",
    "        \"ScaleOutCooldown\": 300 # duration between scale out attempts\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33236185",
   "metadata": {},
   "source": [
    "#### 通过并发推理，测试伸缩策略"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6c346b59",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import time\n",
    "import random\n",
    "\n",
    "start = time.time()\n",
    "\n",
    "outputs=[]\n",
    "\n",
    "def build_prompts_with_random_seed():\n",
    "    return {\n",
    "        \"prompt\": \"a photo of an astronaut riding a horse on mars\",\n",
    "        \"negative_prompt\":\"\",\n",
    "        \"steps\":50,\n",
    "        \"sampler\":\"ddim\",\n",
    "        \"seed\": random.randint(52362, 99999999),\n",
    "        \"height\": 512, \n",
    "        \"width\": 512,\n",
    "        \"count\":2\n",
    "\n",
    "    }\n",
    "\n",
    "# send 10 requests\n",
    "for i in range(10):\n",
    "    prediction = async_predictor.predict_async(build_prompts_with_random_seed())\n",
    "    outputs.append(prediction)\n",
    "\n",
    "# iterate over list of output paths and get results\n",
    "results = []\n",
    "for output in outputs:\n",
    "    response = output.get_result(WaiterConfig(max_attempts=600))\n",
    "    results.append(response)\n",
    "\n",
    "print(f\"Time taken: {time.time() - start}s\")\n",
    "print(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d294816f",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "source": [
    "#### 绘制推理结果\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b29866f",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "for r in results:\n",
    "    for item in r[\"result\"]:\n",
    "        bucket, key = get_bucket_and_key(item)\n",
    "        obj = s3_resource.Object(bucket, key)\n",
    "        bytes = obj.get()['Body'].read()\n",
    "        image = Image.open(io.BytesIO(bytes))\n",
    "        image.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be86866f",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "response = asg_client.deregister_scalable_target(\n",
    "    ServiceNamespace='sagemaker',\n",
    "    ResourceId=resource_id,\n",
    "    ScalableDimension='sagemaker:variant:DesiredInstanceCount'\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0e496af7",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    },
    "tags": []
   },
   "source": [
    "### 5. 清除资源"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a2e3df27",
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "async_predictor.delete_endpoint()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p39",
   "language": "python",
   "name": "conda_pytorch_p39"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}