# Stable Diffusion Inpainting with ClipSeg

(English)

Deploy Stable Diffusion Inpainting pipeline with [ClipSeg](https://huggingface.co/blog/clipseg-zero-shot).

User can generate inpainted image without creating their own mask image. User can specify mask with text.

(日本語)

Stable Diffusion Inpainting pipeline と [ClipSeg](https://huggingface.co/blog/clipseg-zero-shot) をデプロイするサンプルノートブック。

ユーザーはインペイントされた画像をマスク画像を作らずにテキストで指定して生成することが可能です。

In [None]:
import sagemaker, boto3, json
from sagemaker import get_execution_role
from sagemaker.pytorch.model import PyTorchModel
from sagemaker.huggingface import HuggingFace
from sagemaker.pytorch import PyTorch

role = get_execution_role()
region = boto3.Session().region_name
sess = sagemaker.Session()
bucket = sess.default_bucket()

sagemaker.__version__

## Generate Scripts for Inference

(English)

Generate `scripts/code/inference.py`, `scripts/code/requirements.txt` used for Inference.

(日本語)

推論で使うファイル `scripts/code/inference.py`, `scripts/code/requirements.txt` を作成します。

In [None]:
!mkdir -p sd_inpaint_clipseg_scripts/code

In [None]:
%%writefile sd_inpaint_clipseg_scripts/code/requirements.txt
transformers
diffusers
accelerate

In [None]:
%%writefile sd_inpaint_clipseg_scripts/code/inference.py
import torch
from transformers import CLIPSegProcessor, CLIPSegForImageSegmentation
from diffusers import DiffusionPipeline
from torch import autocast

import json
import base64
from PIL import Image
from io import BytesIO
import numpy as np

def model_fn(model_dir):
 processor = CLIPSegProcessor.from_pretrained("CIDAS/clipseg-rd64-refined")
 model = CLIPSegForImageSegmentation.from_pretrained("CIDAS/clipseg-rd64-refined")

 # Diffusion Pipeline: https://huggingface.co/docs/diffusers/api/diffusion_pipeline
 pipe = DiffusionPipeline.from_pretrained(
 "runwayml/stable-diffusion-inpainting",
 custom_pipeline="text_inpainting",
 segmentation_model=model,
 segmentation_processor=processor
 )
 
 pipe = pipe.to("cuda")
 return pipe

def input_fn(data, content_type):
 if content_type == 'application/json':
 data = json.loads(data)
 else:
 raise TypeError('content_type is only allowed application/json')
 return data

def predict_fn(data, model):
 pipe = model
 image_decoded = BytesIO(base64.b64decode(data['image'].encode()))
 image = Image.open(image_decoded).convert("RGB")
 data["image"] = image
 with autocast("cuda"):
 image = pipe(**data).images[0]
 # Convert to JSON Encoded Image
 buffered = BytesIO()
 image.save(buffered, format="JPEG")
 return base64.b64encode(buffered.getvalue()).decode()


def output_fn(data, accept_type):
 if accept_type == 'application/json':
 data = json.dumps({'generated_image' : data})
 else:
 raise TypeError('content_type is only allowed application/json')
 return data

## Package and Deploy Model

(English)

We will package the files created, upload it on S3, and deploy inference endpoint.

(日本語)

作成したファイルをパッケージし S3 にアップロードし推論エンドポイントをデプロイします。

In [None]:
%cd sd_inpaint_clipseg_scripts
!tar -czvf ../sd_inpaint_clipseg_package.tar.gz *
%cd -

In [None]:
model_path = sess.upload_data("sd_inpaint_clipseg_package.tar.gz", bucket=bucket, key_prefix=f"StableDiffusionInpainting-ClipSeg")
model_path

In [None]:
endpoint_name = "StableDiffusionInpainting-CLIPSeg"

huggingface_model = PyTorchModel(
 model_data=model_path,
 framework_version="2.0",
 py_version='py310',
 role=role,
 name=endpoint_name
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
 initial_instance_count=1,
 instance_type='ml.g5.2xlarge',
 endpoint_name=endpoint_name
)

## Run Inference

(English)

Run inference on deployed endpoint.

(日本語)

デプロイしたエンドポイントで推論を実行します。

In [None]:
from sagemaker.predictor import Predictor
from sagemaker.predictor_async import AsyncPredictor
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

import base64
from PIL import Image
from io import BytesIO
import matplotlib.pyplot as plt
import numpy as np

predictor_client = Predictor(
 endpoint_name=endpoint_name,
 sagemaker_session=sess,
 serializer=JSONSerializer(),
 deserializer=JSONDeserializer()
)


def inference(input_img_file_name, mask_text, prompt):
 with open(input_img_file_name, "rb") as f:
 input_img_image_bytes = f.read()
 encoded_input_image = base64.b64encode(bytearray(input_img_image_bytes)).decode()
 data = {
 "prompt": prompt,
 "text": mask_text,
 "image": encoded_input_image
 }
 response = predictor_client.predict(
 data=data
 )
 generated_image = response["generated_image"] # Base64 Encoded JPEG
 generated_image_decoded = BytesIO(base64.b64decode(generated_image.encode()))
 generated_image_rgb = Image.open(generated_image_decoded).convert("RGB")
 return generated_image_rgb

def display_img_and_prompt(img, prmpt):
 """Display the generated image."""
 plt.figure(figsize=(12, 12))
 plt.imshow(np.array(img))
 plt.axis("off")
 plt.title(prmpt)
 plt.show()

(English) Run inference on image in the directory. Feel free to change the file name and prompt.

(日本語) ディレクトリにある画像ファイルで推論を実行します。好きなファイル名とプロンプトを指定してください。

In [None]:
input_img_file_name = "dog_suit.jpg"
mask_text = "a dog"
prompt = "a cat"
generated_image = inference(input_img_file_name, mask_text, prompt)
display_img_and_prompt(generated_image, prompt)

## Delete Endpoint

(English)

Delete deployed endpoint.

(日本語)

デプロイしたエンドポイントを削除します。

In [None]:
predictor_client.delete_model()
predictor_client.delete_endpoint()