{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction to Bedrock - Generating images using Stable Diffusion\n", "---\n", "In this demo notebook, we demonstrate how to use the Bedrock SDK for an image generation task. We show how to use the Stable Diffusion foundational model to create images\n", "1. Text to Image\n", "2. Image to Image\n", "\n", "Images in Stable Diffusion are generated by these 4 main models below\n", "1. The CLIP text encoder;\n", "2. The VAE decoder;\n", "3. The UNet, and\n", "4. The VAE_post_quant_conv\n", "\n", "These blocks are chosen because they represent the bulk of the compute in the pipeline\n", "\n", "see this diagram below\n", "\n", "![SD Architecture](./images/sd.png)\n", "\n", "#### Image prompting\n", "\n", "Writing a good prompt can sometime be an art. It is often difficult to predict whether a certain prompt will yield a satisfactory image with a given model. However, there are certain templates that have been observed to work. Broadly, a prompt can be roughly broken down into three pieces: (i) type of image (photograph/sketch/painting etc.), (ii) description (subject/object/environment/scene etc.) and (iii) the style of the image (realistic/artistic/type of art etc.). You can change each of the three parts individually to generate variations of an image. Adjectives have been known to play a significant role in the image generation process. Also, adding more details help in the generation process.\n", "\n", "To generate a realistic image, you can use phrases such as “a photo of”, “a photograph of”, “realistic” or “hyper realistic”. To generate images by artists you can use phrases like “by Pablo Piccaso” or “oil painting by Rembrandt” or “landscape art by Frederic Edwin Church” or “pencil drawing by Albrecht Dürer”. You can also combine different artists as well. To generate artistic image by category, you can add the art category in the prompt such as “lion on a beach, abstract”. Some other categories include “oil painting”, “pencil drawing, “pop art”, “digital art”, “anime”, “cartoon”, “futurism”, “watercolor”, “manga” etc. You can also include details such as lighting or camera lens such as 35mm wide lens or 85mm wide lens and details about the framing (portrait/landscape/close up etc.).\n", "\n", "Note that model generates different images even if same prompt is given multiple times. So, you can generate multiple images and select the image that suits your application best." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### ⚠️⚠️⚠️ Execute the following cells before running this notebook ⚠️⚠️⚠️\n", "\n", "For a detailed description on what the following cells do refer to [Bedrock boto3 setup](../00_Intro/bedrock_boto3_setup.ipynb) notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make sure you run `download-dependencies.sh` from the root of the repository to download the dependencies before running this cell\n", "%pip install ../dependencies/botocore-1.29.162-py3-none-any.whl ../dependencies/boto3-1.26.162-py3-none-any.whl ../dependencies/awscli-1.27.162-py3-none-any.whl --force-reinstall\n", "%pip install langchain==0.0.190 --quiet" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#### Un comment the following lines to run from your local environment outside of the AWS account with Bedrock access\n", "\n", "#import os\n", "#os.environ['BEDROCK_ASSUME_ROLE'] = ''\n", "#os.environ['AWS_PROFILE'] = ''" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import boto3\n", "import json\n", "import os\n", "import sys\n", "\n", "module_path = \"..\"\n", "sys.path.append(os.path.abspath(module_path))\n", "from utils import bedrock\n", "\n", "os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'\n", "boto3_bedrock = bedrock.get_bedrock_client(os.environ.get('BEDROCK_ASSUME_ROLE', None))" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Install additional dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install pillow==9.5.0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import io, base64\n", "from PIL import Image" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Text to Image\n", "In order to generate an image, a description of what needs to be generated is needed. This is called `prompt`.\n", "\n", "You can also provide some negative prompts to guide the model to avoid certain type of outputs.\n", "\n", "Prompt acts as the input to the model and steers the model to generate a relevant output. With Stable Diffusion XL you have the option to choose certain [style presets](https://platform.stability.ai/docs/release-notes#style-presets) as well" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prompt = \"Dog in a forest\"\n", "negative_prompts = [\n", " \"poorly rendered\", \n", " \"poor background details\", \n", " \"poorly drawn dog\", \n", " \"disfigured dog features\"\n", " ]\n", "style_preset = \"photographic\" # (photographic, digital-art, cinematic, ...)\n", "#prompt = \"photo taken from above of an italian landscape. cloud is clear with few clouds. Green hills and few villages, a lake\"" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "`Bedrock` class implements a method `generate_image`. This method takes input a prompt and prepares a payload to be sent over to Bedrock API.\n", "You can provide the following model inference parameters to control the repetitiveness of responses:\n", "- prompt (string): Input text prompt for the model\n", "- seed (int): Determines initial noise. Using same seed with same settings will create similar images.\n", "- cfg_scale (float): Presence strength - Determines how much final image portrays prompts.\n", "- steps (int): Generation step - How many times image is sampled. More steps may be more accurate." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "As an output the Bedrock generates a `base64` encoded string respresentation of the image." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = bedrock.Bedrock(boto3_bedrock)\n", "base_64_img_str = model.generate_image(prompt, cfg_scale=5, seed=5450, steps=70, style_preset=style_preset)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We can convert the `base64` image to a PIL image to be displayed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_1 = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, \"utf-8\"))))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_1" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Image to Image\n", "\n", "Stable Diffusion let's us do some interesting stuff with our images like adding new characters or modifying scenery let's give it a try.\n", "\n", "You can use the previously generated image or use a different one to create a base64 string to be passed on as an initial image to the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from io import BytesIO\n", "from base64 import b64encode\n", "\n", "buffer = BytesIO()\n", "image_1.save(buffer, format=\"JPEG\")\n", "img_bytes = buffer.getvalue()\n", "init_image = b64encode(img_bytes).decode()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "A new guiding prompt can then help the model to act on the intial image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "change_prompt = \"add some leaves around the dog\"" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The `generate_image` method also accepts an additional paramter `init_image` which can be used to pass the initial image to the Stable Diffusion model on Bedrock." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "base_64_img_str = model.generate_image(change_prompt, init_image=init_image, seed=321, start_schedule=0.6)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_2 = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, \"utf-8\"))))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "image_2" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Summary\n", "\n", "And play around with different prompts to see amazing results." ] } ], "metadata": { "kernelspec": { "display_name": "bedrock", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }