We live in exciting times, with announcements about cutting-edge technology every week. OpenAI just released the cutting-edge text-to-image model DALLE 2.
Only a few people gained early access to a new AI system that can generate realistic graphics from natural language descriptions. It is still closed to the public.
Stability AI then released the Stable Diffusion model, an open-source variant of DALLE2. This launch has altered everything. People all across the internet were publishing quick results and being surprised by realistic art.
What is Stable Diffusion?
Stable Diffusion is a machine learning model capable of creating images from text, changing images depending on the text, and filling in details on low-resolution or low-detail images.
But with limited local computational resources, the Stable Diffusion model takes a long time to create high-quality pictures. Running the model online using a cloud provider provides us with almost infinite computational resources and allows us to acquire excellent results much faster.
Hosting the model as a microservice also allows other creative apps to more readily exploit the model’s potential without having to deal with the complexities of running ML models online.
In this post, we will attempt to demonstrate how to develop a stable diffusion model and deploy it to AWS.
Build and Deploy Stable Diffusion
BentoML and Amazon Web Services EC2 are two options for hosting the Stable Diffusion model online. BentoML is an open-source framework for scaling machine learning services. With BentoML, we will build a reliable dispersion service and deploy it to AWS EC2.
Preparing the environment and download stable diffusion model
Install requirements and clone the repository.
You can select and download the Stable Diffusion model. Single precision is suitable for CPUs or GPUs with greater than 10GB of VRAM. Half precision is ideal for GPUs with less than 10GB VRAM.
Building Stable Diffusion
We will build a BentoML service to serve the model behind a RESTful API. The following example uses the single precision model for prediction and the service.py module to connect the service to business logic. We can expose the functions as APIs by tagging them with @svc.api.
Furthermore, we can define the APIs’ input and output types in the parameters. The txt2img endpoint, for example, receives a JSON input and produces an Image output, whereas the img2img endpoint accepts an Image and a JSON input and returns an Image output.
A StableDiffusionRunnable defines the essential inference logic. The runnable is in charge of running the model’s txt2img pipe methods and sending in the relevant inputs. For running the model inference logic in the APIs, a custom Runner is constructed from the StableDiffusionRunnable.
Then, use the following command to start a BentoML service for testing. Locally running the Stable Diffusion model inference on CPUs is rather sluggish. Each request will take about 5 minutes to process.
Text to image
Text to image output
The bentofile.yaml file defines the required files and dependencies.
Use the command below to build a bento. A Bento is the distribution format for a BentoML service. It is a self-contained archive that contains all of the data and configurations needed to start the service.
The Stable Diffusion bento has been completed. If you were unable to properly generate the bento, don’t panic; you can download pre-built model using the commands listed in the next section.
Following are the pre-build models:
Deploy Stable Diffusion model to EC2
To deploy the bento to EC2, we will use bentoctl. bentoctl can let you deploy your bentos to any cloud platform using Terraform. To build and apply Terraform files, install the AWS EC2 operator.
In the deployment config.yaml file, the deployment has already been configured. Please feel free to edit to your requirements. The Bento is deployed by default on a g4dn.xlarge host with the Deep Learning AMI GPU PyTorch 1.12.0 (Ubuntu 20.04) AMI on the us-west-1 region.
Create the Terraform files now. Create the Docker image and upload it to AWS ECR. Depending on your bandwidth, image uploading may take a long time. On deploying the bento to AWS EC2, use the Terraform files.
To access the Swagger UI, connect to the EC2 console and open the public IP address in a browser. Finally, if the Stable Diffusion BentoML service is no longer required, remove the deployment.
You should be able to see how fascinating and powerful SD and its companion models are. Time will tell if we will iterate on the concept further or move on to more sophisticated approaches.
However, there are currently initiatives underway to train bigger models with adjustments to better grasp the surroundings and the instructions. We attempted to develop the Stable Diffusion service using BentoML and deployed it to AWS EC2.
We were able to run the Stable Diffusion model on more powerful hardware, create pictures with low latency, and extend beyond a single computer by deploying the service on AWS EC2.