Deploying Stable Diffusion on m7i.4xlarge and Accelerate with OpenVINO

Deploying Stable Diffusion on m7i.4xlarge and Accelerate with OpenVINO

Explaining on how to deploy Stable Diffusion on EC2

Published Feb 12, 2024
Hi I’m Beny, this is my 1st article in AWS Community. In this article you will learn on how to deploy Stable Diffusion running on EC2 m7i.4xlarge (Intel® 4th Gen) and using OpenVINO to enhance AI Inferencing.

Stable Diffusion Overview

Stable Diffusion, a text-to-image model unveiled in 2022 and rooted in diffusion methodologies, epitomizes the burgeoning AI advancements of its time. Its core function lies in crafting intricate images based on textual cues, yet its utility extends to various tasks like inpainting, outpainting, and text-guided image translations.
Stable Diffusion
employs a specific diffusion model (DM) known as a latent diffusion model
(LDM), pioneered by the CompVis group at LMU Munich.
Stable Diffusion represents a significant advancement in the realm of text-to-image model generation, offering broad accessibility while demanding considerably less computational resources compared to other models in the field. Its functionalities encompass text-to-image conversion, image-to-image transformation, graphic artwork creation, image editing, and video production.
In text-to-image generation, Stable Diffusion is widely employed. It constructs images based on textual inputs, allowing for diverse image outcomes through adjustments to the random seed number or manipulation of the denoising schedule for varied effects.
Furthermore, in image-to-image generation, Stable Diffusion facilitates the creation of images using both input images and textual prompts. A typical application involves generating images from sketches paired with relevant prompts.

OpenVINO Overview

The OpenVINO™ toolkit, an open-source platform,
enhances AI inference by minimizing latency and maximizing throughput without
compromising accuracy. It achieves this by reducing model size, optimizing
hardware utilization, and facilitating streamlined development and integration
of deep learning across various domains such as computer vision, large language
models, and generative AI.

Deploying Stable Diffusion in AWS m7i Instance Family using Intel Accelerator OpenVINO

I’m using m7i.4xlarge since its Intel® 4th Gen that has up to 15% better price performance compare to m6i.4xlarge.
1. Anaconda
2. Python 3.10.6
3. Torch 2.2.0
4. Torchvision
5. Xformers
6. Torchaudio
7. EC2 m7i.4xlarge
Installation Process
# Make python environment with version 3.10.6
conda create -n sd_env python==3.10.6
conda activate sd_env
#download the stable diffusion
git clone https://github.com/openvinotoolkit/stable-diffusion-webui.git
cd stable-diffusion-webui
#put as variable
export COMMANDLINE_ARGS="--skip-torch-cuda-test --precision full --no-half"
#Download Model
wget https://huggingface.co/nitrosocke/mo-di-diffusion/resolve/main/moDi-v1-pruned.ckpt
mv moDi-v1-pruned.ckpt models/Stabble-diffusions
#Install Linux package
apt install freeglut3-dev
#Install Python Module
pip install insightface
pip install xformers
pip install
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
#change the degradation code
vim /home/ubuntu/stable-diffusion-webui/venv/lib/python3.10/site-packages/basicsr/data/degradations.py
from torchvision.transforms.functional_tensor import rgb_to_grayscale
from torchvision.transforms.functional import rgb_to_grayscale
#Launch the WebUI
./webui.sh --share --listen --enable-insecure-extension-access
Web UI Console Output
webui console output
WebUI Web output
webui graphic
We start to put our image into Stable Diffusion
  1. Choose img2img
  2. Put on the prompt = Modern Cartoon Characters smiling
  3. Negative prompt = girl women
  4. Sampling method = DPM++ 2M Keras
  5. Script = Accelerate with OpenVINO and Override model = DPM++ 2M Keras
OpenVINO script acceleration enabled, the performance is 37.6 second to generate image
* The performance is variative depends on the environment and instance family.Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Image CFG scale: 1.5, Seed: 448692853, Size: 512x768, Denoising strength: 0.75, Version: 1.6.0, Warm up time: 13.81 secs , Performance: 1.09 it/s
processing | 44.2/4.1s
Time taken:37.6 sec.
OpenVINO script acceleration disabled, the performance is 1 min. 5.1 second to generate image
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1225832328, Size: 512x768, Model hash: 8067368533, Model: moDi-v1-pruned, Denoising strength: 0.75, Version: 1.6.0
Time taken:1 min. 5.1 sec.


  1. We can do Generative AI using M7i, R7i and C7i family and powered by CPU Intel(R) 4th Gen
  2. Stable Diffusion process will be more performing using OpenVINO


  1. https://github.com/openvinotoolkit/stable-diffusion-webui/wiki/Installation-on-Intel-Silicon
  2. https://github.com/openvinotoolkit/stable-diffusion-webui?tab=readme-ov-file
  3. https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html