How To Run Stable Difussion (Similar to DALLE-2)

And some fun generated images

This post is a part of the stable-diffusion series.

If you just want to see examples of images I’ve generated, click here.

Stable Diffusion is a new, free to use, model hosted by Hugging Face, which was produced by CompVis and Stability AI. This model takes in a text prompt, and produces an image. It was trained on 5.85 billion images. What’s most exciting about it, is it is freely available, and you’re able to run it on your own computer at home!

I’m using a 2080 TI (this is an affiliate link, in case anyone purchases one of these now ancient GPUs ;). This allows me to create an image in under five seconds!

The Stable Diffusion scene is moving rather quickly, as there are many forks of code repositories available to day. At its core, there are two different routes you can take to play with Stable Diffusion:

  1. Using the official package on your host OS
  2. Using a Dockerized wrapper of the official package

For your convenience, I’ll describe both approaches.

  • You have Docker installed (required only when not running on your host OS)
  • You have the latest Nvidia Driver and CUDA >= 11.7 installed
    • These are conveniently preinstalled with Pop OS
  • You have cgroupv2 >= 1.10.0 installed

The Hugging Face model card will give you all the instructions you need to get this running locally. I’ll describe them here with explanations on using pip or poetry. I’d recommend leveraging the float16 precision instead of the full float32 to reduce your memory usage, at a minimum drop in quality.

  1. Create a virtual environment. You can use pip, poetry or any other tool to install the required python libraries.
    • With pip we’d run $ python3 -m venv venv and activate it with $ source venv/bin/activate
    • With poetry we’d run $ poetry init
  2. Install the necessary dependencies.
    • With pip we’d run $ python3 -m pip install diffusers transformers scipy
    • With poetry we’d run $ poetry add diffusers transformers scipy
  3. We’d then login with Hugging Face. Before doing this, make sure you make an account on their website, then go to settings > access tokens and generate a token. Make sure the token has WRITE permissions.
    • $ huggingface-cli login and paste your token
  4. Now we can write our first Python script, save this in app.py.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    
    import torch
    from torch import autocast
    from diffusers import StableDiffusionPipeline
    
    model_id = "CompVis/stable-diffusion-v1-4"
    device = "cuda"
    
    # the middle two arguments configure the usage of `float.16` instead of the default `float.32`
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, revision="fp16", use_auth_token=True)
    pipe = pipe.to(device)
    
    prompt = "a photo of an astronaut riding a horse on mars"
    with autocast("cuda"):
        image = pipe(prompt, guidance_scale=7.5)["sample"][0]
    
    image.save("astronaut_rides_horse.png")
    
  5. Now we can run our program.
    • With pip, $ python3 app.py
    • With poetry, $ poetry run app.py

The UI will look like this:

hlky-web-ui.png

  1. Clone the Github repo

    $ git clone git@github.com:hlky/stable-diffusion.git

  2. Bring up the ncessary Docker containers.

    $ docker compose up

  3. At this point you should be able to access the web UI on localhost:7860. You’ll be able to write prompts, tweak settings, and download your generated images! You’ll even be able to leverage img2img to draw images and use them as input, upscale images, and render faces more appropriately!

I know longer use this repo, as I prefer hlky’s, but I’ve left the steps here for your convenience.

The UI will look like this:

web-ui.png

We’re going to be leveraging this incredibly convenient Github repo.

  1. In order for our docker container to have access to our GPU, we’ll have to install a dependency:

    1
    2
    3
    
    apt-get update -y
    apt-get install -y nvidia-container-toolkit
    systemctl restart docker
    
  2. Clone the Github repo

    $ git clone git@github.com:cmdr2/stable-diffusion-ui.git

  3. Bring up the neccessary Docker containers. Our project leverage Docker compose by defining a docker-compose.yml file. This file uses two services. One runs the Stabel Diffusion, while the other runs a thin Web UI, making the model quick and easily accessible. If you’d prefer to use the model straight, you can always shell exec into the model’s container and run your commands there straight.

    $ docker compose up &

  4. At this point you should be able to access the web UI on localhost:9000. You’ll be able to write prompts, tweak settings, and download your generated images!

As with all my posts, if you run into any issues or have questions, please don’t hesitate to leave a comment below. Thanks for reading!

If you’d liek to see examples of images I’ve generated, click here.


This is a post in the stable-diffusion series.
Other posts in this series: