If you just want to see examples of images I’ve generated, click here.

Stable Diffusion is a new, free to use, model hosted by Hugging Face, which was produced by CompVis and Stability AI. This model takes in a text prompt, and produces an image. It was trained on 5.85 billion images. What’s most exciting about it, is it is freely available, and you’re able to run it on your own computer at home!

I’m using a 2080 TI (this is an affiliate link, in case anyone purchases one of these now ancient GPUs ;). This allows me to create an image in under five seconds!

The Stable Diffusion scene is moving rather quickly, as there are many code repositories, and forks to said repositories, available today.

[In A Web UI] Using Automatic1111’s Package

AUTOMATIC1111’s stable-diffusion-webui has become my preferred way of experimenting with Stable Diffusion. His web ui provides a slew of features I did not even know I’d want. For example, with his web UI you can:

  • run image creation in batch
  • easily explore multiple seeds
  • create X/Y plots exploring the same prompt with different parameters like CFG Scale and Steps
  • etc…

Rather than re-write his instrallation instructions, I’d instead elect that you follow what he’s written here. There’s support for automatic installation on Linux and Windows.

[In Code] Using The Official Package On Your Host OS

The Hugging Face model card will give you all the instructions you need to get this running locally. I’ll describe them here with explanations on using pip or poetry. I’d recommend leveraging the float16 precision instead of the full float32 to reduce your memory usage, at a minimum drop in quality.

  1. Create a virtual environment. You can use pip, poetry or any other tool to install the required python libraries.
    • With pip we’d run $ python3 -m venv venv and activate it with $ source venv/bin/activate
    • With poetry we’d run $ poetry init
  2. Install the necessary dependencies.
    • With pip we’d run $ python3 -m pip install diffusers transformers scipy
    • With poetry we’d run $ poetry add diffusers transformers scipy
  3. We’d then login with Hugging Face. Before doing this, make sure you make an account on their website, then go to settings > access tokens and generate a token. Make sure the token has WRITE permissions.
    • $ huggingface-cli login and paste your token
  4. Now we can write our first Python script, save this in app.py.
    import torch
    from torch import autocast
    from diffusers import StableDiffusionPipeline
    
    model_id = "CompVis/stable-diffusion-v1-4"
    device = "cuda"
    
    # the middle two arguments configure the usage of `float.16` instead of the default `float.32`
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, revision="fp16", use_auth_token=True)
    pipe = pipe.to(device)
    
    prompt = "a photo of an astronaut riding a horse on mars"
    with autocast("cuda"):
        image = pipe(prompt, guidance_scale=7.5)["sample"][0]
    
    image.save("astronaut_rides_horse.png")
    
  5. Now we can run our program.
    • With pip, $ python3 app.py
    • With poetry, $ poetry run app.py

Conclusion

As with all my posts, if you run into any issues or have questions, please don’t hesitate to leave a comment below. Thanks for reading!

If you’d like to see examples of images I’ve generated, click here.