This is one guide to learn about Stable Diffusion and teach how you can use this tool.
The image above is generated with Stable Diffusion. It has been generated from the following text (prompt)
City skyline with skycrapers, by Stanislav Sidorov, digital art, ultra realistic, ultra detailed, photorealistic, 4k, character concept, soft light, blade runner, futuristic
Stable Diffusion is a text-to-image machine learning model. A deep learning model, of artificial intelligence that allows us to generate images from text that we put as input or input.
It’s not the first model or the first tool of this style, right now there’s a lot of talk about Dall-e 2, MidJourney, Google Image, but it is the most important because of what it represents. Stable Diffusion is an Open Source project, so anyone can use and modify it. In version 1.4 we have a 4G .cpxt file where the entire pre-trained model comes from, and this is a real revolution.
So much so that in just 2 or 3 weeks since its release, we find plugins for PhotoShop, GIMP, Krita, WordPress, Blender, etc. pretty much every tool that comes with images is implementing Stable Diffusion, so much so that even competitors like Midjourney are using it to enhance their tools. But it is not only used to generate tools, but we as users can install it on our PC and run it to obtain the images locally.
Because in addition to being Open Source does not mean that it is less powerful than the previous ones. It is a true wonder. For me right now it is the best tool that we can use if we want to generate our images for any project.
Ways to install and use Stable Diffusion
There are different ways to use it. Right now I recommend 2. If your computer has the necessary power, that is, a graphics card with about 8Gb of RAM, then install it on your computer. If your hardware is not powerful enough use a Google Collab, right now I recommend the Altryne one, because it comes with a graphical interface and is easier to use.
step to detail.
Colab of Altryne
This is the option that I recommend if your computer is not powerful enough (GPU with 8Gb of RAM) or if you want to try it with all its features without having to install anything.
I recommend it because it has a very comfortable graphical interface with many options to control the images and other model tools such as image to image and upscale.
We use the Google colab created by Altryne and Google Drive to save the model and the results.
It’s all free. I leave a video of the whole process that as you will see is very simple.
Install on PC
To install it from PC you can follow the instructions given in its GitHub, https://github.com/CompVis/stable-diffusion or in its version with graphical interface that I like much more https://github.com/AUTOMATIC1111/stable-diffusion-webui and on windows and linux you can use this executable to install it Stable Diffusion UI v2
You already know that you need a powerful GPU with a minimum of 8Gb of RAM for it to work smoothly. You can make it pull CPU, but it is much slower and it will also depend on the processor you have. So if your equipment is old you will have to resign yourself to using Colab or some payment method to use Stable Diffusion
The advantages of having it on your PC is that it is much faster to use, you don’t have to install or configure anything, just doing it once is enough, from then on everything is much faster.
Also, another reason why I like it a lot is because I can integrate it into other scripts and take advantage of the generated images by inserting them directly into the workflow of the tasks, which is a very important point.
Official Collab Diffusers
It is very similar to the Colab that I have recommended above, it runs almost the same, you do NOT have to upload the model, but it does not have a graphical interface and to modify any option you have to change the options of the code blocks and modify them to adjust it to what we need .
In addition, we cannot use the image to image option, which is very attractive.
You can access from this https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb
We have a filter for adult images, the famous NSFW, but you can deactivate it using this code, that is, creating a cell in the document with
def dummy_checker(images, **kwargs): return images, False http://pipe.safety_checker = dummy_checker
You have to put it right after the cell
pipe = pipe.to("cuda")
and run it
Colab Stable Diffusion Infinity
In this Colab we can use the Infinity tool, which allows us to complete images. Create content from the existing image. A real pass.
Dreamboth with Stable Diffusion
This is the implementation of Google’s Dreamboth with Stable Diffusion that allows, from a few images of a person, to obtain personalized results with the face that the demos.
An amazing way to customize images
https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
Other Colabs
You already know how to work in Colab, well I’ll leave you others that I’m finding so you can use the one you like the most. Even if you want you can make a copy and modify it to your liking to have your own version
- https://colab.research.google.com/drive/1AfAmwLMd_Vx33O9IwY2TmO9wKZ8ABRRa
- https://colab.research.google.com/drive/1Iy-xW9t1-OQWhb0hNxueGij8phCyluOh#scrollTo=B977dVS6AZcL
- Stable Diffusion for lossy image compression https://colab.research.google.com/drive/1Ci1VYHuFJK5eOX9TB0Mq4NsqkeDrMaaH?usp=sharing
- Implementation with keras https://colab.research.google.com/drive/1zVTa4mLeM_w44WaFwl7utTaa6JcaH1zK
From its official website
A simple way to use it, as if you use Dall-e 2 in OpenAI, but if you use the platform the service is paid. https://stability.ai/
From HuggingFace
An interesting option to test it quickly and take some pictures, just to see how it works, but there are many options that we will use if we are going to get serious about this.
Using AWS or some Cloud service
The Stable Diffusion model can be used by running it on hardware in the cloud, a classic service is Amazon’s AWS. Right now I am testing with EC2 instances to work with different algorithms. I’ll tell you how it is.
Other payment services
There are many and more and more are emerging, from implementations in stock photos to websites that allow us to integrate with APIs. At the moment this has caught my attention, although personally I am going to use the free services
Tools for prompt engineering
The engineering prompt is the part that refers to the generation of the prompt, that is, the phrase with which we feed the model so that it generates our images. It is not a trivial issue and you have to know very well how to use it to obtain great results.
A very useful tool to learn is lexicon, where we see images and the prompt they have used, the seed and the guidance scale.
Browsing around you will learn what type of elements you have to assign to the prompt to obtain the type of result you are looking for.