How does stable diffusion work

We Explain How Stable Diffusion Creates

AI Generative ART

How does stable diffusion work — Let’s Talk about Stable Diffusion and How it Works – Image created with Stable Diffusion

If you are a new to the whole world of Artificial Intelligence and of course Generative ART then you might have heard of the AI generative ART tool Stable Diffusion. Stable Diffusion is a deep learning model that can generate high-quality images from user input, commonly known as prompts. But, you might want to know how does Stable Diffusion work?

Let us enlighten you on how the amazing technology behind stable diffusion works and how you can create realistic images from prompts. Let’s face it here at So Artificial we have created dozens of guides on Stable Diffusion so we might as well explain to you how it all works.

How does Stable Diffusion create images?

Stable Diffusion is a complex collection of data set that create a generative model. This model uses deep learning to create images from text (prompts). The way it works is using text to image. Stable Diffusion uses Artificial Intelligence to create the images you see. This model is based on a complex neural network that can map text descriptors to images. In short it can create images based off of user text input descriptors. I.e “A frog in a posh suite reading a novel, monocle and top hat“

A frog in a posh suite read a novel, monocle and top hat

Ok it might not be perfect, but it will do for a simple prompt example of what Stable diffusion is capable of.

The way Stable Diffusion creates these images is by using a method called, you guessed it “diffusion”. The diffusion process involves repeatedly updating a set of image pixels based on a diffusion equation. This in turn helps to smooth out the image and create the final image you see. This is done in what is known as steps. The more steps the more work goes into the image, this of course doesn’t mean that the more steps used the better the image will be. Please note that stable diffusion is an image generative tool it does not generate text like ChatGPT does, these require different datasets to be gathered.

SD Is A Energy-based model

Stable Diffusion is what is known as a energy-based model (EBM) its a form of generative models (GM) imported directly from statistical physics to learning. GM learn by analysing a sample dataset in the case of stable diffusion the data set used was an insanely huge collection of images. EBM’s are trained to learn the characteristics of a dataset and with this dataset it will generate something similar but larger. This is how stable diffusion is able to generate images.

in simple terms it uses images it has gathered in it’s dataset and tries to create it’s very own images from that dataset. Dependant on what user prompt is submitted. No so complicated now right? That’s basically how Generative models work.

EBM are quite common in the generative AI world you commonly find them in robotics, natural language processing chat bots (OpenAI uses a different process) and even computer vision.

Stable Diffusion can't generate text — Although very good at generating images it can not generate text very well.

Does Stable Diffusion use images?

It has, it does and yes it can, make sense? When Stable Diffusion was being created it was relying on a large dataset of images. Originally this would have been small say a few 1000 images of a persons face, they would have then tried to get stable diffusion to draw it’s own version of a persons face using the images it had in it’s datasets.

Nowadays that dataset is huge, seriously massive! Because of the shear size of the image dataset it can now draw pretty much anything you ask of it, even things that don’t exist it will try it’s best to create something for you. This of course has caused a lot of controversary in regards to copywrite laws and such.

This is how the brain of stable diffusion works, it uses large datasets to create the images you see. Without these datasets stable diffusion would not be able to generate images. These datasets are known as models, there are 1000’s of them available all over the internet. Users can even train their own models, in theory you could get a million pictures of say the Simpsons and train a model to only create that style of artwork. Can you see why people might be concerned about this technology?

Stable Diffusion can also use images as in edit them. There is a handy little feature known as inpainting where stable diffusion takes and image you provide it and can edit parts of that image. Don’t like the look of something in your photo why not try out stable diffusion in painting, sure it’s creative and sometimes odd but other times it’s just plain amazing.

When you have Stable Diffusion installed correctly, with the correct model you an generate images from text descriptions (prompts). All you need to do is feed stable diffusion input text and it will then spit out an image that it thinks most likely resembles your prompt. The image can be further refined through adjusting parameters and settings, which is one of the many reasons as to why people choose stable diffusion.

this-image-was-created-with-stable-diffusion-using-a-realistic-model — A realistic Image Created with Stable Diffusion

Advantages of Stable Diffusion Over Other Generative AI Art Tools

When it comes to Generative AI Art Tools there are 100’s if not 1000’s of tools on the market that pretty much all day the same thing. The top 3 are Stable Diffusion, Midjourney and Dall-E-2. Each has it’s strengths and weaknesses.

When it comes to stable diffusion it’s biggest weakness is using it’s service through other 3rd parties. This can be at times expensive or just plain slow. This is why it is recommended that you install stable diffusion on your home pc. Then you have complete control over every aspect of your installation. This will allow you to use your own models and edit all the settings without limitations. But this gives users another headache, is your computer compatible with stable diffusion? does it have enough VRAM etc this is why communities are open to help you with the process.

Now when it comes to advantages for stable diffusion it has plenty:

Free to use – paid version via DreamStudio
Can be installed Locally
Open Source
Custom models can be used
Customizable settings
Custom LoRA files
Large Community
Create your Own Characters with stable diffusion
Extra tools available like inpainting
Negative Prompts things you don’t want stable diffusion to show

As you can see it has a pretty healthy list of benefits when it comes to using stable diffusion, why on earth would you not want to use it? Stable Diffusion is our go to generative AI art tool due to it’s customisation. Without it, it would just be like any other ai art tool. Oh and the fact that it’s open source is a big plus!

Do you use Stable Diffusion? if so why not leave a comment below and let us know what creation you have created, if not what other AI art tools do you use or have used in the past?

[Top Article]