What is embedding?
Embedding is the result of textual inversion, a method to define new keywords in a model without modifying it. The method has gained attention because its capable of injecting new styles or objects to a model with as few as 3 -5 sample images.
How does textual inversion work?
The amazing thing about textual inversion is NOT the ability to add new styles or objects — other fine-tuning methods can do that as well or better. It is the fact that it can do so without changing the model.
The diagram from the original research article reproduced below illustrates how it works.
First you define a new keyword that’s not in the model for the new object or style. That new keyword will get tokenized (that is represented by a number) just like any other keywords in the prompt.
Each token is then converted to a unique embedding vector to be used by the model for image generation.
Textual inversion finds the embedding vector of the new keyword that best represents the new style or object, without changing any part of the model. You can think of it as finding a way within the language model to describe the new concept.
Examples of embeddings
Embeddings can be used for new objects. Below is an example of injecting a toy cat. Note that the new concept (toy cat) can be used with other existing concepts (boat, backpack, etc) in the model.
Embeddings can also be a new style. The example below shows embedding a new style and transferring the style to different context.
Where to find embeddings
Hugging Face host the Stable Diffusion Concept Library, which is a repository of large number of custom embeddings.
Civtai is another great site you can browse models, including embeddings. Filter with textual inversion to view embeddings only.
How to use embeddings
Stable Diffusion Conceptualizer is a great way to try out embeddings without downloading them.
First identify the embedding you want to test in the Concept Library. Let’s say you want to use this Marc Allante style. Next, identify the token needed to trigger this style. You can find it in the file
token_identifier.txt, which is
Putting in the prompt
Gives you the unique Marc Allante style.
The downside of web interface is you cannot use the embedding with a different model or change any parameters.
Using embedding in AUTOMATIC1111 is easy.
First, download an embedding file from the Concept Library. It is the file named
learned_embedds.bin. Make sure don’t right click and save in the below screen. That will save a webpage that it links to. Click of the file name and click the download button in the next page.
Next, rename the file as the keyword you wanted to use this embedding with. It has to be something not exist in the model.
marc_allante.bin is a good choice.
Put it in the
embeddings folder in the GUI’s working directory:
Restart the GUI. In startup terminal, you should see a message like:
Loaded a total of 1 textual inversion embeddings.
Use the filename as part of the prompt to
For example, the following prompt would work on AUTOMATIC1111.
We get the image with the expected style.
Shortcut to use embeddings in AUTOMATIC1111
Embedding won’t work even if it’s one letter off. Also, you cannot use v1 embeddings with v2, and vice versa — They are using two different language models.
Have you ever wonder how to make sure you are actually using the embeddings? It could be difficult to tell because It’s effect can sometimes be subtle.
There’s a little trick in AUTOMATIC1111 to ensure that. There’s a button between the trash and the copy buttons that looks like a little ipod (sorry if it was from a time before you were born…).
Click it and you will see all the embeddings that are available. They are all under the Textual Inversion tab.
Clicking any of them will insert that into the prompt. This function is especially useful to eliminate the tedious work of making sure you’ve entered the embedding magic word correctly.
Note on using embeddings in AUTOMATIC1111
If you pay attention to the prompt, you would notice I have increased the strength of the triggering keyword
marc_allante. I found that it is necessary to adjust the keyword strength. This may have something to do with the way AUTOMATIC1111 loads the embedding.
You may have to play with the keyword strength to get the effect you want. Below is an example of varying the strength while keeping the seed and everything else the same.
To further complicate the matter, the strength needed could be different for different seed values.
Some embeddings I like
There are many embeddings available than I can try. Here’s a few I found that I like.
If you have played with Stable Diffusion base models, you will find it impossible to generate wlop‘s style no matter how hard you try. Embedding together with a custom model can finally do this.
If you try it out, you may find it doesn’t work at all. What you need to do is adjusting the prompt strength.
A working prompt for AUTOMATIC1111 is
(wlop_style :0.6) (m_wlop:1.4) woman wearing dress, perfect face, beautiful detailed eyes, long hair, birds
closed eyes, disfigured, deformed
wlop_style is keyword for embedding, m_wlop is keyword for the model.
Don’t get frustrated if you don’t get the style. Try changing the prompt strengths of the two keywords. Some objects may simply doesn’t work with the embedding. Try some common objects in wlop’s artworks.
(_kuvshinov:1), a woman with beautiful detailed eyes, highlight hair
(Note I have renamed the embedding as
Difference between embedding, dreambooth and hypernetwork
There are three popular methods to fine-tune Stable Diffusion models: textual inversion (embedding), dreambooth and hypernetwork.
Embedding defines new keyword to describe a new concept without changing the model. The embedding vectors are stored in .bin or .pt files. Its file size is very small, usually less than 100 kB.
Dreambooth injects a new concept by fine-tuning the whole model. The file size is typical of Stable Diffusion, around 2 – 4 GB. The file extension is the same as other models, ckpt.
Hypernetwork is an additional network attached to the denoising UNet of Stable Diffusion model. The purpose is to fine-tune a model without changing the model. The file size is typically about 100 MB.
Pros and Cons of using embedding
One of the advantages of using embedding is its small size. With file size of 100 KB or less, it is simple to store multiple of them in your local storage. Because embeddings are just new keywords, they can be used together in the same image.
The drawback of using embedding is sometimes its not clear which model it is supposed to be used with. If the trainer didn’t say, you can start with v1.4 or v1.5. You may also want to include VAE to see if that makes any difference. For anime styles, it is not uncommon for trainers to use anime models like Anything v3.
In general, I found using embedding a bit more difficult than using custom models. I had trouble reproducing the demo styles in many embeddings I downloaded. It’s true that I may get there if I keep tweaking the keyword strengths, but in reality people move on after a few tries.