LoRA models are small Stable Diffusion models that apply tiny changes to standard checkpoint models. They are usually 10 to 100 times smaller than checkpoint models. That makes them very attractive to people having an extensive collection of models.
This is a tutorial for beginners who haven’t used LoRA models before. You will learn what LoRA models are, where to find them, and how to use them in AUTOMATIC1111 GUI. Then you will find a few demos of LoRA models at the end.
What are LoRA models?
LoRA (Low-Rank Adaptation) is a training technique for fine-tuning Stable Diffusion models.
But we already have training techniques such as Dreambooth and textual inversion. What’s the big deal about LoRA? LoRA offers a good trade-off between file size and training power. Dreambooth is powerful but results in large model files (2-7 GBs). Textual inversions are tiny (about 100 KBs), but you can’t do as much.
LoRA sits in between. Its file size is much more manageable (2 – 200 MBs), and the training power is decent.
Stable Diffusion users who like experimenting with models can tell you how quickly their local storage fills up. Because of the large size, It is hard to maintain a collection with a personal computer. LoRA is an excellent solution to the storage problem.
Like textual inversion, you cannot use a LoRA model alone. It must be used with a model checkpoint file. LoRA modifies styles by applying small changes to the accompanying model file.
How does LoRA work?
LoRA applies small changes to the most critical part of Stable Diffusion models: The cross-attention layers. It is the part of the model where the image and the prompt meet. Researchers found it sufficient to fine-tune this part of the model to achieve good training. The cross-attention layers are the yellow parts in the Stable Diffusion model architecture below.
The weights of a cross-attention layer are arranged in matrices. Matrices are just a bunch of numbers arranged in columns and rows, like on an Excel spreadsheet. A LoRA model fine-tunes a model by adding its weights to these matrices.
How can LoRA model files be smaller if they need to store the same number of weights? The trick of LoRA is breaking a matrix into two smaller (low-rank) matrices. It can store a lot fewer numbers by doing this. Let’s illustrate this with the following example.
Let’s say the model has a matrix with 1,000 rows and 2,000 columns. That’s 2,000,000 numbers (1,000 x 2,000) to store in the model file. LoRA breaks down the matrix into a 1,000-by-2 matrix and a 2-by-2,000 matrix. That’s only 6,000 numbers (1,000 x 2 + 2 x 2,000), 333 times less. That’s why LoRA files are a lot smaller.
In this example, the rank of the matrices is 2. It is much lower than the original dimensions, so they are called low-rank matrices. The rank can be as low as 1.
But are there any harms in pulling a trick like that? Researchers found doing that in cross-attention layers did not affect the power of fine-tuning much. So we are good.
Where to find LoRA models?
The go-to place to find Lora is Civitai. The site hosts a large collection of LoRA models. Apply the LORA filter to see only LoRA models. You may find that they all tend to be similar: female portraits, anime, realistic illustration styles, etc.
(Be aware that there’s a lot of NSFW stuff on Civitai. Make sure to put on NSFW filter if you don’t want to see something you cannot unsee…)
Hugging Face is another good source of LoRA libraries. You will find more varieties of LoRA models. But there are not as many LoRA models there. Their collection is a lot smaller.
How to use LoRA?
AUTOMATIC1111 supports LoRA natively. You don’t need to install any extensions.
Step 1: Install LoRA model
To install LoRA models in AUTOMATIC1111 webui, put the model files in the following folder.
Step 2: Use LoRA model
To use a LoRA model, put the following phrase in the prompt.
filename is the file name of the LoRA model, excluding the extension (
multiplier is the weight applied to the LoRA model. The default is 1. Setting it to 0 disables the model.
How can you be sure the filename is correct? Instead of writing this phrase, you should click on the model button.
Click on the Lora tab. You should see a list of LoRA models installed. Click on the one you want to use.
The LoRA phrase will be inserted in the prompt.
Notes on using LoRA
You may adjust the multiplier to crank up or tune down the effect. Setting the multiplier 0 disables the LoRA model. You can adjust the style effect between 0 and 1.
Some LoRA models are trained with Dreambooth. You will need to include a trigger keyword to use the LoRA model. You can find the trigger keyword on the model’s page.
Similar to embeddings, you can use multiple LoRA models at the same time. You can also use them with embeddings.
In AUTOMATIC1111, the LoRA phrase is not part of the prompt. It will be removed after the LoRA model is applied. That means you cannot use prompt syntax like [keyword1:keyword2: 0.8] with them.
Some LoRA models
Here’s my biased selection of LoRA models.
Shukezouma LoRA model brings out a stylish Chinese ink painting theme. Shukezouma means the negative space (commonly seen in Chinese paintings) of the painting is so ample that a horse can pass through it.
Use this LoRA with the Chinese-style model Guo Feng.
Trigger keyword: shukezouma
(shukezouma:0.5) ,<lora:Moxin_Shukezouma:1> , chinese painting, half body, female, perfect symmetric face, detailed chinese dress, mountains, flowers, 1girl, tiger
disfigured, ugly, bad, immature
Akemi Takada (1980s) Style
Akemi Takada is a Japanese manga illustrator. This is for you if you like Japanese anime in the 1980s and 1990s.
Use with AbyssOrangeMix2 model.
takada akemi, Tifa lockhart as magician, Final Fantasy VII, 1girl, small breast, beautiful eyes, brown hair, smiling, red eyes, highres, diamond earring, long hair, side parted hair, hair behind ear, upper body, stylish dress, indoors, bar 1980s (style), painting (medium), retro artstyle, watercolor (medium) <lora:akemiTakada1980sStyle_1:0.6>
(worst quality, low quality:1.4), (painting by bad-artist-anime:0.9), (painting by bad-artist:0.9), watermark, text, error, blurry, jpeg artifacts, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, artist name, bad anatomy
Cyberpunk 2077 Tarot card
This LoRA model generates cyborgs and cities with a futuristic cyberpunk style.
Used with Anything v4.5 model.
cyberpunk, tarot card, close up, portrait, bionic body, cat, young man, perfect human symmetric face, leather metallic jacket, circuit, city street in background, natural lighting, masterpiece <lora:cyberpunk2077Tarot_tarotCard512x1024:0.6>
(worst quality, low quality:1.4), (painting by bad-artist-anime:0.9), (painting by bad-artist:0.9), watermark, text, error, blurry, jpeg artifacts, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, artist name, bad anatomy, big breast
LoRA models are small modifiers of checkpoint models. You can easily use them in AUTOMATIC1111 by including a phrase in the prompt.
That’s it for today! I will tell you how to train LoRA models in a future article.
Below are some additional readings for curious minds.
- Using LoRA for Efficient Stable Diffusion Fine-Tuning (Hugging Face blog). A good overview of how LoRA is applied to Stable Diffusion.
- LoRA: Low-Rank Adaptation of Large Language Models (2021). The research article first proposed the LoRA technique. (for language models)
- Github: Low-rank Adaptation for Fast Text-to-Image Diffusion Fine-tuning. This is the initial work applying LoRA to Stable Diffusion. You will find an excellent technical description of LoRA.