Running the SDXL 1.0 Model: A Step-by-Step Guide

Stability AI, the creator of Stable Diffusion, has released SDXL model 1.0. It is an improvement to the earlier SDXL 0.9, SDXL Beta and the popular v1.5 model.

This post will go through:

  • What the SDXL model is.
  • Comparing images generated with the v1 and SDXL models.
  • Running SDXL on AUTOMATIC1111 Web-UI.
  • Running SDXL with an AUTOMATIC1111 extension.
  • Running SDXL with SD.Next. (Windows)

If you want to try SDXL quickly, using it with the AUTOMATIC1111 Web-UI is the easiest way. However, it is a bit of a hassle to use the refiner in AUTOMATIC1111. SD.Next is for people who want to use the base and the refiner in one go.

What is the SDXL model?

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.

It is a much larger model. In the AI world, we can expect it to be better. The total number of parameters of the SDXL model is 6.6 billion, compared with 0.98 billion for the v1.5 model.

Differences between SDXL and v1.5 models

SDXL model architecture
The SDXL model consists of two models – The base model and the refiner model. (figure from the research article)

The SDXL model is, in practice, two models. You run the base model, followed by the refiner model. The base model sets the global composition. The refiner model adds finer details. (You can optionally run the base model alone.)

The language model (the module that understands your prompts) is a combination of the largest OpenClip model (ViT-G/14) and OpenAI’s proprietary CLIP ViT-L. This is a smart choice because Stable Diffusion v2 uses OpenClip alone and is hard to prompt. Bringing back OpenAI’s CLIP makes prompting easier. The prompts that work on v1.5 will have a good chance to work on SDXL.

The SDXL model has a new image size conditioning that aims to use training images smaller than 256×256. This significantly increases the training data by not discarding 39% of the images.

The U-Net, the most crucial part of the diffusion model, is now 3 times larger. Together with the larger language model, the SDXL model generates high-quality images matching the prompt closely.

The default image size of SDXL is 1024×1024. This is 4 times larger than v1.5 model’s 512×512.

Sample images from SDXL

Users overwhelmingly prefer the SDXL model over the 1.5 model (figure from the research article)

According to Stability AI’s own study, most users prefer the images from the SDXL model over the v1.5 base model. You will find a series of images generated with the same prompts from the v1.5 and SDXL models. You can decide for yourself.

Realistic images

Let’s first compare realistic images generated using the prompt in the realistic people tutorial.

Prompt:

photo of young Caucasian woman, highlight hair, sitting outside restaurant, wearing dress, rim lighting, studio lighting, looking at the camera, dslr, ultra quality, sharp focus, tack sharp, dof, film grain, Fujifilm XT3, crystal clear, 8K UHD, highly detailed glossy eyes, high detailed skin, skin pores

Negative prompt:

disfigured, ugly, bad, immature, cartoon, anime, 3d, painting, b&w

All parameters except image sizes are kept the same for comparison. The size for the v1 models is 512×512. The size for the SDXL model is 1024×1024.

Here’s the SD 1.5 images.

SD v1.5

Here’s the base model.

SDXL base.

Here’s the base model with the refiner.

The refiner model improves rendering details.

Using the base v1.5 model is not doing justice to the v1 models. Most users use fine-tuned v1.5 models to generate realistic people. So I include the result using URPM, an excellent realistic model, below.

URPM (fine-tuned v1.5)

Below is another set of comparison images using a different seed value.

SD v1.5.
SDXL base.
SDXL base + refiner
URPM. (v1.5 fine-tuned.)

The SDXL base model produced a usable image in this set, although the face looks a bit too smooth for a realistic image. The refiner adds nice realistic details to the face.

Legible text

The ability to generate correct text stood out as a ground-breaking capability when I tested the SDXL Beta Model. SDXL should be at least as good.

Prompt:

A fast food restaurant on the moon with name “Moon Burger”

Negative prompt:

disfigured, ugly, bad, immature, cartoon, anime, 3d, painting, b&w

Here are the images from the SDXL base and the SDXL base with refiner.

SDXL base.
SDXL base + refiner.

On the other hand, the v1.5 base model fails miserably. Not only it failed to produce legible text, but it also did not generate a correct image.

v1.5 model.

Anime Style

Let’s compare the images with the Anime style.

anime, photorealistic, 1girl, collarbone, wavy hair, looking at viewer, upper body, necklace, floral print, ponytail, freckles, red hair, sunlight

disfigured, ugly, bad, immature, photo, amateur, overexposed, underexposed

Here are images from SDXL models with and without refiners.

SDXL base.
SDXL base + refiner.

Here are images from v1.5 and the Anything v4.5 model (fine-tuned v1.5).

SD v1.5.
Anything v4.5.

The SDXL base model produces decant anime images. The images are fantastic for a base model. The refiner adds good details but seems to add some repeating artifacts. Getting to a particular style likely needs custom fine-tuned models like the v1.

Scenes

Finally, some sample images of a city with this simple prompt.

Painting of a beautiful city by Brad Rigney.

SDXL Base.
SDXL base + refiner.

This is from v1.5 for comparison.

SD v1.5.

Download SDXL 1.0 model

You can find the SDXL base, refiner and VAE models in the following repository.

SDXL 1.0 base model page

SDXL 1.0 refiner model page

SDXL VAE page

Here are the direct download links of the safetensor model files. You typically don’t need to download the VAE file unless you plan to try out different ones.

Download SDXL 1.0 base model

Download SDXL 1.0 refiner model

Download SDXL VAE file

Tips on using SDXL 1.0 model

A Stability AI’s staff has shared some tips on using the SDXL 1.0 model. Here’s the summary.

  • Negative prompt. Negative prompts are not as necessary in the 1.5 and 2.0 models. Many common negative terms are useless, e.g. Extra fingers.
  • Keyword weight. You don’t need to use a high keyword weight like the v1 models. 1.5 is very high for the SDXL model. You may need to reduce the weights when you reuse the prompt from v1 models. Lowering a weight works better than increasing a weight.
  • Safetensor. Always use the safetensor version, not the checkpoint version. It is safer and won’t execute codes on your machine.
  • Refiner strength. Use a low refiner strength for the best outcome.
  • Refiner. Use a noisy image to get the best out of the refiner.
  • Image size. The native size is 1024×1024. SDXL supports different aspect ratios but the quality is sensitive to size. Here are the image sizes used in DreamStudio, Stability AI’s official image generator
    • 21:9 – 1536 x 640
    • 16:9 – 1344 x 768
    • 3:2 – 1216 x 832
    • 5:4 – 1152 x 896
    • 1:1 – 1024 x 1024

Run SDXL model on AUTOMATIC1111

AUTOMATIC1111 Web-UI now supports the SDXL models natively. You no longer need the SDXL demo extension to run the SDXL model.

The update that supports SDXL was released on July 24, 2023. You may need to update your AUTOMATIC1111 to use the SDXL models.

You can use AUTOMATIC1111 on Google ColabWindows, or Mac.

Download the Quick Start Guide if you are new to Stable Diffusion.

Installing SDXL 1.0 models on Google Colab

Installing the SDXL model in the Colab Notebook in the Quick Start Guide is easy. All you need to do is to select the SDXL_1 model before starting the notebook.

Installing SDXL 1.0 models on Windows or Mac

Download the SDXL base and refiner models and put them in the models/Stable-diffusion folder as usual. See the model install guide if you are new to this.

Download SDXL 1.0 base model

Download SDXL 1.0 refiner model

After clicking the refresh icon next to the Stable Diffusion Checkpoint dropdown menu, you should see the two SDXL models showing up in the dropdown menu.

Using SDXL base model text-to-image

Using the SDXL base model on the txt2img page is no different from using any other models. The basic steps are:

  1. Select the SDXL 1.0 base model in the Stable Diffusion Checkpoint dropdown menu
  2. Enter a prompt and, optionally, a negative prompt.
  3. Set image size to 1024×1024, or something close to 1024 for a different aspect ratio. (see the tips section above)

IMPORTANT: Make sure you didn’t select a VAE of a v1 model. Go to Settings > Stable Diffusion. Set SD VAE to None or Automatic.

TIPS: In Settings > User Interface > QuickSetting: Add sd_vae to add a dropdown menu for selecting VAEs next to the checkpoint dropbox.

Prompt:

1girl ,solo,high contrast, hands on the pocket, (black and white dress, looking at viewer, white and light blue theme, white and light blue background, white hair, blue eyes, full body, black footwear the light blue water on sky and white cloud and day from above, Ink painting

Negative prompt:

sketch, ugly, huge eyes, text, logo, monochrome, bad art

Size: 896 x 1152

20 sampling steps

Using the refiner model

The refiner step is done in the img2img page.

  1. Click Send to img2img under the output image.

2. Select the SDXL 1.0 refiner model in the Stable Diffusion Checkpoint dropdown menu.

3. Set denoising strength to 0.1-0.3 (This IS the refiner strength. Increase to add more detail).

Using preset styles for SDXL

DreamStudio, the official Stable Diffusion generator, has a list of preset styles available. They are actually implemented by adding keywords to the prompt and negative prompt. You can install the StyleSelectorXL extension to add the same list of presets styles to AUTOMATIC1111.

Installing StyleSelectorXL extension

To install the extension, navigate to the Extensions page in AUTOMATIC1111. Select the Install from URL tab. Put the following in the URL for extension’s git repository.

https://github.com/ahgsql/StyleSelectorXL

Press Install. After you see the confirmation of successful installation, restart the AUTOMATIC1111 Web-UI completely.

Using SDXL style selector

You should see a new section appear on the txt2img page.

Write the prompt and the negative prompt as usual. Make sure the SDXL Styles option is enabled. Select a style other than base to apply a style.

Isometric
Photographic
Anime
Comic Book

Run SDXL model using an AUTOMATIC1111 extension

Update July 26, 2023: You no longer need this extension to run SDXL on AUTOMATIC1111.

Installing the SDXL demo extension on Windows or Mac

To install the SDXL demo extension, navigate to the Extensions page in AUTOMATIC1111. Go to the Install from URL tab. Enter the following URL in the URL for extension’s git repository field.

https://github.com/lifeisboringsoprogramming/sd-webui-xldemo-txt2img

Click Install and wait for confirmation that the extension is successfully installed.

Restart AUTOMATIC1111.

Installing the SDXL demo extension on Google Colab

Installing the SDXL demo extension on the site’s Colab notebook is easy. Enter the extension’s URL in the Extensions_from_URL field.

https://github.com/lifeisboringsoprogramming/sd-webui-xldemo-txt2img

Start the Colab Notebook as usual.

Setting up the SDXL extension

Step 1. Fill out the agreement form

Fill out this Form in HuggingFace. If you don’t see a form, it means you have already filled it out, or it is no longer required.

Step 2. Create a Huggingface token

Go to this page, create, and copy a token.

Click New Token, give it a name, and copy it.

Step 3 Enter Access token

In AUTOMATIC1111 Web-UI, navigate to the Settings page.

Go to the SDXL Demo section using the left panel selection.

Enter your Access token in the Huggingface access token field.

Select SDXL 0.9 (fp16) in the Model field.

Click Apply Settings.

Restart AUTOMATIC1111 to take effect.

For Google Colab users, you need to stop and restart the cell. Don’t disconnect the runtime.

You should see the model being downloaded in the next startup.

Now the setup is complete.

Using the SDXL demo extension

Base model

To use the SDXL base model, navigate to the SDXL Demo page in AUTOMATIC1111.

The interface is similar to the txt2img page. Enter a prompt and press Generate to generate an image.

SDXL demo base model in AUTOMATIC1111 webui

Refiner model

To use the refiner model, select the Refiner checkbox. An image canvas will appear.

Upload the image generated from the base model to the image canvas.

Click Refine to run the refiner model.

SDXL demo refiner model in AUTOMATIC1111 webui

Run SDXL model with SD.Next

AUTOMATIC1111 now supports SDXL in the stable release. However, the refiner cannot be used with the base model seamlessly. It has to be a separate step on the img2img page and requires switching the model.

For seemingly generating an image with the base and the refiner model in one go, SD.Next is a good alternative. It is built based on AUTOMATIC1111, so you will find their interface similar.

Hardware requirement

You should have a discrete Nvidia graphic card (GPU) with at least 12 GB VRAM and a Windows machine.

Install SD.Next on Windows

(You don’t need to perform steps 1 and 2 if you already have AUTOMATIC1111 running on your machine.)

Step 1: Install Python

You will need Python 3.10. Install from the Microsoft store is the easiest option.

Step 2: Install git

Git is a code repository management system. You will need it to install and update AUTOMATIC1111.

Go to this page to download the Windows version.

Open the installer. Click Install to accept the license and install the software.

git install windows

Step 3: Clone SD.Next

Press the Window key (It should be on the left of the space bar on your keyboard). A search window should pop up. Type powershell. Click the Windows PowerShell Icon to launch the PowerShell App.

powershell windows

Type the following command and press Enter to clone SD.Next to your computer.

git clone https://github.com/vladmandic/automatic

Step 4: Run SD.Next

Open the File Explorer App.

In the address bar, go to the following location

%userprofile%automatic

Find a file called webui.bat. If you don’t see the extension .bat, look for the webui file with the type Windows Batch File.

webui.bat sd.next

Double-click the webui.bat file to start SD.Next.

Double-click the webui.bat file to rerun it if you encounter the following error.

ImportError: cannot import name ‘deprecated’ from ‘typing_extensions’

It should correct itself the 2nd time.

Type y and press Enter when asked if you want to download the default model.

The installation is complete when you see the Local URL message, which is http://127.0.0.1:7860/ by default.

Step 5: Access the webui on a browser

Go to the local URL on the browser to start SD.Next.

http://127.0.0.1:7860/

You should see a user interface very similar to the AUTOMATIC1111 Web-UI.

Test generating an image with a simple prompt (e.g. a cat) and Press Generate to make sure it is working correctly.

Download the SDXL 1.0 models

You will need to download the SDXL model to use with SD.Next. Note that there’s a malicious copy of SDXL out there, so you should download it from the official repository.

In SD.Next Web-UI, navigate to the Models page and the Huggingface tab.

In the Select model field, enter the following.

stabilityai/stable-diffusion-xl-base-1.0

Press Download Model.

Verify the download is in progress in the console window.

Wait for it to complete.

Press the Download button again if you receive a read time-out error.

Repeat the download with the following in the Select model field.

stabilityai/stable-diffusion-xl-refiner-1.0

Now you have completed downloading both the base and the refiner model.

Setting up SD.Next to use SDXL

SD.Next has two Stable Diffusion backends: (1) original and (2) diffusers. You must switch to the diffuers backend to use the SDXL model.

Switching to the diffusers backend

PowerShell method

You will need to start SD.Next’s webui.bat with the extra argument --backend diffuers to use the diffusers backend. You can switch back and forth between the original and the diffusers backend after you start through the GUI.

If you use the PowerShell app for the web-UI, use the following command.

cd automatic; .webui.bat --backend diffusers
File Explorer method

You can also create a new shortcut to add the extra argument for convenience.

In File Explorer, right-click the file webui.bat. Click Show more options. Click create shortcut.

A new Shortcut file is created. Rename it to webui.bat - diffusers.

Right-click on the file, select Properties. At the end of the Target field, add

 --backend diffusers

There should be a space between webui.bat and --backend diffusers.

Click Apply and then OK.

From now on, you can double click the Shortcut file webui.bat - diffusers to launch SD.Next with the diffuser backend.

Adding the refiner model selection menu

For convenience, you should add the refiner model dropdown menu.

Go to the Settings page, in the QuickSettings list (search quick to find it), add sd_model_refiner.

Click Apply settings and then Restart server.

After restarting, you should see the Stable Diffusion refiner dropdown menu next to the Stable Diffusion checkpoint.

Using the SDXL model

You are now ready to generate images with the SDXL model. There are two modes to generate images

  • Base model alone
  • Base model followed by the refiner

Base model only

Navigate to the From Text tab.

Select the SDXL base model in the Stable Diffusion checkpoint dropdown menu.

Select None in the Stable Diffuson refiner dropdown menu.

Enter a prompt.

a cat playing guitar, wearing sunglasses

Set sampling steps to 30.

Set both the width and the height to 1024. This is important because the SDXL model was trained to generate 1024×1024 images.

The rest of the options are pretty standard. See the AUTOMATIC1111 guide and this article for details.

Base model + refiner

Use the base model followed by the refiner to get the best result. You will get images similar to the base model but with more fine details.

This option takes up a lot of VRAMs. Not all graphic cards can handle it.

To use the base model with the refiner, do everything in the last section except select the SDXL refiner model in the Stable Diffusion refiner dropdown menu.

Switching between original and diffusers backends

You can switch between the original and the diffusers backend on the Settings page.

Switch back to the original backend to use v1 and v2 models. Select original in Stable Diffusion backend.

Click Apply settings and Restart server.

Some notes about SDXL

Make sure to use an image size of 1024 x 1024 or similar. 512×512 doesn’t work well with SDXL.

SD.Next currently uses the diffusers backend for the SDXL model. Further optimizations using the original backend on SD.Next and AUTOMATIC1111 will be available later. That should translate to faster generation and lower VRAM requirements.

Frequently Asked Questions

Can I use SDXL on Mac?

Yes, you will need Mac with Apple Silicon M1 or M2. Make sure your AUTOMATIC1111 is up-to-date. See the installation tutorial.

Can I use ControlNet with SDXL models?

ControlNet currently only works with v1 models. SDXL is not supported.

But it appears to be in the work.

What image sizes should I use with SDXL models?

Here are the recommended image sizes for different aspect ratios.

  • 21:9 – 1536 x 640
  • 16:9 – 1344 x 768
  • 3:2 – 1216 x 832
  • 5:4 – 1152 x 896
  • 1:1 – 1024 x 1024

Next step

Check out some SDXL prompts to get a quick start.

Interesting reads

Stability AI launches SDXL 0.9: A Leap Forward in AI Image Generation – Official press release of SDXL 0.9.

ANNOUNCING SDXL 1.0 — Stability AI – Official press release of SDXL 1.0

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis – Research article detailing the SDXL model.

Diffusers · vladmandic/automatic Wiki – Using Diffusers mode in SD.Next

# disable ad on single page

Leave a Reply

Your email address will not be published. Required fields are marked *