Getting Started

In this guide, we’ll walk through how to train LORAs for the Flux AI Image generation model end to end including:

  • Tools and tips for building and organizing the dataset
  • Captioning options
  • Training the model using a Comfy UI workflow
  • Testing the LORA

This method was used to generate:

santa amigaGalaxy

Specs Required

This guide was written with a 24GB Geforce RTX 3090 and 24GB RAM. I have heard Flux Dev training gets an OOM error when using a 16GB 4090.

Flux Dev or Flux Schnell?

We will be using the Flux Dev model, but the Flux Schnell model should work in a similar way.

If you need to download the dev model, go here. You’ll need to login and accept the ToS. Place the model in your checkpoints folder.

Preparing a Dataset

As often is the case, preparation is key to success here. You must have high quality, properly captioned images of your subject or style to produce a good LORA.

How many images you need in your dataset depends on your subject.

If you’re making a LORA of a person or something specific, 25 images will work fine.

For general styles/looks, LORAs that can work across a variety of prompts/settings, or otherwise conceptual LORAs, 100-200 images has worked well for me.

The larger your dataset is, the more chance you will have blurry or incorrectly captioned images.

Also, using images of various sizes and aspect ratios will produce better final results. We’ll walk through choosing specific resolutions later.

Finding Images

If you’re building a conceptual or style LORA, I’d recommend spending some time looking for a good image source/archive. (The source of the NASA LORA shown above was the APOD NASA archive site.)

nasa apod

If you’re manually collecting images across the web or out of a big archive, check out PureRef which lets you drag and drop any image into an infinite canvas and organize, resize, and save them as one file.

If you’ve found a large set of images somewhere, like in the NASA example, I have had a TON of success using ChatGPT to write Python scraping scripts to download the images using Scrapy in a single prompt. You can also do this with YouTube videos using similar libraries.

If you scrape a large set of images, it’s worth asking the LLM to also sort them into different aspect ratio folders, which will save time later. Example ChatGPT Prompt Here

Organizing and Normalizing Images

Before we train our LORA, we need to properly organize our dataset images into folders based on their input size. If you have very large images, you may also need to resize them.

Resizing Images

The LORA training script will resize input images, but doing this beforehand gives you more control over the input images and lets you see poor resizes before you spend hours training, so this step is recommended.

Even if you do not manually resize your images, or they don’t need to be resized, the images need to be organized into folders by resolution, unless you are training on a single resolution.

Imagemagick

If you need to resize or crop images in bulk, or if you need to normalize images of a similar aspect ratio to specific dimensions, see Imagemagick.

ChatGPT/LLMs can help you write good convert or mogrify (change images in place) commands for your dataset.

imagemagick llm

Choosing Resolutions

Before training, you’ll need to pick your 1-3 resolutions/aspect ratios from the tables below for your dataset. Create datasets/Your_Loras_name/ folder(s) in your ComfyUI folder. Inside, create a folder for each of your chosen resolutions. I have been using this list of resolutions.

It is possible to have images or resize them to resolutions that will fail to process in the training step, so it’s recommended to use a resolution from the table.

The maximum resolutions will take 2-3 days to train to 3000 steps on similar hardware. Only use the minimum resolutions if your source images are very small.

ARMinimumRecommendedMaximum
1:1320 x 3201024 x 10241408 x 1408
3:2384 x 2561216 x 8321728 x 1152
4:3448 x 3201152 x 8961664 x 1216
16:9448 x 2561344 x 7681920 x 1088
21:9576 x 2561536 x 6402176 x 960

Image Review and Moving Images

For large datasets, review and remove any poor quality, irrelevant, or duplicate images now.

If your images have watermarks, they will impact the LORA results. Consider using an img2img workflow with ComfyUI’s mask editor (right click on the load image node) to remove them.

I like to review images in bulk using XN View MP which also lets you sort files by image dimensions, which can be useful if your scraping script didn’t sort the images for you.

XN View Screen

Captioning

Download the Captioning ComfyUI Workflow here.

For small datasets, I’ve had good success manually captioning images. Even if you use the automatic method below, you should review/modify them for best results.

To automatically caption large datasets, we’ll use Miaoshouai Tagger which is fine-tuned using Civit.ai image tags and images. You can use the workflow below to bulk caption your images.

Caption files need to be txt files in the same folder as the image with the exact same name. Example: coolLora/myimage.jpg coolLora/myimage.txt MIAO_Captions.png

Training

Download the Lora Training Comfy UI Workflow here.

Training should take 2-8 hours with proper settings and using reasonably sized images, even with very large datasets. If things are running too slow (you can see your it/s in the console), try lowering your image resolutions.

Training Workflow

Running the Training Workflow

Enable/Disable the 3 Dataset Buckets, enter the path to your image/caption folders, and set the dimensions dataset buckets

In the Lora Training Config section, enter your Lora name, trigger word, save directory, and see other options. lora config

Make sure you load the correct Transformer and T5 flux asset load

Sample Prompts will get generated at each loop (750 steps by default) sample prompts

Other optional training settings can be found in the Settings group othersettings

Testing

Download the ComfyUI Flux LORA Testing Workflow here.

Your Lora and the intermediate steps should be saved to your output location. Move the LORA to your ComfyUI/models/loras folder and you’re ready to use your new LORA!

For testing various prompts, strengths and settings, try the Lora Testing workflow linked above. This will generate 2x1 grids with the Lora on and off using configurable strength ranges.

loratest.png

alt text

Share!

Make sure to share your LORA (unless it’s of you or your dog) with the CivitAI community and, if you used this guide, make sure to leave a link to your LORA in the comments below.

Civitai

Check out other useful ComfyUI Workflows in this Github Repo.

Appendix: Tools and Workflows

Click to expand

Flux Models

Workflows

Dataset Preparation Tools

  • PureRef - Tool for organizing and managing large image collections.
  • ImageMagick - Used for resizing, cropping, and normalizing images via cli.
  • XN View MP - Useful for bulk reviewing and sorting images by dimensions.

Other