Experiments
Updated on
Aug 29, 2024

FLUX.1: A Deep Dive

FLUX.1: A Deep Dive

FLUX.1: A Deep Dive
We Help You Engage the Top 1% AI Researchers to Harness the Power of Generative AI for Your Business.

What Is FLUX.1?

In early August 2024, FLUX.1 was introduced by Black Forest Labs, a company focused on creating and advancing cutting-edge generative deep learning models for media, including images and videos. Built on a novel transformer architecture, this text-to-image diffusion model has been trained on 12 billion parameters.

FLUX.1 is designed to provide more control over the generated content, making it a powerful tool for artists, designers, and developers. FLUX.1 distinguishes itself by using Flow Matching and a DIT (Discrete Integrate and Transfer) architecture, similar to the advancements expected in models like Stable Diffusion 3.

FLUX.1 is available in three variations:

  • FLUX Schnell: This is a distilled version of FLUX.1 optimized for quick inference. It’s ideal for applications where speed is crucial. However, it’s not suitable for fine-tuning and is generally used for commercial purposes under an Apache license.
  • FLUX Dev: This variant is tailored for fine-tuning. It provides the necessary flexibility and control for users who want to customize the model to their specific needs. A commercial license is required for this variant.
  • FLUX Pro: The Pro version of FLUX.1 offers the highest quality and control. However, the weights are not open-source and are only accessible through an API. Fine-tuning is not supported on this variant.

Fine-Tuning FLUX.1: Step-by-Step Guide

We will demonstrate a quickstart guide to fine-tuning FLUX.1 here.

Fine-tuning FLUX.1 for specific tasks like generating human faces can significantly improve the quality of the generated images. Here’s how you can do it using Replicate’s API:

Training via an API

You can create training from your own code with an API.

Make sure you have your REPLICATE_API_TOKEN set in your environment. Find it in your account settings.

export REPLICATE_API_TOKEN=r8_***************************

Create a new model that will serve as the destination for your fine-tuned weights. This is where your trained model will live once the process is complete.

import replicate


model = replicate.models.create(
    owner="yourusername",
    name="flux-your-model-name",
    visibility="public",  # or "private" if you prefer
    hardware="gpu-t4",  # Replicate will override this for fine-tuned models
    description="A fine-tuned FLUX.1 model"
)


print(f"Model created: {model.name}")
print(f"Model URL: https://replicate.com/{model.owner}/{model.name}")

Now that you have your model, start the training process by creating a new training run. You’ll need to provide the input images, the number of steps, and any other desired parameters.

# Now use this model as the destination for your training
training = replicate.trainings.create(
    version="ostris/flux-dev-lora-trainer:4ffd32160efd92e956d39c5338a9b8fbafca58e03f791f6d8011f3e20e8ea6fa",
    input={
        "input_images": open("/path/to/your/local/training-images.zip", "rb"),
        "steps": 1000,
        "hf_token": "YOUR_HUGGING_FACE_TOKEN",  # optional
        "hf_repo_id": "YOUR_HUGGING_FACE_REPO_ID",  # optional
    },
    destination=f"{model.owner}/{model.name}"
)


print(f"Training started: {training.status}")
print(f"Training URL: https://replicate.com/p/{training.id}")


Using Your Trained Model

Once the training is complete, you can use your trained model directly on Replicate, just like any other model.

You can run it on the web:

  • Go to your model page on Replicate (e.g., https://replicate.com/yourusername/flux-your-model-name).
  • For the prompt input, include your trigger word (such as “bad 70s food”) to activate your fine-tuned concept.
  • Adjust any other inputs as needed.
  • Click “Run” to generate your image.

Or, with an API. For example, using the Python client:

import replicate
output = replicate.run(
    "yourusername/flux-your-model-name:version_id",
    input={
        "prompt": "A portrait photo of a space station, bad 70s food",
        "num_inference_steps": 28,
        "guidance_scale": 7.5,
        "model": "dev",
    }
)


print(f"Generated image URL: {output}")

Replace yourusername/flux-your-model-name:version_id with your actual model details.

You can find more information about running it with an API on the “API” tab of your model page.

Example Prompts & Outputs

We have used a model from the free image library, Pexels, to generate our images. 

Prompt: A woman in a serene garden setting, holding a yoga pose on a mat, surrounded by lush greenery and blooming flowers. She is dressed in stylish yoga pants and a top, perfectly blending with the natural, peaceful environment. The garden features soft sunlight filtering through the trees, creating a tranquil atmosphere ideal for practising yoga.

Prompt: A woman sitting at her desk, using a laptop with an e-commerce website open, virtually trying on a sleek corporate outfit—formal trousers and a crisp shirt. The screen shows a business meeting setting, with the woman appearing confident and professional in her virtual attire. The background is a modern office space, with soft lighting and minimalist decor.

Prompt: A woman gazing at her laptop screen, where she is virtually trying on a luxurious saree or an elegant lehenga. The e-commerce website displays a festive setting with intricate jewellery and accessories complementing the outfit. The woman appears excited, surrounded by soft, warm lighting that highlights the richness of the traditional Indian attire.

Prompt: A woman in a casual summer outfit, driving a convertible car along a scenic coastal road, with the wind in her hair. The scene captures the joy and freedom of a holiday, with beautiful coastal views, a clear blue sky, and the open road ahead. The focus is entirely on her relaxed, carefree vibe as she enjoys her drive, free from any distractions.

These outputs are simply brilliant!

A Note about FLUX.1 Prompting

FLUX.1 prompting excels with more elaborate, narrative-style prompts rather than the concise tag format. What truly distinguishes Flux from other models is its exceptional ability to render text—not just individual words, but entire sentences—with impressive clarity. This feature alone unlocks a wealth of creative possibilities for those looking to seamlessly integrate text into their images.

Summary

FLUX.1 offers a versatile and powerful approach to image generation, with various variants tailored for different use cases. Whether you’re looking for quick inference, in-depth fine-tuning, or the highest quality images, FLUX.1 has a solution for you. By following the steps outlined above, you can fine-tune FLUX.1 to suit your specific needs, unlocking the full potential of this advanced model.

Authors