[Qwen-Image] adding validation for guidance_scale, true_cfg_scale and negative_prompt #12223

yiyixuxu · 2025-08-23T05:52:52Z

This PR adds the parameter validation for guidance_scale/true_cfg_scale/negative_prompt and try to improve user experience for the QwenImagePipeline

Behavior for Non-guidance-distilled models:

guidance_scale defaults to None, it will be ignored with warning if provided
CFG enabled only when negative_prompt provided (by default not provided) AND true_cfg_scale > 1 (defaults to be 4.0)
Warns if true_cfg_scale > 1 but no negative_prompt
Warns if negative_prompt provided but true_cfg_scale <= 1

test script

import os
import torch

from diffusers import QwenImagePipeline

pipeline = QwenImagePipeline.from_pretrained("Qwen/Qwen-Image")
pipeline.to(torch.bfloat16)
pipeline.enable_model_cpu_offload(device="cuda:0")

prompt = "现实主义风格的人像摄影作品，画面主体是一位容貌惊艳的女性面部特写。她拥有一头自然微卷的短发，发丝根根分明，蓬松的刘海修饰着额头，增添俏皮感。头上佩戴一顶绿色格子蕾丝边头巾，增添复古与柔美气息。身着一件简约绿色背心裙，在纯白色背景下格外突出。两只手分别握着半个红色桃子，双手轻轻贴在脸颊两侧，营造出可爱又富有创意的视觉效果。  人物表情生动，一只眼睛睁开，另一只微微闭合，展现出调皮与自信的神态。整体构图采用个性视角、非对称构图，聚焦人物主体，增强现场感和既视感。背景虚化处理，层次丰富，景深效果强烈，营造出低光氛围下浓厚的情绪张力。  画面细节精致，色彩生动饱满却不失柔和，呈现出富士胶片独有的温润质感。光影运用充满美学张力，带有轻微超现实的光效处理，提升整体画面高级感。整体风格为现实主义人像摄影，强调细腻的纹理与艺术化的光线表现，堪称一幅细节丰富、氛围拉满的杰作。超清，4K，电影级构图"
common_inputs = {
    "prompt": prompt,
    "height": 1328,
    "width": 1328,
    "num_inference_steps": 50,
}

# test1: this work without cfg, should raise a warning since by default, true_cfg_scale = 4.0, but no negative_prompt is provided
print("test1:")
generator = torch.Generator(device="cuda:0").manual_seed(0)
output = pipeline(**common_inputs, generator=generator).images[0]
output.save("yiyi_test_16_output_1_no_cfg.png")

# test2: this should work as expected with cfg, default true_cfg_scale = 4.0, and negative_prompt is provided.
print("test2:")
generator = torch.Generator(device="cuda:0").manual_seed(0)
output = pipeline(**common_inputs, negative_prompt=" ", generator=generator).images[0]
output.save("yiyi_test_16_output_2_cfg.png")


# test3: this should work as expected without cfg, true_cfg_scale <1 and no negative_prompt is provided
print("test3:")
generator = torch.Generator(device="cuda:0").manual_seed(0)
output = pipeline(**common_inputs, true_cfg_scale=1.0, generator=generator).images[0]
output.save("yiyi_test_16_output_3_no_cfg.png")

# test4: without cfg, but get a warning since negative_prompt is provided.
print("test4:")
generator = torch.Generator(device="cuda:0").manual_seed(0)
output = pipeline(**common_inputs, true_cfg_scale=1.0, negative_prompt=" ", generator=generator).images[0]
output.save("yiyi_test_16_output_4_no_cfg.png")

# test5: this should get a warning since guidance_scale is passed but it is not a guidance-distilled model.
print("test5:")
generator = torch.Generator(device="cuda:0").manual_seed(0)
output = pipeline(**common_inputs, negative_prompt=" ", guidance_scale=1.0, generator=generator).images[0]
output.save("yiyi_test_16_output_5_cfg.png")

outputs

test1:
true_cfg_scale is passed as 4.0, but classifier-free guidance is not enabled since no negative_prompt is provided.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:58<00:00,  1.16s/it]
test2:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:43<00:00,  2.08s/it]
test3:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:58<00:00,  1.16s/it]
test4:
 negative_prompt is passed but classifier-free guidance is not enabled since true_cfg_scale <= 1
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:58<00:00,  1.16s/it]
test5:
guidance_scale is passed as 1.0, but ignored since the model is not guidance-distilled.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:43<00:00,  2.08s/it

Behavior for Guidance-distilled models

(currently we don't have a guidance-distilled qwen-image checkpoint, but the team might release on in the future)

guidance_scale is required (raises ValueError if None)
Can use both guidance distillation and CFG simultaneously
Same CFG validation logic as non-distilled models

yiyixuxu · 2025-08-23T05:59:13Z

cc @naykun

HuggingFaceDocBuilderDev · 2025-08-23T06:07:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

naykun · 2025-08-23T07:21:36Z

I've tested locally and everything appears to be working correctly. Thank you @yiyixuxu !

sayakpaul

Looks great! Should propagate this to other Qwen-Image pipelines as well?

asomoza

thanks! Looks all good to me but maybe we can also change also the example as correctly pointed here by @vagitablebirdcode

up

342c1f9

yiyixuxu requested review from sayakpaul, asomoza and DN6 August 23, 2025 05:57

Merge branch 'main' into guidance-warn

ca58ce3

up up

b7959dc

sayakpaul approved these changes Aug 23, 2025

View reviewed changes

yiyixuxu added 2 commits August 24, 2025 01:42

apply changes to other pipelines

6b05803

style

1efd106

asomoza approved these changes Aug 25, 2025

View reviewed changes

sayakpaul mentioned this pull request Aug 27, 2025

Add Qwen-Image-Edit Inpainting pipeline #12225

Open

yiyixuxu merged commit 865ba10 into main Aug 27, 2025
14 of 15 checks passed

yiyixuxu deleted the guidance-warn branch August 27, 2025 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Qwen-Image] adding validation for guidance_scale, true_cfg_scale and negative_prompt #12223

[Qwen-Image] adding validation for guidance_scale, true_cfg_scale and negative_prompt #12223

yiyixuxu commented Aug 23, 2025 •

edited

Loading

Uh oh!

yiyixuxu commented Aug 23, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 23, 2025

Uh oh!

naykun commented Aug 23, 2025

Uh oh!

sayakpaul left a comment

Uh oh!

asomoza left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[Qwen-Image] adding validation for guidance_scale, true_cfg_scale and negative_prompt #12223

[Qwen-Image] adding validation for guidance_scale, true_cfg_scale and negative_prompt #12223

Conversation

yiyixuxu commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Behavior for Non-guidance-distilled models:

Behavior for Guidance-distilled models

Uh oh!

yiyixuxu commented Aug 23, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 23, 2025

Uh oh!

naykun commented Aug 23, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

asomoza left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yiyixuxu commented Aug 23, 2025 •

edited

Loading

asomoza left a comment •

edited

Loading