Propose to update & upgrade SkyReels-V2 #12167

tolgacangoz · 2025-08-17T19:30:23Z

Fixed the visualisation and enhanced the documentation: Previously, HTML rendering at the docs page looked very bad, see: https://huggingface.co/docs/diffusers/main/en/api/pipelines/skyreels_v2#a-visual-demonstration.
Reorganized the code by moving components into the attention dispatcher because the original repo uses "_native_cudnn" for self-attentions and "flash_varlen" or "_flash_varlen_3" for cross-attentions.
Added support for pipeline.transformer.compile_repeated_blocks(fullgraph=True) and looking forward to be merged torch.compile compatibility with varlen APIs #11970.
Took Wan's RoPE directly to be able to compile: Remember Use real-valued instead of complex tensors in Wan2.1 RoPE #11649.

Skywork/SkyReels-V2-DF-1.3B-540P
seed=0
`main`: ~14 min.	Wan's RoPE
main.mp4	Wan.s_RoPE.mp4
Wan's RoPE + `compile_repeated_blocks(fullgraph=True)`: ~12 min.	Wan's RoPE + `compile_repeated_blocks(fullgraph=True)` + `"_native_cudnn"` for `attn1` and `"flash"` for `attn2`, FA=2.8.3: ~8 min.
Wan.s_RoPE+regional.mp4	Wan.s_RoPE+regional+FA.mp4

Reproducer

!uv pip install git+https://github.com/tolgacangoz/diffusers.git@update-skyreels-v2

import torch, os
from diffusers import AutoModel, SkyReelsV2DiffusionForcingPipeline, UniPCMultistepScheduler
from diffusers.utils import export_to_video

# For faster loading into the GPU
os.environ["HF_ENABLE_PARALLEL_LOADING"] = "YES"

model_id = "Skywork/SkyReels-V2-DF-1.3B-540P-Diffusers"
vae = AutoModel.from_pretrained(model_id,
                                subfolder="vae",
                                torch_dtype=torch.float32,
                                device_map="cuda")
pipeline = SkyReelsV2DiffusionForcingPipeline.from_pretrained(
    model_id,
    vae=vae,
    torch_dtype=torch.bfloat16,
    device_map="cuda"
)
flow_shift = 8.0  # 8.0 for T2V, 5.0 for I2V
pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config, flow_shift=flow_shift)

# Some acceleration helpers
# Be sure to install Flash Attention: https://github.com/Dao-AILab/flash-attention#installation-and-features
#for block in pipeline.transformer.blocks:
#    block.attn1.set_attention_backend("_native_cudnn")
#    block.attn2.set_attention_backend("flash")
#pipeline.transformer.compile_repeated_blocks(fullgraph=True)

prompt = "A cat and a dog baking a cake together in a kitchen. The cat is carefully measuring flour, while the dog is stirring the batter with a wooden spoon. The kitchen is cozy, with sunlight streaming through the window."

output = pipeline(
    prompt=prompt,
    num_inference_steps=30,
    height=544,  # 720 for 720P
    width=960,   # 1280 for 720P
    num_frames=97,
    base_num_frames=97,  # 121 for 720P
    ar_step=5,  # Controls asynchronous inference (0 for synchronous mode)
    causal_block_size=5,  # Number of frames in each block for asynchronous processing
    overlap_history=None,  # Number of frames to overlap for smooth transitions in long videos; 17 for long video generations
    addnoise_condition=20,  # Improves consistency in long video generation
    generator=torch.Generator("cpu").manual_seed(0)
).frames[0]
export_to_video(output, "T2V.mp4", fps=24, quality=8)

Environment

- 🤗 Diffusers version: 0.35.0 or this branch
- Platform: Linux-4.4.0-x86_64-with-glibc2.36
- Running on Google Colab?: No
- Python version: 3.12.6
- PyTorch version (GPU?): 2.8.0+cu126 (True)
- Flax version (CPU?/GPU?/TPU?): 0.11.0 (gpu)
- Jax version: 0.7.0
- JaxLib version: 0.7.0
- Huggingface_hub version: 0.34.3
- Transformers version: 4.55.0
- Accelerate version: 1.9.0
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.6.1
- xFormers version: not installed
- Accelerator: NVIDIA A100-SXM4-40GB, 40960 MiB

@a-r-r-o-w @yiyixuxu @stevhliu

Wraps the visual demonstration section in a Markdown code block. This change corrects the rendering of ASCII diagrams and examples, improving the overall readability of the document.

Improves the readability of the `step_matrix` examples by replacing long sequences of repeated numbers with a more compact `value×count` notation. This change makes the underlying data patterns in the examples easier to understand at a glance.

…ne and sine frequencies

…rt and remove outdated notes

src/diffusers/models/transformers/transformer_skyreels_v2.py

…ine.to("cuda") for GPU allocation

HuggingFaceDocBuilderDev · 2025-08-21T17:12:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

stevhliu

Thanks for improving the docs!

stevhliu · 2025-08-21T17:14:26Z

docs/source/en/api/pipelines/skyreels_v2.md

-
-        Key Pattern: Block i lags behind Block i-1 by exactly ar_step=5 timesteps, creating the
-        staggered "diffusion forcing" effect where later blocks condition on cleaner earlier blocks.
+```text


Thanks for improving! I think the text block should only be used for the graph and chart visuals.

I modified accordingly. I used backticks for rows representation too, otherwise the columns don't seem as aligned. How is it now?

docs/source/en/api/pipelines/skyreels_v2.md

stevhliu · 2025-08-21T17:21:46Z

docs/source/en/api/pipelines/skyreels_v2.md

-
-## Notes
-
- SkyReels-V2 supports LoRAs with [`~loaders.SkyReelsV2LoraLoaderMixin.load_lora_weights`].


Why is the LoRA example being removed?

This part was copied from Wan's page. Since I didn't test anything with LoRAs, I removed the specific example. But SkyReels-V2 and Wan have almost the same architecture. I preserved - SkyReels-V2 supports LoRAs with [~loaders.SkyReelsV2LoraLoaderMixin.load_lora_weights].

…eels_v2.md

stevhliu

Thanks, the docs LGTM!

Let's wait for @a-r-r-o-w to chime in on the other changes before we merge :)

tolgacangoz · 2025-08-22T17:31:34Z

Alright, thanks for your review!

a-r-r-o-w

Nice @tolgacangoz! Thanks for propagating the attention backend changes to SkyReels

a-r-r-o-w · 2025-08-23T07:03:06Z

docs/source/en/api/pipelines/skyreels_v2.md

+# Some acceleration helpers
+# Be sure to install Flash Attention: https://github.com/Dao-AILab/flash-attention#installation-and-features
+# Normally 14 min., with compile_repeated_blocks(fullgraph=True) 12 min., with Flash Attention too less min. at A100.
+# If you want to follow the original implementation in terms of attentions:
+#for block in pipeline.transformer.blocks:
+#    block.attn1.set_attention_backend("_native_cudnn")
+#    block.attn2.set_attention_backend("flash_varlen")  # or "_flash_varlen_3"
+#pipeline.transformer.compile_repeated_blocks(fullgraph=True)


This is nice! We can use this instead of iterating the blocks:

diffusers/src/diffusers/models/modeling_utils.py

Line 586 in 673d435

def set_attention_backend(self, backend: str) -> None:

Can we set different attention backends for attn1 and attn2?

Not sure why we need to showcase different backends though? Is there a specific reason?

Examples should showcase usage via the official expected APIs. There are usually multiple side-effects when not doing so. In this case, one instance could be the following warning not being raised:

diffusers/src/diffusers/models/modeling_utils.py

Line 603 in 673d435

logger.warning("Attention backends are an experimental feature and the API may be subject to change.")

Another problem is the following check will be skipped. When skipped, not-so-technical users will be baffled by the traceback:

diffusers/src/diffusers/models/modeling_utils.py

Line 610 in 673d435

_check_attention_backend_requirements(backend)

Custom uses are okay, but usually only best done by power-users who have a better understanding of the library

SkyReels-V2's self-attentions use "_native_cudnn" with a custom attention mask, and its cross-attentions use varlen types. We cannot set flash attention for all because it doesn't support a custom mask, right?

I agree to do this at the expected API. I opened a feature request for this, but then I thought it was not much necessary: #12210.

I can remove this # Some acceleration helpers comment section completely for now. Since this model is a bit slow, I just wanted to add several speedup helpers. I plan to profile and optimize this model.

Removed them.

src/diffusers/models/transformers/transformer_skyreels_v2.py

…lasses

Removed comments about acceleration helpers and Flash Attention installation.

tolgacangoz and others added 12 commits August 17, 2025 22:28

fix: update SkyReels-V2 documentation and moving into attn dispatcher

3428cc3

Merge branch 'main' into update-skyreels-v2

31ffa05

Refactors SkyReelsV2's attention implementation

42113fc

style

7e237ad

up

4d72277

Fixes formatting in SkyReels-V2 documentation

92dbf97

Wraps the visual demonstration section in a Markdown code block. This change corrects the rendering of ASCII diagrams and examples, improving the overall readability of the document.

Add _repeated_blocks attribute to SkyReelsV2Transformer3DModel

6856ee6

Refactor rotary embedding calculations in SkyReelsV2 to separate cosi…

a7e7b2f

…ne and sine frequencies

Enhance SkyReels-V2 documentation: update model loading for GPU suppo…

07ac70d

…rt and remove outdated notes

up

6e4cc72

up

dbe2454

tolgacangoz commented Aug 19, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_skyreels_v2.py Show resolved Hide resolved

tolgacangoz and others added 3 commits August 19, 2025 17:51

Update model_id in SkyReels-V2 documentation

4743c7e

up

aaf2470

Merge branch 'main' into update-skyreels-v2

1f29453

tolgacangoz marked this pull request as ready for review August 19, 2025 15:57

Copilot AI mentioned this pull request Aug 20, 2025

Comprehensive Review and Analysis of SkyReels-V2 Updates (PR #12167) tolgacangoz/diffusers#8

Closed

Merge branch 'main' into update-skyreels-v2

5c6ce3c

tolgacangoz marked this pull request as draft August 20, 2025 15:20

tolgacangoz force-pushed the update-skyreels-v2 branch from 44d84e4 to 5c6ce3c Compare August 21, 2025 08:39

tolgacangoz changed the title ~~Propose to update SkyReels-V2~~ Propose to update & upgrade SkyReels-V2 Aug 21, 2025

Merge branch 'main' into update-skyreels-v2

1e26139

tolgacangoz marked this pull request as ready for review August 21, 2025 12:59

refactor: remove device_map parameter for model loading and add pipel…

c88cb16

…ine.to("cuda") for GPU allocation

tolgacangoz and others added 2 commits August 21, 2025 20:17

Merge branch 'main' into update-skyreels-v2

46f4f22

stevhliu reviewed Aug 21, 2025

View reviewed changes

docs: enhance parameter examples and formatting in skyreels_v2.md

e82d7b6

docs: update example formatting and add notes on LoRA support in skyr…

fe3af91

…eels_v2.md

tolgacangoz requested a review from stevhliu August 22, 2025 12:03

Merge branch 'main' into update-skyreels-v2

e2f328b

stevhliu approved these changes Aug 22, 2025

View reviewed changes

a-r-r-o-w approved these changes Aug 23, 2025

View reviewed changes

tolgacangoz and others added 2 commits August 23, 2025 11:46

refactor: remove copied comments from transformer_wan in SkyReelsV2 c…

06d3a62

…lasses

Merge branch 'main' into update-skyreels-v2

ba1558c

tolgacangoz requested a review from a-r-r-o-w August 23, 2025 08:49

tolgacangoz added 3 commits August 24, 2025 07:34

Merge branch 'main' into update-skyreels-v2

14240d2

Clean up comments in skyreels_v2.md

df1f6b7

Removed comments about acceleration helpers and Flash Attention installation.

Merge branch 'main' into update-skyreels-v2

af24d9d


		## Notes

		- SkyReels-V2 supports LoRAs with [`~loaders.SkyReelsV2LoraLoaderMixin.load_lora_weights`].

Propose to update & upgrade SkyReels-V2 #12167

Are you sure you want to change the base?

Propose to update & upgrade SkyReels-V2 #12167

Conversation

tolgacangoz commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Aug 21, 2025

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tolgacangoz Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

tolgacangoz commented Aug 22, 2025

Uh oh!

a-r-r-o-w left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tolgacangoz commented Aug 17, 2025 •

edited

Loading

tolgacangoz Aug 22, 2025 •

edited

Loading