[JAX] THD ring attention #1454

zlsh80826 · 2025-02-04T20:27:16Z

Description

Support P2P context parallel (ring attn) with THD format. This feature is only available for self attn + causal + segment_ids/pos + load balancing (reorder before the attn and inverse-reorder after the attn).

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Refactor reorder/inverse_reorder_causal_loading_balancing API to support different reorder strategy.
Support P2P context parallel. The limitations are list above.
Reduce the number of test configs of kv_groups in test_distributed_fused_attn
Use AttnBiasType, AttnMaskType, QKVLayout in cpp_extenion/attention.py for maintaining the readibility.

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

zlsh80826 · 2025-02-06T09:45:50Z

/te-ci jax L1

zlsh80826 · 2025-02-06T11:44:43Z

/te-ci jax L1

zlsh80826 · 2025-02-08T16:06:34Z

/te-ci jax L1

zlsh80826 · 2025-02-10T07:05:24Z

/te-ci jax L1

zlsh80826 · 2025-02-19T15:40:58Z

/te-ci jax L1

phu0ngng · 2025-02-20T16:43:03Z

transformer_engine/jax/attention.py

+    if strategy == ReorderStrategy.DualChunkSwap:
+        return tex.attention.reorder_causal_load_balancing(tensor, cp_size, seq_dim, True)
+    if strategy == ReorderStrategy.Striped:
+        return _inverse_reorder_causal_striped(tensor, cp_size, seq_dim)


Hi,

Why we implemented the reorder_causal_load_balancing() in jax/cpp_extensions/attention.py but the _inverse_reorder_causal_striped in attention.py?

I think we should make the _reorder_causal_striped have the same API as the tex.reorder_causal_load_balancing() which accepts the boolean if_inverse and can handle both cases.

The last argument for reorder_causal_load_balancing is actually for to_contiguous instead of inverse and not inverse. The reason that reorder_causal_load_balancing need to be under cpp_extensions/attention.py is because that it is also needed by the cpp_extensions/attention.py, but _reorder_causal_striped doesn't need to be instead.

But for a better alignment, I can move it into the cpp_extension/attention.py

phu0ngng · 2025-02-20T17:08:51Z

transformer_engine/jax/cpp_extensions/attention.py

@@ -310,7 +312,7 @@ def abstract(
        rng_state_shape = (seed_aval.shape[0], checker.rng_state_size)
        rng_state_aval = seed_aval.update(shape=rng_state_shape, dtype=checker.rng_state_dtype)

-        if config.attn_bias_type == NVTE_Bias_Type.NVTE_NO_BIAS:
+        if config.attn_bias_type == AttnBiasType.NO_BIAS:
            bias_batch = bias_heads = 0
        else:
            *bias_batch_shape, bias_heads, _, _ = bias_aval.shape


Hi, how does the full shape of the bias_aval here look like? Is this bias for PreBias or PostBias or both?

When no_bias, there is a 0 shape bias passed. When it is not, it is intend for both PreBias and PostBias.

transformer_engine/jax/cpp_extensions/attention.py

Signed-off-by: Reese Wang <[email protected]>

zlsh80826 · 2025-02-26T02:26:56Z

/te-ci jax L1

zlsh80826 marked this pull request as draft February 4, 2025 20:27

zlsh80826 marked this pull request as ready for review February 6, 2025 09:46

zlsh80826 requested a review from phu0ngng February 6, 2025 09:56

zlsh80826 force-pushed the rewang/thd-ring-attn branch from d486032 to 33ac4d2 Compare February 8, 2025 09:41

zlsh80826 force-pushed the rewang/thd-ring-attn branch from 33ac4d2 to ddd6a8b Compare February 10, 2025 06:38

phu0ngng requested a review from kocchop February 13, 2025 01:54

zlsh80826 force-pushed the rewang/thd-ring-attn branch from 6dd5fdb to 4c17948 Compare February 19, 2025 15:13

phu0ngng reviewed Feb 20, 2025

View reviewed changes

zlsh80826 added 10 commits February 24, 2025 14:41

Support THD + ring attention for self attn

b9cb819

Signed-off-by: Reese Wang <[email protected]>

Consolidate reorder strategy

5eb9c37

Signed-off-by: Reese Wang <[email protected]>

Fix dataclass frozen issue

748b59c

Signed-off-by: Reese Wang <[email protected]>

Remove redundant code

bab7237

Signed-off-by: Reese Wang <[email protected]>

Use AttnBiasType, AttnMaskType, QKVLayout in cpp_extension

71293bf

Signed-off-by: Reese Wang <[email protected]>

Fix lint

b822315

Signed-off-by: Reese Wang <[email protected]>

Refine P2P helper check_supported

f6721f4

Signed-off-by: Reese Wang <[email protected]>

Add segment_ids/pos check

889500a

Signed-off-by: Reese Wang <[email protected]>

Fixup

312b95f

Signed-off-by: Reese Wang <[email protected]>

Add dual chunk swap example

fc2ebcb

Signed-off-by: Reese Wang <[email protected]>

zlsh80826 force-pushed the rewang/thd-ring-attn branch from 4c17948 to fc2ebcb Compare February 24, 2025 14:45

Align different reorder code structure

2da1075

Signed-off-by: Reese Wang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JAX] THD ring attention #1454

[JAX] THD ring attention #1454

zlsh80826 commented Feb 4, 2025 •

edited

Loading

zlsh80826 commented Feb 6, 2025

zlsh80826 commented Feb 6, 2025

zlsh80826 commented Feb 8, 2025

zlsh80826 commented Feb 10, 2025

zlsh80826 commented Feb 19, 2025

phu0ngng Feb 20, 2025

zlsh80826 Feb 24, 2025

zlsh80826 Feb 24, 2025

zlsh80826 Feb 24, 2025

phu0ngng Feb 20, 2025

zlsh80826 Feb 24, 2025

zlsh80826 commented Feb 26, 2025

[JAX] THD ring attention #1454

Are you sure you want to change the base?

[JAX] THD ring attention #1454

Conversation

zlsh80826 commented Feb 4, 2025 • edited Loading

Description

Type of change

Changes

Checklist:

zlsh80826 commented Feb 6, 2025

zlsh80826 commented Feb 6, 2025

zlsh80826 commented Feb 8, 2025

zlsh80826 commented Feb 10, 2025

zlsh80826 commented Feb 19, 2025

phu0ngng Feb 20, 2025

Choose a reason for hiding this comment

zlsh80826 Feb 24, 2025

Choose a reason for hiding this comment

zlsh80826 Feb 24, 2025

Choose a reason for hiding this comment

zlsh80826 Feb 24, 2025

Choose a reason for hiding this comment

phu0ngng Feb 20, 2025

Choose a reason for hiding this comment

zlsh80826 Feb 24, 2025

Choose a reason for hiding this comment

zlsh80826 commented Feb 26, 2025

zlsh80826 commented Feb 4, 2025 •

edited

Loading