If dp_size = tp_size is still required for deepseek model? #3359

luzengxiangcn · 2025-02-07T03:14:04Z

According to deepseek optimization, the deepseek model with dp+tp attention is now supported. If dp_size = tp_size is still required？

In server argument doc:

enable_dp_attention: Enable [Data Parallelism Attention](https://lmsys.org/blog/2024-12-04-sglang-v0-4/#data-parallelism-attention-for-deepseek-models) for Deepseek models. 
Note that you need to choose dp_size = tp_size for this.

The text was updated successfully, but these errors were encountered:

zhaochenyang20 · 2025-02-07T03:41:05Z

Well. I think still need this. But we've automiatically done this.

luzengxiangcn · 2025-02-07T05:31:41Z

cc @zhaochenyang20. Does it mean dp or tp is supported, and i can't use it in same time, eg tp=4, dp=2. We need this feature to deploy deepseek-v3 across multiple nodes. TP in same node, dp cross all nodes.

zhaochenyang20 · 2025-02-07T05:46:46Z

The dp in data parallelism attention has different meaning. Check this:

https://docs.sglang.ai/references/deepseek.html#multi-head-latent-attention-mla-throughput-optimizations

luzengxiangcn · 2025-02-07T08:58:05Z

@zhaochenyang20

The dp in data parallelism attention has different meaning. Check this:

https://docs.sglang.ai/references/deepseek.html#multi-head-latent-attention-mla-throughput-optimizations

Gocha！
What I am looking for is：

We are planning to apply this deployment architecture to reduce network pressure between nodes, while not increasing inference time too much when throughput is low.

zhaochenyang20 · 2025-02-07T17:43:20Z

@luzengxiangcn Cool. Nice work. Hope to see it soon.

zhaochenyang20 closed this as completed Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If dp_size = tp_size is still required for deepseek model? #3359

If dp_size = tp_size is still required for deepseek model? #3359

luzengxiangcn commented Feb 7, 2025

zhaochenyang20 commented Feb 7, 2025

luzengxiangcn commented Feb 7, 2025

zhaochenyang20 commented Feb 7, 2025

luzengxiangcn commented Feb 7, 2025 •

edited

Loading

zhaochenyang20 commented Feb 7, 2025

If dp_size = tp_size is still required for deepseek model? #3359

If dp_size = tp_size is still required for deepseek model? #3359

Comments

luzengxiangcn commented Feb 7, 2025

zhaochenyang20 commented Feb 7, 2025

luzengxiangcn commented Feb 7, 2025

zhaochenyang20 commented Feb 7, 2025

luzengxiangcn commented Feb 7, 2025 • edited Loading

zhaochenyang20 commented Feb 7, 2025

luzengxiangcn commented Feb 7, 2025 •

edited

Loading