-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deepseek-R1 AssertionError occurred in the batch request of the client #3477
Comments
Oh. Currently do not use batch in dpsk models. We find this problem. Batch can be easily changed by chat completions. |
Do you have a plan to fix this issue? we need batch API in our scenario.
|
Yeah. As that have been said, @FrankLeeeee is on this, dpsk model's batch. Wait and see, thanks! |
@tanconghui @Roysky do you still encounter this issue with the latest release? i cannot reproduce the error. If you can provide me with a script to reproduce the error, that will help as well. |
@tanconghui @Roysky you can take a look at #3754 , I didn't encounter the error any more with this fix. |
Thanks, FrankLeeeee. I also noticed this issue. But maybe it is better use a UUUID stead of the custom_id as the request id? For example, if two batches are processing in the same time, and samples with same custom_id exsit in these two batches, the current solution #3754 still seems problematic.
|
@tanconghui We just merged this into main. Thanks! |
While using deepseek-R1 for inference on 2 nodes * 8 GPUs (H800), an AssertionError occurred during the client batch request.
The specific error is as follows:
The environment configuration is as follows:
Startup command:
node1
python -m sglang.launch_server --model-path DeepSeek-R1 --tp 16 --nccl-init-addr 10.1.10.42:5000 --nnodes 2 --node-rank 0 --trust-remote-code --host 0.0.0.0
node2
python -m sglang.launch_server --model-path DeepSeek-R1 --tp 16 --nccl-init-addr 10.1.10.42:5000 --nnodes 2 --node-rank 1 --trust-remote-code --host 0.0.0.0
The text was updated successfully, but these errors were encountered: