-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Models] Remove GPU-CPU sync when
do_pan_and_scan=false
in Gemma3
#19999
opened Jun 23, 2025 by
lgeiger
Loading…
[Llama4] Update Related to Llama models
ready
ONLY add when PR is ready to merge/full CI is needed
attn_temperature_tuning
llama
#19997
opened Jun 23, 2025 by
b8zhong
Loading…
Update test case parameter to have the throughput above 8.0
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
#19994
opened Jun 23, 2025 by
QiliangCui
Loading…
2 tasks done
[Misc] Clean up InternVL family config registration
#19992
opened Jun 23, 2025 by
Isotr0py
Loading…
3 of 4 tasks
Add support for encoder embedding models
tpu
Related to Google TPUs
v1
#19988
opened Jun 23, 2025 by
maxdebayser
•
Draft
Move to a faster base64 implementation
ci/build
multi-modality
Related to multi-modality (#4194)
#19984
opened Jun 23, 2025 by
h-avsha
Loading…
Blocked fp8 CUTLASS MoE
ci/build
performance
Performance-related issues
#19983
opened Jun 23, 2025 by
ElizaWszola
•
Draft
[doc] Fix broken link in the installation for CPU
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#19980
opened Jun 23, 2025 by
yankay
Loading…
3 of 4 tasks
[Model][1/N] Automatic conversion of CrossEncoding model. Part 1
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
feat: add reward model + min_p speculative decode
frontend
qwen
Related to Qwen models
#19968
opened Jun 23, 2025 by
jatery55555
Loading…
4 tasks
feat: offload weights to cpu before fp8 online quant
documentation
Improvements or additions to documentation
#19967
opened Jun 23, 2025 by
yma11
Loading…
[Chore] Clarifying log messages for KV Connector
ready
ONLY add when PR is ready to merge/full CI is needed
#19965
opened Jun 23, 2025 by
aarnphm
Loading…
[CI/Build] Upgrade lm-eval to 0.4.9
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19962
opened Jun 23, 2025 by
yeqcharlotte
Loading…
feat(audio): add flag for Whisper chunking (#19772)
frontend
#19961
opened Jun 23, 2025 by
hardikkgupta
Loading…
1 of 4 tasks
[CI/Build] Add basic multimodal lm eval for CI testing
ci/build
#19959
opened Jun 23, 2025 by
yeqcharlotte
Loading…
3 of 4 tasks
[Doc] cmd+k
documentation
Improvements or additions to documentation
#19957
opened Jun 22, 2025 by
aarnphm
Loading…
[Perf][Frontend]: eliminate api_key and x_request_id headers middleware overhead
documentation
Improvements or additions to documentation
frontend
#19946
opened Jun 22, 2025 by
Yazan-Sharaya
Loading…
4 tasks done
[Bugfix] fix sampling seeding being off when sequences are prempted
v0
#19940
opened Jun 21, 2025 by
Jackmin801
•
Draft
[PERF] Speedup of MRoPE prepare inputs
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19939
opened Jun 21, 2025 by
vadiklyutiy
Loading…
3 tasks done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.