Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Llama4] Update attn_temperature_tuning llama Related to Llama models ready ONLY add when PR is ready to merge/full CI is needed
#19997 opened Jun 23, 2025 by b8zhong Loading…
Update test case parameter to have the throughput above 8.0 ci/build ready ONLY add when PR is ready to merge/full CI is needed tpu Related to Google TPUs
#19994 opened Jun 23, 2025 by QiliangCui Loading…
2 tasks done
[Misc] Clean up InternVL family config registration
#19992 opened Jun 23, 2025 by Isotr0py Loading…
3 of 4 tasks
Add support for encoder embedding models tpu Related to Google TPUs v1
#19988 opened Jun 23, 2025 by maxdebayser Draft
Move to a faster base64 implementation ci/build multi-modality Related to multi-modality (#4194)
#19984 opened Jun 23, 2025 by h-avsha Loading…
Blocked fp8 CUTLASS MoE ci/build performance Performance-related issues
#19983 opened Jun 23, 2025 by ElizaWszola Draft
[doc] Fix broken link in the installation for CPU documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#19980 opened Jun 23, 2025 by yankay Loading…
3 of 4 tasks
[Model][1/N] Automatic conversion of CrossEncoding model. Part 1 documentation Improvements or additions to documentation qwen Related to Qwen models
#19978 opened Jun 23, 2025 by noooop Draft
4 tasks
support --no-enable-chunked-prefill for V1
#19975 opened Jun 23, 2025 by liuyumoye Loading…
Enabling Safe KVConnector
#19972 opened Jun 23, 2025 by prashant182 Loading…
[Core][V1] Support sharded state loading
#19971 opened Jun 23, 2025 by aarnphm Loading…
Implement Async Scheduling v1
#19970 opened Jun 23, 2025 by WoosukKwon Draft
4 tasks
feat: add reward model + min_p speculative decode frontend qwen Related to Qwen models
#19968 opened Jun 23, 2025 by jatery55555 Loading…
4 tasks
feat: offload weights to cpu before fp8 online quant documentation Improvements or additions to documentation
#19967 opened Jun 23, 2025 by yma11 Loading…
[Chore] Clarifying log messages for KV Connector ready ONLY add when PR is ready to merge/full CI is needed
#19965 opened Jun 23, 2025 by aarnphm Loading…
[CI/Build] Upgrade lm-eval to 0.4.9 ci/build ready ONLY add when PR is ready to merge/full CI is needed
#19962 opened Jun 23, 2025 by yeqcharlotte Loading…
feat(audio): add flag for Whisper chunking (#19772) frontend
#19961 opened Jun 23, 2025 by hardikkgupta Loading…
1 of 4 tasks
[CI/Build] Add basic multimodal lm eval for CI testing ci/build
#19959 opened Jun 23, 2025 by yeqcharlotte Loading…
3 of 4 tasks
[Doc] cmd+k documentation Improvements or additions to documentation
#19957 opened Jun 22, 2025 by aarnphm Loading…
[Perf][Frontend]: eliminate api_key and x_request_id headers middleware overhead documentation Improvements or additions to documentation frontend
#19946 opened Jun 22, 2025 by Yazan-Sharaya Loading…
4 tasks done
[PERF] Speedup of MRoPE prepare inputs qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed v1
#19939 opened Jun 21, 2025 by vadiklyutiy Loading…
3 tasks done
ProTip! Mix and match filters to narrow down what you’re looking for.