forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 41
Pull requests: ROCm/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
removing quant and kv-cache fp8 from deepseek run instructions
#509
opened Apr 9, 2025 by
arakowsk-amd
Loading…
Handling input dim size greater than 3 in tuned_gemm.py
#482
opened Mar 13, 2025 by
charlifu
Loading…
EXPERIMENTING WITH K8S // NO NEED TO MERGE // Rocm vllm ci fix nd k8 osci
#477
opened Mar 12, 2025 by
Alexei-V-Ivanov-AMD
Loading…
Updating ISL and OSL to align with reported benchmark table
#424
opened Feb 14, 2025 by
eduand-alvarez
Loading…
K8test baseline -> Testing a single MI300 8x GPU node for CI performance // no need to merge
#409
opened Feb 6, 2025 by
Alexei-V-Ivanov-AMD
Loading…
Add TritonScaledMMLinearKernel to fix broken support for int8 models
stale
#377
opened Jan 21, 2025 by
rasmith
Loading…
Trying to pass toml file as a parameter to codespell
stale
#376
opened Jan 21, 2025 by
gshtras
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.