-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP] Add TRITON_MLA_SPARSE backend for SM80 sparse MLA support
documentation
Improvements or additions to documentation
nvidia
rocm
Related to AMD ROCm
v1
fix(p2p_nccl): free KV recv_store entries immediately to prevent OOM (#38472)
kv-connector
v1
#38475
opened Mar 29, 2026 by
saifmb0
Loading…
Fix potential infinite loop in SonnetDataset.sample
performance
Performance-related issues
#38471
opened Mar 29, 2026 by
frankie-ys
Loading…
1 of 5 tasks
fix: Add apply_with_spec_decode() method to LogitBiasLogitsProcessor
v1
#38469
opened Mar 29, 2026 by
ranger2571
Loading…
5 tasks
Add platform manual_seed_all API
intel-gpu
Related to Intel GPU
nvidia
performance
Performance-related issues
rocm
Related to AMD ROCm
speculative-decoding
v1
#38468
opened Mar 29, 2026 by
yma11
Loading…
[Feature] Add apply_with_spec_decode() to LogitBiasLogitsProcessor
v1
#38467
opened Mar 29, 2026 by
NJX-njx
Loading…
[Bugfix] Fix limit_mm_per_prompt being ignored for encoder cache profiling
bug
Something isn't working
multi-modality
Related to multi-modality (#4194)
#38465
opened Mar 29, 2026 by
NJX-njx
Loading…
[Logging] Improve DCP error message to suggest VLLM_ATTENTION_BACKEND
v1
#38464
opened Mar 29, 2026 by
WJYuuuu
Loading…
3 of 5 tasks
[Quantization] Consolidate experts_int8 with fp8 online quantization
#38463
opened Mar 29, 2026 by
Josephasafg
•
Draft
3 of 5 tasks
[Logging] Add JIT compilation progress log for FlashInfer
nvidia
v1
#38462
opened Mar 29, 2026 by
WJYuuuu
Loading…
3 of 5 tasks
[Perf] Batch KV cache swap copies via cuMemcpyBatchAsync
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#38460
opened Mar 29, 2026 by
Etelis
Loading…
[Docs] Add vLLM CI overview documentation for contributors
documentation
Improvements or additions to documentation
#38458
opened Mar 29, 2026 by
khluu
Loading…
3 tasks
[ROCm] [DOC] Update the Documentation to include ROCm Nightly Wheel support
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
#38457
opened Mar 29, 2026 by
tjtanaa
Loading…
5 tasks
[CI] Fix online FP8 quantization materializing tensors on CPU
bug
Something isn't working
#38456
opened Mar 29, 2026 by
haosdent
Loading…
[ROCm] Add RDNA 3.5/4 device IDs (gfx1150, gfx1151, gfx1201)
rocm
Related to AMD ROCm
#38455
opened Mar 29, 2026 by
dondetir
Loading…
[ROCm][Test] Add hybrid block size and RDNA4 backend selection tests
rocm
Related to AMD ROCm
v1
#38454
opened Mar 29, 2026 by
dondetir
Loading…
[kv_offload+HMA][8/N]: Support multi-group worker transfer
v1
#38453
opened Mar 29, 2026 by
orozery
Loading…
fix(metrics): capture num_preemptions before _reset() clears it in log()
v1
#38452
opened Mar 29, 2026 by
610lyn
Loading…
1 of 2 tasks
[Perf] Fix DBO overlap: capture DeepEP event before yield
#38451
opened Mar 29, 2026 by
czhu-cohere
Loading…
5 tasks
[CI] Revamp translation validation tests: parametrize ROCm backends, add seed, relax semantic assertions
rocm
Related to AMD ROCm
#38449
opened Mar 29, 2026 by
AndreasKaratzas
Loading…
1 task done
fix(tokenizer): skip reasoning_effort when None in Mistral tokenizer
#38448
opened Mar 29, 2026 by
marioiseli89
Loading…
3 tasks done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-02-28.