Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Feat/smoke flexlb pd
#1001 opened May 13, 2026 by jianglan89 Collaborator Loading…
fix: switch FC endpoints from VPC to public network
#1000 opened May 13, 2026 by guoj14 Contributor Loading…
feat: update rtp-kernel for w4a8-opt and sm103a
#999 opened May 13, 2026 by Bruce-Lee-LY Collaborator Loading…
fix: qwen3.5-moe-nvfp4 accuracy
#998 opened May 13, 2026 by junna2016 Collaborator Loading…
W4A8 OPT with internal_source flashinfer
#996 opened May 12, 2026 by qqbbiu Collaborator Loading…
Feature/fix mtp leakmem
#995 opened May 12, 2026 by zerozw Collaborator Loading…
test: add server_args for server_test
#994 opened May 12, 2026 by zhangjianning-zjn Collaborator Loading…
Feat/fuse remap local ids triton
#993 opened May 12, 2026 by Xu-Sheng-lin Collaborator Loading…
feat(flexlb): add configurable group routing policy
#988 opened May 10, 2026 by jianglan89 Collaborator Loading…
实时限制性解码
#987 opened May 9, 2026 by Glen11111Z Loading…
fix - fuse some norm kernel
#986 opened May 9, 2026 by Nancheng-11 Collaborator Loading…
feat: native setup.py and pytest (v2)
#985 opened May 9, 2026 by LLLLKKKK Collaborator Loading…
feat: Qwen35-MXFP4 impl
#984 opened May 9, 2026 by zhaoan12-prc Collaborator Loading…
Develop/bailian 0508 after rebase
#982 opened May 9, 2026 by jianglan89 Collaborator Loading…
feat - merge qwen35 gate + qkv gemm
#981 opened May 8, 2026 by zerozw Collaborator Loading…
feat - add fa3 attention for mtp
#980 opened May 8, 2026 by zerozw Collaborator Loading…
prompt generator server
#979 opened May 8, 2026 by parkerpang Loading…
fix(model-loader): avoid deepcopy for fp8 scale params
#978 opened May 8, 2026 by siluzhou Collaborator Loading…
feat(mori-ep): Add MoRI Expert Parallelism support for ROCm
#977 opened May 8, 2026 by jacobwin-ai Collaborator Loading…
Ele dev 0508
#976 opened May 8, 2026 by Glen11111Z Loading…
Fix/kimi linear flaky
#975 opened May 8, 2026 by theNiemand Collaborator Loading…
fix(rocm): apply RoPE for embedding models without KV cache
#973 opened May 7, 2026 by siluzhou Collaborator Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.