-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[KV Cache Manager] Dead code elimination, we no longer record/fetch through WindowBlockManager:: mContextBlocksByHash
#6249
opened Jul 22, 2025 by
eopXD
Loading…
[TRTLLM-5627] feat: Implement pytorch sampler for MTP
#6245
opened Jul 22, 2025 by
nvxuanyuc
Loading…
[PERF] Don't use hmac encryption for loopback interfaces
Community want to contribute
PRs initiated from Community
#6241
opened Jul 22, 2025 by
vadiklyutiy
Loading…
Add Acceptance Rate calculation to benchmark_serving
#6240
opened Jul 22, 2025 by
zerollzeng
Loading…
[fix][nvbugs/5399355] Fix Lamport buffer clear issue for MNNVL TwoShot Allreduce
#6237
opened Jul 21, 2025 by
timlee0212
Loading…
[nvbug/5376229]: Remove flash-attn dependency from test_ptp_quickstart_multimodal
#6236
opened Jul 21, 2025 by
moraxu
Loading…
[Fix][Nvbug 5401163] Fix bug of MoE on tp > 1 with trtllm moe backend
#6235
opened Jul 21, 2025 by
byshiue
Loading…
add env for dlcluster docker command for slurm job
#6234
opened Jul 21, 2025 by
yuanjingx87
Loading…
[fix] Allow custom model config for Kimi-K2
Community want to contribute
PRs initiated from Community
#6228
opened Jul 21, 2025 by
meenchen
Loading…
[nvbugs/5401261][fix] Fix Triton backend disaggregated serving support
#6224
opened Jul 21, 2025 by
Tabrizian
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.