forked from NVIDIA/TensorRT-LLM
-
Notifications
You must be signed in to change notification settings - Fork 0
Pull requests: nv-auto-deploy/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[feat] TP Sharding read from the model config (fixes #6342)
enhancement
New feature or request
#117
opened Jul 24, 2025 by
greg-kwasniewski1
Loading…
[TRTLLM-4789] Support logit softcapping during the graph import and optimization
#65
opened Jun 24, 2025 by
nvchenghaoz
Loading…
[TRTLLM-4880, TRTLLM-4595] Add soft logit capping in custom kernel and flashinfer
#62
opened Jun 16, 2025 by
nvchenghaoz
Loading…
ProTip!
Filter pull requests by the default branch with base:main.