nv-auto-deploy / TensorRT-LLM Public

forked from NVIDIA/TensorRT-LLM

Notifications You must be signed in to change notification settings
Fork 0
Star 3

Code
Pull requests 12
Actions
Security
Insights

Additional navigation options

Code
Pull requests
Actions
Security
Insights

Pull requests: nv-auto-deploy/TensorRT-LLM

Labels 9 Milestones 0

New pull request New

12 Open 140 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix the unit test errors / enable accuracy tests

#150 opened Oct 1, 2025 by nvchenghaoz

Loading…

Sg/bamba bench

#139 opened Sep 24, 2025 by suyoggupta • Draft

Support Pixtral

#128 opened Aug 14, 2025 by nvchenghaoz • Draft

Support Qwen 2.5 VL

#127 opened Aug 12, 2025 by nvchenghaoz • Draft

Skip pattern matching in specified modules

#125 opened Aug 8, 2025 by suyoggupta • Draft

remove spurious cpu->gpu and gpu->cpu transfers

#123 opened Aug 1, 2025 by suyoggupta

Loading…

Sg/opt2

#120 opened Jul 28, 2025 by suyoggupta • Draft

avoid copying new_tokens to cpu

#118 opened Jul 27, 2025 by suyoggupta

Loading…

[feat] TP Sharding read from the model config (fixes #6342) enhancement

New feature or request

#117 opened Jul 24, 2025 by greg-kwasniewski1

Loading…

[AutoDeploy] dist_ops revisited

#96 opened Jul 18, 2025 by lucaslie

Loading…

[TRTLLM-4789] Support logit softcapping during the graph import and optimization

#65 opened Jun 24, 2025 by nvchenghaoz

Loading…

[TRTLLM-4880, TRTLLM-4595] Add soft logit capping in custom kernel and flashinfer

#62 opened Jun 16, 2025 by nvchenghaoz

Loading…

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!