File tree Expand file tree Collapse file tree 1 file changed +3
-1
lines changed Expand file tree Collapse file tree 1 file changed +3
-1
lines changed Original file line number Diff line number Diff line change @@ -25,6 +25,8 @@ TensorRT-LLM optimizes the performance of a range of well-known models on NVIDIA
25
25
| ` Qwen2ForRewardModel ` | Qwen2-based | ` Qwen/Qwen2.5-Math-RM-72B ` | L |
26
26
| ` Qwen2VLForConditionalGeneration ` | Qwen2-VL | ` Qwen/Qwen2-VL-7B-Instruct ` | L + V |
27
27
| ` Qwen2_5_VLForConditionalGeneration ` | Qwen2.5-VL | ` Qwen/Qwen2.5-VL-7B-Instruct ` | L + V |
28
+ | ` Qwen3ForCausalLM ` | Qwen3 | ` Qwen/Qwen3-8B ` | L |
29
+ | ` Qwen3MoeForCausalLM ` | Qwen3MoE | ` Qwen/Qwen3-30B-A3B ` | L |
28
30
29
31
Note:
30
32
- L: Language only
72
74
- [ mT5] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/enc_dec )
73
75
- [ OPT] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/contrib/opt )
74
76
- [ Phi-1.5/Phi-2/Phi-3] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/phi )
75
- - [ Qwen/Qwen1.5/Qwen2] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/qwen )
77
+ - [ Qwen/Qwen1.5/Qwen2/Qwen3 ] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/qwen )
76
78
- [ Qwen-VL] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/qwenvl )
77
79
- [ RecurrentGemma] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/recurrentgemma )
78
80
- [ Replit Code] ( https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/contrib/mpt ) [ ^ replitcode ]
You can’t perform that action at this time.
0 commit comments