Commit ecd621f
authored
feat: Add head size 72 support for QKV Preprocessing kernel (#3743)
* refactor: Fix headsize 72 attention error for TRTLLM attn backend in PyTorch workflow
- Remove the head size pre-check logic in AttentionOp because head size 72 can be supported with fmha kernels.
- Added support for head size 72 in unfused attention kernels(QKVPreprocessing).
- Enhanced unit tests by introducing a scenario generation function for better test coverage of attention configurations(include head size 72).
Signed-off-by: qixiang-99 <[email protected]>
* update: Waive head_dim=72 test cases and enhance test representation
- Added a waiver for head_dim=72 cases on post sm100 in the test suite to address known issues.
- Introduced a custom __repr__ method in the Scenario class for pytest substring match.
Signed-off-by: qixiang-99 <[email protected]>
---------
Signed-off-by: qixiang-99 <[email protected]>1 parent 5b9897a commit ecd621f
File tree
3 files changed
+40
-20
lines changed- cpp/tensorrt_llm
- common
- kernels/unfusedAttentionKernels
- tests/unittest/_torch
3 files changed
+40
-20
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2297 | 2297 | | |
2298 | 2298 | | |
2299 | 2299 | | |
2300 | | - | |
2301 | | - | |
| 2300 | + | |
| 2301 | + | |
2302 | 2302 | | |
2303 | 2303 | | |
2304 | 2304 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1603 | 1603 | | |
1604 | 1604 | | |
1605 | 1605 | | |
| 1606 | + | |
1606 | 1607 | | |
1607 | 1608 | | |
1608 | 1609 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
4 | | - | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| 9 | + | |
8 | 10 | | |
9 | 11 | | |
10 | 12 | | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
14 | 33 | | |
15 | 34 | | |
16 | 35 | | |
| |||
110 | 129 | | |
111 | 130 | | |
112 | 131 | | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
113 | 136 | | |
114 | 137 | | |
115 | 138 | | |
| |||
144 | 167 | | |
145 | 168 | | |
146 | 169 | | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
164 | 174 | | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
165 | 181 | | |
166 | 182 | | |
| 183 | + | |
| 184 | + | |
167 | 185 | | |
168 | 186 | | |
169 | 187 | | |
| |||
178 | 196 | | |
179 | 197 | | |
180 | 198 | | |
| 199 | + | |
181 | 200 | | |
182 | 201 | | |
183 | 202 | | |
| |||
0 commit comments