Skip to content

Commit 6efa1b5

Browse files
zhou-yuxinRansiki
authored andcommitted
Update fmhaRunner.cpp to fix guardwords scan error (NVIDIA#6327)
Signed-off-by: Zhou Yuxin <[email protected]> Signed-off-by: Ransiki Zhang <[email protected]>
1 parent 4148f29 commit 6efa1b5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

cpp/tensorrt_llm/kernels/contextFusedMultiHeadAttention/fmhaRunner.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -538,7 +538,7 @@ void FusedMHARunnerV2::setTmaDescriptors(MHARunnerParams runnerParams)
538538
// Box size of TMA
539539
const uint32_t box_size_o[3] = {d_per_group, 1, 16};
540540

541-
// Yuxin: dataTypeOut may be different with dataType, so desc_format and swizzle_mode
541+
// dataTypeOut may be different with dataType, so desc_format and swizzle_mode
542542
// may be incorrect. For example, QKV are in bf16 while O is in fp8.
543543
// Luckily, this case doesn't exist so far. But we should keep one eye on it.
544544
qo_tma_descriptor.set_tma_desctriptor(o_ptr, desc_format, cudaTmaDescInterleave::INTERLEAVE_DISABLED,

0 commit comments

Comments
 (0)