Skip to content

Commit 32e7a5c

Browse files
[GPU][BUG] Incomplete conditions for sdpa prefill and head_size (#31727)
### Details: - *item1* - *...* ### Tickets: - *CVS-172034, CVS-169994* --------- Co-authored-by: Chen Peter <[email protected]>
1 parent 7b1b471 commit 32e7a5c

File tree

1 file changed

+1
-1
lines changed
  • src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa

1 file changed

+1
-1
lines changed

src/plugins/intel_gpu/src/graph/impls/ocl_v2/sdpa/sdpa_opt.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ class SDPAOptImpl : public SDPAImplBase {
123123
// So far this case was observed only from the non-lm models such as vision embedding model.
124124
// If we need to optimize unaligned head size SDPA for 2nd+ token phase of LM model,
125125
// we'll need to fix single_token kernel to support unaligned head size.
126-
if (is_prefill || unaligned_head_size(params)) {
126+
if (is_prefill || unaligned_head_size(new_params)) {
127127
GPU_DEBUG_TRACE_DETAIL << "execute multi_tokens for prefill with indirect = " << is_indirect << "\n";
128128
return execute_stage(events, instance, is_indirect ? indirect_multi_tokens : regular_multi_tokens);
129129
}

0 commit comments

Comments
 (0)