Skip to content

Conversation

MengAiDev
Copy link

@MengAiDev MengAiDev commented Aug 26, 2025

  • Add stream->wait() to ensure all kernels finish execution before proceeding
  • This resolves potential race conditions in the argsort operation

Fixes: #15580

- Add `stream->wait()` to ensure all kernels finish execution before proceeding
- This resolves potential race conditions in the argsort operation
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Aug 26, 2025
@simonlui
Copy link

@MengAiDev The closing brace for the function is missing so it fails to compile when I tried to check out the branch. I added an extra line to close it with } and it works.

@NeoZhangJianyu
Copy link
Collaborator

#15580 support on iGPU.
Could you check if dGPU has this issue?
if no, maybe add the condition to check the iGPU and add wait() for iGPU only.

It could reduce the protentional risk to dGPU.

@simonlui
Copy link

@NeoZhangJianyu I have an Intel Arc A770 16GB and can confirm the issue existed on my dGPU too. This is a snippet from the backtrace I posted in the issue.
/home/simonlui/Code_Repositories/llama-cpp-python/vendor/llama.cpp/ggml/src/ggml-sycl/ggml-sycl.cpp:3380: GGML_ASSERT(row_id_i >= 0 && row_id_i < n_as) failed
Same assert error as iGPU.

@NeoZhangJianyu
Copy link
Collaborator

@NeoZhangJianyu I have an Intel Arc A770 16GB and can confirm the issue existed on my dGPU too. This is a snippet from the backtrace I posted in the issue. /home/simonlui/Code_Repositories/llama-cpp-python/vendor/llama.cpp/ggml/src/ggml-sycl/ggml-sycl.cpp:3380: GGML_ASSERT(row_id_i >= 0 && row_id_i < n_as) failed Same assert error as iGPU.

OK! Thank you for your feedback!
It's OK to me!

@MengAiDev
Copy link
Author

I have fix the }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eval bug: Asynchronous Kernel Execution on iGPU Causes Runtime Errors with MOE Model
3 participants