Skip to content

Conversation

@magnumripper
Copy link
Member

@magnumripper magnumripper commented Aug 20, 2025

Self-test with default LWS at device maximum

Also with a default GWS of 2x that LWS. The new figures are better at triggering bugs. If a kernel needs a lower LWS than device max. our code already handles that.

We previously had it as LWS=7 GWS=49 for speed and for checking heuristics that could bug out on non-log2 values but that was introduced before our autotune was sped up with orders of magnitude (and the heuristics has been stable for many years).

Like before, the self-test will obey any given lws/gws options or environment variables.

Closes #5822

@magnumripper magnumripper force-pushed the self-test-worksizes branch 2 times, most recently from ccf5acb to ba4cebc Compare August 20, 2025 18:23
@magnumripper
Copy link
Member Author

magnumripper commented Aug 20, 2025

So a build bot choked on office-opencl hanging (10 minutes no output) on its Intel CPU device. I'm not quite sure what to do with that.

Edit: I used the same CPU-specific logic that Claudio added nearby.

@magnumripper magnumripper force-pushed the self-test-worksizes branch 3 times, most recently from 7e93ffb to ff382db Compare August 20, 2025 20:53
@magnumripper
Copy link
Member Author

Found two format bugs so far because of this change. Fixes are in this PR but as separate commits.

magnumripper and others added 3 commits August 20, 2025 23:49
Also with a GWS of 2x that LWS.  The new figures are better at triggering
bugs. If a kernel needs a lower LWS than device max. our code already
handles that.

We previously had it as LWS=7 GWS=49 for speed (and for checking
heuristics that could bug out on non-log2 values) but that was
introduced before our autotune was sped up with orders of magnitude
and those heuristics has been stable for many years.

Like before, the self-test will obey any given lws/gws options or
environment variables (in that case even when they are past limits).

Closes openwall#5822
Bug found after bumping work size during self-test.
Bug found after bumping work size during self-test.
@magnumripper
Copy link
Member Author

Ready for merge. Tested a lot, works fine and finds bugs. The only drawback is a tad slower self-test but it's only significant for things like --test=@opencl.

if (self_test_running)
global_work_size = local_work_size;
else if (!global_work_size)
global_work_size = 12 * local_work_size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it'd sometimes affect more than just self-test. Could be worth mentioning in the commit message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this code affects (not shared autotune afaik) so wouldn't know what to write. Anyway I don't think it's worth mentioning, looks like some default start value or worst case fallback.

@magnumripper magnumripper merged commit 157223c into openwall:bleeding-jumbo Aug 23, 2025
32 of 33 checks passed
@magnumripper magnumripper deleted the self-test-worksizes branch August 23, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Note about OpenCL self-tests

2 participants