-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Self test worksizes #5825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self test worksizes #5825
Conversation
ccf5acb to
ba4cebc
Compare
|
So a build bot choked on office-opencl hanging (10 minutes no output) on its Intel CPU device. I'm not quite sure what to do with that. Edit: I used the same CPU-specific logic that Claudio added nearby. |
7e93ffb to
ff382db
Compare
|
Found two format bugs so far because of this change. Fixes are in this PR but as separate commits. |
Also with a GWS of 2x that LWS. The new figures are better at triggering bugs. If a kernel needs a lower LWS than device max. our code already handles that. We previously had it as LWS=7 GWS=49 for speed (and for checking heuristics that could bug out on non-log2 values) but that was introduced before our autotune was sped up with orders of magnitude and those heuristics has been stable for many years. Like before, the self-test will obey any given lws/gws options or environment variables (in that case even when they are past limits). Closes openwall#5822
Bug found after bumping work size during self-test.
Bug found after bumping work size during self-test.
92ec343 to
eb7682f
Compare
|
Ready for merge. Tested a lot, works fine and finds bugs. The only drawback is a tad slower self-test but it's only significant for things like |
| if (self_test_running) | ||
| global_work_size = local_work_size; | ||
| else if (!global_work_size) | ||
| global_work_size = 12 * local_work_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it'd sometimes affect more than just self-test. Could be worth mentioning in the commit message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this code affects (not shared autotune afaik) so wouldn't know what to write. Anyway I don't think it's worth mentioning, looks like some default start value or worst case fallback.
Self-test with default LWS at device maximum
Also with a default GWS of 2x that LWS. The new figures are better at triggering bugs. If a kernel needs a lower LWS than device max. our code already handles that.
We previously had it as LWS=7 GWS=49 for speed and for checking heuristics that could bug out on non-log2 values but that was introduced before our autotune was sped up with orders of magnitude (and the heuristics has been stable for many years).
Like before, the self-test will obey any given lws/gws options or environment variables.
Closes #5822