-
Notifications
You must be signed in to change notification settings - Fork 101
Add new SpGEMM/SpGEAM interface with reuse capabilities #1934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
MarcelKoch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly lgtm, only some smaller remarks.
|
@MarcelKoch I added the functionality to a benchmark, a very brief check on a 1Mx1M stencil matrix on an A2 GPU shows a roughly 1.4x speedup over normal SpGEMM |
MarcelKoch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main question about how to construct the info object.
|
I added advanced_spgemm (multiply_add) and spgeam (add_scale) interfaces, as well as reusable versions thereof with extensive tests of all the size checks and cross-executor functionality. |
MarcelKoch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only smaller nits left, mostly regarding the testing.
5cd4586 to
b015950
Compare
- add documentation to multiply_reuse_info - more strict input dimension checks - fix test comments Co-authored-by: Marcel Koch <[email protected]>
- make move operations noexcept - reorder members for consistency Co-authored-by: Marcel Koch <[email protected]>
- rename add_scale to scale_add - remove unused variables Co-authored-by: Marcel Koch <[email protected]>
8c5883a to
3dcf152
Compare
This adds a new SpGEMM/SpGEAM interface that works without abusing the
LinOp::applyinterface, and a new SpGEMM kernel that operates on an existing output structure (maybe it also makes sense to expose this as a masked SpGEMM operation? The difference is similar to LU vs. ILU kernels)