Skip to content

Revise RAJA plugin support #1742

@adayton1

Description

@adayton1

Is your feature request related to a problem? Please describe.

RAJA plugins are used by CHAI to make sure the data backing ManagedArray is in the correct memory space and that it is up to date. However, the approach used now is not stream aware. This leads to suboptimal performance on GPU platforms. Where there is a dual memory space (CUDA), memory copies to the host are done on stream 0, which forces the whole device to synchronize. Where there is a single memory space (HIP), we have to do a synchronize across the whole device to make sure the data is valid during host accesses.

Describe the solution you'd like

Making CHAI stream aware would be relatively straightforward if the camp resource used by RAJA was passed as an argument to the plugin functions. Additionally, the postLaunch function should also receive an event with a wait method that CHAI can call when it needs to be sure the kernel has been completed.

Describe alternatives you've considered

Instead of modifying the plugin, RAJA could set some global state that is accessible when the plugin methods are called.

Additional context

Umpire is working on camp resource aware allocators (llnl/Umpire#901), which CHAI will also be using.

Also, note that even if only one stream is being used in an application, this new approach will be more efficient than synchronizing across the whole device.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions