This PR seems to cause: > CUDA RUNTIME API error: DeviceSetLimit failed with error cudaErrorInvalidValue. ( tested on H100 device ) _Originally posted by @hfp in https://github.com/cp2k/dbcsr/issues/767#issuecomment-2034752764_