-
Notifications
You must be signed in to change notification settings - Fork 150
BUILD: Add configurable CUDA device debug info flag #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BUILD: Add configurable CUDA device debug info flag #800
Conversation
Signed-off-by: Michal Shalev <[email protected]>
👋 Hi michal-shalev! Thank you for contributing to ai-dynamo/nixl. Your PR reviewers will review your contribution then trigger the CI to test your changes. 🚀 |
Signed-off-by: Michal Shalev <[email protected]>
Signed-off-by: Michal Shalev <[email protected]>
if not get_option('enable_cuda_debug') | ||
add_project_arguments('-dopt=on', language: 'cuda') | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not get_option('enable_cuda_debug') | |
add_project_arguments('-dopt=on', language: 'cuda') | |
endif | |
if get_option('enable_cuda_debug') | |
add_project_arguments('-G', language: 'cuda') | |
endif |
When -G
is not specified, -dopt=on
is implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked here:
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/ 4.2.3.3
And also compiled with and without add_project_arguments('-dopt=on', language: 'cuda')
to verify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but as @rakhmets mentioned
4.2.3.12. --dopt kind (-dopt)
Enable device code optimization. When specified along with -G, enables limited debug information generation for optimized device code (currently, only line number information). When -G is not specified, -dopt=on is implicit.
So, should we just add -G for debug mode and do not add it for release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is that -G
is added by meson.
So, in debug build it will be -G
. But I think the purpose of the PR is:
- debug build
-G -dopt=on
- debug + cuda debug:
-G
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a separate debug option for cuda? I'd just make sure we have -G for debug and we do not have it for release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug is the default build type, and it's not documented anywhere, out of the box users can get very bad performance because of this "-G" flag
Signed-off-by: Michal Shalev <[email protected]>
option('cudapath_inc', type: 'string', value: '', description: 'Include path for CUDA') | ||
option('cudapath_lib', type: 'string', value: '', description: 'Library path for CUDA') | ||
option('cudapath_stub', type: 'string', value: '', description: 'Extra Stub path for CUDA') | ||
option('enable_cuda_debug', type: 'boolean', value: false, description: 'Enable CUDA debug mode (disables -dopt=on optimization)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we tie this to the buildtype (debug/release) instead of adding a new option?
What?
Add configurable CUDA device debug info flag with
cuda_enable_debug
meson option.Why?
Meson automatically adds the -G flag for CUDA debug builds, which significantly degrades performance. The -G flag generates device debug info and turns off optimizations unless -dopt=on is specified.
Reference: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/ 4.2.3.3
How?