Skip to content

Conversation

michal-shalev
Copy link
Contributor

What?

Add configurable CUDA device debug info flag with cuda_enable_debug meson option.

Why?

Meson automatically adds the -G flag for CUDA debug builds, which significantly degrades performance. The -G flag generates device debug info and turns off optimizations unless -dopt=on is specified.
Reference: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/ 4.2.3.3

How?

  • Add cuda_enable_debug boolean option (default: false for performance)
  • Conditionally add -dopt=on only when debug is disabled to prevent automatic -G inclusion
  • When enabled, allows meson to add -G flag for device debugging/profiling

Copy link

👋 Hi michal-shalev! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

Signed-off-by: Michal Shalev <[email protected]>
Comment on lines +104 to +106
if not get_option('enable_cuda_debug')
add_project_arguments('-dopt=on', language: 'cuda')
endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not get_option('enable_cuda_debug')
add_project_arguments('-dopt=on', language: 'cuda')
endif
if get_option('enable_cuda_debug')
add_project_arguments('-G', language: 'cuda')
endif

When -G is not specified, -dopt=on is implicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked here:
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/ 4.2.3.3
And also compiled with and without add_project_arguments('-dopt=on', language: 'cuda') to verify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but as @rakhmets mentioned

4.2.3.12. --dopt kind (-dopt)
Enable device code optimization. When specified along with -G, enables limited debug information generation for optimized device code (currently, only line number information). When -G is not specified, -dopt=on is implicit.

So, should we just add -G for debug mode and do not add it for release?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is that -G is added by meson.
So, in debug build it will be -G. But I think the purpose of the PR is:

  • debug build -G -dopt=on
  • debug + cuda debug: -G

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need a separate debug option for cuda? I'd just make sure we have -G for debug and we do not have it for release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug is the default build type, and it's not documented anywhere, out of the box users can get very bad performance because of this "-G" flag

rakhmets
rakhmets previously approved these changes Sep 17, 2025
yosefe
yosefe previously approved these changes Sep 17, 2025
option('cudapath_inc', type: 'string', value: '', description: 'Include path for CUDA')
option('cudapath_lib', type: 'string', value: '', description: 'Library path for CUDA')
option('cudapath_stub', type: 'string', value: '', description: 'Extra Stub path for CUDA')
option('enable_cuda_debug', type: 'boolean', value: false, description: 'Enable CUDA debug mode (disables -dopt=on optimization)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we tie this to the buildtype (debug/release) instead of adding a new option?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants