BUILD: Add configurable CUDA device debug info flag #800

michal-shalev · 2025-09-16T20:28:27Z

What?

Add configurable CUDA device debug info flag with cuda_enable_debug meson option.

Why?

Meson automatically adds the -G flag for CUDA debug builds, which significantly degrades performance. The -G flag generates device debug info and turns off optimizations unless -dopt=on is specified.
Reference: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/ 4.2.3.3

How?

Add cuda_enable_debug boolean option (default: false for performance)
Conditionally add -dopt=on only when debug is disabled to prevent automatic -G inclusion
When enabled, allows meson to add -G flag for device debugging/profiling

Signed-off-by: Michal Shalev <[email protected]>

github-actions · 2025-09-16T20:28:35Z

👋 Hi michal-shalev! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

meson_options.txt

Signed-off-by: Michal Shalev <[email protected]>

rakhmets · 2025-09-17T09:32:07Z

meson.build

+    if not get_option('enable_cuda_debug')
+        add_project_arguments('-dopt=on', language: 'cuda')
+    endif


Suggested change

if not get_option('enable_cuda_debug')

add_project_arguments('-dopt=on', language: 'cuda')

endif

if get_option('enable_cuda_debug')

add_project_arguments('-G', language: 'cuda')

endif

When -G is not specified, -dopt=on is implicit.

I checked here:
https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/ 4.2.3.3
And also compiled with and without add_project_arguments('-dopt=on', language: 'cuda') to verify.

but as @rakhmets mentioned

4.2.3.12. --dopt kind (-dopt) Enable device code optimization. When specified along with -G, enables limited debug information generation for optimized device code (currently, only line number information). When -G is not specified, -dopt=on is implicit.

So, should we just add -G for debug mode and do not add it for release?

I think the issue is that -G is added by meson.
So, in debug build it will be -G. But I think the purpose of the PR is:

debug build -G -dopt=on

debug + cuda debug: -G

why do we need a separate debug option for cuda? I'd just make sure we have -G for debug and we do not have it for release

debug is the default build type, and it's not documented anywhere, out of the box users can get very bad performance because of this "-G" flag

meson_options.txt

Signed-off-by: Michal Shalev <[email protected]>

ovidiusm · 2025-09-18T09:22:02Z

meson_options.txt

 option('cudapath_inc', type: 'string', value: '', description: 'Include path for CUDA')
 option('cudapath_lib', type: 'string', value: '', description: 'Library path for CUDA')
 option('cudapath_stub', type: 'string', value: '', description: 'Extra Stub path for CUDA')
+option('enable_cuda_debug', type: 'boolean', value: false, description: 'Enable CUDA debug mode (disables -dopt=on optimization)')


Could we tie this to the buildtype (debug/release) instead of adding a new option?

BUILD: Add configurable CUDA device debug info flag

c73f54c

Signed-off-by: Michal Shalev <[email protected]>

michal-shalev self-assigned this Sep 16, 2025

michal-shalev requested a review from a team as a code owner September 16, 2025 20:28

pull-request-size bot added the size/XS label Sep 16, 2025

copy-pr-bot bot temporarily deployed to GITLAB September 16, 2025 20:28 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS September 16, 2025 20:28 Failure

github-actions bot added the external-contribution label Sep 16, 2025

michal-shalev requested review from brminich, yosefe and rakhmets September 16, 2025 20:28

copy-pr-bot bot temporarily deployed to GITLAB September 16, 2025 20:31 Inactive

yosefe reviewed Sep 17, 2025

View reviewed changes

meson_options.txt Outdated Show resolved Hide resolved

michal-shalev added 2 commits September 17, 2025 11:27

PR fixes + CI fix

4ad257d

Signed-off-by: Michal Shalev <[email protected]>

Merge branch 'main' into configurable-cuda-debug-flag

6647af5

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 08:28 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS September 17, 2025 08:28 Failure

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 08:28 Inactive

michal-shalev requested a review from yosefe September 17, 2025 08:30

CI fix

09e3082

Signed-off-by: Michal Shalev <[email protected]>

copy-pr-bot bot temporarily deployed to SWX_AWS September 17, 2025 08:43 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS September 17, 2025 08:43 Failure

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 08:43 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS September 17, 2025 08:43 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 08:43 Inactive

rakhmets reviewed Sep 17, 2025

View reviewed changes

rakhmets previously approved these changes Sep 17, 2025

View reviewed changes

yosefe previously approved these changes Sep 17, 2025

View reviewed changes

CI fix - change ubuntu-latest to ubuntu-24.04

c4ab7c6

Signed-off-by: Michal Shalev <[email protected]>

michal-shalev dismissed stale reviews from yosefe and rakhmets via c4ab7c6 September 17, 2025 10:59

michal-shalev requested review from a team as code owners September 17, 2025 10:59

copy-pr-bot bot temporarily deployed to SWX_AWS September 17, 2025 10:59 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 10:59 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS September 17, 2025 10:59 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 11:00 Inactive

ovidiusm reviewed Sep 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUILD: Add configurable CUDA device debug info flag #800

BUILD: Add configurable CUDA device debug info flag #800

Uh oh!

michal-shalev commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

Uh oh!

rakhmets Sep 17, 2025

Uh oh!

michal-shalev Sep 17, 2025

Uh oh!

brminich Sep 17, 2025

Uh oh!

rakhmets Sep 17, 2025

Uh oh!

brminich Sep 17, 2025

Uh oh!

michal-shalev Sep 17, 2025

Uh oh!

Uh oh!

ovidiusm Sep 18, 2025

Uh oh!

Uh oh!

BUILD: Add configurable CUDA device debug info flag #800

Are you sure you want to change the base?

BUILD: Add configurable CUDA device debug info flag #800

Uh oh!

Conversation

michal-shalev commented Sep 16, 2025

What?

Why?

How?

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

Uh oh!

rakhmets Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

michal-shalev Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

brminich Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

rakhmets Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

brminich Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

michal-shalev Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ovidiusm Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!