Skip to content

Conversation

cr7258
Copy link

@cr7258 cr7258 commented Sep 18, 2025

Overview:

By default, namespaceRestriction is set to true. As per the NVIDIA Dynamo production deployment guide, the Dynamo Operator will only reconcile DynamoGraphDeployment resources in the namespace where it is running (in this case, the dynamo-kubernetes namespace).

However, if a user then deploys a DynamoGraphDeployment using the provided examples (e.g., agg.yaml), which is deployed in the default namespace, the Dynamo Operator will not reconcile this DynamoGraphDeployment. Consequently, the Dynamo Operator does not create any vLLM instances, and no relevant logs are emitted. This behavior can be very confusing for new users.

To ensure new users can smoothly experience Dynamo, this PR sets the default value of namespaceRestriction to false, allowing users to successfully deploy DynamoGraphDeployment by following the existing installation guide and example.

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • Chores
    • Updated platform deployment configuration to align with current operational standards.
    • No changes to user-facing features or behavior.
    • No impact on existing settings or workflows for end-users.
    • Internal operational adjustment to improve manageability and consistency across environments.

Copy link

copy-pr-bot bot commented Sep 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

👋 Hi cr7258! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added chore external-contribution Pull request is from an external contributor labels Sep 18, 2025
@cr7258 cr7258 force-pushed the namespaceRestriction branch from fe7001a to 5bad4c7 Compare September 18, 2025 02:10
Copy link
Contributor

coderabbitai bot commented Sep 18, 2025

Walkthrough

A single Helm values change updates deploy/cloud/helm/platform/values.yaml to set dynamo-operator.namespaceRestriction.enabled from true to false.

Changes

Cohort / File(s) Summary
Helm values update
deploy/cloud/helm/platform/values.yaml
Set dynamo-operator.namespaceRestriction.enabled to false (previously true). No other public settings changed.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

I nudge a toggle, soft as clover’s sway,
A gate unlatches—namespaces give way.
In charts I hop, with values neat and terse,
One flag flipped true-to-false across the verse.
Thump! Configuration set—now onward, deploy day. 🐇✨

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The Overview is detailed and explains the rationale and impact of the change, but the Details and Where should the reviewer start? sections remain template placeholders with no concrete information, and the Related Issues entry contains a placeholder ("#xxx"), so the PR description does not fully follow the repository template. Please populate the Details section with the specific changes (for example, deploy/cloud/helm/platform/values.yaml — set dynamo-operator.namespaceRestriction.enabled from true to false), add a "Where should the reviewer start?" section listing files/lines to inspect, and update Related Issues with the actual issue number or remove the placeholder; also include any test/verification steps and notes about backward-compatibility or user impact. After these additions the description will conform to the required template.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title clearly and concisely summarizes the primary change: switching the default of namespaceRestriction to false; it follows conventional commit style ("chore:") and is specific enough for a teammate scanning PR history to understand the main intent.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

namespaceRestriction:
# -- Whether to restrict operator to specific namespaces
enabled: true
enabled: false

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems to be a problem for unfamiliar users. Especially since we did not set a specific namespace in the example. https://github.com/ai-dynamo/dynamo/tree/main/components/backends/vllm/deploy

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
deploy/cloud/helm/platform/values.yaml (2)

33-37: Call out the default change in chart release notes and bump chart version.

This is a security‑relevant default flip; document prominently in CHANGELOG/NOTES.txt and bump the chart minor version.


33-37: Optional: safer “open by default” via opt‑in labels.

Future enhancement: even with restriction disabled, require namespaces to carry a label (e.g., dynamo.nvidia.com/enabled=true) to be reconciled. This avoids accidental cluster‑wide adoption.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6dd3326 and 5bad4c7.

📒 Files selected for processing (1)
  • deploy/cloud/helm/platform/values.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#3100
File: deploy/cloud/operator/cmd/main.go:186-190
Timestamp: 2025-09-17T22:35:40.649Z
Learning: The mpiRunSecretName validation in deploy/cloud/operator/cmd/main.go is safe for upgrades because the Helm chart automatically populates dynamo-operator.dynamo.mpiRun.secretName with a default value of "mpi-run-ssh-secret" and includes SSH key generation functionality via sshKeygen.enabled: true.
📚 Learning: 2025-09-17T22:35:40.649Z
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#3100
File: deploy/cloud/operator/cmd/main.go:186-190
Timestamp: 2025-09-17T22:35:40.649Z
Learning: The mpiRunSecretName validation in deploy/cloud/operator/cmd/main.go is safe for upgrades because the Helm chart automatically populates dynamo-operator.dynamo.mpiRun.secretName with a default value of "mpi-run-ssh-secret" and includes SSH key generation functionality via sshKeygen.enabled: true.

Applied to files:

  • deploy/cloud/helm/platform/values.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
deploy/cloud/helm/platform/values.yaml (1)

33-37: Verify RBAC and watch-scope behavior with namespaceRestriction default (enabled: false)

Search produced no matches; cannot verify. From repo root run and paste outputs:

# find where the flag is referenced
rg -n --hidden -S '\.Values\.namespaceRestriction|namespaceRestriction' deploy/cloud/helm -g '!**/values.yaml'

# find WATCH_NAMESPACE / namespace args/env in templates
rg -n --hidden -S 'WATCH_NAMESPACE|--namespace|watchNamespace|watch.*namespace' deploy/cloud/helm -g '!**/values.yaml'

# find RBAC resources in templates
rg -n --hidden -S 'ClusterRole|ClusterRoleBinding|RoleBinding|Role\b' deploy/cloud/helm -g '!**/values.yaml'

# render templates for both cases (requires helm)
helm template test-release deploy/cloud/helm -f deploy/cloud/helm/platform/values.yaml --set namespaceRestriction.enabled=false | rg -n 'ClusterRole|ClusterRoleBinding|RoleBinding|Role\b|WATCH_NAMESPACE|env:|--namespace'
helm template test-release deploy/cloud/helm -f deploy/cloud/helm/platform/values.yaml --set namespaceRestriction.enabled=true --set namespaceRestriction.targetNamespace=foo | rg -n 'ClusterRole|ClusterRoleBinding|RoleBinding|Role\b|WATCH_NAMESPACE|env:|--namespace'

Expected outcomes:

  • enabled=false → ClusterRole/ClusterRoleBinding present; no WATCH_NAMESPACE env/arg (manager watches cluster‑wide).
  • enabled=true → namespaced Role/RoleBinding rendered; WATCH_NAMESPACE (or equivalent) set to targetNamespace.
  • Confirm no duplicate operator instances watch the same namespace in multi‑ns installs.

Comment on lines 33 to 37
namespaceRestriction:
# -- Whether to restrict operator to specific namespaces
enabled: true
enabled: false
# -- Target namespace for operator deployment (leave empty for current namespace)
targetNamespace:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Changing default to cluster-wide reconciliation broadens scope; add an explicit warning (and consider alternative rollout).

This default can surprise multi‑tenant/prod users by reconciling CRs in all namespaces. At minimum, add a prominent warning next to the value. Prefer shipping a quickstart values file for examples instead of changing the global default.

Apply this inline doc tweak:

   namespaceRestriction:
-    # -- Whether to restrict operator to specific namespaces
-    enabled: false
+    # -- Whether to restrict operator to specific namespaces
+    # WARNING: Disabling namespace restriction makes the operator reconcile across ALL namespaces.
+    # Not recommended for multi-tenant or production clusters. Set to true for production.
+    enabled: false
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
namespaceRestriction:
# -- Whether to restrict operator to specific namespaces
enabled: true
enabled: false
# -- Target namespace for operator deployment (leave empty for current namespace)
targetNamespace:
namespaceRestriction:
# -- Whether to restrict operator to specific namespaces
# WARNING: Disabling namespace restriction makes the operator reconcile across ALL namespaces.
# Not recommended for multi-tenant or production clusters. Set to true for production.
enabled: false
# -- Target namespace for operator deployment (leave empty for current namespace)
targetNamespace:
🤖 Prompt for AI Agents
In deploy/cloud/helm/platform/values.yaml around lines 33 to 37, the inline docs
do not warn users that leaving namespaceRestriction disabled enables
cluster-wide reconciliation; add a prominent warning comment immediately above
or on the enabled field calling out that false = cluster-wide reconciliation and
may be unsafe in multi-tenant/prod environments, and suggest setting enabled:
true and targetNamespace for scoped installs; additionally add a note
referencing a new quickstart-values.yaml (in the same directory) that contains
example scoped and cluster-wide configurations for safe rollout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore external-contribution Pull request is from an external contributor size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants