Skip to content

Conversation

tmonty12
Copy link
Contributor

@tmonty12 tmonty12 commented Sep 17, 2025

Overview:

The dynamo operator automatically detects docker config secret present in the namespace where DGD (DynamoGraphDeployment) is deployed that match the docker image that are being used and will inject into the pod spec's imagePullSecrets.

There can be scenario where this mechanism should be disabled.

For example, if a publicly available image is used and there is an invalid secret present in the namespace .

Details:

This PR adds an annotation nvidia.com/disable-image-pull-secret-discovery for individual DGD components to disable the injection of the imagePullSecrets.

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Added opt-out control for automatic image pull secret discovery via annotation: nvidia.com/disable-image-pull-secret-discovery: "true".
    • By default, the operator auto-discovers and injects imagePullSecrets matching the container registry.
  • Documentation

    • New “Image Pull Secret Configuration” section with guidance, YAML examples for manual configuration, and details on using the opt-out annotation.
  • Tests

    • Added tests covering the discovery toggle, default behavior, invalid values, and absence of a secrets retriever.

Copy link
Contributor

coderabbitai bot commented Sep 17, 2025

Walkthrough

Adds a new opt-out annotation constant and gates image pull secret discovery in GenerateBasePodSpec based on that annotation. Introduces tests (including a duplicate test function) for the gating behavior and updates documentation to describe automatic discovery and the opt-out, with examples for manual configuration.

Changes

Cohort / File(s) Summary
Constants: annotation key
deploy/cloud/operator/internal/consts/consts.go
Adds exported constant KubeAnnotationDisableImagePullSecretDiscovery = "nvidia.com/disable-image-pull-secret-discovery".
Operator logic: image pull secret discovery gating
deploy/cloud/operator/internal/dynamo/graph.go
Reads the new annotation from component metadata; if set to "true", skips image pull secret discovery. Otherwise, preserves existing discovery behavior. No public API changes.
Tests for discovery gating
deploy/cloud/operator/internal/dynamo/graph_test.go
Adds mock secrets retriever and tests covering true/false/absent/invalid annotation values and nil retriever. Note: duplicate test function TestGenerateBasePodSpec_DisableImagePullSecretDiscovery added.
Docs: deployment guide
docs/guides/dynamo_deploy/create_deployment.md
Adds “Image Pull Secret Configuration” section documenting automatic discovery, the disabling annotation, and YAML examples for manual configuration via extraPodSpec.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User as User
  participant Dyn as Dynamo Operator
  participant Gen as GenerateBasePodSpec
  participant SR as SecretsRetriever
  participant K8s as Kubernetes API

  User->>Dyn: Create/Update Component
  Dyn->>Gen: Build base PodSpec
  Gen->>Gen: Read annotation nvidia.com/disable-image-pull-secret-discovery

  alt Annotation == "true" (opt-out)
    note right of Gen: Discovery disabled
    Gen-->>Dyn: PodSpec (no imagePullSecrets from discovery)
  else Not "true" or absent
    Gen->>SR: RetrieveImagePullSecrets(component)
    SR->>K8s: List/Get Docker config secrets
    K8s-->>SR: Matching secrets
    SR-->>Gen: Secret refs
    Gen-->>Dyn: PodSpec (imagePullSecrets set)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

I nudge the pods with whiskered care,
A toggle whisper: “skip secrets” in the air.
If true, I hop past docks unseen—
If false, I fetch them, crisp and clean.
Docs in paw, I sign the spec—
Carrot commit, no rabbit wreck. 🥕🐇

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly and accurately describes the primary change: adding a mechanism in the operator to disable automatic imagePullSecrets discovery and uses a conventional-commit prefix that fits repo style. It directly reflects the code and documentation changes in this PR and is clear for teammates scanning history.
Description Check ✅ Passed The description provides a clear Overview and Details explaining the change and rationale, but the "Where should the reviewer start?" section is empty and the Related Issues field contains a placeholder "#xxx" instead of a real issue reference; the description also does not call out the added tests and docs. These omissions reduce reviewer efficiency and may hide test issues (graph_test.go appears to include a duplicated test function).

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
deploy/cloud/operator/internal/consts/consts.go (1)

33-34: LGTM: new opt-out annotation constant

Name, scope, and placement look consistent with existing Kube* constants. Consider adding a short GoDoc explaining default behavior (discovery on; set "true" to disable) for searchability.

docs/guides/dynamo_deploy/create_deployment.md (1)

135-166: Clarify behavior and edge cases for imagePullSecrets discovery

  • Note that discovery only runs when extraPodSpec.mainContainer.image is set.
  • State that manual extraPodSpec.podSpec.imagePullSecrets are honored; if discovery is not disabled, both sets are merged.
  • Specify the expected literal value "true" for the annotation (or make code accept case-insensitive—see separate comment).
  • Optionally note that only dockerconfigjson-type secrets are considered.

Suggested doc tweak:

@@
-**Disabling Automatic Discovery:**
+**Disabling Automatic Discovery:**
+Notes:
+- Discovery runs only if `extraPodSpec.mainContainer.image` is set.
+- Manually specified `extraPodSpec.podSpec.imagePullSecrets` are always honored. If you do not disable discovery, discovered secrets will be merged in addition to your manual list.
+- Set the annotation value to the lowercase string `"true"`.
@@
   annotations:
     nvidia.com/disable-image-pull-secret-discovery: "true"
deploy/cloud/operator/internal/dynamo/graph.go (1)

768-779: Make annotation parsing tolerant + keep behavior explicit

  • Use case-insensitive parsing for the boolean-like annotation to reduce footguns.
  • Current merge appends discovered secrets after any user-provided ones; add de-duplication to avoid duplicate entries when users also list the same secret.

Proposed changes:

@@
-	disableImagePullSecretAnnotationValue := component.Annotations[commonconsts.KubeAnnotationDisableImagePullSecretDiscovery]
-	shouldDisableImagePullSecret := disableImagePullSecretAnnotationValue == commonconsts.KubeLabelValueTrue
+	disableImagePullSecretAnnotationValue := component.Annotations[commonconsts.KubeAnnotationDisableImagePullSecretDiscovery]
+	shouldDisableImagePullSecret := strings.EqualFold(disableImagePullSecretAnnotationValue, commonconsts.KubeLabelValueTrue)
@@
-	podSpec.ImagePullSecrets = append(podSpec.ImagePullSecrets, imagePullSecrets...)
+	// Merge discovered imagePullSecrets without duplicates (favor user-provided)
+	if len(imagePullSecrets) > 0 {
+		seen := make(map[string]struct{}, len(podSpec.ImagePullSecrets))
+		for _, s := range podSpec.ImagePullSecrets {
+			seen[s.Name] = struct{}{}
+		}
+		for _, s := range imagePullSecrets {
+			if _, ok := seen[s.Name]; !ok {
+				podSpec.ImagePullSecrets = append(podSpec.ImagePullSecrets, s)
+			}
+		}
+	}

Optional: log at debug when GetSecrets errors are swallowed to aid troubleshooting.

deploy/cloud/operator/internal/dynamo/graph_test.go (1)

4501-4631: Great coverage; add cases for manual + discovered secrets and de-dup

  • Add a case where ExtraPodSpec.PodSpec.ImagePullSecrets is preset and discovery is enabled to assert both are present.
  • Add a case where the same secret appears manually and via discovery to assert de-dup (if you adopt the merge change).

Minimal additions:

@@
 func TestGenerateBasePodSpec_DisableImagePullSecretDiscovery(t *testing.T) {
   tests := []struct {
@@
   }{
+    {
+      name: "manual imagePullSecrets retained alongside discovery",
+      component: &v1alpha1.DynamoComponentDeploymentOverridesSpec{
+        DynamoComponentDeploymentSharedSpec: v1alpha1.DynamoComponentDeploymentSharedSpec{
+          ComponentType: commonconsts.ComponentTypeFrontend,
+          ExtraPodSpec: &common.ExtraPodSpec{
+            PodSpec: &corev1.PodSpec{
+              ImagePullSecrets: []corev1.LocalObjectReference{{Name: "manual-secret"}},
+            },
+            MainContainer: &corev1.Container{Image: "test-registry/test-image:latest"},
+          },
+        },
+      },
+      secretsRetriever: &mockSecretsRetrieverWithSecrets{},
+      expectedImagePullSecrets: []corev1.LocalObjectReference{
+        {Name: "manual-secret"}, {Name: "test-docker-secret"},
+      },
+    },
+    {
+      name: "duplicate secret names are de-duplicated",
+      component: &v1alpha1.DynamoComponentDeploymentOverridesSpec{
+        DynamoComponentDeploymentSharedSpec: v1alpha1.DynamoComponentDeploymentSharedSpec{
+          ComponentType: commonconsts.ComponentTypeFrontend,
+          ExtraPodSpec: &common.ExtraPodSpec{
+            PodSpec: &corev1.PodSpec{
+              ImagePullSecrets: []corev1.LocalObjectReference{{Name: "test-docker-secret"}},
+            },
+            MainContainer: &corev1.Container{Image: "test-registry/test-image:latest"},
+          },
+        },
+      },
+      secretsRetriever:         &mockSecretsRetrieverWithSecrets{},
+      expectedImagePullSecrets: []corev1.LocalObjectReference{{Name: "test-docker-secret"}},
+    },
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26889b0 and 48e4e05.

📒 Files selected for processing (4)
  • deploy/cloud/operator/internal/consts/consts.go (1 hunks)
  • deploy/cloud/operator/internal/dynamo/graph.go (1 hunks)
  • deploy/cloud/operator/internal/dynamo/graph_test.go (2 hunks)
  • docs/guides/dynamo_deploy/create_deployment.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-03T15:26:55.732Z
Learnt from: julienmancuso
PR: ai-dynamo/dynamo#1337
File: deploy/cloud/helm/platform/components/operator/templates/image-builer-serviceaccount.yaml:0-0
Timestamp: 2025-06-03T15:26:55.732Z
Learning: The image-builder ServiceAccount in deploy/cloud/helm/platform/components/operator/templates/image-builer-serviceaccount.yaml does not need imagePullSecrets, unlike the component ServiceAccount.

Applied to files:

  • deploy/cloud/operator/internal/dynamo/graph.go
  • docs/guides/dynamo_deploy/create_deployment.md
🧬 Code graph analysis (2)
deploy/cloud/operator/internal/dynamo/graph.go (2)
deploy/cloud/operator/internal/consts/consts.go (2)
  • KubeAnnotationDisableImagePullSecretDiscovery (33-33)
  • KubeLabelValueTrue (42-42)
deploy/cloud/operator/api/dynamo/common/common.go (1)
  • ExtraPodSpec (58-61)
deploy/cloud/operator/internal/dynamo/graph_test.go (5)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2)
  • DynamoComponentDeploymentOverridesSpec (56-58)
  • DynamoComponentDeploymentSharedSpec (60-115)
deploy/cloud/operator/internal/dynamo/graph.go (5)
  • SecretsRetriever (316-318)
  • Config (64-74)
  • GenerateBasePodSpec (699-839)
  • BackendFrameworkNoop (1067-1067)
  • RoleMain (572-572)
deploy/cloud/operator/internal/consts/consts.go (5)
  • ComponentTypeFrontend (51-51)
  • KubeAnnotationDisableImagePullSecretDiscovery (33-33)
  • KubeLabelValueTrue (42-42)
  • KubeLabelValueFalse (41-41)
  • MultinodeDeploymentTypeGrove (83-83)
deploy/cloud/operator/api/dynamo/common/common.go (1)
  • ExtraPodSpec (58-61)
deploy/cloud/operator/internal/controller_common/predicate.go (1)
  • Config (50-63)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and Test - vllm
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
deploy/cloud/operator/internal/dynamo/graph_test.go (1)

3144-3157: LGTM: focused mock for docker secret injection tests

The mock covers both interfaces and returns a stable secret, good for table tests.

@tmonty12 tmonty12 merged commit 70a8aa3 into main Sep 18, 2025
14 of 16 checks passed
@tmonty12 tmonty12 deleted the tmonty12/dep-383-mechanism-disable-image-secret-auto-injection branch September 18, 2025 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants