-
Notifications
You must be signed in to change notification settings - Fork 111
🐛 Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted #1160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Fix ManagedClusterAddons not removed when ClusterManagementAddon is deleted #1160
Conversation
WalkthroughGates stopping of addon managers on presence of ManagedClusterAddOns using a name-based indexer; replaces label/list MCA discovery with index lookups, updates controller informer wiring and sync/run flows, adds unit tests for stopUnusedManagers, tweaks template-agent logging, and introduces duplicate e2e tests for cascade deletion cleanup. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Pre-merge checks (2 passed, 3 warnings)❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1160 +/- ##
==========================================
- Coverage 58.17% 58.17% -0.01%
==========================================
Files 211 211
Lines 20797 20819 +22
==========================================
+ Hits 12099 12111 +12
- Misses 7628 7637 +9
- Partials 1070 1071 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
a6896e5
to
5bdd118
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
test/e2e/addonmanagement_test.go (3)
852-862
: Nit: fix “copyiedConfig” typo.Rename for readability.
- copyiedConfig, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( + copiedConfig, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( context.Background(), configmap.Name, metav1.GetOptions{}) @@ - if !equality.Semantic.DeepEqual(copyiedConfig.Data, configmap.Data) { - return fmt.Errorf("expected configmap is not correct, %v", copyiedConfig.Data) + if !equality.Semantic.DeepEqual(copiedConfig.Data, configmap.Data) { + return fmt.Errorf("expected configmap is not correct, %v", copiedConfig.Data)
869-907
: Reduce duplication via small helpers to improve test maintainability.Create tiny helpers like waitForConfigMapCopy(ns, name), waitForConfigMapDeletion(ns, name), waitForJobDeletion(ns, name) and reuse them here and in earlier tests that perform the same checks.
883-919
: Optional: add explicit Eventually timeouts to reduce flakes.Consider explicit timeouts/polling intervals (e.g., 2m, 5s) to avoid environment-dependent defaults.
pkg/addon/controllers/addontemplate/controller_test.go (1)
273-391
: Solid unit coverage for stopUnusedManagers; consider two small additions.
- Add a case to simulate list error (fake client reactor) and assert requeue-on-error (after you wire it in the controller, see separate comment).
- Clarify behavior when no manager exists but MCAs do: do we want to requeue in that case? If not, add a case and assertion.
pkg/addon/controllers/addontemplate/controller.go (2)
109-113
: Prefer fields.OneTermEqualSelector for field selector construction.More robust than manual string concatenation.
- listOptions := metav1.ListOptions{ - FieldSelector: "metadata.name=" + addOnName, - } + listOptions := metav1.ListOptions{ + FieldSelector: fields.OneTermEqualSelector("metadata.name", addOnName).String(), + }Add import:
- k8s.io/apimachinery/pkg/fields
119-124
: Optional: use rate-limited requeue and a named constant.
- Rate-limited requeue reduces fixed polling.
- Extract 10s to a const for easier tuning and tests.
Example:
const waitForMCACleanup = 10 * time.Second // ... syncCtx.Queue().AddAfter(addOnName, waitForMCACleanup) // or: syncCtx.Queue().AddRateLimited(addOnName)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
pkg/addon/controllers/addontemplate/controller.go
(1 hunks)pkg/addon/controllers/addontemplate/controller_test.go
(2 hunks)test/e2e/addonmanagement_test.go
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: skeeey
PR: open-cluster-management-io/ocm#1053
File: pkg/server/services/addon/addon_test.go:208-225
Timestamp: 2025-06-27T09:46:24.579Z
Learning: In the Open Cluster Management addon service, deletion actions are not expected to occur. The OnDelete handler method is implemented as a no-op, and test coverage for deletion is not needed since this operation is not part of the service's expected functionality.
📚 Learning: 2025-06-27T09:46:24.579Z
Learnt from: skeeey
PR: open-cluster-management-io/ocm#1053
File: pkg/server/services/addon/addon_test.go:208-225
Timestamp: 2025-06-27T09:46:24.579Z
Learning: In the Open Cluster Management addon service, deletion actions are not expected to occur. The OnDelete handler method is implemented as a no-op, and test coverage for deletion is not needed since this operation is not part of the service's expected functionality.
Applied to files:
test/e2e/addonmanagement_test.go
🧬 Code graph analysis (1)
pkg/addon/controllers/addontemplate/controller_test.go (2)
pkg/registration/helpers/testing/testinghelpers.go (1)
NewManagedClusterAddons
(520-529)pkg/common/testing/fake_sync_context.go (1)
NewFakeSyncContext
(21-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: integration
- GitHub Check: e2e-singleton
- GitHub Check: unit
- GitHub Check: e2e
- GitHub Check: build
- GitHub Check: e2e-hosted
- GitHub Check: verify
🔇 Additional comments (3)
test/e2e/addonmanagement_test.go (1)
834-868
: Good end-to-end coverage for CMA-driven cascading deletion.This validates the pre-delete job path and MCA cleanup before asserting CMA deletion. Matches the PR objective well.
pkg/addon/controllers/addontemplate/controller_test.go (1)
28-29
: Import looks correct.Importing testinghelpers to construct MCAs aligns with the new tests’ needs.
pkg/addon/controllers/addontemplate/controller.go (1)
127-133
: LGTM: stop and remove manager after confirming zero MCAs.Matches the intended gating and logs the action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pkg/addon/controllers/addontemplate/controller.go (1)
110-115
: Requeue on List error to avoid leaking managers when CMA is gone.If the List call fails while the CMA key is being processed, explicitly requeue to ensure we retry teardown instead of relying solely on controller-level error handling.
Apply:
- mcaList, err := c.addonClient.AddonV1alpha1().ManagedClusterAddOns("").List(ctx, listOptions) - if err != nil { - return err - } + mcaList, err := c.addonClient.AddonV1alpha1().ManagedClusterAddOns("").List(ctx, listOptions) + if err != nil { + logger.Error(err, "Failed to list ManagedClusterAddOns", "addonName", addOnName) + // retry shortly on transient API errors + syncCtx.Queue().AddAfter(addOnName, 10*time.Second) + return nil + }
🧹 Nitpick comments (3)
pkg/addon/controllers/addontemplate/controller.go (3)
109-113
: Build the FieldSelector via fields.OneTermEqualSelector.Small safety/readability win vs hand-building the string.
Apply:
- listOptions := metav1.ListOptions{ - FieldSelector: "metadata.name=" + addOnName, - } + listOptions := metav1.ListOptions{ + FieldSelector: fields.OneTermEqualSelector("metadata.name", addOnName).String(), + }Add import (outside this hunk):
import ( // ... "k8s.io/apimachinery/pkg/fields" // ... )
118-124
: LGTM; consider making the 10s delay a named const.The requeue-on-existence behavior is correct. Exposing 10*time.Second as a named const (e.g., waitMCADeletion) makes tuning easier across environments.
130-133
: Tweak log message for accuracy.Log reflects the post-stop state; make it past tense.
- logger.Info("Stopping the manager for addon", "addonName", addOnName) + logger.Info("Stopped the manager for addon", "addonName", addOnName)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
pkg/addon/controllers/addontemplate/controller.go
(2 hunks)pkg/addon/controllers/addontemplate/controller_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/addon/controllers/addontemplate/controller_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/addon/controllers/addontemplate/controller.go (1)
pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: e2e-singleton
- GitHub Check: e2e-hosted
- GitHub Check: e2e
- GitHub Check: unit
- GitHub Check: verify
- GitHub Check: integration
- GitHub Check: build
🔇 Additional comments (2)
pkg/addon/controllers/addontemplate/controller.go (2)
104-106
: Signature + context logger change looks good.Returning error and using klog.FromContext for structured logging is appropriate here.
142-149
: Confirm requeue semantics on error propagation.You now return the error from stopUnusedManagers. Please confirm library-go’s factory.Controller requeues the same key on sync error; if not, add an explicit requeue in the error path to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thing we need to confirm is when cma is deleting, does install controller still create mca?
listOptions := metav1.ListOptions{ | ||
FieldSelector: "metadata.name=" + addOnName, | ||
} | ||
mcaList, err := c.addonClient.AddonV1alpha1().ManagedClusterAddOns("").List(ctx, listOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we list from lister instead calling api directly?
11338ad
to
0844710
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/addon/controllers/addontemplate/controller.go (1)
22-23
: Duplicate import alias (won’t compile).The same package is imported twice under different aliases. Consolidate to a single alias and update the type reference.
- addonapiv1alpha1 "open-cluster-management.io/api/addon/v1alpha1" addonv1alpha1 "open-cluster-management.io/api/addon/v1alpha1"
- getValuesClosure := func(cluster *clusterv1.ManagedCluster, addon *addonapiv1alpha1.ManagedClusterAddOn) (addonfactory.Values, error) { + getValuesClosure := func(cluster *clusterv1.ManagedCluster, addon *addonv1alpha1.ManagedClusterAddOn) (addonfactory.Values, error) { return templateagent.GetAddOnRegistriesPrivateValuesFromClusterAnnotation(klog.FromContext(ctx), cluster, addon) }Also applies to: 222-224
🧹 Nitpick comments (4)
pkg/addon/controllers/addontemplate/controller.go (4)
36-39
: Requeue delay: make configurable or add jitter.10s is fine, but consider a flag/env or jitter to avoid alignment across many addons.
-const ( - // requeueDelay is the delay for requeuing when ManagedClusterAddOns still exist - requeueDelay = 10 * time.Second -) +const requeueDelay = 10 * time.Second +// TODO: consider making this configurable and/or apply jitter.
53-53
: Ensure MCA informer is started/synced for lister reads.Since stopUnusedManagers depends on mcaLister, include the MCA informer in WithBareInformers so the factory guarantees cache sync before sync runs.
return factory.New().WithInformersQueueKeysFunc( queue.QueueKeyByMetaNamespaceName, addonInformers.Addon().V1alpha1().ClusterManagementAddOns().Informer()). - WithBareInformers( + WithBareInformers( // do not need to queue, just make sure the controller reconciles after the addonTemplate cache is synced // otherwise, there will be "xx-addon-template" not found" errors in the log as the controller uses the // addonTemplate lister to get the template object - addonInformers.Addon().V1alpha1().AddOnTemplates().Informer()). + addonInformers.Addon().V1alpha1().AddOnTemplates().Informer(), + // ensure MCA cache is synced for mcaLister usage + addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer()).Also applies to: 83-84, 99-106
110-136
: Gating logic looks good; simplify and short‑circuit on first match.No need to build a slice; return as soon as one MCA for this addon is found. Also align param name with addonName used elsewhere.
-func (c *addonTemplateController) stopUnusedManagers( - ctx context.Context, syncCtx factory.SyncContext, addOnName string) error { +func (c *addonTemplateController) stopUnusedManagers( + ctx context.Context, syncCtx factory.SyncContext, addonName string) error { logger := klog.FromContext(ctx) - // Check if all managed cluster addon instances are deleted before stopping the manager - // This allows pre-delete jobs to complete mcaList, err := c.mcaLister.List(labels.Everything()) if err != nil { return err } - // Filter by addon name - var matchingMCAs []*addonv1alpha1.ManagedClusterAddOn - for _, mca := range mcaList { - if mca.Name == addOnName { - matchingMCAs = append(matchingMCAs, mca) - } - } - - // Check if there are still ManagedClusterAddOns for this addon - if len(matchingMCAs) > 0 { - logger.Info("ManagedClusterAddOn still exists, waiting for deletion", - "addonName", addOnName, "count", len(matchingMCAs)) - // Requeue to check again later - syncCtx.Queue().AddAfter(addOnName, requeueDelay) - return nil - } + for _, mca := range mcaList { + if mca.Name == addonName { + logger.Info("ManagedClusterAddOn still exists, waiting for deletion", + "addonName", addonName) + syncCtx.Queue().AddAfter(addonName, requeueDelay) + return nil + } + }
257-261
: Log errors with logger.Error for consistency.Use structured error logging.
- if err != nil { - logger.Info("Failed to list ManagedClusterAddOns", "error", err) + if err != nil { + logger.Error(err, "Failed to list ManagedClusterAddOns")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
pkg/addon/controllers/addontemplate/controller.go
(5 hunks)pkg/addon/controllers/addontemplate/controller_test.go
(2 hunks)test/e2e/addonmanagement_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- test/e2e/addonmanagement_test.go
- pkg/addon/controllers/addontemplate/controller_test.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/addon/controllers/addontemplate/controller.go (1)
pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: e2e-singleton
- GitHub Check: e2e-hosted
- GitHub Check: e2e
- GitHub Check: integration
- GitHub Check: unit
- GitHub Check: verify
🔇 Additional comments (2)
pkg/addon/controllers/addontemplate/controller.go (2)
142-144
: LGTM: stop and remove manager only after MCAs are gone.
147-161
: Good: propagate errors from stopUnusedManagers.This prevents silent leaks and leverages controller requeue semantics.
/hold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
test/e2e/addonmanagement_test.go (4)
865-865
: Remove leftover commented sleep.
Dead code adds noise.- // time.Sleep(time.Minute * 10)
852-862
: Fix typo: copyiedConfig → copiedConfig.
Minor readability polish.- copyiedConfig, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( + copiedConfig, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( context.Background(), configmap.Name, metav1.GetOptions{}) if err != nil { return err } - if !equality.Semantic.DeepEqual(copyiedConfig.Data, configmap.Data) { - return fmt.Errorf("expected configmap is not correct, %v", copyiedConfig.Data) + if !equality.Semantic.DeepEqual(copiedConfig.Data, configmap.Data) { + return fmt.Errorf("expected configmap is not correct, %v", copiedConfig.Data) }
848-851
: Add local cleanup for the hub-side ConfigMap to avoid test resource leaks.
Keeps the e2e environment clean across runs._, err := hub.KubeClient.CoreV1().ConfigMaps(universalClusterName).Create( context.Background(), configmap, metav1.CreateOptions{}) gomega.Expect(err).ToNot(gomega.HaveOccurred()) + defer func() { + _ = hub.KubeClient.CoreV1().ConfigMaps(universalClusterName).Delete( + context.Background(), configmap.Name, metav1.DeleteOptions{}) + }()
900-910
: Harden Eventually timeout for MCA deletion to match the pre-delete window.
Avoids flakiness on slow clusters.- gomega.Eventually(func() error { + gomega.Eventually(func() error { _, err := hub.AddonClient.AddonV1alpha1().ManagedClusterAddOns(universalClusterName).Get( context.TODO(), addOnName, metav1.GetOptions{}) if err != nil { if errors.IsNotFound(err) { return nil } return err } return fmt.Errorf("the ManagedClusterAddon should be deleted") - }).ShouldNot(gomega.HaveOccurred()) + }, time.Second*300, time.Second*3).ShouldNot(gomega.HaveOccurred())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
test/e2e/addonmanagement_test.go
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: unit
- GitHub Check: e2e-hosted
- GitHub Check: integration
- GitHub Check: e2e
- GitHub Check: e2e-singleton
- GitHub Check: verify
- GitHub Check: build
🔇 Additional comments (1)
test/e2e/addonmanagement_test.go (1)
8-8
: LGTM: time import is required for explicit Eventually timeouts.
be2536c
to
8dc16ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
test/e2e/addonmanagement_test.go (1)
77-77
: Defocus the suite/spec: replace FDescribe/FIt with Describe/It.
Focused specs will skip the rest of the suite in CI.Apply:
-var _ = ginkgo.FDescribe("Enable addon management feature gate", ginkgo.Ordered, ginkgo.Label("addon-manager"), func() { +var _ = ginkgo.Describe("Enable addon management feature gate", ginkgo.Ordered, ginkgo.Label("addon-manager"), func() {-ginkgo.FIt("ClusterManagementAddon deletion should wait for ManagedClusterAddons cleanup", func() { +ginkgo.It("ClusterManagementAddon deletion should wait for ManagedClusterAddons cleanup", func() {Run to ensure no other focused specs remain:
#!/bin/bash rg -nP -g '!**/vendor/**' -C0 '\bF(?:Describe|Context|It|Specify|When|DescribeTable|Entry)\s*\('Also applies to: 835-835
🧹 Nitpick comments (5)
test/e2e/addonmanagement_test.go (5)
853-861
: Nit: fix variable name typo.
Use copiedConfig for clarity/consistency.- copyiedConfig, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( + copiedConfig, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( context.Background(), configmap.Name, metav1.GetOptions{}) if err != nil { return err } - if !equality.Semantic.DeepEqual(copyiedConfig.Data, configmap.Data) { - return fmt.Errorf("expected configmap is not correct, %v", copyiedConfig.Data) + if !equality.Semantic.DeepEqual(copiedConfig.Data, configmap.Data) { + return fmt.Errorf("expected configmap is not correct, %v", copiedConfig.Data) }
917-921
: Simplify TailLines pointer creation.
Avoid the inline lambda.- logs, logErr := hub.KubeClient.CoreV1().Pods("open-cluster-management-hub").GetLogs( - pod.Name, &corev1.PodLogOptions{ - TailLines: func(i int64) *int64 { return &i }(50), - }).DoRaw(context.Background()) + tail := int64(50) + logs, logErr := hub.KubeClient.CoreV1().Pods("open-cluster-management-hub").GetLogs( + pod.Name, &corev1.PodLogOptions{TailLines: &tail}, + ).DoRaw(context.Background())
941-943
: Nit: log field label is misleading.
It's ContainerStatuses, not a single Ready flag.- klog.Infof("Job pod %s status: Phase=%s, Ready=%v, Conditions=%v", - pod.Name, pod.Status.Phase, pod.Status.ContainerStatuses, pod.Status.Conditions) + klog.Infof("Job pod %s status: Phase=%s, ContainerStatuses=%v, Conditions=%v", + pod.Name, pod.Status.Phase, pod.Status.ContainerStatuses, pod.Status.Conditions)
946-954
: Prefer typed PodPhase constants over string literals.
Reduces typo risk.- if pod.Status.Phase == "Failed" || pod.Status.Phase == "Succeeded" { + if pod.Status.Phase == corev1.PodFailed || pod.Status.Phase == corev1.PodSucceeded { logs, logErr := spoke.KubeClient.CoreV1().Pods(addonInstallNamespace).GetLogs( - pod.Name, &corev1.PodLogOptions{}).DoRaw(context.Background()) + pod.Name, &corev1.PodLogOptions{}, + ).DoRaw(context.Background())
970-970
: Gomega v2 style: use WithTimeout/WithPolling (optional).
Improves readability and matches current idioms.- }, time.Second*300, time.Second*3).ShouldNot(gomega.HaveOccurred()) + }).WithTimeout(5*time.Minute).WithPolling(3*time.Second).ShouldNot(gomega.HaveOccurred())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
test/e2e/addonmanagement_test.go
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: e2e-hosted
- GitHub Check: e2e
- GitHub Check: e2e-singleton
- GitHub Check: unit
- GitHub Check: build
- GitHub Check: verify
- GitHub Check: integration
🔇 Additional comments (1)
test/e2e/addonmanagement_test.go (1)
8-8
: Import of time is appropriate.
Used for Eventually timeouts; no issues.
def832c
to
b3aeefc
Compare
|
||
// Check if all managed cluster addon instances are deleted before stopping the manager | ||
// This allows pre-delete jobs to complete | ||
mcaList, err := c.mcaLister.List(labels.Everything()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have an index addonindex.ManagedClusterAddonByName for this.
logger.Info("ManagedClusterAddOn still exists, waiting for deletion", | ||
"addonName", addOnName, "count", len(matchingMCAs)) | ||
// Requeue to check again later | ||
syncCtx.Queue().AddAfter(addOnName, requeueDelay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to requeue? deletion of MCA will also trigger the reconcile, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the controller only watches CMA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we watch MCA? we could filter by configuration in status of mca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/addon/controllers/addontemplate/controller.go (1)
99-109
: Ensure MCA informer is started/synced before using its indexer.stopUnusedManagers relies on the ManagedClusterAddOns indexer. Add the MCA informer to WithBareInformers so the factory starts and syncs it, avoiding empty index reads if the manager hasn't started yet.
return factory.New().WithInformersQueueKeysFunc( queue.QueueKeyByMetaNamespaceName, addonInformers.Addon().V1alpha1().ClusterManagementAddOns().Informer()). - WithBareInformers( + WithBareInformers( // do not need to queue, just make sure the controller reconciles after the addonTemplate cache is synced // otherwise, there will be "xx-addon-template" not found" errors in the log as the controller uses the // addonTemplate lister to get the template object - addonInformers.Addon().V1alpha1().AddOnTemplates().Informer()). + addonInformers.Addon().V1alpha1().AddOnTemplates().Informer(), + addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer()). WithSync(c.sync). ToController("addon-template-controller", recorder)
🧹 Nitpick comments (6)
pkg/addon/templateagent/registration.go (2)
94-96
: Use error-level logging for template fetch failures.Fetching the template failing is an error path; logging at Info may hide real issues during triage. Recommend logger.Error with addonName keyed.
- a.logger.Info("CSRConfigurations failed to get addon template", "addonName", a.addonName, "error", err) + a.logger.Error(err, "CSRConfigurations failed to get addon template", "addonName", a.addonName)
171-173
: Align severity: approval check should log error on template fetch failure.Same rationale as above; favor Error over Info.
- a.logger.Info("CSRApproveCheck failed to get addon template", "addonName", a.addonName, "error", err) + a.logger.Error(err, "CSRApproveCheck failed to get addon template", "addonName", a.addonName)pkg/addon/controllers/addontemplate/controller_test.go (1)
410-414
: Optionally assert requeue behavior.You can assert the key was requeued by waiting briefly and checking syncContext.Queue().Len() > 0 or popping a single item; this makes the test intention explicit.
pkg/addon/controllers/addontemplate/controller.go (3)
37-41
: Consider making requeueDelay configurable or rate-limited.A fixed 10s works, but allowing configuration or using the controller’s rate-limited queue avoids hot loops under prolonged cleanup.
111-139
: Nit: fix log typo and avoid requeue when no manager exists.
- “Sart” -> “Start”.
- If no manager exists, requeueing adds noise. Only requeue while a manager is present.
- if len(addons) > 0 { - logger.Info("ManagedClusterAddOn still exists, waiting for deletion", - "addonName", addOnName, "count", len(addons)) - // Requeue to check again later - syncCtx.Queue().AddAfter(addOnName, requeueDelay) - return nil - } + if len(addons) > 0 { + if _, exists := c.addonManagers[addOnName]; exists { + logger.Info("ManagedClusterAddOn still exists, waiting for deletion", + "addonName", addOnName, "count", len(addons)) + // Requeue to check again later + syncCtx.Queue().AddAfter(addOnName, requeueDelay) + } + return nil + } ... - logger.Info("Sart to stop the manager for addon", "addonName", addOnName) + logger.Info("Start to stop the manager for addon", "addonName", addOnName)
252-255
: Log list/index failures at error level and include addonName.This helps surface startup issues when warming initial triggers.
- logger.Info("Failed to list ManagedClusterAddOns by index", "error", err) + logger.Error(err, "Failed to list ManagedClusterAddOns by index", "addonName", addonName)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
pkg/addon/controllers/addontemplate/controller.go
(6 hunks)pkg/addon/controllers/addontemplate/controller_test.go
(5 hunks)pkg/addon/templateagent/registration.go
(2 hunks)pkg/addon/templateagent/template_agent.go
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- pkg/addon/templateagent/template_agent.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/addon/controllers/addontemplate/controller.go (2)
pkg/addon/index/index.go (1)
ManagedClusterAddonByName
(19-19)pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
pkg/addon/controllers/addontemplate/controller_test.go (3)
pkg/addon/index/index.go (2)
ManagedClusterAddonByName
(19-19)IndexManagedClusterAddonByName
(42-50)pkg/registration/helpers/testing/testinghelpers.go (1)
NewManagedClusterAddons
(520-529)pkg/common/testing/fake_sync_context.go (1)
NewFakeSyncContext
(21-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: e2e
- GitHub Check: e2e-singleton
- GitHub Check: e2e-hosted
- GitHub Check: unit
- GitHub Check: integration
🔇 Additional comments (2)
pkg/addon/controllers/addontemplate/controller_test.go (2)
156-164
: Good: index by addon name wired into tests.Registering ManagedClusterAddonByName ensures the controller paths using ByIndex are exercised realistically.
263-273
: Controller under test correctly receives the MCA indexer.Wiring the indexer into the test controller mirrors production wiring and avoids false negatives.
6bdb759
to
de9f918
Compare
5b57be0
to
7055812
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
test/e2e/addonmanagement_test.go (1)
895-906
: Avoid hard-coded cleanup job nameUse the addOnName to derive the cleanup Job name for resilience across addons.
- _, err := spoke.KubeClient.BatchV1().Jobs(addonInstallNamespace).Get( - context.Background(), "hello-template-cleanup-configmap", metav1.GetOptions{}) + jobName := fmt.Sprintf("%s-cleanup-configmap", addOnName) + _, err := spoke.KubeClient.BatchV1().Jobs(addonInstallNamespace).Get( + context.Background(), jobName, metav1.GetOptions{})pkg/addon/controllers/addontemplate/controller.go (2)
124-141
: Add a safety requeue when MCAs still exist (defensive forward progress)Relying solely on filtered MCA events is OK, but a light AddAfter fallback avoids stalls if predicates ever miss an event (e.g., status.ConfigReferences not populated).
if len(addons) > 0 { logger.Info("ManagedClusterAddOn still exists, waiting for deletion", "addonName", addOnName, "count", len(addons)) - // No need to requeue since we now watch ManagedClusterAddon events - return nil + // Defensive requeue to guarantee forward progress even if no MCA event fires. + syncCtx.Queue().AddAfter(addOnName, 10*time.Second) + return nil }
263-267
: Log index lookup failures as errors (not info)Index failures indicate misconfiguration; use Error to surface this.
- logger.Info("Failed to list ManagedClusterAddOns by index", "error", err) + logger.Error(err, "Failed to list ManagedClusterAddOns by index", "addonName", addonName)pkg/addon/controllers/addontemplate/controller_test.go (1)
292-315
: Assert enqueue behavior via FakeSyncContext.Queue()
FakeSyncContext.Queue() returns the workqueue, so renameexpectedRequeue
toexpectedEnqueued
and add an assertion such as:g.Expect(ctx.Queue().Len() > 0).To(Equal(expectedEnqueued))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
pkg/addon/controllers/addontemplate/controller.go
(9 hunks)pkg/addon/controllers/addontemplate/controller_test.go
(5 hunks)pkg/addon/templateagent/registration.go
(2 hunks)pkg/addon/templateagent/template_agent.go
(1 hunks)test/e2e/addonmanagement_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/addon/templateagent/registration.go
- pkg/addon/templateagent/template_agent.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/addon/controllers/addontemplate/controller_test.go (3)
pkg/addon/index/index.go (2)
ManagedClusterAddonByName
(19-19)IndexManagedClusterAddonByName
(42-50)pkg/registration/helpers/testing/testinghelpers.go (1)
NewManagedClusterAddons
(520-529)pkg/common/testing/fake_sync_context.go (1)
NewFakeSyncContext
(21-27)
pkg/addon/controllers/addontemplate/controller.go (3)
pkg/common/queue/queuekey.go (2)
QueueKeyByMetaNamespaceName
(67-70)QueueKeyByMetaName
(57-60)pkg/addon/index/index.go (1)
ManagedClusterAddonByName
(19-19)pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: e2e-singleton
- GitHub Check: e2e
- GitHub Check: unit
- GitHub Check: e2e-hosted
- GitHub Check: integration
- GitHub Check: verify
- GitHub Check: build
🔇 Additional comments (7)
test/e2e/addonmanagement_test.go (1)
258-269
: ConfigMap propagation assertion — LGTMUsing copiedConfig.Data vs. configmap.Data is correct and fixes the prior compare bug.
pkg/addon/controllers/addontemplate/controller.go (3)
97-115
: Filtered MCA watch wiring — LGTMWatching ManagedClusterAddOn with a focused predicate and queueing by name cleanly unblocks stop logic via MCA delete events while limiting noise.
160-167
: Stop path on CMA not found / unsupported — LGTMDelegating to stopUnusedManagers on CMA absence or non-template CMA aligns with the new gating logic.
72-85
: Index already registered globally
TheManagedClusterAddonByName
index is set up in pkg/addon/manager.go when informers are initialized, so this controller doesn’t need to call AddIndexers here.Likely an incorrect or invalid review comment.
pkg/addon/controllers/addontemplate/controller_test.go (3)
156-163
: Indexer registration in tests — LGTMRegistering ManagedClusterAddonByName pre-start ensures ByIndex queries work in tests.
262-273
: Controller under test wiring — LGTMmanaging managedClusterAddonIndexer and shared informers mirrors production wiring.
354-363
: Good fix: capture loop variables for stopFunc wrappersShadowing capturedName/capturedStopFunc prevents the closure-capture bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pkg/addon/controllers/addontemplate/controller.go (2)
154-164
: Use shared GVR constant to avoid driftLeverage utils.AddOnTemplateGVR instead of hard-coded strings.
- if configRef.ConfigGroupResource.Group == "addon.open-cluster-management.io" && - configRef.ConfigGroupResource.Resource == "addontemplates" { + if configRef.ConfigGroupResource.Group == utils.AddOnTemplateGVR.Group && + configRef.ConfigGroupResource.Resource == utils.AddOnTemplateGVR.Resource { templateRefs = append(templateRefs, configRef) }
306-314
: Log index errors at error level and include addonNameMinor severity tweak for visibility.
- logger.Info("Failed to list ManagedClusterAddOns by index", "error", err) + logger.Error(err, "Failed to list ManagedClusterAddOns by index", "addonName", addonName)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/addon/controllers/addontemplate/controller.go
(8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/addon/controllers/addontemplate/controller.go (3)
pkg/common/queue/queuekey.go (1)
QueueKeyByMetaNamespaceName
(67-70)pkg/addon/index/index.go (1)
ManagedClusterAddonByName
(19-19)pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: integration
- GitHub Check: build
- GitHub Check: verify
- GitHub Check: unit
- GitHub Check: e2e-singleton
- GitHub Check: e2e
- GitHub Check: e2e-hosted
🔇 Additional comments (7)
pkg/addon/controllers/addontemplate/controller.go (7)
5-19
: Imports and index alias look correctNew imports (reflect/cache/workqueue) and addonindex alias are appropriate for the new logic.
Also applies to: 33-36
45-57
: Controller fields wiring is soundStoring the indexer and queue on the controller enables the MCA event-driven requeues you rely on. Looks good.
75-92
: SyncContext and queue plumbing LGTMUsing a named controller and persisting syncCtx.Queue() for custom handlers is correct.
101-113
: Informer wiring is appropriateWatching CMA for keys and using MCA/AddOnTemplate as bare informers matches the design. Nice.
185-193
: Manager stop path LGTMClean shutdown with map cleanup and helpful logging.
199-209
: Reconciliation gating is correctDeferring stop on CMA NotFound/unsupported until MCAs are gone matches the intended fix.
167-183
: No changes required — theManagedClusterAddonByName
index is registered in pkg/addon/manager.go (lines 113–118).
// Add custom event handler for ManagedClusterAddon to filter configReference changes | ||
// following the pattern from managedClusterSetController | ||
_, err := addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ | ||
UpdateFunc: func(oldObj, newObj interface{}) { | ||
// Only handle configReference updates for AddOnTemplates | ||
oldMCA, ok := oldObj.(*addonv1alpha1.ManagedClusterAddOn) | ||
if !ok { | ||
return | ||
} | ||
newMCA, ok := newObj.(*addonv1alpha1.ManagedClusterAddOn) | ||
if !ok { | ||
return | ||
} | ||
|
||
// Extract AddOnTemplate configReferences from both old and new | ||
oldTemplateRefs := extractAddOnTemplateConfigRefs(oldMCA) | ||
newTemplateRefs := extractAddOnTemplateConfigRefs(newMCA) | ||
|
||
// Only process if AddOnTemplate configReferences changed | ||
if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { | ||
// Queue the addon name to trigger reconciliation | ||
c.queue.Add(newMCA.Name) | ||
} | ||
}, | ||
DeleteFunc: func(obj interface{}) { | ||
mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) | ||
if !ok { | ||
return | ||
} | ||
c.queue.Add(mca.Name) | ||
}, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle tombstones in DeleteFunc and make Update comparison order-insensitive
- Delete events can arrive as cache.DeletedFinalStateUnknown; current cast may miss them.
- reflect.DeepEqual on slices is order-sensitive; benign reordering will spuriously enqueue.
Apply:
@@
- DeleteFunc: func(obj interface{}) {
- mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn)
- if !ok {
- return
- }
- c.queue.Add(mca.Name)
- },
+ DeleteFunc: func(obj interface{}) {
+ switch t := obj.(type) {
+ case *addonv1alpha1.ManagedClusterAddOn:
+ c.queue.Add(t.Name)
+ case cache.DeletedFinalStateUnknown:
+ if mca, ok := t.Obj.(*addonv1alpha1.ManagedClusterAddOn); ok {
+ c.queue.Add(mca.Name)
+ }
+ default:
+ return
+ }
+ },
@@
- if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) {
+ sort.Slice(oldTemplateRefs, func(i, j int) bool {
+ oi, oj := oldTemplateRefs[i], oldTemplateRefs[j]
+ return oi.UniqueName < oj.UniqueName
+ })
+ sort.Slice(newTemplateRefs, func(i, j int) bool {
+ ni, nj := newTemplateRefs[i], newTemplateRefs[j]
+ return ni.UniqueName < nj.UniqueName
+ })
+ if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) {
// Queue the addon name to trigger reconciliation
c.queue.Add(newMCA.Name)
}
Note: UniqueName is available on ConfigReference; if not, sort by Group/Resource/Name composite.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Add custom event handler for ManagedClusterAddon to filter configReference changes | |
// following the pattern from managedClusterSetController | |
_, err := addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ | |
UpdateFunc: func(oldObj, newObj interface{}) { | |
// Only handle configReference updates for AddOnTemplates | |
oldMCA, ok := oldObj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return | |
} | |
newMCA, ok := newObj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return | |
} | |
// Extract AddOnTemplate configReferences from both old and new | |
oldTemplateRefs := extractAddOnTemplateConfigRefs(oldMCA) | |
newTemplateRefs := extractAddOnTemplateConfigRefs(newMCA) | |
// Only process if AddOnTemplate configReferences changed | |
if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { | |
// Queue the addon name to trigger reconciliation | |
c.queue.Add(newMCA.Name) | |
} | |
}, | |
DeleteFunc: func(obj interface{}) { | |
mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return | |
} | |
c.queue.Add(mca.Name) | |
}, | |
}) | |
// Add custom event handler for ManagedClusterAddon to filter configReference changes | |
// following the pattern from managedClusterSetController | |
_, err := addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ | |
UpdateFunc: func(oldObj, newObj interface{}) { | |
// Only handle configReference updates for AddOnTemplates | |
oldMCA, ok := oldObj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return | |
} | |
newMCA, ok := newObj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return | |
} | |
// Extract AddOnTemplate configReferences from both old and new | |
oldTemplateRefs := extractAddOnTemplateConfigRefs(oldMCA) | |
newTemplateRefs := extractAddOnTemplateConfigRefs(newMCA) | |
// Sort the configReferences to make comparison order-insensitive | |
sort.Slice(oldTemplateRefs, func(i, j int) bool { | |
oi, oj := oldTemplateRefs[i], oldTemplateRefs[j] | |
return oi.UniqueName < oj.UniqueName | |
}) | |
sort.Slice(newTemplateRefs, func(i, j int) bool { | |
ni, nj := newTemplateRefs[i], newTemplateRefs[j] | |
return ni.UniqueName < nj.UniqueName | |
}) | |
// Only process if AddOnTemplate configReferences changed | |
if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { | |
// Queue the addon name to trigger reconciliation | |
c.queue.Add(newMCA.Name) | |
} | |
}, | |
DeleteFunc: func(obj interface{}) { | |
switch t := obj.(type) { | |
case *addonv1alpha1.ManagedClusterAddOn: | |
c.queue.Add(t.Name) | |
case cache.DeletedFinalStateUnknown: | |
if mca, ok := t.Obj.(*addonv1alpha1.ManagedClusterAddOn); ok { | |
c.queue.Add(mca.Name) | |
} | |
default: | |
return | |
} | |
}, | |
}) |
🤖 Prompt for AI Agents
In pkg/addon/controllers/addontemplate/controller.go around lines 115 to 146,
the DeleteFunc must handle cache.DeletedFinalStateUnknown tombstones and the
UpdateFunc comparison must be order-insensitive: modify DeleteFunc to detect if
obj is of type cache.DeletedFinalStateUnknown and, if so, extract the contained
object before casting to *addonv1alpha1.ManagedClusterAddOn, then enqueue its
Name; for UpdateFunc normalize the old/new AddOnTemplate configReference lists
into a deterministic order (either sort by ConfigReference.UniqueName if
present, or by a composite key of Group/Resource/Name) or convert them to a
set/map of unique keys and compare those instead of reflect.DeepEqual, ensuring
nil/empty slices are handled consistently so benign reordering doesn’t trigger
reconciliation.
778aed5
to
159f919
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pkg/addon/controllers/addontemplate/controller.go (1)
115-146
: Handle tombstones in DeleteFunc and make Update comparison order-insensitive (use UniqueName sort).
- Delete events can arrive as cache.DeletedFinalStateUnknown; current cast may miss them.
- reflect.DeepEqual on slices is order-sensitive; benign reordering spuriously enqueues.
Apply:
@@ - _, err := addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ + _, err := addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ UpdateFunc: func(oldObj, newObj interface{}) { // Only handle configReference updates for AddOnTemplates oldMCA, ok := oldObj.(*addonv1alpha1.ManagedClusterAddOn) if !ok { return } newMCA, ok := newObj.(*addonv1alpha1.ManagedClusterAddOn) if !ok { return } // Extract AddOnTemplate configReferences from both old and new oldTemplateRefs := extractAddOnTemplateConfigRefs(oldMCA) newTemplateRefs := extractAddOnTemplateConfigRefs(newMCA) - // Only process if AddOnTemplate configReferences changed - if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { + // Normalize order by UniqueName to avoid spurious enqueue on reordering + sort.Slice(oldTemplateRefs, func(i, j int) bool { + return oldTemplateRefs[i].UniqueName < oldTemplateRefs[j].UniqueName + }) + sort.Slice(newTemplateRefs, func(i, j int) bool { + return newTemplateRefs[i].UniqueName < newTemplateRefs[j].UniqueName + }) + // Only process if AddOnTemplate configReferences changed + if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { // Queue the addon name to trigger reconciliation c.queue.Add(newMCA.Name) } }, - DeleteFunc: func(obj interface{}) { - mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) - if !ok { - return - } - c.queue.Add(mca.Name) - }, + DeleteFunc: func(obj interface{}) { + switch t := obj.(type) { + case *addonv1alpha1.ManagedClusterAddOn: + c.queue.Add(t.Name) + case cache.DeletedFinalStateUnknown: + if mca, ok := t.Obj.(*addonv1alpha1.ManagedClusterAddOn); ok { + c.queue.Add(mca.Name) + } + default: + return + } + }, })Optional: also enqueue on Add when a MCA starts referencing AddOnTemplate to catch edge cases:
+ AddFunc: func(obj interface{}) { + if mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn); ok && len(extractAddOnTemplateConfigRefs(mca)) > 0 { + c.queue.Add(mca.Name) + } + },
🧹 Nitpick comments (3)
pkg/addon/controllers/addontemplate/controller.go (3)
3-21
: Import sort for order-insensitive comparison.Required by the UpdateFunc change above.
import ( "context" - "reflect" + "reflect" + "sort" "time" @@ - "k8s.io/client-go/tools/cache" - "k8s.io/client-go/util/workqueue" + "k8s.io/client-go/tools/cache" + "k8s.io/client-go/util/workqueue" )
153-163
: Avoid hard-coded GVR strings; use utils.AddOnTemplateGVR for correctness and consistency.This prevents drift if group/resource names change and matches existing helpers.
-func extractAddOnTemplateConfigRefs(mca *addonv1alpha1.ManagedClusterAddOn) []addonv1alpha1.ConfigReference { +func extractAddOnTemplateConfigRefs(mca *addonv1alpha1.ManagedClusterAddOn) []addonv1alpha1.ConfigReference { var templateRefs []addonv1alpha1.ConfigReference for _, configRef := range mca.Status.ConfigReferences { - if configRef.ConfigGroupResource.Group == "addon.open-cluster-management.io" && - configRef.ConfigGroupResource.Resource == "addontemplates" { + if configRef.ConfigGroupResource.Group == utils.AddOnTemplateGVR.Group && + configRef.ConfigGroupResource.Resource == utils.AddOnTemplateGVR.Resource { templateRefs = append(templateRefs, configRef) } } return templateRefs }
305-313
: Log index lookup failures as errors and include addonName context.Improves signal when the index isn’t wired or cache isn’t synced yet.
- mcas, err := c.managedClusterAddonIndexer.ByIndex(addonindex.ManagedClusterAddonByName, addonName) + mcas, err := c.managedClusterAddonIndexer.ByIndex(addonindex.ManagedClusterAddonByName, addonName) if err != nil { - logger.Info("Failed to list ManagedClusterAddOns by index", "error", err) + logger.Error(err, "Failed to list ManagedClusterAddOns by index", "addonName", addonName) } else { for _, mca := range mcas { addon, ok := mca.(*addonv1alpha1.ManagedClusterAddOn) if ok { mgr.Trigger(addon.Namespace, addonName) } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/addon/controllers/addontemplate/controller.go
(8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/addon/controllers/addontemplate/controller.go (3)
pkg/common/queue/queuekey.go (1)
QueueKeyByMetaNamespaceName
(67-70)pkg/addon/index/index.go (1)
ManagedClusterAddonByName
(19-19)pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: e2e
- GitHub Check: e2e-singleton
- GitHub Check: e2e-hosted
- GitHub Check: integration
- GitHub Check: verify
- GitHub Check: unit
- GitHub Check: build
🔇 Additional comments (2)
pkg/addon/controllers/addontemplate/controller.go (2)
101-114
: Wiring looks solid.Using WithInformersQueueKeysFunc for CMA and WithBareInformers to ensure MCA/AddOnTemplate caches are synced before reconcile is correct for this controller’s model.
169-183
: Index registration for ManagedClusterAddonByName confirmed
ManagedClusterAddOns informer in pkg/addon/manager.go registers the ManagedClusterAddonByName index, and IndexManagedClusterAddonByName correctly returns the addon’s Name.
@qiujian16 PTAL |
} | ||
|
||
// Extract AddOnTemplate configReferences from both old and new | ||
oldTemplateRefs := extractAddOnTemplateConfigRefs(oldMCA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just filter the mca based on whether it is an addonTemplate based addon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not only filter the addonTemplate-based addon, but also filter only the template that has changes.
Do you mean we do not need to compare whether the template has changed, we enqueue all update events as long as it is template-based?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, why do we only compare template change? If this is only for deleting, should we just filter the one that is template addon and is in deleting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. PTAL
} | ||
}, | ||
DeleteFunc: func(obj interface{}) { | ||
mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, we only need to care template based addon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the below lines 145-148
…eleted The addon template controller was stopping addon managers immediately when ClusterManagementAddon was deleted, without waiting for pre-delete jobs to complete or ManagedClusterAddons to be cleaned up via owner reference cascading deletion. This change implements the TODO at line 105 by checking if all ManagedClusterAddons are deleted before stopping the manager. The controller now uses field selectors to efficiently query for remaining ManagedClusterAddons and requeues after 10 seconds if any still exist, allowing time for proper cleanup. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
Signed-off-by: zhujian <[email protected]>
Signed-off-by: zhujian <[email protected]>
- Use lister instead of API client for better performance - Add named constant for requeue delay - Fix test cache synchronization issues - Improve test coverage from 74.7% to 75.6% Addresses review feedback from Qiujian16 and CodeRabbit. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
Add explicit 180s timeout for pre-delete job configmap cleanup. The default 90s timeout was insufficient for the deletion workflow. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
- Replace utilruntime.HandleError with structured logging in CSR functions - Add more context to error messages for better debugging - Use logger.Info for template retrieval errors to provide better visibility 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
- Replace inefficient list-and-filter with indexed lookup - Add managedClusterAddonIndexer field to controller struct - Update comment to accurately describe functionality - Fix unit tests to properly set up the required index This addresses the PR review feedback to use the existing index instead of listing all ManagedClusterAddOns and filtering by name. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
Since we now use managedClusterAddonIndexer for efficient lookup, the mcaLister field is no longer needed. This cleanup reduces memory usage and simplifies the controller structure. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
Use managedClusterAddonIndexer.ByIndex() instead of listing all ManagedClusterAddOns and filtering by name. This provides O(1) indexed lookup instead of O(n) linear scan. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
- Fix closure capture bug in controller test by using captured variables - Fix typo 'copyiedConfig' to 'copiedConfig' in e2e tests 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
Replace filtered event handling with custom event handlers that only trigger reconciliation when AddOnTemplate configReferences actually change. This reduces unnecessary reconciliation cycles by using reflect.DeepEqual to compare config references between old and new objects. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: zhujian <[email protected]>
159f919
to
4649d1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
pkg/addon/controllers/addontemplate/controller.go (1)
115-151
: Harden MCA event handlers: handle tombstones and compare order-insensitively.
- DeleteFunc ignores cache.DeletedFinalStateUnknown; deletes may be missed.
- reflect.DeepEqual is order-sensitive; benign reordering will spuriously enqueue.
Apply:
@@ - UpdateFunc: func(oldObj, newObj interface{}) { + UpdateFunc: func(oldObj, newObj interface{}) { // Only handle configReference updates for AddOnTemplates oldMCA, ok := oldObj.(*addonv1alpha1.ManagedClusterAddOn) if !ok { return } newMCA, ok := newObj.(*addonv1alpha1.ManagedClusterAddOn) if !ok { return } // Extract AddOnTemplate configReferences from both old and new oldTemplateRefs := extractAddOnTemplateConfigRefs(oldMCA) newTemplateRefs := extractAddOnTemplateConfigRefs(newMCA) - // Only process if AddOnTemplate configReferences changed - if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { + // Normalize order before comparison (avoid spurious diffs from reordering) + sort.Slice(oldTemplateRefs, func(i, j int) bool { + oi, oj := oldTemplateRefs[i], oldTemplateRefs[j] + if oi.ConfigReferent.Namespace != oj.ConfigReferent.Namespace { + return oi.ConfigReferent.Namespace < oj.ConfigReferent.Namespace + } + return oi.ConfigReferent.Name < oj.ConfigReferent.Name + }) + sort.Slice(newTemplateRefs, func(i, j int) bool { + ni, nj := newTemplateRefs[i], newTemplateRefs[j] + if ni.ConfigReferent.Namespace != nj.ConfigReferent.Namespace { + return ni.ConfigReferent.Namespace < nj.ConfigReferent.Namespace + } + return ni.ConfigReferent.Name < nj.ConfigReferent.Name + }) + if !reflect.DeepEqual(oldTemplateRefs, newTemplateRefs) { // Queue the addon name to trigger reconciliation c.queue.Add(newMCA.Name) } }, - DeleteFunc: func(obj interface{}) { - mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) - if !ok { - return - } - - // Only process template-based addons - templateRefs := extractAddOnTemplateConfigRefs(mca) - if len(templateRefs) > 0 { - c.queue.Add(mca.Name) - } - }, + DeleteFunc: func(obj interface{}) { + switch t := obj.(type) { + case *addonv1alpha1.ManagedClusterAddOn: + if len(extractAddOnTemplateConfigRefs(t)) > 0 { + c.queue.Add(t.Name) + } + case cache.DeletedFinalStateUnknown: + if mca, ok := t.Obj.(*addonv1alpha1.ManagedClusterAddOn); ok { + if len(extractAddOnTemplateConfigRefs(mca)) > 0 { + c.queue.Add(mca.Name) + } + } + default: + return + } + },And import sort:
import ( "context" - "reflect" + "reflect" + "sort" "time"
🧹 Nitpick comments (3)
test/e2e/addonmanagement_test.go (1)
834-920
: Avoid flakiness and clean up hub-side ConfigMap.
- Pre-delete flows can be slow; add explicit Eventually timeouts/polling to reduce flakes.
- The original ConfigMap in the hub namespace isn’t cleaned up by the job; delete it to avoid residue between tests.
Apply:
- gomega.Eventually(func() error { + gomega.Eventually(func() error { _, err := spoke.KubeClient.CoreV1().ConfigMaps(addonInstallNamespace).Get( context.Background(), configmap.Name, metav1.GetOptions{}) if err != nil { if errors.IsNotFound(err) { return nil } return err } return fmt.Errorf("the configmap should be deleted") - }).ShouldNot(gomega.HaveOccurred()) + }).WithTimeout(3 * time.Minute).WithPolling(2 * time.Second).ShouldNot(gomega.HaveOccurred()) @@ - gomega.Eventually(func() error { + gomega.Eventually(func() error { _, err := hub.AddonClient.AddonV1alpha1().ManagedClusterAddOns(universalClusterName).Get( context.TODO(), addOnName, metav1.GetOptions{}) if err != nil { if errors.IsNotFound(err) { return nil } return err } return fmt.Errorf("the ManagedClusterAddon should be deleted") - }).ShouldNot(gomega.HaveOccurred()) + }).WithTimeout(3 * time.Minute).WithPolling(2 * time.Second).ShouldNot(gomega.HaveOccurred()) @@ - gomega.Eventually(func() error { + gomega.Eventually(func() error { _, err := spoke.KubeClient.BatchV1().Jobs(addonInstallNamespace).Get( context.Background(), "hello-template-cleanup-configmap", metav1.GetOptions{}) if err != nil { if errors.IsNotFound(err) { return nil } return err } return fmt.Errorf("the job should be deleted") - }).ShouldNot(gomega.HaveOccurred()) + }).WithTimeout(3 * time.Minute).WithPolling(2 * time.Second).ShouldNot(gomega.HaveOccurred()) + + // Best-effort cleanup of the original hub ConfigMap created for this test. + _ = hub.KubeClient.CoreV1().ConfigMaps(universalClusterName). + Delete(context.TODO(), configmap.Name, metav1.DeleteOptions{})pkg/addon/controllers/addontemplate/controller.go (1)
311-313
: Log errors as errors (and consider retry).Use logger.Error for ByIndex failures; optional: requeue to retry transient index/list errors.
Apply:
- logger.Info("Failed to list ManagedClusterAddOns by index", "error", err) + logger.Error(err, "Failed to list ManagedClusterAddOns by index") + // optionally: c.queue.AddAfter(addonName, 10*time.Second)pkg/addon/controllers/addontemplate/controller_test.go (1)
389-417
: Consider asserting queueing behavior explicitly.If we keep “expectedRequeue”, we can assert enqueue by wiring a fake SyncContext or wrapping c.queue with a test queue to observe Add calls. Optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
pkg/addon/controllers/addontemplate/controller.go
(8 hunks)pkg/addon/controllers/addontemplate/controller_test.go
(5 hunks)pkg/addon/templateagent/registration.go
(2 hunks)pkg/addon/templateagent/template_agent.go
(1 hunks)test/e2e/addonmanagement_test.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/addon/templateagent/template_agent.go
- pkg/addon/templateagent/registration.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/addon/controllers/addontemplate/controller_test.go (3)
pkg/addon/index/index.go (2)
ManagedClusterAddonByName
(19-19)IndexManagedClusterAddonByName
(42-50)pkg/registration/helpers/testing/testinghelpers.go (1)
NewManagedClusterAddons
(520-529)pkg/common/testing/fake_sync_context.go (1)
NewFakeSyncContext
(21-27)
pkg/addon/controllers/addontemplate/controller.go (3)
pkg/common/queue/queuekey.go (1)
QueueKeyByMetaNamespaceName
(67-70)pkg/addon/index/index.go (1)
ManagedClusterAddonByName
(19-19)pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build
- GitHub Check: integration
- GitHub Check: unit
- GitHub Check: verify
- GitHub Check: e2e-singleton
- GitHub Check: e2e-hosted
- GitHub Check: e2e
🔇 Additional comments (5)
test/e2e/addonmanagement_test.go (1)
259-266
: Fix looks good — correct var and comparison.Using copiedConfig and comparing Data fields directly is the right fix.
pkg/addon/controllers/addontemplate/controller.go (1)
171-187
: Confirm watch-based requeue is sufficient when MCAs remain.stopUnusedManagers returns without requeue by design; this relies entirely on MCA watch events. With tombstone handling fixed, that should be fine. If external index errors occur, consider AddAfter to avoid indefinite liveness.
Would you like me to add a tiny unit test that emits a DeletedFinalStateUnknown for MCA and asserts the CMA name is enqueued?
pkg/addon/controllers/addontemplate/controller_test.go (3)
156-164
: Good: index registered for tests.Registering ManagedClusterAddonByName ensures ByIndex works in tests.
352-363
: Nice capture fix for closures.Shadowing loop vars avoids the classic capture bug.
365-373
: Redundant but fine: indexer registration here too.Consistent with TestReconcile; keeps tests hermetic.
managedClusterAddonIndexer: addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer().GetIndexer(), | ||
addonManagers: make(map[string]context.CancelFunc), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Register the ManagedClusterAddonByName index before using ByIndex.
ByIndex will error if the indexer wasn’t registered; in production this will cause stop/start paths to misbehave or leak managers. Tests add the indexer, but the controller doesn’t.
Apply:
controller := factory.New().
WithSyncContext(syncCtx).
@@
WithSync(c.sync).
ToController("addon-template-controller", recorder)
+ // Ensure the MCA by-name index is registered (required for ByIndex calls).
+ if err := addonInformers.Addon().V1alpha1().ManagedClusterAddOns().
+ Informer().AddIndexers(cache.Indexers{
+ addonindex.ManagedClusterAddonByName: addonindex.IndexManagedClusterAddonByName,
+ }); err != nil {
+ utilruntime.HandleError(err)
+ }
Also applies to: 101-109
🤖 Prompt for AI Agents
In pkg/addon/controllers/addontemplate/controller.go around lines 84-85 (and
similarly around 101-109), the ManagedClusterAddon indexer is used via ByIndex
without ensuring the index was registered; add code during controller
construction to call the indexer's AddIndexers to register the
ManagedClusterAddonByName indexer (provide a function that returns the addon
name key from the ManagedClusterAddOn object or its metadata), do this
registration before any call to ByIndex, and handle and log/return any error
from AddIndexers so the controller fails fast if registration fails.
… controller" This reverts commit 4649d1b. Signed-off-by: zhujian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pkg/addon/controllers/addontemplate/controller.go (2)
135-141
: Consider a time-based requeue as a safety net.Relying solely on MCA watch events risks leaking a manager if an event is missed; the PR summary mentioned a 10s requeue. Add a light requeue when MCAs still exist.
Apply:
- logger.Info("ManagedClusterAddOn still exists, waiting for deletion", - "addonName", addOnName, "count", len(addons)) - // No need to requeue since we now watch ManagedClusterAddon events - return nil + logger.Info("ManagedClusterAddOn still exists, waiting for deletion", + "addonName", addOnName, "count", len(addons)) + syncCtx.Queue().AddAfter(addOnName, 10*time.Second) + return nilWas dropping the timed requeue intentional given the PR description states a 10s requeue?
264-267
: Log indexer failures as errors, not info.Indexer errors indicate misconfiguration or cache issues; promote to Error for visibility.
Apply:
- if err != nil { - logger.Info("Failed to list ManagedClusterAddOns by index", "error", err) + if err != nil { + logger.Error(err, "ByIndex ManagedClusterAddOns failed", "addonName", addonName)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/addon/controllers/addontemplate/controller.go
(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/addon/controllers/addontemplate/controller.go (3)
pkg/common/queue/queuekey.go (2)
QueueKeyByMetaNamespaceName
(67-70)QueueKeyByMetaName
(57-60)pkg/addon/index/index.go (1)
ManagedClusterAddonByName
(19-19)pkg/addon/templateagent/common.go (1)
SupportAddOnTemplate
(34-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: integration
- GitHub Check: unit
- GitHub Check: verify
- GitHub Check: build
- GitHub Check: e2e-singleton
- GitHub Check: e2e
- GitHub Check: e2e-hosted
🔇 Additional comments (2)
pkg/addon/controllers/addontemplate/controller.go (2)
228-230
: LGTM: type-correct getValuesClosure signature.Switching addon param to *addonv1alpha1.ManagedClusterAddOn aligns with surrounding code.
87-96
: No additional index registration needed in addontemplate controller
The ManagedClusterAddOns index is already registered once in pkg/addon/manager.go (lines 113–116) before any controllers run, so adding it again in controller.go is redundant.Likely an incorrect or invalid review comment.
WithFilteredEventsInformersQueueKeysFunc( | ||
queue.QueueKeyByMetaName, | ||
func(obj interface{}) bool { | ||
mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) | ||
if !ok { | ||
return false | ||
} | ||
|
||
// Only process ManagedClusterAddOns that reference AddOnTemplates | ||
for _, configRef := range mca.Status.ConfigReferences { | ||
if configRef.ConfigGroupResource.Group == "addon.open-cluster-management.io" && | ||
configRef.ConfigGroupResource.Resource == "addontemplates" { | ||
return true | ||
} | ||
} | ||
return false | ||
}, | ||
addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer()). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle tombstones in the filter predicate and use constants instead of hard-coded strings.
Delete events can arrive as cache.DeletedFinalStateUnknown; current cast drops those. Also prefer utils.AddOnTemplateGVR to avoid drift.
Apply:
- queue.QueueKeyByMetaName,
- func(obj interface{}) bool {
- mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn)
- if !ok {
- return false
- }
-
- // Only process ManagedClusterAddOns that reference AddOnTemplates
- for _, configRef := range mca.Status.ConfigReferences {
- if configRef.ConfigGroupResource.Group == "addon.open-cluster-management.io" &&
- configRef.ConfigGroupResource.Resource == "addontemplates" {
- return true
- }
- }
- return false
- },
+ queue.QueueKeyByMetaName,
+ func(obj interface{}) bool {
+ // unwrap tombstone for delete events
+ if ts, ok := obj.(cache.DeletedFinalStateUnknown); ok {
+ obj = ts.Obj
+ }
+ mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn)
+ if !ok {
+ return false
+ }
+ // Only process MCAs that reference AddOnTemplates
+ for _, ref := range mca.Status.ConfigReferences {
+ if ref.ConfigGroupResource.Group == utils.AddOnTemplateGVR.Group &&
+ ref.ConfigGroupResource.Resource == utils.AddOnTemplateGVR.Resource {
+ return true
+ }
+ }
+ return false
+ },
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
WithFilteredEventsInformersQueueKeysFunc( | |
queue.QueueKeyByMetaName, | |
func(obj interface{}) bool { | |
mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return false | |
} | |
// Only process ManagedClusterAddOns that reference AddOnTemplates | |
for _, configRef := range mca.Status.ConfigReferences { | |
if configRef.ConfigGroupResource.Group == "addon.open-cluster-management.io" && | |
configRef.ConfigGroupResource.Resource == "addontemplates" { | |
return true | |
} | |
} | |
return false | |
}, | |
addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer()). | |
WithFilteredEventsInformersQueueKeysFunc( | |
queue.QueueKeyByMetaName, | |
func(obj interface{}) bool { | |
// unwrap tombstone for delete events | |
if ts, ok := obj.(cache.DeletedFinalStateUnknown); ok { | |
obj = ts.Obj | |
} | |
mca, ok := obj.(*addonv1alpha1.ManagedClusterAddOn) | |
if !ok { | |
return false | |
} | |
// Only process MCAs that reference AddOnTemplates | |
for _, ref := range mca.Status.ConfigReferences { | |
if ref.ConfigGroupResource.Group == utils.AddOnTemplateGVR.Group && | |
ref.ConfigGroupResource.Resource == utils.AddOnTemplateGVR.Resource { | |
return true | |
} | |
} | |
return false | |
}, | |
addonInformers.Addon().V1alpha1().ManagedClusterAddOns().Informer()). |
🤖 Prompt for AI Agents
In pkg/addon/controllers/addontemplate/controller.go around lines 97 to 114, the
filter predicate currently casts obj directly to
*addonv1alpha1.ManagedClusterAddOn which drops tombstone-wrapped delete events
(cache.DeletedFinalStateUnknown) and uses hard-coded group/resource strings;
update the predicate to unwrap cache.DeletedFinalStateUnknown by extracting the
tombstone.Obj when needed and then assert the target type, and replace the
literal "addon.open-cluster-management.io"/"addontemplates" checks with the
shared constant (e.g., utils.AddOnTemplateGVR or the project’s AddOnTemplateGVR)
comparing Group and Resource against those constant values so delete events are
handled and the GVR strings are not hard-coded.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qiujian16, zhujian7 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
b506d16
into
open-cluster-management-io:main
The addon template controller was stopping addon managers immediately when ClusterManagementAddon was deleted, without waiting for pre-delete jobs to complete or ManagedClusterAddons to be cleaned up via owner reference cascading deletion.
This change implements the TODO at line 105 by checking if all ManagedClusterAddons are deleted before stopping the manager. The controller now uses field selectors to efficiently query for remaining ManagedClusterAddons and requeues after 10 seconds if any still exist, allowing time for proper cleanup.
🤖 Generated with Claude Code
Summary
Related issue(s)
Fixes #1156
Summary by CodeRabbit
Bug Fixes
New Features
Tests
Chores