-
Notifications
You must be signed in to change notification settings - Fork 115
🐛 PlacementRollout ReadyStatus #1281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
🐛 PlacementRollout ReadyStatus #1281
Conversation
WalkthroughModified the manifest work replica set deployment reconciliation logic to track succeeded clusters separately and evaluate completion status based on succeeded cluster count rather than total cluster count. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
0c997f1 to
441473d
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: annelaucg The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go (1)
105-108: Correctly tracks succeeded clusters, but consider failed cluster visibility.The logic appropriately identifies and tracks clusters with
Succeededstatus. However, clusters inFailedor permanently stuckProgressingstates are not tracked separately, which means:
- The rollout completion check (line 201) will never pass if any cluster fails
- The status will show as "Progressing" indefinitely, with no distinction between "actively rolling out" vs. "failed and stuck"
This may be acceptable depending on requirements, but consider whether observability could be improved by tracking failed clusters separately to provide clearer feedback to users.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go(5 hunks)
🔇 Additional comments (4)
pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go (4)
38-38: LGTM! Clear variable initialization.The addition of
succeededCountalongsidecountandtotalclearly establishes the three metrics being tracked: expected clusters, existing ManifestWorks, and succeeded ManifestWorks.
68-68: LGTM! Appropriate data structure for tracking.The
succeededClusterNamesset mirrors the pattern ofexistingClusterNamesand is the right choice for tracking unique cluster names that have reached succeeded status within each placement.
180-180: LGTM! Consistent summation pattern.The accumulation of
succeededCountfollows the same pattern ascounton line 179, correctly summing succeeded clusters across all placements.
201-205: Behavioral change: completion now requires all clusters to succeed.This change shifts the rollout completion criterion from "all ManifestWorks created" (
total == count) to "all ManifestWorks succeeded" (total == succeededCount). This aligns well with the PR objective of tracking actual readiness rather than just deployment initiation.Implications:
- Rollouts will remain in
Progressingstate until all clusters reachSucceededstatus- If any cluster fails or gets stuck, the rollout will never show as
Complete- There is no distinction in status between "actively progressing" and "blocked by failures"
This is likely the intended behavior for a readiness check, but ensure that operators have sufficient visibility into individual cluster states (via logs, metrics, or cluster-level status) to diagnose why a rollout isn't completing.
Consider verifying that:
- Operators have adequate visibility into per-cluster rollout status (especially failures)
- The
Progressingstate for stuck/failed rollouts is acceptable UX, or if a separateFailedcondition would be beneficial
Summary
Related issue(s)
#1235
#1280
Fixes #
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.