Skip to content

Conversation

@annelaucg
Copy link

@annelaucg annelaucg commented Dec 4, 2025

Summary

Related issue(s)

#1235
#1280

Fixes #

Summary by CodeRabbit

  • Bug Fixes
    • Improved deployment completion status tracking across multiple clusters to more accurately reflect the successful rollout state of managed resources.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 4, 2025

Walkthrough

Modified the manifest work replica set deployment reconciliation logic to track succeeded clusters separately and evaluate completion status based on succeeded cluster count rather than total cluster count.

Changes

Cohort / File(s) Change Summary
Manifest Work Replica Set Reconciliation
pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go
Introduces succeededCount and succeededClusterNames variables to track clusters with succeeded rollout status. Updates completion evaluation logic to count only succeeded clusters instead of all existing clusters when determining if deployment is complete.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Logic change in completion evaluation: Review the shift from total cluster count to succeeded cluster count to ensure it correctly handles edge cases (e.g., partial failures, cluster removal during rollout)
  • Placement-level tracking: Verify that succeededClusterNames is correctly populated and managed across different placement states
  • Rollout status conditions: Confirm that the condition checking for Succeeded status is comprehensive and doesn't miss valid completion scenarios

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete. It includes Related issue(s) but the Summary section is empty and the Fixes field lacks an issue number, leaving key context missing. Add a detailed summary explaining the bug and the fix applied. Specify which issue this PR fixes by adding an issue number to the Fixes field.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title correctly identifies the main change: a bug fix for PlacementRollout ReadyStatus. It uses the appropriate icon prefix (:bug:) and clearly references the affected component.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@annelaucg annelaucg force-pushed the placementrollout-readystatus branch from 0c997f1 to 441473d Compare December 9, 2025 22:44
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 9, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: annelaucg
Once this PR has been reviewed and has the lgtm label, please assign elgnay for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@annelaucg annelaucg marked this pull request as ready for review December 9, 2025 22:44
@openshift-ci openshift-ci bot requested review from bhperry and elgnay December 9, 2025 22:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go (1)

105-108: Correctly tracks succeeded clusters, but consider failed cluster visibility.

The logic appropriately identifies and tracks clusters with Succeeded status. However, clusters in Failed or permanently stuck Progressing states are not tracked separately, which means:

  • The rollout completion check (line 201) will never pass if any cluster fails
  • The status will show as "Progressing" indefinitely, with no distinction between "actively rolling out" vs. "failed and stuck"

This may be acceptable depending on requirements, but consider whether observability could be improved by tracking failed clusters separately to provide clearer feedback to users.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d5d82c and 441473d.

📒 Files selected for processing (1)
  • pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go (5 hunks)
🔇 Additional comments (4)
pkg/work/hub/controllers/manifestworkreplicasetcontroller/manifestworkreplicaset_deploy_reconcile.go (4)

38-38: LGTM! Clear variable initialization.

The addition of succeededCount alongside count and total clearly establishes the three metrics being tracked: expected clusters, existing ManifestWorks, and succeeded ManifestWorks.


68-68: LGTM! Appropriate data structure for tracking.

The succeededClusterNames set mirrors the pattern of existingClusterNames and is the right choice for tracking unique cluster names that have reached succeeded status within each placement.


180-180: LGTM! Consistent summation pattern.

The accumulation of succeededCount follows the same pattern as count on line 179, correctly summing succeeded clusters across all placements.


201-205: Behavioral change: completion now requires all clusters to succeed.

This change shifts the rollout completion criterion from "all ManifestWorks created" (total == count) to "all ManifestWorks succeeded" (total == succeededCount). This aligns well with the PR objective of tracking actual readiness rather than just deployment initiation.

Implications:

  • Rollouts will remain in Progressing state until all clusters reach Succeeded status
  • If any cluster fails or gets stuck, the rollout will never show as Complete
  • There is no distinction in status between "actively progressing" and "blocked by failures"

This is likely the intended behavior for a readiness check, but ensure that operators have sufficient visibility into individual cluster states (via logs, metrics, or cluster-level status) to diagnose why a rollout isn't completing.

Consider verifying that:

  1. Operators have adequate visibility into per-cluster rollout status (especially failures)
  2. The Progressing state for stuck/failed rollouts is acceptable UX, or if a separate Failed condition would be beneficial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants