Skip to content

Introduce RetryOnFailure lifecycle management strategy #1278

@stefanprodan

Description

@stefanprodan

This is a proposal for introducing a new way of performing lifecycle management akin to Flux kustomize-controller operational mode.

The RetryOnFailure strategy is suitable for statefulsets and other workloads that cannot tolerate rollbacks and have a high rollout duration susceptible to health check timeouts and transient capacity errors.

The RetryOnFailure strategy will ensure that:

  • An installation failure will not be retried immediately, instead, the controller will retry with an upgrade at a fixed interval defined in the HelmRelease.
  • An upgrade failure will leave the Helm release in a failed state without performing any remediations.
  • An upgrade failure will never result in a rollback or uninstall.
  • Upgrade failures are always retried at a fixed interval defined in the HelmRelease.

API Changes

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
spec:
  install:
    strategy:
      name: RetryOnFailure # defaults to RemediateOnFailure
      retryInterval: 5m # only used for RetryOnFailure (defaults to 5m)
    remediation: # ignored for RetryOnFailure
      retries: 2
  upgrade:
    strategy:
      name: RetryOnFailure # defaults to RemediateOnFailure
      retryInterval: 5m # only used for RetryOnFailure (defaults to 5m)
    remediation: # ignored for RetryOnFailure
      retries: 2
      strategy: rollback

When not specified, or when the strategy is set to RemediateOnFailure, the lifecycle management works like before.

For installations, the RetryOnFailure strategy will perform an uninstall on failure, then will rerun the installation after the specified retry interval.

For upgrades, the RetryOnFailure strategy will behave like flux reconcile hr --force when the remediation retries are set to 0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/uxIn pursuit of a delightful user experienceenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions