Skip to content

kops_cluster_updater is broken on new cluster creation in 1.31.0 (but worked in 1.30.1) #217

@shapirus

Description

@shapirus

Situation: creation of a new kops cluster in AWS from scratch using the terraform kops provider.

Resources defined:

  • kops_cluster
  • a master kops_instance_group (min_size=1 max_size=1)
  • a non-master kops_instance_group (min_size=1 max_size=1)

terraform plan runs fine. The cluster revision and revisions of both instance groups are present in the keepers attribute. The plan output looks good.

Behavior of kops_cluster_updater in 1.30.1:

  • applies the generated cluster manifest
  • initializes the launch templates, ASGs and everything else in the cloud
  • waits for instances in both IGs to start, validates the cluster, job done

Behavior of kops_cluster_updater in 1.31.0:

  • applies a cluster manifest without the non-master instance group, only the master IG is applied
  • only the master instance group's cloud resources (LT, ASG) are created
  • kops_cluster_updater is stuck in the "still creating state", never being able to validate the cluster, because the desired instance group is never created, and eventually fails on timeout
  • if kops update cluster is run at this point manually, it displays a pending diff with all the stuff belonging to the non-master instance group, and if it is run again with --yes, then it will create the respective resources in the cloud.

Since it worked fine in 1.30.1 and is broken in 1.31.0, it should probably be easy enough to track down the change that introduced this bug.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions