Skip to content

Conversation

justinsb
Copy link
Member

Less hacky support for GCP, encode more of the logic into controllers.

WIP

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 30, 2025
@k8s-ci-robot k8s-ci-robot requested a review from zetaab September 30, 2025 22:42
@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from c2c7c26 to 5e7c5bb Compare September 30, 2025 22:55
@hakman hakman requested review from hakman and removed request for olemarkus and zetaab September 30, 2025 23:27
@justinsb justinsb force-pushed the clusterapi_controllers branch 4 times, most recently from 80b82ec to 855ef49 Compare October 6, 2025 16:43
@k8s-ci-robot k8s-ci-robot added the area/provider/gcp Issues or PRs related to gcp provider label Oct 6, 2025
@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from 27f468b to 3e80afa Compare October 7, 2025 16:22
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

Let's try this new test :-)

@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@hakman hakman force-pushed the clusterapi_controllers branch from b8bc822 to a4fbb3d Compare October 7, 2025 18:06
@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@hakman hakman force-pushed the clusterapi_controllers branch from a4fbb3d to 5496816 Compare October 7, 2025 18:08
@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from a3ef085 to fb68367 Compare October 7, 2025 22:26
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from fb68367 to b7f6dc3 Compare October 7, 2025 22:58
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from 05506d9 to 1e290a0 Compare October 10, 2025 18:32
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

1 similar comment
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from ac839db to 1af674f Compare October 11, 2025 02:51
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from 1af674f to 172a4a7 Compare October 11, 2025 10:36
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from 172a4a7 to e3db3be Compare October 11, 2025 10:59
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

This is an interesting one. For most builds, kops.Version is the same as kops.KOPS_RELEASE_VERSION, and we sometimes use KOPS_RELEASE_VERSION instead of kops.Version. But for CI builds, they are not the same. So when we use KOPS_RELEASE_VERSION with a CI build, we get an error like this one from kops-controller:

I1011 11:27:40.929602       1 server.go:247] bootstrap failed to build node config: building nodeConfig for instanceGroup: error during apply: kops version older than last used to update the cluster

*********************************************************************************

The cluster was last updated by kops version 1.34.0-beta.2+v1.34.0-beta.1-16-gce1ad2cc6c
To permit updating by the older version 1.34.0-beta.1, run with the --allow-kops-downgrade flag

*********************************************************************************

Trying to switch everything to use kops.Version

@justinsb justinsb force-pushed the clusterapi_controllers branch from 907dda5 to d951951 Compare October 11, 2025 12:23
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

OK, the version thing changed image tags and had bigger impact. I'll try that separately, but I'll start with turning off the kops version check

@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

Looks like we don't have the GCP project configured when we acquire it from boskos; fixing that!

@justinsb justinsb force-pushed the clusterapi_controllers branch from c4c0a5b to 9e681a8 Compare October 11, 2025 16:02
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

Refactoring the signal handler, but something is odd there!

@justinsb justinsb force-pushed the clusterapi_controllers branch from 9e681a8 to d34ad3e Compare October 12, 2025 11:25
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

Working on a theory that killing go run doesn't send a signal to the child process (?), so trying to build it first.

The good news is that it seems to be working, so I think I could just move this code into toolbox or kops-controller, but ... let's see if this passes!

@justinsb justinsb force-pushed the clusterapi_controllers branch from d34ad3e to bfb5cce Compare October 12, 2025 12:45
@justinsb justinsb force-pushed the clusterapi_controllers branch from bfb5cce to 059cb26 Compare October 12, 2025 14:50
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

I think this is getting pretty close :-)

@justinsb
Copy link
Member Author

/retest

clusterapi test is now passing! Other failures look unrelated e.g. ( KUBE_BUILD_CONTAINER_NAME_BASE: unbound variable)

@k8s-ci-robot
Copy link
Contributor

@justinsb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kops-kubernetes-e2e-ubuntu-gce-build 059cb26 link false /test pull-kops-kubernetes-e2e-ubuntu-gce-build

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@justinsb
Copy link
Member Author

(I am still skipping the version check, which should not be needed, but I think this is #17658)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api area/documentation area/kops-controller area/nodeup area/provider/gcp Issues or PRs related to gcp provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants