Skip to content

✨ Rosa Config implementaiton #5499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

PanSpagetka
Copy link
Contributor

@PanSpagetka PanSpagetka commented May 21, 2025

Based on proposal #5451
Adding RosaRoleConfig API with implementation. that should create Account/Operator roles and OIDC config/provider necessary to create ROSA cluster.

We need to move RosaMachinePoolAutoScaling definition to controlplane, because otherwise there would be circular dependency.

What type of PR is this?
/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:


@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 21, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from Ankitasw and serngawy May 21, 2025 13:37
@k8s-ci-robot k8s-ci-robot added needs-priority size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 21, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @PanSpagetka. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@PanSpagetka PanSpagetka marked this pull request as draft May 21, 2025 13:37
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2025
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from 11dac0b to 6056618 Compare May 28, 2025 13:43
@serngawy serngawy mentioned this pull request May 29, 2025
5 tasks
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch 6 times, most recently from 1587db4 to 9121ec2 Compare June 9, 2025 13:40
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch 3 times, most recently from 07b73a3 to 097252f Compare June 16, 2025 11:47
Dockerfile Outdated
@@ -28,12 +28,17 @@ WORKDIR /workspace
# Copy the Go Modules manifests
COPY go.mod go.mod
COPY go.sum go.sum
COPY ./rosa /workspace/rosa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we adding this ? what is the rosa file

Dockerfile Outdated
# Cache deps before building and copying source so that we don't need to re-download as much
# and so that source changes don't invalidate our downloaded layer
RUN --mount=type=cache,target=/root/.local/share/golang \
--mount=type=cache,target=/go/pkg/mod \
go mod download

# RUN go mod download
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this line

PROJECT Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just add the RosaRoleConfig item without changing the order of other items

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this file ?

@@ -38,6 +39,7 @@ patchesStrategicMerge:
- patches/webhook_in_awsmanagedcontrolplanes.yaml
- patches/webhook_in_eksconfigs.yaml
- patches/webhook_in_eksconfigtemplates.yaml
#- patches/webhook_in_rosaroleconfigs.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you need to uncomment this line

}
}

if scope.RosaRoleConfig.Status.OIDCID == "" {
Copy link
Contributor

@serngawy serngawy Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you should check/set the RosaRoleConfig condition first , then get the oidc using OCM client if it is not exist then create it.
Same applied for account-roles and operator-roles

}
}

err = r.deleteOperatorRoles(ocmClient, awsClient, scope.RosaRoleConfig.Spec.AccountRoleConfig.Prefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to delete the operator roles before the oidc-provider ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, a changed it so it matches reverse creation order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the order here is not correct, deleting the operatrRoles first then oidc-provider then odic-config ?

return ocmClient.DeleteOidcConfig(oidcConfigID)
}

type reporter struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move this to another file , better to be under pkg/.../rosa

@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from 097252f to 0c9fa93 Compare June 18, 2025 11:40
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 19, 2025
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch 3 times, most recently from b3aded3 to 23fa4cb Compare June 24, 2025 11:29
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from 23fa4cb to 2515be2 Compare July 2, 2025 08:09
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from fedd967 to a428b51 Compare August 5, 2025 11:12
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from a428b51 to c2e82ca Compare August 5, 2025 11:31
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2025
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch 3 times, most recently from 4761247 to fe144c5 Compare August 5, 2025 12:13
@PanSpagetka
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e-blocking

1 similar comment
@PanSpagetka
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e-blocking

// +optional
// +immutable
SharedVPCConfig SharedVPCConfig `json:"sharedVPCConfig,omitempty"`
// OIDCID is the ID of the OIDC config that will be used to create the operator roles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// OIDCID is the ID of the OIDC config that will be used to create the operator roles.
// OIDCID is the ID of the OIDC config that will be used to create the operator roles. A managed OIDC-provider will be created if the OIDCID not specified

OperatorRoleConfig OperatorRoleConfig `json:"operatorRoleConfig"`
OIDCConfig OIDCConfig `json:"oidcConfig"`
IdentityRef *infrav1.AWSIdentityReference `json:"identityRef,omitempty"`
Region string `json:"region,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed no need for region

}

oidcID := scope.RosaRoleConfig.Status.OIDCID
err = r.deleteOIDCProvider(ocmClient, awsClient, oidcID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't delete the oidc-provider if the user set it under the spec.operatorRole.OIDCConfig

@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from fe144c5 to 60be2ee Compare August 7, 2025 12:02
@PanSpagetka
Copy link
Contributor Author

/retest-required

@PanSpagetka
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-test

1 similar comment
@PanSpagetka
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-test

@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from 60be2ee to bc8e7af Compare August 8, 2025 06:44
@PanSpagetka
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-apidiff-main

// If specified, the roles and OIDC configuration will be taken from the referenced RosaRoleConfig instead of the direct fields.
//
// +optional
RosaRoleConfigRef *corev1.LocalObjectReference `json:"rosaRoleConfigRef,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OIDC is immutable , so validation should be applied (could be in the webhook update ) that RosaRoleConfigRef cannot be change if the cluster is provisioned

}
}

err = r.deleteOperatorRoles(ocmClient, awsClient, scope.RosaRoleConfig.Spec.AccountRoleConfig.Prefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the order here is not correct, deleting the operatrRoles first then oidc-provider then odic-config ?

if len(operatorRoles) > 0 {
for _, roles := range operatorRoles {
for _, role := range roles {
if strings.Contains(role.RoleName, fmt.Sprintf("%s-openshift-ingress-operator-cloud-credentials", config.Prefix)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why strings.contain not exact string equal by name ?

}
}
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is not correct , what if one Role is missing not all of them ? we create 8 operator Roles, we should check every single one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if there is one role missing we don't create 8 roles because you can't have 2 roles with same name. And in rosa cli we handle roles creation as a batch operation, so we are consistent with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is not correct , rosa lib will re-create the missing operator roles if there are any. You can reproduce this scenario with rosa-cli

if strings.Contains(role.RoleName, fmt.Sprintf("%s-openshift-ingress-operator-cloud-credentials", config.Prefix)) {
scope.RosaRoleConfig.Status.OperatorRolesRef.IngressARN = role.RoleARN
}
if strings.Contains(role.RoleName, fmt.Sprintf("%s-openshift-image-registry-installer-cloud-credentials", config.Prefix)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if strings.Contains(role.RoleName, fmt.Sprintf("%s-openshift-image-registry-installer-cloud-credentials", config.Prefix)) {
else if strings.Contains(role.RoleName, fmt.Sprintf("%s-openshift-image-registry-installer-cloud-credentials", config.Prefix)) {

better to set else if with the following if(s) check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to have the RoleNames in list/map so easier for you to check the roles count and existence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No having roles names in list doesn't help, every role has its own field so every condition needs to be spelled out.

@PanSpagetka
Copy link
Contributor Author

/retest-required

1 similar comment
@PanSpagetka
Copy link
Contributor Author

/retest-required

@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from 0cbe365 to b3e19fc Compare August 14, 2025 07:32
@PanSpagetka PanSpagetka force-pushed the rosa-roles-implementations branch from b3e19fc to 54f21ce Compare August 14, 2025 11:05
Copy link
Contributor

@serngawy serngawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PanSpagetka, sounds good in general we can add more test in future PRs

@serngawy
Copy link
Contributor

Please add release note , mention adding Rosa RoleConfig API

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 15, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and style suggestions.

@@ -714,8 +737,8 @@ type AWSRolesRef struct {
// ]
// }
// +immutable
ControlPlaneOperatorARN string `json:"controlPlaneOperatorARN"`
KMSProviderARN string `json:"kmsProviderARN"`
ControlPlaneOperatorARN string `json:"controlPlaneOperatorARN,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these fields being updated with omitempty?

@@ -179,6 +183,29 @@ func (r *ROSAControlPlane) validateExternalAuthProviders() *field.Error {
return nil
}

func (r *ROSAControlPlane) validateRosaRoleConfig() *field.Error {
hasRosaRoleConfigRef := r.Spec.RosaRoleConfigRef != nil
hasAnyDirectRoleFields := r.Spec.OIDCID != "" || r.Spec.InstallerRoleARN != "" || r.Spec.SupportRoleARN != "" || r.Spec.WorkerRoleARN != "" ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but you don't know which field is missing without checking all of them.

I can see why this information is being moved into a RoleConfig, because checking all these fields individually is tricky.

@@ -179,6 +186,58 @@ func (r *ROSAControlPlane) validateExternalAuthProviders() *field.Error {
return nil
}

func (r *ROSAControlPlane) validateRosaRoleConfig() *field.Error {
hasAnyDirectRoleFields := r.Spec.OIDCID != "" || r.Spec.InstallerRoleARN != "" || r.Spec.SupportRoleARN != "" || r.Spec.WorkerRoleARN != "" ||
r.Spec.RolesRef.IngressARN != "" || r.Spec.RolesRef.ImageRegistryARN != "" || r.Spec.RolesRef.StorageARN != "" ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of clarity, perhaps we could use a r.Spec.RolesRef == nil check to demonstrate that there is no RolesRef at all, without checking every field within it.

We can look at the specific RolesRef fields themselves further down in the function, since up here we're just looking for a high level conflict.

if r.Spec.WorkerRoleARN == "" {
return field.Invalid(field.NewPath("spec.workerRoleARN"), r.Spec.WorkerRoleARN, "must be specified")
}
if r.Spec.RolesRef.IngressARN == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably also want to make sure that r.Spec.RolesRef isn't nil at some point, possibly up above where my previous comment is.

Another suggestion would be to say if you reach this point in the code, you know all the previous fields aren't "", because the function would have returned.

Knowing that, we could say that at this point if r.Spec.RolesRef is not nil, then the user has specified all the direct fields and a RolesRef, triggering the mutual exclusion.

In terms of clarity, I think it would also be worth having a couple helper functions for checking the direct fields and the RolesRef fields; it keeps the conditionals grouped into smaller chunks.

Comment on lines +28 to +29
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.

"context"

awsv2 "github.com/aws/aws-sdk-go-v2/aws"
iamv2 "github.com/aws/aws-sdk-go-v2/service/iam"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could import these directly without an alias now that there's no reference to the older SDK.

// PatchObject persists the RosaRoleConfig configuration and status.
func (s *RosaRoleConfigScope) PatchObject() error {
return s.patchHelper.Patch(
context.TODO(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
context.TODO(),
context.Background(),

}

// Close closes the current scope persisting the RosaRoleConfig configuration and status.
func (s *RosaRoleConfigScope) Close() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have this wrapper function?

}

// Debugf prints a debug message with the given format and arguments.
func (r *Reporter) Debugf(format string, args ...interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions just swallow the input? Are you trying to hide the output?

@@ -1,6 +1,6 @@
module sigs.k8s.io/cluster-api-provider-aws/v2

go 1.23.0
go 1.23.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave this at 1.23.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-priority needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants