Skip to content

Commit 06737ef

Browse files
authored
Merge pull request #17567 from hakman/karpenter-1.6.2
Update Karpenter to v1.6.2
2 parents 9dadd5c + efdc249 commit 06737ef

File tree

34 files changed

+3814
-3573
lines changed

34 files changed

+3814
-3573
lines changed

cmd/kops/integration_test.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -914,11 +914,11 @@ func TestKarpenter(t *testing.T) {
914914
withOIDCDiscovery().
915915
withDefaults24().
916916
withAddons("karpenter.sh-k8s-1.19").
917-
withServiceAccountRole("aws-node-termination-handler.kube-system", true).
917+
withoutNTH().
918918
withServiceAccountRole("karpenter.kube-system", true)
919919
test.expectTerraformFilenames = append(test.expectTerraformFilenames,
920-
"aws_launch_template_karpenter-nodes-single-machinetype.minimal.example.com_user_data",
921-
"aws_launch_template_karpenter-nodes-default.minimal.example.com_user_data",
920+
"aws_s3_object_nodeupscript-karpenter-nodes-single-machinetype_content",
921+
"aws_s3_object_nodeupscript-karpenter-nodes-default_content",
922922
"aws_s3_object_nodeupconfig-karpenter-nodes-single-machinetype_content",
923923
"aws_s3_object_nodeupconfig-karpenter-nodes-default_content",
924924
)

docs/operations/karpenter.md

Lines changed: 92 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,69 +1,122 @@
11
# Karpenter
22

3-
[Karpenter](https://karpenter.sh) is a Kubernetes-native capacity manager that directly provisions Nodes and underlying instances based on Pod requirements. On AWS, kOps supports managing an InstanceGroup with either Karpenter or an AWS Auto Scaling Group (ASG).
3+
[Karpenter](https://karpenter.sh) is an open-source node lifecycle management project built for Kubernetes.
4+
Adding Karpenter to a Kubernetes cluster can dramatically improve the efficiency and cost of running workloads on that cluster.
5+
6+
On AWS, kOps supports managing an InstanceGroup with either Karpenter or an AWS Auto Scaling Group (ASG).
7+
8+
## Prerequisites
9+
10+
Managed Karpenter requires kOps 1.34+ and that [IAM Roles for Service Accounts (IRSA)](/cluster_spec#service-account-issuer-discovery-and-aws-iam-roles-for-service-accounts-irsa) be enabled for the cluster.
11+
12+
If an older version of Karpenter was installed, it must be uninstalled before installing the new version.
413

514
## Installing
615

7-
If using kOps 1.26 or older, enable the Karpenter feature flag :
16+
### New clusters
817

918
```sh
10-
export KOPS_FEATURE_FLAGS="Karpenter"
11-
```
19+
export KOPS_STATE_STORE="s3://my-state-store"
20+
export KOPS_DISCOVERY_STORE="s3://my-discovery-store"
21+
export NAME="my-cluster.example.com"
22+
export ZONES="eu-central-1a"
1223

13-
Karpenter requires that external permissions for ServiceAccounts be enabled for the cluster. See [AWS IAM roles for ServiceAccounts documentation](/cluster_spec#service-account-issuer-discovery-and-aws-iam-roles-for-service-accounts-irsa) for how to enable this.
24+
kops create cluster --name ${NAME} \
25+
--cloud=aws \
26+
--instance-manager=karpenter \
27+
--discovery-store=${KOPS_DISCOVERY_STORE} \
28+
--zones=${ZONES} \
29+
--yes
30+
31+
kops validate cluster --name ${NAME} --wait=10m
32+
33+
kops export kubeconfig --name ${NAME} --admin
34+
```
1435

1536
### Existing clusters
1637

17-
On existing clusters, you can create a Karpenter InstanceGroup by adding the following to its InstanceGroup spec:
38+
The Karpenter addon must be enabled in the cluster spec:
1839

1940
```yaml
2041
spec:
21-
manager: Karpenter
42+
karpenter:
43+
enabled: true
2244
```
2345
24-
You also need to enable the Karpenter addon in the cluster spec:
46+
To create a Karpenter InstanceGroup, set the following in its InstanceGroup spec:
2547
2648
```yaml
2749
spec:
28-
karpenter:
29-
enabled: true
50+
manager: Karpenter
3051
```
3152
32-
### New clusters
33-
34-
On new clusters, you can simply add the `--instance-manager=karpenter` flag:
53+
### EC2NodeClass and NodePool
3554
3655
```sh
37-
kops create cluster --name mycluster.example.com --cloud aws --networking=amazonvpc --zones=eu-central-1a,eu-central-1b --master-count=3 --yes --discovery-store=s3://discovery-store/
56+
USER_DATA=$(aws s3 cp ${KOPS_STATE_STORE}/${NAME}/igconfig/node/nodes/nodeupscript.sh -)
57+
USER_DATA=${USER_DATA//$'\n'/$'\n '}
58+
59+
kubectl apply -f - <<YAML
60+
apiVersion: karpenter.k8s.aws/v1
61+
kind: EC2NodeClass
62+
metadata:
63+
name: default
64+
spec:
65+
amiFamily: Custom
66+
amiSelectorTerms:
67+
- ssmParameter: /aws/service/canonical/ubuntu/server/24.04/stable/current/amd64/hvm/ebs-gp3/ami-id
68+
- ssmParameter: /aws/service/canonical/ubuntu/server/24.04/stable/current/arm64/hvm/ebs-gp3/ami-id
69+
associatePublicIPAddress: true
70+
tags:
71+
KubernetesCluster: ${NAME}
72+
kops.k8s.io/instancegroup: nodes
73+
k8s.io/role/node: "1"
74+
subnetSelectorTerms:
75+
- tags:
76+
KubernetesCluster: ${NAME}
77+
securityGroupSelectorTerms:
78+
- tags:
79+
KubernetesCluster: ${NAME}
80+
Name: nodes.${NAME}
81+
instanceProfile: nodes.${NAME}
82+
userData: |
83+
${USER_DATA}
84+
YAML
85+
86+
kubectl apply -f - <<YAML
87+
apiVersion: karpenter.sh/v1
88+
kind: NodePool
89+
metadata:
90+
name: default
91+
spec:
92+
template:
93+
spec:
94+
requirements:
95+
- key: kubernetes.io/arch
96+
operator: In
97+
values: ["amd64", "arm64"]
98+
- key: kubernetes.io/os
99+
operator: In
100+
values: ["linux"]
101+
- key: karpenter.sh/capacity-type
102+
operator: In
103+
values: ["on-demand", "spot"]
104+
nodeClassRef:
105+
group: karpenter.k8s.aws
106+
kind: EC2NodeClass
107+
name: default
108+
YAML
38109
```
39110

40111
## Karpenter-managed InstanceGroups
41112

42-
A Karpenter-managed InstanceGroup controls a corresponding Karpenter Provisioner resource. kOps will ensure that the Provisioner is configured with the correct AWS security groups, subnets, and launch templates. Just like with ASG-managed InstanceGroups, you can add labels and taints to Nodes and kOps will ensure those are added accordingly.
43-
44-
Note that not all features of InstanceGroups are supported.
45-
46-
## Subnets
47-
48-
By default, kOps will tag subnets with `kops.k8s.io/instance-group/<intancegroup>: "true"` for each InstanceGroup the subnet is assigned to. If you enable manual tagging of subnets, you have to ensure these tags are added, if not Karpenter will fail to provision any instances.
49-
50-
## Instance Types
51-
52-
If you do not specify a mixed instances policy, only the instance type specified by `spec.machineType` will be used. With Karpenter, one typically wants a wider range of instances to choose from. kOps supports both providing a list of instance types through `spec.mixedInstancesPolicy.instances` and providing instance type requirements through `spec.mixedInstancesPolicy.instanceRequirements`. See (/instance_groups)[InstanceGroup documentation] for more details.
113+
A Karpenter-managed InstanceGroup controls the bootstrap script. kOps will ensure the correct AWS security groups, subnets and permissions.
114+
`EC2NodeClass` and `NodePool` objects must be created by the cluster operator.
53115

54116
## Known limitations
55117

56-
### Karpenter-managed Launch Templates
57-
58-
On EKS, Karpener creates its own launch templates for Provisioners. These launch templates will not work with a kOps cluster for a number of reasons. Most importantly, they do not use supported AMIs and they do not install and configure nodeup, the instance-side kOps component. The Karpenter features that require Karpenter to directly manage launch templates will not be available on kOps.
59-
60-
### Unmanaged Provisioner resources
61-
62-
As mentioned above, kOps will manage a Provisioner resource per InstanceGroup. It is technically possible to create Provsioner resources directly, but you have to ensure that you configure Provisioners according to kOps requirements. As mentioned above, Karpenter-managed launch templates do not work and you have to maintain your own kOps-compatible launch templates.
63-
64-
### Other minor limitations
65-
66-
* Control plane nodes must be provisioned with an ASG, not Karpenter.
67-
* Provisioners will unconditionally use spot with a fallback on ondemand instances.
68-
* Provisioners will unconditionally include burstable instance groups such as the T3 instance family.
69-
* kOps will not allow mixing arm64 and amd64 instances in the same Provider.
118+
* **Upgrade is not supported** from the previous version of managed Karpenter.
119+
* Control plane nodes must be provisioned with an ASG.
120+
* All `EC2NodeClass` objects must have the `spec.amiFamily` set to `Custom`.
121+
* `spec.instanceStorePolicy` configuration is not supported in `EC2NodeClass`.
122+
* `spec.kubelet`, `spec.taints` and `spec.labels` configuration are not supported in `EC2NodeClass`, but they can be configured in the `Cluster` or `InstanceGroup` spec.

docs/releases/1.34-NOTES.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ This is a document to gather the release notes prior to the release.
1616

1717
## AWS
1818

19-
* TODO
19+
* Karpenter has been upgraded to v1.6.2. ([17567](https://github.com/kubernetes/kops/pull/17567)
2020

2121
## GCP
2222

@@ -34,7 +34,9 @@ This is a document to gather the release notes prior to the release.
3434

3535
## Other breaking changes
3636

37-
* Legacy addons have been removed from the kOps repo. These were only referenced by kOps <1.22 ([17322](https://github.com/kubernetes/kops/pull/17332))
37+
* Legacy addons have been removed from the kOps repo. These were only referenced by kOps <1.22. ([17322](https://github.com/kubernetes/kops/pull/17332))
38+
39+
* If an older version of Karpenter was installed, it must be uninstalled before upgrading. ([17567](https://github.com/kubernetes/kops/pull/17567)
3840

3941
# Known Issues
4042

pkg/apis/kops/validation/validation.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1894,6 +1894,9 @@ func validateMetricsServer(cluster *kops.Cluster, spec *kops.MetricsServerConfig
18941894
}
18951895

18961896
func validateNodeTerminationHandler(cluster *kops.Cluster, spec *kops.NodeTerminationHandlerSpec, fldPath *field.Path) (allErrs field.ErrorList) {
1897+
if (spec.Enabled == nil || *spec.Enabled) && cluster.Spec.Karpenter != nil && cluster.Spec.Karpenter.Enabled {
1898+
allErrs = append(allErrs, field.Forbidden(fldPath, "nodeTerminationHandler cannot be used in conjunction with Karpenter"))
1899+
}
18971900
if spec.IsQueueMode() {
18981901
if spec.EnableSpotInterruptionDraining != nil && !*spec.EnableSpotInterruptionDraining {
18991902
allErrs = append(allErrs, field.Forbidden(fldPath.Child("enableSpotInterruptionDraining"), "spot interruption draining cannot be disabled in Queue Processor mode"))

pkg/model/awsmodel/autoscalinggroup.go

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -74,19 +74,26 @@ func (b *AutoscalingGroupModelBuilder) Build(c *fi.CloudupModelBuilderContext) e
7474
}
7575
}
7676

77-
task, err := b.buildLaunchTemplateTask(c, name, ig)
77+
// Always create the user data, even for Karpenter manged instance groups
78+
// Kaprenter expects the user data to be available in the state store:
79+
// ${KOPS_STATE_STORE}/${CLUSTER_NAME}/igconfig/node/${IG_NAME}/nodeupscript.sh
80+
userData, err := b.BootstrapScriptBuilder.ResourceNodeUp(c, ig)
7881
if err != nil {
7982
return err
8083
}
81-
c.AddTask(task)
8284

83-
// @step: now lets build the autoscaling group task
8485
if ig.Spec.Manager != "Karpenter" {
86+
lt, err := b.buildLaunchTemplateTask(c, name, ig, userData)
87+
if err != nil {
88+
return err
89+
}
90+
c.AddTask(lt)
91+
8592
asg, err := b.buildAutoScalingGroupTask(c, name, ig)
8693
if err != nil {
8794
return err
8895
}
89-
asg.LaunchTemplate = task
96+
asg.LaunchTemplate = lt
9097
c.AddTask(asg)
9198

9299
warmPool := b.Cluster.Spec.CloudProvider.AWS.WarmPool.ResolveDefaults(ig)
@@ -136,7 +143,7 @@ func (b *AutoscalingGroupModelBuilder) Build(c *fi.CloudupModelBuilderContext) e
136143
}
137144

138145
// buildLaunchTemplateTask is responsible for creating the template task into the aws model
139-
func (b *AutoscalingGroupModelBuilder) buildLaunchTemplateTask(c *fi.CloudupModelBuilderContext, name string, ig *kops.InstanceGroup) (*awstasks.LaunchTemplate, error) {
146+
func (b *AutoscalingGroupModelBuilder) buildLaunchTemplateTask(c *fi.CloudupModelBuilderContext, name string, ig *kops.InstanceGroup, userData fi.Resource) (*awstasks.LaunchTemplate, error) {
140147
// @step: add the iam instance profile
141148
link, err := b.LinkToIAMInstanceProfile(ig)
142149
if err != nil {
@@ -180,11 +187,6 @@ func (b *AutoscalingGroupModelBuilder) buildLaunchTemplateTask(c *fi.CloudupMode
180187
return nil, fmt.Errorf("error building cloud tags: %v", err)
181188
}
182189

183-
userData, err := b.BootstrapScriptBuilder.ResourceNodeUp(c, ig)
184-
if err != nil {
185-
return nil, err
186-
}
187-
188190
lt := &awstasks.LaunchTemplate{
189191
Name: fi.PtrTo(name),
190192
Lifecycle: b.Lifecycle,

pkg/model/bootstrapscript.go

Lines changed: 49 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,13 @@ import (
2727
"k8s.io/kops/pkg/apis/nodeup"
2828
"k8s.io/kops/pkg/assets"
2929
"k8s.io/kops/pkg/model/resources"
30+
"k8s.io/kops/pkg/nodemodel/wellknownassets"
3031
"k8s.io/kops/pkg/wellknownservices"
3132
"k8s.io/kops/upup/pkg/fi"
3233
"k8s.io/kops/upup/pkg/fi/fitasks"
3334
"k8s.io/kops/upup/pkg/fi/utils"
3435
"k8s.io/kops/util/pkg/architectures"
36+
"k8s.io/kops/util/pkg/vfs"
3537
)
3638

3739
type NodeUpConfigBuilder interface {
@@ -65,6 +67,8 @@ type BootstrapScript struct {
6567

6668
// nodeupConfig contains the nodeup config.
6769
nodeupConfig fi.CloudupTaskDependentResource
70+
// nodeupScript contains the nodeup bootstrap script, for use with Karpenter.
71+
nodeupScript fi.CloudupTaskDependentResource
6872
}
6973

7074
var (
@@ -74,7 +78,7 @@ var (
7478
)
7579

7680
// kubeEnv returns the boot config for the instance group
77-
func (b *BootstrapScript) kubeEnv(ig *kops.InstanceGroup, c *fi.CloudupContext) (*nodeup.BootConfig, error) {
81+
func (b *BootstrapScript) kubeEnv(cluster *kops.Cluster, ig *kops.InstanceGroup, c *fi.CloudupContext) (*nodeup.BootConfig, error) {
7882
wellKnownAddresses := make(WellKnownAddresses)
7983

8084
for _, hasAddress := range b.hasAddressTasks {
@@ -121,6 +125,40 @@ func (b *BootstrapScript) kubeEnv(ig *kops.InstanceGroup, c *fi.CloudupContext)
121125
bootConfig.NodeupConfigHash = base64.StdEncoding.EncodeToString(sum256[:])
122126
b.nodeupConfig.Resource = fi.NewBytesResource(configData)
123127

128+
if ig.Spec.Manager == kops.InstanceManagerKarpenter {
129+
assetBuilder := assets.NewAssetBuilder(vfs.NewVFSContext(), cluster.Spec.Assets, false)
130+
nodeUpAssets := make(map[architectures.Architecture]*assets.MirroredAsset)
131+
for _, arch := range architectures.GetSupported() {
132+
asset, err := wellknownassets.NodeUpAsset(assetBuilder, arch)
133+
if err != nil {
134+
return nil, err
135+
}
136+
nodeUpAssets[arch] = asset
137+
}
138+
139+
var nodeupScript resources.NodeUpScript
140+
nodeupScript.NodeUpAssets = nodeUpAssets
141+
nodeupScript.BootConfig = bootConfig
142+
143+
nodeupScript.WithEnvironmentVariables(cluster, ig)
144+
nodeupScript.WithProxyEnv(cluster)
145+
nodeupScript.WithSysctls()
146+
147+
nodeupScript.CompressUserData = fi.ValueOf(ig.Spec.CompressUserData)
148+
149+
nodeupScript.CloudProvider = string(cluster.GetCloudProvider())
150+
151+
scriptResource, err := nodeupScript.Build()
152+
if err != nil {
153+
return nil, err
154+
}
155+
scriptData, err := fi.ResourceAsBytes(scriptResource)
156+
if err != nil {
157+
return nil, err
158+
}
159+
b.nodeupScript.Resource = fi.NewBytesResource(scriptData)
160+
}
161+
124162
return bootConfig, nil
125163
}
126164

@@ -194,6 +232,7 @@ func (b *BootstrapScriptBuilder) ResourceNodeUp(c *fi.CloudupModelBuilderContext
194232
}
195233
task.resource.Task = task
196234
task.nodeupConfig.Task = task
235+
task.nodeupScript.Task = task
197236
c.AddTask(task)
198237

199238
c.AddTask(&fitasks.ManagedFile{
@@ -202,6 +241,14 @@ func (b *BootstrapScriptBuilder) ResourceNodeUp(c *fi.CloudupModelBuilderContext
202241
Location: fi.PtrTo("igconfig/" + ig.Spec.Role.ToLowerString() + "/" + ig.Name + "/nodeupconfig.yaml"),
203242
Contents: &task.nodeupConfig,
204243
})
244+
if ig.Spec.Manager == kops.InstanceManagerKarpenter {
245+
c.AddTask(&fitasks.ManagedFile{
246+
Name: fi.PtrTo("nodeupscript-" + ig.Name),
247+
Lifecycle: b.Lifecycle,
248+
Location: fi.PtrTo("igconfig/" + ig.Spec.Role.ToLowerString() + "/" + ig.Name + "/nodeupscript.sh"),
249+
Contents: &task.nodeupScript,
250+
})
251+
}
205252
return &task.resource, nil
206253
}
207254

@@ -231,7 +278,7 @@ func (b *BootstrapScript) Run(c *fi.CloudupContext) error {
231278
return nil
232279
}
233280

234-
bootConfig, err := b.kubeEnv(b.ig, c)
281+
bootConfig, err := b.kubeEnv(b.cluster, b.ig, c)
235282
if err != nil {
236283
return err
237284
}

0 commit comments

Comments
 (0)