Skip to content

Commit 61b4928

Browse files
authored
Merge pull request #51483 from pohly/dra-1.34
DRA: core update for 1.34
2 parents e087053 + 36162b1 commit 61b4928

File tree

7 files changed

+79
-67
lines changed

7 files changed

+79
-67
lines changed

content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ creating or modifying ResourceSlices.
219219
Consider the following example ResourceSlice:
220220

221221
```yaml
222-
apiVersion: resource.k8s.io/v1beta1
222+
apiVersion: resource.k8s.io/v1
223223
kind: ResourceSlice
224224
metadata:
225225
name: cat-slice
@@ -233,14 +233,13 @@ spec:
233233
allNodes: true
234234
devices:
235235
- name: "large-black-cat"
236-
basic:
237-
attributes:
238-
color:
239-
string: "black"
240-
size:
241-
string: "large"
242-
cat:
243-
boolean: true
236+
attributes:
237+
color:
238+
string: "black"
239+
size:
240+
string: "large"
241+
cat:
242+
boolean: true
244243
```
245244
This ResourceSlice is managed by the `resource-driver.example.com` driver in the
246245
`black-cat-pool` pool. The `allNodes: true` field indicates that any node in the
@@ -399,7 +398,7 @@ admin access grants access to in-use devices and may enable additional
399398
permissions when making the device available in a container:
400399

401400
```yaml
402-
apiVersion: resource.k8s.io/v1beta2
401+
apiVersion: resource.k8s.io/v1
403402
kind: ResourceClaimTemplate
404403
metadata:
405404
name: large-black-cat-claim-template
@@ -441,7 +440,7 @@ allocated if it is available. But if it is not and two small white devices are a
441440
the pod will still be able to run.
442441

443442
```yaml
444-
apiVersion: resource.k8s.io/v1beta2
443+
apiVersion: resource.k8s.io/v1
445444
kind: ResourceClaimTemplate
446445
metadata:
447446
name: prioritized-list-claim-template
@@ -495,7 +494,7 @@ handles this and it is transparent to the consumer as the ResourceClaim API is n
495494

496495
```yaml
497496
kind: ResourceSlice
498-
apiVersion: resource.k8s.io/v1beta2
497+
apiVersion: resource.k8s.io/v1
499498
metadata:
500499
name: resourceslice
501500
spec:
@@ -632,4 +631,4 @@ spec:
632631
- [Allocate devices to workloads using DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/)
633632
- For more information on the design, see the
634633
[Dynamic Resource Allocation with Structured Parameters](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)
635-
KEP.
634+
KEP.

content/en/docs/reference/command-line-tools-reference/feature-gates/DynamicResourceAllocation.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,13 @@ stages:
1313
- stage: beta
1414
defaultValue: false
1515
fromVersion: "1.32"
16+
toVersion: "1.33"
17+
- stage: stable
18+
defaultValue: true
19+
locked: false
20+
fromVersion: "1.34"
1621

17-
# TODO: as soon as this is locked to "true" (= GA), comments about other DRA
22+
# TODO: as soon as this is locked to "true" (= some time after GA, *not* yet in 1.34), comments about other DRA
1823
# feature gate(s) like "unless you also enable the `DynamicResourceAllocation` feature gate"
1924
# can be removed (for example, in dra-admin-access.md).
2025

content/en/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Allocate Devices to Workloads with DRA
33
content_type: task
4-
min-kubernetes-server-version: v1.32
4+
min-kubernetes-server-version: v1.34
55
weight: 20
66
---
77
{{< feature-state feature_gate_name="DynamicResourceAllocation" >}}
@@ -157,6 +157,20 @@ claims in different containers.
157157
kubectl apply -f https://k8s.io/examples/dra/dra-example-job.yaml
158158
```
159159

160+
Try the following troubleshooting steps:
161+
162+
1. When the workload does not start as expected, drill down from Job
163+
to Pods to ResourceClaims and check the objects
164+
at each level with `kubectl describe` to see whether there are any
165+
status fields or events which might explain why the workload is
166+
not starting.
167+
1. When creating a Pod fails with `must specify one of: resourceClaimName,
168+
resourceClaimTemplateName`, check that all entries in `pod.spec.resourceClaims`
169+
have exactly one of those fields set. If they do, then it is possible
170+
that the cluster has a mutating Pod webhook installed which was built
171+
against APIs from Kubernetes < 1.32. Work with your cluster administrator
172+
to check this.
173+
160174
## Clean up {#clean-up}
161175

162176
To delete the Kubernetes objects that you created in this task, follow these
@@ -183,4 +197,4 @@ steps:
183197

184198
## {{% heading "whatsnext" %}}
185199

186-
* [Learn more about DRA](/docs/concepts/scheduling-eviction/dynamic-resource-allocation)
200+
* [Learn more about DRA](/docs/concepts/scheduling-eviction/dynamic-resource-allocation)

content/en/docs/tasks/configure-pod-container/assign-resources/set-up-dra-cluster.md

Lines changed: 42 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: "Set Up DRA in a Cluster"
33
content_type: task
4-
min-kubernetes-server-version: v1.32
4+
min-kubernetes-server-version: v1.34
55
weight: 10
66
---
77
{{< feature-state feature_gate_name="DynamicResourceAllocation" >}}
@@ -37,30 +37,20 @@ For details, see
3737

3838
<!-- steps -->
3939

40-
## Enable the DRA API groups {#enable-dra}
40+
## Optional: enable legacy DRA API groups {#enable-dra}
4141

42-
To let Kubernetes allocate resources to your Pods with DRA, complete the
43-
following configuration steps:
42+
DRA graduated to stable in Kubernetes 1.34 and is enabled by default.
43+
Some older DRA drivers or workloads might still need the
44+
v1beta1 API from Kubernetes 1.30 or v1beta2 from Kubernetes 1.32.
45+
If and only if support for those is desired, then enable the following
46+
{{< glossary_tooltip text="API groups" term_id="api-group" >}}:
47+
48+
* `resource.k8s.io/v1beta1`
49+
* `resource.k8s.io/v1beta2`
50+
51+
For more information, see
52+
[Enabling or disabling API groups](/docs/reference/using-api/#enabling-or-disabling).
4453

45-
1. Enable the `DynamicResourceAllocation`
46-
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
47-
on all of the following components:
48-
49-
* `kube-apiserver`
50-
* `kube-controller-manager`
51-
* `kube-scheduler`
52-
* `kubelet`
53-
54-
1. Enable the following
55-
{{< glossary_tooltip text="API groups" term_id="api-group" >}}:
56-
57-
* `resource.k8s.io/v1beta1`: required for DRA to function.
58-
* `resource.k8s.io/v1beta2`: optional, recommended improvements to the user
59-
experience.
60-
61-
For more information, see
62-
[Enabling or disabling API groups](/docs/reference/using-api/#enabling-or-disabling).
63-
6454
## Verify that DRA is enabled {#verify}
6555

6656
To verify that the cluster is configured correctly, try to list DeviceClasses:
@@ -81,15 +71,15 @@ similar to the following:
8171
```
8272
error: the server doesn't have a resource type "deviceclasses"
8373
```
74+
8475
Try the following troubleshooting steps:
8576

86-
1. Ensure that the `kube-scheduler` component has the `DynamicResourceAllocation`
87-
feature gate enabled *and* uses the
88-
[v1 configuration API](/docs/reference/config-api/kube-scheduler-config.v1/).
89-
If you use a custom configuration, you might need to perform additional steps
90-
to enable the `DynamicResource` plugin.
91-
1. Restart the `kube-apiserver` component and the `kube-controller-manager`
92-
component to propagate the API group changes.
77+
1. Reconfigure and restart the `kube-apiserver` component.
78+
79+
1. If the complete `.spec.resourceClaims` field gets removed from Pods, or if
80+
Pods get scheduled without considering the ResourceClaims, then verify
81+
that the `DynamicResourceAllocation` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is not turned off
82+
for kube-apiserver, kube-controller-manager, kube-scheduler or the kubelet.
9383

9484
## Install device drivers {#install-drivers}
9585

@@ -112,6 +102,12 @@ cluster-1-device-pool-1-driver.example.com-lqx8x cluster-1-node-1 driver
112102
cluster-1-device-pool-2-driver.example.com-29t7b cluster-1-node-2 driver.example.com cluster-1-device-pool-2-446z 8s
113103
```
114104

105+
Try the following troubleshooting steps:
106+
107+
1. Check the health of the DRA driver and look for error messages about
108+
publishing ResourceSlices in its log output. The vendor of the driver
109+
may have further instructions about installation and troubleshooting.
110+
115111
## Create DeviceClasses {#create-deviceclasses}
116112

117113
You can define categories of devices that your application operators can
@@ -135,27 +131,25 @@ operators.
135131
The output is similar to the following:
136132

137133
```yaml
138-
apiVersion: resource.k8s.io/v1beta1
134+
apiVersion: resource.k8s.io/v1
139135
kind: ResourceSlice
140136
# lines omitted for clarity
141137
spec:
142138
devices:
143-
- basic:
144-
attributes:
145-
type:
146-
string: gpu
147-
capacity:
148-
memory:
149-
value: 64Gi
150-
name: gpu-0
151-
- basic:
152-
attributes:
153-
type:
154-
string: gpu
155-
capacity:
156-
memory:
157-
value: 64Gi
158-
name: gpu-1
139+
- attributes:
140+
type:
141+
string: gpu
142+
capacity:
143+
memory:
144+
value: 64Gi
145+
name: gpu-0
146+
- attributes:
147+
type:
148+
string: gpu
149+
capacity:
150+
memory:
151+
value: 64Gi
152+
name: gpu-1
159153
driver: driver.example.com
160154
nodeName: cluster-1-node-1
161155
# lines omitted for clarity
@@ -186,4 +180,4 @@ kubectl delete -f https://k8s.io/examples/dra/deviceclass.yaml
186180
## {{% heading "whatsnext" %}}
187181

188182
* [Learn more about DRA](/docs/concepts/scheduling-eviction/dynamic-resource-allocation)
189-
* [Allocate Devices to Workloads with DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra)
183+
* [Allocate Devices to Workloads with DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra)

content/en/examples/dra/deviceclass.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
apiVersion: resource.k8s.io/v1beta2
1+
apiVersion: resource.k8s.io/v1
22
kind: DeviceClass
33
metadata:
44
name: example-device-class

content/en/examples/dra/resourceclaim.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
apiVersion: resource.k8s.io/v1beta2
1+
apiVersion: resource.k8s.io/v1
22
kind: ResourceClaim
33
metadata:
44
name: example-resource-claim

content/en/examples/dra/resourceclaimtemplate.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
apiVersion: resource.k8s.io/v1beta2
1+
apiVersion: resource.k8s.io/v1
22
kind: ResourceClaimTemplate
33
metadata:
44
name: example-resource-claim-template

0 commit comments

Comments
 (0)