-
Notifications
You must be signed in to change notification settings - Fork 4.2k
AEP-8459: MemoryPerCPU Enforce #8459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
61e4f32
1766fc2
f4b20b6
5fed004
5bf11a4
35b9ab3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,162 @@ | ||
# AEP-8459: MemoryPerCPU | ||
|
||
<!-- toc --> | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Design Details](#design-details) | ||
- [API Changes](#api-changes) | ||
- [Behavior](#behavior) | ||
- [Feature Enablement and Rollback](#feature-enablement-and-rollback) | ||
- [How can this feature be enabled / disabled in a live cluster?](#how-can-this-feature-be-enabled--disabled-in-a-live-cluster) | ||
- [Kubernetes Version Compatibility](#kubernetes-version-compatibility) | ||
- [Validation](#validation) | ||
- [Test Plan](#test-plan) | ||
- [Implementation History](#implementation-history) | ||
- [Future Work](#future-work) | ||
- [Alternatives](#alternatives) | ||
<!-- /toc --> | ||
|
||
## Summary | ||
|
||
This AEP proposes a new feature to allow enforcing a fixed memory-per-CPU ratio (`memoryPerCPU`) in Vertical Pod Autoscaler (VPA) recommendations. | ||
The feature is controlled by a new alpha feature gate `MemoryPerCPURatio` (default off). | ||
|
||
## Motivation | ||
|
||
Many workloads scale their memory requirements proportionally to CPU, but today VPA generates independent CPU and memory recommendations. This can lead to skewed configurations — for example, too much memory for a small CPU allocation, or too little memory for a high CPU allocation. | ||
|
||
The `memoryPerCPU` field addresses this by enforcing a predictable CPU-to-memory ratio in recommendations. This reduces the risk of misconfiguration, ensures consistency, and simplifies tuning for workloads where CPU and memory usage are tightly coupled. | ||
|
||
This feature is particularly useful in environments where services are billed primarily on CPU with a fixed CPU-to-memory ratio. In such cases, it allows VPA to be used for automatic vertical scaling while preserving the existing billing model and guarantees to customers. | ||
|
||
### Goals | ||
|
||
* Allow users to specify a `memoryPerCPU` ratio in `VerticalPodAutoscaler` objects. | ||
* Ensure VPA recommendations respect the ratio across Target, LowerBound, UpperBound, and UncappedTarget. | ||
* Provide a feature gate to enable/disable the feature cluster-wide. | ||
|
||
### Non-Goals | ||
|
||
* Redesign of the VPA recommender algorithm beyond enforcing the ratio. | ||
* Supporting multiple ratio policies per container (only one `memoryPerCPU` is supported). | ||
* Retroactive migration of existing VPAs without explicit user opt-in. | ||
|
||
## Proposal | ||
|
||
Extend `ContainerResourcePolicy` with a new optional field: | ||
|
||
```yaml | ||
apiVersion: autoscaling.k8s.io/v1 | ||
kind: VerticalPodAutoscaler | ||
metadata: | ||
name: my-app | ||
spec: | ||
resourcePolicy: | ||
containerPolicies: | ||
- containerName: app | ||
minAllowed: | ||
cpu: 1 | ||
memory: 4Gi | ||
maxAllowed: | ||
cpu: 4 | ||
memory: 16Gi | ||
controlledResources: ["cpu", "memory"] | ||
controlledValues: RequestsAndLimits | ||
memoryPerCPU: "4Gi" | ||
``` | ||
|
||
When enabled, VPA will adjust CPU or memory recommendations to maintain: | ||
|
||
``` | ||
memory_bytes = cpu_cores * memoryPerCPU | ||
``` | ||
|
||
## Design Details | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh yes, thanks. See commit: 5fed004 |
||
|
||
### API Changes | ||
|
||
* New field `memoryPerCPU` (`resource.Quantity`) in `ContainerResourcePolicy`. | ||
* Feature gate: `MemoryPerCPURatio` (alpha, default off). | ||
|
||
### Behavior | ||
|
||
* If both CPU and memory are controlled, VPA enforces the ratio. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if both (cpu and memory) are not specified? Should that be a validation error? It seems, like we should enforce that if you specify both you should get an error, this way we'll ensure that either you specify all the pieces of the puzzle, or none. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Initially, my thinking was to simply ignore memoryPerCPU if either CPU or memory was not specified in controlledResources. But if the philosophy is rather to fail fast and return a validation error whenever memoryPerCPU is set without both CPU and memory being present, I’m fine with that approach too, I can update the AEP accordingly. |
||
* Applies to Target, LowerBound, UpperBound, and UncappedTarget. | ||
* Ratio enforcement is strict: | ||
* If the memory recommendation would exceed `cpu * memoryPerCPU`, then **CPU is increased** to satisfy the ratio. | ||
* If the CPU recommendation would exceed `memory / memoryPerCPU`, then **memory is increased** to satisfy the ratio. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm inclined to say we should error out if the math doesn't stand with the cpu and memory values, adjusting seems "magical", and I'd advice against it. Explicitness is always better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see your point, implicit adjustments can indeed feel “magical.” If we only validated and errored, users wouldn’t get the behavior they’re asking for (“always keep memory = cpu × memoryPerCPU”), they’d just see failures. Or maybe I didn’t fully understand your point? |
||
* If ratio cannot be applied (e.g., missing CPU), fallback to standard recommendations. | ||
* With the `MemoryPerCPURatio` feature gate disabled, the `memoryPerCPU` field is ignored and recommendations fall back to standard VPA behavior. | ||
Comment on lines
+84
to
+92
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add some examples here to help users better understand how the algorithm behaves? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, I’ve added examples to clarify the behavior in commit 5fed004 |
||
|
||
> [!IMPORTANT] | ||
> The enforced ratio values will be capped by | ||
> [`--container-recommendation-max-allowed-cpu`](https://github.com/kubernetes/autoscaler/blob/4d294562e505431d518a81e8833accc0ec99c9b8/vertical-pod-autoscaler/pkg/recommender/main.go#L122) | ||
> and | ||
> [`--container-recommendation-max-allowed-memory`](https://github.com/kubernetes/autoscaler/blob/4d294562e505431d518a81e8833accc0ec99c9b8/vertical-pod-autoscaler/pkg/recommender/main.go#L123) | ||
> flag values, if set. | ||
|
||
#### Examples | ||
|
||
* Example 1: `memoryPerCPU = 4Gi` | ||
* Baseline recommendation: 1 CPU, 8Gi memory | ||
* UncappedTarget (ratio enforced): 2 CPUs, 8Gi | ||
* Target (after policy/caps): 2 CPUs, 8Gi | ||
|
||
* Example 2: `memoryPerCPU = 4Gi` | ||
* Baseline recommendation: 2 CPUs, 4Gi memory | ||
* UncappedTarget (ratio enforced): 2 CPUs, 8Gi | ||
* Target (after policy/caps): 2 CPUs, 8Gi | ||
|
||
* Example 3: `memoryPerCPU = 4Gi`, with `--container-recommendation-max-allowed-memory=7Gi` or with `maxAllowed.memory=6Gi` set in VPA object | ||
* Baseline recommendation: 2 CPUs, 4Gi memory | ||
* UncappedTarget (ratio enforced): 2 CPUs, 8Gi | ||
* Target (capped): 2 CPUs, 7Gi ← memory capped by max-allowed-memory; ratio not fully satisfied | ||
|
||
### Feature Enablement and Rollback | ||
|
||
#### How can this feature be enabled / disabled in a live cluster? | ||
|
||
* Feature gate name: `MemoryPerCPURatio` | ||
* Default: Off (Alpha) | ||
* Components depending on the feature gate: | ||
* admission-controller | ||
* recommender | ||
|
||
**When enabled**: | ||
* VPA honors `memoryPerCPU` in recommendations. | ||
Comment on lines
+128
to
+129
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And ignore memory recommendation ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When the memory suggestion is greater than 4Gi per CPU, the CPU will be increased to respect the memoryPerCPU ratio |
||
|
||
**When disabled**: | ||
* `memoryPerCPU` is ignored. | ||
* Recommendations behave as before. | ||
|
||
### Kubernetes Version Compatibility | ||
|
||
The `memoryPerCPU` feature requires VPA version 1.5.0 or higher. The feature is being introduced as alpha and will follow the standard Kubernetes feature gate graduation process: | ||
- Alpha: v1.5.0 (default off) | ||
- Beta: TBD (default on) | ||
- GA: TBD (default on) | ||
|
||
### Validation | ||
|
||
* `memoryPerCPU` must be > 0. | ||
* Value must be a valid `resource.Quantity` (e.g., `512Mi`, `4Gi`). | ||
|
||
### Test Plan | ||
|
||
* Unit tests covering: | ||
- ensuring ratio enforcement logic, | ||
- ensuring that when the feature gate is on or off the values and validation are applied accordingly. | ||
* E2E tests comparing behavior with different configurations. | ||
|
||
## Implementation History | ||
|
||
* 2025-08-19: Initial proposal | ||
|
||
## Future Work | ||
|
||
|
||
## Alternatives | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in this case, VPA should skip the memory suggestion, right? We can also force this (for example, make sure
memoryPerCPU
andcontrolledResources.memory
are mutually exclusive) . think just skipping is better. Just an idea.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the memory suggestion was greater than 4Gi per CPU? Surely it needs to be increased then?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. You specify that for every cpu you will get 4Gi. the all idea is to have a "hard-coded" cpu-memory ratio. As I wrote in the comment below those use cases should be clear to the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent is that the ratio is strictly enforced in both directions.
Concretely: if the memory recommendation is higher than cpu * memoryPerCPU, then CPU will be increased accordingly. Likewise, if CPU is higher than memory / memoryPerCPU, then memory is increased.
I’ll update the AEP to make this explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, could you also update the use cases as Omer had suggested, since it's not clear to me why someone may want this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! I just pushed a commit updating the Motivation section to clarify the use cases, as suggested.
In our specific case, we want to use VPA to automatically scale our customers' services vertically, but since we charge them based on CPU with a guaranteed CPU-to-memory ratio, we need VPA to respect this fixed ratio in its recommendations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you want to express ratio as a
resource.Quantity
and not just a plain integer or float, if you're considering partials?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A different question, taking a step back, is there a possibility where users will be interested in providing a ration for different pair of resources? The answer will allow us to better match the name for this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial reasoning was consistency with other VPA fields that already use
resource.Quantity
. But you’re right that semantically this field represents a ratio, not a resource amount. I’m fine switching to a float or plain integer if that’s the preferred approach.I’m not 100% sure I understood the second question. What do you mean by "users will be interested in providing a ratio for different pair of resources"?