-
Notifications
You must be signed in to change notification settings - Fork 113
[Documentation] Add HPA example to LWS site #663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: LuyuZhang00 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @LuyuZhang00. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-lws ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
/ok-to-test |
|
|
||
| Before setting up HPA, ensure you have: | ||
|
|
||
| ### 1. Install Metrics Server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is metrics server provided by most cloud providers? I've never had to do this step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a note, if user already have installed, they can safely ignore this step, I included it for my initial test
For open-source Kubernetes:
By default, installations like kubeadm, kind, or minikube don't include the Metrics Server. The quickest way to get started with kubectl topor HPA is to install it manually.
For managed Kubernetes services (like GKE, EKS, AKS, etc.):
Most cloud providers include Metrics Server out of the box in their managed offerings—so it’s usually already there and good to go for things like HPA and kubectl top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I'm fine with keeping this section. I think we should have a note saying: if using a managed Kubernetes service, skip this step
|
|
||
| When using HPA with LeaderWorkerSet: | ||
|
|
||
| - HPA monitors **leader pods** only (not worker pods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add that model server metrics often take into account the whole group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested adding CPU load to both the leader pod and the worker pods separately. In practice, the HPA only took effect when CPU load was increased on the leader pod. I noticed that in the updateStatus function within pkg/controllers/leaderworkerset_controller.go, it also monitors the leader pod. This seems like a potential area for improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when using metrics exposed by the nodes themselves, it will only take the leader into account. My point was that we should add that model servers, such as vLLM, expose metrics that take into account the whole group.
This seems like a potential area for improvement.
I agree, but it is a complex area. We would need a way to aggregate the metrics of a group of pods, which should be covered by the revised Gang Scheduling KEP https://docs.google.com/document/d/1ulO5eUnAsBWzqJdk_o5L-qdq5DIVwGcE7gWzCQ80SCM/edit?pli=1&tab=t.0.
An idea I was thinking of earlier was to be able to change whether to target a leader pod or a worker pod for HPA, since there are deployment patterns where the leader only has a proxy server, so it isn't representative of when the accelerators being used have reached its limit.
| ### Memory-based Scaling | ||
|
|
||
| You can also configure HPA to scale based on memory utilization: | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @LuyuZhang00
Considering other examples, their YAML files are placed in the https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/sample directory rather than within the markdown files.
Should we create an HPA directory under docs/examples and place the YAML files there? This would make maintenance easier.
What type of PR is this?
/kind documentation
What this PR does / why we need it
add hpa docs
Which issue(s) this PR fixes
Fixes #652
Special notes for your reviewer
cc @Edwinhr716
Does this PR introduce a user-facing change?
No