Skip to content

Conversation

@LuyuZhang00
Copy link
Contributor

@LuyuZhang00 LuyuZhang00 commented Sep 22, 2025

What type of PR is this?

/kind documentation

What this PR does / why we need it

add hpa docs

Which issue(s) this PR fixes

Fixes #652

Special notes for your reviewer

cc @Edwinhr716

Does this PR introduce a user-facing change?

No

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Sep 22, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LuyuZhang00
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 22, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @LuyuZhang00. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 22, 2025
@netlify
Copy link

netlify bot commented Sep 22, 2025

Deploy Preview for kubernetes-sigs-lws ready!

Name Link
🔨 Latest commit c1e9ac6
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-lws/deploys/68d1529cfe1aec000835a470
😎 Deploy Preview https://deploy-preview-663--kubernetes-sigs-lws.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@Edwinhr716
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 22, 2025

Before setting up HPA, ensure you have:

### 1. Install Metrics Server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is metrics server provided by most cloud providers? I've never had to do this step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a note, if user already have installed, they can safely ignore this step, I included it for my initial test

​​For open-source Kubernetes:​
By default, installations like kubeadm, kind, or minikube don't include the Metrics Server. The quickest way to get started with kubectl topor HPA is to install it manually.

For managed Kubernetes services (like GKE, EKS, AKS, etc.):​​
Most cloud providers include Metrics Server out of the box in their managed offerings—so it’s usually already there and good to go for things like HPA and kubectl top.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'm fine with keeping this section. I think we should have a note saying: if using a managed Kubernetes service, skip this step


When using HPA with LeaderWorkerSet:

- HPA monitors **leader pods** only (not worker pods)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add that model server metrics often take into account the whole group

Copy link
Contributor Author

@LuyuZhang00 LuyuZhang00 Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested adding CPU load to both the leader pod and the worker pods separately. In practice, the HPA only took effect when CPU load was increased on the leader pod. I noticed that in the updateStatus function within pkg/controllers/leaderworkerset_controller.go, it also monitors the leader pod. This seems like a potential area for improvement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when using metrics exposed by the nodes themselves, it will only take the leader into account. My point was that we should add that model servers, such as vLLM, expose metrics that take into account the whole group.

This seems like a potential area for improvement.

I agree, but it is a complex area. We would need a way to aggregate the metrics of a group of pods, which should be covered by the revised Gang Scheduling KEP https://docs.google.com/document/d/1ulO5eUnAsBWzqJdk_o5L-qdq5DIVwGcE7gWzCQ80SCM/edit?pli=1&tab=t.0.

An idea I was thinking of earlier was to be able to change whether to target a leader pod or a worker pod for HPA, since there are deployment patterns where the leader only has a proxy server, so it isn't representative of when the accelerators being used have reached its limit.

@LuyuZhang00 LuyuZhang00 changed the title docs: add hpa docs [Documentation] Add HPA example to LWS site Sep 23, 2025
### Memory-based Scaling

You can also configure HPA to scale based on memory utilization:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @LuyuZhang00

Considering other examples, their YAML files are placed in the https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/sample directory rather than within the markdown files.

Should we create an HPA directory under docs/examples and place the YAML files there? This would make maintenance easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Documentation] Add HPA example to LWS site

4 participants