[Documentation] Add HPA example to LWS site #663

LuyuZhang00 · 2025-09-22T13:43:53Z

What type of PR is this?

/kind documentation

What this PR does / why we need it

add hpa docs

Which issue(s) this PR fixes

Fixes #652

Special notes for your reviewer

cc @Edwinhr716

Does this PR introduce a user-facing change?

No

k8s-ci-robot · 2025-09-22T13:43:59Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LuyuZhang00
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

site/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-09-22T13:44:03Z

Hi @LuyuZhang00. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

netlify · 2025-09-22T13:44:59Z

✅ Deploy Preview for kubernetes-sigs-lws ready!

Name	Link
🔨 Latest commit	`c1e9ac6`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-sigs-lws/deploys/68d1529cfe1aec000835a470
😎 Deploy Preview	https://deploy-preview-663--kubernetes-sigs-lws.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Edwinhr716 · 2025-09-22T16:39:23Z

/ok-to-test

Edwinhr716 · 2025-09-22T16:47:52Z

site/content/en/docs/examples/hpa.md

+
+Before setting up HPA, ensure you have:
+
+### 1. Install Metrics Server


Is metrics server provided by most cloud providers? I've never had to do this step.

This is a note, if user already have installed, they can safely ignore this step, I included it for my initial test

For open-source Kubernetes:
By default, installations like kubeadm, kind, or minikube don't include the Metrics Server. The quickest way to get started with kubectl topor HPA is to install it manually.

For managed Kubernetes services (like GKE, EKS, AKS, etc.):
Most cloud providers include Metrics Server out of the box in their managed offerings—so it’s usually already there and good to go for things like HPA and kubectl top.

Makes sense, I'm fine with keeping this section. I think we should have a note saying: if using a managed Kubernetes service, skip this step

Edwinhr716 · 2025-09-22T16:50:06Z

site/content/en/docs/examples/hpa.md

+
+When using HPA with LeaderWorkerSet:
+
+- HPA monitors **leader pods** only (not worker pods)


Would be good to add that model server metrics often take into account the whole group

I tested adding CPU load to both the leader pod and the worker pods separately. In practice, the HPA only took effect when CPU load was increased on the leader pod. I noticed that in the updateStatus function within pkg/controllers/leaderworkerset_controller.go, it also monitors the leader pod. This seems like a potential area for improvement.

Yes, when using metrics exposed by the nodes themselves, it will only take the leader into account. My point was that we should add that model servers, such as vLLM, expose metrics that take into account the whole group.

This seems like a potential area for improvement.

I agree, but it is a complex area. We would need a way to aggregate the metrics of a group of pods, which should be covered by the revised Gang Scheduling KEP https://docs.google.com/document/d/1ulO5eUnAsBWzqJdk_o5L-qdq5DIVwGcE7gWzCQ80SCM/edit?pli=1&tab=t.0.

An idea I was thinking of earlier was to be able to change whether to target a leader pod or a worker pod for HPA, since there are deployment patterns where the leader only has a proxy server, so it isn't representative of when the accelerators being used have reached its limit.

yankay · 2025-09-24T02:26:10Z

site/content/en/docs/examples/hpa.md

+### Memory-based Scaling
+
+You can also configure HPA to scale based on memory utilization:
+


HI @LuyuZhang00

Considering other examples, their YAML files are placed in the https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/sample directory rather than within the markdown files.

Should we create an HPA directory under docs/examples and place the YAML files there? This would make maintenance easier.

add hpa docs

c1e9ac6

k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Sep 22, 2025

k8s-ci-robot requested review from ahg-g and kerthcet September 22, 2025 13:43

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 22, 2025

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 22, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 22, 2025

Edwinhr716 reviewed Sep 22, 2025

View reviewed changes

LuyuZhang00 changed the title ~~docs: add hpa docs~~ [Documentation] Add HPA example to LWS site Sep 23, 2025

yankay reviewed Sep 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Documentation] Add HPA example to LWS site #663

[Documentation] Add HPA example to LWS site #663

LuyuZhang00 commented Sep 22, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Sep 22, 2025

Uh oh!

k8s-ci-robot commented Sep 22, 2025

Uh oh!

netlify bot commented Sep 22, 2025 •

edited

Loading

Uh oh!

Edwinhr716 commented Sep 22, 2025

Uh oh!

Edwinhr716 Sep 22, 2025

Uh oh!

LuyuZhang00 Sep 23, 2025

Uh oh!

Edwinhr716 Sep 23, 2025

Uh oh!

Edwinhr716 Sep 22, 2025

Uh oh!

LuyuZhang00 Sep 23, 2025 •

edited

Loading

Uh oh!

Edwinhr716 Sep 23, 2025

Uh oh!

yankay Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		Before setting up HPA, ensure you have:

		### 1. Install Metrics Server


		When using HPA with LeaderWorkerSet:

		- HPA monitors leader pods only (not worker pods)

		### Memory-based Scaling

		You can also configure HPA to scale based on memory utilization:

[Documentation] Add HPA example to LWS site #663

Are you sure you want to change the base?

[Documentation] Add HPA example to LWS site #663

Conversation

LuyuZhang00 commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?

Uh oh!

k8s-ci-robot commented Sep 22, 2025

Uh oh!

k8s-ci-robot commented Sep 22, 2025

Uh oh!

netlify bot commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kubernetes-sigs-lws ready!

Uh oh!

Edwinhr716 commented Sep 22, 2025

Uh oh!

Edwinhr716 Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

LuyuZhang00 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Edwinhr716 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Edwinhr716 Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

LuyuZhang00 Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Edwinhr716 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

yankay Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

LuyuZhang00 commented Sep 22, 2025 •

edited

Loading

netlify bot commented Sep 22, 2025 •

edited

Loading

LuyuZhang00 Sep 23, 2025 •

edited

Loading