Introducing sampler warmup as separate warmup step #131

ksmusz · 2025-09-03T11:46:05Z

Warming up the sampler with different configurations removes graph recompilations of bigger sampler graphs seen within the actual execution. As tested with example workloads and batch sizes, the only recompilations left from the sampler are from minor graphs, which have minimal influence to the execution time.

The warmup of the sampler takes around 1-3 seconds, depending on the buckets' batch sizes to be warmed up.

Additionally, removed the situation, where the warmup method is called twice (seen as duplicated prints within the warmup phase but with empty warmed up buckets, as these have all been already warmed up).

Signed-off-by: Krzysztof Smusz <[email protected]>

ksmusz · 2025-09-03T13:03:13Z

/run-gaudi-tests

sys-hab-pt-service · 2025-09-03T13:03:33Z

Only codeowners can request to run Gaudi tests. Contact list: kzawora-intel, xuechendi, mswiniarsk, adobrzyn

adobrzyn · 2025-09-04T07:53:53Z

/run-gaudi-tests

Signed-off-by: Krzysztof Smusz <[email protected]>

adobrzyn · 2025-09-04T13:10:04Z

/run-gaudi-tests

Copilot

Pull Request Overview

This PR introduces sampler warmup as a separate warmup step to reduce graph recompilations during model execution. The warmup tests various sampling configurations with different batch sizes and temperature/top-p/top-k values to pre-compile sampler graphs, reducing compilation overhead during actual inference.

Key changes:

Added separate sampler warmup phase before model graph warmup
Refactored sampling code to extract common functionality into reusable methods
Modified worker warmup conditions to prevent redundant warmups

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
vllm_gaudi/v1/worker/hpu_worker.py	Added condition to prevent redundant model warmup when graphs are already compiled
vllm_gaudi/v1/worker/hpu_model_runner.py	Added comprehensive sampler warmup functionality and refactored sampling code

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

vllm_gaudi/v1/worker/hpu_model_runner.py

adobrzyn · 2025-09-05T07:58:40Z

/run-gaudi-tests

Signed-off-by: Krzysztof Smusz <[email protected]>

adobrzyn · 2025-09-09T08:59:03Z

/run-gaudi-tests

kzawora-intel · 2025-09-09T11:44:46Z

/run-gaudi-tests

kzawora-intel · 2025-09-09T12:36:59Z

/run-gaudi-tests

Warming up the sampler with different configurations removes graph recompilations of bigger sampler graphs seen within the actual execution. As tested with example workloads and batch sizes, the only recompilations left from the sampler are from minor graphs, which have minimal influence to the execution time. The warmup of the sampler takes around 1-3 seconds, depending on the buckets' batch sizes to be warmed up. Additionally, removed the situation, where the warmup method is called twice (seen as duplicated prints within the warmup phase but with empty warmed up buckets, as these have all been already warmed up). --------- Signed-off-by: Krzysztof Smusz <[email protected]> Signed-off-by: Katarzyna Fojcik <[email protected]>

Adding sampler warmup

0f9a1f6

Signed-off-by: Krzysztof Smusz <[email protected]>

ksmusz requested review from adobrzyn, kzawora-intel, mswiniarsk and xuechendi as code owners September 3, 2025 11:46

Minor refactor

3252543

Signed-off-by: Krzysztof Smusz <[email protected]>

Merge branch 'main' into dev/ksmusz/sampler_warmup_pr

b4a8f7f

adobrzyn requested a review from Copilot September 5, 2025 07:50

Copilot AI reviewed Sep 5, 2025

View reviewed changes

vllm_gaudi/v1/worker/hpu_model_runner.py Outdated Show resolved Hide resolved

vllm_gaudi/v1/worker/hpu_model_runner.py Show resolved Hide resolved

vllm_gaudi/v1/worker/hpu_model_runner.py Show resolved Hide resolved

ksmusz added 4 commits September 5, 2025 13:28

Applying changes after copilot review

df04d22

Signed-off-by: Krzysztof Smusz <[email protected]>

Merge branch 'main' into dev/ksmusz/sampler_warmup_pr

8210cc3

Merge branch 'main' into dev/ksmusz/sampler_warmup_pr

648bac0

Linter fix after rebase

f5de9d0

Signed-off-by: Krzysztof Smusz <[email protected]>

kzawora-intel approved these changes Sep 8, 2025

View reviewed changes

Merge branch 'main' into dev/ksmusz/sampler_warmup_pr

c7739e7

kzawora-intel enabled auto-merge (squash) September 9, 2025 11:44

Merge branch 'main' into dev/ksmusz/sampler_warmup_pr

eb3b6e3

kzawora-intel merged commit a9702fe into vllm-project:main Sep 9, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introducing sampler warmup as separate warmup step #131

Introducing sampler warmup as separate warmup step #131

Uh oh!

ksmusz commented Sep 3, 2025 •

edited

Loading

Uh oh!

ksmusz commented Sep 3, 2025

Uh oh!

sys-hab-pt-service commented Sep 3, 2025

Uh oh!

adobrzyn commented Sep 4, 2025

Uh oh!

adobrzyn commented Sep 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adobrzyn commented Sep 5, 2025

Uh oh!

adobrzyn commented Sep 9, 2025

Uh oh!

kzawora-intel commented Sep 9, 2025

Uh oh!

kzawora-intel commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Introducing sampler warmup as separate warmup step #131

Introducing sampler warmup as separate warmup step #131

Uh oh!

Conversation

ksmusz commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ksmusz commented Sep 3, 2025

Uh oh!

sys-hab-pt-service commented Sep 3, 2025

Uh oh!

adobrzyn commented Sep 4, 2025

Uh oh!

adobrzyn commented Sep 4, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adobrzyn commented Sep 5, 2025

Uh oh!

adobrzyn commented Sep 9, 2025

Uh oh!

kzawora-intel commented Sep 9, 2025

Uh oh!

kzawora-intel commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ksmusz commented Sep 3, 2025 •

edited

Loading