improve model for prefix cache score #1770

kaushikmitr · 2025-10-25T00:45:08Z

This pull request introduces several improvements to both the training and prediction servers for latency prediction, focusing on more granular feature engineering and data bucketing, especially around prefix cache score. The changes enhance model training and prediction accuracy by adding interaction features and refining how data is bucketed and processed. The most important updates are grouped below by theme.

Feature Engineering and Data Preparation:

Added a _prepare_features_with_interaction method to both prediction_server.py and training_server.py, which generates new interaction features (such as effective_input_tokens and a categorical prefill_score_bucket) for the TTFT model, improving model learning and prediction accuracy. [1] [2]
Updated the prediction methods (predict and predict_batch) to use these engineered features for both single and batch predictions, ensuring consistency with the training pipeline. [1] [2] [3]

Data Bucketing Enhancements:

Expanded data bucketing in the training server to include a third dimension based on prefix cache score, using a new _get_prefix_bucket method and updating bucket keys for both TTFT and TPOT data. This enables more granular sampling and storage. [1] [2] [3]

Model Training Improvements:

Modified the _train_model_with_scaling method to accept sample weighting and to drop the categorical prefill_score_bucket for Bayesian Ridge models, ensuring compatibility and improved training. [1] [2]

Configuration and Miscellaneous:

Added a new configuration flag SAMPLE_WEIGHTING_FOR_PREFIX_CACHE to the settings, allowing optional sample weighting based on prefix cache score.
Minor code cleanup and import fixes in training_server.py.

These changes collectively improve the accuracy and flexibility of latency prediction by allowing the models to better capture the effects of prefix cache and input size, and by aligning feature engineering in both training and prediction workflows.

BenjaminBraunDev · 2025-10-27T22:38:44Z

LGTM, I have finished rebasing and moving logic to the plugins, so once this is in I can rebase over it and make the PR for that.

ahg-g · 2025-10-27T22:40:04Z

/lgtm
/approve

k8s-ci-robot · 2025-10-27T22:40:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, kaushikmitr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

improve model for prefix cache score

e599fa2

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 25, 2025

k8s-ci-robot requested review from danehans and robscott October 25, 2025 00:45

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 25, 2025

k8s-ci-robot assigned ahg-g Oct 27, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 27, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2025

k8s-ci-robot merged commit 60726b0 into kubernetes-sigs:slo-prediction-experimental Oct 27, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve model for prefix cache score #1770

improve model for prefix cache score #1770

kaushikmitr commented Oct 25, 2025

Uh oh!

BenjaminBraunDev commented Oct 27, 2025

Uh oh!

ahg-g commented Oct 27, 2025

Uh oh!

k8s-ci-robot commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

improve model for prefix cache score #1770

improve model for prefix cache score #1770

Conversation

kaushikmitr commented Oct 25, 2025

Uh oh!

BenjaminBraunDev commented Oct 27, 2025

Uh oh!

ahg-g commented Oct 27, 2025

Uh oh!

k8s-ci-robot commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants