-
Notifications
You must be signed in to change notification settings - Fork 185
Record EPP NormalizedTimePerOutputToken metric on streaming mode #1706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record EPP NormalizedTimePerOutputToken metric on streaming mode #1706
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
/cc @delavet |
|
@dharaneeshvrd: GitHub didn't allow me to request PR reviews from the following users: delavet. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hey @dharaneeshvrd! Thanks for the PR, do you mind adding this metric to our hermetic tests to validate the behavior?
|
462c169 to
e652c5a
Compare
e652c5a to
46c873a
Compare
Update e2e/epp/e2e_test & integration/epp/hermetic_test to validate inference_objective_normalized_time_per_output_token_seconds metric Signed-off-by: Dharaneeshwaran Ravichandran <[email protected]>
46c873a to
6c7ce3e
Compare
|
@kfswain Updated the hermetic test. PTAL! |
|
@kfswain Can you please review this PR when you get a chance? |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danehans, dharaneeshvrd The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ernetes-sigs#1706) Update e2e/epp/e2e_test & integration/epp/hermetic_test to validate inference_objective_normalized_time_per_output_token_seconds metric Signed-off-by: Dharaneeshwaran Ravichandran <[email protected]>
What type of PR is this?
/kind bug
/kind failing-test
What this PR does / why we need it:
Add code changes to record NormalizedTimePerOutputToken metric in EPP, which is expected in e2e epp test.
Which issue(s) this PR fixes:
Fixes #939
Does this PR introduce a user-facing change?: