You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site-src/guides/metrics-and-observability.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ This guide describes the current state of exposed metrics and how to scrape them
35
35
| inference_objective_request_total | Counter | The counter of requests broken out for each model. |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
36
36
| inference_objective_request_error_total | Counter | The counter of requests errors broken out for each model. |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
37
37
| inference_objective_request_duration_seconds | Distribution | Distribution of response latency. |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
38
-
|normalized_time_per_output_token_seconds| Distribution | Distribution of ntpot (response latency per output token) |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
38
+
|inference_objective_normalized_time_per_output_token_seconds| Distribution | Distribution of ntpot (response latency per output token) |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
39
39
| inference_objective_request_sizes | Distribution | Distribution of request size in bytes. |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
40
40
| inference_objective_response_sizes | Distribution | Distribution of response size in bytes. |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
41
41
| inference_objective_input_tokens | Distribution | Distribution of input token count. |`model_name`=<model-name> <br> `target_model_name`=<target-model-name>| ALPHA |
# HELP inference_objective_normalized_time_per_output_token_seconds [ALPHA] Inference objective latency divided by number of output tokens in seconds for each model and target model.
733
+
# TYPE inference_objective_normalized_time_per_output_token_seconds histogram
0 commit comments