-
Notifications
You must be signed in to change notification settings - Fork 596
adding scripted metric aggs docs #10211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding scripted metric aggs docs #10211
Conversation
Signed-off-by: Anton Rubin <[email protected]>
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
Signed-off-by: Anton Rubin <[email protected]>
@sandeshkr419 Could you please review this PR? Thanks! |
Signed-off-by: Anton Rubin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These scripts within script stages are painless scripts, right?
Should we mention it somewhere within the doc - like these script stages should contain painless scripts (and then maybe link it).
Basically, what I am looking for are the rules/syntax to define these script but at the same time, explaining painless script might be out of scope for this page.
So if we can link it and mention it briefly, that might be good.
|
||
## Handling empty buckets (no documents scenario) | ||
|
||
When using a `scripted_metric` aggregation as a sub-aggregation within a bucket aggregation (such as terms), it is important to account for buckets that contain no documents on certain shards. In such cases, those shards return a `null` value for the aggregation state. During the `reduce_script` phase, the states array may therefore include `null` entries corresponding to these shards. To ensure reliable execution, the `reduce_script` must be designed to handle `null` values gracefully. A common approach is to include a conditional check, such as `if (state != null)`, before accessing or operating on each state. Failure to implement such checks can result in runtime errors when processing empty buckets across shards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this documentation - but do you think that handling null can be another improvement in code base by passing some paramater like ignore_null_results
.
I'm wondering that might even make scripted aggs faster if the null checks are part of code-flow via some param rather than run as part of script.
Signed-off-by: Anton Rubin <[email protected]>
@sandeshkr419 thank you for the review, The changes have been pushed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for addressing the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @AntonEliatra! Please see my comments and let me know if you have any questions.
Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: AntonEliatra <[email protected]>
@kolchfa-aws that's updated now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AntonEliatra Please see my comments and changes and tag me for approval when addressed. Thanks!
|
||
| Parameter | Data type | Required/Optional | Description | | ||
| ---------------- | --------- | ----------------- | -------------------------------------------------------------------------------------------------- | | ||
| `init_script` | String | Optional | A script that executes once per shard before any documents are processed. Used to set up an initial state (for example, initialize counters or lists in a state object). If not provided, the state starts as an empty object on each shard. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should any/all instances of "state" be in code font on lines 22, 23, and 24?
|
||
## Allowed return types | ||
|
||
Scripts can use any valid operation and object internally. However, the data you store in `state` or return from any script must be of one of the allowed types. This restriction exists because the intermediate state needs to be sent between nodes. The following types are allowed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "state" be in code font here?
@@ -55,18 +231,38 @@ GET opensearch_dashboards_sample_data_logs/_search | |||
``` | |||
{% include copy-curl.html %} | |||
|
|||
#### Example response | |||
|
|||
The response returns three values in the `value` object, demonstrating how a scripted metric can return multiple metrics at once by using a map in the state: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"scripted metric" => "scripted_metric
aggregration"? Should "state" be in code font?
|
||
## Handling empty buckets (no documents scenario) | ||
|
||
When using a `scripted_metric` aggregation as a subaggregation within a bucket aggregation (such as `terms`), it is important to account for buckets that contain no documents on certain shards. In such cases, those shards return a `null` value for the aggregation state. During the `reduce_script` phase, the `states` array may therefore include `null` entries corresponding to these shards. To ensure reliable execution, the `reduce_script` must be designed to handle `null` values gracefully. A common approach is to include a conditional check, such as `if (state != null)`, before accessing or operating on each state. Failure to implement such checks can result in runtime errors when processing empty buckets across shards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End of 2nd sentence: Should "state" be in code font?
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: AntonEliatra <[email protected]>
Signed-off-by: Anton Rubin <[email protected]>
Signed-off-by: Anton Rubin <[email protected]>
Signed-off-by: Nathan Bower <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @AntonEliatra! LGTM
* adding scripted metric aggs docs Signed-off-by: Anton Rubin <[email protected]> * fixing vale errors Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: AntonEliatra <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: AntonEliatra <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * Apply suggestions from code review Signed-off-by: Nathan Bower <[email protected]> --------- Signed-off-by: Anton Rubin <[email protected]> Signed-off-by: AntonEliatra <[email protected]> Signed-off-by: Nathan Bower <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 489b888) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* adding scripted metric aggs docs Signed-off-by: Anton Rubin <[email protected]> * fixing vale errors Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * Apply suggestions from code review Co-authored-by: kolchfa-aws <[email protected]> Signed-off-by: AntonEliatra <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: AntonEliatra <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * addressing the PR comments Signed-off-by: Anton Rubin <[email protected]> * Apply suggestions from code review Signed-off-by: Nathan Bower <[email protected]> --------- Signed-off-by: Anton Rubin <[email protected]> Signed-off-by: AntonEliatra <[email protected]> Signed-off-by: Nathan Bower <[email protected]> Co-authored-by: kolchfa-aws <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
Description
adding scripted metric aggs docs
Version
all
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.