Skip to content

Optimization in String Terms Aggregation query for Large Bucket Counts #18732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

vinaykpud
Copy link
Contributor

@vinaykpud vinaykpud commented Jul 11, 2025

Description

If the number of requested top-N buckets exceeds or close to the maximum bucket ordinal, making the use of a PriorityQueue for top-N selection inefficient or redundant. So we made following modifications:

  1. use quickselect for topN if the requested size is greater than the 20% of the total buckets.
  2. If the requested size is greater than the bucket size then return all the bucket.

Benchmark test results here :

#18704 (comment)

Related Issues

Resolves #18704
Related #18650

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Rishabh Maurya <[email protected]>
@github-actions github-actions bot added bug Something isn't working Search:Performance labels Jul 11, 2025
@vinaykpud vinaykpud force-pushed the string-term-agg-opt branch 3 times, most recently from d5dad5c to fa96268 Compare July 11, 2025 18:21
@vinaykpud vinaykpud force-pushed the string-term-agg-opt branch from fa96268 to 0cf5b78 Compare July 11, 2025 18:22
Copy link
Contributor

❌ Gradle check result for b25271f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 242faae: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud force-pushed the string-term-agg-opt branch from 242faae to 482a37e Compare July 14, 2025 22:34
Copy link
Contributor

❌ Gradle check result for 482a37e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud force-pushed the string-term-agg-opt branch from 482a37e to 81211c1 Compare July 14, 2025 23:46
Copy link
Contributor

❌ Gradle check result for 81211c1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
Copy link
Contributor

❌ Gradle check result for a81608e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud closed this Jul 15, 2025
@vinaykpud vinaykpud reopened this Jul 15, 2025
Copy link
Contributor

❌ Gradle check result for a81608e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud marked this pull request as ready for review July 22, 2025 20:20
Copy link
Contributor

❌ Gradle check result for 5f2c4bf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 5f2c4bf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud closed this Jul 28, 2025
@vinaykpud vinaykpud reopened this Jul 28, 2025
Copy link
Contributor

❌ Gradle check result for 5f2c4bf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 93034bb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 93034bb: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud closed this Jul 29, 2025
@vinaykpud vinaykpud reopened this Jul 29, 2025
Copy link
Contributor

❌ Gradle check result for 93034bb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for cb83f07: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
Copy link
Contributor

✅ Gradle check result for ce67085: SUCCESS

Copy link

codecov bot commented Jul 29, 2025

Codecov Report

❌ Patch coverage is 95.45455% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.83%. Comparing base (2906600) to head (ce67085).
⚠️ Report is 19 commits behind head on main.

Files with missing lines Patch % Lines
...ket/terms/GlobalOrdinalsStringTermsAggregator.java 95.45% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18732      +/-   ##
============================================
+ Coverage     72.78%   72.83%   +0.05%     
- Complexity    68544    68575      +31     
============================================
  Files          5567     5569       +2     
  Lines        314911   314958      +47     
  Branches      45684    45691       +7     
============================================
+ Hits         229201   229404     +203     
+ Misses        67092    66978     -114     
+ Partials      18618    18576      -42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search:Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Performance] Optimize String terms agg
2 participants