Fix MaxScoreBulkScorer leaf-bound overshoot and prevent merging zero-score fragments #15346
+162
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR delivers two minimal, targeted fixes with regression tests:
Core: Prevent MaxScoreBulkScorer from advancing past a leaf’s maxDoc under filtered disjunctions (avoids potential EOF when norms are accessed after NO_MORE_DOCS).
Highlighter: Don’t merge zero-scored fragments (GH-15333) to avoid producing merged passages that include content with no matches.
Motivation
MaxScoreBulkScorer: With a restrictive filter plus a disjunction, the candidate windowing logic could overshoot a segment’s maxDoc. If norms were accessed after NO_MORE_DOCS, this could trigger unexpected EOF.
Highlighter: Zero-score fragments should not be merged with adjacent fragments, otherwise the final passage can include unrelated content with no matches.
Changes
Core (lucene/core)
Clamp candidate advancement at the leaf boundary in MaxScoreBulkScorer (e.g., within nextCandidate) so NO_MORE_DOCS is returned when rangeEnd exceeds maxDoc.
Added regression test: org.apache.lucene.search.TestMaxScoreBulkScorerFilterBounds.
Highlighter (lucene/highlighter)
In Highlighter, filter out zero-scored TextFragments before mergeContiguousFragments to prevent unintended merges.
Added regression test: org.apache.lucene.search.highlight.TestZeroScoreMerging.
Docs
Updated [CHANGES.txt] with both fixes and referenced test names.
Testing
New tests:
lucene/core: TestMaxScoreBulkScorerFilterBounds validates filtered-disjunction execution does not score past maxDoc and does not throw.
lucene/highlighter: TestZeroScoreMerging ensures zero-score fragments aren’t merged.
Both tests pass locally in isolation for their respective modules.
Backwards compatibility
Behavior is strictly safer/more correct:
Core: Prevents out-of-bounds progression; no API changes.
Highlighter: Merge semantics exclude fragments with score == 0; expected/intuitive behavior, no API changes.
Performance
Neutral. The core change is a simple bound check in the candidate advancement logic. Highlighter change is a small pre-filter on fragments.
Risk
Low. Changes are localized and covered by focused regression tests.
Related
Fix: #15333