Skip to content

Conversation

@kdt523
Copy link

@kdt523 kdt523 commented Oct 21, 2025

This PR delivers two minimal, targeted fixes with regression tests:

Core: Prevent MaxScoreBulkScorer from advancing past a leaf’s maxDoc under filtered disjunctions (avoids potential EOF when norms are accessed after NO_MORE_DOCS).
Highlighter: Don’t merge zero-scored fragments (GH-15333) to avoid producing merged passages that include content with no matches.

Motivation
MaxScoreBulkScorer: With a restrictive filter plus a disjunction, the candidate windowing logic could overshoot a segment’s maxDoc. If norms were accessed after NO_MORE_DOCS, this could trigger unexpected EOF.
Highlighter: Zero-score fragments should not be merged with adjacent fragments, otherwise the final passage can include unrelated content with no matches.

Changes
Core (lucene/core)
Clamp candidate advancement at the leaf boundary in MaxScoreBulkScorer (e.g., within nextCandidate) so NO_MORE_DOCS is returned when rangeEnd exceeds maxDoc.
Added regression test: org.apache.lucene.search.TestMaxScoreBulkScorerFilterBounds.
Highlighter (lucene/highlighter)
In Highlighter, filter out zero-scored TextFragments before mergeContiguousFragments to prevent unintended merges.
Added regression test: org.apache.lucene.search.highlight.TestZeroScoreMerging.
Docs
Updated [CHANGES.txt] with both fixes and referenced test names.

Testing
New tests:
lucene/core: TestMaxScoreBulkScorerFilterBounds validates filtered-disjunction execution does not score past maxDoc and does not throw.
lucene/highlighter: TestZeroScoreMerging ensures zero-score fragments aren’t merged.
Both tests pass locally in isolation for their respective modules.

Backwards compatibility
Behavior is strictly safer/more correct:
Core: Prevents out-of-bounds progression; no API changes.
Highlighter: Merge semantics exclude fragments with score == 0; expected/intuitive behavior, no API changes.

Performance
Neutral. The core change is a simple bound check in the candidate advancement logic. Highlighter change is a small pre-filter on fragments.

Risk
Low. Changes are localized and covered by focused regression tests.
Related
Fix: #15333

…score fragments

Clamp candidate advancement to leaf bounds in filtered disjunctions; filter zero-score fragments before merge.

Add regression tests: TestMaxScoreBulkScorerFilterBounds and TestZeroScoreMerging.

Update CHANGES.txt with both fixes.
@kdt523 kdt523 force-pushed the fix/maxscore-highlighter-15333 branch from b959674 to 1e91cab Compare October 23, 2025 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Highlighter.getBestFragments() merges zero-scored fragments with scored fragments, polluting highlight results

1 participant