Skip to content

Conversation

guojialiang92
Copy link
Contributor

Description

This PR is based on [18720]'s follow-up work. Based on the discussion with @ashking94, I made some optimizations. The main purpose is to avoid the listAll operation in IndexShard#computeReferencedSegmentsCheckpoint.

The main changes in this PR are as follows:

  • I Introduced IndexShard#primaryMergedSegmentCheckpoints for primary shard to track merged segment checkpoints that have been published for pre-warm but not yet refreshed. We will add merged segment checkpoint to IndexShard#primaryMergedSegmentCheckpoints before publish_merged_segment.
  • Rename IndexShard#pendingMergedSegmentCheckpoints to IndexShard#replicaMergedSegmentCheckpoints, which is used for replica shard record the pre-copied merged segment checkpoints, which are not yet refreshed.
  • When ReplicationCheckpointUpdater#afterRefresh is invoked, remove refreshed segment from IndexShard#primaryMergedSegmentCheckpoints. The reason why it cannot be executed in PublishCheckpointAction can refer to the discussion in issue #18845.
  • To avoid memory leaks in IndexShard#primaryMergedSegmentCheckpoints, I introduced a configuration IndexSettings#INDEX_MERGED_SEGMENT_CHECKPOINT_RETENTION_TIME (default 15 minutes) to clean up expired checkpoints after publish_referenced_segments.
  • I introduced MergedSegmentWarmerIT#testPrimaryMergedSegmentCheckpointRetentionTimeout. Construct a case with redundant merged segment checkpoint in the primary shard and delete it based on the expiration time.

Related Issues

Resolves #[18845]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: guojialiang <[email protected]>
Copy link
Contributor

github-actions bot commented Aug 1, 2025

❌ Gradle check result for 0c06302: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: guojialiang <[email protected]>
Copy link
Contributor

github-actions bot commented Aug 1, 2025

❌ Gradle check result for 253c15f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Aug 2, 2025

❌ Gradle check result for 1c90ffc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: guojialiang <[email protected]>
Copy link
Contributor

github-actions bot commented Aug 2, 2025

❌ Gradle check result for 9ba9cb6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Aug 4, 2025

❌ Gradle check result for 742b545: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: guojialiang <[email protected]>
@guojialiang92 guojialiang92 force-pushed the dev/optimize-redundant-merged-segment-cleanup branch from 742b545 to 762f707 Compare August 4, 2025 03:57
Copy link
Contributor

github-actions bot commented Aug 4, 2025

✅ Gradle check result for 762f707: SUCCESS

@guojialiang92
Copy link
Contributor Author

Hi, @ashking94
If you have time, please help review this PR.

@guojialiang92
Copy link
Contributor Author

Hi, @ashking94

I have merged the latest code from the main branch and resolved the conflicts.
I noticed that the merged segment warmer of the remote store has been merged into the main branch. After this PR, I will adapt this part of the code to the remote store.

Looking forward to your reply:)

Copy link
Contributor

❌ Gradle check result for 467872f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 137195f: SUCCESS

Copy link

codecov bot commented Aug 11, 2025

Codecov Report

❌ Patch coverage is 89.79592% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.80%. Comparing base (f967a72) to head (137195f).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...in/java/org/opensearch/index/shard/IndexShard.java 92.10% 1 Missing and 2 partials ⚠️
.../main/java/org/opensearch/index/IndexSettings.java 75.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18890      +/-   ##
============================================
- Coverage     72.96%   72.80%   -0.17%     
+ Complexity    69451    69295     -156     
============================================
  Files          5645     5645              
  Lines        318787   318821      +34     
  Branches      46125    46126       +1     
============================================
- Hits         232610   232115     -495     
- Misses        67358    67870     +512     
- Partials      18819    18836      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@guojialiang92
Copy link
Contributor Author

Hi, @ashking94
Could you please help review this PR? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant