-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Is your feature request related to a problem? Please describe
@ashking94 and I had a discussion in 18720's comment, and we can optimize the logic for cleaning up redundant pending merged segments in subsequent PRs. Thereby avoiding listAll
operations.
Describe the solution you'd like
I paste the key discussion here for easy reading.
As I understand, we only want to prevent the deletion of such segments that are produced as part of the background merges and not yet part of the reader yet.
Yes.
I think this can be easily solved by having a variable that tracks segment files that have been published for pre-warm but not yet refreshed.
For the convenience of subsequent discussions, we can name it unSearchableMergedSegmentCheckpointList
, which is maintained by the primary shard.
unSearchableMergedSegmentCheckpointList
tracks segment files that have been published for pre-warm but not yet refreshed. Therefore, the merged segments it contains cannot be deleted.
I think we can add list of segments to be sent from primary to replica just before the publish merged segment action is invoked, and removed from the list when the publish refresh action is invoked.
Perhaps we need a dedicated refresh listener and remove MergedSegmentCheckpoint
from unSearchableMergedSegmentCheckpointList
when executing RefreshListener#afterRefresh
.
If we perform the remove operation when PublishCheckpointAction
is invoked, we may miss some checkpoints. For example, in RemoteStoreRefreshListener
, PublishCheckpointAction
is only invoked when the upload is successful.
The
computeReferencedSegmentsCheckpoint
can then use this list to add the pending merged segments files that are currently not part ofsegmentInfosGatedCloseable
.
After introducing this proposal, MergedSegmentCheckpoint
in unsearchableMergedSegmentCheckpointList
need to be cleaned up to avoid memory leak.
Can we remove MergedSegmentCheckpoint from unSearchableMergedSegmentCheckpointList only when it is searchable (which means that the segmentInfosGatedCloseable holds the file)?
Yes.
I was thinking if publish_referenced_segments action can act as that hook where we do the cleanup of segments that are present in segmentInfosGatedCloseable from unSearchableMergedSegmentCheckpointList.
I think we still need a regularly scheduled task, as we still need to clean up expired checkpoints.
Also, around the memory leak point, we may have to take time information on when the checkpoint was added and clean it up after it is older than X mins/hours at the same time when the above cleanup is happening?
Yes, but the default value may be quite large. Meanwhile, when users need to reduce the configuration, they will understand the risk that merged segments are no longer tracked, which may result in useful pending merged segments on replica nodes being cleaned up.
Related component
Indexing:Replication
Describe alternatives you've considered
No response
Additional context
No response