-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add option to enable remote store for segments only #18773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to enable remote store for segments only #18773
Conversation
Current stateStart up a local cluster with two nodes using: Create an index with a single shard and a single search replica: Write a document to the index: The primary shard is created, but the search replica is unassigned and the recovery source is The primary shard is writing segments to the remote store, which is nice: |
|
❌ Gradle check result for 4bef91a: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Aha! The search replica wasn't getting allocated because we added an allocation decider that only assigns a search replica if it's a search-only node. But what if a cluster doesn't have dedicated search nodes? I've changed the logic to let me allocate a search replica to a node if a) the node is a search node, or b) there are no search nodes in the cluster. With that change, I'm able to get remote store based replication working with two nodes running on my laptop. |
ddf33fb to
80444d2
Compare
|
❌ Gradle check result for 80444d2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Michael Froh <[email protected]>
Signed-off-by: Michael Froh <[email protected]>
9748f24 to
9cda00f
Compare
|
Thanks a lot, @shwetathareja! I've tried to incorporate your feedback. To make things a little better, I've renamed the new node attribute to I was able to confirm that I could run a couple of nodes on my laptop using a shared |
|
❌ Gradle check result for 9cda00f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Sorry, just catching up. The current snapshots(pinned timestamps v2)are built on top of segments and translogs since we don't perform a per shard flush prior to snapshots(which is what allows us to scale better). So with this change we would also have to move back snapshots to support the shallow snapshot capability |
|
@Bukhtawar, this change is to allow us to use segrep with pull-based ingestion and a clusterless architecture (without cluster managers). Since snapshots run through cluster managers, we can't do them anyway. Also, we don't have a translog, since the event stream that we're pulling from takes on that role. |
server/src/main/java/org/opensearch/cluster/metadata/MetadataCreateIndexService.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
ashking94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the replication mode changes with segments only remote store based indexes? Would it fall back to full request replication model?
Signed-off-by: Michael Froh <[email protected]>
|
Thanks @ashking94! I made those changes that you called out. |
|
❌ Gradle check result for b6fb4ba: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Michael Froh <[email protected]>
…ct#18773) Currently, the remote store implementation is all or nothing. If you want anything stored in the remote store, you pretty much need to store everything in the remote store. This change adds an explicit setting so expert users can say, "No, thanks, I don't want any of this remote cluster state, remote translog stuff. I just want segments replicated to a remote store." --------- Signed-off-by: Michael Froh <[email protected]> Signed-off-by: sunqijun.jun <[email protected]>
…ct#18773) Currently, the remote store implementation is all or nothing. If you want anything stored in the remote store, you pretty much need to store everything in the remote store. This change adds an explicit setting so expert users can say, "No, thanks, I don't want any of this remote cluster state, remote translog stuff. I just want segments replicated to a remote store." --------- Signed-off-by: Michael Froh <[email protected]>
…ct#18773) Currently, the remote store implementation is all or nothing. If you want anything stored in the remote store, you pretty much need to store everything in the remote store. This change adds an explicit setting so expert users can say, "No, thanks, I don't want any of this remote cluster state, remote translog stuff. I just want segments replicated to a remote store." --------- Signed-off-by: Michael Froh <[email protected]>
Description
Currently, the remote store implementation is all or nothing. If you want anything stored in the remote store, you pretty much need to store everything in the remote store.
This change adds an explicit setting so expert users can say, "No thanks, I don't want any of this remote cluster state or remote translog stuff. I just want segments replicated to a remote store." I needed to hack away at some of the existing logic that has embraced this "all or nothing" assumption.
I still can't bring up a search replica, because I can't seem to recover from remote store without translog recovery, but I can get a primary to push segments to the remote store.I can bring up a search replica! That had nothing to do with remote store configuration, but rather logic onSearchReplicaAllocationDeciderthat says search replicas must live on search nodes. I removed that rule if the cluster has no dedicated search nodes.Related Issues
Resolves #18669
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.