-
Notifications
You must be signed in to change notification settings - Fork 202
[Access] POC Collection Syncing #8154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| // `e.consumers`. | ||
| // Note: the `e.consumers` will be guaranteed to receive at least one `OnExecutionDataFetched` event | ||
| // for each sealed block in consecutive block height order. | ||
| e.notificationConsumer, err = jobqueue.NewComponentConsumer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified the requester by removing the notification consumer entirely. Instead, we are just calling e.distributor.OnExecutionDataReceived().
Note, the OnExecutionDataReceived used to be called with the execution data read from storage, but actually, no consumer actually make use the execution data, because the consumer will read execution data with their next unprocessed index to ensure data for all heights are processed.
Close #8121
This PR addresses two issues:
When an Access Node is far behind, startup is slow because it attempts to download collections for all finalized heights with missing collections, causing extended downtime.
Solution: Switched to a job queue that focuses on the next missing height to fetch. This enables faster startup, reduces downtime, and speeds up catch-up.
Collections can be synced from either collection nodes or execution nodes. Currently, the Access Node syncs from both. Because the storage function uses a lock, concurrent syncing causes both procedures to block each other.
Solution: Prioritize syncing from execution nodes via execution data syncing, and allow only one sync procedure at a time. Syncing from collection nodes is only enabled when execution data syncing is turned off. This reduces lock contention, speeds up indexing, and reduces load on collection nodes.
Collection Indexing Refactoring POC
engine/access/collection_sync, with only one enabled at startup. This structure supports a future hybrid mode.engine/access/collection_sync/execution_data_index/processor.goengine indexes collections from execution data. It receives notifications when new execution data is downloaded and indexes the collections from that data.engine/access/collection_sync/fetcher/engine.goengine fetches collections from collection nodes and indexes them.