-
Notifications
You must be signed in to change notification settings - Fork 2k
perf: Eliminate spawn_blocking in multiproof manager #19203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| /// Message about completion of proof calculation for a specific state update | ||
| #[derive(Debug)] | ||
| pub(super) struct ProofCalculated { | ||
| /// The index of this proof in the sequence of state updates | ||
| sequence_number: u64, | ||
| /// Sparse trie update | ||
| update: SparseTrieUpdate, | ||
| /// The time taken to calculate the proof. | ||
| elapsed: Duration, | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove ProofCalculated entirely because the task loop no longer
forwards worker results throughProofCalculated.
Results now arrive as ProofResultMessage over proof_result_rx, and
the loop converts them straight into SparseTrieUpdates inline—no
intermediate wrapper needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR eliminates spawn_blocking calls in the multiproof manager by restructuring proof result delivery. Instead of spawning blocking tasks that wait for proof completion, workers now send results directly through a dedicated crossbeam channel to the MultiProofTask event loop.
Key changes:
- Workers deliver proof results directly via
ProofResultMessagethrough a crossbeam channel MultiProofTaskusescrossbeam_channel::select!to handle both state updates and proof results concurrently- Removed intermediate blocking tasks and message types (
ProofCalculated,ProofCalculationError)
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| crates/trie/parallel/src/proof_task.rs | Adds ProofResultMessage type and ProofResultContext for direct result delivery; workers send results via channel instead of returning through receivers |
| crates/trie/parallel/src/proof.rs | Updated to use crossbeam channel for receiving proof results directly from workers |
| crates/engine/tree/src/tree/payload_processor/multiproof.rs | Removed spawn_blocking; workers send results to shared channel; event loop uses select! to handle state updates and proof results |
| crates/engine/tree/src/tree/payload_processor/prewarm.rs | Updated channel type from std mpsc to crossbeam for multiproof messages |
| crates/engine/tree/src/tree/payload_processor/mod.rs | Updated channel types to use crossbeam channels for multiproof messages |
| crates/engine/tree/Cargo.toml | Added crossbeam-channel dependency |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
crates/engine/tree/src/tree/payload_processor/multiproof.rs:463
- The
elapsedvariable calculation on line 422 was removed but is still used on line 485. This will cause a compilation error aselapsedis undefined at the point of use in the storage multiproof result message.
let elapsed = start.elapsed();
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
this is looking pretty good already |
0edadef to
c9494dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will review properly in a bit, but looks good directionally
crates/trie/parallel/src/proof.rs
Outdated
| collect_branch_node_masks: self.collect_branch_node_masks, | ||
| multi_added_removed_keys: self.multi_added_removed_keys.clone(), | ||
| missed_leaves_storage_roots: self.missed_leaves_storage_roots.clone(), | ||
| proof_result_sender: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is getting harder to read, let's introduce a separate struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion! addressed in 5aa0087
| prefetch_proofs_requested, | ||
| updates_finished, | ||
| ) { | ||
| crossbeam_channel::select! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a very nice change: we now have a channel that we receive the state updates on, and a separate channel that we receive the proof results on. Much cleaner imo, we can maybe separate the handling of messages into separate methods so it's easier to navigate, but not a blocker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense, let me configure and see how to clean that up
we can maybe separate the handling of messages into separate methods so it's easier to navigate, but not a blocker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shekhirin just did a prototype, i think its easier to review in a separate pr :) noted down in #19182
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot especially because this removes a totally redundant spawn
imo there's room for cleanup (not in this pr tho) see also #19182
| } | ||
|
|
||
| /// Spawns a single storage proof calculation task. | ||
| fn spawn_storage_proof(&mut self, storage_multiproof_input: StorageMultiproofInput) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to rename this now to just send_ removing spawn_ from the fn names because misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i changed to the hierarchy and naming to dispatch (felt more fitting than send) :
dispatch() // Entry point - checks if empty, then sends
→ send_multiproof_task() // removed shallow function
→ dispatch_storage_proof() // dispatch to storage workers
→ dispatch_multiproof() // dispatch to account workers
what do you think @mattsse ?
| prefetch_proofs_requested, | ||
| updates_finished, | ||
| ) { | ||
| crossbeam_channel::select! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of this looks good to me
| } | ||
| } | ||
| Err(_) => { | ||
| error!(target: "engine::tree::payload_processor::multiproof", "Proof result channel closed unexpectedly"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be unreachable because always keep the channel open via the
MultiproofManager.proof_result_tx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good pt, changed to unreachable! 01f1db8
| /// 4. `MultiProofTask` consumes the message from the same channel and sequences it with | ||
| /// `ProofSequencer`. | ||
| #[derive(Debug)] | ||
| pub struct MultiproofManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this type now simply acts as container to groups certain fields within the MultiProofTask
I personally find this introduces more complexity because now you also need to know what this entity does even if this now only does simple dispatch, similar situation with the ProofSequencer
all of this could be squashed into the
MultiProofTask
but we can decide this later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the note, noted this down as a cleanup in
#19182
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
receiving the ProofCalculated in multiproof manager
This is incorrect. we should get rid of them
ProofResultVariant references
…r closed proof result channel
Switch back to start.elapsed() to capture full request latency instead of just calculation time, fixing under-reported metrics.
077e5b5 to
105008f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, LGTM
| proof_sequence_number, | ||
| state_root_message_sender, | ||
| multi_added_removed_keys, | ||
| .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, to be the same as in dispatch_storage_proof
| .. | |
| state_root_message_sender: _, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense! resolved
| break | ||
| } | ||
| } | ||
| MultiProofMessage::EmptyProof { sequence_number, state } => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is going on with indentation here, rustfmt gave up I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yongkangc I think we may need to fix this manually, idk why rustfmt doesn't do it for us
| proof_targets: MultiProofTargets, | ||
| proof_sequence_number: u64, | ||
| state_root_message_sender: Sender<MultiProofMessage>, | ||
| state_root_message_sender: CrossbeamSender<MultiProofMessage>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it's only used for EmptyProof message, I wonder if we can remove it altogether and just send the empty proof to proof_result_tx? No need to address it in this PR, just a thought cc @mattsse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one last question otherwise lgtm
| proof_sequence_number, | ||
| state_root_message_sender, | ||
| multi_added_removed_keys, | ||
| state_root_message_sender: _, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this feels a bit weird because we first clone stuff from self into the
StorageMultiproofInput only to drop it here
imo this entire type StorageMultiproofInput is redundant
| // SAFETY: This is unreachable because `self.multiproof_manager` owns | ||
| // `proof_result_tx` (the sender), which keeps the channel open for | ||
| // the entire lifetime of this task. The channel cannot close while | ||
| // `self` exists. | ||
| unreachable!("proof result channel closed while multiproof manager still exists") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this checks out because select does return RecvError and not TryRecvError so this is only reachable if the channel is disconnected.
tho maybe we should just return the an error here
| // Extract storage proof from the multiproof wrapper | ||
| let (mut multiproof, _stats) = proof_msg.result?; | ||
| let proof = | ||
| multiproof.storages.remove(&hashed_address).ok_or_else(|| { | ||
| ParallelStateRootError::Other(format!( | ||
| "storage proof not found in multiproof for {hashed_address}" | ||
| )) | ||
| })?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to do this now?
this is added a few times and unclear what changed here?
| if let Ok(proof_msg) = receiver.recv() { | ||
| // Extract storage proof from the multiproof wrapper | ||
| if let Ok((mut multiproof, _stats)) = proof_msg.result && | ||
| let Some(proof) = multiproof.storages.remove(&hashed_address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| Err(e) => Err(e), | ||
| }; | ||
|
|
||
| // Send result directly to MultiProofTask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this comment, because this codepath is also used from ParallelProof::storage_proof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shekhirin removed comment
Closes: #19183
This PR refactors the multiproof pipeline to improve performance and simplify its design by eliminating a
spawn_blockinghop for handling proof results.Previously, the
MultiproofManagerwould spawn a blocking task to await the result of each proof calculation, which introduced unnecessary overhead and complexity.The new implementation streamlines this process:
ProofResultMessage) directly to theMultiProofTaskevent loop via a dedicatedcrossbeamchannel.MultiProofTaskusescrossbeam::select!to concurrently handle both incoming control messages (like state updates) and the proof results from workers.This change removes the intermediate blocking task, resulting in a more direct, efficient, and easier-to-understand data flow.
The updated message flow is as follows: