Skip to content

Conversation

@AndreiEres
Copy link
Contributor

@AndreiEres AndreiEres commented Oct 2, 2025

Description

Fixes gossiping and scalability issues in the statement-store networking.

  1. Reduced gossiping traffic by propagating only recent statements instead of all.
  2. Added an early check for statements that the node already has to skip duplicate processing.
  3. Added splitting of large statement batches to stay under MAX_STATEMENT_NOTIFICATION_SIZE; oversized individual statements are skipped.
  4. MAX_STATEMENT_NOTIFICATION_SIZE was updated to the commonly used 1MB, which drastically improved the gossiping speed.
  5. Notifications are sent asynchronously. I don't see much difference in performance, but according to @lexnv, it's better to do: poc: async backpressure for transaction notifications #9296.
  6. Added a 10s timeout to handle very slow or disconnected peers.

Integration

Internal optimizations to the gossip protocol. No downstream changes required.

Related PR: #9965

Things to handle in further PRs

  • After this PR, nodes don't send all statements to new peers anymore, only the recent ones.
  • After restarting, the node doesn't re-gossip statements it wasn't gossiped.
  • Broadcasting notifications to all peers when the first peer is slow is limited. We could instead use a FuturesUnordered.

@AndreiEres AndreiEres force-pushed the AndreiEres/fix-statement-store-gossiping branch from 860f41c to 7b8aeb1 Compare October 2, 2025 11:10
@AndreiEres AndreiEres added the T0-node This PR/Issue is related to the topic “node”. label Oct 2, 2025
@AndreiEres
Copy link
Contributor Author

/cmd prdoc --audience node_dev --bump patch

@AndreiEres AndreiEres changed the title [WIP] Fix statement-store gossiping Fix statement-store gossiping Oct 2, 2025
@AndreiEres AndreiEres requested a review from bkchr October 3, 2025 08:03
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests would be nice for the new send logic.

@AndreiEres AndreiEres requested a review from bkchr October 7, 2025 16:01
@AndreiEres AndreiEres added A4-backport-unstable2507 Pull request must be backported to the unstable2507 release branch A4-backport-stable2509 Pull request must be backported to the stable2509 release branch and removed A4-backport-stable2503 Pull request must be backported to the stable2503 release branch labels Oct 16, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 16, 2025
@AndreiEres AndreiEres added this pull request to the merge queue Oct 16, 2025
@AndreiEres AndreiEres removed this pull request from the merge queue due to a manual request Oct 16, 2025
@AndreiEres AndreiEres enabled auto-merge October 16, 2025 09:05
@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/18556111296
Failed job name: test-linux-stable

@AndreiEres AndreiEres added this pull request to the merge queue Oct 16, 2025
Merged via the queue into master with commit b21cbb5 Oct 16, 2025
261 of 264 checks passed
@AndreiEres AndreiEres deleted the AndreiEres/fix-statement-store-gossiping branch October 16, 2025 16:18
@paritytech-release-backport-bot

Created backport PR for stable2506:

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin backport-9912-to-stable2506
git worktree add --checkout .worktree/backport-9912-to-stable2506 backport-9912-to-stable2506
cd .worktree/backport-9912-to-stable2506
git reset --hard HEAD^
git cherry-pick -x b21cbb58ab50d5d10371393967537f6f221bb92f
git push --force-with-lease

@paritytech-release-backport-bot

Created backport PR for unstable2507:

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin backport-9912-to-unstable2507
git worktree add --checkout .worktree/backport-9912-to-unstable2507 backport-9912-to-unstable2507
cd .worktree/backport-9912-to-unstable2507
git reset --hard HEAD^
git cherry-pick -x b21cbb58ab50d5d10371393967537f6f221bb92f
git push --force-with-lease

@paritytech-release-backport-bot

Created backport PR for stable2509:

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin backport-9912-to-stable2509
git worktree add --checkout .worktree/backport-9912-to-stable2509 backport-9912-to-stable2509
cd .worktree/backport-9912-to-stable2509
git reset --hard HEAD^
git cherry-pick -x b21cbb58ab50d5d10371393967537f6f221bb92f
git push --force-with-lease

alvicsam pushed a commit that referenced this pull request Oct 17, 2025
# Description

Fixes gossiping and scalability issues in the statement-store
networking.

1. Reduced gossiping traffic by propagating only recent statements
instead of all.
2. Added an early check for statements that the node already has to skip
duplicate processing.
3. Added splitting of large statement batches to stay under
MAX_STATEMENT_NOTIFICATION_SIZE; oversized individual statements are
skipped.
4. MAX_STATEMENT_NOTIFICATION_SIZE was updated to the commonly used 1MB,
which drastically improved the gossiping speed.
5. Notifications are sent asynchronously. I don't see much difference in
performance, but according to @lexnv, it's better to do:
#9296.
6. Added a 10s timeout to handle very slow or disconnected peers.

## Integration

Internal optimizations to the gossip protocol. No downstream changes
required.

Related PR: #9965

## Things to handle in further PRs
- After this PR, nodes don't send all statements to new peers anymore,
only the recent ones.
- After restarting, the node doesn't re-gossip statements it wasn't
gossiped.
- Broadcasting notifications to all peers when the first peer is slow is
limited. We could instead use a FuturesUnordered.

---------

Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Bastian Köcher <[email protected]>
@AndreiEres AndreiEres added A4-backport-stable2506 Pull request must be backported to the stable2506 release branch A4-backport-stable2509 Pull request must be backported to the stable2509 release branch A4-backport-unstable2507 Pull request must be backported to the unstable2507 release branch and removed A4-backport-stable2506 Pull request must be backported to the stable2506 release branch A4-backport-unstable2507 Pull request must be backported to the unstable2507 release branch A4-backport-stable2509 Pull request must be backported to the stable2509 release branch labels Oct 21, 2025
gui1117 pushed a commit that referenced this pull request Oct 22, 2025
Fixes gossiping and scalability issues in the statement-store
networking.

1. Reduced gossiping traffic by propagating only recent statements
instead of all.
2. Added an early check for statements that the node already has to skip
duplicate processing.
3. Added splitting of large statement batches to stay under
MAX_STATEMENT_NOTIFICATION_SIZE; oversized individual statements are
skipped.
4. MAX_STATEMENT_NOTIFICATION_SIZE was updated to the commonly used 1MB,
which drastically improved the gossiping speed.
5. Notifications are sent asynchronously. I don't see much difference in
performance, but according to @lexnv, it's better to do:
#9296.
6. Added a 10s timeout to handle very slow or disconnected peers.

Internal optimizations to the gossip protocol. No downstream changes
required.

Related PR: #9965

- After this PR, nodes don't send all statements to new peers anymore,
only the recent ones.
- After restarting, the node doesn't re-gossip statements it wasn't
gossiped.
- Broadcasting notifications to all peers when the first peer is slow is
limited. We could instead use a FuturesUnordered.

---------

Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Bastian Köcher <[email protected]>
(cherry picked from commit b21cbb5)
acatangiu pushed a commit that referenced this pull request Oct 22, 2025
Backport #9912 into `stable2509` from AndreiEres.

See the
[documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md)
on how to use this bot.

<!--
  # To be used by other automation, do not modify:
  original-pr-number: #${pull_number}
-->

Co-authored-by: Andrei Eres <[email protected]>
Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: gui1117 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A4-backport-stable2506 Pull request must be backported to the stable2506 release branch A4-backport-stable2509 Pull request must be backported to the stable2509 release branch A4-backport-unstable2507 Pull request must be backported to the unstable2507 release branch T0-node This PR/Issue is related to the topic “node”.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants