Skip to content

Commit 86ebed5

Browse files
Add docs about wasted traffic (#2047)
Fixes #576 Signed-off-by: Martin Florian <[email protected]>
1 parent 9bf47a6 commit 86ebed5

File tree

3 files changed

+49
-1
lines changed

3 files changed

+49
-1
lines changed

docs/src/deployment/observability/metrics.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ Configuring a docker compose deployment to enable metrics
6464

6565
When using docker compose for the deployment, the metrics are enabled by default. These can be accessed at `http://validator.localhost/metrics` for the validator app and at `http://participant.localhost/metrics` for the participant.
6666

67+
.. _metrics_grafana_dashboards:
68+
6769
Grafana Dashboards
6870
++++++++++++++++++
6971

docs/src/deployment/traffic.rst

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,3 +171,49 @@ the validator app will
171171

172172
For configuring the built-in top-up automation, please refer to the :ref:`validator deployment guide <helm_validator_topup>`.
173173
Configuring alternative methods for buying traffic, e.g., using third-party services, exceeds the scope of this documentation.
174+
175+
.. _traffic_wasted:
176+
177+
Wasted traffic
178+
--------------
179+
180+
`Wasted traffic` is defined as synchronizer events that have been sequenced but will not be delivered to their recipients.
181+
For validators, which are subject to traffic fees,
182+
wasted traffic implies that :ref:`traffic <traffic_accounting>` has been charged for a message that was ultimately not delivered.
183+
Not all failed submissions result in wasted traffic:
184+
wasted traffic only occurs whenever a synchronizer event is rejected after sequencing but before delivery.
185+
Some level of wasted traffic is expected and unavoidable, due to factors such as:
186+
187+
- Submission request amplification.
188+
Participants that use BFT sequencer connections retry submission requests after a timeout to ensure speedy delivery in the face of nonresponsive sequencers;
189+
if processing was simply slower than usual but the sequencer was not faulty, the duplicate request counts as wasted traffic.
190+
- Duplication of messages within the ordering layer, typically linked to transient networking issues or load spikes.
191+
- Duplication of submissions on the participant/app side, for example when catching up after restoring from a backup or after some crashes.
192+
193+
Validator perspective
194+
+++++++++++++++++++++
195+
196+
Validator operators are encouraged to investigate the causes of repeatedly failing submissions.
197+
As stated above, not all failed submissions result in wasted traffic, and some wasted traffic is unavoidable.
198+
Attention is warranted, however, if the rate of wasted traffic increases significantly at some point in time.
199+
200+
The Splice distribution contains a :ref:`Grafana dashboard <metrics_grafana_dashboards>` about `Synchronizer Fees (validator view)`,
201+
to assist in monitoring traffic-related metrics.
202+
The `Rejected Event Traffic` panel on this dashboard is especially relevant for determining the rate of wasted traffic.
203+
(Hover on the ⓘ symbols in panel headers for precise descriptions of the shown data.)
204+
205+
SV perspective
206+
++++++++++++++
207+
208+
SV operators are encouraged to monitor wasted traffic across all synchronizer members,
209+
as reported for example by sequencer :ref:`metrics <metrics>`,
210+
to detect cases where wasted traffic increases significantly and/or in a global manner.
211+
The Splice distribution contains a :ref:`Grafana dashboard <metrics_grafana_dashboards>` about `Synchronizer Fees (SV view)` that can be helpful,
212+
as well as an alert definition that focuses on validator participants.
213+
214+
Note that wasted traffic is less relevant for SVs themselves, as SV components have unlimited traffic.
215+
Note also that SV mediators and sequencers waste traffic as part of their regular operation:
216+
They heavily use aggregate submissions where sequencers collect messages from a group of senders and only deliver a single message per recipient once a threshold of individual submissions has been sequenced;
217+
sequenced individual submissions beyond the aggregation threshold count as wasted traffic.
218+
All that said, should an SV component suddenly exhibit a significant increase in wasted traffic,
219+
this likely points to an actual issue that should be investigated.

docs/src/release_notes.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Upcoming
2020

2121
- Various improvements to the docs on :ref:`recovering a validator from an identities backup <validator_reonboard>`,
2222
including adding a section on :ref:`obtaining an identities backup from a database backup <validator_manual_dump>`.
23-
23+
- Add documentation about :ref:`Wasted traffic <traffic_wasted>`.
2424

2525
0.4.13
2626
------

0 commit comments

Comments
 (0)