Make metrics based bad order detection order specific #4021

MartinquaXD · 2026-01-06T10:41:45Z

Description

Currently the bad token detection assumes that we are perfectly able to detect "broken" orders and only orders that trade specific tokens that a particular solver is not able to handle cause problems. However this assumption does not work well with the increasing complexity of new order types that can suddenly start failing for any number of reasons.
The most prominent recent example were flashloan orders where the EIP 1271 signature verified correctly but transferring the tokens into the settlement contract failed because the user's Aave debt position was not healthy enough.
Our current logic caused a lot of collateral damage because such orders could cause many reasonable tokens to be flagged as unsupported although the tokens themselves were perfectly fine and only that particular order was problematic.

Changes

To address this this PR change the metrics based detection mechanism to only flag on an order by order basis instead flagging all orders trading specific tokens. The change itself is relatively simple (collect metrics keyed by Uid instead of token`) but came with a few related changes:

the name bad_token_detection is now incorrect in most (but not all!) cases so many things were renamed
- this includes a few config parameters so they must be updated in the infra repo as well!
caching uids has a lot more potential to bloat the cache so a cache eviction task was introduced, this required 2 new config parameters (max_age, gc_interval)

How to test

adjusted existing unit to make sure the metrics logic still works correctly with Uid
added a new unit test for the cache eviction

MartinquaXD · 2026-01-06T10:45:45Z

crates/driver/src/domain/competition/detector/bad_orders/metrics.rs

github unfortunately marks this whole file as new when it actually was actually just moved and modified slightly. The new parts are:

added last_seen_at

added spawn_gc_task (and the associated unit test)

The refactoring/renaming should probably be separated from the logic changes to avoid this.

This diff is much better https://www.diffchecker.com/iMNVcpf7/

m-sz · 2026-01-07T14:34:19Z

Infra PR https://github.com/cowprotocol/infrastructure/pull/4302

jmg-duarte

LGTM

crates/driver/src/domain/competition/bad_orders/metrics.rs

squadgazzz · 2026-01-07T17:41:30Z

crates/driver/src/infra/solver/mod.rs

+pub struct BadOrderDetection {
    /// Tokens that are explicitly allow- or deny-listed.
-    pub tokens_supported: HashMap<eth::TokenAddress, bad_tokens::Quality>,
+    pub tokens_supported: HashMap<eth::TokenAddress, bad_orders::Quality>,


It is unfortunate that bad_orders now relate to tokens.

I have moved this into common module detector

squadgazzz · 2026-01-07T17:42:31Z

crates/driver/src/domain/competition/detector/bad_tokens/simulation.rs

Do we really need to move the simulation to bad_orders? It still works with tokens, doesn't it?

The simulation still considers the tokens. I will re-arrange the hierarchy to reflect that and be less confusing.

m-sz · 2026-01-08T14:03:00Z

Moved bad_token and bad_order detection as submodules to common detector

The public facing api (Quality and the Detector) now reside inside of it. Then its submodules are bad_tokens::simulation and bad_orders::metrics which exactly describe how we rule out the specific offending trades. The rest of the crate uses only the main detector module and detector::Detector struct which makes it less confusing.

Also removed the hand rolled current_unix_timestamp() in favour of now_in_epoch_seconds()

m-sz · 2026-01-08T14:07:39Z

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

jmg-duarte · 2026-01-08T17:40:26Z

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

Suggestions:

FaultDetectorConfig
Split the configs into token/order or metrics/simulation (or both) and then create a separate one called DetectionConfigurations — its no longer ambiguous because you can open it and see both explicit ones

squadgazzz · 2026-01-08T18:56:38Z

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

We now have 2 detectors that live in different modules. Can we simply split their configs or do they share some config params?

MartinquaXD added 4 commits January 6, 2026 07:48

use metrics to flag bad orders instead of bad tokens

33b7411

Rename bad_token module to bad_orders

d5e84e4

Merge remote-tracking branch 'origin/main' into bad-order-detection

23350d4

More renaming and unit tests

5ba26d4

MartinquaXD commented Jan 6, 2026

View reviewed changes

MartinquaXD and others added 3 commits January 6, 2026 10:49

fix failing tests

2962298

Merge branch 'main' into convert-bad-token-to-bad-order-detection

a7c4aa4

Merge branch 'main' into convert-bad-token-to-bad-order-detection

b9e05ac

m-sz self-assigned this Jan 7, 2026

m-sz added 2 commits January 7, 2026 15:12

Merge branch 'main' into convert-bad-token-to-bad-order-detection

b530ad1

Fix naming of freeze-time argument

e71af0e

m-sz marked this pull request as ready for review January 7, 2026 14:52

m-sz requested a review from a team as a code owner January 7, 2026 14:52

m-sz added 2 commits January 7, 2026 16:44

Merge branch 'main' into convert-bad-token-to-bad-order-detection

2d6a925

Merge branch 'main' into convert-bad-token-to-bad-order-detection

c1fc50c

jmg-duarte approved these changes Jan 7, 2026

View reviewed changes

squadgazzz reviewed Jan 7, 2026

View reviewed changes

m-sz added 2 commits January 8, 2026 14:58

Re-structure bad order and bad token detector

cf7ddd9

Use now_in_epoch_seconds()

a91e6c3

m-sz added 2 commits January 8, 2026 15:58

Merge branch 'main' into convert-bad-token-to-bad-order-detection

90a4ef6

clippy

3e65816

Make metrics based bad order detection order specific #4021

Are you sure you want to change the base?

Make metrics based bad order detection order specific #4021

Conversation

MartinquaXD commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

How to test

Uh oh!

MartinquaXD Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

squadgazzz Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

squadgazzz Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

m-sz commented Jan 7, 2026

Uh oh!

jmg-duarte left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

squadgazzz Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

m-sz Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

squadgazzz Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

m-sz Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

m-sz commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

m-sz commented Jan 8, 2026

Uh oh!

jmg-duarte commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

squadgazzz commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

MartinquaXD commented Jan 6, 2026 •

edited

Loading

m-sz commented Jan 8, 2026 •

edited

Loading

jmg-duarte commented Jan 8, 2026 •

edited

Loading