Skip to content

Conversation

@MartinquaXD
Copy link
Contributor

@MartinquaXD MartinquaXD commented Jan 6, 2026

Description

Currently the bad token detection assumes that we are perfectly able to detect "broken" orders and only orders that trade specific tokens that a particular solver is not able to handle cause problems. However this assumption does not work well with the increasing complexity of new order types that can suddenly start failing for any number of reasons.
The most prominent recent example were flashloan orders where the EIP 1271 signature verified correctly but transferring the tokens into the settlement contract failed because the user's Aave debt position was not healthy enough.
Our current logic caused a lot of collateral damage because such orders could cause many reasonable tokens to be flagged as unsupported although the tokens themselves were perfectly fine and only that particular order was problematic.

Changes

To address this this PR change the metrics based detection mechanism to only flag on an order by order basis instead flagging all orders trading specific tokens. The change itself is relatively simple (collect metrics keyed by Uid instead of token`) but came with a few related changes:

  • the name bad_token_detection is now incorrect in most (but not all!) cases so many things were renamed
    • this includes a few config parameters so they must be updated in the infra repo as well!
  • caching uids has a lot more potential to bloat the cache so a cache eviction task was introduced, this required 2 new config parameters (max_age, gc_interval)

How to test

  • adjusted existing unit to make sure the metrics logic still works correctly with Uid
  • added a new unit test for the cache eviction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github unfortunately marks this whole file as new when it actually was actually just moved and modified slightly. The new parts are:

  • added last_seen_at
  • added spawn_gc_task (and the associated unit test)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactoring/renaming should probably be separated from the logic changes to avoid this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is much better https://www.diffchecker.com/iMNVcpf7/

@m-sz m-sz self-assigned this Jan 7, 2026
@m-sz
Copy link
Contributor

m-sz commented Jan 7, 2026

@m-sz m-sz marked this pull request as ready for review January 7, 2026 14:52
@m-sz m-sz requested a review from a team as a code owner January 7, 2026 14:52
Copy link
Contributor

@jmg-duarte jmg-duarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pub struct BadOrderDetection {
/// Tokens that are explicitly allow- or deny-listed.
pub tokens_supported: HashMap<eth::TokenAddress, bad_tokens::Quality>,
pub tokens_supported: HashMap<eth::TokenAddress, bad_orders::Quality>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unfortunate that bad_orders now relate to tokens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved this into common module detector

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to move the simulation to bad_orders? It still works with tokens, doesn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simulation still considers the tokens. I will re-arrange the hierarchy to reflect that and be less confusing.

@m-sz
Copy link
Contributor

m-sz commented Jan 8, 2026

Moved bad_token and bad_order detection as submodules to common detector

The public facing api (Quality and the Detector) now reside inside of it. Then its submodules are bad_tokens::simulation and bad_orders::metrics which exactly describe how we rule out the specific offending trades. The rest of the crate uses only the main detector module and detector::Detector struct which makes it less confusing.

Also removed the hand rolled current_unix_timestamp() in favour of now_in_epoch_seconds()

@m-sz
Copy link
Contributor

m-sz commented Jan 8, 2026

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

@jmg-duarte
Copy link
Contributor

jmg-duarte commented Jan 8, 2026

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

Suggestions:

  1. FaultDetectorConfig
  2. Split the configs into token/order or metrics/simulation (or both) and then create a separate one called DetectionConfigurations — its no longer ambiguous because you can open it and see both explicit ones

@squadgazzz
Copy link
Contributor

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

We now have 2 detectors that live in different modules. Can we simply split their configs or do they share some config params?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants