Skip to content

Conversation

petarjuki7
Copy link
Contributor

Issue Addressed

#558
Makes sure if a cluster is liquidated that it doesn't get processed.

@petarjuki7 petarjuki7 changed the title fix(validator_store):filter inactive falidators fix(validator_store):filter inactive validators Sep 3, 2025
@petarjuki7 petarjuki7 changed the title fix(validator_store):filter inactive validators fix(validator_store): filter inactive validators Sep 3, 2025
// First, attempt to get the cluster normally
if let Some(cluster) = state.clusters().get_by(&validator.cluster_id) {
if cluster.liquidated {
return Err(Error::SpecificError(SpecificError::Unsupported));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the appropriate error to throw

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably just add a new variant in, something like ClusterLiquidated

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical issue where liquidated validators were still being processed by adding a check to filter out inactive validators from cluster operations. The change ensures that validators belonging to liquidated clusters are properly rejected to prevent unintended processing.

  • Adds a liquidation status check for validator clusters
  • Returns an error when attempting to process validators from liquidated clusters

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@petarjuki7 petarjuki7 self-assigned this Sep 3, 2025
@diegomrsantos
Copy link
Contributor

Can we add a test to avoid this regression from being added?

@dknopik
Copy link
Member

dknopik commented Sep 4, 2025

Nice! We also should filter in voting_pubkeys, so that the validator services stop attempting duties for those validators.

@petarjuki7
Copy link
Contributor Author

Can we add a test to avoid this regression from being added?

Not sure if I understand, where to add a test?

@dknopik
Copy link
Member

dknopik commented Sep 4, 2025

Can we add a test to avoid this regression from being added?

While I agree, right now it is really hard to test the validator store due to it's tightly coupled nature. I think it might be better to keep it in mind for when we add more tests.

Alternatively, our integration test approach currently is based on local Testnets. We could add a check there where we assert that a liquidated validator does not attest

.shares()
.values()
.filter_map(|v| filter_func(DoppelgangerStatus::SigningEnabled(v.validator_pubkey)))
.filter(|public_key| self.get_validator_and_cluster(*public_key).is_ok())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach has a problem:

We try to lock (within get_validator_and_cluster) while we already hold a lock (here by calling state)

This can cause a deadlock, as the second lock might wait for a pending writer, which waits for the lock we're holding here first.

Instead, get the cluster here and check for liquidation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, I understand, I meant to save from some duplication of code.

@diegomrsantos
Copy link
Contributor

Can we add a test to avoid this regression from being added?

While I agree, right now it is really hard to test the validator store due to it's tightly coupled nature. I think it might be better to keep it in mind for when we add more tests.

Alternatively, our integration test approach currently is based on local Testnets. We could add a check there where we assert that a liquidated validator does not attest

Could you explain in more detail how our integration test approach works? Where are the tests located?

@dknopik
Copy link
Member

dknopik commented Sep 4, 2025

@diegomrsantos I am talking about #562. It runs a Testnet and then checks if the expected duties are performed. It seems feasible to add a liquidated cluster and check that it does not perform duties.

@diegomrsantos
Copy link
Contributor

@diegomrsantos I am talking about #562. It runs a Testnet and then checks if the expected duties are performed. It seems feasible to add a liquidated cluster and check that it does not perform duties.

But where are the tests exactly? I see only a yaml file there

Comment on lines 593 to 604
fn is_cluster_active(&self, validator_pubkey: &PublicKeyBytes) -> bool {
let state = self.database.state();

if let Some(validator) = state.metadata().get_by(validator_pubkey)
&& let Some(cluster) = state.clusters().get_by(&validator.cluster_id)
{
return !cluster.liquidated;
}

// We did not manage to fetch the cluster
false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, same issue as before:

By trying to borrow here (in line 594) while we already holding the lock through the .state() call in voting_pubkeys, we are at risk of a deadlock if a writer tries to acquire the lock between the two calls.

Furthermore, you can call get_by directly with the validator pubkey - no need to go via the metadata, as the multi index map allows access via the validator_pubkey.

dknopik
dknopik previously approved these changes Sep 5, 2025
Copy link
Member

@dknopik dknopik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! I've got one nitpick, but that one is a matter of taste. Feel free to address or merge (by adding the ready-to-merge label)

Comment on lines 808 to 810
if let Some(clusters) = clusters.get_by(public_key) {
return !clusters.liquidated;
}
false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, exactly :)

still got one nitpick in me :P

I really like Option::is_some_and, but that is a matter of taste

@dknopik dknopik added the v1.0.0 First Mainnet-release label Sep 5, 2025
@dknopik dknopik changed the base branch from unstable to release-v0.3.0 September 5, 2025 15:12
@dknopik dknopik dismissed their stale review September 5, 2025 15:12

The base branch was changed.

@dknopik
Copy link
Member

dknopik commented Sep 5, 2025

Sorry, I just noticed that this was based on unstable. Most bugfixes should be based on the releases-v0.3.0 branch, in order to separate experimental work (on unstable) from our fixes ahead of the big release. Can you please rebase your commits on top of release-v0.3.0?

@petarjuki7 petarjuki7 force-pushed the petarjuki7/filter_inactive_validators branch from 2fac0bf to d2d1e46 Compare September 6, 2025 14:58
@petarjuki7
Copy link
Contributor Author

Closing this in favour of #589 to keep a cleaner commit history

@petarjuki7 petarjuki7 closed this Sep 8, 2025
mergify bot pushed a commit that referenced this pull request Sep 8, 2025
#558
Makes sure if a cluster is liquidated that it doesn't get processed.

Closes #576
petarjuki7 added a commit to petarjuki7/anchor that referenced this pull request Sep 16, 2025
sigp#558
Makes sure if a cluster is liquidated that it doesn't get processed.

Closes sigp#576
diegomrsantos pushed a commit to diegomrsantos/anchor that referenced this pull request Sep 17, 2025
sigp#558
Makes sure if a cluster is liquidated that it doesn't get processed.

Closes sigp#576
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1.0.0 First Mainnet-release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants