Skip to content

Conversation

gballet
Copy link
Member

@gballet gballet commented Aug 14, 2025

πŸ—’οΈ Description

Add a test required as part of the BloatNet effort. This is the

πŸ”— Related Issues or PRs

Not an issue, but a test plan is described here.

βœ… Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx --with=tox-uv tox -e lint,typecheck,spellcheck,markdownlint
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Signed-off-by: Guillaume Ballet <[email protected]>
Signed-off-by: Guillaume Ballet <[email protected]>
Signed-off-by: Guillaume Ballet <[email protected]>
Signed-off-by: Guillaume Ballet <[email protected]>
@gballet gballet changed the title BloatNet: add first few single-opcode test for state access. feat(tests): add first few single-opcode test for state access in BloatNet Aug 14, 2025
Signed-off-by: Guillaume Ballet <[email protected]>

remove leftover single whitespace :|
@gballet gballet force-pushed the bloatnet-test-SSTORE branch from b6cd62a to 374e08a Compare August 14, 2025 19:16
@LouisTsai-Csie
Copy link
Collaborator

LouisTsai-Csie commented Aug 15, 2025

Hello @gballet ! Thanks for adding this case.

This is the issue tracker for bloatnet test cases, could you please help me (1) add the PR to the issue tracker PR description (like this) (2) link this PR to the issue, this would help us better track the progress, thank you!

For benchmark test, we now add new cases under tests/benchmark, and I think test_worst_stateful_opcodes.py best fit in your test.

I also add some review below, please feel free to let me know if you have any issue! If you want some reference for benchmark test, maybe you can take a look at this This is a similar case for this benchmark! You can take a look at this structure!

@gballet
Copy link
Member Author

gballet commented Aug 21, 2025

Hey @LouisTsai-Csie thanks for the feedback.

This is the issue tracker for bloatnet test cases, could you please help me (1) add the PR to the issue tracker PR description (like this) (2) link this PR to the issue, this would help us better track the progress, thank you!

I tried to do this, but this looks like it's very involved. I did my best effort but since I don't know what you're expecting, and also that I don't have all the time in the world, I'll leave it in your court to comment on that. #2064

For benchmark test, we now add new cases under tests/benchmark, and I think test_worst_stateful_opcodes.py best fit in your test.

if I do that, how do I run the test? it seems to ignore them after I moved it to the directory. I have pushed it to this PR for your consideration.

I also add some review below, please feel free to let me know if you have any issue! If you want some reference for benchmark test, maybe you can take a look at this This is a similar case for this benchmark! You can take a look at this structure!

Thanks for the reference.

@LouisTsai-Csie
Copy link
Collaborator

@gballet Appologies. I forgot to link the issue tracker for you. We've created an issue tracker based on your documentation.

I help you link this PR to the SSTORE β€” Fill block with SSTORE(0 β†’ 1) to maximize new storage slot creation, please let me know if this does not fit in the category.

Also, it would be great if you can help me review if there is anything missing / wrong in our issue tracker!

@gballet
Copy link
Member Author

gballet commented Aug 21, 2025

I'll need to have a closer look, but it seems fine as a first pass. Do you know what the problem is with moving my file to benchmarks?

@LouisTsai-Csie
Copy link
Collaborator

LouisTsai-Csie commented Aug 21, 2025

Our documentation is incomplete (I will fix them ASAP), for running the test, you will need to add a flag -m benchmark to run the test under the benchmark/ folder. By default, these tests are ignored to avoid some overhead in the CI/release process

This is the command on our documentation:

fill -v tests/benchmark/test_worst_blocks.py::test_block_full_of_ether_transfers --fork Osaka

But I would add some flag to run it:

uv run fill -v tests/benchmark/test_worst_blocks.py::test_block_full_of_ether_transfers --fork Osaka -m benchmark --clean
  • uv run: we use uv as package manager
  • -m benchmark: We need this flag or benchmark test will be ignored by default
  • --clean: you will need this if you already fill test before.

Please let me know if there is anything unclear to you!

@gballet gballet marked this pull request as ready for review August 27, 2025 08:53
Copy link
Collaborator

@fselmo fselmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gballet. I did a first pass at this strictly just from setup. I didn't dig into the actual test case to make sure it's doing what we want it to do. I am going to take a deeper look at the logic.

@fselmo
Copy link
Collaborator

fselmo commented Aug 28, 2025

I just wanted to add a bit more context on the gas_benchmark_value. This allows us to run something like:

uv run fill --fork=Prague -m benchmark --gas-benchmark-values 1,10,30,45,100,150 --clean -k bloatnet

This allows us to test against the different gas limit values specified for the block, not transaction gas cap (1 = 1 Mgas). I am looking a bit deeper into the PR next but wanted to provide some better context.

# with them. Only fill the block by a factor of SPEEDUP.
SPEEDUP: int = 100


Copy link
Collaborator

@jsign jsign Aug 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have covered cold and warm SSTORE/SLOAD with same and different values.

Are these tests different in some way?

Copy link
Collaborator

@fselmo fselmo Aug 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah, good observation. The biggest question I suppose is whether those worst case scenarios are the same as the ones in the bloatnet doc here? If they are, maybe creating a marker on existing relevant tests, so we don't have to redefine them, would be a good approach. Then, instead of trying to run only the tests in this file, we could use a marker like -m bloatnet and if we mark those existing, relevant, param cases with this marker, they get included in all the tests you want to run. We could then mark this whole file (test_bloatnet.py) with the appropriate marker for your use so that you get all the test cases relevant to you in one command, whether they are defined here or elsewhere.

Just a thought.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not conflate two different things together, the objectives are clearly different:

  • BloatNet looks for the performance of regular execution
  • the zk benchmarks are benchmarking zkevms, a widely different environment

Adding an extra coupling here is causing more work for no benefit, since these tests are maintained by different sets of people.

Regarding the worst case, the tests are doing different things since the code itself is different. The goal of the bloatnet test it to measure the sole performance of SSTORE in client, whereas when I read the zkvm-specific code, it is doing extra stuff like jumps. It's normal, you wouldn't be able to load so much code as in our test inside a zkvm. But we can.

Copy link
Collaborator

@jsign jsign Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gballet, the tests in benchmark are not specific to zkvms. (They are even used for PerfNet).

The test I linked are executing blocks where the full gas limit is used to do cold or warm reads, or to write slots to existent or non-existent storage slots. There's nothing specific to zkvms there, thus why I ask how these tests are different -- mainly to avoid duplication or explain better what different variant is trying to be benchmarked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be addressed before merging!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider it has already been addressed by my first comment: it's not the same bytecode, not the same objectives (worst case vs perf), and not the same constraints. Ignacio's tests might not be specific to zkvms, but we do need something specific for us.

fselmo and others added 5 commits September 2, 2025 16:59
* refactor(tests): Proposed patch for bloatnet SSTORE tests

* refactor(tests): Update tests from comments on PR

PR: #1
Signed-off-by: fselmo <[email protected]>

* Use parametrization of the value that is written to

---------

Signed-off-by: fselmo <[email protected]>
Co-authored-by: Guillaume Ballet <[email protected]>
Co-authored-by: felipe <[email protected]>
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some refactoring for the code, please help update them, thanks!

Let's wait for the answer of the execute mode for bloatnet scenario.

Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one comment that was making the test fail when using small --gas-benchmark-values values.

gballet and others added 3 commits September 9, 2025 13:15
Co-authored-by: θ”‘δ½³θͺ  Louis Tsai <[email protected]>
Co-authored-by: Mario Vega <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants