Skip to content

Conversation

DZakh
Copy link
Member

@DZakh DZakh commented Sep 23, 2025

The first PR of the refactoring, where I introduced checkpoints. Currently, most of the functionality stays the same, and checkpoints serve exactly the same purpose as the last_scanned_block_hashes did - reorg detection only.
It'd be nice to add a few more tests, but it's good to go as it is.

Also, besides the preparations for the next changes, this PR fixes:

  • Issue of skipping reorgs on crashing during rollback
  • Potentially removes a lot of redundant db writes for indexers with many addresses
  • Correctly displays indexing progress for ordered multichain indexers (primarily visual)

Summary by CodeRabbit

  • New Features

    • Per-chain progress tracking and reorg-aware checkpoints; initialization now exposes checkpointId and reorg checkpoints.
    • Unified Batch.t API carries total batch size, per-chain progress, and checkpoint metadata consumed across IO and processing flows.
  • Refactor

    • Replaced legacy end-of-block storage with a Checkpoints model and centralized reorg detection around maxReorgDepth.
    • Batch-centric processing replaces many per-item/progress flows; utilities for dict operations streamlined.
  • Chores

    • Config key renamed: confirmedBlockThreshold → maxReorgDepth; DB/schema updated to persist max_reorg_depth and checkpoints.
  • Tests

    • Tests and mocks updated to new reorg/batch APIs; some obsolete E2E mocks/tests removed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 23, 2025

Walkthrough

Reworks batching to a batch-centric, per-chain progression model with checkpoints and reorg-aware state. Adds Batch.t (with progressedChainsById and checkpoint arrays), threads reorgCheckpoints/checkpointId through storage and fetchers, replaces confirmedBlockThreshold with maxReorgDepth, and removes EndOfBlockRangeScannedData APIs.

Changes

Cohort / File(s) Summary
Batch model & pipeline
codegenerator/cli/npm/envio/src/Batch.res, codegenerator/cli/templates/static/codegen/src/EventProcessing.res, codegenerator/cli/templates/static/codegen/src/IO.res, codegenerator/cli/templates/static/codegen/src/InMemoryStore.res, codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
Introduces Batch.t (items, totalBatchSize, progressedChainsById, checkpoint arrays); implements ordered/unordered prepare paths and make; converts event processing, IO, and in-memory DCS storage to consume Batch.t; changes setDcsToStore to single-chain API.
Reorg detection & fetchers
codegenerator/cli/npm/envio/src/ReorgDetection.res, codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
Replaces LastBlockScannedHashes with ReorgDetection.t and constructor make(~blocks, ~maxReorgDepth, ~shouldRollbackOnReorg); uses maxReorgDepth for thresholds; ChainFetcher threaded with reorgCheckpoints/maxReorgDepth.
Persistence & storage wiring
codegenerator/cli/npm/envio/src/Persistence.res, codegenerator/cli/npm/envio/src/PgStorage.res, codegenerator/cli/npm/envio/src/Internal.res
Adds reorgCheckpoints and checkpointId to initial/resumed state; PgStorage queries and exposes checkpoints and committed checkpoint id; introduces Internal.reorgCheckpoint type.
DB schema & table APIs
codegenerator/cli/npm/envio/src/db/InternalTable.res, codegenerator/cli/npm/envio/src/db/Table.res
Adds max_reorg_depth / maxReorgDepth to Chains; renames EndOfBlockRangeScannedDataCheckpoints (envio_checkpoints); adds makeGetReorgCheckpointsQuery, makeCommitedCheckpointIdQuery, makeInsertCheckpointQuery; adds BigInt field type variant; updates progressed-chains persistence shape.
Config & templates
codegenerator/cli/npm/envio/src/InternalConfig.res, codegenerator/cli/templates/dynamic/codegen/src/RegisterHandlers.res.hbs, codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs, codegenerator/cli/templates/static/codegen/src/db/DbFunctions.res, codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js
Renames confirmedBlockThresholdmaxReorgDepth in configs/templates; updates generated code; removes EndOfBlockRangeScannedData APIs/JS exports; TestHelpers_MockDb switched to batch-centric flow and includes reorgCheckpoints in storage readiness.
Global state & actions
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
Switches reducers/actions to batch payloads (EventBatchProcessed now carries Batch.t); removes UpdateEndOfBlockRangeScannedData and prune throttler; updates queue/progress updates to use progressedChainsById.
Utilities & fetch state
codegenerator/cli/npm/envio/src/Utils.res, codegenerator/cli/npm/envio/src/FetchState.res
Reimplements dict helpers (mapValues, mapValuesToArray, filterMapValues, forEach*, size, isEmpty) using raw loops; adjusts dict iteration key/value ordering usages; renames getProgressBlockNumbergetUnorderedMultichainProgressBlockNumberAt(~index) and updates unordered-batch sorting/selection logic.
Tests & scenarios
scenarios/test_codegen/test/* (e.g., ChainManager_test.res, ReorgDetection_test.res, lib_tests/PgStorage_test.res, lib_tests/FetchState_test.res, lib_tests/Persistence_test.res, helpers/Mock.res, rollback/Rollback_test.res, removed files: Mock_test.res, __mocks__/DbStub.res)
Updates tests to maxReorgDepth and ReorgDetection.make, migrates tests to Batch-based flows, adds Checkpoints query tests, includes reorgCheckpoints/checkpointId in mock/persistence initial state, and removes/refactors some E2E/mock test fixtures.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor App
  participant ChainManager
  participant BatchPrep as Batch.make
  participant Fetchers as ChainFetcher[]
  participant EventProc as EventProcessing
  participant DbIO as Db.executeBatch
  participant Pg as PgStorage/Checkpoints

  App->>ChainManager: createBatch(~checkpointIdBeforeBatch, ~chainsBeforeBatch, ~multichain, ~batchSizeTarget)
  ChainManager->>Fetchers: read fetchState + reorgDetection
  ChainManager->>BatchPrep: make(chainsBeforeBatch, multichain, batchSizeTarget)
  BatchPrep-->>ChainManager: Batch.t { items, progressedChainsById, checkpoint arrays }

  App->>EventProc: processEventBatch(~batch)
  EventProc->>DbIO: executeBatch(~batch)
  DbIO->>Pg: setProgressedChains(derived from batch.progressedChainsById)
  Note over Pg: reads/writes envio_checkpoints and returns checkpointId/reorgCheckpoints
  DbIO-->>EventProc: ok
  EventProc-->>App: done
Loading
sequenceDiagram
  autonumber
  participant Pg as PgStorage
  participant IT as InternalTable.Checkpoints
  participant CF as ChainFetcher
  participant RD as ReorgDetection

  Pg->>IT: makeGetReorgCheckpointsQuery(~pgSchema)
  IT-->>Pg: SQL rows (reorg checkpoints, safe_block per chain)
  Pg-->>CF: reorgCheckpoints[]
  CF->>RD: make(~blocks from checkpoints, ~maxReorgDepth, ~shouldRollbackOnReorg)
  RD-->>CF: ReorgDetection.t
  CF->>RD: getLatestValidScannedBlock / getThresholdBlockNumbers
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested reviewers

  • JonoPrest
  • moose-code

Poem

"I nibble code beneath the moon,
New batches bloom and checkpoints tune.
Reorgs kept mild, depths set right,
Chains hop steady through the night.
A rabbit cheers: the pipeline's bright!" 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Track block hashes close to the head” reflects a concrete part of the changeset by highlighting the new checkpoint storage of block hashes for reorg detection, even though the PR also includes a wider refactor of batch and reorg logic.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dz/track-blocks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Base automatically changed from dz/full-block-in-batch to main September 30, 2025 08:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (1)

477-539: Critical: Implement reorg retry logic in getLastKnownValidBlock

  • Lines 477–539 of codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res still contain FIXME placeholders and a commented-out retry loop with no ChainBlocks calls.
  • You must replace these placeholders with a ChainBlocks-based reorg detection and retry mechanism (for example, leveraging ChainBlocks.registerReorgGuard or an equivalent getLatestValidBlock API) to ensure correct rollback.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0a3cb9 and 55b0da6.

📒 Files selected for processing (23)
  • codegenerator/cli/npm/envio/src/Batch.res (4 hunks)
  • codegenerator/cli/npm/envio/src/ChainBlocks.res (1 hunks)
  • codegenerator/cli/npm/envio/src/InternalConfig.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Persistence.res (1 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (3 hunks)
  • codegenerator/cli/npm/envio/src/ReorgDetection.res (8 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (7 hunks)
  • codegenerator/cli/npm/envio/src/db/Table.res (1 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs (2 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/RegisterHandlers.res.hbs (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctions.res (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (11 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (4 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (4 hunks)
  • scenarios/test_codegen/test/ChainManager_test.res (1 hunks)
  • scenarios/test_codegen/test/Config_test.res (1 hunks)
  • scenarios/test_codegen/test/E2EEthNode_test.res (1 hunks)
  • scenarios/test_codegen/test/Integration_ts_helpers.res (1 hunks)
  • scenarios/test_codegen/test/ReorgDetection_test.res (15 hunks)
  • scenarios/test_codegen/test/__mocks__/MockConfig.res (1 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (4 hunks)
💤 Files with no reviewable changes (2)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctions.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/Persistence.res
  • scenarios/test_codegen/test/Integration_ts_helpers.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/InternalConfig.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • scenarios/test_codegen/test/E2EEthNode_test.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/npm/envio/src/ReorgDetection.res
  • scenarios/test_codegen/test/ReorgDetection_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/npm/envio/src/db/Table.res
  • codegenerator/cli/npm/envio/src/ChainBlocks.res
  • scenarios/test_codegen/test/ChainManager_test.res
  • scenarios/test_codegen/test/__mocks__/MockConfig.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • scenarios/test_codegen/test/Config_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/Batch.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/Persistence.res
  • scenarios/test_codegen/test/Integration_ts_helpers.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/InternalConfig.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • scenarios/test_codegen/test/E2EEthNode_test.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/npm/envio/src/ReorgDetection.res
  • scenarios/test_codegen/test/ReorgDetection_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/npm/envio/src/db/Table.res
  • codegenerator/cli/npm/envio/src/ChainBlocks.res
  • scenarios/test_codegen/test/ChainManager_test.res
  • scenarios/test_codegen/test/__mocks__/MockConfig.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • scenarios/test_codegen/test/Config_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/Batch.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/RegisterHandlers.res.hbs
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (51)
codegenerator/cli/templates/static/codegen/src/IO.res (1)

227-240: Review comment is accurate and actionable. It correctly highlights the missing blocks-table rollback and the optional simplification of the single‐element Promise.all. No changes needed.

codegenerator/cli/npm/envio/src/db/InternalTable.res (3)

19-19: LGTM! Field addition follows consistent patterns.

The maxReorgDepth field is correctly added across all necessary locations with proper naming conventions (snake_case for DB, camelCase for ReScript) and consistent type mappings.

Also applies to: 32-32, 57-57, 70-70, 105-105


209-242: LGTM! Type and function correctly restructured.

The new progressedChain type is properly scoped to this module's needs (excluding batchSize which is not persisted), and setProgressedChains correctly maps the fields to the progress update query.


268-291: LGTM! Module refactored for checkpoint-based reorg tracking.

The Checkpoints module (formerly EndOfBlockRangeScannedData) is correctly structured with:

  • Sequential bigint id as primary key for ordering
  • Nullable blockHash for reorg detection when applicable
  • Proper field mappings and schema definitions
codegenerator/cli/npm/envio/src/Batch.res (2)

3-24: LGTM! New types properly structured for checkpoint-based batching.

The new types introduce:

  • batchCheckpoint with optional blockHash and items for flexible reorg tracking
  • chainBeforeBatch to encapsulate per-chain state
  • mutChainAcc for mutable accumulation during batch building

The optional field syntax and comments clearly document when fields may be absent.


50-50: Minor optimization: Safe use of getUnsafe.

The change from Array.get to Array.getUnsafe is safe here since the loop bounds guarantee idx is within array bounds (0 to length-1).

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (2)

33-33: LGTM! Throttler field correctly removed.

The pruneStaleEndBlockData field has been removed from WriteThrottlers in alignment with the removal of EndOfBlockRangeScannedData tracking. The return statement correctly includes only the remaining throttlers.


357-394: LGTM! Refactored to use blocks-based reorg detection.

The validatePartitionQueryResponse function has been successfully refactored to:

  • Use ChainBlocks.registerReorgGuard instead of manual hash tracking
  • Update chainFetcher.blocks instead of lastBlockScannedHashes
  • Simplify the non-rollback return path by removing EndOfBlockRangeScannedData construction
scenarios/test_codegen/test/Integration_ts_helpers.res (1)

31-31: LGTM! Field rename is consistent with the PR objectives.

The rename from confirmedBlockThreshold to maxReorgDepth aligns with the project-wide refactoring and maintains the same value (200).

codegenerator/cli/npm/envio/src/db/Table.res (1)

8-8: LGTM! BigInt field type addition is properly integrated.

The new BigInt variant is correctly decorated with @as("BIGINT") and is already handled in the schema coercion logic at line 202.

scenarios/test_codegen/test/E2EEthNode_test.res (1)

42-42: LGTM! Consistent field rename in E2E test configuration.

The rename maintains the same value and aligns with the project-wide refactoring.

scenarios/test_codegen/test/ChainManager_test.res (1)

111-111: LGTM! Parameter rename aligns with ReorgDetection API changes.

The update to ~maxReorgDepth=200 is consistent with the renamed parameter in the LastBlockScannedHashes.empty function.

scenarios/test_codegen/test/Config_test.res (1)

11-11: LGTM! Test expectation updated to match the renamed config field.

The assertion correctly expects maxReorgDepth: 200 instead of the old field name, ensuring the test remains valid.

codegenerator/cli/npm/envio/src/InternalConfig.res (1)

33-33: LGTM! Semantically improved field name.

The rename from confirmedBlockThreshold to maxReorgDepth improves clarity about the field's purpose in reorg handling.

scenarios/test_codegen/test/__mocks__/MockConfig.res (1)

43-43: LGTM! Mock config updated consistently.

The test mock correctly reflects the renamed field from the type definition.

scenarios/test_codegen/test/lib_tests/PgStorage_test.res (2)

118-125: LGTM! Test chain configs updated consistently.

All test inputs correctly use maxReorgDepth matching the type definition changes.


571-634: LGTM! Additional test configs consistently updated.

All chain config test cases properly reflect the maxReorgDepth field rename with values preserved.

codegenerator/cli/npm/envio/src/Persistence.res (1)

20-20: LGTM! New field for reorg checkpoint tracking.

The reorgCheckpoints field extends the initial state to support the new checkpoint-based reorg detection mechanism described in the PR objectives.

codegenerator/cli/npm/envio/src/PgStorage.res (4)

710-710: LGTM! Clean run initializes with empty checkpoints.

Correctly initializes reorgCheckpoints as an empty array for a clean run, which is appropriate since no blocks have been processed yet.


899-911: LGTM! Resume path loads checkpoints concurrently.

The code properly:

  • Upgrades from Promise.all2 to Promise.all3 to include checkpoint loading
  • Loads all checkpoint records from the database
  • Maintains concurrent loading for performance

913-918: LGTM! Resumed state includes checkpoints.

The returned state correctly includes the loaded reorgCheckpoints, matching the Persistence.initialState type definition.


68-68: Verify migration for table rename

I found no migration or cleanup logic handling the replacement of the old end_of_block_range_scanned_data table with the new Checkpoints table in the ReScript sources or SQL scripts. Confirm that your upgrade/migration workflow includes renaming or migrating data from the old table (or dropping it safely) and that existing deployments remain compatible.

codegenerator/cli/templates/dynamic/codegen/src/RegisterHandlers.res.hbs (1)

59-59: LGTM! Clean rename from confirmedBlockThreshold to maxReorgDepth.

The field rename is consistent with the broader refactoring, and the value source remains correctly mapped from the YAML configuration.

codegenerator/cli/templates/dynamic/codegen/src/ConfigYAML.res.hbs (2)

28-28: LGTM! Type signature updated to maxReorgDepth.

The field rename in the type definition aligns with the broader refactoring across the codebase.


61-61: LGTM! Public config builder updated to maxReorgDepth.

The object key rename is consistent with the type change and maintains correct value mapping.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (3)

6-10: Clear documentation for the new progressCheckpointId field.

The comments explain the intent: tracking blocks in processing order and noting that it can be reset to 0 when no history exists yet.


33-33: LGTM! Clean initialization in makeFromConfig.

Starting with 0n is appropriate for a fresh config without history.


74-74: LGTM! Proper threading of reorgCheckpoints.

The reorgCheckpoints from initialState are correctly passed to ChainFetcher.makeFromDbState for reorg detection.

scenarios/test_codegen/test/ReorgDetection_test.res (3)

17-24: LGTM! Test helper updated to use maxReorgDepth.

The mock function correctly reflects the API change from confirmedBlockThreshold to maxReorgDepth, maintaining test functionality.


30-74: LGTM! Comprehensive test parameter updates.

All test cases systematically updated to use maxReorgDepth, ensuring consistency with the renamed API.


81-145: LGTM! Registration tests updated correctly.

The registerReorgGuard test properly uses the new maxReorgDepth parameter in both the empty initialization and mock comparisons.

codegenerator/cli/npm/envio/src/ChainBlocks.res (3)

3-6: LGTM! Clean type definition for ChainBlocks.

The type appropriately encapsulates reorg rollback configuration and the block hash tracking state.


8-29: LGTM! Constructor properly filters and initializes chain-specific reorg state.

The make function correctly:

  • Filters reorgCheckpoints by matching chainId
  • Maps filtered checkpoints to the expected blockNumber and blockHash shape
  • Passes the data and maxReorgDepth to ReorgDetection.LastBlockScannedHashes.makeWithData

31-49: LGTM! registerReorgGuard properly delegates and updates state.

The function:

  • Correctly threads shouldRollbackOnReorg from the ChainBlocks instance
  • Delegates to the underlying ReorgDetection.LastBlockScannedHashes.registerReorgGuard
  • Returns both the updated ChainBlocks instance and the reorg result
codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (8)

15-15: LGTM!

The addition of maxReorgDepth field to the type t is correct and aligns with the PR objective to rename confirmedBlockThreshold to maxReorgDepth.


24-24: LGTM!

Replacing lastBlockScannedHashes with blocks: ChainBlocks.t is consistent with the PR's goal to introduce ChainBlocks for reorg tracking.


45-46: LGTM!

The addition of maxReorgDepth and reorgCheckpoints parameters to the make function signature is correct and properly threads the reorg state through initialization.


184-187: LGTM!

The blockLag calculation correctly uses maxReorgDepth instead of the old chainConfig.confirmedBlockThreshold, maintaining the same logic with the renamed field.


202-212: LGTM!

The construction of blocks via ChainBlocks.make properly propagates chainId, maxReorgDepth, reorgCheckpoints, and shouldRollbackOnReorg configuration. This aligns with the PR's introduction of ChainBlocks for reorg tracking.


234-235: LGTM!

The makeFromConfig function correctly propagates maxReorgDepth from chainConfig.maxReorgDepth and initializes reorgCheckpoints as an empty array, which is appropriate for a fresh config-based initialization.


279-284: LGTM!

The makeFromDbState function correctly threads maxReorgDepth from resumedChainState.maxReorgDepth and reorgCheckpoints through to the make function, enabling proper state restoration from the database.


468-471: LGTM!

The getHighestBlockBelowThreshold function correctly uses cf.maxReorgDepth instead of the old chainConfig.confirmedBlockThreshold, maintaining the same threshold calculation logic.

codegenerator/cli/npm/envio/src/ReorgDetection.res (9)

46-46: LGTM!

The makeWithData signature correctly updates the parameter from confirmedBlockThreshold to maxReorgDepth, maintaining API consistency with the PR's renaming objective.


49-49: LGTM!

The empty signature correctly updates the parameter from confirmedBlockThreshold to maxReorgDepth.


81-81: LGTM!

The type field rename from confirmedBlockThreshold to maxReorgDepth is correct and consistent with the PR's objective.


92-104: LGTM!

The makeWithData implementation correctly propagates the renamed maxReorgDepth parameter throughout the function body (lines 100, 102).


106-110: LGTM!

The empty implementation correctly uses maxReorgDepth in both the parameter and the returned record (line 107).


113-118: LGTM!

The getDataByBlockNumberCopyInThreshold function correctly destructures maxReorgDepth (line 113) and uses it in the threshold calculation (line 118).


135-179: LGTM!

The registerReorgGuard function correctly destructures maxReorgDepth from self (line 135) and uses it when calling empty (line 179), maintaining consistency with the renaming.


196-203: LGTM!

The returned record literal correctly uses maxReorgDepth (line 198) in place of the old field name.


288-316: LGTM!

The rollbackToValidBlockNumber function correctly destructures maxReorgDepth (line 288) and uses it in the returned record (line 312), completing the consistent renaming across the module.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
codegenerator/cli/npm/envio/src/db/InternalTable.res (1)

209-242: Minor: Consider documenting the function rename.

The function was renamed from setProgress to setProgressedChains, and a new progressedChain type was introduced. While the changes are correct, the rename isn't directly related to the PR's stated objectives (tracking block hashes and reorg checkpoints).

Was this rename intentional for this PR, or should it be part of a separate refactoring effort?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55b0da6 and f5bf560.

📒 Files selected for processing (6)
  • codegenerator/cli/npm/envio/src/Batch.res (1 hunks)
  • codegenerator/cli/npm/envio/src/ChainBlocks.res (1 hunks)
  • codegenerator/cli/npm/envio/src/Persistence.res (1 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (3 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (7 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/npm/envio/src/ChainBlocks.res
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/Persistence.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • codegenerator/cli/npm/envio/src/Persistence.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (10)
codegenerator/cli/npm/envio/src/Persistence.res (1)

20-20: LGTM! Field addition aligns with reorg checkpoint tracking.

The new reorgCheckpoints field properly extends the initial state to support reorg detection based on stored block hashes. The type reference to InternalTable.Checkpoints.reorgCheckpoint is consistent with the broader refactoring.

scenarios/test_codegen/test/lib_tests/PgStorage_test.res (3)

118-118: LGTM! Test data updated for renamed field.

The test correctly uses maxReorgDepth instead of the old confirmedBlockThreshold field name, maintaining consistency with the refactored configuration schema.

Also applies to: 125-125


547-653: Excellent test coverage for reorg checkpoint filtering logic.

The new test suite validates:

  • Chains with maxReorgDepth=0 are correctly skipped
  • Chains with negative reorg thresholds (blockHeight < maxReorgDepth) are filtered out
  • Query returns None when all chains are filtered
  • Only eligible chains appear in the generated SQL WHERE clause

This ensures the query generation logic properly handles edge cases.


679-679: LGTM! Consistent field rename across test cases.

All test chain configurations have been updated to use maxReorgDepth instead of confirmedBlockThreshold, ensuring test coverage aligns with the production code changes.

Also applies to: 706-706, 734-734, 742-742

codegenerator/cli/npm/envio/src/db/InternalTable.res (2)

19-19: LGTM! maxReorgDepth field properly integrated into Chains module.

The field is correctly:

  • Added to the field enum (line 19)
  • Included in the fields array (line 32)
  • Mapped in the type definition with @as("max_reorg_depth") (line 57)
  • Defined as Integer type with proper schema (line 70)
  • Populated from config in initialFromConfig (line 105)

Also applies to: 32-32, 57-57, 70-70, 105-105


310-335: Reorg checkpoint query logic looks correct.

The makeGetReorgCheckpointsQuery function properly:

  • Filters chains with maxReorgDepth === 0 (no reorg protection needed)
  • Filters chains where reorgThreshold < 0 (not enough blocks scanned yet)
  • Returns None when no valid chains remain (avoiding empty queries)
  • Builds parameterized WHERE conditions for each eligible chain
  • Only selects rows where block_hash IS NOT NULL

The filtering logic prevents unnecessary database queries and handles edge cases correctly.

codegenerator/cli/npm/envio/src/PgStorage.res (4)

68-68: LGTM! Table reference updated for Checkpoints module.

The change correctly references InternalTable.Checkpoints.table instead of the old EndOfBlockRangeScannedData.table, aligning with the module rename.


710-710: LGTM! Empty checkpoint array for clean initialization.

On a clean initialization, returning an empty reorgCheckpoints array is correct since no blocks have been processed yet.


908-921: Excellent conditional query execution pattern.

The code properly:

  • Calls makeGetReorgCheckpointsQuery which returns Option<string>
  • Returns empty array when query is None (no valid chains to check)
  • Only executes the database query when needed
  • Uses proper type casting for the result

This pattern avoids unnecessary database queries and handles edge cases gracefully.


925-925: LGTM! Checkpoints properly included in resume state.

The reorgCheckpoints field is correctly added to the returned initial state, ensuring reorg detection can resume from the last known checkpoint.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

1004-1009: FIXME: Blocks-based rollback logic still pending.

This FIXME was flagged in previous reviews and remains unresolved. The commented-out code shows the intended blocks-based rollback logic for lastBlockScannedHashes during reorg, but it's not yet implemented.

This could leave reorg rollback functionality incomplete for the new blocks-based checkpoint model.

Based on past review comments, the blocks-based rollback implementation is still pending. Verify whether this is blocking or can be deferred:

#!/bin/bash
# Description: Search for other FIXME/TODO related to blocks-based rollback

echo "=== Searching for blocks/rollback related FIXMEs ==="
rg -n 'FIXME.*block.*rollback|TODO.*block.*rollback' -g '*.res' -C3

echo "=== Searching for LastBlockScannedHashes.rollbackToValidBlockNumber usage ==="
rg -n 'rollbackToValidBlockNumber' -g '*.res' -C3
🧹 Nitpick comments (5)
codegenerator/cli/templates/static/codegen/src/InMemoryStore.res (1)

113-144: Consider deduping and early exit for DCR writes.

  • If dcs can contain duplicates by address, redundant Set updates may occur. Optional: dedupe by dc.address before looping.
  • Micro-optimization: return early when dcs is empty.
- dcs->Belt.Array.forEach(dc => {
+ if dcs->Belt.Array.length == 0 { () } else {
+   let unique =
+     dcs
+     ->Belt.Array.reduce(Js.Dict.empty(), (acc, dc) => { acc->Js.Dict.set(dc.address, dc); acc })
+     ->Js.Dict.values;
+   unique->Belt.Array.forEach(dc => {
      /* existing body */
-   })
+   })
+ }

Ensure InMemoryTable.Entity.set is idempotent for same entity.id.

codegenerator/cli/npm/envio/src/PgStorage.res (2)

69-69: Checkpoints included in initialization: good; ensure indexes support queries.

Adding InternalTable.Checkpoints.table is correct. Verify the table defines indices required by makeGetReorgCheckpointsQuery (e.g., by chainId, serial/height) to keep resume fast.


908-922: resumeInitialState: safe fallback and typed parse; consider ordering/limit.

  • None → [] is safe.
  • If query returns many rows, consider adding ORDER BY/LIMIT in makeGetReorgCheckpointsQuery for predictable size and performance.

Also applies to: 924-928

codegenerator/cli/npm/envio/src/Batch.res (2)

132-165: Potential mutation ordering issue with dcsToStore.

On line 145, leftDcsToStore is created as an empty array and passed to updateInternal. Then lines 148-157 mutate this same array by pushing items to it. This relies on the mutation happening after the reference is captured but before it's actually used.

While this works in JavaScript/ReScript, it's fragile and could be error-prone if the execution order changes. Consider creating the leftDcsToStore array before calling updateInternal.

Apply this diff to make the mutation order more explicit:

-        | dcs => {
-            let leftDcsToStore = []
-            let batchDcs = []
-            let updatedFetchState =
-              fetchState->FetchState.updateInternal(~mutItems=leftItems, ~dcsToStore=leftDcsToStore)
-            let nextProgressBlockNumber = updatedFetchState->FetchState.getProgressBlockNumber
-
-            dcs->Array.forEach(dc => {
-              // Important: This should be a registering block number.
-              // This works for now since dc.startBlock is a registering block number.
-              if dc.startBlock <= nextProgressBlockNumber {
-                batchDcs->Array.push(dc)
-              } else {
-                // Mutate the array we passed to the updateInternal beforehand
-                leftDcsToStore->Array.push(dc)
-              }
-            })
-
-            getChainAfterBatchIfProgressed(
-              ~chainBeforeBatch,
-              ~batchSize,
-              ~dcsToStore=Some(batchDcs),
-              ~updatedFetchState,
-            )
-          }
+        | dcs => {
+            let batchDcs = []
+            let leftDcsToStore = []
+            
+            // First determine progress block number with empty dcsToStore
+            let tempFetchState =
+              fetchState->FetchState.updateInternal(~mutItems=leftItems, ~dcsToStore=[])
+            let nextProgressBlockNumber = tempFetchState->FetchState.getProgressBlockNumber
+
+            // Then partition dcs based on progress block number
+            dcs->Array.forEach(dc => {
+              if dc.startBlock <= nextProgressBlockNumber {
+                batchDcs->Array.push(dc)
+              } else {
+                leftDcsToStore->Array.push(dc)
+              }
+            })
+
+            // Finally create the actual updated fetch state
+            let updatedFetchState =
+              fetchState->FetchState.updateInternal(~mutItems=leftItems, ~dcsToStore=leftDcsToStore)
+
+            getChainAfterBatchIfProgressed(
+              ~chainBeforeBatch,
+              ~batchSize,
+              ~dcsToStore=Some(batchDcs),
+              ~updatedFetchState,
+            )
+          }

85-187: Add explicit test for zero-batch progression
No existing tests exercise the edge case where batchSize = 0. Add a test that invokes createBatch (or getProgressedChainsById) with a zero batch size and verifies that chains at head still progress when progressBlockNumber < nextProgressBlockNumber.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5bf560 and 2dbe083.

⛔ Files ignored due to path filters (1)
  • scenarios/test_codegen/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (12)
  • codegenerator/cli/npm/envio/src/Batch.res (5 hunks)
  • codegenerator/cli/npm/envio/src/FetchState.res (3 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (4 hunks)
  • codegenerator/cli/npm/envio/src/Utils.res (1 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (8 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (2 hunks)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res (2 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (2 hunks)
  • codegenerator/cli/templates/static/codegen/src/InMemoryStore.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (6 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (19 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/InMemoryStore.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/npm/envio/src/FetchState.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/InMemoryStore.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/npm/envio/src/FetchState.res
  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/InMemoryStore.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
🔇 Additional comments (24)
codegenerator/cli/templates/static/codegen/src/IO.res (1)

51-265: LGTM! Batch-centric refactor is correctly implemented.

The signature change from ~progressedChains to ~batch: Batch.t aligns with the PR's batch-centric progression model. The inline transformation at lines 259-265 correctly derives the progressedChain array from batch.progressedChainsById using the new chainAfterBatch type.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (2)

17-17: LGTM! Field renamed to better reflect its purpose.

The rename from isFetchingAtHead to isProgressAtHead improves clarity about what this boolean tracks.


183-183: LGTM! Consistent rename of reorg threshold configuration.

The rename from confirmedBlockThreshold to maxReorgDepth is applied consistently across initialization, block lag calculation, and hash tracking. This aligns with the broader PR changes to use more descriptive naming.

Also applies to: 212-212, 263-263, 474-474

codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (2)

382-387: LGTM! Test mock updated to include reorg checkpoints.

The storageStatus construction now includes the reorgCheckpoints: [] field, aligning with the new checkpoint tracking introduced in this PR.


406-409: LGTM! Updated to match new InMemoryStore API.

The call to setDcsToStore now uses labeled parameters ~chainId and ~dcs instead of a composed dictionary argument, matching the updated API signature.

codegenerator/cli/npm/envio/src/FetchState.res (1)

95-98: Parameter order swapped consistently.

The forEachWithKey callback parameters have been reordered from (contractName, addresses) to (addresses, contractName) across all four call sites. The logic inside each callback has been correctly updated to match the new order.

As per coding guidelines: ReScript uses = for setting record field values, and := only for ref values created with the ref function. The changes here don't introduce any violations.

Also applies to: 103-110, 115-131, 1156-1162

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (3)

98-105: LGTM! Simplified to use Batch module for multichain readiness check.

The inline mapping of cf.fetchState directly to Batch.hasMultichainReadyItem simplifies the logic and aligns with the batch-centric model introduced in this PR.


107-118: LGTM! Batch construction consolidated into single call.

The createBatch function now uses Batch.make with an inline map to construct chainBeforeBatch records. The field mapping is correct:

  • fetchState from cf.fetchState
  • progressBlockNumber from cf.committedProgressBlockNumber
  • totalEventsProcessed from cf.numEventsProcessed
  • sourceBlockNumber from cf.currentBlockHeight

120-123: LGTM! Helper function renamed for clarity.

The rename from isFetchingAtHead to isProgressAtHead is consistent with the field rename in ChainFetcher.res and improves code readability.

codegenerator/cli/templates/static/codegen/src/EventProcessing.res (3)

316-325: byChain logging: verify field name and own-key iteration.

  • Ensure chainAfterBatch.progressBlockNumber matches Batch.chainAfterBatch type (name drift risk). If it’s toBlockNumber/progressToBlockNumber, update here.
  • This uses Utils.Dict.filterMapValues; after we add hasOwnProperty guards in Utils (see separate comment), this will be safe from inherited keys.

333-337: Handler and DB paths correctly switch to batch.items and IO.executeBatch(~batch).

Preload, handlers, and write path are consistent with the new batch model.

Also applies to: 341-341, 354-355


304-311: All processEventBatch call sites have been updated No instances of ~items or ~progressedChains remain in *.res files.

codegenerator/cli/templates/static/codegen/src/InMemoryStore.res (1)

105-107: Signature change: simpler call sites.

setDcsToStore(~chainId, ~dcs, ...) is clearer than dict-based input. LGTM.

codegenerator/cli/npm/envio/src/PgStorage.res (1)

711-711: initialize: reorgCheckpoints default [].

Baseline is reasonable. LGTM.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (8)

11-34: LGTM! Clean removal of pruneStaleEndBlockData throttler.

The WriteThrottlers.make function has been simplified to remove the pruneStaleEndBlockData throttler, and the signature no longer requires a config parameter. This aligns with the PR's removal of EndOfBlockRangeScannedData APIs.


107-107: EventBatchProcessed payload updated to batch-centric model.

The action payload changed from {progressedChains, items} to {batch: Batch.t}, which is consistent with the batch-centric refactoring described in the PR summary.


110-115: UpdateQueues payload migrated to progressedChainsById.

The payload changed from updatedFetchStates: ChainMap.t<FetchState.t> to progressedChainsById: dict<Batch.chainAfterBatch>. This is consistent with the new batch model where progression state is captured per-chain in a dictionary.


565-602: EventBatchProcessed handler correctly uses batch object.

The handler has been updated to:

  • Accept {batch} payload (line 565)
  • Pass ~batch to updateProgressedChains (line 575)
  • Check for exit condition after batch processing (lines 580-595)

The logic correctly handles the batch-centric model. The preservation of NoExit when shouldUseTui is true (line 590) ensures the Dev Console remains operational.


629-656: UpdateQueues correctly extracts fetchState from progressedChainsById.

The handler iterates through chain fetchers and updates each with the corresponding fetchState from progressedChainsById (lines 631-636). The fallback to cf.fetchState when a chain is not in the dict (line 635) correctly handles chains that weren't affected by the batch.


823-835: Reorg threshold check correctly uses batch progressedChainsById.

The logic to determine shouldEnterReorgThreshold now correctly extracts the updated fetchState from batch.progressedChainsById (lines 826-831) before checking if the chain is ready to enter reorg threshold. The fallback to chainFetcher.fetchState when not in the dict is appropriate.


891-909: Dynamic contracts correctly extracted from batch progressedChainsById.

The loop (lines 889-909) correctly:

  • Extracts chainAfterBatch from progressedChainsById (line 891)
  • Accesses fetchState.dcsToStore (line 893)
  • Sets dynamic contracts in the in-memory store when present (lines 894-900)
  • Updates Prometheus metrics with chainAfterBatch.batchSize and progressBlockNumber (lines 902-906)

177-304: Verify cached isProgressAtHead remains accurate across batch creation and processing.
Ensure that the flag captured in isProgressAtHeadWhenBatchCreated (Batch.res) still reflects the true “at head” status when updateProgressedChains runs, and cannot become stale if new blocks or events arrive between batch creation and processing.

codegenerator/cli/npm/envio/src/Batch.res (2)

19-22: All downstream consumers use the updated Batch.t fields; no action needed.


290-295: Add tests or confirm coverage for single-chain ordered mode
The optimization using chainsBeforeBatch->ChainMap.size === 1 is valid, but I found no tests covering that code path. Add a test for single-chain ordered batches or confirm that existing tests exercise this scenario.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
scenarios/test_codegen/test/rollback/Rollback_test.res (1)

168-385: Consider adding tests for new block hash tracking mechanism.

While the existing tests continue to verify rollback behavior at a high level (checking task types, entity states, and rollback triggers), the commented assertions leave a gap in verifying the core PR feature: storing block hashes with serial IDs.

Based on the PR objectives, consider adding new test assertions that verify:

  1. Block hashes are stored with correct serial IDs following processing order
  2. Block hashes are NOT stored for unprocessed blocks (as stated in PR description)
  3. The new maxReorgDepth mechanism works correctly with the block hash tracking

Would you like me to help identify what specific assertions should be added to test the new block hash tracking functionality?

codegenerator/cli/npm/envio/src/ReorgDetection.res (1)

177-255: Avoid throwing on missing verified hashes; return the previous valid block instead.

The “Unexpected case” throw can occur if upstream gives a shorter verification set; prefer graceful fallback.

-        | None =>
-          Js.Exn.raiseError(
-            `Unexpected case. Couldn't find verified hash for block number ${blockNumberKey}`,
-          )
+        | None =>
+          getPrevScannedBlock(idx)

Also apply explicit sorting here for stability:

-    // Js engine automatically orders numeric object keys
-    let ascBlockNumberKeys = dataByBlockNumber->Js.Dict.keys
+    let ascBlockNumberKeys =
+      dataByBlockNumber
+      ->Js.Dict.keys
+      ->Belt.Array.keepMap(Int.fromString)
+      ->Js.Array2.sortInPlaceWith((a, b) => compare(a, b))
+      ->Belt.Array.map(Belt.Int.toString)

Doc nit: the comment mentions “or the latest scanned block outside of the reorg threshold.” Current implementation can’t return outside-threshold data. Update the comment or add a fallback scan over self.dataByBlockNumber when in-threshold search fails.

♻️ Duplicate comments (1)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

1000-1006: FIXME remains: blocks-based rollback logic incomplete.

The commented-out rollback logic for lastBlockScannedHashes (lines 1000-1005) remains incomplete. This was flagged in previous reviews and is marked as addressed in commit 2dbe083, but the FIXME comment is still present in this version.

🧹 Nitpick comments (5)
codegenerator/cli/npm/envio/src/ReorgDetection.res (5)

63-76: Validate inputs in make (non-negative maxReorgDepth).

Guard against accidental negatives to avoid retaining too-large windows when subtracting from currentBlockHeight.

 let make = (~blocks, ~maxReorgDepth, ~shouldRollbackOnReorg, ~detectedReorgBlock=?) => {
+  if maxReorgDepth < 0 {
+    Js.Exn.raiseError("maxReorgDepth must be >= 0")
+  }
   let dataByBlockNumber = Js.Dict.empty()

78-99: Don’t rely on object-key enumeration order; sort keys explicitly.

Explicit sort makes behavior stable across runtimes and future refactors.

-  // Js engine automatically orders numeric object keys
-  let ascBlockNumberKeys = dataByBlockNumber->Js.Dict.keys
+  // Sort keys numerically to guarantee ascending order
+  let ascBlockNumberKeys =
+    dataByBlockNumber
+    ->Js.Dict.keys
+    ->Belt.Array.keepMap(Int.fromString)
+    ->Js.Array2.sortInPlaceWith((a, b) => compare(a, b))
+    ->Belt.Array.map(Belt.Int.toString)

100-175: Reorg branch should also apply pruning; confirm detectedReorgBlock reset semantics.

  • When reorg is detected and rollback is enabled, you compute the pruned copy but don’t persist it. Persisting keeps the map bounded.
  • In the NoReorg path you reset detectedReorgBlock to None; confirm this is intended for duplicate‑reorg suppression logic.
   | Some(reorgDetected) => (
       shouldRollbackOnReorg
         ? {
             ...self,
+            dataByBlockNumber: dataByBlockNumberCopyInThreshold,
             detectedReorgBlock: Some(reorgDetected.scannedBlock),
           }
         : make(~blocks=[], ~maxReorgDepth, ~shouldRollbackOnReorg),
       ReorgDetected(reorgDetected),
     )

If the reset is not intended, consider preserving detectedReorgBlock until you pass a safe checkpoint block.


257-293: Rollback should iterate over a sorted view of keys.

Ensure deterministic rollback boundary handling.

-  // Js engine automatically orders numeric object keys
-  let ascBlockNumberKeys = dataByBlockNumber->Js.Dict.keys
+  let ascBlockNumberKeys =
+    dataByBlockNumber
+    ->Js.Dict.keys
+    ->Belt.Array.keepMap(Int.fromString)
+    ->Js.Array2.sortInPlaceWith((a, b) => compare(a, b))
+    ->Belt.Array.map(Belt.Int.toString)

295-300: Sort returned block numbers.

Callers may expect ascending order; make it explicit.

-  dataByBlockNumberCopyInThreshold->Js.Dict.values->Js.Array2.map(v => v.blockNumber)
+  dataByBlockNumberCopyInThreshold
+  ->Js.Dict.values
+  ->Js.Array2.map(v => v.blockNumber)
+  ->Js.Array2.sortInPlaceWith((a, b) => compare(a, b))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2dbe083 and 82f127e.

📒 Files selected for processing (10)
  • codegenerator/cli/npm/envio/src/ReorgDetection.res (2 hunks)
  • codegenerator/cli/npm/envio/src/bindings/PromClient.res (2 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (10 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (2 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (19 hunks)
  • scenarios/test_codegen/test/ChainManager_test.res (3 hunks)
  • scenarios/test_codegen/test/E2E_test.res (1 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (3 hunks)
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res (2 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (9 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • scenarios/test_codegen/test/ChainManager_test.res
  • scenarios/test_codegen/test/E2E_test.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/npm/envio/src/ReorgDetection.res
  • codegenerator/cli/npm/envio/src/bindings/PromClient.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • scenarios/test_codegen/test/ChainManager_test.res
  • scenarios/test_codegen/test/E2E_test.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/npm/envio/src/ReorgDetection.res
  • codegenerator/cli/npm/envio/src/bindings/PromClient.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
🔇 Additional comments (35)
scenarios/test_codegen/test/rollback/Rollback_test.res (1)

716-716: LGTM! Helper function properly migrated to module.

The migration of initialEnterReorgThreshold to M.Helper.initialEnterReorgThreshold is consistent across all test scenarios and properly namespaced.

Also applies to: 746-747, 769-769, 1030-1031

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (8)

17-17: LGTM: Field updates align with reorg detection refactor.

The rename from isFetchingAtHead to isProgressAtHead improves clarity, and the addition of reorgDetection field properly replaces the previous hash-tracking approach.

Also applies to: 23-23


44-46: LGTM: Constructor signature updated correctly.

The new parameters reorgCheckpoints and maxReorgDepth properly support the transition from lastBlockScannedHashes and confirmedBlockThreshold to the new reorg detection model.


184-186: LGTM: Block lag calculation updated correctly.

The transition from confirmedBlockThreshold to maxReorgDepth in the block lag calculation maintains the correct semantics.


197-210: LGTM: ReorgDetection construction is correct.

The filtering of reorgCheckpoints by chainId and subsequent mapping to ReorgDetection.make parameters is implemented correctly, ensuring only relevant checkpoints are used for each chain.


231-232: LGTM: Fresh initialization properly uses empty checkpoints.

Starting with an empty reorgCheckpoints array in makeFromConfig is correct for a clean indexer start.


279-280: LGTM: Resume path correctly forwards persisted reorg data.

The makeFromDbState function properly passes through reorgCheckpoints and retrieves maxReorgDepth from the persisted resumedChainState.


474-477: LGTM: Threshold calculation updated correctly.

The substitution of maxReorgDepth for confirmedBlockThreshold maintains the same semantic calculation.


489-491: LGTM: ReorgDetection method calls are correct.

The calls to getThresholdBlockNumbers and getLatestValidScannedBlock correctly use the new reorgDetection API with appropriate parameters.

Also applies to: 524-527

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (9)

107-107: LGTM: EventBatchProcessed refactored to batch-centric model.

Consolidating the payload into {batch: Batch.t} aligns with the PR's goal of centralizing batch-based data flow.


110-115: LGTM: UpdateQueues refactored to use batch-based progression.

The change from updatedFetchStates to progressedChainsById: dict<Batch.chainAfterBatch> properly reflects the new batch-centric architecture.


177-228: LGTM: Batch-based progress tracking implemented correctly.

The updateProgressedChains function correctly extracts chain-specific progress from batch.progressedChainsById and updates Prometheus metrics and chain fetcher state appropriately.


349-355: LGTM: Reorg detection state update handled correctly.

The unpacking and assignment of updatedReorgDetection properly maintains the reorg detection state.


561-574: LGTM: EventBatchProcessed handler updated correctly.

The handler properly extracts the batch object and passes it to updateProgressedChains, maintaining consistency with the batch-centric model.


625-639: LGTM: UpdateQueues handler correctly uses progressedChainsById.

The handler properly looks up chain-specific progress from the progressedChainsById dictionary and updates fetch state accordingly.


837-837: LGTM: Empty progressedChainsById check prevents unnecessary work.

The guard correctly skips processing when no chains have progressed.


857-905: LGTM: Non-empty batch processing correctly handles progressedChainsById.

The processing path correctly:

  • Dispatches UpdateQueues with progressedChainsById
  • Extracts chainAfterBatch data for metrics
  • Accesses dcsToStore and batchSize from chainAfterBatch

819-831: LGTM: Reorg threshold check correctly uses batch fetchState.

The logic properly checks each chain's readiness to enter the reorg threshold, using the updated fetchState from batch.progressedChainsById when available.

scenarios/test_codegen/test/lib_tests/Persistence_test.res (1)

81-81: LGTM: Test updated to include new reorgCheckpoints field.

The addition of reorgCheckpoints: [] to both initialState objects correctly reflects the updated Persistence.initialState type.

Also applies to: 146-146

scenarios/test_codegen/test/E2E_test.res (1)

41-84: LGTM: E2E test for Prometheus metrics is well-structured.

The new test case properly validates the Prometheus metrics lifecycle:

  1. Initial state verification
  2. Post-reorg-threshold verification
  3. Post-batch-write verification

This provides good coverage for the metrics integration.

scenarios/test_codegen/test/helpers/Mock.res (5)

210-213: LGTM: Metric type added to support Prometheus testing.

The new metric type and corresponding field in Indexer.t enable E2E tests to verify Prometheus metrics behavior.

Also applies to: 221-221


228-229: LGTM: Metrics reset ensures clean test state.

Calling PromClient.resetMetrics during Indexer.make prevents metric pollution across test runs.


203-203: LGTM: Storage mock updated with reorgCheckpoints field.

The addition of reorgCheckpoints: [] aligns the mock with the updated storage state structure.


354-363: LGTM: Metric retrieval implementation is correct.

The implementation properly queries the Prometheus registry and formats the result into the test-friendly metric type.


593-618: LGTM: Helper function streamlines reorg threshold testing.

The initialEnterReorgThreshold helper encapsulates the common flow for driving the indexer into the reorg threshold state, improving test readability and reducing duplication.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (4)

67-67: LGTM! Reorg checkpoints properly wired from initialState.

The reorg checkpoints are correctly passed from initialState.reorgCheckpoints to ChainFetcher.makeFromDbState, ensuring reorg detection state is restored on resume.


100-106: LGTM! Clean refactor to inline fetchState mapping.

The removal of the getFetchStates helper in favor of inline mapping is cleaner and more direct. The delegation to Batch.hasMultichainReadyItem maintains the same logic while simplifying the code structure.


109-119: LGTM! Batch creation refactored to single Batch.make call.

The refactoring to use a single Batch.make call with inline chainBeforeBatch construction is cleaner and more maintainable. The field mappings from ChainFetcher properties are correct:

  • fetchStatecf.fetchState
  • progressBlockNumbercf.committedProgressBlockNumber
  • totalEventsProcessedcf.numEventsProcessed
  • sourceBlockNumbercf.currentBlockHeight

121-124: LGTM! Consistent rename to isProgressAtHead.

The rename from isFetchingAtHead to isProgressAtHead is semantically clearer and consistent with the broader PR changes. Both the function name and the property access (cf.isProgressAtHead) have been updated correctly.

scenarios/test_codegen/test/ChainManager_test.res (4)

111-111: LGTM! Mock initialization updated to use maxReorgDepth.

The initialization of lastBlockScannedHashes using ReorgDetection.LastBlockScannedHashes.empty(~maxReorgDepth=200) correctly reflects the PR's terminology change from confirmedBlockThreshold to maxReorgDepth. The hardcoded value of 200 is appropriate for test mocks.


112-112: LGTM! Mock field renamed to isProgressAtHead.

The rename from isFetchingAtHead to isProgressAtHead is consistent with the changes in ChainManager.res and the broader PR terminology updates.


160-160: LGTM! Batch destructuring updated for progressedChainsById.

The destructuring pattern {items, progressedChainsById} correctly reflects the new Batch.t shape introduced in the PR, replacing the previous updatedFetchStates pattern.


181-186: LGTM! FetchState retrieval correctly handles progressive updates.

The logic correctly retrieves the updated fetchState from progressedChainsById when available, falling back to the original fetcher.fetchState when the chain wasn't progressed in the batch. Despite the "dangerous" naming of Utils.Dict.dangerouslyGetByIntNonOption, the pattern match safely handles both Some and None cases.

codegenerator/cli/npm/envio/src/bindings/PromClient.res (1)

14-20: LGTM! New Prometheus metric bindings added.

The new bindings provide essential Prometheus metric access:

  • metricValue type captures metric value and labels
  • metricInstance provides async access to metric values
  • getSingleMetric safely returns option<metricInstance> for missing metrics
  • resetMetrics enables registry cleanup

These additions enable metric introspection and testing capabilities.

codegenerator/cli/npm/envio/src/ReorgDetection.res (1)

43-61: LGTM: new state shape is coherent and minimal.

The fields align with the “track hashes near head” objective and replace threshold semantics cleanly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82f127e and 9bc0799.

📒 Files selected for processing (1)
  • codegenerator/cli/npm/envio/src/Batch.res (5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/Batch.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/Batch.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
codegenerator/cli/npm/envio/src/Utils.res (1)

100-159: for..in loops require hasOwnProperty guards to prevent prototype pollution.

All dict iteration functions (mapValues, filterMapValues, mapValuesToArray, forEach, forEachWithKey, size, isEmpty) use for..in without hasOwnProperty guards. This allows inherited enumerable properties to be processed, changing semantics vs Object.keys and enabling prototype pollution attacks.

This issue was previously flagged with detailed fixes. Apply the diffs from the past review comment to add hasOwnProperty guards to each for..in loop.

Based on past review comments.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1)

81-84: Restore committed checkpoint ID from persistence

commitedCheckpointId stays hardcoded at 0, so every restart forgets the last persisted checkpoint. That breaks history checkpoint ordering and replays work that should stay committed. Please load the stored checkpoint id from initialState (or extend it if missing) and thread it into ChainManager.makeFromDbState, instead of defaulting to zero.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (2)

828-849: Empty-items branch can drop pending rollback (FIXME still present)

If RollbackInMemStore is set and totalBatchSize === 0, persisting only progress risks losing the pending rollback on restart. Flush the rollback in-memory store through the normal batch execution path even for zero-sized batches, or defer persisting progress.

Apply:

       } else if totalBatchSize === 0 {
         dispatchAction(StartProcessingBatch)
-        // For this case there shouldn't be any FetchState changes
-        // so we don't dispatch UpdateQueues - only update the progress for chains without events
-        await Db.sql->InternalTable.Chains.setProgressedChains(
-          ~pgSchema=Db.publicSchema,
-          ~progressedChains=progressedChainsById->Utils.Dict.mapValuesToArray((
-            chainAfterBatch
-          ): InternalTable.Chains.progressedChain => {
-            chainId: chainAfterBatch.fetchState.chainId,
-            progressBlockNumber: chainAfterBatch.progressBlockNumber,
-            totalEventsProcessed: chainAfterBatch.totalEventsProcessed,
-          }),
-        )
-        // FIXME: When state.rollbackState is RollbackInMemStore
-        // If we increase progress in this case (no items)
-        // and then indexer restarts - there's a high chance of missing
-        // the rollback. This should be tested and fixed.
-        dispatchAction(EventBatchProcessed({batch: batch}))
+        // If a rollback in-memory store is present, commit it via the normal batch execution path
+        // to ensure rollback side-effects are persisted even without events.
+        let rollbackInMemStore = switch state.rollbackState {
+        | RollbackInMemStore(inMemoryStore) => Some(inMemoryStore)
+        | _ => None
+        }
+        switch rollbackInMemStore {
+        | Some(inMemoryStore) =>
+          switch await EventProcessing.processEventBatch(
+            ~batch,
+            ~inMemoryStore,
+            ~isInReorgThreshold,
+            ~loadManager=state.loadManager,
+            ~config=state.config,
+          ) {
+          | Ok() =>
+            dispatchAction(ResetRollbackState)
+            dispatchAction(EventBatchProcessed({batch}))
+          | Error(errHandler) => dispatchAction(ErrorExit(errHandler))
+          }
+        | None =>
+          // No rollback pending; only persist progressed chains
+          await Db.sql->InternalTable.Chains.setProgressedChains(
+            ~pgSchema=Db.publicSchema,
+            ~progressedChains=progressedChainsById->Utils.Dict.mapValuesToArray((
+              chainAfterBatch
+            ): InternalTable.Chains.progressedChain => {
+              chainId: chainAfterBatch.fetchState.chainId,
+              progressBlockNumber: chainAfterBatch.progressBlockNumber,
+              totalEventsProcessed: chainAfterBatch.totalEventsProcessed,
+            }),
+          )
+          dispatchAction(EventBatchProcessed({batch}))
+        }

Add a unit test for empty batch + RollbackInMemStore to prevent regressions.


990-999: Implement reorgDetection rollback alongside fetchState rollback

The blocks-based reorg state isn’t rolled back; the FIXME remains. Align it with fetchState rollback to prevent stale/invalid hash state post-rollback.

Apply:

           let rolledBackCf = {
             ...cf,
-            // FIXME:
-            // lastBlockScannedHashes: chain == reorgChain
-            //   ? cf.lastBlockScannedHashes->ReorgDetection.LastBlockScannedHashes.rollbackToValidBlockNumber(
-            //       ~blockNumber=lastKnownValidBlockNumber,
-            //     )
-            //   : cf.lastBlockScannedHashes,
+            reorgDetection:
+              chain == reorgChain
+                ? cf.reorgDetection->ReorgDetection.rollbackToValidBlockNumber(
+                    ~blockNumber=lastKnownValidBlockNumber,
+                  )
+                : cf.reorgDetection,
             fetchState,
           }
🧹 Nitpick comments (5)
scenarios/test_codegen/test/ChainManager_test.res (2)

225-229: Nit: Fix assertion message typos/grammar

Minor polish.

-          ~message="There were a different number of events created to what was recieved from the queues.",
+          ~message="The number of events created differed from the number read from the queues.",

128-128: Typo in public field name ‘commitedCheckpointId’

If feasible, consider renaming to ‘committedCheckpointId’ across the codebase to avoid long‑term API friction. Defer if this reflects an existing public contract.

scenarios/test_codegen/test/ReorgDetection_test.res (1)

213-218: Typos in assertion message

Minor text fixes.

-      ~message="Prunes original blocks at 1 and 50. It writes invalid data for block 20 and 50, but they are outside of the reorg thershold, so we don't care",
+      ~message="Prunes original blocks at 1 and 50. It writes invalid data for blocks 20 and 50, but they are outside of the reorg threshold, so we don't care",
codegenerator/cli/templates/static/codegen/src/EventProcessing.res (1)

261-285: Guard batch layout invariants while iterating checkpoints

The loop uses raw indices/offsets. Add a lightweight guard or dev-assert (e.g., ensure checkpointIndex + fixedCheckpointFields + eventsProcessed <= length) to fail fast on malformed batches.

codegenerator/cli/npm/envio/src/Batch.res (1)

23-54: Document the commented-out blockHash field handling.

The comments at lines 30-31 describe blockHash at offset 3, but the implementation in prepareOrderedBatch (lines 302-305) and prepareUnorderedBatch (lines 419-422) has the blockHash assignment commented out. This creates inconsistency between the documentation and implementation.

Consider either:

  1. Removing blockHash references from the comments if it's not being used
  2. Documenting why blockHash is temporarily disabled
  3. Re-enabling blockHash if it should be tracked
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b2c7f0e and fb3487b.

📒 Files selected for processing (14)
  • codegenerator/cli/npm/envio/src/Batch.res (5 hunks)
  • codegenerator/cli/npm/envio/src/FetchState.res (4 hunks)
  • codegenerator/cli/npm/envio/src/Utils.res (2 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res (5 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (3 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (10 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (5 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (16 hunks)
  • scenarios/test_codegen/test/ChainManager_test.res (4 hunks)
  • scenarios/test_codegen/test/Mock_test.res (0 hunks)
  • scenarios/test_codegen/test/ReorgDetection_test.res (23 hunks)
  • scenarios/test_codegen/test/__mocks__/DbStub.res (0 hunks)
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res (4 hunks)
💤 Files with no reviewable changes (2)
  • scenarios/test_codegen/test/mocks/DbStub.res
  • scenarios/test_codegen/test/Mock_test.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • scenarios/test_codegen/test/ReorgDetection_test.res
  • codegenerator/cli/npm/envio/src/FetchState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • scenarios/test_codegen/test/ChainManager_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/npm/envio/src/Batch.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/npm/envio/src/Utils.res
  • scenarios/test_codegen/test/ReorgDetection_test.res
  • codegenerator/cli/npm/envio/src/FetchState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • scenarios/test_codegen/test/ChainManager_test.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/npm/envio/src/Batch.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (19)
codegenerator/cli/npm/envio/src/Utils.res (1)

192-199: Verify no usage of UnsafeIntOperators or ensure safe integer ranges
No references to UnsafeIntOperators were found; manually confirm it’s unused or that all integer arguments remain ≤ Number.MAX_SAFE_INTEGER to prevent precision loss.

scenarios/test_codegen/test/ChainManager_test.res (2)

111-115: Good migration to ReorgDetection.make

Initialization with empty blocks, explicit maxReorgDepth, and shouldRollbackOnReorg=false looks correct for this test.


186-193: Deriving fetchState from progressedChainsById is correct

Using progressedChainsById with a safe fallback to prior fetchState is the right approach for per-chain progress projection.

scenarios/test_codegen/test/ReorgDetection_test.res (1)

15-24: ReorgDetection.make usage LGTM

Clean switch to the new constructor; mapping fixtures into blocks and threading maxReorgDepth/flags is correct.

codegenerator/cli/templates/static/codegen/src/EventProcessing.res (2)

193-251: Preload across batch is solid

Efficiently filters items with handlers and safely absorbs both sync and async errors. Good.


319-337: Useful structured logging

totalBatchSize and per-chain batch summaries improve observability. Good.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (2)

177-215: Progress-update via batch is correct

Metrics, firstEventBlockNumber derivation, and isProgressAtHead carry-over look good.


616-623: UpdateQueues now uses progressedChainsById: LGTM

Properly selects per-chain fetchState with threshold update on entry.

codegenerator/cli/templates/static/codegen/src/eventFetching/ChainFetcher.res (4)

184-186: Block lag tied to maxReorgDepth is sensible

Correctly zeroes lag in reorg-threshold or when not rolling back; otherwise uses chain maxReorgDepth.


197-211: ReorgDetection constructed from per-chain checkpoints

Filtering checkpoints by chainId and threading maxReorgDepth and shouldRollbackOnReorg is correct. No sorting required per tests.


475-477: getHighestBlockBelowThreshold uses maxReorgDepth

Good replacement of deprecated threshold naming.


489-492: Rollback ‘last known valid block’ via ReorgDetection API

Threshold and validity resolution through ReorgDetection looks correct; fallback path retained for empty thresholds.

Ensure getBlockHashes’ return shape matches ReorgDetection.blockDataWithTimestamp.

Also applies to: 525-528

codegenerator/cli/npm/envio/src/Batch.res (7)

117-232: LGTM: Well-structured chain progression tracking.

The getProgressedChainsById function correctly:

  • Determines which chains progressed by comparing block numbers (line 128)
  • Updates fetch states with remaining items (line 181)
  • Splits dynamic contracts (dcs) based on whether their startBlock has been reached (lines 191-200)
  • Handles chains without batch changes (lines 212-219)

The logic for accumulating batchDcs and leftDcsToStore ensures that only contracts whose registration block has been processed are included in the batch.


234-239: LGTM: Checkpoint index calculation is correct.

The formula correctly computes the checkpoint's position in the array by:

  • Starting at checkpointsStartIndex (2)
  • Accounting for previous checkpoints via (checkpointId - checkpointIdBeforeBatch - 1) * fixedCheckpointFields
  • Adding totalBatchSize to skip over all items from previous checkpoints

241-344: LGTM: Ordered batch preparation is well-implemented.

The prepareOrderedBatch function correctly:

  • Selects the chain with the earliest timestamp at each iteration (line 258)
  • Limits to 1 block per iteration to maintain ordering across chains (lines 265-270)
  • Creates checkpoints per block with proper indexing (lines 275-279)
  • Copies items using setUnsafe for performance (lines 284-289)
  • Updates progress tracking per chain (line 315)
  • Handles chains reaching their buffer (lines 317-323)
  • Finalizes the batch array with computed values (lines 331-343)

The checkpoint index calculation at lines 275-279 matches the getCheckpointIndex helper.


468-484: LGTM: Correct dispatch between ordered and unordered modes.

The make function correctly:

  • Uses unordered batch preparation for Unordered multichain or single-chain scenarios (lines 476-477)
  • Uses ordered batch preparation only for multi-chain Ordered mode (line 482)
  • Passes all required parameters to both batch preparation functions

486-508: LGTM: Safe iteration through batch checkpoints.

The keepMap function correctly:

  • Iterates through all checkpoints starting at checkpointsStartIndex (line 488)
  • Extracts eventsProcessed count using the proper offset (lines 491-494)
  • Iterates through items within each checkpoint (lines 495-504)
  • Uses Utils.magic to cast from unknown to typed values safely
  • Advances the checkpoint index by fixed fields plus items (line 505)

510-535: LGTM: Efficient early-exit search for chain's first block.

The findFirstEventBlockNumber function correctly:

  • Searches through checkpoints until finding a matching chainId (lines 514-526)
  • Returns the block number immediately upon finding a match (lines 520-525)
  • Advances the index properly when skipping non-matching checkpoints (lines 527-532)
  • Returns None if no matching checkpoint is found

6-13: chainAfterBatch naming is correct The isProgressAtHeadWhenBatchCreated field replaces the old isProgressAtHead consistently across type and usages; no updates needed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
codegenerator/cli/npm/envio/src/ReorgDetection.res (1)

63-69: Normalize block-number key encoding (use one helper, avoid mixing Int.toString/Js.Int.toString).

Define a single helper and replace all occurrences for consistency and future‑proofing.

Apply near top-level (after type defs):

+/* Normalize int keys used in JS dicts */
+let toKey = Belt.Int.toString

Then replace conversions in this file, e.g.:

- dataByBlockNumber->Js.Dict.set(block.blockNumber->Js.Int.toString, block)
+ dataByBlockNumber->Js.Dict.set(block.blockNumber->toKey, block)

- ...dangerouslyGetNonOption(rangeLastBlock.blockNumber->Int.toString)
+ ...dangerouslyGetNonOption(rangeLastBlock.blockNumber->toKey)

- ...dangerouslyGetNonOption(prevRangeLastBlock.blockNumber->Int.toString)
+ ...dangerouslyGetNonOption(prevRangeLastBlock.blockNumber->toKey)

- dataByBlockNumberCopyInThreshold->Js.Dict.set(rangeLastBlock.blockNumber->Int.toString, rangeLastBlock)
+ dataByBlockNumberCopyInThreshold->Js.Dict.set(rangeLastBlock.blockNumber->toKey, rangeLastBlock)

- dataByBlockNumberCopyInThreshold->Js.Dict.set(prevRangeLastBlock.blockNumber->Int.toString, prevRangeLastBlock)
+ dataByBlockNumberCopyInThreshold->Js.Dict.set(prevRangeLastBlock.blockNumber->toKey, prevRangeLastBlock)

- verifiedDataByBlockNumber->Js.Dict.set(blockData.blockNumber->Int.toString, blockData)
+ verifiedDataByBlockNumber->Js.Dict.set(blockData.blockNumber->toKey, blockData)

- ...dangerouslyGetNonOption(detectedReorgBlock.blockNumber->Int.toString)
+ ...dangerouslyGetNonOption(detectedReorgBlock.blockNumber->toKey)

Optionally also switch to Utils.Dict.dangerouslyGetByIntNonOption consistently where available.

Also applies to: 113-116, 127-131, 151-154, 158-161, 188-191, 209-214

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

828-849: Preserve RollbackInMemStore in empty-items path; avoid persisting progress while rollback is pending.

Persisting progress with zero items can drop a pending in‑memory rollback on restart.

Minimal guard:

-      } else if totalBatchSize === 0 {
+      } else if totalBatchSize === 0 {
         dispatchAction(StartProcessingBatch)
-        // For this case there shouldn't be any FetchState changes
-        // so we don't dispatch UpdateQueues - only update the progress for chains without events
-        await Db.sql->InternalTable.Chains.setProgressedChains(
-          ~pgSchema=Db.publicSchema,
-          ~progressedChains=progressedChainsById->Utils.Dict.mapValuesToArray((
-            chainAfterBatch
-          ): InternalTable.Chains.progressedChain => {
-            chainId: chainAfterBatch.fetchState.chainId,
-            progressBlockNumber: chainAfterBatch.progressBlockNumber,
-            totalEventsProcessed: chainAfterBatch.totalEventsProcessed,
-          }),
-        )
-        // FIXME: When state.rollbackState is RollbackInMemStore
-        // If we increase progress in this case (no items)
-        // and then indexer restarts - there's a high chance of missing
-        // the rollback. This should be tested and fixed.
-        dispatchAction(EventBatchProcessed({batch: batch}))
+        let hasPendingRollback = switch state.rollbackState {
+        | RollbackInMemStore(_) => true
+        | _ => false
+        }
+        if !hasPendingRollback {
+          // Only persist if no rollback is pending
+          await Db.sql->InternalTable.Chains.setProgressedChains(
+            ~pgSchema=Db.publicSchema,
+            ~progressedChains=progressedChainsById->Utils.Dict.mapValuesToArray((
+              chainAfterBatch
+            ): InternalTable.Chains.progressedChain => {
+              chainId: chainAfterBatch.fetchState.chainId,
+              progressBlockNumber: chainAfterBatch.progressBlockNumber,
+              totalEventsProcessed: chainAfterBatch.totalEventsProcessed,
+            }),
+          )
+        }
+        dispatchAction(EventBatchProcessed({batch: batch}))

Add a test to cover empty-batch + RollbackInMemStore.

🧹 Nitpick comments (3)
codegenerator/cli/npm/envio/src/ReorgDetection.res (1)

82-98: Optional: don’t rely on JS object key enumeration ordering.

Relying on “Js engine automatically orders numeric object keys” can be brittle. Consider sorting keys numerically for clarity.

scenarios/test_codegen/test/ChainManager_test.res (1)

160-176: Batch API usage and per-chain progression look correct.

createBatch destructuring and progressedChainsById propagation are correct; guard on totalBatchSize prevents OOB on items[0].

Minor: you can use items->Array.getUnsafe(0) after the guard instead of Option.getExn.

Also applies to: 188-193

codegenerator/cli/npm/envio/src/Batch.res (1)

142-147: Use by-int Dict helpers to avoid string conversions.

Prefer Utils.Dict.dangerouslyGetByIntNonOption for int keys.

-  let progressBlockNumberAfterBatch = switch progressBlockNumberPerChain->Utils.Dict.dangerouslyGetNonOption(
-    fetchState.chainId->Int.toString,
-  ) {
+  let progressBlockNumberAfterBatch = switch progressBlockNumberPerChain->Utils.Dict.dangerouslyGetByIntNonOption(
+    fetchState.chainId,
+  ) {
@@
-  switch switch batchSizePerChain->Utils.Dict.dangerouslyGetNonOption(
-    fetchState.chainId->Int.toString,
-  ) {
+  switch switch batchSizePerChain->Utils.Dict.dangerouslyGetByIntNonOption(
+    fetchState.chainId,
+  ) {

Also applies to: 149-151

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3ace66 and 24c348c.

📒 Files selected for processing (6)
  • codegenerator/cli/npm/envio/src/Batch.res (4 hunks)
  • codegenerator/cli/npm/envio/src/ReorgDetection.res (2 hunks)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res (5 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (16 hunks)
  • scenarios/test_codegen/test/ChainManager_test.res (4 hunks)
  • scenarios/test_codegen/test/E2E_test.res (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/templates/static/codegen/src/EventProcessing.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • scenarios/test_codegen/test/ChainManager_test.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • scenarios/test_codegen/test/E2E_test.res
  • codegenerator/cli/npm/envio/src/ReorgDetection.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • scenarios/test_codegen/test/ChainManager_test.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • scenarios/test_codegen/test/E2E_test.res
  • codegenerator/cli/npm/envio/src/ReorgDetection.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (3)
scenarios/test_codegen/test/ChainManager_test.res (1)

111-115: LGTM on reorgDetection init and committed checkpoint wiring.

Using ReorgDetection.make with detect-only mode and introducing commitedCheckpointId=0 aligns with the new checkpoints flow.

Also applies to: 128-129

codegenerator/cli/npm/envio/src/Batch.res (1)

252-396: Checkpoints, items, and progressedChainsById composition looks solid.

Ordered/unordered batch assembly correctly groups per-block items, emits checkpoints, and computes progressedChainsById.

If helpful, add a unit test asserting:

  • sum(checkpointEventsProcessed) == totalBatchSize
  • each checkpointBlockHash matches ReorgDetection.getHashByBlockNumber for its chain/block
  • progressedChainsById keys cover only chains that advanced this batch.

Also applies to: 516-529, 532-548, 550-566

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

177-216: Batch-centric progression wiring LGTM.

  • updateProgressedChains applies progressedChainsById correctly.
  • EventBatchProcessed and UpdateQueues switched to Batch.t payloads.
  • Metrics updates aligned with batch data.

Also applies to: 551-565, 801-861, 879-893, 899-921

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

828-849: Critical: Empty-items batch with RollbackInMemStore can drop rollback state.

At lines 830-848, when totalBatchSize === 0 (empty-items batch), the code persists progress via setProgressedChains and dispatches EventBatchProcessed without checking state.rollbackState. If rollbackState is RollbackInMemStore, the in-memory rollback is never executed and will be lost on indexer restart, causing the rollback to be missed.

Mirror the non-empty batch path (lines 864-871) by extracting rollbackInMemStore from state.rollbackState and either:

  1. Include it when calling a persistence function that handles rollback state, or
  2. Defer persisting progress while RollbackInMemStore is set.

Add a unit test covering the empty-batch + RollbackInMemStore scenario to prevent regression.

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24c348c and b77485b.

📒 Files selected for processing (2)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (16 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (11 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (9)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (9)

11-34: LGTM! Throttler simplification aligns with batch-centric refactor.

The removal of pruneStaleEndBlockData and the no-parameter signature for WriteThrottlers.make are consistent with the PR's shift to batch-based progress tracking. The remaining throttlers (chainMetaData, pruneStaleEntityHistory) are correctly constructed.


60-60: LGTM!

The call to WriteThrottlers.make() correctly uses the updated no-parameter signature.


107-111: LGTM! Batch-centric action payloads are well-defined.

The updated payloads for EventBatchProcessed and UpdateQueues correctly reflect the batch-oriented data flow introduced in this PR.


177-294: LGTM! Batch-centric progress tracking is correctly implemented.

The refactored updateProgressedChains function correctly extracts per-chain state from batch.progressedChainsById and appropriately handles chains that didn't progress (fallback to existing state). Progress metrics, first event detection, and sync-to-head logic are all sound.


339-345: LGTM! ReorgDetection integration is correct.

The updated reorg handling correctly uses ReorgDetection.registerReorgGuard and propagates the updated reorgDetection state to the chain fetcher, consistent with the migration from lastBlockScannedHashes.


551-588: LGTM! EventBatchProcessed handling is consistent with batch model.

The case correctly uses the {batch} payload, passes it to updateProgressedChains, and appropriately dispatches follow-up tasks including exit logic and optional pruning.


615-642: LGTM! UpdateQueues correctly applies batch-derived fetch states.

The case appropriately extracts per-chain fetchState from progressedChainsById, falls back to existing state when absent, and conditionally updates blockLag for reorg threshold handling.


993-997: LGTM! Reorg chain rollback is now correctly implemented.

The conditional rollback logic for reorgDetection correctly calls rollbackToValidBlockNumber only for the reorg chain while preserving the state for other chains. This addresses the FIXME flagged in previous reviews.

Based on learnings


797-924: LGTM! ProcessEventBatch correctly uses batch-centric structures.

The task correctly creates a batch, extracts progressedChainsById and totalBatchSize, applies fetch state lookups for reorg threshold detection, dispatches UpdateQueues with batch data, and passes the batch to EventProcessing.processEventBatch. Metrics and dynamic contract storage handling are also correctly aligned with the batch model.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
scenarios/test_codegen/test/rollback/Rollback_test.res (1)

701-701: Remove .it_only before merging.

The .it_only modifier skips all other tests in the suite and must not be committed to the main branch.

Apply this diff:

-  Async.it_only("Should re-enter reorg threshold on restart", async () => {
+  Async.it("Should re-enter reorg threshold on restart", async () => {
codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1)

65-74: Good: checkpoint and reorg state are restored from initialState.

Passing reorgCheckpoints into ChainFetcher and initializing checkpointId from initialState resolves the prior “hardcoded 0” concern.

Also applies to: 81-85

codegenerator/cli/npm/envio/src/db/InternalTable.res (1)

268-276: Ensure DB migration exists for “envio_checkpoints” creation/rename.

Schema introduced/renamed table and PK shape; a migration must handle it (including data backfill if needed).

Run to verify presence:

#!/bin/bash
# Look for SQL migrations or ReScript-managed DDL for envio_checkpoints / old table
rg -n -C2 -i 'envio_checkpoints|end_of_block_range_scanned_data|ALTER TABLE|CREATE TABLE|DROP TABLE' --type=sql
rg -n -C2 -i 'envio_checkpoints|end_of_block_range_scanned_data|ALTER TABLE|CREATE TABLE|DROP TABLE' --type=res

Also applies to: 291-300

codegenerator/cli/npm/envio/src/Batch.res (1)

226-250: Remove duplicate assignment for clarity.

Lines 234 and 242 both assign checkpointId to prevCheckpointId. The second assignment on line 242 is redundant and should be removed.

Apply this diff:

         mutCheckpointBlockHashes->Js.Array2.push(Js.Null.Value(hash))->ignore
         mutCheckpointEventsProcessed->Js.Array2.push(0)->ignore

-        prevCheckpointId := checkpointId
       | Js.Null.Null => ()
       }
     }
🧹 Nitpick comments (1)
codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (1)

4-4: Typo in API surface: “commitedCheckpointId” → “committedCheckpointId”.

Public field and inits are misspelled. Consider renaming now (and providing a thin alias/shim if needed) to avoid cementing a typo into the API.

If you want, I can generate a scripted rename across the repo.

Also applies to: 27-27, 81-81

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b77485b and 5eefea9.

📒 Files selected for processing (11)
  • codegenerator/cli/npm/envio/src/Batch.res (4 hunks)
  • codegenerator/cli/npm/envio/src/FetchState.res (6 hunks)
  • codegenerator/cli/npm/envio/src/Persistence.res (1 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (4 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (8 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (3 hunks)
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res (5 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (6 hunks)
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res (2 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • codegenerator/cli/npm/envio/src/Persistence.res
  • codegenerator/cli/npm/envio/src/FetchState.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • scenarios/test_codegen/test/lib_tests/Persistence_test.res
  • scenarios/test_codegen/test/helpers/Mock.res
  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/npm/envio/src/Batch.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/eventFetching/ChainManager.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
  • codegenerator/cli/templates/static/codegen/src/IO.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (12)
scenarios/test_codegen/test/lib_tests/Persistence_test.res (1)

81-83: Tests updated to assert new initialState fields.

Looks correct and aligned with the new persistence surface.

If you adopt bigint for checkpointId (see InternalTable comment), update these literals to 0n.

Also applies to: 147-149

codegenerator/cli/templates/static/codegen/src/IO.res (2)

53-54: Batch → progressedChains mapping LGTM.

Deriving progressedChains from batch.progressedChainsById matches the new data flow.

Also applies to: 260-267


235-241: Reorg rollback no longer touches checkpoints; confirm intent.

You removed EoBR rollback and didn’t add a replacement for envio_checkpoints. If checkpoints are append‑only by design (for reorg detection), fine; otherwise consider pruning rows after the rollback point per chain.

I can draft a small helper like Checkpoints.rollbackAfter(~chainId, ~blockNumber) if needed.

codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1)

318-321: Mock DB batch path aligns with new reorg/checkpoint model.

Looks good for tests. If checkpointId moves to bigint, switch 0 → 0n in checkpointIdBeforeBatch and Ready.checkpointId.

Also applies to: 323-327, 388-397, 423-468

codegenerator/cli/npm/envio/src/PgStorage.res (3)

68-68: LGTM: Table reference updated to match module rename.

The change from EndOfBlockRangeScannedData.table to Checkpoints.table correctly reflects the module rename described in the PR.


710-712: LGTM: New fields properly initialized for clean runs.

The additions of reorgCheckpoints: [] and checkpointId: InternalTable.Checkpoints.initialCheckpointId correctly initialize the new state fields for fresh database initialization.


328-328: Parameter order correct. Utils.Dict.forEachWithKey passes (value, key) to the callback as expected.

scenarios/test_codegen/test/helpers/Mock.res (2)

203-204: LGTM: Mock storage state correctly updated.

The additions of reorgCheckpoints: [] and checkpointId: 0 properly reflect the new state fields introduced in the main storage implementation.


227-288: LGTM: Test flexibility improvements and consistent API updates.

The addition of the ~reset=true parameter to Indexer.make and the currentBlockHeight parameter throughout the mock source implementation correctly reflect the changes in the main codebase and improve test configurability.

codegenerator/cli/npm/envio/src/Batch.res (3)

1-33: LGTM: Checkpoint-based batch architecture properly defined.

The new type definitions (chainAfterBatch, chainBeforeBatch, and t) correctly implement the checkpoint-based batch structure. The use of UnsafeIntOperators is acceptable for performance but ensure all integer operations are validated to prevent overflow/underflow.


96-530: LGTM: Batch preparation logic correctly implements checkpoint-based architecture.

The refactored getProgressedChainsById, prepareOrderedBatch, and prepareUnorderedBatch functions correctly implement the new checkpoint-based batch system. Previous review concerns regarding empty return values and checkpoint index calculations have been properly addressed.


550-566: LGTM: Helper function correctly locates first event block.

The findFirstEventBlockNumber function properly iterates through checkpoints to find the first block with events for the specified chain.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
codegenerator/cli/npm/envio/src/db/InternalTable.res (1)

278-286: Checkpoint ID remains int despite past recommendation for bigint.

The checkpoint id field is consistently typed as int (type definition line 278, schema line 294, initialCheckpointId line 289, query line 330). However, past review comments recommended using bigint end-to-end to prevent overflow for serial IDs tracking processing order in long-running indexers.

Current state:

  • Type t.id: int (line 278)
  • Table field: Integer with S.int (line 294)
  • Initial value: 0 (line 289)
  • Query COALESCE: Belt.Int.toString (line 330)

Past recommendation:
Change to bigint throughout (type, schema, initial value, query) to safely handle serial ID growth near the head.

Based on learnings

Also applies to: 289-289, 294-294, 330-331

scenarios/test_codegen/test/rollback/Rollback_test.res (2)

168-185: Commented assertions still present - tests incomplete.

The deep equality assertions for UpdateEndOfBlockRangeScannedData payloads remain commented out across multiple test scenarios. Tests currently only verify variant tags (e.g., line 172), not actual checkpoint data like block hashes, block numbers, or checkpoint IDs.

Missing validation:

  • Lines 168-185: First test scenario
  • Lines 235-251: "Successfully rolls back single chain indexer"
  • Lines 326-346: After rollback checkpoint verification
  • Lines 358-385: Block 4 processing checkpoint

These commented assertions should be restored and updated to validate the new Checkpoints module behavior, ensuring block hashes are correctly tracked with serial IDs as per PR objectives.

Also applies to: 235-251, 326-346, 358-385


798-798: Remove .it_only before merging.

The .it_only modifier at line 798 causes all other tests in the suite to be skipped. This should be changed to .it before merging to ensure the full test suite runs.

Apply this diff:

-  Async.it_only("Rollback of a single chain indexer", async () => {
+  Async.it("Rollback of a single chain indexer", async () => {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5eefea9 and 9785792.

📒 Files selected for processing (4)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (8 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (4 hunks)
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res (8 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (6 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/IO.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (10)
scenarios/test_codegen/test/lib_tests/FetchState_test.res (2)

2365-2365: LGTM! API rename correctly applied.

The test suite correctly reflects the rename from filterAndSortForUnorderedBatch to sortForUnorderedBatch. All call sites maintain the same parameters (~batchSizeTarget) and test expectations remain unchanged, confirming the behavior is preserved.

Also applies to: 2401-2404, 2443-2446, 2484-2487


2774-2777: Ensure multichain scenarios are tested
All calls to getUnorderedMultichainProgressBlockNumberAt currently use ~index=0, covering only the single‐chain case. Add tests for non-zero indices (additional chains) and invalid index handling.

codegenerator/cli/npm/envio/src/db/InternalTable.res (4)

19-19: LGTM: Consistent maxReorgDepth integration.

The maxReorgDepth field is properly integrated across all necessary locations (field union, fields array, type definition, table schema, and initialization). The naming change from confirmedBlockThreshold aligns with the PR objectives.

Also applies to: 32-32, 57-57, 70-70, 105-105


190-190: LGTM: Parameter order corrected.

The parameter order change from (chainId, data) to (data, chainId) aligns with ReScript conventions where the value typically precedes the key in forEach-style functions.


209-242: LGTM: Refactored to use explicit progressedChain type.

The new progressedChain type provides a clearer API than the previous dict-based approach. The function now accepts a strongly-typed array, improving type safety.


302-327: LGTM: Sound reorg checkpoint query logic.

The makeGetReorgCheckpointsQuery correctly:

  • Filters out chains with zero reorg depth or negative thresholds
  • Builds dynamic per-chain conditions based on blockHeight - maxReorgDepth
  • Returns None when no chains require checkpoint retrieval

This aligns with the PR goal of tracking block hashes near the head within the reorg threshold.

scenarios/test_codegen/test/rollback/Rollback_test.res (1)

701-796: LGTM: Comprehensive reorg threshold restart test.

The new test validates that the indexer correctly re-enters the reorg threshold after restart by:

  • Setting up multi-chain scenario
  • Verifying initial reorg threshold entry and metrics
  • Simulating restart
  • Confirming reorg threshold re-entry with updated block heights

This provides valuable coverage for the checkpoint and reorg detection persistence across restarts.

codegenerator/cli/templates/static/codegen/src/IO.res (3)

53-53: LGTM: Refactored to batch-centric API.

The function signature change from ~progressedChains: array<Batch.progressedChain> to ~batch: Batch.t provides a cleaner API by encapsulating all batch-related data in a single parameter.


59-59: LGTM: Conditional history persistence added.

The new shouldSaveHistory flag correctly gates entity history writes based on the reorg threshold state, preventing unnecessary history accumulation during normal processing while preserving it when needed for rollbacks.

Also applies to: 95-139


257-283: LGTM: Batch-centric checkpoint persistence.

The new implementation correctly:

  • Transforms batch.progressedChainsById to InternalTable.Chains.progressedChain format using Utils.Dict.mapValuesToArray
  • Conditionally persists checkpoints only when shouldSaveHistory is true
  • Inserts checkpoints with serial IDs and block hash data from the batch

This aligns with the PR objective of tracking block hashes with serial IDs in processing order.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scenarios/test_codegen/test/lib_tests/PgStorage_test.res (1)

108-130: Update init DDL expectations to use envio_checkpoints
makeInitializeTransaction now emits a CREATE TABLE for envio_checkpoints instead of end_of_block_range_scanned_data. Update the expected DDL strings in scenarios/test_codegen/test/lib_tests/PgStorage_test.res at lines 599-607, 627-635, 655-663, and 666-671 to reflect this change.

♻️ Duplicate comments (1)
codegenerator/cli/npm/envio/src/PgStorage.res (1)

900-921: Guard against empty checkpointIdResult to avoid crash

Belt.Array.getUnsafe(0) will throw when no rows (e.g., first run after migration). Handle empty-case with a default.

Apply this diff:

-    {
-      cleanRun: false,
-      reorgCheckpoints,
-      cache,
-      chains,
-      checkpointId: (checkpointIdResult->Belt.Array.getUnsafe(0))["id"],
-    }
+    let checkpointId =
+      switch checkpointIdResult->Belt.Array.get(0) {
+      | Some(row) => row["id"]
+      | None => InternalTable.Checkpoints.initialCheckpointId
+      }
+
+    {
+      cleanRun: false,
+      reorgCheckpoints,
+      cache,
+      chains,
+      checkpointId,
+    }
🧹 Nitpick comments (6)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

728-731: Consider migrating away from deprecated checkpoint pruning.

The code calls deprecated_pruneStaleCheckpoints, which the naming suggests is outdated. Consider migrating to the new checkpoint implementation if available.

codegenerator/cli/templates/static/codegen/src/IO.res (1)

237-241: Rollback only deletes checkpoints for the reorged chain; add pruning to prevent stale buildup

Using deprecated_rollbackReorgedChainCheckpoints handles the immediate chain, but stale checkpoints for other chains below their safe_block can accumulate. Consider invoking a pruning step (e.g., deprecated_pruneStaleCheckpoints or a non‑deprecated equivalent) post-commit. Also verify Db.publicSchema matches the configured schema for internal tables.

codegenerator/cli/npm/envio/src/db/InternalTable.res (4)

268-365: Checkpoint id as int may not scale; prefer bigint end-to-end

To avoid overflow and align with “serial id” semantics, consider bigint:

  • Change Checkpoints.t.id to bigint
  • Column type to Numeric (or BigInt) with fieldSchema=S.bigint
  • initialCheckpointId to 0n
  • Update makeCommitedCheckpointIdQuery COALESCE fallback accordingly
  • Propagate to consumers (Persistence.initialState.checkpointId, resume/initialize decoders, Batch/ChainManager types)

302-326: CTE query: LGTM; add supporting indices

Query is sound. For performance, add indices on:

  • envio_checkpoints(chain_id, block_number)
  • Optionally a partial index on envio_checkpoints(chain_id, block_number) WHERE block_hash IS NOT NULL

366-381: Rollback helper is per-chain and deprecated; provide/replace with a non-deprecated API

This utility is used during rollback; consider promoting a non-deprecated name and semantics (or clearly steer callers to pruneStaleCheckpoints post-commit).


383-410: Add periodic pruning to control table growth

The deprecated_pruneStaleCheckpoints helper is useful after commits or at startup to cap history outside safe windows. Consider wiring it in an appropriate maintenance path.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9785792 and e9f4722.

📒 Files selected for processing (7)
  • codegenerator/cli/npm/envio/src/Persistence.res (1 hunks)
  • codegenerator/cli/npm/envio/src/PgStorage.res (4 hunks)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (8 hunks)
  • codegenerator/cli/templates/static/codegen/src/IO.res (4 hunks)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js (0 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (17 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (5 hunks)
💤 Files with no reviewable changes (1)
  • codegenerator/cli/templates/static/codegen/src/db/DbFunctionsImplementation.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • codegenerator/cli/npm/envio/src/Persistence.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/PgStorage.res
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/static/codegen/src/IO.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (18)
codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (8)

11-34: LGTM: WriteThrottlers simplified correctly.

The removal of pruneStaleEndBlockData and the updated signature align with the batch-centric progression model. The remaining throttlers are properly initialized.


107-111: LGTM: Action payloads updated for batch-centric model.

The payload changes for EventBatchProcessed and UpdateQueues align with the batch-centric progression approach. The new structure using Batch.t and progressedChainsById provides better encapsulation.


177-294: LGTM: Batch-centric progress tracking implemented correctly.

The refactoring to use batch.progressedChainsById and derive per-chain fetch state from chainAfterBatch is well-structured. Prometheus metrics and progress tracking are updated correctly based on the batch data.


339-345: LGTM: ReorgDetection abstraction properly integrated.

The migration from lastBlockScannedHashes to ReorgDetection.registerReorgGuard aligns with the PR's goal to track block hashes with serial IDs. The reorg validation logic is correctly updated.


551-588: LGTM: EventBatchProcessed handler updated correctly.

The handler properly extracts and uses the batch parameter, correctly calling updateProgressedChains(~batch) and maintaining the exit logic for end blocks.


615-642: LGTM: UpdateQueues correctly uses progressedChainsById.

The handler properly extracts fetch state from progressedChainsById, with appropriate fallback to the existing cf.fetchState when a chain hasn't progressed. The reorg threshold logic is correctly applied.


805-930: LGTM: Batch processing logic correctly refactored.

The batch-centric processing flow is well-structured, correctly extracting progressedChainsById and totalBatchSize from the batch, properly updating Prometheus metrics, and handling dcsToStore. The reorg threshold detection logic using progressedChainsById is correct.


997-1001: LGTM: Blocks-based rollback correctly implemented.

The rollback logic properly uses reorgDetection->ReorgDetection.rollbackToValidBlockNumber for the reorg chain, correctly implementing the blocks-based approach mentioned in past review comments.

codegenerator/cli/templates/static/codegen/src/IO.res (4)

53-60: Batch threading + history gating look good

executeBatch accepts ~batch and shouldSaveHistory localizes the decision. No issues.


257-271: Progress updates derived from batch: LGTM

Mapping batch.progressedChainsById into setProgressedChains is clear and correct.


272-283: Checkpoint insertion gated by history: LGTM; ensure arrays are aligned

Insert uses UNNEST. Confirm all checkpoint arrays are equal length; otherwise PG will error.


285-288: One-shot execution plan: LGTM

Awaiting Promise.all over db funcs within the transaction is fine.

scenarios/test_codegen/test/lib_tests/PgStorage_test.res (1)

547-580: CTE-based reorg checkpoints query test: LGTM

Expected SQL matches builder semantics and filters. Good coverage.

codegenerator/cli/npm/envio/src/PgStorage.res (2)

68-70: Initialize now creates envio_checkpoints

Good swap. Ensure migrations/tests are aligned with the new table name and schema.


707-713: Expose reorgCheckpoints and checkpointId on initialize: LGTM

Shapes look consistent with InternalTable.Checkpoints.initialCheckpointId.

codegenerator/cli/npm/envio/src/db/InternalTable.res (3)

19-33: max_reorg_depth threading: LGTM

Field added to schema, type, and initial config wiring looks consistent.

Also applies to: 53-71, 100-116


209-242: setProgressedChains helper: LGTM

Clear API; aligns with batch updates. Promise.ignoreValue is fine here.


332-364: Batch insert query: LGTM

UNNEST types match table fields; block_hash nullable is handled. No issues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
scenarios/test_codegen/test/rollback/Rollback_test.res (4)

358-385: Restore disabled block hash assertions.

Fourth occurrence of commented-out block hash assertions. All these disabled assertions across the test file prevent proper verification of the PR's core feature: tracking block hashes with serial IDs.

Based on past review comments indicating this was addressed in commits fd3fad7 to e3ace66.


168-185: Restore disabled block hash assertions.

The commented-out assertions at lines 168-185 were previously flagged and marked as addressed, but they remain disabled in this version. According to the PR objectives to "track block hashes with a serial id," these assertions should verify the new block hash tracking mechanism rather than being left commented out.

The current test only checks variant tags (lines 170-173), which won't catch data integrity issues like incorrect block numbers or hashes.

Based on past review comments indicating this was supposedly addressed in commits fd3fad7 to e3ace66.


235-251: Restore disabled block hash assertions.

Similar to lines 168-185, these commented-out assertions remain disabled despite being marked as addressed in past reviews. This is part of a systematic pattern across the test file where block hash tracking verification has been weakened.

Based on past review comments indicating this was addressed in commits fd3fad7 to e3ace66.


326-346: Restore disabled block hash assertions.

Third occurrence of commented-out block hash assertions that remain disabled. This continues the pattern of weakened test coverage for the new block hash tracking feature.

Based on past review comments indicating this was addressed in commits fd3fad7 to e3ace66.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (1)

839-857: Address PreparedRollback handling in empty-batch case.

The FIXME correctly identifies a critical data consistency issue: when totalBatchSize === 0 (no items to process) but chains progress, and state.rollbackState is PreparedRollback, the code persists progress via setProgressedChains (lines 843-852) and dispatches EventBatchProcessed (line 857) without preserving the rollback state. On restart, the pending rollback is lost, potentially leading to incorrect indexed data.

⚠️ This issue was flagged in multiple past reviews and remains unresolved.

Mirror the non-empty branch (lines 873-880) by extracting and handling rollbackInMemStore before persisting progress. Either:

  1. Check state.rollbackState and skip persisting progress while PreparedRollback is pending, or
  2. Pass the rollback state through to ensure it's preserved across restarts, or
  3. Defer dispatching EventBatchProcessed until after the rollback is executed in the next batch

Apply this diff to add the check:

-      } else if totalBatchSize === 0 {
+      } else if totalBatchSize === 0 {
+        // Don't persist progress if rollback is pending
+        switch state.rollbackState {
+        | PreparedRollback(_) =>
+          // Rollback is pending, skip progress update and process the rollback in next iteration
+          ()
+        | NoRollback | PreparingRollback(_) =>
         dispatchAction(StartProcessingBatch)
         // For this case there shouldn't be any FetchState changes
         // so we don't dispatch UpdateQueues - only update the progress for chains without events
         await Db.sql->InternalTable.Chains.setProgressedChains(
           ~pgSchema=Db.publicSchema,
           ~progressedChains=progressedChainsById->Utils.Dict.mapValuesToArray((
             chainAfterBatch
           ): InternalTable.Chains.progressedChain => {
             chainId: chainAfterBatch.fetchState.chainId,
             progressBlockNumber: chainAfterBatch.progressBlockNumber,
             totalEventsProcessed: chainAfterBatch.totalEventsProcessed,
           }),
         )
-        // FIXME: When state.rollbackState is PreparedRollback
-        // If we increase progress in this case (no items)
-        // and then indexer restarts - there's a high chance of missing
-        // the rollback. This should be tested and fixed.
         dispatchAction(EventBatchProcessed({batch: batch}))
+        }

Add a unit test covering the empty-batch + PreparedRollback scenario to prevent regressions.

🧹 Nitpick comments (1)
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1)

436-442: Consider a type-safe alternative to Obj.magic.

While Obj.magic is acceptable in test helpers, it bypasses type safety. If FetchState exposes a factory for test/mock scenarios or accepts a minimal valid config, prefer that approach.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9f4722 and bf9720d.

📒 Files selected for processing (4)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1 hunks)
  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (21 hunks)
  • scenarios/test_codegen/test/helpers/Mock.res (7 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • scenarios/test_codegen/test/helpers/Mock.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (10)
scenarios/test_codegen/test/rollback/Rollback_test.res (1)

701-796: LGTM!

The new test "Should re-enter reorg threshold on restart" is well-structured and covers an important scenario for the reorg detection feature. It properly:

  • Tests multi-chain reorg threshold entry
  • Verifies metrics before and after restart
  • Confirms correct fetch behavior on restart
  • Uses descriptive assertion messages
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (6)

318-320: LGTM! Efficient early exit.

The guard clause correctly handles the empty events case and avoids unnecessary processing overhead.


321-371: LGTM! Per-event processing logic is sound.

The state tracking (latestFetchedBlockNumber, itemsWithContractRegister) and chain ID validation are correctly implemented. The error messages are clear and actionable.


379-401: LGTM! Persistence setup aligns with reorg-aware architecture.

The initialization correctly includes reorgCheckpoints: [] and checkpointId: 0, consistent with the new checkpoint-based model. The deep copy approach prevents unwanted side effects.


406-421: LGTM! Contract registration flow is correct.

The async handling and conditional DCS storage are appropriate. The TODO at line 413 about reusing FetchState logic for deduplication is a reasonable future improvement.


478-492: LGTM! Error handling is comprehensive.

The try/catch properly handles ProcessingError and converts it with appropriate logging. The flags shouldSaveHistory=false and shouldBenchmark=false are correct for mock testing scenarios.


494-499: LGTM! Write-back preserves immutability.

The write-back correctly operates on the cloned mockDb, ensuring the original remains unchanged and allowing state comparison as documented in the comments.

codegenerator/cli/templates/static/codegen/src/globalState/GlobalState.res (3)

1002-1006: LGTM! Blocks-based rollback logic implemented.

The previously commented-out rollback logic for reorgDetection has been properly implemented. The conditional correctly applies rollbackToValidBlockNumber only to the reorg chain while preserving the state for other chains.

✅ Addresses past review comment about missing blocks-based rollback implementation.


733-736: Clarify use of deprecated_pruneStaleCheckpoints

  • The call at GlobalState.res:733 invokes deprecated_pruneStaleCheckpoints (defined at codegenerator/cli/npm/envio/src/db/InternalTable.res:387); confirm when and how this will be replaced or removed.
  • No non-deprecated pruneStaleCheckpoints implementation was found—if one exists, update this call.
  • Add a TODO with a tracking-issue reference to surface the migration plan.

293-296: Empty checkpointIds handling is correct. Batch.t defaults checkpointIds to [], so Utils.Array.last returning None and falling back to chainManager.commitedCheckpointId is intended.

Comment on lines 373 to 377
let processingChainId = switch processingChainId.contents {
| Some(chainId) => chainId
| None =>
Js.Exn.raiseError("No events provided to processEvents. Please provide at least one event.")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unreachable error path with misleading message.

This error should never occur because:

  • Line 318 returns early if events is empty
  • If non-empty, the loop (line 330) executes at least once, setting processingChainId via line 351

The error message is misleading since events were already validated as non-empty at line 318.

Consider removing this check or replacing it with an assertion, as it's defensive code that should never execute in practice:

-    let processingChainId = switch processingChainId.contents {
-    | Some(chainId) => chainId
-    | None =>
-      Js.Exn.raiseError("No events provided to processEvents. Please provide at least one event.")
-    }
+    let processingChainId = switch processingChainId.contents {
+    | Some(chainId) => chainId
+    | None =>
+      // Should never happen due to early return at line 318
+      Js.Exn.raiseError("Internal error: processingChainId not set despite non-empty events array")
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let processingChainId = switch processingChainId.contents {
| Some(chainId) => chainId
| None =>
Js.Exn.raiseError("No events provided to processEvents. Please provide at least one event.")
}
let processingChainId = switch processingChainId.contents {
| Some(chainId) => chainId
| None =>
// Should never happen due to early return at line 318
Js.Exn.raiseError("Internal error: processingChainId not set despite non-empty events array")
}
🤖 Prompt for AI Agents
In codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
around lines 373-377, the switch that raises "No events provided..." is an
unreachable defensive branch (events are validated non-empty earlier and
processingChainId is set in the loop), so remove the unreachable switch or
replace it with a clear assertion; specifically either delete the switch and use
the already-determined processingChainId value, or replace the match with an
assertion like `Js.Exn.raiseError("unreachable")` / `assert(false)` (or your
project’s preferred assert/fail helper) to document the invariant and avoid a
misleading runtime message.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
codegenerator/cli/npm/envio/src/db/InternalTable.res (1)

278-330: Verify checkpoint ID type is sufficient for expected scale.

The checkpoint id field uses int type (32-bit) throughout this module, which limits the maximum checkpoint ID to ~2.1 billion. For a system tracking block hashes close to the head with potentially frequent checkpoints, this could overflow.

Consider whether the expected checkpoint frequency and indexer lifespan could exceed this limit. If there's any risk of overflow, use bigint consistently (type definition, schema, and COALESCE default) as suggested in past reviews.

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf9720d and 3251b14.

📒 Files selected for processing (5)
  • codegenerator/cli/npm/envio/src/db/InternalTable.res (8 hunks)
  • scenarios/test_codegen/test/ReorgDetection_test.res (23 hunks)
  • scenarios/test_codegen/test/lib_tests/FetchState_test.res (7 hunks)
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res (12 hunks)
  • scenarios/test_codegen/test/rollback/Rollback_test.res (5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • scenarios/test_codegen/test/ReorgDetection_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • scenarios/test_codegen/test/lib_tests/FetchState_test.res
  • scenarios/test_codegen/test/rollback/Rollback_test.res
  • scenarios/test_codegen/test/ReorgDetection_test.res
  • codegenerator/cli/npm/envio/src/db/InternalTable.res
  • scenarios/test_codegen/test/lib_tests/PgStorage_test.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (11)
scenarios/test_codegen/test/lib_tests/FetchState_test.res (2)

2365-2409: LGTM! API rename applied consistently.

The test suite correctly reflects the API change from filterAndSortForUnorderedBatch to sortForUnorderedBatch, with all test cases properly updated to use the new function name and batchSizeTarget parameter.


2768-2809: LGTM! Progress API migration is correct.

The tests properly use the new getUnorderedMultichainProgressBlockNumberAt(~index=0) API, replacing the previous getProgressBlockNumber accessor. The index parameter allows for more flexible progress tracking across multiple chains.

scenarios/test_codegen/test/lib_tests/PgStorage_test.res (3)

118-125: LGTM! Field rename applied consistently.

The chain configuration correctly uses maxReorgDepth instead of confirmedBlockThreshold, with appropriate test values (10 and 200 blocks).


547-580: LGTM! Comprehensive test for optimized checkpoint query.

The new test properly validates the CTE-based query structure that pre-filters chains and computes safe_block before joining with checkpoints. The expected SQL correctly implements the reorg detection logic with proper filtering conditions.


147-189: LGTM! SQL schema correctly updated.

The initialization queries properly include the max_reorg_depth column and the renamed envio_checkpoints table. The INSERT statements correctly populate maxReorgDepth values for all chains.

scenarios/test_codegen/test/ReorgDetection_test.res (2)

8-25: LGTM! Helper functions correctly migrated to new API.

The test helpers properly use the new ReorgDetection.make constructor with blocks array, maxReorgDepth, and optional shouldRollbackOnReorg/detectedReorgBlock parameters. The mapping from tuple format to block records is clean.


27-496: LGTM! ReorgDetection API migration is complete and consistent.

All test cases have been properly updated to use the new API surface:

  • getThresholdBlockNumbers with ~currentBlockHeight parameter
  • registerReorgGuard with reorg guard structure
  • getLatestValidScannedBlock and rollbackToValidBlockNumber methods

The test logic is preserved and expectations remain valid throughout.

codegenerator/cli/npm/envio/src/db/InternalTable.res (4)

19-105: LGTM! maxReorgDepth field integrated correctly.

The maxReorgDepth field is properly added to the Chains module across all necessary locations:

  • Field enum definition
  • Fields array
  • Type definition
  • Table field schema (Integer type with S.int)
  • Initialization from config

209-242: LGTM! New progressedChain type and batch-centric progression.

The new progressedChain type and setProgressedChains function properly implement per-chain progression tracking with an array-based approach instead of dictionary-based. The SQL parameter construction is correct.


302-326: LGTM! Well-optimized CTE query for reorg detection.

The makeGetReorgCheckpointsQuery uses an efficient CTE-based approach:

  1. Pre-filters chains with maxReorgDepth > 0 (small table scan)
  2. Computes safe_block once per chain
  3. Checks progress has entered reorg window
  4. Joins with checkpoints only for relevant chains
  5. Filters for non-null hashes and recent blocks

The query structure minimizes unnecessary checkpoint scans and is well-documented.


190-204: Ignore parameter‐order concern: Utils.Dict.forEachWithKey is defined as (dict, (value, key)) ⇒ unit, so using (data, chainId) is correct.

Likely an incorrect or invalid review comment.

@DZakh DZakh requested a review from JonoPrest October 7, 2025 15:14
@DZakh DZakh mentioned this pull request Oct 7, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scenarios/test_codegen/test/ChainManager_test.res (1)

27-35: Restore per-chain chainId wiring

Hard-coding ~chainId=1 drops every fetcher’s real chain id, so multi-chain batches now carry duplicate ids. That breaks progressedChainsById lookups and any logic keyed by chain id. Please keep feeding the actual id into FetchState.make.

-      ~chainId=1,
+      ~chainId=id,
♻️ Duplicate comments (2)
scenarios/test_codegen/test/ChainManager_test.res (1)

169-178: Still indexing the array as if it were an option

items[0] is a batch item, not option. Piping it into Option.getExn won’t compile (same issue noted earlier). Guard the empty case and grab the element via Array.get.

-            let firstEventInBlock = items[0]->Option.getExn
+            let firstEventInBlock =
+              switch items->Array.get(0) {
+              | Some(event) => event
+              | None => Js.Exn.raiseError("Unexpected empty batch with non-zero size")
+              }
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1)

373-377: Remove unreachable error path; replace with invariant assertion.

This branch can’t occur: non-empty events and the map above always set processingChainId. Replace with an invariant getExn to avoid a misleading runtime error.

-    let processingChainId = switch processingChainId.contents {
-    | Some(chainId) => chainId
-    | None =>
-      Js.Exn.raiseError("No events provided to processEvents. Please provide at least one event.")
-    }
+    let processingChainId =
+      processingChainId.contents->Utils.Option.getExn(
+        "Internal error: processingChainId not set despite non-empty events"
+      )

As per coding guidelines

🧹 Nitpick comments (4)
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (4)

329-329: Use actual latest block timestamp (avoid 0).

Capture max block timestamp alongside block number to better reflect head state.

@@
-    let latestFetchedBlockNumber = ref(0)
+    let latestFetchedBlockNumber = ref(0)
+    let latestFetchedBlockTimestamp = ref(0)
@@
       latestFetchedBlockNumber.contents = Pervasives.max(
         latestFetchedBlockNumber.contents,
         event.block->Types.Block.getNumber,
       )
+      latestFetchedBlockTimestamp.contents = Pervasives.max(
+        latestFetchedBlockTimestamp.contents,
+        event.block->Types.Block.getTimestamp,
+      )
@@
-                ~latestFetchedBlock={
-                  blockNumber: latestFetchedBlockNumber.contents,
-                  blockTimestamp: 0,
-                },
+                ~latestFetchedBlock={
+                  blockNumber: latestFetchedBlockNumber.contents,
+                  blockTimestamp: latestFetchedBlockTimestamp.contents,
+                },

As per coding guidelines

Also applies to: 363-366, 447-450


323-326: Avoid shadowing config twice; rename locals for clarity.

Shadowing makes later reads ambiguous. Consider configWithRegistrations and configWithPersistence, and pass them explicitly (e.g., ~config=configWithPersistence).

As per coding guidelines

Also applies to: 398-401


413-421: Address TODO: deduplicate DCS before storing.

Prevent redundant writes by de-duping DCS items (e.g., dict keyed by contract id/name or a hash). I can draft a helper if you want.

As per coding guidelines


493-494: Fix typo in comment.

“contatin” → “contain”.

-    //In mem store can still contatin raw events and dynamic contracts for the
+    //In mem store can still contain raw events and dynamic contracts for the

As per coding guidelines

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3251b14 and ed1514a.

📒 Files selected for processing (6)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1 hunks)
  • scenarios/erc20_multichain_factory/test/RollbackDynamicContract_test.res (0 hunks)
  • scenarios/erc20_multichain_factory/test/RollbackMultichain_test.res (0 hunks)
  • scenarios/erc20_multichain_factory/test/TestDeleteEntity.res (0 hunks)
  • scenarios/erc20_multichain_factory/test/TestWhereQuery.res (0 hunks)
  • scenarios/test_codegen/test/ChainManager_test.res (5 hunks)
💤 Files with no reviewable changes (4)
  • scenarios/erc20_multichain_factory/test/RollbackMultichain_test.res
  • scenarios/erc20_multichain_factory/test/TestDeleteEntity.res
  • scenarios/erc20_multichain_factory/test/RollbackDynamicContract_test.res
  • scenarios/erc20_multichain_factory/test/TestWhereQuery.res
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • scenarios/test_codegen/test/ChainManager_test.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • scenarios/test_codegen/test/ChainManager_test.res
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (1)
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1)

318-320: Early exit on empty events — good guard.

Prevents unnecessary setup work and matches intended no-op semantics.

Comment on lines -790 to -792
await Db.sql->DbFunctions.EndOfBlockRangeScannedData.setEndOfBlockRangeScannedData(
nextEndOfBlockRangeScannedData,
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so happy to get rid of this. Especially since we have indexers with 300 partitions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1)

373-378: Unreachable branch message fixed.

Message now reflects the invariant. Thanks for addressing.

codegenerator/cli/npm/envio/src/Batch.res (1)

226-248: Shadowing makes addReorgCheckpoints harder to follow.

Line 229 shadows the parameter prevCheckpointId with a ref of the same name. While the logic is correct (the if-branch returns .contents, the else-branch returns the parameter), this naming reuse is confusing: inside the if-block prevCheckpointId means the ref, outside it means the int parameter.

Apply this diff to clarify intent:

 let addReorgCheckpoints = (
   ~prevCheckpointId,
   ~reorgDetection: ReorgDetection.t,
   ~fromBlockExclusive,
   ~toBlockExclusive,
   ~chainId,
   ~mutCheckpointIds,
   ~mutCheckpointChainIds,
   ~mutCheckpointBlockNumbers,
   ~mutCheckpointBlockHashes,
   ~mutCheckpointEventsProcessed,
 ) => {
   if (
     reorgDetection.shouldRollbackOnReorg && !(reorgDetection.dataByBlockNumber->Utils.Dict.isEmpty)
   ) {
-    let prevCheckpointId = ref(prevCheckpointId)
+    let prevCheckpointIdRef = ref(prevCheckpointId)
     for blockNumber in fromBlockExclusive + 1 to toBlockExclusive - 1 {
       switch reorgDetection->ReorgDetection.getHashByBlockNumber(~blockNumber) {
       | Js.Null.Value(hash) =>
-        let checkpointId = prevCheckpointId.contents + 1
-        prevCheckpointId := checkpointId
+        let checkpointId = prevCheckpointIdRef.contents + 1
+        prevCheckpointIdRef := checkpointId

         mutCheckpointIds->Js.Array2.push(checkpointId)->ignore
         mutCheckpointChainIds->Js.Array2.push(chainId)->ignore
         mutCheckpointBlockNumbers->Js.Array2.push(blockNumber)->ignore
         mutCheckpointBlockHashes->Js.Array2.push(Js.Null.Value(hash))->ignore
         mutCheckpointEventsProcessed->Js.Array2.push(0)->ignore
       | Js.Null.Null => ()
       }
     }
-    prevCheckpointId.contents
+    prevCheckpointIdRef.contents
   } else {
     prevCheckpointId
   }
 }
🧹 Nitpick comments (4)
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (4)

323-326: Avoid shadowing the top-level config.

Shadowing hurts readability. Prefer a distinct name for the “registered” config.

Apply this minimal rename (update downstream references accordingly):

-    let config = {
-      ...config,
-      registrations: Some(EventRegister.finishRegistration()),
-    }
+    let registeredConfig = {
+      ...config,
+      registrations: Some(EventRegister.finishRegistration()),
+    }

339-352: Rename local chainId binding to avoid shadowing the param.

The let chainId = switch chainId { ... } shadows the function arg; use a distinct name to reduce confusion.

-      let chainId = switch chainId {
+      let resolvedChainId = switch chainId {
       | Some(chainId) => chainId
       | None => event.chainId
       }
       ...
-      | None => processingChainId.contents = Some(chainId)
+      | None => processingChainId.contents = Some(resolvedChainId)

Also use resolvedChainId at later uses within this scope (e.g., when computing chain).


385-403: Use the clone for persistence storage to keep state isolation clear.

To strictly honor “clone, mutate, return clone,” build storage from the clone.

-    let persistence = {
+    let persistence = {
       ...config.persistence,
-      storage: makeMockStorage(mockDb),
+      storage: makeMockStorage(mockDbClone),
       storageStatus: Ready({
         cleanRun: false,
         cache: Js.Dict.empty(),
         chains: [],
         reorgCheckpoints: [],
         checkpointId: 0,
       }),
     }
-    let config = {
-      ...config,
-      persistence,
-    }
+    let effectiveConfig = { ...registeredConfig, persistence }

Update downstream ~config args to effectiveConfig.


408-422: Optional: deduplicate DCS before writing.

You already noted the TODO. A simple keyed map (by contract id/address) before setDcsToStore would avoid dupes.

I can sketch a small helper to uniq by contract key if helpful.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed1514a and 5c66c63.

📒 Files selected for processing (2)
  • codegenerator/cli/npm/envio/src/Batch.res (4 hunks)
  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
codegenerator/cli/templates/{static,dynamic}/codegen/src/**/*

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Edit template versions under codegenerator/cli/templates/static/codegen/src or codegenerator/cli/templates/dynamic/codegen/src instead of editing generated/src

Files:

  • codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
**/*.{res,resi}

📄 CodeRabbit inference engine (.cursor/rules/rescript.mdc)

**/*.{res,resi}: Never use [| item |] to create an array. Use [ item ] instead.
Must always use = for setting value to a field. Use := only for ref values created using ref function.
ReScript has record types which require a type definition before hand. You can access record fields by dot like foo.myField.
It's also possible to define an inline object, it'll have quoted fields in this case.
Use records when working with structured data, and objects to conveniently pass payload data between functions.
Never use %raw to access object fields if you know the type.

Files:

  • codegenerator/cli/npm/envio/src/Batch.res
**/*.res

📄 CodeRabbit inference engine (.cursor/rules/navigation.mdc)

Prefer reading ReScript .res modules directly; compiled .js artifacts can be ignored

Files:

  • codegenerator/cli/npm/envio/src/Batch.res
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_and_test
🔇 Additional comments (6)
codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs (3)

318-321: Early exit looks good.

Short-circuiting on empty events avoids unnecessary setup. LGTM.


424-476: Confirm selection.eventConfigs consistency.

FetchState.make gets a dummy non-empty eventConfigs, but the query’s selection.eventConfigs is empty. Ensure handleQueryResult doesn’t rely on selection to gate newItems.

Optionally align both for clarity:

-                  selection: {eventConfigs: [], dependsOnAddresses: false},
+                  selection: {eventConfigs: [{"id": "test"}->Obj.magic], dependsOnAddresses: false},

478-486: Batch execution flow LGTM.

Preload then run handlers with history/benchmark flags off is appropriate for tests.

codegenerator/cli/npm/envio/src/Batch.res (3)

530-546: LGTM!

The make function correctly dispatches between unordered and ordered batch preparation. The logic to treat single-chain ordered mode as unordered is a sensible optimization.


548-564: LGTM!

The findFirstEventBlockNumber function correctly searches the checkpoint arrays with proper bounds checking. The loop termination conditions are safe.


436-480: Guard against empty checkpoints before increment
If the first buffered item’s blockNumber equals chainBeforeBatch.progressBlockNumber, the else-branch at Batch.res lines 472–480 will compute lastIndex = −1 and access checkpointEventsProcessed[−1]. Verify that fetchState.buffer never contains items at or below progressBlockNumber, or add a guard/assertion to ensure a checkpoint exists before incrementing.

Comment on lines +494 to 499
//In mem store can still contatin raw events and dynamic contracts for the
//testing framework in cases where either contract register or loaderHandler
//is None
mockDbClone->writeFromMemoryStore(~inMemoryStore)
mockDbClone
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in comment.

“contatin” → “contain”.

-    //In mem store can still contatin raw events and dynamic contracts for the
+    //In mem store can still contain raw events and dynamic contracts for the
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
//In mem store can still contatin raw events and dynamic contracts for the
//testing framework in cases where either contract register or loaderHandler
//is None
mockDbClone->writeFromMemoryStore(~inMemoryStore)
mockDbClone
}
//In mem store can still contain raw events and dynamic contracts for the
//testing framework in cases where either contract register or loaderHandler
//is None
mockDbClone->writeFromMemoryStore(~inMemoryStore)
mockDbClone
}
🤖 Prompt for AI Agents
In codegenerator/cli/templates/dynamic/codegen/src/TestHelpers_MockDb.res.hbs
around lines 494 to 499, the inline comment contains a typo: "contatin" should
be "contain"; update the comment to read "In mem store can still contain raw
events and dynamic contracts for the testing framework in cases where either
contract register or loaderHandler is None" (preserve surrounding spacing and
punctuation).

Copy link
Collaborator

@JonoPrest JonoPrest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

@DZakh DZakh mentioned this pull request Oct 21, 2025
@DZakh
Copy link
Member Author

DZakh commented Oct 22, 2025

Merged as part of #789

@DZakh DZakh closed this Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants