Skip to content

Conversation

@glyh
Copy link
Member

@glyh glyh commented Sep 23, 2025

In this test we run schema upgrade sometimes randomly when restoring from pre-computed block, so to simulate a race condition that user might encounter when live upgrading their archive node.

TODO(low priority): replace test result with Malleable_error in follow-up PRs

@glyh glyh marked this pull request as draft September 23, 2025 07:23
@glyh glyh marked this pull request as ready for review September 23, 2025 09:10
@glyh glyh force-pushed the lyh/archive-live-upgrade-test branch 7 times, most recently from 4b0793b to 2f3f6d0 Compare September 23, 2025 12:51
@glyh
Copy link
Member Author

glyh commented Sep 23, 2025

!ci-build-me

@glyh glyh force-pushed the lyh/archive-live-upgrade-test branch from 2f3f6d0 to 5f945e0 Compare September 23, 2025 13:03
@glyh
Copy link
Member Author

glyh commented Sep 23, 2025

This becomes more of a refactor than implementing test... I suggest reviewer to review the 1st commit if it's too hard to review everything.

However, other commits are pretty trivial.

Please tell me if splitting up to more PRs are preferred.

EDIT: moved refactor part to #17853

@glyh
Copy link
Member Author

glyh commented Sep 23, 2025

Archive node test is passing, specifically this test can be found here: https://buildkite.com/organizations/o-1-labs-2/pipelines/mina-o-1-labs/builds/39107/jobs/019976ae-3049-4c69-a0ce-5836d03ecaab/log#130-6002

Nix test is failing on all PRs for now.

|> Option.value_exn ~message:"Failed to find upgrade script"
in
let upgrade_script_finished = Ivar.create () in
(let%bind () = after (Time.Span.of_min (Random.float_range 0. 5.)) in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 0-5 minutes, but the sleep in archive_blocks_from_files is 0-5 seconds. Should this one be seconds too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, 5min is the time it takes to run the whole test on my machine. I'm expecting the upgrade script to be ran any time during the process of loading precomputed blocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see - this delay wasn't supposed to match up with the delay from archive_blocks_from_files. This is a random delay to put the start of the upgrade script somewhere in the middle of that other function's execution.

let archive_blocks_from_files t ~archive_address ~format ?(sleep = 5) blocks =
Deferred.List.iter blocks ~f:(fun block ->
let%bind _ = archive_blocks t ~archive_address ~format [ block ] () in
(* WARN: live upgrade test expect this sleep to be present so we can emulate a race condition *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe set the sleep parameter explicitly where this function is used in the live upgrade test?

Copy link
Member Author

@glyh glyh Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented

@cjjdespres
Copy link
Member

Please tell me if splitting up to more PRs are preferred.

I think I would prefer that. Thanks.

@glyh glyh force-pushed the lyh/archive-live-upgrade-test branch from 5f945e0 to c7f2013 Compare September 24, 2025 02:02
Copy link
Member

@cjjdespres cjjdespres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the delay is working, looking at the logs in that test run above:

{"timestamp":"2025-09-23 13:43:39.073308Z","level":"Debug","source":{"module":"Mina_automation__Archive","location":"File \"src/test/mina_automation/archive.ml\", line 99, characters 14-26"},"message":"Archive stdout: $stdout","metadata":{"stdout":"{\"timestamp\":\"2025-09-23 13:43:39.073200Z\",\"level\":\"Info\",\"source\":{\"module\":\"Archive_lib__Processor\",\"location\":\"File \\\"src/app/archive/lib/processor.ml\\\", line 4593, characters 24-35\"},\"message\":\"Added accounts created for block with state hash $state_hash to archive database\",\"metadata\":{\"num_accounts_created\":0,\"state_hash\":\"3NLF5Ew9z8s6oeqrHus9hnbxxU6wRXswqGC73x27Gqm8QbuCtAxC\"}}\n"}}
{"timestamp":"2025-09-23 13:43:42.833308Z","level":"Info","source":{"module":"Dune__exe__Live_upgrade_archive","location":"File \"src/test/archive/archive_node_tests/live_upgrade_archive.ml\", line 37, characters 3-14"},"message":"Starting upgrade script","metadata":{}}
{"timestamp":"2025-09-23 13:43:44.368473Z","level":"Debug","source":{"module":"Mina_automation__Archive","location":"File \"src/test/mina_automation/archive.ml\", line 99, characters 14-26"},"message":"Archive stdout: $stdout","metadata":{"stdout":"{\"timestamp\":\"2025-09-23 13:43:44.368050Z\",\"level\":\"Info\",\"source\":{\"module\":\"Archive_lib__Processor\",\"location\":\"File \\\"src/app/archive/lib/processor.ml\\\", line 4576, characters 20-31\"},\"message\":\"Added accounts accessed for block with state hash $state_hash to archive database\",\"metadata\":{\"num_accounts_accessed\":3,\"state_hash\":\"3NKp2MV6MUps3xAaszHe3w5hTju2rj6HGiWmmsY55zRMkNJ9pbDP\"}}\n"}}

So this seems fine.

@glyh
Copy link
Member Author

glyh commented Sep 26, 2025

!ci-build-me

@glyh glyh force-pushed the lyh/archive-live-upgrade-test branch from 7bb7648 to 4b5ac02 Compare September 26, 2025 13:44
@glyh
Copy link
Member Author

glyh commented Sep 26, 2025

!ci-build-me

@glyh
Copy link
Member Author

glyh commented Sep 30, 2025

Rebasing onto latest compatible

@glyh glyh force-pushed the lyh/archive-live-upgrade-test branch from 4b5ac02 to a1cc174 Compare September 30, 2025 02:47
@glyh
Copy link
Member Author

glyh commented Sep 30, 2025

!ci-build-me

@glyh
Copy link
Member Author

glyh commented Sep 30, 2025

!ci-nightly-me

@glyh
Copy link
Member Author

glyh commented Sep 30, 2025

@amc-ie
Copy link
Member

amc-ie commented Sep 30, 2025

!ci-bypass-changelog

@glyh glyh merged commit 6fc2083 into compatible Sep 30, 2025
55 checks passed
@glyh glyh deleted the lyh/archive-live-upgrade-test branch September 30, 2025 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants