-
Notifications
You must be signed in to change notification settings - Fork 583
Rewrite hardfork test to Go #17919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: compatible
Are you sure you want to change the base?
Rewrite hardfork test to Go #17919
Conversation
69c09d7
to
ec89fba
Compare
ec89fba
to
9bd5c76
Compare
echo "Creates a quick-epoch-turnaround configuration in localnet/ and launches two Mina nodes" >&2 | ||
echo "Usage: $0 [-m|--mina $MINA_EXE] [-i|--tx-interval $TX_INTERVAL] [-d|--delay-min $DELAY_MIN] [-s|--slot $SLOT] [--develop] [-c|--config ./config.json] [--slot-tx-end 100] [--slot-chain-end 130] [--genesis-ledger-dir ./genesis]" >&2 | ||
echo "Consider reading script's code for information on optional arguments" >&2 | ||
usage() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were introduced to make output of the script nicer, not directly related to rewrite to Go
"${NODE_ARGS_1[@]}" \ | ||
--block-producer-key "$PWD/$CONF_DIR/bp" \ | ||
--config-directory "$PWD/localnet/runtime_1" \ | ||
--run-snark-worker "$(cat $CONF_DIR/bp.pub)" --work-selection seq \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fix of a pre-existing glitch, not directly related to rewrite to Go
!ci-build-me |
!ci-nightly-me |
!ci-bypass-changelog |
That's a lot of code |
It seems there's issue:
we should probably fix this inside our hosted nix image |
Some ideas:
EDIT: implemented |
I think so? I noticed this exchange on slack:
and response
So setting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one test failure that I thought I'd point out now. I haven't gone through much of the code yet.
I'm unsure why there were tzdata exceptions in the nightly run, because the logs have this in them:
+ [[ 1 -gt 0 ]]
+ ln -sf /usr/share/zoneinfo/UTC /etc/localtime
+ chown -R root /workdir
+ git config --global --add safe.directory /workdir
+ git fetch
so I think the build-and-test.sh
script is supposed to correct for the lack of /etc/timezone
already. Maybe the agents don't always have /usr/share/zoneinfo/UTC
?
The regular CI failed in the dev unit tests like this:
This run has ID `99BF9CDD-C904-49A0-981F-F1DD407607A5`.
[OK] Root 0 closing stable root, reload as converting.
[OK] Root 1 moving a root.
[FAIL] Root 2 make checkpointing a root.
-- Root.002 [make checkpointing a root.] Failed --
in `/workdir/_build/default/src/lib/mina_ledger/test/_build/_tests/99BF9CDD-C904-49A0-981F-F1DD407607A5/root.002.output`:
ASSERT make checkpointing a root timed out after 2s
[exception] (monitor.ml.Error
("Alcotest__Core.Check_error(\"Error make checkpointing a root timed out after 2s.\")")
which is really bizarre, but is very likely unrelated to this PR. It's possible that there was something slow in IO?
echo "Running HF test with SLOT_TX_END=$SLOT_TX_END" | ||
"$SCRIPT_DIR"/test.sh compatible-devnet{/bin/mina,-genesis/bin/runtime_genesis_ledger} fork-devnet{/bin/mina,-genesis/bin/runtime_genesis_ledger} && echo "HF test completed successfully" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Bash: shellcheck" CI job is failing like this:
In ./scripts/hardfork/build-and-test.sh line 53:
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
^--------^ SC2034 (warning): SCRIPT_DIR appears unused. Verify use (or export if used externally).
because this change removes the one use of SCRIPT_DIR
from this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, I ran ./build-and-test.sh
from the scripts/hardfork/
directory itself and got this at the end, after everything had been built:
Error: failed to start main network: fork/exec /home/despresc/src/mina-reviews/scripts/hardfork/scripts/hardfork/run-localnet.sh: no such file or directory
failed to start main network: fork/exec /home/despresc/src/mina-reviews/scripts/hardfork/scripts/hardfork/run-localnet.sh: no such file or directory
So it might be good to keep using SCRIPT_DIR
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retriggering dev unit test.
I tried running it locally - I think my laptop might not be able to handle this test:
with everything that gets spawned. |
I have tried to run the original bash test on my laptop and succeeded once. Given we're using the same model it's interesting that a rewrite cause limit on resource. It does take sometimes to run though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the top level of this test. I feel there's much more to review, if we want to ensure 100% correctness.
errorLogger: log.New(os.Stderr, "ERROR: ", log.LstdFlags), | ||
debugLogger: log.New(os.Stdout, "DEBUG: ", log.LstdFlags), | ||
isDebug: isDebug, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For forward compatibility, could we generate JSON logs so it's parsable by downstream consumer?
|
||
var blocks []BlockData | ||
|
||
result.Get("data.bestChain").ForEach(func(_, value gjson.Result) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems very verbose. I think it's better to use a 3rd party library that deals with parsing for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use gqlgen? It has client side API that could be used for our purpose. Best case we should reuse the generated gql schema in this repo.
https://github.com/99designs/gqlgen/tree/master/_examples/todo
return 0, 0, err | ||
} | ||
|
||
blockHeight := result.Get("data.bestChain.0.protocolState.consensusState.blockHeight").Int() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an example, we're manually dealing with int conversion here, and I think the error message won't be pretty here.
@@ -0,0 +1 @@ | |||
{"go.mod":"b4592235afa6583ad9fa5ace6072a5b9733165d1580bfb47954fd154da9fe6ee","go.sum":"a66a7ea9363eecf0990f36729b257548c44b114ce588890dccdf633b123974d3","vendorSha256":"sha256-2Qy9V5NOKli0bjnhOykBYShk6+TYqIJshOXdWk+9O4c="} No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have this hash pinning functionality specific to nix to go into another PR. It doesn't seems to be relating to the test itself closely.
Is this nix's restriction?
mainGenesisTimestamp := config.FormatTimestamp(mainGenesisTs) | ||
|
||
// Prepare run-localnet.sh command | ||
cmd := exec.Command( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of the fact this is a mix of go and bash. I think we're gradually migrating?
if err := t.ValidateForkConfigData(analysis.LatestNonEmptyBlock, forkConfigBytes); err != nil { | ||
return err | ||
} | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why there's a scope here.
|
||
// ExtractForkConfig extracts the fork configuration from the network | ||
func (t *HardforkTest) ExtractForkConfig(port int, forkConfigPath string) ([]byte, error) { | ||
for attempt := 1; attempt <= t.Config.ForkConfigMaxRetries; attempt++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this retry mechanism is a bit problematic. When encountering some errors, no retry happens at all
@@ -0,0 +1,193 @@ | |||
package hardfork |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these json validation is very verbose. We might want to consider simplifying them.
}() | ||
|
||
// Sleep until fork genesis | ||
t.Logger.Info("Sleeping for %d minutes until fork genesis...", t.Config.ForkDelay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is confusing, and it doesn't seems to correspond to any thing, and it's not in original script. suggest deleting.
} | ||
|
||
// Wait until best chain query time | ||
t.WaitUntilBestChainQuery(t.Config.MainSlot, t.Config.MainDelay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sleep is before the query in original script. might be a bug here.
sleep $((FORK_DELAY*60))s
earliest_str=""
while [[ "$earliest_str" == "" ]] || [[ "$earliest_str" == "," ]]; do
earliest_str=$(get_height_and_slot_of_earliest 10303 2>/dev/null)
sleep "$FORK_SLOT"s
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, in that script we're assuming fork genesis happen right after local net is started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is pretty confusing because there's a sleep $((FORK_SLOT*10))s
after the above script. I would suggest everything to be operating on time stamp not duration.
Running this test locally. |
Rewrite HF test core logic in Go.
Explain how you tested your changes:
Checklist: