Skip to content

Conversation

@PawelPlesniak
Copy link
Collaborator

@PawelPlesniak PawelPlesniak commented Nov 11, 2025

Description

Changes the structure of the PR template to prioritize the testing, introduces the requirements from other repos as a field.

No tests or further checks have been run, as this is a template issue, and does not affect the core code.

Type of change

  • Documentation (non-breaking change that adds or improves the documentation)
  • New feature (non-breaking change which adds functionality)
  • Optimization (non-breaking, back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (whatever its nature)

Key checklist

  • All tests pass (eg. python -m pytest)
  • Pre-commit hooks run successfully (eg. pre-commit run --all-files)

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added or an issue has been opened to tackle that in the future.
    (Indicate issue here: # (issue))

@PawelPlesniak
Copy link
Collaborator Author

@jamesturner246 @Aurashk @miruuna this is the PR to clarify what tests should be performed for PR merges. This will update the PR template so the testing requiremenst are clearer, this is sometthing that I would value feedback on. The tests that should be performed are:

  • unit tests with pytest
  • regression tests with minimal_system_quick_test
  • integration tests with daqsystemtest_integtest_bundle for anyone with either access to the NP0x cluster or CVMFS

Looking forward to feedback from you

@Aurashk
Copy link
Contributor

Aurashk commented Nov 21, 2025

@jamesturner246 @Aurashk @miruuna this is the PR to clarify what tests should be performed for PR merges. This will update the PR template so the testing requiremenst are clearer, this is sometthing that I would value feedback on. The tests that should be performed are:

* unit tests with `pytest`

* regression tests with `minimal_system_quick_test`

* integration tests with `daqsystemtest_integtest_bundle` for anyone with either access to the NP0x cluster or CVMFS

Looking forward to feedback from you

Thanks Pawel, that's very helpful.
Just wondering about the integration tests, is it enough to daqsystemtest_integtest_bundle on any machine with CVMFS active? So there is no requirement to test on specific hardware?

@PawelPlesniak
Copy link
Collaborator Author

There are no hardware restrictions, but this code has been written to run on AlmaLinux9. If I remember correctly, @jamesturner246 got CVMFS set up on his laptop, and should be able to run these tests locally

@miruuna
Copy link
Contributor

miruuna commented Nov 24, 2025

This looks good and it'll help standardize the workflow. However, running daqsystemtest_integtest_bundle on my machine (Ubuntu 24.04.3 LTS) seems to always take a very long time (>30 min). I am not sure if it's got to do with my machine being slow but this would definitely slow the workflow in my case by quite a bit.

@PawelPlesniak
Copy link
Collaborator Author

PawelPlesniak commented Nov 24, 2025

This looks good and it'll help standardize the workflow. However, running daqsystemtest_integtest_bundle on my machine (Ubuntu 24.04.3 LTS) seems to always take a very long time (>30 min). I am not sure if it's got to do with my machine being slow but this would definitely slow the workflow in my case by quite a bit.

Thanks Miruna. Yes, it is expected that this test takes very long, but this should be only have to be run when major changes are made to the core codebase, prior to merging a PR. I will make this clearer in the testing list requirements

@bieryAtFnal
Copy link
Contributor

In case it is useful, I will contribute a couple of notes regarding where we can run the daqsystemtest regression tests.

  1. The system software packages that are needed on a computer in order to run the DAQ generally are mentioned on the "software area instructions" page here in section 1.vi.
  2. As @PawelPlesniak knows, we don't yet have sufficient-computer-resource checks in all of the existing regression tests, including those in daqsystemtest. These checks attempt to ensure that the current computer has enough CPU/memory/free disk to handle a given regression test. If such checking should be a priority, I will work on it - please let me know. (For reference, somewhat recently we added such checking to the regression tests in the dfmodules package. Now, the tests in that package will skip tests that require more resources than the current computer has available. This helps to avoid confusing errors when we run a test on a system that is underpowered for the test.)

@Aurashk
Copy link
Contributor

Aurashk commented Nov 25, 2025

In case it is useful, I will contribute a couple of notes regarding where we can run the daqsystemtest regression tests.

1. The system software packages that are needed on a computer in order to run the DAQ generally are mentioned on the "software area instructions" page [here](https://github.com/DUNE-DAQ/daqconf/wiki/Setting-up-a-fddaq%E2%80%90v5.5.0-development-area) in section 1.vi.

2. As @PawelPlesniak knows, we don't yet have sufficient-computer-resource checks in all of the existing regression tests, including those in `daqsystemtest`.  These checks attempt to ensure that the current computer has enough CPU/memory/free disk to handle a given regression test.  If such checking should be a priority, I will work on it - please let me know.  (For reference, somewhat recently we added such checking to the regression tests in the `dfmodules` package.  Now, the tests in that package will skip tests that require more resources than the current computer has available.  This helps to avoid confusing errors when we run a test on a system that is underpowered for the test.)

Thanks Kurt, that's very helpful, it's possible 2. explains some test failures I was seeing locally last time I tried. Is it totally unpredictable what will happen in the tests without these checks for computational resources or is there a common point of failure? Also one other more general thing that might be useful to know is what makes the tests run long? Is it a timed simulation of everything working meaninfully together or is it doing a lot of computational work?

@Aurashk
Copy link
Contributor

Aurashk commented Nov 25, 2025

Also another thing came to mind. What's the situation with this MSQT test in the CI https://github.com/DUNE-DAQ/drunc/actions/workflows/run_mqst.yml? It seems like it was abandoned some time earlier in the year judging by the actions runs. Is this something we want to get working again?

@PawelPlesniak
Copy link
Collaborator Author

In case it is useful, I will contribute a couple of notes regarding where we can run the daqsystemtest regression tests.

1. The system software packages that are needed on a computer in order to run the DAQ generally are mentioned on the "software area instructions" page [here](https://github.com/DUNE-DAQ/daqconf/wiki/Setting-up-a-fddaq%E2%80%90v5.5.0-development-area) in section 1.vi.

2. As @PawelPlesniak knows, we don't yet have sufficient-computer-resource checks in all of the existing regression tests, including those in `daqsystemtest`.  These checks attempt to ensure that the current computer has enough CPU/memory/free disk to handle a given regression test.  If such checking should be a priority, I will work on it - please let me know.  (For reference, somewhat recently we added such checking to the regression tests in the `dfmodules` package.  Now, the tests in that package will skip tests that require more resources than the current computer has available.  This helps to avoid confusing errors when we run a test on a system that is underpowered for the test.)

Thanks Kurt, that's very helpful, it's possible 2. explains some test failures I was seeing locally last time I tried. Is it totally unpredictable what will happen in the tests without these checks for computational resources or is there a common point of failure? Also one other more general thing that might be useful to know is what makes the tests run long? Is it a timed simulation of everything working meaninfully together or is it doing a lot of computational work?

When a session is running on a host with insufficient resources, a session will likley throw errors with the number of missing/empty data products. This takes time as we run a variety of configurations with many runs - there are 9 tests and multiple configurations for some of these tests. Supposing each tests takes 3 mins, this will get you the approximate half hour for running.

@PawelPlesniak
Copy link
Collaborator Author

Also another thing came to mind. What's the situation with this MSQT test in the CI https://github.com/DUNE-DAQ/drunc/actions/workflows/run_mqst.yml? It seems like it was abandoned some time earlier in the year judging by the actions runs. Is this something we want to get working again?

This is something held back by the development of the Subprocess process manager, in this PR

@bieryAtFnal
Copy link
Contributor

In principle, the time that each regression test takes to run is dominated by the amount of time spent waiting in each FSM state (e.g. trigger-enabled), as much time as the writer of the test chose. Of course, if lots of failures happen and/or a process either stalls or crashes, run control transitions can take longer than usual (e.g. some of the "stop-run" transitions), and those might produce a noticeable extra amount of time.

@jamesturner246
Copy link
Contributor

jamesturner246 commented Dec 15, 2025

Hi all. As discussed last meeting, I think a special cluster account just for testing PRs would be invaluable for this workflow.

Something we could perhaps hook into CI -- e.g. manually (or even auto, but maybe too noisy) trigger the full integration test suite on the cluster once the PR is marked ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants