test(supervisors): fix occasional assertion failures and hangs #2431

vvanglro · 2024-08-14T04:28:05Z

Summary

It's not entirely clear why the assertion fails here. assert not process.is_alive()

Make sure that test resources can be cleaned up anyway.

Checklist

I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.

…lly' Ensure proper cleanup in 'test_multiprocess' by wrapping the supervisor shutdown logic in a 'finally' block. This guarantees that the supervisor is properly terminated and all processes are joined, avoiding potential resource leaks or state residuebetween test runs.

vvanglro · 2024-08-14T07:43:08Z

Now there are two problems, one is that the process hangs when assertion fails(This PR solves), and the second is that it hangs on execution.

related:
https://stackoverflow.com/questions/74114409/flask-hangs-when-run-as-a-subprocess-alongside-pytest
pytest-dev/pytest#10965

I create a Windows environment, but it's hard to reproduce.🙃

A10-second timeout has been added to the subprocess call within the test_multiprocess.py to ensure it does not hang indefinitely in case of failures or slow systems.

tests/supervisors/test_multiprocess.py

vvanglro · 2024-08-15T08:53:10Z

Ah, It seems to make sense now. process.kill() process does not end immediately, This is why failure is occasionally asserted here assert not process.is_alive().

Successful cases are all executed for more than 7s, that is, when the ping is executed, the process is really closed, and then the timeout returns false.

I add a timeout of 1 second to the assertion here, and the total execution time is more than 3 s, indicating that the ping method is taken here.

The same.

Kludex · 2024-08-15T08:58:43Z

See? No need for retry. 😌👍

Thanks for your time investigating this. 🙏

tests/supervisors/test_multiprocess.py

Kludex · 2024-08-20T11:50:36Z

tests/supervisors/test_multiprocess.py

@@ -34,6 +34,7 @@ def new_function():
                "-c",
                f"from {module} import {name}; {name}.__wrapped__()",
            ],
+            timeout=10,


Why do we need the timeout?

Oh, this was added earlier when I was investigating why it was hanging. No need for it now, I'll remove it later

- Replace fixed sleep with retry mechanism for process status checks- Enhance test stability by allowing for variable process startup times

Kludex · 2024-11-21T08:13:09Z

Thanks.

vvanglro · 2024-11-21T08:15:02Z

We can't control it with precision, so we might as well retry.

Just make sure the dead process is eventually started successfully.

Retries are not a good sign in a test suite.

Kludex · 2024-12-15T18:31:57Z

@abersheeran suggestions to solve this flaky test?

vvanglro · 2024-12-16T09:27:52Z

Very rare, Run it many times before encountering. It looks like the restart of the process is delayed?

vvanglro · 2024-12-16T10:02:50Z

Ah, does the unit test code have to be statistically covered too?

abersheeran · 2025-01-06T07:31:22Z

@abersheeran suggestions to solve this flaky test?

I think using while to check is not a good idea in unit tests. To be honest, I think it would be better to just kill all processes in the tree externally in case of weird scheduling errors.

vvanglro · 2025-01-14T05:52:43Z

@Kludex I think the issue is resolved in the latest commit.

Kludex

Sorry the delay!

- Add retry mechanism for unstable test_multiprocess_health_check- Ensure server is running before killing the process - Add assertion to check process is alive before killing it

vvanglro · 2025-07-03T02:59:25Z

🤦 This use case seems very unstable under Windows, and perhaps increasing the retry count is the correct way to suppress it.

- Add a `with_retry` decorator in tests/utils.py to handle retries - Apply the decorator to the `test_multiprocess_health_check` function - Remove the ad-hoc retry logic specific to `test_multiprocess_health_check`

- Add pragma nocover comment to async_wrapper and sync_wrapper functions- This change excludes the decorator logic from code coverage metrics

- Add pragma nocover comment to exclude async_wrapper return statement from coverage - This change helps to avoid unnecessary test coverage metrics

vvanglro · 2025-07-16T05:07:18Z

Using a retry decorator might be better, as I have observed that some test cases are also unstable, such as ws cases, subsequently, it can be considered to use this decorator for these unstable use cases.

vvanglro added 5 commits August 14, 2024 17:18

test(supervisor): add timeout to multiprocess test subprocess

b4d8b09

A10-second timeout has been added to the subprocess call within the test_multiprocess.py to ensure it does not hang indefinitely in case of failures or slow systems.

try to solve

a65c61a

try to solve

d832a5a

Merge branch 'master' into fix/test_hang

cd7dcc9

feat: add retry for unstable tests

27c6968

vvanglro commented Aug 15, 2024

View reviewed changes

tests/supervisors/test_multiprocess.py Outdated Show resolved Hide resolved

test(supervisor): remove retry, ensure process close.

f0d496e

vvanglro changed the title ~~test(supervisors): ensure cleanup in test_multiprocess by using 'finally'~~ test(supervisors): fix occasional assertion failures and hangs Aug 16, 2024

Merge branch 'master' into fix/test_hang

239a3cf

Kludex reviewed Aug 20, 2024

View reviewed changes

vvanglro added 4 commits August 20, 2024 20:28

test case

ffe43cb

Merge branch 'master' into fix/test_hang

41583fd

Merge branch 'master' into fix/test_hang

5b97b39

Merge branch 'master' into fix/test_hang

d737e28

This was referenced Oct 17, 2024

feat: supports setting multiple hosts #2486

Open

fix(http): enable httptools lenient data #2488

Merged

Merge branch 'master' into fix/test_hang

b8f2769

Kludex previously approved these changes Nov 20, 2024

View reviewed changes

Kludex enabled auto-merge (squash) November 20, 2024 20:54

test(supervisors): improve test_multiprocess reliability

0b1b372

- Replace fixed sleep with retry mechanism for process status checks- Enhance test stability by allowing for variable process startup times

auto-merge was automatically disabled November 21, 2024 08:12
Head branch was pushed to by a user without write access

vvanglro added 2 commits December 16, 2024 17:53

test(supervisors): improve test_multiprocess reliability

bd08e0e

Merge branch 'master' into fix/test_hang

0956647

vvanglro added 2 commits December 16, 2024 18:04

test(supervisors): improve test_multiprocess reliability

f95cd0f

Merge branch 'master' into fix/test_hang

49c40db

vvanglro added 3 commits January 6, 2025 16:26

Release gil to let the supervisor thread switch execution

7c48f04

format

418dc84

Merge branch 'master' into fix/test_hang

b8a0686

Merge branch 'master' into fix/test_hang

a711e2e

Kludex enabled auto-merge (squash) July 2, 2025 11:04

Kludex disabled auto-merge July 2, 2025 11:04

Kludex approved these changes Jul 2, 2025

View reviewed changes

Kludex enabled auto-merge (squash) July 2, 2025 11:05

Merge branch 'master' into fix/test_hang

8a53143

Kludex disabled auto-merge July 2, 2025 16:26

test(supervisors): improve test_multiprocess reliability

13c6970

- Add retry mechanism for unstable test_multiprocess_health_check- Ensure server is running before killing the process - Add assertion to check process is alive before killing it

Kludex requested a review from abersheeran July 3, 2025 06:19

vvanglro added 2 commits July 3, 2025 23:20

Merge branch 'master' into fix/test_hang

8ff5788

Merge branch 'master' into fix/test_hang

d5bee19

abersheeran approved these changes Jul 13, 2025

View reviewed changes

vvanglro added 4 commits July 16, 2025 12:43

test(supervisors): implement a generic retry mechanism for tests

a447a5a

- Add a `with_retry` decorator in tests/utils.py to handle retries - Apply the decorator to the `test_multiprocess_health_check` function - Remove the ad-hoc retry logic specific to `test_multiprocess_health_check`

test: add pragma nocover to retry decorator

68a59ed

- Add pragma nocover comment to async_wrapper and sync_wrapper functions- This change excludes the decorator logic from code coverage metrics

lint

66fef01

test: add pragma nocover to async_wrapper return statement

e470d85

- Add pragma nocover comment to exclude async_wrapper return statement from coverage - This change helps to avoid unnecessary test coverage metrics

Uh oh!

test(supervisors): fix occasional assertion failures and hangs #2431

Are you sure you want to change the base?

test(supervisors): fix occasional assertion failures and hangs #2431

Uh oh!

Conversation

vvanglro commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

vvanglro commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vvanglro commented Aug 15, 2024

Uh oh!

Kludex commented Aug 15, 2024

Uh oh!

Uh oh!

Kludex Aug 20, 2024

Choose a reason for hiding this comment

Uh oh!

vvanglro Aug 20, 2024

Choose a reason for hiding this comment

Uh oh!

Kludex commented Nov 21, 2024

Uh oh!

vvanglro commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kludex commented Dec 15, 2024

Uh oh!

vvanglro commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vvanglro commented Dec 16, 2024

Uh oh!

abersheeran commented Jan 6, 2025

Uh oh!

vvanglro commented Jan 14, 2025

Uh oh!

Kludex left a comment

Choose a reason for hiding this comment

Uh oh!

vvanglro commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vvanglro commented Jul 16, 2025

Uh oh!

Uh oh!

vvanglro commented Aug 14, 2024 •

edited

Loading

vvanglro commented Aug 14, 2024 •

edited

Loading

vvanglro commented Nov 21, 2024 •

edited

Loading

vvanglro commented Dec 16, 2024 •

edited

Loading

vvanglro commented Jul 3, 2025 •

edited

Loading