-
-
Notifications
You must be signed in to change notification settings - Fork 826
test(supervisors): fix occasional assertion failures and hangs #2431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…lly' Ensure proper cleanup in 'test_multiprocess' by wrapping the supervisor shutdown logic in a 'finally' block. This guarantees that the supervisor is properly terminated and all processes are joined, avoiding potential resource leaks or state residuebetween test runs.
Now there are two problems, one is that the process hangs when assertion fails(This PR solves), and the second is that it hangs on execution. related: I create a Windows environment, but it's hard to reproduce.🙃 |
A10-second timeout has been added to the subprocess call within the test_multiprocess.py to ensure it does not hang indefinitely in case of failures or slow systems.
See? No need for retry. 😌👍 Thanks for your time investigating this. 🙏 |
@@ -34,6 +34,7 @@ def new_function(): | |||
"-c", | |||
f"from {module} import {name}; {name}.__wrapped__()", | |||
], | |||
timeout=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this was added earlier when I was investigating why it was hanging. No need for it now, I'll remove it later
- Replace fixed sleep with retry mechanism for process status checks- Enhance test stability by allowing for variable process startup times
Head branch was pushed to by a user without write access
Thanks. |
We can't control it with precision, so we might as well retry. Just make sure the dead process is eventually started successfully. |
Retries are not a good sign in a test suite.
@abersheeran suggestions to solve this flaky test? |
Ah, does the unit test code have to be statistically covered too? |
I think using while to check is not a good idea in unit tests. To be honest, I think it would be better to just kill all processes in the tree externally in case of weird scheduling errors. |
@Kludex I think the issue is resolved in the latest commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry the delay!
- Add retry mechanism for unstable test_multiprocess_health_check- Ensure server is running before killing the process - Add assertion to check process is alive before killing it
🤦 This use case seems very unstable under Windows, and perhaps increasing the retry count is the correct way to suppress it. |
- Add a `with_retry` decorator in tests/utils.py to handle retries - Apply the decorator to the `test_multiprocess_health_check` function - Remove the ad-hoc retry logic specific to `test_multiprocess_health_check`
- Add pragma nocover comment to async_wrapper and sync_wrapper functions- This change excludes the decorator logic from code coverage metrics
- Add pragma nocover comment to exclude async_wrapper return statement from coverage - This change helps to avoid unnecessary test coverage metrics
Using a retry decorator might be better, as I have observed that some test cases are also unstable, such as ws cases, subsequently, it can be considered to use this decorator for these unstable use cases. |
Summary
It's not entirely clear why the assertion fails here.
assert not process.is_alive()
Make sure that test resources can be cleaned up anyway.
Checklist