-
-
Notifications
You must be signed in to change notification settings - Fork 9
Refactor and expand tests #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… inputs and add tests
…antile handling; add tests for new functionality
|
I will try to review later today, on a quick look it looks good. I think the main challenge will be having our CI jobs run the relevant tests in representative enough environments given the complex relations between the optional dependencies. It is also possible we have dead code laying around from discarded experiments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started going over the tests I will continue as it will take a while. So far the most recurring comment is adding tests of axis behaviour in array functions. The xarray related tests are definitely passing the argument to the array function, so it might not have any effect on actual coverage if looked at from a global perspective.
However, there are already many functions with tests for axis and given we have a minimal test job that runs tests on an env without xarray nor arviz-base I think it would be nice to try and get the global coverage of the library and the coverage of minimal test job (counted with respect to things inside base/* except the dataarray file).
Second common comment is try to avoid np.random.function for new code and use the default_rng() then rng.function
All source code related changes look good.
Think I have all of the changes related to axes testing you mentioned, and fixed the Appreciate the review so far as well. Going to continue looking for places to test axis behavior where it makes sense. I feel pretty good about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can merge already as it is a clear improvement on the current situation, but I think we still need to work on that, not so much in terms of coverage number or tests themselves but more organizational level things. This was already an issue before, but I am still not sure if I make a change to the kde where should new tests go, or without running them even where tests to update will be.
| @pytest.mark.filterwarnings("ignore::UserWarning") | ||
| @pytest.mark.filterwarnings("ignore::RuntimeWarning") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not the first I have seem but these are great, thanks! For legacy arviz we missed some warnings because the tests were too noisy and didn't see some deprecations until the function was removed for example
Do you know if it is possible to restrict ignores to matching errors too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning messages were getting wild in the test output because we're explicitly trying to raise a lot of these warnings. So I tried to go through and make sure we're not ignoring organic warnings that surface normally.
| from arviz_stats.utils import ELPDData | ||
|
|
||
|
|
||
| @pytest.mark.filterwarnings("ignore:Estimated shape parameter:UserWarning") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like the pattern to ignore warnings with matching message, TIL ❤️
Agreed, I think the organization could definitely be better. Also, the workflows you mentioned in slack for running different tests suites based on dependencies would be very nice. The tests take slightly longer to run now given the new volume of tests, but I don't think it's too crazy. I imagine the organization of where to put new tests can evolve more. I think for certain things it's obvious, like LOO-based tests, but to your point about KDE tests, I think right now there could be a couple places where they could live. So we can definitely centralize that more so it's more obvious. |
This significantly refactors and expands the test suite for arviz-stats. The goal here was to test core functionality and relevant edge cases for areas that either lacked test coverage or had very minimal exposure. I also moved all
loobased testing into its own testing directory so we could expand on those tests and separate the modules out.While this PR primarily focuses on robustifying the test suite, there were several changes to source code as well along the way. Some of these are (hopefully) small improvements, while others came up from some edge-case tests.
Source Changes
src/arviz_stats/accessors.py:276self._obs.nametoself._obj.namesrc/arviz_stats/numba/intervals.pyreturnstatement inquantile()function (line 31)_quantile()(line 21): changedresult = ...toresult[:] = ...src/arviz_stats/numba/diagnostics.pyrhat()(line 121): changedds[group]tods[group].datasetess()(line 286): changedds[group]tods[group].datasetsrc/arviz_stats/sampling_diagnostics.py:388rhattorhat_nestedindata.map()callsrc/arviz_stats/base/dataarray.py:325-328factor >= 1src/arviz_stats/survival.py(lines 25-27, lines 106-108)var_namesparameter inkaplan_meier()andgenerate_survival_curves()strandlist[str]and is not listed as optional anymoresrc/arviz_stats/base/dataarray.py(line 115, lines 125-129)methodparameter torhat_nested()with default"rank"src/arviz_stats/sampling_diagnostics.py(lines 378-416)methodparameter throughout allrhat_nested()callssrc/arviz_stats/numba/diagnostics.py:69raw_rearrangetorearrange(was getting an error for raw_rearrange)I think all fixes should be backward compatible. I know this is quite a large PR. Things escalated quickly to say the least 😄.
Resolves (first pass at least): #218
📚 Documentation preview 📚: https://arviz-stats--220.org.readthedocs.build/en/220/