-
-
Notifications
You must be signed in to change notification settings - Fork 19.2k
REF: consistent ndarray-wrapping for lists in arithmetic ops #62552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The alternative to np.array is sanitize_array (effectively |
Discussed on the dev call this week to (I think) general approval that np.array-wrapping is the simplest, most consistent behavior available and that users who want something else can pretty easily do that themselves. That said, I've found three issues where this would enshrine behavior that users complain about: #54554, #62524, #62353. In each of those cases the list contains datetimes or timedeltas. Calling np.array on it returns object dtype, where the users expected dt64 or td64. I still think np.array is the right move here, but want reviewers to be aware of the tradeoffs. |
Added 3.0 milestone bc it is technically a breaking change. |
Dangit, wrapping in np.array introduces an inconsistency in DataFrame-vs-EverythingElse behavior. If we do a DataFrame op with a list, it passes that list to the Series constructor, which has different inference behavior from np.array (e.g. on I think we need to do Series-constructor-like behavior across the board. |
Double dang. Now I remember why I said "no" last time I tried sanitize_array. We run into NotImplemented trouble. e.g. with
left+right will go through |
Implementations details and complexities aside, I think conceptually "the right thing" would be to convert to an array like I assume that is the Series-constructor-like / sanitize_array you mention above? (and where you run into issues ..) |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Discussed in #62423 among others. This ensures we wrap list in np.array at the top of all arithmetic/comparison/logical methods*.
* Not quite accurate. For logical methods on numpy dtypes we deprecated and now disallow dtype-less sequences. So this does the wrapping on EAs, but doesn't revert that deprecation for non-EA cases.
As a result,
Series[dt64] - [datetime_obj]
no longer raises. Note that it does return object dtype, which is not what the user in #62524 wants.This also results in a changed dtype for
test_add_list_to_masked_array
.Update I'm working on a branch to deprecate wrapping for tuple and non-list
is_list_like(other) and not hasattr(other, "dtype")