-
-
Notifications
You must be signed in to change notification settings - Fork 145
feat(series): #1098 arithmetic addition #1275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(series): #1098 arithmetic addition #1275
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like great progress here!
3b3f899
to
bf79de1
Compare
02c646d
to
8f43efc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cmp0xff!
I approved but I will let @Dr-Irv review/merge when he is back as this is a larger MR. I think adding all these overloads is in the the spirit of what pandas-stubs has been doing (but it might also create more maintenance burden).
(Personally, I'm still hoping for something like this #820 (comment) which would be more generic, but not supported by today's type checkers/python type system)
e101ff4
to
14f79a0
Compare
14f79a0
to
2b94ab0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the improvement of the tests and your approach. Thanks much for that.
I also like that you test the operator +
(via both __add__()
and __radd__()
, and Series.add()
and Series.radd()
. But the arguments types for other
for __add__()
and Series.add()
should be the same, and the argument types for other
for __radd__()
and Series.radd()
should be the same.
As suggested in the review, if you make those changes, you should be in good shape.
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
i + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
f + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really do not want to have ignore
statements in the tests.
I think if you modify __radd__()
as follows it might work:
def __radd__(
self: Series[complex], other: complex | Sequence[complex] | np_ndarray_anyint
| np_ndarray_float
| np_ndarray_complex
) -> Series[complex]: ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I suggested in the comment, it did not make a difference when I included np_ndarray_*
in the typing for other
. My guess is that pandas.Series.__radd__
has a lower priority than np.ndarray.__add__
, which already gives Any
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're correct about the precedence.
Actually, I think np.ndarray.__add__()
is type checking to return np.ndarray
, so can you change the test to show that it does return np_ndarray_int
or np_ndarray_float
in the assert_type()
statements, but that the runtime result is a Series
.
Then add a comment indicating why the tests are this way because of the precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4f11405 except for the mypy
bug mentioned in the comment for the untyped Series
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
i + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.float64, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
f + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.float64, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same issue as above. Modify __radd__()
to include the numpy arrays as the other
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
i + left, "pd.Series[int]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.int64, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
f + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.float64, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same - modify __radd__()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/series/arithmetic/test_add.py
Outdated
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
i + left, pd.Series # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
f + left, pd.Series # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
c + left, pd.Series # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same - modify __radd__()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2b94ab0
to
2576640
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the arguments types for other for add() and Series.add() should be the same, and the argument types for other for radd() and Series.radd() should be the same.
With 2576640, arguments types are now
- the same for
Series.__add__
and Series.add` - almost the same for
Series.__radd__
andSeries.radd
, except fornp_ndarray_*
are not added to the former case, becausenumpy
has already defined the results to be of typeAny
.
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
i + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
f + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) | ||
check( | ||
assert_type( # type: ignore[assert-type] | ||
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure] | ||
), | ||
pd.Series, | ||
np.complex128, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I suggested in the comment, it did not make a difference when I included np_ndarray_*
in the typing for other
. My guess is that pandas.Series.__radd__
has a lower priority than np.ndarray.__add__
, which already gives Any
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make __radd__()
and radd()
have the same arguments, even though it doesn't do what we want?
Can you update the tests as indicated in the comment so that you don't have the # type: ignore
, but you use assert_type()
on the type that numpy
is returning, and then check()
on the runtime type (pd.Series
)
Also, I'm finding it hard to see what you changed since you're original commit. Please don't force push your changes, but just push your latest commits.
They are almost the same now, except for
Added, except for #1275 (comment)
I see, but the base branch had been updated and there were conflicts. What should I do instead of rebasing, which caused the forced-update? |
git checkout mybranch
git fetch upstream
git merge upstream/main If there are conflicts when you merge, fix them, and commit, and you should be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @cmp0xff . Really nice work!
This PR is motivated by #1273 (review).
Comprehensive plan (maybe in another PR)
Inspired by #1093, one can draft the following comprehensive plan for all possible combinations of the left arithmetic operand, the binary arithmetic operator and the right arithmetic operand.
Left operand
Series[Any]
Series[int]
Series[float]
Series[complex]
Binary arithmetic operators
Include their right-version and functional version. In
add
we give a full example. We follow the official documentation.__add__
,__radd__
,add
,radd
__truediv__
__floordiv__
__pow__
__mod__
__mul__
__sub__
Right operand
Scalar
Sequence
numpy
arraysSeries
Draft of a plan
add
.Checklist
assert_type()
to assert the type of any return value