Skip to content

feat(series): #1098 arithmetic addition #1275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

cmp0xff
Copy link
Contributor

@cmp0xff cmp0xff commented Jul 13, 2025

This PR is motivated by #1273 (review).

Comprehensive plan (maybe in another PR)

Inspired by #1093, one can draft the following comprehensive plan for all possible combinations of the left arithmetic operand, the binary arithmetic operator and the right arithmetic operand.

Left operand

  • Series[Any]
  • Series[int]
  • Series[float]
  • Series[complex]

Binary arithmetic operators

Include their right-version and functional version. In add we give a full example. We follow the official documentation.

  • __add__, __radd__, add, radd
  • __truediv__
  • __floordiv__
  • __pow__
  • __mod__
  • __mul__
  • __sub__

Right operand

  • Scalar
  • Sequence
  • numpy arrays
  • Series

Draft of a plan

  • We can have 4 ~ 5 test scripts for each left operand
  • In each test script, we can have 7 test functions for each binary arithmetic operator
  • In each test function, we test on all right operands. Do not forget the possible four variations, see the example of add.

Checklist

@cmp0xff cmp0xff marked this pull request as ready for review July 13, 2025 16:32
Copy link
Member

@loicdiridollou loicdiridollou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like great progress here!

@cmp0xff cmp0xff marked this pull request as draft July 14, 2025 16:51
@cmp0xff cmp0xff force-pushed the feature/cmp0xff/1098-__add__-and-__mul__ branch from 3b3f899 to bf79de1 Compare July 15, 2025 09:29
@cmp0xff cmp0xff force-pushed the feature/cmp0xff/1098-__add__-and-__mul__ branch 2 times, most recently from 02c646d to 8f43efc Compare July 15, 2025 15:25
@cmp0xff cmp0xff changed the title feat(series): #1098 __add__ and __mul__ feat(series): #1098 additions Jul 15, 2025
@cmp0xff cmp0xff changed the title feat(series): #1098 additions feat(series): #1098 arithmetic additions Jul 15, 2025
@cmp0xff cmp0xff changed the title feat(series): #1098 arithmetic additions feat(series): #1098 arithmetic addition Jul 15, 2025
@cmp0xff cmp0xff marked this pull request as ready for review July 15, 2025 16:57
cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this pull request Jul 16, 2025
Copy link
Member

@twoertwein twoertwein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cmp0xff!

I approved but I will let @Dr-Irv review/merge when he is back as this is a larger MR. I think adding all these overloads is in the the spirit of what pandas-stubs has been doing (but it might also create more maintenance burden).

(Personally, I'm still hoping for something like this #820 (comment) which would be more generic, but not supported by today's type checkers/python type system)

cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this pull request Jul 17, 2025
@cmp0xff cmp0xff force-pushed the feature/cmp0xff/1098-__add__-and-__mul__ branch from e101ff4 to 14f79a0 Compare July 17, 2025 07:32
cmp0xff added a commit to cmp0xff/pandas-stubs that referenced this pull request Jul 17, 2025
@cmp0xff cmp0xff force-pushed the feature/cmp0xff/1098-__add__-and-__mul__ branch from 14f79a0 to 2b94ab0 Compare July 17, 2025 15:10
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the improvement of the tests and your approach. Thanks much for that.

I also like that you test the operator + (via both __add__() and __radd__(), and Series.add() and Series.radd(). But the arguments types for other for __add__() and Series.add() should be the same, and the argument types for other for __radd__() and Series.radd() should be the same.

As suggested in the review, if you make those changes, you should be in good shape.

Comment on lines 62 to 83
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix
check(
assert_type( # type: ignore[assert-type]
i + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
check(
assert_type( # type: ignore[assert-type]
f + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
check(
assert_type( # type: ignore[assert-type]
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really do not want to have ignore statements in the tests.

I think if you modify __radd__() as follows it might work:

def __radd__(
        self: Series[complex], other: complex | Sequence[complex] | np_ndarray_anyint
            | np_ndarray_float
            | np_ndarray_complex
    ) -> Series[complex]: ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested in the comment, it did not make a difference when I included np_ndarray_* in the typing for other. My guess is that pandas.Series.__radd__ has a lower priority than np.ndarray.__add__, which already gives Any.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're correct about the precedence.

Actually, I think np.ndarray.__add__() is type checking to return np.ndarray, so can you change the test to show that it does return np_ndarray_int or np_ndarray_float in the assert_type() statements, but that the runtime result is a Series.

Then add a comment indicating why the tests are this way because of the precedence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4f11405 except for the mypy bug mentioned in the comment for the untyped Series

Comment on lines 62 to 83
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix
check(
assert_type( # type: ignore[assert-type]
i + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.float64,
)
check(
assert_type( # type: ignore[assert-type]
f + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.float64,
)
check(
assert_type( # type: ignore[assert-type]
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue as above. Modify __radd__() to include the numpy arrays as the other argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 62 to 79
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix
check(
assert_type( # type: ignore[assert-type]
i + left, "pd.Series[int]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.int64,
)
check(
assert_type( # type: ignore[assert-type]
f + left, "pd.Series[float]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.float64,
)
check(
assert_type( # type: ignore[assert-type]
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same - modify __radd__()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 62 to 81
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix
check(
assert_type( # type: ignore[assert-type]
i + left, pd.Series # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
)
check(
assert_type( # type: ignore[assert-type]
f + left, pd.Series # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
)
check(
assert_type( # type: ignore[assert-type]
c + left, pd.Series # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same - modify __radd__()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmp0xff cmp0xff force-pushed the feature/cmp0xff/1098-__add__-and-__mul__ branch from 2b94ab0 to 2576640 Compare July 28, 2025 10:34
Copy link
Contributor Author

@cmp0xff cmp0xff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the arguments types for other for add() and Series.add() should be the same, and the argument types for other for radd() and Series.radd() should be the same.

With 2576640, arguments types are now

  • the same for Series.__add__ and Series.add`
  • almost the same for Series.__radd__ and Series.radd, except for np_ndarray_* are not added to the former case, because numpy has already defined the results to be of type Any.

Comment on lines 62 to 83
# numpy typing gives ndarray instead of `pd.Series[...]` in reality, which we cannot fix
check(
assert_type( # type: ignore[assert-type]
i + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
check(
assert_type( # type: ignore[assert-type]
f + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
check(
assert_type( # type: ignore[assert-type]
c + left, "pd.Series[complex]" # pyright: ignore[reportAssertTypeFailure]
),
pd.Series,
np.complex128,
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested in the comment, it did not make a difference when I included np_ndarray_* in the typing for other. My guess is that pandas.Series.__radd__ has a lower priority than np.ndarray.__add__, which already gives Any.

@cmp0xff cmp0xff requested a review from Dr-Irv July 28, 2025 10:35
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make __radd__() and radd() have the same arguments, even though it doesn't do what we want?

Can you update the tests as indicated in the comment so that you don't have the # type: ignore, but you use assert_type() on the type that numpy is returning, and then check() on the runtime type (pd.Series)

Also, I'm finding it hard to see what you changed since you're original commit. Please don't force push your changes, but just push your latest commits.

@cmp0xff
Copy link
Contributor Author

cmp0xff commented Jul 29, 2025

Can you make __radd__() and radd() have the same arguments, even though it doesn't do what we want?

They are almost the same now, except for __radd__s do not need Series in other, because they are governed by __add__ already.

Can you update the tests as indicated in the comment so that you don't have the # type: ignore, but you use assert_type() on the type that numpy is returning, and then check() on the runtime type (pd.Series)

Added, except for #1275 (comment)

Also, I'm finding it hard to see what you changed since you're original commit. Please don't force push your changes, but just push your latest commits.

I see, but the base branch had been updated and there were conflicts. What should I do instead of rebasing, which caused the forced-update?

@cmp0xff cmp0xff requested a review from Dr-Irv July 29, 2025 09:24
@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Jul 29, 2025

I see, but the base branch had been updated and there were conflicts. What should I do instead of rebasing, which caused the forced-update?

git checkout mybranch
git fetch upstream
git merge upstream/main

If there are conflicts when you merge, fix them, and commit, and you should be good.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @cmp0xff . Really nice work!

@Dr-Irv Dr-Irv merged commit 845f9c5 into pandas-dev:main Jul 29, 2025
13 checks passed
@cmp0xff cmp0xff deleted the feature/cmp0xff/1098-__add__-and-__mul__ branch July 29, 2025 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Addition of a Series[int] with a complex returns Series[unknown]
4 participants