-
Notifications
You must be signed in to change notification settings - Fork 5
✨: add CanArrayX protocols #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Ok This PR is doing too much. Let me pair it down to just a few Protocols and do the rest as a series of followups. |
96067a4
to
a1be18e
Compare
Ping @NeilGirdhar, given related discussions. |
src/array_api_typing/_array.py
Outdated
... | ||
|
||
|
||
class CanArrayAdd(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about parametrizing by dtype. Self, other, output. Bit of a mess. Maybe tackle parametrizing as a followup?
Should all the Protocols inherit from |
I don't know what Joren will say, but I would guess no and no? (I think you got it right in this PR?) Also, I'm guessing you're aware that |
My thought was for building stuff like class Positive(Protocol):
def __call__(self, array: CanArrayPos, /) -> CanArrayPos: ... is wrong. It should be something like class Positive(Protocol):
def __call__(self, array: HasArrayNamespace, /) -> HasArrayNamespace: ... But I think we want class Positive(Protocol):
def __call__(self, array: CanArrayPos, /) -> HasArrayNamespace: ... Which I think works best if it's class CanArrayPos(HasArrayNamespace, Protocol): ...
Yes. :). |
I see, you're kind of using it as a poor man's intersection?
Okay, is that because you're going to generate some documentation from these annotations? Or you find it less confusing? Also, are you going to add |
It's for 2 reasons: the array api does it in their docs and because I think the Python numerical tower is a mess and since ints and floats aren't subclasses of each other, it makes little sense for them to be interchangeable at the static type level. 😤😆
Worth discussing. The array api does not. |
The docs are that way to help beginners who might be confused. (At least that was the argument that was presented.) But you aren't expecting beginners to read your code, are you? And, you aren't using this repo to build docs? The downside of populating the unions unnecessarily is overcomplicated type errors. So from a user standpoint, I think this is worse. From a developer standpoint, it's a matter of taste. Personally, I think more succinct is easier to understand.
As much as you might like to turn back time and change the typing decisions that were made, the fact is that the static type I think I understand what you're doing and why. I spent years writing
array.__add__(other: int | float | complex | array, /) → array Have I misunderstood the documentation? |
Ah. We're building towards v2021 first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier to use optype for this, as it already provides single-method generic protocols for each of the special dunders:
https://github.com/jorenham/optype/blob/master/optype/_core/_can.py
There's even documentation: https://github.com/jorenham/optype#binary-operations
And of course it's tested and thoroughly type-checked and stuff
@jorenham is this prep for using optype? |
Yea, pretty much. |
Signed-off-by: Nathaniel Starkman <[email protected]>
Support unary minus operator. Signed-off-by: Nathaniel Starkman <[email protected]>
I've thought about this, but I'm not sure what the best approach is. I considered four approaches:
Now that I've written these down, I think I feel most for option 4. As far as I'm concerned, docstrings are a "should-have", not a "must-have" (MoSCow jargon). By postponing worrying about docstrings, we can focus on building the actual functionality first. This feels like the most agile approach to me. Thoughts? |
For magic dunder methods I agree we can start with 4. What about doing
|
I like that! |
We still have the problem of E.g. class CanArrayAdd(Protocol):
def __add__(self, other: Self | int | float, /) -> Self: ... which isn't compatible with Edit: the closest I can get is opt.CanAdd["HasArrayNamespace[NS_contra] | int | float", "Array[NS_contra]"], Doing
|
I'll add them to optype then update |
Awesome, so then it'll be... CanAddSelf[T, R=Self] = CanAdd[Self | T, Self | R] so we can do |
Something like this, @nstarman? class CanAddSelf(Protocol[_T_contra]):
def __add__(self, rhs: Self | _T_contra, /) -> Self: ... |
Great! I guess the return type probably isn't necessary. |
Yea indeed. And if anyone needs it after all, then we can always add it as optional type parameter later on. |
BTW, this wouldn't work in case of boolean arrays. |
Yeah. I noticed that. It's in the signature of the Array API, but without a way to detect boolean dtypes, how else do we write this statically? Also we need |
I don't think we need to do single-method Protocols now that we're using @docstring_setter(
__pos__ = """...""",
...
)
class Array(
HasArrayNamespace[NS_co],
opt.CanPosSelf,
opt.CanNegSelf,
opt.CanAddSelf[int | float],
opt.CanIAddSelf[int | float],
opt.CanRAddSelf[int | float],
opt.CanSubSelf[int | float],
opt.CanISubSelf[int | float],
opt.CanRSubSelf[int | float],
opt.CanMulSelf[int | float],
opt.CanIMulSelf[int | float],
opt.CanRMulSelf[int | float],
opt.CanTrueDivSelf[int | float],
opt.CanRTrueDivSelf[int | float],
opt.CanFloorDivSelf[int | float],
opt.CanIFloorDivSelf[int | float],
opt.CanRFloorDivSelf[int | float],
opt.CanModSelf[int | float],
opt.CanIModSelf[int | float],
opt.CanRModSelf[int | float],
opt.CanPowSelf[int | float],
opt.CanIPowSelf[int | float],
opt.CanRPowSelf[int | float],
Protocol,
): |
Then that should be changed 🤷🏻♂️
I'd make it generic: class CanAddSelf(Protocol[_T_contra]):
def __add__(self, rhs: Self | _T_contra, /) -> Self: ... 😏
Yea I'll add But I'm thinking of leaving out the def __radd__(self, rhs: _T_contra, /) -> Self: .. because it shouldn't be needed, ...right? |
We'll still need some for the non-python dunders like |
Yes, ones that don't have a natural fit in |
We don't care about |
That's a good idea. We can define a generic Array[InputT]
NumericArray = Array[int | float]
BoolArray = Array[bool] |
Signed-off-by: Nathaniel Starkman <[email protected]>
Pushing a commit that won't work since it references non-existent |
src/array_api_typing/_array.py
Outdated
|
||
|
||
@docstring_setter( | ||
__pos__="""Evaluates `+self_i` for each element of an array instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we push the docstrings to a JSON that gets read in? It would make this
@docstring_setter(**docstrings_json)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted for a toml file since it has nicely formatted multiline raw strings.
…strings from TOML file Signed-off-by: Nathaniel Starkman <[email protected]>
I just released optype 0.12.0 :) |
@jorenham. It works! |
op.CanAddSame[T_contra], | ||
op.CanIAddSelf[T_contra], | ||
op.CanRAddSelf[T_contra], | ||
op.CanSubSame[T_contra], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't accept boolean numpy arrays:
>>> import numpy as np
>>> np.array(True) - np.array(False)
Traceback (most recent call last):
File "<python-input-2>", line 1, in <module>
np.array(True) - np.array(False)
~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.
op.CanPosSelf, | ||
op.CanNegSelf, | ||
op.CanAddSame[T_contra], | ||
op.CanIAddSelf[T_contra], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+=
also works if you just have an __add__
and no __iadd__
:
>>> class Thingy:
... def __add__(self, rhs, /):
... return self if isinstance(rhs, Thingy) else NotImplemented
...
>>> a = Thingy()
>>> a + a
<__main__.Thingy object at 0x7f9896498830>
>>> a += a
>>> a
<__main__.Thingy object at 0x7f9896498830>
We already require Can{binop}Same
, so can we remove CanI{binop}Self
?
op.CanMulSame[T_contra], | ||
op.CanIMulSelf[T_contra], | ||
op.CanRMulSelf[T_contra], | ||
op.CanTruedivSame[T_contra], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CanTruedivSame
requires __truediv__: (Self, Self) -> Self
. In NumPy, that only holds for np.inexact
dtypes (floating and complex). So this would reject integer and boolean arrays:
>>> import numpy as np
>>> np.array([1]) / np.array([1])
array([1.])
>>> np.array([True]) / np.array([True])
array([1.])
op.CanTruedivSame[T_contra], | ||
op.CanITruedivSelf[T_contra], | ||
op.CanRTruedivSelf[T_contra], | ||
op.CanFloordivSame[T_contra], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't hold for boolean numpy arrays:
>>> import numpy as np
>>> np.array([True]) // np.array([True])
array([1], dtype=int8)
op.CanFloordivSame[T_contra], | ||
op.CanIFloordivSelf[T_contra], | ||
op.CanRFloordivSelf[T_contra], | ||
op.CanModSame[T_contra], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mod and floordiv have identical signatures in numpy, so this won't work for boolean arrays:
>>> import numpy as np
>>> np.array([True]) % np.array([True])
array([0], dtype=int8)
op.CanModSame[T_contra], | ||
op.CanIModSelf[T_contra], | ||
op.CanRModSelf[T_contra], | ||
op.CanPowSame[T_contra], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
poor boolean arrays:
>>> np.array([True]) ** np.array([True])
array([1], dtype=int8)
### | ||
# Ensure that `np.ndarray` instances are assignable to `xpt.Array`. | ||
|
||
arr_array: xpt.Array[Any, Any] = arr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you set the first typar to Never
? Because that way, e.g. __add__
becomes (Self, Self | Never) -> Self
which reduces to (Self, Self) -> Self
.
In theory it shouldn't make a difference here. But I know that pyright has a bug where it (incorrectly) reduces Self | Any
to Any
in certain situations. So I wouldn't be surprised if mypy would also behave incorrectly in this case.
# Ensure that `np.ndarray` instances are assignable to `xpt.Array`. | ||
|
||
arr_array: xpt.Array[Any, Any] = arr | ||
arr_floatarray: xpt.Array[float, Any] = arr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also kinda curious if xpt.Array[float, Any]
will reject boolean- and integer arrays.
arr_array: xpt.Array[Any, Any] = arr | ||
arr_floatarray: xpt.Array[float, Any] = arr | ||
arr_boolarray: xpt.Array[bool, Any] = arr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should probably stay in sync with the ones in test_numpy1.pyi
No description provided.