-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
BUG: IntervalIndex.unique() only contains the first interval if all interval borders are negative #61920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hello @khemkaran10, I have tried your fix and I have the following issue. When i run with this branch this code: Gould you try to run the code again and assure, that idx_neg.unique() really returns "IntervalIndex([(-4, -3], (-3, -2], (-2, -1]], dtype='interval[int64, right]')"? In my tests the return is "IntervalIndex([([-4607182418800017408, -4609434218613702656], (-4609434218613702656, -4611686018427387904], (-4611686018427387904, -4616189618054758400]], dtype='interval[int64, right]')" which triggers the error: "E ValueError: left side of interval must be <= right side" |
@il1sf4 Thanks for pointing out. I have update the PR. |
pandas/core/arrays/interval.py
Outdated
@@ -1985,6 +1985,9 @@ def _from_combined(self, combined: np.ndarray) -> IntervalArray: | |||
)._from_sequence(nc[:, 1], dtype=dtype) | |||
else: | |||
assert isinstance(dtype, np.dtype) | |||
nc = np.hstack( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the hstack really necessary here since the next 2 lines is just splitting them back up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @jbrockmendel , It's not required. I'll change this.
pandas/core/arrays/interval.py
Outdated
nc = unique( | ||
# Using .view("complex128") with negatives causes issues. | ||
# GH#61917 | ||
(np.array(self._combined[:, 0], dtype=complex)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is More Correct, should we just patch inside _combined directly? The only other place it is used is isin
; will there be an analaguos but there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, agree we can add this logic directly in _combined.
Fixes: #61917
Before FIx ❌:
After Fix ✅:
Fixes incorrect deduplication of negative-valued intervals when using .unique().
Previously used .view("complex128"), which failed for negative floats/ints.
closes BUG:
IntervalIndex.unique()
only contains the first interval if all interval borders are negative #61917Tests added and passed if fixing a bug or adding a new feature
All code checks passed.