Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,7 @@ Groupby/resample/rolling
- Bug in :meth:`DataFrameGroupBy.cumsum` where it did not return the correct dtype when the label contained ``None``. (:issue:`58811`)
- Bug in :meth:`DataFrameGroupby.transform` and :meth:`SeriesGroupby.transform` with a reducer and ``observed=False`` that coerces dtype to float when there are unobserved categories. (:issue:`55326`)
- Bug in :meth:`Rolling.apply` where the applied function could be called on fewer than ``min_period`` periods if ``method="table"``. (:issue:`58868`)
- Bug in :meth:`Series.resample` could raise when the the date range ended shortly before DST. (:issue:`58380`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just before DST, but more generally before a non-existent time, right?


Reshaping
^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -2466,7 +2466,7 @@ def _get_timestamp_range_edges(
)
if isinstance(freq, Day):
first = first.tz_localize(index_tz)
last = last.tz_localize(index_tz)
last = last.tz_localize(index_tz, nonexistent="shift_forward")
Copy link
Member

@mroeschke mroeschke Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should nonexistent be dependent on closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think last here is used as the last element in the series. Instead, last needs to be strictly after all the data points:

pandas/pandas/_libs/lib.pyx

Lines 893 to 894 in 0fadaa9

if values[lenidx - 1] > binner[lenbin - 1]:
raise ValueError("Values falls after last bin")

Shifting backwards should raise when running near a nonexistent hour, in the rare case where the nanosecond before is in the index:

import pandas as pd

almost_a_day = pd.Timedelta(days=1) - pd.Timedelta(nanoseconds=1)
timestamp = pd.to_datetime("2024-04-25").tz_localize("Africa/Cairo")
ts = pd.Series(
    1,
    [timestamp, timestamp + almost_a_day],
)
ts.resample("1D", closed="right").sum()

However, shifting forward runs with the following output:

2024-04-24 00:00:00+02:00    1
2024-04-25 00:00:00+02:00    1
Freq: D, dtype: int64

(I'm not sure why it gives a bin on 2024-04-24.)

else:
first = first.normalize()
last = last.normalize()
Expand Down
14 changes: 14 additions & 0 deletions pandas/tests/resample/test_datetime_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -958,6 +958,20 @@ def _create_series(values, timestamps, freq="D"):
tm.assert_series_equal(result, expected)


def test_resample_dst_midnight_last_nonexistent():
# GH 58380
ts = Series(
1,
date_range("2024-04-19", "2024-04-20", tz="Africa/Cairo", freq="15min"),
)

expected = Series([len(ts)], index=DatetimeIndex([ts.index[0]], freq="7D"))

result = ts.resample("7D").sum()
print(f"{result=}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the print.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry about that!

tm.assert_series_equal(result, expected)


def test_resample_daily_anchored(unit):
rng = date_range("1/1/2000 0:00:00", periods=10000, freq="min").as_unit(unit)
ts = Series(np.random.default_rng(2).standard_normal(len(rng)), index=rng)
Expand Down
Loading