Skip to content

Conversation

@zachyattack23
Copy link

Description

Adds regression test for inconsistent behavior between DataFrame.where with
inplace=True and inplace=False when using StringArray with NA-like values.

Background

  • Original bug: In pandas 2.2, inplace=True would incorrectly store
    float nan while inplace=False stored pd.NA
  • Status: Bug is fixed in current main (3.0.0.dev0) - both now correctly
    use pd.NA
  • This PR: Adds test to prevent regression

Changes

  • Added test_where_inplace_string_array_consistency() to
    pandas/tests/frame/indexing/test_where.py
  • Test verifies both inplace=True and inplace=False produce identical
    results with pd.NA for StringArray

Testing

pytest pandas/tests/frame/indexing/test_where.py::test_where_inplace_string_array_consistency -v

PASSED

Closes #46512

[x] Tests added and passed
[x] All code checks passed (will verify in CI)
[ ] Added type annotations (N/A - test only)
[ ] Added whatsnew entry (N/A - test only)
[x] Used AI following AGENTS.md

- Add test for GH#46512
- Bug was fixed - inplace and non-inplace now behave consistently
- Test ensures StringArray with NA-like values works correctly
- Both inplace=True and inplace=False now return pd.NA consistently

Closes pandas-dev#46512
Comment on lines +1079 to +1080
# GH#46512 - inplace and non-inplace should have consistent behavior
# for StringArray with NA-like values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all comments (and below) except GH#46512

Comment on lines +1090 to +1092
# Both should produce pd.NA, not float nan
assert isinstance(result["A"]._values[1], type(pd.NA))
assert isinstance(df_inplace["A"]._values[1], type(pd.NA))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Both should produce pd.NA, not float nan
assert isinstance(result["A"]._values[1], type(pd.NA))
assert isinstance(df_inplace["A"]._values[1], type(pd.NA))

This isn't needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Inconsistency in DataFrame.where between inplace and not inplace with na like value for StringArray

2 participants