-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
CategoricalCategorical Data TypeCategorical Data TypeMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatenp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatePDEP missing valuesIssues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprintIssues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprint
Description
Now that Categorical
depends on ExtensionArray
, it makes more sense to return and output pd.NA
as a missing value instead of np.nan
.
Propose that we announce in 2.0 release that this will change in a future release. Not clear if/how we create a deprecation message here.
Current behavior:
>>> c = pd.Categorical( ["a", "a", "b", "c", "c"], ["a", "b", "c"])
>>> c
['a', 'a', 'b', 'c', 'c']
Categories (3, object): ['a', 'b', 'c']
>>> s = pd.Series(c)
>>> s
0 a
1 a
2 b
3 c
4 c
dtype: category
Categories (3, object): ['a', 'b', 'c']
>>> s.iloc[2] = pd.NA
>>> s.iloc[2]
nan
Metadata
Metadata
Assignees
Labels
CategoricalCategorical Data TypeCategorical Data TypeMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatenp.nan, pd.NaT, pd.NA, dropna, isnull, interpolatePDEP missing valuesIssues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprintIssues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprint