Skip to content

keep() and discard() err on NA ... perhaps unintentionally? #1156

@mmuurr

Description

@mmuurr

I haven't checked previous {purrr} versions to see if this is new(ish) behavior, but for some reason this surprised me:

keep(c(TRUE, FALSE, NA), identity)
# Error in `keep()`:
# ℹ In index: 3.
# Caused by error:

(Same for discard().)

When inspecting keep(), however, we see that:

keep
# function(.x, .p, ...) {
#   where <- where_if(.x, .p, ...)
#   .x[!is.na(where) & where]
# }

... suggesting that it's intended to handle NA values (same for discard()), but it's purrr:::where_if() that's actually throwing the error.

Is this intended?

While I tend to try to opt for explicitness as much as possible, I think the NA-handling as coded in keep() and discard() make sense and align with dplyr::filter()'s logic. I also think that enabling NA-handling as coded again would be backwards compatible, since any existing code must be coercing (or otherwise explicitly-handling) NAs anyhow.

FWIW, the documentation doesn't explicitly mention NA-handling, but states:

  • for keep: "all elements where .p evaluates to TRUE", which would be valid as coded
  • for discard: "all elements where .p evaluates to FALSE", which is not exactly valid as coded, as discard()'s definition also retains NAs (if where_if() didn't throw).

Alternatively, adding something like .na = c(TRUE, FALSE) parameter (with match.arg) would be a clear & compact way to make such choices explicit, while removing common coercion cruft in .p.

Cheers!

{purrr} v1.0.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions