Skip to content

[Variant] Support bulk-appends in cast_to_variant #8323

@scovich

Description

@scovich

Re

My biggest comment / suggestion is to consider making the API vectorized (convert the entire Arrow Array) but I think we can do that as a follow on PR

And #8299 (comment) -- that run-end encoding could be handled more easily in a vectorized API.

And #8299 (comment) that suggests an append_all_rows() method.

And #8299 (comment) that also wonders about vectorization.

I'll try to give one response that covers them all:

I think it's reasonable to consider adding a bulk append type API, but we have to be cognizant of the limitations and challenges it will face:

  • We will need a new trait that knows how to create (and finish!) variant builder instances
  • Variant building is inherently row-based, so any builder that ultimately needs to produce a variant array or variant object as its output will have a trivial append_all_rows that just calls append_row in a loop (like today), in order to recursively build up the fields/elements of the variant it creates.
  • The API would be very nice for converting primitive arrays to variant, because they don't need to recurse on anything. Also nice because we could potentially define a specialized impl just for VariantArrayBuilder, so we don't have to deal with that new variant builder create+finish trait.
  • Casting a list of primitive values is an interesting intermediate case, where one should be able to append all the elements of a given list in one shot. But that might require the new create+finish trait? Or maybe it just needs a second specialization for ListBuilder?
  • Maybe instead of a no-arg append_all_rows(), we should consider a ranged append_many_rows(start..end)? One could always pass .. to request encoding of all rows.

Originally posted by @scovich in #8299 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions