Skip to content

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Aug 6, 2025

Which issue does this PR close?

We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.

Rationale for this change

As part of the review in #8021, @scovich and I were discussing how VariantArray::value should behave in the presence of nulls: #8021 (comment)

Suggest to make this return Option so callers don't have to check for null themselves.

I realized it might not be 100% clear that the existing convention in this crate was that value() methods did not check for nulls / return Option. I think we should document it better

What changes are included in this PR?

Explicitly document that value methods do not check for nulls and explain what happens when they are used on null values

Are these changes tested?

Yes, by CI

Are there any user-facing changes?

Additional documentation. No behavior changes

@alamb alamb added the documentation Improvements or additions to documentation label Aug 6, 2025
@github-actions github-actions bot added the arrow Changes to the arrow crate label Aug 6, 2025
@scovich
Copy link
Contributor

scovich commented Aug 6, 2025

As part of the review in #8021, @scovich and I were discussing how VariantArray::value should behave in the presence of nulls:

I wasn't suggesting to change the signature of that public function... just of the internal typed_value_to_variant helper?

@alamb
Copy link
Contributor Author

alamb commented Aug 6, 2025

I wasn't suggesting to change the signature of that public function... just of the internal typed_value_to_variant helper?

Right, sorry -- the caller of typed_value_to_variant (value()) doesn't return an option, so I figured I would at least explain why that was not returning Option

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Aug 18, 2025
/// Returns the boolean value at index `i`.
///
/// Note: This method does not check for nulls and the value is arbitrary
/// if [`is_null`](Self::is_null) returns true for the index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this calls value_unchecked actually, do we also want to add the note to value_unchecked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an excellent call. I have done so


/// Returns the boolean value at index `i`.
///
/// Note: This method does not check for nulls and the value is arbitrary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually I think all value methods on all arrays behaves like this.

Copy link
Contributor Author

@alamb alamb Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, that is why i was trying to add the documentation.

Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @viirya

/// Returns the boolean value at index `i`.
///
/// Note: This method does not check for nulls and the value is arbitrary
/// if [`is_null`](Self::is_null) returns true for the index.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an excellent call. I have done so

@alamb alamb merged commit 78212bf into apache:main Aug 18, 2025
29 checks passed
@alamb
Copy link
Contributor Author

alamb commented Aug 18, 2025

Thanks again @viirya @scovich and @kylebarron

I merged this PR in as I am trying to get stuff merged before the next release. I am happy to make additional improvements in follow on PRs. Just let me know

@alamb alamb deleted the alamb/doc_value branch August 18, 2025 19:55
Comment on lines +181 to +182
/// Note: This method does not check for nulls and the value is arbitrary
/// if [`is_null`](Self::is_null) returns true for the index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other places, is it guaranteed not to panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is guaranteed not to panic.

/// Returns the element at index `i`
///
/// Note: This method does not check for nulls and the value is arbitrary
/// if [`is_null`](Self::is_null) returns true for the index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other places, is it guaranteed not to panic and return a valid T::Native value corresponding to type T (in (theoretical?) case T::Native could represent some values not valid for T)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other places, is it guaranteed not to panic and return a valid T::Native value corresponding to type T

Yes, to the best of my knowledge

(in (theoretical?) case T::Native could represent some values not valid for T)

Yes I think this is possible -- the only one I can think of now is potentially date/time types where some valid integer is not a valid date/time/timestamp instance.

/// Returns ith value of this list array.
///
/// Note: This method does not check for nulls and the value is arbitrary
/// (but still well-defined) if [`is_null`](Self::is_null) returns true for the index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but still well-defined

should this be construed as meaning the returned value is of size L, if self is a FixedSizeList(L)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate documentation Improvements or additions to documentation parquet-variant parquet-variant* crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants