Skip to content

Conversation

@paleolimbot
Copy link
Member

Which issue does this PR close?

Follow on PR for #17986 , cleaning a few verbose helpers for comparing equality and printing. This is done by introducing a non-owning SerializedTypeView (name up for debate) that handles comparison and display of extension types.

Rationale for this change

The two helpers designed to centralize the printing and comparison of DataType + Metadata were verbose and hard to understand.

What changes are included in this PR?

  • Added SerializedTypeView (== &DataType + Option(extension_name) + Option(extension_metadata)
  • Implemented Display and PartialEq
  • Removed previous helpers and used the new class for display and comparison

Are these changes tested?

I'll complete the test coverage if this seems like a reasonable approach.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions common Related to common crate labels Oct 24, 2025
Comment on lines -78 to -86
pub fn check_metadata_with_storage_equal(
actual: (
&DataType,
Option<&std::collections::HashMap<String, String>>,
),
expected: (
&DataType,
Option<&std::collections::HashMap<String, String>>,
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I'm trying to remove

Comment on lines -124 to -127
pub fn format_type_and_metadata(
data_type: &DataType,
metadata: Option<&std::collections::HashMap<String, String>>,
) -> String {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this

false => write!(f, "{v:?} {:?}", metadata.as_ref().unwrap()),
true => write!(f, "{v:?}"),
}
write!(f, "{}", ScalarAndMetadata::new(v.clone(), metadata.clone()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloning the scalar here just to render it isn't ideal. I played with just making this Expr::Literal(ScalarAndMetadata) but that touches a lot.

Comment on lines +235 to +241
// TODO: Something about string equality or expression equality is causing these cases
// to fail
// TestCase {
// desc: r#"min(c2) --> "min(c2)" -- (column *named* "min(t.c2)"!)"#,
// input: sort(min(col("c2"))),
// expected: sort(col("min(t.c2)")),
// },
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this...something about equality or the string representation.

Comment on lines +1526 to +1531
let actual = SerializedTypeView::from(field);
let expected = SerializedTypeView::from(prev);
if actual != expected {
return plan_err!(
"Conflicting types for parameter '{id}': expected {expected} but got {actual}"
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper in action

Comment on lines -798 to +797
Prepare: "my_plan" [Utf8, FixedSizeBinary(16)<{"ARROW:extension:name": "arrow.uuid"}>]
Prepare: "my_plan" [Utf8, arrow.uuid<FixedSizeBinary(16)>]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also significantly improves the plan representation of an extension type (although this is only used in the prepare output)

Comment on lines +112 to +117
#[derive(Debug, Clone, Copy, Eq)]
pub struct SerializedTypeView<'a, 'b, 'c> {
arrow_type: &'a DataType,
extension_name: Option<&'b str>,
extension_metadata: Option<&'c str>,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifetimes are kind of annoying but we mix and match metadata in a lot of ways and this manages to not copy anything when we're peeking at the type.

Comment on lines +168 to +173
pub fn extension_metadata(&self) -> Option<&str> {
if let Some(metadata) = self.extension_metadata {
if !metadata.is_empty() {
return Some(metadata);
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This normalizes comparing two extension types where the metadata is missing and/or empty. Arrow C++ always exports "empty" instead of "absent" (and used to crash on the "absent" case). The byte-for-byte comparison of the metadata is probably too strict for any actual parameterized extension types but is safer in lieu of pluggable type comparison.

Comment on lines +203 to +221
impl Display for SerializedTypeView<'_, '_, '_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match (self.extension_name(), self.extension_metadata()) {
(Some(name), None) => write!(f, "{}<{}>", name, self.arrow_type()),
(Some(name), Some(metadata)) => {
write!(f, "{}({})<{}>", name, metadata, self.arrow_type())
}
_ => write!(f, "{}", self.arrow_type()),
}
}
}

impl PartialEq<SerializedTypeView<'_, '_, '_>> for SerializedTypeView<'_, '_, '_> {
fn eq(&self, other: &SerializedTypeView) -> bool {
self.arrow_type() == other.arrow_type()
&& self.extension_name() == other.extension_name()
&& self.extension_metadata() == other.extension_metadata()
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replacement formatter and comparison.

This does have a subtle difference with the previous version, which is that non extension metadata is not considered with type equality. I think this is a good thing...extension types are formal (we error on mismatch), metadata is informal (we accumulate on mismatch, or at least existing code does).

Comment on lines +67 to +75
impl Display for ScalarAndMetadata {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let storage_type = self.value.data_type();
let serialized_type = SerializedTypeView::from((&storage_type, self.metadata()));

let metadata_without_extension_info = self
.metadata
.as_ref()
.map(|metadata| {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure all of this is a good idea but I was trying to improve the extension scalar outuput without dropping the display of extra metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate logical-expr Logical plan and expressions sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant