Skip to content

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Sep 12, 2025

Which issue does this PR close?

Note: this targets a feature branch, not main

Rationale for this change

As I started on decoding thrift page headers, I found that the way I had been going was no longer going to work. This PR begins the process of abstracting the thrift reader to allow for other implementations.

What changes are included in this PR?

In addition to reworking the reader itself, this PR moves away from the previous TryFrom approach and instead adds a ReadThrift trait.

Are these changes tested?

Should be covered by existing tests

Are there any user-facing changes?

Yes

@github-actions github-actions bot added the parquet Changes to the parquet crate label Sep 12, 2025
/// Read a Thrift encoded [list] from the input protocol object.
///
/// [list]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
pub(crate) fn read_thrift_vec<'a, T, R>(prot: &mut R) -> Result<Vec<T>>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big reason why this change was needed...trying to get the reading of vectors of serializable objects working with the changes to the ThriftCompactInputProtocol changes was just too hard. Adding this function a) worked and b) made implementing the thrift macros easier.

@mbrobbel mbrobbel modified the milestones: 56.2.0, 57.0.0 Sep 16, 2025
@mbrobbel mbrobbel added api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version labels Sep 16, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- I don't fully appreciate why using a special trait rather than TryFrom makes things easier, but on the other hand this code is simpler and more explicit so I give it a major 👍

}
}

/// A high performance Thrift reader that reads from a slice of bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@etseidl
Copy link
Contributor Author

etseidl commented Sep 17, 2025

Thanks @alamb!

I don't fully appreciate why using a special trait rather than TryFrom makes things easier

TBH neither do I 🤣. It was mostly down to the Vec handling requiring so much arcane lifetime foo. As I started pushing on that a little too hard, it all broke down. I probably could have just abandoned TryFrom for vectors, as I ultimately did for the new trait, but I do like the symmetry of ReadThrift and WriteThrift.

Anyway, onward!

@etseidl etseidl merged commit c327d7f into apache:gh5854_thrift_remodel Sep 17, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants