-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix failures for EXPLAIN and EXPLAIN ANALYZE on Iceberg OPTIMIZE queries #26670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix failures for EXPLAIN and EXPLAIN ANALYZE on Iceberg OPTIMIZE queries #26670
Conversation
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
@cla-bot check |
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
The cla-bot has been summoned, and re-checked this pull request! |
// Certain table handle attributes are not applicable to select queries (which need stats). | ||
// If this changes, the caching logic may here may need to be revised. | ||
checkArgument(!originalHandle.isRecordScannedFiles(), "Unexpected scanned files recording set"); | ||
checkArgument(originalHandle.getMaxScannedFileSize().isEmpty(), "Unexpected max scanned file size set"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix can be to either return empty statistics here instead of the checkArgument
,
if (originalHandle.isRecordScannedFiles()) {
// Skip
return TableStatistics.empty();
}
or just remove the checks altogether and include originalHandle.isRecordScannedFiles()
in the cacheKey
below.
The test for the fix would be to successfully run EXPLAIN and EXPLAIN ANALYZE on OPTIMIZE query with ignore_stats_calculator_failures
session property set to false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I agree this is the simplest fix — I actually considered that approach earlier as well.
However, I decided to introduce the disableTableStatisticsCache
option because it provides a more explicit and controllable behavior. It allows users (and tests) to completely turn off caching when diagnosing issues related to stale or inconsistent statistics.
This approach keeps the cache key logic clean and avoids coupling it with isRecordScannedFiles(), while also giving us better flexibility to handle similar scenarios in the future — especially if we later support outputting statistics for operations like OPTIMIZE or other similar behaviors in Iceberg.
That said, I’m open to adjusting the naming or scope of the option if you think that would make the intent clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to introduce the amount of complexity added here for solving this case.
Iceberg doesn't have a problem of stale statistics because we don't cache statistics across queries and the metadata file cached for planning are immutable, so can never provide stale information.
If we want to output statistics for OPTIMIZE, then my second suggestion of "remove the checks altogether and include originalHandle.isRecordScannedFiles() in the cacheKey below" is the way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @raunaqmorarka If we are just trying to solve the current issue, I agree with your suggestion. The changes have already been pushed.
f5a56ac
to
29c5d35
Compare
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
1. Fixed server-side failure in EXPLAIN; when IGNORE_STATS_CALCULATOR_FAILURES is false, the query failed. 2. Fixed failure in EXPLAIN ANALYZE OPTIMIZE.
29c5d35
to
ab31deb
Compare
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
@cla-bot check |
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
The cla-bot has been summoned, and re-checked this pull request! |
Description
This PR fixes failures when running EXPLAIN and EXPLAIN ANALYZE on Iceberg OPTIMIZE statements:
Fixed server-side failure in
EXPLAIN
;EXPLAIN
prints an error on the server side, which is only ignored because of theIGNORE_STATS_CALCULATOR_FAILURES
session property. When this property is set to false, it can still cause the query to fail.Fixed failure in
EXPLAIN ANALYZE OPTIMIZE
.This change ensures both
EXPLAIN
andEXPLAIN ANALYZE OPTIMIZE
queries complete without errors.Additional context and related issues
This would also close #26598
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: