Skip to content

Conversation

crepererum
Copy link

@crepererum crepererum commented Jul 28, 2025

This is #69 but with apache#13511 reverted.

Tracking issue: https://github.com/influxdata/influxdb_iox/issues/14762

Patches

Patches map to commits 1:1 (i.e. every patch is exactly 1 commit) and are ordered for easier correlation of the description and the respective commits. They are also grouped in 3 stages.

A: Dummy

No actual patches, can be dropped at any point:

  1. a dummy patch just to get "a diff" to the base branch

B: CI Fixes

Need to get CI up and running before picking any actual patches:

  1. Fix intermittent SQL logic test failure in limit.slt by adding ORDER BY clause
    That's Fix intermittent SQL logic test failure in limit.slt by adding ORDER BY clause apache/datafusion#16257 . Can be dropped with DF 49.

All commits afterwards should build cleanly!

C: Patches

These are the actual relevant patches:

  1. chore: default=true for skip_physical_aggregate_schema_check, and add warn logging:
    until we chase down all warnings in our iox logs (see https://github.com/influxdata/influxdb_iox/issues/12404 )
  2. (New) Test + workaround for SanityCheck plan:
    according to this slack thread, we can drop this with DataFusion version 49.
  3. chore: skip order calculation / exponential planning:
    workaround for Exponential planning time (100s of seconds) with UNION and ORDER BY queries apache/datafusion#13748 -- which should be fixed in DataFusion version 49
  4. fix: temporary fix to handle incorrect coalesce (inserted during EnforceDistribution) which later causes an error during EnforceSort (without our patch). The next DataFusion version 46 upgrade does the proper fix, which is to not insert the coalesce in the first place.:
    There is EAR-5822 (also see https://github.com/influxdata/influxdb_iox/issues/13310 ) despite what the note in Patched DataFusion version 45.0.0 #54 and ParallelizeSorts, a subrule of EnforceSorting optimizer, should not remove necessary coalesce. apache/datafusion#14691 (comment) say, this is still required for DF version 46. Otherwise the regression test fails. Also see this slack thread.
  5. fix: reserved keywords in qualified column names:
    That's fix: reserved keywords in qualified column names apache/datafusion#16584 . Can be dropped with DF 49.
  6. feat: add SchemaProvider::table_type(table_name: &str)
    That's feat: add SchemaProvider::table_type(table_name: &str) apache/datafusion#16401 . Can be dropped with DF 49.
  7. fix: support nullable columns in pre-sorted data sources
    That's fix: support nullable columns in pre-sorted data sources apache/datafusion#16783 . Can be dropped with DF 49.
  8. revert: Support WITHIN GROUP syntax to standardize certain existing aggregate functions
    That reverts Support WITHIN GROUP syntax to standardize certain existing aggregate functions  apache/datafusion#13511 because it is a breaking change without a smooth transition. Can be dropped once we have Support old syntax for approx_percentile_cont and approx_percentile_cont_with_weight apache/datafusion#16955 .
  9. revert: Fix ClickBench extended queries after update to APPROX_PERCENTILE_CONT
    That reverts Fix ClickBench extended queries after update to APPROX_PERCENTILE_CONT apache/datafusion#15929 because it is the test adjustment for Support WITHIN GROUP syntax to standardize certain existing aggregate functions  apache/datafusion#13511 (see previous point).
  10. feat: support distinct for window
    That's feat: support distinct for window apache/datafusion#16925 because a customer wants it (see https://github.com/influxdata/EAR/issues/6252 ). Can be dropped with DF 50.

crepererum and others added 10 commits July 14, 2025 13:25
…BY clause (apache#16257)

* Add order by clause to limit query for consistent results

* test: update explain plan
…rceDistribution) which later causes an error during EnforceSort (without our patch). The next DataFusion version 46 upgrade does the proper fix, which is to not insert the coalesce in the first place.

test: recreating the iox plan:
* demonstrate the insertion of coalesce after the use of column estimates, and the removal of the test scenario's forcing of rr repartitioning

test: reproducer of SanityCheck failure after EnforceSorting removes the coalesce added in the EnforceDistribution

fix: special case to not remove the needed coalesce
* feat:  add SchemaProvider::table_type(table_name: &str)

InformationSchemaConfig::make_tables only needs the TableType not the
whole TableProvider, and the former may require an expensive catalog
operation to construct and the latter may not.

This allows avoiding `SELECT * FROM information_schema.tables` having to
make 1 of those potentially expensive operations per table.

* test:  new InformationSchemaConfig::make_tables behavior

* Move tests to same file to fix CI

---------

Co-authored-by: Andrew Lamb <[email protected]>
crepererum and others added 2 commits July 28, 2025 17:26
* feat: support distinct for window

* fix

* fix

* fisx

* fix unparse

* fix test

* fix test

* easy way

* add test

* add comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants