Relax constraint that file sort order must only reference individual columns #17419

pepijnve · 2025-09-04T16:16:29Z

Which issue does this PR close?

Closes External table with complex order expression not allowed #17411

Rationale for this change

The documentation states that WITH ORDER clauses may use non-trivial expressions. It even has an example showing the usage of this feature. In practice this does not work and the implementation is limited to simple column references.

What changes are included in this PR?

Add a new physical_expr::create_lex_ordering function that provides a more flexible version of physical_expr::create_ordering. create_ordering with its single column constraint has been retained for backwards compatibility, but should perhaps be deprecated. It does not seems possible to reimplement it in terms of create_lex_ordering since an ExecutionProps instance is required.
Add a new physical_expr::equivalence::project_orderings convenience function that uses the existing sort order projection logic
Adjust the various users of sort order projection to make use of the new implementation.

Are these changes tested?

Changed logic is covered by existing tests.
Added additional SQL logic tests to verify a non-trivial with order case

Are there any user-facing changes?

The example in the documentation will now actually work
Consumers of sort expressions may now have to deal with arbitrary PhysicalExpr instances rather than only Column.

…e individual columns

alamb

Thanks @pepijnve -- this looks good to me. I'll kick off some planning benchmarks just to make sure this doesn't affect them, but I don't expect to see any slowdown

alamb · 2025-09-10T19:50:26Z

datafusion/catalog/src/stream.rs

-            }
-            None => create_ordering(self.0.source.schema(), &self.0.order)?,
+        let schema = self.0.source.schema();
+        let df_schema = DFSchema::try_from(Arc::clone(schema))?;


I wonder if (re)creating this DFSchema is necessary -- it feels like at this point we know the schema information

However, i also see we need to have a DFSchema to correctly create arbitrary PhysicalExprs so this is probably fine

I was a bit concerned about the waste here as well, but I couldn't figure out a simple way to avoid this.

alamb · 2025-09-10T19:52:54Z

datafusion/sqllogictest/test_files/order.slt

+----
+physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/data/composite_order.csv]]}, projection=[a, b], output_ordering=[a@0 + b@1 ASC NULLS LAST], file_type=csv, has_header=true
+
+# Query ordered by the declared order should be just a table scan


alamb · 2025-09-10T19:54:00Z

datafusion/sqllogictest/test_files/window.slt

@@ -3532,7 +3532,7 @@ physical_plan
 01)BoundedWindowAggExec: wdw=[sum(multiple_ordered_table.a) ORDER BY [multiple_ordered_table.b ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Field { name: "sum(multiple_ordered_table.a) ORDER BY [multiple_ordered_table.b ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, frame: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW], mode=[Sorted]
 02)--CoalesceBatchesExec: target_batch_size=4096
 03)----FilterExec: b@2 = 0
-04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[a0, a, b, c, d], output_orderings=[[a@1 ASC NULLS LAST, b@2 ASC NULLS LAST], [c@3 ASC NULLS LAST]], file_type=csv, has_header=true
+04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[a0, a, b, c, d], output_orderings=[[c@3 ASC NULLS LAST], [a@1 ASC NULLS LAST, b@2 ASC NULLS LAST]], file_type=csv, has_header=true


do you know why the output orderings come out in a different (reverse) order now?

No, I didn't take the time to try to understand why. That's how they're being emitted by the EquivalenceClass code. I had assumed the order was not important, but if it is I can take a closer look.

i don't think it is important

alamb · 2025-09-10T20:13:10Z

🤖 ./gh_compare_branch_bench.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing issue_17411 (124953d) to 241b669 diff
BENCH_NAME=sql_planner
BENCH_COMMAND=cargo bench --bench sql_planner
BENCH_FILTER=
BENCH_BRANCH_NAME=issue_17411
Results will be posted here when complete

alamb · 2025-09-11T10:27:22Z

🤖: Benchmark completed

Details

group                                                issue_17411                            main
-----                                                -----------                            ----
logical_aggregate_with_join                          1.00    624.8±4.06µs        ? ?/sec    1.01    631.5±3.42µs        ? ?/sec
logical_plan_optimize                                1.00     179.0±5.63s        ? ?/sec    1.02     182.6±8.59s        ? ?/sec
logical_select_all_from_1000                         1.00     11.0±0.07ms        ? ?/sec    1.04     11.4±0.08ms        ? ?/sec
logical_select_one_from_700                          1.00    412.9±5.21µs        ? ?/sec    1.01    417.5±3.09µs        ? ?/sec
logical_trivial_join_high_numbered_columns           1.00    370.1±3.09µs        ? ?/sec    1.01    372.9±2.20µs        ? ?/sec
logical_trivial_join_low_numbered_columns            1.00    356.0±1.57µs        ? ?/sec    1.01    358.9±2.83µs        ? ?/sec
physical_intersection                                1.00    830.7±4.30µs        ? ?/sec    1.01    842.1±9.45µs        ? ?/sec
physical_join_consider_sort                          1.00   1382.2±8.17µs        ? ?/sec    1.01   1400.8±8.36µs        ? ?/sec
physical_join_distinct                               1.00    347.8±1.66µs        ? ?/sec    1.02    353.7±2.71µs        ? ?/sec
physical_many_self_joins                             1.00     10.2±0.06ms        ? ?/sec    1.02     10.4±0.07ms        ? ?/sec
physical_plan_clickbench_all                         1.06    217.7±7.87ms        ? ?/sec    1.00    205.0±5.02ms        ? ?/sec
physical_plan_clickbench_q1                          1.01      2.7±0.06ms        ? ?/sec    1.00      2.6±0.07ms        ? ?/sec
physical_plan_clickbench_q10                         1.03      3.7±0.12ms        ? ?/sec    1.00      3.6±0.13ms        ? ?/sec
physical_plan_clickbench_q11                         1.03      4.0±0.10ms        ? ?/sec    1.00      3.9±0.15ms        ? ?/sec
physical_plan_clickbench_q12                         1.03      4.2±0.16ms        ? ?/sec    1.00      4.0±0.16ms        ? ?/sec
physical_plan_clickbench_q13                         1.03      3.7±0.11ms        ? ?/sec    1.00      3.6±0.13ms        ? ?/sec
physical_plan_clickbench_q14                         1.03      4.0±0.11ms        ? ?/sec    1.00      3.9±0.12ms        ? ?/sec
physical_plan_clickbench_q15                         1.04      3.8±0.10ms        ? ?/sec    1.00      3.7±0.14ms        ? ?/sec
physical_plan_clickbench_q16                         1.00      3.6±0.07ms        ? ?/sec    1.00      3.6±0.11ms        ? ?/sec
physical_plan_clickbench_q17                         1.00      3.7±0.09ms        ? ?/sec    1.00      3.7±0.11ms        ? ?/sec
physical_plan_clickbench_q18                         1.05      3.2±0.08ms        ? ?/sec    1.00      3.1±0.10ms        ? ?/sec
physical_plan_clickbench_q19                         1.01      4.1±0.11ms        ? ?/sec    1.00      4.1±0.11ms        ? ?/sec
physical_plan_clickbench_q2                          1.05      3.3±0.13ms        ? ?/sec    1.00      3.2±0.12ms        ? ?/sec
physical_plan_clickbench_q20                         1.06      2.9±0.08ms        ? ?/sec    1.00      2.8±0.07ms        ? ?/sec
physical_plan_clickbench_q21                         1.06      3.3±0.08ms        ? ?/sec    1.00      3.1±0.08ms        ? ?/sec
physical_plan_clickbench_q22                         1.03      3.9±0.12ms        ? ?/sec    1.00      3.8±0.09ms        ? ?/sec
physical_plan_clickbench_q23                         1.01      4.2±0.12ms        ? ?/sec    1.00      4.2±0.15ms        ? ?/sec
physical_plan_clickbench_q24                         1.17      5.5±0.21ms        ? ?/sec    1.00      4.7±0.14ms        ? ?/sec
physical_plan_clickbench_q25                         1.07      3.6±0.12ms        ? ?/sec    1.00      3.3±0.11ms        ? ?/sec
physical_plan_clickbench_q26                         1.07      3.3±0.10ms        ? ?/sec    1.00      3.1±0.09ms        ? ?/sec
physical_plan_clickbench_q27                         1.12      3.7±0.24ms        ? ?/sec    1.00      3.3±0.09ms        ? ?/sec
physical_plan_clickbench_q28                         1.03      4.3±0.12ms        ? ?/sec    1.00      4.1±0.15ms        ? ?/sec
physical_plan_clickbench_q29                         1.02      5.0±0.16ms        ? ?/sec    1.00      4.9±0.12ms        ? ?/sec
physical_plan_clickbench_q3                          1.04      3.2±0.10ms        ? ?/sec    1.00      3.0±0.12ms        ? ?/sec
physical_plan_clickbench_q30                         1.03     14.6±0.60ms        ? ?/sec    1.00     14.1±0.48ms        ? ?/sec
physical_plan_clickbench_q31                         1.09      4.5±0.17ms        ? ?/sec    1.00      4.1±0.12ms        ? ?/sec
physical_plan_clickbench_q32                         1.06      4.4±0.16ms        ? ?/sec    1.00      4.1±0.11ms        ? ?/sec
physical_plan_clickbench_q33                         1.02      3.7±0.12ms        ? ?/sec    1.00      3.7±0.29ms        ? ?/sec
physical_plan_clickbench_q34                         1.05      3.4±0.14ms        ? ?/sec    1.00      3.2±0.08ms        ? ?/sec
physical_plan_clickbench_q35                         1.03      3.4±0.08ms        ? ?/sec    1.00      3.3±0.09ms        ? ?/sec
physical_plan_clickbench_q36                         1.07      4.4±0.17ms        ? ?/sec    1.00      4.1±0.10ms        ? ?/sec
physical_plan_clickbench_q37                         1.10      4.6±0.17ms        ? ?/sec    1.00      4.2±0.18ms        ? ?/sec
physical_plan_clickbench_q38                         1.10      4.6±0.15ms        ? ?/sec    1.00      4.2±0.13ms        ? ?/sec
physical_plan_clickbench_q39                         1.10      4.4±0.16ms        ? ?/sec    1.00      4.0±0.13ms        ? ?/sec
physical_plan_clickbench_q4                          1.04      2.8±0.06ms        ? ?/sec    1.00      2.7±0.10ms        ? ?/sec
physical_plan_clickbench_q40                         1.13      5.3±0.18ms        ? ?/sec    1.00      4.7±0.15ms        ? ?/sec
physical_plan_clickbench_q41                         1.13      4.8±0.18ms        ? ?/sec    1.00      4.2±0.13ms        ? ?/sec
physical_plan_clickbench_q42                         1.14      4.7±0.21ms        ? ?/sec    1.00      4.2±0.13ms        ? ?/sec
physical_plan_clickbench_q43                         1.20      5.4±0.26ms        ? ?/sec    1.00      4.5±0.13ms        ? ?/sec
physical_plan_clickbench_q44                         1.05      3.1±0.11ms        ? ?/sec    1.00      2.9±0.09ms        ? ?/sec
physical_plan_clickbench_q45                         1.04      3.0±0.09ms        ? ?/sec    1.00      2.9±0.09ms        ? ?/sec
physical_plan_clickbench_q46                         1.03      3.5±0.11ms        ? ?/sec    1.00      3.4±0.11ms        ? ?/sec
physical_plan_clickbench_q47                         1.02      4.1±0.16ms        ? ?/sec    1.00      4.1±0.12ms        ? ?/sec
physical_plan_clickbench_q48                         1.08      5.3±0.21ms        ? ?/sec    1.00      4.9±0.20ms        ? ?/sec
physical_plan_clickbench_q49                         1.09      5.6±0.25ms        ? ?/sec    1.00      5.1±0.15ms        ? ?/sec
physical_plan_clickbench_q5                          1.02      3.1±0.08ms        ? ?/sec    1.00      3.0±0.08ms        ? ?/sec
physical_plan_clickbench_q50                         1.04      4.8±0.22ms        ? ?/sec    1.00      4.7±0.20ms        ? ?/sec
physical_plan_clickbench_q51                         1.03      3.6±0.14ms        ? ?/sec    1.00      3.5±0.11ms        ? ?/sec
physical_plan_clickbench_q6                          1.04      3.1±0.08ms        ? ?/sec    1.00      3.0±0.09ms        ? ?/sec
physical_plan_clickbench_q7                          1.02      2.7±0.06ms        ? ?/sec    1.00      2.6±0.08ms        ? ?/sec
physical_plan_clickbench_q8                          1.03      3.8±0.09ms        ? ?/sec    1.00      3.7±0.11ms        ? ?/sec
physical_plan_clickbench_q9                          1.01      3.5±0.08ms        ? ?/sec    1.00      3.5±0.09ms        ? ?/sec
physical_plan_tpcds_all                              1.02   1074.3±3.41ms        ? ?/sec    1.00   1053.3±7.82ms        ? ?/sec
physical_plan_tpch_all                               1.03     65.8±0.43ms        ? ?/sec    1.00     63.9±0.61ms        ? ?/sec
physical_plan_tpch_q1                                1.00      2.1±0.01ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
physical_plan_tpch_q10                               1.03      4.0±0.02ms        ? ?/sec    1.00      3.9±0.03ms        ? ?/sec
physical_plan_tpch_q11                               1.05      3.5±0.02ms        ? ?/sec    1.00      3.4±0.03ms        ? ?/sec
physical_plan_tpch_q12                               1.00  1849.7±10.34µs        ? ?/sec    1.00  1844.4±15.32µs        ? ?/sec
physical_plan_tpch_q13                               1.00  1485.6±10.56µs        ? ?/sec    1.00  1486.7±11.55µs        ? ?/sec
physical_plan_tpch_q14                               1.00  1990.1±16.15µs        ? ?/sec    1.00  1991.3±12.82µs        ? ?/sec
physical_plan_tpch_q16                               1.00      2.5±0.02ms        ? ?/sec    1.00      2.5±0.02ms        ? ?/sec
physical_plan_tpch_q17                               1.06      2.6±0.02ms        ? ?/sec    1.00      2.5±0.03ms        ? ?/sec
physical_plan_tpch_q18                               1.00      2.7±0.01ms        ? ?/sec    1.01      2.7±0.02ms        ? ?/sec
physical_plan_tpch_q19                               1.01      3.3±0.02ms        ? ?/sec    1.00      3.3±0.03ms        ? ?/sec
physical_plan_tpch_q2                                1.07      6.0±0.11ms        ? ?/sec    1.00      5.6±0.05ms        ? ?/sec
physical_plan_tpch_q20                               1.02      3.2±0.03ms        ? ?/sec    1.00      3.2±0.07ms        ? ?/sec
physical_plan_tpch_q21                               1.06      4.4±0.03ms        ? ?/sec    1.00      4.1±0.04ms        ? ?/sec
physical_plan_tpch_q22                               1.00      2.7±0.01ms        ? ?/sec    1.00      2.7±0.02ms        ? ?/sec
physical_plan_tpch_q3                                1.04      2.7±0.01ms        ? ?/sec    1.00      2.6±0.02ms        ? ?/sec
physical_plan_tpch_q4                                1.00   1541.2±6.67µs        ? ?/sec    1.01  1551.8±11.19µs        ? ?/sec
physical_plan_tpch_q5                                1.03      3.3±0.02ms        ? ?/sec    1.00      3.2±0.02ms        ? ?/sec
physical_plan_tpch_q6                                1.00    876.1±7.08µs        ? ?/sec    1.01    881.2±8.98µs        ? ?/sec
physical_plan_tpch_q7                                1.00      4.3±0.04ms        ? ?/sec    1.01      4.3±0.06ms        ? ?/sec
physical_plan_tpch_q8                                1.07      5.6±0.16ms        ? ?/sec    1.00      5.2±0.05ms        ? ?/sec
physical_plan_tpch_q9                                1.00      4.1±0.02ms        ? ?/sec    1.00      4.1±0.03ms        ? ?/sec
physical_select_aggregates_from_200                  1.00     16.9±0.08ms        ? ?/sec    1.01     17.0±0.10ms        ? ?/sec
physical_select_all_from_1000                        1.00     24.4±0.17ms        ? ?/sec    1.02     25.0±0.12ms        ? ?/sec
physical_select_one_from_700                         1.00  1057.1±10.09µs        ? ?/sec    1.02   1081.8±8.25µs        ? ?/sec
physical_sorted_union_order_by_10                    1.00     13.2±0.20ms        ? ?/sec    1.00     13.2±0.19ms        ? ?/sec
physical_sorted_union_order_by_100                   1.02       2.1±0.03s        ? ?/sec    1.00       2.0±0.01s        ? ?/sec
physical_sorted_union_order_by_200                   1.01      13.0±0.18s        ? ?/sec    1.00      12.9±0.14s        ? ?/sec
physical_sorted_union_order_by_300                   1.02      39.5±0.29s        ? ?/sec    1.00      38.9±0.54s        ? ?/sec
physical_sorted_union_order_by_50                    1.00    390.4±3.72ms        ? ?/sec    1.00    390.5±4.02ms        ? ?/sec
physical_theta_join_consider_sort                    1.00   1745.6±8.52µs        ? ?/sec    1.01  1762.5±10.74µs        ? ?/sec
physical_unnest_to_join                              1.00   1307.8±9.30µs        ? ?/sec    1.00   1312.9±5.21µs        ? ?/sec
physical_window_function_partition_by_4_on_values    1.00   1302.2±5.14µs        ? ?/sec    1.00   1298.8±5.89µs        ? ?/sec
physical_window_function_partition_by_7_on_values    1.00     34.8±0.15ms        ? ?/sec    1.02     35.4±0.10ms        ? ?/sec
physical_window_function_partition_by_8_on_values    1.00    136.2±0.66ms        ? ?/sec    1.01    138.2±0.29ms        ? ?/sec
with_param_values_many_columns                       1.00    148.7±4.52µs        ? ?/sec    1.01    150.3±5.18µs        ? ?/sec

alamb · 2025-09-11T10:27:26Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing issue_17411 (124953d) to 241b669 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-09-11T11:24:24Z

🤖: Benchmark completed

Details

Comparing HEAD and issue_17411
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ issue_17411 ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │  2724.96 ms │  2883.22 ms │ 1.06x slower │
│ QQuery 1     │  1368.60 ms │  1466.75 ms │ 1.07x slower │
│ QQuery 2     │  2447.12 ms │  2678.82 ms │ 1.09x slower │
│ QQuery 3     │  1205.52 ms │  1172.61 ms │    no change │
│ QQuery 4     │  2291.41 ms │  2376.86 ms │    no change │
│ QQuery 5     │ 27385.01 ms │ 27708.13 ms │    no change │
│ QQuery 6     │  4140.13 ms │  4248.71 ms │    no change │
│ QQuery 7     │  3741.37 ms │  4334.76 ms │ 1.16x slower │
└──────────────┴─────────────┴─────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 45304.13ms │
│ Total Time (issue_17411)   │ 46869.87ms │
│ Average Time (HEAD)        │  5663.02ms │
│ Average Time (issue_17411) │  5858.73ms │
│ Queries Faster             │          0 │
│ Queries Slower             │          4 │
│ Queries with No Change     │          4 │
│ Queries with Failure       │          0 │
└────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ issue_17411 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.20 ms │     2.37 ms │  1.08x slower │
│ QQuery 1     │    50.00 ms │    50.48 ms │     no change │
│ QQuery 2     │   139.23 ms │   133.63 ms │     no change │
│ QQuery 3     │   165.19 ms │   164.52 ms │     no change │
│ QQuery 4     │  1071.85 ms │  1060.62 ms │     no change │
│ QQuery 5     │  1475.94 ms │  1544.15 ms │     no change │
│ QQuery 6     │     2.17 ms │     2.30 ms │  1.06x slower │
│ QQuery 7     │    54.95 ms │    54.03 ms │     no change │
│ QQuery 8     │  1438.20 ms │  1497.52 ms │     no change │
│ QQuery 9     │  1838.06 ms │  1861.33 ms │     no change │
│ QQuery 10    │   379.72 ms │   386.61 ms │     no change │
│ QQuery 11    │   426.25 ms │   443.43 ms │     no change │
│ QQuery 12    │  1351.47 ms │  1369.58 ms │     no change │
│ QQuery 13    │  2125.51 ms │  2191.41 ms │     no change │
│ QQuery 14    │  1264.83 ms │  1306.22 ms │     no change │
│ QQuery 15    │  1227.45 ms │  1217.76 ms │     no change │
│ QQuery 16    │  2760.44 ms │  2715.61 ms │     no change │
│ QQuery 17    │  2644.63 ms │  2686.47 ms │     no change │
│ QQuery 18    │  5498.43 ms │  5035.57 ms │ +1.09x faster │
│ QQuery 19    │   127.09 ms │   125.21 ms │     no change │
│ QQuery 20    │  2046.49 ms │  2047.36 ms │     no change │
│ QQuery 21    │  2368.18 ms │  2349.67 ms │     no change │
│ QQuery 22    │  4483.89 ms │  4037.62 ms │ +1.11x faster │
│ QQuery 23    │ 12849.04 ms │ 12919.60 ms │     no change │
│ QQuery 24    │   215.42 ms │   234.47 ms │  1.09x slower │
│ QQuery 25    │   495.37 ms │   522.15 ms │  1.05x slower │
│ QQuery 26    │   220.22 ms │   218.60 ms │     no change │
│ QQuery 27    │  2947.09 ms │  2952.02 ms │     no change │
│ QQuery 28    │ 23157.97 ms │ 23142.16 ms │     no change │
│ QQuery 29    │   991.05 ms │   985.86 ms │     no change │
│ QQuery 30    │  1348.32 ms │  1319.36 ms │     no change │
│ QQuery 31    │  1348.57 ms │  1333.53 ms │     no change │
│ QQuery 32    │  4515.69 ms │  4490.59 ms │     no change │
│ QQuery 33    │  5826.81 ms │  5722.14 ms │     no change │
│ QQuery 34    │  6030.99 ms │  6093.99 ms │     no change │
│ QQuery 35    │  2083.58 ms │  2077.14 ms │     no change │
│ QQuery 36    │   120.66 ms │   125.11 ms │     no change │
│ QQuery 37    │    54.35 ms │    57.26 ms │  1.05x slower │
│ QQuery 38    │   122.58 ms │   122.68 ms │     no change │
│ QQuery 39    │   203.48 ms │   203.30 ms │     no change │
│ QQuery 40    │    45.17 ms │    46.75 ms │     no change │
│ QQuery 41    │    39.61 ms │    40.37 ms │     no change │
│ QQuery 42    │    35.30 ms │    36.12 ms │     no change │
└──────────────┴─────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 95593.44ms │
│ Total Time (issue_17411)   │ 94926.68ms │
│ Average Time (HEAD)        │  2223.10ms │
│ Average Time (issue_17411) │  2207.60ms │
│ Queries Faster             │          2 │
│ Queries Slower             │          5 │
│ Queries with No Change     │         36 │
│ Queries with Failure       │          0 │
└────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ issue_17411 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 174.48 ms │   175.18 ms │     no change │
│ QQuery 2     │  25.04 ms │    25.70 ms │     no change │
│ QQuery 3     │  44.59 ms │    44.60 ms │     no change │
│ QQuery 4     │  27.25 ms │    26.34 ms │     no change │
│ QQuery 5     │  74.23 ms │    73.67 ms │     no change │
│ QQuery 6     │  19.61 ms │    19.13 ms │     no change │
│ QQuery 7     │ 145.96 ms │   140.82 ms │     no change │
│ QQuery 8     │  32.90 ms │    31.99 ms │     no change │
│ QQuery 9     │  83.36 ms │    85.80 ms │     no change │
│ QQuery 10    │  58.51 ms │    57.31 ms │     no change │
│ QQuery 11    │  42.27 ms │    41.12 ms │     no change │
│ QQuery 12    │  51.50 ms │    50.90 ms │     no change │
│ QQuery 13    │  48.05 ms │    45.82 ms │     no change │
│ QQuery 14    │  14.40 ms │    13.28 ms │ +1.08x faster │
│ QQuery 15    │  24.72 ms │    24.35 ms │     no change │
│ QQuery 16    │  24.44 ms │    23.66 ms │     no change │
│ QQuery 17    │ 147.72 ms │   144.33 ms │     no change │
│ QQuery 18    │ 331.26 ms │   321.67 ms │     no change │
│ QQuery 19    │  48.83 ms │    36.64 ms │ +1.33x faster │
│ QQuery 20    │  60.25 ms │    50.18 ms │ +1.20x faster │
│ QQuery 21    │ 222.16 ms │   224.59 ms │     no change │
│ QQuery 22    │  20.51 ms │    20.09 ms │     no change │
└──────────────┴───────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)          │ 1722.02ms │
│ Total Time (issue_17411)   │ 1677.16ms │
│ Average Time (HEAD)        │   78.27ms │
│ Average Time (issue_17411) │   76.23ms │
│ Queries Faster             │         3 │
│ Queries Slower             │         0 │
│ Queries with No Change     │        19 │
│ Queries with Failure       │         0 │
└────────────────────────────┴───────────┘

alamb · 2025-09-14T11:06:05Z

I am a little worried about the reported slowdowns in the sql planning benchmarks. I'll try and reproduce them locally

pepijnve · 2025-09-14T17:23:08Z

Perhaps a two step approach would be better then where we try the “column only” version first and only use the more complex code path as fallback.

github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate catalog Related to the catalog crate datasource Changes to the datasource crate labels Sep 4, 2025

pepijnve force-pushed the issue_17411 branch 7 times, most recently from 4e04412 to 4bbc81a Compare September 5, 2025 15:06

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Sep 5, 2025

pepijnve force-pushed the issue_17411 branch from 49a6c8a to 5861425 Compare September 8, 2025 07:51

pepijnve marked this pull request as ready for review September 8, 2025 07:59

apache#17411 Relax constraint that file sort order must only referenc…

9453640

…e individual columns

pepijnve force-pushed the issue_17411 branch from 5861425 to 9453640 Compare September 8, 2025 08:11

apache#17411 Add additional SQL logic tests

124953d

alamb approved these changes Sep 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Relax constraint that file sort order must only reference individual columns #17419

Relax constraint that file sort order must only reference individual columns #17419

pepijnve commented Sep 4, 2025 •

edited

Loading

Uh oh!

alamb left a comment

Uh oh!

alamb Sep 10, 2025

Uh oh!

pepijnve Sep 10, 2025

Uh oh!

alamb Sep 10, 2025

Uh oh!

alamb Sep 10, 2025

Uh oh!

pepijnve Sep 10, 2025

Uh oh!

alamb Sep 10, 2025

Uh oh!

alamb commented Sep 10, 2025

Uh oh!

alamb commented Sep 11, 2025

Uh oh!

alamb commented Sep 11, 2025

Uh oh!

alamb commented Sep 11, 2025

Uh oh!

alamb commented Sep 14, 2025

Uh oh!

pepijnve commented Sep 14, 2025

Uh oh!

Uh oh!

Relax constraint that file sort order must only reference individual columns #17419

Are you sure you want to change the base?

Relax constraint that file sort order must only reference individual columns #17419

Conversation

pepijnve commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

pepijnve Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

pepijnve Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

alamb commented Sep 10, 2025

Uh oh!

alamb commented Sep 11, 2025

Uh oh!

alamb commented Sep 11, 2025

Uh oh!

alamb commented Sep 11, 2025

Uh oh!

alamb commented Sep 14, 2025

Uh oh!

pepijnve commented Sep 14, 2025

Uh oh!

Uh oh!

pepijnve commented Sep 4, 2025 •

edited

Loading