-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Describe the bug
Currently the output ordering -- which is defined per partition -- is a vector of LexOrdering
:
datafusion/datafusion/datasource/src/file_scan_config.rs
Lines 183 to 184 in d19bf52
/// All equivalent lexicographical orderings that describe the schema. | |
pub output_ordering: Vec<LexOrdering>, |
This was kinda OK before #16217 because LexOrdering
was allowed to be empty -- which was basically a sentinel value for "not ordering". Now however the code cannot specify that a partition isn't ordered anymore. This actually leads to some funky bugs like in this code here:
datafusion/datafusion/datasource/src/file_scan_config.rs
Lines 1382 to 1384 in d19bf52
let Some(new_ordering) = LexOrdering::new(new_ordering) else { | |
continue; | |
}; |
So if the projection for a single partition leads to "unordered", then get_projected_output_ordering(input).len() < input.len()
, i.e. we loose partitions. This can clearly not be right.
To Reproduce
No response
Expected behavior
FileScanConfig::output_ordering
must be Vec<Option<LexOrdering>>
.
Additional context
No response