Skip to content

Conversation

XiangpengHao
Copy link
Contributor

@XiangpengHao XiangpengHao commented Jul 2, 2025

This is my latest attempt to make pushdown faster. Prior art: #6921

cc @alamb @zhuqi-lucas

Problems of #6921

  1. It proactively loads entire row group into memory. (rather than only loading pages that passing the filter predicate)
  2. It only cache decompressed pages, still paying the decoding cost twice.

This PR takes a different approach, it does not change the decoding pipeline, so we avoid the problem 1. It also caches the arrow record batch, so avoid problem 2.

But this means we need to use more memory to cache data.

How it works?

  1. It instruments the array_readers with a transparent cached_array_reader.
  2. The cache layer will first consult the RowGroupCache to look for a batch, and only reads from underlying reader on a cache miss.
  3. There're cache producer and cache consumer. Producer is when we build filters we insert arrow arrays into cache, consumer is when we build outputs, we remove arrow array from cache. So the memory usage should look like this:
    ▲
    │     ╭─╮
    │    ╱   ╲
    │   ╱     ╲
    │  ╱       ╲
    │ ╱         ╲
    │╱           ╲
    └─────────────╲──────► Time
    │      │      │
    Filter  Peak  Consume
    Phase (Built) (Decrease)

In a concurrent setup, not all reader may reach the peak point at the same time, so the peak system memory usage might be lower.

  1. It has a max_cache_size knob, this is a per row group setting. If the row group has used up the budget, the cache stops taking new data. and the cached_array_reader will fallback to read and decode from Parquet.

Other benefits

  1. This architecture allows nested columns (but not implemented in this pr), i.e., it's future proof.
  2. There're many performance optimizations to further squeeze the performance, but even with current state, it has no regressions.

How does it perform?

My criterion somehow won't produces a result from --save-baseline, so I asked llm to generate a table from this benchmark:

cargo bench --bench arrow_reader_clickbench --features "arrow async" "async"

Baseline is the implementation for current main branch.
New Unlimited is the new pushdown with unlimited memory budget.
New 100MB is the new pushdown but the memory budget for a row group caching is 100MB.

Query  | Baseline (ms) | New Unlimited (ms) | Diff (ms)  | New 100MB (ms) | Diff (ms)
-------+--------------+--------------------+-----------+----------------+-----------
Q1     | 0.847          | 0.803               | -0.044     | 0.812          | -0.035    
Q10    | 4.060          | 6.273               | +2.213     | 6.216          | +2.156    
Q11    | 5.088          | 7.152               | +2.064     | 7.193          | +2.105    
Q12    | 18.485         | 14.937              | -3.548     | 14.904         | -3.581    
Q13    | 24.859         | 21.908              | -2.951     | 21.705         | -3.154    
Q14    | 23.994         | 20.691              | -3.303     | 20.467         | -3.527    
Q19    | 1.894          | 1.980               | +0.086     | 1.996          | +0.102    
Q20    | 90.325         | 64.689              | -25.636    | 74.478         | -15.847   
Q21    | 106.610        | 74.766              | -31.844    | 99.557         | -7.053    
Q22    | 232.730        | 101.660             | -131.070   | 204.800        | -27.930   
Q23    | 222.800        | 186.320             | -36.480    | 186.590        | -36.210   
Q24    | 24.840         | 19.762              | -5.078     | 19.908         | -4.932    
Q27    | 80.463         | 47.118              | -33.345    | 49.597         | -30.866   
Q28    | 78.999         | 47.583              | -31.416    | 51.432         | -27.567   
Q30    | 28.587         | 28.710              | +0.123     | 28.926         | +0.339    
Q36    | 80.157         | 57.954              | -22.203    | 58.012         | -22.145   
Q37    | 46.962         | 45.901              | -1.061     | 45.386         | -1.576    
Q38    | 16.324         | 16.492              | +0.168     | 16.522         | +0.198    
Q39    | 20.754         | 20.734              | -0.020     | 20.648         | -0.106    
Q40    | 22.554         | 21.707              | -0.847     | 21.995         | -0.559    
Q41    | 16.430         | 16.391              | -0.039     | 16.581         | +0.151    
Q42    | 6.045          | 6.157               | +0.112     | 6.120          | +0.075    
  1. If we consider the diff within 5ms to be noise, then we are never worse than the current implementation.
  2. We see significant improvements for string-heavy queries, because string columns are large, they take time to decompress and decode.
  3. 100MB cache budget seems to have small performance impact.

Limitations

  1. It only works for async readers, because sync reader do not follow the same row group by row group structure.
  2. It is memory hungry -- compared to Experimental parquet decoder with first-class selection pushdown support #6921. But changing decoding pipeline without eager loading entire row group would require significant changes to the current decoding infrastructure, e.g., we need to make page iterator an async function.
  3. It currently doesn't support nested columns, more specifically, it doesn't support nested columns with nullable parents. but supporting it is straightforward, no big changes.
  4. The current memory accounting is not accurate, it will overestimate the memory usage, especially when reading string view arrays, where multiple string view may share the same underlying buffer, and that buffer size is counted twice. Anyway, we never exceeds the user configured memory usage.
  5. If one row passes the filter, the entire batch will be cached. We can probably optimize this though.

Next steps?

This pr is largely proof of concept, I want to collect some feedback before sending a multi-thousands pr :)

Some items I can think of:

  1. Design an interface for user to specify the cache size limit, currently it's hard-coded.
  2. Don't instrument nested array reader if the parquet file has nullable parent. currently it will panic
  3. More testing, and integration test/benchmark with Datafusion

@github-actions github-actions bot added the parquet Changes to the parquet crate label Jul 2, 2025
#[derive(Clone)]
pub struct CacheOptions<'a> {
pub projection_mask: &'a ProjectionMask,
pub cache: Arc<Mutex<RowGroupCache>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically there's no contention because there's not parallelism in decoding one row group. we add mutex here because we need to use Arc.

let row_group_cache = Arc::new(Mutex::new(RowGroupCache::new(
batch_size,
// None,
Some(1024 * 1024 * 100),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently hard-coded, leave it a future work to make it configurable through user settings

@XiangpengHao XiangpengHao changed the title Pushdown v4 Parquet filter pushdown v4 Jul 2, 2025
@@ -613,8 +623,18 @@ where
.fetch(&mut self.input, predicate.projection(), selection)
.await?;

let mut cache_projection = predicate.projection().clone();
cache_projection.intersect(&projection);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A column is cached if and only if it appears both in output projection and filter projection

Copy link
Contributor

@alamb alamb Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one thing I didn't understand after reading this PR in detail was how the relative row positions are updated after applying a filter.

For example if we are applying multiple filters, the first may reduce the original RowSelection down to [100->200], and now when the second filter runs it is only evaluated on the 100->200 rows , not the original selection

In other words I think there needs to be some sort of function equvalent to RowSelection::and_then that applies to the cache

// Narrow the cache so that it only retains the results of evaluating the predicate
let row_group_cache = row_group_cache.and_then(resulting_selection)

Maybe this is the root cause of https://github.com/apache/datafusion/actions/runs/16302299778/job/46039904381?pr=16711

}

fn get_def_levels(&self) -> Option<&[i16]> {
None // we don't allow nullable parent for now.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested columns not support yet

@alamb
Copy link
Contributor

alamb commented Jul 2, 2025

😮 -- My brain is likely too fried at the moment to review this properly but it is on my list for first thing tomorrow

@zhuqi-lucas
Copy link

Thank you @XiangpengHao for amazing work, i will try to review and test this PR!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR is I think this is really clever - very nice @XiangpengHao . I left some structural comments / suggestions but nothing major.

I will run some more benchmarks, but it was showing very nice improvements for Q21 locally for me (129ms --> 90ms)

If that looks good I'll wire it up in DataFusion and run those benchmarks

Some thoughts:

  1. I would be happy to wire in the buffering limit / API
  2. As you say, there are many more improvements possible -- specifically I suspect the RowSelector representation is going to cause us pain and suffering for filters that have many short selections when bitmaps would be a better choice

Buffering

I think buffering the intermediate filter results is unavoidable if we want to preserve the current behavior to minimizes the size of IO requests

If we want to reduce buffering I think we can only really do it by increasing the number of IO requests (so we can incrementally produce the final output). I think we should proceed with buffering and then tune if/when needed

Comment on lines 632 to 636
CacheOptions {
projection_mask: &cache_projection,
cache: row_group_cache.clone(),
role: crate::arrow::array_reader::CacheRole::Producer,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structurally both here and below it might help to keep the creation ofthe CacheOptions into the cache itself so a reader of this code doesn't have to understand the innards of the cache

Suggested change
CacheOptions {
projection_mask: &cache_projection,
cache: row_group_cache.clone(),
role: crate::arrow::array_reader::CacheRole::Producer,
},
row_group_cache.producer_options(projection, predicate.proection())


let start_position = self.outer_position - row_count;

let selection_buffer = row_selection_to_boolean_buffer(row_count, self.selections.iter());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is clever -- though it will likely suffer from the same "RowSelection is a crappy representation for small selection runs" problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is to alleviate the problem. If we have multiple small selection runs on the same cached batch, first combine them into a boolean buffer, and do boolean selection once.

.expect("data must be already cached in the read_records call, this is a bug");
let cached = cached.slice(overlap_start - batch_start, selection_length);
let filtered = arrow_select::filter::filter(&cached, &mask_array)?;
selected_arrays.push(filtered);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably use the new BatchCoalescer here instead: https://docs.rs/arrow/latest/arrow/compute/struct.BatchCoalescer.html

It is definitely faster for primitive arrays and will save intermediate memory usage

It might have some trouble with StringView as it also tries to gc internally too -- we may need to optimize the output to avoid gc'ing if we see the same buffer from call to call

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing pushdown-v4 (1851f0b) to af8564f diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=pushdown-v4
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

🤖: Benchmark completed

Details

group                                main                                   pushdown-v4
-----                                ----                                   -----------
arrow_reader_clickbench/async/Q1     1.00      2.4±0.02ms        ? ?/sec    1.00      2.4±0.12ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     10.4±0.11ms        ? ?/sec    1.10     11.5±0.26ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     12.4±0.14ms        ? ?/sec    1.09     13.5±0.18ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.34     34.4±0.29ms        ? ?/sec    1.00     25.7±0.22ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.23     48.6±0.32ms        ? ?/sec    1.00     39.5±0.26ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.24     46.3±0.35ms        ? ?/sec    1.00     37.2±0.27ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.2±0.05ms        ? ?/sec    1.08      5.6±0.07ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.32    161.7±0.73ms        ? ?/sec    1.00    122.3±0.50ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.30    207.7±0.83ms        ? ?/sec    1.00    159.6±0.65ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.06    479.2±2.17ms        ? ?/sec    1.00    450.6±8.27ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.13   492.5±12.42ms        ? ?/sec    1.00   436.3±14.78ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.21     53.8±0.69ms        ? ?/sec    1.00     44.3±0.41ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.52    163.9±0.89ms        ? ?/sec    1.00    107.7±0.60ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.45    160.0±0.86ms        ? ?/sec    1.00    110.3±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     61.5±0.37ms        ? ?/sec    1.00     61.6±0.37ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.33    169.1±0.95ms        ? ?/sec    1.00    127.2±0.54ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.01    100.1±0.47ms        ? ?/sec    1.00     98.7±0.39ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     39.6±0.23ms        ? ?/sec    1.00     39.5±0.25ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.02     49.6±0.20ms        ? ?/sec    1.00     48.9±0.43ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.05     54.1±0.36ms        ? ?/sec    1.00     51.7±0.44ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.00     40.8±0.26ms        ? ?/sec    1.01     41.1±0.34ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     14.5±0.12ms        ? ?/sec    1.00     14.5±0.10ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.2±0.00ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00      9.2±0.09ms        ? ?/sec    1.01      9.3±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     11.1±0.07ms        ? ?/sec    1.01     11.2±0.07ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     36.4±0.28ms        ? ?/sec    1.00     36.4±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     49.9±0.41ms        ? ?/sec    1.00     49.9±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     47.9±0.28ms        ? ?/sec    1.01     48.2±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.02      4.3±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.01    178.1±0.90ms        ? ?/sec    1.00    176.8±0.72ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    233.1±2.45ms        ? ?/sec    1.00    233.5±0.83ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.01    479.4±2.39ms        ? ?/sec    1.00    476.4±2.19ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.02   443.9±12.86ms        ? ?/sec    1.00   435.5±16.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     51.0±0.52ms        ? ?/sec    1.01     51.7±0.65ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    153.9±0.61ms        ? ?/sec    1.00    153.3±0.68ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.01    150.4±0.65ms        ? ?/sec    1.00    149.2±0.86ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.01     59.3±0.40ms        ? ?/sec    1.00     58.9±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.01    158.6±1.04ms        ? ?/sec    1.00    157.7±0.94ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.01     93.2±0.44ms        ? ?/sec    1.00     92.5±0.42ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     31.9±0.20ms        ? ?/sec    1.01     32.2±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.01     34.7±0.41ms        ? ?/sec    1.00     34.3±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     50.4±0.47ms        ? ?/sec    1.00     50.5±0.48ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.00     37.7±0.37ms        ? ?/sec    1.01     38.0±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     13.6±0.07ms        ? ?/sec    1.01     13.7±0.09ms        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing pushdown-v4 (1851f0b) to af8564f diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=pushdown-v4
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

🤖: Benchmark completed

😎 -- very nice

@zhuqi-lucas
Copy link

🤖: Benchmark completed

😎 -- very nice

Great result!

I am curious about the performance compared with no filter pushdown case, because previous try will also improve the performance for this benchmark. But compared to the no filter pushdown case, it has some regression.

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

I am curious about the performance compared with no filter pushdown case, because previous try will also improve the performance for this benchmark. But compared to the no filter pushdown case, it has some regression.

I will try and run this experiment later today

@zhuqi-lucas
Copy link

I am curious about the performance compared with no filter pushdown case, because previous try will also improve the performance for this benchmark. But compared to the no filter pushdown case, it has some regression.

I will try and run this experiment later today

Thank you @alamb , if it has no regression, i believe this PR will also resolve the adaptive selection cases, if it has regression, we can further combine the adaptive selection for final optimization.

@XiangpengHao

This comment was marked as resolved.

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

🤖: Benchmark completed

Details

group                                                                                                      main                                   pushdown-v4
-----                                                                                                      ----                                   -----------
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.06   1356.3±2.84µs        ? ?/sec    1.00   1277.4±2.92µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.02   1352.0±2.48µs        ? ?/sec    1.00   1323.1±3.61µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.06   1361.7±3.15µs        ? ?/sec    1.00   1283.6±2.09µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.00    484.4±6.57µs        ? ?/sec    1.06    512.0±4.35µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00    662.9±2.03µs        ? ?/sec    1.05    694.0±2.13µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.00    485.8±3.76µs        ? ?/sec    1.05    509.5±4.37µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.09    626.7±3.48µs        ? ?/sec    1.00    577.1±3.17µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.01    772.8±2.90µs        ? ?/sec    1.00    763.2±2.98µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.07    632.7±2.73µs        ? ?/sec    1.00    590.5±4.25µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.03    258.8±3.21µs        ? ?/sec    1.00    251.7±2.83µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.17    269.3±0.80µs        ? ?/sec    1.00    230.1±0.60µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.00    257.7±2.56µs        ? ?/sec    1.00    258.5±3.28µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.00    309.6±1.51µs        ? ?/sec    1.00    311.1±2.30µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.00    301.0±0.54µs        ? ?/sec    1.07    321.4±0.61µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.13    306.2±1.12µs        ? ?/sec    1.00    269.9±1.09µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.00    317.2±1.37µs        ? ?/sec    1.00    318.4±1.88µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.01   1077.6±2.48µs        ? ?/sec    1.00   1066.7±1.91µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.05    951.0±2.12µs        ? ?/sec    1.00    902.7±2.82µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.01   1083.5±2.79µs        ? ?/sec    1.00   1074.1±4.83µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.04    448.4±3.42µs        ? ?/sec    1.00    432.8±4.39µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.11    630.6±1.87µs        ? ?/sec    1.00    567.9±4.22µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.04    457.8±4.89µs        ? ?/sec    1.00    438.3±3.40µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    153.1±0.31µs        ? ?/sec    1.05    160.6±0.29µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.19    297.8±0.69µs        ? ?/sec    1.00    249.8±0.82µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    158.7±0.36µs        ? ?/sec    1.05    166.4±1.13µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     77.3±0.22µs        ? ?/sec    1.00     77.2±0.19µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.25    257.7±0.48µs        ? ?/sec    1.00    206.9±0.37µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.02     83.5±0.22µs        ? ?/sec    1.00     82.0±3.11µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.00    686.9±1.54µs        ? ?/sec    1.08    740.3±4.00µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.02    561.3±1.29µs        ? ?/sec    1.00    550.5±1.88µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.00    693.1±1.30µs        ? ?/sec    1.08    747.3±2.10µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.00     65.1±4.91µs        ? ?/sec    1.07     69.3±4.01µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.19    254.1±3.38µs        ? ?/sec    1.00    214.4±1.60µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.00     71.5±3.59µs        ? ?/sec    1.07     76.4±4.51µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.00     86.3±0.17µs        ? ?/sec    1.09     94.4±0.72µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.26    228.6±0.89µs        ? ?/sec    1.00    181.1±0.37µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.00     91.0±0.29µs        ? ?/sec    1.09     99.2±0.27µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.00      9.3±0.11µs        ? ?/sec    1.02      9.5±0.23µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.37    190.3±0.85µs        ? ?/sec    1.00    138.5±0.26µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.00     14.6±0.24µs        ? ?/sec    1.02     14.9±0.39µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.00    170.2±0.42µs        ? ?/sec    1.08    184.4±0.56µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.27    349.1±0.82µs        ? ?/sec    1.00    275.7±0.70µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.00    175.8±0.44µs        ? ?/sec    1.08    189.6±0.51µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.00     12.9±0.26µs        ? ?/sec    1.14     14.7±0.42µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.41    267.4±0.67µs        ? ?/sec    1.00    190.2±0.58µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     20.0±0.74µs        ? ?/sec    1.00     20.0±0.36µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.00    340.8±0.84µs        ? ?/sec    1.07    365.3±0.82µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.08    376.1±1.45µs        ? ?/sec    1.00    348.3±0.85µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.00    347.6±1.68µs        ? ?/sec    1.07    371.8±0.92µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.00     26.0±0.54µs        ? ?/sec    1.17     30.3±1.95µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.22    219.8±0.58µs        ? ?/sec    1.00    179.7±0.58µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.00     32.6±0.53µs        ? ?/sec    1.09     35.5±1.36µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    120.2±0.20µs        ? ?/sec    1.01    121.8±0.18µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    135.7±0.53µs        ? ?/sec    1.02    138.6±0.32µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    123.1±0.19µs        ? ?/sec    1.02    126.1±0.26µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.01    174.1±0.60µs        ? ?/sec    1.00    171.8±0.28µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.00    230.2±0.68µs        ? ?/sec    1.01    232.8±0.70µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.01    179.4±0.43µs        ? ?/sec    1.00    177.0±0.46µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00     77.2±0.20µs        ? ?/sec    1.01     78.0±0.68µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    178.9±0.83µs        ? ?/sec    1.01    181.2±1.04µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.01     82.3±0.31µs        ? ?/sec    1.00     81.8±0.26µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    138.4±0.42µs        ? ?/sec    1.06    147.0±0.36µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    213.4±0.55µs        ? ?/sec    1.03    219.8±0.91µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    143.6±0.28µs        ? ?/sec    1.06    152.8±0.29µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00     74.6±0.44µs        ? ?/sec    1.00     74.6±0.30µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.00    177.5±0.71µs        ? ?/sec    1.01    179.7±0.46µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.00     78.4±0.22µs        ? ?/sec    1.01     79.5±0.26µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    113.8±0.15µs        ? ?/sec    1.01    114.9±0.18µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    140.0±0.32µs        ? ?/sec    1.03    144.7±0.64µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    116.7±0.13µs        ? ?/sec    1.02    119.6±0.57µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    171.7±0.63µs        ? ?/sec    1.02    175.7±0.48µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.00    249.4±0.59µs        ? ?/sec    1.02    253.6±0.63µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    176.6±0.51µs        ? ?/sec    1.03    181.6±0.73µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00    202.6±0.43µs        ? ?/sec    1.00    203.3±0.29µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    263.1±0.57µs        ? ?/sec    1.00    263.6±0.81µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00    209.1±0.51µs        ? ?/sec    1.01    210.2±0.56µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    145.9±0.34µs        ? ?/sec    1.07    156.7±0.30µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    230.6±0.61µs        ? ?/sec    1.03    236.8±0.62µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    151.3±0.34µs        ? ?/sec    1.06    159.9±0.96µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00     97.6±0.97µs        ? ?/sec    1.11    108.3±0.72µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.00    208.6±1.32µs        ? ?/sec    1.03    214.8±0.91µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.00    107.3±2.25µs        ? ?/sec    1.15    123.3±1.20µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.00     95.6±0.12µs        ? ?/sec    1.04     99.4±0.25µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.00    113.9±0.18µs        ? ?/sec    1.02    116.2±0.46µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.00     98.6±0.22µs        ? ?/sec    1.04    102.3±0.33µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.00    130.9±0.37µs        ? ?/sec    1.05    138.0±0.77µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.00    189.6±0.46µs        ? ?/sec    1.03    194.5±0.29µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.00    135.5±0.33µs        ? ?/sec    1.06    143.0±0.58µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     44.4±0.11µs        ? ?/sec    1.01     44.9±0.11µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.00    143.4±0.29µs        ? ?/sec    1.01    144.4±1.68µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     48.6±0.12µs        ? ?/sec    1.01     49.2±0.17µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.00    104.6±0.17µs        ? ?/sec    1.09    114.4±0.27µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.00    177.8±0.47µs        ? ?/sec    1.03    182.6±2.84µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    109.4±0.22µs        ? ?/sec    1.09    119.6±3.64µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.00     38.9±0.14µs        ? ?/sec    1.00     38.8±0.08µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.00    141.4±0.38µs        ? ?/sec    1.00    140.8±1.42µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.01     43.8±0.19µs        ? ?/sec    1.00     43.5±0.22µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.00     94.6±0.20µs        ? ?/sec    1.02     96.1±0.21µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.00    108.9±0.32µs        ? ?/sec    1.02    110.9±0.92µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.00     98.2±0.32µs        ? ?/sec    1.01     98.7±0.23µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.00    121.4±0.27µs        ? ?/sec    1.00    121.0±0.36µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.00    174.6±0.69µs        ? ?/sec    1.02    177.9±0.35µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    125.8±0.44µs        ? ?/sec    1.00    126.0±0.39µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.11     26.3±0.21µs        ? ?/sec    1.00     23.7±0.06µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.00    126.0±0.27µs        ? ?/sec    1.01    127.5±0.31µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.00     30.1±0.25µs        ? ?/sec    1.03     31.1±0.19µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     87.2±0.26µs        ? ?/sec    1.11     96.5±0.28µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.00    157.0±0.39µs        ? ?/sec    1.04    163.6±0.36µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     91.1±0.36µs        ? ?/sec    1.12    101.7±0.40µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.00     18.2±0.22µs        ? ?/sec    1.01     18.4±0.39µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.00    122.0±0.34µs        ? ?/sec    1.01    123.1±0.45µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.01     24.9±0.49µs        ? ?/sec    1.00     24.8±0.43µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     87.0±0.43µs        ? ?/sec    1.02     88.4±0.67µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00    112.3±0.35µs        ? ?/sec    1.00    111.9±0.36µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     89.3±0.27µs        ? ?/sec    1.01     90.6±0.31µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    117.9±0.65µs        ? ?/sec    1.04    122.6±0.58µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    186.8±0.63µs        ? ?/sec    1.03    193.3±0.82µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.00    120.7±0.60µs        ? ?/sec    1.05    127.3±3.66µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.01    151.7±0.32µs        ? ?/sec    1.00    149.8±0.46µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.01    209.7±0.70µs        ? ?/sec    1.00    207.1±1.72µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.01    156.7±0.39µs        ? ?/sec    1.00    154.5±0.26µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.00     93.1±0.46µs        ? ?/sec    1.09    101.7±0.58µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.02    182.8±0.54µs        ? ?/sec    1.00    179.1±0.71µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.00     97.7±0.52µs        ? ?/sec    1.10    107.5±2.91µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.00     42.5±0.65µs        ? ?/sec    1.12     47.7±1.88µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.00    150.0±0.71µs        ? ?/sec    1.00    150.5±1.19µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.00     47.0±0.68µs        ? ?/sec    1.14     53.7±1.88µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.00     92.3±0.17µs        ? ?/sec    1.01     93.3±0.21µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.00    110.0±0.61µs        ? ?/sec    1.01    111.2±0.24µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.00     95.1±0.17µs        ? ?/sec    1.01     96.3±0.24µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.01    123.0±0.28µs        ? ?/sec    1.00    122.4±0.61µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.00    182.0±1.07µs        ? ?/sec    1.00    182.3±0.35µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    127.3±0.44µs        ? ?/sec    1.00    126.9±1.12µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.00     36.9±0.12µs        ? ?/sec    1.00     37.0±0.07µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.01    136.8±0.48µs        ? ?/sec    1.00    135.7±0.34µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.00     41.0±0.32µs        ? ?/sec    1.01     41.4±0.10µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.00     96.6±0.20µs        ? ?/sec    1.11    106.9±0.24µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.00    170.4±0.44µs        ? ?/sec    1.03    175.1±1.72µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.00    101.3±0.25µs        ? ?/sec    1.10    111.6±0.80µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.00     31.2±0.12µs        ? ?/sec    1.00     31.1±0.07µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.00    133.7±0.58µs        ? ?/sec    1.00    133.3±0.23µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.00     35.5±0.20µs        ? ?/sec    1.01     36.0±0.11µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.01      7.2±0.04ms        ? ?/sec    1.00      7.1±0.04ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.01     13.3±0.11ms        ? ?/sec    1.00     13.2±0.16ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.00    495.7±3.68µs        ? ?/sec    1.04    513.4±2.64µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00    665.7±5.16µs        ? ?/sec    1.04    694.8±1.99µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.00    498.7±3.42µs        ? ?/sec    1.02    510.0±3.08µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.20    726.9±3.72µs        ? ?/sec    1.00    607.3±3.10µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.03    817.4±3.99µs        ? ?/sec    1.00    796.7±7.55µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.19    732.6±2.67µs        ? ?/sec    1.00    615.7±3.29µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.01    322.0±1.12µs        ? ?/sec    1.00    320.3±1.68µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.00    401.2±1.26µs        ? ?/sec    1.08    432.0±2.30µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.01    328.1±1.32µs        ? ?/sec    1.00    326.5±1.63µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.02    259.4±2.77µs        ? ?/sec    1.00    255.2±2.32µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.15    277.3±0.65µs        ? ?/sec    1.00    240.4±0.67µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.00    265.7±2.52µs        ? ?/sec    1.01    269.6±2.35µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.03    383.8±1.92µs        ? ?/sec    1.00    372.4±1.34µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.13    339.4±1.33µs        ? ?/sec    1.00    301.3±1.63µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.03    395.0±6.05µs        ? ?/sec    1.00    385.3±2.38µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.00    102.2±0.19µs        ? ?/sec    1.00    101.8±0.23µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.00    118.1±0.29µs        ? ?/sec    1.00    117.6±1.43µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.01    105.0±0.34µs        ? ?/sec    1.00    104.1±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.01    139.8±0.27µs        ? ?/sec    1.00    139.0±0.19µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.00    195.4±0.41µs        ? ?/sec    1.00    194.8±0.63µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.01    144.3±0.30µs        ? ?/sec    1.00    143.5±0.95µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.04     44.6±0.12µs        ? ?/sec    1.00     43.0±0.10µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.01    144.2±1.16µs        ? ?/sec    1.00    143.2±1.37µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.03     49.0±0.13µs        ? ?/sec    1.00     47.6±0.15µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.00    104.5±0.28µs        ? ?/sec    1.10    114.6±0.48µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.00    178.3±1.76µs        ? ?/sec    1.02    182.6±1.21µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.00    109.3±0.70µs        ? ?/sec    1.09    119.2±0.46µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.01     39.2±0.31µs        ? ?/sec    1.00     38.9±0.09µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.01    142.2±3.14µs        ? ?/sec    1.00    140.9±0.58µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     43.1±0.09µs        ? ?/sec    1.01     43.7±0.13µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.00     94.5±0.13µs        ? ?/sec    1.02     96.4±1.15µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.00    109.5±0.24µs        ? ?/sec    1.01    110.2±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.00     97.3±0.22µs        ? ?/sec    1.01     98.7±0.21µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.02    123.7±0.55µs        ? ?/sec    1.00    121.2±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.00    177.5±0.37µs        ? ?/sec    1.00    177.0±0.41µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.02    128.0±0.69µs        ? ?/sec    1.00    125.8±0.41µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     27.1±0.35µs        ? ?/sec    1.00     27.0±0.22µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.01    128.1±1.29µs        ? ?/sec    1.00    126.5±0.36µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.00     31.5±0.33µs        ? ?/sec    1.00     31.5±0.44µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.00     87.1±0.42µs        ? ?/sec    1.11     96.8±0.35µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.00    161.0±0.23µs        ? ?/sec    1.02    164.3±0.48µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.00     91.8±0.28µs        ? ?/sec    1.10    101.3±0.70µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.00     21.5±0.40µs        ? ?/sec    1.02     21.8±0.57µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.00    123.9±0.57µs        ? ?/sec    1.00    124.2±0.41µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.00     26.3±0.38µs        ? ?/sec    1.01     26.6±0.37µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     87.1±0.25µs        ? ?/sec    1.03     89.4±0.36µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.00    112.4±0.44µs        ? ?/sec    1.01    113.0±1.44µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     89.3±0.27µs        ? ?/sec    1.03     92.2±0.34µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.00    118.0±0.64µs        ? ?/sec    1.03    121.6±0.59µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.00    186.2±0.50µs        ? ?/sec    1.05    195.5±0.49µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.00    120.6±0.44µs        ? ?/sec    1.04    125.3±0.44µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.01    151.9±0.56µs        ? ?/sec    1.00    150.7±0.38µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.01    207.6±1.85µs        ? ?/sec    1.00    205.1±0.74µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.01    156.8±0.44µs        ? ?/sec    1.00    155.8±1.35µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.00     93.6±0.75µs        ? ?/sec    1.09    102.2±0.67µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.02    182.2±0.36µs        ? ?/sec    1.00    178.6±0.45µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.00     97.3±0.73µs        ? ?/sec    1.11    107.6±0.74µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.00     43.9±0.70µs        ? ?/sec    1.05     46.3±2.01µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.01    150.1±1.21µs        ? ?/sec    1.00    149.2±0.87µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.00     50.0±1.04µs        ? ?/sec    1.06     53.0±2.24µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.01    100.9±0.19µs        ? ?/sec    1.00    100.0±0.22µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.00    114.5±0.43µs        ? ?/sec    1.00    114.9±0.29µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.01    103.5±0.33µs        ? ?/sec    1.00    102.4±0.46µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.02    132.4±0.16µs        ? ?/sec    1.00    130.2±0.28µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.00    186.6±0.43µs        ? ?/sec    1.00    187.0±0.46µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.01    137.0±0.70µs        ? ?/sec    1.00    135.1±0.99µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     35.1±0.08µs        ? ?/sec    1.03     36.2±0.16µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.01    136.9±0.39µs        ? ?/sec    1.00    136.2±1.12µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.00     39.7±0.12µs        ? ?/sec    1.04     41.2±0.20µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.00     97.1±0.52µs        ? ?/sec    1.10    106.7±0.19µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.00    170.1±1.99µs        ? ?/sec    1.03    174.9±0.37µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.00    101.6±0.18µs        ? ?/sec    1.10    111.5±0.34µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.00     30.5±0.27µs        ? ?/sec    1.02     31.0±0.16µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.01    133.3±0.83µs        ? ?/sec    1.00    132.6±0.40µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.00     35.1±0.25µs        ? ?/sec    1.01     35.6±0.37µs        ? ?/sec

@XiangpengHao

This comment was marked as resolved.

@alamb
Copy link
Contributor

alamb commented Jul 25, 2025

  1. Update some comments to make it clear the predicate cache only applies to the async decoder
  1. Update the code to avoid breaking API changes
  1. Break out the end to end tests + statistics into separate PR to make review tractable

I will do this once the above two PRs are merged

Revert backwards incompatible changes to the Parquet reader API
Clarify in documentation that cache is only for async decoder
@XiangpengHao
Copy link
Contributor Author

  1. I don't think it is worth trying to also add the cache to the sync reader. Instead I think we should pursue a more general solution, see [Epic] Parquet Reader Improvement Plan / Proposal - July 2025 #8000)

I really like #8000, thank you @alamb for writing it up! I'll think about it over the next couple of days.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank yoU @XiangpengHao -- I think we should proceed with this PR

I broke out some of the infrastructure into a new PR in case that is easier for other reviewers

What I think we should do is wait until after we cut the next release (eta early next week) and then merge it in

alamb added a commit that referenced this pull request Aug 1, 2025
# Which issue does this PR close?

We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax.

- related to  #7850

# Rationale for this change

While reviewing #7850 from
@XiangpengHao I found myself wanting even more comments (or maybe I was
doing this as an exercise to load the state back into my head)

In any case, I wrote up some comments that I think would make the code
easier to understand

# What changes are included in this PR?

Add some more docs

# Are these changes tested?

By CI

# Are there any user-facing changes?
No -- this is documentation to internal interfaces

There is no code or functional change
@alamb
Copy link
Contributor

alamb commented Aug 6, 2025

Now that we have released 56.0.0 and we have a story for why we won't do predicate result caching for the sync reader (namely #7983) I think we are ready to merge this PR

I merged up from main, and I am going to take one more look to make sure there are no breaking API changes

use std::sync::atomic::AtomicUsize;
use std::sync::Arc;

/// This enum represents the state of Arrow reader metrics collection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the addition of metrics will be very helpful for other use cases (as mentioned recently by @mapleFU and @steveloughran recently)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at this PR and I think we need to fix the test before merging. Otherwise we are good to go.

I'll follow up with @XiangpengHao and either he or I will fix it

@@ -1832,6 +1882,7 @@ mod tests {
assert_eq!(total_rows, 730);
}

#[ignore]
Copy link
Contributor

@alamb alamb Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can merge this PR without un-ignoring this test

I think it is showing a regression. When I I looked into it more, and it seems like the new cache, even when supposedly disabled, is changing the behavior and fetching more pages.

I think we need to ensure that if the cache is disabled, then the IO behavior is the same as before

Specifically it looks like we now fetch all the pages, even those that they are supposed to be skipped:

Expected page requests: [
    113..222,
    331..440,
    573..682,
    791..900,
    1033..1142,
    1251..1360,
...
Actual page requests: [
    4..113,
    113..222,
    222..331,
    331..440,
    440..573,
    573..682,
    682..791,
    791..900,
    900..1033,
    1033..1142,
    1142..1251,
    1251..1360,
...

Here is the diff I was using to investigate:

Details

diff --git a/parquet/src/arrow/async_reader/mod.rs b/parquet/src/arrow/async_reader/mod.rs
index 843ad766e9..b3da39c48e 100644
--- a/parquet/src/arrow/async_reader/mod.rs
+++ b/parquet/src/arrow/async_reader/mod.rs
@@ -1884,7 +1884,6 @@ mod tests {
         assert_eq!(total_rows, 730);
     }

-    #[ignore]
     #[tokio::test]
     async fn test_in_memory_row_group_sparse() {
         let testdata = arrow::util::test_util::parquet_test_data();
@@ -1925,8 +1924,6 @@ mod tests {
         )
         .unwrap();

-        let _schema_desc = metadata.file_metadata().schema_descr();
-
         let projection = ProjectionMask::leaves(metadata.file_metadata().schema_descr(), vec![0]);

         let reader_factory = ReaderFactory {
@@ -1946,19 +1943,25 @@ mod tests {
         // Setup `RowSelection` so that we can skip every other page, selecting the last page
         let mut selectors = vec![];
         let mut expected_page_requests: Vec<Range<usize>> = vec![];
+        let mut page_idx = 0;
         while let Some(page) = pages.next() {
+
             let num_rows = if let Some(next_page) = pages.peek() {
                 next_page.first_row_index - page.first_row_index
             } else {
                 num_rows - page.first_row_index
             };
+            println!("page {page_idx}: first_row_index={} offset={} compressed_page_size={}, num_rows={num_rows}, skip={skip}", page.first_row_index, page.offset, page.compressed_page_size);
+            page_idx += 1;

+            let start = page.offset as usize;
+            let end = start + page.compressed_page_size as usize;
             if skip {
                 selectors.push(RowSelector::skip(num_rows as usize));
+                println!("  skipping page with {num_rows} rows : {start}..{end}");
             } else {
                 selectors.push(RowSelector::select(num_rows as usize));
-                let start = page.offset as usize;
-                let end = start + page.compressed_page_size as usize;
+                println!("  selecting page with {num_rows} rows: {start}..{end}");
                 expected_page_requests.push(start..end);
             }
             skip = !skip;
@@ -1973,7 +1976,13 @@ mod tests {

         let requests = requests.lock().unwrap();

-        assert_eq!(&requests[..], &expected_page_requests)
+        println!("Expected page requests: {:#?}", &expected_page_requests);
+        println!("Actual page requests: {:#?}", &requests[..]);
+
+        assert_eq!(
+            format!("{:#?}",&expected_page_requests),
+            format!("{:#?}", &requests[..]),
+        );
     }

     #[tokio::test]

@XiangpengHao
Copy link
Contributor Author

yes, will take a look soon

I took another look at this PR and I think we need to fix the test before merging. Otherwise we are good to go.

I'll follow up with @XiangpengHao and either he or I will fix it

@XiangpengHao
Copy link
Contributor Author

I have a few more things I'd like to change, will update here once they're ready

Copy link
Contributor Author

@XiangpengHao XiangpengHao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've finished my pass with two new changes, can you take a look? @alamb

No test is ignored now.

}

/// Exclude leaves belonging to roots that span multiple parquet leaves (i.e. nested columns)
fn exclude_nested_columns_from_cache(&self, mask: &ProjectionMask) -> Option<ProjectionMask> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New change 1: exclude nested column from cache.

Previous behavior: panic.

It's not impossible but very hard to support cache nested columns. We don't support it yet.

With this change, it will fallback to the old implementation, i.e., decode twice, but at least will not panic.

@@ -924,7 +1014,15 @@ impl InMemoryRowGroup<'_> {
_ => (),
}

ranges.extend(selection.scan_ranges(&offset_index[idx].page_locations));
// Expand selection to batch boundaries only for cached columns
let use_expanded = cache_mask.map(|m| m.leaf_included(idx)).unwrap_or(false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change 2: only expand the selection for the caching column, not other columns. This should improve the IO.

@alamb
Copy link
Contributor

alamb commented Aug 8, 2025

Thank you @XiangpengHao -- I am starting to check this out

alamb added a commit to alamb/datafusion that referenced this pull request Aug 8, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @XiangpengHao

I looked at the code in the latest commits and it looks good to me. I am testing this PR here

Assuming everything looks good I'll merge it in

@@ -1920,7 +1930,6 @@ mod tests {
assert_eq!(total_rows, 730);
}

#[ignore]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@alamb
Copy link
Contributor

alamb commented Aug 8, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing pushdown-v4 (bea4433) to c561acb diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=pushdown-v4
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Aug 8, 2025

🤖: Benchmark completed

Details

group                                main                                   pushdown-v4
-----                                ----                                   -----------
arrow_reader_clickbench/async/Q1     1.01      2.4±0.01ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     10.9±0.15ms        ? ?/sec    1.36     14.8±0.36ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     12.9±0.14ms        ? ?/sec    1.29     16.7±0.62ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.25     36.1±0.23ms        ? ?/sec    1.00     28.9±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.17     49.9±0.41ms        ? ?/sec    1.00     42.6±0.55ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.19     47.6±0.27ms        ? ?/sec    1.00     40.1±0.48ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.4±0.10ms        ? ?/sec    1.10      5.9±0.17ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.39    171.7±0.69ms        ? ?/sec    1.00    123.1±0.72ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.56    223.7±1.38ms        ? ?/sec    1.00    143.7±0.74ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.68    491.3±2.03ms        ? ?/sec    1.00    291.7±7.17ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.12   495.0±10.25ms        ? ?/sec    1.00    444.0±4.31ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.19     57.4±0.59ms        ? ?/sec    1.00     48.2±0.82ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.57    168.7±0.94ms        ? ?/sec    1.00    107.1±0.61ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.52    165.3±0.89ms        ? ?/sec    1.00    108.6±0.72ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     62.1±0.61ms        ? ?/sec    1.03     64.1±0.52ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.32    174.4±0.96ms        ? ?/sec    1.00    132.0±0.99ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00    101.1±0.77ms        ? ?/sec    1.05    106.5±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     40.9±0.38ms        ? ?/sec    1.01     41.2±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     50.4±0.33ms        ? ?/sec    1.01     50.9±0.57ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.08     56.1±0.34ms        ? ?/sec    1.00     51.9±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.01     42.0±0.54ms        ? ?/sec    1.00     41.7±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     15.0±0.23ms        ? ?/sec    1.00     14.9±0.28ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.2±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00      9.5±0.06ms        ? ?/sec    1.05     10.0±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     11.3±0.07ms        ? ?/sec    1.06     12.0±0.12ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     38.2±0.19ms        ? ?/sec    1.03     39.5±0.41ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     51.7±0.40ms        ? ?/sec    1.03     53.0±0.28ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     48.4±0.53ms        ? ?/sec    1.05     50.8±0.23ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.00      4.3±0.05ms        ? ?/sec    1.02      4.4±0.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    177.4±1.07ms        ? ?/sec    1.03    182.4±0.91ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    238.5±2.01ms        ? ?/sec    1.04    246.9±2.73ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    484.1±6.12ms        ? ?/sec    1.02    495.2±3.11ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   441.8±11.62ms        ? ?/sec    1.00   441.2±14.66ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     53.2±1.22ms        ? ?/sec    1.02     54.1±0.97ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    154.5±1.31ms        ? ?/sec    1.02    157.5±0.71ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    151.0±0.89ms        ? ?/sec    1.02    154.7±0.59ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.00     60.4±0.35ms        ? ?/sec    1.01     61.3±0.56ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    161.3±2.14ms        ? ?/sec    1.01    162.8±0.89ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     94.5±0.52ms        ? ?/sec    1.02     96.2±0.75ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     32.7±0.11ms        ? ?/sec    1.00     32.8±0.22ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.02     36.7±0.38ms        ? ?/sec    1.00     36.1±0.73ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     52.6±0.50ms        ? ?/sec    1.01     53.2±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.00     39.5±0.32ms        ? ?/sec    1.01     40.0±0.32ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     14.1±0.12ms        ? ?/sec    1.02     14.4±0.14ms        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Aug 8, 2025

Ok,I think we have bikeshed this one enough and let's go!

@alamb alamb merged commit 04f217b into apache:main Aug 8, 2025
16 checks passed
@alamb alamb mentioned this pull request Aug 8, 2025
4 tasks
alamb added a commit that referenced this pull request Aug 15, 2025
# Which issue does this PR close?

- Part of #8000

- Related to #7850

# Rationale for this change

There is quite a bit of code in the current Parquet sync and async
readers related to IO patterns that I do not think is not covered by
existing tests. As I refactor the guts of the readers into the
PushDecoder, I would like to ensure we don't introduce regressions in
existing functionality.

I would like to add tests that cover the IO patterns of the Parquet
Reader so I don't break it

# What changes are included in this PR?

Add tests which
1. Creates a temporary parquet file with a known row group structure
2. Reads data from that file using the Arrow Parquet Reader, recording
the IO operations
3. Asserts the expected IO patterns based on the read operations in a
human understandable behavior

This is done for both the sync and async readers.

I am sorry this is such a massive PR, but it is entirely tests and I
think it is quite important. I could break the sync or async tests into
their own PR, but this seems uncessary

# Are these changes tested?

Yes, indeed the entire PR is only tests


# Are there any user-facing changes?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parquet decoder / decoded predicate / page / results Cache
4 participants