Speed up Parquet filter pushdown v4 (Predicate evaluation cache for async_reader) #7850

XiangpengHao · 2025-07-02T20:25:41Z

This is my latest attempt to make pushdown faster. Prior art: #6921

Part of [Epic] Parquet Reader Improvement Plan / Proposal - July 2025 #8000
Related to Enable parquet filter pushdown (filter_pushdown) by default datafusion#3463
Related to [EPIC] Faster performance for parquet predicate evaluation for non selective filters #7456
Closes Parquet decoder / decoded predicate / page / results Cache #7363
Closes [Parquet Predicate Cache]: Add ArrowReaderMetrics and tests for caching #8003

Problems of #6921

It proactively loads entire row group into memory. (rather than only loading pages that passing the filter predicate)
It only cache decompressed pages, still paying the decoding cost twice.

This PR takes a different approach, it does not change the decoding pipeline, so we avoid the problem 1. It also caches the arrow record batch, so avoid problem 2.

But this means we need to use more memory to cache data.

How it works?

It instruments the array_readers with a transparent cached_array_reader.
The cache layer will first consult the RowGroupCache to look for a batch, and only reads from underlying reader on a cache miss.
There're cache producer and cache consumer. Producer is when we build filters we insert arrow arrays into cache, consumer is when we build outputs, we remove arrow array from cache. So the memory usage should look like this:

    ▲
    │     ╭─╮
    │    ╱   ╲
    │   ╱     ╲
    │  ╱       ╲
    │ ╱         ╲
    │╱           ╲
    └─────────────╲──────► Time
    │      │      │
    Filter  Peak  Consume
    Phase (Built) (Decrease)

In a concurrent setup, not all reader may reach the peak point at the same time, so the peak system memory usage might be lower.

It has a max_cache_size knob, this is a per row group setting. If the row group has used up the budget, the cache stops taking new data. and the cached_array_reader will fallback to read and decode from Parquet.

Other benefits

This architecture allows nested columns (but not implemented in this pr), i.e., it's future proof.
There're many performance optimizations to further squeeze the performance, but even with current state, it has no regressions.

How does it perform?

My criterion somehow won't produces a result from --save-baseline, so I asked llm to generate a table from this benchmark:

cargo bench --bench arrow_reader_clickbench --features "arrow async" "async"

Baseline is the implementation for current main branch.
New Unlimited is the new pushdown with unlimited memory budget.
New 100MB is the new pushdown but the memory budget for a row group caching is 100MB.

Query  | Baseline (ms) | New Unlimited (ms) | Diff (ms)  | New 100MB (ms) | Diff (ms)
-------+--------------+--------------------+-----------+----------------+-----------
Q1     | 0.847          | 0.803               | -0.044     | 0.812          | -0.035    
Q10    | 4.060          | 6.273               | +2.213     | 6.216          | +2.156    
Q11    | 5.088          | 7.152               | +2.064     | 7.193          | +2.105    
Q12    | 18.485         | 14.937              | -3.548     | 14.904         | -3.581    
Q13    | 24.859         | 21.908              | -2.951     | 21.705         | -3.154    
Q14    | 23.994         | 20.691              | -3.303     | 20.467         | -3.527    
Q19    | 1.894          | 1.980               | +0.086     | 1.996          | +0.102    
Q20    | 90.325         | 64.689              | -25.636    | 74.478         | -15.847   
Q21    | 106.610        | 74.766              | -31.844    | 99.557         | -7.053    
Q22    | 232.730        | 101.660             | -131.070   | 204.800        | -27.930   
Q23    | 222.800        | 186.320             | -36.480    | 186.590        | -36.210   
Q24    | 24.840         | 19.762              | -5.078     | 19.908         | -4.932    
Q27    | 80.463         | 47.118              | -33.345    | 49.597         | -30.866   
Q28    | 78.999         | 47.583              | -31.416    | 51.432         | -27.567   
Q30    | 28.587         | 28.710              | +0.123     | 28.926         | +0.339    
Q36    | 80.157         | 57.954              | -22.203    | 58.012         | -22.145   
Q37    | 46.962         | 45.901              | -1.061     | 45.386         | -1.576    
Q38    | 16.324         | 16.492              | +0.168     | 16.522         | +0.198    
Q39    | 20.754         | 20.734              | -0.020     | 20.648         | -0.106    
Q40    | 22.554         | 21.707              | -0.847     | 21.995         | -0.559    
Q41    | 16.430         | 16.391              | -0.039     | 16.581         | +0.151    
Q42    | 6.045          | 6.157               | +0.112     | 6.120          | +0.075

If we consider the diff within 5ms to be noise, then we are never worse than the current implementation.
We see significant improvements for string-heavy queries, because string columns are large, they take time to decompress and decode.
100MB cache budget seems to have small performance impact.

Limitations

It only works for async readers, because sync reader do not follow the same row group by row group structure.
It is memory hungry -- compared to Experimental parquet decoder with first-class selection pushdown support #6921. But changing decoding pipeline without eager loading entire row group would require significant changes to the current decoding infrastructure, e.g., we need to make page iterator an async function.
It currently doesn't support nested columns, more specifically, it doesn't support nested columns with nullable parents. but supporting it is straightforward, no big changes.
The current memory accounting is not accurate, it will overestimate the memory usage, especially when reading string view arrays, where multiple string view may share the same underlying buffer, and that buffer size is counted twice. Anyway, we never exceeds the user configured memory usage.
If one row passes the filter, the entire batch will be cached. We can probably optimize this though.

Next steps?

This pr is largely proof of concept, I want to collect some feedback before sending a multi-thousands pr :)

Some items I can think of:

Design an interface for user to specify the cache size limit, currently it's hard-coded.
Don't instrument nested array reader if the parquet file has nullable parent. currently it will panic
More testing, and integration test/benchmark with Datafusion

XiangpengHao · 2025-07-02T20:27:16Z

parquet/src/arrow/array_reader/builder.rs

+#[derive(Clone)]
+pub struct CacheOptions<'a> {
+    pub projection_mask: &'a ProjectionMask,
+    pub cache: Arc<Mutex<RowGroupCache>>,


Practically there's no contention because there's not parallelism in decoding one row group. we add mutex here because we need to use Arc.

XiangpengHao · 2025-07-02T20:29:08Z

parquet/src/arrow/async_reader/mod.rs

+        let row_group_cache = Arc::new(Mutex::new(RowGroupCache::new(
+            batch_size,
+            // None,
+            Some(1024 * 1024 * 100),


This is currently hard-coded, leave it a future work to make it configurable through user settings

XiangpengHao · 2025-07-02T20:30:38Z

parquet/src/arrow/async_reader/mod.rs

@@ -613,8 +623,18 @@ where
                    .fetch(&mut self.input, predicate.projection(), selection)
                    .await?;

+                let mut cache_projection = predicate.projection().clone();
+                cache_projection.intersect(&projection);


A column is cached if and only if it appears both in output projection and filter projection

So one thing I didn't understand after reading this PR in detail was how the relative row positions are updated after applying a filter.

For example if we are applying multiple filters, the first may reduce the original RowSelection down to [100->200], and now when the second filter runs it is only evaluated on the 100->200 rows , not the original selection

In other words I think there needs to be some sort of function equvalent to RowSelection::and_then that applies to the cache

// Narrow the cache so that it only retains the results of evaluating the predicate let row_group_cache = row_group_cache.and_then(resulting_selection)

Maybe this is the root cause of https://github.com/apache/datafusion/actions/runs/16302299778/job/46039904381?pr=16711

XiangpengHao · 2025-07-02T20:31:33Z

parquet/src/arrow/array_reader/cached_array_reader.rs

+    }
+
+    fn get_def_levels(&self) -> Option<&[i16]> {
+        None // we don't allow nullable parent for now.


nested columns not support yet

alamb · 2025-07-02T20:57:08Z

😮 -- My brain is likely too fried at the moment to review this properly but it is on my list for first thing tomorrow

zhuqi-lucas · 2025-07-03T10:10:03Z

Thank you @XiangpengHao for amazing work, i will try to review and test this PR!

alamb

TLDR is I think this is really clever - very nice @XiangpengHao . I left some structural comments / suggestions but nothing major.

I will run some more benchmarks, but it was showing very nice improvements for Q21 locally for me (129ms --> 90ms)

If that looks good I'll wire it up in DataFusion and run those benchmarks

Some thoughts:

I would be happy to wire in the buffering limit / API
As you say, there are many more improvements possible -- specifically I suspect the RowSelector representation is going to cause us pain and suffering for filters that have many short selections when bitmaps would be a better choice

Buffering

I think buffering the intermediate filter results is unavoidable if we want to preserve the current behavior to minimizes the size of IO requests

If we want to reduce buffering I think we can only really do it by increasing the number of IO requests (so we can incrementally produce the final output). I think we should proceed with buffering and then tune if/when needed

alamb · 2025-07-03T12:18:38Z

parquet/src/arrow/async_reader/mod.rs

+                        CacheOptions {
+                            projection_mask: &cache_projection,
+                            cache: row_group_cache.clone(),
+                            role: crate::arrow::array_reader::CacheRole::Producer,
+                        },


structurally both here and below it might help to keep the creation ofthe CacheOptions into the cache itself so a reader of this code doesn't have to understand the innards of the cache

Suggested change

CacheOptions {

projection_mask: &cache_projection,

cache: row_group_cache.clone(),

role: crate::arrow::array_reader::CacheRole::Producer,

},

row_group_cache.producer_options(projection, predicate.proection())

parquet/src/arrow/async_reader/mod.rs

alamb · 2025-07-03T12:30:39Z

parquet/src/arrow/array_reader/cached_array_reader.rs

+
+        let start_position = self.outer_position - row_count;
+
+        let selection_buffer = row_selection_to_boolean_buffer(row_count, self.selections.iter());


this is clever -- though it will likely suffer from the same "RowSelection is a crappy representation for small selection runs" problem

yes, this is to alleviate the problem. If we have multiple small selection runs on the same cached batch, first combine them into a boolean buffer, and do boolean selection once.

parquet/src/arrow/array_reader/row_group_cache.rs

alamb · 2025-07-03T12:43:42Z

parquet/src/arrow/array_reader/cached_array_reader.rs

+                .expect("data must be already cached in the read_records call, this is a bug");
+            let cached = cached.slice(overlap_start - batch_start, selection_length);
+            let filtered = arrow_select::filter::filter(&cached, &mask_array)?;
+            selected_arrays.push(filtered);


You can probably use the new BatchCoalescer here instead: https://docs.rs/arrow/latest/arrow/compute/struct.BatchCoalescer.html

It is definitely faster for primitive arrays and will save intermediate memory usage

It might have some trouble with StringView as it also tries to gc internally too -- we may need to optimize the output to avoid gc'ing if we see the same buffer from call to call

alamb · 2025-07-03T12:54:14Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing pushdown-v4 (1851f0b) to af8564f diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=pushdown-v4
Results will be posted here when complete

alamb · 2025-07-03T13:20:03Z

🤖: Benchmark completed

Details

group                                main                                   pushdown-v4
-----                                ----                                   -----------
arrow_reader_clickbench/async/Q1     1.00      2.4±0.02ms        ? ?/sec    1.00      2.4±0.12ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     10.4±0.11ms        ? ?/sec    1.10     11.5±0.26ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     12.4±0.14ms        ? ?/sec    1.09     13.5±0.18ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.34     34.4±0.29ms        ? ?/sec    1.00     25.7±0.22ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.23     48.6±0.32ms        ? ?/sec    1.00     39.5±0.26ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.24     46.3±0.35ms        ? ?/sec    1.00     37.2±0.27ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.2±0.05ms        ? ?/sec    1.08      5.6±0.07ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.32    161.7±0.73ms        ? ?/sec    1.00    122.3±0.50ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.30    207.7±0.83ms        ? ?/sec    1.00    159.6±0.65ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.06    479.2±2.17ms        ? ?/sec    1.00    450.6±8.27ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.13   492.5±12.42ms        ? ?/sec    1.00   436.3±14.78ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.21     53.8±0.69ms        ? ?/sec    1.00     44.3±0.41ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.52    163.9±0.89ms        ? ?/sec    1.00    107.7±0.60ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.45    160.0±0.86ms        ? ?/sec    1.00    110.3±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     61.5±0.37ms        ? ?/sec    1.00     61.6±0.37ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.33    169.1±0.95ms        ? ?/sec    1.00    127.2±0.54ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.01    100.1±0.47ms        ? ?/sec    1.00     98.7±0.39ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     39.6±0.23ms        ? ?/sec    1.00     39.5±0.25ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.02     49.6±0.20ms        ? ?/sec    1.00     48.9±0.43ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.05     54.1±0.36ms        ? ?/sec    1.00     51.7±0.44ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.00     40.8±0.26ms        ? ?/sec    1.01     41.1±0.34ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     14.5±0.12ms        ? ?/sec    1.00     14.5±0.10ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.2±0.00ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00      9.2±0.09ms        ? ?/sec    1.01      9.3±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     11.1±0.07ms        ? ?/sec    1.01     11.2±0.07ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     36.4±0.28ms        ? ?/sec    1.00     36.4±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     49.9±0.41ms        ? ?/sec    1.00     49.9±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     47.9±0.28ms        ? ?/sec    1.01     48.2±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.02      4.3±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.01    178.1±0.90ms        ? ?/sec    1.00    176.8±0.72ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    233.1±2.45ms        ? ?/sec    1.00    233.5±0.83ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.01    479.4±2.39ms        ? ?/sec    1.00    476.4±2.19ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.02   443.9±12.86ms        ? ?/sec    1.00   435.5±16.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     51.0±0.52ms        ? ?/sec    1.01     51.7±0.65ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    153.9±0.61ms        ? ?/sec    1.00    153.3±0.68ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.01    150.4±0.65ms        ? ?/sec    1.00    149.2±0.86ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.01     59.3±0.40ms        ? ?/sec    1.00     58.9±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.01    158.6±1.04ms        ? ?/sec    1.00    157.7±0.94ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.01     93.2±0.44ms        ? ?/sec    1.00     92.5±0.42ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     31.9±0.20ms        ? ?/sec    1.01     32.2±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.01     34.7±0.41ms        ? ?/sec    1.00     34.3±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     50.4±0.47ms        ? ?/sec    1.00     50.5±0.48ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.00     37.7±0.37ms        ? ?/sec    1.01     38.0±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     13.6±0.07ms        ? ?/sec    1.01     13.7±0.09ms        ? ?/sec

alamb · 2025-07-03T13:20:06Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing pushdown-v4 (1851f0b) to af8564f diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=pushdown-v4
Results will be posted here when complete

alamb · 2025-07-03T13:58:24Z

🤖: Benchmark completed

😎 -- very nice

zhuqi-lucas · 2025-07-03T14:02:26Z

🤖: Benchmark completed

😎 -- very nice

Great result!

I am curious about the performance compared with no filter pushdown case, because previous try will also improve the performance for this benchmark. But compared to the no filter pushdown case, it has some regression.

alamb · 2025-07-03T14:04:21Z

I am curious about the performance compared with no filter pushdown case, because previous try will also improve the performance for this benchmark. But compared to the no filter pushdown case, it has some regression.

I will try and run this experiment later today

zhuqi-lucas · 2025-07-03T14:07:56Z

I am curious about the performance compared with no filter pushdown case, because previous try will also improve the performance for this benchmark. But compared to the no filter pushdown case, it has some regression.

I will try and run this experiment later today

Thank you @alamb , if it has no regression, i believe this PR will also resolve the adaptive selection cases, if it has regression, we can further combine the adaptive selection for final optimization.

alamb · 2025-07-03T14:36:06Z

🤖: Benchmark completed

Details

group                                                                                                      main                                   pushdown-v4
-----                                                                                                      ----                                   -----------
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.06   1356.3±2.84µs        ? ?/sec    1.00   1277.4±2.92µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.02   1352.0±2.48µs        ? ?/sec    1.00   1323.1±3.61µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.06   1361.7±3.15µs        ? ?/sec    1.00   1283.6±2.09µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.00    484.4±6.57µs        ? ?/sec    1.06    512.0±4.35µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00    662.9±2.03µs        ? ?/sec    1.05    694.0±2.13µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.00    485.8±3.76µs        ? ?/sec    1.05    509.5±4.37µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.09    626.7±3.48µs        ? ?/sec    1.00    577.1±3.17µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.01    772.8±2.90µs        ? ?/sec    1.00    763.2±2.98µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.07    632.7±2.73µs        ? ?/sec    1.00    590.5±4.25µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.03    258.8±3.21µs        ? ?/sec    1.00    251.7±2.83µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.17    269.3±0.80µs        ? ?/sec    1.00    230.1±0.60µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.00    257.7±2.56µs        ? ?/sec    1.00    258.5±3.28µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.00    309.6±1.51µs        ? ?/sec    1.00    311.1±2.30µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.00    301.0±0.54µs        ? ?/sec    1.07    321.4±0.61µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.13    306.2±1.12µs        ? ?/sec    1.00    269.9±1.09µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.00    317.2±1.37µs        ? ?/sec    1.00    318.4±1.88µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.01   1077.6±2.48µs        ? ?/sec    1.00   1066.7±1.91µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.05    951.0±2.12µs        ? ?/sec    1.00    902.7±2.82µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.01   1083.5±2.79µs        ? ?/sec    1.00   1074.1±4.83µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.04    448.4±3.42µs        ? ?/sec    1.00    432.8±4.39µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.11    630.6±1.87µs        ? ?/sec    1.00    567.9±4.22µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.04    457.8±4.89µs        ? ?/sec    1.00    438.3±3.40µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    153.1±0.31µs        ? ?/sec    1.05    160.6±0.29µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.19    297.8±0.69µs        ? ?/sec    1.00    249.8±0.82µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    158.7±0.36µs        ? ?/sec    1.05    166.4±1.13µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     77.3±0.22µs        ? ?/sec    1.00     77.2±0.19µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.25    257.7±0.48µs        ? ?/sec    1.00    206.9±0.37µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.02     83.5±0.22µs        ? ?/sec    1.00     82.0±3.11µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.00    686.9±1.54µs        ? ?/sec    1.08    740.3±4.00µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.02    561.3±1.29µs        ? ?/sec    1.00    550.5±1.88µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.00    693.1±1.30µs        ? ?/sec    1.08    747.3±2.10µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.00     65.1±4.91µs        ? ?/sec    1.07     69.3±4.01µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.19    254.1±3.38µs        ? ?/sec    1.00    214.4±1.60µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.00     71.5±3.59µs        ? ?/sec    1.07     76.4±4.51µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.00     86.3±0.17µs        ? ?/sec    1.09     94.4±0.72µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.26    228.6±0.89µs        ? ?/sec    1.00    181.1±0.37µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.00     91.0±0.29µs        ? ?/sec    1.09     99.2±0.27µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.00      9.3±0.11µs        ? ?/sec    1.02      9.5±0.23µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.37    190.3±0.85µs        ? ?/sec    1.00    138.5±0.26µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.00     14.6±0.24µs        ? ?/sec    1.02     14.9±0.39µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.00    170.2±0.42µs        ? ?/sec    1.08    184.4±0.56µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.27    349.1±0.82µs        ? ?/sec    1.00    275.7±0.70µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.00    175.8±0.44µs        ? ?/sec    1.08    189.6±0.51µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.00     12.9±0.26µs        ? ?/sec    1.14     14.7±0.42µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.41    267.4±0.67µs        ? ?/sec    1.00    190.2±0.58µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     20.0±0.74µs        ? ?/sec    1.00     20.0±0.36µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.00    340.8±0.84µs        ? ?/sec    1.07    365.3±0.82µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.08    376.1±1.45µs        ? ?/sec    1.00    348.3±0.85µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.00    347.6±1.68µs        ? ?/sec    1.07    371.8±0.92µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.00     26.0±0.54µs        ? ?/sec    1.17     30.3±1.95µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.22    219.8±0.58µs        ? ?/sec    1.00    179.7±0.58µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.00     32.6±0.53µs        ? ?/sec    1.09     35.5±1.36µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    120.2±0.20µs        ? ?/sec    1.01    121.8±0.18µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    135.7±0.53µs        ? ?/sec    1.02    138.6±0.32µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    123.1±0.19µs        ? ?/sec    1.02    126.1±0.26µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.01    174.1±0.60µs        ? ?/sec    1.00    171.8±0.28µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.00    230.2±0.68µs        ? ?/sec    1.01    232.8±0.70µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.01    179.4±0.43µs        ? ?/sec    1.00    177.0±0.46µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00     77.2±0.20µs        ? ?/sec    1.01     78.0±0.68µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    178.9±0.83µs        ? ?/sec    1.01    181.2±1.04µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.01     82.3±0.31µs        ? ?/sec    1.00     81.8±0.26µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    138.4±0.42µs        ? ?/sec    1.06    147.0±0.36µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    213.4±0.55µs        ? ?/sec    1.03    219.8±0.91µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    143.6±0.28µs        ? ?/sec    1.06    152.8±0.29µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00     74.6±0.44µs        ? ?/sec    1.00     74.6±0.30µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.00    177.5±0.71µs        ? ?/sec    1.01    179.7±0.46µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.00     78.4±0.22µs        ? ?/sec    1.01     79.5±0.26µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    113.8±0.15µs        ? ?/sec    1.01    114.9±0.18µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    140.0±0.32µs        ? ?/sec    1.03    144.7±0.64µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    116.7±0.13µs        ? ?/sec    1.02    119.6±0.57µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    171.7±0.63µs        ? ?/sec    1.02    175.7±0.48µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.00    249.4±0.59µs        ? ?/sec    1.02    253.6±0.63µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    176.6±0.51µs        ? ?/sec    1.03    181.6±0.73µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00    202.6±0.43µs        ? ?/sec    1.00    203.3±0.29µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    263.1±0.57µs        ? ?/sec    1.00    263.6±0.81µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00    209.1±0.51µs        ? ?/sec    1.01    210.2±0.56µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    145.9±0.34µs        ? ?/sec    1.07    156.7±0.30µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    230.6±0.61µs        ? ?/sec    1.03    236.8±0.62µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    151.3±0.34µs        ? ?/sec    1.06    159.9±0.96µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00     97.6±0.97µs        ? ?/sec    1.11    108.3±0.72µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.00    208.6±1.32µs        ? ?/sec    1.03    214.8±0.91µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.00    107.3±2.25µs        ? ?/sec    1.15    123.3±1.20µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.00     95.6±0.12µs        ? ?/sec    1.04     99.4±0.25µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.00    113.9±0.18µs        ? ?/sec    1.02    116.2±0.46µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.00     98.6±0.22µs        ? ?/sec    1.04    102.3±0.33µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.00    130.9±0.37µs        ? ?/sec    1.05    138.0±0.77µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.00    189.6±0.46µs        ? ?/sec    1.03    194.5±0.29µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.00    135.5±0.33µs        ? ?/sec    1.06    143.0±0.58µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     44.4±0.11µs        ? ?/sec    1.01     44.9±0.11µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.00    143.4±0.29µs        ? ?/sec    1.01    144.4±1.68µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     48.6±0.12µs        ? ?/sec    1.01     49.2±0.17µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.00    104.6±0.17µs        ? ?/sec    1.09    114.4±0.27µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.00    177.8±0.47µs        ? ?/sec    1.03    182.6±2.84µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    109.4±0.22µs        ? ?/sec    1.09    119.6±3.64µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.00     38.9±0.14µs        ? ?/sec    1.00     38.8±0.08µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.00    141.4±0.38µs        ? ?/sec    1.00    140.8±1.42µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.01     43.8±0.19µs        ? ?/sec    1.00     43.5±0.22µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.00     94.6±0.20µs        ? ?/sec    1.02     96.1±0.21µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.00    108.9±0.32µs        ? ?/sec    1.02    110.9±0.92µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.00     98.2±0.32µs        ? ?/sec    1.01     98.7±0.23µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.00    121.4±0.27µs        ? ?/sec    1.00    121.0±0.36µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.00    174.6±0.69µs        ? ?/sec    1.02    177.9±0.35µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    125.8±0.44µs        ? ?/sec    1.00    126.0±0.39µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.11     26.3±0.21µs        ? ?/sec    1.00     23.7±0.06µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.00    126.0±0.27µs        ? ?/sec    1.01    127.5±0.31µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.00     30.1±0.25µs        ? ?/sec    1.03     31.1±0.19µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     87.2±0.26µs        ? ?/sec    1.11     96.5±0.28µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.00    157.0±0.39µs        ? ?/sec    1.04    163.6±0.36µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     91.1±0.36µs        ? ?/sec    1.12    101.7±0.40µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.00     18.2±0.22µs        ? ?/sec    1.01     18.4±0.39µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.00    122.0±0.34µs        ? ?/sec    1.01    123.1±0.45µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.01     24.9±0.49µs        ? ?/sec    1.00     24.8±0.43µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     87.0±0.43µs        ? ?/sec    1.02     88.4±0.67µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00    112.3±0.35µs        ? ?/sec    1.00    111.9±0.36µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     89.3±0.27µs        ? ?/sec    1.01     90.6±0.31µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    117.9±0.65µs        ? ?/sec    1.04    122.6±0.58µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    186.8±0.63µs        ? ?/sec    1.03    193.3±0.82µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.00    120.7±0.60µs        ? ?/sec    1.05    127.3±3.66µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.01    151.7±0.32µs        ? ?/sec    1.00    149.8±0.46µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.01    209.7±0.70µs        ? ?/sec    1.00    207.1±1.72µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.01    156.7±0.39µs        ? ?/sec    1.00    154.5±0.26µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.00     93.1±0.46µs        ? ?/sec    1.09    101.7±0.58µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.02    182.8±0.54µs        ? ?/sec    1.00    179.1±0.71µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.00     97.7±0.52µs        ? ?/sec    1.10    107.5±2.91µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.00     42.5±0.65µs        ? ?/sec    1.12     47.7±1.88µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.00    150.0±0.71µs        ? ?/sec    1.00    150.5±1.19µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.00     47.0±0.68µs        ? ?/sec    1.14     53.7±1.88µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.00     92.3±0.17µs        ? ?/sec    1.01     93.3±0.21µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.00    110.0±0.61µs        ? ?/sec    1.01    111.2±0.24µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.00     95.1±0.17µs        ? ?/sec    1.01     96.3±0.24µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.01    123.0±0.28µs        ? ?/sec    1.00    122.4±0.61µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.00    182.0±1.07µs        ? ?/sec    1.00    182.3±0.35µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    127.3±0.44µs        ? ?/sec    1.00    126.9±1.12µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.00     36.9±0.12µs        ? ?/sec    1.00     37.0±0.07µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.01    136.8±0.48µs        ? ?/sec    1.00    135.7±0.34µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.00     41.0±0.32µs        ? ?/sec    1.01     41.4±0.10µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.00     96.6±0.20µs        ? ?/sec    1.11    106.9±0.24µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.00    170.4±0.44µs        ? ?/sec    1.03    175.1±1.72µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.00    101.3±0.25µs        ? ?/sec    1.10    111.6±0.80µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.00     31.2±0.12µs        ? ?/sec    1.00     31.1±0.07µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.00    133.7±0.58µs        ? ?/sec    1.00    133.3±0.23µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.00     35.5±0.20µs        ? ?/sec    1.01     36.0±0.11µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.01      7.2±0.04ms        ? ?/sec    1.00      7.1±0.04ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.01     13.3±0.11ms        ? ?/sec    1.00     13.2±0.16ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.00    495.7±3.68µs        ? ?/sec    1.04    513.4±2.64µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00    665.7±5.16µs        ? ?/sec    1.04    694.8±1.99µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.00    498.7±3.42µs        ? ?/sec    1.02    510.0±3.08µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.20    726.9±3.72µs        ? ?/sec    1.00    607.3±3.10µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.03    817.4±3.99µs        ? ?/sec    1.00    796.7±7.55µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.19    732.6±2.67µs        ? ?/sec    1.00    615.7±3.29µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.01    322.0±1.12µs        ? ?/sec    1.00    320.3±1.68µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.00    401.2±1.26µs        ? ?/sec    1.08    432.0±2.30µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.01    328.1±1.32µs        ? ?/sec    1.00    326.5±1.63µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.02    259.4±2.77µs        ? ?/sec    1.00    255.2±2.32µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.15    277.3±0.65µs        ? ?/sec    1.00    240.4±0.67µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.00    265.7±2.52µs        ? ?/sec    1.01    269.6±2.35µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.03    383.8±1.92µs        ? ?/sec    1.00    372.4±1.34µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.13    339.4±1.33µs        ? ?/sec    1.00    301.3±1.63µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.03    395.0±6.05µs        ? ?/sec    1.00    385.3±2.38µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.00    102.2±0.19µs        ? ?/sec    1.00    101.8±0.23µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.00    118.1±0.29µs        ? ?/sec    1.00    117.6±1.43µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.01    105.0±0.34µs        ? ?/sec    1.00    104.1±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.01    139.8±0.27µs        ? ?/sec    1.00    139.0±0.19µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.00    195.4±0.41µs        ? ?/sec    1.00    194.8±0.63µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.01    144.3±0.30µs        ? ?/sec    1.00    143.5±0.95µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.04     44.6±0.12µs        ? ?/sec    1.00     43.0±0.10µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.01    144.2±1.16µs        ? ?/sec    1.00    143.2±1.37µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.03     49.0±0.13µs        ? ?/sec    1.00     47.6±0.15µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.00    104.5±0.28µs        ? ?/sec    1.10    114.6±0.48µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.00    178.3±1.76µs        ? ?/sec    1.02    182.6±1.21µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.00    109.3±0.70µs        ? ?/sec    1.09    119.2±0.46µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.01     39.2±0.31µs        ? ?/sec    1.00     38.9±0.09µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.01    142.2±3.14µs        ? ?/sec    1.00    140.9±0.58µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     43.1±0.09µs        ? ?/sec    1.01     43.7±0.13µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.00     94.5±0.13µs        ? ?/sec    1.02     96.4±1.15µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.00    109.5±0.24µs        ? ?/sec    1.01    110.2±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.00     97.3±0.22µs        ? ?/sec    1.01     98.7±0.21µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.02    123.7±0.55µs        ? ?/sec    1.00    121.2±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.00    177.5±0.37µs        ? ?/sec    1.00    177.0±0.41µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.02    128.0±0.69µs        ? ?/sec    1.00    125.8±0.41µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     27.1±0.35µs        ? ?/sec    1.00     27.0±0.22µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.01    128.1±1.29µs        ? ?/sec    1.00    126.5±0.36µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.00     31.5±0.33µs        ? ?/sec    1.00     31.5±0.44µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.00     87.1±0.42µs        ? ?/sec    1.11     96.8±0.35µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.00    161.0±0.23µs        ? ?/sec    1.02    164.3±0.48µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.00     91.8±0.28µs        ? ?/sec    1.10    101.3±0.70µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.00     21.5±0.40µs        ? ?/sec    1.02     21.8±0.57µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.00    123.9±0.57µs        ? ?/sec    1.00    124.2±0.41µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.00     26.3±0.38µs        ? ?/sec    1.01     26.6±0.37µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     87.1±0.25µs        ? ?/sec    1.03     89.4±0.36µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.00    112.4±0.44µs        ? ?/sec    1.01    113.0±1.44µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     89.3±0.27µs        ? ?/sec    1.03     92.2±0.34µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.00    118.0±0.64µs        ? ?/sec    1.03    121.6±0.59µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.00    186.2±0.50µs        ? ?/sec    1.05    195.5±0.49µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.00    120.6±0.44µs        ? ?/sec    1.04    125.3±0.44µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.01    151.9±0.56µs        ? ?/sec    1.00    150.7±0.38µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.01    207.6±1.85µs        ? ?/sec    1.00    205.1±0.74µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.01    156.8±0.44µs        ? ?/sec    1.00    155.8±1.35µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.00     93.6±0.75µs        ? ?/sec    1.09    102.2±0.67µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.02    182.2±0.36µs        ? ?/sec    1.00    178.6±0.45µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.00     97.3±0.73µs        ? ?/sec    1.11    107.6±0.74µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.00     43.9±0.70µs        ? ?/sec    1.05     46.3±2.01µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.01    150.1±1.21µs        ? ?/sec    1.00    149.2±0.87µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.00     50.0±1.04µs        ? ?/sec    1.06     53.0±2.24µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.01    100.9±0.19µs        ? ?/sec    1.00    100.0±0.22µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.00    114.5±0.43µs        ? ?/sec    1.00    114.9±0.29µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.01    103.5±0.33µs        ? ?/sec    1.00    102.4±0.46µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.02    132.4±0.16µs        ? ?/sec    1.00    130.2±0.28µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.00    186.6±0.43µs        ? ?/sec    1.00    187.0±0.46µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.01    137.0±0.70µs        ? ?/sec    1.00    135.1±0.99µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     35.1±0.08µs        ? ?/sec    1.03     36.2±0.16µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.01    136.9±0.39µs        ? ?/sec    1.00    136.2±1.12µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.00     39.7±0.12µs        ? ?/sec    1.04     41.2±0.20µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.00     97.1±0.52µs        ? ?/sec    1.10    106.7±0.19µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.00    170.1±1.99µs        ? ?/sec    1.03    174.9±0.37µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.00    101.6±0.18µs        ? ?/sec    1.10    111.5±0.34µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.00     30.5±0.27µs        ? ?/sec    1.02     31.0±0.16µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.01    133.3±0.83µs        ? ?/sec    1.00    132.6±0.40µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.00     35.1±0.25µs        ? ?/sec    1.01     35.6±0.37µs        ? ?/sec

alamb · 2025-07-25T17:48:59Z

Update some comments to make it clear the predicate cache only applies to the async decoder

Clarify in documentation that cache is only for async decoder XiangpengHao/arrow-rs#8

Update the code to avoid breaking API changes

Revert backwards incompatible changes to the Parquet reader API XiangpengHao/arrow-rs#9

Break out the end to end tests + statistics into separate PR to make review tractable

I will do this once the above two PRs are merged

Revert backwards incompatible changes to the Parquet reader API

Clarify in documentation that cache is only for async decoder

XiangpengHao · 2025-07-25T18:57:28Z

I don't think it is worth trying to also add the cache to the sync reader. Instead I think we should pursue a more general solution, see [Epic] Parquet Reader Improvement Plan / Proposal - July 2025 #8000)

I really like #8000, thank you @alamb for writing it up! I'll think about it over the next couple of days.

alamb

Thank yoU @XiangpengHao -- I think we should proceed with this PR

I broke out some of the infrastructure into a new PR in case that is easier for other reviewers

#8003

What I think we should do is wait until after we cut the next release (eta early next week) and then merge it in

#7395

@XiangpengHao

# Which issue does this PR close? We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. - related to #7850 # Rationale for this change While reviewing #7850 from @XiangpengHao I found myself wanting even more comments (or maybe I was doing this as an exercise to load the state back into my head) In any case, I wrote up some comments that I think would make the code easier to understand # What changes are included in this PR? Add some more docs # Are these changes tested? By CI # Are there any user-facing changes? No -- this is documentation to internal interfaces There is no code or functional change

alamb · 2025-08-06T13:26:55Z

Now that we have released 56.0.0 and we have a story for why we won't do predicate result caching for the sync reader (namely #7983) I think we are ready to merge this PR

I merged up from main, and I am going to take one more look to make sure there are no breaking API changes

alamb · 2025-08-06T13:28:30Z

parquet/src/arrow/arrow_reader/metrics.rs

+use std::sync::atomic::AtomicUsize;
+use std::sync::Arc;
+
+/// This enum represents the state of Arrow reader metrics collection.


I also think the addition of metrics will be very helpful for other use cases (as mentioned recently by @mapleFU and @steveloughran recently)

alamb

I took another look at this PR and I think we need to fix the test before merging. Otherwise we are good to go.

I'll follow up with @XiangpengHao and either he or I will fix it

alamb · 2025-08-06T13:35:39Z

parquet/src/arrow/async_reader/mod.rs

@@ -1832,6 +1882,7 @@ mod tests {
        assert_eq!(total_rows, 730);
    }

+    #[ignore]


I don't think we can merge this PR without un-ignoring this test

I think it is showing a regression. When I I looked into it more, and it seems like the new cache, even when supposedly disabled, is changing the behavior and fetching more pages.

I think we need to ensure that if the cache is disabled, then the IO behavior is the same as before

Specifically it looks like we now fetch all the pages, even those that they are supposed to be skipped:

Expected page requests: [ 113..222, 331..440, 573..682, 791..900, 1033..1142, 1251..1360, ...

Actual page requests: [ 4..113, 113..222, 222..331, 331..440, 440..573, 573..682, 682..791, 791..900, 900..1033, 1033..1142, 1142..1251, 1251..1360, ...

Here is the diff I was using to investigate:

Details

diff --git a/parquet/src/arrow/async_reader/mod.rs b/parquet/src/arrow/async_reader/mod.rs index 843ad766e9..b3da39c48e 100644 --- a/parquet/src/arrow/async_reader/mod.rs +++ b/parquet/src/arrow/async_reader/mod.rs @@ -1884,7 +1884,6 @@ mod tests { assert_eq!(total_rows, 730); } - #[ignore] #[tokio::test] async fn test_in_memory_row_group_sparse() { let testdata = arrow::util::test_util::parquet_test_data(); @@ -1925,8 +1924,6 @@ mod tests { ) .unwrap(); - let _schema_desc = metadata.file_metadata().schema_descr(); - let projection = ProjectionMask::leaves(metadata.file_metadata().schema_descr(), vec![0]); let reader_factory = ReaderFactory { @@ -1946,19 +1943,25 @@ mod tests { // Setup `RowSelection` so that we can skip every other page, selecting the last page let mut selectors = vec![]; let mut expected_page_requests: Vec<Range<usize>> = vec![]; + let mut page_idx = 0; while let Some(page) = pages.next() { + let num_rows = if let Some(next_page) = pages.peek() { next_page.first_row_index - page.first_row_index } else { num_rows - page.first_row_index }; + println!("page {page_idx}: first_row_index={} offset={} compressed_page_size={}, num_rows={num_rows}, skip={skip}", page.first_row_index, page.offset, page.compressed_page_size); + page_idx += 1; + let start = page.offset as usize; + let end = start + page.compressed_page_size as usize; if skip { selectors.push(RowSelector::skip(num_rows as usize)); + println!(" skipping page with {num_rows} rows : {start}..{end}"); } else { selectors.push(RowSelector::select(num_rows as usize)); - let start = page.offset as usize; - let end = start + page.compressed_page_size as usize; + println!(" selecting page with {num_rows} rows: {start}..{end}"); expected_page_requests.push(start..end); } skip = !skip; @@ -1973,7 +1976,13 @@ mod tests { let requests = requests.lock().unwrap(); - assert_eq!(&requests[..], &expected_page_requests) + println!("Expected page requests: {:#?}", &expected_page_requests); + println!("Actual page requests: {:#?}", &requests[..]); + + assert_eq!( + format!("{:#?}",&expected_page_requests), + format!("{:#?}", &requests[..]), + ); } #[tokio::test]

XiangpengHao · 2025-08-06T14:22:57Z

yes, will take a look soon

I took another look at this PR and I think we need to fix the test before merging. Otherwise we are good to go.

I'll follow up with @XiangpengHao and either he or I will fix it

XiangpengHao · 2025-08-07T12:59:14Z

I have a few more things I'd like to change, will update here once they're ready

XiangpengHao

I've finished my pass with two new changes, can you take a look? @alamb

No test is ignored now.

XiangpengHao · 2025-08-07T19:57:41Z

parquet/src/arrow/async_reader/mod.rs

+    }
+
+    /// Exclude leaves belonging to roots that span multiple parquet leaves (i.e. nested columns)
+    fn exclude_nested_columns_from_cache(&self, mask: &ProjectionMask) -> Option<ProjectionMask> {


New change 1: exclude nested column from cache.

Previous behavior: panic.

It's not impossible but very hard to support cache nested columns. We don't support it yet.

With this change, it will fallback to the old implementation, i.e., decode twice, but at least will not panic.

XiangpengHao · 2025-08-07T19:58:48Z

parquet/src/arrow/async_reader/mod.rs

@@ -924,7 +1014,15 @@ impl InMemoryRowGroup<'_> {
                        _ => (),
                    }

-                    ranges.extend(selection.scan_ranges(&offset_index[idx].page_locations));
+                    // Expand selection to batch boundaries only for cached columns
+                    let use_expanded = cache_mask.map(|m| m.leaf_included(idx)).unwrap_or(false);


Change 2: only expand the selection for the caching column, not other columns. This should improve the IO.

alamb · 2025-08-08T15:22:26Z

Thank you @XiangpengHao -- I am starting to check this out

alamb

Thank you @XiangpengHao

I looked at the code in the latest commits and it looks good to me. I am testing this PR here

Rerunning the performance tests in DataFusion (see apache/datafusion#16711 (comment))
Rerunning benchmarks on this PR #7850 (comment)
this PR with the new tests from #7971 (see #8096)

Assuming everything looks good I'll merge it in

alamb · 2025-08-08T15:37:49Z

parquet/src/arrow/async_reader/mod.rs

@@ -1920,7 +1930,6 @@ mod tests {
        assert_eq!(total_rows, 730);
    }

-    #[ignore]


alamb · 2025-08-08T16:01:38Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing pushdown-v4 (bea4433) to c561acb diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=pushdown-v4
Results will be posted here when complete

alamb · 2025-08-08T16:27:37Z

🤖: Benchmark completed

Details

group                                main                                   pushdown-v4
-----                                ----                                   -----------
arrow_reader_clickbench/async/Q1     1.01      2.4±0.01ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     10.9±0.15ms        ? ?/sec    1.36     14.8±0.36ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     12.9±0.14ms        ? ?/sec    1.29     16.7±0.62ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.25     36.1±0.23ms        ? ?/sec    1.00     28.9±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.17     49.9±0.41ms        ? ?/sec    1.00     42.6±0.55ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.19     47.6±0.27ms        ? ?/sec    1.00     40.1±0.48ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.4±0.10ms        ? ?/sec    1.10      5.9±0.17ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.39    171.7±0.69ms        ? ?/sec    1.00    123.1±0.72ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.56    223.7±1.38ms        ? ?/sec    1.00    143.7±0.74ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.68    491.3±2.03ms        ? ?/sec    1.00    291.7±7.17ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.12   495.0±10.25ms        ? ?/sec    1.00    444.0±4.31ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.19     57.4±0.59ms        ? ?/sec    1.00     48.2±0.82ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.57    168.7±0.94ms        ? ?/sec    1.00    107.1±0.61ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.52    165.3±0.89ms        ? ?/sec    1.00    108.6±0.72ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     62.1±0.61ms        ? ?/sec    1.03     64.1±0.52ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.32    174.4±0.96ms        ? ?/sec    1.00    132.0±0.99ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00    101.1±0.77ms        ? ?/sec    1.05    106.5±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     40.9±0.38ms        ? ?/sec    1.01     41.2±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     50.4±0.33ms        ? ?/sec    1.01     50.9±0.57ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.08     56.1±0.34ms        ? ?/sec    1.00     51.9±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.01     42.0±0.54ms        ? ?/sec    1.00     41.7±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     15.0±0.23ms        ? ?/sec    1.00     14.9±0.28ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.2±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00      9.5±0.06ms        ? ?/sec    1.05     10.0±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     11.3±0.07ms        ? ?/sec    1.06     12.0±0.12ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     38.2±0.19ms        ? ?/sec    1.03     39.5±0.41ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     51.7±0.40ms        ? ?/sec    1.03     53.0±0.28ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     48.4±0.53ms        ? ?/sec    1.05     50.8±0.23ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.00      4.3±0.05ms        ? ?/sec    1.02      4.4±0.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    177.4±1.07ms        ? ?/sec    1.03    182.4±0.91ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    238.5±2.01ms        ? ?/sec    1.04    246.9±2.73ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    484.1±6.12ms        ? ?/sec    1.02    495.2±3.11ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   441.8±11.62ms        ? ?/sec    1.00   441.2±14.66ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     53.2±1.22ms        ? ?/sec    1.02     54.1±0.97ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    154.5±1.31ms        ? ?/sec    1.02    157.5±0.71ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    151.0±0.89ms        ? ?/sec    1.02    154.7±0.59ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.00     60.4±0.35ms        ? ?/sec    1.01     61.3±0.56ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    161.3±2.14ms        ? ?/sec    1.01    162.8±0.89ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     94.5±0.52ms        ? ?/sec    1.02     96.2±0.75ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     32.7±0.11ms        ? ?/sec    1.00     32.8±0.22ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.02     36.7±0.38ms        ? ?/sec    1.00     36.1±0.73ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     52.6±0.50ms        ? ?/sec    1.01     53.2±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.00     39.5±0.32ms        ? ?/sec    1.01     40.0±0.32ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.00     14.1±0.12ms        ? ?/sec    1.02     14.4±0.14ms        ? ?/sec

alamb · 2025-08-08T18:31:54Z

Ok,I think we have bikeshed this one enough and let's go!

# Which issue does this PR close? - Part of #8000 - Related to #7850 # Rationale for this change There is quite a bit of code in the current Parquet sync and async readers related to IO patterns that I do not think is not covered by existing tests. As I refactor the guts of the readers into the PushDecoder, I would like to ensure we don't introduce regressions in existing functionality. I would like to add tests that cover the IO patterns of the Parquet Reader so I don't break it # What changes are included in this PR? Add tests which 1. Creates a temporary parquet file with a known row group structure 2. Reads data from that file using the Arrow Parquet Reader, recording the IO operations 3. Asserts the expected IO patterns based on the read operations in a human understandable behavior This is done for both the sync and async readers. I am sorry this is such a massive PR, but it is entirely tests and I think it is quite important. I could break the sync or async tests into their own PR, but this seems uncessary # Are these changes tested? Yes, indeed the entire PR is only tests # Are there any user-facing changes?

XiangpengHao added 10 commits July 1, 2025 11:18

update

1f93a93

update

2e01e56

update

0bd08c3

update

d6ecbd4

cleanup

7cd5518

update

4520048

update

e6281bc

update

6b6d4fc

update

b696b66

update

f60581f

github-actions bot added the parquet Changes to the parquet crate label Jul 2, 2025

XiangpengHao commented Jul 2, 2025

View reviewed changes

XiangpengHao changed the title ~~Pushdown v4~~ Parquet filter pushdown v4 Jul 2, 2025

XiangpengHao commented Jul 2, 2025

View reviewed changes

clippy and license

1851f0b

alamb reviewed Jul 3, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

XiangpengHao added 2 commits July 25, 2025 13:54

Merge pull request #9 from alamb/alamb/revert_api_changes

3e05cb2

Revert backwards incompatible changes to the Parquet reader API

Merge pull request #8 from alamb/alamb/pushdown-v4-cleanup

4d64dc0

Clarify in documentation that cache is only for async decoder

alamb mentioned this pull request Jul 25, 2025

[Parquet Predicate Cache]: Add ArrowReaderMetrics and tests for caching #8003

Closed

alamb approved these changes Jul 25, 2025

View reviewed changes

alamb mentioned this pull request Jul 28, 2025

Incorrect inlined string view comparison after " Add prefix compare for inlined" #7874

Closed

Merge remote-tracking branch 'apache/main' into pushdown-v4

8da582b

alamb reviewed Aug 6, 2025

View reviewed changes

XiangpengHao added 3 commits August 7, 2025 14:39

exclude nested column from cache

315e463

only use expanded selection when the column is one of cache column

1db701a

Merge remote-tracking branch 'upstream/main' into pushdown-v4

bea4433

XiangpengHao commented Aug 7, 2025

View reviewed changes

alamb added a commit to alamb/datafusion that referenced this pull request Aug 8, 2025

Pin to apache/arrow-rs#7850

29465ce

alamb mentioned this pull request Aug 8, 2025

DO NOT MERGE -- test pushdown v4 with io tests #8096

Closed

alamb approved these changes Aug 8, 2025

View reviewed changes

parquet/src/arrow/async_reader/mod.rs Outdated

@@ -1920,7 +1930,6 @@ mod tests {

assert_eq!(total_rows, 730);

}

#[ignore]

Copy link

Contributor

alamb Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

alamb merged commit 04f217b into apache:main Aug 8, 2025
16 checks passed

alamb mentioned this pull request Aug 8, 2025

POC: Parquet predicate results cache #7760

Closed

4 tasks

alamb mentioned this pull request Aug 21, 2025

Speed up Parquet filter pushdown with predicate cache #8203

Closed

123789456ye mentioned this pull request Aug 29, 2025

[Parquet] Support page level cache for reading #8246

Open


		let start_position = self.outer_position - row_count;

		let selection_buffer = row_selection_to_boolean_buffer(row_count, self.selections.iter());

Speed up Parquet filter pushdown v4 (Predicate evaluation cache for async_reader) #7850

Speed up Parquet filter pushdown v4 (Predicate evaluation cache for async_reader) #7850

Conversation

XiangpengHao commented Jul 2, 2025 • edited by alamb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problems of #6921

How it works?

Other benefits

How does it perform?

Limitations

Next steps?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Jul 2, 2025

Uh oh!

zhuqi-lucas commented Jul 3, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Buffering

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Jul 3, 2025

Uh oh!

alamb commented Jul 3, 2025

Uh oh!

alamb commented Jul 3, 2025

Uh oh!

alamb commented Jul 3, 2025

Uh oh!

zhuqi-lucas commented Jul 3, 2025

Uh oh!

alamb commented Jul 3, 2025

Uh oh!

zhuqi-lucas commented Jul 3, 2025

Uh oh!

This comment was marked as resolved.

alamb commented Jul 3, 2025

Uh oh!

This comment was marked as resolved.

alamb commented Jul 25, 2025

Uh oh!

XiangpengHao commented Jul 25, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb commented Aug 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XiangpengHao commented Aug 6, 2025

Uh oh!

XiangpengHao commented Jul 2, 2025 •

edited by alamb

Loading

alamb Jul 15, 2025 •

edited

Loading

alamb Aug 6, 2025 •

edited

Loading

alamb left a comment •

edited

Loading