benchmark: Add parquet h2o support #16804

zhuqi-lucas · 2025-07-17T03:32:22Z

Which issue does this PR close?

Currently, we only support for CSV format for h2o benchmark, but from the compare with other database result, it is using parquet, so this ticket try to support parquet format benchmark in datafusion.

Details:
#16710 (comment)

cc @alamb @Dandandan @2010YOUY01

Rationale for this change

Currently, we only support for CSV format for h2o benchmark, but from the compare with other database result, it is using parquet, so this ticket try to support parquet format benchmark in datafusion.

What changes are included in this PR?

Currently, we only support for CSV format for h2o benchmark, but from the compare with other database result, it is using parquet, so this ticket try to support parquet format benchmark in datafusion.

Are these changes tested?

Tested group by, join, both ok now.

./bench.sh run h2o_medium_join_parquet
***************************
DataFusion Benchmark Script
COMMAND: run
BENCHMARK: h2o_medium_join_parquet
QUERY: All
DATAFUSION_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/..
BRANCH_NAME: support_parquet_for_h2o
DATA_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/data
RESULTS_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/results/support_parquet_for_h2o
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
***************************
RESULTS_FILE: /Users/zhuqi/arrow-datafusion/benchmarks/results/support_parquet_for_h2o/h2o_join.json
Running h2o join benchmark...
+ cargo run --release --bin dfbench -- h2o --iterations 3 --join-paths /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_NA_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e2_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e5_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e8_NA.parquet --queries-path /Users/zhuqi/arrow-datafusion/benchmarks/queries/h2o/join.sql -o /Users/zhuqi/arrow-datafusion/benchmarks/results/support_parquet_for_h2o/h2o_join.json
    Finished `release` profile [optimized] target(s) in 0.34s
     Running `/Users/zhuqi/arrow-datafusion/target/aarch64-apple-darwin/release/dfbench h2o --iterations 3 --join-paths /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_NA_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e2_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e5_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e8_NA.parquet --queries-path /Users/zhuqi/arrow-datafusion/benchmarks/queries/h2o/join.sql -o /Users/zhuqi/arrow-datafusion/benchmarks/results/support_parquet_for_h2o/h2o_join.json`
Running benchmarks with the following options: RunOpt { query: None, common: CommonOpt { iterations: 3, partitions: None, batch_size: None, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false }, queries_path: "/Users/zhuqi/arrow-datafusion/benchmarks/queries/h2o/join.sql", path: "benchmarks/data/h2o/G1_1e7_1e7_100_0.csv", join_paths: "/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_NA_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e2_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e5_0.parquet,/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e8_NA.parquet", output_path: Some("/Users/zhuqi/arrow-datafusion/benchmarks/results/support_parquet_for_h2o/h2o_join.json") }
Q1: SELECT x.id1, x.id2, x.id3, x.id4 as xid4, small.id4 as smallid4, x.id5, x.id6, x.v1, small.v2 FROM x INNER JOIN small ON x.id1 = small.id1;
Query 1 iteration 1 took 6.2 ms and returned 90 rows
Query 1 iteration 2 took 0.7 ms and returned 90 rows
Query 1 iteration 3 took 0.6 ms and returned 90 rows
Query 1 avg time: 2.51 ms
Q2: SELECT x.id1 as xid1, medium.id1 as mediumid1, x.id2, x.id3, x.id4 as xid4, medium.id4 as mediumid4, x.id5 as xid5, medium.id5 as mediumid5, x.id6, x.v1, medium.v2 FROM x INNER JOIN medium ON x.id2 = medium.id2;
Query 2 iteration 1 took 5.4 ms and returned 89 rows
Query 2 iteration 2 took 4.1 ms and returned 89 rows
Query 2 iteration 3 took 4.4 ms and returned 89 rows
Query 2 avg time: 4.64 ms
Q3: SELECT x.id1 as xid1, medium.id1 as mediumid1, x.id2, x.id3, x.id4 as xid4, medium.id4 as mediumid4, x.id5 as xid5, medium.id5 as mediumid5, x.id6, x.v1, medium.v2 FROM x LEFT JOIN medium ON x.id2 = medium.id2;
Query 3 iteration 1 took 4.1 ms and returned 100 rows
Query 3 iteration 2 took 3.8 ms and returned 100 rows
Query 3 iteration 3 took 4.2 ms and returned 100 rows
Query 3 avg time: 4.02 ms
Q4: SELECT x.id1 as xid1, medium.id1 as mediumid1, x.id2, x.id3, x.id4 as xid4, medium.id4 as mediumid4, x.id5 as xid5, medium.id5 as mediumid5, x.id6, x.v1, medium.v2 FROM x JOIN medium ON x.id5 = medium.id5;
Query 4 iteration 1 took 3.0 ms and returned 89 rows
Query 4 iteration 2 took 2.9 ms and returned 89 rows
Query 4 iteration 3 took 2.8 ms and returned 89 rows
Query 4 avg time: 2.90 ms
Q5: SELECT x.id1 as xid1, large.id1 as largeid1, x.id2 as xid2, large.id2 as largeid2, x.id3, x.id4 as xid4, large.id4 as largeid4, x.id5 as xid5, large.id5 as largeid5, x.id6 as xid6, large.id6 as largeid6, x.v1, large.v2 FROM x JOIN large ON x.id3 = large.id3;
Query 5 iteration 1 took 468.4 ms and returned 92 rows
Query 5 iteration 2 took 464.7 ms and returned 92 rows
Query 5 iteration 3 took 449.2 ms and returned 92 rows
Query 5 avg time: 460.75 ms
+ set +x
Done

Are there any user-facing changes?

Yes, new format support.

zhuqi-lucas · 2025-07-17T03:33:52Z

Updated: error for parquet join data generate, it works for group by:

./bench.sh data h2o_medium_join_parquet
***************************
DataFusion Benchmark Runner and Data Generator
COMMAND: data
BENCHMARK: h2o_medium_join_parquet
DATA_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/data
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
***************************
Found Python version 3.13, which is suitable.
Using Python command: /opt/homebrew/bin/python3
Installing falsa...
Generating h2o test data in /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o with size=MEDIUM and format=PARQUET
100 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e2_0.parquet

100000 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e5_0.parquet

100000000 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e8_NA.parquet

An SMALL data schema is the following:
id1: int64 not null
id4: string
v2: double

An output format is PARQUET

Batch mode is supported.
In case of memory problems you can try to reduce a batch_size.


Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
╭──────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────────────────────────────╮
│ /Users/zhuqi/arrow-datafusion/benchmarks/venv/lib/python3.13/site-packages/falsa/app.py:144 in join                                                                                                                │
│                                                                                                                                                                                                                    │
│   141 │   )                                                                                                                                                                                                        │
│   142 │                                                                                                                                                                                                            │
│   143 │   for batch in track(join_small.iter_batches(), total=len(join_small.batches)):                                                                                                                            │
│ ❱ 144 │   │   writer_small.write_batch(batch)                                                                                                                                                                      │
│   145 │   writer_small.close()                                                                                                                                                                                     │
│   146 │                                                                                                                                                                                                            │
│   147 │   if data_format is Format.DELTA:                                                                                                                                                                          │
│                                                                                                                                                                                                                    │
│ ╭──────────────────────────────────────────────────────────────────────────────────────────────────── locals ────────────────────────────────────────────────────────────────────────────────────────────────────╮ │
│ │                batch = pyarrow.RecordBatch                                                                                                                                                                     │ │
│ │                        id1: int64 not null                                                                                                                                                                     │ │
│ │                        id4: string not null                                                                                                                                                                    │ │
│ │                        v2: double not null                                                                                                                                                                     │ │
│ │                        ----                                                                                                                                                                                    │ │
│ │                        id1: [12,77,106,10,52,105,29,64,46,51,...,110,82,72,8,1,104,69,5,44,25]                                                                                                                 │ │
│ │                        id4: ["id12","id77","id106","id10","id52","id105","id29","id64","id46","id51",...,"id110","id82","id72","id8","id1","id104","id69","id5","id44","id25"]                                 │ │
│ │                        v2:                                                                                                                                                                                     │ │
│ │                        [53.0075954693085,32.410072200393316,72.68372205230826,71.61363809771296,86.99915627358179,15.539006557813716,23.840799451398684,7.23383214431385,6.366591524991982,20.222312628293857… │ │
│ │           batch_size = 5000000                                                                                                                                                                                 │ │
│ │    data_filename_big = 'J1_1e8_1e8_NA.parquet'                                                                                                                                                                 │ │
│ │    data_filename_lhs = 'J1_1e8_NA_0.parquet'                                                                                                                                                                   │ │
│ │ data_filename_medium = 'J1_1e8_1e5_0.parquet'                                                                                                                                                                  │ │
│ │  data_filename_small = 'J1_1e8_1e2_0.parquet'                                                                                                                                                                  │ │
│ │          data_format = <Format.PARQUET: 'PARQUET'>                                                                                                                                                             │ │
│ │      generation_seed = 6839596180442651345                                                                                                                                                                     │ │
│ │             join_big = <falsa.local_fs.JoinBigGenerator object at 0x105e2fe00>                                                                                                                                 │ │
│ │             join_lhs = <falsa.local_fs.JoinLHSGenerator object at 0x105e2da90>                                                                                                                                 │ │
│ │          join_medium = <falsa.local_fs.JoinMediumGenerator object at 0x105e2fcb0>                                                                                                                              │ │
│ │           join_small = <falsa.local_fs.JoinSmallGenerator object at 0x105e2fb60>                                                                                                                               │ │
│ │                    k = 10                                                                                                                                                                                      │ │
│ │            keys_seed = 1026847926404610461                                                                                                                                                                     │ │
│ │                n_big = 100000000                                                                                                                                                                               │ │
│ │             n_medium = 100000                                                                                                                                                                                  │ │
│ │              n_small = 100                                                                                                                                                                                     │ │
│ │                  nas = 0                                                                                                                                                                                       │ │
│ │           output_big = PosixPath('/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e8_NA.parquet')                                                                                                    │ │
│ │           output_dir = PosixPath('/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o')                                                                                                                          │ │
│ │           output_lhs = PosixPath('/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_NA_0.parquet')                                                                                                      │ │
│ │        output_medium = PosixPath('/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e5_0.parquet')                                                                                                     │ │
│ │         output_small = PosixPath('/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e8_1e2_0.parquet')                                                                                                     │ │
│ │          path_prefix = '/Users/zhuqi/arrow-datafusion/benchmarks/data/h2o'                                                                                                                                     │ │
│ │                 seed = 42                                                                                                                                                                                      │ │
│ │                 size = <Size.MEDIUM: 'MEDIUM'>                                                                                                                                                                 │ │
│ │         writer_small = <pyarrow.parquet.core.ParquetWriter object at 0x1067d4050>                                                                                                                              │ │
│ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                                                                                                                                    │
│ /Users/zhuqi/arrow-datafusion/benchmarks/venv/lib/python3.13/site-packages/pyarrow/parquet/core.py:1089 in write_batch                                                                                             │
│                                                                                                                                                                                                                    │
│   1086 │   │   │   will be used instead.                                                                                                                                                                           │
│   1087 │   │   """                                                                                                                                                                                                 │
│   1088 │   │   table = pa.Table.from_batches([batch], batch.schema)                                                                                                                                                │
│ ❱ 1089 │   │   self.write_table(table, row_group_size)                                                                                                                                                             │
│   1090 │                                                                                                                                                                                                           │
│   1091 │   def write_table(self, table, row_group_size=None):                                                                                                                                                      │
│   1092 │   │   """                                                                                                                                                                                                 │
│                                                                                                                                                                                                                    │
│ ╭──────────────────────────────────────────────────────────────────────────────────────────────────── locals ────────────────────────────────────────────────────────────────────────────────────────────────────╮ │
│ │          batch = pyarrow.RecordBatch                                                                                                                                                                           │ │
│ │                  id1: int64 not null                                                                                                                                                                           │ │
│ │                  id4: string not null                                                                                                                                                                          │ │
│ │                  v2: double not null                                                                                                                                                                           │ │
│ │                  ----                                                                                                                                                                                          │ │
│ │                  id1: [12,77,106,10,52,105,29,64,46,51,...,110,82,72,8,1,104,69,5,44,25]                                                                                                                       │ │
│ │                  id4: ["id12","id77","id106","id10","id52","id105","id29","id64","id46","id51",...,"id110","id82","id72","id8","id1","id104","id69","id5","id44","id25"]                                       │ │
│ │                  v2:                                                                                                                                                                                           │ │
│ │                  [53.0075954693085,32.410072200393316,72.68372205230826,71.61363809771296,86.99915627358179,15.539006557813716,23.840799451398684,7.23383214431385,6.366591524991982,20.222312628293857,...,6… │ │
│ │ row_group_size = None                                                                                                                                                                                          │ │
│ │           self = <pyarrow.parquet.core.ParquetWriter object at 0x1067d4050>                                                                                                                                    │ │
│ │          table = pyarrow.Table                                                                                                                                                                                 │ │
│ │                  id1: int64 not null                                                                                                                                                                           │ │
│ │                  id4: string not null                                                                                                                                                                          │ │
│ │                  v2: double not null                                                                                                                                                                           │ │
│ │                  ----                                                                                                                                                                                          │ │
│ │                  id1: [[12,77,106,10,52,...,104,69,5,44,25]]                                                                                                                                                   │ │
│ │                  id4: [["id12","id77","id106","id10","id52",...,"id104","id69","id5","id44","id25"]]                                                                                                           │ │
│ │                  v2:                                                                                                                                                                                           │ │
│ │                  [[53.0075954693085,32.410072200393316,72.68372205230826,71.61363809771296,86.99915627358179,...,26.7118533955444,73.44416011403574,93.63022604514522,51.816253173876824,78.95727980955964]]   │ │
│ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                                                                                                                                    │
│ /Users/zhuqi/arrow-datafusion/benchmarks/venv/lib/python3.13/site-packages/pyarrow/parquet/core.py:1113 in write_table                                                                                             │
│                                                                                                                                                                                                                    │
│   1110 │   │   │   msg = ('Table schema does not match schema used to create file: '                                                                                                                               │
│   1111 │   │   │   │      '\ntable:\n{!s} vs. \nfile:\n{!s}'                                                                                                                                                       │
│   1112 │   │   │   │      .format(table.schema, self.schema))                                                                                                                                                      │
│ ❱ 1113 │   │   │   raise ValueError(msg)                                                                                                                                                                           │
│   1114 │   │                                                                                                                                                                                                       │
│   1115 │   │   self.writer.write_table(table, row_group_size=row_group_size)                                                                                                                                       │
│   1116                                                                                                                                                                                                             │
│                                                                                                                                                                                                                    │
│ ╭──────────────────────────────────────────────────────────────────────────────────────────────────── locals ────────────────────────────────────────────────────────────────────────────────────────────────────╮ │
│ │            msg = 'Table schema does not match schema used to create file: \ntable:\nid1: int64 not n'+98                                                                                                       │ │
│ │ row_group_size = None                                                                                                                                                                                          │ │
│ │           self = <pyarrow.parquet.core.ParquetWriter object at 0x1067d4050>                                                                                                                                    │ │
│ │          table = pyarrow.Table                                                                                                                                                                                 │ │
│ │                  id1: int64 not null                                                                                                                                                                           │ │
│ │                  id4: string not null                                                                                                                                                                          │ │
│ │                  v2: double not null                                                                                                                                                                           │ │
│ │                  ----                                                                                                                                                                                          │ │
│ │                  id1: [[12,77,106,10,52,...,104,69,5,44,25]]                                                                                                                                                   │ │
│ │                  id4: [["id12","id77","id106","id10","id52",...,"id104","id69","id5","id44","id25"]]                                                                                                           │ │
│ │                  v2:                                                                                                                                                                                           │ │
│ │                  [[53.0075954693085,32.410072200393316,72.68372205230826,71.61363809771296,86.99915627358179,...,26.7118533955444,73.44416011403574,93.63022604514522,51.816253173876824,78.95727980955964]]   │ │
│ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Table schema does not match schema used to create file:
table:
id1: int64 not null
id4: string not null
v2: double not null vs.
file:
id1: int64 not null
id4: string
v2: double

zhuqi-lucas · 2025-07-17T03:39:00Z

Create the jira for falsa side, it fails with generate parquet data for join set, but it works well with group by.

mrpowers-io/falsa#27

zhuqi-lucas · 2025-07-17T05:34:10Z

Updated, it works now, the falsa has merged the fix and released: mrpowers-io/falsa#28

./bench.sh data h2o_small_join_parquet
***************************
DataFusion Benchmark Runner and Data Generator
COMMAND: data
BENCHMARK: h2o_small_join_parquet
DATA_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/data
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
***************************
Found Python version 3.13, which is suitable.
Using Python command: /opt/homebrew/bin/python3
Installing falsa...
Generating h2o test data in /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o with size=SMALL and format=PARQUET
10 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e7_1e1_0.parquet

10000 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e7_1e4_0.parquet

10000000 rows will be saved into: /Users/zhuqi/arrow-datafusion/benchmarks/data/h2o/J1_1e7_1e7_NA.parquet

An SMALL data schema is the following:
id1: int64 not null
id4: string not null
v2: double not null

An output format is PARQUET

Batch mode is supported.
In case of memory problems you can try to reduce a batch_size.


Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00

An MEDIUM data schema is the following:
id1: int64 not null
id2: int64 not null
id4: string not null
id5: string not null
v2: double not null

An output format is PARQUET

Batch mode is supported.
In case of memory problems you can try to reduce a batch_size.


Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00

An BIG data schema is the following:
id1: int64 not null
id2: int64 not null
id3: int64 not null
id4: string not null
id5: string not null
id6: string not null
v2: double not null

An output format is PARQUET

Batch mode is supported.
In case of memory problems you can try to reduce a batch_size.


Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:02

An LSH data schema is the following:
id1: int64 not null
id2: int64 not null
id3: int64 not null
id4: string not null
id5: string not null
id6: string not null
v1: double not null

An output format is PARQUET

Batch mode is supported.
In case of memory problems you can try to reduce a batch_size.


Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00

SemyonSinchenko · 2025-07-17T05:45:56Z

@zhuqi-lucas Just for the case falsa generates too much noise in stdout (and runner's output), I can add a command-line argument to suppress it. Something like --silent.

zhuqi-lucas · 2025-07-17T05:49:51Z

@zhuqi-lucas Just for the case falsa generates too much noise in stdout (and runner's output), I can add a command-line argument to suppress it. Something like --silent.

Thank you @SemyonSinchenko , it's ok for benchmark data generation, we can see the details info.

2010YOUY01

Thank you! it LGTM. I have also tested it locally.

2010YOUY01 · 2025-07-17T08:19:16Z

benchmarks/bench.sh

-h2o_small_window:       Extended h2oai benchmark with small dataset (1e7 rows) for window,  default file format is csv
-h2o_medium_window:      Extended h2oai benchmark with medium dataset (1e8 rows) for window, default file format is csv
-h2o_big_window:         Extended h2oai benchmark with large dataset (1e9 rows) for window,  default file format is csv
+h2o_small:                      h2oai benchmark with small dataset (1e7 rows) for groupby,  default file format is csv


Later, we can clean it up with additional size/format options
like
./bench.sh run h2o_join medium parquet

Good suggestion @2010YOUY01 , it's more clear.

2010YOUY01 · 2025-07-17T08:20:08Z

benchmarks/bench.sh

@@ -775,6 +840,7 @@ data_h2o() {

    # Set virtual environment directory
    VIRTUAL_ENV="${PWD}/venv"
+    rm -rf "$VIRTUAL_ENV"


Could you add a comment for this line?

Thank you @2010YOUY01 , i removed this line in latest PR, i think this is tested code i added.

zhuqi-lucas · 2025-07-17T08:27:01Z

Thank you! it LGTM. I have also tested it locally.

Thank you @2010YOUY01 for review!

jonathanc-n

Thanks @zhuqi-lucas! The csv files were taking much of the benchmark time, this should be a nice improvement

alamb

Looks good to me -- thank you @zhuqi-lucas and @2010YOUY01

I think the CSV tests are quite important as that is what the original benchmark uses (and yes that means it is largely a test of CSV performance)

zhuqi-lucas · 2025-07-18T03:15:49Z

Thank you @alamb @jonathanc-n for review.

Add parquet h2o support

da1a31d

zhuqi-lucas mentioned this pull request Jul 17, 2025

Generate parquet dataset failed for join mrpowers-io/falsa#27

Closed

zhuqi-lucas mentioned this pull request Jul 17, 2025

Optimize the join operators #16710

Open

zhuqi-lucas added 2 commits July 17, 2025 13:50

support all cases to run

cd306fd

polish

7bcedd5

zhuqi-lucas force-pushed the support_parquet_for_h2o branch from 689d67d to 7bcedd5 Compare July 17, 2025 06:03

zhuqi-lucas added 2 commits July 17, 2025 14:08

Merge branch 'main' into support_parquet_for_h2o

b468ba4

clippy

40259fe

2010YOUY01 approved these changes Jul 17, 2025

View reviewed changes

polish code

5930315

jonathanc-n approved these changes Jul 17, 2025

View reviewed changes

alamb approved these changes Jul 17, 2025

View reviewed changes

alamb merged commit 46afb3b into apache:main Jul 18, 2025
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

benchmark: Add parquet h2o support #16804

benchmark: Add parquet h2o support #16804

zhuqi-lucas commented Jul 17, 2025 •

edited

Loading

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

SemyonSinchenko commented Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

2010YOUY01 left a comment

Uh oh!

2010YOUY01 Jul 17, 2025

Uh oh!

zhuqi-lucas Jul 17, 2025

Uh oh!

2010YOUY01 Jul 17, 2025

Uh oh!

zhuqi-lucas Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

jonathanc-n left a comment

Uh oh!

alamb left a comment

Uh oh!

zhuqi-lucas commented Jul 18, 2025

Uh oh!

Uh oh!

Uh oh!

benchmark: Add parquet h2o support #16804

benchmark: Add parquet h2o support #16804

Conversation

zhuqi-lucas commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

SemyonSinchenko commented Jul 17, 2025

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

2010YOUY01 left a comment

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas commented Jul 17, 2025

Uh oh!

jonathanc-n left a comment

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas commented Jul 18, 2025

Uh oh!

Uh oh!

Uh oh!

zhuqi-lucas commented Jul 17, 2025 •

edited

Loading