Commit 322745d
Enable parallel writing across row groups when writing encrypted parquet (#8162)
- Closes #8115.
- Closes #8260
- Closes #8259
# Rationale for this change
#8029 introduced `pub
ArrowWriter.get_column_writers` and `pub ArrowWriter.append_row_group`
to enable multi-threaded parquet encrypted writing. However testing
downstream showed the API is not feasible, see #8115.
# What changes are included in this PR?
This introduces `pub ArrowWriter.into_serialized_writer` and deprecates
`pub ArrowWriter.get_column_writers` and `pub
ArrowWriter.append_row_group`. It also makes
`ArrowRowGroupWriterFactory` public and adds a `pub
ArrowRowGroupWriterFactory.create_column_writers`.
# Are these changes tested?
This includes a DataFusion inspired test for concurrent writing across
columns and row groups to make sure parallel writing is and remains
possible with `ArrowWriter`s API. Further we created a draft PR in
DataFusion apache/datafusion#16738 to test for
multithreaded writing support.
# Are there any user-facing changes?
See description of changes.
---------
Co-authored-by: Adam Reeve <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>1 parent f4840f6 commit 322745d
File tree
4 files changed
+391
-94
lines changed- parquet
- src/arrow
- arrow_writer
- async_writer
- tests/encryption
4 files changed
+391
-94
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
409 | 409 | | |
410 | 410 | | |
411 | 411 | | |
| 412 | + | |
412 | 413 | | |
413 | 414 | | |
414 | 415 | | |
| |||
418 | 419 | | |
419 | 420 | | |
420 | 421 | | |
| 422 | + | |
421 | 423 | | |
422 | 424 | | |
423 | 425 | | |
| |||
426 | 428 | | |
427 | 429 | | |
428 | 430 | | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
429 | 440 | | |
430 | 441 | | |
431 | 442 | | |
| |||
851 | 862 | | |
852 | 863 | | |
853 | 864 | | |
854 | | - | |
| 865 | + | |
| 866 | + | |
855 | 867 | | |
856 | 868 | | |
857 | 869 | | |
| |||
906 | 918 | | |
907 | 919 | | |
908 | 920 | | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
909 | 927 | | |
910 | 928 | | |
911 | 929 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
64 | | - | |
| 64 | + | |
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| |||
288 | 288 | | |
289 | 289 | | |
290 | 290 | | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | 291 | | |
308 | 292 | | |
309 | 293 | | |
310 | 294 | | |
| 295 | + | |
311 | 296 | | |
312 | 297 | | |
313 | 298 | | |
314 | 299 | | |
315 | 300 | | |
316 | | - | |
317 | | - | |
318 | | - | |
319 | 301 | | |
320 | 302 | | |
321 | 303 | | |
| |||
349 | 331 | | |
350 | 332 | | |
351 | 333 | | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
370 | | - | |
371 | | - | |
372 | | - | |
373 | | - | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
387 | | - | |
388 | | - | |
389 | | - | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | 334 | | |
398 | 335 | | |
399 | 336 | | |
| |||
0 commit comments