feat(benchmark) : added a dedicated page for benchmarking in contributor-guide #18367

manasa-manoj-nbr · 2025-10-29T21:01:38Z

Which issue does this PR close?

Closes Add a page to describe the bench code we have. #17811

Rationale for this change

The DataFusion project has an extensive benchmarking infrastructure with many different benchmark types (TPCH, ClickBench, IMDB, H2O.ai, micro-benchmarks, etc.) scattered across README files and code comments. Contributors need a centralized, easily discoverable resource to understand what benchmarks are available, how to use them for validating performance changes, and where to add new benchmark code. This addresses the maintainer's request in issue #17811 to create a dedicated documentation page describing all the benchmark code we have.

What changes are included in this PR?

Created docs/source/contributor-guide/benchmarking.md: A comprehensive documentation page covering all DataFusion benchmarks, organized by categories (Performance Benchmarks, Specialized Benchmarks, Micro-benchmarks)
Updated docs/source/index.rst: Added the new benchmarking page to the Contributor Guide navigation structure
Updated docs/source/contributor-guide/testing.md: Added cross-reference to the new dedicated benchmarking page in the existing benchmarks section

The new documentation consolidates information about:

All major benchmark suites (TPCH, ClickBench, IMDB, H2O.ai, Sort, External Aggregation, etc.)
Usage instructions for bench.sh script and dfbench binary
Configuration options and environment variables
Guidelines for adding new benchmarks
Troubleshooting common issues

Are these changes tested?

Documentation builds successfully without warnings or errors
Navigation structure tested - new page appears correctly in Contributor Guide menu
Internal links verified - all cross-references and links work properly
Content accuracy verified - all benchmark information sourced from official /benchmarks/README.md and existing documentation

Are there any user-facing changes?

No Breaking Changes:
- No changes to APIs, CLIs, or runtime behavior
- No changes to existing benchmark functionality
- Purely additive documentation enhancement

…tion-guide

2010YOUY01 · 2025-10-30T06:12:39Z

Thank you for the contribution.

The original issue is suggesting to add a contributor guide page for micro benchmarks scattered in the codebase, and this PR is for end-to-end benchmarks, we already have a doc for them https://github.com/apache/datafusion/blob/main/benchmarks/README.md
I think it's a good idea to move it to the contributor guide, and we don't have to generate a new one.

manasa-manoj-nbr added 2 commits October 30, 2025 02:24

feat(benchmark) : added a dedicated page for benchmarking in contribu…

5790f86

…tion-guide

feat(benchmark) : apply prettier formatter

e0c73e3

github-actions bot added the documentation Improvements or additions to documentation label Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(benchmark) : added a dedicated page for benchmarking in contributor-guide #18367

feat(benchmark) : added a dedicated page for benchmarking in contributor-guide #18367

manasa-manoj-nbr commented Oct 29, 2025

Uh oh!

2010YOUY01 commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(benchmark) : added a dedicated page for benchmarking in contributor-guide #18367

Are you sure you want to change the base?

feat(benchmark) : added a dedicated page for benchmarking in contributor-guide #18367

Conversation

manasa-manoj-nbr commented Oct 29, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

2010YOUY01 commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants