Zero Collision Hash Benchmark Framework #3127

lizhouyu · 2025-06-23T05:40:46Z

Differential Revision: D77033290

facebook-github-bot · 2025-06-23T05:41:06Z

This pull request was exported from Phabricator. Differential Revision: D77033290

facebook-github-bot · 2025-07-06T01:34:48Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-06T04:35:07Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-06T04:59:24Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-06T05:33:03Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-08T00:08:59Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-09T17:56:49Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-14T20:27:09Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

facebook-github-bot · 2025-07-15T20:34:57Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/data`: a new module for data-related functions, currently empty. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: a new script that runs the ZCH benchmark. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a new module for plotting training metrics, including an example notebook `plot_training_metrics.ipynb`. The diff includes a significant amount of new code, including model definitions, data loading, and plotting functions. The `benchmark_zch.py` script is the main entry point for running the ZCH benchmark. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating ZCH models. * Provides data loading and plotting functions for training metrics. ### Implications This diff provides a comprehensive framework for evaluating and optimizing ZCH models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a convenient way to run and compare different ZCH models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Differential Revision: D77033290

facebook-github-bot · 2025-07-15T21:27:17Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Differential Revision: D77033290

facebook-github-bot · 2025-07-16T02:07:23Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Differential Revision: D77033290

facebook-github-bot · 2025-07-17T03:38:27Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290

facebook-github-bot · 2025-07-17T03:53:45Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290

facebook-github-bot · 2025-07-17T03:59:12Z

This pull request was exported from Phabricator. Differential Revision: D77033290

Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290 fbshipit-source-id: 298a1c6e1ab858641992db7362ccf227725ec12b

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 23, 2025

facebook-github-bot added the fb-exported label Jun 23, 2025

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 6, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

f95b8fa

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu force-pushed the export-D77033290 branch from 729264d to f95b8fa Compare July 6, 2025 01:34

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 6, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

c44d954

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu force-pushed the export-D77033290 branch from f95b8fa to c44d954 Compare July 6, 2025 04:35

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 6, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

1a691d5

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu force-pushed the export-D77033290 branch from c44d954 to 1a691d5 Compare July 6, 2025 04:59

lizhouyu force-pushed the export-D77033290 branch from 1a691d5 to 61b76e8 Compare July 6, 2025 05:33

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 6, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

61b76e8

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 6, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

b8b1556

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 8, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

115412b

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu force-pushed the export-D77033290 branch from 61b76e8 to 115412b Compare July 8, 2025 00:09

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 8, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

f3d68f4

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu force-pushed the export-D77033290 branch from 115412b to c3663f5 Compare July 9, 2025 17:56

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 9, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

c3663f5

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 9, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

f40295d

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 14, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

42e9eff

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu added a commit to lizhouyu/torchrec that referenced this pull request Jul 14, 2025

Zero Collision Hash Benchmark Framework (pytorch#3127)

e7c1596

Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290

lizhouyu force-pushed the export-D77033290 branch from c3663f5 to e7c1596 Compare July 14, 2025 20:27

lizhouyu force-pushed the export-D77033290 branch from e7c1596 to 7556c4e Compare July 15, 2025 20:35

lizhouyu force-pushed the export-D77033290 branch from 7556c4e to 0970774 Compare July 15, 2025 21:27

lizhouyu force-pushed the export-D77033290 branch from 0970774 to 6d2e16f Compare July 16, 2025 02:07

lizhouyu force-pushed the export-D77033290 branch from 6d2e16f to a84406c Compare July 17, 2025 03:38

lizhouyu force-pushed the export-D77033290 branch from a84406c to 9230233 Compare July 17, 2025 03:53

lizhouyu force-pushed the export-D77033290 branch from 9230233 to 6e3df72 Compare July 17, 2025 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Zero Collision Hash Benchmark Framework #3127

Zero Collision Hash Benchmark Framework #3127

Uh oh!

lizhouyu commented Jun 23, 2025

Uh oh!

facebook-github-bot commented Jun 23, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 8, 2025

Uh oh!

facebook-github-bot commented Jul 9, 2025

Uh oh!

facebook-github-bot commented Jul 14, 2025

Uh oh!

facebook-github-bot commented Jul 15, 2025

Uh oh!

facebook-github-bot commented Jul 15, 2025

Uh oh!

facebook-github-bot commented Jul 16, 2025

Uh oh!

facebook-github-bot commented Jul 17, 2025

Uh oh!

facebook-github-bot commented Jul 17, 2025

Uh oh!

facebook-github-bot commented Jul 17, 2025

Uh oh!

Uh oh!

Zero Collision Hash Benchmark Framework #3127

Are you sure you want to change the base?

Zero Collision Hash Benchmark Framework #3127

Uh oh!

Conversation

lizhouyu commented Jun 23, 2025

Uh oh!

facebook-github-bot commented Jun 23, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 6, 2025

Uh oh!

facebook-github-bot commented Jul 8, 2025

Uh oh!

facebook-github-bot commented Jul 9, 2025

Uh oh!

facebook-github-bot commented Jul 14, 2025

Uh oh!

facebook-github-bot commented Jul 15, 2025

Uh oh!

facebook-github-bot commented Jul 15, 2025

Uh oh!

facebook-github-bot commented Jul 16, 2025

Uh oh!

facebook-github-bot commented Jul 17, 2025

Uh oh!

facebook-github-bot commented Jul 17, 2025

Uh oh!

facebook-github-bot commented Jul 17, 2025

Uh oh!

Uh oh!