-
Notifications
You must be signed in to change notification settings - Fork 545
Zero Collision Hash Benchmark Framework #3127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This pull request was exported from Phabricator. Differential Revision: D77033290 |
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
729264d
to
f95b8fa
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
f95b8fa
to
c44d954
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
c44d954
to
1a691d5
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
1a691d5
to
61b76e8
Compare
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
61b76e8
to
115412b
Compare
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
This pull request was exported from Phabricator. Differential Revision: D77033290 |
115412b
to
c3663f5
Compare
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: pytorch#3127 Differential Revision: D77033290
c3663f5
to
e7c1596
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/data`: a new module for data-related functions, currently empty. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: a new script that runs the ZCH benchmark. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a new module for plotting training metrics, including an example notebook `plot_training_metrics.ipynb`. The diff includes a significant amount of new code, including model definitions, data loading, and plotting functions. The `benchmark_zch.py` script is the main entry point for running the ZCH benchmark. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating ZCH models. * Provides data loading and plotting functions for training metrics. ### Implications This diff provides a comprehensive framework for evaluating and optimizing ZCH models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a convenient way to run and compare different ZCH models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Differential Revision: D77033290
e7c1596
to
7556c4e
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Differential Revision: D77033290
7556c4e
to
0970774
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Differential Revision: D77033290
0970774
to
6d2e16f
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290
6d2e16f
to
a84406c
Compare
This pull request was exported from Phabricator. Differential Revision: D77033290 |
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290
a84406c
to
9230233
Compare
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290
This pull request was exported from Phabricator. Differential Revision: D77033290 |
9230233
to
6e3df72
Compare
Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/3127M## Context ### Context The high-level intention of this diffs is to open-source (OSS) the Benchmark test bed for hash collision management algorithms. The goal is to provide a testbed for accurate and fair Benchmark. ### Major Changes **Summary** This diff introduces a new benchmark testbed for hash collision management modules. The framework includes several new files and modules: * `torchrec/distributed/benchmark/benchmark_zch/models`: a folder to keep configuration files and wrapper classes for models to benchmark. * `torchrec/distributed/benchmark/benchmark_zch/data`: a folder to keep configuration files and wrapper classes for dataset used for benchmark. It also includes a pre-hash script `sparse_kuairand_dataset.py` which takes an input of kuairand dataset and make the input values evenly distributed among the input hash space. * `torchrec/modules/mc_adapter.py`: a new module that implements the MC Adapter algorithm which enables hash collision management modules into embedding collection modules of existing and future models in a plug-and-play manner. The adapter simulates all the APIs of embedding collection and embedding bag collection modules, with a managed collision module being called before embedding look-up. * `torchrec/distributed/benchmark/benchmark_zch/benchmark_zch.py`: the main entrance of the benchmark testbed. * `torchrec/distributed/benchmark/benchmark_zch/plots`: a folder that keeps plotting notebooks for training and evaluation metrics. ### Key Features * Implements the MC Adapter algorithm for ZCH models. * Includes a new benchmark framework for evaluating hash collision management models. * Provides data loading and plotting functions for training metrics. * Metrics will be output to tensorboard during training for users to inspect the real-time results. ### Implications This diff provides a comprehensive framework for evaluating and optimizing hash collision management models. The MC Adapter algorithm is a key component of this framework, and the benchmark script provides a unified, convenient way to run and compare different hash collision management models. The plotting functions allow for easy visualization of training metrics, facilitating model optimization and improvement. Reviewed By: aporialiao Differential Revision: D77033290 fbshipit-source-id: 298a1c6e1ab858641992db7362ccf227725ec12b
Differential Revision: D77033290