-
Notifications
You must be signed in to change notification settings - Fork 540
Enable Changing the # of shards for CW resharding #3188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request was exported from Phabricator. Differential Revision: D78291717 |
9c50718
to
1d44ff7
Compare
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
This pull request was exported from Phabricator. Differential Revision: D78291717 |
1d44ff7
to
8411bc3
Compare
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
This pull request was exported from Phabricator. Differential Revision: D78291717 |
Summary: Pull Request resolved: pytorch#3188 Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
8411bc3
to
cae0e53
Compare
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
cae0e53
to
5bd9cf7
Compare
This pull request was exported from Phabricator. Differential Revision: D78291717 |
Summary: Pull Request resolved: pytorch#3188 Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
5bd9cf7
to
d2c1ed9
Compare
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
d2c1ed9
to
3abf13f
Compare
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
3abf13f
to
c85ea45
Compare
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
c85ea45
to
99e2dfc
Compare
This pull request was exported from Phabricator. Differential Revision: D78291717 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D78291717 |
99e2dfc
to
d1b7766
Compare
Summary: Pull Request resolved: pytorch#3188 Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
Summary: Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
d1b7766
to
52f8a46
Compare
Summary: Pull Request resolved: pytorch#3188 Currently Dynamic Sharding assumes the # of shards per embedding table stays the same: - https://www.internalfb.com/code/fbsource/[6d270632037a1e8bca7f63500dd07fd0b213e572]/fbcode/torchrec/distributed/sharding/dynamic_sharding.py?lines=140 E.g. - `table_0` originally sharded on ranks: [0, 1] - Reshard API currently supports moving `table_0` shards to ranks [1, 2]. - Where `the shard` on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2 We want to support changing the # of shards: - e.g. table_0 originally on ranks: [0, 1] --> reshard to [0] - Or reshard to [0, 1, 2, 3] Here's the unit test you can modify to check if your usecase passes: - https://www.internalfb.com/code/fbsource/[4d0d74b9f3c441e7aa35ce7102200fa0ca8c95cf]/fbcode/torchrec/distributed/tests/test_dynamic_sharding.py?lines=453-459 - Basically change the new sharding plan to be a different # of ranks than the original sharding plan. Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table - e.g. emb_table size: [4, 8], this can only be sharded on 1, 2, or 4 ranks. not 3 ranks Differential Revision: D78291717
This pull request was exported from Phabricator. Differential Revision: D78291717 |
52f8a46
to
bbb5e37
Compare
Summary:
Currently Dynamic Sharding assumes the # of shards per embedding table stays the same:
E.g.
table_0
originally sharded on ranks: [0, 1]table_0
shards to ranks [1, 2].the shard
on rank 0 will move to rank 1, and the shard on rank 1 will move to rank 2We want to support changing the # of shards:
Here's the unit test you can modify to check if your usecase passes:
Note: the new total number of ranks for each embedding table should be a factor of the dimension 0 of that embedding table
Differential Revision: D78291717