Added DLRM notebook and generated python script #2131

kharshith-k · 2025-06-26T05:39:35Z

Added DLRM.ipynb and generated dlrm.py

abheesht17

Thanks for the PR! Left some comments, let me know what you think :).

Here are some notes for posterity (we don't need to implement these now in this PR):

We should, in the future, also think about the specific case of how how DLRM implements distributed training. The embeddings are sharded across devices, whereas the MLP layers are DDP'd.
The paper mentions that they use torch.nn.EmbeddingBag instead of torch.nn.Embedding. Apparently, torch.nn.EmbeddingBag is more efficient than torch.nn.Embedding followed by some aggregation op. We should think about this at some later point (maybe, keras_rs.embeddings.DistributedEmbedding solves this). Anyway, we can worry about this later!

examples/keras_rs/dlrm.py

abheesht17 · 2025-07-07T16:54:47Z

examples/keras_rs/dlrm.py

+        self.embedding_layers = []
+        for feature_name, vocabulary in vocabularies.items():
+            self.embedding_layers.append(
+                keras.layers.Embedding(
+                    input_dim=len(vocabulary) + 1,
+                    output_dim=embedding_dim,
+                )
+            )


One of the highlights of DLRM is that it can process both categorical and dense features. We should use some dense features present in the MovieLens dataset. Do we have any such features?

The movielens 100k-ratings dataset mostly contains categorical features, However it also has user age feature but its bucketized, which transforms it to be a categorical feature.

Hmmm, doesn't it have raw_user_age as a feature? The reason I'm insisting on this is because the two towers for dense and categorical features is a salient part of DLRM. And do you think we can use normalised timestamp as a feature? Will that help/does that make sense?

Sure! like you mentioned, two towers for dense and categorical features is a salient part of DLRM. I'll have another look at the code and see if any dense features can be used. Timestamp looks like a suitable dense feature. I'll modify the code and try using that

And we can remove user_bucketised_age and use raw_user_age, maybe

Sure! Will try that

examples/keras_rs/dlrm.py

Updated docstrings to match the context according to code

abheesht17 · 2025-07-28T08:24:15Z

@kharshith-k - let me know when this is ready for another round of review. Thanks!

kharshith-k · 2025-07-29T04:22:14Z

@kharshith-k - let me know when this is ready for another round of review. Thanks!
@abheesht17 - The changes are ready for review. Please let me know if any more changes would be required. Thanks!

abheesht17

Thanks! Great work, this looks good to me, overall. Just one major comment on using dense features (it isn't DLRM without the two blocks for dense and categorical features)

abheesht17 · 2025-07-28T08:25:27Z

.DS_Store

Remove this file

abheesht17 · 2025-07-28T08:25:51Z

examples/.DS_Store

Please remove this file. You can also add these files to .gitignore, BTW, to avoid these getting added by accident.

abheesht17 · 2025-07-28T08:25:59Z

examples/keras_rs/.DS_Store

Please remove this file. You can also add these files to .gitignore, BTW, to avoid these getting added by accident.

abheesht17 · 2025-07-28T08:26:30Z

examples/keras_rs/img/.DS_Store

Please remove

abheesht17 · 2025-07-28T08:27:05Z

examples/keras_rs/ipynb/dlrm.ipynb

There are two copies of this notebook. Let's delete one of them

examples/keras_rs/ipynb/DLRM.ipynb
examples/keras_rs/ipynb/dlrm.ipynb

abheesht17 · 2025-08-07T02:55:19Z

examples/keras_rs/dlrm.py

+        self.embedding_layers = []
+        for feature_name, vocabulary in vocabularies.items():
+            self.embedding_layers.append(
+                keras.layers.Embedding(
+                    input_dim=len(vocabulary) + 1,
+                    output_dim=embedding_dim,
+                )
+            )


Hmmm, doesn't it have raw_user_age as a feature? The reason I'm insisting on this is because the two towers for dense and categorical features is a salient part of DLRM. And do you think we can use normalised timestamp as a feature? Will that help/does that make sense?

kharshith-k added 2 commits June 26, 2025 05:35

Added DLRM.ipynb and generated dlrm.py

b76d67e

Merge branch 'keras-team:master' into keras-rs-examples

7a4c247

github-actions bot assigned sachinprasadhs Jun 26, 2025

sachinprasadhs requested review from abheesht17 June 26, 2025 17:44

abheesht17 requested changes Jul 7, 2025

View reviewed changes

kharshith-k added 6 commits July 10, 2025 13:09

Update dlrm.py

421c24f

Updated docstrings to match the context according to code

Modified DLRM.ipynb to add architecture diagram

768fb2e

Modified dlrm.ipynb to add architecture Diagram and reference to paper

42641ff

Merge branch 'keras-team:master' into keras-rs-examples

ef7accd

Added dlrm.ipynb

1f37ad3

Added dlrm.py

01b2257

kharshith-k closed this Jul 29, 2025

kharshith-k reopened this Jul 29, 2025

abheesht17 reviewed Aug 7, 2025

View reviewed changes

Added DLRM notebook and generated python script #2131

Are you sure you want to change the base?

Added DLRM notebook and generated python script #2131

Uh oh!

Conversation

kharshith-k commented Jun 26, 2025

Uh oh!

abheesht17 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abheesht17 commented Jul 28, 2025

Uh oh!

kharshith-k commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abheesht17 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kharshith-k commented Jul 29, 2025 •

edited

Loading