Skip to content

Added DLRM notebook and generated python script #2131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

kharshith-k
Copy link

Added DLRM.ipynb and generated dlrm.py

Copy link
Collaborator

@abheesht17 abheesht17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Left some comments, let me know what you think :).

Here are some notes for posterity (we don't need to implement these now in this PR):

  • We should, in the future, also think about the specific case of how how DLRM implements distributed training. The embeddings are sharded across devices, whereas the MLP layers are DDP'd.
  • The paper mentions that they use torch.nn.EmbeddingBag instead of torch.nn.Embedding. Apparently, torch.nn.EmbeddingBag is more efficient than torch.nn.Embedding followed by some aggregation op. We should think about this at some later point (maybe, keras_rs.embeddings.DistributedEmbedding solves this). Anyway, we can worry about this later!

Comment on lines +273 to +280
self.embedding_layers = []
for feature_name, vocabulary in vocabularies.items():
self.embedding_layers.append(
keras.layers.Embedding(
input_dim=len(vocabulary) + 1,
output_dim=embedding_dim,
)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the highlights of DLRM is that it can process both categorical and dense features. We should use some dense features present in the MovieLens dataset. Do we have any such features?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The movielens 100k-ratings dataset mostly contains categorical features, However it also has user age feature but its bucketized, which transforms it to be a categorical feature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, doesn't it have raw_user_age as a feature? The reason I'm insisting on this is because the two towers for dense and categorical features is a salient part of DLRM. And do you think we can use normalised timestamp as a feature? Will that help/does that make sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! like you mentioned, two towers for dense and categorical features is a salient part of DLRM. I'll have another look at the code and see if any dense features can be used. Timestamp looks like a suitable dense feature. I'll modify the code and try using that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we can remove user_bucketised_age and use raw_user_age, maybe

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Will try that

@abheesht17
Copy link
Collaborator

@kharshith-k - let me know when this is ready for another round of review. Thanks!

@kharshith-k
Copy link
Author

kharshith-k commented Jul 29, 2025

@kharshith-k - let me know when this is ready for another round of review. Thanks!
@abheesht17 - The changes are ready for review. Please let me know if any more changes would be required. Thanks!

@kharshith-k kharshith-k reopened this Jul 29, 2025
Copy link
Collaborator

@abheesht17 abheesht17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Great work, this looks good to me, overall. Just one major comment on using dense features (it isn't DLRM without the two blocks for dense and categorical features)

.DS_Store Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this file. You can also add these files to .gitignore, BTW, to avoid these getting added by accident.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this file. You can also add these files to .gitignore, BTW, to avoid these getting added by accident.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two copies of this notebook. Let's delete one of them

examples/keras_rs/ipynb/DLRM.ipynb
examples/keras_rs/ipynb/dlrm.ipynb

Comment on lines +273 to +280
self.embedding_layers = []
for feature_name, vocabulary in vocabularies.items():
self.embedding_layers.append(
keras.layers.Embedding(
input_dim=len(vocabulary) + 1,
output_dim=embedding_dim,
)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, doesn't it have raw_user_age as a feature? The reason I'm insisting on this is because the two towers for dense and categorical features is a salient part of DLRM. And do you think we can use normalised timestamp as a feature? Will that help/does that make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants