-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Two Stage Recommender System with Marketing Interaction Example #2214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @mehtamansi29, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new, detailed tutorial for implementing a two-stage recommender system within the keras_rs framework. It provides an end-to-end guide for a marketing interaction use case, demonstrating how to effectively predict ad click-through rates. The tutorial walks through data preparation, building a retrieval model to narrow down potential ad candidates, and then developing a ranking model to optimize the final selection, offering a complete solution for personalized ad delivery. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces a new example tutorial for a Two-Stage Recommender System. The overall structure and explanation are clear and provide a good overview of the system. However, there are several minor issues related to typos, unused imports, and some potentially confusing or inefficient code patterns that could be improved for clarity and robustness. Specifically, there are multiple instances of "Retrival" instead of "Retrieval", some unused imports, and a loss function named bpr_hinge_loss that does not implement BPR Hinge Loss. Additionally, the Python script version of the notebook contains shell commands that are not valid Python syntax.
| "def bpr_hinge_loss(y_true, y_pred):\n", | ||
| " margin = 1.0\n", | ||
| " return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bpr_hinge_loss function is misnamed. The current implementation -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10) is a form of logistic loss, not BPR Hinge Loss, which typically involves a margin and max(0, margin - (pos_score - neg_score)). Also, margin = 1.0 is defined but not used. Please rename the function to accurately reflect its implementation or implement the actual BPR Hinge Loss.
def pairwise_logistic_loss(y_true, y_pred):
return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)
| self.user_tower = user_tower | ||
| self.ad_tower = ad_tower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __init__ method of RetrievalModel takes user_tower_instance and ad_tower_instance as arguments but then overwrites them with the global user_tower and ad_tower variables. This makes the passed arguments redundant and can lead to unexpected behavior if different tower instances were intended to be used. It should use the passed arguments.
self.user_tower = user_tower_instance
self.ad_tower = ad_tower_instance| margin = 1.0 | ||
| return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bpr_hinge_loss function is misnamed. The current implementation -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10) is a form of logistic loss, not BPR Hinge Loss, which typically involves a margin and max(0, margin - (pos_score - neg_score)). Also, margin = 1.0 is defined but not used. Please rename the function to accurately reflect its implementation or implement the actual BPR Hinge Loss.
def pairwise_logistic_loss(y_true, y_pred):
return -tf.math.log(tf.nn.sigmoid(y_pred) + 1e-10)| pip install -q kaggle | ||
| # Download the dataset (requires Kaggle API key in ~/.kaggle/kaggle.json) | ||
| kaggle datasets download -d mafrojaakter/ad-click-data --unzip -p ./ad_click_dataset | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shell commands like pip install and kaggle datasets download are specific to Jupyter notebooks and will cause a SyntaxError if this Python file is run directly as a script. These lines should be removed or commented out for a pure Python file. Also, !# is an incorrect comment for a shell command.
| pip install -q kaggle | |
| # Download the dataset (requires Kaggle API key in ~/.kaggle/kaggle.json) | |
| kaggle datasets download -d mafrojaakter/ad-click-data --unzip -p ./ad_click_dataset | |
| """ | |
| # pip install -q kaggle | |
| # # Download the dataset (requires Kaggle API key in ~/.kaggle/kaggle.json) | |
| # kaggle datasets download -d mafrojaakter/ad-click-data --unzip -p ./ad_click_dataset |
| !pip install -q keras-rs | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "import tensorflow_datasets as tfds\n", | ||
| "from mpl_toolkits.axes_grid1 import make_axes_locatable\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| history = retrieval_model.fit(retrieval_train_dataset, epochs=30) | ||
|
|
||
| pd.DataFrame(history.history).plot( | ||
| subplots=True, layout=(1, 3), figsize=(12, 4), title="Retrival Model Metrics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| plt.show() | ||
|
|
||
| """ | ||
| # **Predictions of Retrival Model** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Retrieval model only calculates a simple similarity score (Dot Product). It doesn't | ||
| account for complex feature interactions. | ||
| So we need to build ranking model after words retrival model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo and grammatical error: "Retrival model" should be "Retrieval model", and "after words retrival model" should be "after the retrieval model".
Retrieval model only calculates a simple similarity score (Dot Product). It doesn't
account for complex feature interactions.
So we need to build a ranking model after the retrieval model.| top_ads = retrieval_engine.decode_results(scores, indices)[0] | ||
| final_ranked_ads = rerank_ads_for_user(sample_user, top_ads, ranking_model) | ||
| print(f"User: {sample_user['user_id']}") | ||
| print(f"{'Ad ID':<10} | {'Topic':<30} | {'Retrival Score':<11} | {'Rank Probability'}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
|
Looks like you have added two .ipynb, remove the one which is not necessary. |
This commit adds a new example tutorial demonstrating how to build a Two-Stage Recommender System using keras_rs. The example focuses on a marketing interaction use case (Ad Click Prediction), covering both the Retrieval stage (Two-Tower model) and the Ranking stage.