Category: B2; Team name: DLLB; Dataset: PPI #234

dleko11 · 2025-11-22T20:46:23Z

Co-authored-by: luka-benic [email protected]
Co-authored-by: dleko11 [email protected]

Checklist

My pull request has a clear and explanatory title.
My pull request passes the Linting test.
I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
My PR follows PEP8 guidelines. (refer to comment below)
My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
I linked to issues and PRs that are relevant to this PR.

Description

This PR extends TopoBench to support edge-level link prediction on both transductive and inductive graph datasets, and adds a tutorial notebook that illustrates how to use the new functionality.

Concretely, the PR introduces:

Edge-level split utilities for link prediction (transductive and inductive).
A dynamic negative sampling transform integrated into the dataloading pipeline.
A dedicated edge-level readout (LinkPredictionReadOut) for link prediction on top of existing GNN backbones (GCN, GAT).
Example dataset configurations for Cora (transductive), MUTAG (inductive), and PPI (inductive, predefined splits).
A tutorial notebook showing the full workflow end-to-end.

Key Changes (Code)

Edge-level splitting
- load_edge_transductive_splits for single-graph / transductive datasets (e.g. Cora).
- load_edge_inductive_splits for multi-graph / inductive datasets (e.g. MUTAG, PPI).
- Both return DataloadDataset objects with:
  - edge_label_index, edge_label (positive and negative candidate edges),
  - consistent handling of val/test negatives vs train-time negatives.
Dynamic negative sampling
- NegativeSamplingTransform in topobench.transforms.data_manipulations:
  - takes positive edges from edge_label_index,
  - samples fresh negatives via torch_geometric.utils.negative_sampling,
  - rebuilds edge_label_index / edge_label each epoch according to neg_pos_ratio and neg_sampling_method.
Edge-level readout
- LinkPredictionReadOut in topobench.nn.readouts:
  - consumes node embeddings x_0 from the backbone,
  - scores candidate edges via dot products,
  - outputs 2-class logits (no-edge, edge) and attaches labels for the loss/evaluator.
PPI dataset support
- New loader (based on torch_geometric.datasets.PPI) that:
  - loads the predefined train/val/test splits from PyG,
  - combines them into a single dataset with a split_idx mapping,
  - is compatible with the inductive edge-level splitting utilities.
Configuration-level support
- task_level: edge and num_classes: 2 for link prediction.
- Extended split_params for link prediction:
  - learning_setting (transductive / inductive),
  - val_prop, test_prop, train_prop,
  - is_undirected,
  - neg_pos_ratio (dynamic train negatives),
  - neg_sampling_ratio (static val/test negatives),
  - neg_sampling_method.

These changes plug into the existing TopoBench training pipeline without altering the high-level interface (Hydra configs + run.yaml).

Tutorial (Usage Example)

To illustrate the new link prediction support, this PR also adds:

tutorials/tutorial_link_prediction.ipynb

The notebook demonstrates:

Transductive and inductive link prediction setups demonstrated on the Cora and MUTAG datasets.
How the split utilities, negative sampling transform, and LinkPredictionReadOut interact in practice.
Running short GCN/GAT experiments and inspecting basic metrics and visualizations of positive/negative edges in the splits.

The tutorial is an example user guide for the new functionality; all core logic lives in the library code.

Issue

Additional context

Co-authored-by: luka-benic <[email protected]> Co-authored-by: dleko11 <[email protected]>

review-notebook-app · 2025-11-22T20:46:27Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

luka-benic · 2025-11-23T10:51:48Z

We fixed some compatibility issues, namely we had a problem with the PPI dataset class from torch_geometric version 2.8.0. which was not compatible with networkx version 2.8.8.

David Leko and others added 6 commits November 13, 2025 21:49

Added dataset for download

f9eca63

Added dataset

c167696

Added more datasets.

c72824d

Cleaning.

dd80a6d

Merge branch 'geometric-intelligence:main' into main

c0004ac

Category: B2; Team name: DLLB; Dataset: PPI

f577ce6

Co-authored-by: luka-benic <[email protected]> Co-authored-by: dleko11 <[email protected]>

David Leko and others added 3 commits November 22, 2025 21:48

Some fixes.

ea73a35

Deleted one file.

087954b

Fixed issues.

3a65f93

levtelyatnikov added the category-b2 Submission to TDL Challenge 2025: Mission B, Category 2. label Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Category: B2; Team name: DLLB; Dataset: PPI #234

Category: B2; Team name: DLLB; Dataset: PPI #234

Uh oh!

dleko11 commented Nov 22, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Nov 22, 2025

Uh oh!

luka-benic commented Nov 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Category: B2; Team name: DLLB; Dataset: PPI #234

Are you sure you want to change the base?

Category: B2; Team name: DLLB; Dataset: PPI #234

Uh oh!

Conversation

dleko11 commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Description

Key Changes (Code)

Tutorial (Usage Example)

Issue

Additional context

Uh oh!

review-notebook-app bot commented Nov 22, 2025

Uh oh!

luka-benic commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dleko11 commented Nov 22, 2025 •

edited

Loading

luka-benic commented Nov 23, 2025 •

edited

Loading