Skip to content

Conversation

@jableable
Copy link

@jableable jableable commented Nov 25, 2025

Checklist

  • My pull request has a clear and explanatory title.
  • My pull request passes the Linting test.
  • I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
  • My PR follows PEP8 guidelines. (refer to comment below)
  • My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • I linked to issues and PRs that are relevant to this PR.

Description

This pull request adds the Metamath proof-graph dataset for the TAG-DS TopoBench Challenge 2025 (Category A1). The dataset is distributed as a single data.pt file hosted on Hugging Face and contains 20,000 graphs derived from the first 10,000 theorems in the ~45k-theorem Metamath database.

This dataset provides a large-scale, real-world benchmark for graph neural networks on formal reasoning, with deep directed acyclic graph (DAG) structure, long-range dependencies, and a highly non-uniform label distribution reflective of real mathematical theorem usage.


Dataset Structure

Each of the 10k theorems appears twice:

1. Theorem Graph (statement only). A small DAG containing:

  • hypothesis nodes
  • a conclusion node
  • no proof steps

2. Proof Graph (full derivation). A full proof DAG containing:

  • hypothesis nodes
  • all intermediate proof steps
  • the same conclusion node

Train/Val/Test Split Design

  • All 10,000 theorem-only graphs (indices 0–9,999) are always included in the training set.
  • Proof graphs (indices 10,000–19,999) undergo fixed random seed 80/10/10 split.
  • Only proof graphs appear in validation and test splits.

Task: Node-Level Classification

The model must predict, for each node, the justification label (the theorem used at that proof step).

  • Label space: 3,557 labels (all labels with <5 occurrences in the training set are mapped to an UNK class)
  • Conclusion masking: In all proof graphs, the final proof step has its embedding zeroed out, forcing the model to infer the correct conclusion from graph context.
  • Node features: 768‑dimensional CodeBERT embeddings for each Metamath statement.

References:

[1] Metamath Official Site - https://us.metamath.org/index.html
[2] Huggingface Dataset - https://huggingface.co/datasets/jableable/metamath-proof-graphs

@levtelyatnikov levtelyatnikov added the category-a1 Submission to TDL Challenge 2025: Mission A, Category 1. label Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category-a1 Submission to TDL Challenge 2025: Mission A, Category 1.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants