Skip to content

Conversation

@snomura1
Copy link

@snomura1 snomura1 commented Nov 8, 2025

Thank you for creating and maintaining FOLIO. This dataset has been extremely helpful for advancing the field.

This PR addresses multiple syntax errors and inconsistencies found in the FOLIO v0.0 training dataset (folio-train.jsonl and folio-train.txt).

Changes Made

Syntax Corrections

  • Removed empty FOL formulae entries
  • Fixed incorrect parenthesis positions and counts
  • Removed extraneous punctuation

Predicate Name Consistency

  • Standardized predicate naming conventions for proper nouns:
    • Robinrobin
    • Markmark
    • Carolcarol
  • Fixed predicate capitalization:
    • team(x)Team(x)
    • women(adenocarcinoma)Women(adenocarcinoma)

Variable/Predicate Fixes

  • Corrected Country(x) Nearby formula structure

I hope these fixes contribute to the continued improvement of this dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant