Skip to content

O-labels treatment #98

@bschembri-UoM

Description

@bschembri-UoM

I would like to understand how seqeval treats sentences with no expected entities.

Taking the bellow example (adapted from the documentation);
actuals = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O'], ['O','O','O','O']]
preds = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O'], ['O','O','O','O']]
print(classification_report(actuals, preds, mode='strict', scheme=IOB2, digits=4))

I get the following output;

              precision    recall  f1-score   support

        MISC     0.0000    0.0000    0.0000         1
         PER     1.0000    1.0000    1.0000         1

   micro avg     0.5000    0.5000    0.5000         2
   macro avg     0.5000    0.5000    0.5000         2
weighted avg     0.5000    0.5000    0.5000         2

In case when a sentence is correctly predicted with no entities, isn't this sentence (labels) meant to be added to the metric calculations?
Looking at the support figure of "2" I believe that this implies that the last sentence is not taken into consideration.

Can you clarify this please?

  • Operating System: Ubuntu 20.04
  • Python Version: 3.7
  • Package Version: 1.2.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions