Skip to content

Conversation

@ArneBinder
Copy link
Owner

@ArneBinder ArneBinder commented Sep 23, 2025

that was missed in #476

This also moves the helper method _config_to_str to tests package root and fixes an import bug in an unused test fixture in tests/taskmodules/test_simple_transformer_text_classification.py.

@ArneBinder ArneBinder added the bug Something isn't working label Sep 23, 2025
@ArneBinder ArneBinder changed the title use documents and annotations directly from pie_documents fix: use documents and annotations directly from pie_documents Sep 23, 2025
@codecov
Copy link

codecov bot commented Sep 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.82%. Comparing base (c156b3a) to head (16f21ca).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #499      +/-   ##
==========================================
- Coverage   76.77%   69.82%   -6.95%     
==========================================
  Files          31       31              
  Lines        1804     1803       -1     
  Branches      346      346              
==========================================
- Hits         1385     1259     -126     
- Misses        339      474     +135     
+ Partials       80       70      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ArneBinder ArneBinder merged commit 481157c into main Sep 23, 2025
4 checks passed
@ArneBinder ArneBinder deleted the fix-use-pie-documents branch September 23, 2025 17:30
ArneBinder added a commit that referenced this pull request Sep 24, 2025
This PR implements #459, i.e., it adds the models and taskmodules
implemented originally in
[pie-modules](https://github.com/ArneBinder/pie-modules) (except for QA
and span-pair based RE, see potential follow-ups below).

- added models:
  - `SequenceClassificationModelWithPooler`
  - `SequencePairSimilarityModelWithPooler`
  - `SimpleTokenClassificationModel`
  - `SimpleGenerativeModel`
  - `SimpleSequenceClassificationModel`
  - `TokenClassificationModelWithSeq2SeqEncoderAndCrf`
- added taskmodules:
  - `RETextClassificationWithIndicesTaskModule`
  - `TextToTextTaskModule`
  - `LabeledSpanExtractionByTokenClassificationTaskModule`
  - `PointerNetworkTaskModuleForEnd2EndRE`
  - `CrossTextBinaryCorefTaskModule`
 
**IMPORTANT: This restricts the version of transformers to
`>=4.35.0,<4.37.0`! So, this is breaking.**

requires: 
- #482
- #499

Additional changes:
 - add `tabulate`, and `pytorch-crf` to dev dependencies
- set dependence `torchmetrics[text] >=1.5, <2` to solve conflicts with
`nltk` (`text` loads the required additional dependencies and `>=1.5`
ensures that no deprecated nltk models are loaded. Note that we already
use the modern nltk models in
[`pie_documents.document.processing.NltkSentenceSplitter`](https://github.com/ArneBinder/pie-documents/blob/main/src/pie_documents/document/processing/sentence_splitter.py))
- add `SpanNotAlignedWithTokenException` and `get_aligned_token_span` to
`utils.document`
- add `RequiresMaxInputLength` and `RequiresTaskmoduleConfig` to
`models.interface`

potential follow-ups:
- [ ] add remaining models (SimpleExtractiveQuestionAnsweringModel and
SpanTupleClassificationModel)
- [ ] add remaining taskmodules (ExtractiveQuestionAnsweringTaskModule,
and RESpanPairClassificationTaskModule)

---------

Co-authored-by: Danylo Mysak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants