You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import edsnlp, edsnlp.pipes as eds
from edsnlp.pipes.ner.tnm.patterns_new import tnm_pattern_new
from edsnlp.pipes.ner.tnm.patterns import tnm_pattern
text = "Mise à jour de la classification : T3 N1b M0."
# Old
nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.tnm(pattern=tnm_pattern))
print(nlp(text).ents)
# Out: ()
# New
nlp_new = edsnlp.blank("eds")
nlp_new.add_pipe(eds.tnm(pattern=tnm_pattern_new))
print(nlp_new(text).ents)
# Out: (T3 N1b M0)
Changes
patterns_new.py: File containing new tnm regex. Compare to old one add many new sections.
patterns.py: Old regex file. Renamed some sections to match new section names used in model.py.
tnm.py: Change default pattern to new pattern.
test_tnm.py: Change tnp pipe definition to still use old regex.
model.py: Remove part of pydantic typing validation to work with both old and new patterns.
TODO
model.py: add pydantic good typing
test_tnm.py: update unit tests
Checklist
[] If this PR is a bug fix, the bug is documented in the test suite.
[] Changes were documented in the changelog (pending section).
[] If necessary, changes were made to the documentation (eg new pipeline).
if torch.Tensor in copyreg.dispatch_table:
- old_dispatch[torch.Tensor] = copyreg.dispatch_table[torch.Tensor]
copyreg.pickle(torch.Tensor, reduce_empty)
83
9
0
89.16%
edsnlp/utils/span_getters.py
Was already missing at lines 78-80
if span_getter is None:
- yield doc[:], None- return
if callable(span_getter):
Was already missing at lines 81-83
if callable(span_getter):
- yield from span_getter(doc)- return
for key, span_filter in span_getter.items():
Was already missing at line 85
if key == "*":
- candidates = (
(span, group) for group in doc.spans.values() for span in group
Was already missing at lines 94-97
else:
- for span, group in candidates:- if span.label_ in span_filter:- yield span, group
Was already missing at line 101
if callable(span_setter):
- span_setter(doc, matches)
else:
Was already missing at line 181
elif isinstance(v, str):
- new_value[k] = [v]
elif isinstance(v, list) and all(isinstance(i, str) for i in v):
231
10
0
95.67%
edsnlp/utils/resources.py
Was already missing at line 33
if not verbs:
- return conjugated_verbs
24
1
0
95.83%
edsnlp/utils/numbers.py
Was already missing at line 34
else:
- string = s
string = string.lower().strip()
self.on_stop()
- except BaseException as e:
...
- self.main_control_queue.put(e)
finally:
Was already missing at lines 402-404
pass
- except StopSignal:- pass
for name, queue in self.consumer_queues(stage):
Was already missing at line 542
while schedule[task_idx] is None:
- task_idx = (task_idx + 1) % len(schedule)
Was already missing at lines 606-608
if isinstance(docs, StreamSentinel):
- self.active_batches[stage].append([None, None, None, docs])- continue
batch_id = str(hash(tuple(id(x) for x in docs)))[-8:] + "-" + self.uid
if not consultation_mention:
- consultation_mention = []
elif consultation_mention is True:
48
2
0
95.83%
edsnlp/pipes/core/normalizer/__init__.py
Was already missing at line 7
def excluded_or_space_getter(t):
- return t.is_space or t.tag_ == "EXCLUDED"
5
1
0
80.00%
edsnlp/pipes/core/endlines/endlines.py
Was already missing at lines 160-164
if end_lines_model is None:
- path = build_path(__file__, "base_model.pkl")-- with open(path, "rb") as inp:- self.model = pickle.load(inp)
elif isinstance(end_lines_model, str):
# if module is reloaded.
- existing_func = registry.factories.get(internal_name)- if not util.is_same_func(factory_func, existing_func):
raise ValueError(
31
2
0
93.55%
edsnlp/package.py
Was already missing at lines 474-476
version = version or pyproject["project"]["version"]
- except (KeyError, TypeError):- version = "0.1.0"
name = name or pyproject["project"]["name"]
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add a new TNM regex that outperforms the old one. By default,
eds.tnm
will use the new regex pattern, but the old one will remain accessible.Installation:
Code example:
Changes
patterns_new.py
: File containing new tnm regex. Compare to old one add many new sections.patterns.py
: Old regex file. Renamed some sections to match new section names used inmodel.py
.tnm.py
: Change default pattern to new pattern.test_tnm.py
: Change tnp pipe definition to still use old regex.model.py
: Remove part of pydantic typing validation to work with both old and new patterns.TODO
model.py
: add pydantic good typingtest_tnm.py
: update unit testsChecklist