Skip to content

Commit 5f31166

Browse files
committed
changelog
1 parent 406ff1d commit 5f31166

File tree

3 files changed

+31
-25
lines changed

3 files changed

+31
-25
lines changed

changelog.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changelog
22

3+
## Unreleased
4+
5+
### Added
6+
7+
- `EDS.Tokenizer` now handles `-\n` (found in text when spliting a long word with a linebreak) as a specific token, which can be discarded by the normalizer pipe.
8+
9+
### Fixed
10+
11+
- When using `ignore_space_tokens=True`, words separated only by linebreaks will be collected (via `get_text()`) with spaces inbetween
12+
- The `process` method of `Qualifiers` now accepts `Span` as input, an treats it as a `Doc` to avoid alignment issues
13+
- The `detailed_status_mapping` of disorder/behavior pipes is now a defaultdict to avoid `KeyError: None` that can occur when loading pre-annotated docs without instanciating pipes beforehands
14+
315
## v0.13.1
416

517
### Added

docs/pipes/ner/behaviors/index.md

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -9,25 +9,24 @@ Some general considerations about those components:
99
- The matched comorbidity is also available under the `ent.label_` of each match.
1010
- Matches have an associated `_.status` attribute taking the value `1`, or `2`. A corresponding `_.detailed_status` attribute stores the human-readable status, which can be component-dependent. See each component documentation for more details.
1111
- Some components add additional information to matches. For instance, the `tobacco` adds, if relevant, extracted *pack-year* (= *paquet-année*). Those information are available under the `ent._.assigned` attribute.
12-
- Those components work on **normalized** documents. Please use the `eds.normalizer` pipeline with the following parameters:
13-
```{ .python .no-check }
14-
nlp.add_pipe(
15-
eds.normalizer(
16-
accents=True,
17-
lowercase=True,
18-
quotes=True,
19-
spaces=True,
20-
pollution=dict(
21-
information=True,
22-
bars=True,
23-
biology=True,
24-
doctors=True,
25-
web=True,
26-
coding=True,
27-
footer=True,
28-
),
29-
),
30-
)
12+
- Those components work on **normalized** documents. Please use the `eds.normalizer` pipeline with these additional flags:
13+
14+
```{ .python .no-check }
15+
import edsnlp, edsnlp.pipes as eds
16+
...
17+
18+
nlp.add_pipe(
19+
eds.normalizer(
20+
accents=True,
21+
lowercase=True,
22+
quotes=True,
23+
spaces=True,
24+
pollution=dict(
25+
biology=True,
26+
coding=True,
27+
),
28+
),
29+
)
3130
```
3231

3332
--8<-- "docs/pipes/ner/disorders/warning.md"

docs/pipes/ner/disorders/index.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Some general considerations about those components:
1212
- The matched comorbidity is also available under the `ent.label_` of each match.
1313
- Matches have an associated `_.status` attribute taking the value `1`, or `2`. A corresponding `_.detailed_status` attribute stores the human-readable status, which can be component-dependent. See each component documentation for more details.
1414
- Some components add additional information to matches. For instance, the `tobacco` adds, if relevant, extracted *pack-year* (= *paquet-année*). Those information are available under the `ent._.assigned` attribute.
15-
- Those components work on **normalized** documents. Please use the `eds.normalizer` pipeline with the following parameters:
15+
- Those components work on **normalized** documents. Please use the `eds.normalizer` pipeline with these additional flags:
1616

1717
```{ .python .no-check }
1818
import edsnlp, edsnlp.pipes as eds
@@ -25,13 +25,8 @@ Some general considerations about those components:
2525
quotes=True,
2626
spaces=True,
2727
pollution=dict(
28-
information=True,
29-
bars=True,
3028
biology=True,
31-
doctors=True,
32-
web=True,
3329
coding=True,
34-
footer=True,
3530
),
3631
),
3732
)

0 commit comments

Comments
 (0)