You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-10Lines changed: 11 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
This package wraps the fast and efficient [UDPipe](http://ufal.mff.cuni.cz/udpipe) language-agnostic NLP pipeline
4
4
(via its [Python bindings](https://github.com/ufal/udpipe/tree/master/bindings/python)), so you can use
5
-
[UDPipe pre-trained models](https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-2998) as a [spaCy](https://spacy.io/) pipeline for 50+ languages out-of-the-box.
6
-
Inspired by [spacy-stanfordnlp](https://github.com/explosion/spacy-stanfordnlp), this package offers slightly less accurate
7
-
models that are in turn much faster (see benchmarks for [UDPipe](https://ufal.mff.cuni.cz/udpipe/models#universal_dependencies_24_models_performance) and [StanfordNLP](https://stanfordnlp.github.io/stanfordnlp/performance.html)).
5
+
[UDPipe pre-trained models](https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-3131) as a [spaCy](https://spacy.io/) pipeline for 50+ languages out-of-the-box.
6
+
Inspired by [spacy-stanza](https://github.com/explosion/spacy-stanza), this package offers slightly less accurate
7
+
models that are in turn much faster (see benchmarks for [UDPipe](https://ufal.mff.cuni.cz/udpipe/models#universal_dependencies_25_models_performance) and [Stanza](https://stanfordnlp.github.io/stanza/performance.html)).
8
8
9
9
## Installation
10
10
@@ -14,10 +14,10 @@ Use the package manager [pip](https://pip.pypa.io/en/stable/) to install spacy-u
14
14
pip install spacy-udpipe
15
15
```
16
16
17
-
After installation, use `spacy_udpipe.download(<language ISO code>)` to download the pre-trained model for the desired language.
17
+
After installation, use `spacy_udpipe.download()` to download the pre-trained model for the desired language.
18
18
19
19
## Usage
20
-
The loaded UDPipeLanguage class returns a spaCy [`Language` object](https://spacy.io/api/language), i.e., the nlp object you can use to process text and create a [`Doc` object](https://spacy.io/api/doc).
20
+
The loaded UDPipeLanguage class returns a spaCy [`Language` object](https://spacy.io/api/language), i.e., the object you can use to process text and create a [`Doc` object](https://spacy.io/api/doc).
21
21
22
22
```python
23
23
import spacy_udpipe
@@ -56,18 +56,19 @@ Created by [Antonio Šajatović](http://github.com/asajatovic) during an interns
56
56
## Contributing
57
57
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
58
58
59
-
Please make sure to update tests as appropriate. Tests are run automatically for each pull request on the master branch. To start the tests locally, just run [`pytest`](https://docs.pytest.org/en/latest/contents.html) in the root source directory.
59
+
Please make sure to update the tests as appropriate. Tests are run automatically for each pull request on the master branch.
60
+
To start the tests locally, first, install the package with `pip install -e .`, then run [`pytest`](https://docs.pytest.org/en/latest/contents.html) in the root source directory.
Maintained by [Text Analysis and Knowledge Engineering Lab (TakeLab)](http://takelab.fer.hr/).
66
67
67
68
## Notes
68
69
* All available pre-trained models are licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/).
69
70
70
-
*Full list of pre-trained models for supported languages is available in [`languages.json`](https://github.com/TakeLab/spacy-udpipe/blob/master/spacy_udpipe/languages.json).
71
+
*A full list of pre-trained models for supported languages is available in [`languages.json`](https://github.com/TakeLab/spacy-udpipe/blob/master/spacy_udpipe/languages.json).
71
72
72
73
* This package exposes a `spacy_languages` entry point in its [`setup.py`](https://github.com/TakeLab/spacy-udpipe/blob/master/setup.py) so full suport for serialization is enabled:
73
74
```python
@@ -84,10 +85,10 @@ Maintained by [Text Analysis and Knowledge Engineering Lab (TakeLab)](http://tak
84
85
* Known possible issues:
85
86
* Tag map
86
87
87
-
`Token.tag_`is a [CoNLL](https://universaldependencies.org/format.html) XPOS tag (language-specific part-of-speech tag), defined for each language separately by the corresponding [Universal Dependencies](https://universaldependencies.org/) treebank. Mappings between between XPOSand Universal Dependencies POS tags should be defined in a `TAG_MAP` dictionary (located in language-specific `tag_map.py` files), along with optional morphological features. See [spaCy tag map](https://spacy.io/usage/adding-languages#tag-map) for more details.
88
+
`Token.tag_`is a [CoNLL](https://universaldependencies.org/format.html) XPOS tag (language-specific part-of-speech tag), defined for each language separately by the corresponding [Universal Dependencies](https://universaldependencies.org/) treebank. Mappings between XPOSand Universal Dependencies POS tags should be defined in a `TAG_MAP` dictionary (located in language-specific `tag_map.py` files), along with optional morphological features. See [spaCy tag map](https://spacy.io/usage/adding-languages#tag-map) for more details.
88
89
* Syntax iterators
89
90
90
91
In order to extract `Doc.noun_chunks`, a proper syntax iterator implementation for the language of interest is required. For more details, please see [spaCy syntax iterators](https://spacy.io/usage/adding-languages#syntax-iterators).
91
92
* Other language-specific issues
92
93
93
-
A quick way to check language-specific defaults in [spaCy](https://spacy.io) is to visit [spaCy language support](https://spacy.io/usage/models#languages). Also, please see [spaCy language data](https://spacy.io/usage/adding-languages#language-data) for details regarding other language-specific data.
94
+
A quick way to check language-specific defaults in [spaCy](https://spacy.io) is to visit [spaCy language support](https://spacy.io/usage/models#languages). Also, please see [spaCy language data](https://spacy.io/usage/adding-languages#language-data) for details regarding other language-specific data.
0 commit comments