- A collection of transformer models built using huggingface for various tasks. Training done using pytorch lightning.
- Datasets, models and tokenizers from hugging face.
- Goal: Get familiar with huggingface and pytorch lightning ecosystems.
- To train models, install using pip:
pip install transformers-collection - check installation:
transformers-collection version
To play around with the code clone the repo:
git clone [email protected]:aadhithya/transformers-collection.git- Install poetry:
pip install poetry - Intsall dependencies:
poetry install
Note: poetry install will create a new venv.
Note: poetry/pip install installs CPU version of pytorch if not available, please make sure to install CUDA version if needed.
-
Create the yaml config file for the model (see configs/sentiment-clf.yml for example).
-
train model using:
transformers-collection train /path/to/config.yml -
For a list of supported models, see section Supported Models.
The following models are planned:
| Task | Model | Default Dataset | Status | Checkpoint |
|---|---|---|---|---|
| Text Classification | SentimentClassification |
emotion | ✅ | TBD |
| Text Summarization | - | 🗓️ Planned | TBD |
- Auto push model to huggingface hub with commit ref.
- make models available via transformers pipelines.
- add more models.