GitHub - imics-lab/bidirectional-gloss-translation: Finetuning Pre-trained Language Models for Bidirectional Sign Language Gloss to Text Translation

Sign Language Translation with Pre-trained Language Models

This repository contains the implementation for "From Gloss to Meaning: Evaluating Pre-trained Language Models for Bidirectional Sign Language Translation" - a comprehensive study comparing fine-tuned pre-trained language models against transformer models trained from scratch for sign language gloss translation.

Overview

Our research demonstrates that fine-tuning large pre-trained language models significantly outperforms training from scratch for bidirectional sign language gloss translation tasks. We evaluate multiple PLMs across three benchmark datasets with state-of-the-art results.

📁 File Structure

asl-translation/
├── base_pipeline.py              # Base class with common functionality
├── preprocessors.py              # Text and gloss preprocessing utilities
│
├── gloss_to_text_data.py         # Data processing for gloss→text
├── gloss_to_text_model.py        # Model handling for gloss→text  
├── gloss_to_text_pipeline.py     # Complete gloss→text pipeline
│
├── text_to_gloss_data.py         # Data processing for text→gloss
├── text_to_gloss_model.py        # Model handling for text→gloss
├── text_to_gloss_pipeline.py     # Complete text→gloss pipeline
│
├── example_usage.py              # Multiple usage examples
├── requirements.txt              # Python dependencies
├── __init__.py                   # Package initialization
├── setup.py                      # Package installation
└── README.md                     # This file

Installation

Requirements

Python 3.8+
CUDA-capable GPU (8GB+ VRAM recommended, 16GB+ for LLaMA)
16GB+ system RAM (32GB+ recommended for LLaMA)

Quick Setup

git clone https://github.com//imics-lab/gloss2text.git
cd gloss2text

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

from gloss_to_text_pipeline import GlossToTextTranslationPipeline

# Initialize
pipeline = GlossToTextTranslationPipeline()

# Step by step 
raw_ds = pipeline.load_dataset()
df, gloss_col, text_col = pipeline.preprocess_data(raw_ds)
ds = pipeline.prepare_data_for_training(df, gloss_col, text_col)
tokenizer, _ = pipeline.load_model_and_tokenizer()
tok_ds = pipeline.tokenize_data(ds)
trainer = pipeline.train_model(tok_ds, output_dir="./gloss_to_text_t5")

Available Options

Models (`--model`)

t5-base: T5-small (220M params)
flan-t5-base: Flan-T5-small (220M params)
mbart: mBART-small (125M params)
llama-8b: LLaMA 3.1 8B (8B params)

Datasets (`--dataset`)

signum: SIGNUM dataset (DGS ↔ German)
phoenix: RWTH-PHOENIX-14T (DGS ↔ German)
aslg: ASLG-PC12 (ASL ↔ English)

Tasks (`--task`)

g2t: Gloss-to-Text translation
t2g: Text-to-Gloss translation
both: Train both directions sequentially

Model Performance

📌 Key Results

Fine-tuned PLMs significantly outperform baseline Transformers across all benchmarks.
G2T is consistently easier than T2G: BLEU-4 is 30–60% higher and WER substantially lower.
Llama 8B achieves the best results overall, especially on large-scale ASLG-PC12 (83.10 BLEU-4 G2T, 55.21 BLEU-4 T2G).
mBART-small excels on low-resource datasets like SIGNUM and PHOENIX-14T due to its multilingual denoising pre-training.

Hardware Requirements

Model	Min VRAM	Recommended VRAM	Training Time*
T5-small	4GB	8GB	~2 hours
Flan-T5-small	4GB	8GB	~2 hours
mBART-small	6GB	8GB	~2.5 hours
LLaMA 8B	12GB	16GB+	~8 hours

Approximate times for 1000 samples, 5 epochs on RTX 4090

Evaluation Metrics

BLEU-1/2/3/4: N-gram precision scores
ROUGE-L: Longest common subsequence
METEOR: Alignment-based semantic evaluation
WER: Word Error Rate

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sign Language Translation with Pre-trained Language Models

Overview

📁 File Structure

Installation

Requirements

Quick Setup

Usage

Available Options

Models (`--model`)

Datasets (`--dataset`)

Tasks (`--task`)

Model Performance

📌 Key Results

Hardware Requirements

Evaluation Metrics

Contributing

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

imics-lab/bidirectional-gloss-translation

Folders and files

Latest commit

History

Repository files navigation

Sign Language Translation with Pre-trained Language Models

Overview

📁 File Structure

Installation

Requirements

Quick Setup

Usage

Available Options

Models (--model)

Datasets (--dataset)

Tasks (--task)

Model Performance

📌 Key Results

Hardware Requirements

Evaluation Metrics

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Models (`--model`)

Datasets (`--dataset`)

Tasks (`--task`)

Packages