A FastAPI-based service that provides text embeddings using various Sentence Transformer models. This service offers a simple API to generate embeddings for text inputs, supporting both single strings and batches.
- Multiple model support (
all-MiniLM-L6-v2
,all-mpnet-base-v2
,paraphrase-multilingual-MiniLM-L12-v2
) - OpenAI-compatible API format
- Batched inference support
- Docker support
- Comprehensive test suite
- GitHub Actions CI/CD pipeline
- Token usage tracking
docker run -d --name embedding-service -p 8000:8000 ghcr.io/shaharia-lab/embedding-service:latest
# Build the image
docker build -t embedding-service .
# Run the container
docker run -p 8000:8000 embedding-service
- Create a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies:
pip install -r requirements-dev.txt
- Run the server:
uvicorn app.main:app --reload
Once running, the API will be available at http://localhost:8000
. You can visit http://localhost:8000/docs
for interactive API documentation.
curl -X POST http://localhost:8000/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "Hello world",
"model": "all-MiniLM-L6-v2"
}'
import requests
url = "http://localhost:8000/v1/embeddings"
payload = {
"input": "Hello world",
"model": "all-MiniLM-L6-v2" # optional, defaults to all-MiniLM-L6-v2
}
headers = {"Content-Type": "application/json"}
response = requests.post(url, json=payload)
embeddings = response.json()
payload = {
"input": ["Hello world", "Another text"],
"model": "all-MiniLM-L6-v2"
}
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: "",
baseURL: "http://localhost:8000/v1"
});
const embedding = await openai.embeddings.create({
model: "all-MiniLM-L6-v2",
input: "Your text string goes here",
encoding_format: "float",
});
console.log(embedding);
Once the server is running, you can access:
- Interactive API documentation: http://localhost:8000/docs
- OpenAPI schema: http://localhost:8000/openapi.json
# Run tests
pytest app/tests -v
# Run tests with coverage
pytest app/tests -v --cov=app --cov-report=term-missing
embedding-service/
├── app/
│ ├── __init__.py
│ ├── main.py
│ └── tests/
│ ├── __init__.py
│ └── test_main.py
├── requirements.txt
├── requirements-dev.txt
├── Dockerfile
└── README.md
The project uses GitHub Actions for:
- Running tests on pull requests and pushes to main
- Building and publishing Docker images on releases
- Automated testing and validation
MIT License
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request