I ship production-ready ML APIs you can run in minutes: FastAPI, API-key auth, rate limits, p50/p95 metrics, CI, and a Docker image on GHCR.
- Design and ship production-style ML microservices (FastAPI + Uvicorn)
- Add API-key auth & token-bucket rate limiting
- Instrument
/metrics
(p50/p95) plus/health
and/version
- Build reproducible Docker images and publish to GHCR
- Wire up CI smoke tests (boot server → hit health → assert JSON)
- Contract ML Engineer (remote, US-friendly time zones)
- 10–25 hrs/week, 1–3 week sprints, or full time
- ML APIs (FastAPI), RAG baselines, Docker, CI/CD
- Start: immediately · Contact: [email protected] · LinkedIn
Repo → serving_app · Image → ghcr.io/kylesdeveloper/serving_app:latest
# Run the container
docker run --rm -p 8011:8000 ghcr.io/kylesdeveloper/serving_app:latest
# Health & version
curl -s http://localhost:8011/health | python -m json.tool
curl -s http://localhost:8011/version
# Single prediction
curl -s -X POST http://localhost:8011/predict \
-H 'Content-Type: application/json' \
-d '{"features":[5.1,3.5,1.4,0.2], "return_proba": true}' | python -m json.tool
What’s inside: Pydantic schemas, /predict & /predict_batch, /health & /version, CI that boots the API and hits /health.
Repo → rag_service (FastAPI + BM25 baseline + metrics)
# Quickstart (local)
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python -m rag_app.index --corpus ./corpus --out ./rag_app/index.json
API_KEY=dev-key RATE_LIMIT_PER_MIN=30 uvicorn rag_app.main:app --port 8010
# Ask
curl -s -X POST "http://localhost:8010/ask" \
-H "x-api-key: dev-key" -H "Content-Type: application/json" \
-d '{"question":"What is coinsurance?","k":5}' | python -m json.tool
# Metrics
curl -s "http://localhost:8010/metrics" | python -m json.tool
What’s inside: API-key auth, per-key rate limits, stopword-aware BM25 boosting, /metrics with p50/p95, CI smoke tests, Dockerfile.
- Ship early: health/version + one endpoint + metrics first, then iterate
- Automate: CI smoke tests (boot server, hit /health, fetch /metrics)
- Document: clear README with curl examples & troubleshooting
- Own the pipeline: training script → artifact → serving API → image → registry
Python, FastAPI, Uvicorn, Pydantic, scikit-learn, NumPy, Docker, GitHub Actions, rank-bm25
- Open to: contract / part-time / short engagements
- Areas: ML APIs, retrieval/RAG baselines, metrics & reliability, containerization, CI/CD
- Contact: [email protected] · LinkedIn
Want something similar for your team? I can clone one of these services to your domain and ship a runnable image with metrics and CI.
- 🔧 Serving App: https://github.com/KyleSDeveloper/serving_app
- 📚 RAG Service: https://github.com/KyleSDeveloper/rag_service