A microservices LLM API platform with MCP tool integration
Before running Jan Server locally make sure you have:
- Docker Desktop (Windows/macOS) or Docker Engine + docker compose V2 (Linux)
- Make (installed by default on Linux/macOS, install on Windows)
- Git for cloning the repository
- 8 GB RAM minimum (12 GB recommended for all services)
- Optional: NVIDIA GPU + recent drivers if you plan to run local vLLM inference
# Clone and enter the repo
git clone https://github.com/janhq/jan-server.git
cd jan-server
# Interactive setup (runs jan-cli wizard and docker compose)
make quickstartThe quickstart target wraps jan-cli and guides you through:
- Selecting the LLM provider (local vLLM vs remote OpenAI-compatible endpoint)
- Choosing the MCP search provider (Serper, SearXNG, or disabled)
- Enabling or disabling the Media API
Need to rerun the wizard? Execute make quickstart again and accept the prompt to update your .env.
Prefer a scripted setup? Run:
make setup # Generates/updates .env via jan-cli
make up-full # Starts every service defined in docker-compose.ymlMore detail: Quickstart Documentation
Services running after make up-full:
- API Gateway: http://localhost:8000 (Kong)
- LLM API: http://localhost:8080 (OpenAI-compatible)
- Response API: http://localhost:8082 (Multi-step orchestration)
- Media API: http://localhost:8285 (Media management)
- MCP Tools: http://localhost:8091 (Tool integration)
- API Documentation: http://localhost:8000/v1/swagger/
- Keycloak Console: http://localhost:8085 (admin/admin)
Keycloak now runs directly from the official
quay.io/keycloak/keycloak:24.0.5image with our realm/import scripts bind-mounted at runtime - no bundled Keycloak source tree is required.
Full setup guide: Getting Started
Jan Server is an enterprise-grade LLM API platform that provides:
- OpenAI-compatible API for chat completions and conversations
- Multi-step tool orchestration with Response API for complex workflows
- Media management with S3 integration and
jan_*ID resolution - MCP (Model Context Protocol) tools for web search, scraping, and code execution
- OAuth/OIDC authentication via Keycloak with guest access
- Full observability with OpenTelemetry, Prometheus, Jaeger, and Grafana
- Flexible deployment with Docker Compose profiles and Kubernetes support
- OpenAI-compatible chat completions API with streaming support
- Response API for multi-step tool orchestration (max depth: 8, timeout: 45s)
- Media API with S3 storage, jan_* ID system, and presigned URLs
- MCP tools (google_search, web scraping, code execution via SandboxFusion)
- Conversation and message management with PostgreSQL persistence
- Guest and user authentication via Keycloak OIDC enforced by Kong gateway (JWT + custom API key plugin)
- API gateway routing via Kong v3.5
- Distributed tracing with Jaeger and OpenTelemetry
- Metrics and dashboards with Prometheus and Grafana
- Development mode with host.docker.internal support for flexible debugging
- Comprehensive testing suite with 6 jan-cli api-test collections
- Service template system for rapid microservice creation
Primary entry points:
- docs/README.md - Documentation hub overview
- docs/index.md - Navigation map grouped by audience
- docs/architecture/services.md - Service responsibilities and ports
- docs/api/README.md - API reference hub
- docs/getting-started/README.md - Five minute setup
- docs/quickstart.md - Interactive setup walkthrough and commands
Governance and quality:
- DOCUMENTATION_QUALITY_REPORT.md - Recent audit summary and metrics
- CHANGELOG.md - Release history and notable changes
- docs-improve.todo - Active documentation improvement tracker
- CONTRIBUTING.md - Development workflow expectations
- docs/architecture/security.md - Security posture and hardening guidance
jan-server/
|-- services/ # Go microservices
| |-- llm-api/
| |-- response-api/
| |-- media-api/
| |-- mcp-tools/
| |-- template-api/
|-- docs/ # Documentation hub
|-- docker/ # Compose profiles (infra, api, mcp, inference)
|-- monitoring/ # Grafana, Prometheus, OTEL configs
|-- k8s/ # Helm chart + setup guide
|-- config/ # Environment templates and helpers
|-- kong/ # Gateway declarative config
|-- keycloak/ # Realm + theme customisation
|-- scripts/ # Utility scripts (new service template, etc.)
|-- Makefile # Build, test, deploy targets
Key directories:
services/- source for each microservice plus local docs.docs/- user, operator, and developer documentation (see docs/README.md).docker/- compose files included viadocker-compose.yml.monitoring/- observability stack definitions (Grafana dashboards live here).k8s/- Helm chart (k8s/jan-server) and cluster setup notes.config/-.envtemplates and environment overlays.kong//keycloak/- edge and auth configuration.scripts/- automation (service scaffolding, utility scripts).
| Service | Purpose | Port(s) | Source | Docs |
|---|---|---|---|---|
| LLM API | OpenAI-compatible chat, conversations, models | 8080 (direct), 8000 via Kong | services/llm-api |
docs/api/llm-api/README.md |
| Response API | Multi-step orchestration using MCP tools | 8082 | services/response-api |
docs/api/response-api/README.md |
| Media API | jan_* IDs, S3 ingest, media resolution | 8285 | services/media-api |
docs/api/media-api/README.md |
| MCP Tools | Model Context Protocol tools (search, scrape, file search, python) | 8091 | services/mcp-tools |
docs/api/mcp-tools/README.md |
See docs/architecture/services.md for dependency graphs and integration notes.
Create new microservices quickly with the template system:
# Generate new service from template
./scripts/new-service-from-template.ps1 -Name my-new-service
# Template includes:
# - Go service skeleton with Gin HTTP server
# - Configuration management (Viper)
# - Structured logging (Zerolog)
# - OpenTelemetry tracing support
# - PostgreSQL with GORM
# - Dependency injection with Wire
# - Docker and Makefile setup
# - Health check endpointDocumentation:
- Template guide:
docs/guides/services-template.md - Template README:
docs/guides/services-template.md
# Start services
make up-full # Full stack (all 4 APIs + infrastructure)
make up-gpu # With GPU inference (vLLM)
make up-cpu # CPU-only inference
make up # Infrastructure only (DB, Keycloak, Redis)
# Build services
make build-llm-api # Build LLM API
make build-response-api # Build Response API
make build-media-api # Build Media API
make build-mcp # Build MCP Tools
# Development
make test-all # Run all test suites
make swag # Generate API docs
# Testing
make test-auth # Authentication tests
make test-conversations # Conversation tests
make test-response # Response API tests
make test-media # Media API tests
make test-mcp # MCP tools tests
make test-e2e # Gateway E2E tests
# Monitoring
make monitor-up # Start monitoring stack
make monitor-logs # View monitoring logs
# Logs & Status
make logs-llm-api # View LLM API logs
make logs-response-api # View Response API logs
make logs-media-api # View Media API logs
make logs-mcp # View MCP Tools logs
make health-check # Check all services health
# Database
make db-migrate # Run migrations
make db-reset # Reset database
make db-seed # Seed test data
# Cleanup
make down # Stop services
make clean # Clean artifacts
make clean-all # Clean everything (including volumes)Run services on your host for debugging:
# Start all services in Docker with host.docker.internal support
make dev-full
# Stop any service and run it on your host
docker compose stop llm-api
.\scripts\dev-full-run.ps1 llm-api # Windows
./scripts/dev-full-run.sh llm-api # Linux/MacSee Development Guide and Dev-Full Mode for details.
Jan Server includes a unified CLI tool for configuration management, service operations, and development tasks.
# Install globally (recommended)
make cli-install
# Add to PATH as instructed, then run from anywhere
jan-cli --version
jan-cli config validate
jan-cli service listUse the wrapper scripts from the project root:
# Linux/macOS/WSL
./jan-cli.sh config validate
./jan-cli.sh service list
./jan-cli.sh dev setup
# Windows PowerShell
.\jan-cli.ps1 config validate
.\jan-cli.ps1 service list
.\jan-cli.ps1 dev setupThe wrapper scripts automatically build the CLI if needed.
Configuration Management:
jan-cli config validate # Validate configuration
jan-cli config export --format env # Export as environment variables
jan-cli config show llm-api # Show service configuration
jan-cli config k8s-values --env prod # Generate Kubernetes valuesService Operations:
jan-cli service list # List all services
jan-cli service logs llm-api # Show service logs
jan-cli service status # Check service healthDevelopment Tools:
jan-cli dev setup # Setup development environment
jan-cli dev scaffold my-service # Create new service from templateDocumentation:
- Complete guide: docs/guides/jan-cli.md
- Command reference: cmd/jan-cli/README.md
Kong (http://localhost:8000) fronts all /llm/* services and enforces Keycloak-issued JWTs or the custom API key plugin (X-API-Key: sk_*). Acquire temporary guest tokens at POST /llm/auth/guest-login, then include Authorization: Bearer <token> (or X-API-Key) on subsequent requests.
# Get guest token (no registration required)
curl -X POST http://localhost:8000/llm/auth/guest-login
# Sample response:
# {
# "access_token": "eyJhbGc...",
# "token_type": "Bearer",
# "expires_in": 3600,
# "refresh_token": "...",
# "user_id": "guest-..."
# }# Simple chat completion
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "jan-v1-4b",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
# With media (using jan_* ID)
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"model": "jan-v1-4b",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": "jan_01hqr8v9k2x3f4g5h6j7k8m9n0"}}
]
}]
}'# Upload media (remote URL)
curl -X POST http://localhost:8285/v1/media \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "remote_url",
"url": "https://example.com/image.jpg"
},
"user_id": "user123"
}'
# Response:
# {
# "id": "jan_01hqr8v9k2x3f4g5h6j7k8m9n0",
# "mime": "image/jpeg",
# "bytes": 45678,
# "presigned_url": "https://s3.menlo.ai/platform-dev/..."
# }
# Resolve jan_* ID to presigned URL
curl -X POST http://localhost:8285/v1/media/resolve \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"ids": ["jan_01hqr8v9k2x3f4g5h6j7k8m9n0"]}'# Google search
curl -X POST http://localhost:8000/v1/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "google_search",
"arguments": {"q": "latest AI news", "num": 5}
}
}'
# List available tools
curl -X GET http://localhost:8091/v1/mcp/tools# Create response with tool execution
curl -X POST http://localhost:8082/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"input": "Search for the latest AI news and summarize the top 3 results"
}'
# Response includes:
# - Tool execution trace
# - Final generated response
# - Execution metadata (depth, duration, etc.)More examples: API Documentation ->
make up-full # All services
make up-gpu # With GPU inference
make up-cpu # CPU-only inference
make monitor-up # Add monitoring stack# Quick setup with defaults
make setup
# Or manually configure
cp config/secrets.env.example config/secrets.env
# Edit config/secrets.env with your API keys:
# - HF_TOKEN (HuggingFace token for model downloads)
# - SERPER_API_KEY (for Google Search tool)
# - POSTGRES_PASSWORD (database password)
# - KEYCLOAK_ADMIN_PASSWORD (Keycloak admin password)
# Available environment configs:
# - config/defaults.env - Base configuration
# - config/development.env - Docker development
# - config/testing.env - Testing configuration
# - config/production.env.example - Production templateRequired secrets:
HF_TOKEN- HuggingFace token (get from https://huggingface.co/settings/tokens)SERPER_API_KEY- Serper API key (get from https://serper.dev)
See Deployment Guide for production setup.
# Run all tests (6 jan-cli api-test collections)
make test-all
# Specific test suites
make test-auth # Authentication flows (guest + user)
make test-conversations # Conversation management
make test-response # Response API orchestration
make test-media # Media API operations
make test-mcp # MCP tools integration
make test-e2e # Gateway end-to-end tests
# Test reports
# - CLI output: Detailed results with assertionsTest Collections:
tests/automation/auth-postman-scripts.json- Auth flowstests/automation/conversations-postman-scripts.json- Conversationstests/automation/responses-postman-scripts.json- Response APItests/automation/media-postman-scripts.json- Media APItests/automation/mcp-postman-scripts.json- MCP toolstests/automation/test-all.postman.json- Complete E2E suite
Testing guide: docs/guides/testing.md
Access monitoring dashboards:
- Grafana: http://localhost:3331 (admin/admin)
- Prometheus: http://localhost:9090
- Jaeger: http://localhost:16686
See Monitoring Guide for configuration.
| Layer | Technology | Version |
|---|---|---|
| API Gateway | Kong | 3.5 |
| Services | Go (Gin framework) | 1.21+ |
| Database | PostgreSQL | 18 |
| Cache | Redis | Latest |
| Auth | Keycloak (OIDC) | Latest |
| Inference | vLLM | Latest |
| Search | SearXNG | Latest |
| Code Execution | SandboxFusion | Latest |
| Observability | OpenTelemetry | Latest |
| Metrics | Prometheus | Latest |
| Tracing | Jaeger | Latest |
| Dashboards | Grafana | Latest |
| MCP Protocol | mark3labs/mcp-go | Latest |
| Container | Docker Compose | 2.0+ |
| Orchestration | Kubernetes + Helm | 1.28+ |
Microservices:
- LLM API: Go 1.21+ with Gin, GORM, Wire DI
- Response API: Go 1.21+ with Gin, GORM, Wire DI
- Media API: Go 1.21+ with Gin, GORM, S3 SDK
- MCP Tools: Go 1.21+ with JSON-RPC 2.0
See CONTRIBUTING.md for contribution guidelines.
[License information]
- Documentation: docs/README.md
- Issue Tracker: https://github.com/janhq/jan-server/issues
- Discussions: https://github.com/janhq/jan-server/discussions
Quick Start: make setup && make up-full | Documentation: docs/ | API Docs: http://localhost:8000/v1/swagger/