This repository now supports deployment via a Helm chart.
- Use
setup.shwithin thehelmsubdirectory to deploy all services using Helm. - Most configuration is managed in
helm/values.yaml. The most common settings to change are:- Authentication (API keys, secrets, etc.)
- Domain URL for access (
firecrawlDomainUrlcontrols the Ingress host)
- The code assumes a Traefik ingress proxy for external access.
- All core Kubernetes YAML and system design remain unchanged; Helm simply provides a more flexible and maintainable deployment method.
This repository contains Kubernetes manifests and scripts to deploy the Firecrawl system, a scalable web crawling and automation platform, on a Kubernetes cluster. The system is composed of several microservices, including an API, worker, Playwright automation service, and Redis for caching and message brokering.
- Architecture
- Components
- Prerequisites
- Setup
- Configuration
- Usage
- Testing
- Teardown
- Contributing
- License
+----------------+ +----------------+ +------------------+
| | | | | |
| API Service +<---->+ Redis +<---->+ Worker |
| | | | | |
+-------+--------+ +----------------+ +------------------+
|
v
+----------------------+
| Playwright Service |
+----------------------+
- API Service: Handles client requests and orchestrates crawling tasks.
- Worker: Processes crawl jobs and interacts with Playwright for browser automation.
- Playwright Service: Provides browser automation capabilities.
- Redis: Used for caching, message brokering, and job queueing.
k8s/api-deployment.yaml&k8s/api-service.yaml: Deploy and expose the API service.k8s/api-ingress.yaml: Ingress configuration for external access to the API.k8s/worker-deployment.yaml: Deploys the worker service.k8s/playwright-service-deployment.yaml&k8s/playwright-service-service.yaml: Deploy and expose the Playwright automation service.k8s/redis-deployment.yaml&k8s/redis-service.yaml: Deploy and expose Redis.k8s/firecrawl-config-configmap.yaml: Configuration for the system (non-sensitive).k8s/firecrawl-secrets-secret.yaml: Sensitive configuration (e.g., API keys, credentials).
- Kubernetes cluster (local or cloud)
- kubectl
- Docker (for building images, if needed)
- Access to required container images (see deployment manifests)
-
Clone the repository:
git clone https://github.com/your-org/firecrawl_k8s.git cd firecrawl_k8s -
Configure environment variables and secrets:
- Edit
k8s/firecrawl-config-configmap.yamlfor non-sensitive configuration. - Edit
k8s/firecrawl-secrets-secret.yamlfor secrets (do not commit sensitive data).
- Edit
-
Deploy all services:
./setup.sh
This script applies all Kubernetes manifests in the
k8s/directory.
- ConfigMap:
k8s/firecrawl-config-configmap.yaml
Stores non-sensitive configuration (e.g., environment variables, feature flags). - Secret:
k8s/firecrawl-secrets-secret.yaml
Stores sensitive data (e.g., API keys, credentials).
Do not commit real secrets to version control.
-
API Access:
The API service is exposed via the Ingress defined ink8s/api-ingress.yaml.
Update the ingress host as needed for your environment. -
Scaling:
You can scale the worker or API deployments using:kubectl scale deployment worker-deployment --replicas=3 kubectl scale deployment api-deployment --replicas=2
-
Logs:
View logs for a pod:kubectl logs <pod-name>
To run integration or system tests (if provided):
./test.shThis script will execute tests against the deployed services.
To remove all deployed resources:
./teardown.shThis script deletes all Kubernetes resources created by the setup.
Contributions are welcome! Please open issues or submit pull requests for improvements or bug fixes.
This project is licensed under the MIT License. See LICENSE for details.
- Ensure all secrets are managed securely.
- For production deployments, review and update resource requests/limits, security contexts, and ingress settings as needed.