Official helm charts for deploying Datafold into Kubernetes.
The recommended way to deploy Datafold is using our operator, which provides a simpler and more manageable deployment experience.
You'll need two files from Datafold:
datafold-operator-secrets.yaml
- Contains application secrets and configurationdatafold-docker-secret.yaml
- Contains Docker registry credentials
Create a namespace for your Datafold deployment:
kubectl create namespace datafold-apps
kubectl config set-context --current --namespace=datafold-apps
Deploy the Docker registry secret to allow pulling private Datafold images:
kubectl apply -f datafold-docker-secret.yaml
Update the datafold-operator-secrets.yaml
file with your specific configuration (namespace, keys,
email server password, etc.) and deploy it:
kubectl apply -f datafold-operator-secrets.yaml
Add the Datafold Helm repository:
helm repo add datafold https://charts.datafold.com
helm repo update
Deploy the Datafold operator using the datafold-manager chart:
helm upgrade --install datafold-manager datafold/datafold-manager \
--namespace datafold-apps \
--set namespace.name=datafold-apps
Create a DatafoldApplication
custom resource to define your Datafold deployment. See the examples/
directory for configuration templates:
kubectl apply -f examples/datafold-application-full.yaml
The operator will automatically deploy and manage your Datafold application based on the DatafoldApplication
specification.
The examples/
directory contains various DatafoldApplication
configuration templates for different deployment scenarios:
datafold-application-full.yaml
- Complete production configuration with all componentsdatafold-application-minimal.yaml
- Minimal configuration for development/testingdatafold-application-aws-lb.yaml
- AWS-specific configuration with load balancerdatafold-application-gcp-lb.yaml
- GCP-specific configuration with load balancerdatafold-application-signoz.yaml
- Configuration with SigNoz monitoringdatafold-application-datadog.yaml
- Configuration with Datadog monitoring
Choose the example that best matches your environment and customize it as needed.
For users who prefer the traditional Helm charts approach or need more direct control over the deployment. This method is significantly harder and more complex.
We will run a couple of commands where we want the kubernetes operations to execute in the same namespace. We will set that namespace name in a shell environment variable, so we use the same value consistently.
You can change datafold
into any other namespace name you like.
kubectl create namespace datafold
kubectl config set-context --current --namespace=datafold
Our images are stored on a private registry. You can request a JSON key to be used to pull those images. Install that json key as follows, paying attention to the values of docker-server and you have the correct namespace targeted in your context.
kubectl create secret docker-registry datafold-docker-secret \
--docker-server=us-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat ~/json-key-file.json)" \
[email protected]
The helm chart requires a complete values.yaml
file that merges configuration from multiple sources. You can use our example as a starting point:
# Copy and customize the example values file
cp examples/old-method-values-example.yaml values.yaml
# Edit values.yaml with your specific configuration:
# - Update serverName, clusterName, and other global settings
# - Configure your database connection details
# - Set up AWS load balancer ARNs and target groups
# - Adjust resource limits and worker counts as needed
The example file is based on a real dedicated cloud deployment and includes all necessary configuration sections.
Make sure to use the latest release from the helm-charts from our release list:
https://github.com/datafold/helm-charts/releases
helm repo add datafold https://charts.datafold.com
helm repo update
helm upgrade --install datafold datafold/datafold \
--values values.yaml
## Development and Validation
### Kubeconform Validation
This repository includes kubeconform validation to ensure Helm charts conform to the Kubernetes API schema. You can run validation locally using the jeeves tool:
```bash
# Install dependencies
pip install -r jeeves/requirements.txt
# Set environment variables (optional - wrapper commands set these automatically)
export DATAFOLD_K8S_SECRETFILE="./dev/secrets.yaml"
export DATAFOLD_K8S_CONFIGFILE="./dev/config.yaml"
export DATAFOLD_DEPLOY_NAME="datafold-test"
export TAG="latest"
# Run kubeconform validation for different cloud providers
# AWS Configuration (automatically sets environment variables and infra file)
./j dev kubeconform run --cloud-provider aws --strict
# GCP Configuration (automatically sets environment variables and infra file)
./j dev kubeconform run --cloud-provider gcp --strict
# Azure Configuration (automatically sets environment variables and infra file)
./j dev kubeconform run --cloud-provider azure --strict
# Custom validation parameters
./j dev kubeconform run --cloud-provider aws --kubernetes-version "1.29.0" --output-format "json"
./j dev kubeconform run --cloud-provider gcp --skip-list "CustomResourceDefinition,ValidatingWebhookConfiguration"
./j dev kubeconform run --cloud-provider azure --strict false
The validation runs automatically on pull requests for all three cloud providers (AWS, GCP, Azure) to ensure your charts work correctly across different Kubernetes environments.
For more information about jeeves and available commands, see jeeves/README.md.
### Local dev install
helm upgrade --install datafold-operator charts/datafold-operator \
--namespace datafold-apps \
--set namespace.name=datafold-apps