Skip to content

Cluster Infrastructure #24

@webwurst

Description

@webwurst

FIXME Link here to doc about our current Kubernetes cluster and hosting setup.

  • Monitoring
    • Add customized dashboards using Grafonnet to kube-prometheus.
      • cert-manager
      • openebs
    • Configure Alerts and send notifications to a matrix-channel.
    • Add website analytics with Fathom.
    • Create public status page with overview of current apps.
    • Regularly check observatory.mozilla.org for all public sites.
  • Authentication
    • OpenID Connect via Keycloak for kube-apiserver and apps.
    • Add gangway.
  • Security
  • Shared services
    • Kinto
    • Postgres
    • Minio
    • Elasticsearch
  • Backup
    • Push database snapshots and filestores regularly so some s3 storage.
  • Stability
    • Automatically replace the oldest node every twelve hours with a fresh one. Maybe with the help of kured.
    • Make sure limits are set with every pod.
    • Make every service be backed by at least two replicas. Label apps that can't deal with this.
    • Set PodDisruptionBudget for all apps.
    • Set recommended labels for all resources.

Random Ideas

  • Try varnish with traffics and crashes.
  • Add blackbox exporter for our public services.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Epics

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions