Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 14 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,21 @@ This repository contains the non-sensitive Kubernetes declarations powering the

Secrets and credentials are managed separately in a Blackbox repository: [tfwiki/secrets](https://github.com/tfwiki/secrets)

> **Warning**
> We are migrating away from managing Kubernetes resources directly via manifest files (fiddly and error-prone) to managing them via Terraform.
>
> See the [terraform](./terraform) folder for progress on this migration

Rough notes:

### Prerequisites
* Kubernetes cluster running 1.8.x (to avoid hardcoding NFS Service IP in PersistantVolume declaration)
* Cloud SQL database `cloudsql-instance-credentials` https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine
* Persistant disk for mediawiki images (mounted via NFS)
* Global Static IP address
## Prerequisites

- Kubernetes cluster running 1.8.x (to avoid hardcoding NFS Service IP in PersistantVolume declaration)
- Cloud SQL database `cloudsql-instance-credentials` <https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine>
- Persistant disk for mediawiki images (mounted via NFS)
- Global Static IP address

### Task list
## Task list

1. Create cluster in Google Container Engine
2. Work on correct zone (`gcloud config set compute/zone [COMPUTE-ZONE]`)
Expand All @@ -26,6 +32,6 @@ Rough notes:

Syncing files from the Valve-hosted wiki is managed via the [`media-sync`](k8s/common/media-sync.yaml) job, but needs authorised SSH keys stored within a Kubernetes secret:

```
```sh
kubectl create secret generic media-sync-secret --from-file=ssh-privatekey=/path/to/.ssh/id_rsa --from-file=ssh-publickey=/path/to/.ssh/id_rsa.pub
```
```
34 changes: 34 additions & 0 deletions terraform/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Crash log files
crash.log

# Exclude all .tfvars files, which are likely to contain sentitive data, such as
# password, private keys, and other secrets. These should not be part of version
# control as they are data points which are potentially sensitive and subject
# to change depending on the environment.
#
*.tfvars

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Include override files you do wish to add to version control using negated pattern
#
# !example_override.tf

# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
# example: *tfplan*

# Ignore CLI configuration files
.terraformrc
terraform.rc
42 changes: 42 additions & 0 deletions terraform/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions terraform/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Terraform deployment stuffs

All the Kubernetes stuff, but in Terraform so its easier to deal with.

Resources to track/import:

- Cluster itself
- [x] GKE cluster
- [x] GKE node pool
- Supporting infrastructure
- [ ] Blackfire
- [x] Ingress
- [ ] Cert manager ??
- [ ] Filestore
- [ ] CloudSQL database
- [ ] (any other external resources?)
- Kubernetes deployments
- [x] Cloudsql-proxy daemonset
- [x] mcrouter daemonset
- [x] Mediawiki deployment
- [x] Mediawiki-update deployment
- [x] Memcached stateful set
- [x] Run-jobs deployment
- [x] Update special pages cron job
- [x] Varnish deployment
- Kubernetes services
- [x] all-varnish
- [x] cloudsql-proxy
- [x] mcrouter
- [x] mediawiki
- [x] memcached
- [x] nfs-server
- [x] nfs-varnish

TODO

- Extract appropriate variables from kubernetes configs
- Replace hardcoded resource references with usage of resource attributes
- Set up remote Terraform state
1 change: 1 addition & 0 deletions terraform/config.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# TODO: Config and secrets
59 changes: 59 additions & 0 deletions terraform/gke-cluster/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "3.90.0"
}
}
}

data "google_container_engine_versions" "supported" {
location = var.google_zone
version_prefix = var.kubernetes_version
}

resource "google_container_cluster" "default" {
name = var.cluster_name
location = var.google_zone
min_master_version = data.google_container_engine_versions.supported.latest_master_version
# node version must match master version
# https://www.terraform.io/docs/providers/google/r/container_cluster.html#node_version
node_version = data.google_container_engine_versions.supported.latest_master_version
initial_node_count = 0

resource_labels = {
"env" = var.env_label
}
}
resource "google_container_node_pool" "highcpu" {
name = "high-cpu-pool"
cluster = var.cluster_name

node_locations = [
var.google_zone
]

node_config {
machine_type = var.machine_type

oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/service.management",
"https://www.googleapis.com/auth/servicecontrol",
]
}

autoscaling {
max_node_count = 9
min_node_count = 3
}

depends_on = [
# Can't directly reference this for cluster_name because it'll force a
# replacement due to diff name formats
google_container_cluster.default
]
}
7 changes: 7 additions & 0 deletions terraform/gke-cluster/output.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
output "node_version" {
value = google_container_cluster.default.node_version
}

output "google_zone" {
value = var.google_zone
}
20 changes: 20 additions & 0 deletions terraform/gke-cluster/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
variable "kubernetes_version" {
default = "1.18"
}

variable "cluster_name" {
type = string
}

variable "google_zone" {
type = string
}

variable "machine_type" {
type = string
default = "n1-highcpu-32"
}

variable "env_label" {
type = string
}
134 changes: 134 additions & 0 deletions terraform/kubernetes-config/cloudsql-proxy.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
resource "kubernetes_service" "cloudsql_proxy" {
metadata {
name = "cloudsql-proxy"

labels = {
app = "cloudsql-proxy"
}
}

spec {
port {
name = "cloudsql-proxy"
protocol = "TCP"
port = 3306
target_port = "cloudsql-proxy"
}

selector = {
app = "cloudsql-proxy"
}

type = "NodePort"
session_affinity = "None"
external_traffic_policy = "Cluster"
}
}

resource "kubernetes_daemonset" "cloudsql_proxy" {
metadata {
name = "cloudsql-proxy"

labels = {
app = "cloudsql-proxy"
}
}

spec {
selector {
match_labels = {
app = "cloudsql-proxy"
}
}

template {
metadata {
labels = {
app = "cloudsql-proxy"
}
}

spec {
volume {
name = "cloudsql-instance-credentials"

secret {
secret_name = "cloudsql-instance-credentials"
default_mode = "0644"
}
}

volume {
name = "ssl-certs"

host_path {
path = "/etc/ssl/certs"
}
}

volume {
name = "cloudsql"
}

container {
name = "cloudsql-proxy"
image = "gcr.io/cloudsql-docker/gce-proxy:1.11"

# TODO: Extract variables
command = ["/cloud_sql_proxy", "--dir=/cloudsql", "-instances=tfwiki-182108:us-west1:tfwiki-production=tcp:0.0.0.0:3306", "-credential_file=/secrets/cloudsql/credentials.json"]

port {
name = "cloudsql-proxy"
container_port = 3306
protocol = "TCP"
}

resources {
limits = {
cpu = "1024m"

memory = "512Mi"
}

requests = {
cpu = "512m"

memory = "128Mi"
}
}

volume_mount {
name = "cloudsql-instance-credentials"
read_only = true
mount_path = "/secrets/cloudsql"
}

volume_mount {
name = "ssl-certs"
mount_path = "/etc/ssl/certs"
}

volume_mount {
name = "cloudsql"
mount_path = "/cloudsql"
}

termination_message_path = "/dev/termination-log"
termination_message_policy = "File"
image_pull_policy = "IfNotPresent"
}

restart_policy = "Always"
termination_grace_period_seconds = 30
dns_policy = "ClusterFirst"
}
}

strategy {
type = "RollingUpdate"
}

revision_history_limit = 10
}
}

Loading