Implement ingress for easier services exposing #1314

sitole · 2025-10-07T15:39:55Z

Deploys Traefik v3.5 as a Nomad job with configuration for auto-discovery of Nomad services.
Ingress load balancer that will handle traffic for services behind Traefik.
Move "additional domains" to the security store so they can be taken from other IaC projects and correctly propagate DNS for each domain that should be routed with Traefik

Note

Deploys a Traefik-based ingress via Nomad and GCP HTTPS load balancer, and moves additional routing domains to Secret Manager with module-wide wiring.

Ingress (Traefik)
- Add Nomad job ingress running traefik:v3.5 with Nomad/Consul providers; new ingress_count and ingress_port variables.
- Provision GCP External Managed HTTPS LB for ingress: health check, backend service to api_instance_group, URL map, target HTTPS proxy, global IP/forwarding rule; reference existing certificate map.
- Wire ingress_port across modules (variables.tf, main.tf), add API instance group named port, and open firewall for the ingress port.
Routing domains via secrets
- Create Secret Manager entries routing-domains and initial version; output routing_domains_secret_name.
- Read and merge secret-stored domains with env additional_domains in root module and pass downstream.
Misc
- Makefile: propagate INGRESS_COUNT TF var.

^{Written by Cursor Bugbot for commit 9e24600. This will update automatically on new commits. Configure here.}

linear · 2025-10-07T15:39:57Z

ENG-3138 Ingress for additional services deployed to our cloud

sitole · 2025-10-07T15:40:49Z

This is a prerequisite for https://github.com/e2b-dev/belt/pull/217 that shows exposing the service via an ingress service.

djeebus · 2025-10-07T17:23:04Z

iac/provider-gcp/nomad-cluster/network/ingress.tf

+  }
+}
+
+resource "google_compute_url_map" "ingress" {


Can we get a better name here? I think we now have two load balancers, "ingress" and "orch_map", neither of their names help understand what they do. Maybe "traefik" and "direct"?

I like ingress as its common name and it describes what it is. Yes, "orch_map" is, in my opinion, a mistake, as it no longer makes sense. Ideally, I would like to transition away from the current load balancer once the migration is complete/rename it to something like "ingress-sandboxes" or a similar name to distinguish better.

I don't like to call it Traefik, as we can switch the ingress backend at any time in the future, but I'm okay with you coming up with a better name.

a note here, if we want to rename the orch_map to ingress-sandboxes, maybe we should name this ingress something like ingress-api or ingress-management or ingress-services

I still don’t like that we would need two load balancers just because we cannot filter sandbox traffic. Will look into again tomorrow

Yep, we can rename ingress to something else. Iam not sure about management/api as we can use it for something different in future. Ingress services sounds okay to me.

I though it's actually quite nice to have separate LBs for user's sandbox traffic and our services traffic (different limitations, limits, HTTP support, etc), but maybe it's unnecessary

Ideally, we should be able to match sandbox traffic to different rules (now it's catch-all fallback) so we can apply different limits/armor rules to them, then we don't need to have different LBs.

For supporting newer versions of HTTP, etc, we can still relatively easily migrate everything, and I'm not sure if we would need some special LB that cannot handle both sandbox and services traffic.

dobrac

just to confirm, this doesn't route any traffic yet and it's just a preparation?

iac/provider-gcp/nomad/jobs/ingress.hcl

iac/provider-gcp/nomad/main.tf

dobrac · 2025-10-08T20:33:09Z

iac/provider-gcp/nomad/main.tf

+      cpu_count     = 1
+      memory_mb     = 512


are these enough for our traffic?

Yes, more than enough for now. When we proceed and start routing sandbox traffic, we need to revisit this, with memory ideally set to 1024 min and 2048 max.

dobrac · 2025-10-08T20:35:00Z

iac/provider-gcp/Makefile

 	$(call tfvar, GCP_REGION) \
 	$(call tfvar, GCP_ZONE) \
 	$(call tfvar, DOMAIN_NAME) \
-	$(call tfvar, ADDITIONAL_DOMAINS) \


note here, this is a breaking change and should be handled with proper migration path

Do we have any? This type transformation @jakubno we handles manually.

Maybe I can push attribute before this pr

can we maybe do the secret default value from the env var? Is that a bad idea?

Imho during migration period we can merge server + tf var lists.

Added compatibility during the migration period that will accept both env and secret and merge them together.

cursor · 2025-10-09T10:48:15Z

iac/provider-gcp/nomad/jobs/ingress.hcl

+          "--providers.consulcatalog=true",
+          "--providers.consulcatalog.exposedByDefault=false",
+          "--providers.consulcatalog.endpoint.address=${consul_endpoint}",
+          "--providers.consulcatalog.endpoint.token=${consul_token}",


Bug: Token Exposure and Network Resolution Issues

Sensitive Nomad and Consul tokens are passed as command-line arguments to the Traefik container, exposing them in process lists and logs. Additionally, if the nomad_endpoint is configured as localhost, it may not resolve correctly from within the container using host networking, potentially causing service discovery failures.

sitole added 3 commits October 7, 2025 17:35

Use additional domains from secret store instead local envs

e8a4bc5

Traefik ingress service

8191248

Load balancer for ingress backed by traefik

2ab1c6f

sitole added the improvement Improvement for current functionality label Oct 7, 2025

e2b-request-same-site-reviewers bot requested a review from dobrac October 7, 2025 15:40

sitole marked this pull request as ready for review October 7, 2025 15:42

sitole requested review from ValentaTomas and jakubno as code owners October 7, 2025 15:42

This comment was marked as outdated.

Sign in to view

djeebus reviewed Oct 7, 2025

View reviewed changes

sitole requested a review from djeebus October 8, 2025 09:11

jakubno self-assigned this Oct 8, 2025

dobrac requested changes Oct 8, 2025

View reviewed changes

sitole mentioned this pull request Oct 9, 2025

PoC: Use of ingress controller for Nomad cluster #1248

Closed

sitole added 3 commits October 9, 2025 12:42

options to customize ingress job count

158f2a8

Additional domains can be provide from env and storage secret

14fd3f4

Removed outdatec comment

9e24600

cursor bot reviewed Oct 9, 2025

View reviewed changes

Implement ingress for easier services exposing #1314

Are you sure you want to change the base?

Implement ingress for easier services exposing #1314

Uh oh!

Conversation

sitole commented Oct 7, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear bot commented Oct 7, 2025

Uh oh!

sitole commented Oct 7, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dobrac left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Oct 9, 2025

Choose a reason for hiding this comment

Bug: Token Exposure and Network Resolution Issues

Uh oh!

Uh oh!

sitole commented Oct 7, 2025 •

edited by cursor bot

Loading