-
Notifications
You must be signed in to change notification settings - Fork 8
feat: add backend infrastructure foundation #892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
i just want to make sure i understand well where we are here.. this is effectively a prerequisite to meaningfully incorporating ai, whatever john produces for a sra metadata db, and makes it easy to support the endpoints we need for ncbi linkouts. yes? what makes this a draft? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dannon, thanks so much for creating this. I would really love to get a minimal Python backend up and running. Looking at this, I see a few changes I would like to request. I will outline them here as they are a bit structural rather than line by line. So, in the order of the changed-file outline:
-
Can you remove the
VersionDisplaycomponent? It looks like this duplicates or can be merged with the existing version display, and can be done in a separate PR. -
In the
/backenddirectory, can you move the contents there one level down in the folder hierarchy to make room for other backend applications, please? For example, I can see us implementing an "aI-service", a "api-service", and a "next-js-service" as discussed. Looks like this one is theapi-service? -
In
/scripts, please co-locate docker-build.sh with the docker app that it will build. -
In the top-level directory, can you
- Move docker-compose.yml into the backend directory
- For nginx.conf, please move this to the backend directory and remove the section about hosting the Next.js static site content. CloudFront currently serves the static site via S3. No need to change this to get an API server running.
Remove Redis from this PR. Redis could always come in later if needed, but it doesn't seem to be required now, and it could be excluded from a minimal PR. Additionally, CloudFront caches items, depending on the cache control and TTL settings.
Thanks for adding playwright and updating the Jest config.
Thanks again for preparing this.
D
|
OK, I updated the requested changes. I removed the request to delete Redis from this PR. |
|
I want to push back again on the multi-service architecture suggestion, at least for now. The functionality here—LLM search/discovery, Galaxy tool execution, ENA data retrieval, NCBI linkouts—belongs behind a single API: These aren't independent services—splitting them would add network hops and API boundaries to what should be a seamless user experience. These features share:
Distributing auth and user context across multiple services would require JWT validation in each service, shared auth middleware, and complex session management—significant overhead for features that naturally operate on the same user request. FastAPI handles this elegantly through routers: backend/app/api/v1/
├── llm.py # Natural language dataset search
├── galaxy.py # Galaxy tool execution and monitoring
├── ena.py # Data retrieval
├── more.py # the next featureThe frontend stays static (Next.js → S3 → CloudFront). There's no requirement for SSR/ISR that would justify a "next-js-service" runtime. As a likely responsible party for deploying and managing this infrastructure, it's a huge benefit to have a single-service architecture that's straightforward to deploy, monitor, and debug. If we hit actual technical constraints requiring service separation, I'm happy to do the refactoring then. YAGNI—we can always extract a service later if needed, but we can't easily "un-microservice" once we've committed to that complexity. Start simple. nginx and Static FilesI'd like nginx to serve static files for local dev/E2E testing while production uses CloudFront → S3. This provides a single URL locally ( VersionDisplay ComponentYou're right about duplication with the existing version chip. I added it to demonstrate backend connectivity and show server version (critical for debugging version mismatches), but the cleaner solution is enhancing the existing version chip's tooltip. I'll split that into a separate PR to keep this focused on backend infrastructure. Proposed StructureI'm happy to accommodate other structural preferences (directory naming, moving docker-compose/nginx.conf into backend/, etc.), but I'd like us to more completely understand technical requirements driving the need for service separation before committing to that complexity. |
|
thanks for the detail dannon. im ok w leaving decisions about additional services to a future ticket/ pr. agree that can be refactored as needed, once weve had time to discuss those ideas separately. |
Creates the base infrastructure layer for BRC Analytics backend: - FastAPI application with health and cache endpoints - Redis caching service with TTL management - Docker Compose setup (backend + redis + nginx) - nginx reverse proxy configuration - uv for dependency management - Ruff formatting for Python code - E2E health check tests This branch serves as the foundation for feature branches to build upon.
Adds GET /api/v1/version endpoint that returns: - API version (from APP_VERSION env var) - Environment (development/production) - Service name This provides a simple way for the frontend to display version info and demonstrates the backend infrastructure is working.
Use APP_VERSION build arg to inject version from package.json into backend container. Adds docker-build.sh script to automate this. All endpoints now use settings.APP_VERSION instead of hardcoded values.
Remove VersionDisplay component from footer branding as it duplicates the existing version chip functionality. The version chip already displays build date, version, and git commit info. Backend version display will be added to the existing version chip in a separate PR to avoid duplication. Addresses PR galaxyproject#892 feedback.
Move Docker-related files into the backend directory to co-locate infrastructure with the backend code: - docker-compose.yml → backend/docker-compose.yml - nginx.conf → backend/nginx.conf - scripts/docker-build.sh → backend/docker-build.sh Update all path references in: - backend/README.md: Update build script path - backend/docker-compose.yml: Update context and volume paths - backend/docker-build.sh: Update package.json reference This keeps backend infrastructure self-contained and makes the deployment setup clearer. Addresses PR galaxyproject#892 feedback.
a5f4d11 to
fcb8d9a
Compare
|
Hi @dannon, thanks for the update. Yes, happy to discuss splitting the current backend app into AI- and API-focused apps (running on the same machine and isolated Docker network) in a future PR as @d-callan suggests.
I am requesting that we support a folder structure like: For this PR, we would only need like: |
|
I think next js server needs a separate issue and pr, don't want to hold this up for that. More important, in my mind, to move on to talking about ncbi link out stuffs. As for dir tree, i see it as six of one half dozen of another. If we assume everyone is willing to discuss and iterate, it doesn't seem a blocker? Let's try to get this done please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR establishes the foundational backend infrastructure for BRC Analytics, introducing a FastAPI-based REST API with Redis caching, Docker containerization with nginx reverse proxy, version syncing from package.json, and comprehensive E2E testing.
Key Changes:
- FastAPI backend with health check, cache management, and version endpoints
- Redis-based caching service with TTL support and async operations
- Docker Compose setup with nginx reverse proxy, backend service, and Redis
- E2E tests for API health checks and version display functionality
Reviewed Changes
Copilot reviewed 27 out of 30 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
backend/app/main.py |
FastAPI application setup with CORS middleware and API router registration |
backend/app/core/config.py |
Application settings management with environment variable loading |
backend/app/core/cache.py |
Redis cache service implementation with async operations and TTL management |
backend/app/core/dependencies.py |
Dependency injection for cache service singleton |
backend/app/api/v1/health.py |
Health check endpoint for service monitoring |
backend/app/api/v1/cache.py |
Cache health endpoint with connectivity verification |
backend/app/api/v1/version.py |
Version information endpoint |
backend/Dockerfile |
Multi-stage Docker build using uv package manager |
backend/docker-compose.yml |
Service orchestration for nginx, backend, and Redis |
backend/nginx.conf |
Nginx configuration for reverse proxy and static file serving |
backend/docker-build.sh |
Build script to sync version from package.json |
backend/pyproject.toml |
Python project dependencies and configuration |
backend/ruff.toml |
Ruff linter configuration |
tests/e2e/03-api-health.spec.ts |
E2E tests for API health and documentation endpoints |
playwright.config.ts |
Playwright test configuration with dual web server setup |
pyproject.toml |
Root project configuration excluding backend directory |
scripts/format-python.sh |
Python formatting script updated to include backend |
package.json |
Updated test script to pass with no tests |
site-config/brc-analytics/local/.env |
Added backend URL and Galaxy environment variables |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| """Generate a cache key from prefix and parameters""" | ||
| # Sort parameters for consistent keys | ||
| param_str = json.dumps(params, sort_keys=True, default=str) | ||
| hash_val = hashlib.md5(param_str.encode()).hexdigest()[:16] |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using MD5 for hashing cache keys is not secure for cryptographic purposes. While acceptable for cache key generation, consider using a more secure hash function like SHA-256 or explicitly document that this is only for cache key generation, not security.
| hash_val = hashlib.md5(param_str.encode()).hexdigest()[:16] | |
| hash_val = hashlib.sha256(param_str.encode()).hexdigest()[:16] |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok then, in the interest of moving things along, I am happy to deal with the directory structure whenever we add a new service. I will go ahead and approve this PR.
One more thing to note is that it seems like a security issue to expose redis and backend outside the host. Is that happening here? It seems only nginx should be accessible outside the host. Should this be fixed here, or would it be addressed by additional networking config on actual deployment?
|
Thank you! Much appreciated 👍 let's have a few mins next week, maybe Tues, to discuss and ticket any remaining todos |
|
ok, i added an attached ticket for our future selves sake. when i say that i mean i pointed claude at the pr and told it to draft content for an accompanying ticket that summarizes where we landed and make it sounds like reqs lol then i just added the open questions section and a note to the dir structure section 😂 hopefully that makes it a little easier to parse if needed later. but also, maybe, can serve as a thing we can use to facilitate conversation about what our process should look like going forward. but we can talk about that later. so.. @dannon please look at the ticket and if claude did a bad job of anything feel free to update it to better reflect the current state of this pr, and then merge when youre ready. 🎉 thanks again everyone 💐 |
Summary