Skip to content

Conversation

@gavinelder
Copy link
Contributor

@gavinelder gavinelder commented Dec 17, 2025

Follow-up to #891 which moves the JVM memory tuning configuration to a dedicated page under Advanced topics.

Manually configuring JVM settings can have adverse consequences and should only be done based on observed performance issues. Specifying JVM parameters on deployments by default can negatively impact customers and we should rely on the frameworks default memory management systems.

I am following up with a detailed metrics observation guidance to help give customers greater insight into their application performance.

@netlify
Copy link

netlify bot commented Dec 17, 2025

Deploy Preview for seqera-docs failed. Why did it fail? →

Name Link
🔨 Latest commit 12d9d53
🔍 Latest deploy log https://app.netlify.com/projects/seqera-docs/deploys/6942d4ec96b3660008b72e81

gavinelder and others added 8 commits December 17, 2025 15:59
Signed-off-by: Justine Geffen <[email protected]>
Added creation date and tags to the JVM memory tuning documentation.

Signed-off-by: Justine Geffen <[email protected]>
Add creation date and tags for JVM memory tuning documentation

Signed-off-by: Justine Geffen <[email protected]>
Add creation date and tags to JVM memory tuning documentation

Signed-off-by: Justine Geffen <[email protected]>
Added creation date and tags to JVM memory tuning documentation.

Signed-off-by: Justine Geffen <[email protected]>
Added creation date and tags to JVM memory tuning documentation.

Signed-off-by: Justine Geffen <[email protected]>
@justinegeffen justinegeffen added 1. Editor review Needs a language review 1. Dev/PM/SME Needs a review by a Dev/PM/SME and removed 1. Dev/PM/SME Needs a review by a Dev/PM/SME labels Dec 17, 2025
Copy link
Member

@gwright99 gwright99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments and first thoughts within. Happy to discuss further async.

"enterprise/advanced-topics/custom-launch-container",
"enterprise/advanced-topics/firewall-configuration",
"enterprise/advanced-topics/seqera-container-images",
"enterprise/advanced-topics/content-security-policy"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the CSP link only starts in v25.2 docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSP was not configurable before then and was added for Studios support.

@@ -0,0 +1,66 @@
---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than have multiple copies of the same text in various versions, is it possible to make these pages DRY and link back to platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly that is a limitation of Docusaurus.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md is not published and instead when you cut a version the specific versioned docs need the same page.

This duplication is more due to backporting.

JVM memory tuning is an advanced topic that may cause instability and performance issues.
:::

Seqera Platform scales memory allocation based on resources allocated to the application. To best inform available memory, set memory requests and limits on your deployments. We recommend increasing memory allocation before manually configuring JVM settings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"increasing memory allocation" -- I assume this means requests / limits in the K8s manifests? Vertical scaling on a docker compose node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to docker compose as-well.

We should be setting these values

backend:
    image: cr.seqera.io/private/nf-tower-enterprise/backend:v25.3.0
    platform: linux/amd64
    command: -c '/wait-for-it.sh db:3306 -t 60; /tower.sh'
    networks:
      - frontend
      - backend
    expose:
      - 8080
    deploy:
      resources:
        limits:
          memory: 4G        # <---- Limit 
        reservations:
          memory: 2G        # <---- Reservations
    restart: always
    depends_on:
      - db
      - redis
      - cron

Copy link
Member

@gwright99 gwright99 Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently defined in the docker-compose template (maybe we should add it so things are aligned?)

# https://docs.seqera.io/assets/files/docker-compose-0655848af8f21b6e6211d1a9c8ebc702.yml
  backend:
    image: cr.seqera.io/private/nf-tower-enterprise/backend:v25.3.0
    platform: linux/amd64
    command: -c '/wait-for-it.sh db:3306 -t 60; /tower.sh'
    networks:
      - frontend
      - backend
    expose:
      - 8080
    volumes:
      - $PWD/tower.yml:/tower.yml
      # Data studios RSA key is required for the data studios functionality. Uncomment the line below to mount the key.
      #- $PWD/data-studios-rsa.pem:/data-studios-rsa.pem
    env_file:
      # Seqera environment variables — see https://docs.seqera.io/platform-enterprise/enterprise/configuration/overview for details
      - tower.env
    environment:
      # Micronaut environments are required. Do not edit these values
      - MICRONAUT_ENVIRONMENTS=prod,redis,ha
    restart: always
    depends_on:
      - db
      - redis
      - cron

JVM memory tuning is an advanced topic that may cause instability and performance issues.
:::

Seqera Platform scales memory allocation based on resources allocated to the application. To best inform available memory, set memory requests and limits on your deployments. We recommend increasing memory allocation before manually configuring JVM settings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a scenario when we expect a client would need to start tinkering with the JVM settings? When is it? How would it be identified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this is answered by #954 where both of these will need revisions.

JVM monitoring links back to overall system monitoring.

| 3 | 8 GB | 5 GB | 1.5 GB | `-XX:ActiveProcessorCount=3 -Xms2000M -Xmx5000M -XX:MaxDirectMemorySize=1500m` |
| 3 | 16 GB | 11 GB | 2.5 GB | `-XX:ActiveProcessorCount=3 -Xms4000M -Xmx11000M -XX:MaxDirectMemorySize=2500m` |

## When to adjust memory settings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's at the bottom. Based on my questions, I think this would be more useful nearer to the top.

**Increase heap memory (`-Xmx`)** if you see:

- `OutOfMemoryError: Java heap space` errors in logs
- Garbage collection pauses affecting performance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we expecting these metrics to be visible. IIRC you dont get memory metrics on the standards EC2 monitoring package. Do we expect the client to upgrade their monitoring system / be using an aggregating agent like Datadog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above they would need to be monitoring via prometheus / or another agent and monitoring JVM stats.


**Increase direct memory (`MaxDirectMemorySize`)** if you see:

- `OutOfMemoryError: Direct buffer memory` errors in logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase relative to what?

  • Grant more memory at the expense of heap?
  • Grant more memory at the expense of overhead?
  • Something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be at the expense of the other.

There is a typical expected ratio of heap vs direct memory.

If the heap is hitting 100% useage that can be scaled on it's own you can then review your direct memory usage and opt to reduce if you have overhead or increase memory allocated to the pod.


- `OutOfMemoryError: Direct buffer memory` errors in logs
- High concurrent workflow launch rates (more than 100 simultaneous workflows)
- Large configuration payloads or extensive API usage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Large" ==?
"Extensive" == ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to drop these

**Increase direct memory (`MaxDirectMemorySize`)** if you see:

- `OutOfMemoryError: Direct buffer memory` errors in logs
- High concurrent workflow launch rates (more than 100 simultaneous workflows)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 100 a known pain point when using the default options or was this just chosen because it's a nice number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes workflows is bad here as it's not nextflow it's Java task allocation related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1. Editor review Needs a language review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants