docs: Move memory tuning to an advanced topic #970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

gavinelder wants to merge 9 commits into master from ge/fix/jvm-memory-tuning

Contributor

gavinelder commented Dec 17, 2025 •

edited

Loading

Follow-up to #891 which moves the JVM memory tuning configuration to a dedicated page under Advanced topics.

Manually configuring JVM settings can have adverse consequences and should only be done based on observed performance issues. Specifying JVM parameters on deployments by default can negatively impact customers and we should rely on the frameworks default memory management systems.

I am following up with a detailed metrics observation guidance to help give customers greater insight into their application performance.


          docs: Move memory tuning to an advanced topic

gavinelder requested a review from llewellyn-sl

December 17, 2025 15:58

netlify bot commented Dec 17, 2025 •

edited

Loading

❌ Deploy Preview for seqera-docs failed. Why did it fail? →

Name	Link
🔨 Latest commit	`12d9d53`
🔍 Latest deploy log	https://app.netlify.com/projects/seqera-docs/deploys/6942d4ec96b3660008b72e81

gavinelder requested review from gwright99 and swampie

December 17, 2025 15:58

gavinelder and others added 8 commits

December 17, 2025 15:59


          Merge branch 'master' into ge/fix/jvm-memory-tuning

5119c3c


          Update jvm-memory-tuning.md

668ec22

Signed-off-by: Justine Geffen <[email protected]>


          Update JVM memory tuning documentation metadata

f0825a6

Added creation date and tags to the JVM memory tuning documentation.

Signed-off-by: Justine Geffen <[email protected]>


          Update JVM memory tuning documentation metadata

455954f

Add creation date and tags for JVM memory tuning documentation

Signed-off-by: Justine Geffen <[email protected]>


          Update JVM memory tuning doc with metadata

707930c

Add creation date and tags to JVM memory tuning documentation

Signed-off-by: Justine Geffen <[email protected]>


          Update JVM memory tuning doc with date and tags

5a8f1f0

Added creation date and tags to JVM memory tuning documentation.

Signed-off-by: Justine Geffen <[email protected]>


          Update JVM memory tuning documentation metadata

12d9d53

Added creation date and tags to JVM memory tuning documentation.

Signed-off-by: Justine Geffen <[email protected]>


          Merge branch 'master' into ge/fix/jvm-memory-tuning

11f0625

justinegeffen approved these changes

View reviewed changes

justinegeffen added 1. Editor review 1. Dev/PM/SME and removed 1. Dev/PM/SME labels

gwright99 requested changes

View reviewed changes

Member

gwright99 left a comment

Comments and first thoughts within. Happy to discuss further async.

platform-enterprise_versioned_sidebars/version-25.2-sidebars.json

    
                            "enterprise/advanced-topics/custom-launch-container",

                            "enterprise/advanced-topics/firewall-configuration",

                            "enterprise/advanced-topics/seqera-container-images",

                            "enterprise/advanced-topics/content-security-policy"

Member

gwright99 Dec 17, 2025

Is there a reason the CSP link only starts in v25.2 docs?

Contributor Author

gavinelder Dec 17, 2025

CSP was not configurable before then and was added for Studios support.

platform-enterprise_versioned_docs/version-24.2/enterprise/advanced-topics/jvm-memory-tuning.md

    
            @@ -0,0 +1,66 @@
          
              ---

Member

gwright99 Dec 17, 2025

Rather than have multiple copies of the same text in various versions, is it possible to make these pages DRY and link back to platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md?

Contributor Author

gavinelder Dec 17, 2025

Sadly that is a limitation of Docusaurus.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md is not published and instead when you cut a version the specific versioned docs need the same page.

This duplication is more due to backporting.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md Show resolved Hide resolved

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              JVM memory tuning is an advanced topic that may cause instability and performance issues.

              :::

              Seqera Platform scales memory allocation based on resources allocated to the application. To best inform available memory, set memory requests and limits on your deployments. We recommend increasing memory allocation before manually configuring JVM settings.

Member

gwright99 Dec 17, 2025

"increasing memory allocation" -- I assume this means requests / limits in the K8s manifests? Vertical scaling on a docker compose node?

Contributor Author

gavinelder Dec 17, 2025

This applies to docker compose as-well.

We should be setting these values

backend:
    image: cr.seqera.io/private/nf-tower-enterprise/backend:v25.3.0
    platform: linux/amd64
    command: -c '/wait-for-it.sh db:3306 -t 60; /tower.sh'
    networks:
      - frontend
      - backend
    expose:
      - 8080
    deploy:
      resources:
        limits:
          memory: 4G        # <---- Limit 
        reservations:
          memory: 2G        # <---- Reservations
    restart: always
    depends_on:
      - db
      - redis
      - cron

Member

gwright99 Dec 17, 2025 •

edited

Loading

Not currently defined in the docker-compose template (maybe we should add it so things are aligned?)

# https://docs.seqera.io/assets/files/docker-compose-0655848af8f21b6e6211d1a9c8ebc702.yml
  backend:
    image: cr.seqera.io/private/nf-tower-enterprise/backend:v25.3.0
    platform: linux/amd64
    command: -c '/wait-for-it.sh db:3306 -t 60; /tower.sh'
    networks:
      - frontend
      - backend
    expose:
      - 8080
    volumes:
      - $PWD/tower.yml:/tower.yml
      # Data studios RSA key is required for the data studios functionality. Uncomment the line below to mount the key.
      #- $PWD/data-studios-rsa.pem:/data-studios-rsa.pem
    env_file:
      # Seqera environment variables — see https://docs.seqera.io/platform-enterprise/enterprise/configuration/overview for details
      - tower.env
    environment:
      # Micronaut environments are required. Do not edit these values
      - MICRONAUT_ENVIRONMENTS=prod,redis,ha
    restart: always
    depends_on:
      - db
      - redis
      - cron

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              JVM memory tuning is an advanced topic that may cause instability and performance issues.

              :::

              Seqera Platform scales memory allocation based on resources allocated to the application. To best inform available memory, set memory requests and limits on your deployments. We recommend increasing memory allocation before manually configuring JVM settings.

Member

gwright99 Dec 17, 2025

Is there a scenario when we expect a client would need to start tinkering with the JVM settings? When is it? How would it be identified?

Contributor Author

gavinelder Dec 17, 2025

Some of this is answered by #954 where both of these will need revisions.

JVM monitoring links back to overall system monitoring.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              |  3   | 8 GB  |     5 GB      |    1.5 GB     | `-XX:ActiveProcessorCount=3 -Xms2000M -Xmx5000M -XX:MaxDirectMemorySize=1500m`  |

              |  3   | 16 GB |     11 GB     |    2.5 GB     | `-XX:ActiveProcessorCount=3 -Xms4000M -Xmx11000M -XX:MaxDirectMemorySize=2500m` |

              ## When to adjust memory settings

Member

gwright99 Dec 17, 2025

Ok, it's at the bottom. Based on my questions, I think this would be more useful nearer to the top.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              **Increase heap memory (`-Xmx`)** if you see:

              - `OutOfMemoryError: Java heap space` errors in logs

              - Garbage collection pauses affecting performance

Member

gwright99 Dec 17, 2025

How are we expecting these metrics to be visible. IIRC you dont get memory metrics on the standards EC2 monitoring package. Do we expect the client to upgrade their monitoring system / be using an aggregating agent like Datadog?

Contributor Author

gavinelder Dec 17, 2025

As above they would need to be monitoring via prometheus / or another agent and monitoring JVM stats.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              **Increase direct memory (`MaxDirectMemorySize`)** if you see:

              - `OutOfMemoryError: Direct buffer memory` errors in logs

Member

gwright99 Dec 17, 2025

Increase relative to what?

Grant more memory at the expense of heap?
Grant more memory at the expense of overhead?
Something else?

Contributor Author

gavinelder Dec 17, 2025

It should not be at the expense of the other.

There is a typical expected ratio of heap vs direct memory.

If the heap is hitting 100% useage that can be scaled on it's own you can then review your direct memory usage and opt to reduce if you have overhead or increase memory allocated to the pod.

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              - `OutOfMemoryError: Direct buffer memory` errors in logs

              - High concurrent workflow launch rates (more than 100 simultaneous workflows)

              - Large configuration payloads or extensive API usage

Member

gwright99 Dec 17, 2025

"Large" ==?
"Extensive" == ?

Contributor Author

gavinelder Dec 17, 2025

Happy to drop these

platform-enterprise_docs/enterprise/advanced-topics/jvm-memory-tuning.md

    
              **Increase direct memory (`MaxDirectMemorySize`)** if you see:

              - `OutOfMemoryError: Direct buffer memory` errors in logs

              - High concurrent workflow launch rates (more than 100 simultaneous workflows)

Member

gwright99 Dec 17, 2025

Is 100 a known pain point when using the default options or was this just chosen because it's a nice number?

Contributor Author

gavinelder Dec 17, 2025

Yes workflows is bad here as it's not nextflow it's Java task allocation related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1. Editor review