Skip to content

Conversation

alphaprinz
Copy link
Contributor

@alphaprinz alphaprinz commented Aug 25, 2025

Describe the Problem

Add read only pool that connects to read only replica's host.

Explain the Changes

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

  • Doc added/updated
  • Tests added

Summary by CodeRabbit

  • New Features

    • Added a dedicated read-only database pool and optional read-only host configuration (env fallback) so read and write pools can target separate hosts.
  • Bug Fixes

    • Connection/disconnection messages now reference the requested pool for clearer diagnostics.
  • Refactor

    • Connection pooling updated for flexible per-pool selection and per-pool host overrides without impacting other pools.

Copy link

coderabbitai bot commented Aug 25, 2025

Walkthrough

Adds a read-only Postgres pool with optional host override, per-pool host support during pool creation, and updates PostgresTable.get_pool / PostgresSequence.get_pool to accept an optional key parameter; error and logging now reference the requested pool key.

Changes

Cohort / File(s) Summary
Postgres client pools and accessors
src/util/postgres_client.js
- Added pools.read_only (instance: null, size: config.POSTGRES_DEFAULT_MAX_CLIENTS) and set pools.read_only.host from POSTGRES_HOST_RO / POSTGRES_HOST_RO_PATH
- _create_pool clones this.new_pool_params per pool and applies pool.host override before creating Pool with {...new_pool_params, max: pool.size} (per-pool host support)
- Pool creation/disconnection/error messages now reference the requested key
Pool accessor signature updates
src/util/postgres_client.js
- PostgresTable.get_pool changed to get_pool(key = this.pool_key) (uses provided key for lookup and errors)
- PostgresSequence.get_pool changed to get_pool(key = this.pool_key) (same behavior)

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Table as PostgresTable
  participant Seq as PostgresSequence
  participant Client as PostgresClient
  participant Pool as Pool (rw / read_only)
  participant PG as Postgres

  Caller->>Table: request connection (optional key)
  Table->>Client: get_pool(key)
  alt Sequence operation
    Caller->>Seq: request sequence op (optional key)
    Seq->>Client: get_pool(key)
  end
  alt key == "read_only"
    Client->>Client: select pools.read_only (use host_ro if set)
  else other key
    Client->>Client: select pools.<key> (default host)
  end
  Client->>Pool: create or reuse Pool with per-pool params (host override, max)
  Pool->>PG: connect / execute query
  PG-->>Pool: result / error
  Pool-->>Caller: return result or error (messages reference requested key)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly identifies the addition of a read-only pool for performance as a proof of concept, which aligns directly with the pull request’s main change to support read-only replica hosts.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eae1192 and 4850fef.

📒 Files selected for processing (1)
  • src/util/postgres_client.js (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/util/postgres_client.js

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

port,
...params,
};
if (host.endsWith("-rw")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using an env to pass the read-only host, e.g. process.env.POSTGRES_RO_HOST. This will be more consistent with the rest of the postgres info we pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind the change, it's just that the other params come from the postgres mounted secret path /etc/postgres-secret/, which will make this param different than the others.

...params,
};
if (host.endsWith("-rw")) {
this.pools.read_only.host = host.replace("-rw", "-r");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the -ro service. The -r service routes to any instance that can perform read, either secondary or primary. the -ro service will only route to the secondary, and allow us to control which instance we want to communicate with.

@alphaprinz alphaprinz force-pushed the read_only_pool branch 3 times, most recently from 1f3cbd0 to 0942286 Compare September 9, 2025 16:59
@alphaprinz alphaprinz marked this pull request as ready for review September 9, 2025 16:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/util/postgres_client.js (1)

1514-1515: Broaden RO host resolution and keep env naming compatible.

Support both POSTGRES_RO_HOST (suggested earlier) and POSTGRES_HOST_RO (current), including their *_PATH variants. Also guard the -rw-ro fallback to avoid surprising substitutions when naming doesn’t follow that convention.

Apply this diff:

-            const host_ro = process.env.POSTGRES_HOST_RO || fs_utils.try_read_file_sync(process.env.POSTGRES_HOST_RO_PATH) || host.replace("-rw", "-ro");
+            const host_ro =
+                process.env.POSTGRES_RO_HOST ||
+                process.env.POSTGRES_HOST_RO ||
+                fs_utils.try_read_file_sync(process.env.POSTGRES_RO_HOST_PATH || process.env.POSTGRES_HOST_RO_PATH) ||
+                (host.includes('-rw') ? host.replace('-rw', '-ro') : host);
 
             // TODO: This need to move to another function
@@
-            this.pools.read_only.host = host_ro;
+            this.pools.read_only.host = host_ro;

Additionally (outside the shown hunk), if you support POSTGRES_CONNECTION_STRING_PATH, consider adding optional POSTGRES_CONNECTION_STRING_RO / POSTGRES_CONNECTION_STRING_RO_PATH to let RO use a different DSN when one is provided.

Also applies to: 1528-1529

🧹 Nitpick comments (3)
src/util/postgres_client.js (3)

1497-1501: Allow a dedicated max-clients knob for the read-only pool.

Right now read_only uses POSTGRES_DEFAULT_MAX_CLIENTS. Consider a separate limit so read traffic can scale independently without starving writes.

Apply this diff:

-            read_only: {
-                instance: null,
-                size: config.POSTGRES_DEFAULT_MAX_CLIENTS
-            }
+            read_only: {
+                instance: null,
+                // If not defined in config, fallback remains DEFAULT_MAX_CLIENTS
+                size: config.POSTGRES_RO_MAX_CLIENTS || config.POSTGRES_DEFAULT_MAX_CLIENTS
+            }

1727-1731: Per-pool overrides should also support connection strings and ensure host override wins.

If a per-pool connectionString is provided in the future, honor it. If a per-pool host is set, make sure it is not shadowed by a base-level connectionString.

Apply this diff:

-        const new_pool_params = _.clone(this.new_pool_params);
-        if (pool.host) {
-            new_pool_params.host = pool.host;
-        }
+        const new_pool_params = _.clone(this.new_pool_params);
+        if (pool.connectionString) {
+            new_pool_params.connectionString = pool.connectionString;
+        } else if (pool.host) {
+            // ensure host override takes effect even if base params had a connectionString
+            delete new_pool_params.connectionString;
+            new_pool_params.host = pool.host;
+        }
         if (!pool.instance) {
             pool.instance = new Pool({ ...new_pool_params, max: pool.size });

Also applies to: 1733-1733


629-633: Introduce a get_read_pool() helper and update call sites

  • No occurrences of get_pool('read_only') were found, so the new read-only pool isn’t exercised.
  • Add a get_read_pool() method in src/util/postgres_client.js and replace any scattered 'read_only' literals in callers with calls to this helper.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3afc81f and 0942286.

📒 Files selected for processing (1)
  • src/util/postgres_client.js (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-jest-unit-tests
  • GitHub Check: run-package-lock-validation

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/util/postgres_client.js (2)

1497-1501: Unconditional creation of the read_only pool risks doubling connections when POSTGRES_HOST_RO is unset.

If host_ro isn’t provided, read_only currently connects to the same host as default, potentially doubling max clients to the primary and degrading DB capacity. We should skip creating the RO pool unless explicitly configured. This also aligns with earlier feedback to not use RO by default when not configured.

See proposed guard in _create_pool below.


1727-1733: Guard RO pool creation when not configured and support per-pool connectionString override.

Prevents accidental duplicate pools to the primary and enables RO via connection string.

Apply:

-        const new_pool_params = _.clone(this.new_pool_params);
-        if (pool.host) {
-            new_pool_params.host = pool.host;
-        }
+        const new_pool_params = _.clone(this.new_pool_params);
+        // Skip RO pool unless explicitly configured
+        if (name === 'read_only' && !pool.host && !pool.connectionString) {
+            dbg.log0('Skipping read_only pool creation: no POSTGRES_HOST_RO/POSTGRES_CONNECTION_STRING_RO configured');
+            return;
+        }
+        // Per-pool overrides
+        if (pool.host) new_pool_params.host = pool.host;
+        if (pool.connectionString) new_pool_params.connectionString = pool.connectionString;
🧹 Nitpick comments (3)
src/util/postgres_client.js (3)

1514-1514: Host override wiring is OK; consider supporting RO connection strings and validate presence.

  • Support POSTGRES_CONNECTION_STRING_RO/…_PATH for deployments using connection strings.
  • Only set up RO when a host/connectionString is actually provided.

Proposed additions outside this hunk (in the connection-string branch):

 if (process.env.POSTGRES_CONNECTION_STRING_PATH) {
   /** @type {import('pg').PoolConfig} */
   this.new_pool_params = {
     connectionString: fs.readFileSync(process.env.POSTGRES_CONNECTION_STRING_PATH, "utf8"),
     ...params,
   };
+  // Optional RO connection string
+  const ro_path = process.env.POSTGRES_CONNECTION_STRING_RO_PATH;
+  const ro_str  = process.env.POSTGRES_CONNECTION_STRING_RO;
+  if (ro_path || ro_str) {
+    this.pools.read_only = this.pools.read_only || { instance: null, size: config.POSTGRES_DEFAULT_MAX_CLIENTS };
+    this.pools.read_only.connectionString = ro_path ? fs.readFileSync(ro_path, "utf8") : ro_str;
+  }
 } else {
   // get the connection configuration. first from env, then from file, then default
   const host = process.env.POSTGRES_HOST || fs_utils.try_read_file_sync(process.env.POSTGRES_HOST_PATH) || '127.0.0.1';
   const host_ro = process.env.POSTGRES_HOST_RO || fs_utils.try_read_file_sync(process.env.POSTGRES_HOST_RO_PATH);
   ...
   this.pools.read_only.host = host_ro;
 }

Also applies to: 1528-1528


1531-1533: Bug: masking source is wrong; print_pool_params becomes {}.

Use new_pool_params as the source when omitting the password.

Apply:

-        this.print_pool_params = _.omit(this.print_pool_params, 'password');
+        this.print_pool_params = _.omit(this.new_pool_params, 'password');

1497-1501: Consider a dedicated config knob for RO pool size.

Using POSTGRES_DEFAULT_MAX_CLIENTS for RO may oversubscribe connections. Prefer a separate limit, e.g., config.POSTGRES_RO_MAX_CLIENTS (fallback to default if undefined).

Example:

-            read_only: {
-                instance: null,
-                size: config.POSTGRES_DEFAULT_MAX_CLIENTS
-            }
+            read_only: {
+                instance: null,
+                size: config.POSTGRES_RO_MAX_CLIENTS || config.POSTGRES_DEFAULT_MAX_CLIENTS
+            }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0942286 and 2c0f6f0.

📒 Files selected for processing (1)
  • src/util/postgres_client.js (5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T13:10:36.141Z
Learnt from: naveenpaul1
PR: noobaa/noobaa-core#9182
File: src/server/system_services/pool_server.js:1314-1317
Timestamp: 2025-08-08T13:10:36.141Z
Learning: In src/server/system_services/pool_server.js (and config usage), the constant config.INTERNAL_STORAGE_POOL_NAME has been removed from the system. Future logic should not depend on this constant and should instead use config.DEFAULT_POOL_NAME or structural markers (e.g., pool.resource_type === 'INTERNAL' or pool.mongo_info) to identify internal/mongo pools.

Applied to files:

  • src/util/postgres_client.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-jest-unit-tests
🔇 Additional comments (1)
src/util/postgres_client.js (1)

629-633: Confirm read-only pool usage: callers must pass 'read_only' explicitly

Defaulting to this.pool_key is backward-compatible. Repo search shows no occurrences of postgres_pool:'read_only' or get_pool('read_only'); many call sites call this.get_pool() with no key — review and update any code paths that should use the read-only pool to either set postgres_pool: 'read_only' at collection initialization or call get_pool('read_only').

Locations to review:

  • src/util/postgres_client.js — default-using call sites: lines 579, 641, 647, 704, 725, 1297; pool_key assigned at lines 527 and 604.
  • src/server/object_services/md_store.js — example explicit postgres_pool = 'md'.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/util/postgres_client.js (2)

1514-1514: Update comment to reflect recommended behavior.

The comment states "if not present defaults to general pg host", but based on past review feedback, the read-only pool should not be used if no read-only host is configured.

Update the comment after addressing the fallback behavior:

-//optional read-only host. if not present defaults to general pg host
+// Optional read-only host for connecting to read-only replicas
+// If not configured, the read_only pool should not be used

1498-1501: Consider adding tests and documentation for the POC.

As this is a POC for read-only pool support, consider the following next steps before moving to production:

  1. Tests: Add tests to verify:

    • The read-only pool connects to the correct host
    • Queries can be executed through the read-only pool
    • Error handling when read-only host is not configured
    • Fallback behavior (if kept) works as expected
  2. Documentation: Document:

    • How to configure POSTGRES_HOST_RO or POSTGRES_HOST_RO_PATH
    • When to use the read-only pool vs. the default pool
    • The expected format for the read-only host (e.g., using -ro service suffix)
  3. Usage example: Add at least one usage example showing how to request the read-only pool in a table or sequence definition

Would you like me to help generate test cases or usage examples for the read-only pool functionality?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6921679 and eae1192.

📒 Files selected for processing (1)
  • src/util/postgres_client.js (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run-package-lock-validation
  • GitHub Check: Build Noobaa Image
  • GitHub Check: run-jest-unit-tests
🔇 Additional comments (3)
src/util/postgres_client.js (3)

1497-1501: LGTM!

The read-only pool configuration follows the same pattern as existing pools and is properly initialized.


1728-1733: LGTM!

The per-pool host override logic is correctly implemented:

  • Clones new_pool_params to avoid mutating the shared object
  • Conditionally overrides the host when pool.host is defined
  • Creates the Pool instance with the per-pool configuration

This enables the read-only pool to connect to a different host while keeping other pools using the primary host.


629-635: get_pool signature change is backward compatible

No read_only pool usage found across the codebase; verify that the read-only pool is configured and referenced where needed.

Signed-off-by: Amit Prinz Setter <[email protected]>
@alphaprinz alphaprinz merged commit 69d598a into noobaa:master Oct 9, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants