[Questions] Understanding Quorum Queue Behavior #14744

stephen7921 · 2025-10-17T01:15:58Z

stephen7921
Oct 17, 2025

Community Support Policy

I have read RabbitMQ's Community Support Policy
I run RabbitMQ 4.x, the only series currently covered by community support
I promise to provide all relevant information (versions, logs from all nodes, rabbitmq-diagnostics output, detailed reproduction steps)

RabbitMQ version used

4.1.2

Erlang version used

27.3.x

Operating system (distribution) used

Ubuntu 22.04.5

How is RabbitMQ deployed?

Community Docker image

rabbitmq-diagnostics status output

See https://www.rabbitmq.com/docs/cli to learn how to use rabbitmq-diagnostics

# PASTE OUTPUT HERE, BETWEEN BACKTICKS

Logs from node 1 (with sensitive values edited out)

See https://www.rabbitmq.com/docs/logging to learn how to collect logs

# PASTE LOG HERE, BETWEEN BACKTICKS

Logs from node 2 (if applicable, with sensitive values edited out)

See https://www.rabbitmq.com/docs/logging to learn how to collect logs

# PASTE LOG HERE, BETWEEN BACKTICKS

Logs from node 3 (if applicable, with sensitive values edited out)

See https://www.rabbitmq.com/docs/logging to learn how to collect logs

# PASTE LOG HERE, BETWEEN BACKTICKS

rabbitmq.conf

See https://www.rabbitmq.com/docs/configure#config-location to learn how to find rabbitmq.conf file location

# PASTE rabbitmq.conf HERE, BETWEEN BACKTICKS

Steps to deploy RabbitMQ cluster

We run RabbitMQ as a 3-node on AWS service

Steps to reproduce the behavior in question

Through RabbitMQ, over 150 messages (4K bytes, multiple segment files) per second are transmitted to a single queue.
Each queue has a lifecycle that includes creation, message transmission, and deletion, and during the service, there are more than 40 active queues at any given time. (In other words, RabbitMQ is processing more than over 6,000 messages per second.)
RabbitMQ is configured in a clustered environment on AWS, and the application server exists in the same region.

When using RabbitMQ version 3.10.X with the Classic Type (without HA), no specific issues were observed in the service.
However, starting from RabbitMQ version 4.X, the default queue type is set to Quorum, and the following abnormal phenomena have been observed:
(The application does not specify the queue type during creation, so it follows the default queue type.)

Symptoms:

The consumer of the queue disappears, and the number of ready messages increases.
RabbitMQ Log: missed heartbeats from client, timeout: 60s
Threads related to queue operations in the application server experience locks and blocks.
The number of file descriptors (FDs) in the application server increases.

advanced.config

See https://www.rabbitmq.com/docs/configure#config-location to learn how to find advanced.config file location

# PASTE advanced.config HERE, BETWEEN BACKTICKS

Application code

# PASTE CODE HERE, BETWEEN BACKTICKS

Kubernetes deployment file

# Relevant parts of K8S deployment that demonstrate how RabbitMQ is deployed
# PASTE YAML HERE, BETWEEN BACKTICKS

What problem are you trying to solve?

While it is understood that Classic Queues are lighter in weight compared to Quorum Queues, these symptoms do not seem to be secondary effects caused by using Quorum Queues.

Could you explain why such phenomena might occur in Quorum Queues?
(If additional information is required, please let me know, but please understand that very detailed information cannot be disclosed.)

kjnilsson · 2025-10-17T08:36:01Z

kjnilsson
Oct 17, 2025
Maintainer

Have you inspected the RabbitMQ server logs? That would be your first point of call.

0 replies

michaelklishin · 2025-10-20T15:58:49Z

michaelklishin
Oct 20, 2025
Maintainer

@stephen7921 we do not guess in this community. You haven't shared any logs or code.

You are running an outdated patch release that excludes a file descriptor leak in QQs that was fixed in 4.1.3.

The consumer of the queue disappears, and the number of ready messages increases.

This is expected because all unconfirmed deliveries are automatically requeued

Threads related to queue operations in the application server experience locks and blocks

We cannot suggest anything specific without logs from all nodes. There is a known scenario where a QQ can end up without a leader in 4.1.4 and it takes days to reproduce, that is, many won't hit it.
There is also #12366 which those using QQs as they were meant to be used

The number of file descriptors (FDs) in the application server increases

We cannot know what your "application server" does or why.

The aforementioned change in 4.1.3, rabbitmq/ra#553, is a change to RabbitMQ itself, not an "application server" or client libraries.

Opening a new connection to RabbitMQ means increasing the file (and socket) descriptor count, that's how network sockets work. I assume that can affect "application servers" (but I don't really know what exactly that means to you).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Questions] Understanding Quorum Queue Behavior #14744

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Questions] Understanding Quorum Queue Behavior #14744

Uh oh!

stephen7921 Oct 17, 2025

Community Support Policy

RabbitMQ version used

Erlang version used

Operating system (distribution) used

How is RabbitMQ deployed?

rabbitmq-diagnostics status output

Logs from node 1 (with sensitive values edited out)

Logs from node 2 (if applicable, with sensitive values edited out)

Logs from node 3 (if applicable, with sensitive values edited out)

rabbitmq.conf

Steps to deploy RabbitMQ cluster

Steps to reproduce the behavior in question

advanced.config

Application code

Kubernetes deployment file

What problem are you trying to solve?

Replies: 2 comments

Uh oh!

kjnilsson Oct 17, 2025 Maintainer

Uh oh!

michaelklishin Oct 20, 2025 Maintainer

stephen7921
Oct 17, 2025

kjnilsson
Oct 17, 2025
Maintainer

michaelklishin
Oct 20, 2025
Maintainer