Skip to content

Commit 83f7517

Browse files
ericfirthcswatt
andauthored
Added DLQ documentation (#30879)
* Added DLQ documentation * some changes * last edits --------- Co-authored-by: cecilia saixue watt <[email protected]>
1 parent 9de981d commit 83f7517

File tree

2 files changed

+79
-2
lines changed

2 files changed

+79
-2
lines changed

config/_default/menus/main.en.yaml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4601,16 +4601,21 @@ menu:
46014601
identifier: data_streams_live_messages
46024602
parent: data_streams
46034603
weight: 3
4604+
- name: Dead Letter Queues
4605+
url: data_streams/dead_letter_queues
4606+
identifier: data_streams_dead_letter_queues
4607+
parent: data_streams
4608+
weight: 4
46044609
- name: Data Pipeline Lineage
46054610
url: data_streams/data_pipeline_lineage
46064611
identifier: data_streams_pipeline_lineage
46074612
parent: data_streams
4608-
weight: 4
4613+
weight: 5
46094614
- name: Metrics and Tags
46104615
url: data_streams/metrics_and_tags
46114616
identifier: data_streams_metrics_and_tags
46124617
parent: data_streams
4613-
weight: 5
4618+
weight: 6
46144619
- name: Data Jobs Monitoring
46154620
url: data_jobs/
46164621
pre: data-jobs-monitoring
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: Dead Letter Queues
3+
---
4+
5+
Data Streams Monitoring (DSM) provides visibility into your non-empty dead letter queues (DLQs), enabling you to monitor and inspect message processing failures. DSM also enables you to remediate these message processing failures directly within Datadog.
6+
7+
<div class="alert alert-info">Monitoring dead letter queues is available for Amazon SQS queues.</div>
8+
9+
## Monitor DLQs
10+
11+
### Setup
12+
* Enable [Data Streams Monitoring][1] for your messaging services.
13+
* Install the [Datadog-AWS integration][2]. Use this integration to manage permissions.
14+
* To remediate message processing failures within Datadog, additional setup is required. See the [Remediate DLQ issues](#remediate-dlq-issues) section.
15+
16+
### Usage
17+
18+
#### Create a monitor for a dead letter queue
19+
20+
To track if your queue is rerouting messages to its DLQ, you can create a [metric monitors][8] that alerts on the [`data_streams.sqs.dead_letter_queue.messages`][8] metric.
21+
22+
To create a monitor for a queue's DLQ:
23+
24+
1. In Datadog, navigate to [Data Streams Monitoring][4].
25+
2. Select the **Explore** tab (default).
26+
3. Click on a supported queue to open its side panel.
27+
4. Select the **Dead Letter Queue** tab.
28+
5. Click **Create Monitor** to open a monitor setup page. The default inputs are sufficient to create a monitor that alerts when your DLQ is non-empty, but you can also make additional configurations on this page if you wish.
29+
6. Click **Create** at the bottom of the page.
30+
31+
#### Detect message processing issues
32+
33+
Data Streams Monitoring helps you detect where messages couldn't be processed and what downstream services could be affected:
34+
35+
* The DSM [**Service Map**][6] highlights queues with messages in their DLQs, helping you to visually identify where failures occur
36+
37+
* The DSM [**Issues**][7] page lists all queues that are experiencing message processing issues
38+
39+
## Remediate DLQ issues
40+
You can inspect and resolve non-empty DLQs directly in Datadog by using [Datadog Actions][5].
41+
42+
### Setup
43+
In Datadog, create a [Connection][9]. You need an IAM entity to perform the actions. This IAM entity can be an IAM User (with a secret access key) or IAM Role (assumed by using `sts:AssumeRole`) and have the following permissions:
44+
* `sqs:ReceiveMessage` (for _peek_)
45+
* `sqs:StartMessageMoveTask` (for _redrive_)
46+
* `sqs:PurgeQueue` (for _purge_)
47+
48+
These permissions can be applied globally to all SQS queues, or restricted to specific queues.
49+
50+
### Usage
51+
52+
After you set up the connection, you can click on a supported queue to open its side panel, where you can use the following actions:
53+
54+
* **Peek** to inspect failed message content and identify the root cause
55+
* **Redrive** to requeue messages for another processing attempt
56+
* **Purge** to clear messages that no longer need processing
57+
58+
## Troubleshooting
59+
If you are unable to see dead letter queue information:
60+
* Confirm that you have installed the [Datadog-AWS integration][2]
61+
* Confirm that your AWS role uses the AWS-managed `AmazonSQSReadOnlyAccess` policy
62+
* Confirm that your role has `sqs:ListQueues` and `sqs:GetQueueAttributes` permissions
63+
64+
[1]: /data_streams/setup
65+
[2]: /integrations/amazon-web-services/
66+
[3]: /data_streams/metrics_and_tags/#data_streamssqsdead_letter_queuemessages
67+
[4]: https://app.datadoghq.com/data-streams/
68+
[5]: https://app.datadoghq.com/actions
69+
[6]: https://app.datadoghq.com/data-streams/map
70+
[7]: https://app.datadoghq.com/data-streams/issues
71+
[8]: /monitors/types/metric/
72+
[9]: https://app.datadoghq.com/actions/connections

0 commit comments

Comments
 (0)