feat: Add support for OpenTelemetry #551

shikokuchuo · 2025-10-27T19:56:46Z

This PR implements basic OpenTelemetry instrumentation for DBI.

Abides by the otel semantic conventions for database spans as far as possible (with considerations for limitations of the R API, performance etc.).

The following is a screenshot of the spans created by running the examples for dbGetQuery(). This trace may also be examined interactively at this public link (30 day validity):
https://logfire-eu.pydantic.dev/public-trace/a3da0166-cf62-43de-b194-864bf3c9e33d?spanId=77e385b051f86076

Implementation progress:

Implemented for: dbConnect/dbDisconnect, dbCreateTable/dbRemoveTable, dbGetQuery

Todo:

Extended coverage to: dbAppendTable, dbWriteTable/dbReadTable and all Arrow variants
Add tests
~~- [ ] Add documentation~~ (covered by news item + separate article for other packages)

I've assumed otel to be an 'imports' package for simplicity, but it shouldn't be a problem to move to 'suggests' if that's the preference.

krlmlr

Thanks, lovely!

For what functions does it not make sense to implement telemetry? I guess dbQuote*(), what else?

Does this supersede https://github.com/r-dbi/dblog? Do you think a non-invasive approach like used there would be feasible here as well? What is the overhead if no listeners are active?

If we need to add here, a suggested package would be preferred.

hadley

Looks like a great first step!

DESCRIPTION

R/dbGetQuery.R

shikokuchuo · 2025-10-28T09:47:55Z

For what functions does it not make sense to implement telemetry? I guess dbQuote*(), what else?

My preliminary thoughts are that we should instrument all 'full transactions', where we might be interested in the length of the spans. Otel expects all spans to be short-lived. Hence for example, we don't have a span that starts with dbConnect() and ends with dbDisconnect(), we have spans for each.

Does this supersede https://github.com/r-dbi/dblog? Do you think a non-invasive approach like used there would be feasible here as well? What is the overhead if no listeners are active?

Gabor has spent a lot of time to make the interface as user-friendly as possible. I think the idea that you can get instrumentation for free with no code changes, and can leave it on in production is a powerful proposition.

If not active, there is practically no overhead - the current main instrumentation function otel_local_active_span() is guarded by an early return based on a variable that will be cached (implemented in d6213f6). The design is so that arguments would remain unevaluated.

If we need to add here, a suggested package would be preferred.

Updated to suggests in d6213f6.

shikokuchuo · 2025-10-31T19:57:03Z

I've updated this PR to cover the high-level operations - let me know if any obvious ones are missing. Instrumenting the lower level ones would result in much more (noisy) output.

Live link here: https://logfire-eu.pydantic.dev/public-trace/73de9eac-7379-4581-b285-845a7a52c56b?spanId=eddfab36dba4de22

Re. documentation, let me know if you have a particular preference here e.g. if you want to stick with a news item (knitr), or have a separate vignette (mirai).

krlmlr

Thanks. My understanding is that this is opt-in, and that tracing for DBI can be disabled even if tracing for other sources is enabled. I wonder if we can emit a banner message when connecting that points to relevant documentation?

R/otel.R

R/11-dbAppendTable.R

tests/testthat/test-otel.R

shikokuchuo · 2025-11-03T10:57:22Z

My understanding is that this is opt-in, and that tracing for DBI can be disabled even if tracing for other sources is enabled.

Yes, you're right - and detailed in the otelsdk instrumentation docs.

I wonder if we can emit a banner message when connecting that points to relevant documentation?

I'm thinking that in some cases it may be a system admin which has set up otel collection rather than the end user. So it may be surprising for the user to see a banner, especially as they wouldn't then know what to do with the information.

hadley · 2025-11-03T13:27:18Z

As this rolls out across more packages, we'll do more to promote it, so hopefully folks start to internalise that this sort of observability is available in all the packages they rely on the most.

shikokuchuo · 2025-11-08T10:51:41Z

I've now updated this PR with a common approach on caching the tracer, and a testing helper (following discussions with @schloerke who's been spearheading the otel integration in Shiny/promises).

Copilot

Pull request overview

This PR adds basic OpenTelemetry instrumentation to DBI, implementing tracing for database operations following the OpenTelemetry semantic conventions for database spans. The implementation provides observability into database operations by creating spans for connections, queries, and table operations.

Key changes:

Core OpenTelemetry infrastructure with lazy initialization and tracer caching
Instrumentation added to generic database operations (connect/disconnect, queries, table operations)
Test coverage for OpenTelemetry tracing functionality

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
R/otel.R	New file implementing core OpenTelemetry helper functions for tracer management, span creation, and SQL query attribute extraction
R/zzz.R	Added tracer initialization call in .onLoad hook
R/DBI-package.R	Added .onLoad function to initialize OpenTelemetry tracer
R/dbConnect.R	Added OpenTelemetry span instrumentation for database connection
R/dbDisconnect.R	Added OpenTelemetry span instrumentation for database disconnection
R/dbGetQuery.R	Added OpenTelemetry span instrumentation for query execution
R/dbGetQueryArrow.R	Added OpenTelemetry span instrumentation for Arrow query execution
R/dbReadTable.R	Added OpenTelemetry span instrumentation for table reading
R/dbReadTableArrow.R	Added OpenTelemetry span instrumentation for Arrow table reading
R/13-dbWriteTable.R	Added OpenTelemetry span instrumentation for table writing
R/23-dbWriteTableArrow.R	Added OpenTelemetry span instrumentation for Arrow table writing
R/11-dbAppendTable.R	Added OpenTelemetry span instrumentation for table appending
R/21-dbAppendTableArrow.R	Added OpenTelemetry span instrumentation for Arrow table appending
R/12-dbCreateTable.R	Added OpenTelemetry span instrumentation for table creation
R/22-dbCreateTableArrow.R	Added OpenTelemetry span instrumentation for Arrow table creation
R/dbRemoveTable.R	Added OpenTelemetry span instrumentation for table removal
tests/testthat/test-otel.R	New test file validating OpenTelemetry span creation and attributes
DESCRIPTION	Added otel and otelsdk to Suggests dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

R/otel.R

R/DBI-package.R

Copilot · 2025-12-04T06:28:10Z

R/dbGetQuery.R

+  otel_local_active_span(
+    dynGet("attributes")$db.operation.name,
+    conn,
+    label = dynGet("attributes")$db.collection.name,
+    attributes = make_query_attributes(statement)


The dynGet("attributes") calls on lines 65 and 67 attempt to retrieve the value of attributes before it's assigned on line 68, which will fail. The dynGet function searches for a variable in parent frames, but attributes is being assigned as a parameter in the current call, not in a parent frame.

This should be refactored to compute the attributes first, then use them:

setGeneric("dbGetQuery", def = function(conn, statement, ...) { attributes <- make_query_attributes(statement) otel_local_active_span( attributes$db.operation.name, conn, label = attributes$db.collection.name, attributes = attributes ) standardGeneric("dbGetQuery") })

Suggested change

otel_local_active_span(

dynGet("attributes")$db.operation.name,

conn,

label = dynGet("attributes")$db.collection.name,

attributes = make_query_attributes(statement)

attributes <- make_query_attributes(statement)

otel_local_active_span(

attributes$db.operation.name,

conn,

label = attributes$db.collection.name,

attributes = attributes

@shikokuchuo: Can you confirm that dynGet() is doing what Copilot thinks it's doing? I understand that we want to run make_query_attributes() only once, and only if we actually record the span. How about:

Suggested change

otel_local_active_span(

dynGet("attributes")$db.operation.name,

conn,

label = dynGet("attributes")$db.collection.name,

attributes = make_query_attributes(statement)

otel_local_active_span(

attributes$db.operation.name,

conn,

label = attributes$db.collection.name,

attributes = { attributes <- make_query_attributes(statement) }

Scratch that. The code in make_query_attributes() is already brittle. Should we record source code locations instead?

Copilot · 2025-12-04T06:28:10Z

R/dbGetQueryArrow.R

+  otel_local_active_span(
+    dynGet("attributes")$db.operation.name,
+    conn,
+    label = dynGet("attributes")$db.collection.name,
+    attributes = make_query_attributes(statement)


The dynGet("attributes") calls on lines 59 and 61 attempt to retrieve the value of attributes before it's assigned on line 62, which will fail. The dynGet function searches for a variable in parent frames, but attributes is being assigned as a parameter in the current call, not in a parent frame.

This should be refactored to compute the attributes first, then use them:

setGeneric("dbGetQueryArrow", def = function(conn, statement, ...) { attributes <- make_query_attributes(statement) otel_local_active_span( attributes$db.operation.name, conn, label = attributes$db.collection.name, attributes = attributes ) standardGeneric("dbGetQueryArrow") })

Suggested change

otel_local_active_span(

dynGet("attributes")$db.operation.name,

conn,

label = dynGet("attributes")$db.collection.name,

attributes = make_query_attributes(statement)

attributes <- make_query_attributes(statement)

otel_local_active_span(

attributes$db.operation.name,

conn,

label = attributes$db.collection.name,

attributes = attributes

Source code locations?

Copilot · 2025-12-04T06:28:11Z

R/otel.R

+  list(
+    db.operation.name = query[1L],
+    db.collection.name = query[which(query == "FROM") + 1L]


When a SQL statement doesn't contain a "FROM" clause, which(query == "FROM") returns integer(0), and query[integer(0)] returns character(0). This results in db.collection.name being set to character(0) instead of NA_character_ or being omitted. This could cause issues downstream when the attribute is used. Consider handling this case explicitly:

make_query_attributes <- function(statement) { query <- strsplit(statement, " ", fixed = TRUE)[[1L]] from_idx <- which(query == "FROM") list( db.operation.name = query[1L], db.collection.name = if (length(from_idx) > 0) query[from_idx[1L] + 1L] else NA_character_ ) }

Additionally, consider making the FROM matching case-insensitive using toupper(query) == "FROM" to handle queries written in lowercase or mixed case.

Suggested change

list(

db.operation.name = query[1L],

db.collection.name = query[which(query == "FROM") + 1L]

query_upper <- toupper(query)

from_idx <- which(query_upper == "FROM")

list(

db.operation.name = query[1L],

db.collection.name = if (length(from_idx) > 0) query[from_idx[1L] + 1L] else NA_character_

Source code locations?

R/otel.R

krlmlr

Thanks, great! I'll play with it locally to get a feeling as well.

krlmlr · 2025-12-04T06:23:41Z

R/21-dbAppendTableArrow.R

+    label = collection_name(name, conn),
+    attributes = list(
+      db.collection.name = collection_name(name, conn),
+      db.operation.name = "INSERT INTO"


Should we distinguish between Arrow and data frame source?

krlmlr · 2025-12-04T06:31:48Z

R/22-dbCreateTableArrow.R

  "dbCreateTableArrow",
  def = function(conn, name, value, ..., temporary = FALSE) {
+    otel_local_active_span(
+      "CREATE TABLE",


Same here, but not as critical perhaps.

krlmlr · 2025-12-04T06:37:57Z

R/otel.R

+  otel_is_tracing <- FALSE
+
+  otel_cache_tracer <<- function() {
+    requireNamespace("otel", quietly = TRUE) || return()


What happens if otel is installed during the session? Can we somehow support this use case?

Will otel print diagnostics on the console if it's active, by default?

Co-authored-by: Copilot <[email protected]>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

shikokuchuo added 3 commits October 27, 2025 19:10

OpenTelemetry concept

9d66707

Add table creation, removal and queries

2bbc16c

Adhere closer to semantic conventions

791e073

krlmlr reviewed Oct 27, 2025

View reviewed changes

hadley approved these changes Oct 27, 2025

View reviewed changes

DESCRIPTION Outdated Show resolved Hide resolved

R/dbGetQuery.R Outdated Show resolved Hide resolved

R/dbGetQuery.R Outdated Show resolved Hide resolved

shikokuchuo added 5 commits October 28, 2025 10:48

Move otel to suggests and cache tracer on package load

d6213f6

Refactor otel_local_active_span()

631e53b

Add testing infrastructure

d1ac9b2

Add dbWriteTable, dbAppendTable, dbReadTable; handle names more robustly

cca88e4

Rename some parameters

cfc7ec7

shikokuchuo marked this pull request as ready for review October 31, 2025 16:41

Merge branch 'main' into otel

fababe1

krlmlr changed the title ~~OpenTelemetry Integration~~ feat: Added support for [OpenTelemetry](https://opentelemetry.io/) observability. See ?otelsdk::collecting for more details on configuring OpenTelemetry Oct 31, 2025

krlmlr reviewed Oct 31, 2025

View reviewed changes

R/otel.R Outdated Show resolved Hide resolved

R/otel.R Outdated Show resolved Hide resolved

R/11-dbAppendTable.R Outdated Show resolved Hide resolved

R/11-dbAppendTable.R Outdated Show resolved Hide resolved

tests/testthat/test-otel.R Show resolved Hide resolved

shikokuchuo added 3 commits November 3, 2025 10:45

Use updated otel_refresh_tracer()

28c98df

Do not record query text

2f62ee7

Use INSERT INTO for dbAppendTable()

4288bc1

hadley changed the title ~~feat: Added support for [OpenTelemetry](https://opentelemetry.io/) observability. See ?otelsdk::collecting for more details on configuring OpenTelemetry~~ feat: Add support for OpenTelemetry Nov 3, 2025

shikokuchuo added 5 commits November 3, 2025 21:45

Implement collection_name() helper

d0e5f6c

Use safe subsetting

d2996e3

Simplify modify_binding() helper

f009219

Simplify otel_refresh_tracer()

989afe0

Use local scope to cache tracer; add testing helper

c9296bf

shikokuchuo added 2 commits November 8, 2025 12:33

Simplify otel_cache_tracer()

2968b5d

Drop otel::as_attributes()

8f2db6b

Merge branch 'main' into otel

debca2a

shikokuchuo requested review from hadley and krlmlr November 10, 2025 20:22

hadley mentioned this pull request Dec 2, 2025

logging sql to file in compute tidyverse/dbplyr#1650

Closed

krlmlr requested a review from Copilot December 4, 2025 06:22

Copilot started reviewing on behalf of krlmlr December 4, 2025 06:23 View session

Copilot finished reviewing on behalf of krlmlr December 4, 2025 06:26

Copilot AI reviewed Dec 4, 2025

View reviewed changes

krlmlr reviewed Dec 4, 2025

View reviewed changes

krlmlr and others added 2 commits December 5, 2025 17:12

More precise arg name

ea5f2be

Co-authored-by: Copilot <[email protected]>

Update tests/testthat/test-otel.R

dffa0f9

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

feat: Add support for OpenTelemetry #551

Are you sure you want to change the base?

feat: Add support for OpenTelemetry #551

Uh oh!

Conversation

shikokuchuo commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation progress:

Todo:

Uh oh!

krlmlr left a comment

Choose a reason for hiding this comment

Uh oh!

hadley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shikokuchuo commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shikokuchuo commented Oct 31, 2025

Uh oh!

krlmlr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shikokuchuo commented Nov 3, 2025

Uh oh!

hadley commented Nov 3, 2025

Uh oh!

shikokuchuo commented Nov 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

krlmlr left a comment

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

shikokuchuo commented Oct 27, 2025 •

edited

Loading

shikokuchuo commented Oct 28, 2025 •

edited

Loading