Skip to content

Conversation

@shikokuchuo
Copy link

@shikokuchuo shikokuchuo commented Oct 27, 2025

This PR implements basic OpenTelemetry instrumentation for DBI.

Abides by the otel semantic conventions for database spans as far as possible (with considerations for limitations of the R API, performance etc.).

The following is a screenshot of the spans created by running the examples for dbGetQuery(). This trace may also be examined interactively at this public link (30 day validity):
https://logfire-eu.pydantic.dev/public-trace/a3da0166-cf62-43de-b194-864bf3c9e33d?spanId=77e385b051f86076

Screenshot 2025-10-27 at 19 37 42

Implementation progress:

  • Implemented for: dbConnect/dbDisconnect, dbCreateTable/dbRemoveTable, dbGetQuery

Todo:

  • Extended coverage to: dbAppendTable, dbWriteTable/dbReadTable and all Arrow variants
  • Add tests
    - [ ] Add documentation (covered by news item + separate article for other packages)

I've assumed otel to be an 'imports' package for simplicity, but it shouldn't be a problem to move to 'suggests' if that's the preference.

Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lovely!

For what functions does it not make sense to implement telemetry? I guess dbQuote*(), what else?

Does this supersede https://github.com/r-dbi/dblog? Do you think a non-invasive approach like used there would be feasible here as well? What is the overhead if no listeners are active?

If we need to add here, a suggested package would be preferred.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a great first step!

@shikokuchuo
Copy link
Author

shikokuchuo commented Oct 28, 2025

For what functions does it not make sense to implement telemetry? I guess dbQuote*(), what else?

My preliminary thoughts are that we should instrument all 'full transactions', where we might be interested in the length of the spans. Otel expects all spans to be short-lived. Hence for example, we don't have a span that starts with dbConnect() and ends with dbDisconnect(), we have spans for each.

Does this supersede https://github.com/r-dbi/dblog? Do you think a non-invasive approach like used there would be feasible here as well? What is the overhead if no listeners are active?

Gabor has spent a lot of time to make the interface as user-friendly as possible. I think the idea that you can get instrumentation for free with no code changes, and can leave it on in production is a powerful proposition.

If not active, there is practically no overhead - the current main instrumentation function otel_local_active_span() is guarded by an early return based on a variable that will be cached (implemented in d6213f6). The design is so that arguments would remain unevaluated.

If we need to add here, a suggested package would be preferred.

Updated to suggests in d6213f6.

@shikokuchuo shikokuchuo marked this pull request as ready for review October 31, 2025 16:41
@shikokuchuo
Copy link
Author

I've updated this PR to cover the high-level operations - let me know if any obvious ones are missing. Instrumenting the lower level ones would result in much more (noisy) output.

Live link here: https://logfire-eu.pydantic.dev/public-trace/73de9eac-7379-4581-b285-845a7a52c56b?spanId=eddfab36dba4de22

Screenshot 2025-10-31 at 18 07 23

Re. documentation, let me know if you have a particular preference here e.g. if you want to stick with a news item (knitr), or have a separate vignette (mirai).

@krlmlr krlmlr changed the title OpenTelemetry Integration feat: Added support for [OpenTelemetry](https://opentelemetry.io/) observability. See ?otelsdk::collecting for more details on configuring OpenTelemetry Oct 31, 2025
Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. My understanding is that this is opt-in, and that tracing for DBI can be disabled even if tracing for other sources is enabled. I wonder if we can emit a banner message when connecting that points to relevant documentation?

@shikokuchuo
Copy link
Author

My understanding is that this is opt-in, and that tracing for DBI can be disabled even if tracing for other sources is enabled.

Yes, you're right - and detailed in the otelsdk instrumentation docs.

I wonder if we can emit a banner message when connecting that points to relevant documentation?

I'm thinking that in some cases it may be a system admin which has set up otel collection rather than the end user. So it may be surprising for the user to see a banner, especially as they wouldn't then know what to do with the information.

@hadley hadley changed the title feat: Added support for [OpenTelemetry](https://opentelemetry.io/) observability. See ?otelsdk::collecting for more details on configuring OpenTelemetry feat: Add support for OpenTelemetry Nov 3, 2025
@hadley
Copy link
Member

hadley commented Nov 3, 2025

As this rolls out across more packages, we'll do more to promote it, so hopefully folks start to internalise that this sort of observability is available in all the packages they rely on the most.

@shikokuchuo
Copy link
Author

I've now updated this PR with a common approach on caching the tracer, and a testing helper (following discussions with @schloerke who's been spearheading the otel integration in Shiny/promises).

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds basic OpenTelemetry instrumentation to DBI, implementing tracing for database operations following the OpenTelemetry semantic conventions for database spans. The implementation provides observability into database operations by creating spans for connections, queries, and table operations.

Key changes:

  • Core OpenTelemetry infrastructure with lazy initialization and tracer caching
  • Instrumentation added to generic database operations (connect/disconnect, queries, table operations)
  • Test coverage for OpenTelemetry tracing functionality

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
R/otel.R New file implementing core OpenTelemetry helper functions for tracer management, span creation, and SQL query attribute extraction
R/zzz.R Added tracer initialization call in .onLoad hook
R/DBI-package.R Added .onLoad function to initialize OpenTelemetry tracer
R/dbConnect.R Added OpenTelemetry span instrumentation for database connection
R/dbDisconnect.R Added OpenTelemetry span instrumentation for database disconnection
R/dbGetQuery.R Added OpenTelemetry span instrumentation for query execution
R/dbGetQueryArrow.R Added OpenTelemetry span instrumentation for Arrow query execution
R/dbReadTable.R Added OpenTelemetry span instrumentation for table reading
R/dbReadTableArrow.R Added OpenTelemetry span instrumentation for Arrow table reading
R/13-dbWriteTable.R Added OpenTelemetry span instrumentation for table writing
R/23-dbWriteTableArrow.R Added OpenTelemetry span instrumentation for Arrow table writing
R/11-dbAppendTable.R Added OpenTelemetry span instrumentation for table appending
R/21-dbAppendTableArrow.R Added OpenTelemetry span instrumentation for Arrow table appending
R/12-dbCreateTable.R Added OpenTelemetry span instrumentation for table creation
R/22-dbCreateTableArrow.R Added OpenTelemetry span instrumentation for Arrow table creation
R/dbRemoveTable.R Added OpenTelemetry span instrumentation for table removal
tests/testthat/test-otel.R New test file validating OpenTelemetry span creation and attributes
DESCRIPTION Added otel and otelsdk to Suggests dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +64 to +68
otel_local_active_span(
dynGet("attributes")$db.operation.name,
conn,
label = dynGet("attributes")$db.collection.name,
attributes = make_query_attributes(statement)
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dynGet("attributes") calls on lines 65 and 67 attempt to retrieve the value of attributes before it's assigned on line 68, which will fail. The dynGet function searches for a variable in parent frames, but attributes is being assigned as a parameter in the current call, not in a parent frame.

This should be refactored to compute the attributes first, then use them:

setGeneric("dbGetQuery", def = function(conn, statement, ...) {
  attributes <- make_query_attributes(statement)
  otel_local_active_span(
    attributes$db.operation.name,
    conn,
    label = attributes$db.collection.name,
    attributes = attributes
  )
  standardGeneric("dbGetQuery")
})
Suggested change
otel_local_active_span(
dynGet("attributes")$db.operation.name,
conn,
label = dynGet("attributes")$db.collection.name,
attributes = make_query_attributes(statement)
attributes <- make_query_attributes(statement)
otel_local_active_span(
attributes$db.operation.name,
conn,
label = attributes$db.collection.name,
attributes = attributes

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shikokuchuo: Can you confirm that dynGet() is doing what Copilot thinks it's doing? I understand that we want to run make_query_attributes() only once, and only if we actually record the span. How about:

Suggested change
otel_local_active_span(
dynGet("attributes")$db.operation.name,
conn,
label = dynGet("attributes")$db.collection.name,
attributes = make_query_attributes(statement)
otel_local_active_span(
attributes$db.operation.name,
conn,
label = attributes$db.collection.name,
attributes = { attributes <- make_query_attributes(statement) }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that. The code in make_query_attributes() is already brittle. Should we record source code locations instead?

Comment on lines +58 to +62
otel_local_active_span(
dynGet("attributes")$db.operation.name,
conn,
label = dynGet("attributes")$db.collection.name,
attributes = make_query_attributes(statement)
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dynGet("attributes") calls on lines 59 and 61 attempt to retrieve the value of attributes before it's assigned on line 62, which will fail. The dynGet function searches for a variable in parent frames, but attributes is being assigned as a parameter in the current call, not in a parent frame.

This should be refactored to compute the attributes first, then use them:

setGeneric("dbGetQueryArrow", def = function(conn, statement, ...) {
  attributes <- make_query_attributes(statement)
  otel_local_active_span(
    attributes$db.operation.name,
    conn,
    label = attributes$db.collection.name,
    attributes = attributes
  )
  standardGeneric("dbGetQueryArrow")
})
Suggested change
otel_local_active_span(
dynGet("attributes")$db.operation.name,
conn,
label = dynGet("attributes")$db.collection.name,
attributes = make_query_attributes(statement)
attributes <- make_query_attributes(statement)
otel_local_active_span(
attributes$db.operation.name,
conn,
label = attributes$db.collection.name,
attributes = attributes

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source code locations?

Comment on lines +62 to +64
list(
db.operation.name = query[1L],
db.collection.name = query[which(query == "FROM") + 1L]
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a SQL statement doesn't contain a "FROM" clause, which(query == "FROM") returns integer(0), and query[integer(0)] returns character(0). This results in db.collection.name being set to character(0) instead of NA_character_ or being omitted. This could cause issues downstream when the attribute is used. Consider handling this case explicitly:

make_query_attributes <- function(statement) {
  query <- strsplit(statement, " ", fixed = TRUE)[[1L]]
  from_idx <- which(query == "FROM")
  
  list(
    db.operation.name = query[1L],
    db.collection.name = if (length(from_idx) > 0) query[from_idx[1L] + 1L] else NA_character_
  )
}

Additionally, consider making the FROM matching case-insensitive using toupper(query) == "FROM" to handle queries written in lowercase or mixed case.

Suggested change
list(
db.operation.name = query[1L],
db.collection.name = query[which(query == "FROM") + 1L]
query_upper <- toupper(query)
from_idx <- which(query_upper == "FROM")
list(
db.operation.name = query[1L],
db.collection.name = if (length(from_idx) > 0) query[from_idx[1L] + 1L] else NA_character_

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source code locations?

Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great! I'll play with it locally to get a feeling as well.

label = collection_name(name, conn),
attributes = list(
db.collection.name = collection_name(name, conn),
db.operation.name = "INSERT INTO"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we distinguish between Arrow and data frame source?

"dbCreateTableArrow",
def = function(conn, name, value, ..., temporary = FALSE) {
otel_local_active_span(
"CREATE TABLE",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, but not as critical perhaps.

otel_is_tracing <- FALSE

otel_cache_tracer <<- function() {
requireNamespace("otel", quietly = TRUE) || return()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if otel is installed during the session? Can we somehow support this use case?

Will otel print diagnostics on the console if it's active, by default?

krlmlr and others added 2 commits December 5, 2025 17:12
Co-authored-by: Copilot <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants