Skip to content

Conversation

atheriel
Copy link
Collaborator

This commit wraps all LLM model calls in an Open Telemetry span that abides by the (still nascent) semantic conventions for Generative AI clients.

It's very similar in approach to what was done for httr2, and in fact the two of them complement one another nicely:
r-lib/httr2#729.

For example:

library(otelsdk)

Sys.setenv(OTEL_TRACES_EXPORTER = "stderr")

chat <- ellmer::chat_databricks(model = "databricks-claude-3-7-sonnet")
chat$chat("Tell me a joke in the form of an SQL query.")

@atheriel atheriel requested review from hadley and gaborcsardi May 22, 2025 19:37
@atheriel
Copy link
Collaborator Author

Traces that mix LLM model call spans with httr2 spans:

Screenshot 2025-05-22 at 14-25-20 Live · posit1_starter-project · Pydantic Logfire

@jcheng5
Copy link
Collaborator

jcheng5 commented May 23, 2025

cc @cpsievert @schloerke @icarusz

@hadley
Copy link
Member

hadley commented May 28, 2025

Do we want to (optionally?) also include user and assistant messages, a la https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/ ?

@atheriel
Copy link
Collaborator Author

@hadley I do. But there's a ton of disagreement in the OTel LLM community about how to do that, and none of the existing instrumentation libraries work in the same way 😞. Plus the whole "structured body" mechanism the current spec proposes (1) isn't supported by the span API; and (2) is formally deprecated.

So I kind of think we need to noodle on what to do there, and I suggest pushing it into follow-up work. I'm planning on writing up an issue describing what options we have.

I also think we should have first-class support for tool call spans, because that's something that ellmer focuses on specifically. This PR is really the "basic" bit that the title implies.

@hadley
Copy link
Member

hadley commented May 28, 2025

@atheriel ok, that makes sense. I'm sure there will be a lot of learning as we figure out exactly what is most useful to instrument across packages.

@atheriel atheriel marked this pull request as draft June 6, 2025 17:59
@atheriel
Copy link
Collaborator Author

atheriel commented Jun 6, 2025

Moving this back to draft because it has known issues (i.e. the concurrency does not work correctly).

This commit instruments various operations with Open Telemetry spans that
abide by the (still nascent) semantic conventions for Generative AI
clients [0].

These conventions classify `ellmer` chatbots as "agents" due to their
ability to run tool calls, so in fact there are three types of span: (1)
a top-level `invoke_agent` span for each chat interaction; (2) `chat`
spans that wrap model API calls; and (3) `execute_tool` spans that wrap
tool calls on our end.

There's currently no community concensus for how to attach turns to
spans, so I've left that out for now.

Example code:

    library(otelsdk)

    Sys.setenv(OTEL_TRACES_EXPORTER = "stderr")

    chat <- ellmer::chat_databricks(model = "databricks-claude-3-7-sonnet")
    chat$chat("Tell me a joke in the form of an SQL query.")

Unit tests are included.

[0]: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/

Signed-off-by: Aaron Jacobs <[email protected]>
@atheriel atheriel changed the title Add basic Open Telemetry instrumentation for model calls Add Open Telemetry instrumentation Jun 20, 2025
@atheriel
Copy link
Collaborator Author

atheriel commented Jun 20, 2025

This has been updated to support async operations and for changes in otel and otelsdk. It now also includes pretty extensive unit tests and support for agent and tool call spans.

schloerke and others added 10 commits September 8, 2025 09:50
* main: (95 commits)
  fix(chat): Call `check_echo()` in `chat()` for consistent echo behavior (#742)
  Increment version number to 0.3.2.9000
  Increment version number to 0.3.2
  Don't run `content_image()` on CRAN (#739)
  feat(chat_): Add `params` and `model` to all `chat_` functions (#699)
  Fix spelling in `tool_prompt.md` (#730)
  Fix typos in source comments and regenerate documentation (#736)
  Fix news bullet
  Increment version number to 0.3.1.9000
  Increment version number to 0.3.1
  Update cran comments
  Check revdpes
  Re-build readme
  Typo fixes (#686)
  Polish news
  Update to latest Air settings and use `format-suggest.yaml` (#683)
  Use newer REST API base url (#726)
  Fix auth scope and API endpoints for Google Vertex (#704)
  Run `Rscript data-raw/prices.R` to update pricing info (#727)
  Improve error message for `batch_chat()` (#716)
  ...
if (!is.null(tracer)) {
return(tracer)
}
if (testthat::is_testing()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must check if {testthat} is installed (as it is currently only a Suggests pkg)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just inline that function (i.e. copy it in into utils.R)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants