Skip to content

Conversation

@zacharycmontoya
Copy link
Contributor

@zacharycmontoya zacharycmontoya commented Jun 17, 2025

Motivation

We are looking to implement the OpenTelemetry Metrics API in our libraries, so this adds initial tests to ensure that the implementation generates metrics in the OTLP format. APMAPI-1323

Changes

Test changes:

  • Adds a new test tests/otel/test_metrics.py::Test_OTelMetrics to test that our various weblog applications that send OTLP metrics to the Datadog Agent's OTLP Ingest using OTLP/HTTP have the correct information. Asynchronous instruments report every time the configured MetricReader polls for results, and synchronous instruments are reported inside the /basic/metric weblog endpoint.
    • This new test is expected to be run under the following cases:
      • scenario=OTEL_METRIC_E2E + library=java_otel
      • scenario=APM_TRACING_E2E_OTEL + library=dotnet

Weblog changes:

  • .NET: Adds the following asynchronous metrics using the OpenTelemetry Metrics API:
    • ObservableCounter (Asynchronous Counter)
    • ObservableUpDownCounter (Asynchronous UpDownCounter)
    • ObservableGauge (Asynchronous Gauge)
  • .NET: Adds a /basic/metric endpoint to generate one of each of the following metrics using the OpenTelemetry Metrics API:
    • Counter
    • UpDownCounter
    • Histogram
    • Gauge
  • .NET: (To be deleted) Imports the OpenTelemetry SDK and configures an OTLP metrics exporter. This will be removed before merging, as this behavior should be handled by the DD library
  • Java OTel: Adds code to generate metrics with asynchronous instruments and the remaining synchronous instruments

Interface changes:

  • OpenTelemetry: Adds a get_metrics function to return all of the OTLP metrics sent to the interface with a target host (e.g. 'agent')

Scenario changes:

  • Provides some small fixes for the OpenTelemetry scenario by configuring additional interfaces
  • Adds the open_telemetry interface to the general EndToEndScenario class so that any weblog application can emit OTLP data that will be properly captured by the proxy

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes (if something not related to your task is failing, you can ignore it)
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner. We're working on refining the codeowners file quickly.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • If PR title starts with [<language>], double-check that only <language> is impacted by the change
  • No system-tests internal is modified. Otherwise, I have the approval from R&P team
  • CI is green, or failing jobs are not related to this change (and you are 100% sure about this statement)
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added (or removed)?

@zacharycmontoya zacharycmontoya changed the title tracing: Zach.montoya/otel otlp experiment tracing: Add tests for OpenTelemetry Metrics API Jun 17, 2025
…ert the format of metrics generated by an OTel SDK. This test currently asserts the upload to the backend, so this should be updated to instead assert how metrics are received by the Datadog Agent.
…he OTLP endpoints are all sending data to the proxy at the open_telemetry_weblog port)
…2e.py::Test_OTelMetricE2E class to a new tests/otel/test_metrics.py::Test_OTelMetrics class
…ics.py::Test_OTelMetrics test cases with weblog applications
…gainst the .NET weblog app which is currently set to pass due to directly bringing in the OTel SDK for metrics collection, simply as a demonstration of the tests
…red in the OTLP format for individual metrics
…t in order to pass the tests/otel/test_metrics.py::Test_OTelMetrics test cases (when the OTel Metrics API implementation is added)
@zacharycmontoya zacharycmontoya force-pushed the zach.montoya/otel-otlp-experiment branch from f25dc55 to 8cea4f8 Compare June 25, 2025 23:24
zacharycmontoya added a commit to DataDog/dd-trace-py that referenced this pull request Jun 25, 2025
…in-progress system-tests for the OTel Metrics API feature at DataDog/system-tests#4790
…e the tracer instrumenting the weblog application to use the new OTel Metrics API support
zacharycmontoya added a commit to DataDog/dd-trace-py that referenced this pull request Jul 9, 2025
## Motivation
We want to implement support for the OTel Metrics API across our
libraries. There is an
[RFC](https://docs.google.com/document/d/1Zkpk6QdlRQ67459fqHk8_SlJKb3-ZqRg-3juLHtWBtU/edit?usp=sharing)
in development as well as
[system-tests](DataDog/system-tests#4790) to
define that behavior, and this POC aims to deliver the necessary code to
implement that.

For API usage, see the `utils/build/docker/python/flask/app.py` file in
the referenced system-tests PR.

Tracking Issue: [APMAPI-1322]

## Summary
This PR provides an opt-in implementation of the OTel Metrics API by
taking an optional dependency on the OpenTelemetry SDK packaage, so it
will only be enabled when both the flag is enabled and the dependency is
present. By bringing in the SDK component, we minimize API
incompatibilities for the time being and we cover the OTel Metrics API
that application / library developers expect:
- Get/Set a global `MeterProvider` instance (which provides access to
`Meter`s)
- Get a `Meter` with a given name+version
- Create synchronous/asynchronous instruments with a given name, kind,
unit, and description. This also allows users to directly record
measurements on the synchronous instruments.

All of this functionality is controlled by the environment variable
`DD_TRACE_OTEL_METRICS_ENABLED`, which is disabled by default.

## Testing
This PR adds unit tests for the configuration, ensuring that we do not
load any metrics SDK types unless the user has brought their SDK +
exporter AND `DD_TRACE_OTEL_METRICS_ENABLED=true`. In terms of
end-to-end testing, I am currently relying on the linked system-tests
PR, which this branch is currently passing.

## Improvements
This is still early in the development, so we can expect to implement
additional changes in the future, such as making the integration with
Datadog environment variables complete and setting metric preferences
that align with Datadog's processing. For now, I'd like to merge as-is.

## Related Changes
There are a number of changes that are adopted from
#13831, including riotfile
changes and configuration test code. When one PR is merged, the other
should rebase to make merging easier.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - -

[APMAPI-1322]:
https://datadoghq.atlassian.net/browse/APMAPI-1322?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Munir Abdinur <[email protected]>
Co-authored-by: Munir Abdinur <[email protected]>
Comment on lines +559 to +560
"DD_TRACE_OTEL_ENABLED": "true",
"DD_TRACE_OTEL_METRICS_ENABLED": "true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should remove the first config here and also change the name of the env var below to DD_METRICS_OTEL_ENABLED instead?

alyshawang pushed a commit to DataDog/dd-trace-py that referenced this pull request Jul 25, 2025
## Motivation
We want to implement support for the OTel Metrics API across our
libraries. There is an
[RFC](https://docs.google.com/document/d/1Zkpk6QdlRQ67459fqHk8_SlJKb3-ZqRg-3juLHtWBtU/edit?usp=sharing)
in development as well as
[system-tests](DataDog/system-tests#4790) to
define that behavior, and this POC aims to deliver the necessary code to
implement that.

For API usage, see the `utils/build/docker/python/flask/app.py` file in
the referenced system-tests PR.

Tracking Issue: [APMAPI-1322]

## Summary
This PR provides an opt-in implementation of the OTel Metrics API by
taking an optional dependency on the OpenTelemetry SDK packaage, so it
will only be enabled when both the flag is enabled and the dependency is
present. By bringing in the SDK component, we minimize API
incompatibilities for the time being and we cover the OTel Metrics API
that application / library developers expect:
- Get/Set a global `MeterProvider` instance (which provides access to
`Meter`s)
- Get a `Meter` with a given name+version
- Create synchronous/asynchronous instruments with a given name, kind,
unit, and description. This also allows users to directly record
measurements on the synchronous instruments.

All of this functionality is controlled by the environment variable
`DD_TRACE_OTEL_METRICS_ENABLED`, which is disabled by default.

## Testing
This PR adds unit tests for the configuration, ensuring that we do not
load any metrics SDK types unless the user has brought their SDK +
exporter AND `DD_TRACE_OTEL_METRICS_ENABLED=true`. In terms of
end-to-end testing, I am currently relying on the linked system-tests
PR, which this branch is currently passing.

## Improvements
This is still early in the development, so we can expect to implement
additional changes in the future, such as making the integration with
Datadog environment variables complete and setting metric preferences
that align with Datadog's processing. For now, I'd like to merge as-is.

## Related Changes
There are a number of changes that are adopted from
#13831, including riotfile
changes and configuration test code. When one PR is merged, the other
should rebase to make merging easier.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - -

[APMAPI-1322]:
https://datadoghq.atlassian.net/browse/APMAPI-1322?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: Munir Abdinur <[email protected]>
Co-authored-by: Munir Abdinur <[email protected]>
@zacharycmontoya
Copy link
Contributor Author

Superseded by #5106

@zacharycmontoya zacharycmontoya deleted the zach.montoya/otel-otlp-experiment branch October 5, 2025 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants