Skip to content

Conversation

jan-xyz
Copy link
Contributor

@jan-xyz jan-xyz commented Aug 25, 2025

Changes

This adds tracing support for the tower layer. It was addressed briefly with @cijothomas on the CNCF slack, but we didn't discuss any details of the implementation. I took the liberty for now to take 2 major design decisions:

  1. General vs. Specialized tower layer: I chose to pick a general layer that does metrics and tracing together with a no-op tracer in case someone doesn't want to do tracing with this layer. I assume that long term people/teams will fully switch to OpenTelemetry for Metrics and Tracing and instead of specifying two layers which each inspect the request and response, it makes more sense to do this in one layer.

  2. Backwards compatibility: I tried to leave the original types in place which have the "Metric" implementation in the name. This is maybe a more smooth migration? I am not sure how helpful that actually is, and maybe we should just break compatibility in case we want to go with a unified layer.

Note: It currently does not include context extraction.

As a follow up I would also like to provide a way to use this as a client layer, where the semantics change a bit.

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@jan-xyz jan-xyz requested a review from a team as a code owner August 25, 2025 12:57
Copy link

codecov bot commented Aug 25, 2025

Codecov Report

❌ Patch coverage is 83.81503% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.6%. Comparing base (aceb106) to head (4d10b92).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
opentelemetry-instrumentation-tower/src/lib.rs 83.8% 28 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main    #431     +/-   ##
=======================================
+ Coverage   53.1%   53.6%   +0.4%     
=======================================
  Files         70      70             
  Lines      10643   10767    +124     
=======================================
+ Hits        5660    5778    +118     
- Misses      4983    4989      +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jan-xyz jan-xyz changed the title Add tower tracing (feat) Add tower tracing Aug 25, 2025
@jan-xyz jan-xyz changed the title (feat) Add tower tracing feat: Add tower tracing Aug 25, 2025
@@ -281,8 +302,13 @@ impl<ReqExt, ResExt> HTTPMetricsLayerBuilder<ReqExt, ResExt> {
self
}

fn make_state(meter: Meter, req_dur_bounds: Vec<f64>) -> HTTPMetricsLayerState {
HTTPMetricsLayerState {
pub fn with_tracer(mut self, tracer: BoxedTracer) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not something introduced in this PR, but I am not sure if the instrumentation should really accept a tracer and meter. It can just get them from global.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the tests then set the global tracer and meter? Or do you have a better design in mind? I guess this raises the question for me: how should the default behavior be (in terms of which tracer/meter instance to pick but also in terms of activated functionality)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libraries (and instrumentation libraries which does instrumentation) should always retrieve the global tracer/meter. It is upto application owner to ensure they set a valid meter/tracer using global - if they don't, NoOps get used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so that implies that I should set the global tracer in tests to the InMemory variants?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. (might need some tricks to avoid collision between tests).
Anyway - changing to use global tracer/meter - it can be done on its own small PR. For this one, it is okay to continue with existing meter like approach of accepting tracer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it to use global providers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is already breaking backwards compatibility I also took the liberty to remove the type aliases which kept backwards compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-added the functions to set the tracer and meter provider, because the tests are not well supported for the global setup. The default is now to use the global tracer and meter, but with the option to override it, which I also chose for testing. Otherwise, it's not possible to normally run the tests, and the main opentelemetry-rust repo even had all these tests disabled that used the global state. I am not sure if this will become better supported at some point, or you already have a plan for it? Otherwise, I think this is a good compromise.

@jan-xyz jan-xyz requested a review from cijothomas August 29, 2025 07:19
@jan-xyz
Copy link
Contributor Author

jan-xyz commented Aug 29, 2025

while working on this, I started to wonder how the overall strategy for this crate could look like, and I am wondering if my PR is maybe not going far enough. For the future we have multiple use cases for a (collection of?) tower layer(s).

So far I see at least 4:

  • HTTP Server
  • HTTP Client
  • gRPC Server
  • gRPC Client

As it stands now, it provides HTTP server support, for that it might make sense to also rename the structs accordingly to HTTPServerLayer a.s.o. I won't push the other 3 concerns into this PR, but I at least would have the plan to add these in the next couple of days/weeks if it fits this project since I need this functionality for a project I am currently working on.

It might even be an idea to provide some form of auto-detecting abstraction on top which figures out if it's in a server/client context or gRPC/HTTP exchange. I am not sure how realistic that is though, as I haven't looked into the details yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants