Skip to content

Conversation

@nsavoire
Copy link

@nsavoire nsavoire commented Sep 1, 2025

What does this PR do?

This PR stores the tracer process level context in a named anonymous mapping on Linux.

PROF-12406

Motivation

Project plan

  • Datadog proposal to OTel
  • Technical RFC on process-level information and why we chose named anonymous mapping over memfd

Reviewer's Checklist

  • Changed code has unit tests for its functionality at or near 100% coverage.
  • System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • There is a benchmark for any new code, or changes to existing code.
  • If this interacts with the agent in a new way, a system test has been added.
  • New code is free of linting errors. You can check this by running ./scripts/lint.sh locally.
  • Add an appropriate team label so this PR gets put in the right place for the release notes.
  • Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

Unsure? Have a question? Request a review!

@nsavoire nsavoire requested review from a team as code owners September 1, 2025 21:28
@nsavoire nsavoire force-pushed the nsavoire/process_level_context branch from 0cd2733 to 99d1142 Compare September 1, 2025 21:32
@pr-commenter
Copy link

pr-commenter bot commented Sep 1, 2025

Benchmarks

Benchmark execution time: 2025-11-17 11:33:34

Comparing candidate commit 76e24fe in PR branch nsavoire/process_level_context with baseline commit bce309f in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 24 metrics, 0 unstable metrics.

@nsavoire nsavoire changed the title [PROF-12406] feat: store OTEL process context in named anonymous mapping on Linux feat: store OTEL process context in named anonymous mapping on Linux Sep 2, 2025
@nsavoire nsavoire force-pushed the nsavoire/process_level_context branch 2 times, most recently from f4348e8 to 56e26fa Compare September 2, 2025 08:08
Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments!

My go-foo is not very strong so definitely worth getting a pass from someone that's not pretending to know go as well ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is cool! Much better than generating this by hand as we did in the C implementation ;)

@nsavoire nsavoire requested a review from ivoanjo September 4, 2025 11:52
@nsavoire nsavoire force-pushed the nsavoire/process_level_context branch from 2282839 to 3025072 Compare September 4, 2025 12:12
Copy link
Contributor

@nsrip-dd nsrip-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this with a simple Go program that just starts the APM tracer and then blocks forever. I've set DD_ENV, service name, and other config values, and then used Ivo's shell script to confirm that the memory mapping is there and has the expected contents.

Just left a few small comments.

// https://opentelemetry.io/docs/specs/semconv/registry/attributes/service/#service-version
ServiceVersion string `msg:"service.version"`
// https://opentelemetry.io/docs/specs/semconv/registry/attributes/telemetry/#telemetry-sdk-language
TelemetrySdkLanguage string `msg:"telemetry.sdk.language"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really small nitpick: the Go custom is to capitalize acronyms, like TelemetrySDKLanguage

log.Error("failed to store the configuration: %s", err.Error())
}

processContext := otelProcessContext{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a doc that explains why were are not using the Memfd solution for OpenTelemetry? IIRC we had some good reason, but I forgot what it was. We should probably link to that doc here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reasons were:

  • memfd_create syscall might be blocked by some seccomp profiles
  • it requires more syscalls / IO since all symlinks in /proc/<pid>/fd/ need to be resolved until datadog-tracer-info-<id> file is found

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "not inherited by fork" is also a nice bonus, and connects back to the discussions happening internally on the meaning of "runtime-id" -- the anonymous mapping mechanism, because it's not inherited by forks won't have any issues with stale information (such as a runtime-id) being carried from parent to child process.

@nsavoire nsavoire force-pushed the nsavoire/process_level_context branch from d14c0bc to 0111ebb Compare November 14, 2025 14:46
@nsavoire nsavoire requested a review from a team as a code owner November 14, 2025 14:46
@nsavoire nsavoire requested a review from nsrip-dd November 14, 2025 14:50
Comment on lines +1 to +6
// Unless explicitly stated otherwise all files in this repository are licensed
// under the Apache License Version 2.0.
// This product includes software developed at Datadog (https://www.datadoghq.com/).
// Copyright 2025 Datadog, Inc.

//go:build linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a Linux specific implementation, but it'd be good to have some tests. You can skip a test if the OS isn't Linux like this:

func TestCreateOtelProcessContextMapping(t *testing.T) {
    if runtime.GOOS != "linux" {
        t.Skip("OTEL process context mapping is only supported on Linux")
    }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reluctant to add a test because of all the code needed to read the mapping.
But so be it, I added a test.

@nsavoire nsavoire requested a review from darccio November 14, 2025 21:19
@nsavoire nsavoire force-pushed the nsavoire/process_level_context branch from e5bd886 to 76e24fe Compare November 17, 2025 11:02
@r1viollet
Copy link

@darccio would you have time for this review ? 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants