Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
396 changes: 396 additions & 0 deletions oteps/4672-OpAmp-metric-schedule-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,396 @@
# Enhancing OpenTelemetry for Large-Scale Metric Management

## Summary

This proposal outlines an enhancement to OpenTelemetry's OpAmp control plane to
address the challenges of large-scale metric management in push-based telemetry
systems. It suggests extending OpAmp to include a standardized protocol for
server-driven metric configuration, allowing backends to define sampling periods
and routing instructions based on metric name and resource. This would enable
proactive management of telemetry flow, reducing operational costs and improving
efficiency, similar to capabilities found in pull-based systems like Prometheus,
but without requiring server-side polling and registration. The proposal details
specific protobuf extensions for ScheduleInfoRequest and ScheduleInfoResponse
and describes a state machine for how agents and servers would interact to
implement this dynamic metric scheduling.

## Motivation

The core motivation for this proposal stems from a fundamental principle of
operating at scale: the cheapest RPC is the one that is never sent. Push-based
telemetry systems like OpenTelemetry currently lack a standardized,
server-driven mechanism to manage the volume and frequency of data sent by
clients. Unlike pull-based systems such as Prometheus, which inherently control
data load through client-side polling, push-based architectures risk
overwhelming ingestion systems as deployments grow. Without a way for the
backend to provide instructions on data transmission, we are left reacting to
data floods rather than preventing them proactively.

By extending the OpenTelemetry OpAmp control plane, we can introduce a
standardized protocol for metric configuration. This enhancement would provide a
common, interoperable language for the backend to instruct clients on how to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is client in this an SDK or a collector? Right now collectors are the only OpAMP clients, which to me limits part of the effectiveness of this proposal. The idea of being able to control this on an SDK level would reach the goals in the proposal, whereas simply doing this in a collector is already possible through OpAMP remote configuration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see below the proposal would be for a SDK/API changes, I think it would be useful to specify these changes in the summary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaronoff97 This proposal is targeted at SDKs, but CAN apply to opentelemetry collectors. The idea behind the decoupling is that we don't care WHICH consumer applies the config as long as we know the consumer can accept it.

batch, sample, and transmit telemetry data. While OpAmp's existing custom
capabilities offer a path for vendor-specific solutions, a standardized
mechanism is essential for the broad adoption and interoperability that
OpenTelemetry champions. This new capability would enable proactive,
Comment on lines +32 to +35

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what vendor specific solutions this refers to, OpAMP's existing capabilities for the collector and bridge are both vendor-neutral and allow you to control batching and sampling.

server-driven management of telemetry flow, leading to significantly improved
efficiency, reduced data ingestion and storage costs, and more resilient
monitoring infrastructure in large-scale environments.

This protocol extension would unlock several powerful use cases for dynamically
managing telemetry data at the source.
Comment on lines +40 to +41

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these are possible today via remote configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. We'd like to be able to have remote configuration where we don't need to know the exact details of what agent we're talking to. I.e. we need an abstraction of specific control use cases that we can push to any agent.

Today, OpAMP implementations (in my experience) require hard-coding knowledge of each agent implementation in every sever. We need something to break that down, or the cost of supporting new agents grows with every new agent you want to support vs. the cost growing with every use case you want to support.


* **Dynamic Sampling and Throttling:** The backend could instruct a client to
change the collection period for a specific metric on a particular resource.
For instance, it could reduce the reporting frequency for a stable,
high-volume metric or increase it during a critical incident investigation.

* **Intelligent Data Routing:** The protocol could convey dynamic routing
information, allowing clients to send specific types of telemetry directly
to the most efficient storage or processing backend. A client could be
instructed to send high-cardinality metrics to a time-series database
optimized for that workload, while sending critical health check metrics to
a high-availability alerting system. This optimizes data write paths and
improves overall system performance.

Initially, this proposal focuses on metrics, but the framework could easily be
extended to manage other signals like traces and logs in the future.

## Explanation

As an OpenTelemetry user, you now have a more efficient way to manage your
metrics, especially in large-scale systems. The OpAmp control plane has been
enhanced to allow your backend systems to dynamically configure how your
telemetry data is collected.

Here's how it works:

* **Server-Driven Configuration:** Instead of relying on client-side polling,
your backend can now define sampling periods and routing instructions for
your metrics. This means your backend can tell your OpenTelemetry agents
exactly how often to send specific metric data and where to send it, based
on the metric name and the resource being monitored.
* **Reduced Operational Costs:** By proactively managing telemetry flow, you
can avoid overwhelming your ingestion systems. This leads to reduced
operational costs and improved efficiency, similar to what's possible with
pull-based systems like Prometheus, but without the need for client-side
polling.
* **Dynamic Metric Scheduling:** The system uses a state machine to manage the
interaction between your agents and servers. When a new metric with a new
resource is written, your agent sends a request to the OpAmp server for
scheduling information. The server then provides the sampling periods and
other configuration parameters.
* **Flexible Sampling:** You can have different sampling rates for different
metrics and resources. For example, critical metrics like RPC status, which
are used for alerting, can be collected at a high frequency (e.g., every few
seconds), while less critical metrics like machine addresses can be
collected much more slowly (e.g., every few hours). This allows you to
tailor data collection to the specific needs of your systems.
* **Collector Integration:** If you use an OpenTelemetry collector, it can act
as a proxy for the original requests, potentially with caching capabilities,
or even take on the role of an agent or server itself, providing further
flexibility in your metric management setup.

In essence, this enhancement gives you more granular control over your telemetry
data, allowing you to optimize data collection for performance and cost
efficiency.

## Internal details

In this example, let's assume that we have a new metric reader and exporter:

* The `ControlledExporter` is an exporter that uses OpAmp to control its
export, similar to other configurations retrieved from the management
protocol.
* The `MultiPeriodicExportingMetricReader` is a reader that pushes metrics
similarly to the `PeriodicExportingMetricReader`, except it permits multiple
sampling periods per metric, and accepts a `ControlledExporter` as input.

During initialization, the `ControlledExporter` contacts the OpAmp server,
informing it that it requires the server to provide the sampling periods per
metric and resource, based on its policies. The server will inform if it has
that capability, otherwise either provide a default sampling period or, if not,
the exporter will fall back to a default sampling period.

The `MultiPeriodicExportingMetricReader` will then interface with the
ControlledExporter and, for each new metric, it will request the sampling period
and other configuration parameters from the exporter, which will need to support
an agreed-upon interface to communicate with the reader, an extension of the
regular MetricExporter interface.

### Extensions to OpAmp

For metrics we could extend the current `ScheduleInfoRequest` from
`AgentToServer` in a way that agents could use to request the sampling period
Comment on lines +123 to +124

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no current ScheduleInfoRequest as far as i can tell?

plus metadata like storage keys that would be completely opaque / arbitrary to
them, and that they could use later to optimize data routing and storage,
something like:

```
import "opentelemetry/proto/common/v1/resource.proto";

package opentelemetry.proto.metrics.v1;

message ScheduleInfoRequest {
// Identifies the entity to read schedules for from its resources.
opentelemetry.proto.resource.v1.Resource resource = 1;

// Last known fingerprint of the server response. Optional. If provided,
// the server MAY drop responses in case the fingerprint is unchanged.
int64 response_fingerprint = 2 [default = -1];

// List of metrics the server wants to obtain schedule information. If
// unset, all metrics matching the resource will be evaluated.
repeated string metric_name = 3;

// Last known information mapped from these resources. The server
// can use this information to avoid sending unnecessary data. Multiple
// entries imply sending the same request as multiple resources, one
// for each entry below.
message Metadata {
// The most recent list of additional resources to be passed by
// the agent received from the server. Helps the server deduplicate
// requests on their side.
opentelemetry.proto.resource.v1.Resource extra_resource = 1;

// Last known fingerprint reported by the server. The server
// MAY avoid sending data if nothing has changed on their side,
// which is possible.
fixed64 schedules_fingerprint = 3;
}

repeated Metadata metadata = 4;

// List of custom capabilities implemented by the agent and which the
// server might be able to provide information about it, similar to
// the main "custom_capabilities" feature in OpAmp. If unset, the
// server will not report these back.
repeated string custom_scheduling_capabilities = 5;
}
```

Then, in the
[ServerToAgent](https://opentelemetry.io/docs/specs/opamp/#servertoagent-message)
message we would add the response message, as follows:

```
import "opentelemetry/proto/common/v1/resource.proto";

message ScheduleInfoResponse {
// Fingerprint of the response, provided by the server. If the same
// sent by the request, indicates the server provided the same answer
// as it has done before, and the client should just ignore this reply,
// even if it comes with the other fields filled up.
int64 response_fingerprint = 1 [default = -1];

// Last known information mapped from these resources. The server
// can use this information to avoid sending unnecessary data. Multiple
// entries imply sending the same request as multiple resources, one
// for each entry below.
repeated Metadata metadata = 2;

message Metadata {
// Additional resources to be passed by requests dispatching
// this set of resources.
opentelemetry.proto.resource.v1.Resource extra_resource = 1;

// Schedules fingerprint given by the server, the client SHOULD
// send it back the last known fingerprint and extra resource pair
// to the server so the server can avoid sending information back
// unnecessarily.
fixed64 schedules_fingerprint = 3;

// Schedules which apply to this set of extra resources, or empty
// if the fingerprints match and the server decided to not send them.
message Schedule {
// The sampling period at which to send points.
google.protobuf.Duration sampling_period = 1;

// Set of patterns to be applied to metrics, matching by prefix if
// it ends with /, or exact match otherwise. A schedule applies
// to a given metric if the metric can be found in the inclusion
// pattern and NOT be found in the exclusion pattern
repeated string metric_inclusion_pattern = 2;
repeated string metric_exclusion_pattern = 3;

// Any addition information supporting a particular custom
// capability supported by the server and requested by the client.
// It is up to the client and the server to agree on a common
// protocol in this scenario.
message CustomSchedulingMessage {
// same syntax as the regular CustomMesssage used by regular
// custom capabilities.
string capability = 1;
string type = 2;
bytes data = 3;
}
repeated CustomSchedulingMessage custom_scheduling_message = 4;
}
repeated Schedule schedule = 4;
}
// Delay for next time the agent will request a schedule for the
// metrics in question, as suggested by the server. Clients SHOULD not
// send a similar request before this delay unless there is a crash
// or some other special circumstances.
google.protobuf.Duration next_delay = 3;
}
```

### State machine

These extensions would enable the proposed reader and exporter, together with
OpAmp, to implement the following state machine:

1. Whenever a metric with a new resource is written, the SDK (through the
reader and exporter described above) sends an `AgentToServer` message to the
OpAmp endpoint, asking for the scheduling.
2. The server will check its internal configuration, and construct an answer
for the client. For each scheduled request, it will also compute a
`schedules_fingerprint` that summarizes the schedule. If the fingerprint
matches the one sent by the client for that schedule, the server will skip
that change and not send it in the response.
3. The client reads the response and stores it in its internal memory.
1. If no schedule is provided, the client will assume the metric is not
expected to be collected by the server, and just not configure any
exporter. This saves resources on both the client that doesn't have to
create RPCs for the metric and on the server that doesn't have to do any
processing for a metric that will not be retained.
2. Otherwise, the sampling period provided will then be used to configure
the push metric period **instead of using one set in the code**. Clients
are expected to spread writes over time to avoid overloading a server.
Comment on lines +258 to +260

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this potentially create a new exporter for every schedule? I would worry about the efficiency of doing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we would not need more than on exporter for each schedule.

When metrics scale per-sdk (e.g. even in large scale prometheus applications), filtering the metrics sent per batch or altering to have different metrics at different intervals is a must-have capability. This can be done efficiently and does not imply new mechanism for export, it can be controlled purely at the "metric reader" layer.

1. If multiple entries are provided, the client is expected to perform
separate collections, once for each sampling period provided. This
enables cases where the same metric should be retained at different
frequencies at the discretion of the server. For example, the server
could collect a metric at higher frequency and keep it in memory yet
have a separate, lower frequency that is immediately stored in disk,
or forwarded to a separate system that does some form of analytics.
2. Each request should have its resource description extended by the
extra_resources provided by the server.
4. Once the client receives the response, it should wait for `next_delay` and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would these operations be blocking or async? I would worry about them blocking and causing a massive memory backup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would operate similar to how Jaeger-Remote-Sampler works today.

These calls are NON blocking, and there is a fallback behavior used until the values from OpAmp are returned.

then send a new `AgentToServer` request to the server, sending any previous
fingerprint it received from the server.
1. In case changes are reported, the client is expected to update the
sampling periods accordingly.

These can be seen in the diagram below

```mermaid
graph TD
%% Agent Subgraph
subgraph Agent
init("Initialization")
send_request("Send AgentToServer Request")
read_response("Read Server Response")
store_schedule{"Has updated<br>schedule?"}
wait_next_delay("Wait for Next Delay")
metric_write("Metric write with new resource")
no_schedule("No schedule, stop exporting")
configure_exporter("Configure MetricReader<br>with provided<br>sampling period")
end

%% Server Subgraph
subgraph Server
receive_request("Receive AgentToServer Request")
check_config("Check Internal Config")
build_response("Build Response with Schedules<br> and Extra Resources")
compute_fingerprint("Compute Schedules Fingerprint")
compare_fingerprint{"Compare Fingerprint"}
skip_response("Skip sending Schedule")
send_server_response("Send Server Response")
end

%% Connections
init --> metric_write
metric_write -- "New Resource" --> send_request
send_request --> receive_request
receive_request --> check_config
check_config --> build_response
build_response --> compute_fingerprint
compute_fingerprint --> compare_fingerprint
compare_fingerprint -- "Fingerprint matches" --> skip_response
compare_fingerprint -- "Fingerprint differs" --> send_server_response
skip_response --> send_server_response
send_server_response --> read_response
read_response --> store_schedule
store_schedule -- "Schedule provided" --> configure_exporter
store_schedule -- "No schedule provided" --> no_schedule
configure_exporter --> wait_next_delay
no_schedule --> wait_next_delay
wait_next_delay -- "After Delay" --> send_request
```

### Adding an OTel collector to the protocol

In case a collector is present, the collector would be expected to act in one of
the following modes:

* If the collector is not doing any major data transformations, it could be
seen purely as a proxy for the original request from the agent to OpAmp,
perhaps with some caching capabilities to avoid sending too many requests to
the original storage system, but retaining the ability of the central system
to configure sampling periods for all collected metrics.
* Note that it is also possible that the agents contact the OpAmp server
directly even though a collector is used in the collection path.
* Otherwise, the collector can assume either the role of the agent or the
server, or even both. Let's say that a collector has a rule where, for a
given metric, it should always collect 10 samples and report the highest
value out of the 10 samples during an interval. The collector could first
request the sampling period from the configuration system to determine
whether the metric has an active schedule and at what frequency, allowing it
to apply its custom collection rule within the server-defined interval.

## Trade-offs and mitigations

Since the sampling period is now computed by the server and overrides the one
configured in the agent, as described earlier, a misconfiguration on the server
would prevent an agent from sending data to the server and cause telemetry loss.
This is no different from a Prometheus server not sending the collection
requests or an OTel server rejecting an RPC because it has no information on how
to store a given metric, so it is important to assure the server configuration
is configured correctly in the server and the server is capable to reply to
OpAmp requests when prompted.

One mitigation for this scenario is to add instructions on the agents with a
default fallback sampling period if none is provided and the OpAmp server can't
be contacted, so the agent could keep sending data even in cases the server is
misconfigured. This unfortunately has the side effect of creating additional
cost for the server to send the RPC even if the server intends to not collect
them, but for some systems and metrics it might be preferable to have them
written over not having any data at all.

## Prior art and alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using existing remote configuration capabilities of OpAMP?

OpAMP server can send a config setting that describes the sampling policy (what this proposal defines in ScheduleInfoResponse). You can define an Otel semconv that defines a name of the config file that contains sampling settings and include that file in AgentConfigMap. The format of the file can be also defined by the same semconv or be delegated to Otel Configuration SIG to make a decision on.

The information carried in ScheduleInfoRequest can be part of existing AgentDescription message, recorded in non_identifying_attributes.

This way there is no need to make any changes to OpAMP itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we could use the AgentConfigMap as a vehicle to send config, OpAmp is lacking the ability to declare what semantics would be acceptable config files.

E.g. how would we know that any particular Agent can accept the config file we want to send? How would reserve the "name" portion of the ConfigMap? How do we know it's safe to attach our config into the config map for any agent?

This proposal first started with looking at using "custom capabilities" to advertise whether or not Metric config would be something an Agent would understand and accept.

While I appreciate your suggestion - I still think there's either some missing pieces or clear interaction between AgentConfigMap + CustomCapabilities to make this practical to use generically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. how would we know that any particular Agent can accept the config file we want to send? How would reserve the "name" portion of the ConfigMap? How do we know it's safe to attach our config into the config map for any agent?

This can all be semantic conventions. A convention for Agent to specify in non-identifying attributes what it can accept. Another convention for the "name" of the config.

I still think there's either some missing pieces or clear interaction between AgentConfigMap + CustomCapabilities to make this practical to use generically.

What exactly is impractical with the semantic conventions approach?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tigran,

In this proposal we do not expect the config to be rolled out all at once, for a couple of reasons:

  • We would like to allow dynamic configuration updates and gradual rollout mechanisms. The server can distribute specific configurations to different agents at different times, enhancing system resilience against widespread outages and supporting features like rate limiting during high workloads.
  • Configuration sizes can be substantial in large scale systems like the ones targeted by the proposal. In our existing systems, even agents requesting minimal configurations based on their metric usage can lead to significant memory consumption for configuration handling if they export significant amount of data, and most of these agents only require a fraction of the total config in the server.

I took a look at the config and at high level it seems this would not be possible with the current AgentConfigMap, or at least it would require a mechanism to permit these to be rolled out slowly and based on the agent needs, while the config needs to be solved all at a single step, at least that is my reading from https://opentelemetry.io/docs/specs/opamp/#configuration

Also, adopting a single configuration format for metric sampling periods would impose a strict standard across all systems. This proposal, conversely, focuses solely on the agent-server interface, allowing each system to define its own configuration. This offers greater flexibility for servers to establish their own rules for metric configuration and collection.

Let me know your thoughts.

Copy link
Member

@tigrannajaryan tigrannajaryan Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would like to allow dynamic configuration updates and gradual rollout mechanisms.

The Server controls the rollout of config changes and can do it gradually. There is nothing in the OpAMP spec which says when is the Server supposed to send a remote config change to the Agent. It is a Server decision and the Server can decide to do a gradual rollout using whatever strategy it chooses.

However, if the intent here is to be able to rollout an incremental change (e.g. just an update to a sampling rate - a single numeric value) then I can see how re-sending the entire config can be inefficient.

We have a long-standing issue to support incremental changes that we discussed briefly in the past but did not move forward with. It may be time to revive that.

Let me think a bit about this proposal. My strong preference is to minimize the changes to OpAMP protocol itself, especially changes that serve a relatively narrow use case.

Also, adopting a single configuration format for metric sampling periods would impose a strict standard across all systems.

I don't see that as a downside. To me a standardization of configuration file format for Otel-compatible systems is a benefit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about the update, there are a few points that would be requirements for this proposal that might not fit with the current configuration model

  • One, as you said, is the fact that sending the config both ways is not ideal, since these can be very large, in this protocol we are proposing splitting the config into smaller pieces and only sending / resending when there are differences between them. This could be done in some other way than the one proposed, but it is almost prohibitive to send the whole config
  • The other important aspect is that we don't want to send the whole config, but only what the agent is expected to use, IIUC the effective config can be reported back but there would be limits on how we would request information for additional metrics. One possibility is that whenever a new metric is added or another resource is being monitored the agent extend its effective config and send it to the server, who then provides the config, but this still means the whole config would be resent in this event.

Copy link
Member

@tigrannajaryan tigrannajaryan Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth I think the problem that this proposal wants to solve is a type of a configuration problem. If the generic configuration capabilities in OpAMP are not good enough to solve this particular problem I prefer to spend time fixing/extending the generic OpAMP capabilities so that it can handle this particular use case without the need to implement a specialized new flow and new set of messages. I suggest that we try to do that. This will allow future similar use cases to be handled without new effort (e.g. think trace sampling). Can we work together on this?

If we don't find a good way to solve this via a general-purpose OpAMP config capabilities I am happy to come back to this special-purpose solution and discuss it, but I think it should be our last resort second choice.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think a demo showing this working with an OpAMP client in an SDK would be helpful to understand more of the tradeoffs. Being able to see the difference in dev experience, user experience, performance based on a few of the approaches laid out here would make it easier to evaluate the options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth I think the problem that this proposal wants to solve is a type of a configuration problem. If the generic configuration capabilities in OpAMP are not good enough to solve this particular problem I prefer to spend time fixing/extending the generic OpAMP capabilities so that it can handle this particular use case without the need to implement a specialized new flow and new set of messages. I suggest that we try to do that. This will allow future similar use cases to be handled without new effort (e.g. think trace sampling). Can we work together on this?

Happy to!

I also think this proposal will have direct ties into Sampling configuration, both for traces and logs.

The key use case to think about here is designing an OpAMP server that can know how to control sampling or logging without needing to know the implementations of clients that connect to it, and be certain it's not causing issues or downing SDKs/Collectors by sending bad configs where they aren't understood.

OpAMP already has a lot in place for this, but is just missing a few "connect the dots" pieces:

  • The ability to declare support for specific config formats that are supported by an agent.
  • The ability to request partial configuration, or specific config formats.
  • The interaction of the above system with the existing "full configuration" mode of OpAMP.

If you compare OpAMP to xDS (Envoy's control plane), you'll see a huge deviation between the two, with Envoy allowing only specific use cases via control plane. The idea here is OpAMP is generally more flexible but allows that style of use case within it.

Again the goal here is that an implementation of an OpAMP server could control e.g. metric-reporting, log/span sampling without having to know if it's talking to a specific SDK or Collector. It may not even need (or want) full configuration control over things it talks to, just a key set of use cases.

When looking at the design of OpAMP it seemed like custom capability was the way this would be most suited, but I agree that this doesn't answer how the interaction with general configuration would be done.

@tigrannajaryan - Ideally @menderico and I could join the OpAMP SIG to discuss, however the time is not very EU friendly. I've added it to my calendar and will see what I can do but let's continue this discussion offline.

Copy link
Member

@tigrannajaryan tigrannajaryan Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key use case to think about here is designing an OpAMP server that can know how to control sampling or logging without needing to know the implementations of clients that connect to it, and be certain it's not causing issues or downing SDKs/Collectors by sending bad configs where they aren't understood.

Absolutely agree. That's the philosophy behind OpAMP design and we should stick to this.

The ability to declare support for specific config formats that are supported by an agent.
The ability to request partial configuration, or specific config formats.
The interaction of the above system with the existing "full configuration" mode of OpAMP.

This is a good initial summary. Let's work on this.

We should definitely do this within OpAMP SIG, but if the time does not work, offline is fine too.


There are a few examples of control planes built on top of OpAmp,
[ControlTheory](https://www.controltheory.com/use-case/cut-observability-costs/)
and [BindPlane](https://cloud.google.com/stackdriver/bindplane) being two
systems that use OpAmp to monitor the health of the collectors plus manage its
configuration.

Prometheus is a system that allows per-metric sampling periods configuration,
but its pull-based nature means the client library doesn't need to contact the
control plane, only the Prometheus server need to access the configuration to
determine its pull intervals, which is different from the proposal presented
above where the target is to have the agents and collectors retrieve the
relevant configuration and avoid pushing data faster than they are expected by
the server without requiring a pull mechanism.

For traces, a similar concept has been implemented by Jaeger through its
[remote sampling configuration](https://www.jaegertracing.io/docs/1.27/architecture/sampling/),
with the sampling period being controlled by a central repository.

## Prototypes

While no external prototype is available, the mechanism described above is the
one used internally in Google to collect telemetry data across all jobs and have
them available at Monarch, as the instrumentation library mentioned in
[section 4.1 of the VLDB paper about Monarch](https://www.vldb.org/pvldb/vol13/p3181-adams.pdf)

## Future possibilities

The same principles could be adopted for logs and tracing, for example by adding
sampling and batching periods from the server that instructs agents how often to
sample a trace and how to batch them, which could lead to additional cost
savings by dispatching multiple writes in a single call and/or providing more
opportunities for protocols like gRPC to compress requests with repeated
information.