PROM-67: Scrape Time Rule Evaluation #67

roidelapluie · 2025-11-03T14:32:00Z

No description provided.

Signed-off-by: Julien Pivotto <[email protected]>

juliusv · 2025-11-03T18:16:19Z

Wow, I really like this and think it's well thought out and clear the way you put it, thanks @roidelapluie!

The only obvious question is of course whether all those scrape-time rules should really live in the main config file, which is inconsistent with normal rules living in separate files. So I wonder if it should be scrape_rule_files instead of scrape_rules (or optionally in addition). But maybe we don't expect people to use many scrape-time rules at all, so putting them into the main config file is more doable?

bwplotka · 2025-11-04T05:20:35Z

proposals/0067-scrape-time-rule-evaluation.md

+* **Related Issues and PRs:**
+  * [Original feature request](https://github.com/prometheus/prometheus/issues/394)
+
+> This proposal introduces the ability to evaluate PromQL expressions at scrape time against raw metrics from a single scrape, before any relabeling occurs. This enables the creation of derived metrics that combine values from the same scrape without the time skew issues inherent in recording rules or query-time calculations. Additionally, by evaluating before relabeling, this enables powerful cardinality reduction strategies where aggregated metrics can be computed and stored while dropping the original high-cardinality metrics.


Suggested change

> This proposal introduces the ability to evaluate PromQL expressions at scrape time against raw metrics from a single scrape, before any relabeling occurs. This enables the creation of derived metrics that combine values from the same scrape without the time skew issues inherent in recording rules or query-time calculations. Additionally, by evaluating before relabeling, this enables powerful cardinality reduction strategies where aggregated metrics can be computed and stored while dropping the original high-cardinality metrics.

> This proposal introduces the ability to evaluate PromQL expressions at scrape time against raw metrics from a single scrape, before metric relabeling occurs. This enables the creation of derived metrics that combine values from the same scrape without the time skew issues inherent in recording rules or query-time calculations. Additionally, by evaluating before metric relabeling, this enables powerful cardinality reduction strategies where aggregated metrics can be computed and stored while dropping the original high-cardinality metrics.

Adding "metric" for relabeling mention as scrape rules (and scrapes in general) will be done AFTER (SD) relabeling

That is not really correct, because I want scrape rule BEFORE SD relabeling (which means that the metrics used by the rules DO NOT have the target labels. Juste bare metrics from the scrape. They should look look just as scrape metrics.

bwplotka

Amazing! It would be great addition 👍🏽

cc @fpetkovski who started initial experiment.

As discussed before, the main obvious challenge is a relatively limited potential for this feature vs what people actually want, so:

Quick 5m rollups (rate/irate) and drop the rest.
Instance label drop + aggregation across targets (or even scrapers!) and drop rest.
Drop metrics conditionally using existing TSDB data.

This means that if we accept this we might need to manage user expectations, but more we will need to solve this problem anyway eventually, with likely a 3rd solution (e.g. cross process aggregation proxy like vmagent is doing or aggregation 5m stateful layer in Prometheus process). This means user will have likely 3 types of rules to choose from (unless we reuse one of existing config surfaces).

I think that's acceptable and it's good to iterate here, just want to make the consequences clear here (:

The only obvious question is of course whether all those scrape-time rules should really live in the main config file, which is inconsistent with normal rules living in separate files. So I wonder if it should be scrape_rule_files instead of scrape_rules (or optionally in addition). But maybe we don't expect people to use many scrape-time rules at all, so putting them into the main config file is more doable?

Good question. Given it's expected from users to DROP certain metrics based on scrape rule it would be more difficult to see what's happening if the scrape rules were in the remote place. Plus as mentioned before, scrape rules have relatively limited usability, so maybe there won't be that many? Previous small discussion.

juliusv · 2025-11-04T09:22:22Z

As discussed before, the main obvious challenge is a relatively limited potential for this feature vs what people actually want, so:

Good point. The use case for aggregating across instances is probably going to be way more important than the one for aggregating within an instance.

FWIW, I have no strong opinion on whether the kind of scrape-time rule evaluation as laid out in this proposal here should be added to Prometheus or not, I can find arguments either way. I just think the proposal is well done :)

The only obvious question is of course whether all those scrape-time rules should really live in the main config file, which is inconsistent with normal rules living in separate files. So I wonder if it should be scrape_rule_files instead of scrape_rules (or optionally in addition). But maybe we don't expect people to use many scrape-time rules at all, so putting them into the main config file is more doable?

Good question. Given it's expected from users to DROP certain metrics based on scrape rule it would be more difficult to see what's happening if the scrape rules were in the remote place. Plus as mentioned before, scrape rules have relatively limited usability, so maybe there won't be that many? Previous small discussion.

Agreed with both, even though it's a bit odd :)

fpetkovski · 2025-11-04T11:44:56Z

Thanks for writing this out @roidelapluie. Applying aggregations before SD relabeling makes perfect sense and I think will solve the caching issue described in my POC.

roidelapluie · 2025-11-04T12:08:04Z

Thanks for writing this out @roidelapluie. Applying aggregations before SD relabeling makes perfect sense and I think will solve the caching issue described in my POC.

ideally we would only add the "relevant" metrics to the in memory cache, by pre-parsing rules and extracting matchers as descibed in this document, but that could be added in a second PR.

Add Scrape Time Rule Evaluation

c21578d

Signed-off-by: Julien Pivotto <[email protected]>

roidelapluie force-pushed the roidelapluie/stre branch from 60964cb to c21578d Compare November 3, 2025 14:32

roidelapluie mentioned this pull request Nov 3, 2025

Implement scrape-time rule evaluation prometheus/prometheus#10529

Open

bwplotka reviewed Nov 4, 2025

View reviewed changes

bwplotka approved these changes Nov 4, 2025

View reviewed changes

bwplotka changed the title ~~Add Scrape Time Rule Evaluation~~ PROM-67: Scrape Time Rule Evaluation Nov 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PROM-67: Scrape Time Rule Evaluation #67

PROM-67: Scrape Time Rule Evaluation #67

Uh oh!

roidelapluie commented Nov 3, 2025

Uh oh!

juliusv commented Nov 3, 2025

Uh oh!

bwplotka Nov 4, 2025 •

edited

Loading

Uh oh!

roidelapluie Nov 4, 2025 •

edited

Loading

Uh oh!

bwplotka left a comment •

edited

Loading

Uh oh!

juliusv commented Nov 4, 2025

Uh oh!

fpetkovski commented Nov 4, 2025

Uh oh!

roidelapluie commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PROM-67: Scrape Time Rule Evaluation #67

Are you sure you want to change the base?

PROM-67: Scrape Time Rule Evaluation #67

Uh oh!

Conversation

roidelapluie commented Nov 3, 2025

Uh oh!

juliusv commented Nov 3, 2025

Uh oh!

bwplotka Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roidelapluie Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwplotka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juliusv commented Nov 4, 2025

Uh oh!

fpetkovski commented Nov 4, 2025

Uh oh!

roidelapluie commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bwplotka Nov 4, 2025 •

edited

Loading

roidelapluie Nov 4, 2025 •

edited

Loading

bwplotka left a comment •

edited

Loading