Skip to content

Conversation

@ceorourke
Copy link
Member

When a static or percent based detector is changed to become a dynamic detector we need to send Seer historical data for that detector so it can detect anomalies. Also when a dynamic detector's snuba query query or aggregate changes we need to update the data Seer has so it's detecting anomalies on the correct data. This PR also handles not updating the existing data if the call to Seer fails for any reason.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 7, 2025
Comment on lines -154 to -155
"id": self.data_condition_group.id,
"organizationId": self.organization.id,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't sent by the front end so I wanted this to be the same. Especially for the ids, it doesn't make sense that we'd be sending these on creation.

@@ -553,6 +551,379 @@ def test_transaction_dataset_deprecation_multiple_data_sources(self) -> None:
):
validator.save()


class TestMetricAlertsUpdateDetectorValidator(TestMetricAlertsDetectorValidator):
def test_update_with_valid_data(self) -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't have a simple update test case so I added one

raise DetectorException(
f"Could not create detector, data condition {dcg_id} not found or too many found."
)
# use setattr to avoid saving the models until the Seer call has successfully finished,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ceorourke ceorourke force-pushed the ceorourke/send-historical-data-to-seer-on-update branch from c4fdd12 to 2adcc69 Compare November 7, 2025 18:50
@codecov
Copy link

codecov bot commented Nov 7, 2025

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
29534 3 29531 243
View the top 3 failed test(s) by shortest run time
tests.sentry.incidents.endpoints.validators.test_validators.TestMetricAlertsUpdateDetectorValidator::test_anomaly_detection__send_historical_data_update_fails
Stack Traces | 2.91s run time
#x1B[1m#x1B[.../endpoints/validators/test_validators.py#x1B[0m:845: in test_anomaly_detection__send_historical_data_update_fails
    assert len(conditions) == 1
#x1B[1m#x1B[31mE   AssertionError: assert 2 == 1#x1B[0m
#x1B[1m#x1B[31mE    +  where 2 = len([<DataCondition at 0x7fcf6cb3ac30: id=59, type='gt', comparison=100, condition_result=75, condition_group_id=35>, <DataCondition at 0x7fcf6cb3acc0: id=60, type='lte', comparison=100, condition_result=0, condition_group_id=35>])#x1B[0m
tests.sentry.incidents.endpoints.validators.test_validators.TestMetricAlertsUpdateDetectorValidator::test_update_anomaly_detection_from_static
Stack Traces | 2.95s run time
#x1B[1m#x1B[.../seer/anomaly_detection/store_data_workflow_engine.py#x1B[0m:66: in _fetch_related_models
    data_condition = DataCondition.objects.get(
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/manager.py#x1B[0m:87: in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:636: in get
    raise self.model.MultipleObjectsReturned(
#x1B[1m#x1B[31mE   sentry.workflow_engine.models.data_condition.DataCondition.MultipleObjectsReturned: get() returned more than one DataCondition -- it returned 2!#x1B[0m

#x1B[33mDuring handling of the above exception, another exception occurred:#x1B[0m
#x1B[1m#x1B[.../sentry/incidents/metric_issue_detector.py#x1B[0m:277: in update
    update_detector_data(instance, validated_data)
#x1B[1m#x1B[.../seer/anomaly_detection/store_data_workflow_engine.py#x1B[0m:86: in update_detector_data
    data_source, data_condition, snuba_query = _fetch_related_models(detector, "update")
#x1B[1m#x1B[.../seer/anomaly_detection/store_data_workflow_engine.py#x1B[0m:76: in _fetch_related_models
    raise DetectorException(
#x1B[1m#x1B[31mE   sentry.workflow_engine.types.DetectorException: Could not update detector, data condition 43 not found or too many found.#x1B[0m

#x1B[33mDuring handling of the above exception, another exception occurred:#x1B[0m
#x1B[1m#x1B[.../endpoints/validators/test_validators.py#x1B[0m:666: in test_update_anomaly_detection_from_static
    dynamic_detector = update_validator.save()
#x1B[1m#x1B[31m.venv/lib/python3.13.../site-packages/rest_framework/serializers.py#x1B[0m:205: in save
    self.instance = self.update(self.instance, validated_data)
#x1B[1m#x1B[.../sentry/incidents/metric_issue_detector.py#x1B[0m:280: in update
    raise serializers.ValidationError(
#x1B[1m#x1B[31mE   rest_framework.exceptions.ValidationError: [ErrorDetail(string='Failed to send data to Seer, cannot update detector', code='invalid')]#x1B[0m
tests.sentry.incidents.endpoints.validators.test_validators.TestMetricAlertsUpdateDetectorValidator::test_update_with_valid_data
Stack Traces | 3.15s run time
#x1B[1m#x1B[.../endpoints/validators/test_validators.py#x1B[0m:635: in test_update_with_valid_data
    assert update_validator.is_valid(), update_validator.errors
#x1B[1m#x1B[31mE   AssertionError: {'conditionGroup': {'conditions': [ErrorDetail(string='Resolution condition required for metric issue detector.', code='invalid')]}}#x1B[0m
#x1B[1m#x1B[31mE   assert False#x1B[0m
#x1B[1m#x1B[31mE    +  where False = <bound method BaseSerializer.is_valid of MetricIssueDetectorValidator(context={'organization': <Organization at 0x7fed...))\n        group_by = ListField(allow_empty=False, child=CharField(allow_blank=False, max_length=200), required=False)>()#x1B[0m
#x1B[1m#x1B[31mE    +    where <bound method BaseSerializer.is_valid of MetricIssueDetectorValidator(context={'organization': <Organization at 0x7fed...))\n        group_by = ListField(allow_empty=False, child=CharField(allow_blank=False, max_length=200), required=False)> = MetricIssueDetectorValidator(context={'organization': <Organization at 0x7fedfe0c62b0: id=4557073293967360, owner_id=N...())\n        group_by = ListField(allow_empty=False, child=CharField(allow_blank=False, max_length=200), required=False).is_valid#x1B[0m

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@ceorourke ceorourke marked this pull request as ready for review November 7, 2025 21:18
@ceorourke ceorourke requested review from a team as code owners November 7, 2025 21:18
resolution=timedelta(seconds=data_source.get("resolution", snuba_query.resolution)),
environment=data_source.get("environment", snuba_query.environment),
event_types=data_source.get("event_types", [event_type for event_type in event_types]),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: update_detector_data receives a single data source dict instead of a {"data_sources": [...]} structure, preventing snuba_query updates.
Severity: CRITICAL | Confidence: 0.95

🔍 Detailed Analysis

When update_detector_data is invoked at src/sentry/incidents/metric_issue_detector.py:249, it receives validated_data_source, which is a single dictionary. However, the update_detector_data function expects a dictionary containing a "data_sources" key with a list of data sources. This mismatch causes the internal logic to skip updating the snuba_query object's fields. Consequently, when a dynamic detector's snuba query is updated, the old query, aggregate, and event types are sent to Seer instead of the new values, leading to anomaly detection operating on incorrect metrics.

💡 Suggested Fix

Modify the call to update_detector_data at src/sentry/incidents/metric_issue_detector.py:249 to pass {"data_sources": [validated_data_source]} instead of validated_data_source directly, aligning with the expected input structure.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/sentry/incidents/metric_issue_detector.py#L249

Potential issue: When `update_detector_data` is invoked at
`src/sentry/incidents/metric_issue_detector.py:249`, it receives
`validated_data_source`, which is a single dictionary. However, the
`update_detector_data` function expects a dictionary containing a `"data_sources"` key
with a list of data sources. This mismatch causes the internal logic to skip updating
the `snuba_query` object's fields. Consequently, when a dynamic detector's snuba query
is updated, the old query, aggregate, and event types are sent to Seer instead of the
new values, leading to anomaly detection operating on incorrect metrics.

Did we get this right? 👍 / 👎 to inform future reviews.

@mifu67 mifu67 self-requested a review November 11, 2025 19:00
Copy link
Contributor

@mifu67 mifu67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update logic itself looks good; just a question about when we should be updating.


# Handle a dynamic detector's snuba query changing
if instance.config.get("detection_type") == AlertRuleDetectionType.DYNAMIC:
if snuba_query.query != data_source.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check other snuba query fields as well (like timeWindow)? Should we just resend the data every time we update a dynamic detector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I will look into this for a follow up. I think every time we change it isn't necessary (e.g. it could just be changing the name of the detector) but we might be missing some cases we should be updating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants