Skip to content

Conversation

@Mihir-Mavalankar
Copy link
Contributor

@Mihir-Mavalankar Mihir-Mavalankar commented Nov 6, 2025

PR Details

@Mihir-Mavalankar Mihir-Mavalankar self-assigned this Nov 6, 2025
@Mihir-Mavalankar Mihir-Mavalankar requested a review from a team as a code owner November 6, 2025 23:43
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 6, 2025
@codecov

This comment was marked as outdated.

@Mihir-Mavalankar Mihir-Mavalankar changed the title feat(autofix): Atomicity check for starting automation runs feat(autofix): Duplicate check for automation runs Nov 7, 2025
Copy link
Member

@JoshFerge JoshFerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add a test?

Copy link
Contributor

@kddubey kddubey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ttl should prolly be minutes, not 1 hour. the automation task has a 35 sec. limit w/ 1 retry. if something flakily fails in the automation task (e.g., gemini 429s) and we don't actually trigger autofix, we'll be delaying the next automation run for 1 hour. during that time it's possible someone would've wanted an RCA on their issue

if the ttl is minutes, then as long as a new event comes in after that, autofix has another chance to get triggered

or, if the automation task fails, delete the key. that feels a little riskier tho

@Mihir-Mavalankar Mihir-Mavalankar requested review from a team November 7, 2025 17:03
@Mihir-Mavalankar
Copy link
Contributor Author

the ttl should prolly be minutes, not 1 hour. the automation task has a 35 sec. limit w/ 1 retry. if something flakily fails in the automation task (e.g., gemini 429s) and we don't actually trigger autofix, we'll be delaying the next automation run for 1 hour. during that time it's possible someone would've wanted an RCA on their issue

if the ttl is minutes, then as long as a new event comes in after that, autofix has another chance to get triggered

or, if the automation task fails, delete the key. that feels a little riskier tho

Okay will change to 15 minutes

Copy link
Contributor

@kddubey kddubey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could prolly be 5 minutes

@Mihir-Mavalankar
Copy link
Contributor Author

nit: could prolly be 5 minutes

Hmm 5 seems a bit too close. Will take it down to 10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants