Skip to content

Add support for a discard option in concurrency controls #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 11, 2025

Conversation

rosa
Copy link
Member

@rosa rosa commented Jul 10, 2025

This PR builds on #523 and #586, after having great discussions in those with @joelzwarrington and @mhenrixon. Thanks a lot for your work on that and for bearing with me 🙏 In the beginning, I wasn't sure how to approach this because of the way Solid Queue enqueues jobs in two steps (first, the job is created, then an execution for it is created), but after going through those two PRs again and looking into how Sidekiq's unique jobs feature behaves, I think the behaviour here is the right one.

The way this works is quite simple: a new option, on_conflict, that defaults to :block but can be set to :discard.

class DeliverAnnouncementToContactJob < ApplicationJob
  limits_concurrency to: 1, key: ->(contact) { contact.account }, duration: 5.minutes, on_conflict: :discard

  def perform(contact)
    # ...

Basically, when a job needs to be discarded because of a conflict, we delete it in the same transaction where it was created. We don't mark it as finished as #586 does because I don't want to have a provider job ID in this case. I became convinced of this after reviewing how Sidekiq's unique jobs feature behaves. I was super lucky I got the chance to speak with @mperham yesterday at RailsConf (🙏 🙇‍♀️ ), so I could make sure I got this right. Chatting with him and other super nice and helpful attendees (I'm sorry I didn't catch your GitHub handle, but if you're reading this, you know who you are so let me know!), made me realise Active Job's behaviour with regards to successfully_enqueued? is inconsistent between perform_later and perform_all_later. With perform_later, unless the adapter raises, the job is considered successfully enqueued. This doesn't match perform_all_later where raising doesn't make sense because you need to enqueue multiple jobs, so you don't want the whole thing to fail if a single job fails to be enqueued. In Sidekiq's case, when a job fails to be enqueued an exception is not raised: the return result is not a job ID (jid) but nil instead. This is reflected in the bulk enqueue implementation but not in the individual enqueue. I think this should be fixed in Active Job, giving the adapter a chance to set sucessfully_enqueued instead of having to raise an exception to prevent it from being set to true here. I think the inconsistency might simply be because perform_all_later was added much later and the exception raising behaviour couldn't be used there 🤔

Anyway, here I chose to do the same as Sidekiq, and just not enqueue the job at all (and not have a job ID) when we need to discard a job because of concurrency limits.

joelzwarrington and others added 6 commits July 7, 2025 11:27
Remove 'duplicate' verbiage and use concurrency limits instead, simplify control flow

Fix race condition vulnerability by changing logic to enqueue

Add assertions when bulk enqueuing jobs with concurrency controls

Dispatch jobs in the order they were enqueued

Set ActiveJob successfully_enqueued for both enqueued/blocked and discarded jobs

Change concurrency 'at_limit' -> 'on_conflict'

Update discard logic to trigger an ActiveRecord rollback when attempting dispatch to prevent discarded job creation

Change default on_conflict concurrency option to old behaviour (blocking execution)

Add concurrent on_conflict documentation to README

Add test for discarding grouped concurrent jobs

Fix tests which expect raising enqueue errors

Add test to confirm scheduled jobs are also discarded
Co-authored-by: Philippe Tring <[email protected]>
Brought together #523 and #586, and took some parts from each.
Clean and enhance the README as well.
Otherwise we'd get inconsistent behaviour with other adapters like
Sidekiq, where only jobs that are actually enqueued get assigned a
provider_job_id. Turns out, issuing a DELETE query for a record that got
inserted within a transaction works perfectly, even with bulk INSERT, so
let's just avoid creating those jobs altogether.
@joelzwarrington
Copy link
Contributor

Awesome! 👏

In the future, I also see an option to: on_duplicate: :terminate_existing

@mhenrixon
Copy link
Contributor

mhenrixon commented Jul 10, 2025

It looks great, @rosa! Thank you for wrapping it up for us; I am happy to have contributed some ideas.

Look forward to seeing this released. ❤️

@rosa rosa merged commit a912126 into main Jul 11, 2025
211 of 215 checks passed
@rosa rosa deleted the concurrency-on-conflict-discard branch July 11, 2025 14:54
@Teyler7
Copy link

Teyler7 commented Jul 13, 2025

This looks incredible!

It was a honor to get to talk to you and Mike at Rails Conf and discuss Active Job's successfully_enqueued? behavior. Thanks for all your hard work on Solid Queue 🧡

@@ -46,6 +50,14 @@ def release_concurrency_lock
Semaphore.signal(self)
end

def handle_concurrency_conflict
if concurrency_on_conflict.discard?
destroy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destroy! maybe?

Copy link

@marelons1337 marelons1337 Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parent method is not defined with a bang, so should this one call a bang operation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants