Concurrency control for ActiveJob with Rails 6

Viral Patel
4 min readNov 2, 2021
Concurrency control and distributed systems programming with Rails

If you are familiar with Rails ecosystem and have experienced working with Rails framework, you already know about ActiveJob. For those who are new, ActiveJob is a framework that is part of core Rails which allows developers to create, enqueue and execute background jobs. Many types(or should I say almost all) of applications require background tasks that run independently of the user interface. This helps applications to be more responsive since background tasks do not block user interactive calls and improves availability and performance of your overall system.

Objective of this blog post is to introduce distributed computing principles and style guide for Rails developers on writing concurrent compliant background tasks using ActiveJob.

Why is concurrency control needed?

Background jobs does not happen sequentially. In a typical Rails systems, there are multiple instances of background workers running independently working to run queued jobs that means all the jobs will compete for access to resources and services, such as databases and storage. This concurrent access can result in resource contention, which might cause conflicts in availability of the services and in the integrity of data in storage. There are multiple ways to resolve resource contention. We can leverage locking mechanism at the database level to remove any side effects of the concurrent operations but we are not discussing them here in this blog post and we are focusing on designing concurrency control around execution of ActiveJob.

Consideration for designing rate controlled ActiveJobs:

  • Introducing control limit on number of similar background jobs that are allowed to run concurrently to reduce resource usage and resource exhaustion
  • Throttling number of background jobs to run for a given period of time, again helps with reducing resource exhaustion and resource usage.

ActiveJob does not provide throttling or concurrency control out of the box.

Introducing Activejob Traffic Control

Under the hood, it provides performant distributed lock solution using distributed cache stores such as memcached or Redis. The biggest benefit of using this gem is that it will save you from trouble of inventing rate control logic in your code and adding extra columns in data models or additional tables to achieve the similar functionality with your background jobs.

Activejob Traffic Control, provides nice abstraction and clean interface to quickly introduce rate control to your existing active jobs in your application. More than any other approaches provided with this library, the most common rate control mechanism that I’ve used is concurrency limit for a given background job and you can achieve that by adding concurreny macro just after defining active job class as shown below,

class ConcurrencyTestJob < ActiveJob::Base
concurrency 5, drop: false
def perform
# only five `ConcurrencyTestJob` will ever run simultaneously
end
end

It’s that easy. As it can be easily interpreted, ConcurrencyTestJob will only run 5 instances of the job at once. This can be super helpful for background jobs that are processing data in large batches or interacting heavily with shared resources such as Databases, Data stores, Elasticsearch clusters or cloud APIs. If you are bulk inserting/updating records, this will allow doing it with a certain rate allowing it to not become a bottleneck for other parts of the applications to access the shared resources.

Another solution that I do not use frequently or have not found a reason to use a lot is throttling the numbers of jobs to be run for a given period of time. Here is an example from the README of the repo:

class CanThrottleJob < ActiveJob::Base
throttle threshold: 10, period: 1.second
def perform
# no more than two of `CanThrottleJob` will run every second
# if more than that attempt to run, they will be re-enqueued to run in a random time
# ranging from 1 - 5x the period (so, 1-5 seconds in this case)
end
end

We have seen this pattern in UI search inputs where we throttle the number of requests we make to the backend by making a single request every n seconds rather than making every time user types something which can be very inefficient and resource read-intensive operation. Similarly, ActiveJob Traffic Control gives us ability to throttle background jobs. This can be super helpful if you have an API endpoint that triggers this background job for every single incoming API request. Assume you have a background task that calls third party API which has very rate limiting restrictions, we can throttle this background jobs to allow it to run n times for a given period of time to ensure we do not hit the rate limiting exceptions but rather this will just re-enqueue the job and re-run at a random time in near future.

Read more about what this gem can offer you in terms of rate limiting for your background jobs: https://github.com/nickelser/activejob-traffic_control

Trade-offs

There are no perfect solutions. Every design choice has trade-offs as using concurrency control in your ActiveJob.

  • It reduces the performance benefits of reducing number of concurrent tasks but since we can configure the settings on per Job basis, the impact is very calculated.
  • Need to introduce Redis or Memcached data store in your application stack if not already used.
  • Some programmers may think this as a silver bullet but it is not for data conflict resolution. Database level constraints are must to achieve data integrity and consistency. This is just a nice addition to concurrency tools for applications to reduce resource starvation and exhaustion by getting accessed from so many concurrent systems.

Please leave comments if you find this blog useful. It will help me decide if I should write more blogs on concurrent and parallel programming techniques with Ruby on Rails.

--

--