Running Ruby Blocks in the Background
Update
There is a new version available. More details are available here.
For the impatient
Introduction
Every Rails developer knows this: Your application is fast and responsive, up until a point where the data set handled in one request gets so large that request times become unacceptable. The obvious solution is to identify code that does not have to run immediately, but can be delegated to another process. There are several ready-made solutions for delegating execution of code to a background process, like ActiveMessaging using ActiveMQ, BackgrounDRb using a DRb server, or databased-driven Job queues.
The problems with each of the above mentioned approaches are:
- Responsibilities get cut out of objects (most of the time, the code running in the background process actually belongs to an object residing in the foreground process). This can be solved by delegating the time-consuming task to the background process, which clones the object, and delegates the task back to it, only in another process.
- Not DRY. Background processes tend to repeat code, like the above mentioned back-delegation. This can be reduced to a certain extent, but there will always be some overhead.
- Not Failsafe. Almost always, the task at hand is delegated over a socket of some sort. If the other process is busy, hangs or is down, there can be timeouts which result in ugly exceptions, and in the end, discard the task. To make background processing failsafe, you need to write a lot more code, which is repetitive as well.
We want to present our solution to all of the above problems, that also is very elegant. Running a task in the background is as easy as
background do
# run your code
end
The communication with the background process is configurable, as is the error handling. For example, you could use ActiveMQ for queueing your background tasks. If the connection to ActiveMQ fails for some reason, the task could be executed in-process. If this fails (e.g. because of an error or timeout), the task would be dumped to disk for a later replay when the problem is fixed.
How does it work?
Actually, the solution is pretty easy. The block’s code and local variables needed by the block are serialized and sent to the handler, which then evaluates the block in the context of the local variables. The attentive reader might notice that it is impossible to serialize code blocks, let alone know all the block’s local variables in advance. Well, that is almost true.
There is a genius piece of code, called proc_source, that allows you to serialize code blocks. It works by parsing the source file that contains the block. This is possible, because code blocks know where they are in the source code.
It turns out that the local variables can’t be accessed with plain Ruby. But that is actually a good thing, because we might want to control which objects are sent over the wire, and only choose the ones that are actually needed, in order to save computation time and bandwidth.
So, to attach local variables to the code block, you’d use the following method call:
background :locals => { :user => current_user } do
user.do_some_time_consuming_operation
end
Failsafe background processing
To make sure that your task is executed even when your background process of choice is not available, you can specify a handler, and a couple of fallback handlers. The handler and fallback handlers are tried in order, until the first one succeeds.
background :locals => { :user => current_user },
:handler => :active_messaging,
:fallback => [:in_process, :disk, :forget] do
user.do_some_time_consuming_operation
end
In this example, the :active_messaging handler is tried first. If it fails, the code is executed in-process. If it still fails, the code is dumped into a file, and if even this fails, the task is discarded.
The self object
One of the amazing things is that you can use the self keyword inside blocks. This works because the object, in which the code is executed, is serialized as well.
class User
def some_operation
variable = some_evaluation
background :locals => { :variable => variable } do
self.do_something_with(variable)
end
end
end
Error reporting
As a developer, you might want to be informed when something goes wrong, in order to fix it. But since every project uses a different error reporting system, the error reporting is configurable. Also, some code is rather important, while other code is optional, so that you might not want to be informed about every error. To specify the error reporting, use the optional :reporter parameter:
background :reporter => :exception_notification do
# background code
end
Note that errors are only reported if an exception occurs while talking to the background process; if you want to be informed when an error occurs while the block is executed in the other process, you need to implement your own reporting for the backgrond process.
Decorating existing methods
Most of the time, you want a whole method to be executed in another process. To make this pattern DRY, the background method can be used as a method decorator, when it is called in class-level scope:
class User
def do_something_complicated(parameter, argument)
# complicated things
end
# execute all calls to do_something_complicated in the background
background :do_something_complicated, :params => ['parameter', 'argument']
end
Note that you have to specify all of the methods parameter names as they are written in the original method’s definiton. This is not 100% DRY, but neccessary to correctly send all the parameters to the background process. If you have a suggestion on how to avoid this, please let us know.
Default configuration
You can configure the default background handler, a default fallback chain as well as a default error reporter. The configuration lies in the Background::Config class.
Security Issues
As with any background processing, you need to be careful about the requests that are processed. Since the background plugin executes arbitrary Ruby code, you need to take special care that no unfiltered user input is injected. Make sure that your firewall does not allow connections from the outside, and that the code that connects from the inside is controlled by you. We don’t take any responsibility for any damage caused by the operation of the background plugin.
Limitations
Since singleton objects can not be serialized, all of the singleton methods are stripped away before objects are sent to the background process. Be aware of this fact if you rely on these methods. Most of the time, it should be easy to extend the objects again inside the code block.
Dependencies
The background plugin depends only on ActiveSupport, which is part of Rails.
Getting it, License and Patches
Get the complete source code through Github. License is MIT. That means that you can do whatever you want with the software, as long as the copyright statement stays intact. Please be a kind open source citizen, and give back your patches and extensions. Just fork the code on Github, and after you’re done, send us a pull request. Thanks for your help!
Popularity: 2% [?]

Great job! This is exactly how async processing is supposed to work.
As to the problem of finding out the arguments of a method you could use Ruby2Ruby:
Ruby2Ruby.translate(User, :do_something_complicated)
If you don’t want to introduce a new dependency for this, you could go smart and use the parameter auto-detection only if Ruby2Ruby is available and resort to the manual way if it isn’t.
I’ll give your plugin a try and maybe I can add something to it. I’d like to use Starling as a job queue instead of activemessaging.
Andreas Korth
18 Jun 08 at 8:53 pm
This is pretty interesting. Is there a way to have multiple queues – one for longer processing background processes and one for the short ones? One of the advantages of systems like Workling/Starling is the ability to setup multiple queues.
Thanks.
Ram
18 Jun 08 at 8:53 pm
@andreas: I’ve looked into Ruby2ruby and it seems that it would be a lot of effort to actually parse the argument names of the decorated method. AFAICS, even with Ruby2Ruby, it wouldn’t work with dynamically generated methods, which are pretty common in Rails. But thanks for the hint anyways, I will use it in upcoming projects for sure.
As soon as you are done with your Starling background handler, please let us know, such that we can incorporate it into the main release. Thanks!
@ram: No, right now, there is no way to use different queues. But since we need that feature anyways, I will be workng on it the next week. I will post updates here as soon as the next version is out.
Thomas Kadauke
18 Jun 08 at 8:53 pm
@thomas: Ruby2Ruby certainly does work with dynamic methods – that’s the whole point of it. And you tell me how hard it really is to parse the argument names: http://www.pastie.org/218757.
Andreas Korth
18 Jun 08 at 8:53 pm
There is a new version available, that allows you to use multiple queues, by configuring the background handler. See http://devblog.imedo.de/2008/6/28/new-version-of-background-with-backend-configuration for details.
I didn’t have time to check out Ruby2Ruby yet, but I will have a look at it soon, since I might need it in another sub-project.
I already have some ActiveMessaging-related changes in the pipeline, for making the background process (i.e. the poller) more fault-resistant. I will post updates on this blog as soon as it’s ready to release.
Thomas Kadauke
18 Jun 08 at 8:53 pm