eventmachine/eventmachine

View on GitHub
docs/old/DEFERRABLES

Summary

Maintainability
Test Coverage
EventMachine (EM) adds two different formalisms for lightweight concurrency
to the Ruby programmer's toolbox: spawned processes and deferrables. This
note will show you how to use deferrables. For more information, see the
separate document LIGHTWEIGHT_CONCURRENCY.

=== What are Deferrables?

EventMachine's Deferrable borrows heavily from the "deferred" object in
Python's "Twisted" event-handling framework. Here's a minimal example that
illustrates Deferrable:

 require 'eventmachine'

 class MyClass
   include EM::Deferrable

   def print_value x
     puts "MyClass instance received #{x}"
   end
 end

 EM.run {
   df = MyClass.new
   df.callback {|x|
     df.print_value(x)
     EM.stop
   }

   EM::Timer.new(2) {
     df.set_deferred_status :succeeded, 100
   }
 }


This program will spin for two seconds, print out the string "MyClass
instance received 100" and then exit. The Deferrable pattern relies on
an unusual metaphor that may be unfamiliar to you, unless you've used
Python's Twisted. You may need to read the following material through
more than once before you get the idea.

EventMachine::Deferrable is simply a Ruby Module that you can include
in your own classes. (There also is a class named
EventMachine::DefaultDeferrable for when you want to create one without
including it in code of your own.)

An object that includes EventMachine::Deferrable is like any other Ruby
object: it can be created whenever you want, returned from your functions,
or passed as an argument to other functions.

The Deferrable pattern allows you to specify any number of Ruby code
blocks (callbacks or errbacks) that will be executed at some future time
when the status of the Deferrable object changes.

How might that be useful? Well, imagine that you're implementing an HTTP
server, but you need to make a call to some other server in order to fulfill
a client request.

When you receive a request from one of your clients, you can create and
return a Deferrable object. Some other section of your program can add a
callback to the Deferrable that will cause the client's request to be
fulfilled. Simultaneously, you initiate an event-driven or threaded client
request to some different server. And then your EM program will continue to
process other events and service other client requests.

When your client request to the other server completes some time later, you
will call the #set_deferred_status method on the Deferrable object, passing
either a success or failure status, and an arbitrary number of parameters
(which might include the data you received from the other server).

At that point, the status of the Deferrable object becomes known, and its
callback or errback methods are immediately executed. Callbacks and errbacks
are code blocks that are attached to Deferrable objects at any time through
the methods #callback and #errback.

The deep beauty of this pattern is that it decouples the disposition of one
operation (such as a client request to an outboard server) from the
subsequent operations that depend on that disposition (which may include
responding to a different client or any other operation).

The code which invokes the deferred operation (that will eventually result
in a success or failure status together with associated data) is completely
separate from the code which depends on that status and data. This achieves
one of the primary goals for which threading is typically used in
sophisticated applications, with none of the nondeterminacy or debugging
difficulties of threads.

As soon as the deferred status of a Deferrable becomes known by way of a call
to #set_deferred_status, the Deferrable will IMMEDIATELY execute all of its
callbacks or errbacks in the order in which they were added to the Deferrable.

Callbacks and errbacks can be added to a Deferrable object at any time, not
just when the object is created. They can even be added after the status of
the object has been determined! (In this case, they will be executed
immediately when they are added.)

A call to Deferrable#set_deferred_status takes :succeeded or :failed as its
first argument. (This determines whether the object will call its callbacks
or its errbacks.) #set_deferred_status also takes zero or more additional
parameters, that will in turn be passed as parameters to the callbacks or
errbacks.

In general, you can only call #set_deferred_status ONCE on a Deferrable
object. A call to #set_deferred_status will not return until all of the
associated callbacks or errbacks have been called. If you add callbacks or
errbacks AFTER making a call to #set_deferred_status, those additional
callbacks or errbacks will execute IMMEDIATELY. Any given callback or
errback will be executed AT MOST once.

It's possible to call #set_deferred_status AGAIN, during the execution a
callback or errback. This makes it possible to change the parameters which
will be sent to the callbacks or errbacks farther down the chain, enabling
some extremely elegant use-cases. You can transform the data returned from
a deferred operation in arbitrary ways as needed by subsequent users, without
changing any of the code that generated the original data.

A call to #set_deferred_status will not return until all of the associated
callbacks or errbacks have been called. If you add callbacks or errbacks
AFTER making a call to #set_deferred_status, those additional callbacks or
errbacks will execute IMMEDIATELY.

Let's look at some more sample code. It turns out that many of the internal
protocol implementations in the EventMachine package rely on Deferrable. One
of these is EM::Protocols::HttpClient.

To make an evented HTTP request, use the module function
EM::Protocols::HttpClient#request, which returns a Deferrable object.
Here's how:

 require 'eventmachine'

 EM.run {
   df = EM::Protocols::HttpClient.request( :host=>"www.example.com",
                                           :request=>"/index.html" )

   df.callback {|response|
     puts "Succeeded: #{response[:content]}"
     EM.stop
   }

   df.errback {|response|
     puts "ERROR: #{response[:status]}"
     EM.stop
   }
 }

(See the documentation of EventMachine::Protocols::HttpClient for information
on the object returned by #request.)

In this code, we make a call to HttpClient#request, which immediately returns
a Deferrable object. In the background, an HTTP client request is being made
to www.example.com, although your code will continue to run concurrently.

At some future point, the HTTP client request will complete, and the code in
EM::Protocols::HttpClient will process either a valid HTTP response (including
returned content), or an error.

At that point, EM::Protocols::HttpClient will call
EM::Deferrable#set_deferred_status on the Deferrable object that was returned
to your program, as the return value from EM::Protocols::HttpClient.request.
You don't have to do anything to make this happen. All you have to do is tell
the Deferrable what to do in case of either success, failure, or both.

In our code sample, we set one callback and one errback. The former will be
called if the HTTP call succeeds, and the latter if it fails. (For
simplicity, we have both of them calling EM#stop to end the program, although
real programs would be very unlikely to do this.)

Setting callbacks and errbacks is optional. They are handlers to defined
events in the lifecycle of the Deferrable event. It's not an error if you
fail to set either a callback, an errback, or both. But of course your
program will then fail to receive those notifications.

If through some bug it turns out that #set_deferred_status is never called
on a Deferrable object, then that object's callbacks or errbacks will NEVER
be called. It's also possible to set a timeout on a Deferrable. If the
timeout elapses before any other call to #set_deferred_status, the Deferrable
object will behave as is you had called set_deferred_status(:failed) on it.


Now let's modify the example to illustrate some additional points:

 require 'eventmachine'

 EM.run {
   df = EM::Protocols::HttpClient.request( :host=>"www.example.com",
                                           :request=>"/index.html" )

   df.callback {|response|
     df.set_deferred_status :succeeded, response[:content]
   }

   df.callback {|string|
     puts "Succeeded: #{string}"
     EM.stop
   }

   df.errback {|response|
     puts "ERROR: #{response[:status]}"
     EM.stop
   }
 }


Just for the sake of illustration, we've now set two callbacks instead of
one. If the deferrable operation (the HTTP client-request) succeeds, then
both of the callbacks will be executed in order.

But notice that we've also made our own call to #set_deferred_status in the
first callback. This isn't required, because the HttpClient implementation
already made a call to #set_deferred_status. (Otherwise, of course, the
callback would not be executing.)

But we used #set_deferred_status in the first callback in order to change the
parameters that will be sent to subsequent callbacks in the chain. In this
way, you can construct powerful sequences of layered functionality. If you
want, you can even change the status of the Deferrable from :succeeded to
:failed, which would abort the chain of callback calls, and invoke the chain
of errbacks instead.

Now of course it's somewhat trivial to define two callbacks in the same
method, even with the parameter-changing effect we just described. It would
be much more interesting to pass the Deferrable to some other function (for
example, a function defined in another module or a different gem), that would
in turn add callbacks and/or errbacks of its own. That would illustrate the
true power of the Deferrable pattern: to isolate the HTTP client-request
from other functions that use the data that it returns without caring where
those data came from.

Remember that you can add a callback or an errback to a Deferrable at any
point in time, regardless of whether the status of the deferred operation is
known (more precisely, regardless of when #set_deferred_status is called on
the object). Even hours or days later.

When you add a callback or errback to a Deferrable object on which
#set_deferred_status has not yet been called, the callback/errback is queued
up for future execution, inside the Deferrable object. When you add a
callback or errback to a Deferrable on which #set_deferred_status has
already been called, the callback/errback will be executed immediately.
Your code doesn't have to worry about the ordering, and there are no timing
issues, as there would be with a threaded approach.

For more information on Deferrables and their typical usage patterns, look
in the EM unit tests. There are also quite a few sugarings (including
EM::Deferrable#future) that make typical Deferrable usages syntactically
easier to work with.