celluloid 0.2.2 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -2,18 +2,28 @@ Celluloid
2
2
  =========
3
3
  [![Build Status](http://travis-ci.org/tarcieri/celluloid.png)](http://travis-ci.org/tarcieri/celluloid)
4
4
 
5
- > "I thought of objects being like biological cells and/or individual
6
- > computers on a network, only able to communicate with messages"
5
+ > "I thought of objects being like biological cells and/or individual
6
+ > computers on a network, only able to communicate with messages"
7
7
  > _--Alan Kay, creator of Smalltalk, on the meaning of "object oriented programming"_
8
8
 
9
- Celluloid is a concurrent object framework for Ruby inspired by Erlang and
10
- the Actor model. Celluloid gives you thread-backed objects that run
11
- concurrently, providing the simplicity of Ruby objects for the most common
12
- use cases, but also the ability to call methods _asynchronously_, allowing
13
- the receiver to do things in the background while the caller carries on
14
- with its business. These concurrent objects are called "actors". Actors are
15
- somewhere in between the kind of object you're typically used to working
16
- with and a network service.
9
+ Celluloid provides a simple and natural way to build fault-tolerant concurrent
10
+ programs in Ruby. With Celluloid, you can build systems out of concurrent
11
+ objects just as easily as you build sequential programs out of regular objects.
12
+ Recommended for any developer, including novices, Celluloid should help ease
13
+ your worries about building multithreaded Ruby programs.
14
+
15
+ Under the hood, Celluloid wraps regular objects in threads that talk to each
16
+ other using messages. These concurrent objects are called "actors". When a
17
+ caller wants another actor to execute a method, it literally sends it a
18
+ message object telling it what method to execute. The receiver listens on its
19
+ mailbox, gets the request, runs the method, and sends the caller the result.
20
+ The receiver processes messages in its inbox one-at-a-time, which means that
21
+ you don't need to worry about synchronizing access to an object's instance
22
+ variables.
23
+
24
+ In addition to that, Celluloid also gives you the ability to call methods
25
+ _asynchronously_, so the receiver to do things in the background for you
26
+ without the caller having to sit around waiting for the result.
17
27
 
18
28
  Like Celluloid? [Join the Google Group](http://groups.google.com/group/celluloid-ruby)
19
29
 
@@ -28,47 +38,47 @@ to the JRuby executable, or set the "JRUBY_OPTS=--1.9" environment variable.
28
38
 
29
39
  Celluloid works on Rubinius in either 1.8 or 1.9 mode.
30
40
 
31
- Usage
32
- -----
41
+ Basic Usage
42
+ -----------
33
43
 
34
44
  To use Celluloid, define a normal Ruby class that includes Celluloid:
35
45
 
36
46
  class Sheen
37
47
  include Celluloid
38
-
48
+
39
49
  def initialize(name)
40
50
  @name = name
41
51
  end
42
-
52
+
43
53
  def set_status(status)
44
54
  @status = status
45
55
  end
46
-
56
+
47
57
  def report
48
58
  "#{@name} is #{@status}"
49
59
  end
50
60
  end
51
-
61
+
52
62
  Now when you create new instances of this class, they're actually concurrent
53
63
  objects, each running in their own thread:
54
64
 
55
65
  >> charlie = Sheen.new "Charlie Sheen"
56
66
  => #<Celluloid::Actor(Sheen:0x00000100a312d0) @name="Charlie Sheen">
57
67
  >> charlie.set_status "winning!"
58
- => "winning!"
68
+ => "winning!"
59
69
  >> charlie.report
60
- => "Charlie Sheen is winning!"
70
+ => "Charlie Sheen is winning!"
61
71
  >> charlie.set_status! "asynchronously winning!"
62
- => nil
72
+ => nil
63
73
  >> charlie.report
64
- => "Charlie Sheen is asynchronously winning!"
74
+ => "Charlie Sheen is asynchronously winning!"
65
75
 
66
- You can call methods on this concurrent object just like you would any other
76
+ You can call methods on this concurrent object just like you would any other
67
77
  Ruby object. The Sheen#set_status method works exactly like you'd expect,
68
78
  returning the last expression evaluated.
69
79
 
70
80
  However, Celluloid's secret sauce kicks in when you call banged predicate
71
- methods (i.e. methods ending in !). Even though the Sheen class has no
81
+ methods (i.e. methods ending in !). Even though the Sheen class has no
72
82
  set_status! method, you can still call it. Why is this? Because bang methods
73
83
  have a special meaning in Celluloid. (Note: this also means you can't define
74
84
  bang methods on Celluloid classes and expect them to be callable from other
@@ -99,7 +109,7 @@ Futures
99
109
  Futures allow you to request a computation and get the result later. There are
100
110
  two types of futures supported by Celluloid: method futures and block futures.
101
111
  Method futures work by invoking the _future_ method on an actor. This method
102
- is analogous to the typical _send_ method in that it takes a method name,
112
+ is analogous to the typical _send_ method in that it takes a method name,
103
113
  followed by an arbitrary number of arguments, and a block. Let's invoke the
104
114
  report method from the charlie object used in the above example using a future:
105
115
 
@@ -111,7 +121,7 @@ report method from the charlie object used in the above example using a future:
111
121
  The call to charlie.future immediately returns a Celluloid::Future object,
112
122
  regardless of how long it takes to execute the "report" method. To obtain
113
123
  the result of the call to "report", we call the _value_ method of the
114
- future object. This call will block until the value returned from the method
124
+ future object. This call will block until the value returned from the method
115
125
  call is available (i.e. the method has finished executing). If an exception
116
126
  occured during the method call, the call to future.value will reraise the
117
127
  same exception.
@@ -119,9 +129,9 @@ same exception.
119
129
  Futures also allow you to background the computation of any block:
120
130
 
121
131
  >> future = Celluloid::Future.new { 2 + 2 }
122
- => #<Celluloid::Future:0x000001008425f0>
132
+ => #<Celluloid::Future:0x000001008425f0>
123
133
  >> future.value
124
- => 4
134
+ => 4
125
135
 
126
136
  One thing to be aware of when using futures: always make sure to obtain the
127
137
  value of any future you make. Futures create a thread in the background which
@@ -142,9 +152,9 @@ class from the example above:
142
152
 
143
153
  >> supervisor = Sheen.supervise "Charlie Sheen"
144
154
  => #<Celluloid::Supervisor(Sheen) "Charlie Sheen">
145
-
155
+
146
156
  This created a new Celluloid::Supervisor actor, and also created a new Sheen
147
- actor, giving its initialize method the argument "Charlie Sheen". The
157
+ actor, giving its initialize method the argument "Charlie Sheen". The
148
158
  _supervise_ method has the same method signature as _new_. However, rather
149
159
  than returning the newly created actor, _supervise_ returns the supervisor.
150
160
  To retrieve the actor that the supervisor is currently using, use the
@@ -154,7 +164,7 @@ Celluloid::Supervisor#actor method:
154
164
  => #<Celluloid::Supervisor(Sheen) "Charlie Sheen">
155
165
  >> charlie = supervisor.actor
156
166
  => #<Celluloid::Actor(Sheen:0x00000100a312d0)>
157
-
167
+
158
168
  Supervisors can also automatically put actors into the actor _registry_ using
159
169
  the supervise_as method:
160
170
 
@@ -162,7 +172,7 @@ the supervise_as method:
162
172
  => #<Celluloid::Supervisor(Sheen) "Charlie Sheen">
163
173
  >> charlie = Celluloid::Actor[:charlie]
164
174
  => #<Celluloid::Actor(Sheen:0x00000100a312d0)>
165
-
175
+
166
176
  In this case, the supervisor will ensure that an actor of the Sheen class,
167
177
  created using the given arguments, is aways available by calling
168
178
  Celluloid::Actor[:charlie]. The first argument to supervise_as is the name
@@ -180,21 +190,21 @@ that actor crashes and dies. Let's start with an example:
180
190
  class JamesDean
181
191
  include Celluloid
182
192
  class CarInMyLaneError < StandardError; end
183
-
193
+
184
194
  def drive_little_bastard
185
195
  raise CarInMyLaneError, "that guy's gotta stop. he'll see us"
186
196
  end
187
197
  end
188
-
198
+
189
199
  Now, let's have James drive Little Bastard and see what happens:
190
200
 
191
201
  >> james = JamesDean.new
192
- => #<Celluloid::Actor(JamesDean:0x1068)>
202
+ => #<Celluloid::Actor(JamesDean:0x1068)>
193
203
  >> james.drive_little_bastard!
194
- => nil
204
+ => nil
195
205
  >> james
196
- => #<Celluloid::Actor(JamesDean:0x1068) dead>
197
-
206
+ => #<Celluloid::Actor(JamesDean:0x1068) dead>
207
+
198
208
  When we told james asynchronously to drive Little Bastard, it killed him! If
199
209
  we were Elizabeth Taylor, co-star in James' latest film at the time of his
200
210
  death, we'd certainly want to know when he died. So how can we do that?
@@ -207,7 +217,7 @@ Elizabeth Taylor object could be notified that James Dean has crashed:
207
217
  class ElizabethTaylor
208
218
  include Celluloid
209
219
  trap_exit :actor_died
210
-
220
+
211
221
  def actor_died(actor, reason)
212
222
  puts "Oh no! #{actor.inspect} has died because of a #{reason.class}"
213
223
  end
@@ -218,13 +228,13 @@ whenever any linked actors crashed. Now we need to link Elizabeth to James so
218
228
  James' crash notifications get sent to her:
219
229
 
220
230
  >> james = JamesDean.new
221
- => #<Celluloid::Actor(JamesDean:0x11b8)>
231
+ => #<Celluloid::Actor(JamesDean:0x11b8)>
222
232
  >> elizabeth = ElizabethTaylor.new
223
- => #<Celluloid::Actor(ElizabethTaylor:0x11f0)>
233
+ => #<Celluloid::Actor(ElizabethTaylor:0x11f0)>
224
234
  >> elizabeth.link james
225
- => #<Celluloid::Actor(JamesDean:0x11b8)>
235
+ => #<Celluloid::Actor(JamesDean:0x11b8)>
226
236
  >> james.drive_little_bastard!
227
- => nil
237
+ => nil
228
238
  Oh no! #<Celluloid::Actor(JamesDean:0x11b8) dead> has died because of a JamesDean::CarInMyLaneError
229
239
 
230
240
  Elizabeth called the _link_ method to receive crash events from James. Because
@@ -238,57 +248,57 @@ objects, one for James himself and one for Little Bastard, his car:
238
248
  class PorscheSpider
239
249
  include Celluloid
240
250
  class CarInMyLaneError < StandardError; end
241
-
251
+
242
252
  def drive_on_route_466
243
253
  raise CarInMyLaneError, "head on collision :("
244
254
  end
245
255
  end
246
-
256
+
247
257
  class JamesDean
248
258
  include Celluloid
249
-
259
+
250
260
  def initialize
251
261
  @little_bastard = PorscheSpider.new_link
252
262
  end
253
-
263
+
254
264
  def drive_little_bastard
255
265
  @little_bastard.drive_on_route_466
256
266
  end
257
267
  end
258
-
268
+
259
269
  If you take a look in JamesDean#initialize, you'll notice that to create an
260
270
  instance of PorcheSpider, James is calling the new_link method.
261
271
 
262
- This method works similarly to _new_, except it combines _new_ and _link_
272
+ This method works similarly to _new_, except it combines _new_ and _link_
263
273
  into a single call.
264
274
 
265
275
  Now what happens if we repeat the same scenario with Elizabeth Taylor watching
266
276
  for James Dean's crash?
267
277
 
268
278
  >> james = JamesDean.new
269
- => #<Celluloid::Actor(JamesDean:0x1108) @little_bastard=#<Celluloid::Actor(PorscheSpider:0x10ec)>>
279
+ => #<Celluloid::Actor(JamesDean:0x1108) @little_bastard=#<Celluloid::Actor(PorscheSpider:0x10ec)>>
270
280
  >> elizabeth = ElizabethTaylor.new
271
- => #<Celluloid::Actor(ElizabethTaylor:0x1144)>
281
+ => #<Celluloid::Actor(ElizabethTaylor:0x1144)>
272
282
  >> elizabeth.link james
273
- => #<Celluloid::Actor(JamesDean:0x1108) @little_bastard=#<Celluloid::Actor(PorscheSpider:0x10ec)>>
283
+ => #<Celluloid::Actor(JamesDean:0x1108) @little_bastard=#<Celluloid::Actor(PorscheSpider:0x10ec)>>
274
284
  >> james.drive_little_bastard!
275
- => nil
285
+ => nil
276
286
  Oh no! #<Celluloid::Actor(JamesDean:0x1108) dead> has died because of a PorscheSpider::CarInMyLaneError
277
287
 
278
- When Little Bastard crashed, it killed James as well. Little Bastard killed
279
- James, and because Elizabeth was trapping James' exit events, she received the
288
+ When Little Bastard crashed, it killed James as well. Little Bastard killed
289
+ James, and because Elizabeth was trapping James' exit events, she received the
280
290
  notification of James' death.
281
291
 
282
- Actors that are linked together propagate their error messages to all other
292
+ Actors that are linked together propagate their error messages to all other
283
293
  actors that they're linked to. Unless those actors are trapping exit events,
284
- those actors too will die, like James did in this case. If you have many,
294
+ those actors too will die, like James did in this case. If you have many,
285
295
  many actors linked together in a large object graph, killing one will kill them
286
296
  all unless they are trapping exits.
287
297
 
288
298
  This allows you to factor your problem into several actors. If an error occurs
289
299
  in any of them, it will kill off all actors used in a particular system. In
290
300
  general, you'll probably want to have a supervisor start a single actor which
291
- is in charge of a particular part of your system, and have that actor
301
+ is in charge of a particular part of your system, and have that actor
292
302
  new_link to other actors which are part of the same system. If any error
293
303
  occurs in any of these actors, all of them will be killed off and the entire
294
304
  subsystem will be restarted by the supervisor in a clean state.
@@ -303,11 +313,11 @@ Celluloid lets you register actors so you can refer to them symbolically.
303
313
  You can register Actors using Celluloid::Actor[]:
304
314
 
305
315
  >> james = JamesDean.new
306
- => #<Celluloid::Actor(JamesDean:0x80c27ce0)>
316
+ => #<Celluloid::Actor(JamesDean:0x80c27ce0)>
307
317
  >> Celluloid::Actor[:james] = james
308
- => #<Celluloid::Actor(JamesDean:0x80c27ce0)>
318
+ => #<Celluloid::Actor(JamesDean:0x80c27ce0)>
309
319
  >> Celluloid::Actor[:james]
310
- => #<Celluloid::Actor(JamesDean:0x80c27ce0)>
320
+ => #<Celluloid::Actor(JamesDean:0x80c27ce0)>
311
321
 
312
322
  The Celluloid::Actor constant acts as a hash, allowing you to register actors
313
323
  under the name of your choosing, and access actors by name rather than
@@ -331,22 +341,22 @@ send them a value in the process:
331
341
  class SignalingExample
332
342
  include Celluloid
333
343
  attr_reader :signaled
334
-
344
+
335
345
  def initialize
336
346
  @signaled = false
337
347
  end
338
-
348
+
339
349
  def wait_for_signal
340
350
  value = wait :ponycopter
341
351
  @signaled = true
342
352
  value
343
353
  end
344
-
354
+
345
355
  def send_signal(value)
346
356
  signal :ponycopter, value
347
357
  end
348
358
  end
349
-
359
+
350
360
  The wait_for_signal method in turn calls a method called "wait". Wait suspends
351
361
  the running method until another method of the same object calls the "signal"
352
362
  method with the same label.
@@ -354,19 +364,156 @@ method with the same label.
354
364
  The send_signal method of this class does just that, signaling "ponycopter"
355
365
  with the given value. This value is returned from the original wait call.
356
366
 
367
+ Handling I/O with Celluloid::IO
368
+ -------------------------------
369
+
370
+ Celluloid provides a separate class of actors which run alongside I/O
371
+ operations. These actors are slower and more heavyweight and should only be
372
+ used when writing actors that also handle IO operations. Every IO actor will
373
+ use 2 file descriptors (it uses a pipe for signaling), so use them sparingly
374
+ and only when directly interacting with IO.
375
+
376
+ To create an IO actor, include Celluloid::IO:
377
+
378
+ class IOActor
379
+ include Celluloid::IO
380
+
381
+ def initialize(sock)
382
+ @sock = sock
383
+ end
384
+
385
+ def read
386
+ wait_readable(@sock) do
387
+ @sock.read_nonblock
388
+ end
389
+ end
390
+ end
391
+
392
+ The Celluloid::IO#wait_readable and #wait_writeable methods suspend execution
393
+ of the current method until the given IO object is ready to be read from or
394
+ written to respectively. In the meantime, the current actor will continue
395
+ processing incoming messages, allowing it to respond to method requests even
396
+ while a method (or many methods) are waiting on IO objects.
397
+
357
398
  Logging
358
399
  -------
359
400
 
360
401
  By default, Celluloid will log any errors and backtraces from any crashing
361
402
  actors to STDOUT. However, if you wish you can use any logger which is
362
- compatible with the standard Ruby Logger API. For example, if you're using
363
- Celluloid within a Rails application, you'll probably want to do:
403
+ duck typed with the standard Ruby Logger API (i.e. it implements the #error
404
+ method). For example, if you're using Celluloid within a Rails
405
+ application, you'll probably want to do:
364
406
 
365
407
  Celluloid.logger = Rails.logger
366
-
408
+
409
+ The logger class you specify must be thread-safe, although with a logging
410
+ API about the worst you have to worry about with thread safety bugs is
411
+ out-of-order messages in the log.
412
+
413
+ Implementation and Gotchas
414
+ --------------------------
415
+
416
+ Celluloid is fundamentally a messaging system which uses thread-safe proxies
417
+ to manage all inter-object communication in the system. While the goal of
418
+ these proxies is to make it simple for you to write concurrent programs by
419
+ applying the uniform access principle to thread-safe inter-object messaging,
420
+ you can't simply forget they're there.
421
+
422
+ The thread-safety guarantees Celluloid provides around synchronizing access to
423
+ instance variables only work so long as all access to actors go through the
424
+ proxy objects. If the real objects that Celluloid is wrapping in an actor
425
+ manage to leak out of the system, all hell will break loose.
426
+
427
+ Here are a few rules you can follow to keep this from happening:
428
+
429
+ 1. ***NEVER RETURN SELF*** (or pass self as an argument to other actors): in
430
+ cases where you want to pass an actor around to other actors or threads,
431
+ use Celluloid.current_actor. If you grab the latest master of Celluloid
432
+ off of Github, you can just use the #current_actor method when you are
433
+ inside of an actor itself.
434
+
435
+ 2. Don't mutate the state of objects you've sent in calls to other actors:
436
+ This means you must think about data in one of two different ways: either
437
+ you "fire and forget" the data, leaving it for other actors to do with
438
+ what they will, or you must treat it as immutable if you have any plans
439
+ of sharing it with other actors. If you're paranoid (and when you're
440
+ dealing with concurrency, there's nothing wrong with being paranoid),
441
+ you can freeze objects so you can detect subsequent mutations (or rather,
442
+ turn attempts at mutation into errors).
443
+
444
+ 3. Don't mix Ruby thread primitives and calls to other actors: if you make
445
+ a call to another actor with a mutex held, you're doing it wrong. It's
446
+ perfectly fine and strongly encouraged to call out to thread safe
447
+ libraries from Celluloid actors. However, if you're using libraries that
448
+ acquire mutexes and then execute callbacks (e.g. they take a block while
449
+ they're holding a mutex) the guarantees that Celluloid provides will
450
+ become weak and you may encounter deadlocks.
451
+
452
+ 4. Use Fibers at your own risk: Celluloid employs Fibers as an intrinsic part
453
+ of how it implements actors. While it's possible for certain uses of Fibers
454
+ to cooperatively work alongside how Celluloid behaves, in most cases you'll
455
+ be writing a check you can't afford. So please ask yourself: why are you
456
+ using Fibers, and why can't it be solved by a block? If you've got a really
457
+ good reason and you're feeling lucky, knock yourself out.
458
+
459
+ On Thread Safety in Ruby
460
+ ------------------------
461
+
462
+ Ruby actually has a pretty good story when it comes to thread safety. The best
463
+ strategy for thread safety is to share as little state as possible, and if
464
+ you do share state, you should never mutate it. The worry of anyone stepping
465
+ into a thread safe world is that you're using a bunch of legacy libraries with
466
+ dubious thread safety. Who knows what those crazy library authors were doing?
467
+
468
+ Relax people. You're using a language where somebody can change what the '+'
469
+ operator does to numbers. So why aren't we afraid to add numbers? Who knows
470
+ what those crazy library authors may have done! Instead of freaking out, we
471
+ can learn some telltale signs of things that will cause thread safety problems
472
+ in Ruby programs so we can identify potential problem libraries just from how
473
+ their APIs behave.
474
+
475
+ The #1 thread safety issue to look out for in a Ruby library is if it provides
476
+ some sort of singleton access to a particular object through a class method,
477
+ e.g MyClass.zomgobject, as opposed to asking you do do MyClass.new. If you
478
+ aren't allocating the object, it isn't yours, it's somebody else's, and you
479
+ better damn well make sure you can share nice, or you shouldn't play with it
480
+ at all.
481
+
482
+ How do we share nicely? Let's find out by first looking at a thread-unsafe
483
+ version of a singleton method:
484
+
485
+ class Foo
486
+ def self.current
487
+ @foo ||= Foo.new
488
+ end
489
+ end
490
+
491
+ Seems bad. All threads will share access to the same Foo object, and there's
492
+ also a secondary bug here which means when the object is first being allocated
493
+ and memoized as @foo. The first thread that tries to allocate it may get a
494
+ different version than all the other threads because the memo value it set
495
+ got clobbered by another thread because it's unsynchronized.
496
+
497
+ What else can we do? It depends on why the library is memoizing. Perhaps the
498
+ Foo object has some kind of setup cost, such as making a network connection,
499
+ and we want to keep it around instead of setting it up and tearing it down
500
+ every time. If that's the case, the simplest thing we can do to make this
501
+ code thread safe is to create a thread-specific memo of the object:
502
+
503
+ class Foo
504
+ def self.current
505
+ Thread.current[:foo] ||= Foo.new
506
+ end
507
+ end
508
+
509
+ Keep in mind that this will require N Foo objects for N threads. If each
510
+ object is wrapping a network connection, this might be a concern. That said,
511
+ if you see this pattern employed in the singleton methods of a library,
512
+ it's most likely thread safe, provided that Foo doesn't do other wonky things.
513
+
367
514
  Contributing to Celluloid
368
515
  -------------------------
369
-
516
+
370
517
  * Fork Celluloid on github
371
518
  * Make your changes and send me a pull request
372
519
  * If I like them I'll merge them and give you commit access to my repository