dispatch 0.0.1pre
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +4 -0
- data/Gemfile +4 -0
- data/Gemfile.lock +14 -0
- data/README.rdoc +670 -0
- data/Rakefile +2 -0
- data/dispatch.gemspec +21 -0
- data/examples/benchmarks.rb +90 -0
- data/examples/dispatch_methods.rb +277 -0
- data/examples/dispatch_methods.sh +5 -0
- data/examples/futures.rb +34 -0
- data/examples/ring_buffer.rb +95 -0
- data/examples/sleeping_barber.rb +19 -0
- data/lib/dispatch.rb +22 -0
- data/lib/dispatch/enumerable.rb +108 -0
- data/lib/dispatch/job.rb +52 -0
- data/lib/dispatch/proxy.rb +49 -0
- data/lib/dispatch/queue.rb +34 -0
- data/lib/dispatch/source.rb +83 -0
- data/lib/dispatch/version.rb +3 -0
- data/spec/enumerable_spec.rb +188 -0
- data/spec/job_spec.rb +103 -0
- data/spec/proxy_spec.rb +105 -0
- data/spec/queue_spec.rb +45 -0
- data/spec/source_spec.rb +190 -0
- data/spec/spec_helper.rb +60 -0
- metadata +99 -0
data/.gitignore
ADDED
data/Gemfile
ADDED
data/Gemfile.lock
ADDED
data/README.rdoc
ADDED
@@ -0,0 +1,670 @@
|
|
1
|
+
= Grand Central Dispatch for MacRuby
|
2
|
+
|
3
|
+
== Introduction
|
4
|
+
|
5
|
+
This article explains how to use Grand Central Dispatch (*GCD*) from MacRuby, and is adapted from {Introducing Blocks and Grand Central Dispatch}[http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html] at the {Apple Developer Connection}[http://developer.apple.com/].
|
6
|
+
|
7
|
+
=== About GCD
|
8
|
+
|
9
|
+
GCD is a revolutionary approach to multicore computing that is woven throughout the fabric of {Mac OS X}[http://www.apple.com/macosx/] version 10.6 Snow Leopard. GCD combines an easy-to-use programming model with highly-efficient system services to radically simplify the code needed to make best use of multiple processors. The technologies in GCD improve the performance, efficiency, and responsiveness of Snow Leopard out of the box, and will deliver even greater benefits as more developers adopt them.
|
10
|
+
|
11
|
+
The central insight of GCD is shifting the responsibility for managing threads and their execution from applications to the operating system. As a result, programmers can write less code to deal with concurrent operations in their applications, and the system can perform more efficiently on single-processor machines, large multiprocessor servers, and everything in between. Without a pervasive approach such as GCD, even the best-written application cannot deliver optimal performance, because it doesn't have full insight into everything else happening in the system.
|
12
|
+
|
13
|
+
=== The MacRuby Dispatch module
|
14
|
+
|
15
|
+
GCD is natively implemented as a C API and runtime engine. MacRuby 0.5 introduces a new "Dispatch" module, which provides a Ruby wrapper around that API. This allows Ruby blocks to be scheduled on queues for asynchronous and concurrent execution either explicitly or in response to various kinds of events, with GCD automatically mapping queues to threads as needed. The Dispatch module provides four primary abstractions that mirror the C API:
|
16
|
+
|
17
|
+
Dispatch::Queue:: The basic unit for organizing blocks. Several queues are created by default, and applications may create additional queues for their own use.
|
18
|
+
|
19
|
+
Dispatch::Group:: Allows applications to track the progress of blocks submitted to queues and take action when the blocks complete.
|
20
|
+
|
21
|
+
Dispatch::Source:: Monitors and coalesces low-level system events so that they can be responded to asynchronously via simple event handlers.
|
22
|
+
|
23
|
+
Dispatch::Semaphore:: Synchronizes threads via a combination of waiting and signalling.
|
24
|
+
|
25
|
+
|
26
|
+
=== What You Need
|
27
|
+
|
28
|
+
The examples all assume you run the latest macirb and require the +dispatch+ library:
|
29
|
+
|
30
|
+
$ macirb
|
31
|
+
require 'dispatch'
|
32
|
+
|
33
|
+
We also assume that you are already familiar with Ruby, though not necessarily MacRuby. No prior knowledge of C or GCD is assumed or required, but the {dispatch(3) man page}[http://developer.apple.com/mac/library/DOCUMENTATION/Darwin/Reference/ManPages/man3/dispatch.3.html] may be helpful if you wish to better understand the underlying semantics.
|
34
|
+
|
35
|
+
== Dispatch::Job: Easy Concurrency
|
36
|
+
|
37
|
+
The easiest way to perform concurrent work is via a +Job+ object. Say you have a complex, long-running calculation you want to happen in the background. Create a job by passing Dispatch::Job's initializer the block you want to execute:
|
38
|
+
|
39
|
+
job = Dispatch::Job.new { Math.sqrt(10**100) }
|
40
|
+
|
41
|
+
This atomically[http://en.wikipedia.org/wiki/Atomic_operation] adds the block to GCD's default concurrent queue, then returns immediately so you don't stall the main thread.
|
42
|
+
|
43
|
+
Concurrent queues schedule as many simultaneous blocks as they can on a first-in/first-out (FIFO[http://en.wikipedia.org/wiki/FIFO]) basis, as long as there are threads available. If there are spare CPUs, the system will automatically create more threads -- and reclaim them when idle -- allowing GCD to dynamically scale the number of threads based on the overall system load. Thus (unlike with threads, which choke when you create too many) you can generally create as many jobs as you want, and GCD will do the right thing.
|
44
|
+
|
45
|
+
=== Job#value: Asynchronous Return Values
|
46
|
+
|
47
|
+
The downside of asynchrony is that you don't know exactly when your job will execute. Fortunately, +Dispatch::Job+ attempts to duck-type +Thread[http://ruby-doc.org/core/classes/Thread.html]+, so you can call +value[http://ruby-doc.org/core/classes/Thread.html#M000460]+ to obtain the result of executing that block:
|
48
|
+
|
49
|
+
@result = job.value
|
50
|
+
puts "value (sync): #{@result} => 1.0e+50"
|
51
|
+
|
52
|
+
This will wait until the value has been calculated, allowing it to be used as an {explicit Future}[http://en.wikipedia.org/wiki/Futures_and_promises]. However, this may stall the main thread indefinitely, which reduces the benefits of concurrency.
|
53
|
+
|
54
|
+
Wherever possible, you should instead attempt to figure out exactly _when_ and _why_ you need to know the result of asynchronous work. Then, call +value+ with a block to also perform _that_ work asynchronously once the value has been calculated -- all without blocking the main thread:
|
55
|
+
|
56
|
+
job.value {|v| puts "value (async): #{v} => 1.0e+50" } # (eventually)
|
57
|
+
|
58
|
+
=== Job#join: Job Completion
|
59
|
+
|
60
|
+
If you just want to track completion, you can call +join[http://ruby-doc.org/core/classes/Thread.html#M000462]+, which waits without returning the result:
|
61
|
+
|
62
|
+
job.join
|
63
|
+
puts "join done (sync)"
|
64
|
+
|
65
|
+
Similarly, call +join+ with a block to run asynchronously once the work has been completed
|
66
|
+
|
67
|
+
job.join { puts "join done (async)" }
|
68
|
+
|
69
|
+
=== Job#add: Coordinating Work
|
70
|
+
|
71
|
+
More commonly, you will have multiple units of work you'd like to perform in parallel. You can add blocks to an existing job using +add:
|
72
|
+
|
73
|
+
job.add { Math.sqrt(2**64) }
|
74
|
+
|
75
|
+
If there are multiple blocks in a job, +value+ will wait until they all finish then return the last value received:
|
76
|
+
|
77
|
+
job.value {|b| puts "value (async): #{b} => 4294967296.0" }
|
78
|
+
|
79
|
+
=== Job#values: Returning All Values
|
80
|
+
|
81
|
+
Note that values may be received out of order, since they may take differing amounts of time to complete. If you need to force a particular ordering, create a new +Job+ or call +join+ before submitting the block.
|
82
|
+
|
83
|
+
Additionally, you can call +values+ to obtain all the values:
|
84
|
+
|
85
|
+
@values = job.values
|
86
|
+
puts "values: #{@values.inspect} => [1.0E50]"
|
87
|
+
job.join
|
88
|
+
puts "values: #{@values.inspect} => [1.0E50, 4294967296.0]"
|
89
|
+
|
90
|
+
Note that unlike +value+ this will not by itself first +join+ the job, and thus does not have an asynchronous equivalent.
|
91
|
+
|
92
|
+
== Dispatch::Proxy: Protecting Shared Data
|
93
|
+
|
94
|
+
Concurrency would be easy if everything was {embarrassingly parallel}[http://en.wikipedia.org/wiki/Embarrassingly_parallel], but it becomes tricky when we need to share data between threads. If two threads try to modify the same object at the same time, it could lead to inconsistent (read: _corrupt_) data. There are well-known techniques for preventing this sort of data corruption (e.g. locks[http://en.wikipedia.org/wiki/Lock_(computer_science)] and mutexes[http://en.wikipedia.org/wiki/Mutual%20eclusion]), but these have their own well-known problems (e.g., deadlock[http://en.wikipedia.org/wiki/Deadlock], and {priority inversion}[http://en.wikipedia.org/wiki/Priority_inversion]).
|
95
|
+
|
96
|
+
Because Ruby traditionally had a global VM lock (or GIL[http://en.wikipedia.org/wiki/Global_Interpreter_Lock]), only one thread could modify data at a time, so developers never had to worry about these issues; then again, this also meant they didn't get much benefit from additional threads.
|
97
|
+
|
98
|
+
In MacRuby, every thread has its own Virtual Machine, which means all of them can access Ruby objects at the same time -- great for concurrency, not so great for data integrity. Fortunately, GCD provides _serial queues_ for {lock-free synchronization}[http://en.wikipedia.org/wiki/Non-blocking_synchronization], by ensuring that only one thread a time accesses a particular object -- without the complexity and inefficiency of locking. Here we will focus on +Dispatch::Proxy+, a high-level construct that implements the {Actor model}[http://en.wikipedia.org/wiki/Actor_model] by wrapping any arbitrary Ruby object with a +SimpleDelegate+ that only allows execution of one method at a time (i.e., serializes data access on to a private queue).
|
99
|
+
|
100
|
+
=== Job#synchronize: Creating Proxies
|
101
|
+
|
102
|
+
The easiest way to create a Proxy is to first create an empty Job:
|
103
|
+
|
104
|
+
job = Dispatch::Job.new {}
|
105
|
+
|
106
|
+
then ask it to wrap the object you want to modify from multiple threads:
|
107
|
+
|
108
|
+
@hash = job.synchronize Hash.new
|
109
|
+
puts "synchronize: #{@hash.class} => Dispatch::Proxy"
|
110
|
+
|
111
|
+
This is actually the same type of object used to manage the list of +values+:
|
112
|
+
|
113
|
+
puts "values: #{job.values.class} => Dispatch::Proxy"
|
114
|
+
|
115
|
+
=== Proxy#method_missing: Using Proxies
|
116
|
+
|
117
|
+
The Proxy object can be called just as it if were the delegate object:
|
118
|
+
|
119
|
+
@hash[:foo] = :bar
|
120
|
+
puts "proxy: #{@hash} => {:foo=>:bar}"
|
121
|
+
@hash.delete :foo
|
122
|
+
|
123
|
+
Except that you can use it safely inside Dispatch blocks from multiple threads:
|
124
|
+
|
125
|
+
[64, 100].each do |n|
|
126
|
+
job.add { @hash[n] = Math.sqrt(10**n) }
|
127
|
+
end
|
128
|
+
job.join
|
129
|
+
puts "proxy: #{@hash} => {64 => 1.0E32, 100 => 1.0E50}"
|
130
|
+
|
131
|
+
In this case, each block will perform the +sqrt+ asynchronously on the concurrent queue, potentially on multiple threads
|
132
|
+
|
133
|
+
As with Dispatch::Job, you can make any invocation asynchronous by passing a block:
|
134
|
+
|
135
|
+
@hash.inspect { |s| puts "inspect: #{s} => {64 => 1.0E32, 100 => 1.0E50}" }
|
136
|
+
|
137
|
+
=== Proxy#\_\_value\_\_: Returning Delegate
|
138
|
+
|
139
|
+
If for any reason you need to retrieve the original (unproxied) object, simply call +__value__+:
|
140
|
+
|
141
|
+
delegate = @hash.__value__
|
142
|
+
puts "\n__value__: #{delegate.class} => Hash"
|
143
|
+
|
144
|
+
This differs from +SimpleDelegate#__getobj__+ (which Dispatch::Proxy inherits) in that it will first wait until any pending asynchronous blocks have executed.
|
145
|
+
|
146
|
+
As elsewhere in Ruby, the "__" namespace implies "internal" methods, in this case meaning they are called directly on the proxy rather than passed to the delegate.
|
147
|
+
|
148
|
+
==== Caveat: Local Variables
|
149
|
+
|
150
|
+
Because Dispatch blocks may execute after the local context has gone away, you should always store Proxy objects in a non-local variable: instance, class, or global -- anything with a sigil[http://en.wikipedia.org/wiki/Sigil_(computer_programming)].
|
151
|
+
|
152
|
+
Note that we can as usual _access_ local variables from inside the block; GCD automatically copies them, which is why this works as expected:
|
153
|
+
|
154
|
+
n = 42
|
155
|
+
job = Dispatch::Job.new { puts "n (during): #{n} => 42" }
|
156
|
+
job.join
|
157
|
+
|
158
|
+
but this doesn't:
|
159
|
+
|
160
|
+
n = 0
|
161
|
+
job = Dispatch::Job.new { n = 21 }
|
162
|
+
job.join
|
163
|
+
puts "n (after): #{n} => 0?!?"
|
164
|
+
|
165
|
+
The general rule is "do *not* assign to external variables inside a Dispatch block." Assigning local variables will have no effect (outside that block), and assigning other variables may replace your Proxy object with a non-Proxy version. Remember also that Ruby treats the accumulation operations ("+=", "||=", etc.) as syntactic sugar over assignment, and thus those operations only affect the copy of the variable:
|
166
|
+
|
167
|
+
n = 0
|
168
|
+
job = Dispatch::Job.new { n += 84 }
|
169
|
+
job.join
|
170
|
+
puts "n (+=): #{n} => 0?!?"
|
171
|
+
|
172
|
+
== Dispatch Enumerable: Parallel Iterations
|
173
|
+
|
174
|
+
Jobs are useful when you want to run a single item in the background or to run many different operations at once. But if you want to run the _same_ operation multiple times, you can take advantage of specialized GCD iterators. The Dispatch module defines "p_" variants of common Ruby iterators, making it trivial to parallelize existing operations.
|
175
|
+
|
176
|
+
In addition, for simplicity they all are _synchronous_, meaning they won't return until all the work has completed.
|
177
|
+
|
178
|
+
=== Integer#p_times
|
179
|
+
|
180
|
+
The simplest iteration is defined on the +Integer+ class, and passes the index that many +times+:
|
181
|
+
|
182
|
+
5.times { |i| print "#{10**i}\t" }
|
183
|
+
puts "times"
|
184
|
+
|
185
|
+
becomes
|
186
|
+
|
187
|
+
5.p_times { |i| print "#{10**i}\t" }
|
188
|
+
puts "p_times"
|
189
|
+
|
190
|
+
Note that even though the iterator as a whole is synchronous, and blocks are scheduled in the order received, each block runs independently and therefore may complete out of order.
|
191
|
+
|
192
|
+
This does add some overhead compared to the non-parallel version, so if you have a large number of relatively cheap iterations you can batch them together by specifying a +stride+:
|
193
|
+
|
194
|
+
5.p_times(3) { |i| print "#{10**i}\t" }
|
195
|
+
puts "p_times(3)"
|
196
|
+
|
197
|
+
It doesn't change the result, but schedules fewer blocks thus amortizing the overhead over more work. Note that items _within_ a stride are executed completely in the original order, but no order is guaranteed _between_ strides.
|
198
|
+
|
199
|
+
The +p_times+ method is used to implement several convenience methods on +Enumerable+, which are therefore available from any class which mixes that in (e.g, +Array+, +Hash+, etc.). These also can take an optional stride.
|
200
|
+
|
201
|
+
=== Enumerable#p_each
|
202
|
+
|
203
|
+
Passes each object, like +each+:
|
204
|
+
DAYS=%w(Mon Tue Wed Thu Fri)
|
205
|
+
|
206
|
+
DAYS.each { |day| print "#{day}\t"}
|
207
|
+
puts "each"
|
208
|
+
|
209
|
+
DAYS.p_each { |day| print "#{day}\t"}
|
210
|
+
puts "p_each"
|
211
|
+
|
212
|
+
DAYS.p_each(3) { |day| print "#{day}\t"}
|
213
|
+
puts "p_each(3)"
|
214
|
+
|
215
|
+
=== Enumerable#p_each_with_index
|
216
|
+
|
217
|
+
Passes each object and its index, like +each_with_index+:
|
218
|
+
|
219
|
+
DAYS.each_with_index { |day, i | print "#{i}:#{day}\t"}
|
220
|
+
puts "each_with_index"
|
221
|
+
|
222
|
+
DAYS.p_each_with_index { |day, i | print "#{i}:#{day}\t"}
|
223
|
+
puts "p_each_with_index"
|
224
|
+
|
225
|
+
DAYS.p_each_with_index(3) { |day, i | print "#{i}:#{day}\t"}
|
226
|
+
puts "p_each_with_index(3)"
|
227
|
+
|
228
|
+
=== Enumerable#p_map
|
229
|
+
|
230
|
+
Passes each object and collects the transformed values, like +map+:
|
231
|
+
|
232
|
+
print (0..4).map { |i| "#{10**i}\t" }.join
|
233
|
+
puts "map"
|
234
|
+
|
235
|
+
print (0..4).p_map { |i| "#{10**i}\t" }.join
|
236
|
+
puts "p_map"
|
237
|
+
|
238
|
+
print (0..4).p_map(3) { |i| "#{10**i}\t" }.join
|
239
|
+
puts "p_map(3)"
|
240
|
+
|
241
|
+
=== Enumerable#p_mapreduce
|
242
|
+
|
243
|
+
Unlike the others, this method does not have a serial equivalent, but you may recognize it from the world of {distributed computing}[http://en.wikipedia.org/wiki/MapReduce]:
|
244
|
+
|
245
|
+
mr = (0..4).p_mapreduce(0) { |i| 10**i }
|
246
|
+
puts "p_mapreduce: #{mr} => 11111"
|
247
|
+
|
248
|
+
This uses a parallel +inject+ (formerly known as +reduce+) to return a single value by combining the result of +map+. Unlike +inject+, you must specify an explicit initial value as the first parameter. The default accumulator is ":+", but you can specify a different symbol to +send+:
|
249
|
+
|
250
|
+
mr = (0..4).p_mapreduce([], :concat) { |i| [10**i] }
|
251
|
+
puts "p_mapreduce(:concat): #{mr} => [1, 1000, 10, 100, 10000]"
|
252
|
+
|
253
|
+
Because of those parameters, the optional +stride+ is now the third:
|
254
|
+
|
255
|
+
mr = (0..4).p_mapreduce([], :concat, 3) { |i| [10**i] }
|
256
|
+
puts "p_mapreduce(3): #{mr} => [1000, 10000, 1, 10, 100]"
|
257
|
+
|
258
|
+
=== Enumerable#p_find_all
|
259
|
+
|
260
|
+
Passes each object and collects those for which the block is true, like +find_all+:
|
261
|
+
|
262
|
+
puts "find_all | p_find_all | p_find_all(3)"
|
263
|
+
puts (0..4).find_all { |i| i.odd? }.inspect
|
264
|
+
puts (0..4).p_find_all { |i| i.odd? }.inspect
|
265
|
+
puts (0..4).p_find_all(3) { |i| i.odd? }.inspect
|
266
|
+
|
267
|
+
=== Enumerable#p_find
|
268
|
+
|
269
|
+
Passes each object and returns nil if none match. Similar to +find+, it returns the first object it _finds_ for which the block is true, but unlike +find+ that may not be the _actual_ first object since blocks -- say it with me -- "may complete out of order":
|
270
|
+
|
271
|
+
puts "find | p_find | p_find(3)"
|
272
|
+
|
273
|
+
puts (0..4).find { |i| i == 5 }.nil? # => nil
|
274
|
+
puts (0..4).p_find { |i| i == 5 }.nil? # => nil
|
275
|
+
puts (0..4).p_find(3) { |i| i == 5 }.nil? # => nil
|
276
|
+
|
277
|
+
puts "#{(0..4).find { |i| i.odd? }} => 1"
|
278
|
+
puts "#{(0..4).p_find { |i| i.odd? }} => 1?"
|
279
|
+
puts "#{(0..4).p_find(3) { |i| i.odd? }} => 3?"
|
280
|
+
|
281
|
+
== Queues: Serialization
|
282
|
+
|
283
|
+
Most of the time, you can simply use GCD's default concurrent queues or the built-in queues associated with synchronized objects. However, if you want more precise control you can create and use your own queues.
|
284
|
+
|
285
|
+
=== Queue::new
|
286
|
+
|
287
|
+
The simplest way to create a +new+ queue is by passing it a meaningful name, typically using reverse-DNS naming:
|
288
|
+
|
289
|
+
puts
|
290
|
+
puts q = Dispatch::Queue.new("org.macruby.queue.example")
|
291
|
+
|
292
|
+
=== Queue#sync
|
293
|
+
|
294
|
+
You can schedule blocks directly on a queue synchronously using +sync+:
|
295
|
+
|
296
|
+
q.sync { puts "queue sync" }
|
297
|
+
|
298
|
+
=== Queue#async
|
299
|
+
|
300
|
+
However, it is usually more useful to schedule them asynchronously using +async+:
|
301
|
+
|
302
|
+
q.async { puts "queue async" }
|
303
|
+
|
304
|
+
=== Queue#join
|
305
|
+
|
306
|
+
A key advantage of scheduling blocks on your own private queue is that you can ensure that all pending blocks have been executed, via a +join+:
|
307
|
+
|
308
|
+
puts "queue join"
|
309
|
+
q.join
|
310
|
+
|
311
|
+
== Semaphores: Synchronization
|
312
|
+
|
313
|
+
Semaphores provide a powerful mechanism for communicating information across multiple queues. They are another low-level mechanism you can use for synchronizing work.
|
314
|
+
|
315
|
+
=== Semaphore::new
|
316
|
+
|
317
|
+
First, create a semaphore using +new+:
|
318
|
+
|
319
|
+
puts
|
320
|
+
puts semaphore = Dispatch::Semaphore.new(0)
|
321
|
+
|
322
|
+
Semaphores can be used to manage complex interactions, but here will simply use to them signal completion of a single task by passing a +count+ of zero.
|
323
|
+
|
324
|
+
=== Semaphore#signal
|
325
|
+
|
326
|
+
Next, schedule an asynchronous block that will +signal+ when it is done:
|
327
|
+
|
328
|
+
q.async {
|
329
|
+
puts "semaphore signal"
|
330
|
+
semaphore.signal
|
331
|
+
}
|
332
|
+
|
333
|
+
=== Semaphore#wait
|
334
|
+
|
335
|
+
Finally, +wait+ for that signal to arrive
|
336
|
+
|
337
|
+
puts "semaphore wait"
|
338
|
+
semaphore.wait
|
339
|
+
|
340
|
+
|
341
|
+
== Sources: Asynchronous Events
|
342
|
+
|
343
|
+
In addition to scheduling blocks directly, GCD makes it easy to run a block in response to various system events via a Dispatch::Source, which can be a:
|
344
|
+
|
345
|
+
* Timer
|
346
|
+
* Custom event
|
347
|
+
* Signal
|
348
|
+
* File descriptor (file or socket)
|
349
|
+
* Process state change
|
350
|
+
|
351
|
+
When the source "fires", GCD will schedule the _handler_ block on the specific queue if it is not currently running, or -- more importantly -- coalesce pending events if it is. This provides excellent responsiveness without the expense of either polling or binding a thread to the event source. Plus, since the handler is never run more than once at a time, the block doesn't even need to be reentrant -- and thus you don't need to +synchronize+ any variables that are only used there.
|
352
|
+
|
353
|
+
=== Source.periodic
|
354
|
+
|
355
|
+
We'll start with a simple example: a +periodic+ timer that runs every 0.4 seconds and prints out the number of pending events:
|
356
|
+
|
357
|
+
puts
|
358
|
+
timer = Dispatch::Source.periodic(0.4) do |src|
|
359
|
+
puts "Dispatch::Source.periodic: #{src.data}"
|
360
|
+
end
|
361
|
+
sleep 1 # => 1 1 ...
|
362
|
+
|
363
|
+
If you're familiar with the C API for GCD, be aware that a +Dispatch::Source+ is fully configured at the time of instantiation, and does not need to be +resume+d. Also, times are in seconds, not nanoseconds.
|
364
|
+
|
365
|
+
=== Source#data
|
366
|
+
|
367
|
+
As you can see above, the handle gets called with the source itself as a parameter, which allows you query it for the source's +data+. The meaning of the data varies with the type of +Source+, though it is always an integer. Most commonly -- as in this case -- it is a count of the number of events being processed, and thus "1".
|
368
|
+
|
369
|
+
=== Source#suspend!
|
370
|
+
|
371
|
+
This monotony rapidly gets annoying; to pause, just +suspend!+ the source:
|
372
|
+
|
373
|
+
timer.suspend!
|
374
|
+
puts "suspend!"
|
375
|
+
sleep 1
|
376
|
+
|
377
|
+
You can suspend a source at any time to prevent it from running another block, though this will not affect a block that is already being processed.
|
378
|
+
|
379
|
+
=== Source#resume!
|
380
|
+
|
381
|
+
If you change your mind, you can always +resume!+ the source:
|
382
|
+
|
383
|
+
timer.resume!
|
384
|
+
puts "resume!"
|
385
|
+
sleep 1 # => 1 2 1 ...
|
386
|
+
|
387
|
+
If the +Source+ has fired one or more times, it will schedule a block containing the coalesced events. In this case, we were suspended for over 2 intervals, so the next block will fire with +data+ being at least 2.
|
388
|
+
|
389
|
+
=== Source#cancel!
|
390
|
+
|
391
|
+
Finally, you can stop the source entirely by calling +cancel!+:
|
392
|
+
|
393
|
+
timer.cancel!
|
394
|
+
puts "cancel!"
|
395
|
+
puts
|
396
|
+
|
397
|
+
Cancellation is particularly significant in MacRuby's implementation of GCD, since (due to the reliance on garbage collection) there is no other way to explicitly stop using a source.
|
398
|
+
|
399
|
+
=== Custom Sources
|
400
|
+
|
401
|
+
Next up are _custom_ or _application-specific_ sources, which are fired explicitly by the developer instead of in response to an external event. These simple behaviors are the primitives upon which other sources are built.
|
402
|
+
|
403
|
+
==== Source.add
|
404
|
+
|
405
|
+
The +add+ source accumulates the sum of the event data (e.g., for a counter) in a thread-safe manner:
|
406
|
+
|
407
|
+
@sum = 0
|
408
|
+
adder = Dispatch::Source.add do |s|
|
409
|
+
puts "Dispatch::Source.add: #{s.data} (#{@sum += s.data})"
|
410
|
+
semaphore.signal
|
411
|
+
end
|
412
|
+
|
413
|
+
Note that we use an instance variable (since it is re-assigned), but we don't have to +synchronize+ it -- and can safely re-assign it -- since the event handler does not need to be reentrant.
|
414
|
+
|
415
|
+
==== Source#<<
|
416
|
+
|
417
|
+
To fire a custom source, we invoke what GCD calls a _merge_ using the shovel operator ('+<<+'):
|
418
|
+
|
419
|
+
adder << 1
|
420
|
+
semaphore.wait
|
421
|
+
puts "sum: #{@sum} => 1"
|
422
|
+
|
423
|
+
Note the use of +Semaphore#wait+ to ensure the asynchronously-scheduled event handler has been run.
|
424
|
+
|
425
|
+
The name "merge" makes more sense when you see it coalesce multiple firings into a single handler:
|
426
|
+
|
427
|
+
adder.suspend!
|
428
|
+
adder << 3
|
429
|
+
adder << 5
|
430
|
+
puts "sum: #{@sum} => 1"
|
431
|
+
adder.resume!
|
432
|
+
semaphore.wait
|
433
|
+
puts "sum: #{@sum} => 9"
|
434
|
+
adder.cancel!
|
435
|
+
|
436
|
+
Since the source is suspended -- mimicking what would happen if your event handler was busy at the time -- GCD automatically _merges_ the results together using addition. This is useful for tracking cumulative results across multiple threads, e.g. for a progress meter. Notice this is the event coalescing behavior used by +periodic+.
|
437
|
+
|
438
|
+
==== Source.or
|
439
|
+
|
440
|
+
Similarly, the +or+ source combines events using a logical OR (e.g, for booleans or bitmasks):
|
441
|
+
|
442
|
+
@mask = 0
|
443
|
+
masker = Dispatch::Source.or do |s|
|
444
|
+
@mask |= s.data
|
445
|
+
puts "Dispatch::Source.or: #{s.data.to_s(2)} (#{@mask.to_s(2)})"
|
446
|
+
semaphore.signal
|
447
|
+
end
|
448
|
+
masker << 0b0001
|
449
|
+
semaphore.wait
|
450
|
+
puts "mask: #{@mask.to_s(2)} => 1"
|
451
|
+
masker.suspend!
|
452
|
+
masker << 0b0011
|
453
|
+
masker << 0b1010
|
454
|
+
puts "mask: #{@mask.to_s(2)} => 1"
|
455
|
+
masker.resume!
|
456
|
+
semaphore.wait
|
457
|
+
puts "mask: #{@mask.to_s(2)} => 1011"
|
458
|
+
masker.cancel!
|
459
|
+
puts
|
460
|
+
|
461
|
+
This is primarily useful for flagging what _kinds_ of events have taken place since the last time the handler fired.
|
462
|
+
|
463
|
+
=== Process Sources
|
464
|
+
|
465
|
+
Let's see how both of use are used for sources which deal with UNIX processes.
|
466
|
+
|
467
|
+
==== Source.process
|
468
|
+
|
469
|
+
This +or+-style source takes and returns a mask of different events affecting the specified +process+:
|
470
|
+
|
471
|
+
exec:: Dispatch::Source.PROC_EXEC
|
472
|
+
exit:: Dispatch::Source.PROC_EXIT
|
473
|
+
fork:: Dispatch::Source.PROC_FORK
|
474
|
+
signal:: Dispatch::Source.PROC_SIGNAL
|
475
|
+
|
476
|
+
[*NOTE*: +Thread#fork+ is not supported by MacRuby]
|
477
|
+
|
478
|
+
The underlying API expects and returns integers, e.g.:
|
479
|
+
|
480
|
+
@event = 0
|
481
|
+
mask = Dispatch::Source::PROC_EXIT | Dispatch::Source::PROC_SIGNAL
|
482
|
+
proc_src = Dispatch::Source.process($$, mask) do |s|
|
483
|
+
@event |= s.data
|
484
|
+
puts "Dispatch::Source.process: #{s.data.to_s(2)} (#{@event.to_s(2)})"
|
485
|
+
semaphore.signal
|
486
|
+
end
|
487
|
+
|
488
|
+
In this case, we are watching the current process ('$$') for +:signal+ and (less usefully :-) +:exit+ events .
|
489
|
+
|
490
|
+
==== Source#data2events
|
491
|
+
|
492
|
+
Alternatively, you can pass in array of names (symbols or strings) for the mask, and optionally use +data2events+ to convert the returned data into an array of symbols
|
493
|
+
|
494
|
+
semaphore2 = Dispatch::Semaphore.new(0)
|
495
|
+
@events = []
|
496
|
+
mask2 = [:exit, :fork, :exec, :signal]
|
497
|
+
proc_src2 = Dispatch::Source.process($$, mask2) do |s|
|
498
|
+
these = Dispatch::Source.data2events(s.data)
|
499
|
+
@events += these
|
500
|
+
puts "Dispatch::Source.process: #{these} (#{@events})"
|
501
|
+
semaphore2.signal
|
502
|
+
end
|
503
|
+
|
504
|
+
==== Source.process Example
|
505
|
+
|
506
|
+
To fire the event, we can, e.g., send a un-trapped signal :
|
507
|
+
|
508
|
+
sig_usr1 = Signal.list["USR1"]
|
509
|
+
Signal.trap(sig_usr1, "IGNORE")
|
510
|
+
Process.kill(sig_usr1, $$)
|
511
|
+
Signal.trap(sig_usr1, "DEFAULT")
|
512
|
+
|
513
|
+
You can check which flags were set by _and_ing against the bitmask:
|
514
|
+
|
515
|
+
semaphore.wait
|
516
|
+
result = @event & mask
|
517
|
+
print "@event: #{result.to_s(2)} =>"
|
518
|
+
puts " #{Dispatch::Source::PROC_SIGNAL.to_s(2)} (Dispatch::Source::PROC_SIGNAL)"
|
519
|
+
proc_src.cancel!
|
520
|
+
|
521
|
+
Or equivalently, intersecting the array:
|
522
|
+
|
523
|
+
semaphore2.wait
|
524
|
+
puts "@events: #{(result2 = @events & mask2)} => [:signal]"
|
525
|
+
proc_src2.cancel!
|
526
|
+
|
527
|
+
==== Source#event2num
|
528
|
+
|
529
|
+
You can convert from symbol to int via +event2num+:
|
530
|
+
|
531
|
+
puts "event2num: #{Dispatch::Source.event2num(result2[0]).to_s(2)} => #{result.to_s(2)}"
|
532
|
+
|
533
|
+
==== Source#data2events
|
534
|
+
|
535
|
+
Similarly, use +data2events+ to turn an int into a symbol:
|
536
|
+
|
537
|
+
puts "data2events: #{Dispatch::Source.data2events(result)} => #{result2}"
|
538
|
+
|
539
|
+
==== Source.signal
|
540
|
+
|
541
|
+
This +Source+ overlaps slightly with the previous one, but uses +add+ to track the number of times that a specific +signal+ was fired against the *current* process:
|
542
|
+
|
543
|
+
@signals = 0
|
544
|
+
sig_usr2 = Signal.list["USR2"]
|
545
|
+
signal = Dispatch::Source.signal(sig_usr2) do |s|
|
546
|
+
puts "Dispatch::Source.signal: #{s.data} (#{@signals += s.data})"
|
547
|
+
semaphore.signal
|
548
|
+
end
|
549
|
+
|
550
|
+
puts "signals: #{@signals} => 0"
|
551
|
+
signal.suspend!
|
552
|
+
Signal.trap(sig_usr2, "IGNORE")
|
553
|
+
3.times { Process.kill(sig_usr2, $$) }
|
554
|
+
Signal.trap(sig_usr2, "DEFAULT")
|
555
|
+
signal.resume!
|
556
|
+
semaphore.wait
|
557
|
+
puts "signals: #{@signals} => 3"
|
558
|
+
signal.cancel!
|
559
|
+
puts
|
560
|
+
|
561
|
+
=== File Sources
|
562
|
+
|
563
|
+
Next up are sources which deal with file operations -- actually, anything that modifies a vnode, including sockets and pipes.
|
564
|
+
|
565
|
+
==== Source.file
|
566
|
+
|
567
|
+
This +or+-style source takes and returns a mask of different events affecting the specified +file+:
|
568
|
+
|
569
|
+
delete:: Dispatch::Source.VNODE_DELETE
|
570
|
+
write:: Dispatch::Source.VNODE_WRITE
|
571
|
+
extend:: Dispatch::Source.VNODE_EXTEND
|
572
|
+
attrib:: Dispatch::Source.VNODE_ATTRIB
|
573
|
+
link:: Dispatch::Source.VNODE_LINK
|
574
|
+
rename:: Dispatch::Source.VNODE_RENAME
|
575
|
+
revoke:: Dispatch::Source.VNODE_REVOKE
|
576
|
+
|
577
|
+
As before, the underlying API expects and returns integers, e.g.:
|
578
|
+
|
579
|
+
@fevent = 0
|
580
|
+
@msg = "#{$$}-#{Time.now.to_s.gsub(' ','_')}"
|
581
|
+
puts "msg: #{@msg}"
|
582
|
+
filename = "/tmp/dispatch-#{@msg}"
|
583
|
+
puts "filename: #{filename}"
|
584
|
+
file = File.open(filename, "w")
|
585
|
+
fmask = Dispatch::Source::VNODE_DELETE | Dispatch::Source::VNODE_WRITE
|
586
|
+
file_src = Dispatch::Source.file(file.fileno, fmask, q) do |s|
|
587
|
+
@fevent |= s.data
|
588
|
+
puts "Dispatch::Source.file: #{s.data.to_s(2)} (#{@fevent.to_s(2)})"
|
589
|
+
semaphore.signal
|
590
|
+
end
|
591
|
+
file.print @msg
|
592
|
+
file.flush
|
593
|
+
file.close
|
594
|
+
semaphore.wait(0.1)
|
595
|
+
print "fevent: #{(@fevent & fmask).to_s(2)} =>"
|
596
|
+
puts " #{Dispatch::Source::VNODE_WRITE.to_s(2)} (Dispatch::Source::VNODE_WRITE)"
|
597
|
+
File.delete(filename)
|
598
|
+
semaphore.wait(0.1)
|
599
|
+
print "fevent: #{@fevent.to_s(2)} => #{fmask.to_s(2)}"
|
600
|
+
puts " (Dispatch::Source::VNODE_DELETE | Dispatch::Source::VNODE_WRITE)"
|
601
|
+
file_src.cancel!
|
602
|
+
q.join
|
603
|
+
|
604
|
+
And of course can also use symbols:
|
605
|
+
|
606
|
+
@fevent2 = []
|
607
|
+
file = File.open(filename, "w")
|
608
|
+
fmask2 = %w(delete write)
|
609
|
+
file_src2 = Dispatch::Source.file(file, fmask2) do |s|
|
610
|
+
@fevent2 += Dispatch::Source.data2events(s.data)
|
611
|
+
puts "Dispatch::Source.file: #{Dispatch::Source.data2events(s.data)} (#{@fevent2})"
|
612
|
+
semaphore2.signal
|
613
|
+
end
|
614
|
+
file.print @msg
|
615
|
+
file.flush
|
616
|
+
semaphore2.wait(0.1)
|
617
|
+
puts "fevent2: #{@fevent2} => [:write]"
|
618
|
+
file_src2.cancel!
|
619
|
+
|
620
|
+
As a bonus, if you pass in an actual IO object (not just a file descriptor) the Dispatch library will automatically create a handler that closes the file for you when cancelled!
|
621
|
+
|
622
|
+
==== Source.read
|
623
|
+
|
624
|
+
In contrast to the previous sources, these next two refer to internal state rather than external events. Specifically, this +add+-style source avoids blocking on a +read+ by only calling the handler when it estimates there are +s.data+ unread bytes available in the buffer:
|
625
|
+
|
626
|
+
file = File.open(filename, "r")
|
627
|
+
@input = ""
|
628
|
+
reader = Dispatch::Source.read(file) do |s|
|
629
|
+
@input << file.read(s.data)
|
630
|
+
puts "Dispatch::Source.read: #{s.data}: #{@input}"
|
631
|
+
end
|
632
|
+
while (@input.size < @msg.size) do; end
|
633
|
+
puts "input: #{@input} => #{@msg}" # => e.g., 74323-2010-07-07_15:23:10_-0700
|
634
|
+
reader.cancel!
|
635
|
+
|
636
|
+
Strictly speaking, the count returned by +s.data+ is only an estimate. It would be safer to instead call +@file.read(1)+ each time to avoid any risk of blocking -- but that would lead to many more block invocations, which might not be a net win.
|
637
|
+
|
638
|
+
Note that since the block handler may be called many times, we can't wait on a semaphore, but instead test on the shared variable. In a real implementation you should detect end of file instead.
|
639
|
+
|
640
|
+
==== Source.write
|
641
|
+
|
642
|
+
This +add+-style event is similar to the above, but waits until a +write+ buffer is available:
|
643
|
+
|
644
|
+
file = File.open(filename, "w")
|
645
|
+
@next_char = 0
|
646
|
+
writer = Dispatch::Source.write(file) do |s|
|
647
|
+
if @next_char < @msg.size then
|
648
|
+
char = @msg[@next_char]
|
649
|
+
file.write(char)
|
650
|
+
@next_char += 1
|
651
|
+
puts "Dispatch::Source.write: #{char}|#{@msg[@next_char..-1]}"
|
652
|
+
end
|
653
|
+
end
|
654
|
+
while (@next_char < @msg.size) do; end
|
655
|
+
puts "output: #{File.read(filename)} => #{@msg}" # e.g., 74323-2010-07-07_15:23:10_-0700
|
656
|
+
File.delete(filename)
|
657
|
+
|
658
|
+
In this case we play it safe by only writing out a single character each time we are called, to avoid risk of blocking (and simplify our algorithm).
|
659
|
+
|
660
|
+
= What's Next?
|
661
|
+
|
662
|
+
This concludes our introduction to the high-level wrappers available when using +require 'dispatch'+. These allow you to easily write concurrent, asynchronous code using simple Ruby idioms. For additional performance and fine-grained control, you may want to dive down into directly using Queues, Groups, and Semaphores.
|
663
|
+
At the moment, this is best done by reading the {existing C documentation for GCD}[http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html] and comparing to the relevant RubyDoc.
|
664
|
+
|
665
|
+
$ macri Dispatch
|
666
|
+
$ macri Dispatch::Queue
|
667
|
+
$ macri Dispatch::Group
|
668
|
+
$ macri Dispatch::Semaphore
|
669
|
+
|
670
|
+
However, feel free to {file bugs}[https://www.macruby.org/auth/login/?next=/trac/newticket] about additional documentation you would like to see.
|