tupelo 0.7 → 0.8

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 74456cde00385b5340494724ee0affc7cf6e2534
4
- data.tar.gz: 331f3970d1fd41a702ada650f5701616de24b184
3
+ metadata.gz: 1a19d73a0cfead5fa7d4078521ac079bc69b6654
4
+ data.tar.gz: 24d0e5206aa04688072768f90c630d95bff54636
5
5
  SHA512:
6
- metadata.gz: bcfe41b674e798ba15761f6740bc74a11963a793538a725ec393fefd4328eb9053e019d95c0f4701cb2ca680dc52c2d2e8dce48921be7cd68532c639b4c61d90
7
- data.tar.gz: 82b13b6aa0675069946428033e5d032ca4418409c5901da00cac76cf139d7cc34afa428c4e666e42370b69241c797cabb55b375cfd34e5d8bd5bc83f2e6bf9a9
6
+ metadata.gz: c06d448f481f6e18b2d944e2961af9fb3007a2e56eaf69b65fa82975023c622abf5beca87142f3e844af0fa1e4e7eba2905b8c6c83b06c4193309d33345fac21
7
+ data.tar.gz: 97163417cc8366be36adb87b3ee3b1b1cbfcedf007dc817d579782231790c0cdaf0177c4104abde51fd4034c40ff88d2e90ef836442a66944b4c9a00cee7f9d4
data/README.md CHANGED
@@ -11,12 +11,16 @@ Tupelo differs from other spaces in several ways:
11
11
 
12
12
  * minimal central computation: just counter increment, message dispatch, and connection management (and it never unpacks serialized tuples)
13
13
 
14
- * clients do all the tuple work: registering and checking waiters, matching, searching, notifying, storing, inserting, deleting, persisting, etc. Each client is free to to decide how to do these things (application code is insulated from this, however). Special-purpose clients may use specialized algorithms and stores for the subspaces they manage.
14
+ * clients do all the tuple work: registering and checking waiters, matching, searching, notifying, storing, inserting, deleting, persisting, etc. Each client is free to to decide how to do these things (application code is insulated from this, however). Special-purpose clients (known as *tuplets*) may use specialized algorithms and stores for the subspaces they manage.
15
15
 
16
- * transactions, in addition to the classic operators.
16
+ * transactions, in addition to the classic operators (and transactions execute client-side, reducing bottleneck and increasing expressiveness).
17
17
 
18
18
  * replication is inherent in the design (in fact it is unavoidable), for better or worse.
19
19
 
20
+ Documentation
21
+ ============
22
+
23
+ * [FAQ](doc/faq.md)
20
24
 
21
25
  Getting started
22
26
  ==========
@@ -58,11 +62,12 @@ Getting started
58
62
 
59
63
  Pulse without waiting:
60
64
 
61
- pulse_nowait ...
65
+ pulse_nowait <tuple>,...
62
66
 
63
67
  Read tuple matching a template, waiting for a match to exist:
64
68
 
65
69
  r <template>
70
+ read <template>
66
71
  read_wait <template>
67
72
 
68
73
  Read tuple matching a template and return it, without waiting for a match to exist (returning nil in that case):
@@ -78,7 +83,7 @@ Getting started
78
83
 
79
84
  ra Hash
80
85
 
81
- reads all hash tuples (and ignore array tuples), and
86
+ reads all hash tuples (and ignores array tuples), and
82
87
 
83
88
  ra proc {|t| t.size==2}
84
89
 
@@ -119,6 +124,22 @@ Getting started
119
124
  end
120
125
 
121
126
  Note that the block may execute more than once, if there is competition for the tuples that you are trying to #take or #read. When the block exits, however, the transaction is final and universally accepted by all clients.
127
+
128
+ You can timeout a transaction:
129
+
130
+ transaction timeout: 1 do
131
+ read ["does not exist"]
132
+ end
133
+
134
+ This uses tupelo's internal lightweight scheduler, rather than ruby's heavyweight (one thread per timeout) Timeout, though the latter works with tupelo as well.
135
+
136
+ You can also abort a transaction while inside it by calling `#abort` on it:
137
+
138
+ write [1]
139
+ transaction {take [1]; abort}
140
+ read_all # => [[1]]
141
+
142
+ Another thread can abort a transaction in progress (to the extent possible) by calling `#cancel` on it. See [example/cancel.rb](example/cancel.rb).
122
143
 
123
144
  4. Run tup with a server file so that two sessions can interact. Do this in two terminals in the same dir:
124
145
 
@@ -134,7 +155,13 @@ Getting started
134
155
 
135
156
  Note that all bin and example programs accept blob type (e.g., --msgpack, --json) on command line (it only needs to be specified for server -- the clients discover it). Also, all these programs accept log level on command line. The default is --warn. The --info level is a good way to get an idea of what is happening, without the verbosity of --debug.
136
157
 
137
- 6. Debugging: in addition to the --info switch on all bin and example programs, bin/tspy is also really useful. There is also the similar --trace switch that is available to all bin and example programs. This switch diagnostic output for each transaction. For example:
158
+ 6. Debugging: in addition to the --info switch on all bin and example programs, bin/tspy is also really useful; it shows all tuplespace events in sequence that they occur. For example, run
159
+
160
+ $ tspy svr
161
+
162
+ in another terminal after running `tup svr`. The output shows the clock tick, sending client, operation, and operation status (success or failure).
163
+
164
+ There is also the similar --trace switch that is available to all bin and example programs. This turns on diagnostic output for each transaction. For example:
138
165
 
139
166
  ```
140
167
  tick cid status operation
@@ -143,7 +170,7 @@ Getting started
143
170
  3 3 atomic take ["x", 1], ["y", 2]
144
171
  ```
145
172
 
146
- The `Tupelo.application` command, provided by `tupelo/app`, is the source of all these options and is available to your programs. It's a kind of lightweight process deployment and control framework; however it is not necessary to use tupelo.
173
+ The `Tupelo.application` command, provided by `tupelo/app`, is the source of all these options and is available to your programs. It's a kind of lightweight process deployment and control framework; however `Tupelo.application` is not necessary to use tupelo.
147
174
 
148
175
 
149
176
  What is a tuplespace?
@@ -151,7 +178,7 @@ What is a tuplespace?
151
178
 
152
179
  A tuplespace is a service for coordination, configuration, and control of concurrent and distributed systems. The model it provides to processes is a shared space that they can use to communicate in a deterministic and sequential manner. (Deterministic in that all clients see the same, consistent view of the data.) The space contains tuples. The operations on the space are few, but powerful. It's not a database, but it might be a front-end for one or more databases.
153
180
 
154
- See https://en.wikipedia.org/wiki/Tuple_space for general information and history. This project is strongly influenced by Masatoshi Seki's Rinda implementation, part of the Ruby standard library.
181
+ See https://en.wikipedia.org/wiki/Tuple_space for general information and history. This project is strongly influenced by Masatoshi Seki's Rinda implementation, part of the Ruby standard library. See http://pragprog.com/book/sidruby/the-druby-book for a good introduction to rinda and druby.
155
182
 
156
183
  What is a tuple?
157
184
  ----------------
@@ -183,6 +210,12 @@ In other words, a tuple is a fairly general object, though this depends on the s
183
210
 
184
211
  It's kind of like a "JSON object", except that, when using the json serializer, the hash keys can only be strings. In the msgpack case, keys have no special limitations. In the case of the marshal and yaml modes, tuples can contain many other kinds of objects.
185
212
 
213
+ One other thing to keep in mind: in the array case, the order of the elements is significant. In the hash case, the order is not significant. So these are both true:
214
+
215
+ [1,2] != [2,1]
216
+ {a:1, b:2} == {b:2, a:1}
217
+
218
+
186
219
  What is a template?
187
220
  -------------------
188
221
 
@@ -255,9 +288,11 @@ If you prefer classical tuplespace locking, you can simply use certain tuples as
255
288
 
256
289
  If an optimistic transaction fails (for example, it is trying to take a tuple, but the tuple has just been taken by another transaction), then the transaction block is re-executed, possibly waiting for new matches to the templates. Application code must be aware of the possible re-execution of the block. This is better explained in the examples...
257
290
 
258
- Transactions have a significant disadvantage compared to lock tuples: a transaction can protect only resources that are represented in the tuplespace, whereas a lock can protect anything: a file, a device, a service, etc. This is because a transaction begins and ends within a single instant of logical (tuplespace) time, whereas a lock tuple can be taken out for an arbitrary duration of real time. Furthermore, the instant of logical time in which a transaction takes effect may occur at different wall-clock times on different processes, even on the same host.
291
+ Transactions have a significant disadvantage compared to using take/write to lock/unlock tuples: a transaction can protect only resources that are represented in the tuplespace, whereas a lock can protect anything: a file, a device, a service, etc. This is because a transaction begins and ends within a single instant of logical (tuplespace) time, whereas a lock tuple can be taken out for an arbitrary duration of real (and logical) time. Furthermore, the instant of logical time in which a transaction takes effect may occur at different wall-clock times on different processes, even on the same host.
259
292
 
260
- Tupelo transactions are ACID in the following sense. They are Atomic and Isolated -- this is enforced by the transaction processing in each client. Consistency is enforced by the underlying message sequencer: each client's copy of the space is the deterministic result of the same sequence of operations. Durability is optional, but can be provided by the archiver (to be implemented) or other clients.
293
+ Transactions do have an advantage over using take/write to lock/unlock tuples: there is no possibility of deadlock. See [example/deadlock.rb](example/deadlock.rb) and [example/parallel.rb](example/parallel.rb).
294
+
295
+ Tupelo transactions are ACID in the following sense. They are Atomic and Isolated -- this is enforced by the transaction processing in each client. Consistency is enforced by the underlying message sequencer: each client's copy of the space is the deterministic result of the same sequence of operations. Durability is optional, but can be provided by the persistent archiver or other clients.
261
296
 
262
297
  On the CAP spectrum, tupelo tends towards consistency: for all clients, write and take operations are applied in the same order, so the state of the entire system up through a given tick of discrete time is universally agreed upon. Of course, because of the difficulties of distributed systems, one client may not yet have seen the same range of ticks as another.
263
298
 
@@ -318,7 +353,7 @@ Future
318
353
 
319
354
  - Investigate nio4r for faster networking, especially with many clients.
320
355
 
321
- - Interoperable client and server implementations in C, Python, Go, ....
356
+ - Interoperable client and server implementations in C, Python, Go, .... Elixir?
322
357
 
323
358
  - UDP multicast to further reduce the bottleneck in the message sequencer.
324
359
 
@@ -363,11 +398,13 @@ To compare
363
398
 
364
399
  * resque
365
400
 
366
- * zookeeper -- totally ordered updates
401
+ * zookeeper -- totally ordered updates; tupelo trades availability for lower latency (?)
367
402
 
368
403
  * chubby
369
404
 
370
- * doozer
405
+ * doozer, etcd
406
+
407
+ * arakoon
371
408
 
372
409
  * hazelcast
373
410
 
@@ -375,12 +412,17 @@ To compare
375
412
 
376
413
  * datomic -- similar distribution of "facts", but not tuplespace; similar use of pluggable storage managers
377
414
 
378
- * job queues: sidekiq, resque, delayedjob, http://queues.io
415
+ * job queues: sidekiq, resque, delayedjob, http://queues.io, https://github.com/factual/skuld
379
416
 
380
417
  * pubsubs: kafka
381
418
 
382
419
  * spark, storm
383
420
 
421
+ * tibco and gigaspace
422
+
423
+ * gridgain
424
+
425
+
384
426
  Architecture
385
427
  ============
386
428
 
data/bin/tup CHANGED
@@ -47,6 +47,8 @@ if ARGV.delete("-h") or ARGV.delete("--help")
47
47
 
48
48
  -v verbose mode (include time and pid in log messages)
49
49
 
50
+ --trace enable trace output
51
+
50
52
  --pubsub publish/subscribe mode; does not keep local tuple store:
51
53
 
52
54
  * read only works in blocking mode (waiting for new tuple)
@@ -57,107 +59,53 @@ if ARGV.delete("-h") or ARGV.delete("--help")
57
59
  --yaml
58
60
  --json
59
61
  --msgpack <-- default
62
+
63
+ --persist-dir DIR
64
+ load and save tuplespace to DIR
60
65
 
61
66
  END
62
67
  exit
63
68
  end
64
69
 
65
- require 'easy-serve'
70
+ require 'tupelo/app'
66
71
 
67
- log_level = case
68
- when ARGV.delete("--debug"); Logger::DEBUG
69
- when ARGV.delete("--info"); Logger::INFO
70
- when ARGV.delete("--warn"); Logger::WARN
71
- when ARGV.delete("--error"); Logger::ERROR
72
- when ARGV.delete("--fatal"); Logger::FATAL
73
- else Logger::WARN
74
- end
75
- verbose = ARGV.delete("-v")
76
- pubsub = ARGV.delete("--pubsub")
72
+ argv, tupelo_opts = Tupelo.parse_args(ARGV)
77
73
 
78
- blob_type = nil
79
- %w{--marshal --yaml --json --msgpack}.each do |switch|
80
- s = ARGV.delete(switch) and
81
- blob_type ||= s.delete("--")
82
- end
74
+ pubsub = argv.delete("--pubsub") # not a standard tupelo opt
83
75
 
84
- ez_opts = {
85
- servers_file: ARGV.shift,
86
- interactive: $stdin.isatty
87
- }
76
+ servers_file = argv.shift
77
+ addr = argv.shift(3)
88
78
 
89
- addr = ARGV.shift(3)
79
+ Tupelo.application(
80
+ argv: argv,
81
+ **tupelo_opts,
82
+ servers_file: servers_file,
83
+ seqd_addr: addr,
84
+ cseqd_addr: addr, # using same addr causes autoincrement of port/filename
85
+ arcd_addr: addr) do
90
86
 
91
- EasyServe.start ez_opts do |ez|
92
- log = ez.log
93
- log.level = log_level
94
- log.formatter = nil if verbose
95
- log.progname = File.basename($0)
96
-
97
- ez.start_servers do
98
- arc_to_seq_sock, seq_to_arc_sock = UNIXSocket.pair
99
- arc_to_cseq_sock, cseq_to_arc_sock = UNIXSocket.pair
100
-
101
- ez.server :seqd, *addr do |svr|
102
- require 'funl/message-sequencer'
103
- seq_opts = {}
104
- seq_opts[:blob_type] = blob_type if blob_type
105
- seq = Funl::MessageSequencer.new svr, seq_to_arc_sock, log: log,
106
- **seq_opts
107
- seq.start ## thwait? or can easy-serve do that?
108
- end
109
-
110
- ez.server :cseqd, *addr do |svr|
111
- require 'funl/client-sequencer'
112
- cseq = Funl::ClientSequencer.new svr, cseq_to_arc_sock, log: log
113
- cseq.start
114
- end
115
-
116
- ez.server :arcd, *addr do |svr|
117
- require 'tupelo/archiver'
118
- arc = Tupelo::Archiver.new svr, seq: arc_to_seq_sock,
119
- cseq: arc_to_cseq_sock, log: log
120
- arc.start
121
- end
87
+ class TupClient < Tupelo::Client
88
+ alias w write_wait
89
+ alias pl pulse_wait
90
+ alias t take
91
+ alias r read_wait
92
+ alias ra read_all
93
+ alias tr transaction
94
+ CMD_ALIASES = %w{ w pl t r ra tr }
95
+ private *CMD_ALIASES
122
96
  end
123
-
124
- ez.local :seqd, :cseqd, :arcd do |seqd, cseqd, arcd|
125
- log.progname = "client <starting in #{log.progname}>"
126
-
127
- require 'tupelo/client'
128
- class TupClient < Tupelo::Client
129
- alias w write_wait
130
- alias pl pulse_wait
131
- alias t take
132
- alias r read_wait
133
- alias ra read_all
134
- alias tr transaction
135
- CMD_ALIASES = %w{ w pl t r ra tr }
136
- private *CMD_ALIASES
137
- end
138
-
139
- client_opts = {seq: seqd, cseq: cseqd, log: log}
140
- if pubsub
141
- client_opts[:arc] = nil
142
- client_opts[:tuplespace] = TupClient::NullTuplespace
143
- else
144
- client_opts[:arc] = arcd
145
- end
146
-
147
- client = TupClient.new client_opts
148
- client.start do
149
- log.progname = "client #{client.client_id}"
150
- end
151
- log.info {
152
- "cpu time: %.2fs" % Process.times.inject {|s,x|s+x}
153
- }
154
- log.info {
155
- "starting shell. Commands: #{TupClient::CMD_ALIASES.join(", ")}"
156
- }
157
97
 
158
- require 'tupelo/app/irb-shell'
159
- IRB.start_session(client)
98
+ client_opts = {}
99
+ if pubsub
100
+ client_opts[:arc] = nil
101
+ client_opts[:tuplespace] = TupClient::NullTuplespace
102
+ end
160
103
 
161
- client.stop
104
+ local TupClient, **client_opts do
105
+ log.info {"cpu time: %.2fs" % Process.times.inject {|s,x|s+x}}
106
+ log.info {"starting shell. Commands: #{TupClient::CMD_ALIASES.join(", ")}"}
107
+
108
+ require 'tupelo/app/irb-shell'
109
+ IRB.start_session(self)
162
110
  end
163
111
  end
@@ -0,0 +1,34 @@
1
+ require 'tupelo/app'
2
+
3
+ ### need a programmatic way to start up clients
4
+
5
+ Tupelo.application do |app|
6
+
7
+ app.child do ## local still hangs
8
+ 3.times do |i|
9
+ app.child do
10
+ write [i]
11
+ log "wrote #{i}"
12
+ end
13
+ end
14
+
15
+ 3.times do
16
+ log take [nil]
17
+ end
18
+ end
19
+ end
20
+
21
+ __END__
22
+
23
+ this hangs sometimes but not always:
24
+
25
+ tick cid status operation
26
+ A: client 3: wrote 0
27
+ A: client 4: wrote 1
28
+ 1 3 batch write [0]
29
+ 2 4 batch write [1]
30
+ A: client 2: [0]
31
+ 3 2 atomic take [0]
32
+ 4 2 atomic take [1]
33
+ A: client 2: [1]
34
+ A: client 5: wrote 2
@@ -0,0 +1,66 @@
1
+ # See the README reference to this file.
2
+ # Run with --trace to see what's happening.
3
+
4
+ require 'tupelo/app'
5
+
6
+ def observe_deadlock
7
+ done = false
8
+ at_exit do
9
+ # for a passive client, exit is forced when there are no
10
+ # more non-passive clients
11
+ if done
12
+ log "done (should not happen)"
13
+ else
14
+ log "stopped in deadlock (as expected)"
15
+ end
16
+ end
17
+
18
+ yield
19
+
20
+ done = true
21
+ end
22
+
23
+ Tupelo.application do
24
+ local do
25
+ write [1], [2], [3], [4]
26
+ end
27
+
28
+ child passive: true do
29
+ observe_deadlock do
30
+ take [1]
31
+ sleep 1
32
+ take [2]
33
+ write [1], [2]
34
+ end
35
+ end
36
+
37
+ child passive: true do
38
+ observe_deadlock do
39
+ sleep 0.5
40
+ take [2]
41
+ take [1]
42
+ write [1], [2]
43
+ end
44
+ end
45
+
46
+ child do
47
+ transaction do
48
+ take [3]
49
+ sleep 1
50
+ take [4]
51
+ write [3], [4]
52
+ log "done"
53
+ end
54
+ end
55
+
56
+ child do
57
+ transaction do
58
+ sleep 0.5
59
+ take [4]
60
+ take [3]
61
+ write [3], [4]
62
+ log "done"
63
+ end
64
+ end
65
+
66
+ end
data/example/lease.rb ADDED
@@ -0,0 +1,103 @@
1
+ require 'tupelo/app'
2
+
3
+ N_WORKERS = 3
4
+ N_TASKS = 10
5
+ N_SLEEPS = 2
6
+
7
+ Tupelo.application do
8
+ N_WORKERS.times do |w_i|
9
+ child passive: true do
10
+ loop do
11
+ task_id = task_data = nil
12
+
13
+ transaction do
14
+ _, task_id, task_data = take ["task", nil, nil]
15
+ write ["lease", client_id, task_id, task_data]
16
+ write ["alive", client_id, task_id, (Time.now + 1).to_f]
17
+ end
18
+
19
+ N_SLEEPS.times do
20
+ sleep 1 # pretend to be working
21
+ write ["alive", client_id, task_id, (Time.now + 1).to_f]
22
+
23
+ # randomly exit or oversleep the lease deadline
24
+ if w_i == 1
25
+ log "bad worker exiting"
26
+ exit
27
+ elsif w_i == 2
28
+ log "bad worker oversleeping"
29
+ sleep 3
30
+ end
31
+ end
32
+
33
+ result = task_data * 1000
34
+
35
+ transaction do
36
+ if take_nowait ["lease", client_id, task_id, nil]
37
+ write ["result", task_id, result]
38
+ # write the result only if this client still has lease --
39
+ # otherwise, some other client has been assigned to this task.
40
+ else
41
+ log.warn "I lost my lease because I didn't finish task in time!"
42
+ end
43
+ end
44
+ end
45
+ end
46
+ end
47
+
48
+ # Lease manager. Ensures that, for each input tuple ["task", i, ...],
49
+ # there is exactly one output tuple ["result", i, ...]. It does not
50
+ # attempt to stop / start processes. So it can fail if all the workers die,
51
+ # or if the lease manager itself dies. But it will succeed if it and at least
52
+ # one worker lives. This demonstrates how to recover from worker failure
53
+ # and prevent "lost tuples".
54
+ child passive: true do
55
+ require 'tupelo/client/atdo'
56
+
57
+ scheduler = make_scheduler
58
+ alive_until = Hash.new(0)
59
+
60
+ loop do
61
+ _, lease_client_id, task_id, time = take ["alive", nil, nil, nil]
62
+ t = alive_until[[lease_client_id, task_id]]
63
+ alive_until[[lease_client_id, task_id]] = [t, time].max
64
+
65
+ scheduler.at time + 0.2 do # allow for network latency etc.
66
+ t = alive_until[[lease_client_id, task_id]]
67
+ if t < Time.now.to_f # expired
68
+ task_data = nil
69
+ transaction do
70
+ _,_,_,task_data =
71
+ take_nowait ["lease", lease_client_id, task_id, nil]
72
+ # if lease is gone, ok!
73
+ if task_data
74
+ write ["task", task_id, task_data] # for someone else to work on
75
+ end
76
+ end
77
+ if task_data
78
+ log.warn "took lease from #{lease_client_id} on #{task_id}"
79
+ end
80
+ end
81
+ end
82
+ end
83
+ end
84
+
85
+ # Task requestor.
86
+ child do
87
+ N_TASKS.times do |task_id|
88
+ task_data = task_id # for simplicity
89
+ write ["task", task_id, task_data]
90
+ end
91
+
92
+ N_TASKS.times do |task_id|
93
+ log take ["result", task_id, nil]
94
+ end
95
+
96
+ extra_results = read_all ["result", nil, nil]
97
+ if extra_results.empty?
98
+ log "results look ok!"
99
+ else
100
+ log.error "extra results = #{extra_results}"
101
+ end
102
+ end
103
+ end
data/example/parallel.rb CHANGED
@@ -1 +1,100 @@
1
- # like gnu parallel
1
+ # a bit like gnu parallel
2
+ # see also https://github.com/grosser/parallel
3
+
4
+ require 'tupelo/app/remote'
5
+
6
+ show_steps = !!ARGV.delete("--show-steps")
7
+
8
+ hosts = ARGV.shift
9
+ map = ARGV.slice!(0,3)
10
+ reduce = ARGV.slice!(0,4)
11
+
12
+ abort <<END unless hosts and
13
+ map[0] == "map" and reduce[0] == "reduce" and reduce[3]
14
+
15
+ usage: #$0 <ssh-host>,... map <var> <expr> reduce <var> <var> <expr> [<infile> ...]
16
+
17
+ Input can be provided on standard input or as the contents of the files
18
+ specified in the infile arguments. Writes the result of the last
19
+ reduction to standard output.
20
+
21
+ If --show-steps is set then intermediate reductions are printed as they
22
+ are computed. If input is stdin at the terminal, then you can see these
23
+ outputs even before you type the EOF character.
24
+
25
+ Caution: very little argument checking!
26
+ Caution: no robustness guarantees (but see comments)!
27
+
28
+ Example:
29
+
30
+ ruby #$0 localhost,localhost map s s.length reduce l1 l2 l1+l2
31
+
32
+ Use `s.split.length` to get word count instead of char count.
33
+
34
+ END
35
+
36
+ hosts = hosts.split(",")
37
+
38
+ map_str = <<END
39
+ proc do |#{map[1]}|
40
+ #{map[2]}
41
+ end
42
+ END
43
+
44
+ reducer = eval <<END
45
+ proc do |#{reduce[1]}, #{reduce[2]}|
46
+ #{reduce[3]}
47
+ end
48
+ END
49
+
50
+ Tupelo.tcp_application do
51
+ hosts.each do |host|
52
+ remote host: host, passive: true, log: true, eval: %{
53
+ mapper = #{map_str}
54
+ loop do
55
+ s = take(line: String)["line"]
56
+ output = mapper[s]
57
+ log(mapped: output) if #{show_steps}
58
+ write output: output
59
+ end
60
+ }
61
+ end
62
+
63
+ child passive: true do
64
+ loop do
65
+ m1, m2 = transaction do # transaction avoids deadlock!
66
+ [take(output: nil)["output"],
67
+ take(output: nil)["output"]]
68
+ end
69
+
70
+ # Fragile! A crash after the transaction above means the whole app
71
+ # can't finish. You could fix this with lease tuples--see lease.rb.
72
+
73
+ output = reducer[m1, m2]
74
+ log reduced: output if show_steps
75
+
76
+ transaction do
77
+ count = take(count: nil)["count"]
78
+ write count: count - 1
79
+ write output: output
80
+ end
81
+ end
82
+ end
83
+
84
+ local do
85
+ write count: 0
86
+
87
+ ARGF.each do |line|
88
+ transaction do
89
+ write line: line.chomp
90
+ count = take(count: nil)["count"]
91
+ write count: count + 1
92
+ end
93
+ end
94
+
95
+ read count: 1
96
+ result = take output: nil
97
+ log result if show_steps
98
+ puts result["output"]
99
+ end
100
+ end
@@ -1,3 +1,5 @@
1
+ # see also parallel.rb
2
+
1
3
  require 'tupelo/app/remote'
2
4
 
3
5
  hosts = ARGV.shift or abort "usage: #$0 <ssh-hostname>,<ssh-hostname>,..."
@@ -22,6 +24,5 @@ Tupelo.tcp_application do
22
24
  sum += take([Numeric])[0]
23
25
  end
24
26
  log "sum = #{sum}, correct sum = #{input.flatten.join.size}"
25
- sleep 2
26
27
  end
27
28
  end