tupelo 0.16 → 0.17

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: e5af7d65abd86881773acbdf6957c756d6788bf4
4
- data.tar.gz: f8d9c3bb4b8cb83a3d18714a508eeb374e457ba3
3
+ metadata.gz: c5703dd8bef3eca3c30fc208f1ff5fe291b19b4a
4
+ data.tar.gz: c84e02596fc16e71dce360a8ba67b7033194d15b
5
5
  SHA512:
6
- metadata.gz: e6af32369c79059957d4513ec90d711a0d0dd514488f126944b7da0dc888fadc07654a32f22387b6e095629f0149c72df4ce6f7ea4ccc24834d52d66e0159a5d
7
- data.tar.gz: bac6727749123e5b64472b9b983c18fd90f301fa6bb0486b0b6888c4864b818fb1bc6089d4e5f2f7b46bce651927113bffbcff537fa642a4f5446a5965381e73
6
+ metadata.gz: f6e2afa9ccf4cea0e12497ae4b8d98d169d2b8b6b46dcf92ff13d46f44d111080331fc80aa637f9234ef41d3727cd7e591d7e861bf2492bce5fda9ab258e6e32
7
+ data.tar.gz: afb4669e2bdf49294eda7423b250ea62f0e4efedd7f9ea250658ee531e76871d27d027b456492680d0fa920ffd9d5045fb6f7df569bdc52945f7677b86ffb325
data/README.md CHANGED
@@ -1,34 +1,32 @@
1
- tupelo
1
+ Tupelo
2
2
  ==
3
3
 
4
- A tuplespace that is fast, scalable, and language agnostic. It is designed for distribution of both computation and storage, in a unified language that has both transactional and tuple-operation (read/write/take) semantics.
4
+ A tuplespace that is fast, scalable, and language agnostic. It is designed for distribution of both computation and storage (disk and memory), in a unified language that has both transactional and tuple-operation (read/write/take) semantics.
5
5
 
6
-
7
- This is the reference implementation in ruby. It should be able to communicate with implementations in other languages. Planned implementation languages include C, Python, and Go.
8
-
9
- Tupelo differs from other spaces in several ways:
10
-
11
- * minimal central storage: the only state in the server is a counter and socket connections
12
-
13
- * minimal central computation: just counter increment, message dispatch, and connection management (and it never unpacks serialized tuples)
14
-
15
- * clients do all the tuple work: registering and checking waiters, matching, searching, notifying, storing, inserting, deleting, persisting, etc. Each client is free to to decide how to do these things (application code is insulated from this, however). Special-purpose clients (known as *tuplets*) may use specialized algorithms and stores for the subspaces they manage.
16
-
17
- * transactions, in addition to the classic operators (and transactions execute client-side, reducing bottleneck and increasing expressiveness).
18
-
19
- * replication is inherent in the design (in fact it is unavoidable), for better or worse.
6
+ This is the reference implementation in ruby. It should be able to communicate with implementations in other languages.
20
7
 
21
8
  Documentation
22
9
  ============
23
10
 
11
+ * [Tutorial](doc/tutorial.md)
24
12
  * [FAQ](doc/faq.md)
13
+ * [Comparisons](doc/compare.md)
14
+ * [Transactions](doc/transactions.md)
25
15
  * [Subspaces](doc/subspace.md)
26
- * [Abstract](sfdc.md) and [slides](doc/sfdc.pdf) for San Francisco Distributed Systems meetup
16
+ * [Examples](example/)
17
+
18
+ Internals
19
+ ---------
20
+ * [Architecture and protocol](doc/arch.md)
21
+
22
+ Talk
23
+ ----
24
+ * [Abstract](sfdc.md) and [slides](doc/sfdc.pdf) for San Francisco Distributed Computing meetup
27
25
 
28
26
  Getting started
29
27
  ==========
30
28
 
31
- 1. Install ruby 2 (not 1.9) from http://ruby-lang.org. Examples and tests will not work on windows (they use fork and unix sockets) or jruby, though probably the underying libs will (using tcp sockets).
29
+ 1. Install ruby 2.0 or 2.1 (not 1.9) from http://ruby-lang.org. Examples and tests will not work on windows (they use fork and unix sockets) or jruby, though probably the underying libs will (using tcp sockets).
32
30
 
33
31
  2. Install the gem and its dependencies (you may need to `sudo` this):
34
32
 
@@ -43,327 +41,14 @@ Getting started
43
41
  >> t [nil, nil]
44
42
  => ["hello", "world"]
45
43
 
46
- If you run tup with the --info switch it will tell you the aliases to the tuple API (and also tell you much about what is happening in your transactions). Here's an overview of the API, including the short aliases avilable in tup:
47
-
48
- Write one or more tuples (and wait for the transaction to be recorded in the local space):
49
-
50
- w <tuple>,...
51
- write_wait <tuple>,...
52
-
53
- Write without waiting:
54
-
55
- write <tuple>,...
56
-
57
- Write and then wait, under user control:
58
-
59
- write(...).wait
60
-
61
- Pulse a tuple or several (write but immediately delete it, like pubsub):
62
-
63
- pl <tuple>,...
64
- pulse_wait ...
65
-
66
- Pulse without waiting:
67
-
68
- pulse_nowait <tuple>,...
69
-
70
- Read tuple matching a template, waiting for a match to exist:
71
-
72
- r <template>
73
- read <template>
74
- read_wait <template>
75
-
76
- Read tuple matching a template and return it, without waiting for a match to exist (returning nil in that case):
77
-
78
- read_nowait <template>
79
-
80
- Note that neither #read nor #read_nowait wait for any previously issued writes to complete. The difference is that #read waits for a match to exist and #read_nowait does not. Compare:
81
-
82
- write [1]; read_nowait [1] # ==> nil, probably
83
- write [2]; read [2] # ==> [2]
84
-
85
- Read all tuples matching a template, no waiting (like #read_nowait):
86
-
87
- ra <template>
88
- read_all <template>
89
-
90
- If the template is omitted, reads everything (careful, you get what you ask for!). The template can be a standard template as discussed below or anything with a #=== method. Hence
91
-
92
- ra Hash
93
-
94
- reads all hash tuples (and ignores array tuples), and
95
-
96
- ra proc {|t| t.size==2}
97
-
98
- reads all 2-tuples.
99
-
100
- Read tuples in a stream, both existing and as they arrive:
101
-
102
- read <template> do |tuple| ... end
103
- read do |tuple| ... end # match any tuple
104
-
105
- Take a tuple matching a template:
106
-
107
- t <template>
108
- take <template>
109
-
110
- Take a tuple matching a template and optimistically use the local value before the transaction is complete:
111
-
112
- x_final = take <template> do |x_optimistic|
113
- ...
114
- end
115
-
116
- There is no guarantee that `x_final == x_optimistic`. The block may execute more than once.
117
-
118
- Take a tuple matching a template, but only if a local match exists (otherwise return nil):
119
-
120
- take_nowait <template>
121
-
122
- x_final = take_nowait <template> do |x_optimistic|
123
- ...
124
- end
125
-
126
- Note that a local match is still not a guarantee of `x_final == x_optimistic`. Another process may take `x_optimistic` first, and the take will be re-executed. (Think of #take_nowait as a way of saying "take a match, but don't bother trying if there is no match known at this time.") Similarly, #take_nowait returning nil is not a guarantee that a match does not exist: another process could have written a match later than the time of the local search.
127
-
128
- Perform a general transaction:
129
-
130
- result =
131
- transaction do |t|
132
- rval = t.read ... # optimistic value
133
- t.write ...
134
- t.pulse ...
135
- tval = t.take ... # optimistic value
136
- [rval, tval] # pass out result
137
- end
138
-
139
- Note that the block may execute more than once, if there is competition for the tuples that you are trying to #take or #read. When the block exits, however, the transaction is final and universally accepted by all clients.
140
-
141
- Tuples written or taken during a transaction affect subsequent operations in the transaction without modifying the tuplespace or affecting other concurrent transactions (until the transaction completes):
142
-
143
- transaction do |t|
144
- t.write [3]
145
- p t.read [3] # => 3
146
- p read_all # => [] # note read_all called on client, not trans.
147
- t.take [3]
148
- p t.read_nowait [3] # => nil
149
- end
150
-
151
- Be careful about context within the do...end. If you omit the `|t|` block argument, then all operations are automatically scoped to the transaction, rather than the client. The following is equivalent to the previous example:
152
-
153
- client = self # local var that we can use inside the block
154
- transaction do
155
- write [3]
156
- p read [3]
157
- p client.read_all
158
- take [3]
159
- p read_nowait [3]
160
- end
161
-
162
- You can timeout a transaction:
163
-
164
- transaction timeout: 1 do
165
- read ["does not exist"]
166
- end
167
-
168
- This uses tupelo's internal lightweight scheduler, rather than ruby's heavyweight (one thread per timeout) Timeout, though the latter works with tupelo as well.
169
-
170
- You can also abort a transaction while inside it by calling `#abort` on it:
171
-
172
- write [1]
173
- transaction {take [1]; abort}
174
- read_all # => [[1]]
175
-
176
- Another thread can abort a transaction in progress (to the extent possible) by calling `#cancel` on it. See [example/cancel.rb](example/cancel.rb).
177
-
178
- 4. Run tup with a server file so that two sessions can interact. Do this in two terminals in the same dir:
179
-
180
- $ tup sv
181
-
182
- (The 'sv' argument names a file that the first instance of tup uses to store information like socket addresses and the second instance uses to connect. The first instance starts the servers as child processes. However, both instances appear in the terminal as interactive shells.)
183
-
184
- To do this on two hosts, copy the sv file and, if necessary, edit its connect_host field. You can even do this:
185
-
186
- host1$ tup sv tcp localhost
187
-
188
- host2$ tup host1:path/to/sv --tunnel
189
-
190
-
191
- 5. Look at the examples. You may need to dig a bit to find the gem installation. For example:
192
-
193
- ls -d /usr/local/lib/ruby/gems/*/gems/tupelo*
194
-
195
- Note that all bin and example programs accept blob type (e.g., --msgpack, --json) on command line (it only needs to be specified for server -- the clients discover it). Also, all these programs accept log level on command line. The default is --warn. The --info level is a good way to get an idea of what is happening, without the verbosity of --debug.
196
-
197
- 6. Debugging: in addition to the --info switch on all bin and example programs, bin/tspy is also really useful; it shows all tuplespace events in sequence that they occur. For example, run
198
-
199
- $ tspy sv
200
-
201
- in another terminal after running `tup sv`. The output shows the clock tick, sending client, operation, and operation status (success or failure).
202
-
203
- There is also the similar --trace switch that is available to all bin and example programs. This turns on diagnostic output for each transaction. For example:
204
-
205
- ```
206
- tick cid status operation
207
- 1 2 write ["x", 1]
208
- 2 2 write ["y", 2]
209
- 3 3 take ["x", 1], ["y", 2]
210
- ```
211
-
212
- The `Tupelo.application` command, provided by `tupelo/app`, is the source of all these options and is available to your programs. It's a kind of lightweight process deployment and control framework; however `Tupelo.application` is not necessary to use tupelo.
213
-
214
-
215
- What is a tuplespace?
216
- =====================
217
-
218
- A tuplespace is a service for coordination, configuration, and control of concurrent and distributed systems. The model it provides to processes is a shared space that they can use to communicate in a deterministic and sequential manner. (Deterministic in that all clients see the same, consistent view of the data.) The space contains tuples. The operations on the space are few, but powerful. It's not a database, but it might be a front-end for one or more databases.
219
-
220
- See https://en.wikipedia.org/wiki/Tuple_space for general information and history. This project is strongly influenced by Masatoshi Seki's Rinda implementation, part of the Ruby standard library. See http://pragprog.com/book/sidruby/the-druby-book for a good introduction to rinda and druby.
221
-
222
- See http://dbmsmusings.blogspot.com/2010/08/problems-with-acid-and-how-to-fix-them.html for an explanation of the importance of determinism in distributed transaction systems.
223
-
224
- What is a tuple?
225
- ----------------
226
-
227
- A tuple is the unit of information in a tuplespace. It is immutable in the context of the tuplespace -- you can write a tuple into the space and you can read or take one from the space, but you cannot update a tuple within a space. A tuple does not have an identity other than the data it contains. A tuplespace can contain multiple copies of the same tuple. (In the ruby client, two tuples are considered the same when they are #==.)
228
-
229
- A tuple is either an array:
230
-
231
- ["hello", 7]
232
- [nil, true, false]
233
- ["foo", 3.2, [6,5,4], {"bar" => 3}]
234
-
235
- ... or a hash:
236
-
237
- {name: "Myrtle", location: [100,200]}
238
- { [1,2] => 3, [5,7] => 12 }
239
-
240
- In other words, a tuple is a fairly general object, though this depends on the serializer--see below. More or less, a tuple is anything that can be built out of:
241
-
242
- * strings
243
-
244
- * numbers
245
-
246
- * nil, true, false
247
-
248
- * arrays
249
-
250
- * hashes
251
-
252
- It's kind of like a "JSON object", except that, when using the json serializer, the hash keys can only be strings. In the msgpack case, keys have no special limitations. In the case of the marshal and yaml modes, tuples can contain many other kinds of objects.
253
-
254
- The empty tuples `[]` and `{}` are allowed, but bare values such as `3.14` or `false` are not tuples by themselves.
255
-
256
- One other thing to keep in mind: in the array case, the order of the elements is significant. In the hash case, the order is not significant. So these are both true:
257
-
258
- [1,2] != [2,1]
259
- {a:1, b:2} == {b:2, a:1}
260
-
261
-
262
- What is a template?
263
- -------------------
264
-
265
- A template an object that matches (or does not match) tuples. It's used for querying a tuplespace. Typically, a template looks just like a tuple, but possibly with wildcards of some sort. The template:
266
-
267
- [3..5, Integer, /foo/, nil]
268
-
269
- would match the tuple:
270
-
271
- [4, 7, "foobar", "xyz"]
272
-
273
- but not these tuples:
274
-
275
- [6, 7, "foobar", "xyz"]
276
- [3, 7.2, "foobar", "xyz"]
277
- [3, 7, "fobar", "xyz"]
278
-
279
- The nil wildcard matches anything. The Range, Regexp, and Class entries function as wildcards because of the way they define the #=== (match) method. See ruby docs for general information on "threequals" matching.
280
-
281
- Every tuple can also be used as a template. The template:
282
-
283
- [4, 7, "foobar", "xyz"]
284
-
285
- matches itself.
44
+ 4. Take a look at the [FAQ](doc/faq.md), [tutorial](doc/tutorial.md), and many (examples)(example/).
286
45
 
287
- Here's a template for matching some hash tuples:
288
-
289
- {name: String, location: "home"}
290
-
291
- This would match all tuples whose keys are "name" and "location" and whose values for those keys are any string and the string "home", respectively.
292
-
293
- A template doesn't have to be a tuple pattern with wildcards, though. It can be anything with a #=== method. For example:
294
-
295
- read_all proc {|t| some_predicate(t)}
296
- read_all Hash
297
- read_all Array
298
- read_all Object
299
-
300
- An optional library, `tupelo/util/boolean`, provides a #match_any method to construct the boolean `or` of other templates:
301
-
302
- read_all match_any( [1,2,3], {foo: "bar"} )
303
-
304
- Unlike in some tuplespace implementations, templates are a client-side concept (except for subspace-defining templates), which is a source of efficiency and scalability. Matching operations (which can be computationally heavy) are performed on the client, rather than on the server, which would bottleneck the whole system.
305
-
306
- What are the operations on tuples?
307
- --------------------
308
-
309
- * read - search the space for matching tuples, waiting if none found
310
-
311
- * write - insert the tuple into the space
312
-
313
- * take - search the space for matching tuples, waiting if none found, removing the tuple if found
314
-
315
- * pulse - write and take the tuple; readers see it, but it cannot be taken by other client, and it cannot be read later (this is not a classical tuplespace operation, but is useful for publish-subscribe communication patterns)
316
-
317
- These operations have a few variations (wait vs nowait) and options (timeouts).
318
-
319
- Transactions and optimistic concurrency
320
- --------------------
321
-
322
- Transactions combine operations into a group that take effect at the same instant in (logical) time, isolated from other transactions.
323
-
324
- However, it may take some time to prepare the transaction. This is true in terms of both real time (clock and process) and logical time (global sequence of operations). Preparing a transaction means finding tuples that match the criteria of the read and take operations. Finding tuples may require searching (locally) for tuples, or waiting for new tuples to be written by others. Also, the transaction may fail even after matching tuples are found (when another process takes tuples of interest). Then the transaction needs to be prepared again. Once prepared, transaction is sent to all clients, where it may either succeed (in all clients) or fail (for the same reason as before--someone else grabbed one of our tuples). If it fails, then the preparation begins again. A transaction guarantees that, when it completes, all the operations were performed on the tuples at the same logical time. It does not guarantee that the world stands still while one process is inside the `transaction {...}` block.
325
-
326
- Transactions are not just about batching up operations into a more efficient package. A transaction makes the combined operations execute atomically: the transaction finishes only when all of its operations can be successfully performed. Writes and pulses can always succeed, but takes and reads only succeed if the tuples exist.
327
-
328
- Transactions give you a means of optimistic locking: the transaction proceeds in a way that depends on preconditions. See [example/increment.rb](example/increment.rb) for a very simple example. Not only can you make a transaction depend on the existence of a tuple, you can make the effect of the transaction a function of existing tuples (see [example/transaction-logic.rb](example/transaction-logic.rb) and [example/broker-optimistic.rb](example/broker-optimistic.rb)).
329
-
330
- If you prefer classical tuplespace locking, you can simply use certain tuples as locks, using take/write to lock/unlock them. See the examples, such as [example/broker-locking.rb](example/broker-locking.rb). If you have a lot of contention and want to avoid the thundering herd, see [example/lock-mgr-with-queue.rb](example/lock-mgr-with-queue.rb).
331
-
332
- If an optimistic transaction fails (for example, it is trying to take a tuple, but the tuple has just been taken by another transaction), then the transaction block is re-executed, possibly waiting for new matches to the templates. Application code must be aware of the possible re-execution of the block. This is better explained in the examples...
333
-
334
- Transactions have a significant disadvantage compared to using take/write to lock/unlock tuples: a transaction can protect only resources that are represented in the tuplespace, whereas a lock can protect anything: a file, a device, a service, etc. This is because a transaction begins and ends within a single instant of logical (tuplespace) time, whereas a lock tuple can be taken out for an arbitrary duration of real (and logical) time. Furthermore, the instant of logical time in which a transaction takes effect may occur at different wall-clock times on different processes, even on the same host.
335
-
336
- Transactions do have an advantage over using take/write to lock/unlock tuples: there is no possibility of deadlock. See [example/deadlock.rb](example/deadlock.rb) and [example/parallel.rb](example/parallel.rb).
337
-
338
- Another advantage of tranactions is that it is possible to guarantee continuous existence of a time-series of tuples. For example, suppose that tuples matching `{step: Numeric}` indicate the progress of some activity. With transactions, you can guarantee that there is exactly one matching tuple at any time, and that no client ever sees in intermediate or inconsistent state of the counter:
339
-
340
- transaction do
341
- step = take(step: nil)["step"]
342
- write step: step + 1
343
- end
344
-
345
- Any client which reads this template will find a (unique) match without blocking.
346
-
347
- Another use of transactions: forcing a retry when something changes:
348
-
349
- transaction do
350
- step = read(step: nil)["step"]
351
- take value: nil, step: step
352
- end
353
-
354
- This code waits on the existence of a value, but retries if the step changes while waiting. See example/pregel/distributed.rb for a use of this techinique.
355
-
356
- Tupelo transactions are ACID in the following sense. They are Atomic and Isolated -- this is enforced by the transaction processing in each client. Consistency is enforced by the underlying message sequencer: each client's copy of the space is the deterministic result of the same sequence of operations. This is also known as [sequential consistency] (https://en.wikipedia.org/wiki/Sequential_consistency). Durability is optional, but can be provided by the persistent archiver or other clients.
357
-
358
- On the CAP spectrum, tupelo tends towards consistency: for all clients, write and take operations are applied in the same order, so the state of the entire system up through a given tick of discrete time is universally agreed upon. This is known as [state machine replication] (http://en.wikipedia.org/wiki/State%20machine%20replication). Of course, because of the difficulties of distributed systems, one client may not yet have seen the same range of ticks as another. Tupelo's replication model (especially in the use of subspaces) can also be described as [virtual synchrony](https://en.wikipedia.org/wiki/Virtual_synchrony).
359
-
360
- Tupelo transactions do not require two-phase commit, because they are less powerful than general transactions. Each client has enough information to decide (in the same way as all other clients) whether the transaction succeeds or fails. This has performance advantages, but imposes some limitations on transactions over subspaces that are known to one client but not another. [Subspaces](doc/subspace.md).
361
46
 
47
+ Applications
48
+ =======
362
49
 
363
- Syntax
364
- ======
50
+ Tupelo is a flexible base layer for various distributed programming paradigms: job queues, dataflow, map-reduce, etc.
365
51
 
366
- You can use tupelo with a simplified syntax, like a "domain-specific language". Each construct with a block can be used in either of two forms, with an explicit block param or without. Compare [example/add-dsl.rb](example/add-dsl.rb) and [example/add.rb](example/add.rb).
367
52
 
368
53
 
369
54
  Advantages
@@ -414,137 +99,13 @@ Future
414
99
 
415
100
  - Investigate nio4r for faster networking, especially with many clients.
416
101
 
417
- - Interoperable client and server implementations in C, Python, Go, .... Elixir?
102
+ - Interoperable client and server implementations in C, Python, Go, Elixir?
418
103
 
419
104
  - UDP multicast to further reduce the bottleneck in the message sequencer.
420
105
 
421
106
  - Tupelo as a service; specialized and replicated subspace managers as services.
422
107
 
423
108
 
424
- Comparisons
425
- ===========
426
-
427
- Redis
428
- -----
429
-
430
- Unlike redis, computations are not a centralized bottleneck. Set intersection, for example.
431
-
432
- Pushing data to client eliminates need for polling, makes reads faster.
433
-
434
- Tupelo's pulse/read ops are like pubsub in redis.
435
-
436
- However, tupelo is not a substitute for the caching functionality of redis and memcache.
437
-
438
-
439
- Rinda
440
- -----
441
-
442
- Very similar api.
443
-
444
- Rinda has a severe bottleneck, though: all matching, waiting, etc. are performed in one process.
445
-
446
- Rinda is rpc-based, which is slower and also more vulnerable due to the extra client-server state; tupelo is imlemented on a message layer, rather than rpc. This also helps with pipelined writes.
447
-
448
- Tupelo also supports custom classes in tuples, but only with marshal / yaml; must define #==; see [example/custom-class.rb](example/custom-class.rb)
449
-
450
- Both: tuples can be arrays or hashes.
451
-
452
- Spaces have an advantage over distributed hash tables: different clients may acccess tuples in terms of different dimensions. For example, a producer generates [producer_id, value]; a consumer looks for [nil, SomeParticularValues]. Separation of concerns, decoupling in the data space.
453
-
454
-
455
- To compare
456
- ----------
457
-
458
- * beanstalkd
459
-
460
- * resque
461
-
462
- * zookeeper -- totally ordered updates; tupelo trades availability for lower latency (?)
463
-
464
- * chubby
465
-
466
- * doozer, etcd
467
-
468
- * serf -- tupelo has lower latency and is transactional, but at a cost compared to serf; tupelo semantics is closer to databases
469
-
470
- * arakoon
471
-
472
- * hazelcast
473
-
474
- * lmax -- minimal spof
475
-
476
- * datomic -- similar distribution of "facts", but not tuplespace; similar use of pluggable storage managers
477
-
478
- * job queues: sidekiq, resque, delayedjob, http://queues.io, https://github.com/factual/skuld
479
-
480
- * pubsubs: kafka
481
-
482
- * spark, storm
483
-
484
- * tibco and gigaspace
485
-
486
- * gridgain
487
-
488
-
489
- Architecture
490
- ============
491
-
492
- Two central processes:
493
-
494
- * message sequencer -- assigns unique increasing IDs to each message (a message is essentially a transaction containing operations on the tuplespace). This is the key to the whole design. By sequencing all transactions in a way that all clients agree with, the transactions can be applied (or rejected) by all clients without further negotiation.
495
-
496
- * client sequencer -- assigns unique increasing IDs to clients when they join the distributed system
497
-
498
- Specialized clients:
499
-
500
- * archiver -- dumps tuplespace state to clients joining the system later than t=0; at least one archiver is required, unless all clients start at t=0.
501
-
502
- * tup -- command line shell for accessing (and creating) tuplespaces
503
-
504
- * tspy -- uses the notification API to watch all events in the space
505
-
506
- * queue / lock / lease managers (see examples)
507
-
508
- General application clients:
509
-
510
- * contain a worker thread and any number of application-level client threads
511
-
512
- * worker thread manages local tuplespace state and requests to modify or access it
513
-
514
- * client threads construct transactions and wait for results (communicating with the worker thread over queues); they may also use asynchronous transactions
515
-
516
- Some design principles:
517
-
518
- * Once a transaction has been sent from a client to the message sequencer, it references only tuples, not templates. This makes it faster and simpler for each receiving client to apply or reject the transaction. Also, clients that do not support local template searching (such as archivers) can store tuples using especially efficient data structures that only support tuple-insert, tuple-delete, and iterate/export operations.
519
-
520
- * Use non-blocking protocols. For example, transactions can be evaluated in one client without waiting for information from other clients. Even at the level of reading messages over sockets, tupelo uses (via funl and object-stream) non-blocking constructs. At the application level, you can use transactions to optimistically modify shared state (but applications are free to use locking if high contention demands it).
521
-
522
- * Do the hard work on the client side. For example, all pattern matching happens in the client that requested an operation that has a template argument, not on the server or other clients.
523
-
524
- Protocol
525
- --------
526
-
527
- Nothing in the protocol specifies local searching or storage, or matching, or notification, or templating. That's all up to each client. The protocol only contains tuples and operations on them (take, write, pulse, read), combined into transactions.
528
-
529
- The protocol has two layers. The outer (message) layer is 6 fields, managed by the funl gem, using msgpack for serialization. All socket reads are non-blocking (using msgpack's stream mode), so a slow sender will not block other activity in the system.
530
-
531
- One of those 6 fields is a data blob, containing the actual transaction and tuple information. The inner (blob) layer manages that field using msgpack (by default), marshal, json, or yaml. This layer contains the transaction operations. The blob is not unpacked by the server, only by clients.
532
-
533
- Each inner serialization method ("blobber") has its own advantages and drawbacks:
534
-
535
- * marshal is ruby only, but can contain the widest variation of objects
536
-
537
- * yaml is portable and humanly readable, and still fairly diverse, but very inefficient
538
-
539
- * msgpack and json (yajl) are both relatively efficient (in terms of packet size, as well as parse/emit time)
540
-
541
- * msgpack and json support the least diversity of objects (just "JSON objects"), but msgpack also supports hash keys that are objects rather than just strings.
542
-
543
- For most purposes, msgpack is a good choice, so it is the default.
544
-
545
- The sending client's tupelo library must make sure that there is no aliasing within the list of tuples (this is only an issue for Marshal and YAML, since msgpack and json do not support references).
546
-
547
-
548
109
  Development
549
110
  ===========
550
111
 
data/bin/tup CHANGED
@@ -49,7 +49,7 @@ if ARGV.delete("-h") or ARGV.delete("--help")
49
49
 
50
50
  --trace enable trace output
51
51
 
52
- --tunnel remote clients use ssh tunnels by default (OpenSSH >= 6.0)
52
+ --tunnel remote clients use ssh tunnels by default
53
53
 
54
54
  --pubsub publish/subscribe mode; does not keep local tuple store:
55
55
 
@@ -118,6 +118,15 @@ Tupelo.application(
118
118
  alias tr transaction
119
119
  CMD_ALIASES = %w{ w pl t r ra tr }
120
120
  private *CMD_ALIASES
121
+
122
+ def help
123
+ puts "Command aliases:"
124
+ CMD_ALIASES.each do |m_name|
125
+ m = method(m_name)
126
+ printf "%8s -> %s\n", m.name, m.original_name
127
+ end
128
+ nil
129
+ end
121
130
  end
122
131
 
123
132
  client_opts = {}
@@ -134,7 +143,7 @@ Tupelo.application(
134
143
  use_subspaces! if use_subspaces
135
144
 
136
145
  log.info {"cpu time: %.2fs" % Process.times.inject {|s,x|s+x}}
137
- log.info {"starting shell. Commands: #{TupClient::CMD_ALIASES.join(", ")}"}
146
+ log.info {"starting shell."}
138
147
 
139
148
  require 'tupelo/app/irb-shell'
140
149
  IRB.start_session(self)
@@ -2,8 +2,6 @@
2
2
 
3
3
  require 'tupelo/app'
4
4
 
5
- sv = "chat-nohistory.yaml"
6
-
7
5
  Thread.abort_on_exception = true
8
6
 
9
7
  def display_message msg
@@ -13,7 +11,7 @@ def display_message msg
13
11
  puts "#{from}@#{time_str}> #{line}"
14
12
  end
15
13
 
16
- Tupelo.tcp_application services_file: sv do
14
+ Tupelo.tcp_application do
17
15
  me = argv.shift
18
16
 
19
17
  local do
data/example/chat/chat.rb CHANGED
@@ -1,11 +1,23 @@
1
- # Accepts usual tupelo switches (such as --trace, --debug), plus one argument: a
2
- # user name to be shared with other chat clients. New clients see a brief
3
- # history of the chat, as well as new messages from other clients.
1
+ # Network chat program.
4
2
  #
5
3
  # You can run several instances of chat.rb. The first will set up all needed
6
- # services. The rest will connect by referring to a yaml file in the same dir.
7
- # Copy that file to remote hosts (and modify hostnames as needed) for remote
8
- # access. If the first instance is run with "--persist-dir <dir>", messages
4
+ # services, as well as run the chat shell. The rest will connect by referring to
5
+ # the services specified in a yaml file, and then run the chat shell.
6
+ #
7
+ # Usage:
8
+ #
9
+ # ruby chat.rb chat.yaml username
10
+ #
11
+ # For remote clients, you can copy the yaml file, or use scp syntax:
12
+ #
13
+ # ruby chat.rb host:path/to/chat.yaml username
14
+ #
15
+ # The username is shared with other chat clients. New clients see a brief
16
+ # history of the chat, as well as new messages from other clients.
17
+ #
18
+ # Accepts usual tupelo switches (such as --trace, --debug, --tunnel).
19
+ #
20
+ # If the first instance is run with "--persist-dir <dir>", messages
9
21
  # will persist across service shutdown.
10
22
  #
11
23
  # Compare: https://github.com/bloom-lang/bud/blob/master/examples/chat.
@@ -15,7 +27,6 @@
15
27
 
16
28
  require 'tupelo/app'
17
29
 
18
- sv = "chat.yaml"
19
30
  history_period = 60 # seconds -- discard _my_ messages older than this
20
31
 
21
32
  Thread.abort_on_exception = true
@@ -27,7 +38,7 @@ def display_message msg
27
38
  puts "#{from}@#{time_str}> #{line}"
28
39
  end
29
40
 
30
- Tupelo.tcp_application services_file: sv do
41
+ Tupelo.tcp_application do
31
42
  me = argv.shift
32
43
 
33
44
  local do