tupelo 0.19 → 0.20

Sign up to get free protection for your applications and to get access to all the features.
Files changed (51) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +98 -36
  3. data/bin/tup +1 -7
  4. data/bugs/take-write.rb +8 -0
  5. data/example/bingo/bingo-v2.rb +20 -0
  6. data/example/broker-queue.rb +35 -0
  7. data/example/child-of-child.rb +34 -0
  8. data/example/consistent-hash.rb +0 -2
  9. data/example/counters/lock.rb +24 -0
  10. data/example/counters/merge.rb +35 -0
  11. data/example/counters/optimistic.rb +29 -0
  12. data/example/dataflow.rb +21 -0
  13. data/example/dedup.rb +45 -0
  14. data/example/map-reduce/ex.rb +32 -0
  15. data/example/multi-tier/memo2.rb +0 -2
  16. data/example/pregel/dist-opt.rb +15 -0
  17. data/example/riemann/event-subspace.rb +2 -0
  18. data/example/riemann/expiration-dbg.rb +15 -0
  19. data/example/riemann/producer.rb +34 -13
  20. data/example/riemann/v1/expirer.rb +28 -0
  21. data/example/riemann/{riemann-v1.rb → v1/riemann.rb} +5 -8
  22. data/example/riemann/v2/expirer.rb +31 -0
  23. data/example/riemann/v2/hash-store.rb +33 -0
  24. data/example/riemann/v2/http-mode.rb +53 -0
  25. data/example/riemann/v2/ordered-event-store.rb +128 -0
  26. data/example/riemann/{riemann-v2.rb → v2/riemann.rb} +32 -17
  27. data/example/sqlite/poi-store.rb +160 -0
  28. data/example/sqlite/poi-v2.rb +58 -0
  29. data/example/sqlite/poi.rb +40 -0
  30. data/example/sqlite/tmp/poi-sqlite.rb +33 -0
  31. data/example/subspaces/addr-book-v1.rb +0 -2
  32. data/example/subspaces/addr-book-v2.rb +0 -2
  33. data/example/subspaces/addr-book.rb +0 -2
  34. data/example/subspaces/pubsub.rb +0 -2
  35. data/example/subspaces/ramp.rb +0 -2
  36. data/example/subspaces/shop/shop-v2.rb +0 -2
  37. data/example/subspaces/simple.rb +0 -1
  38. data/example/subspaces/sorted-set-space.rb +5 -0
  39. data/lib/tupelo/app.rb +8 -0
  40. data/lib/tupelo/archiver/persistent-tuplespace.rb +2 -2
  41. data/lib/tupelo/archiver/tuplespace.rb +2 -2
  42. data/lib/tupelo/client/reader.rb +18 -8
  43. data/lib/tupelo/client/subspace.rb +12 -4
  44. data/lib/tupelo/client/transaction.rb +13 -1
  45. data/lib/tupelo/client/worker.rb +27 -4
  46. data/lib/tupelo/client.rb +3 -5
  47. data/lib/tupelo/tuplets/persistent-archiver/tuplespace.rb +5 -0
  48. data/lib/tupelo/version.rb +1 -1
  49. data/test/lib/mock-client.rb +1 -0
  50. metadata +26 -7
  51. data/example/riemann/expirer-v1.rb +0 -25
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a8a67711763af58aa46979622519450ba198e603
4
- data.tar.gz: fc399bd00b0b69ec5938a0d3ffae68155c186ae7
3
+ metadata.gz: 8f1def09de170ac1f78bdc5bfe99374b02566dab
4
+ data.tar.gz: fe906b6b7f0724a056b93de2144b5b198b77c577
5
5
  SHA512:
6
- metadata.gz: 0f3570788a47744267fc361e25bd50b56bf327177764fb428317cb51ed1870c7e86acc8df53ef28d60278f9c902057b2a5275f86eba29189ea80f28259ddfc9f
7
- data.tar.gz: 85a3be5a56f49d9518c2ed1409d992654e3f36f80847b441d7f4d84bf29afba541c11164c4052f97e208916e9a2a4cbf57aeef04736207160d4c6b4cd2d69e25
6
+ metadata.gz: b033a673de3a993f6cec2bdb76beca10797a37c579e6bba09c19073a1eea875e683d24cdd28ca2450378602f3ce74a8870ee14ede361ae35e5cf7e4a95fa460b
7
+ data.tar.gz: 43e3999fe505e7eca3c483839e6dcbde880f9cdc7791bb2b88541ce43de9e506b47cc16354d595f051456146b45ea4a78422d9419b25e1e028eaa93a51cb716c
data/README.md CHANGED
@@ -1,35 +1,49 @@
1
1
  Tupelo
2
2
  ==
3
3
 
4
- A tuplespace that is fast, scalable, and language agnostic. Tupelo is designed for distribution of both computation and storage (disk and memory), in a unified language that has both transactional and tuple-operation (read/write/take) semantics.
4
+ Tupelo is a language-agnostic tuplespace for coordination of distributed programs. It is designed for distribution of both computation and storage, on disk and in memory, with pluggable storage adapters. Its programming model is small and semantically transparent: there are tuples (built from arrays, hashes, and scalars), a few operations on tuples (read, write, take), and transactions composed of these operations. This data-centric model, unlike RPC and most forms of messaging, decouples application endpoints from each other, not only in space and time, but also in referential structure: processes refer to data rather than to other processes.
5
+
6
+ Tupelo is inspired by Masatoshi Seki's Rinda in the Ruby standard library, which in turn is based on David Gelernter's Linda. The programming models of Tupelo and Rinda are similar, except for the lack of transactions in Rinda. However, the implementations of the two are nearly opposite in architectural approach.
7
+
8
+ This repository contains the reference implementation in Ruby, with documentation, tests, benchmarks, and examples. Implementations in other languages must communicate with this one.
5
9
 
6
- This is the reference implementation in ruby. It should be able to communicate with implementations in other languages.
7
10
 
8
11
  Documentation
9
12
  ============
10
13
 
14
+ Introductory
15
+ ------------
11
16
  * [Tutorial](doc/tutorial.md)
17
+ * [Examples](example)
12
18
  * [FAQ](doc/faq.md)
13
- * [Comparisons](doc/compare.md)
19
+
20
+ In Depth
21
+ --------
14
22
  * [Transactions](doc/transactions.md)
15
23
  * [Replication](doc/replication.md)
16
24
  * [Subspaces](doc/subspace.md)
17
25
  * [Causality](doc/causality.md)
18
26
  * [Concurrency](doc/concurrency.md)
19
- * [Examples](example/)
27
+
28
+ Big Picture
29
+ -----------
30
+ * [Comparisons](doc/compare.md)
31
+ * [Planned future work](doc/future.md)
20
32
 
21
33
  Internals
22
34
  ---------
23
- * [Architecture and protocol](doc/arch.md)
35
+ * [Architecture](doc/arch.md)
36
+ * [Protocols](doc/protocol.md)
24
37
 
25
38
  Talk
26
39
  ----
27
- * [Abstract](sfdc.md) and [slides](doc/sfdc.pdf) for San Francisco Distributed Computing meetup
40
+ * [Abstract](sfdc.md) and [slides](doc/sfdc.pdf) for San Francisco Distributed Computing meetup, December 2013.
41
+
28
42
 
29
43
  Getting started
30
44
  ==========
31
45
 
32
- 1. Install ruby 2.0 or 2.1 (not 1.9) from http://ruby-lang.org. Examples and tests will not work on windows (they use fork and unix sockets) or jruby, though probably the underying libs will (using tcp sockets).
46
+ 1. Install ruby 2.0 or 2.1 (not 1.9) from http://ruby-lang.org. Examples and tests will not work on Windows (they use fork and unix sockets) or JRuby, though probably the underying libs will (using tcp sockets on Windows).
33
47
 
34
48
  2. Install the gem and its dependencies (you may need to `sudo` this):
35
49
 
@@ -44,68 +58,114 @@ Getting started
44
58
  >> t [nil, nil]
45
59
  => ["hello", "world"]
46
60
 
47
- 4. Take a look at the [FAQ](doc/faq.md), [tutorial](doc/tutorial.md), and the many [examples](example/).
61
+ 4. Take a look at the [FAQ](doc/faq.md), the [tutorial](doc/tutorial.md), and the many [examples](example).
48
62
 
49
63
 
50
64
  Applications
51
65
  =======
52
66
 
53
- Tupelo is a flexible base layer for various distributed programming paradigms: job queues, dataflow, map-reduce, etc. Using subspaces, it's also a transactional, replicated datastore with pluggable storage providers.
67
+ Tupelo is a flexible base layer for various distributed programming patterns and techniques, which are explored in the examples: job queues, shared configuration and state, load balancing, service discovery, in-memory data grids, message queues, publish/subscribe, dataflow, map-reduce, and both optimistic and pessimistic (lock/lease) concurrency control.
54
68
 
69
+ Tupelo can be used to impose a unified transactional structure and distributed access model on a mixture of programs and languages (polyglot computation) and a mixture of data stores (polyglot persistence), with consistent replication.
55
70
 
56
- Advantages
57
- ==========
58
71
 
59
- Tupelo can be used to impose a unified transactional structure and distributed access model on a mixture of programs and stores. ("Polyglot persistence".) Need examples....
72
+ Example
73
+ -------
60
74
 
61
- Speed (latency, throughput):
75
+ This program counts prime numbers in an interval by distributing the problem to a set of hosts:
62
76
 
63
- * minimal system-wide bottlenecks
77
+ require 'tupelo/app/remote'
64
78
 
65
- * non-blocking socket reads
79
+ hosts = %w{itchy scratchy lisa bart} # ssh hosts with key-based auth
66
80
 
67
- * read -- local and hence very fast
81
+ Tupelo.tcp_application do
82
+ hosts.each do |host|
83
+ remote host: host, passive: true, eval: %{
84
+ require 'prime' # ruby stdlib for prime factorization
85
+ loop do
86
+ _, input = take(["input", Integer])
87
+ write ["output", input, input.prime_division]
88
+ end
89
+ }
90
+ end
68
91
 
69
- * write -- fast, pipelined (waiting for acknowledgement is optional);
92
+ local do
93
+ inputs = 1_000_000_000_000 .. 1_000_000_000_200
70
94
 
71
- * transactions -- combine several takes and writes, reducing latency and avoiding locking
95
+ inputs.each do |input|
96
+ write ["input", input]
97
+ end
72
98
 
73
- Can use optimal data structure for each subspace of tuplespace.
99
+ count = 0
100
+ inputs.size.times do |i|
101
+ _, input, factors = take ["output", Integer, nil]
102
+ count += 1 if factors.size == 1 and factors[0][1] == 1
103
+ print "\rChecked #{i}"
104
+ end
74
105
 
75
- Decouples storage from query. (E.g. archiver for storage, optimized for just insert, delete, dump. And in-memory data structure, such as red-black tree, optimized for sorted query.)
106
+ puts "\nThere are #{count} primes in #{inputs}"
107
+ end
108
+ end
76
109
 
77
- Each client can have its own matching agorithms and api -- matching is not part of the comm protocol, which is defined purely in terms of tuples.
110
+ Ssh is used to set up the remote processes. Additionally, with the `--tunnel` command line argument, all tuple communication is tunneled over ssh. More examples like this are in [example/map-reduce](example/map-reduce).
78
111
 
79
- Data replication is easy--hard to avoid in fact.
80
112
 
81
113
  Limitations
82
114
  ===========
83
115
 
84
- Better for small messages, because they tend to propagate widely.
116
+ The main limitation of tupelo is that **all network communication passes through a single process**, the message sequencer. This process has minimal state and minimal computation. The state is just a counter and the network connections (no storage of tuples or other application data). The computation is just counter increment and message dispatch (no transaction execution or searches). A transaction requires just one message (possibly with many recipients) to pass through the sequencer. The message sequencer can be light and fast.
117
+
118
+ Nevertheless, this process is a bottleneck. Each message traverses two hops, to and from the sequencer. Each tupelo client must be connected to the sequencer to transact on tuples (aside from local reads).
85
119
 
86
- May stress network and local memory (but subspaces can help).
120
+ **Tupelo will always have this limitation.** It is essential to the design of the system. By accepting this cost, we get some benefits, discussed in the next section.
87
121
 
88
- Worker thread has cpu cost (but subspaces can help).
122
+ The message sequencer is also a SPoF (single point of failure), but this is not inherent in the design. A future version of tupelo will have options for failover or clustering of the sequencer, perhaps based on [raft](http://raftconsensus.github.io), with a cost of increased latency and complexity. (However, redundancy and failover of *application* data and computation *is* supported by the current implementation; app data and computations are distributed among the client processes.)
89
123
 
90
- What other potential problems and how does tupelo solve them?
124
+ There are some limitations that may result from naive application of tupelo: high client memory use, high bandwidth use, high client cpu use. These resource issues can often be controlled with [subspaces](doc/subspace.md) and specialized data structures and data stores. There are several examples addressing these problems. Another approach is to use the tuplespace for low volume references to high volume data.
91
125
 
126
+ Also, see the discussion in [transactions](doc/transactions.md) on limitations of transactions across subspaces.
92
127
 
93
- Future
94
- ======
128
+ This implementation is also limited in efficiency because of its use of Ruby.
95
129
 
96
- - Subspaces. Redundancy, for read-heavy data stores (redundant array of in-memory sqlite, for example). Clients managing different subspaces may benefit by using different stores and algorithms.
130
+ Finally, it must be understood that work on tupelo is still in early, experimental stages. **The tupelo software should not yet be relied on for applications where failure resistance and recovery are important.** The current version is suited for things like batch processing (especially complex dataflow topologies), which can be restarted after failure, or other distributed systems that have short lifespans or are disposable.
131
+
132
+
133
+ Benefits
134
+ ========
97
135
 
98
- - More persistence options.
136
+ As noted above, the sequencer assigns an incrementing sequence number, or *tick*, to each transaction and dispatches it to the clients, who take on all the burden of tuple computation and storage. This design choice leads to:
99
137
 
100
- - Fail-over. Robustness.
138
+ * strong consistency: all clients have the same view of the tuplespace at a given tick of the global clock;
101
139
 
102
- - Investigate nio4r for faster networking, especially with many clients.
140
+ * deterministic transaction execution across processes: transactions complete in two network hops, and transactions reference concrete tuples, not templates or queries that require further searching;
103
141
 
104
- - Interoperable client and server implementations in C, Python, Go, Elixir?
142
+ * high concurrency: no interprocess locking or coordination is needed to prepare or execute transactions;
105
143
 
106
- - UDP multicast to further reduce the bottleneck in the message sequencer.
144
+ * efficient distribution of transaction workload off of the critical path: transaction preparation (finding matching tuples) is performed by just the client initiating the transaction, and transaction execution is performed only by clients that subscribe to subspaces relevant to the transaction;
107
145
 
108
- - Tupelo as a service; specialized and replicated subspace managers as services.
146
+ * client-side logic within transactions: any client state can be accessed while preparing a transaction, and each client is free to use any template and search mechanism (deterministic or not), possibly taking advantage of the client's specialized tuple storage;
147
+
148
+ * zero-latency reads: clients store subscribed tuples locally, so searching and waiting for matching tuples are local operations;
149
+
150
+ * relatively easy data replication: all subscribers to a subspace replicate that subspace, possibly with different storage implementations;
151
+
152
+ * the current state of the tuplespace can be computed from a earlier state by replaying the transactions in sequence;
153
+
154
+ * the evolution of system state over time is observable, and tupelo provides the tools to do so: the `--trace` switch, the `#trace` api, and the `tspy` program.
155
+
156
+ Additional benefits (not related to message sequencing) include:
157
+
158
+ * the `tup` program for interactively starting and connecting to tupelo instances;
159
+
160
+ * a framework for starting and controlling child and remote processes connected to the tuplespace;
161
+
162
+ * options to tunnel connections over ssh and through firewalls, for running in public clouds and other insecure environments;
163
+
164
+ * choice of object serialization method (msgpack, json, marshal, yaml);
165
+
166
+ * choice of UNIX or TCP sockets.
167
+
168
+ Process control and tunneling are available independently of tupelo using the easy-serve gem.
109
169
 
110
170
 
111
171
  Development
@@ -138,9 +198,11 @@ Other gems:
138
198
 
139
199
  * yajl-ruby (only used to support --json option)
140
200
 
201
+ * nio4r (optional dependency of funl)
202
+
141
203
  Optional gems for some of the examples:
142
204
 
143
- * sinatra, http, sequel, sqlite, rbtree, leveldb-native
205
+ * sinatra, json, http, sequel, sqlite, rbtree, leveldb-native, lmdb
144
206
 
145
207
  Contact
146
208
  =======
data/bin/tup CHANGED
@@ -66,11 +66,9 @@ if ARGV.delete("-h") or ARGV.delete("--help")
66
66
  load and save tuplespace to DIR
67
67
  (only needs to be set on first tup invocation)
68
68
 
69
- --use-subspaces
70
- enable subspaces for this tupelo service
71
- (only needs to be set on first tup invocation)
72
69
  --subscribe TAG,TAG,...
73
70
  subscribe to specified subspaces; use "" for none
71
+ by default, tup client subscribes to everything
74
72
 
75
73
  END
76
74
  exit
@@ -82,8 +80,6 @@ argv, tupelo_opts = Tupelo.parse_args(ARGV)
82
80
 
83
81
  pubsub = argv.delete("--pubsub") # not a standard tupelo opt
84
82
 
85
- use_subspaces = argv.delete("--use-subspaces") # not a standard tupelo opt
86
-
87
83
  if i=argv.index("--subscribe") # default is to subscribe to all
88
84
  argv.delete("--subscribe")
89
85
  subscribed_tags = argv.delete_at(i).split(",")
@@ -156,8 +152,6 @@ Tupelo.application(
156
152
  end
157
153
 
158
154
  local TupClient, **client_opts do
159
- use_subspaces! if use_subspaces
160
-
161
155
  log.info {"cpu time: %.2fs" % Process.times.inject {|s,x|s+x}}
162
156
  log.info {"starting shell."}
163
157
 
data/bugs/take-write.rb CHANGED
@@ -15,5 +15,13 @@ Tupelo.application do
15
15
  status, tick, cid, op = note.wait
16
16
  p op # should "read [1]", not "write [1]; take [1]"
17
17
  # this is just an optimization, not really a bug
18
+
19
+ # however, need to be careful about this optimization, since
20
+ # transaction {take [1]; take [1]; write [1]}
21
+ # is not the same as
22
+ # transaction {take [1]; read [1]}
23
+ # but rather more like
24
+ # transaction {take [1]; read_distinct [1]}
25
+ # except #read_distinct is not defined.
18
26
  end
19
27
  end
@@ -0,0 +1,20 @@
1
+
2
+ __END__
3
+ # dealer
4
+ child passive: true do
5
+ loop do
6
+ # transaction -- add to v2
7
+ _, player_id = take ["buy", nil]
8
+
9
+ end
10
+ end
11
+
12
+ # player -- in v3 use subspace per player
13
+ end
14
+
15
+ ## how to buy 4 without increasing contention risk
16
+
17
+ ## reduce contention for cards by randomizing or cons. hashing?
18
+
19
+ ## swapping these lines -> more contention?
20
+
@@ -0,0 +1,35 @@
1
+ # more like how you would do it in redis, except that the queue is not stored in
2
+ # the central server, so operations on it are not a bottleneck, FWIW
3
+
4
+ require 'tupelo/app'
5
+
6
+ N_PLAYERS = 10
7
+
8
+ Tupelo.application do
9
+ N_PLAYERS.times do
10
+ # sleep rand / 10 # reduce contention -- could also randomize inserts
11
+ child do
12
+ me = client_id
13
+ write name: me
14
+
15
+ you = transaction do
16
+ game = read_nowait(
17
+ player1: nil,
18
+ player2: me)
19
+ break game["player1"] if game
20
+
21
+ unless take_nowait name: me
22
+ raise Tupelo::Client::TransactionFailure
23
+ end
24
+
25
+ you = take(name: nil)["name"]
26
+ write(
27
+ player1: me,
28
+ player2: you)
29
+ you
30
+ end
31
+
32
+ log "now playing with #{you}"
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,34 @@
1
+ require 'tupelo/app'
2
+
3
+ ### need a programmatic way to start up clients
4
+
5
+ Tupelo.application do |app|
6
+
7
+ app.child do ## local still hangs
8
+ 3.times do |i|
9
+ app.child do
10
+ write [i]
11
+ log "wrote #{i}"
12
+ end
13
+ end
14
+
15
+ 3.times do
16
+ log take [nil]
17
+ end
18
+ end
19
+ end
20
+
21
+ __END__
22
+
23
+ this hangs sometimes but not always:
24
+
25
+ tick cid status operation
26
+ A: client 3: wrote 0
27
+ A: client 4: wrote 1
28
+ 1 3 batch write [0]
29
+ 2 4 batch write [1]
30
+ A: client 2: [0]
31
+ 3 2 atomic take [0]
32
+ 4 2 atomic take [1]
33
+ A: client 2: [1]
34
+ A: client 5: wrote 2
@@ -9,8 +9,6 @@ N_ITER = 1000
9
9
 
10
10
  Tupelo.application do
11
11
  local do
12
- use_subspaces!
13
-
14
12
  N_BINS.times do |id|
15
13
  define_subspace id, [id, Numeric, Numeric]
16
14
  end
@@ -0,0 +1,24 @@
1
+ require 'tupelo/app'
2
+
3
+ N_PROCS = 3
4
+ N_ITER = 10
5
+
6
+ Tupelo.application do
7
+ pids = N_PROCS.times.map do
8
+ child do
9
+ N_ITER.times do
10
+ c = take count: nil
11
+ # unlike optimistic.rb, a system/network failure here
12
+ # could cause this tuple to be lost. To safeguard: example/lease.rb
13
+ write count: c["count"] + 1
14
+ sleep 0.1
15
+ end
16
+ end
17
+ end
18
+
19
+ local do
20
+ write count: 0
21
+ pids.each {|pid| Process.waitpid pid}
22
+ log read count: nil
23
+ end
24
+ end
@@ -0,0 +1,35 @@
1
+ require 'tupelo/app'
2
+
3
+ N_PROCS = 3
4
+ N_ITER = 10
5
+
6
+ Tupelo.application do
7
+ pids = N_PROCS.times.map do
8
+ child do
9
+ N_ITER.times do
10
+ write count: 1
11
+ sleep 0.1
12
+ end
13
+ end
14
+ end
15
+
16
+ local do
17
+ # note: no need to init counter
18
+ pids.each {|pid| Process.waitpid pid}
19
+
20
+ # but we cannot read the counter(s) without first merging them
21
+ # see also example/dedup.rb
22
+ while transaction do
23
+ c1 = take_nowait count: nil
24
+ c2 = take_nowait count: nil
25
+ if c1 and c2
26
+ write count: c1["count"] + c2["count"]
27
+ true
28
+ elsif c1
29
+ log c1
30
+ else
31
+ log count: 0
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,29 @@
1
+ require 'tupelo/app'
2
+
3
+ N_PROCS = 3
4
+ N_ITER = 10
5
+
6
+ Tupelo.application do
7
+ pids = N_PROCS.times.map do
8
+ child do
9
+ N_ITER.times do
10
+ transaction do
11
+ c = take count: nil
12
+ write count: c["count"] + 1
13
+ end
14
+ sleep 0.1
15
+ end
16
+ end
17
+ end
18
+
19
+ local do
20
+ write count: 0
21
+ # we have to make sure we write this initial counter only once,
22
+ # which is easy in this case, but not always. See merge.rb for
23
+ # an approach that allows multiple initializations.
24
+
25
+ pids.each {|pid| Process.waitpid pid}
26
+ # could also use tuples to do this
27
+ log read count: nil
28
+ end
29
+ end
@@ -0,0 +1,21 @@
1
+ # http://slashdot.org/topic/bi/how-is-reactive-different-from-procedural-programming
2
+ # http://developers.slashdot.org/story/14/01/13/2119202/how-reactive-programming-differs-from-procedural-programming
3
+ # bud and bloom
4
+
5
+ # TODO: DSL for expressing data dependency relations
6
+
7
+ require 'tupelo/app'
8
+
9
+ N_WORKERS = 2
10
+
11
+ Tupelo.application do
12
+ N_WORKERS.times do
13
+ child passive: true do
14
+
15
+ end
16
+ end
17
+
18
+ local do
19
+ #write
20
+ end
21
+ end
data/example/dedup.rb ADDED
@@ -0,0 +1,45 @@
1
+ # How to deduplicate a tuple.
2
+ #
3
+ # Problem: we have some clients, each of which writes the same tuple.
4
+ # How can we remove duplicates?
5
+ #
6
+ # Run with --trace to see how this works.
7
+
8
+ require 'tupelo/app'
9
+
10
+ N_CLIENTS = 5
11
+
12
+ T = [1] # whatever
13
+
14
+ Tupelo.application do
15
+ N_CLIENTS.times do
16
+ child do
17
+ unless read_nowait T # try not to write dups, but...
18
+ write_wait T # T is possibly not unique
19
+ end
20
+
21
+ # After writing T, each client tries to reduce the T population to one
22
+ # tuple.
23
+ catch do |done|
24
+ loop do
25
+ transaction do
26
+ if take_nowait T and take_nowait T
27
+ write T
28
+ else
29
+ throw done # don't take or write anything
30
+ end
31
+ end
32
+ end
33
+ end
34
+
35
+ # At the tick on which the last transaction above from this client
36
+ # completes, there is a unique T, but of course that may change in the
37
+ # future, for example, if another client's `write_wait T` was delayed in
38
+ # flight over the network). But in that case, the other client will also
39
+ # perform the same de-dup code.
40
+
41
+ count = read_all(T).size
42
+ log "count = #{count}"
43
+ end
44
+ end
45
+ end
@@ -0,0 +1,32 @@
1
+ require 'tupelo/app/remote'
2
+
3
+ hosts = %w{od1 od2 ut} * 4 # list of ssh hosts; must not require password
4
+
5
+ Tupelo.tcp_application do
6
+ hosts.each do |host|
7
+ remote host: host, passive: true, eval: %{
8
+ require 'prime' # ruby stdlib for prime factorization
9
+ loop do
10
+ _, input = take(["input", Integer])
11
+ write ["output", input, input.prime_division]
12
+ end
13
+ }
14
+ end
15
+
16
+ local do
17
+ inputs = 1_000_000_000_000 .. 1_000_000_000_200
18
+
19
+ inputs.each do |input|
20
+ write ["input", input]
21
+ end
22
+
23
+ count = 0
24
+ inputs.size.times do |i|
25
+ _, input, factors = take ["output", Integer, nil]
26
+ count += 1 if factors.size == 1 and factors[0][1] == 1
27
+ print "\rChecked #{i}"
28
+ end
29
+
30
+ puts "\nThere are #{count} primes in #{inputs}"
31
+ end
32
+ end
@@ -14,8 +14,6 @@ fork do
14
14
 
15
15
  Tupelo.application do
16
16
  local do
17
- use_subspaces!
18
-
19
17
  define_subspace("memo", [
20
18
  "memo", # tag is encoded in each tuple, for recognizing
21
19
  String, # key in the cache, must be string
@@ -0,0 +1,15 @@
1
+ #
2
+ # Minor optimization:
3
+
4
+ class KeyMatcher
5
+ def initialize i, n
6
+ @i = i
7
+ @n = n
8
+ end
9
+
10
+ def === id
11
+ id % @n == @i
12
+ end
13
+ end
14
+
15
+ vertex = take id: v_id_matcher, step: step, rank: nil, active: true
@@ -46,6 +46,8 @@ class Tupelo::Client
46
46
  custom: nil
47
47
  }.freeze
48
48
 
49
+ # Also could be a subspace of the event subspace, but for now, we can just
50
+ # use it as a template to select expired events out of the event subspace.
49
51
  EXPIRED_EVENT = {
50
52
  host: nil,
51
53
  service: nil,
@@ -0,0 +1,15 @@
1
+ class Tupelo::Client
2
+ def run_expiration_debugger
3
+ read Tupelo::Client::EXPIRED_EVENT do |event|
4
+ event_exp = event["time"] + event["ttl"]
5
+ delta = Time.now.to_f - event_exp
6
+ if delta > 0.1
7
+ log.warn "expired late by %6.4f seconds: #{event}" % delta
8
+ elsif delta < 0
9
+ log.warn "expired too soon: #{event}"
10
+ else
11
+ log.info "expired on time: #{event}"
12
+ end
13
+ end
14
+ end
15
+ end