riak-ruby-ledger 0.0.4 → 0.0.5
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +8 -8
- data/README.md +75 -292
- data/docs/implementation.md +366 -0
- data/docs/release_notes.md +33 -0
- data/docs/riak_counter_drift.md +56 -0
- data/docs/usage.md +110 -0
- data/lib/crdt/tgcounter.rb +64 -18
- data/lib/ledger.rb +8 -8
- data/lib/ledger/version.rb +2 -2
- data/riak-ruby-ledger.gemspec +2 -2
- data/test/lib/{tgcounter_test.rb → crdt/tgcounter_test.rb} +1 -1
- data/test/lib/{tpncounter_test.rb → crdt/tpncounter_test.rb} +25 -24
- data/test/lib/ledger_test.rb +48 -34
- metadata +16 -13
checksums.yaml
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
---
|
2
2
|
!binary "U0hBMQ==":
|
3
3
|
metadata.gz: !binary |-
|
4
|
-
|
4
|
+
YmFiZWE5ODMwODQwODRiYzI3YjBjM2Y5ZDAwMmM0NzI4Y2ZhZDZlNA==
|
5
5
|
data.tar.gz: !binary |-
|
6
|
-
|
6
|
+
ZDNhNDI3MzA5NWZjY2U3NGRhYjAxNGMzYzVhOTEwMGUzZTZjYTA5OQ==
|
7
7
|
SHA512:
|
8
8
|
metadata.gz: !binary |-
|
9
|
-
|
10
|
-
|
11
|
-
|
9
|
+
YjI5MzU1MzVhYzZhZDY4Y2UwMzJiNGJkMDc3YjdjY2JmNmU5MTM4MzM4ZjQ3
|
10
|
+
OTRjYjI5Mzg5NTI4ZGIxY2M5MjUwNmY3YWVlZmRhNDI2OWViZmIzZWZhYjQw
|
11
|
+
ODE1NTFhYWQyM2YwZDU0OGVhYzQyYjUxMTgxZjRiMTcwZmNmNGE=
|
12
12
|
data.tar.gz: !binary |-
|
13
|
-
|
14
|
-
|
15
|
-
|
13
|
+
ZjRmNjQ3NmJhNzlmNTBjMWNmMGM2ZWFhNWRhNjJiMGQxMzFhNmIzYzI1M2Fk
|
14
|
+
MGM2YzU1YWEwNzRmYzYxYTVmMzdhYzc2NTFhNzdiMjZhNDNiYTliMjQwNjFj
|
15
|
+
MThjZTAxMzc2MDFhMTk5ODdiZjM5NWY2ZDJiYzcyZjg4NDc1MTM=
|
data/README.md
CHANGED
@@ -1,49 +1,94 @@
|
|
1
1
|
# Riak-Ruby-Ledger
|
2
2
|
|
3
|
-
|
3
|
+
An alternative to Riak Counters with idempotent writes within a client defined window.
|
4
4
|
|
5
|
-
|
5
|
+
# Summary
|
6
6
|
|
7
|
-
###
|
7
|
+
### Quick Links
|
8
8
|
|
9
|
-
|
9
|
+
Below are a few documents that are relevant to this gem, **please read before considering using this gem for anything important**.
|
10
10
|
|
11
|
-
|
12
|
-
CRDT PNCounters (two GCounters) such as Riak Counters are non-idempotent, and store nothing about a counter transaction other than the final value. As such it doesn't make sense to use them to store any counter that needs to be accurate.
|
11
|
+
##### Riak Ruby Ledger Docs
|
13
12
|
|
14
|
-
|
15
|
-
|
13
|
+
Document Link | Description
|
14
|
+
--- | ---
|
15
|
+
[[docs/riak_counter_drift.md]](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/riak_counter_drift.md) | Why Riak Counters may or may not work for your use case (Counter Drift).
|
16
|
+
[[docs/implementation.md]](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/implementation.md) | Implementation details about this gem as well as some of the reasoning behind the approach.
|
17
|
+
[[docs/usage.md]](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/usage.md) | Suggested usage of this gem from your application, and implications of changing various settings.
|
18
|
+
[[docs/release_notes.md](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/release_notes.md)] | Information about what changed in each version
|
16
19
|
|
17
|
-
|
18
|
-
By allowing clients to set how many transactions to keep in the counter object as well as set a retry policy on the Riak actions performed on the counter, a good balance can be achieved. The `Riak::Ledger` class in this gem can be instantiated with the following options:
|
20
|
+
### Counter Drift
|
19
21
|
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
22
|
+
**Why shouldn't I use Riak Counters?**
|
23
|
+
|
24
|
+
CRDT PNCounters (two plain GCounters) such as Riak Counters are non-idempotent and store nothing about a counter transaction other than the final value. This means that if an increment operation fails in any number of ways (500 response from server, process that made the call dies, network connection is interrupted, operation times out, etc), your application now has no idea whether or not the increment actually happened.
|
25
|
+
|
26
|
+
**What is Counter Drift?**
|
27
|
+
|
28
|
+
In the above situation of a failed increment operation, your application has two choices:
|
29
|
+
|
30
|
+
1. Retry the operation: This could result in the operation occuring twice causing what is called **positive counter drift**
|
31
|
+
2. Don't retry the operation: This could result in the operation never occuring at all causing **negative counter drift**
|
32
|
+
|
33
|
+
As such it doesn't make sense to use plain GCounters or PNCounters to store any counter that needs to be accurate.
|
34
|
+
|
35
|
+
***More information about Riak Counters and Drift***: [[docs/riak_counter_drift.md]](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/riak_counter_drift.md)
|
36
|
+
|
37
|
+
### Implementation
|
38
|
+
|
39
|
+
The data type implemented is a PNCounter CRDT with an ordered array of transactions for each GCounter actor. Transaction ids are stored with the GCounter, so operations against this counter are idempotent while the transaction remains in any actor's array.
|
40
|
+
|
41
|
+
**High Level API**
|
42
|
+
|
43
|
+
Function | Description
|
44
|
+
--- | ---
|
45
|
+
`Riak::Ledger.new` | Creates a new Ledger instance
|
46
|
+
`Riak::Ledger.find!` | Finds an existing Ledger in Riak, merges it locally, and then writes the merged value back to Riak
|
47
|
+
`#credit!`, `#debit!`, `#update!` | Reads the existing state of the ledger from Riak, merges it locally, and adds a new `transaction` and positive or negative `value`
|
48
|
+
|
49
|
+
**Ledger Options**
|
50
|
+
|
51
|
+
Name | Description
|
52
|
+
--- | ---
|
53
|
+
`:retry_count`[Integer] | When a write to Riak is a "maybe" (500, timeout, or any other error condition), resubmit the request `:retry_count` number of times, and return false if it is still unsuccessful
|
54
|
+
`:history_length`[Integer] | Keep up to `:history_length` number of transactions in each actor's section of the underlying GCounter. When the (`:history_length` + 1)th transaction is written then merged, add the oldest transaction's value to the actor's total
|
55
|
+
|
56
|
+
***More information about the implementation and how edge cases can be avoided***: [[docs/implementation.md]](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/implementation.md)
|
57
|
+
|
58
|
+
### Suggested Usage and Configuration
|
59
|
+
|
60
|
+
Depending on your use case, you may want to tweak the configuration options `:history_length` and `:retry_count`.
|
61
|
+
|
62
|
+
The default `:history_length` is 10. This means that if a transaction fails, but your application is unable to determine whether or not the counter was actually incremented, you have a buffer space or window of 9 additional transactions on that counter before you can no longer retry the original failed transaction without assuming counter drift is happening.
|
25
63
|
|
26
|
-
|
64
|
+
The default `:retry_count` is also 10. This means that if a transaction fails, the actor that attempted the transaction will continue trying 9 more times. If the request to change the counter still fails after the 10th try, the operation will return `false` for failure. At this point your application can attempt to try the transaction again, or return a failure to the user with a note that the transaction will be retried in the future.
|
27
65
|
|
28
|
-
|
66
|
+
An example of a failure might look like the following:
|
29
67
|
|
30
|
-
|
68
|
+
1. transaction1 fails with actor1, and because of the nature of the failure, your application is unsure whether or not the counter was actually incremented.
|
31
69
|
|
32
|
-
|
70
|
+
1. If your `:retry_count` is low, you can quickly determine in your application that something went wrong, and inform the user that the transaction was unsuccessful for now, but will be attempted later
|
71
|
+
2. If your `:retry_count` is high, the user will be kept waiting longer, but the odds of the transaction eventually working are higher
|
72
|
+
2. If after the initial retries, the transaction was still a failure, your application must decide what to do next
|
33
73
|
|
34
|
-
|
74
|
+
1. If your `:history_length` is low, your options are limited. You must continue to retry that same failed transaction for that user (using any available actor) until it is successful. If you allow additional transactions to take place on the same counter before retrying, you run a high risk of counter drift.
|
75
|
+
2. If your `:history_length` is medium-high, then you have an allowance of (`:history_length` - 1) additional transactions for that counter before you run the risk of counter drift.
|
35
76
|
|
36
|
-
|
77
|
+
**Note**
|
37
78
|
|
38
|
-
|
79
|
+
This gem cannot guarentee transaction idempotence of a counter for greater than `:history_length` number of transactions.
|
39
80
|
|
40
|
-
|
81
|
+
***More information about configuration and implications of changing various settings***: [[docs/usage.md]](https://github.com/drewkerrigan/riak-ruby-ledger/blob/master/docs/usage.md)
|
41
82
|
|
42
|
-
|
83
|
+
### Additional Reading
|
43
84
|
|
44
|
-
|
85
|
+
Document Link | Description
|
86
|
+
--- | ---
|
87
|
+
[[http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf](http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf)] | CRDT paper from Shapiro et al. at INRIA
|
88
|
+
[[http://basho.com/counters-in-riak-1-4/](http://basho.com/counters-in-riak-1-4/)] | Riak Counters
|
89
|
+
[[github.com/basho/riak_dt](https://github.com/basho/riak_dt)] | Other Riak Data Types
|
45
90
|
|
46
|
-
|
91
|
+
# Installation
|
47
92
|
|
48
93
|
Add this line to your application's Gemfile:
|
49
94
|
|
@@ -57,7 +102,7 @@ Or install it yourself as:
|
|
57
102
|
|
58
103
|
$ gem install riak-ruby-ledger
|
59
104
|
|
60
|
-
|
105
|
+
# Usage
|
61
106
|
|
62
107
|
### Initialize
|
63
108
|
|
@@ -65,7 +110,7 @@ Or install it yourself as:
|
|
65
110
|
require 'riak' # riak-client gem
|
66
111
|
require 'ledger' # riak-ruby-ledger gem
|
67
112
|
|
68
|
-
# Name your
|
113
|
+
# Name each of your threads
|
69
114
|
Thread.current["name"] = "ACTOR1"
|
70
115
|
|
71
116
|
# Create a Riak::Client instance
|
@@ -148,272 +193,10 @@ ledger.has_transaction? "txn6" #true
|
|
148
193
|
ledger.delete()
|
149
194
|
```
|
150
195
|
|
151
|
-
|
152
|
-
|
153
|
-
### When to use Riak Counters
|
154
|
-
|
155
|
-
Riak Counters are very well suited for certain problems:
|
156
|
-
|
157
|
-
* Facebook likes
|
158
|
-
* Youtube views
|
159
|
-
* Reddit upvotes
|
160
|
-
* Twitter followers
|
161
|
-
* Any non-critical counts
|
162
|
-
* Counts that do not adversely affect applications or users when off by a few
|
163
|
-
|
164
|
-
### When not to use Riak Counters
|
165
|
-
|
166
|
-
* Currency (virtual or real) balances
|
167
|
-
* Metrics that result in charging a customer
|
168
|
-
* Keeping track of how many calls are made to a paid API endpoint
|
169
|
-
* Storage used by a user
|
170
|
-
* Real-time counts
|
171
|
-
* Any critical counts
|
172
|
-
* Counts that must be accurate
|
173
|
-
|
174
|
-
### Counter Drift
|
175
|
-
|
176
|
-
Riak Counters as currently implemented are not ***idempotent***. This simply means that you cannot retry the same increment or decrement operation more than once.
|
177
|
-
|
178
|
-
Take the following scenario into consideration:
|
179
|
-
|
180
|
-
1. User buys an in-game item that costs 50 gold, and has a current balance of 100 gold
|
181
|
-
2. Application server attempts to debit user's account 50 gold
|
182
|
-
a. If Riak successfully returns 200 response code, no problem!
|
183
|
-
b. If Riak returns 500 (or any other error code), we don't have any way of knowing whether or not the operation succeeded
|
184
|
-
c. If the application server fails at any point in the execution, we also don't have a good way of knowing whether or not the operation succeeded
|
185
|
-
|
186
|
-
In the case of 2b and 2c, we have the following choices:
|
187
|
-
|
188
|
-
* Retry the operation (Risk absolute positive drift)
|
189
|
-
* If the original counter decrement was successful, we have now debited the user's balance twice, thereby charging them 100 gold for a 50 gold item
|
190
|
-
* Never retry (Risk absolute negative drift)
|
191
|
-
* If the original counter decrement was unsuccessful, we gave the user an item for free
|
192
|
-
|
193
|
-
## Idempotent Counters
|
194
|
-
|
195
|
-
There are several approaches to making counters varying degrees of idempotent, the ones relative to the goals of this gem described here.
|
196
|
-
|
197
|
-
### Definitions
|
198
|
-
|
199
|
-
* ***Transaction id***: Globally unique externally generated transaction id that is available per counter action (increment or decrement)
|
200
|
-
* ***Actor***: A thread, process, or server that is able to serially perform actions (a single actor can never perform actions in parallel with itself)
|
201
|
-
* ***Sibling***: In Riak, when you write to the same key without specifying a vector clock, a sibling is created. This is denoted below as `[...sibling1..., ...sibling2...]`.
|
202
|
-
|
203
|
-
### Approach 1: Ensure idempotent counter actions at any time, by any actor
|
204
|
-
|
205
|
-
This is possible if the entire transaction history is stored inside of the counter object:
|
206
|
-
|
207
|
-
Actor 1 writes txn1: 50
|
208
|
-
|
209
|
-
```
|
210
|
-
{"txn1": 50}
|
211
|
-
```
|
212
|
-
|
213
|
-
Actor 2 writes txn1: 50, txn2: 100
|
214
|
-
|
215
|
-
```
|
216
|
-
[
|
217
|
-
#sibling 1
|
218
|
-
{"txn1": 50},
|
219
|
-
#sibling 2
|
220
|
-
{"txn1": 50, "txn2": 100}
|
221
|
-
]
|
222
|
-
```
|
223
|
-
|
224
|
-
Actor 1 reads and merges value
|
225
|
-
|
226
|
-
```
|
227
|
-
{"txn1": 50, "txn2": 100}
|
228
|
-
```
|
229
|
-
|
230
|
-
Total: 150
|
231
|
-
|
232
|
-
This is not a counter, but a ***GSet***, because the entire set of transactions needs to be stored with the object. The total for a counter is defined by the sum of the entire set of values
|
233
|
-
|
234
|
-
***Pros***:
|
235
|
-
|
236
|
-
* Retry any action at any time by any actor in the system.
|
237
|
-
* Optimize for writes: No need to read the value prior to writing a new transaction.
|
238
|
-
|
239
|
-
***Cons***:
|
240
|
-
|
241
|
-
* GSet sizes can become too large for ruby to handle. If more than ~1000 transactions are expected for a single counter, this approach should not be used
|
242
|
-
|
243
|
-
|
244
|
-
### Approach 2a: Ensure idempotent counter actions by any actor, for the current transaction
|
245
|
-
|
246
|
-
In this approach, the transaction id is stored per actor for the most recently written transaction
|
247
|
-
|
248
|
-
Actor 1 writes txn1: 50
|
249
|
-
|
250
|
-
```
|
251
|
-
Actor1: {"total": 0} {"txn1": 50}
|
252
|
-
```
|
253
|
-
|
254
|
-
Actor 2 attempts to write txn1: 50
|
255
|
-
|
256
|
-
Actor 2 reads current value and sees that txn1 has already been written, ignores it's own txn1
|
257
|
-
|
258
|
-
Actor 2 writes merged value
|
259
|
-
|
260
|
-
```
|
261
|
-
Actor1: {"total": 0} {"txn1": 50}
|
262
|
-
```
|
263
|
-
|
264
|
-
Actor 2 Writes txn2: 100
|
265
|
-
|
266
|
-
```
|
267
|
-
Actor1: {"total": 0} {"txn1": 50}
|
268
|
-
Actor2: {"total": 0} {"txn2": 100}
|
269
|
-
```
|
270
|
-
|
271
|
-
Actor 2 Reads current value, and writes txn3: 10 along with it's own merged data
|
272
|
-
|
273
|
-
```
|
274
|
-
Actor1: {"total": 0} {"txn1": 50}
|
275
|
-
Actor2: {"total": 100} {"txn3": 10}
|
276
|
-
```
|
277
|
-
|
278
|
-
Actor 1 reads and merges value
|
279
|
-
|
280
|
-
```
|
281
|
-
Actor1: {"total": 0} {"txn1": 50}
|
282
|
-
Actor2: {"total": 100} {"txn3": 10}
|
283
|
-
```
|
284
|
-
|
285
|
-
Total: 160
|
286
|
-
|
287
|
-
***Pros***:
|
288
|
-
|
289
|
-
* Retry an action with any actor in the system, assuming the actions are serialized per counter
|
290
|
-
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
291
|
-
|
292
|
-
***Cons***:
|
293
|
-
|
294
|
-
* Counter drift is a possibility in the case where transaction 1 fails, several other transactions succeed without retrying transaction 1, and then transaction 1 is tried again
|
295
|
-
|
296
|
-
### Approach 2b: Ensure idempotent counter actions by any actor, for the previous `X` transactions
|
297
|
-
|
298
|
-
This approach is the same as 2a, but instead of only storing the most previous transaction, we store the most previous `X` transactions. In this example we'll use X=5
|
299
|
-
|
300
|
-
Actor 1 writes txn1: 50, txn2: 10, txn3: 100 (order is preserved using an array instead of a hash for transactions)
|
301
|
-
|
302
|
-
```
|
303
|
-
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]}
|
304
|
-
```
|
305
|
-
|
306
|
-
Actor 2 attempts to write txn1: 50
|
307
|
-
|
308
|
-
Actor 2 reads current value and sees that txn1 has already been written, ignores it's own txn1
|
309
|
-
|
310
|
-
Actor 2 writes merged value
|
311
|
-
|
312
|
-
```
|
313
|
-
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]
|
314
|
-
```
|
315
|
-
|
316
|
-
Actor 2 Writes txn4: 100
|
317
|
-
|
318
|
-
```
|
319
|
-
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]
|
320
|
-
Actor2: {"total": 0} [["txn4", 100]]
|
321
|
-
```
|
322
|
-
|
323
|
-
Actor 1 Writes txn5: 20, txn6: 20
|
324
|
-
|
325
|
-
```
|
326
|
-
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20]]
|
327
|
-
Actor2: {"total": 0} [["txn4", 100]]
|
328
|
-
```
|
329
|
-
|
330
|
-
Actor 1 Writes txn7: 30, and writes it's own merged data
|
331
|
-
|
332
|
-
```
|
333
|
-
Actor1: {"total": 50} [["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20], ["txn7", 30]]
|
334
|
-
Actor2: {"total": 0} [["txn4", 100]]
|
335
|
-
```
|
336
|
-
|
337
|
-
Actor 1 reads and merges value
|
338
|
-
|
339
|
-
```
|
340
|
-
Actor1: {"total": 50} [["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20], ["txn7", 30]]
|
341
|
-
Actor2: {"total": 0} [["txn4", 100]]
|
342
|
-
```
|
343
|
-
|
344
|
-
Total: 330
|
345
|
-
|
346
|
-
***Pros***:
|
347
|
-
|
348
|
-
* Retry an action with any actor in the system, for the last X actions
|
349
|
-
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
350
|
-
|
351
|
-
***Cons***:
|
352
|
-
|
353
|
-
* Counter drift is a possibility in the case where transaction 1 fails, X + 1 actions occur, and transaction 1 is retried
|
354
|
-
|
355
|
-
### Approach 3: Ensure idempotent counter actions by a single actor, for the current transaction
|
356
|
-
|
357
|
-
In this approach, a globally unique transaction id is no longer required, because we are assuming that only a single actor can ever be responsible for a single transaction
|
358
|
-
|
359
|
-
Actor 1 writes request1: 50, the request shows an error, but the write actually occurred in Riak
|
360
|
-
|
361
|
-
```
|
362
|
-
Actor1: {"total": 0} {"request1": 50}
|
363
|
-
```
|
364
|
-
|
365
|
-
Actor 1 retries request1: 50, the request succeeds, but since request1 is already there, it is ignored and returns a success to the client
|
366
|
-
|
367
|
-
```
|
368
|
-
Actor1: {"total": 0} {"request1": 50}
|
369
|
-
```
|
370
|
-
|
371
|
-
Actor 1 writes request2: 100, the request succeeds
|
372
|
-
|
373
|
-
```
|
374
|
-
Actor1: {"total": 50} {"request2": 100}
|
375
|
-
```
|
376
|
-
|
377
|
-
Actor 2 writes request3: 10. Since request ids are only unique to the actor, no cross-actor uniqueness check can be made.
|
378
|
-
|
379
|
-
```
|
380
|
-
Actor1: {"total": 50} {"request2": 100}
|
381
|
-
Actor2: {"total": 0} {"request3": 10}
|
382
|
-
```
|
383
|
-
|
384
|
-
Actor 2 Writes request4: 100
|
385
|
-
|
386
|
-
```
|
387
|
-
Actor1: {"total": 50} {"request2": 100}
|
388
|
-
Actor2: {"total": 10} {"request4": 100}
|
389
|
-
```
|
390
|
-
|
391
|
-
Actor 1 reads and merges value
|
392
|
-
|
393
|
-
```
|
394
|
-
Actor1: {"total": 50} {"request2": 100}
|
395
|
-
Actor2: {"total": 10} {"request4": 100}
|
396
|
-
```
|
397
|
-
|
398
|
-
Total: 260
|
399
|
-
|
400
|
-
***Pros***:
|
401
|
-
|
402
|
-
* No reliance on an external globally unique transaction id
|
403
|
-
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
404
|
-
|
405
|
-
***Cons***:
|
406
|
-
|
407
|
-
* Counter drift is a possibility if any action is retried by someone other than the current actor during it's current transaction
|
408
|
-
|
409
|
-
## Conclusion
|
410
|
-
|
411
|
-
In order to attempt to best meet the requirements of *most* counters that cannot be satisfied with Riak Counters, this gem implements approach ***2b*** as it should handle the most likely retry scenarios for most applications.
|
412
|
-
|
413
|
-
## Contributing
|
196
|
+
# Contributing
|
414
197
|
|
415
198
|
1. Fork it
|
416
199
|
2. Create your feature branch (`git checkout -b my-new-feature`)
|
417
200
|
3. Commit your changes (`git commit -am 'Add some feature'`)
|
418
201
|
4. Push to the branch (`git push origin my-new-feature`)
|
419
|
-
5. Create new Pull Request
|
202
|
+
5. Create new Pull Request
|
@@ -0,0 +1,366 @@
|
|
1
|
+
## Implementation
|
2
|
+
|
3
|
+
### Summary
|
4
|
+
|
5
|
+
The data type implemented is a PNCounter CRDT with an ordered array of transactions for each GCounter actor. Transaction ids are stored with the GCounter, so operations against this counter are idempotent while the transaction remains in any actor's array.
|
6
|
+
|
7
|
+
**High Level API**
|
8
|
+
|
9
|
+
Function | Description
|
10
|
+
--- | ---
|
11
|
+
`Riak::Ledger.new` | Creates a new Ledger instance
|
12
|
+
`Riak::Ledger.find!` | Finds an existing Ledger in Riak, merges it locally, and then writes the merged value back to Riak
|
13
|
+
`#credit!`, `#debit!`, `#update!` | Reads the existing state of the ledger from Riak, merges it locally, and adds a new `transaction` and positive or negative `value`
|
14
|
+
|
15
|
+
**Ledger Options**
|
16
|
+
|
17
|
+
Name | Description
|
18
|
+
--- | ---
|
19
|
+
`:retry_count`[Integer] | When a write to Riak is a "maybe" (500, timeout, or any other error condition), resubmit the request `:retry_count` number of times, and return false if it is still unsuccessful
|
20
|
+
`:history_length`[Integer] | Keep up to `:history_length` number of transactions in each actor's section of the underlying GCounter. When the (`:history_length` + 1)th transaction is written then merged, add the oldest transaction's value to the actor's total
|
21
|
+
|
22
|
+
### GCounters
|
23
|
+
|
24
|
+
A typical GCounter data structure looks something like this:
|
25
|
+
|
26
|
+
```
|
27
|
+
{
|
28
|
+
"actor1": 10,
|
29
|
+
"actor2": 20,
|
30
|
+
"actor3": 5
|
31
|
+
}
|
32
|
+
```
|
33
|
+
|
34
|
+
Since no actor can affect any other actor's total, this is a safe way to increment a single number in a concurrent way. The total value of this counter is defined by the totals of all actors summed
|
35
|
+
|
36
|
+
### PNCounters
|
37
|
+
|
38
|
+
Because GCounters only allow for a counter to increment, a simple way to allow for decrements is to use two GCounters. A PNCounter is defined by two GCounters, one for increments, and one for decrements.
|
39
|
+
|
40
|
+
```
|
41
|
+
{
|
42
|
+
"p": <GCounter>,
|
43
|
+
"n": <GCounter>,
|
44
|
+
}
|
45
|
+
```
|
46
|
+
|
47
|
+
"p" is for positive, and "n" is for negative, so the current value of a PNCounter is defined by P minus N.
|
48
|
+
|
49
|
+
### TPNCounter and TGCounter (unique to this gem and its functionality)
|
50
|
+
|
51
|
+
For idempotent operations over a limited window of transactions, an array of transactions can be stored with each actor's counter value. The mechanics of the GCounter are unchanged, but the method with which the single total value for an actor gets incremented is dependent upon the current size of the transaction list.
|
52
|
+
|
53
|
+
The new data structure for the GCounter portion of this gem's PNCounter looks like this:
|
54
|
+
|
55
|
+
```
|
56
|
+
{
|
57
|
+
"actor1": {"total": 10, "txns": [["txn1": 5],["txn2": 1],["txn3":10]]},
|
58
|
+
"actor2": {"total": 20, "txns": [["txn4": 5],["txn5": 1],["txn6":10]]},
|
59
|
+
"actor3": {"total": 5, "txns": [["txn7": 5],["txn8": 1],["txn9":10]]}
|
60
|
+
}
|
61
|
+
```
|
62
|
+
|
63
|
+
Since these are not true PN or G counters, in the code they are named `Riak::CRDT::TPNCoutner` and `Riak::CRDT::TGCounter` (T for transaction)
|
64
|
+
|
65
|
+
#### History Length
|
66
|
+
|
67
|
+
The history length option determines the maximum length of the transaction array per actor before that actor will start removing (oldest first) transactions from its list. Take the following code example into consideration:
|
68
|
+
|
69
|
+
For this example, the `:history_length` is lowered to 3 so it gets reached faster.
|
70
|
+
|
71
|
+
```
|
72
|
+
options = {:history_length => 3}
|
73
|
+
ledger = Riak::Ledger.new(client["ledgers"], "player_2", options)
|
74
|
+
|
75
|
+
ledger.credit!("txn1", 10)
|
76
|
+
ledger.credit!("txn2", 10)
|
77
|
+
ledger.credit!("txn3", 10)
|
78
|
+
ledger.credit!("txn4", 10)
|
79
|
+
ledger.credit!("txn5", 10)
|
80
|
+
ledger.credit!("txn6", 10)
|
81
|
+
|
82
|
+
ledger.value #60
|
83
|
+
|
84
|
+
ledger.has_transaction? "txn1" #false
|
85
|
+
ledger.has_transaction? "txn2" #false
|
86
|
+
ledger.has_transaction? "txn3" #true
|
87
|
+
|
88
|
+
# txn3 is still in the history because the most previous write does not trigger a merge of the actor's total
|
89
|
+
# Performing a find! will trigger the merge however
|
90
|
+
ledger = Riak::Ledger.find!(client["ledgers"], "player_2", options)
|
91
|
+
|
92
|
+
ledger.has_transaction? "txn3" #false
|
93
|
+
ledger.has_transaction? "txn4" #true
|
94
|
+
ledger.has_transaction? "txn5" #true
|
95
|
+
ledger.has_transaction? "txn6" #true
|
96
|
+
```
|
97
|
+
|
98
|
+
#### Edge Case: Duplicates
|
99
|
+
|
100
|
+
**First line of defense**
|
101
|
+
|
102
|
+
Before every write, the Ledger class will read the current value of the counter from Riak, if it already exists, the operation will not continue because the transaction has already been placed.
|
103
|
+
|
104
|
+
It is possible to have duplicate transactions in across multiple actors however if the following happens:
|
105
|
+
|
106
|
+
1. Actor 1 attempts to write transaction1, but is taking a long time to do so for some reason
|
107
|
+
2. Your application decides that Actor 1 has taken too long, and issues the same transaction to Actor 2 for writing
|
108
|
+
3. Since Actor 1's version of the transaction is still in flight, it could finish successfully while Actor 2's write of transaction1 was also successful
|
109
|
+
|
110
|
+
This situation would result in siblings getting created where the merged result ends up being 2 actors with the same transaction1
|
111
|
+
|
112
|
+
**Second line of defense**
|
113
|
+
|
114
|
+
Upon Actor 2's or Actor 1's next merge, they will find that there is indeed a duplicate, and the following logic happens in order to deal with the duplicate:
|
115
|
+
|
116
|
+
1. A merge occurs, and a string comparison on the actors' ids takes place to see who should own the transaction
|
117
|
+
|
118
|
+
1. If Actor 1 is merging, "ACTOR1" is less than "ACTOR2", so Actor 1 gets rid of the transaction without counting it
|
119
|
+
2. If Actor 2 is merging, "ACTOR2" is greater than "ACTOR1", so Actor 2 keeps the transaction, knowing that Actor 1 should delete it
|
120
|
+
|
121
|
+
This approach allows for the case in which Actor1 and Actor2 are simultaneously merging, similarly to when they simultaneously added the transaction
|
122
|
+
|
123
|
+
It is quite possible however for Actor 1 to become stale, and never get rid of the transaction as they should have...
|
124
|
+
|
125
|
+
**Third and final line of defense**
|
126
|
+
|
127
|
+
The following workflow should be read in the voice of Actor 2:
|
128
|
+
|
129
|
+
If we have held onto a duplicate this long, we meet the following criteria:
|
130
|
+
|
131
|
+
1. We are the actor who is supposed to keep this duplicate while the other removes it
|
132
|
+
2. We have had enough time to do :history_length number of transactions since the other actor
|
133
|
+
has performed a merge
|
134
|
+
3. If they stay dormant and the txn remains untouched there, I shouldn't count it
|
135
|
+
4. If they are currently merging and about to count it, I also shouldn't count it for fear of counting it twice,
|
136
|
+
5. The third possibility is the following:
|
137
|
+
|
138
|
+
1. Actor 1 attempts to write transaction 1, it takes a long time, application decides to retry after timeout
|
139
|
+
2. Actor 2 manages to successfully write transaction 1, and then :history_length - 1 more writes and
|
140
|
+
is currently deciding what to do with that transaction ("hmmm, should I count it?")
|
141
|
+
3. While that merge is happening, Actor 1 finally finishes writing transaction 1 and now Actor 2's
|
142
|
+
request is taking a long time for some reason
|
143
|
+
4. While still waiting on Actor 2, Actor 1 performs another merge and sees that Actor 2 has transaction 1
|
144
|
+
knowing it is the inferior actor, Actor 1 removes without counting. But at this stage, Actor 2 wouldn't have known that Actor 1 ever even had transaction 1, and would have correctly counted the value
|
145
|
+
|
146
|
+
Given that 5) would actually be handled by the second line of defense, this leaves us with 3) and 4). Since both of those situations result in Actor 1 counting the value, during the compression phases of Actor 2's merge, if the duplicate transaction is about to be deleted, Actor 2 would remove the transaction without counting it towards it's own total.
|
147
|
+
|
148
|
+
## Other Possible Approaches to the Idempotent Counter Problem
|
149
|
+
|
150
|
+
There are several approaches to making counters varying degrees of idempotent, the ones relative to the goals of this gem described here.
|
151
|
+
|
152
|
+
### Definitions
|
153
|
+
|
154
|
+
* ***Transaction id***: Globally unique externally generated transaction id that is available per counter action (increment or decrement)
|
155
|
+
* ***Actor***: A thread, process, or server that is able to serially perform actions (a single actor can never perform actions in parallel with itself)
|
156
|
+
* ***Sibling***: In Riak, when you write to the same key without specifying a vector clock, a sibling is created. This is denoted below as `[...sibling1..., ...sibling2...]`.
|
157
|
+
|
158
|
+
### Approach 1: Ensure idempotent counter actions at any time, by any actor
|
159
|
+
|
160
|
+
This is possible if the entire transaction history is stored inside of the counter object:
|
161
|
+
|
162
|
+
Actor 1 writes txn1: 50
|
163
|
+
|
164
|
+
```
|
165
|
+
{"txn1": 50}
|
166
|
+
```
|
167
|
+
|
168
|
+
Actor 2 writes txn1: 50, txn2: 100
|
169
|
+
|
170
|
+
```
|
171
|
+
[
|
172
|
+
#sibling 1
|
173
|
+
{"txn1": 50},
|
174
|
+
#sibling 2
|
175
|
+
{"txn1": 50, "txn2": 100}
|
176
|
+
]
|
177
|
+
```
|
178
|
+
|
179
|
+
Actor 1 reads and merges value
|
180
|
+
|
181
|
+
```
|
182
|
+
{"txn1": 50, "txn2": 100}
|
183
|
+
```
|
184
|
+
|
185
|
+
Total: 150
|
186
|
+
|
187
|
+
This is not a counter, but a ***GSet***, because the entire set of transactions needs to be stored with the object. The total for a counter is defined by the sum of the entire set of values
|
188
|
+
|
189
|
+
***Pros***:
|
190
|
+
|
191
|
+
* Retry any action at any time by any actor in the system.
|
192
|
+
* Optimize for writes: No need to read the value prior to writing a new transaction.
|
193
|
+
|
194
|
+
***Cons***:
|
195
|
+
|
196
|
+
* GSet sizes can become too large for ruby to handle. If more than ~1000 transactions are expected for a single counter, this approach should not be used
|
197
|
+
|
198
|
+
|
199
|
+
### Approach 2a: Ensure idempotent counter actions by any actor, for the current transaction
|
200
|
+
|
201
|
+
In this approach, the transaction id is stored per actor for the most recently written transaction
|
202
|
+
|
203
|
+
Actor 1 writes txn1: 50
|
204
|
+
|
205
|
+
```
|
206
|
+
Actor1: {"total": 0} {"txn1": 50}
|
207
|
+
```
|
208
|
+
|
209
|
+
Actor 2 attempts to write txn1: 50
|
210
|
+
|
211
|
+
Actor 2 reads current value and sees that txn1 has already been written, ignores it's own txn1
|
212
|
+
|
213
|
+
Actor 2 writes merged value
|
214
|
+
|
215
|
+
```
|
216
|
+
Actor1: {"total": 0} {"txn1": 50}
|
217
|
+
```
|
218
|
+
|
219
|
+
Actor 2 Writes txn2: 100
|
220
|
+
|
221
|
+
```
|
222
|
+
Actor1: {"total": 0} {"txn1": 50}
|
223
|
+
Actor2: {"total": 0} {"txn2": 100}
|
224
|
+
```
|
225
|
+
|
226
|
+
Actor 2 Reads current value, and writes txn3: 10 along with it's own merged data
|
227
|
+
|
228
|
+
```
|
229
|
+
Actor1: {"total": 0} {"txn1": 50}
|
230
|
+
Actor2: {"total": 100} {"txn3": 10}
|
231
|
+
```
|
232
|
+
|
233
|
+
Actor 1 reads and merges value
|
234
|
+
|
235
|
+
```
|
236
|
+
Actor1: {"total": 0} {"txn1": 50}
|
237
|
+
Actor2: {"total": 100} {"txn3": 10}
|
238
|
+
```
|
239
|
+
|
240
|
+
Total: 160
|
241
|
+
|
242
|
+
***Pros***:
|
243
|
+
|
244
|
+
* Retry an action with any actor in the system, assuming the actions are serialized per counter
|
245
|
+
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
246
|
+
|
247
|
+
***Cons***:
|
248
|
+
|
249
|
+
* Counter drift is a possibility in the case where transaction 1 fails, several other transactions succeed without retrying transaction 1, and then transaction 1 is tried again
|
250
|
+
|
251
|
+
### Approach 2b: Ensure idempotent counter actions by any actor, for the previous `X` transactions
|
252
|
+
|
253
|
+
This approach is the same as 2a, but instead of only storing the most previous transaction, we store the most previous `X` transactions. In this example we'll use X=5
|
254
|
+
|
255
|
+
Actor 1 writes txn1: 50, txn2: 10, txn3: 100 (order is preserved using an array instead of a hash for transactions)
|
256
|
+
|
257
|
+
```
|
258
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]}
|
259
|
+
```
|
260
|
+
|
261
|
+
Actor 2 attempts to write txn1: 50
|
262
|
+
|
263
|
+
Actor 2 reads current value and sees that txn1 has already been written, ignores it's own txn1
|
264
|
+
|
265
|
+
Actor 2 writes merged value
|
266
|
+
|
267
|
+
```
|
268
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]
|
269
|
+
```
|
270
|
+
|
271
|
+
Actor 2 Writes txn4: 100
|
272
|
+
|
273
|
+
```
|
274
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]
|
275
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
276
|
+
```
|
277
|
+
|
278
|
+
Actor 1 Writes txn5: 20, txn6: 20
|
279
|
+
|
280
|
+
```
|
281
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20]]
|
282
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
283
|
+
```
|
284
|
+
|
285
|
+
Actor 1 Writes txn7: 30, and writes it's own merged data
|
286
|
+
|
287
|
+
```
|
288
|
+
Actor1: {"total": 50} [["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20], ["txn7", 30]]
|
289
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
290
|
+
```
|
291
|
+
|
292
|
+
Actor 1 reads and merges value
|
293
|
+
|
294
|
+
```
|
295
|
+
Actor1: {"total": 50} [["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20], ["txn7", 30]]
|
296
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
297
|
+
```
|
298
|
+
|
299
|
+
Total: 330
|
300
|
+
|
301
|
+
***Pros***:
|
302
|
+
|
303
|
+
* Retry an action with any actor in the system, for the last X actions
|
304
|
+
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
305
|
+
|
306
|
+
***Cons***:
|
307
|
+
|
308
|
+
* Counter drift is a possibility in the case where transaction 1 fails, X + 1 actions occur, and transaction 1 is retried
|
309
|
+
|
310
|
+
### Approach 3: Ensure idempotent counter actions by a single actor, for the current transaction
|
311
|
+
|
312
|
+
In this approach, a globally unique transaction id is no longer required, because we are assuming that only a single actor can ever be responsible for a single transaction
|
313
|
+
|
314
|
+
Actor 1 writes request1: 50, the request shows an error, but the write actually occurred in Riak
|
315
|
+
|
316
|
+
```
|
317
|
+
Actor1: {"total": 0} {"request1": 50}
|
318
|
+
```
|
319
|
+
|
320
|
+
Actor 1 retries request1: 50, the request succeeds, but since request1 is already there, it is ignored and returns a success to the client
|
321
|
+
|
322
|
+
```
|
323
|
+
Actor1: {"total": 0} {"request1": 50}
|
324
|
+
```
|
325
|
+
|
326
|
+
Actor 1 writes request2: 100, the request succeeds
|
327
|
+
|
328
|
+
```
|
329
|
+
Actor1: {"total": 50} {"request2": 100}
|
330
|
+
```
|
331
|
+
|
332
|
+
Actor 2 writes request3: 10. Since request ids are only unique to the actor, no cross-actor uniqueness check can be made.
|
333
|
+
|
334
|
+
```
|
335
|
+
Actor1: {"total": 50} {"request2": 100}
|
336
|
+
Actor2: {"total": 0} {"request3": 10}
|
337
|
+
```
|
338
|
+
|
339
|
+
Actor 2 Writes request4: 100
|
340
|
+
|
341
|
+
```
|
342
|
+
Actor1: {"total": 50} {"request2": 100}
|
343
|
+
Actor2: {"total": 10} {"request4": 100}
|
344
|
+
```
|
345
|
+
|
346
|
+
Actor 1 reads and merges value
|
347
|
+
|
348
|
+
```
|
349
|
+
Actor1: {"total": 50} {"request2": 100}
|
350
|
+
Actor2: {"total": 10} {"request4": 100}
|
351
|
+
```
|
352
|
+
|
353
|
+
Total: 260
|
354
|
+
|
355
|
+
***Pros***:
|
356
|
+
|
357
|
+
* No reliance on an external globally unique transaction id
|
358
|
+
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
359
|
+
|
360
|
+
***Cons***:
|
361
|
+
|
362
|
+
* Counter drift is a possibility if any action is retried by someone other than the current actor during it's current transaction
|
363
|
+
|
364
|
+
## Conclusion
|
365
|
+
|
366
|
+
In order to attempt to best meet the requirements of *most* counters that cannot be satisfied with Riak Counters, this gem implements approach ***2b*** as it should handle the most likely retry scenarios for most applications.
|