riak-ruby-ledger 0.0.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +15 -0
- data/.gitignore +18 -0
- data/Gemfile +4 -0
- data/LICENSE +16 -0
- data/README.md +419 -0
- data/Rakefile +8 -0
- data/lib/crdt/tgcounter.rb +171 -0
- data/lib/crdt/tpncounter.rb +64 -0
- data/lib/ledger.rb +173 -0
- data/lib/ledger/version.rb +5 -0
- data/riak-ruby-ledger.gemspec +25 -0
- data/test/lib/ledger/version_test.rb +9 -0
- data/test/lib/ledger_test.rb +142 -0
- data/test/lib/tgcounter_test.rb +99 -0
- data/test/lib/tpncounter_test.rb +97 -0
- data/test/test_helper.rb +5 -0
- metadata +125 -0
checksums.yaml
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
---
|
2
|
+
!binary "U0hBMQ==":
|
3
|
+
metadata.gz: !binary |-
|
4
|
+
NTFkZWMxYzc0MDM4MzdjYWZiYzY3ZGVmOWY1MGNmYTczMjNiZDgzMA==
|
5
|
+
data.tar.gz: !binary |-
|
6
|
+
MzFlMzBlYzM0MDYwYWU1OGVkNWM0MDRhZGM4NDhmMjIyZDA5OTQ4Mw==
|
7
|
+
SHA512:
|
8
|
+
metadata.gz: !binary |-
|
9
|
+
YjYwMWMzNGJjMzg1OTQ3ZGZlNGI5M2VlZjE1NDgzZTc1YzQzOGJhMzU4MDhh
|
10
|
+
MmRkY2U3ZWRhYTU0M2Y4MmIzMjdhMTc5NTZkNDdlMTRhMmU2YTc0NTU0ZWZk
|
11
|
+
ZDBjNzJjMjc1M2U4MWE2NTY0ZDdkZjA1MTQyYTU3MTljODEwNDY=
|
12
|
+
data.tar.gz: !binary |-
|
13
|
+
Y2QxZDNlNWFmOWYyY2E2YjgzM2U4NjlhYWYzNDFkZmY5OTI2YTgyZWJhZDMy
|
14
|
+
ZWY0MDc0ODkyZjBmODU0MTFiMDNlOTdhNjRhOTE5NjE4YzE5NjE1ZjYyNjg4
|
15
|
+
MWIxNzc0M2Q2ODg5ZjEwMGQxYzNlYzUyYTdjMDRmYmU4NmI2YmI=
|
data/.gitignore
ADDED
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,16 @@
|
|
1
|
+
Copyright 2013-2014 Drew Kerrigan.
|
2
|
+
|
3
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
4
|
+
you may not use this file except in compliance with the License.
|
5
|
+
You may obtain a copy of the License at
|
6
|
+
|
7
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
8
|
+
|
9
|
+
Unless required by applicable law or agreed to in writing, software
|
10
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
11
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
12
|
+
See the License for the specific language governing permissions and
|
13
|
+
limitations under the License.
|
14
|
+
|
15
|
+
All of the files in this project are under the project-wide license
|
16
|
+
unless they are otherwise marked.
|
data/README.md
ADDED
@@ -0,0 +1,419 @@
|
|
1
|
+
# Riak-Ruby-Ledger
|
2
|
+
|
3
|
+
A PNCounter CRDT with ledger transaction ids for tunable write idempotence
|
4
|
+
|
5
|
+
## Summary of Functionality
|
6
|
+
|
7
|
+
### What does it do?
|
8
|
+
|
9
|
+
This gem attempts to provide a tunable Counter option by combining non-idempotent GCounters and a partially idempotent GSet for calculating a running counter or ledger.
|
10
|
+
|
11
|
+
#### Zero Transaction History
|
12
|
+
CRDT PNCounters (two GCounters) such as Riak Counters are non-idempotent, and store nothing about a counter transaction other than the final value. As such it doesn't make sense to use them to store any counter that needs to be accurate.
|
13
|
+
|
14
|
+
#### Entire Transaction History
|
15
|
+
Another approach would be to use a CRDT GSet to store the entire set of transactions, and calculate the current value from the unique list of transaction ids. While accurate, this isn't feasible for many use cases do the space it consumes.
|
16
|
+
|
17
|
+
#### Tunable Transaction History
|
18
|
+
By allowing clients to set how many transactions to keep in the counter object as well as set a retry policy on the Riak actions performed on the counter, a good balance can be achieved. The `Riak::Ledger` class in this gem can be instantiated with the following options:
|
19
|
+
|
20
|
+
```
|
21
|
+
:actor => Actor ID, one per thread or serialized writer
|
22
|
+
:history_length => Number of transactions to store per actor per type (credit or debit)
|
23
|
+
:retry_count => Number of times to retry Riak requests if they fail
|
24
|
+
```
|
25
|
+
|
26
|
+
Furthermore, each `#credit!` and `#debit!` action against the ledger takes an (assumed) globally unique `transaction` id that is determined by your application.
|
27
|
+
|
28
|
+
These options combined give you reasonable guarentees that a single transaction can be retried per counter continually as long as less than X number of other transactions are applied to the same counter (where X is the `:history_length`).
|
29
|
+
|
30
|
+
The gem will automatically retry `:retry_count` number of times, and if it still fails after that you can define a secondary retry or reconciliation policy within your application to deal with the failure, although if the actions are continually failing, it is possible that something is systematically wrong with your Riak cluster.
|
31
|
+
|
32
|
+
### What doesn't it do?
|
33
|
+
|
34
|
+
This gem cannot guarentee transaction idempotence over the entire lifetime of a counter for greater than `:history_length` number of transactions. If your application requires this level of idempotence on a counter, a slower reading GSet based implementation may be right for you, but keep in mind this will penalize the most active users of the counter.
|
35
|
+
|
36
|
+
### Further Reading
|
37
|
+
|
38
|
+
In order to attempt to best meet the requirements of *most* counters that cannot be satisfied with Riak Counters, this gem implements approach ***2b*** described in the [Problem Statement](https://github.com/drewkerrigan/riak-ruby-ledger/tree/ack-refactor#problem-statement) below as it should handle the most likely retry scenarios for most applications.
|
39
|
+
|
40
|
+
CRDT paper from Shapiro et al. at INRIA [http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf](http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf)
|
41
|
+
|
42
|
+
Riak Counters: [http://basho.com/counters-in-riak-1-4/](http://basho.com/counters-in-riak-1-4/)
|
43
|
+
|
44
|
+
Other Riak Data Types: [github.com/basho/riak_dt](https://github.com/basho/riak_dt)
|
45
|
+
|
46
|
+
## Installation
|
47
|
+
|
48
|
+
Add this line to your application's Gemfile:
|
49
|
+
|
50
|
+
gem 'riak-ruby-ledger'
|
51
|
+
|
52
|
+
And then execute:
|
53
|
+
|
54
|
+
$ bundle
|
55
|
+
|
56
|
+
Or install it yourself as:
|
57
|
+
|
58
|
+
$ gem install riak-ruby-ledger
|
59
|
+
|
60
|
+
## Usage
|
61
|
+
|
62
|
+
### Initialize
|
63
|
+
|
64
|
+
```
|
65
|
+
require 'riak' # riak-client gem
|
66
|
+
require 'ledger' # riak-ruby-ledger gem
|
67
|
+
|
68
|
+
# Name your thread
|
69
|
+
Thread.current["name"] = "ACTOR1"
|
70
|
+
|
71
|
+
# Create a Riak::Client instance
|
72
|
+
client = Riak::Client.new pb_port: 8087
|
73
|
+
|
74
|
+
# Default option values
|
75
|
+
options = {
|
76
|
+
:actor => Thread.current["name"], # Actor ID, one per thread or serialized writer
|
77
|
+
:history_length => 10, # Number of transactions to store per actor per type (credit or debit)
|
78
|
+
:retry_count => 10 # Number of times to retry Riak requests if they fail
|
79
|
+
}
|
80
|
+
|
81
|
+
# Create the ledger object
|
82
|
+
# Riak::Bucket Key Hash
|
83
|
+
ledger = Riak::Ledger.new(client["ledgers"], "player_1", options)
|
84
|
+
```
|
85
|
+
|
86
|
+
### Credit and debit
|
87
|
+
|
88
|
+
```
|
89
|
+
ledger.credit!("transaction1", 50)
|
90
|
+
ledger.value # 50
|
91
|
+
ledger.debit!("transaction2", 10)
|
92
|
+
ledger.value # 40
|
93
|
+
|
94
|
+
ledger.debit!("transaction2", 10)
|
95
|
+
ledger.value # 40
|
96
|
+
```
|
97
|
+
|
98
|
+
### Finding an exisitng Ledger
|
99
|
+
|
100
|
+
```
|
101
|
+
ledger = Riak::Ledger.find!(client["ledgers"], "player_1", options)
|
102
|
+
ledger.value # 40
|
103
|
+
```
|
104
|
+
|
105
|
+
### Request success
|
106
|
+
|
107
|
+
If a call to `#debit!` or `#credit!` does not return false, then the transaction can be considered saved, because it would have retried otherwise. Still, for debugging, testing, or external failure policies, `#has_transaction?` is also exposed
|
108
|
+
|
109
|
+
```
|
110
|
+
ledger.has_transaction? "transaction2" # true
|
111
|
+
ledger.has_transaction? "transaction1" # true
|
112
|
+
```
|
113
|
+
|
114
|
+
### Merging after history_length is reached
|
115
|
+
|
116
|
+
For this example, the `:history_length` is lowered to 3 so it gets reached faster.
|
117
|
+
|
118
|
+
```
|
119
|
+
options = {:history_length => 3}
|
120
|
+
ledger = Riak::Ledger.new(client["ledgers"], "player_2", options)
|
121
|
+
|
122
|
+
ledger.credit!("txn1", 10)
|
123
|
+
ledger.credit!("txn2", 10)
|
124
|
+
ledger.credit!("txn3", 10)
|
125
|
+
ledger.credit!("txn4", 10)
|
126
|
+
ledger.credit!("txn5", 10)
|
127
|
+
ledger.credit!("txn6", 10)
|
128
|
+
|
129
|
+
ledger.value #60
|
130
|
+
|
131
|
+
ledger.has_transaction? "txn1" #false
|
132
|
+
ledger.has_transaction? "txn2" #false
|
133
|
+
ledger.has_transaction? "txn3" #true
|
134
|
+
|
135
|
+
# txn3 is still in the history because the most previous write does not trigger a merge of the actor's total
|
136
|
+
# Performing a find! will trigger the merge however
|
137
|
+
ledger = Riak::Ledger.find!(client["ledgers"], "player_2", options)
|
138
|
+
|
139
|
+
ledger.has_transaction? "txn3" #false
|
140
|
+
ledger.has_transaction? "txn4" #true
|
141
|
+
ledger.has_transaction? "txn5" #true
|
142
|
+
ledger.has_transaction? "txn6" #true
|
143
|
+
```
|
144
|
+
|
145
|
+
### Deleting a ledger
|
146
|
+
|
147
|
+
```
|
148
|
+
ledger.delete()
|
149
|
+
```
|
150
|
+
|
151
|
+
## Problem Statement
|
152
|
+
|
153
|
+
### When to use Riak Counters
|
154
|
+
|
155
|
+
Riak Counters are very well suited for certain problems:
|
156
|
+
|
157
|
+
* Facebook likes
|
158
|
+
* Youtube views
|
159
|
+
* Reddit upvotes
|
160
|
+
* Twitter followers
|
161
|
+
* Any non-critical counts
|
162
|
+
* Counts that do not adversely affect applications or users when off by a few
|
163
|
+
|
164
|
+
### When not to use Riak Counters
|
165
|
+
|
166
|
+
* Currency (virtual or real) balances
|
167
|
+
* Metrics that result in charging a customer
|
168
|
+
* Keeping track of how many calls are made to a paid API endpoint
|
169
|
+
* Storage used by a user
|
170
|
+
* Real-time counts
|
171
|
+
* Any critical counts
|
172
|
+
* Counts that must be accurate
|
173
|
+
|
174
|
+
### Counter Drift
|
175
|
+
|
176
|
+
Riak Counters as currently implemented are not ***idempotent***. This simply means that you cannot retry the same increment or decrement operation more than once.
|
177
|
+
|
178
|
+
Take the following scenario into consideration:
|
179
|
+
|
180
|
+
1. User buys an in-game item that costs 50 gold, and has a current balance of 100 gold
|
181
|
+
2. Application server attempts to debit user's account 50 gold
|
182
|
+
a. If Riak successfully returns 200 response code, no problem!
|
183
|
+
b. If Riak returns 500 (or any other error code), we don't have any way of knowing whether or not the operation succeeded
|
184
|
+
c. If the application server fails at any point in the execution, we also don't have a good way of knowing whether or not the operation succeeded
|
185
|
+
|
186
|
+
In the case of 2b and 2c, we have the following choices:
|
187
|
+
|
188
|
+
* Retry the operation (Risk absolute positive drift)
|
189
|
+
* If the original counter decrement was successful, we have now debited the user's balance twice, thereby charging them 100 gold for a 50 gold item
|
190
|
+
* Never retry (Risk absolute negative drift)
|
191
|
+
* If the original counter decrement was unsuccessful, we gave the user an item for free
|
192
|
+
|
193
|
+
## Idempotent Counters
|
194
|
+
|
195
|
+
There are several approaches to making counters varying degrees of idempotent, the ones relative to the goals of this gem described here.
|
196
|
+
|
197
|
+
### Definitions
|
198
|
+
|
199
|
+
* ***Transaction id***: Globally unique externally generated transaction id that is available per counter action (increment or decrement)
|
200
|
+
* ***Actor***: A thread, process, or server that is able to serially perform actions (a single actor can never perform actions in parallel with itself)
|
201
|
+
* ***Sibling***: In Riak, when you write to the same key without specifying a vector clock, a sibling is created. This is denoted below as `[...sibling1..., ...sibling2...]`.
|
202
|
+
|
203
|
+
### Approach 1: Ensure idempotent counter actions at any time, by any actor
|
204
|
+
|
205
|
+
This is possible if the entire transaction history is stored inside of the counter object:
|
206
|
+
|
207
|
+
Actor 1 writes txn1: 50
|
208
|
+
|
209
|
+
```
|
210
|
+
{"txn1": 50}
|
211
|
+
```
|
212
|
+
|
213
|
+
Actor 2 writes txn1: 50, txn2: 100
|
214
|
+
|
215
|
+
```
|
216
|
+
[
|
217
|
+
#sibling 1
|
218
|
+
{"txn1": 50},
|
219
|
+
#sibling 2
|
220
|
+
{"txn1": 50, "txn2": 100}
|
221
|
+
]
|
222
|
+
```
|
223
|
+
|
224
|
+
Actor 1 reads and merges value
|
225
|
+
|
226
|
+
```
|
227
|
+
{"txn1": 50, "txn2": 100}
|
228
|
+
```
|
229
|
+
|
230
|
+
Total: 150
|
231
|
+
|
232
|
+
This is not a counter, but a ***GSet***, because the entire set of transactions needs to be stored with the object. The total for a counter is defined by the sum of the entire set of values
|
233
|
+
|
234
|
+
***Pros***:
|
235
|
+
|
236
|
+
* Retry any action at any time by any actor in the system.
|
237
|
+
* Optimize for writes: No need to read the value prior to writing a new transaction.
|
238
|
+
|
239
|
+
***Cons***:
|
240
|
+
|
241
|
+
* GSet sizes can become too large for ruby to handle. If more than ~1000 transactions are expected for a single counter, this approach should not be used
|
242
|
+
|
243
|
+
|
244
|
+
### Approach 2a: Ensure idempotent counter actions by any actor, for the current transaction
|
245
|
+
|
246
|
+
In this approach, the transaction id is stored per actor for the most recently written transaction
|
247
|
+
|
248
|
+
Actor 1 writes txn1: 50
|
249
|
+
|
250
|
+
```
|
251
|
+
Actor1: {"total": 0} {"txn1": 50}
|
252
|
+
```
|
253
|
+
|
254
|
+
Actor 2 attempts to write txn1: 50
|
255
|
+
|
256
|
+
Actor 2 reads current value and sees that txn1 has already been written, ignores it's own txn1
|
257
|
+
|
258
|
+
Actor 2 writes merged value
|
259
|
+
|
260
|
+
```
|
261
|
+
Actor1: {"total": 0} {"txn1": 50}
|
262
|
+
```
|
263
|
+
|
264
|
+
Actor 2 Writes txn2: 100
|
265
|
+
|
266
|
+
```
|
267
|
+
Actor1: {"total": 0} {"txn1": 50}
|
268
|
+
Actor2: {"total": 0} {"txn2": 100}
|
269
|
+
```
|
270
|
+
|
271
|
+
Actor 2 Reads current value, and writes txn3: 10 along with it's own merged data
|
272
|
+
|
273
|
+
```
|
274
|
+
Actor1: {"total": 0} {"txn1": 50}
|
275
|
+
Actor2: {"total": 100} {"txn3": 10}
|
276
|
+
```
|
277
|
+
|
278
|
+
Actor 1 reads and merges value
|
279
|
+
|
280
|
+
```
|
281
|
+
Actor1: {"total": 0} {"txn1": 50}
|
282
|
+
Actor2: {"total": 100} {"txn3": 10}
|
283
|
+
```
|
284
|
+
|
285
|
+
Total: 160
|
286
|
+
|
287
|
+
***Pros***:
|
288
|
+
|
289
|
+
* Retry an action with any actor in the system, assuming the actions are serialized per counter
|
290
|
+
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
291
|
+
|
292
|
+
***Cons***:
|
293
|
+
|
294
|
+
* Counter drift is a possibility in the case where transaction 1 fails, several other transactions succeed without retrying transaction 1, and then transaction 1 is tried again
|
295
|
+
|
296
|
+
### Approach 2b: Ensure idempotent counter actions by any actor, for the previous `X` transactions
|
297
|
+
|
298
|
+
This approach is the same as 2a, but instead of only storing the most previous transaction, we store the most previous `X` transactions. In this example we'll use X=5
|
299
|
+
|
300
|
+
Actor 1 writes txn1: 50, txn2: 10, txn3: 100 (order is preserved using an array instead of a hash for transactions)
|
301
|
+
|
302
|
+
```
|
303
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]}
|
304
|
+
```
|
305
|
+
|
306
|
+
Actor 2 attempts to write txn1: 50
|
307
|
+
|
308
|
+
Actor 2 reads current value and sees that txn1 has already been written, ignores it's own txn1
|
309
|
+
|
310
|
+
Actor 2 writes merged value
|
311
|
+
|
312
|
+
```
|
313
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]
|
314
|
+
```
|
315
|
+
|
316
|
+
Actor 2 Writes txn4: 100
|
317
|
+
|
318
|
+
```
|
319
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100]]
|
320
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
321
|
+
```
|
322
|
+
|
323
|
+
Actor 1 Writes txn5: 20, txn6: 20
|
324
|
+
|
325
|
+
```
|
326
|
+
Actor1: {"total": 0} [["txn1", 50], ["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20]]
|
327
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
328
|
+
```
|
329
|
+
|
330
|
+
Actor 1 Writes txn7: 30, and writes it's own merged data
|
331
|
+
|
332
|
+
```
|
333
|
+
Actor1: {"total": 50} [["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20], ["txn7", 30]]
|
334
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
335
|
+
```
|
336
|
+
|
337
|
+
Actor 1 reads and merges value
|
338
|
+
|
339
|
+
```
|
340
|
+
Actor1: {"total": 50} [["txn2", 10], ["txn3", 100], ["txn5", 20], ["txn6", 20], ["txn7", 30]]
|
341
|
+
Actor2: {"total": 0} [["txn4", 100]]
|
342
|
+
```
|
343
|
+
|
344
|
+
Total: 330
|
345
|
+
|
346
|
+
***Pros***:
|
347
|
+
|
348
|
+
* Retry an action with any actor in the system, for the last X actions
|
349
|
+
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
350
|
+
|
351
|
+
***Cons***:
|
352
|
+
|
353
|
+
* Counter drift is a possibility in the case where transaction 1 fails, X + 1 actions occur, and transaction 1 is retried
|
354
|
+
|
355
|
+
### Approach 3: Ensure idempotent counter actions by a single actor, for the current transaction
|
356
|
+
|
357
|
+
In this approach, a globally unique transaction id is no longer required, because we are assuming that only a single actor can ever be responsible for a single transaction
|
358
|
+
|
359
|
+
Actor 1 writes request1: 50, the request shows an error, but the write actually occurred in Riak
|
360
|
+
|
361
|
+
```
|
362
|
+
Actor1: {"total": 0} {"request1": 50}
|
363
|
+
```
|
364
|
+
|
365
|
+
Actor 1 retries request1: 50, the request succeeds, but since request1 is already there, it is ignored and returns a success to the client
|
366
|
+
|
367
|
+
```
|
368
|
+
Actor1: {"total": 0} {"request1": 50}
|
369
|
+
```
|
370
|
+
|
371
|
+
Actor 1 writes request2: 100, the request succeeds
|
372
|
+
|
373
|
+
```
|
374
|
+
Actor1: {"total": 50} {"request2": 100}
|
375
|
+
```
|
376
|
+
|
377
|
+
Actor 2 writes request3: 10. Since request ids are only unique to the actor, no cross-actor uniqueness check can be made.
|
378
|
+
|
379
|
+
```
|
380
|
+
Actor1: {"total": 50} {"request2": 100}
|
381
|
+
Actor2: {"total": 0} {"request3": 10}
|
382
|
+
```
|
383
|
+
|
384
|
+
Actor 2 Writes request4: 100
|
385
|
+
|
386
|
+
```
|
387
|
+
Actor1: {"total": 50} {"request2": 100}
|
388
|
+
Actor2: {"total": 10} {"request4": 100}
|
389
|
+
```
|
390
|
+
|
391
|
+
Actor 1 reads and merges value
|
392
|
+
|
393
|
+
```
|
394
|
+
Actor1: {"total": 50} {"request2": 100}
|
395
|
+
Actor2: {"total": 10} {"request4": 100}
|
396
|
+
```
|
397
|
+
|
398
|
+
Total: 260
|
399
|
+
|
400
|
+
***Pros***:
|
401
|
+
|
402
|
+
* No reliance on an external globally unique transaction id
|
403
|
+
* Optimize for reads: Since a very small amount of data is stored in the counter, reads should be very fast
|
404
|
+
|
405
|
+
***Cons***:
|
406
|
+
|
407
|
+
* Counter drift is a possibility if any action is retried by someone other than the current actor during it's current transaction
|
408
|
+
|
409
|
+
## Conclusion
|
410
|
+
|
411
|
+
In order to attempt to best meet the requirements of *most* counters that cannot be satisfied with Riak Counters, this gem implements approach ***2b*** as it should handle the most likely retry scenarios for most applications.
|
412
|
+
|
413
|
+
## Contributing
|
414
|
+
|
415
|
+
1. Fork it
|
416
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
417
|
+
3. Commit your changes (`git commit -am 'Add some feature'`)
|
418
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
419
|
+
5. Create new Pull Request
|