riak-ruby-ledger 0.0.4 → 0.0.5

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,33 @@
1
+ ## Release notes
2
+
3
+ ##### Version 0.0.4 and Counter Drift
4
+
5
+ In version 0.0.4 of this gem, counter drift is still a possibility. Take the following scenario into consideration:
6
+
7
+ 1. Actor 1 and Actor 2 both are somehow trying to write the same transaction id, possibly because the process writing the transaction took too long, and your application erroneously had a policy of retrying the same transaction before the first actor finished.
8
+ a. If the Actor 1 is successful in writing the transaction before Actor 2 begins, Actor 2 will see that the transaction id already exists, and will return successful before attempting to write.
9
+ b. Similarly, if Actor 2 finishes before Actor 1 starts, Actor 1 would disregard the request and report success.
10
+ c. If Actor 1 and Actor 2 simultaneously and successfully write the same transaction, a result of is two siblings.
11
+ 2. If 1a or 1b happen, there is no problem. If 1c occurs, the second line of defense happens during a merge (merges are triggered prior to every write, and after every read).
12
+ a. If Actor 1 merges before Actor 2, Actor 1 will remove it's own duplicate transaction in favor of leaving Actor 2's version, knowing it cannot modify any other actors' data.
13
+ b. Similarly, if Actor 2 merges before Actor 1, it will remove it's own duplicate transaction.
14
+ c. If Actor 1 and Actor 2 merge simultaneously and successfully, they would both remove their own duplicate (from their point of view) version of the transaction, meaning it would be lost causing negative counter drift (on increments) and positive drift (on decrements)
15
+
16
+ This is an unlikely but possible scenario. Here are some ways to reduce or elimiate the possibility of 2c from happening:
17
+
18
+ 1. The precursor to the condition resulting from 2c can be avoided by serializing writes per transaction, like in the example of a game's application server knowing to only submit one unique transaction at a time. Submitting simultaneous transactions is ok, so long as the same transaction isn't active in more than one actor at the same time.
19
+ a. This is possible using this gem, it's just a matter of implemeting some control over who can write a single unique at the same time.
20
+ 2. Have a no duplicate delete policy, meaning that you could potentially have an infinitely growing list of duplicate transactions if your application causes this situation often.
21
+ a. This is unimplemented in this gem as of now, but depending on the thoughts of others, I may add it as an optional policy.
22
+ 3. Attach a microsecond epoch to each transaction so that during merges the the duplicate transaction with the highest epoch always wins.
23
+ a. This is unimplemented in this gem, and it would only lessen the statistical likelihood of 2c happening, it would still be possible. Because it only lowers the likelihood.
24
+ 4. Do a string compare on the actor ids, whichever has the highest string compare value always keeps it's version of the duplicate transaction.
25
+ a. This is now implemented in version 0.1.0, see below.
26
+
27
+ ##### Version 0.0.5 and Actor Naming [***Important***]
28
+
29
+ Solution 4 has been implemented to the potential counter drift caused by two simultaneous writes and later merges of a duplicate transaction as described in the previous section.
30
+
31
+ As a result, keep in mind that when naming actors, they will be compared for ordering purposes
32
+
33
+ Example: "ACTOR2" is greater than "ACTOR1", so ACTOR1 will always remove it's version of a duplicate transaction during a merge, and "ACTOR2" will never remove it's version. Avoid using actor ids that could potentially result in string equality.
@@ -0,0 +1,56 @@
1
+ ## Riak Counters and Drift
2
+
3
+ ### Summary
4
+
5
+ **Why shouldn't I use Riak Counters?**
6
+
7
+ CRDT PNCounters (two plain GCounters) such as Riak Counters are non-idempotent and store nothing about a counter transaction other than the final value. This means that if an increment operation fails in any number of ways (500 response from server, process that made the call dies, network connection is interrupted, operation times out, etc), your application now has no idea whether or not the increment actually happened.
8
+
9
+ **What is Counter Drift?**
10
+
11
+ In the above situation of a failed increment operation, your application has two choices:
12
+
13
+ 1. Retry the operation: This could result in the operation occuring twice causing what is called **positive counter drift**
14
+ 2. Don't retry the operation: This could result in the operation never occuring at all causing **negative counter drift**
15
+
16
+ As such it doesn't make sense to use plain GCounters or PNCounters to store any counter that needs to be accurate.
17
+
18
+ ### When to use Riak Counters
19
+
20
+ Riak Counters are very well suited for certain problems:
21
+
22
+ * Facebook likes
23
+ * Youtube views
24
+ * Reddit upvotes
25
+ * Twitter followers
26
+ * Any non-critical counts
27
+ * Counts that do not adversely affect applications or users when off by a few
28
+
29
+ ### When not to use Riak Counters
30
+
31
+ * Currency (virtual or real) balances
32
+ * Metrics that result in charging a customer
33
+ * Keeping track of how many calls are made to a paid API endpoint
34
+ * Storage used by a user
35
+ * Real-time counts
36
+ * Any critical counts
37
+ * Counts that must be accurate
38
+
39
+ ### Counter Drift
40
+
41
+ Riak Counters (and GSets in general) as currently implemented are not ***idempotent***. This simply means that you cannot retry the same increment or decrement operation more than once.
42
+
43
+ Take the following scenario into consideration:
44
+
45
+ 1. User buys an in-game item that costs 50 gold, and has a current balance of 100 gold
46
+ 2. Application server attempts to debit user's account 50 gold
47
+ a. If Riak successfully returns 200 response code, no problem!
48
+ b. If Riak returns 500 (or any other error code), we don't have any way of knowing whether or not the operation succeeded
49
+ c. If the application server fails at any point in the execution, we also don't have a good way of knowing whether or not the operation succeeded
50
+
51
+ In the case of 2b and 2c, we have the following choices:
52
+
53
+ * Retry the operation (Risk absolute positive drift)
54
+ * If the original counter decrement was successful, we have now debited the user's balance twice, thereby charging them 100 gold for a 50 gold item
55
+ * Never retry (Risk absolute negative drift)
56
+ * If the original counter decrement was unsuccessful, we gave the user an item for free
data/docs/usage.md ADDED
@@ -0,0 +1,110 @@
1
+ ## Suggested Usage and Configuration
2
+
3
+ ### Summary
4
+
5
+ Depending on your use case, you may want to tweak the configuration options `:history_length` and `:retry_count`.
6
+
7
+ The default `:history_length` is 10. This means that if a transaction fails, but your application is unable to determine whether or not the counter was actually incremented, you have a buffer space or window of 9 additional transactions on that counter before you can no longer retry the original failed transaction without assuming counter drift is happening.
8
+
9
+ The default `:retry_count` is also 10. This means that if a transaction fails, the actor that attempted the transaction will continue trying 9 more times. If the request to change the counter still fails after the 10th try, the operation will return `false` for failure. At this point your application can attempt to try the transaction again, or return a failure to the user with a note that the transaction will be retried in the future.
10
+
11
+ An example of a failure might look like the following:
12
+
13
+ 1. transaction1 fails with actor1, and because of the nature of the failure, your application is unsure whether or not the counter was actually incremented.
14
+
15
+ 1. If your `:retry_count` is low, you can quickly determine in your application that something went wrong, and inform the user that the transaction was unsuccessful for now, but will be attempted later
16
+ 2. If your `:retry_count` is high, the user will be kept waiting longer, but the odds of the transaction eventually working are higher
17
+ 2. If after the initial retries, the transaction was still a failure, your application must decide what to do next
18
+
19
+ 1. If your `:history_length` is low, your options are limited. You must continue to retry that same failed transaction for that user (using any available actor) until it is successful. If you allow additional transactions to take place on the same counter before retrying, you run a high risk of counter drift.
20
+ 2. If your `:history_length` is medium-high, then you have an allowance of (`:history_length` - 1) additional transactions for that counter before you run the risk of counter drift.
21
+
22
+ **Note**
23
+
24
+ This gem cannot guarentee transaction idempotence of a counter for greater than `:history_length` number of transactions.
25
+
26
+ ### Tunable Transaction History
27
+ By allowing clients to set how many transactions to keep in the counter object as well as set a retry policy on the Riak actions performed on the counter, a good balance can be achieved. The `Riak::Ledger` class in this gem can be instantiated with the following options:
28
+
29
+ ```
30
+ :actor => Actor ID, one per thread or serialized writer
31
+ :history_length => Number of transactions to store per actor per type (credit or debit)
32
+ :retry_count => Number of times to retry Riak requests if they fail
33
+ ```
34
+
35
+ Furthermore, each `#credit!` and `#debit!` action against the ledger takes an (assumed) globally unique `transaction` id that is determined by your application.
36
+
37
+ These options combined give you reasonable guarentees that a single transaction can be retried per counter continually as long as less than X number of other transactions are applied to the same counter (where X is the `:history_length`).
38
+
39
+ The gem will automatically retry `:retry_count` number of times, and if it still fails after that you can define a secondary retry or reconciliation policy within your application to deal with the failure, although if the actions are continually failing, it is possible that something is systematically wrong with your Riak cluster.
40
+
41
+ ##### Merging Siblings and Collapsing Old Transactions
42
+
43
+ Prior to every write (`#credit!` and `#debit!`), and on every read (`#find!`), two merges happen: Sibling Merges and Transaction Collapse
44
+
45
+ Sibling Merges are just combining the data from two Riak siblings into a single object, nothing extraordinary happening here.
46
+
47
+ Transaction collapse happens based on the specified or default `:history_length`. In the following example, assume `:history_length` is equal to 2:
48
+
49
+ Add 3 transactions
50
+
51
+ ```
52
+ ledger = Riak::Ledger.new(client["ledgers"], "player_2", {:history_length => 2})
53
+
54
+ ledger.credit!("txn1", 10)
55
+ ledger.credit!("txn2", 10)
56
+ ledger.credit!("txn3", 10)
57
+ ```
58
+
59
+ Check transaction existence
60
+
61
+ ```
62
+ ledger.has_transaction? "txn1" #true
63
+ ledger.has_transaction? "txn2" #true
64
+ ledger.has_transaction? "txn3" #true
65
+ ```
66
+
67
+ Based on the above, you might expect "txn1" to have been collapsed; however, merges happen only before writes, and when reads happen. This is because prior to every write, a read occurs triggering a merge. Given those facts, after a read happens, a merge should occur
68
+
69
+ ```
70
+ ledger = Riak::Ledger.find!(client["ledgers"], "player_2", {:history_length => 2})
71
+
72
+ ledger.has_transaction? "txn1" #false
73
+ ```
74
+
75
+ ### :retry_count values
76
+
77
+ A low `:retry_count` (1-9) might be appropriate for applications that would rather give immediate failed request feedback to their users so that they can continue performing other actions. This should be coupled with a higher `:history_length` if you intend to allow your user to initiate other transactions while waiting to retry the first one.
78
+
79
+ A medium `:retry_count` (10-50) might be appropriate for applications that require a higher level of certainty about a specific transaction's success at all times. Allowing a single actor to attempt retries for as long as necessary also greatly reduces the chance that duplicate transactions will ever be created, but requests will take longer in that case. Even if duplicate transactions are created, they should be merged at a later time, but it is safer to have a 1 actor per transaction at a time policy.
80
+
81
+ ### :history_length values
82
+
83
+ A low `:history_length` (1-9) is never really suggested, as it lowers the time window for idempotent operations to occur. The only time a low `:history_length` might be necessary is if your cluster is not big enough to handle the space consumed by the transaction list. Here is an example calculation to show how much space various transaction histories might consume.
84
+
85
+ ```
86
+ # Riak's replication value
87
+ n = 3
88
+ actor_count = 5
89
+ # a single transaction within counter json might
90
+ # look like this: ["550e8400-e29b-41d4-a716-446655440000": 10],
91
+ bytes_per_txn = 45
92
+ # if you have 1 million users, and 1 ledger per user
93
+ number_of_counters = 1,000,000
94
+ ```
95
+
96
+ For a `:history_length` of 10:
97
+
98
+ ```
99
+ (actor_count * number_of_counters * bytes_per_txn * history_length) * n = 6750000000 bytes or 6.28643 GB total raw disk storage
100
+ ```
101
+
102
+ For a `:history_length` of 50:
103
+
104
+ ```
105
+ (actor_count * number_of_counters * bytes_per_txn * history_length) * n = 33750000000 bytes or 31.4321 GB total raw disk storage
106
+ ```
107
+
108
+ A medium `:history_length` (10-50) is a safe balance for most applications. Applications suited for this range of values are ones that do not have very high concurrent access requirements on a per counter basis. For example an application that only allows a user to have one transaction in flight at a time, but wants the option to let the user continue doing a few more transactions before the state of the failed transaction is known.
109
+
110
+ A high `:history_length` (50+) might be suitable for applications whose primary function is to provide highly concurrent and frequent access to a limited number of counters. An example might be a service that needs to keep accurate track of a limited number of statistics like how much bandwidth is consumed for a series of endpoints for the purposes of billing a customer.
@@ -1,5 +1,3 @@
1
- require 'set'
2
-
3
1
  module Riak::CRDT
4
2
  class TGCounter
5
3
  attr_accessor :counts, :actor, :history_length
@@ -63,23 +61,52 @@ module Riak::CRDT
63
61
  self.counts[actor]["txns"][transaction] = value
64
62
  end
65
63
 
66
- # Get unique list of all transactions and values across all known actors
67
- # @param [String] ignore_actor
64
+ # Get unique list of all transactions and values across all known actors, or optionally for a single actor
65
+ # @param [String] for_actor
68
66
  # @return [Hash]
69
- def unique_transactions(ignore_actor=nil)
67
+ def unique_transactions(for_actor=nil)
70
68
  txns = Hash.new()
71
69
 
72
70
  self.counts.each do |a, values|
73
- unless a == ignore_actor
74
- values["txns"].arr.each do |arr|
75
- txns[arr[0]] = arr[1]
76
- end
71
+ next if for_actor && a != for_actor
72
+ values["txns"].arr.each do |arr|
73
+ txns[arr[0]] = arr[1]
77
74
  end
78
75
  end
79
76
 
80
77
  txns
81
78
  end
82
79
 
80
+ # Get unique list of all duplicate transactions per actor other than self
81
+ # @return [Hash]
82
+ def duplicate_transactions_by_actor()
83
+ actor_txns = Hash.new()
84
+
85
+ my_transactions = self.unique_transactions(self.actor).keys
86
+
87
+ self.counts.keys.each do |a|
88
+ next if a == self.actor
89
+ uniques = self.unique_transactions(a).keys
90
+ actor_txns[a] = (my_transactions & uniques)
91
+ end
92
+
93
+ actor_txns
94
+ end
95
+
96
+ # Get unique list of all duplicate transactions for all actors other than self
97
+ # @return [Hash]
98
+ def duplicate_transactions()
99
+ duplicates = Hash.new()
100
+
101
+ self.duplicate_transactions_by_actor().each do |a, txns|
102
+ txns.each do |txn, val|
103
+ duplicates[txn] = val
104
+ end
105
+ end
106
+
107
+ duplicates
108
+ end
109
+
83
110
  def has_transaction?(transaction)
84
111
  self.unique_transactions().keys.member?(transaction)
85
112
  end
@@ -96,11 +123,18 @@ module Riak::CRDT
96
123
  total
97
124
  end
98
125
 
99
- # Merge actor data from a sibling into self, additionally compress oldest
100
- # transactions that exceed the :history_length param into actor's total
126
+ # Merge actor data from a sibling into self, additionally remove duplicate
127
+ # transactions and compress oldest transactions that exceed the
128
+ # :history_length param into actor's total
101
129
  # @param [TGCounter] other
102
130
  def merge(other)
103
- # Combine all actors first
131
+ self.merge_actors(other)
132
+ self.remove_duplicates()
133
+ self.compress_history()
134
+ end
135
+
136
+ # Combine all actors' data
137
+ def merge_actors(other)
104
138
  other.counts.each do |other_actor, other_values|
105
139
  if self.counts[other_actor]
106
140
  # Max of totals
@@ -118,18 +152,30 @@ module Riak::CRDT
118
152
  self.counts[other_actor] = other_values
119
153
  end
120
154
  end
155
+ end
121
156
 
122
- # Remove duplicate transactions if other actors have claimed them
123
- self.unique_transactions(actor).keys.each do |txn|
124
- self.counts[actor]["txns"].delete(txn)
157
+ # Remove duplicate transactions if other actors have claimed them
158
+ def remove_duplicates()
159
+ self.duplicate_transactions_by_actor().each do |a, txns|
160
+ # Spaceship operator, if my actor is of greater value than theirs, skip because they should remove the dupe
161
+ next if (self.actor <=> a) == 1
162
+ txns.each do |txn|
163
+ self.counts[self.actor]["txns"].delete(txn)
164
+ end
125
165
  end
166
+ end
126
167
 
127
- # Merge this actor's data based on history_length
168
+ # Compress this actor's data based on history_length
169
+ def compress_history()
128
170
  total = 0
171
+
172
+ duplicates = self.duplicate_transactions()
173
+
129
174
  if self.counts[actor]["txns"].length > self.history_length
130
175
  to_delete = self.counts[actor]["txns"].length - self.history_length
131
176
  self.counts[actor]["txns"].arr.slice!(0..to_delete - 1).each do |arr|
132
- total += arr[1]
177
+ txn, val = arr
178
+ total += val unless duplicates.member? txn
133
179
  end
134
180
  end
135
181
 
@@ -138,7 +184,7 @@ module Riak::CRDT
138
184
  end
139
185
  end
140
186
 
141
- # Ease of use class: Wraps an ordered array with some hash-like functions
187
+ # Ease of use class - Wraps an ordered array with some hash-like functions
142
188
  class TransactionArray
143
189
  attr_accessor :arr
144
190
 
data/lib/ledger.rb CHANGED
@@ -54,7 +54,7 @@ module Riak
54
54
  # @see update!(transaction, value)
55
55
  # @return [Boolean]
56
56
  def credit!(transaction, value)
57
- update!(transaction, value)
57
+ self.update!(transaction, value)
58
58
  end
59
59
 
60
60
  # Decrement the counter, merge and save it
@@ -63,7 +63,7 @@ module Riak
63
63
  # @see update!(transaction, value)
64
64
  # @return [Boolean]
65
65
  def debit!(transaction, value)
66
- update!(transaction, value * -1)
66
+ self.update!(transaction, value * -1)
67
67
  end
68
68
 
69
69
  # Update the counter, merge and save it. Retry if unsuccessful
@@ -78,10 +78,10 @@ module Riak
78
78
  end
79
79
 
80
80
  # Get the current merged state of this counter
81
- vclock = refresh()
81
+ vclock = self.refresh()
82
82
 
83
83
 
84
- if has_transaction?(transaction)
84
+ if self.has_transaction?(transaction)
85
85
  # If the transaction already exists in the counter, no problem
86
86
  return true
87
87
  else
@@ -92,10 +92,10 @@ module Riak
92
92
  self.counter.increment(transaction, value)
93
93
  end
94
94
 
95
- unless save(vclock)
95
+ unless self.save(vclock)
96
96
  # If the save wasn't successful, retry
97
97
  current_retry = self.retry_count unless current_retry
98
- update!(transaction, value, current_retry - 1)
98
+ self.update!(transaction, value, current_retry - 1)
99
99
  else
100
100
  # If the save succeeded, no problem
101
101
  return true
@@ -103,7 +103,7 @@ module Riak
103
103
  end
104
104
  end
105
105
 
106
- # Create a new Ledger object
106
+ # Check if the counter has transaction
107
107
  # @param [String] transaction
108
108
  # @return [Boolean]
109
109
  def has_transaction?(transaction)
@@ -155,7 +155,7 @@ module Riak
155
155
  object = self.bucket.new(self.key)
156
156
  object.vclock = vclock if vclock
157
157
  object.content_type = 'application/json'
158
- object.raw_data = to_json
158
+ object.raw_data = self.to_json()
159
159
 
160
160
  begin
161
161
  options = {:returnbody => false}
@@ -1,5 +1,5 @@
1
1
  module Riak
2
2
  class Ledger
3
- VERSION = "0.0.4"
3
+ VERSION = "0.0.5"
4
4
  end
5
- end
5
+ end
@@ -8,8 +8,8 @@ Gem::Specification.new do |spec|
8
8
  spec.version = Riak::Ledger::VERSION
9
9
  spec.authors = ["drewkerrigan"]
10
10
  spec.email = ["dkerrigan@basho.com"]
11
- spec.description = %q{A PNCounter CRDT based ledger with support for transaction ids and tunable write idempotence}
12
- spec.summary = %q{This gem attempts to provide a tunable Counter option by combining non-idempotent GCounters and a partially idempotent GSet for calculating a running counter or ledger. By allowing clients to set how many transactions to keep in the counter object as well as set a retry policy on the Riak actions performed on the counter, a good balance can be achieved.}
11
+ spec.description = %q{An alternative to Riak Counters with idempotent writes within a client defined window}
12
+ spec.summary = %q{The data type implemented is a PNCounter CRDT with an ordered array of transactions for each GCounter actor. Transaction ids are stored with the GCounter, so operations against this counter are idempotent while the transaction remains in any actor's array.}
13
13
  spec.homepage = "https://github.com/drewkerrigan/riak-ruby-ledger"
14
14
  spec.license = "Apache2"
15
15
 
@@ -1,4 +1,4 @@
1
- require_relative '../test_helper'
1
+ require_relative '../../test_helper'
2
2
 
3
3
  describe Riak::CRDT::TGCounter do
4
4
  options1 = {:actor => "ACTOR1", :history_length => 5}
@@ -1,4 +1,4 @@
1
- require_relative '../test_helper'
1
+ require_relative '../../test_helper'
2
2
 
3
3
  describe Riak::CRDT::TPNCounter do
4
4
  options1 = {:actor => "ACTOR1", :history_length => 5}
@@ -59,39 +59,40 @@ describe Riak::CRDT::TPNCounter do
59
59
 
60
60
  it "must merge" do
61
61
  counter = Riak::CRDT::TPNCounter.new(options1)
62
- counter.increment("txn1", 10)
63
- counter.increment("txn2", 10)
64
- counter.increment("txn1", 10)
65
- counter.increment("txn1", 10)
66
- counter.decrement("txn3", 5)
67
-
68
- counter.increment("txn4", 10)
69
- counter.increment("txn5", 10)
70
- counter.increment("txn6", 10)
71
- counter.increment("txn7", 10)
72
- counter.decrement("txn8", 5)
62
+ counter.increment("txn1", 10) #ignore
63
+ counter.increment("txn2", 10) #ignore
64
+ counter.increment("txn1", 10) #ignore
65
+ counter.increment("txn1", 10) #ignore
66
+ counter.decrement("txn3", 5) #keep
67
+
68
+ counter.increment("txn4", 10) #ignore
69
+ counter.increment("txn5", 10) #keep
70
+ counter.increment("txn6", 10) #keep
71
+ counter.increment("txn7", 10) #keep
72
+ counter.decrement("txn8", 5) #keep
73
73
 
74
74
  counter2 = Riak::CRDT::TPNCounter.new(options2)
75
- counter2.increment("txn1", 10)
76
- counter2.increment("txn2", 10)
77
- counter2.increment("txn4", 10)
78
- counter2.increment("txn1", 10)
79
- counter2.decrement("txn5", 1)
80
-
81
- counter2.increment("txn9", 10)
82
- counter2.increment("txn10", 10)
83
- counter2.increment("txn11", 10)
84
- counter2.increment("txn12", 10)
85
- counter2.decrement("txn13", 1)
75
+ counter2.increment("txn1", 10) #ignore
76
+ counter2.increment("txn2", 10) #keep
77
+ counter2.increment("txn4", 10) #keep
78
+ counter2.increment("txn1", 10) #keep
79
+ counter2.decrement("txn14", 1) #keep
80
+
81
+ counter2.increment("txn9", 10) #keep
82
+ counter2.increment("txn10", 10) #keep
83
+ counter2.increment("txn11", 10) #keep
84
+ counter2.increment("txn12", 10) #keep
85
+ counter2.decrement("txn13", 1) #keep
86
86
 
87
87
  counter.merge(counter2)
88
88
 
89
89
  assert_equal(0, counter.p.counts["ACTOR1"]["total"])
90
90
  assert_equal(88, counter.value)
91
91
 
92
+ counter.increment("txn9", 10) #ignore, keep in actor 1 even though actor 2 would normally have it
92
93
  counter2.merge(counter)
93
94
 
94
- assert_equal(20, counter2.p.counts["ACTOR2"]["total"])
95
+ assert_equal(10, counter2.p.counts["ACTOR2"]["total"])
95
96
  assert_equal(88, counter2.value)
96
97
  end
97
98
  end