ohm_util 0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,237 @@
1
+ ### Tagging
2
+
3
+ #### Intro
4
+
5
+ # When building a Web 2.0 application, tagging will probably come up
6
+ # as one of the most requested features. Popularized by Delicious,
7
+ # it has quickly become a useful way to organize crowd sourced data.
8
+
9
+ #### How it was done
10
+
11
+ # Typically, when you do tagging using an RDBMS, you'll probably end up
12
+ # having a taggings and a tags table, hence a many-to-many design.
13
+ # Here is a quick sketch just to illustrate:
14
+ #
15
+ #
16
+ #
17
+ # Post Taggings Tag
18
+ # ---- -------- ---
19
+ # id tag_id id
20
+ # title post_id name
21
+ #
22
+ # As you can see, this design leads to a lot of problems:
23
+ #
24
+ # 1. Trying to find the tags of a post will have to go through taggings, and
25
+ # then individually find the actual tag.
26
+ # 2. One might be inclined to use a JOIN query, but we all know
27
+ # [joins are evil](http://stackoverflow.com/questions/1020847).
28
+ # 3. Building a tag cloud or some form of tag ranking is unintuitive.
29
+
30
+ #### The Ohm approach
31
+
32
+ # Here is a basic outline of what we'll need:
33
+ #
34
+ # 1. We should be able to tag a post (separated by commas).
35
+ # 2. We should be able to find a post with a given tag.
36
+
37
+ #### Beginning with our Post model
38
+
39
+ # Let's first require ohm.
40
+ require 'ohm'
41
+
42
+ # We then declare our class, inheriting from `Ohm::Model` in the process.
43
+ class Post < Ohm::Model
44
+
45
+ # The structure, fields, and other associations are defined in a declarative
46
+ # manner. Ohm allows us to declare *attributes*, *sets*, *lists* and
47
+ # *counters*. For our usecase here, only two *attributes* will get the job
48
+ # done. The `body` will just
49
+ # be a plain string, and the `tags` will contain our comma-separated list of
50
+ # words, i.e. "ruby, redis, ohm". We then declare an `index` (which can be
51
+ # an `attribute` or just a plain old method), which we point to our method
52
+ # `tag`.
53
+ attribute :body
54
+ attribute :tags
55
+ index :tag
56
+
57
+ # One very interesting thing about Ohm indexes is that it can either be a
58
+ # *String* or an *Enumerable* data structure. When we declare it as an
59
+ # *Enumerable*, `Ohm` will create an index for every element. So if `tag`
60
+ # returned `[ruby, redis, ohm]` then we can search it using any of the
61
+ # following:
62
+ #
63
+ # 1. ruby
64
+ # 2. redis
65
+ # 3. ohm
66
+ # 4. ruby, redis
67
+ # 5. ruby, ohm
68
+ # 6. redis, ohm
69
+ # 7. ruby, redis, ohm
70
+ #
71
+ # Pretty neat ain't it?
72
+ def tag
73
+ tags.to_s.split(/\s*,\s*/).uniq
74
+ end
75
+ end
76
+
77
+ #### Testing it out
78
+
79
+ # It's a very good habit to test all the time. In the Ruby community,
80
+ # a lot of test frameworks have been created.
81
+
82
+ # For our purposes in this example, we'll use cutest.
83
+ require "cutest"
84
+
85
+ # Cutest allows us to define callbacks which are guaranteed to be executed
86
+ # every time a new `test` begins. Here, we just make sure that the Redis
87
+ # instance of `Ohm` is empty everytime.
88
+ prepare { Ohm.flush }
89
+
90
+ # Next, let's create a simple `Post` instance. The return value of the `setup`
91
+ # block will be passed to every `test` block, so we don't actually have to
92
+ # assign it to an instance variable.
93
+ setup do
94
+ Post.create(:body => "Ohm Tagging", :tags => "tagging, ohm, redis")
95
+ end
96
+
97
+ # For our first run, let's verify the fact that we can find a `Post`
98
+ # using any of the tags we gave.
99
+ test "find using a single tag" do |p|
100
+ assert Post.find(tag: "tagging").include?(p)
101
+ assert Post.find(tag: "ohm").include?(p)
102
+ assert Post.find(tag: "redis").include?(p)
103
+ end
104
+
105
+ # Now we verify our claim earlier, that it is possible to find a tag
106
+ # using any one of the combinations for the given set of tags.
107
+ #
108
+ # We also verify that if we pass in a non-existent tag name that
109
+ # we'll fail to find the `Post` we just created.
110
+ test "find using an intersection of multiple tag names" do |p|
111
+ assert Post.find(tag: ["tagging", "ohm"]).include?(p)
112
+ assert Post.find(tag: ["tagging", "redis"]).include?(p)
113
+ assert Post.find(tag: ["ohm", "redis"]).include?(p)
114
+ assert Post.find(tag: ["tagging", "ohm", "redis"]).include?(p)
115
+
116
+ assert ! Post.find(tag: ["tagging", "foo"]).include?(p)
117
+ end
118
+
119
+ #### Adding a Tag model
120
+
121
+ # Let's pretend that the client suddenly requested that we keep track
122
+ # of the number of times a tag has been used. It's a pretty fair requirement
123
+ # after all. Updating our requirements, we will now have:
124
+ #
125
+ # 1. We should be able to tag a post (separated by commas).
126
+ # 2. We should be able to find a post with a given tag.
127
+ # 3. We should be able to find top tags, and their count.
128
+
129
+ # Continuing from our example above, let's require `ohm-contrib`, which we
130
+ # will be using for callbacks.
131
+ require "ohm/contrib"
132
+
133
+ # Let's quickly re-open our Post class.
134
+ class Post
135
+ # When we want our class to have extended functionality like callbacks,
136
+ # we simply include the necessary modules, in this case `Ohm::Callbacks`,
137
+ # which will be responsible for inserting `before_*` and `after_*` methods
138
+ # in the object's lifecycle.
139
+ include Ohm::Callbacks
140
+
141
+ # To make our code more concise, we just quickly change our implementation
142
+ # of `tag` to receive a default parameter:
143
+ def tag(tags = self.tags)
144
+ tags.to_s.split(/\s*,\s*/).uniq
145
+ end
146
+
147
+ # For all but the most simple cases, we would probably need to define
148
+ # callbacks. When we included `Ohm::Callbacks` above, it actually gave us
149
+ # the following:
150
+ #
151
+ # 1. `before_validate` and `after_validate`
152
+ # 2. `before_create` and `after_create`
153
+ # 3. `before_update` and `after_update`
154
+ # 4. `before_save` and `after_save`
155
+ # 5. `before_delete` and `after_delete`
156
+
157
+ # For our scenario, we only need a `before_update` and `after_save`.
158
+ # The idea for our `before_update` is to decrement the `total` of
159
+ # all existing tags. We use `get(:tags)` the original tags for the
160
+ # record and use assigned one on save.
161
+ protected
162
+ def before_update
163
+ assigned_tags = tags
164
+ tag(get(:tags)).map(&Tag).each { |t| t.decrement :total }
165
+ self.tags = assigned_tags
166
+ end
167
+
168
+ # And of course, we increment all new tags for a particular record
169
+ # after successfully saving it.
170
+ def after_save
171
+ tag.map(&Tag).each { |t| t.increment :total }
172
+ end
173
+ end
174
+
175
+ #### Our Tag model
176
+
177
+ # The `Tag` model has only one type, which is a `counter` for the `total`.
178
+ # Since `Ohm` allows us to use any kind of ID (not just numeric sequences),
179
+ # we can actually use the tag name to identify a `Tag`.
180
+ class Tag < Ohm::Model
181
+ counter :total
182
+
183
+ # The syntax for finding a record by its ID is `Tag["ruby"]`. The standard
184
+ # behavior in `Ohm` is to return `nil` when the ID does not exist.
185
+ #
186
+ # To simplify our code, we override `Tag["ruby"]`, and make it create a
187
+ # new `Tag` if it doesn't exist yet. One important implementation detail
188
+ # though is that we need to encode the tag name, so special characters
189
+ # and spaces won't produce an invalid key.
190
+ def self.[](id)
191
+ encoded_id = id.encode
192
+ super(encoded_id) || create(:id => encoded_id)
193
+ end
194
+ end
195
+
196
+ #### Verifying our third requirement
197
+
198
+ # Continuing from our test cases above, let's add test coverage for the
199
+ # behavior of counting tags.
200
+
201
+ # For each and every tag we initially create, we need to make sure they have a
202
+ # total of 1.
203
+ test "verify total to be exactly 1" do
204
+ assert 1 == Tag["ohm"].total
205
+ assert 1 == Tag["redis"].total
206
+ assert 1 == Tag["tagging"].total
207
+ end
208
+
209
+ # If we try and create another post tagged "ruby", "redis", `Tag["redis"]`
210
+ # should then have a total of 2. All of the other tags will still have
211
+ # a total of 1.
212
+ test "verify totals increase" do
213
+ Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
214
+
215
+ assert 1 == Tag["ohm"].total
216
+ assert 1 == Tag["tagging"].total
217
+ assert 1 == Tag["ruby"].total
218
+ assert 2 == Tag["redis"].total
219
+ end
220
+
221
+ # Finally, let's verify the scenario where we create a `Post` tagged
222
+ # "ruby", "redis" and update it to only have the tag "redis",
223
+ # effectively removing the tag "ruby" from our `Post`.
224
+ test "updating an existing post decrements the tags removed" do
225
+ p = Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
226
+ p.update(:tags => "redis")
227
+
228
+ assert 0 == Tag["ruby"].total
229
+ assert 2 == Tag["redis"].total
230
+ end
231
+
232
+ ## Conclusion
233
+
234
+ # Most of the time we tend to think in terms of an RDBMS way, and this is in
235
+ # no way a negative thing. However, it is important to try and switch your
236
+ # frame of mind when working with Ohm (and Redis) because it will greatly save
237
+ # you time, and possibly lead to a great design.
@@ -0,0 +1,72 @@
1
+ -- This script receives three parameters, all encoded with
2
+ -- JSON. The decoded values are used for deleting a model
3
+ -- instance in Redis and removing any reference to it in sets
4
+ -- (indices) and hashes (unique indices).
5
+ --
6
+ -- # model
7
+ --
8
+ -- Table with three attributes:
9
+ -- id (model instance id)
10
+ -- key (hash where the attributes will be saved)
11
+ -- name (model name)
12
+ --
13
+ -- # uniques
14
+ --
15
+ -- Fields and values to be removed from the unique indices.
16
+ --
17
+ -- # tracked
18
+ --
19
+ -- Keys that share the lifecycle of this model instance, that
20
+ -- should be removed as this object is deleted.
21
+ --
22
+ local model = cjson.decode(ARGV[1])
23
+ local uniques = cjson.decode(ARGV[2])
24
+ local tracked = cjson.decode(ARGV[3])
25
+
26
+ local function remove_indices(model)
27
+ local memo = model.key .. ":_indices"
28
+ local existing = redis.call("SMEMBERS", memo)
29
+
30
+ for _, key in ipairs(existing) do
31
+ redis.call("SREM", key, model.id)
32
+ redis.call("SREM", memo, key)
33
+ end
34
+ end
35
+
36
+ local function remove_uniques(model, uniques)
37
+ local memo = model.key .. ":_uniques"
38
+
39
+ for field, _ in pairs(uniques) do
40
+ local key = model.name .. ":uniques:" .. field
41
+
42
+ redis.call("HDEL", key, redis.call("HGET", memo, key))
43
+ redis.call("HDEL", memo, key)
44
+ end
45
+ end
46
+
47
+ local function remove_tracked(model, tracked)
48
+ for _, tracked_key in ipairs(tracked) do
49
+ local key = model.key .. ":" .. tracked_key
50
+
51
+ redis.call("DEL", key)
52
+ end
53
+ end
54
+
55
+ local function delete(model)
56
+ local keys = {
57
+ model.key .. ":counters",
58
+ model.key .. ":_indices",
59
+ model.key .. ":_uniques",
60
+ model.key
61
+ }
62
+
63
+ redis.call("SREM", model.name .. ":all", model.id)
64
+ redis.call("DEL", unpack(keys))
65
+ end
66
+
67
+ remove_indices(model)
68
+ remove_uniques(model, uniques)
69
+ remove_tracked(model, tracked)
70
+ delete(model)
71
+
72
+ return model.id
data/lib/lua/save.lua ADDED
@@ -0,0 +1,126 @@
1
+ -- This script receives four parameters, all encoded with
2
+ -- JSON. The decoded values are used for saving a model
3
+ -- instance in Redis, creating or updating a hash as needed and
4
+ -- updating zero or more sets (indices) and zero or more hashes
5
+ -- (unique indices).
6
+ --
7
+ -- # model
8
+ --
9
+ -- Table with one or two attributes:
10
+ -- name (model name)
11
+ -- id (model instance id, optional)
12
+ --
13
+ -- If the id is not provided, it is treated as a new record.
14
+ --
15
+ -- # attrs
16
+ --
17
+ -- Array with attribute/value pairs.
18
+ --
19
+ -- # indices
20
+ --
21
+ -- Fields and values to be indexed. Each key in the indices
22
+ -- table is mapped to an array of values. One index is created
23
+ -- for each field/value pair.
24
+ --
25
+ -- # uniques
26
+ --
27
+ -- Fields and values to be indexed as unique. Unlike indices,
28
+ -- values are not enumerable. If a field/value pair is not unique
29
+ -- (i.e., if there was already a hash entry for that field and
30
+ -- value), an error is returned with the UniqueIndexViolation
31
+ -- message and the field that triggered the error.
32
+ --
33
+ local model = cjson.decode(ARGV[1])
34
+ local attrs = cjson.decode(ARGV[2])
35
+ local indices = cjson.decode(ARGV[3])
36
+ local uniques = cjson.decode(ARGV[4])
37
+
38
+ local function save(model, attrs)
39
+ if model.id == nil then
40
+ model.id = redis.call("INCR", model.name .. ":id")
41
+ end
42
+
43
+ model.key = model.name .. ":" .. model.id
44
+
45
+ redis.call("SADD", model.name .. ":all", model.id)
46
+ redis.call("DEL", model.key)
47
+
48
+ if math.mod(#attrs, 2) == 1 then
49
+ error("Wrong number of attribute/value pairs")
50
+ end
51
+
52
+ if #attrs > 0 then
53
+ redis.call("HMSET", model.key, unpack(attrs))
54
+ end
55
+ end
56
+
57
+ local function index(model, indices)
58
+ for field, enum in pairs(indices) do
59
+ for _, val in ipairs(enum) do
60
+ local key = model.name .. ":indices:" .. field .. ":" .. val
61
+
62
+ redis.call("SADD", model.key .. ":_indices", key)
63
+ redis.call("SADD", key, model.id)
64
+ end
65
+ end
66
+ end
67
+
68
+ local function remove_indices(model)
69
+ local memo = model.key .. ":_indices"
70
+ local existing = redis.call("SMEMBERS", memo)
71
+
72
+ for _, key in ipairs(existing) do
73
+ redis.call("SREM", key, model.id)
74
+ redis.call("SREM", memo, key)
75
+ end
76
+ end
77
+
78
+ local function unique(model, uniques)
79
+ for field, value in pairs(uniques) do
80
+ local key = model.name .. ":uniques:" .. field
81
+ local val = value
82
+
83
+ redis.call("HSET", model.key .. ":_uniques", key, val)
84
+ redis.call("HSET", key, val, model.id)
85
+ end
86
+ end
87
+
88
+ local function remove_uniques(model)
89
+ local memo = model.key .. ":_uniques"
90
+
91
+ for _, key in pairs(redis.call("HKEYS", memo)) do
92
+ redis.call("HDEL", key, redis.call("HGET", memo, key))
93
+ redis.call("HDEL", memo, key)
94
+ end
95
+ end
96
+
97
+ local function verify(model, uniques)
98
+ local duplicates = {}
99
+
100
+ for field, value in pairs(uniques) do
101
+ local key = model.name .. ":uniques:" .. field
102
+ local id = redis.call("HGET", key, value)
103
+
104
+ if id and id ~= tostring(model.id) then
105
+ duplicates[#duplicates + 1] = field
106
+ end
107
+ end
108
+
109
+ return duplicates, #duplicates ~= 0
110
+ end
111
+
112
+ local duplicates, err = verify(model, uniques)
113
+
114
+ if err then
115
+ error("UniqueIndexViolation: " .. duplicates[1])
116
+ end
117
+
118
+ save(model, attrs)
119
+
120
+ remove_indices(model)
121
+ index(model, indices)
122
+
123
+ remove_uniques(model, uniques)
124
+ unique(model, uniques)
125
+
126
+ return tostring(model.id)
data/lib/ohm_util.rb ADDED
@@ -0,0 +1,116 @@
1
+ # encoding: UTF-8
2
+
3
+ module OhmUtil
4
+ LUA_CACHE = Hash.new { |h, k| h[k] = Hash.new }
5
+ LUA_SAVE = File.expand_path("../lua/save.lua", __FILE__)
6
+ LUA_DELETE = File.expand_path("../lua/delete.lua", __FILE__)
7
+
8
+ # All of the known errors in Ohm can be traced back to one of these
9
+ # exceptions.
10
+ #
11
+ # MissingID:
12
+ #
13
+ # Comment.new.id # => nil
14
+ # Comment.new.key # => Error
15
+ #
16
+ # Solution: you need to save your model first.
17
+ #
18
+ # IndexNotFound:
19
+ #
20
+ # Comment.find(:foo => "Bar") # => Error
21
+ #
22
+ # Solution: add an index with `Comment.index :foo`.
23
+ #
24
+ # UniqueIndexViolation:
25
+ #
26
+ # Raised when trying to save an object with a `unique` index for
27
+ # which the value already exists.
28
+ #
29
+ # Solution: rescue `Ohm::UniqueIndexViolation` during save, but
30
+ # also, do some validations even before attempting to save.
31
+ #
32
+ class Error < StandardError; end
33
+ class MissingID < Error; end
34
+ class IndexNotFound < Error; end
35
+ class UniqueIndexViolation < Error; end
36
+
37
+ module ErrorPatterns
38
+ DUPLICATE = /(UniqueIndexViolation: (\w+))/.freeze
39
+ NOSCRIPT = /^NOSCRIPT/.freeze
40
+ end
41
+
42
+ # Used by: `attribute`, `counter`, `set`, `reference`,
43
+ # `collection`.
44
+ #
45
+ # Employed as a solution to avoid `NameError` problems when trying
46
+ # to load models referring to other models not yet loaded.
47
+ #
48
+ # Example:
49
+ #
50
+ # class Comment < Ohm::Model
51
+ # reference :user, User # NameError undefined constant User.
52
+ # end
53
+ #
54
+ # # Instead of relying on some clever `const_missing` hack, we can
55
+ # # simply use a symbol or a string.
56
+ #
57
+ # class Comment < Ohm::Model
58
+ # reference :user, :User
59
+ # reference :post, "Post"
60
+ # end
61
+ #
62
+ def self.const(context, name)
63
+ case name
64
+ when Symbol, String
65
+ context.const_get(name)
66
+ else name
67
+ end
68
+ end
69
+
70
+ def self.dict(arr)
71
+ Hash[*arr]
72
+ end
73
+
74
+ def self.sort_options(options)
75
+ args = []
76
+
77
+ args.concat(["BY", options[:by]]) if options[:by]
78
+ args.concat(["GET", options[:get]]) if options[:get]
79
+ args.concat(["LIMIT"] + options[:limit]) if options[:limit]
80
+ args.concat(options[:order].split(" ")) if options[:order]
81
+ args.concat(["STORE", options[:store]]) if options[:store]
82
+
83
+ return args
84
+ end
85
+
86
+ # Run lua scripts and cache the sha in order to improve
87
+ # successive calls.
88
+ def self.script(redis, file, *args)
89
+ begin
90
+ cache = LUA_CACHE[redis.url]
91
+
92
+ if cache.key?(file)
93
+ sha = cache[file]
94
+ else
95
+ src = File.read(file)
96
+ sha = redis.call("SCRIPT", "LOAD", src)
97
+
98
+ cache[file] = sha
99
+ end
100
+
101
+ redis.call!("EVALSHA", sha, *args)
102
+
103
+ rescue RuntimeError
104
+
105
+ case $!.message
106
+ when ErrorPatterns::NOSCRIPT
107
+ LUA_CACHE[redis.url].clear
108
+ retry
109
+ when ErrorPatterns::DUPLICATE
110
+ raise UniqueIndexViolation, $1
111
+ else
112
+ raise $!
113
+ end
114
+ end
115
+ end
116
+ end