ohm_util 0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,237 @@
1
+ ### Tagging
2
+
3
+ #### Intro
4
+
5
+ # When building a Web 2.0 application, tagging will probably come up
6
+ # as one of the most requested features. Popularized by Delicious,
7
+ # it has quickly become a useful way to organize crowd sourced data.
8
+
9
+ #### How it was done
10
+
11
+ # Typically, when you do tagging using an RDBMS, you'll probably end up
12
+ # having a taggings and a tags table, hence a many-to-many design.
13
+ # Here is a quick sketch just to illustrate:
14
+ #
15
+ #
16
+ #
17
+ # Post Taggings Tag
18
+ # ---- -------- ---
19
+ # id tag_id id
20
+ # title post_id name
21
+ #
22
+ # As you can see, this design leads to a lot of problems:
23
+ #
24
+ # 1. Trying to find the tags of a post will have to go through taggings, and
25
+ # then individually find the actual tag.
26
+ # 2. One might be inclined to use a JOIN query, but we all know
27
+ # [joins are evil](http://stackoverflow.com/questions/1020847).
28
+ # 3. Building a tag cloud or some form of tag ranking is unintuitive.
29
+
30
+ #### The Ohm approach
31
+
32
+ # Here is a basic outline of what we'll need:
33
+ #
34
+ # 1. We should be able to tag a post (separated by commas).
35
+ # 2. We should be able to find a post with a given tag.
36
+
37
+ #### Beginning with our Post model
38
+
39
+ # Let's first require ohm.
40
+ require 'ohm'
41
+
42
+ # We then declare our class, inheriting from `Ohm::Model` in the process.
43
+ class Post < Ohm::Model
44
+
45
+ # The structure, fields, and other associations are defined in a declarative
46
+ # manner. Ohm allows us to declare *attributes*, *sets*, *lists* and
47
+ # *counters*. For our usecase here, only two *attributes* will get the job
48
+ # done. The `body` will just
49
+ # be a plain string, and the `tags` will contain our comma-separated list of
50
+ # words, i.e. "ruby, redis, ohm". We then declare an `index` (which can be
51
+ # an `attribute` or just a plain old method), which we point to our method
52
+ # `tag`.
53
+ attribute :body
54
+ attribute :tags
55
+ index :tag
56
+
57
+ # One very interesting thing about Ohm indexes is that it can either be a
58
+ # *String* or an *Enumerable* data structure. When we declare it as an
59
+ # *Enumerable*, `Ohm` will create an index for every element. So if `tag`
60
+ # returned `[ruby, redis, ohm]` then we can search it using any of the
61
+ # following:
62
+ #
63
+ # 1. ruby
64
+ # 2. redis
65
+ # 3. ohm
66
+ # 4. ruby, redis
67
+ # 5. ruby, ohm
68
+ # 6. redis, ohm
69
+ # 7. ruby, redis, ohm
70
+ #
71
+ # Pretty neat ain't it?
72
+ def tag
73
+ tags.to_s.split(/\s*,\s*/).uniq
74
+ end
75
+ end
76
+
77
+ #### Testing it out
78
+
79
+ # It's a very good habit to test all the time. In the Ruby community,
80
+ # a lot of test frameworks have been created.
81
+
82
+ # For our purposes in this example, we'll use cutest.
83
+ require "cutest"
84
+
85
+ # Cutest allows us to define callbacks which are guaranteed to be executed
86
+ # every time a new `test` begins. Here, we just make sure that the Redis
87
+ # instance of `Ohm` is empty everytime.
88
+ prepare { Ohm.flush }
89
+
90
+ # Next, let's create a simple `Post` instance. The return value of the `setup`
91
+ # block will be passed to every `test` block, so we don't actually have to
92
+ # assign it to an instance variable.
93
+ setup do
94
+ Post.create(:body => "Ohm Tagging", :tags => "tagging, ohm, redis")
95
+ end
96
+
97
+ # For our first run, let's verify the fact that we can find a `Post`
98
+ # using any of the tags we gave.
99
+ test "find using a single tag" do |p|
100
+ assert Post.find(tag: "tagging").include?(p)
101
+ assert Post.find(tag: "ohm").include?(p)
102
+ assert Post.find(tag: "redis").include?(p)
103
+ end
104
+
105
+ # Now we verify our claim earlier, that it is possible to find a tag
106
+ # using any one of the combinations for the given set of tags.
107
+ #
108
+ # We also verify that if we pass in a non-existent tag name that
109
+ # we'll fail to find the `Post` we just created.
110
+ test "find using an intersection of multiple tag names" do |p|
111
+ assert Post.find(tag: ["tagging", "ohm"]).include?(p)
112
+ assert Post.find(tag: ["tagging", "redis"]).include?(p)
113
+ assert Post.find(tag: ["ohm", "redis"]).include?(p)
114
+ assert Post.find(tag: ["tagging", "ohm", "redis"]).include?(p)
115
+
116
+ assert ! Post.find(tag: ["tagging", "foo"]).include?(p)
117
+ end
118
+
119
+ #### Adding a Tag model
120
+
121
+ # Let's pretend that the client suddenly requested that we keep track
122
+ # of the number of times a tag has been used. It's a pretty fair requirement
123
+ # after all. Updating our requirements, we will now have:
124
+ #
125
+ # 1. We should be able to tag a post (separated by commas).
126
+ # 2. We should be able to find a post with a given tag.
127
+ # 3. We should be able to find top tags, and their count.
128
+
129
+ # Continuing from our example above, let's require `ohm-contrib`, which we
130
+ # will be using for callbacks.
131
+ require "ohm/contrib"
132
+
133
+ # Let's quickly re-open our Post class.
134
+ class Post
135
+ # When we want our class to have extended functionality like callbacks,
136
+ # we simply include the necessary modules, in this case `Ohm::Callbacks`,
137
+ # which will be responsible for inserting `before_*` and `after_*` methods
138
+ # in the object's lifecycle.
139
+ include Ohm::Callbacks
140
+
141
+ # To make our code more concise, we just quickly change our implementation
142
+ # of `tag` to receive a default parameter:
143
+ def tag(tags = self.tags)
144
+ tags.to_s.split(/\s*,\s*/).uniq
145
+ end
146
+
147
+ # For all but the most simple cases, we would probably need to define
148
+ # callbacks. When we included `Ohm::Callbacks` above, it actually gave us
149
+ # the following:
150
+ #
151
+ # 1. `before_validate` and `after_validate`
152
+ # 2. `before_create` and `after_create`
153
+ # 3. `before_update` and `after_update`
154
+ # 4. `before_save` and `after_save`
155
+ # 5. `before_delete` and `after_delete`
156
+
157
+ # For our scenario, we only need a `before_update` and `after_save`.
158
+ # The idea for our `before_update` is to decrement the `total` of
159
+ # all existing tags. We use `get(:tags)` the original tags for the
160
+ # record and use assigned one on save.
161
+ protected
162
+ def before_update
163
+ assigned_tags = tags
164
+ tag(get(:tags)).map(&Tag).each { |t| t.decrement :total }
165
+ self.tags = assigned_tags
166
+ end
167
+
168
+ # And of course, we increment all new tags for a particular record
169
+ # after successfully saving it.
170
+ def after_save
171
+ tag.map(&Tag).each { |t| t.increment :total }
172
+ end
173
+ end
174
+
175
+ #### Our Tag model
176
+
177
+ # The `Tag` model has only one type, which is a `counter` for the `total`.
178
+ # Since `Ohm` allows us to use any kind of ID (not just numeric sequences),
179
+ # we can actually use the tag name to identify a `Tag`.
180
+ class Tag < Ohm::Model
181
+ counter :total
182
+
183
+ # The syntax for finding a record by its ID is `Tag["ruby"]`. The standard
184
+ # behavior in `Ohm` is to return `nil` when the ID does not exist.
185
+ #
186
+ # To simplify our code, we override `Tag["ruby"]`, and make it create a
187
+ # new `Tag` if it doesn't exist yet. One important implementation detail
188
+ # though is that we need to encode the tag name, so special characters
189
+ # and spaces won't produce an invalid key.
190
+ def self.[](id)
191
+ encoded_id = id.encode
192
+ super(encoded_id) || create(:id => encoded_id)
193
+ end
194
+ end
195
+
196
+ #### Verifying our third requirement
197
+
198
+ # Continuing from our test cases above, let's add test coverage for the
199
+ # behavior of counting tags.
200
+
201
+ # For each and every tag we initially create, we need to make sure they have a
202
+ # total of 1.
203
+ test "verify total to be exactly 1" do
204
+ assert 1 == Tag["ohm"].total
205
+ assert 1 == Tag["redis"].total
206
+ assert 1 == Tag["tagging"].total
207
+ end
208
+
209
+ # If we try and create another post tagged "ruby", "redis", `Tag["redis"]`
210
+ # should then have a total of 2. All of the other tags will still have
211
+ # a total of 1.
212
+ test "verify totals increase" do
213
+ Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
214
+
215
+ assert 1 == Tag["ohm"].total
216
+ assert 1 == Tag["tagging"].total
217
+ assert 1 == Tag["ruby"].total
218
+ assert 2 == Tag["redis"].total
219
+ end
220
+
221
+ # Finally, let's verify the scenario where we create a `Post` tagged
222
+ # "ruby", "redis" and update it to only have the tag "redis",
223
+ # effectively removing the tag "ruby" from our `Post`.
224
+ test "updating an existing post decrements the tags removed" do
225
+ p = Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
226
+ p.update(:tags => "redis")
227
+
228
+ assert 0 == Tag["ruby"].total
229
+ assert 2 == Tag["redis"].total
230
+ end
231
+
232
+ ## Conclusion
233
+
234
+ # Most of the time we tend to think in terms of an RDBMS way, and this is in
235
+ # no way a negative thing. However, it is important to try and switch your
236
+ # frame of mind when working with Ohm (and Redis) because it will greatly save
237
+ # you time, and possibly lead to a great design.
@@ -0,0 +1,72 @@
1
+ -- This script receives three parameters, all encoded with
2
+ -- JSON. The decoded values are used for deleting a model
3
+ -- instance in Redis and removing any reference to it in sets
4
+ -- (indices) and hashes (unique indices).
5
+ --
6
+ -- # model
7
+ --
8
+ -- Table with three attributes:
9
+ -- id (model instance id)
10
+ -- key (hash where the attributes will be saved)
11
+ -- name (model name)
12
+ --
13
+ -- # uniques
14
+ --
15
+ -- Fields and values to be removed from the unique indices.
16
+ --
17
+ -- # tracked
18
+ --
19
+ -- Keys that share the lifecycle of this model instance, that
20
+ -- should be removed as this object is deleted.
21
+ --
22
+ local model = cjson.decode(ARGV[1])
23
+ local uniques = cjson.decode(ARGV[2])
24
+ local tracked = cjson.decode(ARGV[3])
25
+
26
+ local function remove_indices(model)
27
+ local memo = model.key .. ":_indices"
28
+ local existing = redis.call("SMEMBERS", memo)
29
+
30
+ for _, key in ipairs(existing) do
31
+ redis.call("SREM", key, model.id)
32
+ redis.call("SREM", memo, key)
33
+ end
34
+ end
35
+
36
+ local function remove_uniques(model, uniques)
37
+ local memo = model.key .. ":_uniques"
38
+
39
+ for field, _ in pairs(uniques) do
40
+ local key = model.name .. ":uniques:" .. field
41
+
42
+ redis.call("HDEL", key, redis.call("HGET", memo, key))
43
+ redis.call("HDEL", memo, key)
44
+ end
45
+ end
46
+
47
+ local function remove_tracked(model, tracked)
48
+ for _, tracked_key in ipairs(tracked) do
49
+ local key = model.key .. ":" .. tracked_key
50
+
51
+ redis.call("DEL", key)
52
+ end
53
+ end
54
+
55
+ local function delete(model)
56
+ local keys = {
57
+ model.key .. ":counters",
58
+ model.key .. ":_indices",
59
+ model.key .. ":_uniques",
60
+ model.key
61
+ }
62
+
63
+ redis.call("SREM", model.name .. ":all", model.id)
64
+ redis.call("DEL", unpack(keys))
65
+ end
66
+
67
+ remove_indices(model)
68
+ remove_uniques(model, uniques)
69
+ remove_tracked(model, tracked)
70
+ delete(model)
71
+
72
+ return model.id
data/lib/lua/save.lua ADDED
@@ -0,0 +1,126 @@
1
+ -- This script receives four parameters, all encoded with
2
+ -- JSON. The decoded values are used for saving a model
3
+ -- instance in Redis, creating or updating a hash as needed and
4
+ -- updating zero or more sets (indices) and zero or more hashes
5
+ -- (unique indices).
6
+ --
7
+ -- # model
8
+ --
9
+ -- Table with one or two attributes:
10
+ -- name (model name)
11
+ -- id (model instance id, optional)
12
+ --
13
+ -- If the id is not provided, it is treated as a new record.
14
+ --
15
+ -- # attrs
16
+ --
17
+ -- Array with attribute/value pairs.
18
+ --
19
+ -- # indices
20
+ --
21
+ -- Fields and values to be indexed. Each key in the indices
22
+ -- table is mapped to an array of values. One index is created
23
+ -- for each field/value pair.
24
+ --
25
+ -- # uniques
26
+ --
27
+ -- Fields and values to be indexed as unique. Unlike indices,
28
+ -- values are not enumerable. If a field/value pair is not unique
29
+ -- (i.e., if there was already a hash entry for that field and
30
+ -- value), an error is returned with the UniqueIndexViolation
31
+ -- message and the field that triggered the error.
32
+ --
33
+ local model = cjson.decode(ARGV[1])
34
+ local attrs = cjson.decode(ARGV[2])
35
+ local indices = cjson.decode(ARGV[3])
36
+ local uniques = cjson.decode(ARGV[4])
37
+
38
+ local function save(model, attrs)
39
+ if model.id == nil then
40
+ model.id = redis.call("INCR", model.name .. ":id")
41
+ end
42
+
43
+ model.key = model.name .. ":" .. model.id
44
+
45
+ redis.call("SADD", model.name .. ":all", model.id)
46
+ redis.call("DEL", model.key)
47
+
48
+ if math.mod(#attrs, 2) == 1 then
49
+ error("Wrong number of attribute/value pairs")
50
+ end
51
+
52
+ if #attrs > 0 then
53
+ redis.call("HMSET", model.key, unpack(attrs))
54
+ end
55
+ end
56
+
57
+ local function index(model, indices)
58
+ for field, enum in pairs(indices) do
59
+ for _, val in ipairs(enum) do
60
+ local key = model.name .. ":indices:" .. field .. ":" .. val
61
+
62
+ redis.call("SADD", model.key .. ":_indices", key)
63
+ redis.call("SADD", key, model.id)
64
+ end
65
+ end
66
+ end
67
+
68
+ local function remove_indices(model)
69
+ local memo = model.key .. ":_indices"
70
+ local existing = redis.call("SMEMBERS", memo)
71
+
72
+ for _, key in ipairs(existing) do
73
+ redis.call("SREM", key, model.id)
74
+ redis.call("SREM", memo, key)
75
+ end
76
+ end
77
+
78
+ local function unique(model, uniques)
79
+ for field, value in pairs(uniques) do
80
+ local key = model.name .. ":uniques:" .. field
81
+ local val = value
82
+
83
+ redis.call("HSET", model.key .. ":_uniques", key, val)
84
+ redis.call("HSET", key, val, model.id)
85
+ end
86
+ end
87
+
88
+ local function remove_uniques(model)
89
+ local memo = model.key .. ":_uniques"
90
+
91
+ for _, key in pairs(redis.call("HKEYS", memo)) do
92
+ redis.call("HDEL", key, redis.call("HGET", memo, key))
93
+ redis.call("HDEL", memo, key)
94
+ end
95
+ end
96
+
97
+ local function verify(model, uniques)
98
+ local duplicates = {}
99
+
100
+ for field, value in pairs(uniques) do
101
+ local key = model.name .. ":uniques:" .. field
102
+ local id = redis.call("HGET", key, value)
103
+
104
+ if id and id ~= tostring(model.id) then
105
+ duplicates[#duplicates + 1] = field
106
+ end
107
+ end
108
+
109
+ return duplicates, #duplicates ~= 0
110
+ end
111
+
112
+ local duplicates, err = verify(model, uniques)
113
+
114
+ if err then
115
+ error("UniqueIndexViolation: " .. duplicates[1])
116
+ end
117
+
118
+ save(model, attrs)
119
+
120
+ remove_indices(model)
121
+ index(model, indices)
122
+
123
+ remove_uniques(model, uniques)
124
+ unique(model, uniques)
125
+
126
+ return tostring(model.id)
data/lib/ohm_util.rb ADDED
@@ -0,0 +1,116 @@
1
+ # encoding: UTF-8
2
+
3
+ module OhmUtil
4
+ LUA_CACHE = Hash.new { |h, k| h[k] = Hash.new }
5
+ LUA_SAVE = File.expand_path("../lua/save.lua", __FILE__)
6
+ LUA_DELETE = File.expand_path("../lua/delete.lua", __FILE__)
7
+
8
+ # All of the known errors in Ohm can be traced back to one of these
9
+ # exceptions.
10
+ #
11
+ # MissingID:
12
+ #
13
+ # Comment.new.id # => nil
14
+ # Comment.new.key # => Error
15
+ #
16
+ # Solution: you need to save your model first.
17
+ #
18
+ # IndexNotFound:
19
+ #
20
+ # Comment.find(:foo => "Bar") # => Error
21
+ #
22
+ # Solution: add an index with `Comment.index :foo`.
23
+ #
24
+ # UniqueIndexViolation:
25
+ #
26
+ # Raised when trying to save an object with a `unique` index for
27
+ # which the value already exists.
28
+ #
29
+ # Solution: rescue `Ohm::UniqueIndexViolation` during save, but
30
+ # also, do some validations even before attempting to save.
31
+ #
32
+ class Error < StandardError; end
33
+ class MissingID < Error; end
34
+ class IndexNotFound < Error; end
35
+ class UniqueIndexViolation < Error; end
36
+
37
+ module ErrorPatterns
38
+ DUPLICATE = /(UniqueIndexViolation: (\w+))/.freeze
39
+ NOSCRIPT = /^NOSCRIPT/.freeze
40
+ end
41
+
42
+ # Used by: `attribute`, `counter`, `set`, `reference`,
43
+ # `collection`.
44
+ #
45
+ # Employed as a solution to avoid `NameError` problems when trying
46
+ # to load models referring to other models not yet loaded.
47
+ #
48
+ # Example:
49
+ #
50
+ # class Comment < Ohm::Model
51
+ # reference :user, User # NameError undefined constant User.
52
+ # end
53
+ #
54
+ # # Instead of relying on some clever `const_missing` hack, we can
55
+ # # simply use a symbol or a string.
56
+ #
57
+ # class Comment < Ohm::Model
58
+ # reference :user, :User
59
+ # reference :post, "Post"
60
+ # end
61
+ #
62
+ def self.const(context, name)
63
+ case name
64
+ when Symbol, String
65
+ context.const_get(name)
66
+ else name
67
+ end
68
+ end
69
+
70
+ def self.dict(arr)
71
+ Hash[*arr]
72
+ end
73
+
74
+ def self.sort_options(options)
75
+ args = []
76
+
77
+ args.concat(["BY", options[:by]]) if options[:by]
78
+ args.concat(["GET", options[:get]]) if options[:get]
79
+ args.concat(["LIMIT"] + options[:limit]) if options[:limit]
80
+ args.concat(options[:order].split(" ")) if options[:order]
81
+ args.concat(["STORE", options[:store]]) if options[:store]
82
+
83
+ return args
84
+ end
85
+
86
+ # Run lua scripts and cache the sha in order to improve
87
+ # successive calls.
88
+ def self.script(redis, file, *args)
89
+ begin
90
+ cache = LUA_CACHE[redis.url]
91
+
92
+ if cache.key?(file)
93
+ sha = cache[file]
94
+ else
95
+ src = File.read(file)
96
+ sha = redis.call("SCRIPT", "LOAD", src)
97
+
98
+ cache[file] = sha
99
+ end
100
+
101
+ redis.call!("EVALSHA", sha, *args)
102
+
103
+ rescue RuntimeError
104
+
105
+ case $!.message
106
+ when ErrorPatterns::NOSCRIPT
107
+ LUA_CACHE[redis.url].clear
108
+ retry
109
+ when ErrorPatterns::DUPLICATE
110
+ raise UniqueIndexViolation, $1
111
+ else
112
+ raise $!
113
+ end
114
+ end
115
+ end
116
+ end