RubyGems - ohm_util - Versions diffs - 0.1 - Mend

ohm_util 0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

data/examples/tagging.rb ADDED Viewed

@@ -0,0 +1,237 @@
+### Tagging
+#### Intro
+# When building a Web 2.0 application, tagging will probably come up
+# as one of the most requested features. Popularized by Delicious,
+# it has quickly become a useful way to organize crowd sourced data.
+#### How it was done
+# Typically, when you do tagging using an RDBMS, you'll probably end up
+# having a taggings and a tags table, hence a many-to-many design.
+# Here is a quick sketch just to illustrate:
+#
+#
+#
+#     Post      Taggings      Tag
+#     ----      --------      ---
+#     id        tag_id        id
+#     title     post_id       name
+#
+# As you can see, this design leads to a lot of problems:
+#
+# 1. Trying to find the tags of a post will have to go through taggings, and
+#    then individually find the actual tag.
+# 2. One might be inclined to use a JOIN query, but we all know
+#    [joins are evil](http://stackoverflow.com/questions/1020847).
+# 3. Building a tag cloud or some form of tag ranking is unintuitive.
+#### The Ohm approach
+# Here is a basic outline of what we'll need:
+#
+# 1.  We should be able to tag a post (separated by commas).
+# 2.  We should be able to find a post with a given tag.
+#### Beginning with our Post model
+# Let's first require ohm.
+require 'ohm'
+# We then declare our class, inheriting from `Ohm::Model` in the process.
+class Post < Ohm::Model
+  # The structure, fields, and other associations are defined in a declarative
+  # manner. Ohm allows us to declare *attributes*, *sets*, *lists* and
+  # *counters*. For our usecase here, only two *attributes* will get the job
+  # done. The `body` will just
+  # be a plain string, and the `tags` will contain our comma-separated list of
+  # words, i.e. "ruby, redis, ohm". We then declare an `index` (which can be
+  # an `attribute` or just a plain old method), which we point to our method
+  # `tag`.
+  attribute :body
+  attribute :tags
+  index :tag
+  # One very interesting thing about Ohm indexes is that it can either be a
+  # *String* or an *Enumerable* data structure. When we declare it as an
+  # *Enumerable*, `Ohm` will create an index for every element. So if `tag`
+  # returned `[ruby, redis, ohm]` then we can search it using any of the
+  # following:
+  #
+  # 1. ruby
+  # 2. redis
+  # 3. ohm
+  # 4. ruby, redis
+  # 5. ruby, ohm
+  # 6. redis, ohm
+  # 7. ruby, redis, ohm
+  #
+  # Pretty neat ain't it?
+  def tag
+    tags.to_s.split(/\s*,\s*/).uniq
+  end
+end
+#### Testing it out
+# It's a very good habit to test all the time. In the Ruby community,
+# a lot of test frameworks have been created.
+# For our purposes in this example, we'll use cutest.
+require "cutest"
+# Cutest allows us to define callbacks which are guaranteed to be executed
+# every time a new `test` begins. Here, we just make sure that the Redis
+# instance of `Ohm` is empty everytime.
+prepare { Ohm.flush }
+# Next, let's create a simple `Post` instance. The return value of the `setup`
+# block will be passed to every `test` block, so we don't actually have to
+# assign it to an instance variable.
+setup do
+  Post.create(:body => "Ohm Tagging", :tags => "tagging, ohm, redis")
+end
+# For our first run, let's verify the fact that we can find a `Post`
+# using any of the tags we gave.
+test "find using a single tag" do |p|
+  assert Post.find(tag: "tagging").include?(p)
+  assert Post.find(tag: "ohm").include?(p)
+  assert Post.find(tag: "redis").include?(p)
+end
+# Now we verify our claim earlier, that it is possible to find a tag
+# using any one of the combinations for the given set of tags.
+#
+# We also verify that if we pass in a non-existent tag name that
+# we'll fail to find the `Post` we just created.
+test "find using an intersection of multiple tag names" do |p|
+  assert Post.find(tag: ["tagging", "ohm"]).include?(p)
+  assert Post.find(tag: ["tagging", "redis"]).include?(p)
+  assert Post.find(tag: ["ohm", "redis"]).include?(p)
+  assert Post.find(tag: ["tagging", "ohm", "redis"]).include?(p)
+  assert ! Post.find(tag: ["tagging", "foo"]).include?(p)
+end
+#### Adding a Tag model
+# Let's pretend that the client suddenly requested that we keep track
+# of the number of times a tag has been used. It's a pretty fair requirement
+# after all. Updating our requirements, we will now have:
+#
+# 1.  We should be able to tag a post (separated by commas).
+# 2.  We should be able to find a post with a given tag.
+# 3.  We should be able to find top tags, and their count.
+# Continuing from our example above, let's require `ohm-contrib`, which we
+# will be using for callbacks.
+require "ohm/contrib"
+# Let's quickly re-open our Post class.
+class Post
+  # When we want our class to have extended functionality like callbacks,
+  # we simply include the necessary modules, in this case `Ohm::Callbacks`,
+  # which will be responsible for inserting `before_*` and `after_*` methods
+  # in the object's lifecycle.
+  include Ohm::Callbacks
+  # To make our code more concise, we just quickly change our implementation
+  # of `tag` to receive a default parameter:
+  def tag(tags = self.tags)
+    tags.to_s.split(/\s*,\s*/).uniq
+  end
+  # For all but the most simple cases, we would probably need to define
+  # callbacks. When we included `Ohm::Callbacks` above, it actually gave us
+  # the following:
+  #
+  # 1. `before_validate` and `after_validate`
+  # 2. `before_create` and `after_create`
+  # 3. `before_update` and `after_update`
+  # 4. `before_save` and `after_save`
+  # 5. `before_delete` and `after_delete`
+  # For our scenario, we only need a `before_update` and `after_save`.
+  # The idea for our `before_update` is to decrement the `total` of
+  # all existing tags. We use `get(:tags)` the original tags for the
+  # record and use assigned one on save.
+protected
+  def before_update
+    assigned_tags = tags
+    tag(get(:tags)).map(&Tag).each { |t| t.decrement :total }
+    self.tags = assigned_tags
+  end
+  # And of course, we increment all new tags for a particular record
+  # after successfully saving it.
+  def after_save
+    tag.map(&Tag).each { |t| t.increment :total }
+  end
+end
+#### Our Tag model
+# The `Tag` model has only one type, which is a `counter` for the `total`.
+# Since `Ohm` allows us to use any kind of ID (not just numeric sequences),
+# we can actually use the tag name to identify a `Tag`.
+class Tag < Ohm::Model
+  counter :total
+  # The syntax for finding a record by its ID is `Tag["ruby"]`. The standard
+  # behavior in `Ohm` is to return `nil` when the ID does not exist.
+  #
+  # To simplify our code, we override `Tag["ruby"]`, and make it create a
+  # new `Tag` if it doesn't exist yet. One important implementation detail
+  # though is that we need to encode the tag name, so special characters
+  # and spaces won't produce an invalid key.
+  def self.[](id)
+    encoded_id = id.encode
+    super(encoded_id) || create(:id => encoded_id)
+  end
+end
+#### Verifying our third requirement
+# Continuing from our test cases above, let's add test coverage for the
+# behavior of counting tags.
+# For each and every tag we initially create, we need to make sure they have a
+# total of 1.
+test "verify total to be exactly 1" do
+  assert 1 == Tag["ohm"].total
+  assert 1 == Tag["redis"].total
+  assert 1 == Tag["tagging"].total
+end
+# If we try and create another post tagged "ruby", "redis", `Tag["redis"]`
+# should then have a total of 2. All of the other tags will still have
+# a total of 1.
+test "verify totals increase" do
+  Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
+  assert 1 == Tag["ohm"].total
+  assert 1 == Tag["tagging"].total
+  assert 1 == Tag["ruby"].total
+  assert 2 == Tag["redis"].total
+end
+# Finally, let's verify the scenario where we create a `Post` tagged
+# "ruby", "redis" and update it to only have the tag "redis",
+# effectively removing the tag "ruby" from our `Post`.
+test "updating an existing post decrements the tags removed" do
+  p = Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
+  p.update(:tags => "redis")
+  assert 0 == Tag["ruby"].total
+  assert 2 == Tag["redis"].total
+end
+## Conclusion
+# Most of the time we tend to think in terms of an RDBMS way, and this is in
+# no way a negative thing. However, it is important to try and switch your
+# frame of mind when working with Ohm (and Redis) because it will greatly save
+# you time, and possibly lead to a great design.

data/lib/lua/delete.lua ADDED Viewed

@@ -0,0 +1,72 @@
+-- This script receives three parameters, all encoded with
+-- JSON. The decoded values are used for deleting a model
+-- instance in Redis and removing any reference to it in sets
+-- (indices) and hashes (unique indices).
+--
+-- # model
+--
+-- Table with three attributes:
+--    id (model instance id)
+--    key (hash where the attributes will be saved)
+--    name (model name)
+--
+-- # uniques
+--
+-- Fields and values to be removed from the unique indices.
+--
+-- # tracked
+--
+-- Keys that share the lifecycle of this model instance, that
+-- should be removed as this object is deleted.
+--
+local model   = cjson.decode(ARGV[1])
+local uniques = cjson.decode(ARGV[2])
+local tracked = cjson.decode(ARGV[3])
+local function remove_indices(model)
+	local memo = model.key .. ":_indices"
+	local existing = redis.call("SMEMBERS", memo)
+	for _, key in ipairs(existing) do
+		redis.call("SREM", key, model.id)
+		redis.call("SREM", memo, key)
+	end
+end
+local function remove_uniques(model, uniques)
+	local memo = model.key .. ":_uniques"
+	for field, _ in pairs(uniques) do
+		local key = model.name .. ":uniques:" .. field
+		redis.call("HDEL", key, redis.call("HGET", memo, key))
+		redis.call("HDEL", memo, key)
+	end
+end
+local function remove_tracked(model, tracked)
+	for _, tracked_key in ipairs(tracked) do
+		local key = model.key .. ":" .. tracked_key
+		redis.call("DEL", key)
+	end
+end
+local function delete(model)
+	local keys = {
+		model.key .. ":counters",
+		model.key .. ":_indices",
+		model.key .. ":_uniques",
+		model.key
+	}
+	redis.call("SREM", model.name .. ":all", model.id)
+	redis.call("DEL", unpack(keys))
+end
+remove_indices(model)
+remove_uniques(model, uniques)
+remove_tracked(model, tracked)
+delete(model)
+return model.id

data/lib/lua/save.lua ADDED Viewed

@@ -0,0 +1,126 @@
+-- This script receives four parameters, all encoded with
+-- JSON. The decoded values are used for saving a model
+-- instance in Redis, creating or updating a hash as needed and
+-- updating zero or more sets (indices) and zero or more hashes
+-- (unique indices).
+--
+-- # model
+--
+-- Table with one or two attributes:
+--    name (model name)
+--    id (model instance id, optional)
+--
+-- If the id is not provided, it is treated as a new record.
+--
+-- # attrs
+--
+-- Array with attribute/value pairs.
+--
+-- # indices
+--
+-- Fields and values to be indexed. Each key in the indices
+-- table is mapped to an array of values. One index is created
+-- for each field/value pair.
+--
+-- # uniques
+--
+-- Fields and values to be indexed as unique. Unlike indices,
+-- values are not enumerable. If a field/value pair is not unique
+-- (i.e., if there was already a hash entry for that field and
+-- value), an error is returned with the UniqueIndexViolation
+-- message and the field that triggered the error.
+--
+local model   = cjson.decode(ARGV[1])
+local attrs   = cjson.decode(ARGV[2])
+local indices = cjson.decode(ARGV[3])
+local uniques = cjson.decode(ARGV[4])
+local function save(model, attrs)
+	if model.id == nil then
+		model.id = redis.call("INCR", model.name .. ":id")
+	end
+	model.key = model.name .. ":" .. model.id
+	redis.call("SADD", model.name .. ":all", model.id)
+	redis.call("DEL", model.key)
+	if math.mod(#attrs, 2) == 1 then
+		error("Wrong number of attribute/value pairs")
+	end
+	if #attrs > 0 then
+		redis.call("HMSET", model.key, unpack(attrs))
+	end
+end
+local function index(model, indices)
+	for field, enum in pairs(indices) do
+		for _, val in ipairs(enum) do
+			local key = model.name .. ":indices:" .. field .. ":" .. val
+			redis.call("SADD", model.key .. ":_indices", key)
+			redis.call("SADD", key, model.id)
+		end
+	end
+end
+local function remove_indices(model)
+	local memo = model.key .. ":_indices"
+	local existing = redis.call("SMEMBERS", memo)
+	for _, key in ipairs(existing) do
+		redis.call("SREM", key, model.id)
+		redis.call("SREM", memo, key)
+	end
+end
+local function unique(model, uniques)
+	for field, value in pairs(uniques) do
+		local key = model.name .. ":uniques:" .. field
+		local val = value
+		redis.call("HSET", model.key .. ":_uniques", key, val)
+		redis.call("HSET", key, val, model.id)
+	end
+end
+local function remove_uniques(model)
+	local memo = model.key .. ":_uniques"
+	for _, key in pairs(redis.call("HKEYS", memo)) do
+		redis.call("HDEL", key, redis.call("HGET", memo, key))
+		redis.call("HDEL", memo, key)
+	end
+end
+local function verify(model, uniques)
+	local duplicates = {}
+	for field, value in pairs(uniques) do
+		local key = model.name .. ":uniques:" .. field
+		local id = redis.call("HGET", key, value)
+		if id and id ~= tostring(model.id) then
+			duplicates[#duplicates + 1] = field
+		end
+	end
+	return duplicates, #duplicates ~= 0
+end
+local duplicates, err = verify(model, uniques)
+if err then
+	error("UniqueIndexViolation: " .. duplicates[1])
+end
+save(model, attrs)
+remove_indices(model)
+index(model, indices)
+remove_uniques(model, uniques)
+unique(model, uniques)
+return tostring(model.id)

data/lib/ohm_util.rb ADDED Viewed

@@ -0,0 +1,116 @@
+# encoding: UTF-8
+module OhmUtil
+  LUA_CACHE   = Hash.new { |h, k| h[k] = Hash.new }
+  LUA_SAVE    = File.expand_path("../lua/save.lua",   __FILE__)
+  LUA_DELETE  = File.expand_path("../lua/delete.lua", __FILE__)
+  # All of the known errors in Ohm can be traced back to one of these
+  # exceptions.
+  #
+  # MissingID:
+  #
+  #   Comment.new.id # => nil
+  #   Comment.new.key # => Error
+  #
+  #   Solution: you need to save your model first.
+  #
+  # IndexNotFound:
+  #
+  #   Comment.find(:foo => "Bar") # => Error
+  #
+  #   Solution: add an index with `Comment.index :foo`.
+  #
+  # UniqueIndexViolation:
+  #
+  #   Raised when trying to save an object with a `unique` index for
+  #   which the value already exists.
+  #
+  #   Solution: rescue `Ohm::UniqueIndexViolation` during save, but
+  #   also, do some validations even before attempting to save.
+  #
+  class Error < StandardError; end
+  class MissingID < Error; end
+  class IndexNotFound < Error; end
+  class UniqueIndexViolation < Error; end
+  module ErrorPatterns
+    DUPLICATE = /(UniqueIndexViolation: (\w+))/.freeze
+    NOSCRIPT = /^NOSCRIPT/.freeze
+  end
+  # Used by: `attribute`, `counter`, `set`, `reference`,
+  # `collection`.
+  #
+  # Employed as a solution to avoid `NameError` problems when trying
+  # to load models referring to other models not yet loaded.
+  #
+  # Example:
+  #
+	#   class Comment < Ohm::Model
+	#     reference :user, User # NameError undefined constant User.
+	#   end
+	#
+	#   # Instead of relying on some clever `const_missing` hack, we can
+	#   # simply use a symbol or a string.
+	#
+	#   class Comment < Ohm::Model
+	#     reference :user, :User
+	#     reference :post, "Post"
+	#   end
+	#
+	def self.const(context, name)
+		case name
+		when Symbol, String
+			context.const_get(name)
+		else name
+		end
+	end
+	def self.dict(arr)
+		Hash[*arr]
+	end
+	def self.sort_options(options)
+		args = []
+		args.concat(["BY", options[:by]]) if options[:by]
+		args.concat(["GET", options[:get]]) if options[:get]
+		args.concat(["LIMIT"] + options[:limit]) if options[:limit]
+		args.concat(options[:order].split(" ")) if options[:order]
+		args.concat(["STORE", options[:store]]) if options[:store]
+		return args
+	end
+	# Run lua scripts and cache the sha in order to improve
+	# successive calls.
+	def self.script(redis, file, *args)
+		begin
+			cache = LUA_CACHE[redis.url]
+			if cache.key?(file)
+				sha = cache[file]
+			else
+				src = File.read(file)
+				sha = redis.call("SCRIPT", "LOAD", src)
+				cache[file] = sha
+			end
+			redis.call!("EVALSHA", sha, *args)
+		rescue RuntimeError
+			case $!.message
+			when ErrorPatterns::NOSCRIPT
+				LUA_CACHE[redis.url].clear
+				retry
+			when ErrorPatterns::DUPLICATE
+				raise UniqueIndexViolation, $1
+			else
+				raise $!
+			end
+		end
+	end
+end