RubyGems - merkle-hash-tree - Versions diffs - 0.1.0 - Mend

merkle-hash-tree 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

data/.gitignore +1 -0
data/Gemfile +3 -0
data/Guardfile +13 -0
data/README.md +148 -0
data/Rakefile +38 -0
data/doc/DAI.md +111 -0
data/lib/merkle-hash-tree.rb +283 -0
data/lib/range_extensions.rb +13 -0
data/merkle-hash-tree.gemspec +31 -0
data/spec/audit_proof_spec.rb +116 -0
data/spec/consistency_proof_spec.rb +41 -0
data/spec/dai_caching_spec.rb +42 -0
data/spec/head_n_spec.rb +36 -0
data/spec/head_spec.rb +45 -0
data/spec/power_of_2_smaller_than_spec.rb +37 -0
data/spec/spec_helper.rb +34 -0
metadata +243 -0

data/.gitignore ADDED

	@@ -0,0 +1 @@
1	+ /Gemfile.lock

data/Gemfile ADDED

@@ -0,0 +1,3 @@
+source 'https://rubygems.org/'
+gemspec

data/Guardfile ADDED

@@ -0,0 +1,13 @@
+guard 'spork' do
+  watch('Gemfile')             { :rspec }
+  watch('Gemfile.lock')        { :rspec }
+  watch('spec/spec_helper.rb') { :rspec }
+end
+guard 'rspec',
+      :cmd            => "rspec --drb",
+      :all_on_start   => true,
+      :all_after_pass => true do
+  watch(%r{^spec/.+_spec\.rb$})
+  watch(%r{^lib/})               { "spec" }
+end

data/README.md ADDED

@@ -0,0 +1,148 @@
+This gem contains an implementation of "Merkle Hash Trees" (MHT).
+Specifically, it implements the variant described in
+[RFC6962](http://tools.ietf.org/html/rfc6962), as the initial use-case for
+this gem was for an implementation of a [Certificate
+Transparency](http://www.certificate-transparency.org/) log server.
+# Installation
+Installation should be trivial, if you're using rubygems:
+    gem install merkle-hash-tree
+If you want to install directly from the git repo, run `rake install`.
+# Usage
+Using `MerkleHashTree` is relatively straightforward, although it does have
+one or two intricacies.  Because MHTs typically deal with large volumes of
+data, it isn't enough to just load a giant list of objects into memory and
+go to town -- you'll run out of memory pretty quickly, and on a large tree
+you'll likely burn a lot of CPU time computing hashes.  Instead, in order to
+instantiate an MHT you must first construct an object that implements a
+specific interface, which the MHT implementation then uses to interact with
+your dataset.
+## Basic Usage
+For now, though, let's assume that you have such an object, named
+`mht_data`, and we'll look at how to use the MHT.  (It might be useful to
+understand [how MHT proofs
+work](http://www.certificate-transparency.org/log-proofs-work) before you go
+too deeply into this).
+For starters, we'll create a new MHT:
+    mht = MerkleHashTree.new(mht_data, Digest::SHA256)
+The `MerkleHashTree` constructor takes exactly two arguments: an object that
+implements the data access interface we'll talk about later, and a class (or
+object) which implements the same `digest` method signature as the core
+`Digest::Base` class.  Typically, this will simply be a `Digest` subclass,
+such as `Digest::MD5`, `Digest::SHA1`, or (as in the example above)
+`Digest::SHA256`.  This second argument is the way that the MHT calculates
+hashes in the tree -- it simply calls the `#digest` method on whatever you
+pass in as the second argument, passing in a string and expecting raw octets
+out the other end.
+Once we have our MHT object, we can start to do things with it.  For
+example, we can get the hash of the "head" of the tree:
+    mht.head   # => "<some long string of octets>"
+You can also get the head of any subtree, by specifying the
+first and last elements of the list to be covered by the subtree:
+    mht.head(16, 20)   # => "<some more octets>"
+Note that the beginning element must be a power of 2.
+If you want to get the subtree from 0 to an arbitrary element in the list,
+you can just specify the last element:
+    mht.head(42)   # => "<some other long string of octets>"
+    # equivalent to
+    mht.head(0, 42)
+We can also ask for a "consistency proof" between any two subtrees:
+    mht.consistency_proof(42, 69)   # => ["<hash>", "<hash>", ... ]
+If we want a consistency proof between a subtree and the current head, we
+can drop the second parameter:
+    mht.consistency_proof(42)   # => ["<hash>", "<hash>", ... ]
+I'm not going to describe Merkle consistency proofs here; the Internet does
+a far better job than I ever will.  The return value of `#consistency_proof`
+is simply an array of the hashes that are required by a client to prove that
+the smaller subtree is, indeed, a subtree of the larger one (and nothing
+dodgy has gone on behind the scenes).  [RFC6962,
+s2.1.2](http://tools.ietf.org/html/rfc6962#section-2.1.2) has all the gory
+details of how to calculate it and how to use the result.
+There are also such things as "audit proofs" (again, I'm not going to
+explain them here), which you get by specifying a single leaf number and a
+subtree ID:
+    mht.audit_proof(13, 42)   # => ["<hash>", "<hash>", ... ]
+In this example, the audit proof will return a list of hashes, starting from
+the leaf node's sibling and working up towards the root node for a hash tree
+containing 42 elements, that demonstrate that leaf 13 is in the tree and
+hasn't been removed or altered.
+You can also drop the second argument, in which case you get an audit proof
+for the tree that represents the entire list as it currently exists:
+    mht.audit_proof(13)   # => ["<hash>", "<hash>", ... ]
+And that's it!  There really isn't much you can do from the outside.  All
+the fun happens inside.
+## The Data Access Interface
+Rather than trying to work with an entire dataset in memory,
+`MerkleHashTree` is capable of working with a dataset far larger than what
+could fit in memory, by using a data access object to fetch items and cache
+intermediate results (the hashes of nodes in the tree).  To do this, though,
+a fair number of methods need to be implemented.
+How you implement them is up to you -- you could query a backend database,
+or just make up data as you felt like it.  In the minimal case, you *can*
+pass in an instance of Array, although I doubt you'll enjoy the performance
+on any but the smallest possible hash tree.
+The complete interface definition is given in `doc/DAI.md`, for those who
+wish to implement their own interface.  Essentially, you *must* to implement
+`[](n)`, which returns the `n`th entry in the (zero-indexed) list, as well
+as `length`, which returns the current size of the list.  You can also
+implement `cache_set(n1, n2, s)` and `cache_get(n1, n2)`, which set and get
+entries in the cache of node values.  If you don't implement these, then
+`MerkleHashTree` will need to recalculate every hash in the tree repeatedly
+for most every operation -- which will be *very* slow for anything other
+than the most trivial result.
+As I said before, you *can* just use Array, if you want to, which could look
+something like this:
+    a = Array.new
+    mht = MerkleHashTree.new(a, Digest::MD5)
+    a << 'a'
+    a << 'b'
+    a << 'c'
+    a << 'd'
+    a << 'e'
+    mht.head   # => "O\xA2\x03\x12\xF6\x0F\xFBtU\x95GY\xE53\x17\x8D"
+## Further Info
+In a reversal of standard operating procedure, I heavily document all the
+methods and interfaces I write.  You can get complete API documentation by
+using `ri` (or a descendent thereof), or via your web-based rdoc browser of
+choice.

data/Rakefile ADDED

@@ -0,0 +1,38 @@
+require 'rubygems'
+require 'bundler'
+task :default => :test
+begin
+	Bundler.setup(:default, :development)
+rescue Bundler::BundlerError => e
+	$stderr.puts e.message
+	$stderr.puts "Run `bundle install` to install missing gems"
+	exit e.status_code
+end
+require 'git-version-bump/rake-tasks'
+Bundler::GemHelper.install_tasks
+require 'rdoc/task'
+Rake::RDocTask.new do |rd|
+	rd.main = "README.md"
+	rd.title = 'lvmsync'
+	rd.rdoc_files.include("README.md", "lib/**/*.rb")
+end
+desc "Run guard"
+task :guard do
+	require 'guard'
+	::Guard.start(:clear => true)
+	while ::Guard.running do
+		sleep 0.5
+	end
+end
+require 'rspec/core/rake_task'
+RSpec::Core::RakeTask.new :test do |t|
+	t.pattern = "spec/**/*_spec.rb"
+end

data/doc/DAI.md ADDED

@@ -0,0 +1,111 @@
+The Data Access Interface for this library is a flexible way for the tree to
+retrieve and cache information it needs.  This is important, because the use
+case for this library is to provide hash trees for datasets *far* larger
+than what can be reasonably stored in memory by Ruby objects, and
+potentially in diverse and application-specific stores.  Therefore, it is
+important that the interface between instances of `MerkleHashTree` and the
+underlying list data is as flexible as possible.
+The interface is designed so that an instance of `Array` *will* work, in the
+minimal case, although it won't perform particularly well.  In order to
+maximise performance, it is recommended that the optional caching methods
+also be implemented, with the cache data stored either in memory, or in a
+fast network-accessable cache such as memcached or Redis.
+# Mandatory Methods
+There are two mandatory methods which *must* be implemented by any object
+which is passed in as the data object to a call to `MerkleHashTree.new`.
+They might look familiar.
+Both of these methods are called *very* frequently and repeatedly; it is
+highly recommended that they perform their own caching of results if
+retrieval from backing store is an expensive operation.  Caching is quite
+easy in this system, because no value ever changes once it has been defined
+(with the exception of `length`).
+## `length`
+This method returns the number of items in the list.  This number must be
+monotonically increasing with each call -- that is, there must never be a
+case where a call to `#length` on a given data object returns a
+value less than that returned by a previous call to `#length` on that same
+object.  Failure to observe this property will absolutely and with
+guaranteed certainty lead to heartbreak.
+## `[](n)`
+This method returns the `n`th item in the list, indexed from zero.  All
+values of `[](n)` for `0 <= n < length` must return an object which responds
+to `to_s` correctly (the value of `to_s` is used as the value passed to the
+hashing function which calculates the leaf hash value).
+If `n` is greater than or equal to a value previously returned from a call
+to `length` on the same object, it is permissible to either return `nil`,
+raise `ArgumentError`, or do whatever you like -- if `MerkleHashTree` ever
+does that, it's a big fat stinky bug in this library.
+Once a call to this method with a given value of `n` has been made, *every*
+future call for the same value of `n` MUST return an object whose `to_s`
+method returns an identical string.  Failure to observe this requirement
+will surely cause demons to fly out of your nose.
+# Optional Methods
+There are two optional methods which your DAI object may choose to
+implement.  If you implement one, though, you must implement both.  They are
+used to cache intermediate hashes within the tree nodes, and can
+significantly improve performance on large trees, because it's a lot quicker
+to retrieve a value from a database than it is to recalculate a few hundred
+thousand SHA256 hashes.
+## `mht_cache_set(key, value)`
+This method takes a string `key` and a string `value`, and should store that
+association somewhere convenient for later retrieval.  The return value is
+ignored (although raising an exception is Just Not On).
+For a given `key`, only one `value` will *ever* be passed (for a given DAI
+object).  If this allows you to optimise some part of your cache
+implementation, mazel tov.
+## `mht_cache_get(key)`
+This method takes a string `key` and returns either a string `value` or
+`nil`.  If a string is returned, that string MUST be the value passed to a
+previous call to `mht_cache_set` for the same `key`.
+Since this is a caching interface, It is entirely permissible to return
+`nil` to a call to `mht_cache_get` for a given key when a previous call for
+the same key returned a string `value`.  The cache entry may well have
+expired in the interim.  `MerkleHashTree` will *always* handle a call to
+`mht_cache_get` returning `nil` (by recalculating any and all hashes
+required to regenerate the value that has not been cached).
+This method MAY be called with a given `key` without a previous call to
+`mht_cache_set` being made for the same `key`, and your implementation must
+handle that gracefully (by returning `nil`).
+# Item Methods
+The objects returned from calls to `[](n)` must implement a `to_s` method
+that returns a string.  There is no requirement for the value returned by
+`to_s` to be unique amongst all objects returned from `[](n)`, but I
+certainly wouldn't recommend them all returning the same value (it would be
+a very boring-looking hash tree).
+To slightly improve performance, objects can also implement an accessor
+method pair, `mht_leaf_hash` and `mht_leaf_hash=(s)`.  If available,
+`mht_leaf_hash` will be called to determine the hash value of the object; if
+this method returns `nil`, then the hash value will be calculated from the
+string returned by `to_s`, and then cached in the object by calling
+`mht_leaf_hash=(h)`.  It is not recommended that you try to be clever by
+implementing a hashing scheme yourself in `mht_leaf_hash`; that way lies
+madness.

data/lib/merkle-hash-tree.rb ADDED

@@ -0,0 +1,283 @@
+require 'range_extensions'
+# Implement an RFC6962-compliant Merkle Hash Tree.
+#
+class MerkleHashTree
+	# Instantiate a new MerkleHashTree.
+	#
+	# Arguments:
+	#
+	# * `data_access` -- An object which implements the Data Access Interface
+	#   specified in `doc/DAI.md`.  `Array` implements the basic interface,
+	#   but for performance you'll want to implement the caching methods
+	#   described in `doc/DAI.md`.
+	#
+	#   The MerkleHashTree gets all of its data from this object.
+	#
+	# * `hash_class` -- An object which provides a `.digest` method which
+	#   behaves identically to `Digest::Base.digest` -- that is, it takes
+	#   an arbitrary string and returns another string, with the requirement
+	#   that every call with the same input will return the same output.
+	#
+	# Raises:
+	#
+	# * `ArgumentError` -- If either argument does not meet the basic
+	#   requirements specified above (that is, the objects don't implement
+	#   the defined interface).
+	#
+	def initialize(data_access, hash_class)
+		@data = data_access
+		unless @data.respond_to?(:[])
+			raise ArgumentError,
+			      "data_access (#{@data}) does not implement #[]"
+		end
+		unless @data.respond_to?(:length)
+			raise ArgumentError,
+			      "data_access (#{@data}) does not implement #length"
+		end
+		@digest = hash_class
+		unless @digest.respond_to?(:digest)
+			raise ArgumentError,
+			      "hash_class (#{@digest}) does not implement #digest"
+		end
+	end
+	# Return the hash value of a subtree.
+	#
+	# Arguments:
+	#
+	# * `subtree` -- A range of the list items over which the tree hash will
+	#   be calculated.  If not specified, it defaults to the entire current
+	#   list.
+	#
+	# Raises:
+	#
+	# * `ArgumentError` -- if the range doesn't consist of integers, or if the
+	#   range is outside the bounds of the current list size.
+	#
+	def head(subtree = nil)
+		# Super-special case when we're asking for the hash of an entire list
+		# that...  just happens to be empty
+		if subtree.nil? and @data.length == 0
+			return digest("")
+		end
+		subtree ||= 0..(@data.length-1)
+		unless subtree.min.is_a? Integer and subtree.max.is_a? Integer
+			raise ArgumentError,
+			      "subtree is not all integers (got #{subtree.inspect})"
+		end
+		if subtree.min < 0
+			raise ArgumentError,
+			      "subtree cannot go negative (#{subtree.inspect})"
+		end
+		if subtree.max >= @data.length
+			raise ArgumentError,
+			      "subtree extends beyond list length (subtree is #{subtree.inspect}, list has #{@data.length} items)"
+		end
+		if subtree.max < subtree.min
+			raise ArgumentError,
+			      "subtree goes backwards (#{subtree.inspect})"
+		end
+		if @data.respond_to?(:mht_cache_get) and h = @data.mht_cache_get(subtree.inspect)
+			return h
+		end
+		# No caching, or not in the cache... recalculate!
+		h = if subtree.size == 1
+			# We're at a leaf!
+			leaf_hash(subtree.min)
+		else
+			k = power_of_2_smaller_than(subtree.size)
+			node_hash(head((0..k-1)+subtree.min), head(subtree.min+k..subtree.max))
+		end
+		if @data.respond_to?(:mht_cache_set)
+			@data.mht_cache_set(subtree.inspect, h)
+		end
+		h
+	end
+	# Generate an "audit proof" for a list item.
+	#
+	# Arguments:
+	#
+	# * `item` -- Specifies the index in the list to retrieve the audit proof
+	#   for.  Must be a non-negative integer within the bounds of the current
+	#   list.
+	#
+	# * `subtree` -- A range which defines the subset of list items within
+	#   which to generate the audit proof.  The bounds of the range must be
+	#   within the bounds of the current list.
+	#
+	# The return value of this method is an array of node hashes which make
+	# up the audit proof.  The first element of the array is the immediate
+	# sibling of the item requested; the last is a child of the root.
+	#
+	# Raises:
+	#
+	# * `ArgumentError` -- if any provided argument isn't an integer, or is
+	#   negative, or is out of range.
+	#
+	# * `RuntimeError` -- if an attempt is made to request an audit proof on
+	#   an empty list.
+	#
+	def audit_proof(item, subtree=nil)
+		if @data.length == 0
+			raise RuntimeError,
+			      "Cannot calculate an audit proof on an empty list"
+		end
+		subtree ||= (0..@data.length - 1)
+		unless subtree.min.is_a? Integer and subtree.max.is_a? Integer
+			raise ArgumentError,
+			      "subtree must be an integer range (got #{subtree.inspect})"
+		end
+		unless item.is_a? Integer
+			raise ArgumentError,
+			      "item must be an integer (got #{item.inspect})"
+		end
+		if subtree.min < 0
+			raise ArgumentError,
+			      "subtree range must be non-negative (subtree is #{subtree.inspect})"
+		end
+		if subtree.max >= @data.length
+			raise ArgumentError,
+			      "subtree must not extend beyond the end of the list (subtree is #{subtree.inspect}, list has #{@data.length} items)"
+		end
+		if subtree.max < subtree.min
+			raise ArgumentError,
+			      "subtree must be min..max (subtree is #{subtree.inspect})"
+		end
+		# And finally, after all that, we can start actually *doing* something
+		if subtree.size == 1
+			# Audit proof for a single item is defined as being empty
+			[]
+		else
+			k = power_of_2_smaller_than(subtree.size)
+			if item < k
+				audit_proof(item, (0..k-1)+subtree.min) + [head(subtree.min+k..subtree.max)]
+			else
+				audit_proof(subtree.min+item-k, subtree.min+k..subtree.max) +
+				  [head((0..k-1)+subtree.min)]
+			end
+		end
+	end
+	# Generate a consistency proof.
+	#
+	# Arguments:
+	#
+	# * `m` -- The smaller list size for which you wish to generate
+	#   the consistency proof.
+	#
+	# * `n` -- The larger list size for which you wish to generate
+	#   the consistency proof.
+	#
+	# Raises:
+	#
+	# * `ArgumentError` -- If the arguments aren't integers, or if they're
+	#   negative, or if `n < m`.
+	#
+	def consistency_proof(m, n)
+		unless m.is_a? Integer
+			raise ArgumentError,
+			      "m is not an integer (got #{m.inspect})"
+		end
+		unless n.is_a? Integer
+			raise ArgumentError,
+			      "n is not an integer (got #{n.inspect})"
+		end
+		if m < 0
+			raise ArgumentError,
+			      "m cannot be negative (m is #{m})"
+		end
+		if n > @data.length
+			raise ArgumentError,
+			      "n cannot be larger than the list length (n is #{n}, list has #{@data.length} elements)"
+		end
+		if n < m
+			raise ArgumentError,
+			      "n cannot be less than m (m is #{m}, n is #{n})"
+		end
+		# This is taken from in-practice behaviour of the Google pilot/aviator
+		# CT servers... when first=0, you always get an empty proof.
+		return [] if m == 0
+		# And now... on to the real show!
+		subproof(m, 0..n-1, true)
+	end
+	private
+	# :nodoc:
+	def subproof(m, n, b)
+		if n.max == m-1
+			if b
+				[]
+			else
+				[head(n)]
+			end
+		elsif n.min == n.max
+			[head(n)]
+		else
+			k = power_of_2_smaller_than(n.size)
+			if m <= k+n.min
+				subproof(m, (0..k-1)+n.min, b) + [head((n.min+k)..n.max)]
+			else
+				subproof(m, ((n.min+k)..n.max), false) + [head((0..k-1)+n.min)]
+			end
+		end
+	end
+	def digest(s)
+		@digest.digest(s)
+	end
+	def leaf_hash(n)
+		if @data[n].respond_to?(:mht_leaf_hash) and h = @data[n].mht_leaf_hash
+			return h
+		end
+		h = digest("\0" + @data[n].to_s)
+		if @data[n].respond_to?(:mht_leaf_hash=)
+			@data[n].mht_leaf_hash = h
+		end
+		h
+	end
+	def node_hash(h1, h2)
+		digest("\x01" + h1 + h2)
+	end
+	# This is almost certainly horribly inefficient, but my math skills have
+	# atrophied embarrassingly
+	def power_of_2_smaller_than(n)
+		raise ArgumentError, "Too small, Jim" if n < 2
+		s = (n-1).to_s(2).length-1
+		2**s
+	end
+end

data/lib/range_extensions.rb ADDED

@@ -0,0 +1,13 @@
+class Range
+	def length
+		last - first + 1
+	end
+	alias_method :size, :length
+	def +(n)
+		raise ArgumentError, "n must be integer" unless n.is_a? Integer
+		(min+n)..(max+n)
+	end
+end

data/merkle-hash-tree.gemspec ADDED

@@ -0,0 +1,31 @@
+require 'git-version-bump'
+Gem::Specification.new do |s|
+	s.name = "merkle-hash-tree"
+	s.version = GVB.version
+	s.date    = GVB.date
+	s.platform = Gem::Platform::RUBY
+	s.homepage = "http://theshed.hezmatt.org/merkle-hash-tree"
+	s.summary = "An RFC6962-compliant implementation of Merkle Hash Trees"
+	s.authors = ["Matt Palmer"]
+	s.extra_rdoc_files = ["README.md"]
+	s.files = `git ls-files`.split("\n")
+	s.add_runtime_dependency "git-version-bump"
+	s.add_development_dependency 'bundler'
+	s.add_development_dependency 'guard-spork'
+	s.add_development_dependency 'guard-rspec'
+	s.add_development_dependency 'plymouth'
+	s.add_development_dependency 'pry-debugger'
+	s.add_development_dependency 'rake'
+	# Needed for guard
+	s.add_development_dependency 'rb-inotify', '~> 0.9'
+	s.add_development_dependency 'rdoc'
+	s.add_development_dependency 'rspec', '~> 2.11'
+	s.add_development_dependency 'rspec-mocks'
+end

data/spec/audit_proof_spec.rb ADDED

@@ -0,0 +1,116 @@
+require_relative './spec_helper'
+require_relative '../lib/merkle-hash-tree'
+describe "MerkleHashTree#audit_proof" do
+	let(:mht) { MerkleHashTree.new(data, IdentityDigest) }
+	context "with a single element" do
+		let(:data) { %w{a} }
+		it "returns empty" do
+			expect(mht.audit_proof(0)).to eq([])
+		end
+	end
+	context "with a one-level tree" do
+		let(:data) { %w{a b} }
+		it "works for 'b'" do
+			expect(mht.audit_proof(1)).to eq(['a'])
+		end
+		it "works for 'a'" do
+			expect(mht.audit_proof(0)).to eq(['b'])
+		end
+	end
+	context "with a two-level tree" do
+		let(:data) { %w{a b c d} }
+		it "works for 'a'" do
+			expect(mht.audit_proof(0)).to eq(['b', 'cd'])
+		end
+		it "works for 'b'" do
+			expect(mht.audit_proof(1)).to eq(['a', 'cd'])
+		end
+		it "works for 'c'" do
+			expect(mht.audit_proof(2)).to eq(['d', 'ab'])
+		end
+		it "works for 'd'" do
+			expect(mht.audit_proof(3)).to eq(['c', 'ab'])
+		end
+	end
+	context "with an unbalanced three-level tree" do
+		let(:data) { %w{a b c d e} }
+		it "works for 'a'" do
+			expect(mht.audit_proof(0)).to eq(['b', 'cd', 'e'])
+		end
+		it "works for 'b'" do
+			expect(mht.audit_proof(1)).to eq(['a', 'cd', 'e'])
+		end
+		it "works for 'c'" do
+			expect(mht.audit_proof(2)).to eq(['d', 'ab', 'e'])
+		end
+		it "works for 'd'" do
+			expect(mht.audit_proof(3)).to eq(['c', 'ab', 'e'])
+		end
+		it "works for 'e'" do
+			# It makes sense if you drink *juuuuust* enough tequila
+			expect(mht.audit_proof(4)).to eq(['abcd'])
+		end
+	end
+	context "with a seven node tree" do
+		# Taken from RFC6962, s2.1.3
+		let(:data) { %w{a b c d e f g} }
+		it "works for 'a'" do
+			expect(mht.audit_proof(0)).to eq(%w{b cd efg})
+		end
+		it "works for 'd'" do
+			expect(mht.audit_proof(3)).to eq(%w{c ab efg})
+		end
+		it "works for 'e'" do
+			expect(mht.audit_proof(4)).to eq(%w{f g abcd})
+		end
+		it "works for 'g'" do
+			expect(mht.audit_proof(6)).to eq(%w{ef abcd})
+		end
+	end
+	context "with a large tree" do
+		let(:data) { %w{a b c d e f g h} }
+		it "works for 'd'" do
+			expect(mht.audit_proof(3)).to eq(%w{c ab efgh})
+		end
+	end
+	context "with a hueg tree" do
+		let(:data) { %w{a b c d e f g h i j k l m n o p q r s t u v w x y z} }
+		it "works for 'f'" do
+			expect(mht.audit_proof(5)).to eq(%w{e gh abcd ijklmnop qrstuvwxyz})
+		end
+		it "works for 'q'" do
+			expect(mht.audit_proof(15)).to eq(%w{o mn ijkl abcdefgh qrstuvwxyz})
+		end
+		it "works for 'z'" do
+			expect(mht.audit_proof(25)).to eq(%w{y qrstuvwx abcdefghijklmnop})
+		end
+	end
+end

data/spec/consistency_proof_spec.rb ADDED

@@ -0,0 +1,41 @@
+require_relative './spec_helper'
+require_relative '../lib/merkle-hash-tree'
+describe "MerkleHashTree#consistency_proof" do
+	let(:mht) { MerkleHashTree.new(data, IdentityDigest) }
+	context "with a seven-node tree" do
+		# Taken from RFC6962, s2.1.3
+		let(:data) { %w{a b c d e f g} }
+		it "is empty for 0->7" do
+			expect(mht.consistency_proof(0, 7)).to eq([])
+		end
+		it "is empty for 7->7" do
+			expect(mht.consistency_proof(7, 7)).to eq([])
+		end
+		it "works for 3->7" do
+			expect(mht.consistency_proof(3, 7)).to eq(%w{c d ab efg})
+		end
+		it "works for 4->7" do
+			expect(mht.consistency_proof(4, 7)).to eq(%w{efg})
+		end
+		it "works for 6->7" do
+			expect(mht.consistency_proof(6, 7)).to eq(%w{ef g abcd})
+		end
+	end
+	context "with a full three-level tree" do
+		# Taken from www.certificate-transparency.org's "How Log Proofs Work"
+		# page
+		let(:data) { %w{a b c d e f g h} }
+		it "works for 6->8" do
+			expect(mht.consistency_proof(6, 8)).to eq(%w{ef gh abcd})
+		end
+	end
+end

data/spec/dai_caching_spec.rb ADDED

@@ -0,0 +1,42 @@
+require_relative './spec_helper'
+require_relative '../lib/merkle-hash-tree'
+describe "DAI caching" do
+	let(:dai) do
+		%w{a b c d e f g}
+	end
+	let(:mht) { MerkleHashTree.new(dai, IdentityDigest) }
+	it "tries to get cache values, but is OK with nil" do
+		dai.
+		  should_receive(:mht_cache_get).
+		  with(any_args).
+		  at_least(:once).
+		  and_return(nil)
+		mht.head
+	end
+	it "respects cache values" do
+		dai.
+		  should_receive(:mht_cache_get).
+		  with('1..5').
+		  and_return('xyzzy')
+		expect(mht.head(1..5)).to eq('xyzzy')
+	end
+	it "tries to cache calculated values" do
+		dai.
+		  should_receive(:mht_cache_get).
+		  with(any_args).
+		  at_least(:once).
+		  and_return(nil)
+		dai.
+		  should_receive(:mht_cache_set).
+		  with('2..2', 'c')
+		mht.head(2..2)
+	end
+end

data/spec/head_n_spec.rb ADDED

@@ -0,0 +1,36 @@
+require_relative './spec_helper'
+require_relative '../lib/merkle-hash-tree'
+require 'digest/sha2'
+describe "MerkleHashTree#head" do
+	let(:data) do
+		["", "\0", "\x10", "\x20\x21", "\x30\x31", "\x40\x41\x42\x43",
+	    "\x50\x51\x52\x53\x54\x55\x56\x57",
+	    "\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f"
+	   ]
+	end
+	let(:mht) { MerkleHashTree.new(data, Digest::SHA256) }
+	hashes = ["6e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d",
+	          "fac54203e7cc696cf0dfcb42c92a1d9dbaf70ad9e621f4bd8d98662f00e3c125",
+	          "aeb6bcfe274b70a14fb067a5e5578264db0fa9b51af5e0ba159158f329e06e77",
+	          "d37ee418976dd95753c1c73862b9398fa2a2cf9b4ff0fdfe8b30cd95209614b7",
+	          "4e3bbb1f7b478dcfe71fb631631519a3bca12c9aefca1612bfce4c13a86264d4",
+	          "76e67dadbcdf1e10e1b74ddc608abd2f98dfb16fbce75277b5232a127f2087ef",
+	          "ddb89be403809e325750d3d263cd78929c2942b7942a34b77e122c9594a74c8c",
+	          "5dc9da79a70659a9ad559cb701ded9a2ab9d823aad2f4960cfe370eff4604328"
+	         ]
+	def hexstring(s)
+		s.scan(/./m).map { |c| sprintf("%02x", c.ord) }.join
+	end
+	8.times do |i|
+		context "head(#{i})" do
+			it "gives a specific hash" do
+				expect(hexstring(mht.head(0..i))).to eq(hashes[i])
+			end
+		end
+	end
+end

data/spec/head_spec.rb ADDED

@@ -0,0 +1,45 @@
+require_relative './spec_helper'
+require_relative '../lib/merkle-hash-tree'
+require 'digest/sha2'
+# These tests are taken directly from the CT reference implementation's
+# merkle tree test suite; they're used to ensure we conform to that
+# specification correctly
+describe "MerkleHashTree#head" do
+	let(:mht) { MerkleHashTree.new(data, Digest::SHA256) }
+	hashes = [Digest::SHA256.hexdigest(""),
+	          "6e340b9cffb37a989ca544e6bb780a2c78901d3fb33738768511a30617afa01d",
+	          "fac54203e7cc696cf0dfcb42c92a1d9dbaf70ad9e621f4bd8d98662f00e3c125",
+	          "aeb6bcfe274b70a14fb067a5e5578264db0fa9b51af5e0ba159158f329e06e77",
+	          "d37ee418976dd95753c1c73862b9398fa2a2cf9b4ff0fdfe8b30cd95209614b7",
+	          "4e3bbb1f7b478dcfe71fb631631519a3bca12c9aefca1612bfce4c13a86264d4",
+	          "76e67dadbcdf1e10e1b74ddc608abd2f98dfb16fbce75277b5232a127f2087ef",
+	          "ddb89be403809e325750d3d263cd78929c2942b7942a34b77e122c9594a74c8c",
+	          "5dc9da79a70659a9ad559cb701ded9a2ab9d823aad2f4960cfe370eff4604328"
+	         ]
+	leaves = ["",
+	          "\0",
+	          "\x10",
+	          "\x20\x21",
+	          "\x30\x31",
+	          "\x40\x41\x42\x43",
+	          "\x50\x51\x52\x53\x54\x55\x56\x57",
+	          "\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f"
+	         ]
+	def string_from_hex(s)
+		s.scan(/../).map { |c| c.to_i(16).chr }.join
+	end
+	9.times do |i|
+		context "with #{i} items" do
+			let(:data) { leaves.take(i) }
+			it "gives a specific hash" do
+				expect(mht.head).to eq(string_from_hex(hashes[i]))
+			end
+		end
+	end
+end

data/spec/power_of_2_smaller_than_spec.rb ADDED

@@ -0,0 +1,37 @@
+require_relative './spec_helper'
+require_relative '../lib/merkle-hash-tree'
+require 'digest/sha1'
+describe "MerkleHashTree#power_of_2_smaller_than" do
+	tests = {
+		2 => 1,
+		3 => 2,
+		4 => 2,
+		5 => 4,
+		6 => 4,
+		7 => 4,
+		8 => 4,
+		9 => 8,
+		10 => 8,
+		15 => 8,
+		16 => 8,
+		17 => 16,
+		18 => 16,
+		31 => 16,
+		32 => 16,
+		33 => 32
+	}
+	let(:mht) { MerkleHashTree.new([], Digest::SHA1) }
+	it "bombs out for n=1" do
+		expect { mht.send(:power_of_2_smaller_than, 1) }.
+		  to raise_error(ArgumentError, /Too small, Jim/)
+	end
+	tests.each_pair do |k, v|
+		it "(#{k}) => #{v}" do
+			expect(mht.send(:power_of_2_smaller_than, k)).to eq(v)
+		end
+	end
+end

data/spec/spec_helper.rb ADDED

@@ -0,0 +1,34 @@
+require 'spork'
+Spork.prefork do
+	require 'bundler'
+	Bundler.setup(:default, :test)
+	require 'rspec/core'
+	require 'rspec/mocks'
+	require 'pry'
+#	require 'plymouth'
+	RSpec.configure do |config|
+		config.fail_fast = true
+#		config.full_backtrace = true
+		config.expect_with :rspec do |c|
+			c.syntax = :expect
+		end
+	end
+	# Our super-special digest class to make it easier to understand WTF is
+	# going on
+	class IdentityDigest
+		def self.digest(s)
+			# Strip off the first character, it'll just be a \0 or \x1 anyway
+			s[1..-1]
+		end
+	end
+end
+Spork.each_run do
+	# Nothing to do here, specs will load the files they need
+end

metadata ADDED

@@ -0,0 +1,243 @@
+--- !ruby/object:Gem::Specification
+name: merkle-hash-tree
+version: !ruby/object:Gem::Version
+  version: 0.1.0
+  prerelease:
+platform: ruby
+authors:
+- Matt Palmer
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2014-07-10 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: git-version-bump
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: bundler
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: guard-spork
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: guard-rspec
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: plymouth
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: pry-debugger
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: rake
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: rb-inotify
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: '0.9'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: '0.9'
+- !ruby/object:Gem::Dependency
+  name: rdoc
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: rspec
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: '2.11'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: '2.11'
+- !ruby/object:Gem::Dependency
+  name: rspec-mocks
+  requirement: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: '0'
+description:
+email:
+executables: []
+extensions: []
+extra_rdoc_files:
+- README.md
+files:
+- .gitignore
+- Gemfile
+- Guardfile
+- README.md
+- Rakefile
+- doc/DAI.md
+- lib/merkle-hash-tree.rb
+- lib/range_extensions.rb
+- merkle-hash-tree.gemspec
+- spec/audit_proof_spec.rb
+- spec/consistency_proof_spec.rb
+- spec/dai_caching_spec.rb
+- spec/head_n_spec.rb
+- spec/head_spec.rb
+- spec/power_of_2_smaller_than_spec.rb
+- spec/spec_helper.rb
+homepage: http://theshed.hezmatt.org/merkle-hash-tree
+licenses: []
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  none: false
+  requirements:
+  - - ! '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+      segments:
+      - 0
+      hash: 4544178081523726354
+required_rubygems_version: !ruby/object:Gem::Requirement
+  none: false
+  requirements:
+  - - ! '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+      segments:
+      - 0
+      hash: 4544178081523726354
+requirements: []
+rubyforge_project:
+rubygems_version: 1.8.23
+signing_key:
+specification_version: 3
+summary: An RFC6962-compliant implementation of Merkle Hash Trees
+test_files: []