RubyGems - perobs - Versions diffs - 0.0.1 - Mend

perobs 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

checksums.yaml +7 -0
data/.gitignore +14 -0
data/Gemfile +4 -0
data/LICENSE.txt +22 -0
data/README.md +113 -0
data/Rakefile +22 -0
data/lib/perobs/Array.rb +173 -0
data/lib/perobs/BlockDB.rb +242 -0
data/lib/perobs/Cache.rb +201 -0
data/lib/perobs/DataBase.rb +115 -0
data/lib/perobs/FileSystemDB.rb +171 -0
data/lib/perobs/Hash.rb +175 -0
data/lib/perobs/HashedBlocksDB.rb +153 -0
data/lib/perobs/Object.rb +189 -0
data/lib/perobs/ObjectBase.rb +159 -0
data/lib/perobs/Store.rb +290 -0
data/lib/perobs/version.rb +4 -0
data/lib/perobs.rb +29 -0
data/perobs.gemspec +23 -0
data/spec/Array_spec.rb +94 -0
data/spec/FileSystemDB_spec.rb +107 -0
data/spec/Hash_spec.rb +96 -0
data/spec/Object_spec.rb +108 -0
data/spec/Store_spec.rb +412 -0
data/spec/perobs_spec.rb +155 -0
data/tasks/changelog.rake +169 -0
data/tasks/gem.rake +50 -0
data/tasks/rdoc.rake +14 -0
data/tasks/test.rake +7 -0
metadata +121 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA1:
+  metadata.gz: 9dd54b9f62dc6b5cc7129d25b9ba87e2d7aa3775
+  data.tar.gz: 1431e7ec23c7bf2c18fa65b7c3e14b33bc696b2b
+SHA512:
+  metadata.gz: dbf7166adf28acabef48594bb80721512f0b30156f66965e7077f6b4089e429c5d951b621c0c3322bc6c2a0b269e47042994dd9449d75cb00858e0d2a23bbbbc
+  data.tar.gz: 73ae5cbfd48a5bc3a53194961398ec6a0ff57a4ac4ed606ba0e3ab1922fcba89cbc72cb90327e2256e2c9709a2a2707bdea8a8bd9124ff973531b800ffc2f304

data/.gitignore ADDED Viewed

@@ -0,0 +1,14 @@
+/.bundle/
+/.yardoc
+/Gemfile.lock
+/_yardoc/
+/coverage/
+/doc/
+/pkg/
+/spec/reports/
+/tmp/
+*.bundle
+*.so
+*.o
+*.a
+mkmf.log

data/Gemfile ADDED Viewed

@@ -0,0 +1,4 @@
+source 'https://rubygems.org'
+# Specify your gem's dependencies in perobs.gemspec
+gemspec

data/LICENSE.txt ADDED Viewed

@@ -0,0 +1,22 @@
+Copyright (c) 2015 Chris Schlaeger
+MIT License
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

data/README.md ADDED Viewed

@@ -0,0 +1,113 @@
+# PEROBS - PErsistent Ruby OBject Store
+PEROBS is a library that provides a persistent object store for Ruby
+objects. Objects of your classes can be made persistent by deriving
+them from PEROBS::Object. They will be in memory when needed and
+transparently stored into a persistent storage. Currently only
+filesystem based storage is supported, but back-ends for key/value
+databases can be easily added.
+This library is ideal for Ruby applications that work on huge, mostly
+constant data sets and usually handle a small subset of the data at a
+time. To ensure data consistency of a larger data set, you can use
+transactions to make modifications of multiple objects atomic.
+Transactions can be nested and are aborted when an exception is
+raised.
+## Usage
+It features a garbage collector that removes all objects that are no
+longer in use. A build-in cache keeps access latencies to recently
+used objects low and lazily flushes modified objects into the
+persistend back-end.
+Persistent objects must be created by deriving your class from
+PEROBS::Object. Only instance variables that are declared via
+po_attr will be persistent. All objects that are stored in persitant
+instance variables must provide a to_json method that generates JSON
+syntax that can be parsed into their original object again. It is
+recommended that references to other objects are all going to persistent
+objects again.
+There are currently 3 kinds of persistent objects available:
+* PEROBS::Object is the base class for all your classes that should be
+  persistent.
+* PEROBS::Array provides an interface similar to the built-in Array class
+  but its objects are automatically stored.
+* PEROBS::Hash provides an interface similar to the built-in Hash
+  class but its objects are automatically stored.
+In addition to these classes, you also need to create a PEROBS::Store
+object that owns your persistent objects. The store provides the
+persistent database. If you are using the default serializer (JSON),
+you can only use the subset of Ruby types that JSON supports.
+Alternatively, you can use Marshal or YAML which support almost every
+Ruby data type.
+Here is an example how to use PEROBS. Let's define a class that models
+a person with their family relations.
+```
+require 'perobs'
+class Person < PEROBS::Object
+  po_attr :name, :mother, :father, :kids
+  def initialize(store, name)
+    super
+    attr_init(:name, name)
+    attr_init(:kids, PEROBS::Array.new)
+  end
+  def to_s
+    "#{@name} is the child of #{self.mother ? self.mother.name : 'unknown'} " +
+    "and #{self.father ? self.father.name : 'unknown'}.
+  end
+end
+store = PEROBS::Store.new('family')
+store['grandpa'] = joe = Person.new('Joe')
+store['grandma'] = jane = Person.new('Jane')
+jim = Person.new('Jim')
+jim.father = joe
+joe.kids << jim
+jim.mother = jane
+jane.kids << jim
+store.sync
+```
+When you run this script, a folder named 'family' will be created. It
+contains the 3 Person objects.
+## Installation
+Add this line to your application's Gemfile:
+```ruby
+gem 'perobs'
+```
+And then execute:
+    $ bundle
+Or install it yourself as:
+    $ gem install perobs
+## Usage
+TODO: Write usage instructions here
+## Contributing
+1. Fork it ( https://github.com/scrapper/perobs/fork )
+2. Create your feature branch (`git checkout -b my-new-feature`)
+3. Commit your changes (`git commit -am 'Add some feature'`)
+4. Push to the branch (`git push origin my-new-feature`)
+5. Create a new Pull Request

data/Rakefile ADDED Viewed

@@ -0,0 +1,22 @@
+# Add the lib directory to the search path if it isn't included already
+# lib = File.expand_path('../lib', __FILE__)
+# $:.unshift lib unless $:.include?(lib)
+require "bundler/gem_tasks"
+require "rspec/core/rake_task"
+require 'rake/clean'
+require 'yard'
+YARD::Rake::YardocTask.new
+Dir.glob( 'tasks/*.rake').each do |fn|
+  begin
+    load fn;
+  rescue LoadError
+    puts "#{fn.split('/')[1]} tasks unavailable: #{$!}"
+  end
+end
+task :default  => :spec
+task :test => :spec
+desc 'Run all unit and spec tests'

data/lib/perobs/Array.rb ADDED Viewed

@@ -0,0 +1,173 @@
+# encoding: UTF-8
+#
+# = Array.rb -- Persistent Ruby Object Store
+#
+# Copyright (c) 2015 by Chris Schlaeger <chris@taskjuggler.org>
+#
+# MIT License
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# "Software"), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+require 'perobs/ObjectBase'
+module PEROBS
+  # An Array that is transparently persisted onto the back-end storage. It is
+  # very similar to the Ruby built-in Array class but has some additional
+  # limitations. The hash key must always be a String.
+  class Array < ObjectBase
+    # Create a new PersistentArray object.
+    # @param store [Store] The Store this hash is stored in
+    # @param size [Fixnum] The requested size of the Array
+    # @param default [Any] The default value that is returned when no value is
+    #        stored for a specific key.
+    def initialize(store, size = 0, default = nil)
+      super(store)
+      @data = ::Array.new(size, default)
+    end
+    # Equivalent to Array::[]
+    def [](index)
+      _dereferenced(@data[index])
+    end
+    # Equivalent to Array::[]=
+    def []=(index, obj)
+      @data[index] = _referenced(obj)
+      @store.cache.cache_write(self)
+      obj
+    end
+    # Equivalent to Array::<<
+    def <<(obj)
+      @store.cache.cache_write(self)
+      @data << _referenced(obj)
+    end
+    # Equivalent to Array::+
+    def +(ary)
+      @store.cache.cache_write(self)
+      @data + ary
+    end
+    # Equivalent to Array::push
+    def push(obj)
+      @store.cache.cache_write(self)
+      @data.push(_referenced(obj))
+    end
+    # Equivalent to Array::pop
+    def pop
+      @store.cache.cache_write(self)
+      _dereferenced(@data.pop)
+    end
+    # Equivalent to Array::clear
+    def clear
+      @store.cache.cache_write(self)
+      @data.clear
+    end
+    # Equivalent to Array::delete
+    def delete(obj)
+      @store.cache.cache_write(self)
+      @data.delete { |v| _dereferenced(v) == obj }
+    end
+    # Equivalent to Array::delete_at
+    def delete_at(index)
+      @store.cache.cache_write(self)
+      @data.delete_at(index)
+    end
+    # Equivalent to Array::delete_if
+    def delete_if
+      @data.delete_if do |item|
+        yield(_dereferenced(item))
+      end
+    end
+    # Equivalent to Array::each
+    def each
+      @data.each do |item|
+        yield(_dereferenced(item))
+      end
+    end
+    # Equivalent to Array::empty?
+    def empty?
+      @data.empty?
+    end
+    # Equivalent to Array::include?
+    def include?(obj)
+      @data.each { |v| return true if _dereferenced(v) == obj }
+      false
+    end
+    # Equivalent to Array::length
+    def length
+      @data.length
+    end
+    alias size length
+    # Equivalent to Array::map
+    def map
+      @data.map do |item|
+        yield(_dereferenced(item))
+      end
+    end
+    alias collect map
+    # Return a list of all object IDs of all persistend objects that this Array
+    # is referencing.
+    # @return [Array of Fixnum or Bignum] IDs of referenced objects
+    def _referenced_object_ids
+      @data.each.select { |v| v && v.is_a?(POReference) }.map { |o| o.id }
+    end
+    # This method should only be used during store repair operations. It will
+    # delete all referenced to the given object ID.
+    # @param id [Fixnum/Bignum] targeted object ID
+    def _delete_reference_to_id(id)
+      @data.delete_if { |v| v && v.is_a?(POReference) && v.id == id }
+    end
+    # Restore the persistent data from a single data structure.
+    # This is a library internal method. Do not use outside of this library.
+    # @param data [Array] the actual Array object
+    # @private
+    def _deserialize(data)
+      @data = data
+    end
+    private
+    def _serialize
+      @data
+    end
+  end
+end

data/lib/perobs/BlockDB.rb ADDED Viewed

@@ -0,0 +1,242 @@
+# encoding: UTF-8
+#
+# = BlockDB.rb -- Persistent Ruby Object Store
+#
+# Copyright (c) 2015 by Chris Schlaeger <chris@taskjuggler.org>
+#
+# MIT License
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# "Software"), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+require 'json'
+require 'json/add/core'
+require 'json/add/struct'
+module PEROBS
+  # This class manages the usage of the data blocks in the corresponding
+  # HashedBlocks object.
+  class BlockDB
+    # Create a new BlockDB object.
+    def initialize(dir, block_size)
+      @dir = dir
+      @block_size = block_size
+      @index_file_name = File.join(dir, 'index.json')
+      @block_file_name = File.join(dir, 'data')
+      read_index
+    end
+    # Write the given bytes with the given ID into the DB.
+    # @param id [Fixnum or Bignum] ID
+    # @param raw [String] sequence of bytes
+    def write_object(id, raw)
+      bytes = raw.bytesize
+      start_address = reserve_blocks(id, bytes)
+      if write_to_block_file(raw, start_address) != bytes
+        raise RuntimeError, 'Object length does not match written bytes'
+      end
+      write_index
+    end
+    # Read the entry for the given ID and return it as bytes.
+    # @param id [Fixnum or Bignum] ID
+    # @return [String] sequence of bytes
+    def read_object(id)
+      read_from_block_file(*find(id))
+    end
+    # Find the data for the object with given id.
+    # @param id [Fixnum or Bignum] Object ID
+    # @return [Array] Returns an Array with two Fixnum entries. The first is
+    #         the number of bytes and the second is the starting offset in the
+    #         block storage file.
+    def find(id)
+      @entries.each do |entry|
+        if entry['id'] == id
+          return [ entry['bytes'], entry['first_block'] * @block_size ]
+        end
+      end
+      nil
+    end
+    # Write a string of bytes into the file at the given address.
+    # @param raw [String] bytes to write
+    # @param address [Fixnum] offset in the file
+    # @return [Fixnum] number of bytes written
+    def write_to_block_file(raw, address)
+      begin
+        File.write(@block_file_name, raw, address)
+      rescue => e
+        raise IOError,
+              "Cannot write block file #{@block_file_name}: #{e.message}"
+      end
+    end
+    # Read _bytes_ bytes from the file starting at offset _address_.
+    # @param bytes [Fixnum] number of bytes to read
+    # @param address [Fixnum] offset in the file
+    def read_from_block_file(bytes, address)
+      begin
+        File.read(@block_file_name, bytes, address)
+      rescue => e
+        raise IOError,
+              "Cannot read block file #{@block_file_name}: #{e.message}"
+      end
+    end
+    # Clear the mark on all entries in the index.
+    def clear_marks
+      @entries.each { |e| e['marked'] = false}
+      write_index
+    end
+    # Set a mark on the entry with the given ID.
+    # @param id [Fixnum or Bignum] ID of the entry
+    def mark(id)
+      found = false
+      @entries.each do |entry|
+        if entry['id'] == id
+          entry['marked'] = true
+          found = true
+          break
+        end
+      end
+      unless found
+        raise ArgumentError, "Cannot find an entry for ID #{id} to mark"
+      end
+      write_index
+    end
+    # Check if the entry for a given ID is marked.
+    # @param id [Fixnum or Bignum] ID of the entry
+    # @return [TrueClass or FalseClass] true if marked, false otherwise
+    def is_marked?(id)
+      @entries.each do |entry|
+        return entry['marked'] if entry['id'] == id
+      end
+      raise ArgumentError, "Cannot find an entry for ID #{id} to check"
+    end
+    # Remove all entries from the index that have not been marked.
+    def delete_unmarked_entries
+      @entries.delete_if { |e| e['marked'] == false }
+      write_index
+    end
+    private
+    # Reserve the blocks needed for the specified number of bytes with the
+    # given ID.
+    # @param id [Fixnum or Bignum] ID of the entry
+    # @param bytes [Fixnum] number of bytes for this entry
+    # @return [Fixnum] the start address of the reserved block
+    def reserve_blocks(id, bytes)
+      # size of the entry in blocks
+      blocks = size_in_blocks(bytes)
+      # index of first block after the last seen entry
+      end_of_last_entry = 0
+      # block index of best fit segment
+      best_fit_start = nil
+      # best fir segment size in blocks
+      best_fit_blocks = nil
+      # If there is already an entry for an object with the _id_, we mark it
+      # for deletion.
+      entry_to_delete = nil
+      @entries.each do |entry|
+        if entry['id'] == id
+          # We've found an old entry for this ID.
+          if entry['blocks'] >= blocks
+            # The old entry still fits. Let's just reuse it.
+            entry['bytes'] = bytes
+            entry['blocks'] = blocks
+            return entry['first_block'] * @block_size
+          end
+          # It does not fit. Ignore the entry and mark it for deletion.
+          entry_to_delete = entry
+          next
+        end
+        gap = entry['first_block'] - end_of_last_entry
+        if gap >= blocks &&
+          (best_fit_blocks.nil? || gap < best_fit_blocks)
+          # We've found a segment that fits the requested bytes and fits
+          # better than any previous find.
+          best_fit_start = end_of_last_entry
+          best_fit_blocks = gap
+        end
+        end_of_last_entry = entry['first_block'] + entry['blocks']
+      end
+      # Delete the old entry if requested.
+      @entries.delete(entry_to_delete) if entry_to_delete
+      # Create a new entry and insert it.
+      entry = {
+        'id' => id,
+        'bytes' => bytes,
+        'first_block' => best_fit_start || end_of_last_entry,
+        'blocks' => blocks,
+        'marked' => false
+      }
+      @entries << entry
+      @entries.sort! { |e1, e2| e1['first_block'] <=> e2['first_block'] }
+      entry['first_block'] * @block_size
+    end
+    def read_index
+      if File.exists?(@index_file_name)
+        begin
+          @entries = JSON.parse(File.read(@index_file_name))
+        rescue => e
+          raise RuntimeError,
+                "BlockDB file #{@index_file_name} corrupted: #{e.message}"
+        end
+      else
+        @entries = []
+      end
+    end
+    def write_index
+      begin
+        File.write(@index_file_name, @entries.to_json)
+      rescue => e
+        raise RuntimeError,
+              "Cannot write BlockDB index file #{@index_file_name}: " +
+              e.message
+      end
+    end
+    def size_in_blocks(bytes)
+      bytes / @block_size + (bytes % @block_size != 0 ? 1 : 0)
+    end
+  end
+end