relaxo 0.4.7 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 1bd62c00d2fd57e6f6e9bcc39ced85062f777125
4
- data.tar.gz: 3ca923d3f102f4ffb22c7cae9cd8a9e1b10a81ca
3
+ metadata.gz: 814525baa6f655dc170eec88ee0878abf080f10c
4
+ data.tar.gz: 8c530c78c0b059db275418b4db7bfc521b562663
5
5
  SHA512:
6
- metadata.gz: c93d34f9b6f6d7de7f720e272a66b1dbf50e8f42a4b45fa2526c08aad4db290e16c0ae9b37aecd4164c476ae5836e5eba32b569b43f583a171026ebc71b5ccbb
7
- data.tar.gz: 9f6fa956df6cc287a0308a176376824d9e64e9a1fa3e87b7de33999c4c31826a929d699a3cce229d59a0b365637bf4efebee21284ef334ba4cc9da9c840e68e5
6
+ metadata.gz: e5f3dbc6500971bf4ece1cea1369f8f4b0d1e3ffa5055163a89da3a4c9f2ff796eea4b7031c82231bda3694f65f77a7620312966490124ff5d0ec4d9611aa5d8
7
+ data.tar.gz: 1322de675206c9ce30f8f088d85ba0646155e9ae54f0e974274fdeae23bd80b9287526a91f8c6554c81536a3c67c8c244fba00487c7b216bc004ba50fb98106d
data/.gitignore CHANGED
@@ -15,3 +15,4 @@ spec/reports
15
15
  test/tmp
16
16
  test/version_tmp
17
17
  tmp
18
+ spec/relaxo/test
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --color
2
+ --format documentation
3
+ --warnings
@@ -1,13 +1,16 @@
1
1
  language: ruby
2
2
  sudo: false
3
+ before_install:
4
+ # For testing purposes:
5
+ - git config --global user.name "Samuel Williams"
6
+ - git config --global user.email "samuel@oriontransfer.net"
3
7
  rvm:
4
8
  - 2.1.8
5
9
  - 2.2.4
6
10
  - 2.3.1
11
+ - 2.4.0
7
12
  - rbx-2
8
- services:
9
- - couchdb
10
- env: COVERAGE=true
13
+ env: COVERAGE=true BENCHMARK=true
11
14
  matrix:
12
15
  allow_failures:
13
16
  - rvm: ruby-head
data/Gemfile CHANGED
@@ -3,11 +3,18 @@ source 'https://rubygems.org'
3
3
  # Specify your gem's dependencies in relaxo.gemspec
4
4
  gemspec
5
5
 
6
- platforms :jruby do
7
- gem 'jruby-openssl'
6
+ gem 'rugged', git: 'git://github.com/libgit2/rugged.git', submodules: true
7
+
8
+ group :development do
9
+ gem "pry"
10
+ gem "msgpack"
8
11
  end
9
12
 
10
13
  group :test do
14
+ gem 'benchmark-ips'
15
+ gem 'ruby-prof'
16
+
17
+ gem 'rack-test'
11
18
  gem 'simplecov'
12
19
  gem 'coveralls', require: false
13
20
  end
data/README.md CHANGED
@@ -1,103 +1,110 @@
1
1
  # Relaxo
2
2
 
3
- Relaxo provides a set of tools and interfaces for interacting with CouchDB. It aims to be as simple and efficient as possible while still improving the usability of various CouchDB features.
3
+ Relaxo is a transactional database built on top of git. It's aim is to provide a robust interface for document storage and sorted indexes.
4
4
 
5
- [![Build Status](https://secure.travis-ci.org/ioquatix/relaxo.png)](http://travis-ci.org/ioquatix/relaxo)
6
- [![Code Climate](https://codeclimate.com/github/ioquatix/relaxo.png)](https://codeclimate.com/github/ioquatix/relaxo)
5
+ [![Build Status](https://secure.travis-ci.org/ioquatix/relaxo.svg)](http://travis-ci.org/ioquatix/relaxo)
6
+ [![Code Climate](https://codeclimate.com/github/ioquatix/relaxo.svg)](https://codeclimate.com/github/ioquatix/relaxo)
7
7
  [![Coverage Status](https://coveralls.io/repos/ioquatix/relaxo/badge.svg)](https://coveralls.io/r/ioquatix/relaxo)
8
+
8
9
  ## Installation
9
10
 
10
11
  Add this line to your application's Gemfile:
11
12
 
12
- gem 'relaxo'
13
+ gem 'relaxo'
13
14
 
14
15
  And then execute:
15
16
 
16
- $ bundle
17
+ $ bundle
17
18
 
18
19
  Or install it yourself as:
19
20
 
20
- $ gem install relaxo
21
+ $ gem install relaxo
21
22
 
22
23
  ## Usage
23
24
 
24
25
  Connect to a local database and manipulate some documents.
25
26
 
26
27
  require 'relaxo'
28
+ require 'msgpack'
29
+
30
+ DB = Relaxo.connect("test")
27
31
 
28
- database = Relaxo.connect("http://localhost:5984/test")
32
+ DB.commit(message: "Create test data") do |dataset|
33
+ object = dataset.append(MessagePack.dump({bob: 'dole'}))
34
+ dataset.write("doc1.json", object)
35
+ end
29
36
 
30
- doc1 = {:bob => 'dole'}
31
- database.save(doc1)
37
+ DB.commit(message: "Update test data") do |dataset|
38
+ doc = MessagePack.load dataset.read('doc1.json').data
39
+ doc[:foo] = 'bar'
40
+
41
+ object = dataset.append(MessagePack.dump(doc))
42
+ dataset.write("doc2.json", object)
43
+ end
32
44
 
33
- doc2 = database.get(doc1['_id'])
34
- doc2[:foo] = 'bar'
35
- database.save(doc2)
45
+ doc = MessagePack.load DB.current['doc2.json'].data
46
+ puts doc
47
+ # => {"bob"=>"dole", "foo"=>"bar"}
36
48
 
37
- ### Transactions/Bulk Save
49
+ ### Document Storage
38
50
 
39
- Sessions support a very similar interface to the main database class and can for many cases be used interchangeably, but with added efficiency.
51
+ Relaxo uses the git persistent data structure for storing documents. This data structure exposes a file-system like interface, which stores any kind of data. This means that you are free to use JSON, or BSON, or MessagePack, or JPEG, or XML, or any combination of those.
40
52
 
41
- require 'relaxo'
42
- require 'relaxo/session'
43
-
44
- database = Relaxo.connect("http://localhost:5984/test")
45
- animals = ['Neko-san', 'Wan-chan', 'Nezu-chan', 'Chicken-san']
53
+ Relaxo has a transactional model for both reading and writing.
54
+
55
+ #### Reading Files
56
+
57
+ path = "path/to/document"
46
58
 
47
- database.transaction do |transaction|
48
- animals.each do |animal|
49
- transaction.save({:name => animal})
50
- end
59
+ DB.current do |dataset|
60
+ object = dataset.read(path)
61
+
62
+ puts "The object id: #{object.oid}"
63
+ puts "The object data size: #{object.size}"
64
+ puts "The object data: #{object.data.inspect}"
51
65
  end
52
- # => [
53
- # {:name=>"Neko-san", "_id"=>"...", "_rev"=>"..."},
54
- # {:name=>"Wan-chan", "_id"=>"...", "_rev"=>"..."},
55
- # {:name=>"Nezu-chan", "_id"=>"...", "_rev"=>"..."},
56
- # {:name=>"Chicken-san", "_id"=>"...", "_rev"=>"..."}
57
- #]
58
-
59
- All documents will allocated UUIDs appropriately and at the end of the session block they will be updated (saved or deleted) using CouchDB `_bulk_save`. The Transactions interface doesn't support any kind of interaction with the server and thus views won't be updated until after the transaction is complete.
60
66
 
61
- To abort the session, either raise an exception or call `transaction.abort!` which is equivalent to `throw :abort`.
67
+ #### Writing Files
62
68
 
63
- ### Loading Data
69
+ path = "path/to/document"
70
+ data = MessagePack.dump(document)
71
+
72
+ DB.commit(message: "Adding document") do |changeset|
73
+ object = changeset.append(data)
74
+ changeset.write(path, object)
75
+ end
76
+
77
+ ### Datasets and Transactions
64
78
 
65
- Relaxo includes a command line script to import documents into a CouchDB database:
79
+ `Dataset`s and `Changeset`s are important concepts. Relaxo doesn't allow arbitrary access to data, but instead exposes the git persistent model for both reading and writing. The implications of this are that when reading or writing, you always see a consistent snapshot of the data store.
66
80
 
67
- % relaxo --help
68
- Usage: relaxo [options] [server-url] [files]
69
- This script can be used to import data to CouchDB.
81
+ ### Suitability
70
82
 
71
- Document creation:
72
- --existing [mode] Control whether to 'update (new document attributes takes priority), 'merge' (existing document attributes takes priority) or replace (old document attributes discarded) existing documents.
73
- --format [type] Control the input format. 'yaml' files are imported as a single document or array of documents. 'csv' files are imported as records using the first row as attribute keys.
74
- --[no-]transaction Controls whether data is saved using the batch save operation. Not suitable for huge amounts of data.
83
+ Relaxo is designed to scale to the hundreds of thousands of documents. It's designed around the git persistent data store, and therefore has some performance and concurrency limitations due to the underlying implementation.
75
84
 
76
- Help and Copyright information:
77
- --copy Display copyright and warranty information
78
- -h, --help Show this help message.
85
+ Because it maintains a full history of all changes, the repository would continue to grow over time by default, but there are mechanisms to deal with that.
79
86
 
80
- This command loads the documents stored in `design.yaml` and `sample.yaml` into the database at `http://localhost:5984/test`.
87
+ #### Performance
81
88
 
82
- % relaxo http://localhost:5984/test design.yaml sample.yaml
89
+ Relaxo can do anywhere from 1000-10,000 inserts per second depending on how you structure the workload.
83
90
 
84
- ...where `design.yaml` and `sample.yaml` contain lists of valid documents, e.g.:
91
+ Relaxo Performance
92
+ Warming up --------------------------------------
93
+ single 129.000 i/100ms
94
+ Calculating -------------------------------------
95
+ single 6.224k (±14.7%) i/s - 114.036k in 20.000025s
96
+ single transaction should be fast
97
+ Warming up --------------------------------------
98
+ multiple 152.000 i/100ms
99
+ Calculating -------------------------------------
100
+ multiple 1.452k (±15.2%) i/s - 28.120k in 20.101831s
101
+ multiple transactions should be fast
85
102
 
86
- # design.yaml
87
- - _id: "_design/services"
88
- language: javascript
89
- views:
90
- service:
91
- map: |
92
- function(doc) {
93
- if (doc.type == 'service') {
94
- emit(doc._id, doc._rev);
95
- }
96
- }
103
+ Reading data is lighting fast as it's loaded directly from disk and cached.
97
104
 
98
- If you specify `--format=csv`, the input files will be parsed as standard CSV. The document schema is inferred from the zeroth (header) row and all subsequent rows will be converted to individual documents. All fields will be saved as text.
105
+ ### Loading Data
99
106
 
100
- If your requirements are more complex, consider writing a custom script either to import directly using the `relaxo` gem or convert your data to YAML and import that as above.
107
+ As Relaxo is unapologetically based on git, you can use git directly with a non-bare working directory to add any files you like. You can even point Relaxo at an existing git repository.
101
108
 
102
109
  ## Contributing
103
110
 
data/Rakefile CHANGED
@@ -6,3 +6,15 @@ RSpec::Core::RakeTask.new(:spec) do |task|
6
6
  end
7
7
 
8
8
  task :default => :spec
9
+
10
+ task :console do
11
+ require 'pry'
12
+ require 'msgpack'
13
+ require 'securerandom'
14
+
15
+ require_relative 'lib/relaxo'
16
+
17
+ DB = Relaxo.connect(File.join(__dir__, '/tmp/relaxo-test-db'))
18
+
19
+ Pry.start
20
+ end
@@ -20,23 +20,14 @@
20
20
 
21
21
  require 'relaxo/database'
22
22
 
23
+ require 'pry'
24
+
23
25
  module Relaxo
24
- def self.connect(url, metadata = nil)
25
- host = "http://localhost:5984"
26
-
27
- if url =~ /^(https?:\/\/.+?)\/(.+)$/
28
- host = $1
29
- name = $2
30
-
31
- # Ensure that we use the default port if none has been specified:
32
- unless host =~ /:\d+$/
33
- host = host + ":5984"
34
- end
35
- else
36
- name = url
26
+ def self.connect(path, metadata = {})
27
+ unless File.exist?(path)
28
+ Rugged::Repository.init_at(path, true)
37
29
  end
38
30
 
39
- connection = Connection.new(host)
40
- database = Database.new(connection, name, metadata)
31
+ return Database.new(path, metadata)
41
32
  end
42
33
  end
@@ -18,49 +18,81 @@
18
18
  # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
19
19
  # THE SOFTWARE.
20
20
 
21
- require 'relaxo/client'
22
-
23
- require 'thread'
21
+ require_relative 'dataset'
24
22
 
25
23
  module Relaxo
26
- class Connection
27
- DEFAULT_UUID_FETCH_COUNT = 10
28
-
29
- def initialize(url)
30
- @url = url
31
- @uuids = []
24
+ class Changeset < Dataset
25
+ def initialize(repository, tree)
26
+ super
32
27
 
33
- @uuid_lock = Mutex.new
28
+ @changes = {}
29
+ @directories = {}
34
30
  end
35
31
 
36
- attr :url
32
+ attr :ref
33
+ attr :changes
37
34
 
38
- # This implementation could be improved. It's not exactly fast to request 1 UUID at a time. One idea is to add a UUID queue to Transaction which allows UUIDs to be fetched in bulk on a per-transaction basis, and reused if the transaction fails.
39
- def next_uuid
40
- @uuid_lock.synchronize do
41
- fetch_uuids(DEFAULT_UUID_FETCH_COUNT) if @uuids.size == 0
42
-
43
- return @uuids.pop
35
+ def changes?
36
+ @changes.any?
37
+ end
38
+
39
+ def read(path)
40
+ if update = @changes[path]
41
+ if update[:action] != :remove
42
+ @repository.read(update[:oid])
43
+ end
44
+ else
45
+ super
44
46
  end
45
47
  end
46
48
 
47
- def info
48
- Client.get @url
49
+ def append(data, type = :blob)
50
+ oid = @repository.write(data, type)
51
+
52
+ return Rugged::Object.new(@repository, oid)
49
53
  end
50
54
 
51
- # Returns a list of names, one for each available database.
52
- def databases
53
- Client.get("#{@url}/_all_dbs")
55
+ def write(path, object, mode = 0100644)
56
+ root, _, name = path.rpartition('/')
57
+
58
+ entry = @changes[path] = {
59
+ action: :upsert,
60
+ oid: object.oid,
61
+ object: object,
62
+ filemode: mode,
63
+ path: path,
64
+ root: root,
65
+ name: name,
66
+ }
67
+
68
+ directory(root).insert(entry)
69
+
70
+ return entry
54
71
  end
55
72
 
56
- def configuration
57
- Client.get("#{@url}/_config")
73
+ alias []= write
74
+
75
+ def delete(path)
76
+ root, _, name = path.rpartition('/')
77
+
78
+ entry = @changes[path] = {
79
+ action: :remove,
80
+ path: path,
81
+ root: root,
82
+ name: name,
83
+ }
84
+
85
+ directory(root).delete(entry)
86
+
87
+ return entry
58
88
  end
59
89
 
60
- private
90
+ def abort!
91
+ throw :abort
92
+ end
61
93
 
62
- def fetch_uuids(count)
63
- @uuids += Client.get("#{@url}/_uuids?count=#{count}")["uuids"]
94
+ def write_tree
95
+ @tree.update(@changes.values)
64
96
  end
65
97
  end
66
98
  end
@@ -18,118 +18,102 @@
18
18
  # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
19
19
  # THE SOFTWARE.
20
20
 
21
- require 'relaxo/client'
22
- require 'relaxo/connection'
21
+ require 'rugged'
22
+ require 'logger'
23
+
24
+ require_relative 'dataset'
25
+ require_relative 'changeset'
23
26
 
24
27
  module Relaxo
25
- ID = '_id'
26
- REV = '_rev'
27
- DELETED = '_deleted'
28
+ HEAD = 'HEAD'.freeze
28
29
 
29
30
  class Database
30
- def initialize(connection, name, metadata = {})
31
- @connection = connection
32
- @name = name
33
-
31
+ def initialize(path, metadata = {})
32
+ @path = path
34
33
  @metadata = metadata
35
34
 
36
- @root = connection.url + "/" + CGI.escape(name)
35
+ @logger = metadata[:logger] || Logger.new($stderr).tap{|logger| logger.level = Logger::INFO}
36
+
37
+ @repository = repository || Rugged::Repository.new(path)
37
38
  end
38
39
 
39
- attr :connection
40
- attr :name
41
- attr :root
42
-
40
+ attr :path
43
41
  attr :metadata
42
+ attr :repository
44
43
 
45
- def [] key
46
- @metadata[key]
47
- end
48
-
49
- # Create the database, will potentially throw an exception if it already exists.
50
- def create!
51
- Client.put @root
52
- end
53
-
54
- # Return true if the database already exists.
55
- def exist?
56
- Client.head @root
57
- end
58
-
59
- # Delete the database and all data.
60
- def delete!
61
- Client.delete @root
62
- end
63
-
64
- # Compact the database, removing old document revisions and optimizing space use.
65
- def compact!
66
- Client.post "#{@root}/_compact"
67
- end
68
-
69
- def id?(id, parameters = {})
70
- Client.head document_url(id, parameters)
71
- end
72
-
73
- def get(id, parameters = {})
74
- Client.get document_url(id, parameters)
44
+ def empty?
45
+ @repository.empty?
75
46
  end
76
47
 
77
- def put(document)
78
- Client.put document_url(document[ID] || @connection.next_uuid), document
48
+ def [] key
49
+ @metadata[key]
79
50
  end
80
51
 
81
- def delete(document)
82
- Client.delete document_url(document[ID]) + "?rev=#{document[REV]}"
52
+ # During the execution of the block, changes don't get stored immediately, so reading from the dataset (from outside the block) will continue to return the values that were stored in the configuration when the transaction was started.
53
+ def commit(**options)
54
+ track_time(options[:message]) do
55
+ catch(:abort) do
56
+ begin
57
+ parent, tree = latest_commit
58
+
59
+ changeset = Changeset.new(@repository, tree)
60
+
61
+ yield changeset
62
+ end until apply(parent, changeset, **options)
63
+ end
64
+ end
83
65
  end
84
66
 
85
- def save(document)
86
- status = put(document)
67
+ # Efficient point-in-time read-only access.
68
+ def current
69
+ _, tree = latest_commit
87
70
 
88
- if status['ok']
89
- document[ID] = status['id']
90
- document[REV] = status['rev']
91
- end
71
+ dataset = Dataset.new(@repository, tree)
92
72
 
93
- return status
94
- end
95
-
96
- def bulk_save(documents, options = {})
97
- options = {
98
- :docs => documents,
99
- :all_or_nothing => true
100
- }.merge(options)
73
+ yield dataset if block_given?
101
74
 
102
- Client.post command_url("_bulk_docs"), options
103
- end
104
-
105
- # Accepts paramaters as described in http://wiki.apache.org/couchdb/HttpViewApi
106
- def view(name, parameters = {})
107
- Client.get view_url(name, parameters)
75
+ return dataset
108
76
  end
109
77
 
110
- def info
111
- Client.get @root
112
- end
78
+ private
113
79
 
114
- def documents(parameters = {})
115
- view("_all_docs", parameters)
80
+ def track_time(message)
81
+ start_time = Time.now
82
+
83
+ yield
84
+ ensure
85
+ end_time = Time.now
86
+ elapsed_time = end_time - start_time
87
+
88
+ @logger.debug("time") {"#{message.inspect}: %0.3fs" % elapsed_time}
116
89
  end
117
90
 
118
- private
119
-
120
- # Convert a simplified view name into a complete view path. If the name already starts with a "_" no alterations will be made.
121
- def view_url(name, parameters = {})
122
- path = (name =~ /^([^_].+?)\/(.*)$/ ? "_design/#{$1}/_view/#{$2}" : name)
91
+ def apply(parent, changeset, **options)
92
+ return true unless changeset.changes?
93
+
94
+ options[:tree] = changeset.write_tree
95
+ options[:parents] ||= [parent]
96
+ options[:update_ref] ||= HEAD
123
97
 
124
- Client.encode_url("#{@root}/#{path}", parameters)
98
+ begin
99
+ Rugged::Commit.create(@repository, options)
100
+ rescue Rugged::ObjectError
101
+ return false
102
+ end
125
103
  end
126
104
 
127
- def document_url(id, parameters = {})
128
- Client.encode_url("#{@root}/#{Client.escape_id(id)}", parameters)
105
+ def latest_commit
106
+ if head = @repository.head
107
+ return head.target, head.target.tree
108
+ else
109
+ return nil, empty_tree
110
+ end
111
+ rescue Rugged::ReferenceError
112
+ return nil, empty_tree
129
113
  end
130
114
 
131
- def command_url(command, parameters = {})
132
- Client.encode_url("#{@root}/#{command}", parameters)
115
+ def empty_tree
116
+ @empty_tree ||= Rugged::Tree.empty(@repository)
133
117
  end
134
118
  end
135
119
  end