rndb 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/LICENSE ADDED
@@ -0,0 +1,24 @@
1
+ This is free and unencumbered software released into the public domain.
2
+
3
+ Anyone is free to copy, modify, publish, use, compile, sell, or
4
+ distribute this software, either in source code form or as a compiled
5
+ binary, for any purpose, commercial or non-commercial, and by any
6
+ means.
7
+
8
+ In jurisdictions that recognize copyright laws, the author or authors
9
+ of this software dedicate any and all copyright interest in the
10
+ software to the public domain. We make this dedication for the benefit
11
+ of the public at large and to the detriment of our heirs and
12
+ successors. We intend this dedication to be an overt act of
13
+ relinquishment in perpetuity of all present and future rights to this
14
+ software under copyright law.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
19
+ IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
20
+ OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
21
+ ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22
+ OTHER DEALINGS IN THE SOFTWARE.
23
+
24
+ For more information, please refer to <https://unlicense.org>
data/README.md ADDED
@@ -0,0 +1,96 @@
1
+ [![test](https://github.com/kranzky/rndb/actions/workflows/test.yml/badge.svg)](https://github.com/kranzky/rndb/actions/workflows/test.yml)
2
+ [![Coverage Status](https://coveralls.io/repos/github/kranzky/rndb/badge.svg?branch=main)](https://coveralls.io/github/kranzky/rndb?branch=main)
3
+
4
+ # RnDB
5
+
6
+ RnDB is a procedurally-generated fake database.
7
+
8
+ ## Usage
9
+
10
+ First, create tables with columns that may have a pre-determined distribution of
11
+ values (which can be queried on), or which may have a lambda for generating a
12
+ random value, or both (such as for the `weight` column below).
13
+
14
+ ```
15
+ class Widget < RnDB::Table
16
+ column :colour, { red: 0.3, green: 0.12, brown: 0.01, blue: 0.5, orange: 0.07 }
17
+ column :weight, { light: 0.3, medium: 0.64, heavy: 0.06 }, -> value do
18
+ range =
19
+ case value
20
+ when :light
21
+ (0.0..5.0)
22
+ when :medium
23
+ (6.0..9.0)
24
+ when :heavy
25
+ (10.0..20.0)
26
+ end
27
+ self.rand(range)
28
+ end
29
+ column :name, -> { Faker::Games::Pokemon.name }
30
+ end
31
+ ```
32
+
33
+ Next, create a database with an optional random seed (`137` in the example
34
+ below), and add the table to the database, specifying the number of records to
35
+ simulate (in this case, one trillion).
36
+
37
+ ```
38
+ DB = RnDB::Database.new(137)
39
+ DB.add_table(Widget, 1e12)
40
+ ```
41
+
42
+ Finally, fetch some records!
43
+
44
+ ```
45
+ puts Widget.count
46
+ puts Widget[1234567890].name
47
+ puts Widget.find { |widget| (3.1415..3.1416).include?(widget.weight) }.attributes
48
+ ```
49
+
50
+ Which will display the following:
51
+
52
+ ```
53
+ 1000000000000
54
+ Charmander
55
+ {:id=>61520, :weight=>3.1415121332762386, :colour=>:red, :name=>"Exeggcute"}
56
+ ```
57
+
58
+ Note that the `find` command tested over sixty thousand records in just a second
59
+ or two without needing to generate all attributes of each record first. But an
60
+ even faster way of honing in on a particular record is to run a query, such as:
61
+
62
+ ```
63
+ query = Widget.where(colour: [:brown, :orange], :weight => :heavy)
64
+ ```
65
+
66
+ You can then retrieve random records that match the query with `sample`, use
67
+ `pluck` to retrieve specific attributes without generating all of them, and use
68
+ `find` or `filter` to further refine your search, like this:
69
+
70
+ ```
71
+ puts query.count
72
+ puts query.sample.pluck(:colour, :weight)
73
+ puts query.lazy.filter { |ball| ball.name == 'Pikachu' }.map(&:id).take(10).to_a
74
+ ```
75
+
76
+ Which will display the following:
77
+
78
+ ```
79
+ 4800000000
80
+ {:colour=>:orange, :weight=>16.096085279047017}
81
+ [429400000068, 429400000087, 429400000875, 429400000885, 429400000914, 429400001036, 429400001062, 429400001330, 429400001341, 429400001438]
82
+ ```
83
+
84
+ Note that we used the `lazy` enumerator when filtering records to prevent
85
+ running the block on all records before performing the `map` and taking the
86
+ first ten results.
87
+
88
+ ## Release Process
89
+
90
+ 1. `rake version:bump:whatever`
91
+ 2. `rake git:release BRANCH=main`
92
+ 3. Create new release on GitHub to trigger ship workflow
93
+
94
+ ## Copyright
95
+
96
+ Copyright (c) 2021 Jason Hutchens. See LICENSE for further details.
data/Rakefile ADDED
@@ -0,0 +1,29 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'rubygems'
4
+ require 'bundler'
5
+ begin
6
+ Bundler.setup(:default, :development)
7
+ rescue Bundler::BundlerError => e
8
+ warn e.message
9
+ warn "Run `bundle install` to install missing gems"
10
+ exit e.status_code
11
+ end
12
+ require 'rake'
13
+ require 'juwelier'
14
+ Juwelier::Tasks.new do |gem|
15
+ gem.name = "rndb"
16
+ gem.homepage = "https://github.com/kranzky/rndb"
17
+ gem.license = "Unlicense"
18
+ gem.summary = "RnDB is an procedurally-generated mock database."
19
+ gem.description = ""
20
+ gem.email = "lloyd@kranzky.com"
21
+ gem.authors = ["Lloyd Kranzky"]
22
+ gem.required_ruby_version = ">= 2.1"
23
+ end
24
+ Juwelier::RubygemsDotOrgTasks.new
25
+
26
+ require 'yard'
27
+ YARD::Rake::YardocTask.new
28
+
29
+ task default: :clean
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.1.1
data/lib/rndb.rb ADDED
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'rndb/database'
4
+ require 'rndb/slice'
5
+ require 'rndb/thicket'
6
+ require 'rndb/table'
7
+ require 'rndb/query'
@@ -0,0 +1,26 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RnDB
4
+ class Database
5
+ attr_accessor :prng
6
+ attr_reader :seed
7
+
8
+ # Opens a new fake database. A seed for the PRNG may be optionally supplied.
9
+ def initialize(seed=Time.now.to_i)
10
+ raise "database already open" unless Thread.current[:rndb_database].nil?
11
+ Thread.current[:rndb_database] = self
12
+ @prng = Random
13
+ @seed = seed
14
+ end
15
+
16
+ # Add a Table to the database, specifying the number of records to simulate.
17
+ def add_table(klass, size)
18
+ klass.send(:_migrate, size.to_i)
19
+ end
20
+
21
+ # Dump the table schemas as a hash.
22
+ def schema
23
+ Thread.current[:rndb_tables]
24
+ end
25
+ end
26
+ end
data/lib/rndb/query.rb ADDED
@@ -0,0 +1,57 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RnDB
4
+ class Query
5
+ include Enumerable
6
+
7
+ # Query records of the given table based on the IDs in the supplied Thicket.
8
+ def initialize(table, ids)
9
+ @table, @ids = table, ids
10
+ end
11
+
12
+ # Delegate counting to the Thicket.
13
+ def count
14
+ @ids.count
15
+ end
16
+
17
+ # Retrieve the ID of an index into this query and use it to instantiate a record.
18
+ def [](index)
19
+ @table[@ids[index]]
20
+ end
21
+
22
+ # Implemented to be consistent with #first, which we get by magic.
23
+ def last
24
+ self[-1] unless count.zero?
25
+ end
26
+
27
+ # Delegate iteration to the Thicket, yielding records to the caller.
28
+ def each
29
+ @ids.each { |id| yield @table[id] }
30
+ end
31
+
32
+ # Return an array or a hash of plucked values, avoiding generation of all attributes.
33
+ def pluck(*args)
34
+ @ids.map do |id|
35
+ if args.count == 1
36
+ @table.value(id, args.first)
37
+ else
38
+ args.map do |attribute|
39
+ [attribute, @table.value(id, attribute)]
40
+ end.to_h
41
+ end
42
+ end
43
+ end
44
+
45
+ # Return a new query that takes a random sampling of IDs from the current query.
46
+ def sample(limit=1)
47
+ _db.prng.srand
48
+ self.class.new(@table, @ids.sample(limit, _db.prng))
49
+ end
50
+
51
+ private
52
+
53
+ def _db
54
+ Thread.current[:rndb_database]
55
+ end
56
+ end
57
+ end
data/lib/rndb/slice.rb ADDED
@@ -0,0 +1,26 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RnDB
4
+ class Slice < Range
5
+ # A range that knows how to sort and intersect itself, private to Thickets.
6
+ def initialize(min, max)
7
+ super(min.to_i, max.to_i)
8
+ end
9
+
10
+ # Just in case the Range implementation is inefficient.
11
+ def count
12
+ max - min + 1
13
+ end
14
+
15
+ # Because Slices in a Thicket are disjoint, we can sort by min or max.
16
+ def <=>(other)
17
+ min <=> other.min
18
+ end
19
+
20
+ # We need to intersect slices when processing query constraints.
21
+ def &(other)
22
+ return nil if min > other.max || max < other.min
23
+ self.class.new([min, other.min].max, [max, other.max].min)
24
+ end
25
+ end
26
+ end
data/lib/rndb/table.rb ADDED
@@ -0,0 +1,220 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'digest'
4
+ require 'byebug'
5
+
6
+ module RnDB
7
+ class Table
8
+ attr_reader :id
9
+
10
+ # Create a new record wit the given ID.
11
+ def initialize(id)
12
+ _validate!
13
+ @id = id
14
+ end
15
+
16
+ # Generate all attributes, which may be expensive.
17
+ def attributes
18
+ _generate_all
19
+ end
20
+
21
+ # Return the attributes as a hash.
22
+ def to_h
23
+ attributes
24
+ end
25
+
26
+ # Return a stringified version of the attributes hash.
27
+ def to_s
28
+ to_h.to_s
29
+ end
30
+
31
+ private
32
+
33
+ def _generate_all
34
+ _schema[:columns].each_key do |name|
35
+ _generate_column(name)
36
+ end
37
+ @_attributes
38
+ end
39
+
40
+ def _generate_column(name)
41
+ @_attributes ||= { id: @id }
42
+ @_attributes[name] ||= self.class.value(@id, name)
43
+ end
44
+
45
+ def _validate!
46
+ self.class.send(:_validate!)
47
+ end
48
+
49
+ def _schema
50
+ self.class.send(:_schema)
51
+ end
52
+
53
+ class << self
54
+ include Enumerable
55
+
56
+ # Return the name of the table, which is derived from the class name.
57
+ def table_name
58
+ name.downcase.to_sym
59
+ end
60
+
61
+ # Return a new record corresponding to the specified index.
62
+ def [](index)
63
+ _validate!
64
+ new(index) if index < count
65
+ end
66
+
67
+ # Return a Query that matches the supplied constraints
68
+ def where(constraints={})
69
+ _validate!
70
+ ids = Thicket.new(0..._schema[:size])
71
+ constraints.each do |attribute, values|
72
+ column = _schema[:columns][attribute]
73
+ other = Array(values).reduce(Thicket.new) do |thicket, value|
74
+ thicket | column[:mapping][value]
75
+ end
76
+ ids &= other
77
+ end
78
+ Query.new(self, ids)
79
+ end
80
+
81
+ # Return all records.
82
+ def all
83
+ where
84
+ end
85
+
86
+ # Count all records, delegating this to the all Query.
87
+ def count
88
+ all.count
89
+ end
90
+
91
+ # Return the last record, to be consistent with #first, which we get by magic.
92
+ def last
93
+ all.last
94
+ end
95
+
96
+ # Iterate over all records, delegating this to the all Query
97
+ def each(&block)
98
+ all.each(&block)
99
+ end
100
+
101
+ # Pluck specified attributes from all records, delegating this to the all query.
102
+ def pluck(*args)
103
+ all.pluck(args)
104
+ end
105
+
106
+ # Return a Querty that contains a random sampling of records.
107
+ def sample(limit=1)
108
+ all.sample(limit)
109
+ end
110
+
111
+ # Add a new column to the Table model.
112
+ def column(attribute, *args)
113
+ args.each do |arg|
114
+ index =
115
+ case arg
116
+ when Hash
117
+ :distribution
118
+ when Proc
119
+ :generator
120
+ else
121
+ raise "unsupported column parameter"
122
+ end
123
+ _schema[:columns][attribute][index] = arg
124
+ end
125
+ define_method(attribute) do
126
+ _generate_column(attribute)
127
+ end
128
+ end
129
+
130
+ # Generate a random number, intended to be used in lambdas. The number
131
+ # will have been seeded appropriately to ensure determinism.
132
+ def rand(args)
133
+ _validate!
134
+ _db.prng.rand(args)
135
+ end
136
+
137
+ # Retrieve the value of the given attribute for the given ID.
138
+ def value(id, attribute)
139
+ _validate!
140
+ return id if attribute == :id
141
+ column = _schema[:columns][attribute]
142
+ value =
143
+ unless column[:distribution].nil?
144
+ column[:mapping].find do |_, ids|
145
+ ids.include?(id)
146
+ end&.first
147
+ end
148
+ unless column[:generator].nil?
149
+ _seed_prng(id, attribute)
150
+ value =
151
+ if column[:distribution].nil?
152
+ column[:generator].call
153
+ else
154
+ column[:generator].call(value)
155
+ end
156
+ end
157
+ value
158
+ end
159
+
160
+ private
161
+
162
+ def _db
163
+ Thread.current[:rndb_database]
164
+ end
165
+
166
+ def _schema
167
+ Thread.current[:rndb_tables] ||= Hash.new do |tables, name|
168
+ tables[name] = {
169
+ class: nil,
170
+ size: 0,
171
+ columns: Hash.new do |columns, key|
172
+ columns[key] = {
173
+ distribution: nil,
174
+ mapping: {},
175
+ generator: nil
176
+ }
177
+ end
178
+ }
179
+ end
180
+ Thread.current[:rndb_tables][table_name]
181
+ end
182
+
183
+ def _migrate(size)
184
+ raise "table already migrated" unless _schema[:class].nil?
185
+ ids = Thicket.new(0...size)
186
+ _schema[:columns].each_value do |column|
187
+ distribution = column[:distribution]
188
+ next if distribution.nil?
189
+ raise "distribution must sum to unity" unless distribution.values.sum == 1
190
+ min = 0.0
191
+ column[:distribution].each do |value, probability|
192
+ max = min + probability
193
+ column[:mapping][value] = ids * (min..max)
194
+ min = max
195
+ end
196
+ ids =
197
+ column[:mapping].values.reduce(Thicket.new) do |thicket, other|
198
+ thicket | other
199
+ end
200
+ end
201
+ _schema[:size] = size
202
+ _schema[:class] = self
203
+ end
204
+
205
+ def _seed_prng(id, attribute)
206
+ tuple = [_db.seed, table_name, attribute, id].join('-')
207
+ digest = Digest::SHA256.hexdigest(tuple)
208
+ value = digest.to_i(16) % 18_446_744_073_709_551_616
209
+ _db.prng.srand(value)
210
+ Faker::Config.random = _db.prng
211
+ value
212
+ end
213
+
214
+ def _validate!
215
+ @valid ||= (self == _schema[:class])
216
+ raise "table not added to database" unless @valid
217
+ end
218
+ end
219
+ end
220
+ end