bmg 0.22.0 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: be420623e04afe03570a1756d1607a202eb96481
4
- data.tar.gz: 150b4f3f4daced5825e661765ed4530455263422
3
+ metadata.gz: 542c4218634ee7ae5b224400ee07ec6d2998473a
4
+ data.tar.gz: 37eddfc05f9fcfca90e96c50fc6628a09cdf8742
5
5
  SHA512:
6
- metadata.gz: 5b2d86803d1bfedfa5605f158a34c744e8b0fa77ed25314e38a28dac6bdef1512bc51a3398cfcd55df42fb60354880b4378b49d339682724cd3da42ef282ef27
7
- data.tar.gz: 7321a2b290d2852ff4e2fe52b915e190d3fb13143807e8ac062928c8745a330d70c43fcbbf4abb545507cc8d2b9a05167502c44c7b6d5310c24b2e781920f827
6
+ metadata.gz: 9c5009a14fa10be21fc6d76e23fb47a8e0e69e9dd300478822c6bc9758230ff593cb0a3783b94b0a64c4afbd82456b362ee081a02e4a049c65759c9699a3f074
7
+ data.tar.gz: 5d73d4dad1eb72f3794cbcfddab422e01a197d9c128f6991243cf7d381e41f9ae0cf255f03aff5eb23ac3d4375de1da1345503e086ff01253dd6e636497727b4
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
- # Bmg, a relational algebra (Alf's successor)!
1
+ # Bmg, a relational algebra
2
2
 
3
- Bmg is a relational algebra implemented as a Ruby library. It implements the
3
+ Bmg is a [relational algebra](https://www.relational-algebra.dev/) implemented as a Ruby library. It implements the
4
4
  [Relation as First-Class Citizen](http://www.try-alf.org/blog/2013-10-21-relations-as-first-class-citizen)
5
5
  paradigm contributed with [Alf](http://www.try-alf.org/) a few years ago.
6
6
 
@@ -9,16 +9,24 @@ and any data source that can be seen as serving relations. Cross data-sources
9
9
  joins are supported, as with Alf. For differences with Alf, see a section
10
10
  further down this README.
11
11
 
12
+ ## Links
13
+
14
+ * Documentation can be found at https://www.relational-algebra.dev/
15
+ * Contribute to that documentation on github: https://github.com/enspirit/bmg-website
16
+
12
17
  ## Outline
13
18
 
14
19
  * [Example](#example)
15
20
  * [Where are base relations coming from?](#where-are-base-relations-coming-from)
16
21
  * [Memory relations](#memory-relations)
17
22
  * [Connecting to SQL databases](#connecting-to-sql-databases)
18
- * [Reading files (csv, Excel, text)](#reading-files-csv-excel-text)
23
+ * [Reading data files](#reading-data-files-json-csv-yaml-text-xls--xlsx)
19
24
  * [Connecting to Redis databases](#connecting-to-redis-databases)
20
25
  * [Your own relations](#your-own-relations)
26
+ * [The Database abstraction](#the-database-abstraction)
21
27
  * [List of supported operators](#supported-operators)
28
+ * [List of supported predicates](#supported-predicates)
29
+ * [List of supported summaries](#supported-summaries)
22
30
  * [How is this different?](#how-is-this-different)
23
31
  * [... from similar libraries](#-from-similar-libraries)
24
32
  * [... from Alf](#-from-alf)
@@ -117,33 +125,38 @@ Bmg.sequel(:suppliers, sequel_db)
117
125
  # {:array=>false})
118
126
  ```
119
127
 
120
- ### Reading files (csv, Excel, text)
128
+ ### Reading data files (json, csv, yaml, text, xls & xlsx)
121
129
 
122
130
  Bmg provides simple adapters to read files and reach Relationland as soon as
123
131
  possible.
124
132
 
125
- #### CSV files
133
+ #### JSON files
126
134
 
127
135
  ```ruby
128
- csv_options = { col_sep: ",", quote_char: '"' }
129
- r = Bmg.csv("path/to/a/file.csv", csv_options)
136
+ r = Bmg.json("path/to/a/file.json")
130
137
  ```
131
138
 
132
- Options are directly transmitted to `::CSV.new`, check Ruby's standard
133
- library.
139
+ The json file is expected to contain tuples of same heading.
134
140
 
135
- #### Excel files
141
+ #### YAML files
136
142
 
137
- You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
138
- read `.xls` and `.xlsx` files with Bmg.
143
+ ```ruby
144
+ r = Bmg.yaml("path/to/a/file.yaml")
145
+ ```
146
+
147
+ The yaml file is expected to contain tuples of same heading.
148
+
149
+ #### CSV files
139
150
 
140
151
  ```ruby
141
- roo_options = { skip: 1 }
142
- r = Bmg.excel("path/to/a/file.xls", roo_options)
152
+ csv_options = { col_sep: ",", quote_char: '"' }
153
+ r = Bmg.csv("path/to/a/file.csv", csv_options)
143
154
  ```
144
155
 
145
- Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
146
- documentation.
156
+ Options are directly transmitted to `::CSV.new`, check Ruby's standard
157
+ library. If you don't provide them, `Bmg` uses `headers: true` (hence making
158
+ then assumption that attributes names are provided on first line), and makes a
159
+ best effort to infer the column separator.
147
160
 
148
161
  #### Text files
149
162
 
@@ -173,6 +186,19 @@ r.type.attrlist
173
186
  In this scenario, non matching lines are skipped. The `:line` attribute keeps
174
187
  being used to have at least one candidate key (so to speak).
175
188
 
189
+ #### Excel files
190
+
191
+ You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
192
+ read `.xls` and `.xlsx` files with Bmg.
193
+
194
+ ```ruby
195
+ roo_options = { skip: 1 }
196
+ r = Bmg.excel("path/to/a/file.xls", roo_options)
197
+ ```
198
+
199
+ Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
200
+ documentation.
201
+
176
202
  ### Connecting to Redis databases
177
203
 
178
204
  Bmg currently requires `bmg-redis` and `redis >= 4.6` to connect
@@ -240,6 +266,58 @@ restrictions down the tree) by overriding the underscored version of operators
240
266
  Have a look at `Bmg::Algebra` for the protocol and `Bmg::Sql::Relation` for an
241
267
  example. Keep in touch with the team if you need some help.
242
268
 
269
+ ## The Database abstraction
270
+
271
+ The previous section focused on obtaining *relations*. In practice you frequently
272
+ have a collection of relations hence a *database*:
273
+
274
+ * A SQL database with multiple tables
275
+ * A list of data files, all in the same folder
276
+ * An excel file with various sheets
277
+
278
+ Bmg supports a simple Datbabase abstraction that serves those relations "by name",
279
+ in a simple way. A database can also be easily dumped back to a data folder of
280
+ json or csv files, or as simple xlsx files with multiple sheets.
281
+
282
+ ### Connecting to a SQL Database
283
+
284
+ For a SQL database, connected with Sequel:
285
+
286
+ ```
287
+ db = Bmg::Database.sequel(Sequel.connect('...'))
288
+ db.suppliers # yields a Bmg::Relation over the `suppliers` table
289
+ ```
290
+
291
+ ### Connecting to data files in the same folder
292
+
293
+ Data files all in the same folder can be seen as a very basic form of database,
294
+ and served as such. Bmg supports `json`, `csv` and `yaml` files:
295
+
296
+ ```
297
+ db = Bmg::Database.data_folder('./my-database')
298
+ db.suppliers # yields a Bmg::Relation over the `suppliers.(json,csv,yml)` file
299
+ ```
300
+
301
+ Bmg supports files in different formats in the same folder. When files with the
302
+ same basename exist, json is prefered over yaml, which is prefered over csv.
303
+
304
+ ### Dumping a Database instance
305
+
306
+ As a data folder:
307
+
308
+ ```
309
+ db = Bmg::Database.sequel(Sequel.connect('...'))
310
+ db.to_data_folder('path/to/folder', :json)
311
+ ```
312
+
313
+ As an .xlsx file (any existing file will be erased, we don't support modifying
314
+ existing files):
315
+
316
+ ```
317
+ require 'bmg/xlsx'
318
+ db.to_xlsx('path/to/file.xlsx')
319
+ ```
320
+
243
321
  ## Supported operators
244
322
 
245
323
  ```ruby
@@ -283,6 +361,67 @@ r.unwrap(:a) # shortcut over unwrap([:a])
283
361
  r.where(predicate) # alias for restrict(predicate)
284
362
  ```
285
363
 
364
+ ## Supported Predicates
365
+
366
+ Usual operators are supported and map to their SQL equivalent as expected:
367
+
368
+ ```ruby
369
+ Predicate.eq # =
370
+ Predicate.neq # <>
371
+ Predicate.lt # <
372
+ Predicate.lte # <=
373
+ Predicate.gt # >
374
+ Predicate.gte # >=
375
+ Predicate.in # SQL's IN
376
+ Predicate.is_null # SQL's IS NULL
377
+ ```
378
+
379
+ See the [Predicate gem](https://github.com/enspirit/predicate) for a more
380
+ complete list.
381
+
382
+ Note: predicates that implement specific Ruby algorithms or patterns are
383
+ not compiled to SQL (and more generally not delegated to underlying database
384
+ servers).
385
+
386
+ ## Supported Summaries
387
+
388
+ The `summarize` operator receives a list of `attr: summarizer` pairs, e.g.
389
+
390
+ ```ruby
391
+ r.summarize([:city], {
392
+ how_many: :count, # same as how_many: Bmg::Summarizer.count
393
+ status: :max, # same as status: Bmg::Summarizer.max(:status)
394
+ min_status: Bmg::Summarizer.min(:status)
395
+ })
396
+ ```
397
+
398
+ The following summarizers are available and translated to SQL:
399
+
400
+ ```ruby
401
+ Bmg::Summarizer.count # count the number of tuples
402
+ Bmg::Summarizer.distinct(:a) # collect distinct values (as an array)
403
+ Bmg::Summarizer.distinct_count(:a) # count of distinct values
404
+ Bmg::Summarizer.min(:a) # min value for attribute :a
405
+ Bmg::Summarizer.max(:a) # max value
406
+ Bmg::Summarizer.sum(:a) # sum :a's values
407
+ Bmg::Summarizer.avg(:a) # average
408
+ ```
409
+
410
+ The following summarizers are implemented in Ruby (they are supported when
411
+ querying SQL databases, but not compiled to SQL):
412
+
413
+ ```ruby
414
+ Bmg::Summarizer.collect(:a) # collect :a's values (as an array)
415
+ Bmg::Summarizer.concat(:a, opts: { ... }) # concat :a's values (opts, e.g. {between: ','})
416
+ Bmg::Summarizer.first(:a, order: ...) # smallest seen a:'s value according to a tuple ordering
417
+ Bmg::Summarizer.last(:a, order: ...) # largest seen a:'s value according to a tuple ordering
418
+ Bmg::Summarizer.variance(:a) # variance
419
+ Bmg::Summarizer.stddev(:a) # standard deviation
420
+ Bmg::Summarizer.percentile(:a, nth) # (continuous) nth percentile
421
+ Bmg::Summarizer.percentile_disc(:a, nth) # discrete nth percentile
422
+ Bmg::Summarizer.value_by(:a, :by => :b) # { :b => :a } as a Hash
423
+ ```
424
+
286
425
  ## How is this different?
287
426
 
288
427
  ### ... from similar libraries?
@@ -0,0 +1,67 @@
1
+ module Bmg
2
+ class Database
3
+ class DataFolder < Database
4
+
5
+ DEFAULT_OPTIONS = {
6
+ data_extensions: ['json', 'yml', 'yaml', 'csv']
7
+ }
8
+
9
+ def initialize(folder, options = {})
10
+ @folder = Path(folder)
11
+ @options = DEFAULT_OPTIONS.merge(options)
12
+ end
13
+
14
+ def method_missing(name, *args, &bl)
15
+ return super(name, *args, &bl) unless args.empty? && bl.nil?
16
+ raise NotSuchRelationError(name.to_s) unless file = find_file(name)
17
+ read_file(file)
18
+ end
19
+
20
+ def each_relation_pair
21
+ return to_enum(:each_relation_pair) unless block_given?
22
+
23
+ @folder.glob('*') do |path|
24
+ next unless path.file?
25
+ next unless @options[:data_extensions].find {|ext|
26
+ path.ext == ".#{ext}" || path.ext == ext
27
+ }
28
+ yield(path.basename.rm_ext.to_sym, read_file(path))
29
+ end
30
+ end
31
+
32
+ def self.dump(database, path, ext = :json)
33
+ path = Path(path)
34
+ path.mkdir_p
35
+ database.each_relation_pair do |name, rel|
36
+ (path/"#{name}.#{ext}").write(rel.public_send(:"to_#{ext}"))
37
+ end
38
+ path
39
+ end
40
+
41
+ private
42
+
43
+ def read_file(file)
44
+ case file.ext.to_s
45
+ when '.json'
46
+ Bmg.json(file)
47
+ when '.yaml', '.yml'
48
+ Bmg.yaml(file)
49
+ when '.csv'
50
+ Bmg.csv(file)
51
+ else
52
+ raise NotSupportedError, "Unable to use #{file} as a relation"
53
+ end
54
+ end
55
+
56
+ def find_file(name)
57
+ exts = @options[:data_extensions]
58
+ exts.each do |ext|
59
+ target = @folder/"#{name}.#{ext}"
60
+ return target if target.file?
61
+ end
62
+ raise NotSuchRelationError, "#{@folder}/#{name}.#{exts.join(',')}"
63
+ end
64
+
65
+ end # class DataFolder
66
+ end # class Database
67
+ end # module Bmg
@@ -0,0 +1,35 @@
1
+ module Bmg
2
+ class Database
3
+ class Sequel < Database
4
+
5
+ DEFAULT_OPTIONS = {
6
+ }
7
+
8
+ def initialize(sequel_db, options = {})
9
+ @sequel_db = sequel_db
10
+ @sequel_db = ::Sequel.connect(@sequel_db) unless @sequel_db.is_a?(::Sequel::Database)
11
+ end
12
+
13
+ def method_missing(name, *args, &bl)
14
+ return super(name, *args, &bl) unless args.empty? && bl.nil?
15
+ raise NotSuchRelationError(name.to_s) unless @sequel_db.table_exists?(name)
16
+ rel_for(name)
17
+ end
18
+
19
+ def each_relation_pair
20
+ return to_enum(:each_relation_pair) unless block_given?
21
+
22
+ @sequel_db.tables.each do |table|
23
+ yield(table, rel_for(table))
24
+ end
25
+ end
26
+
27
+ protected
28
+
29
+ def rel_for(table_name)
30
+ Bmg.sequel(table_name, @sequel_db)
31
+ end
32
+
33
+ end # class Sequel
34
+ end # class Database
35
+ end # module Bmg
@@ -0,0 +1,41 @@
1
+ module Bmg
2
+ class Database
3
+ class Xlsx < Database
4
+
5
+ DEFAULT_OPTIONS = {
6
+ }
7
+
8
+ def initialize(path, options = {})
9
+ path = Path(path) if path.is_a?(String)
10
+ @path = path
11
+ @options = options.merge(DEFAULT_OPTIONS)
12
+ end
13
+
14
+ def method_missing(name, *args, &bl)
15
+ return super(name, *args, &bl) unless args.empty? && bl.nil?
16
+ rel = rel_for(name)
17
+ raise NotSuchRelationError(name.to_s) unless rel
18
+ rel
19
+ end
20
+
21
+ def each_relation_pair
22
+ return to_enum(:each_relation_pair) unless block_given?
23
+
24
+ spreadsheet.sheets.each do |sheet_name|
25
+ yield(sheet_name.to_sym, rel_for(sheet_name))
26
+ end
27
+ end
28
+
29
+ protected
30
+
31
+ def spreadsheet
32
+ @spreadsheet ||= Roo::Spreadsheet.open(@path, @options)
33
+ end
34
+
35
+ def rel_for(sheet_name)
36
+ Bmg.excel(@path, { sheet: sheet_name.to_s })
37
+ end
38
+
39
+ end # class Sequel
40
+ end # class Database
41
+ end # module Bmg
@@ -0,0 +1,35 @@
1
+ module Bmg
2
+ class Database
3
+
4
+ def self.data_folder(*args)
5
+ require_relative 'database/data_folder'
6
+ DataFolder.new(*args)
7
+ end
8
+
9
+ def self.sequel(*args)
10
+ require 'bmg/sequel'
11
+ require_relative 'database/sequel'
12
+ Sequel.new(*args)
13
+ end
14
+
15
+ def self.xlsx(*args)
16
+ require 'bmg/xlsx'
17
+ require_relative 'database/xlsx'
18
+ Xlsx.new(*args)
19
+ end
20
+
21
+ def to_xlsx(*args)
22
+ require 'bmg/xlsx'
23
+ Writer::Xlsx.to_xlsx(self, *args)
24
+ end
25
+
26
+ def to_data_folder(*args)
27
+ DataFolder.dump(self, *args)
28
+ end
29
+
30
+ def each_relation_pair
31
+ raise NotImplementedError
32
+ end
33
+
34
+ end # class Database
35
+ end # module Bmg
data/lib/bmg/error.rb CHANGED
@@ -23,4 +23,7 @@ module Bmg
23
23
  # to backtrack to something more ruby-native.
24
24
  class NotSupportedError < Error; end
25
25
 
26
+ # Raised when relation (variable) is not found
27
+ class NotSuchRelationError < Error; end
28
+
26
29
  end
@@ -1,80 +1 @@
1
- module Bmg
2
- module Reader
3
- class Excel
4
- include Reader
5
-
6
- DEFAULT_OPTIONS = {
7
- sheet: 0,
8
- skip: 0,
9
- row_num: true
10
- }
11
-
12
- def initialize(type, path, options = {})
13
- require 'roo'
14
- @path = path
15
- @options = DEFAULT_OPTIONS.merge(options)
16
- @type = type.knows_attrlist? ? type : type.with_attrlist(infer_attrlist)
17
- end
18
-
19
- def each
20
- return to_enum unless block_given?
21
-
22
- headers = type.attrlist
23
- headers = headers[1..-1] if generate_row_num?
24
- start_at = @options[:skip] + 2
25
- end_at = spreadsheet.last_row
26
- (start_at..end_at).each do |i|
27
- row = spreadsheet.row(i)
28
- init = init_tuple(i - start_at + 1)
29
- tuple = (0...headers.size).each_with_object(init){|i,t|
30
- t[headers[i]] = row[i]
31
- }
32
- yield(tuple)
33
- end
34
- end
35
-
36
- def to_ast
37
- [ :excel, @path, @options ]
38
- end
39
-
40
- def to_s
41
- "(excel #{@path})"
42
- end
43
- alias :inspect :to_s
44
-
45
- private
46
-
47
- def spreadsheet
48
- @spreadsheet ||= Roo::Spreadsheet
49
- .open(@path, @options)
50
- .sheet(@options[:sheet])
51
- end
52
-
53
- def infer_attrlist
54
- row = spreadsheet.row(1+@options[:skip])
55
- attrlist = row.map{|c| c.to_s.strip.to_sym }
56
- attrlist.unshift(row_num_name) if generate_row_num?
57
- attrlist
58
- end
59
-
60
- def generate_row_num?
61
- !!@options[:row_num]
62
- end
63
-
64
- def row_num_name
65
- case as = @options[:row_num]
66
- when TrueClass then :row_num
67
- when Symbol then as
68
- else nil
69
- end
70
- end
71
-
72
- def init_tuple(i)
73
- return {} unless generate_row_num?
74
-
75
- { row_num_name => i }
76
- end
77
-
78
- end # class Excel
79
- end # module Reader
80
- end # module Bmg
1
+ require_relative 'xlsx'
@@ -0,0 +1,80 @@
1
+ module Bmg
2
+ module Reader
3
+ class Excel
4
+ include Reader
5
+
6
+ DEFAULT_OPTIONS = {
7
+ sheet: 0,
8
+ skip: 0,
9
+ row_num: true
10
+ }
11
+
12
+ def initialize(type, path, options = {})
13
+ require 'roo'
14
+ @path = path
15
+ @options = DEFAULT_OPTIONS.merge(options)
16
+ @type = type.knows_attrlist? ? type : type.with_attrlist(infer_attrlist)
17
+ end
18
+
19
+ def each
20
+ return to_enum unless block_given?
21
+
22
+ headers = type.attrlist
23
+ headers = headers[1..-1] if generate_row_num?
24
+ start_at = @options[:skip] + 2
25
+ end_at = spreadsheet.last_row
26
+ (start_at..end_at).each do |i|
27
+ row = spreadsheet.row(i)
28
+ init = init_tuple(i - start_at + 1)
29
+ tuple = (0...headers.size).each_with_object(init){|i,t|
30
+ t[headers[i]] = row[i]
31
+ }
32
+ yield(tuple)
33
+ end
34
+ end
35
+
36
+ def to_ast
37
+ [ :excel, @path, @options ]
38
+ end
39
+
40
+ def to_s
41
+ "(excel #{@path})"
42
+ end
43
+ alias :inspect :to_s
44
+
45
+ private
46
+
47
+ def spreadsheet
48
+ @spreadsheet ||= Roo::Spreadsheet
49
+ .open(@path, @options)
50
+ .sheet(@options[:sheet])
51
+ end
52
+
53
+ def infer_attrlist
54
+ row = spreadsheet.row(1+@options[:skip])
55
+ attrlist = row.map{|c| c.to_s.strip.to_sym }
56
+ attrlist.unshift(row_num_name) if generate_row_num?
57
+ attrlist
58
+ end
59
+
60
+ def generate_row_num?
61
+ !!@options[:row_num]
62
+ end
63
+
64
+ def row_num_name
65
+ case as = @options[:row_num]
66
+ when TrueClass then :row_num
67
+ when Symbol then as
68
+ else nil
69
+ end
70
+ end
71
+
72
+ def init_tuple(i)
73
+ return {} unless generate_row_num?
74
+
75
+ { row_num_name => i }
76
+ end
77
+
78
+ end # class Excel
79
+ end # module Reader
80
+ end # module Bmg
data/lib/bmg/sequel.rb CHANGED
@@ -1,3 +1,4 @@
1
+ require 'bmg'
1
2
  require 'bmg/sql'
2
3
  require 'sequel'
3
4
  require 'predicate/sequel'
@@ -0,0 +1,82 @@
1
+ module Bmg
2
+ class Summarizer
3
+ #
4
+ # Bucketizer summarizer.
5
+ #
6
+ # Example:
7
+ #
8
+ # # direct ruby usage
9
+ # Bmg::Summarizer.bucketize(:qty, :size => 2).summarize(...)
10
+ #
11
+ class Bucketize < Summarizer
12
+
13
+ # Sets default options.
14
+ def default_options
15
+ { :size => 10 }
16
+ end
17
+
18
+ # Returns least value (defaults to "")
19
+ def least()
20
+ [[], []]
21
+ end
22
+
23
+ # Concatenates current memo with val.to_s
24
+ def _happens(memo, val)
25
+ memo.first << val
26
+ memo
27
+ end
28
+
29
+ # Finalizes computation
30
+ def finalize(memo)
31
+ buckets = compute_buckets(memo.first, options[:size])
32
+ buckets = touching_buckets(buckets) if options[:boundaries] == :touching
33
+ buckets
34
+ end
35
+
36
+ private
37
+
38
+ def compute_buckets(values, num_buckets = 10)
39
+ sorted_values = values.compact.sort
40
+ sorted_values = sorted_values.map{|v| v.to_s[0...options[:value_length]] } if options[:value_length]
41
+ sorted_values = sorted_values.uniq if options[:distinct]
42
+
43
+ # Calculate the size of each bucket
44
+ total_values = sorted_values.length
45
+ bucket_size = (total_values / num_buckets.to_f).ceil
46
+
47
+ # Create the ranges for each bucket
48
+ bucket_ranges = []
49
+ num_buckets.times do |i|
50
+ start_index = i * bucket_size
51
+ break if start_index >= total_values # Ensure we do not exceed the array bounds
52
+
53
+ end_index = [(start_index + bucket_size - 1), total_values - 1].min
54
+ start_value = sorted_values[start_index]
55
+ end_value = sorted_values[end_index]
56
+ bucket_ranges << (start_value..end_value)
57
+ end
58
+
59
+ bucket_ranges
60
+ end
61
+
62
+ def touching_buckets(buckets)
63
+ result = []
64
+ buckets.each do |b|
65
+ r_start = result.empty? ? b.begin : result.last.end
66
+ r_end = b.end
67
+ result << (r_start...r_end)
68
+ end
69
+ result[-1] = (result.last.begin..result.last.end)
70
+
71
+ result
72
+ end
73
+
74
+ end # class Concat
75
+
76
+ # Factors a bucketize summarizer
77
+ def self.bucketize(*args, &bl)
78
+ Bucketize.new(*args, &bl)
79
+ end
80
+
81
+ end # class Summarizer
82
+ end # module Bmg
@@ -21,7 +21,7 @@ module Bmg
21
21
  end
22
22
 
23
23
  # Concatenates current memo with val.to_s
24
- def _happens(memo, val)
24
+ def _happens(memo, val)
25
25
  memo << options[:between].to_s unless memo.empty?
26
26
  memo << val.to_s
27
27
  end
@@ -172,3 +172,4 @@ require_relative 'summarizer/value_by'
172
172
  require_relative 'summarizer/positional'
173
173
  require_relative 'summarizer/first'
174
174
  require_relative 'summarizer/last'
175
+ require_relative 'summarizer/bucketize'
data/lib/bmg/version.rb CHANGED
@@ -1,7 +1,7 @@
1
1
  module Bmg
2
2
  module Version
3
3
  MAJOR = 0
4
- MINOR = 22
4
+ MINOR = 23
5
5
  TINY = 0
6
6
  end
7
7
  VERSION = "#{Version::MAJOR}.#{Version::MINOR}.#{Version::TINY}"
@@ -7,22 +7,36 @@ module Bmg
7
7
  }
8
8
 
9
9
  def initialize(xlsx_options, output_preferences = nil)
10
+ require 'write_xlsx'
10
11
  @xlsx_options = DEFAULT_OPTIONS.merge(xlsx_options)
11
12
  @output_preferences = OutputPreferences.dress(output_preferences)
12
13
  end
13
14
  attr_reader :xlsx_options, :output_preferences
14
15
 
15
16
  def call(relation, path)
16
- require 'write_xlsx'
17
17
  dup._call(relation, path)
18
18
  end
19
19
 
20
+ def self.to_xlsx(database, path)
21
+ require 'write_xlsx'
22
+ workbook = WriteXLSX.new(path)
23
+ database.each_relation_pair do |name, rel|
24
+ worksheet = workbook.add_worksheet(name)
25
+ rel.to_xlsx({
26
+ workbook: workbook,
27
+ worksheet: worksheet,
28
+ })
29
+ end
30
+ workbook.close
31
+ end
32
+
20
33
  protected
21
34
  attr_reader :workbook, :worksheet
22
35
 
23
36
  def _call(relation, path)
24
37
  @workbook = xlsx_options[:workbook] || WriteXLSX.new(path)
25
38
  @worksheet = xlsx_options[:worksheet] || workbook.add_worksheet
39
+ @worksheet = workbook.add_worksheet(@worksheet) if @worksheet.is_a?(String)
26
40
 
27
41
  headers = infer_headers(relation.type)
28
42
  before = nil
data/lib/bmg/xlsx.rb ADDED
@@ -0,0 +1,3 @@
1
+ require_relative 'reader/xlsx'
2
+ require_relative 'writer/xlsx'
3
+ require_relative 'database/xlsx'
data/lib/bmg.rb CHANGED
@@ -24,6 +24,16 @@ module Bmg
24
24
  end
25
25
  module_function :csv
26
26
 
27
+ def json(path, options = {}, type = Type::ANY)
28
+ in_memory(path.load.map{|tuple| TupleAlgebra.symbolize_keys(tuple) })
29
+ end
30
+ module_function :json
31
+
32
+ def yaml(path, options = {}, type = Type::ANY)
33
+ in_memory(path.load.map{|tuple| TupleAlgebra.symbolize_keys(tuple) })
34
+ end
35
+ module_function :yaml
36
+
27
37
  def excel(path, options = {}, type = Type::ANY)
28
38
  Reader::Excel.new(type, path, options).spied(main_spy)
29
39
  end
@@ -57,6 +67,8 @@ module Bmg
57
67
  require_relative 'bmg/relation/materialized'
58
68
  require_relative 'bmg/relation/proxy'
59
69
 
70
+ require_relative 'bmg/database'
71
+
60
72
  # Deprecated
61
73
  Leaf = Relation::InMemory
62
74
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bmg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.22.0
4
+ version: 0.23.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Bernard Lambeau
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-05-17 00:00:00.000000000 Z
11
+ date: 2024-06-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: predicate
@@ -142,6 +142,10 @@ files:
142
142
  - lib/bmg.rb
143
143
  - lib/bmg/algebra.rb
144
144
  - lib/bmg/algebra/shortcuts.rb
145
+ - lib/bmg/database.rb
146
+ - lib/bmg/database/data_folder.rb
147
+ - lib/bmg/database/sequel.rb
148
+ - lib/bmg/database/xlsx.rb
145
149
  - lib/bmg/error.rb
146
150
  - lib/bmg/operator.rb
147
151
  - lib/bmg/operator/allbut.rb
@@ -172,6 +176,7 @@ files:
172
176
  - lib/bmg/reader/csv.rb
173
177
  - lib/bmg/reader/excel.rb
174
178
  - lib/bmg/reader/text_file.rb
179
+ - lib/bmg/reader/xlsx.rb
175
180
  - lib/bmg/relation.rb
176
181
  - lib/bmg/relation/empty.rb
177
182
  - lib/bmg/relation/in_memory.rb
@@ -272,6 +277,7 @@ files:
272
277
  - lib/bmg/sql/version.rb
273
278
  - lib/bmg/summarizer.rb
274
279
  - lib/bmg/summarizer/avg.rb
280
+ - lib/bmg/summarizer/bucketize.rb
275
281
  - lib/bmg/summarizer/by_proc.rb
276
282
  - lib/bmg/summarizer/collect.rb
277
283
  - lib/bmg/summarizer/concat.rb
@@ -300,6 +306,7 @@ files:
300
306
  - lib/bmg/writer.rb
301
307
  - lib/bmg/writer/csv.rb
302
308
  - lib/bmg/writer/xlsx.rb
309
+ - lib/bmg/xlsx.rb
303
310
  - tasks/gem.rake
304
311
  - tasks/test.rake
305
312
  homepage: http://github.com/enspirit/bmg