bmg 0.22.0 → 0.23.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: be420623e04afe03570a1756d1607a202eb96481
4
- data.tar.gz: 150b4f3f4daced5825e661765ed4530455263422
3
+ metadata.gz: 542c4218634ee7ae5b224400ee07ec6d2998473a
4
+ data.tar.gz: 37eddfc05f9fcfca90e96c50fc6628a09cdf8742
5
5
  SHA512:
6
- metadata.gz: 5b2d86803d1bfedfa5605f158a34c744e8b0fa77ed25314e38a28dac6bdef1512bc51a3398cfcd55df42fb60354880b4378b49d339682724cd3da42ef282ef27
7
- data.tar.gz: 7321a2b290d2852ff4e2fe52b915e190d3fb13143807e8ac062928c8745a330d70c43fcbbf4abb545507cc8d2b9a05167502c44c7b6d5310c24b2e781920f827
6
+ metadata.gz: 9c5009a14fa10be21fc6d76e23fb47a8e0e69e9dd300478822c6bc9758230ff593cb0a3783b94b0a64c4afbd82456b362ee081a02e4a049c65759c9699a3f074
7
+ data.tar.gz: 5d73d4dad1eb72f3794cbcfddab422e01a197d9c128f6991243cf7d381e41f9ae0cf255f03aff5eb23ac3d4375de1da1345503e086ff01253dd6e636497727b4
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
- # Bmg, a relational algebra (Alf's successor)!
1
+ # Bmg, a relational algebra
2
2
 
3
- Bmg is a relational algebra implemented as a Ruby library. It implements the
3
+ Bmg is a [relational algebra](https://www.relational-algebra.dev/) implemented as a Ruby library. It implements the
4
4
  [Relation as First-Class Citizen](http://www.try-alf.org/blog/2013-10-21-relations-as-first-class-citizen)
5
5
  paradigm contributed with [Alf](http://www.try-alf.org/) a few years ago.
6
6
 
@@ -9,16 +9,24 @@ and any data source that can be seen as serving relations. Cross data-sources
9
9
  joins are supported, as with Alf. For differences with Alf, see a section
10
10
  further down this README.
11
11
 
12
+ ## Links
13
+
14
+ * Documentation can be found at https://www.relational-algebra.dev/
15
+ * Contribute to that documentation on github: https://github.com/enspirit/bmg-website
16
+
12
17
  ## Outline
13
18
 
14
19
  * [Example](#example)
15
20
  * [Where are base relations coming from?](#where-are-base-relations-coming-from)
16
21
  * [Memory relations](#memory-relations)
17
22
  * [Connecting to SQL databases](#connecting-to-sql-databases)
18
- * [Reading files (csv, Excel, text)](#reading-files-csv-excel-text)
23
+ * [Reading data files](#reading-data-files-json-csv-yaml-text-xls--xlsx)
19
24
  * [Connecting to Redis databases](#connecting-to-redis-databases)
20
25
  * [Your own relations](#your-own-relations)
26
+ * [The Database abstraction](#the-database-abstraction)
21
27
  * [List of supported operators](#supported-operators)
28
+ * [List of supported predicates](#supported-predicates)
29
+ * [List of supported summaries](#supported-summaries)
22
30
  * [How is this different?](#how-is-this-different)
23
31
  * [... from similar libraries](#-from-similar-libraries)
24
32
  * [... from Alf](#-from-alf)
@@ -117,33 +125,38 @@ Bmg.sequel(:suppliers, sequel_db)
117
125
  # {:array=>false})
118
126
  ```
119
127
 
120
- ### Reading files (csv, Excel, text)
128
+ ### Reading data files (json, csv, yaml, text, xls & xlsx)
121
129
 
122
130
  Bmg provides simple adapters to read files and reach Relationland as soon as
123
131
  possible.
124
132
 
125
- #### CSV files
133
+ #### JSON files
126
134
 
127
135
  ```ruby
128
- csv_options = { col_sep: ",", quote_char: '"' }
129
- r = Bmg.csv("path/to/a/file.csv", csv_options)
136
+ r = Bmg.json("path/to/a/file.json")
130
137
  ```
131
138
 
132
- Options are directly transmitted to `::CSV.new`, check Ruby's standard
133
- library.
139
+ The json file is expected to contain tuples of same heading.
134
140
 
135
- #### Excel files
141
+ #### YAML files
136
142
 
137
- You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
138
- read `.xls` and `.xlsx` files with Bmg.
143
+ ```ruby
144
+ r = Bmg.yaml("path/to/a/file.yaml")
145
+ ```
146
+
147
+ The yaml file is expected to contain tuples of same heading.
148
+
149
+ #### CSV files
139
150
 
140
151
  ```ruby
141
- roo_options = { skip: 1 }
142
- r = Bmg.excel("path/to/a/file.xls", roo_options)
152
+ csv_options = { col_sep: ",", quote_char: '"' }
153
+ r = Bmg.csv("path/to/a/file.csv", csv_options)
143
154
  ```
144
155
 
145
- Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
146
- documentation.
156
+ Options are directly transmitted to `::CSV.new`, check Ruby's standard
157
+ library. If you don't provide them, `Bmg` uses `headers: true` (hence making
158
+ then assumption that attributes names are provided on first line), and makes a
159
+ best effort to infer the column separator.
147
160
 
148
161
  #### Text files
149
162
 
@@ -173,6 +186,19 @@ r.type.attrlist
173
186
  In this scenario, non matching lines are skipped. The `:line` attribute keeps
174
187
  being used to have at least one candidate key (so to speak).
175
188
 
189
+ #### Excel files
190
+
191
+ You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
192
+ read `.xls` and `.xlsx` files with Bmg.
193
+
194
+ ```ruby
195
+ roo_options = { skip: 1 }
196
+ r = Bmg.excel("path/to/a/file.xls", roo_options)
197
+ ```
198
+
199
+ Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
200
+ documentation.
201
+
176
202
  ### Connecting to Redis databases
177
203
 
178
204
  Bmg currently requires `bmg-redis` and `redis >= 4.6` to connect
@@ -240,6 +266,58 @@ restrictions down the tree) by overriding the underscored version of operators
240
266
  Have a look at `Bmg::Algebra` for the protocol and `Bmg::Sql::Relation` for an
241
267
  example. Keep in touch with the team if you need some help.
242
268
 
269
+ ## The Database abstraction
270
+
271
+ The previous section focused on obtaining *relations*. In practice you frequently
272
+ have a collection of relations hence a *database*:
273
+
274
+ * A SQL database with multiple tables
275
+ * A list of data files, all in the same folder
276
+ * An excel file with various sheets
277
+
278
+ Bmg supports a simple Datbabase abstraction that serves those relations "by name",
279
+ in a simple way. A database can also be easily dumped back to a data folder of
280
+ json or csv files, or as simple xlsx files with multiple sheets.
281
+
282
+ ### Connecting to a SQL Database
283
+
284
+ For a SQL database, connected with Sequel:
285
+
286
+ ```
287
+ db = Bmg::Database.sequel(Sequel.connect('...'))
288
+ db.suppliers # yields a Bmg::Relation over the `suppliers` table
289
+ ```
290
+
291
+ ### Connecting to data files in the same folder
292
+
293
+ Data files all in the same folder can be seen as a very basic form of database,
294
+ and served as such. Bmg supports `json`, `csv` and `yaml` files:
295
+
296
+ ```
297
+ db = Bmg::Database.data_folder('./my-database')
298
+ db.suppliers # yields a Bmg::Relation over the `suppliers.(json,csv,yml)` file
299
+ ```
300
+
301
+ Bmg supports files in different formats in the same folder. When files with the
302
+ same basename exist, json is prefered over yaml, which is prefered over csv.
303
+
304
+ ### Dumping a Database instance
305
+
306
+ As a data folder:
307
+
308
+ ```
309
+ db = Bmg::Database.sequel(Sequel.connect('...'))
310
+ db.to_data_folder('path/to/folder', :json)
311
+ ```
312
+
313
+ As an .xlsx file (any existing file will be erased, we don't support modifying
314
+ existing files):
315
+
316
+ ```
317
+ require 'bmg/xlsx'
318
+ db.to_xlsx('path/to/file.xlsx')
319
+ ```
320
+
243
321
  ## Supported operators
244
322
 
245
323
  ```ruby
@@ -283,6 +361,67 @@ r.unwrap(:a) # shortcut over unwrap([:a])
283
361
  r.where(predicate) # alias for restrict(predicate)
284
362
  ```
285
363
 
364
+ ## Supported Predicates
365
+
366
+ Usual operators are supported and map to their SQL equivalent as expected:
367
+
368
+ ```ruby
369
+ Predicate.eq # =
370
+ Predicate.neq # <>
371
+ Predicate.lt # <
372
+ Predicate.lte # <=
373
+ Predicate.gt # >
374
+ Predicate.gte # >=
375
+ Predicate.in # SQL's IN
376
+ Predicate.is_null # SQL's IS NULL
377
+ ```
378
+
379
+ See the [Predicate gem](https://github.com/enspirit/predicate) for a more
380
+ complete list.
381
+
382
+ Note: predicates that implement specific Ruby algorithms or patterns are
383
+ not compiled to SQL (and more generally not delegated to underlying database
384
+ servers).
385
+
386
+ ## Supported Summaries
387
+
388
+ The `summarize` operator receives a list of `attr: summarizer` pairs, e.g.
389
+
390
+ ```ruby
391
+ r.summarize([:city], {
392
+ how_many: :count, # same as how_many: Bmg::Summarizer.count
393
+ status: :max, # same as status: Bmg::Summarizer.max(:status)
394
+ min_status: Bmg::Summarizer.min(:status)
395
+ })
396
+ ```
397
+
398
+ The following summarizers are available and translated to SQL:
399
+
400
+ ```ruby
401
+ Bmg::Summarizer.count # count the number of tuples
402
+ Bmg::Summarizer.distinct(:a) # collect distinct values (as an array)
403
+ Bmg::Summarizer.distinct_count(:a) # count of distinct values
404
+ Bmg::Summarizer.min(:a) # min value for attribute :a
405
+ Bmg::Summarizer.max(:a) # max value
406
+ Bmg::Summarizer.sum(:a) # sum :a's values
407
+ Bmg::Summarizer.avg(:a) # average
408
+ ```
409
+
410
+ The following summarizers are implemented in Ruby (they are supported when
411
+ querying SQL databases, but not compiled to SQL):
412
+
413
+ ```ruby
414
+ Bmg::Summarizer.collect(:a) # collect :a's values (as an array)
415
+ Bmg::Summarizer.concat(:a, opts: { ... }) # concat :a's values (opts, e.g. {between: ','})
416
+ Bmg::Summarizer.first(:a, order: ...) # smallest seen a:'s value according to a tuple ordering
417
+ Bmg::Summarizer.last(:a, order: ...) # largest seen a:'s value according to a tuple ordering
418
+ Bmg::Summarizer.variance(:a) # variance
419
+ Bmg::Summarizer.stddev(:a) # standard deviation
420
+ Bmg::Summarizer.percentile(:a, nth) # (continuous) nth percentile
421
+ Bmg::Summarizer.percentile_disc(:a, nth) # discrete nth percentile
422
+ Bmg::Summarizer.value_by(:a, :by => :b) # { :b => :a } as a Hash
423
+ ```
424
+
286
425
  ## How is this different?
287
426
 
288
427
  ### ... from similar libraries?
@@ -0,0 +1,67 @@
1
+ module Bmg
2
+ class Database
3
+ class DataFolder < Database
4
+
5
+ DEFAULT_OPTIONS = {
6
+ data_extensions: ['json', 'yml', 'yaml', 'csv']
7
+ }
8
+
9
+ def initialize(folder, options = {})
10
+ @folder = Path(folder)
11
+ @options = DEFAULT_OPTIONS.merge(options)
12
+ end
13
+
14
+ def method_missing(name, *args, &bl)
15
+ return super(name, *args, &bl) unless args.empty? && bl.nil?
16
+ raise NotSuchRelationError(name.to_s) unless file = find_file(name)
17
+ read_file(file)
18
+ end
19
+
20
+ def each_relation_pair
21
+ return to_enum(:each_relation_pair) unless block_given?
22
+
23
+ @folder.glob('*') do |path|
24
+ next unless path.file?
25
+ next unless @options[:data_extensions].find {|ext|
26
+ path.ext == ".#{ext}" || path.ext == ext
27
+ }
28
+ yield(path.basename.rm_ext.to_sym, read_file(path))
29
+ end
30
+ end
31
+
32
+ def self.dump(database, path, ext = :json)
33
+ path = Path(path)
34
+ path.mkdir_p
35
+ database.each_relation_pair do |name, rel|
36
+ (path/"#{name}.#{ext}").write(rel.public_send(:"to_#{ext}"))
37
+ end
38
+ path
39
+ end
40
+
41
+ private
42
+
43
+ def read_file(file)
44
+ case file.ext.to_s
45
+ when '.json'
46
+ Bmg.json(file)
47
+ when '.yaml', '.yml'
48
+ Bmg.yaml(file)
49
+ when '.csv'
50
+ Bmg.csv(file)
51
+ else
52
+ raise NotSupportedError, "Unable to use #{file} as a relation"
53
+ end
54
+ end
55
+
56
+ def find_file(name)
57
+ exts = @options[:data_extensions]
58
+ exts.each do |ext|
59
+ target = @folder/"#{name}.#{ext}"
60
+ return target if target.file?
61
+ end
62
+ raise NotSuchRelationError, "#{@folder}/#{name}.#{exts.join(',')}"
63
+ end
64
+
65
+ end # class DataFolder
66
+ end # class Database
67
+ end # module Bmg
@@ -0,0 +1,35 @@
1
+ module Bmg
2
+ class Database
3
+ class Sequel < Database
4
+
5
+ DEFAULT_OPTIONS = {
6
+ }
7
+
8
+ def initialize(sequel_db, options = {})
9
+ @sequel_db = sequel_db
10
+ @sequel_db = ::Sequel.connect(@sequel_db) unless @sequel_db.is_a?(::Sequel::Database)
11
+ end
12
+
13
+ def method_missing(name, *args, &bl)
14
+ return super(name, *args, &bl) unless args.empty? && bl.nil?
15
+ raise NotSuchRelationError(name.to_s) unless @sequel_db.table_exists?(name)
16
+ rel_for(name)
17
+ end
18
+
19
+ def each_relation_pair
20
+ return to_enum(:each_relation_pair) unless block_given?
21
+
22
+ @sequel_db.tables.each do |table|
23
+ yield(table, rel_for(table))
24
+ end
25
+ end
26
+
27
+ protected
28
+
29
+ def rel_for(table_name)
30
+ Bmg.sequel(table_name, @sequel_db)
31
+ end
32
+
33
+ end # class Sequel
34
+ end # class Database
35
+ end # module Bmg
@@ -0,0 +1,41 @@
1
+ module Bmg
2
+ class Database
3
+ class Xlsx < Database
4
+
5
+ DEFAULT_OPTIONS = {
6
+ }
7
+
8
+ def initialize(path, options = {})
9
+ path = Path(path) if path.is_a?(String)
10
+ @path = path
11
+ @options = options.merge(DEFAULT_OPTIONS)
12
+ end
13
+
14
+ def method_missing(name, *args, &bl)
15
+ return super(name, *args, &bl) unless args.empty? && bl.nil?
16
+ rel = rel_for(name)
17
+ raise NotSuchRelationError(name.to_s) unless rel
18
+ rel
19
+ end
20
+
21
+ def each_relation_pair
22
+ return to_enum(:each_relation_pair) unless block_given?
23
+
24
+ spreadsheet.sheets.each do |sheet_name|
25
+ yield(sheet_name.to_sym, rel_for(sheet_name))
26
+ end
27
+ end
28
+
29
+ protected
30
+
31
+ def spreadsheet
32
+ @spreadsheet ||= Roo::Spreadsheet.open(@path, @options)
33
+ end
34
+
35
+ def rel_for(sheet_name)
36
+ Bmg.excel(@path, { sheet: sheet_name.to_s })
37
+ end
38
+
39
+ end # class Sequel
40
+ end # class Database
41
+ end # module Bmg
@@ -0,0 +1,35 @@
1
+ module Bmg
2
+ class Database
3
+
4
+ def self.data_folder(*args)
5
+ require_relative 'database/data_folder'
6
+ DataFolder.new(*args)
7
+ end
8
+
9
+ def self.sequel(*args)
10
+ require 'bmg/sequel'
11
+ require_relative 'database/sequel'
12
+ Sequel.new(*args)
13
+ end
14
+
15
+ def self.xlsx(*args)
16
+ require 'bmg/xlsx'
17
+ require_relative 'database/xlsx'
18
+ Xlsx.new(*args)
19
+ end
20
+
21
+ def to_xlsx(*args)
22
+ require 'bmg/xlsx'
23
+ Writer::Xlsx.to_xlsx(self, *args)
24
+ end
25
+
26
+ def to_data_folder(*args)
27
+ DataFolder.dump(self, *args)
28
+ end
29
+
30
+ def each_relation_pair
31
+ raise NotImplementedError
32
+ end
33
+
34
+ end # class Database
35
+ end # module Bmg
data/lib/bmg/error.rb CHANGED
@@ -23,4 +23,7 @@ module Bmg
23
23
  # to backtrack to something more ruby-native.
24
24
  class NotSupportedError < Error; end
25
25
 
26
+ # Raised when relation (variable) is not found
27
+ class NotSuchRelationError < Error; end
28
+
26
29
  end
@@ -1,80 +1 @@
1
- module Bmg
2
- module Reader
3
- class Excel
4
- include Reader
5
-
6
- DEFAULT_OPTIONS = {
7
- sheet: 0,
8
- skip: 0,
9
- row_num: true
10
- }
11
-
12
- def initialize(type, path, options = {})
13
- require 'roo'
14
- @path = path
15
- @options = DEFAULT_OPTIONS.merge(options)
16
- @type = type.knows_attrlist? ? type : type.with_attrlist(infer_attrlist)
17
- end
18
-
19
- def each
20
- return to_enum unless block_given?
21
-
22
- headers = type.attrlist
23
- headers = headers[1..-1] if generate_row_num?
24
- start_at = @options[:skip] + 2
25
- end_at = spreadsheet.last_row
26
- (start_at..end_at).each do |i|
27
- row = spreadsheet.row(i)
28
- init = init_tuple(i - start_at + 1)
29
- tuple = (0...headers.size).each_with_object(init){|i,t|
30
- t[headers[i]] = row[i]
31
- }
32
- yield(tuple)
33
- end
34
- end
35
-
36
- def to_ast
37
- [ :excel, @path, @options ]
38
- end
39
-
40
- def to_s
41
- "(excel #{@path})"
42
- end
43
- alias :inspect :to_s
44
-
45
- private
46
-
47
- def spreadsheet
48
- @spreadsheet ||= Roo::Spreadsheet
49
- .open(@path, @options)
50
- .sheet(@options[:sheet])
51
- end
52
-
53
- def infer_attrlist
54
- row = spreadsheet.row(1+@options[:skip])
55
- attrlist = row.map{|c| c.to_s.strip.to_sym }
56
- attrlist.unshift(row_num_name) if generate_row_num?
57
- attrlist
58
- end
59
-
60
- def generate_row_num?
61
- !!@options[:row_num]
62
- end
63
-
64
- def row_num_name
65
- case as = @options[:row_num]
66
- when TrueClass then :row_num
67
- when Symbol then as
68
- else nil
69
- end
70
- end
71
-
72
- def init_tuple(i)
73
- return {} unless generate_row_num?
74
-
75
- { row_num_name => i }
76
- end
77
-
78
- end # class Excel
79
- end # module Reader
80
- end # module Bmg
1
+ require_relative 'xlsx'
@@ -0,0 +1,80 @@
1
+ module Bmg
2
+ module Reader
3
+ class Excel
4
+ include Reader
5
+
6
+ DEFAULT_OPTIONS = {
7
+ sheet: 0,
8
+ skip: 0,
9
+ row_num: true
10
+ }
11
+
12
+ def initialize(type, path, options = {})
13
+ require 'roo'
14
+ @path = path
15
+ @options = DEFAULT_OPTIONS.merge(options)
16
+ @type = type.knows_attrlist? ? type : type.with_attrlist(infer_attrlist)
17
+ end
18
+
19
+ def each
20
+ return to_enum unless block_given?
21
+
22
+ headers = type.attrlist
23
+ headers = headers[1..-1] if generate_row_num?
24
+ start_at = @options[:skip] + 2
25
+ end_at = spreadsheet.last_row
26
+ (start_at..end_at).each do |i|
27
+ row = spreadsheet.row(i)
28
+ init = init_tuple(i - start_at + 1)
29
+ tuple = (0...headers.size).each_with_object(init){|i,t|
30
+ t[headers[i]] = row[i]
31
+ }
32
+ yield(tuple)
33
+ end
34
+ end
35
+
36
+ def to_ast
37
+ [ :excel, @path, @options ]
38
+ end
39
+
40
+ def to_s
41
+ "(excel #{@path})"
42
+ end
43
+ alias :inspect :to_s
44
+
45
+ private
46
+
47
+ def spreadsheet
48
+ @spreadsheet ||= Roo::Spreadsheet
49
+ .open(@path, @options)
50
+ .sheet(@options[:sheet])
51
+ end
52
+
53
+ def infer_attrlist
54
+ row = spreadsheet.row(1+@options[:skip])
55
+ attrlist = row.map{|c| c.to_s.strip.to_sym }
56
+ attrlist.unshift(row_num_name) if generate_row_num?
57
+ attrlist
58
+ end
59
+
60
+ def generate_row_num?
61
+ !!@options[:row_num]
62
+ end
63
+
64
+ def row_num_name
65
+ case as = @options[:row_num]
66
+ when TrueClass then :row_num
67
+ when Symbol then as
68
+ else nil
69
+ end
70
+ end
71
+
72
+ def init_tuple(i)
73
+ return {} unless generate_row_num?
74
+
75
+ { row_num_name => i }
76
+ end
77
+
78
+ end # class Excel
79
+ end # module Reader
80
+ end # module Bmg
data/lib/bmg/sequel.rb CHANGED
@@ -1,3 +1,4 @@
1
+ require 'bmg'
1
2
  require 'bmg/sql'
2
3
  require 'sequel'
3
4
  require 'predicate/sequel'
@@ -0,0 +1,82 @@
1
+ module Bmg
2
+ class Summarizer
3
+ #
4
+ # Bucketizer summarizer.
5
+ #
6
+ # Example:
7
+ #
8
+ # # direct ruby usage
9
+ # Bmg::Summarizer.bucketize(:qty, :size => 2).summarize(...)
10
+ #
11
+ class Bucketize < Summarizer
12
+
13
+ # Sets default options.
14
+ def default_options
15
+ { :size => 10 }
16
+ end
17
+
18
+ # Returns least value (defaults to "")
19
+ def least()
20
+ [[], []]
21
+ end
22
+
23
+ # Concatenates current memo with val.to_s
24
+ def _happens(memo, val)
25
+ memo.first << val
26
+ memo
27
+ end
28
+
29
+ # Finalizes computation
30
+ def finalize(memo)
31
+ buckets = compute_buckets(memo.first, options[:size])
32
+ buckets = touching_buckets(buckets) if options[:boundaries] == :touching
33
+ buckets
34
+ end
35
+
36
+ private
37
+
38
+ def compute_buckets(values, num_buckets = 10)
39
+ sorted_values = values.compact.sort
40
+ sorted_values = sorted_values.map{|v| v.to_s[0...options[:value_length]] } if options[:value_length]
41
+ sorted_values = sorted_values.uniq if options[:distinct]
42
+
43
+ # Calculate the size of each bucket
44
+ total_values = sorted_values.length
45
+ bucket_size = (total_values / num_buckets.to_f).ceil
46
+
47
+ # Create the ranges for each bucket
48
+ bucket_ranges = []
49
+ num_buckets.times do |i|
50
+ start_index = i * bucket_size
51
+ break if start_index >= total_values # Ensure we do not exceed the array bounds
52
+
53
+ end_index = [(start_index + bucket_size - 1), total_values - 1].min
54
+ start_value = sorted_values[start_index]
55
+ end_value = sorted_values[end_index]
56
+ bucket_ranges << (start_value..end_value)
57
+ end
58
+
59
+ bucket_ranges
60
+ end
61
+
62
+ def touching_buckets(buckets)
63
+ result = []
64
+ buckets.each do |b|
65
+ r_start = result.empty? ? b.begin : result.last.end
66
+ r_end = b.end
67
+ result << (r_start...r_end)
68
+ end
69
+ result[-1] = (result.last.begin..result.last.end)
70
+
71
+ result
72
+ end
73
+
74
+ end # class Concat
75
+
76
+ # Factors a bucketize summarizer
77
+ def self.bucketize(*args, &bl)
78
+ Bucketize.new(*args, &bl)
79
+ end
80
+
81
+ end # class Summarizer
82
+ end # module Bmg
@@ -21,7 +21,7 @@ module Bmg
21
21
  end
22
22
 
23
23
  # Concatenates current memo with val.to_s
24
- def _happens(memo, val)
24
+ def _happens(memo, val)
25
25
  memo << options[:between].to_s unless memo.empty?
26
26
  memo << val.to_s
27
27
  end
@@ -172,3 +172,4 @@ require_relative 'summarizer/value_by'
172
172
  require_relative 'summarizer/positional'
173
173
  require_relative 'summarizer/first'
174
174
  require_relative 'summarizer/last'
175
+ require_relative 'summarizer/bucketize'
data/lib/bmg/version.rb CHANGED
@@ -1,7 +1,7 @@
1
1
  module Bmg
2
2
  module Version
3
3
  MAJOR = 0
4
- MINOR = 22
4
+ MINOR = 23
5
5
  TINY = 0
6
6
  end
7
7
  VERSION = "#{Version::MAJOR}.#{Version::MINOR}.#{Version::TINY}"
@@ -7,22 +7,36 @@ module Bmg
7
7
  }
8
8
 
9
9
  def initialize(xlsx_options, output_preferences = nil)
10
+ require 'write_xlsx'
10
11
  @xlsx_options = DEFAULT_OPTIONS.merge(xlsx_options)
11
12
  @output_preferences = OutputPreferences.dress(output_preferences)
12
13
  end
13
14
  attr_reader :xlsx_options, :output_preferences
14
15
 
15
16
  def call(relation, path)
16
- require 'write_xlsx'
17
17
  dup._call(relation, path)
18
18
  end
19
19
 
20
+ def self.to_xlsx(database, path)
21
+ require 'write_xlsx'
22
+ workbook = WriteXLSX.new(path)
23
+ database.each_relation_pair do |name, rel|
24
+ worksheet = workbook.add_worksheet(name)
25
+ rel.to_xlsx({
26
+ workbook: workbook,
27
+ worksheet: worksheet,
28
+ })
29
+ end
30
+ workbook.close
31
+ end
32
+
20
33
  protected
21
34
  attr_reader :workbook, :worksheet
22
35
 
23
36
  def _call(relation, path)
24
37
  @workbook = xlsx_options[:workbook] || WriteXLSX.new(path)
25
38
  @worksheet = xlsx_options[:worksheet] || workbook.add_worksheet
39
+ @worksheet = workbook.add_worksheet(@worksheet) if @worksheet.is_a?(String)
26
40
 
27
41
  headers = infer_headers(relation.type)
28
42
  before = nil
data/lib/bmg/xlsx.rb ADDED
@@ -0,0 +1,3 @@
1
+ require_relative 'reader/xlsx'
2
+ require_relative 'writer/xlsx'
3
+ require_relative 'database/xlsx'
data/lib/bmg.rb CHANGED
@@ -24,6 +24,16 @@ module Bmg
24
24
  end
25
25
  module_function :csv
26
26
 
27
+ def json(path, options = {}, type = Type::ANY)
28
+ in_memory(path.load.map{|tuple| TupleAlgebra.symbolize_keys(tuple) })
29
+ end
30
+ module_function :json
31
+
32
+ def yaml(path, options = {}, type = Type::ANY)
33
+ in_memory(path.load.map{|tuple| TupleAlgebra.symbolize_keys(tuple) })
34
+ end
35
+ module_function :yaml
36
+
27
37
  def excel(path, options = {}, type = Type::ANY)
28
38
  Reader::Excel.new(type, path, options).spied(main_spy)
29
39
  end
@@ -57,6 +67,8 @@ module Bmg
57
67
  require_relative 'bmg/relation/materialized'
58
68
  require_relative 'bmg/relation/proxy'
59
69
 
70
+ require_relative 'bmg/database'
71
+
60
72
  # Deprecated
61
73
  Leaf = Relation::InMemory
62
74
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bmg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.22.0
4
+ version: 0.23.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Bernard Lambeau
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-05-17 00:00:00.000000000 Z
11
+ date: 2024-06-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: predicate
@@ -142,6 +142,10 @@ files:
142
142
  - lib/bmg.rb
143
143
  - lib/bmg/algebra.rb
144
144
  - lib/bmg/algebra/shortcuts.rb
145
+ - lib/bmg/database.rb
146
+ - lib/bmg/database/data_folder.rb
147
+ - lib/bmg/database/sequel.rb
148
+ - lib/bmg/database/xlsx.rb
145
149
  - lib/bmg/error.rb
146
150
  - lib/bmg/operator.rb
147
151
  - lib/bmg/operator/allbut.rb
@@ -172,6 +176,7 @@ files:
172
176
  - lib/bmg/reader/csv.rb
173
177
  - lib/bmg/reader/excel.rb
174
178
  - lib/bmg/reader/text_file.rb
179
+ - lib/bmg/reader/xlsx.rb
175
180
  - lib/bmg/relation.rb
176
181
  - lib/bmg/relation/empty.rb
177
182
  - lib/bmg/relation/in_memory.rb
@@ -272,6 +277,7 @@ files:
272
277
  - lib/bmg/sql/version.rb
273
278
  - lib/bmg/summarizer.rb
274
279
  - lib/bmg/summarizer/avg.rb
280
+ - lib/bmg/summarizer/bucketize.rb
275
281
  - lib/bmg/summarizer/by_proc.rb
276
282
  - lib/bmg/summarizer/collect.rb
277
283
  - lib/bmg/summarizer/concat.rb
@@ -300,6 +306,7 @@ files:
300
306
  - lib/bmg/writer.rb
301
307
  - lib/bmg/writer/csv.rb
302
308
  - lib/bmg/writer/xlsx.rb
309
+ - lib/bmg/xlsx.rb
303
310
  - tasks/gem.rake
304
311
  - tasks/test.rake
305
312
  homepage: http://github.com/enspirit/bmg