bmg 0.17.6 → 0.18.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e67d11656619195bfaa20a9d55fff49d593ccaa3ad558d0876c82534977d4e04
4
- data.tar.gz: 7ecd9b926d4289feb0fec3efcafcf7b14c483b70f8c4e47b87e3ba038db64774
3
+ metadata.gz: 2c9b490328160c9816102f7dd34b3ee46207ab4362cd2af6d8623c9fc5d4b87a
4
+ data.tar.gz: d113dc14ae1811f2b5565af9e84a0ffbc82b7842c46284af7fcad3bc9f3b7470
5
5
  SHA512:
6
- metadata.gz: 4d99c96391c73c096e3e981155cbe53e637c909854a0e3fe92bae991eb7da1be7cf78188c87f9fb53cdbb90e5dd0d92b4fec0501f8d1fcf7f1cdb1900828d575
7
- data.tar.gz: f8ff33a830852b1e0c5648596ef5f11cae7fc5f7bd817c1ed30a73ce3da81a92bfb42871eecd928a3ce7b717499fb9a975ee8f174a1b013054af506ed056b825
6
+ metadata.gz: fd42b8787cc093c6452386760744d7480c64c8be186ab73b87a548a1b853643b46a6ad48cf019da9d77fddbb38c824eb0e0bef3ef5fa2f333f1d1240f5a0bd84
7
+ data.tar.gz: 7bd0cd32a28963123b03b59152fa03d6470bbdf1a3c6cfe0b17a6667a80d97e65d5309def4e31ec9cbb47353d409ad9ad2a6d1fbc9c2ec10797460852836ce35
data/Gemfile CHANGED
@@ -1,5 +1,2 @@
1
1
  source "https://rubygems.org"
2
2
  gemspec
3
-
4
- # gem "predicate", github: "enspirit/predicate", branch: "placeholders"
5
- # gem "predicate", path: "../predicate"
data/README.md CHANGED
@@ -1,16 +1,30 @@
1
1
  # Bmg, a relational algebra (Alf's successor)!
2
2
 
3
+ [![Build Status](https://travis-ci.com/enspirit/bmg.svg?branch=master)](https://travis-ci.com/enspirit/bmg)
4
+
3
5
  Bmg is a relational algebra implemented as a ruby library. It implements the
4
6
  [Relation as First-Class Citizen](http://www.try-alf.org/blog/2013-10-21-relations-as-first-class-citizen)
5
- paradigm contributed with Alf a few years ago.
6
-
7
- Like Alf, Bmg can be used to query relations in memory, from various files,
8
- SQL databases, and any data sources that can be seen as serving relations.
9
- Cross data-sources joins are supported, as with Alf.
10
-
11
- Unlike Alf, Bmg does not make any core ruby extension and exposes the
12
- object-oriented syntax only (not Alf's functional one). Bmg implementation is
13
- also much simpler, and make its easier to implement user-defined relations.
7
+ paradigm contributed with [Alf](http://www.try-alf.org/) a few years ago.
8
+
9
+ Bmg can be used to query relations in memory, from various files, SQL databases,
10
+ and any data source that can be seen as serving relations. Cross data-sources
11
+ joins are supported, as with Alf. For differences with Alf, see a section
12
+ further down this README.
13
+
14
+ ## Outline
15
+
16
+ * [Example](#example)
17
+ * [Where are base relations coming from?](#where-are-base-relations-coming-from)
18
+ * [Memory relations](#memory-relations)
19
+ * [Connecting to SQL databases](#connecting-to-sql-databases)
20
+ * [Reading files (csv, excel, text)](#reading-files-csv-excel-text)
21
+ * [Your own relations](#your-own-relations)
22
+ * [List of supported operators](#supported-operators)
23
+ * [How is this different?](#how-is-this-different)
24
+ * [... from similar libraries](#-from-similar-libraries)
25
+ * [... from Alf](#-from-alf)
26
+ * [Contribute](#contribute)
27
+ * [License](#license)
14
28
 
15
29
  ## Example
16
30
 
@@ -27,7 +41,7 @@ suppliers = Bmg::Relation.new([
27
41
  ])
28
42
 
29
43
  by_city = suppliers
30
- .restrict(Predicate.neq(status: 30))
44
+ .exclude(status: 30)
31
45
  .extend(upname: ->(t){ t[:name].upcase })
32
46
  .group([:sid, :name, :status], :suppliers_in)
33
47
 
@@ -35,76 +49,158 @@ puts JSON.pretty_generate(by_city)
35
49
  # [{...},...]
36
50
  ```
37
51
 
38
- ## Connecting to a SQL database
52
+ ## Where are base relations coming from?
53
+
54
+ Bmg sees relations as sets/enumerable of symbolized Ruby hashes. The following
55
+ sections show you how to get them in the first place, to enter Relationland.
56
+
57
+ ### Memory relations
58
+
59
+ If you have an Array of Hashes -- in fact any Enumerable -- you can easily get
60
+ a Relation using either `Bmg::Relation.new` or `Bmg.in_memory`.
61
+
62
+ ```ruby
63
+ # this...
64
+ r = Bmg::Relation.new [{id: 1}, {id: 2}]
65
+
66
+ # is the same as this...
67
+ r = Bmg.in_memory [{id: 1}, {id: 2}]
68
+
69
+ # entire algebra is available on `r`
70
+ ```
71
+
72
+ ### Connecting to SQL databases
39
73
 
40
- Bmg requires `sequel >= 3.0` to connect to SQL databases.
74
+ Bmg currently requires `sequel >= 3.0` to connect to SQL databases. You also
75
+ need to require `bmg/sequel`.
41
76
 
42
77
  ```ruby
43
78
  require 'sqlite3'
44
79
  require 'bmg'
45
80
  require 'bmg/sequel'
81
+ ```
46
82
 
47
- DB = Sequel.connect("sqlite://suppliers-and-parts.db")
83
+ Then `Bmg.sequel` serves relations for tables of your SQL database:
48
84
 
85
+ ```ruby
86
+ DB = Sequel.connect("sqlite://suppliers-and-parts.db")
49
87
  suppliers = Bmg.sequel(:suppliers, DB)
88
+ ```
89
+
90
+ The entire algebra is available on those relations. As long as you keep using
91
+ operators that can be translated to SQL, results remain SQL-able:
50
92
 
93
+ ```ruby
51
94
  big_suppliers = suppliers
52
- .restrict(Predicate.neq(status: 30))
95
+ .exclude(status: 30)
96
+ .project([:sid, :name])
53
97
 
54
98
  puts big_suppliers.to_sql
55
- # SELECT `t1`.`sid`, `t1`.`name`, `t1`.`status`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
99
+ # SELECT `t1`.`sid`, `t1`.`name` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
100
+ ```
56
101
 
57
- puts JSON.pretty_generate(big_suppliers)
58
- # [{...},...]
102
+ Operators not translatable to SQL are available too (such as `group` below).
103
+ Bmg fallbacks to memory operators for them, but remains capable of pushing some
104
+ operators down the tree as illustrated below (the restriction on `:city` is
105
+ pushed to the SQL server):
106
+
107
+ ```ruby
108
+ Bmg.sequel(:suppliers, sequel_db)
109
+ .project([:sid, :name, :city])
110
+ .group([:sid, :name], :suppliers_in)
111
+ .restrict(city: ["Paris", "London"])
112
+ .debug
113
+
114
+ # (group
115
+ # (sequel SELECT `t1`.`sid`, `t1`.`name`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`city` IN ('Paris', 'London')))
116
+ # [:sid, :name, :status]
117
+ # :suppliers_in
118
+ # {:array=>false})
59
119
  ```
60
120
 
61
- ## How is this different from similar libraries?
121
+ ### Reading files (csv, excel, text)
62
122
 
63
- 1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
64
- etc.) do not implement a genuine relational algebra: their support for
65
- chaining relational operators is limited (yielding errors or wrong SQL
66
- queries). Bmg **always** allows chaining operators. If it does not, it's
67
- a bug. In other words, the following query is 100% valid:
123
+ Bmg provides simple adapters to read files and reach Relationland as soon as
124
+ possible.
68
125
 
69
- relation
70
- .restrict(...) # aka where
71
- .union(...)
72
- .summarize(...) # aka group by
73
- .restrict(...)
126
+ #### CSV files
74
127
 
75
- 2. Bmg supports in memory relations, json relations, csv relations, SQL
76
- relations and so on. It's not tight to SQL generation, and supports
77
- queries accross multiple data sources.
128
+ ```ruby
129
+ csv_options = { col_sep: ",", quote_char: '"' }
130
+ r = Bmg.csv("path/to/a/file.csv", csv_options)
131
+ ```
78
132
 
79
- 3. Bmg makes a best effort to optimize queries, simplifying both generated
80
- SQL code (low-level accesses to datasources) and in-memory operations.
133
+ Options are directly transmitted to `::CSV.new`, check ruby's standard
134
+ library.
81
135
 
82
- 4. Bmg supports various *structuring* operators (group, image, autowrap,
83
- autosummarize, etc.) and allows building 'non flat' relations.
136
+ #### Excel files
84
137
 
85
- ## How is this different from Alf?
138
+ You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
139
+ read `.xls` and `.xlsx` files with Bmg.
86
140
 
87
- 1. Bmg's implementation is much simpler than Alf, and uses no ruby core
88
- extention.
141
+ ```ruby
142
+ roo_options = { skip: 1 }
143
+ r = Bmg.excel("path/to/a/file.xls", roo_options)
144
+ ```
89
145
 
90
- 2. We are confident using Bmg in production. Systematic inspection of query
91
- plans is suggested though. Alf was a bit too experimental to be used on
92
- (critical) production systems.
146
+ Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
147
+ documentation.
93
148
 
94
- 2. Alf exposes a functional syntax, command line tool, restful tools and
95
- many more. Bmg is limited to the core algebra, main Relation abstraction
96
- and SQL generation.
149
+ #### Text files
97
150
 
98
- 3. Bmg is less strict regarding conformance to relational theory, and
99
- may actually expose non relational features (such as support for null,
100
- left_join operator, etc.). Sharp tools hurt, use them with great care.
151
+ There is also a straightforward way to read text files and convert lines to
152
+ tuples.
101
153
 
102
- 4. Bmg does not yet implement all operators documented on try-alf.org, even
103
- if we plan to eventually support them all.
154
+ ```ruby
155
+ r = Bmg.text_file("path/to/a/file.txt")
156
+ r.type.attrlist
157
+ # => [:line, :text]
158
+ ```
104
159
 
105
- 5. Bmg has a few additional operators that prove very useful on real
106
- production use cases: prefix, suffix, autowrap, autosummarize, left_join,
107
- rxmatch, etc.
160
+ Without options tuples will have `:line` and `:text` attributes, the former
161
+ being the line number (starting at 1) and the latter being the line itself
162
+ (stripped).
163
+
164
+ The are a couple of options (see `Bmg::Reader::Textfile`). The most useful one
165
+ is the use a of a Regexp with named captures to automatically extract
166
+ attributes:
167
+
168
+ ```ruby
169
+ r = Bmg.text_file("path/to/a/file.txt", parse: /GET (?<url>([^\s]+))/)
170
+ r.type.attrlist
171
+ # => [:line, :url]
172
+ ```
173
+
174
+ In this scenario, non matching lines are skipped. The `:line` attribute keeps
175
+ being used to have at least one candidate key (so to speak).
176
+
177
+ ### Your own relations
178
+
179
+ As noted earlier, Bmg has a simple relation interface where you only have to
180
+ provide an iteration of symbolized tuples.
181
+
182
+ ```ruby
183
+ class MyRelation
184
+ include Bmg::Relation
185
+
186
+ def each
187
+ yield(id: 1, name: "Alf", year: 2014)
188
+ yield(id: 2, name: "Bmg", year: 2018)
189
+ end
190
+ end
191
+
192
+ MyRelation.new
193
+ .restrict(Predicate.gt(:year, 2015))
194
+ .allbut([:year])
195
+ ```
196
+
197
+ As shown, creating adapters on top of various data source is straighforward.
198
+ Adapters can also participate to query optimization (such as pushing
199
+ restrictions down the tree) by overriding the underscored version of operators
200
+ (e.g. `_restrict`).
201
+
202
+ Have a look at `Bmg::Algebra` for the protocol and `Bmg::Sql::Relation` for an
203
+ example. Keep in touch with the team if you need some help.
108
204
 
109
205
  ## Supported operators
110
206
 
@@ -114,8 +210,10 @@ r.autowrap(split: '_') # structure a flat relation, split:
114
210
  r.autosummarize([:a, :b, ...], x: :sum) # (experimental) usual summarizers supported
115
211
  r.constants(x: 12, ...) # add constant attributes (sometimes useful in unions)
116
212
  r.extend(x: ->(t){ ... }, ...) # add computed attributes
213
+ r.exclude(predicate) # shortcut for restrict(!predicate)
117
214
  r.group([:a, :b, ...], :x) # relation-valued attribute from attributes
118
215
  r.image(right, :x, [:a, :b, ...]) # relation-valued attribute from another relation
216
+ r.images({:x => r1, :y => r2}, [:a, ...]) # shortcut over image(r1, :x, ...).image(r2, :y, ...)
119
217
  r.join(right, [:a, :b, ...]) # natural join on a join key
120
218
  r.join(right, :a => :x, :b => :y, ...) # natural join after right reversed renaming
121
219
  r.left_join(right, [:a, :b, ...], {...}) # left join with optional default right tuple
@@ -137,14 +235,95 @@ t.transform(&:to_s) # similar, but Proc-driven
137
235
  t.transform(:foo => :upcase, ...) # specific-attrs tranformation
138
236
  t.transform([:to_s, :upcase]) # chain-transformation
139
237
  r.union(right) # relational union
238
+ r.where(predicate) # alias for restrict(predicate)
140
239
  ```
141
240
 
142
- ## Who is behind Bmg?
241
+ ## How is this different?
242
+
243
+ ### ... from similar libraries?
244
+
245
+ 1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
246
+ etc.) do not implement a genuine relational algebra. Their support for
247
+ chaining relational operators is thus limited (restricting your expression
248
+ power and/or raising errors and/or outputting wrong or counterintuitive
249
+ SQL code). Bmg **always** allows chaining operators. If it does not, it's
250
+ a bug.
251
+
252
+ For instance the expression below is 100% valid in Bmg. The last where
253
+ clause applies to the result of the summarize (while SQL requires a `HAVING`
254
+ clause, or a `SELECT ... FROM (SELECT ...) r`).
255
+
256
+ ```ruby
257
+ relation
258
+ .where(...)
259
+ .union(...)
260
+ .summarize(...) # aka group by
261
+ .where(...)
262
+ ```
263
+
264
+ 2. Bmg supports in memory relations, json relations, csv relations, SQL
265
+ relations and so on. It's not tight to SQL generation, and supports
266
+ queries accross multiple data sources.
267
+
268
+ 3. Bmg makes a best effort to optimize queries, simplifying both generated
269
+ SQL code (low-level accesses to datasources) and in-memory operations.
270
+
271
+ 4. Bmg supports various *structuring* operators (group, image, autowrap,
272
+ autosummarize, etc.) and allows building 'non flat' relations.
273
+
274
+ 5. Bmg can use full ruby power when that helps (e.g. regular expressions in
275
+ WHERE clauses or ruby code in EXTEND clauses). This may prevent Bmg from
276
+ delegating work to underlying data sources (e.g. SQL server) and should
277
+ therefore be used with care though.
278
+
279
+ ### ... from Alf?
280
+
281
+ If you use Alf (or used it in the past), below are the main differences between
282
+ Bmg and Alf. Bmg has NOT been written to be API-compatible with Alf and will
283
+ probably never be.
284
+
285
+ 1. Bmg's implementation is much simpler than Alf and uses no ruby core
286
+ extention.
287
+
288
+ 2. We are confident using Bmg in production. Systematic inspection of query
289
+ plans is advised though. Alf was a bit too experimental to be used on
290
+ (critical) production systems.
291
+
292
+ 3. Alf exposes a functional syntax, command line tool, restful tools and
293
+ many more. Bmg is limited to the core algebra, main Relation abstraction
294
+ and SQL generation.
143
295
 
144
- Bernard Lambeau (bernard@klaro.cards) is Alf & Bmg main engineer & maintainer.
296
+ 4. Bmg is less strict regarding conformance to relational theory, and
297
+ may actually expose non relational features (such as support for null,
298
+ left_join operator, etc.). Sharp tools hurt, use them with care.
299
+
300
+ 5. Unlike Alf::Relation instances of Bmg::Relation capture query-trees, not
301
+ values. Currently two instances `r1` and `r2` are not equal even if they
302
+ define the same mathematical relation. As a consequence joining on
303
+ relation-valued attributes does not work as expected in Bmg until further
304
+ notice.
305
+
306
+ 6. Bmg does not implement all operators documented on try-alf.org, even if
307
+ we plan to eventually support most of them.
308
+
309
+ 7. Bmg has a few additional operators that prove very useful on real
310
+ production use cases: prefix, suffix, autowrap, autosummarize, left_join,
311
+ rxmatch, etc.
312
+
313
+ 8. Bmg optimizes queries and compiles them to SQL on the fly, while Alf was
314
+ building an AST internally first. Strictly speaking this makes Bmg less
315
+ powerful than Alf since optimizations cannot be turned off for now.
316
+
317
+ ## Contribute
318
+
319
+ Please use github issues and pull requests for all questions, bug reports,
320
+ and contributions. Don't hesitate to get in touch with us with an early code
321
+ spike if you plan to add non trivial features.
322
+
323
+ ## Licence
324
+
325
+ This software is distributed by Enspirit SRL under a MIT Licence. Please
326
+ contact Bernard Lambeau (blambeau@gmail.com) with any question.
145
327
 
146
328
  Enspirit (https://enspirit.be) and Klaro App (https://klaro.cards) are both
147
329
  actively using and contributing to the library.
148
-
149
- Feel free to contact us for help, ideas and/or contributions. Please use github
150
- issues and pull requests if possible if code is involved.
data/lib/bmg.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  require 'path'
2
2
  require 'predicate'
3
3
  require 'forwardable'
4
+ require 'set'
4
5
  module Bmg
5
6
 
6
7
  def in_memory(enumerable, type = Type::ANY)
@@ -8,6 +9,11 @@ module Bmg
8
9
  end
9
10
  module_function :in_memory
10
11
 
12
+ def text_file(path, options = {}, type = Type::ANY)
13
+ Reader::TextFile.new(type, path, options).spied(main_spy)
14
+ end
15
+ module_function :text_file
16
+
11
17
  def csv(path, options = {}, type = Type::ANY)
12
18
  Reader::Csv.new(type, path, options).spied(main_spy)
13
19
  end
data/lib/bmg/algebra.rb CHANGED
@@ -174,6 +174,7 @@ module Bmg
174
174
 
175
175
  def transform(transformation = nil, options = {}, &proc)
176
176
  transformation, options = proc, (transformation || {}) unless proc.nil?
177
+ return self if transformation.is_a?(Hash) && transformation.empty?
177
178
  _transform(self.type.transform(transformation, options), transformation, options)
178
179
  end
179
180
 
@@ -2,6 +2,14 @@ module Bmg
2
2
  module Algebra
3
3
  module Shortcuts
4
4
 
5
+ def where(predicate)
6
+ restrict(predicate)
7
+ end
8
+
9
+ def exclude(predicate)
10
+ restrict(!Predicate.coerce(predicate))
11
+ end
12
+
5
13
  def rxmatch(attrs, matcher, options = {})
6
14
  predicate = attrs.inject(Predicate.contradiction){|p,a|
7
15
  p | Predicate.match(a, matcher, options)
@@ -31,6 +39,12 @@ module Bmg
31
39
  self.image(right.rename(renaming), as, on.keys, options)
32
40
  end
33
41
 
42
+ def images(rights, on = [], options = {})
43
+ rights.each_pair.inject(self){|memo,(as,right)|
44
+ memo.image(right, as, on, options)
45
+ }
46
+ end
47
+
34
48
  def join(right, on = [])
35
49
  return super unless on.is_a?(Hash)
36
50
  renaming = Hash[on.map{|k,v| [v,k] }]
@@ -63,12 +63,38 @@ module Bmg
63
63
 
64
64
  protected ### optimization
65
65
 
66
+ def _allbut(type, butlist)
67
+ operand.allbut(self.butlist|butlist)
68
+ end
69
+
70
+ def _matching(type, right, on)
71
+ # Always possible to push the matching, since by construction
72
+ # `on` can only use attributes that have not been trown away,
73
+ # hence they exist on `operand` too.
74
+ operand.matching(right, on).allbut(butlist)
75
+ end
76
+
77
+ def _page(type, ordering, page_index, options)
78
+ return super unless self.preserving_key?
79
+ operand.page(ordering, page_index, options).allbut(butlist)
80
+ end
81
+
82
+ def _project(type, attrlist)
83
+ operand.project(attrlist)
84
+ end
85
+
66
86
  def _restrict(type, predicate)
67
87
  operand.restrict(predicate).allbut(butlist)
68
88
  end
69
89
 
70
90
  protected ### inspect
71
91
 
92
+ def preserving_key?
93
+ operand.type.knows_keys? && operand.type.keys.find{|k|
94
+ (k & butlist).empty?
95
+ }
96
+ end
97
+
72
98
  def args
73
99
  [ butlist ]
74
100
  end
@@ -24,6 +24,22 @@ module Bmg
24
24
 
25
25
  public
26
26
 
27
+ def self.same(*args)
28
+ Same.new(*args)
29
+ end
30
+
31
+ def self.group(*args)
32
+ Group.new(*args)
33
+ end
34
+
35
+ def self.y_by_x(*args)
36
+ YByX.new(*args)
37
+ end
38
+
39
+ def self.ys_by_x(*args)
40
+ YsByX.new(*args)
41
+ end
42
+
27
43
  def each(&bl)
28
44
  h = {}
29
45
  @operand.each do |tuple|
@@ -41,6 +57,12 @@ module Bmg
41
57
  [:autosummarize, operand.to_ast, by.dup, sums.dup]
42
58
  end
43
59
 
60
+ public ### for internal reasons
61
+
62
+ def _count
63
+ operand._count
64
+ end
65
+
44
66
  protected
45
67
 
46
68
  def _restrict(type, predicate)
@@ -175,11 +197,11 @@ module Bmg
175
197
  end
176
198
 
177
199
  def init(v)
178
- [v]
200
+ v.nil? ? [] : [v]
179
201
  end
180
202
 
181
203
  def sum(v1, v2)
182
- v1 << v2
204
+ v2.nil? ? v1 : (v1 << v2)
183
205
  end
184
206
 
185
207
  def term(v)
@@ -211,11 +233,11 @@ module Bmg
211
233
  end
212
234
 
213
235
  def init(v)
214
- [v]
236
+ v.nil? ? [] : [v]
215
237
  end
216
238
 
217
239
  def sum(v1, v2)
218
- v1 << v2
240
+ v2.nil? ? v1 : (v1 << v2)
219
241
  end
220
242
 
221
243
  def term(v)
@@ -52,6 +52,12 @@ module Bmg
52
52
  [ :autowrap, operand.to_ast, @original_options.dup ]
53
53
  end
54
54
 
55
+ public ### for internal reasons
56
+
57
+ def _count
58
+ operand._count
59
+ end
60
+
55
61
  protected ### optimization
56
62
 
57
63
  def _autowrap(type, opts)
@@ -86,6 +92,16 @@ module Bmg
86
92
  false
87
93
  end
88
94
 
95
+ def _matching(type, right, on)
96
+ if (wrapped_roots! & on).empty?
97
+ operand.matching(right, on).autowrap(options)
98
+ else
99
+ super
100
+ end
101
+ rescue UnknownAttributesError
102
+ super
103
+ end
104
+
89
105
  def _page(type, ordering, page_index, opts)
90
106
  attrs = ordering.map{|(a,d)| a }
91
107
  if (wrapped_roots! & attrs).empty?
@@ -97,6 +113,16 @@ module Bmg
97
113
  super
98
114
  end
99
115
 
116
+ def _project(type, attrlist)
117
+ if (wrapped_roots! & attrlist).empty?
118
+ operand.project(attrlist).autowrap(options)
119
+ else
120
+ super
121
+ end
122
+ rescue UnknownAttributesError
123
+ super
124
+ end
125
+
100
126
  def _rename(type, renaming)
101
127
  # 1. Can't optimize if renaming applies to a wrapped one
102
128
  return super unless (wrapped_roots! & renaming.keys).empty?
@@ -54,6 +54,12 @@ module Bmg
54
54
  [ :constants, operand.to_ast, constants.dup ]
55
55
  end
56
56
 
57
+ public ### for internal reasons
58
+
59
+ def _count
60
+ operand._count
61
+ end
62
+
57
63
  protected ### optimization
58
64
 
59
65
  def _page(type, ordering, page_index, options)
@@ -53,6 +53,12 @@ module Bmg
53
53
  [ :extend, operand.to_ast, extension.dup ]
54
54
  end
55
55
 
56
+ public ### for internal reasons
57
+
58
+ def _count
59
+ operand._count
60
+ end
61
+
56
62
  protected ### optimization
57
63
 
58
64
  def _allbut(type, butlist)
@@ -99,9 +99,10 @@ module Bmg
99
99
  key = tuple_project(t, on)
100
100
  index[key].operand << tuple_image(t, on)
101
101
  end
102
- if options[:array]
102
+ if opt = options[:array]
103
+ sorter = to_sorter(opt)
103
104
  index = index.each_with_object({}) do |(k,v),ix|
104
- ix[k] = v.to_a
105
+ ix[k] = sorter ? v.to_a.sort(&sorter) : v.to_a
105
106
  end
106
107
  end
107
108
  index
@@ -154,8 +155,32 @@ module Bmg
154
155
  end
155
156
  end
156
157
 
158
+ public ### for internal reasons
159
+
160
+ def _count
161
+ left._count
162
+ end
163
+
157
164
  protected ### optimization
158
165
 
166
+ def _allbut(type, butlist)
167
+ if butlist.include?(as)
168
+ left.allbut(butlist - [as])
169
+ elsif (butlist & on).empty?
170
+ left.allbut(butlist).image(right, as, on, options)
171
+ else
172
+ super
173
+ end
174
+ end
175
+
176
+ def _matching(type, m_right, m_on)
177
+ if m_on.include?(as)
178
+ super
179
+ else
180
+ left.matching(m_right, m_on).image(right, as, on, options)
181
+ end
182
+ end
183
+
159
184
  def _page(type, ordering, page_index, opts)
160
185
  if ordering.map{|(k,v)| k}.include?(as)
161
186
  super
@@ -166,6 +191,14 @@ module Bmg
166
191
  end
167
192
  end
168
193
 
194
+ def _project(type, attrlist)
195
+ if attrlist.include?(as)
196
+ super
197
+ else
198
+ left.project(attrlist)
199
+ end
200
+ end
201
+
169
202
  def _restrict(type, predicate)
170
203
  on_as, rest = predicate.and_split([as])
171
204
  if rest.tautology?
@@ -227,6 +260,11 @@ module Bmg
227
260
  Relation::InMemory.new(image_type, Set.new)
228
261
  end
229
262
 
263
+ def to_sorter(opt)
264
+ return nil unless opt.is_a?(Array)
265
+ Ordering.new(opt).comparator
266
+ end
267
+
230
268
  public
231
269
 
232
270
  def to_s
@@ -45,13 +45,7 @@ module Bmg
45
45
  protected ### inspect
46
46
 
47
47
  def comparator
48
- ->(t1, t2) {
49
- ordering.each do |(attr,direction)|
50
- c = t1[attr] <=> t2[attr]
51
- return (direction == :desc ? -c : c) unless c==0
52
- end
53
- 0
54
- }
48
+ Ordering.new(@ordering).comparator
55
49
  end
56
50
 
57
51
  def args
@@ -31,7 +31,7 @@ module Bmg
31
31
  def each
32
32
  seen = {}
33
33
  @operand.each do |tuple|
34
- projected = project(tuple)
34
+ projected = tuple_project(tuple)
35
35
  unless seen.has_key?(projected)
36
36
  yield(projected)
37
37
  seen[projected] = true
@@ -74,7 +74,7 @@ module Bmg
74
74
 
75
75
  private
76
76
 
77
- def project(tuple)
77
+ def tuple_project(tuple)
78
78
  tuple.dup.delete_if{|k,_| !@attrlist.include?(k) }
79
79
  end
80
80
 
@@ -60,6 +60,12 @@ module Bmg
60
60
  [ :rename, operand.to_ast, renaming.dup ]
61
61
  end
62
62
 
63
+ public ### for internal reasons
64
+
65
+ def _count
66
+ operand._count
67
+ end
68
+
63
69
  protected ### optimization
64
70
 
65
71
  def _page(type, ordering, page_index, options)
@@ -23,7 +23,7 @@ module Bmg
23
23
 
24
24
  protected
25
25
 
26
- attr_reader :transformation
26
+ attr_reader :transformation, :options
27
27
 
28
28
  public
29
29
 
@@ -40,6 +40,43 @@ module Bmg
40
40
 
41
41
  protected ### optimization
42
42
 
43
+ def _allbut(type, butlist)
44
+ # `allbut` can always be pushed down the tree. unlike
45
+ # `extend` the Proc that might be used cannot use attributes
46
+ # in butlist, so it's safe to strip them away.
47
+ if transformer.knows_attrlist?
48
+ # We just need to clean the transformation
49
+ attrlist = transformer.to_attrlist
50
+ thrown = attrlist & butlist
51
+ t = transformation.dup.reject{|k,v| thrown.include?(k) }
52
+ operand.allbut(butlist).transform(t, options)
53
+ else
54
+ operand.allbut(butlist).transform(transformation, options)
55
+ end
56
+ end
57
+
58
+ def _project(type, attrlist)
59
+ if transformer.knows_attrlist?
60
+ t = transformation.dup.select{|k,v| attrlist.include?(k) }
61
+ operand.project(attrlist).transform(t, options)
62
+ else
63
+ operand.project(attrlist).transform(transformation, options)
64
+ end
65
+ end
66
+
67
+ def _restrict(type, predicate)
68
+ return super unless transformer.knows_attrlist?
69
+ top, bottom = predicate.and_split(transformer.to_attrlist)
70
+ if top == predicate
71
+ super
72
+ else
73
+ operand
74
+ .restrict(bottom)
75
+ .transform(transformation, options)
76
+ .restrict(top)
77
+ end
78
+ end
79
+
43
80
  protected ### inspect
44
81
 
45
82
  def args
data/lib/bmg/reader.rb CHANGED
@@ -9,3 +9,4 @@ module Bmg
9
9
  end
10
10
  require_relative "reader/csv"
11
11
  require_relative "reader/excel"
12
+ require_relative "reader/text_file"
@@ -0,0 +1,56 @@
1
+ module Bmg
2
+ module Reader
3
+ class TextFile
4
+ include Reader
5
+
6
+ DEFAULT_OPTIONS = {
7
+ strip: true,
8
+ parse: nil
9
+ }
10
+
11
+ def initialize(type, path, options = {})
12
+ options = { parse: options } if options.is_a?(Regexp)
13
+ @path = path
14
+ @options = DEFAULT_OPTIONS.merge(options)
15
+ @type = infer_type(type)
16
+ end
17
+ attr_reader :path, :options
18
+
19
+ public # Relation
20
+
21
+ def each
22
+ path.each_line.each_with_index do |text, line|
23
+ text = text.strip if strip?
24
+ parsed = parse(text)
25
+ yield({line: 1+line}.merge(parsed)) if parsed
26
+ end
27
+ end
28
+
29
+ private
30
+
31
+ def infer_type(base)
32
+ return base unless base == Bmg::Type::ANY
33
+ attr_list = if rx = options[:parse]
34
+ [:line] + rx.names.map(&:to_sym)
35
+ else
36
+ [:line, :text]
37
+ end
38
+ base
39
+ .with_attrlist(attr_list)
40
+ .with_keys([[:line]])
41
+ end
42
+
43
+ def strip?
44
+ options[:strip]
45
+ end
46
+
47
+ def parse(text)
48
+ return { text: text } unless rx = options[:parse]
49
+ if match = rx.match(text)
50
+ TupleAlgebra.symbolize_keys(match.named_captures)
51
+ end
52
+ end
53
+
54
+ end # class TextFile
55
+ end # module Reader
56
+ end # module Bmg
data/lib/bmg/relation.rb CHANGED
@@ -17,6 +17,16 @@ module Bmg
17
17
  self
18
18
  end
19
19
 
20
+ def type
21
+ Bmg::Type::ANY
22
+ end
23
+
24
+ def with_type(type)
25
+ dup.tap{|r|
26
+ r.type = type
27
+ }
28
+ end
29
+
20
30
  def with_typecheck
21
31
  dup.tap{|r|
22
32
  r.type = r.type.with_typecheck
@@ -100,6 +110,18 @@ module Bmg
100
110
  end
101
111
  end
102
112
 
113
+ def count
114
+ if type.knows_keys?
115
+ project(type.keys.first)._count
116
+ else
117
+ self._count
118
+ end
119
+ end
120
+
121
+ def _count
122
+ to_a.size
123
+ end
124
+
103
125
  # Returns a json representation
104
126
  def to_json(*args, &bl)
105
127
  to_a.to_json(*args, &bl)
@@ -113,9 +135,10 @@ module Bmg
113
135
  # When no string_or_io is used, the method uses a string.
114
136
  #
115
137
  # The method always returns the string_or_io.
116
- def to_csv(options = {}, string_or_io = nil)
138
+ def to_csv(options = {}, string_or_io = nil, preferences = nil)
117
139
  options, string_or_io = {}, options unless options.is_a?(Hash)
118
- Writer::Csv.new(options).call(self, string_or_io)
140
+ string_or_io, preferences = nil, string_or_io if string_or_io.is_a?(Hash)
141
+ Writer::Csv.new(options, preferences).call(self, string_or_io)
119
142
  end
120
143
 
121
144
  # Converts to an sexpr expression.
@@ -19,6 +19,10 @@ module Bmg
19
19
  def each(&bl)
20
20
  end
21
21
 
22
+ def _count
23
+ 0
24
+ end
25
+
22
26
  def to_ast
23
27
  [ :empty ]
24
28
  end
@@ -17,6 +17,16 @@ module Bmg
17
17
  @operand.each(&bl)
18
18
  end
19
19
 
20
+ def _count
21
+ if operand.respond_to?(:count)
22
+ operand.count
23
+ elsif operand.respond_to?(:size)
24
+ operand.size
25
+ else
26
+ super
27
+ end
28
+ end
29
+
20
30
  def to_ast
21
31
  [ :in_memory, operand ]
22
32
  end
@@ -16,6 +16,12 @@ module Bmg
16
16
  end
17
17
  protected :type=
18
18
 
19
+ public
20
+
21
+ def _count
22
+ operand._count
23
+ end
24
+
19
25
  public
20
26
 
21
27
  def each(&bl)
@@ -24,10 +24,15 @@ module Bmg
24
24
  protected :type=
25
25
 
26
26
  def each(&bl)
27
- spy.call(self)
27
+ spy.call(self) if bl
28
28
  operand.each(&bl)
29
29
  end
30
30
 
31
+ def count
32
+ spy.call(self) if bl
33
+ operand.count
34
+ end
35
+
31
36
  def to_ast
32
37
  [ :spied, operand.to_ast, spy ]
33
38
  end
@@ -33,6 +33,10 @@ module Bmg
33
33
  base_table.update(arg)
34
34
  end
35
35
 
36
+ def _count
37
+ dataset.count
38
+ end
39
+
36
40
  def to_ast
37
41
  [:sequel, dataset.sql]
38
42
  end
@@ -10,7 +10,6 @@ module Bmg
10
10
  end
11
11
 
12
12
  attr_accessor :type
13
- protected :type=
14
13
 
15
14
  protected
16
15
 
data/lib/bmg/support.rb CHANGED
@@ -1,3 +1,5 @@
1
1
  require_relative 'support/tuple_algebra'
2
2
  require_relative 'support/tuple_transformer'
3
3
  require_relative 'support/keys'
4
+ require_relative 'support/ordering'
5
+ require_relative 'support/output_preferences'
@@ -0,0 +1,20 @@
1
+ module Bmg
2
+ class Ordering
3
+
4
+ def initialize(attrs)
5
+ @attrs = attrs
6
+ end
7
+ attr_reader :attrs
8
+
9
+ def comparator
10
+ ->(t1, t2) {
11
+ attrs.each do |(attr,direction)|
12
+ c = t1[attr] <=> t2[attr]
13
+ return (direction == :desc ? -c : c) unless c==0
14
+ end
15
+ 0
16
+ }
17
+ end
18
+
19
+ end # class Ordering
20
+ end # module Bmg
@@ -0,0 +1,44 @@
1
+ module Bmg
2
+ class OutputPreferences
3
+
4
+ DEFAULT_PREFS = {
5
+ attributes_ordering: nil,
6
+ extra_attributes: :after
7
+ }
8
+
9
+ def initialize(options)
10
+ @options = DEFAULT_PREFS.merge(options)
11
+ end
12
+ attr_reader :options
13
+
14
+ def self.dress(arg)
15
+ return arg if arg.is_a?(OutputPreferences)
16
+ arg = {} if arg.nil?
17
+ new(arg)
18
+ end
19
+
20
+ def attributes_ordering
21
+ options[:attributes_ordering]
22
+ end
23
+
24
+ def extra_attributes
25
+ options[:extra_attributes]
26
+ end
27
+
28
+ def order_attrlist(attrlist)
29
+ return attrlist if attributes_ordering.nil?
30
+ index = Hash[attributes_ordering.each_with_index.to_a]
31
+ attrlist.sort{|a,b|
32
+ ai, bi = index[a], index[b]
33
+ if ai && bi
34
+ ai <=> bi
35
+ elsif ai
36
+ extra_attributes == :after ? -1 : 1
37
+ else
38
+ extra_attributes == :after ? 1 : -1
39
+ end
40
+ }
41
+ end
42
+
43
+ end # class OutputPreferences
44
+ end # module Bmg
@@ -19,5 +19,11 @@ module Bmg
19
19
  end
20
20
  module_function :rename
21
21
 
22
+ def symbolize_keys(h)
23
+ return h if h.empty?
24
+ h.each_with_object({}){|(k,v),h| h[k.to_sym] = v }
25
+ end
26
+ module_function :symbolize_keys
27
+
22
28
  end # module TupleAlgebra
23
29
  end # module Bmg
@@ -26,11 +26,7 @@ module Bmg
26
26
 
27
27
  def transform_tuple(tuple, with)
28
28
  case with
29
- when Symbol
30
- tuple.each_with_object({}){|(k,v),dup|
31
- dup[k] = transform_attr(v, with)
32
- }
33
- when Proc
29
+ when Symbol, Proc, Regexp
34
30
  tuple.each_with_object({}){|(k,v),dup|
35
31
  dup[k] = transform_attr(v, with)
36
32
  }
@@ -51,8 +47,13 @@ module Bmg
51
47
  case with
52
48
  when Symbol
53
49
  value.send(with)
50
+ when Regexp
51
+ m = with.match(value.to_s)
52
+ m.nil? ? m : m.to_s
54
53
  when Proc
55
54
  with.call(value)
55
+ when Hash
56
+ with[value]
56
57
  else
57
58
  raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
58
59
  end
data/lib/bmg/version.rb CHANGED
@@ -1,8 +1,8 @@
1
1
  module Bmg
2
2
  module Version
3
3
  MAJOR = 0
4
- MINOR = 17
5
- TINY = 6
4
+ MINOR = 18
5
+ TINY = 2
6
6
  end
7
7
  VERSION = "#{Version::MAJOR}.#{Version::MINOR}.#{Version::TINY}"
8
8
  end
@@ -6,26 +6,37 @@ module Bmg
6
6
  DEFAULT_OPTIONS = {
7
7
  }
8
8
 
9
- def initialize(options)
10
- @options = DEFAULT_OPTIONS.merge(options)
9
+ def initialize(csv_options, output_preferences = nil)
10
+ @csv_options = DEFAULT_OPTIONS.merge(csv_options)
11
+ @output_preferences = OutputPreferences.dress(output_preferences)
11
12
  end
12
- attr_reader :options
13
+ attr_reader :csv_options, :output_preferences
13
14
 
14
15
  def call(relation, string_or_io = nil)
15
16
  require 'csv'
16
17
  string_or_io, to_s = string_or_io.nil? ? [StringIO.new, true] : [string_or_io, false]
17
- headers = relation.type.to_attrlist if relation.type.knows_attrlist?
18
- csv = nil
18
+ headers, csv = infer_headers(relation.type), nil
19
19
  relation.each do |tuple|
20
20
  if csv.nil?
21
- headers = tuple.keys if headers.nil?
22
- csv = CSV.new(string_or_io, options.merge(headers: headers))
21
+ headers = infer_headers(tuple) if headers.nil?
22
+ csv = CSV.new(string_or_io, csv_options.merge(headers: headers))
23
23
  end
24
24
  csv << headers.map{|h| tuple[h] }
25
25
  end
26
26
  to_s ? string_or_io.string : string_or_io
27
27
  end
28
28
 
29
+ private
30
+
31
+ def infer_headers(from)
32
+ attrlist = if from.is_a?(Type) && from.knows_attrlist?
33
+ from.to_attrlist
34
+ elsif from.is_a?(Hash)
35
+ from.keys
36
+ end
37
+ attrlist ? output_preferences.order_attrlist(attrlist) : nil
38
+ end
39
+
29
40
  end # class Csv
30
41
  end # module Writer
31
42
  end # module Bmg
data/tasks/test.rake CHANGED
@@ -6,17 +6,24 @@ namespace :test do
6
6
  desc "Runs unit tests"
7
7
  RSpec::Core::RakeTask.new(:unit) do |t|
8
8
  t.pattern = "spec/unit/**/test_*.rb"
9
- t.rspec_opts = ["-Ilib", "-Ispec/unit", "--fail-fast", "--color", "--backtrace", "--format=progress"]
9
+ t.rspec_opts = ["-Ilib", "-Ispec/unit", "--color", "--backtrace", "--format=progress"]
10
10
  end
11
11
  tests << :unit
12
12
 
13
13
  desc "Runs integration tests"
14
14
  RSpec::Core::RakeTask.new(:integration) do |t|
15
15
  t.pattern = "spec/integration/**/test_*.rb"
16
- t.rspec_opts = ["-Ilib", "-Ispec/integration", "--fail-fast", "--color", "--backtrace", "--format=progress"]
16
+ t.rspec_opts = ["-Ilib", "-Ispec/integration", "--color", "--backtrace", "--format=progress"]
17
17
  end
18
18
  tests << :integration
19
19
 
20
+ desc "Runs github regression tests"
21
+ RSpec::Core::RakeTask.new(:regression) do |t|
22
+ t.pattern = "spec/regression/**/test_*.rb"
23
+ t.rspec_opts = ["-Ilib", "-Ispec/regression", "--color", "--backtrace", "--format=progress"]
24
+ end
25
+ tests << :regression
26
+
20
27
  task :all => tests
21
28
  end
22
29
 
metadata CHANGED
@@ -1,49 +1,49 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bmg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.17.6
4
+ version: 0.18.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Bernard Lambeau
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-08-28 00:00:00.000000000 Z
11
+ date: 2021-04-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: predicate
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
18
- - !ruby/object:Gem::Version
19
- version: '2.4'
20
17
  - - ">="
21
18
  - !ruby/object:Gem::Version
22
- version: 2.4.0
19
+ version: 2.5.0
20
+ - - "~>"
21
+ - !ruby/object:Gem::Version
22
+ version: '2.5'
23
23
  type: :runtime
24
24
  prerelease: false
25
25
  version_requirements: !ruby/object:Gem::Requirement
26
26
  requirements:
27
- - - "~>"
28
- - !ruby/object:Gem::Version
29
- version: '2.4'
30
27
  - - ">="
31
28
  - !ruby/object:Gem::Version
32
- version: 2.4.0
29
+ version: 2.5.0
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '2.5'
33
33
  - !ruby/object:Gem::Dependency
34
34
  name: path
35
35
  requirement: !ruby/object:Gem::Requirement
36
36
  requirements:
37
37
  - - ">="
38
38
  - !ruby/object:Gem::Version
39
- version: '1.3'
39
+ version: '2.0'
40
40
  type: :runtime
41
41
  prerelease: false
42
42
  version_requirements: !ruby/object:Gem::Requirement
43
43
  requirements:
44
44
  - - ">="
45
45
  - !ruby/object:Gem::Version
46
- version: '1.3'
46
+ version: '2.0'
47
47
  - !ruby/object:Gem::Dependency
48
48
  name: rake
49
49
  requirement: !ruby/object:Gem::Requirement
@@ -78,14 +78,14 @@ dependencies:
78
78
  requirements:
79
79
  - - ">="
80
80
  - !ruby/object:Gem::Version
81
- version: '2.7'
81
+ version: '2.8'
82
82
  type: :development
83
83
  prerelease: false
84
84
  version_requirements: !ruby/object:Gem::Requirement
85
85
  requirements:
86
86
  - - ">="
87
87
  - !ruby/object:Gem::Version
88
- version: '2.7'
88
+ version: '2.8'
89
89
  - !ruby/object:Gem::Dependency
90
90
  name: sequel
91
91
  requirement: !ruby/object:Gem::Requirement
@@ -154,6 +154,7 @@ files:
154
154
  - lib/bmg/reader.rb
155
155
  - lib/bmg/reader/csv.rb
156
156
  - lib/bmg/reader/excel.rb
157
+ - lib/bmg/reader/text_file.rb
157
158
  - lib/bmg/relation.rb
158
159
  - lib/bmg/relation/empty.rb
159
160
  - lib/bmg/relation/in_memory.rb
@@ -260,6 +261,8 @@ files:
260
261
  - lib/bmg/summarizer/variance.rb
261
262
  - lib/bmg/support.rb
262
263
  - lib/bmg/support/keys.rb
264
+ - lib/bmg/support/ordering.rb
265
+ - lib/bmg/support/output_preferences.rb
263
266
  - lib/bmg/support/tuple_algebra.rb
264
267
  - lib/bmg/support/tuple_transformer.rb
265
268
  - lib/bmg/type.rb
@@ -287,7 +290,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
287
290
  - !ruby/object:Gem::Version
288
291
  version: '0'
289
292
  requirements: []
290
- rubygems_version: 3.1.2
293
+ rubygems_version: 3.0.8
291
294
  signing_key:
292
295
  specification_version: 4
293
296
  summary: Bmg is Alf's successor.