bmg 0.17.4 → 0.18.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/Gemfile +0 -3
- data/README.md +239 -57
- data/lib/bmg.rb +8 -0
- data/lib/bmg/algebra.rb +10 -0
- data/lib/bmg/algebra/shortcuts.rb +8 -0
- data/lib/bmg/operator.rb +1 -0
- data/lib/bmg/operator/transform.rb +57 -0
- data/lib/bmg/reader.rb +1 -0
- data/lib/bmg/reader/text_file.rb +56 -0
- data/lib/bmg/relation.rb +24 -0
- data/lib/bmg/relation/proxy.rb +63 -0
- data/lib/bmg/relation/spied.rb +1 -1
- data/lib/bmg/support.rb +2 -0
- data/lib/bmg/support/keys.rb +4 -0
- data/lib/bmg/support/output_preferences.rb +44 -0
- data/lib/bmg/support/tuple_algebra.rb +6 -0
- data/lib/bmg/support/tuple_transformer.rb +63 -0
- data/lib/bmg/type.rb +25 -0
- data/lib/bmg/version.rb +2 -2
- data/lib/bmg/writer.rb +1 -0
- data/lib/bmg/writer/csv.rb +42 -0
- data/tasks/test.rake +9 -2
- metadata +34 -27
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f1ce59d00b630f644e5716eaff17a83116c1342ea0b6ba7d9174b1f5f4eadd6e
|
4
|
+
data.tar.gz: 2156d1a8eb88999e749f434a8e94d2b4e70c636fadb10470493f19bb805a19a3
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8e3e714d698ff7c47c2f61b4056c73acaa39983306402eeb54d1957b90959af51fcf20a543c5c17242d260fe835b33c19dd960e3cd4920a66d49ea81a22ee4b0
|
7
|
+
data.tar.gz: 89fe1cc4c3157adf7373fb755e0cf7eb5dd8d93f7061558454d27649b44aca558f936f006e3238dce410ffdfe29821d59feac3b31b5d64241ffc76192bba149f
|
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -1,16 +1,30 @@
|
|
1
1
|
# Bmg, a relational algebra (Alf's successor)!
|
2
2
|
|
3
|
+
[](https://travis-ci.com/enspirit/bmg)
|
4
|
+
|
3
5
|
Bmg is a relational algebra implemented as a ruby library. It implements the
|
4
6
|
[Relation as First-Class Citizen](http://www.try-alf.org/blog/2013-10-21-relations-as-first-class-citizen)
|
5
|
-
paradigm contributed with Alf a few years ago.
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
7
|
+
paradigm contributed with [Alf](http://www.try-alf.org/) a few years ago.
|
8
|
+
|
9
|
+
Bmg can be used to query relations in memory, from various files, SQL databases,
|
10
|
+
and any data source that can be seen as serving relations. Cross data-sources
|
11
|
+
joins are supported, as with Alf. For differences with Alf, see a section
|
12
|
+
further down this README.
|
13
|
+
|
14
|
+
## Outline
|
15
|
+
|
16
|
+
* [Example](#example)
|
17
|
+
* [Where are base relations coming from?](#where-are-base-relations-coming-from)
|
18
|
+
* [Memory relations](#memory-relations)
|
19
|
+
* [Connecting to SQL databases](#connecting-to-sql-databases)
|
20
|
+
* [Reading files (csv, excel, text)](#reading-files-csv-excel-text)
|
21
|
+
* [Your own relations](#your-own-relations)
|
22
|
+
* [List of supported operators](#supported-operators)
|
23
|
+
* [How is this different?](#how-is-this-different)
|
24
|
+
* [... from similar libraries](#-from-similar-libraries)
|
25
|
+
* [... from Alf](#-from-alf)
|
26
|
+
* [Contribute](#contribute)
|
27
|
+
* [License](#license)
|
14
28
|
|
15
29
|
## Example
|
16
30
|
|
@@ -27,7 +41,7 @@ suppliers = Bmg::Relation.new([
|
|
27
41
|
])
|
28
42
|
|
29
43
|
by_city = suppliers
|
30
|
-
.
|
44
|
+
.exclude(status: 30)
|
31
45
|
.extend(upname: ->(t){ t[:name].upcase })
|
32
46
|
.group([:sid, :name, :status], :suppliers_in)
|
33
47
|
|
@@ -35,76 +49,158 @@ puts JSON.pretty_generate(by_city)
|
|
35
49
|
# [{...},...]
|
36
50
|
```
|
37
51
|
|
38
|
-
##
|
52
|
+
## Where are base relations coming from?
|
53
|
+
|
54
|
+
Bmg sees relations as sets/enumerable of symbolized Ruby hashes. The following
|
55
|
+
sections show you how to get them in the first place, to enter Relationland.
|
56
|
+
|
57
|
+
### Memory relations
|
58
|
+
|
59
|
+
If you have an Array of Hashes -- in fact any Enumerable -- you can easily get
|
60
|
+
a Relation using either `Bmg::Relation.new` or `Bmg.in_memory`.
|
61
|
+
|
62
|
+
```ruby
|
63
|
+
# this...
|
64
|
+
r = Bmg::Relation.new [{id: 1}, {id: 2}]
|
65
|
+
|
66
|
+
# is the same as this...
|
67
|
+
r = Bmg.in_memory [{id: 1}, {id: 2}]
|
68
|
+
|
69
|
+
# entire algebra is available on `r`
|
70
|
+
```
|
71
|
+
|
72
|
+
### Connecting to SQL databases
|
39
73
|
|
40
|
-
Bmg requires `sequel >= 3.0` to connect to SQL databases.
|
74
|
+
Bmg currently requires `sequel >= 3.0` to connect to SQL databases. You also
|
75
|
+
need to require `bmg/sequel`.
|
41
76
|
|
42
77
|
```ruby
|
43
78
|
require 'sqlite3'
|
44
79
|
require 'bmg'
|
45
80
|
require 'bmg/sequel'
|
81
|
+
```
|
46
82
|
|
47
|
-
|
83
|
+
Then `Bmg.sequel` serves relations for tables of your SQL database:
|
48
84
|
|
85
|
+
```ruby
|
86
|
+
DB = Sequel.connect("sqlite://suppliers-and-parts.db")
|
49
87
|
suppliers = Bmg.sequel(:suppliers, DB)
|
88
|
+
```
|
89
|
+
|
90
|
+
The entire algebra is available on those relations. As long as you keep using
|
91
|
+
operators that can be translated to SQL, results remain SQL-able:
|
50
92
|
|
93
|
+
```ruby
|
51
94
|
big_suppliers = suppliers
|
52
|
-
.
|
95
|
+
.exclude(status: 30)
|
96
|
+
.project([:sid, :name])
|
53
97
|
|
54
98
|
puts big_suppliers.to_sql
|
55
|
-
# SELECT `t1`.`sid`, `t1`.`name
|
99
|
+
# SELECT `t1`.`sid`, `t1`.`name` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
|
100
|
+
```
|
56
101
|
|
57
|
-
|
58
|
-
|
102
|
+
Operators not translatable to SQL are available too (such as `group` below).
|
103
|
+
Bmg fallbacks to memory operators for them, but remains capable of pushing some
|
104
|
+
operators down the tree as illustrated below (the restriction on `:city` is
|
105
|
+
pushed to the SQL server):
|
106
|
+
|
107
|
+
```ruby
|
108
|
+
Bmg.sequel(:suppliers, sequel_db)
|
109
|
+
.project([:sid, :name, :city])
|
110
|
+
.group([:sid, :name], :suppliers_in)
|
111
|
+
.restrict(city: ["Paris", "London"])
|
112
|
+
.debug
|
113
|
+
|
114
|
+
# (group
|
115
|
+
# (sequel SELECT `t1`.`sid`, `t1`.`name`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`city` IN ('Paris', 'London')))
|
116
|
+
# [:sid, :name, :status]
|
117
|
+
# :suppliers_in
|
118
|
+
# {:array=>false})
|
59
119
|
```
|
60
120
|
|
61
|
-
|
121
|
+
### Reading files (csv, excel, text)
|
62
122
|
|
63
|
-
|
64
|
-
|
65
|
-
chaining relational operators is limited (yielding errors or wrong SQL
|
66
|
-
queries). Bmg **always** allows chaining operators. If it does not, it's
|
67
|
-
a bug. In other words, the following query is 100% valid:
|
123
|
+
Bmg provides simple adapters to read files and reach Relationland as soon as
|
124
|
+
possible.
|
68
125
|
|
69
|
-
|
70
|
-
.restrict(...) # aka where
|
71
|
-
.union(...)
|
72
|
-
.summarize(...) # aka group by
|
73
|
-
.restrict(...)
|
126
|
+
#### CSV files
|
74
127
|
|
75
|
-
|
76
|
-
|
77
|
-
|
128
|
+
```ruby
|
129
|
+
csv_options = { col_sep: ",", quote_char: '"' }
|
130
|
+
r = Bmg.csv("path/to/a/file.csv", csv_options)
|
131
|
+
```
|
78
132
|
|
79
|
-
|
80
|
-
|
133
|
+
Options are directly transmitted to `::CSV.new`, check ruby's standard
|
134
|
+
library.
|
81
135
|
|
82
|
-
|
83
|
-
autosummarize, etc.) and allows building 'non flat' relations.
|
136
|
+
#### Excel files
|
84
137
|
|
85
|
-
|
138
|
+
You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
|
139
|
+
read `.xls` and `.xlsx` files with Bmg.
|
86
140
|
|
87
|
-
|
88
|
-
|
141
|
+
```ruby
|
142
|
+
roo_options = { skip: 1 }
|
143
|
+
r = Bmg.excel("path/to/a/file.xls", roo_options)
|
144
|
+
```
|
89
145
|
|
90
|
-
|
91
|
-
|
92
|
-
(critical) production systems.
|
146
|
+
Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
|
147
|
+
documentation.
|
93
148
|
|
94
|
-
|
95
|
-
many more. Bmg is limited to the core algebra, main Relation abstraction
|
96
|
-
and SQL generation.
|
149
|
+
#### Text files
|
97
150
|
|
98
|
-
|
99
|
-
|
100
|
-
left_join operator, etc.). Sharp tools hurt, use them with great care.
|
151
|
+
There is also a straightforward way to read text files and convert lines to
|
152
|
+
tuples.
|
101
153
|
|
102
|
-
|
103
|
-
|
154
|
+
```ruby
|
155
|
+
r = Bmg.text_file("path/to/a/file.txt")
|
156
|
+
r.type.attrlist
|
157
|
+
# => [:line, :text]
|
158
|
+
```
|
104
159
|
|
105
|
-
|
106
|
-
|
107
|
-
|
160
|
+
Without options tuples will have `:line` and `:text` attributes, the former
|
161
|
+
being the line number (starting at 1) and the latter being the line itself
|
162
|
+
(stripped).
|
163
|
+
|
164
|
+
The are a couple of options (see `Bmg::Reader::Textfile`). The most useful one
|
165
|
+
is the use a of a Regexp with named captures to automatically extract
|
166
|
+
attributes:
|
167
|
+
|
168
|
+
```ruby
|
169
|
+
r = Bmg.text_file("path/to/a/file.txt", parse: /GET (?<url>([^\s]+))/)
|
170
|
+
r.type.attrlist
|
171
|
+
# => [:line, :url]
|
172
|
+
```
|
173
|
+
|
174
|
+
In this scenario, non matching lines are skipped. The `:line` attribute keeps
|
175
|
+
being used to have at least one candidate key (so to speak).
|
176
|
+
|
177
|
+
### Your own relations
|
178
|
+
|
179
|
+
As noted earlier, Bmg has a simple relation interface where you only have to
|
180
|
+
provide an iteration of symbolized tuples.
|
181
|
+
|
182
|
+
```ruby
|
183
|
+
class MyRelation
|
184
|
+
include Bmg::Relation
|
185
|
+
|
186
|
+
def each
|
187
|
+
yield(id: 1, name: "Alf", year: 2014)
|
188
|
+
yield(id: 2, name: "Bmg", year: 2018)
|
189
|
+
end
|
190
|
+
end
|
191
|
+
|
192
|
+
MyRelation.new
|
193
|
+
.restrict(Predicate.gt(:year, 2015))
|
194
|
+
.allbut([:year])
|
195
|
+
```
|
196
|
+
|
197
|
+
As shown, creating adapters on top of various data source is straighforward.
|
198
|
+
Adapters can also participate to query optimization (such as pushing
|
199
|
+
restrictions down the tree) by overriding the underscored version of operators
|
200
|
+
(e.g. `_restrict`).
|
201
|
+
|
202
|
+
Have a look at `Bmg::Algebra` for the protocol and `Bmg::Sql::Relation` for an
|
203
|
+
example. Keep in touch with the team if you need some help.
|
108
204
|
|
109
205
|
## Supported operators
|
110
206
|
|
@@ -114,6 +210,7 @@ r.autowrap(split: '_') # structure a flat relation, split:
|
|
114
210
|
r.autosummarize([:a, :b, ...], x: :sum) # (experimental) usual summarizers supported
|
115
211
|
r.constants(x: 12, ...) # add constant attributes (sometimes useful in unions)
|
116
212
|
r.extend(x: ->(t){ ... }, ...) # add computed attributes
|
213
|
+
r.exclude(predicate) # shortcut for restrict(!predicate)
|
117
214
|
r.group([:a, :b, ...], :x) # relation-valued attribute from attributes
|
118
215
|
r.image(right, :x, [:a, :b, ...]) # relation-valued attribute from another relation
|
119
216
|
r.join(right, [:a, :b, ...]) # natural join on a join key
|
@@ -132,15 +229,100 @@ r.restrict(a: "foo", b: "bar", ...) # relational restriction, aka where
|
|
132
229
|
r.rxmatch([:a, :b, ...], /xxx/) # regex match kind of restriction
|
133
230
|
r.summarize([:a, :b, ...], x: :sum) # relational summarization
|
134
231
|
r.suffix(:_foo, but: [:a, ...]) # suffix kind of renaming
|
232
|
+
t.transform(:to_s) # all-attrs transformation
|
233
|
+
t.transform(&:to_s) # similar, but Proc-driven
|
234
|
+
t.transform(:foo => :upcase, ...) # specific-attrs tranformation
|
235
|
+
t.transform([:to_s, :upcase]) # chain-transformation
|
135
236
|
r.union(right) # relational union
|
237
|
+
r.where(predicate) # alias for restrict(predicate)
|
136
238
|
```
|
137
239
|
|
138
|
-
##
|
240
|
+
## How is this different?
|
241
|
+
|
242
|
+
### ... from similar libraries?
|
243
|
+
|
244
|
+
1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
|
245
|
+
etc.) do not implement a genuine relational algebra. Their support for
|
246
|
+
chaining relational operators is thus limited (restricting your expression
|
247
|
+
power and/or raising errors and/or outputting wrong or counterintuitive
|
248
|
+
SQL code). Bmg **always** allows chaining operators. If it does not, it's
|
249
|
+
a bug.
|
250
|
+
|
251
|
+
For instance the expression below is 100% valid in Bmg. The last where
|
252
|
+
clause applies to the result of the summarize (while SQL requires a `HAVING`
|
253
|
+
clause, or a `SELECT ... FROM (SELECT ...) r`).
|
254
|
+
|
255
|
+
```ruby
|
256
|
+
relation
|
257
|
+
.where(...)
|
258
|
+
.union(...)
|
259
|
+
.summarize(...) # aka group by
|
260
|
+
.where(...)
|
261
|
+
```
|
262
|
+
|
263
|
+
2. Bmg supports in memory relations, json relations, csv relations, SQL
|
264
|
+
relations and so on. It's not tight to SQL generation, and supports
|
265
|
+
queries accross multiple data sources.
|
266
|
+
|
267
|
+
3. Bmg makes a best effort to optimize queries, simplifying both generated
|
268
|
+
SQL code (low-level accesses to datasources) and in-memory operations.
|
269
|
+
|
270
|
+
4. Bmg supports various *structuring* operators (group, image, autowrap,
|
271
|
+
autosummarize, etc.) and allows building 'non flat' relations.
|
272
|
+
|
273
|
+
5. Bmg can use full ruby power when that helps (e.g. regular expressions in
|
274
|
+
WHERE clauses or ruby code in EXTEND clauses). This may prevent Bmg from
|
275
|
+
delegating work to underlying data sources (e.g. SQL server) and should
|
276
|
+
therefore be used with care though.
|
277
|
+
|
278
|
+
### ... from Alf?
|
279
|
+
|
280
|
+
If you use Alf (or used it in the past), below are the main differences between
|
281
|
+
Bmg and Alf. Bmg has NOT been written to be API-compatible with Alf and will
|
282
|
+
probably never be.
|
283
|
+
|
284
|
+
1. Bmg's implementation is much simpler than Alf and uses no ruby core
|
285
|
+
extention.
|
286
|
+
|
287
|
+
2. We are confident using Bmg in production. Systematic inspection of query
|
288
|
+
plans is advised though. Alf was a bit too experimental to be used on
|
289
|
+
(critical) production systems.
|
290
|
+
|
291
|
+
3. Alf exposes a functional syntax, command line tool, restful tools and
|
292
|
+
many more. Bmg is limited to the core algebra, main Relation abstraction
|
293
|
+
and SQL generation.
|
139
294
|
|
140
|
-
|
295
|
+
4. Bmg is less strict regarding conformance to relational theory, and
|
296
|
+
may actually expose non relational features (such as support for null,
|
297
|
+
left_join operator, etc.). Sharp tools hurt, use them with care.
|
298
|
+
|
299
|
+
5. Unlike Alf::Relation instances of Bmg::Relation capture query-trees, not
|
300
|
+
values. Currently two instances `r1` and `r2` are not equal even if they
|
301
|
+
define the same mathematical relation. As a consequence joining on
|
302
|
+
relation-valued attributes does not work as expected in Bmg until further
|
303
|
+
notice.
|
304
|
+
|
305
|
+
6. Bmg does not implement all operators documented on try-alf.org, even if
|
306
|
+
we plan to eventually support most of them.
|
307
|
+
|
308
|
+
7. Bmg has a few additional operators that prove very useful on real
|
309
|
+
production use cases: prefix, suffix, autowrap, autosummarize, left_join,
|
310
|
+
rxmatch, etc.
|
311
|
+
|
312
|
+
8. Bmg optimizes queries and compiles them to SQL on the fly, while Alf was
|
313
|
+
building an AST internally first. Strictly speaking this makes Bmg less
|
314
|
+
powerful than Alf since optimizations cannot be turned off for now.
|
315
|
+
|
316
|
+
## Contribute
|
317
|
+
|
318
|
+
Please use github issues and pull requests for all questions, bug reports,
|
319
|
+
and contributions. Don't hesitate to get in touch with us with an early code
|
320
|
+
spike if you plan to add non trivial features.
|
321
|
+
|
322
|
+
## Licence
|
323
|
+
|
324
|
+
This software is distributed by Enspirit SRL under a MIT Licence. Please
|
325
|
+
contact Bernard Lambeau (blambeau@gmail.com) with any question.
|
141
326
|
|
142
327
|
Enspirit (https://enspirit.be) and Klaro App (https://klaro.cards) are both
|
143
328
|
actively using and contributing to the library.
|
144
|
-
|
145
|
-
Feel free to contact us for help, ideas and/or contributions. Please use github
|
146
|
-
issues and pull requests if possible if code is involved.
|
data/lib/bmg.rb
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
require 'path'
|
2
2
|
require 'predicate'
|
3
3
|
require 'forwardable'
|
4
|
+
require 'set'
|
4
5
|
module Bmg
|
5
6
|
|
6
7
|
def in_memory(enumerable, type = Type::ANY)
|
@@ -8,6 +9,11 @@ module Bmg
|
|
8
9
|
end
|
9
10
|
module_function :in_memory
|
10
11
|
|
12
|
+
def text_file(path, options = {}, type = Type::ANY)
|
13
|
+
Reader::TextFile.new(type, path, options).spied(main_spy)
|
14
|
+
end
|
15
|
+
module_function :text_file
|
16
|
+
|
11
17
|
def csv(path, options = {}, type = Type::ANY)
|
12
18
|
Reader::Csv.new(type, path, options).spied(main_spy)
|
13
19
|
end
|
@@ -38,11 +44,13 @@ module Bmg
|
|
38
44
|
require_relative 'bmg/operator'
|
39
45
|
|
40
46
|
require_relative 'bmg/reader'
|
47
|
+
require_relative 'bmg/writer'
|
41
48
|
|
42
49
|
require_relative 'bmg/relation/empty'
|
43
50
|
require_relative 'bmg/relation/in_memory'
|
44
51
|
require_relative 'bmg/relation/spied'
|
45
52
|
require_relative 'bmg/relation/materialized'
|
53
|
+
require_relative 'bmg/relation/proxy'
|
46
54
|
|
47
55
|
# Deprecated
|
48
56
|
Leaf = Relation::InMemory
|
data/lib/bmg/algebra.rb
CHANGED
@@ -172,6 +172,16 @@ module Bmg
|
|
172
172
|
end
|
173
173
|
protected :_summarize
|
174
174
|
|
175
|
+
def transform(transformation = nil, options = {}, &proc)
|
176
|
+
transformation, options = proc, (transformation || {}) unless proc.nil?
|
177
|
+
_transform(self.type.transform(transformation, options), transformation, options)
|
178
|
+
end
|
179
|
+
|
180
|
+
def _transform(type, transformation, options)
|
181
|
+
Operator::Transform.new(type, self, transformation, options)
|
182
|
+
end
|
183
|
+
protected :_transform
|
184
|
+
|
175
185
|
def union(other, options = {})
|
176
186
|
return self if other.is_a?(Relation::Empty)
|
177
187
|
_union self.type.union(other.type), other, options
|
@@ -2,6 +2,14 @@ module Bmg
|
|
2
2
|
module Algebra
|
3
3
|
module Shortcuts
|
4
4
|
|
5
|
+
def where(predicate)
|
6
|
+
restrict(predicate)
|
7
|
+
end
|
8
|
+
|
9
|
+
def exclude(predicate)
|
10
|
+
restrict(!Predicate.coerce(predicate))
|
11
|
+
end
|
12
|
+
|
5
13
|
def rxmatch(attrs, matcher, options = {})
|
6
14
|
predicate = attrs.inject(Predicate.contradiction){|p,a|
|
7
15
|
p | Predicate.match(a, matcher, options)
|
data/lib/bmg/operator.rb
CHANGED
@@ -0,0 +1,57 @@
|
|
1
|
+
module Bmg
|
2
|
+
module Operator
|
3
|
+
#
|
4
|
+
# Transform operator.
|
5
|
+
#
|
6
|
+
# Transforms existing attributes through computations
|
7
|
+
#
|
8
|
+
# Example:
|
9
|
+
#
|
10
|
+
# [{ a: 1 }] transform { a: ->(t){ t[:a]*2 } } => [{ a: 4 }]
|
11
|
+
#
|
12
|
+
class Transform
|
13
|
+
include Operator::Unary
|
14
|
+
|
15
|
+
DEFAULT_OPTIONS = {}
|
16
|
+
|
17
|
+
def initialize(type, operand, transformation, options = {})
|
18
|
+
@type = type
|
19
|
+
@operand = operand
|
20
|
+
@transformation = transformation
|
21
|
+
@options = DEFAULT_OPTIONS.merge(options)
|
22
|
+
end
|
23
|
+
|
24
|
+
protected
|
25
|
+
|
26
|
+
attr_reader :transformation
|
27
|
+
|
28
|
+
public
|
29
|
+
|
30
|
+
def each
|
31
|
+
t = transformer
|
32
|
+
@operand.each do |tuple|
|
33
|
+
yield t.call(tuple)
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def to_ast
|
38
|
+
[ :transform, operand.to_ast, transformation.dup ]
|
39
|
+
end
|
40
|
+
|
41
|
+
protected ### optimization
|
42
|
+
|
43
|
+
protected ### inspect
|
44
|
+
|
45
|
+
def args
|
46
|
+
[ transformation ]
|
47
|
+
end
|
48
|
+
|
49
|
+
private
|
50
|
+
|
51
|
+
def transformer
|
52
|
+
@transformer ||= TupleTransformer.new(transformation)
|
53
|
+
end
|
54
|
+
|
55
|
+
end # class Transform
|
56
|
+
end # module Operator
|
57
|
+
end # module Bmg
|
data/lib/bmg/reader.rb
CHANGED
@@ -0,0 +1,56 @@
|
|
1
|
+
module Bmg
|
2
|
+
module Reader
|
3
|
+
class TextFile
|
4
|
+
include Reader
|
5
|
+
|
6
|
+
DEFAULT_OPTIONS = {
|
7
|
+
strip: true,
|
8
|
+
parse: nil
|
9
|
+
}
|
10
|
+
|
11
|
+
def initialize(type, path, options = {})
|
12
|
+
options = { parse: options } if options.is_a?(Regexp)
|
13
|
+
@path = path
|
14
|
+
@options = DEFAULT_OPTIONS.merge(options)
|
15
|
+
@type = infer_type(type)
|
16
|
+
end
|
17
|
+
attr_reader :path, :options
|
18
|
+
|
19
|
+
public # Relation
|
20
|
+
|
21
|
+
def each
|
22
|
+
path.each_line.each_with_index do |text, line|
|
23
|
+
text = text.strip if strip?
|
24
|
+
parsed = parse(text)
|
25
|
+
yield({line: 1+line}.merge(parsed)) if parsed
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
29
|
+
private
|
30
|
+
|
31
|
+
def infer_type(base)
|
32
|
+
return base unless base == Bmg::Type::ANY
|
33
|
+
attr_list = if rx = options[:parse]
|
34
|
+
[:line] + rx.names.map(&:to_sym)
|
35
|
+
else
|
36
|
+
[:line, :text]
|
37
|
+
end
|
38
|
+
base
|
39
|
+
.with_attrlist(attr_list)
|
40
|
+
.with_keys([[:line]])
|
41
|
+
end
|
42
|
+
|
43
|
+
def strip?
|
44
|
+
options[:strip]
|
45
|
+
end
|
46
|
+
|
47
|
+
def parse(text)
|
48
|
+
return { text: text } unless rx = options[:parse]
|
49
|
+
if match = rx.match(text)
|
50
|
+
TupleAlgebra.symbolize_keys(match.named_captures)
|
51
|
+
end
|
52
|
+
end
|
53
|
+
|
54
|
+
end # class TextFile
|
55
|
+
end # module Reader
|
56
|
+
end # module Bmg
|
data/lib/bmg/relation.rb
CHANGED
@@ -17,6 +17,16 @@ module Bmg
|
|
17
17
|
self
|
18
18
|
end
|
19
19
|
|
20
|
+
def type
|
21
|
+
Bmg::Type::ANY
|
22
|
+
end
|
23
|
+
|
24
|
+
def with_type(type)
|
25
|
+
dup.tap{|r|
|
26
|
+
r.type = type
|
27
|
+
}
|
28
|
+
end
|
29
|
+
|
20
30
|
def with_typecheck
|
21
31
|
dup.tap{|r|
|
22
32
|
r.type = r.type.with_typecheck
|
@@ -105,6 +115,20 @@ module Bmg
|
|
105
115
|
to_a.to_json(*args, &bl)
|
106
116
|
end
|
107
117
|
|
118
|
+
# Writes the relation data to CSV.
|
119
|
+
#
|
120
|
+
# `string_or_io` and `options` are what CSV::new itself
|
121
|
+
# recognizes, default options are CSV's.
|
122
|
+
#
|
123
|
+
# When no string_or_io is used, the method uses a string.
|
124
|
+
#
|
125
|
+
# The method always returns the string_or_io.
|
126
|
+
def to_csv(options = {}, string_or_io = nil, preferences = nil)
|
127
|
+
options, string_or_io = {}, options unless options.is_a?(Hash)
|
128
|
+
string_or_io, preferences = nil, string_or_io if string_or_io.is_a?(Hash)
|
129
|
+
Writer::Csv.new(options, preferences).call(self, string_or_io)
|
130
|
+
end
|
131
|
+
|
108
132
|
# Converts to an sexpr expression.
|
109
133
|
def to_ast
|
110
134
|
raise "Bmg is missing a feature!"
|
@@ -0,0 +1,63 @@
|
|
1
|
+
module Bmg
|
2
|
+
module Relation
|
3
|
+
#
|
4
|
+
# This module can be used to create typed collection on top
|
5
|
+
# of Bmg relations. Algebra methods will be delegated to the
|
6
|
+
# decorated relation, and results wrapped in a new instance
|
7
|
+
# of the class.
|
8
|
+
#
|
9
|
+
module Proxy
|
10
|
+
|
11
|
+
def initialize(relation)
|
12
|
+
@relation = relation
|
13
|
+
end
|
14
|
+
|
15
|
+
def method_missing(name, *args, &bl)
|
16
|
+
if @relation.respond_to?(name)
|
17
|
+
res = @relation.send(name, *args, &bl)
|
18
|
+
res.is_a?(Relation) ? _proxy(res) : res
|
19
|
+
else
|
20
|
+
super
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
def respond_to?(name, *args)
|
25
|
+
@relation.respond_to?(name) || super
|
26
|
+
end
|
27
|
+
|
28
|
+
[
|
29
|
+
:extend
|
30
|
+
].each do |name|
|
31
|
+
define_method(name) do |*args, &bl|
|
32
|
+
res = @relation.send(name, *args, &bl)
|
33
|
+
res.is_a?(Relation) ? _proxy(res) : res
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
[
|
38
|
+
:one,
|
39
|
+
:one_or_nil
|
40
|
+
].each do |meth|
|
41
|
+
define_method(meth) do |*args, &bl|
|
42
|
+
res = @relation.send(meth, *args, &bl)
|
43
|
+
res.nil? ? nil : _proxy_tuple(res)
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
def to_json(*args, &bl)
|
48
|
+
@relation.to_json(*args, &bl)
|
49
|
+
end
|
50
|
+
|
51
|
+
protected
|
52
|
+
|
53
|
+
def _proxy(relation)
|
54
|
+
self.class.new(relation)
|
55
|
+
end
|
56
|
+
|
57
|
+
def _proxy_tuple(tuple)
|
58
|
+
tuple
|
59
|
+
end
|
60
|
+
|
61
|
+
end # module Proxy
|
62
|
+
end # class Relation
|
63
|
+
end # module Bmg
|
data/lib/bmg/relation/spied.rb
CHANGED
data/lib/bmg/support.rb
CHANGED
data/lib/bmg/support/keys.rb
CHANGED
@@ -0,0 +1,44 @@
|
|
1
|
+
module Bmg
|
2
|
+
class OutputPreferences
|
3
|
+
|
4
|
+
DEFAULT_PREFS = {
|
5
|
+
attributes_ordering: nil,
|
6
|
+
extra_attributes: :after
|
7
|
+
}
|
8
|
+
|
9
|
+
def initialize(options)
|
10
|
+
@options = DEFAULT_PREFS.merge(options)
|
11
|
+
end
|
12
|
+
attr_reader :options
|
13
|
+
|
14
|
+
def self.dress(arg)
|
15
|
+
return arg if arg.is_a?(OutputPreferences)
|
16
|
+
arg = {} if arg.nil?
|
17
|
+
new(arg)
|
18
|
+
end
|
19
|
+
|
20
|
+
def attributes_ordering
|
21
|
+
options[:attributes_ordering]
|
22
|
+
end
|
23
|
+
|
24
|
+
def extra_attributes
|
25
|
+
options[:extra_attributes]
|
26
|
+
end
|
27
|
+
|
28
|
+
def order_attrlist(attrlist)
|
29
|
+
return attrlist if attributes_ordering.nil?
|
30
|
+
index = Hash[attributes_ordering.each_with_index.to_a]
|
31
|
+
attrlist.sort{|a,b|
|
32
|
+
ai, bi = index[a], index[b]
|
33
|
+
if ai && bi
|
34
|
+
ai <=> bi
|
35
|
+
elsif ai
|
36
|
+
extra_attributes == :after ? -1 : 1
|
37
|
+
else
|
38
|
+
extra_attributes == :after ? 1 : -1
|
39
|
+
end
|
40
|
+
}
|
41
|
+
end
|
42
|
+
|
43
|
+
end # class OutputPreferences
|
44
|
+
end # module Bmg
|
@@ -0,0 +1,63 @@
|
|
1
|
+
module Bmg
|
2
|
+
class TupleTransformer
|
3
|
+
|
4
|
+
def initialize(transformation)
|
5
|
+
@transformation = transformation
|
6
|
+
end
|
7
|
+
|
8
|
+
def self.new(arg)
|
9
|
+
return arg if arg.is_a?(TupleTransformer)
|
10
|
+
super
|
11
|
+
end
|
12
|
+
|
13
|
+
def call(tuple)
|
14
|
+
transform_tuple(tuple, @transformation)
|
15
|
+
end
|
16
|
+
|
17
|
+
def knows_attrlist?
|
18
|
+
@transformation.is_a?(Hash)
|
19
|
+
end
|
20
|
+
|
21
|
+
def to_attrlist
|
22
|
+
@transformation.keys
|
23
|
+
end
|
24
|
+
|
25
|
+
private
|
26
|
+
|
27
|
+
def transform_tuple(tuple, with)
|
28
|
+
case with
|
29
|
+
when Symbol, Proc, Regexp
|
30
|
+
tuple.each_with_object({}){|(k,v),dup|
|
31
|
+
dup[k] = transform_attr(v, with)
|
32
|
+
}
|
33
|
+
when Hash
|
34
|
+
with.each_with_object(tuple.dup){|(k,v),dup|
|
35
|
+
dup[k] = transform_attr(dup[k], v)
|
36
|
+
}
|
37
|
+
when Array
|
38
|
+
with.inject(tuple){|dup,on|
|
39
|
+
transform_tuple(dup, on)
|
40
|
+
}
|
41
|
+
else
|
42
|
+
raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
def transform_attr(value, with)
|
47
|
+
case with
|
48
|
+
when Symbol
|
49
|
+
value.send(with)
|
50
|
+
when Regexp
|
51
|
+
m = with.match(value.to_s)
|
52
|
+
m.nil? ? m : m.to_s
|
53
|
+
when Proc
|
54
|
+
with.call(value)
|
55
|
+
when Hash
|
56
|
+
with[value]
|
57
|
+
else
|
58
|
+
raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
|
59
|
+
end
|
60
|
+
end
|
61
|
+
|
62
|
+
end # module TupleTransformer
|
63
|
+
end # module Bmg
|
data/lib/bmg/type.rb
CHANGED
@@ -241,6 +241,31 @@ module Bmg
|
|
241
241
|
}
|
242
242
|
end
|
243
243
|
|
244
|
+
def transform(transformation, options = {})
|
245
|
+
transformer = TupleTransformer.new(transformation)
|
246
|
+
if typechecked? && knows_attrlist? && transformer.knows_attrlist?
|
247
|
+
known_attributes!(transformer.to_attrlist)
|
248
|
+
end
|
249
|
+
keys = if options[:key_preserving]
|
250
|
+
self._keys
|
251
|
+
elsif transformer.knows_attrlist? && knows_keys?
|
252
|
+
touched_attrs = transformer.to_attrlist
|
253
|
+
keys = self._keys.select{|k| (k & touched_attrs).empty? }
|
254
|
+
else
|
255
|
+
nil
|
256
|
+
end
|
257
|
+
pred = if transformer.knows_attrlist?
|
258
|
+
attr_list = transformer.to_attrlist
|
259
|
+
predicate.and_split(attr_list).last
|
260
|
+
else
|
261
|
+
Predicate.tautology
|
262
|
+
end
|
263
|
+
dup.tap{|x|
|
264
|
+
x.keys = keys
|
265
|
+
x.predicate = pred
|
266
|
+
}
|
267
|
+
end
|
268
|
+
|
244
269
|
def union(other)
|
245
270
|
if typechecked? && knows_attrlist? && other.knows_attrlist?
|
246
271
|
missing = self.attrlist - other.attrlist
|
data/lib/bmg/version.rb
CHANGED
data/lib/bmg/writer.rb
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
require_relative 'writer/csv'
|
@@ -0,0 +1,42 @@
|
|
1
|
+
module Bmg
|
2
|
+
module Writer
|
3
|
+
class Csv
|
4
|
+
include Writer
|
5
|
+
|
6
|
+
DEFAULT_OPTIONS = {
|
7
|
+
}
|
8
|
+
|
9
|
+
def initialize(csv_options, output_preferences = nil)
|
10
|
+
@csv_options = DEFAULT_OPTIONS.merge(csv_options)
|
11
|
+
@output_preferences = OutputPreferences.dress(output_preferences)
|
12
|
+
end
|
13
|
+
attr_reader :csv_options, :output_preferences
|
14
|
+
|
15
|
+
def call(relation, string_or_io = nil)
|
16
|
+
require 'csv'
|
17
|
+
string_or_io, to_s = string_or_io.nil? ? [StringIO.new, true] : [string_or_io, false]
|
18
|
+
headers, csv = infer_headers(relation.type), nil
|
19
|
+
relation.each do |tuple|
|
20
|
+
if csv.nil?
|
21
|
+
headers = infer_headers(tuple) if headers.nil?
|
22
|
+
csv = CSV.new(string_or_io, csv_options.merge(headers: headers))
|
23
|
+
end
|
24
|
+
csv << headers.map{|h| tuple[h] }
|
25
|
+
end
|
26
|
+
to_s ? string_or_io.string : string_or_io
|
27
|
+
end
|
28
|
+
|
29
|
+
private
|
30
|
+
|
31
|
+
def infer_headers(from)
|
32
|
+
attrlist = if from.is_a?(Type) && from.knows_attrlist?
|
33
|
+
from.to_attrlist
|
34
|
+
elsif from.is_a?(Hash)
|
35
|
+
from.keys
|
36
|
+
end
|
37
|
+
attrlist ? output_preferences.order_attrlist(attrlist) : nil
|
38
|
+
end
|
39
|
+
|
40
|
+
end # class Csv
|
41
|
+
end # module Writer
|
42
|
+
end # module Bmg
|
data/tasks/test.rake
CHANGED
@@ -6,17 +6,24 @@ namespace :test do
|
|
6
6
|
desc "Runs unit tests"
|
7
7
|
RSpec::Core::RakeTask.new(:unit) do |t|
|
8
8
|
t.pattern = "spec/unit/**/test_*.rb"
|
9
|
-
t.rspec_opts = ["-Ilib", "-Ispec/unit", "--
|
9
|
+
t.rspec_opts = ["-Ilib", "-Ispec/unit", "--color", "--backtrace", "--format=progress"]
|
10
10
|
end
|
11
11
|
tests << :unit
|
12
12
|
|
13
13
|
desc "Runs integration tests"
|
14
14
|
RSpec::Core::RakeTask.new(:integration) do |t|
|
15
15
|
t.pattern = "spec/integration/**/test_*.rb"
|
16
|
-
t.rspec_opts = ["-Ilib", "-Ispec/integration", "--
|
16
|
+
t.rspec_opts = ["-Ilib", "-Ispec/integration", "--color", "--backtrace", "--format=progress"]
|
17
17
|
end
|
18
18
|
tests << :integration
|
19
19
|
|
20
|
+
desc "Runs github regression tests"
|
21
|
+
RSpec::Core::RakeTask.new(:regression) do |t|
|
22
|
+
t.pattern = "spec/regression/**/test_*.rb"
|
23
|
+
t.rspec_opts = ["-Ilib", "-Ispec/regression", "--color", "--backtrace", "--format=progress"]
|
24
|
+
end
|
25
|
+
tests << :regression
|
26
|
+
|
20
27
|
task :all => tests
|
21
28
|
end
|
22
29
|
|
metadata
CHANGED
@@ -1,91 +1,91 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bmg
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.18.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Bernard Lambeau
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2021-03-12 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: predicate
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- - "~>"
|
18
|
-
- !ruby/object:Gem::Version
|
19
|
-
version: '2.4'
|
20
17
|
- - ">="
|
21
18
|
- !ruby/object:Gem::Version
|
22
|
-
version: 2.
|
19
|
+
version: 2.5.0
|
20
|
+
- - "~>"
|
21
|
+
- !ruby/object:Gem::Version
|
22
|
+
version: '2.5'
|
23
23
|
type: :runtime
|
24
24
|
prerelease: false
|
25
25
|
version_requirements: !ruby/object:Gem::Requirement
|
26
26
|
requirements:
|
27
|
-
- - "~>"
|
28
|
-
- !ruby/object:Gem::Version
|
29
|
-
version: '2.4'
|
30
27
|
- - ">="
|
31
28
|
- !ruby/object:Gem::Version
|
32
|
-
version: 2.
|
29
|
+
version: 2.5.0
|
30
|
+
- - "~>"
|
31
|
+
- !ruby/object:Gem::Version
|
32
|
+
version: '2.5'
|
33
33
|
- !ruby/object:Gem::Dependency
|
34
|
-
name:
|
34
|
+
name: path
|
35
35
|
requirement: !ruby/object:Gem::Requirement
|
36
36
|
requirements:
|
37
|
-
- - "
|
37
|
+
- - ">="
|
38
38
|
- !ruby/object:Gem::Version
|
39
|
-
version: '
|
40
|
-
type: :
|
39
|
+
version: '2.0'
|
40
|
+
type: :runtime
|
41
41
|
prerelease: false
|
42
42
|
version_requirements: !ruby/object:Gem::Requirement
|
43
43
|
requirements:
|
44
|
-
- - "
|
44
|
+
- - ">="
|
45
45
|
- !ruby/object:Gem::Version
|
46
|
-
version: '
|
46
|
+
version: '2.0'
|
47
47
|
- !ruby/object:Gem::Dependency
|
48
|
-
name:
|
48
|
+
name: rake
|
49
49
|
requirement: !ruby/object:Gem::Requirement
|
50
50
|
requirements:
|
51
51
|
- - "~>"
|
52
52
|
- !ruby/object:Gem::Version
|
53
|
-
version: '
|
53
|
+
version: '13'
|
54
54
|
type: :development
|
55
55
|
prerelease: false
|
56
56
|
version_requirements: !ruby/object:Gem::Requirement
|
57
57
|
requirements:
|
58
58
|
- - "~>"
|
59
59
|
- !ruby/object:Gem::Version
|
60
|
-
version: '
|
60
|
+
version: '13'
|
61
61
|
- !ruby/object:Gem::Dependency
|
62
|
-
name:
|
62
|
+
name: rspec
|
63
63
|
requirement: !ruby/object:Gem::Requirement
|
64
64
|
requirements:
|
65
|
-
- - "
|
65
|
+
- - "~>"
|
66
66
|
- !ruby/object:Gem::Version
|
67
|
-
version: '
|
67
|
+
version: '3.6'
|
68
68
|
type: :development
|
69
69
|
prerelease: false
|
70
70
|
version_requirements: !ruby/object:Gem::Requirement
|
71
71
|
requirements:
|
72
|
-
- - "
|
72
|
+
- - "~>"
|
73
73
|
- !ruby/object:Gem::Version
|
74
|
-
version: '
|
74
|
+
version: '3.6'
|
75
75
|
- !ruby/object:Gem::Dependency
|
76
76
|
name: roo
|
77
77
|
requirement: !ruby/object:Gem::Requirement
|
78
78
|
requirements:
|
79
79
|
- - ">="
|
80
80
|
- !ruby/object:Gem::Version
|
81
|
-
version: '2.
|
81
|
+
version: '2.8'
|
82
82
|
type: :development
|
83
83
|
prerelease: false
|
84
84
|
version_requirements: !ruby/object:Gem::Requirement
|
85
85
|
requirements:
|
86
86
|
- - ">="
|
87
87
|
- !ruby/object:Gem::Version
|
88
|
-
version: '2.
|
88
|
+
version: '2.8'
|
89
89
|
- !ruby/object:Gem::Dependency
|
90
90
|
name: sequel
|
91
91
|
requirement: !ruby/object:Gem::Requirement
|
@@ -149,14 +149,17 @@ files:
|
|
149
149
|
- lib/bmg/operator/shared/nary.rb
|
150
150
|
- lib/bmg/operator/shared/unary.rb
|
151
151
|
- lib/bmg/operator/summarize.rb
|
152
|
+
- lib/bmg/operator/transform.rb
|
152
153
|
- lib/bmg/operator/union.rb
|
153
154
|
- lib/bmg/reader.rb
|
154
155
|
- lib/bmg/reader/csv.rb
|
155
156
|
- lib/bmg/reader/excel.rb
|
157
|
+
- lib/bmg/reader/text_file.rb
|
156
158
|
- lib/bmg/relation.rb
|
157
159
|
- lib/bmg/relation/empty.rb
|
158
160
|
- lib/bmg/relation/in_memory.rb
|
159
161
|
- lib/bmg/relation/materialized.rb
|
162
|
+
- lib/bmg/relation/proxy.rb
|
160
163
|
- lib/bmg/relation/spied.rb
|
161
164
|
- lib/bmg/sequel.rb
|
162
165
|
- lib/bmg/sequel/ext.rb
|
@@ -258,9 +261,13 @@ files:
|
|
258
261
|
- lib/bmg/summarizer/variance.rb
|
259
262
|
- lib/bmg/support.rb
|
260
263
|
- lib/bmg/support/keys.rb
|
264
|
+
- lib/bmg/support/output_preferences.rb
|
261
265
|
- lib/bmg/support/tuple_algebra.rb
|
266
|
+
- lib/bmg/support/tuple_transformer.rb
|
262
267
|
- lib/bmg/type.rb
|
263
268
|
- lib/bmg/version.rb
|
269
|
+
- lib/bmg/writer.rb
|
270
|
+
- lib/bmg/writer/csv.rb
|
264
271
|
- tasks/gem.rake
|
265
272
|
- tasks/test.rake
|
266
273
|
homepage: http://github.com/enspirit/bmg
|
@@ -282,7 +289,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
282
289
|
- !ruby/object:Gem::Version
|
283
290
|
version: '0'
|
284
291
|
requirements: []
|
285
|
-
rubygems_version: 3.
|
292
|
+
rubygems_version: 3.0.8
|
286
293
|
signing_key:
|
287
294
|
specification_version: 4
|
288
295
|
summary: Bmg is Alf's successor.
|