bmg 0.17.4 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1e48d4a2d3c4bea8271e40ad4db52f43438de7f754079cc4c5bcd5e98ce4865d
4
- data.tar.gz: 3fff9f251a2b766ab3de631748ccc6fe6f6b96ccef88507cb3ed76766fea3992
3
+ metadata.gz: f1ce59d00b630f644e5716eaff17a83116c1342ea0b6ba7d9174b1f5f4eadd6e
4
+ data.tar.gz: 2156d1a8eb88999e749f434a8e94d2b4e70c636fadb10470493f19bb805a19a3
5
5
  SHA512:
6
- metadata.gz: facfbb798258cfdec3803cc7b93999d23c8a0776ead53c074266f4f658a27feee1a3e1dcc1145e7ad45bb03579d22f5d65108152516ac0af0d79aa59805ee016
7
- data.tar.gz: 10271e608276bc2e00ccee3d6a94287cb2211d945898cb9cfa54764a05027c9e6a6c34a11e9725f9ed47df4df537fe054d8c3f17b24f58da9713761121523e73
6
+ metadata.gz: 8e3e714d698ff7c47c2f61b4056c73acaa39983306402eeb54d1957b90959af51fcf20a543c5c17242d260fe835b33c19dd960e3cd4920a66d49ea81a22ee4b0
7
+ data.tar.gz: 89fe1cc4c3157adf7373fb755e0cf7eb5dd8d93f7061558454d27649b44aca558f936f006e3238dce410ffdfe29821d59feac3b31b5d64241ffc76192bba149f
data/Gemfile CHANGED
@@ -1,5 +1,2 @@
1
1
  source "https://rubygems.org"
2
2
  gemspec
3
-
4
- # gem "predicate", github: "enspirit/predicate", branch: "placeholders"
5
- # gem "predicate", path: "../predicate"
data/README.md CHANGED
@@ -1,16 +1,30 @@
1
1
  # Bmg, a relational algebra (Alf's successor)!
2
2
 
3
+ [![Build Status](https://travis-ci.com/enspirit/bmg.svg?branch=master)](https://travis-ci.com/enspirit/bmg)
4
+
3
5
  Bmg is a relational algebra implemented as a ruby library. It implements the
4
6
  [Relation as First-Class Citizen](http://www.try-alf.org/blog/2013-10-21-relations-as-first-class-citizen)
5
- paradigm contributed with Alf a few years ago.
6
-
7
- Like Alf, Bmg can be used to query relations in memory, from various files,
8
- SQL databases, and any data sources that can be seen as serving relations.
9
- Cross data-sources joins are supported, as with Alf.
10
-
11
- Unlike Alf, Bmg does not make any core ruby extension and exposes the
12
- object-oriented syntax only (not Alf's functional one). Bmg implementation is
13
- also much simpler, and make its easier to implement user-defined relations.
7
+ paradigm contributed with [Alf](http://www.try-alf.org/) a few years ago.
8
+
9
+ Bmg can be used to query relations in memory, from various files, SQL databases,
10
+ and any data source that can be seen as serving relations. Cross data-sources
11
+ joins are supported, as with Alf. For differences with Alf, see a section
12
+ further down this README.
13
+
14
+ ## Outline
15
+
16
+ * [Example](#example)
17
+ * [Where are base relations coming from?](#where-are-base-relations-coming-from)
18
+ * [Memory relations](#memory-relations)
19
+ * [Connecting to SQL databases](#connecting-to-sql-databases)
20
+ * [Reading files (csv, excel, text)](#reading-files-csv-excel-text)
21
+ * [Your own relations](#your-own-relations)
22
+ * [List of supported operators](#supported-operators)
23
+ * [How is this different?](#how-is-this-different)
24
+ * [... from similar libraries](#-from-similar-libraries)
25
+ * [... from Alf](#-from-alf)
26
+ * [Contribute](#contribute)
27
+ * [License](#license)
14
28
 
15
29
  ## Example
16
30
 
@@ -27,7 +41,7 @@ suppliers = Bmg::Relation.new([
27
41
  ])
28
42
 
29
43
  by_city = suppliers
30
- .restrict(Predicate.neq(status: 30))
44
+ .exclude(status: 30)
31
45
  .extend(upname: ->(t){ t[:name].upcase })
32
46
  .group([:sid, :name, :status], :suppliers_in)
33
47
 
@@ -35,76 +49,158 @@ puts JSON.pretty_generate(by_city)
35
49
  # [{...},...]
36
50
  ```
37
51
 
38
- ## Connecting to a SQL database
52
+ ## Where are base relations coming from?
53
+
54
+ Bmg sees relations as sets/enumerable of symbolized Ruby hashes. The following
55
+ sections show you how to get them in the first place, to enter Relationland.
56
+
57
+ ### Memory relations
58
+
59
+ If you have an Array of Hashes -- in fact any Enumerable -- you can easily get
60
+ a Relation using either `Bmg::Relation.new` or `Bmg.in_memory`.
61
+
62
+ ```ruby
63
+ # this...
64
+ r = Bmg::Relation.new [{id: 1}, {id: 2}]
65
+
66
+ # is the same as this...
67
+ r = Bmg.in_memory [{id: 1}, {id: 2}]
68
+
69
+ # entire algebra is available on `r`
70
+ ```
71
+
72
+ ### Connecting to SQL databases
39
73
 
40
- Bmg requires `sequel >= 3.0` to connect to SQL databases.
74
+ Bmg currently requires `sequel >= 3.0` to connect to SQL databases. You also
75
+ need to require `bmg/sequel`.
41
76
 
42
77
  ```ruby
43
78
  require 'sqlite3'
44
79
  require 'bmg'
45
80
  require 'bmg/sequel'
81
+ ```
46
82
 
47
- DB = Sequel.connect("sqlite://suppliers-and-parts.db")
83
+ Then `Bmg.sequel` serves relations for tables of your SQL database:
48
84
 
85
+ ```ruby
86
+ DB = Sequel.connect("sqlite://suppliers-and-parts.db")
49
87
  suppliers = Bmg.sequel(:suppliers, DB)
88
+ ```
89
+
90
+ The entire algebra is available on those relations. As long as you keep using
91
+ operators that can be translated to SQL, results remain SQL-able:
50
92
 
93
+ ```ruby
51
94
  big_suppliers = suppliers
52
- .restrict(Predicate.neq(status: 30))
95
+ .exclude(status: 30)
96
+ .project([:sid, :name])
53
97
 
54
98
  puts big_suppliers.to_sql
55
- # SELECT `t1`.`sid`, `t1`.`name`, `t1`.`status`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
99
+ # SELECT `t1`.`sid`, `t1`.`name` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
100
+ ```
56
101
 
57
- puts JSON.pretty_generate(big_suppliers)
58
- # [{...},...]
102
+ Operators not translatable to SQL are available too (such as `group` below).
103
+ Bmg fallbacks to memory operators for them, but remains capable of pushing some
104
+ operators down the tree as illustrated below (the restriction on `:city` is
105
+ pushed to the SQL server):
106
+
107
+ ```ruby
108
+ Bmg.sequel(:suppliers, sequel_db)
109
+ .project([:sid, :name, :city])
110
+ .group([:sid, :name], :suppliers_in)
111
+ .restrict(city: ["Paris", "London"])
112
+ .debug
113
+
114
+ # (group
115
+ # (sequel SELECT `t1`.`sid`, `t1`.`name`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`city` IN ('Paris', 'London')))
116
+ # [:sid, :name, :status]
117
+ # :suppliers_in
118
+ # {:array=>false})
59
119
  ```
60
120
 
61
- ## How is this different from similar libraries?
121
+ ### Reading files (csv, excel, text)
62
122
 
63
- 1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
64
- etc.) do not implement a genuine relational algebra: their support for
65
- chaining relational operators is limited (yielding errors or wrong SQL
66
- queries). Bmg **always** allows chaining operators. If it does not, it's
67
- a bug. In other words, the following query is 100% valid:
123
+ Bmg provides simple adapters to read files and reach Relationland as soon as
124
+ possible.
68
125
 
69
- relation
70
- .restrict(...) # aka where
71
- .union(...)
72
- .summarize(...) # aka group by
73
- .restrict(...)
126
+ #### CSV files
74
127
 
75
- 2. Bmg supports in memory relations, json relations, csv relations, SQL
76
- relations and so on. It's not tight to SQL generation, and supports
77
- queries accross multiple data sources.
128
+ ```ruby
129
+ csv_options = { col_sep: ",", quote_char: '"' }
130
+ r = Bmg.csv("path/to/a/file.csv", csv_options)
131
+ ```
78
132
 
79
- 3. Bmg makes a best effort to optimize queries, simplifying both generated
80
- SQL code (low-level accesses to datasources) and in-memory operations.
133
+ Options are directly transmitted to `::CSV.new`, check ruby's standard
134
+ library.
81
135
 
82
- 4. Bmg supports various *structuring* operators (group, image, autowrap,
83
- autosummarize, etc.) and allows building 'non flat' relations.
136
+ #### Excel files
84
137
 
85
- ## How is this different from Alf?
138
+ You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
139
+ read `.xls` and `.xlsx` files with Bmg.
86
140
 
87
- 1. Bmg's implementation is much simpler than Alf, and uses no ruby core
88
- extention.
141
+ ```ruby
142
+ roo_options = { skip: 1 }
143
+ r = Bmg.excel("path/to/a/file.xls", roo_options)
144
+ ```
89
145
 
90
- 2. We are confident using Bmg in production. Systematic inspection of query
91
- plans is suggested though. Alf was a bit too experimental to be used on
92
- (critical) production systems.
146
+ Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
147
+ documentation.
93
148
 
94
- 2. Alf exposes a functional syntax, command line tool, restful tools and
95
- many more. Bmg is limited to the core algebra, main Relation abstraction
96
- and SQL generation.
149
+ #### Text files
97
150
 
98
- 3. Bmg is less strict regarding conformance to relational theory, and
99
- may actually expose non relational features (such as support for null,
100
- left_join operator, etc.). Sharp tools hurt, use them with great care.
151
+ There is also a straightforward way to read text files and convert lines to
152
+ tuples.
101
153
 
102
- 4. Bmg does not yet implement all operators documented on try-alf.org, even
103
- if we plan to eventually support them all.
154
+ ```ruby
155
+ r = Bmg.text_file("path/to/a/file.txt")
156
+ r.type.attrlist
157
+ # => [:line, :text]
158
+ ```
104
159
 
105
- 5. Bmg has a few additional operators that prove very useful on real
106
- production use cases: prefix, suffix, autowrap, autosummarize, left_join,
107
- rxmatch, etc.
160
+ Without options tuples will have `:line` and `:text` attributes, the former
161
+ being the line number (starting at 1) and the latter being the line itself
162
+ (stripped).
163
+
164
+ The are a couple of options (see `Bmg::Reader::Textfile`). The most useful one
165
+ is the use a of a Regexp with named captures to automatically extract
166
+ attributes:
167
+
168
+ ```ruby
169
+ r = Bmg.text_file("path/to/a/file.txt", parse: /GET (?<url>([^\s]+))/)
170
+ r.type.attrlist
171
+ # => [:line, :url]
172
+ ```
173
+
174
+ In this scenario, non matching lines are skipped. The `:line` attribute keeps
175
+ being used to have at least one candidate key (so to speak).
176
+
177
+ ### Your own relations
178
+
179
+ As noted earlier, Bmg has a simple relation interface where you only have to
180
+ provide an iteration of symbolized tuples.
181
+
182
+ ```ruby
183
+ class MyRelation
184
+ include Bmg::Relation
185
+
186
+ def each
187
+ yield(id: 1, name: "Alf", year: 2014)
188
+ yield(id: 2, name: "Bmg", year: 2018)
189
+ end
190
+ end
191
+
192
+ MyRelation.new
193
+ .restrict(Predicate.gt(:year, 2015))
194
+ .allbut([:year])
195
+ ```
196
+
197
+ As shown, creating adapters on top of various data source is straighforward.
198
+ Adapters can also participate to query optimization (such as pushing
199
+ restrictions down the tree) by overriding the underscored version of operators
200
+ (e.g. `_restrict`).
201
+
202
+ Have a look at `Bmg::Algebra` for the protocol and `Bmg::Sql::Relation` for an
203
+ example. Keep in touch with the team if you need some help.
108
204
 
109
205
  ## Supported operators
110
206
 
@@ -114,6 +210,7 @@ r.autowrap(split: '_') # structure a flat relation, split:
114
210
  r.autosummarize([:a, :b, ...], x: :sum) # (experimental) usual summarizers supported
115
211
  r.constants(x: 12, ...) # add constant attributes (sometimes useful in unions)
116
212
  r.extend(x: ->(t){ ... }, ...) # add computed attributes
213
+ r.exclude(predicate) # shortcut for restrict(!predicate)
117
214
  r.group([:a, :b, ...], :x) # relation-valued attribute from attributes
118
215
  r.image(right, :x, [:a, :b, ...]) # relation-valued attribute from another relation
119
216
  r.join(right, [:a, :b, ...]) # natural join on a join key
@@ -132,15 +229,100 @@ r.restrict(a: "foo", b: "bar", ...) # relational restriction, aka where
132
229
  r.rxmatch([:a, :b, ...], /xxx/) # regex match kind of restriction
133
230
  r.summarize([:a, :b, ...], x: :sum) # relational summarization
134
231
  r.suffix(:_foo, but: [:a, ...]) # suffix kind of renaming
232
+ t.transform(:to_s) # all-attrs transformation
233
+ t.transform(&:to_s) # similar, but Proc-driven
234
+ t.transform(:foo => :upcase, ...) # specific-attrs tranformation
235
+ t.transform([:to_s, :upcase]) # chain-transformation
135
236
  r.union(right) # relational union
237
+ r.where(predicate) # alias for restrict(predicate)
136
238
  ```
137
239
 
138
- ## Who is behind Bmg?
240
+ ## How is this different?
241
+
242
+ ### ... from similar libraries?
243
+
244
+ 1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
245
+ etc.) do not implement a genuine relational algebra. Their support for
246
+ chaining relational operators is thus limited (restricting your expression
247
+ power and/or raising errors and/or outputting wrong or counterintuitive
248
+ SQL code). Bmg **always** allows chaining operators. If it does not, it's
249
+ a bug.
250
+
251
+ For instance the expression below is 100% valid in Bmg. The last where
252
+ clause applies to the result of the summarize (while SQL requires a `HAVING`
253
+ clause, or a `SELECT ... FROM (SELECT ...) r`).
254
+
255
+ ```ruby
256
+ relation
257
+ .where(...)
258
+ .union(...)
259
+ .summarize(...) # aka group by
260
+ .where(...)
261
+ ```
262
+
263
+ 2. Bmg supports in memory relations, json relations, csv relations, SQL
264
+ relations and so on. It's not tight to SQL generation, and supports
265
+ queries accross multiple data sources.
266
+
267
+ 3. Bmg makes a best effort to optimize queries, simplifying both generated
268
+ SQL code (low-level accesses to datasources) and in-memory operations.
269
+
270
+ 4. Bmg supports various *structuring* operators (group, image, autowrap,
271
+ autosummarize, etc.) and allows building 'non flat' relations.
272
+
273
+ 5. Bmg can use full ruby power when that helps (e.g. regular expressions in
274
+ WHERE clauses or ruby code in EXTEND clauses). This may prevent Bmg from
275
+ delegating work to underlying data sources (e.g. SQL server) and should
276
+ therefore be used with care though.
277
+
278
+ ### ... from Alf?
279
+
280
+ If you use Alf (or used it in the past), below are the main differences between
281
+ Bmg and Alf. Bmg has NOT been written to be API-compatible with Alf and will
282
+ probably never be.
283
+
284
+ 1. Bmg's implementation is much simpler than Alf and uses no ruby core
285
+ extention.
286
+
287
+ 2. We are confident using Bmg in production. Systematic inspection of query
288
+ plans is advised though. Alf was a bit too experimental to be used on
289
+ (critical) production systems.
290
+
291
+ 3. Alf exposes a functional syntax, command line tool, restful tools and
292
+ many more. Bmg is limited to the core algebra, main Relation abstraction
293
+ and SQL generation.
139
294
 
140
- Bernard Lambeau (bernard@klaro.cards) is Alf & Bmg main engineer & maintainer.
295
+ 4. Bmg is less strict regarding conformance to relational theory, and
296
+ may actually expose non relational features (such as support for null,
297
+ left_join operator, etc.). Sharp tools hurt, use them with care.
298
+
299
+ 5. Unlike Alf::Relation instances of Bmg::Relation capture query-trees, not
300
+ values. Currently two instances `r1` and `r2` are not equal even if they
301
+ define the same mathematical relation. As a consequence joining on
302
+ relation-valued attributes does not work as expected in Bmg until further
303
+ notice.
304
+
305
+ 6. Bmg does not implement all operators documented on try-alf.org, even if
306
+ we plan to eventually support most of them.
307
+
308
+ 7. Bmg has a few additional operators that prove very useful on real
309
+ production use cases: prefix, suffix, autowrap, autosummarize, left_join,
310
+ rxmatch, etc.
311
+
312
+ 8. Bmg optimizes queries and compiles them to SQL on the fly, while Alf was
313
+ building an AST internally first. Strictly speaking this makes Bmg less
314
+ powerful than Alf since optimizations cannot be turned off for now.
315
+
316
+ ## Contribute
317
+
318
+ Please use github issues and pull requests for all questions, bug reports,
319
+ and contributions. Don't hesitate to get in touch with us with an early code
320
+ spike if you plan to add non trivial features.
321
+
322
+ ## Licence
323
+
324
+ This software is distributed by Enspirit SRL under a MIT Licence. Please
325
+ contact Bernard Lambeau (blambeau@gmail.com) with any question.
141
326
 
142
327
  Enspirit (https://enspirit.be) and Klaro App (https://klaro.cards) are both
143
328
  actively using and contributing to the library.
144
-
145
- Feel free to contact us for help, ideas and/or contributions. Please use github
146
- issues and pull requests if possible if code is involved.
data/lib/bmg.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  require 'path'
2
2
  require 'predicate'
3
3
  require 'forwardable'
4
+ require 'set'
4
5
  module Bmg
5
6
 
6
7
  def in_memory(enumerable, type = Type::ANY)
@@ -8,6 +9,11 @@ module Bmg
8
9
  end
9
10
  module_function :in_memory
10
11
 
12
+ def text_file(path, options = {}, type = Type::ANY)
13
+ Reader::TextFile.new(type, path, options).spied(main_spy)
14
+ end
15
+ module_function :text_file
16
+
11
17
  def csv(path, options = {}, type = Type::ANY)
12
18
  Reader::Csv.new(type, path, options).spied(main_spy)
13
19
  end
@@ -38,11 +44,13 @@ module Bmg
38
44
  require_relative 'bmg/operator'
39
45
 
40
46
  require_relative 'bmg/reader'
47
+ require_relative 'bmg/writer'
41
48
 
42
49
  require_relative 'bmg/relation/empty'
43
50
  require_relative 'bmg/relation/in_memory'
44
51
  require_relative 'bmg/relation/spied'
45
52
  require_relative 'bmg/relation/materialized'
53
+ require_relative 'bmg/relation/proxy'
46
54
 
47
55
  # Deprecated
48
56
  Leaf = Relation::InMemory
data/lib/bmg/algebra.rb CHANGED
@@ -172,6 +172,16 @@ module Bmg
172
172
  end
173
173
  protected :_summarize
174
174
 
175
+ def transform(transformation = nil, options = {}, &proc)
176
+ transformation, options = proc, (transformation || {}) unless proc.nil?
177
+ _transform(self.type.transform(transformation, options), transformation, options)
178
+ end
179
+
180
+ def _transform(type, transformation, options)
181
+ Operator::Transform.new(type, self, transformation, options)
182
+ end
183
+ protected :_transform
184
+
175
185
  def union(other, options = {})
176
186
  return self if other.is_a?(Relation::Empty)
177
187
  _union self.type.union(other.type), other, options
@@ -2,6 +2,14 @@ module Bmg
2
2
  module Algebra
3
3
  module Shortcuts
4
4
 
5
+ def where(predicate)
6
+ restrict(predicate)
7
+ end
8
+
9
+ def exclude(predicate)
10
+ restrict(!Predicate.coerce(predicate))
11
+ end
12
+
5
13
  def rxmatch(attrs, matcher, options = {})
6
14
  predicate = attrs.inject(Predicate.contradiction){|p,a|
7
15
  p | Predicate.match(a, matcher, options)
data/lib/bmg/operator.rb CHANGED
@@ -46,4 +46,5 @@ require_relative 'operator/rename'
46
46
  require_relative 'operator/restrict'
47
47
  require_relative 'operator/rxmatch'
48
48
  require_relative 'operator/summarize'
49
+ require_relative 'operator/transform'
49
50
  require_relative 'operator/union'
@@ -0,0 +1,57 @@
1
+ module Bmg
2
+ module Operator
3
+ #
4
+ # Transform operator.
5
+ #
6
+ # Transforms existing attributes through computations
7
+ #
8
+ # Example:
9
+ #
10
+ # [{ a: 1 }] transform { a: ->(t){ t[:a]*2 } } => [{ a: 4 }]
11
+ #
12
+ class Transform
13
+ include Operator::Unary
14
+
15
+ DEFAULT_OPTIONS = {}
16
+
17
+ def initialize(type, operand, transformation, options = {})
18
+ @type = type
19
+ @operand = operand
20
+ @transformation = transformation
21
+ @options = DEFAULT_OPTIONS.merge(options)
22
+ end
23
+
24
+ protected
25
+
26
+ attr_reader :transformation
27
+
28
+ public
29
+
30
+ def each
31
+ t = transformer
32
+ @operand.each do |tuple|
33
+ yield t.call(tuple)
34
+ end
35
+ end
36
+
37
+ def to_ast
38
+ [ :transform, operand.to_ast, transformation.dup ]
39
+ end
40
+
41
+ protected ### optimization
42
+
43
+ protected ### inspect
44
+
45
+ def args
46
+ [ transformation ]
47
+ end
48
+
49
+ private
50
+
51
+ def transformer
52
+ @transformer ||= TupleTransformer.new(transformation)
53
+ end
54
+
55
+ end # class Transform
56
+ end # module Operator
57
+ end # module Bmg
data/lib/bmg/reader.rb CHANGED
@@ -9,3 +9,4 @@ module Bmg
9
9
  end
10
10
  require_relative "reader/csv"
11
11
  require_relative "reader/excel"
12
+ require_relative "reader/text_file"
@@ -0,0 +1,56 @@
1
+ module Bmg
2
+ module Reader
3
+ class TextFile
4
+ include Reader
5
+
6
+ DEFAULT_OPTIONS = {
7
+ strip: true,
8
+ parse: nil
9
+ }
10
+
11
+ def initialize(type, path, options = {})
12
+ options = { parse: options } if options.is_a?(Regexp)
13
+ @path = path
14
+ @options = DEFAULT_OPTIONS.merge(options)
15
+ @type = infer_type(type)
16
+ end
17
+ attr_reader :path, :options
18
+
19
+ public # Relation
20
+
21
+ def each
22
+ path.each_line.each_with_index do |text, line|
23
+ text = text.strip if strip?
24
+ parsed = parse(text)
25
+ yield({line: 1+line}.merge(parsed)) if parsed
26
+ end
27
+ end
28
+
29
+ private
30
+
31
+ def infer_type(base)
32
+ return base unless base == Bmg::Type::ANY
33
+ attr_list = if rx = options[:parse]
34
+ [:line] + rx.names.map(&:to_sym)
35
+ else
36
+ [:line, :text]
37
+ end
38
+ base
39
+ .with_attrlist(attr_list)
40
+ .with_keys([[:line]])
41
+ end
42
+
43
+ def strip?
44
+ options[:strip]
45
+ end
46
+
47
+ def parse(text)
48
+ return { text: text } unless rx = options[:parse]
49
+ if match = rx.match(text)
50
+ TupleAlgebra.symbolize_keys(match.named_captures)
51
+ end
52
+ end
53
+
54
+ end # class TextFile
55
+ end # module Reader
56
+ end # module Bmg
data/lib/bmg/relation.rb CHANGED
@@ -17,6 +17,16 @@ module Bmg
17
17
  self
18
18
  end
19
19
 
20
+ def type
21
+ Bmg::Type::ANY
22
+ end
23
+
24
+ def with_type(type)
25
+ dup.tap{|r|
26
+ r.type = type
27
+ }
28
+ end
29
+
20
30
  def with_typecheck
21
31
  dup.tap{|r|
22
32
  r.type = r.type.with_typecheck
@@ -105,6 +115,20 @@ module Bmg
105
115
  to_a.to_json(*args, &bl)
106
116
  end
107
117
 
118
+ # Writes the relation data to CSV.
119
+ #
120
+ # `string_or_io` and `options` are what CSV::new itself
121
+ # recognizes, default options are CSV's.
122
+ #
123
+ # When no string_or_io is used, the method uses a string.
124
+ #
125
+ # The method always returns the string_or_io.
126
+ def to_csv(options = {}, string_or_io = nil, preferences = nil)
127
+ options, string_or_io = {}, options unless options.is_a?(Hash)
128
+ string_or_io, preferences = nil, string_or_io if string_or_io.is_a?(Hash)
129
+ Writer::Csv.new(options, preferences).call(self, string_or_io)
130
+ end
131
+
108
132
  # Converts to an sexpr expression.
109
133
  def to_ast
110
134
  raise "Bmg is missing a feature!"
@@ -0,0 +1,63 @@
1
+ module Bmg
2
+ module Relation
3
+ #
4
+ # This module can be used to create typed collection on top
5
+ # of Bmg relations. Algebra methods will be delegated to the
6
+ # decorated relation, and results wrapped in a new instance
7
+ # of the class.
8
+ #
9
+ module Proxy
10
+
11
+ def initialize(relation)
12
+ @relation = relation
13
+ end
14
+
15
+ def method_missing(name, *args, &bl)
16
+ if @relation.respond_to?(name)
17
+ res = @relation.send(name, *args, &bl)
18
+ res.is_a?(Relation) ? _proxy(res) : res
19
+ else
20
+ super
21
+ end
22
+ end
23
+
24
+ def respond_to?(name, *args)
25
+ @relation.respond_to?(name) || super
26
+ end
27
+
28
+ [
29
+ :extend
30
+ ].each do |name|
31
+ define_method(name) do |*args, &bl|
32
+ res = @relation.send(name, *args, &bl)
33
+ res.is_a?(Relation) ? _proxy(res) : res
34
+ end
35
+ end
36
+
37
+ [
38
+ :one,
39
+ :one_or_nil
40
+ ].each do |meth|
41
+ define_method(meth) do |*args, &bl|
42
+ res = @relation.send(meth, *args, &bl)
43
+ res.nil? ? nil : _proxy_tuple(res)
44
+ end
45
+ end
46
+
47
+ def to_json(*args, &bl)
48
+ @relation.to_json(*args, &bl)
49
+ end
50
+
51
+ protected
52
+
53
+ def _proxy(relation)
54
+ self.class.new(relation)
55
+ end
56
+
57
+ def _proxy_tuple(tuple)
58
+ tuple
59
+ end
60
+
61
+ end # module Proxy
62
+ end # class Relation
63
+ end # module Bmg
@@ -24,7 +24,7 @@ module Bmg
24
24
  protected :type=
25
25
 
26
26
  def each(&bl)
27
- spy.call(self)
27
+ spy.call(self) if bl
28
28
  operand.each(&bl)
29
29
  end
30
30
 
data/lib/bmg/support.rb CHANGED
@@ -1,2 +1,4 @@
1
1
  require_relative 'support/tuple_algebra'
2
+ require_relative 'support/tuple_transformer'
2
3
  require_relative 'support/keys'
4
+ require_relative 'support/output_preferences'
@@ -7,6 +7,10 @@ module Bmg
7
7
 
8
8
  public ## tools
9
9
 
10
+ def select(&bl)
11
+ Keys.new(@keys.select(&bl), false)
12
+ end
13
+
10
14
  public ## algebra
11
15
 
12
16
  def allbut(oldtype, newtype, butlist)
@@ -0,0 +1,44 @@
1
+ module Bmg
2
+ class OutputPreferences
3
+
4
+ DEFAULT_PREFS = {
5
+ attributes_ordering: nil,
6
+ extra_attributes: :after
7
+ }
8
+
9
+ def initialize(options)
10
+ @options = DEFAULT_PREFS.merge(options)
11
+ end
12
+ attr_reader :options
13
+
14
+ def self.dress(arg)
15
+ return arg if arg.is_a?(OutputPreferences)
16
+ arg = {} if arg.nil?
17
+ new(arg)
18
+ end
19
+
20
+ def attributes_ordering
21
+ options[:attributes_ordering]
22
+ end
23
+
24
+ def extra_attributes
25
+ options[:extra_attributes]
26
+ end
27
+
28
+ def order_attrlist(attrlist)
29
+ return attrlist if attributes_ordering.nil?
30
+ index = Hash[attributes_ordering.each_with_index.to_a]
31
+ attrlist.sort{|a,b|
32
+ ai, bi = index[a], index[b]
33
+ if ai && bi
34
+ ai <=> bi
35
+ elsif ai
36
+ extra_attributes == :after ? -1 : 1
37
+ else
38
+ extra_attributes == :after ? 1 : -1
39
+ end
40
+ }
41
+ end
42
+
43
+ end # class OutputPreferences
44
+ end # module Bmg
@@ -19,5 +19,11 @@ module Bmg
19
19
  end
20
20
  module_function :rename
21
21
 
22
+ def symbolize_keys(h)
23
+ return h if h.empty?
24
+ h.each_with_object({}){|(k,v),h| h[k.to_sym] = v }
25
+ end
26
+ module_function :symbolize_keys
27
+
22
28
  end # module TupleAlgebra
23
29
  end # module Bmg
@@ -0,0 +1,63 @@
1
+ module Bmg
2
+ class TupleTransformer
3
+
4
+ def initialize(transformation)
5
+ @transformation = transformation
6
+ end
7
+
8
+ def self.new(arg)
9
+ return arg if arg.is_a?(TupleTransformer)
10
+ super
11
+ end
12
+
13
+ def call(tuple)
14
+ transform_tuple(tuple, @transformation)
15
+ end
16
+
17
+ def knows_attrlist?
18
+ @transformation.is_a?(Hash)
19
+ end
20
+
21
+ def to_attrlist
22
+ @transformation.keys
23
+ end
24
+
25
+ private
26
+
27
+ def transform_tuple(tuple, with)
28
+ case with
29
+ when Symbol, Proc, Regexp
30
+ tuple.each_with_object({}){|(k,v),dup|
31
+ dup[k] = transform_attr(v, with)
32
+ }
33
+ when Hash
34
+ with.each_with_object(tuple.dup){|(k,v),dup|
35
+ dup[k] = transform_attr(dup[k], v)
36
+ }
37
+ when Array
38
+ with.inject(tuple){|dup,on|
39
+ transform_tuple(dup, on)
40
+ }
41
+ else
42
+ raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
43
+ end
44
+ end
45
+
46
+ def transform_attr(value, with)
47
+ case with
48
+ when Symbol
49
+ value.send(with)
50
+ when Regexp
51
+ m = with.match(value.to_s)
52
+ m.nil? ? m : m.to_s
53
+ when Proc
54
+ with.call(value)
55
+ when Hash
56
+ with[value]
57
+ else
58
+ raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
59
+ end
60
+ end
61
+
62
+ end # module TupleTransformer
63
+ end # module Bmg
data/lib/bmg/type.rb CHANGED
@@ -241,6 +241,31 @@ module Bmg
241
241
  }
242
242
  end
243
243
 
244
+ def transform(transformation, options = {})
245
+ transformer = TupleTransformer.new(transformation)
246
+ if typechecked? && knows_attrlist? && transformer.knows_attrlist?
247
+ known_attributes!(transformer.to_attrlist)
248
+ end
249
+ keys = if options[:key_preserving]
250
+ self._keys
251
+ elsif transformer.knows_attrlist? && knows_keys?
252
+ touched_attrs = transformer.to_attrlist
253
+ keys = self._keys.select{|k| (k & touched_attrs).empty? }
254
+ else
255
+ nil
256
+ end
257
+ pred = if transformer.knows_attrlist?
258
+ attr_list = transformer.to_attrlist
259
+ predicate.and_split(attr_list).last
260
+ else
261
+ Predicate.tautology
262
+ end
263
+ dup.tap{|x|
264
+ x.keys = keys
265
+ x.predicate = pred
266
+ }
267
+ end
268
+
244
269
  def union(other)
245
270
  if typechecked? && knows_attrlist? && other.knows_attrlist?
246
271
  missing = self.attrlist - other.attrlist
data/lib/bmg/version.rb CHANGED
@@ -1,8 +1,8 @@
1
1
  module Bmg
2
2
  module Version
3
3
  MAJOR = 0
4
- MINOR = 17
5
- TINY = 4
4
+ MINOR = 18
5
+ TINY = 0
6
6
  end
7
7
  VERSION = "#{Version::MAJOR}.#{Version::MINOR}.#{Version::TINY}"
8
8
  end
data/lib/bmg/writer.rb ADDED
@@ -0,0 +1 @@
1
+ require_relative 'writer/csv'
@@ -0,0 +1,42 @@
1
+ module Bmg
2
+ module Writer
3
+ class Csv
4
+ include Writer
5
+
6
+ DEFAULT_OPTIONS = {
7
+ }
8
+
9
+ def initialize(csv_options, output_preferences = nil)
10
+ @csv_options = DEFAULT_OPTIONS.merge(csv_options)
11
+ @output_preferences = OutputPreferences.dress(output_preferences)
12
+ end
13
+ attr_reader :csv_options, :output_preferences
14
+
15
+ def call(relation, string_or_io = nil)
16
+ require 'csv'
17
+ string_or_io, to_s = string_or_io.nil? ? [StringIO.new, true] : [string_or_io, false]
18
+ headers, csv = infer_headers(relation.type), nil
19
+ relation.each do |tuple|
20
+ if csv.nil?
21
+ headers = infer_headers(tuple) if headers.nil?
22
+ csv = CSV.new(string_or_io, csv_options.merge(headers: headers))
23
+ end
24
+ csv << headers.map{|h| tuple[h] }
25
+ end
26
+ to_s ? string_or_io.string : string_or_io
27
+ end
28
+
29
+ private
30
+
31
+ def infer_headers(from)
32
+ attrlist = if from.is_a?(Type) && from.knows_attrlist?
33
+ from.to_attrlist
34
+ elsif from.is_a?(Hash)
35
+ from.keys
36
+ end
37
+ attrlist ? output_preferences.order_attrlist(attrlist) : nil
38
+ end
39
+
40
+ end # class Csv
41
+ end # module Writer
42
+ end # module Bmg
data/tasks/test.rake CHANGED
@@ -6,17 +6,24 @@ namespace :test do
6
6
  desc "Runs unit tests"
7
7
  RSpec::Core::RakeTask.new(:unit) do |t|
8
8
  t.pattern = "spec/unit/**/test_*.rb"
9
- t.rspec_opts = ["-Ilib", "-Ispec/unit", "--fail-fast", "--color", "--backtrace", "--format=progress"]
9
+ t.rspec_opts = ["-Ilib", "-Ispec/unit", "--color", "--backtrace", "--format=progress"]
10
10
  end
11
11
  tests << :unit
12
12
 
13
13
  desc "Runs integration tests"
14
14
  RSpec::Core::RakeTask.new(:integration) do |t|
15
15
  t.pattern = "spec/integration/**/test_*.rb"
16
- t.rspec_opts = ["-Ilib", "-Ispec/integration", "--fail-fast", "--color", "--backtrace", "--format=progress"]
16
+ t.rspec_opts = ["-Ilib", "-Ispec/integration", "--color", "--backtrace", "--format=progress"]
17
17
  end
18
18
  tests << :integration
19
19
 
20
+ desc "Runs github regression tests"
21
+ RSpec::Core::RakeTask.new(:regression) do |t|
22
+ t.pattern = "spec/regression/**/test_*.rb"
23
+ t.rspec_opts = ["-Ilib", "-Ispec/regression", "--color", "--backtrace", "--format=progress"]
24
+ end
25
+ tests << :regression
26
+
20
27
  task :all => tests
21
28
  end
22
29
 
metadata CHANGED
@@ -1,91 +1,91 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bmg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.17.4
4
+ version: 0.18.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Bernard Lambeau
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-07-23 00:00:00.000000000 Z
11
+ date: 2021-03-12 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: predicate
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
18
- - !ruby/object:Gem::Version
19
- version: '2.4'
20
17
  - - ">="
21
18
  - !ruby/object:Gem::Version
22
- version: 2.4.0
19
+ version: 2.5.0
20
+ - - "~>"
21
+ - !ruby/object:Gem::Version
22
+ version: '2.5'
23
23
  type: :runtime
24
24
  prerelease: false
25
25
  version_requirements: !ruby/object:Gem::Requirement
26
26
  requirements:
27
- - - "~>"
28
- - !ruby/object:Gem::Version
29
- version: '2.4'
30
27
  - - ">="
31
28
  - !ruby/object:Gem::Version
32
- version: 2.4.0
29
+ version: 2.5.0
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '2.5'
33
33
  - !ruby/object:Gem::Dependency
34
- name: rake
34
+ name: path
35
35
  requirement: !ruby/object:Gem::Requirement
36
36
  requirements:
37
- - - "~>"
37
+ - - ">="
38
38
  - !ruby/object:Gem::Version
39
- version: '10'
40
- type: :development
39
+ version: '2.0'
40
+ type: :runtime
41
41
  prerelease: false
42
42
  version_requirements: !ruby/object:Gem::Requirement
43
43
  requirements:
44
- - - "~>"
44
+ - - ">="
45
45
  - !ruby/object:Gem::Version
46
- version: '10'
46
+ version: '2.0'
47
47
  - !ruby/object:Gem::Dependency
48
- name: rspec
48
+ name: rake
49
49
  requirement: !ruby/object:Gem::Requirement
50
50
  requirements:
51
51
  - - "~>"
52
52
  - !ruby/object:Gem::Version
53
- version: '3.6'
53
+ version: '13'
54
54
  type: :development
55
55
  prerelease: false
56
56
  version_requirements: !ruby/object:Gem::Requirement
57
57
  requirements:
58
58
  - - "~>"
59
59
  - !ruby/object:Gem::Version
60
- version: '3.6'
60
+ version: '13'
61
61
  - !ruby/object:Gem::Dependency
62
- name: path
62
+ name: rspec
63
63
  requirement: !ruby/object:Gem::Requirement
64
64
  requirements:
65
- - - ">="
65
+ - - "~>"
66
66
  - !ruby/object:Gem::Version
67
- version: '1.3'
67
+ version: '3.6'
68
68
  type: :development
69
69
  prerelease: false
70
70
  version_requirements: !ruby/object:Gem::Requirement
71
71
  requirements:
72
- - - ">="
72
+ - - "~>"
73
73
  - !ruby/object:Gem::Version
74
- version: '1.3'
74
+ version: '3.6'
75
75
  - !ruby/object:Gem::Dependency
76
76
  name: roo
77
77
  requirement: !ruby/object:Gem::Requirement
78
78
  requirements:
79
79
  - - ">="
80
80
  - !ruby/object:Gem::Version
81
- version: '2.7'
81
+ version: '2.8'
82
82
  type: :development
83
83
  prerelease: false
84
84
  version_requirements: !ruby/object:Gem::Requirement
85
85
  requirements:
86
86
  - - ">="
87
87
  - !ruby/object:Gem::Version
88
- version: '2.7'
88
+ version: '2.8'
89
89
  - !ruby/object:Gem::Dependency
90
90
  name: sequel
91
91
  requirement: !ruby/object:Gem::Requirement
@@ -149,14 +149,17 @@ files:
149
149
  - lib/bmg/operator/shared/nary.rb
150
150
  - lib/bmg/operator/shared/unary.rb
151
151
  - lib/bmg/operator/summarize.rb
152
+ - lib/bmg/operator/transform.rb
152
153
  - lib/bmg/operator/union.rb
153
154
  - lib/bmg/reader.rb
154
155
  - lib/bmg/reader/csv.rb
155
156
  - lib/bmg/reader/excel.rb
157
+ - lib/bmg/reader/text_file.rb
156
158
  - lib/bmg/relation.rb
157
159
  - lib/bmg/relation/empty.rb
158
160
  - lib/bmg/relation/in_memory.rb
159
161
  - lib/bmg/relation/materialized.rb
162
+ - lib/bmg/relation/proxy.rb
160
163
  - lib/bmg/relation/spied.rb
161
164
  - lib/bmg/sequel.rb
162
165
  - lib/bmg/sequel/ext.rb
@@ -258,9 +261,13 @@ files:
258
261
  - lib/bmg/summarizer/variance.rb
259
262
  - lib/bmg/support.rb
260
263
  - lib/bmg/support/keys.rb
264
+ - lib/bmg/support/output_preferences.rb
261
265
  - lib/bmg/support/tuple_algebra.rb
266
+ - lib/bmg/support/tuple_transformer.rb
262
267
  - lib/bmg/type.rb
263
268
  - lib/bmg/version.rb
269
+ - lib/bmg/writer.rb
270
+ - lib/bmg/writer/csv.rb
264
271
  - tasks/gem.rake
265
272
  - tasks/test.rake
266
273
  homepage: http://github.com/enspirit/bmg
@@ -282,7 +289,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
282
289
  - !ruby/object:Gem::Version
283
290
  version: '0'
284
291
  requirements: []
285
- rubygems_version: 3.1.2
292
+ rubygems_version: 3.0.8
286
293
  signing_key:
287
294
  specification_version: 4
288
295
  summary: Bmg is Alf's successor.