bmg 0.17.4 → 0.18.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1e48d4a2d3c4bea8271e40ad4db52f43438de7f754079cc4c5bcd5e98ce4865d
4
- data.tar.gz: 3fff9f251a2b766ab3de631748ccc6fe6f6b96ccef88507cb3ed76766fea3992
3
+ metadata.gz: f1ce59d00b630f644e5716eaff17a83116c1342ea0b6ba7d9174b1f5f4eadd6e
4
+ data.tar.gz: 2156d1a8eb88999e749f434a8e94d2b4e70c636fadb10470493f19bb805a19a3
5
5
  SHA512:
6
- metadata.gz: facfbb798258cfdec3803cc7b93999d23c8a0776ead53c074266f4f658a27feee1a3e1dcc1145e7ad45bb03579d22f5d65108152516ac0af0d79aa59805ee016
7
- data.tar.gz: 10271e608276bc2e00ccee3d6a94287cb2211d945898cb9cfa54764a05027c9e6a6c34a11e9725f9ed47df4df537fe054d8c3f17b24f58da9713761121523e73
6
+ metadata.gz: 8e3e714d698ff7c47c2f61b4056c73acaa39983306402eeb54d1957b90959af51fcf20a543c5c17242d260fe835b33c19dd960e3cd4920a66d49ea81a22ee4b0
7
+ data.tar.gz: 89fe1cc4c3157adf7373fb755e0cf7eb5dd8d93f7061558454d27649b44aca558f936f006e3238dce410ffdfe29821d59feac3b31b5d64241ffc76192bba149f
data/Gemfile CHANGED
@@ -1,5 +1,2 @@
1
1
  source "https://rubygems.org"
2
2
  gemspec
3
-
4
- # gem "predicate", github: "enspirit/predicate", branch: "placeholders"
5
- # gem "predicate", path: "../predicate"
data/README.md CHANGED
@@ -1,16 +1,30 @@
1
1
  # Bmg, a relational algebra (Alf's successor)!
2
2
 
3
+ [![Build Status](https://travis-ci.com/enspirit/bmg.svg?branch=master)](https://travis-ci.com/enspirit/bmg)
4
+
3
5
  Bmg is a relational algebra implemented as a ruby library. It implements the
4
6
  [Relation as First-Class Citizen](http://www.try-alf.org/blog/2013-10-21-relations-as-first-class-citizen)
5
- paradigm contributed with Alf a few years ago.
6
-
7
- Like Alf, Bmg can be used to query relations in memory, from various files,
8
- SQL databases, and any data sources that can be seen as serving relations.
9
- Cross data-sources joins are supported, as with Alf.
10
-
11
- Unlike Alf, Bmg does not make any core ruby extension and exposes the
12
- object-oriented syntax only (not Alf's functional one). Bmg implementation is
13
- also much simpler, and make its easier to implement user-defined relations.
7
+ paradigm contributed with [Alf](http://www.try-alf.org/) a few years ago.
8
+
9
+ Bmg can be used to query relations in memory, from various files, SQL databases,
10
+ and any data source that can be seen as serving relations. Cross data-sources
11
+ joins are supported, as with Alf. For differences with Alf, see a section
12
+ further down this README.
13
+
14
+ ## Outline
15
+
16
+ * [Example](#example)
17
+ * [Where are base relations coming from?](#where-are-base-relations-coming-from)
18
+ * [Memory relations](#memory-relations)
19
+ * [Connecting to SQL databases](#connecting-to-sql-databases)
20
+ * [Reading files (csv, excel, text)](#reading-files-csv-excel-text)
21
+ * [Your own relations](#your-own-relations)
22
+ * [List of supported operators](#supported-operators)
23
+ * [How is this different?](#how-is-this-different)
24
+ * [... from similar libraries](#-from-similar-libraries)
25
+ * [... from Alf](#-from-alf)
26
+ * [Contribute](#contribute)
27
+ * [License](#license)
14
28
 
15
29
  ## Example
16
30
 
@@ -27,7 +41,7 @@ suppliers = Bmg::Relation.new([
27
41
  ])
28
42
 
29
43
  by_city = suppliers
30
- .restrict(Predicate.neq(status: 30))
44
+ .exclude(status: 30)
31
45
  .extend(upname: ->(t){ t[:name].upcase })
32
46
  .group([:sid, :name, :status], :suppliers_in)
33
47
 
@@ -35,76 +49,158 @@ puts JSON.pretty_generate(by_city)
35
49
  # [{...},...]
36
50
  ```
37
51
 
38
- ## Connecting to a SQL database
52
+ ## Where are base relations coming from?
53
+
54
+ Bmg sees relations as sets/enumerable of symbolized Ruby hashes. The following
55
+ sections show you how to get them in the first place, to enter Relationland.
56
+
57
+ ### Memory relations
58
+
59
+ If you have an Array of Hashes -- in fact any Enumerable -- you can easily get
60
+ a Relation using either `Bmg::Relation.new` or `Bmg.in_memory`.
61
+
62
+ ```ruby
63
+ # this...
64
+ r = Bmg::Relation.new [{id: 1}, {id: 2}]
65
+
66
+ # is the same as this...
67
+ r = Bmg.in_memory [{id: 1}, {id: 2}]
68
+
69
+ # entire algebra is available on `r`
70
+ ```
71
+
72
+ ### Connecting to SQL databases
39
73
 
40
- Bmg requires `sequel >= 3.0` to connect to SQL databases.
74
+ Bmg currently requires `sequel >= 3.0` to connect to SQL databases. You also
75
+ need to require `bmg/sequel`.
41
76
 
42
77
  ```ruby
43
78
  require 'sqlite3'
44
79
  require 'bmg'
45
80
  require 'bmg/sequel'
81
+ ```
46
82
 
47
- DB = Sequel.connect("sqlite://suppliers-and-parts.db")
83
+ Then `Bmg.sequel` serves relations for tables of your SQL database:
48
84
 
85
+ ```ruby
86
+ DB = Sequel.connect("sqlite://suppliers-and-parts.db")
49
87
  suppliers = Bmg.sequel(:suppliers, DB)
88
+ ```
89
+
90
+ The entire algebra is available on those relations. As long as you keep using
91
+ operators that can be translated to SQL, results remain SQL-able:
50
92
 
93
+ ```ruby
51
94
  big_suppliers = suppliers
52
- .restrict(Predicate.neq(status: 30))
95
+ .exclude(status: 30)
96
+ .project([:sid, :name])
53
97
 
54
98
  puts big_suppliers.to_sql
55
- # SELECT `t1`.`sid`, `t1`.`name`, `t1`.`status`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
99
+ # SELECT `t1`.`sid`, `t1`.`name` FROM `suppliers` AS 't1' WHERE (`t1`.`status` != 30)
100
+ ```
56
101
 
57
- puts JSON.pretty_generate(big_suppliers)
58
- # [{...},...]
102
+ Operators not translatable to SQL are available too (such as `group` below).
103
+ Bmg fallbacks to memory operators for them, but remains capable of pushing some
104
+ operators down the tree as illustrated below (the restriction on `:city` is
105
+ pushed to the SQL server):
106
+
107
+ ```ruby
108
+ Bmg.sequel(:suppliers, sequel_db)
109
+ .project([:sid, :name, :city])
110
+ .group([:sid, :name], :suppliers_in)
111
+ .restrict(city: ["Paris", "London"])
112
+ .debug
113
+
114
+ # (group
115
+ # (sequel SELECT `t1`.`sid`, `t1`.`name`, `t1`.`city` FROM `suppliers` AS 't1' WHERE (`t1`.`city` IN ('Paris', 'London')))
116
+ # [:sid, :name, :status]
117
+ # :suppliers_in
118
+ # {:array=>false})
59
119
  ```
60
120
 
61
- ## How is this different from similar libraries?
121
+ ### Reading files (csv, excel, text)
62
122
 
63
- 1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
64
- etc.) do not implement a genuine relational algebra: their support for
65
- chaining relational operators is limited (yielding errors or wrong SQL
66
- queries). Bmg **always** allows chaining operators. If it does not, it's
67
- a bug. In other words, the following query is 100% valid:
123
+ Bmg provides simple adapters to read files and reach Relationland as soon as
124
+ possible.
68
125
 
69
- relation
70
- .restrict(...) # aka where
71
- .union(...)
72
- .summarize(...) # aka group by
73
- .restrict(...)
126
+ #### CSV files
74
127
 
75
- 2. Bmg supports in memory relations, json relations, csv relations, SQL
76
- relations and so on. It's not tight to SQL generation, and supports
77
- queries accross multiple data sources.
128
+ ```ruby
129
+ csv_options = { col_sep: ",", quote_char: '"' }
130
+ r = Bmg.csv("path/to/a/file.csv", csv_options)
131
+ ```
78
132
 
79
- 3. Bmg makes a best effort to optimize queries, simplifying both generated
80
- SQL code (low-level accesses to datasources) and in-memory operations.
133
+ Options are directly transmitted to `::CSV.new`, check ruby's standard
134
+ library.
81
135
 
82
- 4. Bmg supports various *structuring* operators (group, image, autowrap,
83
- autosummarize, etc.) and allows building 'non flat' relations.
136
+ #### Excel files
84
137
 
85
- ## How is this different from Alf?
138
+ You will need to add [`roo`](https://github.com/roo-rb/roo) to your Gemfile to
139
+ read `.xls` and `.xlsx` files with Bmg.
86
140
 
87
- 1. Bmg's implementation is much simpler than Alf, and uses no ruby core
88
- extention.
141
+ ```ruby
142
+ roo_options = { skip: 1 }
143
+ r = Bmg.excel("path/to/a/file.xls", roo_options)
144
+ ```
89
145
 
90
- 2. We are confident using Bmg in production. Systematic inspection of query
91
- plans is suggested though. Alf was a bit too experimental to be used on
92
- (critical) production systems.
146
+ Options are directly transmitted to `Roo::Spreadsheet.open`, check roo's
147
+ documentation.
93
148
 
94
- 2. Alf exposes a functional syntax, command line tool, restful tools and
95
- many more. Bmg is limited to the core algebra, main Relation abstraction
96
- and SQL generation.
149
+ #### Text files
97
150
 
98
- 3. Bmg is less strict regarding conformance to relational theory, and
99
- may actually expose non relational features (such as support for null,
100
- left_join operator, etc.). Sharp tools hurt, use them with great care.
151
+ There is also a straightforward way to read text files and convert lines to
152
+ tuples.
101
153
 
102
- 4. Bmg does not yet implement all operators documented on try-alf.org, even
103
- if we plan to eventually support them all.
154
+ ```ruby
155
+ r = Bmg.text_file("path/to/a/file.txt")
156
+ r.type.attrlist
157
+ # => [:line, :text]
158
+ ```
104
159
 
105
- 5. Bmg has a few additional operators that prove very useful on real
106
- production use cases: prefix, suffix, autowrap, autosummarize, left_join,
107
- rxmatch, etc.
160
+ Without options tuples will have `:line` and `:text` attributes, the former
161
+ being the line number (starting at 1) and the latter being the line itself
162
+ (stripped).
163
+
164
+ The are a couple of options (see `Bmg::Reader::Textfile`). The most useful one
165
+ is the use a of a Regexp with named captures to automatically extract
166
+ attributes:
167
+
168
+ ```ruby
169
+ r = Bmg.text_file("path/to/a/file.txt", parse: /GET (?<url>([^\s]+))/)
170
+ r.type.attrlist
171
+ # => [:line, :url]
172
+ ```
173
+
174
+ In this scenario, non matching lines are skipped. The `:line` attribute keeps
175
+ being used to have at least one candidate key (so to speak).
176
+
177
+ ### Your own relations
178
+
179
+ As noted earlier, Bmg has a simple relation interface where you only have to
180
+ provide an iteration of symbolized tuples.
181
+
182
+ ```ruby
183
+ class MyRelation
184
+ include Bmg::Relation
185
+
186
+ def each
187
+ yield(id: 1, name: "Alf", year: 2014)
188
+ yield(id: 2, name: "Bmg", year: 2018)
189
+ end
190
+ end
191
+
192
+ MyRelation.new
193
+ .restrict(Predicate.gt(:year, 2015))
194
+ .allbut([:year])
195
+ ```
196
+
197
+ As shown, creating adapters on top of various data source is straighforward.
198
+ Adapters can also participate to query optimization (such as pushing
199
+ restrictions down the tree) by overriding the underscored version of operators
200
+ (e.g. `_restrict`).
201
+
202
+ Have a look at `Bmg::Algebra` for the protocol and `Bmg::Sql::Relation` for an
203
+ example. Keep in touch with the team if you need some help.
108
204
 
109
205
  ## Supported operators
110
206
 
@@ -114,6 +210,7 @@ r.autowrap(split: '_') # structure a flat relation, split:
114
210
  r.autosummarize([:a, :b, ...], x: :sum) # (experimental) usual summarizers supported
115
211
  r.constants(x: 12, ...) # add constant attributes (sometimes useful in unions)
116
212
  r.extend(x: ->(t){ ... }, ...) # add computed attributes
213
+ r.exclude(predicate) # shortcut for restrict(!predicate)
117
214
  r.group([:a, :b, ...], :x) # relation-valued attribute from attributes
118
215
  r.image(right, :x, [:a, :b, ...]) # relation-valued attribute from another relation
119
216
  r.join(right, [:a, :b, ...]) # natural join on a join key
@@ -132,15 +229,100 @@ r.restrict(a: "foo", b: "bar", ...) # relational restriction, aka where
132
229
  r.rxmatch([:a, :b, ...], /xxx/) # regex match kind of restriction
133
230
  r.summarize([:a, :b, ...], x: :sum) # relational summarization
134
231
  r.suffix(:_foo, but: [:a, ...]) # suffix kind of renaming
232
+ t.transform(:to_s) # all-attrs transformation
233
+ t.transform(&:to_s) # similar, but Proc-driven
234
+ t.transform(:foo => :upcase, ...) # specific-attrs tranformation
235
+ t.transform([:to_s, :upcase]) # chain-transformation
135
236
  r.union(right) # relational union
237
+ r.where(predicate) # alias for restrict(predicate)
136
238
  ```
137
239
 
138
- ## Who is behind Bmg?
240
+ ## How is this different?
241
+
242
+ ### ... from similar libraries?
243
+
244
+ 1. The libraries you probably know (Sequel, Arel, SQLAlchemy, Korma, jOOQ,
245
+ etc.) do not implement a genuine relational algebra. Their support for
246
+ chaining relational operators is thus limited (restricting your expression
247
+ power and/or raising errors and/or outputting wrong or counterintuitive
248
+ SQL code). Bmg **always** allows chaining operators. If it does not, it's
249
+ a bug.
250
+
251
+ For instance the expression below is 100% valid in Bmg. The last where
252
+ clause applies to the result of the summarize (while SQL requires a `HAVING`
253
+ clause, or a `SELECT ... FROM (SELECT ...) r`).
254
+
255
+ ```ruby
256
+ relation
257
+ .where(...)
258
+ .union(...)
259
+ .summarize(...) # aka group by
260
+ .where(...)
261
+ ```
262
+
263
+ 2. Bmg supports in memory relations, json relations, csv relations, SQL
264
+ relations and so on. It's not tight to SQL generation, and supports
265
+ queries accross multiple data sources.
266
+
267
+ 3. Bmg makes a best effort to optimize queries, simplifying both generated
268
+ SQL code (low-level accesses to datasources) and in-memory operations.
269
+
270
+ 4. Bmg supports various *structuring* operators (group, image, autowrap,
271
+ autosummarize, etc.) and allows building 'non flat' relations.
272
+
273
+ 5. Bmg can use full ruby power when that helps (e.g. regular expressions in
274
+ WHERE clauses or ruby code in EXTEND clauses). This may prevent Bmg from
275
+ delegating work to underlying data sources (e.g. SQL server) and should
276
+ therefore be used with care though.
277
+
278
+ ### ... from Alf?
279
+
280
+ If you use Alf (or used it in the past), below are the main differences between
281
+ Bmg and Alf. Bmg has NOT been written to be API-compatible with Alf and will
282
+ probably never be.
283
+
284
+ 1. Bmg's implementation is much simpler than Alf and uses no ruby core
285
+ extention.
286
+
287
+ 2. We are confident using Bmg in production. Systematic inspection of query
288
+ plans is advised though. Alf was a bit too experimental to be used on
289
+ (critical) production systems.
290
+
291
+ 3. Alf exposes a functional syntax, command line tool, restful tools and
292
+ many more. Bmg is limited to the core algebra, main Relation abstraction
293
+ and SQL generation.
139
294
 
140
- Bernard Lambeau (bernard@klaro.cards) is Alf & Bmg main engineer & maintainer.
295
+ 4. Bmg is less strict regarding conformance to relational theory, and
296
+ may actually expose non relational features (such as support for null,
297
+ left_join operator, etc.). Sharp tools hurt, use them with care.
298
+
299
+ 5. Unlike Alf::Relation instances of Bmg::Relation capture query-trees, not
300
+ values. Currently two instances `r1` and `r2` are not equal even if they
301
+ define the same mathematical relation. As a consequence joining on
302
+ relation-valued attributes does not work as expected in Bmg until further
303
+ notice.
304
+
305
+ 6. Bmg does not implement all operators documented on try-alf.org, even if
306
+ we plan to eventually support most of them.
307
+
308
+ 7. Bmg has a few additional operators that prove very useful on real
309
+ production use cases: prefix, suffix, autowrap, autosummarize, left_join,
310
+ rxmatch, etc.
311
+
312
+ 8. Bmg optimizes queries and compiles them to SQL on the fly, while Alf was
313
+ building an AST internally first. Strictly speaking this makes Bmg less
314
+ powerful than Alf since optimizations cannot be turned off for now.
315
+
316
+ ## Contribute
317
+
318
+ Please use github issues and pull requests for all questions, bug reports,
319
+ and contributions. Don't hesitate to get in touch with us with an early code
320
+ spike if you plan to add non trivial features.
321
+
322
+ ## Licence
323
+
324
+ This software is distributed by Enspirit SRL under a MIT Licence. Please
325
+ contact Bernard Lambeau (blambeau@gmail.com) with any question.
141
326
 
142
327
  Enspirit (https://enspirit.be) and Klaro App (https://klaro.cards) are both
143
328
  actively using and contributing to the library.
144
-
145
- Feel free to contact us for help, ideas and/or contributions. Please use github
146
- issues and pull requests if possible if code is involved.
data/lib/bmg.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  require 'path'
2
2
  require 'predicate'
3
3
  require 'forwardable'
4
+ require 'set'
4
5
  module Bmg
5
6
 
6
7
  def in_memory(enumerable, type = Type::ANY)
@@ -8,6 +9,11 @@ module Bmg
8
9
  end
9
10
  module_function :in_memory
10
11
 
12
+ def text_file(path, options = {}, type = Type::ANY)
13
+ Reader::TextFile.new(type, path, options).spied(main_spy)
14
+ end
15
+ module_function :text_file
16
+
11
17
  def csv(path, options = {}, type = Type::ANY)
12
18
  Reader::Csv.new(type, path, options).spied(main_spy)
13
19
  end
@@ -38,11 +44,13 @@ module Bmg
38
44
  require_relative 'bmg/operator'
39
45
 
40
46
  require_relative 'bmg/reader'
47
+ require_relative 'bmg/writer'
41
48
 
42
49
  require_relative 'bmg/relation/empty'
43
50
  require_relative 'bmg/relation/in_memory'
44
51
  require_relative 'bmg/relation/spied'
45
52
  require_relative 'bmg/relation/materialized'
53
+ require_relative 'bmg/relation/proxy'
46
54
 
47
55
  # Deprecated
48
56
  Leaf = Relation::InMemory
data/lib/bmg/algebra.rb CHANGED
@@ -172,6 +172,16 @@ module Bmg
172
172
  end
173
173
  protected :_summarize
174
174
 
175
+ def transform(transformation = nil, options = {}, &proc)
176
+ transformation, options = proc, (transformation || {}) unless proc.nil?
177
+ _transform(self.type.transform(transformation, options), transformation, options)
178
+ end
179
+
180
+ def _transform(type, transformation, options)
181
+ Operator::Transform.new(type, self, transformation, options)
182
+ end
183
+ protected :_transform
184
+
175
185
  def union(other, options = {})
176
186
  return self if other.is_a?(Relation::Empty)
177
187
  _union self.type.union(other.type), other, options
@@ -2,6 +2,14 @@ module Bmg
2
2
  module Algebra
3
3
  module Shortcuts
4
4
 
5
+ def where(predicate)
6
+ restrict(predicate)
7
+ end
8
+
9
+ def exclude(predicate)
10
+ restrict(!Predicate.coerce(predicate))
11
+ end
12
+
5
13
  def rxmatch(attrs, matcher, options = {})
6
14
  predicate = attrs.inject(Predicate.contradiction){|p,a|
7
15
  p | Predicate.match(a, matcher, options)
data/lib/bmg/operator.rb CHANGED
@@ -46,4 +46,5 @@ require_relative 'operator/rename'
46
46
  require_relative 'operator/restrict'
47
47
  require_relative 'operator/rxmatch'
48
48
  require_relative 'operator/summarize'
49
+ require_relative 'operator/transform'
49
50
  require_relative 'operator/union'
@@ -0,0 +1,57 @@
1
+ module Bmg
2
+ module Operator
3
+ #
4
+ # Transform operator.
5
+ #
6
+ # Transforms existing attributes through computations
7
+ #
8
+ # Example:
9
+ #
10
+ # [{ a: 1 }] transform { a: ->(t){ t[:a]*2 } } => [{ a: 4 }]
11
+ #
12
+ class Transform
13
+ include Operator::Unary
14
+
15
+ DEFAULT_OPTIONS = {}
16
+
17
+ def initialize(type, operand, transformation, options = {})
18
+ @type = type
19
+ @operand = operand
20
+ @transformation = transformation
21
+ @options = DEFAULT_OPTIONS.merge(options)
22
+ end
23
+
24
+ protected
25
+
26
+ attr_reader :transformation
27
+
28
+ public
29
+
30
+ def each
31
+ t = transformer
32
+ @operand.each do |tuple|
33
+ yield t.call(tuple)
34
+ end
35
+ end
36
+
37
+ def to_ast
38
+ [ :transform, operand.to_ast, transformation.dup ]
39
+ end
40
+
41
+ protected ### optimization
42
+
43
+ protected ### inspect
44
+
45
+ def args
46
+ [ transformation ]
47
+ end
48
+
49
+ private
50
+
51
+ def transformer
52
+ @transformer ||= TupleTransformer.new(transformation)
53
+ end
54
+
55
+ end # class Transform
56
+ end # module Operator
57
+ end # module Bmg
data/lib/bmg/reader.rb CHANGED
@@ -9,3 +9,4 @@ module Bmg
9
9
  end
10
10
  require_relative "reader/csv"
11
11
  require_relative "reader/excel"
12
+ require_relative "reader/text_file"
@@ -0,0 +1,56 @@
1
+ module Bmg
2
+ module Reader
3
+ class TextFile
4
+ include Reader
5
+
6
+ DEFAULT_OPTIONS = {
7
+ strip: true,
8
+ parse: nil
9
+ }
10
+
11
+ def initialize(type, path, options = {})
12
+ options = { parse: options } if options.is_a?(Regexp)
13
+ @path = path
14
+ @options = DEFAULT_OPTIONS.merge(options)
15
+ @type = infer_type(type)
16
+ end
17
+ attr_reader :path, :options
18
+
19
+ public # Relation
20
+
21
+ def each
22
+ path.each_line.each_with_index do |text, line|
23
+ text = text.strip if strip?
24
+ parsed = parse(text)
25
+ yield({line: 1+line}.merge(parsed)) if parsed
26
+ end
27
+ end
28
+
29
+ private
30
+
31
+ def infer_type(base)
32
+ return base unless base == Bmg::Type::ANY
33
+ attr_list = if rx = options[:parse]
34
+ [:line] + rx.names.map(&:to_sym)
35
+ else
36
+ [:line, :text]
37
+ end
38
+ base
39
+ .with_attrlist(attr_list)
40
+ .with_keys([[:line]])
41
+ end
42
+
43
+ def strip?
44
+ options[:strip]
45
+ end
46
+
47
+ def parse(text)
48
+ return { text: text } unless rx = options[:parse]
49
+ if match = rx.match(text)
50
+ TupleAlgebra.symbolize_keys(match.named_captures)
51
+ end
52
+ end
53
+
54
+ end # class TextFile
55
+ end # module Reader
56
+ end # module Bmg
data/lib/bmg/relation.rb CHANGED
@@ -17,6 +17,16 @@ module Bmg
17
17
  self
18
18
  end
19
19
 
20
+ def type
21
+ Bmg::Type::ANY
22
+ end
23
+
24
+ def with_type(type)
25
+ dup.tap{|r|
26
+ r.type = type
27
+ }
28
+ end
29
+
20
30
  def with_typecheck
21
31
  dup.tap{|r|
22
32
  r.type = r.type.with_typecheck
@@ -105,6 +115,20 @@ module Bmg
105
115
  to_a.to_json(*args, &bl)
106
116
  end
107
117
 
118
+ # Writes the relation data to CSV.
119
+ #
120
+ # `string_or_io` and `options` are what CSV::new itself
121
+ # recognizes, default options are CSV's.
122
+ #
123
+ # When no string_or_io is used, the method uses a string.
124
+ #
125
+ # The method always returns the string_or_io.
126
+ def to_csv(options = {}, string_or_io = nil, preferences = nil)
127
+ options, string_or_io = {}, options unless options.is_a?(Hash)
128
+ string_or_io, preferences = nil, string_or_io if string_or_io.is_a?(Hash)
129
+ Writer::Csv.new(options, preferences).call(self, string_or_io)
130
+ end
131
+
108
132
  # Converts to an sexpr expression.
109
133
  def to_ast
110
134
  raise "Bmg is missing a feature!"
@@ -0,0 +1,63 @@
1
+ module Bmg
2
+ module Relation
3
+ #
4
+ # This module can be used to create typed collection on top
5
+ # of Bmg relations. Algebra methods will be delegated to the
6
+ # decorated relation, and results wrapped in a new instance
7
+ # of the class.
8
+ #
9
+ module Proxy
10
+
11
+ def initialize(relation)
12
+ @relation = relation
13
+ end
14
+
15
+ def method_missing(name, *args, &bl)
16
+ if @relation.respond_to?(name)
17
+ res = @relation.send(name, *args, &bl)
18
+ res.is_a?(Relation) ? _proxy(res) : res
19
+ else
20
+ super
21
+ end
22
+ end
23
+
24
+ def respond_to?(name, *args)
25
+ @relation.respond_to?(name) || super
26
+ end
27
+
28
+ [
29
+ :extend
30
+ ].each do |name|
31
+ define_method(name) do |*args, &bl|
32
+ res = @relation.send(name, *args, &bl)
33
+ res.is_a?(Relation) ? _proxy(res) : res
34
+ end
35
+ end
36
+
37
+ [
38
+ :one,
39
+ :one_or_nil
40
+ ].each do |meth|
41
+ define_method(meth) do |*args, &bl|
42
+ res = @relation.send(meth, *args, &bl)
43
+ res.nil? ? nil : _proxy_tuple(res)
44
+ end
45
+ end
46
+
47
+ def to_json(*args, &bl)
48
+ @relation.to_json(*args, &bl)
49
+ end
50
+
51
+ protected
52
+
53
+ def _proxy(relation)
54
+ self.class.new(relation)
55
+ end
56
+
57
+ def _proxy_tuple(tuple)
58
+ tuple
59
+ end
60
+
61
+ end # module Proxy
62
+ end # class Relation
63
+ end # module Bmg
@@ -24,7 +24,7 @@ module Bmg
24
24
  protected :type=
25
25
 
26
26
  def each(&bl)
27
- spy.call(self)
27
+ spy.call(self) if bl
28
28
  operand.each(&bl)
29
29
  end
30
30
 
data/lib/bmg/support.rb CHANGED
@@ -1,2 +1,4 @@
1
1
  require_relative 'support/tuple_algebra'
2
+ require_relative 'support/tuple_transformer'
2
3
  require_relative 'support/keys'
4
+ require_relative 'support/output_preferences'
@@ -7,6 +7,10 @@ module Bmg
7
7
 
8
8
  public ## tools
9
9
 
10
+ def select(&bl)
11
+ Keys.new(@keys.select(&bl), false)
12
+ end
13
+
10
14
  public ## algebra
11
15
 
12
16
  def allbut(oldtype, newtype, butlist)
@@ -0,0 +1,44 @@
1
+ module Bmg
2
+ class OutputPreferences
3
+
4
+ DEFAULT_PREFS = {
5
+ attributes_ordering: nil,
6
+ extra_attributes: :after
7
+ }
8
+
9
+ def initialize(options)
10
+ @options = DEFAULT_PREFS.merge(options)
11
+ end
12
+ attr_reader :options
13
+
14
+ def self.dress(arg)
15
+ return arg if arg.is_a?(OutputPreferences)
16
+ arg = {} if arg.nil?
17
+ new(arg)
18
+ end
19
+
20
+ def attributes_ordering
21
+ options[:attributes_ordering]
22
+ end
23
+
24
+ def extra_attributes
25
+ options[:extra_attributes]
26
+ end
27
+
28
+ def order_attrlist(attrlist)
29
+ return attrlist if attributes_ordering.nil?
30
+ index = Hash[attributes_ordering.each_with_index.to_a]
31
+ attrlist.sort{|a,b|
32
+ ai, bi = index[a], index[b]
33
+ if ai && bi
34
+ ai <=> bi
35
+ elsif ai
36
+ extra_attributes == :after ? -1 : 1
37
+ else
38
+ extra_attributes == :after ? 1 : -1
39
+ end
40
+ }
41
+ end
42
+
43
+ end # class OutputPreferences
44
+ end # module Bmg
@@ -19,5 +19,11 @@ module Bmg
19
19
  end
20
20
  module_function :rename
21
21
 
22
+ def symbolize_keys(h)
23
+ return h if h.empty?
24
+ h.each_with_object({}){|(k,v),h| h[k.to_sym] = v }
25
+ end
26
+ module_function :symbolize_keys
27
+
22
28
  end # module TupleAlgebra
23
29
  end # module Bmg
@@ -0,0 +1,63 @@
1
+ module Bmg
2
+ class TupleTransformer
3
+
4
+ def initialize(transformation)
5
+ @transformation = transformation
6
+ end
7
+
8
+ def self.new(arg)
9
+ return arg if arg.is_a?(TupleTransformer)
10
+ super
11
+ end
12
+
13
+ def call(tuple)
14
+ transform_tuple(tuple, @transformation)
15
+ end
16
+
17
+ def knows_attrlist?
18
+ @transformation.is_a?(Hash)
19
+ end
20
+
21
+ def to_attrlist
22
+ @transformation.keys
23
+ end
24
+
25
+ private
26
+
27
+ def transform_tuple(tuple, with)
28
+ case with
29
+ when Symbol, Proc, Regexp
30
+ tuple.each_with_object({}){|(k,v),dup|
31
+ dup[k] = transform_attr(v, with)
32
+ }
33
+ when Hash
34
+ with.each_with_object(tuple.dup){|(k,v),dup|
35
+ dup[k] = transform_attr(dup[k], v)
36
+ }
37
+ when Array
38
+ with.inject(tuple){|dup,on|
39
+ transform_tuple(dup, on)
40
+ }
41
+ else
42
+ raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
43
+ end
44
+ end
45
+
46
+ def transform_attr(value, with)
47
+ case with
48
+ when Symbol
49
+ value.send(with)
50
+ when Regexp
51
+ m = with.match(value.to_s)
52
+ m.nil? ? m : m.to_s
53
+ when Proc
54
+ with.call(value)
55
+ when Hash
56
+ with[value]
57
+ else
58
+ raise ArgumentError, "Unexpected transformation `#{with.inspect}`"
59
+ end
60
+ end
61
+
62
+ end # module TupleTransformer
63
+ end # module Bmg
data/lib/bmg/type.rb CHANGED
@@ -241,6 +241,31 @@ module Bmg
241
241
  }
242
242
  end
243
243
 
244
+ def transform(transformation, options = {})
245
+ transformer = TupleTransformer.new(transformation)
246
+ if typechecked? && knows_attrlist? && transformer.knows_attrlist?
247
+ known_attributes!(transformer.to_attrlist)
248
+ end
249
+ keys = if options[:key_preserving]
250
+ self._keys
251
+ elsif transformer.knows_attrlist? && knows_keys?
252
+ touched_attrs = transformer.to_attrlist
253
+ keys = self._keys.select{|k| (k & touched_attrs).empty? }
254
+ else
255
+ nil
256
+ end
257
+ pred = if transformer.knows_attrlist?
258
+ attr_list = transformer.to_attrlist
259
+ predicate.and_split(attr_list).last
260
+ else
261
+ Predicate.tautology
262
+ end
263
+ dup.tap{|x|
264
+ x.keys = keys
265
+ x.predicate = pred
266
+ }
267
+ end
268
+
244
269
  def union(other)
245
270
  if typechecked? && knows_attrlist? && other.knows_attrlist?
246
271
  missing = self.attrlist - other.attrlist
data/lib/bmg/version.rb CHANGED
@@ -1,8 +1,8 @@
1
1
  module Bmg
2
2
  module Version
3
3
  MAJOR = 0
4
- MINOR = 17
5
- TINY = 4
4
+ MINOR = 18
5
+ TINY = 0
6
6
  end
7
7
  VERSION = "#{Version::MAJOR}.#{Version::MINOR}.#{Version::TINY}"
8
8
  end
data/lib/bmg/writer.rb ADDED
@@ -0,0 +1 @@
1
+ require_relative 'writer/csv'
@@ -0,0 +1,42 @@
1
+ module Bmg
2
+ module Writer
3
+ class Csv
4
+ include Writer
5
+
6
+ DEFAULT_OPTIONS = {
7
+ }
8
+
9
+ def initialize(csv_options, output_preferences = nil)
10
+ @csv_options = DEFAULT_OPTIONS.merge(csv_options)
11
+ @output_preferences = OutputPreferences.dress(output_preferences)
12
+ end
13
+ attr_reader :csv_options, :output_preferences
14
+
15
+ def call(relation, string_or_io = nil)
16
+ require 'csv'
17
+ string_or_io, to_s = string_or_io.nil? ? [StringIO.new, true] : [string_or_io, false]
18
+ headers, csv = infer_headers(relation.type), nil
19
+ relation.each do |tuple|
20
+ if csv.nil?
21
+ headers = infer_headers(tuple) if headers.nil?
22
+ csv = CSV.new(string_or_io, csv_options.merge(headers: headers))
23
+ end
24
+ csv << headers.map{|h| tuple[h] }
25
+ end
26
+ to_s ? string_or_io.string : string_or_io
27
+ end
28
+
29
+ private
30
+
31
+ def infer_headers(from)
32
+ attrlist = if from.is_a?(Type) && from.knows_attrlist?
33
+ from.to_attrlist
34
+ elsif from.is_a?(Hash)
35
+ from.keys
36
+ end
37
+ attrlist ? output_preferences.order_attrlist(attrlist) : nil
38
+ end
39
+
40
+ end # class Csv
41
+ end # module Writer
42
+ end # module Bmg
data/tasks/test.rake CHANGED
@@ -6,17 +6,24 @@ namespace :test do
6
6
  desc "Runs unit tests"
7
7
  RSpec::Core::RakeTask.new(:unit) do |t|
8
8
  t.pattern = "spec/unit/**/test_*.rb"
9
- t.rspec_opts = ["-Ilib", "-Ispec/unit", "--fail-fast", "--color", "--backtrace", "--format=progress"]
9
+ t.rspec_opts = ["-Ilib", "-Ispec/unit", "--color", "--backtrace", "--format=progress"]
10
10
  end
11
11
  tests << :unit
12
12
 
13
13
  desc "Runs integration tests"
14
14
  RSpec::Core::RakeTask.new(:integration) do |t|
15
15
  t.pattern = "spec/integration/**/test_*.rb"
16
- t.rspec_opts = ["-Ilib", "-Ispec/integration", "--fail-fast", "--color", "--backtrace", "--format=progress"]
16
+ t.rspec_opts = ["-Ilib", "-Ispec/integration", "--color", "--backtrace", "--format=progress"]
17
17
  end
18
18
  tests << :integration
19
19
 
20
+ desc "Runs github regression tests"
21
+ RSpec::Core::RakeTask.new(:regression) do |t|
22
+ t.pattern = "spec/regression/**/test_*.rb"
23
+ t.rspec_opts = ["-Ilib", "-Ispec/regression", "--color", "--backtrace", "--format=progress"]
24
+ end
25
+ tests << :regression
26
+
20
27
  task :all => tests
21
28
  end
22
29
 
metadata CHANGED
@@ -1,91 +1,91 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bmg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.17.4
4
+ version: 0.18.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Bernard Lambeau
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-07-23 00:00:00.000000000 Z
11
+ date: 2021-03-12 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: predicate
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
18
- - !ruby/object:Gem::Version
19
- version: '2.4'
20
17
  - - ">="
21
18
  - !ruby/object:Gem::Version
22
- version: 2.4.0
19
+ version: 2.5.0
20
+ - - "~>"
21
+ - !ruby/object:Gem::Version
22
+ version: '2.5'
23
23
  type: :runtime
24
24
  prerelease: false
25
25
  version_requirements: !ruby/object:Gem::Requirement
26
26
  requirements:
27
- - - "~>"
28
- - !ruby/object:Gem::Version
29
- version: '2.4'
30
27
  - - ">="
31
28
  - !ruby/object:Gem::Version
32
- version: 2.4.0
29
+ version: 2.5.0
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '2.5'
33
33
  - !ruby/object:Gem::Dependency
34
- name: rake
34
+ name: path
35
35
  requirement: !ruby/object:Gem::Requirement
36
36
  requirements:
37
- - - "~>"
37
+ - - ">="
38
38
  - !ruby/object:Gem::Version
39
- version: '10'
40
- type: :development
39
+ version: '2.0'
40
+ type: :runtime
41
41
  prerelease: false
42
42
  version_requirements: !ruby/object:Gem::Requirement
43
43
  requirements:
44
- - - "~>"
44
+ - - ">="
45
45
  - !ruby/object:Gem::Version
46
- version: '10'
46
+ version: '2.0'
47
47
  - !ruby/object:Gem::Dependency
48
- name: rspec
48
+ name: rake
49
49
  requirement: !ruby/object:Gem::Requirement
50
50
  requirements:
51
51
  - - "~>"
52
52
  - !ruby/object:Gem::Version
53
- version: '3.6'
53
+ version: '13'
54
54
  type: :development
55
55
  prerelease: false
56
56
  version_requirements: !ruby/object:Gem::Requirement
57
57
  requirements:
58
58
  - - "~>"
59
59
  - !ruby/object:Gem::Version
60
- version: '3.6'
60
+ version: '13'
61
61
  - !ruby/object:Gem::Dependency
62
- name: path
62
+ name: rspec
63
63
  requirement: !ruby/object:Gem::Requirement
64
64
  requirements:
65
- - - ">="
65
+ - - "~>"
66
66
  - !ruby/object:Gem::Version
67
- version: '1.3'
67
+ version: '3.6'
68
68
  type: :development
69
69
  prerelease: false
70
70
  version_requirements: !ruby/object:Gem::Requirement
71
71
  requirements:
72
- - - ">="
72
+ - - "~>"
73
73
  - !ruby/object:Gem::Version
74
- version: '1.3'
74
+ version: '3.6'
75
75
  - !ruby/object:Gem::Dependency
76
76
  name: roo
77
77
  requirement: !ruby/object:Gem::Requirement
78
78
  requirements:
79
79
  - - ">="
80
80
  - !ruby/object:Gem::Version
81
- version: '2.7'
81
+ version: '2.8'
82
82
  type: :development
83
83
  prerelease: false
84
84
  version_requirements: !ruby/object:Gem::Requirement
85
85
  requirements:
86
86
  - - ">="
87
87
  - !ruby/object:Gem::Version
88
- version: '2.7'
88
+ version: '2.8'
89
89
  - !ruby/object:Gem::Dependency
90
90
  name: sequel
91
91
  requirement: !ruby/object:Gem::Requirement
@@ -149,14 +149,17 @@ files:
149
149
  - lib/bmg/operator/shared/nary.rb
150
150
  - lib/bmg/operator/shared/unary.rb
151
151
  - lib/bmg/operator/summarize.rb
152
+ - lib/bmg/operator/transform.rb
152
153
  - lib/bmg/operator/union.rb
153
154
  - lib/bmg/reader.rb
154
155
  - lib/bmg/reader/csv.rb
155
156
  - lib/bmg/reader/excel.rb
157
+ - lib/bmg/reader/text_file.rb
156
158
  - lib/bmg/relation.rb
157
159
  - lib/bmg/relation/empty.rb
158
160
  - lib/bmg/relation/in_memory.rb
159
161
  - lib/bmg/relation/materialized.rb
162
+ - lib/bmg/relation/proxy.rb
160
163
  - lib/bmg/relation/spied.rb
161
164
  - lib/bmg/sequel.rb
162
165
  - lib/bmg/sequel/ext.rb
@@ -258,9 +261,13 @@ files:
258
261
  - lib/bmg/summarizer/variance.rb
259
262
  - lib/bmg/support.rb
260
263
  - lib/bmg/support/keys.rb
264
+ - lib/bmg/support/output_preferences.rb
261
265
  - lib/bmg/support/tuple_algebra.rb
266
+ - lib/bmg/support/tuple_transformer.rb
262
267
  - lib/bmg/type.rb
263
268
  - lib/bmg/version.rb
269
+ - lib/bmg/writer.rb
270
+ - lib/bmg/writer/csv.rb
264
271
  - tasks/gem.rake
265
272
  - tasks/test.rake
266
273
  homepage: http://github.com/enspirit/bmg
@@ -282,7 +289,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
282
289
  - !ruby/object:Gem::Version
283
290
  version: '0'
284
291
  requirements: []
285
- rubygems_version: 3.1.2
292
+ rubygems_version: 3.0.8
286
293
  signing_key:
287
294
  specification_version: 4
288
295
  summary: Bmg is Alf's successor.