namo 0.5.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f1cf8cf203da06b319112fec69ad8fae3dddc73ef850d3f84de4d13df2c51249
4
- data.tar.gz: 1e80df8ee9c7be401c3f9cbf9625eceef0a87b6136a9d8b3825379b32eecff3f
3
+ metadata.gz: c0aa48a727f9cc67fae8eea94dcfa8227c66a34aa8918ac65e30b33b7685c82b
4
+ data.tar.gz: c8f634b5c41f35337fb843d286edf80cbdabef951b076dd36533f5a2531c3715
5
5
  SHA512:
6
- metadata.gz: 712227a916253708637a1ca8c1f177e1ab36daba694df13edc2918943779751ba89e71f8d0f578b44d285abf522a869d5f16c3b0f8377899fa7818bbd6009d41
7
- data.tar.gz: 7c50400df852f31b3a6a12011486cd72f32c946b94c6f7e71a9d83358a50bda4eec9b8854442181112101d82b4b154e61c6159746c697d0adec192f33d99f5a3
6
+ metadata.gz: 9b196446a0d02dc61790a25ea416bfebdd591020cfd6ef89430f5e8b9d8bd1239cc265bca83f0cd6dde4d42ce3c3927074ac69e1d4e79ae88db667b95caeadf3
7
+ data.tar.gz: 3680f44daa8e1e78921d1d286c7f436947b6e436c9308d62038c17a1f9658ae4ccc23196afb2eb44e6b3efb41398ab1b39d8cf3bf813f87c24df12c7cb975586
data/CHANGELOG CHANGED
@@ -1,6 +1,37 @@
1
1
  CHANGELOG
2
2
  _________
3
3
 
4
+ 20260520
5
+ 0.7.0: + derived-dimension surfacing, lazy single-column access, live views
6
+
7
+ 1. + Namo#data_dimensions: Returns the storage dimensions as a plain Array (keys of the first row).
8
+ 2. + Namo#derived_dimensions: Returns the formula names as a plain Array.
9
+ 3. + Namo#values(*dims): Per-dimension full sequences (duplicates preserved, in row order). With no args, returns a Hash {dim => sequence} across the queryable namespace. With one arg, lazily computes and returns just that column as an Array. With multiple args, returns a subset Hash containing each requested dimension. Unknown dimensions propagate as nil per row, matching Row#[] and Namo#[] selection conventions — values(:unknown) returns an Array of nils; values(:known, :unknown) returns {known: [...], unknown: [nil, ...]}.
10
+ 4. + Namo#coordinates(*dims): Per-dimension unique-value sets. Same argument shape as #values; coordinates(dim) == values(dim).uniq. Unknown dimensions therefore appear as [nil].
11
+ 5. + Namo#to_h: Alias for the full values Hash.
12
+ 6. ~ Namo#dimensions: Now covers the queryable namespace (storage + derived) instead of storage-only. Return type stays a plain Array. Memoisation removed: every call recomputes from current state (live view).
13
+ 7. ~ Namo#coordinates: Memoisation removed (was @coordinates ||= ...). Now covers the queryable namespace; coordinates(:derived_dim) and coordinates[:derived_dim] work, evaluating the formula across all rows. New positional-args API supports lazy single-column access.
14
+ 8. ~ Namo#canonical_data: Sorts by data_dimensions to preserve 0.6.0 row-equality semantics under the broader dimensions definition.
15
+ 9. /raise_unless_matching_dimensions/raise_unless_matching_data_dimensions/: Private helper renamed to reflect what it actually compares.
16
+ 10. ~ test/namo_test.rb: + Tests for #data_dimensions, #derived_dimensions, the no-arg/single-arg/multi-arg forms of #values and #coordinates, derived-dimension surfacing in #dimensions, #to_h, the coordinates(dim) == values(dim).uniq consistency property, and live-view semantics (added rows / formulae reflected on next call).
17
+ 11. ~ Rakefile: + -V mainfont=Charter -V monofont=Menlo on pandoc invocation in docs:md2pdf, for a cleaner serif body font and so code spans containing Unicode math glyphs (e.g. ∅) render correctly under xelatex.
18
+ 12. ~ Namo::VERSION: /0.6.0/0.7.0/
19
+
20
+ 20260511
21
+ 0.6.0: + equality, pattern-match, and subset/superset operators
22
+
23
+ 1. + Namo#==: Multiset equality on row data, ignoring class and formulae.
24
+ 2. + Namo#===: Analytical identity match — true iff other has the same dimensions and same formula names as self, ignoring rows and proc bodies. Returns false for non-Namo operands.
25
+ 3. + Namo#eql?: Strict equality requiring class match, multiset-equal data, and formula name match.
26
+ 4. + Namo#hash: Content-based hash, consistent with eql?.
27
+ 5. + Namo#<, #<=, #>, #>=: Multiset subset/superset relations on rows. Raise ArgumentError on mismatched dimensions, TypeError on non-Namo operand.
28
+ 6. ~ Namo#+, #-, #&, #|, #^: Error message on dimension mismatch updated to "dimensions don't match: X vs Y". Non-Namo operand now raises TypeError ("can't compare Namo with X") instead of NoMethodError.
29
+ 7. ~ lib/namo.rb, namo.gemspec: Minor cleanup (./-prefixed requires; gemspec whitespace).
30
+ 8. ~ test/namo_test.rb: Add tests for ==, ===, eql?, hash, <, <=, >, >=, equal?, and the new error message.
31
+ 9. ~ README.md: + Equality section, + Subset and superset section, + design-philosophy paragraph in the opening and one-line principle callouts in the dimensions, formulae, set-operator, and equality sections.
32
+ 10. + script/md4print, ~ Rakefile: + rake docs:print, docs:pdf, docs:all for regenerating docs/*.print.md and docs/*.print.pdf.
33
+ 11. ~ Namo::VERSION: /0.5.0/0.6.0/
34
+
4
35
  20260416
5
36
  0.5.0: + row-axis set operations: intersection (&), union (|), symmetric difference (^)
6
37
 
data/README.md CHANGED
@@ -4,6 +4,8 @@ Named dimensional data for Ruby.
4
4
 
5
5
  Namo is a Ruby library for working with multi-dimensional data using named dimensions. It infers dimensions and coordinates from plain arrays of hashes — the same shape you get from databases, CSV files, JSON, and YAML — so there's no reshaping step.
6
6
 
7
+ The design rests on a few stances: every hash key is a dimension and none is privileged as a coordinate or value; formulae attach to a Namo alongside data and re-evaluate on each access, appearing as derived dimensions alongside the data dimensions; operators that combine Namos all take Namos and return Namos, so analytical pipelines close; and the formula mechanism is type-agnostic — strings, dates, booleans, and arbitrary Ruby objects work as readily as numbers.
8
+
7
9
  ## Installation
8
10
 
9
11
  ```
@@ -44,6 +46,8 @@ sales.coordinates[:quarter]
44
46
  # => ['Q1', 'Q2']
45
47
  ```
46
48
 
49
+ Every key is a dimension; every value is a coordinate. There's no schema declaration and no choosing which column is "the index" — `price` and `quantity` are no less first-class than `product` and `quarter`.
50
+
47
51
  ### Selection
48
52
 
49
53
  Select by named dimension using keyword arguments:
@@ -155,6 +159,8 @@ Selection, projection, and contraction always return a new Namo instance, so eve
155
159
 
156
160
  ### Concatenation
157
161
 
162
+ `+` is the first of Namo's binary operators: it takes a Namo on each side and returns a Namo. The same shape holds for `-`, `&`, `|`, `^`, `==`, `===`, `<`, `<=`, `>`, `>=` and (later) the composition operators — Namo in, Namo (or boolean) out — so analytical pipelines stay queryable end-to-end.
163
+
158
164
  `+` combines two Namo objects that share the same dimensions by appending the rows of the second to the first:
159
165
 
160
166
  ```ruby
@@ -280,6 +286,87 @@ set_a ^ set_b
280
286
 
281
287
  The dimensions must match; different dimensions raise an `ArgumentError`. Formulae merge from both sides; the left-hand side's formulae take precedence on conflict.
282
288
 
289
+ ### Equality
290
+
291
+ Comparison on Namos is **multiset-theoretic on rows**: row order is ignored (it's an accident of ingestion, not data), but row multiplicities count (they *are* data). The same stance carries across the equality, pattern-match, and subset/superset operators below.
292
+
293
+ `==` is multiset equality on rows. Class and formulae are ignored; row order is ignored; row multiplicities are not.
294
+
295
+ ```ruby
296
+ a = Namo.new([{x: 1}, {x: 2}])
297
+ b = Namo.new([{x: 2}, {x: 1}])
298
+
299
+ a == b
300
+ # => true
301
+
302
+ a == Namo.new([{x: 1}, {x: 1}, {x: 2}])
303
+ # => false
304
+ ```
305
+
306
+ `eql?` is stricter: it also requires the class to match and the formula names to match. Like `===`, it ignores proc bodies — proc identity isn't a meaningful equivalence in Ruby (`proc{...} == proc{...}` is false), so neither `===` nor `eql?` uses it.
307
+
308
+ `hash` is consistent with `eql?` and is content-based, so equal Namos hash equally and can be used as Hash keys:
309
+
310
+ ```ruby
311
+ h = {a => 'first'}
312
+ h[b]
313
+ # => 'first'
314
+ ```
315
+
316
+ `equal?` is unchanged from Ruby's default — it tests object identity.
317
+
318
+ `===` answers a different question: does the candidate have the same dimensions and the same formula names? Row data is ignored, and so are the proc bodies themselves — only the names matter. This is the `===` semantics that case statements use, so Namos can serve as templates for analytical shape:
319
+
320
+ ```ruby
321
+ sales_shape = Namo.new([{product: 'X', quarter: 'Q1', price: 0.0, quantity: 0}])
322
+ sales_shape[:revenue] = proc{|row| row[:price] * row[:quantity]}
323
+
324
+ q1 = Namo.new([{product: 'Widget', quarter: 'Q1', price: 10.0, quantity: 100}])
325
+ q1[:revenue] = proc{|row| row[:price] * row[:quantity]}
326
+
327
+ sales_shape === q1
328
+ # => true (same dimensions, same formula name)
329
+
330
+ sales_shape == q1
331
+ # => false (different rows)
332
+ ```
333
+
334
+ The two `:revenue` procs are independently-written and not the same object — `proc{...} == proc{...}` is false in Ruby. But `===` doesn't compare proc identity; it asks "do these Namos have the same analytical shape?" and the shape is the set of dimensions plus the set of formula names.
335
+
336
+ Each comparison operator answers a distinct question: `eql?` is strictest (class + data + formula names); `==` is data identity; `===` is analytical identity; the subset operators are data containment.
337
+
338
+ ### Subset and Superset
339
+
340
+ `<`, `<=`, `>`, `>=` are multiset subset and superset relations on rows.
341
+
342
+ ```ruby
343
+ small = Namo.new([{x: 1}, {x: 2}])
344
+ large = Namo.new([{x: 1}, {x: 2}, {x: 3}])
345
+
346
+ small <= large
347
+ # => true
348
+
349
+ small < large
350
+ # => true
351
+
352
+ large > small
353
+ # => true
354
+ ```
355
+
356
+ Equal sets are `<=` and `>=` each other, but neither `<` nor `>`. Disjoint sets are none of the above — unless one side is empty, in which case it is a subset of (and disjoint with) the other.
357
+
358
+ Multiplicity matters: a single `{x: 1}` is a proper subset of two `{x: 1}`s.
359
+
360
+ ```ruby
361
+ one = Namo.new([{x: 1}])
362
+ two = Namo.new([{x: 1}, {x: 1}])
363
+
364
+ one < two
365
+ # => true
366
+ ```
367
+
368
+ The dimensions must match; different dimensions raise an `ArgumentError`. Comparing against a non-Namo raises a `TypeError`.
369
+
283
370
  ### Formulae
284
371
 
285
372
  Define computed dimensions using `[]=`:
@@ -296,6 +383,8 @@ sales[:product, :quarter, :revenue]
296
383
  # ]>
297
384
  ```
298
385
 
386
+ Formulae aren't materialised into row data — they re-evaluate on every access. A `:revenue` value reflects the current `:price` and `:quantity` at the moment you ask for it, so derived values stay in sync with whatever the underlying data is doing.
387
+
299
388
  Formulae compose:
300
389
 
301
390
  ```ruby
@@ -323,6 +412,78 @@ sales[product: 'Widget'][:revenue, :quarter]
323
412
 
324
413
  Formulae carry through selection — a filtered Namo instance remembers its formulae.
325
414
 
415
+ ### Coordinates and values
416
+
417
+ `dimensions` covers the *queryable namespace* — every name you can ask for, whether it lives in the row data or is computed by a formula. Once formulae are defined, they appear alongside data dimensions:
418
+
419
+ ```ruby
420
+ sales[:revenue] = proc{|row| row[:price] * row[:quantity]}
421
+
422
+ sales.dimensions
423
+ # => [:product, :quarter, :price, :quantity, :revenue]
424
+
425
+ sales.data_dimensions
426
+ # => [:product, :quarter, :price, :quantity]
427
+
428
+ sales.derived_dimensions
429
+ # => [:revenue]
430
+ ```
431
+
432
+ `coordinates` gives the unique values per dimension, including derived ones:
433
+
434
+ ```ruby
435
+ sales.coordinates[:product]
436
+ # => ['Widget', 'Gadget']
437
+
438
+ sales.coordinates[:revenue]
439
+ # => [1000.0, 1500.0]
440
+ ```
441
+
442
+ `values` gives the full per-row sequence — duplicates preserved, row order preserved:
443
+
444
+ ```ruby
445
+ sales.values[:product]
446
+ # => ['Widget', 'Widget', 'Gadget', 'Gadget']
447
+
448
+ sales.values[:revenue]
449
+ # => [1000.0, 1500.0, 1000.0, 1500.0]
450
+ ```
451
+
452
+ Both `coordinates` and `values` accept positional arguments. With no args they return a Hash across the queryable namespace; with one arg they lazily compute and return just that column as an Array; with multiple args they return a subset Hash containing just the requested columns:
453
+
454
+ ```ruby
455
+ sales.values(:product)
456
+ # => ['Widget', 'Widget', 'Gadget', 'Gadget']
457
+
458
+ sales.values(:product, :quarter)
459
+ # => {
460
+ # product: ['Widget', 'Widget', 'Gadget', 'Gadget'],
461
+ # quarter: ['Q1', 'Q2', 'Q1', 'Q2']
462
+ # }
463
+
464
+ sales.coordinates(:revenue)
465
+ # => [1000.0, 1500.0]
466
+ ```
467
+
468
+ Single-arg access is lazy: `sales.values(:revenue)` evaluates the formula only across the rows of `:revenue`, without materialising the other columns. The bracket form (`sales.values[:revenue]`) still works through ordinary Hash lookup but pays for the full materialisation up front.
469
+
470
+ `coordinates` is `values` with `.uniq` applied per column — `coordinates(dim) == values(dim).uniq` holds for every dimension.
471
+
472
+ `to_h` is the Ruby-conventional alias for the full `values` Hash:
473
+
474
+ ```ruby
475
+ sales.to_h
476
+ # => {
477
+ # product: ['Widget', 'Widget', 'Gadget', 'Gadget'],
478
+ # quarter: ['Q1', 'Q2', 'Q1', 'Q2'],
479
+ # price: [10.0, 10.0, 25.0, 25.0],
480
+ # quantity: [100, 150, 40, 60],
481
+ # revenue: [1000.0, 1500.0, 1000.0, 1500.0]
482
+ # }
483
+ ```
484
+
485
+ Unknown dimensions propagate `nil` per row — `values(:missing)` returns `[nil, nil, ...]` rather than raising or returning a sentinel, matching the convention used by `Row#[]` and `[]` selection. Use `dimensions.include?(:dim)` if you need to check membership directly.
486
+
326
487
  ### Enumerable
327
488
 
328
489
  Namo includes `Enumerable`, so `each`, `reduce`, `map`, `select`, `min_by`, and all the rest work out of the box. Rows are yielded as `Row` objects, so formulae are accessible during enumeration:
@@ -354,7 +515,7 @@ sales.flat_map{|row| [row[:price]]}
354
515
 
355
516
  ### Extracting data
356
517
 
357
- `to_a` returns an array of hashes:
518
+ `to_a` returns an array of hashes — the row-oriented form:
358
519
 
359
520
  ```ruby
360
521
  sales[:product, :quarter, :revenue].to_a
@@ -366,6 +527,17 @@ sales[:product, :quarter, :revenue].to_a
366
527
  # ]
367
528
  ```
368
529
 
530
+ `to_h` returns a hash of arrays — the columnar form (see [Coordinates and values](#coordinates-and-values) above):
531
+
532
+ ```ruby
533
+ sales[:product, :quarter, :revenue].to_h
534
+ # => {
535
+ # product: ['Widget', 'Widget', 'Gadget', 'Gadget'],
536
+ # quarter: ['Q1', 'Q2', 'Q1', 'Q2'],
537
+ # revenue: [1000.0, 1500.0, 1000.0, 1500.0]
538
+ # }
539
+ ```
540
+
369
541
  ## Why?
370
542
 
371
543
  Every other multi-dimensional array library requires you to pre-shape your data before you can work with it. Namo takes it in the form it likely already comes in.
data/Rakefile CHANGED
@@ -6,4 +6,37 @@ Rake::TestTask.new(:test) do |t|
6
6
  t.test_files = FileList['test/**/*_test.rb']
7
7
  end
8
8
 
9
+ namespace :docs do
10
+ SOURCE_DOCS = %w{COMPARISON EXAMPLES README ROADMAP}
11
+
12
+ desc "Strip syntax highlighting from code blocks for printing"
13
+ task :md4print do
14
+ SOURCE_DOCS.each do |name|
15
+ sh "script/md4print #{name}.md"
16
+ sh "mv #{name}.print.md docs/"
17
+ end
18
+ end
19
+
20
+ desc "Render print-ready markdown to PDF"
21
+ task :md2pdf => :md4print do
22
+ Dir.glob('docs/*.print.md').each do |f|
23
+ pdf = f.sub(/\.md$/, '.pdf')
24
+ sh "pandoc #{f} --pdf-engine=xelatex -V geometry:margin=1in -V mainfont=Charter -V monofont=Menlo -o #{pdf}"
25
+ end
26
+ end
27
+
28
+ desc "Remove intermediate .print.md files"
29
+ task :clean do
30
+ rm_f Dir.glob('docs/*.print.md')
31
+ end
32
+
33
+ desc "Remove all generated docs (intermediates and PDFs)"
34
+ task :clobber => :clean do
35
+ rm_f Dir.glob('docs/*.print.pdf')
36
+ end
37
+
38
+ desc "Regenerate all derived docs"
39
+ task :gen => [:md2pdf, :clean]
40
+ end
41
+
9
42
  task default: :test
data/lib/Namo/VERSION.rb CHANGED
@@ -2,5 +2,5 @@
2
2
  # Namo::VERSION
3
3
 
4
4
  class Namo
5
- VERSION = '0.5.0'
5
+ VERSION = '0.7.0'
6
6
  end
data/lib/namo.rb CHANGED
@@ -1,9 +1,9 @@
1
1
  # namo.rb
2
2
  # Namo
3
3
 
4
- require_relative 'Namo/NegatedDimension'
5
- require_relative 'Namo/Row'
6
- require_relative 'Symbol'
4
+ require_relative './Namo/NegatedDimension'
5
+ require_relative './Namo/Row'
6
+ require_relative './Symbol'
7
7
 
8
8
  class Namo
9
9
  include Enumerable
@@ -12,15 +12,39 @@ class Namo
12
12
  attr_accessor :formulae
13
13
 
14
14
  def dimensions
15
- @dimensions ||= @data.first.keys
15
+ @data.first.keys + @formulae.keys
16
16
  end
17
17
 
18
- def coordinates
19
- @coordinates ||= (
20
- dimensions.each_with_object({}) do |dimension, hash|
21
- hash[dimension] = @data.map{|row| row[dimension]}.uniq
22
- end
23
- )
18
+ def data_dimensions
19
+ @data.first.keys
20
+ end
21
+
22
+ def derived_dimensions
23
+ @formulae.keys
24
+ end
25
+
26
+ def values(*dims)
27
+ if dims.empty?
28
+ dimensions.each_with_object({}){|dim, hash| hash[dim] = values_for(dim)}
29
+ elsif dims.length == 1
30
+ values_for(dims.first)
31
+ else
32
+ dims.each_with_object({}){|dim, hash| hash[dim] = values_for(dim)}
33
+ end
34
+ end
35
+
36
+ def coordinates(*dims)
37
+ if dims.empty?
38
+ values.transform_values(&:uniq)
39
+ elsif dims.length == 1
40
+ values(dims.first).uniq
41
+ else
42
+ dims.each_with_object({}){|dim, hash| hash[dim] = values(dim).uniq}
43
+ end
44
+ end
45
+
46
+ def to_h
47
+ values
24
48
  end
25
49
 
26
50
  def [](*names, **selections)
@@ -32,7 +56,7 @@ class Namo
32
56
  projected = (
33
57
  if negated.any?
34
58
  excluded = negated.map(&:name)
35
- kept = dimensions - excluded
59
+ kept = data_dimensions - excluded
36
60
  rows.map do |row|
37
61
  kept.each_with_object({}){|name, hash| hash[name] = row[name]}
38
62
  end
@@ -57,40 +81,80 @@ class Namo
57
81
  end
58
82
 
59
83
  def +(other)
60
- unless dimensions == other.dimensions
61
- raise ArgumentError, "dimensions do not match"
62
- end
84
+ raise_unless_namo(other)
85
+ raise_unless_matching_data_dimensions(other)
63
86
  self.class.new(@data + other.data, formulae: other.formulae.merge(@formulae))
64
87
  end
65
88
 
66
89
  def -(other)
67
- unless dimensions == other.dimensions
68
- raise ArgumentError, "dimensions do not match"
69
- end
90
+ raise_unless_namo(other)
91
+ raise_unless_matching_data_dimensions(other)
70
92
  self.class.new(@data - other.data, formulae: @formulae.dup)
71
93
  end
72
94
 
73
95
  def &(other)
74
- unless dimensions == other.dimensions
75
- raise ArgumentError, "dimensions do not match"
76
- end
96
+ raise_unless_namo(other)
97
+ raise_unless_matching_data_dimensions(other)
77
98
  self.class.new(@data & other.data, formulae: @formulae.dup)
78
99
  end
79
100
 
80
101
  def |(other)
81
- unless dimensions == other.dimensions
82
- raise ArgumentError, "dimensions do not match"
83
- end
102
+ raise_unless_namo(other)
103
+ raise_unless_matching_data_dimensions(other)
84
104
  self.class.new((@data | other.data), formulae: other.formulae.merge(@formulae))
85
105
  end
86
106
 
87
107
  def ^(other)
88
- unless dimensions == other.dimensions
89
- raise ArgumentError, "dimensions do not match"
90
- end
108
+ raise_unless_namo(other)
109
+ raise_unless_matching_data_dimensions(other)
91
110
  self.class.new((@data - other.data) + (other.data - @data), formulae: other.formulae.merge(@formulae))
92
111
  end
93
112
 
113
+ def ==(other)
114
+ return false unless other.is_a?(Namo)
115
+ canonical_data == other.canonical_data
116
+ end
117
+
118
+ def ===(other)
119
+ return false unless other.is_a?(Namo)
120
+ dimensions.sort == other.dimensions.sort &&
121
+ @formulae.keys.sort == other.formulae.keys.sort
122
+ end
123
+
124
+ def eql?(other)
125
+ self.class == other.class &&
126
+ canonical_data == other.canonical_data &&
127
+ @formulae.keys.sort == other.formulae.keys.sort
128
+ end
129
+
130
+ def hash
131
+ [self.class, canonical_data, @formulae.keys.sort].hash
132
+ end
133
+
134
+ def <(other)
135
+ raise_unless_namo(other)
136
+ raise_unless_matching_data_dimensions(other)
137
+ proper_subset_of_rows?(other)
138
+ end
139
+
140
+ def <=(other)
141
+ raise_unless_namo(other)
142
+ raise_unless_matching_data_dimensions(other)
143
+ subset_of_rows?(other)
144
+ end
145
+
146
+ def >(other)
147
+ raise_unless_namo(other)
148
+ raise_unless_matching_data_dimensions(other)
149
+ other.proper_subset_of_rows?(self)
150
+ end
151
+
152
+ def >=(other)
153
+ raise_unless_namo(other)
154
+ raise_unless_matching_data_dimensions(other)
155
+ other.subset_of_rows?(self)
156
+ end
157
+
94
158
  def to_a
95
159
  @data.map do |row|
96
160
  row.keys.each_with_object({}) do |key, hash|
@@ -99,8 +163,44 @@ class Namo
99
163
  end
100
164
  end
101
165
 
166
+ protected
167
+
168
+ def canonical_data
169
+ @data.sort_by{|row| row.values_at(*data_dimensions.sort)}
170
+ end
171
+
172
+ def subset_of_rows?(other)
173
+ self_counts = canonical_data.tally
174
+ other_counts = other.canonical_data.tally
175
+ self_counts.all?{|row, count| (other_counts[row] || 0) >= count}
176
+ end
177
+
178
+ def proper_subset_of_rows?(other)
179
+ subset_of_rows?(other) && self != other
180
+ end
181
+
102
182
  private
103
183
 
184
+ def values_for(dim)
185
+ if data_dimensions.include?(dim)
186
+ @data.map{|row_data| row_data[dim]}
187
+ else
188
+ @data.map{|row_data| Row.new(row_data, @formulae)[dim]}
189
+ end
190
+ end
191
+
192
+ def raise_unless_namo(other)
193
+ unless other.is_a?(Namo)
194
+ raise TypeError, "can't compare Namo with #{other.class}"
195
+ end
196
+ end
197
+
198
+ def raise_unless_matching_data_dimensions(other)
199
+ unless data_dimensions == other.data_dimensions
200
+ raise ArgumentError, "dimensions don't match: #{data_dimensions} vs #{other.data_dimensions}"
201
+ end
202
+ end
203
+
104
204
  def initialize(data = nil, formulae: {})
105
205
  @data = data
106
206
  @formulae = formulae
data/namo.gemspec CHANGED
@@ -19,7 +19,6 @@ Gem::Specification.new do |spec|
19
19
  spec.license = 'MIT'
20
20
 
21
21
  spec.required_ruby_version = '>= 2.7'
22
-
23
22
  spec.require_paths = ['lib']
24
23
 
25
24
  spec.files = [
@@ -21,19 +21,159 @@ describe Namo do
21
21
  it "infers dimensions from hash keys" do
22
22
  _(sales.dimensions).must_equal [:product, :quarter, :price, :quantity]
23
23
  end
24
+
25
+ it "includes derived dimensions after storage dimensions" do
26
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
27
+ sales[:label] = proc{|r| "#{r[:product]}-#{r[:quarter]}"}
28
+ _(sales.dimensions).must_equal [:product, :quarter, :price, :quantity, :revenue, :label]
29
+ end
30
+
31
+ it "reflects mutation on the next call" do
32
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
33
+ _(sales.dimensions).must_include :revenue
34
+ sales.formulae.delete(:revenue)
35
+ _(sales.dimensions).wont_include :revenue
36
+ end
37
+ end
38
+
39
+ describe "#data_dimensions" do
40
+ it "returns only the storage keys" do
41
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
42
+ _(sales.data_dimensions).must_equal [:product, :quarter, :price, :quantity]
43
+ end
44
+ end
45
+
46
+ describe "#derived_dimensions" do
47
+ it "returns only the formula keys" do
48
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
49
+ _(sales.derived_dimensions).must_equal [:revenue]
50
+ end
51
+
52
+ it "is empty when no formulae are defined" do
53
+ _(sales.derived_dimensions).must_equal []
54
+ end
24
55
  end
25
56
 
26
57
  describe "#coordinates" do
27
- it "extracts unique values for each dimension" do
58
+ it "with no args returns a Hash of unique values for each dimension" do
28
59
  _(sales.coordinates).must_equal ({
29
60
  product: ['Widget', 'Gadget'],
30
61
  quarter: ['Q1', 'Q2'],
31
62
  price: [10.0, 25.0],
32
63
  quantity: [100, 150, 40, 60]
33
64
  })
65
+ end
66
+
67
+ it "0.6.0-style indexing still works" do
34
68
  _(sales.coordinates[:product]).must_equal ['Widget', 'Gadget']
35
69
  _(sales.coordinates[:quarter]).must_equal ['Q1', 'Q2']
36
70
  end
71
+
72
+ it "with one arg returns just that column's unique values as an Array" do
73
+ _(sales.coordinates(:product)).must_equal ['Widget', 'Gadget']
74
+ end
75
+
76
+ it "with one arg returns [nil] for an unknown dimension (nil values uniqued)" do
77
+ _(sales.coordinates(:missing)).must_equal [nil]
78
+ end
79
+
80
+ it "with one arg evaluates a derived dimension" do
81
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
82
+ _(sales.coordinates(:revenue)).must_equal [1000.0, 1500.0]
83
+ end
84
+
85
+ it "with multiple args returns a subset Hash" do
86
+ _(sales.coordinates(:product, :quarter)).must_equal({
87
+ product: ['Widget', 'Gadget'],
88
+ quarter: ['Q1', 'Q2']
89
+ })
90
+ end
91
+
92
+ it "with multiple args includes unknown dimensions as [nil]" do
93
+ _(sales.coordinates(:product, :missing)).must_equal({
94
+ product: ['Widget', 'Gadget'],
95
+ missing: [nil]
96
+ })
97
+ end
98
+
99
+ it "covers derived dimensions in the no-arg form" do
100
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
101
+ _(sales.coordinates[:revenue]).must_equal [1000.0, 1500.0]
102
+ end
103
+ end
104
+
105
+ describe "#values" do
106
+ it "with no args returns a Hash of full sequences for each dimension" do
107
+ _(sales.values).must_equal({
108
+ product: ['Widget', 'Widget', 'Gadget', 'Gadget'],
109
+ quarter: ['Q1', 'Q2', 'Q1', 'Q2'],
110
+ price: [10.0, 10.0, 25.0, 25.0],
111
+ quantity: [100, 150, 40, 60]
112
+ })
113
+ end
114
+
115
+ it "with one arg returns just that column as an Array, preserving duplicates and order" do
116
+ _(sales.values(:product)).must_equal ['Widget', 'Widget', 'Gadget', 'Gadget']
117
+ _(sales.values(:price)).must_equal [10.0, 10.0, 25.0, 25.0]
118
+ end
119
+
120
+ it "with one arg returns an Array of nils for an unknown dimension (one nil per row)" do
121
+ _(sales.values(:missing)).must_equal [nil, nil, nil, nil]
122
+ end
123
+
124
+ it "with one arg evaluates a derived dimension across all rows" do
125
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
126
+ _(sales.values(:revenue)).must_equal [1000.0, 1500.0, 1000.0, 1500.0]
127
+ end
128
+
129
+ it "with multiple args returns a subset Hash" do
130
+ _(sales.values(:product, :quarter)).must_equal({
131
+ product: ['Widget', 'Widget', 'Gadget', 'Gadget'],
132
+ quarter: ['Q1', 'Q2', 'Q1', 'Q2']
133
+ })
134
+ end
135
+
136
+ it "with multiple args includes unknown dimensions as Arrays of nils" do
137
+ _(sales.values(:product, :missing)).must_equal({
138
+ product: ['Widget', 'Widget', 'Gadget', 'Gadget'],
139
+ missing: [nil, nil, nil, nil]
140
+ })
141
+ end
142
+
143
+ it "covers derived dimensions in the no-arg form" do
144
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
145
+ _(sales.values[:revenue]).must_equal [1000.0, 1500.0, 1000.0, 1500.0]
146
+ end
147
+ end
148
+
149
+ describe "#to_h" do
150
+ it "returns the full values Hash" do
151
+ _(sales.to_h).must_equal sales.values
152
+ end
153
+ end
154
+
155
+ describe "aspect consistency" do
156
+ it "satisfies coordinates(dim) == values(dim).uniq for each dimension" do
157
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
158
+ sales.dimensions.each do |dim|
159
+ _(sales.coordinates(dim)).must_equal sales.values(dim).uniq
160
+ end
161
+ end
162
+ end
163
+
164
+ describe "live-view semantics" do
165
+ it "reflects added rows on next call" do
166
+ _(sales.values(:product)).must_equal ['Widget', 'Widget', 'Gadget', 'Gadget']
167
+ sales.data << {product: 'Thingo', quarter: 'Q3', price: 5.0, quantity: 10}
168
+ _(sales.values(:product)).must_equal ['Widget', 'Widget', 'Gadget', 'Gadget', 'Thingo']
169
+ end
170
+
171
+ it "reflects added formulae on next call" do
172
+ _(sales.derived_dimensions).must_equal []
173
+ sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
174
+ _(sales.derived_dimensions).must_equal [:revenue]
175
+ _(sales.coordinates(:revenue)).must_equal [1000.0, 1500.0]
176
+ end
37
177
  end
38
178
 
39
179
  describe "#[]" do
@@ -466,6 +606,282 @@ describe Namo do
466
606
  end
467
607
  end
468
608
 
609
+ describe "#==" do
610
+ it "is true for same data, same order" do
611
+ a = Namo.new([{x: 1}, {x: 2}])
612
+ b = Namo.new([{x: 1}, {x: 2}])
613
+ _(a == b).must_equal true
614
+ end
615
+
616
+ it "is true for same data, different order" do
617
+ a = Namo.new([{x: 1}, {x: 2}])
618
+ b = Namo.new([{x: 2}, {x: 1}])
619
+ _(a == b).must_equal true
620
+ end
621
+
622
+ it "is false for different data" do
623
+ a = Namo.new([{x: 1}, {x: 2}])
624
+ b = Namo.new([{x: 1}, {x: 3}])
625
+ _(a == b).must_equal false
626
+ end
627
+
628
+ it "is multiset-aware: duplicates count" do
629
+ a = Namo.new([{x: 1}])
630
+ b = Namo.new([{x: 1}, {x: 1}])
631
+ _(a == b).must_equal false
632
+ end
633
+
634
+ it "is true across subclasses with same data" do
635
+ subclass = Class.new(Namo)
636
+ a = Namo.new([{x: 1}, {x: 2}])
637
+ b = subclass.new([{x: 1}, {x: 2}])
638
+ _(a == b).must_equal true
639
+ end
640
+
641
+ it "ignores formulae" do
642
+ a = Namo.new([{x: 1}, {x: 2}])
643
+ b = Namo.new([{x: 1}, {x: 2}])
644
+ b[:y] = proc{|row| row[:x] * 2}
645
+ _(a == b).must_equal true
646
+ end
647
+
648
+ it "is false against a non-Namo" do
649
+ a = Namo.new([{x: 1}, {x: 2}])
650
+ _(a == [{x: 1}, {x: 2}]).must_equal false
651
+ _(a == 'string').must_equal false
652
+ _(a == nil).must_equal false
653
+ end
654
+ end
655
+
656
+ describe "#===" do
657
+ it "is true when dimensions and formulae match, ignoring rows" do
658
+ a = Namo.new([{x: 1}])
659
+ b = Namo.new([{x: 2}, {x: 3}])
660
+ _(a === b).must_equal true
661
+ end
662
+
663
+ it "is false when formulae differ" do
664
+ a = Namo.new([{x: 1}])
665
+ b = Namo.new([{x: 1}])
666
+ b[:doubled] = proc{|row| row[:x] * 2}
667
+ _(a === b).must_equal false
668
+ end
669
+
670
+ it "is true when formulae have the same names, regardless of proc identity" do
671
+ a = Namo.new([{x: 1}])
672
+ a[:doubled] = proc{|row| row[:x] * 2}
673
+ b = Namo.new([{x: 1}])
674
+ b[:doubled] = proc{|row| row[:x] * 2}
675
+ _(a === b).must_equal true
676
+ end
677
+
678
+ it "is false when dimensions differ" do
679
+ a = Namo.new([{x: 1}])
680
+ b = Namo.new([{y: 1}])
681
+ _(a === b).must_equal false
682
+ end
683
+
684
+ it "is true when dimensions are in different order" do
685
+ a = Namo.new([{x: 1, y: 2}])
686
+ b = Namo.new([{y: 9, x: 8}])
687
+ _(a === b).must_equal true
688
+ end
689
+
690
+ it "is false for a non-Namo and does not raise" do
691
+ a = Namo.new([{x: 1}])
692
+ _(a === [{x: 1}]).must_equal false
693
+ _(a === 'string').must_equal false
694
+ _(a === nil).must_equal false
695
+ end
696
+
697
+ it "drives case statement dispatch on analytical type" do
698
+ template = Namo.new([{x: 0}])
699
+ candidate = Namo.new([{x: 5}, {x: 6}])
700
+ result = case candidate
701
+ when template; :matched
702
+ else; :not_matched
703
+ end
704
+ _(result).must_equal :matched
705
+ end
706
+ end
707
+
708
+ describe "#eql?" do
709
+ it "is true for same class, same data, no formulae" do
710
+ a = Namo.new([{x: 1}, {x: 2}])
711
+ b = Namo.new([{x: 1}, {x: 2}])
712
+ _(a.eql?(b)).must_equal true
713
+ end
714
+
715
+ it "is true for same class, same data, different order" do
716
+ a = Namo.new([{x: 1}, {x: 2}])
717
+ b = Namo.new([{x: 2}, {x: 1}])
718
+ _(a.eql?(b)).must_equal true
719
+ end
720
+
721
+ it "is true when formula names match, regardless of proc identity" do
722
+ a = Namo.new([{x: 1}, {x: 2}])
723
+ a[:y] = proc{|row| row[:x] * 2}
724
+ b = Namo.new([{x: 1}, {x: 2}])
725
+ b[:y] = proc{|row| row[:x] * 2}
726
+ _(a.eql?(b)).must_equal true
727
+ end
728
+
729
+ it "is false when formula names differ" do
730
+ a = Namo.new([{x: 1}, {x: 2}])
731
+ a[:doubled] = proc{|row| row[:x] * 2}
732
+ b = Namo.new([{x: 1}, {x: 2}])
733
+ b[:tripled] = proc{|row| row[:x] * 3}
734
+ _(a.eql?(b)).must_equal false
735
+ end
736
+
737
+ it "is false across different classes" do
738
+ subclass = Class.new(Namo)
739
+ a = Namo.new([{x: 1}, {x: 2}])
740
+ b = subclass.new([{x: 1}, {x: 2}])
741
+ _(a.eql?(b)).must_equal false
742
+ end
743
+ end
744
+
745
+ describe "#hash" do
746
+ it "is equal for set-equal Namos" do
747
+ a = Namo.new([{x: 1}, {x: 2}])
748
+ b = Namo.new([{x: 2}, {x: 1}])
749
+ _(a.hash).must_equal b.hash
750
+ end
751
+
752
+ it "differs when formula names differ" do
753
+ a = Namo.new([{x: 1}, {x: 2}])
754
+ b = Namo.new([{x: 1}, {x: 2}])
755
+ b[:y] = proc{|row| row[:x] * 2}
756
+ _(a.hash).wont_equal b.hash
757
+ end
758
+
759
+ it "is equal when formula names match, regardless of proc identity" do
760
+ a = Namo.new([{x: 1}, {x: 2}])
761
+ a[:y] = proc{|row| row[:x] * 2}
762
+ b = Namo.new([{x: 1}, {x: 2}])
763
+ b[:y] = proc{|row| row[:x] * 2}
764
+ _(a.hash).must_equal b.hash
765
+ end
766
+
767
+ it "differs across classes" do
768
+ subclass = Class.new(Namo)
769
+ a = Namo.new([{x: 1}, {x: 2}])
770
+ b = subclass.new([{x: 1}, {x: 2}])
771
+ _(a.hash).wont_equal b.hash
772
+ end
773
+
774
+ it "makes Namos usable as Hash keys" do
775
+ a = Namo.new([{x: 1}, {x: 2}])
776
+ b = Namo.new([{x: 2}, {x: 1}])
777
+ h = {a => 'first'}
778
+ _(h[b]).must_equal 'first'
779
+ end
780
+ end
781
+
782
+ describe "#<, #<=, #>, #>=" do
783
+ let(:small) { Namo.new([{x: 1}, {x: 2}]) }
784
+ let(:large) { Namo.new([{x: 1}, {x: 2}, {x: 3}]) }
785
+ let(:disjoint) { Namo.new([{x: 4}, {x: 5}]) }
786
+
787
+ it "recognises proper subset" do
788
+ _(small < large).must_equal true
789
+ _(small <= large).must_equal true
790
+ _(large > small).must_equal true
791
+ _(large >= small).must_equal true
792
+ end
793
+
794
+ it "treats equal sets as <= and >= but not < or >" do
795
+ copy = Namo.new([{x: 2}, {x: 1}])
796
+ _(small <= copy).must_equal true
797
+ _(small >= copy).must_equal true
798
+ _(small < copy).must_equal false
799
+ _(small > copy).must_equal false
800
+ end
801
+
802
+ it "treats disjoint sets as neither subset nor superset" do
803
+ _(small <= disjoint).must_equal false
804
+ _(small >= disjoint).must_equal false
805
+ _(small < disjoint).must_equal false
806
+ _(small > disjoint).must_equal false
807
+ end
808
+
809
+ it "is multiset-aware: a single row is a proper subset of two of the same row" do
810
+ one = Namo.new([{x: 1}])
811
+ two = Namo.new([{x: 1}, {x: 1}])
812
+ _(one < two).must_equal true
813
+ _(one <= two).must_equal true
814
+ _(two <= one).must_equal false
815
+ _(two < one).must_equal false
816
+ end
817
+
818
+ it "raises ArgumentError on mismatched dimensions" do
819
+ other = Namo.new([{y: 1}])
820
+ _ { small < other }.must_raise ArgumentError
821
+ _ { small <= other }.must_raise ArgumentError
822
+ _ { small > other }.must_raise ArgumentError
823
+ _ { small >= other }.must_raise ArgumentError
824
+ end
825
+
826
+ it "raises TypeError on non-Namo" do
827
+ _ { small < [{x: 1}] }.must_raise TypeError
828
+ _ { small <= 'string' }.must_raise TypeError
829
+ _ { small > nil }.must_raise TypeError
830
+ _ { small >= 42 }.must_raise TypeError
831
+ end
832
+ end
833
+
834
+ describe "#equal?" do
835
+ it "is false for distinct objects" do
836
+ a = Namo.new([{x: 1}])
837
+ b = Namo.new([{x: 1}])
838
+ _(a.equal?(b)).must_equal false
839
+ end
840
+
841
+ it "is true for the same object" do
842
+ a = Namo.new([{x: 1}])
843
+ _(a.equal?(a)).must_equal true
844
+ end
845
+ end
846
+
847
+ describe "dimension-mismatch error message" do
848
+ it "names both dimension lists" do
849
+ a = Namo.new([{x: 1}])
850
+ b = Namo.new([{y: 1}])
851
+ err = _ { a + b }.must_raise ArgumentError
852
+ _(err.message).must_match(/dimensions don't match/)
853
+ _(err.message).must_match(/\[:x\]/)
854
+ _(err.message).must_match(/\[:y\]/)
855
+ end
856
+ end
857
+
858
+ describe "non-Namo comparison error message" do
859
+ it "names the offending class" do
860
+ a = Namo.new([{x: 1}])
861
+ err = _ { a < 'string' }.must_raise TypeError
862
+ _(err.message).must_match(/can't compare Namo with/)
863
+ _(err.message).must_match(/String/)
864
+ end
865
+ end
866
+
867
+ describe "non-Namo set operation error message" do
868
+ it "raises TypeError on non-Namo for #+, #-, #&, #|, #^" do
869
+ a = Namo.new([{x: 1}])
870
+ _ { a + [{x: 1}] }.must_raise TypeError
871
+ _ { a - 'string' }.must_raise TypeError
872
+ _ { a & nil }.must_raise TypeError
873
+ _ { a | 42 }.must_raise TypeError
874
+ _ { a ^ :symbol }.must_raise TypeError
875
+ end
876
+
877
+ it "names the offending class" do
878
+ a = Namo.new([{x: 1}])
879
+ err = _ { a + 'string' }.must_raise TypeError
880
+ _(err.message).must_match(/can't compare Namo with/)
881
+ _(err.message).must_match(/String/)
882
+ end
883
+ end
884
+
469
885
  describe "#to_a" do
470
886
  it "returns the data as an array of hashes" do
471
887
  _(sales.to_a).must_equal sample_data
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: namo
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.0
4
+ version: 0.7.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - thoran
@@ -72,8 +72,8 @@ files:
72
72
  - namo.gemspec
73
73
  - test/Namo/NegatedDimension_test.rb
74
74
  - test/Namo/Row_test.rb
75
- - test/Namo_test.rb
76
75
  - test/Symbol_test.rb
76
+ - test/namo_test.rb
77
77
  homepage: https://github.com/thoran/namo
78
78
  licenses:
79
79
  - MIT
@@ -92,7 +92,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
92
92
  - !ruby/object:Gem::Version
93
93
  version: '0'
94
94
  requirements: []
95
- rubygems_version: 4.0.10
95
+ rubygems_version: 4.0.11
96
96
  specification_version: 4
97
97
  summary: Named dimensional data for Ruby.
98
98
  test_files: []