namo 0.8.0 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG +13 -0
- data/README.md +118 -0
- data/lib/Namo/VERSION.rb +1 -1
- data/lib/namo.rb +48 -0
- data/test/namo_test.rb +282 -0
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 7adbe8192367d3c4207f7b27ff0b6a8a13f5243b42a4e4f18a6fbc35ef6439be
|
|
4
|
+
data.tar.gz: cda1e8b3d8fc042b4457efc0bc7846d5a416a5eda670d140824a4f8ad193a0f2
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 60d639f243fc7bf306576b69ac646be388571b343330e5636f7c862519169cbaca3a36beb1cfccf16462b7e65b4735ca1b38e948db693c094c3636507531ebd4
|
|
7
|
+
data.tar.gz: 5898c9f8b7d9196481a964826cee97a8f1a962916f818604fa4ea647c1b59d5b05e17ea0ac5a578c6cd2a7d499100d95a10344029e7b6abd5bce70a41efc9b1f
|
data/CHANGELOG
CHANGED
|
@@ -1,6 +1,19 @@
|
|
|
1
1
|
CHANGELOG
|
|
2
2
|
_________
|
|
3
3
|
|
|
4
|
+
20260521
|
|
5
|
+
0.9.0: + composition operators: equi-join (*), Cartesian product (**), decomposition (/)
|
|
6
|
+
|
|
7
|
+
1. + Namo#*: Equi-join on shared data dimensions. Inner-join semantics — unmatched rows from both sides are dropped. Raises ArgumentError ("no shared dimensions, need to have shared dimensions") when operands have no overlap. Preserves duplicates multiplicatively. Formulae merge with self winning on conflict.
|
|
8
|
+
2. + Namo#**: Cartesian product of two Namos with disjoint data dimensions. Raises ArgumentError ("dimensions in common, need no common dimensions") when any dimension is shared. Output has left.length * right.length rows. Formulae merge with self winning on conflict.
|
|
9
|
+
3. + Namo#/: Decomposition. Removes from self the dimensions that are also in other (the intersection), then dedupes the projected rows. No precondition — total on Namo × Namo. When self and other share no dimensions, the operator is a no-op. Formulae carry through from self. (a ** b) / b == a exactly; (a * b) / b loses dimensions shared between a and b.
|
|
10
|
+
4. + Namo#raise_unless_shared_data_dimensions, Namo#raise_unless_disjoint_data_dimensions: Private precondition helpers for #* and #** respectively.
|
|
11
|
+
5. ~ test/namo_test.rb: + #* tests (single/multi-dimension join, inner-join symmetry, multiplicative duplicates, formulae merging, error cases). + #** tests (Cartesian product, output sizing, dimension overlap error). + #/ tests (intersection removal, dedupe of collided rows, no-op on disjoint operands, idempotence). + Composition round-trip tests for the ** case (exact identity) and the * case (lossy on shared dimensions).
|
|
12
|
+
6. ~ README.md: + Composition section (*), + Cartesian product section (**), + Decomposition section (/) including the combining-vs-projecting rationale for /'s no-precondition design. Placed after Symmetric Difference and before Equality.
|
|
13
|
+
7. ~ ROADMAP.md: Promote 0.9.0 from upcoming to shipped under "Current state: 0.9.0"; revise Summary to include composition in the operator vocabulary and point "next phase" at 0.10.0+.
|
|
14
|
+
8. ~ COMPARISON.md: /planned (0.9.0)/shipped (0.9.0)/ for Equi-join, Cartesian product, and Decomposition. + Paragraph in the Decomposition entry on the combining-vs-projecting distinction. Date bumped to 20260521.
|
|
15
|
+
9. ~ Namo::VERSION: /0.8.0/0.9.0/
|
|
16
|
+
|
|
4
17
|
20260521
|
|
5
18
|
0.8.0: + proc and regex-based selection
|
|
6
19
|
|
data/README.md
CHANGED
|
@@ -319,6 +319,124 @@ set_a ^ set_b
|
|
|
319
319
|
|
|
320
320
|
The dimensions must match; different dimensions raise an `ArgumentError`. Formulae merge from both sides; the left-hand side's formulae take precedence on conflict.
|
|
321
321
|
|
|
322
|
+
### Composition
|
|
323
|
+
|
|
324
|
+
`*` is the equi-join operator. It pairs rows from two Namos where coordinates match on every shared dimension, like an inner join on the shared dimension names:
|
|
325
|
+
|
|
326
|
+
```ruby
|
|
327
|
+
ohlcv = Namo.new([
|
|
328
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5},
|
|
329
|
+
{symbol: 'RIO', date: '2025-01-01', close: 118.3}
|
|
330
|
+
])
|
|
331
|
+
|
|
332
|
+
fundamentals = Namo.new([
|
|
333
|
+
{symbol: 'BHP', pe: 14.5},
|
|
334
|
+
{symbol: 'RIO', pe: 9.2}
|
|
335
|
+
])
|
|
336
|
+
|
|
337
|
+
ohlcv * fundamentals
|
|
338
|
+
# => #<Namo [
|
|
339
|
+
# {symbol: 'BHP', date: '2025-01-01', close: 42.5, pe: 14.5},
|
|
340
|
+
# {symbol: 'RIO', date: '2025-01-01', close: 118.3, pe: 9.2}
|
|
341
|
+
# ]>
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
Inner-join semantics: unmatched rows from either side are dropped. Output dimensions are `self.data_dimensions` followed by `other.data_dimensions` exclusive to other. Duplicates on shared coordinates are preserved multiplicatively — output multiplicity is the product of input multiplicities on each matching key.
|
|
345
|
+
|
|
346
|
+
The two Namos must have at least one shared data dimension. No overlap raises an `ArgumentError` — the asymmetry with `**` is deliberate, and falling through to a Cartesian product would silently turn a logic error into a large pile of nonsense rows. Formulae merge from both sides; the left-hand side wins on conflict.
|
|
347
|
+
|
|
348
|
+
### Cartesian product
|
|
349
|
+
|
|
350
|
+
`**` is the Cartesian product. Every row from the left paired with every row from the right:
|
|
351
|
+
|
|
352
|
+
```ruby
|
|
353
|
+
products = Namo.new([{product: 'Widget'}, {product: 'Gadget'}])
|
|
354
|
+
quarters = Namo.new([{quarter: 'Q1'}, {quarter: 'Q2'}])
|
|
355
|
+
|
|
356
|
+
products ** quarters
|
|
357
|
+
# => #<Namo [
|
|
358
|
+
# {product: 'Widget', quarter: 'Q1'},
|
|
359
|
+
# {product: 'Widget', quarter: 'Q2'},
|
|
360
|
+
# {product: 'Gadget', quarter: 'Q1'},
|
|
361
|
+
# {product: 'Gadget', quarter: 'Q2'}
|
|
362
|
+
# ]>
|
|
363
|
+
```
|
|
364
|
+
|
|
365
|
+
Output has `self.data.length * other.data.length` rows. Output dimensions are `self.data_dimensions + other.data_dimensions`, in operand order. Duplicates are preserved multiplicatively.
|
|
366
|
+
|
|
367
|
+
The two Namos must have **no** shared data dimensions — the precondition is the mirror image of `*`. Any overlap raises an `ArgumentError`; allowing it would produce rows with the same dimension named twice. Formulae merge from both sides; the left-hand side wins on conflict.
|
|
368
|
+
|
|
369
|
+
The visual relationship is intentional: `*` is the filtered version, `**` is the explosive version — more sigil, more output.
|
|
370
|
+
|
|
371
|
+
### Decomposition
|
|
372
|
+
|
|
373
|
+
`/` removes from the left Namo the dimensions that are also in the right, then dedupes the projected rows. It's the inverse of `*` and `**`:
|
|
374
|
+
|
|
375
|
+
```ruby
|
|
376
|
+
combined = Namo.new([
|
|
377
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5, pe: 14.5},
|
|
378
|
+
{symbol: 'RIO', date: '2025-01-01', close: 118.3, pe: 9.2}
|
|
379
|
+
])
|
|
380
|
+
|
|
381
|
+
fundamentals = Namo.new([
|
|
382
|
+
{symbol: 'BHP', pe: 14.5},
|
|
383
|
+
{symbol: 'RIO', pe: 9.2}
|
|
384
|
+
])
|
|
385
|
+
|
|
386
|
+
combined / fundamentals
|
|
387
|
+
# => #<Namo [
|
|
388
|
+
# {date: '2025-01-01', close: 42.5},
|
|
389
|
+
# {date: '2025-01-01', close: 118.3}
|
|
390
|
+
# ]>
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
The intersection of dimensions — here `:symbol` and `:pe` — is removed. Everything else stays. The projected rows are deduplicated, so `/` answers "what's left when these dimensions are factored out?" rather than "what rows survive a column drop?". Formulae carry through from the left-hand side.
|
|
394
|
+
|
|
395
|
+
`/` has no precondition. When the two Namos share no dimensions, the intersection is empty, nothing is removed, and `self / other` returns a Namo equal to self:
|
|
396
|
+
|
|
397
|
+
```ruby
|
|
398
|
+
shipments = Namo.new([{order_id: 1, weight: 10}])
|
|
399
|
+
weather = Namo.new([{date: '2025-01-01', temperature: 22}])
|
|
400
|
+
|
|
401
|
+
shipments / weather
|
|
402
|
+
# => #<Namo [{order_id: 1, weight: 10}]> — equal to shipments
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
The round-trip identity holds for the `**` case exactly:
|
|
406
|
+
|
|
407
|
+
```ruby
|
|
408
|
+
a = Namo.new([{symbol: 'BHP'}, {symbol: 'RIO'}])
|
|
409
|
+
b = Namo.new([{quarter: 'Q1'}, {quarter: 'Q2'}])
|
|
410
|
+
|
|
411
|
+
(a ** b) / b == a
|
|
412
|
+
# => true
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
For `*`, the round-trip is lossy on the dimensions that were shared between the operands:
|
|
416
|
+
|
|
417
|
+
```ruby
|
|
418
|
+
a = Namo.new([{symbol: 'BHP', close: 42.5}, {symbol: 'RIO', close: 118.3}])
|
|
419
|
+
b = Namo.new([{symbol: 'BHP', pe: 14.5}, {symbol: 'RIO', pe: 9.2}])
|
|
420
|
+
|
|
421
|
+
(a * b) / b
|
|
422
|
+
# => #<Namo [{close: 42.5}, {close: 118.3}]>
|
|
423
|
+
# Equal to a[-:symbol]. :symbol was shared and is lost.
|
|
424
|
+
```
|
|
425
|
+
|
|
426
|
+
The asymmetry is inherent: `/` operates only on the two values it receives and can't distinguish "shared dimension that belonged to both" from "exclusive dimension that belonged only to the right". Removing the intersection is the only rule expressible from the operands alone, and it gives clean recovery from `**` and well-defined (if lossy) recovery from `*`.
|
|
427
|
+
|
|
428
|
+
#### Why `/` is loose
|
|
429
|
+
|
|
430
|
+
`*` and `**` raise when their preconditions are violated — combining unrelated Namos has no natural answer, and silently producing arbitrary output would turn a logic error into a large pile of nonsense rows. `/` is different: it's a projecting operator, not a combining one, and projecting away nothing returns the original. The no-precondition rule isn't a fallback; it's the structurally correct result.
|
|
431
|
+
|
|
432
|
+
This earns `/` three properties a strict version would lose:
|
|
433
|
+
|
|
434
|
+
- **Identity test.** `combined / other == combined` exactly when the two have no shared dimensions — answers "are these Namos dimensionally independent?" without explicit introspection. Same shape as `a & b == a` answering subset from 0.6.0.
|
|
435
|
+
- **Idempotence.** `(c / b) / b == c / b`. Once `b`'s dimensions are removed, removing them again does nothing.
|
|
436
|
+
- **Pipeline composition.** A processing step that applies `/ separator` can run over any Namo regardless of whether the separator's dimensions apply. Uninvolved Namos pass through unchanged; involved Namos get stripped. The pipeline doesn't need to special-case applicability.
|
|
437
|
+
|
|
438
|
+
This is the same pattern that makes `Array#-` useful with arrays that aren't subsets: `[1, 2, 3] - [9] == [1, 2, 3]`, not an error. The no-op-on-non-applicable behaviour lets the operator compose into pipelines that don't know in advance whether the operation applies.
|
|
439
|
+
|
|
322
440
|
### Equality
|
|
323
441
|
|
|
324
442
|
Comparison on Namos is **multiset-theoretic on rows**: row order is ignored (it's an accident of ingestion, not data), but row multiplicities count (they *are* data). The same stance carries across the equality, pattern-match, and subset/superset operators below.
|
data/lib/Namo/VERSION.rb
CHANGED
data/lib/namo.rb
CHANGED
|
@@ -110,6 +110,42 @@ class Namo
|
|
|
110
110
|
self.class.new((@data - other.data) + (other.data - @data), formulae: other.formulae.merge(@formulae))
|
|
111
111
|
end
|
|
112
112
|
|
|
113
|
+
def *(other)
|
|
114
|
+
raise_unless_namo(other)
|
|
115
|
+
raise_unless_shared_data_dimensions(other)
|
|
116
|
+
shared = data_dimensions & other.data_dimensions
|
|
117
|
+
combined_data = []
|
|
118
|
+
@data.each do |left_row|
|
|
119
|
+
other.data.each do |right_row|
|
|
120
|
+
if shared.all?{|dim| left_row[dim] == right_row[dim]}
|
|
121
|
+
combined_data << left_row.merge(right_row)
|
|
122
|
+
end
|
|
123
|
+
end
|
|
124
|
+
end
|
|
125
|
+
self.class.new(combined_data, formulae: other.formulae.merge(@formulae))
|
|
126
|
+
end
|
|
127
|
+
|
|
128
|
+
def **(other)
|
|
129
|
+
raise_unless_namo(other)
|
|
130
|
+
raise_unless_disjoint_data_dimensions(other)
|
|
131
|
+
combined_data = []
|
|
132
|
+
@data.each do |left_row|
|
|
133
|
+
other.data.each do |right_row|
|
|
134
|
+
combined_data << left_row.merge(right_row)
|
|
135
|
+
end
|
|
136
|
+
end
|
|
137
|
+
self.class.new(combined_data, formulae: other.formulae.merge(@formulae))
|
|
138
|
+
end
|
|
139
|
+
|
|
140
|
+
def /(other)
|
|
141
|
+
raise_unless_namo(other)
|
|
142
|
+
kept = data_dimensions - other.data_dimensions
|
|
143
|
+
projected = @data.map do |row|
|
|
144
|
+
kept.each_with_object({}){|dim, hash| hash[dim] = row[dim]}
|
|
145
|
+
end
|
|
146
|
+
self.class.new(projected.uniq, formulae: @formulae.dup)
|
|
147
|
+
end
|
|
148
|
+
|
|
113
149
|
def ==(other)
|
|
114
150
|
return false unless other.is_a?(Namo)
|
|
115
151
|
canonical_data == other.canonical_data
|
|
@@ -201,6 +237,18 @@ class Namo
|
|
|
201
237
|
end
|
|
202
238
|
end
|
|
203
239
|
|
|
240
|
+
def raise_unless_shared_data_dimensions(other)
|
|
241
|
+
if (data_dimensions & other.data_dimensions).empty?
|
|
242
|
+
raise ArgumentError, "no shared dimensions, need to have shared dimensions: #{data_dimensions} vs #{other.data_dimensions}"
|
|
243
|
+
end
|
|
244
|
+
end
|
|
245
|
+
|
|
246
|
+
def raise_unless_disjoint_data_dimensions(other)
|
|
247
|
+
if (data_dimensions & other.data_dimensions).any?
|
|
248
|
+
raise ArgumentError, "dimensions in common, need no common dimensions: #{data_dimensions} vs #{other.data_dimensions}"
|
|
249
|
+
end
|
|
250
|
+
end
|
|
251
|
+
|
|
204
252
|
def initialize(data = nil, formulae: {})
|
|
205
253
|
@data = data
|
|
206
254
|
@formulae = formulae
|
data/test/namo_test.rb
CHANGED
|
@@ -710,6 +710,288 @@ describe Namo do
|
|
|
710
710
|
end
|
|
711
711
|
end
|
|
712
712
|
|
|
713
|
+
describe "#*" do
|
|
714
|
+
let(:ohlcv) do
|
|
715
|
+
Namo.new([
|
|
716
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5},
|
|
717
|
+
{symbol: 'RIO', date: '2025-01-01', close: 118.3}
|
|
718
|
+
])
|
|
719
|
+
end
|
|
720
|
+
|
|
721
|
+
let(:fundamentals) do
|
|
722
|
+
Namo.new([
|
|
723
|
+
{symbol: 'BHP', pe: 14.5},
|
|
724
|
+
{symbol: 'RIO', pe: 9.2}
|
|
725
|
+
])
|
|
726
|
+
end
|
|
727
|
+
|
|
728
|
+
it "joins on a single shared dimension" do
|
|
729
|
+
result = ohlcv * fundamentals
|
|
730
|
+
_(result.to_a).must_equal [
|
|
731
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5, pe: 14.5},
|
|
732
|
+
{symbol: 'RIO', date: '2025-01-01', close: 118.3, pe: 9.2}
|
|
733
|
+
]
|
|
734
|
+
end
|
|
735
|
+
|
|
736
|
+
it "joins on multiple shared dimensions" do
|
|
737
|
+
a = Namo.new([
|
|
738
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5},
|
|
739
|
+
{symbol: 'BHP', date: '2025-01-02', close: 43.0}
|
|
740
|
+
])
|
|
741
|
+
b = Namo.new([
|
|
742
|
+
{symbol: 'BHP', date: '2025-01-01', volume: 1000},
|
|
743
|
+
{symbol: 'BHP', date: '2025-01-02', volume: 1500}
|
|
744
|
+
])
|
|
745
|
+
result = a * b
|
|
746
|
+
_(result.to_a).must_equal [
|
|
747
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5, volume: 1000},
|
|
748
|
+
{symbol: 'BHP', date: '2025-01-02', close: 43.0, volume: 1500}
|
|
749
|
+
]
|
|
750
|
+
end
|
|
751
|
+
|
|
752
|
+
it "preserves non-shared dimensions from both sides" do
|
|
753
|
+
result = ohlcv * fundamentals
|
|
754
|
+
_(result.dimensions).must_equal [:symbol, :date, :close, :pe]
|
|
755
|
+
end
|
|
756
|
+
|
|
757
|
+
it "drops unmatched rows from both sides (inner-join symmetry)" do
|
|
758
|
+
left = Namo.new([
|
|
759
|
+
{symbol: 'BHP', close: 42.5},
|
|
760
|
+
{symbol: 'CBA', close: 100.0}
|
|
761
|
+
])
|
|
762
|
+
right = Namo.new([
|
|
763
|
+
{symbol: 'BHP', pe: 14.5},
|
|
764
|
+
{symbol: 'RIO', pe: 9.2}
|
|
765
|
+
])
|
|
766
|
+
result = left * right
|
|
767
|
+
_(result.to_a).must_equal [{symbol: 'BHP', close: 42.5, pe: 14.5}]
|
|
768
|
+
end
|
|
769
|
+
|
|
770
|
+
it "produces multiplicative duplicates when inputs have duplicates on shared dimensions" do
|
|
771
|
+
left = Namo.new([
|
|
772
|
+
{symbol: 'BHP', close: 42.5},
|
|
773
|
+
{symbol: 'BHP', close: 43.0}
|
|
774
|
+
])
|
|
775
|
+
right = Namo.new([
|
|
776
|
+
{symbol: 'BHP', pe: 14.5},
|
|
777
|
+
{symbol: 'BHP', pe: 14.7}
|
|
778
|
+
])
|
|
779
|
+
result = left * right
|
|
780
|
+
_(result.to_a.length).must_equal 4
|
|
781
|
+
_(result.to_a).must_equal [
|
|
782
|
+
{symbol: 'BHP', close: 42.5, pe: 14.5},
|
|
783
|
+
{symbol: 'BHP', close: 42.5, pe: 14.7},
|
|
784
|
+
{symbol: 'BHP', close: 43.0, pe: 14.5},
|
|
785
|
+
{symbol: 'BHP', close: 43.0, pe: 14.7}
|
|
786
|
+
]
|
|
787
|
+
end
|
|
788
|
+
|
|
789
|
+
it "carries formulae through from self" do
|
|
790
|
+
ohlcv[:label] = proc{|r| "#{r[:symbol]}-self"}
|
|
791
|
+
result = ohlcv * fundamentals
|
|
792
|
+
_(result.map{|row| row[:label]}).must_equal ['BHP-self', 'RIO-self']
|
|
793
|
+
end
|
|
794
|
+
|
|
795
|
+
it "merges formulae from other" do
|
|
796
|
+
fundamentals[:flag] = proc{|r| "pe=#{r[:pe]}"}
|
|
797
|
+
result = ohlcv * fundamentals
|
|
798
|
+
_(result.map{|row| row[:flag]}).must_equal ['pe=14.5', 'pe=9.2']
|
|
799
|
+
end
|
|
800
|
+
|
|
801
|
+
it "prefers self's formulae on conflict" do
|
|
802
|
+
ohlcv[:label] = proc{|r| "self: #{r[:symbol]}"}
|
|
803
|
+
fundamentals[:label] = proc{|r| "other: #{r[:symbol]}"}
|
|
804
|
+
result = ohlcv * fundamentals
|
|
805
|
+
_(result.map{|row| row[:label]}).must_equal ['self: BHP', 'self: RIO']
|
|
806
|
+
end
|
|
807
|
+
|
|
808
|
+
it "raises ArgumentError when there are no shared dimensions" do
|
|
809
|
+
a = Namo.new([{symbol: 'BHP'}])
|
|
810
|
+
b = Namo.new([{quarter: 'Q1'}])
|
|
811
|
+
err = _ { a * b }.must_raise ArgumentError
|
|
812
|
+
_(err.message).must_match(/no shared dimensions, need to have shared dimensions/)
|
|
813
|
+
end
|
|
814
|
+
|
|
815
|
+
it "raises TypeError on a non-Namo operand" do
|
|
816
|
+
_ { ohlcv * [{symbol: 'BHP'}] }.must_raise TypeError
|
|
817
|
+
end
|
|
818
|
+
|
|
819
|
+
it "returns an instance of self's class" do
|
|
820
|
+
subclass = Class.new(Namo)
|
|
821
|
+
a = subclass.new([{symbol: 'BHP', close: 42.5}])
|
|
822
|
+
b = Namo.new([{symbol: 'BHP', pe: 14.5}])
|
|
823
|
+
_((a * b).class).must_equal subclass
|
|
824
|
+
end
|
|
825
|
+
end
|
|
826
|
+
|
|
827
|
+
describe "#**" do
|
|
828
|
+
let(:products) do
|
|
829
|
+
Namo.new([{product: 'Widget'}, {product: 'Gadget'}])
|
|
830
|
+
end
|
|
831
|
+
|
|
832
|
+
let(:quarters) do
|
|
833
|
+
Namo.new([{quarter: 'Q1'}, {quarter: 'Q2'}])
|
|
834
|
+
end
|
|
835
|
+
|
|
836
|
+
it "Cartesian-products two disjoint Namos" do
|
|
837
|
+
result = products ** quarters
|
|
838
|
+
_(result.to_a).must_equal [
|
|
839
|
+
{product: 'Widget', quarter: 'Q1'},
|
|
840
|
+
{product: 'Widget', quarter: 'Q2'},
|
|
841
|
+
{product: 'Gadget', quarter: 'Q1'},
|
|
842
|
+
{product: 'Gadget', quarter: 'Q2'}
|
|
843
|
+
]
|
|
844
|
+
end
|
|
845
|
+
|
|
846
|
+
it "has self.data.length * other.data.length rows" do
|
|
847
|
+
a = Namo.new([{x: 1}, {x: 2}, {x: 3}])
|
|
848
|
+
b = Namo.new([{y: 'a'}, {y: 'b'}])
|
|
849
|
+
_((a ** b).to_a.length).must_equal 6
|
|
850
|
+
end
|
|
851
|
+
|
|
852
|
+
it "output dimensions are self.data_dimensions + other.data_dimensions" do
|
|
853
|
+
result = products ** quarters
|
|
854
|
+
_(result.dimensions).must_equal [:product, :quarter]
|
|
855
|
+
end
|
|
856
|
+
|
|
857
|
+
it "preserves duplicates on either side multiplicatively" do
|
|
858
|
+
a = Namo.new([{x: 1}, {x: 1}])
|
|
859
|
+
b = Namo.new([{y: 'a'}, {y: 'a'}])
|
|
860
|
+
result = a ** b
|
|
861
|
+
_(result.to_a.length).must_equal 4
|
|
862
|
+
end
|
|
863
|
+
|
|
864
|
+
it "carries formulae through from self" do
|
|
865
|
+
products[:label] = proc{|r| "self: #{r[:product]}"}
|
|
866
|
+
result = products ** quarters
|
|
867
|
+
_(result.map{|row| row[:label]}).must_equal [
|
|
868
|
+
'self: Widget', 'self: Widget', 'self: Gadget', 'self: Gadget'
|
|
869
|
+
]
|
|
870
|
+
end
|
|
871
|
+
|
|
872
|
+
it "merges formulae from other" do
|
|
873
|
+
quarters[:flag] = proc{|r| "q=#{r[:quarter]}"}
|
|
874
|
+
result = products ** quarters
|
|
875
|
+
_(result.map{|row| row[:flag]}).must_equal ['q=Q1', 'q=Q2', 'q=Q1', 'q=Q2']
|
|
876
|
+
end
|
|
877
|
+
|
|
878
|
+
it "prefers self's formulae on conflict" do
|
|
879
|
+
products[:label] = proc{|r| "self: #{r[:product]}"}
|
|
880
|
+
quarters[:label] = proc{|r| "other: #{r[:quarter]}"}
|
|
881
|
+
result = products ** quarters
|
|
882
|
+
_(result.map{|row| row[:label]}).must_equal [
|
|
883
|
+
'self: Widget', 'self: Widget', 'self: Gadget', 'self: Gadget'
|
|
884
|
+
]
|
|
885
|
+
end
|
|
886
|
+
|
|
887
|
+
it "raises ArgumentError when any dimension is shared" do
|
|
888
|
+
a = Namo.new([{symbol: 'BHP', close: 42.5}])
|
|
889
|
+
b = Namo.new([{symbol: 'RIO', pe: 14.5}])
|
|
890
|
+
err = _ { a ** b }.must_raise ArgumentError
|
|
891
|
+
_(err.message).must_match(/dimensions in common, need no common dimensions/)
|
|
892
|
+
end
|
|
893
|
+
|
|
894
|
+
it "raises TypeError on a non-Namo operand" do
|
|
895
|
+
_ { products ** [{quarter: 'Q1'}] }.must_raise TypeError
|
|
896
|
+
end
|
|
897
|
+
|
|
898
|
+
it "returns an instance of self's class" do
|
|
899
|
+
subclass = Class.new(Namo)
|
|
900
|
+
a = subclass.new([{product: 'Widget'}])
|
|
901
|
+
b = Namo.new([{quarter: 'Q1'}])
|
|
902
|
+
_((a ** b).class).must_equal subclass
|
|
903
|
+
end
|
|
904
|
+
end
|
|
905
|
+
|
|
906
|
+
describe "#/" do
|
|
907
|
+
let(:combined) do
|
|
908
|
+
Namo.new([
|
|
909
|
+
{symbol: 'BHP', date: '2025-01-01', close: 42.5, pe: 14.5},
|
|
910
|
+
{symbol: 'RIO', date: '2025-01-01', close: 118.3, pe: 9.2}
|
|
911
|
+
])
|
|
912
|
+
end
|
|
913
|
+
|
|
914
|
+
let(:fundamentals) do
|
|
915
|
+
Namo.new([
|
|
916
|
+
{symbol: 'BHP', pe: 14.5},
|
|
917
|
+
{symbol: 'RIO', pe: 9.2}
|
|
918
|
+
])
|
|
919
|
+
end
|
|
920
|
+
|
|
921
|
+
it "removes dimensions present in both self and other (the intersection)" do
|
|
922
|
+
result = combined / fundamentals
|
|
923
|
+
_(result.dimensions).must_equal [:date, :close]
|
|
924
|
+
end
|
|
925
|
+
|
|
926
|
+
it "preserves dimensions exclusive to self" do
|
|
927
|
+
result = combined / fundamentals
|
|
928
|
+
_(result.to_a).must_equal [
|
|
929
|
+
{date: '2025-01-01', close: 42.5},
|
|
930
|
+
{date: '2025-01-01', close: 118.3}
|
|
931
|
+
]
|
|
932
|
+
end
|
|
933
|
+
|
|
934
|
+
it "dedupes rows that collide after projection" do
|
|
935
|
+
a = Namo.new([
|
|
936
|
+
{symbol: 'BHP', close: 42.5},
|
|
937
|
+
{symbol: 'RIO', close: 42.5}
|
|
938
|
+
])
|
|
939
|
+
b = Namo.new([{symbol: 'X'}])
|
|
940
|
+
result = a / b
|
|
941
|
+
_(result.to_a).must_equal [{close: 42.5}]
|
|
942
|
+
end
|
|
943
|
+
|
|
944
|
+
it "carries formulae through from self" do
|
|
945
|
+
combined[:label] = proc{|r| "row: #{r[:close]}"}
|
|
946
|
+
result = combined / fundamentals
|
|
947
|
+
_(result.map{|row| row[:label]}).must_equal ['row: 42.5', 'row: 118.3']
|
|
948
|
+
end
|
|
949
|
+
|
|
950
|
+
it "is a no-op when self and other share no dimensions" do
|
|
951
|
+
shipments = Namo.new([{order_id: 1, weight: 10}])
|
|
952
|
+
weather = Namo.new([{date: '2025-01-01', temperature: 22}])
|
|
953
|
+
_(shipments / weather).must_equal shipments
|
|
954
|
+
end
|
|
955
|
+
|
|
956
|
+
it "ignores dimensions present in other but not in self" do
|
|
957
|
+
a = Namo.new([{symbol: 'BHP', close: 42.5}])
|
|
958
|
+
b = Namo.new([{symbol: 'BHP', pe: 14.5, sector: 'Mining'}])
|
|
959
|
+
result = a / b
|
|
960
|
+
_(result.dimensions).must_equal [:close]
|
|
961
|
+
end
|
|
962
|
+
|
|
963
|
+
it "is idempotent" do
|
|
964
|
+
first = combined / fundamentals
|
|
965
|
+
second = first / fundamentals
|
|
966
|
+
_(second).must_equal first
|
|
967
|
+
end
|
|
968
|
+
|
|
969
|
+
it "raises TypeError on a non-Namo operand" do
|
|
970
|
+
_ { combined / [{symbol: 'BHP'}] }.must_raise TypeError
|
|
971
|
+
end
|
|
972
|
+
|
|
973
|
+
it "returns an instance of self's class" do
|
|
974
|
+
subclass = Class.new(Namo)
|
|
975
|
+
a = subclass.new([{symbol: 'BHP', close: 42.5}])
|
|
976
|
+
b = Namo.new([{symbol: 'BHP', pe: 14.5}])
|
|
977
|
+
_((a / b).class).must_equal subclass
|
|
978
|
+
end
|
|
979
|
+
end
|
|
980
|
+
|
|
981
|
+
describe "composition round-trip" do
|
|
982
|
+
it "satisfies (a ** b) / b == a for disjoint a and b" do
|
|
983
|
+
a = Namo.new([{symbol: 'BHP'}, {symbol: 'RIO'}])
|
|
984
|
+
b = Namo.new([{quarter: 'Q1'}, {quarter: 'Q2'}])
|
|
985
|
+
_((a ** b) / b).must_equal a
|
|
986
|
+
end
|
|
987
|
+
|
|
988
|
+
it "satisfies (a * b) / b == a[-:shared] for a and b with shared dimensions (shared dimensions lost)" do
|
|
989
|
+
a = Namo.new([{symbol: 'BHP', close: 42.5}, {symbol: 'RIO', close: 118.3}])
|
|
990
|
+
b = Namo.new([{symbol: 'BHP', pe: 14.5}, {symbol: 'RIO', pe: 9.2}])
|
|
991
|
+
_((a * b) / b).must_equal Namo.new([{close: 42.5}, {close: 118.3}])
|
|
992
|
+
end
|
|
993
|
+
end
|
|
994
|
+
|
|
713
995
|
describe "#==" do
|
|
714
996
|
it "is true for same data, same order" do
|
|
715
997
|
a = Namo.new([{x: 1}, {x: 2}])
|