namo 0.15.0 → 0.16.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG +45 -0
- data/README.md +16 -2
- data/lib/Namo/VERSION.rb +1 -1
- data/lib/namo.rb +11 -1
- data/test/namo_test.rb +200 -0
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: dcb7aafe4522ff115c12464016fef1a6909047975829523b095105f601a7ae60
|
|
4
|
+
data.tar.gz: 81f150f5f370f836734908a30e1ecd600b856f90c0151a698fa25bd117530d92
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 6fd5a96f0fdf85e6ffb8a3cfa2d1a560d5da1d118072e95cd549af43149ff2a4c963de122b9026c29c0817e33fcfbd2ba3ece969ee1f6f5d92738c3d7f38b73a
|
|
7
|
+
data.tar.gz: 78ba07312d7e4f4d4ac85a6ca572852964ed7055d48687361d8a86f9cb998122f421e75418ba67bc978bf78d7b0cc22315ebb800ca734e8ade07817bae886736
|
data/CHANGELOG
CHANGED
|
@@ -1,6 +1,51 @@
|
|
|
1
1
|
CHANGELOG
|
|
2
2
|
_________
|
|
3
3
|
|
|
4
|
+
20260612
|
|
5
|
+
0.16.0: ~ data/formula exclusivity — projection drops the formulae it materialises; * and ** raise on a data/formula name collision.
|
|
6
|
+
|
|
7
|
+
1. ~ lib/namo.rb: Namo#[]'s positive-projection branch carries @formulae minus the projected
|
|
8
|
+
derived names — naming a derived dimension materialises it (stored values, computed against
|
|
9
|
+
the yielding Namo, windowed over any same-call selection) and drops the formula; omitted
|
|
10
|
+
formulae carry live and recompute from the result's own rows. Contraction and selection-only
|
|
11
|
+
calls are unchanged. The projection list is the materialise/live selector.
|
|
12
|
+
2. ~ lib/namo.rb: Namo#* and Namo#** raise ArgumentError via the new private
|
|
13
|
+
raise_unless_data_formula_exclusivity when one operand's data dimension is the other's
|
|
14
|
+
derived dimension, block and no-block forms alike. Formula-vs-formula stays left-wins; the
|
|
15
|
+
set operators need no guard (matching-data-dimensions blocks the asymmetric case); the
|
|
16
|
+
constructor stays unguarded.
|
|
17
|
+
3. ~ test/namo_test.rb: + "data/formula exclusivity" describe — access-path agreement on a
|
|
18
|
+
materialised dimension, dimension-listing, dependent-formula carry, omitted-formula
|
|
19
|
+
liveness, live-without-inputs caveat, two-arity windowing at materialisation,
|
|
20
|
+
contraction/selection unchanged, subclass type, composition collision raises (both
|
|
21
|
+
directions, both operators, block forms), left-wins formula merge, contraction-first
|
|
22
|
+
resolution.
|
|
23
|
+
4. ~ test/namo_test.rb: + "range selection" context under #[] — basic range, beginless and
|
|
24
|
+
endless forms, range composed with projection, range on a formula-defined dimension
|
|
25
|
+
(Row_test holds the predicate matrix; these pin the Namo-level wiring).
|
|
26
|
+
5. ~ README.md: + Projection of derived dimensions subsection under Formulae (naming
|
|
27
|
+
materialises, omitting carries live); data/formula collision sentences under Composition
|
|
28
|
+
and Cartesian product.
|
|
29
|
+
6. ~ ROADMAP.md: Promote 0.16.0 to shipped; Current state -> 0.16.0; Summary folds in
|
|
30
|
+
exclusivity; next phase -> 0.17.0. window.length -> window.count and n.length -> n.count
|
|
31
|
+
in the remaining future-release examples (0.17.0, 2.x, 4.x) — Namo has no #length,
|
|
32
|
+
extending the 0.15.0 correction.
|
|
33
|
+
7. ~ COMPARISON.md: Repoint the pre-renumbering planned markers to what shipped — proc-based,
|
|
34
|
+
regex-based, and mixed selection -> shipped (0.8.0); Enumerable methods return Namos ->
|
|
35
|
+
shipped (0.11.0), with the entry summary's parity sentence and the Sorting entry's
|
|
36
|
+
"as of 0.14.0" corrected to 0.11.0; values and to_h -> shipped (0.7.0). Aspect classes ->
|
|
37
|
+
not planned, the entry rewritten to record 0.7.0's plain-return-types decision (Namo#===
|
|
38
|
+
and subclassing cover case dispatch; a Matcher factory can serve a finer split later).
|
|
39
|
+
Aggregation repointed from 2.x to group_by returning a Namo::Collection at 0.19.0 (gated
|
|
40
|
+
on Collection at 0.18.0), with summary/members examples; bare names stay 2.x. Parameterised
|
|
41
|
+
formulae stays planned (0.17.0), its example's window.length -> window.count.
|
|
42
|
+
8. ~ EXAMPLES.md: + Epidemiology / public health section — a cross-row (two-arity) rolling
|
|
43
|
+
weekly average in the four-stage format (Polars, then Namo 1.x/2.x/3.x), with the
|
|
44
|
+
(row, namo) window over the yielding Namo and a one-arity formula referencing the
|
|
45
|
+
two-arity one. window.length -> window.count in the finance 1.x and 2.x stages. Date
|
|
46
|
+
bumped to 20260612.
|
|
47
|
+
9. ~ Namo::VERSION: /0.15.0/0.16.0/
|
|
48
|
+
|
|
4
49
|
20260612
|
|
5
50
|
0.15.0: + two-arity formulae — procs with arity 2 receive (row, namo) for cross-row computation.
|
|
6
51
|
|
data/README.md
CHANGED
|
@@ -353,7 +353,7 @@ ohlcv * fundamentals
|
|
|
353
353
|
|
|
354
354
|
Inner-join semantics: unmatched rows from either side are dropped. Output dimensions are `self.data_dimensions` followed by `other.data_dimensions` exclusive to other. Duplicates on shared coordinates are preserved multiplicatively — output multiplicity is the product of input multiplicities on each matching key.
|
|
355
355
|
|
|
356
|
-
The two Namos must have at least one shared data dimension. No overlap raises an `ArgumentError` — the asymmetry with `**` is deliberate, and falling through to a Cartesian product would silently turn a logic error into a large pile of nonsense rows. Formulae merge from both sides; the left-hand side wins on conflict.
|
|
356
|
+
The two Namos must have at least one shared data dimension. No overlap raises an `ArgumentError` — the asymmetry with `**` is deliberate, and falling through to a Cartesian product would silently turn a logic error into a large pile of nonsense rows. Formulae merge from both sides; the left-hand side wins on conflict. A name that is data on one side and a formula on the other also raises an `ArgumentError` — the operands disagree about what the name means, with no last-write order to appeal to — so resolve before composing: `audited[-:margin] * modelled`.
|
|
357
357
|
|
|
358
358
|
#### Conditional join
|
|
359
359
|
|
|
@@ -404,7 +404,7 @@ products ** quarters
|
|
|
404
404
|
|
|
405
405
|
Output has `self.data.length * other.data.length` rows. Output dimensions are `self.data_dimensions + other.data_dimensions`, in operand order. Duplicates are preserved multiplicatively.
|
|
406
406
|
|
|
407
|
-
The two Namos must have **no** shared data dimensions — the precondition is the mirror image of `*`. Any overlap raises an `ArgumentError`; allowing it would produce rows with the same dimension named twice. Formulae merge from both sides; the left-hand side wins on conflict
|
|
407
|
+
The two Namos must have **no** shared data dimensions — the precondition is the mirror image of `*`. Any overlap raises an `ArgumentError`; allowing it would produce rows with the same dimension named twice. Formulae merge from both sides; the left-hand side wins on conflict, and a data/formula name collision between the operands raises, as for `*`.
|
|
408
408
|
|
|
409
409
|
The visual relationship is intentional: `*` is the filtered version, `**` is the explosive version — more sigil, more output.
|
|
410
410
|
|
|
@@ -644,6 +644,20 @@ sales[product: 'Widget'][:revenue, :quarter]
|
|
|
644
644
|
|
|
645
645
|
Formulae carry through selection — a filtered Namo instance remembers its formulae.
|
|
646
646
|
|
|
647
|
+
#### Projection of derived dimensions
|
|
648
|
+
|
|
649
|
+
Naming a derived dimension in a projection asks for its values: they are computed against the source and stored in the result's rows, and the formula is dropped — the name is a data dimension of the result. Omitting it carries the formula live, recomputing from the result's own rows on every access:
|
|
650
|
+
|
|
651
|
+
```ruby
|
|
652
|
+
sales[:price, :quantity, :revenue].derived_dimensions
|
|
653
|
+
# => [] — :revenue is stored values, a snapshot taken at projection
|
|
654
|
+
|
|
655
|
+
sales[:price, :quantity].derived_dimensions
|
|
656
|
+
# => [:revenue] — :revenue recomputes from the projected rows on every access
|
|
657
|
+
```
|
|
658
|
+
|
|
659
|
+
The projection list is the selector: name a derived dimension for a snapshot, omit it to keep it as computation. A carried formula whose inputs the projection dropped breaks on access — the same caveat as contracting away a formula's inputs.
|
|
660
|
+
|
|
647
661
|
#### Cross-row formulae
|
|
648
662
|
|
|
649
663
|
A formula's arity selects its calling convention. A proc with **one** parameter receives the row, as above. A proc with **two** parameters receives `(row, namo)`, where `namo` is the Namo the row belongs to — so the formula can reach beyond the current row to the rest of the collection. That's what cross-row computation needs: moving windows, ranks, running totals, anything whose value depends on the row's neighbours.
|
data/lib/Namo/VERSION.rb
CHANGED
data/lib/namo.rb
CHANGED
|
@@ -71,7 +71,8 @@ class Namo
|
|
|
71
71
|
rows.map(&:to_h)
|
|
72
72
|
end
|
|
73
73
|
)
|
|
74
|
-
|
|
74
|
+
carried = positive.any? ? @formulae.reject{|name, _| positive.include?(name)} : @formulae.dup
|
|
75
|
+
self.class.new(projected, formulae: carried)
|
|
75
76
|
end
|
|
76
77
|
|
|
77
78
|
def []=(name, value)
|
|
@@ -118,6 +119,7 @@ class Namo
|
|
|
118
119
|
def *(other, &block)
|
|
119
120
|
raise_unless_namo(other)
|
|
120
121
|
raise_unless_shared_data_dimensions(other)
|
|
122
|
+
raise_unless_data_formula_exclusivity(other)
|
|
121
123
|
shared = data_dimensions & other.data_dimensions
|
|
122
124
|
combined_data = []
|
|
123
125
|
@data.each do |left_row|
|
|
@@ -136,6 +138,7 @@ class Namo
|
|
|
136
138
|
def **(other, &block)
|
|
137
139
|
raise_unless_namo(other)
|
|
138
140
|
raise_unless_disjoint_data_dimensions(other)
|
|
141
|
+
raise_unless_data_formula_exclusivity(other)
|
|
139
142
|
combined_data = []
|
|
140
143
|
@data.each do |left_row|
|
|
141
144
|
if block
|
|
@@ -266,4 +269,11 @@ class Namo
|
|
|
266
269
|
raise ArgumentError, "dimensions in common, need no common dimensions: #{data_dimensions} vs #{other.data_dimensions}"
|
|
267
270
|
end
|
|
268
271
|
end
|
|
272
|
+
|
|
273
|
+
def raise_unless_data_formula_exclusivity(other)
|
|
274
|
+
collisions = (data_dimensions & other.derived_dimensions) | (derived_dimensions & other.data_dimensions)
|
|
275
|
+
if collisions.any?
|
|
276
|
+
raise ArgumentError, "name collision between data and formulae: #{collisions.inspect}"
|
|
277
|
+
end
|
|
278
|
+
end
|
|
269
279
|
end
|
data/test/namo_test.rb
CHANGED
|
@@ -491,6 +491,36 @@ describe Namo do
|
|
|
491
491
|
end
|
|
492
492
|
end
|
|
493
493
|
|
|
494
|
+
context "range selection" do
|
|
495
|
+
it "selects rows whose value falls within the range" do
|
|
496
|
+
result = sales[price: 5.0..15.0]
|
|
497
|
+
_(result.to_a.count).must_equal 2
|
|
498
|
+
_(result.to_a.map{|row| row[:product]}).must_equal ['Widget', 'Widget']
|
|
499
|
+
end
|
|
500
|
+
|
|
501
|
+
it "supports beginless and endless ranges" do
|
|
502
|
+
_(sales[price: ..15.0].to_a.count).must_equal 2
|
|
503
|
+
_(sales[quantity: 100..].to_a.count).must_equal 2
|
|
504
|
+
end
|
|
505
|
+
|
|
506
|
+
it "composes with projection in a single call" do
|
|
507
|
+
result = sales[:product, :quantity, quantity: 50..120]
|
|
508
|
+
_(result.to_a).must_equal [
|
|
509
|
+
{product: 'Widget', quantity: 100},
|
|
510
|
+
{product: 'Gadget', quantity: 60}
|
|
511
|
+
]
|
|
512
|
+
end
|
|
513
|
+
|
|
514
|
+
it "selects on a formula-defined dimension" do
|
|
515
|
+
sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
|
|
516
|
+
result = sales[revenue: 1200.0..]
|
|
517
|
+
_(result.to_a).must_equal [
|
|
518
|
+
{product: 'Widget', quarter: 'Q2', price: 10.0, quantity: 150},
|
|
519
|
+
{product: 'Gadget', quarter: 'Q2', price: 25.0, quantity: 60}
|
|
520
|
+
]
|
|
521
|
+
end
|
|
522
|
+
end
|
|
523
|
+
|
|
494
524
|
context "mixed proc and regex selection" do
|
|
495
525
|
it "combines a proc and a regex across dimensions" do
|
|
496
526
|
result = sales[product: /^W/, quantity: ->(v){v > 100}]
|
|
@@ -767,6 +797,176 @@ describe Namo do
|
|
|
767
797
|
end
|
|
768
798
|
end
|
|
769
799
|
|
|
800
|
+
describe "data/formula exclusivity" do
|
|
801
|
+
context "projection" do
|
|
802
|
+
let(:price_data) do
|
|
803
|
+
[
|
|
804
|
+
{symbol: 'AAA', date: 1, close: 10.0},
|
|
805
|
+
{symbol: 'AAA', date: 2, close: 20.0},
|
|
806
|
+
{symbol: 'AAA', date: 3, close: 30.0},
|
|
807
|
+
]
|
|
808
|
+
end
|
|
809
|
+
|
|
810
|
+
let(:prices) do
|
|
811
|
+
Namo.new(price_data)
|
|
812
|
+
end
|
|
813
|
+
|
|
814
|
+
let(:sma) do
|
|
815
|
+
->(row, namo){
|
|
816
|
+
window = namo[symbol: row[:symbol], date: ->(d){d <= row[:date]}]
|
|
817
|
+
window.values(:close).sum / window.count.to_f
|
|
818
|
+
}
|
|
819
|
+
end
|
|
820
|
+
|
|
821
|
+
it "agrees across all access paths on a materialised dimension" do
|
|
822
|
+
prices[:sma] = sma
|
|
823
|
+
projected = prices[:date, :sma]
|
|
824
|
+
_(projected.values(:sma)).must_equal [10.0, 15.0, 20.0]
|
|
825
|
+
_(projected.first[:sma]).must_equal projected.values(:sma).first
|
|
826
|
+
_(projected[sma: ->(v){v > 12.0}].values(:date)).must_equal [2, 3]
|
|
827
|
+
end
|
|
828
|
+
|
|
829
|
+
it "lists a materialised dimension as data, not derived, exactly once" do
|
|
830
|
+
prices[:sma] = sma
|
|
831
|
+
projected = prices[:date, :sma]
|
|
832
|
+
_(projected.data_dimensions).must_include :sma
|
|
833
|
+
_(projected.derived_dimensions).wont_include :sma
|
|
834
|
+
_(projected.dimensions.count(:sma)).must_equal 1
|
|
835
|
+
end
|
|
836
|
+
|
|
837
|
+
it "carries a dependent formula not named in the projection, resolving off the materialised column" do
|
|
838
|
+
prices[:sma] = sma
|
|
839
|
+
prices[:double_sma] = ->(row){row[:sma] * 2}
|
|
840
|
+
projected = prices[:date, :sma]
|
|
841
|
+
_(projected.derived_dimensions).must_equal [:double_sma]
|
|
842
|
+
_(projected.values(:double_sma)).must_equal [20.0, 30.0, 40.0]
|
|
843
|
+
end
|
|
844
|
+
|
|
845
|
+
it "carries an omitted formula live, recomputing from the result's own rows" do
|
|
846
|
+
sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
|
|
847
|
+
projected = sales[:price, :quantity]
|
|
848
|
+
_(projected.derived_dimensions).must_equal [:revenue]
|
|
849
|
+
_(projected.values(:revenue)).must_equal [1000.0, 1500.0, 1000.0, 1500.0]
|
|
850
|
+
projected.data.first[:quantity] = 200
|
|
851
|
+
_(projected.values(:revenue).first).must_equal 2000.0
|
|
852
|
+
end
|
|
853
|
+
|
|
854
|
+
it "breaks on access when a carried formula's inputs were dropped (caveat emptor)" do
|
|
855
|
+
sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
|
|
856
|
+
projected = sales[:product]
|
|
857
|
+
_(projected.derived_dimensions).must_equal [:revenue]
|
|
858
|
+
_ { projected.values(:revenue) }.must_raise NoMethodError
|
|
859
|
+
end
|
|
860
|
+
|
|
861
|
+
it "materialises a two-arity formula windowed over the yielding Namo" do
|
|
862
|
+
prices[:sma] = sma
|
|
863
|
+
_(prices[:date, :sma].values(:sma)).must_equal [10.0, 15.0, 20.0]
|
|
864
|
+
end
|
|
865
|
+
|
|
866
|
+
it "windows a two-arity materialisation over a same-call selection" do
|
|
867
|
+
prices[:sma] = sma
|
|
868
|
+
projected = prices[:date, :sma, date: 2..3]
|
|
869
|
+
_(projected.values(:sma)).must_equal [20.0, 25.0]
|
|
870
|
+
end
|
|
871
|
+
|
|
872
|
+
it "carries all formulae through a selection-only call" do
|
|
873
|
+
sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
|
|
874
|
+
result = sales[price: ..15.0]
|
|
875
|
+
_(result.derived_dimensions).must_equal [:revenue]
|
|
876
|
+
_(result.values(:revenue)).must_equal [1000.0, 1500.0]
|
|
877
|
+
end
|
|
878
|
+
|
|
879
|
+
it "carries all formulae through contraction" do
|
|
880
|
+
sales[:revenue] = proc{|r| r[:price] * r[:quantity]}
|
|
881
|
+
result = sales[-:quarter]
|
|
882
|
+
_(result.derived_dimensions).must_equal [:revenue]
|
|
883
|
+
_(result.values(:revenue)).must_equal [1000.0, 1500.0, 1000.0, 1500.0]
|
|
884
|
+
end
|
|
885
|
+
|
|
886
|
+
it "returns pure materialised values and empty formulae when only derived names are projected" do
|
|
887
|
+
prices[:sma] = sma
|
|
888
|
+
projected = prices[:sma]
|
|
889
|
+
_(projected.to_a).must_equal [{sma: 10.0}, {sma: 15.0}, {sma: 20.0}]
|
|
890
|
+
_(projected.formulae).must_equal({})
|
|
891
|
+
end
|
|
892
|
+
|
|
893
|
+
it "returns an instance of self's class" do
|
|
894
|
+
subclass = Class.new(Namo)
|
|
895
|
+
namo = subclass.new([{x: 1}])
|
|
896
|
+
namo[:double] = ->(row){row[:x] * 2}
|
|
897
|
+
_(namo[:double].class).must_equal subclass
|
|
898
|
+
end
|
|
899
|
+
end
|
|
900
|
+
|
|
901
|
+
context "composition" do
|
|
902
|
+
let(:audited) do
|
|
903
|
+
Namo.new([
|
|
904
|
+
{symbol: 'BHP', margin: 0.3},
|
|
905
|
+
{symbol: 'RIO', margin: 0.25}
|
|
906
|
+
])
|
|
907
|
+
end
|
|
908
|
+
|
|
909
|
+
let(:modelled) do
|
|
910
|
+
namo = Namo.new([
|
|
911
|
+
{symbol: 'BHP', price: 10.0, cost: 6.0},
|
|
912
|
+
{symbol: 'RIO', price: 20.0, cost: 16.0}
|
|
913
|
+
])
|
|
914
|
+
namo[:margin] = proc{|r| (r[:price] - r[:cost]) / r[:price]}
|
|
915
|
+
namo
|
|
916
|
+
end
|
|
917
|
+
|
|
918
|
+
let(:audited_orders) do
|
|
919
|
+
Namo.new([{order: 'A', margin: 0.3}])
|
|
920
|
+
end
|
|
921
|
+
|
|
922
|
+
let(:modelled_tiers) do
|
|
923
|
+
namo = Namo.new([{tier: 'light', price: 10.0, cost: 6.0}])
|
|
924
|
+
namo[:margin] = proc{|r| (r[:price] - r[:cost]) / r[:price]}
|
|
925
|
+
namo
|
|
926
|
+
end
|
|
927
|
+
|
|
928
|
+
it "raises on * when self's data dimension is other's derived dimension" do
|
|
929
|
+
_ { audited * modelled }.must_raise ArgumentError
|
|
930
|
+
end
|
|
931
|
+
|
|
932
|
+
it "raises on * when self's derived dimension is other's data dimension" do
|
|
933
|
+
_ { modelled * audited }.must_raise ArgumentError
|
|
934
|
+
end
|
|
935
|
+
|
|
936
|
+
it "raises on ** when self's data dimension is other's derived dimension" do
|
|
937
|
+
_ { audited_orders ** modelled_tiers }.must_raise ArgumentError
|
|
938
|
+
end
|
|
939
|
+
|
|
940
|
+
it "raises on ** when self's derived dimension is other's data dimension" do
|
|
941
|
+
_ { modelled_tiers ** audited_orders }.must_raise ArgumentError
|
|
942
|
+
end
|
|
943
|
+
|
|
944
|
+
it "raises in the block forms of both operators" do
|
|
945
|
+
_ { audited.*(modelled){|row, candidates| candidates} }.must_raise ArgumentError
|
|
946
|
+
_ { audited_orders.**(modelled_tiers){|row, candidates| candidates} }.must_raise ArgumentError
|
|
947
|
+
end
|
|
948
|
+
|
|
949
|
+
it "names the colliding dimensions in the message" do
|
|
950
|
+
err = _ { audited * modelled }.must_raise ArgumentError
|
|
951
|
+
_(err.message).must_match(/name collision between data and formulae/)
|
|
952
|
+
_(err.message).must_include ':margin'
|
|
953
|
+
end
|
|
954
|
+
|
|
955
|
+
it "does not raise on a formula-vs-formula collision — left wins" do
|
|
956
|
+
left = Namo.new([{symbol: 'BHP', close: 42.5}])
|
|
957
|
+
right = Namo.new([{symbol: 'BHP', pe: 14.5}])
|
|
958
|
+
left[:margin] = proc{|r| :left}
|
|
959
|
+
right[:margin] = proc{|r| :right}
|
|
960
|
+
_((left * right).values(:margin)).must_equal [:left]
|
|
961
|
+
end
|
|
962
|
+
|
|
963
|
+
it "composes after explicit resolution by contraction" do
|
|
964
|
+
result = audited[-:margin] * modelled
|
|
965
|
+
_(result.values(:margin)).must_equal [0.4, 0.2]
|
|
966
|
+
end
|
|
967
|
+
end
|
|
968
|
+
end
|
|
969
|
+
|
|
770
970
|
describe "#each" do
|
|
771
971
|
it "yields Row objects" do
|
|
772
972
|
rows = []
|