kumi 0.0.8 → 0.0.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CLAUDE.md +28 -44
- data/README.md +188 -108
- data/docs/AST.md +8 -1
- data/docs/FUNCTIONS.md +52 -8
- data/docs/compiler_design_principles.md +86 -0
- data/docs/features/README.md +22 -2
- data/docs/features/hierarchical-broadcasting.md +349 -0
- data/docs/features/javascript-transpiler.md +148 -0
- data/docs/features/performance.md +1 -3
- data/docs/features/s-expression-printer.md +77 -0
- data/docs/schema_metadata.md +7 -7
- data/examples/game_of_life.rb +2 -4
- data/lib/kumi/analyzer.rb +0 -2
- data/lib/kumi/compiler.rb +6 -275
- data/lib/kumi/core/analyzer/passes/broadcast_detector.rb +600 -42
- data/lib/kumi/core/analyzer/passes/input_collector.rb +4 -2
- data/lib/kumi/core/analyzer/passes/semantic_constraint_validator.rb +27 -0
- data/lib/kumi/core/analyzer/passes/type_checker.rb +6 -2
- data/lib/kumi/core/analyzer/passes/unsat_detector.rb +90 -46
- data/lib/kumi/core/cascade_executor_builder.rb +132 -0
- data/lib/kumi/core/compiler/expression_compiler.rb +146 -0
- data/lib/kumi/core/compiler/function_invoker.rb +55 -0
- data/lib/kumi/core/compiler/path_traversal_compiler.rb +158 -0
- data/lib/kumi/core/compiler/reference_compiler.rb +46 -0
- data/lib/kumi/core/compiler_base.rb +137 -0
- data/lib/kumi/core/explain.rb +2 -2
- data/lib/kumi/core/function_registry/collection_functions.rb +86 -3
- data/lib/kumi/core/function_registry/function_builder.rb +5 -3
- data/lib/kumi/core/function_registry/logical_functions.rb +171 -1
- data/lib/kumi/core/function_registry/stat_functions.rb +156 -0
- data/lib/kumi/core/function_registry.rb +32 -10
- data/lib/kumi/core/nested_structure_utils.rb +78 -0
- data/lib/kumi/core/ruby_parser/dsl_cascade_builder.rb +2 -2
- data/lib/kumi/core/ruby_parser/input_builder.rb +61 -8
- data/lib/kumi/core/schema_instance.rb +4 -0
- data/lib/kumi/core/vectorized_function_builder.rb +88 -0
- data/lib/kumi/errors.rb +2 -0
- data/lib/kumi/js/compiler.rb +878 -0
- data/lib/kumi/js/function_registry.rb +333 -0
- data/lib/kumi/js.rb +23 -0
- data/lib/kumi/registry.rb +61 -1
- data/lib/kumi/schema.rb +1 -1
- data/lib/kumi/support/s_expression_printer.rb +162 -0
- data/lib/kumi/syntax/array_expression.rb +6 -6
- data/lib/kumi/syntax/call_expression.rb +4 -4
- data/lib/kumi/syntax/cascade_expression.rb +4 -4
- data/lib/kumi/syntax/case_expression.rb +4 -4
- data/lib/kumi/syntax/declaration_reference.rb +4 -4
- data/lib/kumi/syntax/hash_expression.rb +4 -4
- data/lib/kumi/syntax/input_declaration.rb +6 -5
- data/lib/kumi/syntax/input_element_reference.rb +5 -5
- data/lib/kumi/syntax/input_reference.rb +5 -5
- data/lib/kumi/syntax/literal.rb +4 -4
- data/lib/kumi/syntax/node.rb +34 -34
- data/lib/kumi/syntax/root.rb +6 -6
- data/lib/kumi/syntax/trait_declaration.rb +4 -4
- data/lib/kumi/syntax/value_declaration.rb +4 -4
- data/lib/kumi/version.rb +1 -1
- data/lib/kumi.rb +1 -1
- data/scripts/analyze_broadcast_methods.rb +68 -0
- data/scripts/analyze_cascade_methods.rb +74 -0
- data/scripts/check_broadcasting_coverage.rb +51 -0
- data/scripts/find_dead_code.rb +114 -0
- metadata +22 -4
- data/docs/features/array-broadcasting.md +0 -170
- data/lib/kumi/cli.rb +0 -449
- data/lib/kumi/core/vectorization_metadata.rb +0 -110
data/docs/FUNCTIONS.md
CHANGED
@@ -10,6 +10,8 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
|
|
10
10
|
* **Usage**: `fn(:and, boolean1, boolean2, ...)` → `boolean`
|
11
11
|
* **`any?`**: Check if any element in collection is truthy
|
12
12
|
* **Usage**: `fn(:any?, array(any) arg1)` → `boolean`
|
13
|
+
* **`cascade_and`**: Element-wise AND for arrays with same nested structure
|
14
|
+
* **Usage**: `fn(:cascade_and, boolean1, boolean2, ...)` → `boolean`
|
13
15
|
* **`none?`**: Check if no elements in collection are truthy
|
14
16
|
* **Usage**: `fn(:none?, array(any) arg1)` → `boolean`
|
15
17
|
* **`not`**: Logical NOT
|
@@ -52,14 +54,14 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
|
|
52
54
|
* **Usage**: `fn(:modulo, float arg1, float arg2)` → `float`
|
53
55
|
* **`multiply`**: Multiply two numbers
|
54
56
|
* **Usage**: `fn(:multiply, float arg1, float arg2)` → `float`
|
57
|
+
* **`piecewise_sum`**: Accumulate over tiered ranges; returns [sum, marginal_rate]
|
58
|
+
* **Usage**: `fn(:piecewise_sum, float arg1, array(float) arg2, array(float) arg3)` → `array(float)`
|
55
59
|
* **`power`**: Raise first number to power of second
|
56
60
|
* **Usage**: `fn(:power, float arg1, float arg2)` → `float`
|
57
61
|
* **`round`**: Round number to specified precision
|
58
62
|
* **Usage**: `fn(:round, float1, float2, ...)` → `float`
|
59
63
|
* **`subtract`**: Subtract second number from first
|
60
64
|
* **Usage**: `fn(:subtract, float arg1, float arg2)` → `float`
|
61
|
-
* **`piecewise_sum`**: Accumulate over tiered ranges; returns [sum, marginal_rate]
|
62
|
-
* **Usage**: `fn(:piecewise_sum, float arg1, array(float) arg2, array(float) arg3)` → `array(float)`
|
63
65
|
|
64
66
|
## String Functions
|
65
67
|
|
@@ -67,16 +69,22 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
|
|
67
69
|
* **Usage**: `fn(:capitalize, string arg1)` → `string`
|
68
70
|
* **`concat`**: Concatenate multiple strings
|
69
71
|
* **Usage**: `fn(:concat, string1, string2, ...)` → `string`
|
72
|
+
* **`contains?`**: Check if string contains substring
|
73
|
+
* **Usage**: `fn(:contains?, string arg1, string arg2)` → `boolean`
|
70
74
|
* **`downcase`**: Convert string to lowercase
|
71
75
|
* **Usage**: `fn(:downcase, string arg1)` → `string`
|
72
76
|
* **`end_with?`**: Check if string ends with suffix
|
73
77
|
* **Usage**: `fn(:end_with?, string arg1, string arg2)` → `boolean`
|
74
|
-
* **`
|
75
|
-
* **Usage**: `fn(:
|
76
|
-
* **`length`**: Get
|
77
|
-
* **Usage**: `fn(:length,
|
78
|
+
* **`includes?`**: Check if string contains substring
|
79
|
+
* **Usage**: `fn(:includes?, string arg1, string arg2)` → `boolean`
|
80
|
+
* **`length`**: Get string length
|
81
|
+
* **Usage**: `fn(:length, string arg1)` → `integer`
|
78
82
|
* **`start_with?`**: Check if string starts with prefix
|
79
83
|
* **Usage**: `fn(:start_with?, string arg1, string arg2)` → `boolean`
|
84
|
+
* **`string_include?`**: Check if string contains substring
|
85
|
+
* **Usage**: `fn(:string_include?, string arg1, string arg2)` → `boolean`
|
86
|
+
* **`string_length`**: Get string length
|
87
|
+
* **Usage**: `fn(:string_length, string arg1)` → `integer`
|
80
88
|
* **`strip`**: Remove leading and trailing whitespace
|
81
89
|
* **Usage**: `fn(:strip, string arg1)` → `string`
|
82
90
|
* **`upcase`**: Convert string to uppercase
|
@@ -84,20 +92,54 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
|
|
84
92
|
|
85
93
|
## Collection Functions
|
86
94
|
|
95
|
+
* **`all_across`**: Check if all elements are truthy across all nested levels
|
96
|
+
* **Usage**: `fn(:all_across, array(any) arg1)` → `boolean`
|
97
|
+
* **`any_across`**: Check if any element is truthy across all nested levels
|
98
|
+
* **Usage**: `fn(:any_across, array(any) arg1)` → `boolean`
|
99
|
+
* **`avg_if`**: Average values where corresponding condition is true
|
100
|
+
* **Usage**: `fn(:avg_if, array(float) arg1, array(boolean) arg2)` → `float`
|
101
|
+
* **`build_array`**: Build array of given size with index values
|
102
|
+
* **Usage**: `fn(:build_array, integer arg1)` → `array(any)`
|
103
|
+
* **`count_across`**: Count total elements across all nested levels
|
104
|
+
* **Usage**: `fn(:count_across, array(any) arg1)` → `integer`
|
105
|
+
* **`count_if`**: Count number of true values in boolean array
|
106
|
+
* **Usage**: `fn(:count_if, array(boolean) arg1)` → `integer`
|
107
|
+
* **`each_slice`**: Group array elements into subarrays of given size
|
108
|
+
* **Usage**: `fn(:each_slice, array arg1, integer arg2)` → `array(array)`
|
87
109
|
* **`empty?`**: Check if collection is empty
|
88
110
|
* **Usage**: `fn(:empty?, array(any) arg1)` → `boolean`
|
89
111
|
* **`first`**: Get first element of collection
|
90
112
|
* **Usage**: `fn(:first, array(any) arg1)` → `any`
|
113
|
+
* **`flatten`**: Flatten nested arrays into a single array
|
114
|
+
* **Usage**: `fn(:flatten, array(any) arg1)` → `array(any)`
|
115
|
+
* **`flatten_deep`**: Recursively flatten all nested arrays (alias for flatten)
|
116
|
+
* **Usage**: `fn(:flatten_deep, array(any) arg1)` → `array(any)`
|
117
|
+
* **`flatten_one`**: Flatten nested arrays by one level only
|
118
|
+
* **Usage**: `fn(:flatten_one, array(any) arg1)` → `array(any)`
|
91
119
|
* **`include?`**: Check if collection includes element
|
92
120
|
* **Usage**: `fn(:include?, array(any) arg1, any arg2)` → `boolean`
|
121
|
+
* **`indices`**: Generate array of indices for the collection
|
122
|
+
* **Usage**: `fn(:indices, array(any) arg1)` → `array(integer)`
|
123
|
+
* **`join`**: Join array elements into string with separator
|
124
|
+
* **Usage**: `fn(:join, array arg1, string arg2)` → `string`
|
93
125
|
* **`last`**: Get last element of collection
|
94
126
|
* **Usage**: `fn(:last, array(any) arg1)` → `any`
|
95
|
-
* **`
|
96
|
-
* **Usage**: `fn(:
|
127
|
+
* **`map_add`**: Add value to each element
|
128
|
+
* **Usage**: `fn(:map_add, array(float) arg1, float arg2)` → `array(float)`
|
129
|
+
* **`map_conditional`**: Transform elements based on condition: if element == condition_value then true_value else false_value
|
130
|
+
* **Usage**: `fn(:map_conditional, array arg1, any arg2, any arg3, any arg4)` → `array`
|
131
|
+
* **`map_join_rows`**: Join 2D array into string with row and column separators
|
132
|
+
* **Usage**: `fn(:map_join_rows, array(array) arg1, string arg2, string arg3)` → `string`
|
133
|
+
* **`map_multiply`**: Multiply each element by factor
|
134
|
+
* **Usage**: `fn(:map_multiply, array(float) arg1, float arg2)` → `array(float)`
|
135
|
+
* **`map_with_index`**: Map collection elements to [element, index] pairs
|
136
|
+
* **Usage**: `fn(:map_with_index, array(any) arg1)` → `array(any)`
|
97
137
|
* **`max`**: Find maximum value in numeric collection
|
98
138
|
* **Usage**: `fn(:max, array(float) arg1)` → `float`
|
99
139
|
* **`min`**: Find minimum value in numeric collection
|
100
140
|
* **Usage**: `fn(:min, array(float) arg1)` → `float`
|
141
|
+
* **`range`**: Generate range of integers from start to finish (exclusive)
|
142
|
+
* **Usage**: `fn(:range, integer arg1, integer arg2)` → `array(integer)`
|
101
143
|
* **`reverse`**: Reverse collection order
|
102
144
|
* **Usage**: `fn(:reverse, array(any) arg1)` → `array(any)`
|
103
145
|
* **`size`**: Get collection size
|
@@ -106,6 +148,8 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
|
|
106
148
|
* **Usage**: `fn(:sort, array(any) arg1)` → `array(any)`
|
107
149
|
* **`sum`**: Sum all numeric elements in collection
|
108
150
|
* **Usage**: `fn(:sum, array(float) arg1)` → `float`
|
151
|
+
* **`sum_if`**: Sum values where corresponding condition is true
|
152
|
+
* **Usage**: `fn(:sum_if, array(float) arg1, array(boolean) arg2)` → `float`
|
109
153
|
* **`unique`**: Remove duplicate elements from collection
|
110
154
|
* **Usage**: `fn(:unique, array(any) arg1)` → `array(any)`
|
111
155
|
|
@@ -0,0 +1,86 @@
|
|
1
|
+
# Compiler Design Principles
|
2
|
+
|
3
|
+
## Core Principle: Smart Analyzer, Dumb Compiler
|
4
|
+
|
5
|
+
The Kumi compiler follows a strict separation of concerns:
|
6
|
+
|
7
|
+
### Analyzer Phase (Smart)
|
8
|
+
- **Makes all decisions** about how operations should be executed
|
9
|
+
- **Analyzes semantic context** to determine operation modes
|
10
|
+
- **Pre-computes execution strategies** and stores in metadata
|
11
|
+
- **Resolves complex logic** like nested array broadcasting, reduction flattening, etc.
|
12
|
+
- **Produces complete instructions** for the compiler to follow
|
13
|
+
|
14
|
+
### Compiler Phase (Dumb)
|
15
|
+
- **Follows metadata instructions** without making decisions
|
16
|
+
- **No conditional logic** based on data types, function types, or structure analysis
|
17
|
+
- **Mechanically executes** the pre-computed strategy from analyzer
|
18
|
+
- **Pure translation** from AST + metadata → executable functions
|
19
|
+
|
20
|
+
## Examples
|
21
|
+
|
22
|
+
### ❌ BAD: Compiler Making Decisions
|
23
|
+
```ruby
|
24
|
+
def compile_call(expr)
|
25
|
+
if Kumi::Registry.reducer?(expr.fn_name)
|
26
|
+
if nested_array_detected?(values)
|
27
|
+
# Compiler deciding to flatten
|
28
|
+
flatten_and_call(expr.fn_name, values)
|
29
|
+
end
|
30
|
+
end
|
31
|
+
end
|
32
|
+
```
|
33
|
+
|
34
|
+
### ✅ GOOD: Compiler Following Metadata
|
35
|
+
```ruby
|
36
|
+
def compile_call(expr)
|
37
|
+
# Just read the pre-computed strategy
|
38
|
+
strategy = @analysis.metadata[:call_strategies][expr]
|
39
|
+
execute_strategy(strategy, expr)
|
40
|
+
end
|
41
|
+
```
|
42
|
+
|
43
|
+
### ❌ BAD: Runtime Structure Analysis
|
44
|
+
```ruby
|
45
|
+
def vectorized_function_call(fn_name, values)
|
46
|
+
# Compiler analyzing structure at runtime
|
47
|
+
if values.any? { |v| deeply_nested?(v) }
|
48
|
+
apply_nested_broadcasting(fn, values)
|
49
|
+
end
|
50
|
+
end
|
51
|
+
```
|
52
|
+
|
53
|
+
### ✅ GOOD: Pre-computed Broadcasting Plan
|
54
|
+
```ruby
|
55
|
+
def compile_element_field_reference(expr)
|
56
|
+
# Analyzer already determined the strategy
|
57
|
+
metadata = @analysis.state[:broadcasts][:nested_paths][expr.path]
|
58
|
+
traverse_nested_path(ctx, expr.path, metadata[:operation_mode])
|
59
|
+
end
|
60
|
+
```
|
61
|
+
|
62
|
+
## Benefits
|
63
|
+
|
64
|
+
1. **Predictable Performance**: No runtime analysis or decision-making
|
65
|
+
2. **Easier Testing**: Compiler behavior determined entirely by metadata
|
66
|
+
3. **Maintainable**: Complex logic isolated in analyzer passes
|
67
|
+
4. **Extensible**: New features added by extending analyzer, not compiler
|
68
|
+
5. **Debuggable**: All decisions visible in analyzer metadata
|
69
|
+
|
70
|
+
## Implementation Pattern
|
71
|
+
|
72
|
+
For any new compiler feature:
|
73
|
+
|
74
|
+
1. **Analyzer Pass**: Analyze the requirement and store strategy in metadata
|
75
|
+
2. **Metadata Schema**: Define clear data structure for the strategy
|
76
|
+
3. **Compiler Method**: Read metadata and execute strategy mechanically
|
77
|
+
4. **No Conditionals**: Avoid `if` statements based on runtime data in compiler
|
78
|
+
|
79
|
+
## Metadata-Driven Architecture
|
80
|
+
|
81
|
+
The compiler should be a pure **metadata interpreter**:
|
82
|
+
- Input: AST + Analyzer Metadata
|
83
|
+
- Output: Executable Functions
|
84
|
+
- Process: Mechanical translation following metadata instructions
|
85
|
+
|
86
|
+
This ensures the compiler remains simple, fast, and maintainable as the system grows in complexity.
|
data/docs/features/README.md
CHANGED
@@ -30,13 +30,33 @@ Defines expected inputs with types and constraints.
|
|
30
30
|
- Domain validation at runtime
|
31
31
|
- Separates input metadata from business logic
|
32
32
|
|
33
|
+
### [Hierarchical Broadcasting](hierarchical-broadcasting.md)
|
34
|
+
Automatic vectorization over hierarchical data structures with dual access modes.
|
35
|
+
|
36
|
+
- Object access for structured business data
|
37
|
+
- Element access for multi-dimensional arrays
|
38
|
+
- Mixed access modes in same schema
|
39
|
+
|
33
40
|
### [Performance](performance.md)
|
34
|
-
|
35
|
-
Processes large schemas with optimized algorithms.
|
41
|
+
Processes large schemas.
|
36
42
|
|
37
43
|
- Result caching
|
38
44
|
- Selective evaluation
|
39
45
|
|
46
|
+
### [S-Expression Printer](s-expression-printer.md)
|
47
|
+
Debug and inspect AST structures with readable S-expression notation output.
|
48
|
+
|
49
|
+
- Visitor pattern implementation for all node types
|
50
|
+
- Proper indentation and hierarchical structure
|
51
|
+
- Useful for debugging schema parsing and AST analysis
|
52
|
+
|
53
|
+
### [JavaScript Transpiler](javascript-transpiler.md)
|
54
|
+
Transpiles compiled schemas to standalone JavaScript code.
|
55
|
+
|
56
|
+
- Generates bundles with only required functions
|
57
|
+
- Supports CommonJS and browser environments
|
58
|
+
- Maintains identical behavior across platforms
|
59
|
+
|
40
60
|
## Integration
|
41
61
|
|
42
62
|
- Type inference uses input declarations
|
@@ -0,0 +1,349 @@
|
|
1
|
+
# Hierarchical Broadcasting
|
2
|
+
|
3
|
+
Automatic vectorization of operations over hierarchical data structures with two complementary access modes for different use cases.
|
4
|
+
|
5
|
+
## Overview
|
6
|
+
|
7
|
+
The hierarchical broadcasting system enables natural field access syntax that automatically applies operations element-wise across complex nested data structures, with intelligent detection of map vs reduce operations and support for both structured objects and raw multi-dimensional arrays.
|
8
|
+
|
9
|
+
## Core Mechanism
|
10
|
+
|
11
|
+
The system uses a three-stage pipeline:
|
12
|
+
|
13
|
+
1. **Parser** - Creates InputElementReference AST nodes for nested field access
|
14
|
+
2. **BroadcastDetector** - Identifies which operations should be vectorized vs scalar
|
15
|
+
3. **Compiler** - Generates appropriate map/reduce functions based on usage context
|
16
|
+
|
17
|
+
## Access Modes
|
18
|
+
|
19
|
+
The system supports two complementary access modes that can be mixed within the same schema:
|
20
|
+
|
21
|
+
### Object Access Mode (Default)
|
22
|
+
For structured business data with named fields:
|
23
|
+
|
24
|
+
```ruby
|
25
|
+
input do
|
26
|
+
array :users do
|
27
|
+
string :name
|
28
|
+
integer :age
|
29
|
+
end
|
30
|
+
end
|
31
|
+
|
32
|
+
value :user_names, input.users.name # Broadcasts over structured objects
|
33
|
+
trait :adults, input.users.age >= 18 # Boolean array from age comparison
|
34
|
+
```
|
35
|
+
|
36
|
+
### Element Access Mode
|
37
|
+
For multi-dimensional raw arrays using `element` syntax with **progressive path traversal**:
|
38
|
+
|
39
|
+
```ruby
|
40
|
+
input do
|
41
|
+
array :cube do
|
42
|
+
element :array, :layer do
|
43
|
+
element :array, :row do
|
44
|
+
element :integer, :cell
|
45
|
+
end
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
49
|
+
|
50
|
+
# Progressive path traversal - each level goes deeper
|
51
|
+
value :dimensions, [
|
52
|
+
fn(:size, input.cube), # Number of layers
|
53
|
+
fn(:size, input.cube.layer), # Total matrices across all layers
|
54
|
+
fn(:size, input.cube.layer.row), # Total rows across all matrices
|
55
|
+
fn(:size, input.cube.layer.row.cell) # Total cells (leaf elements)
|
56
|
+
]
|
57
|
+
|
58
|
+
# Operations at different dimensional levels
|
59
|
+
value :layer_data, input.cube.layer # 3D matrices
|
60
|
+
value :row_data, input.cube.layer.row # 2D rows
|
61
|
+
value :cell_data, input.cube.layer.row.cell # 1D values
|
62
|
+
```
|
63
|
+
|
64
|
+
**Key Benefits of Progressive Path Traversal:**
|
65
|
+
- **Intuitive syntax**: `input.cube.layer.row` naturally represents "rows within layers within cube"
|
66
|
+
- **Direct dimensional access**: No need for complex flattening operations for basic dimensional analysis
|
67
|
+
- **Ranked polymorphism**: Same operations work across different dimensional arrays
|
68
|
+
- **Clean code**: `fn(:size, input.cube.layer.row)` instead of `fn(:size, fn(:flatten_one, input.cube.layer))`
|
69
|
+
|
70
|
+
## Business Use Cases
|
71
|
+
|
72
|
+
Element access mode is essential for common business scenarios involving simple nested arrays:
|
73
|
+
|
74
|
+
```ruby
|
75
|
+
# E-commerce: Product availability analysis
|
76
|
+
input do
|
77
|
+
array :products do
|
78
|
+
string :name
|
79
|
+
array :daily_sales do # Simple array: [12, 8, 15, 3]
|
80
|
+
element :integer, :units
|
81
|
+
end
|
82
|
+
end
|
83
|
+
end
|
84
|
+
|
85
|
+
# Create flags for business rules
|
86
|
+
trait :had_slow_days, fn(:any?, input.products.daily_sales.units < 5)
|
87
|
+
trait :consistently_strong, fn(:all?, input.products.daily_sales.units >= 10)
|
88
|
+
|
89
|
+
# Progressive analysis
|
90
|
+
value :sales_summary, [
|
91
|
+
fn(:size, input.products), # Number of products
|
92
|
+
fn(:size, input.products.daily_sales.units), # Total sales data points
|
93
|
+
fn(:sum, fn(:flatten, input.products.daily_sales.units)) # Total units sold
|
94
|
+
]
|
95
|
+
```
|
96
|
+
|
97
|
+
### Mixed Access Modes
|
98
|
+
Both modes work together seamlessly:
|
99
|
+
|
100
|
+
```ruby
|
101
|
+
input do
|
102
|
+
array :users do # Object access
|
103
|
+
string :name
|
104
|
+
integer :age
|
105
|
+
end
|
106
|
+
array :activity_logs do # Element access
|
107
|
+
element :integer, :day
|
108
|
+
end
|
109
|
+
end
|
110
|
+
|
111
|
+
value :user_count, fn(:size, input.users)
|
112
|
+
value :total_activity_days, fn(:size, fn(:flatten, input.activity_logs.day))
|
113
|
+
```
|
114
|
+
|
115
|
+
## Basic Broadcasting (Object Access)
|
116
|
+
|
117
|
+
```ruby
|
118
|
+
schema do
|
119
|
+
input do
|
120
|
+
array :line_items do
|
121
|
+
float :price
|
122
|
+
integer :quantity
|
123
|
+
string :category
|
124
|
+
end
|
125
|
+
float :tax_rate, type: :float
|
126
|
+
end
|
127
|
+
|
128
|
+
# Element-wise computation - broadcasts over each item
|
129
|
+
value :subtotals, input.line_items.price * input.line_items.quantity
|
130
|
+
|
131
|
+
# Element-wise traits - applied to each item
|
132
|
+
trait :is_taxable, (input.line_items.category != "digital")
|
133
|
+
|
134
|
+
# Conditional logic - element-wise evaluation
|
135
|
+
value :taxes, fn(:if, is_taxable, subtotals * input.tax_rate, 0.0)
|
136
|
+
end
|
137
|
+
```
|
138
|
+
|
139
|
+
## Aggregation Operations
|
140
|
+
|
141
|
+
Operations that consume arrays to produce scalars are automatically detected:
|
142
|
+
|
143
|
+
```ruby
|
144
|
+
schema do
|
145
|
+
# These aggregate the vectorized results
|
146
|
+
value :total_subtotal, fn(:sum, subtotals)
|
147
|
+
value :total_tax, fn(:sum, taxes)
|
148
|
+
value :grand_total, total_subtotal + total_tax
|
149
|
+
|
150
|
+
# Statistics over arrays
|
151
|
+
value :avg_price, fn(:avg, input.line_items.price)
|
152
|
+
value :max_quantity, fn(:max, input.line_items.quantity)
|
153
|
+
end
|
154
|
+
```
|
155
|
+
|
156
|
+
## Field Access Nesting
|
157
|
+
|
158
|
+
Supports arbitrary depth field access with path building:
|
159
|
+
|
160
|
+
```ruby
|
161
|
+
schema do
|
162
|
+
input do
|
163
|
+
array :orders do
|
164
|
+
array :items do
|
165
|
+
hash :product do
|
166
|
+
string :name
|
167
|
+
float :base_price
|
168
|
+
end
|
169
|
+
integer :quantity
|
170
|
+
end
|
171
|
+
end
|
172
|
+
end
|
173
|
+
|
174
|
+
# Deep field access - automatically broadcasts over nested arrays
|
175
|
+
value :all_product_names, input.orders.items.product.name
|
176
|
+
value :total_values, input.orders.items.product.base_price * input.orders.items.quantity
|
177
|
+
end
|
178
|
+
```
|
179
|
+
|
180
|
+
## Type Inference
|
181
|
+
|
182
|
+
The type system automatically infers appropriate types for broadcasted operations:
|
183
|
+
|
184
|
+
- `input.items.price` (float array) → inferred as `:float` per element
|
185
|
+
- `input.items.price * input.items.quantity` → element-wise `:float` result
|
186
|
+
- `fn(:sum, input.items.price)` → scalar `:float` result
|
187
|
+
|
188
|
+
## Implementation Details
|
189
|
+
|
190
|
+
### Parser Layer
|
191
|
+
- **InputFieldProxy** - Handles `input.field.subfield...` with path building
|
192
|
+
- **InputElementReference** - AST node representing array field access paths
|
193
|
+
|
194
|
+
### Analysis Layer
|
195
|
+
- **BroadcastDetector** - Identifies vectorized vs scalar operations
|
196
|
+
- **TypeInferencer** - Infers types for array element access patterns
|
197
|
+
|
198
|
+
### Compilation Layer
|
199
|
+
- **Automatic Dispatch** - Maps element-wise operations to array map functions
|
200
|
+
- **Reduction Detection** - Converts aggregation functions to array reduce operations
|
201
|
+
|
202
|
+
## Usage Patterns
|
203
|
+
|
204
|
+
### Element-wise Operations
|
205
|
+
```ruby
|
206
|
+
# All of these broadcast element-wise
|
207
|
+
value :discounted_prices, input.items.price * 0.9
|
208
|
+
trait :expensive, (input.items.price > 100.0)
|
209
|
+
value :categories, input.items.category
|
210
|
+
```
|
211
|
+
|
212
|
+
### Aggregation Operations
|
213
|
+
```ruby
|
214
|
+
# These consume arrays to produce scalars
|
215
|
+
value :item_count, fn(:size, input.items)
|
216
|
+
value :total_price, fn(:sum, input.items.price)
|
217
|
+
value :has_expensive, fn(:any?, expensive)
|
218
|
+
```
|
219
|
+
|
220
|
+
### Mixed Operations
|
221
|
+
```ruby
|
222
|
+
# Element-wise computation followed by aggregation
|
223
|
+
value :line_totals, input.items.price * input.items.quantity
|
224
|
+
value :order_total, fn(:sum, line_totals)
|
225
|
+
value :avg_line_total, fn(:avg, line_totals)
|
226
|
+
```
|
227
|
+
|
228
|
+
### Conditional Aggregation Functions
|
229
|
+
|
230
|
+
Kumi provides powerful conditional aggregation functions that make working with vectorized traits clean and intuitive:
|
231
|
+
|
232
|
+
```ruby
|
233
|
+
# Step 1: Create vectorized values and traits
|
234
|
+
value :revenues, input.sales.price * input.sales.quantity # → [3000.0, 2000.0, 1250.0]
|
235
|
+
trait :expensive, input.sales.price > 100 # → [true, true, false]
|
236
|
+
trait :bulk_order, input.sales.quantity >= 15 # → [false, false, true]
|
237
|
+
|
238
|
+
# Step 2: Use conditional aggregation functions (NEW - clean and readable)
|
239
|
+
value :expensive_count, fn(:count_if, expensive) # → 2
|
240
|
+
value :expensive_total, fn(:sum_if, revenues, expensive) # → 5000.0
|
241
|
+
value :expensive_average, fn(:avg_if, revenues, expensive) # → 2500.0
|
242
|
+
|
243
|
+
# Step 3: Combine conditions for complex filtering
|
244
|
+
trait :expensive_bulk, expensive & bulk_order # → [false, false, false]
|
245
|
+
value :expensive_bulk_total, fn(:sum_if, revenues, expensive_bulk) # → 0.0
|
246
|
+
```
|
247
|
+
|
248
|
+
**Old cascade pattern** (verbose, harder to read):
|
249
|
+
```ruby
|
250
|
+
# OLD WAY - required verbose cascade patterns
|
251
|
+
value :expensive_amounts do
|
252
|
+
on expensive, revenues
|
253
|
+
base 0.0
|
254
|
+
end
|
255
|
+
value :expensive_total, fn(:sum, expensive_amounts)
|
256
|
+
|
257
|
+
value :expensive_count_markers do
|
258
|
+
on expensive, 1.0
|
259
|
+
base 0.0
|
260
|
+
end
|
261
|
+
value :expensive_count, fn(:sum, expensive_count_markers)
|
262
|
+
```
|
263
|
+
|
264
|
+
**New conditional functions** (clean, direct):
|
265
|
+
```ruby
|
266
|
+
# NEW WAY - direct and readable
|
267
|
+
value :expensive_total, fn(:sum_if, revenues, expensive)
|
268
|
+
value :expensive_count, fn(:count_if, expensive)
|
269
|
+
value :expensive_average, fn(:avg_if, revenues, expensive)
|
270
|
+
```
|
271
|
+
|
272
|
+
The conditional aggregation functions are now the idiomatic way to work with boolean arrays in Kumi.
|
273
|
+
|
274
|
+
## Common Patterns and Gotchas
|
275
|
+
|
276
|
+
### Operator Precedence with Mixed Arithmetic
|
277
|
+
|
278
|
+
**Problem**: Complex arithmetic expressions with arrays can fail due to precedence parsing:
|
279
|
+
|
280
|
+
```ruby
|
281
|
+
# This fails with confusing error message
|
282
|
+
value :ones, input.items.price * 0 + 1
|
283
|
+
# Error: "no implicit conversion of Integer into Array"
|
284
|
+
|
285
|
+
# The expression is parsed as: (input.items.price * 0) + 1
|
286
|
+
# Which becomes: [0.0, 0.0, 0.0] + 1 (array + scalar in wrong context)
|
287
|
+
```
|
288
|
+
|
289
|
+
**Solutions**:
|
290
|
+
|
291
|
+
```ruby
|
292
|
+
# Use explicit function calls
|
293
|
+
value :ones, fn(:add, input.items.price * 0, 1)
|
294
|
+
|
295
|
+
# Use cascade pattern (most idiomatic)
|
296
|
+
trait :always_true, input.items.price >= 0
|
297
|
+
value :ones do
|
298
|
+
on always_true, 1.0
|
299
|
+
base 0.0
|
300
|
+
end
|
301
|
+
|
302
|
+
# Use separate steps
|
303
|
+
value :zeros, input.items.price * 0
|
304
|
+
value :ones, zeros + 1
|
305
|
+
```
|
306
|
+
|
307
|
+
The cascade pattern is preferred as it's more explicit about intent and leverages Kumi's automatic scalar broadcasting.
|
308
|
+
|
309
|
+
## Error Handling
|
310
|
+
|
311
|
+
### Dimension Mismatch Detection
|
312
|
+
|
313
|
+
Array broadcasting operations are only valid within the same array source. Attempting to broadcast across different arrays generates detailed error messages:
|
314
|
+
|
315
|
+
```ruby
|
316
|
+
schema do
|
317
|
+
input do
|
318
|
+
array :items do
|
319
|
+
string :name
|
320
|
+
end
|
321
|
+
array :logs do
|
322
|
+
string :user_name
|
323
|
+
end
|
324
|
+
end
|
325
|
+
|
326
|
+
# This will generate a dimension mismatch error
|
327
|
+
trait :same_name, input.items.name == input.logs.user_name
|
328
|
+
end
|
329
|
+
|
330
|
+
# Error:
|
331
|
+
# Cannot broadcast operation across arrays from different sources: items, logs.
|
332
|
+
# Problem: Multiple operands are arrays from different sources:
|
333
|
+
# - Operand 1 resolves to array(string) from array 'items'
|
334
|
+
# - Operand 2 resolves to array(string) from array 'logs'
|
335
|
+
# Direct operations on arrays from different sources is ambiguous and not supported.
|
336
|
+
# Vectorized operations can only work on fields from the same array input.
|
337
|
+
```
|
338
|
+
|
339
|
+
The error messages provide:
|
340
|
+
- **Quick Summary**: Identifies the conflicting array sources
|
341
|
+
- **Type Information**: Shows the resolved types of each operand
|
342
|
+
- **Clear Explanation**: Why the operation is ambiguous and not supported
|
343
|
+
|
344
|
+
## Performance Characteristics
|
345
|
+
|
346
|
+
- **Single Pass** - Each array is traversed once per computation chain
|
347
|
+
- **Lazy Evaluation** - Operations are composed into pipelines
|
348
|
+
- **Memory Efficient** - No intermediate array allocations for simple operations
|
349
|
+
- **Type Safe** - Full compile-time type checking for array element operations
|