kumi 0.0.9 → 0.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. checksums.yaml +4 -4
  2. data/CLAUDE.md +28 -44
  3. data/README.md +187 -120
  4. data/docs/AST.md +1 -1
  5. data/docs/FUNCTIONS.md +52 -8
  6. data/docs/compiler_design_principles.md +86 -0
  7. data/docs/features/README.md +15 -2
  8. data/docs/features/hierarchical-broadcasting.md +349 -0
  9. data/docs/features/javascript-transpiler.md +148 -0
  10. data/docs/features/performance.md +1 -3
  11. data/docs/schema_metadata.md +7 -7
  12. data/examples/game_of_life.rb +2 -4
  13. data/lib/kumi/analyzer.rb +0 -2
  14. data/lib/kumi/compiler.rb +6 -275
  15. data/lib/kumi/core/analyzer/passes/broadcast_detector.rb +600 -42
  16. data/lib/kumi/core/analyzer/passes/input_collector.rb +4 -2
  17. data/lib/kumi/core/analyzer/passes/semantic_constraint_validator.rb +27 -0
  18. data/lib/kumi/core/analyzer/passes/type_checker.rb +6 -2
  19. data/lib/kumi/core/analyzer/passes/unsat_detector.rb +90 -46
  20. data/lib/kumi/core/cascade_executor_builder.rb +132 -0
  21. data/lib/kumi/core/compiler/expression_compiler.rb +146 -0
  22. data/lib/kumi/core/compiler/function_invoker.rb +55 -0
  23. data/lib/kumi/core/compiler/path_traversal_compiler.rb +158 -0
  24. data/lib/kumi/core/compiler/reference_compiler.rb +46 -0
  25. data/lib/kumi/core/compiler_base.rb +137 -0
  26. data/lib/kumi/core/explain.rb +2 -2
  27. data/lib/kumi/core/function_registry/collection_functions.rb +86 -3
  28. data/lib/kumi/core/function_registry/function_builder.rb +5 -3
  29. data/lib/kumi/core/function_registry/logical_functions.rb +171 -1
  30. data/lib/kumi/core/function_registry/stat_functions.rb +156 -0
  31. data/lib/kumi/core/function_registry.rb +32 -10
  32. data/lib/kumi/core/nested_structure_utils.rb +78 -0
  33. data/lib/kumi/core/ruby_parser/dsl_cascade_builder.rb +2 -2
  34. data/lib/kumi/core/ruby_parser/input_builder.rb +61 -8
  35. data/lib/kumi/core/schema_instance.rb +4 -0
  36. data/lib/kumi/core/vectorized_function_builder.rb +88 -0
  37. data/lib/kumi/errors.rb +2 -0
  38. data/lib/kumi/js/compiler.rb +878 -0
  39. data/lib/kumi/js/function_registry.rb +333 -0
  40. data/lib/kumi/js.rb +23 -0
  41. data/lib/kumi/registry.rb +61 -1
  42. data/lib/kumi/schema.rb +1 -1
  43. data/lib/kumi/support/s_expression_printer.rb +16 -15
  44. data/lib/kumi/syntax/array_expression.rb +6 -6
  45. data/lib/kumi/syntax/call_expression.rb +4 -4
  46. data/lib/kumi/syntax/cascade_expression.rb +4 -4
  47. data/lib/kumi/syntax/case_expression.rb +4 -4
  48. data/lib/kumi/syntax/declaration_reference.rb +4 -4
  49. data/lib/kumi/syntax/hash_expression.rb +4 -4
  50. data/lib/kumi/syntax/input_declaration.rb +6 -5
  51. data/lib/kumi/syntax/input_element_reference.rb +5 -5
  52. data/lib/kumi/syntax/input_reference.rb +5 -5
  53. data/lib/kumi/syntax/literal.rb +4 -4
  54. data/lib/kumi/syntax/node.rb +34 -34
  55. data/lib/kumi/syntax/root.rb +6 -6
  56. data/lib/kumi/syntax/trait_declaration.rb +4 -4
  57. data/lib/kumi/syntax/value_declaration.rb +4 -4
  58. data/lib/kumi/version.rb +1 -1
  59. data/lib/kumi.rb +1 -1
  60. data/scripts/analyze_broadcast_methods.rb +68 -0
  61. data/scripts/analyze_cascade_methods.rb +74 -0
  62. data/scripts/check_broadcasting_coverage.rb +51 -0
  63. data/scripts/find_dead_code.rb +114 -0
  64. metadata +20 -4
  65. data/docs/features/array-broadcasting.md +0 -170
  66. data/lib/kumi/cli.rb +0 -449
  67. data/lib/kumi/core/vectorization_metadata.rb +0 -110
data/docs/FUNCTIONS.md CHANGED
@@ -10,6 +10,8 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
10
10
  * **Usage**: `fn(:and, boolean1, boolean2, ...)` → `boolean`
11
11
  * **`any?`**: Check if any element in collection is truthy
12
12
  * **Usage**: `fn(:any?, array(any) arg1)` → `boolean`
13
+ * **`cascade_and`**: Element-wise AND for arrays with same nested structure
14
+ * **Usage**: `fn(:cascade_and, boolean1, boolean2, ...)` → `boolean`
13
15
  * **`none?`**: Check if no elements in collection are truthy
14
16
  * **Usage**: `fn(:none?, array(any) arg1)` → `boolean`
15
17
  * **`not`**: Logical NOT
@@ -52,14 +54,14 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
52
54
  * **Usage**: `fn(:modulo, float arg1, float arg2)` → `float`
53
55
  * **`multiply`**: Multiply two numbers
54
56
  * **Usage**: `fn(:multiply, float arg1, float arg2)` → `float`
57
+ * **`piecewise_sum`**: Accumulate over tiered ranges; returns [sum, marginal_rate]
58
+ * **Usage**: `fn(:piecewise_sum, float arg1, array(float) arg2, array(float) arg3)` → `array(float)`
55
59
  * **`power`**: Raise first number to power of second
56
60
  * **Usage**: `fn(:power, float arg1, float arg2)` → `float`
57
61
  * **`round`**: Round number to specified precision
58
62
  * **Usage**: `fn(:round, float1, float2, ...)` → `float`
59
63
  * **`subtract`**: Subtract second number from first
60
64
  * **Usage**: `fn(:subtract, float arg1, float arg2)` → `float`
61
- * **`piecewise_sum`**: Accumulate over tiered ranges; returns [sum, marginal_rate]
62
- * **Usage**: `fn(:piecewise_sum, float arg1, array(float) arg2, array(float) arg3)` → `array(float)`
63
65
 
64
66
  ## String Functions
65
67
 
@@ -67,16 +69,22 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
67
69
  * **Usage**: `fn(:capitalize, string arg1)` → `string`
68
70
  * **`concat`**: Concatenate multiple strings
69
71
  * **Usage**: `fn(:concat, string1, string2, ...)` → `string`
72
+ * **`contains?`**: Check if string contains substring
73
+ * **Usage**: `fn(:contains?, string arg1, string arg2)` → `boolean`
70
74
  * **`downcase`**: Convert string to lowercase
71
75
  * **Usage**: `fn(:downcase, string arg1)` → `string`
72
76
  * **`end_with?`**: Check if string ends with suffix
73
77
  * **Usage**: `fn(:end_with?, string arg1, string arg2)` → `boolean`
74
- * **`include?`**: Check if collection includes element
75
- * **Usage**: `fn(:include?, array(any) arg1, any arg2)` → `boolean`
76
- * **`length`**: Get collection length
77
- * **Usage**: `fn(:length, array(any) arg1)` → `integer`
78
+ * **`includes?`**: Check if string contains substring
79
+ * **Usage**: `fn(:includes?, string arg1, string arg2)` → `boolean`
80
+ * **`length`**: Get string length
81
+ * **Usage**: `fn(:length, string arg1)` → `integer`
78
82
  * **`start_with?`**: Check if string starts with prefix
79
83
  * **Usage**: `fn(:start_with?, string arg1, string arg2)` → `boolean`
84
+ * **`string_include?`**: Check if string contains substring
85
+ * **Usage**: `fn(:string_include?, string arg1, string arg2)` → `boolean`
86
+ * **`string_length`**: Get string length
87
+ * **Usage**: `fn(:string_length, string arg1)` → `integer`
80
88
  * **`strip`**: Remove leading and trailing whitespace
81
89
  * **Usage**: `fn(:strip, string arg1)` → `string`
82
90
  * **`upcase`**: Convert string to uppercase
@@ -84,20 +92,54 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
84
92
 
85
93
  ## Collection Functions
86
94
 
95
+ * **`all_across`**: Check if all elements are truthy across all nested levels
96
+ * **Usage**: `fn(:all_across, array(any) arg1)` → `boolean`
97
+ * **`any_across`**: Check if any element is truthy across all nested levels
98
+ * **Usage**: `fn(:any_across, array(any) arg1)` → `boolean`
99
+ * **`avg_if`**: Average values where corresponding condition is true
100
+ * **Usage**: `fn(:avg_if, array(float) arg1, array(boolean) arg2)` → `float`
101
+ * **`build_array`**: Build array of given size with index values
102
+ * **Usage**: `fn(:build_array, integer arg1)` → `array(any)`
103
+ * **`count_across`**: Count total elements across all nested levels
104
+ * **Usage**: `fn(:count_across, array(any) arg1)` → `integer`
105
+ * **`count_if`**: Count number of true values in boolean array
106
+ * **Usage**: `fn(:count_if, array(boolean) arg1)` → `integer`
107
+ * **`each_slice`**: Group array elements into subarrays of given size
108
+ * **Usage**: `fn(:each_slice, array arg1, integer arg2)` → `array(array)`
87
109
  * **`empty?`**: Check if collection is empty
88
110
  * **Usage**: `fn(:empty?, array(any) arg1)` → `boolean`
89
111
  * **`first`**: Get first element of collection
90
112
  * **Usage**: `fn(:first, array(any) arg1)` → `any`
113
+ * **`flatten`**: Flatten nested arrays into a single array
114
+ * **Usage**: `fn(:flatten, array(any) arg1)` → `array(any)`
115
+ * **`flatten_deep`**: Recursively flatten all nested arrays (alias for flatten)
116
+ * **Usage**: `fn(:flatten_deep, array(any) arg1)` → `array(any)`
117
+ * **`flatten_one`**: Flatten nested arrays by one level only
118
+ * **Usage**: `fn(:flatten_one, array(any) arg1)` → `array(any)`
91
119
  * **`include?`**: Check if collection includes element
92
120
  * **Usage**: `fn(:include?, array(any) arg1, any arg2)` → `boolean`
121
+ * **`indices`**: Generate array of indices for the collection
122
+ * **Usage**: `fn(:indices, array(any) arg1)` → `array(integer)`
123
+ * **`join`**: Join array elements into string with separator
124
+ * **Usage**: `fn(:join, array arg1, string arg2)` → `string`
93
125
  * **`last`**: Get last element of collection
94
126
  * **Usage**: `fn(:last, array(any) arg1)` → `any`
95
- * **`length`**: Get collection length
96
- * **Usage**: `fn(:length, array(any) arg1)` → `integer`
127
+ * **`map_add`**: Add value to each element
128
+ * **Usage**: `fn(:map_add, array(float) arg1, float arg2)` → `array(float)`
129
+ * **`map_conditional`**: Transform elements based on condition: if element == condition_value then true_value else false_value
130
+ * **Usage**: `fn(:map_conditional, array arg1, any arg2, any arg3, any arg4)` → `array`
131
+ * **`map_join_rows`**: Join 2D array into string with row and column separators
132
+ * **Usage**: `fn(:map_join_rows, array(array) arg1, string arg2, string arg3)` → `string`
133
+ * **`map_multiply`**: Multiply each element by factor
134
+ * **Usage**: `fn(:map_multiply, array(float) arg1, float arg2)` → `array(float)`
135
+ * **`map_with_index`**: Map collection elements to [element, index] pairs
136
+ * **Usage**: `fn(:map_with_index, array(any) arg1)` → `array(any)`
97
137
  * **`max`**: Find maximum value in numeric collection
98
138
  * **Usage**: `fn(:max, array(float) arg1)` → `float`
99
139
  * **`min`**: Find minimum value in numeric collection
100
140
  * **Usage**: `fn(:min, array(float) arg1)` → `float`
141
+ * **`range`**: Generate range of integers from start to finish (exclusive)
142
+ * **Usage**: `fn(:range, integer arg1, integer arg2)` → `array(integer)`
101
143
  * **`reverse`**: Reverse collection order
102
144
  * **Usage**: `fn(:reverse, array(any) arg1)` → `array(any)`
103
145
  * **`size`**: Get collection size
@@ -106,6 +148,8 @@ Kumi provides a rich library of built-in functions for use within `value` and `t
106
148
  * **Usage**: `fn(:sort, array(any) arg1)` → `array(any)`
107
149
  * **`sum`**: Sum all numeric elements in collection
108
150
  * **Usage**: `fn(:sum, array(float) arg1)` → `float`
151
+ * **`sum_if`**: Sum values where corresponding condition is true
152
+ * **Usage**: `fn(:sum_if, array(float) arg1, array(boolean) arg2)` → `float`
109
153
  * **`unique`**: Remove duplicate elements from collection
110
154
  * **Usage**: `fn(:unique, array(any) arg1)` → `array(any)`
111
155
 
@@ -0,0 +1,86 @@
1
+ # Compiler Design Principles
2
+
3
+ ## Core Principle: Smart Analyzer, Dumb Compiler
4
+
5
+ The Kumi compiler follows a strict separation of concerns:
6
+
7
+ ### Analyzer Phase (Smart)
8
+ - **Makes all decisions** about how operations should be executed
9
+ - **Analyzes semantic context** to determine operation modes
10
+ - **Pre-computes execution strategies** and stores in metadata
11
+ - **Resolves complex logic** like nested array broadcasting, reduction flattening, etc.
12
+ - **Produces complete instructions** for the compiler to follow
13
+
14
+ ### Compiler Phase (Dumb)
15
+ - **Follows metadata instructions** without making decisions
16
+ - **No conditional logic** based on data types, function types, or structure analysis
17
+ - **Mechanically executes** the pre-computed strategy from analyzer
18
+ - **Pure translation** from AST + metadata → executable functions
19
+
20
+ ## Examples
21
+
22
+ ### ❌ BAD: Compiler Making Decisions
23
+ ```ruby
24
+ def compile_call(expr)
25
+ if Kumi::Registry.reducer?(expr.fn_name)
26
+ if nested_array_detected?(values)
27
+ # Compiler deciding to flatten
28
+ flatten_and_call(expr.fn_name, values)
29
+ end
30
+ end
31
+ end
32
+ ```
33
+
34
+ ### ✅ GOOD: Compiler Following Metadata
35
+ ```ruby
36
+ def compile_call(expr)
37
+ # Just read the pre-computed strategy
38
+ strategy = @analysis.metadata[:call_strategies][expr]
39
+ execute_strategy(strategy, expr)
40
+ end
41
+ ```
42
+
43
+ ### ❌ BAD: Runtime Structure Analysis
44
+ ```ruby
45
+ def vectorized_function_call(fn_name, values)
46
+ # Compiler analyzing structure at runtime
47
+ if values.any? { |v| deeply_nested?(v) }
48
+ apply_nested_broadcasting(fn, values)
49
+ end
50
+ end
51
+ ```
52
+
53
+ ### ✅ GOOD: Pre-computed Broadcasting Plan
54
+ ```ruby
55
+ def compile_element_field_reference(expr)
56
+ # Analyzer already determined the strategy
57
+ metadata = @analysis.state[:broadcasts][:nested_paths][expr.path]
58
+ traverse_nested_path(ctx, expr.path, metadata[:operation_mode])
59
+ end
60
+ ```
61
+
62
+ ## Benefits
63
+
64
+ 1. **Predictable Performance**: No runtime analysis or decision-making
65
+ 2. **Easier Testing**: Compiler behavior determined entirely by metadata
66
+ 3. **Maintainable**: Complex logic isolated in analyzer passes
67
+ 4. **Extensible**: New features added by extending analyzer, not compiler
68
+ 5. **Debuggable**: All decisions visible in analyzer metadata
69
+
70
+ ## Implementation Pattern
71
+
72
+ For any new compiler feature:
73
+
74
+ 1. **Analyzer Pass**: Analyze the requirement and store strategy in metadata
75
+ 2. **Metadata Schema**: Define clear data structure for the strategy
76
+ 3. **Compiler Method**: Read metadata and execute strategy mechanically
77
+ 4. **No Conditionals**: Avoid `if` statements based on runtime data in compiler
78
+
79
+ ## Metadata-Driven Architecture
80
+
81
+ The compiler should be a pure **metadata interpreter**:
82
+ - Input: AST + Analyzer Metadata
83
+ - Output: Executable Functions
84
+ - Process: Mechanical translation following metadata instructions
85
+
86
+ This ensures the compiler remains simple, fast, and maintainable as the system grows in complexity.
@@ -30,9 +30,15 @@ Defines expected inputs with types and constraints.
30
30
  - Domain validation at runtime
31
31
  - Separates input metadata from business logic
32
32
 
33
+ ### [Hierarchical Broadcasting](hierarchical-broadcasting.md)
34
+ Automatic vectorization over hierarchical data structures with dual access modes.
35
+
36
+ - Object access for structured business data
37
+ - Element access for multi-dimensional arrays
38
+ - Mixed access modes in same schema
39
+
33
40
  ### [Performance](performance.md)
34
- TODO: Add benchmark data
35
- Processes large schemas with optimized algorithms.
41
+ Processes large schemas.
36
42
 
37
43
  - Result caching
38
44
  - Selective evaluation
@@ -44,6 +50,13 @@ Debug and inspect AST structures with readable S-expression notation output.
44
50
  - Proper indentation and hierarchical structure
45
51
  - Useful for debugging schema parsing and AST analysis
46
52
 
53
+ ### [JavaScript Transpiler](javascript-transpiler.md)
54
+ Transpiles compiled schemas to standalone JavaScript code.
55
+
56
+ - Generates bundles with only required functions
57
+ - Supports CommonJS and browser environments
58
+ - Maintains identical behavior across platforms
59
+
47
60
  ## Integration
48
61
 
49
62
  - Type inference uses input declarations
@@ -0,0 +1,349 @@
1
+ # Hierarchical Broadcasting
2
+
3
+ Automatic vectorization of operations over hierarchical data structures with two complementary access modes for different use cases.
4
+
5
+ ## Overview
6
+
7
+ The hierarchical broadcasting system enables natural field access syntax that automatically applies operations element-wise across complex nested data structures, with intelligent detection of map vs reduce operations and support for both structured objects and raw multi-dimensional arrays.
8
+
9
+ ## Core Mechanism
10
+
11
+ The system uses a three-stage pipeline:
12
+
13
+ 1. **Parser** - Creates InputElementReference AST nodes for nested field access
14
+ 2. **BroadcastDetector** - Identifies which operations should be vectorized vs scalar
15
+ 3. **Compiler** - Generates appropriate map/reduce functions based on usage context
16
+
17
+ ## Access Modes
18
+
19
+ The system supports two complementary access modes that can be mixed within the same schema:
20
+
21
+ ### Object Access Mode (Default)
22
+ For structured business data with named fields:
23
+
24
+ ```ruby
25
+ input do
26
+ array :users do
27
+ string :name
28
+ integer :age
29
+ end
30
+ end
31
+
32
+ value :user_names, input.users.name # Broadcasts over structured objects
33
+ trait :adults, input.users.age >= 18 # Boolean array from age comparison
34
+ ```
35
+
36
+ ### Element Access Mode
37
+ For multi-dimensional raw arrays using `element` syntax with **progressive path traversal**:
38
+
39
+ ```ruby
40
+ input do
41
+ array :cube do
42
+ element :array, :layer do
43
+ element :array, :row do
44
+ element :integer, :cell
45
+ end
46
+ end
47
+ end
48
+ end
49
+
50
+ # Progressive path traversal - each level goes deeper
51
+ value :dimensions, [
52
+ fn(:size, input.cube), # Number of layers
53
+ fn(:size, input.cube.layer), # Total matrices across all layers
54
+ fn(:size, input.cube.layer.row), # Total rows across all matrices
55
+ fn(:size, input.cube.layer.row.cell) # Total cells (leaf elements)
56
+ ]
57
+
58
+ # Operations at different dimensional levels
59
+ value :layer_data, input.cube.layer # 3D matrices
60
+ value :row_data, input.cube.layer.row # 2D rows
61
+ value :cell_data, input.cube.layer.row.cell # 1D values
62
+ ```
63
+
64
+ **Key Benefits of Progressive Path Traversal:**
65
+ - **Intuitive syntax**: `input.cube.layer.row` naturally represents "rows within layers within cube"
66
+ - **Direct dimensional access**: No need for complex flattening operations for basic dimensional analysis
67
+ - **Ranked polymorphism**: Same operations work across different dimensional arrays
68
+ - **Clean code**: `fn(:size, input.cube.layer.row)` instead of `fn(:size, fn(:flatten_one, input.cube.layer))`
69
+
70
+ ## Business Use Cases
71
+
72
+ Element access mode is essential for common business scenarios involving simple nested arrays:
73
+
74
+ ```ruby
75
+ # E-commerce: Product availability analysis
76
+ input do
77
+ array :products do
78
+ string :name
79
+ array :daily_sales do # Simple array: [12, 8, 15, 3]
80
+ element :integer, :units
81
+ end
82
+ end
83
+ end
84
+
85
+ # Create flags for business rules
86
+ trait :had_slow_days, fn(:any?, input.products.daily_sales.units < 5)
87
+ trait :consistently_strong, fn(:all?, input.products.daily_sales.units >= 10)
88
+
89
+ # Progressive analysis
90
+ value :sales_summary, [
91
+ fn(:size, input.products), # Number of products
92
+ fn(:size, input.products.daily_sales.units), # Total sales data points
93
+ fn(:sum, fn(:flatten, input.products.daily_sales.units)) # Total units sold
94
+ ]
95
+ ```
96
+
97
+ ### Mixed Access Modes
98
+ Both modes work together seamlessly:
99
+
100
+ ```ruby
101
+ input do
102
+ array :users do # Object access
103
+ string :name
104
+ integer :age
105
+ end
106
+ array :activity_logs do # Element access
107
+ element :integer, :day
108
+ end
109
+ end
110
+
111
+ value :user_count, fn(:size, input.users)
112
+ value :total_activity_days, fn(:size, fn(:flatten, input.activity_logs.day))
113
+ ```
114
+
115
+ ## Basic Broadcasting (Object Access)
116
+
117
+ ```ruby
118
+ schema do
119
+ input do
120
+ array :line_items do
121
+ float :price
122
+ integer :quantity
123
+ string :category
124
+ end
125
+ float :tax_rate, type: :float
126
+ end
127
+
128
+ # Element-wise computation - broadcasts over each item
129
+ value :subtotals, input.line_items.price * input.line_items.quantity
130
+
131
+ # Element-wise traits - applied to each item
132
+ trait :is_taxable, (input.line_items.category != "digital")
133
+
134
+ # Conditional logic - element-wise evaluation
135
+ value :taxes, fn(:if, is_taxable, subtotals * input.tax_rate, 0.0)
136
+ end
137
+ ```
138
+
139
+ ## Aggregation Operations
140
+
141
+ Operations that consume arrays to produce scalars are automatically detected:
142
+
143
+ ```ruby
144
+ schema do
145
+ # These aggregate the vectorized results
146
+ value :total_subtotal, fn(:sum, subtotals)
147
+ value :total_tax, fn(:sum, taxes)
148
+ value :grand_total, total_subtotal + total_tax
149
+
150
+ # Statistics over arrays
151
+ value :avg_price, fn(:avg, input.line_items.price)
152
+ value :max_quantity, fn(:max, input.line_items.quantity)
153
+ end
154
+ ```
155
+
156
+ ## Field Access Nesting
157
+
158
+ Supports arbitrary depth field access with path building:
159
+
160
+ ```ruby
161
+ schema do
162
+ input do
163
+ array :orders do
164
+ array :items do
165
+ hash :product do
166
+ string :name
167
+ float :base_price
168
+ end
169
+ integer :quantity
170
+ end
171
+ end
172
+ end
173
+
174
+ # Deep field access - automatically broadcasts over nested arrays
175
+ value :all_product_names, input.orders.items.product.name
176
+ value :total_values, input.orders.items.product.base_price * input.orders.items.quantity
177
+ end
178
+ ```
179
+
180
+ ## Type Inference
181
+
182
+ The type system automatically infers appropriate types for broadcasted operations:
183
+
184
+ - `input.items.price` (float array) → inferred as `:float` per element
185
+ - `input.items.price * input.items.quantity` → element-wise `:float` result
186
+ - `fn(:sum, input.items.price)` → scalar `:float` result
187
+
188
+ ## Implementation Details
189
+
190
+ ### Parser Layer
191
+ - **InputFieldProxy** - Handles `input.field.subfield...` with path building
192
+ - **InputElementReference** - AST node representing array field access paths
193
+
194
+ ### Analysis Layer
195
+ - **BroadcastDetector** - Identifies vectorized vs scalar operations
196
+ - **TypeInferencer** - Infers types for array element access patterns
197
+
198
+ ### Compilation Layer
199
+ - **Automatic Dispatch** - Maps element-wise operations to array map functions
200
+ - **Reduction Detection** - Converts aggregation functions to array reduce operations
201
+
202
+ ## Usage Patterns
203
+
204
+ ### Element-wise Operations
205
+ ```ruby
206
+ # All of these broadcast element-wise
207
+ value :discounted_prices, input.items.price * 0.9
208
+ trait :expensive, (input.items.price > 100.0)
209
+ value :categories, input.items.category
210
+ ```
211
+
212
+ ### Aggregation Operations
213
+ ```ruby
214
+ # These consume arrays to produce scalars
215
+ value :item_count, fn(:size, input.items)
216
+ value :total_price, fn(:sum, input.items.price)
217
+ value :has_expensive, fn(:any?, expensive)
218
+ ```
219
+
220
+ ### Mixed Operations
221
+ ```ruby
222
+ # Element-wise computation followed by aggregation
223
+ value :line_totals, input.items.price * input.items.quantity
224
+ value :order_total, fn(:sum, line_totals)
225
+ value :avg_line_total, fn(:avg, line_totals)
226
+ ```
227
+
228
+ ### Conditional Aggregation Functions
229
+
230
+ Kumi provides powerful conditional aggregation functions that make working with vectorized traits clean and intuitive:
231
+
232
+ ```ruby
233
+ # Step 1: Create vectorized values and traits
234
+ value :revenues, input.sales.price * input.sales.quantity # → [3000.0, 2000.0, 1250.0]
235
+ trait :expensive, input.sales.price > 100 # → [true, true, false]
236
+ trait :bulk_order, input.sales.quantity >= 15 # → [false, false, true]
237
+
238
+ # Step 2: Use conditional aggregation functions (NEW - clean and readable)
239
+ value :expensive_count, fn(:count_if, expensive) # → 2
240
+ value :expensive_total, fn(:sum_if, revenues, expensive) # → 5000.0
241
+ value :expensive_average, fn(:avg_if, revenues, expensive) # → 2500.0
242
+
243
+ # Step 3: Combine conditions for complex filtering
244
+ trait :expensive_bulk, expensive & bulk_order # → [false, false, false]
245
+ value :expensive_bulk_total, fn(:sum_if, revenues, expensive_bulk) # → 0.0
246
+ ```
247
+
248
+ **Old cascade pattern** (verbose, harder to read):
249
+ ```ruby
250
+ # OLD WAY - required verbose cascade patterns
251
+ value :expensive_amounts do
252
+ on expensive, revenues
253
+ base 0.0
254
+ end
255
+ value :expensive_total, fn(:sum, expensive_amounts)
256
+
257
+ value :expensive_count_markers do
258
+ on expensive, 1.0
259
+ base 0.0
260
+ end
261
+ value :expensive_count, fn(:sum, expensive_count_markers)
262
+ ```
263
+
264
+ **New conditional functions** (clean, direct):
265
+ ```ruby
266
+ # NEW WAY - direct and readable
267
+ value :expensive_total, fn(:sum_if, revenues, expensive)
268
+ value :expensive_count, fn(:count_if, expensive)
269
+ value :expensive_average, fn(:avg_if, revenues, expensive)
270
+ ```
271
+
272
+ The conditional aggregation functions are now the idiomatic way to work with boolean arrays in Kumi.
273
+
274
+ ## Common Patterns and Gotchas
275
+
276
+ ### Operator Precedence with Mixed Arithmetic
277
+
278
+ **Problem**: Complex arithmetic expressions with arrays can fail due to precedence parsing:
279
+
280
+ ```ruby
281
+ # This fails with confusing error message
282
+ value :ones, input.items.price * 0 + 1
283
+ # Error: "no implicit conversion of Integer into Array"
284
+
285
+ # The expression is parsed as: (input.items.price * 0) + 1
286
+ # Which becomes: [0.0, 0.0, 0.0] + 1 (array + scalar in wrong context)
287
+ ```
288
+
289
+ **Solutions**:
290
+
291
+ ```ruby
292
+ # Use explicit function calls
293
+ value :ones, fn(:add, input.items.price * 0, 1)
294
+
295
+ # Use cascade pattern (most idiomatic)
296
+ trait :always_true, input.items.price >= 0
297
+ value :ones do
298
+ on always_true, 1.0
299
+ base 0.0
300
+ end
301
+
302
+ # Use separate steps
303
+ value :zeros, input.items.price * 0
304
+ value :ones, zeros + 1
305
+ ```
306
+
307
+ The cascade pattern is preferred as it's more explicit about intent and leverages Kumi's automatic scalar broadcasting.
308
+
309
+ ## Error Handling
310
+
311
+ ### Dimension Mismatch Detection
312
+
313
+ Array broadcasting operations are only valid within the same array source. Attempting to broadcast across different arrays generates detailed error messages:
314
+
315
+ ```ruby
316
+ schema do
317
+ input do
318
+ array :items do
319
+ string :name
320
+ end
321
+ array :logs do
322
+ string :user_name
323
+ end
324
+ end
325
+
326
+ # This will generate a dimension mismatch error
327
+ trait :same_name, input.items.name == input.logs.user_name
328
+ end
329
+
330
+ # Error:
331
+ # Cannot broadcast operation across arrays from different sources: items, logs.
332
+ # Problem: Multiple operands are arrays from different sources:
333
+ # - Operand 1 resolves to array(string) from array 'items'
334
+ # - Operand 2 resolves to array(string) from array 'logs'
335
+ # Direct operations on arrays from different sources is ambiguous and not supported.
336
+ # Vectorized operations can only work on fields from the same array input.
337
+ ```
338
+
339
+ The error messages provide:
340
+ - **Quick Summary**: Identifies the conflicting array sources
341
+ - **Type Information**: Shows the resolved types of each operand
342
+ - **Clear Explanation**: Why the operation is ambiguous and not supported
343
+
344
+ ## Performance Characteristics
345
+
346
+ - **Single Pass** - Each array is traversed once per computation chain
347
+ - **Lazy Evaluation** - Operations are composed into pipelines
348
+ - **Memory Efficient** - No intermediate array allocations for simple operations
349
+ - **Type Safe** - Full compile-time type checking for array element operations