RubyGems - kumi - Versions diffs - 0.0.14 → 0.0.15 - Mend

kumi 0.0.14 → 0.0.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +33 -0
data/README.md +0 -27
data/docs/dev/vm-profiling.md +95 -0
data/docs/features/README.md +0 -7
data/lib/kumi/analyzer.rb +5 -2
data/lib/kumi/compiler.rb +6 -5
data/lib/kumi/core/analyzer/passes/ir_dependency_pass.rb +67 -0
data/lib/kumi/core/analyzer/passes/toposorter.rb +3 -35
data/lib/kumi/core/ir/execution_engine/interpreter.rb +42 -30
data/lib/kumi/core/ir/execution_engine/profiler.rb +139 -11
data/lib/kumi/core/ir/execution_engine.rb +6 -15
data/lib/kumi/dev/profile_aggregator.rb +301 -0
data/lib/kumi/dev/profile_runner.rb +199 -0
data/lib/kumi/dev/runner.rb +3 -1
data/lib/kumi/dev.rb +14 -0
data/lib/kumi/runtime/executable.rb +61 -29
data/lib/kumi/schema.rb +9 -3
data/lib/kumi/version.rb +1 -1
data/lib/kumi.rb +1 -0
metadata +6 -2
data/docs/features/analysis-cascade-mutual-exclusion.md +0 -89

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: ec64db5a7b14df1a8656e2cedc569153a0f092d69a99544c5aa103241f61d931
-  data.tar.gz: 84bd4c7360eb74364e47103901fe8ccbcf6282a9cb40111d42b05357a2172c9e
+  metadata.gz: 5405d7d0612a81e5154bd1d452fdfc150691b022137fc0ee132c47ede1a58e2e
+  data.tar.gz: '093cf7a6d305c02f92de600b06f62be39f8af90d798a0f93ed3ef59f539ada9b'
 SHA512:
-  metadata.gz: da7a9f3135e63779676ea5a802f48288ff2675935ab8dd8dcc751aaf309392275589f08fb92fede4dce351707369b5113e28540e5f4853629a02f4a1945ca75a
-  data.tar.gz: 5a45952056f81d9b4551bc7b48d3fd35c25065eb058e3cd6737879377a5c6e5f61d1cbba5b51955f54ad6ca14a797254e3381c69cd6b9eb145f3929e6e962dae
+  metadata.gz: b3ea711bf465e0c11cc95fabb3809dd632ebbfcc8c36297b161fb1f179fffdda5df1e5c033968e837dd8ed3f983639416de08bd371f30be9f2cefd5543efe1ff
+  data.tar.gz: 0a63fe824fb604639b4efb9cfc2ce24a93429110adc75cfd3edc905c58b3501273ad07a3cf66415eabce4e099c5fd4d202bd2c4064875d7a73e0c2a26f0689cb

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,38 @@
 ## [Unreleased]
+## [0.0.15] – 2025-08-21
+### Added
+- (DX) Schema-aware VM profiling with multi-schema performance analysis
+- DAG-based execution optimization with pre-computed dependency resolution
+### Performance
+- Reference operations eliminated as VM bottleneck via O(1) hash lookups
+## [0.0.14] – 2025-08-21
+### Added
+- Text schema frontend with `.kumi` file format support
+- `bin/kumi parse` command for schema analysis and golden file testing
+- LoadInputCSE optimization pass to eliminate redundant load operations
+- Runtime accessor caching with precise field-based invalidation
+- VM profiler with wall time, CPU time, and cache hit rate analysis
+- Structured analyzer debug system with state inspection
+- Checkpoint system for capturing and comparing analyzer states
+- State serialization (StateSerde) for golden testing and regression detection
+- Debug object printers with configurable truncation
+- Multi-run averaging for stable performance benchmarking
+### Fixed
+- VM targeting for `__vec` twin declarations that were failing to resolve
+- Demand-driven reference resolution with proper name indexing and cycle detection
+- Accessor cache invalidation now uses precise field dependencies instead of clearing all caches
+- StateSerde JSON serialization issues with frozen hashes, Sets, and Symbols
+### Performance
+- 14x improvement on update-heavy workloads (1.88k → 26.88k iterations/second)
+- 30-40% reduction in IR module size for schemas with repeated field access
+- Eliminated load_input performance bottleneck that was consuming ~99% of execution time
+- Optional caching system (enabled via KUMI_VM_CACHE=1) for performance-critical scenarios
 ## [0.0.13] – 2025-08-14
 ### Added
 - Runtime performance optimizations for interpreter execution

data/README.md CHANGED Viewed

@@ -207,33 +207,6 @@ end
 # ❌ Function arity error: divide expects 2 arguments, got 1
 ```
-**Mutual Recursion**: Kumi supports mutual recursion when cascade conditions are mutually exclusive:
-```ruby
-trait :is_forward, input.operation == "forward"
-trait :is_reverse, input.operation == "reverse"
-# Safe mutual recursion - conditions are mutually exclusive
-value :forward_processor do
-  on is_forward, input.value * 2        # Direct calculation
-  on is_reverse, reveAnalysisrse_processor + 10  # Delegates to reverse (safe)
-  base "invalid operation"
-end
-value :reverse_processor do
-  on is_forward, forward_processor - 5   # Delegates to forward (safe)
-  on is_reverse, input.value / 2         # Direct calculation
-  base "invalid operation"
-end
-# Usage examples:
-# operation="forward", value=10  => forward: 20, reverse: 15
-# operation="reverse", value=10  => forward: 15, reverse: 5
-# operation="unknown", value=10  => both: "invalid operation"
-```
-This compiles because `operation` can only be "forward" or "reverse", never both. Each recursion executes one step before hitting a direct calculation.
 #### **Runtime Introspection: Debug and Understand**
 **Explainability**: Trace exactly how any value is computed, step-by-step. This is invaluable for debugging complex logic and auditing results.

data/docs/dev/vm-profiling.md ADDED Viewed

@@ -0,0 +1,95 @@
+# VM Profiling with Schema Differentiation
+## Overview
+Profiles VM operation execution with schema-level differentiation. Tracks operations by schema type for multi-schema performance analysis.
+## Core Components
+**Profiler**: `lib/kumi/core/ir/execution_engine/profiler.rb`
+- Streams VM operation events with schema identification
+- Supports persistent mode for cross-run analysis
+- JSONL event format with operation metadata
+**Profile Aggregator**: `lib/kumi/dev/profile_aggregator.rb`
+- Analyzes profiling data by schema type
+- Generates summary and detailed performance reports
+- Schema breakdown showing operations and timing per schema
+**CLI Integration**: `bin/kumi profile`
+- Processes JSONL profiling data files
+- Multiple output formats: summary, detailed, raw
+## Usage
+### Basic Profiling
+```bash
+# Single schema with operations
+KUMI_PROFILE=1 KUMI_PROFILE_OPS=1 KUMI_PROFILE_FILE=profile.jsonl ruby script.rb
+# Persistent mode across multiple runs
+KUMI_PROFILE=1 KUMI_PROFILE_PERSISTENT=1 KUMI_PROFILE_OPS=1 KUMI_PROFILE_FILE=profile.jsonl ruby script.rb
+# Streaming mode for real-time analysis
+KUMI_PROFILE=1 KUMI_PROFILE_STREAM=1 KUMI_PROFILE_OPS=1 KUMI_PROFILE_FILE=profile.jsonl ruby script.rb
+```
+### CLI Analysis
+```bash
+# Summary report with schema breakdown
+kumi profile profile.jsonl --summary
+# Detailed per-operation analysis
+kumi profile profile.jsonl --detailed
+# Raw event stream
+kumi profile profile.jsonl --raw
+```
+## Environment Variables
+**Core**:
+- `KUMI_PROFILE=1` - Enable profiling
+- `KUMI_PROFILE_FILE=path` - Output file (required)
+- `KUMI_PROFILE_OPS=1` - Enable VM operation profiling
+**Modes**:
+- `KUMI_PROFILE_PERSISTENT=1` - Append to existing files across runs
+- `KUMI_PROFILE_STREAM=1` - Stream individual events vs batch
+- `KUMI_PROFILE_TRUNCATE=1` - Truncate existing files
+## Event Format
+JSONL with operation metadata:
+```json
+{"event":"vm_operation","schema":"TestSchema","operation":"LoadInput","duration_ms":0.001,"timestamp":"2025-01-20T10:30:45.123Z"}
+{"event":"vm_operation","schema":"TestSchema","operation":"Map","duration_ms":0.002,"timestamp":"2025-01-20T10:30:45.125Z"}
+```
+## Schema Differentiation
+Tracks operations by schema class name for multi-schema analysis:
+**Implementation**:
+- Schema name propagated through compilation pipeline
+- Profiler tags each VM operation with schema identifier
+- Aggregator groups operations by schema type
+**Output Example**:
+```
+Total operations: 24 (0.8746ms)
+Schemas analyzed: SchemaA, SchemaB
+  SchemaA: 12 operations, 0.3242ms
+  SchemaB: 12 operations, 0.0504ms
+```
+## Performance Analysis
+**Reference Operations**: Typically dominate execution time in complex schemas
+**Map Operations**: Element-wise computations on arrays
+**LoadInput Operations**: Data access operations
+Use schema breakdown to identify performance differences between schema types.

data/docs/features/README.md CHANGED Viewed

@@ -9,13 +9,6 @@ Analyzes rule combinations to detect logical impossibilities across dependency c
 - Validates domain constraints
 - Reports multiple errors
-### [Cascade Mutual Exclusion](analysis-cascade-mutual-exclusion.md)
-Enables safe mutual recursion when cascade conditions are mutually exclusive.
-- Allows mathematically sound recursive patterns
-- Detects mutually exclusive conditions
-- Prevents unsafe cycles while enabling safe ones
 ### [Type Inference](analysis-type-inference.md)
 Determines types from expressions and propagates them through dependencies.

data/lib/kumi/analyzer.rb CHANGED Viewed

@@ -21,7 +21,8 @@ module Kumi
       Core::Analyzer::Passes::ScopeResolutionPass,             # 15. Plans execution scope and lifting needs for declarations.
       Core::Analyzer::Passes::JoinReducePlanningPass,          # 16. Plans join/reduce operations (Generates IR Structs)
       Core::Analyzer::Passes::LowerToIRPass,                   # 17. Lowers the schema to IR (Generates IR Structs)
-      Core::Analyzer::Passes::LoadInputCSE                     # 18. Eliminates redundant load_input operations
+      Core::Analyzer::Passes::LoadInputCSE,                    # 18. Eliminates redundant load_input operations
+      Core::Analyzer::Passes::IRDependencyPass                 # 19. Extracts IR-level dependencies for VM execution optimization
     ].freeze
     def self.analyze!(schema, passes: DEFAULT_PASSES, **opts)
@@ -58,7 +59,9 @@ module Kumi
         t0 = Process.clock_gettime(Process::CLOCK_MONOTONIC)
         pass_instance = pass_class.new(schema, state)
         begin
-          state = pass_instance.run(errors)
+          state = Dev::Profiler.phase("analyzer.pass", pass: pass_name) do
+            pass_instance.run(errors)
+          end
         rescue StandardError => e
           # TODO: - GREATLY improve this, need to capture the context of the error
           # and the pass that failed and line number if relevant

data/lib/kumi/compiler.rb CHANGED Viewed

@@ -3,18 +3,19 @@
 module Kumi
   # Compiles an analyzed schema into executable lambdas
   class Compiler < Core::CompilerBase
-    def self.compile(schema, analyzer:)
-      new(schema, analyzer).compile
+    def self.compile(schema, analyzer:, schema_name: nil)
+      new(schema, analyzer, schema_name: schema_name).compile
     end
-    def initialize(schema, analyzer)
-      super
+    def initialize(schema, analyzer, schema_name: nil)
+      super(schema, analyzer)
       @bindings = {}
+      @schema_name = schema_name
     end
     def compile
       # Switch to LIR: Use the analysis state instead of old compilation
-      Runtime::Executable.from_analysis(@analysis.state)
+      Runtime::Executable.from_analysis(@analysis.state, schema_name: @schema_name)
     end
   end
 end

data/lib/kumi/core/analyzer/passes/ir_dependency_pass.rb ADDED Viewed

@@ -0,0 +1,67 @@
+# frozen_string_literal: true
+module Kumi
+  module Core
+    module Analyzer
+      module Passes
+        # RESPONSIBILITY: Extract IR-level dependencies for VM execution optimization
+        # DEPENDENCIES: :ir_module from LowerToIRPass
+        # PRODUCES: :ir_dependencies - Hash mapping declaration names to referenced bindings
+        #           :name_index - Hash mapping stored binding names to producing declarations
+        # INTERFACE: new(schema, state).run(errors)
+        #
+        # NOTE: This pass extracts actual IR-level dependencies by analyzing :ref operations
+        # in the generated IR, providing the dependency information needed for optimized VM scheduling.
+        class IRDependencyPass < PassBase
+          def run(errors)
+            ir_module = get_state(:ir_module, required: true)
+            ir_dependencies = build_ir_dependency_map(ir_module)
+            name_index = build_name_index(ir_module)
+            state.with(:ir_dependencies, ir_dependencies).with(:name_index, name_index)
+          end
+          private
+          # Build a map of declaration -> [stored_bindings_it_references] from the IR
+          def build_ir_dependency_map(ir_module)
+            deps_map = {}
+            ir_module.decls.each do |decl|
+              refs = []
+              decl.ops.each do |op|
+                if op.tag == :ref
+                  refs << op.attrs[:name]
+                end
+              end
+              deps_map[decl.name] = refs
+            end
+            deps_map.freeze
+          end
+          # Build name index to map stored binding names to their producing declarations
+          def build_name_index(ir_module)
+            name_index = {}
+            ir_module.decls.each do |decl|
+              # Map the primary declaration name
+              name_index[decl.name] = decl
+              # Also map any vectorized twin names produced by this declaration
+              decl.ops.each do |op|
+                if op.tag == :store
+                  stored_name = op.attrs[:name]
+                  name_index[stored_name] = decl
+                end
+              end
+            end
+            name_index.freeze
+          end
+        end
+      end
+    end
+  end
+end

data/lib/kumi/core/analyzer/passes/toposorter.rb CHANGED Viewed

@@ -5,8 +5,8 @@ module Kumi
   module Core
     module Analyzer
       module Passes
-        # RESPONSIBILITY: Compute topological ordering of declarations, allowing safe conditional cycles
-        # DEPENDENCIES: :dependencies from DependencyResolver, :declarations from NameIndexer, :cascades from UnsatDetector
+        # RESPONSIBILITY: Compute topological ordering of declarations, blocking all cycles
+        # DEPENDENCIES: :dependencies from DependencyResolver, :declarations from NameIndexer
         # PRODUCES: :evaluation_order - Array of declaration names in evaluation order
         #           :node_index - Hash mapping object_id to node metadata for later passes
         # INTERFACE: new(schema, state).run(errors)
@@ -60,19 +60,13 @@ module Kumi
             temp_marks = Set.new
             perm_marks = Set.new
             order = []
-            cascades = get_state(:cascades) || {}
             visit_node = lambda do |node, path = []|
               return if perm_marks.include?(node)
               if temp_marks.include?(node)
-                # Check if this is a safe conditional cycle
-                cycle_path = path + [node]
-                return if safe_conditional_cycle?(cycle_path, graph, cascades)
-                # Allow this cycle - it's safe due to cascade mutual exclusion
+                # Block all cycles - no mutual recursion allowed
                 report_unexpected_cycle(temp_marks, node, errors)
                 return
               end
@@ -102,32 +96,6 @@ module Kumi
             order.freeze
           end
-          def safe_conditional_cycle?(cycle_path, graph, cascades)
-            return false if cycle_path.nil? || cycle_path.size < 2
-            # Find where the cycle starts - look for the first occurrence of the repeated node
-            last_node = cycle_path.last
-            return false if last_node.nil?
-            cycle_start = cycle_path.index(last_node)
-            return false unless cycle_start && cycle_start < cycle_path.size - 1
-            cycle_nodes = cycle_path[cycle_start..]
-            # Check if all edges in the cycle are conditional
-            cycle_nodes.each_cons(2) do |from, to|
-              edges = graph[from] || []
-              edge = edges.find { |e| e.to == to }
-              return false unless edge&.conditional
-              # Check if the cascade has mutually exclusive conditions
-              cascade_meta = cascades[edge.cascade_owner]
-              return false unless cascade_meta&.dig(:all_mutually_exclusive)
-            end
-            true
-          end
           def report_unexpected_cycle(temp_marks, current_node, errors)
             cycle_path = temp_marks.to_a.join(" → ") + " → #{current_node}"

data/lib/kumi/core/ir/execution_engine/interpreter.rb CHANGED Viewed

@@ -14,6 +14,7 @@ module Kumi
             ir_module.decls.each do |decl|
               decl.ops.each do |op|
                 next unless op.tag == :store
                 name = op.attrs[:name]
                 index[name] = decl if name
               end
@@ -26,27 +27,39 @@ module Kumi
             raise ArgumentError, "Registry cannot be nil" if registry.nil?
             raise ArgumentError, "Registry must be a Hash, got #{registry.class}" unless registry.is_a?(Hash)
-            # --- PROFILER: init per run ---
-            Profiler.reset!(meta: { decls: ir_module.decls&.size || 0 }) if Profiler.enabled?
+            # --- PROFILER: init per run (but not in persistent mode) ---
+            if Profiler.enabled?
+              schema_name = ctx[:schema_name] || "UnknownSchema"
+              if Profiler.persistent?
+                # In persistent mode, just update schema name without full reset
+                Profiler.set_schema_name(schema_name)
+              else
+                # Normal mode: full reset with schema name
+                Profiler.reset!(meta: { decls: ir_module.decls&.size || 0, schema_name: schema_name })
+              end
+            end
             outputs = {}
             target = ctx[:target]
             guard_stack = [true]
             # Always ensure we have a declaration cache - either from caller or new for this VM run
             declaration_cache = ctx[:declaration_cache] || {}
             # Build name index for targeting by stored names
             name_index = ctx[:name_index] || (target ? build_name_index(ir_module) : nil)
-            # Choose declarations to execute by stored name (not only decl name)
-            decls_to_run =
-              if target
+            # Choose declarations to execute - prefer explicit schedule if present
+            decls_to_run =
+              if ctx[:decls_to_run]
+                ctx[:decls_to_run] # array of decl objects
+              elsif target
                 # Prefer a decl that STORES the target (covers __vec twins)
                 d = name_index && name_index[target]
                 # Fallback: allow targeting by decl name (legacy behavior)
                 d ||= ir_module.decls.find { |dd| dd.name == target }
                 raise "Unknown target: #{target}" unless d
                 [d]
               else
                 ir_module.decls
@@ -84,7 +97,10 @@ module Kumi
                                    false
                                  end
                   slots << nil # keep slot_id == op_index
-                  Profiler.record!(decl: decl.name, idx: op_index, tag: op.tag, op: op, t0: t0, cpu_t0: cpu_t0, rows: 0, note: "enter") if t0
+                  if t0
+                    Profiler.record!(decl: decl.name, idx: op_index, tag: op.tag, op: op, t0: t0, cpu_t0: cpu_t0, rows: 0,
+                                     note: "enter")
+                  end
                   next
                 when :guard_pop
@@ -97,7 +113,10 @@ module Kumi
                 # Skip body when guarded off, but keep indices aligned
                 unless guard_stack.last
                   slots << nil if PRODUCES_SLOT.include?(op.tag) || NON_PRODUCERS.include?(op.tag)
-                  Profiler.record!(decl: decl.name, idx: op_index, tag: op.tag, op: op, t0: t0, cpu_t0: cpu_t0, rows: 0, note: "skipped") if t0
+                  if t0
+                    Profiler.record!(decl: decl.name, idx: op_index, tag: op.tag, op: op, t0: t0, cpu_t0: cpu_t0, rows: 0,
+                                     note: "skipped")
+                  end
                   next
                 end
@@ -149,41 +168,34 @@ module Kumi
                            end
                   rows_touched ||= 1
                   cache_note = hit ? "hit:#{plan_id}" : "miss:#{plan_id}"
-                  Profiler.record!(decl: decl.name, idx: op_index, tag: :load_input, op: op, t0: t0, cpu_t0: cpu_t0,
-                                   rows: rows_touched, note: cache_note) if t0
+                  if t0
+                    Profiler.record!(decl: decl.name, idx: op_index, tag: :load_input, op: op, t0: t0, cpu_t0: cpu_t0,
+                                     rows: rows_touched, note: cache_note)
+                  end
                 when :ref
                   name = op.attrs[:name]
                   if outputs.key?(name)
                     referenced = outputs[name]
+                    hit = :outputs
                   elsif declaration_cache.key?(name)
                     referenced = declaration_cache[name]
+                    hit = :cache
                   else
-                    # demand-compute the producing decl up to the store of `name`
-                    active = (ctx[:active] ||= {})
-                    raise "cycle detected: #{name}" if active[name]
-                    active[name] = true
-                    subctx = {
-                      input: ctx[:input] || ctx["input"],
-                      target: name,                         # target is the STORED NAME
-                      accessor_cache: ctx[:accessor_cache],
-                      declaration_cache: ctx[:declaration_cache],
-                      name_index: name_index,               # reuse map
-                      active: active
-                    }
-                    referenced = self.run(ir_module, subctx, accessors: accessors, registry: registry).fetch(name)
-                    active.delete(name)
+                    raise "unscheduled ref #{name}: producer not executed or dependency analysis failed"
                   end
                   if ENV["DEBUG_VM_ARGS"]
                     puts "DEBUG Ref #{name}: #{referenced[:k] == :scalar ? "scalar(#{referenced[:v].inspect})" : "#{referenced[:k]}(#{referenced[:rows]&.size || 0} rows)"}"
                   end
                   slots << referenced
-                  rows_touched = (referenced[:k] == :vec) ? (referenced[:rows]&.size || 0) : 1
-                  Profiler.record!(decl: decl.name, idx: op_index, tag: :ref, op: op, t0: t0, cpu_t0: cpu_t0, rows: rows_touched) if t0
+                  rows_touched = referenced[:k] == :vec ? (referenced[:rows]&.size || 0) : 1
+                  if t0
+                    Profiler.record!(decl: decl.name, idx: op_index, tag: :ref, op: op, t0: t0, cpu_t0: cpu_t0,
+                                     rows: rows_touched, note: hit)
+                  end
                 when :array
                   # Validate slot indices before accessing