RubyGems - kumi - Versions diffs - 0.0.14 → 0.0.15 - Mend

kumi 0.0.14 → 0.0.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +33 -0
data/README.md +0 -27
data/docs/dev/vm-profiling.md +95 -0
data/docs/features/README.md +0 -7
data/lib/kumi/analyzer.rb +5 -2
data/lib/kumi/compiler.rb +6 -5
data/lib/kumi/core/analyzer/passes/ir_dependency_pass.rb +67 -0
data/lib/kumi/core/analyzer/passes/toposorter.rb +3 -35
data/lib/kumi/core/ir/execution_engine/interpreter.rb +42 -30
data/lib/kumi/core/ir/execution_engine/profiler.rb +139 -11
data/lib/kumi/core/ir/execution_engine.rb +6 -15
data/lib/kumi/dev/profile_aggregator.rb +301 -0
data/lib/kumi/dev/profile_runner.rb +199 -0
data/lib/kumi/dev/runner.rb +3 -1
data/lib/kumi/dev.rb +14 -0
data/lib/kumi/runtime/executable.rb +61 -29
data/lib/kumi/schema.rb +9 -3
data/lib/kumi/version.rb +1 -1
data/lib/kumi.rb +1 -0
metadata +6 -2
data/docs/features/analysis-cascade-mutual-exclusion.md +0 -89

data/lib/kumi/dev/profile_runner.rb ADDED Viewed

@@ -0,0 +1,199 @@
+# frozen_string_literal: true
+require "json"
+require "fileutils"
+require "benchmark"
+module Kumi
+  module Dev
+    module ProfileRunner
+      module_function
+      def run(script_path, opts = {})
+        # Validate script exists
+        unless File.exist?(script_path)
+          puts "Error: Script not found: #{script_path}"
+          return false
+        end
+        # Set up profiling environment
+        setup_profiler_env(opts)
+        puts "Profiling: #{script_path}"
+        puts "Configuration:"
+        puts "  Output: #{ENV['KUMI_PROFILE_FILE']}"
+        puts "  Phases: enabled"
+        puts "  Operations: #{ENV['KUMI_PROFILE_OPS'] == '1' ? 'enabled' : 'disabled'}"
+        puts "  Sampling: #{ENV['KUMI_PROFILE_SAMPLE'] || '1'}"
+        puts "  Persistent: #{ENV['KUMI_PROFILE_PERSISTENT'] == '1' ? 'yes' : 'no'}"
+        puts "  Memory snapshots: #{opts[:memory] ? 'enabled' : 'disabled'}"
+        puts
+        # Initialize profiler
+        Dev::Profiler.init_persistent! if ENV["KUMI_PROFILE_PERSISTENT"] == "1"
+        # Add memory snapshot before execution
+        Dev::Profiler.memory_snapshot("script_start") if opts[:memory]
+        # Execute the script
+        start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
+        begin
+          result = Dev::Profiler.phase("script_execution", script: File.basename(script_path)) do
+            # Execute in a clean environment to avoid polluting the current process
+            load(File.expand_path(script_path))
+          end
+        rescue StandardError => e
+          puts "Error executing script: #{e.message}"
+          puts e.backtrace.first(5).join("\n")
+          return false
+        ensure
+          execution_time = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time
+        end
+        # Add memory snapshot after execution
+        Dev::Profiler.memory_snapshot("script_end") if opts[:memory]
+        # Finalize profiler to get aggregated data
+        Dev::Profiler.finalize!
+        puts "Script completed in #{execution_time.round(4)}s"
+        # Show analysis unless quiet
+        show_analysis(opts) unless opts[:quiet]
+        true
+      rescue LoadError => e
+        puts "Error loading script: #{e.message}"
+        false
+      end
+      private
+      def self.setup_profiler_env(opts)
+        # Always enable profiling
+        ENV["KUMI_PROFILE"] = "1"
+        # Output file
+        output_file = opts[:output] || "tmp/profile.jsonl"
+        ENV["KUMI_PROFILE_FILE"] = output_file
+        # Truncate if requested
+        ENV["KUMI_PROFILE_TRUNCATE"] = opts[:truncate] ? "1" : "0"
+        # Streaming
+        ENV["KUMI_PROFILE_STREAM"] = opts[:stream] ? "1" : "0"
+        # Operations profiling
+        if opts[:phases_only]
+          ENV["KUMI_PROFILE_OPS"] = "0"
+        elsif opts[:ops]
+          ENV["KUMI_PROFILE_OPS"] = "1"
+        else
+          # Default: phases only
+          ENV["KUMI_PROFILE_OPS"] = "0"
+        end
+        # Sampling
+        ENV["KUMI_PROFILE_SAMPLE"] = opts[:sample].to_s if opts[:sample]
+        # Persistent mode
+        ENV["KUMI_PROFILE_PERSISTENT"] = opts[:persistent] ? "1" : "0"
+        # Ensure output directory exists
+        FileUtils.mkdir_p(File.dirname(output_file))
+      end
+      def self.show_analysis(opts)
+        output_file = ENV["KUMI_PROFILE_FILE"]
+        unless File.exist?(output_file)
+          puts "No profile data generated"
+          return
+        end
+        puts "\n=== Profiling Analysis ==="
+        # Use ProfileAggregator for comprehensive analysis
+        require_relative "profile_aggregator"
+        aggregator = ProfileAggregator.new(output_file)
+        if opts[:json]
+          # Export full analysis to JSON and display
+          json_output = opts[:json_file] || "/tmp/profile_analysis.json"
+          aggregator.export_summary(json_output)
+          puts File.read(json_output)
+          return
+        end
+        # Show comprehensive analysis using ProfileAggregator
+        if opts[:detailed]
+          aggregator.detailed_report(limit: opts[:limit] || 15)
+        else
+          # Show summary + key insights
+          aggregator.summary_report
+          # Add some key insights for CLI users
+          puts
+          puts "=== KEY INSIGHTS ==="
+          # Show top hotspots
+          hotspots = aggregator.hotspot_analysis(limit: 3)
+          if hotspots.any?
+            puts "Top Performance Bottlenecks:"
+            hotspots.each_with_index do |(key, stats), i|
+              puts "  #{i+1}. #{stats[:decl]} (#{stats[:tag]}): #{stats[:total_ms]}ms"
+            end
+          end
+          # Reference analysis summary
+          ref_analysis = aggregator.reference_operation_analysis
+          if ref_analysis[:operations] > 0
+            puts "Reference Operation Impact: #{(ref_analysis[:total_time] / aggregator.vm_execution_time * 100).round(1)}% of VM time"
+          end
+          # Memory impact
+          mem = aggregator.memory_analysis
+          if mem
+            puts "Memory Impact: #{mem[:growth][:heap_growth_pct]}% heap growth, #{mem[:growth][:rss_growth_pct]}% RSS growth"
+          end
+        end
+        puts
+        puts "Full profile: #{output_file}"
+        puts "For detailed analysis: bin/kumi profile #{ARGV.join(' ')} --detailed"
+      end
+      def self.analyze_phases(phase_events)
+        phase_events.group_by { |e| e["name"] }.transform_values do |events|
+          {
+            count: events.length,
+            total_ms: events.sum { |e| e["wall_ms"] }.round(3),
+            avg_ms: (events.sum { |e| e["wall_ms"] } / events.length).round(4)
+          }
+        end.sort_by { |_, stats| -stats[:total_ms] }.to_h
+      end
+      def self.analyze_events(events)
+        {
+          summary: {
+            total_events: events.length,
+            phase_events: events.count { |e| e["kind"] == "phase" },
+            memory_events: events.count { |e| e["kind"] == "mem" },
+            operation_events: events.count { |e| !%w[phase mem summary final_summary cache_analysis].include?(e["kind"]) }
+          },
+          phases: analyze_phases(events.select { |e| e["kind"] == "phase" }),
+          memory_snapshots: events.select { |e| e["kind"] == "mem" }.map do |e|
+            {
+              label: e["label"],
+              heap_live: e["heap_live"],
+              rss_mb: e["rss_mb"],
+              timestamp: e["ts"]
+            }
+          end,
+          final_analysis: events.find { |e| e["kind"] == "final_summary" }&.dig("data"),
+          cache_analysis: events.find { |e| e["kind"] == "cache_analysis" }&.dig("data")
+        }
+      end
+    end
+  end
+end

data/lib/kumi/dev/runner.rb CHANGED Viewed

@@ -19,7 +19,9 @@ module Kumi
         errors = []
         begin
-          final_state = Kumi::Analyzer.run_analysis_passes(schema, Kumi::Analyzer::DEFAULT_PASSES, state, errors)
+          final_state = Dev::Profiler.phase("text.analyzer") do
+            Kumi::Analyzer.run_analysis_passes(schema, Kumi::Analyzer::DEFAULT_PASSES, state, errors)
+          end
           ir = final_state[:ir_module]
           result = Result.new(

data/lib/kumi/dev.rb ADDED Viewed

@@ -0,0 +1,14 @@
+# frozen_string_literal: true
+module Kumi
+  module Dev
+    # Alias to the execution engine profiler for cross-layer access
+    Profiler = Kumi::Core::IR::ExecutionEngine::Profiler
+    # Load profile runner for CLI
+    autoload :ProfileRunner, "kumi/dev/profile_runner"
+    # Load profile aggregator for data analysis
+    autoload :ProfileAggregator, "kumi/dev/profile_aggregator"
+  end
+end

data/lib/kumi/runtime/executable.rb CHANGED Viewed

@@ -37,12 +37,16 @@ module Kumi
     # - DEBUG_VM_ARGS=1 to trace VM execution
     # - Accessors can be debugged independently with DEBUG_ACCESSOR_OPS=1
     class Executable
-      def self.from_analysis(state, registry: nil)
+      def self.from_analysis(state, registry: nil, schema_name: nil)
         ir = state.fetch(:ir_module)
         access_plans = state.fetch(:access_plans)
         input_metadata = state[:input_metadata] || {}
         dependents = state[:dependents] || {}
-        accessors = Kumi::Core::Compiler::AccessBuilder.build(access_plans)
+        ir_dependencies = state[:ir_dependencies] || {} # <-- from IR dependency pass
+        name_index = state[:name_index] || {} # <-- from IR dependency pass
+        accessors = Dev::Profiler.phase("compiler.access_builder") do
+          Kumi::Core::Compiler::AccessBuilder.build(access_plans)
+        end
         access_meta = {}
         field_to_plan_ids = Hash.new { |h, k| h[k] = [] }
@@ -60,10 +64,12 @@ module Kumi
         # Use the internal functions hash that VM expects
         registry ||= Kumi::Registry.functions
         new(ir: ir, accessors: accessors, access_meta: access_meta, registry: registry,
-            input_metadata: input_metadata, field_to_plan_ids: field_to_plan_ids, dependents: dependents)
+            input_metadata: input_metadata, field_to_plan_ids: field_to_plan_ids, dependents: dependents,
+            ir_dependencies: ir_dependencies, name_index: name_index, schema_name: schema_name)
       end
-      def initialize(ir:, accessors:, access_meta:, registry:, input_metadata:, field_to_plan_ids: {}, dependents: {})
+      def initialize(ir:, accessors:, access_meta:, registry:, input_metadata:, field_to_plan_ids: {}, dependents: {}, ir_dependencies: {},
+                     name_index: {}, schema_name: nil)
         @ir = ir.freeze
         @acc = accessors.freeze
         @meta = access_meta.freeze
@@ -71,6 +77,9 @@ module Kumi
         @input_metadata = input_metadata.freeze
         @field_to_plan_ids = field_to_plan_ids.freeze
         @dependents = dependents.freeze
+        @ir_dependencies = ir_dependencies.freeze # decl -> [stored_bindings_it_references]
+        @name_index = name_index.freeze # store_name -> producing decl
+        @schema_name = schema_name
         @decl = @ir.decls.map { |d| [d.name, d] }.to_h
         @accessor_cache = {} # Persistent accessor cache across evaluations
       end
@@ -96,14 +105,22 @@ module Kumi
       def eval_decl(name, input, mode: :ruby, declaration_cache: nil)
         raise Kumi::Core::Errors::RuntimeError, "unknown decl #{name}" unless decl?(name)
-        vm_context = {
-          input: input,
-          target: name,
+        # If the caller asked for a specific binding, schedule deps once
+        decls_to_run = topo_closure_for_target(name)
+        vm_context = {
+          input: input,
           accessor_cache: @accessor_cache,
-          declaration_cache: declaration_cache
+          declaration_cache: declaration_cache || {}, # run-local cache
+          decls_to_run: decls_to_run,                # <-- explicit schedule
+          strict_refs: true,                         # <-- refs must be precomputed
+          name_index: @name_index,                   # for error messages, twins, etc.
+          schema_name: @schema_name
         }
-        out = Kumi::Core::IR::ExecutionEngine.run(@ir, vm_context, accessors: @acc, registry: @reg).fetch(name)
+        out = Dev::Profiler.phase("vm.run", target: name) do
+          Kumi::Core::IR::ExecutionEngine.run(@ir, vm_context, accessors: @acc, registry: @reg).fetch(name)
+        end
         mode == :ruby ? unwrap(@decl[name], out) : out
       end
@@ -119,6 +136,39 @@ module Kumi
         v[:k] == :scalar ? v[:v] : v # no grouping needed
       end
+      def topo_closure_for_target(store_name)
+        target_decl = @name_index[store_name]
+        raise "Unknown target store #{store_name}" unless target_decl
+        # DFS collect closure of decl names using pre-computed IR-level dependencies
+        seen = {}
+        order = []
+        visiting = {}
+        visit = lambda do |dname|
+          return if seen[dname]
+          raise "Cycle detected in DAG scheduler: #{dname}. Mutual recursion should be caught earlier by UnsatDetector." if visiting[dname]
+          visiting[dname] = true
+          # Visit declarations that produce the bindings this decl references
+          Array(@ir_dependencies[dname]).each do |ref_binding|
+            # Find which declaration produces this binding
+            producer = @name_index[ref_binding]
+            visit.call(producer.name) if producer
+          end
+          visiting.delete(dname)
+          seen[dname] = true
+          order << dname
+        end
+        visit.call(target_decl.name)
+        # 'order' is postorder; it already yields producers before consumers
+        order.map { |dname| @decl[dname] }
+      end
       private
       def validate_keys(keys)
@@ -146,7 +196,7 @@ module Kumi
           # Store VM format for cross-VM caching
           @cache[name] = vm_result
         end
         # Convert to requested format when returning
         vm_result = @cache[name]
         @mode == :wrapped ? vm_result : @program.unwrap(nil, vm_result)
@@ -203,18 +253,6 @@ module Kumi
         self
       end
-      def wrapped!
-        @mode = :wrapped
-        @cache.clear
-        self
-      end
-      def ruby!
-        @mode = :ruby
-        @cache.clear
-        self
-      end
       private
       def input_field_exists?(field)
@@ -245,12 +283,6 @@ module Kumi
           false
         end
       end
-      def deep_merge(a, b)
-        return b unless a.is_a?(Hash) && b.is_a?(Hash)
-        a.merge(b) { |_k, v1, v2| deep_merge(v1, v2) }
-      end
     end
   end
 end

data/lib/kumi/schema.rb CHANGED Viewed

@@ -49,12 +49,18 @@ module Kumi
     def schema(&)
       # from_location = caller_locations(1, 1).first
       # raise "Called from #{from_location.path}:#{from_location.lineno}"
-      @__syntax_tree__ = Core::RubyParser::Dsl.build_syntax_tree(&).freeze
+      @__syntax_tree__ = Dev::Profiler.phase("frontend.parse") do
+        Core::RubyParser::Dsl.build_syntax_tree(&).freeze
+      end
       puts Support::SExpressionPrinter.print(@__syntax_tree__, indent: 2) if ENV["KUMI_DEBUG"] || ENV["KUMI_PRINT_SYNTAX_TREE"]
-      @__analyzer_result__ = Analyzer.analyze!(@__syntax_tree__).freeze
-      @__compiled_schema__ = Compiler.compile(@__syntax_tree__, analyzer: @__analyzer_result__).freeze
+      @__analyzer_result__ = Dev::Profiler.phase("analyzer") do
+        Analyzer.analyze!(@__syntax_tree__).freeze
+      end
+      @__compiled_schema__ = Dev::Profiler.phase("compiler") do
+        Compiler.compile(@__syntax_tree__, analyzer: @__analyzer_result__, schema_name: self.name).freeze
+      end
       Inspector.new(@__syntax_tree__, @__analyzer_result__, @__compiled_schema__)
     end

data/lib/kumi/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Kumi
-  VERSION = "0.0.14"
+  VERSION = "0.0.15"
 end

data/lib/kumi.rb CHANGED Viewed

@@ -8,6 +8,7 @@ loader.ignore("#{__dir__}/kumi-cli")
 loader.inflector.inflect(
   "lower_to_ir_pass" => "LowerToIRPass",
   "load_input_cse" => "LoadInputCSE",
+  "ir_dependency_pass" => "IRDependencyPass",
   "vm" => "VM",
   "ir" => "IR",
   'ir_dump' => 'IRDump',

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: kumi
 version: !ruby/object:Gem::Version
-  version: 0.0.14
+  version: 0.0.15
 platform: ruby
 authors:
 - André Muta
@@ -48,10 +48,10 @@ files:
 - docs/compiler_design_principles.md
 - docs/dev/analyzer-debug.md
 - docs/dev/parse-command.md
+- docs/dev/vm-profiling.md
 - docs/development/README.md
 - docs/development/error-reporting.md
 - docs/features/README.md
-- docs/features/analysis-cascade-mutual-exclusion.md
 - docs/features/analysis-type-inference.md
 - docs/features/analysis-unsat-detection.md
 - docs/features/hierarchical-broadcasting.md
@@ -93,6 +93,7 @@ files:
 - lib/kumi/core/analyzer/passes/function_signature_pass.rb
 - lib/kumi/core/analyzer/passes/input_access_planner_pass.rb
 - lib/kumi/core/analyzer/passes/input_collector.rb
+- lib/kumi/core/analyzer/passes/ir_dependency_pass.rb
 - lib/kumi/core/analyzer/passes/join_reduce_planning_pass.rb
 - lib/kumi/core/analyzer/passes/load_input_cse.rb
 - lib/kumi/core/analyzer/passes/lower_to_ir_pass.rb
@@ -188,8 +189,11 @@ files:
 - lib/kumi/core/types/inference.rb
 - lib/kumi/core/types/normalizer.rb
 - lib/kumi/core/types/validator.rb
+- lib/kumi/dev.rb
 - lib/kumi/dev/ir.rb
 - lib/kumi/dev/parse.rb
+- lib/kumi/dev/profile_aggregator.rb
+- lib/kumi/dev/profile_runner.rb
 - lib/kumi/dev/runner.rb
 - lib/kumi/errors.rb
 - lib/kumi/frontends.rb

data/docs/features/analysis-cascade-mutual-exclusion.md DELETED Viewed

@@ -1,89 +0,0 @@
-# Cascade Mutual Exclusion Detection
-Analyzes cascade expressions to allow safe recursive patterns when conditions are mutually exclusive.
-## Overview
-The cascade mutual exclusion detector identifies when all conditions in a cascade expression cannot be true simultaneously, enabling safe mutual recursion patterns that would otherwise be rejected as cycles.
-## Core Mechanism
-The system performs three-stage analysis:
-1. **Conditional Dependency Tracking** - DependencyResolver marks base case dependencies as conditional
-2. **Mutual Exclusion Analysis** - UnsatDetector determines if cascade conditions are mutually exclusive
-3. **Safe Cycle Detection** - Toposorter allows cycles where all edges are conditional and conditions are mutually exclusive
-## Example: Processing Workflow
-```ruby
-schema do
-  input do
-    string :operation  # "forward", "reverse", "unknown"
-    integer :value
-  end
-  trait :is_forward, input.operation == "forward"
-  trait :is_reverse, input.operation == "reverse"
-  # Safe mutual recursion - conditions are mutually exclusive
-  value :forward_processor do
-    on is_forward, input.value * 2        # Direct calculation
-    on is_reverse, reverse_processor + 10  # Delegates to reverse (safe)
-    base "invalid operation"               # Fallback for unknown operations
-  end
-  value :reverse_processor do
-    on is_forward, forward_processor - 5   # Delegates to forward (safe)
-    on is_reverse, input.value / 2         # Direct calculation
-    base "invalid operation"               # Fallback for unknown operations
-  end
-end
-```
-## Safety Guarantees
-**Allowed**: Cycles where conditions are mutually exclusive
-- `is_forward` and `is_reverse` cannot both be true (operation has single value)
-- Each recursion executes exactly one step before hitting direct calculation
-- Bounded recursion with guaranteed termination
-**Rejected**: Cycles with overlapping conditions
-```ruby
-# This would be rejected - conditions can overlap
-value :unsafe_cycle do
-  on input.n > 0, "positive"
-  on input.n > 5, "large"  # Both can be true!
-  base fn(:not, unsafe_cycle)
-end
-```
-## Implementation Details
-### Conditional Dependencies
-Base case dependencies are marked as conditional because they only execute when no explicit conditions match.
-### Mutual Exclusion Analysis
-Conditions are analyzed for mutual exclusion:
-- Same field equality comparisons: `field == value1` vs `field == value2`
-- Domain constraints ensuring impossibility
-- All condition pairs must be mutually exclusive
-### Metadata Generation
-Analysis results stored in `cascade_metadata` state:
-```ruby
-{
-  condition_traits: [:is_forward, :is_reverse],
-  condition_count: 2,
-  all_mutually_exclusive: true,
-  exclusive_pairs: 1,
-  total_pairs: 1
-}
-```
-## Use Cases
-- Processing workflows with bidirectional logic
-- State machine fallback patterns
-- Recursive decision trees with termination conditions
-- Complex business rules with safe delegation patterns