RubyGems - kumi - Versions diffs - 0.0.0 → 0.0.3 - Mend

kumi 0.0.0 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

checksums.yaml +4 -4
data/.rubocop.yml +113 -3
data/CHANGELOG.md +21 -1
data/CLAUDE.md +387 -0
data/README.md +257 -20
data/docs/development/README.md +120 -0
data/docs/development/error-reporting.md +361 -0
data/documents/AST.md +126 -0
data/documents/DSL.md +154 -0
data/documents/FUNCTIONS.md +132 -0
data/documents/SYNTAX.md +367 -0
data/examples/deep_schema_compilation_and_evaluation_benchmark.rb +106 -0
data/examples/federal_tax_calculator_2024.rb +112 -0
data/examples/wide_schema_compilation_and_evaluation_benchmark.rb +80 -0
data/lib/generators/trait_engine/templates/schema_spec.rb.erb +27 -0
data/lib/kumi/analyzer/constant_evaluator.rb +51 -0
data/lib/kumi/analyzer/passes/definition_validator.rb +42 -0
data/lib/kumi/analyzer/passes/dependency_resolver.rb +71 -0
data/lib/kumi/analyzer/passes/input_collector.rb +55 -0
data/lib/kumi/analyzer/passes/name_indexer.rb +24 -0
data/lib/kumi/analyzer/passes/pass_base.rb +67 -0
data/lib/kumi/analyzer/passes/toposorter.rb +72 -0
data/lib/kumi/analyzer/passes/type_checker.rb +139 -0
data/lib/kumi/analyzer/passes/type_consistency_checker.rb +45 -0
data/lib/kumi/analyzer/passes/type_inferencer.rb +125 -0
data/lib/kumi/analyzer/passes/unsat_detector.rb +107 -0
data/lib/kumi/analyzer/passes/visitor_pass.rb +41 -0
data/lib/kumi/analyzer.rb +54 -0
data/lib/kumi/atom_unsat_solver.rb +349 -0
data/lib/kumi/compiled_schema.rb +41 -0
data/lib/kumi/compiler.rb +127 -0
data/lib/kumi/domain/enum_analyzer.rb +53 -0
data/lib/kumi/domain/range_analyzer.rb +83 -0
data/lib/kumi/domain/validator.rb +84 -0
data/lib/kumi/domain/violation_formatter.rb +40 -0
data/lib/kumi/domain.rb +8 -0
data/lib/kumi/error_reporter.rb +164 -0
data/lib/kumi/error_reporting.rb +95 -0
data/lib/kumi/errors.rb +116 -0
data/lib/kumi/evaluation_wrapper.rb +20 -0
data/lib/kumi/explain.rb +282 -0
data/lib/kumi/export/deserializer.rb +39 -0
data/lib/kumi/export/errors.rb +12 -0
data/lib/kumi/export/node_builders.rb +140 -0
data/lib/kumi/export/node_registry.rb +38 -0
data/lib/kumi/export/node_serializers.rb +156 -0
data/lib/kumi/export/serializer.rb +23 -0
data/lib/kumi/export.rb +33 -0
data/lib/kumi/function_registry/collection_functions.rb +92 -0
data/lib/kumi/function_registry/comparison_functions.rb +31 -0
data/lib/kumi/function_registry/conditional_functions.rb +36 -0
data/lib/kumi/function_registry/function_builder.rb +92 -0
data/lib/kumi/function_registry/logical_functions.rb +42 -0
data/lib/kumi/function_registry/math_functions.rb +72 -0
data/lib/kumi/function_registry/string_functions.rb +54 -0
data/lib/kumi/function_registry/type_functions.rb +51 -0
data/lib/kumi/function_registry.rb +138 -0
data/lib/kumi/input/type_matcher.rb +92 -0
data/lib/kumi/input/validator.rb +52 -0
data/lib/kumi/input/violation_creator.rb +50 -0
data/lib/kumi/input.rb +8 -0
data/lib/kumi/parser/build_context.rb +25 -0
data/lib/kumi/parser/dsl.rb +12 -0
data/lib/kumi/parser/dsl_cascade_builder.rb +125 -0
data/lib/kumi/parser/expression_converter.rb +58 -0
data/lib/kumi/parser/guard_rails.rb +43 -0
data/lib/kumi/parser/input_builder.rb +94 -0
data/lib/kumi/parser/input_proxy.rb +29 -0
data/lib/kumi/parser/parser.rb +66 -0
data/lib/kumi/parser/schema_builder.rb +172 -0
data/lib/kumi/parser/sugar.rb +108 -0
data/lib/kumi/schema.rb +49 -0
data/lib/kumi/schema_instance.rb +43 -0
data/lib/kumi/syntax/declarations.rb +23 -0
data/lib/kumi/syntax/expressions.rb +30 -0
data/lib/kumi/syntax/node.rb +46 -0
data/lib/kumi/syntax/root.rb +12 -0
data/lib/kumi/syntax/terminal_expressions.rb +27 -0
data/lib/kumi/syntax.rb +9 -0
data/lib/kumi/types/builder.rb +21 -0
data/lib/kumi/types/compatibility.rb +86 -0
data/lib/kumi/types/formatter.rb +24 -0
data/lib/kumi/types/inference.rb +40 -0
data/lib/kumi/types/normalizer.rb +70 -0
data/lib/kumi/types/validator.rb +35 -0
data/lib/kumi/types.rb +64 -0
data/lib/kumi/version.rb +1 -1
data/lib/kumi.rb +7 -3
data/scripts/generate_function_docs.rb +59 -0
data/test_impossible_cascade.rb +51 -0
metadata +93 -10
data/sig/kumi.rbs +0 -4

data/docs/development/error-reporting.md ADDED Viewed

@@ -0,0 +1,361 @@
+# Error Reporting Standards
+This guide provides comprehensive standards for error reporting in Kumi, ensuring consistent, localized error messages throughout the system.
+## Overview
+Kumi uses a unified error reporting interface that:
+- Provides consistent location information (file:line:column)
+- Categorizes errors by type (syntax, semantic, type, runtime)
+- Supports both immediate raising and error accumulation patterns
+- Maintains backward compatibility with existing tests
+- Enables enhanced error messages with suggestions and context
+## Core Interface Components
+### ErrorReporter Module
+Central error reporting functionality with standardized error entries:
+```ruby
+# Create structured error entry
+entry = ErrorReporter.create_error(
+  "Error message",
+  location: node.loc,
+  type: :semantic,
+  context: { additional: "info" }
+)
+# Add error to accumulator
+ErrorReporter.add_error(errors, "message", location: node.loc)
+# Immediately raise error
+ErrorReporter.raise_error("message", location: node.loc, error_class: Errors::SyntaxError)
+```
+### ErrorReporting Mixin
+Convenient methods for classes that need error reporting:
+```ruby
+class MyClass
+  include ErrorReporting
+  def process
+    # Accumulated errors (analyzer pattern)
+    report_error(errors, "message", location: node.loc, type: :semantic)
+    # Immediate errors (parser pattern)
+    raise_localized_error("message", location: node.loc, error_class: Errors::SyntaxError)
+  end
+end
+```
+## Implementation Patterns
+### Parser Classes (Immediate Errors)
+Parser classes should raise errors immediately when encountered:
+```ruby
+class DslBuilderContext
+  include ErrorReporting
+  def validate_name(name, type, location)
+    return if name.is_a?(Symbol)
+    raise_syntax_error(
+      "The name for '#{type}' must be a Symbol, got #{name.class}",
+      location: location
+    )
+  end
+  def raise_error(message, location)
+    # Legacy method - delegates to new interface
+    raise_syntax_error(message, location: location)
+  end
+end
+```
+### Analyzer Passes (Accumulated Errors)
+Analyzer passes should accumulate errors and report them at the end:
+```ruby
+class MyAnalyzerPass < PassBase
+  def run(errors)
+    each_decl do |decl|
+      validate_declaration(decl, errors)
+    end
+  end
+  private
+  def validate_declaration(decl, errors)
+    # New error reporting method
+    report_error(
+      errors,
+      "Validation failed for #{decl.name}",
+      location: decl.loc,
+      type: :semantic
+    )
+    # Legacy method (backward compatible)
+    add_error(errors, decl.loc, "Legacy format message")
+  end
+end
+```
+## Location Resolution Best Practices
+### Always Provide Location When Available
+```ruby
+# Good: Specific node location
+report_error(errors, "Type mismatch", location: node.loc)
+# Acceptable: Fallback location
+report_error(errors, "Cycle detected", location: first_node&.loc || :cycle)
+# Avoid: No location information
+report_error(errors, "Error occurred", location: nil)
+```
+### Complex Error Location Resolution
+For errors that span multiple nodes or are contextual:
+```ruby
+def report_cycle(cycle_path, errors)
+  # Find first declaration in cycle for location context
+  first_decl = find_declaration_by_name(cycle_path.first)
+  location = first_decl&.loc || :cycle
+  report_error(
+    errors,
+    "cycle detected: #{cycle_path.join(' → ')}",
+    location: location,
+    type: :semantic
+  )
+end
+def find_declaration_by_name(name)
+  return nil unless schema
+  schema.attributes.find { |attr| attr.name == name } ||
+    schema.traits.find { |trait| trait.name == name }
+end
+```
+### Location Fallbacks
+When AST location is not available, use meaningful symbolic locations:
+```ruby
+# Cycle detection
+location = node.loc || :cycle
+# Type inference failures
+location = decl.loc || :type_inference
+# Cross-reference resolution
+location = ref_node.loc || :reference_resolution
+```
+## Error Categorization
+### Error Types
+- `:syntax` - Parse-time structural errors
+- `:semantic` - Analysis-time logical errors
+- `:type` - Type system violations
+- `:runtime` - Execution-time failures
+### Type-specific Methods
+```ruby
+# Syntax errors (parser)
+report_syntax_error(errors, "Invalid syntax", location: loc)
+raise_syntax_error("Invalid syntax", location: loc)
+# Semantic errors (analyzer)
+report_semantic_error(errors, "Logic error", location: loc)
+# Type errors (type checker)
+report_type_error(errors, "Type mismatch", location: loc)
+```
+## Enhanced Error Messages
+### Basic Enhanced Errors
+```ruby
+report_enhanced_error(
+  errors,
+  "undefined reference to `missing_field`",
+  location: node.loc,
+  similar_names: ["missing_value", "missing_data"],
+  suggestions: [
+    "Check spelling of field name",
+    "Ensure field is declared in input block"
+  ]
+)
+```
+### Context-rich Errors
+```ruby
+report_error(
+  errors,
+  "Type mismatch in function call",
+  location: call_node.loc,
+  type: :type,
+  context: {
+    function: call_node.fn_name,
+    expected_type: expected,
+    actual_type: actual,
+    argument_position: position
+  }
+)
+```
+## Backward Compatibility
+### Legacy Format Support
+The system supports both legacy `[location, message]` arrays and new `ErrorEntry` objects:
+```ruby
+# Analyzer.format_errors handles both formats
+def format_errors(errors)
+  errors.map do |error|
+    case error
+    when ErrorReporter::ErrorEntry
+      error.to_s  # New format: "at file.rb:10:5: message"
+    when Array
+      loc, msg = error
+      "at #{loc || '?'}: #{msg}"  # Legacy format
+    end
+  end.join("\n")
+end
+```
+### Migration Strategy
+1. **New code**: Use new error reporting methods (`report_error`, `raise_localized_error`)
+2. **Existing code**: No changes required - `add_error` method maintained for compatibility
+3. **Enhanced features**: Migrate to new methods to access suggestions, context, and categorization
+## Testing Error Reporting
+### Error Location Testing
+```ruby
+RSpec.describe "Error Location Verification" do
+  it "reports errors at correct locations" do
+    schema_code = <<~RUBY
+      Kumi.schema do
+        input { integer :age }
+        trait :adult, (input.age >= 18)
+        trait :adult, (input.age >= 21)  # Line 4: Duplicate
+      end
+    RUBY
+    begin
+      eval(schema_code, binding, "test.rb", 1)
+    rescue Kumi::Errors::SemanticError => e
+      expect(e.message).to include("test.rb:4")
+      expect(e.message).to include("duplicated definition")
+    end
+  end
+end
+```
+### Error Quality Testing
+```ruby
+it "provides comprehensive error information" do
+  error = expect_semantic_error do
+    schema do
+      input { string :name }
+      value :result, fn(:add, input.name, 5)
+    end
+  end
+  expect(error.message).to include("add")           # Function name
+  expect(error.message).to include("string")        # Actual type
+  expect(error.message).to include("expects")       # Clear expectation
+  expect(error.message).to match(/:\d+:/)          # Line number
+end
+```
+### Edge Case Testing
+Use `spec/integration/potential_breakage_spec.rb` patterns:
+```ruby
+it "detects edge case that should break" do
+  expect do
+    schema do
+      input { integer :x }
+      # Edge case that might not be caught
+      value :result, some_edge_case_construct
+    end
+  end.to raise_error(Kumi::Errors::SemanticError)
+end
+```
+## Performance Considerations
+### Error Object Creation
+- ErrorEntry objects are lightweight structs
+- Location formatting is lazy (only when `to_s` is called)
+- Context information is stored efficiently in hashes
+### Batch Error Processing
+For analyzer passes processing many nodes:
+```ruby
+def run(errors)
+  # Batch process nodes to minimize error object creation
+  invalid_nodes = collect_invalid_nodes
+  invalid_nodes.each do |node|
+    report_error(errors, "Invalid: #{node.name}", location: node.loc)
+  end
+end
+```
+## Common Patterns and Anti-patterns
+### ✅ Good Patterns
+```ruby
+# Clear, specific error messages
+report_error(errors, "argument 1 of `fn(:add)` expects float, got string", location: arg.loc)
+# Proper location resolution
+location = node.loc || fallback_location_for_context
+# Type-appropriate error categorization
+report_type_error(errors, "type mismatch", location: node.loc)
+```
+### ❌ Anti-patterns
+```ruby
+# Vague error messages
+report_error(errors, "error", location: node.loc)
+# Missing location information
+report_error(errors, "something failed", location: nil)
+# Wrong error categorization
+report_syntax_error(errors, "type mismatch", location: node.loc)  # Should be type error
+```
+## Error Message Guidelines
+### Message Format
+- Start with lowercase (automatic capitalization in display)
+- Be specific about what failed and why
+- Include relevant context (function names, types, values)
+- Avoid technical jargon in user-facing messages
+### Examples
+```ruby
+# Good messages
+"argument 1 of `fn(:add)` expects float, got input field `name` of declared type string"
+"duplicated definition `adult`"
+"undefined reference to `missing_field`"
+"cycle detected: a → b → a"
+# Messages to improve
+"validation failed"
+"error in processing"
+"something went wrong"
+```
+This error reporting system ensures that users get clear, actionable feedback about issues in their Kumi schemas, with precise location information to help them fix problems quickly.

data/documents/AST.md ADDED Viewed

@@ -0,0 +1,126 @@
+# Kumi AST Reference
+## Core Node Types
+**Root**: Schema container
+```ruby
+Root = Struct.new(:inputs, :attributes, :traits)
+```
+**FieldDecl**: Input field metadata
+```ruby
+FieldDecl = Struct.new(:name, :domain, :type)
+# DSL: integer :age, domain: 18..65 → FieldDecl(:age, 18..65, :integer)
+```
+**Trait**: Boolean predicate
+```ruby
+Trait = Struct.new(:name, :expression)
+# DSL: trait :adult, (input.age >= 18) → Trait(:adult, CallExpression(...))
+```
+**Attribute**: Computed value
+```ruby
+Attribute = Struct.new(:name, :expression)
+# DSL: value :total, fn(:add, a, b) → Attribute(:total, CallExpression(:add, [...]))
+```
+## Expression Nodes
+**CallExpression**: Function calls and operators
+```ruby
+CallExpression = Struct.new(:fn_name, :args)
+def &(other) = CallExpression.new(:and, [self, other])  # Enable chaining
+```
+**FieldRef**: Field access (`input.field_name`)
+```ruby
+FieldRef = Struct.new(:name)
+# Has operator methods: >=, <=, >, <, ==, != that create CallExpression nodes
+```
+**Binding**: References to other declarations
+```ruby
+Binding = Struct.new(:name)
+# Created by: ref(:name) OR bare identifier (trait_name) in composite traits
+# DSL: ref(:adult) → Binding(:adult)
+# DSL: adult & verified → CallExpression(:and, [Binding(:adult), Binding(:verified)])
+```
+**Literal**: Constants (`18`, `"text"`, `true`)
+```ruby
+Literal = Struct.new(:value)
+```
+**ListExpression**: Arrays (`[1, 2, 3]`)
+```ruby
+ListExpression = Struct.new(:elements)
+```
+## Cascade Expressions (Conditional Values)
+**CascadeExpression**: Container for conditional logic
+```ruby
+CascadeExpression = Struct.new(:cases)
+```
+**WhenCaseExpression**: Individual conditions
+```ruby
+WhenCaseExpression = Struct.new(:condition, :result)
+```
+**Case type mappings**:
+- `on :a, :b, result` → `condition: fn(:all?, ref(:a), ref(:b))`
+- `on_any :a, :b, result` → `condition: fn(:any?, ref(:a), ref(:b))`
+- `base result` → `condition: literal(true)`
+## Key Nuances
+**Operator methods on FieldRef**: Enable `input.age >= 18` syntax by defining operators that create `CallExpression` nodes
+**CallExpression `&` method**: Enables expression chaining like `(expr1) & (expr2)`
+**Node immutability**: AST nodes are frozen after construction; analysis results stored separately
+**Location tracking**: All nodes include file/line/column for error reporting
+**Tree traversal**: Each node defines `children` method for recursive processing
+**Expression wrapping**: During parsing, raw values auto-convert to `Literal` nodes via `ensure_syntax()`
+## Common Expression Trees
+**Simple**: `(input.age >= 18)`
+```
+CallExpression(:>=, [FieldRef(:age), Literal(18)])
+```
+**Chained AND**: `(input.age >= 21) & (input.verified == true)`
+```
+CallExpression(:and, [
+  CallExpression(:>=, [FieldRef(:age), Literal(21)]),
+  CallExpression(:==, [FieldRef(:verified), Literal(true)])
+])
+```
+**Composite Trait**: `adult & verified & high_income`
+```
+CallExpression(:and, [
+  CallExpression(:and, [
+    Binding(:adult),
+    Binding(:verified)
+  ]),
+  Binding(:high_income)
+])
+```
+**Mixed Composition**: `adult & (input.score > 80) & verified`
+```
+CallExpression(:and, [
+  CallExpression(:and, [
+    Binding(:adult),
+    CallExpression(:>, [FieldRef(:score), Literal(80)])
+  ]),
+  Binding(:verified)
+])
+```

data/documents/DSL.md ADDED Viewed

@@ -0,0 +1,154 @@
+# Kumi DSL Reference
+Kumi is a declarative language for defining, analyzing, and executing complex business logic. It compiles rules into a verifiable dependency graph, ensuring that logic is **sound, maintainable, and free of contradictions** before execution (as much as possible given the current library implementation).
+-----
+## Guiding Principles
+Kumi's design is opinionated and guides you toward creating robust and analyzable business logic.
+  * **Logic as Code, Not Just Configuration**: Rules are expressed in a clean, readable DSL that can be version-controlled and tested.
+  * **Provable Correctness**: A multi-pass analyzer statically verifies your schema, detecting duplicates, circular dependencies, type errors, and even **logically impossible conditions** (e.g., `age < 25 AND age > 65`) at compile time.
+  * **Explicit Data Contracts**: The mandatory `input` block serves as a formal, self-documenting contract for the data your schema expects, enabling runtime validation of types and domain constraints.
+  * **Composition Over Complexity**: Complex rules are built by composing simpler, named concepts (`trait`s), rather than creating large, monolithic blocks of logic.
+-----
+## Core Syntax
+A Kumi schema contains an `input` block to declare its data contract, followed by `trait` and `value` definitions.
+```ruby
+schema do
+  # 1. Define the data contract for this schema.
+  input do
+    # ... field declarations
+  end
+  # 2. Define reusable boolean predicates (traits).
+  # ... trait definitions
+  # 3. Define computed fields (values).
+  # ... value definitions
+end
+```
+-----
+## Input Fields: The Data Contract
+The `input` block declares the schema's data dependencies. All external data must be accessed via the `input` object (e.g., `input.age`).
+### **Declaration Methods**
+The preferred way to declare fields is with **type-specific methods**, which provide compile-time type checking and runtime validation.
+  * **Primitives**:
+    ```ruby
+    string  :name
+    integer :age, domain: 18..65
+    float   :score, domain: 0.0..100.0
+    boolean :is_active
+    ```
+  * **Collections**:
+    ```ruby
+    array :tags, elem: { type: :string }
+    hash  :metadata, key: { type: :string }, val: { type: :any }
+    ```
+### **Domain Constraints**
+Attach validation rules directly to input fields using `domain:`. These are checked when data is loaded.
+  * **Range**: `domain: 1..100` or `0.0...1.0` (exclusive end)
+  * **Enumeration**: `domain: %w[pending active archived]`
+  * **Custom Logic**: `domain: ->(value) { value.even? }`
+-----
+## Traits: Named Logical Predicates
+A **`trait`** is a named expression that **must evaluate to a boolean**. Traits are the fundamental building blocks of logic, defining reusable, verifiable conditions.
+### **Defining & Composing Traits**
+Traits are defined with a parenthesized expression and composed using the `&` operator. This composition is strictly **conjunctive (logical AND)**, a key constraint that enables Kumi's powerful static analysis.
+```ruby
+# Base Traits
+trait :is_adult, (input.age >= 18)
+trait :is_verified, (input.status == "verified")
+# Composite Trait (is_adult AND is_verified)
+trait :can_proceed, is_adult & is_verified
+# Mix bare trait names with inline expressions
+trait :is_eligible, is_adult & is_verified & (input.score > 50)
+```
+-----
+## Values: Computed Fields
+A **`value`** is a named expression that computes a field of any type.
+### **Simple Values**
+Values can be defined with expressions using `input` fields, functions (`fn`), and references to other values.
+```ruby
+value :full_name, fn(:concat, input.first_name, " ", input.last_name)
+value :discounted_price, fn(:multiply, input.base_price, 0.8)
+```
+### **Conditional Values (Cascades)**
+For conditional logic, a `value` takes a block to create a **cascade**. Cascades select a result based on a series of conditions, which **must reference named `trait`s**. This enforces clarity by separating the *what* (the condition's name) from the *how* (its implementation).
+```ruby
+value :access_level do
+  # `on` implies AND: user must be :premium AND :verified.
+  on :premium, :verified, "Full Access"
+  # `on_any` implies OR: user can be :staff OR :admin.
+  on_any :staff, :admin, "Elevated Access"
+  # `on_none` implies NOT (A OR B): user is neither :blocked NOR :suspended.
+  on_none :blocked, :suspended, "Limited Access"
+  # `base` is the default if no other conditions match.
+  base "No Access"
+end
+```
+-----
+## The Kumi Pattern: Separating AND vs. OR Logic
+Kumi intentionally enforces a pattern for handling different types of logic to maximize clarity and analyzability.
+  * **`trait`s and `&` are for AND logic**: Use `trait` composition to build up a set of conditions that must *all* be true. This is your primary tool for defining constraints.
+  * **`value` cascades are for OR logic**: Use `on_any` within a `value` cascade to handle conditions where *any* one of several predicates is sufficient. This is the idiomatic way to express disjunctive logic.
+This separation forces complex `OR` conditions to be handled within the clear, readable structure of a cascade, rather than being hidden inside a complex `trait` definition.
+-----
+## Best Practices
+  * **Prefer Small, Composable Traits**: Avoid creating large, monolithic traits with many `&` conditions. Instead, define smaller, named traits and compose them.
+    ```ruby
+    # AVOID: Hard to read and reuse
+    trait :eligible, (input.age >= 18) & (input.status == "active") & (input.score > 50)
+    # PREFER: Clear, reusable, and self-documenting
+    trait :is_adult, (input.age >= 18)
+    trait :is_active, (input.status == "active")
+    trait :has_good_score, (input.score > 50)
+    trait :is_eligible, is_adult & is_active & has_good_score
+    ```
+  * **Name All Conditions**: If you need to use a condition in a `value` cascade, define it as a `trait` first. This gives the condition a clear business name and makes the cascade easier to read.