RubyGems - kairos-chain - Versions diffs - 3.16.0 → 3.17.0 - Mend

kairos-chain 3.16.0 → 3.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (105) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 57b5814a96eadd45ccf9056493e217017d0ff5d83f676ee5cccde7c8ae8b3fb7
-  data.tar.gz: 12c91db84b545e576e821fe1e1a82c7fa0f30089850fff92adcab7f85e244cc0
+  metadata.gz: 9d2b6b702508caf49a43e1fdc09beb4b6d5a7dd478a89492d07503643744855c
+  data.tar.gz: faaceed727c7de49424391d32768bff4790411a414cd4b2a671fcd31501aadf5
 SHA512:
-  metadata.gz: 528258d448d7079162acb42e7bb6d4ad62402b5d3c6a3802882137175462f24468e3355cc2d90a0efdb778a0949c523209d5e060a637fb5500f893155f9b2ed5
-  data.tar.gz: 74e8d097b208b131bab97a3b2c81208644279c08a48f7f4cb54ddedd9a8fe31c15eb7b22ebfddd0d07eec2bd93672990168e246e9350af202ff517b232c0f992
+  metadata.gz: 727829e5b3672c2b694336baabb14af1a0906ee3b7c196bb320ea0060db98bc163ab36a7971f405eb88e0c1d0393dcb370fed337bb6220e0e8dd1716885e7166
+  data.tar.gz: 12be3d20add101e4ac13871b4c36dce043e104fcbe67ab1af95ad55793b018b2cc0e785324d34a4f8887a2969c7d6b5db9da61a9f777496d560296a8d94f14cc

data/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,48 @@ All notable changes to the `kairos-chain` gem will be documented in this file.
 This project follows [Semantic Versioning](https://semver.org/).
+## [3.17.0] - 2026-04-22
+### Added
+- **Multi-LLM Review SkillSet** — New `multi_llm_review` tool dispatching review
+  tasks in parallel to heterogeneous LLMs (Claude Opus 4.6/4.7 via Agent tool
+  and CLI, OpenAI Codex GPT-5.4, Cursor Composer-2). Returns consensus verdict
+  with aggregated findings. Convergence rule configurable (default `3/4 APPROVE`).
+  Complexity auto-detection from `review_type` + artifact size. Sandbox mode
+  via `review_context: independent` prevents CLAUDE.md contamination.
+- **LLM adapter extensions** — `codex_adapter` (`codex exec` subprocess) and
+  `cursor_adapter` (`agent -p` subprocess) join the existing `claude_code_adapter`.
+  All share the `SafeSubprocess` wrapper with timeout, stderr isolation, and
+  ANSI stripping. `provider_override` parameter on `llm_call` routes to a
+  specific adapter independent of the default config.
+- **Agent SkillSet integration** — Agent OODA loop can invoke `multi_llm_review`
+  as an ACT step for design/implementation review cycles.
+### Fixed
+- **Adapter model label** — `codex_adapter` and `cursor_adapter` no longer fall
+  back to `llm_client.yml`'s Anthropic default model (`claude-opus-4-6`) when
+  the caller omits `model`. They now report `codex-cli-default` /
+  `cursor-cli-default` in response JSON, honestly reflecting that the CLI's
+  own default model was used. No change to dispatch behavior.
+- **Runtime integration** — CLI auth flow, effort control per provider,
+  auto-complexity mapping (`low`/`medium`/`high`/`xhigh`) validated end-to-end.
+- **MMP keypair persistence** — Keypair saved to `.kairos/keys/` instead of
+  ephemeral CWD, preventing loss on directory change.
+- **paused_risk skip transition** — Agent state machine correctly handles
+  `skip` action from `paused_risk` state.
+- **llm_call AuthError fallback** — Auto-switch to `claude_code` provider
+  when Anthropic API auth fails.
+### Review
+- Design: 2 rounds × 4 LLMs (Claude Opus 4.6, Claude Opus 4.7, Codex GPT-5.4,
+  Cursor Composer-2)
+- Implementation: 2 rounds × 4 LLMs
+- Runtime test: 4-LLM diversity verified on buggy Fibonacci artifact
+  (4/4 REJECT with distinct findings per reviewer characteristic)
 ## [3.16.0] - 2026-04-19
 ### Changed

data/README.md CHANGED Viewed

@@ -20,6 +20,7 @@ A self-referential [Model Context Protocol (MCP)](https://modelcontextprotocol.i
 - **Attestation System (Synoptis)** — Cryptographic attestation and trust scoring
 - **Dream Mode** — Speculative knowledge proposals with community review
 - **Claude Code Plugin Projection** — Auto-project SkillSets as Claude Code plugins (hooks, agents, slash commands)
+- **Multi-LLM Review** — Parallel dispatch to heterogeneous LLMs (Claude, Codex, Cursor) via CLI subprocesses; consensus verdict with aggregated findings
 ## Installation

data/lib/kairos_mcp/daemon/active_observe.rb ADDED Viewed

@@ -0,0 +1,180 @@
+# frozen_string_literal: true
+module KairosMcp
+  class Daemon
+    # ActiveObserve — policy-driven OBSERVE phase.
+    #
+    # Design (v0.2 §2, P3.1):
+    #   The passive OBSERVE phase inspects mandate state in memory. The
+    #   active variant additionally invokes a whitelisted set of READ-ONLY
+    #   tools named in the mandate's `observe_policies`, collects their
+    #   results, and runs a cheap triage step to highlight what looks
+    #   relevant. In P3.1 the triage is a keyword filter stub; a future
+    #   revision will slot in a cheap LLM call with the same interface.
+    #
+    # Safety:
+    #   Only tools listed in READ_ONLY_ALLOWLIST (or the caller-supplied
+    #   allowlist) are ever invoked. A mandate cannot widen its own
+    #   observation surface — policies must be a subset of the allowlist.
+    class ActiveObserve
+      # A deliberately conservative default. Additional read-only tools may
+      # be allowed by passing an explicit `allowlist:` into #initialize.
+      READ_ONLY_ALLOWLIST = %w[
+        chain_history
+        chain_status
+        chain_verify
+        knowledge_get
+        knowledge_list
+        skills_list
+        skills_get
+        skills_dsl_list
+        skills_dsl_get
+        resource_list
+        resource_read
+        introspection_health
+        introspection_check
+        state_status
+        state_history
+        document_status
+        meeting_browse
+        meeting_check_freshness
+        skillset_browse
+      ].freeze
+      def initialize(allowlist: READ_ONLY_ALLOWLIST, keywords: nil, logger: nil)
+        @allowlist = allowlist.map(&:to_s).freeze
+        @keywords  = keywords
+        @logger    = logger
+      end
+      # Execute the active OBSERVE step.
+      #
+      # @param mandate_hash [Hash] must expose :observe_policies (or the
+      #   'observe_policies' key) as an Array of tool names. May expose
+      #   :goal_name and :goal for keyword-based triage.
+      # @param tool_invoker [#call] a callable accepting
+      #   (tool_name, args) and returning the tool's native result. The
+      #   caller supplies this so ActiveObserve itself is I/O-agnostic
+      #   and trivially testable.
+      # @return [Hash] structured observation with :policies_invoked,
+      #   :policies_skipped, :results, :relevant, :errors.
+      def observe(mandate_hash, tool_invoker:)
+        raise ArgumentError, 'mandate_hash required' unless mandate_hash.is_a?(Hash)
+        raise ArgumentError, 'tool_invoker must respond to call' unless tool_invoker.respond_to?(:call)
+        policies = Array(mandate_hash[:observe_policies] || mandate_hash['observe_policies'])
+        # Deduplicate by normalized tool name — first-wins for args if dupes exist.
+        policies = policies.uniq { |e| normalize_policy(e).first }
+        invoked = []
+        skipped = []
+        results = {}
+        errors  = {}
+        policies.each do |entry|
+          tool_name, args = normalize_policy(entry)
+          unless allowed?(tool_name)
+            skipped << tool_name
+            log(:warn, :active_observe_skip, tool: tool_name, reason: 'not_in_allowlist')
+            next
+          end
+          begin
+            results[tool_name] = tool_invoker.call(tool_name, args)
+            invoked << tool_name
+          rescue StandardError => e
+            errors[tool_name] = "#{e.class}: #{e.message}"
+            log(:error, :active_observe_error, tool: tool_name, error: errors[tool_name])
+          end
+        end
+        relevant = select_relevant(results, mandate_hash)
+        {
+          policies_invoked: invoked,
+          policies_skipped: skipped,
+          results:          results,
+          relevant:         relevant,
+          errors:           errors
+        }
+      end
+      # Triage stub. In P3.1 we score results by simple keyword membership
+      # (mandate goal tokens). Returning the full result under a :match
+      # entry keeps the interface stable for the eventual LLM-backed
+      # implementation, which will add confidence scores without changing
+      # the key layout.
+      #
+      # @return [Hash] tool_name → { match: Boolean, score: Float, matched_keywords: [...] }
+      def select_relevant(results, mandate_hash)
+        keywords = effective_keywords(mandate_hash)
+        return {} if results.empty?
+        results.each_with_object({}) do |(tool, payload), acc|
+          matched = match_keywords(payload, keywords)
+          acc[tool] = {
+            match:            !matched.empty? || keywords.empty?,
+            score:            keywords.empty? ? 1.0 : matched.size.to_f / keywords.size,
+            matched_keywords: matched
+          }
+        end
+      end
+      # ------------------------------------------------------------------ helpers
+      private
+      # A policy entry is either a String tool-name or a Hash
+      # { tool: "...", args: {...} }. Normalizing here means the rest of
+      # the class can assume a pair.
+      def normalize_policy(entry)
+        case entry
+        when String
+          [entry, {}]
+        when Hash
+          tool = entry[:tool] || entry['tool'] || entry[:name] || entry['name']
+          args = entry[:args] || entry['args'] || {}
+          [tool.to_s, args]
+        else
+          [entry.to_s, {}]
+        end
+      end
+      def allowed?(tool_name)
+        @allowlist.include?(tool_name.to_s)
+      end
+      def effective_keywords(mandate_hash)
+        return Array(@keywords).map(&:to_s).reject(&:empty?) if @keywords
+        raw = [
+          mandate_hash[:goal_name], mandate_hash['goal_name'],
+          mandate_hash[:goal],      mandate_hash['goal']
+        ].compact.map(&:to_s).join(' ')
+        raw.downcase.scan(/[a-z0-9]{3,}/).uniq
+      end
+      def match_keywords(payload, keywords)
+        return [] if keywords.empty?
+        haystack = payload_to_string(payload).downcase
+        keywords.select { |k| haystack.include?(k) }
+      end
+      def payload_to_string(payload)
+        case payload
+        when String then payload
+        when Hash, Array then payload.to_s
+        else payload.to_s
+        end
+      end
+      def log(level, event, **fields)
+        return unless @logger && @logger.respond_to?(level)
+        @logger.public_send(level, "#{event} #{fields.inspect}")
+      rescue StandardError
+        # Never let a logger crash mask the original tool error.
+      end
+    end
+  end
+end

data/lib/kairos_mcp/daemon/approval_gate.rb ADDED Viewed

@@ -0,0 +1,231 @@
+# frozen_string_literal: true
+require 'json'
+require 'securerandom'
+require 'fileutils'
+require 'time'
+require 'digest'
+module KairosMcp
+  class Daemon
+    # ApprovalGate — pending-file based approval system for code-gen proposals.
+    #
+    # Design (P3.2 v0.2 §4):
+    #   Proposals are staged as JSON files in .kairos/run/proposals/.
+    #   Human approval/rejection is recorded as a separate .decision.json file.
+    #   proposal_hash binds reviewed content to applied content cryptographically.
+    class ApprovalGate
+      DEFAULT_TTL   = 28_800  # 8 hours (daemon mode)
+      MAX_PENDING   = 16
+      def initialize(dir:, clock: -> { Time.now.utc }, logger: nil)
+        @dir    = dir
+        @clock  = clock
+        @logger = logger
+        FileUtils.mkdir_p(@dir, mode: 0o700)
+      end
+      # Stage a pending-approval proposal. Returns the stored Hash.
+      def stage(proposal)
+        id = proposal[:proposal_id] || proposal['proposal_id']
+        raise ArgumentError, 'proposal_id required' if id.to_s.empty?
+        check_backpressure!
+        now = @clock.call
+        ttl = proposal[:ttl_seconds] || proposal['ttl_seconds'] || DEFAULT_TTL
+        # Compute proposal_hash from canonical content (excludes mutable fields)
+        canonical = proposal.reject { |k, _| mutable_key?(k) }
+        p_hash = canonical_hash(canonical)
+        p = proposal.merge(
+          status:        'pending_approval',
+          proposal_hash: p_hash,
+          created_at:    now.iso8601,
+          expires_at:    (now + ttl).iso8601
+        )
+        write_atomic(file_for(id), JSON.pretty_generate(stringify_keys(p)))
+        p
+      end
+      # Auto-approve (fast path for L2 scopes).
+      def auto_approve(proposal)
+        id = proposal[:proposal_id] || proposal['proposal_id']
+        raise ArgumentError, 'proposal_id required' if id.to_s.empty?
+        now = @clock.call
+        ttl = proposal[:ttl_seconds] || proposal['ttl_seconds'] || DEFAULT_TTL
+        canonical = proposal.reject { |k, _| mutable_key?(k) }
+        p_hash = canonical_hash(canonical)
+        p = proposal.merge(
+          status:        'auto_approved',
+          proposal_hash: p_hash,
+          created_at:    now.iso8601,
+          expires_at:    (now + ttl).iso8601
+        )
+        write_atomic(file_for(id), JSON.pretty_generate(stringify_keys(p)))
+        write_decision(id,
+                       decision: 'approve',
+                       reviewer: 'policy:auto_approve',
+                       proposal_hash: p_hash,
+                       granted_at: now.iso8601,
+                       reason: "scope=#{proposal.dig(:scope, :scope) || proposal.dig('scope', 'scope')} auto-approved")
+        p
+      end
+      # Non-blocking status check.
+      # @return [Symbol] :pending | :approved | :rejected | :expired | :not_found
+      def status_of(proposal_id)
+        p = read_proposal(proposal_id)
+        return :not_found unless p
+        return :expired if Time.parse(p['expires_at']) < @clock.call
+        d = read_decision(proposal_id)
+        return :pending unless d
+        d['decision'] == 'approve' ? :approved : :rejected
+      end
+      # Record a human decision (via AttachServer mailbox).
+      def record_decision(proposal_id, decision:, reviewer:, reason: nil)
+        raise ArgumentError, 'decision must be approve|reject' unless %w[approve reject].include?(decision)
+        p = read_proposal(proposal_id)
+        raise NotFoundError, "proposal not found: #{proposal_id}" unless p
+        raise ConflictError, "already decided: #{proposal_id}" if File.exist?(decision_file(proposal_id))
+        raise ExpiredError, "expired: #{proposal_id}" if Time.parse(p['expires_at']) < @clock.call
+        write_decision(proposal_id,
+                       decision: decision,
+                       reviewer: reviewer,
+                       proposal_hash: p['proposal_hash'],
+                       granted_at: @clock.call.iso8601,
+                       reason: reason)
+      end
+      # For cycle re-entry: returns ApprovalGrant or nil. Never blocks.
+      def consume_grant(proposal_id)
+        case status_of(proposal_id)
+        when :approved
+          ApprovalGrant.new(
+            proposal_id: proposal_id,
+            decision:    read_decision(proposal_id),
+            proposal:    read_proposal(proposal_id)
+          )
+        else
+          nil
+        end
+      end
+      # Verify proposal content integrity at apply time.
+      # @return [Boolean]
+      def verify_proposal_integrity(proposal_id)
+        p = read_proposal(proposal_id)
+        return false unless p
+        d = read_decision(proposal_id)
+        return false unless d
+        canonical = p.reject { |k, _| mutable_key?(k) }
+        recomputed = canonical_hash(canonical)
+        recomputed == p['proposal_hash'] && d['proposal_hash'] == p['proposal_hash']
+      end
+      # List pending proposals.
+      def pending_proposals
+        Dir.glob(File.join(@dir, '*.json')).filter_map do |f|
+          next if f.end_with?('.decision.json') || f.end_with?('.applied.json')
+          p = safe_read_json(f)
+          next unless p
+          next if p['status'] == 'auto_approved' && File.exist?(decision_file(p['proposal_id']))
+          s = status_of(p['proposal_id'])
+          s == :pending ? p : nil
+        end
+      end
+      # Read a proposal record.
+      def read_proposal(proposal_id)
+        safe_read_json(file_for(proposal_id))
+      end
+      # Read a decision record.
+      def read_decision(proposal_id)
+        safe_read_json(decision_file(proposal_id))
+      end
+      ApprovalGrant = Struct.new(:proposal_id, :decision, :proposal, keyword_init: true)
+      class NotFoundError < StandardError; end
+      class ConflictError < StandardError; end
+      class ExpiredError  < StandardError; end
+      class BackpressureError < StandardError; end
+      private
+      def file_for(id)      ; File.join(@dir, "#{id}.json") end
+      def decision_file(id) ; File.join(@dir, "#{id}.decision.json") end
+      MUTABLE_KEYS = %w[status proposal_hash created_at expires_at ttl_seconds].freeze
+      def mutable_key?(k)
+        MUTABLE_KEYS.include?(k.to_s)
+      end
+      # Sorted-key canonical JSON for deterministic hashing (R2 residual fix).
+      def canonical_hash(obj)
+        "sha256:#{Digest::SHA256.hexdigest(canonical_json(obj))}"
+      end
+      def canonical_json(obj)
+        case obj
+        when Hash
+          '{' + obj.keys.map(&:to_s).sort.map { |k|
+            # Look up by both string and symbol to handle mixed-key hashes
+            val = obj.key?(k) ? obj[k] : obj[k.to_sym]
+            k.to_json + ':' + canonical_json(val)
+          }.join(',') + '}'
+        when Array
+          '[' + obj.map { |v| canonical_json(v) }.join(',') + ']'
+        when Symbol
+          obj.to_s.to_json
+        else
+          obj.to_json
+        end
+      end
+      def stringify_keys(hash)
+        hash.transform_keys(&:to_s)
+      end
+      def write_decision(proposal_id, **fields)
+        write_atomic(decision_file(proposal_id), JSON.pretty_generate(fields))
+      end
+      def write_atomic(path, data)
+        tmp = "#{path}.tmp.#{SecureRandom.hex(4)}"
+        File.open(tmp, 'wb', 0o600) do |f|
+          f.write(data)
+          f.flush
+          f.fsync rescue nil
+        end
+        File.rename(tmp, path)
+      ensure
+        File.unlink(tmp) if tmp && File.exist?(tmp)
+      end
+      def safe_read_json(path)
+        return nil unless File.file?(path)
+        JSON.parse(File.read(path))
+      rescue StandardError
+        nil
+      end
+      def check_backpressure!
+        count = Dir.glob(File.join(@dir, '*.json')).count do |f|
+          next false if f.end_with?('.decision.json') || f.end_with?('.applied.json')
+          # Only count proposals that have no decision file (truly pending)
+          proposal_id = File.basename(f, '.json')
+          !File.exist?(decision_file(proposal_id))
+        end
+        raise BackpressureError, "max pending proposals (#{MAX_PENDING}) exceeded" if count >= MAX_PENDING
+      end
+    end
+  end
+end