RubyGems - kairos-chain - Versions diffs - 3.23.1 → 3.24.0 - Mend

kairos-chain 3.23.1 → 3.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 35124b7f595066816a5e59823ca5f871bcb2c009f12db8127f7570ee0e923206
-  data.tar.gz: 482cef573f07539663a119db81cbdcb5b600362a551b3ee45459d9727c78c755
+  metadata.gz: 20e2a223137f51dc61025e57dd6fd205a8f702ef923b5b0b2e0d464a308d279f
+  data.tar.gz: 51627cb487cf5fc2e46b8e6055bf36e0cf5c8839f15cb2c2236f2ff56efa2def
 SHA512:
-  metadata.gz: a07f31d1e33713c4993aaf396838ffaac1ac5c9979ba7ff24f0d590a39fa4341f64cb1899d061507cc13b7fdfff3a85ee6177dc21b54578a7ef2af9b8de5fe81
-  data.tar.gz: aebf6ebc84bf36682c74ebf7b000c5168051cab9685842c75dcd2a9ebc4b733713b4f1389e7e2d8fe00f61eefd3dc201885e9582a620c77e937a53999b349e3b
+  metadata.gz: 9fd4b17a28bdc06b7b19195e7274ef41e4ba7a93fc70e2c3a38083c58eae85ea8dfd432cb7cc6219cf7ba49ba637dbed12c48ea12312f4a3f26f46dfb936438f
+  data.tar.gz: fadb35fdbf47eeebfc9b667685452a0c222ecb396dbb2031de08ac27b2df1de1a3985fc41c03e4e767d22a58ec98a77d52c2059e921f8084d1663a2a099bdd1d

data/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,77 @@ All notable changes to the `kairos-chain` gem will be documented in this file.
 This project follows [Semantic Versioning](https://semver.org/).
+## [3.24.0] - 2026-04-27
+### Added
+- **multi_llm_review_wait MCP tool** (Phase 1.5) — optional blocking gate
+  between `multi_llm_review` (Phase 1) and `multi_llm_review_collect`
+  (Phase 2). Wraps the existing `WaitForWorker.wait` polling loop and
+  exposes 6 distinct status codes (`ready`, `still_pending`, `crashed`,
+  `unknown_token`, `already_collected`, `past_collect_deadline`) each with
+  a `next_action` recovery hint pointing at the right next tool.
+- **`next_action` hint on `multi_llm_review` delegation_pending response** —
+  structured `{tool, args, purpose}` field nudging the orchestrator to call
+  `multi_llm_review_wait` after persona Agent dispatch. MCP does not enforce
+  ordering; this is a hint, not a constraint, but in practice LLMs follow
+  it reliably.
+- **Path A vs Path B disambiguation in workflow knowledge doc** — surfaces
+  the long-implicit distinction between the host-tracked Bash workflow
+  (Claude Code's `Bash(background)` pattern, statusbar shows `XX shells`)
+  and the MCP-managed SkillSet (detached worker, no host-side tracking,
+  polling required).
+- New config keys under `delegation.parallel`:
+  `wait_poll_interval_seconds: 1.0`, `wait_max_default_seconds: 600`,
+  `wait_max_hard_cap_seconds: 1800`, `wait_still_pending_streak_limit: 3`.
+- Streak guard: 3 consecutive `still_pending` returns escalate to
+  `crashed/wait_exhausted` so a wedged worker cannot trap the orchestrator
+  in an infinite wait loop.
+- 14 new tests in `test_multi_llm_review_wait.rb` covering all status
+  paths, streak persistence/reset, hard cap clamping, deadline-remaining
+  clamping, and backward compatibility (collect still works without wait).
+### Changed
+- `multi_llm_review` SkillSet version 0.4.0 → 0.5.0.
+- `delegation` instruction text now mentions wait → collect chain.
+### Notes
+- Backward compatible: callers that skip wait and call collect directly
+  still work via collect's existing internal polling.
+- Design review (Codex GPT-5.5 + Cursor Composer-2 + Claude Team Opus 4.7)
+  produced 3/3 REVISE with 6-7 P1 issues; revisions R1-R14 captured in
+  handoff L2 `multi_llm_review_wait_tool_handoff` before implementation.
+## [3.23.3] - 2026-04-27
+### Documentation
+- **multi_llm_review_workflow knowledge** — Added "Async/Parallel Collect
+  Timing — Iron Rule" subsection. Documents the workflow constraint that the
+  orchestrator must call `multi_llm_review_collect` immediately after persona
+  Agent reviews complete, without intervening user dialogue. Explains the
+  underlying mechanics (LLM is not event-driven; collect already polls
+  internally at 0.5s intervals; token expiry vs subprocess completion). Adds
+  recommended flow, anti-pattern, and manual recovery instructions.
+- Updated stale `must_collect_by` default reference (600s → 1800s).
+## [3.23.2] - 2026-04-26
+### Fixed
+- **multi_llm_review collect_deadline bug** — `timeout_seconds_override` no longer
+  leaves the orchestrator's submission window shorter than the worker lifespan.
+  In the async/parallel path, `collect_deadline` is now auto-extended to cover
+  `worker self_timeout + poll margin` so raising `timeout_seconds_override`
+  alone keeps the token alive while the worker is healthy.
+- New `collect_deadline_seconds_override` argument on `multi_llm_review` for
+  explicit control of the orchestrator's submission window.
+- Default `delegation.collect_deadline_seconds` raised from `600` (10 min) to
+  `1800` (30 min) to better fit interactive runs where user dialogue intervenes
+  between Phase 1 and `multi_llm_review_collect`.
 ## [3.17.0] - 2026-04-22
 ### Added

data/lib/kairos_mcp/version.rb CHANGED Viewed

@@ -1,4 +1,4 @@
 module KairosMcp
-  VERSION = "3.23.1"
+  VERSION = "3.24.0"
   CHANGELOG_URL = "https://github.com/masaomi/KairosChain_2026/blob/main/CHANGELOG.md"
 end

data/templates/knowledge/multi_llm_review_workflow/multi_llm_review_workflow.md CHANGED Viewed

@@ -29,6 +29,54 @@ This skill covers:
 For **WHO** (which LLM is good at what), see: `multi_llm_reviewer_evaluation`
 For **development lifecycle** (design → implement → verify), see: `design_to_implementation_workflow`
+## Two Execution Paths (read this first)
+There are **two distinct execution paths** with the same name "multi-LLM review".
+They differ in subprocess lifecycle ownership and completion-detection mechanics.
+Pick the right one for your environment:
+### Path A — Host-tracked (Bash workflow)
+- **Trigger**: orchestrator (LLM) calls Claude Code's `Bash` tool with
+  `run_in_background: true` to spawn `claude -p`, `codex exec`, `agent -p` directly.
+- **Process parent**: Claude Code (the host harness).
+- **Completion detection**: **event-driven**. Claude Code's shell tracker monitors
+  the spawned shells; when they exit, the LLM is notified through the standard
+  tool-result mechanism. Statusbar shows `XX shells` while reviewers are running.
+- **When to use**: interactive Claude Code sessions for one-off Tier 3 reviews.
+- **Reference**: see "Orchestration Template" section below for the canonical
+  `Bash(background)` pattern.
+### Path B — MCP-managed (multi_llm_review SkillSet)
+- **Trigger**: orchestrator calls the MCP tool `multi_llm_review`.
+- **Process parent**: the kairos-chain Ruby gem (MCP server). The gem forks a
+  detached worker (`bin/dispatch_worker.rb`) which calls `Process.setsid` and
+  spawns CLI reviewers as a separate session leader.
+- **Completion detection**: **polling required**. Claude Code is not the parent,
+  so the spawned subprocesses do NOT appear in the `XX shells` statusbar count.
+  The orchestrator must call `multi_llm_review_collect` (and optionally
+  `multi_llm_review_wait` first) to observe completion.
+- **When to use**: portable execution (other MCP hosts, autonomous Agent SkillSet),
+  or any case where you want the consensus computation done server-side.
+- **Recommended chain (3-step)**: `multi_llm_review` → `multi_llm_review_wait` →
+  `multi_llm_review_collect`. Each Phase-1/1.5 response carries a `next_action`
+  hint pointing at the next tool. wait is optional but recommended — without it,
+  collect's internal polling still covers worker completion, but recovery hints
+  for `still_pending`, `crashed`, and `past_collect_deadline` are less explicit.
+- **Reference**: see "Orchestrator Delegation Protocol" + "Async/Parallel Collect
+  Timing — Iron Rule" sections below.
+### Quick selector
+| Question | Answer |
+|----------|--------|
+| Are you in an interactive Claude Code session and just need one review? | **Path A** |
+| Do you need this to work in Cursor / autonomous mode / other MCP host? | **Path B** |
+| Do you want the consensus result inside the MCP tool response? | **Path B** |
+| Did you observe `XX shells` in the statusbar last time it worked? | That was Path A |
+| Did the run produce a `collect_token` and a `pending/<token>/` directory? | That was Path B |
 ## Roles
 | Role | Who | Responsibility |
@@ -331,8 +379,8 @@ cross-model subprocess reviewers give epistemic diversity. The two are complemen
 **Failure modes**:
 - `expired_or_unknown_token`: orchestrator missed `must_collect_by` deadline
-  (default 600s), or token never existed. The pending review is gone; call
-  `multi_llm_review` again from scratch.
+  (default 1800s since v3.23.2; was 600s), or token never existed. The pending
+  review is gone; call `multi_llm_review` again from scratch.
 - `error: invalid orchestrator_reviews`: persona count outside 2-4 or missing
   required fields. Fix and retry collect with the same token.
 - All-subprocess-failed at Call 1: returns error immediately; no token issued.
@@ -340,6 +388,62 @@ cross-model subprocess reviewers give epistemic diversity. The two are complemen
 **Default**: `orchestrator_strategy` defaults to `"exclude"` (back-compat). Use
 `"delegate"` explicitly until validated by use.
+#### Async/Parallel Collect Timing — Iron Rule
+When `delegation.parallel.default: true` (the v3.x default), Call 1 returns
+`delegation_pending` **immediately** (~50ms) and a detached worker runs the
+subprocess reviewers in parallel with the orchestrator's persona Agent
+reviews. This is faster, but introduces a timing trap:
+> **The orchestrator MUST call `multi_llm_review_collect` immediately after
+> the persona Agent reviews complete — without intervening user dialogue,
+> unrelated tool calls, or context switches.**
+Why this matters:
+- The LLM is **not event-driven**. When the worker finishes writing
+  `subprocess_status: "done"` to `state.json`, nothing wakes the orchestrator.
+  The orchestrator only notices when it next calls `multi_llm_review_collect`.
+- `multi_llm_review_collect` already polls internally at
+  `poll_interval_seconds: 0.5` for up to `collect_max_wait_seconds: 420` (7min)
+  per call. Polling is not the bottleneck — the bottleneck is the orchestrator
+  forgetting to call collect at all.
+- The token expires at `collect_deadline` (default 30min since v3.23.2). If
+  user dialogue or other work intervenes between persona Agent completion and
+  the collect call, the token can expire while the subprocess results sit
+  ready and unread on disk.
+Recommended orchestrator flow (single LLM turn, no detours):
+```
+1. multi_llm_review(...) → receive delegation_pending + collect_token
+2. Spawn persona Agent reviews (Agent tool, parallel, 2-4 personas)
+3. As soon as ALL personas return → multi_llm_review_collect(collect_token, ...)
+4. Return final consensus to user
+```
+Anti-pattern (do NOT do this):
+```
+1. multi_llm_review(...) → delegation_pending
+2. Run persona Agent reviews
+3. ❌ "By the way, while we wait, let me explain X to the user…"
+4. ❌ User asks an unrelated question, conversation drifts
+5. ❌ 30+ minutes later, finally try collect → expired_or_unknown_token
+```
+If the orchestrator is genuinely interrupted (user explicitly switches topic,
+or persona Agent itself takes a long time and the orchestrator wants to
+report progress), it should still **call collect first** — collect returns
+quickly if the worker is already done, or blocks up to 7min if not. Either
+way, the token stays alive and consensus is captured before resuming side
+work.
+Manual recovery if expiry happens: subprocess results are persisted at
+`.kairos/multi_llm_review/pending/<token>/subprocess_results.json` and remain
+readable until GC. Read them directly and synthesize manually, then re-run
+`multi_llm_review` for fresh results if needed.
 ### Critical CLI Notes
 - **Cursor Agent stdin**: `cat file | agent -p -` does NOT work. Use file-reference:

data/templates/skillsets/multi_llm_review/config/multi_llm_review.yml CHANGED Viewed

@@ -39,7 +39,7 @@ convergence_rule_after_exclusion: "3/4 APPROVE"
 # Phase 2 (multi_llm_review_collect) receives the orchestrator's persona
 # team review and computes final consensus.
 delegation:
-  collect_deadline_seconds: 600  # how long the orchestrator has to call collect
+  collect_deadline_seconds: 1800 # how long the orchestrator has to call collect (30min — interactive runs often have user dialogue between Phase 1 and collect)
   retain_collected_seconds: 3600 # how long collected results stay for idempotent replay
   # v0.3.0 parallel subprocess worker (Phase 11.5). When default:true, Phase 1
   # returns a delegation_pending token immediately and a detached OS worker
@@ -66,6 +66,11 @@ delegation:
     worker_self_timeout_floor_seconds: 60
     main_call_max_timeout_seconds: 300
     main_call_timeout_margin_seconds: 60
+    # multi_llm_review_wait tool (Phase 1.5) — see tools/multi_llm_review_wait.rb
+    wait_poll_interval_seconds: 1.0           # wait tool polling cadence (separate from collect's 0.5s)
+    wait_max_default_seconds: 600             # default per-call blocking ceiling
+    wait_max_hard_cap_seconds: 1800           # per-call hard cap (clamps max_wait_seconds arg)
+    wait_still_pending_streak_limit: 3        # consecutive still_pending returns before crashed/wait_exhausted
 # Dispatch settings
 timeout_seconds: 300              # global deadline for all reviewers

data/templates/skillsets/multi_llm_review/skillset.json CHANGED Viewed

@@ -1,19 +1,21 @@
 {
   "name": "multi_llm_review",
-  "version": "0.4.0",
-  "description": "Parallel multi-LLM review orchestration. Dispatches review prompts to N LLM backends via llm_client, collects verdicts, and computes consensus. v0.4.0 (Phase 12): adds feedback_text + schema_version to response, sanitization contract for prompt-injection defense, and multi_llm_review_bundle tool for human-handoff paths without dispatch.",
+  "version": "0.5.0",
+  "description": "Parallel multi-LLM review orchestration. Dispatches review prompts to N LLM backends via llm_client, collects verdicts, and computes consensus. v0.5.0: adds multi_llm_review_wait (Phase 1.5) for explicit subprocess completion gating with next_action recovery hints, and Path A/B doc disambiguation. v0.4.0 (Phase 12): feedback_text + schema_version, sanitization contract for prompt-injection defense, and multi_llm_review_bundle tool for human-handoff paths without dispatch.",
   "author": "Masaomi Hatakeyama",
   "layer": "L1",
   "depends_on": ["llm_client"],
   "provides": [
     "multi_llm_review_orchestration",
     "review_consensus",
-    "review_bundle_human_handoff"
+    "review_bundle_human_handoff",
+    "review_wait_gate"
   ],
   "tool_classes": [
     "KairosMcp::SkillSets::MultiLlmReview::Tools::MultiLlmReview",
     "KairosMcp::SkillSets::MultiLlmReview::Tools::MultiLlmReviewCollect",
-    "KairosMcp::SkillSets::MultiLlmReview::Tools::MultiLlmReviewBundle"
+    "KairosMcp::SkillSets::MultiLlmReview::Tools::MultiLlmReviewBundle",
+    "KairosMcp::SkillSets::MultiLlmReview::Tools::MultiLlmReviewWait"
   ],
   "config_files": ["config/multi_llm_review.yml"],
   "knowledge_dirs": [],

data/templates/skillsets/multi_llm_review/test/test_multi_llm_review.rb CHANGED Viewed

@@ -861,6 +861,19 @@ module KairosMcp
           FileUtils.rm_rf(@tmp)
         end
+        # Replace WorkerSpawner.spawn with a no-op for the duration of the block.
+        # Avoids actually forking a detached worker process during async tests.
+        def with_stubbed_worker_spawner
+          singleton = WorkerSpawner.singleton_class
+          original = WorkerSpawner.method(:spawn)
+          singleton.send(:define_method, :spawn) { |**_kwargs| true }
+          begin
+            yield
+          ensure
+            singleton.send(:define_method, :spawn, original)
+          end
+        end
         def test_partition_for_strategy_delegate_drops_match
           reviewers = [
             { provider: 'claude_code', model: 'claude-opus-4-7', role_label: 'r47' },
@@ -1024,9 +1037,150 @@ module KairosMcp
           )
           payload = JSON.parse(result.first[:text])
           deadline = Time.iso8601(payload['must_collect_by'])
-          # Should be ~60s from now, not the default 600s
+          # Should be ~60s from now, not the default 1800s
           assert_in_delta 60, deadline - Time.now, 5
         end
+        # Bug #1 fix: collect_deadline_seconds_override must extend the sync
+        # delegate_response deadline beyond the config default.
+        def test_delegate_sync_respects_collect_deadline_override
+          subprocess_results = [
+            { role_label: 'codex', provider: 'codex', model: 'm',
+              raw_text: 'APPROVE', elapsed_seconds: 1, error: nil, status: :success }
+          ]
+          result = @tool.send(:delegate_response,
+            raw_results: subprocess_results,
+            arguments: {
+              'review_type' => 'design', 'artifact_name' => 'x',
+              'collect_deadline_seconds_override' => 3000
+            },
+            config: { 'delegation' => { 'collect_deadline_seconds' => 60 } },
+            orchestrator_model: 'claude-opus-4-7',
+            convergence_rule: '3/4 APPROVE',
+            min_quorum: 2,
+            review_round: 1,
+            complexity: 'high'
+          )
+          payload = JSON.parse(result.first[:text])
+          deadline = Time.iso8601(payload['must_collect_by'])
+          # Override (3000s) wins over config (60s)
+          assert_in_delta 3000, deadline - Time.now, 5
+        end
+        # Bug #3 fix: when no override and no config, default is now 1800s (was 600s).
+        def test_delegate_sync_default_deadline_is_1800
+          subprocess_results = [
+            { role_label: 'codex', provider: 'codex', model: 'm',
+              raw_text: 'APPROVE', elapsed_seconds: 1, error: nil, status: :success }
+          ]
+          result = @tool.send(:delegate_response,
+            raw_results: subprocess_results,
+            arguments: { 'review_type' => 'design', 'artifact_name' => 'x' },
+            config: {},
+            orchestrator_model: 'claude-opus-4-7',
+            convergence_rule: '3/4 APPROVE',
+            min_quorum: 2,
+            review_round: 1,
+            complexity: 'high'
+          )
+          payload = JSON.parse(result.first[:text])
+          deadline = Time.iso8601(payload['must_collect_by'])
+          assert_in_delta 1800, deadline - Time.now, 5
+        end
+        # Bug #1 fix (async): when timeout_seconds_override raises the worker
+        # self_timeout above the configured collect_deadline, the deadline must
+        # auto-extend to cover the worker lifespan + poll margin. Otherwise the
+        # token expires while the worker is still healthy.
+        def test_delegate_async_auto_extends_deadline_to_worker_lifespan
+          reviewers = [{ provider: 'codex', model: 'codex-default', role_label: 'codex' }]
+          arguments = {
+            'review_type' => 'design',
+            'artifact_name' => 'x',
+            'timeout_seconds_override' => 1500
+          }
+          config = {
+            'delegation' => {
+              'collect_deadline_seconds' => 600,
+              'parallel' => {
+                'worker_self_timeout_multiplier' => 1.5,
+                'worker_self_timeout_floor_seconds' => 60,
+                'poll_interval_seconds' => 0.5
+              }
+            }
+          }
+          parallel_cfg = config.dig('delegation', 'parallel')
+          result = nil
+          with_stubbed_worker_spawner do
+            result = @tool.send(:delegate_response_async,
+              reviewers: reviewers,
+              messages: [{ 'role' => 'user', 'content' => 'x' }],
+              system_prompt: 'sys',
+              arguments: arguments,
+              config: config,
+              orchestrator_model: 'claude-opus-4-7',
+              convergence_rule: '3/4 APPROVE',
+              min_quorum: 2,
+              review_round: 1,
+              complexity: 'high',
+              review_context: 'independent',
+              max_concurrent: 2,
+              timeout_secs: 1500,
+              parallel_cfg: parallel_cfg
+            )
+          end
+          payload = JSON.parse(result.first[:text])
+          assert_equal 'delegation_pending', payload['status']
+          deadline = Time.iso8601(payload['must_collect_by'])
+          # worker_lifespan = 1500*1.5 + 60 = 2310; +10s poll margin = 2320
+          # Deadline must be at least worker_lifespan + margin, NOT 600
+          assert_operator deadline - Time.now, :>=, 2320 - 5
+        end
+        # Async: explicit collect_deadline_seconds_override above the auto-min wins.
+        def test_delegate_async_respects_explicit_override
+          reviewers = [{ provider: 'codex', model: 'codex-default', role_label: 'codex' }]
+          arguments = {
+            'review_type' => 'design',
+            'artifact_name' => 'x',
+            'collect_deadline_seconds_override' => 5000
+          }
+          config = {
+            'delegation' => {
+              'collect_deadline_seconds' => 600,
+              'parallel' => {
+                'worker_self_timeout_multiplier' => 1.5,
+                'worker_self_timeout_floor_seconds' => 60,
+                'poll_interval_seconds' => 0.5
+              }
+            }
+          }
+          parallel_cfg = config.dig('delegation', 'parallel')
+          result = nil
+          with_stubbed_worker_spawner do
+            result = @tool.send(:delegate_response_async,
+              reviewers: reviewers,
+              messages: [{ 'role' => 'user', 'content' => 'x' }],
+              system_prompt: 'sys',
+              arguments: arguments,
+              config: config,
+              orchestrator_model: 'claude-opus-4-7',
+              convergence_rule: '3/4 APPROVE',
+              min_quorum: 2,
+              review_round: 1,
+              complexity: 'high',
+              review_context: 'independent',
+              max_concurrent: 2,
+              timeout_secs: 300,
+              parallel_cfg: parallel_cfg
+            )
+          end
+          payload = JSON.parse(result.first[:text])
+          deadline = Time.iso8601(payload['must_collect_by'])
+          assert_in_delta 5000, deadline - Time.now, 5
+        end
       end
       class TestCollectTool < Minitest::Test

data/templates/skillsets/multi_llm_review/test/test_multi_llm_review_wait.rb ADDED Viewed

@@ -0,0 +1,249 @@
+# frozen_string_literal: true
+require 'minitest/autorun'
+require 'json'
+require 'tmpdir'
+require 'fileutils'
+require 'time'
+# Stub BaseTool so we can load the tool file in isolation.
+module KairosMcp
+  module Tools
+    class BaseTool
+      def text_content(s); [{ text: s }]; end
+    end
+  end unless defined?(KairosMcp::Tools::BaseTool)
+end
+require_relative '../lib/multi_llm_review/pending_state'
+require_relative '../lib/multi_llm_review/wait_for_worker'
+require_relative '../tools/multi_llm_review_wait'
+module KairosMcp
+  module SkillSets
+    module MultiLlmReview
+      class TestMultiLlmReviewWait < Minitest::Test
+        def setup
+          @tmp = Dir.mktmpdir('mlr-wait-')
+          @orig_cwd = Dir.pwd
+          Dir.chdir(@tmp)
+          @tool = Tools::MultiLlmReviewWait.new
+          @token = '11111111-2222-4333-8444-555555555555'
+        end
+        def teardown
+          Dir.chdir(@orig_cwd)
+          FileUtils.rm_rf(@tmp)
+        end
+        def write_state(extra = {})
+          PendingState.create_token_dir!(@token)
+          PendingState.write_state(@token, {
+            'schema_version' => 4,
+            'token' => @token,
+            'created_at' => Time.now.iso8601,
+            'collect_deadline' => (Time.now + 1800).iso8601,
+            'subprocess_status' => 'pending',
+            'subprocess_total' => 3,
+            'parallel' => true
+          }.merge(extra))
+          FileUtils.touch(PendingState.collect_lock_path(@token))
+        end
+        def call_wait(args = {})
+          payload = JSON.parse(@tool.call({ 'collect_token' => @token }.merge(args)).first[:text])
+          payload
+        end
+        # ── unknown_token ────────────────────────────────────────────────
+        def test_unknown_token_returns_unknown_with_redispatch_hint
+          payload = call_wait
+          assert_equal 'unknown_token', payload['status']
+          assert_equal @token, payload['collect_token']
+          assert_equal 'multi_llm_review', payload['next_action']['tool']
+          assert_match(/never existed|garbage-collected|new dispatch/i,
+                       payload['next_action']['purpose'])
+        end
+        def test_invalid_token_format_returns_unknown
+          payload = JSON.parse(@tool.call({ 'collect_token' => 'not-a-uuid' }).first[:text])
+          assert_equal 'unknown_token', payload['status']
+        end
+        # ── already_collected ────────────────────────────────────────────
+        def test_already_collected_returns_replay_hint
+          write_state
+          PendingState.write_collected(@token, {
+            'final_payload' => { 'status' => 'ok', 'verdict' => 'APPROVE' }
+          })
+          payload = call_wait
+          assert_equal 'already_collected', payload['status']
+          assert_equal 'multi_llm_review_collect', payload['next_action']['tool']
+          assert_match(/idempotent replay/i, payload['next_action']['purpose'])
+        end
+        # ── past_collect_deadline ────────────────────────────────────────
+        def test_past_deadline_returns_redispatch_without_blocking
+          write_state('collect_deadline' => (Time.now - 60).iso8601)
+          t0 = Time.now
+          payload = call_wait('max_wait_seconds' => 5)
+          elapsed = Time.now - t0
+          assert_equal 'past_collect_deadline', payload['status']
+          assert_equal 'multi_llm_review', payload['next_action']['tool']
+          assert_operator elapsed, :<, 1.0, 'must not block when past deadline'
+        end
+        # ── ready ────────────────────────────────────────────────────────
+        def test_ready_when_subprocess_results_present
+          write_state
+          PendingState.write_subprocess_results(@token, {
+            'results' => [
+              { 'role_label' => 'codex', 'raw_text' => 'APPROVE', 'status' => 'success' },
+              { 'role_label' => 'cursor', 'raw_text' => 'APPROVE', 'status' => 'success' },
+              { 'role_label' => 'claude', 'raw_text' => 'APPROVE', 'status' => 'success' }
+            ],
+            'elapsed_seconds' => 12.3
+          })
+          payload = call_wait('max_wait_seconds' => 2)
+          assert_equal 'ready', payload['status']
+          assert_equal 3, payload['subprocess_done']
+          assert_equal 3, payload['subprocess_total']
+          assert_equal 'multi_llm_review_collect', payload['next_action']['tool']
+          assert_includes payload['next_action']['args'].keys, 'orchestrator_reviews'
+        end
+        # ── still_pending + streak escalation ────────────────────────────
+        def test_still_pending_returned_when_worker_healthy_but_slow
+          write_state
+          # Live heartbeat so WaitForWorker sees a healthy worker.
+          FileUtils.touch(PendingState.worker_heartbeat_path(@token))
+          PendingState.write_worker_pid(@token, { 'pid' => Process.pid, 'pgid' => Process.pid })
+          payload = call_wait('max_wait_seconds' => 1)
+          assert_equal 'still_pending', payload['status']
+          assert_equal 1, payload['still_pending_streak']
+          assert_equal 'multi_llm_review_wait', payload['next_action']['tool']
+        end
+        def test_still_pending_streak_persists_across_calls
+          write_state
+          FileUtils.touch(PendingState.worker_heartbeat_path(@token))
+          PendingState.write_worker_pid(@token, { 'pid' => Process.pid, 'pgid' => Process.pid })
+          p1 = call_wait('max_wait_seconds' => 1)
+          assert_equal 1, p1['still_pending_streak']
+          p2 = call_wait('max_wait_seconds' => 1)
+          assert_equal 2, p2['still_pending_streak']
+        end
+        def test_streak_at_limit_escalates_to_crashed
+          write_state('wait_still_pending_streak' => 3)
+          payload = call_wait('max_wait_seconds' => 1)
+          assert_equal 'crashed', payload['status']
+          assert_equal 'wait_exhausted', payload['crashed_reason']
+          assert_equal 'multi_llm_review', payload['next_action']['tool']
+        end
+        def test_ready_resets_streak
+          write_state('wait_still_pending_streak' => 2)
+          PendingState.write_subprocess_results(@token, { 'results' => [], 'elapsed_seconds' => 1 })
+          payload = call_wait('max_wait_seconds' => 1)
+          assert_equal 'ready', payload['status']
+          state = PendingState.load_state(@token)
+          assert_equal 0, state['wait_still_pending_streak'].to_i
+        end
+        # ── crashed (worker terminal) ────────────────────────────────────
+        def test_crashed_status_propagates_reason
+          write_state('subprocess_status' => 'crashed', 'crash_reason' => 'segfault')
+          payload = call_wait('max_wait_seconds' => 1)
+          assert_equal 'crashed', payload['status']
+          assert_equal 'segfault', payload['crashed_reason']
+          assert_equal 'multi_llm_review', payload['next_action']['tool']
+        end
+        # ── hard cap ─────────────────────────────────────────────────────
+        # Hard cap is enforced before WaitForWorker is invoked. We verify the
+        # clamping logic without actually waiting for the cap by checking the
+        # request was processed (well-formed payload returned in bounded time)
+        # and the deadline-remaining check fired.
+        def test_max_wait_clamped_when_request_exceeds_hard_cap
+          # Set a very short deadline so the deadline-remaining clamp fires
+          # almost immediately.
+          write_state('collect_deadline' => (Time.now + 2).iso8601)
+          FileUtils.touch(PendingState.worker_heartbeat_path(@token))
+          PendingState.write_worker_pid(@token, { 'pid' => Process.pid, 'pgid' => Process.pid })
+          t0 = Time.now
+          payload = call_wait('max_wait_seconds' => 999_999)
+          elapsed = Time.now - t0
+          # Whatever status comes back (still_pending or past_collect_deadline
+          # depending on timing), elapsed must be bounded — never the 999_999s
+          # the caller requested. Enforces the clamp path is not bypassed.
+          refute_nil payload['status']
+          assert_operator elapsed, :<, 30.0,
+            'elapsed must be bounded by deadline-remaining clamp, not by raw max_wait_seconds'
+        end
+        # ── elapsed_seconds field is always present ──────────────────────
+        def test_elapsed_seconds_always_present
+          write_state
+          PendingState.write_subprocess_results(@token, { 'results' => [], 'elapsed_seconds' => 0.1 })
+          payload = call_wait('max_wait_seconds' => 1)
+          assert payload.key?('elapsed_seconds'), 'elapsed_seconds field missing'
+          assert_kind_of Float, payload['elapsed_seconds']
+        end
+        # ── next_action present on every status ──────────────────────────
+        def test_next_action_present_on_every_status
+          write_state
+          # ready
+          PendingState.write_subprocess_results(@token, { 'results' => [], 'elapsed_seconds' => 1 })
+          assert call_wait('max_wait_seconds' => 1)['next_action'], 'ready missing next_action'
+          # past_collect_deadline
+          File.delete(PendingState.subprocess_results_path(@token))
+          PendingState.write_state(@token, PendingState.load_state(@token)
+            .merge('collect_deadline' => (Time.now - 1).iso8601))
+          assert call_wait['next_action'], 'past_collect_deadline missing next_action'
+          # crashed
+          PendingState.write_state(@token, PendingState.load_state(@token).merge(
+            'collect_deadline' => (Time.now + 600).iso8601,
+            'subprocess_status' => 'crashed', 'crash_reason' => 'oom'
+          ))
+          assert call_wait['next_action'], 'crashed missing next_action'
+        end
+      end
+      # ── backward compat: collect can still be called without wait ────────
+      # Verifies that introducing wait does not break the existing
+      # "delegation_pending → collect" path. The collect tool already polls
+      # internally and remains the primary completion gate.
+      class TestWaitToolBackwardCompat < Minitest::Test
+        def test_collect_works_without_wait_tool
+          # Smoke test: load the collect tool and verify it has not gained a
+          # required dependency on wait. (Full collect integration is covered
+          # in test_multi_llm_review.rb; this is a presence check.)
+          require_relative '../tools/multi_llm_review_collect'
+          collect = Tools::MultiLlmReviewCollect.new
+          schema = collect.input_schema
+          assert_equal 'object', schema[:type]
+          # The collect tool's required fields must still be just collect_token
+          # + orchestrator_reviews — wait must NOT have been added as required.
+          required = schema[:required] || []
+          refute_includes required, 'wait_completed'
+          refute_includes required, 'wait_token'
+        end
+      end
+    end
+  end
+end

data/templates/skillsets/multi_llm_review/tools/multi_llm_review.rb CHANGED Viewed

@@ -92,6 +92,16 @@ module KairosMcp
                   type: 'integer',
                   description: 'Override dispatch timeout in seconds (default from config)'
                 },
+                collect_deadline_seconds_override: {
+                  type: 'integer',
+                  description: 'Override how long the orchestrator has to call ' \
+                    'multi_llm_review_collect before the pending token expires ' \
+                    '(default from config: delegation.collect_deadline_seconds). ' \
+                    'In the async/parallel path, the effective deadline is also ' \
+                    'auto-extended to cover the worker self_timeout plus a poll margin, ' \
+                    'so raising timeout_seconds_override alone no longer leaves the ' \
+                    'collect deadline shorter than the worker lifespan.'
+                },
                 complexity: {
                   type: 'string',
                   enum: %w[auto low medium high critical],
@@ -446,7 +456,8 @@ module KairosMcp
               }))
             end
-            deadline_secs = config.dig('delegation', 'collect_deadline_seconds') || 600
+            deadline_secs = arguments['collect_deadline_seconds_override'] ||
+                            config.dig('delegation', 'collect_deadline_seconds') || 1800
             now = Time.now
             token = PendingState.generate_token
@@ -507,11 +518,22 @@ module KairosMcp
               }))
             end
-            deadline_secs = config.dig('delegation', 'collect_deadline_seconds') || 600
+            deadline_secs = arguments['collect_deadline_seconds_override'] ||
+                            config.dig('delegation', 'collect_deadline_seconds') || 1800
             multiplier = parallel_cfg['worker_self_timeout_multiplier'] || 1.5
             floor      = parallel_cfg['worker_self_timeout_floor_seconds'] || 60
+            poll_interval = parallel_cfg['poll_interval_seconds'] || 0.5
             now = Time.now
-            self_timeout_at = (now + timeout_secs * multiplier + floor).iso8601
+            worker_lifespan_secs = (timeout_secs * multiplier + floor).to_f
+            self_timeout_at = (now + worker_lifespan_secs).iso8601
+            # Auto-extend collect_deadline to cover the worker's self_timeout plus
+            # a polling margin. Without this, raising timeout_seconds_override alone
+            # leaves the orchestrator's submission window shorter than the worker
+            # lifespan — the collect token expires while the worker is still healthy.
+            # Only kicks in for the async path; sync delegate_response has no worker.
+            min_deadline_secs = (worker_lifespan_secs + (poll_interval * 20)).ceil
+            deadline_secs = [deadline_secs.to_i, min_deadline_secs].max
             # UUID collision retry (EEXIST on Dir.mkdir per PendingState§token_dir).
             token = nil
@@ -578,7 +600,7 @@ module KairosMcp
                 'instruction' => 'Run persona-based review using your Agent tool. ' \
                   "Choose #{PersonaAssembly::MIN_PERSONAS}-#{PersonaAssembly::MAX_PERSONAS} " \
                   'personas appropriate to the artifact and review_type. ' \
-                  'Submit findings via multi_llm_review_collect with the collect_token below.',
+                  'Then call multi_llm_review_wait, then multi_llm_review_collect.',
                 'review_type' => arguments['review_type'],
                 'persona_count_min' => PersonaAssembly::MIN_PERSONAS,
                 'persona_count_max' => PersonaAssembly::MAX_PERSONAS
@@ -586,7 +608,18 @@ module KairosMcp
               'subprocess_status' => 'pending',
               'subprocess_total' => reviewers.size,
               'must_collect_by' => (now + deadline_secs).iso8601,
-              'orchestrator_model' => orchestrator_model
+              'orchestrator_model' => orchestrator_model,
+              # next_action hint (R1, R8): MCP does not enforce ordering, but
+              # the LLM is highly likely to follow this hint. Calling wait is
+              # optional — collect alone still works via its internal polling —
+              # but wait surfaces structural completion deterministically.
+              'next_action' => {
+                'tool' => 'multi_llm_review_wait',
+                'args' => { 'collect_token' => token, 'max_wait_seconds' => 600 },
+                'purpose' => 'Phase 1.5: block until subprocess reviewers complete. Call ' \
+                  'AFTER spawning persona Agent reviews, BEFORE multi_llm_review_collect. ' \
+                  'Optional but strongly recommended for deterministic recovery hints.'
+              }
             }))
           end

data/templates/skillsets/multi_llm_review/tools/multi_llm_review_wait.rb ADDED Viewed

@@ -0,0 +1,313 @@
+# frozen_string_literal: true
+require 'json'
+require 'time'
+require_relative '../lib/multi_llm_review/pending_state'
+require_relative '../lib/multi_llm_review/wait_for_worker'
+module KairosMcp
+  module SkillSets
+    module MultiLlmReview
+      module Tools
+        # Phase 1.5 of the orchestrator delegation protocol.
+        #
+        # Optional blocking gate that orchestrator can call AFTER spawning
+        # persona Agent reviews and BEFORE multi_llm_review_collect. Server
+        # polls the detached worker's state and returns when subprocess
+        # reviewers complete (or earlier on terminal conditions).
+        #
+        # Without this tool, orchestrator can still call collect directly —
+        # collect's own internal polling covers worker completion. wait is a
+        # tool-chain checkpoint that surfaces structural status (ready,
+        # crashed, exhausted) with explicit next_action recovery hints, so
+        # the LLM can choose the right next step deterministically.
+        #
+        # Status enum (R10):
+        #   ready                  — subprocess_results.json present, proceed to collect
+        #   still_pending          — max_wait elapsed, worker healthy, may call wait again
+        #   crashed                — worker terminal failure (with reason)
+        #   unknown_token          — token dir missing (never existed or GC'd)
+        #   already_collected      — collected.json present, retrieve cached payload
+        #   past_collect_deadline  — token alive but past deadline; collect would reject
+        class MultiLlmReviewWait < KairosMcp::Tools::BaseTool
+          # Per-call hard cap on max_wait_seconds (R7).
+          MAX_WAIT_HARD_CAP_DEFAULT = 1800
+          # Default streak limit before still_pending escalates to crashed (R7).
+          STILL_PENDING_STREAK_LIMIT_DEFAULT = 3
+          def name
+            'multi_llm_review_wait'
+          end
+          def description
+            'Phase 1.5 — block until subprocess reviewers complete for a delegated ' \
+              'multi_llm_review token. Optional but recommended: call after spawning ' \
+              'persona Agent reviews and before multi_llm_review_collect. Returns ' \
+              'a status enum with a next_action recovery hint for every status.'
+          end
+          def category
+            :review
+          end
+          def usecase_tags
+            %w[review multi-llm wait blocking polling]
+          end
+          def related_tools
+            %w[multi_llm_review multi_llm_review_collect]
+          end
+          def input_schema
+            {
+              type: 'object',
+              properties: {
+                collect_token: {
+                  type: 'string',
+                  description: 'UUID v4 token returned by multi_llm_review delegation_pending'
+                },
+                max_wait_seconds: {
+                  type: 'integer',
+                  description: 'Server-side blocking duration cap in seconds. ' \
+                    'Default from config (delegation.parallel.wait_max_default_seconds). ' \
+                    'Hard cap 1800 (delegation.parallel.wait_max_hard_cap_seconds).'
+                }
+              },
+              required: %w[collect_token]
+            }
+          end
+          def call(arguments)
+            token = arguments['collect_token'].to_s
+            unless PendingState.valid_token?(token)
+              return text_content(JSON.generate({
+                'status' => 'unknown_token',
+                'collect_token' => token,
+                'elapsed_seconds' => 0.0,
+                'next_action' => next_action_redispatch(
+                  'Token format invalid. Re-run multi_llm_review to start a new dispatch.'
+                )
+              }))
+            end
+            cfg = config_parallel
+            default_max  = (cfg['wait_max_default_seconds'] || 600).to_i
+            hard_cap     = (cfg['wait_max_hard_cap_seconds'] || MAX_WAIT_HARD_CAP_DEFAULT).to_i
+            poll_int     = (cfg['wait_poll_interval_seconds'] || 1.0).to_f
+            streak_limit = (cfg['wait_still_pending_streak_limit'] ||
+                            STILL_PENDING_STREAK_LIMIT_DEFAULT).to_i
+            requested_max = (arguments['max_wait_seconds'] || default_max).to_i
+            requested_max = hard_cap if requested_max > hard_cap
+            requested_max = 1 if requested_max < 1
+            # 1. already_collected check (collected.json present) — before any
+            #    deadline / token-dir checks so a successful collect always
+            #    returns deterministically even after deadline expiry.
+            if File.exist?(safe_path { PendingState.collected_path(token) })
+              return reply('already_collected', token, 0.0,
+                next_action: next_action_collect_replay(token,
+                  'Collect already completed for this token. Call multi_llm_review_collect ' \
+                  'to retrieve the cached final consensus (idempotent replay).'))
+            end
+            # 2. unknown_token check (state.json missing).
+            state = PendingState.load_state(token)
+            if state.nil?
+              return reply('unknown_token', token, 0.0,
+                next_action: next_action_redispatch(
+                  'Token not found (never existed or already garbage-collected). ' \
+                  'Re-run multi_llm_review to start a new dispatch.'))
+            end
+            # 3. past_collect_deadline early exit (collect would reject anyway).
+            deadline = (Time.iso8601(state['collect_deadline']) rescue nil)
+            if deadline && Time.now > deadline
+              return reply('past_collect_deadline', token, 0.0,
+                subprocess_total: state['subprocess_total'] ||
+                                  (PendingState.load_request(token)&.dig('reviewers')&.size),
+                next_action: next_action_redispatch(
+                  'Token deadline elapsed. multi_llm_review_collect would reject. ' \
+                  'Re-run multi_llm_review to start a new dispatch.'))
+            end
+            # 4. Cap max_wait by remaining deadline (R7) so we never block
+            #    longer than the useful lifetime of the token.
+            if deadline
+              remaining = (deadline - Time.now).to_i
+              requested_max = remaining if remaining < requested_max
+              requested_max = 1 if requested_max < 1
+            end
+            # 5. Streak guard: if still_pending was returned too many times in
+            #    a row, escalate to crashed/wait_exhausted.
+            streak = (state['wait_still_pending_streak'] || 0).to_i
+            if streak >= streak_limit
+              return reply('crashed', token, 0.0,
+                crashed_reason: 'wait_exhausted',
+                still_pending_streak: streak,
+                next_action: next_action_redispatch(
+                  "still_pending streak reached limit (#{streak_limit}). Worker may be " \
+                  'wedged or pathologically slow. Re-run multi_llm_review.'))
+            end
+            # 6. Delegate to existing WaitForWorker for the actual polling.
+            outcome = WaitForWorker.wait(token, {
+              max_wait_seconds: requested_max,
+              poll_interval_seconds: poll_int,
+              startup_grace_seconds: cfg['startup_grace_seconds'] || 30,
+              heartbeat_stale_threshold_seconds: cfg['heartbeat_stale_threshold_seconds'] || 15
+            })
+            translate_outcome(token, outcome, streak, requested_max, state)
+          rescue StandardError => e
+            warn "[multi_llm_review_wait] INTERNAL ERROR: #{e.class}: #{e.message}"
+            warn e.backtrace.first(10).join("\n") if e.backtrace
+            text_content(JSON.generate({
+              'status' => 'error',
+              'error_class' => 'internal',
+              'error' => "#{e.class}: #{e.message}",
+              'collect_token' => arguments['collect_token']
+            }))
+          end
+          private
+          def translate_outcome(token, outcome, prior_streak, requested_max, state)
+            elapsed = (outcome[:elapsed] || requested_max).to_f
+            subprocess_total = state['subprocess_total'] ||
+                               PendingState.load_request(token)&.dig('reviewers')&.size
+            case outcome[:status]
+            when :ready
+              reset_streak(token)
+              done = (outcome[:results].is_a?(Array) ? outcome[:results].size : nil) ||
+                     subprocess_total
+              reply('ready', token, elapsed,
+                subprocess_done: done,
+                subprocess_total: subprocess_total,
+                next_action: next_action_collect(token,
+                  'Subprocess reviewers complete. Submit your persona Agent findings to ' \
+                  'multi_llm_review_collect to compute the final consensus.'))
+            when :crashed
+              reset_streak(token)
+              reply('crashed', token, elapsed,
+                crashed_reason: outcome[:reason] || 'crashed',
+                subprocess_total: subprocess_total,
+                next_action: next_action_redispatch(
+                  "Worker terminated abnormally (#{outcome[:reason] || 'crashed'}). " \
+                  'Re-run multi_llm_review to start a new dispatch.'))
+            when :timeout
+              new_streak = prior_streak + 1
+              persist_streak(token, new_streak)
+              reply('still_pending', token, elapsed,
+                subprocess_total: subprocess_total,
+                still_pending_streak: new_streak,
+                next_action: next_action_wait(token,
+                  "Worker still healthy after #{requested_max}s. Call multi_llm_review_wait " \
+                  "again with the same token (streak #{new_streak}/#{(state.dig('wait_still_pending_streak_limit') || STILL_PENDING_STREAK_LIMIT_DEFAULT)})."))
+            else
+              reply('crashed', token, elapsed,
+                crashed_reason: "unknown_outcome:#{outcome[:status]}",
+                subprocess_total: subprocess_total,
+                next_action: next_action_redispatch(
+                  'Worker reported an unexpected outcome. Re-run multi_llm_review.'))
+            end
+          end
+          def reply(status, token, elapsed, **fields)
+            payload = {
+              'status' => status,
+              'collect_token' => token,
+              'elapsed_seconds' => elapsed.round(3)
+            }
+            payload['subprocess_done']        = fields[:subprocess_done] if fields.key?(:subprocess_done)
+            payload['subprocess_total']       = fields[:subprocess_total] if fields.key?(:subprocess_total)
+            payload['crashed_reason']         = fields[:crashed_reason] if fields.key?(:crashed_reason)
+            payload['still_pending_streak']   = fields[:still_pending_streak] if fields.key?(:still_pending_streak)
+            payload['next_action']            = fields[:next_action] if fields.key?(:next_action)
+            text_content(JSON.generate(payload))
+          end
+          def next_action_collect(token, purpose)
+            {
+              'tool' => 'multi_llm_review_collect',
+              'args' => {
+                'collect_token' => token,
+                'orchestrator_reviews' => '<persona findings array, 2-4 entries>'
+              },
+              'purpose' => purpose
+            }
+          end
+          def next_action_collect_replay(token, purpose)
+            {
+              'tool' => 'multi_llm_review_collect',
+              'args' => { 'collect_token' => token },
+              'purpose' => purpose
+            }
+          end
+          def next_action_wait(token, purpose)
+            {
+              'tool' => 'multi_llm_review_wait',
+              'args' => { 'collect_token' => token },
+              'purpose' => purpose
+            }
+          end
+          def next_action_redispatch(purpose)
+            {
+              'tool' => 'multi_llm_review',
+              'args' => '<original arguments>',
+              'purpose' => purpose
+            }
+          end
+          # Streak persistence via PendingState.update_state (atomic RMW).
+          def persist_streak(token, n)
+            PendingState.update_state(token) do |state|
+              next nil unless state
+              state['wait_still_pending_streak'] = n
+              state
+            end
+          rescue StandardError
+            # Best-effort. Streak loss = orchestrator gets one more retry,
+            # acceptable degradation.
+          end
+          def reset_streak(token)
+            PendingState.update_state(token) do |state|
+              next nil unless state
+              if state['wait_still_pending_streak'].to_i.positive?
+                state['wait_still_pending_streak'] = 0
+                state
+              else
+                nil
+              end
+            end
+          rescue StandardError
+            # Best-effort.
+          end
+          def safe_path
+            yield
+          rescue StandardError
+            '/dev/null/never_exists'
+          end
+          def config_parallel
+            return {} unless self.class.const_defined?(:CONFIG_PATH) || true
+            path = File.expand_path('../config/multi_llm_review.yml', __dir__)
+            return {} unless File.exist?(path)
+            cfg = YAML.safe_load_file(path, permitted_classes: [Symbol], aliases: true)
+            (cfg.dig('delegation', 'parallel') || {}).to_h
+          rescue StandardError
+            {}
+          end
+        end
+      end
+    end
+  end
+end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: kairos-chain
 version: !ruby/object:Gem::Version
-  version: 3.23.1
+  version: 3.24.0
 platform: ruby
 authors:
 - Masaomi Hatakeyama
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2026-04-26 00:00:00.000000000 Z
+date: 2026-04-27 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: minitest
@@ -497,6 +497,7 @@ files:
 - templates/skillsets/multi_llm_review/test/test_feedback_formatter.rb
 - templates/skillsets/multi_llm_review/test/test_multi_llm_review.rb
 - templates/skillsets/multi_llm_review/test/test_multi_llm_review_bundle.rb
+- templates/skillsets/multi_llm_review/test/test_multi_llm_review_wait.rb
 - templates/skillsets/multi_llm_review/test/test_pending_state_v3.rb
 - templates/skillsets/multi_llm_review/test/test_pin_resolver.rb
 - templates/skillsets/multi_llm_review/test/test_sanitizer.rb
@@ -504,6 +505,7 @@ files:
 - templates/skillsets/multi_llm_review/tools/multi_llm_review.rb
 - templates/skillsets/multi_llm_review/tools/multi_llm_review_bundle.rb
 - templates/skillsets/multi_llm_review/tools/multi_llm_review_collect.rb
+- templates/skillsets/multi_llm_review/tools/multi_llm_review_wait.rb
 - templates/skillsets/multiuser/config/multiuser.yml
 - templates/skillsets/multiuser/lib/multiuser.rb
 - templates/skillsets/multiuser/lib/multiuser/authorization_gate.rb