kairos-chain 3.5.0 → 3.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +65 -0
  3. data/lib/kairos_mcp/invocation_context.rb +118 -0
  4. data/lib/kairos_mcp/protocol.rb +4 -3
  5. data/lib/kairos_mcp/skill_tool_adapter.rb +2 -2
  6. data/lib/kairos_mcp/tool_registry.rb +60 -17
  7. data/lib/kairos_mcp/tools/base_tool.rb +21 -1
  8. data/lib/kairos_mcp/version.rb +1 -1
  9. data/templates/knowledge/design_to_implementation_workflow/design_to_implementation_workflow.md +196 -0
  10. data/templates/knowledge/multi_llm_review_workflow/multi_llm_review_workflow.md +358 -0
  11. data/templates/knowledge/multi_llm_reviewer_evaluation/multi_llm_reviewer_evaluation.md +185 -0
  12. data/templates/skillsets/agent/config/agent.yml +43 -0
  13. data/templates/skillsets/agent/lib/agent/cognitive_loop.rb +146 -0
  14. data/templates/skillsets/agent/lib/agent/mandate_adapter.rb +45 -0
  15. data/templates/skillsets/agent/lib/agent/message_format.rb +33 -0
  16. data/templates/skillsets/agent/lib/agent/session.rb +193 -0
  17. data/templates/skillsets/agent/lib/agent.rb +6 -0
  18. data/templates/skillsets/agent/skillset.json +21 -0
  19. data/templates/skillsets/agent/test/test_agent_m1.rb +329 -0
  20. data/templates/skillsets/agent/test/test_agent_m2.rb +625 -0
  21. data/templates/skillsets/agent/test/test_agent_m3.rb +710 -0
  22. data/templates/skillsets/agent/test/test_agent_m4.rb +545 -0
  23. data/templates/skillsets/agent/tools/agent_start.rb +150 -0
  24. data/templates/skillsets/agent/tools/agent_status.rb +75 -0
  25. data/templates/skillsets/agent/tools/agent_step.rb +481 -0
  26. data/templates/skillsets/agent/tools/agent_stop.rb +74 -0
  27. data/templates/skillsets/autoexec/lib/autoexec/plan_store.rb +46 -14
  28. data/templates/skillsets/autoexec/lib/autoexec/risk_classifier.rb +26 -0
  29. data/templates/skillsets/autoexec/lib/autoexec/task_dsl.rb +81 -8
  30. data/templates/skillsets/autoexec/tools/autoexec_plan.rb +7 -2
  31. data/templates/skillsets/autoexec/tools/autoexec_run.rb +126 -10
  32. data/templates/skillsets/mcp_client/config/mcp_client.yml +15 -0
  33. data/templates/skillsets/mcp_client/lib/mcp_client/client.rb +110 -0
  34. data/templates/skillsets/mcp_client/lib/mcp_client/connection_manager.rb +127 -0
  35. data/templates/skillsets/mcp_client/lib/mcp_client/proxy_tool.rb +62 -0
  36. data/templates/skillsets/mcp_client/lib/mcp_client.rb +5 -0
  37. data/templates/skillsets/mcp_client/skillset.json +14 -0
  38. data/templates/skillsets/mcp_client/test/test_mcp_client.rb +487 -0
  39. data/templates/skillsets/mcp_client/tools/mcp_connect.rb +116 -0
  40. data/templates/skillsets/mcp_client/tools/mcp_disconnect.rb +55 -0
  41. data/templates/skillsets/mcp_client/tools/mcp_list_remote.rb +50 -0
  42. data/templates/skillsets/mmp/config/meeting.yml +6 -0
  43. data/templates/skillsets/mmp/lib/mmp/attestation_nudge.rb +277 -0
  44. data/templates/skillsets/mmp/lib/mmp.rb +24 -0
  45. data/templates/skillsets/mmp/tools/meeting_acquire_skill.rb +13 -0
  46. data/templates/skillsets/mmp/tools/meeting_attest_skill.rb +19 -0
  47. data/templates/skillsets/mmp/tools/meeting_browse.rb +11 -1
  48. data/templates/skillsets/mmp/tools/meeting_check_freshness.rb +12 -2
  49. data/templates/skillsets/mmp/tools/meeting_connect.rb +12 -1
  50. data/templates/skillsets/mmp/tools/meeting_get_skill_details.rb +11 -1
  51. data/templates/skillsets/mmp/tools/meeting_preview_skill.rb +11 -1
  52. metadata +33 -4
  53. data/templates/knowledge/multi_llm_design_review/multi_llm_design_review.md +0 -398
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 330c7ed792c82c5484002e3d4c34eadee17b12a3dff89b4740bf331f49973565
4
- data.tar.gz: eaec92628e858a914b10863221c3463e05767e7e01dfaaf5fe430ff667a4c04a
3
+ metadata.gz: e63245a9c8dd3d8e79b83ad1eba7697cbf2075832b1afbae7080c4b1dea0e0de
4
+ data.tar.gz: 56ec7236649c644bd73fccbe35989640ac169831e8300531821c46ff01805475
5
5
  SHA512:
6
- metadata.gz: d55b480528d2a810bf254948ea867f888e6c426e370fff6ec069f146479fbd2e80908b7ea0f2349f713f76763fca34d108eadd5f3592eb51c91125f36466c9bc
7
- data.tar.gz: 72d9eeb8df3c7185d6ad411e4bc77305b74d29002c2022e529fcab0de52fd6706c4048f508039c9b94ebff3db96aa6eeb30a69b736fedeb74e67e867ad4ed912
6
+ metadata.gz: cb2ddc02ed24e10dcd76e6cf6f8149588b135011b8e777c52324961fc06ecb44a6810c1b97cdedda66bbcc8bcca484400a7f6a245dfbe7c1b831b7b2b0b04003
7
+ data.tar.gz: 455843011cef593d7ff7693d66f22be19dad14f70fe5b26cccc131cae903362089e35367e4410ba91138a9822b9b8adb6823f7ca98420bb6d9811e8f5233f106
data/CHANGELOG.md CHANGED
@@ -4,6 +4,71 @@ All notable changes to the `kairos-chain` gem will be documented in this file.
4
4
 
5
5
  This project follows [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [3.6.0] - 2026-03-28
8
+
9
+ ### Added
10
+
11
+ - **Agent SkillSet** — OODA cognitive loop for autonomous task execution
12
+ - `agent_start`: Initialize agent session with mandate and goal
13
+ - `agent_step`: Execute one OODA cycle (Observe → Orient → Decide → Act via autoexec)
14
+ - `agent_status`: View cycle history and active mandates
15
+ - `agent_stop`: End agent session with reflection
16
+ - Cumulative progress file (`progress.jsonl`) for cross-cycle continuity
17
+ - Loop detection via decision_payload summary comparison
18
+ - Multi-cycle mandate progression with checkpoint
19
+ - 90 tests across M1-M4 milestones
20
+
21
+ - **mcp_client SkillSet** — Connect to external MCP servers as a client
22
+ - `mcp_connect`: Establish connection to remote MCP server (HTTP JSON-RPC)
23
+ - `mcp_disconnect`: Close connection and unregister proxy tools
24
+ - `mcp_list_remote`: List available tools on connected server
25
+ - `ProxyTool`: Dynamic tool proxying with namespace prefixing
26
+ - `ConnectionManager`: Singleton with lifecycle management
27
+ - Dual blacklist (Agent + InvocationContext) for security
28
+ - ORIENT_TOOLS integration for Agent SkillSet awareness
29
+ - 25 tests (Client 6, ConnectionManager 7, ProxyTool 4, Registry 3, E2E 5)
30
+
31
+ - **Attestation Nudge** (MMP SkillSet) — Proactive attestation prompts
32
+ - Tracks usage of acquired skills, suggests attestation after threshold
33
+ - `register_gate(:attestation_nudge)` passive observer (zero L0 changes)
34
+ - Gate detects `resource_read`/`knowledge_get` access to received skills
35
+ - In-memory tool_name/file_path indexes for O(1) gate miss path
36
+ - `flock(LOCK_EX)` atomic JSON file updates
37
+ - Time-window throttling: `cooldown_hours` + `nudge_interval_hours`
38
+ - Passive decline: nudge emission starts cooldown
39
+ - Nudge footer on 5 MMP tools (browse, connect, details, preview, freshness)
40
+ - `sanitize_for_display` for remote metadata in nudge messages
41
+ - 39 tests, 4 rounds of multi-LLM review (3/3 APPROVE including Codex)
42
+
43
+ - **InvocationContext** — Tool invocation chain tracking
44
+ - Depth limiting, caller tracking, mandate propagation
45
+ - Whitelist/blacklist policy enforcement at registry boundary
46
+ - `derive` method for Agent SkillSet tool_names extraction
47
+ - 59 tests
48
+
49
+ ### Changed
50
+
51
+ - **L1 Knowledge Consolidation** (4 → 3 skills):
52
+ - `multi_llm_review_workflow` v3.1: merged with `multi_llm_design_review` (methodology + CLI execution in single skill)
53
+ - `multi_llm_reviewer_evaluation` v1.1: Codex convergence behavior data, APPROVE signal reliability
54
+ - `design_to_implementation_workflow` v1.1: self-review phase, implementation review phase, Persona Assembly merge gate
55
+ - Deleted: `multi_llm_design_review` (absorbed into `multi_llm_review_workflow`)
56
+ - Self-referential review: v3.0 reviewed by its own multi-LLM process → v3.1
57
+
58
+ - **meeting_attest_skill**: Fail-closed when `content_hash` is nil (previously fail-open)
59
+
60
+ - **autoexec**: Enhanced `task_dsl` and `plan_store` for Agent SkillSet integration
61
+
62
+ ### Fixed
63
+
64
+ - **Phase 4 review fixes**: Notification method, restore hook, race condition, stale proxy
65
+ - **Mandate save race**: Single atomic save (no update_status then stale save)
66
+ - **Attestation Nudge race condition**: `rebuild_indexes_from(data)` inside `with_locked_data`
67
+ - **Attestation Nudge index staleness**: `mark_attested` rebuilds indexes
68
+ - **Attestation Nudge JSON recovery**: `with_locked_data` recovers from corrupted JSON
69
+
70
+ ---
71
+
7
72
  ## [3.5.0] - 2026-03-27
8
73
 
9
74
  ### Added
@@ -0,0 +1,118 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'securerandom'
4
+
5
+ module KairosMcp
6
+ # Tracks invocation chain metadata for internal tool-to-tool calls.
7
+ # Carries depth, caller, mandate, and policy (whitelist/blacklist) through
8
+ # the entire invocation chain. Created by BaseTool#invoke_tool, threaded
9
+ # through ToolRegistry#call_tool.
10
+ class InvocationContext
11
+ MAX_DEPTH = 10
12
+
13
+ attr_reader :depth, :caller_tool, :mandate_id, :token_budget,
14
+ :whitelist, :blacklist, :root_invocation_id
15
+
16
+ def initialize(depth: 0, caller_tool: nil, mandate_id: nil,
17
+ token_budget: nil, whitelist: nil, blacklist: nil,
18
+ root_invocation_id: nil)
19
+ @depth = depth
20
+ @caller_tool = caller_tool
21
+ @mandate_id = mandate_id
22
+ @token_budget = token_budget
23
+ @whitelist = whitelist
24
+ @blacklist = blacklist
25
+ @root_invocation_id = root_invocation_id || SecureRandom.hex(8)
26
+ end
27
+
28
+ # Create a child context for a nested invocation.
29
+ # Inherits all policy from the parent; increments depth.
30
+ def child(caller_tool:)
31
+ raise DepthExceededError, "Max invocation depth (#{MAX_DEPTH}) exceeded" if @depth >= MAX_DEPTH
32
+
33
+ self.class.new(
34
+ depth: @depth + 1,
35
+ caller_tool: caller_tool,
36
+ mandate_id: @mandate_id,
37
+ token_budget: @token_budget,
38
+ whitelist: @whitelist&.dup,
39
+ blacklist: @blacklist&.dup,
40
+ root_invocation_id: @root_invocation_id
41
+ )
42
+ end
43
+
44
+ # Derive a new context with modified blacklist, preserving all other fields.
45
+ # Used by agent ACT phase to selectively unblock autoexec tools.
46
+ # Does NOT increment depth — child() does that at invoke_tool time.
47
+ def derive(blacklist_remove: [], blacklist_add: [])
48
+ new_blacklist = Array(@blacklist).dup
49
+ blacklist_remove.each { |pat| new_blacklist.delete(pat) }
50
+ blacklist_add.each { |pat| new_blacklist << pat unless new_blacklist.include?(pat) }
51
+
52
+ self.class.new(
53
+ depth: @depth,
54
+ caller_tool: @caller_tool,
55
+ mandate_id: @mandate_id,
56
+ token_budget: @token_budget,
57
+ whitelist: @whitelist&.dup,
58
+ blacklist: new_blacklist.empty? ? nil : new_blacklist,
59
+ root_invocation_id: @root_invocation_id
60
+ )
61
+ end
62
+
63
+ # Serialize to a plain Hash for passing through tool arguments.
64
+ # Only includes policy-relevant fields (whitelist, blacklist, mandate_id, token_budget).
65
+ def to_h
66
+ {
67
+ 'whitelist' => @whitelist,
68
+ 'blacklist' => @blacklist,
69
+ 'mandate_id' => @mandate_id,
70
+ 'token_budget' => @token_budget
71
+ }
72
+ end
73
+
74
+ def to_json(*args)
75
+ require 'json'
76
+ to_h.to_json(*args)
77
+ end
78
+
79
+ # Reconstruct policy from a Hash (e.g., parsed from tool arguments).
80
+ # Only restores policy fields — depth and caller are not transferred.
81
+ def self.from_h(hash)
82
+ return nil if hash.nil?
83
+
84
+ new(
85
+ whitelist: hash['whitelist'],
86
+ blacklist: hash['blacklist'],
87
+ mandate_id: hash['mandate_id'],
88
+ token_budget: hash['token_budget']
89
+ )
90
+ end
91
+
92
+ def self.from_json(json_string)
93
+ require 'json'
94
+ from_h(JSON.parse(json_string))
95
+ end
96
+
97
+ # Check if a tool is allowed by whitelist/blacklist policy.
98
+ # Blacklist is checked first (deny wins). Both use fnmatch patterns.
99
+ # For namespaced tools (e.g., "peer1/agent_start"), also checks
100
+ # the bare name ("agent_start") to prevent blacklist bypass via
101
+ # remote proxy tool namespace prefix.
102
+ def allowed?(tool_name)
103
+ names = [tool_name]
104
+ names << tool_name.split('/').last if tool_name.include?('/')
105
+
106
+ if @blacklist
107
+ return false if names.any? { |n| @blacklist.any? { |pat| File.fnmatch(pat, n) } }
108
+ end
109
+ if @whitelist
110
+ return names.any? { |n| @whitelist.any? { |pat| File.fnmatch(pat, n) } }
111
+ end
112
+ true
113
+ end
114
+
115
+ class DepthExceededError < StandardError; end
116
+ class PolicyDeniedError < StandardError; end
117
+ end
118
+ end
@@ -168,9 +168,10 @@ module KairosMcp
168
168
  end
169
169
 
170
170
  def handle_tools_list
171
- {
172
- tools: @tool_registry.list_tools
173
- }
171
+ # Filter namespaced proxy tools (e.g., "peer1/tool") from external clients
172
+ # to prevent infinite proxy loops. Internal call_tool/tool_exists? still sees them.
173
+ tools = @tool_registry.list_tools.reject { |t| t[:name].to_s.include?('/') }
174
+ { tools: tools }
174
175
  end
175
176
 
176
177
  def handle_tools_call(params)
@@ -5,8 +5,8 @@ module KairosMcp
5
5
  # Adapter that wraps a Skill with tool_config as an MCP Tool
6
6
  # This allows skills defined in kairos.rb to be exposed as MCP tools
7
7
  class SkillToolAdapter < Tools::BaseTool
8
- def initialize(skill, safety = nil)
9
- super(safety)
8
+ def initialize(skill, safety = nil, registry: nil)
9
+ super(safety, registry: registry)
10
10
  @skill = skill
11
11
  @tool_config = skill.tool_config
12
12
  end
@@ -130,6 +130,9 @@ module KairosMcp
130
130
 
131
131
  # Skill-based tools (from kairos.rb with tool block)
132
132
  register_skill_tools if skill_tools_enabled?
133
+
134
+ # Restore dynamic proxy tools from active mcp_client connections (Phase 4)
135
+ restore_dynamic_tools
133
136
  end
134
137
 
135
138
  # Register tools from enabled SkillSets
@@ -154,26 +157,11 @@ module KairosMcp
154
157
 
155
158
  Kairos.skills.each do |skill|
156
159
  next unless skill.has_tool? # Only skills with tool block and executor
157
- adapter = SkillToolAdapter.new(skill, @safety)
160
+ adapter = SkillToolAdapter.new(skill, @safety, registry: self)
158
161
  register(adapter)
159
162
  end
160
163
  end
161
164
 
162
- def skill_tools_enabled?
163
- SkillsConfig.load['skill_tools_enabled'] == true
164
- end
165
-
166
- def register_if_defined(class_name)
167
- klass = Object.const_get(class_name)
168
- register(klass.new(@safety))
169
- rescue NameError
170
- # Class not defined yet (file might not exist), ignore
171
- end
172
-
173
- def register(tool)
174
- @tools[tool.name] = tool
175
- end
176
-
177
165
  def set_workspace(roots)
178
166
  @safety.set_workspace(roots)
179
167
  end
@@ -182,16 +170,71 @@ module KairosMcp
182
170
  @tools.values.map(&:to_schema)
183
171
  end
184
172
 
185
- def call_tool(name, arguments)
173
+ # Register a pre-built tool instance (e.g., proxy tools from mcp_client).
174
+ # Cannot overwrite local (non-proxy) tools to prevent accidental replacement.
175
+ def register_dynamic_tool(tool_instance)
176
+ name = tool_instance.name
177
+ existing = @tools[name]
178
+ if existing && !existing.respond_to?(:remote_name)
179
+ raise "Cannot override local tool '#{name}' with dynamic registration"
180
+ end
181
+ @tools[name] = tool_instance
182
+ end
183
+
184
+ # Remove a dynamically registered tool (e.g., on mcp_disconnect).
185
+ def unregister_tool(name)
186
+ @tools.delete(name)
187
+ end
188
+
189
+ def call_tool(name, arguments, invocation_context: nil)
186
190
  tool = @tools[name]
187
191
  unless tool
188
192
  raise "Tool not found: #{name}"
189
193
  end
190
194
 
195
+ # Defense-in-depth: enforce invocation policy at the registry boundary.
196
+ # This duplicates the check in BaseTool#invoke_tool so that direct
197
+ # call_tool calls with a context also respect whitelist/blacklist.
198
+ if invocation_context && !invocation_context.allowed?(name)
199
+ raise InvocationContext::PolicyDeniedError,
200
+ "Tool '#{name}' blocked by invocation policy at registry boundary"
201
+ end
202
+
191
203
  self.class.run_gates(name, arguments, @safety)
192
204
  tool.call(arguments)
193
205
  rescue GateDeniedError => e
194
206
  [{ type: 'text', text: JSON.pretty_generate({ error: 'forbidden', message: e.message }) }]
207
+ rescue InvocationContext::DepthExceededError, InvocationContext::PolicyDeniedError => e
208
+ [{ type: 'text', text: JSON.pretty_generate({ error: 'invocation_denied', message: e.message }) }]
209
+ end
210
+
211
+ private
212
+
213
+ def skill_tools_enabled?
214
+ SkillsConfig.load['skill_tools_enabled'] == true
215
+ end
216
+
217
+ def register_if_defined(class_name)
218
+ klass = Object.const_get(class_name)
219
+ register(klass.new(@safety, registry: self))
220
+ rescue NameError
221
+ # Class not defined yet (file might not exist), ignore
222
+ end
223
+
224
+ def register(tool)
225
+ @tools[tool.name] = tool
226
+ end
227
+
228
+ # Restore dynamic proxy tools from active mcp_client connections.
229
+ # Called at the end of register_tools so that HTTP-mode registries
230
+ # (which are recreated per request) pick up existing connections.
231
+ def restore_dynamic_tools
232
+ return unless defined?(KairosMcp::SkillSets::McpClient::ConnectionManager)
233
+
234
+ conn_mgr = KairosMcp::SkillSets::McpClient::ConnectionManager.instance
235
+ conn_mgr.restore_proxy_tools(self, @safety)
236
+ rescue StandardError
237
+ nil # mcp_client SkillSet may not be loaded
195
238
  end
196
239
  end
197
240
  end
@@ -1,8 +1,28 @@
1
+ require_relative '../invocation_context'
2
+
1
3
  module KairosMcp
2
4
  module Tools
3
5
  class BaseTool
4
- def initialize(safety = nil)
6
+ def initialize(safety = nil, registry: nil)
5
7
  @safety = safety
8
+ @registry = registry
9
+ end
10
+
11
+ # Invoke another tool through the same ToolRegistry, preserving the
12
+ # full gate pipeline and invocation policy (whitelist/blacklist/depth).
13
+ # Only available when the tool was registered with a registry reference.
14
+ def invoke_tool(tool_name, arguments = {}, context: nil)
15
+ raise "Tool invocation not available (no registry)" unless @registry
16
+
17
+ ctx = context || InvocationContext.new
18
+ child_ctx = ctx.child(caller_tool: name)
19
+
20
+ unless child_ctx.allowed?(tool_name)
21
+ raise InvocationContext::PolicyDeniedError,
22
+ "Tool '#{tool_name}' blocked by invocation policy (caller: #{name})"
23
+ end
24
+
25
+ @registry.call_tool(tool_name, arguments, invocation_context: child_ctx)
6
26
  end
7
27
 
8
28
  def name
@@ -1,4 +1,4 @@
1
1
  module KairosMcp
2
- VERSION = "3.5.0"
2
+ VERSION = "3.6.0"
3
3
  CHANGELOG_URL = "https://github.com/masaomi/KairosChain_2026/blob/main/CHANGELOG.md"
4
4
  end
@@ -0,0 +1,196 @@
1
+ ---
2
+ name: design_to_implementation_workflow
3
+ description: "Full-lifecycle workflow for complex features: design review, self-review, implementation review, and final merge gate. Derived from Service Grant + Attestation Nudge experiments."
4
+ version: "1.1"
5
+ tags:
6
+ - workflow
7
+ - implementation
8
+ - multi-llm
9
+ - design-review
10
+ - methodology
11
+ - self-review
12
+ ---
13
+
14
+ # Design-to-Implementation Workflow
15
+
16
+ ## Overview
17
+
18
+ A structured workflow for implementing complex features (Tier 2+) that maximizes
19
+ quality through multiple review checkpoints. Each checkpoint finds categorically
20
+ different bugs.
21
+
22
+ ## Full Lifecycle Model (v1.1)
23
+
24
+ ```
25
+ ┌─────────────────────────────────────────────────────────────┐
26
+ │ DESIGN PHASE │
27
+ │ │
28
+ │ Draft v0.1 ──→ Multi-LLM Review R1 ──→ Fix ──→ v0.2 │
29
+ │ (structural gaps) │
30
+ │ │
31
+ │ v0.2 ──→ Multi-LLM Review R2 ──→ Fix ──→ v0.3 │
32
+ │ (fix correctness) │
33
+ │ │
34
+ │ Convergence: 0 FAIL, 2/3+ APPROVE │
35
+ ├─────────────────────────────────────────────────────────────┤
36
+ │ IMPLEMENTATION PHASE │
37
+ │ │
38
+ │ Implement from v0.3 ──→ Tests pass │
39
+ │ │
40
+ │ Self-Review (Agent subagent) ──→ Fix P0/P1 │
41
+ │ (race conditions, edge cases, code quality) │
42
+ │ │
43
+ │ Tests pass again │
44
+ ├─────────────────────────────────────────────────────────────┤
45
+ │ VERIFICATION PHASE │
46
+ │ │
47
+ │ Multi-LLM Implementation Review ──→ Fix │
48
+ │ (missing wiring, fail-open, integration gaps) │
49
+ │ │
50
+ │ Final Multi-LLM Review + Persona Assembly │
51
+ │ (merge gate: 3/3 APPROVE = merge-ready) │
52
+ └─────────────────────────────────────────────────────────────┘
53
+ ```
54
+
55
+ ## When to Use This Workflow
56
+
57
+ | Tier | Scope | Design Review | Self-Review | Impl Review | Final Review |
58
+ |------|-------|--------------|-------------|-------------|--------------|
59
+ | 1 | Single file, known pattern | Skip | Optional | Skip | Skip |
60
+ | 2 | Multi-file, SkillSet feature | 1-2 rounds | Recommended | 1 round | Optional |
61
+ | 3 | Cross-component, new subsystem | 2-3 rounds | Required | 1 round | Required |
62
+ | 3+ | Security-critical | 2-3 rounds | Required | 1 round | Required + Persona Assembly |
63
+
64
+ ## Phase Details
65
+
66
+ ### Design Phase
67
+
68
+ #### Solo Design (v0.1)
69
+ - Single LLM (Opus-class) produces initial design
70
+ - Include: architecture, component design, schema, error handling, phase boundaries
71
+ - Output: Complete design document with pseudocode
72
+
73
+ #### Multi-LLM Review Rounds
74
+ - **3 reviewers**: Claude Opus 4.6 + Codex GPT-5.4 + Composer-2
75
+ - **Convergence criteria**: 0 FAIL, 2/3+ APPROVE
76
+ - **Typical rounds**: 2-3 for Tier 3 complexity
77
+ - **Convergence curve**:
78
+ - R1: Structural gaps — "this is missing" (existence)
79
+ - R2: Fix correctness — "the fix is wrong" (accuracy)
80
+ - R3: Refinement — "minor adjustments" (polish)
81
+
82
+ ### Implementation Phase
83
+
84
+ #### Implementation
85
+ - Single Opus-class LLM for context preservation
86
+ - Follow design document's phase ordering
87
+ - Implement → test within each component before moving to next
88
+
89
+ #### Self-Review (NEW in v1.1)
90
+
91
+ Before requesting external multi-LLM review, run a self-review using an Agent subagent:
92
+
93
+ ```
94
+ Agent(subagent_type: "general-purpose"):
95
+ "Review [file] for bugs, race conditions, edge cases,
96
+ test coverage gaps. Categorize as P0/P1/P2."
97
+ ```
98
+
99
+ **Why self-review matters**:
100
+ - Finds P0 bugs cheaply (no external LLM cost)
101
+ - Catches implementation-level issues design review can't see
102
+ - Example: P0 race condition in `rebuild_indexes` (unlocked file read) — found by self-review, invisible to design review
103
+
104
+ **What self-review finds** (confirmed in Attestation Nudge session):
105
+ - Race conditions in file I/O patterns
106
+ - Index staleness after state transitions
107
+ - Missing error recovery paths (corrupted JSON)
108
+ - Test coverage gaps for edge cases
109
+
110
+ ### Verification Phase
111
+
112
+ #### Implementation Review (NEW in v1.1)
113
+
114
+ After self-review fixes, run full multi-LLM review of the **implemented code** (not design doc):
115
+
116
+ **Key difference from design review**: Implementation review finds **categorically different bugs**:
117
+
118
+ | Design Review Finds | Implementation Review Finds |
119
+ |--------------------|-----------------------------|
120
+ | "This API doesn't exist" | "This method has no call site" |
121
+ | "The key model is inconsistent" | "The fail-open path is exploitable" |
122
+ | "Session concept is undefined" | "The return type doesn't match the guard" |
123
+
124
+ **Attestation Nudge data point**:
125
+ - Design review: 8 findings across 2 rounds (structural + correctness)
126
+ - Implementation review: 5 findings in 1 round (wiring + integration)
127
+ - **Zero overlap** between design and implementation findings
128
+
129
+ #### Final Review + Persona Assembly
130
+
131
+ For Tier 3+ or pre-merge gates:
132
+
133
+ ```
134
+ Claude Persona Assembly (4 personas):
135
+ Kairos — Philosophical alignment, layer boundaries
136
+ Guardian — Security, fail-safe behavior, flock correctness
137
+ Pragmatist — Code quality, test coverage, performance
138
+ Skeptic — What breaks first? Scale? Silent failures?
139
+ ```
140
+
141
+ **When to use Persona Assembly**:
142
+ - Final merge gate for Tier 3+ features
143
+ - Safety-critical components
144
+ - NOT for intermediate rounds (diminishing returns)
145
+
146
+ **Merge criteria**: 3/3 APPROVE with 0 FAIL. Codex APPROVE is the strongest signal (see `multi_llm_reviewer_evaluation`).
147
+
148
+ ## Effort Level Selection
149
+
150
+ | Phase | Effort | Rationale |
151
+ |-------|--------|-----------|
152
+ | Design review | High | Maximize gap detection |
153
+ | Implementation | Medium | Design is detailed; faithful translation |
154
+ | Self-review | Low | Quick Agent pass, fix obvious issues |
155
+ | Implementation review | High | Find wiring/integration bugs |
156
+ | Final review | High | Merge gate with Persona Assembly |
157
+
158
+ ## Tool Usage During Implementation
159
+
160
+ | Tool | Purpose | Timing |
161
+ |------|---------|--------|
162
+ | knowledge_get (L1) | Load domain context | Session start |
163
+ | context_save (L2) | Save session progress | Session end / milestone |
164
+ | Agent (subagent) | Self-review | After implementation, before external review |
165
+
166
+ ### What NOT to Use During Implementation
167
+ - **Autonomos**: Overhead of observe/orient/decide is wasteful when design document
168
+ already serves as roadmap. Save for exploratory phases.
169
+ - **autoexec**: Designed for structured JSON step plans, not free-form coding
170
+ - **Agent team**: Context fragmentation across agents. Single LLM preserves
171
+ cross-component coherence for tightly-coupled implementations.
172
+
173
+ ## Convergence Data
174
+
175
+ ### Service Grant (Tier 3, 2026-03-18)
176
+ - Design: v1.0 → v1.4, 3 review rounds, 3 LLMs
177
+ - Design review findings: R1: 8 P0/P1, R2: 2 FAIL + 28 CONCERN, R3: 0 FAIL
178
+ - Implementation: Phase 0-3, 2 rounds implementation review
179
+ - Total bugs found: 8 (design) + 13 (implementation review) + 2 (during coding)
180
+
181
+ ### Attestation Nudge (Tier 2, 2026-03-28)
182
+ - Design: v0.1 → v0.3, 2 review rounds, 3 LLMs
183
+ - Self-review: 4 fixes (P0-1 race, P1-4 staleness, P1-6 test gap, P2-2 recovery)
184
+ - Implementation review: 3 fixes (missing call site, fail-open attest, escaping)
185
+ - Final review (Persona Assembly): 0 FAIL, 3/3 APPROVE
186
+ - **Codex convergence**: REJECT → REJECT → REJECT → APPROVE (4 rounds)
187
+
188
+ ## Anti-Patterns
189
+
190
+ - Implementing Phase 2+ when Phase 1 prerequisites aren't met
191
+ - Using agent team for implementation (context fragmentation)
192
+ - Skipping self-review (misses cheap P0 fixes)
193
+ - Skipping implementation review (design review can't find wiring bugs)
194
+ - Treating Codex REJECT as "too strict" without investigating (usually substantive)
195
+ - Using Persona Assembly in every round (diminishing returns; save for final gate)
196
+ - Implementing without design review for Tier 3 complexity ("just implement it")