@vpxa/aikit 0.1.74 → 0.1.75

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (134) hide show
  1. package/package.json +6 -1
  2. package/packages/cli/dist/index.js +2 -2
  3. package/packages/cli/dist/{init-DQkar6Es.js → init-CuRXmyD9.js} +1 -1
  4. package/packages/cli/dist/scaffold-WMQ2uQ48.js +2 -0
  5. package/packages/cli/dist/{user-CopNWxHP.js → user-vbJwa7x2.js} +1 -1
  6. package/scaffold/dist/adapters/claude-code.mjs +4 -0
  7. package/scaffold/dist/adapters/copilot.mjs +75 -0
  8. package/scaffold/dist/adapters/flows.mjs +1 -0
  9. package/scaffold/dist/adapters/skills.mjs +1 -0
  10. package/scaffold/{compiled → dist/compiled}/flows-data.mjs +304 -446
  11. package/scaffold/{compiled → dist/compiled}/skills-data.mjs +554 -2281
  12. package/scaffold/dist/definitions/agents.mjs +9 -0
  13. package/scaffold/{definitions → dist/definitions}/bodies.mjs +6 -229
  14. package/scaffold/dist/definitions/exclusions.mjs +1 -0
  15. package/scaffold/dist/definitions/hooks.mjs +1 -0
  16. package/scaffold/dist/definitions/models.mjs +1 -0
  17. package/scaffold/dist/definitions/plugins.mjs +1 -0
  18. package/scaffold/{definitions → dist/definitions}/prompts.mjs +9 -149
  19. package/scaffold/{definitions → dist/definitions}/protocols.mjs +9 -37
  20. package/scaffold/dist/definitions/tools.mjs +1 -0
  21. package/packages/cli/dist/scaffold-ukCDW3wQ.js +0 -2
  22. package/scaffold/_preview/agents/Architect-Reviewer-Alpha.agent.md +0 -132
  23. package/scaffold/_preview/agents/Architect-Reviewer-Beta.agent.md +0 -132
  24. package/scaffold/_preview/agents/Code-Reviewer-Alpha.agent.md +0 -112
  25. package/scaffold/_preview/agents/Code-Reviewer-Beta.agent.md +0 -112
  26. package/scaffold/_preview/agents/Debugger.agent.md +0 -412
  27. package/scaffold/_preview/agents/Documenter.agent.md +0 -468
  28. package/scaffold/_preview/agents/Explorer.agent.md +0 -76
  29. package/scaffold/_preview/agents/Frontend.agent.md +0 -440
  30. package/scaffold/_preview/agents/Implementer.agent.md +0 -425
  31. package/scaffold/_preview/agents/Orchestrator.agent.md +0 -452
  32. package/scaffold/_preview/agents/Planner.agent.md +0 -481
  33. package/scaffold/_preview/agents/README.md +0 -57
  34. package/scaffold/_preview/agents/Refactor.agent.md +0 -435
  35. package/scaffold/_preview/agents/Researcher-Alpha.agent.md +0 -151
  36. package/scaffold/_preview/agents/Researcher-Beta.agent.md +0 -152
  37. package/scaffold/_preview/agents/Researcher-Delta.agent.md +0 -153
  38. package/scaffold/_preview/agents/Researcher-Gamma.agent.md +0 -152
  39. package/scaffold/_preview/agents/Security.agent.md +0 -433
  40. package/scaffold/_preview/agents/_shared/architect-reviewer-base.md +0 -104
  41. package/scaffold/_preview/agents/_shared/code-agent-base.md +0 -366
  42. package/scaffold/_preview/agents/_shared/code-reviewer-base.md +0 -87
  43. package/scaffold/_preview/agents/_shared/decision-protocol.md +0 -27
  44. package/scaffold/_preview/agents/_shared/forge-protocol.md +0 -90
  45. package/scaffold/_preview/agents/_shared/researcher-base.md +0 -114
  46. package/scaffold/_preview/agents/templates/adr-template.md +0 -28
  47. package/scaffold/_preview/agents/templates/execution-state.md +0 -26
  48. package/scaffold/_preview/flows/_epilogue/steps/docs-sync/README.md +0 -120
  49. package/scaffold/_preview/flows/aikit-advanced/README.md +0 -70
  50. package/scaffold/_preview/flows/aikit-advanced/steps/design/README.md +0 -178
  51. package/scaffold/_preview/flows/aikit-advanced/steps/execute/README.md +0 -145
  52. package/scaffold/_preview/flows/aikit-advanced/steps/plan/README.md +0 -122
  53. package/scaffold/_preview/flows/aikit-advanced/steps/spec/README.md +0 -121
  54. package/scaffold/_preview/flows/aikit-advanced/steps/task/README.md +0 -119
  55. package/scaffold/_preview/flows/aikit-advanced/steps/verify/README.md +0 -145
  56. package/scaffold/_preview/flows/aikit-basic/README.md +0 -51
  57. package/scaffold/_preview/flows/aikit-basic/steps/assess/README.md +0 -109
  58. package/scaffold/_preview/flows/aikit-basic/steps/design/README.md +0 -116
  59. package/scaffold/_preview/flows/aikit-basic/steps/implement/README.md +0 -131
  60. package/scaffold/_preview/flows/aikit-basic/steps/verify/README.md +0 -123
  61. package/scaffold/_preview/prompts/aikit-ask.prompt.md +0 -13
  62. package/scaffold/_preview/prompts/aikit-debug.prompt.md +0 -15
  63. package/scaffold/_preview/prompts/aikit-design.prompt.md +0 -15
  64. package/scaffold/_preview/prompts/aikit-flow-add.prompt.md +0 -84
  65. package/scaffold/_preview/prompts/aikit-flow-create.prompt.md +0 -80
  66. package/scaffold/_preview/prompts/aikit-flow-manage.prompt.md +0 -24
  67. package/scaffold/_preview/prompts/aikit-implement.prompt.md +0 -17
  68. package/scaffold/_preview/prompts/aikit-plan.prompt.md +0 -15
  69. package/scaffold/_preview/prompts/aikit-review.prompt.md +0 -24
  70. package/scaffold/_preview/skills/adr-skill/SKILL.md +0 -335
  71. package/scaffold/_preview/skills/adr-skill/assets/templates/adr-madr.md +0 -89
  72. package/scaffold/_preview/skills/adr-skill/assets/templates/adr-readme.md +0 -20
  73. package/scaffold/_preview/skills/adr-skill/assets/templates/adr-simple.md +0 -46
  74. package/scaffold/_preview/skills/adr-skill/references/adr-conventions.md +0 -95
  75. package/scaffold/_preview/skills/adr-skill/references/examples.md +0 -193
  76. package/scaffold/_preview/skills/adr-skill/references/review-checklist.md +0 -77
  77. package/scaffold/_preview/skills/adr-skill/references/template-variants.md +0 -52
  78. package/scaffold/_preview/skills/adr-skill/scripts/bootstrap_adr.js +0 -259
  79. package/scaffold/_preview/skills/adr-skill/scripts/new_adr.js +0 -391
  80. package/scaffold/_preview/skills/adr-skill/scripts/set_adr_status.js +0 -169
  81. package/scaffold/_preview/skills/aikit/SKILL.md +0 -754
  82. package/scaffold/_preview/skills/brainstorming/SKILL.md +0 -265
  83. package/scaffold/_preview/skills/brainstorming/spec-document-reviewer-prompt.md +0 -49
  84. package/scaffold/_preview/skills/c4-architecture/SKILL.md +0 -389
  85. package/scaffold/_preview/skills/c4-architecture/references/advanced-patterns.md +0 -552
  86. package/scaffold/_preview/skills/c4-architecture/references/c4-syntax.md +0 -510
  87. package/scaffold/_preview/skills/c4-architecture/references/common-mistakes.md +0 -437
  88. package/scaffold/_preview/skills/c4-architecture/references/html-design-system.md +0 -337
  89. package/scaffold/_preview/skills/c4-architecture/references/html-template.html +0 -627
  90. package/scaffold/_preview/skills/docs/SKILL.md +0 -553
  91. package/scaffold/_preview/skills/docs/references/diataxis-anti-patterns.md +0 -147
  92. package/scaffold/_preview/skills/docs/references/diataxis-compass.md +0 -123
  93. package/scaffold/_preview/skills/docs/references/diataxis-quadrants.md +0 -192
  94. package/scaffold/_preview/skills/docs/references/diataxis-quality.md +0 -76
  95. package/scaffold/_preview/skills/docs/references/diataxis-templates.md +0 -120
  96. package/scaffold/_preview/skills/docs/references/flow-artifacts-guide.md +0 -70
  97. package/scaffold/_preview/skills/docs/references/project-knowledge-gotchas.md +0 -32
  98. package/scaffold/_preview/skills/docs/references/project-knowledge-templates.md +0 -281
  99. package/scaffold/_preview/skills/docs/references/project-knowledge-workflow.md +0 -80
  100. package/scaffold/_preview/skills/frontend-design/SKILL.md +0 -237
  101. package/scaffold/_preview/skills/lesson-learned/SKILL.md +0 -113
  102. package/scaffold/_preview/skills/lesson-learned/references/anti-patterns.md +0 -55
  103. package/scaffold/_preview/skills/lesson-learned/references/se-principles.md +0 -109
  104. package/scaffold/_preview/skills/multi-agents-development/SKILL.md +0 -448
  105. package/scaffold/_preview/skills/multi-agents-development/architecture-review-prompt.md +0 -81
  106. package/scaffold/_preview/skills/multi-agents-development/code-quality-review-prompt.md +0 -91
  107. package/scaffold/_preview/skills/multi-agents-development/implementer-prompt.md +0 -93
  108. package/scaffold/_preview/skills/multi-agents-development/parallel-dispatch-example.md +0 -167
  109. package/scaffold/_preview/skills/multi-agents-development/spec-review-prompt.md +0 -81
  110. package/scaffold/_preview/skills/present/SKILL.md +0 -616
  111. package/scaffold/_preview/skills/react/SKILL.md +0 -309
  112. package/scaffold/_preview/skills/repo-access/SKILL.md +0 -178
  113. package/scaffold/_preview/skills/repo-access/references/error-patterns.md +0 -116
  114. package/scaffold/_preview/skills/repo-access/references/platform-matrix.md +0 -142
  115. package/scaffold/_preview/skills/requirements-clarity/SKILL.md +0 -333
  116. package/scaffold/_preview/skills/session-handoff/SKILL.md +0 -199
  117. package/scaffold/_preview/skills/session-handoff/references/handoff-template.md +0 -139
  118. package/scaffold/_preview/skills/session-handoff/references/resume-checklist.md +0 -80
  119. package/scaffold/_preview/skills/session-handoff/scripts/check_staleness.js +0 -269
  120. package/scaffold/_preview/skills/session-handoff/scripts/create_handoff.js +0 -299
  121. package/scaffold/_preview/skills/session-handoff/scripts/list_handoffs.js +0 -113
  122. package/scaffold/_preview/skills/session-handoff/scripts/validate_handoff.js +0 -241
  123. package/scaffold/_preview/skills/typescript/SKILL.md +0 -405
  124. package/scaffold/adapters/claude-code.mjs +0 -73
  125. package/scaffold/adapters/copilot.mjs +0 -292
  126. package/scaffold/adapters/flows.mjs +0 -27
  127. package/scaffold/adapters/skills.mjs +0 -25
  128. package/scaffold/definitions/agents.mjs +0 -266
  129. package/scaffold/definitions/exclusions.mjs +0 -58
  130. package/scaffold/definitions/hooks.mjs +0 -43
  131. package/scaffold/definitions/models.mjs +0 -84
  132. package/scaffold/definitions/plugins.mjs +0 -147
  133. package/scaffold/definitions/tools.mjs +0 -250
  134. package/scaffold/generate.mjs +0 -92
@@ -1,448 +0,0 @@
1
- ---
2
- name: multi-agents-development
3
- description: "Comprehensive patterns for orchestrating multiple AI agents in parallel development workflows. Covers task decomposition, parallel dispatch, context crafting, status handling, review pipelines, and recovery."
4
- metadata:
5
- category: cross-cutting
6
- domain: general
7
- applicability: always
8
- inputs: [plan, codebase]
9
- outputs: [task-decomposition, dispatch-templates]
10
- requires: [aikit]
11
- relatedSkills: [session-handoff]
12
- ---
13
-
14
- # Multi-Agent Development
15
-
16
- Comprehensive patterns for orchestrating multiple AI agents in parallel development workflows. Covers task decomposition, parallel dispatch, context crafting, status handling, review pipelines, and recovery.
17
-
18
- **Core Principle**: Dispatch multiple agents for focused tasks. Each subagent gets fresh, focused context with explicit scope — never inherited session state.
19
-
20
- Load this skill when orchestrating multi-agent work: planning parallel batches, crafting delegation prompts, handling implementer status, running review pipelines, or recovering from agent failures.
21
-
22
- ---
23
-
24
- ## §1 Agent Roles & Model Selection
25
-
26
- ### Role Categories
27
-
28
- | Role | Agents | When to Use | Parallelizable |
29
- |------|--------|-------------|----------------|
30
- | **Orchestration** | Orchestrator, Planner | Workflow control, planning | No (sequential) |
31
- | **Implementation** | Implementer, Frontend, Refactor | Code creation/modification | Yes (disjoint files only) |
32
- | **Research** | Explorer, Researcher-Alpha/Beta/Gamma/Delta | Codebase exploration, decisions | Yes (always) |
33
- | **Review** | Code-Reviewer-Alpha/Beta, Architect-Reviewer-Alpha/Beta | Quality verification | Yes (always) |
34
- | **Diagnostics** | Debugger, Security | Issue tracing, vulnerability analysis | Yes (read-only) |
35
- | **Documentation** | Documenter | README, API docs, changelog | Yes (disjoint files) |
36
-
37
- ### Model Selection by Task Complexity
38
-
39
- Choose the **least powerful model that can handle the role**:
40
-
41
- | Complexity Signal | Model Tier | Example Agents |
42
- |-------------------|-----------|----------------|
43
- | Mechanical (rename, move, add field) | Fast model | Explorer (Gemini Flash) |
44
- | Standard (implement spec, write tests) | Mid-tier | Implementer (GPT-5.4), Refactor (GPT-5.4) |
45
- | Judgment-heavy (architecture, security, debug) | Strongest | Debugger (Opus 4.6), Security (Opus 4.6) |
46
- | Multi-model cross-validation | Mixed | Researcher-Alpha/Beta/Gamma/Delta (all different) |
47
-
48
- **Upgrade signal**: If an agent returns `BLOCKED` or `DONE_WITH_CONCERNS` on a task classified as "Standard", consider re-dispatching to a stronger model.
49
-
50
- ---
51
-
52
- ## §2 Task Decomposition Rules
53
-
54
- ### The Golden Rule
55
- > **One task = one focused problem domain = 1-3 files maximum.**
56
-
57
- ### Decomposition Checklist
58
-
59
- For each task, specify ALL of:
60
- - [ ] **Target files** — exact paths to create or modify
61
- - [ ] **Acceptance criteria** — what "done" looks like (testable)
62
- - [ ] **Agent assignment** — which agent handles this
63
- - [ ] **Dependencies** — which tasks must complete first (if any)
64
-
65
- ### Sizing Guide
66
-
67
- | Task Size | Files | Example | Agent |
68
- |-----------|-------|---------|-------|
69
- | **Micro** | 1 file | Add a utility function | Implementer |
70
- | **Small** | 1-2 files | New endpoint + test | Implementer |
71
- | **Standard** | 2-3 files | Feature with service + controller + test | Implementer |
72
- | **Too big** | 4+ files | **SPLIT IT** — decompose further | — |
73
-
74
- ### Splitting Strategies
75
- - **By layer**: Service logic (Implementer) + UI component (Frontend) + tests (Implementer)
76
- - **By feature boundary**: Auth endpoints (Implementer A) + Profile endpoints (Implementer B)
77
- - **By concern**: Data model changes (Implementer) + API route changes (Implementer) + UI updates (Frontend)
78
-
79
- ---
80
-
81
- ## §3 Independence Decision Tree
82
-
83
- Before marking tasks as parallel, walk this tree:
84
-
85
- ```
86
- Task A and Task B — can they run in parallel?
87
-
88
- ├─ Do they share ANY files? (create, modify, or delete the same file)
89
- │ ├─ YES → SEQUENTIAL (or merge into one task)
90
- │ └─ NO ↓
91
-
92
- ├─ Do they share mutable state? (env vars, globals, same DB table, shared config)
93
- │ ├─ YES → SEQUENTIAL
94
- │ └─ NO ↓
95
-
96
- ├─ Does B need A's output? (B reads a file A creates, B uses A's new export)
97
- │ ├─ YES → SEQUENTIAL (A before B)
98
- │ └─ NO ↓
99
-
100
- ├─ Would A's result change B's approach? (A discovers something that affects B)
101
- │ ├─ YES → SEQUENTIAL or single agent
102
- │ └─ NO ↓
103
-
104
- ├─ Resource contention? (same port, same build process, same lock file)
105
- │ ├─ YES → SEQUENTIAL
106
- │ └─ NO ↓
107
-
108
- └─ ✅ SAFE TO PARALLELIZE
109
- ```
110
-
111
- ### Edge Cases
112
-
113
- | Situation | Verdict | Why |
114
- |-----------|---------|-----|
115
- | Both import from same module (read-only) | ✅ Parallel | Reading shared code is fine |
116
- | Both add exports to same index file | ❌ Sequential | Concurrent index.ts edits will conflict |
117
- | A creates a type, B uses that type | ❌ Sequential | B depends on A's output |
118
- | Both modify different test files | ✅ Parallel | Disjoint file sets |
119
- | Both touch package.json | ❌ Sequential | Shared file |
120
- | A adds a route, B adds middleware | ⚠️ Check | If B's middleware affects A's route → sequential |
121
-
122
- ### Integration Verification (after parallel batch completes)
123
-
124
- 1. **Conflict check**: Did any agent unexpectedly modify a file assigned to another agent?
125
- 2. **Import check**: Do all new cross-references resolve?
126
- 3. **Full suite**: `check({})` + `test_run({})` — everything must pass
127
- 4. **Spot check**: Manually verify at least one task's output matches acceptance criteria
128
-
129
- ---
130
-
131
- ## §4 Parallel Dispatch Patterns
132
-
133
- ### Dispatch Rules
134
-
135
- 1. **Max 4 concurrent file-modifying agents** per batch
136
- 2. **Read-only agents have no limit** — Explorer, Researcher*, Reviewer*, Security can always run in parallel
137
- 3. **Build dependency graph first** — phases with no dependencies MUST be batched together
138
- 4. **Never dispatch two implementers to the same file** — even different sections
139
-
140
- ### Batch Strategy
141
-
142
- ```
143
- Phase Plan:
144
- Phase 1: [Task A, Task B, Task C] ← no dependencies between A/B/C
145
- Phase 2: [Task D, Task E] ← D depends on A, E depends on B
146
- Phase 3: [Task F] ← F depends on D and E
147
-
148
- Execution:
149
- Batch 1: dispatch(A, B, C) in parallel → review → gate
150
- Batch 2: dispatch(D, E) in parallel → review → gate
151
- Batch 3: dispatch(F) → review → gate
152
- ```
153
-
154
- ### Anti-Patterns
155
-
156
- | ❌ Don't | ✅ Do Instead |
157
- |----------|--------------|
158
- | Dispatch 6 implementers at once | Max 4, queue the rest |
159
- | Give one agent 10 files | Split into 3-4 focused tasks |
160
- | Let agents read the full plan | Give each agent ONLY its task context |
161
- | Retry same prompt on failure | Diagnose first, then re-prompt with fix |
162
- | Skip review after parallel batch | ALWAYS review + integration verify |
163
- | Inherit session context to subagent | Build fresh, focused context per dispatch |
164
-
165
- ---
166
-
167
- ## §5 Context Crafting Guide
168
-
169
- ### The Controller Principle
170
- > **The Orchestrator provides ALL context. Subagents never need to search for context themselves.**
171
-
172
- Each subagent gets a fresh, self-contained prompt. No inherited session state. No "read the plan first."
173
-
174
- ### The 6-Point Prompt Template
175
-
176
- Every delegation prompt MUST include:
177
-
178
- ```markdown
179
- ## 1. Scope
180
- Files to create/modify: [exact paths]
181
- Files to NOT touch: [boundaries]
182
-
183
- ## 2. Goal
184
- [What the code should do — acceptance criteria, testable outcomes]
185
-
186
- ## 3. Architectural Context
187
- [Relevant patterns, conventions, existing code structure]
188
- [Include actual code snippets from compact/digest — don't tell agent to "go read X"]
189
-
190
- ## 4. Constraints
191
- - Follow [pattern/convention]
192
- - Do NOT modify [boundary files]
193
- - Use [specific library/approach]
194
-
195
- ## 5. FORGE Context
196
- Tier: [Floor/Standard/Critical]
197
- Evidence requirements: [what evidence to collect]
198
-
199
- ## 6. Self-Review & Status
200
- Before declaring DONE, verify:
201
- - [ ] All acceptance criteria met
202
- - [ ] No files outside scope modified
203
- - [ ] Tests pass (if applicable)
204
- - [ ] Code follows stated conventions
205
-
206
- End with status: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED
207
- ```
208
-
209
- ### What to Include vs Omit
210
-
211
- | ✅ Include | ❌ Omit |
212
- |-----------|---------|
213
- | Exact file paths and code snippets | Full session history |
214
- | Acceptance criteria | Other agents' tasks |
215
- | Relevant conventions (from KB) | Unrelated architecture context |
216
- | Compact/digest of relevant files | Raw file contents of large files |
217
- | Error messages (if fixing a bug) | Previous failed attempts (unless relevant) |
218
- | FORGE tier and ceremony | Full FORGE protocol explanation |
219
-
220
- ### Context Size Budget
221
-
222
- | Task Complexity | Context Target | Approach |
223
- |-----------------|---------------|----------|
224
- | Micro (1 file) | ~500 tokens | Inline code snippet + goal |
225
- | Small (1-2 files) | ~1000 tokens | `compact` of target files + goal |
226
- | Standard (2-3 files) | ~2000 tokens | `digest` of related files + architectural context |
227
- | Complex (judgment-heavy) | ~3000 tokens | `digest` + relevant decisions from AI Kit |
228
-
229
- ---
230
-
231
- ## §6 Subagent Execution Cycle
232
-
233
- ### Lifecycle
234
-
235
- ```
236
- Orchestrator Subagent (fresh instance)
237
- │ │
238
- ├─ Craft focused prompt ──────────────►│
239
- │ (6-point template) │
240
- │ ├─ Understand scope
241
- │ ├─ Implement changes
242
- │ ├─ Self-review (checklist)
243
- │◄─────────────────── Return status ───┤
244
- │ │ (DONE/CONCERNS/NEEDS/BLOCKED)
245
- │ │
246
- ├─ Handle status (see §7) × (subagent terminates)
247
-
248
- ├─ Automated gate (check/test_run)
249
-
250
- ├─ Dispatch reviewers (see §8)
251
-
252
- └─ FORGE evidence_map gate
253
- ```
254
-
255
- ### Key Rules
256
-
257
- 1. **One subagent = one task**. Never reuse a subagent for a different task.
258
- 2. **Controller provides context**. The subagent's prompt contains everything it needs — it should NOT need to search/explore the codebase.
259
- 3. **Self-review before handoff**. Every implementer must complete the self-review checklist before declaring DONE.
260
- 4. **Status is mandatory**. Every subagent response MUST end with exactly ONE status code.
261
-
262
- ---
263
-
264
- ## §7 Implementer Status Protocol
265
-
266
- ### Status Codes
267
-
268
- Every implementer (Implementer, Frontend, Refactor) MUST end their response with exactly ONE:
269
-
270
- | Status | Meaning | Orchestrator Action |
271
- |--------|---------|-------------------|
272
- | **DONE** | All tasks complete, self-review passed | → Automated gate → Review pipeline |
273
- | **DONE_WITH_CONCERNS** | Complete but flagging issues: [list] | → Surface concerns as `Assumed` claims in evidence_map → Likely HOLD → Address before review |
274
- | **NEEDS_CONTEXT** | Cannot proceed without: [specific question] | → Provide missing context → Re-dispatch same task (counts as retry) |
275
- | **BLOCKED** | Hit a wall: [description] | → Diagnose (see below) |
276
-
277
- ### BLOCKED Diagnosis Tree
278
-
279
- ```
280
- Agent returned BLOCKED
281
-
282
- ├─ Missing context? (needs info not in prompt)
283
- │ → Provide context, re-dispatch
284
-
285
- ├─ Wrong model? (task too complex for assigned model)
286
- │ → Re-dispatch to stronger model (e.g., Implementer → Debugger)
287
-
288
- ├─ Scope too broad? (agent overwhelmed)
289
- │ → Split task further, re-dispatch smaller pieces
290
-
291
- ├─ Plan wrong? (implementation approach won't work)
292
- │ → Re-plan this phase, check AI Kit for alternatives
293
-
294
- └─ External blocker? (dependency not ready, API unavailable)
295
- → Park task, proceed with independent work, revisit later
296
- ```
297
-
298
- ### FORGE Composition
299
-
300
- Status protocol and FORGE are **independent but composable**:
301
-
302
- - **Status** = subjective agent telemetry ("I think I'm done")
303
- - **FORGE** = objective quality evidence ("the evidence says it's done")
304
-
305
- ```
306
- DONE → proceed to automated gate → FORGE evidence_map
307
- DONE_WITH_CONCERNS → concerns become 'Assumed' claims → evidence_map likely HOLDs
308
- NEEDS_CONTEXT → provide context, re-dispatch (no FORGE yet)
309
- BLOCKED → diagnose:
310
- contract/security issue → HARD_BLOCK
311
- resource/scope issue → re-plan, no FORGE
312
- ```
313
-
314
- **Critical rule**: Every `DONE` status MUST be followed by `evidence_map({ action: "gate" })` before proceeding to review. No shortcuts.
315
-
316
- ---
317
-
318
- ## §8 Review Pipeline
319
-
320
- ### Four-Stage Pipeline
321
-
322
- ```
323
- Stage 1: Implementer Self-Review (embedded in agent output)
324
- └─ Checklist: scope respected, tests pass, conventions followed
325
-
326
- Stage 2: Orchestrator Automated Gate
327
- └─ check({}) + test_run({}) MUST pass
328
- └─ Validate self-review checklist present in output
329
- └─ FAIL → bounce back to implementer with specific gap
330
- └─ PASS ↓
331
-
332
- Stage 3: Dual Code Review (parallel)
333
- ├─ Code-Reviewer-Alpha (GPT-5.4): code quality + Spec Alignment
334
- └─ Code-Reviewer-Beta (Opus 4.6): code quality + Spec Alignment
335
- │ Both review same code, different model perspectives
336
- │ Spec Alignment = "Does this match what was asked?"
337
-
338
- Stage 4: Conditional Reviews (parallel if both needed)
339
- ├─ Architecture Review — if boundary changes, new modules, pattern shifts
340
- └─ Security Review — if auth, crypto, input handling, or external data
341
-
342
- FORGE Gate: evidence_map({ action: "gate" })
343
- └─ YIELD → proceed to commit
344
- └─ HOLD → address flagged items → re-gate (max 3 rounds)
345
- └─ HARD_BLOCK → escalate to user
346
- ```
347
-
348
- ### Spec Alignment Dimension (for Code Reviewers)
349
-
350
- Both Code-Reviewer-Alpha and Code-Reviewer-Beta evaluate an explicit **Spec Alignment** dimension:
351
-
352
- 1. Does the implementation match the acceptance criteria from the task?
353
- 2. Are there over-builds (features not requested)?
354
- 3. Are there under-builds (requirements missed)?
355
- 4. Does the output match the expected file changes?
356
-
357
- This catches spec drift that automated tests might miss.
358
-
359
- ### When to Skip Stages
360
-
361
- | Stage | Skip When |
362
- |-------|-----------|
363
- | Architecture Review | No new modules, no boundary changes, no new patterns |
364
- | Security Review | No auth, no crypto, no external input handling |
365
- | FORGE Gate | Floor-tier tasks only (simple, mechanical changes) |
366
-
367
- ---
368
-
369
- ## §9 Recovery & Escalation
370
-
371
- ### Retry Policy
372
-
373
- - **Max 2 retries per agent per task** — after that, re-plan or escalate
374
- - Each retry MUST include the specific failure reason in the new prompt
375
- - Never retry with the same prompt — always add diagnostic context
376
-
377
- ### Loop Detection
378
-
379
- If an agent returns the same error/status 2+ times:
380
- 1. **STOP** — do not retry again
381
- 2. Check if the approach is fundamentally wrong
382
- 3. Consider: different agent, different model, different decomposition, or user escalation
383
-
384
- ### Emergency Procedures
385
-
386
- When parallel batch causes cascading failures:
387
-
388
- ```
389
- STOP → Halt all running agents immediately
390
- ASSESS → git diff --stat + check({}) — how bad is it?
391
- CONTAIN → Limited (1-3 files): fix or re-delegate
392
- Widespread (10+ files): git stash to preserve for analysis
393
- RECOVER → Partial: git checkout -- {specific files}
394
- Full: git stash (preserves) or git checkout . (discards)
395
- Nuclear: git reset --hard HEAD (last resort)
396
- DOCUMENT → remember what went wrong, update plan
397
- ```
398
-
399
- ### Scope Tripwires
400
-
401
- | Signal | Action |
402
- |--------|--------|
403
- | Agent modified **2x more files** than planned | Pause, review before continuing |
404
- | Agent returns `ESCALATE` or `BLOCKED` repeatedly | Do NOT re-delegate unchanged. Diagnose first |
405
- | Agent's output contradicts the plan | Stop, compare with plan, re-align |
406
- | Tests that were passing now fail | Immediate rollback of that agent's changes |
407
-
408
- ---
409
-
410
- ## §10 Common Mistakes & Red Flags
411
-
412
- ### Delegation Anti-Patterns
413
-
414
- | ❌ Mistake | Why It Fails | ✅ Fix |
415
- |-----------|-------------|--------|
416
- | **Too broad scope** — "implement the auth system" | Agent lacks clear boundaries, produces sprawling changes | Split: "add JWT middleware to auth.ts" + "add login endpoint to routes.ts" |
417
- | **No constraints** — "add a feature" | Agent invents architecture, conflicts with existing patterns | Include conventions, boundaries, existing patterns in prompt |
418
- | **Vague output** — "make it work" | No way to verify completion | Specific acceptance criteria: "endpoint returns 200 with {schema}" |
419
- | **Session context inheritance** — "continue from where we left off" | Subagent has stale/polluted context | Fresh prompt with 6-point template every time |
420
- | **Skipping reviews** — "it's a small change" | Small changes cause big regressions | ALWAYS run automated gate minimum |
421
- | **Parallel on shared files** — "both agents edit config.ts" | Merge conflicts, lost changes | Sequential, or merge into one task |
422
- | **Trusting the report** — "agent said DONE so it's done" | Agents are optimistic, miss edge cases | Automated gate + dual code review catches this |
423
- | **Brute-force retries** — same prompt 3 times | If it failed twice, it'll fail a third time | Diagnose, change approach, then retry |
424
- | **Orchestrator implements** — "just this one small fix" | Breaks the delegation contract, no review | ALWAYS delegate, no matter how small |
425
-
426
- ### Red Flags in Agent Output
427
-
428
- | Flag | What It Means | Action |
429
- |------|--------------|--------|
430
- | Agent modified files outside its scope | Scope creep or misunderstanding | Rollback out-of-scope files, re-delegate with tighter constraints |
431
- | Agent added dependencies not in plan | Unauthorized architectural decision | Review necessity, likely rollback |
432
- | Agent skipped self-review checklist | Rushing, likely incomplete | Bounce back with checklist requirement |
433
- | Agent's DONE but tests fail | Didn't actually self-test | Bounce back with failing test output |
434
- | Agent asks questions in output instead of using NEEDS_CONTEXT | Misunderstands status protocol | Treat as NEEDS_CONTEXT, educate in next prompt |
435
-
436
- ---
437
-
438
- ## Prompt Template Reference
439
-
440
- Detailed prompt templates are provided as sidecar files:
441
-
442
- | Template | File | Use When |
443
- |----------|------|----------|
444
- | Implementer dispatch | [`implementer-prompt.md`](implementer-prompt.md) | Dispatching Implementer, Frontend, or Refactor agents |
445
- | Spec compliance review | [`spec-review-prompt.md`](spec-review-prompt.md) | Adversarial spec alignment check (Code-Reviewer-Alpha) |
446
- | Code quality review | [`code-quality-review-prompt.md`](code-quality-review-prompt.md) | Dual code quality review (Code-Reviewer-Beta) |
447
- | Architecture review | [`architecture-review-prompt.md`](architecture-review-prompt.md) | Boundary changes, pattern adherence review |
448
- | Parallel dispatch example | [`parallel-dispatch-example.md`](parallel-dispatch-example.md) | Worked example of decomposing a feature into parallel tasks |
@@ -1,81 +0,0 @@
1
- # Architecture Review Prompt Template
2
-
3
- Use this template when dispatching **Architect-Reviewer-Alpha** and **Architect-Reviewer-Beta** for architecture review. Only needed when changes cross module boundaries, introduce new patterns, or modify shared infrastructure.
4
-
5
- ---
6
-
7
- ## Prompt Template
8
-
9
- ```markdown
10
- You are performing an architecture review. Focus on structural decisions, not code quality (that's handled separately).
11
-
12
- ## Change Summary
13
- [What was changed and why — high-level description]
14
-
15
- ## Files Changed
16
- [List of files with module/boundary information]
17
-
18
- ## Architectural Context
19
- [Paste relevant architecture docs, module graph, dependency structure from compact/digest]
20
-
21
- ---
22
-
23
- ## Review Dimensions
24
-
25
- ### 1. Boundary Integrity
26
- - Do changes respect existing module boundaries?
27
- - Are there new cross-module dependencies that shouldn't exist?
28
- - Is the dependency direction correct (inner → outer, not reverse)?
29
-
30
- ### 2. Pattern Adherence
31
- - Do new components follow established architectural patterns?
32
- - Are there deviations from the documented architecture?
33
- - If a new pattern is introduced, is it justified and documented?
34
-
35
- ### 3. Coupling & Cohesion
36
- - Are new dependencies minimal and well-justified?
37
- - Could any coupling be reduced with interfaces or inversion?
38
- - Are related concerns grouped together?
39
-
40
- ### 4. Scalability Impact
41
- - Will this change create bottlenecks under load?
42
- - Are there single points of failure introduced?
43
- - Does this work with horizontal scaling?
44
-
45
- ### 5. Migration & Evolution
46
- - Does this change make future changes harder?
47
- - Are there implicit assumptions that will break?
48
- - Is the change reversible if needed?
49
-
50
- ### 6. API Surface
51
- - Are new public APIs minimal and well-designed?
52
- - Could any public surface be internal instead?
53
- - Are breaking changes flagged?
54
-
55
- ---
56
-
57
- ## Output Format
58
-
59
- ### Verdict: [APPROVE | APPROVE_WITH_NOTES | REQUEST_CHANGES]
60
-
61
- **APPROVE** — Architecture is sound.
62
- **APPROVE_WITH_NOTES** — Acceptable, but track these concerns.
63
- **REQUEST_CHANGES** — Structural issues that must be resolved.
64
-
65
- ### Findings:
66
- | # | Dimension | Severity | Description | Impact |
67
- |---|-----------|----------|-------------|--------|
68
- | 1 | [Dimension] | [Blocker/Concern/Note] | [Issue] | [What breaks/degrades] |
69
-
70
- ### Recommendation:
71
- [Structural guidance for improvement]
72
- ```
73
-
74
- ---
75
-
76
- ## Usage Notes
77
-
78
- - **Only trigger architecture review when**: changes cross module boundaries, new patterns are introduced, shared infrastructure is modified, or public API surface changes
79
- - Both Alpha and Beta reviewers run in parallel for multi-model perspective
80
- - Architecture blockers are HIGH priority — must resolve before merge
81
- - If both reviewers flag the same concern, it's almost certainly a real issue
@@ -1,91 +0,0 @@
1
- # Code Quality Review Prompt Template
2
-
3
- Use this template when dispatching **Code-Reviewer-Alpha** and **Code-Reviewer-Beta** for dual code quality review. Both reviewers get the same prompt — different models catch different issues.
4
-
5
- This review runs AFTER spec compliance passes.
6
-
7
- ---
8
-
9
- ## Prompt Template
10
-
11
- ```markdown
12
- You are performing a code quality review. The code has already passed spec compliance — it does what was asked. Your job is to evaluate HOW it does it.
13
-
14
- ## Task Context
15
- [Brief description of what was implemented and why]
16
-
17
- ## Files Changed
18
- [List of files with a one-line description of changes in each]
19
-
20
- ## Code Diff
21
- [Paste the actual code changes — full diff or new file contents]
22
-
23
- ---
24
-
25
- ## Review Dimensions
26
-
27
- ### 1. Readability & Maintainability
28
- - Are names clear and intention-revealing?
29
- - Is the code self-documenting or does it need comments?
30
- - Is complexity managed (small functions, single responsibility)?
31
- - Would a new team member understand this in 5 minutes?
32
-
33
- ### 2. Pattern Adherence
34
- - Does the code follow the project's established patterns?
35
- - Are there inconsistencies with surrounding code?
36
- - Are there framework idioms being violated?
37
-
38
- ### 3. Error Handling & Edge Cases
39
- - Are errors handled at appropriate boundaries?
40
- - Are edge cases considered (null, empty, overflow, concurrent access)?
41
- - Are error messages helpful for debugging?
42
-
43
- ### 4. Performance
44
- - Any obvious N+1 queries or unnecessary iterations?
45
- - Any blocking operations that should be async?
46
- - Any missing caching opportunities or unnecessary allocations?
47
-
48
- ### 5. Security
49
- - Input validation present for external data?
50
- - No secrets in code?
51
- - No injection vulnerabilities (SQL, XSS, command)?
52
- - Proper authorization checks?
53
-
54
- ### 6. Spec Alignment (Cross-Check)
55
- - Does the implementation match what was requested?
56
- - Any over-engineering beyond requirements?
57
- - Any subtle deviations from the intended behavior?
58
-
59
- ### 7. Testability
60
- - Is the code testable in isolation?
61
- - Are dependencies injectable?
62
- - Are there missing test cases?
63
-
64
- ---
65
-
66
- ## Output Format
67
-
68
- ### Verdict: [APPROVE | APPROVE_WITH_SUGGESTIONS | REQUEST_CHANGES]
69
-
70
- **APPROVE** — Code is production-ready.
71
- **APPROVE_WITH_SUGGESTIONS** — Good to merge, but consider these improvements.
72
- **REQUEST_CHANGES** — Must fix before proceeding. List blocking issues.
73
-
74
- ### Findings:
75
- | # | Dimension | Severity | Description | Location | Suggestion |
76
- |---|-----------|----------|-------------|----------|------------|
77
- | 1 | [Dimension] | [Blocker/Major/Minor/Nit] | [Issue] | [file:line] | [Fix] |
78
-
79
- ### Summary:
80
- [2-3 sentence overall assessment]
81
- ```
82
-
83
- ---
84
-
85
- ## Usage Notes
86
-
87
- - Both Alpha and Beta reviewers get this same template — multi-model cross-validation
88
- - Combine findings from both reviewers before deciding on action
89
- - **Blocker** findings = REQUEST_CHANGES (must fix)
90
- - **Major** findings = usually REQUEST_CHANGES unless truly optional
91
- - **Minor/Nit** = APPROVE_WITH_SUGGESTIONS (fix in follow-up or ignore)