opencode-swarm 7.58.0 → 7.59.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.opencode/skills/brainstorm/SKILL.md +142 -0
- package/.opencode/skills/clarify/SKILL.md +103 -0
- package/.opencode/skills/clarify-spec/SKILL.md +58 -0
- package/.opencode/skills/codebase-review-swarm/INSTALL.md +75 -0
- package/.opencode/skills/codebase-review-swarm/README.md +44 -0
- package/.opencode/skills/codebase-review-swarm/SKILL.md +65 -0
- package/.opencode/skills/codebase-review-swarm/agents/openai.yaml +6 -0
- package/.opencode/skills/codebase-review-swarm/assets/jsonl-schemas.md +239 -0
- package/.opencode/skills/codebase-review-swarm/assets/review-report-template.md +244 -0
- package/.opencode/skills/codebase-review-swarm/references/compatibility-and-research-notes.md +25 -0
- package/.opencode/skills/codebase-review-swarm/references/full-v7-source-prompt.md +2373 -0
- package/.opencode/skills/codebase-review-swarm/references/review-protocol-v8.2.md +310 -0
- package/.opencode/skills/codebase-review-swarm/scripts/init-review-run.py +134 -0
- package/.opencode/skills/codebase-review-swarm/scripts/validate-skill-package.py +62 -0
- package/.opencode/skills/consult/SKILL.md +16 -0
- package/.opencode/skills/council/SKILL.md +147 -0
- package/.opencode/skills/critic-gate/SKILL.md +59 -0
- package/.opencode/skills/deep-dive/SKILL.md +142 -0
- package/.opencode/skills/design-docs/SKILL.md +81 -0
- package/.opencode/skills/discover/SKILL.md +20 -0
- package/.opencode/skills/execute/SKILL.md +191 -0
- package/.opencode/skills/issue-ingest/SKILL.md +64 -0
- package/.opencode/skills/phase-wrap/SKILL.md +123 -0
- package/.opencode/skills/plan/SKILL.md +293 -0
- package/.opencode/skills/pre-phase-briefing/SKILL.md +69 -0
- package/.opencode/skills/resume/SKILL.md +23 -0
- package/.opencode/skills/specify/SKILL.md +175 -0
- package/.opencode/skills/swarm-pr-feedback/SKILL.md +192 -0
- package/.opencode/skills/swarm-pr-review/SKILL.md +884 -0
- package/dist/agents/agent-output-schema.d.ts +1 -1
- package/dist/cli/index.js +1351 -1159
- package/dist/commands/command-dispatch.d.ts +1 -0
- package/dist/commands/index.d.ts +1 -0
- package/dist/commands/registry.d.ts +15 -14
- package/dist/config/bundled-skills.d.ts +25 -0
- package/dist/config/constants.d.ts +1 -1
- package/dist/config/schema.d.ts +42 -0
- package/dist/index.js +3517 -2673
- package/dist/memory/schema.d.ts +1 -1
- package/dist/tools/lean-turbo-run-phase.d.ts +2 -1
- package/dist/turbo/lean/index.d.ts +4 -1
- package/dist/turbo/lean/merge-back.d.ts +180 -0
- package/dist/turbo/lean/runner.d.ts +47 -1
- package/dist/turbo/lean/state.d.ts +10 -0
- package/dist/turbo/lean/worktree.d.ts +194 -0
- package/package.json +20 -1
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: issue-ingest
|
|
3
|
+
description: >
|
|
4
|
+
Full execution protocol for MODE: ISSUE_INGEST -- GitHub issue intake, localization, spec generation, and transition to planning or tracing.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Issue Ingest Protocol
|
|
8
|
+
|
|
9
|
+
This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
|
|
10
|
+
|
|
11
|
+
### MODE: ISSUE_INGEST
|
|
12
|
+
Activates when: user invokes `/swarm issue <url>`; OR architect receives `[MODE: ISSUE_INGEST issue="<url>"]` signal.
|
|
13
|
+
|
|
14
|
+
Purpose: ingest a GitHub issue, localize root cause, and produce a resolution spec. The issue URL points to a GitHub issue that describes a bug, feature request, or task to be resolved.
|
|
15
|
+
|
|
16
|
+
Flags parsed from signal:
|
|
17
|
+
- `plan=true` → after spec generation, transition to MODE: PLAN (create implementation plan)
|
|
18
|
+
- `trace=true` → after plan, delegate to swarm-implement skill for full fix-and-PR workflow (implies plan=true)
|
|
19
|
+
- `noRepro=true` → skip reproduction verification step
|
|
20
|
+
|
|
21
|
+
#### Phase 1: INTAKE
|
|
22
|
+
1. Fetch the issue body using the GitHub CLI (`gh issue view <N> --repo <owner>/<repo> --json title,body,labels,assignees,comments`) or web fetch.
|
|
23
|
+
2. Parse the issue into a normalized **Intake Note** with four required fields:
|
|
24
|
+
- **Observed behavior**: what the issue reports
|
|
25
|
+
- **Expected behavior**: what should happen instead
|
|
26
|
+
- **Reproduction steps**: how to trigger the issue (may be absent; flag with `[NEEDS REPRO]` if missing)
|
|
27
|
+
- **Environment**: platform, version, configuration context
|
|
28
|
+
3. If any required field is missing and cannot be inferred from context, flag as `[NEEDS REPRO]`.
|
|
29
|
+
4. If `--no-repro` flag is set, skip reproduction verification and proceed with available information.
|
|
30
|
+
5. Exit when the Intake Note is complete or all missing fields are flagged.
|
|
31
|
+
|
|
32
|
+
#### Phase 2: LOCALIZATION
|
|
33
|
+
1. Delegate to `the active swarm's explorer agent` to scan the codebase for code areas related to the issue's observed behavior.
|
|
34
|
+
2. Build 2–5 candidate hypotheses for root cause, each with:
|
|
35
|
+
- **Location**: file(s) and function(s) most likely responsible
|
|
36
|
+
- **Confidence**: composite score (stack-trace match 0.4, recency 0.25, call-graph proximity 0.2, test-failure correlation 0.15)
|
|
37
|
+
- **Falsifiability**: a specific test or observation that would disprove this hypothesis
|
|
38
|
+
3. Validate top-3 hypotheses in parallel using targeted `the active swarm's sme agent` consultations.
|
|
39
|
+
4. Prune to a single root cause hypothesis with supporting evidence.
|
|
40
|
+
5. Exit when a root cause is identified with ≥70% confidence, or when all hypotheses are exhausted (report ambiguity).
|
|
41
|
+
|
|
42
|
+
#### Phase 3: SPEC GENERATION
|
|
43
|
+
0. Include a **Root Cause** section derived from Phase 2 localization results: concise statement of the identified root cause, location, and confidence score. Include a **Fix Strategy** section at product/behavior level (what the fix must accomplish, not how to implement it).
|
|
44
|
+
1. Generate `.swarm/spec.md` using the same SPEC CONTENT RULES as MODE: SPECIFY:
|
|
45
|
+
- WHAT users need and WHY — never HOW to implement
|
|
46
|
+
- FR-### / SC-### numbering, Given/When/Then scenarios
|
|
47
|
+
- No technology stack, APIs, or code structure
|
|
48
|
+
- `[NEEDS CLARIFICATION]` markers only for items that survive the clarification funnel: inventory all material uncertainties without numeric cap → classify each (self_resolved/critic_resolved/research_needed/user_decision/deferred_nonblocking) — **overconfidence guard:** if the default is not directly supported by user request, spec, or recorded context, classify as `user_decision` rather than `self_resolved` → consult critic_sounding_board — critic responds per SoundingBoardVerdict: UNNECESSARY→DROP, RESOLVE→RESOLVE, REPHRASE→REPHRASE, APPROVED→ASK_USER — **always-surface protection:** always-surface categories must not receive UNNECESSARY/DROP; override to APPROVED/ASK_USER → record resolved items as assumptions → surface only survivors as markers with decision packet format (grouped by category, recommended defaults, blocking vs optional markers)
|
|
49
|
+
2. Cross-reference the spec against the issue's expected behavior to ensure alignment.
|
|
50
|
+
3. If the issue is a bug: spec must describe the correct behavior, not the broken behavior.
|
|
51
|
+
4. If the issue is a feature: spec must describe the user-facing outcome, not the implementation.
|
|
52
|
+
5. QA GATE SELECTION: Ask user which QA gates to enable (same dialogue as MODE: SPECIFY). Write to `.swarm/context.md` under `## Pending QA Gate Selection`.
|
|
53
|
+
|
|
54
|
+
#### Phase 4: TRANSITION
|
|
55
|
+
Based on flags:
|
|
56
|
+
- No flags → report spec summary and suggest `PLAN` or `CLARIFY-SPEC`
|
|
57
|
+
- `plan=true` → transition to MODE: PLAN using the generated spec
|
|
58
|
+
- `trace=true` → transition to MODE: PLAN, then delegate to swarm-implement skill for full fix workflow
|
|
59
|
+
|
|
60
|
+
RULES:
|
|
61
|
+
- One question per message in INTAKE dialogue (max 6 questions)
|
|
62
|
+
- Hypotheses must be falsifiable — no unfalsifiable hypotheses
|
|
63
|
+
- Spec must be independently testable — each FR must have a verification path
|
|
64
|
+
- The issue URL is already sanitized by the issue command — do not re-sanitize
|
|
@@ -0,0 +1,123 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: phase-wrap
|
|
3
|
+
description: >
|
|
4
|
+
Full execution protocol for MODE: PHASE-WRAP -- phase boundary evidence, drift and hallucination gates, retrospectives, phase completion, and final council.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Phase Wrap Protocol
|
|
8
|
+
|
|
9
|
+
This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
|
|
10
|
+
|
|
11
|
+
## ⛔ RETROSPECTIVE GATE
|
|
12
|
+
|
|
13
|
+
**MANDATORY before calling phase_complete.** You MUST write a retrospective evidence bundle BEFORE calling \`phase_complete\`. The tool will return \`{status: 'blocked', reason: 'RETROSPECTIVE_MISSING'}\` if you skip this step.
|
|
14
|
+
|
|
15
|
+
**How to write the retrospective:**
|
|
16
|
+
|
|
17
|
+
Call the \`write_retro\` tool with the required fields:
|
|
18
|
+
- \`phase\`: The phase number being completed (e.g., 1, 2, 3)
|
|
19
|
+
- \`summary\`: Human-readable summary of the phase
|
|
20
|
+
- \`task_count\`: Count of tasks completed in this phase
|
|
21
|
+
- \`task_complexity\`: One of \`trivial\` | \`simple\` | \`moderate\` | \`complex\`
|
|
22
|
+
- \`total_tool_calls\`: Total number of tool calls in this phase
|
|
23
|
+
- \`coder_revisions\`: Number of coder revisions made
|
|
24
|
+
- \`reviewer_rejections\`: Number of reviewer rejections received
|
|
25
|
+
- \`test_failures\`: Number of test failures encountered
|
|
26
|
+
- \`security_findings\`: Number of security findings
|
|
27
|
+
- \`integration_issues\`: Number of integration issues
|
|
28
|
+
- \`lessons_learned\` ("lessons_learned"): (optional) Key lessons learned from this phase (max 5)
|
|
29
|
+
- \`top_rejection_reasons\`: (optional) Top reasons for reviewer rejections
|
|
30
|
+
- \`metadata\`: (optional) Additional metadata, e.g., \`{ "plan_id": "<current plan title from .swarm/plan.json>" }\`
|
|
31
|
+
|
|
32
|
+
The tool will automatically write the retrospective to \`.swarm/evidence/retro-{phase}/evidence.json\` with the correct schema wrapper. The resulting JSON entry will include: \`"type": "retrospective"\`, \`"phase_number"\` (matching the phase argument), and \`"verdict": "pass"\` (auto-set by the tool).
|
|
33
|
+
|
|
34
|
+
**Required field rules:**
|
|
35
|
+
- \`verdict\` is auto-generated by write_retro with value \`"pass"\`. The resulting retrospective entry will have verdict \`"pass"\`; this is required for phase_complete to succeed.
|
|
36
|
+
- \`phase\` MUST match the phase number you are completing
|
|
37
|
+
- \`lessons_learned\` should be 3-5 concrete, actionable items from this phase
|
|
38
|
+
- Write the bundle as task_id \`retro-{N}\` (e.g., \`retro-1\` for Phase 1, \`retro-2\` for Phase 2)
|
|
39
|
+
- \`metadata.plan_id\` should be set to the current project's plan title (from \`.swarm/plan.json\` header). This enables cross-project filtering in the retrospective injection system.
|
|
40
|
+
|
|
41
|
+
### Additional retrospective fields (capture when applicable):
|
|
42
|
+
- \`user_directives\`: Any corrections or preferences the user expressed during this phase
|
|
43
|
+
- \`directive\`: what the user said (non-empty string)
|
|
44
|
+
- \`category\`: \`tooling\` | \`code_style\` | \`architecture\` | \`process\` | \`other\`
|
|
45
|
+
- \`scope\`: \`session\` (one-time, do not carry forward) | \`project\` (persist to context.md) | \`global\` (user preference)
|
|
46
|
+
- \`approaches_tried\`: Approaches attempted during this phase (max 10)
|
|
47
|
+
- \`approach\`: what was tried (non-empty string)
|
|
48
|
+
- \`result\`: \`success\` | \`failure\` | \`partial\`
|
|
49
|
+
- \`abandoned_reason\`: why it was abandoned (required when result is \`failure\` or \`partial\`)
|
|
50
|
+
|
|
51
|
+
**⚠️ WARNING:** Calling \`phase_complete(N)\` without a valid \`retro-N\` bundle will be BLOCKED. The error response will be:
|
|
52
|
+
\`{ "status": "blocked", "reason": "RETROSPECTIVE_MISSING" }\`
|
|
53
|
+
|
|
54
|
+
### MODE: PHASE-WRAP
|
|
55
|
+
1. the active swarm's explorer agent - Rescan
|
|
56
|
+
2. the active swarm's docs agent (the standard `docs` agent — NOT `docs_design`) - Update documentation for all changes in this phase. Provide:
|
|
57
|
+
- Complete list of files changed during this phase
|
|
58
|
+
- Summary of what was added/modified/removed
|
|
59
|
+
- List of doc files that may need updating (README.md, CONTRIBUTING.md, docs/)
|
|
60
|
+
Do NOT dispatch `docs_design` here. The structured design docs are synced separately and conditionally in step 5.58.
|
|
61
|
+
3. Update context.md
|
|
62
|
+
4. Write retrospective evidence: use the evidence manager (write_retro) to record phase, total_tool_calls, coder_revisions, reviewer_rejections, test_failures, security_findings, integration_issues, task_count, task_complexity, top_rejection_reasons, lessons_learned to .swarm/evidence/. Reset Phase Metrics in context.md to 0.
|
|
63
|
+
4.5. Run `evidence_check` to verify all completed tasks have required evidence (review + test). If gaps found, note in retrospective lessons_learned. Optionally run `pkg_audit` if dependencies were modified during this phase. Optionally run `schema_drift` if API routes were modified during this phase.
|
|
64
|
+
5. Run `sbom_generate` with scope='changed' to capture post-implementation dependency snapshot (saved to `.swarm/evidence/sbom/`). This is a non-blocking step - always proceeds to summary.
|
|
65
|
+
5.5. **Drift verification**: Conditional on .swarm/spec.md existence — if spec.md does not exist, skip silently and proceed to step 5.55. If spec.md exists, delegate to the active swarm's critic_drift_verifier agent with DRIFT-CHECK context:
|
|
66
|
+
- Provide: phase number being completed, completed task IDs and their descriptions
|
|
67
|
+
- Include evidence path (.swarm/evidence/) for the critic to read implementation artifacts
|
|
68
|
+
The critic reads every target file, verifies described changes exist against the spec, and returns per-task verdicts: ALIGNED, MINOR_DRIFT, MAJOR_DRIFT, or OFF_SPEC.
|
|
69
|
+
If the critic returns anything other than ALIGNED on any task, surface the drift results as a warning to the user before proceeding.
|
|
70
|
+
After the delegation returns, YOU (the architect) call the `write_drift_evidence` tool to write the drift evidence artifact (phase, verdict from critic, summary). The critic does NOT write files — it is read-only. Only then proceed to step 5.55. phase_complete will also run its own deterministic pre-check (completion-verify) and block if tasks are obviously incomplete.
|
|
71
|
+
⚠️ **GOTCHA**: The drift evidence `summary` field is scanned by gates for verdict keywords. NEVER include the string "NEEDS_REVISION" or any other verdict word in the summary text — the gate will match it and falsely reject the evidence even when the verdict is APPROVED. Use neutral language like "drift verification completed" or "all tasks aligned with spec".
|
|
72
|
+
5.55. **Hallucination verification (conditional on QA gate)**: Check whether `hallucination_guard` is enabled in the effective QA gate profile for this plan (visible via `get_qa_gate_profile`). If disabled, skip silently and proceed to step 5.6.
|
|
73
|
+
If `hallucination_guard` is enabled, delegate to the active swarm's critic_hallucination_verifier agent with HALLUCINATION-CHECK context:
|
|
74
|
+
- Provide: phase number being completed, completed task IDs, every file touched this phase
|
|
75
|
+
- Include evidence path (.swarm/evidence/) so the verifier can read implementation artifacts
|
|
76
|
+
The verifier reads every changed file cold, cross-references every named API against its real source or package manifest, and returns per-artifact verdicts across four axes: API existence, signature accuracy, doc/spec claim support, citation integrity.
|
|
77
|
+
If the verifier returns NEEDS_REVISION: STOP — do NOT call phase_complete.
|
|
78
|
+
Fix the hallucinations (remove fabricated APIs, correct signatures, repair broken citations), then re-delegate until APPROVED.
|
|
79
|
+
After the delegation returns APPROVED, YOU (the architect) call the `write_hallucination_evidence` tool to write the evidence artifact (phase, verdict, summary). The critic does NOT write files — it is read-only.
|
|
80
|
+
NOTE: This step is enforced by the plugin. If `hallucination_guard` is enabled and `.swarm/evidence/{phase}/hallucination-guard.json` is missing or has a non-APPROVED verdict, phase_complete will be BLOCKED.
|
|
81
|
+
PROFILE LOCK NOTE: If the QA gate profile is already locked (drift verification has approved the plan) and `hallucination_guard` was not elected during the initial QA GATE SELECTION, this step is skipped — report the skip to the user. A new plan cycle is required to enable the gate.
|
|
82
|
+
5.56. **Mutation gate (conditional on QA gate)**: Check whether `mutation_test` is enabled in the effective QA gate profile for this plan (visible via `get_qa_gate_profile`). If disabled or turbo mode is active, skip silently and proceed to step 5.6.
|
|
83
|
+
If `mutation_test` is enabled:
|
|
84
|
+
1. Call `generate_mutants` with the list of source files touched this phase to produce mutation patches.
|
|
85
|
+
2. If `generate_mutants` returns a SKIP verdict (LLM unavailable), call `write_mutation_evidence` with verdict SKIP and proceed — SKIP does not block.
|
|
86
|
+
3. Otherwise, call `mutation_test` with the generated patches, the source files, and the test command for this project.
|
|
87
|
+
4. Call `write_mutation_evidence` with the phase number, verdict (PASS/WARN/FAIL), killRate, adjustedKillRate, and summary from the mutation_test result.
|
|
88
|
+
5. If verdict is FAIL: STOP — do NOT call phase_complete. Provide the testImprovementPrompt from mutation_test to the coder to improve test coverage, then re-run from step 1.
|
|
89
|
+
6. If verdict is WARN: non-blocking — proceed to step 5.6 with a warning to the user.
|
|
90
|
+
7. If verdict is PASS: proceed to step 5.6.
|
|
91
|
+
NOTE: This step is enforced by the plugin. If `mutation_test` is enabled and `.swarm/evidence/{phase}/mutation-gate.json` is missing or has a 'fail' verdict, phase_complete will be BLOCKED.
|
|
92
|
+
5.58. **Design-doc sync (conditional on `design_docs.enabled` — issue #1080)**: If `design_docs.enabled` is not true, skip silently. Otherwise: `phase_complete` runs a deterministic, non-blocking design-doc drift check and writes `.swarm/doc-drift-phase-{phase}.json`. If its verdict is `DOC_STALE`, enter MODE: DESIGN_DOCS in sync mode for the stale sections only — delegate to the active swarm's `docs_design` agent (NOT the standard `docs` agent) with the changed files + the stale section IDs, and have it update the affected docs and append a `design-changelog.md` entry. This is advisory and NON-BLOCKING — never hold up phase_complete on design-doc lag, and never write `.swarm/spec.md`, `CHANGELOG.md`, or `docs/releases/pending/*` here.
|
|
93
|
+
5.6. **Mandatory gate evidence**: Before calling phase_complete, ensure:
|
|
94
|
+
- `.swarm/evidence/{phase}/completion-verify.json` exists (written automatically by the completion-verify gate)
|
|
95
|
+
- `.swarm/evidence/{phase}/drift-verifier.json` exists with verdict 'approved' (written by YOU via the `write_drift_evidence` tool after the critic_drift_verifier returns its verdict in step 5.5) — required when .swarm/spec.md exists
|
|
96
|
+
- `.swarm/evidence/{phase}/hallucination-guard.json` exists with verdict 'approved' (written by YOU via the `write_hallucination_evidence` tool after the critic_hallucination_verifier returns its verdict in step 5.55) — ONLY required when `hallucination_guard` is enabled in the QA gate profile
|
|
97
|
+
- `.swarm/evidence/{phase}/mutation-gate.json` exists with verdict 'pass' or 'warn' (written by YOU via the `write_mutation_evidence` tool after step 5.56) — ONLY required when `mutation_test` is enabled in the QA gate profile
|
|
98
|
+
If any required file is missing, run the missing gate first. Turbo mode skips all gates automatically.
|
|
99
|
+
NOTE: Steps 5.5, 5.55, and 5.56 are enforced by runtime hooks. If `hallucination_guard` is enabled and you skip the critic_hallucination_verifier delegation (or fail to call `write_hallucination_evidence`), phase_complete will be BLOCKED by the plugin. Similarly, if `mutation_test` is enabled and you skip step 5.56 (or fail to call `write_mutation_evidence`), phase_complete will be BLOCKED. These are not suggestions — they are hard enforcement mechanisms.
|
|
100
|
+
5.7. **Final Council (conditional on QA gate - last phase only)**: Check whether `final_council` is enabled in the effective QA gate profile (visible via `get_qa_gate_profile`). If disabled, skip silently and proceed to step 6.
|
|
101
|
+
If enabled AND this is the LAST phase in the plan (all other phases have status 'complete' and no more phases remain):
|
|
102
|
+
1. Build a PROJECT DOSSIER from the completed plan, all phase summaries, changed-file summaries, and all relevant evidence artifacts. This is a completed-project review, not General Council mode.
|
|
103
|
+
2. Dispatch `the active swarm's critic agent`, `the active swarm's reviewer agent`, `the active swarm's sme agent`, `the active swarm's test_engineer agent`, and `the active swarm's explorer agent` in PARALLEL with project-scoped context. Each member must review the entire completed body of work and return a `CouncilMemberVerdict` JSON object using `agent`, `verdict` (APPROVE|CONCERNS|REJECT), `confidence`, `findings[]`, `criteriaAssessed[]`, `criteriaUnmet[]`, and `durationMs`.
|
|
104
|
+
3. Collect the five returned verdict objects. Do NOT fabricate, infer, or substitute verdicts. If a member does not return valid JSON, re-dispatch that member.
|
|
105
|
+
4. Call `write_final_council_evidence` with `phase`, `projectSummary`, `roundNumber`, and the collected `verdicts` array. This writes `.swarm/evidence/final-council.json` with plan binding, member verdicts, and quorum metadata.
|
|
106
|
+
⚠️ **GOTCHA**: `write_final_council_evidence` normalizes CONCERNS verdicts to "rejected" internally. A CONCERNS verdict in the **final council** still blocks `phase_complete` even with zero required fixes. You MUST either address the concerns and get APPROVE on a second council round, or surface the non-blocking advisory to the user before proceeding. (Note: the **phase-level** council has a `phaseConcernsAllowComplete` flag that makes CONCERNS advisory; the final council does not.)
|
|
107
|
+
5. Do NOT call `convene_general_council`, do NOT dispatch `council_generalist`, `council_skeptic`, or `council_domain_expert`, and do NOT require `council.general.enabled` for this gate. `final_council` is the same five-member phase council rerun at project scope.
|
|
108
|
+
6. Do NOT call `phase_complete` or `/swarm close` until `.swarm/evidence/final-council.json` exists with an approved, plan-bound, quorumed final-council verdict. When `final_council` is enabled, `phase_complete` will block until that evidence exists.
|
|
109
|
+
If enabled but NOT the last phase, skip silently - final council only runs once, after all phases.
|
|
110
|
+
6. Summarize to user
|
|
111
|
+
7. Ask: "Ready for Phase [N+1]?"
|
|
112
|
+
|
|
113
|
+
CATASTROPHIC VIOLATION CHECK — ask yourself at EVERY phase boundary (MODE: PHASE-WRAP):
|
|
114
|
+
"Have I delegated to the active swarm's reviewer agent at least once this phase?"
|
|
115
|
+
If the answer is NO: you have a catastrophic process violation.
|
|
116
|
+
STOP. Do not proceed to the next phase. Inform the user:
|
|
117
|
+
"⛔ PROCESS VIOLATION: Phase [N] completed with zero reviewer-agent delegations in the active swarm.
|
|
118
|
+
All code changes in this phase are unreviewed. Recommend retrospective review before proceeding."
|
|
119
|
+
This is not optional. Zero active-swarm reviewer calls in a phase is always a violation.
|
|
120
|
+
There is no project where code ships without review.
|
|
121
|
+
|
|
122
|
+
### Blockers
|
|
123
|
+
Mark [BLOCKED] in plan.md, skip to next unblocked task, inform user.
|
|
@@ -0,0 +1,293 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: plan
|
|
3
|
+
description: >
|
|
4
|
+
Full execution protocol for MODE: PLAN -- plan creation, external plan ingestion, QA gate persistence, task granularity, and traceability checks.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Plan Protocol
|
|
8
|
+
|
|
9
|
+
This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
|
|
10
|
+
|
|
11
|
+
### MODE: PLAN
|
|
12
|
+
|
|
13
|
+
SPEC GATE (soft — check before planning):
|
|
14
|
+
- If `.swarm/spec.md` does NOT exist:
|
|
15
|
+
- PLAN INGESTION DETECTION: Check if the user is providing an external plan (indicators: markdown content with Phase/Task structure, or phrases like "ingest this plan", "implement this plan", "prepare for implementation", "here is a plan", "here's the plan"):
|
|
16
|
+
- If plan ingestion is detected AND no spec.md exists: offer this choice FIRST before any planning:
|
|
17
|
+
1. "Generate spec from this plan first" → enter EXTERNAL PLAN IMPORT PATH in MODE: SPECIFY to reverse-engineer a spec.md from the provided plan, then return to planning
|
|
18
|
+
2. "Skip spec and proceed with the provided plan" → proceed directly to plan ingestion and planning without creating a spec
|
|
19
|
+
- This is a SOFT gate — option 2 always lets the user proceed without a spec
|
|
20
|
+
- If no plan ingestion detected: Warn: "No spec found. A spec helps ensure the plan covers all requirements and gives the critic something to verify against. Would you like to create one first?"
|
|
21
|
+
- Offer two options:
|
|
22
|
+
1. "Create a spec first" → transition to MODE: SPECIFY
|
|
23
|
+
2. "Skip and plan directly" → continue with the steps below unchanged
|
|
24
|
+
- If `.swarm/spec.md` EXISTS:
|
|
25
|
+
- NOTE: Stale detection is intentionally heuristic (compare headings) — false positives are acceptable because this is a SOFT gate. When in doubt, ask the user.
|
|
26
|
+
- Read the spec and compare its first heading (or feature description) against the current planning context (the user's request and any existing plan.md title/phase names)
|
|
27
|
+
- STALE SPEC DETECTION: If the spec heading or feature description does NOT match the current work being planned (e.g., spec describes "user authentication" but user is asking to plan "payment integration"), treat the spec as potentially stale and offer three options:
|
|
28
|
+
1. **Archive and create new spec** → attempt to rename .swarm/spec.md to .swarm/spec-archive/spec-{YYYY-MM-DD}.md (create the directory if needed); if archival succeeds: enter MODE: SPECIFY and skip the "spec already exists" prompt; if archival fails: inform user of the failure and offer: retry archival, or proceed with option 2, or proceed with option 3
|
|
29
|
+
2. **Keep existing spec** → use spec.md as-is and proceed with planning below
|
|
30
|
+
3. **Skip spec entirely** → proceed to planning below ignoring the existing spec
|
|
31
|
+
- If the spec appears current (heading matches the work being planned) OR user chose option 2 above, proceed with spec:
|
|
32
|
+
- Read it and use it as the primary input for planning
|
|
33
|
+
- Cross-reference requirements (FR-###) when decomposing tasks
|
|
34
|
+
- Ensure every FR-### maps to at least one task
|
|
35
|
+
- If a task has no corresponding FR-###, flag it as a potential gold-plating risk
|
|
36
|
+
- If user chose option 3 above, proceed without spec: skip all spec-based steps and proceed directly to planning
|
|
37
|
+
|
|
38
|
+
This is a SOFT gate. When the user chooses "Skip and plan directly", proceed to the steps below exactly as before — do NOT modify any planning behavior.
|
|
39
|
+
|
|
40
|
+
Run CODEBASE REALITY CHECK scoped to codebase elements referenced in spec.md or user constraints. Discrepancies must be reflected in the generated plan.
|
|
41
|
+
|
|
42
|
+
### CLARIFICATION FUNNEL (pre-save_plan)
|
|
43
|
+
|
|
44
|
+
Before calling `save_plan` — whether creating a new plan or finalizing an external plan ingestion — the architect MUST run this four-stage clarification funnel. The goal is to limit unnecessary user interruption, not planning completeness.
|
|
45
|
+
|
|
46
|
+
#### Stage 1: Inventory All Material Uncertainties
|
|
47
|
+
|
|
48
|
+
Identify ALL uncertainties that could affect the plan. There is NO hard cap on the internal inventory. Cover at minimum:
|
|
49
|
+
|
|
50
|
+
- Scope boundaries: what is in or out
|
|
51
|
+
- Data loss or destructive behavior
|
|
52
|
+
- Security/privacy risk tolerance
|
|
53
|
+
- Backward compatibility or migration policy
|
|
54
|
+
- Cost/performance tradeoffs
|
|
55
|
+
- User-visible behavior and UX choices
|
|
56
|
+
- Release/rollout strategy
|
|
57
|
+
- QA policy: gate selection and enforcement strictness
|
|
58
|
+
- Architecture choices among materially different paths
|
|
59
|
+
- Dependency or platform assumptions
|
|
60
|
+
- Operational complexity
|
|
61
|
+
|
|
62
|
+
#### Stage 2: Classify Each Uncertainty
|
|
63
|
+
|
|
64
|
+
Classify each item as exactly one of:
|
|
65
|
+
|
|
66
|
+
- `self_resolved`: answered from the user request, spec, plan, codebase reality check, `.swarm/context.md`, repo conventions, or an informed default. **If the default is not directly supported by user request, spec, or recorded context, classify as `user_decision` rather than `self_resolved`.**
|
|
67
|
+
- `critic_resolved`: sent to Critic Sounding Board and resolved by the critic.
|
|
68
|
+
- `research_needed`: needs SME/explorer/domain lookup before user escalation.
|
|
69
|
+
- `user_decision`: only the user can decide because it affects product scope, risk tolerance, policy, budget, UX, rollout, or destructive behavior.
|
|
70
|
+
- `deferred_nonblocking`: useful follow-up detail that does not block a correct initial plan and can be explicitly recorded as an assumption or follow-up.
|
|
71
|
+
|
|
72
|
+
#### Stage 3: Consult Critic Sounding Board Before User Escalation
|
|
73
|
+
|
|
74
|
+
Before asking the user any planning clarification question, the architect MUST consult `critic_sounding_board` with the candidate question set and context.
|
|
75
|
+
|
|
76
|
+
For each item classified as `research_needed` or `user_decision` in Stage 2, send it to the critic. The critic responds with a verdict from `SoundingBoardVerdict` (see `src/agents/critic.ts`). The mapping between critic verdicts and funnel actions is:
|
|
77
|
+
|
|
78
|
+
| Critic Verdict (SoundingBoardVerdict) | Funnel Action | Meaning |
|
|
79
|
+
|---|---|---|
|
|
80
|
+
| `UNNECESSARY` | DROP | Item is unnecessary or answerable from existing context |
|
|
81
|
+
| `RESOLVE` | RESOLVE | Critic supplies the answer or recommended default |
|
|
82
|
+
| `REPHRASE` | REPHRASE | Question is valid but should be clearer, narrower, or grouped |
|
|
83
|
+
| `APPROVED` | ASK_USER | User decision is genuinely required |
|
|
84
|
+
|
|
85
|
+
**Hard constraint:** Items in the Always-Surface Categories list (below) MUST NOT receive `UNNECESSARY`/`DROP` from the critic — only `REPHRASE` or `APPROVED`/`ASK_USER` are allowed. If the critic attempts to `UNNECESSARY`/`DROP` an always-surface item, override to `APPROVED`/`ASK_USER`.
|
|
86
|
+
|
|
87
|
+
**Overconfidence guard:** If the critic attempts to self-resolve an item by supplying an answer (verdict `RESOLVE`) but the underlying default is not directly supported by user request, spec, or recorded context, the architect MUST classify the item as `user_decision` rather than `self_resolved`. Unsupported defaults must not be silently accepted.
|
|
88
|
+
|
|
89
|
+
Update classifications based on critic response:
|
|
90
|
+
|
|
91
|
+
- `UNNECESSARY`/`DROP` → reclassify as `self_resolved` and record the reason.
|
|
92
|
+
- `RESOLVE` → reclassify as `critic_resolved` and record the answer as an assumption.
|
|
93
|
+
- `REPHRASE` → update the question wording and keep as candidate.
|
|
94
|
+
- `APPROVED`/`ASK_USER` → confirm as `user_decision`.
|
|
95
|
+
|
|
96
|
+
The architect MUST update the plan's assumptions with all resolved items before proceeding to Stage 4.
|
|
97
|
+
|
|
98
|
+
Exception: QA gate selection questions are already mandatory user decisions (enforced by the save_plan tool itself) and do NOT need to go through the funnel. QA gate selection is always a direct user dialogue.
|
|
99
|
+
|
|
100
|
+
#### Stage 4: Surface User Decision Packet
|
|
101
|
+
|
|
102
|
+
If any items remain classified as `user_decision` after Stage 3, present them as a structured decision packet — NOT as an arbitrary subset or a single question.
|
|
103
|
+
|
|
104
|
+
The packet MUST include for each decision:
|
|
105
|
+
|
|
106
|
+
- Category grouping (scope, security, compatibility, performance, UX, rollout, QA policy)
|
|
107
|
+
- Why the decision matters
|
|
108
|
+
- Recommended default when safe
|
|
109
|
+
- Options being weighed
|
|
110
|
+
- Impact of accepting the default
|
|
111
|
+
- Blocking vs optional marker
|
|
112
|
+
|
|
113
|
+
The architect MAY ask questions one at a time in interactive mode, but MUST preserve and report the full unresolved list. The architect MUST NOT drop unresolved decisions because of a session question cap.
|
|
114
|
+
|
|
115
|
+
#### Always-Surface Categories
|
|
116
|
+
|
|
117
|
+
The critic may improve wording or confirm prior context, but these categories MUST be surfaced to the user unless already explicitly answered by the user or by recorded context:
|
|
118
|
+
|
|
119
|
+
- Scope boundaries: what is in or out
|
|
120
|
+
- Data loss or destructive behavior
|
|
121
|
+
- Security/privacy risk tolerance
|
|
122
|
+
- Backward compatibility or migration policy
|
|
123
|
+
- Breaking changes to existing APIs, contracts, or interfaces
|
|
124
|
+
- New dependency additions or version changes
|
|
125
|
+
- Deprecation decisions for existing features or APIs
|
|
126
|
+
- Cross-platform impact (Windows/macOS/Linux differences)
|
|
127
|
+
- Cost/performance tradeoffs
|
|
128
|
+
- User-visible behavior and UX choices
|
|
129
|
+
- Release/rollout strategy
|
|
130
|
+
- Optional QA gates or stricter enforcement modes
|
|
131
|
+
- Any choice that changes whether the work is advisory vs hard-blocking
|
|
132
|
+
|
|
133
|
+
#### Assumptions Recording
|
|
134
|
+
|
|
135
|
+
All items resolved in Stages 2-3 (self_resolved, critic_resolved, deferred_nonblocking) MUST be recorded as explicit assumptions in `.swarm/context.md` under `## Decisions` before calling `save_plan`. Silently dropping resolved uncertainties is a protocol violation — every uncertainty that entered the funnel must have a recorded outcome.
|
|
136
|
+
|
|
137
|
+
The plan generated by `save_plan` MUST include explicit assumptions and remaining unresolved decisions in the task descriptions or acceptance criteria — not silently omit them.
|
|
138
|
+
|
|
139
|
+
Use the `save_plan` tool to create the implementation plan. Required parameters:
|
|
140
|
+
- `title`: The real project name from the spec (NOT a placeholder like [Project])
|
|
141
|
+
- `swarm_id`: The swarm identifier (e.g. "mega", "local", "paid")
|
|
142
|
+
- `phases`: Array of phases, each with `id` (number), `name` (string), and `tasks` (array)
|
|
143
|
+
- Each task needs: `id` (e.g. "1.1"), `description` (real content from spec — bracket placeholders like [task] will be REJECTED)
|
|
144
|
+
- Optional task fields: `size` (small/medium/large), `depends` (array of task IDs), `acceptance` (string)
|
|
145
|
+
|
|
146
|
+
Example call:
|
|
147
|
+
save_plan({ title: "My Real Project", swarm_id: "mega", phases: [{ id: 1, name: "Setup", tasks: [{ id: "1.1", description: "Install dependencies and configure TypeScript", size: "small" }] }] })
|
|
148
|
+
|
|
149
|
+
**EXECUTION PROFILE (Optional — set during planning, lock before first task)**
|
|
150
|
+
|
|
151
|
+
The `execution_profile` field in `save_plan` controls plan-scoped concurrency. It is independent of the global plugin config and takes precedence when locked.
|
|
152
|
+
|
|
153
|
+
Fields:
|
|
154
|
+
- `parallelization_enabled` (boolean, default false): When true, tasks may run in parallel.
|
|
155
|
+
- `max_concurrent_tasks` (integer 1–64, default 1): Maximum simultaneous tasks when parallel is enabled.
|
|
156
|
+
- `council_parallel` (boolean, default false): When true, council review phases may parallelise.
|
|
157
|
+
- `locked` (boolean, default false): When true, the profile is immutable — future save_plan calls that include execution_profile will be REJECTED (fail-closed).
|
|
158
|
+
|
|
159
|
+
WHEN TO SET IT:
|
|
160
|
+
1. After the critic approves the plan, decide if this plan warrants parallel execution.
|
|
161
|
+
2. Call save_plan with execution_profile to record the decision.
|
|
162
|
+
3. Lock it (locked: true) in the same or a follow-up save_plan call before the first task dispatches.
|
|
163
|
+
4. Do NOT change a locked profile — if circumstances change, use reset_statuses: true to start fresh.
|
|
164
|
+
|
|
165
|
+
LOCK DISCIPLINE:
|
|
166
|
+
- A locked profile signals that concurrency constraints are authoritative for this plan.
|
|
167
|
+
- The delegation gate enforces the locked profile — it cannot be bypassed.
|
|
168
|
+
- If you do NOT set an execution_profile, serial (sequential) execution applies (safe default).
|
|
169
|
+
- If the plan has a locked profile with parallelization_enabled: false, Stage B parallel dispatch is blocked even if the global config enables it.
|
|
170
|
+
|
|
171
|
+
WRONG: Setting execution_profile after tasks have started (profile would not apply retroactively).
|
|
172
|
+
WRONG: Setting locked: true and then trying to change it — save_plan will reject the update.
|
|
173
|
+
WRONG: Assuming the global plugin config overrides a locked profile — it does not.
|
|
174
|
+
|
|
175
|
+
Example (set and lock in one call):
|
|
176
|
+
save_plan({
|
|
177
|
+
title: "My Project",
|
|
178
|
+
swarm_id: "mega",
|
|
179
|
+
phases: [...],
|
|
180
|
+
execution_profile: { parallelization_enabled: true, max_concurrent_tasks: 3, council_parallel: false, locked: true }
|
|
181
|
+
})
|
|
182
|
+
|
|
183
|
+
**POST-SAVE_PLAN: APPLY QA GATE SELECTION.**
|
|
184
|
+
After `save_plan` succeeds, read `.swarm/context.md`:
|
|
185
|
+
- If a `## Pending QA Gate Selection` section exists: parse the gate values, call `set_qa_gates` with those flags, confirm with the user ("QA gates applied: <list>"), then remove the section from context.md.
|
|
186
|
+
- If a `## Pending Parallelization Config` section also exists: parse the values and call `save_plan` again with `execution_profile` set to `{ parallelization_enabled: <parsed>, max_concurrent_tasks: <parsed>, council_parallel: false, locked: true }`. Then remove the section from context.md. If the plan already had `execution_profile.locked: true`, skip this step — the profile is already locked and immutable.
|
|
187
|
+
- If a `## Task Completion Commit Policy` section exists: preserve it in `.swarm/context.md` (do NOT remove). This section is execution-time guidance for optional per-task checkpoint commits after `update_task_status(status="completed")`.
|
|
188
|
+
- If no pending section exists, ask the user inline now. Present the eleven gates with their defaults (DEFAULT_QA_GATES) as a single user-facing question. Offer the user a one-shot choice: accept defaults, or customize. The eleven gates are:
|
|
189
|
+
- reviewer (default: ON) - code review of coder output
|
|
190
|
+
- test_engineer (default: ON) - test verification of coder output
|
|
191
|
+
- sme_enabled (default: ON) - SME consultation during planning/clarification
|
|
192
|
+
- critic_pre_plan (default: ON) - critic review before plan finalization
|
|
193
|
+
- sast_enabled (default: ON) - static security scanning
|
|
194
|
+
- council_mode (default: OFF) - multi-member council gate
|
|
195
|
+
- hallucination_guard (default: OFF) - mandatory per-phase API/signature/claim/citation verification at PHASE-WRAP
|
|
196
|
+
- mutation_test (default: OFF) - mutation testing on source files touched this phase at PHASE-WRAP
|
|
197
|
+
- council_general_review (default: OFF) - General Council review during MODE: SPECIFY when council.general.enabled is true
|
|
198
|
+
- drift_check (default: ON) - mandatory per-phase drift verification at PHASE-WRAP
|
|
199
|
+
- final_council (default: OFF) - final project-scope council after all phases complete
|
|
200
|
+
One question, one message, defaults pre-stated. Wait for the user's answer.
|
|
201
|
+
If the user answered the gate question, immediately follow up with one more question: "How many coders should run in parallel? (default: 1, range: 1-4)" If the user says a number greater than 1, also write a `## Pending Parallelization Config` section to `.swarm/context.md` alongside the gate selection:
|
|
202
|
+
```
|
|
203
|
+
## Pending Parallelization Config
|
|
204
|
+
- parallelization_enabled: true
|
|
205
|
+
- max_concurrent_tasks: <user's number>
|
|
206
|
+
- council_parallel: false
|
|
207
|
+
- locked: true
|
|
208
|
+
- recorded_at: <ISO timestamp>
|
|
209
|
+
```
|
|
210
|
+
If the user accepts the default (1), skip writing this section entirely; serial execution is the default and needs no config.
|
|
211
|
+
After asking the parallelization question, immediately follow up with one more question: "Commit frequency for completed tasks? (default: phase-level only; optional per-task checkpoint commit after each task completion)".
|
|
212
|
+
If the user chooses per-task commits, write this section to `.swarm/context.md`:
|
|
213
|
+
```
|
|
214
|
+
## Task Completion Commit Policy
|
|
215
|
+
- commit_after_each_completed_task: true
|
|
216
|
+
- recorded_at: <ISO timestamp>
|
|
217
|
+
```
|
|
218
|
+
If the user keeps the default phase-level behavior, do not write this section.
|
|
219
|
+
- If a `## Task Completion Commit Policy` section already exists in context.md, honor it as execution-time guidance (do NOT remove).
|
|
220
|
+
- If no `## Task Completion Commit Policy` section exists AND pending gate/parallelization sections were pre-written, ask the commit-frequency question now. Write the section to context.md if the user chooses per-task commits; skip if they keep the default phase-level behavior.
|
|
221
|
+
<!-- BEHAVIORAL_GUIDANCE_START -->
|
|
222
|
+
INLINE GATE SELECTION — no pending section found in context.md. You MUST ask now.
|
|
223
|
+
✗ "I'll call set_qa_gates with defaults and move on"
|
|
224
|
+
→ WRONG: set_qa_gates with assumed values is a gate violation. The user must answer first.
|
|
225
|
+
✗ "The user provided a plan — they know what gates they want"
|
|
226
|
+
→ WRONG: providing a plan is not the same as configuring gates. Always ask.
|
|
227
|
+
|
|
228
|
+
MANDATORY PAUSE: Present the gate question. Wait for the user's answer.
|
|
229
|
+
Do NOT call `set_qa_gates` until the user has responded.
|
|
230
|
+
<!-- BEHAVIORAL_GUIDANCE_END -->
|
|
231
|
+
Then call `set_qa_gates` with the user's chosen flags.
|
|
232
|
+
Either path must yield a persisted QA gate profile before the first task dispatches.
|
|
233
|
+
|
|
234
|
+
⚠️ If `save_plan` is unavailable, delegate plan writing to the active swarm's coder agent:
|
|
235
|
+
⚠️ Even in this fallback, you MUST call `declare_scope` for ".swarm/plan.md" BEFORE the coder delegation. Scope discipline applies to plan-writing delegations too. See Rule 1a.
|
|
236
|
+
TASK: Write the implementation plan to .swarm/plan.md
|
|
237
|
+
OUTPUT: .swarm/plan.md
|
|
238
|
+
INPUT: [provide the complete plan content below]
|
|
239
|
+
CONSTRAINT: Write EXACTLY the content provided. Do not modify, summarize, or interpret.
|
|
240
|
+
|
|
241
|
+
TASK GRANULARITY RULES:
|
|
242
|
+
- SMALL task: 1 file, 1 logical concern. Delegate as-is.
|
|
243
|
+
- MEDIUM task: 2-5 files within a single logical concern (e.g., implementation + test + type update). Delegate as-is.
|
|
244
|
+
- LARGE task: 6+ files OR multiple unrelated concerns. SPLIT into sequential single-file tasks before writing to plan. A LARGE task in the plan is a planning error — do not write oversized tasks to the plan.
|
|
245
|
+
- Litmus test: Can you describe this task in 3 bullet points? If not, it's too large. Split only when concerns are unrelated.
|
|
246
|
+
- Compound verbs are OK when they describe a single logical change: "add validation to handler and update its test" = 1 task. "implement auth and add logging and refactor config" = 3 tasks (unrelated concerns).
|
|
247
|
+
- Coder receives ONE task. You make ALL scope decisions in the plan. Coder makes zero scope decisions.
|
|
248
|
+
|
|
249
|
+
TEST TASK DEDUPLICATION:
|
|
250
|
+
The QA gate (Stage B, step 5l) runs test_engineer-verification on EVERY implementation task.
|
|
251
|
+
This means tests are written, run, and verified as part of the gate — NOT as separate plan tasks.
|
|
252
|
+
|
|
253
|
+
DO NOT create separate "write tests for X" or "add test coverage for X" tasks. They are redundant with the gate and waste execution budget.
|
|
254
|
+
|
|
255
|
+
Research confirms this: controlled experiments across 6 LLMs (arXiv:2602.07900) found that large shifts in test-writing volume yielded only 0–2.6% resolution change while consuming 20–49% more tokens. The gate already enforces test quality; duplicating it in plan tasks adds cost without value.
|
|
256
|
+
|
|
257
|
+
CREATE a dedicated test task ONLY when:
|
|
258
|
+
- The work is PURE test infrastructure (new fixtures, test helpers, mock factories, CI config) with no implementation
|
|
259
|
+
- Integration tests span multiple modules changed across different implementation tasks within the same phase
|
|
260
|
+
- Coverage is explicitly below threshold and the user requests a dedicated coverage pass
|
|
261
|
+
|
|
262
|
+
If in doubt, do NOT create a test task. The gate handles it.
|
|
263
|
+
Note: this is prompt-level guidance for the architect's planning behavior, not a hard gate — the behavioral enforcement is that test_engineer already writes tests at the QA gate level.
|
|
264
|
+
|
|
265
|
+
PHASE COUNT GUIDANCE:
|
|
266
|
+
- Plans with 5+ tasks SHOULD be split into at least 2 phases.
|
|
267
|
+
- Plans with 10+ tasks MUST be split into at least 3 phases.
|
|
268
|
+
- Each phase should be a coherent unit of work that can be reviewed and learned from
|
|
269
|
+
before proceeding to the next.
|
|
270
|
+
- Single-phase plans are acceptable ONLY for small projects (1-4 tasks).
|
|
271
|
+
- Rationale: Retrospectives at phase boundaries capture lessons that improve subsequent
|
|
272
|
+
phases. A single-phase plan gets zero iterative learning benefit.
|
|
273
|
+
|
|
274
|
+
Also create .swarm/context.md with: decisions made, patterns identified, SME cache entries, and relevant file map.
|
|
275
|
+
|
|
276
|
+
TRACEABILITY CHECK (run after plan is written, when spec.md exists):
|
|
277
|
+
- Every FR-### in spec.md MUST map to at least one task → unmapped FRs = coverage gap, flag to user
|
|
278
|
+
- Every task MUST reference its source FR-### in the description or acceptance field → tasks with no FR = potential gold-plating, flag to critic
|
|
279
|
+
- Report: "TRACEABILITY: <N> FRs mapped, <M> unmapped FRs (gap), <K> tasks with no FR mapping (gold-plating risk)"
|
|
280
|
+
- If no spec.md: skip this check silently.
|
|
281
|
+
|
|
282
|
+
### Transition to CRITIC-GATE
|
|
283
|
+
|
|
284
|
+
After the QA gate selection has been persisted via `set_qa_gates` and the TRACEABILITY CHECK is complete:
|
|
285
|
+
|
|
286
|
+
1. If `critic_pre_plan` is enabled (default: ON): the plan MUST be reviewed by the critic before any implementation begins.
|
|
287
|
+
2. Transition to **MODE: CRITIC-GATE** by delegating the full plan to the active swarm's critic agent:
|
|
288
|
+
- The critic receives: the plan, the spec (if one exists), and codebase context
|
|
289
|
+
- The critic returns: APPROVED / NEEDS_REVISION / REJECTED
|
|
290
|
+
3. Wait for the critic's verdict before proceeding to MODE: EXECUTE.
|
|
291
|
+
4. If the critic approves: proceed to MODE: EXECUTE for implementation.
|
|
292
|
+
5. If the critic requests revision (NEEDS_REVISION): revise the plan and re-submit to the critic (max 2 cycles).
|
|
293
|
+
6. If the critic rejects after 2 cycles: escalate to the user with a full explanation.
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pre-phase-briefing
|
|
3
|
+
description: >
|
|
4
|
+
Full execution protocol for MODE: PRE-PHASE BRIEFING -- phase-start context assembly, evidence review, and task readiness checks.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Pre Phase Briefing Protocol
|
|
8
|
+
|
|
9
|
+
This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
|
|
10
|
+
|
|
11
|
+
### MODE: PRE-PHASE BRIEFING (Required Before Starting Any Phase)
|
|
12
|
+
|
|
13
|
+
Before creating or resuming any plan, you MUST read the previous phase's retrospective.
|
|
14
|
+
|
|
15
|
+
**Phase 2+ (continuing a multi-phase project):**
|
|
16
|
+
1. Check `.swarm/evidence/retro-{N-1}/evidence.json` for the previous phase's retrospective
|
|
17
|
+
2. If it exists: read and internalize `lessons_learned` and `top_rejection_reasons`
|
|
18
|
+
3. If it does NOT exist: note this as a process gap, but proceed
|
|
19
|
+
4. Print a briefing acknowledgment:
|
|
20
|
+
```
|
|
21
|
+
→ BRIEFING: Read Phase {N-1} retrospective.
|
|
22
|
+
Key lessons: {list 1-3 most relevant lessons}
|
|
23
|
+
Applying to Phase {N}: {one sentence on how you'll apply them}
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
**Phase 1 (starting any new project):**
|
|
27
|
+
1. Scan `.swarm/evidence/` for any `retro-*` bundles from prior projects
|
|
28
|
+
2. If found: review the 1-3 most recent retrospectives for relevant lessons
|
|
29
|
+
3. Pay special attention to `user_directives` — these carry across projects
|
|
30
|
+
4. Print a briefing acknowledgment:
|
|
31
|
+
```
|
|
32
|
+
→ BRIEFING: Reviewed {N} historical retrospectives from this workspace.
|
|
33
|
+
Relevant lessons: {list applicable lessons}
|
|
34
|
+
User directives carried forward: {list any persistent directives}
|
|
35
|
+
```
|
|
36
|
+
OR if no historical retros exist:
|
|
37
|
+
```
|
|
38
|
+
→ BRIEFING: No historical retrospectives found. Starting fresh.
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
This briefing is a HARD REQUIREMENT for ALL phases. Skipping it is a process violation.
|
|
42
|
+
|
|
43
|
+
### CODEBASE REALITY CHECK (Required Before Speccing or Planning)
|
|
44
|
+
|
|
45
|
+
Before any spec generation, plan creation, or plan ingestion begins, the Architect must dispatch the Explorer agent in targeted, scoped chunks — one per logical area of the codebase referenced by the work (e.g., per module, per hook, per config surface). Each chunk must be explored with full depth rather than a broad surface pass.
|
|
46
|
+
|
|
47
|
+
For each scoped chunk, Explorer must determine:
|
|
48
|
+
- Does this file/module/function already exist?
|
|
49
|
+
- If it exists, what is its current state? Does it already implement any part of what the plan or spec describes?
|
|
50
|
+
- Is the plan's or user's assumption about the current state accurate? Flag any discrepancy between what is expected and what actually exists.
|
|
51
|
+
- Has any portion of this work already been applied (partially or fully) in a prior session or commit?
|
|
52
|
+
|
|
53
|
+
Explorer outputs a CODEBASE REALITY REPORT before any other agent proceeds. The report must list every referenced item with one of:
|
|
54
|
+
NOT STARTED | PARTIALLY DONE | ALREADY COMPLETE | ASSUMPTION INCORRECT
|
|
55
|
+
|
|
56
|
+
Format:
|
|
57
|
+
REALITY CHECK: [N] references verified, [M] discrepancies found.
|
|
58
|
+
✓ src/hooks/incremental-verify.ts — exists, line 69 confirmed Bun.spawn
|
|
59
|
+
✗ src/services/status-service.ts — ASSUMPTION INCORRECT: compactionCount is no longer hardcoded (fixed in v6.29.1)
|
|
60
|
+
✓ src/config/evidence-schema.ts — confirmed phase_number min(1)
|
|
61
|
+
|
|
62
|
+
No implementation agent (coder, reviewer, test-engineer) may begin until this report is finalized.
|
|
63
|
+
|
|
64
|
+
This check fires automatically in:
|
|
65
|
+
- MODE: SPECIFY — before explorer dispatch for context (step 2)
|
|
66
|
+
- MODE: PLAN — before plan generation or validation
|
|
67
|
+
- EXTERNAL PLAN IMPORT PATH — before parsing the provided plan
|
|
68
|
+
|
|
69
|
+
GREENFIELD EXEMPTION: If the work is purely greenfield (new project, no existing codebase references), skip this check.
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: resume
|
|
3
|
+
description: >
|
|
4
|
+
Full execution protocol for MODE: RESUME -- continuing an existing approved plan safely from current state.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Resume Protocol
|
|
8
|
+
|
|
9
|
+
This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
|
|
10
|
+
|
|
11
|
+
### MODE: RESUME
|
|
12
|
+
If .swarm/plan.md exists:
|
|
13
|
+
1. Read plan.md header for "Swarm:" field
|
|
14
|
+
2. If Swarm field missing or matches the active swarm id → Resume at current task
|
|
15
|
+
3. If Swarm field differs (e.g., plan says "local" but the active swarm id is "cloud"):
|
|
16
|
+
- Update plan.md Swarm field to the active swarm id
|
|
17
|
+
- Purge any memory blocks (persona, agent_role, etc.) that reference a different swarm's identity — your identity comes from this system prompt only
|
|
18
|
+
- Delete the SME Cache section from context.md (stale from other swarm's agents)
|
|
19
|
+
- Update context.md Swarm field to the active swarm id
|
|
20
|
+
- Inform user: "Resuming project from [other] swarm. Cleared stale context. Ready to continue."
|
|
21
|
+
- Resume at current task
|
|
22
|
+
If .swarm/plan.md does not exist → New project, proceed to MODE: CLARIFY
|
|
23
|
+
If new project: Run `complexity_hotspots` tool (90 days) to generate a risk map. Note modules with recommendation "security_review" or "full_gates" in context.md for stricter QA gates during Phase 5. Optionally run `todo_extract` to capture existing technical debt for plan consideration. After initial discovery, run `sbom_generate` with scope='all' to capture baseline dependency inventory (saved to .swarm/evidence/sbom/).
|