npm - opencode-swarm - Versions diffs - 7.58.0 → 7.59.0 - Mend

opencode-swarm 7.58.0 → 7.59.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/.opencode/skills/brainstorm/SKILL.md +142 -0
package/.opencode/skills/clarify/SKILL.md +103 -0
package/.opencode/skills/clarify-spec/SKILL.md +58 -0
package/.opencode/skills/codebase-review-swarm/INSTALL.md +75 -0
package/.opencode/skills/codebase-review-swarm/README.md +44 -0
package/.opencode/skills/codebase-review-swarm/SKILL.md +65 -0
package/.opencode/skills/codebase-review-swarm/agents/openai.yaml +6 -0
package/.opencode/skills/codebase-review-swarm/assets/jsonl-schemas.md +239 -0
package/.opencode/skills/codebase-review-swarm/assets/review-report-template.md +244 -0
package/.opencode/skills/codebase-review-swarm/references/compatibility-and-research-notes.md +25 -0
package/.opencode/skills/codebase-review-swarm/references/full-v7-source-prompt.md +2373 -0
package/.opencode/skills/codebase-review-swarm/references/review-protocol-v8.2.md +310 -0
package/.opencode/skills/codebase-review-swarm/scripts/init-review-run.py +134 -0
package/.opencode/skills/codebase-review-swarm/scripts/validate-skill-package.py +62 -0
package/.opencode/skills/consult/SKILL.md +16 -0
package/.opencode/skills/council/SKILL.md +147 -0
package/.opencode/skills/critic-gate/SKILL.md +59 -0
package/.opencode/skills/deep-dive/SKILL.md +142 -0
package/.opencode/skills/design-docs/SKILL.md +81 -0
package/.opencode/skills/discover/SKILL.md +20 -0
package/.opencode/skills/execute/SKILL.md +191 -0
package/.opencode/skills/issue-ingest/SKILL.md +64 -0
package/.opencode/skills/phase-wrap/SKILL.md +123 -0
package/.opencode/skills/plan/SKILL.md +293 -0
package/.opencode/skills/pre-phase-briefing/SKILL.md +69 -0
package/.opencode/skills/resume/SKILL.md +23 -0
package/.opencode/skills/specify/SKILL.md +175 -0
package/.opencode/skills/swarm-pr-feedback/SKILL.md +192 -0
package/.opencode/skills/swarm-pr-review/SKILL.md +884 -0
package/dist/agents/agent-output-schema.d.ts +1 -1
package/dist/cli/index.js +1351 -1159
package/dist/commands/command-dispatch.d.ts +1 -0
package/dist/commands/index.d.ts +1 -0
package/dist/commands/registry.d.ts +15 -14
package/dist/config/bundled-skills.d.ts +25 -0
package/dist/config/constants.d.ts +1 -1
package/dist/config/schema.d.ts +42 -0
package/dist/index.js +3517 -2673
package/dist/memory/schema.d.ts +1 -1
package/dist/tools/lean-turbo-run-phase.d.ts +2 -1
package/dist/turbo/lean/index.d.ts +4 -1
package/dist/turbo/lean/merge-back.d.ts +180 -0
package/dist/turbo/lean/runner.d.ts +47 -1
package/dist/turbo/lean/state.d.ts +10 -0
package/dist/turbo/lean/worktree.d.ts +194 -0
package/package.json +20 -1

package/.opencode/skills/council/SKILL.md ADDED Viewed

@@ -0,0 +1,147 @@
+---
+name: council
+description: >
+  Full execution protocol for MODE: COUNCIL -- General Council research,
+  parallel member dispatch, disagreement handling, and synthesis.
+---
+# Council Protocol
+This protocol is loaded on demand by the architect stub in `src/agents/architect.ts`.
+The architect prompt keeps only activation, action, and hard safety constraints;
+the full execution details live here.
+### MODE: COUNCIL
+Activates when: user invokes `/swarm council <question>` (optionally with
+`--preset <name>` and/or `--spec-review`).
+Purpose: convene a fixed three-agent multi-model General Council
+(generalist / skeptic / domain expert) for an advisory deliberation. The
+architect runs a curated web research pass upfront, dispatches the three agents
+in parallel with the gathered RESEARCH CONTEXT, routes any disagreements back
+for one targeted reconciliation round, and synthesizes the final user-facing
+answer directly.
+This mode is ADVISORY. It does not block any other workflow and does not modify
+code, plans, or specs. The output is for the user (general mode) or for the spec
+being drafted in MODE: SPECIFY (spec_review mode, gated by
+`council_general_review`).
+#### Pre-flight (always run first)
+1. Read `council.general` from the resolved opencode-swarm config. Resolution
+   is global first (`~/.config/opencode/opencode-swarm.json`), then project
+   override (`.opencode/opencode-swarm.json`). A global config is valid and must
+   be used when no project override is present; do not fail after checking only
+   the project file. If `council.general.enabled` is not true OR no search API
+   key is configured (neither `council.general.searchApiKey` nor the
+   corresponding env var `TAVILY_API_KEY` / `BRAVE_SEARCH_API_KEY`),
+   surface to the user: "General Council is not enabled. Set
+   council.general.enabled: true and configure a search API key in
+   global ~/.config/opencode/opencode-swarm.json or project
+   .opencode/opencode-swarm.json." Then STOP.
+#### Research Phase (always run before dispatching council agents)
+2. Formulate 1-3 targeted `web_search` queries that best capture the
+   information needed to answer the question. Prefer specific, keyword-focused
+   queries over broad ones.
+   Hard grounding rules:
+   - Do not append a model training-cutoff year to searches.
+   - Use `web_search` with its default `freshness: "auto"` behavior for
+     current queries unless the user explicitly asked for a historical window.
+   - Preserve each `web_search` result's normalized `query`, `temporalIntent`,
+     `freshness`, and `removedStaleYears` metadata in RESEARCH CONTEXT audit
+     notes.
+   - For current, latest, today, now, state-of-the-art, pricing, release-status,
+     legal/regulatory, financial, security, or otherwise time-sensitive
+     questions, the Research Phase must produce usable current sources before
+     council dispatch.
+   - If `web_search` returns no results or an error for a time-sensitive
+     question, stop and surface the failed search result to the user instead of
+     dispatching ungrounded members.
+   - For stable/non-current questions, if `web_search` returns no results or an
+     error, note this in the dispatch message and proceed without a context
+     block. In that degraded mode, members may use stable background knowledge
+     only and must not make current-fact claims.
+   Compile all successful results into a RESEARCH CONTEXT block in this format:
+```text
+RESEARCH CONTEXT
+================
+[1] <title> - <url>
+    <snippet>
+    query: <normalized query>; temporalIntent: <current|historical|unspecified>; freshness: <day|week|month|year|none>; removedStaleYears: <comma-separated years or none>
+[2] <title> - <url>
+    <snippet>
+...
+```
+#### Round 1 - Parallel Independent Analysis
+3. Dispatch `the active swarm's council_generalist agent`,
+   `the active swarm's council_skeptic agent`, and
+   `the active swarm's council_domain_expert agent` in PARALLEL -- one message
+   per agent, then STOP and wait for all responses. Each dispatch message must
+   include:
+   - The question
+   - Round number: 1
+   - The CURRENT DATE in ISO `YYYY-MM-DD` form
+   - The full RESEARCH CONTEXT block from step 2
+   - Instruction: "Cite from the RESEARCH CONTEXT for external evidence. Your
+     memberId and role are hardcoded in your system prompt."
+Do NOT share other agents' responses at this stage.
+4. Collect all three JSON responses. The `round1Responses` array will contain
+   entries with `memberId` of `council_generalist`, `council_skeptic`, and
+   `council_domain_expert` and `role` of `generalist`, `skeptic`, and
+   `domain_expert` respectively. These come from the agents' JSON output; no
+   manual construction is needed.
+#### Synthesis and Deliberation (when council.general.deliberate is true; default true)
+5. Call `convene_general_council` with mode set from the command (`general` or
+   `spec_review`), `question`, and the collected `round1Responses` only (omit
+   `round2Responses`). Inspect the returned `disagreementsCount`.
+6. If `disagreementsCount > 0`:
+   a. For each disagreement in the tool's response, identify the disputing
+      agents (the agents listed in the disagreement's positions, identified by
+      memberId: `council_generalist`, `council_skeptic`, or
+      `council_domain_expert`).
+   b. Re-delegate ONLY to the disputing agents -- one message per agent --
+      passing: their Round 1 response, the disagreement topic, the opposing
+      position(s), round number 2, and the same RESEARCH CONTEXT block.
+   c. Collect the Round 2 responses.
+   d. Call `convene_general_council` AGAIN with both `round1Responses` AND
+      `round2Responses` populated.
+#### Output
+7. Present the final answer to the user from the `synthesis` returned by
+   `convene_general_council`. Apply these output rules directly:
+   - LEAD WITH CONSENSUS: open with the strongest consensus position.
+     Confidence-weighted: higher-confidence claims from multiple agents rank
+     first, but evidence quality outranks raw confidence. Never elevate a
+     single confident voice over a well-evidenced contrary majority.
+   - ACKNOWLEDGE DISAGREEMENT HONESTLY: for each persisting disagreement, write
+     "experts disagree on X because..." and present the strongest version of
+     each side. Do not pretend disagreements are resolved. Do not silently pick
+     a winner.
+   - CITE THE STRONGEST SOURCES: link key claims with `[title](url)` format from
+     the source list in the synthesis. Pick the most reputable source per claim;
+     do not cite duplicates.
+   - BE CONCISE: a few short paragraphs plus a bulleted summary. Expand only
+     when the question genuinely requires it.
+   - HARD CONSTRAINTS: You MUST NOT invent claims not present in the council's
+     responses. You MUST NOT add new web research. You MUST NOT favor a position
+     based on confidence alone.
+Preface the answer with one line listing the participating models (reviewer
+model as generalist, critic model as skeptic, SME model as domain expert). Do
+NOT present raw per-member JSON.

package/.opencode/skills/critic-gate/SKILL.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+name: critic-gate
+description: >
+  Full execution protocol for MODE: CRITIC-GATE -- plan critic review, revision loops, and hard stop before execution.
+---
+# Critic Gate Protocol
+This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
+### MODE: CRITIC-GATE
+Delegate plan to the active swarm's critic agent for review BEFORE any implementation begins.
+- Send the full plan.md content and codebase context summary
+- **APPROVED** → Proceed to MODE: EXECUTE
+- **NEEDS_REVISION** → Revise the plan based on critic feedback, then resubmit (max 2 cycles)
+- **REJECTED** → Inform the user of fundamental issues and ask for guidance before proceeding
+⛔ HARD STOP — Print this checklist before advancing to MODE: EXECUTE:
+  [ ] the active swarm's critic agent returned a verdict
+  [ ] APPROVED → proceed to MODE: EXECUTE
+  [ ] NEEDS_REVISION → revised and resubmitted (attempt N of max 2)
+  [ ] REJECTED (any cycle) → informed user. STOP.
+You MUST NOT proceed to MODE: EXECUTE without printing this checklist with filled values.
+CRITIC-GATE TRIGGER: Run ONCE when you first write the complete .swarm/plan.md.
+Do NOT re-run CRITIC-GATE before every project phase.
+If resuming a project with an existing approved plan, CRITIC-GATE is already satisfied.
+6j. SPEC-GATE (Execute BEFORE any save_plan call):
+- The save_plan tool will REJECT if .swarm/spec.md does not exist (enforced at the tool level via SWARM_SKIP_SPEC_GATE env var bypass).
+- Before calling save_plan, verify spec.md is present using lint_spec.
+- If spec.md is absent: do NOT call save_plan. Use /swarm specify to create a spec first, or inform the user.
+- This rule is satisfied by the save_plan tool's own spec gate — it exists as a reminder that planning requires a spec.
+6k. SPEC-STALENESS GUARD:
+- If _specStale or .swarm/spec-staleness.json exists, the Architect MUST stop
+  and SURFACE THE DRIFT TO THE USER. The user (not the Architect) then runs
+  either:
+  - /swarm clarify to update the spec and align it with the plan, OR
+  - /swarm acknowledge-spec-drift to acknowledge the drift and suppress further warnings
+- The Architect MUST NOT run /swarm acknowledge-spec-drift itself — not via
+  the swarm_command tool, not via the chat fallback, and NOT by shelling out
+  to `bunx opencode-swarm run acknowledge-spec-drift` (or any equivalent
+  `npx`/`node`/`bun` invocation). Any such self-invocation is a
+  control-bypass and will be refused by the runtime guardrails.
+- Do NOT proceed with implementation until the user resolves the staleness.
+- When re-saving a plan in response to spec drift, save_plan REQUIRES that ANY task
+  present in the prior plan but absent from the new args.phases be enumerated
+  in removed_task_ids with a removal_reason. save_plan will reject the call
+  otherwise (PLAN_TASK_REMOVAL_NOT_ACKNOWLEDGED). Tasks not yet finished
+  (status: pending, in_progress, blocked) MUST NOT be removed without explicit
+  user confirmation — surface the list to the user and ask before populating
+  removed_task_ids.
+- While .swarm/spec-staleness.json exists, the runtime STRUCTURALLY BLOCKS the
+  following tools (SPEC_DRIFT_BLOCKED_TOOLS): save_plan, update_task_status,
+  phase_complete, lean_turbo_run_phase, lean_turbo_acquire_locks. If a call
+  returns SPEC_DRIFT_BLOCK, do NOT retry; surface the drift to the user and
+  WAIT for them to run /swarm clarify or /swarm acknowledge-spec-drift.

package/.opencode/skills/deep-dive/SKILL.md ADDED Viewed

@@ -0,0 +1,142 @@
+---
+name: deep-dive
+description: >
+  Full execution protocol for MODE: DEEP_DIVE — read-only codebase audit with
+  parallel explorer waves, 2 independent reviewers, and sequential critic
+  challenge for HIGH/CRITICAL findings. Loaded on demand by the architect when
+  the deep-dive command emits a [MODE: DEEP_DIVE ...] signal.
+---
+# Deep Dive Audit Protocol
+Read-only deep audit of a specified codebase scope using parallel explorer waves, always 2 parallel reviewers, and sequential critic challenge. This mode does NOT mutate source code, does NOT delegate to coder, and does NOT call declare_scope.
+### MODE: DEEP_DIVE
+## Step 0 — Parse Header
+Parse the MODE: DEEP_DIVE header to extract:
+- `scope`: the codebase area to audit (e.g., "auth", "payment flow", "src/hooks/")
+- `profile`: one of standard | security | ux | architecture | full (default: standard)
+- `max_explorers`: integer 1..8 (default: 6, or 8 for full profile)
+- `output`: markdown | json (default: markdown)
+- `update_main`: boolean (default: true) — whether to fetch/ff-only main before starting
+- `allow_dirty`: boolean (default: false) — whether to proceed with uncommitted changes
+If the header is malformed or missing required fields, report the error and stop.
+## Step 1 — Repo Readiness
+1. Check git working tree status. If dirty and `allow_dirty` is false, warn the user and ask whether to proceed. Do NOT proceed automatically.
+2. If `update_main` is true and tree is clean: check current branch. If not on `main`, report current branch to user and ASK FOR CONFIRMATION before switching. Only after explicit user approval: `git fetch origin main && git checkout main && git merge --ff-only origin/main`. If ff-only fails, warn the user and ask before proceeding.
+3. Record the current HEAD commit hash for the report.
+## Step 2 — Scope Resolution
+Use the following tools to map the audit scope:
+1. `repo_map` with action "build" to establish the code graph
+2. `repo_map` with action "localization" for the scope target
+3. `symbols` and `batch_symbols` on key files identified by localization
+4. `imports` to trace dependency boundaries
+5. `doc_scan` if documentation coverage is relevant
+6. `knowledge_recall` with query matching the scope domain
+Produce a SCOPE MAP: list of files, modules, and interfaces within the audit boundary. Cap at 50 files total.
+## Step 3 — Explorer Missions (Parallel Waves)
+Dispatch explorer waves using parallel Task calls. Each wave contains up to `max_explorers` missions.
+**File caps per mission:**
+- 8 files maximum per mission
+- ~3500 total lines across all files in a mission
+- Group files by import proximity (files that import each other go in the same mission)
+**Profile-based lane selection — each profile activates specific lanes:**
+| Lane | Template | standard | security | ux | architecture | full |
+|------|----------|----------|----------|----|-------------|------|
+| SCOPE_MAP | Map structure, exports, boundaries | ✓ | ✓ | ✓ | ✓ | ✓ |
+| WIRING_DATAFLOW | Trace data flow, API contracts, state propagation | ✓ | ✓ | | ✓ | ✓ |
+| RUNTIME_BEHAVIOR | Error handling, edge cases, lifecycle, async patterns | ✓ | | | ✓ | ✓ |
+| UX_FLOW | User-facing behavior, accessibility, responsiveness | | | ✓ | | ✓ |
+| SECURITY_TRUST | Auth boundaries, input validation, trust transitions | | ✓ | | | ✓ |
+| TEST_COVERAGE | Coverage gaps, flaky tests, missing assertions | ✓ | | | | ✓ |
+| PERFORMANCE_RELIABILITY | Resource leaks, N+1 queries, race conditions | | | | ✓ | ✓ |
+| DOCS_CONFIG_DEPLOYMENT | Config consistency, docs accuracy, deployment drift | | | | | ✓ |
+Each explorer mission receives:
+- Lane template name and description
+- Assigned files (8 max, grouped by import proximity)
+- The scope map context from Step 2
+- Instruction: "You are performing a [LANE] audit. Report findings as candidate observations with severity (INFO/LOW/MEDIUM/HIGH/CRITICAL), location, and evidence."
+Explorer missions are dispatched in parallel waves. Wait for ALL missions in a wave to complete before dispatching the next wave.
+Explorers generate CANDIDATE FINDINGS only — they do NOT make verdicts. All findings are unverified until Step 5.
+## Step 4 — Normalize Candidates
+1. Collect all candidate findings from all explorer missions.
+2. Deduplicate: merge findings that reference the same location and issue.
+3. Assign DD-C001 through DD-CNNN identifiers to unique findings.
+4. Cap at 10 findings per shard (see Step 5 for sharding).
+5. Sort by severity (CRITICAL → HIGH → MEDIUM → LOW → INFO).
+## Step 5 — Always 2 Parallel Reviewers
+Split the verified candidates into 2 shards of ≤10 candidates each. Dispatch 2 parallel `the active swarm's reviewer agent` calls.
+Each reviewer receives:
+- Their shard of candidates (up to 10)
+- The scope map context
+- The original scope description
+- Instruction: "Verify or reject each candidate finding. For each: verdict (VERIFIED / REJECTED / NEEDS_MORE_EVIDENCE), confidence (0-1), and brief reasoning."
+Reviewers MUST NOT suggest fixes — they verify findings only.
+## Step 5b — Reviewer Merge/Dedup
+After both reviewers return, perform a lightweight sync pass:
+1. Cross-reference findings between reviewers — flag correlations
+2. Deduplicate any findings both reviewers verified independently
+3. For NEEDS_MORE_EVIDENCE findings: if the other reviewer verified a related finding, merge
+4. Produce a unified findings list with verified/rejected status
+## Step 6 — Critic Challenge (HIGH/CRITICAL only)
+For verified findings rated HIGH or CRITICAL, dispatch sequential critic passes:
+**Pass 1 — False-positive / root-cause challenge:**
+- `the active swarm's critic agent` receives each HIGH/CRITICAL finding
+- Challenge: "Is this a false positive? Is the root cause correctly identified? Provide verdict: SURVIVES / DOWNGRADE / REJECT"
+- Only findings that SURVIVE proceed to Pass 2
+**Pass 2 — Impact / severity challenge:**
+- `the active swarm's critic agent` receives surviving findings
+- Challenge: "Is the severity correctly rated? Could this be lower impact than claimed? Provide verdict: SURVIVES / DOWNGRADE / REJECT"
+- Final severity is the critic's assessed severity
+CRITICAL: Do NOT challenge MEDIUM/LOW/INFO findings. Only HIGH and CRITICAL go through critic review.
+## Step 7 — Final Report
+Assemble and present the audit report:
+1. **Wiring Map**: Visual summary of the scope's module structure and data flow
+2. **Functionality Assessment**: High-level summary of what the scope does and how well
+3. **Verified Findings Table**: DD-ID, severity, location, description, evidence
+4. **Rejected Candidates**: Brief list with rejection reasons
+5. **Enhancements**: Non-blocking improvement suggestions
+6. **Recommended Implementation Phases**: If findings suggest follow-up work, outline phases
+7. **JSON Block** (when output=json): Structured machine-readable findings
+## Important Constraints
+- Do NOT mutate source code under any circumstances
+- Do NOT delegate to coder
+- Do NOT call declare_scope
+- Do NOT create or modify any files outside .swarm/
+- No final finding may appear in the report without reviewer verification
+- Explorers generate candidate findings only — reviewers verify or reject
+- Critics challenge only HIGH/CRITICAL findings — do NOT waste cycles on lower severity

package/.opencode/skills/design-docs/SKILL.md ADDED Viewed

@@ -0,0 +1,81 @@
+---
+name: design-docs
+description: >
+  Full execution protocol for MODE: DESIGN_DOCS — generate or sync structured,
+  language-agnostic design docs (domain.md, technical-spec.md, behavior-spec.md,
+  reference/) for the project under build, with a stable section-ID registry and
+  a design changelog. Loaded on demand by the architect when the design-docs
+  command emits a [MODE: DESIGN_DOCS ...] signal (issue #1080).
+---
+# Design-Doc Generation & Sync Protocol
+Generate or maintain the project's structured design documentation. The work is delegated to the `docs_design` agent (a design-doc-author role variant of the docs agent). This mode authors a fixed set of version-controlled docs in the **target project repo** (NOT under `.swarm/`). It does NOT modify source code, does NOT call `declare_scope`, and does NOT touch `.swarm/spec.md`, `CHANGELOG.md`, or `docs/releases/pending/*`.
+### MODE: DESIGN_DOCS
+## Step 0 — Parse Header
+Parse the `[MODE: DESIGN_DOCS ...]` header to extract:
+- `out`: output directory, project-relative (default `docs`)
+- `lang`: target language for `reference/` docs, or `auto` (default `auto`)
+- `update`: boolean — `true` = sync existing docs to current code/spec; `false` = generate fresh
+- the trailing free text = the system description (required when `update=false`)
+If the header is malformed, report the error and stop.
+## Step 1 — Preconditions
+1. Confirm `design_docs.enabled` is true (the `docs_design` agent only exists when enabled). If it is not, tell the user to set `design_docs.enabled: true` in `opencode-swarm.json` and stop.
+2. If a spec-staleness block is active (`.swarm/spec-staleness.json` present), resolve/acknowledge spec staleness FIRST — otherwise design-doc writes may be blocked by the guardrail. Do not blindly retry on `SPEC_STALENESS_BLOCK`.
+3. Read `.swarm/spec.md` if present — it is the authoritative requirements source (FR-### IDs). The design docs must be consistent with it.
+## Step 2 — Index Existing State (always)
+Have the `docs_design` agent (or `doc_scan`) index `<out>/` to discover any existing design docs. If `<out>/reference/traceability.json` exists, it is the section-ID registry — load it. Existing section IDs MUST be preserved on regeneration.
+## Step 3 — Generate or Sync
+Dispatch the **`docs_design`** agent (the active swarm's `docs_design` — never the standard `docs` agent) with:
+- `TASK`, `MODE` (generate|sync), `OUT_DIR`, `LANGUAGE`
+- For sync: `FILES CHANGED` and `CHANGES SUMMARY` from the current phase/diff
+- `SKILLS: file:.opencode/skills/design-docs/SKILL.md` (this skill)
+The agent owns exactly these files under `<out>` and creates NOTHING else:
+```
+<out>/
+├── domain.md              # 100% language-agnostic. Entities in neutral notation
+│                          #   (field: type-class), domain invariants. ZERO framework
+│                          #   names in normative text. Section IDs: D-###
+├── technical-spec.md      # Language-agnostic architecture: layers, dependency rules,
+│                          #   contract SHAPES (inputs→outputs→error-kinds), algorithms,
+│                          #   invariants. + the traceability table. Section IDs: S-###
+├── behavior-spec.md       # 100% language-agnostic Given/When/Then specs. IDs: B-###
+├── design-changelog.md    # Keep-a-Changelog log of design-doc changes (NOT release notes)
+└── reference/             # ALL [INCIDENTAL] language/framework-specific material here.
+    ├── reference-impl.md  #   Exact signatures, CLI strings, SQL, code. Mapped to
+    │                      #   spec sections by ID. Section IDs: R-###
+    ├── idiom-notes.md     #   "Here is how the reference solved X" — examples only.
+    └── traceability.json  #   Machine-readable section-ID registry (source of truth)
+```
+## Step 4 — Invariants the docs MUST satisfy
+- **Language-agnostic normative text**: `domain.md`, `technical-spec.md`, and `behavior-spec.md` contain ZERO framework/library/language names in normative content. All such material lives ONLY in `reference/`.
+- **Version header** on every doc:
+  `<!-- design-doc: <name>  version: <phase-or-counter>  generated: <ISO-8601>  spec-hash: <8 chars> -->`
+- **Stable section IDs**: assigned once, never renumbered. `D-###` domain, `S-###` technical-spec, `B-###` behavior-spec, `R-###` reference. On sync, reuse every existing ID; mint new IDs only for genuinely new sections.
+- **Traceability footer** ending each section: `> Traceability: FR-012, FR-013 | invariant: <id-or-none>`.
+- **traceability.json** kept in sync: `{ "schema_version": 1, "sections": [ { "section_id", "doc", "title", "spec_frs": [], "invariants": [], "code_anchors": [] } ] }`. `technical-spec.md` renders a human-readable mirror table `| Doc Section | Spec FR | Invariant | Code anchors |`.
+- **design-changelog.md**: append one entry per generate/sync under `## [Unreleased]` (Added/Changed/Removed), e.g. `- <ISO date> phase <N>: <sections touched> (<FR refs>)`. This file is SEPARATE from release-please artifacts — never edit `CHANGELOG.md` or `docs/releases/pending/*` here.
+## Step 5 — Verify & Report
+1. Confirm the agent created/updated only the allowed files and `traceability.json` is consistent with the docs.
+2. Confirm no normative doc names a framework (spot-check) and every section has an ID + traceability footer.
+3. Report `UPDATED` / `ADDED` / `REMOVED` / `SUMMARY` back to the user.
+## Notes on the PHASE-WRAP sync path
+During PHASE-WRAP, the deterministic design-doc drift check (`runDesignDocDriftCheck`) writes `.swarm/doc-drift-phase-N.json`. If the verdict is `DOC_STALE` and `design_docs.enabled`, dispatch `docs_design` in **sync** mode for the affected sections only, then append a design-changelog entry. This is advisory and non-blocking — never block phase completion on design-doc lag.

package/.opencode/skills/discover/SKILL.md ADDED Viewed

@@ -0,0 +1,20 @@
+---
+name: discover
+description: >
+  Full execution protocol for MODE: DISCOVER -- read-only repository discovery and governance/context mapping.
+---
+# Discover Protocol
+This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
+### MODE: DISCOVER
+Delegate to the active swarm's explorer agent. Wait for response.
+For complex tasks, make a second explorer call focused on risk/gap analysis:
+- Hidden requirements, unstated assumptions, scope risks
+- Existing patterns that the implementation must follow
+After explorer returns:
+- Run `symbols` tool on key files identified by explorer to understand public API surfaces
+- For multi-file module surveys: prefer `batch_symbols` over sequential single-file symbols calls
+- Run `complexity_hotspots` if not already run in Phase 0 (check context.md for existing analysis). Note modules with recommendation "security_review" or "full_gates" in context.md.
+- Check for project governance files using the `glob` tool with patterns `project-instructions.md`, `docs/project-instructions.md`, `CONTRIBUTING.md`, and `INSTRUCTIONS.md` (checked in that priority order — first match wins). If a file is found: read it and extract all MUST (mandatory constraints) and SHOULD (recommended practices) rules. Write the extracted rules as a summary to `.swarm/context.md` under a `## Project Governance` section — append if the section already exists, create it if not. If no MUST or SHOULD rules are found in the file, skip writing. If no governance file is found: skip silently. Existing DISCOVER steps are unchanged.

package/.opencode/skills/execute/SKILL.md ADDED Viewed

@@ -0,0 +1,191 @@
+---
+name: execute
+description: >
+  Full execution protocol for MODE: EXECUTE -- task execution, coder retry handling, QA gates, completion evidence, and per-task closure.
+---
+# Execute Protocol
+This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
+### MODE: EXECUTE
+For each task (respecting dependencies):
+RETRY PROTOCOL — when returning to coder after any gate failure:
+1. Provide structured rejection: "GATE FAILED: [gate name] | REASON: [details] | REQUIRED FIX: [specific action required]"
+2. Re-enter at step 5b (the active swarm's coder agent) with full failure context
+3. Resume execution at the failed step (do not restart from 5a)
+   Exception: if coder modified files outside the original task scope, restart from step 5c
+4. Gates already PASSED may be skipped on retry if their input files are unchanged
+5. Print "Resuming at step [5X] after coder retry [N/configured QA retry limit]" before re-executing
+GATE FAILURE RESPONSE RULES — when ANY gate returns a failure:
+You MUST return to the active swarm's coder agent. You MUST NOT fix the code yourself.
+WRONG responses to gate failure:
+✗ Editing the file yourself to fix the syntax error
+✗ Running a tool to auto-fix and moving on without coder
+✗ "Installing" or "configuring" tools to work around the failure
+✗ Treating the failure as an environment issue and proceeding
+✗ Deciding the failure is a false positive and skipping the gate
+RIGHT response to gate failure:
+✓ Print "GATE FAILED: [gate name] | REASON: [details]"
+✓ BEFORE the retry delegation: call `declare_scope` with the file list the retry will touch. Re-declare even if the files are identical to the original task — retry scope persists per-call, not per-task. See Rule 1a.
+✓ Delegate to the active swarm's coder agent with:
+TASK: Fix [gate name] failure
+FILE: [affected file(s)]
+INPUT: [exact error output from the gate]
+CONSTRAINT: Fix ONLY the reported issue, do not modify other code
+✓ After coder returns, re-run the failed gate from the step that failed
+✓ Print "Coder attempt [N/configured QA retry limit] on task [X.Y]"
+The ONLY exception: lint tool in fix mode (step 5g) auto-corrects by design.
+All other gates: failure → return to coder. No self-fixes. No workarounds.
+5a. **UI DESIGN GATE** (conditional — Rule 9): If task matches UI trigger → the active swarm's designer agent produces scaffold → pass scaffold to coder as INPUT. If no match → skip.
+→ After step 5a (or immediately if no UI task applies): Call update_task_status with status in_progress for the current task. Then proceed to step 5b.
+5a-bis. **DARK MATTER CO-CHANGE DETECTION**: After declaring scope but BEFORE finalizing the task file list, call knowledge_recall with query hidden-coupling primaryFile where primaryFile is the first file in the task's FILE list. Extract primaryFile from the task's FILE list (first file = primary). If results found, add those files to the task's AFFECTS scope with a BLAST RADIUS note. If no results or knowledge_recall unavailable, proceed gracefully without adding files. This is advisory — the architect may exclude files from scope if they are unrelated to the current task. Delegate to the active swarm's coder agent only after scope is declared.
+5b-PRE (required): Call `declare_scope({ taskId, files })` with the EXACT file list for this task — including any co-change files surfaced by 5a-bis. Skipping this call will cause every coder write to be BLOCKED by scope-guard. No `declare_scope` → no 5b delegation. See Rule 1a.
+    5b-BASE (required, once per task): Call `sast_scan` with `{ capture_baseline: true, phase: <N>, changed_files: <files from 5b-PRE> }` where `<N>` is the current phase number (extract from current task ID: task "3.2" → phase 3, task "1.5" → phase 1). The tool maintains `.swarm/evidence/{phase}/sast-baseline.json` as a phase-scoped, incrementally merged baseline of pre-existing SAST findings. Calling twice for the same files is safe (idempotent merge). Do NOT re-capture mid-task.
+    → REQUIRED: Print "sast-baseline: [WRITTEN — N fingerprints | MERGED — N fingerprints | SKIPPED — gate disabled | ERROR — details]"
+    → Subsequent `pre_check_batch` calls with `phase: <N>` will automatically diff against this baseline — only NEW findings (not in baseline) drive the fail verdict.
+5b. the active swarm's coder agent - Implement (if designer scaffold produced, include it as INPUT).
+5c. Run `diff` tool. If `hasContractChanges` → the active swarm's explorer agent integration analysis. If COMPATIBILITY SIGNALS=INCOMPATIBLE or MIGRATION_SURFACE=yes → coder retry. If COMPATIBILITY SIGNALS=COMPATIBLE and MIGRATION_SURFACE=no → proceed.
+    → REQUIRED: Print "diff: [PASS | CONTRACT CHANGE — details]"
+    5d. Run `syntax_check` tool. SYNTACTIC ERRORS → return to coder. NO ERRORS → proceed to placeholder_scan.
+    → REQUIRED: Print "syntaxcheck: [PASS | FAIL — N errors]"
+    5e. Run `placeholder_scan` tool. PLACEHOLDER FINDINGS → return to coder. NO FINDINGS → proceed to imports.
+    → REQUIRED: Print "placeholderscan: [PASS | FAIL — N findings]"
+    5f. Run `imports` tool for dependency audit. ISSUES → return to coder.
+    → REQUIRED: Print "imports: [PASS | ISSUES — details]"
+    5g. Run `lint` tool with fix mode for auto-fixes. If issues remain → run `lint` tool with check mode. FAIL → return to coder.
+    → REQUIRED: Print "lint: [PASS | FAIL — details]"
+    5h. Run `build_check` tool. BUILD FAILS → return to coder. SUCCESS → proceed to pre_check_batch.
+    → REQUIRED: Print "buildcheck: [PASS | FAIL | SKIPPED — no toolchain]"
+    5i. Run `pre_check_batch` tool with `phase: <N>` (same phase number used in 5b-BASE) → runs four verification tools in parallel (max 4 concurrent):
+    - lint:check (code quality verification)
+    - secretscan (secret detection)
+    - sast_scan (static security analysis — diffs against phase baseline when phase provided)
+    - quality_budget (maintainability metrics)
+    → Returns { gates_passed, lint, secretscan, sast_scan, quality_budget, total_duration_ms }
+    → sast_scan result may include { new_findings, pre_existing_findings, baseline_used } when baseline diff is active.
+    → If ALL FOUR tools have ran === false (lint.ran === false && secretscan.ran === false && sast_scan.ran === false && quality_budget.ran === false):
+        → This is a SKIP - no tools actually ran. Print "pre_check_batch: SKIP — all tools ran===false (no files to check or tools not available)" and proceed to the active swarm's reviewer agent.
+    → Else if gates_passed === false: read individual tool results, identify which tool(s) failed, return structured rejection to the active swarm's coder agent with specific tool failures. Do NOT call the active swarm's reviewer agent.
+    → If gates_passed === true AND sast_preexisting_findings is present: proceed to the active swarm's reviewer agent. Include the pre-existing SAST findings in the reviewer delegation context with instruction: "SAST TRIAGE REQUIRED: The following SAST findings existed before this task began (from phase baseline or unchanged lines). Verify these are acceptable pre-existing conditions and do not interact with the new changes." Do NOT return to coder for pre-existing findings.
+    → If gates_passed === true (no sast_preexisting_findings): proceed to the active swarm's reviewer agent.
+    → REQUIRED: Print "pre_check_batch: [PASS — all gates passed | PASS — pre-existing SAST findings (N findings, reviewer triage) | FAIL — [gate]: [details]]"
+⚠️ pre_check_batch SCOPE BOUNDARY:
+pre_check_batch runs FOUR automated tools: lint:check, secretscan, sast_scan, quality_budget.
+pre_check_batch does NOT run and does NOT replace:
+- the active swarm's reviewer agent (logic review, correctness, edge cases, maintainability)
+- the active swarm's reviewer agent security-only pass (OWASP evaluation, auth/crypto review)
+- the active swarm's test_engineer agent verification tests (functional correctness)
+- the active swarm's test_engineer agent adversarial tests (attack vectors, boundary violations)
+- diff tool (contract change detection)
+- placeholder_scan (TODO/stub detection)
+- imports (dependency audit)
+gates_passed: true means "automated static checks passed."
+It does NOT mean "code is reviewed." It does NOT mean "code is tested."
+After pre_check_batch passes, you MUST STILL delegate to the active swarm's reviewer agent.
+Treating pre_check_batch as a substitute for the active swarm's reviewer agent is a PROCESS VIOLATION.
+    5j. the active swarm's reviewer agent - General review. REJECTED before the configured QA retry limit → coder retry. REJECTED at the configured QA retry limit → escalate.
+    → REQUIRED: Print "reviewer: [APPROVED | REJECTED — reason]"
+    5k. Security gate: if change matches TIER 3 criteria OR content contains SECURITY_KEYWORDS OR secretscan has ANY findings OR sast_scan has ANY findings at or above threshold → MUST delegate the active swarm's reviewer agent security-only review. REJECTED before the configured QA retry limit → coder retry. REJECTED at the configured QA retry limit → escalate to user.
+    → REQUIRED: Print "security-reviewer: [TRIGGERED | NOT TRIGGERED — reason]"
+    → If TRIGGERED: Print "security-reviewer: [APPROVED | REJECTED — reason]"
+    5l. the active swarm's test_engineer agent - Verification tests. FAIL → coder retry from 5g.
+    → REQUIRED: Print "testengineer-verification: [PASS N/N | FAIL — details]"
+    5l-bis. REGRESSION SWEEP (automatic after test_engineer-verification PASS):
+    Run test_runner with { scope: "graph", files: [<all source files changed by coder in this task>] }.
+    scope:"graph" traces imports to discover test files beyond the task's own tests that may be affected by this change.
+    Outcomes (based on test_runner result.outcome field):
+    - outcome: "pass" → All tests passed. Print "regression-sweep: PASS [N additional tests, M files]"
+    - outcome: "regression" → Tests ran but some failed. Print "regression-sweep: FAIL — REGRESSION DETECTED in [files]. The failing tests are CORRECT — fix the source code, not the tests." Return to coder with retry from 5g.
+    - outcome: "skip" → No test files resolved (nothing to run). Print "regression-sweep: SKIPPED — no related tests beyond task scope"
+    - outcome: "scope_exceeded" → Too many files for graph scope. Print "regression-sweep: SKIPPED — broad scope, no related tests beyond task scope"
+    - outcome: "error" → Tool error (timeout, no framework, etc.). Print "regression-sweep: SKIPPED — test_runner error" and continue pipeline.
+    IMPORTANT: The regression sweep runs test_runner DIRECTLY (architect calls the tool). Do NOT delegate to test_engineer for this — the test_engineer's EXECUTION BOUNDARY restricts it to its own test files. The architect has unrestricted test_runner access.
+    → REQUIRED: Print "regression-sweep: [PASS | FAIL — REGRESSION DETECTED | SKIPPED — no related tests | SKIPPED — broad scope | SKIPPED — test_runner error]"
+    5l-ter. TEST DRIFT CHECK (conditional): Run this step if the change involves any drift-prone area:
+    - Command/CLI behavior changed (shell command wrappers, CLI interfaces)
+    - Parsing or routing logic changed (argument parsing, route matching, file resolution)
+    - User-visible output changed (formatted output, error messages, JSON response structure)
+    - Public contracts or schemas changed (API types, tool argument schemas, return types)
+    - Assertion-heavy areas where output strings are tested (command/help output tests, error message tests)
+    - Helper behavior or lifecycle semantics changed (state machines, lifecycle hooks, initialization)
+    If NOT triggered: Print "test-drift: NOT TRIGGERED — no drift-prone change detected"
+    If TRIGGERED:
+    - Use grep/search to find test files that cover the affected functionality
+    - Run those tests via test_runner with scope:"convention" on the related test files
+    - If any FAIL → print "test-drift: DRIFT DETECTED in [N] tests" and escalate to reviewer/test_engineer
+    - If all PASS → print "test-drift: [N] related tests verified"
+    - If no related tests found → print "test-drift: NO RELATED TESTS FOUND" (not a failure)
+    → REQUIRED: Print "test-drift: [TRIGGERED | NOT TRIGGERED — reason]" and "[DRIFT DETECTED in N tests | N related tests verified | NO RELATED TESTS FOUND | NOT TRIGGERED]"
+    5n. TODO SCAN (advisory): Call todo_extract with paths=[list of files changed in this task]. If any results have priority HIGH → print "todo-scan: WARN — N high-priority TODOs in changed files: [list of TODO texts]". If no high-priority results → print "todo-scan: CLEAN". This is advisory only and does NOT block the pipeline.
+    → REQUIRED: Print "todo-scan: [WARN — N high-priority TODOs | CLEAN]"
+    5m. ADVERSARIAL TEST STEP (config-specific): Use the rendered adversarial-test instruction from the MODE: EXECUTE architect stub. If the stub omits step 5m, skip this step.
+    5n. COVERAGE CHECK: If the active swarm's test_engineer agent reports coverage < 70% → delegate the active swarm's test_engineer agent for an additional test pass targeting uncovered paths. This is a soft guideline; use judgment for trivial tasks.
+PRE-COMMIT RULE — Before ANY commit or push:
+  You MUST answer YES to ALL of the following:
+  [ ] Did the active swarm's reviewer agent run and return APPROVED? (not "I reviewed it" — the agent must have run)
+  [ ] Did the active swarm's test_engineer agent run and return PASS? (not "the code looks correct" — the agent must have run)
+  [ ] Did pre_check_batch run with gates_passed true?
+  [ ] Did the diff step run?
+  [ ] Did regression-sweep run (or SKIP with no related tests or test_runner error)?
+  [ ] Did test-drift check run (or NOT TRIGGERED)?
+  If ANY box is unchecked: DO NOT COMMIT. Return to step 5b.
+  There is no override. A commit without a completed QA gate is a workflow violation.
+## ROLE-BOUNDARY CHANGE VALIDATION (mandatory for prompt changes)
+When a task modifies agent prompts (especially explorer, reviewer, critic, or any agent involved in the mapper/validator/challenge hierarchy), add an explicit test validation step:
+- If new prompt contract tests exist (e.g., explorer-role-boundary.test.ts, explorer-consumer-contract.test.ts): Run them via test_runner
+- If no specific tests exist for the changed prompt: Run test_runner with scope "convention" on the changed file
+- Verify the new tests pass before completing the task
+This step supplements (not replaces) the existing regression-sweep and test-drift checks. It exists to catch prompt contract regressions that automated gates might miss.
+5o. ⛔ TASK COMPLETION GATE — You MUST print this checklist with filled values before marking ✓ in .swarm/plan.md:
+  [TOOL] diff: PASS / SKIP — value: ___
+  [TOOL] syntax_check: PASS — value: ___
+  [TOOL] placeholder_scan: PASS — value: ___
+  [TOOL] imports: PASS — value: ___
+  [TOOL] lint: PASS — value: ___
+  [TOOL] build_check: PASS / SKIPPED — value: ___
+  [TOOL] pre_check_batch: PASS (lint:check ✓ secretscan ✓ sast_scan ✓ quality_budget ✓) — value: ___
+  [GATE] reviewer: APPROVED — value: ___
+  [GATE] reuse_re_verification: VERIFIED / SKIPPED / DUPLICATION_DETECTED — value: ___
+  [GATE] security-reviewer: APPROVED / SKIPPED — value: ___
+  [GATE] test_engineer-verification: PASS — value: ___
+  [GATE] regression-sweep: PASS / SKIPPED — value: ___
+  [GATE] test-drift: TRIGGERED / NOT TRIGGERED — value: ___
+  [GATE] test_engineer-adversarial: use the rendered checklist entry from the MODE: EXECUTE architect stub
+  [GATE] coverage: ≥70% / soft-skip — value: ___
+  You MUST NOT mark a task complete without printing this checklist with filled values.
+  You MUST NOT fill "PASS" or "APPROVED" for a gate you did not actually run — that is fabrication.
+  Any blank "value: ___" field = gate was not run = task is NOT complete.
+  Filling this checklist from memory ("I think I ran it") is INVALID. Each value must come from actual tool/agent output in this session.
+    5p. Call update_task_status with status "completed".
+    5q. OPTIONAL TASK-COMPLETION COMMIT POLICY: read `.swarm/context.md`.
+        - If `## Task Completion Commit Policy` contains `commit_after_each_completed_task: true`, immediately call:
+          `checkpoint save task-<task-id>-complete`
+        - If the section is absent or false, skip this step.
+        - This optional commit policy NEVER bypasses PRE-COMMIT RULE checks above.
+        - If checkpoint save fails with "duplicate label", the task was already checkpointed from a prior completion or retry. Silently skip — the existing checkpoint is valid.
+    5r. Proceed to next task.