opencode-swarm 7.58.0 → 7.59.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/.opencode/skills/brainstorm/SKILL.md +142 -0
  2. package/.opencode/skills/clarify/SKILL.md +103 -0
  3. package/.opencode/skills/clarify-spec/SKILL.md +58 -0
  4. package/.opencode/skills/codebase-review-swarm/INSTALL.md +75 -0
  5. package/.opencode/skills/codebase-review-swarm/README.md +44 -0
  6. package/.opencode/skills/codebase-review-swarm/SKILL.md +65 -0
  7. package/.opencode/skills/codebase-review-swarm/agents/openai.yaml +6 -0
  8. package/.opencode/skills/codebase-review-swarm/assets/jsonl-schemas.md +239 -0
  9. package/.opencode/skills/codebase-review-swarm/assets/review-report-template.md +244 -0
  10. package/.opencode/skills/codebase-review-swarm/references/compatibility-and-research-notes.md +25 -0
  11. package/.opencode/skills/codebase-review-swarm/references/full-v7-source-prompt.md +2373 -0
  12. package/.opencode/skills/codebase-review-swarm/references/review-protocol-v8.2.md +310 -0
  13. package/.opencode/skills/codebase-review-swarm/scripts/init-review-run.py +134 -0
  14. package/.opencode/skills/codebase-review-swarm/scripts/validate-skill-package.py +62 -0
  15. package/.opencode/skills/consult/SKILL.md +16 -0
  16. package/.opencode/skills/council/SKILL.md +147 -0
  17. package/.opencode/skills/critic-gate/SKILL.md +59 -0
  18. package/.opencode/skills/deep-dive/SKILL.md +142 -0
  19. package/.opencode/skills/design-docs/SKILL.md +81 -0
  20. package/.opencode/skills/discover/SKILL.md +20 -0
  21. package/.opencode/skills/execute/SKILL.md +191 -0
  22. package/.opencode/skills/issue-ingest/SKILL.md +64 -0
  23. package/.opencode/skills/phase-wrap/SKILL.md +123 -0
  24. package/.opencode/skills/plan/SKILL.md +293 -0
  25. package/.opencode/skills/pre-phase-briefing/SKILL.md +69 -0
  26. package/.opencode/skills/resume/SKILL.md +23 -0
  27. package/.opencode/skills/specify/SKILL.md +175 -0
  28. package/.opencode/skills/swarm-pr-feedback/SKILL.md +192 -0
  29. package/.opencode/skills/swarm-pr-review/SKILL.md +884 -0
  30. package/dist/agents/agent-output-schema.d.ts +1 -1
  31. package/dist/cli/index.js +1351 -1159
  32. package/dist/commands/command-dispatch.d.ts +1 -0
  33. package/dist/commands/index.d.ts +1 -0
  34. package/dist/commands/registry.d.ts +15 -14
  35. package/dist/config/bundled-skills.d.ts +25 -0
  36. package/dist/config/constants.d.ts +1 -1
  37. package/dist/config/schema.d.ts +42 -0
  38. package/dist/index.js +3517 -2673
  39. package/dist/memory/schema.d.ts +1 -1
  40. package/dist/tools/lean-turbo-run-phase.d.ts +2 -1
  41. package/dist/turbo/lean/index.d.ts +4 -1
  42. package/dist/turbo/lean/merge-back.d.ts +180 -0
  43. package/dist/turbo/lean/runner.d.ts +47 -1
  44. package/dist/turbo/lean/state.d.ts +10 -0
  45. package/dist/turbo/lean/worktree.d.ts +194 -0
  46. package/package.json +20 -1
@@ -0,0 +1,147 @@
1
+ ---
2
+ name: council
3
+ description: >
4
+ Full execution protocol for MODE: COUNCIL -- General Council research,
5
+ parallel member dispatch, disagreement handling, and synthesis.
6
+ ---
7
+
8
+ # Council Protocol
9
+
10
+ This protocol is loaded on demand by the architect stub in `src/agents/architect.ts`.
11
+ The architect prompt keeps only activation, action, and hard safety constraints;
12
+ the full execution details live here.
13
+
14
+ ### MODE: COUNCIL
15
+
16
+ Activates when: user invokes `/swarm council <question>` (optionally with
17
+ `--preset <name>` and/or `--spec-review`).
18
+
19
+ Purpose: convene a fixed three-agent multi-model General Council
20
+ (generalist / skeptic / domain expert) for an advisory deliberation. The
21
+ architect runs a curated web research pass upfront, dispatches the three agents
22
+ in parallel with the gathered RESEARCH CONTEXT, routes any disagreements back
23
+ for one targeted reconciliation round, and synthesizes the final user-facing
24
+ answer directly.
25
+
26
+ This mode is ADVISORY. It does not block any other workflow and does not modify
27
+ code, plans, or specs. The output is for the user (general mode) or for the spec
28
+ being drafted in MODE: SPECIFY (spec_review mode, gated by
29
+ `council_general_review`).
30
+
31
+ #### Pre-flight (always run first)
32
+
33
+ 1. Read `council.general` from the resolved opencode-swarm config. Resolution
34
+ is global first (`~/.config/opencode/opencode-swarm.json`), then project
35
+ override (`.opencode/opencode-swarm.json`). A global config is valid and must
36
+ be used when no project override is present; do not fail after checking only
37
+ the project file. If `council.general.enabled` is not true OR no search API
38
+ key is configured (neither `council.general.searchApiKey` nor the
39
+ corresponding env var `TAVILY_API_KEY` / `BRAVE_SEARCH_API_KEY`),
40
+ surface to the user: "General Council is not enabled. Set
41
+ council.general.enabled: true and configure a search API key in
42
+ global ~/.config/opencode/opencode-swarm.json or project
43
+ .opencode/opencode-swarm.json." Then STOP.
44
+
45
+ #### Research Phase (always run before dispatching council agents)
46
+
47
+ 2. Formulate 1-3 targeted `web_search` queries that best capture the
48
+ information needed to answer the question. Prefer specific, keyword-focused
49
+ queries over broad ones.
50
+
51
+ Hard grounding rules:
52
+ - Do not append a model training-cutoff year to searches.
53
+ - Use `web_search` with its default `freshness: "auto"` behavior for
54
+ current queries unless the user explicitly asked for a historical window.
55
+ - Preserve each `web_search` result's normalized `query`, `temporalIntent`,
56
+ `freshness`, and `removedStaleYears` metadata in RESEARCH CONTEXT audit
57
+ notes.
58
+ - For current, latest, today, now, state-of-the-art, pricing, release-status,
59
+ legal/regulatory, financial, security, or otherwise time-sensitive
60
+ questions, the Research Phase must produce usable current sources before
61
+ council dispatch.
62
+ - If `web_search` returns no results or an error for a time-sensitive
63
+ question, stop and surface the failed search result to the user instead of
64
+ dispatching ungrounded members.
65
+ - For stable/non-current questions, if `web_search` returns no results or an
66
+ error, note this in the dispatch message and proceed without a context
67
+ block. In that degraded mode, members may use stable background knowledge
68
+ only and must not make current-fact claims.
69
+
70
+ Compile all successful results into a RESEARCH CONTEXT block in this format:
71
+
72
+ ```text
73
+ RESEARCH CONTEXT
74
+ ================
75
+ [1] <title> - <url>
76
+ <snippet>
77
+ query: <normalized query>; temporalIntent: <current|historical|unspecified>; freshness: <day|week|month|year|none>; removedStaleYears: <comma-separated years or none>
78
+
79
+ [2] <title> - <url>
80
+ <snippet>
81
+ ...
82
+ ```
83
+
84
+ #### Round 1 - Parallel Independent Analysis
85
+
86
+ 3. Dispatch `the active swarm's council_generalist agent`,
87
+ `the active swarm's council_skeptic agent`, and
88
+ `the active swarm's council_domain_expert agent` in PARALLEL -- one message
89
+ per agent, then STOP and wait for all responses. Each dispatch message must
90
+ include:
91
+ - The question
92
+ - Round number: 1
93
+ - The CURRENT DATE in ISO `YYYY-MM-DD` form
94
+ - The full RESEARCH CONTEXT block from step 2
95
+ - Instruction: "Cite from the RESEARCH CONTEXT for external evidence. Your
96
+ memberId and role are hardcoded in your system prompt."
97
+
98
+ Do NOT share other agents' responses at this stage.
99
+
100
+ 4. Collect all three JSON responses. The `round1Responses` array will contain
101
+ entries with `memberId` of `council_generalist`, `council_skeptic`, and
102
+ `council_domain_expert` and `role` of `generalist`, `skeptic`, and
103
+ `domain_expert` respectively. These come from the agents' JSON output; no
104
+ manual construction is needed.
105
+
106
+ #### Synthesis and Deliberation (when council.general.deliberate is true; default true)
107
+
108
+ 5. Call `convene_general_council` with mode set from the command (`general` or
109
+ `spec_review`), `question`, and the collected `round1Responses` only (omit
110
+ `round2Responses`). Inspect the returned `disagreementsCount`.
111
+
112
+ 6. If `disagreementsCount > 0`:
113
+ a. For each disagreement in the tool's response, identify the disputing
114
+ agents (the agents listed in the disagreement's positions, identified by
115
+ memberId: `council_generalist`, `council_skeptic`, or
116
+ `council_domain_expert`).
117
+ b. Re-delegate ONLY to the disputing agents -- one message per agent --
118
+ passing: their Round 1 response, the disagreement topic, the opposing
119
+ position(s), round number 2, and the same RESEARCH CONTEXT block.
120
+ c. Collect the Round 2 responses.
121
+ d. Call `convene_general_council` AGAIN with both `round1Responses` AND
122
+ `round2Responses` populated.
123
+
124
+ #### Output
125
+
126
+ 7. Present the final answer to the user from the `synthesis` returned by
127
+ `convene_general_council`. Apply these output rules directly:
128
+ - LEAD WITH CONSENSUS: open with the strongest consensus position.
129
+ Confidence-weighted: higher-confidence claims from multiple agents rank
130
+ first, but evidence quality outranks raw confidence. Never elevate a
131
+ single confident voice over a well-evidenced contrary majority.
132
+ - ACKNOWLEDGE DISAGREEMENT HONESTLY: for each persisting disagreement, write
133
+ "experts disagree on X because..." and present the strongest version of
134
+ each side. Do not pretend disagreements are resolved. Do not silently pick
135
+ a winner.
136
+ - CITE THE STRONGEST SOURCES: link key claims with `[title](url)` format from
137
+ the source list in the synthesis. Pick the most reputable source per claim;
138
+ do not cite duplicates.
139
+ - BE CONCISE: a few short paragraphs plus a bulleted summary. Expand only
140
+ when the question genuinely requires it.
141
+ - HARD CONSTRAINTS: You MUST NOT invent claims not present in the council's
142
+ responses. You MUST NOT add new web research. You MUST NOT favor a position
143
+ based on confidence alone.
144
+
145
+ Preface the answer with one line listing the participating models (reviewer
146
+ model as generalist, critic model as skeptic, SME model as domain expert). Do
147
+ NOT present raw per-member JSON.
@@ -0,0 +1,59 @@
1
+ ---
2
+ name: critic-gate
3
+ description: >
4
+ Full execution protocol for MODE: CRITIC-GATE -- plan critic review, revision loops, and hard stop before execution.
5
+ ---
6
+
7
+ # Critic Gate Protocol
8
+
9
+ This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
10
+
11
+ ### MODE: CRITIC-GATE
12
+ Delegate plan to the active swarm's critic agent for review BEFORE any implementation begins.
13
+ - Send the full plan.md content and codebase context summary
14
+ - **APPROVED** → Proceed to MODE: EXECUTE
15
+ - **NEEDS_REVISION** → Revise the plan based on critic feedback, then resubmit (max 2 cycles)
16
+ - **REJECTED** → Inform the user of fundamental issues and ask for guidance before proceeding
17
+
18
+ ⛔ HARD STOP — Print this checklist before advancing to MODE: EXECUTE:
19
+ [ ] the active swarm's critic agent returned a verdict
20
+ [ ] APPROVED → proceed to MODE: EXECUTE
21
+ [ ] NEEDS_REVISION → revised and resubmitted (attempt N of max 2)
22
+ [ ] REJECTED (any cycle) → informed user. STOP.
23
+
24
+ You MUST NOT proceed to MODE: EXECUTE without printing this checklist with filled values.
25
+
26
+ CRITIC-GATE TRIGGER: Run ONCE when you first write the complete .swarm/plan.md.
27
+ Do NOT re-run CRITIC-GATE before every project phase.
28
+ If resuming a project with an existing approved plan, CRITIC-GATE is already satisfied.
29
+
30
+ 6j. SPEC-GATE (Execute BEFORE any save_plan call):
31
+ - The save_plan tool will REJECT if .swarm/spec.md does not exist (enforced at the tool level via SWARM_SKIP_SPEC_GATE env var bypass).
32
+ - Before calling save_plan, verify spec.md is present using lint_spec.
33
+ - If spec.md is absent: do NOT call save_plan. Use /swarm specify to create a spec first, or inform the user.
34
+ - This rule is satisfied by the save_plan tool's own spec gate — it exists as a reminder that planning requires a spec.
35
+
36
+ 6k. SPEC-STALENESS GUARD:
37
+ - If _specStale or .swarm/spec-staleness.json exists, the Architect MUST stop
38
+ and SURFACE THE DRIFT TO THE USER. The user (not the Architect) then runs
39
+ either:
40
+ - /swarm clarify to update the spec and align it with the plan, OR
41
+ - /swarm acknowledge-spec-drift to acknowledge the drift and suppress further warnings
42
+ - The Architect MUST NOT run /swarm acknowledge-spec-drift itself — not via
43
+ the swarm_command tool, not via the chat fallback, and NOT by shelling out
44
+ to `bunx opencode-swarm run acknowledge-spec-drift` (or any equivalent
45
+ `npx`/`node`/`bun` invocation). Any such self-invocation is a
46
+ control-bypass and will be refused by the runtime guardrails.
47
+ - Do NOT proceed with implementation until the user resolves the staleness.
48
+ - When re-saving a plan in response to spec drift, save_plan REQUIRES that ANY task
49
+ present in the prior plan but absent from the new args.phases be enumerated
50
+ in removed_task_ids with a removal_reason. save_plan will reject the call
51
+ otherwise (PLAN_TASK_REMOVAL_NOT_ACKNOWLEDGED). Tasks not yet finished
52
+ (status: pending, in_progress, blocked) MUST NOT be removed without explicit
53
+ user confirmation — surface the list to the user and ask before populating
54
+ removed_task_ids.
55
+ - While .swarm/spec-staleness.json exists, the runtime STRUCTURALLY BLOCKS the
56
+ following tools (SPEC_DRIFT_BLOCKED_TOOLS): save_plan, update_task_status,
57
+ phase_complete, lean_turbo_run_phase, lean_turbo_acquire_locks. If a call
58
+ returns SPEC_DRIFT_BLOCK, do NOT retry; surface the drift to the user and
59
+ WAIT for them to run /swarm clarify or /swarm acknowledge-spec-drift.
@@ -0,0 +1,142 @@
1
+ ---
2
+ name: deep-dive
3
+ description: >
4
+ Full execution protocol for MODE: DEEP_DIVE — read-only codebase audit with
5
+ parallel explorer waves, 2 independent reviewers, and sequential critic
6
+ challenge for HIGH/CRITICAL findings. Loaded on demand by the architect when
7
+ the deep-dive command emits a [MODE: DEEP_DIVE ...] signal.
8
+ ---
9
+
10
+ # Deep Dive Audit Protocol
11
+
12
+ Read-only deep audit of a specified codebase scope using parallel explorer waves, always 2 parallel reviewers, and sequential critic challenge. This mode does NOT mutate source code, does NOT delegate to coder, and does NOT call declare_scope.
13
+
14
+ ### MODE: DEEP_DIVE
15
+
16
+ ## Step 0 — Parse Header
17
+
18
+ Parse the MODE: DEEP_DIVE header to extract:
19
+ - `scope`: the codebase area to audit (e.g., "auth", "payment flow", "src/hooks/")
20
+ - `profile`: one of standard | security | ux | architecture | full (default: standard)
21
+ - `max_explorers`: integer 1..8 (default: 6, or 8 for full profile)
22
+ - `output`: markdown | json (default: markdown)
23
+ - `update_main`: boolean (default: true) — whether to fetch/ff-only main before starting
24
+ - `allow_dirty`: boolean (default: false) — whether to proceed with uncommitted changes
25
+
26
+ If the header is malformed or missing required fields, report the error and stop.
27
+
28
+ ## Step 1 — Repo Readiness
29
+
30
+ 1. Check git working tree status. If dirty and `allow_dirty` is false, warn the user and ask whether to proceed. Do NOT proceed automatically.
31
+ 2. If `update_main` is true and tree is clean: check current branch. If not on `main`, report current branch to user and ASK FOR CONFIRMATION before switching. Only after explicit user approval: `git fetch origin main && git checkout main && git merge --ff-only origin/main`. If ff-only fails, warn the user and ask before proceeding.
32
+ 3. Record the current HEAD commit hash for the report.
33
+
34
+ ## Step 2 — Scope Resolution
35
+
36
+ Use the following tools to map the audit scope:
37
+ 1. `repo_map` with action "build" to establish the code graph
38
+ 2. `repo_map` with action "localization" for the scope target
39
+ 3. `symbols` and `batch_symbols` on key files identified by localization
40
+ 4. `imports` to trace dependency boundaries
41
+ 5. `doc_scan` if documentation coverage is relevant
42
+ 6. `knowledge_recall` with query matching the scope domain
43
+
44
+ Produce a SCOPE MAP: list of files, modules, and interfaces within the audit boundary. Cap at 50 files total.
45
+
46
+ ## Step 3 — Explorer Missions (Parallel Waves)
47
+
48
+ Dispatch explorer waves using parallel Task calls. Each wave contains up to `max_explorers` missions.
49
+
50
+ **File caps per mission:**
51
+ - 8 files maximum per mission
52
+ - ~3500 total lines across all files in a mission
53
+ - Group files by import proximity (files that import each other go in the same mission)
54
+
55
+ **Profile-based lane selection — each profile activates specific lanes:**
56
+
57
+ | Lane | Template | standard | security | ux | architecture | full |
58
+ |------|----------|----------|----------|----|-------------|------|
59
+ | SCOPE_MAP | Map structure, exports, boundaries | ✓ | ✓ | ✓ | ✓ | ✓ |
60
+ | WIRING_DATAFLOW | Trace data flow, API contracts, state propagation | ✓ | ✓ | | ✓ | ✓ |
61
+ | RUNTIME_BEHAVIOR | Error handling, edge cases, lifecycle, async patterns | ✓ | | | ✓ | ✓ |
62
+ | UX_FLOW | User-facing behavior, accessibility, responsiveness | | | ✓ | | ✓ |
63
+ | SECURITY_TRUST | Auth boundaries, input validation, trust transitions | | ✓ | | | ✓ |
64
+ | TEST_COVERAGE | Coverage gaps, flaky tests, missing assertions | ✓ | | | | ✓ |
65
+ | PERFORMANCE_RELIABILITY | Resource leaks, N+1 queries, race conditions | | | | ✓ | ✓ |
66
+ | DOCS_CONFIG_DEPLOYMENT | Config consistency, docs accuracy, deployment drift | | | | | ✓ |
67
+
68
+ Each explorer mission receives:
69
+ - Lane template name and description
70
+ - Assigned files (8 max, grouped by import proximity)
71
+ - The scope map context from Step 2
72
+ - Instruction: "You are performing a [LANE] audit. Report findings as candidate observations with severity (INFO/LOW/MEDIUM/HIGH/CRITICAL), location, and evidence."
73
+
74
+ Explorer missions are dispatched in parallel waves. Wait for ALL missions in a wave to complete before dispatching the next wave.
75
+
76
+ Explorers generate CANDIDATE FINDINGS only — they do NOT make verdicts. All findings are unverified until Step 5.
77
+
78
+ ## Step 4 — Normalize Candidates
79
+
80
+ 1. Collect all candidate findings from all explorer missions.
81
+ 2. Deduplicate: merge findings that reference the same location and issue.
82
+ 3. Assign DD-C001 through DD-CNNN identifiers to unique findings.
83
+ 4. Cap at 10 findings per shard (see Step 5 for sharding).
84
+ 5. Sort by severity (CRITICAL → HIGH → MEDIUM → LOW → INFO).
85
+
86
+ ## Step 5 — Always 2 Parallel Reviewers
87
+
88
+ Split the verified candidates into 2 shards of ≤10 candidates each. Dispatch 2 parallel `the active swarm's reviewer agent` calls.
89
+
90
+ Each reviewer receives:
91
+ - Their shard of candidates (up to 10)
92
+ - The scope map context
93
+ - The original scope description
94
+ - Instruction: "Verify or reject each candidate finding. For each: verdict (VERIFIED / REJECTED / NEEDS_MORE_EVIDENCE), confidence (0-1), and brief reasoning."
95
+
96
+ Reviewers MUST NOT suggest fixes — they verify findings only.
97
+
98
+ ## Step 5b — Reviewer Merge/Dedup
99
+
100
+ After both reviewers return, perform a lightweight sync pass:
101
+ 1. Cross-reference findings between reviewers — flag correlations
102
+ 2. Deduplicate any findings both reviewers verified independently
103
+ 3. For NEEDS_MORE_EVIDENCE findings: if the other reviewer verified a related finding, merge
104
+ 4. Produce a unified findings list with verified/rejected status
105
+
106
+ ## Step 6 — Critic Challenge (HIGH/CRITICAL only)
107
+
108
+ For verified findings rated HIGH or CRITICAL, dispatch sequential critic passes:
109
+
110
+ **Pass 1 — False-positive / root-cause challenge:**
111
+ - `the active swarm's critic agent` receives each HIGH/CRITICAL finding
112
+ - Challenge: "Is this a false positive? Is the root cause correctly identified? Provide verdict: SURVIVES / DOWNGRADE / REJECT"
113
+ - Only findings that SURVIVE proceed to Pass 2
114
+
115
+ **Pass 2 — Impact / severity challenge:**
116
+ - `the active swarm's critic agent` receives surviving findings
117
+ - Challenge: "Is the severity correctly rated? Could this be lower impact than claimed? Provide verdict: SURVIVES / DOWNGRADE / REJECT"
118
+ - Final severity is the critic's assessed severity
119
+
120
+ CRITICAL: Do NOT challenge MEDIUM/LOW/INFO findings. Only HIGH and CRITICAL go through critic review.
121
+
122
+ ## Step 7 — Final Report
123
+
124
+ Assemble and present the audit report:
125
+
126
+ 1. **Wiring Map**: Visual summary of the scope's module structure and data flow
127
+ 2. **Functionality Assessment**: High-level summary of what the scope does and how well
128
+ 3. **Verified Findings Table**: DD-ID, severity, location, description, evidence
129
+ 4. **Rejected Candidates**: Brief list with rejection reasons
130
+ 5. **Enhancements**: Non-blocking improvement suggestions
131
+ 6. **Recommended Implementation Phases**: If findings suggest follow-up work, outline phases
132
+ 7. **JSON Block** (when output=json): Structured machine-readable findings
133
+
134
+ ## Important Constraints
135
+
136
+ - Do NOT mutate source code under any circumstances
137
+ - Do NOT delegate to coder
138
+ - Do NOT call declare_scope
139
+ - Do NOT create or modify any files outside .swarm/
140
+ - No final finding may appear in the report without reviewer verification
141
+ - Explorers generate candidate findings only — reviewers verify or reject
142
+ - Critics challenge only HIGH/CRITICAL findings — do NOT waste cycles on lower severity
@@ -0,0 +1,81 @@
1
+ ---
2
+ name: design-docs
3
+ description: >
4
+ Full execution protocol for MODE: DESIGN_DOCS — generate or sync structured,
5
+ language-agnostic design docs (domain.md, technical-spec.md, behavior-spec.md,
6
+ reference/) for the project under build, with a stable section-ID registry and
7
+ a design changelog. Loaded on demand by the architect when the design-docs
8
+ command emits a [MODE: DESIGN_DOCS ...] signal (issue #1080).
9
+ ---
10
+
11
+ # Design-Doc Generation & Sync Protocol
12
+
13
+ Generate or maintain the project's structured design documentation. The work is delegated to the `docs_design` agent (a design-doc-author role variant of the docs agent). This mode authors a fixed set of version-controlled docs in the **target project repo** (NOT under `.swarm/`). It does NOT modify source code, does NOT call `declare_scope`, and does NOT touch `.swarm/spec.md`, `CHANGELOG.md`, or `docs/releases/pending/*`.
14
+
15
+ ### MODE: DESIGN_DOCS
16
+
17
+ ## Step 0 — Parse Header
18
+
19
+ Parse the `[MODE: DESIGN_DOCS ...]` header to extract:
20
+ - `out`: output directory, project-relative (default `docs`)
21
+ - `lang`: target language for `reference/` docs, or `auto` (default `auto`)
22
+ - `update`: boolean — `true` = sync existing docs to current code/spec; `false` = generate fresh
23
+ - the trailing free text = the system description (required when `update=false`)
24
+
25
+ If the header is malformed, report the error and stop.
26
+
27
+ ## Step 1 — Preconditions
28
+
29
+ 1. Confirm `design_docs.enabled` is true (the `docs_design` agent only exists when enabled). If it is not, tell the user to set `design_docs.enabled: true` in `opencode-swarm.json` and stop.
30
+ 2. If a spec-staleness block is active (`.swarm/spec-staleness.json` present), resolve/acknowledge spec staleness FIRST — otherwise design-doc writes may be blocked by the guardrail. Do not blindly retry on `SPEC_STALENESS_BLOCK`.
31
+ 3. Read `.swarm/spec.md` if present — it is the authoritative requirements source (FR-### IDs). The design docs must be consistent with it.
32
+
33
+ ## Step 2 — Index Existing State (always)
34
+
35
+ Have the `docs_design` agent (or `doc_scan`) index `<out>/` to discover any existing design docs. If `<out>/reference/traceability.json` exists, it is the section-ID registry — load it. Existing section IDs MUST be preserved on regeneration.
36
+
37
+ ## Step 3 — Generate or Sync
38
+
39
+ Dispatch the **`docs_design`** agent (the active swarm's `docs_design` — never the standard `docs` agent) with:
40
+ - `TASK`, `MODE` (generate|sync), `OUT_DIR`, `LANGUAGE`
41
+ - For sync: `FILES CHANGED` and `CHANGES SUMMARY` from the current phase/diff
42
+ - `SKILLS: file:.opencode/skills/design-docs/SKILL.md` (this skill)
43
+
44
+ The agent owns exactly these files under `<out>` and creates NOTHING else:
45
+
46
+ ```
47
+ <out>/
48
+ ├── domain.md # 100% language-agnostic. Entities in neutral notation
49
+ │ # (field: type-class), domain invariants. ZERO framework
50
+ │ # names in normative text. Section IDs: D-###
51
+ ├── technical-spec.md # Language-agnostic architecture: layers, dependency rules,
52
+ │ # contract SHAPES (inputs→outputs→error-kinds), algorithms,
53
+ │ # invariants. + the traceability table. Section IDs: S-###
54
+ ├── behavior-spec.md # 100% language-agnostic Given/When/Then specs. IDs: B-###
55
+ ├── design-changelog.md # Keep-a-Changelog log of design-doc changes (NOT release notes)
56
+ └── reference/ # ALL [INCIDENTAL] language/framework-specific material here.
57
+ ├── reference-impl.md # Exact signatures, CLI strings, SQL, code. Mapped to
58
+ │ # spec sections by ID. Section IDs: R-###
59
+ ├── idiom-notes.md # "Here is how the reference solved X" — examples only.
60
+ └── traceability.json # Machine-readable section-ID registry (source of truth)
61
+ ```
62
+
63
+ ## Step 4 — Invariants the docs MUST satisfy
64
+
65
+ - **Language-agnostic normative text**: `domain.md`, `technical-spec.md`, and `behavior-spec.md` contain ZERO framework/library/language names in normative content. All such material lives ONLY in `reference/`.
66
+ - **Version header** on every doc:
67
+ `<!-- design-doc: <name> version: <phase-or-counter> generated: <ISO-8601> spec-hash: <8 chars> -->`
68
+ - **Stable section IDs**: assigned once, never renumbered. `D-###` domain, `S-###` technical-spec, `B-###` behavior-spec, `R-###` reference. On sync, reuse every existing ID; mint new IDs only for genuinely new sections.
69
+ - **Traceability footer** ending each section: `> Traceability: FR-012, FR-013 | invariant: <id-or-none>`.
70
+ - **traceability.json** kept in sync: `{ "schema_version": 1, "sections": [ { "section_id", "doc", "title", "spec_frs": [], "invariants": [], "code_anchors": [] } ] }`. `technical-spec.md` renders a human-readable mirror table `| Doc Section | Spec FR | Invariant | Code anchors |`.
71
+ - **design-changelog.md**: append one entry per generate/sync under `## [Unreleased]` (Added/Changed/Removed), e.g. `- <ISO date> phase <N>: <sections touched> (<FR refs>)`. This file is SEPARATE from release-please artifacts — never edit `CHANGELOG.md` or `docs/releases/pending/*` here.
72
+
73
+ ## Step 5 — Verify & Report
74
+
75
+ 1. Confirm the agent created/updated only the allowed files and `traceability.json` is consistent with the docs.
76
+ 2. Confirm no normative doc names a framework (spot-check) and every section has an ID + traceability footer.
77
+ 3. Report `UPDATED` / `ADDED` / `REMOVED` / `SUMMARY` back to the user.
78
+
79
+ ## Notes on the PHASE-WRAP sync path
80
+
81
+ During PHASE-WRAP, the deterministic design-doc drift check (`runDesignDocDriftCheck`) writes `.swarm/doc-drift-phase-N.json`. If the verdict is `DOC_STALE` and `design_docs.enabled`, dispatch `docs_design` in **sync** mode for the affected sections only, then append a design-changelog entry. This is advisory and non-blocking — never block phase completion on design-doc lag.
@@ -0,0 +1,20 @@
1
+ ---
2
+ name: discover
3
+ description: >
4
+ Full execution protocol for MODE: DISCOVER -- read-only repository discovery and governance/context mapping.
5
+ ---
6
+
7
+ # Discover Protocol
8
+
9
+ This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
10
+
11
+ ### MODE: DISCOVER
12
+ Delegate to the active swarm's explorer agent. Wait for response.
13
+ For complex tasks, make a second explorer call focused on risk/gap analysis:
14
+ - Hidden requirements, unstated assumptions, scope risks
15
+ - Existing patterns that the implementation must follow
16
+ After explorer returns:
17
+ - Run `symbols` tool on key files identified by explorer to understand public API surfaces
18
+ - For multi-file module surveys: prefer `batch_symbols` over sequential single-file symbols calls
19
+ - Run `complexity_hotspots` if not already run in Phase 0 (check context.md for existing analysis). Note modules with recommendation "security_review" or "full_gates" in context.md.
20
+ - Check for project governance files using the `glob` tool with patterns `project-instructions.md`, `docs/project-instructions.md`, `CONTRIBUTING.md`, and `INSTRUCTIONS.md` (checked in that priority order — first match wins). If a file is found: read it and extract all MUST (mandatory constraints) and SHOULD (recommended practices) rules. Write the extracted rules as a summary to `.swarm/context.md` under a `## Project Governance` section — append if the section already exists, create it if not. If no MUST or SHOULD rules are found in the file, skip writing. If no governance file is found: skip silently. Existing DISCOVER steps are unchanged.
@@ -0,0 +1,191 @@
1
+ ---
2
+ name: execute
3
+ description: >
4
+ Full execution protocol for MODE: EXECUTE -- task execution, coder retry handling, QA gates, completion evidence, and per-task closure.
5
+ ---
6
+
7
+ # Execute Protocol
8
+
9
+ This protocol is loaded on demand by the architect stub in src/agents/architect.ts. The architect prompt keeps only activation, action, and hard safety constraints; the full execution details live here.
10
+
11
+ ### MODE: EXECUTE
12
+ For each task (respecting dependencies):
13
+
14
+ RETRY PROTOCOL — when returning to coder after any gate failure:
15
+ 1. Provide structured rejection: "GATE FAILED: [gate name] | REASON: [details] | REQUIRED FIX: [specific action required]"
16
+ 2. Re-enter at step 5b (the active swarm's coder agent) with full failure context
17
+ 3. Resume execution at the failed step (do not restart from 5a)
18
+ Exception: if coder modified files outside the original task scope, restart from step 5c
19
+ 4. Gates already PASSED may be skipped on retry if their input files are unchanged
20
+ 5. Print "Resuming at step [5X] after coder retry [N/configured QA retry limit]" before re-executing
21
+
22
+ GATE FAILURE RESPONSE RULES — when ANY gate returns a failure:
23
+ You MUST return to the active swarm's coder agent. You MUST NOT fix the code yourself.
24
+
25
+ WRONG responses to gate failure:
26
+ ✗ Editing the file yourself to fix the syntax error
27
+ ✗ Running a tool to auto-fix and moving on without coder
28
+ ✗ "Installing" or "configuring" tools to work around the failure
29
+ ✗ Treating the failure as an environment issue and proceeding
30
+ ✗ Deciding the failure is a false positive and skipping the gate
31
+
32
+ RIGHT response to gate failure:
33
+ ✓ Print "GATE FAILED: [gate name] | REASON: [details]"
34
+ ✓ BEFORE the retry delegation: call `declare_scope` with the file list the retry will touch. Re-declare even if the files are identical to the original task — retry scope persists per-call, not per-task. See Rule 1a.
35
+ ✓ Delegate to the active swarm's coder agent with:
36
+ TASK: Fix [gate name] failure
37
+ FILE: [affected file(s)]
38
+ INPUT: [exact error output from the gate]
39
+ CONSTRAINT: Fix ONLY the reported issue, do not modify other code
40
+ ✓ After coder returns, re-run the failed gate from the step that failed
41
+ ✓ Print "Coder attempt [N/configured QA retry limit] on task [X.Y]"
42
+
43
+ The ONLY exception: lint tool in fix mode (step 5g) auto-corrects by design.
44
+ All other gates: failure → return to coder. No self-fixes. No workarounds.
45
+
46
+ 5a. **UI DESIGN GATE** (conditional — Rule 9): If task matches UI trigger → the active swarm's designer agent produces scaffold → pass scaffold to coder as INPUT. If no match → skip.
47
+
48
+ → After step 5a (or immediately if no UI task applies): Call update_task_status with status in_progress for the current task. Then proceed to step 5b.
49
+
50
+ 5a-bis. **DARK MATTER CO-CHANGE DETECTION**: After declaring scope but BEFORE finalizing the task file list, call knowledge_recall with query hidden-coupling primaryFile where primaryFile is the first file in the task's FILE list. Extract primaryFile from the task's FILE list (first file = primary). If results found, add those files to the task's AFFECTS scope with a BLAST RADIUS note. If no results or knowledge_recall unavailable, proceed gracefully without adding files. This is advisory — the architect may exclude files from scope if they are unrelated to the current task. Delegate to the active swarm's coder agent only after scope is declared.
51
+
52
+ 5b-PRE (required): Call `declare_scope({ taskId, files })` with the EXACT file list for this task — including any co-change files surfaced by 5a-bis. Skipping this call will cause every coder write to be BLOCKED by scope-guard. No `declare_scope` → no 5b delegation. See Rule 1a.
53
+ 5b-BASE (required, once per task): Call `sast_scan` with `{ capture_baseline: true, phase: <N>, changed_files: <files from 5b-PRE> }` where `<N>` is the current phase number (extract from current task ID: task "3.2" → phase 3, task "1.5" → phase 1). The tool maintains `.swarm/evidence/{phase}/sast-baseline.json` as a phase-scoped, incrementally merged baseline of pre-existing SAST findings. Calling twice for the same files is safe (idempotent merge). Do NOT re-capture mid-task.
54
+ → REQUIRED: Print "sast-baseline: [WRITTEN — N fingerprints | MERGED — N fingerprints | SKIPPED — gate disabled | ERROR — details]"
55
+ → Subsequent `pre_check_batch` calls with `phase: <N>` will automatically diff against this baseline — only NEW findings (not in baseline) drive the fail verdict.
56
+ 5b. the active swarm's coder agent - Implement (if designer scaffold produced, include it as INPUT).
57
+ 5c. Run `diff` tool. If `hasContractChanges` → the active swarm's explorer agent integration analysis. If COMPATIBILITY SIGNALS=INCOMPATIBLE or MIGRATION_SURFACE=yes → coder retry. If COMPATIBILITY SIGNALS=COMPATIBLE and MIGRATION_SURFACE=no → proceed.
58
+ → REQUIRED: Print "diff: [PASS | CONTRACT CHANGE — details]"
59
+ 5d. Run `syntax_check` tool. SYNTACTIC ERRORS → return to coder. NO ERRORS → proceed to placeholder_scan.
60
+ → REQUIRED: Print "syntaxcheck: [PASS | FAIL — N errors]"
61
+ 5e. Run `placeholder_scan` tool. PLACEHOLDER FINDINGS → return to coder. NO FINDINGS → proceed to imports.
62
+ → REQUIRED: Print "placeholderscan: [PASS | FAIL — N findings]"
63
+ 5f. Run `imports` tool for dependency audit. ISSUES → return to coder.
64
+ → REQUIRED: Print "imports: [PASS | ISSUES — details]"
65
+ 5g. Run `lint` tool with fix mode for auto-fixes. If issues remain → run `lint` tool with check mode. FAIL → return to coder.
66
+ → REQUIRED: Print "lint: [PASS | FAIL — details]"
67
+ 5h. Run `build_check` tool. BUILD FAILS → return to coder. SUCCESS → proceed to pre_check_batch.
68
+ → REQUIRED: Print "buildcheck: [PASS | FAIL | SKIPPED — no toolchain]"
69
+ 5i. Run `pre_check_batch` tool with `phase: <N>` (same phase number used in 5b-BASE) → runs four verification tools in parallel (max 4 concurrent):
70
+ - lint:check (code quality verification)
71
+ - secretscan (secret detection)
72
+ - sast_scan (static security analysis — diffs against phase baseline when phase provided)
73
+ - quality_budget (maintainability metrics)
74
+ → Returns { gates_passed, lint, secretscan, sast_scan, quality_budget, total_duration_ms }
75
+ → sast_scan result may include { new_findings, pre_existing_findings, baseline_used } when baseline diff is active.
76
+ → If ALL FOUR tools have ran === false (lint.ran === false && secretscan.ran === false && sast_scan.ran === false && quality_budget.ran === false):
77
+ → This is a SKIP - no tools actually ran. Print "pre_check_batch: SKIP — all tools ran===false (no files to check or tools not available)" and proceed to the active swarm's reviewer agent.
78
+ → Else if gates_passed === false: read individual tool results, identify which tool(s) failed, return structured rejection to the active swarm's coder agent with specific tool failures. Do NOT call the active swarm's reviewer agent.
79
+ → If gates_passed === true AND sast_preexisting_findings is present: proceed to the active swarm's reviewer agent. Include the pre-existing SAST findings in the reviewer delegation context with instruction: "SAST TRIAGE REQUIRED: The following SAST findings existed before this task began (from phase baseline or unchanged lines). Verify these are acceptable pre-existing conditions and do not interact with the new changes." Do NOT return to coder for pre-existing findings.
80
+ → If gates_passed === true (no sast_preexisting_findings): proceed to the active swarm's reviewer agent.
81
+ → REQUIRED: Print "pre_check_batch: [PASS — all gates passed | PASS — pre-existing SAST findings (N findings, reviewer triage) | FAIL — [gate]: [details]]"
82
+
83
+ ⚠️ pre_check_batch SCOPE BOUNDARY:
84
+ pre_check_batch runs FOUR automated tools: lint:check, secretscan, sast_scan, quality_budget.
85
+ pre_check_batch does NOT run and does NOT replace:
86
+ - the active swarm's reviewer agent (logic review, correctness, edge cases, maintainability)
87
+ - the active swarm's reviewer agent security-only pass (OWASP evaluation, auth/crypto review)
88
+ - the active swarm's test_engineer agent verification tests (functional correctness)
89
+ - the active swarm's test_engineer agent adversarial tests (attack vectors, boundary violations)
90
+ - diff tool (contract change detection)
91
+ - placeholder_scan (TODO/stub detection)
92
+ - imports (dependency audit)
93
+ gates_passed: true means "automated static checks passed."
94
+ It does NOT mean "code is reviewed." It does NOT mean "code is tested."
95
+ After pre_check_batch passes, you MUST STILL delegate to the active swarm's reviewer agent.
96
+ Treating pre_check_batch as a substitute for the active swarm's reviewer agent is a PROCESS VIOLATION.
97
+
98
+ 5j. the active swarm's reviewer agent - General review. REJECTED before the configured QA retry limit → coder retry. REJECTED at the configured QA retry limit → escalate.
99
+ → REQUIRED: Print "reviewer: [APPROVED | REJECTED — reason]"
100
+ 5k. Security gate: if change matches TIER 3 criteria OR content contains SECURITY_KEYWORDS OR secretscan has ANY findings OR sast_scan has ANY findings at or above threshold → MUST delegate the active swarm's reviewer agent security-only review. REJECTED before the configured QA retry limit → coder retry. REJECTED at the configured QA retry limit → escalate to user.
101
+ → REQUIRED: Print "security-reviewer: [TRIGGERED | NOT TRIGGERED — reason]"
102
+ → If TRIGGERED: Print "security-reviewer: [APPROVED | REJECTED — reason]"
103
+ 5l. the active swarm's test_engineer agent - Verification tests. FAIL → coder retry from 5g.
104
+ → REQUIRED: Print "testengineer-verification: [PASS N/N | FAIL — details]"
105
+ 5l-bis. REGRESSION SWEEP (automatic after test_engineer-verification PASS):
106
+ Run test_runner with { scope: "graph", files: [<all source files changed by coder in this task>] }.
107
+ scope:"graph" traces imports to discover test files beyond the task's own tests that may be affected by this change.
108
+
109
+ Outcomes (based on test_runner result.outcome field):
110
+ - outcome: "pass" → All tests passed. Print "regression-sweep: PASS [N additional tests, M files]"
111
+ - outcome: "regression" → Tests ran but some failed. Print "regression-sweep: FAIL — REGRESSION DETECTED in [files]. The failing tests are CORRECT — fix the source code, not the tests." Return to coder with retry from 5g.
112
+ - outcome: "skip" → No test files resolved (nothing to run). Print "regression-sweep: SKIPPED — no related tests beyond task scope"
113
+ - outcome: "scope_exceeded" → Too many files for graph scope. Print "regression-sweep: SKIPPED — broad scope, no related tests beyond task scope"
114
+ - outcome: "error" → Tool error (timeout, no framework, etc.). Print "regression-sweep: SKIPPED — test_runner error" and continue pipeline.
115
+
116
+ IMPORTANT: The regression sweep runs test_runner DIRECTLY (architect calls the tool). Do NOT delegate to test_engineer for this — the test_engineer's EXECUTION BOUNDARY restricts it to its own test files. The architect has unrestricted test_runner access.
117
+ → REQUIRED: Print "regression-sweep: [PASS | FAIL — REGRESSION DETECTED | SKIPPED — no related tests | SKIPPED — broad scope | SKIPPED — test_runner error]"
118
+
119
+ 5l-ter. TEST DRIFT CHECK (conditional): Run this step if the change involves any drift-prone area:
120
+ - Command/CLI behavior changed (shell command wrappers, CLI interfaces)
121
+ - Parsing or routing logic changed (argument parsing, route matching, file resolution)
122
+ - User-visible output changed (formatted output, error messages, JSON response structure)
123
+ - Public contracts or schemas changed (API types, tool argument schemas, return types)
124
+ - Assertion-heavy areas where output strings are tested (command/help output tests, error message tests)
125
+ - Helper behavior or lifecycle semantics changed (state machines, lifecycle hooks, initialization)
126
+
127
+ If NOT triggered: Print "test-drift: NOT TRIGGERED — no drift-prone change detected"
128
+ If TRIGGERED:
129
+ - Use grep/search to find test files that cover the affected functionality
130
+ - Run those tests via test_runner with scope:"convention" on the related test files
131
+ - If any FAIL → print "test-drift: DRIFT DETECTED in [N] tests" and escalate to reviewer/test_engineer
132
+ - If all PASS → print "test-drift: [N] related tests verified"
133
+ - If no related tests found → print "test-drift: NO RELATED TESTS FOUND" (not a failure)
134
+ → REQUIRED: Print "test-drift: [TRIGGERED | NOT TRIGGERED — reason]" and "[DRIFT DETECTED in N tests | N related tests verified | NO RELATED TESTS FOUND | NOT TRIGGERED]"
135
+
136
+ 5n. TODO SCAN (advisory): Call todo_extract with paths=[list of files changed in this task]. If any results have priority HIGH → print "todo-scan: WARN — N high-priority TODOs in changed files: [list of TODO texts]". If no high-priority results → print "todo-scan: CLEAN". This is advisory only and does NOT block the pipeline.
137
+ → REQUIRED: Print "todo-scan: [WARN — N high-priority TODOs | CLEAN]"
138
+
139
+ 5m. ADVERSARIAL TEST STEP (config-specific): Use the rendered adversarial-test instruction from the MODE: EXECUTE architect stub. If the stub omits step 5m, skip this step.
140
+ 5n. COVERAGE CHECK: If the active swarm's test_engineer agent reports coverage < 70% → delegate the active swarm's test_engineer agent for an additional test pass targeting uncovered paths. This is a soft guideline; use judgment for trivial tasks.
141
+
142
+ PRE-COMMIT RULE — Before ANY commit or push:
143
+ You MUST answer YES to ALL of the following:
144
+ [ ] Did the active swarm's reviewer agent run and return APPROVED? (not "I reviewed it" — the agent must have run)
145
+ [ ] Did the active swarm's test_engineer agent run and return PASS? (not "the code looks correct" — the agent must have run)
146
+ [ ] Did pre_check_batch run with gates_passed true?
147
+ [ ] Did the diff step run?
148
+ [ ] Did regression-sweep run (or SKIP with no related tests or test_runner error)?
149
+ [ ] Did test-drift check run (or NOT TRIGGERED)?
150
+
151
+ If ANY box is unchecked: DO NOT COMMIT. Return to step 5b.
152
+ There is no override. A commit without a completed QA gate is a workflow violation.
153
+
154
+ ## ROLE-BOUNDARY CHANGE VALIDATION (mandatory for prompt changes)
155
+ When a task modifies agent prompts (especially explorer, reviewer, critic, or any agent involved in the mapper/validator/challenge hierarchy), add an explicit test validation step:
156
+ - If new prompt contract tests exist (e.g., explorer-role-boundary.test.ts, explorer-consumer-contract.test.ts): Run them via test_runner
157
+ - If no specific tests exist for the changed prompt: Run test_runner with scope "convention" on the changed file
158
+ - Verify the new tests pass before completing the task
159
+
160
+ This step supplements (not replaces) the existing regression-sweep and test-drift checks. It exists to catch prompt contract regressions that automated gates might miss.
161
+
162
+ 5o. ⛔ TASK COMPLETION GATE — You MUST print this checklist with filled values before marking ✓ in .swarm/plan.md:
163
+ [TOOL] diff: PASS / SKIP — value: ___
164
+ [TOOL] syntax_check: PASS — value: ___
165
+ [TOOL] placeholder_scan: PASS — value: ___
166
+ [TOOL] imports: PASS — value: ___
167
+ [TOOL] lint: PASS — value: ___
168
+ [TOOL] build_check: PASS / SKIPPED — value: ___
169
+ [TOOL] pre_check_batch: PASS (lint:check ✓ secretscan ✓ sast_scan ✓ quality_budget ✓) — value: ___
170
+ [GATE] reviewer: APPROVED — value: ___
171
+ [GATE] reuse_re_verification: VERIFIED / SKIPPED / DUPLICATION_DETECTED — value: ___
172
+ [GATE] security-reviewer: APPROVED / SKIPPED — value: ___
173
+ [GATE] test_engineer-verification: PASS — value: ___
174
+ [GATE] regression-sweep: PASS / SKIPPED — value: ___
175
+ [GATE] test-drift: TRIGGERED / NOT TRIGGERED — value: ___
176
+ [GATE] test_engineer-adversarial: use the rendered checklist entry from the MODE: EXECUTE architect stub
177
+ [GATE] coverage: ≥70% / soft-skip — value: ___
178
+
179
+ You MUST NOT mark a task complete without printing this checklist with filled values.
180
+ You MUST NOT fill "PASS" or "APPROVED" for a gate you did not actually run — that is fabrication.
181
+ Any blank "value: ___" field = gate was not run = task is NOT complete.
182
+ Filling this checklist from memory ("I think I ran it") is INVALID. Each value must come from actual tool/agent output in this session.
183
+
184
+ 5p. Call update_task_status with status "completed".
185
+ 5q. OPTIONAL TASK-COMPLETION COMMIT POLICY: read `.swarm/context.md`.
186
+ - If `## Task Completion Commit Policy` contains `commit_after_each_completed_task: true`, immediately call:
187
+ `checkpoint save task-<task-id>-complete`
188
+ - If the section is absent or false, skip this step.
189
+ - This optional commit policy NEVER bypasses PRE-COMMIT RULE checks above.
190
+ - If checkpoint save fails with "duplicate label", the task was already checkpointed from a prior completion or retry. Silently skip — the existing checkpoint is valid.
191
+ 5r. Proceed to next task.