@mmerterden/multi-agent-pipeline 10.7.2 → 10.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -14,6 +14,20 @@ Internal file-layout changes that don't affect the slash-command surface are sti
14
14
 
15
15
  ---
16
16
 
17
+ ## [10.7.3] - 2026-07-02
18
+
19
+ ### Changed
20
+
21
+ - **Zero external-tool residue.** Genericized every remaining editor/tool citation
22
+ in the pipeline's own files (Windsurf Cascade / Cursor Plan Mode / Cursor Bugbot /
23
+ Cline / Devin) across schemas, lib comments, refs, and the task-clarifier persona —
24
+ feature behavior is unchanged, only the third-party naming is dropped. (Vendored
25
+ `shared/external` knowledge and the `~/.codex` prune path in `update` are unaffected.)
26
+ - **README** gained a **Tokens & integrations** table (keychain-mapping model + the
27
+ services the pipeline talks to: Jira, GitHub, Bitbucket, Confluence, Figma, Fortify,
28
+ Firebase, Jenkins, npm) and an explicit **Platform support** section (macOS / Linux /
29
+ Windows, keychain backends).
30
+
17
31
  ## [10.7.2] - 2026-07-02
18
32
 
19
33
  ### Changed
package/README.md CHANGED
@@ -84,11 +84,27 @@ The pipeline runs natively on **Claude Code** and **Copilot CLI** — both insta
84
84
 
85
85
  Filter skills by stack with `--platform=ios\|android\|all`.
86
86
 
87
- ## Setup notes
87
+ ## Tokens & integrations
88
88
 
89
- - **Tokens** stay in the OS keychain (macOS Keychain / Windows Credential Manager / Linux libsecret) and are never committed or logged. `setup` maps them into `~/.claude/multi-agent-preferences.json`.
90
- - **Secret scan** runs as a `PreToolUse` hook on Claude Code (hard-blocks a commit on a hit) and as a pre-push check elsewhere.
91
- - **All tokens are optional** — the pipeline asks for any it needs at Phase 0.
89
+ `setup` scans your OS keychain and maps each token by a **logical name** (e.g. `jira`) to its real keychain entry — the pipeline resolves tokens through that mapping (`credential-store.sh`), so literal keychain names never appear in synced files. Tokens stay in the keychain (macOS Keychain / Windows Credential Manager / Linux libsecret), are **never committed or logged**, and are all **optional** — the pipeline asks for any it needs at Phase 0.
90
+
91
+ | Token | Used for | Phase |
92
+ |---|---|---|
93
+ | `jira` | fetch the issue · post the report comment | 0, 7 |
94
+ | `github` | issues · PRs · `gh` auth | 0, 6 |
95
+ | `bitbucket` | PR create/update (reviewer-preserving) · diff | 6 |
96
+ | `confluence` | publish analysis / wiki pages | 7 |
97
+ | `figma` + `figma_mcp` | fetch design context | analysis only |
98
+ | `fortify` | security-scan findings gate | 4 |
99
+ | `firebase` | Firebase config (base64 JSON) for Firebase projects | as needed |
100
+ | `jenkins` | CI trigger / status | build / deploy |
101
+ | `npm` | package publish (mostly CI) | release |
102
+
103
+ The **secret scan** runs as a `PreToolUse` hook on Claude Code (hard-blocks a commit on a hit) and as a pre-push check elsewhere.
104
+
105
+ ## Platform support
106
+
107
+ Runs on **macOS**, **Linux**, and **Windows** (Git Bash / WSL). Shell and credential access go through a platform-agnostic layer — the keychain resolves automatically to **macOS Keychain**, **Linux libsecret** (`secret-tool`), or **Windows Credential Manager**, and scripts fall back between BSD and GNU tool variants. Node.js 18 / 20 / 22.
92
108
 
93
109
  ## Companion repos
94
110
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@mmerterden/multi-agent-pipeline",
3
- "version": "10.7.2",
3
+ "version": "10.7.3",
4
4
  "description": "8-phase AI development pipeline with full orchestration on Claude Code, Copilot CLI, Cursor, Antigravity, and VS Code Copilot Chat. Analysis, planning, TDD, CLI-aware parallel review with consensus surfacing + Opus triage, default-FAIL evidence gates, secret + intent guards, per-phase cost ledger, persistent learnings memory, wiki generation, commit automation. Token-preserving uninstall.",
5
5
  "type": "module",
6
6
  "main": "index.js",
@@ -7,7 +7,7 @@ modelRationale: "Ambiguity scoring is a low-stakes classification + targeted que
7
7
 
8
8
  # Task Clarifier Agent - Phase 0 Step 9
9
9
 
10
- You score how clearly the task is specified and, when score is below the threshold, emit up to N clarifying questions the user must answer before Phase 1 (Analysis) begins. Pattern source: Devin's clarifying-question loop documented at <https://docs.devin.ai/work-with-devin/devin-review> and the Devin Knowledge pattern (<https://docs.devin.ai/product-guides/knowledge>).
10
+ You score how clearly the task is specified and, when score is below the threshold, emit up to N clarifying questions the user must answer before Phase 1 (Analysis) begins.
11
11
 
12
12
  **You do NOT solve the task.** You only assess whether the task is solvable as stated. The user is the one who answers; you produce the questions.
13
13
 
@@ -108,6 +108,4 @@ Cost expectation on Haiku (input ~1.5k tokens issue body, output ~400 tokens JSO
108
108
 
109
109
  ## Pattern citation
110
110
 
111
- - Devin docs (clarifying via Knowledge triggers + Ask Devin sessions): <https://docs.devin.ai/work-with-devin/devin-review>
112
- - Cursor Plan Mode (asks clarifying questions before producing the plan): <https://cursor.com/docs/agent/planning>
113
111
  - Anthropic Building Effective Agents - orchestrator-workers pattern, prefer cheap classification before expensive synthesis: <https://www.anthropic.com/engineering/building-effective-agents>
@@ -1,6 +1,6 @@
1
1
  # Feature: Shadow-Git Checkpoints (Phase 3)
2
2
 
3
- **Gated by `prefs.global.shadowGit.enabled`** (default: `false`). The orchestrator snapshots the worktree via `pipeline/lib/shadow-git.sh` so sub-phase rollback is possible without polluting the project's real `.git` history. Pattern source: Cline checkpoints (<https://docs.cline.bot/features/checkpoints>).
3
+ **Gated by `prefs.global.shadowGit.enabled`** (default: `false`). The orchestrator snapshots the worktree via `pipeline/lib/shadow-git.sh` so sub-phase rollback is possible without polluting the project's real `.git` history.
4
4
 
5
5
  ```bash
6
6
  # Phase 0 (one-time per task): initialize shadow repo + baseline snapshot.
@@ -508,7 +508,7 @@ Log: `Phase 0 Step 7: taskType = {component|bugfix|feature|refactor|chore}`
508
508
 
509
509
  **Cost:** ~$0.0025 per Haiku call. The pipeline's other expensive phases (Phase 4 reviewers, Phase 3 Sonnet codegen) far outweigh this - the value is avoiding the ~30 min wasted when Phase 3 builds the wrong thing because Phase 0 didn't ask.
510
510
 
511
- **Reference:** see `pipeline/agents/task-clarifier.md` for the full scoring rubric, question quality rules, and prior-art citations (Devin Knowledge / Cursor Plan Mode).
511
+ **Reference:** see `pipeline/agents/task-clarifier.md` for the full scoring rubric and question-quality rules.
512
512
 
513
513
  **Why this fits Phase 0 (not a new phase):** clarification doesn't change what code gets written - it changes what gets understood before code is written. Phase 0 already collects identity / project / branch / maturity; ambiguity scoring fits naturally as the last contextual gate.
514
514
 
@@ -124,7 +124,7 @@ Log: "Phase 2: Plan - {N} tasks created, {M} with architecture review, validat
124
124
 
125
125
  #### Step 4.5 - Emit Plan Todo List (opt-in)
126
126
 
127
- **Gated by `prefs.global.planTodos.enabled`** (default: `false`). When enabled, after the planning-output JSON validates and BEFORE the approval gate, transform `tasks[]` into a structured Todo list conforming to `pipeline/schemas/plan-todos.schema.json` and persist into `agent-state.plan`. Pattern source: Windsurf Cascade's always-visible Todo list (<https://docs.windsurf.com/windsurf/cascade>) and Cursor Plan Mode's reviewable plan (<https://cursor.com/docs/agent/planning>).
127
+ **Gated by `prefs.global.planTodos.enabled`** (default: `false`). When enabled, after the planning-output JSON validates and BEFORE the approval gate, transform `tasks[]` into a structured Todo list conforming to `pipeline/schemas/plan-todos.schema.json` and persist into `agent-state.plan`. The plan is rendered as a live, always-visible Todo list.
128
128
 
129
129
  ```bash
130
130
  TODO_BLOB=$(jq '
@@ -41,7 +41,7 @@ Phase 3 consumes the Phase 2 output object conforming to `pipeline/schemas/plann
41
41
 
42
42
  **Plan Todo iteration (opt-in)**: gated by `prefs.global.planTodos.enabled` (default: `false`). When enabled and Phase 2 Step 4.5 emitted a `plan.todos[]`, Phase 3 iterates via `pipeline/lib/plan-todos.sh next/start/complete/fail` instead of walking `tasks[]` directly. When disabled, the loop walks `tasks[]` from `planning-output` - TDD contract is unchanged. Full helper loop + state semantics: `refs/features/plan-todos.md`.
43
43
 
44
- **Shadow-Git checkpoints (opt-in)**: gated by `prefs.global.shadowGit.enabled` (default: `false`). When enabled, the orchestrator snapshots the worktree via `pipeline/lib/shadow-git.sh` so sub-phase rollback is possible without touching the project's real `.git` history. Cline-style. Lifecycle: `shadow-git.sh init` (Phase 0 baseline), `shadow-git.sh snapshot` (per step after `plan-todos complete`), `shadow-git.sh restore <sha> --files` (rollback). Modes: `per-todo-step` (default) or `per-tool-call`. Full wiring + storage cap: `refs/features/shadow-git.md`.
44
+ **Shadow-Git checkpoints (opt-in)**: gated by `prefs.global.shadowGit.enabled` (default: `false`). When enabled, the orchestrator snapshots the worktree via `pipeline/lib/shadow-git.sh` so sub-phase rollback is possible without touching the project's real `.git` history. Lifecycle: `shadow-git.sh init` (Phase 0 baseline), `shadow-git.sh snapshot` (per step after `plan-todos complete`), `shadow-git.sh restore <sha> --files` (rollback). Modes: `per-todo-step` (default) or `per-tool-call`. Full wiring + storage cap: `refs/features/shadow-git.md`.
45
45
 
46
46
  #### Component tasks - delegated dispatch (taskType === "component")
47
47
 
@@ -25,7 +25,6 @@ The agent detects which CLI it's running in and uses the appropriate visual mech
25
25
  1. system prompt mentions "Claude Code" → claude-code
26
26
  2. system prompt mentions "Copilot" / "GitHub Copilot" → copilot
27
27
  3. system prompt mentions "Cursor" → cursor
28
- 4. system prompt mentions "Cascade" / "Windsurf" → windsurf
29
28
  5. None of the above → generic (bash stdout)
30
29
  ```
31
30
 
@@ -2,11 +2,8 @@
2
2
  #
3
3
  # plan-todos.sh - manage the Phase 2 plan as a live Todo list.
4
4
  #
5
- # Pattern source:
6
- # - Windsurf Cascade - https://docs.windsurf.com/windsurf/cascade
7
- # "renders a Todo list inside the conversation that updates as it works"
8
- # - Cursor Plan Mode - https://cursor.com/docs/agent/planning
9
- # "creates a detailed, reviewable, editable plan before writing any code"
5
+ # The Phase 2 plan is broken into a live, reviewable Todo list that updates
6
+ # step-by-step as Phase 3 works through it.
10
7
  #
11
8
  # State lives in `agent-state.json` under `.plan.todos[]` per
12
9
  # pipeline/schemas/plan-todos.schema.json. Phase 2 (Planning) emits the
@@ -137,7 +137,7 @@ render_inline_body() {
137
137
  printf '_%s_\n\n' "$rule_id"
138
138
  fi
139
139
  printf -- '---\n🤖 _Multi-Agent Review · iteration #%s_\n' "$ITERATION"
140
- # Dedupe marker - Bugbot-style. Re-runs of /multi-agent:review skip a
140
+ # Dedupe marker - dedupe-style. Re-runs of /multi-agent:review skip a
141
141
  # finding when an existing comment carries the same fingerprint.
142
142
  if [ -n "$fingerprint" ]; then
143
143
  printf '<!-- multi-agent-finding: %s -->\n' "$fingerprint"
@@ -229,7 +229,7 @@ post_github() {
229
229
  rc_body="See inline comments above."
230
230
  fi
231
231
 
232
- # Dedupe gate - read pref, default ON (Bugbot-style). Loads existing
232
+ # Dedupe gate - read pref, default ON (dedupe-style). Loads existing
233
233
  # comments once if needed; per-finding check is in-process.
234
234
  local DEDUPE_ENABLED
235
235
  DEDUPE_ENABLED=$(jq -r '.global.review.dedupeInlineComments // true' \
@@ -3,12 +3,8 @@
3
3
  # review-watch.sh - poll watched GitHub repos for incoming PRs and
4
4
  # dispatch `/multi-agent:review` on each new/updated PR.
5
5
  #
6
- # Pattern source:
7
- # - Cursor Bugbot - https://cursor.com/docs/bugbot
8
- # "Bugbot automatically reviews every PR update; reads existing PR
9
- # comments to avoid duplicate feedback."
10
- # - Devin Review - https://docs.devin.ai/work-with-devin/devin-review
11
- # "Auto-Review on PR open / new commit / draft-ready / reviewer-add."
6
+ # Auto-reviews every new/updated PR the user did not author, reading existing
7
+ # PR comments to avoid duplicate feedback.
12
8
  #
13
9
  # What it does:
14
10
  # - Reads watched repos from prefs.global.reviewWatch.repos[] (or --repos).
@@ -2,11 +2,9 @@
2
2
  #
3
3
  # shadow-git.sh - per-tool-call checkpoints in a separate git repo.
4
4
  #
5
- # Pattern source:
6
- # - Cline checkpoints - https://docs.cline.bot/features/checkpoints
7
- # "Cline maintains a shadow Git repository separate from your project's
8
- # actual Git history... After each tool use, Cline commits the current
9
- # state of your files to this shadow repo."
5
+ # A shadow git repo, separate from the project's real .git, records a snapshot
6
+ # of the working tree after each tool use so a sub-phase can be rolled back
7
+ # without polluting the user's semantic commit history.
10
8
  #
11
9
  # Why a SHADOW repo (not the real .git): the real git tree holds the user's
12
10
  # semantic commits - clean history, intentional messages. Shadow snapshots
@@ -2,7 +2,7 @@
2
2
  "$schema": "http://json-schema.org/draft-07/schema#",
3
3
  "$id": "https://example.com/pipeline/clarify-output.schema.json",
4
4
  "title": "Clarification Output",
5
- "description": "Schema for Phase 0 Step 9 task-clarifier sub-agent output. Source: pipeline/agents/task-clarifier.md. Pattern reference: Devin clarifying-question loop (https://docs.devin.ai/work-with-devin/devin-review) and Cursor Plan Mode (https://cursor.com/docs/agent/planning).",
5
+ "description": "Schema for Phase 0 Step 9 task-clarifier sub-agent output. Source: pipeline/agents/task-clarifier.md. A clarifying-question loop that runs before planning.",
6
6
  "type": "object",
7
7
  "additionalProperties": false,
8
8
  "required": ["clarityScore", "questions", "stopAndAsk"],
@@ -2,7 +2,7 @@
2
2
  "$schema": "http://json-schema.org/draft-07/schema#",
3
3
  "$id": "https://example.com/pipeline/plan-todos.schema.json",
4
4
  "title": "Plan Todo List",
5
- "description": "Structured representation of Phase 2's plan. Persists into agent-state under `.plan.todos[]` and survives across phases. Phase 3 (Dev) iterates step-by-step; Phase 4 (Review) verifies completed steps against criteria; Phase 7 (Report) renders a per-step rollup. Pattern source: Windsurf Cascade (https://docs.windsurf.com/windsurf/cascade) renders a live Todo list inside the conversation; Cursor Plan Mode (https://cursor.com/docs/agent/planning) builds an editable plan before any code.",
5
+ "description": "Structured representation of Phase 2's plan. Persists into agent-state under `.plan.todos[]` and survives across phases. Phase 3 (Dev) iterates step-by-step; Phase 4 (Review) verifies completed steps against criteria; Phase 7 (Report) renders a per-step rollup. The plan is broken into a live, structured Todo list for step-by-step tracking across phases.",
6
6
  "type": "object",
7
7
  "additionalProperties": false,
8
8
  "required": ["title", "todos"],
@@ -709,7 +709,7 @@
709
709
  "dedupeInlineComments": {
710
710
  "type": "boolean",
711
711
  "default": true,
712
- "description": "Bugbot/Devin Review behaviour: before posting an inline comment, scan existing PR comments for a stable fingerprint marker (sha-16 of path|line|issue). If a comment with the same marker exists, skip - preserves audit trail without flooding the PR on re-runs. Provider-agnostic: GitHub /pulls/{n}/comments + /issues/{n}/comments and Bitbucket Server /pull-requests/{id}/activities?fromType=COMMENT are both checked. Set to false to restore the pre-v8.6 behavior (every run posts fresh comments, original spec)."
712
+ "description": "Deduplicated PR review comments: before posting an inline comment, scan existing PR comments for a stable fingerprint marker (sha-16 of path|line|issue). If a comment with the same marker exists, skip - preserves audit trail without flooding the PR on re-runs. Provider-agnostic: GitHub /pulls/{n}/comments + /issues/{n}/comments and Bitbucket Server /pull-requests/{id}/activities?fromType=COMMENT are both checked. Set to false to restore the pre-v8.6 behavior (every run posts fresh comments, original spec)."
713
713
  }
714
714
  }
715
715
  },
@@ -727,7 +727,7 @@
727
727
  "type": "string",
728
728
  "enum": ["per-tool-call", "per-todo-step", "off"],
729
729
  "default": "per-todo-step",
730
- "description": "Snapshot frequency. per-tool-call mirrors Cline (one snapshot after each Edit/Write/Bash mutation); per-todo-step snapshots once per plan-todos.sh step boundary (cheaper, recommended). off is equivalent to enabled=false."
730
+ "description": "Snapshot frequency. per-tool-call snapshots after each Edit/Write/Bash mutation; per-todo-step snapshots once per plan-todos.sh step boundary (cheaper, recommended). off is equivalent to enabled=false."
731
731
  },
732
732
  "pruneAfterDays": {
733
733
  "type": "integer",
@@ -741,7 +741,7 @@
741
741
  "planTodos": {
742
742
  "type": "object",
743
743
  "additionalProperties": false,
744
- "description": "v8.6+ - Plan-as-live-Todo-list. Phase 2 emits agent-state.plan.todos[] conforming to pipeline/schemas/plan-todos.schema.json; Phase 3 iterates step-by-step via pipeline/lib/plan-todos.sh next/start/complete; Phase 7 renders the rollup into agent-log.md and the PR body. Pattern source: Windsurf Cascade (https://docs.windsurf.com/windsurf/cascade) Todo list inside the conversation; Cursor Plan Mode (https://cursor.com/docs/agent/planning) reviewable plan. Off by default - opt in to add status-transition writes per Phase 3 step in exchange for sub-step visibility + per-step notes.",
744
+ "description": "v8.6+ - Plan-as-live-Todo-list. Phase 2 emits agent-state.plan.todos[] conforming to pipeline/schemas/plan-todos.schema.json; Phase 3 iterates step-by-step via pipeline/lib/plan-todos.sh next/start/complete; Phase 7 renders the rollup into agent-log.md and the PR body. The plan is broken into a live, structured Todo list. Off by default - opt in to add status-transition writes per Phase 3 step in exchange for sub-step visibility + per-step notes.",
745
745
  "properties": {
746
746
  "enabled": {
747
747
  "type": "boolean",
@@ -753,7 +753,7 @@
753
753
  "clarifyAmbiguous": {
754
754
  "type": "object",
755
755
  "additionalProperties": false,
756
- "description": "v8.6+ - Phase 0 Step 9 clarifying-question loop. Before Phase 1 starts, a cheap Haiku classifier scores task ambiguity (0-10) and emits up to N questions if score < threshold. Pattern source: Devin Knowledge / Ask Devin (https://docs.devin.ai/work-with-devin/devin-review) and Cursor Plan Mode clarifying questions (https://cursor.com/docs/agent/planning). Cost: ~$0.0025 per Haiku call. Off by default - flip on for teams burned by ambiguity-driven rework or when working on cross-team issues where the spec lives in someone else's head.",
756
+ "description": "v8.6+ - Phase 0 Step 9 clarifying-question loop. Before Phase 1 starts, a cheap Haiku classifier scores task ambiguity (0-10) and emits up to N questions if score < threshold. Cost: ~$0.0025 per Haiku call. Off by default - flip on for teams burned by ambiguity-driven rework or when working on cross-team issues where the spec lives in someone else's head.",
757
757
  "properties": {
758
758
  "enabled": {
759
759
  "type": "boolean",
@@ -840,7 +840,7 @@
840
840
  "reviewWatch": {
841
841
  "type": "object",
842
842
  "additionalProperties": false,
843
- "description": "v8.6+ - Auto-review incoming PRs via gh CLI polling. Inspired by Cursor Bugbot (https://cursor.com/docs/bugbot) and Devin Review (https://docs.devin.ai/work-with-devin/devin-review): trigger /multi-agent:review on PRs the user did NOT author. Disabled by default. Configure repos via .repos[] then either run pipeline/lib/review-watch.sh --watch as a background process or schedule it via cron.",
843
+ "description": "v8.6+ - Auto-review incoming PRs via gh CLI polling. Auto-triggers /multi-agent:review on PRs the user did NOT author. Disabled by default. Configure repos via .repos[] then either run pipeline/lib/review-watch.sh --watch as a background process or schedule it via cron.",
844
844
  "properties": {
845
845
  "enabled": {
846
846
  "type": "boolean",
@@ -8,7 +8,7 @@
8
8
  # 4. State directory is created
9
9
  # 5. prefs schema exposes global.reviewWatch.{enabled,repos,intervalSeconds,labelFilter}
10
10
  # 6. global.reviewWatch.enabled defaults to false (opt-in)
11
- # 7. global.review.dedupeInlineComments default is true (Bugbot parity)
11
+ # 7. global.review.dedupeInlineComments default is true (dedupe parity)
12
12
  # 8. post-pr-review.sh exposes finding_fingerprint + comment_exists_with_fingerprint
13
13
  # 9. render_inline_body accepts a 5th fingerprint arg and embeds the marker
14
14
  # 10. Unknown command exits non-zero
@@ -81,7 +81,7 @@ fi
81
81
  # 7. dedupeInlineComments default true
82
82
  if jq -e '.properties.global.properties.review.properties.dedupeInlineComments
83
83
  | has("default") and .default == true' "$SCHEMA" >/dev/null 2>&1; then
84
- record_pass "review.dedupeInlineComments defaults to true (Bugbot parity)"
84
+ record_pass "review.dedupeInlineComments defaults to true (dedupe parity)"
85
85
  else
86
86
  record_fail "review.dedupeInlineComments should default to true"
87
87
  fi
@@ -32,7 +32,7 @@ failures=()
32
32
  record_pass() { pass=$((pass + 1)); printf ' \033[0;32mPASS\033[0m %s\n' "$1"; }
33
33
  record_fail() { fail=$((fail + 1)); failures+=("$1"); printf ' \033[0;31mFAIL\033[0m %s\n' "$1"; }
34
34
 
35
- printf '→ smoke-shadow-git: Cline-style per-tool-call checkpoint contract\n'
35
+ printf '→ smoke-shadow-git: per-tool-call checkpoint contract\n'
36
36
 
37
37
  # 1. Script exists + parses
38
38
  if [ ! -f "$SG" ]; then