@gempack/squad-mcp 0.5.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (96) hide show
  1. package/.claude-plugin/marketplace.json +2 -2
  2. package/.claude-plugin/plugin.json +3 -2
  3. package/CHANGELOG.md +260 -17
  4. package/INSTALL.md +156 -24
  5. package/README.md +279 -27
  6. package/agents/{PO.md → product-owner.md} +33 -1
  7. package/agents/{Senior-Architect.md → senior-architect.md} +33 -1
  8. package/agents/{Senior-DBA.md → senior-dba.md} +33 -1
  9. package/agents/{Senior-Dev-Reviewer.md → senior-dev-reviewer.md} +33 -1
  10. package/agents/{Senior-Dev-Security.md → senior-dev-security.md} +33 -1
  11. package/agents/{Senior-Developer.md → senior-developer.md} +33 -1
  12. package/agents/{Senior-QA.md → senior-qa.md} +33 -1
  13. package/agents/{TechLead-Consolidator.md → tech-lead-consolidator.md} +7 -1
  14. package/agents/{TechLead-Planner.md → tech-lead-planner.md} +7 -1
  15. package/commands/squad-review.md +10 -58
  16. package/commands/squad.md +11 -70
  17. package/dist/config/ownership-matrix.d.ts +24 -2
  18. package/dist/config/ownership-matrix.js +466 -139
  19. package/dist/config/ownership-matrix.js.map +1 -1
  20. package/dist/config/squad-yaml.d.ts +242 -0
  21. package/dist/config/squad-yaml.js +403 -0
  22. package/dist/config/squad-yaml.js.map +1 -0
  23. package/dist/errors.d.ts +1 -1
  24. package/dist/errors.js +1 -1
  25. package/dist/errors.js.map +1 -1
  26. package/dist/format/pr-review.d.ts +61 -0
  27. package/dist/format/pr-review.js +146 -0
  28. package/dist/format/pr-review.js.map +1 -0
  29. package/dist/index.js +19 -13
  30. package/dist/index.js.map +1 -1
  31. package/dist/learning/format.d.ts +29 -0
  32. package/dist/learning/format.js +55 -0
  33. package/dist/learning/format.js.map +1 -0
  34. package/dist/learning/store.d.ts +102 -0
  35. package/dist/learning/store.js +169 -0
  36. package/dist/learning/store.js.map +1 -0
  37. package/dist/resources/agent-loader.d.ts +1 -1
  38. package/dist/resources/agent-loader.js +53 -40
  39. package/dist/resources/agent-loader.js.map +1 -1
  40. package/dist/tasks/select.d.ts +64 -0
  41. package/dist/tasks/select.js +84 -0
  42. package/dist/tasks/select.js.map +1 -0
  43. package/dist/tasks/store.d.ts +338 -0
  44. package/dist/tasks/store.js +321 -0
  45. package/dist/tasks/store.js.map +1 -0
  46. package/dist/tools/compose-advisory-bundle.d.ts +5 -5
  47. package/dist/tools/compose-advisory-bundle.js +24 -12
  48. package/dist/tools/compose-advisory-bundle.js.map +1 -1
  49. package/dist/tools/compose-prd-parse.d.ts +53 -0
  50. package/dist/tools/compose-prd-parse.js +167 -0
  51. package/dist/tools/compose-prd-parse.js.map +1 -0
  52. package/dist/tools/compose-squad-workflow.d.ts +28 -10
  53. package/dist/tools/compose-squad-workflow.js +0 -0
  54. package/dist/tools/compose-squad-workflow.js.map +1 -1
  55. package/dist/tools/consolidate.d.ts +55 -4
  56. package/dist/tools/consolidate.js +87 -15
  57. package/dist/tools/consolidate.js.map +1 -1
  58. package/dist/tools/expand-task.d.ts +51 -0
  59. package/dist/tools/expand-task.js +35 -0
  60. package/dist/tools/expand-task.js.map +1 -0
  61. package/dist/tools/list-tasks.d.ts +31 -0
  62. package/dist/tools/list-tasks.js +50 -0
  63. package/dist/tools/list-tasks.js.map +1 -0
  64. package/dist/tools/next-task.d.ts +37 -0
  65. package/dist/tools/next-task.js +60 -0
  66. package/dist/tools/next-task.js.map +1 -0
  67. package/dist/tools/read-learnings.d.ts +53 -0
  68. package/dist/tools/read-learnings.js +72 -0
  69. package/dist/tools/read-learnings.js.map +1 -0
  70. package/dist/tools/read-squad-config.d.ts +23 -0
  71. package/dist/tools/read-squad-config.js +34 -0
  72. package/dist/tools/read-squad-config.js.map +1 -0
  73. package/dist/tools/record-learning.d.ts +62 -0
  74. package/dist/tools/record-learning.js +80 -0
  75. package/dist/tools/record-learning.js.map +1 -0
  76. package/dist/tools/record-tasks.d.ts +71 -0
  77. package/dist/tools/record-tasks.js +45 -0
  78. package/dist/tools/record-tasks.js.map +1 -0
  79. package/dist/tools/registry.d.ts +1 -1
  80. package/dist/tools/registry.js +71 -39
  81. package/dist/tools/registry.js.map +1 -1
  82. package/dist/tools/score-rubric.d.ts +74 -0
  83. package/dist/tools/score-rubric.js +140 -0
  84. package/dist/tools/score-rubric.js.map +1 -0
  85. package/dist/tools/slice-files-for-task.d.ts +31 -0
  86. package/dist/tools/slice-files-for-task.js +52 -0
  87. package/dist/tools/slice-files-for-task.js.map +1 -0
  88. package/dist/tools/update-task-status.d.ts +29 -0
  89. package/dist/tools/update-task-status.js +35 -0
  90. package/dist/tools/update-task-status.js.map +1 -0
  91. package/package.json +4 -1
  92. package/skills/squad/SKILL.md +454 -0
  93. package/tools/post-review.mjs +212 -0
  94. /package/agents/{Skill-Squad-Dev.md → _shared/Skill-Squad-Dev.md} +0 -0
  95. /package/agents/{Skill-Squad-Review.md → _shared/Skill-Squad-Review.md} +0 -0
  96. /package/agents/{_Severity-and-Ownership.md → _shared/_Severity-and-Ownership.md} +0 -0
package/README.md CHANGED
@@ -75,20 +75,31 @@ node dist/index.js
75
75
 
76
76
  ### Tools (deterministic, pure functions)
77
77
 
78
- | Tool | Purpose |
79
- |------|---------|
80
- | `detect_changed_files` | Hardened `git diff --name-status --no-renames` for a workspace. Allowlisted refs, 10s timeout, 1MB stdout cap. |
81
- | `classify_work_type` | Heuristic `WorkType` from prompt + paths (`Feature` / `Bug Fix` / `Refactor` / `Performance` / `Security` / `Business Rule`) with Low/Medium/High confidence. |
82
- | `score_risk` | Compute Low/Medium/High from boolean signals (auth, money, migration, files_count, new_module, api_change). |
83
- | `select_squad` | Select advisory agents for a work type. Combines matrix + path hints + content sniff. Returns evidence per file. |
84
- | `slice_files_for_agent` | Filter a file list to those owned by a single agent. Used to build sliced advisory prompts. |
85
- | `validate_plan_text` | Advisory check for inviolable-rule violations in a plan (commit/push fences, emojis in code blocks, non-English identifiers, impl-before-approval). |
86
- | `compose_squad_workflow` | One-call pipeline: `detect_changed_files` → `classify_work_type` → `score_risk` → `select_squad`. |
87
- | `compose_advisory_bundle` | One-call full bundle: `compose_squad_workflow` + `slice_files_for_agent` per selected agent + `validate_plan_text`. |
88
- | `apply_consolidation_rules` | Aggregate advisory reports → final verdict (APPROVED / CHANGES_REQUIRED / REJECTED). |
89
- | `list_agents` | List configured agents with role, ownership, naming conventions. |
90
- | `get_agent_definition` | Return the full markdown system prompt for an agent (local override embedded default). |
91
- | `init_local_config` | Copy embedded defaults to the local override directory so they can be edited. |
78
+ | Tool | Purpose |
79
+ | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
80
+ | `detect_changed_files` | Hardened `git diff --name-status --no-renames` for a workspace. Allowlisted refs, 10s timeout, 1MB stdout cap. |
81
+ | `classify_work_type` | Heuristic `WorkType` from prompt + paths (`Feature` / `Bug Fix` / `Refactor` / `Performance` / `Security` / `Business Rule`) with Low/Medium/High confidence. |
82
+ | `score_risk` | Compute Low/Medium/High from boolean signals (auth, money, migration, files_count, new_module, api_change). |
83
+ | `select_squad` | Select advisory agents for a work type. Combines matrix + path hints + content sniff. Returns evidence per file. |
84
+ | `slice_files_for_agent` | Filter a file list to those owned by a single agent. Used to build sliced advisory prompts. |
85
+ | `validate_plan_text` | Advisory check for inviolable-rule violations in a plan (commit/push fences, emojis in code blocks, non-English identifiers, impl-before-approval). |
86
+ | `compose_squad_workflow` | One-call pipeline: `detect_changed_files` → `classify_work_type` → `score_risk` → `select_squad`. |
87
+ | `compose_advisory_bundle` | One-call full bundle: `compose_squad_workflow` + `slice_files_for_agent` per selected agent + `validate_plan_text`. |
88
+ | `apply_consolidation_rules` | Aggregate advisory reports → final verdict (APPROVED / CHANGES_REQUIRED / REJECTED). Returns weighted rubric scorecard when reports carry per-dimension scores. |
89
+ | `score_rubric` | Pure rubric calculator. Takes per-agent scores (0-100) + optional weight overrides, returns weighted score, per-dimension breakdown, and pre-formatted ASCII scorecard. |
90
+ | `read_squad_config` | Read and resolve `.squad.yaml` (or `.squad.yml`) at workspace_root. Returns effective weights, threshold, min_score, skip_paths, disable_agents. |
91
+ | `read_learnings` | Load past accept/reject decisions from `.squad/learnings.jsonl`. Filters by agent / decision / changed-file scope. Returns entries plus a markdown block ready to inject into agent or consolidator prompts. |
92
+ | `record_learning` | Append one accept/reject decision to `.squad/learnings.jsonl`. Side-effecting; the skill (or CLI) is responsible for per-finding user authorisation. |
93
+ | `compose_prd_parse` | Build a prompt + JSON schema for the host LLM to decompose a PRD into atomic tasks. Pure-MCP: server does NO LLM calls. Caller (skill) feeds the prompt to its model, then calls `record_tasks` after user confirmation. |
94
+ | `list_tasks` | Read tasks from `.squad/tasks.json`. Filters: status, agent (matches `agent_hints`), changed_files (glob match against task `scope`). |
95
+ | `next_task` | Pick the next ready task: candidate status (default pending), all dependencies done, optional agent / changed_files filter. Tiebreak priority then id. Returns null + reason when none ready. |
96
+ | `record_tasks` | Bulk-create tasks. Allocates ids sequentially, validates dependencies resolve (forward refs in batch ok), rejects duplicates and self-deps. Atomic write. |
97
+ | `update_task_status` | Flip a task or subtask status: pending / in-progress / review / done / blocked / cancelled. |
98
+ | `expand_task` | Append subtasks to an existing task. Mechanical only — caller (skill or LLM) supplies the subtask inputs. |
99
+ | `slice_files_for_task` | Filter a file list to those matching a task's `scope` glob. Same glob primitive as `skip_paths` and learnings scope. |
100
+ | `list_agents` | List configured agents with role, ownership, naming conventions. |
101
+ | `get_agent_definition` | Return the full markdown system prompt for an agent (local override → embedded default). |
102
+ | `init_local_config` | Copy embedded defaults to the local override directory so they can be edited. |
92
103
 
93
104
  ### Prompts
94
105
 
@@ -98,20 +109,27 @@ node dist/index.js
98
109
 
99
110
  ### Resources
100
111
 
101
- - `agent://po`, `agent://tech-lead-planner`, `agent://tech-lead-consolidator`, `agent://senior-architect`, `agent://senior-dba`, `agent://senior-developer`, `agent://senior-dev-reviewer`, `agent://senior-dev-security`, `agent://senior-qa`.
112
+ - `agent://product-owner`, `agent://tech-lead-planner`, `agent://tech-lead-consolidator`, `agent://senior-architect`, `agent://senior-dba`, `agent://senior-developer`, `agent://senior-dev-reviewer`, `agent://senior-dev-security`, `agent://senior-qa`. (Renamed from PascalCase / `po` in v0.6.0 — older 0.5.x consumers must use `agent://po` instead.)
102
113
  - `severity://_severity-and-ownership` — severity matrix + ownership rules.
103
114
  - `severity://skill-squad-dev`, `severity://skill-squad-review` — full skill specs.
104
115
 
105
116
  ### Bundled skills
106
117
 
107
- The plugin auto-registers these skills via `skills/` (or sync them to `~/.claude/skills/` for non-plugin clients with `node tools/sync-agents.mjs`):
118
+ The plugin auto-registers these skills via `skills/`:
108
119
 
109
- | Skill | Trigger | Purpose |
110
- |-------|---------|---------|
111
- | `/squad` | implementation workflow | Builds an approved plan, distributes work to specialist agents in parallel, implements the change, consolidates via tech-lead. Optional `--codex` second-opinion. New `--quick` mode reduces to 1 specialist + tech-lead with terse prompts (mutually exclusive with `--codex`; auto-fallback to normal mode on security/data-layer scope). |
112
- | `/squad-review` | multi-perspective review | Auto-detects affected domains, spawns specialist agents in parallel, scores on a weighted rubric (Code Quality 20%, Security 20%, Maintainability 20%, Performance 20%, Async/Concurrency 8%, Error Handling 7%, Architecture Fit 5%), tech-lead consolidates the verdict. New `--quick` mode for fast iteration. |
113
- | `/brainstorm` | pre-implementation research | Web research in parallel + specialist agent perspectives options matrix with cited sources and a recommendation. Produces no code. Position: `/brainstorm` decides what to build, `/squad` implements, `/squad-review` reviews. |
114
- | `/commit-suggest` | commit message generator | Read-only suggester for Conventional Commits messages. Runs only an allowlist of git commands; never executes mutations; never adds AI co-author trailers. The user runs the commit themselves. |
120
+ | Skill | Trigger | Purpose |
121
+ | ----------------- | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
122
+ | `/squad` | implementation workflow | Single skill, two modes. `/squad <task>` builds an approved plan, distributes work to specialist subagents in parallel, implements the change, consolidates via tech-lead. `/squad-review [target]` is the same skill in review mode never implements, just produces an advisory verdict on an existing diff/branch/PR. Optional `--codex` second-opinion. |
123
+ | `/brainstorm` | pre-implementation research | Web research in parallel + specialist agent perspectives options matrix with cited sources and a recommendation. Produces no code. Position: `/brainstorm` decides what to build, `/squad` implements, `/squad-review` reviews. |
124
+ | `/commit-suggest` | commit message generator | Read-only suggester for Conventional Commits messages. Runs only an allowlist of git commands; never executes mutations; never adds AI co-author trailers. The user runs the commit themselves. |
125
+
126
+ ### Bundled subagents
127
+
128
+ The plugin's `agents/` directory registers nine native Claude Code subagents you can also dispatch directly via `Task(subagent_type=…)`:
129
+
130
+ `product-owner`, `senior-architect`, `senior-dba`, `senior-developer`, `senior-dev-reviewer`, `senior-dev-security`, `senior-qa`, `tech-lead-planner`, `tech-lead-consolidator`.
131
+
132
+ The `/squad` skill orchestrates them. For non-Claude-Code MCP clients (Cursor, Claude Desktop, Warp), the same role markdowns are accessible through the MCP `agent://…` resources and `get_agent_definition` tool.
115
133
 
116
134
  Workflow positioning:
117
135
 
@@ -127,6 +145,236 @@ Workflow positioning:
127
145
 
128
146
  See [INSTALL.md](INSTALL.md#bundled-skills) for trigger examples and the optional `commit-msg` git hook + `permissions.deny` snippet that hard-enforce the read-only and no-AI-attribution invariants at the OS / Claude Code layer.
129
147
 
148
+ ## Repo configuration — `.squad.yaml`
149
+
150
+ Drop a `.squad.yaml` (or `.squad.yml`) at the repo root to override defaults per-project. Versioned with the code, picked up automatically by `compose_squad_workflow` and `compose_advisory_bundle`.
151
+
152
+ ```yaml
153
+ # .squad.yaml — example for a regulated fintech backend
154
+
155
+ # Rubric weights (must sum to 100 across the agents you list).
156
+ # Agents NOT listed are zeroed out — listing weights is an explicit choice
157
+ # of which dimensions count for this repo.
158
+ weights:
159
+ senior-dev-security: 30 # PCI compliance — security weighted higher
160
+ senior-dba: 22 # double-entry ledger, money on the line
161
+ senior-developer: 20
162
+ senior-architect: 15
163
+ senior-qa: 13
164
+
165
+ # Per-dimension flag threshold (default 75). Below this, the dimension is
166
+ # marked with ⚠ in the scorecard.
167
+ threshold: 80
168
+
169
+ # Quality floor: APPROVED with weighted score below this becomes
170
+ # CHANGES_REQUIRED. Severity rules (Blocker/Major) take precedence.
171
+ min_score: 75
172
+
173
+ # Files excluded from advisory. Glob syntax: ** for any depth, * for one
174
+ # segment, ? for one char. Useful for docs-only or generated paths.
175
+ skip_paths:
176
+ - "docs/**"
177
+ - "**/*.md"
178
+ - "**/generated/**"
179
+ - "vendor/**"
180
+
181
+ # Agents not relevant for this repo (e.g. internal tool, no PO involved).
182
+ disable_agents:
183
+ - product-owner
184
+ ```
185
+
186
+ All keys are optional; partial files merge with package defaults. `force_agents` in tool calls still wins over `disable_agents` (config is a default policy, not a veto over explicit caller intent). Validation is strict: weights that don't sum to 100, unknown agent names, or invalid threshold ranges are rejected with a clear error.
187
+
188
+ The reader is cached by mtime — long-running MCP servers automatically pick up edits without a restart.
189
+
190
+ ## Learnings — persistent accept/reject memory
191
+
192
+ Each time the team accepts or rejects an advisory finding, the decision can be appended to `.squad/learnings.jsonl`. Future runs of the squad load recent decisions and inject them into per-agent and consolidator prompts so the squad stops re-raising findings the team has already considered.
193
+
194
+ ```jsonl
195
+ {"ts":"2026-04-12T15:02:31Z","pr":42,"agent":"senior-dev-security","severity":"Major","finding":"missing CSRF on POST /api/refund","decision":"reject","reason":"CSRF terminated at API gateway, see infra/edge.tf","scope":"src/api/**"}
196
+ {"ts":"2026-04-15T09:18:11Z","pr":47,"agent":"senior-architect","severity":"Major","finding":"cross-module coupling Auth → Billing","decision":"accept","reason":"refactored to event bus"}
197
+ ```
198
+
199
+ The file lives in git. Decisions are auditable in PR diffs.
200
+
201
+ ### Recording decisions
202
+
203
+ Inside Claude Code, after `/squad-review` produces the verdict, tell the skill to record:
204
+
205
+ ```
206
+ record reject senior-dev-security "missing CSRF on POST /api/refund"
207
+ reason: CSRF terminated at API gateway
208
+ scope: src/api/**
209
+ ```
210
+
211
+ The skill confirms each decision and calls the `record_learning` MCP tool. **Per-finding authorisation is required** — silence or "thanks" is not authorisation.
212
+
213
+ For non-MCP environments, use the CLI helper:
214
+
215
+ ```bash
216
+ node tools/record-learning.mjs --reject \
217
+ --agent senior-dev-security \
218
+ --finding "missing CSRF on POST /api/refund" \
219
+ --reason "CSRF terminated at API gateway" \
220
+ --scope "src/api/**" \
221
+ --pr 42
222
+ ```
223
+
224
+ ### How the squad uses them
225
+
226
+ In Phase 5 (per-agent advisory) the skill calls `read_learnings(workspace_root, agent, changed_files)` and injects the rendered `## Past team decisions` block into the agent's prompt. In Phase 10 (consolidator) it does the same without an agent filter — the consolidator sees the full picture across agents.
227
+
228
+ Each agent is told: when a current finding matches a previously **rejected** decision (similar agent + similar finding text + matching scope), suppress or downgrade severity unless the diff materially changes the rationale. When a finding contradicts a previously **accepted** decision, flag the contradiction explicitly.
229
+
230
+ ### Configuration
231
+
232
+ Override defaults via `.squad.yaml`:
233
+
234
+ ```yaml
235
+ learnings:
236
+ path: .squad/learnings.jsonl # default
237
+ max_recent: 50 # how many recent entries to inject (hard cap 200)
238
+ enabled: true # set false to disable injection without deleting the journal
239
+ ```
240
+
241
+ The store reader is mtime-cached. The journal is append-only by design — the skill never amends or deletes past entries; correcting a stale decision means appending a new one.
242
+
243
+ ## Tasks — PRD-decomposed atomic work units
244
+
245
+ The biggest source of token bloat in a long-running squad session is the squad re-analysing the whole repo for every prompt. The tasks store fixes that by decomposing a PRD into atomic tasks up front, then running the squad on ONE task's narrowed scope at a time.
246
+
247
+ ```jsonc
248
+ // .squad/tasks.json (excerpt)
249
+ {
250
+ "version": 1,
251
+ "tasks": [
252
+ {
253
+ "id": 1,
254
+ "title": "Add CSRF token to checkout flow",
255
+ "status": "done",
256
+ "dependencies": [],
257
+ "priority": "high",
258
+ "scope": "src/api/checkout/**",
259
+ "agent_hints": ["senior-dev-security", "senior-developer"],
260
+ "test_strategy": "POST without token → 403; POST with token → 200.",
261
+ "subtasks": [],
262
+ "created_at": "2026-05-08T12:00:00Z",
263
+ "updated_at": "2026-05-09T15:30:00Z"
264
+ },
265
+ {
266
+ "id": 2,
267
+ "title": "Wire CSRF middleware into refund endpoint",
268
+ "status": "pending",
269
+ "dependencies": [1],
270
+ "priority": "high",
271
+ "scope": "src/api/refund/**",
272
+ "subtasks": [],
273
+ ...
274
+ }
275
+ ]
276
+ }
277
+ ```
278
+
279
+ `scope` (glob) and `agent_hints` are squad-mcp-specific additions on top of the claude-task-master shape — they let `slice_files_for_task` and `compose_squad_workflow` narrow the advisory automatically.
280
+
281
+ ### Decomposing a PRD
282
+
283
+ Inside Claude Code:
284
+
285
+ ```
286
+ /squad-tasks docs/prd-payments-refactor.md
287
+ ```
288
+
289
+ The skill (Phase 0.5):
290
+
291
+ 1. Calls `compose_prd_parse` with the PRD text.
292
+ 2. Receives a prompt + JSON schema and runs them through Claude.
293
+ 3. Shows you the parsed tasks — title, deps, priority, scope, agent_hints — for review.
294
+ 4. Calls `record_tasks` only after you say "record" / "go" / "yes".
295
+
296
+ The parse is **pure-MCP**: the squad-mcp server never makes LLM calls. The host (Claude Code, Cursor, Warp) does the inference. No provider keys in the server, no surprises for non-Claude clients.
297
+
298
+ ### Working tasks
299
+
300
+ ```
301
+ /squad-next # picks the highest-priority ready task
302
+ /squad-task 5 # explicit pick by id
303
+ ```
304
+
305
+ For each task:
306
+
307
+ - `slice_files_for_task` narrows the changed-files list to the task's `scope`.
308
+ - `compose_squad_workflow` runs against that slice; if `agent_hints` is set, only those agents wake up.
309
+ - Phase 1 onward proceeds normally, just with much less context.
310
+ - When done, the skill flips status to `done` via `update_task_status`.
311
+
312
+ ### Configuration
313
+
314
+ Override defaults via `.squad.yaml`:
315
+
316
+ ```yaml
317
+ tasks:
318
+ path: .squad/tasks.json # default
319
+ enabled: true # set false to silence reads without deleting the file
320
+ ```
321
+
322
+ Writes (`record_tasks`, `update_task_status`, `expand_task`) stay open even when reads are disabled — same policy as learnings. Disabling injection should not throw away the journal.
323
+
324
+ ### CLI for non-MCP environments
325
+
326
+ Mirroring the post-review and record-learning helpers:
327
+
328
+ ```bash
329
+ # decompose offline (you generate the JSON yourself or via another tool)
330
+ echo '[{"title":"Add CSRF","scope":"src/api/**"}]' | node tools/record-tasks.mjs
331
+
332
+ # inspect
333
+ node tools/list-tasks.mjs --status pending
334
+ node tools/next-task.mjs --json
335
+
336
+ # flip status from CI
337
+ node tools/update-task-status.mjs --task 5 --status done
338
+ ```
339
+
340
+ The CLIs share `tools/_tasks-io.mjs` for read/write and require only node 18+. Schema validation is lighter than the MCP tool — production use should prefer the MCP path.
341
+
342
+ ## Posting reviews to GitHub PRs
343
+
344
+ Once the squad runs, you can post the verdict + scorecard as a `gh pr review` directly. The skill `/squad-review #42` runs the advisory and offers to post the result; default behaviour is **dry-run + confirmation** — Claude shows the exact `gh` command and the markdown body, then waits for your "go" before posting.
345
+
346
+ ```bash
347
+ # manual usage (outside the skill)
348
+ echo '<consolidation JSON>' | node tools/post-review.mjs --pr 42 --dry-run
349
+ # prints: gh pr review 42 --approve --body-file - <<'EOF' ... EOF
350
+
351
+ # actually post
352
+ echo '<consolidation JSON>' | node tools/post-review.mjs --pr 42
353
+ ```
354
+
355
+ The CLI maps verdict → `gh` action deterministically:
356
+
357
+ | Verdict | Score signal | `gh` action |
358
+ | -------------------------------------------------- | ---------------------- | ---------------------------------- |
359
+ | `REJECTED` | — | `--request-changes` (blocks merge) |
360
+ | `CHANGES_REQUIRED` | — | `--comment` (advisory) |
361
+ | `APPROVED` + `downgraded_by_score: true` | weighted < `min_score` | `--comment` |
362
+ | `APPROVED` + score < `request_changes_below_score` | (opt-in floor) | `--request-changes` |
363
+ | `APPROVED` otherwise | passes threshold | `--approve` |
364
+
365
+ ### Auto-post (opt-in)
366
+
367
+ If `.squad.yaml` has `pr_posting.auto_post: true`, the skill posts without the second confirmation prompt — but **always shows the body first**. Auto-post means "skip the second yes/no", not "skip the preview".
368
+
369
+ ```yaml
370
+ pr_posting:
371
+ auto_post: true # default false — always asks
372
+ request_changes_below_score: 50 # below this, post --request-changes instead of --approve
373
+ omit_attribution_footer: false # default false — footer present
374
+ ```
375
+
376
+ Requires `gh` CLI in PATH and authenticated (`gh auth login`). The CLI exits 3 with a clear message if `gh` is missing.
377
+
130
378
  ## Detection strategy (`select_squad` / `slice_files_for_agent`)
131
379
 
132
380
  Three layers, in order of strength:
@@ -158,12 +406,17 @@ Run the `init_local_config` tool once to seed the local directory with editable
158
406
  squad-mcp/
159
407
  ├── .claude-plugin/ # Claude Code plugin manifest + marketplace
160
408
  ├── .github/workflows/ # CI + release workflows
161
- ├── agents/ # Bundled agent markdown defaults
162
- ├── commands/ # Plugin slash commands (/squad, /squad-review, /brainstorm, /commit-suggest)
163
- ├── skills/ # Bundled skills (commit-suggest, brainstorm)
409
+ ├── agents/ # Native subagents + shared docs
410
+ ├── *.md # 9 subagent definitions (kebab-case, with frontmatter)
411
+ │ └── _shared/ # severity matrix + skill specs (not loaded as subagents)
412
+ ├── commands/ # Slash commands (/squad, /squad-review, /brainstorm, /commit-suggest)
413
+ ├── skills/ # Bundled skills
414
+ │ ├── squad/ # single skill, two modes (implement | review)
415
+ │ ├── brainstorm/
416
+ │ └── commit-suggest/
164
417
  ├── src/
165
418
  │ ├── index.ts # stdio entry
166
- │ ├── tools/ # MCP tools (12 deterministic functions)
419
+ │ ├── tools/ # MCP tools (23 deterministic functions)
167
420
  │ ├── resources/ # MCP resources + agent loader
168
421
  │ ├── prompts/ # MCP prompt templates
169
422
  │ ├── exec/git.ts # hardened git execution layer
@@ -173,7 +426,6 @@ squad-mcp/
173
426
  │ └── ownership-matrix.ts # agents, work types, content/path patterns
174
427
  ├── tests/ # vitest unit + integration + stdio smoke
175
428
  ├── tools/
176
- │ ├── sync-agents.mjs # mirror agents + skills into ~/.claude/ for non-plugin clients
177
429
  │ └── git-hooks/commit-msg # opt-in hook rejecting AI-attribution trailers
178
430
  └── dist/ # compiled JS (gitignored, shipped via npm)
179
431
  ```
@@ -1,6 +1,12 @@
1
+ ---
2
+ name: product-owner
3
+ description: Product Owner. Validates business value, functional requirements, and UX. Use for features, business-rule changes, and user-facing surfaces.
4
+ model: inherit
5
+ ---
6
+
1
7
  # PO (Product Owner)
2
8
 
3
- > Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
9
+ > Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
4
10
 
5
11
  ## Role
6
12
  Business representative in technical review. Ensures every implementation delivers real value to the end user and aligns with product goals.
@@ -82,3 +88,29 @@ Objective summary of the evaluation.
82
88
  - Be pragmatic: not every gap is a blocker, classify by severity
83
89
  - Frame impact in business terms, not technical ones
84
90
  - Without a user story, judge by observable behavior and product common sense
91
+
92
+ ## Score
93
+
94
+ At the end of your advisory output, emit exactly:
95
+
96
+ ```
97
+ Score: <NN>/100
98
+ Score rationale: <one sentence on what drove the score>
99
+ ```
100
+
101
+ The score is YOUR dimension's contribution to the squad rubric (`Business & UX`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
102
+
103
+ ### Calibration
104
+
105
+ - 90-100: requirement matches the change; UX clear; business value evident.
106
+ - 70-89: minor mismatch with stated requirement or UX awkwardness.
107
+ - **50-69: one Major — business rule contradicted, UX broken on critical flow, requirement absent.**
108
+ - 30-49: change does not deliver claimed value; conflicts with PO intent.
109
+ - 0-29: should not be built; halt.
110
+
111
+ ### Notes
112
+
113
+ - Score is per-agent. Do not score other dimensions.
114
+ - Score reflects the slice of files you reviewed, not the whole change.
115
+ - A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
116
+ - An honest 65 is more useful than a generous 80; the rubric is auditable.
@@ -1,6 +1,12 @@
1
+ ---
2
+ name: senior-architect
3
+ description: Senior Architect. Guards module boundaries, coupling, dependency direction, DI lifetimes, and scalability. Use for structural changes and new modules.
4
+ model: inherit
5
+ ---
6
+
1
7
  # Senior-Architect
2
8
 
3
- > Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
9
+ > Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
4
10
 
5
11
  ## Role
6
12
  Guardian of architectural integrity. Evaluates design decisions with a long-term lens and keeps the solution from eroding boundaries.
@@ -119,3 +125,29 @@ Summary of the diagnosis and long-term view.
119
125
  - Distinguish "ideal" from "acceptable for now"
120
126
  - Avoid astronaut architecture — prefer pragmatic solutions
121
127
  - If the issue is implementation (not design), forward to the right agent
128
+
129
+ ## Score
130
+
131
+ At the end of your advisory output, emit exactly:
132
+
133
+ ```
134
+ Score: <NN>/100
135
+ Score rationale: <one sentence on what drove the score>
136
+ ```
137
+
138
+ The score is YOUR dimension's contribution to the squad rubric (`Architecture`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
139
+
140
+ ### Calibration
141
+
142
+ - 90-100: clean module/domain boundaries, DI lifetimes correct, no coupling regression, extensibility clear.
143
+ - 70-89: minor issues (over-eager abstraction, ambiguous responsibility split) but no actionable Major.
144
+ - **50-69: at least one Major (cross-module coupling, wrong DI lifetime, hidden mutable state).**
145
+ - 30-49: multiple Majors or one Blocker that endangers structural integrity.
146
+ - 0-29: architecture-level break; halt.
147
+
148
+ ### Notes
149
+
150
+ - Score is per-agent. Do not score other dimensions.
151
+ - Score reflects the slice of files you reviewed, not the whole change.
152
+ - A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
153
+ - An honest 65 is more useful than a generous 80; the rubric is auditable.
@@ -1,6 +1,12 @@
1
+ ---
2
+ name: senior-dba
3
+ description: Senior DBA. Reviews queries, migrations, EF mappings, cache, concurrency, and persistence stack. Use for data-layer changes.
4
+ model: inherit
5
+ ---
6
+
1
7
  # Senior-DBA
2
8
 
3
- > Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
9
+ > Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
4
10
 
5
11
  ## Role
6
12
  Data specialist. Ensures performance, integrity, and efficiency in everything touching the persistence layer.
@@ -135,3 +141,29 @@ Summary and prioritized risks.
135
141
  - Be conservative with migrations — prefer additive operations
136
142
  - Challenge every query without WHERE or with SELECT *
137
143
  - Validate suggested indexes do not degrade write performance
144
+
145
+ ## Score
146
+
147
+ At the end of your advisory output, emit exactly:
148
+
149
+ ```
150
+ Score: <NN>/100
151
+ Score rationale: <one sentence on what drove the score>
152
+ ```
153
+
154
+ The score is YOUR dimension's contribution to the squad rubric (`Data Layer`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
155
+
156
+ ### Calibration
157
+
158
+ - 90-100: queries efficient, migrations safe and reversible, EF mappings correct, no concurrency hazard.
159
+ - 70-89: minor inefficiencies or missing indexes; no data-integrity risk.
160
+ - **50-69: one Major — N+1 query, missing transaction, broken concurrency control, mismatched stack mix.**
161
+ - 30-49: data integrity at risk (race, lost update, irreversible migration without backout).
162
+ - 0-29: data corruption likely; halt.
163
+
164
+ ### Notes
165
+
166
+ - Score is per-agent. Do not score other dimensions.
167
+ - Score reflects the slice of files you reviewed, not the whole change.
168
+ - A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
169
+ - An honest 65 is more useful than a generous 80; the rubric is auditable.
@@ -1,6 +1,12 @@
1
+ ---
2
+ name: senior-dev-reviewer
3
+ description: Senior code reviewer. Focuses on readability, code smells, naming, idioms, async/await correctness, and error handling.
4
+ model: inherit
5
+ ---
6
+
1
7
  # Senior-Dev-Reviewer
2
8
 
3
- > Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
9
+ > Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
4
10
 
5
11
  ## Role
6
12
  Senior code reviewer focused on quality, readability, and maintainability. Performs detailed line-level review, applies the idiomatic checklist for the detected language/framework, and produces a numeric scorecard so reviewers and the tech-lead can see at a glance where the change stands.
@@ -606,3 +612,29 @@ Summary and decision. Restate the overall score and the top 1–3 things the aut
606
612
  - Be specific: always reference file and line
607
613
  - When the language idiom and the existing codebase conflict, side with the existing codebase consistency and flag the inconsistency for separate discussion
608
614
  - Remember: the goal is that the author learns, not just that they fix
615
+
616
+ ## Score
617
+
618
+ At the end of your advisory output, emit exactly:
619
+
620
+ ```
621
+ Score: <NN>/100
622
+ Score rationale: <one sentence on what drove the score>
623
+ ```
624
+
625
+ The score is YOUR dimension's contribution to the squad rubric (`Code Quality`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
626
+
627
+ ### Calibration
628
+
629
+ - 90-100: idiomatic, readable, well-named, async/error patterns clean.
630
+ - 70-89: minor style or naming smells; no idiom violations of consequence.
631
+ - 50-69: one Major — wrong async pattern, swallowed exception, name that misleads readers.
632
+ - 30-49: multiple Majors; reviewer fatigue indicator.
633
+ - 0-29: code unmaintainable as-is; halt.
634
+
635
+ ### Notes
636
+
637
+ - Score is per-agent. Do not score other dimensions.
638
+ - Score reflects the slice of files you reviewed, not the whole change.
639
+ - A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
640
+ - An honest 65 is more useful than a generous 80; the rubric is auditable.
@@ -1,6 +1,12 @@
1
+ ---
2
+ name: senior-dev-security
3
+ description: Application security specialist. Finds OWASP Top 10 vulnerabilities, validates authn/authz, sensitive data, input validation, and dependency CVEs.
4
+ model: inherit
5
+ ---
6
+
1
7
  # Senior-Dev-Security
2
8
 
3
- > Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
9
+ > Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
4
10
 
5
11
  ## Role
6
12
  Application security specialist. Identifies vulnerabilities, validates access controls, and ensures sensitive data is protected.
@@ -132,3 +138,29 @@ Summary of risks and prioritized recommendations.
132
138
  - Do not generate false positives — only report with real or highly likely evidence
133
139
  - Prioritize by real impact, not theoretical checklist
134
140
  - Explicitly record what could not be validated
141
+
142
+ ## Score
143
+
144
+ At the end of your advisory output, emit exactly:
145
+
146
+ ```
147
+ Score: <NN>/100
148
+ Score rationale: <one sentence on what drove the score>
149
+ ```
150
+
151
+ The score is YOUR dimension's contribution to the squad rubric (`Security`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
152
+
153
+ ### Calibration
154
+
155
+ - 90-100: no OWASP issue; authn/authz tight; secrets handled; no new dependency risk.
156
+ - 70-89: minor concerns (missing input length cap, weak rate limit) — not exploitable.
157
+ - **50-69: one Major — IDOR, missing authz check, secret in log, unsafe dependency.**
158
+ - 30-49: exploitable today (auth bypass, SQLi, RCE); Blocker territory.
159
+ - 0-29: critical security break; halt.
160
+
161
+ ### Notes
162
+
163
+ - Score is per-agent. Do not score other dimensions.
164
+ - Score reflects the slice of files you reviewed, not the whole change.
165
+ - A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
166
+ - An honest 65 is more useful than a generous 80; the rubric is auditable.
@@ -1,6 +1,12 @@
1
+ ---
2
+ name: senior-developer
3
+ description: Pragmatic senior developer. Reviews technical correctness, robustness, API contracts, external integrations, observability, and application performance.
4
+ model: inherit
5
+ ---
6
+
1
7
  # Senior-Developer
2
8
 
3
- > Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
9
+ > Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
4
10
 
5
11
  ## Role
6
12
  Pragmatic senior developer focused on robust implementation. Evaluates code from the perspective of someone who will maintain, debug, and evolve it day to day.
@@ -178,3 +184,29 @@ Summary of the analysis and confidence in the solution for production.
178
184
  - Focus on real, probable bugs — not unlikely theoretical scenarios
179
185
  - Production is hostile: anything that can go wrong, will
180
186
  - Moderate duplication is acceptable when the alternative is a premature abstraction
187
+
188
+ ## Score
189
+
190
+ At the end of your advisory output, emit exactly:
191
+
192
+ ```
193
+ Score: <NN>/100
194
+ Score rationale: <one sentence on what drove the score>
195
+ ```
196
+
197
+ The score is YOUR dimension's contribution to the squad rubric (`Application Code`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
198
+
199
+ ### Calibration
200
+
201
+ - 90-100: correctness solid, robustness considered, API contract honoured, observability in place.
202
+ - 70-89: minor robustness gaps (one ambiguous error path, missing log) but no behavioural break.
203
+ - **50-69: one Major — broken contract, missing error handling, observability hole on critical path.**
204
+ - 30-49: multiple Majors or behaviour change with no test/log support.
205
+ - 0-29: ships broken; halt.
206
+
207
+ ### Notes
208
+
209
+ - Score is per-agent. Do not score other dimensions.
210
+ - Score reflects the slice of files you reviewed, not the whole change.
211
+ - A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
212
+ - An honest 65 is more useful than a generous 80; the rubric is auditable.