@gempack/squad-mcp 0.5.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +2 -2
- package/.claude-plugin/plugin.json +3 -2
- package/CHANGELOG.md +260 -17
- package/INSTALL.md +156 -24
- package/README.md +279 -27
- package/agents/{PO.md → product-owner.md} +33 -1
- package/agents/{Senior-Architect.md → senior-architect.md} +33 -1
- package/agents/{Senior-DBA.md → senior-dba.md} +33 -1
- package/agents/{Senior-Dev-Reviewer.md → senior-dev-reviewer.md} +33 -1
- package/agents/{Senior-Dev-Security.md → senior-dev-security.md} +33 -1
- package/agents/{Senior-Developer.md → senior-developer.md} +33 -1
- package/agents/{Senior-QA.md → senior-qa.md} +33 -1
- package/agents/{TechLead-Consolidator.md → tech-lead-consolidator.md} +7 -1
- package/agents/{TechLead-Planner.md → tech-lead-planner.md} +7 -1
- package/commands/squad-review.md +10 -58
- package/commands/squad.md +11 -70
- package/dist/config/ownership-matrix.d.ts +24 -2
- package/dist/config/ownership-matrix.js +466 -139
- package/dist/config/ownership-matrix.js.map +1 -1
- package/dist/config/squad-yaml.d.ts +242 -0
- package/dist/config/squad-yaml.js +403 -0
- package/dist/config/squad-yaml.js.map +1 -0
- package/dist/errors.d.ts +1 -1
- package/dist/errors.js +1 -1
- package/dist/errors.js.map +1 -1
- package/dist/format/pr-review.d.ts +61 -0
- package/dist/format/pr-review.js +146 -0
- package/dist/format/pr-review.js.map +1 -0
- package/dist/index.js +19 -13
- package/dist/index.js.map +1 -1
- package/dist/learning/format.d.ts +29 -0
- package/dist/learning/format.js +55 -0
- package/dist/learning/format.js.map +1 -0
- package/dist/learning/store.d.ts +102 -0
- package/dist/learning/store.js +169 -0
- package/dist/learning/store.js.map +1 -0
- package/dist/resources/agent-loader.d.ts +1 -1
- package/dist/resources/agent-loader.js +53 -40
- package/dist/resources/agent-loader.js.map +1 -1
- package/dist/tasks/select.d.ts +64 -0
- package/dist/tasks/select.js +84 -0
- package/dist/tasks/select.js.map +1 -0
- package/dist/tasks/store.d.ts +338 -0
- package/dist/tasks/store.js +321 -0
- package/dist/tasks/store.js.map +1 -0
- package/dist/tools/compose-advisory-bundle.d.ts +5 -5
- package/dist/tools/compose-advisory-bundle.js +24 -12
- package/dist/tools/compose-advisory-bundle.js.map +1 -1
- package/dist/tools/compose-prd-parse.d.ts +53 -0
- package/dist/tools/compose-prd-parse.js +167 -0
- package/dist/tools/compose-prd-parse.js.map +1 -0
- package/dist/tools/compose-squad-workflow.d.ts +28 -10
- package/dist/tools/compose-squad-workflow.js +0 -0
- package/dist/tools/compose-squad-workflow.js.map +1 -1
- package/dist/tools/consolidate.d.ts +55 -4
- package/dist/tools/consolidate.js +87 -15
- package/dist/tools/consolidate.js.map +1 -1
- package/dist/tools/expand-task.d.ts +51 -0
- package/dist/tools/expand-task.js +35 -0
- package/dist/tools/expand-task.js.map +1 -0
- package/dist/tools/list-tasks.d.ts +31 -0
- package/dist/tools/list-tasks.js +50 -0
- package/dist/tools/list-tasks.js.map +1 -0
- package/dist/tools/next-task.d.ts +37 -0
- package/dist/tools/next-task.js +60 -0
- package/dist/tools/next-task.js.map +1 -0
- package/dist/tools/read-learnings.d.ts +53 -0
- package/dist/tools/read-learnings.js +72 -0
- package/dist/tools/read-learnings.js.map +1 -0
- package/dist/tools/read-squad-config.d.ts +23 -0
- package/dist/tools/read-squad-config.js +34 -0
- package/dist/tools/read-squad-config.js.map +1 -0
- package/dist/tools/record-learning.d.ts +62 -0
- package/dist/tools/record-learning.js +80 -0
- package/dist/tools/record-learning.js.map +1 -0
- package/dist/tools/record-tasks.d.ts +71 -0
- package/dist/tools/record-tasks.js +45 -0
- package/dist/tools/record-tasks.js.map +1 -0
- package/dist/tools/registry.d.ts +1 -1
- package/dist/tools/registry.js +71 -39
- package/dist/tools/registry.js.map +1 -1
- package/dist/tools/score-rubric.d.ts +74 -0
- package/dist/tools/score-rubric.js +140 -0
- package/dist/tools/score-rubric.js.map +1 -0
- package/dist/tools/slice-files-for-task.d.ts +31 -0
- package/dist/tools/slice-files-for-task.js +52 -0
- package/dist/tools/slice-files-for-task.js.map +1 -0
- package/dist/tools/update-task-status.d.ts +29 -0
- package/dist/tools/update-task-status.js +35 -0
- package/dist/tools/update-task-status.js.map +1 -0
- package/package.json +4 -1
- package/skills/squad/SKILL.md +454 -0
- package/tools/post-review.mjs +212 -0
- /package/agents/{Skill-Squad-Dev.md → _shared/Skill-Squad-Dev.md} +0 -0
- /package/agents/{Skill-Squad-Review.md → _shared/Skill-Squad-Review.md} +0 -0
- /package/agents/{_Severity-and-Ownership.md → _shared/_Severity-and-Ownership.md} +0 -0
package/README.md
CHANGED
|
@@ -75,20 +75,31 @@ node dist/index.js
|
|
|
75
75
|
|
|
76
76
|
### Tools (deterministic, pure functions)
|
|
77
77
|
|
|
78
|
-
| Tool
|
|
79
|
-
|
|
80
|
-
| `detect_changed_files`
|
|
81
|
-
| `classify_work_type`
|
|
82
|
-
| `score_risk`
|
|
83
|
-
| `select_squad`
|
|
84
|
-
| `slice_files_for_agent`
|
|
85
|
-
| `validate_plan_text`
|
|
86
|
-
| `compose_squad_workflow`
|
|
87
|
-
| `compose_advisory_bundle`
|
|
88
|
-
| `apply_consolidation_rules` | Aggregate advisory reports → final verdict (APPROVED / CHANGES_REQUIRED / REJECTED). |
|
|
89
|
-
| `
|
|
90
|
-
| `
|
|
91
|
-
| `
|
|
78
|
+
| Tool | Purpose |
|
|
79
|
+
| --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
80
|
+
| `detect_changed_files` | Hardened `git diff --name-status --no-renames` for a workspace. Allowlisted refs, 10s timeout, 1MB stdout cap. |
|
|
81
|
+
| `classify_work_type` | Heuristic `WorkType` from prompt + paths (`Feature` / `Bug Fix` / `Refactor` / `Performance` / `Security` / `Business Rule`) with Low/Medium/High confidence. |
|
|
82
|
+
| `score_risk` | Compute Low/Medium/High from boolean signals (auth, money, migration, files_count, new_module, api_change). |
|
|
83
|
+
| `select_squad` | Select advisory agents for a work type. Combines matrix + path hints + content sniff. Returns evidence per file. |
|
|
84
|
+
| `slice_files_for_agent` | Filter a file list to those owned by a single agent. Used to build sliced advisory prompts. |
|
|
85
|
+
| `validate_plan_text` | Advisory check for inviolable-rule violations in a plan (commit/push fences, emojis in code blocks, non-English identifiers, impl-before-approval). |
|
|
86
|
+
| `compose_squad_workflow` | One-call pipeline: `detect_changed_files` → `classify_work_type` → `score_risk` → `select_squad`. |
|
|
87
|
+
| `compose_advisory_bundle` | One-call full bundle: `compose_squad_workflow` + `slice_files_for_agent` per selected agent + `validate_plan_text`. |
|
|
88
|
+
| `apply_consolidation_rules` | Aggregate advisory reports → final verdict (APPROVED / CHANGES_REQUIRED / REJECTED). Returns weighted rubric scorecard when reports carry per-dimension scores. |
|
|
89
|
+
| `score_rubric` | Pure rubric calculator. Takes per-agent scores (0-100) + optional weight overrides, returns weighted score, per-dimension breakdown, and pre-formatted ASCII scorecard. |
|
|
90
|
+
| `read_squad_config` | Read and resolve `.squad.yaml` (or `.squad.yml`) at workspace_root. Returns effective weights, threshold, min_score, skip_paths, disable_agents. |
|
|
91
|
+
| `read_learnings` | Load past accept/reject decisions from `.squad/learnings.jsonl`. Filters by agent / decision / changed-file scope. Returns entries plus a markdown block ready to inject into agent or consolidator prompts. |
|
|
92
|
+
| `record_learning` | Append one accept/reject decision to `.squad/learnings.jsonl`. Side-effecting; the skill (or CLI) is responsible for per-finding user authorisation. |
|
|
93
|
+
| `compose_prd_parse` | Build a prompt + JSON schema for the host LLM to decompose a PRD into atomic tasks. Pure-MCP: server does NO LLM calls. Caller (skill) feeds the prompt to its model, then calls `record_tasks` after user confirmation. |
|
|
94
|
+
| `list_tasks` | Read tasks from `.squad/tasks.json`. Filters: status, agent (matches `agent_hints`), changed_files (glob match against task `scope`). |
|
|
95
|
+
| `next_task` | Pick the next ready task: candidate status (default pending), all dependencies done, optional agent / changed_files filter. Tiebreak priority then id. Returns null + reason when none ready. |
|
|
96
|
+
| `record_tasks` | Bulk-create tasks. Allocates ids sequentially, validates dependencies resolve (forward refs in batch ok), rejects duplicates and self-deps. Atomic write. |
|
|
97
|
+
| `update_task_status` | Flip a task or subtask status: pending / in-progress / review / done / blocked / cancelled. |
|
|
98
|
+
| `expand_task` | Append subtasks to an existing task. Mechanical only — caller (skill or LLM) supplies the subtask inputs. |
|
|
99
|
+
| `slice_files_for_task` | Filter a file list to those matching a task's `scope` glob. Same glob primitive as `skip_paths` and learnings scope. |
|
|
100
|
+
| `list_agents` | List configured agents with role, ownership, naming conventions. |
|
|
101
|
+
| `get_agent_definition` | Return the full markdown system prompt for an agent (local override → embedded default). |
|
|
102
|
+
| `init_local_config` | Copy embedded defaults to the local override directory so they can be edited. |
|
|
92
103
|
|
|
93
104
|
### Prompts
|
|
94
105
|
|
|
@@ -98,20 +109,27 @@ node dist/index.js
|
|
|
98
109
|
|
|
99
110
|
### Resources
|
|
100
111
|
|
|
101
|
-
- `agent://
|
|
112
|
+
- `agent://product-owner`, `agent://tech-lead-planner`, `agent://tech-lead-consolidator`, `agent://senior-architect`, `agent://senior-dba`, `agent://senior-developer`, `agent://senior-dev-reviewer`, `agent://senior-dev-security`, `agent://senior-qa`. (Renamed from PascalCase / `po` in v0.6.0 — older 0.5.x consumers must use `agent://po` instead.)
|
|
102
113
|
- `severity://_severity-and-ownership` — severity matrix + ownership rules.
|
|
103
114
|
- `severity://skill-squad-dev`, `severity://skill-squad-review` — full skill specs.
|
|
104
115
|
|
|
105
116
|
### Bundled skills
|
|
106
117
|
|
|
107
|
-
The plugin auto-registers these skills via `skills
|
|
118
|
+
The plugin auto-registers these skills via `skills/`:
|
|
108
119
|
|
|
109
|
-
| Skill
|
|
110
|
-
|
|
111
|
-
| `/squad`
|
|
112
|
-
| `/
|
|
113
|
-
| `/
|
|
114
|
-
|
|
120
|
+
| Skill | Trigger | Purpose |
|
|
121
|
+
| ----------------- | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
122
|
+
| `/squad` | implementation workflow | Single skill, two modes. `/squad <task>` builds an approved plan, distributes work to specialist subagents in parallel, implements the change, consolidates via tech-lead. `/squad-review [target]` is the same skill in review mode — never implements, just produces an advisory verdict on an existing diff/branch/PR. Optional `--codex` second-opinion. |
|
|
123
|
+
| `/brainstorm` | pre-implementation research | Web research in parallel + specialist agent perspectives → options matrix with cited sources and a recommendation. Produces no code. Position: `/brainstorm` decides what to build, `/squad` implements, `/squad-review` reviews. |
|
|
124
|
+
| `/commit-suggest` | commit message generator | Read-only suggester for Conventional Commits messages. Runs only an allowlist of git commands; never executes mutations; never adds AI co-author trailers. The user runs the commit themselves. |
|
|
125
|
+
|
|
126
|
+
### Bundled subagents
|
|
127
|
+
|
|
128
|
+
The plugin's `agents/` directory registers nine native Claude Code subagents you can also dispatch directly via `Task(subagent_type=…)`:
|
|
129
|
+
|
|
130
|
+
`product-owner`, `senior-architect`, `senior-dba`, `senior-developer`, `senior-dev-reviewer`, `senior-dev-security`, `senior-qa`, `tech-lead-planner`, `tech-lead-consolidator`.
|
|
131
|
+
|
|
132
|
+
The `/squad` skill orchestrates them. For non-Claude-Code MCP clients (Cursor, Claude Desktop, Warp), the same role markdowns are accessible through the MCP `agent://…` resources and `get_agent_definition` tool.
|
|
115
133
|
|
|
116
134
|
Workflow positioning:
|
|
117
135
|
|
|
@@ -127,6 +145,236 @@ Workflow positioning:
|
|
|
127
145
|
|
|
128
146
|
See [INSTALL.md](INSTALL.md#bundled-skills) for trigger examples and the optional `commit-msg` git hook + `permissions.deny` snippet that hard-enforce the read-only and no-AI-attribution invariants at the OS / Claude Code layer.
|
|
129
147
|
|
|
148
|
+
## Repo configuration — `.squad.yaml`
|
|
149
|
+
|
|
150
|
+
Drop a `.squad.yaml` (or `.squad.yml`) at the repo root to override defaults per-project. Versioned with the code, picked up automatically by `compose_squad_workflow` and `compose_advisory_bundle`.
|
|
151
|
+
|
|
152
|
+
```yaml
|
|
153
|
+
# .squad.yaml — example for a regulated fintech backend
|
|
154
|
+
|
|
155
|
+
# Rubric weights (must sum to 100 across the agents you list).
|
|
156
|
+
# Agents NOT listed are zeroed out — listing weights is an explicit choice
|
|
157
|
+
# of which dimensions count for this repo.
|
|
158
|
+
weights:
|
|
159
|
+
senior-dev-security: 30 # PCI compliance — security weighted higher
|
|
160
|
+
senior-dba: 22 # double-entry ledger, money on the line
|
|
161
|
+
senior-developer: 20
|
|
162
|
+
senior-architect: 15
|
|
163
|
+
senior-qa: 13
|
|
164
|
+
|
|
165
|
+
# Per-dimension flag threshold (default 75). Below this, the dimension is
|
|
166
|
+
# marked with ⚠ in the scorecard.
|
|
167
|
+
threshold: 80
|
|
168
|
+
|
|
169
|
+
# Quality floor: APPROVED with weighted score below this becomes
|
|
170
|
+
# CHANGES_REQUIRED. Severity rules (Blocker/Major) take precedence.
|
|
171
|
+
min_score: 75
|
|
172
|
+
|
|
173
|
+
# Files excluded from advisory. Glob syntax: ** for any depth, * for one
|
|
174
|
+
# segment, ? for one char. Useful for docs-only or generated paths.
|
|
175
|
+
skip_paths:
|
|
176
|
+
- "docs/**"
|
|
177
|
+
- "**/*.md"
|
|
178
|
+
- "**/generated/**"
|
|
179
|
+
- "vendor/**"
|
|
180
|
+
|
|
181
|
+
# Agents not relevant for this repo (e.g. internal tool, no PO involved).
|
|
182
|
+
disable_agents:
|
|
183
|
+
- product-owner
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
All keys are optional; partial files merge with package defaults. `force_agents` in tool calls still wins over `disable_agents` (config is a default policy, not a veto over explicit caller intent). Validation is strict: weights that don't sum to 100, unknown agent names, or invalid threshold ranges are rejected with a clear error.
|
|
187
|
+
|
|
188
|
+
The reader is cached by mtime — long-running MCP servers automatically pick up edits without a restart.
|
|
189
|
+
|
|
190
|
+
## Learnings — persistent accept/reject memory
|
|
191
|
+
|
|
192
|
+
Each time the team accepts or rejects an advisory finding, the decision can be appended to `.squad/learnings.jsonl`. Future runs of the squad load recent decisions and inject them into per-agent and consolidator prompts so the squad stops re-raising findings the team has already considered.
|
|
193
|
+
|
|
194
|
+
```jsonl
|
|
195
|
+
{"ts":"2026-04-12T15:02:31Z","pr":42,"agent":"senior-dev-security","severity":"Major","finding":"missing CSRF on POST /api/refund","decision":"reject","reason":"CSRF terminated at API gateway, see infra/edge.tf","scope":"src/api/**"}
|
|
196
|
+
{"ts":"2026-04-15T09:18:11Z","pr":47,"agent":"senior-architect","severity":"Major","finding":"cross-module coupling Auth → Billing","decision":"accept","reason":"refactored to event bus"}
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
The file lives in git. Decisions are auditable in PR diffs.
|
|
200
|
+
|
|
201
|
+
### Recording decisions
|
|
202
|
+
|
|
203
|
+
Inside Claude Code, after `/squad-review` produces the verdict, tell the skill to record:
|
|
204
|
+
|
|
205
|
+
```
|
|
206
|
+
record reject senior-dev-security "missing CSRF on POST /api/refund"
|
|
207
|
+
reason: CSRF terminated at API gateway
|
|
208
|
+
scope: src/api/**
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
The skill confirms each decision and calls the `record_learning` MCP tool. **Per-finding authorisation is required** — silence or "thanks" is not authorisation.
|
|
212
|
+
|
|
213
|
+
For non-MCP environments, use the CLI helper:
|
|
214
|
+
|
|
215
|
+
```bash
|
|
216
|
+
node tools/record-learning.mjs --reject \
|
|
217
|
+
--agent senior-dev-security \
|
|
218
|
+
--finding "missing CSRF on POST /api/refund" \
|
|
219
|
+
--reason "CSRF terminated at API gateway" \
|
|
220
|
+
--scope "src/api/**" \
|
|
221
|
+
--pr 42
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
### How the squad uses them
|
|
225
|
+
|
|
226
|
+
In Phase 5 (per-agent advisory) the skill calls `read_learnings(workspace_root, agent, changed_files)` and injects the rendered `## Past team decisions` block into the agent's prompt. In Phase 10 (consolidator) it does the same without an agent filter — the consolidator sees the full picture across agents.
|
|
227
|
+
|
|
228
|
+
Each agent is told: when a current finding matches a previously **rejected** decision (similar agent + similar finding text + matching scope), suppress or downgrade severity unless the diff materially changes the rationale. When a finding contradicts a previously **accepted** decision, flag the contradiction explicitly.
|
|
229
|
+
|
|
230
|
+
### Configuration
|
|
231
|
+
|
|
232
|
+
Override defaults via `.squad.yaml`:
|
|
233
|
+
|
|
234
|
+
```yaml
|
|
235
|
+
learnings:
|
|
236
|
+
path: .squad/learnings.jsonl # default
|
|
237
|
+
max_recent: 50 # how many recent entries to inject (hard cap 200)
|
|
238
|
+
enabled: true # set false to disable injection without deleting the journal
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
The store reader is mtime-cached. The journal is append-only by design — the skill never amends or deletes past entries; correcting a stale decision means appending a new one.
|
|
242
|
+
|
|
243
|
+
## Tasks — PRD-decomposed atomic work units
|
|
244
|
+
|
|
245
|
+
The biggest source of token bloat in a long-running squad session is the squad re-analysing the whole repo for every prompt. The tasks store fixes that by decomposing a PRD into atomic tasks up front, then running the squad on ONE task's narrowed scope at a time.
|
|
246
|
+
|
|
247
|
+
```jsonc
|
|
248
|
+
// .squad/tasks.json (excerpt)
|
|
249
|
+
{
|
|
250
|
+
"version": 1,
|
|
251
|
+
"tasks": [
|
|
252
|
+
{
|
|
253
|
+
"id": 1,
|
|
254
|
+
"title": "Add CSRF token to checkout flow",
|
|
255
|
+
"status": "done",
|
|
256
|
+
"dependencies": [],
|
|
257
|
+
"priority": "high",
|
|
258
|
+
"scope": "src/api/checkout/**",
|
|
259
|
+
"agent_hints": ["senior-dev-security", "senior-developer"],
|
|
260
|
+
"test_strategy": "POST without token → 403; POST with token → 200.",
|
|
261
|
+
"subtasks": [],
|
|
262
|
+
"created_at": "2026-05-08T12:00:00Z",
|
|
263
|
+
"updated_at": "2026-05-09T15:30:00Z"
|
|
264
|
+
},
|
|
265
|
+
{
|
|
266
|
+
"id": 2,
|
|
267
|
+
"title": "Wire CSRF middleware into refund endpoint",
|
|
268
|
+
"status": "pending",
|
|
269
|
+
"dependencies": [1],
|
|
270
|
+
"priority": "high",
|
|
271
|
+
"scope": "src/api/refund/**",
|
|
272
|
+
"subtasks": [],
|
|
273
|
+
...
|
|
274
|
+
}
|
|
275
|
+
]
|
|
276
|
+
}
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
`scope` (glob) and `agent_hints` are squad-mcp-specific additions on top of the claude-task-master shape — they let `slice_files_for_task` and `compose_squad_workflow` narrow the advisory automatically.
|
|
280
|
+
|
|
281
|
+
### Decomposing a PRD
|
|
282
|
+
|
|
283
|
+
Inside Claude Code:
|
|
284
|
+
|
|
285
|
+
```
|
|
286
|
+
/squad-tasks docs/prd-payments-refactor.md
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
The skill (Phase 0.5):
|
|
290
|
+
|
|
291
|
+
1. Calls `compose_prd_parse` with the PRD text.
|
|
292
|
+
2. Receives a prompt + JSON schema and runs them through Claude.
|
|
293
|
+
3. Shows you the parsed tasks — title, deps, priority, scope, agent_hints — for review.
|
|
294
|
+
4. Calls `record_tasks` only after you say "record" / "go" / "yes".
|
|
295
|
+
|
|
296
|
+
The parse is **pure-MCP**: the squad-mcp server never makes LLM calls. The host (Claude Code, Cursor, Warp) does the inference. No provider keys in the server, no surprises for non-Claude clients.
|
|
297
|
+
|
|
298
|
+
### Working tasks
|
|
299
|
+
|
|
300
|
+
```
|
|
301
|
+
/squad-next # picks the highest-priority ready task
|
|
302
|
+
/squad-task 5 # explicit pick by id
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
For each task:
|
|
306
|
+
|
|
307
|
+
- `slice_files_for_task` narrows the changed-files list to the task's `scope`.
|
|
308
|
+
- `compose_squad_workflow` runs against that slice; if `agent_hints` is set, only those agents wake up.
|
|
309
|
+
- Phase 1 onward proceeds normally, just with much less context.
|
|
310
|
+
- When done, the skill flips status to `done` via `update_task_status`.
|
|
311
|
+
|
|
312
|
+
### Configuration
|
|
313
|
+
|
|
314
|
+
Override defaults via `.squad.yaml`:
|
|
315
|
+
|
|
316
|
+
```yaml
|
|
317
|
+
tasks:
|
|
318
|
+
path: .squad/tasks.json # default
|
|
319
|
+
enabled: true # set false to silence reads without deleting the file
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
Writes (`record_tasks`, `update_task_status`, `expand_task`) stay open even when reads are disabled — same policy as learnings. Disabling injection should not throw away the journal.
|
|
323
|
+
|
|
324
|
+
### CLI for non-MCP environments
|
|
325
|
+
|
|
326
|
+
Mirroring the post-review and record-learning helpers:
|
|
327
|
+
|
|
328
|
+
```bash
|
|
329
|
+
# decompose offline (you generate the JSON yourself or via another tool)
|
|
330
|
+
echo '[{"title":"Add CSRF","scope":"src/api/**"}]' | node tools/record-tasks.mjs
|
|
331
|
+
|
|
332
|
+
# inspect
|
|
333
|
+
node tools/list-tasks.mjs --status pending
|
|
334
|
+
node tools/next-task.mjs --json
|
|
335
|
+
|
|
336
|
+
# flip status from CI
|
|
337
|
+
node tools/update-task-status.mjs --task 5 --status done
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
The CLIs share `tools/_tasks-io.mjs` for read/write and require only node 18+. Schema validation is lighter than the MCP tool — production use should prefer the MCP path.
|
|
341
|
+
|
|
342
|
+
## Posting reviews to GitHub PRs
|
|
343
|
+
|
|
344
|
+
Once the squad runs, you can post the verdict + scorecard as a `gh pr review` directly. The skill `/squad-review #42` runs the advisory and offers to post the result; default behaviour is **dry-run + confirmation** — Claude shows the exact `gh` command and the markdown body, then waits for your "go" before posting.
|
|
345
|
+
|
|
346
|
+
```bash
|
|
347
|
+
# manual usage (outside the skill)
|
|
348
|
+
echo '<consolidation JSON>' | node tools/post-review.mjs --pr 42 --dry-run
|
|
349
|
+
# prints: gh pr review 42 --approve --body-file - <<'EOF' ... EOF
|
|
350
|
+
|
|
351
|
+
# actually post
|
|
352
|
+
echo '<consolidation JSON>' | node tools/post-review.mjs --pr 42
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
The CLI maps verdict → `gh` action deterministically:
|
|
356
|
+
|
|
357
|
+
| Verdict | Score signal | `gh` action |
|
|
358
|
+
| -------------------------------------------------- | ---------------------- | ---------------------------------- |
|
|
359
|
+
| `REJECTED` | — | `--request-changes` (blocks merge) |
|
|
360
|
+
| `CHANGES_REQUIRED` | — | `--comment` (advisory) |
|
|
361
|
+
| `APPROVED` + `downgraded_by_score: true` | weighted < `min_score` | `--comment` |
|
|
362
|
+
| `APPROVED` + score < `request_changes_below_score` | (opt-in floor) | `--request-changes` |
|
|
363
|
+
| `APPROVED` otherwise | passes threshold | `--approve` |
|
|
364
|
+
|
|
365
|
+
### Auto-post (opt-in)
|
|
366
|
+
|
|
367
|
+
If `.squad.yaml` has `pr_posting.auto_post: true`, the skill posts without the second confirmation prompt — but **always shows the body first**. Auto-post means "skip the second yes/no", not "skip the preview".
|
|
368
|
+
|
|
369
|
+
```yaml
|
|
370
|
+
pr_posting:
|
|
371
|
+
auto_post: true # default false — always asks
|
|
372
|
+
request_changes_below_score: 50 # below this, post --request-changes instead of --approve
|
|
373
|
+
omit_attribution_footer: false # default false — footer present
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
Requires `gh` CLI in PATH and authenticated (`gh auth login`). The CLI exits 3 with a clear message if `gh` is missing.
|
|
377
|
+
|
|
130
378
|
## Detection strategy (`select_squad` / `slice_files_for_agent`)
|
|
131
379
|
|
|
132
380
|
Three layers, in order of strength:
|
|
@@ -158,12 +406,17 @@ Run the `init_local_config` tool once to seed the local directory with editable
|
|
|
158
406
|
squad-mcp/
|
|
159
407
|
├── .claude-plugin/ # Claude Code plugin manifest + marketplace
|
|
160
408
|
├── .github/workflows/ # CI + release workflows
|
|
161
|
-
├── agents/ #
|
|
162
|
-
├──
|
|
163
|
-
|
|
409
|
+
├── agents/ # Native subagents + shared docs
|
|
410
|
+
│ ├── *.md # 9 subagent definitions (kebab-case, with frontmatter)
|
|
411
|
+
│ └── _shared/ # severity matrix + skill specs (not loaded as subagents)
|
|
412
|
+
├── commands/ # Slash commands (/squad, /squad-review, /brainstorm, /commit-suggest)
|
|
413
|
+
├── skills/ # Bundled skills
|
|
414
|
+
│ ├── squad/ # single skill, two modes (implement | review)
|
|
415
|
+
│ ├── brainstorm/
|
|
416
|
+
│ └── commit-suggest/
|
|
164
417
|
├── src/
|
|
165
418
|
│ ├── index.ts # stdio entry
|
|
166
|
-
│ ├── tools/ # MCP tools (
|
|
419
|
+
│ ├── tools/ # MCP tools (23 deterministic functions)
|
|
167
420
|
│ ├── resources/ # MCP resources + agent loader
|
|
168
421
|
│ ├── prompts/ # MCP prompt templates
|
|
169
422
|
│ ├── exec/git.ts # hardened git execution layer
|
|
@@ -173,7 +426,6 @@ squad-mcp/
|
|
|
173
426
|
│ └── ownership-matrix.ts # agents, work types, content/path patterns
|
|
174
427
|
├── tests/ # vitest unit + integration + stdio smoke
|
|
175
428
|
├── tools/
|
|
176
|
-
│ ├── sync-agents.mjs # mirror agents + skills into ~/.claude/ for non-plugin clients
|
|
177
429
|
│ └── git-hooks/commit-msg # opt-in hook rejecting AI-attribution trailers
|
|
178
430
|
└── dist/ # compiled JS (gitignored, shipped via npm)
|
|
179
431
|
```
|
|
@@ -1,6 +1,12 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: product-owner
|
|
3
|
+
description: Product Owner. Validates business value, functional requirements, and UX. Use for features, business-rule changes, and user-facing surfaces.
|
|
4
|
+
model: inherit
|
|
5
|
+
---
|
|
6
|
+
|
|
1
7
|
# PO (Product Owner)
|
|
2
8
|
|
|
3
|
-
> Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
|
|
9
|
+
> Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
|
|
4
10
|
|
|
5
11
|
## Role
|
|
6
12
|
Business representative in technical review. Ensures every implementation delivers real value to the end user and aligns with product goals.
|
|
@@ -82,3 +88,29 @@ Objective summary of the evaluation.
|
|
|
82
88
|
- Be pragmatic: not every gap is a blocker, classify by severity
|
|
83
89
|
- Frame impact in business terms, not technical ones
|
|
84
90
|
- Without a user story, judge by observable behavior and product common sense
|
|
91
|
+
|
|
92
|
+
## Score
|
|
93
|
+
|
|
94
|
+
At the end of your advisory output, emit exactly:
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
Score: <NN>/100
|
|
98
|
+
Score rationale: <one sentence on what drove the score>
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
The score is YOUR dimension's contribution to the squad rubric (`Business & UX`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
|
|
102
|
+
|
|
103
|
+
### Calibration
|
|
104
|
+
|
|
105
|
+
- 90-100: requirement matches the change; UX clear; business value evident.
|
|
106
|
+
- 70-89: minor mismatch with stated requirement or UX awkwardness.
|
|
107
|
+
- **50-69: one Major — business rule contradicted, UX broken on critical flow, requirement absent.**
|
|
108
|
+
- 30-49: change does not deliver claimed value; conflicts with PO intent.
|
|
109
|
+
- 0-29: should not be built; halt.
|
|
110
|
+
|
|
111
|
+
### Notes
|
|
112
|
+
|
|
113
|
+
- Score is per-agent. Do not score other dimensions.
|
|
114
|
+
- Score reflects the slice of files you reviewed, not the whole change.
|
|
115
|
+
- A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
|
|
116
|
+
- An honest 65 is more useful than a generous 80; the rubric is auditable.
|
|
@@ -1,6 +1,12 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: senior-architect
|
|
3
|
+
description: Senior Architect. Guards module boundaries, coupling, dependency direction, DI lifetimes, and scalability. Use for structural changes and new modules.
|
|
4
|
+
model: inherit
|
|
5
|
+
---
|
|
6
|
+
|
|
1
7
|
# Senior-Architect
|
|
2
8
|
|
|
3
|
-
> Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
|
|
9
|
+
> Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
|
|
4
10
|
|
|
5
11
|
## Role
|
|
6
12
|
Guardian of architectural integrity. Evaluates design decisions with a long-term lens and keeps the solution from eroding boundaries.
|
|
@@ -119,3 +125,29 @@ Summary of the diagnosis and long-term view.
|
|
|
119
125
|
- Distinguish "ideal" from "acceptable for now"
|
|
120
126
|
- Avoid astronaut architecture — prefer pragmatic solutions
|
|
121
127
|
- If the issue is implementation (not design), forward to the right agent
|
|
128
|
+
|
|
129
|
+
## Score
|
|
130
|
+
|
|
131
|
+
At the end of your advisory output, emit exactly:
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
Score: <NN>/100
|
|
135
|
+
Score rationale: <one sentence on what drove the score>
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
The score is YOUR dimension's contribution to the squad rubric (`Architecture`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
|
|
139
|
+
|
|
140
|
+
### Calibration
|
|
141
|
+
|
|
142
|
+
- 90-100: clean module/domain boundaries, DI lifetimes correct, no coupling regression, extensibility clear.
|
|
143
|
+
- 70-89: minor issues (over-eager abstraction, ambiguous responsibility split) but no actionable Major.
|
|
144
|
+
- **50-69: at least one Major (cross-module coupling, wrong DI lifetime, hidden mutable state).**
|
|
145
|
+
- 30-49: multiple Majors or one Blocker that endangers structural integrity.
|
|
146
|
+
- 0-29: architecture-level break; halt.
|
|
147
|
+
|
|
148
|
+
### Notes
|
|
149
|
+
|
|
150
|
+
- Score is per-agent. Do not score other dimensions.
|
|
151
|
+
- Score reflects the slice of files you reviewed, not the whole change.
|
|
152
|
+
- A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
|
|
153
|
+
- An honest 65 is more useful than a generous 80; the rubric is auditable.
|
|
@@ -1,6 +1,12 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: senior-dba
|
|
3
|
+
description: Senior DBA. Reviews queries, migrations, EF mappings, cache, concurrency, and persistence stack. Use for data-layer changes.
|
|
4
|
+
model: inherit
|
|
5
|
+
---
|
|
6
|
+
|
|
1
7
|
# Senior-DBA
|
|
2
8
|
|
|
3
|
-
> Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
|
|
9
|
+
> Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
|
|
4
10
|
|
|
5
11
|
## Role
|
|
6
12
|
Data specialist. Ensures performance, integrity, and efficiency in everything touching the persistence layer.
|
|
@@ -135,3 +141,29 @@ Summary and prioritized risks.
|
|
|
135
141
|
- Be conservative with migrations — prefer additive operations
|
|
136
142
|
- Challenge every query without WHERE or with SELECT *
|
|
137
143
|
- Validate suggested indexes do not degrade write performance
|
|
144
|
+
|
|
145
|
+
## Score
|
|
146
|
+
|
|
147
|
+
At the end of your advisory output, emit exactly:
|
|
148
|
+
|
|
149
|
+
```
|
|
150
|
+
Score: <NN>/100
|
|
151
|
+
Score rationale: <one sentence on what drove the score>
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
The score is YOUR dimension's contribution to the squad rubric (`Data Layer`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
|
|
155
|
+
|
|
156
|
+
### Calibration
|
|
157
|
+
|
|
158
|
+
- 90-100: queries efficient, migrations safe and reversible, EF mappings correct, no concurrency hazard.
|
|
159
|
+
- 70-89: minor inefficiencies or missing indexes; no data-integrity risk.
|
|
160
|
+
- **50-69: one Major — N+1 query, missing transaction, broken concurrency control, mismatched stack mix.**
|
|
161
|
+
- 30-49: data integrity at risk (race, lost update, irreversible migration without backout).
|
|
162
|
+
- 0-29: data corruption likely; halt.
|
|
163
|
+
|
|
164
|
+
### Notes
|
|
165
|
+
|
|
166
|
+
- Score is per-agent. Do not score other dimensions.
|
|
167
|
+
- Score reflects the slice of files you reviewed, not the whole change.
|
|
168
|
+
- A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
|
|
169
|
+
- An honest 65 is more useful than a generous 80; the rubric is auditable.
|
|
@@ -1,6 +1,12 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: senior-dev-reviewer
|
|
3
|
+
description: Senior code reviewer. Focuses on readability, code smells, naming, idioms, async/await correctness, and error handling.
|
|
4
|
+
model: inherit
|
|
5
|
+
---
|
|
6
|
+
|
|
1
7
|
# Senior-Dev-Reviewer
|
|
2
8
|
|
|
3
|
-
> Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
|
|
9
|
+
> Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
|
|
4
10
|
|
|
5
11
|
## Role
|
|
6
12
|
Senior code reviewer focused on quality, readability, and maintainability. Performs detailed line-level review, applies the idiomatic checklist for the detected language/framework, and produces a numeric scorecard so reviewers and the tech-lead can see at a glance where the change stands.
|
|
@@ -606,3 +612,29 @@ Summary and decision. Restate the overall score and the top 1–3 things the aut
|
|
|
606
612
|
- Be specific: always reference file and line
|
|
607
613
|
- When the language idiom and the existing codebase conflict, side with the existing codebase consistency and flag the inconsistency for separate discussion
|
|
608
614
|
- Remember: the goal is that the author learns, not just that they fix
|
|
615
|
+
|
|
616
|
+
## Score
|
|
617
|
+
|
|
618
|
+
At the end of your advisory output, emit exactly:
|
|
619
|
+
|
|
620
|
+
```
|
|
621
|
+
Score: <NN>/100
|
|
622
|
+
Score rationale: <one sentence on what drove the score>
|
|
623
|
+
```
|
|
624
|
+
|
|
625
|
+
The score is YOUR dimension's contribution to the squad rubric (`Code Quality`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
|
|
626
|
+
|
|
627
|
+
### Calibration
|
|
628
|
+
|
|
629
|
+
- 90-100: idiomatic, readable, well-named, async/error patterns clean.
|
|
630
|
+
- 70-89: minor style or naming smells; no idiom violations of consequence.
|
|
631
|
+
- 50-69: one Major — wrong async pattern, swallowed exception, name that misleads readers.
|
|
632
|
+
- 30-49: multiple Majors; reviewer fatigue indicator.
|
|
633
|
+
- 0-29: code unmaintainable as-is; halt.
|
|
634
|
+
|
|
635
|
+
### Notes
|
|
636
|
+
|
|
637
|
+
- Score is per-agent. Do not score other dimensions.
|
|
638
|
+
- Score reflects the slice of files you reviewed, not the whole change.
|
|
639
|
+
- A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
|
|
640
|
+
- An honest 65 is more useful than a generous 80; the rubric is auditable.
|
|
@@ -1,6 +1,12 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: senior-dev-security
|
|
3
|
+
description: Application security specialist. Finds OWASP Top 10 vulnerabilities, validates authn/authz, sensitive data, input validation, and dependency CVEs.
|
|
4
|
+
model: inherit
|
|
5
|
+
---
|
|
6
|
+
|
|
1
7
|
# Senior-Dev-Security
|
|
2
8
|
|
|
3
|
-
> Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
|
|
9
|
+
> Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
|
|
4
10
|
|
|
5
11
|
## Role
|
|
6
12
|
Application security specialist. Identifies vulnerabilities, validates access controls, and ensures sensitive data is protected.
|
|
@@ -132,3 +138,29 @@ Summary of risks and prioritized recommendations.
|
|
|
132
138
|
- Do not generate false positives — only report with real or highly likely evidence
|
|
133
139
|
- Prioritize by real impact, not theoretical checklist
|
|
134
140
|
- Explicitly record what could not be validated
|
|
141
|
+
|
|
142
|
+
## Score
|
|
143
|
+
|
|
144
|
+
At the end of your advisory output, emit exactly:
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
Score: <NN>/100
|
|
148
|
+
Score rationale: <one sentence on what drove the score>
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
The score is YOUR dimension's contribution to the squad rubric (`Security`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
|
|
152
|
+
|
|
153
|
+
### Calibration
|
|
154
|
+
|
|
155
|
+
- 90-100: no OWASP issue; authn/authz tight; secrets handled; no new dependency risk.
|
|
156
|
+
- 70-89: minor concerns (missing input length cap, weak rate limit) — not exploitable.
|
|
157
|
+
- **50-69: one Major — IDOR, missing authz check, secret in log, unsafe dependency.**
|
|
158
|
+
- 30-49: exploitable today (auth bypass, SQLi, RCE); Blocker territory.
|
|
159
|
+
- 0-29: critical security break; halt.
|
|
160
|
+
|
|
161
|
+
### Notes
|
|
162
|
+
|
|
163
|
+
- Score is per-agent. Do not score other dimensions.
|
|
164
|
+
- Score reflects the slice of files you reviewed, not the whole change.
|
|
165
|
+
- A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
|
|
166
|
+
- An honest 65 is more useful than a generous 80; the rubric is auditable.
|
|
@@ -1,6 +1,12 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: senior-developer
|
|
3
|
+
description: Pragmatic senior developer. Reviews technical correctness, robustness, API contracts, external integrations, observability, and application performance.
|
|
4
|
+
model: inherit
|
|
5
|
+
---
|
|
6
|
+
|
|
1
7
|
# Senior-Developer
|
|
2
8
|
|
|
3
|
-
> Reference: [Severity and Ownership Matrix](_Severity-and-Ownership.md)
|
|
9
|
+
> Reference: [Severity and Ownership Matrix](_shared/_Severity-and-Ownership.md)
|
|
4
10
|
|
|
5
11
|
## Role
|
|
6
12
|
Pragmatic senior developer focused on robust implementation. Evaluates code from the perspective of someone who will maintain, debug, and evolve it day to day.
|
|
@@ -178,3 +184,29 @@ Summary of the analysis and confidence in the solution for production.
|
|
|
178
184
|
- Focus on real, probable bugs — not unlikely theoretical scenarios
|
|
179
185
|
- Production is hostile: anything that can go wrong, will
|
|
180
186
|
- Moderate duplication is acceptable when the alternative is a premature abstraction
|
|
187
|
+
|
|
188
|
+
## Score
|
|
189
|
+
|
|
190
|
+
At the end of your advisory output, emit exactly:
|
|
191
|
+
|
|
192
|
+
```
|
|
193
|
+
Score: <NN>/100
|
|
194
|
+
Score rationale: <one sentence on what drove the score>
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
The score is YOUR dimension's contribution to the squad rubric (`Application Code`). The consolidator will weight it against other agents and compare against the threshold (default 75) to produce the final scorecard.
|
|
198
|
+
|
|
199
|
+
### Calibration
|
|
200
|
+
|
|
201
|
+
- 90-100: correctness solid, robustness considered, API contract honoured, observability in place.
|
|
202
|
+
- 70-89: minor robustness gaps (one ambiguous error path, missing log) but no behavioural break.
|
|
203
|
+
- **50-69: one Major — broken contract, missing error handling, observability hole on critical path.**
|
|
204
|
+
- 30-49: multiple Majors or behaviour change with no test/log support.
|
|
205
|
+
- 0-29: ships broken; halt.
|
|
206
|
+
|
|
207
|
+
### Notes
|
|
208
|
+
|
|
209
|
+
- Score is per-agent. Do not score other dimensions.
|
|
210
|
+
- Score reflects the slice of files you reviewed, not the whole change.
|
|
211
|
+
- A score of 0 means halt — equivalent to a Blocker. Do not emit 0 unless you would also raise a Blocker.
|
|
212
|
+
- An honest 65 is more useful than a generous 80; the rubric is auditable.
|