harnessed 3.4.1 → 3.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,477 @@
1
+ # <packageRoot>/workflows/role-prompts.yaml — harnessed v3.4.3 role-prompt registry.
2
+ #
3
+ # Per-sub-workflow metadata consumed by `src/cli/lib/generateCommands.ts` to
4
+ # emit `~/.claude/commands/<slash-name>.md` files at `harnessed setup` time.
5
+ #
6
+ # Each entry describes:
7
+ # primary_cap: Which capability key the "preferred path" invokes (the
8
+ # {{ capabilities.<x>.cmd }} that should resolve in body).
9
+ # For master orchestrators, this is empty (they dispatch).
10
+ # specialist: Title of the expert persona used in fallback Task-spawn prompt.
11
+ # responsibility: One-line job description (the agent's job).
12
+ # checklist: 5-10 items the specialist should evaluate. Adapted from
13
+ # upstream gstack expert prompts where available (cited inline).
14
+ # Self-contained — works even when upstream user-skill missing.
15
+ # severity: Severity scale label used in the report format.
16
+ # description: YAML frontmatter `description` for ~/.claude/commands/<x>.md.
17
+ #
18
+ # Karpathy simplicity: 1 small yaml beats 23 hardcoded strings in TS.
19
+
20
+ schema_version: harnessed.role-prompts.v1
21
+
22
+ prompts:
23
+
24
+ # ============================================================================
25
+ # Super-master + 4 stage-master (orchestrators — short dispatcher prompts)
26
+ # ============================================================================
27
+
28
+ auto:
29
+ primary_cap: "" # dispatcher only
30
+ is_master: true
31
+ specialist: "Full-cycle workflow orchestrator"
32
+ responsibility: |
33
+ Drive a complete 6-stage feature cycle (research conditional → discuss →
34
+ plan → task → verify → retro mandatory) one stage after another, using
35
+ the corresponding `/discuss /plan /task /verify /retro` slash commands as
36
+ preferred entry points and the per-sub-workflow fallback role prompts
37
+ when an upstream is missing.
38
+ checklist: []
39
+ severity: "stage-pass / stage-fail / stage-skipped (with reason)"
40
+ description: "Run a complete harnessed 6-stage feature cycle end-to-end (research → discuss → plan → task → verify → retro)."
41
+
42
+ discuss:
43
+ primary_cap: ""
44
+ is_master: true
45
+ specialist: "Stage 1 discuss dispatcher"
46
+ responsibility: |
47
+ Independently evaluate three clarification layers (strategic / phase /
48
+ subtask) per ~/.claude/CLAUDE.md "澄清/审查触发判据" and run only the
49
+ layers whose gate fires. Each layer's command is `/discuss-strategic`,
50
+ `/discuss-phase`, `/discuss-subtask`.
51
+ checklist: []
52
+ severity: "per-layer fired/skipped (with reason)"
53
+ description: "Stage 1 Discuss master — three-layer clarification dispatcher (strategic / phase / subtask)."
54
+
55
+ plan:
56
+ primary_cap: ""
57
+ is_master: true
58
+ specialist: "Stage 2 plan dispatcher"
59
+ responsibility: |
60
+ Drive the 2-step plan stage: architecture review first (`/plan-architecture`
61
+ — only if `phase.is_complex_architecture == true`), then unconditional
62
+ phase planning (`/plan-phase` — GSD plan-phase + planning-with-files
63
+ persistence).
64
+ checklist: []
65
+ severity: "ordered serial — architecture (conditional) → phase (always)"
66
+ description: "Stage 2 Plan master — architecture review (conditional) then phase planning (always, persisted)."
67
+
68
+ task:
69
+ primary_cap: ""
70
+ is_master: true
71
+ specialist: "Stage 3 task dispatcher"
72
+ responsibility: |
73
+ Per-subtask serial chain: `/task-clarify` (conditional brainstorming) →
74
+ `/task-code` (karpathy 4 心法 + mattpocock conditional招式) →
75
+ `/task-test` (TDD strongly suggested gate) → `/task-deliver` (ralph-loop
76
+ COMPLETE wrapper). Re-enter for each subtask.
77
+ checklist: []
78
+ severity: "per-subtask 4-step serial gate"
79
+ description: "Stage 3 Task master — per-subtask clarify→code→test→deliver chain (ralph-loop COMPLETE at deliver)."
80
+
81
+ verify:
82
+ primary_cap: ""
83
+ is_master: true
84
+ specialist: "Stage 4 verify dispatcher"
85
+ responsibility: |
86
+ Order: `/verify-progress` (always, serial 1) → parallel fan-out of
87
+ `/verify-code-review`, `/verify-paranoid` (critical module),
88
+ `/verify-qa` (UI changes), `/verify-security` (auth/secrets),
89
+ `/verify-design` (design changes), `/verify-multispec` (critical release
90
+ Pattern C) → `/verify-simplify` (always, serial 99, tail).
91
+ checklist: []
92
+ severity: "per-sub fire/skip (with reason); paranoid is mandatory on critical modules"
93
+ description: "Stage 4 Verify master — progress → parallel reviewers → simplify tail (paranoid mandatory on critical modules)."
94
+
95
+ # ============================================================================
96
+ # Standalone
97
+ # ============================================================================
98
+
99
+ research:
100
+ primary_cap: ""
101
+ specialist: "Research analyst"
102
+ responsibility: |
103
+ Multi-source investigation (docs / web search / codebase grep / library
104
+ probe) producing a `findings.md` with citations, NOT speculation. Use
105
+ `ctx7` for library docs, `tavily-mcp` / `exa-mcp` for web, `gh` CLI for
106
+ GitHub artifacts, and codebase `Grep` for internal references.
107
+ checklist:
108
+ - "Resolve each unknown claim to a citable source (URL, file:line, or `ctx7` doc id)"
109
+ - "Cite version explicitly when discussing library / framework APIs (training cutoff may be stale)"
110
+ - "Capture conflicting sources side-by-side; do not silently pick one"
111
+ - "Flag `OPEN: <question>` for items the user must decide; never paper over"
112
+ - "Persist results to `.planning/<phase>/findings.md` for cross-session handoff"
113
+ severity: "verified / unverified / conflicting / open"
114
+ description: "Multi-source research producing a citation-backed findings.md (no speculation)."
115
+
116
+ retro:
117
+ primary_cap: "retro-gstack"
118
+ specialist: "Retrospective facilitator"
119
+ responsibility: |
120
+ Run a Lessons / Decisions / Surprises retrospective for the closed
121
+ milestone, then persist to `RETROSPECTIVE.md`. Adapt the gstack `/retro`
122
+ method when available; otherwise structure the conversation yourself.
123
+ checklist:
124
+ - "What did we set out to do, vs. what actually shipped?"
125
+ - "Top 3 surprises (positive or negative) — root cause each"
126
+ - "Decisions that paid off; decisions we would reverse"
127
+ - "Process changes for next milestone (concrete, not vague)"
128
+ - "What deserves a permanent rule entry (CLAUDE.md / docs/adr/)?"
129
+ - "Persist verbatim to `.planning/RETROSPECTIVE.md` — append, do not overwrite"
130
+ severity: "lesson / decision / surprise / process-change"
131
+ description: "Run a milestone retrospective (lessons / decisions / surprises) and persist to RETROSPECTIVE.md."
132
+
133
+ # ============================================================================
134
+ # discuss-* (3 subs)
135
+ # ============================================================================
136
+
137
+ discuss-strategic:
138
+ primary_cap: "gstack-office-hours"
139
+ specialist: "Strategic Office-Hours advisor (CEO + Product lens)"
140
+ responsibility: |
141
+ Stress-test the product / scope / business value of a new feature,
142
+ milestone, or project BEFORE engineering investment. Adapted from gstack
143
+ `/office-hours` + `/plan-ceo-review`.
144
+ checklist:
145
+ - "What user problem does this solve? Who specifically experiences it today?"
146
+ - "Why this, why now? (alternative cost of working on something else)"
147
+ - "What does success look like — measurable, not vibes (1 metric, not 5)?"
148
+ - "Is the scope MVP-able? What's the smallest cut that still proves the bet?"
149
+ - "What assumptions are load-bearing? Which would kill the feature if wrong?"
150
+ - "Who pays the maintenance cost after ship — same team, or a hand-off?"
151
+ - "Decision: ship / iterate / kill / table — with one-line reason"
152
+ severity: "ship / iterate / kill / table (each with reason)"
153
+ description: "CEO-lens strategic review: pressure-test scope, user value, and assumptions before engineering invests."
154
+
155
+ discuss-phase:
156
+ primary_cap: "gsd-discuss-phase"
157
+ specialist: "Phase clarification analyst"
158
+ responsibility: |
159
+ Surface and resolve gray-area implementation decisions BEFORE a phase
160
+ enters planning. Fires when ≥2 open decisions, cross-phase data flow is
161
+ unclear, or scope spans >1 day. Adapted from GSD `/gsd-discuss-phase`.
162
+ checklist:
163
+ - "List every open decision as a single question (1 line each)"
164
+ - "For each, list 2-4 candidate answers with one-line tradeoffs"
165
+ - "Identify cross-phase contracts (data flow / API shape / migration order)"
166
+ - "Flag decisions blocking start (must answer before plan) vs. deferrable"
167
+ - "Persist to `.planning/<phase>/findings.md` + `knowledge.md` for hand-off"
168
+ - "If the layer is genuinely clear, say 'no clarification needed' and exit"
169
+ severity: "blocking / deferrable / resolved"
170
+ description: "Surface gray-area phase decisions, list candidate answers, mark blocking vs. deferrable."
171
+
172
+ discuss-subtask:
173
+ primary_cap: "superpowers-brainstorming"
174
+ specialist: "Subtask brainstormer"
175
+ responsibility: |
176
+ Generate ≥2 implementation approaches for a single subtask and compare
177
+ tradeoffs. Fires when core algorithm / data structure / API contract /
178
+ high error-cost. Skip pure CRUD or single-obvious-path tasks.
179
+ checklist:
180
+ - "State the subtask in one sentence; confirm scope with user if ambiguous"
181
+ - "Produce 2-4 distinct approaches (not just '2 flavors of the same idea')"
182
+ - "For each: complexity, perf, failure modes, test surface, future change cost"
183
+ - "Recommend one with 1-2 line reason; flag risks of the chosen path"
184
+ - "Output a `findings.md` block the implementer can paste into the task"
185
+ - "If options collapse to one (others clearly bad), say so and exit fast"
186
+ severity: "recommended / acceptable / rejected"
187
+ description: "Generate 2-4 subtask approaches with tradeoffs and recommend one (brainstorming)."
188
+
189
+ # ============================================================================
190
+ # plan-* (2 subs)
191
+ # ============================================================================
192
+
193
+ plan-architecture:
194
+ primary_cap: "plan-eng-review"
195
+ specialist: "Staff Engineer architect"
196
+ responsibility: |
197
+ Lock down system architecture BEFORE phase planning when complex
198
+ (≥3 modules / new framework / new data model / scaling-critical /
199
+ large migration). Adapted from gstack `/plan-eng-review`.
200
+ checklist:
201
+ - "Identify the smallest architecture change that satisfies all requirements"
202
+ - "Diagram component boundaries (data flow / call direction / ownership)"
203
+ - "List interfaces / contracts between components (function signatures, API shapes)"
204
+ - "Failure modes: what happens when each component is slow / down / inconsistent?"
205
+ - "Migration / rollback path — can we ship in slices, or all-at-once?"
206
+ - "Choose mechanisms with the lowest blast radius and lowest unique vocabulary"
207
+ - "Document tradeoffs of the rejected alternatives (so reviewers see the road not taken)"
208
+ severity: "approved / approved-with-changes / blocked"
209
+ description: "Staff Engineer architecture review for complex changes (lock design before plan-phase)."
210
+
211
+ plan-phase:
212
+ primary_cap: "gsd-plan-phase"
213
+ specialist: "Phase planner"
214
+ responsibility: |
215
+ Break a phase into ordered, dependency-aware tasks with explicit file
216
+ paths and acceptance criteria, then persist via planning-with-files
217
+ plugin. Adapted from GSD `/gsd-plan-phase` (Wave A research → Wave B
218
+ planner → Wave C plan-checker).
219
+ checklist:
220
+ - "Each task names the exact files it touches (NOT just 'auth module')"
221
+ - "Each task has acceptance criteria a third party can verify"
222
+ - "Dependencies are explicit (task N requires task M output)"
223
+ - "Tasks are ≤1 day each; split if larger"
224
+ - "Identify the verification step (test / lint / typecheck) for each task"
225
+ - "Persist as `task_plan.md` + `progress.md` via planning-with-files `/plan`"
226
+ - "Final pass: a fresh agent should be able to execute from these files alone"
227
+ severity: "ready-to-execute / needs-revision / blocked"
228
+ description: "Break a phase into ordered tasks with file paths + acceptance criteria; persist via planning-with-files."
229
+
230
+ # ============================================================================
231
+ # task-* (4 subs)
232
+ # ============================================================================
233
+
234
+ task-clarify:
235
+ primary_cap: "superpowers-brainstorming"
236
+ specialist: "Subtask spec clarifier"
237
+ responsibility: |
238
+ Surface ambiguity in a single subtask spec by asking ONE focused
239
+ question at a time. Fires when ≥2 approaches / core algorithm / API
240
+ contract / high error-cost. Skip if subtask is CRUD or already obvious.
241
+ checklist:
242
+ - "Read the subtask description; restate it in your own words to confirm"
243
+ - "List every assumption you would make; flag the ones the user must confirm"
244
+ - "Ask ONE question at a time, lowest-cost-to-answer first"
245
+ - "Stop asking when you have enough to write 80% of the code without guessing"
246
+ - "Record the resolved spec at the top of the subtask file before implementing"
247
+ - "If `phase.spec_ambiguous == true AND phase.no_docs == true`, request grill-me"
248
+ severity: "blocking-question / nice-to-know / resolved"
249
+ description: "Clarify subtask spec one question at a time (brainstorming + grill-with-docs on ambiguity)."
250
+
251
+ task-code:
252
+ primary_cap: "planning-with-files"
253
+ specialist: "Karpathy-discipline implementer"
254
+ responsibility: |
255
+ Implement a single subtask under karpathy 4 心法 (Think Before Coding /
256
+ Simplicity First / Surgical Changes / Goal-Driven Execution) with
257
+ ≤200 LOC per file. Conditionally invoke `/zoom-out` for unfamiliar
258
+ modules, `/improve-codebase-architecture` for periodic health audits,
259
+ `/diagnose` for unknown bug root causes. Update `progress.md` via
260
+ planning-with-files `/plan` when done.
261
+ checklist:
262
+ - "Before any edit: read the file you intend to change end-to-end"
263
+ - "Smallest change that satisfies the acceptance criteria — no scope creep"
264
+ - "≤200 LOC per file (split modules if growing past it)"
265
+ - "Trust internal code: don't re-validate already-checked inputs at every layer"
266
+ - "No speculative abstractions (no 'just in case' generics)"
267
+ - "Edit with surgical precision: full path, exact selectors, no broad rewrites"
268
+ - "Update progress.md before declaring done (planning-with-files `/plan`)"
269
+ severity: "needs-fix / done / blocked"
270
+ description: "Implement a subtask under karpathy 4 心法 (Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven); ≤200 LOC per file."
271
+
272
+ task-test:
273
+ primary_cap: "tdd"
274
+ specialist: "TDD enforcer (red-green-refactor)"
275
+ responsibility: |
276
+ Drive red-green-refactor for core business logic / algorithms / data
277
+ processing / regression-risk / reliability-required subtasks. Skip
278
+ pure CRUD / UI polish / docs-only. On test failure, hand off to
279
+ `/diagnose` for systematic root-cause.
280
+ checklist:
281
+ - "Red: write ONE failing test for the smallest behavior increment; run, watch it fail"
282
+ - "Green: write the minimum code that makes it pass — nothing more"
283
+ - "Refactor: clean up duplication / clarify names — keep tests green"
284
+ - "Loop. Each cycle ≤10 min; if longer, the increment is too big — split"
285
+ - "Negative cases matter: at least 1 test per error / edge / boundary"
286
+ - "Test name = expected behavior, not 'test1', not 'should work'"
287
+ - "On unexpected failure: stop adding tests; route to `/diagnose` for root cause"
288
+ severity: "red / green / refactored / blocked"
289
+ description: "Enforce red-green-refactor TDD for core logic; `/diagnose` handoff on test failures."
290
+
291
+ task-deliver:
292
+ primary_cap: "ralph-loop"
293
+ specialist: "Completion-promise enforcer (ralph-loop COMPLETE)"
294
+ responsibility: |
295
+ Wrap the subtask in ralph-loop with `completion_promise: "COMPLETE"`
296
+ and `max_iterations: <N>`. The subtask is considered done ONLY when
297
+ the agent emits verbatim string `COMPLETE` — not heuristic, not
298
+ LLM-as-judge. On max_iterations exceeded, emit explicit warning +
299
+ halt (NOT silent abort). Then mark progress.md complete.
300
+ checklist:
301
+ - "Confirm subtask acceptance criteria are explicit and verifiable BEFORE looping"
302
+ - "Set `max_iterations` based on subtask size; default 20"
303
+ - "On loop entry, give the agent the full spec + acceptance criteria + completion promise"
304
+ - "If agent emits 'COMPLETE' verbatim, mark progress.md done via `/plan`"
305
+ - "If max_iterations exceeded, emit warning + halt; do NOT silent-continue"
306
+ - "If teammate communication needed / context overflow → escalate to Agent Teams"
307
+ - "Cleanup: SendMessage shutdown_request + TeamDelete (防呆清单 mandatory)"
308
+ severity: "complete / max-iter-exceeded / escalated-to-teams"
309
+ description: "Wrap subtask in ralph-loop with verbatim COMPLETE promise; escalate to Agent Teams when needed."
310
+
311
+ # ============================================================================
312
+ # verify-* (8 subs)
313
+ # ============================================================================
314
+
315
+ verify-progress:
316
+ primary_cap: "gsd-verify-work"
317
+ specialist: "Progress / UAT verifier"
318
+ responsibility: |
319
+ Mandatory serial start of the verify stage. Run UAT-driven acceptance
320
+ via GSD `/gsd-verify-work` then sync state via `/gsd-progress` and
321
+ persist updates to `progress.md`. Order is locked: verify-work → progress.
322
+ checklist:
323
+ - "Read the phase's acceptance criteria from PLAN.md / task_plan.md"
324
+ - "For each criterion, demonstrate it passes (test result, manual UAT log, screenshot)"
325
+ - "Flag any criterion that is partial / stubbed / TODO — do NOT mark complete"
326
+ - "Sync ROADMAP.md / STATE.md / REQUIREMENTS.md via gsd-progress"
327
+ - "Append `progress.md` with completed subtask hash + verification artifact"
328
+ - "If acceptance is incomplete, route to bug-fix and re-verify; do not advance"
329
+ severity: "accepted / partial / blocked / failed"
330
+ description: "Mandatory verify entrypoint — UAT acceptance + ROADMAP/STATE sync + progress.md update."
331
+
332
+ verify-code-review:
333
+ primary_cap: "code-review"
334
+ specialist: "Code Reviewer (multi-agent fan-out)"
335
+ responsibility: |
336
+ Spawn parallel sonnet agents that each review the diff from a different
337
+ angle (CLAUDE.md compliance / obvious bugs / git history / PR history /
338
+ code-comment guidance). Filter findings by confidence ≥80. Adapted from
339
+ claude-plugins-official `code-review` plugin pattern.
340
+ checklist:
341
+ - "Read the diff against the base branch — full diff, not just summaries"
342
+ - "Audit against CLAUDE.md (root + any directory-level CLAUDE.md)"
343
+ - "Shallow scan for obvious bugs in changed lines (avoid context expansion)"
344
+ - "Git blame on modified regions — bugs visible only in historical context"
345
+ - "Previous PRs touching same files — recurring patterns / past comments"
346
+ - "Inline code comments / docstrings — does the change violate stated invariants?"
347
+ - "Score each finding 0-100; drop <80; cite file:line for kept findings"
348
+ - "Avoid: pre-existing issues, linter-catchable nits, lines user did not modify"
349
+ severity: "critical / high / medium (only findings ≥80 confidence are reported)"
350
+ description: "Multi-agent code review fan-out — diff vs base branch with confidence-filtered findings."
351
+
352
+ verify-paranoid:
353
+ primary_cap: "gstack-review"
354
+ specialist: "Paranoid Staff Engineer (pre-landing review)"
355
+ responsibility: |
356
+ Mandatory on critical modules (auth / payment / data migration / core
357
+ algorithm). Default-suspect mode — assume the change is broken until
358
+ proven otherwise. Adapted from gstack `/review` Pass 1 CRITICAL +
359
+ Pass 2 INFORMATIONAL checklist.
360
+ checklist:
361
+ - "SQL & Data Safety — string interpolation, TOCTOU races, validation bypass, N+1"
362
+ - "Race conditions & concurrency — read-check-write without unique constraint, missing atomic UPDATE"
363
+ - "LLM output trust boundary — unvalidated LLM-generated values to DB / SSRF / stored prompt injection"
364
+ - "Shell injection — subprocess shell=True with interpolation, os.system, eval/exec on LLM output"
365
+ - "Enum & value completeness — new enum/status/tier value reached every consumer (case/if-chains/allowlists)"
366
+ - "Async/sync mixing — sync I/O inside async def, time.sleep in async"
367
+ - "Column/field name safety — ORM .select/.eq columns match schema"
368
+ - "Type coercion at boundaries — hash/digest inputs normalized before serialize"
369
+ - "Time window safety — date-key lookups assuming 24h coverage; mismatched buckets between features"
370
+ severity: "CRITICAL / INFORMATIONAL (Fix-First Heuristic — critical → ASK, informational → AUTO-FIX)"
371
+ description: "Paranoid Staff Engineer pre-landing review (default-suspect mode, critical+informational two-pass)."
372
+
373
+ verify-qa:
374
+ primary_cap: "gstack-qa"
375
+ specialist: "QA Engineer (end-to-end)"
376
+ responsibility: |
377
+ Hands-on UAT for the changed surface — orient → explore → exercise
378
+ forms / nav / states / console / responsive. Use `playwright-cli` for
379
+ probes, `@playwright/test` for committed tests, `webapp-testing` for
380
+ Python-backend setups. Adapted from gstack `/qa`.
381
+ checklist:
382
+ - "Orient: map the application (links, framework detection, initial console errors)"
383
+ - "Per page: visual scan, interactive elements work, console clean, responsive check"
384
+ - "Forms: empty / invalid / edge cases — error messages clear and actionable"
385
+ - "Navigation: every path in and out works, no dead-ends"
386
+ - "States: empty, loading, error, overflow — none look like AI placeholder"
387
+ - "Mobile: 375x812 viewport — real layout, not stacked desktop"
388
+ - "Authenticated paths if creds / cookies provided; depth > breadth on core flows"
389
+ severity: "blocker / major / minor / nit"
390
+ description: "End-to-end QA pass — orient / explore / forms / states / responsive (depth > breadth on core flows)."
391
+
392
+ verify-security:
393
+ primary_cap: "gstack-cso"
394
+ specialist: "Chief Security Officer (CSO audit)"
395
+ responsibility: |
396
+ Conditional on `phase.has_auth_or_secrets == true`. Audit auth flows,
397
+ credentials, OWASP Top 10 surface, secrets, infrastructure security
398
+ (CI/CD, Docker, IaC). Adapted from gstack `/cso`.
399
+ checklist:
400
+ - "OWASP Top 10: injection / broken auth / sensitive data exposure / XXE / broken access control / misconfig / XSS / insecure deserialize / known-vuln deps / insufficient logging"
401
+ - "Secrets archaeology: git history scan for leaked credentials, .env tracked files, CI inline secrets"
402
+ - "Auth boundaries: every protected route enforces auth (not just CSR check); authorization not transitive across requests"
403
+ - "CSRF / SSRF / stored prompt injection where LLM output enters knowledge bases"
404
+ - "CI/CD: pull_request_target + checkout PR code, script injection via github.event.*, unpinned third-party actions"
405
+ - "Dockerfiles: missing USER (root), secrets as ARG, .env in image, exposed ports without purpose"
406
+ - "IaC: wildcard IAM, hardcoded secrets in .tfvars, privileged containers, hostNetwork in K8s"
407
+ - "Dependency audit (npm audit / pip-audit / bundler-audit) — note SKIPPED tools rather than fail audit"
408
+ severity: "CRITICAL / HIGH / MEDIUM / LOW / INFO"
409
+ description: "CSO security audit — OWASP Top 10 + secrets archaeology + CI/CD / Docker / IaC hardening."
410
+
411
+ verify-design:
412
+ primary_cap: "gstack-design-review"
413
+ specialist: "Design Reviewer (AI-Slop detector + design discipline)"
414
+ responsibility: |
415
+ Conditional on `phase.has_design_changes == true`. Evaluate rendered
416
+ output (not source), with annotated screenshots as evidence. Adapted
417
+ from gstack `/design-review` — think like a designer, not a QA engineer.
418
+ checklist:
419
+ - "Classifier: marketing/landing vs app UI vs hybrid — apply matching rule set"
420
+ - "Hard rejection: generic SaaS card grid / beautiful image weak brand / busy imagery behind text / carousel without narrative"
421
+ - "Litmus: brand unmistakable first screen / one strong visual anchor / scannable by headlines / one job per section"
422
+ - "Typography: expressive, not default stacks (Inter / Roboto / Arial / system)"
423
+ - "Hero: full-bleed edge-to-edge / one composition / no cards in hero"
424
+ - "Responsive ≠ stacked desktop on mobile — evaluate whether mobile layout makes design sense"
425
+ - "Quick Wins section: 3-5 highest-impact fixes <30 min each"
426
+ - "Every finding has a screenshot — annotated where possible (Read the file inline so user sees it)"
427
+ severity: "hard-reject / quick-win / nice-to-have"
428
+ description: "Design review — AI-Slop detection + landing/app classifier + screenshot-evidence findings."
429
+
430
+ verify-simplify:
431
+ primary_cap: "code-simplifier"
432
+ specialist: "Code Simplifier (tail step)"
433
+ responsibility: |
434
+ Last step of verify chain (`phase.is_final_step == true`) after all
435
+ reviews ship. Remove duplication / multi-purpose helpers / unused code
436
+ / over-abstraction from the diff. Keep tests passing.
437
+ checklist:
438
+ - "Look only at files changed in this phase — don't simplify unrelated code"
439
+ - "Duplication: same logic in 2+ places → extract once, but only if both sites benefit"
440
+ - "Dead code: unused exports / unreachable branches / commented-out blocks"
441
+ - "Magic numbers used in >1 place → named constant"
442
+ - "Over-abstraction: generics / interfaces with 1 implementer → inline"
443
+ - "Comments that lie or duplicate the code → delete (no-comments-default karpathy rule)"
444
+ - "Run tests after each simplification; revert if anything fails"
445
+ severity: "applied / candidate-flagged / skipped (too risky for final step)"
446
+ description: "Final-step code simplification on the phase diff (remove duplication / dead code / over-abstraction)."
447
+
448
+ verify-multispec:
449
+ primary_cap: "agent-teams-create"
450
+ specialist: "Multi-specialist Agent Team orchestrator (Pattern C)"
451
+ responsibility: |
452
+ Critical release / large refactor only. Spawn 4 teammates
453
+ (code-review + gstack-review + gstack-cso + gstack-qa) via TeamCreate,
454
+ let them cross-question findings via SendMessage (NOT fire-and-forget),
455
+ lead arbitrates final report. Cleanup mandatory.
456
+ checklist:
457
+ - "Token-cost gate: estimate team_cost vs 2 × subagent_cost; only escalate when team wins"
458
+ - "TeamCreate with 4 teammates: code-review / gstack-review / gstack-cso / gstack-qa"
459
+ - "Each teammate's brief is self-contained (no shared session context to lean on)"
460
+ - "Round-trip findings: each teammate sends top-3 findings; others rate (real / false-positive / nit)"
461
+ - "Lead arbitrates conflicts; produces final report ordered CRITICAL → HIGH → MEDIUM"
462
+ - "Cleanup MANDATORY: SendMessage shutdown_request to each teammate, then TeamDelete"
463
+ - "If the gate doesn't fire (regular PR), DO NOT escalate — fall back to single-agent fan-out"
464
+ severity: "ship-blocker / ship-with-action / informational"
465
+ description: "Pattern C 4-specialist Agent Team — critical-release multi-dimensional review with SendMessage cross-questioning."
466
+
467
+ # ============================================================================
468
+ # Multi-cap workflow notes
469
+ # ============================================================================
470
+ # discuss-strategic ships 2 capabilities (office-hours + plan-ceo-review)
471
+ # — primary_cap is office-hours (the entry); the role prompt covers both
472
+ # CEO + product lenses so a single Task spawn can do either.
473
+ # verify-progress ships 2 (gsd-verify-work + gsd-progress) — primary = the
474
+ # first one; role prompt covers both since they're sequential.
475
+ # task-code primary = planning-with-files (the persistent update); the role
476
+ # prompt is karpathy-discipline focused since the code phase has no single
477
+ # cmd — discipline is behavioral.
@@ -7,7 +7,7 @@ description: |
7
7
  conditional + code order 2 + test order 3 conditional + deliver order 4) + disciplines_applied
8
8
  (6 default) + tools_available (8 entry: superpowers-brainstorming + tdd + grill-with-docs +
9
9
  zoom-out + improve-codebase-architecture + diagnose + ralph-loop + planning-with-files)。
10
- Triggered by harnessed CLI `harnessed task --subtask <text>` or slash command `/task`
10
+ Triggered by slash command `/task`
11
11
  (bare per ADR 0030 namespace policy D-02 LOCK) after `harnessed setup`.
12
12
  trigger_phrases:
13
13
  - "task"
@@ -55,9 +55,17 @@ Sister `workflows/capabilities.yaml`:
55
55
 
56
56
  ## Invocation
57
57
 
58
- - CLI: `harnessed task --subtask "<text>"`
59
58
  - Slash command: `/task <text>` (bare per ADR 0030 namespace policy D-02 LOCK after `harnessed setup`)
60
59
 
60
+ <!-- v3.4.3-dual-path-invocation -->
61
+ ## How to invoke
62
+
63
+ **Preferred path** (master orchestrator): dispatch to the per-sub-workflow slash commands in the order this stage prescribes. Each sub command lives at `~/.claude/commands/<sub-name>.md` with its own dual-path fallback.
64
+
65
+ **Fallback path** (when no slash command from the sub-list resolves): run each missing sub-workflow inline using its own role prompt from `~/.claude/skills/<sub-name>/SKILL.md`. Do NOT skip stages silently — each sub either runs or is logged as "skipped: <reason>".
66
+
67
+ (Sister `~/.claude/commands/task.md` is also generated by `harnessed setup` so `/task` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task --apply` CLI claims are removed; that subcommand was never implemented.)
68
+
61
69
  ## References
62
70
 
63
71
  - D-01 master orchestrator delegation pattern
@@ -54,29 +54,37 @@ sister CLAUDE.md "Discuss / Research 阶段" mattpocock 招式按需召唤 patte
54
54
  unconditional fire (D-05 invokes_tools 与 OnClause 并存, 但作用面不同 — invokes_tools
55
55
  phase-level conditional tool fire NOT 决定 phase 是否走)。
56
56
 
57
- ## CLI invocation
58
-
59
- ```bash
60
- # Dry-run preview — arbitrate-only, never spawns SDK.
61
- harnessed task-clarify --task "<text>" --dry-run --non-interactive
62
-
63
- # Apply path — real SDK spawn + 1-phase (conditional brainstorming via gate evaluation).
64
- harnessed task-clarify --task "<text>" --apply
65
- ```
66
-
67
- ## Forward-looking note
68
-
69
- The `trigger_phrases:` frontmatter is active after `harnessed setup` copies this
70
- SKILL.md to `~/.claude/skills/task-clarify/` — Claude Code then loads the slash
71
- command `/task-clarify` automatically (Gap B fix — sister v1.0.2 mechanism).
72
-
57
+ <!-- v3.4.3-dual-path-invocation -->
73
58
  ## How to invoke
74
59
 
75
- Use the SlashCommand tool to run: `{{ capabilities.superpowers-brainstorming.cmd }}`
76
-
77
- (If the rendered cmd above is the bare `/superpowers-brainstorming` accompanied by a `⚠️ ... not installed`
78
- warning from `harnessed setup`, install the missing plugin first then re-run
79
- `harnessed setup` to re-render this SKILL.md with the full namespaced cmd.)
60
+ **Preferred path** (when the upstream specialist is installed): use the SlashCommand tool to run `{{ capabilities.superpowers-brainstorming.cmd }}` — the upstream specialist takes over.
61
+
62
+ **Fallback path** (when the upstream isn't installed or returns no result): use the Task tool to spawn a general-purpose subagent with this prompt:
63
+
64
+ > You are a **Subtask spec clarifier**.
65
+ >
66
+ > **Mission**: Surface ambiguity in a single subtask spec by asking ONE focused question at a time. Fires when ≥2 approaches / core algorithm / API contract / high error-cost. Skip if subtask is CRUD or already obvious.
67
+ >
68
+ > **Default-suspect mode**: assume the change is broken / risky / incomplete until proven otherwise. Cite `file:line` for every finding; do not generalize.
69
+ >
70
+ > **Review checklist**:
71
+ > 1. Read the subtask description; restate it in your own words to confirm
72
+ >
73
+ > 2. List every assumption you would make; flag the ones the user must confirm
74
+ >
75
+ > 3. Ask ONE question at a time, lowest-cost-to-answer first
76
+ >
77
+ > 4. Stop asking when you have enough to write 80% of the code without guessing
78
+ >
79
+ > 5. Record the resolved spec at the top of the subtask file before implementing
80
+ >
81
+ > 6. If `phase.spec_ambiguous == true AND phase.no_docs == true`, request grill-me
82
+ >
83
+ > **Output format**: structured report with severity-classified findings (blocking-question / nice-to-know / resolved). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
84
+
85
+ (Role prompt is self-contained — works even when the upstream `superpowers-brainstorming` user-skill / plugin isn't installed.)
86
+
87
+ (Sister `~/.claude/commands/task-clarify.md` is also generated by `harnessed setup` so `/task-clarify` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task-clarify --apply` CLI claims are removed; that subcommand was never implemented.)
80
88
 
81
89
  ## References
82
90
 
@@ -60,29 +60,39 @@ per CLAUDE.md "跨 session 恢复" 模式 + R20.6 Manus-style 持久化。Plugin
60
60
  verified at `~/.claude/plugins/cache/planning-with-files/planning-with-files/2.34.0/`
61
61
  (2026-05-20).
62
62
 
63
- ## CLI invocation
64
-
65
- ```bash
66
- # Dry-run preview — arbitrate-only, never spawns SDK.
67
- harnessed task-code --task "<text>" --dry-run --non-interactive
68
-
69
- # Apply path — real SDK spawn + 2-phase chain.
70
- harnessed task-code --task "<text>" --apply
71
- ```
72
-
73
- ## Forward-looking note
74
-
75
- The `trigger_phrases:` frontmatter is active after `harnessed setup` copies this
76
- SKILL.md to `~/.claude/skills/task-code/` — Claude Code then loads the slash
77
- command `/task-code` automatically (Gap B fix — sister v1.0.2 mechanism).
78
-
63
+ <!-- v3.4.3-dual-path-invocation -->
79
64
  ## How to invoke
80
65
 
81
- Use the SlashCommand tool to run: `{{ capabilities.planning-with-files.cmd }}`
82
-
83
- (If the rendered cmd above is the bare `/planning-with-files` accompanied by a `⚠️ ... not installed`
84
- warning from `harnessed setup`, install the missing plugin first then re-run
85
- `harnessed setup` to re-render this SKILL.md with the full namespaced cmd.)
66
+ **Preferred path** (when the upstream specialist is installed): use the SlashCommand tool to run `{{ capabilities.planning-with-files.cmd }}` — the upstream specialist takes over.
67
+
68
+ **Fallback path** (when the upstream isn't installed or returns no result): use the Task tool to spawn a general-purpose subagent with this prompt:
69
+
70
+ > You are a **Karpathy-discipline implementer**.
71
+ >
72
+ > **Mission**: Implement a single subtask under karpathy 4 心法 (Think Before Coding / Simplicity First / Surgical Changes / Goal-Driven Execution) with ≤200 LOC per file. Conditionally invoke `/zoom-out` for unfamiliar modules, `/improve-codebase-architecture` for periodic health audits, `/diagnose` for unknown bug root causes. Update `progress.md` via planning-with-files `/plan` when done.
73
+ >
74
+ > **Default-suspect mode**: assume the change is broken / risky / incomplete until proven otherwise. Cite `file:line` for every finding; do not generalize.
75
+ >
76
+ > **Review checklist**:
77
+ > 1. Before any edit: read the file you intend to change end-to-end
78
+ >
79
+ > 2. Smallest change that satisfies the acceptance criteria — no scope creep
80
+ >
81
+ > 3. ≤200 LOC per file (split modules if growing past it)
82
+ >
83
+ > 4. Trust internal code: don't re-validate already-checked inputs at every layer
84
+ >
85
+ > 5. No speculative abstractions (no 'just in case' generics)
86
+ >
87
+ > 6. Edit with surgical precision: full path, exact selectors, no broad rewrites
88
+ >
89
+ > 7. Update progress.md before declaring done (planning-with-files `/plan`)
90
+ >
91
+ > **Output format**: structured report with severity-classified findings (needs-fix / done / blocked). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
92
+
93
+ (Role prompt is self-contained — works even when the upstream `planning-with-files` user-skill / plugin isn't installed.)
94
+
95
+ (Sister `~/.claude/commands/task-code.md` is also generated by `harnessed setup` so `/task-code` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task-code --apply` CLI claims are removed; that subcommand was never implemented.)
86
96
 
87
97
  ## References
88
98