harnessed 3.4.3 → 3.4.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -0
- package/dist/cli.mjs +1084 -745
- package/dist/cli.mjs.map +1 -1
- package/dist/index.mjs +1 -1
- package/dist/index.mjs.map +1 -1
- package/package.json +1 -1
- package/workflows/auto/SKILL.md +10 -4
- package/workflows/capabilities.yaml +1 -1
- package/workflows/discuss/auto/SKILL.md +9 -4
- package/workflows/discuss/phase/SKILL.md +11 -29
- package/workflows/discuss/strategic/SKILL.md +11 -31
- package/workflows/discuss/subtask/SKILL.md +11 -29
- package/workflows/execute-task/SKILL.md +7 -6
- package/workflows/execute-task/workflow.yaml +93 -0
- package/workflows/plan/architecture/SKILL.md +11 -31
- package/workflows/plan/auto/SKILL.md +9 -4
- package/workflows/plan/phase/SKILL.md +11 -31
- package/workflows/research/SKILL.md +14 -1
- package/workflows/retro/SKILL.md +11 -29
- package/workflows/task/auto/SKILL.md +9 -4
- package/workflows/task/clarify/SKILL.md +11 -29
- package/workflows/task/code/SKILL.md +11 -31
- package/workflows/task/deliver/SKILL.md +12 -32
- package/workflows/task/test/SKILL.md +11 -31
- package/workflows/verify/auto/SKILL.md +9 -4
- package/workflows/verify/code-review/SKILL.md +11 -33
- package/workflows/verify/design/SKILL.md +11 -33
- package/workflows/verify/multispec/SKILL.md +11 -31
- package/workflows/verify/paranoid/SKILL.md +11 -35
- package/workflows/verify/progress/SKILL.md +11 -29
- package/workflows/verify/qa/SKILL.md +11 -31
- package/workflows/verify/security/SKILL.md +11 -33
- package/workflows/verify/simplify/SKILL.md +11 -31
- package/workflows/execute-task/phases.yaml +0 -73
|
@@ -54,37 +54,19 @@ sister CLAUDE.md "Discuss / Research 阶段" mattpocock 招式按需召唤 patte
|
|
|
54
54
|
unconditional fire (D-05 invokes_tools 与 OnClause 并存, 但作用面不同 — invokes_tools
|
|
55
55
|
phase-level conditional tool fire NOT 决定 phase 是否走)。
|
|
56
56
|
|
|
57
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
58
57
|
## How to invoke
|
|
59
58
|
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
> 1. Read the subtask description; restate it in your own words to confirm
|
|
72
|
-
>
|
|
73
|
-
> 2. List every assumption you would make; flag the ones the user must confirm
|
|
74
|
-
>
|
|
75
|
-
> 3. Ask ONE question at a time, lowest-cost-to-answer first
|
|
76
|
-
>
|
|
77
|
-
> 4. Stop asking when you have enough to write 80% of the code without guessing
|
|
78
|
-
>
|
|
79
|
-
> 5. Record the resolved spec at the top of the subtask file before implementing
|
|
80
|
-
>
|
|
81
|
-
> 6. If `phase.spec_ambiguous == true AND phase.no_docs == true`, request grill-me
|
|
82
|
-
>
|
|
83
|
-
> **Output format**: structured report with severity-classified findings (blocking-question / nice-to-know / resolved). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
84
|
-
|
|
85
|
-
(Role prompt is self-contained — works even when the upstream `superpowers-brainstorming` user-skill / plugin isn't installed.)
|
|
86
|
-
|
|
87
|
-
(Sister `~/.claude/commands/task-clarify.md` is also generated by `harnessed setup` so `/task-clarify` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task-clarify --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
59
|
+
Use the Bash tool to run:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
echo "$ARGUMENTS" | harnessed run task-clarify --task-stdin
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
If `$ARGUMENTS` is empty, run `harnessed run task-clarify` (no stdin pipe).
|
|
66
|
+
|
|
67
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
68
|
+
|
|
69
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
88
70
|
|
|
89
71
|
## References
|
|
90
72
|
|
|
@@ -60,39 +60,19 @@ per CLAUDE.md "跨 session 恢复" 模式 + R20.6 Manus-style 持久化。Plugin
|
|
|
60
60
|
verified at `~/.claude/plugins/cache/planning-with-files/planning-with-files/2.34.0/`
|
|
61
61
|
(2026-05-20).
|
|
62
62
|
|
|
63
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
64
63
|
## How to invoke
|
|
65
64
|
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
> 1. Before any edit: read the file you intend to change end-to-end
|
|
78
|
-
>
|
|
79
|
-
> 2. Smallest change that satisfies the acceptance criteria — no scope creep
|
|
80
|
-
>
|
|
81
|
-
> 3. ≤200 LOC per file (split modules if growing past it)
|
|
82
|
-
>
|
|
83
|
-
> 4. Trust internal code: don't re-validate already-checked inputs at every layer
|
|
84
|
-
>
|
|
85
|
-
> 5. No speculative abstractions (no 'just in case' generics)
|
|
86
|
-
>
|
|
87
|
-
> 6. Edit with surgical precision: full path, exact selectors, no broad rewrites
|
|
88
|
-
>
|
|
89
|
-
> 7. Update progress.md before declaring done (planning-with-files `/plan`)
|
|
90
|
-
>
|
|
91
|
-
> **Output format**: structured report with severity-classified findings (needs-fix / done / blocked). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
92
|
-
|
|
93
|
-
(Role prompt is self-contained — works even when the upstream `planning-with-files` user-skill / plugin isn't installed.)
|
|
94
|
-
|
|
95
|
-
(Sister `~/.claude/commands/task-code.md` is also generated by `harnessed setup` so `/task-code` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task-code --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
65
|
+
Use the Bash tool to run:
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
echo "$ARGUMENTS" | harnessed run task-code --task-stdin
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
If `$ARGUMENTS` is empty, run `harnessed run task-code` (no stdin pipe).
|
|
72
|
+
|
|
73
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
74
|
+
|
|
75
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
96
76
|
|
|
97
77
|
## References
|
|
98
78
|
|
|
@@ -41,7 +41,7 @@ spawns each phase as a sub-agent via `@anthropic-ai/claude-agent-sdk` 0.3.142+.
|
|
|
41
41
|
ralph-loop SDK wrapper 保 completion-promise verbatim string `"COMPLETE"` — sub-task
|
|
42
42
|
被认为完成的判据是子任务输出包含 verbatim "COMPLETE" string (NOT 启发式 / NOT
|
|
43
43
|
LLM-as-judge). Sister capabilities.yaml `ralph-loop` entry impl `bundled-skill` +
|
|
44
|
-
`sdk_ref: src/
|
|
44
|
+
`sdk_ref: src/workflow/lib/ralphLoop.ts` (Phase 2.2 v0.2.0 ship)。
|
|
45
45
|
|
|
46
46
|
### Parallelism — ralph-loop 正交 wrapper
|
|
47
47
|
|
|
@@ -82,39 +82,19 @@ in `progress.md` — sister Phase 01-code progress update pattern, last call in
|
|
|
82
82
|
③ task chain。Plugin path `~/.claude/plugins/cache/planning-with-files/
|
|
83
83
|
planning-with-files/2.34.0/` verified (2026-05-20)。
|
|
84
84
|
|
|
85
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
86
85
|
## How to invoke
|
|
87
86
|
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
> 1. Confirm subtask acceptance criteria are explicit and verifiable BEFORE looping
|
|
100
|
-
>
|
|
101
|
-
> 2. Set `max_iterations` based on subtask size; default 20
|
|
102
|
-
>
|
|
103
|
-
> 3. On loop entry, give the agent the full spec + acceptance criteria + completion promise
|
|
104
|
-
>
|
|
105
|
-
> 4. If agent emits 'COMPLETE' verbatim, mark progress.md done via `/plan`
|
|
106
|
-
>
|
|
107
|
-
> 5. If max_iterations exceeded, emit warning + halt; do NOT silent-continue
|
|
108
|
-
>
|
|
109
|
-
> 6. If teammate communication needed / context overflow → escalate to Agent Teams
|
|
110
|
-
>
|
|
111
|
-
> 7. Cleanup: SendMessage shutdown_request + TeamDelete (防呆清单 mandatory)
|
|
112
|
-
>
|
|
113
|
-
> **Output format**: structured report with severity-classified findings (complete / max-iter-exceeded / escalated-to-teams). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
114
|
-
|
|
115
|
-
(Role prompt is self-contained — works even when the upstream `ralph-loop` user-skill / plugin isn't installed.)
|
|
116
|
-
|
|
117
|
-
(Sister `~/.claude/commands/task-deliver.md` is also generated by `harnessed setup` so `/task-deliver` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task-deliver --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
87
|
+
Use the Bash tool to run:
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
echo "$ARGUMENTS" | harnessed run task-deliver --task-stdin
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
If `$ARGUMENTS` is empty, run `harnessed run task-deliver` (no stdin pipe).
|
|
94
|
+
|
|
95
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
96
|
+
|
|
97
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
118
98
|
|
|
119
99
|
## References
|
|
120
100
|
|
|
@@ -63,39 +63,19 @@ Phase 01-test 条件性 fire `diagnose` (capabilities.yaml L55-64 mattpocock-ski
|
|
|
63
63
|
test fail 时进入 diagnose loop (reproduce → minimise → hypothesise → instrument →
|
|
64
64
|
fix → regression-test), 测试通过则 skip diagnose entirely。
|
|
65
65
|
|
|
66
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
67
66
|
## How to invoke
|
|
68
67
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
> 1. Red: write ONE failing test for the smallest behavior increment; run, watch it fail
|
|
81
|
-
>
|
|
82
|
-
> 2. Green: write the minimum code that makes it pass — nothing more
|
|
83
|
-
>
|
|
84
|
-
> 3. Refactor: clean up duplication / clarify names — keep tests green
|
|
85
|
-
>
|
|
86
|
-
> 4. Loop. Each cycle ≤10 min; if longer, the increment is too big — split
|
|
87
|
-
>
|
|
88
|
-
> 5. Negative cases matter: at least 1 test per error / edge / boundary
|
|
89
|
-
>
|
|
90
|
-
> 6. Test name = expected behavior, not 'test1', not 'should work'
|
|
91
|
-
>
|
|
92
|
-
> 7. On unexpected failure: stop adding tests; route to `/diagnose` for root cause
|
|
93
|
-
>
|
|
94
|
-
> **Output format**: structured report with severity-classified findings (red / green / refactored / blocked). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
95
|
-
|
|
96
|
-
(Role prompt is self-contained — works even when the upstream `tdd` user-skill / plugin isn't installed.)
|
|
97
|
-
|
|
98
|
-
(Sister `~/.claude/commands/task-test.md` is also generated by `harnessed setup` so `/task-test` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed task-test --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
68
|
+
Use the Bash tool to run:
|
|
69
|
+
|
|
70
|
+
```bash
|
|
71
|
+
echo "$ARGUMENTS" | harnessed run task-test --task-stdin
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
If `$ARGUMENTS` is empty, run `harnessed run task-test` (no stdin pipe).
|
|
75
|
+
|
|
76
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
77
|
+
|
|
78
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
99
79
|
|
|
100
80
|
## References
|
|
101
81
|
|
|
@@ -64,14 +64,19 @@ Sister `workflows/capabilities.yaml`:
|
|
|
64
64
|
|
|
65
65
|
- Slash command: `/verify` (bare per ADR 0030 namespace policy D-02 LOCK after `harnessed setup`)
|
|
66
66
|
|
|
67
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
68
67
|
## How to invoke
|
|
69
68
|
|
|
70
|
-
|
|
69
|
+
Use the Bash tool to run:
|
|
71
70
|
|
|
72
|
-
|
|
71
|
+
```bash
|
|
72
|
+
echo "$ARGUMENTS" | harnessed run verify --task-stdin
|
|
73
|
+
```
|
|
73
74
|
|
|
74
|
-
|
|
75
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify` (no stdin pipe).
|
|
76
|
+
|
|
77
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
78
|
+
|
|
79
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
75
80
|
|
|
76
81
|
## References
|
|
77
82
|
|
|
@@ -48,41 +48,19 @@ Sister `workflows/judgments/parallelism-gate.yaml`:
|
|
|
48
48
|
总 fire 当 `phase.stage == 'verify'` 后必跑串行 (verify-progress) 之后并行 fan-out。无 skip
|
|
49
49
|
条件 — code-review 多 agent 是 verify-work 第 3 phase 默认 fan-out (sister CLAUDE.md verbatim)。
|
|
50
50
|
|
|
51
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
52
51
|
## How to invoke
|
|
53
52
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
> 1. Read the diff against the base branch — full diff, not just summaries
|
|
66
|
-
>
|
|
67
|
-
> 2. Audit against CLAUDE.md (root + any directory-level CLAUDE.md)
|
|
68
|
-
>
|
|
69
|
-
> 3. Shallow scan for obvious bugs in changed lines (avoid context expansion)
|
|
70
|
-
>
|
|
71
|
-
> 4. Git blame on modified regions — bugs visible only in historical context
|
|
72
|
-
>
|
|
73
|
-
> 5. Previous PRs touching same files — recurring patterns / past comments
|
|
74
|
-
>
|
|
75
|
-
> 6. Inline code comments / docstrings — does the change violate stated invariants?
|
|
76
|
-
>
|
|
77
|
-
> 7. Score each finding 0-100; drop <80; cite file:line for kept findings
|
|
78
|
-
>
|
|
79
|
-
> 8. Avoid: pre-existing issues, linter-catchable nits, lines user did not modify
|
|
80
|
-
>
|
|
81
|
-
> **Output format**: structured report with severity-classified findings (critical / high / medium (only findings ≥80 confidence are reported)). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
82
|
-
|
|
83
|
-
(Role prompt is self-contained — works even when the upstream `code-review` user-skill / plugin isn't installed.)
|
|
84
|
-
|
|
85
|
-
(Sister `~/.claude/commands/verify-code-review.md` is also generated by `harnessed setup` so `/verify-code-review` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-code-review --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
53
|
+
Use the Bash tool to run:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
echo "$ARGUMENTS" | harnessed run verify-code-review --task-stdin
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-code-review` (no stdin pipe).
|
|
60
|
+
|
|
61
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
62
|
+
|
|
63
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
86
64
|
|
|
87
65
|
## References
|
|
88
66
|
|
|
@@ -51,41 +51,19 @@ Sister `workflows/judgments/stage-routing.yaml`:
|
|
|
51
51
|
- 创意补充 / 不要 AI 味 → `frontend-design`
|
|
52
52
|
- 用户明示「独特 / 不要 AI 感」→ frontend-design 主导, 否则 ui-ux-pro-max 优先
|
|
53
53
|
|
|
54
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
55
54
|
## How to invoke
|
|
56
55
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
> 1. Classifier: marketing/landing vs app UI vs hybrid — apply matching rule set
|
|
69
|
-
>
|
|
70
|
-
> 2. Hard rejection: generic SaaS card grid / beautiful image weak brand / busy imagery behind text / carousel without narrative
|
|
71
|
-
>
|
|
72
|
-
> 3. Litmus: brand unmistakable first screen / one strong visual anchor / scannable by headlines / one job per section
|
|
73
|
-
>
|
|
74
|
-
> 4. Typography: expressive, not default stacks (Inter / Roboto / Arial / system)
|
|
75
|
-
>
|
|
76
|
-
> 5. Hero: full-bleed edge-to-edge / one composition / no cards in hero
|
|
77
|
-
>
|
|
78
|
-
> 6. Responsive ≠ stacked desktop on mobile — evaluate whether mobile layout makes design sense
|
|
79
|
-
>
|
|
80
|
-
> 7. Quick Wins section: 3-5 highest-impact fixes <30 min each
|
|
81
|
-
>
|
|
82
|
-
> 8. Every finding has a screenshot — annotated where possible (Read the file inline so user sees it)
|
|
83
|
-
>
|
|
84
|
-
> **Output format**: structured report with severity-classified findings (hard-reject / quick-win / nice-to-have). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
85
|
-
|
|
86
|
-
(Role prompt is self-contained — works even when the upstream `gstack-design-review` user-skill / plugin isn't installed.)
|
|
87
|
-
|
|
88
|
-
(Sister `~/.claude/commands/verify-design.md` is also generated by `harnessed setup` so `/verify-design` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-design --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
56
|
+
Use the Bash tool to run:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
echo "$ARGUMENTS" | harnessed run verify-design --task-stdin
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-design` (no stdin pipe).
|
|
63
|
+
|
|
64
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
65
|
+
|
|
66
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
89
67
|
|
|
90
68
|
## References
|
|
91
69
|
|
|
@@ -64,39 +64,19 @@ Phase-level `on` clause (critical-release 升级触发):
|
|
|
64
64
|
- **Token 估算 prereq**: `team_cost < 2 × subagent_cost` (engine-level check per agent-teams.md L34)
|
|
65
65
|
- **Cleanup mandatory**: phase 02-team-cleanup `agent-teams-shutdown` 必跑 (防呆清单)
|
|
66
66
|
|
|
67
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
68
67
|
## How to invoke
|
|
69
68
|
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
> 1. Token-cost gate: estimate team_cost vs 2 × subagent_cost; only escalate when team wins
|
|
82
|
-
>
|
|
83
|
-
> 2. TeamCreate with 4 teammates: code-review / gstack-review / gstack-cso / gstack-qa
|
|
84
|
-
>
|
|
85
|
-
> 3. Each teammate's brief is self-contained (no shared session context to lean on)
|
|
86
|
-
>
|
|
87
|
-
> 4. Round-trip findings: each teammate sends top-3 findings; others rate (real / false-positive / nit)
|
|
88
|
-
>
|
|
89
|
-
> 5. Lead arbitrates conflicts; produces final report ordered CRITICAL → HIGH → MEDIUM
|
|
90
|
-
>
|
|
91
|
-
> 6. Cleanup MANDATORY: SendMessage shutdown_request to each teammate, then TeamDelete
|
|
92
|
-
>
|
|
93
|
-
> 7. If the gate doesn't fire (regular PR), DO NOT escalate — fall back to single-agent fan-out
|
|
94
|
-
>
|
|
95
|
-
> **Output format**: structured report with severity-classified findings (ship-blocker / ship-with-action / informational). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
96
|
-
|
|
97
|
-
(Role prompt is self-contained — works even when the upstream `agent-teams-create` user-skill / plugin isn't installed.)
|
|
98
|
-
|
|
99
|
-
(Sister `~/.claude/commands/verify-multispec.md` is also generated by `harnessed setup` so `/verify-multispec` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-multispec --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
69
|
+
Use the Bash tool to run:
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
echo "$ARGUMENTS" | harnessed run verify-multispec --task-stdin
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-multispec` (no stdin pipe).
|
|
76
|
+
|
|
77
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
78
|
+
|
|
79
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
100
80
|
|
|
101
81
|
## References
|
|
102
82
|
|
|
@@ -50,43 +50,19 @@ Sister `workflows/judgments/stage-routing.yaml`:
|
|
|
50
50
|
- ✅ **触发**: 关键模块 PR 前 (auth / payment / data migration / core algorithm 等)
|
|
51
51
|
- ❌ **跳过**: 常规 PR / docs / config / 非核心 module
|
|
52
52
|
|
|
53
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
54
53
|
## How to invoke
|
|
55
54
|
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
> 1. SQL & Data Safety — string interpolation, TOCTOU races, validation bypass, N+1
|
|
68
|
-
>
|
|
69
|
-
> 2. Race conditions & concurrency — read-check-write without unique constraint, missing atomic UPDATE
|
|
70
|
-
>
|
|
71
|
-
> 3. LLM output trust boundary — unvalidated LLM-generated values to DB / SSRF / stored prompt injection
|
|
72
|
-
>
|
|
73
|
-
> 4. Shell injection — subprocess shell=True with interpolation, os.system, eval/exec on LLM output
|
|
74
|
-
>
|
|
75
|
-
> 5. Enum & value completeness — new enum/status/tier value reached every consumer (case/if-chains/allowlists)
|
|
76
|
-
>
|
|
77
|
-
> 6. Async/sync mixing — sync I/O inside async def, time.sleep in async
|
|
78
|
-
>
|
|
79
|
-
> 7. Column/field name safety — ORM .select/.eq columns match schema
|
|
80
|
-
>
|
|
81
|
-
> 8. Type coercion at boundaries — hash/digest inputs normalized before serialize
|
|
82
|
-
>
|
|
83
|
-
> 9. Time window safety — date-key lookups assuming 24h coverage; mismatched buckets between features
|
|
84
|
-
>
|
|
85
|
-
> **Output format**: structured report with severity-classified findings (CRITICAL / INFORMATIONAL (Fix-First Heuristic — critical → ASK, informational → AUTO-FIX)). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
86
|
-
|
|
87
|
-
(Role prompt is self-contained — works even when the upstream `gstack-review` user-skill / plugin isn't installed.)
|
|
88
|
-
|
|
89
|
-
(Sister `~/.claude/commands/verify-paranoid.md` is also generated by `harnessed setup` so `/verify-paranoid` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-paranoid --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
55
|
+
Use the Bash tool to run:
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
echo "$ARGUMENTS" | harnessed run verify-paranoid --task-stdin
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-paranoid` (no stdin pipe).
|
|
62
|
+
|
|
63
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
64
|
+
|
|
65
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
90
66
|
|
|
91
67
|
## References
|
|
92
68
|
|
|
@@ -46,37 +46,19 @@ Sister `workflows/capabilities.yaml` entries:
|
|
|
46
46
|
总 fire 当 `phase.stage == 'verify'` (sister `workflows/judgments/stage-routing.yaml`
|
|
47
47
|
verify-progress-always trigger)。无 skip 条件 — verify-work 起点必跑。
|
|
48
48
|
|
|
49
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
50
49
|
## How to invoke
|
|
51
50
|
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
> 1. Read the phase's acceptance criteria from PLAN.md / task_plan.md
|
|
64
|
-
>
|
|
65
|
-
> 2. For each criterion, demonstrate it passes (test result, manual UAT log, screenshot)
|
|
66
|
-
>
|
|
67
|
-
> 3. Flag any criterion that is partial / stubbed / TODO — do NOT mark complete
|
|
68
|
-
>
|
|
69
|
-
> 4. Sync ROADMAP.md / STATE.md / REQUIREMENTS.md via gsd-progress
|
|
70
|
-
>
|
|
71
|
-
> 5. Append `progress.md` with completed subtask hash + verification artifact
|
|
72
|
-
>
|
|
73
|
-
> 6. If acceptance is incomplete, route to bug-fix and re-verify; do not advance
|
|
74
|
-
>
|
|
75
|
-
> **Output format**: structured report with severity-classified findings (accepted / partial / blocked / failed). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
76
|
-
|
|
77
|
-
(Role prompt is self-contained — works even when the upstream `gsd-verify-work` user-skill / plugin isn't installed.)
|
|
78
|
-
|
|
79
|
-
(Sister `~/.claude/commands/verify-progress.md` is also generated by `harnessed setup` so `/verify-progress` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-progress --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
51
|
+
Use the Bash tool to run:
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
echo "$ARGUMENTS" | harnessed run verify-progress --task-stdin
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-progress` (no stdin pipe).
|
|
58
|
+
|
|
59
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
60
|
+
|
|
61
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
80
62
|
|
|
81
63
|
## References
|
|
82
64
|
|
|
@@ -52,39 +52,19 @@ Sister `workflows/judgments/stage-routing.yaml`:
|
|
|
52
52
|
- setup 需 Python 后端 (Tortoise ORM / pandas) → `webapp-testing` skill
|
|
53
53
|
- 性能 / a11y / 内存诊断 → 不在此 sub-workflow,用 `chrome-devtools-mcp`
|
|
54
54
|
|
|
55
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
56
55
|
## How to invoke
|
|
57
56
|
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
> 1. Orient: map the application (links, framework detection, initial console errors)
|
|
70
|
-
>
|
|
71
|
-
> 2. Per page: visual scan, interactive elements work, console clean, responsive check
|
|
72
|
-
>
|
|
73
|
-
> 3. Forms: empty / invalid / edge cases — error messages clear and actionable
|
|
74
|
-
>
|
|
75
|
-
> 4. Navigation: every path in and out works, no dead-ends
|
|
76
|
-
>
|
|
77
|
-
> 5. States: empty, loading, error, overflow — none look like AI placeholder
|
|
78
|
-
>
|
|
79
|
-
> 6. Mobile: 375x812 viewport — real layout, not stacked desktop
|
|
80
|
-
>
|
|
81
|
-
> 7. Authenticated paths if creds / cookies provided; depth > breadth on core flows
|
|
82
|
-
>
|
|
83
|
-
> **Output format**: structured report with severity-classified findings (blocker / major / minor / nit). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
84
|
-
|
|
85
|
-
(Role prompt is self-contained — works even when the upstream `gstack-qa` user-skill / plugin isn't installed.)
|
|
86
|
-
|
|
87
|
-
(Sister `~/.claude/commands/verify-qa.md` is also generated by `harnessed setup` so `/verify-qa` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-qa --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
57
|
+
Use the Bash tool to run:
|
|
58
|
+
|
|
59
|
+
```bash
|
|
60
|
+
echo "$ARGUMENTS" | harnessed run verify-qa --task-stdin
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-qa` (no stdin pipe).
|
|
64
|
+
|
|
65
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
66
|
+
|
|
67
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
88
68
|
|
|
89
69
|
## References
|
|
90
70
|
|
|
@@ -47,41 +47,19 @@ Sister `workflows/judgments/stage-routing.yaml`:
|
|
|
47
47
|
- ✅ **触发**: auth flow / session / credentials / API keys / SQL injection 路径 / OWASP top 10 area
|
|
48
48
|
- ❌ **跳过**: docs / 纯 UI styling / 内部 refactor / non-security PR
|
|
49
49
|
|
|
50
|
-
<!-- v3.4.3-dual-path-invocation -->
|
|
51
50
|
## How to invoke
|
|
52
51
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
> 1. OWASP Top 10: injection / broken auth / sensitive data exposure / XXE / broken access control / misconfig / XSS / insecure deserialize / known-vuln deps / insufficient logging
|
|
65
|
-
>
|
|
66
|
-
> 2. Secrets archaeology: git history scan for leaked credentials, .env tracked files, CI inline secrets
|
|
67
|
-
>
|
|
68
|
-
> 3. Auth boundaries: every protected route enforces auth (not just CSR check); authorization not transitive across requests
|
|
69
|
-
>
|
|
70
|
-
> 4. CSRF / SSRF / stored prompt injection where LLM output enters knowledge bases
|
|
71
|
-
>
|
|
72
|
-
> 5. CI/CD: pull_request_target + checkout PR code, script injection via github.event.*, unpinned third-party actions
|
|
73
|
-
>
|
|
74
|
-
> 6. Dockerfiles: missing USER (root), secrets as ARG, .env in image, exposed ports without purpose
|
|
75
|
-
>
|
|
76
|
-
> 7. IaC: wildcard IAM, hardcoded secrets in .tfvars, privileged containers, hostNetwork in K8s
|
|
77
|
-
>
|
|
78
|
-
> 8. Dependency audit (npm audit / pip-audit / bundler-audit) — note SKIPPED tools rather than fail audit
|
|
79
|
-
>
|
|
80
|
-
> **Output format**: structured report with severity-classified findings (CRITICAL / HIGH / MEDIUM / LOW / INFO). One finding per line: `[severity] file:line — problem (one sentence); fix: suggested change`. If no findings, say so explicitly. No preamble, no end-of-report summary.
|
|
81
|
-
|
|
82
|
-
(Role prompt is self-contained — works even when the upstream `gstack-cso` user-skill / plugin isn't installed.)
|
|
83
|
-
|
|
84
|
-
(Sister `~/.claude/commands/verify-security.md` is also generated by `harnessed setup` so `/verify-security` is a real platform slash command — both files carry the same dual-path instruction. Previous v3.4.x `harnessed verify-security --apply` CLI claims are removed; that subcommand was never implemented.)
|
|
52
|
+
Use the Bash tool to run:
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
echo "$ARGUMENTS" | harnessed run verify-security --task-stdin
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
If `$ARGUMENTS` is empty, run `harnessed run verify-security` (no stdin pipe).
|
|
59
|
+
|
|
60
|
+
After completion, the Bash output prints a `Next:` hint on stderr suggesting the next stage. Decide whether to invoke based on conversation context — the hint is informational, not prescriptive.
|
|
61
|
+
|
|
62
|
+
<!-- harnessed-generated:v3.4.4 -->
|
|
85
63
|
|
|
86
64
|
## References
|
|
87
65
|
|