@event4u/agent-config 1.31.0 → 1.32.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent-src/skills/feature-planning/SKILL.md +43 -7
- package/.agent-src/skills/judge-test-coverage/SKILL.md +4 -0
- package/.agent-src/skills/pest-testing/SKILL.md +13 -6
- package/.agent-src/skills/quality-tools/SKILL.md +4 -0
- package/.agent-src/skills/refine-prompt/SKILL.md +10 -0
- package/.agent-src/skills/refine-ticket/SKILL.md +12 -0
- package/.agent-src/skills/subagent-orchestration/SKILL.md +77 -12
- package/.agent-src/skills/subagent-orchestration/prompts/README.md +29 -0
- package/.agent-src/skills/subagent-orchestration/prompts/do-and-judge-two-stage.md +121 -0
- package/.agent-src/skills/subagent-orchestration/prompts/do-and-judge.md +60 -0
- package/.agent-src/skills/subagent-orchestration/prompts/do-competitively.md +65 -0
- package/.agent-src/skills/subagent-orchestration/prompts/do-in-parallel.md +62 -0
- package/.agent-src/skills/subagent-orchestration/prompts/do-in-steps.md +62 -0
- package/.agent-src/skills/subagent-orchestration/prompts/do-in-worktrees.md +70 -0
- package/.agent-src/skills/subagent-orchestration/prompts/judge-with-debate.md +63 -0
- package/.agent-src/skills/subagent-orchestration/schemas/subagent-status.json +63 -0
- package/.agent-src/skills/test-driven-development/SKILL.md +25 -13
- package/.agent-src/skills/testing-anti-patterns/SKILL.md +7 -0
- package/.agent-src/skills/testing-anti-patterns/process-anti-patterns.md +67 -0
- package/.claude-plugin/marketplace.json +1 -1
- package/CHANGELOG.md +24 -0
- package/docs/catalog.md +1 -1
- package/docs/contracts/file-ownership-matrix.json +341 -0
- package/package.json +1 -1
- package/scripts/check_bite_sized_granularity.py +99 -0
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
# Prompt — do-in-parallel
|
|
2
|
+
|
|
3
|
+
Mode reference: [`../SKILL.md`](../SKILL.md) § *3. do-in-parallel*.
|
|
4
|
+
|
|
5
|
+
## Implementer prompt (per slice)
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
You are the implementer for SLICE {{slice_id}} in a parallel-dispatch
|
|
9
|
+
run. {{n_slices}} slices run concurrently. Slices are guaranteed
|
|
10
|
+
independent — different files, no shared state.
|
|
11
|
+
|
|
12
|
+
SLICE: {{slice_description}}
|
|
13
|
+
CONTEXT FILES (this slice only): {{file_paths}}
|
|
14
|
+
SHARED-STATE BAN: {{shared_paths_to_avoid}}
|
|
15
|
+
|
|
16
|
+
CONSTRAINTS:
|
|
17
|
+
- Do NOT touch any file outside the cited paths. The orchestrator
|
|
18
|
+
verified independence — violating it causes a merge race.
|
|
19
|
+
- Do NOT communicate with other slices. They are doing their own work.
|
|
20
|
+
- Write tests scoped to your slice; do not assert on slice-cross
|
|
21
|
+
behavior.
|
|
22
|
+
|
|
23
|
+
ON COMPLETION, return ONE envelope per schemas/subagent-status.json:
|
|
24
|
+
- DONE — slice shipped clean; evidence[] required.
|
|
25
|
+
- DONE_WITH_CONCERNS — shipped but mark concerns[] for the
|
|
26
|
+
aggregating judge to surface.
|
|
27
|
+
- NEEDS_CONTEXT — paused; orchestrator must answer
|
|
28
|
+
blocking_question. Other slices keep running.
|
|
29
|
+
- BLOCKED — slice cannot complete in isolation; explain
|
|
30
|
+
in blocking_reason. Other slices keep running;
|
|
31
|
+
aggregating judge handles partial outcome.
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
## Judge prompt (run once on aggregate)
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
You are the judge running ONCE over the merged output of N parallel
|
|
38
|
+
slices. Per-slice judges were skipped to keep cost linear.
|
|
39
|
+
|
|
40
|
+
SLICE ENVELOPES: {{envelopes_array}}
|
|
41
|
+
AGGREGATED DIFF: {{merged_diff}}
|
|
42
|
+
TEST OUTPUT (full suite): {{test_output}}
|
|
43
|
+
|
|
44
|
+
VERDICT (one envelope, schemas/subagent-status.json):
|
|
45
|
+
- DONE — every slice DONE or DONE_WITH_CONCERNS that
|
|
46
|
+
you accept; evidence[] cites the merge being
|
|
47
|
+
test-green.
|
|
48
|
+
- DONE_WITH_CONCERNS — accept the aggregate, but consolidated
|
|
49
|
+
concerns[] from all slices need caller action.
|
|
50
|
+
- NEEDS_CONTEXT — one or more slices need clarification before
|
|
51
|
+
the aggregate can land; cite which.
|
|
52
|
+
- BLOCKED — aggregate is broken; cite the slice(s) that
|
|
53
|
+
must be re-run.
|
|
54
|
+
|
|
55
|
+
INDEPENDENCE-VIOLATION CHECK: scan for files touched by more than one
|
|
56
|
+
slice. If found, return BLOCKED — the dispatch was unsafe.
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Failure-isolation rule
|
|
60
|
+
|
|
61
|
+
A slice returning BLOCKED does not abort the other slices. The
|
|
62
|
+
aggregating judge decides whether the partial result lands.
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
# Prompt — do-in-steps
|
|
2
|
+
|
|
3
|
+
Mode reference: [`../SKILL.md`](../SKILL.md) § *2. do-in-steps*.
|
|
4
|
+
|
|
5
|
+
## Implementer prompt (per step)
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
You are the implementer for STEP {{step_number}} of {{total_steps}} in a
|
|
9
|
+
sequential plan. Earlier steps that PASSED judgment are committed; their
|
|
10
|
+
diffs are read-only context.
|
|
11
|
+
|
|
12
|
+
PLAN: {{plan_summary}}
|
|
13
|
+
THIS STEP: {{step_description}}
|
|
14
|
+
PRIOR STEP DIFFS (read-only): {{prior_diffs}}
|
|
15
|
+
CONTEXT FILES: {{file_paths}}
|
|
16
|
+
|
|
17
|
+
CONSTRAINTS:
|
|
18
|
+
- Do NOT modify code from prior steps; their tests must still pass.
|
|
19
|
+
- Do NOT preempt later steps; one step at a time.
|
|
20
|
+
- Write the test for THIS step before the production code.
|
|
21
|
+
|
|
22
|
+
ON COMPLETION, return ONE envelope per schemas/subagent-status.json:
|
|
23
|
+
- DONE — step complete, gate green; cite evidence[].
|
|
24
|
+
- DONE_WITH_CONCERNS — step complete but flag carry-over concerns
|
|
25
|
+
for later steps.
|
|
26
|
+
- NEEDS_CONTEXT — paused; blocking_question must be answered
|
|
27
|
+
before this step can complete.
|
|
28
|
+
- BLOCKED — step cannot complete on the current plan;
|
|
29
|
+
blocking_reason explains why. The orchestrator
|
|
30
|
+
may revise the plan and re-dispatch.
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Judge prompt (between steps)
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
You are the judge reviewing STEP {{step_number}} before STEP
|
|
37
|
+
{{step_number_plus_one}} starts. A failing step here cascades into the
|
|
38
|
+
next, so verdicts are stricter than a one-shot do-and-judge.
|
|
39
|
+
|
|
40
|
+
STEP DIFF: {{diff}}
|
|
41
|
+
STEP TESTS: {{test_output}}
|
|
42
|
+
PRIOR STEPS: {{prior_step_summaries}}
|
|
43
|
+
NEXT STEP DESCRIPTION: {{next_step_description}}
|
|
44
|
+
|
|
45
|
+
VERDICT — return ONE envelope per schemas/subagent-status.json:
|
|
46
|
+
- DONE — proceed to next step; evidence[] required.
|
|
47
|
+
- DONE_WITH_CONCERNS — proceed, but next step's prompt MUST surface
|
|
48
|
+
the concerns[] so the implementer compensates.
|
|
49
|
+
- NEEDS_CONTEXT — pause; orchestrator answers blocking_question
|
|
50
|
+
before next step.
|
|
51
|
+
- BLOCKED — do not start next step; this step is wrong.
|
|
52
|
+
|
|
53
|
+
DOWNSTREAM IMPACT CHECK: name one way this diff could break the next
|
|
54
|
+
step. If you cannot, return DONE. If you can but the implementer
|
|
55
|
+
already mitigated, DONE. Otherwise DONE_WITH_CONCERNS.
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
## Cascade rule
|
|
59
|
+
|
|
60
|
+
A step that returns BLOCKED stops the chain. The orchestrator does not
|
|
61
|
+
"jump ahead" or re-order — it surfaces the BLOCKED envelope to the user
|
|
62
|
+
and waits.
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# Prompt — do-in-worktrees
|
|
2
|
+
|
|
3
|
+
Mode reference: [`../SKILL.md`](../SKILL.md) § *6. do-in-worktrees*.
|
|
4
|
+
Worktree creation/destruction lives in [`../../using-git-worktrees/SKILL.md`](../../using-git-worktrees/SKILL.md).
|
|
5
|
+
|
|
6
|
+
## Implementer prompt (per worktree step)
|
|
7
|
+
|
|
8
|
+
```
|
|
9
|
+
You are the implementer for STEP {{step_id}} in a cross-wing chain.
|
|
10
|
+
You are running INSIDE a fresh git worktree at {{worktree_path}} on
|
|
11
|
+
branch {{branch_name}}. Prior step's open files / branch state cannot
|
|
12
|
+
leak into this worktree — that is the whole point.
|
|
13
|
+
|
|
14
|
+
STEP TYPED INPUT (from prior step's ## Output): {{typed_input}}
|
|
15
|
+
STEP DESCRIPTION: {{step_description}}
|
|
16
|
+
EXPECTED ## Output (next step's ## Input): {{expected_output_shape}}
|
|
17
|
+
|
|
18
|
+
CONSTRAINTS:
|
|
19
|
+
- Stay inside the worktree path. Do NOT cd to the parent repo.
|
|
20
|
+
- Do NOT touch branches other than {{branch_name}}.
|
|
21
|
+
- Produce the expected ## Output shape literally — the next worktree's
|
|
22
|
+
implementer consumes it as ## Input.
|
|
23
|
+
- Run the chain-end test for THIS step before signaling completion.
|
|
24
|
+
|
|
25
|
+
ON COMPLETION, return ONE envelope per schemas/subagent-status.json:
|
|
26
|
+
- DONE — step output produced and validated; evidence[]
|
|
27
|
+
cites the typed-output file path.
|
|
28
|
+
- DONE_WITH_CONCERNS — output produced but flag carry-over for next
|
|
29
|
+
worktree; concerns[] surfaces in next step's
|
|
30
|
+
dispatch.
|
|
31
|
+
- NEEDS_CONTEXT — paused; chain pauses until orchestrator
|
|
32
|
+
answers blocking_question. Other worktrees
|
|
33
|
+
are NOT running concurrently in this mode.
|
|
34
|
+
- BLOCKED — step cannot complete; chain halts. The
|
|
35
|
+
orchestrator decides whether to drop the
|
|
36
|
+
worktree or rescope.
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Chain-end judge prompt (run once after final worktree)
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
You are the chain-end judge. The chain produced N typed outputs, one
|
|
43
|
+
per worktree. Validate the final integration PR against the chain's
|
|
44
|
+
goal.
|
|
45
|
+
|
|
46
|
+
CHAIN STEPS: {{step_summaries_array}}
|
|
47
|
+
TYPED OUTPUTS: {{outputs_array}}
|
|
48
|
+
INTEGRATION PR DIFF: {{integration_diff}}
|
|
49
|
+
|
|
50
|
+
VERDICT (one envelope, schemas/subagent-status.json):
|
|
51
|
+
- DONE — chain landed cleanly; evidence[] cites each
|
|
52
|
+
step's typed output and the integration test
|
|
53
|
+
run.
|
|
54
|
+
- DONE_WITH_CONCERNS — chain landed but consolidated concerns[]
|
|
55
|
+
across steps need follow-up.
|
|
56
|
+
- NEEDS_CONTEXT — integration is unclear; cite which step(s)
|
|
57
|
+
need clarification.
|
|
58
|
+
- BLOCKED — integration is broken; cite the worktree(s)
|
|
59
|
+
that must be redone. Do NOT silently rewrite.
|
|
60
|
+
|
|
61
|
+
WORKTREE-LEAK CHECK: scan the integration diff for branch names or
|
|
62
|
+
files belonging to a different worktree's step. If found, BLOCKED —
|
|
63
|
+
isolation was violated.
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Sequential-not-parallel rule
|
|
67
|
+
|
|
68
|
+
`do-in-worktrees` runs steps sequentially across isolated worktrees.
|
|
69
|
+
Parallel concurrent worktrees are `do-in-parallel` with explicit
|
|
70
|
+
isolation, not this mode.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# Prompt — judge-with-debate
|
|
2
|
+
|
|
3
|
+
Mode reference: [`../SKILL.md`](../SKILL.md) § *5. judge-with-debate*.
|
|
4
|
+
|
|
5
|
+
## Judge-A / Judge-B prompt (run twice in parallel)
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
You are JUDGE {{judge_letter}} reviewing a high-stakes diff (security,
|
|
9
|
+
data integrity, public API). Another judge is independently reviewing
|
|
10
|
+
the same diff. A meta-judge will reconcile your verdicts.
|
|
11
|
+
|
|
12
|
+
DO NOT reach for the safe answer. Disagreement IS the value.
|
|
13
|
+
|
|
14
|
+
TASK: {{task_description}}
|
|
15
|
+
DIFF: {{diff}}
|
|
16
|
+
TEST OUTPUT: {{test_output}}
|
|
17
|
+
SENSITIVITY: {{security_or_data_or_api}}
|
|
18
|
+
|
|
19
|
+
VERDICT (one envelope, schemas/subagent-status.json):
|
|
20
|
+
- DONE — diff is correct and safe; evidence[] cites
|
|
21
|
+
the specific defenses you verified.
|
|
22
|
+
- DONE_WITH_CONCERNS — correct but the failure modes you can name
|
|
23
|
+
go in concerns[].
|
|
24
|
+
- NEEDS_CONTEXT — paused; meta-judge will adjudicate after
|
|
25
|
+
orchestrator answers blocking_question.
|
|
26
|
+
- BLOCKED — diff is wrong; explain in blocking_reason.
|
|
27
|
+
|
|
28
|
+
NAME ONE FAILURE MODE you actively looked for, even if you did not
|
|
29
|
+
find it. "I looked for X, did not find it" is stronger evidence than
|
|
30
|
+
"this looks fine".
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Meta-judge prompt (run once after Judge-A and Judge-B return)
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
You are the META-JUDGE reconciling two independent verdicts. Both
|
|
37
|
+
judges saw the same diff; their envelopes are below.
|
|
38
|
+
|
|
39
|
+
JUDGE-A ENVELOPE: {{envelope_a}}
|
|
40
|
+
JUDGE-B ENVELOPE: {{envelope_b}}
|
|
41
|
+
DIFF: {{diff}}
|
|
42
|
+
|
|
43
|
+
RECONCILIATION RULES:
|
|
44
|
+
1. Both DONE → your verdict is DONE.
|
|
45
|
+
2. Either BLOCKED → your verdict is BLOCKED. No tiebreaker.
|
|
46
|
+
3. One DONE, one DONE_WITH_CONCERNS → DONE_WITH_CONCERNS (carry the
|
|
47
|
+
concerns).
|
|
48
|
+
4. One NEEDS_CONTEXT → consolidate blocking_question(s); your status
|
|
49
|
+
is NEEDS_CONTEXT.
|
|
50
|
+
5. Mixed otherwise → DONE_WITH_CONCERNS, listing every concern from
|
|
51
|
+
both judges.
|
|
52
|
+
|
|
53
|
+
DISAGREEMENT IS THE VALUE: do NOT average. The strict-er verdict wins.
|
|
54
|
+
|
|
55
|
+
VERDICT (one envelope, schemas/subagent-status.json) using the rules
|
|
56
|
+
above. Cite both judges' evidence[] in your evidence[].
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## High-stakes-only rule
|
|
60
|
+
|
|
61
|
+
`judge-with-debate` is two-judges-plus-meta = three subagent calls per
|
|
62
|
+
review. Reserve for security, data migration, public API, or
|
|
63
|
+
cross-tenant boundaries. Routine refactors use plain `do-and-judge`.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
{
|
|
2
|
+
"$schema": "https://json-schema.org/draft-07/schema#",
|
|
3
|
+
"$id": "https://event4u.dev/agent-config/schemas/subagent-status.json",
|
|
4
|
+
"title": "Subagent Status Envelope",
|
|
5
|
+
"description": "Required wire format for every implementer or judge subagent return. Loaded by /do-and-judge, /do-in-steps, and the subagent-orchestration skill. Hand-validated by tests/test_subagent_status_schema.py (no jsonschema runtime dep).",
|
|
6
|
+
"type": "object",
|
|
7
|
+
"required": ["status", "summary"],
|
|
8
|
+
"additionalProperties": false,
|
|
9
|
+
"properties": {
|
|
10
|
+
"status": {
|
|
11
|
+
"type": "string",
|
|
12
|
+
"enum": ["DONE", "DONE_WITH_CONCERNS", "NEEDS_CONTEXT", "BLOCKED"],
|
|
13
|
+
"description": "DONE = work shipped, gate green. DONE_WITH_CONCERNS = work shipped but caller must read concerns[]. NEEDS_CONTEXT = paused, blocking_question[] must be answered. BLOCKED = halted, blocking_reason describes why no path forward."
|
|
14
|
+
},
|
|
15
|
+
"summary": {
|
|
16
|
+
"type": "string",
|
|
17
|
+
"minLength": 1,
|
|
18
|
+
"description": "One- or two-sentence outcome. Required for every status."
|
|
19
|
+
},
|
|
20
|
+
"evidence": {
|
|
21
|
+
"type": "array",
|
|
22
|
+
"items": {"type": "string", "minLength": 1},
|
|
23
|
+
"description": "Citations: file:line, command output, test name, contract section. Required for DONE / DONE_WITH_CONCERNS."
|
|
24
|
+
},
|
|
25
|
+
"concerns": {
|
|
26
|
+
"type": "array",
|
|
27
|
+
"items": {"type": "string", "minLength": 1},
|
|
28
|
+
"description": "Caller-actionable concerns. Required when status = DONE_WITH_CONCERNS, must be empty otherwise."
|
|
29
|
+
},
|
|
30
|
+
"blocking_question": {
|
|
31
|
+
"type": "string",
|
|
32
|
+
"minLength": 1,
|
|
33
|
+
"description": "Single specific question whose answer would unblock. Required when status = NEEDS_CONTEXT."
|
|
34
|
+
},
|
|
35
|
+
"blocking_reason": {
|
|
36
|
+
"type": "string",
|
|
37
|
+
"minLength": 1,
|
|
38
|
+
"description": "Why no path forward exists. Required when status = BLOCKED. Distinguish from NEEDS_CONTEXT (where caller can supply context)."
|
|
39
|
+
},
|
|
40
|
+
"next_action": {
|
|
41
|
+
"type": "string",
|
|
42
|
+
"description": "What the caller does now. Optional; orchestrator infers from status when omitted."
|
|
43
|
+
}
|
|
44
|
+
},
|
|
45
|
+
"allOf": [
|
|
46
|
+
{
|
|
47
|
+
"if": {"properties": {"status": {"const": "DONE"}}, "required": ["status"]},
|
|
48
|
+
"then": {"required": ["evidence"], "not": {"required": ["concerns"]}}
|
|
49
|
+
},
|
|
50
|
+
{
|
|
51
|
+
"if": {"properties": {"status": {"const": "DONE_WITH_CONCERNS"}}, "required": ["status"]},
|
|
52
|
+
"then": {"required": ["evidence", "concerns"]}
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"if": {"properties": {"status": {"const": "NEEDS_CONTEXT"}}, "required": ["status"]},
|
|
56
|
+
"then": {"required": ["blocking_question"]}
|
|
57
|
+
},
|
|
58
|
+
{
|
|
59
|
+
"if": {"properties": {"status": {"const": "BLOCKED"}}, "required": ["status"]},
|
|
60
|
+
"then": {"required": ["blocking_reason"]}
|
|
61
|
+
}
|
|
62
|
+
]
|
|
63
|
+
}
|
|
@@ -64,6 +64,20 @@ stay on plain TDD — the section above.
|
|
|
64
64
|
If step 2 is skipped, the test is not trusted — a test that has never
|
|
65
65
|
failed proves nothing about the code under test.
|
|
66
66
|
|
|
67
|
+
### Iron Law — delete-and-restart over keep-as-reference
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
WHEN UNTESTED CODE EXISTS AND A TEST IS NEEDED — DELETE THE CODE,
|
|
71
|
+
WRITE THE TEST, REIMPLEMENT. NEVER KEEP IT "AS REFERENCE".
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
Reading the existing implementation while writing its test is
|
|
75
|
+
test-after-the-fact with extra steps. The 12-row anti-rationalization
|
|
76
|
+
table that follows expands the most common ways this Iron Law gets
|
|
77
|
+
talked-around. Externalized to
|
|
78
|
+
[`testing-anti-patterns/process-anti-patterns.md`](../testing-anti-patterns/process-anti-patterns.md)
|
|
79
|
+
to keep this skill under the 400-line sunset trigger.
|
|
80
|
+
|
|
67
81
|
## Procedure
|
|
68
82
|
|
|
69
83
|
### 1. Identify the behavior to test
|
|
@@ -140,22 +154,20 @@ Back to step 1 with the next single-sentence behavior.
|
|
|
140
154
|
3. Captured green-run output
|
|
141
155
|
4. Any refactor diff (optional)
|
|
142
156
|
|
|
143
|
-
## Anti-rationalizations
|
|
157
|
+
## Anti-rationalizations
|
|
144
158
|
|
|
145
|
-
|
|
146
|
-
the
|
|
159
|
+
Twelve common rationalizations that fire *before* the test is written —
|
|
160
|
+
plus the delete-and-restart Iron Law — live in
|
|
161
|
+
[`testing-anti-patterns/process-anti-patterns.md`](../testing-anti-patterns/process-anti-patterns.md).
|
|
162
|
+
Read the table when:
|
|
147
163
|
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
| "I already ran it manually" | Manual runs are not repeatable. The next edit breaks it silently. |
|
|
153
|
-
| "Deleting this code I just wrote is wasteful" | Sunk cost. The cheap path is: delete, write the test, reimplement minimally. |
|
|
154
|
-
| "I'll keep the code as reference while I write the test" | You will read it and adapt it. That is test-after-the-fact with extra steps. Delete it. |
|
|
155
|
-
| "I just need to explore the API first" | Spike it on a throwaway branch. Then delete it and restart with TDD. |
|
|
156
|
-
| "The test is too hard to write" | That signals a design problem in the code, not in the test. Listen to it. |
|
|
157
|
-
| "This bug is urgent, no time for a test" | The test **is** the fastest path to a verified fix. Guessing takes longer. |
|
|
164
|
+
- You catch yourself thinking "I'll add the test after" — row 2.
|
|
165
|
+
- You want to keep the code "as reference" while writing the test — row 5.
|
|
166
|
+
- "CI is red, patch first, test later" — row 9.
|
|
167
|
+
- "Follow-up PR will add the test" — row 12.
|
|
158
168
|
|
|
169
|
+
For mock-isolation failure modes (separate concern), see
|
|
170
|
+
[`testing-anti-patterns`](../testing-anti-patterns/SKILL.md).
|
|
159
171
|
|
|
160
172
|
## Examples
|
|
161
173
|
|
|
@@ -8,6 +8,13 @@ source: package
|
|
|
8
8
|
|
|
9
9
|
Tests must verify real behavior, not mock behavior. Mocks isolate; they are not the thing under test. This skill is the **prevention** layer; [`judge-test-coverage`](../judge-test-coverage/SKILL.md) catches what slips through afterwards.
|
|
10
10
|
|
|
11
|
+
For the **process / rationalization** failure modes that fire *before* a
|
|
12
|
+
test is written (the urges to skip TDD, keep code "as reference", patch
|
|
13
|
+
without a regression test), see the sibling reference table in
|
|
14
|
+
[`process-anti-patterns.md`](process-anti-patterns.md). Both layers are
|
|
15
|
+
required — a correctly-mocked test that was written *after* the code is
|
|
16
|
+
still test-after-the-fact.
|
|
17
|
+
|
|
11
18
|
## When to use
|
|
12
19
|
|
|
13
20
|
- About to write a new test that mocks a collaborator.
|
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
# Testing process anti-patterns — reference table
|
|
2
|
+
|
|
3
|
+
Sibling reference for [`testing-anti-patterns`](SKILL.md) and
|
|
4
|
+
[`test-driven-development`](../test-driven-development/SKILL.md).
|
|
5
|
+
|
|
6
|
+
`testing-anti-patterns/SKILL.md` covers **mock-isolation** failure modes
|
|
7
|
+
(mocking-the-mock, production pollution, partial mocks). This doc covers
|
|
8
|
+
the **process / rationalization** failure modes — the urges that fire
|
|
9
|
+
*before* the test is written and convince you to skip TDD entirely.
|
|
10
|
+
|
|
11
|
+
Both layers are required. A correctly-mocked test that was written *after*
|
|
12
|
+
the code is still test-after-the-fact. A TDD-first test that mocks itself
|
|
13
|
+
is still mocking the mock.
|
|
14
|
+
|
|
15
|
+
## The Iron Law (delete-and-restart)
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
WHEN YOU FIND YOURSELF KEEPING UNTESTED CODE "AS REFERENCE" WHILE WRITING
|
|
19
|
+
A TEST FOR IT — DELETE THE CODE. WRITE THE TEST. THEN REIMPLEMENT.
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
The cheap path is one extra round-trip. The expensive path is a test that
|
|
23
|
+
silently encodes the bug it was supposed to catch.
|
|
24
|
+
|
|
25
|
+
## 12-row anti-rationalization table
|
|
26
|
+
|
|
27
|
+
The urge to skip TDD is strongest on tasks where TDD matters most. Name
|
|
28
|
+
the rationalization, then reject it:
|
|
29
|
+
|
|
30
|
+
| # | Thought | Reality |
|
|
31
|
+
|---|---|---|
|
|
32
|
+
| 1 | "This is too simple to need a test" | Simple code still breaks. A test takes less time than one debug cycle. |
|
|
33
|
+
| 2 | "I'll add the test after the code works" | A test written after code that passed first try has never failed. It does not prove the code is correct. |
|
|
34
|
+
| 3 | "I already ran it manually" | Manual runs are not repeatable. The next edit breaks it silently. |
|
|
35
|
+
| 4 | "Deleting code I just wrote is wasteful" | Sunk cost. The cheap path: delete, write the test, reimplement minimally. |
|
|
36
|
+
| 5 | "I'll keep the code as reference while I write the test" | You will read it and adapt to it. That is test-after-the-fact with extra steps. Delete it. |
|
|
37
|
+
| 6 | "I just need to explore the API first" | Spike on a throwaway branch. Then delete the spike and restart with TDD. |
|
|
38
|
+
| 7 | "The test is too hard to write" | That signals a design problem in the code, not the test. Listen to it — refactor the seam, then test. |
|
|
39
|
+
| 8 | "This bug is urgent, no time for a test" | The test **is** the fastest path to a verified fix. Guessing takes longer and re-occurs. |
|
|
40
|
+
| 9 | "CI is red — patch first, test later" | A red CI is the cheapest moment to write the regression test. The patch without the test invites the same bug back. |
|
|
41
|
+
| 10 | "The test is just proof-of-work for the PR review" | A test that exists to placate review is not a test — it is theater. Either it asserts behavior or delete it. |
|
|
42
|
+
| 11 | "The dependency is too awkward to seam" | The seam discomfort *is* the design feedback. A constructor-injection refactor pays for itself the second time you change the dependency. |
|
|
43
|
+
| 12 | "We'll add the test in a follow-up PR" | Follow-up PRs that add tests to merged code arrive 0% of the time. The test ships with the change or never. |
|
|
44
|
+
|
|
45
|
+
## When to use this doc
|
|
46
|
+
|
|
47
|
+
- Reviewing your own draft before writing a test — read the table, check
|
|
48
|
+
none of the 12 are firing in your head.
|
|
49
|
+
- Reviewing a teammate's PR — if the PR description matches one of the 12
|
|
50
|
+
patterns, surface the row number in the review.
|
|
51
|
+
- Onboarding — pair with [`test-driven-development`](../test-driven-development/SKILL.md)
|
|
52
|
+
to give new devs the *why* behind the discipline.
|
|
53
|
+
|
|
54
|
+
## Cross-references
|
|
55
|
+
|
|
56
|
+
- Mock-specific anti-patterns: [`testing-anti-patterns`](SKILL.md)
|
|
57
|
+
- TDD discipline: [`test-driven-development`](../test-driven-development/SKILL.md)
|
|
58
|
+
- Coverage hygiene on a finished diff: [`judge-test-coverage`](../judge-test-coverage/SKILL.md)
|
|
59
|
+
- Pest conventions: [`pest-testing`](../pest-testing/SKILL.md)
|
|
60
|
+
- Quality tooling: [`quality-tools`](../quality-tools/SKILL.md)
|
|
61
|
+
|
|
62
|
+
## Provenance
|
|
63
|
+
|
|
64
|
+
- Adapted from `obra/superpowers@v5.1.0` `testing/anti-patterns.md`.
|
|
65
|
+
- Council convergence (anthropic/claude-sonnet-4-5 + openai/gpt-4o,
|
|
66
|
+
2026-05-07): both members ADOPT — the catalogue surfaces specific
|
|
67
|
+
rationalization patterns that would otherwise leak past code review.
|
package/CHANGELOG.md
CHANGED
|
@@ -318,6 +318,30 @@ our recommendation order, not its support status.
|
|
|
318
318
|
users" tension without removing any path that an existing user
|
|
319
319
|
might rely on.
|
|
320
320
|
|
|
321
|
+
## [1.32.0](https://github.com/event4u-app/agent-config/compare/1.31.0...1.32.0) (2026-05-09)
|
|
322
|
+
|
|
323
|
+
### Features
|
|
324
|
+
|
|
325
|
+
* **roadmap:** bite-sized task granularity gate for structural roadmaps ([b23683d](https://github.com/event4u-app/agent-config/commit/b23683df15dd43229a25cad33882f6a692d92a97))
|
|
326
|
+
* **skills:** add 3-scan self-review to planning skills ([6784fb8](https://github.com/event4u-app/agent-config/commit/6784fb8ad355ef5b2d7f2cebe5d5e26f114cbe4a))
|
|
327
|
+
* **subagent-orchestration:** status taxonomy + externalized prompts + two-stage mode ([6d846a7](https://github.com/event4u-app/agent-config/commit/6d846a74441196b98cabf8cd1c18ca40db0cec89))
|
|
328
|
+
* **skills:** TDD hardening with externalized anti-pattern catalogue ([db2b1a2](https://github.com/event4u-app/agent-config/commit/db2b1a2a550f10fa51742bef360044b9de1bb7ca))
|
|
329
|
+
|
|
330
|
+
### Bug Fixes
|
|
331
|
+
|
|
332
|
+
* **investigation:** inline council convergence (council files gitignored) ([ff93aa7](https://github.com/event4u-app/agent-config/commit/ff93aa7ee6f6cffff062838a644047470e1d462e))
|
|
333
|
+
* **skills:** inline council convergence in anti-patterns provenance ([7d7e663](https://github.com/event4u-app/agent-config/commit/7d7e663f8e619882fd59465dfbfdb268132eeb8d))
|
|
334
|
+
* **skills:** drop roadmap reference from anti-patterns provenance ([a49c010](https://github.com/event4u-app/agent-config/commit/a49c0107971a5b70a1fdf33ed1a19c91d075980a))
|
|
335
|
+
|
|
336
|
+
### Chores
|
|
337
|
+
|
|
338
|
+
* **ownership:** regenerate matrix after superpowers-harvest landing ([faf4794](https://github.com/event4u-app/agent-config/commit/faf479470db9c83d17e8332dd3498df1e5f4c34b))
|
|
339
|
+
* **index:** regenerate after superpowers-harvest landing ([946f3cc](https://github.com/event4u-app/agent-config/commit/946f3ccad54da3a3898d0aeb4474be0b87e66800))
|
|
340
|
+
* **roadmap:** remove old roadmap path (already archived) ([18f281b](https://github.com/event4u-app/agent-config/commit/18f281bbeb51e93feef40146af3a3b5e5cb916f2))
|
|
341
|
+
* **roadmap:** close superpowers-harvest — Phase 1 LANDED, P1.4b deferred ([a296106](https://github.com/event4u-app/agent-config/commit/a296106c6a846a54c3fe20728203fb3bbae7fffc))
|
|
342
|
+
|
|
343
|
+
Tests: 2560 (+74 since 1.31.0)
|
|
344
|
+
|
|
321
345
|
## [1.31.0](https://github.com/event4u-app/agent-config/compare/1.29.0...1.31.0) (2026-05-09)
|
|
322
346
|
|
|
323
347
|
### Features
|
package/docs/catalog.md
CHANGED
|
@@ -146,7 +146,7 @@ are excluded.
|
|
|
146
146
|
| skill | [`skill-reviewer`](../.agent-src/skills/skill-reviewer/SKILL.md) | | Use when reviewing, auditing, or optimizing skills — validates against the 7 Skill Killers checklist and produces fix recommendations. |
|
|
147
147
|
| skill | [`skill-writing`](../.agent-src/skills/skill-writing/SKILL.md) | | Use when deciding 'should this be a skill or a rule?', creating/improving/reviewing agent skills, SKILL.md frontmatter, or procedure sections — even without saying 'skill-writing'. |
|
|
148
148
|
| skill | [`sql-writing`](../.agent-src/skills/sql-writing/SKILL.md) | | Use when writing raw SQL — MariaDB/MySQL syntax, parameterization, raw migrations, seeders with `DB::statement` — even when the user just pastes a query and asks 'why is this slow' without naming SQL. |
|
|
149
|
-
| skill | [`subagent-orchestration`](../.agent-src/skills/subagent-orchestration/SKILL.md) | | Use when orchestrating implementer/judge subagents —
|
|
149
|
+
| skill | [`subagent-orchestration`](../.agent-src/skills/subagent-orchestration/SKILL.md) | | Use when orchestrating implementer/judge subagents — seven modes (do-and-judge ±two-stage, do-in-steps/parallel/worktrees, do-competitively, judge-with-debate) — models from .agent-settings.yml. |
|
|
150
150
|
| skill | [`systematic-debugging`](../.agent-src/skills/systematic-debugging/SKILL.md) | | Use when hitting a bug, test failure, crash, or unexpected behavior — enforces reproduce → isolate → hypothesize → verify before any fix — even when the user just says 'this is broken' or 'quick fix'. |
|
|
151
151
|
| skill | [`technical-specification`](../.agent-src/skills/technical-specification/SKILL.md) | | Use when the user says "write a spec", "create RFC", "write a PRD", or "document this decision". Writes technical specifications, PRDs, RFCs, and ADRs with clear structure. |
|
|
152
152
|
| skill | [`terraform`](../.agent-src/skills/terraform/SKILL.md) | | Use when writing Terraform — AWS modules, resources, variables, outputs, remote state — even when the user just says 'provision this infra' or 'add an S3 bucket' without naming Terraform. |
|