@event4u/agent-config 1.16.0 → 1.18.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent-src/commands/{agents-audit.md → agents/audit.md} +4 -3
- package/.agent-src/commands/{agents-cleanup.md → agents/cleanup.md} +12 -6
- package/.agent-src/commands/{agents-prepare.md → agents/prepare.md} +4 -3
- package/.agent-src/commands/agents.md +46 -0
- package/.agent-src/commands/{chat-history-checkpoint.md → chat-history/checkpoint.md} +4 -4
- package/.agent-src/commands/{chat-history-clear.md → chat-history/clear.md} +4 -4
- package/.agent-src/commands/{chat-history-resume.md → chat-history/resume.md} +4 -4
- package/.agent-src/commands/chat-history/show.md +107 -0
- package/.agent-src/commands/chat-history.md +33 -89
- package/.agent-src/commands/{commit-in-chunks.md → commit/in-chunks.md} +15 -13
- package/.agent-src/commands/commit.md +22 -2
- package/.agent-src/commands/{context-create.md → context/create.md} +4 -3
- package/.agent-src/commands/{context-refactor.md → context/refactor.md} +4 -3
- package/.agent-src/commands/context.md +44 -0
- package/.agent-src/commands/{copilot-agents-init.md → copilot-agents/init.md} +4 -3
- package/.agent-src/commands/{copilot-agents-optimize.md → copilot-agents/optimize.md} +4 -3
- package/.agent-src/commands/copilot-agents.md +44 -0
- package/.agent-src/commands/council/default.md +221 -0
- package/.agent-src/commands/{council-design.md → council/design.md} +6 -5
- package/.agent-src/commands/{council-optimize.md → council/optimize.md} +7 -6
- package/.agent-src/commands/{council-pr.md → council/pr.md} +6 -5
- package/.agent-src/commands/council.md +47 -212
- package/.agent-src/commands/{create-pr-description.md → create-pr/description-only.md} +4 -2
- package/.agent-src/commands/create-pr.md +26 -5
- package/.agent-src/commands/{feature-dev.md → feature/dev.md} +5 -10
- package/.agent-src/commands/{feature-explore.md → feature/explore.md} +4 -8
- package/.agent-src/commands/{feature-plan.md → feature/plan.md} +4 -8
- package/.agent-src/commands/{feature-refactor.md → feature/refactor.md} +4 -8
- package/.agent-src/commands/{feature-roadmap.md → feature/roadmap.md} +6 -10
- package/.agent-src/commands/feature.md +6 -12
- package/.agent-src/commands/{fix-ci.md → fix/ci.md} +4 -8
- package/.agent-src/commands/{fix-portability.md → fix/portability.md} +4 -8
- package/.agent-src/commands/{fix-pr-bot-comments.md → fix/pr-bots.md} +4 -8
- package/.agent-src/commands/{fix-pr-developer-comments.md → fix/pr-developers.md} +4 -8
- package/.agent-src/commands/{fix-pr-comments.md → fix/pr.md} +7 -11
- package/.agent-src/commands/{fix-references.md → fix/refs.md} +4 -8
- package/.agent-src/commands/{fix-seeder.md → fix/seeder.md} +4 -8
- package/.agent-src/commands/fix.md +7 -13
- package/.agent-src/commands/{do-and-judge.md → judge/on-diff.md} +4 -3
- package/.agent-src/commands/judge/solo.md +90 -0
- package/.agent-src/commands/{do-in-steps.md → judge/steps.md} +4 -3
- package/.agent-src/commands/judge.md +35 -70
- package/.agent-src/commands/{memory-add.md → memory/add.md} +4 -3
- package/.agent-src/commands/{memory-full.md → memory/load.md} +4 -3
- package/.agent-src/commands/{memory-promote.md → memory/promote.md} +4 -3
- package/.agent-src/commands/{propose-memory.md → memory/propose.md} +4 -3
- package/.agent-src/commands/memory.md +48 -0
- package/.agent-src/commands/{module-create.md → module/create.md} +4 -3
- package/.agent-src/commands/{module-explore.md → module/explore.md} +4 -3
- package/.agent-src/commands/module.md +44 -0
- package/.agent-src/commands/{optimize-agents.md → optimize/agents.md} +4 -8
- package/.agent-src/commands/{optimize-augmentignore.md → optimize/augmentignore.md} +4 -9
- package/.agent-src/commands/{optimize-rtk-filters.md → optimize/rtk.md} +4 -8
- package/.agent-src/commands/{optimize-skills.md → optimize/skills.md} +4 -8
- package/.agent-src/commands/optimize.md +4 -10
- package/.agent-src/commands/{override-create.md → override/create.md} +4 -3
- package/.agent-src/commands/{override-manage.md → override/manage.md} +4 -3
- package/.agent-src/commands/override.md +44 -0
- package/.agent-src/commands/{roadmap-create.md → roadmap/create.md} +4 -3
- package/.agent-src/commands/{roadmap-execute.md → roadmap/execute.md} +4 -3
- package/.agent-src/commands/roadmap.md +44 -0
- package/.agent-src/commands/{tests-create.md → tests/create.md} +4 -3
- package/.agent-src/commands/{tests-execute.md → tests/execute.md} +4 -3
- package/.agent-src/commands/tests.md +44 -0
- package/.agent-src/contexts/communication/rules-auto/artifact-engagement-recording-mechanics.md +72 -0
- package/.agent-src/contexts/communication/rules-auto/augment-portability-mechanics.md +79 -0
- package/.agent-src/contexts/communication/rules-auto/augment-source-of-truth-mechanics.md +98 -0
- package/.agent-src/contexts/communication/rules-auto/cli-output-handling-mechanics.md +87 -0
- package/.agent-src/contexts/communication/rules-auto/command-suggestion-policy-mechanics.md +62 -0
- package/.agent-src/contexts/communication/rules-auto/docs-sync-mechanics.md +78 -0
- package/.agent-src/contexts/communication/rules-auto/package-ci-checks-mechanics.md +85 -0
- package/.agent-src/contexts/communication/rules-auto/review-routing-awareness-mechanics.md +65 -0
- package/.agent-src/contexts/communication/rules-auto/roadmap-progress-sync-mechanics.md +78 -0
- package/.agent-src/contexts/communication/rules-auto/skill-quality-mechanics.md +62 -0
- package/.agent-src/contexts/communication/rules-auto/slash-command-routing-policy-mechanics.md +55 -0
- package/.agent-src/contexts/communication/rules-auto/ui-audit-gate-mechanics.md +53 -0
- package/.agent-src/contexts/communication/rules-auto/user-interaction-mechanics.md +77 -0
- package/.agent-src/contexts/judges/no-consolidate-rationale.md +102 -0
- package/.agent-src/contexts/judges/persona-voice-rubric.md +140 -0
- package/.agent-src/rules/artifact-engagement-recording.md +13 -69
- package/.agent-src/rules/ask-when-uncertain.md +27 -42
- package/.agent-src/rules/augment-portability.md +15 -61
- package/.agent-src/rules/augment-source-of-truth.md +27 -93
- package/.agent-src/rules/cli-output-handling.md +10 -76
- package/.agent-src/rules/command-suggestion-policy.md +18 -59
- package/.agent-src/rules/commit-conventions.md +17 -14
- package/.agent-src/rules/context-hygiene.md +6 -0
- package/.agent-src/rules/direct-answers.md +35 -59
- package/.agent-src/rules/docker-commands.md +5 -5
- package/.agent-src/rules/docs-sync.md +15 -69
- package/.agent-src/rules/language-and-tone.md +48 -72
- package/.agent-src/rules/missing-tool-handling.md +28 -22
- package/.agent-src/rules/no-cheap-questions.md +39 -53
- package/.agent-src/rules/no-roadmap-references.md +73 -0
- package/.agent-src/rules/onboarding-gate.md +7 -0
- package/.agent-src/rules/package-ci-checks.md +21 -61
- package/.agent-src/rules/preservation-guard.md +64 -29
- package/.agent-src/rules/review-routing-awareness.md +24 -43
- package/.agent-src/rules/roadmap-progress-sync.md +31 -65
- package/.agent-src/rules/rule-type-governance.md +28 -0
- package/.agent-src/rules/security-sensitive-stop.md +8 -8
- package/.agent-src/rules/skill-quality.md +16 -48
- package/.agent-src/rules/slash-command-routing-policy.md +7 -4
- package/.agent-src/rules/think-before-action.md +52 -42
- package/.agent-src/rules/tool-safety.md +19 -16
- package/.agent-src/rules/ui-audit-gate.md +24 -38
- package/.agent-src/rules/user-interaction.md +13 -68
- package/.agent-src/skills/ai-council/SKILL.md +2 -0
- package/.agent-src/skills/api-testing/SKILL.md +1 -1
- package/.agent-src/skills/check-refs/SKILL.md +59 -40
- package/.agent-src/skills/conventional-commits-writing/SKILL.md +86 -28
- package/.agent-src/skills/copilot-agents-optimization/SKILL.md +5 -5
- package/.agent-src/skills/developer-like-execution/SKILL.md +4 -4
- package/.agent-src/skills/finishing-a-development-branch/SKILL.md +101 -65
- package/.agent-src/skills/flux/SKILL.md +30 -10
- package/.agent-src/skills/github-ci/SKILL.md +2 -2
- package/.agent-src/skills/judge-code-quality/SKILL.md +7 -8
- package/.agent-src/skills/judge-security-auditor/SKILL.md +4 -5
- package/.agent-src/skills/judge-test-coverage/SKILL.md +3 -4
- package/.agent-src/skills/lint-skills/SKILL.md +57 -39
- package/.agent-src/skills/md-language-check/SKILL.md +61 -39
- package/.agent-src/skills/override-management/SKILL.md +5 -5
- package/.agent-src/skills/quality-tools/SKILL.md +2 -2
- package/.agent-src/skills/react-shadcn-ui/SKILL.md +116 -43
- package/.agent-src/skills/readme-reviewer/SKILL.md +30 -29
- package/.agent-src/skills/readme-writing/SKILL.md +78 -53
- package/.agent-src/skills/readme-writing-package/SKILL.md +50 -47
- package/.agent-src/skills/receiving-code-review/SKILL.md +52 -47
- package/.agent-src/skills/refine-prompt/SKILL.md +0 -1
- package/.agent-src/skills/requesting-code-review/SKILL.md +35 -30
- package/.agent-src/skills/security/SKILL.md +7 -2
- package/.agent-src/skills/security-audit/SKILL.md +7 -3
- package/.agent-src/skills/systematic-debugging/SKILL.md +68 -60
- package/.agent-src/skills/test-driven-development/SKILL.md +59 -57
- package/.agent-src/skills/test-performance/SKILL.md +0 -1
- package/.agent-src/skills/traefik/SKILL.md +4 -4
- package/.agent-src/skills/verify-completion-evidence/SKILL.md +28 -26
- package/.agent-src/templates/roadmaps.md +4 -0
- package/.claude-plugin/marketplace.json +22 -11
- package/AGENTS.md +2 -2
- package/CHANGELOG.md +125 -1
- package/README.md +18 -17
- package/docs/architecture.md +4 -6
- package/docs/catalog.md +67 -39
- package/docs/contracts/STABILITY.md +13 -7
- package/docs/contracts/adr-chat-history-split.md +1 -3
- package/docs/contracts/adr-command-suggestion.md +0 -2
- package/docs/contracts/adr-implement-ticket-runtime.md +1 -2
- package/docs/contracts/adr-product-ui-track.md +3 -6
- package/docs/contracts/adr-prompt-driven-execution.md +3 -4
- package/docs/contracts/agent-memory-contract.md +6 -11
- package/docs/contracts/artifact-engagement-flow.md +6 -9
- package/docs/contracts/command-clusters.md +56 -46
- package/docs/contracts/command-suggestion-flow.md +1 -3
- package/docs/contracts/context-paths.md +99 -0
- package/docs/contracts/file-ownership-matrix.json +6722 -0
- package/docs/contracts/file-ownership-matrix.md +134 -0
- package/docs/contracts/implement-ticket-flow.md +6 -9
- package/docs/contracts/linear-ai-rules-inclusion.md +0 -1
- package/docs/contracts/linear-ai-three-layers.md +0 -2
- package/docs/contracts/load-context-budget-model.md +258 -0
- package/docs/contracts/load-context-schema.md +21 -3
- package/docs/contracts/roadmap-complexity-standard.md +137 -0
- package/docs/contracts/rule-interactions.md +0 -1
- package/docs/contracts/rule-priority-hierarchy.md +1 -1
- package/docs/contracts/ui-track-flow.md +7 -17
- package/docs/customization.md +2 -0
- package/docs/getting-started.md +5 -4
- package/docs/guidelines/agent-infra/ask-when-uncertain-demos.md +134 -0
- package/docs/guidelines/agent-infra/asking-and-brevity-examples.md +100 -0
- package/docs/guidelines/agent-infra/direct-answers-demos.md +145 -0
- package/docs/guidelines/agent-infra/verify-before-complete-demos.md +128 -0
- package/package.json +1 -1
- package/scripts/_phase2_shim_helper.py +109 -0
- package/scripts/agent-config +30 -0
- package/scripts/ai_council/one_off_archive/2026-05/README.md +45 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_2a4_acceptance.py +208 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_budget_v2_audit.py +206 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_context_layer_v1_estimate.py +67 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_context_layer_v1_review.py +292 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_followups_review.py +259 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_nondestructive_inline_audit.py +209 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_phase4_dispatch_latency.py +108 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_phase6_trigger_jaccard.py +92 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_phase_2a_budget_rebalance.py +257 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_phase_2a_post_revert.py +197 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_rule_hardening_v1.py +251 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_structural_open_questions.py +232 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_structural_optimization.py +144 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_structural_v3_gaps.py +252 -0
- package/scripts/ai_council/one_off_archive/2026-05/_one_off_structural_v3_review.py +240 -0
- package/scripts/build_rule_trigger_matrix.py +360 -0
- package/scripts/check_always_budget.py +402 -45
- package/scripts/check_cluster_patterns.py +159 -0
- package/scripts/check_command_count_messaging.py +14 -7
- package/scripts/check_context_paths.py +201 -0
- package/scripts/check_no_roadmap_refs.py +155 -0
- package/scripts/check_one_off_location.py +81 -0
- package/scripts/check_phase_coupling.py +148 -0
- package/scripts/check_portability.py +2 -0
- package/scripts/check_references.py +35 -2
- package/scripts/check_safety_floor_untouched.py +125 -0
- package/scripts/command_suggester/loader.py +4 -1
- package/scripts/compress.py +64 -15
- package/scripts/context_hygiene_hook.py +173 -0
- package/scripts/generate_index.py +6 -2
- package/scripts/generate_ownership_matrix.py +323 -0
- package/scripts/hooks/augment-context-hygiene.sh +55 -0
- package/scripts/hooks/augment-onboarding-gate.sh +55 -0
- package/scripts/hooks/augment-roadmap-progress.sh +57 -0
- package/scripts/install.py +105 -45
- package/scripts/lint_examples.py +98 -0
- package/scripts/lint_no_new_atomic_commands.py +12 -11
- package/scripts/lint_roadmap_complexity.py +127 -0
- package/scripts/onboarding_gate_hook.py +137 -0
- package/scripts/requirements-evals.txt +1 -0
- package/scripts/roadmap_progress_hook.py +159 -0
- package/scripts/schemas/command.schema.json +4 -3
- package/scripts/schemas/rule.schema.json +5 -0
- package/scripts/skill_linter.py +1 -0
- package/scripts/sync_agent_settings.py +25 -2
- package/scripts/update_counts.py +7 -0
- /package/scripts/ai_council/{_one_off_rebalancing_audit.py → one_off_archive/2026-05/_one_off_rebalancing_audit.py} +0 -0
- /package/scripts/ai_council/{_one_off_roundtrip.py → one_off_archive/2026-05/_one_off_roundtrip.py} +0 -0
|
@@ -11,17 +11,19 @@ source: package
|
|
|
11
11
|
## When to use
|
|
12
12
|
|
|
13
13
|
* About to run `/create-pr` or `/prepare-for-review`
|
|
14
|
-
*
|
|
15
|
-
|
|
14
|
+
* A feature or bug fix is code-complete and the next step is "get
|
|
15
|
+
eyes on it"
|
|
16
|
+
* A stacked PR is ready and you need the parent branch reviewer to
|
|
16
17
|
context-switch smoothly
|
|
17
|
-
* Asking a human for a quick sanity check on a specific commit or
|
|
18
|
+
* Asking a human for a quick sanity check on a specific commit or
|
|
19
|
+
diff
|
|
18
20
|
|
|
19
21
|
Do NOT use when:
|
|
20
22
|
|
|
21
23
|
* You are *processing* review feedback — use [`receiving-code-review`](../receiving-code-review/SKILL.md)
|
|
22
|
-
*
|
|
23
|
-
green tests and a clean diff
|
|
24
|
-
*
|
|
24
|
+
* The branch is not yet code-complete — the review-request gate
|
|
25
|
+
requires green tests and a clean diff
|
|
26
|
+
* The change is documentation-only and has no behavior impact
|
|
25
27
|
|
|
26
28
|
## Goal
|
|
27
29
|
|
|
@@ -36,9 +38,9 @@ process. A well-framed review request **halves** review time and
|
|
|
36
38
|
NEVER REQUEST REVIEW FROM A BRANCH YOU HAVE NOT REVIEWED YOURSELF.
|
|
37
39
|
```
|
|
38
40
|
|
|
39
|
-
Self-review is the single cheapest filter.
|
|
40
|
-
reviewer would flag in round one, so the human can
|
|
41
|
-
issues only they can see.
|
|
41
|
+
Self-review is the single cheapest filter. It catches the issues a
|
|
42
|
+
human reviewer would flag in round one, so the human reviewer can
|
|
43
|
+
spend time on the issues only they can see.
|
|
42
44
|
|
|
43
45
|
## Procedure
|
|
44
46
|
|
|
@@ -46,18 +48,19 @@ issues only they can see.
|
|
|
46
48
|
|
|
47
49
|
Before asking anyone else:
|
|
48
50
|
|
|
49
|
-
* Read the full diff (`git diff <base>...<head>`), not just files
|
|
50
|
-
remember touching
|
|
51
|
+
* Read the full diff (`git diff <base>...<head>`), not just the files
|
|
52
|
+
you remember touching
|
|
51
53
|
* Check for accidental debug output, dead code, leftover `dd()`,
|
|
52
54
|
`console.log`, commented-out blocks
|
|
53
55
|
* Check for secrets in diff — API keys, connection strings, tokens
|
|
54
56
|
* Check file-system side effects — generated files, lockfile churn,
|
|
55
57
|
IDE configs, `.env` changes
|
|
56
58
|
* Run the linter + tests (see [`verify-before-complete`](../verify-before-complete/SKILL.md))
|
|
57
|
-
*
|
|
59
|
+
* If you find issues → fix them, do **not** ship them and hope the
|
|
60
|
+
reviewer flags them
|
|
58
61
|
|
|
59
|
-
Use [`review-changes`](../../commands/review-changes.md)
|
|
60
|
-
structured walk-through.
|
|
62
|
+
Use the [`review-changes`](../../commands/review-changes.md) command
|
|
63
|
+
as the structured walk-through.
|
|
61
64
|
|
|
62
65
|
### 2. Establish the diff baseline
|
|
63
66
|
|
|
@@ -78,8 +81,8 @@ review 80 unrelated commits.
|
|
|
78
81
|
|
|
79
82
|
### 3. Write the review request context
|
|
80
83
|
|
|
81
|
-
Any review request must answer four questions.
|
|
82
|
-
reviewer will ask
|
|
84
|
+
Any review request must answer four questions. If any is missing, the
|
|
85
|
+
reviewer will ask — and that round trip is preventable.
|
|
83
86
|
|
|
84
87
|
| Question | Where it lives |
|
|
85
88
|
|---|---|
|
|
@@ -96,22 +99,24 @@ for the title format.
|
|
|
96
99
|
### 4. Keep the PR reviewable in size
|
|
97
100
|
|
|
98
101
|
* Target < 400 lines of real diff (excluding generated / lockfiles)
|
|
99
|
-
*
|
|
100
|
-
so reviewers can handle each in one sitting
|
|
101
|
-
* Flag generated files explicitly in the description so reviewers
|
|
102
|
-
|
|
103
|
-
|
|
102
|
+
* If bigger — consider splitting into a stack (refactor PR → feature
|
|
103
|
+
PR) so reviewers can handle each in one sitting
|
|
104
|
+
* Flag generated files explicitly in the description so reviewers
|
|
105
|
+
skip them
|
|
106
|
+
* Never mix a refactor + behavior change in the same PR — reviewers
|
|
107
|
+
cannot isolate the risk
|
|
104
108
|
|
|
105
109
|
### 5. Pick the right reviewer set
|
|
106
110
|
|
|
107
|
-
* **Architectural impact** → code owner for the affected area
|
|
108
|
-
* **Security-sensitive** → a security-reviewer role if the project has
|
|
111
|
+
* **Architectural impact** → the code owner for the affected area
|
|
112
|
+
* **Security-sensitive** → a security-reviewer role if the project has
|
|
113
|
+
one
|
|
109
114
|
* **Bots** → let Copilot / Greptile / Augment run automatically; do not
|
|
110
115
|
gate human review on bot completion
|
|
111
116
|
* **Cross-team change** → each affected team's owner
|
|
112
117
|
|
|
113
|
-
|
|
114
|
-
do not override without a reason.
|
|
118
|
+
If the project has a `CODEOWNERS` file, GitHub handles this
|
|
119
|
+
automatically — do not override without a reason.
|
|
115
120
|
|
|
116
121
|
### 6. Send and wait — do not nudge early
|
|
117
122
|
|
|
@@ -119,8 +124,8 @@ After the PR is open:
|
|
|
119
124
|
|
|
120
125
|
* Respond to questions, not to the implicit "where is my review?"
|
|
121
126
|
schedule
|
|
122
|
-
*
|
|
123
|
-
do not re-open or force-push to bump the PR list
|
|
127
|
+
* If the review is blocking and overdue → a single short nudge is
|
|
128
|
+
appropriate; do not re-open or force-push to bump the PR list
|
|
124
129
|
|
|
125
130
|
When review comments arrive → switch to
|
|
126
131
|
[`receiving-code-review`](../receiving-code-review/SKILL.md).
|
|
@@ -144,9 +149,9 @@ When handing the review request to the reviewer (PR body, Slack, email):
|
|
|
144
149
|
* A 1000-line PR with "no behavior change" still needs review — the
|
|
145
150
|
reviewer has no way to confirm "no behavior change" without reading
|
|
146
151
|
every line
|
|
147
|
-
* Auto-merge on approval bypasses re-review after later force-pushes
|
|
148
|
-
use deliberately, not by habit
|
|
149
|
-
* A PR description
|
|
152
|
+
* Auto-merge on approval bypasses re-review after later force-pushes
|
|
153
|
+
— use deliberately, not by habit
|
|
154
|
+
* A PR description that says "see the code" is not a description —
|
|
150
155
|
reviewers need the why
|
|
151
156
|
* Requesting review from someone without context (new hire, other
|
|
152
157
|
team) without a longer pairing — they cannot do a deep review cold
|
|
@@ -11,8 +11,13 @@ source: package
|
|
|
11
11
|
Use when implementing authentication, authorization, or any security-sensitive functionality.
|
|
12
12
|
|
|
13
13
|
Do NOT use when:
|
|
14
|
-
|
|
15
|
-
|
|
14
|
+
|
|
15
|
+
* Validation logic only — route to [`laravel-validation`](../laravel-validation/SKILL.md)
|
|
16
|
+
* Full security audit — route to [`security-audit`](../security-audit/SKILL.md)
|
|
17
|
+
* You need a pre-implementation threat model — route to
|
|
18
|
+
[`threat-modeling`](../threat-modeling/SKILL.md)
|
|
19
|
+
* You need end-to-end authorization analysis — route to
|
|
20
|
+
[`authz-review`](../authz-review/SKILL.md)
|
|
16
21
|
|
|
17
22
|
## Procedure: Implement security for a feature
|
|
18
23
|
|
|
@@ -24,9 +24,13 @@ Use this skill when:
|
|
|
24
24
|
|
|
25
25
|
Do NOT use when:
|
|
26
26
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
27
|
+
* Writing new auth/policy code — route to [`security`](../security/SKILL.md)
|
|
28
|
+
* Hunting for functional bugs — route to [`bug-analyzer`](../bug-analyzer/SKILL.md) (proactive mode)
|
|
29
|
+
* Investigating performance — route to [`performance-analysis`](../performance-analysis/SKILL.md)
|
|
30
|
+
* You need a pre-implementation threat model for a new feature — route to
|
|
31
|
+
[`threat-modeling`](../threat-modeling/SKILL.md)
|
|
32
|
+
* You need end-to-end authorization analysis for one route/action — route to
|
|
33
|
+
[`authz-review`](../authz-review/SKILL.md)
|
|
30
34
|
|
|
31
35
|
## Procedure: Security audit
|
|
32
36
|
|
|
@@ -8,8 +8,8 @@ source: package
|
|
|
8
8
|
|
|
9
9
|
## When to use
|
|
10
10
|
|
|
11
|
-
*
|
|
12
|
-
*
|
|
11
|
+
* A test fails and the failure is not self-explanatory
|
|
12
|
+
* A bug is reported (Jira, Sentry, user message) and the root cause is not obvious
|
|
13
13
|
* Production or staging shows unexpected behavior
|
|
14
14
|
* Code behaves differently than the developer expected
|
|
15
15
|
* A previous fix did not resolve the issue or introduced a new one
|
|
@@ -17,14 +17,19 @@ source: package
|
|
|
17
17
|
|
|
18
18
|
Do NOT use when:
|
|
19
19
|
|
|
20
|
-
*
|
|
20
|
+
* The failure message already names the fix (typo, missing import, obvious
|
|
21
|
+
off-by-one) — fix it and move on
|
|
21
22
|
* Pure style / formatting / lint issues
|
|
22
23
|
* Documentation-only questions
|
|
24
|
+
* You need a static trace of a specific data element — route to
|
|
25
|
+
[`data-flow-mapper`](../data-flow-mapper/SKILL.md)
|
|
26
|
+
* You need to enumerate what a planned change will touch — route to
|
|
27
|
+
[`blast-radius-analyzer`](../blast-radius-analyzer/SKILL.md)
|
|
23
28
|
|
|
24
29
|
## Goal
|
|
25
30
|
|
|
26
|
-
Find the **root cause** before changing any code. A symptom fix
|
|
27
|
-
over an unknown cause is a regression waiting to happen.
|
|
31
|
+
Find the **root cause** before changing any code. A symptom fix that
|
|
32
|
+
papers over an unknown cause is a regression waiting to happen.
|
|
28
33
|
|
|
29
34
|
## The Iron Law
|
|
30
35
|
|
|
@@ -37,30 +42,31 @@ diff, a reproduced failure — those are evidence.
|
|
|
37
42
|
|
|
38
43
|
## Procedure
|
|
39
44
|
|
|
40
|
-
Complete each phase before the next. Skipping ahead is the
|
|
41
|
-
biggest cause of wasted debug time.
|
|
45
|
+
Complete each phase before starting the next. Skipping ahead is the
|
|
46
|
+
single biggest cause of wasted debug time.
|
|
42
47
|
|
|
43
48
|
### Phase 1 — Reproduce
|
|
44
49
|
|
|
45
|
-
Goal: make the failure happen on demand, smallest possible setup.
|
|
50
|
+
Goal: make the failure happen on demand, with the smallest possible setup.
|
|
46
51
|
|
|
47
|
-
1. Read the error message, stack trace, and logs **in full**. Note
|
|
48
|
-
file, line, and the chain of calls above it.
|
|
49
|
-
2. Identify the minimum input, state, or
|
|
50
|
-
failure.
|
|
52
|
+
1. Read the error message, stack trace, and logs **in full**. Note the
|
|
53
|
+
exact file, line, and the chain of calls above it.
|
|
54
|
+
2. Identify the minimum input, state, or sequence of actions that
|
|
55
|
+
triggers the failure. If it is intermittent — gather more data before
|
|
56
|
+
guessing.
|
|
51
57
|
3. Capture the exact reproduction as a command or a test. Prefer a
|
|
52
58
|
failing test (see [`test-driven-development`](../test-driven-development/SKILL.md))
|
|
53
|
-
— turns Phase 4 into a verified fix.
|
|
59
|
+
— it turns Phase 4 into a verified fix.
|
|
54
60
|
|
|
55
|
-
|
|
56
|
-
re-run, collect more evidence.
|
|
61
|
+
If you cannot reproduce, you do not yet understand the bug. Stop. Add
|
|
62
|
+
logging, re-run, collect more evidence.
|
|
57
63
|
|
|
58
64
|
### Phase 2 — Isolate
|
|
59
65
|
|
|
60
66
|
Goal: locate the failure in a single component, layer, or call site.
|
|
61
67
|
|
|
62
|
-
1. Bisect the surface area.
|
|
63
|
-
off/skip/mock adjacent features to narrow the window.
|
|
68
|
+
1. Bisect the surface area. What is the smallest code path that still
|
|
69
|
+
fails? Turn off/skip/mock adjacent features to narrow the window.
|
|
64
70
|
2. For multi-component systems (frontend → API → service → DB, or
|
|
65
71
|
CI → build → deploy), log at **each boundary**:
|
|
66
72
|
|
|
@@ -68,7 +74,8 @@ Goal: locate the failure in a single component, layer, or call site.
|
|
|
68
74
|
* What leaves the component
|
|
69
75
|
* What config/env the component actually sees
|
|
70
76
|
|
|
71
|
-
|
|
77
|
+
The goal is not to fix — it is to answer "which boundary is the one
|
|
78
|
+
where expected ≠ actual?".
|
|
72
79
|
3. Check recent changes: `git log`, `git blame` on the failing line,
|
|
73
80
|
recent dependency updates, config edits, infra changes.
|
|
74
81
|
4. **Consult memory for prior matches.** Via
|
|
@@ -81,13 +88,13 @@ Goal: locate the failure in a single component, layer, or call site.
|
|
|
81
88
|
limit=3,
|
|
82
89
|
)
|
|
83
90
|
```
|
|
84
|
-
A matching `incident-learning` may already name the root cause,
|
|
85
|
-
and regression test. A matching `historical-pattern`
|
|
86
|
-
hypothesis space before Phase 3. Cite matching `id`s in
|
|
87
|
-
trail.
|
|
88
|
-
5. Trace backwards from the symptom. `null` arrives at line 42 —
|
|
89
|
-
does the value originate? Walk up the call stack until the
|
|
90
|
-
found. Fix at origin, not at line 42.
|
|
91
|
+
A matching `incident-learning` may already name the root cause, the
|
|
92
|
+
fix, and the regression test. A matching `historical-pattern`
|
|
93
|
+
narrows the hypothesis space before Phase 3. Cite matching `id`s in
|
|
94
|
+
the Phase 1–4 evidence trail.
|
|
95
|
+
5. Trace backwards from the symptom. If `null` arrives at line 42 —
|
|
96
|
+
where does the value originate? Walk up the call stack until the
|
|
97
|
+
origin is found. Fix at origin, not at line 42.
|
|
91
98
|
|
|
92
99
|
### Phase 3 — Hypothesize
|
|
93
100
|
|
|
@@ -95,35 +102,35 @@ Goal: one testable hypothesis at a time, rejected or confirmed by evidence.
|
|
|
95
102
|
|
|
96
103
|
1. State the hypothesis in one sentence: *"The failure happens because
|
|
97
104
|
X, which I can confirm by observing Y."*
|
|
98
|
-
2. Design the smallest possible experiment that confirms or
|
|
99
|
-
hypothesis. One variable at a time.
|
|
105
|
+
2. Design the smallest possible experiment that either confirms or
|
|
106
|
+
rejects the hypothesis. One variable at a time.
|
|
100
107
|
3. Run it. Read the output.
|
|
101
|
-
4.
|
|
108
|
+
4. If confirmed → Phase 4. If rejected → back to Phase 2 with the new
|
|
102
109
|
information, then form a new hypothesis.
|
|
103
110
|
|
|
104
|
-
|
|
105
|
-
well enough yet, or the architecture
|
|
111
|
+
If three hypotheses in a row fail, stop. You do not understand the
|
|
112
|
+
system well enough yet, or the architecture is the problem itself — see
|
|
106
113
|
"Three-strike rule" below.
|
|
107
114
|
|
|
108
115
|
### Phase 4 — Verify the fix
|
|
109
116
|
|
|
110
117
|
Goal: the fix resolves the root cause, not just the observed symptom.
|
|
111
118
|
|
|
112
|
-
1. Write or update a failing test
|
|
113
|
-
done in Phase 1).
|
|
119
|
+
1. Write or update a failing test that reproduces the bug (if not
|
|
120
|
+
already done in Phase 1).
|
|
114
121
|
2. Apply a single, minimal fix targeting the root cause. No bundled
|
|
115
122
|
refactors, no "while I'm here".
|
|
116
|
-
3. Re-run the reproduction — failure gone.
|
|
117
|
-
4. Re-run the surrounding test suite — nothing adjacent turned red.
|
|
118
|
-
5. Read output carefully — no new warnings, deprecations, or
|
|
119
|
-
retries
|
|
123
|
+
3. Re-run the reproduction — the failure is gone.
|
|
124
|
+
4. Re-run the surrounding test suite — nothing adjacent has turned red.
|
|
125
|
+
5. Read the output carefully — no new warnings, deprecations, or
|
|
126
|
+
silent retries that would mask the same bug recurring.
|
|
120
127
|
|
|
121
|
-
|
|
122
|
-
Phase 2, treat the failure as new evidence.
|
|
128
|
+
If the fix does not work, **do not** stack a second fix on top. Go back
|
|
129
|
+
to Phase 2, treat the failure as new evidence.
|
|
123
130
|
|
|
124
131
|
## Three-strike rule
|
|
125
132
|
|
|
126
|
-
|
|
133
|
+
If you have tried **three** fixes and the bug is still present:
|
|
127
134
|
|
|
128
135
|
* Stop attempting fixes.
|
|
129
136
|
* Re-read phases 1–3 — something about the root cause is wrong.
|
|
@@ -150,7 +157,7 @@ right line beats five minutes of IDE breakpoints.
|
|
|
150
157
|
## Condition-based waiting (intermittent bugs)
|
|
151
158
|
|
|
152
159
|
Intermittent tests and race conditions usually stem from waiting on
|
|
153
|
-
time instead of a condition. Replace `sleep(100)` or
|
|
160
|
+
time instead of on a condition. Replace `sleep(100)` or
|
|
154
161
|
`setTimeout(r, 100)` with an explicit wait-for:
|
|
155
162
|
|
|
156
163
|
```ts
|
|
@@ -172,11 +179,12 @@ async function waitFor<T>(
|
|
|
172
179
|
```
|
|
173
180
|
|
|
174
181
|
Only use an arbitrary timeout when the timing itself is the contract
|
|
175
|
-
(debounce, throttle) — add a comment explaining **why** the exact
|
|
182
|
+
(debounce, throttle) — and add a comment explaining **why** the exact
|
|
183
|
+
value.
|
|
176
184
|
|
|
177
185
|
## Output format
|
|
178
186
|
|
|
179
|
-
When reporting debug findings:
|
|
187
|
+
When reporting debug findings to the user:
|
|
180
188
|
|
|
181
189
|
1. **Symptom** — what was observed (one sentence + failure message)
|
|
182
190
|
2. **Reproduction** — the command or test that triggers it
|
|
@@ -189,27 +197,27 @@ When reporting debug findings:
|
|
|
189
197
|
|
|
190
198
|
* Reading half a stack trace and jumping to a fix — the actual cause is
|
|
191
199
|
usually two or three frames above the one you read.
|
|
192
|
-
* "It works on my machine" — different env than the
|
|
193
|
-
Reproduce with exact conditions from the report.
|
|
194
|
-
* Adding a retry or sleep to mask an intermittent failure — hides
|
|
195
|
-
race condition, does not fix it. Use condition-based waiting.
|
|
196
|
-
* Fixing the first line that throws when the bad value came from
|
|
197
|
-
call chain. Trace backwards to the origin.
|
|
200
|
+
* "It works on my machine" — you are running a different env than the
|
|
201
|
+
bug report. Reproduce with the exact conditions from the report.
|
|
202
|
+
* Adding a retry or sleep to mask an intermittent failure — this hides
|
|
203
|
+
the race condition, it does not fix it. Use condition-based waiting.
|
|
204
|
+
* Fixing the first line that throws, when the bad value came from
|
|
205
|
+
somewhere up the call chain. Trace backwards to the origin.
|
|
198
206
|
* "The fix works, the test is just flaky" — flaky tests are bugs in the
|
|
199
207
|
test or the code. Diagnose them, do not retry-until-green.
|
|
200
|
-
* Turning a failing assertion into a softer one ("maybe 2 or 3
|
|
201
|
-
accept both") to make it pass.
|
|
202
|
-
* Bundling a bug fix with a refactor — test goes red again
|
|
203
|
-
which change broke it.
|
|
208
|
+
* Turning a failing assertion into a softer one ("maybe it's 2 or 3
|
|
209
|
+
retries, let's accept both") to make it pass.
|
|
210
|
+
* Bundling a bug fix with a refactor — if the test goes red again you
|
|
211
|
+
cannot tell which change broke it.
|
|
204
212
|
|
|
205
213
|
## Red flags — STOP and restart from Phase 1
|
|
206
214
|
|
|
207
215
|
* "Let me just try X and see if it works"
|
|
208
216
|
* "I don't fully understand it, but this probably fixes it"
|
|
209
217
|
* Proposing a fix without having reproduced the bug
|
|
210
|
-
* Bundling multiple changes in one attempt
|
|
218
|
+
* Bundling multiple changes in one attempt ("fixing this and refactoring that")
|
|
211
219
|
* "It's probably a race condition, let me add a sleep"
|
|
212
|
-
* A green test run after changes without having first seen it red
|
|
220
|
+
* A green test run after changes, without having first seen it red
|
|
213
221
|
* "This looks similar to bug X, so it's the same fix"
|
|
214
222
|
* Suppressing a log, warning, or exception instead of tracing its source
|
|
215
223
|
|
|
@@ -219,7 +227,7 @@ When reporting debug findings:
|
|
|
219
227
|
* Do NOT change two things at once in a single experiment
|
|
220
228
|
* Do NOT silence a warning, failing test, or noisy log as a "fix"
|
|
221
229
|
* Do NOT mark a bug as fixed without a regression test
|
|
222
|
-
* Do NOT attempt fix #4 after three failed fixes — surface the pattern
|
|
230
|
+
* Do NOT attempt fix #4 after three failed fixes — surface the pattern instead
|
|
223
231
|
|
|
224
232
|
## When to hand over to another skill
|
|
225
233
|
|
|
@@ -234,11 +242,11 @@ When reporting debug findings:
|
|
|
234
242
|
|
|
235
243
|
Before declaring a bug fixed:
|
|
236
244
|
|
|
237
|
-
* [ ]
|
|
238
|
-
* [ ]
|
|
245
|
+
* [ ] The failure was reproduced before any code changed
|
|
246
|
+
* [ ] The root cause is named explicitly, not "probably"
|
|
239
247
|
* [ ] Evidence (log, trace, diff) supports the named root cause
|
|
240
|
-
* [ ]
|
|
241
|
-
* [ ]
|
|
242
|
-
* [ ]
|
|
248
|
+
* [ ] A failing test reproducing the bug was added or updated
|
|
249
|
+
* [ ] The fix is minimal and targets the root cause, not the symptom
|
|
250
|
+
* [ ] The regression test now passes
|
|
243
251
|
* [ ] Adjacent tests still pass
|
|
244
252
|
* [ ] No warning or suppressed output hides a recurrence
|
|
@@ -9,23 +9,23 @@ source: package
|
|
|
9
9
|
## When to use
|
|
10
10
|
|
|
11
11
|
* Adding a new function, method, or behavior
|
|
12
|
-
* Fixing a bug (bug needs a regression test before the fix)
|
|
12
|
+
* Fixing a bug (the bug needs a regression test before the fix)
|
|
13
13
|
* Refactoring a unit whose current behavior is unclear
|
|
14
14
|
* Any task where expected behavior can be expressed as an assertion
|
|
15
15
|
|
|
16
16
|
Do NOT use when:
|
|
17
17
|
|
|
18
|
-
*
|
|
19
|
-
*
|
|
20
|
-
*
|
|
18
|
+
* Writing throwaway prototype or spike code explicitly marked as exploration
|
|
19
|
+
* Generating boilerplate (migrations, config files, scaffolding)
|
|
20
|
+
* Editing pure documentation (`.md`, `AGENTS.md`, README)
|
|
21
21
|
* Working inside this `agent-config` package on skill/rule markdown
|
|
22
22
|
|
|
23
23
|
## Goal
|
|
24
24
|
|
|
25
|
-
* Drive implementation from a verified-failing test, not the agent's
|
|
26
|
-
belief that code "should work".
|
|
27
|
-
* Catch edge cases **before** production bugs.
|
|
28
|
-
* Leave every change with a regression test
|
|
25
|
+
* Drive implementation from a verified-failing test, not from the agent's
|
|
26
|
+
belief that the code "should work".
|
|
27
|
+
* Catch edge cases **before** they become production bugs.
|
|
28
|
+
* Leave every change with a regression test that runs in CI.
|
|
29
29
|
|
|
30
30
|
## The core discipline
|
|
31
31
|
|
|
@@ -38,7 +38,7 @@ Do NOT use when:
|
|
|
38
38
|
```
|
|
39
39
|
|
|
40
40
|
If step 2 is skipped, the test is not trusted — a test that has never
|
|
41
|
-
failed proves nothing.
|
|
41
|
+
failed proves nothing about the code under test.
|
|
42
42
|
|
|
43
43
|
## Procedure
|
|
44
44
|
|
|
@@ -46,21 +46,21 @@ failed proves nothing.
|
|
|
46
46
|
|
|
47
47
|
State in one sentence: *"When X happens, the system should do Y."*
|
|
48
48
|
|
|
49
|
-
|
|
50
|
-
one sentence
|
|
49
|
+
If you cannot state it in one sentence, the scope is too big — split into
|
|
50
|
+
multiple tests, each covering one sentence.
|
|
51
51
|
|
|
52
52
|
### 2. Write the failing test first
|
|
53
53
|
|
|
54
|
-
|
|
54
|
+
Write the smallest test that expresses the sentence from step 1.
|
|
55
55
|
|
|
56
|
-
* One assertion per behavior (multiple assertions OK only when
|
|
57
|
-
the same single behavior).
|
|
56
|
+
* One assertion per behavior (multiple assertions are OK only when they
|
|
57
|
+
describe the same single behavior).
|
|
58
58
|
* Real code paths, not mocks — mock only at I/O boundaries (HTTP, DB, time).
|
|
59
|
-
*
|
|
59
|
+
* Use a descriptive name: `it_rejects_empty_email`, not `test_email_1`.
|
|
60
60
|
|
|
61
61
|
### 3. Run the test and watch it fail
|
|
62
62
|
|
|
63
|
-
Execute the single test (targeted, not full suite):
|
|
63
|
+
Execute the single test (targeted, not the full suite):
|
|
64
64
|
|
|
65
65
|
```bash
|
|
66
66
|
# PHP/Pest
|
|
@@ -72,26 +72,27 @@ npx vitest run --testNamePattern "rejects empty email"
|
|
|
72
72
|
|
|
73
73
|
Required observations **before proceeding**:
|
|
74
74
|
|
|
75
|
-
*
|
|
76
|
-
*
|
|
77
|
-
*
|
|
78
|
-
do not start writing production code.
|
|
75
|
+
* The test **fails** (not errors).
|
|
76
|
+
* The failure message matches what you expected (missing behavior, not typo).
|
|
77
|
+
* If the test passes immediately → it does not test what you think. Fix
|
|
78
|
+
the test, do not start writing production code.
|
|
79
79
|
|
|
80
80
|
### 4. Write minimum code to pass
|
|
81
81
|
|
|
82
|
-
**
|
|
82
|
+
Add **just enough** production code to make the test green. No extra
|
|
83
83
|
features, no unrelated refactoring, no "while I'm here" cleanups.
|
|
84
84
|
|
|
85
|
-
|
|
86
|
-
test
|
|
85
|
+
If you feel the urge to add a parameter, edge case, or helper not covered
|
|
86
|
+
by the current test — stop. That belongs in the next RED step, not this
|
|
87
|
+
GREEN step.
|
|
87
88
|
|
|
88
89
|
### 5. Run again and watch it pass
|
|
89
90
|
|
|
90
91
|
Re-run the same targeted command. Required:
|
|
91
92
|
|
|
92
|
-
*
|
|
93
|
-
* No previously green tests turned red.
|
|
94
|
-
*
|
|
93
|
+
* The new test passes.
|
|
94
|
+
* No previously green tests have turned red.
|
|
95
|
+
* Test output is clean (no new warnings, deprecations, or noise).
|
|
95
96
|
|
|
96
97
|
### 6. Refactor (only if green)
|
|
97
98
|
|
|
@@ -101,8 +102,8 @@ With all tests green, you may:
|
|
|
101
102
|
* Extract duplication into helpers
|
|
102
103
|
* Tighten types
|
|
103
104
|
|
|
104
|
-
Do **not** add new behavior during refactor — needs its own failing
|
|
105
|
-
first. Re-run tests after refactor to confirm still-green.
|
|
105
|
+
Do **not** add new behavior during refactor — that needs its own failing
|
|
106
|
+
test first. Re-run tests after the refactor to confirm still-green.
|
|
106
107
|
|
|
107
108
|
### 7. Repeat for the next behavior
|
|
108
109
|
|
|
@@ -110,25 +111,25 @@ Back to step 1 with the next single-sentence behavior.
|
|
|
110
111
|
|
|
111
112
|
## Output format
|
|
112
113
|
|
|
113
|
-
1.
|
|
114
|
-
2.
|
|
114
|
+
1. The failing test (file + test name) with captured failure output
|
|
115
|
+
2. The minimum-code diff that makes it pass
|
|
115
116
|
3. Captured green-run output
|
|
116
|
-
4.
|
|
117
|
+
4. Any refactor diff (optional)
|
|
117
118
|
|
|
118
119
|
## Anti-rationalizations table
|
|
119
120
|
|
|
120
|
-
The urge to skip TDD is strongest
|
|
121
|
-
rationalization and reject it:
|
|
121
|
+
The urge to skip TDD is strongest on tasks where TDD matters most. Name
|
|
122
|
+
the rationalization and reject it:
|
|
122
123
|
|
|
123
124
|
| Thought | Reality |
|
|
124
125
|
|---|---|
|
|
125
|
-
| "
|
|
126
|
-
| "I'll add the test after the code works" | A test written after code passes on first run — never failed.
|
|
127
|
-
| "I already ran it manually" | Manual runs not repeatable.
|
|
128
|
-
| "Deleting code I just wrote is wasteful" | Sunk cost.
|
|
129
|
-
| "I'll keep the code as reference while I write the test" | You will adapt it. That is test-after-the-fact with extra steps. Delete it. |
|
|
130
|
-
| "I just need to explore the API first" | Spike on a throwaway branch. Then delete and restart with TDD. |
|
|
131
|
-
| "The test is too hard to write" |
|
|
126
|
+
| "This is too simple to need a test" | Simple code still breaks. A test takes less time than one debug cycle. |
|
|
127
|
+
| "I'll add the test after the code works" | A test written after code passes on the first run — it has never failed. It does not prove the code is correct. |
|
|
128
|
+
| "I already ran it manually" | Manual runs are not repeatable. The next edit breaks it silently. |
|
|
129
|
+
| "Deleting this code I just wrote is wasteful" | Sunk cost. The cheap path is: delete, write the test, reimplement minimally. |
|
|
130
|
+
| "I'll keep the code as reference while I write the test" | You will read it and adapt it. That is test-after-the-fact with extra steps. Delete it. |
|
|
131
|
+
| "I just need to explore the API first" | Spike it on a throwaway branch. Then delete it and restart with TDD. |
|
|
132
|
+
| "The test is too hard to write" | That signals a design problem in the code, not in the test. Listen to it. |
|
|
132
133
|
| "This bug is urgent, no time for a test" | The test **is** the fastest path to a verified fix. Guessing takes longer. |
|
|
133
134
|
|
|
134
135
|
|
|
@@ -162,7 +163,7 @@ final class EmailValidator
|
|
|
162
163
|
}
|
|
163
164
|
```
|
|
164
165
|
|
|
165
|
-
Run filter again → passes. No additional rules (format, MX, length)
|
|
166
|
+
Run the filter again → passes. No additional rules (format, MX, length)
|
|
166
167
|
until a next failing test drives them.
|
|
167
168
|
|
|
168
169
|
### JS / Vitest
|
|
@@ -184,7 +185,7 @@ it('retries a failing operation up to 3 times', async () => {
|
|
|
184
185
|
```
|
|
185
186
|
|
|
186
187
|
Run: `npx vitest run --testNamePattern "retries a failing"` → fails
|
|
187
|
-
(`retry` undefined).
|
|
188
|
+
(`retry` is undefined).
|
|
188
189
|
|
|
189
190
|
```ts
|
|
190
191
|
// src/retry.ts — GREEN (minimum)
|
|
@@ -197,21 +198,22 @@ export async function retry<T>(op: () => Promise<T>): Promise<T> {
|
|
|
197
198
|
}
|
|
198
199
|
```
|
|
199
200
|
|
|
200
|
-
Run again → passes. Configurable attempt count, backoff, jitter all
|
|
201
|
+
Run again → passes. Configurable attempt count, backoff, and jitter all
|
|
201
202
|
wait for their own failing tests.
|
|
202
203
|
|
|
203
204
|
## Gotchas
|
|
204
205
|
|
|
205
206
|
* Running the full suite instead of a filtered test hides the RED→GREEN
|
|
206
207
|
signal in noise. Always target first.
|
|
207
|
-
* A test that passes on the very first run is not TDD — written
|
|
208
|
-
|
|
208
|
+
* A test that passes on the very first run is not TDD — it was written
|
|
209
|
+
against code that already exists.
|
|
209
210
|
* `expect()` with three or four assertions on unrelated fields describes
|
|
210
211
|
multiple behaviors. Split them.
|
|
211
|
-
* Snapshot tests invert the discipline — they generate the expected
|
|
212
|
-
|
|
213
|
-
contract (CLI, SQL strings).
|
|
214
|
-
* Mocking the thing under test (instead of its I/O) tests the mock
|
|
212
|
+
* Snapshot tests invert the discipline — they generate the expected value
|
|
213
|
+
from the code. Only use snapshots where human-readable output is the
|
|
214
|
+
contract (CLI output, SQL strings).
|
|
215
|
+
* Mocking the thing under test (instead of its I/O) tests the mock, not
|
|
216
|
+
the code.
|
|
215
217
|
|
|
216
218
|
## Do NOT
|
|
217
219
|
|
|
@@ -219,22 +221,22 @@ wait for their own failing tests.
|
|
|
219
221
|
and has been observed to fail
|
|
220
222
|
* Do NOT accept a test that never failed as evidence the code works
|
|
221
223
|
* Do NOT bundle refactors into the GREEN step
|
|
222
|
-
* Do NOT silence a flaky test — diagnose or delete it
|
|
223
|
-
* Do NOT skip the targeted RED-run because "I know it fails"
|
|
224
|
+
* Do NOT silence a flaky test — diagnose it, or delete it
|
|
225
|
+
* Do NOT skip the targeted RED-run because "I just wrote it, I know it fails"
|
|
224
226
|
|
|
225
227
|
## Anti-patterns
|
|
226
228
|
|
|
227
229
|
* `it('works')` — no behavior described
|
|
228
230
|
* One test covering "and/and/and" — split per behavior
|
|
229
|
-
* Test
|
|
230
|
-
* Test
|
|
231
|
+
* Test that reaches into private state instead of testing observable behavior
|
|
232
|
+
* Test that duplicates the production code's algorithm (tautology)
|
|
231
233
|
|
|
232
234
|
## When to hand over to another skill
|
|
233
235
|
|
|
234
236
|
* Quality tools, PHPStan, ECS, Rector → [`quality-tools`](../quality-tools/SKILL.md)
|
|
235
237
|
* Full Pest conventions, Laravel testing helpers → [`pest-testing`](../pest-testing/SKILL.md)
|
|
236
238
|
* Running tests inside Docker → [`tests-execute`](../tests-execute/SKILL.md)
|
|
237
|
-
* Investigating why a test
|
|
239
|
+
* Investigating why a test is failing for non-obvious reasons →
|
|
238
240
|
[`systematic-debugging`](../systematic-debugging/SKILL.md)
|
|
239
241
|
|
|
240
242
|
## Validation checklist
|
|
@@ -243,10 +245,10 @@ Before marking TDD work complete:
|
|
|
243
245
|
|
|
244
246
|
* [ ] Every new behavior has a test
|
|
245
247
|
* [ ] Each test was observed to fail first, with a matching failure message
|
|
246
|
-
* [ ]
|
|
248
|
+
* [ ] The minimum code was written to turn each RED into GREEN
|
|
247
249
|
* [ ] All targeted tests pass
|
|
248
|
-
* [ ] No adjacent test turned red
|
|
249
|
-
* [ ]
|
|
250
|
+
* [ ] No adjacent test has turned red
|
|
251
|
+
* [ ] Test output is clean (no new warnings or deprecations)
|
|
250
252
|
|
|
251
253
|
See also [`developer-like-execution`](../developer-like-execution/SKILL.md)
|
|
252
254
|
for the broader think → analyze → verify loop this skill plugs into.
|