@techwavedev/agi-agent-kit 1.2.7 → 1.2.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,235 @@
1
+ ---
2
+ name: test-driven-development
3
+ description: RED-GREEN-REFACTOR enforcement. MANDATORY for new features, bug fixes, and behavior changes. Write the test first, watch it fail, then implement.
4
+ version: 1.0.0
5
+ ---
6
+
7
+ # Test-Driven Development (TDD)
8
+
9
+ > Adapted from obra/superpowers — platform-agnostic commands.
10
+
11
+ ## Overview
12
+
13
+ Write the test first. Watch it fail. Write minimal code to pass.
14
+
15
+ **Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
16
+
17
+ ---
18
+
19
+ ## The Iron Law
20
+
21
+ ```
22
+ NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
23
+ ```
24
+
25
+ Write code before the test? **Delete it.** Start over.
26
+
27
+ **No exceptions:**
28
+
29
+ - Don't keep it as "reference"
30
+ - Don't "adapt" it while writing tests
31
+ - Don't look at it
32
+ - Delete means delete
33
+
34
+ ---
35
+
36
+ ## When to Use
37
+
38
+ **Always:**
39
+
40
+ - New features
41
+ - Bug fixes
42
+ - Refactoring
43
+ - Behavior changes
44
+
45
+ **Exceptions (ask the user):**
46
+
47
+ - Throwaway prototypes
48
+ - Generated code
49
+ - Configuration-only changes
50
+
51
+ Thinking "skip TDD just this once"? Stop. That's rationalization.
52
+
53
+ ---
54
+
55
+ ## Red-Green-Refactor Cycle
56
+
57
+ ### 🔴 RED — Write Failing Test
58
+
59
+ Write one minimal test showing what **should** happen.
60
+
61
+ **Requirements:**
62
+
63
+ - One behavior per test
64
+ - Clear, descriptive name
65
+ - Real code (no mocks unless unavoidable)
66
+
67
+ ```
68
+ ✅ test('rejects empty email with error message')
69
+ ❌ test('test1') / test('validates email and domain and whitespace')
70
+ ```
71
+
72
+ ### 🔴 Verify RED — Watch It Fail
73
+
74
+ **MANDATORY. Never skip.**
75
+
76
+ Run the test. Confirm:
77
+
78
+ - Test **fails** (not errors — compilation errors are different from test failures)
79
+ - Failure message is expected (feature missing, not typos)
80
+
81
+ **Test passes immediately?** You're testing existing behavior. Fix the test.
82
+
83
+ ### 🟢 GREEN — Minimal Code
84
+
85
+ Write the **simplest** code to pass the test.
86
+
87
+ - Don't add features beyond what the test requires
88
+ - Don't refactor other code
89
+ - Don't "improve" beyond the test
90
+ - YAGNI ruthlessly
91
+
92
+ ### 🟢 Verify GREEN — Watch It Pass
93
+
94
+ **MANDATORY.**
95
+
96
+ Run the test. Confirm:
97
+
98
+ - New test passes
99
+ - All other tests still pass
100
+ - Output pristine (no errors, warnings)
101
+
102
+ **Test fails?** Fix the code, not the test.
103
+
104
+ ### 🔵 REFACTOR — Clean Up
105
+
106
+ After green only:
107
+
108
+ - Remove duplication
109
+ - Improve names
110
+ - Extract helpers
111
+
112
+ **Keep tests green during refactoring.** Don't add behavior.
113
+
114
+ ### 🔁 Repeat
115
+
116
+ Next failing test for next behavior.
117
+
118
+ ---
119
+
120
+ ## Good Tests vs Bad Tests
121
+
122
+ | Quality | Good | Bad |
123
+ | ---------------- | ---------------------------------- | --------------------------------------------------- |
124
+ | **Minimal** | One thing. "and" in name? Split it | `test('validates email and domain and whitespace')` |
125
+ | **Clear** | Name describes behavior | `test('test1')` |
126
+ | **Real** | Tests actual code | Tests mock behavior |
127
+ | **Shows intent** | Demonstrates desired API | Obscures what code should do |
128
+
129
+ ---
130
+
131
+ ## Why Order Matters
132
+
133
+ | Rationalization | Reality |
134
+ | -------------------------------------- | ----------------------------------------------------------------------- |
135
+ | "I'll write tests after to verify" | Tests written after pass immediately — proves nothing |
136
+ | "Already manually tested" | Manual = ad-hoc, no record, can't re-run |
137
+ | "Deleting X hours of work is wasteful" | Sunk cost fallacy. Unverified code is tech debt |
138
+ | "TDD is dogmatic" | TDD IS pragmatic: finds bugs before commit |
139
+ | "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
140
+
141
+ ---
142
+
143
+ ## Common Rationalizations
144
+
145
+ | Excuse | Reality |
146
+ | -------------------------------------- | ---------------------------------------------------------- |
147
+ | "Too simple to test" | Simple code breaks. Test takes 30 seconds |
148
+ | "I'll test after" | Tests passing immediately prove nothing |
149
+ | "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete |
150
+ | "Need to explore first" | Fine. Throw away exploration, start with TDD |
151
+ | "Test hard = skip it" | Hard to test = hard to use. Simplify the design |
152
+ | "TDD will slow me down" | TDD is faster than debugging. Always |
153
+ | "Existing code has no tests" | You're improving it. Add tests for what you touch |
154
+
155
+ ---
156
+
157
+ ## Red Flags — STOP and Start Over
158
+
159
+ - Code written before test
160
+ - Test passes immediately (never saw it fail)
161
+ - Can't explain why the test failed
162
+ - Rationalizing "just this once"
163
+ - "Keep as reference" or "adapt existing code"
164
+ - "This is different because..."
165
+
166
+ **All of these mean: Delete code. Start over with TDD.**
167
+
168
+ ---
169
+
170
+ ## Bug Fix Example
171
+
172
+ **Bug:** Empty email accepted
173
+
174
+ **🔴 RED:**
175
+
176
+ ```
177
+ test('rejects empty email') → assert result.error == 'Email required'
178
+ ```
179
+
180
+ **🔴 Verify RED:**
181
+
182
+ ```
183
+ $ run tests → FAIL: expected 'Email required', got undefined ✓
184
+ ```
185
+
186
+ **🟢 GREEN:**
187
+
188
+ ```
189
+ if not data.email: return error('Email required')
190
+ ```
191
+
192
+ **🟢 Verify GREEN:**
193
+
194
+ ```
195
+ $ run tests → PASS ✓
196
+ ```
197
+
198
+ ---
199
+
200
+ ## Verification Checklist
201
+
202
+ Before marking work complete:
203
+
204
+ - [ ] Every new function/method has a test
205
+ - [ ] Watched each test fail before implementing
206
+ - [ ] Each test failed for expected reason
207
+ - [ ] Wrote minimal code to pass each test
208
+ - [ ] All tests pass
209
+ - [ ] Output pristine (no errors, warnings)
210
+ - [ ] Tests use real code (mocks only if unavoidable)
211
+ - [ ] Edge cases and errors covered
212
+
213
+ Can't check all boxes? You skipped TDD. Start over.
214
+
215
+ ---
216
+
217
+ ## When Stuck
218
+
219
+ | Problem | Solution |
220
+ | ---------------------- | ------------------------------------------------------------- |
221
+ | Don't know how to test | Write the wished-for API. Write assertion first. Ask the user |
222
+ | Test too complicated | Design too complicated. Simplify the interface |
223
+ | Must mock everything | Code too coupled. Use dependency injection |
224
+ | Test setup is huge | Extract helpers. Still complex? Simplify the design |
225
+
226
+ ---
227
+
228
+ ## Integration
229
+
230
+ | Skill | Relationship |
231
+ | -------------------------------- | ------------------------------------------ |
232
+ | `executing-plans` | TDD cycle used during task implementation |
233
+ | `systematic-debugging` | Bug found → write failing test → TDD cycle |
234
+ | `verification-before-completion` | Gate before claiming tests pass |
235
+ | `plan-writing` | Plans include TDD step structure |
@@ -0,0 +1,157 @@
1
+ ---
2
+ name: verification-before-completion
3
+ description: Universal verification gate. MANDATORY before any completion claim, success assertion, commit, or PR. Evidence before claims, always.
4
+ version: 1.0.0
5
+ ---
6
+
7
+ # Verification Before Completion
8
+
9
+ > Adapted from obra/superpowers — integrated with agi verification scripts.
10
+
11
+ ## Overview
12
+
13
+ Claiming work is complete without verification is dishonesty, not efficiency.
14
+
15
+ **Core principle:** Evidence before claims, always.
16
+
17
+ **Violating the letter of this rule is violating the spirit of this rule.**
18
+
19
+ ---
20
+
21
+ ## The Iron Law
22
+
23
+ ```
24
+ NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
25
+ ```
26
+
27
+ If you haven't run the verification command in this message, you cannot claim it passes.
28
+
29
+ ---
30
+
31
+ ## The Gate Function
32
+
33
+ ```
34
+ BEFORE claiming any status or expressing satisfaction:
35
+
36
+ 1. IDENTIFY: What command proves this claim?
37
+ 2. RUN: Execute the FULL command (fresh, complete)
38
+ 3. READ: Full output, check exit code, count failures
39
+ 4. VERIFY: Does output confirm the claim?
40
+ - If NO: State actual status with evidence
41
+ - If YES: State claim WITH evidence
42
+ 5. ONLY THEN: Make the claim
43
+
44
+ Skip any step = unverified, not verified
45
+ ```
46
+
47
+ ---
48
+
49
+ ## Evidence Requirements
50
+
51
+ | Claim | Requires | Not Sufficient |
52
+ | --------------------- | ------------------------------- | ------------------------------ |
53
+ | Tests pass | Test command output: 0 failures | Previous run, "should pass" |
54
+ | Linter clean | Linter output: 0 errors | Partial check, extrapolation |
55
+ | Build succeeds | Build command: exit 0 | Linter passing, logs look good |
56
+ | Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
57
+ | Regression test works | Red-green cycle verified | Test passes once |
58
+ | Agent completed | VCS diff shows changes | Agent reports "success" |
59
+ | Requirements met | Line-by-line checklist | Tests passing |
60
+
61
+ ---
62
+
63
+ ## Integration with Agi Scripts
64
+
65
+ When available, use the project's verification scripts:
66
+
67
+ | Verification | Script | Command |
68
+ | ------------- | ------------------ | --------------------------------------------------------------------- |
69
+ | Full audit | `checklist.py` | `python .agent/scripts/checklist.py .` |
70
+ | Security scan | `security_scan.py` | `python .agent/skills/vulnerability-scanner/scripts/security_scan.py` |
71
+ | Lint check | `lint_runner.py` | `python .agent/skills/lint-and-validate/scripts/lint_runner.py` |
72
+ | Tests | `test_runner.py` | `python .agent/skills/testing-patterns/scripts/test_runner.py` |
73
+ | UX audit | `ux_audit.py` | `python .agent/skills/frontend-design/scripts/ux_audit.py` |
74
+
75
+ > If no project scripts exist, use the project's native test/build commands.
76
+
77
+ ---
78
+
79
+ ## Red Flags — STOP
80
+
81
+ If you catch yourself thinking:
82
+
83
+ - Using "should", "probably", "seems to"
84
+ - Expressing satisfaction before verification ("Great!", "Perfect!", "Done!")
85
+ - About to commit/push/PR without verification
86
+ - Trusting agent success reports without checking
87
+ - Relying on partial verification
88
+ - Thinking "just this once"
89
+ - **ANY wording implying success without having run verification**
90
+
91
+ ---
92
+
93
+ ## Rationalization Prevention
94
+
95
+ | Excuse | Reality |
96
+ | --------------------------------------- | ------------------------- |
97
+ | "Should work now" | RUN the verification |
98
+ | "I'm confident" | Confidence ≠ evidence |
99
+ | "Just this once" | No exceptions |
100
+ | "Linter passed" | Linter ≠ compiler ≠ tests |
101
+ | "Agent said success" | Verify independently |
102
+ | "Partial check is enough" | Partial proves nothing |
103
+ | "Different words so rule doesn't apply" | Spirit over letter |
104
+
105
+ ---
106
+
107
+ ## Verification Patterns
108
+
109
+ **Tests:**
110
+
111
+ ```
112
+ ✅ [Run test command] [See: 34/34 pass] "All tests pass"
113
+ ❌ "Should pass now" / "Looks correct"
114
+ ```
115
+
116
+ **Build:**
117
+
118
+ ```
119
+ ✅ [Run build] [See: exit 0] "Build passes"
120
+ ❌ "Linter passed" (linter doesn't check compilation)
121
+ ```
122
+
123
+ **Requirements:**
124
+
125
+ ```
126
+ ✅ Re-read plan → Create checklist → Verify each → Report gaps or completion
127
+ ❌ "Tests pass, phase complete"
128
+ ```
129
+
130
+ **Agent delegation:**
131
+
132
+ ```
133
+ ✅ Agent reports success → Check VCS diff → Verify changes → Report actual state
134
+ ❌ Trust agent report
135
+ ```
136
+
137
+ ---
138
+
139
+ ## When to Apply
140
+
141
+ **ALWAYS before:**
142
+
143
+ - ANY variation of success/completion claims
144
+ - ANY expression of satisfaction
145
+ - Committing, PR creation, task completion
146
+ - Moving to next task
147
+ - Delegating to agents
148
+
149
+ ---
150
+
151
+ ## The Bottom Line
152
+
153
+ **No shortcuts for verification.**
154
+
155
+ Run the command. Read the output. THEN claim the result.
156
+
157
+ This is non-negotiable.