@northbridge-security/secureai 0.1.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/README.md +122 -0
- package/.claude/commands/architect/clean.md +978 -0
- package/.claude/commands/architect/kiss.md +762 -0
- package/.claude/commands/architect/review.md +704 -0
- package/.claude/commands/catchup.md +90 -0
- package/.claude/commands/code.md +115 -0
- package/.claude/commands/commit.md +1218 -0
- package/.claude/commands/cover.md +1298 -0
- package/.claude/commands/fmea.md +275 -0
- package/.claude/commands/kaizen.md +312 -0
- package/.claude/commands/pr.md +503 -0
- package/.claude/commands/todo.md +99 -0
- package/.claude/commands/worktree.md +738 -0
- package/.claude/commands/wrapup.md +103 -0
- package/LICENSE +183 -0
- package/README.md +108 -0
- package/dist/cli.js +75634 -0
- package/docs/agents/devops-reviewer.md +889 -0
- package/docs/agents/kiss-simplifier.md +1088 -0
- package/docs/agents/typescript.md +8 -0
- package/docs/guides/README.md +109 -0
- package/docs/guides/agents.clean.arch.md +244 -0
- package/docs/guides/agents.clean.arch.ts.md +1314 -0
- package/docs/guides/agents.gotask.md +1037 -0
- package/docs/guides/agents.markdown.md +1209 -0
- package/docs/guides/agents.onepassword.md +285 -0
- package/docs/guides/agents.sonar.md +857 -0
- package/docs/guides/agents.tdd.md +838 -0
- package/docs/guides/agents.tdd.ts.md +1062 -0
- package/docs/guides/agents.typesript.md +1389 -0
- package/docs/guides/github-mcp.md +1075 -0
- package/package.json +130 -0
- package/packages/secureai-cli/src/cli.ts +21 -0
- package/tasks/README.md +880 -0
- package/tasks/aws.yml +64 -0
- package/tasks/bash.yml +118 -0
- package/tasks/bun.yml +738 -0
- package/tasks/claude.yml +183 -0
- package/tasks/docker.yml +420 -0
- package/tasks/docs.yml +127 -0
- package/tasks/git.yml +1336 -0
- package/tasks/gotask.yml +132 -0
- package/tasks/json.yml +77 -0
- package/tasks/markdown.yml +95 -0
- package/tasks/onepassword.yml +350 -0
- package/tasks/security.yml +102 -0
- package/tasks/sonar.yml +437 -0
- package/tasks/template.yml +74 -0
- package/tasks/vscode.yml +103 -0
- package/tasks/yaml.yml +121 -0
|
@@ -0,0 +1,838 @@
|
|
|
1
|
+
# Test-Driven Development for AI Agents
|
|
2
|
+
|
|
3
|
+
This guide establishes test-driven development (TDD) principles for AI agents working with codebases. These patterns apply across all languages and frameworks, ensuring testable, maintainable, and reliable code.
|
|
4
|
+
|
|
5
|
+
## Target Audience
|
|
6
|
+
|
|
7
|
+
AI agents (Claude Code, Cursor, GitHub Copilot, etc.) writing production code that requires automated testing, regression prevention, and quality assurance.
|
|
8
|
+
|
|
9
|
+
## Core Principles
|
|
10
|
+
|
|
11
|
+
### The TDD Cycle
|
|
12
|
+
|
|
13
|
+
**Red → Green → Refactor** is the fundamental TDD workflow:
|
|
14
|
+
|
|
15
|
+
```text
|
|
16
|
+
1. RED: Write a failing test
|
|
17
|
+
↓
|
|
18
|
+
2. GREEN: Write minimal code to pass the test
|
|
19
|
+
↓
|
|
20
|
+
3. REFACTOR: Improve code without changing behavior
|
|
21
|
+
↓
|
|
22
|
+
(Repeat)
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
**Why this order matters:**
|
|
26
|
+
|
|
27
|
+
- **Red first** - Proves the test can fail (validates test correctness)
|
|
28
|
+
- **Green quickly** - Gets to working state fast (validates implementation)
|
|
29
|
+
- **Refactor safely** - Tests catch regressions (enables improvement)
|
|
30
|
+
|
|
31
|
+
### Test First, Code Second
|
|
32
|
+
|
|
33
|
+
**Always write tests before implementation:**
|
|
34
|
+
|
|
35
|
+
```text
|
|
36
|
+
❌ WRONG:
|
|
37
|
+
1. Write function implementation
|
|
38
|
+
2. Write tests to verify it works
|
|
39
|
+
3. Find bugs, fix, repeat
|
|
40
|
+
|
|
41
|
+
✓ CORRECT:
|
|
42
|
+
1. Write test describing expected behavior
|
|
43
|
+
2. Run test (should fail - RED)
|
|
44
|
+
3. Write minimal code to pass test (GREEN)
|
|
45
|
+
4. Refactor for quality (REFACTOR)
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
**Benefits:**
|
|
49
|
+
|
|
50
|
+
- **Better design** - Writing tests first forces you to think about API design
|
|
51
|
+
- **Complete coverage** - Every line of code has a corresponding test
|
|
52
|
+
- **No dead code** - Only write code needed to pass tests
|
|
53
|
+
- **Living documentation** - Tests document how code should be used
|
|
54
|
+
|
|
55
|
+
### Small Steps
|
|
56
|
+
|
|
57
|
+
**Make incremental progress with small, focused tests:**
|
|
58
|
+
|
|
59
|
+
```text
|
|
60
|
+
Testing a validator function:
|
|
61
|
+
|
|
62
|
+
Step 1: Test empty input
|
|
63
|
+
Step 2: Test valid input
|
|
64
|
+
Step 3: Test invalid format
|
|
65
|
+
Step 4: Test boundary conditions
|
|
66
|
+
Step 5: Test error messages
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**Why small steps:**
|
|
70
|
+
|
|
71
|
+
- Easier to identify what broke when tests fail
|
|
72
|
+
- Faster feedback loop (run tests every few minutes)
|
|
73
|
+
- Reduced cognitive load (focus on one behavior at a time)
|
|
74
|
+
- Natural progression toward complete implementation
|
|
75
|
+
|
|
76
|
+
## Test Types
|
|
77
|
+
|
|
78
|
+
### Unit Tests
|
|
79
|
+
|
|
80
|
+
**Test individual functions/classes in isolation:**
|
|
81
|
+
|
|
82
|
+
**Characteristics:**
|
|
83
|
+
|
|
84
|
+
- Fast (< 10ms per test)
|
|
85
|
+
- No external dependencies (filesystem, network, database)
|
|
86
|
+
- Use mocks/stubs for dependencies
|
|
87
|
+
- Test one behavior per test
|
|
88
|
+
|
|
89
|
+
**Example scenarios:**
|
|
90
|
+
|
|
91
|
+
- Pure functions (input → output)
|
|
92
|
+
- Business logic calculations
|
|
93
|
+
- Data transformations
|
|
94
|
+
- Validation rules
|
|
95
|
+
- String parsing
|
|
96
|
+
- Math operations
|
|
97
|
+
|
|
98
|
+
**When to use:**
|
|
99
|
+
|
|
100
|
+
- Testing business logic
|
|
101
|
+
- Validating calculations
|
|
102
|
+
- Checking edge cases
|
|
103
|
+
- Regression prevention
|
|
104
|
+
|
|
105
|
+
### Integration Tests
|
|
106
|
+
|
|
107
|
+
**Test multiple components working together:**
|
|
108
|
+
|
|
109
|
+
**Characteristics:**
|
|
110
|
+
|
|
111
|
+
- Slower (100ms - 5 seconds per test)
|
|
112
|
+
- May use real dependencies (files, databases, APIs)
|
|
113
|
+
- Test interaction between components
|
|
114
|
+
- Verify end-to-end workflows
|
|
115
|
+
|
|
116
|
+
**Example scenarios:**
|
|
117
|
+
|
|
118
|
+
- Reading/writing files
|
|
119
|
+
- Database queries
|
|
120
|
+
- API calls
|
|
121
|
+
- Command execution
|
|
122
|
+
- Configuration loading
|
|
123
|
+
- Multi-step processes
|
|
124
|
+
|
|
125
|
+
**When to use:**
|
|
126
|
+
|
|
127
|
+
- Testing system boundaries (filesystem, network)
|
|
128
|
+
- Verifying component integration
|
|
129
|
+
- End-to-end workflow validation
|
|
130
|
+
- Infrastructure verification
|
|
131
|
+
|
|
132
|
+
### Functional/End-to-End Tests
|
|
133
|
+
|
|
134
|
+
**Test complete user workflows:**
|
|
135
|
+
|
|
136
|
+
**Characteristics:**
|
|
137
|
+
|
|
138
|
+
- Slowest (seconds to minutes per test)
|
|
139
|
+
- Full system deployment
|
|
140
|
+
- Real environment (staging/production-like)
|
|
141
|
+
- User-centric scenarios
|
|
142
|
+
|
|
143
|
+
**Example scenarios:**
|
|
144
|
+
|
|
145
|
+
- CLI command workflows
|
|
146
|
+
- Web application user flows
|
|
147
|
+
- API endpoint chains
|
|
148
|
+
- Installation processes
|
|
149
|
+
- Update/migration procedures
|
|
150
|
+
|
|
151
|
+
**When to use:**
|
|
152
|
+
|
|
153
|
+
- Critical user workflows
|
|
154
|
+
- Release verification
|
|
155
|
+
- Smoke testing deployments
|
|
156
|
+
- Regression testing major features
|
|
157
|
+
|
|
158
|
+
## Test Organization
|
|
159
|
+
|
|
160
|
+
### Folder Structure
|
|
161
|
+
|
|
162
|
+
**Organize tests parallel to source code:**
|
|
163
|
+
|
|
164
|
+
```text
|
|
165
|
+
project/
|
|
166
|
+
├── src/
|
|
167
|
+
│ ├── auth/
|
|
168
|
+
│ │ ├── authenticate.ts
|
|
169
|
+
│ │ ├── session.ts
|
|
170
|
+
│ │ └── tokens.ts
|
|
171
|
+
│ └── users/
|
|
172
|
+
│ ├── repository.ts
|
|
173
|
+
│ └── service.ts
|
|
174
|
+
├── tests/
|
|
175
|
+
│ ├── unit/
|
|
176
|
+
│ │ ├── auth/
|
|
177
|
+
│ │ │ ├── authenticate.test.ts
|
|
178
|
+
│ │ │ ├── session.test.ts
|
|
179
|
+
│ │ │ └── tokens.test.test.ts
|
|
180
|
+
│ │ └── users/
|
|
181
|
+
│ │ ├── repository.test.ts
|
|
182
|
+
│ │ └── service.test.ts
|
|
183
|
+
│ ├── integration/
|
|
184
|
+
│ │ ├── auth-flow.test.ts
|
|
185
|
+
│ │ └── user-management.test.ts
|
|
186
|
+
│ └── mocks/
|
|
187
|
+
│ ├── auth/
|
|
188
|
+
│ │ └── mock-session.ts
|
|
189
|
+
│ └── users/
|
|
190
|
+
│ └── mock-repository.ts
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
**Benefits:**
|
|
194
|
+
|
|
195
|
+
- Easy to find related tests
|
|
196
|
+
- Clear separation of test types
|
|
197
|
+
- Parallel structure to source code
|
|
198
|
+
- Shared mocks in dedicated folder
|
|
199
|
+
|
|
200
|
+
### Naming Conventions
|
|
201
|
+
|
|
202
|
+
**Test file names:**
|
|
203
|
+
|
|
204
|
+
| Pattern | Example | Purpose |
|
|
205
|
+
| ---------------------- | -------------------------- | ------------------ |
|
|
206
|
+
| `<module>.test.<ext>` | `authenticate.test.ts` | Unit tests |
|
|
207
|
+
| `<feature>.test.<ext>` | `auth-flow.test.ts` | Integration tests |
|
|
208
|
+
| `<workflow>.e2e.<ext>` | `user-registration.e2e.ts` | End-to-end tests |
|
|
209
|
+
| `mock-<module>.<ext>` | `mock-repository.ts` | Test doubles/mocks |
|
|
210
|
+
|
|
211
|
+
**Test case names:**
|
|
212
|
+
|
|
213
|
+
Use descriptive names that explain behavior:
|
|
214
|
+
|
|
215
|
+
```text
|
|
216
|
+
✓ GOOD:
|
|
217
|
+
- "should return user when valid credentials provided"
|
|
218
|
+
- "should throw error when password is too short"
|
|
219
|
+
- "should hash password before storing in database"
|
|
220
|
+
|
|
221
|
+
✗ BAD:
|
|
222
|
+
- "test1"
|
|
223
|
+
- "authentication"
|
|
224
|
+
- "it works"
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
## Writing Effective Tests
|
|
228
|
+
|
|
229
|
+
### Arrange-Act-Assert Pattern
|
|
230
|
+
|
|
231
|
+
**Structure every test with three sections:**
|
|
232
|
+
|
|
233
|
+
```text
|
|
234
|
+
// Arrange: Set up test data and preconditions
|
|
235
|
+
const input = "test@example.com";
|
|
236
|
+
const expected = { email: "test@example.com", valid: true };
|
|
237
|
+
|
|
238
|
+
// Act: Execute the code being tested
|
|
239
|
+
const result = validateEmail(input);
|
|
240
|
+
|
|
241
|
+
// Assert: Verify the outcome
|
|
242
|
+
expect(result).toEqual(expected);
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
**Why this structure:**
|
|
246
|
+
|
|
247
|
+
- Clear separation of setup, execution, and verification
|
|
248
|
+
- Easy to understand what's being tested
|
|
249
|
+
- Simple to debug when tests fail
|
|
250
|
+
- Consistent pattern across all tests
|
|
251
|
+
|
|
252
|
+
### One Assertion Per Concept
|
|
253
|
+
|
|
254
|
+
**Test one behavior at a time:**
|
|
255
|
+
|
|
256
|
+
```text
|
|
257
|
+
✓ GOOD - Single concept:
|
|
258
|
+
test("should validate email format") {
|
|
259
|
+
const result = validateEmail("test@example.com");
|
|
260
|
+
expect(result.valid).toBe(true);
|
|
261
|
+
}
|
|
262
|
+
|
|
263
|
+
test("should extract email domain") {
|
|
264
|
+
const result = validateEmail("test@example.com");
|
|
265
|
+
expect(result.domain).toBe("example.com");
|
|
266
|
+
}
|
|
267
|
+
|
|
268
|
+
✗ BAD - Multiple concepts:
|
|
269
|
+
test("should validate email") {
|
|
270
|
+
const result = validateEmail("test@example.com");
|
|
271
|
+
expect(result.valid).toBe(true);
|
|
272
|
+
expect(result.domain).toBe("example.com");
|
|
273
|
+
expect(result.username).toBe("test");
|
|
274
|
+
}
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
**Exception:** Multiple assertions are acceptable when testing the same concept:
|
|
278
|
+
|
|
279
|
+
```text
|
|
280
|
+
✓ ACCEPTABLE - Same concept (object shape):
|
|
281
|
+
test("should return complete user object") {
|
|
282
|
+
const user = createUser("John", "john@example.com");
|
|
283
|
+
|
|
284
|
+
expect(user.name).toBe("John");
|
|
285
|
+
expect(user.email).toBe("john@example.com");
|
|
286
|
+
expect(user.id).toBeDefined();
|
|
287
|
+
expect(user.createdAt).toBeInstanceOf(Date);
|
|
288
|
+
}
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
### Test Edge Cases
|
|
292
|
+
|
|
293
|
+
**Cover boundary conditions and error scenarios:**
|
|
294
|
+
|
|
295
|
+
**Input validation example:**
|
|
296
|
+
|
|
297
|
+
```text
|
|
298
|
+
Function: validateAge(age: number): boolean
|
|
299
|
+
|
|
300
|
+
Test cases:
|
|
301
|
+
1. Valid age (18-120): expect true
|
|
302
|
+
2. Minimum boundary (18): expect true
|
|
303
|
+
3. Below minimum (17): expect false
|
|
304
|
+
4. Maximum boundary (120): expect true
|
|
305
|
+
5. Above maximum (121): expect false
|
|
306
|
+
6. Zero: expect false
|
|
307
|
+
7. Negative: expect false
|
|
308
|
+
8. Decimal: expect false
|
|
309
|
+
9. NaN: expect false
|
|
310
|
+
10. Infinity: expect false
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
**Common edge cases:**
|
|
314
|
+
|
|
315
|
+
- Empty inputs (null, undefined, empty string, empty array)
|
|
316
|
+
- Boundary values (min, max, zero, one)
|
|
317
|
+
- Invalid types (wrong type, NaN, Infinity)
|
|
318
|
+
- Special characters (Unicode, emojis, control characters)
|
|
319
|
+
- Large inputs (performance, memory limits)
|
|
320
|
+
- Concurrent operations (race conditions)
|
|
321
|
+
|
|
322
|
+
### Avoid Test Interdependence
|
|
323
|
+
|
|
324
|
+
**Each test should be independent:**
|
|
325
|
+
|
|
326
|
+
```text
|
|
327
|
+
✓ GOOD - Independent tests:
|
|
328
|
+
test("should add user") {
|
|
329
|
+
const db = createTestDatabase();
|
|
330
|
+
db.addUser({ name: "Alice" });
|
|
331
|
+
expect(db.count()).toBe(1);
|
|
332
|
+
}
|
|
333
|
+
|
|
334
|
+
test("should remove user") {
|
|
335
|
+
const db = createTestDatabase();
|
|
336
|
+
db.addUser({ name: "Bob" });
|
|
337
|
+
db.removeUser("Bob");
|
|
338
|
+
expect(db.count()).toBe(0);
|
|
339
|
+
}
|
|
340
|
+
|
|
341
|
+
✗ BAD - Dependent tests:
|
|
342
|
+
let db;
|
|
343
|
+
|
|
344
|
+
test("should add user") {
|
|
345
|
+
db = createTestDatabase();
|
|
346
|
+
db.addUser({ name: "Alice" });
|
|
347
|
+
expect(db.count()).toBe(1);
|
|
348
|
+
}
|
|
349
|
+
|
|
350
|
+
test("should remove user") {
|
|
351
|
+
// DEPENDS ON PREVIOUS TEST
|
|
352
|
+
db.removeUser("Alice");
|
|
353
|
+
expect(db.count()).toBe(0);
|
|
354
|
+
}
|
|
355
|
+
```
|
|
356
|
+
|
|
357
|
+
**Why independence matters:**
|
|
358
|
+
|
|
359
|
+
- Tests can run in any order
|
|
360
|
+
- Tests can run in parallel
|
|
361
|
+
- Failures are isolated (one failure doesn't cascade)
|
|
362
|
+
- Tests can be run individually for debugging
|
|
363
|
+
|
|
364
|
+
## Mocking and Test Doubles
|
|
365
|
+
|
|
366
|
+
### When to Use Mocks
|
|
367
|
+
|
|
368
|
+
**Use mocks for external dependencies:**
|
|
369
|
+
|
|
370
|
+
**Mock these:**
|
|
371
|
+
|
|
372
|
+
- File system operations (read, write, delete)
|
|
373
|
+
- Network requests (HTTP, WebSocket, database)
|
|
374
|
+
- System commands (exec, spawn)
|
|
375
|
+
- Time-dependent code (Date.now(), timers)
|
|
376
|
+
- Random number generation
|
|
377
|
+
- External APIs
|
|
378
|
+
|
|
379
|
+
**Don't mock these:**
|
|
380
|
+
|
|
381
|
+
- Pure functions (no side effects)
|
|
382
|
+
- Data structures (objects, arrays)
|
|
383
|
+
- Simple utilities (string manipulation, math)
|
|
384
|
+
- Code you're testing directly
|
|
385
|
+
|
|
386
|
+
### Types of Test Doubles
|
|
387
|
+
|
|
388
|
+
**Different patterns for different needs:**
|
|
389
|
+
|
|
390
|
+
**Stub** - Returns canned responses:
|
|
391
|
+
|
|
392
|
+
```text
|
|
393
|
+
mockDatabase.getUser() → returns { id: 1, name: "Test User" }
|
|
394
|
+
```
|
|
395
|
+
|
|
396
|
+
**Spy** - Records how it was called:
|
|
397
|
+
|
|
398
|
+
```text
|
|
399
|
+
mockLogger.log("message")
|
|
400
|
+
→ Verify: called once with "message"
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
**Mock** - Programmable behavior with expectations:
|
|
404
|
+
|
|
405
|
+
```text
|
|
406
|
+
mockAPI
|
|
407
|
+
.expect("POST", "/users")
|
|
408
|
+
.withBody({ name: "Alice" })
|
|
409
|
+
.respond({ id: 1 })
|
|
410
|
+
```
|
|
411
|
+
|
|
412
|
+
**Fake** - Working implementation (lightweight):
|
|
413
|
+
|
|
414
|
+
```text
|
|
415
|
+
InMemoryDatabase - Real database logic, but in-memory storage
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
### Dependency Injection for Testability
|
|
419
|
+
|
|
420
|
+
**Pass dependencies as parameters:**
|
|
421
|
+
|
|
422
|
+
```text
|
|
423
|
+
✓ GOOD - Injectable dependency:
|
|
424
|
+
function saveUser(user, database) {
|
|
425
|
+
return database.insert(user);
|
|
426
|
+
}
|
|
427
|
+
|
|
428
|
+
// In tests:
|
|
429
|
+
const mockDB = createMockDatabase();
|
|
430
|
+
saveUser({ name: "Alice" }, mockDB);
|
|
431
|
+
|
|
432
|
+
✗ BAD - Hard-coded dependency:
|
|
433
|
+
import { realDatabase } from './database';
|
|
434
|
+
|
|
435
|
+
function saveUser(user) {
|
|
436
|
+
return realDatabase.insert(user);
|
|
437
|
+
// Cannot test without real database
|
|
438
|
+
}
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
**For class-based code, use constructor injection:**
|
|
442
|
+
|
|
443
|
+
```text
|
|
444
|
+
✓ GOOD - Constructor injection:
|
|
445
|
+
class UserService {
|
|
446
|
+
constructor(database, emailService) {
|
|
447
|
+
this.database = database;
|
|
448
|
+
this.emailService = emailService;
|
|
449
|
+
}
|
|
450
|
+
}
|
|
451
|
+
|
|
452
|
+
// In tests:
|
|
453
|
+
const service = new UserService(mockDB, mockEmail);
|
|
454
|
+
|
|
455
|
+
✗ BAD - Hard-coded dependencies:
|
|
456
|
+
class UserService {
|
|
457
|
+
constructor() {
|
|
458
|
+
this.database = new RealDatabase();
|
|
459
|
+
this.emailService = new RealEmailService();
|
|
460
|
+
}
|
|
461
|
+
}
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
## Test Coverage
|
|
465
|
+
|
|
466
|
+
### Coverage Metrics
|
|
467
|
+
|
|
468
|
+
**Understand what coverage measures:**
|
|
469
|
+
|
|
470
|
+
| Metric | Meaning | Target |
|
|
471
|
+
| ------------------ | ---------------------------- | ------ |
|
|
472
|
+
| Line coverage | % of code lines executed | 80%+ |
|
|
473
|
+
| Branch coverage | % of if/else branches tested | 80%+ |
|
|
474
|
+
| Function coverage | % of functions called | 90%+ |
|
|
475
|
+
| Statement coverage | % of statements executed | 80%+ |
|
|
476
|
+
|
|
477
|
+
**Coverage is not quality:**
|
|
478
|
+
|
|
479
|
+
- 100% coverage doesn't mean bug-free code
|
|
480
|
+
- Focus on meaningful tests, not coverage numbers
|
|
481
|
+
- Cover critical paths and edge cases thoroughly
|
|
482
|
+
- Low-value code (getters/setters) can have lower coverage
|
|
483
|
+
|
|
484
|
+
### What to Prioritize
|
|
485
|
+
|
|
486
|
+
**Test these thoroughly (aim for 100%):**
|
|
487
|
+
|
|
488
|
+
- Business logic and algorithms
|
|
489
|
+
- Security-critical code (authentication, authorization)
|
|
490
|
+
- Data validation and sanitization
|
|
491
|
+
- Error handling and edge cases
|
|
492
|
+
- Public APIs and interfaces
|
|
493
|
+
|
|
494
|
+
**Lower priority (aim for 60-80%):**
|
|
495
|
+
|
|
496
|
+
- Simple getters/setters
|
|
497
|
+
- Configuration loading
|
|
498
|
+
- Logging statements
|
|
499
|
+
- UI layout code
|
|
500
|
+
- Trivial utilities
|
|
501
|
+
|
|
502
|
+
### Excluding Code from Coverage
|
|
503
|
+
|
|
504
|
+
**Mark code that shouldn't be covered:**
|
|
505
|
+
|
|
506
|
+
```text
|
|
507
|
+
// Language-specific examples:
|
|
508
|
+
|
|
509
|
+
// TypeScript/JavaScript
|
|
510
|
+
/* istanbul ignore next */
|
|
511
|
+
function developmentOnlyHelper() { ... }
|
|
512
|
+
|
|
513
|
+
// Python
|
|
514
|
+
def debug_helper(): # pragma: no cover
|
|
515
|
+
...
|
|
516
|
+
|
|
517
|
+
// Go
|
|
518
|
+
// +build !test
|
|
519
|
+
```
|
|
520
|
+
|
|
521
|
+
**What to exclude:**
|
|
522
|
+
|
|
523
|
+
- Development/debug utilities
|
|
524
|
+
- Platform-specific code on other platforms
|
|
525
|
+
- Defensive assertions that should never happen
|
|
526
|
+
- Generated code
|
|
527
|
+
|
|
528
|
+
## Anti-Patterns to Avoid
|
|
529
|
+
|
|
530
|
+
### Testing Implementation Details
|
|
531
|
+
|
|
532
|
+
**Test behavior, not implementation:**
|
|
533
|
+
|
|
534
|
+
```text
|
|
535
|
+
✗ BAD - Tests internal state:
|
|
536
|
+
test("should increment counter") {
|
|
537
|
+
const obj = new Counter();
|
|
538
|
+
obj.increment();
|
|
539
|
+
expect(obj._internalCounter).toBe(1); // Testing private state
|
|
540
|
+
}
|
|
541
|
+
|
|
542
|
+
✓ GOOD - Tests public behavior:
|
|
543
|
+
test("should return incremented value") {
|
|
544
|
+
const counter = new Counter();
|
|
545
|
+
counter.increment();
|
|
546
|
+
expect(counter.getValue()).toBe(1);
|
|
547
|
+
}
|
|
548
|
+
```
|
|
549
|
+
|
|
550
|
+
**Why this matters:**
|
|
551
|
+
|
|
552
|
+
- Internal refactoring shouldn't break tests
|
|
553
|
+
- Tests document public contract, not implementation
|
|
554
|
+
- Enables changing internals without test changes
|
|
555
|
+
|
|
556
|
+
### Brittle Tests
|
|
557
|
+
|
|
558
|
+
**Avoid tests that break on unrelated changes:**
|
|
559
|
+
|
|
560
|
+
```text
|
|
561
|
+
✗ BAD - Hardcoded values:
|
|
562
|
+
test("should format date") {
|
|
563
|
+
const result = formatDate(new Date());
|
|
564
|
+
expect(result).toBe("2024-11-22 14:30:45"); // Breaks constantly
|
|
565
|
+
}
|
|
566
|
+
|
|
567
|
+
✓ GOOD - Flexible matching:
|
|
568
|
+
test("should format date") {
|
|
569
|
+
const result = formatDate(new Date());
|
|
570
|
+
expect(result).toMatch(/^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}$/);
|
|
571
|
+
}
|
|
572
|
+
```
|
|
573
|
+
|
|
574
|
+
### Testing Multiple Things
|
|
575
|
+
|
|
576
|
+
**One test, one responsibility:**
|
|
577
|
+
|
|
578
|
+
```text
|
|
579
|
+
✗ BAD - Tests entire workflow:
|
|
580
|
+
test("user workflow") {
|
|
581
|
+
const user = createUser();
|
|
582
|
+
user.login();
|
|
583
|
+
user.updateProfile();
|
|
584
|
+
user.changePassword();
|
|
585
|
+
user.logout();
|
|
586
|
+
// If any step fails, which one?
|
|
587
|
+
}
|
|
588
|
+
|
|
589
|
+
✓ GOOD - Separate tests:
|
|
590
|
+
test("should create user")
|
|
591
|
+
test("should login user")
|
|
592
|
+
test("should update profile")
|
|
593
|
+
test("should change password")
|
|
594
|
+
test("should logout user")
|
|
595
|
+
```
|
|
596
|
+
|
|
597
|
+
### Slow Tests
|
|
598
|
+
|
|
599
|
+
**Keep tests fast:**
|
|
600
|
+
|
|
601
|
+
**Performance targets:**
|
|
602
|
+
|
|
603
|
+
- Unit test: < 10ms
|
|
604
|
+
- Integration test: < 1 second
|
|
605
|
+
- E2E test: < 30 seconds
|
|
606
|
+
|
|
607
|
+
**Optimization strategies:**
|
|
608
|
+
|
|
609
|
+
- Use in-memory databases instead of real ones
|
|
610
|
+
- Mock slow external dependencies
|
|
611
|
+
- Parallelize test execution
|
|
612
|
+
- Use test data factories (don't recreate fixtures)
|
|
613
|
+
- Share expensive setup across tests (carefully)
|
|
614
|
+
|
|
615
|
+
## TDD in Practice
|
|
616
|
+
|
|
617
|
+
### Starting a New Feature
|
|
618
|
+
|
|
619
|
+
**TDD workflow for new features:**
|
|
620
|
+
|
|
621
|
+
```text
|
|
622
|
+
1. Write first test for simplest case
|
|
623
|
+
→ Test fails (RED)
|
|
624
|
+
|
|
625
|
+
2. Write minimal code to pass
|
|
626
|
+
→ Test passes (GREEN)
|
|
627
|
+
|
|
628
|
+
3. Write test for next case
|
|
629
|
+
→ Test fails (RED)
|
|
630
|
+
|
|
631
|
+
4. Extend code to pass new test
|
|
632
|
+
→ All tests pass (GREEN)
|
|
633
|
+
|
|
634
|
+
5. Refactor if needed
|
|
635
|
+
→ Tests still pass (GREEN)
|
|
636
|
+
|
|
637
|
+
6. Repeat until feature complete
|
|
638
|
+
```
|
|
639
|
+
|
|
640
|
+
**Example: Building an email validator**
|
|
641
|
+
|
|
642
|
+
```text
|
|
643
|
+
Step 1: Test empty string
|
|
644
|
+
Test: validateEmail("") → {valid: false}
|
|
645
|
+
Code: function validateEmail(email) { return {valid: false}; }
|
|
646
|
+
|
|
647
|
+
Step 2: Test simple valid email
|
|
648
|
+
Test: validateEmail("a@b.c") → {valid: true}
|
|
649
|
+
Code: Add check for @ and .
|
|
650
|
+
|
|
651
|
+
Step 3: Test invalid format (no @)
|
|
652
|
+
Test: validateEmail("invalid") → {valid: false}
|
|
653
|
+
Code: Already passes
|
|
654
|
+
|
|
655
|
+
Step 4: Test invalid format (no domain)
|
|
656
|
+
Test: validateEmail("test@") → {valid: false}
|
|
657
|
+
Code: Add domain check
|
|
658
|
+
|
|
659
|
+
Step 5: Test complex valid email
|
|
660
|
+
Test: validateEmail("user.name+tag@example.co.uk") → {valid: true}
|
|
661
|
+
Code: Improve regex pattern
|
|
662
|
+
|
|
663
|
+
(Continue for all edge cases...)
|
|
664
|
+
```
|
|
665
|
+
|
|
666
|
+
### Fixing Bugs
|
|
667
|
+
|
|
668
|
+
**TDD workflow for bug fixes:**
|
|
669
|
+
|
|
670
|
+
```text
|
|
671
|
+
1. Write test that reproduces the bug
|
|
672
|
+
→ Test fails (confirms bug exists)
|
|
673
|
+
|
|
674
|
+
2. Fix the bug
|
|
675
|
+
→ Test passes (bug is fixed)
|
|
676
|
+
|
|
677
|
+
3. Ensure all other tests still pass
|
|
678
|
+
→ Regression test in place forever
|
|
679
|
+
```
|
|
680
|
+
|
|
681
|
+
**Example: Bug report: "App crashes on empty input"**
|
|
682
|
+
|
|
683
|
+
```text
|
|
684
|
+
1. Write failing test:
|
|
685
|
+
test("should handle empty input") {
|
|
686
|
+
expect(() => processInput("")).not.toThrow();
|
|
687
|
+
}
|
|
688
|
+
→ Test fails: TypeError: Cannot read property 'length' of undefined
|
|
689
|
+
|
|
690
|
+
2. Fix code:
|
|
691
|
+
function processInput(input) {
|
|
692
|
+
if (!input) return null; // Add null check
|
|
693
|
+
return input.length;
|
|
694
|
+
}
|
|
695
|
+
→ Test passes
|
|
696
|
+
|
|
697
|
+
3. Run all tests:
|
|
698
|
+
→ All pass, regression prevented
|
|
699
|
+
```
|
|
700
|
+
|
|
701
|
+
### Refactoring
|
|
702
|
+
|
|
703
|
+
**TDD enables safe refactoring:**
|
|
704
|
+
|
|
705
|
+
```text
|
|
706
|
+
1. Ensure comprehensive test coverage
|
|
707
|
+
→ All tests pass (GREEN)
|
|
708
|
+
|
|
709
|
+
2. Refactor code
|
|
710
|
+
→ Change structure, not behavior
|
|
711
|
+
|
|
712
|
+
3. Run tests frequently
|
|
713
|
+
→ Tests catch regressions immediately
|
|
714
|
+
|
|
715
|
+
4. If tests fail:
|
|
716
|
+
→ Either fix code or fix test (if test was wrong)
|
|
717
|
+
|
|
718
|
+
5. Repeat until refactoring complete
|
|
719
|
+
→ All tests still pass
|
|
720
|
+
```
|
|
721
|
+
|
|
722
|
+
**Refactoring example:**
|
|
723
|
+
|
|
724
|
+
```text
|
|
725
|
+
BEFORE:
|
|
726
|
+
function calculateTotal(items) {
|
|
727
|
+
let total = 0;
|
|
728
|
+
for (let i = 0; i < items.length; i++) {
|
|
729
|
+
total += items[i].price * items[i].quantity;
|
|
730
|
+
}
|
|
731
|
+
return total;
|
|
732
|
+
}
|
|
733
|
+
|
|
734
|
+
Tests: ✓ All passing
|
|
735
|
+
|
|
736
|
+
AFTER (refactored):
|
|
737
|
+
function calculateTotal(items) {
|
|
738
|
+
return items.reduce((sum, item) =>
|
|
739
|
+
sum + (item.price * item.quantity), 0
|
|
740
|
+
);
|
|
741
|
+
}
|
|
742
|
+
|
|
743
|
+
Tests: ✓ All still passing (behavior unchanged)
|
|
744
|
+
```
|
|
745
|
+
|
|
746
|
+
## Best Practices for AI Agents
|
|
747
|
+
|
|
748
|
+
### Always Run Tests Before Coding
|
|
749
|
+
|
|
750
|
+
**Workflow for AI agents:**
|
|
751
|
+
|
|
752
|
+
```text
|
|
753
|
+
1. Read existing tests
|
|
754
|
+
2. Understand expected behavior
|
|
755
|
+
3. Write new test for feature/fix
|
|
756
|
+
4. Run tests (should fail)
|
|
757
|
+
5. Write code to pass test
|
|
758
|
+
6. Run tests (should pass)
|
|
759
|
+
7. Refactor if needed
|
|
760
|
+
8. Run tests (should still pass)
|
|
761
|
+
```
|
|
762
|
+
|
|
763
|
+
**Never skip step 4** - Confirming the test fails proves it's valid.
|
|
764
|
+
|
|
765
|
+
### Communicate Test Results
|
|
766
|
+
|
|
767
|
+
**Report test status to users:**
|
|
768
|
+
|
|
769
|
+
```text
|
|
770
|
+
✓ GOOD:
|
|
771
|
+
"I've written a test for email validation. Running tests..."
|
|
772
|
+
[test output]
|
|
773
|
+
"Test failed as expected (RED). Now implementing the validator..."
|
|
774
|
+
[writes code]
|
|
775
|
+
"Running tests again..."
|
|
776
|
+
[test output]
|
|
777
|
+
"Test passes (GREEN). Email validation is working correctly."
|
|
778
|
+
|
|
779
|
+
✗ BAD:
|
|
780
|
+
"I've implemented email validation."
|
|
781
|
+
[no tests mentioned, no verification shown]
|
|
782
|
+
```
|
|
783
|
+
|
|
784
|
+
### Use Test Output for Debugging
|
|
785
|
+
|
|
786
|
+
**When tests fail, analyze output:**
|
|
787
|
+
|
|
788
|
+
```text
|
|
789
|
+
Test failure output:
|
|
790
|
+
Expected: { valid: true, domain: "example.com" }
|
|
791
|
+
Received: { valid: true, domain: undefined }
|
|
792
|
+
|
|
793
|
+
Analysis:
|
|
794
|
+
- valid flag is correct
|
|
795
|
+
- domain extraction is broken
|
|
796
|
+
- Focus debugging on domain parsing logic
|
|
797
|
+
```
|
|
798
|
+
|
|
799
|
+
### Maintain Test Quality
|
|
800
|
+
|
|
801
|
+
**Treat tests as production code:**
|
|
802
|
+
|
|
803
|
+
- Use descriptive names
|
|
804
|
+
- Keep tests simple and readable
|
|
805
|
+
- Refactor duplicate test code
|
|
806
|
+
- Delete obsolete tests
|
|
807
|
+
- Update tests when requirements change
|
|
808
|
+
|
|
809
|
+
## Language-Specific Guides
|
|
810
|
+
|
|
811
|
+
For implementation details in specific languages:
|
|
812
|
+
|
|
813
|
+
- **TypeScript/JavaScript**: See [agents.tdd.ts.md](./agents.tdd.ts.md)
|
|
814
|
+
- **Python**: See [agents.tdd.py.md](./agents.tdd.py.md) (coming soon)
|
|
815
|
+
- **Go**: See [agents.tdd.go.md](./agents.tdd.go.md) (coming soon)
|
|
816
|
+
- **Ruby**: See [agents.tdd.rb.md](./agents.tdd.rb.md) (coming soon)
|
|
817
|
+
|
|
818
|
+
## Summary
|
|
819
|
+
|
|
820
|
+
**TDD fundamentals:**
|
|
821
|
+
|
|
822
|
+
- Write tests first (RED → GREEN → REFACTOR)
|
|
823
|
+
- Test behavior, not implementation
|
|
824
|
+
- Keep tests fast and independent
|
|
825
|
+
- Use mocks for external dependencies
|
|
826
|
+
- Aim for 80%+ coverage on critical code
|
|
827
|
+
- One test, one behavior
|
|
828
|
+
- Run tests frequently
|
|
829
|
+
|
|
830
|
+
**For AI agents:**
|
|
831
|
+
|
|
832
|
+
- Always write tests before implementation
|
|
833
|
+
- Verify tests fail before writing code (RED)
|
|
834
|
+
- Show test output to users
|
|
835
|
+
- Use test failures for debugging
|
|
836
|
+
- Maintain test quality like production code
|
|
837
|
+
|
|
838
|
+
**Result**: Reliable, maintainable code with regression protection and living documentation.
|