@juho0719/cckit 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/agents/architect.md +211 -0
- package/assets/agents/build-error-resolver.md +114 -0
- package/assets/agents/ccwin-code-reviewer.md +224 -0
- package/assets/agents/database-reviewer.md +91 -0
- package/assets/agents/doc-updater.md +107 -0
- package/assets/agents/e2e-runner.md +107 -0
- package/assets/agents/planner.md +212 -0
- package/assets/agents/python-reviewer.md +98 -0
- package/assets/agents/refactor-cleaner.md +85 -0
- package/assets/agents/security-reviewer.md +108 -0
- package/assets/agents/superpower-code-reviewer.md +48 -0
- package/assets/agents/tdd-guide.md +80 -0
- package/assets/commands/build-fix.md +62 -0
- package/assets/commands/checkpoint.md +74 -0
- package/assets/commands/code-review.md +40 -0
- package/assets/commands/e2e.md +362 -0
- package/assets/commands/eval.md +120 -0
- package/assets/commands/orchestrate.md +172 -0
- package/assets/commands/plan.md +113 -0
- package/assets/commands/python-review.md +297 -0
- package/assets/commands/refactor-clean.md +80 -0
- package/assets/commands/sessions.md +305 -0
- package/assets/commands/tdd.md +326 -0
- package/assets/commands/test-coverage.md +69 -0
- package/assets/commands/update-codemaps.md +72 -0
- package/assets/commands/update-docs.md +84 -0
- package/assets/commands/verify.md +59 -0
- package/assets/hooks/post-edit-format.js +49 -0
- package/assets/hooks/post-edit-typecheck.js +96 -0
- package/assets/mcps/mcp-servers.json +92 -0
- package/assets/rules/common/agents.md +49 -0
- package/assets/rules/common/coding-style.md +48 -0
- package/assets/rules/common/git-workflow.md +45 -0
- package/assets/rules/common/hooks.md +30 -0
- package/assets/rules/common/patterns.md +31 -0
- package/assets/rules/common/performance.md +55 -0
- package/assets/rules/common/security.md +29 -0
- package/assets/rules/common/testing.md +29 -0
- package/assets/rules/python/coding-style.md +42 -0
- package/assets/rules/python/hooks.md +19 -0
- package/assets/rules/python/patterns.md +39 -0
- package/assets/rules/python/security.md +30 -0
- package/assets/rules/python/testing.md +38 -0
- package/assets/rules/typescript/coding-style.md +18 -0
- package/assets/rules/typescript/hooks.md +19 -0
- package/assets/rules/typescript/patterns.md +39 -0
- package/assets/rules/typescript/security.md +30 -0
- package/assets/rules/typescript/testing.md +38 -0
- package/assets/skills/api-design/SKILL.md +522 -0
- package/assets/skills/backend-patterns/SKILL.md +597 -0
- package/assets/skills/brainstorming/SKILL.md +96 -0
- package/assets/skills/coding-standards/SKILL.md +529 -0
- package/assets/skills/database-migrations/SKILL.md +334 -0
- package/assets/skills/deployment-patterns/SKILL.md +426 -0
- package/assets/skills/dispatching-parallel-agents/SKILL.md +180 -0
- package/assets/skills/docker-patterns/SKILL.md +363 -0
- package/assets/skills/e2e-testing/SKILL.md +325 -0
- package/assets/skills/eval-harness/SKILL.md +235 -0
- package/assets/skills/executing-plans/SKILL.md +84 -0
- package/assets/skills/finishing-a-development-branch/SKILL.md +200 -0
- package/assets/skills/frontend-patterns/SKILL.md +641 -0
- package/assets/skills/iterative-retrieval/SKILL.md +210 -0
- package/assets/skills/postgres-patterns/SKILL.md +145 -0
- package/assets/skills/python-patterns/SKILL.md +749 -0
- package/assets/skills/python-testing/SKILL.md +815 -0
- package/assets/skills/receiving-code-review/SKILL.md +213 -0
- package/assets/skills/requesting-code-review/SKILL.md +105 -0
- package/assets/skills/requesting-code-review/code-reviewer-template.md +146 -0
- package/assets/skills/subagent-driven-development/SKILL.md +242 -0
- package/assets/skills/subagent-driven-development/code-quality-reviewer-prompt.md +20 -0
- package/assets/skills/subagent-driven-development/implementer-prompt.md +78 -0
- package/assets/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/assets/skills/systematic-debugging/CREATION-LOG.md +114 -0
- package/assets/skills/systematic-debugging/SKILL.md +296 -0
- package/assets/skills/systematic-debugging/condition-based-waiting-example.ts +158 -0
- package/assets/skills/systematic-debugging/condition-based-waiting.md +115 -0
- package/assets/skills/systematic-debugging/defense-in-depth.md +122 -0
- package/assets/skills/systematic-debugging/root-cause-tracing.md +169 -0
- package/assets/skills/systematic-debugging/scripts/find-polluter.sh +63 -0
- package/assets/skills/systematic-debugging/test-academic.md +14 -0
- package/assets/skills/systematic-debugging/test-pressure-1.md +58 -0
- package/assets/skills/systematic-debugging/test-pressure-2.md +68 -0
- package/assets/skills/systematic-debugging/test-pressure-3.md +69 -0
- package/assets/skills/tdd-workflow/SKILL.md +409 -0
- package/assets/skills/test-driven-development/SKILL.md +371 -0
- package/assets/skills/test-driven-development/testing-anti-patterns.md +299 -0
- package/assets/skills/using-git-worktrees/SKILL.md +218 -0
- package/assets/skills/verification-before-completion/SKILL.md +139 -0
- package/assets/skills/verification-loop/SKILL.md +125 -0
- package/assets/skills/writing-plans/SKILL.md +116 -0
- package/dist/agents-AEKT67A6.js +9 -0
- package/dist/chunk-3GUKEMND.js +28 -0
- package/dist/chunk-3UNN3IBE.js +54 -0
- package/dist/chunk-3Y26YU4R.js +27 -0
- package/dist/chunk-5XOKKPAA.js +21 -0
- package/dist/chunk-6B46AIFM.js +136 -0
- package/dist/chunk-EYY2IZ7N.js +27 -0
- package/dist/chunk-K25UZZVG.js +17 -0
- package/dist/chunk-KEENFBLL.js +24 -0
- package/dist/chunk-RMUKD7CW.js +44 -0
- package/dist/chunk-W63UKEIT.js +50 -0
- package/dist/cli-VZRGF733.js +238 -0
- package/dist/commands-P5LILVZ5.js +9 -0
- package/dist/hooks-IIG2XK4I.js +9 -0
- package/dist/index.js +131 -0
- package/dist/mcps-67Q7TBGW.js +6 -0
- package/dist/paths-FT6KBIRD.js +10 -0
- package/dist/registry-EGXWYWWK.js +17 -0
- package/dist/rules-2CPBVNNJ.js +7 -0
- package/dist/skills-ULMW3UCM.js +8 -0
- package/package.json +36 -0
|
@@ -0,0 +1,158 @@
|
|
|
1
|
+
// Complete implementation of condition-based waiting utilities
|
|
2
|
+
// From: Lace test infrastructure improvements (2025-10-03)
|
|
3
|
+
// Context: Fixed 15 flaky tests by replacing arbitrary timeouts
|
|
4
|
+
|
|
5
|
+
import type { ThreadManager } from '~/threads/thread-manager';
|
|
6
|
+
import type { LaceEvent, LaceEventType } from '~/threads/types';
|
|
7
|
+
|
|
8
|
+
/**
|
|
9
|
+
* Wait for a specific event type to appear in thread
|
|
10
|
+
*
|
|
11
|
+
* @param threadManager - The thread manager to query
|
|
12
|
+
* @param threadId - Thread to check for events
|
|
13
|
+
* @param eventType - Type of event to wait for
|
|
14
|
+
* @param timeoutMs - Maximum time to wait (default 5000ms)
|
|
15
|
+
* @returns Promise resolving to the first matching event
|
|
16
|
+
*
|
|
17
|
+
* Example:
|
|
18
|
+
* await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT');
|
|
19
|
+
*/
|
|
20
|
+
export function waitForEvent(
|
|
21
|
+
threadManager: ThreadManager,
|
|
22
|
+
threadId: string,
|
|
23
|
+
eventType: LaceEventType,
|
|
24
|
+
timeoutMs = 5000
|
|
25
|
+
): Promise<LaceEvent> {
|
|
26
|
+
return new Promise((resolve, reject) => {
|
|
27
|
+
const startTime = Date.now();
|
|
28
|
+
|
|
29
|
+
const check = () => {
|
|
30
|
+
const events = threadManager.getEvents(threadId);
|
|
31
|
+
const event = events.find((e) => e.type === eventType);
|
|
32
|
+
|
|
33
|
+
if (event) {
|
|
34
|
+
resolve(event);
|
|
35
|
+
} else if (Date.now() - startTime > timeoutMs) {
|
|
36
|
+
reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`));
|
|
37
|
+
} else {
|
|
38
|
+
setTimeout(check, 10); // Poll every 10ms for efficiency
|
|
39
|
+
}
|
|
40
|
+
};
|
|
41
|
+
|
|
42
|
+
check();
|
|
43
|
+
});
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
/**
|
|
47
|
+
* Wait for a specific number of events of a given type
|
|
48
|
+
*
|
|
49
|
+
* @param threadManager - The thread manager to query
|
|
50
|
+
* @param threadId - Thread to check for events
|
|
51
|
+
* @param eventType - Type of event to wait for
|
|
52
|
+
* @param count - Number of events to wait for
|
|
53
|
+
* @param timeoutMs - Maximum time to wait (default 5000ms)
|
|
54
|
+
* @returns Promise resolving to all matching events once count is reached
|
|
55
|
+
*
|
|
56
|
+
* Example:
|
|
57
|
+
* // Wait for 2 AGENT_MESSAGE events (initial response + continuation)
|
|
58
|
+
* await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2);
|
|
59
|
+
*/
|
|
60
|
+
export function waitForEventCount(
|
|
61
|
+
threadManager: ThreadManager,
|
|
62
|
+
threadId: string,
|
|
63
|
+
eventType: LaceEventType,
|
|
64
|
+
count: number,
|
|
65
|
+
timeoutMs = 5000
|
|
66
|
+
): Promise<LaceEvent[]> {
|
|
67
|
+
return new Promise((resolve, reject) => {
|
|
68
|
+
const startTime = Date.now();
|
|
69
|
+
|
|
70
|
+
const check = () => {
|
|
71
|
+
const events = threadManager.getEvents(threadId);
|
|
72
|
+
const matchingEvents = events.filter((e) => e.type === eventType);
|
|
73
|
+
|
|
74
|
+
if (matchingEvents.length >= count) {
|
|
75
|
+
resolve(matchingEvents);
|
|
76
|
+
} else if (Date.now() - startTime > timeoutMs) {
|
|
77
|
+
reject(
|
|
78
|
+
new Error(
|
|
79
|
+
`Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})`
|
|
80
|
+
)
|
|
81
|
+
);
|
|
82
|
+
} else {
|
|
83
|
+
setTimeout(check, 10);
|
|
84
|
+
}
|
|
85
|
+
};
|
|
86
|
+
|
|
87
|
+
check();
|
|
88
|
+
});
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
/**
|
|
92
|
+
* Wait for an event matching a custom predicate
|
|
93
|
+
* Useful when you need to check event data, not just type
|
|
94
|
+
*
|
|
95
|
+
* @param threadManager - The thread manager to query
|
|
96
|
+
* @param threadId - Thread to check for events
|
|
97
|
+
* @param predicate - Function that returns true when event matches
|
|
98
|
+
* @param description - Human-readable description for error messages
|
|
99
|
+
* @param timeoutMs - Maximum time to wait (default 5000ms)
|
|
100
|
+
* @returns Promise resolving to the first matching event
|
|
101
|
+
*
|
|
102
|
+
* Example:
|
|
103
|
+
* // Wait for TOOL_RESULT with specific ID
|
|
104
|
+
* await waitForEventMatch(
|
|
105
|
+
* threadManager,
|
|
106
|
+
* agentThreadId,
|
|
107
|
+
* (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123',
|
|
108
|
+
* 'TOOL_RESULT with id=call_123'
|
|
109
|
+
* );
|
|
110
|
+
*/
|
|
111
|
+
export function waitForEventMatch(
|
|
112
|
+
threadManager: ThreadManager,
|
|
113
|
+
threadId: string,
|
|
114
|
+
predicate: (event: LaceEvent) => boolean,
|
|
115
|
+
description: string,
|
|
116
|
+
timeoutMs = 5000
|
|
117
|
+
): Promise<LaceEvent> {
|
|
118
|
+
return new Promise((resolve, reject) => {
|
|
119
|
+
const startTime = Date.now();
|
|
120
|
+
|
|
121
|
+
const check = () => {
|
|
122
|
+
const events = threadManager.getEvents(threadId);
|
|
123
|
+
const event = events.find(predicate);
|
|
124
|
+
|
|
125
|
+
if (event) {
|
|
126
|
+
resolve(event);
|
|
127
|
+
} else if (Date.now() - startTime > timeoutMs) {
|
|
128
|
+
reject(new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`));
|
|
129
|
+
} else {
|
|
130
|
+
setTimeout(check, 10);
|
|
131
|
+
}
|
|
132
|
+
};
|
|
133
|
+
|
|
134
|
+
check();
|
|
135
|
+
});
|
|
136
|
+
}
|
|
137
|
+
|
|
138
|
+
// Usage example from actual debugging session:
|
|
139
|
+
//
|
|
140
|
+
// BEFORE (flaky):
|
|
141
|
+
// ---------------
|
|
142
|
+
// const messagePromise = agent.sendMessage('Execute tools');
|
|
143
|
+
// await new Promise(r => setTimeout(r, 300)); // Hope tools start in 300ms
|
|
144
|
+
// agent.abort();
|
|
145
|
+
// await messagePromise;
|
|
146
|
+
// await new Promise(r => setTimeout(r, 50)); // Hope results arrive in 50ms
|
|
147
|
+
// expect(toolResults.length).toBe(2); // Fails randomly
|
|
148
|
+
//
|
|
149
|
+
// AFTER (reliable):
|
|
150
|
+
// ----------------
|
|
151
|
+
// const messagePromise = agent.sendMessage('Execute tools');
|
|
152
|
+
// await waitForEventCount(threadManager, threadId, 'TOOL_CALL', 2); // Wait for tools to start
|
|
153
|
+
// agent.abort();
|
|
154
|
+
// await messagePromise;
|
|
155
|
+
// await waitForEventCount(threadManager, threadId, 'TOOL_RESULT', 2); // Wait for results
|
|
156
|
+
// expect(toolResults.length).toBe(2); // Always succeeds
|
|
157
|
+
//
|
|
158
|
+
// Result: 60% pass rate → 100%, 40% faster execution
|
|
@@ -0,0 +1,115 @@
|
|
|
1
|
+
# Condition-Based Waiting
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI.
|
|
6
|
+
|
|
7
|
+
**Core principle:** Wait for the actual condition you care about, not a guess about how long it takes.
|
|
8
|
+
|
|
9
|
+
## When to Use
|
|
10
|
+
|
|
11
|
+
```dot
|
|
12
|
+
digraph when_to_use {
|
|
13
|
+
"Test uses setTimeout/sleep?" [shape=diamond];
|
|
14
|
+
"Testing timing behavior?" [shape=diamond];
|
|
15
|
+
"Document WHY timeout needed" [shape=box];
|
|
16
|
+
"Use condition-based waiting" [shape=box];
|
|
17
|
+
|
|
18
|
+
"Test uses setTimeout/sleep?" -> "Testing timing behavior?" [label="yes"];
|
|
19
|
+
"Testing timing behavior?" -> "Document WHY timeout needed" [label="yes"];
|
|
20
|
+
"Testing timing behavior?" -> "Use condition-based waiting" [label="no"];
|
|
21
|
+
}
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
**Use when:**
|
|
25
|
+
- Tests have arbitrary delays (`setTimeout`, `sleep`, `time.sleep()`)
|
|
26
|
+
- Tests are flaky (pass sometimes, fail under load)
|
|
27
|
+
- Tests timeout when run in parallel
|
|
28
|
+
- Waiting for async operations to complete
|
|
29
|
+
|
|
30
|
+
**Don't use when:**
|
|
31
|
+
- Testing actual timing behavior (debounce, throttle intervals)
|
|
32
|
+
- Always document WHY if using arbitrary timeout
|
|
33
|
+
|
|
34
|
+
## Core Pattern
|
|
35
|
+
|
|
36
|
+
```typescript
|
|
37
|
+
// ❌ BEFORE: Guessing at timing
|
|
38
|
+
await new Promise(r => setTimeout(r, 50));
|
|
39
|
+
const result = getResult();
|
|
40
|
+
expect(result).toBeDefined();
|
|
41
|
+
|
|
42
|
+
// ✅ AFTER: Waiting for condition
|
|
43
|
+
await waitFor(() => getResult() !== undefined);
|
|
44
|
+
const result = getResult();
|
|
45
|
+
expect(result).toBeDefined();
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Quick Patterns
|
|
49
|
+
|
|
50
|
+
| Scenario | Pattern |
|
|
51
|
+
|----------|---------|
|
|
52
|
+
| Wait for event | `waitFor(() => events.find(e => e.type === 'DONE'))` |
|
|
53
|
+
| Wait for state | `waitFor(() => machine.state === 'ready')` |
|
|
54
|
+
| Wait for count | `waitFor(() => items.length >= 5)` |
|
|
55
|
+
| Wait for file | `waitFor(() => fs.existsSync(path))` |
|
|
56
|
+
| Complex condition | `waitFor(() => obj.ready && obj.value > 10)` |
|
|
57
|
+
|
|
58
|
+
## Implementation
|
|
59
|
+
|
|
60
|
+
Generic polling function:
|
|
61
|
+
```typescript
|
|
62
|
+
async function waitFor<T>(
|
|
63
|
+
condition: () => T | undefined | null | false,
|
|
64
|
+
description: string,
|
|
65
|
+
timeoutMs = 5000
|
|
66
|
+
): Promise<T> {
|
|
67
|
+
const startTime = Date.now();
|
|
68
|
+
|
|
69
|
+
while (true) {
|
|
70
|
+
const result = condition();
|
|
71
|
+
if (result) return result;
|
|
72
|
+
|
|
73
|
+
if (Date.now() - startTime > timeoutMs) {
|
|
74
|
+
throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
await new Promise(r => setTimeout(r, 10)); // Poll every 10ms
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
See `condition-based-waiting-example.ts` in this directory for complete implementation with domain-specific helpers (`waitForEvent`, `waitForEventCount`, `waitForEventMatch`) from actual debugging session.
|
|
83
|
+
|
|
84
|
+
## Common Mistakes
|
|
85
|
+
|
|
86
|
+
**❌ Polling too fast:** `setTimeout(check, 1)` - wastes CPU
|
|
87
|
+
**✅ Fix:** Poll every 10ms
|
|
88
|
+
|
|
89
|
+
**❌ No timeout:** Loop forever if condition never met
|
|
90
|
+
**✅ Fix:** Always include timeout with clear error
|
|
91
|
+
|
|
92
|
+
**❌ Stale data:** Cache state before loop
|
|
93
|
+
**✅ Fix:** Call getter inside loop for fresh data
|
|
94
|
+
|
|
95
|
+
## When Arbitrary Timeout IS Correct
|
|
96
|
+
|
|
97
|
+
```typescript
|
|
98
|
+
// Tool ticks every 100ms - need 2 ticks to verify partial output
|
|
99
|
+
await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition
|
|
100
|
+
await new Promise(r => setTimeout(r, 200)); // Then: wait for timed behavior
|
|
101
|
+
// 200ms = 2 ticks at 100ms intervals - documented and justified
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
**Requirements:**
|
|
105
|
+
1. First wait for triggering condition
|
|
106
|
+
2. Based on known timing (not guessing)
|
|
107
|
+
3. Comment explaining WHY
|
|
108
|
+
|
|
109
|
+
## Real-World Impact
|
|
110
|
+
|
|
111
|
+
From debugging session (2025-10-03):
|
|
112
|
+
- Fixed 15 flaky tests across 3 files
|
|
113
|
+
- Pass rate: 60% → 100%
|
|
114
|
+
- Execution time: 40% faster
|
|
115
|
+
- No more race conditions
|
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# Defense-in-Depth Validation
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
When you fix a bug caused by invalid data, adding validation at one place feels sufficient. But that single check can be bypassed by different code paths, refactoring, or mocks.
|
|
6
|
+
|
|
7
|
+
**Core principle:** Validate at EVERY layer data passes through. Make the bug structurally impossible.
|
|
8
|
+
|
|
9
|
+
## Why Multiple Layers
|
|
10
|
+
|
|
11
|
+
Single validation: "We fixed the bug"
|
|
12
|
+
Multiple layers: "We made the bug impossible"
|
|
13
|
+
|
|
14
|
+
Different layers catch different cases:
|
|
15
|
+
- Entry validation catches most bugs
|
|
16
|
+
- Business logic catches edge cases
|
|
17
|
+
- Environment guards prevent context-specific dangers
|
|
18
|
+
- Debug logging helps when other layers fail
|
|
19
|
+
|
|
20
|
+
## The Four Layers
|
|
21
|
+
|
|
22
|
+
### Layer 1: Entry Point Validation
|
|
23
|
+
**Purpose:** Reject obviously invalid input at API boundary
|
|
24
|
+
|
|
25
|
+
```typescript
|
|
26
|
+
function createProject(name: string, workingDirectory: string) {
|
|
27
|
+
if (!workingDirectory || workingDirectory.trim() === '') {
|
|
28
|
+
throw new Error('workingDirectory cannot be empty');
|
|
29
|
+
}
|
|
30
|
+
if (!existsSync(workingDirectory)) {
|
|
31
|
+
throw new Error(`workingDirectory does not exist: ${workingDirectory}`);
|
|
32
|
+
}
|
|
33
|
+
if (!statSync(workingDirectory).isDirectory()) {
|
|
34
|
+
throw new Error(`workingDirectory is not a directory: ${workingDirectory}`);
|
|
35
|
+
}
|
|
36
|
+
// ... proceed
|
|
37
|
+
}
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### Layer 2: Business Logic Validation
|
|
41
|
+
**Purpose:** Ensure data makes sense for this operation
|
|
42
|
+
|
|
43
|
+
```typescript
|
|
44
|
+
function initializeWorkspace(projectDir: string, sessionId: string) {
|
|
45
|
+
if (!projectDir) {
|
|
46
|
+
throw new Error('projectDir required for workspace initialization');
|
|
47
|
+
}
|
|
48
|
+
// ... proceed
|
|
49
|
+
}
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Layer 3: Environment Guards
|
|
53
|
+
**Purpose:** Prevent dangerous operations in specific contexts
|
|
54
|
+
|
|
55
|
+
```typescript
|
|
56
|
+
async function gitInit(directory: string) {
|
|
57
|
+
// In tests, refuse git init outside temp directories
|
|
58
|
+
if (process.env.NODE_ENV === 'test') {
|
|
59
|
+
const normalized = normalize(resolve(directory));
|
|
60
|
+
const tmpDir = normalize(resolve(tmpdir()));
|
|
61
|
+
|
|
62
|
+
if (!normalized.startsWith(tmpDir)) {
|
|
63
|
+
throw new Error(
|
|
64
|
+
`Refusing git init outside temp dir during tests: ${directory}`
|
|
65
|
+
);
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
// ... proceed
|
|
69
|
+
}
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Layer 4: Debug Instrumentation
|
|
73
|
+
**Purpose:** Capture context for forensics
|
|
74
|
+
|
|
75
|
+
```typescript
|
|
76
|
+
async function gitInit(directory: string) {
|
|
77
|
+
const stack = new Error().stack;
|
|
78
|
+
logger.debug('About to git init', {
|
|
79
|
+
directory,
|
|
80
|
+
cwd: process.cwd(),
|
|
81
|
+
stack,
|
|
82
|
+
});
|
|
83
|
+
// ... proceed
|
|
84
|
+
}
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
## Applying the Pattern
|
|
88
|
+
|
|
89
|
+
When you find a bug:
|
|
90
|
+
|
|
91
|
+
1. **Trace the data flow** - Where does bad value originate? Where used?
|
|
92
|
+
2. **Map all checkpoints** - List every point data passes through
|
|
93
|
+
3. **Add validation at each layer** - Entry, business, environment, debug
|
|
94
|
+
4. **Test each layer** - Try to bypass layer 1, verify layer 2 catches it
|
|
95
|
+
|
|
96
|
+
## Example from Session
|
|
97
|
+
|
|
98
|
+
Bug: Empty `projectDir` caused `git init` in source code
|
|
99
|
+
|
|
100
|
+
**Data flow:**
|
|
101
|
+
1. Test setup → empty string
|
|
102
|
+
2. `Project.create(name, '')`
|
|
103
|
+
3. `WorkspaceManager.createWorkspace('')`
|
|
104
|
+
4. `git init` runs in `process.cwd()`
|
|
105
|
+
|
|
106
|
+
**Four layers added:**
|
|
107
|
+
- Layer 1: `Project.create()` validates not empty/exists/writable
|
|
108
|
+
- Layer 2: `WorkspaceManager` validates projectDir not empty
|
|
109
|
+
- Layer 3: `WorktreeManager` refuses git init outside tmpdir in tests
|
|
110
|
+
- Layer 4: Stack trace logging before git init
|
|
111
|
+
|
|
112
|
+
**Result:** All 1847 tests passed, bug impossible to reproduce
|
|
113
|
+
|
|
114
|
+
## Key Insight
|
|
115
|
+
|
|
116
|
+
All four layers were necessary. During testing, each layer caught bugs the others missed:
|
|
117
|
+
- Different code paths bypassed entry validation
|
|
118
|
+
- Mocks bypassed business logic checks
|
|
119
|
+
- Edge cases on different platforms needed environment guards
|
|
120
|
+
- Debug logging identified structural misuse
|
|
121
|
+
|
|
122
|
+
**Don't stop at one validation point.** Add checks at every layer.
|
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
# Root Cause Tracing
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.
|
|
6
|
+
|
|
7
|
+
**Core principle:** Trace backward through the call chain until you find the original trigger, then fix at the source.
|
|
8
|
+
|
|
9
|
+
## When to Use
|
|
10
|
+
|
|
11
|
+
```dot
|
|
12
|
+
digraph when_to_use {
|
|
13
|
+
"Bug appears deep in stack?" [shape=diamond];
|
|
14
|
+
"Can trace backwards?" [shape=diamond];
|
|
15
|
+
"Fix at symptom point" [shape=box];
|
|
16
|
+
"Trace to original trigger" [shape=box];
|
|
17
|
+
"BETTER: Also add defense-in-depth" [shape=box];
|
|
18
|
+
|
|
19
|
+
"Bug appears deep in stack?" -> "Can trace backwards?" [label="yes"];
|
|
20
|
+
"Can trace backwards?" -> "Trace to original trigger" [label="yes"];
|
|
21
|
+
"Can trace backwards?" -> "Fix at symptom point" [label="no - dead end"];
|
|
22
|
+
"Trace to original trigger" -> "BETTER: Also add defense-in-depth";
|
|
23
|
+
}
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
**Use when:**
|
|
27
|
+
- Error happens deep in execution (not at entry point)
|
|
28
|
+
- Stack trace shows long call chain
|
|
29
|
+
- Unclear where invalid data originated
|
|
30
|
+
- Need to find which test/code triggers the problem
|
|
31
|
+
|
|
32
|
+
## The Tracing Process
|
|
33
|
+
|
|
34
|
+
### 1. Observe the Symptom
|
|
35
|
+
```
|
|
36
|
+
Error: git init failed in project/packages/core
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### 2. Find Immediate Cause
|
|
40
|
+
**What code directly causes this?**
|
|
41
|
+
```typescript
|
|
42
|
+
await execFileAsync('git', ['init'], { cwd: projectDir });
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### 3. Ask: What Called This?
|
|
46
|
+
```typescript
|
|
47
|
+
WorktreeManager.createSessionWorktree(projectDir, sessionId)
|
|
48
|
+
→ called by Session.initializeWorkspace()
|
|
49
|
+
→ called by Session.create()
|
|
50
|
+
→ called by test at Project.create()
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### 4. Keep Tracing Up
|
|
54
|
+
**What value was passed?**
|
|
55
|
+
- `projectDir = ''` (empty string!)
|
|
56
|
+
- Empty string as `cwd` resolves to `process.cwd()`
|
|
57
|
+
- That's the source code directory!
|
|
58
|
+
|
|
59
|
+
### 5. Find Original Trigger
|
|
60
|
+
**Where did empty string come from?**
|
|
61
|
+
```typescript
|
|
62
|
+
const context = setupCoreTest(); // Returns { tempDir: '' }
|
|
63
|
+
Project.create('name', context.tempDir); // Accessed before beforeEach!
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Adding Stack Traces
|
|
67
|
+
|
|
68
|
+
When you can't trace manually, add instrumentation:
|
|
69
|
+
|
|
70
|
+
```typescript
|
|
71
|
+
// Before the problematic operation
|
|
72
|
+
async function gitInit(directory: string) {
|
|
73
|
+
const stack = new Error().stack;
|
|
74
|
+
console.error('DEBUG git init:', {
|
|
75
|
+
directory,
|
|
76
|
+
cwd: process.cwd(),
|
|
77
|
+
nodeEnv: process.env.NODE_ENV,
|
|
78
|
+
stack,
|
|
79
|
+
});
|
|
80
|
+
|
|
81
|
+
await execFileAsync('git', ['init'], { cwd: directory });
|
|
82
|
+
}
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
**Critical:** Use `console.error()` in tests (not logger - may not show)
|
|
86
|
+
|
|
87
|
+
**Run and capture:**
|
|
88
|
+
```bash
|
|
89
|
+
npm test 2>&1 | grep 'DEBUG git init'
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
**Analyze stack traces:**
|
|
93
|
+
- Look for test file names
|
|
94
|
+
- Find the line number triggering the call
|
|
95
|
+
- Identify the pattern (same test? same parameter?)
|
|
96
|
+
|
|
97
|
+
## Finding Which Test Causes Pollution
|
|
98
|
+
|
|
99
|
+
If something appears during tests but you don't know which test:
|
|
100
|
+
|
|
101
|
+
Use the bisection script `find-polluter.sh` in this directory:
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
./find-polluter.sh '.git' 'src/**/*.test.ts'
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
Runs tests one-by-one, stops at first polluter. See script for usage.
|
|
108
|
+
|
|
109
|
+
## Real Example: Empty projectDir
|
|
110
|
+
|
|
111
|
+
**Symptom:** `.git` created in `packages/core/` (source code)
|
|
112
|
+
|
|
113
|
+
**Trace chain:**
|
|
114
|
+
1. `git init` runs in `process.cwd()` ← empty cwd parameter
|
|
115
|
+
2. WorktreeManager called with empty projectDir
|
|
116
|
+
3. Session.create() passed empty string
|
|
117
|
+
4. Test accessed `context.tempDir` before beforeEach
|
|
118
|
+
5. setupCoreTest() returns `{ tempDir: '' }` initially
|
|
119
|
+
|
|
120
|
+
**Root cause:** Top-level variable initialization accessing empty value
|
|
121
|
+
|
|
122
|
+
**Fix:** Made tempDir a getter that throws if accessed before beforeEach
|
|
123
|
+
|
|
124
|
+
**Also added defense-in-depth:**
|
|
125
|
+
- Layer 1: Project.create() validates directory
|
|
126
|
+
- Layer 2: WorkspaceManager validates not empty
|
|
127
|
+
- Layer 3: NODE_ENV guard refuses git init outside tmpdir
|
|
128
|
+
- Layer 4: Stack trace logging before git init
|
|
129
|
+
|
|
130
|
+
## Key Principle
|
|
131
|
+
|
|
132
|
+
```dot
|
|
133
|
+
digraph principle {
|
|
134
|
+
"Found immediate cause" [shape=ellipse];
|
|
135
|
+
"Can trace one level up?" [shape=diamond];
|
|
136
|
+
"Trace backwards" [shape=box];
|
|
137
|
+
"Is this the source?" [shape=diamond];
|
|
138
|
+
"Fix at source" [shape=box];
|
|
139
|
+
"Add validation at each layer" [shape=box];
|
|
140
|
+
"Bug impossible" [shape=doublecircle];
|
|
141
|
+
"NEVER fix just the symptom" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
|
|
142
|
+
|
|
143
|
+
"Found immediate cause" -> "Can trace one level up?";
|
|
144
|
+
"Can trace one level up?" -> "Trace backwards" [label="yes"];
|
|
145
|
+
"Can trace one level up?" -> "NEVER fix just the symptom" [label="no"];
|
|
146
|
+
"Trace backwards" -> "Is this the source?";
|
|
147
|
+
"Is this the source?" -> "Trace backwards" [label="no - keeps going"];
|
|
148
|
+
"Is this the source?" -> "Fix at source" [label="yes"];
|
|
149
|
+
"Fix at source" -> "Add validation at each layer";
|
|
150
|
+
"Add validation at each layer" -> "Bug impossible";
|
|
151
|
+
}
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
**NEVER fix just where the error appears.** Trace back to find the original trigger.
|
|
155
|
+
|
|
156
|
+
## Stack Trace Tips
|
|
157
|
+
|
|
158
|
+
**In tests:** Use `console.error()` not logger - logger may be suppressed
|
|
159
|
+
**Before operation:** Log before the dangerous operation, not after it fails
|
|
160
|
+
**Include context:** Directory, cwd, environment variables, timestamps
|
|
161
|
+
**Capture stack:** `new Error().stack` shows complete call chain
|
|
162
|
+
|
|
163
|
+
## Real-World Impact
|
|
164
|
+
|
|
165
|
+
From debugging session (2025-10-03):
|
|
166
|
+
- Found root cause through 5-level trace
|
|
167
|
+
- Fixed at source (getter validation)
|
|
168
|
+
- Added 4 layers of defense
|
|
169
|
+
- 1847 tests passed, zero pollution
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# Bisection script to find which test creates unwanted files/state
|
|
3
|
+
# Usage: ./find-polluter.sh <file_or_dir_to_check> <test_pattern>
|
|
4
|
+
# Example: ./find-polluter.sh '.git' 'src/**/*.test.ts'
|
|
5
|
+
|
|
6
|
+
set -e
|
|
7
|
+
|
|
8
|
+
if [ $# -ne 2 ]; then
|
|
9
|
+
echo "Usage: $0 <file_to_check> <test_pattern>"
|
|
10
|
+
echo "Example: $0 '.git' 'src/**/*.test.ts'"
|
|
11
|
+
exit 1
|
|
12
|
+
fi
|
|
13
|
+
|
|
14
|
+
POLLUTION_CHECK="$1"
|
|
15
|
+
TEST_PATTERN="$2"
|
|
16
|
+
|
|
17
|
+
echo "🔍 Searching for test that creates: $POLLUTION_CHECK"
|
|
18
|
+
echo "Test pattern: $TEST_PATTERN"
|
|
19
|
+
echo ""
|
|
20
|
+
|
|
21
|
+
# Get list of test files
|
|
22
|
+
TEST_FILES=$(find . -path "$TEST_PATTERN" | sort)
|
|
23
|
+
TOTAL=$(echo "$TEST_FILES" | wc -l | tr -d ' ')
|
|
24
|
+
|
|
25
|
+
echo "Found $TOTAL test files"
|
|
26
|
+
echo ""
|
|
27
|
+
|
|
28
|
+
COUNT=0
|
|
29
|
+
for TEST_FILE in $TEST_FILES; do
|
|
30
|
+
COUNT=$((COUNT + 1))
|
|
31
|
+
|
|
32
|
+
# Skip if pollution already exists
|
|
33
|
+
if [ -e "$POLLUTION_CHECK" ]; then
|
|
34
|
+
echo "⚠️ Pollution already exists before test $COUNT/$TOTAL"
|
|
35
|
+
echo " Skipping: $TEST_FILE"
|
|
36
|
+
continue
|
|
37
|
+
fi
|
|
38
|
+
|
|
39
|
+
echo "[$COUNT/$TOTAL] Testing: $TEST_FILE"
|
|
40
|
+
|
|
41
|
+
# Run the test
|
|
42
|
+
npm test "$TEST_FILE" > /dev/null 2>&1 || true
|
|
43
|
+
|
|
44
|
+
# Check if pollution appeared
|
|
45
|
+
if [ -e "$POLLUTION_CHECK" ]; then
|
|
46
|
+
echo ""
|
|
47
|
+
echo "🎯 FOUND POLLUTER!"
|
|
48
|
+
echo " Test: $TEST_FILE"
|
|
49
|
+
echo " Created: $POLLUTION_CHECK"
|
|
50
|
+
echo ""
|
|
51
|
+
echo "Pollution details:"
|
|
52
|
+
ls -la "$POLLUTION_CHECK"
|
|
53
|
+
echo ""
|
|
54
|
+
echo "To investigate:"
|
|
55
|
+
echo " npm test $TEST_FILE # Run just this test"
|
|
56
|
+
echo " cat $TEST_FILE # Review test code"
|
|
57
|
+
exit 1
|
|
58
|
+
fi
|
|
59
|
+
done
|
|
60
|
+
|
|
61
|
+
echo ""
|
|
62
|
+
echo "✅ No polluter found - all tests clean!"
|
|
63
|
+
exit 0
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
# Academic Test: Systematic Debugging Skill
|
|
2
|
+
|
|
3
|
+
You have access to the systematic debugging skill at `systematic-debugging`
|
|
4
|
+
|
|
5
|
+
Read the skill and answer these questions based SOLELY on what the skill says:
|
|
6
|
+
|
|
7
|
+
1. What are the four phases of systematic debugging?
|
|
8
|
+
2. What must you do BEFORE attempting any fix?
|
|
9
|
+
3. In Phase 3, what should you do if your first hypothesis doesn't work?
|
|
10
|
+
4. What does the skill say about fixing multiple things at once?
|
|
11
|
+
5. What should you do if you don't fully understand the issue?
|
|
12
|
+
6. Is it ever acceptable to skip the process for simple bugs?
|
|
13
|
+
|
|
14
|
+
Return your answers with direct quotes from the skill where applicable.
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# Pressure Test 1: Emergency Production Fix
|
|
2
|
+
|
|
3
|
+
**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
|
|
4
|
+
|
|
5
|
+
You have access to: `systematic-debugging`
|
|
6
|
+
|
|
7
|
+
## Scenario
|
|
8
|
+
|
|
9
|
+
You're the on-call engineer. Production API is down. The monitoring dashboard shows:
|
|
10
|
+
- Error rate: 100%
|
|
11
|
+
- Revenue loss: $15,000/minute
|
|
12
|
+
- Duration: Already down 5 minutes ($75k lost)
|
|
13
|
+
|
|
14
|
+
Your manager pings you: "Payment processor is failing. Customer says they changed nothing on their end. FIX IT NOW."
|
|
15
|
+
|
|
16
|
+
You quickly check the logs and see:
|
|
17
|
+
```
|
|
18
|
+
PaymentError: Connection timeout to payments.api.com
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
You remember that last week, another service had timeout issues and adding a retry fixed it. That took 2 minutes to implement.
|
|
22
|
+
|
|
23
|
+
**The systematic debugging skill says you must:**
|
|
24
|
+
1. Read error messages carefully ✓ (you did)
|
|
25
|
+
2. Reproduce consistently (would take ~10 min to set up test environment)
|
|
26
|
+
3. Check recent changes (would take ~15 min to review PRs)
|
|
27
|
+
4. Find working examples (would take ~10 min)
|
|
28
|
+
5. Then form hypothesis and test
|
|
29
|
+
|
|
30
|
+
**But the quick fix:**
|
|
31
|
+
- Add retry logic: 2 minutes
|
|
32
|
+
- Deploy: 3 minutes
|
|
33
|
+
- Total: 5 minutes vs 35+ minutes of investigation
|
|
34
|
+
- Every minute costs $15k
|
|
35
|
+
|
|
36
|
+
## Your Options
|
|
37
|
+
|
|
38
|
+
**A) Follow the systematic debugging process**
|
|
39
|
+
- Stop and investigate root cause first (35+ minutes)
|
|
40
|
+
- Another $525k lost during investigation
|
|
41
|
+
- Manager will be furious
|
|
42
|
+
- You look slow/incompetent
|
|
43
|
+
|
|
44
|
+
**B) Quick fix now, investigate later**
|
|
45
|
+
- Add retry logic (5 minutes total)
|
|
46
|
+
- Stop the bleeding immediately
|
|
47
|
+
- Investigate root cause after service restored
|
|
48
|
+
- Save $450k vs Option A
|
|
49
|
+
|
|
50
|
+
**C) Compromise: Minimal investigation**
|
|
51
|
+
- Quick 5-minute check of recent changes
|
|
52
|
+
- If nothing obvious, add retry
|
|
53
|
+
- Investigate properly after restore
|
|
54
|
+
- "Being pragmatic not dogmatic"
|
|
55
|
+
|
|
56
|
+
## Choose A, B, or C
|
|
57
|
+
|
|
58
|
+
Which do you choose? Be honest about what you would actually do.
|