@tianhai/pi-workflow-kit 0.5.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. package/README.md +44 -494
  2. package/docs/developer-usage-guide.md +41 -401
  3. package/docs/oversight-model.md +13 -34
  4. package/docs/workflow-phases.md +32 -46
  5. package/extensions/workflow-guard.ts +67 -0
  6. package/package.json +3 -7
  7. package/skills/brainstorming/SKILL.md +16 -59
  8. package/skills/executing-tasks/SKILL.md +26 -227
  9. package/skills/finalizing/SKILL.md +33 -0
  10. package/skills/writing-plans/SKILL.md +23 -132
  11. package/ROADMAP.md +0 -16
  12. package/agents/code-reviewer.md +0 -18
  13. package/agents/config.ts +0 -5
  14. package/agents/implementer.md +0 -26
  15. package/agents/spec-reviewer.md +0 -13
  16. package/agents/worker.md +0 -17
  17. package/docs/plans/completed/2026-04-09-cleanup-legacy-state-and-enforce-think-phases-design.md +0 -56
  18. package/docs/plans/completed/2026-04-09-cleanup-legacy-state-and-enforce-think-phases-implementation.md +0 -196
  19. package/docs/plans/completed/2026-04-09-workflow-next-autocomplete-design.md +0 -185
  20. package/docs/plans/completed/2026-04-09-workflow-next-autocomplete-implementation.md +0 -334
  21. package/docs/plans/completed/2026-04-09-workflow-next-handoff-state-design.md +0 -251
  22. package/docs/plans/completed/2026-04-09-workflow-next-handoff-state-implementation.md +0 -253
  23. package/extensions/constants.ts +0 -15
  24. package/extensions/lib/logging.ts +0 -138
  25. package/extensions/plan-tracker.ts +0 -502
  26. package/extensions/subagent/agents.ts +0 -144
  27. package/extensions/subagent/concurrency.ts +0 -52
  28. package/extensions/subagent/env.ts +0 -47
  29. package/extensions/subagent/index.ts +0 -1181
  30. package/extensions/subagent/lifecycle.ts +0 -25
  31. package/extensions/subagent/timeout.ts +0 -13
  32. package/extensions/workflow-monitor/debug-monitor.ts +0 -98
  33. package/extensions/workflow-monitor/git.ts +0 -31
  34. package/extensions/workflow-monitor/heuristics.ts +0 -58
  35. package/extensions/workflow-monitor/investigation.ts +0 -52
  36. package/extensions/workflow-monitor/reference-tool.ts +0 -42
  37. package/extensions/workflow-monitor/skip-confirmation.ts +0 -19
  38. package/extensions/workflow-monitor/tdd-monitor.ts +0 -137
  39. package/extensions/workflow-monitor/test-runner.ts +0 -37
  40. package/extensions/workflow-monitor/verification-monitor.ts +0 -61
  41. package/extensions/workflow-monitor/warnings.ts +0 -81
  42. package/extensions/workflow-monitor/workflow-handler.ts +0 -358
  43. package/extensions/workflow-monitor/workflow-next-completions.ts +0 -68
  44. package/extensions/workflow-monitor/workflow-next-state.ts +0 -112
  45. package/extensions/workflow-monitor/workflow-tracker.ts +0 -253
  46. package/extensions/workflow-monitor/workflow-transitions.ts +0 -55
  47. package/extensions/workflow-monitor.ts +0 -872
  48. package/skills/dispatching-parallel-agents/SKILL.md +0 -194
  49. package/skills/receiving-code-review/SKILL.md +0 -196
  50. package/skills/systematic-debugging/SKILL.md +0 -170
  51. package/skills/systematic-debugging/condition-based-waiting-example.ts +0 -158
  52. package/skills/systematic-debugging/condition-based-waiting.md +0 -115
  53. package/skills/systematic-debugging/defense-in-depth.md +0 -122
  54. package/skills/systematic-debugging/find-polluter.sh +0 -63
  55. package/skills/systematic-debugging/reference/rationalizations.md +0 -61
  56. package/skills/systematic-debugging/root-cause-tracing.md +0 -169
  57. package/skills/test-driven-development/SKILL.md +0 -266
  58. package/skills/test-driven-development/reference/examples.md +0 -101
  59. package/skills/test-driven-development/reference/rationalizations.md +0 -67
  60. package/skills/test-driven-development/reference/when-stuck.md +0 -33
  61. package/skills/test-driven-development/testing-anti-patterns.md +0 -299
  62. package/skills/using-git-worktrees/SKILL.md +0 -231
@@ -1,194 +0,0 @@
1
- ---
2
- name: dispatching-parallel-agents
3
- description: Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies
4
- ---
5
-
6
- > **Related skills:** Debug each problem with `/skill:systematic-debugging`. Verify all fixes with `/skill:executing-tasks`.
7
-
8
- # Dispatching Parallel Agents
9
-
10
- ## Overview
11
-
12
- When you have multiple unrelated failures (different test files, different subsystems, different bugs), investigating them sequentially wastes time. Each investigation is independent and can happen in parallel.
13
-
14
- **Core principle:** Dispatch one agent per independent problem domain. Let them work concurrently.
15
-
16
- ## When to Use
17
-
18
- ```dot
19
- digraph when_to_use {
20
- "Multiple failures?" [shape=diamond];
21
- "Are they independent?" [shape=diamond];
22
- "Single agent investigates all" [shape=box];
23
- "One agent per problem domain" [shape=box];
24
- "Can they work in parallel?" [shape=diamond];
25
- "Sequential agents" [shape=box];
26
- "Parallel dispatch" [shape=box];
27
-
28
- "Multiple failures?" -> "Are they independent?" [label="yes"];
29
- "Are they independent?" -> "Single agent investigates all" [label="no - related"];
30
- "Are they independent?" -> "Can they work in parallel?" [label="yes"];
31
- "Can they work in parallel?" -> "Parallel dispatch" [label="yes"];
32
- "Can they work in parallel?" -> "Sequential agents" [label="no - shared state"];
33
- }
34
- ```
35
-
36
- **Use when:**
37
- - 3+ test files failing with different root causes
38
- - Multiple subsystems broken independently
39
- - Each problem can be understood without context from others
40
- - No shared state between investigations
41
-
42
- **Don't use when:**
43
- - Failures are related (fix one might fix others)
44
- - Need to understand full system state
45
- - Agents would interfere with each other
46
-
47
- ## The Pattern
48
-
49
- ### 1. Identify Independent Domains
50
-
51
- Group failures by what's broken:
52
- - File A tests: Tool approval flow
53
- - File B tests: Batch completion behavior
54
- - File C tests: Abort functionality
55
-
56
- Each domain is independent - fixing tool approval doesn't affect abort tests.
57
-
58
- ### 2. Create Focused Agent Tasks
59
-
60
- Each agent gets:
61
- - **Specific scope:** One test file or subsystem
62
- - **Clear goal:** Make these tests pass
63
- - **Constraints:** Don't change other code
64
- - **Expected output:** Summary of what you found and fixed
65
-
66
- ### 3. Dispatch in Parallel
67
-
68
- **How to dispatch:**
69
-
70
- Use the `subagent` tool in parallel mode:
71
-
72
- > **Agent scope:** The built-in agents (`worker`, `implementer`, `code-reviewer`, `spec-reviewer`)
73
- > are bundled with this package. To use them, set `agentScope: "both"`. The default scope `"user"`
74
- > only loads agents from `~/.pi/agent/agents/`.
75
-
76
- ```ts
77
- subagent({
78
- agentScope: "both", // include bundled agents (worker, implementer, etc.)
79
- tasks: [
80
- { agent: "worker", task: "Fix agent-tool-abort.test.ts failures" },
81
- { agent: "worker", task: "Fix batch-completion-behavior.test.ts failures" },
82
- { agent: "worker", task: "Fix tool-approval-race-conditions.test.ts failures" },
83
- ],
84
- })
85
- ```
86
-
87
- ### 4. Review and Integrate
88
-
89
- When agents return:
90
- - Read each summary
91
- - Verify fixes don't conflict
92
- - Run full test suite
93
- - Integrate all changes
94
-
95
- **If agents edited the same files:** Review manually. Pick the correct version per hunk, or re-run one agent with the other's changes as context. Don't blindly merge.
96
-
97
- **If some agents failed:** Integrate successful agents first (commit their work). Then retry the failed agent with fresh context that includes the integrated changes.
98
-
99
- ## Agent Prompt Structure
100
-
101
- Good agent prompts are:
102
- 1. **Focused** - One clear problem domain
103
- 2. **Self-contained** - All context needed to understand the problem
104
- 3. **Specific about output** - What should the agent return?
105
-
106
- ```markdown
107
- Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
108
-
109
- 1. "should abort tool with partial output capture" - expects 'interrupted at' in message
110
- 2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
111
- 3. "should properly track pendingToolCount" - expects 3 results but gets 0
112
-
113
- These are timing/race condition issues. Your task:
114
-
115
- 1. Read the test file and understand what each test verifies
116
- 2. Identify root cause - timing issues or actual bugs?
117
- 3. Fix by:
118
- - Replacing arbitrary timeouts with event-based waiting
119
- - Fixing bugs in abort implementation if found
120
- - Adjusting test expectations if testing changed behavior
121
-
122
- Do NOT just increase timeouts - find the real issue.
123
-
124
- Return: Summary of what you found and what you fixed.
125
- ```
126
-
127
- ## Common Mistakes
128
-
129
- **❌ Too broad:** "Fix all the tests" - agent gets lost
130
- **✅ Specific:** "Fix agent-tool-abort.test.ts" - focused scope
131
-
132
- **❌ No context:** "Fix the race condition" - agent doesn't know where
133
- **✅ Context:** Paste the error messages and test names
134
-
135
- **❌ No constraints:** Agent might refactor everything
136
- **✅ Constraints:** "Do NOT change production code" or "Fix tests only"
137
-
138
- **❌ Vague output:** "Fix it" - you don't know what changed
139
- **✅ Specific:** "Return summary of root cause and changes"
140
-
141
- ## When NOT to Use
142
-
143
- **Related failures:** Fixing one might fix others - investigate together first
144
- **Need full context:** Understanding requires seeing entire system
145
- **Exploratory debugging:** You don't know what's broken yet
146
- **Shared state:** Agents would interfere (editing same files, using same resources)
147
-
148
- ## Real Example from Session
149
-
150
- **Scenario:** 6 test failures across 3 files after major refactoring
151
-
152
- **Failures:**
153
- - agent-tool-abort.test.ts: 3 failures (timing issues)
154
- - batch-completion-behavior.test.ts: 2 failures (tools not executing)
155
- - tool-approval-race-conditions.test.ts: 1 failure (execution count = 0)
156
-
157
- **Decision:** Independent domains - abort logic separate from batch completion separate from race conditions
158
-
159
- **Dispatch:**
160
- ```ts
161
- subagent({
162
- agentScope: "both", // include bundled agents (worker, implementer, etc.)
163
- tasks: [
164
- { agent: "worker", task: "Fix agent-tool-abort.test.ts" },
165
- { agent: "worker", task: "Fix batch-completion-behavior.test.ts" },
166
- { agent: "worker", task: "Fix tool-approval-race-conditions.test.ts" },
167
- ],
168
- })
169
- ```
170
-
171
- **Results:**
172
- - Agent 1: Replaced timeouts with event-based waiting
173
- - Agent 2: Fixed event structure bug (threadId in wrong place)
174
- - Agent 3: Added wait for async tool execution to complete
175
-
176
- **Integration:** All fixes independent, no conflicts, full suite green
177
-
178
- **Time saved:** 3 problems solved in parallel vs sequentially
179
-
180
- ## Verification
181
-
182
- After agents return:
183
- 1. **Review each summary** - Understand what changed
184
- 2. **Check for conflicts** - Did agents edit same code?
185
- 3. **Run full suite** - Verify all fixes work together
186
- 4. **Spot check** - Agents can make systematic errors
187
-
188
- > **Integration-mode note:** When integrating parallel agent results, run `git stash` if
189
- > needed before the integration test run to isolate any stash conflicts from true failures.
190
- > If tests fail during integration, rule out merge conflicts first before treating it as a
191
- > new bug. Only invoke `workflow_reference debug-rationalizations` if you have confirmed
192
- > the failure is not from a merge conflict.
193
-
194
-
@@ -1,196 +0,0 @@
1
- ---
2
- name: receiving-code-review
3
- description: Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation
4
- ---
5
-
6
- > **Related skills:** Verify each fix with `/skill:executing-tasks`. Use `/skill:test-driven-development` for regression tests.
7
-
8
- # Code Review Reception
9
-
10
- ## Overview
11
-
12
- Code review requires technical evaluation, not emotional performance.
13
-
14
- **Core principle:** Verify before implementing. Ask before assuming. Technical correctness over social comfort.
15
-
16
- ## The Response Pattern
17
-
18
- ```
19
- WHEN receiving code review feedback:
20
-
21
- 1. READ: Complete feedback without reacting
22
- 2. UNDERSTAND: Restate requirement in own words (or ask)
23
- 3. VERIFY: Check against codebase reality
24
- 4. EVALUATE: Technically sound for THIS codebase?
25
- 5. RESPOND: Technical acknowledgment or reasoned pushback
26
- 6. IMPLEMENT: One item at a time, test each
27
- ```
28
-
29
- ## How to Respond
30
-
31
- **Never performative:**
32
- - ❌ "You're absolutely right!" / "Great point!" / "Thanks for catching that!"
33
- - ❌ "Let me implement that now" (before verification)
34
-
35
- **Instead:**
36
- - ✅ "Fixed. [Brief description of what changed]"
37
- - ✅ "Good catch - [specific issue]. Fixed in [location]."
38
- - ✅ Just fix it silently — the code shows you heard the feedback
39
- - ✅ Push back with technical reasoning if wrong
40
- - ✅ Ask clarifying questions if unclear
41
-
42
- ## Handling Unclear Feedback
43
-
44
- ```
45
- IF any item is unclear:
46
- STOP - do not implement anything yet
47
- ASK for clarification on unclear items
48
-
49
- WHY: Items may be related. Partial understanding = wrong implementation.
50
- ```
51
-
52
- **Example:**
53
- ```
54
- your human partner: "Fix 1-6"
55
- You understand 1,2,3,6. Unclear on 4,5.
56
-
57
- ❌ WRONG: Implement 1,2,3,6 now, ask about 4,5 later
58
- ✅ RIGHT: "I understand items 1,2,3,6. Need clarification on 4 and 5 before proceeding."
59
- ```
60
-
61
- ## Source-Specific Handling
62
-
63
- ### From your human partner
64
- - **Trusted** - implement after understanding
65
- - **Still ask** if scope unclear
66
- - **No performative agreement**
67
- - **Skip to action** or technical acknowledgment
68
-
69
- ### From External Reviewers
70
- ```
71
- BEFORE implementing:
72
- 1. Check: Technically correct for THIS codebase?
73
- 2. Check: Breaks existing functionality?
74
- 3. Check: Reason for current implementation?
75
- 4. Check: Works on all platforms/versions?
76
- 5. Check: Does reviewer understand full context?
77
-
78
- IF suggestion seems wrong:
79
- Push back with technical reasoning
80
-
81
- IF can't easily verify:
82
- Say so: "I can't verify this without [X]. Should I [investigate/ask/proceed]?"
83
-
84
- IF conflicts with your human partner's prior decisions:
85
- Stop and discuss with your human partner first
86
- ```
87
-
88
- **your human partner's rule:** "External feedback - be skeptical, but check carefully"
89
-
90
- ## YAGNI Check for "Professional" Features
91
-
92
- ```
93
- IF reviewer suggests "implementing properly":
94
- grep codebase for actual usage
95
-
96
- IF unused: "This endpoint isn't called. Remove it (YAGNI)?"
97
- IF used: Then implement properly
98
- ```
99
-
100
- **your human partner's rule:** "You and reviewer both report to me. If we don't need this feature, don't add it."
101
-
102
- ## Implementation Order
103
-
104
- ```
105
- FOR multi-item feedback:
106
- 1. Clarify anything unclear FIRST
107
- 2. Then implement in this order:
108
- - Blocking issues (breaks, security)
109
- - Simple fixes (typos, imports)
110
- - Complex fixes (refactoring, logic)
111
- 3. Test each fix individually
112
- 4. Verify no regressions
113
- ```
114
-
115
- ## When To Push Back
116
-
117
- Push back when:
118
- - Suggestion breaks existing functionality
119
- - Reviewer lacks full context
120
- - Violates YAGNI (unused feature)
121
- - Technically incorrect for this stack
122
- - Legacy/compatibility reasons exist
123
- - Conflicts with your human partner's architectural decisions
124
-
125
- **How to push back:**
126
- - Use technical reasoning, not defensiveness
127
- - Ask specific questions
128
- - Reference working tests/code
129
- - Involve your human partner if architectural
130
-
131
- **Signal if uncomfortable pushing back out loud:** "Strange things are afoot at the Circle K"
132
-
133
- ## Gracefully Correcting Your Pushback
134
-
135
- If you pushed back and were wrong:
136
- ```
137
- ✅ "You were right - I checked [X] and it does [Y]. Implementing now."
138
- ✅ "Verified this and you're correct. My initial understanding was wrong because [reason]. Fixing."
139
-
140
- ❌ Long apology
141
- ❌ Defending why you pushed back
142
- ❌ Over-explaining
143
- ```
144
-
145
- State the correction factually and move on.
146
-
147
- ## Common Mistakes
148
-
149
- | Mistake | Fix |
150
- |---------|-----|
151
- | Performative agreement | State requirement or just act |
152
- | Blind implementation | Verify against codebase first |
153
- | Batch without testing | One at a time, test each |
154
- | Assuming reviewer is right | Check if breaks things |
155
- | Avoiding pushback | Technical correctness > comfort |
156
- | Partial implementation | Clarify all items first |
157
- | Can't verify, proceed anyway | State limitation, ask for direction |
158
-
159
- ## Real Examples
160
-
161
- **Performative Agreement (Bad):**
162
- ```
163
- Reviewer: "Remove legacy code"
164
- ❌ "You're absolutely right! Let me remove that..."
165
- ```
166
-
167
- **Technical Verification (Good):**
168
- ```
169
- Reviewer: "Remove legacy code"
170
- ✅ "Checking... build target is 10.15+, this API needs 13+. Need legacy for backward compat. Current impl has wrong bundle ID - fix it or drop pre-13 support?"
171
- ```
172
-
173
- **YAGNI (Good):**
174
- ```
175
- Reviewer: "Implement proper metrics tracking with database, date filters, CSV export"
176
- ✅ "Grepped codebase - nothing calls this endpoint. Remove it (YAGNI)? Or is there usage I'm missing?"
177
- ```
178
-
179
- **Unclear Item (Good):**
180
- ```
181
- your human partner: "Fix items 1-6"
182
- You understand 1,2,3,6. Unclear on 4,5.
183
- ✅ "Understand 1,2,3,6. Need clarification on 4 and 5 before implementing."
184
- ```
185
-
186
- ## GitHub Thread Replies
187
-
188
- When replying to inline review comments on GitHub, reply in the comment thread (`gh api repos/{owner}/{repo}/pulls/{pr}/comments/{id}/replies`), not as a top-level PR comment.
189
-
190
- ## The Bottom Line
191
-
192
- **External feedback = suggestions to evaluate, not orders to follow.**
193
-
194
- Verify. Question. Then implement.
195
-
196
- No performative agreement. Technical rigor always.
@@ -1,170 +0,0 @@
1
- ---
2
- name: systematic-debugging
3
- description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes
4
- ---
5
-
6
- > **Related skills:** Write a failing test for the bug with `/skill:test-driven-development`. Verify the fix with `/skill:executing-tasks`.
7
-
8
- # Systematic Debugging
9
-
10
- ## Overview
11
-
12
- Random fixes waste time and create new bugs. Quick patches mask underlying issues.
13
-
14
- **Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
15
-
16
- **Violating the letter of this process is violating the spirit of debugging.**
17
-
18
- > The workflow-monitor extension tracks your debugging: it detects fix-without-investigation and counts failed fix attempts, surfacing warnings in tool results. Use `workflow_reference` with debug topics for additional guidance.
19
-
20
- If a tool result contains a ⚠️ workflow warning, stop immediately and address it before continuing.
21
-
22
- ## The Iron Law
23
-
24
- ```
25
- NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
26
- ```
27
-
28
- If you haven't completed Phase 1, you cannot propose fixes.
29
-
30
- ## When to Use
31
-
32
- Use for ANY technical issue: test failures, bugs, unexpected behavior, performance problems, build failures, integration issues.
33
-
34
- **Use this ESPECIALLY when:**
35
- - Under time pressure (emergencies make guessing tempting)
36
- - "Just one quick fix" seems obvious
37
- - You've already tried multiple fixes
38
- - Previous fix didn't work
39
- - You don't fully understand the issue
40
-
41
- **Don't skip when:**
42
- - Issue seems simple (simple bugs have root causes too)
43
- - You're in a hurry (rushing guarantees rework)
44
-
45
- ## The Four Phases
46
-
47
- You MUST complete each phase before proceeding to the next.
48
-
49
- ### Phase 1: Root Cause Investigation
50
-
51
- **BEFORE attempting ANY fix:**
52
-
53
- 1. **Read Error Messages Carefully** — Don't skip past errors or warnings. Read stack traces completely. Note line numbers, file paths, error codes.
54
-
55
- 2. **Reproduce Consistently** — Can you trigger it reliably? What are the exact steps? If not reproducible → gather more data, don't guess.
56
-
57
- 3. **Check Recent Changes** — Git diff, recent commits, new dependencies, config changes, environmental differences.
58
-
59
- 4. **Gather Evidence in Multi-Component Systems** — For each component boundary: log what enters, what exits, verify config propagation. Run once to see WHERE it breaks, then investigate that component.
60
-
61
- **Example (multi-layer system):**
62
- ```bash
63
- # Layer 1: Workflow
64
- echo "=== Secrets available: ==="
65
- echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}"
66
-
67
- # Layer 2: Build script
68
- echo "=== Env vars in build script: ==="
69
- env | grep IDENTITY || echo "IDENTITY not in environment"
70
-
71
- # Layer 3: Signing
72
- echo "=== Keychain state: ==="
73
- security list-keychains
74
- security find-identity -v
75
- ```
76
- **This reveals:** Which layer fails (e.g., secrets → workflow ✓, workflow → build ✗)
77
-
78
- 5. **Trace Data Flow** — Where does the bad value originate? What called this with the bad value? Keep tracing up until you find the source. Fix at source, not at symptom. See `root-cause-tracing.md` for the complete technique.
79
-
80
- ### Phase 2: Pattern Analysis
81
-
82
- 1. **Find Working Examples** — Locate similar working code in same codebase.
83
- 2. **Compare Against References** — Read reference implementation COMPLETELY. Don't skim.
84
- 3. **Identify Differences** — List every difference, however small. Don't assume "that can't matter."
85
- 4. **Understand Dependencies** — What components, settings, config, environment does this need?
86
-
87
- ### Phase 3: Hypothesis and Testing
88
-
89
- 1. **Form Single Hypothesis** — State clearly: "I think X is the root cause because Y." Be specific, not vague.
90
- 2. **Test Minimally** — Make the SMALLEST possible change. One variable at a time. Don't fix multiple things at once.
91
- 3. **Verify Before Continuing** — Did it work? Yes → Phase 4. No → Form NEW hypothesis. DON'T add more fixes on top.
92
-
93
- ### Phase 4: Implementation
94
-
95
- 1. **Create Failing Test Case** — Use `/skill:test-driven-development` for writing proper failing tests. MUST have before fixing.
96
-
97
- 2. **Implement Single Fix** — ONE change at a time. No "while I'm here" improvements. No bundled refactoring.
98
-
99
- 3. **Verify Fix** — Test passes? No other tests broken? Issue actually resolved?
100
-
101
- 4. **If Fix Doesn't Work:**
102
- - If < 3 attempts: Return to Phase 1, re-analyze with new information
103
- - **If ≥ 3 attempts: STOP (see below)**
104
-
105
- ### When 3+ Fixes Fail: Question Architecture
106
-
107
- **This is NOT a failed hypothesis — it's a wrong architecture.**
108
-
109
- Pattern indicating architectural problem:
110
- - Each fix reveals new shared state/coupling in different places
111
- - Fixes require "massive refactoring" to implement
112
- - Each fix creates new symptoms elsewhere
113
-
114
- **STOP and question fundamentals:**
115
- - Is this pattern fundamentally sound?
116
- - Are we sticking with it through sheer inertia?
117
- - Should we refactor architecture vs. continue fixing symptoms?
118
-
119
- **Discuss with your human partner before attempting more fixes.**
120
-
121
- ## Red Flags — STOP and Follow Process
122
-
123
- If you catch yourself thinking:
124
- - "Quick fix for now, investigate later"
125
- - "Just try changing X and see if it works"
126
- - "Add multiple changes, run tests"
127
- - "Skip the test, I'll manually verify"
128
- - "It's probably X, let me fix that"
129
- - "I don't fully understand but this might work"
130
- - "Pattern says X but I'll adapt it differently"
131
- - "Here are the main problems: [lists fixes without investigation]"
132
- - Proposing solutions before tracing data flow
133
- - **"One more fix attempt" (when already tried 2+)**
134
- - **Each fix reveals new problem in different place**
135
-
136
- **ALL of these mean: STOP. Return to Phase 1.**
137
-
138
- **If 3+ fixes failed:** Question the architecture (see above).
139
-
140
- ## Common Rationalizations
141
-
142
- | Excuse | Reality |
143
- |--------|---------|
144
- | "Issue is simple, don't need process" | Simple issues have root causes too. Process is fast for simple bugs. |
145
- | "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. |
146
- | "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. |
147
- | "I'll write test after confirming fix works" | Untested fixes don't stick. Test first proves it. |
148
- | "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. |
149
- | "Reference too long, I'll adapt the pattern" | Partial understanding guarantees bugs. Read it completely. |
150
- | "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. |
151
- | "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. Question pattern, don't fix again. |
152
-
153
- ## When Process Reveals "No Root Cause"
154
-
155
- If investigation reveals issue is truly environmental, timing-dependent, or external:
156
- 1. Document what you investigated
157
- 2. Implement appropriate handling (retry, timeout, error message)
158
- 3. Add monitoring/logging for future investigation
159
-
160
- **But:** 95% of "no root cause" cases are incomplete investigation.
161
-
162
- ## Supporting Techniques
163
-
164
- These techniques are part of systematic debugging and available in this directory:
165
-
166
- - **`root-cause-tracing.md`** — Trace bugs backward through call stack to find original trigger
167
- - **`defense-in-depth.md`** — Add validation at multiple layers after finding root cause
168
- - **`condition-based-waiting.md`** — Replace arbitrary timeouts with condition polling
169
-
170
- Use `workflow_reference` for: `debug-rationalizations`, `debug-tracing`, `debug-defense-in-depth`, `debug-condition-waiting`