olympus-ai 4.5.13 → 4.5.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/dist/cli/index.js +63 -27
  3. package/dist/cli/index.js.map +1 -1
  4. package/dist/hooks/olympus-hooks.cjs +257 -257
  5. package/dist/installer/hooks.d.ts +47 -14
  6. package/dist/installer/hooks.d.ts.map +1 -1
  7. package/dist/installer/hooks.js +45 -77
  8. package/dist/installer/hooks.js.map +1 -1
  9. package/dist/installer/index.d.ts +8 -7
  10. package/dist/installer/index.d.ts.map +1 -1
  11. package/dist/installer/index.js +49 -46
  12. package/dist/installer/index.js.map +1 -1
  13. package/package.json +1 -1
  14. package/resources/config/risk-keywords.json +5 -5
  15. package/resources/rules/common/ascii-diagram-standards.md +115 -115
  16. package/resources/rules/common/content-validation.md +131 -131
  17. package/resources/rules/common/error-handling.md +430 -430
  18. package/resources/rules/common/markdown-formatting.md +170 -170
  19. package/resources/rules/common/overconfidence-prevention.md +100 -100
  20. package/resources/rules/common/pathway-behaviors.json +60 -60
  21. package/resources/rules/common/pathway-behaviors.md +100 -100
  22. package/resources/rules/common/process-overview.md +157 -157
  23. package/resources/rules/common/terminal-formatting.md +161 -161
  24. package/resources/rules/common/terminology.md +189 -189
  25. package/resources/rules/common/welcome-message.md +118 -118
  26. package/resources/rules/common/workflow-changes.md +285 -285
  27. package/resources/rules/construction/bolt-planning.md +153 -153
  28. package/resources/rules/construction/bolt-review.md +143 -143
  29. package/resources/rules/construction/build-and-test.md +527 -527
  30. package/resources/rules/construction/code-generation.md +414 -414
  31. package/resources/rules/construction/documentation.md +201 -201
  32. package/resources/rules/construction/functional-design.md +135 -135
  33. package/resources/rules/construction/infrastructure-design.md +110 -110
  34. package/resources/rules/construction/nfr-design.md +106 -106
  35. package/resources/rules/construction/nfr-requirements.md +118 -118
  36. package/resources/rules/construction/test-generation.md +112 -112
  37. package/resources/rules/core-workflow.md +196 -196
  38. package/resources/rules/inception/application-design.md +195 -195
  39. package/resources/rules/inception/bolt-planning.md +588 -588
  40. package/resources/rules/inception/reverse-engineering.md +354 -354
  41. package/resources/rules/inception/units-generation.md +505 -505
  42. package/resources/rules/inception/user-stories.md +527 -527
  43. package/resources/rules/inception/workspace-detection.md +82 -82
  44. package/resources/rules/operations/operations.md +19 -19
  45. package/resources/skills/brief/templates/ai-dlc-intent-brief-template.md +149 -149
  46. package/resources/skills/getting-started/SKILL.md +79 -79
  47. package/resources/templates/construction/bolt-spec-template.md +270 -270
  48. package/resources/templates/inception/unit-brief-template.md +188 -188
  49. package/resources/templates/inception/units-template.md +99 -99
@@ -1,430 +1,430 @@
1
- # Error Handling and Recovery Procedures
2
-
3
- ## General Error Handling Principles
4
-
5
- ### When Errors Occur
6
- 1. **Identify the error**: Clearly state what went wrong
7
- 2. **Assess impact**: Determine if the error is blocking or can be worked around
8
- 3. **Communicate**: Inform the user about the error and options
9
- 4. **Offer solutions**: Provide clear steps to resolve or work around the error
10
- 5. **Document**: Log the error and resolution in `audit.md`
11
-
12
- ### Error Severity Levels
13
-
14
- **Critical**: Workflow cannot continue
15
- - Missing required files or artifacts
16
- - Invalid user input that cannot be processed
17
- - System errors preventing file operations
18
-
19
- **High**: Phase cannot complete as planned
20
- - Incomplete answers to required questions
21
- - Contradictory user responses
22
- - Missing dependencies from prior phases
23
-
24
- **Medium**: Phase can continue with workarounds
25
- - Optional artifacts missing
26
- - Non-critical validation failures
27
- - Partial completion possible
28
-
29
- **Low**: Minor issues that don't block progress
30
- - Formatting inconsistencies
31
- - Optional information missing
32
- - Non-blocking warnings
33
-
34
- ## Phase-Specific Error Handling
35
-
36
- ### Context Assessment Errors
37
-
38
- **Error**: Cannot read workspace files
39
- - **Cause**: Permission issues, missing directories
40
- - **Solution**: Ask user to verify workspace path and permissions
41
- - **Workaround**: Proceed with user-provided information only
42
-
43
- **Error**: Existing `aidlc-state.md` is corrupted
44
- - **Cause**: Manual editing, incomplete previous run
45
- - **Solution**: Ask user if they want to start fresh or attempt recovery
46
- - **Recovery**: Create backup, start new state file
47
-
48
- **Error**: Cannot determine required phases
49
- - **Cause**: Insufficient information from user
50
- - **Solution**: Ask clarifying questions about intent and scope
51
- - **Workaround**: Default to comprehensive execution plan
52
-
53
- ### Requirements Assessment Errors
54
-
55
- **Error**: User provides contradictory requirements
56
- - **Cause**: Unclear understanding, changing needs
57
- - **Solution**: Create follow-up questions to resolve contradictions
58
- - **Do Not Proceed**: Until contradictions are resolved
59
-
60
- **Error**: Requirements document cannot be converted
61
- - **Cause**: Unsupported format, corrupted file
62
- - **Solution**: Ask user to provide requirements in supported format
63
- - **Workaround**: Work with user's verbal description
64
-
65
- **Error**: Incomplete answers to verification questions
66
- - **Cause**: User skipped questions, unclear what to answer
67
- - **Solution**: Highlight unanswered questions, provide examples
68
- - **Do Not Proceed**: Until all required questions are answered
69
-
70
- ### Story Development Errors
71
-
72
- **Error**: Cannot map requirements to stories
73
- - **Cause**: Requirements too vague, missing functional details
74
- - **Solution**: Return to Requirements Assessment for clarification
75
- - **Workaround**: Create stories based on available information, mark as incomplete
76
-
77
- **Error**: User provides ambiguous story planning answers
78
- - **Cause**: Unclear options, complex decision
79
- - **Solution**: Add follow-up questions with specific examples
80
- - **Do Not Proceed**: Until ambiguities are resolved
81
-
82
- **Error**: Story generation plan has uncompleted steps
83
- - **Cause**: Execution interrupted, steps skipped
84
- - **Solution**: Resume from first uncompleted step
85
- - **Recovery**: Review completed steps, continue from checkpoint
86
-
87
- ### Application Design Errors
88
-
89
- **Error**: Architectural decision is unclear or contradictory
90
- - **Cause**: Ambiguous answers, conflicting requirements
91
- - **Solution**: Add follow-up questions to clarify decision
92
- - **Do Not Proceed**: Until decision is clear and documented
93
-
94
- **Error**: Cannot determine number of services/units
95
- - **Cause**: Insufficient information about boundaries
96
- - **Solution**: Ask specific questions about deployment, team structure, scaling
97
- - **Workaround**: Default to monolith, allow change later
98
-
99
- ### Design Errors
100
-
101
- **Error**: Unit dependencies are circular
102
- - **Cause**: Poor boundary definition, tight coupling
103
- - **Solution**: Identify circular dependencies, suggest refactoring
104
- - **Recovery**: Revise unit boundaries to break cycles
105
-
106
- **Error**: Unit design plan has missing steps
107
- - **Cause**: Plan generation incomplete, template error
108
- - **Solution**: Regenerate plan with all required steps
109
- - **Recovery**: Add missing steps to existing plan
110
-
111
- **Error**: Cannot generate design artifacts
112
- - **Cause**: Missing unit information, unclear requirements
113
- - **Solution**: Return to Units Planning to clarify unit definition
114
- - **Workaround**: Generate partial design, mark gaps
115
-
116
- ### NFR Implementation Errors
117
-
118
- **Error**: Technology stack choices are incompatible
119
- - **Cause**: Conflicting requirements, platform limitations
120
- - **Solution**: Highlight incompatibilities, ask user to choose
121
- - **Do Not Proceed**: Until compatible choices are made
122
-
123
- **Error**: Organizational constraints cannot be met
124
- - **Cause**: Network restrictions, security policies
125
- - **Solution**: Document constraints, ask user for workarounds
126
- - **Escalation**: May require human intervention for setup
127
-
128
- **Error**: NFR implementation step requires human action
129
- - **Cause**: AI cannot perform certain tasks (network config, credentials)
130
- - **Solution**: Clearly mark as **HUMAN TASK**, provide instructions
131
- - **Wait**: For user confirmation before proceeding
132
-
133
- ### Code Planning Errors
134
-
135
- **Error**: Code generation plan is incomplete
136
- - **Cause**: Missing design artifacts, unclear requirements
137
- - **Solution**: Return to Design phase to complete artifacts
138
- - **Recovery**: Generate plan with available information, mark gaps
139
-
140
- **Error**: Unit dependencies not satisfied
141
- - **Cause**: Dependent units not yet generated
142
- - **Solution**: Reorder generation sequence to respect dependencies
143
- - **Workaround**: Generate with stub dependencies, integrate later
144
-
145
- ### Code Generation Errors
146
-
147
- **Error**: Cannot generate code for a step
148
- - **Cause**: Insufficient design information, unclear requirements
149
- - **Solution**: Skip step, document as incomplete, continue
150
- - **Recovery**: Return to step after gathering more information
151
-
152
- **Error**: Generated code has syntax errors
153
- - **Cause**: Template issues, language-specific problems
154
- - **Solution**: Fix syntax errors, regenerate if needed
155
- - **Validation**: Verify code compiles before proceeding
156
-
157
- **Error**: Test generation fails
158
- - **Cause**: Complex logic, missing test framework setup
159
- - **Solution**: Generate basic test structure, mark for manual completion
160
- - **Workaround**: Proceed without tests, add in Operations phase
161
-
162
- ### Operations Errors
163
-
164
- **Error**: Cannot determine build tool
165
- - **Cause**: Unusual project structure, multiple build systems
166
- - **Solution**: Ask user to specify build tool and commands
167
- - **Workaround**: Provide generic instructions, user adapts
168
-
169
- **Error**: Deployment target is unclear
170
- - **Cause**: Multiple environments, complex infrastructure
171
- - **Solution**: Ask user to specify deployment targets and methods
172
- - **Workaround**: Provide instructions for common platforms
173
-
174
- ## Recovery Procedures
175
-
176
- ### Partial Phase Completion
177
-
178
- **Scenario**: Phase was interrupted mid-execution
179
-
180
- **Recovery Steps**:
181
- 1. Load the phase plan file
182
- 2. Identify last completed step (last [x] checkbox)
183
- 3. Resume from next uncompleted step
184
- 4. Verify all prior steps are actually complete
185
- 5. Continue execution normally
186
-
187
- ### Corrupted State File
188
-
189
- **Scenario**: `aidlc-state.md` is corrupted or inconsistent
190
-
191
- **Recovery Steps**:
192
- 1. Create backup: `aidlc-state.md.backup`
193
- 2. Ask user which phase they're actually on
194
- 3. Regenerate state file from scratch
195
- 4. Mark completed phases based on existing artifacts
196
- 5. Resume from current phase
197
-
198
- ### Missing Artifacts
199
-
200
- **Scenario**: Required artifacts from prior phase are missing
201
-
202
- **Recovery Steps**:
203
- 1. Identify which artifacts are missing
204
- 2. Determine if they can be regenerated
205
- 3. If yes: Return to that phase, regenerate artifacts
206
- 4. If no: Ask user to provide information manually
207
- 5. Document the gap in `audit.md`
208
-
209
- ### User Wants to Restart Phase
210
-
211
- **Scenario**: User is unhappy with phase results and wants to redo
212
-
213
- **Recovery Steps**:
214
- 1. Confirm user wants to restart (data will be lost)
215
- 2. Archive existing artifacts: `{artifact}.backup`
216
- 3. Reset phase status in `aidlc-state.md`
217
- 4. Clear phase checkboxes in plan files
218
- 5. Re-execute phase from beginning
219
-
220
- ### User Wants to Skip Phase
221
-
222
- **Scenario**: User wants to skip a phase that was planned
223
-
224
- **Recovery Steps**:
225
- 1. Confirm user understands implications
226
- 2. Document skip reason in `audit.md`
227
- 3. Mark phase as "SKIPPED" in `aidlc-state.md`
228
- 4. Proceed to next phase
229
- 5. Note: May cause issues in later phases if dependencies missing
230
-
231
- ## Escalation Guidelines
232
-
233
- ### When to Ask for User Help
234
-
235
- **Immediately**:
236
- - Contradictory or ambiguous user input
237
- - Missing required information
238
- - Technical constraints AI cannot resolve
239
- - Decisions requiring business judgment
240
-
241
- **After Attempting Resolution**:
242
- - Repeated errors in same step
243
- - Complex technical issues
244
- - Unusual project structures
245
- - Integration with external systems
246
-
247
- ### When to Suggest Starting Over
248
-
249
- **Consider Fresh Start If**:
250
- - Multiple phases have errors
251
- - State file is severely corrupted
252
- - User cannot provide missing information
253
- - Artifacts are inconsistent across phases
254
-
255
- ## Session Resumption Errors
256
-
257
- ### Missing Artifacts During Resumption
258
-
259
- **Error**: Required artifacts from previous stages are missing
260
- - **Cause**: Files deleted, moved, or never created
261
- - **Solution**:
262
- 1. Identify which stage created the missing artifacts
263
- 2. Check if stage was marked complete in aidlc-state.md
264
- 3. If marked complete but artifacts missing: Regenerate that stage
265
- 4. If not marked complete: Resume from that stage
266
- - **Recovery**: Return to the stage that creates missing artifacts and re-execute
267
-
268
- **Error**: Artifact file exists but is empty or corrupted
269
- - **Cause**: Interrupted write, manual editing, file system issues
270
- - **Solution**:
271
- 1. Create backup of corrupted file
272
- 2. Attempt to regenerate from stage that creates it
273
- 3. If cannot regenerate: Ask user for information to recreate
274
- - **Recovery**: Re-execute the stage that creates the artifact
275
-
276
- ### Inconsistent State During Resumption
277
-
278
- **Error**: aidlc-state.md shows stage complete but artifacts don't exist
279
- - **Cause**: State file updated but artifact generation failed
280
- - **Solution**:
281
- 1. Mark stage as incomplete in aidlc-state.md
282
- 2. Re-execute the stage to generate artifacts
283
- 3. Verify artifacts exist before marking complete
284
- - **Recovery**: Reset stage status and re-execute
285
-
286
- **Error**: Artifacts exist but aidlc-state.md shows stage incomplete
287
- - **Cause**: Artifact generation succeeded but state update failed
288
- - **Solution**:
289
- 1. Verify artifacts are complete and valid
290
- 2. Update aidlc-state.md to mark stage complete
291
- 3. Proceed to next stage
292
- - **Recovery**: Update state file to reflect actual completion
293
-
294
- **Error**: Multiple stages marked as "current" in aidlc-state.md
295
- - **Cause**: State file corruption, manual editing
296
- - **Solution**:
297
- 1. Review artifacts to determine actual progress
298
- 2. Ask user which stage they're actually on
299
- 3. Correct aidlc-state.md to show single current stage
300
- - **Recovery**: Rebuild state file based on existing artifacts
301
-
302
- ### Context Loading Errors
303
-
304
- **Error**: Cannot load required context from previous stages
305
- - **Cause**: Missing files, corrupted content, wrong file paths
306
- - **Solution**:
307
- 1. List which artifacts are needed for current stage
308
- 2. Check which ones are missing or corrupted
309
- 3. Regenerate missing artifacts or ask user for information
310
- - **Recovery**: Complete prerequisite stages before resuming current stage
311
-
312
- **Error**: Loaded artifacts contain contradictory information
313
- - **Cause**: Manual editing, multiple people working, incomplete updates
314
- - **Solution**:
315
- 1. Identify contradictions and present to user
316
- 2. Ask user which information is correct
317
- 3. Update artifacts to resolve contradictions
318
- - **Recovery**: Reconcile contradictions before proceeding
319
-
320
- ### Resumption Best Practices
321
-
322
- 1. **Always validate state**: Check aidlc-state.md matches actual artifacts
323
- 2. **Load incrementally**: Load artifacts stage-by-stage, validate each
324
- 3. **Fail fast**: Stop immediately if critical artifacts are missing
325
- 4. **Communicate clearly**: Tell user exactly what's missing and why it's needed
326
- 5. **Offer options**: Regenerate, provide manually, or start fresh
327
- 6. **Document recovery**: Log all recovery actions in audit.md State file is severely corrupted
328
- - User requirements have changed significantly
329
- - Architectural decision needs to be reversed
330
-
331
- **Before Starting Over**:
332
- 1. Archive all existing work
333
- 2. Document lessons learned
334
- 3. Identify what to preserve
335
- 4. Get user confirmation
336
- 5. Create new execution plan
337
-
338
- ## Logging Requirements
339
-
340
- ### Error Logging Format
341
-
342
- ```markdown
343
- ## Error - [Phase Name]
344
- **Timestamp**: [ISO timestamp]
345
- **Error Type**: [Critical/High/Medium/Low]
346
- **Description**: [What went wrong]
347
- **Cause**: [Why it happened]
348
- **Resolution**: [How it was resolved]
349
- **Impact**: [Effect on workflow]
350
-
351
- ---
352
- ```
353
-
354
- ### Recovery Logging Format
355
-
356
- ```markdown
357
- ## Recovery - [Phase Name]
358
- **Timestamp**: [ISO timestamp]
359
- **Issue**: [What needed recovery]
360
- **Recovery Steps**: [What was done]
361
- **Outcome**: [Result of recovery]
362
- **Artifacts Affected**: [List of files]
363
-
364
- ---
365
- ```
366
-
367
- ## Agent Task Failure Recovery
368
-
369
- ### When a Delegated Task Fails or Is Lost
370
-
371
- **Scenario**: A task delegated via the Task tool returns "No task found with ID", empty output, or an error.
372
-
373
- **MANDATORY Rules**:
374
- 1. **NEVER silently do the work yourself** — if you delegated to an agent, the recovery must also use delegation
375
- 2. **NEVER claim the lost agent's work was sufficient** — if you cannot retrieve output, the work is lost
376
- 3. **Retry the delegation** — re-launch the same agent with the same (or refined) prompt
377
- 4. **Maximum 2 retries** — if the agent fails 3 times total, escalate to the user
378
- 5. **Log all failures** — record the task ID, error, and retry attempts in `audit.md`
379
-
380
- **Recovery Steps**:
381
- 1. Log the failure: task ID, error message, which agent was used
382
- 2. Re-launch the agent with `run_in_background: false` (foreground) to ensure visibility
383
- 3. If retry fails: try a different agent tier (e.g., escalate from `explore` to `explore-medium`)
384
- 4. If all retries fail: inform the user and ask how to proceed
385
- 5. Document the resolution in `audit.md`
386
-
387
- **Error Logging Format**:
388
- ```markdown
389
- ## Agent Task Failure
390
- **Timestamp**: [ISO timestamp]
391
- **Task ID**: [lost task ID]
392
- **Agent**: [agent type]
393
- **Error**: [error message]
394
- **Recovery**: [retry attempt or escalation]
395
-
396
- ---
397
- ```
398
-
399
- ### Background vs Foreground Task Execution
400
-
401
- **Default**: Run delegated tasks in the **foreground** (`run_in_background: false`).
402
-
403
- Foreground execution provides:
404
- - Full visibility into agent work (tool calls, progress, decisions)
405
- - Immediate error detection and recovery
406
- - Better user experience — the user can see what's happening
407
-
408
- **Background execution** (`run_in_background: true`) should ONLY be used for:
409
- - Non-agent operations: `npm install`, `npm test`, `docker build`, etc.
410
- - Operations where output is not needed until completion
411
-
412
- **NEVER run agent tasks (Task tool with subagent_type) in the background** unless the user explicitly requests it. The risk of silent task loss outweighs the parallelism benefit.
413
-
414
- ### Parallel Foreground Execution
415
-
416
- To run multiple agents in parallel WITHOUT background mode:
417
- - Launch all agent tasks in the **same response** (multiple Task calls)
418
- - Do NOT set `run_in_background: true`
419
- - The system will execute them concurrently while maintaining visibility
420
- - Wait for all results before proceeding
421
-
422
- This gives you parallelism benefits with full transparency.
423
-
424
- ## Prevention Best Practices
425
-
426
- 1. **Validate Early**: Check inputs and dependencies before starting work
427
- 2. **Checkpoint Often**: Update checkboxes immediately after completing steps
428
- 3. **Communicate Clearly**: Explain what you're doing and why
429
- 4. **Ask Questions**: Don't assume - clarify ambiguities immediately
430
- 5. **Document Everything**: Log all decisions and changes in `audit.md`
1
+ # Error Handling and Recovery Procedures
2
+
3
+ ## General Error Handling Principles
4
+
5
+ ### When Errors Occur
6
+ 1. **Identify the error**: Clearly state what went wrong
7
+ 2. **Assess impact**: Determine if the error is blocking or can be worked around
8
+ 3. **Communicate**: Inform the user about the error and options
9
+ 4. **Offer solutions**: Provide clear steps to resolve or work around the error
10
+ 5. **Document**: Log the error and resolution in `audit.md`
11
+
12
+ ### Error Severity Levels
13
+
14
+ **Critical**: Workflow cannot continue
15
+ - Missing required files or artifacts
16
+ - Invalid user input that cannot be processed
17
+ - System errors preventing file operations
18
+
19
+ **High**: Phase cannot complete as planned
20
+ - Incomplete answers to required questions
21
+ - Contradictory user responses
22
+ - Missing dependencies from prior phases
23
+
24
+ **Medium**: Phase can continue with workarounds
25
+ - Optional artifacts missing
26
+ - Non-critical validation failures
27
+ - Partial completion possible
28
+
29
+ **Low**: Minor issues that don't block progress
30
+ - Formatting inconsistencies
31
+ - Optional information missing
32
+ - Non-blocking warnings
33
+
34
+ ## Phase-Specific Error Handling
35
+
36
+ ### Context Assessment Errors
37
+
38
+ **Error**: Cannot read workspace files
39
+ - **Cause**: Permission issues, missing directories
40
+ - **Solution**: Ask user to verify workspace path and permissions
41
+ - **Workaround**: Proceed with user-provided information only
42
+
43
+ **Error**: Existing `aidlc-state.md` is corrupted
44
+ - **Cause**: Manual editing, incomplete previous run
45
+ - **Solution**: Ask user if they want to start fresh or attempt recovery
46
+ - **Recovery**: Create backup, start new state file
47
+
48
+ **Error**: Cannot determine required phases
49
+ - **Cause**: Insufficient information from user
50
+ - **Solution**: Ask clarifying questions about intent and scope
51
+ - **Workaround**: Default to comprehensive execution plan
52
+
53
+ ### Requirements Assessment Errors
54
+
55
+ **Error**: User provides contradictory requirements
56
+ - **Cause**: Unclear understanding, changing needs
57
+ - **Solution**: Create follow-up questions to resolve contradictions
58
+ - **Do Not Proceed**: Until contradictions are resolved
59
+
60
+ **Error**: Requirements document cannot be converted
61
+ - **Cause**: Unsupported format, corrupted file
62
+ - **Solution**: Ask user to provide requirements in supported format
63
+ - **Workaround**: Work with user's verbal description
64
+
65
+ **Error**: Incomplete answers to verification questions
66
+ - **Cause**: User skipped questions, unclear what to answer
67
+ - **Solution**: Highlight unanswered questions, provide examples
68
+ - **Do Not Proceed**: Until all required questions are answered
69
+
70
+ ### Story Development Errors
71
+
72
+ **Error**: Cannot map requirements to stories
73
+ - **Cause**: Requirements too vague, missing functional details
74
+ - **Solution**: Return to Requirements Assessment for clarification
75
+ - **Workaround**: Create stories based on available information, mark as incomplete
76
+
77
+ **Error**: User provides ambiguous story planning answers
78
+ - **Cause**: Unclear options, complex decision
79
+ - **Solution**: Add follow-up questions with specific examples
80
+ - **Do Not Proceed**: Until ambiguities are resolved
81
+
82
+ **Error**: Story generation plan has uncompleted steps
83
+ - **Cause**: Execution interrupted, steps skipped
84
+ - **Solution**: Resume from first uncompleted step
85
+ - **Recovery**: Review completed steps, continue from checkpoint
86
+
87
+ ### Application Design Errors
88
+
89
+ **Error**: Architectural decision is unclear or contradictory
90
+ - **Cause**: Ambiguous answers, conflicting requirements
91
+ - **Solution**: Add follow-up questions to clarify decision
92
+ - **Do Not Proceed**: Until decision is clear and documented
93
+
94
+ **Error**: Cannot determine number of services/units
95
+ - **Cause**: Insufficient information about boundaries
96
+ - **Solution**: Ask specific questions about deployment, team structure, scaling
97
+ - **Workaround**: Default to monolith, allow change later
98
+
99
+ ### Design Errors
100
+
101
+ **Error**: Unit dependencies are circular
102
+ - **Cause**: Poor boundary definition, tight coupling
103
+ - **Solution**: Identify circular dependencies, suggest refactoring
104
+ - **Recovery**: Revise unit boundaries to break cycles
105
+
106
+ **Error**: Unit design plan has missing steps
107
+ - **Cause**: Plan generation incomplete, template error
108
+ - **Solution**: Regenerate plan with all required steps
109
+ - **Recovery**: Add missing steps to existing plan
110
+
111
+ **Error**: Cannot generate design artifacts
112
+ - **Cause**: Missing unit information, unclear requirements
113
+ - **Solution**: Return to Units Planning to clarify unit definition
114
+ - **Workaround**: Generate partial design, mark gaps
115
+
116
+ ### NFR Implementation Errors
117
+
118
+ **Error**: Technology stack choices are incompatible
119
+ - **Cause**: Conflicting requirements, platform limitations
120
+ - **Solution**: Highlight incompatibilities, ask user to choose
121
+ - **Do Not Proceed**: Until compatible choices are made
122
+
123
+ **Error**: Organizational constraints cannot be met
124
+ - **Cause**: Network restrictions, security policies
125
+ - **Solution**: Document constraints, ask user for workarounds
126
+ - **Escalation**: May require human intervention for setup
127
+
128
+ **Error**: NFR implementation step requires human action
129
+ - **Cause**: AI cannot perform certain tasks (network config, credentials)
130
+ - **Solution**: Clearly mark as **HUMAN TASK**, provide instructions
131
+ - **Wait**: For user confirmation before proceeding
132
+
133
+ ### Code Planning Errors
134
+
135
+ **Error**: Code generation plan is incomplete
136
+ - **Cause**: Missing design artifacts, unclear requirements
137
+ - **Solution**: Return to Design phase to complete artifacts
138
+ - **Recovery**: Generate plan with available information, mark gaps
139
+
140
+ **Error**: Unit dependencies not satisfied
141
+ - **Cause**: Dependent units not yet generated
142
+ - **Solution**: Reorder generation sequence to respect dependencies
143
+ - **Workaround**: Generate with stub dependencies, integrate later
144
+
145
+ ### Code Generation Errors
146
+
147
+ **Error**: Cannot generate code for a step
148
+ - **Cause**: Insufficient design information, unclear requirements
149
+ - **Solution**: Skip step, document as incomplete, continue
150
+ - **Recovery**: Return to step after gathering more information
151
+
152
+ **Error**: Generated code has syntax errors
153
+ - **Cause**: Template issues, language-specific problems
154
+ - **Solution**: Fix syntax errors, regenerate if needed
155
+ - **Validation**: Verify code compiles before proceeding
156
+
157
+ **Error**: Test generation fails
158
+ - **Cause**: Complex logic, missing test framework setup
159
+ - **Solution**: Generate basic test structure, mark for manual completion
160
+ - **Workaround**: Proceed without tests, add in Operations phase
161
+
162
+ ### Operations Errors
163
+
164
+ **Error**: Cannot determine build tool
165
+ - **Cause**: Unusual project structure, multiple build systems
166
+ - **Solution**: Ask user to specify build tool and commands
167
+ - **Workaround**: Provide generic instructions, user adapts
168
+
169
+ **Error**: Deployment target is unclear
170
+ - **Cause**: Multiple environments, complex infrastructure
171
+ - **Solution**: Ask user to specify deployment targets and methods
172
+ - **Workaround**: Provide instructions for common platforms
173
+
174
+ ## Recovery Procedures
175
+
176
+ ### Partial Phase Completion
177
+
178
+ **Scenario**: Phase was interrupted mid-execution
179
+
180
+ **Recovery Steps**:
181
+ 1. Load the phase plan file
182
+ 2. Identify last completed step (last [x] checkbox)
183
+ 3. Resume from next uncompleted step
184
+ 4. Verify all prior steps are actually complete
185
+ 5. Continue execution normally
186
+
187
+ ### Corrupted State File
188
+
189
+ **Scenario**: `aidlc-state.md` is corrupted or inconsistent
190
+
191
+ **Recovery Steps**:
192
+ 1. Create backup: `aidlc-state.md.backup`
193
+ 2. Ask user which phase they're actually on
194
+ 3. Regenerate state file from scratch
195
+ 4. Mark completed phases based on existing artifacts
196
+ 5. Resume from current phase
197
+
198
+ ### Missing Artifacts
199
+
200
+ **Scenario**: Required artifacts from prior phase are missing
201
+
202
+ **Recovery Steps**:
203
+ 1. Identify which artifacts are missing
204
+ 2. Determine if they can be regenerated
205
+ 3. If yes: Return to that phase, regenerate artifacts
206
+ 4. If no: Ask user to provide information manually
207
+ 5. Document the gap in `audit.md`
208
+
209
+ ### User Wants to Restart Phase
210
+
211
+ **Scenario**: User is unhappy with phase results and wants to redo
212
+
213
+ **Recovery Steps**:
214
+ 1. Confirm user wants to restart (data will be lost)
215
+ 2. Archive existing artifacts: `{artifact}.backup`
216
+ 3. Reset phase status in `aidlc-state.md`
217
+ 4. Clear phase checkboxes in plan files
218
+ 5. Re-execute phase from beginning
219
+
220
+ ### User Wants to Skip Phase
221
+
222
+ **Scenario**: User wants to skip a phase that was planned
223
+
224
+ **Recovery Steps**:
225
+ 1. Confirm user understands implications
226
+ 2. Document skip reason in `audit.md`
227
+ 3. Mark phase as "SKIPPED" in `aidlc-state.md`
228
+ 4. Proceed to next phase
229
+ 5. Note: May cause issues in later phases if dependencies missing
230
+
231
+ ## Escalation Guidelines
232
+
233
+ ### When to Ask for User Help
234
+
235
+ **Immediately**:
236
+ - Contradictory or ambiguous user input
237
+ - Missing required information
238
+ - Technical constraints AI cannot resolve
239
+ - Decisions requiring business judgment
240
+
241
+ **After Attempting Resolution**:
242
+ - Repeated errors in same step
243
+ - Complex technical issues
244
+ - Unusual project structures
245
+ - Integration with external systems
246
+
247
+ ### When to Suggest Starting Over
248
+
249
+ **Consider Fresh Start If**:
250
+ - Multiple phases have errors
251
+ - State file is severely corrupted
252
+ - User cannot provide missing information
253
+ - Artifacts are inconsistent across phases
254
+
255
+ ## Session Resumption Errors
256
+
257
+ ### Missing Artifacts During Resumption
258
+
259
+ **Error**: Required artifacts from previous stages are missing
260
+ - **Cause**: Files deleted, moved, or never created
261
+ - **Solution**:
262
+ 1. Identify which stage created the missing artifacts
263
+ 2. Check if stage was marked complete in aidlc-state.md
264
+ 3. If marked complete but artifacts missing: Regenerate that stage
265
+ 4. If not marked complete: Resume from that stage
266
+ - **Recovery**: Return to the stage that creates missing artifacts and re-execute
267
+
268
+ **Error**: Artifact file exists but is empty or corrupted
269
+ - **Cause**: Interrupted write, manual editing, file system issues
270
+ - **Solution**:
271
+ 1. Create backup of corrupted file
272
+ 2. Attempt to regenerate from stage that creates it
273
+ 3. If cannot regenerate: Ask user for information to recreate
274
+ - **Recovery**: Re-execute the stage that creates the artifact
275
+
276
+ ### Inconsistent State During Resumption
277
+
278
+ **Error**: aidlc-state.md shows stage complete but artifacts don't exist
279
+ - **Cause**: State file updated but artifact generation failed
280
+ - **Solution**:
281
+ 1. Mark stage as incomplete in aidlc-state.md
282
+ 2. Re-execute the stage to generate artifacts
283
+ 3. Verify artifacts exist before marking complete
284
+ - **Recovery**: Reset stage status and re-execute
285
+
286
+ **Error**: Artifacts exist but aidlc-state.md shows stage incomplete
287
+ - **Cause**: Artifact generation succeeded but state update failed
288
+ - **Solution**:
289
+ 1. Verify artifacts are complete and valid
290
+ 2. Update aidlc-state.md to mark stage complete
291
+ 3. Proceed to next stage
292
+ - **Recovery**: Update state file to reflect actual completion
293
+
294
+ **Error**: Multiple stages marked as "current" in aidlc-state.md
295
+ - **Cause**: State file corruption, manual editing
296
+ - **Solution**:
297
+ 1. Review artifacts to determine actual progress
298
+ 2. Ask user which stage they're actually on
299
+ 3. Correct aidlc-state.md to show single current stage
300
+ - **Recovery**: Rebuild state file based on existing artifacts
301
+
302
+ ### Context Loading Errors
303
+
304
+ **Error**: Cannot load required context from previous stages
305
+ - **Cause**: Missing files, corrupted content, wrong file paths
306
+ - **Solution**:
307
+ 1. List which artifacts are needed for current stage
308
+ 2. Check which ones are missing or corrupted
309
+ 3. Regenerate missing artifacts or ask user for information
310
+ - **Recovery**: Complete prerequisite stages before resuming current stage
311
+
312
+ **Error**: Loaded artifacts contain contradictory information
313
+ - **Cause**: Manual editing, multiple people working, incomplete updates
314
+ - **Solution**:
315
+ 1. Identify contradictions and present to user
316
+ 2. Ask user which information is correct
317
+ 3. Update artifacts to resolve contradictions
318
+ - **Recovery**: Reconcile contradictions before proceeding
319
+
320
+ ### Resumption Best Practices
321
+
322
+ 1. **Always validate state**: Check aidlc-state.md matches actual artifacts
323
+ 2. **Load incrementally**: Load artifacts stage-by-stage, validate each
324
+ 3. **Fail fast**: Stop immediately if critical artifacts are missing
325
+ 4. **Communicate clearly**: Tell user exactly what's missing and why it's needed
326
+ 5. **Offer options**: Regenerate, provide manually, or start fresh
327
+ 6. **Document recovery**: Log all recovery actions in audit.md State file is severely corrupted
328
+ - User requirements have changed significantly
329
+ - Architectural decision needs to be reversed
330
+
331
+ **Before Starting Over**:
332
+ 1. Archive all existing work
333
+ 2. Document lessons learned
334
+ 3. Identify what to preserve
335
+ 4. Get user confirmation
336
+ 5. Create new execution plan
337
+
338
+ ## Logging Requirements
339
+
340
+ ### Error Logging Format
341
+
342
+ ```markdown
343
+ ## Error - [Phase Name]
344
+ **Timestamp**: [ISO timestamp]
345
+ **Error Type**: [Critical/High/Medium/Low]
346
+ **Description**: [What went wrong]
347
+ **Cause**: [Why it happened]
348
+ **Resolution**: [How it was resolved]
349
+ **Impact**: [Effect on workflow]
350
+
351
+ ---
352
+ ```
353
+
354
+ ### Recovery Logging Format
355
+
356
+ ```markdown
357
+ ## Recovery - [Phase Name]
358
+ **Timestamp**: [ISO timestamp]
359
+ **Issue**: [What needed recovery]
360
+ **Recovery Steps**: [What was done]
361
+ **Outcome**: [Result of recovery]
362
+ **Artifacts Affected**: [List of files]
363
+
364
+ ---
365
+ ```
366
+
367
+ ## Agent Task Failure Recovery
368
+
369
+ ### When a Delegated Task Fails or Is Lost
370
+
371
+ **Scenario**: A task delegated via the Task tool returns "No task found with ID", empty output, or an error.
372
+
373
+ **MANDATORY Rules**:
374
+ 1. **NEVER silently do the work yourself** — if you delegated to an agent, the recovery must also use delegation
375
+ 2. **NEVER claim the lost agent's work was sufficient** — if you cannot retrieve output, the work is lost
376
+ 3. **Retry the delegation** — re-launch the same agent with the same (or refined) prompt
377
+ 4. **Maximum 2 retries** — if the agent fails 3 times total, escalate to the user
378
+ 5. **Log all failures** — record the task ID, error, and retry attempts in `audit.md`
379
+
380
+ **Recovery Steps**:
381
+ 1. Log the failure: task ID, error message, which agent was used
382
+ 2. Re-launch the agent with `run_in_background: false` (foreground) to ensure visibility
383
+ 3. If retry fails: try a different agent tier (e.g., escalate from `explore` to `explore-medium`)
384
+ 4. If all retries fail: inform the user and ask how to proceed
385
+ 5. Document the resolution in `audit.md`
386
+
387
+ **Error Logging Format**:
388
+ ```markdown
389
+ ## Agent Task Failure
390
+ **Timestamp**: [ISO timestamp]
391
+ **Task ID**: [lost task ID]
392
+ **Agent**: [agent type]
393
+ **Error**: [error message]
394
+ **Recovery**: [retry attempt or escalation]
395
+
396
+ ---
397
+ ```
398
+
399
+ ### Background vs Foreground Task Execution
400
+
401
+ **Default**: Run delegated tasks in the **foreground** (`run_in_background: false`).
402
+
403
+ Foreground execution provides:
404
+ - Full visibility into agent work (tool calls, progress, decisions)
405
+ - Immediate error detection and recovery
406
+ - Better user experience — the user can see what's happening
407
+
408
+ **Background execution** (`run_in_background: true`) should ONLY be used for:
409
+ - Non-agent operations: `npm install`, `npm test`, `docker build`, etc.
410
+ - Operations where output is not needed until completion
411
+
412
+ **NEVER run agent tasks (Task tool with subagent_type) in the background** unless the user explicitly requests it. The risk of silent task loss outweighs the parallelism benefit.
413
+
414
+ ### Parallel Foreground Execution
415
+
416
+ To run multiple agents in parallel WITHOUT background mode:
417
+ - Launch all agent tasks in the **same response** (multiple Task calls)
418
+ - Do NOT set `run_in_background: true`
419
+ - The system will execute them concurrently while maintaining visibility
420
+ - Wait for all results before proceeding
421
+
422
+ This gives you parallelism benefits with full transparency.
423
+
424
+ ## Prevention Best Practices
425
+
426
+ 1. **Validate Early**: Check inputs and dependencies before starting work
427
+ 2. **Checkpoint Often**: Update checkboxes immediately after completing steps
428
+ 3. **Communicate Clearly**: Explain what you're doing and why
429
+ 4. **Ask Questions**: Don't assume - clarify ambiguities immediately
430
+ 5. **Document Everything**: Log all decisions and changes in `audit.md`