olympus-ai 4.5.13 → 4.5.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +1 -1
- package/dist/cli/index.js +63 -27
- package/dist/cli/index.js.map +1 -1
- package/dist/hooks/olympus-hooks.cjs +257 -257
- package/dist/installer/hooks.d.ts +47 -14
- package/dist/installer/hooks.d.ts.map +1 -1
- package/dist/installer/hooks.js +45 -77
- package/dist/installer/hooks.js.map +1 -1
- package/dist/installer/index.d.ts +8 -7
- package/dist/installer/index.d.ts.map +1 -1
- package/dist/installer/index.js +49 -46
- package/dist/installer/index.js.map +1 -1
- package/package.json +1 -1
- package/resources/config/risk-keywords.json +5 -5
- package/resources/rules/common/ascii-diagram-standards.md +115 -115
- package/resources/rules/common/content-validation.md +131 -131
- package/resources/rules/common/error-handling.md +430 -430
- package/resources/rules/common/markdown-formatting.md +170 -170
- package/resources/rules/common/overconfidence-prevention.md +100 -100
- package/resources/rules/common/pathway-behaviors.json +60 -60
- package/resources/rules/common/pathway-behaviors.md +100 -100
- package/resources/rules/common/process-overview.md +157 -157
- package/resources/rules/common/terminal-formatting.md +161 -161
- package/resources/rules/common/terminology.md +189 -189
- package/resources/rules/common/welcome-message.md +118 -118
- package/resources/rules/common/workflow-changes.md +285 -285
- package/resources/rules/construction/bolt-planning.md +153 -153
- package/resources/rules/construction/bolt-review.md +143 -143
- package/resources/rules/construction/build-and-test.md +527 -527
- package/resources/rules/construction/code-generation.md +414 -414
- package/resources/rules/construction/documentation.md +201 -201
- package/resources/rules/construction/functional-design.md +135 -135
- package/resources/rules/construction/infrastructure-design.md +110 -110
- package/resources/rules/construction/nfr-design.md +106 -106
- package/resources/rules/construction/nfr-requirements.md +118 -118
- package/resources/rules/construction/test-generation.md +112 -112
- package/resources/rules/core-workflow.md +196 -196
- package/resources/rules/inception/application-design.md +195 -195
- package/resources/rules/inception/bolt-planning.md +588 -588
- package/resources/rules/inception/reverse-engineering.md +354 -354
- package/resources/rules/inception/units-generation.md +505 -505
- package/resources/rules/inception/user-stories.md +527 -527
- package/resources/rules/inception/workspace-detection.md +82 -82
- package/resources/rules/operations/operations.md +19 -19
- package/resources/skills/brief/templates/ai-dlc-intent-brief-template.md +149 -149
- package/resources/skills/getting-started/SKILL.md +79 -79
- package/resources/templates/construction/bolt-spec-template.md +270 -270
- package/resources/templates/inception/unit-brief-template.md +188 -188
- package/resources/templates/inception/units-template.md +99 -99
|
@@ -1,430 +1,430 @@
|
|
|
1
|
-
# Error Handling and Recovery Procedures
|
|
2
|
-
|
|
3
|
-
## General Error Handling Principles
|
|
4
|
-
|
|
5
|
-
### When Errors Occur
|
|
6
|
-
1. **Identify the error**: Clearly state what went wrong
|
|
7
|
-
2. **Assess impact**: Determine if the error is blocking or can be worked around
|
|
8
|
-
3. **Communicate**: Inform the user about the error and options
|
|
9
|
-
4. **Offer solutions**: Provide clear steps to resolve or work around the error
|
|
10
|
-
5. **Document**: Log the error and resolution in `audit.md`
|
|
11
|
-
|
|
12
|
-
### Error Severity Levels
|
|
13
|
-
|
|
14
|
-
**Critical**: Workflow cannot continue
|
|
15
|
-
- Missing required files or artifacts
|
|
16
|
-
- Invalid user input that cannot be processed
|
|
17
|
-
- System errors preventing file operations
|
|
18
|
-
|
|
19
|
-
**High**: Phase cannot complete as planned
|
|
20
|
-
- Incomplete answers to required questions
|
|
21
|
-
- Contradictory user responses
|
|
22
|
-
- Missing dependencies from prior phases
|
|
23
|
-
|
|
24
|
-
**Medium**: Phase can continue with workarounds
|
|
25
|
-
- Optional artifacts missing
|
|
26
|
-
- Non-critical validation failures
|
|
27
|
-
- Partial completion possible
|
|
28
|
-
|
|
29
|
-
**Low**: Minor issues that don't block progress
|
|
30
|
-
- Formatting inconsistencies
|
|
31
|
-
- Optional information missing
|
|
32
|
-
- Non-blocking warnings
|
|
33
|
-
|
|
34
|
-
## Phase-Specific Error Handling
|
|
35
|
-
|
|
36
|
-
### Context Assessment Errors
|
|
37
|
-
|
|
38
|
-
**Error**: Cannot read workspace files
|
|
39
|
-
- **Cause**: Permission issues, missing directories
|
|
40
|
-
- **Solution**: Ask user to verify workspace path and permissions
|
|
41
|
-
- **Workaround**: Proceed with user-provided information only
|
|
42
|
-
|
|
43
|
-
**Error**: Existing `aidlc-state.md` is corrupted
|
|
44
|
-
- **Cause**: Manual editing, incomplete previous run
|
|
45
|
-
- **Solution**: Ask user if they want to start fresh or attempt recovery
|
|
46
|
-
- **Recovery**: Create backup, start new state file
|
|
47
|
-
|
|
48
|
-
**Error**: Cannot determine required phases
|
|
49
|
-
- **Cause**: Insufficient information from user
|
|
50
|
-
- **Solution**: Ask clarifying questions about intent and scope
|
|
51
|
-
- **Workaround**: Default to comprehensive execution plan
|
|
52
|
-
|
|
53
|
-
### Requirements Assessment Errors
|
|
54
|
-
|
|
55
|
-
**Error**: User provides contradictory requirements
|
|
56
|
-
- **Cause**: Unclear understanding, changing needs
|
|
57
|
-
- **Solution**: Create follow-up questions to resolve contradictions
|
|
58
|
-
- **Do Not Proceed**: Until contradictions are resolved
|
|
59
|
-
|
|
60
|
-
**Error**: Requirements document cannot be converted
|
|
61
|
-
- **Cause**: Unsupported format, corrupted file
|
|
62
|
-
- **Solution**: Ask user to provide requirements in supported format
|
|
63
|
-
- **Workaround**: Work with user's verbal description
|
|
64
|
-
|
|
65
|
-
**Error**: Incomplete answers to verification questions
|
|
66
|
-
- **Cause**: User skipped questions, unclear what to answer
|
|
67
|
-
- **Solution**: Highlight unanswered questions, provide examples
|
|
68
|
-
- **Do Not Proceed**: Until all required questions are answered
|
|
69
|
-
|
|
70
|
-
### Story Development Errors
|
|
71
|
-
|
|
72
|
-
**Error**: Cannot map requirements to stories
|
|
73
|
-
- **Cause**: Requirements too vague, missing functional details
|
|
74
|
-
- **Solution**: Return to Requirements Assessment for clarification
|
|
75
|
-
- **Workaround**: Create stories based on available information, mark as incomplete
|
|
76
|
-
|
|
77
|
-
**Error**: User provides ambiguous story planning answers
|
|
78
|
-
- **Cause**: Unclear options, complex decision
|
|
79
|
-
- **Solution**: Add follow-up questions with specific examples
|
|
80
|
-
- **Do Not Proceed**: Until ambiguities are resolved
|
|
81
|
-
|
|
82
|
-
**Error**: Story generation plan has uncompleted steps
|
|
83
|
-
- **Cause**: Execution interrupted, steps skipped
|
|
84
|
-
- **Solution**: Resume from first uncompleted step
|
|
85
|
-
- **Recovery**: Review completed steps, continue from checkpoint
|
|
86
|
-
|
|
87
|
-
### Application Design Errors
|
|
88
|
-
|
|
89
|
-
**Error**: Architectural decision is unclear or contradictory
|
|
90
|
-
- **Cause**: Ambiguous answers, conflicting requirements
|
|
91
|
-
- **Solution**: Add follow-up questions to clarify decision
|
|
92
|
-
- **Do Not Proceed**: Until decision is clear and documented
|
|
93
|
-
|
|
94
|
-
**Error**: Cannot determine number of services/units
|
|
95
|
-
- **Cause**: Insufficient information about boundaries
|
|
96
|
-
- **Solution**: Ask specific questions about deployment, team structure, scaling
|
|
97
|
-
- **Workaround**: Default to monolith, allow change later
|
|
98
|
-
|
|
99
|
-
### Design Errors
|
|
100
|
-
|
|
101
|
-
**Error**: Unit dependencies are circular
|
|
102
|
-
- **Cause**: Poor boundary definition, tight coupling
|
|
103
|
-
- **Solution**: Identify circular dependencies, suggest refactoring
|
|
104
|
-
- **Recovery**: Revise unit boundaries to break cycles
|
|
105
|
-
|
|
106
|
-
**Error**: Unit design plan has missing steps
|
|
107
|
-
- **Cause**: Plan generation incomplete, template error
|
|
108
|
-
- **Solution**: Regenerate plan with all required steps
|
|
109
|
-
- **Recovery**: Add missing steps to existing plan
|
|
110
|
-
|
|
111
|
-
**Error**: Cannot generate design artifacts
|
|
112
|
-
- **Cause**: Missing unit information, unclear requirements
|
|
113
|
-
- **Solution**: Return to Units Planning to clarify unit definition
|
|
114
|
-
- **Workaround**: Generate partial design, mark gaps
|
|
115
|
-
|
|
116
|
-
### NFR Implementation Errors
|
|
117
|
-
|
|
118
|
-
**Error**: Technology stack choices are incompatible
|
|
119
|
-
- **Cause**: Conflicting requirements, platform limitations
|
|
120
|
-
- **Solution**: Highlight incompatibilities, ask user to choose
|
|
121
|
-
- **Do Not Proceed**: Until compatible choices are made
|
|
122
|
-
|
|
123
|
-
**Error**: Organizational constraints cannot be met
|
|
124
|
-
- **Cause**: Network restrictions, security policies
|
|
125
|
-
- **Solution**: Document constraints, ask user for workarounds
|
|
126
|
-
- **Escalation**: May require human intervention for setup
|
|
127
|
-
|
|
128
|
-
**Error**: NFR implementation step requires human action
|
|
129
|
-
- **Cause**: AI cannot perform certain tasks (network config, credentials)
|
|
130
|
-
- **Solution**: Clearly mark as **HUMAN TASK**, provide instructions
|
|
131
|
-
- **Wait**: For user confirmation before proceeding
|
|
132
|
-
|
|
133
|
-
### Code Planning Errors
|
|
134
|
-
|
|
135
|
-
**Error**: Code generation plan is incomplete
|
|
136
|
-
- **Cause**: Missing design artifacts, unclear requirements
|
|
137
|
-
- **Solution**: Return to Design phase to complete artifacts
|
|
138
|
-
- **Recovery**: Generate plan with available information, mark gaps
|
|
139
|
-
|
|
140
|
-
**Error**: Unit dependencies not satisfied
|
|
141
|
-
- **Cause**: Dependent units not yet generated
|
|
142
|
-
- **Solution**: Reorder generation sequence to respect dependencies
|
|
143
|
-
- **Workaround**: Generate with stub dependencies, integrate later
|
|
144
|
-
|
|
145
|
-
### Code Generation Errors
|
|
146
|
-
|
|
147
|
-
**Error**: Cannot generate code for a step
|
|
148
|
-
- **Cause**: Insufficient design information, unclear requirements
|
|
149
|
-
- **Solution**: Skip step, document as incomplete, continue
|
|
150
|
-
- **Recovery**: Return to step after gathering more information
|
|
151
|
-
|
|
152
|
-
**Error**: Generated code has syntax errors
|
|
153
|
-
- **Cause**: Template issues, language-specific problems
|
|
154
|
-
- **Solution**: Fix syntax errors, regenerate if needed
|
|
155
|
-
- **Validation**: Verify code compiles before proceeding
|
|
156
|
-
|
|
157
|
-
**Error**: Test generation fails
|
|
158
|
-
- **Cause**: Complex logic, missing test framework setup
|
|
159
|
-
- **Solution**: Generate basic test structure, mark for manual completion
|
|
160
|
-
- **Workaround**: Proceed without tests, add in Operations phase
|
|
161
|
-
|
|
162
|
-
### Operations Errors
|
|
163
|
-
|
|
164
|
-
**Error**: Cannot determine build tool
|
|
165
|
-
- **Cause**: Unusual project structure, multiple build systems
|
|
166
|
-
- **Solution**: Ask user to specify build tool and commands
|
|
167
|
-
- **Workaround**: Provide generic instructions, user adapts
|
|
168
|
-
|
|
169
|
-
**Error**: Deployment target is unclear
|
|
170
|
-
- **Cause**: Multiple environments, complex infrastructure
|
|
171
|
-
- **Solution**: Ask user to specify deployment targets and methods
|
|
172
|
-
- **Workaround**: Provide instructions for common platforms
|
|
173
|
-
|
|
174
|
-
## Recovery Procedures
|
|
175
|
-
|
|
176
|
-
### Partial Phase Completion
|
|
177
|
-
|
|
178
|
-
**Scenario**: Phase was interrupted mid-execution
|
|
179
|
-
|
|
180
|
-
**Recovery Steps**:
|
|
181
|
-
1. Load the phase plan file
|
|
182
|
-
2. Identify last completed step (last [x] checkbox)
|
|
183
|
-
3. Resume from next uncompleted step
|
|
184
|
-
4. Verify all prior steps are actually complete
|
|
185
|
-
5. Continue execution normally
|
|
186
|
-
|
|
187
|
-
### Corrupted State File
|
|
188
|
-
|
|
189
|
-
**Scenario**: `aidlc-state.md` is corrupted or inconsistent
|
|
190
|
-
|
|
191
|
-
**Recovery Steps**:
|
|
192
|
-
1. Create backup: `aidlc-state.md.backup`
|
|
193
|
-
2. Ask user which phase they're actually on
|
|
194
|
-
3. Regenerate state file from scratch
|
|
195
|
-
4. Mark completed phases based on existing artifacts
|
|
196
|
-
5. Resume from current phase
|
|
197
|
-
|
|
198
|
-
### Missing Artifacts
|
|
199
|
-
|
|
200
|
-
**Scenario**: Required artifacts from prior phase are missing
|
|
201
|
-
|
|
202
|
-
**Recovery Steps**:
|
|
203
|
-
1. Identify which artifacts are missing
|
|
204
|
-
2. Determine if they can be regenerated
|
|
205
|
-
3. If yes: Return to that phase, regenerate artifacts
|
|
206
|
-
4. If no: Ask user to provide information manually
|
|
207
|
-
5. Document the gap in `audit.md`
|
|
208
|
-
|
|
209
|
-
### User Wants to Restart Phase
|
|
210
|
-
|
|
211
|
-
**Scenario**: User is unhappy with phase results and wants to redo
|
|
212
|
-
|
|
213
|
-
**Recovery Steps**:
|
|
214
|
-
1. Confirm user wants to restart (data will be lost)
|
|
215
|
-
2. Archive existing artifacts: `{artifact}.backup`
|
|
216
|
-
3. Reset phase status in `aidlc-state.md`
|
|
217
|
-
4. Clear phase checkboxes in plan files
|
|
218
|
-
5. Re-execute phase from beginning
|
|
219
|
-
|
|
220
|
-
### User Wants to Skip Phase
|
|
221
|
-
|
|
222
|
-
**Scenario**: User wants to skip a phase that was planned
|
|
223
|
-
|
|
224
|
-
**Recovery Steps**:
|
|
225
|
-
1. Confirm user understands implications
|
|
226
|
-
2. Document skip reason in `audit.md`
|
|
227
|
-
3. Mark phase as "SKIPPED" in `aidlc-state.md`
|
|
228
|
-
4. Proceed to next phase
|
|
229
|
-
5. Note: May cause issues in later phases if dependencies missing
|
|
230
|
-
|
|
231
|
-
## Escalation Guidelines
|
|
232
|
-
|
|
233
|
-
### When to Ask for User Help
|
|
234
|
-
|
|
235
|
-
**Immediately**:
|
|
236
|
-
- Contradictory or ambiguous user input
|
|
237
|
-
- Missing required information
|
|
238
|
-
- Technical constraints AI cannot resolve
|
|
239
|
-
- Decisions requiring business judgment
|
|
240
|
-
|
|
241
|
-
**After Attempting Resolution**:
|
|
242
|
-
- Repeated errors in same step
|
|
243
|
-
- Complex technical issues
|
|
244
|
-
- Unusual project structures
|
|
245
|
-
- Integration with external systems
|
|
246
|
-
|
|
247
|
-
### When to Suggest Starting Over
|
|
248
|
-
|
|
249
|
-
**Consider Fresh Start If**:
|
|
250
|
-
- Multiple phases have errors
|
|
251
|
-
- State file is severely corrupted
|
|
252
|
-
- User cannot provide missing information
|
|
253
|
-
- Artifacts are inconsistent across phases
|
|
254
|
-
|
|
255
|
-
## Session Resumption Errors
|
|
256
|
-
|
|
257
|
-
### Missing Artifacts During Resumption
|
|
258
|
-
|
|
259
|
-
**Error**: Required artifacts from previous stages are missing
|
|
260
|
-
- **Cause**: Files deleted, moved, or never created
|
|
261
|
-
- **Solution**:
|
|
262
|
-
1. Identify which stage created the missing artifacts
|
|
263
|
-
2. Check if stage was marked complete in aidlc-state.md
|
|
264
|
-
3. If marked complete but artifacts missing: Regenerate that stage
|
|
265
|
-
4. If not marked complete: Resume from that stage
|
|
266
|
-
- **Recovery**: Return to the stage that creates missing artifacts and re-execute
|
|
267
|
-
|
|
268
|
-
**Error**: Artifact file exists but is empty or corrupted
|
|
269
|
-
- **Cause**: Interrupted write, manual editing, file system issues
|
|
270
|
-
- **Solution**:
|
|
271
|
-
1. Create backup of corrupted file
|
|
272
|
-
2. Attempt to regenerate from stage that creates it
|
|
273
|
-
3. If cannot regenerate: Ask user for information to recreate
|
|
274
|
-
- **Recovery**: Re-execute the stage that creates the artifact
|
|
275
|
-
|
|
276
|
-
### Inconsistent State During Resumption
|
|
277
|
-
|
|
278
|
-
**Error**: aidlc-state.md shows stage complete but artifacts don't exist
|
|
279
|
-
- **Cause**: State file updated but artifact generation failed
|
|
280
|
-
- **Solution**:
|
|
281
|
-
1. Mark stage as incomplete in aidlc-state.md
|
|
282
|
-
2. Re-execute the stage to generate artifacts
|
|
283
|
-
3. Verify artifacts exist before marking complete
|
|
284
|
-
- **Recovery**: Reset stage status and re-execute
|
|
285
|
-
|
|
286
|
-
**Error**: Artifacts exist but aidlc-state.md shows stage incomplete
|
|
287
|
-
- **Cause**: Artifact generation succeeded but state update failed
|
|
288
|
-
- **Solution**:
|
|
289
|
-
1. Verify artifacts are complete and valid
|
|
290
|
-
2. Update aidlc-state.md to mark stage complete
|
|
291
|
-
3. Proceed to next stage
|
|
292
|
-
- **Recovery**: Update state file to reflect actual completion
|
|
293
|
-
|
|
294
|
-
**Error**: Multiple stages marked as "current" in aidlc-state.md
|
|
295
|
-
- **Cause**: State file corruption, manual editing
|
|
296
|
-
- **Solution**:
|
|
297
|
-
1. Review artifacts to determine actual progress
|
|
298
|
-
2. Ask user which stage they're actually on
|
|
299
|
-
3. Correct aidlc-state.md to show single current stage
|
|
300
|
-
- **Recovery**: Rebuild state file based on existing artifacts
|
|
301
|
-
|
|
302
|
-
### Context Loading Errors
|
|
303
|
-
|
|
304
|
-
**Error**: Cannot load required context from previous stages
|
|
305
|
-
- **Cause**: Missing files, corrupted content, wrong file paths
|
|
306
|
-
- **Solution**:
|
|
307
|
-
1. List which artifacts are needed for current stage
|
|
308
|
-
2. Check which ones are missing or corrupted
|
|
309
|
-
3. Regenerate missing artifacts or ask user for information
|
|
310
|
-
- **Recovery**: Complete prerequisite stages before resuming current stage
|
|
311
|
-
|
|
312
|
-
**Error**: Loaded artifacts contain contradictory information
|
|
313
|
-
- **Cause**: Manual editing, multiple people working, incomplete updates
|
|
314
|
-
- **Solution**:
|
|
315
|
-
1. Identify contradictions and present to user
|
|
316
|
-
2. Ask user which information is correct
|
|
317
|
-
3. Update artifacts to resolve contradictions
|
|
318
|
-
- **Recovery**: Reconcile contradictions before proceeding
|
|
319
|
-
|
|
320
|
-
### Resumption Best Practices
|
|
321
|
-
|
|
322
|
-
1. **Always validate state**: Check aidlc-state.md matches actual artifacts
|
|
323
|
-
2. **Load incrementally**: Load artifacts stage-by-stage, validate each
|
|
324
|
-
3. **Fail fast**: Stop immediately if critical artifacts are missing
|
|
325
|
-
4. **Communicate clearly**: Tell user exactly what's missing and why it's needed
|
|
326
|
-
5. **Offer options**: Regenerate, provide manually, or start fresh
|
|
327
|
-
6. **Document recovery**: Log all recovery actions in audit.md State file is severely corrupted
|
|
328
|
-
- User requirements have changed significantly
|
|
329
|
-
- Architectural decision needs to be reversed
|
|
330
|
-
|
|
331
|
-
**Before Starting Over**:
|
|
332
|
-
1. Archive all existing work
|
|
333
|
-
2. Document lessons learned
|
|
334
|
-
3. Identify what to preserve
|
|
335
|
-
4. Get user confirmation
|
|
336
|
-
5. Create new execution plan
|
|
337
|
-
|
|
338
|
-
## Logging Requirements
|
|
339
|
-
|
|
340
|
-
### Error Logging Format
|
|
341
|
-
|
|
342
|
-
```markdown
|
|
343
|
-
## Error - [Phase Name]
|
|
344
|
-
**Timestamp**: [ISO timestamp]
|
|
345
|
-
**Error Type**: [Critical/High/Medium/Low]
|
|
346
|
-
**Description**: [What went wrong]
|
|
347
|
-
**Cause**: [Why it happened]
|
|
348
|
-
**Resolution**: [How it was resolved]
|
|
349
|
-
**Impact**: [Effect on workflow]
|
|
350
|
-
|
|
351
|
-
---
|
|
352
|
-
```
|
|
353
|
-
|
|
354
|
-
### Recovery Logging Format
|
|
355
|
-
|
|
356
|
-
```markdown
|
|
357
|
-
## Recovery - [Phase Name]
|
|
358
|
-
**Timestamp**: [ISO timestamp]
|
|
359
|
-
**Issue**: [What needed recovery]
|
|
360
|
-
**Recovery Steps**: [What was done]
|
|
361
|
-
**Outcome**: [Result of recovery]
|
|
362
|
-
**Artifacts Affected**: [List of files]
|
|
363
|
-
|
|
364
|
-
---
|
|
365
|
-
```
|
|
366
|
-
|
|
367
|
-
## Agent Task Failure Recovery
|
|
368
|
-
|
|
369
|
-
### When a Delegated Task Fails or Is Lost
|
|
370
|
-
|
|
371
|
-
**Scenario**: A task delegated via the Task tool returns "No task found with ID", empty output, or an error.
|
|
372
|
-
|
|
373
|
-
**MANDATORY Rules**:
|
|
374
|
-
1. **NEVER silently do the work yourself** — if you delegated to an agent, the recovery must also use delegation
|
|
375
|
-
2. **NEVER claim the lost agent's work was sufficient** — if you cannot retrieve output, the work is lost
|
|
376
|
-
3. **Retry the delegation** — re-launch the same agent with the same (or refined) prompt
|
|
377
|
-
4. **Maximum 2 retries** — if the agent fails 3 times total, escalate to the user
|
|
378
|
-
5. **Log all failures** — record the task ID, error, and retry attempts in `audit.md`
|
|
379
|
-
|
|
380
|
-
**Recovery Steps**:
|
|
381
|
-
1. Log the failure: task ID, error message, which agent was used
|
|
382
|
-
2. Re-launch the agent with `run_in_background: false` (foreground) to ensure visibility
|
|
383
|
-
3. If retry fails: try a different agent tier (e.g., escalate from `explore` to `explore-medium`)
|
|
384
|
-
4. If all retries fail: inform the user and ask how to proceed
|
|
385
|
-
5. Document the resolution in `audit.md`
|
|
386
|
-
|
|
387
|
-
**Error Logging Format**:
|
|
388
|
-
```markdown
|
|
389
|
-
## Agent Task Failure
|
|
390
|
-
**Timestamp**: [ISO timestamp]
|
|
391
|
-
**Task ID**: [lost task ID]
|
|
392
|
-
**Agent**: [agent type]
|
|
393
|
-
**Error**: [error message]
|
|
394
|
-
**Recovery**: [retry attempt or escalation]
|
|
395
|
-
|
|
396
|
-
---
|
|
397
|
-
```
|
|
398
|
-
|
|
399
|
-
### Background vs Foreground Task Execution
|
|
400
|
-
|
|
401
|
-
**Default**: Run delegated tasks in the **foreground** (`run_in_background: false`).
|
|
402
|
-
|
|
403
|
-
Foreground execution provides:
|
|
404
|
-
- Full visibility into agent work (tool calls, progress, decisions)
|
|
405
|
-
- Immediate error detection and recovery
|
|
406
|
-
- Better user experience — the user can see what's happening
|
|
407
|
-
|
|
408
|
-
**Background execution** (`run_in_background: true`) should ONLY be used for:
|
|
409
|
-
- Non-agent operations: `npm install`, `npm test`, `docker build`, etc.
|
|
410
|
-
- Operations where output is not needed until completion
|
|
411
|
-
|
|
412
|
-
**NEVER run agent tasks (Task tool with subagent_type) in the background** unless the user explicitly requests it. The risk of silent task loss outweighs the parallelism benefit.
|
|
413
|
-
|
|
414
|
-
### Parallel Foreground Execution
|
|
415
|
-
|
|
416
|
-
To run multiple agents in parallel WITHOUT background mode:
|
|
417
|
-
- Launch all agent tasks in the **same response** (multiple Task calls)
|
|
418
|
-
- Do NOT set `run_in_background: true`
|
|
419
|
-
- The system will execute them concurrently while maintaining visibility
|
|
420
|
-
- Wait for all results before proceeding
|
|
421
|
-
|
|
422
|
-
This gives you parallelism benefits with full transparency.
|
|
423
|
-
|
|
424
|
-
## Prevention Best Practices
|
|
425
|
-
|
|
426
|
-
1. **Validate Early**: Check inputs and dependencies before starting work
|
|
427
|
-
2. **Checkpoint Often**: Update checkboxes immediately after completing steps
|
|
428
|
-
3. **Communicate Clearly**: Explain what you're doing and why
|
|
429
|
-
4. **Ask Questions**: Don't assume - clarify ambiguities immediately
|
|
430
|
-
5. **Document Everything**: Log all decisions and changes in `audit.md`
|
|
1
|
+
# Error Handling and Recovery Procedures
|
|
2
|
+
|
|
3
|
+
## General Error Handling Principles
|
|
4
|
+
|
|
5
|
+
### When Errors Occur
|
|
6
|
+
1. **Identify the error**: Clearly state what went wrong
|
|
7
|
+
2. **Assess impact**: Determine if the error is blocking or can be worked around
|
|
8
|
+
3. **Communicate**: Inform the user about the error and options
|
|
9
|
+
4. **Offer solutions**: Provide clear steps to resolve or work around the error
|
|
10
|
+
5. **Document**: Log the error and resolution in `audit.md`
|
|
11
|
+
|
|
12
|
+
### Error Severity Levels
|
|
13
|
+
|
|
14
|
+
**Critical**: Workflow cannot continue
|
|
15
|
+
- Missing required files or artifacts
|
|
16
|
+
- Invalid user input that cannot be processed
|
|
17
|
+
- System errors preventing file operations
|
|
18
|
+
|
|
19
|
+
**High**: Phase cannot complete as planned
|
|
20
|
+
- Incomplete answers to required questions
|
|
21
|
+
- Contradictory user responses
|
|
22
|
+
- Missing dependencies from prior phases
|
|
23
|
+
|
|
24
|
+
**Medium**: Phase can continue with workarounds
|
|
25
|
+
- Optional artifacts missing
|
|
26
|
+
- Non-critical validation failures
|
|
27
|
+
- Partial completion possible
|
|
28
|
+
|
|
29
|
+
**Low**: Minor issues that don't block progress
|
|
30
|
+
- Formatting inconsistencies
|
|
31
|
+
- Optional information missing
|
|
32
|
+
- Non-blocking warnings
|
|
33
|
+
|
|
34
|
+
## Phase-Specific Error Handling
|
|
35
|
+
|
|
36
|
+
### Context Assessment Errors
|
|
37
|
+
|
|
38
|
+
**Error**: Cannot read workspace files
|
|
39
|
+
- **Cause**: Permission issues, missing directories
|
|
40
|
+
- **Solution**: Ask user to verify workspace path and permissions
|
|
41
|
+
- **Workaround**: Proceed with user-provided information only
|
|
42
|
+
|
|
43
|
+
**Error**: Existing `aidlc-state.md` is corrupted
|
|
44
|
+
- **Cause**: Manual editing, incomplete previous run
|
|
45
|
+
- **Solution**: Ask user if they want to start fresh or attempt recovery
|
|
46
|
+
- **Recovery**: Create backup, start new state file
|
|
47
|
+
|
|
48
|
+
**Error**: Cannot determine required phases
|
|
49
|
+
- **Cause**: Insufficient information from user
|
|
50
|
+
- **Solution**: Ask clarifying questions about intent and scope
|
|
51
|
+
- **Workaround**: Default to comprehensive execution plan
|
|
52
|
+
|
|
53
|
+
### Requirements Assessment Errors
|
|
54
|
+
|
|
55
|
+
**Error**: User provides contradictory requirements
|
|
56
|
+
- **Cause**: Unclear understanding, changing needs
|
|
57
|
+
- **Solution**: Create follow-up questions to resolve contradictions
|
|
58
|
+
- **Do Not Proceed**: Until contradictions are resolved
|
|
59
|
+
|
|
60
|
+
**Error**: Requirements document cannot be converted
|
|
61
|
+
- **Cause**: Unsupported format, corrupted file
|
|
62
|
+
- **Solution**: Ask user to provide requirements in supported format
|
|
63
|
+
- **Workaround**: Work with user's verbal description
|
|
64
|
+
|
|
65
|
+
**Error**: Incomplete answers to verification questions
|
|
66
|
+
- **Cause**: User skipped questions, unclear what to answer
|
|
67
|
+
- **Solution**: Highlight unanswered questions, provide examples
|
|
68
|
+
- **Do Not Proceed**: Until all required questions are answered
|
|
69
|
+
|
|
70
|
+
### Story Development Errors
|
|
71
|
+
|
|
72
|
+
**Error**: Cannot map requirements to stories
|
|
73
|
+
- **Cause**: Requirements too vague, missing functional details
|
|
74
|
+
- **Solution**: Return to Requirements Assessment for clarification
|
|
75
|
+
- **Workaround**: Create stories based on available information, mark as incomplete
|
|
76
|
+
|
|
77
|
+
**Error**: User provides ambiguous story planning answers
|
|
78
|
+
- **Cause**: Unclear options, complex decision
|
|
79
|
+
- **Solution**: Add follow-up questions with specific examples
|
|
80
|
+
- **Do Not Proceed**: Until ambiguities are resolved
|
|
81
|
+
|
|
82
|
+
**Error**: Story generation plan has uncompleted steps
|
|
83
|
+
- **Cause**: Execution interrupted, steps skipped
|
|
84
|
+
- **Solution**: Resume from first uncompleted step
|
|
85
|
+
- **Recovery**: Review completed steps, continue from checkpoint
|
|
86
|
+
|
|
87
|
+
### Application Design Errors
|
|
88
|
+
|
|
89
|
+
**Error**: Architectural decision is unclear or contradictory
|
|
90
|
+
- **Cause**: Ambiguous answers, conflicting requirements
|
|
91
|
+
- **Solution**: Add follow-up questions to clarify decision
|
|
92
|
+
- **Do Not Proceed**: Until decision is clear and documented
|
|
93
|
+
|
|
94
|
+
**Error**: Cannot determine number of services/units
|
|
95
|
+
- **Cause**: Insufficient information about boundaries
|
|
96
|
+
- **Solution**: Ask specific questions about deployment, team structure, scaling
|
|
97
|
+
- **Workaround**: Default to monolith, allow change later
|
|
98
|
+
|
|
99
|
+
### Design Errors
|
|
100
|
+
|
|
101
|
+
**Error**: Unit dependencies are circular
|
|
102
|
+
- **Cause**: Poor boundary definition, tight coupling
|
|
103
|
+
- **Solution**: Identify circular dependencies, suggest refactoring
|
|
104
|
+
- **Recovery**: Revise unit boundaries to break cycles
|
|
105
|
+
|
|
106
|
+
**Error**: Unit design plan has missing steps
|
|
107
|
+
- **Cause**: Plan generation incomplete, template error
|
|
108
|
+
- **Solution**: Regenerate plan with all required steps
|
|
109
|
+
- **Recovery**: Add missing steps to existing plan
|
|
110
|
+
|
|
111
|
+
**Error**: Cannot generate design artifacts
|
|
112
|
+
- **Cause**: Missing unit information, unclear requirements
|
|
113
|
+
- **Solution**: Return to Units Planning to clarify unit definition
|
|
114
|
+
- **Workaround**: Generate partial design, mark gaps
|
|
115
|
+
|
|
116
|
+
### NFR Implementation Errors
|
|
117
|
+
|
|
118
|
+
**Error**: Technology stack choices are incompatible
|
|
119
|
+
- **Cause**: Conflicting requirements, platform limitations
|
|
120
|
+
- **Solution**: Highlight incompatibilities, ask user to choose
|
|
121
|
+
- **Do Not Proceed**: Until compatible choices are made
|
|
122
|
+
|
|
123
|
+
**Error**: Organizational constraints cannot be met
|
|
124
|
+
- **Cause**: Network restrictions, security policies
|
|
125
|
+
- **Solution**: Document constraints, ask user for workarounds
|
|
126
|
+
- **Escalation**: May require human intervention for setup
|
|
127
|
+
|
|
128
|
+
**Error**: NFR implementation step requires human action
|
|
129
|
+
- **Cause**: AI cannot perform certain tasks (network config, credentials)
|
|
130
|
+
- **Solution**: Clearly mark as **HUMAN TASK**, provide instructions
|
|
131
|
+
- **Wait**: For user confirmation before proceeding
|
|
132
|
+
|
|
133
|
+
### Code Planning Errors
|
|
134
|
+
|
|
135
|
+
**Error**: Code generation plan is incomplete
|
|
136
|
+
- **Cause**: Missing design artifacts, unclear requirements
|
|
137
|
+
- **Solution**: Return to Design phase to complete artifacts
|
|
138
|
+
- **Recovery**: Generate plan with available information, mark gaps
|
|
139
|
+
|
|
140
|
+
**Error**: Unit dependencies not satisfied
|
|
141
|
+
- **Cause**: Dependent units not yet generated
|
|
142
|
+
- **Solution**: Reorder generation sequence to respect dependencies
|
|
143
|
+
- **Workaround**: Generate with stub dependencies, integrate later
|
|
144
|
+
|
|
145
|
+
### Code Generation Errors
|
|
146
|
+
|
|
147
|
+
**Error**: Cannot generate code for a step
|
|
148
|
+
- **Cause**: Insufficient design information, unclear requirements
|
|
149
|
+
- **Solution**: Skip step, document as incomplete, continue
|
|
150
|
+
- **Recovery**: Return to step after gathering more information
|
|
151
|
+
|
|
152
|
+
**Error**: Generated code has syntax errors
|
|
153
|
+
- **Cause**: Template issues, language-specific problems
|
|
154
|
+
- **Solution**: Fix syntax errors, regenerate if needed
|
|
155
|
+
- **Validation**: Verify code compiles before proceeding
|
|
156
|
+
|
|
157
|
+
**Error**: Test generation fails
|
|
158
|
+
- **Cause**: Complex logic, missing test framework setup
|
|
159
|
+
- **Solution**: Generate basic test structure, mark for manual completion
|
|
160
|
+
- **Workaround**: Proceed without tests, add in Operations phase
|
|
161
|
+
|
|
162
|
+
### Operations Errors
|
|
163
|
+
|
|
164
|
+
**Error**: Cannot determine build tool
|
|
165
|
+
- **Cause**: Unusual project structure, multiple build systems
|
|
166
|
+
- **Solution**: Ask user to specify build tool and commands
|
|
167
|
+
- **Workaround**: Provide generic instructions, user adapts
|
|
168
|
+
|
|
169
|
+
**Error**: Deployment target is unclear
|
|
170
|
+
- **Cause**: Multiple environments, complex infrastructure
|
|
171
|
+
- **Solution**: Ask user to specify deployment targets and methods
|
|
172
|
+
- **Workaround**: Provide instructions for common platforms
|
|
173
|
+
|
|
174
|
+
## Recovery Procedures
|
|
175
|
+
|
|
176
|
+
### Partial Phase Completion
|
|
177
|
+
|
|
178
|
+
**Scenario**: Phase was interrupted mid-execution
|
|
179
|
+
|
|
180
|
+
**Recovery Steps**:
|
|
181
|
+
1. Load the phase plan file
|
|
182
|
+
2. Identify last completed step (last [x] checkbox)
|
|
183
|
+
3. Resume from next uncompleted step
|
|
184
|
+
4. Verify all prior steps are actually complete
|
|
185
|
+
5. Continue execution normally
|
|
186
|
+
|
|
187
|
+
### Corrupted State File
|
|
188
|
+
|
|
189
|
+
**Scenario**: `aidlc-state.md` is corrupted or inconsistent
|
|
190
|
+
|
|
191
|
+
**Recovery Steps**:
|
|
192
|
+
1. Create backup: `aidlc-state.md.backup`
|
|
193
|
+
2. Ask user which phase they're actually on
|
|
194
|
+
3. Regenerate state file from scratch
|
|
195
|
+
4. Mark completed phases based on existing artifacts
|
|
196
|
+
5. Resume from current phase
|
|
197
|
+
|
|
198
|
+
### Missing Artifacts
|
|
199
|
+
|
|
200
|
+
**Scenario**: Required artifacts from prior phase are missing
|
|
201
|
+
|
|
202
|
+
**Recovery Steps**:
|
|
203
|
+
1. Identify which artifacts are missing
|
|
204
|
+
2. Determine if they can be regenerated
|
|
205
|
+
3. If yes: Return to that phase, regenerate artifacts
|
|
206
|
+
4. If no: Ask user to provide information manually
|
|
207
|
+
5. Document the gap in `audit.md`
|
|
208
|
+
|
|
209
|
+
### User Wants to Restart Phase
|
|
210
|
+
|
|
211
|
+
**Scenario**: User is unhappy with phase results and wants to redo
|
|
212
|
+
|
|
213
|
+
**Recovery Steps**:
|
|
214
|
+
1. Confirm user wants to restart (data will be lost)
|
|
215
|
+
2. Archive existing artifacts: `{artifact}.backup`
|
|
216
|
+
3. Reset phase status in `aidlc-state.md`
|
|
217
|
+
4. Clear phase checkboxes in plan files
|
|
218
|
+
5. Re-execute phase from beginning
|
|
219
|
+
|
|
220
|
+
### User Wants to Skip Phase
|
|
221
|
+
|
|
222
|
+
**Scenario**: User wants to skip a phase that was planned
|
|
223
|
+
|
|
224
|
+
**Recovery Steps**:
|
|
225
|
+
1. Confirm user understands implications
|
|
226
|
+
2. Document skip reason in `audit.md`
|
|
227
|
+
3. Mark phase as "SKIPPED" in `aidlc-state.md`
|
|
228
|
+
4. Proceed to next phase
|
|
229
|
+
5. Note: May cause issues in later phases if dependencies missing
|
|
230
|
+
|
|
231
|
+
## Escalation Guidelines
|
|
232
|
+
|
|
233
|
+
### When to Ask for User Help
|
|
234
|
+
|
|
235
|
+
**Immediately**:
|
|
236
|
+
- Contradictory or ambiguous user input
|
|
237
|
+
- Missing required information
|
|
238
|
+
- Technical constraints AI cannot resolve
|
|
239
|
+
- Decisions requiring business judgment
|
|
240
|
+
|
|
241
|
+
**After Attempting Resolution**:
|
|
242
|
+
- Repeated errors in same step
|
|
243
|
+
- Complex technical issues
|
|
244
|
+
- Unusual project structures
|
|
245
|
+
- Integration with external systems
|
|
246
|
+
|
|
247
|
+
### When to Suggest Starting Over
|
|
248
|
+
|
|
249
|
+
**Consider Fresh Start If**:
|
|
250
|
+
- Multiple phases have errors
|
|
251
|
+
- State file is severely corrupted
|
|
252
|
+
- User cannot provide missing information
|
|
253
|
+
- Artifacts are inconsistent across phases
|
|
254
|
+
|
|
255
|
+
## Session Resumption Errors
|
|
256
|
+
|
|
257
|
+
### Missing Artifacts During Resumption
|
|
258
|
+
|
|
259
|
+
**Error**: Required artifacts from previous stages are missing
|
|
260
|
+
- **Cause**: Files deleted, moved, or never created
|
|
261
|
+
- **Solution**:
|
|
262
|
+
1. Identify which stage created the missing artifacts
|
|
263
|
+
2. Check if stage was marked complete in aidlc-state.md
|
|
264
|
+
3. If marked complete but artifacts missing: Regenerate that stage
|
|
265
|
+
4. If not marked complete: Resume from that stage
|
|
266
|
+
- **Recovery**: Return to the stage that creates missing artifacts and re-execute
|
|
267
|
+
|
|
268
|
+
**Error**: Artifact file exists but is empty or corrupted
|
|
269
|
+
- **Cause**: Interrupted write, manual editing, file system issues
|
|
270
|
+
- **Solution**:
|
|
271
|
+
1. Create backup of corrupted file
|
|
272
|
+
2. Attempt to regenerate from stage that creates it
|
|
273
|
+
3. If cannot regenerate: Ask user for information to recreate
|
|
274
|
+
- **Recovery**: Re-execute the stage that creates the artifact
|
|
275
|
+
|
|
276
|
+
### Inconsistent State During Resumption
|
|
277
|
+
|
|
278
|
+
**Error**: aidlc-state.md shows stage complete but artifacts don't exist
|
|
279
|
+
- **Cause**: State file updated but artifact generation failed
|
|
280
|
+
- **Solution**:
|
|
281
|
+
1. Mark stage as incomplete in aidlc-state.md
|
|
282
|
+
2. Re-execute the stage to generate artifacts
|
|
283
|
+
3. Verify artifacts exist before marking complete
|
|
284
|
+
- **Recovery**: Reset stage status and re-execute
|
|
285
|
+
|
|
286
|
+
**Error**: Artifacts exist but aidlc-state.md shows stage incomplete
|
|
287
|
+
- **Cause**: Artifact generation succeeded but state update failed
|
|
288
|
+
- **Solution**:
|
|
289
|
+
1. Verify artifacts are complete and valid
|
|
290
|
+
2. Update aidlc-state.md to mark stage complete
|
|
291
|
+
3. Proceed to next stage
|
|
292
|
+
- **Recovery**: Update state file to reflect actual completion
|
|
293
|
+
|
|
294
|
+
**Error**: Multiple stages marked as "current" in aidlc-state.md
|
|
295
|
+
- **Cause**: State file corruption, manual editing
|
|
296
|
+
- **Solution**:
|
|
297
|
+
1. Review artifacts to determine actual progress
|
|
298
|
+
2. Ask user which stage they're actually on
|
|
299
|
+
3. Correct aidlc-state.md to show single current stage
|
|
300
|
+
- **Recovery**: Rebuild state file based on existing artifacts
|
|
301
|
+
|
|
302
|
+
### Context Loading Errors
|
|
303
|
+
|
|
304
|
+
**Error**: Cannot load required context from previous stages
|
|
305
|
+
- **Cause**: Missing files, corrupted content, wrong file paths
|
|
306
|
+
- **Solution**:
|
|
307
|
+
1. List which artifacts are needed for current stage
|
|
308
|
+
2. Check which ones are missing or corrupted
|
|
309
|
+
3. Regenerate missing artifacts or ask user for information
|
|
310
|
+
- **Recovery**: Complete prerequisite stages before resuming current stage
|
|
311
|
+
|
|
312
|
+
**Error**: Loaded artifacts contain contradictory information
|
|
313
|
+
- **Cause**: Manual editing, multiple people working, incomplete updates
|
|
314
|
+
- **Solution**:
|
|
315
|
+
1. Identify contradictions and present to user
|
|
316
|
+
2. Ask user which information is correct
|
|
317
|
+
3. Update artifacts to resolve contradictions
|
|
318
|
+
- **Recovery**: Reconcile contradictions before proceeding
|
|
319
|
+
|
|
320
|
+
### Resumption Best Practices
|
|
321
|
+
|
|
322
|
+
1. **Always validate state**: Check aidlc-state.md matches actual artifacts
|
|
323
|
+
2. **Load incrementally**: Load artifacts stage-by-stage, validate each
|
|
324
|
+
3. **Fail fast**: Stop immediately if critical artifacts are missing
|
|
325
|
+
4. **Communicate clearly**: Tell user exactly what's missing and why it's needed
|
|
326
|
+
5. **Offer options**: Regenerate, provide manually, or start fresh
|
|
327
|
+
6. **Document recovery**: Log all recovery actions in audit.md State file is severely corrupted
|
|
328
|
+
- User requirements have changed significantly
|
|
329
|
+
- Architectural decision needs to be reversed
|
|
330
|
+
|
|
331
|
+
**Before Starting Over**:
|
|
332
|
+
1. Archive all existing work
|
|
333
|
+
2. Document lessons learned
|
|
334
|
+
3. Identify what to preserve
|
|
335
|
+
4. Get user confirmation
|
|
336
|
+
5. Create new execution plan
|
|
337
|
+
|
|
338
|
+
## Logging Requirements
|
|
339
|
+
|
|
340
|
+
### Error Logging Format
|
|
341
|
+
|
|
342
|
+
```markdown
|
|
343
|
+
## Error - [Phase Name]
|
|
344
|
+
**Timestamp**: [ISO timestamp]
|
|
345
|
+
**Error Type**: [Critical/High/Medium/Low]
|
|
346
|
+
**Description**: [What went wrong]
|
|
347
|
+
**Cause**: [Why it happened]
|
|
348
|
+
**Resolution**: [How it was resolved]
|
|
349
|
+
**Impact**: [Effect on workflow]
|
|
350
|
+
|
|
351
|
+
---
|
|
352
|
+
```
|
|
353
|
+
|
|
354
|
+
### Recovery Logging Format
|
|
355
|
+
|
|
356
|
+
```markdown
|
|
357
|
+
## Recovery - [Phase Name]
|
|
358
|
+
**Timestamp**: [ISO timestamp]
|
|
359
|
+
**Issue**: [What needed recovery]
|
|
360
|
+
**Recovery Steps**: [What was done]
|
|
361
|
+
**Outcome**: [Result of recovery]
|
|
362
|
+
**Artifacts Affected**: [List of files]
|
|
363
|
+
|
|
364
|
+
---
|
|
365
|
+
```
|
|
366
|
+
|
|
367
|
+
## Agent Task Failure Recovery
|
|
368
|
+
|
|
369
|
+
### When a Delegated Task Fails or Is Lost
|
|
370
|
+
|
|
371
|
+
**Scenario**: A task delegated via the Task tool returns "No task found with ID", empty output, or an error.
|
|
372
|
+
|
|
373
|
+
**MANDATORY Rules**:
|
|
374
|
+
1. **NEVER silently do the work yourself** — if you delegated to an agent, the recovery must also use delegation
|
|
375
|
+
2. **NEVER claim the lost agent's work was sufficient** — if you cannot retrieve output, the work is lost
|
|
376
|
+
3. **Retry the delegation** — re-launch the same agent with the same (or refined) prompt
|
|
377
|
+
4. **Maximum 2 retries** — if the agent fails 3 times total, escalate to the user
|
|
378
|
+
5. **Log all failures** — record the task ID, error, and retry attempts in `audit.md`
|
|
379
|
+
|
|
380
|
+
**Recovery Steps**:
|
|
381
|
+
1. Log the failure: task ID, error message, which agent was used
|
|
382
|
+
2. Re-launch the agent with `run_in_background: false` (foreground) to ensure visibility
|
|
383
|
+
3. If retry fails: try a different agent tier (e.g., escalate from `explore` to `explore-medium`)
|
|
384
|
+
4. If all retries fail: inform the user and ask how to proceed
|
|
385
|
+
5. Document the resolution in `audit.md`
|
|
386
|
+
|
|
387
|
+
**Error Logging Format**:
|
|
388
|
+
```markdown
|
|
389
|
+
## Agent Task Failure
|
|
390
|
+
**Timestamp**: [ISO timestamp]
|
|
391
|
+
**Task ID**: [lost task ID]
|
|
392
|
+
**Agent**: [agent type]
|
|
393
|
+
**Error**: [error message]
|
|
394
|
+
**Recovery**: [retry attempt or escalation]
|
|
395
|
+
|
|
396
|
+
---
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
### Background vs Foreground Task Execution
|
|
400
|
+
|
|
401
|
+
**Default**: Run delegated tasks in the **foreground** (`run_in_background: false`).
|
|
402
|
+
|
|
403
|
+
Foreground execution provides:
|
|
404
|
+
- Full visibility into agent work (tool calls, progress, decisions)
|
|
405
|
+
- Immediate error detection and recovery
|
|
406
|
+
- Better user experience — the user can see what's happening
|
|
407
|
+
|
|
408
|
+
**Background execution** (`run_in_background: true`) should ONLY be used for:
|
|
409
|
+
- Non-agent operations: `npm install`, `npm test`, `docker build`, etc.
|
|
410
|
+
- Operations where output is not needed until completion
|
|
411
|
+
|
|
412
|
+
**NEVER run agent tasks (Task tool with subagent_type) in the background** unless the user explicitly requests it. The risk of silent task loss outweighs the parallelism benefit.
|
|
413
|
+
|
|
414
|
+
### Parallel Foreground Execution
|
|
415
|
+
|
|
416
|
+
To run multiple agents in parallel WITHOUT background mode:
|
|
417
|
+
- Launch all agent tasks in the **same response** (multiple Task calls)
|
|
418
|
+
- Do NOT set `run_in_background: true`
|
|
419
|
+
- The system will execute them concurrently while maintaining visibility
|
|
420
|
+
- Wait for all results before proceeding
|
|
421
|
+
|
|
422
|
+
This gives you parallelism benefits with full transparency.
|
|
423
|
+
|
|
424
|
+
## Prevention Best Practices
|
|
425
|
+
|
|
426
|
+
1. **Validate Early**: Check inputs and dependencies before starting work
|
|
427
|
+
2. **Checkpoint Often**: Update checkboxes immediately after completing steps
|
|
428
|
+
3. **Communicate Clearly**: Explain what you're doing and why
|
|
429
|
+
4. **Ask Questions**: Don't assume - clarify ambiguities immediately
|
|
430
|
+
5. **Document Everything**: Log all decisions and changes in `audit.md`
|