@exaudeus/workrail 0.8.0 → 0.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/dist/application/app.d.ts +0 -1
  2. package/dist/application/app.js +0 -6
  3. package/dist/application/services/workflow-service.js +56 -4
  4. package/dist/mcp-server.js +0 -35
  5. package/package.json +1 -1
  6. package/workflows/bug-investigation.agentic.json +112 -0
  7. package/workflows/document-creation-workflow.json +1 -1
  8. package/workflows/documentation-update-workflow.json +1 -1
  9. package/workflows/routines/plan-analysis.json +139 -0
  10. package/workflows/scoped-documentation-workflow.json +252 -0
  11. package/workflows/workflow-diagnose-environment.json +24 -0
  12. package/spec/mcp-compliance-summary.md +0 -211
  13. package/spec/mcp-protocol-handshake.md +0 -604
  14. package/web/DESIGN_SYSTEM_INTEGRATION.md +0 -305
  15. package/web/assets/images/favicon-amber-16.png +0 -0
  16. package/web/assets/images/favicon-amber-32.png +0 -0
  17. package/web/assets/images/favicon-white-16-clean.png +0 -0
  18. package/web/assets/images/favicon-white-32-clean.png +0 -0
  19. package/web/assets/images/icon-amber-192.png +0 -0
  20. package/web/assets/images/icon-amber-512.png +0 -0
  21. package/web/assets/images/icon-amber.svg +0 -27
  22. package/web/assets/images/icon-white-192-clean.png +0 -0
  23. package/web/assets/images/icon-white-512-clean.png +0 -0
  24. package/web/assets/images/icon-white.svg +0 -27
  25. package/web/examples/BEFORE_AFTER.md +0 -691
  26. package/workflows/IMPROVEMENTS-simplified.md +0 -122
  27. package/workflows/systematic-bug-investigation-simplified.backup-20251106-155300.json +0 -117
  28. package/workflows/systematic-bug-investigation-with-loops.backup-20251106-162241.json +0 -731
@@ -1,122 +0,0 @@
1
- # Improvements to Simplified Bug Investigation Workflow
2
-
3
- ## Problem Reported
4
-
5
- **Issue 1**: "What about having it follow the flow of code to help track down what could be happening?"
6
-
7
- **Issue 2**: "The agent stopped after phase two because it was 'very confident' that it had found the issue"
8
-
9
- ## Root Cause
10
-
11
- The agent stopped after Phase 2 (Hypothesis Formation) because it felt confident it had found the bug. But at that point, it only had a **theory** based on reading code, not **proof** from evidence. This is the #1 failure mode we're trying to prevent.
12
-
13
- ## Changes Made
14
-
15
- ### 1. Enhanced Phase 1 - Execution Flow Tracing
16
-
17
- **Before**: Vague guidance about "understanding how code is reached" and "tracing data flow"
18
-
19
- **After**: Concrete, step-by-step execution flow tracing:
20
- - Start at entry point (API call, test, event)
21
- - Trace the call chain function-by-function
22
- - Track state changes at each step
23
- - Follow data transformations
24
- - Document the complete path from entry to error
25
-
26
- **Why**: This gives agents a **concrete technique** rather than abstract guidance. Following actual execution flow prevents surface-level code reading.
27
-
28
- **Output**: `ExecutionFlow.md` with:
29
- - Entry point
30
- - Step-by-step call chain with file:line references
31
- - Data flow diagram
32
- - State changes
33
- - Decision points
34
-
35
- ### 2. Added Explicit Anti-Early-Exit Warning in Phase 2
36
-
37
- Added at the end of Phase 2 prompt:
38
-
39
- ```
40
- 🚨 CRITICAL - DO NOT STOP HERE:
41
-
42
- Even if you have a hypothesis with 10/10 confidence, you do NOT have proof yet.
43
- You have an educated guess based on reading code.
44
-
45
- You MUST continue to Phase 3 (Instrumentation) and Phase 4 (Evidence Collection)
46
- to gather actual proof.
47
-
48
- Having "very high confidence" after reading code is NOT the same as having
49
- evidence from running instrumented code.
50
-
51
- Call workflow_next to continue to Phase 3. This is not optional.
52
- ```
53
-
54
- **Why**: Catches agents right at the moment they're tempted to stop. Makes it explicit that confidence ≠ completion.
55
-
56
- ### 3. Strengthened MetaGuidance
57
-
58
- Enhanced the "Finding vs Proving" section:
59
-
60
- **Before**:
61
- - "When you look at code and think 'I found the bug!', you have formed a hypothesis..."
62
- - "This is why you must complete all phases even when Phase 1 makes the bug 'obvious'."
63
-
64
- **After** (added):
65
- - "Reading code and feeling confident = THEORY. Running instrumented code and collecting evidence = PROOF."
66
- - "Even with 10/10 confidence after Phase 1 or 2, you have ZERO proof. Continue to Phases 3-5 to gather evidence. This is NOT negotiable."
67
- - "Common mistake: 'I'm very confident so I'll skip instrumentation.' This fails ~90% of the time. High confidence without evidence = educated guess, not diagnosis."
68
-
69
- **Why**: Uses clearer language about the distinction. Explicitly calls out the "I'm confident" mistake.
70
-
71
- ### 4. Updated Phase 1 Closing
72
-
73
- **Before**: "You're building understanding, not diagnosing yet."
74
-
75
- **After**: "This analysis builds understanding. You do NOT have a diagnosis yet. You're mapping the terrain before forming theories."
76
-
77
- **Why**: More forceful language to prevent premature conclusions.
78
-
79
- ## Why This Matters
80
-
81
- ### The Core Problem
82
-
83
- Agents (like humans) naturally:
84
- 1. Pattern match quickly when reading code
85
- 2. Form confident conclusions based on that pattern matching
86
- 3. Feel like they've "solved it" and want to move on
87
-
88
- But bugs often have:
89
- - Alternative explanations
90
- - Edge cases not visible from reading code
91
- - Unexpected interactions only visible at runtime
92
- - Environmental factors
93
-
94
- ### The Solution
95
-
96
- The workflow now:
97
- 1. **Provides concrete technique** (execution flow tracing) vs abstract "analyze code"
98
- 2. **Intercepts at the decision point** (end of Phase 2) with explicit warning
99
- 3. **Explains WHY** phases matter in metaGuidance
100
- 4. **Uses clear language** about theory vs proof
101
-
102
- ## Testing Recommendations
103
-
104
- When testing this workflow:
105
-
106
- 1. **Watch for Phase 2 exits**: Does the agent try to stop after forming hypotheses?
107
- 2. **Check for execution flow**: Does Phase 1 produce a detailed call chain, or just general analysis?
108
- 3. **Look for instrumentation**: Does Phase 3 actually add logging/debugging, or skip it?
109
- 4. **Verify evidence collection**: Does Phase 4 run instrumented code and collect real data?
110
-
111
- ## Remaining Challenges
112
-
113
- Even with these improvements, agents may still try to exit early if:
114
- - They have extremely high confidence
115
- - The bug seems "obvious"
116
- - The codebase is small/simple
117
-
118
- If this continues to be an issue, we may need to:
119
- - Add a "commitment checkpoint" that requires explicit acknowledgment
120
- - Make workflow_next calls more automatic (less agent discretion)
121
- - Add validation that checks for completed artifacts before allowing progression
122
-
@@ -1,117 +0,0 @@
1
- {
2
- "id": "systematic-bug-investigation-simplified",
3
- "name": "Bug Investigation (Simplified)",
4
- "version": "2.0.0-alpha.1",
5
- "description": "A streamlined bug investigation workflow that guides agents through systematic analysis without excessive prescription. Focuses on reflective practice - agents design their approach, then execute it.",
6
- "clarificationPrompts": [
7
- "What type of system is this? (web app, backend service, CLI tool, etc.)",
8
- "How reproducible is this bug? (always, sometimes, rarely)",
9
- "What access do you have? (full codebase, logs, tests, etc.)"
10
- ],
11
- "preconditions": [
12
- "User has a specific bug or failing test to investigate",
13
- "Agent has codebase access and can run tests/build",
14
- "Bug is reproducible with specific steps"
15
- ],
16
- "metaGuidance": [
17
- "WHY THIS WORKFLOW EXISTS: Without structure, agents naturally jump to conclusions after seeing a few lines of code. This feels efficient but leads to wrong diagnoses ~90% of the time.",
18
- "Bugs that seem obvious often have deeper root causes, alternative explanations, or critical context that only emerges through systematic investigation.",
19
- "WHAT THIS WORKFLOW DOES: Separates investigation into distinct phases that prevent premature conclusions: understand code → form hypotheses → gather evidence → validate → document.",
20
- "Each phase builds on the previous and serves a specific purpose in building confidence from theory to proof.",
21
- "YOUR GOAL: Produce a comprehensive diagnostic writeup that explains what's happening, why, and provides evidence. Think 'investigative journalist' not 'quick fix developer'.",
22
- "The deliverable is NOT a fix, but understanding so complete that someone else could fix it confidently.",
23
- "WHY ALL 6 PHASES MATTER:",
24
- "Phase 0 (Setup): Prevents misunderstanding the problem you're solving",
25
- "Phase 1 (Analysis): Builds context needed to form good hypotheses, not just first impressions",
26
- "Phase 2 (Hypotheses): Forces consideration of multiple explanations before committing to one",
27
- "Phase 3 (Instrumentation): Sets up the ability to gather real evidence, not just theories",
28
- "Phase 4 (Evidence): Collects actual data about what's happening, not assumptions",
29
- "Phase 5 (Validation): Challenges your conclusion to catch confirmation bias and alternative explanations",
30
- "Phase 6 (Writeup): Synthesizes everything into actionable knowledge for whoever fixes this",
31
- "THE CRITICAL DISTINCTION - FINDING VS PROVING:",
32
- "When you look at code and think 'I found the bug!', you have formed a hypothesis based on pattern matching. This is valuable but not sufficient.",
33
- "Reading code and feeling confident = THEORY. Running instrumented code and collecting evidence = PROOF. Only proof completes the investigation.",
34
- "Even with 10/10 confidence after Phase 1 or 2, you have ZERO proof. Continue to Phases 3-5 to gather evidence. This is NOT negotiable.",
35
- "Common mistake: 'I'm very confident so I'll skip instrumentation.' This fails ~90% of the time. High confidence without evidence = educated guess, not diagnosis.",
36
- "HOW TO USE THIS WORKFLOW: Call workflow_next to get each phase. Complete that phase's work (including all documentation). Call workflow_next again. Repeat until isComplete=true.",
37
- "Each phase will guide you through what to do and what to produce.",
38
- "REFLECTIVE PRACTICE: This workflow asks you to design your approach for each phase, then execute it. Think through 'what would be most effective here?' before diving in.",
39
- "Your expertise matters, but within the structure of gathering evidence systematically rather than jumping to conclusions.",
40
- "SUCCESS LOOKS LIKE: Someone reading your writeup understands: what the bug is, why it occurs, how you know (evidence), what was ruled out, how to reproduce, and what to consider when fixing.",
41
- "They should feel confident proceeding with a fix based on your thorough investigation."
42
- ],
43
- "steps": [
44
- {
45
- "id": "phase-0-setup",
46
- "title": "Phase 0: Investigation Setup",
47
- "prompt": "**SETUP YOUR INVESTIGATION**\n\nBefore diving into code, establish your investigation context:\n\n1. **Triage**: Understand the bug report\n - What's the reported problem?\n - What's the expected vs actual behavior?\n - What error messages or symptoms exist?\n - How is it reproduced?\n\n2. **Context Gathering**: Collect initial information\n - Stack traces, error logs, or test failures\n - Recent changes that might be related\n - System type and architecture\n - Your access level and available tools\n\n3. **Investigation Workspace**: Set up\n - Create a branch or investigation directory if appropriate\n - Document initial understanding in INVESTIGATION_CONTEXT.md\n - Note any early assumptions to verify later\n\n4. **User Preferences**: Clarify\n - How should I handle large log volumes?\n - Should I proceed automatically between phases or check in?\n - Any specific areas you want me to focus on or avoid?\n\n**OUTPUT**: Create INVESTIGATION_CONTEXT.md with:\n- Bug description and symptoms\n- Reproduction steps\n- Initial context\n- Investigation workspace location\n- User preferences\n\n**Self-Assessment**: On a scale of 1-10, how well do you understand what you're investigating? (Should be 6-8 after this phase)",
48
- "agentRole": "You are setting up a systematic investigation. Focus on understanding the problem and establishing your workspace.",
49
- "requireConfirmation": false
50
- },
51
- {
52
- "id": "phase-1-analysis",
53
- "title": "Phase 1: Codebase Analysis",
54
- "prompt": "**ANALYZE THE CODEBASE BY FOLLOWING EXECUTION FLOW**\n\n**Your Task**: Understand the code around this bug by tracing how execution flows from entry point to error.\n\n**STEP 1 - Design Your Analysis Approach**\n\nBefore you start reading code, think through:\n- Where does execution start? (user action, API call, test, scheduled job)\n- What's the path from entry point to the error location?\n- What are the key decision points along that path?\n- Where could data be transformed or corrupted?\n- What would prevent you from missing the real cause?\n\n**Document your approach in INVESTIGATION_CONTEXT.md under \"Phase 1 Analysis Plan\"**\n\n**STEP 2 - Trace Execution Flow**\n\nFollow the code execution step-by-step from entry to error:\n\n1. **Entry Point**: Where does execution begin for this bug?\n - API endpoint, CLI command, event handler, test case?\n - What are the initial inputs/parameters?\n\n2. **Execution Path**: Trace the call chain step-by-step\n - List each function/method call in order\n - Note what data flows between calls\n - Identify branches/conditionals and which path is taken\n - Mark where the error occurs\n\n3. **State Changes**: Track how state evolves\n - What variables are created/modified?\n - What database/file operations happen?\n - What gets cached or stored?\n\n4. **Data Transformations**: Follow data through the system\n - Input format → transformations → output format\n - Where could data become invalid?\n - What validation happens (or doesn't)?\n\n**STEP 3 - Document Findings**\n\nCreate ExecutionFlow.md with:\n- **Entry Point**: Where execution starts\n- **Call Chain**: Step-by-step execution path with file:line references\n- **Data Flow**: How data transforms from input to error point\n- **State Changes**: What gets modified along the way\n- **Decision Points**: Conditionals/branches that affect the path\n- **Suspicious Points**: Where things could go wrong\n\n**STEP 4 - Analyze Code at Each Step**\n\nFor the key steps in your execution path:\n- Read the actual implementation\n- Check for error handling (or lack of it)\n- Look for validation logic\n- Note assumptions in the code\n- Identify patterns and deviations\n\n**STEP 5 - Self-Critique**\n\nAnswer honestly:\n- Did I trace the complete execution path from entry to error?\n- Are there alternative paths I didn't consider?\n- Did I understand what each step does?\n- What am I still uncertain about?\n- Did I skip any steps in the call chain?\n\n**Confidence Check**: Rate 1-10 how well you understand the execution flow. (Should be 7-9 to proceed)\n\n**CRITICAL**: This analysis builds understanding. You do NOT have a diagnosis yet. You're mapping the terrain before forming theories.",
55
- "agentRole": "You are a systematic investigator tracing code execution like following breadcrumbs. Focus on the actual path the code takes, step by step.",
56
- "guidance": [
57
- "Execution flow tracing is concrete: follow function calls, not just read code",
58
- "The goal is to see what ACTUALLY happens, not what should happen",
59
- "This creates a foundation for hypotheses in Phase 2",
60
- "Don't jump to conclusions yet - just map the flow"
61
- ],
62
- "requireConfirmation": false
63
- },
64
- {
65
- "id": "phase-2-hypotheses",
66
- "title": "Phase 2: Hypothesis Formation",
67
- "prompt": "**FORM HYPOTHESES ABOUT THE BUG**\n\n**Your Task**: Based on your analysis, develop testable hypotheses about what's causing the bug.\n\n**STEP 1 - Brainstorm Possible Causes**\n\nFrom your analysis, what could be causing this bug? Consider:\n- Code defects (logic errors, missing validation, race conditions)\n- Data issues (corruption, unexpected formats, missing data)\n- Environment factors (config, timing, resource limits)\n- Integration problems (API changes, dependency issues)\n\nGenerate 3-7 possible causes. Be creative but grounded in your analysis.\n\n**STEP 2 - Develop Testable Hypotheses**\n\nFor each possible cause, formulate a testable hypothesis:\n\n**Hypothesis Template**:\n- **ID**: H1, H2, etc.\n- **Statement**: \"The bug occurs because [specific cause]\"\n- **Evidence For**: What from your analysis supports this?\n- **Evidence Against**: What contradicts or weakens this?\n- **How to Test**: What evidence would prove/disprove this?\n- **Likelihood**: 1-10 based on current evidence\n\n**STEP 3 - Prioritize**\n\nRank your hypotheses by:\n1. Likelihood (based on evidence)\n2. Testability (can you validate it easily?)\n3. Impact (does it fully explain the symptoms?)\n\nFocus on top 3-5 hypotheses.\n\n**STEP 4 - Plan Validation Strategy**\n\nFor your top hypotheses, design how you'll gather evidence:\n- What instrumentation/logging do you need?\n- What tests should you run?\n- What code experiments could prove/disprove?\n- What data should you examine?\n\n**OUTPUT**: Create Hypotheses.md with:\n- All hypotheses (using template above)\n- Priority ranking with justification\n- Validation strategy for top 3-5\n- Questions that would help narrow down\n\n**Self-Assessment**:\n- Do your hypotheses explain all the symptoms?\n- Are they specific enough to be testable?\n- Have you considered alternative explanations?\n- Are you anchoring too much on your first impression?\n\n**🚨 CRITICAL - DO NOT STOP HERE:**\n\nEven if you have a hypothesis with 10/10 confidence, you do NOT have proof yet. You have an educated guess based on reading code.\n\nYou MUST continue to Phase 3 (Instrumentation) and Phase 4 (Evidence Collection) to gather actual proof.\n\nHaving \"very high confidence\" after reading code is NOT the same as having evidence from running instrumented code.\n\nCall workflow_next to continue to Phase 3. This is not optional.",
68
- "agentRole": "You are forming testable hypotheses based on evidence, not jumping to conclusions. Multiple competing hypotheses are healthy at this stage.",
69
- "guidance": [
70
- "Agents should generate multiple hypotheses, not just their first idea",
71
- "Forcing 'Evidence Against' helps combat confirmation bias",
72
- "The validation strategy prepares for next phase",
73
- "CRITICAL: Agents must not stop after Phase 2 even with high confidence"
74
- ],
75
- "requireConfirmation": false
76
- },
77
- {
78
- "id": "phase-3-instrumentation",
79
- "title": "Phase 3: Instrumentation & Test Setup",
80
- "prompt": "**INSTRUMENT THE CODE FOR EVIDENCE COLLECTION**\n\n**Your Task**: Add instrumentation that will generate evidence to test your hypotheses.\n\n**STEP 1 - Design Your Instrumentation Strategy**\n\nBefore adding any logging or test modifications, think through:\n- What specific data points would prove/disprove each hypothesis?\n- Where in the code should you add instrumentation?\n- What's the right level of detail? (too much = noise, too little = gaps)\n- How will you organize/label output to distinguish between hypotheses?\n- Are there existing tests you can enhance instead of adding new logging?\n\nDocument your strategy.\n\n**STEP 2 - Implement Instrumentation**\n\nAdd the instrumentation you designed. This might include:\n- Debug logging at key points\n- Assertions to catch violations\n- Test modifications to expose state\n- Controlled code experiments (add guards, inject failures)\n- Enhanced error messages\n\nLabel your instrumentation clearly (e.g., \"[H1]\" for hypothesis 1 evidence)\n\n**STEP 3 - Prepare Test Scenarios**\n\nSet up scenarios that will trigger the bug and generate evidence:\n- Minimal reproduction case\n- Edge cases that might behave differently\n- Known working scenarios for comparison\n- Variations that test specific hypotheses\n\n**OUTPUT**: Update INVESTIGATION_CONTEXT.md with:\n- Instrumentation points (what/where/why)\n- Test scenarios prepared\n- Expected outcomes for each hypothesis\n- How you'll analyze the results\n\n**Readiness Check**: Are you confident this instrumentation will generate useful evidence? What might you be missing?",
81
- "agentRole": "You are a detective setting up surveillance. Good instrumentation makes the evidence collection phase productive.",
82
- "requireConfirmation": false
83
- },
84
- {
85
- "id": "phase-4-evidence",
86
- "title": "Phase 4: Evidence Collection",
87
- "prompt": "**COLLECT EVIDENCE BY RUNNING INSTRUMENTED CODE**\n\n**Your Task**: Execute your test scenarios and collect evidence about your hypotheses.\n\n**STEP 1 - Run Test Scenarios**\n\nExecute the scenarios you prepared:\n- Run the minimal reproduction case\n- Run edge cases and variations\n- Run working cases for comparison\n- Capture all output (logs, errors, test results)\n\n**STEP 2 - Organize Evidence**\n\nFor each hypothesis, collect the evidence:\n- What does the instrumentation reveal?\n- Does behavior match predictions?\n- What unexpected findings emerged?\n- What questions remain unanswered?\n\nCreate evidence files: Evidence_H1.md, Evidence_H2.md, etc.\n\n**STEP 3 - Analyze Patterns**\n\nLook across all evidence:\n- Which hypotheses are supported?\n- Which are contradicted?\n- Are there patterns you didn't predict?\n- Do you need additional instrumentation?\n- Should you form new hypotheses?\n\n**STEP 4 - Evidence Quality Assessment**\n\nFor each hypothesis, rate evidence quality (1-10):\n- How direct is the evidence?\n- How reproducible?\n- Are there alternative explanations?\n- Do multiple independent sources confirm?\n\n**OUTPUT**: Update Hypotheses.md with:\n- Evidence collected for each hypothesis\n- Updated likelihood scores\n- Evidence quality ratings\n- New insights or questions\n\n**Decision Point**: \n- Do you have strong evidence (8+/10) for one hypothesis? → Proceed to validation\n- Do you need more instrumentation? → Document what's needed\n- Do you need to revise hypotheses? → Update and continue",
88
- "agentRole": "You are gathering and analyzing evidence systematically. Let the data guide you, not your initial assumptions.",
89
- "guidance": [
90
- "Evidence quality matters - weak evidence shouldn't drive conclusions",
91
- "Multiple independent sources of evidence are stronger than one",
92
- "Be open to unexpected findings that suggest new hypotheses"
93
- ],
94
- "requireConfirmation": false
95
- },
96
- {
97
- "id": "phase-5-validation",
98
- "title": "Phase 5: Hypothesis Validation",
99
- "prompt": "**VALIDATE YOUR LEADING HYPOTHESIS**\n\n**Your Task**: Rigorously validate your strongest hypothesis before writing it up.\n\n**STEP 1 - State Your Leading Hypothesis**\n\nBased on evidence from Phase 4:\n- What hypothesis has the strongest support?\n- What's your confidence level (1-10)?\n- What evidence supports it?\n\n**STEP 2 - Adversarial Review**\n\nChallenge your own conclusion:\n- **Alternative Explanations**: What else could explain the evidence?\n- **Contradicting Evidence**: What evidence doesn't fit?\n- **Bias Check**: Are you anchoring on your first impression?\n- **Completeness**: Does this explain ALL symptoms?\n- **Edge Cases**: Does it hold for all scenarios?\n\n**STEP 3 - Additional Validation**\n\nIf confidence is below 9/10, gather more evidence:\n- What specific test would raise confidence?\n- What alternative hypothesis should you rule out?\n- What code experiment would be definitive?\n\nExecute these additional validations.\n\n**STEP 4 - Final Confidence Assessment**\n\nAnswer these questions:\n- Does this hypothesis explain all observed symptoms? (Yes/No)\n- Is there contradicting evidence? (Yes/No)\n- Have you ruled out major alternatives? (Yes/No)\n- Can you reproduce the bug based on this understanding? (Yes/No)\n- Would you bet your reputation on this diagnosis? (Yes/No)\n\n**OUTPUT**: Create ValidationReport.md with:\n- Leading hypothesis statement\n- Supporting evidence (with quality ratings)\n- Alternative explanations considered and why ruled out\n- Adversarial review findings\n- Final confidence score (1-10)\n- Remaining uncertainties\n\n**Threshold**: You should have 9+/10 confidence to proceed to writeup. If not, identify what's missing and continue investigation.",
100
- "agentRole": "You are rigorously validating your conclusion. Be your own harshest critic.",
101
- "guidance": [
102
- "Adversarial review helps catch confirmation bias",
103
- "9/10 confidence threshold prevents premature conclusions",
104
- "Being explicit about remaining uncertainties is valuable"
105
- ],
106
- "requireConfirmation": false
107
- },
108
- {
109
- "id": "phase-6-writeup",
110
- "title": "Phase 6: Diagnostic Writeup",
111
- "prompt": "**CREATE COMPREHENSIVE DIAGNOSTIC WRITEUP**\n\n**Your Task**: Document your investigation in a clear, actionable writeup.\n\n**Structure Your Writeup**:\n\n**1. EXECUTIVE SUMMARY** (3-5 sentences)\n- What is the bug?\n- What causes it?\n- How confident are you?\n- What's the impact?\n\n**2. ROOT CAUSE ANALYSIS**\n- Detailed explanation of the root cause\n- Why this causes the observed symptoms\n- Code locations involved (file:line references)\n- Relevant code snippets\n\n**3. EVIDENCE**\n- Key evidence that proves the diagnosis\n- Evidence quality and sources\n- How you validated the hypothesis\n- Alternative explanations considered and ruled out\n\n**4. REPRODUCTION**\n- Minimal steps to reproduce\n- What to observe that confirms the diagnosis\n- Conditions required (environment, data, timing)\n\n**5. INVESTIGATION SUMMARY**\n- What you analyzed\n- Hypotheses you tested\n- How you arrived at the conclusion\n- Time spent and key turning points\n\n**6. NEXT STEPS (for whoever fixes this)**\n- Suggested fix approach (conceptual, not implementation)\n- Risks or considerations for the fix\n- How to verify the fix works\n- Tests that should be added\n\n**7. REMAINING UNCERTAINTIES**\n- What you're still unsure about\n- What couldn't be fully validated\n- Edge cases that need more investigation\n\n**OUTPUT**: Create DIAGNOSTIC_WRITEUP.md with the above structure.\n\n**Quality Check**:\n- Could someone unfamiliar with this investigation understand the bug from reading this?\n- Is it clear enough to enable an effective fix?\n- Have you provided sufficient evidence?\n- Have you been honest about uncertainties?\n\n**WORKFLOW COMPLETE**: Once writeup is created, set isWorkflowComplete=true.",
112
- "agentRole": "You are documenting your investigation for others. Clarity and completeness matter. This is the deliverable.",
113
- "requireConfirmation": false
114
- }
115
- ]
116
- }
117
-