@exaudeus/workrail 0.1.0 → 0.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +153 -189
- package/dist/application/services/classification-engine.d.ts +33 -0
- package/dist/application/services/classification-engine.js +258 -0
- package/dist/application/services/compression-service.d.ts +20 -0
- package/dist/application/services/compression-service.js +312 -0
- package/dist/application/services/context-management-service.d.ts +38 -0
- package/dist/application/services/context-management-service.js +301 -0
- package/dist/application/services/context-persistence-service.d.ts +45 -0
- package/dist/application/services/context-persistence-service.js +273 -0
- package/dist/cli/migrate-workflow.js +3 -2
- package/dist/infrastructure/storage/context-storage.d.ts +150 -0
- package/dist/infrastructure/storage/context-storage.js +40 -0
- package/dist/infrastructure/storage/filesystem-blob-storage.d.ts +27 -0
- package/dist/infrastructure/storage/filesystem-blob-storage.js +363 -0
- package/dist/infrastructure/storage/hybrid-context-storage.d.ts +29 -0
- package/dist/infrastructure/storage/hybrid-context-storage.js +400 -0
- package/dist/infrastructure/storage/migrations/001_initial_schema.sql +38 -0
- package/dist/infrastructure/storage/migrations/002_context_concurrency_enhancements.sql +234 -0
- package/dist/infrastructure/storage/migrations/003_classification_overrides.sql +20 -0
- package/dist/infrastructure/storage/sqlite-metadata-storage.d.ts +35 -0
- package/dist/infrastructure/storage/sqlite-metadata-storage.js +410 -0
- package/dist/infrastructure/storage/sqlite-migrator.d.ts +46 -0
- package/dist/infrastructure/storage/sqlite-migrator.js +293 -0
- package/dist/types/context-types.d.ts +236 -0
- package/dist/types/context-types.js +10 -0
- package/dist/utils/storage-security.js +1 -1
- package/package.json +4 -1
- package/workflows/coding-task-workflow-with-loops.json +434 -0
- package/workflows/mr-review-workflow.json +75 -26
- package/workflows/systemic-bug-investigation-with-loops.json +423 -0
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"id": "mr-review-workflow",
|
|
3
3
|
"name": "Adaptive MR Review Workflow",
|
|
4
|
-
"version": "0.
|
|
4
|
+
"version": "0.2.0",
|
|
5
5
|
"description": "An adaptive workflow to guide an AI agent in performing a comprehensive code review. It adjusts its rigor based on MR complexity and includes checkpoints for architectural and self-critique to provide deep, actionable feedback.",
|
|
6
6
|
"preconditions": [
|
|
7
7
|
"User has the full code diff accessible (e.g., as text in a file).",
|
|
@@ -9,17 +9,21 @@
|
|
|
9
9
|
"The agent has access to file-reading tools."
|
|
10
10
|
],
|
|
11
11
|
"metaGuidance": [
|
|
12
|
-
"The ultimate goal is to assist, not replace, a human reviewer.",
|
|
13
|
-
"All feedback should be constructive and actionable.",
|
|
12
|
+
"The ultimate goal is to assist, not replace, a human reviewer. The human owns the final merge decision.",
|
|
13
|
+
"All feedback should be constructive and actionable. Explain the 'why' behind suggestions.",
|
|
14
|
+
"The goal is continuous improvement, not perfection. Approve changes that are 'better' to maintain velocity.",
|
|
15
|
+
"Foster a blameless culture of collective ownership. The code is a shared asset.",
|
|
16
|
+
"Use prefixes like 'Nit:' for non-blocking, minor suggestions to keep focus on important issues.",
|
|
17
|
+
"Embrace small, single-purpose pull requests for faster, more thorough reviews.",
|
|
14
18
|
"When citing issues, always try to provide specific file paths and line numbers from the diff.",
|
|
15
19
|
"Maintain the persona of a helpful, collaborative senior engineer.",
|
|
16
|
-
"If at any point you determine that a crucial piece of information is missing
|
|
20
|
+
"If at any point you determine that a crucial piece of information is missing, you must pause and ask the user to provide it."
|
|
17
21
|
],
|
|
18
22
|
"steps": [
|
|
19
23
|
{
|
|
20
24
|
"id": "phase-0-triage",
|
|
21
25
|
"title": "Phase 0: Triage & Review Focus",
|
|
22
|
-
"prompt": "To begin the Merge Request review, please provide the full context below and classify the MR's complexity. This will tailor the depth of the review.\n\n**1. MR Context:**\n* **MR Title/Purpose:** [User provides the title and a brief description of its purpose.]\n* **Related Ticket(s):** [User provides ticket numbers or links.]\n* **Key Requirements/Acceptance Criteria:** [User lists key requirements from the ticket(s).]\n\n**2. Code Diff:**\n[User pastes the full `git diff` output or provides a path to a file containing the diff.]\n\n**3. Complexity Classification & Focus:**\n* **Classification:** Please choose one: **[Trivial]**, **[Standard]**, or **[High-Risk]**.\n* **Areas of Focus (Optional):** Are there specific areas you want me to pay close attention to? (e.g., 'performance implications', 'API design', 'data integrity').",
|
|
26
|
+
"prompt": "To begin the Merge Request review, please provide the full context below and classify the MR's complexity. This will tailor the depth of the review.\n\n**1. MR Context:**\n* **MR Title/Purpose:** [User provides the title and a brief description of its purpose.]\n* **Related Ticket(s):** [User provides ticket numbers or links.]\n* **Key Requirements/Acceptance Criteria:** [User lists key requirements from the ticket(s).]\n\n**2. Code Diff:**\n[User pastes the full `git diff` output or provides a path to a file containing the diff.]\n\n**3. Complexity Classification & Focus:**\n* **Classification:** Please choose one: **[Trivial]**, **[Standard]**, or **[High-Risk]**.\n* **PR Size:** Is this a small, focused change (<400 lines)? If not, does it have a single, clear purpose?\n* **Areas of Focus (Optional):** Are there specific areas you want me to pay close attention to? (e.g., 'performance implications', 'API design', 'data integrity').",
|
|
23
27
|
"agentRole": "You are a code review coordinator and triage specialist with expertise in assessing merge request complexity and risk. Your role is to efficiently classify reviews and establish appropriate focus areas to ensure the right level of scrutiny for each change.",
|
|
24
28
|
"guidance": [
|
|
25
29
|
"**[Trivial]:** For minor fixes (typos, docs). This will run a condensed, single-phase review.",
|
|
@@ -35,24 +39,55 @@
|
|
|
35
39
|
"guidance": [
|
|
36
40
|
"This is a critical sanity check. If the agent's summary is incorrect, correct it now to prevent a flawed review. This step is skipped for 'Trivial' reviews."
|
|
37
41
|
],
|
|
42
|
+
"runCondition": { "var": "complexity", "not_equals": "Trivial" },
|
|
38
43
|
"requireConfirmation": true
|
|
39
44
|
},
|
|
40
45
|
{
|
|
41
|
-
"id": "phase-
|
|
42
|
-
"title": "Phase
|
|
43
|
-
"prompt": "
|
|
44
|
-
"agentRole": "You are a
|
|
45
|
-
"guidance": [
|
|
46
|
-
|
|
47
|
-
]
|
|
46
|
+
"id": "phase-1-context-setup",
|
|
47
|
+
"title": "Phase 1: Initializing Analysis State",
|
|
48
|
+
"prompt": "Initializing state for progressive depth analysis. Setting analysisDepth = 1, analysisComplete = false, and majorIssuesFound = false.",
|
|
49
|
+
"agentRole": "You are a state manager preparing the workflow for an iterative review process.",
|
|
50
|
+
"guidance": ["This is an automated step to prepare for the analysis loop."],
|
|
51
|
+
"runCondition": { "var": "complexity", "not_equals": "Trivial" }
|
|
48
52
|
},
|
|
49
53
|
{
|
|
50
|
-
"id": "phase-
|
|
51
|
-
"title": "Phase
|
|
52
|
-
"prompt": "
|
|
53
|
-
"agentRole": "You are a
|
|
54
|
-
"guidance": [
|
|
55
|
-
|
|
54
|
+
"id": "phase-1a-llm-context-gathering",
|
|
55
|
+
"title": "Phase 1a: Comprehensive Context Gathering",
|
|
56
|
+
"prompt": "To perform a thorough review, I need more than just the code diff. Please provide the following:\n\n1. **Business Context:** Paste the full text of the associated Jira/GitHub ticket, including the requirements and acceptance criteria. This helps me validate the 'why' behind the change.\n2. **Technical Context:** If this change relies on other parts of the codebase, please provide the relevant code snippets or file contents. Also, include any relevant architectural diagrams or coding standards documents.\n\nI will synthesize this information to build a comprehensive context for my review.",
|
|
57
|
+
"agentRole": "You are a context-aware analyst ensuring you have all necessary information before starting a deep review.",
|
|
58
|
+
"guidance": ["A high-quality review depends on high-quality context. The more information provided here, the more accurate the review will be."],
|
|
59
|
+
"runCondition": { "var": "complexity", "not_equals": "Trivial" },
|
|
60
|
+
"requireConfirmation": true
|
|
61
|
+
},
|
|
62
|
+
{
|
|
63
|
+
"id": "phase-2-depth-analysis-loop",
|
|
64
|
+
"type": "loop",
|
|
65
|
+
"title": "Phase 2: Progressive Depth Analysis",
|
|
66
|
+
"runCondition": { "var": "complexity", "not_equals": "Trivial" },
|
|
67
|
+
"loop": {
|
|
68
|
+
"type": "until",
|
|
69
|
+
"condition": { "var": "analysisComplete", "equals": true },
|
|
70
|
+
"maxIterations": 3,
|
|
71
|
+
"iterationVar": "analysisDepth"
|
|
72
|
+
},
|
|
73
|
+
"body": [
|
|
74
|
+
{
|
|
75
|
+
"id": "perform-analysis-pass",
|
|
76
|
+
"title": "Analysis Pass {{analysisDepth}} of 3",
|
|
77
|
+
"prompt": "Act as a Senior Staff Engineer. Your task is to review the code based on the checklist for the current analysis depth. Think step-by-step.\n\n**Current Depth: {{analysisDepth}}**\n\n**Checklist:**\n* **Depth 1 (Basic Scan):** Check for style guide violations, simple bugs (e.g., typos, unused variables), and common security vulnerabilities (OWASP Top 10).\n* **Depth 2 (Standard Review):** Check for logical errors, edge cases, adherence to SOLID principles, and maintainability issues.\n* **Depth 3 (Deep Architectural Review):** Check for alignment with system architecture, long-term impact, performance bottlenecks, and dependency risks.\n\n**Instructions:**\n1. **Summarize Focus:** State which depth level you are on and what you will focus on.\n2. **Analyze:** Perform the review based on the checklist for depth {{analysisDepth}}.\n3. **List Findings:** Document your findings, categorizing them as 'Critical', 'Major', or 'Minor'.\n4. **Set Flag:** If you find any 'Critical' or 'Major' issues, set the context variable `majorIssuesFound = true`.",
|
|
78
|
+
"agentRole": "You are a Senior Staff Engineer performing a structured, multi-pass code review with increasing levels of scrutiny.",
|
|
79
|
+
"guidance": [
|
|
80
|
+
"At each depth, focus only on the items in that checklist.",
|
|
81
|
+
"Use Chain-of-Thought reasoning to explain your findings."
|
|
82
|
+
]
|
|
83
|
+
},
|
|
84
|
+
{
|
|
85
|
+
"id": "check-analysis-completion",
|
|
86
|
+
"title": "Check Analysis Completion",
|
|
87
|
+
"prompt": "Checking if the analysis is complete. If no 'Major' or 'Critical' issues were found in the last pass, or if we have reached the maximum depth (3), I will set `analysisComplete = true` to exit the loop. Otherwise, I will increment the `analysisDepth` and continue to the next level of review.",
|
|
88
|
+
"agentRole": "You are an automated process controller determining whether to deepen the analysis or conclude this phase.",
|
|
89
|
+
"guidance": ["This step determines the exit condition for the progressive depth loop."]
|
|
90
|
+
}
|
|
56
91
|
]
|
|
57
92
|
},
|
|
58
93
|
{
|
|
@@ -62,15 +97,28 @@
|
|
|
62
97
|
"agentRole": "You are a software quality engineer specializing in testing strategy and impact analysis. Your expertise includes identifying testing gaps, documentation requirements, and potential breaking changes that could affect system stability or user experience.",
|
|
63
98
|
"guidance": [
|
|
64
99
|
"Assessing test coverage is critical. A lack of tests for new logic is often a 'Major' or 'Critical' concern."
|
|
65
|
-
]
|
|
100
|
+
],
|
|
101
|
+
"runCondition": { "var": "complexity", "not_equals": "Trivial" }
|
|
66
102
|
},
|
|
67
103
|
{
|
|
68
|
-
"id": "phase-4-
|
|
69
|
-
"
|
|
70
|
-
"
|
|
71
|
-
"
|
|
72
|
-
"
|
|
73
|
-
"
|
|
104
|
+
"id": "phase-4-refinement-loop",
|
|
105
|
+
"type": "loop",
|
|
106
|
+
"title": "Phase 4: Iterative Summary & Refinement",
|
|
107
|
+
"runCondition": { "var": "complexity", "not_equals": "Trivial" },
|
|
108
|
+
"loop": {
|
|
109
|
+
"type": "until",
|
|
110
|
+
"condition": { "var": "critiqueComplete", "equals": true },
|
|
111
|
+
"maxIterations": 2,
|
|
112
|
+
"iterationVar": "critiquePass"
|
|
113
|
+
},
|
|
114
|
+
"body": [
|
|
115
|
+
{
|
|
116
|
+
"id": "generate-summary-and-critique",
|
|
117
|
+
"title": "Generate Summary & Self-Critique (Pass {{critiquePass}})",
|
|
118
|
+
"prompt": "Consolidate all findings into a final, actionable report.\n\n**1. Devil's Advocate Self-Critique:**\nReview your own findings. Are any of your 'Major' concerns actually minor preferences? Is there a pattern among 'Minor' suggestions that points to a larger problem? Does this critique reveal a fundamental flaw in your analysis that requires another pass?\n\n**2. Final Report:**\n* **Overall Assessment:** A 2-3 sentence summary.\n* **Key Positive Aspects:** Highlights of what was done well.\n* **Concerns (Categorized):** Critical, Major, and Minor/Nit.\n* **Questions for Author:** Clarifying questions.\n* **Actionable Recommendations:** Concrete suggestions.\n\n**3. Set Flag:** Based on your self-critique, set `critiqueComplete = true` if the summary is robust, or `critiqueComplete = false` if you need one more pass to refine it.",
|
|
119
|
+
"agentRole": "You are a facilitator synthesizing feedback and performing a self-critique to ensure the final report is balanced and high-quality.",
|
|
120
|
+
"guidance": ["Use 'Nit:' for minor issues. Balance criticism with praise. The goal is a collaborative and constructive report."]
|
|
121
|
+
}
|
|
74
122
|
]
|
|
75
123
|
},
|
|
76
124
|
{
|
|
@@ -80,7 +128,8 @@
|
|
|
80
128
|
"agentRole": "You are an efficient code reviewer specializing in rapid assessment of low-risk changes. Your role is to quickly validate that trivial changes are safe and appropriate while avoiding over-analysis of simple modifications.",
|
|
81
129
|
"guidance": [
|
|
82
130
|
"This is the condensed summary for 'Trivial' reviews. The agent skips all other analytical steps and provides a simple confirmation for straightforward changes."
|
|
83
|
-
]
|
|
131
|
+
],
|
|
132
|
+
"runCondition": { "var": "complexity", "equals": "Trivial" }
|
|
84
133
|
}
|
|
85
134
|
]
|
|
86
135
|
}
|
|
@@ -0,0 +1,423 @@
|
|
|
1
|
+
{
|
|
2
|
+
"id": "systematic-bug-investigation-with-loops",
|
|
3
|
+
"name": "Systematic Bug Investigation Workflow",
|
|
4
|
+
"version": "1.0.0",
|
|
5
|
+
"description": "A comprehensive workflow for systematic bug and failing test investigation that prevents LLMs from jumping to conclusions. Enforces thorough evidence gathering, hypothesis formation, debugging instrumentation, and validation to achieve near 100% certainty about root causes. This workflow does NOT fix bugs - it produces detailed diagnostic writeups that enable effective fixing by providing complete understanding of what is happening, why it's happening, and supporting evidence.",
|
|
6
|
+
"clarificationPrompts": [
|
|
7
|
+
"What type of system is this? (web app, mobile app, backend service, desktop app, etc.)",
|
|
8
|
+
"How consistently can you reproduce this bug? (always reproducible, sometimes reproducible, rarely reproducible)",
|
|
9
|
+
"What was the last known working version or state if applicable?",
|
|
10
|
+
"Are there any time constraints or urgency factors for this investigation?",
|
|
11
|
+
"What level of system access do you have? (full codebase, limited access, production logs only)",
|
|
12
|
+
"Do you have preferences for handling large log volumes? (sub-chat analysis, inline summaries only, or no preference for automatic decision)"
|
|
13
|
+
],
|
|
14
|
+
"preconditions": [
|
|
15
|
+
"User has identified a specific bug or failing test to investigate",
|
|
16
|
+
"Agent has access to codebase analysis tools (grep, file readers, etc.)",
|
|
17
|
+
"Agent has access to build/test execution tools for the project type",
|
|
18
|
+
"User can provide error messages, stack traces, or test failure output",
|
|
19
|
+
"Bug is reproducible with specific steps or a minimal test case"
|
|
20
|
+
],
|
|
21
|
+
"metaGuidance": [
|
|
22
|
+
"INVESTIGATION DISCIPLINE: Never propose fixes or solutions until Phase 6 (Comprehensive Diagnostic Writeup). Focus entirely on systematic evidence gathering and analysis.",
|
|
23
|
+
"HYPOTHESIS RIGOR: All hypotheses must be based on concrete evidence from code analysis with quantified scoring (1-10 scales). Maximum 5 hypotheses per investigation.",
|
|
24
|
+
"DEBUGGING INSTRUMENTATION: Always implement debugging mechanisms before running tests - logs, print statements, or test modifications that will provide evidence.",
|
|
25
|
+
"EVIDENCE THRESHOLD: Require minimum 3 independent sources of evidence before confirming any hypothesis. Use objective verification criteria.",
|
|
26
|
+
"SYSTEMATIC PROGRESSION: Complete each investigation phase fully before proceeding. Each phase builds critical context for the next with structured documentation.",
|
|
27
|
+
"CONFIDENCE CALIBRATION: Use mathematical confidence framework with 9.0/10 minimum threshold. Actively challenge conclusions with adversarial analysis.",
|
|
28
|
+
"UNCERTAINTY ACKNOWLEDGMENT: Explicitly document all remaining unknowns and their potential impact. No subjective confidence assessments.",
|
|
29
|
+
"THOROUGHNESS: For complex bugs, recursively analyze dependencies and internals of identified components to ensure full picture.",
|
|
30
|
+
"TEST INTEGRATION: Leverage existing tests to validate hypotheses where possible.",
|
|
31
|
+
"LOG ENHANCEMENTS: Include class/function names. For repetitive logs, implement deduplication by tracking counts ('x[count]') and grouping related sequential logs for readability. See Phase 3 for detailed implementation patterns and examples.",
|
|
32
|
+
"LOG ANALYSIS OFFLOADING: For voluminous logs (>500 lines), offload analysis to sub-chats with structured prompts. See Phase 4 for detailed sub-analysis implementation.",
|
|
33
|
+
"RECURSION DEPTH: Limit recursive analysis to 3 levels deep to prevent analysis paralysis while ensuring thoroughness.",
|
|
34
|
+
"INVESTIGATION BOUNDS: If investigation exceeds 20 steps or 4 hours without root cause, pause and reassess approach with user.",
|
|
35
|
+
"AUTOMATION LEVELS: High=auto-approve >8.0 confidence decisions, Medium=standard confirmations, Low=extra confirmations for safety. Control workflow autonomy based on user preference.",
|
|
36
|
+
"CONTEXT DOCUMENTATION: Maintain INVESTIGATION_CONTEXT.md throughout. Update after major milestones, failures, or user interventions to enable seamless handoffs between sessions.",
|
|
37
|
+
"GIT FALLBACK STRATEGY: If git unavailable, gracefully skip commits/branches, log changes manually in CONTEXT.md with timestamps, warn user, document modifications for manual control.",
|
|
38
|
+
"GIT ERROR HANDLING: Use run_terminal_cmd for git operations; if fails, output exact command for user manual execution. Never halt investigation due to git unavailability.",
|
|
39
|
+
"TOOL AVAILABILITY AWARENESS: Check debugging tool availability before investigation design. Have fallbacks for when primary tools unavailable (grep→file_search, etc).",
|
|
40
|
+
"SECURITY PROTOCOLS: Sanitize sensitive data in logs/reproduction steps. Be mindful of exposing credentials, PII, or system internals during evidence collection phases.",
|
|
41
|
+
"DYNAMIC RE-TRIAGE: Allow complexity upgrades during investigation if evidence reveals deeper issues. Safe downgrades only with explicit user confirmation after evidence review.",
|
|
42
|
+
"DEVIL'S ADVOCATE REVIEW: Actively challenge primary hypothesis with available evidence. Seek alternative explanations and rate alternative likelihood before final confidence assessment.",
|
|
43
|
+
"COLLABORATIVE HANDOFFS: Structure documentation for peer review and team coordination. Include methodology, reasoning, and complete evidence chain for knowledge transfer.",
|
|
44
|
+
"FAILURE BOUNDS: Track investigation progress. If >20 steps or >4 hours without breakthrough, pause for user guidance. Document dead ends to prevent redundant work in future sessions.",
|
|
45
|
+
"COGNITIVE BREAKS: After 10 investigation steps, pause and summarize progress to reset perspective.",
|
|
46
|
+
"RUBBER DUCK: Verbalize hypotheses in sub-prompts to externalize reasoning and catch logical gaps.",
|
|
47
|
+
"COLLABORATION READY: Document clearly for handoffs when stuck beyond iteration limits."
|
|
48
|
+
],
|
|
49
|
+
"steps": [
|
|
50
|
+
{
|
|
51
|
+
"id": "phase-0-triage",
|
|
52
|
+
"title": "Phase 0: Initial Triage & Context Gathering",
|
|
53
|
+
"prompt": "**SYSTEMATIC INVESTIGATION BEGINS** - Your mission is to achieve near 100% certainty about this bug's root cause through systematic evidence gathering. NO FIXES will be proposed until Phase 6.\n\n**STEP 1: Bug Report Analysis**\nPlease provide the complete bug context:\n- **Bug Description**: What is the observed behavior vs expected behavior?\n- **Error Messages/Stack Traces**: Paste the complete error output\n- **Reproduction Steps**: How can this bug be consistently reproduced?\n- **Environment Details**: OS, language version, framework version, etc.\n- **Recent Changes**: Any recent commits, deployments, or configuration changes?\n\n**STEP 2: Project Type Classification**\nBased on the information provided, I will classify the project type and set debugging strategies:\n- **Languages/Frameworks**: Primary tech stack\n- **Build System**: Maven, Gradle, npm, etc.\n- **Testing Framework**: JUnit, Jest, pytest, etc.\n- **Logging System**: Available logging mechanisms\n- **Architecture**: Monolithic, microservices, distributed, serverless, etc.\n\n**STEP 3: Complexity Assessment**\nI will analyze the bug complexity using these criteria:\n- **Simple**: Single function/method, clear error path, minimal dependencies\n- **Standard**: Multiple components, moderate investigation required\n- **Complex**: Cross-system issues, race conditions, complex state management\n\n**OUTPUTS**: Set `projectType`, `bugComplexity`, `debuggingMechanism`, and `isDistributed` (true if architecture involves microservices/distributed systems) context variables.",
|
|
54
|
+
"agentRole": "You are a senior debugging specialist and bug triage expert with 15+ years of experience across multiple technology stacks. Your expertise lies in quickly classifying bugs, understanding project architectures, and determining appropriate investigation strategies. You excel at extracting critical information from bug reports and setting up systematic investigation approaches.",
|
|
55
|
+
"guidance": [
|
|
56
|
+
"CLASSIFICATION ACCURACY: Proper complexity assessment determines investigation depth - be thorough but decisive",
|
|
57
|
+
"CONTEXT CAPTURE: Gather complete environmental and situational context now to avoid gaps later",
|
|
58
|
+
"DEBUGGING STRATEGY: Choose debugging mechanisms appropriate for the project type and bug complexity",
|
|
59
|
+
"NO ASSUMPTIONS: If critical information is missing, explicitly request it before proceeding"
|
|
60
|
+
]
|
|
61
|
+
},
|
|
62
|
+
{
|
|
63
|
+
"id": "phase-0a-assumption-check",
|
|
64
|
+
"title": "Phase 0a: Assumption Verification Checkpoint",
|
|
65
|
+
"prompt": "**ASSUMPTION CHECK** - Before proceeding, verify key assumptions to prevent bias.\n\n**VERIFY**:\n1. **Data State**: Confirm variable types and null handling\n2. **API/Library**: Check documentation for actual vs assumed behavior\n3. **Environment**: Verify bug exists in clean environment\n4. **Recent Changes**: Review last 5 commits for relevance\n\n**OUTPUT**: List verified assumptions with evidence sources.",
|
|
66
|
+
"agentRole": "You are a skeptical analyst who challenges every assumption. Question everything that hasn't been explicitly verified.",
|
|
67
|
+
"guidance": [
|
|
68
|
+
"Use analysis tools to verify, don't assume",
|
|
69
|
+
"Document each assumption with its verification method",
|
|
70
|
+
"Flag any unverifiable assumptions for tracking",
|
|
71
|
+
"CHECK API DOCS: Never assume function behavior from names - verify actual documentation",
|
|
72
|
+
"VERIFY DATA TYPES: Use debugger or logs to confirm actual runtime types and values",
|
|
73
|
+
"TEST ENVIRONMENT: Reproduce in minimal environment to rule out configuration issues"
|
|
74
|
+
]
|
|
75
|
+
},
|
|
76
|
+
{
|
|
77
|
+
"id": "phase-0b-reproducibility-loop",
|
|
78
|
+
"type": "loop",
|
|
79
|
+
"title": "Phase 0b: Reproducibility Verification Loop",
|
|
80
|
+
"loop": {
|
|
81
|
+
"type": "for",
|
|
82
|
+
"count": 3,
|
|
83
|
+
"maxIterations": 3,
|
|
84
|
+
"iterationVar": "reproductionAttempt"
|
|
85
|
+
},
|
|
86
|
+
"body": [
|
|
87
|
+
{
|
|
88
|
+
"id": "reproduce-bug",
|
|
89
|
+
"title": "Reproduction Attempt {{reproductionAttempt}}/3",
|
|
90
|
+
"prompt": "**REPRODUCTION ATTEMPT {{reproductionAttempt}}/3**\n\nExecute the provided reproduction steps:\n1. Follow exact steps from bug report\n2. Document outcome (Success/Failure)\n3. Note any variations in behavior\n4. Capture error messages/stack traces\n\n**Update context:**\n- Set `reproductionResults[{{reproductionAttempt - 1}}]` = true/false\n- If failed, document why\n- Track any intermittent patterns",
|
|
91
|
+
"agentRole": "You are systematically verifying bug reproducibility to ensure solid investigation foundation.",
|
|
92
|
+
"guidance": [
|
|
93
|
+
"Execute exactly as specified",
|
|
94
|
+
"Document any deviations",
|
|
95
|
+
"Capture all error details"
|
|
96
|
+
],
|
|
97
|
+
"requireConfirmation": false
|
|
98
|
+
}
|
|
99
|
+
],
|
|
100
|
+
"requireConfirmation": false
|
|
101
|
+
},
|
|
102
|
+
{
|
|
103
|
+
"id": "phase-0c-reproducibility-assessment",
|
|
104
|
+
"title": "Phase 0c: Reproducibility Assessment",
|
|
105
|
+
"prompt": "**ASSESS REPRODUCIBILITY**\n\nBased on 3 reproduction attempts:\n- **Success Rate**: Calculate percentage\n- **Pattern Analysis**: Identify any intermittent patterns\n- **Minimal Reproduction**: Create simplified test case if needed\n\n**DECISION:**\n- If 100% reproducible: Proceed to Phase 1\n- If intermittent: Apply stress techniques and document patterns\n- If 0% reproducible: Request more information from user\n\n**Set `isReproducible` = true/false based on assessment**",
|
|
106
|
+
"agentRole": "You are assessing reproduction results to determine investigation viability.",
|
|
107
|
+
"guidance": [
|
|
108
|
+
"100% reproduction is ideal but not always required",
|
|
109
|
+
"Document intermittent patterns for investigation",
|
|
110
|
+
"Create minimal test case for complex scenarios"
|
|
111
|
+
],
|
|
112
|
+
"validationCriteria": [
|
|
113
|
+
{
|
|
114
|
+
"type": "contains",
|
|
115
|
+
"value": "reproducib",
|
|
116
|
+
"message": "Must make reproducibility determination"
|
|
117
|
+
}
|
|
118
|
+
],
|
|
119
|
+
"hasValidation": true,
|
|
120
|
+
"runCondition": {
|
|
121
|
+
"var": "reproductionAttempt",
|
|
122
|
+
"equals": 3
|
|
123
|
+
}
|
|
124
|
+
},
|
|
125
|
+
{
|
|
126
|
+
"id": "phase-1-streamlined-analysis",
|
|
127
|
+
"runCondition": {
|
|
128
|
+
"var": "bugComplexity",
|
|
129
|
+
"equals": "simple"
|
|
130
|
+
},
|
|
131
|
+
"title": "Phase 1: Streamlined Analysis (Simple Bugs)",
|
|
132
|
+
"prompt": "**STREAMLINED CODEBASE INVESTIGATION** - For simple bugs, I will perform focused analysis of the core issue.\n\n**STEP 1: Direct Component Analysis**\nI will examine the specific component involved:\n- **Primary Function/Method**: Direct analysis of the failing code\n- **Input/Output Analysis**: What data enters and exits the component\n- **Logic Flow**: Step-by-step execution path\n- **Error Point**: Exact location where failure occurs\n\n**STEP 2: Immediate Context Review**\n- **Recent Changes**: Git commits affecting this specific component\n- **Related Tests**: Existing test coverage for this functionality\n- **Dependencies**: Direct dependencies that could affect this component\n\n**STEP 3: Quick Hypothesis Formation**\nI will generate 1-3 focused hypotheses based on:\n- **Obvious Error Patterns**: Common failure modes for this type of component\n- **Change Impact**: How recent modifications could cause this issue\n- **Input Validation**: Whether invalid inputs are causing the failure\n\n**OUTPUTS**: Focused understanding of the simple bug with 1-3 targeted hypotheses ready for validation.",
|
|
133
|
+
"agentRole": "You are an experienced debugging specialist who excels at quickly identifying and resolving straightforward technical issues. Your strength lies in pattern recognition and efficient root cause analysis for simple bugs. You focus on the most likely causes while avoiding over-analysis.",
|
|
134
|
+
"guidance": [
|
|
135
|
+
"FOCUSED ANALYSIS: Concentrate on the specific failing component, avoid deep architectural analysis",
|
|
136
|
+
"PATTERN RECOGNITION: Use experience to identify common failure modes quickly",
|
|
137
|
+
"EFFICIENT HYPOTHESIS: Generate 1-3 focused hypotheses, not exhaustive possibilities",
|
|
138
|
+
"DIRECT APPROACH: Skip complex dependency analysis unless directly relevant"
|
|
139
|
+
]
|
|
140
|
+
},
|
|
141
|
+
{
|
|
142
|
+
"id": "phase-1-comprehensive-analysis",
|
|
143
|
+
"runCondition": {
|
|
144
|
+
"or": [
|
|
145
|
+
{
|
|
146
|
+
"var": "bugComplexity",
|
|
147
|
+
"equals": "standard"
|
|
148
|
+
},
|
|
149
|
+
{
|
|
150
|
+
"var": "bugComplexity",
|
|
151
|
+
"equals": "complex"
|
|
152
|
+
}
|
|
153
|
+
]
|
|
154
|
+
},
|
|
155
|
+
"title": "Phase 1: Deep Codebase Analysis (Standard/Complex Bugs)",
|
|
156
|
+
"prompt": "**SYSTEMATIC CODEBASE INVESTIGATION** - I will now perform comprehensive analysis of the relevant codebase components.\n\n**STEP 1: Affected Component Identification**\nBased on the bug report, I will identify and analyze:\n- **Primary Components**: Classes, functions, modules directly involved\n- **Dependency Chain**: Related components that could influence the bug\n- **Data Flow**: How data moves through the affected systems\n- **Error Propagation Paths**: Where and how errors can originate and propagate\n\n**STEP 2: Code Structure Analysis**\nFor each relevant component, I will examine:\n- **Implementation Logic**: Step-by-step code execution flow\n- **State Management**: How state is created, modified, and shared\n- **Error Handling**: Existing error handling mechanisms\n- **External Dependencies**: Third-party libraries, APIs, database interactions\n- **Concurrency Patterns**: Threading, async operations, shared resources\n\n**STEP 3: Historical Context Review**\nI will analyze:\n- **Recent Changes**: Git history around the affected components\n- **Test Coverage**: Existing tests and their coverage of the bug area\n- **Known Issues**: TODO comments, FIXME notes, or similar patterns\n\n**STEP 4: Recursive Dependency Dive**\nFor key components, analyze dependencies and internals recursively to uncover hidden issues.\n\n**OUTPUTS**: Comprehensive understanding of the codebase architecture and potential failure points.",
|
|
157
|
+
"agentRole": "You are a principal software architect and code analysis expert specializing in systematic codebase investigation. Your strength lies in quickly understanding complex system architectures, identifying failure points, and tracing execution flows. You excel at connecting code patterns to potential runtime behaviors.",
|
|
158
|
+
"guidance": [
|
|
159
|
+
"SYSTEMATIC COVERAGE: Analyze all relevant components, not just the obvious ones",
|
|
160
|
+
"EXECUTION FLOW FOCUS: Trace the actual code execution path that leads to the bug",
|
|
161
|
+
"STATE ANALYSIS: Pay special attention to state management and mutation patterns",
|
|
162
|
+
"DEPENDENCY MAPPING: Understand how external dependencies could contribute to the issue"
|
|
163
|
+
]
|
|
164
|
+
},
|
|
165
|
+
{
|
|
166
|
+
"id": "phase-1a-binary-search",
|
|
167
|
+
"title": "Phase 1a: Binary Search Isolation",
|
|
168
|
+
"runCondition": {
|
|
169
|
+
"or": [
|
|
170
|
+
{"var": "bugType", "equals": "regression"},
|
|
171
|
+
{"var": "searchSpace", "equals": "large"}
|
|
172
|
+
]
|
|
173
|
+
},
|
|
174
|
+
"prompt": "**BINARY SEARCH** - Apply divide-and-conquer:\n\n1. Identify GOOD state (working) and BAD state (broken)\n2. Find midpoint in history/code/data\n3. Test midpoint state\n4. Narrow to relevant half\n5. Document reduced search space\n\n**OUTPUT**: Narrowed location with evidence.",
|
|
175
|
+
"agentRole": "You are a systematic investigator using algorithmic search to efficiently isolate issues.",
|
|
176
|
+
"guidance": [
|
|
177
|
+
"VERSION CONTROL: Use 'git bisect' or equivalent for commit history searches",
|
|
178
|
+
"DATA PIPELINE: Test data at pipeline midpoints to isolate transformation issues",
|
|
179
|
+
"TIME WINDOWS: For time-based issues, binary search through timestamps",
|
|
180
|
+
"DOCUMENT BOUNDARIES: Clearly record each tested boundary and result",
|
|
181
|
+
"EFFICIENCY: Each test should eliminate ~50% of remaining search space"
|
|
182
|
+
]
|
|
183
|
+
},
|
|
184
|
+
{
|
|
185
|
+
"id": "phase-1b-test-reduction",
|
|
186
|
+
"title": "Phase 1b: Test Case Minimization",
|
|
187
|
+
"runCondition": {
|
|
188
|
+
"var": "bugSource",
|
|
189
|
+
"equals": "failing_test"
|
|
190
|
+
},
|
|
191
|
+
"prompt": "**TEST REDUCTION** - Simplify failing test:\n\n1. Inline called methods into test\n2. Add earlier assertion to fail sooner\n3. Remove code after new failure point\n4. Repeat until minimal\n\n**OUTPUT**: Minimal failing test case.",
|
|
192
|
+
"agentRole": "You are a surgical debugger who strips away layers to reveal core issues.",
|
|
193
|
+
"guidance": [
|
|
194
|
+
"PRESERVE FAILURE: Each reduction must maintain the original failure mode",
|
|
195
|
+
"INLINE AGGRESSIVELY: Replace method calls with their actual implementation",
|
|
196
|
+
"FAIL EARLY: Move assertions up to find earliest deviation from expected state",
|
|
197
|
+
"REMOVE RUTHLESSLY: Delete all code that doesn't contribute to the failure",
|
|
198
|
+
"CLARITY GOAL: Final test should make the bug obvious to any reader"
|
|
199
|
+
]
|
|
200
|
+
},
|
|
201
|
+
{
|
|
202
|
+
"id": "phase-2a-hypothesis-development",
|
|
203
|
+
"title": "Phase 2a: Hypothesis Development & Prioritization",
|
|
204
|
+
"prompt": "**HYPOTHESIS GENERATION** - Based on codebase analysis, formulate testable hypotheses about the bug's root cause.\n\n**STEP 1: Evidence-Based Hypothesis Development**\nCreate maximum 5 prioritized hypotheses. Each includes:\n- **Root Cause Theory**: Specific technical explanation\n- **Supporting Evidence**: Code patterns/logic flows supporting this theory\n- **Failure Mechanism**: Exact sequence leading to observed bug\n- **Testability Score**: Quantified assessment (1-10) of validation ease\n- **Evidence Strength Score**: Quantified assessment (1-10) based on code findings\n\n**STEP 2: Hypothesis Prioritization Matrix**\nRank hypotheses using weighted scoring:\n- **Evidence Strength** (40%): Code analysis support for theory\n- **Testability** (35%): Validation ease with debugging instruments\n- **Impact Scope** (25%): How well this explains all symptoms\n\n**CRITICAL RULE**: All hypotheses must be based on concrete evidence from code analysis.\n\n**OUTPUTS**: Maximum 5 hypotheses with quantified scoring, ranked by priority.",
|
|
205
|
+
"agentRole": "You are a senior software detective and root cause analysis expert with deep expertise in systematic hypothesis formation. Your strength lies in connecting code evidence to potential failure mechanisms and creating testable theories. You excel at logical reasoning and evidence-based deduction. You must maintain rigorous quantitative standards and reject any hypothesis not grounded in concrete code evidence.",
|
|
206
|
+
"guidance": [
|
|
207
|
+
"EVIDENCE-BASED ONLY: Every hypothesis must be grounded in concrete code analysis findings with quantified evidence scores",
|
|
208
|
+
"HYPOTHESIS LIMITS: Generate maximum 5 hypotheses to prevent analysis paralysis",
|
|
209
|
+
"QUANTIFIED SCORING: Use 1-10 scales for evidence strength and testability with clear criteria"
|
|
210
|
+
],
|
|
211
|
+
"validationCriteria": [
|
|
212
|
+
{
|
|
213
|
+
"type": "contains",
|
|
214
|
+
"value": "Evidence Strength Score",
|
|
215
|
+
"message": "Must include quantified evidence strength scoring (1-10) for each hypothesis"
|
|
216
|
+
},
|
|
217
|
+
{
|
|
218
|
+
"type": "contains",
|
|
219
|
+
"value": "Testability Score",
|
|
220
|
+
"message": "Must include quantified testability scoring (1-10) for each hypothesis"
|
|
221
|
+
}
|
|
222
|
+
],
|
|
223
|
+
"hasValidation": true
|
|
224
|
+
},
|
|
225
|
+
{
|
|
226
|
+
"id": "phase-2b-hypothesis-validation-strategy",
|
|
227
|
+
"title": "Phase 2b: Hypothesis Validation Strategy & Documentation",
|
|
228
|
+
"prompt": "**HYPOTHESIS VALIDATION PLANNING** - For the top 3 hypotheses, create validation strategies and documentation.\n\n**STEP 1: Hypothesis Validation Strategy**\nFor top 3 hypotheses, define:\n- **Required Evidence**: Specific evidence to confirm/refute hypothesis\n- **Debugging Approach**: Instrumentation/tests providing evidence\n- **Success Criteria**: Results proving hypothesis correct\n- **Confidence Threshold**: Minimum evidence quality needed\n\n**STEP 2: Hypothesis Documentation**\nCreate structured registry:\n- **Hypothesis ID**: H1, H2, H3 for tracking\n- **Status**: Active, Refuted, Confirmed\n- **Evidence Log**: Supporting and contradicting evidence\n- **Validation Plan**: Specific testing approach\n\n**STEP 3: Coverage Check**\nEnsure hypotheses cover diverse categories (logic, state, dependencies) with deep analysis.\n\n**OUTPUTS**: Top 3 hypotheses selected for validation with structured documentation and validation plans.",
|
|
229
|
+
"agentRole": "You are a systematic testing strategist and documentation expert. Your strength lies in creating clear validation plans and maintaining rigorous documentation standards for hypothesis tracking and evidence collection.",
|
|
230
|
+
"guidance": [
|
|
231
|
+
"STRUCTURED DOCUMENTATION: Create formal hypothesis registry with tracking IDs and status",
|
|
232
|
+
"VALIDATION RIGOR: Only proceed with top 3 hypotheses that meet minimum evidence thresholds",
|
|
233
|
+
"COMPREHENSIVE PLANNING: Each hypothesis must have clear validation approach and success criteria"
|
|
234
|
+
],
|
|
235
|
+
"validationCriteria": [
|
|
236
|
+
{
|
|
237
|
+
"type": "contains",
|
|
238
|
+
"value": "Hypothesis ID",
|
|
239
|
+
"message": "Must assign tracking IDs (H1, H2, H3) to each hypothesis"
|
|
240
|
+
},
|
|
241
|
+
{
|
|
242
|
+
"type": "regex",
|
|
243
|
+
"pattern": "H[1-3]",
|
|
244
|
+
"message": "Must use proper hypothesis ID format (H1, H2, H3)"
|
|
245
|
+
}
|
|
246
|
+
],
|
|
247
|
+
"hasValidation": true
|
|
248
|
+
},
|
|
249
|
+
{
|
|
250
|
+
"id": "phase-2c-prepare-validation",
|
|
251
|
+
"title": "Phase 2c: Prepare Hypothesis Validation",
|
|
252
|
+
"prompt": "**PREPARE VALIDATION ARRAY** - Extract the top 3 hypotheses for systematic validation.\n\n**Create `hypothesesToValidate` array with:**\n```json\n[\n {\n \"id\": \"H1\",\n \"description\": \"[Hypothesis description]\",\n \"evidenceStrength\": [score],\n \"testability\": [score],\n \"validationPlan\": \"[Specific testing approach]\"\n },\n // ... H2, H3\n]\n```\n\n**Set context variables:**\n- `hypothesesToValidate`: Array of top 3 hypotheses\n- `currentConfidence`: 0 (will be updated during validation)\n- `validationIterations`: 0 (tracks validation cycles)",
|
|
253
|
+
"agentRole": "You are preparing the systematic validation process by structuring hypotheses for iteration.",
|
|
254
|
+
"guidance": [
|
|
255
|
+
"Extract only the top 3 hypotheses from Phase 2b",
|
|
256
|
+
"Ensure each has complete validation information",
|
|
257
|
+
"Initialize tracking variables for the validation loop"
|
|
258
|
+
],
|
|
259
|
+
"requireConfirmation": false
|
|
260
|
+
},
|
|
261
|
+
{
|
|
262
|
+
"id": "phase-3-4-5-validation-loop",
|
|
263
|
+
"type": "loop",
|
|
264
|
+
"title": "Hypothesis Validation Loop (Phases 3-4-5)",
|
|
265
|
+
"loop": {
|
|
266
|
+
"type": "forEach",
|
|
267
|
+
"items": "hypothesesToValidate",
|
|
268
|
+
"itemVar": "currentHypothesis",
|
|
269
|
+
"indexVar": "hypothesisIndex",
|
|
270
|
+
"maxIterations": 5
|
|
271
|
+
},
|
|
272
|
+
"body": [
|
|
273
|
+
{
|
|
274
|
+
"id": "loop-phase-3-instrumentation",
|
|
275
|
+
"title": "Phase 3: Debug Instrumentation for {{currentHypothesis.id}}",
|
|
276
|
+
"prompt": "**DEBUGGING INSTRUMENTATION for {{currentHypothesis.id}}**\n\n**Hypothesis**: {{currentHypothesis.description}}\n**Validation Plan**: {{currentHypothesis.validationPlan}}\n\n**IMPLEMENT INSTRUMENTATION:**\n1. **Strategy**: Choose based on hypothesis needs (logging, debug prints, test mods)\n2. **Coverage**: Instrument all paths related to {{currentHypothesis.id}}\n3. **Evidence Points**: Focus on gathering evidence that will confirm/refute this specific hypothesis\n\n**LOG OPTIMIZATION:**\n- Use '{{currentHypothesis.id}}_DEBUG:' prefix for all logs\n- Implement deduplication for high-frequency events\n- Group related operations within 50-100ms windows\n\n**OUTPUT**: Instrumented code ready to validate {{currentHypothesis.id}}",
|
|
277
|
+
"agentRole": "You are instrumenting code specifically to validate hypothesis {{currentHypothesis.id}}. Focus on targeted evidence collection.",
|
|
278
|
+
"guidance": [
|
|
279
|
+
"This is hypothesis {{hypothesisIndex + 1}} of {{hypothesesToValidate.length}}",
|
|
280
|
+
"Tailor instrumentation to the specific hypothesis",
|
|
281
|
+
"Ensure non-intrusive implementation"
|
|
282
|
+
],
|
|
283
|
+
"requireConfirmation": false
|
|
284
|
+
},
|
|
285
|
+
{
|
|
286
|
+
"id": "loop-phase-4-evidence",
|
|
287
|
+
"title": "Phase 4: Evidence Collection for {{currentHypothesis.id}}",
|
|
288
|
+
"prompt": "**EVIDENCE COLLECTION for {{currentHypothesis.id}}**\n\n**Execute instrumented code and collect evidence:**\n1. Run the instrumented test/reproduction\n2. Collect all {{currentHypothesis.id}}_DEBUG logs\n3. Analyze results against validation criteria\n4. Document evidence quality and relevance\n\n**EVIDENCE ASSESSMENT:**\n- Does evidence support {{currentHypothesis.id}}? (Yes/No/Partial)\n- Evidence quality score (1-10)\n- Contradicting evidence found?\n- Additional evidence needed?\n\n**If log volume >500 lines, create sub-analysis prompt.**\n\n**OUTPUT**: Evidence assessment for {{currentHypothesis.id}} with quality scoring",
|
|
289
|
+
"agentRole": "You are collecting and analyzing evidence specifically for hypothesis {{currentHypothesis.id}}.",
|
|
290
|
+
"guidance": [
|
|
291
|
+
"Focus on evidence directly related to this hypothesis",
|
|
292
|
+
"Be objective in assessment - negative evidence is valuable",
|
|
293
|
+
"Track evidence quality quantitatively"
|
|
294
|
+
],
|
|
295
|
+
"requireConfirmation": false
|
|
296
|
+
},
|
|
297
|
+
{
|
|
298
|
+
"id": "loop-phase-5-synthesis",
|
|
299
|
+
"title": "Phase 5: Evidence Synthesis for {{currentHypothesis.id}}",
|
|
300
|
+
"prompt": "**EVIDENCE SYNTHESIS for {{currentHypothesis.id}}**\n\n**Synthesize findings:**\n1. **Evidence Summary**: What did we learn about {{currentHypothesis.id}}?\n2. **Confidence Update**: Based on evidence, rate confidence this is the root cause (0-10)\n3. **Status Update**: Mark hypothesis as Confirmed/Refuted/Needs-More-Evidence\n\n**If {{currentHypothesis.id}} is confirmed with high confidence (>8.0):**\n- Set `rootCauseFound` = true\n- Set `rootCauseHypothesis` = {{currentHypothesis.id}}\n- Update `currentConfidence` with confidence score\n\n**If all hypotheses validated but confidence <9.0:**\n- Consider additional investigation needs\n- Document what evidence is still missing",
|
|
301
|
+
"agentRole": "You are synthesizing evidence to determine if {{currentHypothesis.id}} is the root cause.",
|
|
302
|
+
"guidance": [
|
|
303
|
+
"Update hypothesis status based on evidence",
|
|
304
|
+
"Track overall investigation confidence",
|
|
305
|
+
"Be ready to exit loop if root cause found with high confidence"
|
|
306
|
+
],
|
|
307
|
+
"requireConfirmation": false
|
|
308
|
+
}
|
|
309
|
+
],
|
|
310
|
+
"runCondition": {
|
|
311
|
+
"and": [
|
|
312
|
+
{ "var": "rootCauseFound", "not_equals": true },
|
|
313
|
+
{ "var": "validationIterations", "lt": 3 }
|
|
314
|
+
]
|
|
315
|
+
},
|
|
316
|
+
"requireConfirmation": false
|
|
317
|
+
},
|
|
318
|
+
{
|
|
319
|
+
"id": "phase-3a-observability-setup",
|
|
320
|
+
"title": "Phase 3a: Distributed System Observability",
|
|
321
|
+
"runCondition": {
|
|
322
|
+
"var": "isDistributed",
|
|
323
|
+
"equals": true
|
|
324
|
+
},
|
|
325
|
+
"prompt": "**OBSERVABILITY** - Set up three-pillar strategy:\n\n**METRICS**: Identify key indicators (latency, errors)\n**TRACES**: Enable request path tracking\n**LOGS**: Ensure correlation IDs present\n\n**OUTPUT**: Observability checklist completed.",
|
|
326
|
+
"agentRole": "You are a distributed systems expert who thinks in terms of emergent behaviors and system-wide patterns.",
|
|
327
|
+
"guidance": [
|
|
328
|
+
"METRICS SELECTION: Focus on RED metrics (Rate, Errors, Duration) for each service",
|
|
329
|
+
"TRACE COVERAGE: Ensure spans cover all service boundaries and key operations",
|
|
330
|
+
"CORRELATION IDS: Verify IDs propagate through entire request lifecycle",
|
|
331
|
+
"AGGREGATION READY: Set up centralized collection for cross-service analysis",
|
|
332
|
+
"BASELINE ESTABLISHMENT: Capture normal behavior metrics for comparison"
|
|
333
|
+
]
|
|
334
|
+
},
|
|
335
|
+
{
|
|
336
|
+
"id": "phase-4a-distributed-evidence",
|
|
337
|
+
"title": "Phase 4a: Multi-Service Evidence Collection",
|
|
338
|
+
"runCondition": {
|
|
339
|
+
"var": "isDistributed",
|
|
340
|
+
"equals": true
|
|
341
|
+
},
|
|
342
|
+
"prompt": "**DISTRIBUTED ANALYSIS**:\n\n1. Check METRICS for anomalies\n2. Follow TRACES for request path\n3. Correlate LOGS across services\n4. Identify cascade points\n\n**OUTPUT**: Service interaction map with failure points.",
|
|
343
|
+
"agentRole": "You are a systems detective who can trace failures across service boundaries.",
|
|
344
|
+
"guidance": [
|
|
345
|
+
"ANOMALY DETECTION: Look for deviations in latency, error rates, or traffic patterns",
|
|
346
|
+
"TRACE ANALYSIS: Follow request ID through all services to find failure point",
|
|
347
|
+
"LOG CORRELATION: Use timestamp windows and correlation IDs to link events",
|
|
348
|
+
"CASCADE IDENTIFICATION: Look for timeout chains or error propagation patterns",
|
|
349
|
+
"VISUAL MAPPING: Create service dependency diagram with failure annotations"
|
|
350
|
+
]
|
|
351
|
+
},
|
|
352
|
+
{
|
|
353
|
+
"id": "phase-4b-cognitive-reset",
|
|
354
|
+
"title": "Phase 4b: Cognitive Reset & Progress Review",
|
|
355
|
+
"runCondition": {
|
|
356
|
+
"var": "validationIterations",
|
|
357
|
+
"gte": 2
|
|
358
|
+
},
|
|
359
|
+
"prompt": "**COGNITIVE RESET** - Step back and review:\n\n1. Summarize findings so far\n2. List eliminated possibilities\n3. Identify investigation blind spots\n4. Reformulate approach if needed\n\n**DECIDE**: Continue current path or pivot strategy?",
|
|
360
|
+
"agentRole": "You are a strategic advisor who helps maintain perspective during complex investigations.",
|
|
361
|
+
"guidance": [
|
|
362
|
+
"PROGRESS SUMMARY: Write concise bullet points of key findings and eliminations",
|
|
363
|
+
"BLIND SPOT CHECK: What areas haven't been investigated? What assumptions remain?",
|
|
364
|
+
"PATTERN RECOGNITION: Look for investigation loops or repeated dead ends",
|
|
365
|
+
"STRATEGY EVALUATION: Is current approach yielding diminishing returns?",
|
|
366
|
+
"PIVOT CRITERIA: Consider new approach if last 3 iterations provided no new insights"
|
|
367
|
+
]
|
|
368
|
+
},
|
|
369
|
+
{
|
|
370
|
+
"id": "phase-5a-final-confidence",
|
|
371
|
+
"title": "Phase 5a: Final Confidence Assessment",
|
|
372
|
+
"prompt": "**FINAL CONFIDENCE ASSESSMENT** - Evaluate the investigation results.\n\n**If root cause found (rootCauseFound = true):**\n- Review all evidence for {{rootCauseHypothesis}}\n- Perform adversarial challenge\n- Calculate final confidence score\n\n**If no high-confidence root cause:**\n- Document what was learned\n- Identify remaining unknowns\n- Recommend next investigation steps\n\n**CONFIDENCE CALCULATION:**\n- Evidence Quality (1-10)\n- Explanation Completeness (1-10)\n- Alternative Likelihood (1-10, inverted)\n- Final = (Quality × 0.4) + (Completeness × 0.4) + (Alternative × 0.2)\n\n**OUTPUT**: Final confidence assessment with recommendations",
|
|
373
|
+
"agentRole": "You are making the final determination about the root cause with rigorous confidence assessment.",
|
|
374
|
+
"guidance": [
|
|
375
|
+
"Be honest about confidence levels",
|
|
376
|
+
"Document all remaining uncertainties",
|
|
377
|
+
"Provide clear next steps if confidence is low"
|
|
378
|
+
],
|
|
379
|
+
"validationCriteria": [
|
|
380
|
+
{
|
|
381
|
+
"type": "regex",
|
|
382
|
+
"pattern": "Final.*=.*[0-9\\.]+",
|
|
383
|
+
"message": "Must calculate final confidence score"
|
|
384
|
+
}
|
|
385
|
+
],
|
|
386
|
+
"hasValidation": true
|
|
387
|
+
},
|
|
388
|
+
{
|
|
389
|
+
"id": "phase-2c-hypothesis-assumptions",
|
|
390
|
+
"title": "Phase 2c: Hypothesis Assumption Audit",
|
|
391
|
+
"prompt": "**AUDIT** each hypothesis for hidden assumptions:\n\n**FOR EACH HYPOTHESIS**:\n- List implicit assumptions\n- Rate assumption confidence (1-10)\n- Identify verification approach\n\n**REJECT** hypotheses built on unverified assumptions.",
|
|
392
|
+
"agentRole": "You are a rigorous scientist who rejects any hypothesis not grounded in verified facts.",
|
|
393
|
+
"guidance": [
|
|
394
|
+
"EXPLICIT LISTING: Write out every assumption, no matter how obvious it seems",
|
|
395
|
+
"CONFIDENCE SCORING: Rate 1-10 based on evidence quality, not intuition",
|
|
396
|
+
"VERIFICATION PLAN: For each assumption, specify how it can be tested",
|
|
397
|
+
"REJECTION CRITERIA: Any assumption with confidence <7 requires verification",
|
|
398
|
+
"DOCUMENT RATIONALE: Explain why each assumption is accepted or needs testing"
|
|
399
|
+
],
|
|
400
|
+
"validationCriteria": [
|
|
401
|
+
{
|
|
402
|
+
"type": "contains",
|
|
403
|
+
"value": "Assumption confidence",
|
|
404
|
+
"message": "Must rate assumption confidence for each hypothesis"
|
|
405
|
+
}
|
|
406
|
+
],
|
|
407
|
+
"hasValidation": true
|
|
408
|
+
},
|
|
409
|
+
{
|
|
410
|
+
"id": "phase-6-diagnostic-writeup",
|
|
411
|
+
"title": "Phase 6: Comprehensive Diagnostic Writeup",
|
|
412
|
+
"prompt": "**FINAL DIAGNOSTIC DOCUMENTATION** - I will create comprehensive writeup enabling effective bug fixing and knowledge transfer.\n\n**STEP 1: Executive Summary**\n- **Bug Summary**: Concise description of issue and impact\n- **Root Cause**: Clear, non-technical explanation of what is happening\n- **Confidence Level**: Final confidence assessment with calculation methodology\n- **Scope**: What systems, users, or scenarios are affected\n\n**STEP 2: Technical Deep Dive**\n- **Root Cause Analysis**: Detailed technical explanation of failure mechanism\n- **Code Component Analysis**: Specific files, functions, and lines with exact locations\n- **Execution Flow**: Step-by-step sequence of events leading to bug\n- **State Analysis**: How system state contributes to failure\n\n**STEP 3: Investigation Methodology**\n- **Investigation Timeline**: Chronological summary with phase time investments\n- **Hypothesis Evolution**: Complete record of hypotheses (H1-H5) with status changes\n- **Evidence Assessment**: Rating and reliability of evidence sources with key citations\n\n**STEP 4: Knowledge Transfer & Action Plan**\n- **Skill Requirements**: Technical expertise needed for understanding and fixing\n- **Prevention & Review**: Specific measures and code review checklist items\n- **Action Items**: Immediate mitigation steps and permanent fix areas with timelines\n- **Testing Strategy**: Comprehensive verification approach for fixes\n\n**DELIVERABLE**: Enterprise-grade diagnostic report enabling confident bug fixing, knowledge transfer, and organizational learning.",
|
|
413
|
+
"agentRole": "You are a senior technical writer and diagnostic documentation specialist with expertise in creating comprehensive, actionable bug reports for enterprise environments. Your strength lies in translating complex technical investigations into clear, structured documentation that enables effective problem resolution, knowledge transfer, and organizational learning. You excel at creating reports that serve immediate fixing needs, long-term system improvement, and team collaboration.",
|
|
414
|
+
"guidance": [
|
|
415
|
+
"ENTERPRISE FOCUS: Write for multiple stakeholders including developers, managers, and future team members",
|
|
416
|
+
"KNOWLEDGE TRANSFER: Include methodology and reasoning, not just conclusions",
|
|
417
|
+
"COLLABORATIVE DESIGN: Structure content for peer review and team coordination",
|
|
418
|
+
"COMPREHENSIVE COVERAGE: Include all information needed for resolution and prevention",
|
|
419
|
+
"ACTIONABLE DOCUMENTATION: Provide specific, concrete next steps with clear ownership"
|
|
420
|
+
]
|
|
421
|
+
}
|
|
422
|
+
]
|
|
423
|
+
}
|