@exaudeus/workrail 0.0.19 → 0.0.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -15,7 +15,8 @@
|
|
|
15
15
|
"User has identified a specific bug or failing test to investigate",
|
|
16
16
|
"Agent has access to codebase analysis tools (grep, file readers, etc.)",
|
|
17
17
|
"Agent has access to build/test execution tools for the project type",
|
|
18
|
-
"User can provide error messages, stack traces, or test failure output"
|
|
18
|
+
"User can provide error messages, stack traces, or test failure output",
|
|
19
|
+
"Bug is reproducible with specific steps or a minimal test case"
|
|
19
20
|
],
|
|
20
21
|
"metaGuidance": [
|
|
21
22
|
"INVESTIGATION DISCIPLINE: Never propose fixes or solutions until Phase 6 (Comprehensive Diagnostic Writeup). Focus entirely on systematic evidence gathering and analysis.",
|
|
@@ -35,18 +36,21 @@
|
|
|
35
36
|
"CONTEXT DOCUMENTATION: Maintain INVESTIGATION_CONTEXT.md throughout. Update after major milestones, failures, or user interventions to enable seamless handoffs between sessions.",
|
|
36
37
|
"GIT FALLBACK STRATEGY: If git unavailable, gracefully skip commits/branches, log changes manually in CONTEXT.md with timestamps, warn user, document modifications for manual control.",
|
|
37
38
|
"GIT ERROR HANDLING: Use run_terminal_cmd for git operations; if fails, output exact command for user manual execution. Never halt investigation due to git unavailability.",
|
|
38
|
-
"TOOL AVAILABILITY AWARENESS: Check debugging tool availability before investigation design. Have fallbacks for when primary tools unavailable (grep
|
|
39
|
+
"TOOL AVAILABILITY AWARENESS: Check debugging tool availability before investigation design. Have fallbacks for when primary tools unavailable (grep→file_search, etc).",
|
|
39
40
|
"SECURITY PROTOCOLS: Sanitize sensitive data in logs/reproduction steps. Be mindful of exposing credentials, PII, or system internals during evidence collection phases.",
|
|
40
41
|
"DYNAMIC RE-TRIAGE: Allow complexity upgrades during investigation if evidence reveals deeper issues. Safe downgrades only with explicit user confirmation after evidence review.",
|
|
41
42
|
"DEVIL'S ADVOCATE REVIEW: Actively challenge primary hypothesis with available evidence. Seek alternative explanations and rate alternative likelihood before final confidence assessment.",
|
|
42
43
|
"COLLABORATIVE HANDOFFS: Structure documentation for peer review and team coordination. Include methodology, reasoning, and complete evidence chain for knowledge transfer.",
|
|
43
|
-
"FAILURE BOUNDS: Track investigation progress. If >20 steps or >4 hours without breakthrough, pause for user guidance. Document dead ends to prevent redundant work in future sessions."
|
|
44
|
+
"FAILURE BOUNDS: Track investigation progress. If >20 steps or >4 hours without breakthrough, pause for user guidance. Document dead ends to prevent redundant work in future sessions.",
|
|
45
|
+
"COGNITIVE BREAKS: After 10 investigation steps, pause and summarize progress to reset perspective.",
|
|
46
|
+
"RUBBER DUCK: Verbalize hypotheses in sub-prompts to externalize reasoning and catch logical gaps.",
|
|
47
|
+
"COLLABORATION READY: Document clearly for handoffs when stuck beyond iteration limits."
|
|
44
48
|
],
|
|
45
49
|
"steps": [
|
|
46
50
|
{
|
|
47
51
|
"id": "phase-0-triage",
|
|
48
52
|
"title": "Phase 0: Initial Triage & Context Gathering",
|
|
49
|
-
"prompt": "**SYSTEMATIC INVESTIGATION BEGINS** - Your mission is to achieve near 100% certainty about this bug's root cause through systematic evidence gathering. NO FIXES will be proposed until Phase 6.\n\n**STEP 1: Bug Report Analysis**\nPlease provide the complete bug context:\n- **Bug Description**: What is the observed behavior vs expected behavior?\n- **Error Messages/Stack Traces**: Paste the complete error output\n- **Reproduction Steps**: How can this bug be consistently reproduced?\n- **Environment Details**: OS, language version, framework version, etc.\n- **Recent Changes**: Any recent commits, deployments, or configuration changes?\n\n**STEP 2: Project Type Classification**\nBased on the information provided, I will classify the project type and set debugging strategies:\n- **Languages/Frameworks**: Primary tech stack\n- **Build System**: Maven, Gradle, npm, etc.\n- **Testing Framework**: JUnit, Jest, pytest, etc.\n- **Logging System**: Available logging mechanisms\n\n**STEP 3: Complexity Assessment**\nI will analyze the bug complexity using these criteria:\n- **Simple**: Single function/method, clear error path, minimal dependencies\n- **Standard**: Multiple components, moderate investigation required\n- **Complex**: Cross-system issues, race conditions, complex state management\n\n**OUTPUTS**: Set `projectType`, `bugComplexity`, and `
|
|
53
|
+
"prompt": "**SYSTEMATIC INVESTIGATION BEGINS** - Your mission is to achieve near 100% certainty about this bug's root cause through systematic evidence gathering. NO FIXES will be proposed until Phase 6.\n\n**STEP 1: Bug Report Analysis**\nPlease provide the complete bug context:\n- **Bug Description**: What is the observed behavior vs expected behavior?\n- **Error Messages/Stack Traces**: Paste the complete error output\n- **Reproduction Steps**: How can this bug be consistently reproduced?\n- **Environment Details**: OS, language version, framework version, etc.\n- **Recent Changes**: Any recent commits, deployments, or configuration changes?\n\n**STEP 2: Project Type Classification**\nBased on the information provided, I will classify the project type and set debugging strategies:\n- **Languages/Frameworks**: Primary tech stack\n- **Build System**: Maven, Gradle, npm, etc.\n- **Testing Framework**: JUnit, Jest, pytest, etc.\n- **Logging System**: Available logging mechanisms\n- **Architecture**: Monolithic, microservices, distributed, serverless, etc.\n\n**STEP 3: Complexity Assessment**\nI will analyze the bug complexity using these criteria:\n- **Simple**: Single function/method, clear error path, minimal dependencies\n- **Standard**: Multiple components, moderate investigation required\n- **Complex**: Cross-system issues, race conditions, complex state management\n\n**OUTPUTS**: Set `projectType`, `bugComplexity`, `debuggingMechanism`, and `isDistributed` (true if architecture involves microservices/distributed systems) context variables.",
|
|
50
54
|
"agentRole": "You are a senior debugging specialist and bug triage expert with 15+ years of experience across multiple technology stacks. Your expertise lies in quickly classifying bugs, understanding project architectures, and determining appropriate investigation strategies. You excel at extracting critical information from bug reports and setting up systematic investigation approaches.",
|
|
51
55
|
"guidance": [
|
|
52
56
|
"CLASSIFICATION ACCURACY: Proper complexity assessment determines investigation depth - be thorough but decisive",
|
|
@@ -55,6 +59,41 @@
|
|
|
55
59
|
"NO ASSUMPTIONS: If critical information is missing, explicitly request it before proceeding"
|
|
56
60
|
]
|
|
57
61
|
},
|
|
62
|
+
{
|
|
63
|
+
"id": "phase-0a-assumption-check",
|
|
64
|
+
"title": "Phase 0a: Assumption Verification Checkpoint",
|
|
65
|
+
"prompt": "**ASSUMPTION CHECK** - Before proceeding, verify key assumptions to prevent bias.\n\n**VERIFY**:\n1. **Data State**: Confirm variable types and null handling\n2. **API/Library**: Check documentation for actual vs assumed behavior\n3. **Environment**: Verify bug exists in clean environment\n4. **Recent Changes**: Review last 5 commits for relevance\n\n**OUTPUT**: List verified assumptions with evidence sources.",
|
|
66
|
+
"agentRole": "You are a skeptical analyst who challenges every assumption. Question everything that hasn't been explicitly verified.",
|
|
67
|
+
"guidance": [
|
|
68
|
+
"Use analysis tools to verify, don't assume",
|
|
69
|
+
"Document each assumption with its verification method",
|
|
70
|
+
"Flag any unverifiable assumptions for tracking",
|
|
71
|
+
"CHECK API DOCS: Never assume function behavior from names - verify actual documentation",
|
|
72
|
+
"VERIFY DATA TYPES: Use debugger or logs to confirm actual runtime types and values",
|
|
73
|
+
"TEST ENVIRONMENT: Reproduce in minimal environment to rule out configuration issues"
|
|
74
|
+
]
|
|
75
|
+
},
|
|
76
|
+
{
|
|
77
|
+
"id": "phase-0b-reproducibility-lock",
|
|
78
|
+
"title": "Phase 0b: Reproducibility Verification",
|
|
79
|
+
"prompt": "**REPRODUCIBILITY** - Confirm reliable reproduction:\n\n1. Execute provided steps 3 times\n2. Document success rate\n3. If intermittent, apply stress techniques\n4. Create minimal reproduction script/test\n\n**GATE**: Only proceed with 100% reproduction.",
|
|
80
|
+
"agentRole": "You are a quality gatekeeper who ensures solid foundation before investigation.",
|
|
81
|
+
"guidance": [
|
|
82
|
+
"MINIMAL EXAMPLE: Strip away all non-essential code to isolate the issue",
|
|
83
|
+
"CONTAINERIZE: Use Docker or similar to ensure consistent environment",
|
|
84
|
+
"INTERMITTENT BUGS: Apply fuzzing, stress testing, or timing variations to force reproduction",
|
|
85
|
+
"DOCUMENT PRECISELY: Record exact steps, inputs, and environment for future reference",
|
|
86
|
+
"FAIL FAST: If not reproducible after reasonable effort, request more information from user"
|
|
87
|
+
],
|
|
88
|
+
"validationCriteria": [
|
|
89
|
+
{
|
|
90
|
+
"type": "contains",
|
|
91
|
+
"value": "100%",
|
|
92
|
+
"message": "Must confirm 100% reproducibility before proceeding"
|
|
93
|
+
}
|
|
94
|
+
],
|
|
95
|
+
"hasValidation": true
|
|
96
|
+
},
|
|
58
97
|
{
|
|
59
98
|
"id": "phase-1-streamlined-analysis",
|
|
60
99
|
"runCondition": {
|
|
@@ -95,6 +134,42 @@
|
|
|
95
134
|
"DEPENDENCY MAPPING: Understand how external dependencies could contribute to the issue"
|
|
96
135
|
]
|
|
97
136
|
},
|
|
137
|
+
{
|
|
138
|
+
"id": "phase-1a-binary-search",
|
|
139
|
+
"title": "Phase 1a: Binary Search Isolation",
|
|
140
|
+
"runCondition": {
|
|
141
|
+
"or": [
|
|
142
|
+
{"var": "bugType", "equals": "regression"},
|
|
143
|
+
{"var": "searchSpace", "equals": "large"}
|
|
144
|
+
]
|
|
145
|
+
},
|
|
146
|
+
"prompt": "**BINARY SEARCH** - Apply divide-and-conquer:\n\n1. Identify GOOD state (working) and BAD state (broken)\n2. Find midpoint in history/code/data\n3. Test midpoint state\n4. Narrow to relevant half\n5. Document reduced search space\n\n**OUTPUT**: Narrowed location with evidence.",
|
|
147
|
+
"agentRole": "You are a systematic investigator using algorithmic search to efficiently isolate issues.",
|
|
148
|
+
"guidance": [
|
|
149
|
+
"VERSION CONTROL: Use 'git bisect' or equivalent for commit history searches",
|
|
150
|
+
"DATA PIPELINE: Test data at pipeline midpoints to isolate transformation issues",
|
|
151
|
+
"TIME WINDOWS: For time-based issues, binary search through timestamps",
|
|
152
|
+
"DOCUMENT BOUNDARIES: Clearly record each tested boundary and result",
|
|
153
|
+
"EFFICIENCY: Each test should eliminate ~50% of remaining search space"
|
|
154
|
+
]
|
|
155
|
+
},
|
|
156
|
+
{
|
|
157
|
+
"id": "phase-1b-test-reduction",
|
|
158
|
+
"title": "Phase 1b: Test Case Minimization",
|
|
159
|
+
"runCondition": {
|
|
160
|
+
"var": "bugSource",
|
|
161
|
+
"equals": "failing_test"
|
|
162
|
+
},
|
|
163
|
+
"prompt": "**TEST REDUCTION** - Simplify failing test:\n\n1. Inline called methods into test\n2. Add earlier assertion to fail sooner\n3. Remove code after new failure point\n4. Repeat until minimal\n\n**OUTPUT**: Minimal failing test case.",
|
|
164
|
+
"agentRole": "You are a surgical debugger who strips away layers to reveal core issues.",
|
|
165
|
+
"guidance": [
|
|
166
|
+
"PRESERVE FAILURE: Each reduction must maintain the original failure mode",
|
|
167
|
+
"INLINE AGGRESSIVELY: Replace method calls with their actual implementation",
|
|
168
|
+
"FAIL EARLY: Move assertions up to find earliest deviation from expected state",
|
|
169
|
+
"REMOVE RUTHLESSLY: Delete all code that doesn't contribute to the failure",
|
|
170
|
+
"CLARITY GOAL: Final test should make the bug obvious to any reader"
|
|
171
|
+
]
|
|
172
|
+
},
|
|
98
173
|
{
|
|
99
174
|
"id": "phase-2a-hypothesis-development",
|
|
100
175
|
"title": "Phase 2a: Hypothesis Development & Prioritization",
|
|
@@ -143,10 +218,31 @@
|
|
|
143
218
|
],
|
|
144
219
|
"hasValidation": true
|
|
145
220
|
},
|
|
221
|
+
{
|
|
222
|
+
"id": "phase-2c-hypothesis-assumptions",
|
|
223
|
+
"title": "Phase 2c: Hypothesis Assumption Audit",
|
|
224
|
+
"prompt": "**AUDIT** each hypothesis for hidden assumptions:\n\n**FOR EACH HYPOTHESIS**:\n- List implicit assumptions\n- Rate assumption confidence (1-10)\n- Identify verification approach\n\n**REJECT** hypotheses built on unverified assumptions.",
|
|
225
|
+
"agentRole": "You are a rigorous scientist who rejects any hypothesis not grounded in verified facts.",
|
|
226
|
+
"guidance": [
|
|
227
|
+
"EXPLICIT LISTING: Write out every assumption, no matter how obvious it seems",
|
|
228
|
+
"CONFIDENCE SCORING: Rate 1-10 based on evidence quality, not intuition",
|
|
229
|
+
"VERIFICATION PLAN: For each assumption, specify how it can be tested",
|
|
230
|
+
"REJECTION CRITERIA: Any assumption with confidence <7 requires verification",
|
|
231
|
+
"DOCUMENT RATIONALE: Explain why each assumption is accepted or needs testing"
|
|
232
|
+
],
|
|
233
|
+
"validationCriteria": [
|
|
234
|
+
{
|
|
235
|
+
"type": "contains",
|
|
236
|
+
"value": "Assumption confidence",
|
|
237
|
+
"message": "Must rate assumption confidence for each hypothesis"
|
|
238
|
+
}
|
|
239
|
+
],
|
|
240
|
+
"hasValidation": true
|
|
241
|
+
},
|
|
146
242
|
{
|
|
147
243
|
"id": "phase-3-debugging-instrumentation",
|
|
148
244
|
"title": "Phase 3: Debugging Instrumentation Setup",
|
|
149
|
-
"prompt": "**DEBUGGING INSTRUMENTATION** - Implement mechanisms to gather evidence for hypothesis validation.\n\n**STEP 1: Strategy Selection**\nChoose approach based on `projectType`:\n- **Logging**: Strategic state/flow capture\n- **Print Debug**: Console output\n- **Test Mods**: Enhanced test assertions\n- **Debug Tests**: New validation tests\n- **Profiling**: Performance monitoring\n\n**STEP 2: Implementation**\nFor top hypotheses:\n- **Entry/Exit**: Function calls with params/returns\n- **State**: Variable values at decision points\n- **Flow**: Execution path tracking\n- **Error Context**: Enhanced error messages\n- **Timing**: Timestamps for race conditions\n\n**LOG DEDUPLICATION**:\n- **Pattern Groups**: Similar logs (validateUser)\n- **Count Track**: Indicators ('x10 - last: user123')\n- **Grouping**: Sequential ('auth
|
|
245
|
+
"prompt": "**DEBUGGING INSTRUMENTATION** - Implement mechanisms to gather evidence for hypothesis validation.\n\n**STEP 1: Strategy Selection**\nChoose approach based on `projectType`:\n- **Logging**: Strategic state/flow capture\n- **Print Debug**: Console output\n- **Test Mods**: Enhanced test assertions\n- **Debug Tests**: New validation tests\n- **Profiling**: Performance monitoring\n\n**STEP 2: Implementation**\nFor top hypotheses:\n- **Entry/Exit**: Function calls with params/returns\n- **State**: Variable values at decision points\n- **Flow**: Execution path tracking\n- **Error Context**: Enhanced error messages\n- **Timing**: Timestamps for race conditions\n\n**LOG DEDUPLICATION**:\n- **Pattern Groups**: Similar logs (validateUser)\n- **Count Track**: Indicators ('x10 - last: user123')\n- **Grouping**: Sequential ('auth→token→db')\n- **Windows**: 50-100ms for related ops\n- **Summaries**: Total counts ('x47 total')\n\n**SUB-ANALYSIS META**:\n- **H-Tags**: Prefix with hypothesis IDs ('H1_DEBUG:')\n- **Format**: Consistent patterns (timestamp, component)\n- **Context**: Standalone analysis info\n- **Searchable**: Clear terms and regex patterns\n\n**STEP 3: Validation**\n- All hypotheses covered\n- Code safety maintained\n- Clear evidence provided\n- Edge cases handled\n\n**STEP 4: Execution**\n- Running commands\n- Expected patterns\n- Result capture\n\n**OUTPUTS**: Instrumented code ready.",
|
|
150
246
|
"agentRole": "You are a debugging instrumentation specialist and diagnostic expert with extensive experience in systematic evidence collection. Your expertise lies in implementing non-intrusive debugging mechanisms that provide clear evidence for hypothesis validation. You excel at strategic instrumentation that maximizes diagnostic value.",
|
|
151
247
|
"guidance": [
|
|
152
248
|
"STRATEGIC PLACEMENT: Place instrumentation at points that will provide maximum diagnostic value",
|
|
@@ -156,6 +252,23 @@
|
|
|
156
252
|
"LOG DEDUPLICATION FOCUS: Implement pattern-based log deduplication for high-frequency scenarios to reduce noise while preserving diagnostic value. Use counting, grouping, and time-window strategies as detailed in metaGuidance LOG ENHANCEMENTS."
|
|
157
253
|
]
|
|
158
254
|
},
|
|
255
|
+
{
|
|
256
|
+
"id": "phase-3a-observability-setup",
|
|
257
|
+
"title": "Phase 3a: Distributed System Observability",
|
|
258
|
+
"runCondition": {
|
|
259
|
+
"var": "isDistributed",
|
|
260
|
+
"equals": true
|
|
261
|
+
},
|
|
262
|
+
"prompt": "**OBSERVABILITY** - Set up three-pillar strategy:\n\n**METRICS**: Identify key indicators (latency, errors)\n**TRACES**: Enable request path tracking\n**LOGS**: Ensure correlation IDs present\n\n**OUTPUT**: Observability checklist completed.",
|
|
263
|
+
"agentRole": "You are a distributed systems expert who thinks in terms of emergent behaviors and system-wide patterns.",
|
|
264
|
+
"guidance": [
|
|
265
|
+
"METRICS SELECTION: Focus on RED metrics (Rate, Errors, Duration) for each service",
|
|
266
|
+
"TRACE COVERAGE: Ensure spans cover all service boundaries and key operations",
|
|
267
|
+
"CORRELATION IDS: Verify IDs propagate through entire request lifecycle",
|
|
268
|
+
"AGGREGATION READY: Set up centralized collection for cross-service analysis",
|
|
269
|
+
"BASELINE ESTABLISHMENT: Capture normal behavior metrics for comparison"
|
|
270
|
+
]
|
|
271
|
+
},
|
|
159
272
|
{
|
|
160
273
|
"id": "phase-4-evidence-collection",
|
|
161
274
|
"title": "Phase 4: Evidence Collection & Analysis",
|
|
@@ -167,6 +280,40 @@
|
|
|
167
280
|
"SUB-ANALYSIS: For Step 1.5, assess log volume >500 lines and generate sub-analysis prompts."
|
|
168
281
|
]
|
|
169
282
|
},
|
|
283
|
+
{
|
|
284
|
+
"id": "phase-4a-distributed-evidence",
|
|
285
|
+
"title": "Phase 4a: Multi-Service Evidence Collection",
|
|
286
|
+
"runCondition": {
|
|
287
|
+
"var": "isDistributed",
|
|
288
|
+
"equals": true
|
|
289
|
+
},
|
|
290
|
+
"prompt": "**DISTRIBUTED ANALYSIS**:\n\n1. Check METRICS for anomalies\n2. Follow TRACES for request path\n3. Correlate LOGS across services\n4. Identify cascade points\n\n**OUTPUT**: Service interaction map with failure points.",
|
|
291
|
+
"agentRole": "You are a systems detective who can trace failures across service boundaries.",
|
|
292
|
+
"guidance": [
|
|
293
|
+
"ANOMALY DETECTION: Look for deviations in latency, error rates, or traffic patterns",
|
|
294
|
+
"TRACE ANALYSIS: Follow request ID through all services to find failure point",
|
|
295
|
+
"LOG CORRELATION: Use timestamp windows and correlation IDs to link events",
|
|
296
|
+
"CASCADE IDENTIFICATION: Look for timeout chains or error propagation patterns",
|
|
297
|
+
"VISUAL MAPPING: Create service dependency diagram with failure annotations"
|
|
298
|
+
]
|
|
299
|
+
},
|
|
300
|
+
{
|
|
301
|
+
"id": "phase-4b-cognitive-reset",
|
|
302
|
+
"title": "Phase 4b: Cognitive Reset & Progress Review",
|
|
303
|
+
"runCondition": {
|
|
304
|
+
"var": "iterationCount",
|
|
305
|
+
"gt": 10
|
|
306
|
+
},
|
|
307
|
+
"prompt": "**COGNITIVE RESET** - Step back and review:\n\n1. Summarize findings so far\n2. List eliminated possibilities\n3. Identify investigation blind spots\n4. Reformulate approach if needed\n\n**DECIDE**: Continue current path or pivot strategy?",
|
|
308
|
+
"agentRole": "You are a strategic advisor who helps maintain perspective during complex investigations.",
|
|
309
|
+
"guidance": [
|
|
310
|
+
"PROGRESS SUMMARY: Write concise bullet points of key findings and eliminations",
|
|
311
|
+
"BLIND SPOT CHECK: What areas haven't been investigated? What assumptions remain?",
|
|
312
|
+
"PATTERN RECOGNITION: Look for investigation loops or repeated dead ends",
|
|
313
|
+
"STRATEGY EVALUATION: Is current approach yielding diminishing returns?",
|
|
314
|
+
"PIVOT CRITERIA: Consider new approach if last 3 iterations provided no new insights"
|
|
315
|
+
]
|
|
316
|
+
},
|
|
170
317
|
{
|
|
171
318
|
"id": "phase-5a-evidence-synthesis",
|
|
172
319
|
"title": "Phase 5a: Evidence Synthesis & Verification",
|
|
@@ -181,7 +328,7 @@
|
|
|
181
328
|
{
|
|
182
329
|
"id": "phase-5b-confidence-assessment",
|
|
183
330
|
"title": "Phase 5b: Adversarial Challenge & Confidence Assessment",
|
|
184
|
-
"prompt": "**CONFIDENCE ASSESSMENT** - Challenge the root cause conclusion and quantify confidence.\n\n**STEP 1: Adversarial Challenge Protocol**\n- **Devil's Advocate Analysis**: Argue against primary hypothesis\n- **Alternative Explanations**: Identify 2+ alternative explanations for evidence\n- **Confidence Calibration**: Rate certainty on calibrated scale with reasoning\n- **Uncertainty Documentation**: List remaining unknowns and impact\n\n**STEP 2: Confidence Assessment Matrix**\n- **Evidence Quality Score** (1-10): Reliability and completeness of supporting evidence\n- **Explanation Completeness** (1-10): How well root cause explains all symptoms\n- **Alternative Likelihood** (1-10): Probability alternatives are correct (inverted)\n- **Final Confidence** = (Evidence Quality
|
|
331
|
+
"prompt": "**CONFIDENCE ASSESSMENT** - Challenge the root cause conclusion and quantify confidence.\n\n**STEP 1: Adversarial Challenge Protocol**\n- **Devil's Advocate Analysis**: Argue against primary hypothesis\n- **Alternative Explanations**: Identify 2+ alternative explanations for evidence\n- **Confidence Calibration**: Rate certainty on calibrated scale with reasoning\n- **Uncertainty Documentation**: List remaining unknowns and impact\n\n**STEP 2: Confidence Assessment Matrix**\n- **Evidence Quality Score** (1-10): Reliability and completeness of supporting evidence\n- **Explanation Completeness** (1-10): How well root cause explains all symptoms\n- **Alternative Likelihood** (1-10): Probability alternatives are correct (inverted)\n- **Final Confidence** = (Evidence Quality × 0.4) + (Completeness × 0.4) + (Alternative × 0.2)\n\n**CONFIDENCE THRESHOLD**: Proceed only if Final Confidence ≥ 9.0/10. If below, recommend additional investigation with specific evidence gaps.\n\n**OUTPUTS**: High-confidence root cause with quantified assessment and adversarial validation.",
|
|
185
332
|
"agentRole": "You are a senior root cause analysis expert and forensic investigator with deep expertise in systematic evidence evaluation and definitive conclusion formation. Your strength lies in synthesizing complex evidence into clear, confident determinations. You excel at maintaining rigorous standards for certainty while providing actionable insights. You must actively challenge your own conclusions and maintain objective, quantified confidence assessments.",
|
|
186
333
|
"guidance": [
|
|
187
334
|
"ADVERSARIAL MINDSET: Actively challenge your own conclusions with available evidence",
|
|
@@ -212,6 +359,27 @@
|
|
|
212
359
|
],
|
|
213
360
|
"hasValidation": true
|
|
214
361
|
},
|
|
362
|
+
{
|
|
363
|
+
"id": "phase-5c-prevention-scan",
|
|
364
|
+
"title": "Phase 5c: Anti-Pattern & Prevention Analysis",
|
|
365
|
+
"prompt": "**PREVENTION SCAN** - Identify systemic issues:\n\n**ANTI-PATTERNS**:\n- Tight coupling indicators\n- State management issues\n- Missing error handling\n\n**RECOMMENDATIONS**:\n- Modularization opportunities\n- Immutability improvements\n- Test coverage gaps\n\n**OUTPUT**: Prevention checklist for writeup.",
|
|
366
|
+
"agentRole": "You are a software architect who identifies systemic improvements beyond the immediate bug.",
|
|
367
|
+
"guidance": [
|
|
368
|
+
"COUPLING ANALYSIS: Look for God Objects, circular dependencies, or tangled interfaces",
|
|
369
|
+
"STATE AUDIT: Identify mutable shared state, unclear ownership, or race conditions",
|
|
370
|
+
"ERROR HANDLING: Check for silent failures, generic catches, or missing validation",
|
|
371
|
+
"ARCHITECTURAL DEBT: Note violations of SOLID principles or design patterns",
|
|
372
|
+
"ACTIONABLE RECOMMENDATIONS: Provide specific refactoring suggestions with examples"
|
|
373
|
+
],
|
|
374
|
+
"validationCriteria": [
|
|
375
|
+
{
|
|
376
|
+
"type": "contains",
|
|
377
|
+
"value": "Anti-pattern",
|
|
378
|
+
"message": "Must identify at least one anti-pattern or improvement area"
|
|
379
|
+
}
|
|
380
|
+
],
|
|
381
|
+
"hasValidation": true
|
|
382
|
+
},
|
|
215
383
|
{
|
|
216
384
|
"id": "phase-6-diagnostic-writeup",
|
|
217
385
|
"title": "Phase 6: Comprehensive Diagnostic Writeup",
|