npm - @exaudeus/workrail - Versions diffs - 0.0.11 → 0.0.13 - Mend

@exaudeus/workrail 0.0.11 → 0.0.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/workflows/coding-task-workflow.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
     "id": "coding-task-workflow",
     "name": "Excellent Adaptive Coding Workflow with Devil's Advocate Review",
-    "version": "0.1.0",
+    "version": "0.2.0",
     "description": "A comprehensive and resilient workflow for AI-assisted coding. It adaptively sizes tasks, performs a critical self-review of its own plans, provides efficiency options, enforces closed-loop validation, and defines a robust protocol for handling failures.",
     "preconditions": [
         "User has a clear task description (e.g., from Jira, a dev doc, or a BRD).",
@@ -31,15 +31,17 @@
             "requireConfirmation": true
         },
       {
-        "id": "phase-1-scoping",
+        "id": "phase-1-specification",
         "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
-        "title": "Phase 1: Task Briefing & Scope Definition",
-        "prompt": "Your first goal is to understand the task. Analyze the following request, summarize your understanding, ask clarifying questions, and assess which parts of the codebase are relevant.\n\n**Task Description:**\n[User inserts detailed task description here]\n\n**Key Objectives & Success Criteria:**\n[User lists specific, measurable success criteria here]\n\n**Scope and Constraints:**\n[User defines boundaries or areas to avoid here]\n\nFinally, based on your analysis, perform a sanity check on the initial complexity sizing. If you believe the classification is incorrect, state your reasoning and ask for confirmation before proceeding. For example: 'You classified this as Medium, but my analysis shows it impacts several core architectural components. I recommend we upgrade to the Large path to perform a Deep Analysis. Do you agree?'",
-        "agentRole": "You are a senior business analyst and technical lead specializing in requirement gathering and scope definition. Focus on clarity, completeness, and identifying potential scope creep or missing requirements early in the process.",
+        "title": "Phase 1: Create Specification",
+        "prompt": "Your first goal is to understand the task and create a specification document. Analyze the request, summarize your understanding, ask clarifying questions, and assess which parts of the codebase are relevant. The output of this step should be a formal specification.\n\n**Task Description:**\n[User inserts detailed task description here]\n\n**Key Objectives & Success Criteria:**\n[User lists specific, measurable success criteria here]\n\n**Scope and Constraints:**\n[User defines boundaries or areas to avoid here]\n\nFinally, based on your analysis, perform a sanity check on the initial complexity sizing. If you believe the classification is incorrect, state your reasoning and ask for confirmation before proceeding. For example: 'You classified this as Medium, but my analysis shows it impacts several core architectural components. I recommend we upgrade to the Large path to perform a Deep Analysis. Do you agree?'",
+        "agentRole": "You are a senior business analyst and technical lead specializing in requirement gathering and scope definition. Your goal is to produce a clear, comprehensive `spec.md` file that will serve as the foundation for design and implementation.",
         "guidance": [
           "Provide a complete task description. Vague requests will lead to poor plans and wasted effort.",
+          "The output of this step should be the content for `spec.md`.",
           "This step is automatically skipped for Small tasks based on the complexity classification"
-        ]
+        ],
+        "requireConfirmation": false
       },
       {
         "id": "phase-1b-deep-analysis-mandatory",
@@ -52,7 +54,8 @@
           "This step is mandatory for Large tasks due to their complexity and risk",
           "Ensure all relevant source files are attached or accessible to the agent before running this step",
           "Be thorough - this analysis will inform the entire implementation strategy"
-        ]
+        ],
+        "requireConfirmation": false
       },
       {
         "id": "phase-1b-deep-analysis-optional",
@@ -70,25 +73,39 @@
           "This optional analysis was requested for a Medium task",
           "Ensure all relevant source files are attached or accessible to the agent before running this step",
           "Focus on areas most relevant to the current task"
-        ]
+        ],
+        "requireConfirmation": false
+      },
+      {
+        "id": "phase-1c-architectural-design",
+        "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
+        "title": "Phase 1c: Architectural Design",
+        "prompt": "Using the `spec.md` from the previous step and your codebase analysis, create a high-level architectural design. Your output should be a `design.md` document that includes:\n1. **High-Level Approach:** A summary of the proposed solution.\n2. **Component Breakdown:** Identify new or modified components, classes, or modules.\n3. **Data Models:** Describe any changes to data structures or database schemas.\n4. **API Contracts:** Define any new or changed API endpoints, including request/response formats.\n5. **Key Interactions:** A diagram or description of how the major components will interact.",
+        "agentRole": "You are a software architect specializing in translating business requirements into robust and scalable technical designs. Your task is to create a clear and comprehensive `design.md` that guides the implementation.",
+        "guidance": [
+          "The `design.md` should be detailed enough for an engineer to write an implementation plan from it.",
+          "This step is automatically skipped for Small tasks."
+        ],
+        "requireConfirmation": false
       },
       {
         "id": "phase-2-planning",
         "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
         "title": "Phase 2: Create Detailed Implementation Plan",
-        "prompt": "Your goal is to produce a thorough and actionable plan of attack. Do not write any code. Your plan must be detailed, broken into committable phases, and justified.\n\nYour plan must include these sections:\n1.  **Goal Clarification:** Your understanding of the goal, assumptions, and success criteria.\n2.  **Impact Assessment:** Affected codebase parts, dependencies, and risks.\n3.  **Implementation Strategy:** A list of discrete, actionable steps. Each step must detail the task, its rationale, inputs, and outputs.\n4.  **Final Review Checklist:** A specific checklist of items that must be verified to consider this entire task complete. This will be used in the final review phase.\n\nPresent this as a formal proposal.",
+        "prompt": "Your goal is to produce a thorough and actionable `implementation_plan.md` based on the `spec.md` and `design.md`. Do not write any code. Your plan must be detailed, broken into committable phases, and justified.\n\nYour plan must include these sections:\n1.  **Goal Clarification:** Your understanding of the goal, assumptions, and success criteria from the spec.\n2.  **Impact Assessment:** Affected codebase parts, dependencies, and risks based on the design.\n3.  **Implementation Strategy:** A list of discrete, actionable steps. Each step must detail the task, its rationale, inputs, and outputs.\n4.  **Testing Strategy:** Describe how the changes will be tested (e.g., unit tests, integration tests, manual QA).\n5.  **Final Review Checklist:** A specific checklist of items that must be verified to consider this entire task complete. This will be used in the final review phase.\n\nPresent this as a formal proposal.",
         "agentRole": "You are an experienced technical architect and project planner with expertise in breaking down complex development tasks into manageable, logical phases. Your strength is creating detailed, actionable plans that minimize risk while maximizing development efficiency and code quality.",
         "guidance": [
           "The agent will now proceed to critique its own plan in the next step. Withhold your final approval until after that critique.",
           "This step is automatically skipped for Small tasks based on the complexity classification"
-        ]
+        ],
+        "requireConfirmation": false
       },
       {
         "id": "phase-2b-devil-advocate-review",
         "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
         "title": "Phase 2b: Devil's Advocate Plan Review",
         "prompt": "Your task is to perform a 'devil's advocate' review of the implementation plan you just created in Phase 2. The objective is not to discard the plan, but to rigorously stress-test it and make it stronger. Your critique must be balanced and evidence-based.\n\nAnalyze the plan through the following lenses. For every point you make (positive or negative), you must cite specific evidence from the plan, the codebase, or the initial task description.\n\n1.  **Hidden Assumptions:** What assumptions does this plan make about the codebase, user behavior, or existing data that might be incorrect?\n2.  **Potential Risks & Unintended Side Effects:** What is the biggest risk of this plan? Could it impact performance, security, or another feature in a negative way?\n3.  **Overlooked Complexities or Edge Cases:** What specific edge cases (e.g., empty states, invalid inputs, race conditions) does the plan fail to explicitly address?\n4.  **Alternative Approaches:** Briefly propose at least one alternative technical approach. What are the pros and cons of the alternative versus the current plan?\n5.  **Plan Strengths:** To ensure a balanced review, explicitly state the strongest parts of the plan. What aspects are well-thought-out and likely to succeed?\n\nConclude with a balanced summary. If you found issues, provide concrete suggestions for how to amend the plan. Finally, give a confidence score (1-10) for the plan *if* your suggestions are implemented.",
-        "agentRole": "You are a skeptical but fair senior principal engineer with 15+ years of experience in critical system development. Your new goal is to act as a skeptical but fair senior principal engineer. Your role is to identify potential failure points, hidden assumptions, and overlooked complexities in technical plans. You excel at constructive criticism that strengthens plans rather than destroys them. Approach this with the rigor of a senior engineer reviewing a mission-critical system design.",
+        "agentRole": "You are a skeptical but fair senior principal engineer with 15+ years of experience in critical system development. Your role is to identify potential failure points, hidden assumptions, and overlooked complexities in technical plans. You excel at constructive criticism that strengthens plans rather than destroys them. Approach this with the rigor of a senior engineer reviewing a mission-critical system design.",
         "guidance": [
           "This is a critical thinking step. The agent's goal is to find weaknesses in its *own* prior work to improve it. This is a sign of a high-functioning process.",
           "Evaluate the agent's points. Not all 'risks' it identifies may be realistic. Use your judgment to decide which suggestions to incorporate into the plan.",
@@ -97,22 +114,49 @@
         ],
         "requireConfirmation": true
       },
+      {
+        "id": "phase-2c-finalize-plan",
+        "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
+        "title": "Phase 2c: Finalize Implementation Plan",
+        "prompt": "Review the initial `implementation_plan.md` from Phase 2 and the Devil's Advocate critique from Phase 2b. Your task is to create a final, consolidated implementation plan that incorporates the valid feedback from the review.\n\nYour output must be the final `implementation_plan.md`.\n\nAdditionally, explicitly list any suggestions from the review that you believe are valuable but out-of-scope for the current task. These should be formatted as potential tickets for future work.",
+        "agentRole": "You are a pragmatic technical project manager. Your goal is to synthesize feedback, make decisive trade-offs, and produce a final, actionable plan that is ready for execution.",
+        "guidance": [
+          "This is the final plan that will be executed. Ensure it is clear, actionable, and reflects the best path forward.",
+          "The list of out-of-scope items helps capture valuable ideas without derailing the current task."
+        ],
+        "requireConfirmation": true
+      },
+      {
+        "id": "phase-2d-plan-sanity-check",
+        "runCondition": { "var": "taskComplexity", "not_equals": "Small" },
+        "title": "Phase 2d: Plan Sanity Check",
+        "prompt": "Before starting implementation, perform a sanity check on the final `implementation_plan.md`. Your goal is to use your tools to quickly verify the plan's core assumptions against the current codebase. This is not a deep analysis, but a quick check for obvious errors.\n\nFor the key files, functions, classes, or APIs mentioned in the plan, perform the following checks:\n1.  **Existence Check:** Use tools like `grep` or `ls` to confirm that the files and primary functions/classes you plan to modify actually exist where you expect them to.\n2.  **Signature/API Check (if applicable):** Briefly check the function signatures or API endpoints you intend to use. Do they match the plan's assumptions? For example, if the plan assumes a function takes two arguments, verify that.\n3.  **Dependency Check:** If the plan relies on a specific library or module being available, quickly verify its presence (e.g., check `package.json`, `requirements.txt`, etc.).\n\n**Report your findings as a simple checklist:**\n- [✅ or ❌] File `src/example.js` exists.\n- [✅ or ❌] Function `calculateTotal()` found in `src/utils.js`.\n- [✅ or ❌] Dependency `moment` is listed in `package.json`.\n\nIf any check fails (❌), briefly explain the discrepancy and ask the user if the plan needs to be revised before proceeding.",
+        "agentRole": "You are a pragmatic QA engineer double-checking a plan before the development team starts work. Your job is to be quick, efficient, and focused on verifying concrete facts using tools. You are not re-evaluating the plan's logic, only its tangible connection to the codebase.",
+        "guidance": [
+          "This is a quick verification step, not a full re-analysis. The goal is to catch glaring errors before implementation begins.",
+          "Use your code-browsing tools (`grep`, `ls`) to validate the plan's assumptions.",
+          "If a check fails, it's crucial to pause and get human confirmation before proceeding with a potentially flawed plan."
+        ],
+        "requireConfirmation": false
+      },
       {
         "id": "phase-3-iterative-implementation",
         "title": "Phase 3: Iterative Implementation (PREP -> IMPLEMENT -> VERIFY)",
-        "prompt": "The implementation phase has now begun. **Please provide me with the next single step from the approved plan.** If we are on the 'Small' path, provide a single, clear implementation instruction.\n\nI will execute *only* that step using the PREP -> IMPLEMENT -> VERIFY cycle defined in the guidance below, and then await your command for the subsequent step. This process will repeat until all steps are complete.",
+        "prompt": "The implementation phase has now begun. You will now execute the approved `implementation_plan.md` step-by-step. Announce which step you are starting.\n\nYou will execute each step using the PREP -> IMPLEMENT -> VERIFY cycle defined in the guidance below. This process will repeat until all steps in the plan are complete.",
         "agentRole": "You are a meticulous senior software engineer focused on high-quality implementation. Your approach emphasizes careful preparation, precise execution, and thorough verification. You excel at following plans while adapting to unexpected discoveries during implementation.",
         "guidance": [
+          "**IMPORTANT**: Do not proceed to the next workflow phase (Final Review) until all steps in the `implementation_plan.md` are marked as complete and verified.",
           "**Efficiency Tip:** For high-confidence plans, you may provide multiple step instructions at once. I will execute them sequentially, performing the P->I->V cycle for each, and will only pause to ask for input if I encounter a verification failure or ambiguity.",
           "**PREP:** Before implementing each step, you must first PREPARE. Re-read the step's description, confirm the previous step was completed correctly, verify the plan for this step is still valid in the current codebase, and list all required inputs or files. Do not proceed if anything is unclear.",
           "**IMPLEMENT:** After preparation is confirmed, you will IMPLEMENT the step. Focus only on this single step. Use your tools to make the necessary code changes, adhering to all quality standards. Provide a commit message upon completion.",
-          "**VERIFY:** Immediately after implementation, you must VERIFY your work. Your verification for this step **is not complete until you have**:\n1.  **Written necessary unit/integration tests** for the new logic.\n2.  **Run the full test suite** to ensure no regressions were introduced.\n3.  **Performed a critical self-review** of the changes against the plan, checking for code quality, side effects, and architectural alignment.\n\n**Failure Protocol:** If a verification failure cannot be resolved after two attempts, you must halt. Do not try a third time. Instead, present a summary of the problem, detail your failed attempts, and recommend a course of action to the user (e.g., 'revert this step and re-plan', 'request more information', 'proceed with a known issue')."
-        ]
+          "**VERIFY:** Immediately after implementation, you must VERIFY your work. Your verification for this step **is not complete until you have**:\n1.  **Written necessary unit/integration tests** for the new logic (as per the testing strategy).\n2.  **Run the full test suite** to ensure no regressions were introduced.\n3.  **Performed a critical self-review** of the changes against the plan, checking for code quality, side effects, and architectural alignment.\n\n**Failure Protocol:** If a verification failure cannot be resolved after two attempts, you must halt. Do not try a third time. Instead, present a summary of the problem, detail your failed attempts, and recommend a course of action to the user (e.g., 'revert this step and re-plan', 'request more information', 'proceed with a known issue')."
+        ],
+        "requireConfirmation": false
       },
       {
         "id": "phase-4-final-review",
         "title": "Phase 4: Final Review & Completion",
-        "prompt": "All planned steps have been implemented and verified. Your final goal is to perform a holistic review by validating the work against the **'Final Review Checklist'** created and approved during Phase 2.\n\nFor each item on that checklist, provide a confirmation and evidence that it has been met. Conclude with a summary of any potential follow-ups or new dependencies to note.",
+        "prompt": "All planned steps have been implemented and verified. Your final goal is to perform a holistic review by validating the work against the **'Final Review Checklist'** from the `implementation_plan.md`.\n\nFor each item on that checklist, provide a confirmation and evidence that it has been met. Conclude with a summary of any potential follow-ups or new dependencies to note.",
         "agentRole": "You are a quality assurance specialist and technical lead responsible for final project validation. Your expertise lies in comprehensive system testing, requirement verification, and ensuring deliverables meet all specified criteria. Approach this with the thoroughness of a senior engineer conducting a final release review.",
         "guidance": [
           "This is the final quality check. Ensure the agent's summary and checklist validation align with your understanding of the completed work."

package/workflows/systemic-bug-investigation.json ADDED Viewed

@@ -0,0 +1,190 @@
+{
+    "id": "systematic-bug-investigation",
+    "name": "Systematic Bug Investigation Workflow",
+    "version": "1.0.0",
+    "description": "A comprehensive workflow for systematic bug and failing test investigation that prevents LLMs from jumping to conclusions. Enforces thorough evidence gathering, hypothesis formation, debugging instrumentation, and validation to achieve near 100% certainty about root causes. This workflow does NOT fix bugs - it produces detailed diagnostic writeups that enable effective fixing by providing complete understanding of what is happening, why it's happening, and supporting evidence.",
+    "clarificationPrompts": [
+      "What type of system is this? (web app, mobile app, backend service, desktop app, etc.)",
+      "How consistently can you reproduce this bug? (always reproducible, sometimes reproducible, rarely reproducible)",
+      "What was the last known working version or state if applicable?",
+      "Are there any time constraints or urgency factors for this investigation?",
+      "What level of system access do you have? (full codebase, limited access, production logs only)"
+    ],
+    "preconditions": [
+      "User has identified a specific bug or failing test to investigate",
+      "Agent has access to codebase analysis tools (grep, file readers, etc.)",
+      "Agent has access to build/test execution tools for the project type",
+      "User can provide error messages, stack traces, or test failure output"
+    ],
+    "metaGuidance": [
+      "INVESTIGATION DISCIPLINE: Never propose fixes or solutions until Phase 6 (Comprehensive Diagnostic Writeup). Focus entirely on systematic evidence gathering and analysis.",
+      "HYPOTHESIS RIGOR: All hypotheses must be based on concrete evidence from code analysis with quantified scoring (1-10 scales). Maximum 5 hypotheses per investigation.",
+      "DEBUGGING INSTRUMENTATION: Always implement debugging mechanisms before running tests - logs, print statements, or test modifications that will provide evidence.",
+      "EVIDENCE THRESHOLD: Require minimum 3 independent sources of evidence before confirming any hypothesis. Use objective verification criteria.",
+      "SYSTEMATIC PROGRESSION: Complete each investigation phase fully before proceeding. Each phase builds critical context for the next with structured documentation.",
+      "CONFIDENCE CALIBRATION: Use mathematical confidence framework with 9.0/10 minimum threshold. Actively challenge conclusions with adversarial analysis.",
+      "UNCERTAINTY ACKNOWLEDGMENT: Explicitly document all remaining unknowns and their potential impact. No subjective confidence assessments."
+    ],
+    "steps": [
+      {
+        "id": "phase-0-triage",
+        "title": "Phase 0: Initial Triage & Context Gathering",
+        "prompt": "**SYSTEMATIC INVESTIGATION BEGINS** - Your mission is to achieve near 100% certainty about this bug's root cause through systematic evidence gathering. NO FIXES will be proposed until Phase 6.\n\n**STEP 1: Bug Report Analysis**\nPlease provide the complete bug context:\n- **Bug Description**: What is the observed behavior vs expected behavior?\n- **Error Messages/Stack Traces**: Paste the complete error output\n- **Reproduction Steps**: How can this bug be consistently reproduced?\n- **Environment Details**: OS, language version, framework version, etc.\n- **Recent Changes**: Any recent commits, deployments, or configuration changes?\n\n**STEP 2: Project Type Classification**\nBased on the information provided, I will classify the project type and set debugging strategies:\n- **Languages/Frameworks**: Primary tech stack\n- **Build System**: Maven, Gradle, npm, etc.\n- **Testing Framework**: JUnit, Jest, pytest, etc.\n- **Logging System**: Available logging mechanisms\n\n**STEP 3: Complexity Assessment**\nI will analyze the bug complexity using these criteria:\n- **Simple**: Single function/method, clear error path, minimal dependencies\n- **Standard**: Multiple components, moderate investigation required\n- **Complex**: Cross-system issues, race conditions, complex state management\n\n**OUTPUTS**: Set `projectType`, `bugComplexity`, and `debuggingMechanism` context variables.",
+        "agentRole": "You are a senior debugging specialist and bug triage expert with 15+ years of experience across multiple technology stacks. Your expertise lies in quickly classifying bugs, understanding project architectures, and determining appropriate investigation strategies. You excel at extracting critical information from bug reports and setting up systematic investigation approaches.",
+        "guidance": [
+          "CLASSIFICATION ACCURACY: Proper complexity assessment determines investigation depth - be thorough but decisive",
+          "CONTEXT CAPTURE: Gather complete environmental and situational context now to avoid gaps later",
+          "DEBUGGING STRATEGY: Choose debugging mechanisms appropriate for the project type and bug complexity",
+          "NO ASSUMPTIONS: If critical information is missing, explicitly request it before proceeding"
+        ]
+      },
+      {
+        "id": "phase-1-streamlined-analysis",
+        "runCondition": {
+          "var": "bugComplexity",
+          "equals": "simple"
+        },
+        "title": "Phase 1: Streamlined Analysis (Simple Bugs)",
+        "prompt": "**STREAMLINED CODEBASE INVESTIGATION** - For simple bugs, I will perform focused analysis of the core issue.\n\n**STEP 1: Direct Component Analysis**\nI will examine the specific component involved:\n- **Primary Function/Method**: Direct analysis of the failing code\n- **Input/Output Analysis**: What data enters and exits the component\n- **Logic Flow**: Step-by-step execution path\n- **Error Point**: Exact location where failure occurs\n\n**STEP 2: Immediate Context Review**\n- **Recent Changes**: Git commits affecting this specific component\n- **Related Tests**: Existing test coverage for this functionality\n- **Dependencies**: Direct dependencies that could affect this component\n\n**STEP 3: Quick Hypothesis Formation**\nI will generate 1-3 focused hypotheses based on:\n- **Obvious Error Patterns**: Common failure modes for this type of component\n- **Change Impact**: How recent modifications could cause this issue\n- **Input Validation**: Whether invalid inputs are causing the failure\n\n**OUTPUTS**: Focused understanding of the simple bug with 1-3 targeted hypotheses ready for validation.",
+        "agentRole": "You are an experienced debugging specialist who excels at quickly identifying and resolving straightforward technical issues. Your strength lies in pattern recognition and efficient root cause analysis for simple bugs. You focus on the most likely causes while avoiding over-analysis.",
+        "guidance": [
+          "FOCUSED ANALYSIS: Concentrate on the specific failing component, avoid deep architectural analysis",
+          "PATTERN RECOGNITION: Use experience to identify common failure modes quickly",
+          "EFFICIENT HYPOTHESIS: Generate 1-3 focused hypotheses, not exhaustive possibilities",
+          "DIRECT APPROACH: Skip complex dependency analysis unless directly relevant"
+        ]
+      },
+      {
+        "id": "phase-1-comprehensive-analysis",
+        "runCondition": {
+          "or": [
+            {
+              "var": "bugComplexity",
+              "equals": "standard"
+            },
+            {
+              "var": "bugComplexity",
+              "equals": "complex"
+            }
+          ]
+        },
+        "title": "Phase 1: Deep Codebase Analysis (Standard/Complex Bugs)",
+        "prompt": "**SYSTEMATIC CODEBASE INVESTIGATION** - I will now perform comprehensive analysis of the relevant codebase components.\n\n**STEP 1: Affected Component Identification**\nBased on the bug report, I will identify and analyze:\n- **Primary Components**: Classes, functions, modules directly involved\n- **Dependency Chain**: Related components that could influence the bug\n- **Data Flow**: How data moves through the affected systems\n- **Error Propagation Paths**: Where and how errors can originate and propagate\n\n**STEP 2: Code Structure Analysis**\nFor each relevant component, I will examine:\n- **Implementation Logic**: Step-by-step code execution flow\n- **State Management**: How state is created, modified, and shared\n- **Error Handling**: Existing error handling mechanisms\n- **External Dependencies**: Third-party libraries, APIs, database interactions\n- **Concurrency Patterns**: Threading, async operations, shared resources\n\n**STEP 3: Historical Context Review**\nI will analyze:\n- **Recent Changes**: Git history around the affected components\n- **Test Coverage**: Existing tests and their coverage of the bug area\n- **Known Issues**: TODO comments, FIXME notes, or similar patterns\n\n**OUTPUTS**: Comprehensive understanding of the codebase architecture and potential failure points.",
+        "agentRole": "You are a principal software architect and code analysis expert specializing in systematic codebase investigation. Your strength lies in quickly understanding complex system architectures, identifying failure points, and tracing execution flows. You excel at connecting code patterns to potential runtime behaviors.",
+        "guidance": [
+          "SYSTEMATIC COVERAGE: Analyze all relevant components, not just the obvious ones",
+          "EXECUTION FLOW FOCUS: Trace the actual code execution path that leads to the bug",
+          "STATE ANALYSIS: Pay special attention to state management and mutation patterns",
+          "DEPENDENCY MAPPING: Understand how external dependencies could contribute to the issue"
+        ]
+      },
+      {
+        "id": "phase-2-hypothesis-formation",
+        "title": "Phase 2: Evidence-Based Hypothesis Formation",
+        "prompt": "**HYPOTHESIS GENERATION FROM EVIDENCE** - Based on the codebase analysis, I will now formulate testable hypotheses about the bug's root cause.\n\n**STEP 1: Evidence-Based Hypothesis Development**\nI will create a maximum of 5 prioritized hypotheses. For each potential root cause, I will create a hypothesis that includes:\n- **Root Cause Theory**: Specific technical explanation of what is happening\n- **Supporting Evidence**: Code patterns, architectural decisions, or logic flows that support this theory\n- **Failure Mechanism**: Exact sequence of events that leads to the observed bug\n- **Testability Score**: Quantified assessment (1-10) of how easily this can be validated\n- **Evidence Strength Score**: Quantified assessment (1-10) based on concrete code findings\n\n**STEP 2: Hypothesis Prioritization Matrix**\nI will rank hypotheses using this weighted scoring system:\n- **Evidence Strength** (40%): How much concrete code analysis supports this theory\n- **Testability** (35%): How easily this can be validated with debugging instruments\n- **Impact Scope** (25%): How well this explains all observed symptoms\n\n**STEP 3: Hypothesis Validation Strategy**\nFor the top 3 hypotheses, I will define:\n- **Required Evidence**: What specific evidence would confirm or refute this hypothesis\n- **Debugging Approach**: What instrumentation or tests would provide this evidence\n- **Success Criteria**: What results would prove this hypothesis correct\n- **Confidence Threshold**: Minimum evidence quality needed to validate\n\n**STEP 4: Hypothesis Documentation**\nI will create a structured hypothesis registry:\n- **Hypothesis ID**: H1, H2, H3 for tracking\n- **Status**: Active, Refuted, Confirmed\n- **Evidence Log**: All supporting and contradicting evidence\n- **Validation Plan**: Specific testing approach\n\n**CRITICAL RULE**: All hypotheses must be based on concrete evidence from code analysis, not assumptions or common patterns.\n\n**OUTPUTS**: Maximum 5 hypotheses with quantified scoring, top 3 selected for validation with structured documentation.",
+        "agentRole": "You are a senior software detective and root cause analysis expert with deep expertise in systematic hypothesis formation. Your strength lies in connecting code evidence to potential failure mechanisms and creating testable theories. You excel at logical reasoning and evidence-based deduction. You must maintain rigorous quantitative standards and reject any hypothesis not grounded in concrete code evidence.",
+        "guidance": [
+          "EVIDENCE-BASED ONLY: Every hypothesis must be grounded in concrete code analysis findings with quantified evidence scores",
+          "HYPOTHESIS LIMITS: Generate maximum 5 hypotheses to prevent analysis paralysis",
+          "QUANTIFIED SCORING: Use 1-10 scales for evidence strength and testability with clear criteria",
+          "STRUCTURED DOCUMENTATION: Create formal hypothesis registry with tracking IDs and status",
+          "VALIDATION RIGOR: Only proceed with top 3 hypotheses that meet minimum evidence thresholds"
+        ],
+        "validationCriteria": [
+          {
+            "type": "contains",
+            "value": "Evidence Strength Score",
+            "message": "Must include quantified evidence strength scoring (1-10) for each hypothesis"
+          },
+          {
+            "type": "contains",
+            "value": "Testability Score",
+            "message": "Must include quantified testability scoring (1-10) for each hypothesis"
+          },
+          {
+            "type": "contains",
+            "value": "Hypothesis ID",
+            "message": "Must assign tracking IDs (H1, H2, H3, etc.) to each hypothesis"
+          },
+          {
+            "type": "regex",
+            "pattern": "H[1-5]",
+            "message": "Must use proper hypothesis ID format (H1, H2, H3, H4, H5)"
+          }
+        ]
+      },
+      {
+        "id": "phase-3-debugging-instrumentation",
+        "title": "Phase 3: Debugging Instrumentation Setup",
+        "prompt": "**SYSTEMATIC DEBUGGING INSTRUMENTATION** - I will now implement debugging mechanisms to gather evidence for hypothesis validation.\n\n**STEP 1: Instrumentation Strategy Selection**\nBased on the `projectType` and `debuggingMechanism` context, I will choose appropriate debugging approaches:\n- **Logging**: Strategic log statements to capture state and flow\n- **Print Debugging**: Console output for immediate feedback\n- **Test Modifications**: Enhanced test cases with additional assertions\n- **Debugging Tests**: New test cases specifically designed to validate hypotheses\n- **Profiling**: Performance monitoring if relevant to the bug\n\n**STEP 2: Strategic Instrumentation Implementation**\nFor each top-priority hypothesis, I will implement:\n- **Entry/Exit Logging**: Function entry and exit points with parameter/return values\n- **State Capture**: Critical variable values at key decision points\n- **Flow Tracing**: Execution path tracking through complex logic\n- **Error Context**: Enhanced error messages with additional diagnostic information\n- **Timing Information**: Timestamps for race condition or performance-related issues\n\n**STEP 3: Instrumentation Validation**\nI will verify that the instrumentation:\n- **Covers All Hypotheses**: Each hypothesis has corresponding debugging output\n- **Maintains Code Safety**: Debugging code doesn't alter production behavior\n- **Provides Clear Evidence**: Output will clearly confirm or refute hypotheses\n- **Handles Edge Cases**: Instrumentation works for all potential execution paths\n\n**STEP 4: Execution Instructions**\nI will provide clear instructions for:\n- **How to run the instrumented code**: Specific commands or procedures\n- **What to look for**: Expected output patterns for each hypothesis\n- **How to capture results**: Ensuring complete log/output collection\n\n**OUTPUTS**: Instrumented code ready for execution with clear validation criteria.",
+        "agentRole": "You are a debugging instrumentation specialist and diagnostic expert with extensive experience in systematic evidence collection. Your expertise lies in implementing non-intrusive debugging mechanisms that provide clear evidence for hypothesis validation. You excel at strategic instrumentation that maximizes diagnostic value.",
+        "guidance": [
+          "STRATEGIC PLACEMENT: Place instrumentation at points that will provide maximum diagnostic value",
+          "NON-INTRUSIVE: Ensure debugging code doesn't alter the bug's behavior",
+          "COMPREHENSIVE COVERAGE: Instrument all critical paths related to the hypotheses",
+          "CLEAR OUTPUT: Design instrumentation to provide unambiguous evidence"
+        ]
+      },
+      {
+        "id": "phase-4-evidence-collection",
+        "title": "Phase 4: Evidence Collection & Analysis",
+        "prompt": "**EVIDENCE COLLECTION PHASE** - Time to execute the instrumented code and gather evidence for hypothesis validation.\n\n**STEP 1: Execution Coordination**\nI will guide you through:\n- **Execution Commands**: Precise commands to run the instrumented code\n- **Data Collection**: How to capture all relevant output, logs, and results\n- **Multiple Runs**: Instructions for running different scenarios if needed\n- **Failure Scenarios**: How to handle execution failures or unexpected results\n\n**STEP 2: Evidence Analysis Framework**\nOnce you provide the execution results, I will systematically analyze:\n- **Hypothesis Validation**: Which hypotheses are confirmed or refuted by the evidence\n- **Unexpected Findings**: Any results that don't match our predictions\n- **Evidence Quality**: Strength and reliability of the collected evidence\n- **Confidence Assessment**: Current confidence level in each hypothesis\n\n**STEP 3: Evidence Correlation**\nI will examine:\n- **Pattern Recognition**: Consistent patterns across multiple execution runs\n- **Timing Analysis**: Sequence of events leading to the bug\n- **State Evolution**: How system state changes during bug reproduction\n- **Error Propagation**: How errors cascade through the system\n\n**STEP 4: Confidence Evaluation**\nI will assess:\n- **Evidence Strength**: How conclusively the evidence supports each hypothesis\n- **Remaining Uncertainties**: What questions remain unanswered\n- **Additional Evidence Needs**: Whether more debugging is required\n\n**CRITICAL THRESHOLD**: If confidence level is below 90%, I will recommend additional instrumentation or evidence collection.\n\n**OUTPUTS**: Evidence-based validation of hypotheses with confidence assessment.",
+        "agentRole": "You are a forensic evidence analyst and systematic debugging expert specializing in evidence collection and hypothesis validation. Your expertise lies in coordinating debugging execution, analyzing complex diagnostic output, and drawing reliable conclusions from evidence. You excel at maintaining objectivity and rigor in evidence evaluation.",
+        "guidance": [
+          "SYSTEMATIC ANALYSIS: Analyze evidence methodically against each hypothesis",
+          "OBJECTIVE EVALUATION: Remain objective - let evidence drive conclusions, not preferences",
+          "CONFIDENCE THRESHOLDS: Don't proceed to conclusions without sufficient evidence",
+          "MULTIPLE PERSPECTIVES: Consider alternative interpretations of the evidence"
+        ]
+      },
+      {
+        "id": "phase-5-root-cause-confirmation",
+        "title": "Phase 5: Root Cause Confirmation",
+        "prompt": "**ROOT CAUSE CONFIRMATION** - Based on collected evidence, I will confirm the definitive root cause with high confidence.\n\n**STEP 1: Evidence Synthesis**\n- **Confirm Primary Hypothesis**: Identify strongest evidence-supported hypothesis\n- **Eliminate Alternatives**: Rule out other hypotheses based on evidence\n- **Address Contradictions**: Resolve conflicting evidence or unexpected findings\n- **Validate Completeness**: Ensure hypothesis explains all observed symptoms\n\n**STEP 2: Objective Evidence Verification**\n- **Evidence Diversity**: Minimum 3 independent supporting sources\n- **Reproducibility**: Evidence consistently reproducible across test runs\n- **Specificity**: Evidence directly relates to hypothesis, not circumstantial\n- **Contradiction Resolution**: Conflicting evidence explicitly addressed\n\n**STEP 3: Adversarial Challenge Protocol**\n- **Devil's Advocate Analysis**: Argue against primary hypothesis with available evidence\n- **Alternative Explanation Search**: Identify 2+ alternative explanations for evidence\n- **Confidence Calibration**: Rate certainty on calibrated scale with explicit reasoning\n- **Uncertainty Documentation**: List remaining unknowns and their potential impact\n\n**STEP 4: Confidence Assessment Matrix**\n- **Evidence Quality Score** (1-10): Reliability and completeness of supporting evidence\n- **Explanation Completeness** (1-10): How well root cause explains all symptoms\n- **Alternative Likelihood** (1-10): Probability alternatives are correct (inverted)\n- **Final Confidence** = (Evidence Quality × 0.4) + (Completeness × 0.4) + (Alternative × 0.2)\n\n**CONFIDENCE THRESHOLD**: Proceed only if Final Confidence ≥ 9.0/10. If below, recommend additional investigation with specific evidence gaps.\n\n**OUTPUTS**: High-confidence root cause with quantified assessment and adversarial validation.",
+        "agentRole": "You are a senior root cause analysis expert and forensic investigator with deep expertise in systematic evidence evaluation and definitive conclusion formation. Your strength lies in synthesizing complex evidence into clear, confident determinations. You excel at maintaining rigorous standards for certainty while providing actionable insights. You must actively challenge your own conclusions and maintain objective, quantified confidence assessments.",
+        "guidance": [
+          "OBJECTIVE VERIFICATION: Use quantified evidence quality criteria, not subjective assessments",
+          "ADVERSARIAL MINDSET: Actively challenge your own conclusions with available evidence",
+          "CONFIDENCE CALIBRATION: Use mathematical framework for confidence scoring, not intuition",
+          "UNCERTAINTY DOCUMENTATION: Explicitly list all remaining unknowns and their impact",
+          "EVIDENCE CITATION: Support every conclusion with specific, reproducible evidence"
+        ],
+        "validationCriteria": [
+          {
+            "type": "contains",
+            "value": "Evidence Quality Score",
+            "message": "Must include quantified evidence quality scoring (1-10) for root cause confirmation"
+          },
+          {
+            "type": "contains",
+            "value": "Explanation Completeness",
+            "message": "Must include explanation completeness scoring (1-10) for root cause confirmation"
+          },
+          {
+            "type": "contains",
+            "value": "Alternative Likelihood",
+            "message": "Must include alternative likelihood scoring (1-10) for root cause confirmation"
+          },
+          {
+            "type": "regex",
+            "pattern": "Final Confidence = [0-9\\.]+",
+            "message": "Must calculate and report final confidence score using the specified formula"
+          }
+        ]
+      },
+      {
+        "id": "phase-6-diagnostic-writeup",
+        "title": "Phase 6: Comprehensive Diagnostic Writeup",
+        "prompt": "**FINAL DIAGNOSTIC DOCUMENTATION** - I will create comprehensive writeup enabling effective bug fixing and knowledge transfer.\n\n**STEP 1: Executive Summary**\n- **Bug Summary**: Concise description of issue and impact\n- **Root Cause**: Clear, non-technical explanation of what is happening\n- **Confidence Level**: Final confidence assessment with calculation methodology\n- **Scope**: What systems, users, or scenarios are affected\n\n**STEP 2: Technical Deep Dive**\n- **Root Cause Analysis**: Detailed technical explanation of failure mechanism\n- **Code Component Analysis**: Specific files, functions, and lines with exact locations\n- **Execution Flow**: Step-by-step sequence of events leading to bug\n- **State Analysis**: How system state contributes to failure\n\n**STEP 3: Investigation Methodology**\n- **Investigation Timeline**: Chronological summary with time investments per phase\n- **Hypothesis Evolution**: Complete record of all hypotheses (H1-H5) with status changes\n- **Evidence Quality Assessment**: Rating and reliability of each evidence source\n- **Key Evidence**: Most important evidence that led to root cause with citations\n\n**STEP 4: Knowledge Transfer & Action Plan**\n- **Skill Requirements**: Technical expertise needed to understand and fix issue\n- **Prevention Strategies**: Specific measures to prevent similar issues\n- **Code Review Checklist**: Items to check during reviews to catch similar problems\n- **Immediate Actions**: Steps to mitigate issue temporarily with owners and timelines\n- **Root Cause Remediation**: Areas needing permanent fixes with complexity estimates\n- **Testing Strategy**: Comprehensive approach to verify fixes work correctly\n\n**DELIVERABLE**: Enterprise-grade diagnostic report enabling confident bug fixing, knowledge transfer, and organizational learning.",
+        "agentRole": "You are a senior technical writer and diagnostic documentation specialist with expertise in creating comprehensive, actionable bug reports for enterprise environments. Your strength lies in translating complex technical investigations into clear, structured documentation that enables effective problem resolution, knowledge transfer, and organizational learning. You excel at creating reports that serve immediate fixing needs, long-term system improvement, and team collaboration.",
+        "guidance": [
+          "ENTERPRISE FOCUS: Write for multiple stakeholders including developers, managers, and future team members",
+          "KNOWLEDGE TRANSFER: Include methodology and reasoning, not just conclusions",
+          "COLLABORATIVE DESIGN: Structure content for peer review and team coordination",
+          "COMPREHENSIVE COVERAGE: Include all information needed for resolution and prevention",
+          "ACTIONABLE DOCUMENTATION: Provide specific, concrete next steps with clear ownership"
+        ]
+      }
+    ]
+  }