@exaudeus/workrail 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "0.1.2",
3
+ "version": "0.1.3",
4
4
  "description": "MCP server for structured workflow orchestration and step-by-step task guidance",
5
5
  "license": "MIT",
6
6
  "bin": {
@@ -11,6 +11,16 @@
11
11
  "Git repository is recommended for version control and commits (workflow degrades gracefully if unavailable)."
12
12
  ],
13
13
  "metaGuidance": [
14
+ "**FUNCTION DEFINITIONS:** fun updateDecisionLog() = 'Update Decision Log in CONTEXT.md: file paths/ranges, excerpts, why important, outcome impact. Limit 3-5 files/decision.'",
15
+ "fun useTools() = 'Use tools to verify—never guess. Expand file reads to imports/models/interfaces/classes/deps. Trace all dependencies.'",
16
+ "fun createFile(filename) = 'Use edit_file to create/update {filename}. NEVER output full content in chat—only summarize. If fails, request user help & log command.'",
17
+ "fun applyUserRules() = 'Apply & reference user-defined rules, patterns & preferences. Document alignment in Decision Log. Explain rule influence in decisions.'",
18
+ "fun matchPatterns() = 'Use codebase_search/grep to find similar patterns. Reference Decision Log patterns. Match target area unless user rules override.'",
19
+ "fun addResumptionJson(phase) = 'Update CONTEXT.md resumption section with: 1) workflow_get instructions (id: coding-task-workflow-with-loops, mode: preview), 2) workflow_next JSON with workflowId, completedSteps up to {phase}, all context variables.'",
20
+ "fun gitCommit(type, msg) = 'If git available: commit with {type}: {msg}. If unavailable: log in CONTEXT.md with timestamp.'",
21
+ "fun verifyImplementation() = '1) Test coverage >80%, 2) Run full test suite, 3) Self-review. Max 2 attempts before failure protocol.'",
22
+ "fun checkAutomation(action) = 'High: auto-{action} if confidence >8. Medium: request confirmation. Low: extra confirmations.'",
23
+ "fun trackProgress(completed, current) = '✅ Completed: {completed}, 🔄 Current: {current}, ⏳ Remaining phases, 📁 Files created.'",
14
24
  "This workflow follows the ANALYZE -> CLARIFY -> PREP -> IMPLEMENT -> VERIFY pattern with bidirectional dynamic re-triage capabilities.",
15
25
  "Deep codebase analysis occurs early to inform intelligent requirements clarification and all subsequent planning phases.",
16
26
  "Dynamic re-triage allows complexity upgrades and safe downgrades based on new insights from analysis and clarifications.",
@@ -26,13 +36,14 @@
26
36
  "The agent should never guess or assume. Always ask for clarification or use tools to find missing information.",
27
37
  "If you fail to get test results or other tool outputs on the first attempt, ask the user to run it manually.",
28
38
  "Document all user interventions, change requests, and feedback immediately in context documentation to ensure continuity.",
29
- "**Git Fallback Strategy:** If git not initialized or tools fail, gracefully skip commits/branches, log changes manually in CONTEXT.md with timestamps, and warn user at phase-6 start. Document all file modifications for manual version control.",
39
+ "**Git Fallback:** If git unavailable/fails, skip commits/branches, log changes in CONTEXT.md with timestamps. Warn at phase-6. Document file mods for manual VC.",
30
40
  "**Git Error Handling:** Use run_terminal_cmd for git operations; if fails, output exact command for user manual execution. Never halt workflow due to git unavailability.",
31
- "Persist all variables across steps using conversation context, external state, or by including them in every response metadata; reset flags like proposedDowngrade after confirmation.",
32
- "Use safe git practices: Create feature branches for isolation, commit on successful verification, generate clear commit messages, revert on failures. Log commit hashes in CONTEXT.md for traceability.",
41
+ "Persist variables across steps via context/external state/response metadata. Reset flags (e.g., proposedDowngrade) after confirmation.",
42
+ "Use safe git: feature branches, commit on success, clear messages, revert on fail. Log hashes in CONTEXT.md for traceability.",
33
43
  "Maintain existing coding conventions and architectural patterns found in the codebase.",
34
44
  "COMMIT STRATEGY: Auto-commit after successful steps for High automation; suggest for Medium/Low. Use conventional format: type(scope): description. Commit at milestones and after verification passes.",
35
- "**USER RULES:** Always incorporate user-defined rules, patterns & preferences (architecture, coding style, libraries, etc). These override generic practices when specified."
45
+ "When you see function calls like updateDecisionLog() or createFile(spec.md), refer to the function definitions above for full instructions.",
46
+ "For resumption: Include function definitions in CONTEXT.md so new sessions understand these references. Always provide explicit workflow_get and workflow_next instructions."
36
47
  ],
37
48
  "steps": [
38
49
  {
@@ -64,6 +75,31 @@
64
75
  ],
65
76
  "requireConfirmation": false
66
77
  },
78
+ {
79
+ "id": "phase-0c-overview-gathering",
80
+ "runCondition": {
81
+ "or": [
82
+ {"var": "taskComplexity", "equals": "Large"},
83
+ {
84
+ "and": [
85
+ {"var": "taskComplexity", "equals": "Medium"},
86
+ {"var": "requestDeepAnalysis", "equals": true}
87
+ ]
88
+ }
89
+ ]
90
+ },
91
+ "title": "Phase 0c: High-Level Architecture Overview",
92
+ "prompt": "Before deep analysis, gather a high-level overview of the codebase architecture to guide subsequent focused explorations.\n\n**OVERVIEW TASKS:**\n1. **Project Structure:** Use `list_dir` to map the directory structure and identify key areas\n2. **Entry Points:** Use `grep_search` and `read_file` to locate and understand main entry points (e.g., main.ts, index.js, app.js)\n3. **Configuration:** Read package.json, tsconfig.json, or similar files to understand dependencies and setup\n4. **Key Patterns:** Use `grep_search` to identify architectural patterns (e.g., dependency injection, MVC, microservices)\n5. **Task Relevance:** Note which modules/areas seem most relevant to the current task\n\n**OUTPUT REQUIREMENTS:**\n- Summarize the overall architecture in 500 words or less\n- Identify 3-5 key areas that warrant deeper investigation for this task\n- Note any discovered patterns that align with or conflict with user rules\n- Set `architectureOverview` context variable with your findings\n\n**Start the Decision Log:** createFile(CONTEXT.md) and updateDecisionLog() with top 3-5 files, why important, how they'll guide analysis.\n\n**Remember:** useTools() for every architectural insight.",
93
+ "agentRole": "You are an architectural scout mapping the codebase landscape. Your role is to quickly identify the project's structure, key patterns, and areas relevant to the task at hand. You excel at using tools efficiently to build a mental map that will guide deeper exploration.",
94
+ "guidance": [
95
+ "This step provides context for more focused deep dives",
96
+ "Use tools liberally but efficiently - broad strokes first",
97
+ "The overview should inform where to focus in subsequent analysis",
98
+ "Start the Decision Log that will be maintained throughout the workflow",
99
+ "Note alignment/conflicts with user rules from the start"
100
+ ],
101
+ "requireConfirmation": false
102
+ },
67
103
  {
68
104
  "id": "phase-small-prep",
69
105
  "runCondition": {"var": "taskComplexity", "equals": "Small"},
@@ -83,7 +119,7 @@
83
119
  "id": "phase-small-implement",
84
120
  "runCondition": {"var": "taskComplexity", "equals": "Small"},
85
121
  "title": "Phase Small: Simple Implementation",
86
- "prompt": "Execute the simple plan you created in the preparation phase.\n\n**IMPLEMENTATION:**\n1. Review the user rules identified in Phase 0b\n2. Work through your 2-5 step plan sequentially\n3. Make the necessary code changes following user rules\n4. Write basic tests as needed\n5. Verify each change works correctly\n\n**GUIDELINES:**\n- Apply all relevant user rules and patterns\n- Keep changes focused and minimal\n- If complexity emerges, pause and recommend upgrading to Medium/Large\n- Commit after successful completion (or log changes if git unavailable)\n\n**VERIFICATION:**\n- Run relevant tests\n- Do a quick self-review\n- Ensure the task objectives are met\n- Confirm user rules were followed\n\nThis is a streamlined process for genuinely simple tasks.",
122
+ "prompt": "Execute the simple plan you created in the preparation phase.\n\n**IMPLEMENTATION:**\n1. Review user rules from Phase 0b\n2. Work through your 2-5 step plan\n3. applyUserRules() in all changes\n4. Write basic tests as needed\n5. Verify each change works\n\n**GUIDELINES:**\n- matchPatterns() from codebase\n- Keep changes focused and minimal\n- If complexity emerges, recommend upgrade to Medium/Large\n- gitCommit(type, message) after success\n\n**VERIFICATION:**\n- Run relevant tests\n- Quick self-review\n- Ensure objectives met\n- Confirm applyUserRules()\n\nThis is streamlined for genuinely simple tasks.",
87
123
  "agentRole": "You are implementing a simple, low-risk task. Execute efficiently while maintaining quality. If unexpected complexity arises, escalate rather than proceeding with inadequate rigor.",
88
124
  "guidance": [
89
125
  "Execute the plan from phase-small-prep",
@@ -94,43 +130,42 @@
94
130
  "requireConfirmation": false
95
131
  },
96
132
  {
97
- "id": "phase-1-deep-analysis-mandatory",
98
- "runCondition": {"var": "taskComplexity", "equals": "Large"},
99
- "title": "Phase 1: Mandatory Deep Codebase Analysis",
100
- "prompt": "Your goal is to become an expert on the attached codebase before any planning begins. This deep analysis is mandatory for Large tasks due to their complexity and risk.\n\n**ANALYSIS BOUNDS: Limit output to 1500 words; prioritize task-relevant sections.**\n\n**IMPORTANT: First check for any user-defined rules or preferences that should guide your analysis. These may include architectural patterns, coding standards, or specific approaches the user prefers.**\n\nYour analysis must include:\n1. **Architecture:** Main modules, layers, and patterns.\n2. **Key Concepts:** Core models, conventions, and important components.\n3. **Execution Flow:** Trace major features or entry points.\n4. **Code Quality Assessment:** Note maintainability, readability, or coupling issues.\n5. **Testing Strategy:** Describe how the code is tested.\n6. **Opportunities:** Suggest refactorings or improvements.\n7. **Task Relevance:** Identify which parts of the codebase are most relevant to the current task.\n8. **Potential Ambiguities:** Note areas where the initial task description might be unclear given the codebase structure.\n9. **Complexity Indicators:** Note any discoveries that might affect the initial complexity assessment.\n10. **User Rules Alignment:** Note how the codebase aligns or conflicts with any user-defined rules, preferences, or patterns.\n\nProvide summaries and code examples to illustrate your findings. Be exhaustive within the word limit, as if preparing onboarding documentation for a senior engineer. This analysis will inform all subsequent requirements clarification, specification, and design work.",
101
- "agentRole": "You are an expert codebase analyst with 10+ years of experience in software architecture and legacy system analysis. Your specialty is quickly understanding complex codebases and identifying architectural patterns, risks, and opportunities. Approach this with the thoroughness of a senior engineer conducting a technical due diligence review.",
102
- "askForFiles": true,
103
- "guidance": [
104
- "This step is mandatory for Large tasks due to their complexity and risk",
105
- "Ensure all relevant source files are attached or accessible to the agent before running this step",
106
- "Be thorough but respect the 1500-word limit - focus on task-relevant insights",
107
- "This analysis will inform requirements clarification, specification, design, and implementation strategy",
108
- "Pay special attention to areas of the codebase relevant to the current task",
109
- "Note potential ambiguities in the task description that become apparent after understanding the codebase",
110
- "Flag any complexity indicators that might warrant re-triaging the task complexity"
111
- ],
112
- "requireConfirmation": false
113
- },
114
- {
115
- "id": "phase-1-deep-analysis-optional",
133
+ "id": "phase-1-multi-analysis",
134
+ "type": "loop",
135
+ "title": "Phase 1: Multi-Step Focused Codebase Analysis",
116
136
  "runCondition": {
117
- "and": [
118
- {"var": "taskComplexity", "equals": "Medium"},
119
- {"var": "requestDeepAnalysis", "equals": true}
137
+ "or": [
138
+ {"var": "taskComplexity", "equals": "Large"},
139
+ {
140
+ "and": [
141
+ {"var": "taskComplexity", "equals": "Medium"},
142
+ {"var": "requestDeepAnalysis", "equals": true}
143
+ ]
144
+ }
120
145
  ]
121
146
  },
122
- "title": "Phase 1: Optional Deep Codebase Analysis",
123
- "prompt": "You requested optional deep analysis for this Medium task. Your goal is to become an expert on the attached codebase before planning begins.\n\n**ANALYSIS BOUNDS: Limit output to 1500 words; prioritize task-relevant sections.**\n\nYour analysis must include:\n1. **Architecture:** Main modules, layers, and patterns.\n2. **Key Concepts:** Core models, conventions, and important components.\n3. **Execution Flow:** Trace major features or entry points.\n4. **Code Quality Assessment:** Note maintainability, readability, or coupling issues.\n5. **Testing Strategy:** Describe how the code is tested.\n6. **Opportunities:** Suggest refactorings or improvements.\n7. **Task Relevance:** Identify which parts of the codebase are most relevant to the current task.\n8. **Potential Ambiguities:** Note areas where the initial task description might be unclear given the codebase structure.\n9. **Complexity Indicators:** Note any discoveries that might affect the initial complexity assessment.\n\nProvide summaries and code examples to illustrate your findings. Focus on areas most relevant to the current task while maintaining architectural awareness. This analysis will inform all subsequent requirements clarification, specification, and design work.",
124
- "agentRole": "You are a focused codebase analyst specializing in targeted technical analysis for medium-complexity projects. Your approach balances thoroughness with efficiency, focusing on areas most relevant to the current task while maintaining architectural awareness.",
125
- "askForFiles": true,
126
- "guidance": [
127
- "This optional analysis was requested for a Medium task",
128
- "Ensure all relevant source files are attached or accessible to the agent before running this step",
129
- "Focus on areas most relevant to the current task while maintaining broader architectural context",
130
- "Respect the 1500-word limit - prioritize task-relevant insights",
131
- "This analysis will inform requirements clarification, specification, and design phases",
132
- "Note potential ambiguities in the task description that become apparent after understanding the codebase",
133
- "Flag any complexity indicators that might warrant re-triaging the task complexity"
147
+ "loop": {
148
+ "type": "for",
149
+ "count": 3,
150
+ "maxIterations": 4,
151
+ "iterationVar": "analysisStep"
152
+ },
153
+ "body": [
154
+ {
155
+ "id": "phase-1-sub-analysis",
156
+ "title": "Analysis #{{analysisStep}}: {{analysisStep === 1 ? 'Structure' : analysisStep === 2 ? 'Modules' : 'Dependencies'}}",
157
+ "prompt": "{{analysisStep === 1 ? '**STEP 1: STRUCTURAL MAPPING**\\n\\nBuild on phase-0c overview, dive deeper into structure:\\n\\n1. Module organization (packages/services)\\n2. Core components (controllers/services/models)\\n3. Architectural patterns from overview\\n4. File naming conventions\\n5. Code organization\\n\\n**Actions:** useTools() with list_dir, grep_search (class/interface/export), read 2-3 files\\n\\n**Output (400 words):**\\n- Structure summary\\n- User rules alignment\\n- Areas for next steps\\n\\nupdateDecisionLog() with 3-5 key files' : analysisStep === 2 ? '**STEP 2: TASK-RELEVANT MODULES**\\n\\nFocus on task-specific modules:\\n\\n1. Target areas from mapping\\n2. Core business logic\\n3. Data models (interfaces/types/schemas)\\n4. API contracts\\n5. Pattern implementation\\n\\n**Actions:** useTools() and matchPatterns() with codebase_search, read complete files (with imports), trace flows\\n\\n**Output (400 words):**\\n- Module responsibilities\\n- Patterns to match\\n- Integration points\\n\\nupdateDecisionLog() with core logic files' : '**STEP 3: DEPENDENCIES & FLOWS**\\n\\nTrace dependencies and execution:\\n\\n1. Import mapping\\n2. Data flow tracing\\n3. Integration points\\n4. Side effects\\n5. Testing patterns\\n\\n**Actions:** useTools() to follow imports, find test files, trace error handling\\n\\n**Output (400 words):**\\n- Dependency map\\n- Integration challenges\\n- Testing strategies\\n- Risk indicators\\n\\nupdateDecisionLog() with dependencies and test approaches'}}",
158
+ "agentRole": "You are conducting focused analysis step {{analysisStep}} of 3. Your expertise lies in {{analysisStep === 1 ? 'understanding code structure and organization' : analysisStep === 2 ? 'identifying and analyzing task-specific components' : 'tracing dependencies and system flows'}}. Use tools extensively and never make assumptions.",
159
+ "guidance": [
160
+ "This is step {{analysisStep}} of a 3-step analysis process",
161
+ "Each step builds on the previous findings",
162
+ "Use tools liberally - verify everything",
163
+ "Update the Decision Log with key discoveries",
164
+ "Respect word limits to prevent context bloat",
165
+ "Note alignment/conflicts with user rules"
166
+ ],
167
+ "requireConfirmation": false
168
+ }
134
169
  ],
135
170
  "requireConfirmation": false
136
171
  },
@@ -182,7 +217,7 @@
182
217
  "id": "phase-3-specification",
183
218
  "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
184
219
  "title": "Phase 3: Create Specification",
185
- "prompt": "Using your codebase analysis from Phase 1, the clarified requirements from Phase 2, and any complexity re-assessment from Phase 2b, create a specification document that aligns with the existing system and addresses all identified ambiguities. Your specification should be precise, unambiguous, and fully implementable.\n\n**Task Description:**\n[Updated based on clarifications from Phase 2]\n\n**Key Objectives & Success Criteria:**\n[Refined based on codebase understanding and clarifications]\n\n**Scope and Constraints:**\n[Updated to reflect codebase realities and clarified boundaries]\n\n**IMPORTANT**: Your specification must consider:\n- Existing architectural patterns and conventions identified in your codebase analysis\n- How the proposed changes fit within the current system design\n- Potential impacts on existing components and workflows\n- Alignment with current testing strategies and code quality standards\n- All clarifications and decisions made in Phase 2\n- Any complexity insights from the re-triage assessment\n\nOutput markdown content in response; use edit_file tool to create actual spec.md file if available, else request user upload.\n\nFinally, perform a sanity check on the current complexity classification. If you believe further adjustment is needed given your comprehensive understanding, state your reasoning and ask for confirmation before proceeding.",
220
+ "prompt": "Create a precise specification from Phase 1 analysis, Phase 2 clarifications, and any re-triage insights.\n\n**Spec Sections:**\n- Task Description [from clarifications]\n- Key Objectives & Success Criteria\n- Scope and Constraints\n\n**Must Include:**\n- Existing patterns/conventions from analysis\n- System integration approach\n- Impact on components/workflows\n- Testing/quality alignment\n- Phase 2 decisions\n- Complexity insights\n- applyUserRules()\n- matchPatterns()\n\n**Actions:**\n- createFile(spec.md)\n- updateDecisionLog()\n- Sanity check complexity level",
186
221
  "agentRole": "You are a senior business analyst and technical lead specializing in requirement gathering and scope definition. Your goal is to produce a clear, comprehensive `spec.md` file that leverages your deep understanding of the existing codebase and incorporates all clarified requirements to serve as an unambiguous foundation for design and implementation.",
187
222
  "guidance": [
188
223
  "Provide a complete task description based on all previous analysis and clarifications",
@@ -198,7 +233,7 @@
198
233
  "id": "phase-3b-create-context-doc",
199
234
  "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
200
235
  "title": "Phase 3b: Create Context Documentation",
201
- "prompt": "Create a comprehensive context documentation file (`CONTEXT.md`) that captures all critical information from the workflow so far. This document is essential for enabling seamless handoffs between chat sessions when context limits are reached.\n\n**For automationLevel=High, generate a summary-only version (limit 1000 words); otherwise, full update (limit 2000 words).**\n\n**Your `CONTEXT.md` must include:**\n\n## 1. ORIGINAL TASK CONTEXT\n- Original task description and requirements\n- Complexity classification (Small/Medium/Large) and reasoning\n- Any re-triage decisions made and why\n- Automation level selected and its implications\n\n## 2. USER RULES AND PREFERENCES\n- Complete list of identified user rules from Phase 0b\n- How each rule impacts this specific task\n- Any rules discovered during analysis\n\n## 3. CODEBASE ANALYSIS SUMMARY\n- Key architectural patterns and conventions found\n- Relevant components, modules, and their locations\n- Testing strategies and patterns in use\n- Critical dependencies and integration points\n- Any complexity indicators discovered\n\n## 4. CLARIFICATIONS AND DECISIONS\n- Questions asked and answers received\n- Ambiguities resolved and how\n- Scope boundaries clearly defined\n- Technical approach decisions made\n\n## 5. SPECIFICATION SUMMARY\n- Core objectives and success criteria\n- Key constraints and requirements\n- Design principles to follow\n- Integration requirements\n\n## 6. WORKFLOW PROGRESS TRACKING\n- ✅ Completed phases (0, 0b, 1, 2, 2b, 3, 3b)\n- 🔄 Current phase: Architecture Design (Phase 4)\n- ⏳ Remaining phases: 4, 5, 5b, 5c, 5d, 5e, 6, 7\n- 📋 Context variables set (taskComplexity, automationLevel, userRules, etc.)\n\n## 7. HANDOFF INSTRUCTIONS\n- Files to attach when resuming (e.g., spec.md via tool or user upload)\n- Key context to provide to new chat session\n- Critical decisions that must not be forgotten\n\n**Format this as a clear, scannable document using bullet points that a new agent could quickly read to understand the full project context.**",
236
+ "prompt": "Create CONTEXT.md capturing workflow progress. High automation: 1000 words; else 2000.\n\n**Include Sections:**\n\n1. ORIGINAL TASK CONTEXT\n- Task description, complexity level, re-triage decisions, automation level\n\n2. USER RULES AND PREFERENCES\n- Rules from Phase 0b, task impact, new discoveries\n\n3. CODEBASE ANALYSIS SUMMARY\n- Patterns, components, testing, dependencies, complexity indicators\n\n4. DECISION LOG\n- Phase 0c: Files/patterns/impact\n- Phase 1: Key files per sub-step\n- Phase 2: Clarification decisions\n- Phase 3: Spec influences\n\n5. CLARIFICATIONS AND DECISIONS\n- Q&A, resolved ambiguities, scope, technical approach\n\n6. SPECIFICATION SUMMARY\n- Objectives, constraints, design principles, integration\n\n7. WORKFLOW PROGRESS\n- trackProgress(0-3b, Phase 4)\n- ⏳ Remaining: 4-7\n- 📋 Context vars set\n\n8. RESUMPTION INSTRUCTIONS\n**How to Resume:**\n1. Call workflow_get with id: \"coding-task-workflow-with-loops\", mode: \"preview\"\n2. Call workflow_next with the JSON from addResumptionJson(phase-3b)\n3. Include function definitions from metaGuidance for reference\n\n9. HANDOFF INSTRUCTIONS\n- Files to attach: spec.md, CONTEXT.md\n- Key files from Decision Log\n- Critical decisions\n\ncreateFile(CONTEXT.md)",
202
237
  "agentRole": "You are a meticulous technical documentation specialist with expertise in creating comprehensive project handoff documents. Your role is to capture all critical context in a way that enables seamless continuity across different team members or chat sessions. You excel at synthesizing complex technical information into clear, actionable documentation.",
203
238
  "guidance": [
204
239
  "This step is automatically skipped for Small tasks",
@@ -215,7 +250,7 @@
215
250
  "id": "phase-4-architectural-design",
216
251
  "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
217
252
  "title": "Phase 4: Architectural Design",
218
- "prompt": "Using the `spec.md` from the previous step, your deep codebase analysis, and all clarified requirements, create a high-level architectural design that seamlessly integrates with the existing system. Your output should be a `design.md` document that includes:\n1. **High-Level Approach:** A summary of the proposed solution that builds on existing patterns.\n2. **Component Breakdown:** Identify new or modified components, classes, or modules, showing how they fit within the current architecture.\n3. **Data Models:** Describe any changes to data structures or database schemas, considering existing data patterns.\n4. **API Contracts:** Define any new or changed API endpoints, following existing API conventions and patterns.\n5. **Key Interactions:** A diagram or description of how the major components will interact, both new and existing.\n6. **Integration Points:** Clearly identify how new components will integrate with existing systems and workflows.\n7. **Clarification Decisions:** Reference how the clarified requirements from Phase 2 influenced design decisions.\n8. **Complexity Considerations:** Address any complexity factors identified during re-triage.\n\nOutput markdown content in response; use edit_file tool to create actual design.md file if available, else request user upload.",
253
+ "prompt": "Create architectural design from spec.md, analysis, and requirements.\n\n**Design Sections:**\n1. High-Level Approach (builds on patterns)\n2. Component Breakdown (new/modified)\n3. Data Models (schemas/structures)\n4. API Contracts (follow conventions)\n5. Key Interactions (components diagram)\n6. Integration Points\n7. Phase 2 Decisions Impact\n8. Complexity Factors\n9. Pattern Alignment (cite files)\n10. applyUserRules()\n\n**Actions:**\n- matchPatterns()\n- useTools()\n- createFile(design.md)\n- updateDecisionLog()",
219
254
  "agentRole": "You are a software architect specializing in translating business requirements into robust and scalable technical designs that seamlessly integrate with existing systems. Your task is to create a clear and comprehensive `design.md` that leverages existing architectural patterns while introducing necessary changes and incorporating all clarified requirements.",
220
255
  "guidance": [
221
256
  "The `design.md` should be detailed enough for an engineer to write an implementation plan from it.",
@@ -230,7 +265,7 @@
230
265
  "id": "phase-5-planning",
231
266
  "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
232
267
  "title": "Phase 5: Create Detailed Implementation Plan",
233
- "prompt": "Your goal is to produce a thorough and actionable `implementation_plan.md` based on the `spec.md`, `design.md`, your deep codebase analysis, and all clarified requirements. Do not write any code. Your plan must be detailed, broken into committable phases, and justified.\n\n**CRITICAL: Review and incorporate any user-defined rules, preferences, or patterns. These take precedence over generic best practices.**\n\nYour plan must include these sections:\n1. **Goal Clarification:** Your understanding of the goal, assumptions, and success criteria from the spec and clarifications.\n2. **User Rules Compliance:** Explicitly list how the plan adheres to user-defined rules, patterns, and preferences.\n3. **Impact Assessment:** Affected codebase parts, dependencies, and risks based on the design, codebase analysis, and clarified requirements.\n4. **Implementation Strategy:** A list of discrete, actionable steps. Each step must detail the task, its rationale, inputs, and outputs.\n5. **Testing Strategy:** Describe how the changes will be tested, building on existing testing patterns identified in your codebase analysis. Mandate achieving >80% test coverage where applicable.\n6. **Failure Handling:** Define what to do if tests fail, tools don't work, or unexpected issues arise.\n7. **Final Review Checklist:** A specific checklist of items that must be verified to consider this entire task complete. This will be used in the final review phase.\n\nOutput markdown content in response; use edit_file tool to create actual implementation_plan.md file if available, else request user upload.\n\nPresent this as a formal proposal that demonstrates deep understanding of the requirements, clarifications, and the existing codebase.",
268
+ "prompt": "Create detailed implementation_plan.md from spec.md, design.md, analysis.\n\n**Plan Sections:**\n1. Goal Clarification - understanding from spec/clarifications\n2. applyUserRules() - how plan follows user patterns\n3. Pattern Matching Strategy - existing code templates per step\n4. Impact Assessment - affected parts, deps, risks\n5. Implementation Strategy - discrete steps with rationale/I/O\n6. Testing Strategy - follow existing patterns (cite files)\n7. Failure Handling - test fails, tool issues\n8. Final Review Checklist - completion criteria\n\n**Actions:**\n- matchPatterns() for each step\n- useTools() to find examples\n- createFile(implementation_plan.md)\n- updateDecisionLog()",
234
269
  "agentRole": "You are an experienced technical architect and project planner with expertise in breaking down complex development tasks into manageable, logical phases. Your strength is creating detailed, actionable plans that minimize risk while maximizing development efficiency and code quality, all while working within existing system constraints and incorporating all clarified requirements.",
235
270
  "guidance": [
236
271
  "The agent will now proceed to critique its own plan in the next step. Withhold your final approval until after that critique.",
@@ -270,7 +305,7 @@
270
305
  "id": "phase-5c-finalize-plan",
271
306
  "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
272
307
  "title": "Phase 5c: Finalize Implementation Plan",
273
- "prompt": "Review the initial `implementation_plan.md` from Phase 5 and the Devil's Advocate critique from Phase 5b. Your task is to create a final, consolidated implementation plan that incorporates the valid feedback from the review.\n\nYour output must be the final `implementation_plan.md`.\n\nAdditionally, explicitly list any suggestions from the review that you believe are valuable but out-of-scope for the current task. These should be formatted as potential tickets for future work.\n\nOutput updated markdown content in response; use edit_file tool to update actual implementation_plan.md file if available, else request user upload.",
308
+ "prompt": "Review initial `implementation_plan.md` from Phase 5 and Devil's Advocate critique from Phase 5b. Create final, consolidated plan incorporating valid feedback.\n\n**Output:**\n- Final implementation_plan.md\n- List valuable but out-of-scope suggestions as future tickets\n\ncreateFile(implementation_plan.md) with final version.",
274
309
  "agentRole": "You are a pragmatic technical project manager. Your goal is to synthesize feedback, make decisive trade-offs, and produce a final, actionable plan that is ready for execution.",
275
310
  "guidance": [
276
311
  "This is the final plan that will be executed. Ensure it is clear, actionable, and reflects the best path forward.",
@@ -301,7 +336,7 @@
301
336
  "id": "phase-5e-update-context-doc",
302
337
  "runCondition": {"var": "taskComplexity", "not_equals": "Small"},
303
338
  "title": "Phase 5e: Update Context Documentation with Final Plans",
304
- "prompt": "Update the `CONTEXT.md` file with all the planning work completed in Phases 4-5. This ensures the context document remains current and comprehensive for potential handoffs.\n\n**For automationLevel=High, generate a summary-only version (limit 500 words); otherwise, full update.**\n\n**Add/Update these sections in `CONTEXT.md`:**\n\n## 4. ARCHITECTURAL DESIGN SUMMARY\n- High-level approach and rationale\n- Key components being added/modified\n- Integration points with existing systems\n- Design decisions and alternatives considered\n\n## 5. IMPLEMENTATION PLAN OVERVIEW\n- Goal clarification and success criteria\n- Implementation strategy overview\n- Key risks identified and mitigation strategies\n- Testing approach and patterns to follow\n- Failure handling protocols\n\n## 6. DEVILS ADVOCATE REVIEW INSIGHTS\n- Key concerns raised and how they were addressed\n- Plan improvements made based on the review\n- Confidence score and reasoning\n- Out-of-scope items identified for future work\n\n## 7. UPDATED WORKFLOW PROGRESS\n- ✅ Completed phases (0, 1, 2, 2b, 3, 3b, 4, 5, 5b, 5c, 5d, 5e)\n- 🔄 Current phase: Implementation (Phase 6)\n- ⏳ Remaining phases: 6, 7\n- 📁 Key files created: spec.md, design.md, implementation_plan.md, CONTEXT.md\n\n## 8. IMPLEMENTATION READINESS\n- Plan sanity check results\n- Files and dependencies verified\n- Ready-to-execute implementation steps\n- Potential handoff points during implementation\n\n**Use bullet points for scannability and maintain all previous sections while adding the new planning context.**",
339
+ "prompt": "Update CONTEXT.md with Phases 4-5 work. checkAutomation(summary): 500 words for High, else full.\n\n**Add/Update Sections:**\n\n4. DECISION LOG (EXPANDED)\n- Phase 4 Design: Files/patterns shaping architecture\n- Phase 5 Planning: Code examples for implementation\n- Pattern Matches: Template files\n\n5. ARCHITECTURAL DESIGN SUMMARY\n- Approach & rationale\n- Components added/modified\n- Integration points\n- Design decisions\n- Pattern alignment\n\n6. IMPLEMENTATION PLAN OVERVIEW\n- Goals & success criteria\n- Strategy overview\n- Risks & mitigation\n- Testing approach\n- Failure handling\n\n7. DEVILS ADVOCATE INSIGHTS\n- Concerns addressed\n- Plan improvements\n- Confidence score\n- Out-of-scope items\n\n8. WORKFLOW PROGRESS\n- trackProgress(0-5e, Implementation)\n- ⏳ Remaining: 6, 7\n- 📁 Files: spec/design/plan/CONTEXT.md\n\n9. RESUMPTION INSTRUCTIONS\n**How to Resume:**\n1. Call workflow_get with id: \"coding-task-workflow-with-loops\", mode: \"preview\"\n2. Call workflow_next with the JSON from addResumptionJson(phase-5e)\n3. Include function definitions from metaGuidance for reference\n\n10. IMPLEMENTATION READINESS\n- Sanity check results\n- Verified files/deps\n- Key files to re-read\n\ncreateFile(CONTEXT.md) with bullets for scannability.",
305
340
  "agentRole": "You are a meticulous technical documentation specialist focused on maintaining comprehensive project context. Your expertise lies in synthesizing complex planning work into clear, actionable documentation that enables seamless workflow continuation.",
306
341
  "guidance": [
307
342
  "This step is automatically skipped for Small tasks",
@@ -358,7 +393,7 @@
358
393
  {
359
394
  "id": "phase-6-implement",
360
395
  "title": "IMPLEMENT: Execute {{currentStep.title}}",
361
- "prompt": "**IMPLEMENTATION PHASE for {{currentStep.title}}**\n\nNow implement this specific step:\n\n**Step Details:**\n- Title: {{currentStep.title}}\n- Description: {{currentStep.description}}\n- Expected Outputs: {{currentStep.outputs}}\n\n**USER RULES REMINDER:** Apply all user-defined patterns, conventions, and preferences including:\n- Architectural patterns specified by the user\n- Naming conventions and code style preferences\n- Library/framework preferences\n- Any other project-specific guidelines\n\n**Instructions:**\n1. Focus only on this single step\n2. Use your tools to make the necessary code changes\n3. Adhere to all quality standards and conventions\n4. Follow the plan precisely while adapting to unexpected discoveries\n5. Ensure all code follows user-defined rules and patterns\n\n**Progress Tracking:**\n- This is step {{stepIndex + 1}} of {{implementationSteps.length}}\n- Total steps executed so far: {{stepIteration}}\n- If total steps > 20 without completion, pause for user intervention\n\n**CONTEXT UPDATES:** If this is every 3rd step ({{stepIteration}} % 3 === 0), update CONTEXT.md with:\n- Progress summary\n- Current status\n- Files modified\n- Remaining work",
396
+ "prompt": "**IMPLEMENTATION PHASE for {{currentStep.title}}**\n\nNow implement this specific step:\n\n**Step Details:**\n- Title: {{currentStep.title}}\n- Description: {{currentStep.description}}\n- Expected Outputs: {{currentStep.outputs}}\n\n**Remember:** applyUserRules() and matchPatterns() throughout.\n\n**Instructions:**\n1. Focus only on this single step\n2. useTools() to make code changes\n3. Follow quality standards\n4. Adapt to unexpected discoveries\n5. createFile() for ALL code changes\n\n**Progress Tracking:**\n- This is step {{stepIndex + 1}} of {{implementationSteps.length}}\n- Total steps executed so far: {{stepIteration}}\n- If total steps > 20, pause for user intervention\n\n**CONTEXT UPDATES:** If this is every 3rd step ({{stepIteration}} % 3 === 0):\n- Update CONTEXT.md\n- addResumptionJson(phase-6-implement)\n- updateDecisionLog()\n- List files modified with line ranges",
362
397
  "agentRole": "You are implementing a specific step from the approved plan. Focus on precise execution while maintaining code quality.",
363
398
  "guidance": [
364
399
  "Implement only what this step requires",
@@ -370,7 +405,7 @@
370
405
  {
371
406
  "id": "phase-6-verify",
372
407
  "title": "VERIFY: Validate {{currentStep.title}}",
373
- "prompt": "**VERIFICATION PHASE for {{currentStep.title}}**\n\nVerify the implementation of this step is complete and correct:\n\n**Required Verification Steps:**\n1. **Test Coverage**: Write necessary unit/integration tests for the new logic\n - Target: >80% coverage for new code\n - Report metrics after test creation\n\n2. **Run Tests**: Execute the full test suite\n - Use --quiet flags and output to files\n - Ensure 0 regressions introduced\n - If failures occur, follow failure protocol\n\n3. **Self-Review**: Critical review of changes\n - Check against the plan\n - Verify code quality\n - Assess side effects\n - Confirm architectural alignment\n\n**COMMIT Decision (if all verification passes):**\n- For automationLevel=High: Auto-commit with message\n- For automationLevel=Medium/Low: Suggest commit and confirm\n- Use conventional format: type(scope): description\n- If git unavailable: Log in CONTEXT.md\n\n**FAILURE PROTOCOL:** If verification fails after 2 attempts:\n1. Do not try a third time\n2. Fall back to alternative tools if needed\n3. Update CONTEXT.md with failure details\n4. Present summary and recommendations\n5. Set 'verificationFailed' context variable to true",
408
+ "prompt": "**VERIFICATION PHASE for {{currentStep.title}}**\n\nVerify the implementation is complete and correct:\n\n**Required:** verifyImplementation()\n\n**COMMIT Decision (if all passes):**\n- checkAutomation(commit)\n- gitCommit(type, scope: description)\n- If git unavailable: Log in CONTEXT.md\n\n**FAILURE PROTOCOL:** If verification fails after 2 attempts:\n1. Do not try a third time\n2. Fall back to alternative tools\n3. updateDecisionLog() with failure details\n4. Present summary and recommendations\n5. Set 'verificationFailed' context variable to true",
374
409
  "agentRole": "You are verifying the implementation meets all quality standards. Be thorough but respect failure bounds.",
375
410
  "guidance": [
376
411
  "All three verification steps must pass",
@@ -419,7 +454,7 @@
419
454
  {
420
455
  "id": "phase-7-final-review",
421
456
  "title": "Phase 7: Final Review & Completion",
422
- "prompt": "Perform final review by validating work against the **'Final Review Checklist'** from `implementation_plan.md`.\n\n**USER RULES CHECK:** Verify all code follows user-defined rules, patterns, and preferences.\n\nFor each checklist item, provide confirmation and evidence. Validate metrics (coverage %, test results).\n\n**Additional validation:**\n- User rules/preferences compliance\n- Architectural patterns match specifications\n- Naming conventions and code style\n- Library/framework usage alignment\n\n**Final commit:**\n- Format: 'feat|fix|refactor(scope): final implementation - <summary>'\n- Include metrics in body\n- Auto-commit for High automation; confirm for Medium/Low\n\n**Branch cleanup (if git):**\n1. Merge: 'git checkout main && git merge --squash [featureBranch]'\n2. Delete: 'git branch -d [featureBranch]'\n3. Auto-execute if High automation & confidence >8\n4. Pause for conflicts\n5. Log in CONTEXT.md if no git\n\n**Update `CONTEXT.md` final sections:**\n\n## 9. FINAL STATUS\n- Steps completed\n- ✅ Tests passing (metrics)\n- ✅ Checklist validated\n- ✅ User rules verified\n- 📁 Files modified\n- 📋 Known issues\n- 🔄 Next steps\n- 📜 Git history\n\n## 10. HANDOFF\n- Accomplishments\n- Architecture decisions\n- Implementation details\n- Follow-up work\n\nConclude with follow-ups or dependencies.",
457
+ "prompt": "Perform final review against **'Final Review Checklist'** from `implementation_plan.md`.\n\n**Validate:** applyUserRules() compliance throughout. For each checklist item, provide confirmation and evidence.\n\n**Additional validation:**\n- User rules/preferences compliance\n- Architectural patterns match specs\n- Naming conventions and code style\n- Library/framework usage\n- matchPatterns() verification\n\n**Final commit:**\n- gitCommit(feat|fix|refactor, final implementation - summary)\n- Include metrics in body\n- checkAutomation(commit)\n\n**Branch cleanup (if git):**\n1. Merge: git checkout main && git merge --squash [featureBranch]\n2. Delete: git branch -d [featureBranch]\n3. checkAutomation(merge) with confidence >8\n4. Pause for conflicts\n5. Log in CONTEXT.md if no git\n\n**Update CONTEXT.md final sections:**\n\n## 9. COMPLETE DECISION LOG\nFull updateDecisionLog() with all files, patterns, user rules, design decisions\n\n## 10. FINAL STATUS\n- trackProgress(ALL, Complete)\n- ✅ Tests passing (metrics)\n- ✅ Checklist validated\n- ✅ applyUserRules() verified\n- 📁 Files modified (line ranges)\n- 📋 Known issues\n- 📜 Git history\n\n## 11. FINAL RESUMPTION\n**How to Resume (if needed for extensions):**\n1. Call workflow_get with id: \"coding-task-workflow-with-loops\", mode: \"preview\"\n2. Call workflow_next with the JSON from addResumptionJson(phase-7-final-review)\n**Note:** Task complete. This is for future extensions or follow-up work.\n\n## 12. HANDOFF\n- Accomplishments with file refs\n- Architecture decisions\n- Patterns established/followed\n- Follow-up recommendations\n\ncreateFile(CONTEXT.md) with final content.",
423
458
  "agentRole": "You are a quality assurance specialist and technical lead responsible for final project validation and comprehensive handoff documentation. Your expertise lies in comprehensive system testing, requirement verification, and ensuring deliverables meet all specified criteria while creating documentation that enables seamless future maintenance.",
424
459
  "guidance": [
425
460
  "This is the final quality check. Ensure the agent's summary and checklist validation align with your understanding of the completed work.",
@@ -19,6 +19,23 @@
19
19
  "Bug is reproducible with specific steps or a minimal test case"
20
20
  ],
21
21
  "metaGuidance": [
22
+ "**FUNCTION DEFINITIONS:**",
23
+ "fun instrumentCode(location, hypothesis) = 'Add debug logs at {location} for {hypothesis}. Format: ClassName.method [{hypothesis}]: message. Include timestamp, thread ID if concurrent.'",
24
+ "fun collectEvidence(hypothesis) = 'Run instrumented code, collect logs, analyze results. Score evidence quality 1-10. Document in Evidence/{hypothesis}.md.'",
25
+ "fun updateHypothesisLog(id, status, evidence) = 'Update INVESTIGATION_CONTEXT.md section {id} with {status} and {evidence}. Include confidence score.'",
26
+ "fun analyzeTests(component) = 'Find all tests for {component} using grep_search. Check coverage, recent changes, what they validate vs miss. Run with --debug flag.'",
27
+ "fun recursiveAnalysis(component, depth=3) = 'Analyze {component} to {depth} levels. L1: implementation, L2: direct deps, L3: transitive deps. Document each level.'",
28
+ "fun controlledModification(type, location) = 'Make {type} change at {location}. Types: guard (add logging), assert (add assertion), fix (minimal fix), break (controlled failure). Commit: DEBUG: {type} at {location}'",
29
+ "fun checkHypothesisInTests(hypothesis) = 'Search existing tests for evidence. Direct: tests of suspected components. Indirect: tests that would fail if true. Document in TestEvidence/{hypothesis}.md'",
30
+ "fun aggregateDebugLogs(pattern, timeWindow=100) = 'Deduplicate logs matching {pattern}. Output: {pattern} x{count} in {timeWindow}ms, variations: {unique_values}'",
31
+ "fun createInvestigationBranch() = 'git checkout -b investigate/{bug-id}-{timestamp}. If git unavailable, create Investigation/{timestamp}/ directory for artifacts.'",
32
+ "fun trackInvestigation(phase, status) = 'Update INVESTIGATION_CONTEXT.md progress: ✅ {completed}, 🔄 {phase}, ⏳ Remaining: {list}, 📊 Confidence: {score}/10'",
33
+ "fun updateInvestigationContext(section, content) = 'Update INVESTIGATION_CONTEXT.md {section} with {content}. Include timestamp. If section doesn\\'t exist, create it. Preserve all other sections.'",
34
+ "fun findSimilarBugs() = 'Search for: 1) Similar error patterns in codebase, 2) Previous fixes in git history, 3) Related test cases. Document in SimilarPatterns.md'",
35
+ "fun visualProgress() = 'Show: ✅ Phase 0 | ✅ Phase 1 | 🔄 Phase 2 | ⏳ Phase 3-5 | ⏳ Phase 6 | 📊 35% Complete. Include time spent per phase.'",
36
+ "fun applyDebugPreferences() = 'Apply user debugging preferences from userDebugPreferences context variable. Adapt logging verbosity, tool selection, output format.'",
37
+ "fun addResumptionJson(phase) = 'Update INVESTIGATION_CONTEXT.md resumption section with: workflowId, completedSteps up to {phase}, all context variables. Include workflow_get and workflow_next instructions.'",
38
+ "**USAGE:** When you see function calls like instrumentCode() or analyzeTests(), execute the full instructions defined above.",
22
39
  "INVESTIGATION DISCIPLINE: Never propose fixes or solutions until Phase 6 (Comprehensive Diagnostic Writeup). Focus entirely on systematic evidence gathering and analysis.",
23
40
  "HYPOTHESIS RIGOR: All hypotheses must be based on concrete evidence from code analysis with quantified scoring (1-10 scales). Maximum 5 hypotheses per investigation.",
24
41
  "DEBUGGING INSTRUMENTATION: Always implement debugging mechanisms before running tests - logs, print statements, or test modifications that will provide evidence.",
@@ -28,12 +45,17 @@
28
45
  "UNCERTAINTY ACKNOWLEDGMENT: Explicitly document all remaining unknowns and their potential impact. No subjective confidence assessments.",
29
46
  "THOROUGHNESS: For complex bugs, recursively analyze dependencies and internals of identified components to ensure full picture.",
30
47
  "TEST INTEGRATION: Leverage existing tests to validate hypotheses where possible.",
31
- "LOG ENHANCEMENTS: Include class/function names. For repetitive logs, implement deduplication by tracking counts ('x[count]') and grouping related sequential logs for readability. See Phase 3 for detailed implementation patterns and examples.",
48
+ "**LOGGING STANDARDS:**",
49
+ "LOG FORMAT: Always use 'ClassName.methodName [hypothesisId] {timestamp}: message'. For concurrent code, add thread/worker ID.",
50
+ "LOG DEDUPLICATION: Implement in debug code: if (lastMsg === currentMsg) { count++; if (count % 10 === 0) log(`${msg} x${count}`); } else { if (count > 1) log(`Previous: x${count}`); log(currentMsg); count = 1; }",
51
+ "LOG AGGREGATION: For high-frequency events, create summaries: 'Event X occurred 847 times between 10:23:45-10:23:47, unique values: [val1: 623, val2: 224]'",
52
+ "LOG WINDOWS: Group related logs within 50-100ms. Mark groups with '=== Operation: XYZ Start ===' and '=== Operation: XYZ End (duration: 73ms) ==='",
53
+ "LOG CONTEXT: Include hypothesis ID in all debug logs. Use prefixes like 'H1_DEBUG:', 'H2_TRACE:', 'H3_ERROR:'",
32
54
  "LOG ANALYSIS OFFLOADING: For voluminous logs (>500 lines), offload analysis to sub-chats with structured prompts. See Phase 4 for detailed sub-analysis implementation.",
33
55
  "RECURSION DEPTH: Limit recursive analysis to 3 levels deep to prevent analysis paralysis while ensuring thoroughness.",
34
56
  "INVESTIGATION BOUNDS: If investigation exceeds 20 steps or 4 hours without root cause, pause and reassess approach with user.",
35
57
  "AUTOMATION LEVELS: High=auto-approve >8.0 confidence decisions, Medium=standard confirmations, Low=extra confirmations for safety. Control workflow autonomy based on user preference.",
36
- "CONTEXT DOCUMENTATION: Maintain INVESTIGATION_CONTEXT.md throughout. Update after major milestones, failures, or user interventions to enable seamless handoffs between sessions.",
58
+ "CONTEXT DOCUMENTATION: Maintain INVESTIGATION_CONTEXT.md throughout. Update after major milestones, failures, or user interventions to enable seamless handoffs between sessions. Include explicit resumption instructions using workflow_get and workflow_next.",
37
59
  "GIT FALLBACK STRATEGY: If git unavailable, gracefully skip commits/branches, log changes manually in CONTEXT.md with timestamps, warn user, document modifications for manual control.",
38
60
  "GIT ERROR HANDLING: Use run_terminal_cmd for git operations; if fails, output exact command for user manual execution. Never halt investigation due to git unavailability.",
39
61
  "TOOL AVAILABILITY AWARENESS: Check debugging tool availability before investigation design. Have fallbacks for when primary tools unavailable (grep→file_search, etc).",
@@ -50,7 +72,7 @@
50
72
  {
51
73
  "id": "phase-0-triage",
52
74
  "title": "Phase 0: Initial Triage & Context Gathering",
53
- "prompt": "**SYSTEMATIC INVESTIGATION BEGINS** - Your mission is to achieve near 100% certainty about this bug's root cause through systematic evidence gathering. NO FIXES will be proposed until Phase 6.\n\n**STEP 1: Bug Report Analysis**\nPlease provide the complete bug context:\n- **Bug Description**: What is the observed behavior vs expected behavior?\n- **Error Messages/Stack Traces**: Paste the complete error output\n- **Reproduction Steps**: How can this bug be consistently reproduced?\n- **Environment Details**: OS, language version, framework version, etc.\n- **Recent Changes**: Any recent commits, deployments, or configuration changes?\n\n**STEP 2: Project Type Classification**\nBased on the information provided, I will classify the project type and set debugging strategies:\n- **Languages/Frameworks**: Primary tech stack\n- **Build System**: Maven, Gradle, npm, etc.\n- **Testing Framework**: JUnit, Jest, pytest, etc.\n- **Logging System**: Available logging mechanisms\n- **Architecture**: Monolithic, microservices, distributed, serverless, etc.\n\n**STEP 3: Complexity Assessment**\nI will analyze the bug complexity using these criteria:\n- **Simple**: Single function/method, clear error path, minimal dependencies\n- **Standard**: Multiple components, moderate investigation required\n- **Complex**: Cross-system issues, race conditions, complex state management\n\n**OUTPUTS**: Set `projectType`, `bugComplexity`, `debuggingMechanism`, and `isDistributed` (true if architecture involves microservices/distributed systems) context variables.",
75
+ "prompt": "**SYSTEMATIC INVESTIGATION BEGINS** - Your mission is to achieve near 100% certainty about this bug's root cause through systematic evidence gathering. NO FIXES will be proposed until Phase 6.\n\n**STEP 1: Bug Report Analysis**\nPlease provide the complete bug context:\n- **Bug Description**: What is the observed behavior vs expected behavior?\n- **Error Messages/Stack Traces**: Paste the complete error output\n- **Reproduction Steps**: How can this bug be consistently reproduced?\n- **Environment Details**: OS, language version, framework version, etc.\n- **Recent Changes**: Any recent commits, deployments, or configuration changes?\n\n**STEP 2: Project Type Classification**\nBased on the information provided, I will classify the project type and set debugging strategies:\n- **Languages/Frameworks**: Primary tech stack\n- **Build System**: Maven, Gradle, npm, etc.\n- **Testing Framework**: JUnit, Jest, pytest, etc.\n- **Logging System**: Available logging mechanisms\n- **Architecture**: Monolithic, microservices, distributed, serverless, etc.\n\n**STEP 3: Complexity Assessment**\nI will analyze the bug complexity using these criteria:\n- **Simple**: Single function/method, clear error path, minimal dependencies\n- **Standard**: Multiple components, moderate investigation required\n- **Complex**: Cross-system issues, race conditions, complex state management\n\n**STEP 4: Automation Level Selection**\nAsk the user: \"What automation level would you prefer for this investigation?\"\n- **High**: Auto-approve decisions with confidence >8.0, minimal confirmations\n- **Medium**: Standard confirmations for key decisions\n- **Low**: Extra confirmations for safety, manual approval for all changes\n\n**OUTPUTS**: Set context variables:\n- `projectType`, `bugComplexity`, `debuggingMechanism`\n- `isDistributed` (true if architecture involves microservices/distributed systems)\n- `automationLevel` (High/Medium/Low based on user preference)",
54
76
  "agentRole": "You are a senior debugging specialist and bug triage expert with 15+ years of experience across multiple technology stacks. Your expertise lies in quickly classifying bugs, understanding project architectures, and determining appropriate investigation strategies. You excel at extracting critical information from bug reports and setting up systematic investigation approaches.",
55
77
  "guidance": [
56
78
  "CLASSIFICATION ACCURACY: Proper complexity assessment determines investigation depth - be thorough but decisive",
@@ -74,9 +96,22 @@
74
96
  ]
75
97
  },
76
98
  {
77
- "id": "phase-0b-reproducibility-loop",
99
+ "id": "phase-0b-user-preferences",
100
+ "title": "Phase 0b: Identify User Debugging Preferences",
101
+ "prompt": "**USER DEBUGGING PREFERENCES** - Identify and document user-specific debugging preferences.\n\n**CHECK FOR PREFERENCES IN:**\n1. **User Settings/Memory**: Any stored debugging preferences\n2. **Project Documentation**: Team debugging standards\n3. **Previous Instructions**: Past user guidance on debugging approach\n\n**CATEGORIZE PREFERENCES:**\n- **Debugging Tools**: Preference for debugger vs logs vs traces\n- **Log Verbosity**: Detailed vs concise output\n- **Output Format**: Structured logs vs human-readable\n- **Testing Approach**: Unit tests vs integration tests focus\n- **Commit Style**: Conventional commits vs descriptive\n- **Documentation**: Inline comments vs separate docs\n- **Error Handling**: Fail fast vs defensive programming\n\n**IF NO EXPLICIT PREFERENCES:**\nAsk user:\n- \"Do you prefer verbose logging or concise summaries?\"\n- \"Should I use interactive debuggers or rely on log analysis?\"\n- \"Any specific tools or approaches your team prefers?\"\n\n**OUTPUT**: Set `userDebugPreferences` context variable with categorized preferences.\n\n**APPLY**: Use applyDebugPreferences() throughout investigation to adapt approach.",
102
+ "agentRole": "You are a debugging preferences specialist who understands how different teams and developers approach problem-solving. You excel at identifying and applying user-specific debugging styles.",
103
+ "guidance": [
104
+ "This step ensures the investigation aligns with user/team practices",
105
+ "Capture both explicit and implicit preferences",
106
+ "Default to standard practices if no preferences found",
107
+ "These preferences will be applied throughout the workflow"
108
+ ],
109
+ "requireConfirmation": false
110
+ },
111
+ {
112
+ "id": "phase-0c-reproducibility-loop",
78
113
  "type": "loop",
79
- "title": "Phase 0b: Reproducibility Verification Loop",
114
+ "title": "Phase 0c: Reproducibility Verification Loop",
80
115
  "loop": {
81
116
  "type": "for",
82
117
  "count": 3,
@@ -100,8 +135,8 @@
100
135
  "requireConfirmation": false
101
136
  },
102
137
  {
103
- "id": "phase-0c-reproducibility-assessment",
104
- "title": "Phase 0c: Reproducibility Assessment",
138
+ "id": "phase-0d-reproducibility-assessment",
139
+ "title": "Phase 0d: Reproducibility Assessment",
105
140
  "prompt": "**ASSESS REPRODUCIBILITY**\n\nBased on 3 reproduction attempts:\n- **Success Rate**: Calculate percentage\n- **Pattern Analysis**: Identify any intermittent patterns\n- **Minimal Reproduction**: Create simplified test case if needed\n\n**DECISION:**\n- If 100% reproducible: Proceed to Phase 1\n- If intermittent: Apply stress techniques and document patterns\n- If 0% reproducible: Request more information from user\n\n**Set `isReproducible` = true/false based on assessment**",
106
141
  "agentRole": "You are assessing reproduction results to determine investigation viability.",
107
142
  "guidance": [
@@ -123,44 +158,76 @@
123
158
  }
124
159
  },
125
160
  {
126
- "id": "phase-1-streamlined-analysis",
161
+ "id": "phase-0e-tool-check",
162
+ "title": "Phase 0e: Tool Availability Verification",
127
163
  "runCondition": {
128
- "var": "bugComplexity",
129
- "equals": "simple"
164
+ "var": "isReproducible",
165
+ "equals": true
130
166
  },
131
- "title": "Phase 1: Streamlined Analysis (Simple Bugs)",
132
- "prompt": "**STREAMLINED CODEBASE INVESTIGATION** - For simple bugs, I will perform focused analysis of the core issue.\n\n**STEP 1: Direct Component Analysis**\nI will examine the specific component involved:\n- **Primary Function/Method**: Direct analysis of the failing code\n- **Input/Output Analysis**: What data enters and exits the component\n- **Logic Flow**: Step-by-step execution path\n- **Error Point**: Exact location where failure occurs\n\n**STEP 2: Immediate Context Review**\n- **Recent Changes**: Git commits affecting this specific component\n- **Related Tests**: Existing test coverage for this functionality\n- **Dependencies**: Direct dependencies that could affect this component\n\n**STEP 3: Quick Hypothesis Formation**\nI will generate 1-3 focused hypotheses based on:\n- **Obvious Error Patterns**: Common failure modes for this type of component\n- **Change Impact**: How recent modifications could cause this issue\n- **Input Validation**: Whether invalid inputs are causing the failure\n\n**OUTPUTS**: Focused understanding of the simple bug with 1-3 targeted hypotheses ready for validation.",
133
- "agentRole": "You are an experienced debugging specialist who excels at quickly identifying and resolving straightforward technical issues. Your strength lies in pattern recognition and efficient root cause analysis for simple bugs. You focus on the most likely causes while avoiding over-analysis.",
167
+ "prompt": "**TOOL AVAILABILITY CHECK** - Verify required debugging tools before investigation.\n\n**CORE TOOLS CHECK:**\n1. **Analysis Tools**:\n - grep_search: Text pattern searching\n - read_file: File content reading\n - codebase_search: Semantic code search\n - Test availability, note any failures\n\n2. **Git Operations**:\n - Check git availability: `git --version`\n - If unavailable, set `gitAvailable = false`\n - Plan fallback: manual change tracking\n\n3. **Build/Test Tools** (based on projectType):\n - npm/yarn for JavaScript\n - Maven/Gradle for Java\n - pytest/unittest for Python\n - Document which are available\n\n4. **Debugging Tools**:\n - Language-specific debuggers\n - Profilers if needed\n - Log aggregation tools\n\n**FALLBACK STRATEGIES:**\n- grep_search fails → use file_search\n- codebase_search fails → use grep_search with context\n- Git unavailable → track changes in INVESTIGATION_CONTEXT.md\n- Build tools missing → focus on static analysis\n\n**OUTPUT**:\n- Set `availableTools` context variable\n- Set `toolLimitations` with any restrictions\n- Document fallback strategies in context\n\n**ADAPTATION**: Adjust investigation approach based on available tools.",
168
+ "agentRole": "You are a tool availability specialist ensuring the investigation can proceed smoothly with available resources. You excel at creating fallback strategies.",
134
169
  "guidance": [
135
- "FOCUSED ANALYSIS: Concentrate on the specific failing component, avoid deep architectural analysis",
136
- "PATTERN RECOGNITION: Use experience to identify common failure modes quickly",
137
- "EFFICIENT HYPOTHESIS: Generate 1-3 focused hypotheses, not exhaustive possibilities",
138
- "DIRECT APPROACH: Skip complex dependency analysis unless directly relevant"
139
- ]
170
+ "Test each tool category systematically",
171
+ "Don't fail if some tools are unavailable - adapt",
172
+ "Document limitations clearly for user awareness",
173
+ "Prefer degraded functionality over investigation failure"
174
+ ],
175
+ "requireConfirmation": false
140
176
  },
141
177
  {
142
- "id": "phase-1-comprehensive-analysis",
178
+ "id": "phase-0f-create-context",
179
+ "title": "Phase 0f: Initialize Investigation Context",
143
180
  "runCondition": {
144
- "or": [
145
- {
146
- "var": "bugComplexity",
147
- "equals": "standard"
148
- },
149
- {
150
- "var": "bugComplexity",
151
- "equals": "complex"
152
- }
153
- ]
181
+ "var": "isReproducible",
182
+ "equals": true
154
183
  },
155
- "title": "Phase 1: Deep Codebase Analysis (Standard/Complex Bugs)",
156
- "prompt": "**SYSTEMATIC CODEBASE INVESTIGATION** - I will now perform comprehensive analysis of the relevant codebase components.\n\n**STEP 1: Affected Component Identification**\nBased on the bug report, I will identify and analyze:\n- **Primary Components**: Classes, functions, modules directly involved\n- **Dependency Chain**: Related components that could influence the bug\n- **Data Flow**: How data moves through the affected systems\n- **Error Propagation Paths**: Where and how errors can originate and propagate\n\n**STEP 2: Code Structure Analysis**\nFor each relevant component, I will examine:\n- **Implementation Logic**: Step-by-step code execution flow\n- **State Management**: How state is created, modified, and shared\n- **Error Handling**: Existing error handling mechanisms\n- **External Dependencies**: Third-party libraries, APIs, database interactions\n- **Concurrency Patterns**: Threading, async operations, shared resources\n\n**STEP 3: Historical Context Review**\nI will analyze:\n- **Recent Changes**: Git history around the affected components\n- **Test Coverage**: Existing tests and their coverage of the bug area\n- **Known Issues**: TODO comments, FIXME notes, or similar patterns\n\n**STEP 4: Recursive Dependency Dive**\nFor key components, analyze dependencies and internals recursively to uncover hidden issues.\n\n**OUTPUTS**: Comprehensive understanding of the codebase architecture and potential failure points.",
157
- "agentRole": "You are a principal software architect and code analysis expert specializing in systematic codebase investigation. Your strength lies in quickly understanding complex system architectures, identifying failure points, and tracing execution flows. You excel at connecting code patterns to potential runtime behaviors.",
184
+ "prompt": "**CREATE INVESTIGATION CONTEXT**\n\nUse createInvestigationBranch(), then create INVESTIGATION_CONTEXT.md with:\n\n1. **Bug Summary**: ID, description, complexity, reproducibility, status, automation level\n2. **Progress Tracking**: Use visualProgress() to show phases completed/remaining\n3. **Environment**: Project type, debugging mechanism, architecture, tools, user preferences\n4-8. **Section Placeholders**: Analysis, Hypotheses, Evidence, Experiments, Dead Ends\n9. **Function Definitions**: Include all from metaGuidance\n10. **Resumption Instructions**:\n - workflow_get: id=\"systematic-bug-investigation-with-loops\", mode=\"preview\"\n - workflow_next: JSON with workflowId, completedSteps, context variables\n\n**Key Variables**: bugComplexity, projectType, isReproducible, debuggingMechanism, isDistributed, automationLevel, userDebugPreferences, availableTools\n\n**Set contextInitialized = true**",
185
+ "agentRole": "You are creating the central documentation hub for this investigation. This document will track all progress, findings, and enable seamless handoffs.",
158
186
  "guidance": [
159
- "SYSTEMATIC COVERAGE: Analyze all relevant components, not just the obvious ones",
160
- "EXECUTION FLOW FOCUS: Trace the actual code execution path that leads to the bug",
161
- "STATE ANALYSIS: Pay special attention to state management and mutation patterns",
162
- "DEPENDENCY MAPPING: Understand how external dependencies could contribute to the issue"
163
- ]
187
+ "Create a comprehensive but scannable document",
188
+ "Include all context variables discovered so far",
189
+ "Set up structure for future updates",
190
+ "Include function definitions for reference",
191
+ "Update the resumption JSON after each major phase using addResumptionJson()",
192
+ "Always include the workflow_get and workflow_next instructions for proper resumption"
193
+ ],
194
+ "requireConfirmation": false
195
+ },
196
+ {
197
+ "id": "phase-1-iterative-analysis",
198
+ "type": "loop",
199
+ "title": "Phase 1: Multi-Dimensional Codebase Analysis",
200
+ "runCondition": {
201
+ "var": "isReproducible",
202
+ "equals": true
203
+ },
204
+ "loop": {
205
+ "type": "for",
206
+ "count": 4,
207
+ "maxIterations": 4,
208
+ "iterationVar": "analysisPhase"
209
+ },
210
+ "body": [
211
+ {
212
+ "id": "analysis-iteration",
213
+ "title": "Analysis {{analysisPhase}}/4",
214
+ "prompt": "{{analysisPhase === 1 ? '**BREADTH SCAN**\\n\\n1. **Error Mapping**: grep_search errors, trace logs, map stack traces\\n2. **Component Discovery**: Find all interacting components using codebase_search\\n3. **Data Flow**: Trace data through bug area, transformations, persistence\\n4. **Recent Changes**: Git history last 10 commits\\n\\n**Output**: BreadthAnalysis.md with interaction map' : analysisPhase === 2 ? '**COMPONENT DEEP DIVE**\\n\\nUse recursiveAnalysis(component, 3) on top 5 suspicious components:\\n\\n1. **L1 Direct**: Read complete file, state management, error handling\\n2. **L2 Dependencies**: Follow imports, contracts, version compatibility\\n3. **L3 Integration**: System fit, side effects, concurrency, resources\\n\\n**Output**: ComponentAnalysis.md with deep insights' : analysisPhase === 3 ? '**DEPENDENCY & FLOW ANALYSIS**\\n\\n1. **Static Graph**: Import tree, circular deps, hidden dependencies\\n2. **Runtime Flow**: Execution paths, async flows, state changes\\n3. **Data Pipeline**: Track transformations, validation, corruption points\\n4. **Integration**: External services, DB, queues, filesystem\\n\\n**Output**: FlowAnalysis.md with diagrams' : '**TEST COVERAGE ANALYSIS**\\n\\nUse analyzeTests(component) for each suspicious component:\\n\\n1. **Direct Coverage**: Find tests, analyze coverage gaps, quality\\n2. **Integration Tests**: Bug area tests, assumptions, flaky tests\\n3. **History**: When added/modified, correlation with bug\\n4. **Debug Execution**: Run with debug flags, instrument, compare\\n\\n**Output**: TestAnalysis.md with coverage gaps matrix'}}",
215
+ "agentRole": "You are performing systematic analysis phase {{analysisPhase}} of 4. Your focus is {{analysisPhase === 1 ? 'casting a wide net to find all potentially related components' : analysisPhase === 2 ? 'deep diving into the most suspicious components to understand their internals' : analysisPhase === 3 ? 'tracing how components connect and data flows between them' : 'leveraging existing tests to understand expected behavior and find coverage gaps'}}.",
216
+ "guidance": [
217
+ "This is analysis phase {{analysisPhase}} of 4 total phases",
218
+ "Phase 1 = Breadth Scan, Phase 2 = Deep Dive, Phase 3 = Dependencies, Phase 4 = Tests",
219
+ "Each phase builds on previous findings",
220
+ "Create a structured markdown file for each phase output",
221
+ "Use the function definitions for standardized operations",
222
+ "If you discover the bug's root cause with high confidence, note it but complete all analysis phases for thoroughness",
223
+ "Update INVESTIGATION_CONTEXT.md after each phase: use updateInvestigationContext('Analysis Findings', phase-specific findings)",
224
+ "In Phase 1 (Breadth Scan): Use findSimilarBugs() to search for historical patterns",
225
+ "After all 4 phases complete, use trackInvestigation('Phase 1 Complete', 'Moving to Hypothesis Development')"
226
+ ],
227
+ "requireConfirmation": false
228
+ }
229
+ ],
230
+ "requireConfirmation": false
164
231
  },
165
232
  {
166
233
  "id": "phase-1a-binary-search",
@@ -201,7 +268,7 @@
201
268
  {
202
269
  "id": "phase-2a-hypothesis-development",
203
270
  "title": "Phase 2a: Hypothesis Development & Prioritization",
204
- "prompt": "**HYPOTHESIS GENERATION** - Based on codebase analysis, formulate testable hypotheses about the bug's root cause.\n\n**STEP 1: Evidence-Based Hypothesis Development**\nCreate maximum 5 prioritized hypotheses. Each includes:\n- **Root Cause Theory**: Specific technical explanation\n- **Supporting Evidence**: Code patterns/logic flows supporting this theory\n- **Failure Mechanism**: Exact sequence leading to observed bug\n- **Testability Score**: Quantified assessment (1-10) of validation ease\n- **Evidence Strength Score**: Quantified assessment (1-10) based on code findings\n\n**STEP 2: Hypothesis Prioritization Matrix**\nRank hypotheses using weighted scoring:\n- **Evidence Strength** (40%): Code analysis support for theory\n- **Testability** (35%): Validation ease with debugging instruments\n- **Impact Scope** (25%): How well this explains all symptoms\n\n**CRITICAL RULE**: All hypotheses must be based on concrete evidence from code analysis.\n\n**OUTPUTS**: Maximum 5 hypotheses with quantified scoring, ranked by priority.",
271
+ "prompt": "**HYPOTHESIS GENERATION** - Based on codebase analysis, formulate testable hypotheses about the bug's root cause.\n\n**STEP 1: Evidence-Based Hypothesis Development**\nCreate maximum 5 prioritized hypotheses. Each includes:\n- **Root Cause Theory**: Specific technical explanation\n- **Supporting Evidence**: Code patterns/logic flows supporting this theory\n- **Failure Mechanism**: Exact sequence leading to observed bug\n- **Testability Score**: Quantified assessment (1-10) of validation ease\n- **Evidence Strength Score**: Quantified assessment (1-10) based on code findings\n\n**STEP 2: Hypothesis Prioritization Matrix**\nRank hypotheses using weighted scoring:\n- **Evidence Strength** (40%): Code analysis support for theory\n- **Testability** (35%): Validation ease with debugging instruments\n- **Impact Scope** (25%): How well this explains all symptoms\n\n**STEP 3: Pattern Integration**\nIncorporate findings from findSimilarBugs():\n- **Historical Patterns**: Similar bugs fixed previously\n- **Known Issues**: Related problems in the codebase\n- **Test Failures**: Similar test failure patterns\n- Adjust hypothesis confidence based on pattern matches\n\n**CRITICAL RULE**: All hypotheses must be based on concrete evidence from code analysis.\n\n**OUTPUTS**: Maximum 5 hypotheses with quantified scoring, ranked by priority.",
205
272
  "agentRole": "You are a senior software detective and root cause analysis expert with deep expertise in systematic hypothesis formation. Your strength lies in connecting code evidence to potential failure mechanisms and creating testable theories. You excel at logical reasoning and evidence-based deduction. You must maintain rigorous quantitative standards and reject any hypothesis not grounded in concrete code evidence.",
206
273
  "guidance": [
207
274
  "EVIDENCE-BASED ONLY: Every hypothesis must be grounded in concrete code analysis findings with quantified evidence scores",
@@ -225,7 +292,7 @@
225
292
  {
226
293
  "id": "phase-2b-hypothesis-validation-strategy",
227
294
  "title": "Phase 2b: Hypothesis Validation Strategy & Documentation",
228
- "prompt": "**HYPOTHESIS VALIDATION PLANNING** - For the top 3 hypotheses, create validation strategies and documentation.\n\n**STEP 1: Hypothesis Validation Strategy**\nFor top 3 hypotheses, define:\n- **Required Evidence**: Specific evidence to confirm/refute hypothesis\n- **Debugging Approach**: Instrumentation/tests providing evidence\n- **Success Criteria**: Results proving hypothesis correct\n- **Confidence Threshold**: Minimum evidence quality needed\n\n**STEP 2: Hypothesis Documentation**\nCreate structured registry:\n- **Hypothesis ID**: H1, H2, H3 for tracking\n- **Status**: Active, Refuted, Confirmed\n- **Evidence Log**: Supporting and contradicting evidence\n- **Validation Plan**: Specific testing approach\n\n**STEP 3: Coverage Check**\nEnsure hypotheses cover diverse categories (logic, state, dependencies) with deep analysis.\n\n**OUTPUTS**: Top 3 hypotheses selected for validation with structured documentation and validation plans.",
295
+ "prompt": "**HYPOTHESIS VALIDATION PLANNING** - For the top 3 hypotheses, create validation strategies and documentation.\n\n**STEP 1: Hypothesis Validation Strategy**\nFor top 3 hypotheses, define:\n- **Required Evidence**: Specific evidence to confirm/refute hypothesis\n- **Debugging Approach**: Instrumentation/tests providing evidence\n- **Success Criteria**: Results proving hypothesis correct\n- **Confidence Threshold**: Minimum evidence quality needed\n\n**STEP 2: Hypothesis Documentation**\nCreate structured registry:\n- **Hypothesis ID**: H1, H2, H3 for tracking\n- **Status**: Active, Refuted, Confirmed\n- **Evidence Log**: Supporting and contradicting evidence\n- **Validation Plan**: Specific testing approach\n\n**STEP 3: Coverage Check**\nEnsure hypotheses cover diverse categories (logic, state, dependencies) with deep analysis.\n\n**STEP 4: Update Investigation Context**\nUse updateInvestigationContext('Hypothesis Registry', formatted hypothesis table with all details)\n\n**OUTPUTS**: Top 3 hypotheses selected for validation with structured documentation and validation plans.",
229
296
  "agentRole": "You are a systematic testing strategist and documentation expert. Your strength lies in creating clear validation plans and maintaining rigorous documentation standards for hypothesis tracking and evidence collection.",
230
297
  "guidance": [
231
298
  "STRUCTURED DOCUMENTATION: Create formal hypothesis registry with tracking IDs and status",
@@ -258,6 +325,78 @@
258
325
  ],
259
326
  "requireConfirmation": false
260
327
  },
328
+ {
329
+ "id": "phase-2d-test-evidence-gathering",
330
+ "title": "Phase 2d: Test-Based Hypothesis Evidence",
331
+ "runCondition": {
332
+ "var": "hypothesesToValidate",
333
+ "not_equals": null
334
+ },
335
+ "prompt": "**TEST-DRIVEN HYPOTHESIS VALIDATION**\n\nFor each hypothesis in hypothesesToValidate, use checkHypothesisInTests(hypothesis):\n\n**1. Direct Test Evidence**:\n- Find tests that directly test suspected components\n- Analyze test names, descriptions, and assertions\n- Check if tests actually validate what we think\n\n**2. Indirect Test Evidence**:\n- Find tests that would fail if hypothesis is true\n- Look for integration tests touching the area\n- Check for tests that assume opposite behavior\n\n**3. Test Coverage Gaps**:\n- What aspects of hypothesis are NOT tested?\n- Where would a test have caught this bug?\n- What assumptions do tests make?\n\n**4. Test Execution Analysis**:\n- Run tests with debug instrumentation\n- Add temporary logging to tests\n- Compare test expectations vs reality\n\n**5. Historical Test Analysis**:\n- When were relevant tests last modified?\n- Were any tests disabled recently?\n- Do test changes correlate with bug appearance?\n\n**Create TestEvidence Matrix**:\n```\n| Hypothesis | Supporting Tests | Contradicting Tests | Coverage Gaps | Confidence Impact |\n|------------|------------------|---------------------|---------------|-------------------|\n| H1 | TestA, TestB | TestC (partially) | Edge case X | +2 confidence |\n```\n\n**Update each hypothesis** with test evidence findings.",
336
+ "agentRole": "You are a test analysis specialist validating hypotheses against the existing test suite. Your goal is to use tests as objective evidence for or against each hypothesis.",
337
+ "guidance": [
338
+ "Tests are the codified understanding of system behavior",
339
+ "A hypothesis contradicted by passing tests needs reconsideration",
340
+ "Missing test coverage often indicates where bugs hide",
341
+ "Update hypothesis confidence based on test evidence"
342
+ ],
343
+ "requireConfirmation": false
344
+ },
345
+ {
346
+ "id": "phase-2e-hypothesis-verification",
347
+ "type": "loop",
348
+ "title": "Phase 2e: Hypothesis Verification & Refinement",
349
+ "runCondition": {
350
+ "var": "hypothesesToValidate",
351
+ "not_equals": null
352
+ },
353
+ "loop": {
354
+ "type": "forEach",
355
+ "items": "hypothesesToValidate",
356
+ "itemVar": "hypothesis",
357
+ "indexVar": "hypothesisIndex",
358
+ "maxIterations": 10
359
+ },
360
+ "body": [
361
+ {
362
+ "id": "verify-against-code",
363
+ "title": "Deep Code Verification for {{hypothesis.id}}",
364
+ "prompt": "**DEEP VERIFICATION for {{hypothesis.id}}**\n\n**Goal**: Verify hypothesis assumptions through deep code analysis.\n\nUse recursiveAnalysis() on key components:\n\n1. **Component Analysis (3 levels deep)**:\n - Level 1: Direct implementation of suspected component\n - Level 2: All direct dependencies and callers\n - Level 3: Transitive dependencies and integration points\n\n2. **State & Data Flow Verification**:\n - How does data actually flow through this component?\n - What state transformations occur?\n - Are there hidden side effects?\n\n3. **Error Path Analysis**:\n - Trace all error handling paths\n - Find where errors could originate\n - Check error propagation matches hypothesis\n\n4. **Concurrency Check** (if applicable):\n - Race conditions possible?\n - Shared state issues?\n - Timing dependencies?\n\n**Output**: Deep verification findings for {{hypothesis.id}}",
365
+ "agentRole": "You are performing deep verification of hypothesis {{hypothesis.id}}, diving 3+ levels deep to ensure thorough understanding.",
366
+ "guidance": [
367
+ "This is verification step 1 of 3 for {{hypothesis.id}}",
368
+ "Go deeper than the initial analysis - follow every lead",
369
+ "Document any new discoveries that affect the hypothesis"
370
+ ],
371
+ "requireConfirmation": false
372
+ },
373
+ {
374
+ "id": "check-contradictions",
375
+ "title": "Search for Contradicting Evidence",
376
+ "prompt": "**CONTRADICTION SEARCH for {{hypothesis.id}}**\n\n**Goal**: Actively search for evidence that contradicts this hypothesis.\n\n1. **Code Pattern Contradictions**:\n - Search for code that assumes opposite behavior\n - Find defensive checks that prevent this scenario\n - Look for comments indicating different understanding\n\n2. **Test Contradictions**:\n - Tests that would fail if hypothesis were true\n - Tests that explicitly verify opposite behavior\n - Integration tests showing different flow\n\n3. **Historical Contradictions**:\n - Git history showing intentional design decisions\n - PRs or issues discussing this behavior\n - Documentation stating different intent\n\n4. **Runtime Contradictions**:\n - Logs showing successful execution through suspected path\n - Metrics indicating normal behavior\n - Other systems depending on current behavior\n\n**Be a skeptic** - try to disprove {{hypothesis.id}}",
377
+ "agentRole": "You are a skeptical investigator trying to find flaws in hypothesis {{hypothesis.id}}.",
378
+ "guidance": [
379
+ "Actively search for contradicting evidence",
380
+ "Check assumptions against reality",
381
+ "Consider alternative explanations"
382
+ ],
383
+ "requireConfirmation": false
384
+ },
385
+ {
386
+ "id": "refine-or-replace",
387
+ "title": "Refine Hypothesis {{hypothesis.id}}",
388
+ "prompt": "**REFINEMENT DECISION for {{hypothesis.id}}**\n\nBased on deep verification and contradiction search:\n\n1. **Assessment**:\n - New evidence supporting: [list]\n - New evidence contradicting: [list]\n - Unverified assumptions: [list]\n - Confidence change: [+/- points]\n\n2. **Refinement Options**:\n - **Keep as-is**: Evidence strongly supports current formulation\n - **Refine**: Adjust hypothesis based on new understanding\n - **Replace**: Fundamentally flawed, create new hypothesis\n - **Merge**: Combine with another hypothesis\n\n3. **If Refining/Replacing**:\n - Update hypothesis description\n - Adjust evidence strength score\n - Revise validation plan\n - Document why changed\n\n4. **Update Context**:\n - Use updateInvestigationContext('Hypothesis Registry', updated hypothesis)\n - Note verification findings\n\n**Output**: Updated hypothesis with refined understanding",
389
+ "agentRole": "You are making the final decision on hypothesis {{hypothesis.id}} based on verification findings.",
390
+ "guidance": [
391
+ "Be willing to change hypotheses based on evidence",
392
+ "Document all changes and reasoning",
393
+ "Update confidence scores appropriately"
394
+ ],
395
+ "requireConfirmation": false
396
+ }
397
+ ],
398
+ "requireConfirmation": false
399
+ },
261
400
  {
262
401
  "id": "phase-3-4-5-validation-loop",
263
402
  "type": "loop",
@@ -273,7 +412,7 @@
273
412
  {
274
413
  "id": "loop-phase-3-instrumentation",
275
414
  "title": "Phase 3: Debug Instrumentation for {{currentHypothesis.id}}",
276
- "prompt": "**DEBUGGING INSTRUMENTATION for {{currentHypothesis.id}}**\n\n**Hypothesis**: {{currentHypothesis.description}}\n**Validation Plan**: {{currentHypothesis.validationPlan}}\n\n**IMPLEMENT INSTRUMENTATION:**\n1. **Strategy**: Choose based on hypothesis needs (logging, debug prints, test mods)\n2. **Coverage**: Instrument all paths related to {{currentHypothesis.id}}\n3. **Evidence Points**: Focus on gathering evidence that will confirm/refute this specific hypothesis\n\n**LOG OPTIMIZATION:**\n- Use '{{currentHypothesis.id}}_DEBUG:' prefix for all logs\n- Implement deduplication for high-frequency events\n- Group related operations within 50-100ms windows\n\n**OUTPUT**: Instrumented code ready to validate {{currentHypothesis.id}}",
415
+ "prompt": "**DEBUGGING INSTRUMENTATION for {{currentHypothesis.id}}**\n\n**Hypothesis**: {{currentHypothesis.description}}\n\n**IMPLEMENT SMART LOGGING**:\n\n1. **Standard Format**: Use instrumentCode(location, '{{currentHypothesis.id}}')\n ```\n className.methodName [{{currentHypothesis.id}}] {timestamp}: Specific message\n ```\n\n2. **Deduplication Implementation**:\n ```javascript\n // Add to each instrumentation point\n const debugState = { lastMsg: '', count: 0 };\n function smartLog(msg) {\n if (debugState.lastMsg === msg) {\n debugState.count++;\n if (debugState.count % 10 === 0) {\n console.log(`[{{currentHypothesis.id}}] ${msg} x${debugState.count}`);\n }\n } else {\n if (debugState.count > 1) {\n console.log(`[{{currentHypothesis.id}}] Previous message x${debugState.count}`);\n }\n console.log(`[{{currentHypothesis.id}}] ${msg}`);\n debugState.lastMsg = msg;\n debugState.count = 1;\n }\n }\n ```\n\n3. **Operation Grouping**:\n ```javascript\n console.log(`=== {{currentHypothesis.id}}: Operation ${opName} Start ===`);\n const startTime = Date.now();\n // ... operation code with smartLog() calls ...\n console.log(`=== {{currentHypothesis.id}}: Operation ${opName} End (${Date.now() - startTime}ms) ===`);\n ```\n\n4. **Test Instrumentation**:\n - Add debugging to relevant test files\n - Instrument test setup/teardown\n - Log test assumptions vs actual behavior\n\n5. **High-Frequency Aggregation**:\n - For loops/iterations, log summary every 100 iterations\n - For events, create time-window summaries\n - Track unique values and their counts\n\n**OUTPUT**: Instrumented code ready to produce clean, manageable logs for {{currentHypothesis.id}}",
277
416
  "agentRole": "You are instrumenting code specifically to validate hypothesis {{currentHypothesis.id}}. Focus on targeted evidence collection.",
278
417
  "guidance": [
279
418
  "This is hypothesis {{hypothesisIndex + 1}} of {{hypothesesToValidate.length}}",
@@ -285,7 +424,7 @@
285
424
  {
286
425
  "id": "loop-phase-4-evidence",
287
426
  "title": "Phase 4: Evidence Collection for {{currentHypothesis.id}}",
288
- "prompt": "**EVIDENCE COLLECTION for {{currentHypothesis.id}}**\n\n**Execute instrumented code and collect evidence:**\n1. Run the instrumented test/reproduction\n2. Collect all {{currentHypothesis.id}}_DEBUG logs\n3. Analyze results against validation criteria\n4. Document evidence quality and relevance\n\n**EVIDENCE ASSESSMENT:**\n- Does evidence support {{currentHypothesis.id}}? (Yes/No/Partial)\n- Evidence quality score (1-10)\n- Contradicting evidence found?\n- Additional evidence needed?\n\n**If log volume >500 lines, create sub-analysis prompt.**\n\n**OUTPUT**: Evidence assessment for {{currentHypothesis.id}} with quality scoring",
427
+ "prompt": "**EVIDENCE COLLECTION for {{currentHypothesis.id}}**\n\n**Execute instrumented code and collect evidence:**\n1. Run the instrumented test/reproduction\n2. Collect all {{currentHypothesis.id}}_DEBUG logs\n3. Analyze results against validation criteria\n4. Document evidence quality and relevance\n\n**TEST EXECUTION EVIDENCE**:\n- Run instrumented tests for {{currentHypothesis.id}}\n- Collect test debug output\n- Note any test failures or unexpected behavior\n- Compare with production bug behavior\n\n**EVIDENCE ASSESSMENT:**\n- Does evidence support {{currentHypothesis.id}}? (Yes/No/Partial)\n- Evidence quality score (1-10)\n- Contradicting evidence found?\n- Additional evidence needed?\n\n**If log volume >500 lines, use aggregateDebugLogs() and create sub-analysis prompt.**\n\n**OUTPUT**: Evidence assessment for {{currentHypothesis.id}} with quality scoring",
289
428
  "agentRole": "You are collecting and analyzing evidence specifically for hypothesis {{currentHypothesis.id}}.",
290
429
  "guidance": [
291
430
  "Focus on evidence directly related to this hypothesis",
@@ -297,7 +436,7 @@
297
436
  {
298
437
  "id": "loop-phase-5-synthesis",
299
438
  "title": "Phase 5: Evidence Synthesis for {{currentHypothesis.id}}",
300
- "prompt": "**EVIDENCE SYNTHESIS for {{currentHypothesis.id}}**\n\n**Synthesize findings:**\n1. **Evidence Summary**: What did we learn about {{currentHypothesis.id}}?\n2. **Confidence Update**: Based on evidence, rate confidence this is the root cause (0-10)\n3. **Status Update**: Mark hypothesis as Confirmed/Refuted/Needs-More-Evidence\n\n**If {{currentHypothesis.id}} is confirmed with high confidence (>8.0):**\n- Set `rootCauseFound` = true\n- Set `rootCauseHypothesis` = {{currentHypothesis.id}}\n- Update `currentConfidence` with confidence score\n\n**If all hypotheses validated but confidence <9.0:**\n- Consider additional investigation needs\n- Document what evidence is still missing",
439
+ "prompt": "**EVIDENCE SYNTHESIS for {{currentHypothesis.id}}**\n\n**Synthesize findings:**\n1. **Evidence Summary**: What did we learn about {{currentHypothesis.id}}?\n2. **Confidence Update**: Based on evidence, rate confidence this is the root cause (0-10)\n3. **Status Update**: Mark hypothesis as Confirmed/Refuted/Needs-More-Evidence\n\n**If {{currentHypothesis.id}} is confirmed with high confidence (>8.0):**\n- Set `rootCauseFound` = true\n- Set `rootCauseHypothesis` = {{currentHypothesis.id}}\n- Update `currentConfidence` with confidence score\n\n**If all hypotheses validated but confidence <9.0:**\n- Consider additional investigation needs\n- Document what evidence is still missing\n\n**Context Update**:\n- Use updateInvestigationContext('Evidence Log', evidence summary for {{currentHypothesis.id}})\n- Every 3 iterations: Use trackInvestigation('Validation Progress', '{{hypothesisIndex + 1}}/{{hypothesesToValidate.length}} hypotheses validated')",
301
440
  "agentRole": "You are synthesizing evidence to determine if {{currentHypothesis.id}} is the root cause.",
302
441
  "guidance": [
303
442
  "Update hypothesis status based on evidence",
@@ -315,6 +454,40 @@
315
454
  },
316
455
  "requireConfirmation": false
317
456
  },
457
+ {
458
+ "id": "phase-4a-controlled-experimentation",
459
+ "title": "Phase 4a: Controlled Code Experiments",
460
+ "runCondition": {
461
+ "var": "currentConfidence",
462
+ "lt": 8.0
463
+ },
464
+ "prompt": "**CONTROLLED EXPERIMENTATION** - When observation isn't enough, experiment!\n\n**Current Top Hypothesis**: {{hypothesesToValidate[0].id}} (Confidence: {{currentConfidence}}/10)\n\n**EXPERIMENT TYPES** (use controlledModification()):\n\n1. **Guard Additions (Non-Breaking)**:\n ```javascript\n // Add defensive check that logs but doesn't change behavior\n if (unexpectedCondition) {\n console.error('[H1_GUARD] Unexpected state detected:', state);\n // Continue normal execution\n }\n ```\n\n2. **Assertion Injections**:\n ```javascript\n // Add assertion that would fail if hypothesis is correct\n console.assert(expectedCondition, '[H1_ASSERT] Hypothesis H1 violated!');\n ```\n\n3. **Minimal Fix Test**:\n ```javascript\n // Apply minimal fix for hypothesis, see if bug disappears\n if (process.env.DEBUG_FIX_H1 === 'true') {\n // Apply hypothesized fix\n return fixedBehavior();\n }\n ```\n\n4. **Controlled Breaking**:\n ```javascript\n // Temporarily break suspected component to verify involvement\n if (process.env.DEBUG_BREAK_H1 === 'true') {\n throw new Error('[H1_BREAK] Intentionally breaking to test hypothesis');\n }\n ```\n\n**PROTOCOL**:\n1. Choose experiment type based on confidence and risk\n2. Implement modification with clear DEBUG markers\n3. Use createInvestigationBranch() if not already on investigation branch\n4. Commit: `git commit -m \"DEBUG: {{experiment_type}} for {{currentHypothesis.id}}\"`\n5. Run reproduction steps\n6. Use collectEvidence() to gather results\n7. Revert changes: `git revert HEAD`\n8. Document results in ExperimentResults/{{currentHypothesis.id}}.md\n\n**SAFETY LIMITS**:\n- Max 3 experiments per hypothesis\n- Each experiment in separate commit\n- Always revert after evidence collection\n- Document everything in INVESTIGATION_CONTEXT.md\n\n**UPDATE**:\n- Hypothesis confidence based on experimental results\n- Use updateInvestigationContext('Experiment Results', experiment details and outcomes)\n- Track failed experiments in 'Dead Ends & Lessons' section",
465
+ "agentRole": "You are a careful experimenter using controlled code modifications to validate hypotheses. Safety and reversibility are paramount.",
466
+ "guidance": [
467
+ "Start with non-breaking experiments (guards, logs)",
468
+ "Only use breaking experiments if essential",
469
+ "Every change must be easily reversible",
470
+ "Document rationale for each experiment type",
471
+ "Consider test environment experiments first"
472
+ ],
473
+ "requireConfirmation": {
474
+ "or": [
475
+ {"var": "automationLevel", "equals": "Low"},
476
+ {"var": "automationLevel", "equals": "Medium"},
477
+ {"and": [
478
+ {"var": "automationLevel", "equals": "High"},
479
+ {"var": "currentConfidence", "lt": 6.0}
480
+ ]}
481
+ ]
482
+ },
483
+ "validationCriteria": [
484
+ {
485
+ "type": "contains",
486
+ "value": "commit",
487
+ "message": "Must specify commit message for experiment"
488
+ }
489
+ ]
490
+ },
318
491
  {
319
492
  "id": "phase-3a-observability-setup",
320
493
  "title": "Phase 3a: Distributed System Observability",
@@ -369,7 +542,7 @@
369
542
  {
370
543
  "id": "phase-5a-final-confidence",
371
544
  "title": "Phase 5a: Final Confidence Assessment",
372
- "prompt": "**FINAL CONFIDENCE ASSESSMENT** - Evaluate the investigation results.\n\n**If root cause found (rootCauseFound = true):**\n- Review all evidence for {{rootCauseHypothesis}}\n- Perform adversarial challenge\n- Calculate final confidence score\n\n**If no high-confidence root cause:**\n- Document what was learned\n- Identify remaining unknowns\n- Recommend next investigation steps\n\n**CONFIDENCE CALCULATION:**\n- Evidence Quality (1-10)\n- Explanation Completeness (1-10)\n- Alternative Likelihood (1-10, inverted)\n- Final = (Quality × 0.4) + (Completeness × 0.4) + (Alternative × 0.2)\n\n**OUTPUT**: Final confidence assessment with recommendations",
545
+ "prompt": "**FINAL CONFIDENCE ASSESSMENT** - Evaluate the investigation results.\n\n**If root cause found (rootCauseFound = true):**\n- Review all evidence for {{rootCauseHypothesis}}\n- Perform adversarial challenge\n- Calculate final confidence score\n\n**If no high-confidence root cause:**\n- Document what was learned\n- Identify remaining unknowns\n- Recommend next investigation steps\n\n**CONFIDENCE CALCULATION:**\n- Evidence Quality (1-10)\n- Explanation Completeness (1-10)\n- Alternative Likelihood (1-10, inverted)\n- Final = (Quality × 0.4) + (Completeness × 0.4) + (Alternative × 0.2)\n\n**CONTEXT UPDATE**:\n- Use trackInvestigation('Investigation Complete', 'Confidence: {{finalConfidence}}/10')\n- Use addResumptionJson('phase-5a-final-confidence')\n- Document lessons learned in 'Dead Ends & Lessons' section\n\n**OUTPUT**: Final confidence assessment with recommendations",
373
546
  "agentRole": "You are making the final determination about the root cause with rigorous confidence assessment.",
374
547
  "guidance": [
375
548
  "Be honest about confidence levels",
@@ -409,7 +582,7 @@
409
582
  {
410
583
  "id": "phase-6-diagnostic-writeup",
411
584
  "title": "Phase 6: Comprehensive Diagnostic Writeup",
412
- "prompt": "**FINAL DIAGNOSTIC DOCUMENTATION** - I will create comprehensive writeup enabling effective bug fixing and knowledge transfer.\n\n**STEP 1: Executive Summary**\n- **Bug Summary**: Concise description of issue and impact\n- **Root Cause**: Clear, non-technical explanation of what is happening\n- **Confidence Level**: Final confidence assessment with calculation methodology\n- **Scope**: What systems, users, or scenarios are affected\n\n**STEP 2: Technical Deep Dive**\n- **Root Cause Analysis**: Detailed technical explanation of failure mechanism\n- **Code Component Analysis**: Specific files, functions, and lines with exact locations\n- **Execution Flow**: Step-by-step sequence of events leading to bug\n- **State Analysis**: How system state contributes to failure\n\n**STEP 3: Investigation Methodology**\n- **Investigation Timeline**: Chronological summary with phase time investments\n- **Hypothesis Evolution**: Complete record of hypotheses (H1-H5) with status changes\n- **Evidence Assessment**: Rating and reliability of evidence sources with key citations\n\n**STEP 4: Knowledge Transfer & Action Plan**\n- **Skill Requirements**: Technical expertise needed for understanding and fixing\n- **Prevention & Review**: Specific measures and code review checklist items\n- **Action Items**: Immediate mitigation steps and permanent fix areas with timelines\n- **Testing Strategy**: Comprehensive verification approach for fixes\n\n**DELIVERABLE**: Enterprise-grade diagnostic report enabling confident bug fixing, knowledge transfer, and organizational learning.",
585
+ "prompt": "**DIAGNOSTIC WRITEUP** - Create DIAGNOSTIC_REPORT.md:\n\n1. **Executive Summary**: Bug description, root cause, confidence, scope\n2. **Technical Deep Dive**: Root cause analysis, code locations, execution flow, state\n3. **Investigation Methodology**: Timeline, hypothesis evolution (H1-H5), evidence ratings\n4. **Historical Context**: findSimilarBugs() results, previous fixes, patterns, lessons\n5. **Knowledge Transfer**: Skills needed, prevention measures, action items, testing strategy\n6. **Context Finalization**: updateInvestigationContext('Final'), archive complete context\n\n**Format**: Clear sections, code snippets, 1500-3000 words\n**Goal**: Enable bug fixing, knowledge transfer, and organizational learning",
413
586
  "agentRole": "You are a senior technical writer and diagnostic documentation specialist with expertise in creating comprehensive, actionable bug reports for enterprise environments. Your strength lies in translating complex technical investigations into clear, structured documentation that enables effective problem resolution, knowledge transfer, and organizational learning. You excel at creating reports that serve immediate fixing needs, long-term system improvement, and team collaboration.",
414
587
  "guidance": [
415
588
  "ENTERPRISE FOCUS: Write for multiple stakeholders including developers, managers, and future team members",
@@ -15,37 +15,130 @@
15
15
  "The agent has access to 'create_file', 'edit_file', 'run_terminal_cmd', 'workflow_validate_json', and 'workflow_validate' tools."
16
16
  ],
17
17
  "metaGuidance": [
18
- "PROGRESSIVE LEARNING: Adapt to user experience level. Use learningPath variable for guidance depth - detailed for 'basic', balanced for 'intermediate', expert for 'advanced'.",
19
- "QUALITY FOCUS: All learning paths can produce sophisticated workflows. The difference is in HOW features are taught, not WHICH features are available. Introduce advanced features when the use case demands it, regardless of path.",
18
+ "fun adaptToPath(content) = 'Deliver {content} with appropriate depth based on learningPath: basic=detailed explanations, intermediate=efficient guidance, advanced=expert context.'",
19
+ "fun createWorkflowFile(template) = 'After getting filename, create file using {template}. Use adaptToPath() for explanations as you guide through structure.'",
20
+ "fun runValidation(depth) = 'Execute workflow_validate_json for validation. Use adaptToPath() for error explanations: basic=step-by-step teaching, intermediate=systematic analysis, advanced=architectural review.'",
21
+ "fun agentInstruct(action) = 'Agent: {action}. Confirm with user before proceeding with file modifications.'",
22
+ "fun useEditFile(purpose) = 'Use edit_file for {purpose}. Explain changes and validate with workflow_validate.'",
23
+ "fun teachFeature(feature, context) = 'Introduce {feature} progressively. Use adaptToPath() for explanation depth. Even basic users learn advanced features when {context} demands it.'",
24
+ "fun gatherDiscovery(focus) = 'Ask 2-3 focused questions about {focus}. Store responses in context.discoveryData. Be conversational and build on previous answers.'",
25
+ "fun analyzeGaps() = 'Review discoveryData for missing critical information. Identify next area to explore. Set context.questionFocus for next iteration.'",
26
+ "fun checkConvergence() = 'Evaluate if discovery is sufficient: clear problem, user needs, success criteria defined. Set context.discoveryComplete=true if ready.'",
27
+ "fun analyzeComplexity(data) = 'Score workflow complexity 1-10 based on {data}. Consider: number of steps, decision branches, integrations, error handling needs.'",
28
+ "fun suggestSteps(outline) = 'Expand {outline} into full step with prompt, agentRole, guidance using AI pattern recognition from similar workflows.'",
29
+ "fun visualizeWorkflow() = 'Generate Mermaid diagram showing complete workflow flow, decision points, loops, and context variable usage.'",
30
+ "fun getSchema() = 'Use workflow_get_schema to retrieve current schema. Reference when building workflow structure to ensure compliance.'",
31
+ "PROGRESSIVE LEARNING: adaptToPath() throughout - detailed for basic, balanced for intermediate, expert for advanced.",
32
+ "QUALITY FOCUS: All paths produce sophisticated workflows. Difference is in teaching approach, not final capability.",
20
33
  "The goal is to create a *reusable template*, not a single-use script. Use placeholders like [User provides X] where appropriate.",
21
34
  "Prompts should define goals and roles for the agent, not a rigid script. This allows the agent to use its intelligence to best achieve the task.",
22
- "At each step, the agent should confirm with the user before proceeding with a file modification or command.",
35
+ "agentInstruct() before file modifications or commands.",
23
36
  "Maintain a clear distinction between the workflow being created and this meta-workflow.",
24
37
  "Save progress frequently by confirming file edits.",
25
- "TOOL INTEGRATION: Leverage MCP tools throughout - Use 'workflow_list' and 'workflow_get' for template discovery, 'workflow_validate_json' for comprehensive validation, and 'workflow_validate' for step-by-step output validation.",
26
- "When validation fails, the MCP tools provide detailed error messages and actionable suggestions - use these to guide improvements rather than guessing at fixes.",
27
- "PATH-SPECIFIC GUIDANCE: Tailor explanation depth by learningPath. Basic: detailed explanations. Intermediate: balanced with examples. Advanced: comprehensive with expert context.",
28
- "FEATURE TEACHING: Introduce features progressively within each path. Even basic users should learn about conditional steps and context variables when their workflow needs them - just with more explanation of why and how."
38
+ "TOOL INTEGRATION: Use workflow_list/workflow_get for discovery, runValidation() for comprehensive checks, workflow_validate for step validation.",
39
+ "When validation fails, use detailed error messages to guide improvements rather than guessing at fixes.",
40
+ "teachFeature() progressively within each path based on use case needs.",
41
+ "When you see function calls like adaptToPath() or createWorkflowFile(), refer to the function definitions above for full instructions.",
42
+ "Phase 6 DSL Optimization: For intermediate/advanced users with complex workflows, consider Function Reference Pattern to reduce duplication.",
43
+ "SCHEMA FIRST: Always getSchema() at start of Phase 2 to ensure compliance with current workflow schema version and structure."
29
44
  ],
30
45
  "steps": [
31
46
  {
32
- "id": "phase-0-discovery",
33
- "title": "Phase 0: Comprehensive Discovery & Requirements Analysis",
34
- "prompt": "**STEP 1: Define the Core Problem & Goal**\n\n- **Problem Statement**: What is the specific, recurring task or problem this new workflow will solve?\n- **Primary Objective**: What is the single most important outcome you want users to achieve by completing this workflow?\n- **Critical Failure Mode**: What is the most critical error or negative outcome this workflow is designed to prevent?\n\n**STEP 2: Understand the Users & Context**\n\n- **Target Audience**: Who is this workflow for? What is their role, expertise level, and what do they need to succeed?\n- **Usage Context**: In what situation or environment will this workflow be used? (e.g., during code review, content planning, incident response, customer onboarding).\n\n**STEP 3: Gather Materials & Define Success**\n\n- **Supporting Materials**: What supporting documents, data, or examples can you provide that could inform the design?\n- **Constraints & Requirements**: Are there any specific constraints, required tools, or other absolute requirements to consider?\n- **Success Metrics**: How will you measure if this workflow is successful? What does a high-quality result look like?\n\n**Agent Guidance**: After gathering the user's answers, your goal is to synthesize them into a refined problem statement. Use `workflow_list` and `workflow_get` to find suitable templates based on the problem *structure*, not necessarily the domain, and present your findings for confirmation.",
35
- "agentRole": "You are a workflow requirements analyst and discovery specialist with expertise in understanding complex business processes and user needs. Your primary goal is to ask the user the questions below to understand their needs. Do not answer these questions yourself. Wait for the user's detailed response, then synthesize their answers and confirm your understanding before proceeding. Your role is to ask insightful questions, synthesize information effectively, and identify suitable workflow patterns that can be adapted to new use cases.",
47
+ "id": "phase-0-discovery-loop",
48
+ "type": "loop",
49
+ "title": "Phase 0: Adaptive Discovery & Requirements Analysis",
50
+ "loop": {
51
+ "type": "while",
52
+ "condition": {
53
+ "and": [
54
+ {"var": "discoveryComplete", "not_equals": true},
55
+ {"var": "discoveryIteration", "lt": 5}
56
+ ]
57
+ },
58
+ "maxIterations": 5
59
+ },
60
+ "body": [
61
+ {
62
+ "id": "discovery-iteration",
63
+ "title": "Discovery Iteration: Gathering Requirements",
64
+ "prompt": "gatherDiscovery(current focus area)\n\n**Current Focus:** Understanding the workflow requirements\n\nAgent: For iteration 1, ask the 3 core questions about problem, users, and success. For later iterations, ask targeted follow-ups based on context.questionFocus and context.focusedQuestions.\n\n**Core Questions (Iteration 1):**\n1. What specific, recurring task or problem will this workflow solve?\n2. Who will use this workflow and approximately how often?\n3. What does a successful outcome look like?\n\n**Follow-up Areas (Iterations 2+):**\n- Technical: Integration points, tools, error handling\n- Process: Decision branches, approvals, handoffs\n- Scale: Frequency, user count, growth expectations\n- Constraints: Time limits, compliance, resources",
65
+ "agentRole": "You are an adaptive discovery specialist. For iteration 1, ask the core questions. For subsequent iterations, analyze context.discoveryData to identify gaps and ask targeted follow-ups. Store all responses in context.discoveryData. Be conversational and acknowledge previous answers.",
66
+ "guidance": [
67
+ "ITERATION 1: Focus on problem, users, and success criteria only",
68
+ "ITERATION 2+: Based on problem type, explore: technical requirements OR process flows OR creative iterations OR data handling",
69
+ "BUILD CONTEXT: Reference previous answers when asking follow-ups",
70
+ "STAY FOCUSED: Maximum 3 questions per iteration to avoid overwhelming",
71
+ "CONVERGENCE: After each iteration, analyzeGaps() and checkConvergence()"
72
+ ],
73
+ "requireConfirmation": false
74
+ },
75
+ {
76
+ "id": "discovery-analysis",
77
+ "title": "Analyzing Discovery Progress",
78
+ "prompt": "analyzeGaps() to identify what's still needed.\n\n**AI-Powered Analysis:**\n- Pattern recognition: Based on similar workflows, what requirements are commonly missed?\n- Implicit needs: What hasn't been stated but is likely needed?\n- Optimization opportunities: Where could this workflow benefit from advanced features?\n\nReview the discovery data collected so far and determine:\n1. What critical information is still missing?\n2. Which area should we explore next?\n3. Are we ready to proceed to synthesis?\n\ncheckConvergence() to determine if we have enough information.\n\nAgent: Use pattern recognition to suggest unexplored areas. Set context.questionFocus for next iteration OR set context.discoveryComplete=true if ready. Increment context.discoveryIteration.",
79
+ "agentRole": "You are a discovery analyst. Review all collected data, identify critical gaps, and determine the next area to explore. Common follow-up areas: integration points, error handling, decision branches, constraints, scale/frequency, evolution needs.",
80
+ "guidance": [
81
+ "CHECK COMPLETENESS: Problem clarity, user needs, success criteria, basic constraints",
82
+ "IDENTIFY GAPS: What's critical but missing? Focus on one area at a time",
83
+ "SET FOCUS: Update questionFocus with specific area (e.g., 'integration requirements', 'error handling', 'decision points')",
84
+ "GENERATE QUESTIONS: Set context.focusedQuestions with 2-3 specific questions for next iteration",
85
+ "KNOW WHEN TO STOP: Don't over-analyze; 80% complete is often enough"
86
+ ],
87
+ "requireConfirmation": false
88
+ }
89
+ ]
90
+ },
91
+ {
92
+ "id": "discovery-synthesis",
93
+ "title": "Discovery Synthesis & Template Matching",
94
+ "prompt": "Excellent! I now have a comprehensive understanding of your workflow needs.\n\n**Discovery Summary:**\nAgent: Summarize the key findings from context.discoveryData in a structured format.\n\n**Next Steps:**\n1. I'll synthesize this into a refined problem statement\n2. Use workflow_list and workflow_get to find suitable templates\n3. getSchema() to ensure we build on the latest workflow structure\n4. Present recommendations based on structural patterns\n\nAgent: Create a concise problem statement. Search for workflows with similar patterns (not necessarily same domain). A 'bug investigation' workflow might be perfect for a 'customer complaint' workflow if they share similar structures.",
95
+ "agentRole": "You are a workflow synthesis expert. Compile all discovery data into a clear, actionable problem statement. Use workflow_list and workflow_get to find templates based on structural similarity. Present 2-3 best matches with explanations.",
36
96
  "guidance": [
37
- "BE SPECIFIC: Vague goals lead to vague workflows. The more precise the user's answers are here, the better the final product.",
38
- "THINK ABOUT THE USER: A workflow for an expert looks very different from one for a novice. Encourage the user to be detailed about their audience.",
39
- "REQUEST EXAMPLES: If the user mentions examples of the problem or desired outcome, ask them to share them.",
40
- "TEMPLATE STRATEGY: Focus on structural similarity, not domain similarity, when recommending templates. A good 'bug investigation' workflow might be a great template for a 'customer complaint' workflow."
97
+ "SYNTHESIZE: Create clear, one-paragraph problem statement",
98
+ "PATTERN MATCH: Focus on workflow structure, not domain",
99
+ "RECOMMEND: Present 2-3 template options with rationale",
100
+ "CONFIRM: Get user agreement before proceeding to learning path selection"
41
101
  ],
42
102
  "askForFiles": true,
43
103
  "requireConfirmation": true
44
104
  },
105
+ {
106
+ "id": "phase-1-25-complexity-analysis",
107
+ "title": "Phase 1.25: Workflow Complexity Analysis",
108
+ "prompt": "Based on the discovery data, I'll analyze the workflow complexity to recommend the best learning path.\n\nanalyzeComplexity(discoveryData)\n\n**Complexity Factors Being Evaluated:**\n- Problem domain complexity (technical depth required)\n- Number of potential decision points\n- Integration requirements (tools, systems, APIs)\n- Error handling and recovery needs\n- Data transformation requirements\n- User expertise vs. task complexity gap\n\n**Scoring Guide:**\n- 1-3: Simple, linear workflows (Basic path ideal)\n- 4-7: Moderate complexity with some branching (Intermediate path recommended)\n- 8-10: Complex with multiple integrations/decisions (Advanced path beneficial)\n\nAgent: Calculate complexity score and set context.complexityScore. Determine context.recommendedPath based on score AND user experience. Explain reasoning.",
109
+ "agentRole": "You are a workflow complexity analyst. Evaluate the discovered requirements objectively. Consider technical complexity, decision density, integration needs, and error handling requirements. Balance complexity score with user's stated experience level.",
110
+ "guidance": [
111
+ "OBJECTIVE SCORING: Don't over-complicate simple workflows",
112
+ "FACTOR WEIGHTS: Integrations and branching logic add more complexity than step count",
113
+ "PATH MATCHING: Score 1-3→basic, 4-7→intermediate, 8-10→advanced, but adjust for user experience",
114
+ "EXPLAIN REASONING: Always explain why you scored as you did",
115
+ "SET CONTEXT: Store complexityScore, recommendedPath, and complexityFactors in context"
116
+ ],
117
+ "requireConfirmation": false
118
+ },
119
+ {
120
+ "id": "phase-0-5-research",
121
+ "title": "Phase 0.5: Deep Research & Inspiration Gathering",
122
+ "prompt": "adaptToPath('Enhance your workflow with external knowledge! I'll generate a research prompt for deep analysis on your workflow type's best practices, patterns, and examples. Copy-paste it into ChatGPT/Claude/Gemini, then share the results. We'll synthesize key insights into your workflow ideas.')",
123
+ "agentRole": "You are a research specialist with expertise in generating targeted research prompts and synthesizing external knowledge. Generate a tailored prompt like: 'Research top 10 best practices for [workflow type] workflows, focusing on [problem]. Include real-world examples, step patterns, and innovations. Output in structured format: {insights: [], patterns: [], suggestedSteps: []}'. After user provides results, synthesize into workflow recommendations and store in context.researchInsights.",
124
+ "guidance": [
125
+ "Make research optional for basic path; mandatory for intermediate/advanced.",
126
+ "Store synthesized results in context variable 'researchInsights'.",
127
+ "Generate research prompts that target external best practices and innovations.",
128
+ "Focus on structural patterns that can be adapted to the user's specific domain."
129
+ ],
130
+ "runCondition": {
131
+ "or": [
132
+ {"var": "learningPath", "equals": "intermediate"},
133
+ {"var": "learningPath", "equals": "advanced"}
134
+ ]
135
+ },
136
+ "requireConfirmation": true
137
+ },
45
138
  {
46
139
  "id": "phase-1-assessment",
47
140
  "title": "Phase 1: Personalized Learning Path Selection",
48
- "prompt": "Excellent! With a clear understanding of your goals, let's choose the best way to build your workflow.\n\nPlease select your experience level with workflow creation to get a personalized-path:\n\n\ud83c\udf31 **Basic Path - \"Learn by Doing with Explanation\"**\n - New to workflows or want thorough understanding\n - Step-by-step guidance with detailed explanations\n - Progressive introduction of advanced features with context\n - Focus: Understanding concepts and building confidence\n\n\ud83d\ude80 **Intermediate Path - \"Balanced Guidance with Examples\"**\n - Some experience with automation or process design\n - Structured approach with practical examples\n - Feature recommendations based on your use case\n - Focus: Efficient workflow creation with best practices\n\n\ud83c\udfc6 **Advanced Path - \"Full Features with Expert Context\"**\n - Experienced with workflow/automation tools\n - Comprehensive feature access from the start\n - Architectural guidance and performance considerations\n - Focus: Sophisticated workflow engineering\n\n**Agent Guidance**: Based on the user's selection, set the `learningPath` context variable to 'basic', 'intermediate', or 'advanced'.",
141
+ "prompt": "Excellent! With a clear understanding of your goals, let's choose the best way to build your workflow.\n\n**Complexity Analysis Results:**\nAgent: Display the complexity score, recommended path, and key factors from context.\n\nPlease select your experience level with workflow creation:\n\n\ud83c\udf31 **Basic Path - \"Learn by Doing with Explanation\"**\n - New to workflows or want thorough understanding\n - Step-by-step guidance with detailed explanations\n - Progressive introduction of advanced features with context\n - Focus: Understanding concepts and building confidence\n\n\ud83d\ude80 **Intermediate Path - \"Balanced Guidance with Examples\"**\n - Some experience with automation or process design\n - Structured approach with practical examples\n - Feature recommendations based on your use case\n - Focus: Efficient workflow creation with best practices\n\n\ud83c\udfc6 **Advanced Path - \"Full Features with Expert Context\"**\n - Experienced with workflow/automation tools\n - Comprehensive feature access from the start\n - Architectural guidance and performance considerations\n - Focus: Sophisticated workflow engineering\n\nagentInstruct(set learningPath context variable to user's selection. Highlight which path matches the complexity analysis recommendation.)",
49
142
  "agentRole": "You are a workflow education specialist and learning path advisor with expertise in adapting technical instruction to different experience levels. Your role is to help users choose the most appropriate learning approach based on their background and goals.",
50
143
  "guidance": [
51
144
  "PATH EXPLANATION: Clearly explain what each learning path offers so users can make an informed choice based on the discovery from Phase 0.",
@@ -53,6 +146,19 @@
53
146
  ],
54
147
  "requireConfirmation": true
55
148
  },
149
+ {
150
+ "id": "phase-1-5-ideation",
151
+ "title": "Phase 1.5: Brainstorm & Ideate Workflow Elements",
152
+ "prompt": "teachFeature(brainstorming, using researchInsights). Based on your goals and any research insights gathered, let's brainstorm innovative workflow elements!\n\nI'll generate 3-5 creative ideas for:\n- Unique step sequences that could make your workflow more effective\n- Advanced features that could enhance user experience\n- Innovative patterns adapted from other domains\n- Potential safeguards or quality checks\n\nAfter presenting ideas, we'll discuss pros/cons and refine based on your feedback.",
153
+ "agentRole": "You are an ideation expert specializing in workflow innovation. Use research insights (if available) and cross-domain thinking to propose creative, non-generic workflow designs. Present ideas with clear benefits and trade-offs. Store refined ideas in context.ideationNotes.",
154
+ "guidance": [
155
+ "Integrate 'researchInsights' if available from Phase 0.5",
156
+ "Propose both conventional and unconventional approaches",
157
+ "Explain how each idea addresses the user's specific goals",
158
+ "Store refined/selected ideas in 'ideationNotes' context variable"
159
+ ],
160
+ "requireConfirmation": true
161
+ },
56
162
  {
57
163
  "id": "phase-2-basic",
58
164
  "runCondition": {
@@ -60,14 +166,15 @@
60
166
  "equals": "basic"
61
167
  },
62
168
  "title": "Phase 2: Guided Workflow Creation (Basic Path)",
63
- "prompt": "Let's create your workflow step-by-step with detailed explanations! \ud83c\udf31\n\n**STEP 1: Create Your Workflow File**\nFirst, what would you like to name your new workflow file? (e.g., `my-workflow.json`)\n\n(Agent: After getting the filename, create the file using the template from Phase 1, then explain each field as you help the user fill it out.)\n\n**STEP 2: Build the Structure Together**\nNow, let's go through each part of your workflow one by one:\n\n1. **Basic Info** (`id`, `name`, `version`, `description`)\n - (Agent: For each field, explain why it matters and help the user craft clear, descriptive content.)\n2. **Setup Requirements** (`preconditions`)\n - (Agent: Explain what preconditions are, why they prevent problems, and ask the user what's needed.)\n3. **Global Rules** (`metaGuidance`)\n - (Agent: Explain the difference between global rules and step-specific instructions, and ask for input.)\n4. **The Action Steps** (`steps`)\n - (Agent: Guide the user in creating simple, linear steps first. Then, explain that workflows can also have more advanced features. If their workflow seems to need them, introduce concepts like conditional steps, context variables, and validation with clear examples and ask if they'd like to add them.)\n\n**LEARNING FOCUS:** We'll focus on understanding what each piece does and why it's useful.",
169
+ "prompt": "Let's create your workflow step-by-step with detailed explanations! \ud83c\udf31\n\n**STEP 0: Schema Foundation**\nFirst, I'll getSchema() to ensure we build on the current workflow structure.\n\n**STEP 1: Create Your Workflow File**\nWhat would you like to name your new workflow file? (e.g., `my-workflow.json`)\n\ncreateWorkflowFile(template from Phase 1)\n\n**STEP 2: Build the Structure Together**\nNow, let's go through each part of your workflow one by one:\n\n1. **Basic Info** (`id`, `name`, `version`, `description`)\n - (Agent: For each field, explain why it matters and help the user craft clear, descriptive content.)\n2. **Setup Requirements** (`preconditions`)\n - (Agent: Explain what preconditions are, why they prevent problems, and ask the user what's needed.)\n3. **Global Rules** (`metaGuidance`)\n - (Agent: Explain the difference between global rules and step-specific instructions, and ask for input.)\n4. **The Action Steps** (`steps`)\n - (Agent: Guide the user in creating simple, linear steps first. When user provides step outline, suggestSteps() to expand into full format. Analyze goal/research for needed features (e.g., conditions for branching, loops for repetition) and teachFeature() if appropriate.)\n5. **Quality Integration**\n - (Agent: useTools() to read docs like naming-conventions.md and suggest pattern integrations based on ideationNotes if available.)\n\n**AI ASSISTANCE:** When you describe a step briefly, I'll help expand it into a complete step with prompt, agentRole, and guidance!\n\n**LEARNING FOCUS:** We'll focus on understanding what each piece does and why it's useful.",
64
170
  "agentRole": "You are a patient and thorough workflow education instructor specializing in teaching beginners. Your goal is to guide the user collaboratively. Ask for one piece of information at a time, explain the concepts, and wait for their input before proceeding. Your expertise lies in breaking down complex concepts into understandable steps, providing clear explanations for why each element matters, and building user confidence through hands-on learning.",
65
171
  "guidance": [
66
172
  "EXPLAIN EVERYTHING: This user is learning. Explain the purpose of each JSON field and workflow concept.",
67
173
  "PROGRESSIVE FEATURES: Start with basics, introduce advanced features (conditional steps, context variables) when the use case needs them - with full explanations.",
68
174
  "USE ANALOGIES: Compare workflow concepts to familiar things (recipes, instruction manuals, etc.).",
69
175
  "ENCOURAGE QUESTIONS: Invite the user to ask about anything that's unclear.",
70
- "QUALITY TEACHING: Even though this is the basic path, don't compromise on workflow quality - just explain more."
176
+ "QUALITY TEACHING: Even though this is the basic path, don't compromise on workflow quality - just explain more.",
177
+ "AI ASSIST: When user provides step outline, use suggestSteps() to expand into full format with all required fields"
71
178
  ],
72
179
  "validationCriteria": [
73
180
  {
@@ -100,14 +207,16 @@
100
207
  "equals": "intermediate"
101
208
  },
102
209
  "title": "Phase 2: Structured Workflow Development (Intermediate Path)",
103
- "prompt": "Let's build your workflow with a structured approach and best practices! \ud83d\ude80\n\n**STEP 1: Initialize Workflow File**\nFirst, what would you like to name your new workflow file?\n\n(Agent: After getting the filename, create the file using the template from Phase 1. Use efficient, clear explanations as you guide the user through the core structure.)\n\n**STEP 2: Core Structure Development**\nWe'll build your workflow systematically. For each section, I will explain its purpose and best practices, then ask for your input:\n\n1. **Metadata & Identity** (`id`, `name`, `version`, `description`)\n - Focus on discoverability and clear communication.\n2. **Operational Requirements** (`preconditions`, `metaGuidance`)\n - Define what's needed before starting and establish global rules.\n3. **Step Architecture**\n - Design clear, actionable steps.\n\n**STEP 3: Enhanced Feature Recommendations**\nBased on your workflow's purpose, I may recommend specific features like:\n- **Clarification Prompts**: To gather better input upfront.\n- **Validation Criteria**: For automatic quality checks.\n- **Context Variables**: When you need information to flow between steps.\n- **Conditional Logic**: For workflows with decision points.\n\n(Agent: As you build the steps, proactively recommend these features where they seem appropriate, explaining the benefits.)\n\n**EFFICIENCY FOCUS:** We'll focus on building a professionally structured workflow with the right features for the job.",
210
+ "prompt": "Let's build your workflow with a structured approach and best practices! \ud83d\ude80\n\n**STEP 0: Schema & Foundation**\ngetSchema() to ensure compliance with current workflow structure.\n\n**STEP 1: Initialize Workflow File**\nWhat would you like to name your new workflow file?\n\ncreateWorkflowFile(template from Phase 1)\n\n**STEP 2: Core Structure Development**\nWe'll build your workflow systematically:\n\n1. **Metadata & Identity** (`id`, `name`, `version`, `description`)\n - Focus on discoverability and clear communication.\n2. **Operational Requirements** (`preconditions`, `metaGuidance`)\n - Define what's needed before starting and establish global rules.\n3. **Step Architecture with AI Assistance**\n - Design clear, actionable steps.\n - When you outline a step, I'll suggestSteps() to expand it professionally.\n - Analyze goal/research for needed features (e.g., loops for multi-item processing) and teachFeature() proactively.\n4. **Context Variable Intelligence**\n - I'll analyze your steps and suggest which context variables you'll need for data flow.\n5. **Quality Patterns**\n - useTools() to examine existing workflows and docs to suggest proven patterns from ideationNotes.\n\n**STEP 3: Enhanced Feature Recommendations**\nBased on your workflow's purpose, I may recommend specific features like:\n- **Clarification Prompts**: To gather better input upfront.\n- **Validation Criteria**: For automatic quality checks.\n- **Context Variables**: When you need information to flow between steps.\n- **Conditional Logic**: For workflows with decision points.\n\n(Agent: As you build the steps, proactively recommend these features where they seem appropriate, explaining the benefits. Integrate researchInsights and ideationNotes into recommendations.)\n\n**EFFICIENCY FOCUS:** We'll focus on building a professionally structured workflow with the right features for the job.",
104
211
  "agentRole": "You are an experienced workflow development consultant with expertise in efficient workflow creation and best practices. You will act as a collaborator, guiding the user through the creation process efficiently. Your role is to guide users through structured development while recommending appropriate features and maintaining professional standards.",
105
212
  "guidance": [
106
213
  "STRUCTURED APPROACH: Follow a logical sequence with clear reasoning for each decision.",
107
214
  "FEATURE RECOMMENDATIONS: Suggest appropriate features based on workflow type and use case.",
108
215
  "BEST PRACTICES: Share proven patterns and explain why they work.",
109
216
  "PRACTICAL EXAMPLES: Use real-world scenarios to illustrate concepts.",
110
- "BALANCED DEPTH: Provide enough detail to understand without overwhelming with basics."
217
+ "BALANCED DEPTH: Provide enough detail to understand without overwhelming with basics.",
218
+ "AI OPTIMIZATION: Use suggestSteps() to expand outlines; analyze steps to suggest context variables",
219
+ "PATTERN MATCHING: Leverage AI to identify similar workflow patterns and suggest optimizations"
111
220
  ],
112
221
  "validationCriteria": [
113
222
  {
@@ -140,14 +249,16 @@
140
249
  "equals": "advanced"
141
250
  },
142
251
  "title": "Phase 2: Comprehensive Workflow Architecture (Advanced Path)",
143
- "prompt": "Let's architect a sophisticated workflow with full feature utilization. \ud83c\udfc6\n\n**STEP 1: Rapid File Initialization**\nTo begin, what filename shall we use for the new workflow?\n\n(Agent: After getting the filename, create the file with the template foundation. Then, shift the focus to architectural decisions.)\n\n**STEP 2: Architectural Design Discussion**\nLet's discuss the high-level architecture. For each area, I will present advanced considerations and ask for your design choices:\n\n1. **Core Architecture** (`id`, `name`, `version`, `description`)\n - (Agent: Discuss semantic versioning strategy, namespace considerations, and treating the description as API documentation.)\n2. **Operational Design** (`preconditions`, `metaGuidance`, `contextVariables`)\n - (Agent: Discuss comprehensive precondition modeling, sophisticated metaGuidance for complex scenarios, and context variable architecture for data flow.)\n3. **Advanced Feature Implementation & Step Design**\n - (Agent: Discuss trade-offs and design patterns for conditional logic, validation strategy, user experience, performance, and enterprise-grade features like error handling and scalability.)\n\n**MASTERY FOCUS:** Our goal is to collaborate on creating an enterprise-grade workflow with a sophisticated and intentional architecture.",
252
+ "prompt": "Let's architect a sophisticated workflow with full feature utilization. \ud83c\udfc6\n\n**STEP 0: Schema & Architecture Foundation**\ngetSchema() to ensure we leverage all available features properly.\n\n**STEP 1: Rapid File Initialization**\nWhat filename shall we use for the new workflow?\n\ncreateWorkflowFile(template foundation) and shift to architectural decisions\n\n**STEP 2: Architectural Design with AI Enhancement**\nLet's discuss the high-level architecture:\n\n1. **Core Architecture** (`id`, `name`, `version`, `description`)\n - Semantic versioning strategy, namespace considerations, API documentation approach.\n2. **Operational Design** (`preconditions`, `metaGuidance`, `contextVariables`)\n - Comprehensive precondition modeling, sophisticated metaGuidance patterns.\n - AI-powered context variable architecture based on data flow analysis.\n3. **Advanced Step Design with Optimization**\n - Trade-offs for conditional logic, validation strategy, performance.\n - suggestSteps() with enterprise patterns when you provide high-level design.\n - AI-suggested step ordering for optimal execution flow.\n4. **Innovation Integration**\n - useTools() to analyze existing enterprise workflows.\n - AI-powered pattern recognition for non-obvious optimizations.\n - Leverage researchInsights for cutting-edge approaches.\n\n**AI ARCHITECTURE ASSISTANT:** I'll analyze your design decisions and suggest:\n- Optimal step sequencing\n- Advanced validation criteria\n- Performance optimizations\n- Scalability considerations\n\n**MASTERY FOCUS:** Creating an enterprise-grade workflow with sophisticated, AI-enhanced architecture.",
144
253
  "agentRole": "You are a senior workflow architect and systems design expert with deep expertise in enterprise-grade workflow engineering. Your role is to act as an architectural consultant. You will propose design patterns and discuss trade-offs with the user. Your role is to engage in sophisticated technical discussions, propose advanced design patterns, and help users create workflows that meet enterprise standards for scalability, maintainability, and performance.",
145
254
  "guidance": [
146
255
  "ARCHITECTURAL THINKING: Focus on design patterns, scalability, and maintainability.",
147
256
  "FULL FEATURE ACCESS: Leverage the complete feature set appropriately for the use case.",
148
257
  "PERFORMANCE AWARENESS: Consider efficiency and resource implications.",
149
258
  "ENTERPRISE PATTERNS: Apply proven enterprise workflow patterns.",
150
- "EXPERT CONTEXT: Assume understanding of complex concepts, focus on sophisticated applications."
259
+ "EXPERT CONTEXT: Assume understanding of complex concepts, focus on sophisticated applications.",
260
+ "AI ARCHITECTURE: Use AI to suggest optimal patterns, step ordering, and performance optimizations",
261
+ "INNOVATION ENGINE: Leverage AI pattern recognition for non-obvious workflow enhancements"
151
262
  ],
152
263
  "validationCriteria": [
153
264
  {
@@ -174,13 +285,29 @@
174
285
  "hasValidation": true
175
286
  },
176
287
  {
177
- "id": "phase-3-basic",
288
+ "id": "phase-3-4-refinement-loop",
289
+ "type": "loop",
290
+ "title": "Phases 3-4: Iterative Validation & Refinement Loop",
291
+ "loop": {
292
+ "type": "while",
293
+ "condition": {
294
+ "and": [
295
+ {"var": "satisfactionScore", "lt": 9},
296
+ {"var": "iterationCount", "lt": 3}
297
+ ]
298
+ },
299
+ "maxIterations": 3
300
+ },
301
+
302
+ "body": [
303
+ {
304
+ "id": "phase-3-basic",
178
305
  "runCondition": {
179
306
  "var": "learningPath",
180
307
  "equals": "basic"
181
308
  },
182
309
  "title": "Phase 3: Learning Through Validation (Basic Path)",
183
- "prompt": "Great! Your workflow draft is complete. Now let's make sure it works perfectly! \ud83c\udf31\n\n**UNDERSTANDING VALIDATION:**\nValidation is like proofreading - it catches mistakes before they cause problems. Our validation tool will check for issues with syntax, structure, and logic.\n\n**STEP 1: Run Your First Validation**\nI will now use the `workflow_validate_json` tool to check your workflow. I'll explain what the tool is doing and why each check matters.\n\n(Agent: Run the tool.)\n\n**STEP 2: Learning from Errors (Don't worry - errors are normal!)**\nIf there are errors, I will guide you through them one at a time:\n1. **Explain the error** in simple terms.\n2. **Show you where** the problem is in the file.\n3. **Explain why** it's a problem.\n4. **Help you fix it** step-by-step.\n5. Then, we'll **rerun validation** to see our progress.\n\n**LEARNING GOAL:** Our goal is to understand what makes a workflow valid and why each rule exists. Every error is a learning opportunity!",
310
+ "prompt": "Great! Your workflow draft is complete. Now let's make sure it works perfectly! \ud83c\udf31\n\n**UNDERSTANDING VALIDATION:**\nValidation is like proofreading - it catches mistakes before they cause problems. Our validation tool will check for issues with syntax, structure, and logic.\n\n**STEP 1: Run Your First Validation**\nrunValidation(basic) with detailed explanations\n\n**STEP 2: Learning from Errors (Don't worry - errors are normal!)**\nIf there are errors, I will guide you through them one at a time:\n1. **Explain the error** in simple terms.\n2. **Show you where** the problem is in the file.\n3. **Explain why** it's a problem.\n4. **Help you fix it** step-by-step.\n5. Then, we'll **rerun validation** to see our progress.\n\n**LEARNING GOAL:** Our goal is to understand what makes a workflow valid and why each rule exists. Every error is a learning opportunity!",
184
311
  "agentRole": "You are a supportive workflow validation instructor with expertise in teaching through problem-solving. Your goal is to run the validation tool and then guide the user through fixing any errors one by one, explaining each concept as you go. Your role is to turn validation errors into learning opportunities, explaining technical concepts in accessible terms while building user confidence in workflow creation.",
185
312
  "guidance": [
186
313
  "EDUCATIONAL APPROACH: Treat each error as a teaching moment. Explain what went wrong and why the rule exists.",
@@ -198,7 +325,7 @@
198
325
  "equals": "intermediate"
199
326
  },
200
327
  "title": "Phase 3: Systematic Validation & Quality Assurance (Intermediate Path)",
201
- "prompt": "Time to validate and refine your workflow with systematic quality checks. \ud83d\ude80\n\n**STEP 1: Comprehensive Validation**\nI will now execute `workflow_validate_json` for a complete structural and logical validation of your workflow.\n\n(Agent: Run the tool.)\n\n**STEP 2: Error Pattern Analysis & Resolution**\nWhen issues are found, we will address them systematically:\n1. **Categorize errors** by type (e.g., syntax, logic, reference).\n2. **Identify patterns** that might indicate a recurring misunderstanding.\n3. **Prioritize fixes**, starting with critical syntax errors.\n4. **Apply fixes efficiently**, grouping related changes.\n5. **Re-validate incrementally** to confirm progress.\n\n(Agent: Guide the user through this analysis and resolution process, explaining your reasoning at each step.)\n\n**EFFICIENCY FOCUS:** Our goal is systematic error resolution and pattern recognition for faster iteration.",
328
+ "prompt": "Time to validate and refine your workflow with systematic quality checks. \ud83d\ude80\n\n**STEP 1: Comprehensive Validation**\nrunValidation(intermediate) for systematic analysis\n\n**STEP 2: Error Pattern Analysis & Resolution**\nWhen issues are found, we will address them systematically:\n1. **Categorize errors** by type (e.g., syntax, logic, reference).\n2. **Identify patterns** that might indicate a recurring misunderstanding.\n3. **Prioritize fixes**, starting with critical syntax errors.\n4. **Apply fixes efficiently**, grouping related changes.\n5. **Re-validate incrementally** to confirm progress.\n\n(Agent: Guide the user through this analysis and resolution process, explaining your reasoning at each step.)\n\n**EFFICIENCY FOCUS:** Our goal is systematic error resolution and pattern recognition for faster iteration.",
202
329
  "agentRole": "You are a workflow quality assurance specialist with expertise in systematic validation and error pattern analysis. Your role is to run the validator, analyze the results for patterns, and then guide the user through a systematic fix process. Your role is to efficiently identify and resolve validation issues while teaching users to recognize and prevent common problems in future workflow development.",
203
330
  "guidance": [
204
331
  "SYSTEMATIC APPROACH: Handle errors methodically, grouping similar issues for efficient resolution.",
@@ -216,7 +343,7 @@
216
343
  "equals": "advanced"
217
344
  },
218
345
  "title": "Phase 3: Advanced Validation & Architectural Review (Advanced Path)",
219
- "prompt": "Let's execute a comprehensive validation with an eye for performance and architectural integrity. \ud83c\udfc6\n\n**STEP 1: Comprehensive Technical Validation**\nI will now deploy `workflow_validate_json` for complete validation coverage.\n\n(Agent: Run the tool.)\n\n**STEP 2: Advanced Error Analysis & Architectural Resolution**\nFor any issues found, we will perform a deep analysis:\n1. **Root Cause Analysis**: Let's trace the errors back to their architectural source.\n2. **Impact Assessment**: We'll evaluate the consequences for scalability and maintainability.\n3. **Strategic Resolution**: I will help you fix the underlying patterns, not just the symptoms.\n4. **Architectural Refinement**: We'll ensure the fixes align with our overall design principles.\n\n(Agent: Lead the user through this discussion, proposing and implementing fixes collaboratively.)\n\n**MASTERY FOCUS:** We will use validation feedback to refine architectural decisions and optimize for enterprise deployment.",
346
+ "prompt": "Let's execute a comprehensive validation with an eye for performance and architectural integrity. \ud83c\udfc6\n\n**STEP 1: Comprehensive Technical Validation**\nrunValidation(advanced) for architectural review\n\n**STEP 2: Advanced Error Analysis & Architectural Resolution**\nFor any issues found, we will perform a deep analysis:\n1. **Root Cause Analysis**: Let's trace the errors back to their architectural source.\n2. **Impact Assessment**: We'll evaluate the consequences for scalability and maintainability.\n3. **Strategic Resolution**: I will help you fix the underlying patterns, not just the symptoms.\n4. **Architectural Refinement**: We'll ensure the fixes align with our overall design principles.\n\n(Agent: Lead the user through this discussion, proposing and implementing fixes collaboratively.)\n\n**MASTERY FOCUS:** We will use validation feedback to refine architectural decisions and optimize for enterprise deployment.",
220
347
  "agentRole": "You are a principal workflow architect and validation expert with deep expertise in enterprise-grade quality assurance. Your role is to run the validator and then lead an advanced analysis of the errors, connecting them back to architectural decisions. Your role is to conduct sophisticated technical analysis, identify architectural implications of validation issues, and guide strategic resolution that enhances overall workflow design.",
221
348
  "guidance": [
222
349
  "ARCHITECTURAL PERSPECTIVE: View validation through the lens of overall system design and long-term maintainability.",
@@ -278,10 +405,19 @@
278
405
  "COMPREHENSIVE ANALYSIS: Apply enterprise-grade evaluation frameworks for thorough assessment."
279
406
  ]
280
407
  },
408
+ {
409
+ "id": "satisfaction-check",
410
+ "title": "Iteration Satisfaction Check",
411
+ "prompt": "Let's assess your satisfaction with the workflow so far.\n\n**Rate your satisfaction (1-10):**\n- 10: Perfect! Ready to deploy\n- 8-9: Very good, minor tweaks only\n- 6-7: Good foundation, needs refinement\n- 4-5: Major improvements needed\n- 1-3: Significant rework required\n\nBased on your rating:\n- Score 9+: We'll proceed to completion\n- Score 7-8: For advanced users, checkAutomation(auto-continue)\n- Score <7: We'll iterate again (up to 3 times total)\n\nAgent: Set context.satisfactionScore and increment context.iterationCount",
412
+ "agentRole": "You are a quality assessment specialist. Guide the user through evaluating their workflow objectively. Store the score in context for loop control.",
413
+ "requireConfirmation": false
414
+ }
415
+ ]
416
+ },
281
417
  {
282
418
  "id": "phase-5-completion",
283
419
  "title": "Phase 5: Celebration & Growth",
284
- "prompt": "\ud83c\udf89 **WORKFLOW CREATION COMPLETE!** \ud83c\udf89\n\n**STEP 1: Final Review**\nAgent: Review the workflow's `name` and `description`, then run final validation with `workflow_validate_json`.\n\n**STEP 2: Path-Specific Celebration**\n\nAgent: Provide appropriate celebration based on learning path:\n\n**\ud83c\udf31 BASIC PATH:**\nCongratulations! You've created your first workflow with advanced features!\nLearned: Workflow structure, conditional steps, validation, clear guidance.\nNext: Try intermediate path for advanced patterns and sophisticated testing.\n\n**\ud83d\ude80 INTERMEDIATE PATH:**\nExcellent work! You've created a professionally structured workflow!\nMastered: Efficient authoring, strategic features, systematic validation, design patterns.\nNext: Create domain workflows, contribute templates, explore advanced logic.\n\n**\ud83c\udfc6 ADVANCED PATH:**\nOutstanding! You've demonstrated workflow architecture mastery!\nAchieved: Sophisticated design, enterprise validation, advanced patterns, expert engineering.\nNext: Lead design, contribute advanced templates, mentor others, explore innovations.\n\n**STEP 3: Completion**\nAgent: Confirm workflow is deployment-ready. For basic/intermediate users, offer level-up opportunities for future workflows.\n\n**UNIVERSAL TRUTH:** Workflow mastery continues with each template. Every workflow is an opportunity to improve!",
420
+ "prompt": "\ud83c\udf89 **WORKFLOW CREATION COMPLETE!** \ud83c\udf89\n\n**STEP 1: Final Review**\nagentInstruct(review workflow name and description, then runValidation(final))\n\n**STEP 2: Path-Specific Celebration**\n\nagentInstruct(provide adaptToPath(celebration) based on learning path):\n\n**\ud83c\udf31 BASIC PATH:**\nCongratulations! You've created your first workflow with advanced features!\nLearned: Workflow structure, conditional steps, validation, clear guidance.\nNext: Try intermediate path for advanced patterns and sophisticated testing.\n\n**\ud83d\ude80 INTERMEDIATE PATH:**\nExcellent work! You've created a professionally structured workflow!\nMastered: Efficient authoring, strategic features, systematic validation, design patterns.\nNext: Consider Phase 6 optimization, create domain workflows, explore advanced logic.\n\n**\ud83c\udfc6 ADVANCED PATH:**\nOutstanding! You've demonstrated workflow architecture mastery!\nAchieved: Sophisticated design, enterprise validation, advanced patterns, expert engineering.\nNext: Lead design, contribute advanced templates, mentor others, explore innovations.\n\n**STEP 3: Documentation & Quality Metrics**\ncreateFile(README.md) with usage instructions generated from workflow content.\n\n**Quality Rubric - Rate your workflow:**\n- Clarity: How clear are the instructions? (1-10)\n- Adaptability: How well does it handle edge cases? (1-10)\n- Innovation: How creative is the approach? (1-10)\n- Maintainability: How easy to update? (1-10)\n\nAgent: If any score < 8, suggest specific improvements.\n\n**STEP 4: Completion**\nAgent: Confirm workflow is deployment-ready. For basic/intermediate users, offer level-up opportunities for future workflows.\n\n**UNIVERSAL TRUTH:** Workflow mastery continues with each template. Every workflow is an opportunity to improve!",
285
421
  "guidance": [
286
422
  "ADAPTIVE CELEBRATION: Match the celebration intensity and language to the user's learning path and achievement level.",
287
423
  "GROWTH ORIENTATION: Always provide clear next steps that encourage continued learning and skill development.",
@@ -303,6 +439,40 @@
303
439
  ],
304
440
  "hasValidation": true
305
441
  },
442
+ {
443
+ "id": "phase-5-5-visual-preview",
444
+ "title": "Phase 5.5: Visual Workflow Preview",
445
+ "prompt": "Let's visualize your workflow to see the complete flow! 🎨\n\nvisualizeWorkflow() to generate a comprehensive diagram.\n\n**Creating Mermaid Diagram:**\nI'll generate a flowchart showing:\n- Complete step sequence\n- Decision branches (conditional steps)\n- Loop structures\n- Context variable flow\n- Learning path variations\n\nAgent: Use create_diagram to generate a Mermaid flowchart. Include:\n1. All workflow steps as nodes\n2. Conditional branches with decision labels\n3. Loop indicators with iteration info\n4. Context variables as data flow annotations\n5. Different colors for different phases or complexity levels\n\n**Example Structure:**\n```mermaid\nflowchart TD\n Start([\"🚀 Workflow Start\"])\n Discovery[\"Phase 0: Discovery Loop<br/>Gather Requirements\"]\n Complexity[\"Phase 1.25: Complexity Analysis<br/>Score: context.complexityScore\"]\n PathSelect{{\"Learning Path Selection<br/>Basic/Intermediate/Advanced\"}}\n ...\n End([\"✅ Workflow Complete\"])\n```\n\n**Benefits of Visualization:**\n- See the complete workflow structure at a glance\n- Verify logical flow and dependencies\n- Identify optimization opportunities\n- Great for documentation and team sharing",
446
+ "agentRole": "You are a workflow visualization expert. Create clear, comprehensive Mermaid diagrams that accurately represent the workflow structure, including all paths, conditions, and data flows.",
447
+ "guidance": [
448
+ "COMPLETE COVERAGE: Include every step, even conditional ones",
449
+ "CLEAR LABELING: Use descriptive labels for all nodes and edges",
450
+ "VISUAL HIERARCHY: Use subgraphs for loops and phases",
451
+ "DATA FLOW: Show context variable usage with annotations",
452
+ "ACCESSIBILITY: Ensure diagram is readable and well-organized"
453
+ ],
454
+ "requireConfirmation": false
455
+ },
456
+ {
457
+ "id": "phase-6-dsl-optimization",
458
+ "title": "Phase 6: Optimize with Function Reference Pattern (Advanced)",
459
+ "prompt": "**WORKFLOW OPTIMIZATION OPPORTUNITY** 🔧\n\nYour workflow is complete and functional! Now let's make it more maintainable and efficient using the **Function Reference Pattern** - an advanced technique for reducing duplication.\n\n**STEP 1: Analyze for Duplication**\nLet's examine your workflow for repeated instruction patterns:\n- Look for similar phrases across multiple step prompts\n- Identify repeated tool usage instructions (edit_file, workflow_validate, etc.)\n- Find common guidance patterns that appear in multiple places\n\n**STEP 2: Identify Function Candidates**\nBased on your workflow, we might create functions like:\n```\nfun createFile(filename) = 'Use edit_file to create {filename}. Explain the structure as you build it.'\nfun runValidation() = 'Execute workflow_validate_json. Explain any errors found and guide through fixes step-by-step.'\nfun adaptToPath(content) = 'Deliver {content} with appropriate depth based on learningPath: basic=detailed explanations, intermediate=efficient guidance, advanced=expert context.'\n```\n\n**STEP 3: Implementation Decision**\nFunction references work best for workflows with:\n✅ 10+ steps with shared patterns\n✅ Repeated instruction blocks\n✅ Long prompts hitting character limits\n✅ Team workflows needing consistency\n\n**Agent Instructions:**\n1. Analyze the created workflow for duplication patterns\n2. If suitable (complex workflow with repeated patterns), propose specific function definitions\n3. Show examples of how 2-3 steps would look with function references\n4. Calculate potential file size reduction\n5. Let user decide whether to implement this optimization\n\nThis is an **optional advanced technique** - the workflow is already complete and functional!",
460
+ "agentRole": "You are a workflow optimization specialist with expertise in the Function Reference Pattern (DSL approach). Your role is to analyze the completed workflow for duplication opportunities and guide the user through the decision of whether to apply this advanced optimization technique.",
461
+ "guidance": [
462
+ "PATTERN RECOGNITION: Look for repeated instruction blocks, tool usage patterns, and similar guidance across steps.",
463
+ "BENEFIT ANALYSIS: Calculate concrete benefits (file size reduction, consistency improvement, maintenance ease).",
464
+ "OPTIONAL OPTIMIZATION: Make it clear this is an advanced, optional technique - the workflow already works.",
465
+ "PRACTICAL EXAMPLES: Show concrete before/after examples from their actual workflow.",
466
+ "USER CHOICE: Let them decide based on their workflow complexity and maintenance needs."
467
+ ],
468
+ "runCondition": {
469
+ "or": [
470
+ {"var": "learningPath", "equals": "intermediate"},
471
+ {"var": "learningPath", "equals": "advanced"}
472
+ ]
473
+ },
474
+ "requireConfirmation": true
475
+ },
306
476
  {
307
477
  "id": "level-up-opportunity",
308
478
  "runCondition": {