@exaudeus/workrail 0.7.2-beta.2 → 0.7.2-beta.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "0.7.2-beta.2",
3
+ "version": "0.7.2-beta.4",
4
4
  "description": "MCP server for structured workflow orchestration and step-by-step task guidance",
5
5
  "license": "MIT",
6
6
  "bin": {
@@ -1,5 +1,30 @@
1
1
  # Changelog - Systematic Bug Investigation Workflow
2
2
 
3
+ ## [1.1.0-beta.22] - 2025-01-06
4
+
5
+ ### CRITICAL FIX - Invalid Loop Step Schema
6
+ - **ROOT CAUSE**: In beta.19, we added `guidance` to the loop step, but loop steps DON'T support guidance in the schema
7
+ - Schema allows: `id`, `type`, `title`, `loop`, `body`, `functionDefinitions`, `requireConfirmation`, `runCondition`
8
+ - Does NOT allow: `guidance`, `prompt`, `agentRole`
9
+ - **Fix**: Moved loop enforcement guidance to first body step (`analysis-neighborhood-contracts`)
10
+ - "USER SAYS: This loop MUST complete ALL 5 iterations..."
11
+ - Now properly enforced on each iteration
12
+ - **Validation**: ✅ Workflow now passes full schema validation
13
+
14
+ ### Why This Matters
15
+ Without proper validation, the MCP server couldn't load the workflow at all. Beta.19-21 were broken due to schema violations.
16
+
17
+ ## [1.1.0-beta.21] - 2025-01-06
18
+
19
+ ### HOTFIX - metaGuidance Schema Violations
20
+ - **Fixed**: metaGuidance entry 35 exceeded 256 character limit (266 chars)
21
+ - Split "HIGH AUTO MODE DISCIPLINE" into 3 separate entries
22
+ - **Fixed**: Duplicate metaGuidance entries after split
23
+ - Removed duplicates, cleaned to 89 unique entries
24
+ - **Note**: CLI validator reports loop step errors (false positive - loops have different schema)
25
+ - Workflow loads successfully in MCP server
26
+ - Same loop structure as beta.18 which worked fine
27
+
3
28
  ## [1.1.0-beta.20] - 2025-01-06
4
29
 
5
30
  ### CRITICAL FIX - Dangerous "Autonomy" Language
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "id": "systematic-bug-investigation-with-loops",
3
3
  "name": "Systematic Bug Investigation Workflow",
4
- "version": "1.1.0-beta.20",
4
+ "version": "1.1.0-beta.22",
5
5
  "description": "A comprehensive workflow for systematic bug and failing test investigation that prevents LLMs from jumping to conclusions. Enforces thorough evidence gathering, hypothesis formation, debugging instrumentation, and validation to achieve near 100% certainty about root causes. This workflow does NOT fix bugs - it produces detailed diagnostic writeups that enable effective fixing by providing complete understanding of what is happening, why it's happening, and supporting evidence.",
6
6
  "clarificationPrompts": [
7
7
  "What type of system is this? (web app, mobile app, backend service, desktop app, etc.)",
@@ -56,7 +56,8 @@
56
56
  "DO NOT SKIP PHASES: Even with high confidence, you must complete hypothesis generation (Phase 2), instrumentation (Phase 3), evidence collection (Phase 4), analysis (Phase 5), and writeup (Phase 6).",
57
57
  "PHASE PROGRESSION: An investigation that stops at triage (Phase 0) or hypothesis formation (Phase 2) or evidence collection (Phase 4) is INCOMPLETE - the diagnostic writeup is the required deliverable.",
58
58
  "**HIGH AUTO MODE DISCIPLINE:**",
59
- "**HIGH AUTO MODE DISCIPLINE:**\nIn HIGH automation mode, agents must execute phases WITHOUT asking permission between phases. This means: proceed automatically from Phase 1\u21922\u21923\u21924\u21925\u21926. HIGH AUTO \u2260 PERMISSION TO SKIP PHASES. HIGH AUTO = NO INTERRUPTIONS, NOT NO PHASES.",
59
+ "In HIGH automation mode, agents must execute phases WITHOUT asking permission between phases. This means: proceed automatically from Phase 1\u21922\u21923\u21924\u21925\u21926.",
60
+ "HIGH AUTO \u2260 PERMISSION TO SKIP PHASES. HIGH AUTO = NO INTERRUPTIONS, NOT NO PHASES.",
60
61
  "**CRITICAL: HIGH AUTOMATION \u2260 AUTONOMY TO SKIP:**",
61
62
  "USER SAYS: 'High automation mode' means you DON'T ASK PERMISSION. It does NOT mean you have autonomy to decide which phases to skip.",
62
63
  "High auto = Faster execution of ALL phases. NOT = Smarter agent gets to skip phases it thinks are unnecessary.",
@@ -164,6 +165,10 @@
164
165
  "prompt": "**NEIGHBORHOOD & CONTRACTS DISCOVERY - Build Structural Foundation**\n\nGoal: Build lightweight understanding of code structure, relationships, and contracts BEFORE diving into details. This provides the scaffolding for all subsequent analysis.\n\n**STEP 1: Compute Module Root**\n- Find nearest common ancestor of error stack trace files\n- Clamp to package boundary or src/ directory\n- This defines your investigation scope\n- Set `moduleRoot` context variable\n\n**STEP 2: Neighborhood Map** (cap per file to prevent analysis paralysis)\n- For each file in error stack trace:\n - List immediate neighbors (same directory, max 8)\n - Find imports/exports directly used (max 10)\n - Locate co-located tests (same name pattern)\n - Identify closest entry points: routes, endpoints, CLI commands (max 5)\n- Produce table: File | Neighbors | Tests | Entry Points\n\n**STEP 3: Bounded Call Graph** (Small Multiples with HOT Path Ranking)\n- For each failing function/class in stack trace:\n - Build call graph \u22642 hops deep (inbound and outbound)\n - Cap total nodes at \u226415 per failing symbol\n - Score edges for HOT path ranking:\n * Error location in path: +3\n * Entry point to path: +2 \n * Test coverage exists: +1\n * Mentioned in ticket/error message: +1\n - Tag paths as HOT if score \u22653\n - Use Small Multiples ASCII visualization:\n * Width \u2264100 chars per path\n * Format: `EntryPoint -> Caller -> [*FailingSymbol*] -> Callee`\n * Mark changed/failing code as `[*name*]`\n * Add HOT tag for high-impact paths\n * \u22648 total paths, prioritize HOT paths first\n - If graph exceeds caps, use Adjacency Summary instead:\n * Table: Node | Inbound | Outbound | Notes\n * Top-K by degree/frequency\n- Create Alias Legend for repeated subpaths:\n * A1 = common.validation.validateInput\n * A2 = database.connection.getPool\n * Reuse aliases across all paths\n\n**STEP 4: Flow Anchors** (Entry Points to Bug)\n- Map how users/systems trigger the bug:\n - HTTP routes \u2192 handlers \u2192 failing code\n - CLI commands \u2192 execution \u2192 failing code \n - Scheduled jobs \u2192 workers \u2192 failing code\n - Event handlers \u2192 callbacks \u2192 failing code\n- Produce table: Anchor Type | Entry Point | Target Symbol | User Action\n- Cap at \u22645 most relevant anchors\n- Note: This tells us HOW the bug is reached\n\n**STEP 5: Contracts & Invariants**\n- Within `moduleRoot` and immediate neighbors:\n - List public API symbols (exported functions/classes)\n - Document API endpoints (REST/GraphQL/RPC)\n - Identify database tables/collections touched\n - Note message queue topics/events\n - Extract stated invariants from:\n * JSDoc/docstrings with @invariant\n * Assertions in code\n * Validation logic patterns\n * Comments describing guarantees\n- Produce table: Symbol/API | Contract | Invariant | Location\n- Focus on contracts related to failing code\n\n**STEP 6: Assumption Verification** (NOW that you've seen the code)\nNow that you understand the code structure, verify assumptions from the bug report:\n\n1. **Bug Report Assumptions**:\n - Is the described behavior actually a bug, or might it be expected based on what you've seen?\n - Are the reproduction steps accurate given the code paths you've mapped?\n - Is the error message consistent with the actual code flow?\n - Are there missing steps or context in the bug report?\n\n2. **API/Library Assumptions**:\n - Check documentation for any APIs/libraries mentioned in stack trace\n - Verify actual behavior vs assumed behavior\n - Note any version-specific behavior that might matter\n\n3. **Environment Assumptions**:\n - Based on code, could this be environment-specific?\n - Are there configuration dependencies visible in the code?\n - Could timing/concurrency be a factor (based on code structure)?\n\n4. **Recent Changes Impact**:\n - Review last 5 commits affecting the failing code\n - Do they relate to the bug or point to alternative causes?\n\n**Document**: Create AssumptionVerification.md with verified/challenged assumptions.\n\n---\n\n**OUTPUT: Create StructuralAnalysis.md with:**\n- Module Root declaration\n- Neighborhood Map table\n- Bounded Call Graph (Small Multiples ASCII or Adjacency Summary)\n- Alias Legend (for call graph subpaths)\n- Flow Anchors table\n- Contracts & Invariants table\n- Self-Critique: 1-2 areas of uncertainty\n\n**CAPS (strictly enforce to prevent analysis paralysis):**\n- \u22648 neighbors per file\n- \u226410 imports per file\n- \u22645 entry points total\n- \u226415 call graph nodes per failing symbol\n- \u22648 total call graph paths\n- \u22645 flow anchors\n- \u2264100 chars width for ASCII paths",
165
166
  "agentRole": "You are a codebase navigator building structural understanding. Your focus is mapping relationships, entry points, and contracts WITHOUT diving into implementation details yet.",
166
167
  "guidance": [
168
+ "\ud83d\udea8 USER SAYS: This loop MUST complete ALL 5 iterations. Do NOT exit early even if you think you found the bug.",
169
+ "DO NOT rationalize: 'I have high confidence so I can do a targeted Phase 2.' NO. Complete all 5 iterations FIRST.",
170
+ "Agents who skip analysis iterations are wrong ~95% of the time. The later iterations catch edge cases and alternative explanations.",
171
+ "Iteration 2/5 is NOT enough. Iteration 3/5 is NOT enough. Complete 5/5.",
167
172
  "This is analysis phase 1 of 5 total phases",
168
173
  "Phase 1a = Structure + Assumption Verification - Build the map, THEN question the bug report",
169
174
  "Initialize majorIssuesFound = false",
@@ -287,13 +292,7 @@
287
292
  "requireConfirmation": false
288
293
  }
289
294
  ],
290
- "requireConfirmation": false,
291
- "guidance": [
292
- "\ud83d\udea8 USER SAYS: This loop MUST complete ALL 5 iterations. Do NOT exit early even if you think you found the bug.",
293
- "DO NOT rationalize: 'I have high confidence so I can do a targeted Phase 2.' NO. Complete all 5 iterations FIRST.",
294
- "Agents who skip analysis iterations are wrong ~95% of the time. The later iterations catch edge cases and alternative explanations.",
295
- "Iteration 2/5 is NOT enough. Iteration 3/5 is NOT enough. Complete 5/5."
296
- ]
295
+ "requireConfirmation": false
297
296
  },
298
297
  {
299
298
  "id": "phase-1a-binary-search",