@exaudeus/workrail 0.7.2-beta.1 → 0.7.2-beta.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,5 +1,27 @@
|
|
|
1
1
|
# Changelog - Systematic Bug Investigation Workflow
|
|
2
2
|
|
|
3
|
+
## [1.1.0-beta.20] - 2025-01-06
|
|
4
|
+
|
|
5
|
+
### CRITICAL FIX - Dangerous "Autonomy" Language
|
|
6
|
+
- **ROOT CAUSE IDENTIFIED**: Our automation level descriptions were giving agents permission to skip!
|
|
7
|
+
- OLD: "High=**auto-approve >8.0 confidence decisions**" ❌
|
|
8
|
+
- Interpreted as: "I have 9/10 confidence → I can approve my decision to skip phases"
|
|
9
|
+
- OLD: "Control workflow **autonomy**" ❌
|
|
10
|
+
- Interpreted as: "High mode gives me autonomy to decide what to skip"
|
|
11
|
+
|
|
12
|
+
### Language Fixes
|
|
13
|
+
1. **Removed "auto-approve decisions"**: Changed to "execute phases automatically WITHOUT asking permission between phases"
|
|
14
|
+
2. **Removed "autonomy"**: Changed to "Control confirmation frequency"
|
|
15
|
+
3. **Clarified HIGH AUTO MODE**:
|
|
16
|
+
- NEW: "HIGH AUTO = NO INTERRUPTIONS, NOT NO PHASES"
|
|
17
|
+
- NEW: "HIGH AUTO ≠ PERMISSION TO SKIP PHASES"
|
|
18
|
+
4. **Explicit USER SAYS**:
|
|
19
|
+
- "USER SAYS: 'High automation mode' means you DON'T ASK PERMISSION. It does NOT mean you have autonomy to decide which phases to skip."
|
|
20
|
+
- "High auto = Faster execution of ALL phases. NOT = Smarter agent gets to skip phases."
|
|
21
|
+
|
|
22
|
+
### Credit
|
|
23
|
+
User insight: "Could the high automation be causing it to do this? do we frame it as letting it do whatever it wants?" - YES, we were!
|
|
24
|
+
|
|
3
25
|
## [1.1.0-beta.19] - 2025-01-06
|
|
4
26
|
|
|
5
27
|
### CRITICAL FIX - Anti-Rationalization
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"id": "systematic-bug-investigation-with-loops",
|
|
3
3
|
"name": "Systematic Bug Investigation Workflow",
|
|
4
|
-
"version": "1.1.0-beta.
|
|
4
|
+
"version": "1.1.0-beta.20",
|
|
5
5
|
"description": "A comprehensive workflow for systematic bug and failing test investigation that prevents LLMs from jumping to conclusions. Enforces thorough evidence gathering, hypothesis formation, debugging instrumentation, and validation to achieve near 100% certainty about root causes. This workflow does NOT fix bugs - it produces detailed diagnostic writeups that enable effective fixing by providing complete understanding of what is happening, why it's happening, and supporting evidence.",
|
|
6
6
|
"clarificationPrompts": [
|
|
7
7
|
"What type of system is this? (web app, mobile app, backend service, desktop app, etc.)",
|
|
@@ -56,7 +56,10 @@
|
|
|
56
56
|
"DO NOT SKIP PHASES: Even with high confidence, you must complete hypothesis generation (Phase 2), instrumentation (Phase 3), evidence collection (Phase 4), analysis (Phase 5), and writeup (Phase 6).",
|
|
57
57
|
"PHASE PROGRESSION: An investigation that stops at triage (Phase 0) or hypothesis formation (Phase 2) or evidence collection (Phase 4) is INCOMPLETE - the diagnostic writeup is the required deliverable.",
|
|
58
58
|
"**HIGH AUTO MODE DISCIPLINE:**",
|
|
59
|
-
"
|
|
59
|
+
"**HIGH AUTO MODE DISCIPLINE:**\nIn HIGH automation mode, agents must execute phases WITHOUT asking permission between phases. This means: proceed automatically from Phase 1\u21922\u21923\u21924\u21925\u21926. HIGH AUTO \u2260 PERMISSION TO SKIP PHASES. HIGH AUTO = NO INTERRUPTIONS, NOT NO PHASES.",
|
|
60
|
+
"**CRITICAL: HIGH AUTOMATION \u2260 AUTONOMY TO SKIP:**",
|
|
61
|
+
"USER SAYS: 'High automation mode' means you DON'T ASK PERMISSION. It does NOT mean you have autonomy to decide which phases to skip.",
|
|
62
|
+
"High auto = Faster execution of ALL phases. NOT = Smarter agent gets to skip phases it thinks are unnecessary.",
|
|
60
63
|
"are: (1) Phase 0e early termination decision, (2) Phase 4a controlled experiments. All other phases execute automatically based on the systematic workflow structure.",
|
|
61
64
|
"**FUNCTION DEFINITIONS:**",
|
|
62
65
|
"fun instrumentCode(location, hypothesis) = 'Add debug logs at {location} for {hypothesis}. Format: ClassName.method [{hypothesis}]: message. Include timestamp, thread ID if concurrent.'",
|
|
@@ -93,7 +96,7 @@
|
|
|
93
96
|
"LOG ANALYSIS OFFLOADING: For voluminous logs (>500 lines), offload analysis to sub-chats with structured prompts. See Phase 4 for detailed sub-analysis implementation.",
|
|
94
97
|
"RECURSION DEPTH: Limit recursive analysis to 3 levels deep to prevent analysis paralysis while ensuring thoroughness.",
|
|
95
98
|
"INVESTIGATION BOUNDS: If investigation exceeds 20 steps or 4 hours without root cause, pause and reassess approach with user.",
|
|
96
|
-
"AUTOMATION LEVELS: High=
|
|
99
|
+
"AUTOMATION LEVELS: High=execute phases automatically WITHOUT asking permission between phases (but MUST complete ALL phases), Medium=standard confirmations, Low=extra confirmations for safety.",
|
|
97
100
|
"CONTEXT DOCUMENTATION: Maintain INVESTIGATION_CONTEXT.md throughout. Update after major milestones, failures, or user interventions to enable seamless resumption and handoffs. Include explicit resumption instructions using workflow_get and workflow_next.",
|
|
98
101
|
"GIT FALLBACK STRATEGY: If git unavailable, gracefully skip commits/branches, log changes manually in CONTEXT.md with timestamps, warn user, document modifications for manual control.",
|
|
99
102
|
"GIT ERROR HANDLING: Use run_terminal_cmd for git operations; if fails, output exact command for user manual execution. Never halt investigation due to git unavailability.",
|