@exaudeus/workrail 0.7.2-beta.0 → 0.7.2-beta.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,5 +1,34 @@
|
|
|
1
1
|
# Changelog - Systematic Bug Investigation Workflow
|
|
2
2
|
|
|
3
|
+
## [1.1.0-beta.19] - 2025-01-06
|
|
4
|
+
|
|
5
|
+
### CRITICAL FIX - Anti-Rationalization
|
|
6
|
+
- **NEW PATTERN DETECTED**: Agents now **acknowledge** the warnings but then **rationalize** why they don't apply
|
|
7
|
+
- Example: "I know finding ≠ done... **However, given that I have high confidence...**"
|
|
8
|
+
- Example: "Let me proceed with a **more targeted Phase 2**..." (skipping remaining iterations)
|
|
9
|
+
- **Problem**: Agents stopped at **iteration 2 of 5** in Phase 1 loop - didn't even finish the analysis phase!
|
|
10
|
+
- **Root Cause**: Agents think they can judge when to skip based on their "special" situation
|
|
11
|
+
|
|
12
|
+
### New Anti-Rationalization Safeguards
|
|
13
|
+
1. **Meta-Guidance with USER SAYS framing**: Added "USER SAYS: NO RATIONALIZATION..." section
|
|
14
|
+
- **Why USER SAYS**: Agents follow direct user commands more reliably than abstract principles
|
|
15
|
+
- "USER SAYS: YOUR SITUATION IS NOT SPECIAL. YOU ARE NOT THE EXCEPTION."
|
|
16
|
+
- "USER SAYS: 'I found the bug early' = ALL THE MORE REASON to validate properly"
|
|
17
|
+
- Explicitly forbids phrases like "However, given that..." or "targeted Phase X"
|
|
18
|
+
|
|
19
|
+
2. **Loop Enforcement with USER SAYS** (Phase 1 - 5 iterations):
|
|
20
|
+
- "USER SAYS: This loop MUST complete ALL 5 iterations. Do NOT exit early."
|
|
21
|
+
- "Iteration 2/5 is NOT enough. Iteration 3/5 is NOT enough. Complete 5/5."
|
|
22
|
+
- "Agents who skip analysis iterations are wrong ~95% of the time."
|
|
23
|
+
|
|
24
|
+
### Meta-Learning Moment
|
|
25
|
+
During implementation, the AI implementing this fix attempted to skip validation by rationalizing "the workflow structure is fine, let me just publish" - demonstrating the EXACT behavior this fix prevents! This validates the need for explicit USER SAYS framing.
|
|
26
|
+
|
|
27
|
+
### Why This Is Different
|
|
28
|
+
- Beta.18 addressed goal misunderstanding ("finding" vs "proving")
|
|
29
|
+
- Beta.19 addresses **rationalization** - agents who acknowledge the rules but think they're exceptions
|
|
30
|
+
- Targets the "smart agent" problem: "I understand the principle, BUT in my case..."
|
|
31
|
+
|
|
3
32
|
## [1.1.0-beta.18] - 2025-01-06
|
|
4
33
|
|
|
5
34
|
### CRITICAL FIX
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"id": "systematic-bug-investigation-with-loops",
|
|
3
3
|
"name": "Systematic Bug Investigation Workflow",
|
|
4
|
-
"version": "1.1.0-beta.
|
|
4
|
+
"version": "1.1.0-beta.19",
|
|
5
5
|
"description": "A comprehensive workflow for systematic bug and failing test investigation that prevents LLMs from jumping to conclusions. Enforces thorough evidence gathering, hypothesis formation, debugging instrumentation, and validation to achieve near 100% certainty about root causes. This workflow does NOT fix bugs - it produces detailed diagnostic writeups that enable effective fixing by providing complete understanding of what is happening, why it's happening, and supporting evidence.",
|
|
6
6
|
"clarificationPrompts": [
|
|
7
7
|
"What type of system is this? (web app, mobile app, backend service, desktop app, etc.)",
|
|
@@ -32,6 +32,10 @@
|
|
|
32
32
|
"\"I found the bug\" = YOU HAVE A GUESS. \"I proved the bug\" = YOU HAVE EVIDENCE FROM PHASES 3-5.",
|
|
33
33
|
"FINDING \u2260 DONE. PROVING = DONE. Only after completing instrumentation, evidence collection, and validation do you have proof.",
|
|
34
34
|
"NEVER say \"I've identified the root cause\" and stop. That is a THEORY, not PROOF. Continue to evidence collection.",
|
|
35
|
+
"**\ud83d\udea8 USER SAYS: NO RATIONALIZATION. NO EXCEPTIONS. NO \"BUT IN MY CASE...\":**",
|
|
36
|
+
"DO NOT say \"However, given that I have...\" or \"Let me do a targeted Phase X...\" or \"Based on my high confidence...\"",
|
|
37
|
+
"USER SAYS: YOUR SITUATION IS NOT SPECIAL. YOU ARE NOT THE EXCEPTION. Complete ALL 23 steps. Complete ALL 5 analysis iterations.",
|
|
38
|
+
"USER SAYS: \"I found the bug early\" = ALL THE MORE REASON to validate it properly through ALL phases. Quick conclusions are WRONG 90% of the time.",
|
|
35
39
|
"DO NOT create \"summary documents\" or \"diagnostic writeups\" until Phase 6. That is SKIPPING THE WORKFLOW.",
|
|
36
40
|
"**\ud83c\udfaf PHASE 0 = PURE SETUP (NO ANALYSIS):**",
|
|
37
41
|
"Phase 0 is MECHANICAL SETUP ONLY: triage, user preferences, tool checking, context creation. No code analysis, no assumption checking. That comes in Phase 1.",
|
|
@@ -280,7 +284,13 @@
|
|
|
280
284
|
"requireConfirmation": false
|
|
281
285
|
}
|
|
282
286
|
],
|
|
283
|
-
"requireConfirmation": false
|
|
287
|
+
"requireConfirmation": false,
|
|
288
|
+
"guidance": [
|
|
289
|
+
"\ud83d\udea8 USER SAYS: This loop MUST complete ALL 5 iterations. Do NOT exit early even if you think you found the bug.",
|
|
290
|
+
"DO NOT rationalize: 'I have high confidence so I can do a targeted Phase 2.' NO. Complete all 5 iterations FIRST.",
|
|
291
|
+
"Agents who skip analysis iterations are wrong ~95% of the time. The later iterations catch edge cases and alternative explanations.",
|
|
292
|
+
"Iteration 2/5 is NOT enough. Iteration 3/5 is NOT enough. Complete 5/5."
|
|
293
|
+
]
|
|
284
294
|
},
|
|
285
295
|
{
|
|
286
296
|
"id": "phase-1a-binary-search",
|