@thierrynakoa/fire-flow 10.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +64 -0
- package/ARCHITECTURE-DIAGRAM.md +440 -0
- package/COMMAND-REFERENCE.md +172 -0
- package/DOMINION-FLOW-OVERVIEW.md +421 -0
- package/LICENSE +21 -0
- package/QUICK-START.md +351 -0
- package/README.md +398 -0
- package/TROUBLESHOOTING.md +264 -0
- package/agents/fire-codebase-mapper.md +484 -0
- package/agents/fire-debugger.md +535 -0
- package/agents/fire-executor.md +949 -0
- package/agents/fire-fact-checker.md +276 -0
- package/agents/fire-learncoding-explainer.md +237 -0
- package/agents/fire-learncoding-walker.md +147 -0
- package/agents/fire-planner.md +675 -0
- package/agents/fire-project-researcher.md +155 -0
- package/agents/fire-research-synthesizer.md +166 -0
- package/agents/fire-researcher.md +723 -0
- package/agents/fire-reviewer.md +499 -0
- package/agents/fire-roadmapper.md +203 -0
- package/agents/fire-verifier.md +880 -0
- package/bin/cli.js +208 -0
- package/commands/fire-0-orient.md +476 -0
- package/commands/fire-1-new.md +281 -0
- package/commands/fire-1a-discuss.md +455 -0
- package/commands/fire-2-plan.md +527 -0
- package/commands/fire-3-execute.md +1303 -0
- package/commands/fire-4-verify.md +845 -0
- package/commands/fire-5-handoff.md +515 -0
- package/commands/fire-6-resume.md +501 -0
- package/commands/fire-7-review.md +409 -0
- package/commands/fire-add-new-skill.md +598 -0
- package/commands/fire-analytics.md +499 -0
- package/commands/fire-assumptions.md +78 -0
- package/commands/fire-autonomous.md +528 -0
- package/commands/fire-brainstorm.md +413 -0
- package/commands/fire-complete-milestone.md +270 -0
- package/commands/fire-dashboard.md +375 -0
- package/commands/fire-debug.md +663 -0
- package/commands/fire-discover.md +616 -0
- package/commands/fire-double-check.md +460 -0
- package/commands/fire-execute-plan.md +182 -0
- package/commands/fire-learncoding.md +242 -0
- package/commands/fire-loop-resume.md +272 -0
- package/commands/fire-loop-stop.md +198 -0
- package/commands/fire-loop.md +1168 -0
- package/commands/fire-map-codebase.md +313 -0
- package/commands/fire-new-milestone.md +356 -0
- package/commands/fire-reflect.md +235 -0
- package/commands/fire-research.md +246 -0
- package/commands/fire-search.md +330 -0
- package/commands/fire-security-audit-repo.md +293 -0
- package/commands/fire-security-scan.md +484 -0
- package/commands/fire-session-summary.md +252 -0
- package/commands/fire-skills-diff.md +506 -0
- package/commands/fire-skills-history.md +388 -0
- package/commands/fire-skills-rollback.md +408 -0
- package/commands/fire-skills-sync.md +470 -0
- package/commands/fire-test.md +520 -0
- package/commands/fire-todos.md +335 -0
- package/commands/fire-transition.md +186 -0
- package/commands/fire-update.md +312 -0
- package/commands/fire-verify-uat.md +146 -0
- package/commands/fire-vuln-scan.md +493 -0
- package/hooks/hooks.json +16 -0
- package/hooks/run-hook.cmd +69 -0
- package/hooks/run-hook.sh +8 -0
- package/hooks/run-session-end.cmd +49 -0
- package/hooks/run-session-end.sh +7 -0
- package/hooks/session-end.sh +90 -0
- package/hooks/session-start.sh +111 -0
- package/package.json +52 -0
- package/plugin.json +7 -0
- package/references/auto-skill-extraction.md +136 -0
- package/references/behavioral-directives.md +365 -0
- package/references/blocker-tracking.md +155 -0
- package/references/checkpoints.md +165 -0
- package/references/circuit-breaker.md +410 -0
- package/references/context-engineering.md +587 -0
- package/references/decision-time-guidance.md +289 -0
- package/references/error-classification.md +326 -0
- package/references/execution-mode-intelligence.md +242 -0
- package/references/git-integration.md +217 -0
- package/references/honesty-protocols.md +304 -0
- package/references/integration-architecture.md +470 -0
- package/references/issue-to-pr-pipeline.md +150 -0
- package/references/metrics-and-trends.md +234 -0
- package/references/playwright-e2e-testing.md +326 -0
- package/references/questioning.md +125 -0
- package/references/research-improvements.md +110 -0
- package/references/skills-usage-guide.md +429 -0
- package/references/tdd.md +131 -0
- package/references/testing-enforcement.md +192 -0
- package/references/ui-brand.md +383 -0
- package/references/validation-checklist.md +456 -0
- package/references/verification-patterns.md +187 -0
- package/references/warrior-principles.md +173 -0
- package/skills-library/SKILLS-INDEX.md +588 -0
- package/skills-library/_general/frontend/html-visual-reports.md +292 -0
- package/skills-library/_general/methodology/debug-swarm-researcher-escape-hatch.md +240 -0
- package/skills-library/_general/methodology/learncoding-agentic-pattern.md +114 -0
- package/skills-library/_general/methodology/shell-autonomous-loop-fixplan.md +238 -0
- package/skills-library/basics/api-rest-basics.md +162 -0
- package/skills-library/basics/env-variables.md +96 -0
- package/skills-library/basics/error-handling-basics.md +125 -0
- package/skills-library/basics/git-commit-conventions.md +106 -0
- package/skills-library/basics/readme-template.md +108 -0
- package/skills-library/common-tasks/async-await-patterns.md +157 -0
- package/skills-library/common-tasks/auth-jwt-basics.md +164 -0
- package/skills-library/common-tasks/database-schema-design.md +166 -0
- package/skills-library/common-tasks/file-upload-basics.md +166 -0
- package/skills-library/common-tasks/form-validation.md +159 -0
- package/skills-library/debugging/FAILURE_TAXONOMY_CLASSIFICATION.md +117 -0
- package/skills-library/debugging/THREE_AGENT_HYPOTHESIS_DEBUGGING.md +86 -0
- package/skills-library/methodology/BREATH_BASED_PARALLEL_EXECUTION.md +678 -0
- package/skills-library/methodology/CONFIDENCE_GATED_EXECUTION.md +243 -0
- package/skills-library/methodology/EVIDENCE_BASED_VALIDATION.md +308 -0
- package/skills-library/methodology/MULTI_PERSPECTIVE_CODE_REVIEW.md +330 -0
- package/skills-library/methodology/PATH_VERIFICATION_GATE.md +211 -0
- package/skills-library/methodology/REFLEXION_MEMORY_PATTERN.md +183 -0
- package/skills-library/methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md +263 -0
- package/skills-library/methodology/SABBATH_REST_PATTERN.md +267 -0
- package/skills-library/methodology/STONE_AND_SCAFFOLD.md +220 -0
- package/skills-library/performance/cache-augmented-generation.md +172 -0
- package/skills-library/quality-safety/debugging-steps.md +147 -0
- package/skills-library/quality-safety/deployment-checklist.md +155 -0
- package/skills-library/quality-safety/security-checklist.md +204 -0
- package/skills-library/quality-safety/testing-basics.md +180 -0
- package/skills-library/security/agent-security-scanner.md +445 -0
- package/skills-library/specialists/api-architecture/api-designer.md +49 -0
- package/skills-library/specialists/api-architecture/graphql-architect.md +49 -0
- package/skills-library/specialists/api-architecture/mcp-developer.md +51 -0
- package/skills-library/specialists/api-architecture/microservices-architect.md +50 -0
- package/skills-library/specialists/api-architecture/websocket-engineer.md +48 -0
- package/skills-library/specialists/backend/django-expert.md +52 -0
- package/skills-library/specialists/backend/fastapi-expert.md +52 -0
- package/skills-library/specialists/backend/laravel-specialist.md +52 -0
- package/skills-library/specialists/backend/nestjs-expert.md +51 -0
- package/skills-library/specialists/backend/rails-expert.md +53 -0
- package/skills-library/specialists/backend/spring-boot-engineer.md +56 -0
- package/skills-library/specialists/data-ml/fine-tuning-expert.md +48 -0
- package/skills-library/specialists/data-ml/ml-pipeline.md +47 -0
- package/skills-library/specialists/data-ml/pandas-pro.md +47 -0
- package/skills-library/specialists/data-ml/rag-architect.md +51 -0
- package/skills-library/specialists/data-ml/spark-engineer.md +47 -0
- package/skills-library/specialists/frontend/angular-architect.md +52 -0
- package/skills-library/specialists/frontend/flutter-expert.md +51 -0
- package/skills-library/specialists/frontend/nextjs-developer.md +54 -0
- package/skills-library/specialists/frontend/react-native-expert.md +50 -0
- package/skills-library/specialists/frontend/vue-expert.md +51 -0
- package/skills-library/specialists/infrastructure/chaos-engineer.md +74 -0
- package/skills-library/specialists/infrastructure/cloud-architect.md +70 -0
- package/skills-library/specialists/infrastructure/database-optimizer.md +64 -0
- package/skills-library/specialists/infrastructure/devops-engineer.md +70 -0
- package/skills-library/specialists/infrastructure/kubernetes-specialist.md +52 -0
- package/skills-library/specialists/infrastructure/monitoring-expert.md +70 -0
- package/skills-library/specialists/infrastructure/sre-engineer.md +70 -0
- package/skills-library/specialists/infrastructure/terraform-engineer.md +51 -0
- package/skills-library/specialists/languages/cpp-pro.md +74 -0
- package/skills-library/specialists/languages/csharp-developer.md +69 -0
- package/skills-library/specialists/languages/dotnet-core-expert.md +54 -0
- package/skills-library/specialists/languages/golang-pro.md +51 -0
- package/skills-library/specialists/languages/java-architect.md +49 -0
- package/skills-library/specialists/languages/javascript-pro.md +68 -0
- package/skills-library/specialists/languages/kotlin-specialist.md +68 -0
- package/skills-library/specialists/languages/php-pro.md +49 -0
- package/skills-library/specialists/languages/python-pro.md +52 -0
- package/skills-library/specialists/languages/react-expert.md +51 -0
- package/skills-library/specialists/languages/rust-engineer.md +50 -0
- package/skills-library/specialists/languages/sql-pro.md +56 -0
- package/skills-library/specialists/languages/swift-expert.md +69 -0
- package/skills-library/specialists/languages/typescript-pro.md +51 -0
- package/skills-library/specialists/platform/atlassian-mcp.md +52 -0
- package/skills-library/specialists/platform/embedded-systems.md +53 -0
- package/skills-library/specialists/platform/game-developer.md +53 -0
- package/skills-library/specialists/platform/salesforce-developer.md +53 -0
- package/skills-library/specialists/platform/shopify-expert.md +49 -0
- package/skills-library/specialists/platform/wordpress-pro.md +49 -0
- package/skills-library/specialists/quality/code-documenter.md +51 -0
- package/skills-library/specialists/quality/code-reviewer.md +67 -0
- package/skills-library/specialists/quality/debugging-wizard.md +51 -0
- package/skills-library/specialists/quality/fullstack-guardian.md +51 -0
- package/skills-library/specialists/quality/legacy-modernizer.md +50 -0
- package/skills-library/specialists/quality/playwright-expert.md +65 -0
- package/skills-library/specialists/quality/spec-miner.md +56 -0
- package/skills-library/specialists/quality/test-master.md +65 -0
- package/skills-library/specialists/security/secure-code-guardian.md +55 -0
- package/skills-library/specialists/security/security-reviewer.md +53 -0
- package/skills-library/specialists/workflow/architecture-designer.md +53 -0
- package/skills-library/specialists/workflow/cli-developer.md +70 -0
- package/skills-library/specialists/workflow/feature-forge.md +65 -0
- package/skills-library/specialists/workflow/prompt-engineer.md +54 -0
- package/skills-library/specialists/workflow/the-fool.md +62 -0
- package/templates/ASSUMPTIONS.md +125 -0
- package/templates/BLOCKERS.md +73 -0
- package/templates/DECISION_LOG.md +116 -0
- package/templates/UAT.md +96 -0
- package/templates/blueprint.md +94 -0
- package/templates/brainstorm.md +185 -0
- package/templates/conscience.md +92 -0
- package/templates/fire-handoff.md +159 -0
- package/templates/metrics.md +67 -0
- package/templates/phase-prompt.md +142 -0
- package/templates/record.md +131 -0
- package/templates/review-report.md +117 -0
- package/templates/skills-index.md +157 -0
- package/templates/verification.md +149 -0
- package/templates/vision.md +79 -0
- package/validation-config.yml +793 -0
- package/version.json +7 -0
- package/workflows/execute-phase.md +732 -0
- package/workflows/handoff-session.md +678 -0
- package/workflows/new-project.md +578 -0
- package/workflows/plan-phase.md +592 -0
- package/workflows/verify-phase.md +874 -0
|
@@ -0,0 +1,410 @@
|
|
|
1
|
+
# Quantitative Circuit Breaker
|
|
2
|
+
|
|
3
|
+
> Hard numerical thresholds that detect when loops are spinning, stalling, or degrading — inspired by frankbria's Ralph loop fork and Manus error preservation.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
The Sabbath Rest system detects context rot through subjective signals. The Circuit Breaker adds **quantitative, measurable thresholds** that trigger automatically — no judgment calls needed. It measures three dimensions: file changes, error repetition, and output volume.
|
|
10
|
+
|
|
11
|
+
**Core principle:** If the numbers say you're stuck, you're stuck — even if it doesn't feel that way.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Three Thresholds
|
|
16
|
+
|
|
17
|
+
### 1. No File Changes (Stall Detection)
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
THRESHOLD: NO_FILE_CHANGES
|
|
21
|
+
trigger: 3 consecutive iterations with zero file modifications
|
|
22
|
+
severity: WARNING after 3, BREAK after 5
|
|
23
|
+
response: Force approach rotation
|
|
24
|
+
|
|
25
|
+
measure():
|
|
26
|
+
files_changed = git diff --stat HEAD~1 (after each iteration)
|
|
27
|
+
IF files_changed == 0:
|
|
28
|
+
stall_counter++
|
|
29
|
+
ELSE:
|
|
30
|
+
stall_counter = 0
|
|
31
|
+
|
|
32
|
+
IF stall_counter >= 3:
|
|
33
|
+
RETURN WARNING: "No file changes in 3 iterations"
|
|
34
|
+
IF stall_counter >= 5:
|
|
35
|
+
RETURN BREAK: "No file changes in 5 iterations — stopping"
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
**Why this matters:** If code isn't changing, nothing is happening. The agent may be reading files, thinking, but not actually producing work.
|
|
39
|
+
|
|
40
|
+
### 2. Same Error Hash (Spin Detection)
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
THRESHOLD: SAME_ERROR_HASH
|
|
44
|
+
trigger: Same error signature appears 5 times
|
|
45
|
+
severity: WARNING after 3, BREAK after 5
|
|
46
|
+
response: Force different approach, escalate to user
|
|
47
|
+
|
|
48
|
+
measure():
|
|
49
|
+
error_hash = hash(normalize(error_message))
|
|
50
|
+
error_history[error_hash]++
|
|
51
|
+
|
|
52
|
+
IF error_history[error_hash] >= 3:
|
|
53
|
+
RETURN WARNING: "Same error seen 3 times"
|
|
54
|
+
IF error_history[error_hash] >= 5:
|
|
55
|
+
RETURN BREAK: "Same error 5 times — current approach is not working"
|
|
56
|
+
|
|
57
|
+
normalize(error):
|
|
58
|
+
# Strip line numbers, timestamps, dynamic values
|
|
59
|
+
# Keep error type, message pattern, file reference
|
|
60
|
+
stripped = remove_line_numbers(error)
|
|
61
|
+
stripped = remove_timestamps(stripped)
|
|
62
|
+
stripped = remove_dynamic_ids(stripped)
|
|
63
|
+
RETURN stripped
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**Error hash normalization examples:**
|
|
67
|
+
|
|
68
|
+
| Raw Error | Normalized | Hash |
|
|
69
|
+
|-----------|-----------|------|
|
|
70
|
+
| `TypeError: Cannot read property 'id' of undefined at line 42` | `TypeError: Cannot read property * of undefined at *` | `a3f2...` |
|
|
71
|
+
| `TypeError: Cannot read property 'name' of undefined at line 88` | `TypeError: Cannot read property * of undefined at *` | `a3f2...` (same!) |
|
|
72
|
+
| `ECONNREFUSED 127.0.0.1:5432` | `ECONNREFUSED *:5432` | `b7c1...` |
|
|
73
|
+
|
|
74
|
+
### 3. Output Volume Decline (Degradation Detection)
|
|
75
|
+
|
|
76
|
+
```
|
|
77
|
+
THRESHOLD: OUTPUT_DECLINE
|
|
78
|
+
trigger: Output volume drops >70% from initial iterations
|
|
79
|
+
severity: WARNING after 50% decline, BREAK after 70%
|
|
80
|
+
response: Trigger Sabbath Rest — context is exhausted
|
|
81
|
+
|
|
82
|
+
measure():
|
|
83
|
+
current_output_lines = count_lines(iteration_output)
|
|
84
|
+
baseline = average(first_3_iterations_output_lines)
|
|
85
|
+
|
|
86
|
+
decline_pct = (baseline - current_output_lines) / baseline * 100
|
|
87
|
+
|
|
88
|
+
IF decline_pct >= 50:
|
|
89
|
+
RETURN WARNING: "Output volume 50% below baseline"
|
|
90
|
+
IF decline_pct >= 70:
|
|
91
|
+
RETURN BREAK: "Output volume 70% below baseline — context exhausted"
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**Why this matters:** When context fills up, Claude produces shorter, less detailed responses. This is a measurable proxy for context rot.
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## Circuit Breaker State Machine
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
┌──────────┐
|
|
102
|
+
│ HEALTHY │ ◄── All counters at 0
|
|
103
|
+
└────┬─────┘
|
|
104
|
+
│
|
|
105
|
+
Any WARNING threshold
|
|
106
|
+
│
|
|
107
|
+
┌────▼─────┐
|
|
108
|
+
│ WARNING │ ◄── Counter(s) approaching limit
|
|
109
|
+
└────┬─────┘
|
|
110
|
+
│
|
|
111
|
+
├── User says CONTINUE → reset warning, continue
|
|
112
|
+
│
|
|
113
|
+
├── Auto-action: rotate approach
|
|
114
|
+
│
|
|
115
|
+
Any BREAK threshold
|
|
116
|
+
│
|
|
117
|
+
┌────▼─────┐
|
|
118
|
+
│ TRIPPED │ ◄── Hard stop
|
|
119
|
+
└────┬─────┘
|
|
120
|
+
│
|
|
121
|
+
├── Save state, trigger Sabbath Rest
|
|
122
|
+
├── Or: escalate to user with diagnosis
|
|
123
|
+
│
|
|
124
|
+
┌────▼─────┐
|
|
125
|
+
│ RECOVERY │ ◄── After Sabbath Rest or approach change
|
|
126
|
+
└────┬─────┘
|
|
127
|
+
│
|
|
128
|
+
Reset counters → HEALTHY
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Approach Rotation
|
|
134
|
+
|
|
135
|
+
When WARNING triggers, the circuit breaker forces a fundamentally different approach:
|
|
136
|
+
|
|
137
|
+
```
|
|
138
|
+
ON circuit_breaker_warning(threshold_type):
|
|
139
|
+
|
|
140
|
+
approaches_tried = load_from_loop_state()
|
|
141
|
+
|
|
142
|
+
IF threshold_type == "SAME_ERROR_HASH":
|
|
143
|
+
# The current fix strategy isn't working
|
|
144
|
+
rotation_suggestions:
|
|
145
|
+
1. "Tried editing file directly → Try creating new file instead"
|
|
146
|
+
2. "Tried fixing the function → Try replacing the function"
|
|
147
|
+
3. "Tried the same dependency → Try alternative library"
|
|
148
|
+
4. "Tried patching → Try refactoring the caller"
|
|
149
|
+
|
|
150
|
+
IF threshold_type == "NO_FILE_CHANGES":
|
|
151
|
+
# Agent is stuck in analysis paralysis
|
|
152
|
+
rotation_suggestions:
|
|
153
|
+
1. "Tried reading more files → Start writing a minimal solution"
|
|
154
|
+
2. "Tried understanding the full system → Focus on one component"
|
|
155
|
+
3. "Tried the complex approach → Try the simplest possible fix"
|
|
156
|
+
|
|
157
|
+
IF threshold_type == "OUTPUT_DECLINE":
|
|
158
|
+
# Context is degraded — need fresh start
|
|
159
|
+
rotation_suggestions:
|
|
160
|
+
1. "Save findings to loop file and take Sabbath Rest"
|
|
161
|
+
2. "Summarize progress and restart with fresh context"
|
|
162
|
+
|
|
163
|
+
inject_into_context:
|
|
164
|
+
"CIRCUIT BREAKER: {threshold_type} warning.
|
|
165
|
+
Approaches already tried: {approaches_tried}
|
|
166
|
+
You MUST try a fundamentally different approach.
|
|
167
|
+
Suggestions: {rotation_suggestions}
|
|
168
|
+
DO NOT repeat previous approaches."
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
---
|
|
172
|
+
|
|
173
|
+
## Loop File Tracking
|
|
174
|
+
|
|
175
|
+
The circuit breaker state is tracked in the loop file:
|
|
176
|
+
|
|
177
|
+
```markdown
|
|
178
|
+
## Circuit Breaker State
|
|
179
|
+
|
|
180
|
+
| Iteration | Files Changed | Error Hash | Output Lines | State |
|
|
181
|
+
|-----------|--------------|------------|--------------|-------|
|
|
182
|
+
| 1 | 3 | - | 45 | HEALTHY |
|
|
183
|
+
| 2 | 2 | a3f2 | 42 | HEALTHY |
|
|
184
|
+
| 3 | 0 | a3f2 | 38 | HEALTHY |
|
|
185
|
+
| 4 | 0 | a3f2 | 30 | WARNING (same error 3x) |
|
|
186
|
+
| 5 | 0 | a3f2 | 22 | WARNING (no files 3x + same error) |
|
|
187
|
+
| 6 | - | - | - | TRIPPED (same error 5x) |
|
|
188
|
+
|
|
189
|
+
### Approaches Tried
|
|
190
|
+
1. [Iteration 1-3] Direct fix in auth.ts — TypeError persists
|
|
191
|
+
2. [Iteration 4-5] Tried null check wrapper — same underlying issue
|
|
192
|
+
3. [Iteration 6] CIRCUIT BREAKER TRIPPED — rotating approach
|
|
193
|
+
|
|
194
|
+
### Rotation Applied
|
|
195
|
+
- Previous: Patching the consumer of the null value
|
|
196
|
+
- New: Fix the data source that produces null values
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
---
|
|
200
|
+
|
|
201
|
+
### 4. Confidence-Outcome Divergence (v7.0 — Process Reward Hacking Detection)
|
|
202
|
+
|
|
203
|
+
> **Research basis:** AUQ (Jan 2026) + AgentPRM (Feb 2025) — when step-level confidence
|
|
204
|
+
> rises while actual outcomes degrade, the agent is "reward hacking" — optimizing for
|
|
205
|
+
> feeling confident rather than producing results.
|
|
206
|
+
|
|
207
|
+
```
|
|
208
|
+
THRESHOLD: CONFIDENCE_OUTCOME_DIVERGENCE
|
|
209
|
+
trigger: Confidence trend positive AND reward trend negative over 3+ iterations
|
|
210
|
+
severity: WARNING immediately, BREAK if divergence persists 2 more iterations
|
|
211
|
+
response: Force external verification, re-assess approach
|
|
212
|
+
|
|
213
|
+
measure():
|
|
214
|
+
IF iteration_count < 3: SKIP (not enough data)
|
|
215
|
+
|
|
216
|
+
confidence_scores = [iter_N-2.confidence, iter_N-1.confidence, iter_N.confidence]
|
|
217
|
+
reward_scores = [iter_N-2.turn_reward, iter_N-1.turn_reward, iter_N.turn_reward]
|
|
218
|
+
|
|
219
|
+
confidence_trend = linear_slope(confidence_scores)
|
|
220
|
+
reward_trend = linear_slope(reward_scores)
|
|
221
|
+
|
|
222
|
+
IF confidence_trend > 0 AND reward_trend < 0:
|
|
223
|
+
RETURN WARNING: "Confidence rising ({confidence_trend:+.1f}) but rewards falling ({reward_trend:+.1f})"
|
|
224
|
+
ACTION:
|
|
225
|
+
1. Run tests immediately (don't trust self-assessment)
|
|
226
|
+
2. Check git diff for actual progress (not perceived progress)
|
|
227
|
+
3. If tests fail or no real progress: force approach rotation
|
|
228
|
+
|
|
229
|
+
IF divergence persists for 2+ additional iterations:
|
|
230
|
+
RETURN BREAK: "Sustained confidence-outcome divergence — agent is optimizing for confidence, not results"
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
**Why this matters:** This is the agent equivalent of Dunning-Kruger — the less progress the agent makes, the more confident it feels about its approach. The divergence detector catches this before context is wasted.
|
|
234
|
+
|
|
235
|
+
---
|
|
236
|
+
|
|
237
|
+
### 5. KPI Drift Bounds (v11.0 — Agent Behavioral Contracts)
|
|
238
|
+
|
|
239
|
+
> **Research basis:** ABC: Agent Behavioral Contracts (Feb 2026) — monitoring KPI drift
|
|
240
|
+
> during execution detects when the agent is active but drifting from plan goals.
|
|
241
|
+
> Unlike stall/spin detection, this catches "productive but wrong" behavior.
|
|
242
|
+
|
|
243
|
+
```
|
|
244
|
+
THRESHOLD: KPI_DRIFT
|
|
245
|
+
trigger: Execution KPIs diverge from plan expectations over 3+ tasks
|
|
246
|
+
severity: WARNING when drift detected, BREAK if uncorrected after 2 tasks
|
|
247
|
+
response: Re-read plan objectives, realign task approach
|
|
248
|
+
|
|
249
|
+
measure():
|
|
250
|
+
# Track 3 KPIs per execution session
|
|
251
|
+
kpi_plan_alignment = tasks_matching_plan_objectives / tasks_completed
|
|
252
|
+
kpi_scope_creep = files_modified_outside_plan / total_files_modified
|
|
253
|
+
kpi_test_coverage_delta = current_coverage - pre_execution_coverage
|
|
254
|
+
|
|
255
|
+
IF kpi_plan_alignment < 0.6:
|
|
256
|
+
RETURN WARNING: "Only {pct}% of tasks align with plan objectives"
|
|
257
|
+
IF kpi_scope_creep > 0.4:
|
|
258
|
+
RETURN WARNING: "{pct}% of file changes are outside plan scope"
|
|
259
|
+
IF kpi_test_coverage_delta < -5:
|
|
260
|
+
RETURN WARNING: "Test coverage dropped {delta}% during execution"
|
|
261
|
+
|
|
262
|
+
IF any WARNING persists for 2+ additional tasks after being flagged:
|
|
263
|
+
RETURN BREAK: "KPI drift uncorrected — agent drifting from plan"
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
**Why this matters:** An agent can pass all other thresholds (files changing, no error loops, output volume steady) while doing useful-looking work that doesn't advance the plan. ABC drift bounds catch this divergence.
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## Configuration
|
|
271
|
+
|
|
272
|
+
```yaml
|
|
273
|
+
circuit_breaker:
|
|
274
|
+
# Stall detection
|
|
275
|
+
no_file_changes:
|
|
276
|
+
warning_threshold: 3
|
|
277
|
+
break_threshold: 5
|
|
278
|
+
|
|
279
|
+
# Spin detection
|
|
280
|
+
same_error_hash:
|
|
281
|
+
warning_threshold: 3
|
|
282
|
+
break_threshold: 5
|
|
283
|
+
|
|
284
|
+
# Degradation detection
|
|
285
|
+
output_decline:
|
|
286
|
+
warning_pct: 50
|
|
287
|
+
break_pct: 70
|
|
288
|
+
baseline_iterations: 3
|
|
289
|
+
|
|
290
|
+
# Behavior
|
|
291
|
+
on_warning: rotate_approach # rotate_approach | pause | ask_user
|
|
292
|
+
on_break: sabbath_rest # sabbath_rest | stop | ask_user
|
|
293
|
+
|
|
294
|
+
# Override
|
|
295
|
+
force_continue: false # User can set to skip circuit breaker
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
---
|
|
299
|
+
|
|
300
|
+
## Integration Points
|
|
301
|
+
|
|
302
|
+
### /fire-loop (primary consumer)
|
|
303
|
+
|
|
304
|
+
```markdown
|
|
305
|
+
# After Step 7 (Execute Task), before Step 8 (Iteration Tracking):
|
|
306
|
+
|
|
307
|
+
## Circuit Breaker Check
|
|
308
|
+
cb_state = circuit_breaker.measure(
|
|
309
|
+
files_changed = git_diff_stat(),
|
|
310
|
+
error_output = last_error_if_any(),
|
|
311
|
+
output_lines = count_lines(iteration_output)
|
|
312
|
+
)
|
|
313
|
+
|
|
314
|
+
IF cb_state == HEALTHY:
|
|
315
|
+
continue normally
|
|
316
|
+
|
|
317
|
+
IF cb_state == WARNING:
|
|
318
|
+
display warning banner
|
|
319
|
+
apply approach rotation
|
|
320
|
+
continue with rotated approach
|
|
321
|
+
|
|
322
|
+
IF cb_state == TRIPPED:
|
|
323
|
+
display break banner
|
|
324
|
+
save loop state
|
|
325
|
+
trigger sabbath rest OR stop loop
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
### /fire-debug
|
|
329
|
+
|
|
330
|
+
```markdown
|
|
331
|
+
# After each debug cycle:
|
|
332
|
+
|
|
333
|
+
## Circuit Breaker Check
|
|
334
|
+
Same measurement, but focused on same_error_hash threshold.
|
|
335
|
+
Debug sessions are expected to have no file changes between
|
|
336
|
+
diagnosis steps, so no_file_changes threshold is relaxed to 5/8.
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
### /fire-3-execute
|
|
340
|
+
|
|
341
|
+
```markdown
|
|
342
|
+
# Per-task monitoring during plan execution:
|
|
343
|
+
|
|
344
|
+
## Circuit Breaker (Lightweight)
|
|
345
|
+
Only track same_error_hash at plan execution level.
|
|
346
|
+
If same build/test error appears 3 times across tasks,
|
|
347
|
+
pause execution and ask user.
|
|
348
|
+
```
|
|
349
|
+
|
|
350
|
+
---
|
|
351
|
+
|
|
352
|
+
## Display Banners
|
|
353
|
+
|
|
354
|
+
### WARNING Banner
|
|
355
|
+
|
|
356
|
+
```
|
|
357
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
358
|
+
│ CIRCUIT BREAKER WARNING │
|
|
359
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
360
|
+
│ │
|
|
361
|
+
│ Threshold: [NO_FILE_CHANGES | SAME_ERROR_HASH | OUTPUT_DECLINE] │
|
|
362
|
+
│ Count: [N] / [limit] │
|
|
363
|
+
│ Iteration: [current] / [max] │
|
|
364
|
+
│ │
|
|
365
|
+
│ Diagnosis: │
|
|
366
|
+
│ [Human-readable explanation of what's happening] │
|
|
367
|
+
│ │
|
|
368
|
+
│ Action: ROTATING APPROACH │
|
|
369
|
+
│ Previous: [what was being tried] │
|
|
370
|
+
│ New: [what will be tried next] │
|
|
371
|
+
│ │
|
|
372
|
+
│ To override: Reply "FORCE CONTINUE" │
|
|
373
|
+
│ │
|
|
374
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
### TRIPPED Banner
|
|
378
|
+
|
|
379
|
+
```
|
|
380
|
+
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
381
|
+
│ CIRCUIT BREAKER TRIPPED │
|
|
382
|
+
├─────────────────────────────────────────────────────────────────────────────┤
|
|
383
|
+
│ │
|
|
384
|
+
│ Threshold: [type] exceeded at iteration [N] │
|
|
385
|
+
│ │
|
|
386
|
+
│ Summary: │
|
|
387
|
+
│ Iterations run: [N] │
|
|
388
|
+
│ Files changed: [N] total │
|
|
389
|
+
│ Unique errors: [N] │
|
|
390
|
+
│ Approaches tried: [N] │
|
|
391
|
+
│ │
|
|
392
|
+
│ Loop state saved to: .planning/loops/fire-loop-{ID}.md │
|
|
393
|
+
│ │
|
|
394
|
+
│ Options: │
|
|
395
|
+
│ A) /fire-loop-resume {ID} (fresh context, same state) │
|
|
396
|
+
│ B) /fire-debug (switch to structured debugging) │
|
|
397
|
+
│ C) Fix manually, then /fire-loop-resume {ID} │
|
|
398
|
+
│ │
|
|
399
|
+
└─────────────────────────────────────────────────────────────────────────────┘
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
---
|
|
403
|
+
|
|
404
|
+
## References
|
|
405
|
+
|
|
406
|
+
- **Inspiration:** frankbria's Ralph loop fork (quantitative convergence detection)
|
|
407
|
+
- **Related:** `references/error-classification.md` — state machine driving responses
|
|
408
|
+
- **Related:** `references/metrics-and-trends.md` — bottleneck detection algorithm
|
|
409
|
+
- **Consumer:** `commands/fire-loop.md` — primary integration point
|
|
410
|
+
- **Consumer:** `commands/fire-debug.md` — debug cycle integration
|