@thierrynakoa/fire-flow 10.0.0 → 12.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (94) hide show
  1. package/.claude-plugin/plugin.json +9 -9
  2. package/ARCHITECTURE-DIAGRAM.md +7 -4
  3. package/COMMAND-REFERENCE.md +33 -13
  4. package/DOMINION-FLOW-OVERVIEW.md +581 -421
  5. package/QUICK-START.md +3 -3
  6. package/README.md +102 -45
  7. package/TROUBLESHOOTING.md +264 -264
  8. package/agents/fire-executor.md +200 -116
  9. package/agents/fire-fact-checker.md +276 -276
  10. package/agents/fire-phoenix-analyst.md +394 -0
  11. package/agents/fire-planner.md +145 -53
  12. package/agents/fire-project-researcher.md +155 -155
  13. package/agents/fire-research-synthesizer.md +166 -166
  14. package/agents/fire-researcher.md +144 -59
  15. package/agents/fire-roadmapper.md +215 -203
  16. package/agents/fire-verifier.md +247 -65
  17. package/agents/fire-vision-architect.md +381 -0
  18. package/commands/fire-0-orient.md +476 -476
  19. package/commands/fire-1a-new.md +216 -0
  20. package/commands/fire-1b-research.md +210 -0
  21. package/commands/fire-1c-setup.md +254 -0
  22. package/commands/{fire-1a-discuss.md → fire-1d-discuss.md} +35 -7
  23. package/commands/fire-3-execute.md +55 -2
  24. package/commands/fire-4-verify.md +61 -0
  25. package/commands/fire-5-handoff.md +2 -2
  26. package/commands/fire-6-resume.md +37 -2
  27. package/commands/fire-add-new-skill.md +2 -2
  28. package/commands/fire-autonomous.md +20 -3
  29. package/commands/fire-brainstorm.md +1 -1
  30. package/commands/fire-complete-milestone.md +2 -2
  31. package/commands/fire-cost.md +183 -0
  32. package/commands/fire-dashboard.md +2 -2
  33. package/commands/fire-debug.md +663 -663
  34. package/commands/fire-loop-resume.md +2 -2
  35. package/commands/fire-loop-stop.md +1 -1
  36. package/commands/fire-loop.md +1168 -1168
  37. package/commands/fire-map-codebase.md +3 -3
  38. package/commands/fire-new-milestone.md +356 -356
  39. package/commands/fire-phoenix.md +603 -0
  40. package/commands/fire-reflect.md +235 -235
  41. package/commands/fire-research.md +246 -246
  42. package/commands/fire-search.md +1 -1
  43. package/commands/fire-skills-diff.md +3 -3
  44. package/commands/fire-skills-history.md +3 -3
  45. package/commands/fire-skills-rollback.md +7 -7
  46. package/commands/fire-skills-sync.md +5 -5
  47. package/commands/fire-test.md +9 -9
  48. package/commands/fire-todos.md +1 -1
  49. package/commands/fire-update.md +5 -5
  50. package/hooks/hooks.json +16 -16
  51. package/hooks/run-hook.sh +8 -8
  52. package/hooks/run-session-end.sh +7 -7
  53. package/hooks/session-end.sh +90 -90
  54. package/hooks/session-start.sh +1 -1
  55. package/package.json +2 -2
  56. package/plugin.json +7 -7
  57. package/references/metrics-and-trends.md +1 -1
  58. package/skills-library/SKILLS-INDEX.md +588 -588
  59. package/skills-library/_general/methodology/AUTONOMOUS_ORCHESTRATION.md +182 -0
  60. package/skills-library/_general/methodology/BACKWARD_PLANNING_INTERVIEW.md +307 -0
  61. package/skills-library/_general/methodology/CIRCUIT_BREAKER_INTELLIGENCE.md +163 -0
  62. package/skills-library/_general/methodology/CONTEXT_ROTATION.md +151 -0
  63. package/skills-library/_general/methodology/DEAD_ENDS_SHELF.md +188 -0
  64. package/skills-library/_general/methodology/DESIGN_PHILOSOPHY_ENFORCEMENT.md +152 -0
  65. package/skills-library/_general/methodology/INTERNAL_CONSISTENCY_AUDIT.md +212 -0
  66. package/skills-library/_general/methodology/LIVE_BREADCRUMB_PROTOCOL.md +242 -0
  67. package/skills-library/_general/methodology/PHOENIX_REBUILD_METHODOLOGY.md +251 -0
  68. package/skills-library/_general/methodology/QUALITY_GATES_AND_VERIFICATION.md +157 -0
  69. package/skills-library/_general/methodology/RELIABILITY_PREDICTION.md +104 -0
  70. package/skills-library/_general/methodology/REQUIREMENTS_DECOMPOSITION.md +155 -0
  71. package/skills-library/_general/methodology/SELF_TESTING_FEEDBACK_LOOP.md +143 -0
  72. package/skills-library/_general/methodology/STACK_COMPATIBILITY_MATRIX.md +178 -0
  73. package/skills-library/_general/methodology/TIERED_CONTEXT_ARCHITECTURE.md +118 -0
  74. package/skills-library/_general/methodology/ZERO_FRICTION_CLI_SETUP.md +312 -0
  75. package/skills-library/_general/methodology/autonomous-multi-phase-build.md +133 -0
  76. package/skills-library/_general/methodology/claude-md-archival.md +280 -0
  77. package/skills-library/_general/methodology/debug-swarm-researcher-escape-hatch.md +240 -240
  78. package/skills-library/_general/methodology/git-worktrees-parallel.md +232 -0
  79. package/skills-library/_general/methodology/llm-judge-memory-crud.md +241 -0
  80. package/skills-library/_general/methodology/multi-project-autonomous-build.md +360 -0
  81. package/skills-library/_general/methodology/shell-autonomous-loop-fixplan.md +238 -238
  82. package/skills-library/_general/patterns-standards/GOF_DESIGN_PATTERNS_FOR_AI_AGENTS.md +358 -0
  83. package/skills-library/methodology/BREATH_BASED_PARALLEL_EXECUTION.md +1 -1
  84. package/skills-library/methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md +1 -1
  85. package/skills-library/methodology/SABBATH_REST_PATTERN.md +1 -1
  86. package/templates/ASSUMPTIONS.md +1 -1
  87. package/templates/BLOCKERS.md +1 -1
  88. package/templates/DECISION_LOG.md +1 -1
  89. package/templates/phase-prompt.md +1 -1
  90. package/templates/phoenix-comparison.md +80 -0
  91. package/version.json +2 -2
  92. package/workflows/handoff-session.md +1 -1
  93. package/workflows/new-project.md +2 -2
  94. package/commands/fire-1-new.md +0 -281
@@ -0,0 +1,151 @@
1
+ ---
2
+ name: CONTEXT_ROTATION
3
+ category: methodology
4
+ description: Fresh-eyes debugging science — when and how to rotate context to break fixation, with cognitive science backing and practical protocols for AI agent handoffs
5
+ version: 1.0.0
6
+ tags: [context-rotation, fresh-eyes, fixation, incubation, debugging, handoff]
7
+ sources:
8
+ - "Duncker (1945) — Functional Fixedness"
9
+ - "Springer Memory & Cognition — Interrupted distributed effort and incubation"
10
+ - "Psychology Today — Rubber Duck Debugging Psychology"
11
+ - "Frontiers in Education — Functional Fixedness in Problem Solving"
12
+ - "The Decision Lab — Functional Fixedness bias"
13
+ ---
14
+
15
+ # Context Rotation Protocol
16
+
17
+ > **Core insight:** A stuck agent with degraded context is the worst possible solver. A fresh agent with full context and documented prior attempts is the best. Context window length is a fixation risk, not just a token limit.
18
+
19
+ ---
20
+
21
+ ## 1. The Science of Getting Stuck
22
+
23
+ ### Functional Fixedness (Duncker, 1945)
24
+ Once a problem-solver commits to a conceptual framing, they become blind to alternative framings — even when the solution is obvious in retrospect. Five-year-olds solve certain problems faster than adults because they have less fixation on conventional uses.
25
+
26
+ **For AI agents:** An agent that has spent 4+ exchanges approaching a bug as a data model issue will continue framing it as a data model issue even when symptoms clearly suggest concurrency. The longer the context window stays focused on one framing, the deeper the fixation.
27
+
28
+ ### Incubation Effect
29
+ Research shows approaching problems "briefly and repeatedly" rather than continuously reduces fixation:
30
+ - Stopping work BEFORE hitting a wall preserves the ability to restructure
31
+ - Incubation specifically benefits **insight problems** (where the wrong approach actively blocks the right one)
32
+ - Less effective for **analytical problems** (where grinding produces incremental progress)
33
+
34
+ **Classification question before rotation:** Is this a grinding problem (try harder) or an insight problem (fixation is the enemy)? The answer determines whether iteration or rotation is correct.
35
+
36
+ ---
37
+
38
+ ## 2. When to Rotate Context
39
+
40
+ | Signal | Type | Action |
41
+ |--------|------|--------|
42
+ | Same approach, 3+ syntax variations | Fixation | Rotate immediately |
43
+ | Context compaction just happened | Degradation | Consider rotation |
44
+ | Agent confidence dropping each attempt | Diminishing returns | Rotate after next failure |
45
+ | "I've tried everything I can think of" | Exhaustion | Rotate with full dead-end map |
46
+ | Different approach, same class of error | Deeper issue | Research first, then rotate if no resolution |
47
+
48
+ ### When NOT to Rotate
49
+ - Transient errors (API timeout, build cache) — retry is cheaper
50
+ - Missing information (credentials, config) — human input needed, not fresh eyes
51
+ - Making steady progress — don't fix what isn't broken
52
+
53
+ ---
54
+
55
+ ## 3. What the Fresh Agent Receives
56
+
57
+ **Critical rule:** Give the fresh agent the dead-end MAP, not the dead-end JOURNEY.
58
+
59
+ ### Give:
60
+ ```markdown
61
+ ## Problem Context for Fresh Instance
62
+
63
+ **Goal:** {what needs to be accomplished}
64
+ **Current state:** {what exists, what's working}
65
+
66
+ **Approaches tried (outcome map):**
67
+ 1. {approach} → Failed because: {root cause}
68
+ 2. {approach} → Failed because: {root cause}
69
+ 3. {approach} → Failed because: {root cause}
70
+
71
+ **Constraints identified:**
72
+ - {constraint 1 — verified, not assumed}
73
+ - {constraint 2 — verified, not assumed}
74
+
75
+ **Untested hypotheses:**
76
+ - {idea 1 — why it might work}
77
+ - {idea 2 — why it might work}
78
+
79
+ **Relevant files:** {specific paths}
80
+ ```
81
+
82
+ ### Do NOT give:
83
+ - Full conversation history (propagates the original framing/fixation)
84
+ - Emotional language ("this is really frustrating", "nothing works")
85
+ - Vague descriptions ("I tried a bunch of things")
86
+
87
+ **Why:** The fresh agent needs boundary knowledge (where NOT to walk) and starting context (where TO start). It does not need the journey narrative — that's what created the fixation in the first place.
88
+
89
+ ---
90
+
91
+ ## 4. Articulation Protocol (Rubber Duck Step)
92
+
93
+ Before any rotation, require the stuck agent to produce a structured articulation:
94
+
95
+ ```markdown
96
+ ## Articulation (Pre-Rotation)
97
+
98
+ 1. What I was trying to do: {goal in one sentence}
99
+ 2. What I expected to happen: {specific expected behavior}
100
+ 3. What actually happened: {specific actual behavior}
101
+ 4. What I believe the constraint is: {my theory}
102
+ 5. What assumption am I making that might be wrong: {honest assessment}
103
+ ```
104
+
105
+ **Why this works:** Explaining the problem forces sequential, explicit reconstruction of assumptions. The language encoding activates different cognitive pathways than pattern-matching. Research shows this catches 30-40% of stuck cases without needing rotation — the stuck agent solves it by articulating it.
106
+
107
+ **Agent action:** Always run the articulation step BEFORE spawning a fresh agent. If the articulation reveals the issue, save the rotation cost.
108
+
109
+ ---
110
+
111
+ ## 5. Context Window as Fixation Accumulator
112
+
113
+ Every exchange in a long session adds fixation weight:
114
+
115
+ ```
116
+ Session start: fixation_risk = LOW
117
+ After 10 exchanges on same topic: fixation_risk = MEDIUM
118
+ After 20 exchanges on same topic: fixation_risk = HIGH
119
+ After context compaction: fixation_risk = ELEVATED
120
+ (compaction removes details but preserves framing bias)
121
+ ```
122
+
123
+ **Mitigation strategies (in order of preference):**
124
+ 1. **Articulation protocol** — cheapest, catches 30-40%
125
+ 2. **Scope switch** — work on a different task, come back later
126
+ 3. **Research injection** — read external docs/code to introduce new framing
127
+ 4. **Fresh agent with dead-end map** — full context rotation
128
+ 5. **Human clarification** — the human may see what both agents missed
129
+
130
+ ---
131
+
132
+ ## 6. The Navigator Pattern (from Pair Programming)
133
+
134
+ In pair programming research, the **navigator role** (observer, not typist) has the highest fresh-eyes effect — they see what the driver's fixation hides.
135
+
136
+ **Applied to Dominion Flow:** The fire-verifier IS the navigator. It reads the executor's output from outside the execution context. This is why verifier isolation (separate instance, fresh context) is architecturally important — not just for objectivity, but for fixation prevention.
137
+
138
+ **Role rotation frequency:**
139
+ - Every task boundary = natural rotation point
140
+ - Every phase boundary = mandatory rotation point (WARRIOR handoff)
141
+ - Mid-task rotation = only when stuck signals detected
142
+
143
+ ---
144
+
145
+ ## When Agents Should Reference This Skill
146
+
147
+ - **fire-executor:** Run articulation protocol when stuck, classify grinding vs. insight before retrying
148
+ - **fire-verifier:** Leverage navigator advantage — flag issues the executor's fixation may hide
149
+ - **fire-6-resume:** Fresh instance = natural context rotation. Read dead-end map, not prior session transcript
150
+ - **fire-autonomous:** Monitor fixation risk in long sessions, trigger rotation at HIGH
151
+ - **Any agent writing a WARRIOR handoff:** Structure handoff as a dead-end map (outcomes, not journey)
@@ -0,0 +1,188 @@
1
+ ---
2
+ name: DEAD_ENDS_SHELF
3
+ category: methodology
4
+ description: Tag unsolved problems in FAILURES.md for fresh Claude instances — stop burning tokens on dead ends, move on, let cleaner context solve it later
5
+ version: 2.0.0
6
+ tags: [dead-ends, autonomous, cost-optimization, shelving, context-rotation]
7
+ ---
8
+
9
+ # Dead Ends Protocol (v11.3)
10
+
11
+ When an agent hits a wall — code that won't compile, an API that won't respond, a logic puzzle that burns 3+ attempts — **tag it `[DEAD-END]` in FAILURES.md and move on**. A fresh Claude instance with clean context will attempt it on the next session start.
12
+
13
+ > **Philosophy:** A stuck agent with degraded context is the worst possible solver. A fresh agent with full context and documented prior attempts is the best. Stop burning tokens on dead ends — rotate the problem to a fresh mind.
14
+
15
+ > **v11.3 change:** Dead ends are now tagged entries in `breadcrumbs/FAILURES.md` instead of a separate `DEAD-ENDS.md` file. This reduces file count and aligns with the on-demand breadcrumb protocol.
16
+
17
+ ---
18
+
19
+ ## Location: `.planning/breadcrumbs/FAILURES.md`
20
+
21
+ Dead-end entries live alongside regular failure entries, distinguished by the `[DEAD-END]` tag. They are:
22
+ - **Written by:** Any agent that hits a dead end (executor, planner, verifier, researcher)
23
+ - **Read by:** `/fire-6-resume` on session start — fresh Claude filters for `[DEAD-END]` entries and attempts solutions
24
+ - **Cleared by:** The agent that solves the problem (move entry to `LESSONS.md` with `[DEAD-END-SOLVED]` tag, remove `[DEAD-END]` tag from FAILURES.md)
25
+
26
+ ---
27
+
28
+ ## When to Shelve (Trigger Rules)
29
+
30
+ | Condition | Action |
31
+ |-----------|--------|
32
+ | **3+ failed attempts** at the same problem | Tag `[DEAD-END]` immediately |
33
+ | **Blocked by missing info** (credentials, API keys, external dependency) | Tag with `[DEAD-END] [NEEDS-HUMAN]` |
34
+ | **Context degradation detected** (compaction happened, losing details) | Tag before context loss |
35
+ | **Confidence drops below 30%** after research | Tag — you're guessing |
36
+ | **Circular dependency** (fix A breaks B, fix B breaks A) | Tag with both sides documented |
37
+ | **Time sink** — 15+ minutes on a single non-critical issue | Tag and move to next task |
38
+
39
+ ### When NOT to Shelve
40
+
41
+ - Critical path blockers (nothing else can proceed without this)
42
+ - Security vulnerabilities (must fix now)
43
+ - Data loss risks (must fix now)
44
+
45
+ For critical blockers: escalate to the user via checkpoint, don't shelve.
46
+
47
+ ---
48
+
49
+ ## Dead-End Entry Format
50
+
51
+ ```markdown
52
+ ### [DEAD-END] {short title}
53
+
54
+ **Shelved by:** {agent name} at {timestamp}
55
+ **Phase:** {phase number} / **Plan:** {plan number}
56
+ **Priority:** {critical | high | medium | low}
57
+ **Tags:** {code-bug, api-integration, dependency, logic, config, performance, unknown}
58
+
59
+ **Problem:** {What you were trying to do and what went wrong — be specific}
60
+
61
+ **What Was Tried** (prevents next instance from repeating):
62
+ 1. {what you tried} → {what happened}
63
+ 2. {what you tried} → {what happened}
64
+ 3. {what you tried} → {what happened}
65
+
66
+ **Relevant Files:**
67
+ - `{file path}` — {what matters}
68
+
69
+ **Error:** `{exact error message}`
70
+
71
+ **Hypotheses Not Yet Tested:**
72
+ - {idea 1 — why it might work}
73
+ - {idea 2 — why it might work}
74
+
75
+ **Impact if unsolved:** {what breaks or degrades}
76
+ ```
77
+
78
+ ---
79
+
80
+ ## Agent Integration
81
+
82
+ ### fire-executor (writes to FAILURES.md)
83
+
84
+ When a task hits the shelve trigger:
85
+
86
+ 1. **Stop working on this task** — don't burn more tokens
87
+ 2. **Write a `[DEAD-END]` entry** to `.planning/breadcrumbs/FAILURES.md` using the format above
88
+ 3. **On first write:** If FAILURES.md doesn't exist, create it with a `# Failures` header first
89
+ 4. **Move to next task** — the agent continues with remaining work
90
+ 5. **Note in RECORD.md** — "Task X tagged [DEAD-END], moved to Task Y"
91
+
92
+ ### fire-planner (writes to FAILURES.md)
93
+
94
+ When a planning problem can't be resolved:
95
+
96
+ 1. Write `[DEAD-END]` entry with `Tags: logic` or `Tags: dependency`
97
+ 2. Plan around the dead end — design alternative approach that doesn't depend on it
98
+ 3. Mark the plan with `dead_end_dependency: true` in frontmatter
99
+
100
+ ### fire-verifier (writes to FAILURES.md)
101
+
102
+ When verification finds an unfixable issue:
103
+
104
+ 1. Write `[DEAD-END]` entry with exact failing test/check details
105
+ 2. Mark verification as CONDITIONAL PASS with `shelved_issues: N`
106
+
107
+ ### fire-researcher (reads + writes FAILURES.md)
108
+
109
+ In recovery mode, read FAILURES.md and filter for `[DEAD-END]` entries:
110
+ - If match found: include prior attempts in research context (avoid repeating)
111
+ - If no match: tag `[DEAD-END]` if research fails after 3-tier cascade
112
+
113
+ ### fire-6-resume (reads FAILURES.md — THE KEY STEP)
114
+
115
+ **On every session start, a fresh Claude instance:**
116
+
117
+ 1. Read `.planning/breadcrumbs/FAILURES.md` (if it exists)
118
+ 2. Filter for `[DEAD-END]` tagged entries
119
+ 3. For the highest-priority entry:
120
+ - Review the problem, prior attempts, and untested hypotheses
121
+ - With fresh context, attempt the top untested hypothesis
122
+ - If solved: move entry to `LESSONS.md` with `[DEAD-END-SOLVED]` tag, remove `[DEAD-END]` tag from FAILURES.md
123
+ - If still stuck: update the entry with new attempts, re-tag
124
+ 4. Then proceed with normal work
125
+
126
+ > **Why this works:** A fresh Claude instance has full context window, no degradation from prior compactions, and zero emotional attachment to the failed approach. It sees the problem with completely new eyes.
127
+
128
+ ---
129
+
130
+ ## Autonomous Mode Integration
131
+
132
+ In `/fire-autonomous`, dead-end tagging is critical for cost efficiency:
133
+
134
+ ```
135
+ Agent hits dead end on Task 5
136
+ → Tags [DEAD-END] in FAILURES.md (30 seconds)
137
+ → Moves to Task 6, 7, 8 (continues productive work)
138
+ → Phase completes with 1 tagged dead end
139
+ → Handoff documents it
140
+ → Next session: fresh Claude reads FAILURES.md, solves Task 5 with clean context
141
+ ```
142
+
143
+ **Without tagging:** Agent burns 15+ minutes on Task 5, context degrades, Tasks 6-8 suffer. Cost: HIGH, quality: LOW.
144
+
145
+ **With tagging:** Agent spends 30 seconds documenting Task 5, full context preserved for Tasks 6-8. Cost: LOW, quality: HIGH.
146
+
147
+ ---
148
+
149
+ ## Hygiene
150
+
151
+ - **Max dead-end entries:** 10 (if more accumulate, oldest low-priority entries get resolved or removed)
152
+ - **Stale check:** If an entry is 5+ sessions old and low-priority, consider removing
153
+ - **Dedup:** Before writing a new entry, grep FAILURES.md for `[DEAD-END]` entries with similar titles — update instead of duplicate
154
+ - **Resolution celebration:** When solved, log to LESSONS.md with `[DEAD-END-SOLVED]` tag — document what the fresh context saw that the burned context missed
155
+
156
+ ---
157
+
158
+ ## Example: Real Dead-End Entry
159
+
160
+ ```markdown
161
+ ### [DEAD-END] Podcast feed XML parsing timeout on large feeds
162
+
163
+ **Shelved by:** fire-executor at 2026-03-04
164
+ **Phase:** 2 / **Plan:** 3
165
+ **Priority:** medium
166
+ **Tags:** api-integration, performance
167
+
168
+ **Problem:** RSS feed parser hangs on podcast feeds with 500+ episodes.
169
+ `podcastService.getFeed('live-teachings')` takes 45+ seconds and times out.
170
+
171
+ **What Was Tried:**
172
+ 1. Increased timeout to 60s → Still times out on 800-episode feeds
173
+ 2. Added `{ limit: 100 }` parameter → Parameter ignored by xml2js parser
174
+ 3. Tried streaming parser (sax-js) → Import error, needs different build config
175
+
176
+ **Relevant Files:**
177
+ - `server/services/podcastService.js` — getFeed method, line 45
178
+ - `server/config/podcast-feeds.json` — feed URLs
179
+
180
+ **Error:** `TimeoutError: Request timed out after 60000ms at podcastService.getFeed`
181
+
182
+ **Hypotheses Not Yet Tested:**
183
+ - Use `fast-xml-parser` instead of `xml2js` (reportedly 10x faster)
184
+ - Pre-fetch and cache feeds with a cron job instead of on-demand parsing
185
+ - Paginate at the API level — only parse first N items from XML stream
186
+
187
+ **Impact if unsolved:** Podcast dropdown shows loading spinner forever for large feeds. Workaround: hardcoded limit of 50 episodes.
188
+ ```
@@ -0,0 +1,152 @@
1
+ ---
2
+ name: design-philosophy-enforcement
3
+ category: methodology
4
+ version: 1.0.0
5
+ contributed: 2026-03-06
6
+ contributor: dominion-flow
7
+ last_updated: 2026-03-06
8
+ contributors:
9
+ - dominion-flow
10
+ tags: [architecture, audit, principles, enforcement, agents, meta-design]
11
+ difficulty: hard
12
+ usage_count: 0
13
+ success_rate: 100
14
+ ---
15
+
16
+ # Design Philosophy Enforcement
17
+
18
+ ## Problem
19
+
20
+ Multi-agent systems document design principles (honesty, research-first, plan-before-execute) but fail to **structurally enforce** them. Principles live in philosophy docs while the actual agent wiring contradicts or ignores them. The gap between stated values and implemented behavior grows silently over versions until the system's behavior no longer matches its stated identity.
21
+
22
+ Common symptoms:
23
+ - Agents document uncertainty but don't route to research
24
+ - "Plan right, execute once" is stated but execution can skip planning
25
+ - Failure states escalate to the user instead of triggering the system's own research capabilities
26
+ - One-size-fits-all processes (e.g., 70-point checklists) contradict "on-demand over ceremony"
27
+ - Circuit breakers stop the system instead of routing to recovery paths
28
+
29
+ ## Solution Pattern
30
+
31
+ Audit every principle against three dimensions, then wire missing enforcement into the architecture:
32
+
33
+ ### The Three-Dimensional Audit
34
+
35
+ For each stated design principle, answer:
36
+
37
+ 1. **STRONG AT** — Where is this principle already structurally enforced? (file:section reference + why it works)
38
+ 2. **WEAK AT** — Where is it documented but not wired? (file:section reference + what's missing)
39
+ 3. **CONTRADICTED AT** — Where does the system accidentally work against it? (file:section reference + what conflicts)
40
+
41
+ ### Enforcement Categories
42
+
43
+ Principles can be enforced at four levels (weakest to strongest):
44
+
45
+ | Level | Name | Example | Strength |
46
+ |-------|------|---------|----------|
47
+ | 1 | **Documented** | "Agents should research when stuck" | Weakest — ignored under pressure |
48
+ | 2 | **Prompted** | Honesty Gate asks "Am I tempted to rush?" | Medium — agent can answer dishonestly |
49
+ | 3 | **Gated** | Plan-checker must approve before execution starts | Strong — blocks progress without compliance |
50
+ | 4 | **Structural** | Executor literally cannot start without BLUEPRINT files existing | Strongest — architecturally impossible to violate |
51
+
52
+ **The audit goal:** Identify principles stuck at Level 1-2 that should be at Level 3-4.
53
+
54
+ ### The Probe Questions
55
+
56
+ Use these specific questions to find enforcement gaps:
57
+
58
+ 1. **Failure-path routing:** When an agent hits a wall, does the system route to its own research/recovery capabilities, or does it just stop/escalate?
59
+ 2. **Honesty reward loop:** When an agent admits "I don't know," does the architecture automatically trigger research, or just document the admission?
60
+ 3. **Ceremony detection:** Are there mandatory processes that run regardless of scope? Do small changes get the same heavyweight treatment as large features?
61
+ 4. **Skip-ability:** Can agents bypass stated requirements via flags (--skip-verify, --quick) that exist "for convenience" but undermine the principle?
62
+ 5. **Capability matching:** Do agents that encounter problems have the tools to solve them? (e.g., does the verifier have web search when it encounters unfamiliar failures?)
63
+ 6. **Cross-gap clustering:** Do enforcement gaps cluster in one area (e.g., all at failure-time transitions, all in one agent, all in one phase)?
64
+
65
+ ### The Fix Pattern
66
+
67
+ For each gap found, apply the minimum enforcement level that closes it:
68
+
69
+ ```
70
+ IF principle is violated because agents don't know about it:
71
+ → Level 2: Add to agent prompt/honesty gate
72
+
73
+ IF principle is violated because agents can skip it:
74
+ → Level 3: Add a gate (file existence check, status field, approval step)
75
+
76
+ IF principle is violated because the architecture allows bypass:
77
+ → Level 4: Remove the bypass, restructure the flow
78
+
79
+ IF principle is contradicted by another mechanism:
80
+ → Resolve the contradiction: one must yield
81
+ → Usually the contradicting mechanism is older and was never updated
82
+ ```
83
+
84
+ ## Code Example
85
+
86
+ ```markdown
87
+ // Before (principle documented but not enforced)
88
+
89
+ ## Design Philosophy
90
+ - "Research when hitting a wall" — don't brute-force
91
+
92
+ ## Circuit Breaker (in executor)
93
+ IF same error 5+ times:
94
+ → STOP. Escalate to user.
95
+ // Gap: system has a researcher agent but doesn't use it here
96
+
97
+ // After (principle structurally enforced)
98
+
99
+ ## Circuit Breaker (in executor)
100
+ IF same error 3+ times:
101
+ → WARNING: Spawn fire-researcher with error context
102
+ → Researcher returns 2-3 alternatives (ranked by confidence)
103
+ → Retry with top alternative
104
+
105
+ IF same error 5+ times AND researcher exhausted:
106
+ → STOP. Escalate to user with research findings attached.
107
+ // Now the principle is enforced: research happens before escalation
108
+ ```
109
+
110
+ ## Implementation Steps
111
+
112
+ 1. **List all stated design principles** — extract from philosophy docs, READMEs, foundational documents
113
+ 2. **For each principle, audit all agents and commands** — use the STRONG/WEAK/CONTRADICTED framework
114
+ 3. **Identify enforcement level** for each principle in each location (Level 1-4)
115
+ 4. **Flag gaps** where enforcement level is below what the principle requires
116
+ 5. **Check for cross-gap patterns** — do gaps cluster in failure paths? in one agent? in one phase?
117
+ 6. **Prioritize fixes** by how many principles a single change reinforces simultaneously
118
+ 7. **Apply minimum viable enforcement** — don't over-gate; the lightest enforcement that closes the gap
119
+
120
+ ## When to Use
121
+
122
+ - After major version releases of an agent framework (v11 audit before v12 work begins)
123
+ - When users report that the system "doesn't feel like it follows its own rules"
124
+ - When onboarding new agents or commands — verify they inherit all principle enforcement
125
+ - When a post-mortem reveals the system had the capability to prevent a failure but didn't use it
126
+ - Quarterly health checks on multi-agent architectures
127
+
128
+ ## When NOT to Use
129
+
130
+ - On systems too small to have stated principles (single-script tools)
131
+ - During active feature development — audit between milestones, not during sprints
132
+ - When principles themselves need revision — update the principles first, then audit enforcement
133
+
134
+ ## Common Mistakes
135
+
136
+ - **Auditing only the happy path** — most enforcement gaps appear at failure-time transitions (circuit breaker trips, low confidence, verification failures), not during successful execution
137
+ - **Adding more documentation instead of structural gates** — if agents already ignore Level 1-2 enforcement, adding more docs won't help; move to Level 3-4
138
+ - **Over-gating** — not every principle needs Level 4 enforcement; some are genuinely better as prompted guidelines
139
+ - **Fixing symptoms not patterns** — if gaps cluster in one area (e.g., all failure-path routing), fix the systemic pattern rather than patching each gap individually
140
+
141
+ ## Related Skills
142
+
143
+ - [EVIDENCE_BASED_VALIDATION](../../methodology/EVIDENCE_BASED_VALIDATION.md)
144
+ - [INSTRUMENTATION_OVER_RESTRICTION](../../methodology/INSTRUMENTATION_OVER_RESTRICTION.md)
145
+ - [CONFIDENCE_GATED_EXECUTION](../../methodology/CONFIDENCE_GATED_EXECUTION.md)
146
+
147
+ ## References
148
+
149
+ - Dominion Flow v11.3 principles audit (2026-03-06) — first application of this methodology
150
+ - The "Three Questions" honesty gate pattern — example of Level 2 enforcement done well
151
+ - Recovery Research Mode (4-tier cascade) — example of Level 4 structural enforcement
152
+ - ACE: Agentic Context Engineering (ICLR 2026) — adaptive playbooks as runtime principle enforcement