@thierrynakoa/fire-flow 10.0.0 → 12.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +9 -9
- package/ARCHITECTURE-DIAGRAM.md +7 -4
- package/COMMAND-REFERENCE.md +33 -13
- package/DOMINION-FLOW-OVERVIEW.md +581 -421
- package/QUICK-START.md +3 -3
- package/README.md +102 -45
- package/TROUBLESHOOTING.md +264 -264
- package/agents/fire-executor.md +200 -116
- package/agents/fire-fact-checker.md +276 -276
- package/agents/fire-phoenix-analyst.md +394 -0
- package/agents/fire-planner.md +145 -53
- package/agents/fire-project-researcher.md +155 -155
- package/agents/fire-research-synthesizer.md +166 -166
- package/agents/fire-researcher.md +144 -59
- package/agents/fire-roadmapper.md +215 -203
- package/agents/fire-verifier.md +247 -65
- package/agents/fire-vision-architect.md +381 -0
- package/commands/fire-0-orient.md +476 -476
- package/commands/fire-1a-new.md +216 -0
- package/commands/fire-1b-research.md +210 -0
- package/commands/fire-1c-setup.md +254 -0
- package/commands/{fire-1a-discuss.md → fire-1d-discuss.md} +35 -7
- package/commands/fire-3-execute.md +55 -2
- package/commands/fire-4-verify.md +61 -0
- package/commands/fire-5-handoff.md +2 -2
- package/commands/fire-6-resume.md +37 -2
- package/commands/fire-add-new-skill.md +2 -2
- package/commands/fire-autonomous.md +20 -3
- package/commands/fire-brainstorm.md +1 -1
- package/commands/fire-complete-milestone.md +2 -2
- package/commands/fire-cost.md +183 -0
- package/commands/fire-dashboard.md +2 -2
- package/commands/fire-debug.md +663 -663
- package/commands/fire-loop-resume.md +2 -2
- package/commands/fire-loop-stop.md +1 -1
- package/commands/fire-loop.md +1168 -1168
- package/commands/fire-map-codebase.md +3 -3
- package/commands/fire-new-milestone.md +356 -356
- package/commands/fire-phoenix.md +603 -0
- package/commands/fire-reflect.md +235 -235
- package/commands/fire-research.md +246 -246
- package/commands/fire-search.md +1 -1
- package/commands/fire-skills-diff.md +3 -3
- package/commands/fire-skills-history.md +3 -3
- package/commands/fire-skills-rollback.md +7 -7
- package/commands/fire-skills-sync.md +5 -5
- package/commands/fire-test.md +9 -9
- package/commands/fire-todos.md +1 -1
- package/commands/fire-update.md +5 -5
- package/hooks/hooks.json +16 -16
- package/hooks/run-hook.sh +8 -8
- package/hooks/run-session-end.sh +7 -7
- package/hooks/session-end.sh +90 -90
- package/hooks/session-start.sh +1 -1
- package/package.json +2 -2
- package/plugin.json +7 -7
- package/references/metrics-and-trends.md +1 -1
- package/skills-library/SKILLS-INDEX.md +588 -588
- package/skills-library/_general/methodology/AUTONOMOUS_ORCHESTRATION.md +182 -0
- package/skills-library/_general/methodology/BACKWARD_PLANNING_INTERVIEW.md +307 -0
- package/skills-library/_general/methodology/CIRCUIT_BREAKER_INTELLIGENCE.md +163 -0
- package/skills-library/_general/methodology/CONTEXT_ROTATION.md +151 -0
- package/skills-library/_general/methodology/DEAD_ENDS_SHELF.md +188 -0
- package/skills-library/_general/methodology/DESIGN_PHILOSOPHY_ENFORCEMENT.md +152 -0
- package/skills-library/_general/methodology/INTERNAL_CONSISTENCY_AUDIT.md +212 -0
- package/skills-library/_general/methodology/LIVE_BREADCRUMB_PROTOCOL.md +242 -0
- package/skills-library/_general/methodology/PHOENIX_REBUILD_METHODOLOGY.md +251 -0
- package/skills-library/_general/methodology/QUALITY_GATES_AND_VERIFICATION.md +157 -0
- package/skills-library/_general/methodology/RELIABILITY_PREDICTION.md +104 -0
- package/skills-library/_general/methodology/REQUIREMENTS_DECOMPOSITION.md +155 -0
- package/skills-library/_general/methodology/SELF_TESTING_FEEDBACK_LOOP.md +143 -0
- package/skills-library/_general/methodology/STACK_COMPATIBILITY_MATRIX.md +178 -0
- package/skills-library/_general/methodology/TIERED_CONTEXT_ARCHITECTURE.md +118 -0
- package/skills-library/_general/methodology/ZERO_FRICTION_CLI_SETUP.md +312 -0
- package/skills-library/_general/methodology/autonomous-multi-phase-build.md +133 -0
- package/skills-library/_general/methodology/claude-md-archival.md +280 -0
- package/skills-library/_general/methodology/debug-swarm-researcher-escape-hatch.md +240 -240
- package/skills-library/_general/methodology/git-worktrees-parallel.md +232 -0
- package/skills-library/_general/methodology/llm-judge-memory-crud.md +241 -0
- package/skills-library/_general/methodology/multi-project-autonomous-build.md +360 -0
- package/skills-library/_general/methodology/shell-autonomous-loop-fixplan.md +238 -238
- package/skills-library/_general/patterns-standards/GOF_DESIGN_PATTERNS_FOR_AI_AGENTS.md +358 -0
- package/skills-library/methodology/BREATH_BASED_PARALLEL_EXECUTION.md +1 -1
- package/skills-library/methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md +1 -1
- package/skills-library/methodology/SABBATH_REST_PATTERN.md +1 -1
- package/templates/ASSUMPTIONS.md +1 -1
- package/templates/BLOCKERS.md +1 -1
- package/templates/DECISION_LOG.md +1 -1
- package/templates/phase-prompt.md +1 -1
- package/templates/phoenix-comparison.md +80 -0
- package/version.json +2 -2
- package/workflows/handoff-session.md +1 -1
- package/workflows/new-project.md +2 -2
- package/commands/fire-1-new.md +0 -281
|
@@ -0,0 +1,151 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: CONTEXT_ROTATION
|
|
3
|
+
category: methodology
|
|
4
|
+
description: Fresh-eyes debugging science — when and how to rotate context to break fixation, with cognitive science backing and practical protocols for AI agent handoffs
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
tags: [context-rotation, fresh-eyes, fixation, incubation, debugging, handoff]
|
|
7
|
+
sources:
|
|
8
|
+
- "Duncker (1945) — Functional Fixedness"
|
|
9
|
+
- "Springer Memory & Cognition — Interrupted distributed effort and incubation"
|
|
10
|
+
- "Psychology Today — Rubber Duck Debugging Psychology"
|
|
11
|
+
- "Frontiers in Education — Functional Fixedness in Problem Solving"
|
|
12
|
+
- "The Decision Lab — Functional Fixedness bias"
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# Context Rotation Protocol
|
|
16
|
+
|
|
17
|
+
> **Core insight:** A stuck agent with degraded context is the worst possible solver. A fresh agent with full context and documented prior attempts is the best. Context window length is a fixation risk, not just a token limit.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## 1. The Science of Getting Stuck
|
|
22
|
+
|
|
23
|
+
### Functional Fixedness (Duncker, 1945)
|
|
24
|
+
Once a problem-solver commits to a conceptual framing, they become blind to alternative framings — even when the solution is obvious in retrospect. Five-year-olds solve certain problems faster than adults because they have less fixation on conventional uses.
|
|
25
|
+
|
|
26
|
+
**For AI agents:** An agent that has spent 4+ exchanges approaching a bug as a data model issue will continue framing it as a data model issue even when symptoms clearly suggest concurrency. The longer the context window stays focused on one framing, the deeper the fixation.
|
|
27
|
+
|
|
28
|
+
### Incubation Effect
|
|
29
|
+
Research shows approaching problems "briefly and repeatedly" rather than continuously reduces fixation:
|
|
30
|
+
- Stopping work BEFORE hitting a wall preserves the ability to restructure
|
|
31
|
+
- Incubation specifically benefits **insight problems** (where the wrong approach actively blocks the right one)
|
|
32
|
+
- Less effective for **analytical problems** (where grinding produces incremental progress)
|
|
33
|
+
|
|
34
|
+
**Classification question before rotation:** Is this a grinding problem (try harder) or an insight problem (fixation is the enemy)? The answer determines whether iteration or rotation is correct.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## 2. When to Rotate Context
|
|
39
|
+
|
|
40
|
+
| Signal | Type | Action |
|
|
41
|
+
|--------|------|--------|
|
|
42
|
+
| Same approach, 3+ syntax variations | Fixation | Rotate immediately |
|
|
43
|
+
| Context compaction just happened | Degradation | Consider rotation |
|
|
44
|
+
| Agent confidence dropping each attempt | Diminishing returns | Rotate after next failure |
|
|
45
|
+
| "I've tried everything I can think of" | Exhaustion | Rotate with full dead-end map |
|
|
46
|
+
| Different approach, same class of error | Deeper issue | Research first, then rotate if no resolution |
|
|
47
|
+
|
|
48
|
+
### When NOT to Rotate
|
|
49
|
+
- Transient errors (API timeout, build cache) — retry is cheaper
|
|
50
|
+
- Missing information (credentials, config) — human input needed, not fresh eyes
|
|
51
|
+
- Making steady progress — don't fix what isn't broken
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## 3. What the Fresh Agent Receives
|
|
56
|
+
|
|
57
|
+
**Critical rule:** Give the fresh agent the dead-end MAP, not the dead-end JOURNEY.
|
|
58
|
+
|
|
59
|
+
### Give:
|
|
60
|
+
```markdown
|
|
61
|
+
## Problem Context for Fresh Instance
|
|
62
|
+
|
|
63
|
+
**Goal:** {what needs to be accomplished}
|
|
64
|
+
**Current state:** {what exists, what's working}
|
|
65
|
+
|
|
66
|
+
**Approaches tried (outcome map):**
|
|
67
|
+
1. {approach} → Failed because: {root cause}
|
|
68
|
+
2. {approach} → Failed because: {root cause}
|
|
69
|
+
3. {approach} → Failed because: {root cause}
|
|
70
|
+
|
|
71
|
+
**Constraints identified:**
|
|
72
|
+
- {constraint 1 — verified, not assumed}
|
|
73
|
+
- {constraint 2 — verified, not assumed}
|
|
74
|
+
|
|
75
|
+
**Untested hypotheses:**
|
|
76
|
+
- {idea 1 — why it might work}
|
|
77
|
+
- {idea 2 — why it might work}
|
|
78
|
+
|
|
79
|
+
**Relevant files:** {specific paths}
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Do NOT give:
|
|
83
|
+
- Full conversation history (propagates the original framing/fixation)
|
|
84
|
+
- Emotional language ("this is really frustrating", "nothing works")
|
|
85
|
+
- Vague descriptions ("I tried a bunch of things")
|
|
86
|
+
|
|
87
|
+
**Why:** The fresh agent needs boundary knowledge (where NOT to walk) and starting context (where TO start). It does not need the journey narrative — that's what created the fixation in the first place.
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## 4. Articulation Protocol (Rubber Duck Step)
|
|
92
|
+
|
|
93
|
+
Before any rotation, require the stuck agent to produce a structured articulation:
|
|
94
|
+
|
|
95
|
+
```markdown
|
|
96
|
+
## Articulation (Pre-Rotation)
|
|
97
|
+
|
|
98
|
+
1. What I was trying to do: {goal in one sentence}
|
|
99
|
+
2. What I expected to happen: {specific expected behavior}
|
|
100
|
+
3. What actually happened: {specific actual behavior}
|
|
101
|
+
4. What I believe the constraint is: {my theory}
|
|
102
|
+
5. What assumption am I making that might be wrong: {honest assessment}
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
**Why this works:** Explaining the problem forces sequential, explicit reconstruction of assumptions. The language encoding activates different cognitive pathways than pattern-matching. Research shows this catches 30-40% of stuck cases without needing rotation — the stuck agent solves it by articulating it.
|
|
106
|
+
|
|
107
|
+
**Agent action:** Always run the articulation step BEFORE spawning a fresh agent. If the articulation reveals the issue, save the rotation cost.
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## 5. Context Window as Fixation Accumulator
|
|
112
|
+
|
|
113
|
+
Every exchange in a long session adds fixation weight:
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
Session start: fixation_risk = LOW
|
|
117
|
+
After 10 exchanges on same topic: fixation_risk = MEDIUM
|
|
118
|
+
After 20 exchanges on same topic: fixation_risk = HIGH
|
|
119
|
+
After context compaction: fixation_risk = ELEVATED
|
|
120
|
+
(compaction removes details but preserves framing bias)
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
**Mitigation strategies (in order of preference):**
|
|
124
|
+
1. **Articulation protocol** — cheapest, catches 30-40%
|
|
125
|
+
2. **Scope switch** — work on a different task, come back later
|
|
126
|
+
3. **Research injection** — read external docs/code to introduce new framing
|
|
127
|
+
4. **Fresh agent with dead-end map** — full context rotation
|
|
128
|
+
5. **Human clarification** — the human may see what both agents missed
|
|
129
|
+
|
|
130
|
+
---
|
|
131
|
+
|
|
132
|
+
## 6. The Navigator Pattern (from Pair Programming)
|
|
133
|
+
|
|
134
|
+
In pair programming research, the **navigator role** (observer, not typist) has the highest fresh-eyes effect — they see what the driver's fixation hides.
|
|
135
|
+
|
|
136
|
+
**Applied to Dominion Flow:** The fire-verifier IS the navigator. It reads the executor's output from outside the execution context. This is why verifier isolation (separate instance, fresh context) is architecturally important — not just for objectivity, but for fixation prevention.
|
|
137
|
+
|
|
138
|
+
**Role rotation frequency:**
|
|
139
|
+
- Every task boundary = natural rotation point
|
|
140
|
+
- Every phase boundary = mandatory rotation point (WARRIOR handoff)
|
|
141
|
+
- Mid-task rotation = only when stuck signals detected
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## When Agents Should Reference This Skill
|
|
146
|
+
|
|
147
|
+
- **fire-executor:** Run articulation protocol when stuck, classify grinding vs. insight before retrying
|
|
148
|
+
- **fire-verifier:** Leverage navigator advantage — flag issues the executor's fixation may hide
|
|
149
|
+
- **fire-6-resume:** Fresh instance = natural context rotation. Read dead-end map, not prior session transcript
|
|
150
|
+
- **fire-autonomous:** Monitor fixation risk in long sessions, trigger rotation at HIGH
|
|
151
|
+
- **Any agent writing a WARRIOR handoff:** Structure handoff as a dead-end map (outcomes, not journey)
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: DEAD_ENDS_SHELF
|
|
3
|
+
category: methodology
|
|
4
|
+
description: Tag unsolved problems in FAILURES.md for fresh Claude instances — stop burning tokens on dead ends, move on, let cleaner context solve it later
|
|
5
|
+
version: 2.0.0
|
|
6
|
+
tags: [dead-ends, autonomous, cost-optimization, shelving, context-rotation]
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Dead Ends Protocol (v11.3)
|
|
10
|
+
|
|
11
|
+
When an agent hits a wall — code that won't compile, an API that won't respond, a logic puzzle that burns 3+ attempts — **tag it `[DEAD-END]` in FAILURES.md and move on**. A fresh Claude instance with clean context will attempt it on the next session start.
|
|
12
|
+
|
|
13
|
+
> **Philosophy:** A stuck agent with degraded context is the worst possible solver. A fresh agent with full context and documented prior attempts is the best. Stop burning tokens on dead ends — rotate the problem to a fresh mind.
|
|
14
|
+
|
|
15
|
+
> **v11.3 change:** Dead ends are now tagged entries in `breadcrumbs/FAILURES.md` instead of a separate `DEAD-ENDS.md` file. This reduces file count and aligns with the on-demand breadcrumb protocol.
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Location: `.planning/breadcrumbs/FAILURES.md`
|
|
20
|
+
|
|
21
|
+
Dead-end entries live alongside regular failure entries, distinguished by the `[DEAD-END]` tag. They are:
|
|
22
|
+
- **Written by:** Any agent that hits a dead end (executor, planner, verifier, researcher)
|
|
23
|
+
- **Read by:** `/fire-6-resume` on session start — fresh Claude filters for `[DEAD-END]` entries and attempts solutions
|
|
24
|
+
- **Cleared by:** The agent that solves the problem (move entry to `LESSONS.md` with `[DEAD-END-SOLVED]` tag, remove `[DEAD-END]` tag from FAILURES.md)
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## When to Shelve (Trigger Rules)
|
|
29
|
+
|
|
30
|
+
| Condition | Action |
|
|
31
|
+
|-----------|--------|
|
|
32
|
+
| **3+ failed attempts** at the same problem | Tag `[DEAD-END]` immediately |
|
|
33
|
+
| **Blocked by missing info** (credentials, API keys, external dependency) | Tag with `[DEAD-END] [NEEDS-HUMAN]` |
|
|
34
|
+
| **Context degradation detected** (compaction happened, losing details) | Tag before context loss |
|
|
35
|
+
| **Confidence drops below 30%** after research | Tag — you're guessing |
|
|
36
|
+
| **Circular dependency** (fix A breaks B, fix B breaks A) | Tag with both sides documented |
|
|
37
|
+
| **Time sink** — 15+ minutes on a single non-critical issue | Tag and move to next task |
|
|
38
|
+
|
|
39
|
+
### When NOT to Shelve
|
|
40
|
+
|
|
41
|
+
- Critical path blockers (nothing else can proceed without this)
|
|
42
|
+
- Security vulnerabilities (must fix now)
|
|
43
|
+
- Data loss risks (must fix now)
|
|
44
|
+
|
|
45
|
+
For critical blockers: escalate to the user via checkpoint, don't shelve.
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Dead-End Entry Format
|
|
50
|
+
|
|
51
|
+
```markdown
|
|
52
|
+
### [DEAD-END] {short title}
|
|
53
|
+
|
|
54
|
+
**Shelved by:** {agent name} at {timestamp}
|
|
55
|
+
**Phase:** {phase number} / **Plan:** {plan number}
|
|
56
|
+
**Priority:** {critical | high | medium | low}
|
|
57
|
+
**Tags:** {code-bug, api-integration, dependency, logic, config, performance, unknown}
|
|
58
|
+
|
|
59
|
+
**Problem:** {What you were trying to do and what went wrong — be specific}
|
|
60
|
+
|
|
61
|
+
**What Was Tried** (prevents next instance from repeating):
|
|
62
|
+
1. {what you tried} → {what happened}
|
|
63
|
+
2. {what you tried} → {what happened}
|
|
64
|
+
3. {what you tried} → {what happened}
|
|
65
|
+
|
|
66
|
+
**Relevant Files:**
|
|
67
|
+
- `{file path}` — {what matters}
|
|
68
|
+
|
|
69
|
+
**Error:** `{exact error message}`
|
|
70
|
+
|
|
71
|
+
**Hypotheses Not Yet Tested:**
|
|
72
|
+
- {idea 1 — why it might work}
|
|
73
|
+
- {idea 2 — why it might work}
|
|
74
|
+
|
|
75
|
+
**Impact if unsolved:** {what breaks or degrades}
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## Agent Integration
|
|
81
|
+
|
|
82
|
+
### fire-executor (writes to FAILURES.md)
|
|
83
|
+
|
|
84
|
+
When a task hits the shelve trigger:
|
|
85
|
+
|
|
86
|
+
1. **Stop working on this task** — don't burn more tokens
|
|
87
|
+
2. **Write a `[DEAD-END]` entry** to `.planning/breadcrumbs/FAILURES.md` using the format above
|
|
88
|
+
3. **On first write:** If FAILURES.md doesn't exist, create it with a `# Failures` header first
|
|
89
|
+
4. **Move to next task** — the agent continues with remaining work
|
|
90
|
+
5. **Note in RECORD.md** — "Task X tagged [DEAD-END], moved to Task Y"
|
|
91
|
+
|
|
92
|
+
### fire-planner (writes to FAILURES.md)
|
|
93
|
+
|
|
94
|
+
When a planning problem can't be resolved:
|
|
95
|
+
|
|
96
|
+
1. Write `[DEAD-END]` entry with `Tags: logic` or `Tags: dependency`
|
|
97
|
+
2. Plan around the dead end — design alternative approach that doesn't depend on it
|
|
98
|
+
3. Mark the plan with `dead_end_dependency: true` in frontmatter
|
|
99
|
+
|
|
100
|
+
### fire-verifier (writes to FAILURES.md)
|
|
101
|
+
|
|
102
|
+
When verification finds an unfixable issue:
|
|
103
|
+
|
|
104
|
+
1. Write `[DEAD-END]` entry with exact failing test/check details
|
|
105
|
+
2. Mark verification as CONDITIONAL PASS with `shelved_issues: N`
|
|
106
|
+
|
|
107
|
+
### fire-researcher (reads + writes FAILURES.md)
|
|
108
|
+
|
|
109
|
+
In recovery mode, read FAILURES.md and filter for `[DEAD-END]` entries:
|
|
110
|
+
- If match found: include prior attempts in research context (avoid repeating)
|
|
111
|
+
- If no match: tag `[DEAD-END]` if research fails after 3-tier cascade
|
|
112
|
+
|
|
113
|
+
### fire-6-resume (reads FAILURES.md — THE KEY STEP)
|
|
114
|
+
|
|
115
|
+
**On every session start, a fresh Claude instance:**
|
|
116
|
+
|
|
117
|
+
1. Read `.planning/breadcrumbs/FAILURES.md` (if it exists)
|
|
118
|
+
2. Filter for `[DEAD-END]` tagged entries
|
|
119
|
+
3. For the highest-priority entry:
|
|
120
|
+
- Review the problem, prior attempts, and untested hypotheses
|
|
121
|
+
- With fresh context, attempt the top untested hypothesis
|
|
122
|
+
- If solved: move entry to `LESSONS.md` with `[DEAD-END-SOLVED]` tag, remove `[DEAD-END]` tag from FAILURES.md
|
|
123
|
+
- If still stuck: update the entry with new attempts, re-tag
|
|
124
|
+
4. Then proceed with normal work
|
|
125
|
+
|
|
126
|
+
> **Why this works:** A fresh Claude instance has full context window, no degradation from prior compactions, and zero emotional attachment to the failed approach. It sees the problem with completely new eyes.
|
|
127
|
+
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
## Autonomous Mode Integration
|
|
131
|
+
|
|
132
|
+
In `/fire-autonomous`, dead-end tagging is critical for cost efficiency:
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
Agent hits dead end on Task 5
|
|
136
|
+
→ Tags [DEAD-END] in FAILURES.md (30 seconds)
|
|
137
|
+
→ Moves to Task 6, 7, 8 (continues productive work)
|
|
138
|
+
→ Phase completes with 1 tagged dead end
|
|
139
|
+
→ Handoff documents it
|
|
140
|
+
→ Next session: fresh Claude reads FAILURES.md, solves Task 5 with clean context
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
**Without tagging:** Agent burns 15+ minutes on Task 5, context degrades, Tasks 6-8 suffer. Cost: HIGH, quality: LOW.
|
|
144
|
+
|
|
145
|
+
**With tagging:** Agent spends 30 seconds documenting Task 5, full context preserved for Tasks 6-8. Cost: LOW, quality: HIGH.
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## Hygiene
|
|
150
|
+
|
|
151
|
+
- **Max dead-end entries:** 10 (if more accumulate, oldest low-priority entries get resolved or removed)
|
|
152
|
+
- **Stale check:** If an entry is 5+ sessions old and low-priority, consider removing
|
|
153
|
+
- **Dedup:** Before writing a new entry, grep FAILURES.md for `[DEAD-END]` entries with similar titles — update instead of duplicate
|
|
154
|
+
- **Resolution celebration:** When solved, log to LESSONS.md with `[DEAD-END-SOLVED]` tag — document what the fresh context saw that the burned context missed
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Example: Real Dead-End Entry
|
|
159
|
+
|
|
160
|
+
```markdown
|
|
161
|
+
### [DEAD-END] Podcast feed XML parsing timeout on large feeds
|
|
162
|
+
|
|
163
|
+
**Shelved by:** fire-executor at 2026-03-04
|
|
164
|
+
**Phase:** 2 / **Plan:** 3
|
|
165
|
+
**Priority:** medium
|
|
166
|
+
**Tags:** api-integration, performance
|
|
167
|
+
|
|
168
|
+
**Problem:** RSS feed parser hangs on podcast feeds with 500+ episodes.
|
|
169
|
+
`podcastService.getFeed('live-teachings')` takes 45+ seconds and times out.
|
|
170
|
+
|
|
171
|
+
**What Was Tried:**
|
|
172
|
+
1. Increased timeout to 60s → Still times out on 800-episode feeds
|
|
173
|
+
2. Added `{ limit: 100 }` parameter → Parameter ignored by xml2js parser
|
|
174
|
+
3. Tried streaming parser (sax-js) → Import error, needs different build config
|
|
175
|
+
|
|
176
|
+
**Relevant Files:**
|
|
177
|
+
- `server/services/podcastService.js` — getFeed method, line 45
|
|
178
|
+
- `server/config/podcast-feeds.json` — feed URLs
|
|
179
|
+
|
|
180
|
+
**Error:** `TimeoutError: Request timed out after 60000ms at podcastService.getFeed`
|
|
181
|
+
|
|
182
|
+
**Hypotheses Not Yet Tested:**
|
|
183
|
+
- Use `fast-xml-parser` instead of `xml2js` (reportedly 10x faster)
|
|
184
|
+
- Pre-fetch and cache feeds with a cron job instead of on-demand parsing
|
|
185
|
+
- Paginate at the API level — only parse first N items from XML stream
|
|
186
|
+
|
|
187
|
+
**Impact if unsolved:** Podcast dropdown shows loading spinner forever for large feeds. Workaround: hardcoded limit of 50 episodes.
|
|
188
|
+
```
|
|
@@ -0,0 +1,152 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: design-philosophy-enforcement
|
|
3
|
+
category: methodology
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
contributed: 2026-03-06
|
|
6
|
+
contributor: dominion-flow
|
|
7
|
+
last_updated: 2026-03-06
|
|
8
|
+
contributors:
|
|
9
|
+
- dominion-flow
|
|
10
|
+
tags: [architecture, audit, principles, enforcement, agents, meta-design]
|
|
11
|
+
difficulty: hard
|
|
12
|
+
usage_count: 0
|
|
13
|
+
success_rate: 100
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# Design Philosophy Enforcement
|
|
17
|
+
|
|
18
|
+
## Problem
|
|
19
|
+
|
|
20
|
+
Multi-agent systems document design principles (honesty, research-first, plan-before-execute) but fail to **structurally enforce** them. Principles live in philosophy docs while the actual agent wiring contradicts or ignores them. The gap between stated values and implemented behavior grows silently over versions until the system's behavior no longer matches its stated identity.
|
|
21
|
+
|
|
22
|
+
Common symptoms:
|
|
23
|
+
- Agents document uncertainty but don't route to research
|
|
24
|
+
- "Plan right, execute once" is stated but execution can skip planning
|
|
25
|
+
- Failure states escalate to the user instead of triggering the system's own research capabilities
|
|
26
|
+
- One-size-fits-all processes (e.g., 70-point checklists) contradict "on-demand over ceremony"
|
|
27
|
+
- Circuit breakers stop the system instead of routing to recovery paths
|
|
28
|
+
|
|
29
|
+
## Solution Pattern
|
|
30
|
+
|
|
31
|
+
Audit every principle against three dimensions, then wire missing enforcement into the architecture:
|
|
32
|
+
|
|
33
|
+
### The Three-Dimensional Audit
|
|
34
|
+
|
|
35
|
+
For each stated design principle, answer:
|
|
36
|
+
|
|
37
|
+
1. **STRONG AT** — Where is this principle already structurally enforced? (file:section reference + why it works)
|
|
38
|
+
2. **WEAK AT** — Where is it documented but not wired? (file:section reference + what's missing)
|
|
39
|
+
3. **CONTRADICTED AT** — Where does the system accidentally work against it? (file:section reference + what conflicts)
|
|
40
|
+
|
|
41
|
+
### Enforcement Categories
|
|
42
|
+
|
|
43
|
+
Principles can be enforced at four levels (weakest to strongest):
|
|
44
|
+
|
|
45
|
+
| Level | Name | Example | Strength |
|
|
46
|
+
|-------|------|---------|----------|
|
|
47
|
+
| 1 | **Documented** | "Agents should research when stuck" | Weakest — ignored under pressure |
|
|
48
|
+
| 2 | **Prompted** | Honesty Gate asks "Am I tempted to rush?" | Medium — agent can answer dishonestly |
|
|
49
|
+
| 3 | **Gated** | Plan-checker must approve before execution starts | Strong — blocks progress without compliance |
|
|
50
|
+
| 4 | **Structural** | Executor literally cannot start without BLUEPRINT files existing | Strongest — architecturally impossible to violate |
|
|
51
|
+
|
|
52
|
+
**The audit goal:** Identify principles stuck at Level 1-2 that should be at Level 3-4.
|
|
53
|
+
|
|
54
|
+
### The Probe Questions
|
|
55
|
+
|
|
56
|
+
Use these specific questions to find enforcement gaps:
|
|
57
|
+
|
|
58
|
+
1. **Failure-path routing:** When an agent hits a wall, does the system route to its own research/recovery capabilities, or does it just stop/escalate?
|
|
59
|
+
2. **Honesty reward loop:** When an agent admits "I don't know," does the architecture automatically trigger research, or just document the admission?
|
|
60
|
+
3. **Ceremony detection:** Are there mandatory processes that run regardless of scope? Do small changes get the same heavyweight treatment as large features?
|
|
61
|
+
4. **Skip-ability:** Can agents bypass stated requirements via flags (--skip-verify, --quick) that exist "for convenience" but undermine the principle?
|
|
62
|
+
5. **Capability matching:** Do agents that encounter problems have the tools to solve them? (e.g., does the verifier have web search when it encounters unfamiliar failures?)
|
|
63
|
+
6. **Cross-gap clustering:** Do enforcement gaps cluster in one area (e.g., all at failure-time transitions, all in one agent, all in one phase)?
|
|
64
|
+
|
|
65
|
+
### The Fix Pattern
|
|
66
|
+
|
|
67
|
+
For each gap found, apply the minimum enforcement level that closes it:
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
IF principle is violated because agents don't know about it:
|
|
71
|
+
→ Level 2: Add to agent prompt/honesty gate
|
|
72
|
+
|
|
73
|
+
IF principle is violated because agents can skip it:
|
|
74
|
+
→ Level 3: Add a gate (file existence check, status field, approval step)
|
|
75
|
+
|
|
76
|
+
IF principle is violated because the architecture allows bypass:
|
|
77
|
+
→ Level 4: Remove the bypass, restructure the flow
|
|
78
|
+
|
|
79
|
+
IF principle is contradicted by another mechanism:
|
|
80
|
+
→ Resolve the contradiction: one must yield
|
|
81
|
+
→ Usually the contradicting mechanism is older and was never updated
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Code Example
|
|
85
|
+
|
|
86
|
+
```markdown
|
|
87
|
+
// Before (principle documented but not enforced)
|
|
88
|
+
|
|
89
|
+
## Design Philosophy
|
|
90
|
+
- "Research when hitting a wall" — don't brute-force
|
|
91
|
+
|
|
92
|
+
## Circuit Breaker (in executor)
|
|
93
|
+
IF same error 5+ times:
|
|
94
|
+
→ STOP. Escalate to user.
|
|
95
|
+
// Gap: system has a researcher agent but doesn't use it here
|
|
96
|
+
|
|
97
|
+
// After (principle structurally enforced)
|
|
98
|
+
|
|
99
|
+
## Circuit Breaker (in executor)
|
|
100
|
+
IF same error 3+ times:
|
|
101
|
+
→ WARNING: Spawn fire-researcher with error context
|
|
102
|
+
→ Researcher returns 2-3 alternatives (ranked by confidence)
|
|
103
|
+
→ Retry with top alternative
|
|
104
|
+
|
|
105
|
+
IF same error 5+ times AND researcher exhausted:
|
|
106
|
+
→ STOP. Escalate to user with research findings attached.
|
|
107
|
+
// Now the principle is enforced: research happens before escalation
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## Implementation Steps
|
|
111
|
+
|
|
112
|
+
1. **List all stated design principles** — extract from philosophy docs, READMEs, foundational documents
|
|
113
|
+
2. **For each principle, audit all agents and commands** — use the STRONG/WEAK/CONTRADICTED framework
|
|
114
|
+
3. **Identify enforcement level** for each principle in each location (Level 1-4)
|
|
115
|
+
4. **Flag gaps** where enforcement level is below what the principle requires
|
|
116
|
+
5. **Check for cross-gap patterns** — do gaps cluster in failure paths? in one agent? in one phase?
|
|
117
|
+
6. **Prioritize fixes** by how many principles a single change reinforces simultaneously
|
|
118
|
+
7. **Apply minimum viable enforcement** — don't over-gate; the lightest enforcement that closes the gap
|
|
119
|
+
|
|
120
|
+
## When to Use
|
|
121
|
+
|
|
122
|
+
- After major version releases of an agent framework (v11 audit before v12 work begins)
|
|
123
|
+
- When users report that the system "doesn't feel like it follows its own rules"
|
|
124
|
+
- When onboarding new agents or commands — verify they inherit all principle enforcement
|
|
125
|
+
- When a post-mortem reveals the system had the capability to prevent a failure but didn't use it
|
|
126
|
+
- Quarterly health checks on multi-agent architectures
|
|
127
|
+
|
|
128
|
+
## When NOT to Use
|
|
129
|
+
|
|
130
|
+
- On systems too small to have stated principles (single-script tools)
|
|
131
|
+
- During active feature development — audit between milestones, not during sprints
|
|
132
|
+
- When principles themselves need revision — update the principles first, then audit enforcement
|
|
133
|
+
|
|
134
|
+
## Common Mistakes
|
|
135
|
+
|
|
136
|
+
- **Auditing only the happy path** — most enforcement gaps appear at failure-time transitions (circuit breaker trips, low confidence, verification failures), not during successful execution
|
|
137
|
+
- **Adding more documentation instead of structural gates** — if agents already ignore Level 1-2 enforcement, adding more docs won't help; move to Level 3-4
|
|
138
|
+
- **Over-gating** — not every principle needs Level 4 enforcement; some are genuinely better as prompted guidelines
|
|
139
|
+
- **Fixing symptoms not patterns** — if gaps cluster in one area (e.g., all failure-path routing), fix the systemic pattern rather than patching each gap individually
|
|
140
|
+
|
|
141
|
+
## Related Skills
|
|
142
|
+
|
|
143
|
+
- [EVIDENCE_BASED_VALIDATION](../../methodology/EVIDENCE_BASED_VALIDATION.md)
|
|
144
|
+
- [INSTRUMENTATION_OVER_RESTRICTION](../../methodology/INSTRUMENTATION_OVER_RESTRICTION.md)
|
|
145
|
+
- [CONFIDENCE_GATED_EXECUTION](../../methodology/CONFIDENCE_GATED_EXECUTION.md)
|
|
146
|
+
|
|
147
|
+
## References
|
|
148
|
+
|
|
149
|
+
- Dominion Flow v11.3 principles audit (2026-03-06) — first application of this methodology
|
|
150
|
+
- The "Three Questions" honesty gate pattern — example of Level 2 enforcement done well
|
|
151
|
+
- Recovery Research Mode (4-tier cascade) — example of Level 4 structural enforcement
|
|
152
|
+
- ACE: Agentic Context Engineering (ICLR 2026) — adaptive playbooks as runtime principle enforcement
|