@thierrynakoa/fire-flow 10.0.0 → 12.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (94) hide show
  1. package/.claude-plugin/plugin.json +8 -8
  2. package/ARCHITECTURE-DIAGRAM.md +7 -4
  3. package/COMMAND-REFERENCE.md +33 -13
  4. package/DOMINION-FLOW-OVERVIEW.md +581 -421
  5. package/QUICK-START.md +3 -3
  6. package/README.md +101 -44
  7. package/TROUBLESHOOTING.md +264 -264
  8. package/agents/fire-executor.md +200 -116
  9. package/agents/fire-fact-checker.md +276 -276
  10. package/agents/fire-phoenix-analyst.md +394 -0
  11. package/agents/fire-planner.md +145 -53
  12. package/agents/fire-project-researcher.md +155 -155
  13. package/agents/fire-research-synthesizer.md +166 -166
  14. package/agents/fire-researcher.md +144 -59
  15. package/agents/fire-roadmapper.md +215 -203
  16. package/agents/fire-verifier.md +247 -65
  17. package/agents/fire-vision-architect.md +381 -0
  18. package/commands/fire-0-orient.md +476 -476
  19. package/commands/fire-1a-new.md +216 -0
  20. package/commands/fire-1b-research.md +210 -0
  21. package/commands/fire-1c-setup.md +254 -0
  22. package/commands/{fire-1a-discuss.md → fire-1d-discuss.md} +35 -7
  23. package/commands/fire-3-execute.md +55 -2
  24. package/commands/fire-4-verify.md +61 -0
  25. package/commands/fire-5-handoff.md +2 -2
  26. package/commands/fire-6-resume.md +37 -2
  27. package/commands/fire-add-new-skill.md +2 -2
  28. package/commands/fire-autonomous.md +20 -3
  29. package/commands/fire-brainstorm.md +1 -1
  30. package/commands/fire-complete-milestone.md +2 -2
  31. package/commands/fire-cost.md +183 -0
  32. package/commands/fire-dashboard.md +2 -2
  33. package/commands/fire-debug.md +663 -663
  34. package/commands/fire-loop-resume.md +2 -2
  35. package/commands/fire-loop-stop.md +1 -1
  36. package/commands/fire-loop.md +1168 -1168
  37. package/commands/fire-map-codebase.md +3 -3
  38. package/commands/fire-new-milestone.md +356 -356
  39. package/commands/fire-phoenix.md +603 -0
  40. package/commands/fire-reflect.md +235 -235
  41. package/commands/fire-research.md +246 -246
  42. package/commands/fire-search.md +1 -1
  43. package/commands/fire-skills-diff.md +3 -3
  44. package/commands/fire-skills-history.md +3 -3
  45. package/commands/fire-skills-rollback.md +7 -7
  46. package/commands/fire-skills-sync.md +5 -5
  47. package/commands/fire-test.md +9 -9
  48. package/commands/fire-todos.md +1 -1
  49. package/commands/fire-update.md +5 -5
  50. package/hooks/hooks.json +16 -16
  51. package/hooks/run-hook.sh +8 -8
  52. package/hooks/run-session-end.sh +7 -7
  53. package/hooks/session-end.sh +90 -90
  54. package/hooks/session-start.sh +1 -1
  55. package/package.json +4 -2
  56. package/plugin.json +7 -7
  57. package/references/metrics-and-trends.md +1 -1
  58. package/skills-library/SKILLS-INDEX.md +588 -588
  59. package/skills-library/_general/methodology/AUTONOMOUS_ORCHESTRATION.md +182 -0
  60. package/skills-library/_general/methodology/BACKWARD_PLANNING_INTERVIEW.md +307 -0
  61. package/skills-library/_general/methodology/CIRCUIT_BREAKER_INTELLIGENCE.md +163 -0
  62. package/skills-library/_general/methodology/CONTEXT_ROTATION.md +151 -0
  63. package/skills-library/_general/methodology/DEAD_ENDS_SHELF.md +188 -0
  64. package/skills-library/_general/methodology/DESIGN_PHILOSOPHY_ENFORCEMENT.md +152 -0
  65. package/skills-library/_general/methodology/INTERNAL_CONSISTENCY_AUDIT.md +212 -0
  66. package/skills-library/_general/methodology/LIVE_BREADCRUMB_PROTOCOL.md +242 -0
  67. package/skills-library/_general/methodology/PHOENIX_REBUILD_METHODOLOGY.md +251 -0
  68. package/skills-library/_general/methodology/QUALITY_GATES_AND_VERIFICATION.md +157 -0
  69. package/skills-library/_general/methodology/RELIABILITY_PREDICTION.md +104 -0
  70. package/skills-library/_general/methodology/REQUIREMENTS_DECOMPOSITION.md +155 -0
  71. package/skills-library/_general/methodology/SELF_TESTING_FEEDBACK_LOOP.md +143 -0
  72. package/skills-library/_general/methodology/STACK_COMPATIBILITY_MATRIX.md +178 -0
  73. package/skills-library/_general/methodology/TIERED_CONTEXT_ARCHITECTURE.md +118 -0
  74. package/skills-library/_general/methodology/ZERO_FRICTION_CLI_SETUP.md +312 -0
  75. package/skills-library/_general/methodology/autonomous-multi-phase-build.md +133 -0
  76. package/skills-library/_general/methodology/claude-md-archival.md +280 -0
  77. package/skills-library/_general/methodology/debug-swarm-researcher-escape-hatch.md +240 -240
  78. package/skills-library/_general/methodology/git-worktrees-parallel.md +232 -0
  79. package/skills-library/_general/methodology/llm-judge-memory-crud.md +241 -0
  80. package/skills-library/_general/methodology/multi-project-autonomous-build.md +360 -0
  81. package/skills-library/_general/methodology/shell-autonomous-loop-fixplan.md +238 -238
  82. package/skills-library/_general/patterns-standards/GOF_DESIGN_PATTERNS_FOR_AI_AGENTS.md +358 -0
  83. package/skills-library/methodology/BREATH_BASED_PARALLEL_EXECUTION.md +1 -1
  84. package/skills-library/methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md +1 -1
  85. package/skills-library/methodology/SABBATH_REST_PATTERN.md +1 -1
  86. package/templates/ASSUMPTIONS.md +1 -1
  87. package/templates/BLOCKERS.md +1 -1
  88. package/templates/DECISION_LOG.md +1 -1
  89. package/templates/phase-prompt.md +1 -1
  90. package/templates/phoenix-comparison.md +80 -0
  91. package/version.json +2 -2
  92. package/workflows/handoff-session.md +1 -1
  93. package/workflows/new-project.md +2 -2
  94. package/commands/fire-1-new.md +0 -281
@@ -0,0 +1,143 @@
1
+ ---
2
+ name: self-testing-feedback-loop
3
+ category: methodology
4
+ version: 1.0.0
5
+ contributed: 2026-03-06
6
+ contributor: dominion-flow
7
+ last_updated: 2026-03-06
8
+ contributors:
9
+ - dominion-flow
10
+ tags: [circuit-breaker, testing, error-recovery, ai-agent, feedback-loop]
11
+ difficulty: medium
12
+ usage_count: 0
13
+ success_rate: 100
14
+ ---
15
+
16
+ # Self-Testing Feedback Loop for Circuit Breakers
17
+
18
+ ## Problem
19
+
20
+ When AI agents hit errors during code generation, they typically rotate to a different approach — but never verify whether the rotation actually fixed anything before continuing. This creates "blind rotation" where the agent tries approach after approach without feedback, wasting context and iterations. The circuit breaker detects spinning (same error repeated), but the rotation itself has no verification step.
21
+
22
+ Symptoms:
23
+ - Agent rotates approaches but keeps hitting the same underlying error
24
+ - Circuit breaker trips after 5 identical errors despite "different" approaches
25
+ - Agent reports "trying a new approach" but the test output is identical
26
+ - Iterations are wasted on approaches that look different but fail identically
27
+
28
+ ## Solution Pattern
29
+
30
+ After every approach rotation triggered by the circuit breaker, enforce a **test-classify-feedback** cycle before continuing:
31
+
32
+ 1. **TEST** — Run the most specific available test (failing test > module test > full suite)
33
+ 2. **CLASSIFY** — Compare test result against the previous error hash
34
+ 3. **FEEDBACK** — Inject the classification back into the agent's context for the next iteration
35
+
36
+ The classification has four outcomes:
37
+ - **PASS** → Rotation worked. Clear warning state. Continue.
38
+ - **NEW ERROR** → Different failure. That's progress. Feed new error into context.
39
+ - **SAME ERROR** → Rotation didn't help. Double-increment spin counter (accelerate circuit break).
40
+ - **NO TESTS** → Can't verify. Log warning, rely on file-change metrics only.
41
+
42
+ The key insight: **double-incrementing the spin counter** when a rotation produces the same error. This accelerates the circuit break for approaches that look different but are functionally identical — preventing the agent from exhausting 5 rotation attempts on variations of the same broken strategy.
43
+
44
+ ## Code Example
45
+
46
+ ```
47
+ // Before (problematic) — blind rotation
48
+ ON circuit_breaker_warning:
49
+ rotation = suggest_new_approach(approaches_tried)
50
+ inject_into_context(rotation)
51
+ continue_execution() // Hope for the best
52
+
53
+ // After (solution) — verified rotation with feedback
54
+ ON circuit_breaker_warning:
55
+ rotation = suggest_new_approach(approaches_tried)
56
+ inject_into_context(rotation)
57
+
58
+ // Execute one iteration with new approach
59
+ result = execute_iteration()
60
+
61
+ // TEST: Run most specific available test
62
+ test_result = run_tests(priority=[
63
+ specific_failing_test, // Best: exact test that's failing
64
+ module_test_file, // Good: tests for the modified module
65
+ full_suite_if_quick, // OK: full suite if < 30 seconds
66
+ ])
67
+
68
+ // CLASSIFY: Compare against previous error
69
+ IF test_result.passed:
70
+ health = PROGRESS
71
+ spin_counter = 0 // Clear — rotation worked
72
+ record("Rotation successful: {approach}")
73
+
74
+ ELIF test_result.error_hash != previous_error_hash:
75
+ health = PROGRESS // Different error = forward movement
76
+ spin_counter = 0
77
+ feed_new_error(test_result) // New error feeds into next iteration
78
+ record("New error after rotation: {new_hash}")
79
+
80
+ ELIF test_result.error_hash == previous_error_hash:
81
+ health = SPINNING
82
+ spin_counter += 2 // DOUBLE increment — rotation failed
83
+ record("Rotation ineffective: same error {hash}")
84
+
85
+ ELIF no_tests_available:
86
+ health = UNKNOWN
87
+ record("No tests — manual verification needed")
88
+
89
+ // FEEDBACK: Inject result into next iteration
90
+ inject_into_context(
91
+ "SELF-TEST after rotation: {test_result.status}
92
+ Diagnosis: {health}
93
+ Action: {recommended_next_step}"
94
+ )
95
+ ```
96
+
97
+ ## Implementation Steps
98
+
99
+ 1. Hook into the circuit breaker's approach rotation trigger
100
+ 2. After rotation, execute exactly ONE iteration before testing
101
+ 3. Run the most specific available test (don't waste time on full suite if a specific test exists)
102
+ 4. Hash-compare the new error against the previous error
103
+ 5. Classify as PASS / NEW_ERROR / SAME_ERROR / NO_TESTS
104
+ 6. If SAME_ERROR: double-increment the spin counter to accelerate circuit break
105
+ 7. Inject the test result and classification into the agent's context
106
+ 8. Record the classification in the loop tracking file
107
+
108
+ ## When to Use
109
+
110
+ - Any AI agent system with a circuit breaker or loop detection
111
+ - Code generation agents that run tests after changes
112
+ - Autonomous debugging loops (test-diagnose-fix patterns)
113
+ - Any system where "trying a different approach" needs verification
114
+ - Long-running execution pipelines with error recovery
115
+
116
+ ## When NOT to Use
117
+
118
+ - Projects with no test suite (the loop relies on test output)
119
+ - Documentation-only changes (no testable code)
120
+ - When the circuit breaker is already TRIPPED (we're stopping, not testing)
121
+ - Interactive/manual debugging where the human provides feedback
122
+ - Single-shot code generation (no iteration loop)
123
+
124
+ ## Common Mistakes
125
+
126
+ - Running the full test suite when a specific failing test exists — wastes 10x the time
127
+ - Not double-incrementing on SAME_ERROR — allows 5 useless rotations before breaking
128
+ - Testing before the rotation is applied — tests the old approach, not the new one
129
+ - Classifying "different line number, same error type" as progress — normalize error hashes first
130
+ - Skipping the feedback injection — agent doesn't know its rotation failed
131
+
132
+ ## Related Skills
133
+
134
+ - [TIERED_CONTEXT_ARCHITECTURE](../methodology/TIERED_CONTEXT_ARCHITECTURE.md) - Context management that preserves error state
135
+ - [AGENT_SELF_IMPROVEMENT_LOOP](../methodology/AGENT_SELF_IMPROVEMENT_LOOP.md) - Broader agent improvement patterns
136
+ - [DIFFICULTY_AWARE_AGENT_ROUTING](../methodology/DIFFICULTY_AWARE_AGENT_ROUTING.md) - Route by difficulty level
137
+
138
+ ## References
139
+
140
+ - SWE-Agent + Reflexion "Test-Diagnose-Fix" loop (NeurIPS 2024) — 37% higher fix rates
141
+ - Manus AI error preservation pattern (Feb 2026) — errors are most valuable context
142
+ - frankbria's Ralph loop fork — quantitative convergence detection
143
+ - Contributed from: dominion-flow v10.1 research session
@@ -0,0 +1,178 @@
1
+ ---
2
+ name: STACK_COMPATIBILITY_MATRIX
3
+ category: methodology
4
+ description: Reference data for proven stack combinations, known incompatibilities, and project-type mapping
5
+ version: 1.0.0
6
+ tags: [architecture, stack, compatibility, vision]
7
+ ---
8
+
9
+ # Stack Compatibility Matrix
10
+
11
+ Reference data for the `fire-vision-architect` agent. Contains proven combinations, known conflicts, and project-type recommendations.
12
+
13
+ ---
14
+
15
+ ## Proven Stack Combinations
16
+
17
+ ### MERN (MongoDB + Express + React + Node.js)
18
+ - **Best for:** Rapid prototyping, real-time apps, flexible-schema projects
19
+ - **Industry:** Netflix, Uber (parts), Instagram (early)
20
+ - **Auth pairing:** Passport.js, JWT custom, Auth0
21
+ - **Hosting:** Heroku, Railway, DigitalOcean, AWS EC2
22
+ - **DB extensions:** Redis (caching), Mongoose ODM
23
+ - **Strengths:** Single language (JS) everywhere, huge npm ecosystem, fast iteration
24
+ - **Weaknesses:** No ACID by default, schema discipline required, callback complexity
25
+
26
+ ### PERN (PostgreSQL + Express + React + Node.js)
27
+ - **Best for:** Data-integrity-critical apps, financial, CRM, LMS
28
+ - **Industry:** Many YC startups, internal tools
29
+ - **Auth pairing:** Passport.js, JWT custom, better-auth
30
+ - **Hosting:** Heroku, Railway, Render, AWS RDS + EC2
31
+ - **DB extensions:** Prisma ORM, Drizzle ORM, pg-promise
32
+ - **Strengths:** ACID compliance, complex queries, relational integrity, mature tooling
33
+ - **Weaknesses:** Schema migrations needed, slightly slower prototyping than MongoDB
34
+
35
+ ### Next.js + Supabase
36
+ - **Best for:** Solo devs, MVPs, startups wanting speed-to-market
37
+ - **Industry:** cal.com, Resend, many indie SaaS
38
+ - **Auth pairing:** Supabase Auth (built-in), NextAuth
39
+ - **Hosting:** Vercel (native), Netlify, Cloudflare Pages
40
+ - **DB extensions:** Supabase Realtime, Edge Functions, Row Level Security
41
+ - **Strengths:** Fastest time-to-market, built-in auth/storage/realtime, generous free tier
42
+ - **Weaknesses:** Vendor lock-in risk, limited custom backend logic, Supabase scaling costs
43
+
44
+ ### Next.js + PostgreSQL + Prisma
45
+ - **Best for:** Full-stack teams wanting type safety and control
46
+ - **Industry:** Vercel apps, many B2B SaaS
47
+ - **Auth pairing:** NextAuth/Auth.js, better-auth, Clerk
48
+ - **Hosting:** Vercel + Neon/Supabase DB, Railway
49
+ - **DB extensions:** Prisma Studio, connection pooling (PgBouncer)
50
+ - **Strengths:** Type-safe DB access, excellent DX, SSR/SSG flexibility, full control
51
+ - **Weaknesses:** More setup than Supabase all-in-one, Prisma cold start in serverless
52
+
53
+ ### NestJS + PostgreSQL + Redis
54
+ - **Best for:** Enterprise, microservices, teams > 5 developers
55
+ - **Industry:** Enterprise Node.js projects, fintech
56
+ - **Auth pairing:** Passport.js (NestJS module), custom JWT, Keycloak
57
+ - **Hosting:** AWS ECS/EKS, GCP Cloud Run, DigitalOcean Kubernetes
58
+ - **DB extensions:** TypeORM, MikroORM, Bull (job queues via Redis)
59
+ - **Strengths:** Angular-like structure, dependency injection, microservice-ready, testable
60
+ - **Weaknesses:** Steeper learning curve, more boilerplate, overkill for small projects
61
+
62
+ ### Django + PostgreSQL + React/Vue
63
+ - **Best for:** Content-heavy apps, admin-intensive, rapid CRUD
64
+ - **Industry:** Instagram (early), Pinterest, Eventbrite
65
+ - **Auth pairing:** Django Auth (built-in), django-allauth
66
+ - **Hosting:** Heroku, Railway, AWS Elastic Beanstalk, DigitalOcean
67
+ - **DB extensions:** Django ORM, django-rest-framework, Celery (async tasks)
68
+ - **Strengths:** Batteries-included, admin panel free, excellent ORM, security defaults
69
+ - **Weaknesses:** Python + JS split, frontend integration friction, monolith tendency
70
+
71
+ ### Rails + PostgreSQL + Hotwire/React
72
+ - **Best for:** Rapid development, startups, content platforms
73
+ - **Industry:** GitHub, Shopify, Basecamp, Airbnb (early)
74
+ - **Auth pairing:** Devise, OmniAuth
75
+ - **Hosting:** Heroku, Render, Fly.io, Hatchbox
76
+ - **DB extensions:** ActiveRecord, Sidekiq (background jobs), ActionCable (WebSockets)
77
+ - **Strengths:** Convention over configuration, fastest scaffolding, mature ecosystem
78
+ - **Weaknesses:** Ruby learning curve, performance ceiling, fewer JS devs know it
79
+
80
+ ### Remix + PostgreSQL + Prisma
81
+ - **Best for:** Form-heavy apps, progressive enhancement, accessibility-first
82
+ - **Industry:** Shopify (Hydrogen), newer startups
83
+ - **Auth pairing:** Remix Auth, custom session-based
84
+ - **Hosting:** Fly.io, Vercel, Cloudflare Workers
85
+ - **Strengths:** Web standards first, nested routing, excellent forms, progressive enhancement
86
+ - **Weaknesses:** Smaller ecosystem than Next.js, fewer tutorials, less community support
87
+
88
+ ### Astro + Headless CMS
89
+ - **Best for:** Content sites, blogs, documentation, marketing
90
+ - **Industry:** Content-focused startups, developer docs
91
+ - **Auth pairing:** Usually not needed; if needed, Auth.js
92
+ - **Hosting:** Vercel, Netlify, Cloudflare Pages
93
+ - **CMS options:** Sanity, Contentful, Strapi, WordPress headless
94
+ - **Strengths:** Zero JS by default, island architecture, fastest page loads
95
+ - **Weaknesses:** Not for app-like interactivity, limited client-side state
96
+
97
+ ---
98
+
99
+ ## Known Incompatibilities
100
+
101
+ ### Database Conflicts
102
+ | Combination | Problem | Resolution |
103
+ |-------------|---------|------------|
104
+ | MongoDB + PostgreSQL (both as primary) | Redundant primary DBs, split data model, double migration burden | Pick one based on data shape: relational → PostgreSQL, flexible → MongoDB |
105
+ | Firebase Firestore + PostgreSQL | Two databases with different paradigms, sync nightmares | Use Firebase only for real-time features alongside PostgreSQL, or go all-in on one |
106
+ | SQLite + production multi-user | SQLite has write-locking, not suitable for concurrent production use | SQLite for dev/embedded only; PostgreSQL or MySQL for production |
107
+
108
+ ### Frontend Conflicts
109
+ | Combination | Problem | Resolution |
110
+ |-------------|---------|------------|
111
+ | React + Vue (same project) | Two virtual DOMs, double bundle size, conflicting state management | Pick one. React has larger ecosystem; Vue has simpler API |
112
+ | React + jQuery | jQuery DOM manipulation conflicts with React's virtual DOM | Remove jQuery; use React refs for DOM access |
113
+ | Next.js + Create React App | CRA is client-only; Next.js handles routing and SSR differently | Use Next.js alone (it supersedes CRA) |
114
+
115
+ ### Auth Conflicts
116
+ | Combination | Problem | Resolution |
117
+ |-------------|---------|------------|
118
+ | Firebase Auth + Auth0 + custom JWT | Three auth systems = three user tables, token confusion | Pick one: Firebase Auth (free tier), Auth0 (enterprise), or custom (full control) |
119
+ | Supabase Auth + NextAuth | Both manage sessions; double middleware, token conflicts | Use one: Supabase Auth if using Supabase DB, NextAuth if using custom DB |
120
+
121
+ ### Hosting Conflicts
122
+ | Combination | Problem | Resolution |
123
+ |-------------|---------|------------|
124
+ | Serverless (Vercel/Netlify) + long-running processes | Serverless has execution time limits (10-60s), can't run background jobs | Use a separate worker service (Railway, Render background) or switch to container hosting |
125
+ | Static hosting + server-side rendering | Static hosts (GitHub Pages, S3) can't run SSR | Use Vercel/Netlify/Cloudflare which support both, or go pure static |
126
+
127
+ ---
128
+
129
+ ## Project-Type → Stack Mapping
130
+
131
+ ### Simple Projects (≤3 features, 1 user role)
132
+ | Type | Recommended Stack | Why |
133
+ |------|------------------|-----|
134
+ | Portfolio/Landing | Astro + Tailwind + Vercel | Zero JS, fast loads, cheap hosting |
135
+ | Blog | Astro + Headless CMS + Vercel | Content-first, markdown support |
136
+ | Todo/Notes | Next.js + Supabase | Quick CRUD with auth, free tier |
137
+
138
+ ### Standard Projects (4-10 features, multiple roles)
139
+ | Type | Recommended Stack | Why |
140
+ |------|------------------|-----|
141
+ | SaaS | Next.js + PostgreSQL + Prisma + Vercel | Type safety, SSR, scalable |
142
+ | E-commerce | Next.js + Supabase + Stripe | Built-in auth, real-time inventory |
143
+ | LMS | PERN or Next.js + PostgreSQL | Relational data (courses→lessons→users) |
144
+ | CRM | PERN + Redis | Complex queries, caching for dashboards |
145
+
146
+ ### Enterprise/Complex (10+ features, compliance, scale)
147
+ | Type | Recommended Stack | Why |
148
+ |------|------------------|-----|
149
+ | Multi-tenant SaaS | NestJS + PostgreSQL + Redis + AWS | Row-level security, job queues, horizontal scale |
150
+ | Real-time Collab | Next.js + Supabase Realtime or Socket.io | Built-in WebSocket support |
151
+ | ML Pipeline | Django + PostgreSQL + Celery + Redis | Python ML ecosystem, async task processing |
152
+ | Marketplace | NestJS + PostgreSQL + Stripe Connect + S3 | Complex payment splits, file storage |
153
+
154
+ ---
155
+
156
+ ## 2026 Industry Trends
157
+
158
+ ### Rising
159
+ - **Supabase** — Replacing Firebase for new projects (open source, PostgreSQL-based)
160
+ - **Drizzle ORM** — Lighter alternative to Prisma, better serverless performance
161
+ - **better-auth** — Rising auth library for Node.js (simpler than NextAuth)
162
+ - **Bun** — Faster Node.js alternative, gaining production adoption
163
+ - **Cloudflare Workers** — Edge-first deployment for global latency
164
+ - **Turso/LibSQL** — SQLite for production (embedded replicas, edge-ready)
165
+
166
+ ### Stable
167
+ - **Next.js** — Dominant full-stack React framework
168
+ - **PostgreSQL** — Default database for new projects
169
+ - **Tailwind CSS** — Default styling approach
170
+ - **Vercel/Railway** — Default hosting for Node.js apps
171
+ - **Stripe** — Default payment processing
172
+
173
+ ### Declining
174
+ - **Create React App** — Deprecated, replaced by Vite or Next.js
175
+ - **Firebase (for new projects)** — Supabase taking market share
176
+ - **Heroku (free tier)** — Gone; Railway/Render filling the gap
177
+ - **Webpack (manual config)** — Vite/Turbopack replacing
178
+ - **MongoDB (as default)** — PostgreSQL preferred unless schema flexibility is critical
@@ -0,0 +1,118 @@
1
+ ---
2
+ name: tiered-context-architecture
3
+ category: methodology
4
+ version: 1.0.0
5
+ contributed: 2026-03-06
6
+ contributor: dominion-flow
7
+ last_updated: 2026-03-06
8
+ contributors:
9
+ - dominion-flow
10
+ tags: [context-management, ai-agent, llm, token-optimization, memory-tiers]
11
+ difficulty: medium
12
+ usage_count: 0
13
+ success_rate: 100
14
+ ---
15
+
16
+ # Tiered Context Architecture (Hot/Warm/Cold)
17
+
18
+ ## Problem
19
+
20
+ AI agents working on long-running tasks fill their context window with a mix of critical and stale information. Without explicit categorization, all context is treated equally — leading to premature context exhaustion, irrelevant information competing with critical state, and poor compaction decisions that drop important details while preserving noise.
21
+
22
+ Symptoms:
23
+ - Agent "forgets" current task details while retaining old file contents
24
+ - Context compaction drops error messages but keeps completed task descriptions
25
+ - Agent hits context limits mid-task with no clear eviction strategy
26
+ - Output quality degrades because reasoning competes with stale data
27
+
28
+ ## Solution Pattern
29
+
30
+ Categorize every context segment into three tiers based on access recency and task relevance, then apply tier-specific retention policies:
31
+
32
+ **HOT** (never compress, ~15% budget): Current task, active errors, recitation block, circuit breaker state, failed approaches list. This is the "working memory" — losing any of it causes immediate task failure.
33
+
34
+ **WARM** (compressible, ~45% budget): Plan context, loaded skills, recently-read files, recent decisions, episodic recall. Useful for current phase but can be compressed to key points when space is needed.
35
+
36
+ **COLD** (evictable, 0% budget in window): Files read 5+ iterations ago, completed task details, resolved errors, unused skills. Saved to disk, retrievable on demand, but not occupying context window.
37
+
38
+ The key insight: tier assignment is **dynamic** — segments promote (COLD→WARM when re-referenced) and demote (HOT→WARM when task changes) based on actual usage patterns, not static rules.
39
+
40
+ ## Code Example
41
+
42
+ ```
43
+ // Before (problematic) — flat context, no tiers
44
+ context = [
45
+ system_prompt, // critical
46
+ file_read_10_turns_ago, // stale — wastes space
47
+ current_task, // critical
48
+ old_error_resolved, // stale — wastes space
49
+ active_error, // critical
50
+ completed_task_1, // stale
51
+ completed_task_2, // stale
52
+ skill_never_used, // stale
53
+ ]
54
+ // Result: 60% of context is stale. Compaction randomly drops items.
55
+
56
+ // After (solution) — tiered context with explicit budgets
57
+ HOT = [current_task, active_error, recitation, circuit_breaker, failed_approaches]
58
+ WARM = [plan_context, loaded_skills, recent_files, decisions]
59
+ COLD = [] // evicted to disk: old_files, completed_tasks, resolved_errors
60
+
61
+ // Budget enforcement:
62
+ IF hot_tokens > 30K: ERROR — hot tier should never exceed budget
63
+ IF warm_tokens > 90K: compress WARM to 50% (keep key points)
64
+ IF total > 70%: evict all COLD, compress WARM to 30%
65
+ IF total > 85%: keep only HOT, trigger handoff
66
+
67
+ // Dynamic tier transitions:
68
+ IF segment.last_accessed > 5 iterations: demote to COLD
69
+ IF cold_segment.referenced_by_current_task: promote to WARM
70
+ IF warm_segment.is_active_error: promote to HOT
71
+ IF hot_segment.error_resolved: demote to WARM → COLD
72
+ ```
73
+
74
+ ## Implementation Steps
75
+
76
+ 1. Define tier assignment function based on segment type and recency
77
+ 2. Set token budgets per tier (15% HOT, 45% WARM, 0% COLD)
78
+ 3. Tag each context segment with its tier on creation/injection
79
+ 4. Run tier reassignment every 3 iterations (not every iteration — overhead)
80
+ 5. When context exceeds 70%, compress WARM tier first, then evict COLD
81
+ 6. Preserve HOT tier unconditionally — never compress or evict
82
+ 7. Log tier transitions for debugging context management issues
83
+
84
+ ## When to Use
85
+
86
+ - Any AI agent system with long-running tasks (10+ iterations)
87
+ - Multi-phase execution pipelines where old phase context becomes stale
88
+ - Agents that read many files but only work on a few at a time
89
+ - Systems where context compaction causes "amnesia" of critical state
90
+ - When you need to extend useful context life before forced handoff
91
+
92
+ ## When NOT to Use
93
+
94
+ - Short conversations (< 5 turns) — overhead isn't worth it
95
+ - Single-file edits with no accumulated context
96
+ - Systems with unlimited context windows (if such a thing existed)
97
+ - When all context segments are equally critical (rare but possible)
98
+
99
+ ## Common Mistakes
100
+
101
+ - Setting HOT budget too large — defeats the purpose of tiering. HOT should be < 20% of window
102
+ - Never demoting segments — HOT tier grows unbounded if old errors aren't demoted after resolution
103
+ - Compressing HOT tier during context pressure — this causes immediate task failure
104
+ - Evicting COLD without saving to disk — you lose the ability to retrieve if needed later
105
+ - Running tier reassignment every iteration — the overhead reduces net context benefit
106
+
107
+ ## Related Skills
108
+
109
+ - [RESEARCH_BACKED_WORKFLOW_UPGRADE](../methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md) - Research methodology that discovered this pattern
110
+ - [AGENT_SELF_IMPROVEMENT_LOOP](../methodology/AGENT_SELF_IMPROVEMENT_LOOP.md) - Agent improvement patterns
111
+
112
+ ## References
113
+
114
+ - Spotify "Honk" Architecture (2024) — tiered context management reduces failures by 40%
115
+ - ACON "Active Context Compression" (2025-2026) — 26-54% token reduction via selective compression
116
+ - PAACE "Plan-Aware Agent Context Engineering" (2025) — plan-aware preservation during compression
117
+ - Focus (ACL 2025) — active forgetting for proactive context management
118
+ - Contributed from: dominion-flow v10.1 research session