@thierrynakoa/fire-flow 10.0.0 → 12.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +8 -8
- package/ARCHITECTURE-DIAGRAM.md +7 -4
- package/COMMAND-REFERENCE.md +33 -13
- package/DOMINION-FLOW-OVERVIEW.md +581 -421
- package/QUICK-START.md +3 -3
- package/README.md +101 -44
- package/TROUBLESHOOTING.md +264 -264
- package/agents/fire-executor.md +200 -116
- package/agents/fire-fact-checker.md +276 -276
- package/agents/fire-phoenix-analyst.md +394 -0
- package/agents/fire-planner.md +145 -53
- package/agents/fire-project-researcher.md +155 -155
- package/agents/fire-research-synthesizer.md +166 -166
- package/agents/fire-researcher.md +144 -59
- package/agents/fire-roadmapper.md +215 -203
- package/agents/fire-verifier.md +247 -65
- package/agents/fire-vision-architect.md +381 -0
- package/commands/fire-0-orient.md +476 -476
- package/commands/fire-1a-new.md +216 -0
- package/commands/fire-1b-research.md +210 -0
- package/commands/fire-1c-setup.md +254 -0
- package/commands/{fire-1a-discuss.md → fire-1d-discuss.md} +35 -7
- package/commands/fire-3-execute.md +55 -2
- package/commands/fire-4-verify.md +61 -0
- package/commands/fire-5-handoff.md +2 -2
- package/commands/fire-6-resume.md +37 -2
- package/commands/fire-add-new-skill.md +2 -2
- package/commands/fire-autonomous.md +20 -3
- package/commands/fire-brainstorm.md +1 -1
- package/commands/fire-complete-milestone.md +2 -2
- package/commands/fire-cost.md +183 -0
- package/commands/fire-dashboard.md +2 -2
- package/commands/fire-debug.md +663 -663
- package/commands/fire-loop-resume.md +2 -2
- package/commands/fire-loop-stop.md +1 -1
- package/commands/fire-loop.md +1168 -1168
- package/commands/fire-map-codebase.md +3 -3
- package/commands/fire-new-milestone.md +356 -356
- package/commands/fire-phoenix.md +603 -0
- package/commands/fire-reflect.md +235 -235
- package/commands/fire-research.md +246 -246
- package/commands/fire-search.md +1 -1
- package/commands/fire-skills-diff.md +3 -3
- package/commands/fire-skills-history.md +3 -3
- package/commands/fire-skills-rollback.md +7 -7
- package/commands/fire-skills-sync.md +5 -5
- package/commands/fire-test.md +9 -9
- package/commands/fire-todos.md +1 -1
- package/commands/fire-update.md +5 -5
- package/hooks/hooks.json +16 -16
- package/hooks/run-hook.sh +8 -8
- package/hooks/run-session-end.sh +7 -7
- package/hooks/session-end.sh +90 -90
- package/hooks/session-start.sh +1 -1
- package/package.json +4 -2
- package/plugin.json +7 -7
- package/references/metrics-and-trends.md +1 -1
- package/skills-library/SKILLS-INDEX.md +588 -588
- package/skills-library/_general/methodology/AUTONOMOUS_ORCHESTRATION.md +182 -0
- package/skills-library/_general/methodology/BACKWARD_PLANNING_INTERVIEW.md +307 -0
- package/skills-library/_general/methodology/CIRCUIT_BREAKER_INTELLIGENCE.md +163 -0
- package/skills-library/_general/methodology/CONTEXT_ROTATION.md +151 -0
- package/skills-library/_general/methodology/DEAD_ENDS_SHELF.md +188 -0
- package/skills-library/_general/methodology/DESIGN_PHILOSOPHY_ENFORCEMENT.md +152 -0
- package/skills-library/_general/methodology/INTERNAL_CONSISTENCY_AUDIT.md +212 -0
- package/skills-library/_general/methodology/LIVE_BREADCRUMB_PROTOCOL.md +242 -0
- package/skills-library/_general/methodology/PHOENIX_REBUILD_METHODOLOGY.md +251 -0
- package/skills-library/_general/methodology/QUALITY_GATES_AND_VERIFICATION.md +157 -0
- package/skills-library/_general/methodology/RELIABILITY_PREDICTION.md +104 -0
- package/skills-library/_general/methodology/REQUIREMENTS_DECOMPOSITION.md +155 -0
- package/skills-library/_general/methodology/SELF_TESTING_FEEDBACK_LOOP.md +143 -0
- package/skills-library/_general/methodology/STACK_COMPATIBILITY_MATRIX.md +178 -0
- package/skills-library/_general/methodology/TIERED_CONTEXT_ARCHITECTURE.md +118 -0
- package/skills-library/_general/methodology/ZERO_FRICTION_CLI_SETUP.md +312 -0
- package/skills-library/_general/methodology/autonomous-multi-phase-build.md +133 -0
- package/skills-library/_general/methodology/claude-md-archival.md +280 -0
- package/skills-library/_general/methodology/debug-swarm-researcher-escape-hatch.md +240 -240
- package/skills-library/_general/methodology/git-worktrees-parallel.md +232 -0
- package/skills-library/_general/methodology/llm-judge-memory-crud.md +241 -0
- package/skills-library/_general/methodology/multi-project-autonomous-build.md +360 -0
- package/skills-library/_general/methodology/shell-autonomous-loop-fixplan.md +238 -238
- package/skills-library/_general/patterns-standards/GOF_DESIGN_PATTERNS_FOR_AI_AGENTS.md +358 -0
- package/skills-library/methodology/BREATH_BASED_PARALLEL_EXECUTION.md +1 -1
- package/skills-library/methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md +1 -1
- package/skills-library/methodology/SABBATH_REST_PATTERN.md +1 -1
- package/templates/ASSUMPTIONS.md +1 -1
- package/templates/BLOCKERS.md +1 -1
- package/templates/DECISION_LOG.md +1 -1
- package/templates/phase-prompt.md +1 -1
- package/templates/phoenix-comparison.md +80 -0
- package/version.json +2 -2
- package/workflows/handoff-session.md +1 -1
- package/workflows/new-project.md +2 -2
- package/commands/fire-1-new.md +0 -281
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: self-testing-feedback-loop
|
|
3
|
+
category: methodology
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
contributed: 2026-03-06
|
|
6
|
+
contributor: dominion-flow
|
|
7
|
+
last_updated: 2026-03-06
|
|
8
|
+
contributors:
|
|
9
|
+
- dominion-flow
|
|
10
|
+
tags: [circuit-breaker, testing, error-recovery, ai-agent, feedback-loop]
|
|
11
|
+
difficulty: medium
|
|
12
|
+
usage_count: 0
|
|
13
|
+
success_rate: 100
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# Self-Testing Feedback Loop for Circuit Breakers
|
|
17
|
+
|
|
18
|
+
## Problem
|
|
19
|
+
|
|
20
|
+
When AI agents hit errors during code generation, they typically rotate to a different approach — but never verify whether the rotation actually fixed anything before continuing. This creates "blind rotation" where the agent tries approach after approach without feedback, wasting context and iterations. The circuit breaker detects spinning (same error repeated), but the rotation itself has no verification step.
|
|
21
|
+
|
|
22
|
+
Symptoms:
|
|
23
|
+
- Agent rotates approaches but keeps hitting the same underlying error
|
|
24
|
+
- Circuit breaker trips after 5 identical errors despite "different" approaches
|
|
25
|
+
- Agent reports "trying a new approach" but the test output is identical
|
|
26
|
+
- Iterations are wasted on approaches that look different but fail identically
|
|
27
|
+
|
|
28
|
+
## Solution Pattern
|
|
29
|
+
|
|
30
|
+
After every approach rotation triggered by the circuit breaker, enforce a **test-classify-feedback** cycle before continuing:
|
|
31
|
+
|
|
32
|
+
1. **TEST** — Run the most specific available test (failing test > module test > full suite)
|
|
33
|
+
2. **CLASSIFY** — Compare test result against the previous error hash
|
|
34
|
+
3. **FEEDBACK** — Inject the classification back into the agent's context for the next iteration
|
|
35
|
+
|
|
36
|
+
The classification has four outcomes:
|
|
37
|
+
- **PASS** → Rotation worked. Clear warning state. Continue.
|
|
38
|
+
- **NEW ERROR** → Different failure. That's progress. Feed new error into context.
|
|
39
|
+
- **SAME ERROR** → Rotation didn't help. Double-increment spin counter (accelerate circuit break).
|
|
40
|
+
- **NO TESTS** → Can't verify. Log warning, rely on file-change metrics only.
|
|
41
|
+
|
|
42
|
+
The key insight: **double-incrementing the spin counter** when a rotation produces the same error. This accelerates the circuit break for approaches that look different but are functionally identical — preventing the agent from exhausting 5 rotation attempts on variations of the same broken strategy.
|
|
43
|
+
|
|
44
|
+
## Code Example
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
// Before (problematic) — blind rotation
|
|
48
|
+
ON circuit_breaker_warning:
|
|
49
|
+
rotation = suggest_new_approach(approaches_tried)
|
|
50
|
+
inject_into_context(rotation)
|
|
51
|
+
continue_execution() // Hope for the best
|
|
52
|
+
|
|
53
|
+
// After (solution) — verified rotation with feedback
|
|
54
|
+
ON circuit_breaker_warning:
|
|
55
|
+
rotation = suggest_new_approach(approaches_tried)
|
|
56
|
+
inject_into_context(rotation)
|
|
57
|
+
|
|
58
|
+
// Execute one iteration with new approach
|
|
59
|
+
result = execute_iteration()
|
|
60
|
+
|
|
61
|
+
// TEST: Run most specific available test
|
|
62
|
+
test_result = run_tests(priority=[
|
|
63
|
+
specific_failing_test, // Best: exact test that's failing
|
|
64
|
+
module_test_file, // Good: tests for the modified module
|
|
65
|
+
full_suite_if_quick, // OK: full suite if < 30 seconds
|
|
66
|
+
])
|
|
67
|
+
|
|
68
|
+
// CLASSIFY: Compare against previous error
|
|
69
|
+
IF test_result.passed:
|
|
70
|
+
health = PROGRESS
|
|
71
|
+
spin_counter = 0 // Clear — rotation worked
|
|
72
|
+
record("Rotation successful: {approach}")
|
|
73
|
+
|
|
74
|
+
ELIF test_result.error_hash != previous_error_hash:
|
|
75
|
+
health = PROGRESS // Different error = forward movement
|
|
76
|
+
spin_counter = 0
|
|
77
|
+
feed_new_error(test_result) // New error feeds into next iteration
|
|
78
|
+
record("New error after rotation: {new_hash}")
|
|
79
|
+
|
|
80
|
+
ELIF test_result.error_hash == previous_error_hash:
|
|
81
|
+
health = SPINNING
|
|
82
|
+
spin_counter += 2 // DOUBLE increment — rotation failed
|
|
83
|
+
record("Rotation ineffective: same error {hash}")
|
|
84
|
+
|
|
85
|
+
ELIF no_tests_available:
|
|
86
|
+
health = UNKNOWN
|
|
87
|
+
record("No tests — manual verification needed")
|
|
88
|
+
|
|
89
|
+
// FEEDBACK: Inject result into next iteration
|
|
90
|
+
inject_into_context(
|
|
91
|
+
"SELF-TEST after rotation: {test_result.status}
|
|
92
|
+
Diagnosis: {health}
|
|
93
|
+
Action: {recommended_next_step}"
|
|
94
|
+
)
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
## Implementation Steps
|
|
98
|
+
|
|
99
|
+
1. Hook into the circuit breaker's approach rotation trigger
|
|
100
|
+
2. After rotation, execute exactly ONE iteration before testing
|
|
101
|
+
3. Run the most specific available test (don't waste time on full suite if a specific test exists)
|
|
102
|
+
4. Hash-compare the new error against the previous error
|
|
103
|
+
5. Classify as PASS / NEW_ERROR / SAME_ERROR / NO_TESTS
|
|
104
|
+
6. If SAME_ERROR: double-increment the spin counter to accelerate circuit break
|
|
105
|
+
7. Inject the test result and classification into the agent's context
|
|
106
|
+
8. Record the classification in the loop tracking file
|
|
107
|
+
|
|
108
|
+
## When to Use
|
|
109
|
+
|
|
110
|
+
- Any AI agent system with a circuit breaker or loop detection
|
|
111
|
+
- Code generation agents that run tests after changes
|
|
112
|
+
- Autonomous debugging loops (test-diagnose-fix patterns)
|
|
113
|
+
- Any system where "trying a different approach" needs verification
|
|
114
|
+
- Long-running execution pipelines with error recovery
|
|
115
|
+
|
|
116
|
+
## When NOT to Use
|
|
117
|
+
|
|
118
|
+
- Projects with no test suite (the loop relies on test output)
|
|
119
|
+
- Documentation-only changes (no testable code)
|
|
120
|
+
- When the circuit breaker is already TRIPPED (we're stopping, not testing)
|
|
121
|
+
- Interactive/manual debugging where the human provides feedback
|
|
122
|
+
- Single-shot code generation (no iteration loop)
|
|
123
|
+
|
|
124
|
+
## Common Mistakes
|
|
125
|
+
|
|
126
|
+
- Running the full test suite when a specific failing test exists — wastes 10x the time
|
|
127
|
+
- Not double-incrementing on SAME_ERROR — allows 5 useless rotations before breaking
|
|
128
|
+
- Testing before the rotation is applied — tests the old approach, not the new one
|
|
129
|
+
- Classifying "different line number, same error type" as progress — normalize error hashes first
|
|
130
|
+
- Skipping the feedback injection — agent doesn't know its rotation failed
|
|
131
|
+
|
|
132
|
+
## Related Skills
|
|
133
|
+
|
|
134
|
+
- [TIERED_CONTEXT_ARCHITECTURE](../methodology/TIERED_CONTEXT_ARCHITECTURE.md) - Context management that preserves error state
|
|
135
|
+
- [AGENT_SELF_IMPROVEMENT_LOOP](../methodology/AGENT_SELF_IMPROVEMENT_LOOP.md) - Broader agent improvement patterns
|
|
136
|
+
- [DIFFICULTY_AWARE_AGENT_ROUTING](../methodology/DIFFICULTY_AWARE_AGENT_ROUTING.md) - Route by difficulty level
|
|
137
|
+
|
|
138
|
+
## References
|
|
139
|
+
|
|
140
|
+
- SWE-Agent + Reflexion "Test-Diagnose-Fix" loop (NeurIPS 2024) — 37% higher fix rates
|
|
141
|
+
- Manus AI error preservation pattern (Feb 2026) — errors are most valuable context
|
|
142
|
+
- frankbria's Ralph loop fork — quantitative convergence detection
|
|
143
|
+
- Contributed from: dominion-flow v10.1 research session
|
|
@@ -0,0 +1,178 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: STACK_COMPATIBILITY_MATRIX
|
|
3
|
+
category: methodology
|
|
4
|
+
description: Reference data for proven stack combinations, known incompatibilities, and project-type mapping
|
|
5
|
+
version: 1.0.0
|
|
6
|
+
tags: [architecture, stack, compatibility, vision]
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Stack Compatibility Matrix
|
|
10
|
+
|
|
11
|
+
Reference data for the `fire-vision-architect` agent. Contains proven combinations, known conflicts, and project-type recommendations.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Proven Stack Combinations
|
|
16
|
+
|
|
17
|
+
### MERN (MongoDB + Express + React + Node.js)
|
|
18
|
+
- **Best for:** Rapid prototyping, real-time apps, flexible-schema projects
|
|
19
|
+
- **Industry:** Netflix, Uber (parts), Instagram (early)
|
|
20
|
+
- **Auth pairing:** Passport.js, JWT custom, Auth0
|
|
21
|
+
- **Hosting:** Heroku, Railway, DigitalOcean, AWS EC2
|
|
22
|
+
- **DB extensions:** Redis (caching), Mongoose ODM
|
|
23
|
+
- **Strengths:** Single language (JS) everywhere, huge npm ecosystem, fast iteration
|
|
24
|
+
- **Weaknesses:** No ACID by default, schema discipline required, callback complexity
|
|
25
|
+
|
|
26
|
+
### PERN (PostgreSQL + Express + React + Node.js)
|
|
27
|
+
- **Best for:** Data-integrity-critical apps, financial, CRM, LMS
|
|
28
|
+
- **Industry:** Many YC startups, internal tools
|
|
29
|
+
- **Auth pairing:** Passport.js, JWT custom, better-auth
|
|
30
|
+
- **Hosting:** Heroku, Railway, Render, AWS RDS + EC2
|
|
31
|
+
- **DB extensions:** Prisma ORM, Drizzle ORM, pg-promise
|
|
32
|
+
- **Strengths:** ACID compliance, complex queries, relational integrity, mature tooling
|
|
33
|
+
- **Weaknesses:** Schema migrations needed, slightly slower prototyping than MongoDB
|
|
34
|
+
|
|
35
|
+
### Next.js + Supabase
|
|
36
|
+
- **Best for:** Solo devs, MVPs, startups wanting speed-to-market
|
|
37
|
+
- **Industry:** cal.com, Resend, many indie SaaS
|
|
38
|
+
- **Auth pairing:** Supabase Auth (built-in), NextAuth
|
|
39
|
+
- **Hosting:** Vercel (native), Netlify, Cloudflare Pages
|
|
40
|
+
- **DB extensions:** Supabase Realtime, Edge Functions, Row Level Security
|
|
41
|
+
- **Strengths:** Fastest time-to-market, built-in auth/storage/realtime, generous free tier
|
|
42
|
+
- **Weaknesses:** Vendor lock-in risk, limited custom backend logic, Supabase scaling costs
|
|
43
|
+
|
|
44
|
+
### Next.js + PostgreSQL + Prisma
|
|
45
|
+
- **Best for:** Full-stack teams wanting type safety and control
|
|
46
|
+
- **Industry:** Vercel apps, many B2B SaaS
|
|
47
|
+
- **Auth pairing:** NextAuth/Auth.js, better-auth, Clerk
|
|
48
|
+
- **Hosting:** Vercel + Neon/Supabase DB, Railway
|
|
49
|
+
- **DB extensions:** Prisma Studio, connection pooling (PgBouncer)
|
|
50
|
+
- **Strengths:** Type-safe DB access, excellent DX, SSR/SSG flexibility, full control
|
|
51
|
+
- **Weaknesses:** More setup than Supabase all-in-one, Prisma cold start in serverless
|
|
52
|
+
|
|
53
|
+
### NestJS + PostgreSQL + Redis
|
|
54
|
+
- **Best for:** Enterprise, microservices, teams > 5 developers
|
|
55
|
+
- **Industry:** Enterprise Node.js projects, fintech
|
|
56
|
+
- **Auth pairing:** Passport.js (NestJS module), custom JWT, Keycloak
|
|
57
|
+
- **Hosting:** AWS ECS/EKS, GCP Cloud Run, DigitalOcean Kubernetes
|
|
58
|
+
- **DB extensions:** TypeORM, MikroORM, Bull (job queues via Redis)
|
|
59
|
+
- **Strengths:** Angular-like structure, dependency injection, microservice-ready, testable
|
|
60
|
+
- **Weaknesses:** Steeper learning curve, more boilerplate, overkill for small projects
|
|
61
|
+
|
|
62
|
+
### Django + PostgreSQL + React/Vue
|
|
63
|
+
- **Best for:** Content-heavy apps, admin-intensive, rapid CRUD
|
|
64
|
+
- **Industry:** Instagram (early), Pinterest, Eventbrite
|
|
65
|
+
- **Auth pairing:** Django Auth (built-in), django-allauth
|
|
66
|
+
- **Hosting:** Heroku, Railway, AWS Elastic Beanstalk, DigitalOcean
|
|
67
|
+
- **DB extensions:** Django ORM, django-rest-framework, Celery (async tasks)
|
|
68
|
+
- **Strengths:** Batteries-included, admin panel free, excellent ORM, security defaults
|
|
69
|
+
- **Weaknesses:** Python + JS split, frontend integration friction, monolith tendency
|
|
70
|
+
|
|
71
|
+
### Rails + PostgreSQL + Hotwire/React
|
|
72
|
+
- **Best for:** Rapid development, startups, content platforms
|
|
73
|
+
- **Industry:** GitHub, Shopify, Basecamp, Airbnb (early)
|
|
74
|
+
- **Auth pairing:** Devise, OmniAuth
|
|
75
|
+
- **Hosting:** Heroku, Render, Fly.io, Hatchbox
|
|
76
|
+
- **DB extensions:** ActiveRecord, Sidekiq (background jobs), ActionCable (WebSockets)
|
|
77
|
+
- **Strengths:** Convention over configuration, fastest scaffolding, mature ecosystem
|
|
78
|
+
- **Weaknesses:** Ruby learning curve, performance ceiling, fewer JS devs know it
|
|
79
|
+
|
|
80
|
+
### Remix + PostgreSQL + Prisma
|
|
81
|
+
- **Best for:** Form-heavy apps, progressive enhancement, accessibility-first
|
|
82
|
+
- **Industry:** Shopify (Hydrogen), newer startups
|
|
83
|
+
- **Auth pairing:** Remix Auth, custom session-based
|
|
84
|
+
- **Hosting:** Fly.io, Vercel, Cloudflare Workers
|
|
85
|
+
- **Strengths:** Web standards first, nested routing, excellent forms, progressive enhancement
|
|
86
|
+
- **Weaknesses:** Smaller ecosystem than Next.js, fewer tutorials, less community support
|
|
87
|
+
|
|
88
|
+
### Astro + Headless CMS
|
|
89
|
+
- **Best for:** Content sites, blogs, documentation, marketing
|
|
90
|
+
- **Industry:** Content-focused startups, developer docs
|
|
91
|
+
- **Auth pairing:** Usually not needed; if needed, Auth.js
|
|
92
|
+
- **Hosting:** Vercel, Netlify, Cloudflare Pages
|
|
93
|
+
- **CMS options:** Sanity, Contentful, Strapi, WordPress headless
|
|
94
|
+
- **Strengths:** Zero JS by default, island architecture, fastest page loads
|
|
95
|
+
- **Weaknesses:** Not for app-like interactivity, limited client-side state
|
|
96
|
+
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
## Known Incompatibilities
|
|
100
|
+
|
|
101
|
+
### Database Conflicts
|
|
102
|
+
| Combination | Problem | Resolution |
|
|
103
|
+
|-------------|---------|------------|
|
|
104
|
+
| MongoDB + PostgreSQL (both as primary) | Redundant primary DBs, split data model, double migration burden | Pick one based on data shape: relational → PostgreSQL, flexible → MongoDB |
|
|
105
|
+
| Firebase Firestore + PostgreSQL | Two databases with different paradigms, sync nightmares | Use Firebase only for real-time features alongside PostgreSQL, or go all-in on one |
|
|
106
|
+
| SQLite + production multi-user | SQLite has write-locking, not suitable for concurrent production use | SQLite for dev/embedded only; PostgreSQL or MySQL for production |
|
|
107
|
+
|
|
108
|
+
### Frontend Conflicts
|
|
109
|
+
| Combination | Problem | Resolution |
|
|
110
|
+
|-------------|---------|------------|
|
|
111
|
+
| React + Vue (same project) | Two virtual DOMs, double bundle size, conflicting state management | Pick one. React has larger ecosystem; Vue has simpler API |
|
|
112
|
+
| React + jQuery | jQuery DOM manipulation conflicts with React's virtual DOM | Remove jQuery; use React refs for DOM access |
|
|
113
|
+
| Next.js + Create React App | CRA is client-only; Next.js handles routing and SSR differently | Use Next.js alone (it supersedes CRA) |
|
|
114
|
+
|
|
115
|
+
### Auth Conflicts
|
|
116
|
+
| Combination | Problem | Resolution |
|
|
117
|
+
|-------------|---------|------------|
|
|
118
|
+
| Firebase Auth + Auth0 + custom JWT | Three auth systems = three user tables, token confusion | Pick one: Firebase Auth (free tier), Auth0 (enterprise), or custom (full control) |
|
|
119
|
+
| Supabase Auth + NextAuth | Both manage sessions; double middleware, token conflicts | Use one: Supabase Auth if using Supabase DB, NextAuth if using custom DB |
|
|
120
|
+
|
|
121
|
+
### Hosting Conflicts
|
|
122
|
+
| Combination | Problem | Resolution |
|
|
123
|
+
|-------------|---------|------------|
|
|
124
|
+
| Serverless (Vercel/Netlify) + long-running processes | Serverless has execution time limits (10-60s), can't run background jobs | Use a separate worker service (Railway, Render background) or switch to container hosting |
|
|
125
|
+
| Static hosting + server-side rendering | Static hosts (GitHub Pages, S3) can't run SSR | Use Vercel/Netlify/Cloudflare which support both, or go pure static |
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
## Project-Type → Stack Mapping
|
|
130
|
+
|
|
131
|
+
### Simple Projects (≤3 features, 1 user role)
|
|
132
|
+
| Type | Recommended Stack | Why |
|
|
133
|
+
|------|------------------|-----|
|
|
134
|
+
| Portfolio/Landing | Astro + Tailwind + Vercel | Zero JS, fast loads, cheap hosting |
|
|
135
|
+
| Blog | Astro + Headless CMS + Vercel | Content-first, markdown support |
|
|
136
|
+
| Todo/Notes | Next.js + Supabase | Quick CRUD with auth, free tier |
|
|
137
|
+
|
|
138
|
+
### Standard Projects (4-10 features, multiple roles)
|
|
139
|
+
| Type | Recommended Stack | Why |
|
|
140
|
+
|------|------------------|-----|
|
|
141
|
+
| SaaS | Next.js + PostgreSQL + Prisma + Vercel | Type safety, SSR, scalable |
|
|
142
|
+
| E-commerce | Next.js + Supabase + Stripe | Built-in auth, real-time inventory |
|
|
143
|
+
| LMS | PERN or Next.js + PostgreSQL | Relational data (courses→lessons→users) |
|
|
144
|
+
| CRM | PERN + Redis | Complex queries, caching for dashboards |
|
|
145
|
+
|
|
146
|
+
### Enterprise/Complex (10+ features, compliance, scale)
|
|
147
|
+
| Type | Recommended Stack | Why |
|
|
148
|
+
|------|------------------|-----|
|
|
149
|
+
| Multi-tenant SaaS | NestJS + PostgreSQL + Redis + AWS | Row-level security, job queues, horizontal scale |
|
|
150
|
+
| Real-time Collab | Next.js + Supabase Realtime or Socket.io | Built-in WebSocket support |
|
|
151
|
+
| ML Pipeline | Django + PostgreSQL + Celery + Redis | Python ML ecosystem, async task processing |
|
|
152
|
+
| Marketplace | NestJS + PostgreSQL + Stripe Connect + S3 | Complex payment splits, file storage |
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## 2026 Industry Trends
|
|
157
|
+
|
|
158
|
+
### Rising
|
|
159
|
+
- **Supabase** — Replacing Firebase for new projects (open source, PostgreSQL-based)
|
|
160
|
+
- **Drizzle ORM** — Lighter alternative to Prisma, better serverless performance
|
|
161
|
+
- **better-auth** — Rising auth library for Node.js (simpler than NextAuth)
|
|
162
|
+
- **Bun** — Faster Node.js alternative, gaining production adoption
|
|
163
|
+
- **Cloudflare Workers** — Edge-first deployment for global latency
|
|
164
|
+
- **Turso/LibSQL** — SQLite for production (embedded replicas, edge-ready)
|
|
165
|
+
|
|
166
|
+
### Stable
|
|
167
|
+
- **Next.js** — Dominant full-stack React framework
|
|
168
|
+
- **PostgreSQL** — Default database for new projects
|
|
169
|
+
- **Tailwind CSS** — Default styling approach
|
|
170
|
+
- **Vercel/Railway** — Default hosting for Node.js apps
|
|
171
|
+
- **Stripe** — Default payment processing
|
|
172
|
+
|
|
173
|
+
### Declining
|
|
174
|
+
- **Create React App** — Deprecated, replaced by Vite or Next.js
|
|
175
|
+
- **Firebase (for new projects)** — Supabase taking market share
|
|
176
|
+
- **Heroku (free tier)** — Gone; Railway/Render filling the gap
|
|
177
|
+
- **Webpack (manual config)** — Vite/Turbopack replacing
|
|
178
|
+
- **MongoDB (as default)** — PostgreSQL preferred unless schema flexibility is critical
|
|
@@ -0,0 +1,118 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: tiered-context-architecture
|
|
3
|
+
category: methodology
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
contributed: 2026-03-06
|
|
6
|
+
contributor: dominion-flow
|
|
7
|
+
last_updated: 2026-03-06
|
|
8
|
+
contributors:
|
|
9
|
+
- dominion-flow
|
|
10
|
+
tags: [context-management, ai-agent, llm, token-optimization, memory-tiers]
|
|
11
|
+
difficulty: medium
|
|
12
|
+
usage_count: 0
|
|
13
|
+
success_rate: 100
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# Tiered Context Architecture (Hot/Warm/Cold)
|
|
17
|
+
|
|
18
|
+
## Problem
|
|
19
|
+
|
|
20
|
+
AI agents working on long-running tasks fill their context window with a mix of critical and stale information. Without explicit categorization, all context is treated equally — leading to premature context exhaustion, irrelevant information competing with critical state, and poor compaction decisions that drop important details while preserving noise.
|
|
21
|
+
|
|
22
|
+
Symptoms:
|
|
23
|
+
- Agent "forgets" current task details while retaining old file contents
|
|
24
|
+
- Context compaction drops error messages but keeps completed task descriptions
|
|
25
|
+
- Agent hits context limits mid-task with no clear eviction strategy
|
|
26
|
+
- Output quality degrades because reasoning competes with stale data
|
|
27
|
+
|
|
28
|
+
## Solution Pattern
|
|
29
|
+
|
|
30
|
+
Categorize every context segment into three tiers based on access recency and task relevance, then apply tier-specific retention policies:
|
|
31
|
+
|
|
32
|
+
**HOT** (never compress, ~15% budget): Current task, active errors, recitation block, circuit breaker state, failed approaches list. This is the "working memory" — losing any of it causes immediate task failure.
|
|
33
|
+
|
|
34
|
+
**WARM** (compressible, ~45% budget): Plan context, loaded skills, recently-read files, recent decisions, episodic recall. Useful for current phase but can be compressed to key points when space is needed.
|
|
35
|
+
|
|
36
|
+
**COLD** (evictable, 0% budget in window): Files read 5+ iterations ago, completed task details, resolved errors, unused skills. Saved to disk, retrievable on demand, but not occupying context window.
|
|
37
|
+
|
|
38
|
+
The key insight: tier assignment is **dynamic** — segments promote (COLD→WARM when re-referenced) and demote (HOT→WARM when task changes) based on actual usage patterns, not static rules.
|
|
39
|
+
|
|
40
|
+
## Code Example
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
// Before (problematic) — flat context, no tiers
|
|
44
|
+
context = [
|
|
45
|
+
system_prompt, // critical
|
|
46
|
+
file_read_10_turns_ago, // stale — wastes space
|
|
47
|
+
current_task, // critical
|
|
48
|
+
old_error_resolved, // stale — wastes space
|
|
49
|
+
active_error, // critical
|
|
50
|
+
completed_task_1, // stale
|
|
51
|
+
completed_task_2, // stale
|
|
52
|
+
skill_never_used, // stale
|
|
53
|
+
]
|
|
54
|
+
// Result: 60% of context is stale. Compaction randomly drops items.
|
|
55
|
+
|
|
56
|
+
// After (solution) — tiered context with explicit budgets
|
|
57
|
+
HOT = [current_task, active_error, recitation, circuit_breaker, failed_approaches]
|
|
58
|
+
WARM = [plan_context, loaded_skills, recent_files, decisions]
|
|
59
|
+
COLD = [] // evicted to disk: old_files, completed_tasks, resolved_errors
|
|
60
|
+
|
|
61
|
+
// Budget enforcement:
|
|
62
|
+
IF hot_tokens > 30K: ERROR — hot tier should never exceed budget
|
|
63
|
+
IF warm_tokens > 90K: compress WARM to 50% (keep key points)
|
|
64
|
+
IF total > 70%: evict all COLD, compress WARM to 30%
|
|
65
|
+
IF total > 85%: keep only HOT, trigger handoff
|
|
66
|
+
|
|
67
|
+
// Dynamic tier transitions:
|
|
68
|
+
IF segment.last_accessed > 5 iterations: demote to COLD
|
|
69
|
+
IF cold_segment.referenced_by_current_task: promote to WARM
|
|
70
|
+
IF warm_segment.is_active_error: promote to HOT
|
|
71
|
+
IF hot_segment.error_resolved: demote to WARM → COLD
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Implementation Steps
|
|
75
|
+
|
|
76
|
+
1. Define tier assignment function based on segment type and recency
|
|
77
|
+
2. Set token budgets per tier (15% HOT, 45% WARM, 0% COLD)
|
|
78
|
+
3. Tag each context segment with its tier on creation/injection
|
|
79
|
+
4. Run tier reassignment every 3 iterations (not every iteration — overhead)
|
|
80
|
+
5. When context exceeds 70%, compress WARM tier first, then evict COLD
|
|
81
|
+
6. Preserve HOT tier unconditionally — never compress or evict
|
|
82
|
+
7. Log tier transitions for debugging context management issues
|
|
83
|
+
|
|
84
|
+
## When to Use
|
|
85
|
+
|
|
86
|
+
- Any AI agent system with long-running tasks (10+ iterations)
|
|
87
|
+
- Multi-phase execution pipelines where old phase context becomes stale
|
|
88
|
+
- Agents that read many files but only work on a few at a time
|
|
89
|
+
- Systems where context compaction causes "amnesia" of critical state
|
|
90
|
+
- When you need to extend useful context life before forced handoff
|
|
91
|
+
|
|
92
|
+
## When NOT to Use
|
|
93
|
+
|
|
94
|
+
- Short conversations (< 5 turns) — overhead isn't worth it
|
|
95
|
+
- Single-file edits with no accumulated context
|
|
96
|
+
- Systems with unlimited context windows (if such a thing existed)
|
|
97
|
+
- When all context segments are equally critical (rare but possible)
|
|
98
|
+
|
|
99
|
+
## Common Mistakes
|
|
100
|
+
|
|
101
|
+
- Setting HOT budget too large — defeats the purpose of tiering. HOT should be < 20% of window
|
|
102
|
+
- Never demoting segments — HOT tier grows unbounded if old errors aren't demoted after resolution
|
|
103
|
+
- Compressing HOT tier during context pressure — this causes immediate task failure
|
|
104
|
+
- Evicting COLD without saving to disk — you lose the ability to retrieve if needed later
|
|
105
|
+
- Running tier reassignment every iteration — the overhead reduces net context benefit
|
|
106
|
+
|
|
107
|
+
## Related Skills
|
|
108
|
+
|
|
109
|
+
- [RESEARCH_BACKED_WORKFLOW_UPGRADE](../methodology/RESEARCH_BACKED_WORKFLOW_UPGRADE.md) - Research methodology that discovered this pattern
|
|
110
|
+
- [AGENT_SELF_IMPROVEMENT_LOOP](../methodology/AGENT_SELF_IMPROVEMENT_LOOP.md) - Agent improvement patterns
|
|
111
|
+
|
|
112
|
+
## References
|
|
113
|
+
|
|
114
|
+
- Spotify "Honk" Architecture (2024) — tiered context management reduces failures by 40%
|
|
115
|
+
- ACON "Active Context Compression" (2025-2026) — 26-54% token reduction via selective compression
|
|
116
|
+
- PAACE "Plan-Aware Agent Context Engineering" (2025) — plan-aware preservation during compression
|
|
117
|
+
- Focus (ACL 2025) — active forgetting for proactive context management
|
|
118
|
+
- Contributed from: dominion-flow v10.1 research session
|