npm - agentfootprint - Versions diffs - 2.11.4 → 2.11.6 - Mend

agentfootprint 2.11.4 → 2.11.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (65) hide show

package/README.md +321 -165
package/dist/core/Agent.js +55 -1
package/dist/core/Agent.js.map +1 -1
package/dist/core/agent/AgentBuilder.js +67 -1
package/dist/core/agent/AgentBuilder.js.map +1 -1
package/dist/core/agent/stages/callLLM.js +45 -17
package/dist/core/agent/stages/callLLM.js.map +1 -1
package/dist/core/agent/stages/reliabilityExecution.js +291 -0
package/dist/core/agent/stages/reliabilityExecution.js.map +1 -0
package/dist/core/agent/stages/toolCalls.js +9 -17
package/dist/core/agent/stages/toolCalls.js.map +1 -1
package/dist/core/slots/buildToolsSlot.js +101 -33
package/dist/core/slots/buildToolsSlot.js.map +1 -1
package/dist/esm/core/Agent.js +55 -1
package/dist/esm/core/Agent.js.map +1 -1
package/dist/esm/core/agent/AgentBuilder.js +67 -1
package/dist/esm/core/agent/AgentBuilder.js.map +1 -1
package/dist/esm/core/agent/stages/callLLM.js +45 -17
package/dist/esm/core/agent/stages/callLLM.js.map +1 -1
package/dist/esm/core/agent/stages/reliabilityExecution.js +287 -0
package/dist/esm/core/agent/stages/reliabilityExecution.js.map +1 -0
package/dist/esm/core/agent/stages/toolCalls.js +9 -17
package/dist/esm/core/agent/stages/toolCalls.js.map +1 -1
package/dist/esm/core/slots/buildToolsSlot.js +101 -33
package/dist/esm/core/slots/buildToolsSlot.js.map +1 -1
package/dist/esm/events/registry.js +6 -0
package/dist/esm/events/registry.js.map +1 -1
package/dist/esm/index.js +1 -0
package/dist/esm/index.js.map +1 -1
package/dist/esm/recorders/core/ToolsRecorder.js +23 -0
package/dist/esm/recorders/core/ToolsRecorder.js.map +1 -0
package/dist/esm/tool-providers/gatedTools.js +8 -3
package/dist/esm/tool-providers/gatedTools.js.map +1 -1
package/dist/events/registry.js +6 -0
package/dist/events/registry.js.map +1 -1
package/dist/index.js +5 -3
package/dist/index.js.map +1 -1
package/dist/recorders/core/ToolsRecorder.js +27 -0
package/dist/recorders/core/ToolsRecorder.js.map +1 -0
package/dist/tool-providers/gatedTools.js +8 -3
package/dist/tool-providers/gatedTools.js.map +1 -1
package/dist/types/core/Agent.d.ts +9 -1
package/dist/types/core/Agent.d.ts.map +1 -1
package/dist/types/core/agent/AgentBuilder.d.ts +61 -0
package/dist/types/core/agent/AgentBuilder.d.ts.map +1 -1
package/dist/types/core/agent/stages/callLLM.d.ts +8 -0
package/dist/types/core/agent/stages/callLLM.d.ts.map +1 -1
package/dist/types/core/agent/stages/reliabilityExecution.d.ts +66 -0
package/dist/types/core/agent/stages/reliabilityExecution.d.ts.map +1 -0
package/dist/types/core/agent/stages/toolCalls.d.ts +8 -0
package/dist/types/core/agent/stages/toolCalls.d.ts.map +1 -1
package/dist/types/core/slots/buildToolsSlot.d.ts +24 -4
package/dist/types/core/slots/buildToolsSlot.d.ts.map +1 -1
package/dist/types/events/payloads.d.ts +39 -0
package/dist/types/events/payloads.d.ts.map +1 -1
package/dist/types/events/registry.d.ts +7 -1
package/dist/types/events/registry.d.ts.map +1 -1
package/dist/types/index.d.ts +1 -0
package/dist/types/index.d.ts.map +1 -1
package/dist/types/recorders/core/ToolsRecorder.d.ts +19 -0
package/dist/types/recorders/core/ToolsRecorder.d.ts.map +1 -0
package/dist/types/tool-providers/gatedTools.d.ts.map +1 -1
package/dist/types/tool-providers/types.d.ts +43 -7
package/dist/types/tool-providers/types.d.ts.map +1 -1
package/package.json +6 -1

package/README.md CHANGED Viewed

@@ -1,12 +1,17 @@
 <p align="center">
-  <img width="220" alt="agentfootprint logo" src="https://github.com/user-attachments/assets/d548e2f4-cd49-4b9b-bdc2-2e6cbc2817ab" />
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/hero-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/hero-light.svg">
+    <img alt="agentfootprint mascot composing context flavors (Skills, Steering, Guardrails, RAG, Tool APIs, Memory) into three structured LLM slots (system, messages, tools) — the central abstraction, visualized." src="docs/assets/hero-light.svg" width="100%"/>
+  </picture>
 </p>
-<h1 align="center">agentfootprint</h1>
+<h1 align="center">Agentfootprint</h1>
 <p align="center">
-  <strong>Context engineering, abstracted.</strong>
+  <strong>We abstract context engineering — and hand back the trace.</strong><br/>
+  <strong>Live</strong> to develop · <strong>offline</strong> to monitor · <strong>detailed</strong> to improve.
 </p>
 <p align="center">
@@ -19,94 +24,314 @@
 ---
-## What is agentfootprint?
+## 1. What we abstract
-**A framework for building AI agents by treating context as a first-class runtime system.**
+When you build an Agentic Application, you collect domain-specific data and instructions, then wire them up based on what your system receives.
-Most agent code becomes context plumbing: which instructions go in `system`, which messages get added after a tool returns, which tools should be exposed right now, which memory to load for this tenant, which parts of the prompt are stable enough to cache.
+That data and those instructions wear many names — **Skills · Steering · Guardrails · RAG · Tool APIs · Memory** — with more on the way. But they all do the same thing: they **inject into one of three slots** in the LLM call (`system`, `messages`, `tools`).
-Without a framework, every agent hand-rolls this logic. Over time it becomes a fragile mix of prompt concatenation, tool routing, memory loading, cache markers, observability hooks, and retry logic.
+So we abstracted the injection itself.
-**agentfootprint abstracts that bookkeeping.** You declare what context to inject, where it lands, and when it activates. The framework owns the agent loop, recomposes the LLM call every iteration, records typed events, applies caching, and persists replayable checkpoints.
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/triggers-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/triggers-light.svg">
+    <img alt="agentfootprint — Every LLM call has 3 fixed slots (system, messages, tools). Every flavor lands in one slot under one of 4 fixed triggers (always · rule · on-tool-return · llm-activated). Sparkle streams flow from each trigger lane down to a specific pill inside its destination slot — same slot can hold pills from different triggers (RAG via rule, Instruction via on-tool-return), and the same flavor (Skill) can land in different slots." src="docs/assets/triggers-light.svg" width="100%"/>
+  </picture>
+</p>
+The abstraction is three rules:
+1. **Three slots are fixed.** `system`, `messages`, `tools` — the LLM API surface.
+2. **N flavors are open.** You declare what you have. Tomorrow's flavor (few-shot, reflection, persona, A2A handoff…) plugs in the same way.
+3. **Rules decide *where* and *when*.** You provide the rules. We collect your data, fire the right one, land it in the right slot at the right iteration.
+That's the whole model: `Injection = slot × trigger × cache`.
+- **Slot** — which of the 3 LLM API regions the content lands in (`system` / `messages` / `tools`).
+- **Trigger** — when the content fires (see below).
+- **Cache** — how stable the content is across iterations. The framework places provider cache markers for you — stable content gets 80–90% cheaper prefixes.
+### The 4 triggers
+| Trigger | Flavor | Fires when | Builder example | Default slot |
+|---|---|---|---|---|
+| `always` | static | Every iteration | `.steering('You are a triage agent…')` | `system` |
+| `rule` | runtime — predicate | Your rule returns true | `.rag({ when: s => /price\|refund/.test(s.userQuery) })` | `messages` |
+| `on-tool-return` | runtime — lifecycle | After a specific tool returns | `.instruction({ after: 'search_db', text: 'Cite source IDs.' })` | `messages` |
+| `llm-activated` | runtime — agent-driven | LLM calls `read_skill('id')` | `.skill({ id: 'refund-policy', activatedBy: 'read_skill' })` | `messages` (body) |
-> You write the intent. agentfootprint owns the context loop.
+> [!NOTE]
+> Slot is a default, not a coupling — the same `Skill` can live in `tools` (schema only, discovered via `read_skill`), `messages` (body injected on activation), or `system` (baked into the prompt as steering).
+**3 slots × 4 triggers × N flavors = the entire context-engineering surface.**
 ---
-## The lineage
+## 2. Why we chose this abstraction
-Every load-bearing dev tool of the last decade made the same move:
+The agent space has many credible primary abstractions:
-| Framework | You write | The framework abstracts |
-|---|---|---|
-| **PyTorch (autograd)** | Forward graph | Gradient computation, backward pass |
-| **Express / Fastify** | Routes + handlers | HTTP loop, middleware chain |
-| **Prisma** | Schema + query intent | SQL generation, migrations |
-| **React** | Components + state | DOM diffing, render path |
-| **agentfootprint** | Injections (slot × trigger × cache) | Slot composition, iteration loop, caching, observation, replay |
+| Framework | What it abstracts |
+|---|---|
+| **LangChain** | Pipelines of composable components |
+| **LangGraph** | State machines of nodes and edges |
+| **CrewAI · AutoGen** | Crews of role-playing agents |
+| **Mastra · Genkit · Pydantic AI** | Typed full-stack bundles |
+| **DSPy** | Compiled prompts |
+| **Inngest AgentKit** | Durable workflows |
+We didn't have to choose between them.
+agentfootprint is built on **footprintjs** — the flowchart pattern for backend code. footprintjs gives us every one of those abstractions out of the box:
+| Capability | What footprintjs hands us |
+|---|---|
+| Composition | `Sequence` · `Parallel` · `Conditional` · `Loop` |
+| State machines | The ReAct loop *is* a flowchart |
+| Multi-agent crews | Compose Agents through control flow — no special class needed |
+| Durable workflows | `pauseHere()` plus JSON-portable `resume()` |
+| Typed observation | 60+ events for free, because the framework owns the loop |
+So we used the budget those abstractions would have cost us to invest deeply in something they all leave to the developer: **the injection loop.**
-The closest structural parallel is **autograd**: you describe the graph, the framework traverses it, and *because the framework owns the traversal it can record everything for free*. Same idea here — typed events, replayable checkpoints, and provider-agnostic prompt caching are consequences of owning the loop, not extra features.
+> [!IMPORTANT]
+> **We abstract context engineering — and hand back the trace.**
+> Live to develop · offline to monitor · detailed to improve.
+### The reason — agents have a new class of bug
+For fifty years, software bugs have been **logic errors**. A wrong condition, a missed edge case, an off-by-one. You step through the code until you find the bad branch.
+LLM-powered apps add a second class of bug: **contextual errors.** The code is correct. The model is correct. The answer is wrong because **the LLM's decision rests on context that was ambiguous, confusing, or misleading at the moment of inference.**
+Tracking *which content the model actually saw, and why,* is the entire debugging job. Without it, the failure mode is invisible:
+| What got injected wrong | What the model did |
+|---|---|
+| Wrong instruction landed in the `system` slot | Followed the wrong rule |
+| Predicate fired one iteration too early | Reasoned with stale assumptions |
+| Skill body missing when the LLM called `read_skill` | Invented its own |
+| Cache prefix invalidated mid-iteration | Saw a silently rewritten stale version |
+| Tool returned but the `on-tool-return` injection didn't fire | Couldn't interpret the result |
+> [!IMPORTANT]
+> **The model doesn't tell you which of these went wrong. It just gives you the wrong answer.**
+You can't step through that with a debugger. By the time you read the response, the context that produced it is gone unless something recorded it.
+That's the gap agentfootprint fills. A framework that owns the control flow can debug logic errors. A framework that owns the *injection* can debug contextual errors — because every injection is a typed event with a where, when, why, and how-it-cached.
+### What that buys you
+Because we own the injection, every LLM call backtracks to four typed answers:
+- **What** was injected
+- **Who** triggered it (which rule)
+- **When** it fired
+- **How** it landed — slot, position, cache
+Same trace, three workflows:
+- **Live — debug as you build.** See exactly which injection produced which token, which predicate fired this iteration, which prefix actually got cached.
+- **Offline — monitor what shipped.** Replay any past run from its trace. Alert on drift. Attribute cost per injection.
+- **Detailed — improve via export.** Every successful trajectory is labeled training data for SFT, DPO, or RL — no separate data-collection phase.
+And a fourth, novel: **the agent can read its own trace.** Six months after the agent rejected loan #42, *"why did you reject it?"* answers from the recorded evidence (`creditScore=580`, `threshold=600`), not a rerun. Causal memory turns the trace into the agent's working memory.
 ---
-## The core idea
+## 3. How do I design my agent or system of agents?
+Two scales — same alphabet. Four control flows are the entire vocabulary.
-Every LLM call has three slots:
+<table>
+<tr>
+<td width="50%" align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/sequence-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/sequence-light.svg">
+    <img alt="Sequence — linear chain A → B → C." src="docs/assets/sequence-light.svg" width="100%"/>
+  </picture>
+</td>
+<td width="50%">
-```text
-system     messages     tools
+```typescript
+import { Sequence } from 'agentfootprint';
+const flow = Sequence.create()
+  .step('a', stageA)
+  .step('b', stageB)
+  .step('c', stageC)
+  .build();
 ```
-Every agent feature — steering, instructions, skills, facts, memory, RAG, tool schemas — is content flowing into one of those slots. agentfootprint models all of them as one primitive:
+</td>
+</tr>
+<tr>
+<td width="50%" align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/parallel-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/parallel-light.svg">
+    <img alt="Parallel — fan-out then fan-in across N agents." src="docs/assets/parallel-light.svg" width="100%"/>
+  </picture>
+</td>
+<td width="50%">
+```typescript
+import { Parallel } from 'agentfootprint';
-```text
-Injection = slot × trigger × cache
+const fan = Parallel.create()
+  .branch('web', searchWeb)
+  .branch('docs', searchDocs)
+  .mergeWithFn(synthesizer)
+  .build();
 ```
-An Injection answers three questions:
-1. **Where does this content land?** `system`, `messages`, or `tools`
-2. **When does it activate?** `always` · `rule` · `on-tool-return` · `llm-activated`
-3. **How is it cached?** `always` · `never` · `while-active` · predicate
-That is the whole abstraction. Every named pattern in the agent literature — Reflexion, Tree-of-Thoughts, Skills, RAG, Constitutional AI — reduces to *which slot* + *which trigger*. You learn one model; the field's growth lands as new factories on the same primitive.
-```text
-                         LLM call
-        ┌────────────────────────────────────┐
-        │   system      messages      tools  │
-        │      ▲            ▲            ▲   │
-        └──────┼────────────┼────────────┼───┘
-               │            │            │
-          Injection     Injection     Injection
-               ▲
-               │
-      always · rule · on-tool-return · llm-activated
+</td>
+</tr>
+<tr>
+<td width="50%" align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/conditional-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/conditional-light.svg">
+    <img alt="Conditional — diamond gate routes to one of N branches based on a predicate." src="docs/assets/conditional-light.svg" width="100%"/>
+  </picture>
+</td>
+<td width="50%">
+```typescript
+import { Conditional } from 'agentfootprint';
+const router = Conditional.create()
+  .when('billing', s => s.intent === 'billing', billingAgent)
+  .when('tech',    s => s.intent === 'tech',    techAgent)
+  .otherwise('default', defaultAgent)
+  .build();
 ```
----
+</td>
+</tr>
+<tr>
+<td width="50%" align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/loop-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/loop-light.svg">
+    <img alt="Loop — body cycles back from end to start until a condition is met." src="docs/assets/loop-light.svg" width="100%"/>
+  </picture>
+</td>
+<td width="50%">
+```typescript
+import { Loop } from 'agentfootprint';
+const reflexion = Loop.create()
+  .repeat(thinkAgent)
+  .until(s => s.satisfied)
+  .build();
+```
-## Why this isn't just an ergonomics win — Dynamic ReAct
+</td>
+</tr>
+</table>
-Because the framework owns the loop, **all three slots recompose every iteration based on what just happened.**
+### Inside one agent — Dynamic vs Classic ReAct
-- **LangChain** assembles prompts once per turn.
-- **LangGraph** composes state per node, not per loop iteration.
-- **agentfootprint** recomposes per iteration.
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/dynamic-vs-classic-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/dynamic-vs-classic-light.svg">
+    <img alt="Classic ReAct vs Dynamic ReAct loop topology — same 5 stages (SystemPrompt, Messages, Tools, CallLLM, Route → ExecuteTools/Finalize), but the loop edge differs: Classic returns to CallLLM only (slots frozen at 12 tools every iteration), Dynamic returns to SystemPrompt (slots recompose, tools shrink from 1 to 5 as skills activate)." src="docs/assets/dynamic-vs-classic-light.svg" width="100%"/>
+  </picture>
+</p>
+**Same five stages on both sides. Only one thing differs — where the loop returns.** Classic ReAct loops back to `CallLLM` and slots stay frozen. Dynamic ReAct (agentfootprint) loops back to `SystemPrompt`, so injections that fired on the previous tool result recompose the next prompt. Per-iteration recomposition is also the structural prerequisite for the cache layer.
+| Iteration | Classic ReAct | Dynamic ReAct (agentfootprint) |
+|---|---|---|
+| 1 | 12 tools shown | **1 tool** (`read_skill`) |
+| 2 | 12 tools shown | **5 tools** (skill activated) |
+| 3 | 12 tools shown | 5 tools |
+> 📖 [Dynamic ReAct guide](https://footprintjs.github.io/agentfootprint/guides/dynamic-react/) · [Key concepts](https://footprintjs.github.io/agentfootprint/getting-started/key-concepts/)
+### Multi-agent — compose with the alphabet
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/compose-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/compose-light.svg">
+    <img alt="A custom research agent built from the same 4 control flows: input flows into a Conditional gate (plan more research?), which fans out to a Parallel block (search_web, search_docs, search_kb), then chains into a Sequence (synthesize → critique), and a Loop arrow returns from the end back to the Conditional gate so the agent iterates until satisfied. Formula: Loop( Conditional(plan?) → Parallel(search_web, search_docs, search_kb) → Sequence(synth → critique) )." src="docs/assets/compose-light.svg" width="100%"/>
+  </picture>
+</p>
-Per-iteration recomposition is what makes context engineering compositional instead of static. It's also the structural prerequisite for the cache layer — cache markers can't track active injections in lockstep without it.
+Pick the flows that match your problem. Chain them. **That's your Agentic Application.**
-```text
-Classic ReAct                    Dynamic ReAct
-───────────────                  ─────────────
-iter 1: 12 tools shown           iter 1: 1 tool  (read_skill)
-iter 2: 12 tools shown           iter 2: 5 tools (skill activated)
-iter 3: 12 tools shown           iter 3: 5 tools
+```typescript
+const research = Loop.create()
+  .repeat(Sequence.create().step('plan', plan).step('search', searchAll).build())
+  .until(s => s.satisfied).build();
 ```
-Use Dynamic ReAct when your tools have dependencies (one tool's output implies which tool to call next). Use Classic ReAct when all tools are independent and ordering doesn't matter.
+Same `.create().method().build()` shape as the four rows above — just composed.
+### Named patterns — also compositions of the same 4
-> 📖 Deep dive: [Dynamic ReAct guide](https://footprintjs.github.io/agentfootprint/guides/dynamic-react/) · [Cache layer](https://footprintjs.github.io/agentfootprint/guides/caching/)
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/patterns-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/patterns-light.svg">
+    <img alt="6 named multi-agent patterns reduce to compositions of the same 4 control flows: Swarm = Loop(Parallel(Agent×N) → merge); Tree-of-Thoughts = Loop(Parallel(Agent×N) → Conditional(score)); Reflexion = Loop(Agent → Conditional(critique) → Agent); Debate = Parallel(Agent_pro, Agent_con) → Agent_judge; Router = Conditional → Agent_A | Agent_B | Agent_C; Hierarchical = Agent_planner → Sequence(Agent_worker×N) → synth." src="docs/assets/patterns-light.svg" width="100%"/>
+  </picture>
+</p>
+The patterns the field knows reduce to the same alphabet:
+| Pattern | Composition |
+|---|---|
+| **Swarm** | `Loop( Parallel( Agent×N ) → merge )` |
+| **Tree-of-Thoughts** | `Loop( Parallel( Agent×N ) → Conditional(score) )` |
+| **Reflexion** | `Loop( Agent → Conditional(critique) → Agent )` |
+| **Debate** | `Parallel( Agent_pro, Agent_con ) → Agent_judge` |
+| **Router** | `Conditional → Agent_A \| Agent_B \| Agent_C` |
+| **Hierarchical** | `Agent_planner → Sequence( Agent_worker×N ) → synth` |
+Same trick as Beat 1: instead of N libraries for N patterns, we found the M building blocks all N patterns are made of.
+> 📖 Compare: [hand-rolled vs declarative](https://footprintjs.github.io/agentfootprint/getting-started/why/) · [migration from LangChain / CrewAI / LangGraph](https://footprintjs.github.io/agentfootprint/getting-started/vs/)
+---
+## 4. How do I see what my agent did?
+Because we own the loop (Beat 2), every decision and execution is captured during traversal — not bolted on. The default capture is the **causal trace**: every stage, read, write, and decision evidence, as a JSON-portable, scrubbable, queryable, exportable artifact. Beyond the default, wire custom recorders for cost, latency, or quality scoring — any observation hook fires on the same stream.
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="docs/assets/causal-memory-dark.svg">
+    <source media="(prefers-color-scheme: light)" srcset="docs/assets/causal-memory-light.svg">
+    <img alt="agentfootprint causal memory — Each agent run produces a JSON-portable causal trace: a scrubbable timeline of every stage with reads, writes, and captured decision evidence. The trace card shows a time-travel slider (Step 5 of 17, Live), an execution timeline with stage-duration bars, and the captured decision evidence pill (riskTier eq high → reject). Two built-in lenses view it: Lens (agent-centric) and Explainable Trace (structural). Three programmatic consumers fan out from it: audit replay (GDPR Article 22 adverse-action notice answered from chain, no LLM call, $15/1M to $0.25/1M tokens), cheap-model triage (Sonnet trace fed to Haiku for follow-ups), and training data export (every chain is a labeled trajectory ready for SFT/DPO/process-RL). One recording, two lenses, three consumers, zero extra instrumentation. Powered by footprintjs causalChain()." src="docs/assets/causal-memory-light.svg" width="100%"/>
+  </picture>
+</p>
+The same trace serves three downstream consumers — no extra instrumentation:
+1. **Audit / compliance.** Six months later, *"why was loan #42 rejected?"* answers from the chain (`creditScore=580 < 620 ∧ dti=0.6 > 0.43 → riskTier=high → REJECTED`). No LLM call. GDPR Art. 22, ECOA, and EU AI Act adverse-action notices write themselves from the captured decision evidence.
+2. **Cheap-model triage.** A Sonnet trace becomes good *input* for Haiku to answer follow-ups. ~200 tokens at any model ($0.25/1M) vs ~2,500 tokens at a reasoning model ($15/1M). Memoization for agent thinking — no agent rerun.
+3. **Training data — the substrate is already there.** Every successful chain is a labeled trajectory. SFT pairs (`{prompt, completion}`) fall out of the snapshot's history field; the export wrapper is roadmap work tracked in [GitHub issues](https://github.com/footprintjs/agentfootprint/issues). DPO and process-RL need additional collection layers (preference feedback, per-step reward annotation) that don't ship today.
+Two built-in lenses view the same trace:
+| Lens | View | When to use |
+|---|---|---|
+| **Lens** | Agent-centric — User/Agent[3 slots]/Tool flowchart with iteration scrubber and round commentary | Live debugging, "what did Neo see at step 5?" |
+| **Explainable Trace** | Structural — subflow tree, full flowchart, memory inspector, per-stage execution timeline | Architecture review, root-cause analysis |
+> 📖 Powered by [footprintjs `causalChain()`](https://footprintjs.github.io/footPrint/blog/backward-causal-chain/) — backward thin-slicing on the commit log. [Causal memory deep dive](https://footprintjs.github.io/agentfootprint/causal-deep-dive/) · [Explainability & compliance](https://footprintjs.github.io/footPrint/blog/explainability-compliance/)
+**One recording. Two lenses. Three consumers. Zero extra instrumentation.**
 ---
@@ -146,69 +371,6 @@ Swap `mock(...)` for `anthropic(...)` / `openai(...)` / `bedrock(...)` / `ollama
 ---
-## A real agent in 8 lines
-```typescript
-const agent = Agent.create({ provider, model: 'claude-sonnet-4-5-20250929' })
-  .system('You are a support assistant.')
-  .steering(toneRule)            // always-on
-  .instruction(urgentRule)       // rule-gated
-  .skill(billingSkill)           // LLM-activated
-  .memory(conversationMemory)    // cross-run, multi-tenant
-  .tool(weather)
-  .build();
-await agent.run({ message: userInput, identity: { conversationId } });
-```
-The hand-rolled equivalent is ~80 lines of slot management, trigger evaluation, memory loading, and cache marker placement — and growing with every feature. The declarative version stays at 8.
-> 📖 Compare: [hand-rolled vs declarative](https://footprintjs.github.io/agentfootprint/getting-started/why/) · [migration from LangChain / CrewAI / LangGraph](https://footprintjs.github.io/agentfootprint/getting-started/vs/)
----
-## The differentiator: the trace is a cache of the agent's thinking
-Other agent frameworks remember *what was said*. agentfootprint's causal memory records the **decision evidence** — every value the flowchart captured during the run, persisted as a JSON-portable snapshot.
-That changes the cost structure of everything that happens after the agent runs:
-1. **Audit / explain** — six months later, "why was loan #42 rejected?" answers from the original evidence (creditScore=580, threshold=600), not reconstruction.
-2. **Cheap-model triage** — a trace from Sonnet is good *input* for Haiku to answer follow-up questions about that run. Memoization for agent reasoning.
-3. **Training data** — every successful production run is a labeled trajectory for SFT/DPO/process-RL, no separate data-collection phase.
-One recording, three downstream consumers, no extra instrumentation.
-> 📖 Deep dive: [Causal memory guide](https://footprintjs.github.io/agentfootprint/guides/causal-memory/)
----
-## What you can build
-```typescript
-// Customer support — skills + memory + audit + cache
-const agent = Agent.create({ provider, model })
-  .system('You are a friendly support assistant.')
-  .skill(billingSkill)
-  .steering(toneGuidelines)
-  .memory(conversationMemory)
-  .build();
-// Research pipeline — multi-agent fan-out + merge
-const research = Parallel.create()
-  .branch(optimist).branch(skeptic).branch(historian)
-  .merge(synthesizer)
-  .build();
-// Streaming chat — token-by-token to a browser via SSE
-agent.on('agentfootprint.stream.token', (e) => res.write(toSSE(e)));
-await agent.run({ message: req.query.message });
-```
-> 📖 Full examples: [examples gallery](https://github.com/footprintjs/agentfootprint/tree/main/examples) · every example is also a CI test.
----
 ## Mocks first, production second
 Build the entire app against in-memory mocks with **zero API cost**, then swap real infrastructure one boundary at a time.
@@ -216,7 +378,7 @@ Build the entire app against in-memory mocks with **zero API cost**, then swap r
 | Boundary | Dev | Prod |
 |---|---|---|
 | LLM provider | `mock(...)` | `anthropic()` · `openai()` · `bedrock()` · `ollama()` |
-| Memory store | `InMemoryStore` | `RedisStore` · `AgentCoreStore` · DynamoDB / Postgres / Pinecone |
+| Memory store | `InMemoryStore` | `RedisStore` · `AgentCoreStore` |
 | MCP | `mockMcpClient(...)` | `mcpClient({ transport })` |
 | Cache strategy | `NoOpCacheStrategy` | auto-selected per provider |
@@ -226,47 +388,41 @@ The flowchart, recorders, and tests don't change between dev and prod.
 ## What ships today
-- **2 primitives** — `LLMCall`, `Agent` (the ReAct loop)
-- **4 compositions** — `Sequence`, `Parallel`, `Conditional`, `Loop`
-- **7 LLM providers** — Anthropic · OpenAI · Bedrock · Ollama · Browser-Anthropic · Browser-OpenAI · Mock
-- **One Injection primitive** — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact`
-- **One Memory factory** — 4 types × 7 strategies including **Causal**
-- **Provider-agnostic prompt caching** — declarative per-injection, per-iteration marker recomputation
-- **RAG · MCP · Memory store adapters** — InMemory · Redis · AgentCore
-- **48+ typed observability events** across context · stream · agent · cost · skill · permission · eval · memory · cache · embedding · error
-- **Pause / resume** — JSON-serializable checkpoints; resume hours later on a different server
-- **Resilience** — `withRetry`, `withFallback`, `resilientProvider`
-- **AI-coding-tool support** — Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot
-> 📖 [Full feature list & API reference](https://footprintjs.github.io/agentfootprint/reference/) · [CHANGELOG](./CHANGELOG.md)
+**Core**
+- 2 primitives — `LLMCall`, `Agent` (the ReAct loop)
+- 4 control flows — `Sequence`, `Parallel`, `Conditional`, `Loop`
+- 1 Injection primitive — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact`
+- 1 reliability gate — `.reliability({ preCheck, postDecide, providers, circuitBreaker, fallback })`
+- 1 tool dispatch primitive — `ToolProvider` (sync OR async) — `staticTools` · `gatedTools` · `skillScopedTools` · custom `discoveryProvider` over hubs / MCP / per-tenant catalogs
----
-## Roadmap
+**LLM providers** (7)
-| Theme | Focus |
+| Factory | Use for |
 |---|---|
-| Reliability | Circuit breaker, output fallback, auto-resume-on-error |
-| Causal exports | `causalMemory.exportForTraining({ format: 'sft' \| 'dpo' \| 'process' })` |
-| Governance | Policies, budget tracking, production memory adapters |
-| Cache v2 | Gemini handle-based caching, cost attribution |
-| Deep agents | Planning-before-execution, A2A protocol, Lens UI |
-Roadmap items are *not* current API claims. If a feature isn't in `npm install agentfootprint` today, it's listed here, not in the docs.
----
-## Design philosophy
-Two principles shape the runtime:
-**Connected data (Palantir, 2003).** Enterprise insight is bottlenecked by data fragmentation, not analyst skill. Agents face the same problem at runtime — disconnected tool state, lost decision evidence, scattered execution context. agentfootprint connects state, decisions, execution, and memory into one runtime footprint so the next iteration compounds the connection instead of paying for it again.
-**Modular boundaries (Liskov, 1974).** Every framework boundary — `LLMProvider`, `ToolProvider`, `CacheStrategy`, `Recorder`, `MemoryStore` — is an LSP-substitutable interface. Swap implementations without changing agent code.
-Connected data alone is fast but unmaintainable. Modular boundaries alone are clean but dumb. Together: a runtime that's both fast and reasonable.
-> 📖 Long-form: [the Palantir lineage](https://footprintjs.github.io/agentfootprint/inspiration/connected-data/) · [the Liskov lineage](https://footprintjs.github.io/agentfootprint/inspiration/modularity/)
+| `anthropic` | Claude (Sonnet, Opus, Haiku) via `@anthropic-ai/sdk` |
+| `openai` | GPT-4o, GPT-4-turbo via `openai` SDK |
+| `bedrock` | Claude / Titan / Mistral via AWS Bedrock runtime |
+| `ollama` | Local models (OpenAI-compatible endpoint) |
+| `browserAnthropic` | Browser-side Claude calls (no proxy server) |
+| `browserOpenai` | Browser-side OpenAI calls (no proxy server) |
+| `mock` | Deterministic dev/test (zero API cost) |
+**Memory + adapters**
+- Memory factory — 4 types (`episodic` / `semantic` / `narrative` / `causal`) × 7 strategies (`window` / `budget` / `summarize` / `topK` / `extract` / `decay` / `hybrid`)
+- Memory stores — `InMemoryStore`, `RedisStore` (peer-dep `ioredis`), `AgentCoreStore` (peer-dep AWS SDK)
+- RAG · MCP adapters — `mockMcpClient(...)` / `mcpClient({ transport })`
+**Operability**
+- Provider-agnostic prompt caching — declarative per-injection, per-iteration marker recomputation
+- Pause / resume — JSON-serializable checkpoints; resume hours later on a different server
+- Resilience primitives — `withRetry`, `withFallback`, `withCircuitBreaker`, `.outputFallback`, `agent.resumeOnError`
+- 60+ typed observability events — `agent` · `composition` · `context` · `stream` · `tools` · `skill` · `memory` · `cache` · `cost` · `permission` · `eval` · `embedding` · `pause` · `error` · `fallback` · `resilience` · `reliability` · `risk`
+**Tooling**
+- **Lens** · **Explainable Trace** — two visual replays of the causal trace (separate `agentfootprint-lens` package)
+- AI-coding-tool support — Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot
+> 📖 [Agent API reference](https://footprintjs.github.io/agentfootprint/api/agent/) · [CHANGELOG](./CHANGELOG.md)
 ---
@@ -277,8 +433,8 @@ Connected data alone is fast but unmaintainable. Modular boundaries alone are cl
 | New to agents | [5-minute quick start](https://footprintjs.github.io/agentfootprint/getting-started/quick-start/) |
 | Coming from LangChain / CrewAI / LangGraph | [Migration guide](https://footprintjs.github.io/agentfootprint/getting-started/vs/) |
 | Architecting an enterprise rollout | [Production guide](https://footprintjs.github.io/agentfootprint/guides/deployment/) |
-| Doing due diligence | [Architecture overview](https://footprintjs.github.io/agentfootprint/architecture/) |
-| Researcher / extending | [Extension guide](https://footprintjs.github.io/agentfootprint/contributing/extension-guide/) |
+| Doing due diligence | [Architecture overview](https://footprintjs.github.io/agentfootprint/architecture/dependency-graph/) |
+| Researcher / academic background | [Citations & prior art](https://footprintjs.github.io/agentfootprint/research/citations/) |
 | Curious about design | [Inspiration docs](https://footprintjs.github.io/agentfootprint/inspiration/) |
 Or jump into the [examples gallery](https://github.com/footprintjs/agentfootprint/tree/main/examples) — every example is also an end-to-end CI test.
@@ -287,7 +443,7 @@ Or jump into the [examples gallery](https://github.com/footprintjs/agentfootprin
 ## Built on
-[footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. The decision-evidence capture, narrative recording, and time-travel checkpointing this library uses are footprintjs primitives. The same way autograd's forward-pass traversal is what makes gradient inspection automatic, footprintjs's flowchart traversal is what makes agentfootprint's typed-event stream and replayable traces automatic.
+[footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. agentfootprint's decision-evidence capture, narrative recording, and time-travel checkpointing are footprintjs primitives at the runtime layer.
 You don't need to learn footprintjs to use agentfootprint — but if you want to build your own primitives at this depth, [start there](https://footprintjs.github.io/footPrint/).