agentfootprint 2.11.4 → 2.11.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (65) hide show
  1. package/README.md +321 -165
  2. package/dist/core/Agent.js +55 -1
  3. package/dist/core/Agent.js.map +1 -1
  4. package/dist/core/agent/AgentBuilder.js +67 -1
  5. package/dist/core/agent/AgentBuilder.js.map +1 -1
  6. package/dist/core/agent/stages/callLLM.js +45 -17
  7. package/dist/core/agent/stages/callLLM.js.map +1 -1
  8. package/dist/core/agent/stages/reliabilityExecution.js +291 -0
  9. package/dist/core/agent/stages/reliabilityExecution.js.map +1 -0
  10. package/dist/core/agent/stages/toolCalls.js +9 -17
  11. package/dist/core/agent/stages/toolCalls.js.map +1 -1
  12. package/dist/core/slots/buildToolsSlot.js +101 -33
  13. package/dist/core/slots/buildToolsSlot.js.map +1 -1
  14. package/dist/esm/core/Agent.js +55 -1
  15. package/dist/esm/core/Agent.js.map +1 -1
  16. package/dist/esm/core/agent/AgentBuilder.js +67 -1
  17. package/dist/esm/core/agent/AgentBuilder.js.map +1 -1
  18. package/dist/esm/core/agent/stages/callLLM.js +45 -17
  19. package/dist/esm/core/agent/stages/callLLM.js.map +1 -1
  20. package/dist/esm/core/agent/stages/reliabilityExecution.js +287 -0
  21. package/dist/esm/core/agent/stages/reliabilityExecution.js.map +1 -0
  22. package/dist/esm/core/agent/stages/toolCalls.js +9 -17
  23. package/dist/esm/core/agent/stages/toolCalls.js.map +1 -1
  24. package/dist/esm/core/slots/buildToolsSlot.js +101 -33
  25. package/dist/esm/core/slots/buildToolsSlot.js.map +1 -1
  26. package/dist/esm/events/registry.js +6 -0
  27. package/dist/esm/events/registry.js.map +1 -1
  28. package/dist/esm/index.js +1 -0
  29. package/dist/esm/index.js.map +1 -1
  30. package/dist/esm/recorders/core/ToolsRecorder.js +23 -0
  31. package/dist/esm/recorders/core/ToolsRecorder.js.map +1 -0
  32. package/dist/esm/tool-providers/gatedTools.js +8 -3
  33. package/dist/esm/tool-providers/gatedTools.js.map +1 -1
  34. package/dist/events/registry.js +6 -0
  35. package/dist/events/registry.js.map +1 -1
  36. package/dist/index.js +5 -3
  37. package/dist/index.js.map +1 -1
  38. package/dist/recorders/core/ToolsRecorder.js +27 -0
  39. package/dist/recorders/core/ToolsRecorder.js.map +1 -0
  40. package/dist/tool-providers/gatedTools.js +8 -3
  41. package/dist/tool-providers/gatedTools.js.map +1 -1
  42. package/dist/types/core/Agent.d.ts +9 -1
  43. package/dist/types/core/Agent.d.ts.map +1 -1
  44. package/dist/types/core/agent/AgentBuilder.d.ts +61 -0
  45. package/dist/types/core/agent/AgentBuilder.d.ts.map +1 -1
  46. package/dist/types/core/agent/stages/callLLM.d.ts +8 -0
  47. package/dist/types/core/agent/stages/callLLM.d.ts.map +1 -1
  48. package/dist/types/core/agent/stages/reliabilityExecution.d.ts +66 -0
  49. package/dist/types/core/agent/stages/reliabilityExecution.d.ts.map +1 -0
  50. package/dist/types/core/agent/stages/toolCalls.d.ts +8 -0
  51. package/dist/types/core/agent/stages/toolCalls.d.ts.map +1 -1
  52. package/dist/types/core/slots/buildToolsSlot.d.ts +24 -4
  53. package/dist/types/core/slots/buildToolsSlot.d.ts.map +1 -1
  54. package/dist/types/events/payloads.d.ts +39 -0
  55. package/dist/types/events/payloads.d.ts.map +1 -1
  56. package/dist/types/events/registry.d.ts +7 -1
  57. package/dist/types/events/registry.d.ts.map +1 -1
  58. package/dist/types/index.d.ts +1 -0
  59. package/dist/types/index.d.ts.map +1 -1
  60. package/dist/types/recorders/core/ToolsRecorder.d.ts +19 -0
  61. package/dist/types/recorders/core/ToolsRecorder.d.ts.map +1 -0
  62. package/dist/types/tool-providers/gatedTools.d.ts.map +1 -1
  63. package/dist/types/tool-providers/types.d.ts +43 -7
  64. package/dist/types/tool-providers/types.d.ts.map +1 -1
  65. package/package.json +6 -1
package/README.md CHANGED
@@ -1,12 +1,17 @@
1
1
 
2
2
  <p align="center">
3
- <img width="220" alt="agentfootprint logo" src="https://github.com/user-attachments/assets/d548e2f4-cd49-4b9b-bdc2-2e6cbc2817ab" />
3
+ <picture>
4
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/hero-dark.svg">
5
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/hero-light.svg">
6
+ <img alt="agentfootprint mascot composing context flavors (Skills, Steering, Guardrails, RAG, Tool APIs, Memory) into three structured LLM slots (system, messages, tools) — the central abstraction, visualized." src="docs/assets/hero-light.svg" width="100%"/>
7
+ </picture>
4
8
  </p>
5
9
 
6
- <h1 align="center">agentfootprint</h1>
10
+ <h1 align="center">Agentfootprint</h1>
7
11
 
8
12
  <p align="center">
9
- <strong>Context engineering, abstracted.</strong>
13
+ <strong>We abstract context engineering — and hand back the trace.</strong><br/>
14
+ <strong>Live</strong> to develop · <strong>offline</strong> to monitor · <strong>detailed</strong> to improve.
10
15
  </p>
11
16
 
12
17
  <p align="center">
@@ -19,94 +24,314 @@
19
24
 
20
25
  ---
21
26
 
22
- ## What is agentfootprint?
27
+ ## 1. What we abstract
23
28
 
24
- **A framework for building AI agents by treating context as a first-class runtime system.**
29
+ When you build an Agentic Application, you collect domain-specific data and instructions, then wire them up based on what your system receives.
25
30
 
26
- Most agent code becomes context plumbing: which instructions go in `system`, which messages get added after a tool returns, which tools should be exposed right now, which memory to load for this tenant, which parts of the prompt are stable enough to cache.
31
+ That data and those instructions wear many names **Skills · Steering · Guardrails · RAG · Tool APIs · Memory** with more on the way. But they all do the same thing: they **inject into one of three slots** in the LLM call (`system`, `messages`, `tools`).
27
32
 
28
- Without a framework, every agent hand-rolls this logic. Over time it becomes a fragile mix of prompt concatenation, tool routing, memory loading, cache markers, observability hooks, and retry logic.
33
+ So we abstracted the injection itself.
29
34
 
30
- **agentfootprint abstracts that bookkeeping.** You declare what context to inject, where it lands, and when it activates. The framework owns the agent loop, recomposes the LLM call every iteration, records typed events, applies caching, and persists replayable checkpoints.
35
+ <p align="center">
36
+ <picture>
37
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/triggers-dark.svg">
38
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/triggers-light.svg">
39
+ <img alt="agentfootprint — Every LLM call has 3 fixed slots (system, messages, tools). Every flavor lands in one slot under one of 4 fixed triggers (always · rule · on-tool-return · llm-activated). Sparkle streams flow from each trigger lane down to a specific pill inside its destination slot — same slot can hold pills from different triggers (RAG via rule, Instruction via on-tool-return), and the same flavor (Skill) can land in different slots." src="docs/assets/triggers-light.svg" width="100%"/>
40
+ </picture>
41
+ </p>
42
+
43
+ The abstraction is three rules:
44
+
45
+ 1. **Three slots are fixed.** `system`, `messages`, `tools` — the LLM API surface.
46
+ 2. **N flavors are open.** You declare what you have. Tomorrow's flavor (few-shot, reflection, persona, A2A handoff…) plugs in the same way.
47
+ 3. **Rules decide *where* and *when*.** You provide the rules. We collect your data, fire the right one, land it in the right slot at the right iteration.
48
+
49
+ That's the whole model: `Injection = slot × trigger × cache`.
50
+
51
+ - **Slot** — which of the 3 LLM API regions the content lands in (`system` / `messages` / `tools`).
52
+ - **Trigger** — when the content fires (see below).
53
+ - **Cache** — how stable the content is across iterations. The framework places provider cache markers for you — stable content gets 80–90% cheaper prefixes.
54
+
55
+ ### The 4 triggers
56
+
57
+ | Trigger | Flavor | Fires when | Builder example | Default slot |
58
+ |---|---|---|---|---|
59
+ | `always` | static | Every iteration | `.steering('You are a triage agent…')` | `system` |
60
+ | `rule` | runtime — predicate | Your rule returns true | `.rag({ when: s => /price\|refund/.test(s.userQuery) })` | `messages` |
61
+ | `on-tool-return` | runtime — lifecycle | After a specific tool returns | `.instruction({ after: 'search_db', text: 'Cite source IDs.' })` | `messages` |
62
+ | `llm-activated` | runtime — agent-driven | LLM calls `read_skill('id')` | `.skill({ id: 'refund-policy', activatedBy: 'read_skill' })` | `messages` (body) |
31
63
 
32
- > You write the intent. agentfootprint owns the context loop.
64
+ > [!NOTE]
65
+ > Slot is a default, not a coupling — the same `Skill` can live in `tools` (schema only, discovered via `read_skill`), `messages` (body injected on activation), or `system` (baked into the prompt as steering).
66
+
67
+ **3 slots × 4 triggers × N flavors = the entire context-engineering surface.**
33
68
 
34
69
  ---
35
70
 
36
- ## The lineage
71
+ ## 2. Why we chose this abstraction
37
72
 
38
- Every load-bearing dev tool of the last decade made the same move:
73
+ The agent space has many credible primary abstractions:
39
74
 
40
- | Framework | You write | The framework abstracts |
41
- |---|---|---|
42
- | **PyTorch (autograd)** | Forward graph | Gradient computation, backward pass |
43
- | **Express / Fastify** | Routes + handlers | HTTP loop, middleware chain |
44
- | **Prisma** | Schema + query intent | SQL generation, migrations |
45
- | **React** | Components + state | DOM diffing, render path |
46
- | **agentfootprint** | Injections (slot × trigger × cache) | Slot composition, iteration loop, caching, observation, replay |
75
+ | Framework | What it abstracts |
76
+ |---|---|
77
+ | **LangChain** | Pipelines of composable components |
78
+ | **LangGraph** | State machines of nodes and edges |
79
+ | **CrewAI · AutoGen** | Crews of role-playing agents |
80
+ | **Mastra · Genkit · Pydantic AI** | Typed full-stack bundles |
81
+ | **DSPy** | Compiled prompts |
82
+ | **Inngest AgentKit** | Durable workflows |
83
+
84
+ We didn't have to choose between them.
85
+
86
+ agentfootprint is built on **footprintjs** — the flowchart pattern for backend code. footprintjs gives us every one of those abstractions out of the box:
87
+
88
+ | Capability | What footprintjs hands us |
89
+ |---|---|
90
+ | Composition | `Sequence` · `Parallel` · `Conditional` · `Loop` |
91
+ | State machines | The ReAct loop *is* a flowchart |
92
+ | Multi-agent crews | Compose Agents through control flow — no special class needed |
93
+ | Durable workflows | `pauseHere()` plus JSON-portable `resume()` |
94
+ | Typed observation | 60+ events for free, because the framework owns the loop |
95
+
96
+ So we used the budget those abstractions would have cost us to invest deeply in something they all leave to the developer: **the injection loop.**
47
97
 
48
- The closest structural parallel is **autograd**: you describe the graph, the framework traverses it, and *because the framework owns the traversal it can record everything for free*. Same idea here — typed events, replayable checkpoints, and provider-agnostic prompt caching are consequences of owning the loop, not extra features.
98
+ > [!IMPORTANT]
99
+ > **We abstract context engineering — and hand back the trace.**
100
+ > Live to develop · offline to monitor · detailed to improve.
101
+
102
+ ### The reason — agents have a new class of bug
103
+
104
+ For fifty years, software bugs have been **logic errors**. A wrong condition, a missed edge case, an off-by-one. You step through the code until you find the bad branch.
105
+
106
+ LLM-powered apps add a second class of bug: **contextual errors.** The code is correct. The model is correct. The answer is wrong because **the LLM's decision rests on context that was ambiguous, confusing, or misleading at the moment of inference.**
107
+
108
+ Tracking *which content the model actually saw, and why,* is the entire debugging job. Without it, the failure mode is invisible:
109
+
110
+ | What got injected wrong | What the model did |
111
+ |---|---|
112
+ | Wrong instruction landed in the `system` slot | Followed the wrong rule |
113
+ | Predicate fired one iteration too early | Reasoned with stale assumptions |
114
+ | Skill body missing when the LLM called `read_skill` | Invented its own |
115
+ | Cache prefix invalidated mid-iteration | Saw a silently rewritten stale version |
116
+ | Tool returned but the `on-tool-return` injection didn't fire | Couldn't interpret the result |
117
+
118
+ > [!IMPORTANT]
119
+ > **The model doesn't tell you which of these went wrong. It just gives you the wrong answer.**
120
+
121
+ You can't step through that with a debugger. By the time you read the response, the context that produced it is gone unless something recorded it.
122
+
123
+ That's the gap agentfootprint fills. A framework that owns the control flow can debug logic errors. A framework that owns the *injection* can debug contextual errors — because every injection is a typed event with a where, when, why, and how-it-cached.
124
+
125
+ ### What that buys you
126
+
127
+ Because we own the injection, every LLM call backtracks to four typed answers:
128
+
129
+ - **What** was injected
130
+ - **Who** triggered it (which rule)
131
+ - **When** it fired
132
+ - **How** it landed — slot, position, cache
133
+
134
+ Same trace, three workflows:
135
+
136
+ - **Live — debug as you build.** See exactly which injection produced which token, which predicate fired this iteration, which prefix actually got cached.
137
+ - **Offline — monitor what shipped.** Replay any past run from its trace. Alert on drift. Attribute cost per injection.
138
+ - **Detailed — improve via export.** Every successful trajectory is labeled training data for SFT, DPO, or RL — no separate data-collection phase.
139
+
140
+ And a fourth, novel: **the agent can read its own trace.** Six months after the agent rejected loan #42, *"why did you reject it?"* answers from the recorded evidence (`creditScore=580`, `threshold=600`), not a rerun. Causal memory turns the trace into the agent's working memory.
49
141
 
50
142
  ---
51
143
 
52
- ## The core idea
144
+ ## 3. How do I design my agent or system of agents?
145
+
146
+ Two scales — same alphabet. Four control flows are the entire vocabulary.
53
147
 
54
- Every LLM call has three slots:
148
+ <table>
149
+ <tr>
150
+ <td width="50%" align="center">
151
+ <picture>
152
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/sequence-dark.svg">
153
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/sequence-light.svg">
154
+ <img alt="Sequence — linear chain A → B → C." src="docs/assets/sequence-light.svg" width="100%"/>
155
+ </picture>
156
+ </td>
157
+ <td width="50%">
55
158
 
56
- ```text
57
- system messages tools
159
+ ```typescript
160
+ import { Sequence } from 'agentfootprint';
161
+
162
+ const flow = Sequence.create()
163
+ .step('a', stageA)
164
+ .step('b', stageB)
165
+ .step('c', stageC)
166
+ .build();
58
167
  ```
59
168
 
60
- Every agent feature — steering, instructions, skills, facts, memory, RAG, tool schemas — is content flowing into one of those slots. agentfootprint models all of them as one primitive:
169
+ </td>
170
+ </tr>
171
+ <tr>
172
+ <td width="50%" align="center">
173
+ <picture>
174
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/parallel-dark.svg">
175
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/parallel-light.svg">
176
+ <img alt="Parallel — fan-out then fan-in across N agents." src="docs/assets/parallel-light.svg" width="100%"/>
177
+ </picture>
178
+ </td>
179
+ <td width="50%">
180
+
181
+ ```typescript
182
+ import { Parallel } from 'agentfootprint';
61
183
 
62
- ```text
63
- Injection = slot × trigger × cache
184
+ const fan = Parallel.create()
185
+ .branch('web', searchWeb)
186
+ .branch('docs', searchDocs)
187
+ .mergeWithFn(synthesizer)
188
+ .build();
64
189
  ```
65
190
 
66
- An Injection answers three questions:
67
-
68
- 1. **Where does this content land?** `system`, `messages`, or `tools`
69
- 2. **When does it activate?** `always` · `rule` · `on-tool-return` · `llm-activated`
70
- 3. **How is it cached?** `always` · `never` · `while-active` · predicate
71
-
72
- That is the whole abstraction. Every named pattern in the agent literature — Reflexion, Tree-of-Thoughts, Skills, RAG, Constitutional AI — reduces to *which slot* + *which trigger*. You learn one model; the field's growth lands as new factories on the same primitive.
73
-
74
- ```text
75
- LLM call
76
- ┌────────────────────────────────────┐
77
- │ system messages tools │
78
- │ ▲ ▲ ▲ │
79
- └──────┼────────────┼────────────┼───┘
80
- │ │ │
81
- Injection Injection Injection
82
-
83
-
84
- always · rule · on-tool-return · llm-activated
191
+ </td>
192
+ </tr>
193
+ <tr>
194
+ <td width="50%" align="center">
195
+ <picture>
196
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/conditional-dark.svg">
197
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/conditional-light.svg">
198
+ <img alt="Conditional — diamond gate routes to one of N branches based on a predicate." src="docs/assets/conditional-light.svg" width="100%"/>
199
+ </picture>
200
+ </td>
201
+ <td width="50%">
202
+
203
+ ```typescript
204
+ import { Conditional } from 'agentfootprint';
205
+
206
+ const router = Conditional.create()
207
+ .when('billing', s => s.intent === 'billing', billingAgent)
208
+ .when('tech', s => s.intent === 'tech', techAgent)
209
+ .otherwise('default', defaultAgent)
210
+ .build();
85
211
  ```
86
212
 
87
- ---
213
+ </td>
214
+ </tr>
215
+ <tr>
216
+ <td width="50%" align="center">
217
+ <picture>
218
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/loop-dark.svg">
219
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/loop-light.svg">
220
+ <img alt="Loop — body cycles back from end to start until a condition is met." src="docs/assets/loop-light.svg" width="100%"/>
221
+ </picture>
222
+ </td>
223
+ <td width="50%">
224
+
225
+ ```typescript
226
+ import { Loop } from 'agentfootprint';
227
+
228
+ const reflexion = Loop.create()
229
+ .repeat(thinkAgent)
230
+ .until(s => s.satisfied)
231
+ .build();
232
+ ```
88
233
 
89
- ## Why this isn't just an ergonomics win — Dynamic ReAct
234
+ </td>
235
+ </tr>
236
+ </table>
90
237
 
91
- Because the framework owns the loop, **all three slots recompose every iteration based on what just happened.**
238
+ ### Inside one agent Dynamic vs Classic ReAct
92
239
 
93
- - **LangChain** assembles prompts once per turn.
94
- - **LangGraph** composes state per node, not per loop iteration.
95
- - **agentfootprint** recomposes per iteration.
240
+ <p align="center">
241
+ <picture>
242
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/dynamic-vs-classic-dark.svg">
243
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/dynamic-vs-classic-light.svg">
244
+ <img alt="Classic ReAct vs Dynamic ReAct loop topology — same 5 stages (SystemPrompt, Messages, Tools, CallLLM, Route → ExecuteTools/Finalize), but the loop edge differs: Classic returns to CallLLM only (slots frozen at 12 tools every iteration), Dynamic returns to SystemPrompt (slots recompose, tools shrink from 1 to 5 as skills activate)." src="docs/assets/dynamic-vs-classic-light.svg" width="100%"/>
245
+ </picture>
246
+ </p>
247
+
248
+ **Same five stages on both sides. Only one thing differs — where the loop returns.** Classic ReAct loops back to `CallLLM` and slots stay frozen. Dynamic ReAct (agentfootprint) loops back to `SystemPrompt`, so injections that fired on the previous tool result recompose the next prompt. Per-iteration recomposition is also the structural prerequisite for the cache layer.
249
+
250
+ | Iteration | Classic ReAct | Dynamic ReAct (agentfootprint) |
251
+ |---|---|---|
252
+ | 1 | 12 tools shown | **1 tool** (`read_skill`) |
253
+ | 2 | 12 tools shown | **5 tools** (skill activated) |
254
+ | 3 | 12 tools shown | 5 tools |
255
+
256
+ > 📖 [Dynamic ReAct guide](https://footprintjs.github.io/agentfootprint/guides/dynamic-react/) · [Key concepts](https://footprintjs.github.io/agentfootprint/getting-started/key-concepts/)
257
+
258
+ ### Multi-agent — compose with the alphabet
259
+
260
+ <p align="center">
261
+ <picture>
262
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/compose-dark.svg">
263
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/compose-light.svg">
264
+ <img alt="A custom research agent built from the same 4 control flows: input flows into a Conditional gate (plan more research?), which fans out to a Parallel block (search_web, search_docs, search_kb), then chains into a Sequence (synthesize → critique), and a Loop arrow returns from the end back to the Conditional gate so the agent iterates until satisfied. Formula: Loop( Conditional(plan?) → Parallel(search_web, search_docs, search_kb) → Sequence(synth → critique) )." src="docs/assets/compose-light.svg" width="100%"/>
265
+ </picture>
266
+ </p>
96
267
 
97
- Per-iteration recomposition is what makes context engineering compositional instead of static. It's also the structural prerequisite for the cache layer — cache markers can't track active injections in lockstep without it.
268
+ Pick the flows that match your problem. Chain them. **That's your Agentic Application.**
98
269
 
99
- ```text
100
- Classic ReAct Dynamic ReAct
101
- ─────────────── ─────────────
102
- iter 1: 12 tools shown iter 1: 1 tool (read_skill)
103
- iter 2: 12 tools shown iter 2: 5 tools (skill activated)
104
- iter 3: 12 tools shown iter 3: 5 tools
270
+ ```typescript
271
+ const research = Loop.create()
272
+ .repeat(Sequence.create().step('plan', plan).step('search', searchAll).build())
273
+ .until(s => s.satisfied).build();
105
274
  ```
106
275
 
107
- Use Dynamic ReAct when your tools have dependencies (one tool's output implies which tool to call next). Use Classic ReAct when all tools are independent and ordering doesn't matter.
276
+ Same `.create().method().build()` shape as the four rows above just composed.
277
+
278
+ ### Named patterns — also compositions of the same 4
108
279
 
109
- > 📖 Deep dive: [Dynamic ReAct guide](https://footprintjs.github.io/agentfootprint/guides/dynamic-react/) · [Cache layer](https://footprintjs.github.io/agentfootprint/guides/caching/)
280
+ <p align="center">
281
+ <picture>
282
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/patterns-dark.svg">
283
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/patterns-light.svg">
284
+ <img alt="6 named multi-agent patterns reduce to compositions of the same 4 control flows: Swarm = Loop(Parallel(Agent×N) → merge); Tree-of-Thoughts = Loop(Parallel(Agent×N) → Conditional(score)); Reflexion = Loop(Agent → Conditional(critique) → Agent); Debate = Parallel(Agent_pro, Agent_con) → Agent_judge; Router = Conditional → Agent_A | Agent_B | Agent_C; Hierarchical = Agent_planner → Sequence(Agent_worker×N) → synth." src="docs/assets/patterns-light.svg" width="100%"/>
285
+ </picture>
286
+ </p>
287
+
288
+ The patterns the field knows reduce to the same alphabet:
289
+
290
+ | Pattern | Composition |
291
+ |---|---|
292
+ | **Swarm** | `Loop( Parallel( Agent×N ) → merge )` |
293
+ | **Tree-of-Thoughts** | `Loop( Parallel( Agent×N ) → Conditional(score) )` |
294
+ | **Reflexion** | `Loop( Agent → Conditional(critique) → Agent )` |
295
+ | **Debate** | `Parallel( Agent_pro, Agent_con ) → Agent_judge` |
296
+ | **Router** | `Conditional → Agent_A \| Agent_B \| Agent_C` |
297
+ | **Hierarchical** | `Agent_planner → Sequence( Agent_worker×N ) → synth` |
298
+
299
+ Same trick as Beat 1: instead of N libraries for N patterns, we found the M building blocks all N patterns are made of.
300
+
301
+ > 📖 Compare: [hand-rolled vs declarative](https://footprintjs.github.io/agentfootprint/getting-started/why/) · [migration from LangChain / CrewAI / LangGraph](https://footprintjs.github.io/agentfootprint/getting-started/vs/)
302
+
303
+ ---
304
+
305
+ ## 4. How do I see what my agent did?
306
+
307
+ Because we own the loop (Beat 2), every decision and execution is captured during traversal — not bolted on. The default capture is the **causal trace**: every stage, read, write, and decision evidence, as a JSON-portable, scrubbable, queryable, exportable artifact. Beyond the default, wire custom recorders for cost, latency, or quality scoring — any observation hook fires on the same stream.
308
+
309
+ <p align="center">
310
+ <picture>
311
+ <source media="(prefers-color-scheme: dark)" srcset="docs/assets/causal-memory-dark.svg">
312
+ <source media="(prefers-color-scheme: light)" srcset="docs/assets/causal-memory-light.svg">
313
+ <img alt="agentfootprint causal memory — Each agent run produces a JSON-portable causal trace: a scrubbable timeline of every stage with reads, writes, and captured decision evidence. The trace card shows a time-travel slider (Step 5 of 17, Live), an execution timeline with stage-duration bars, and the captured decision evidence pill (riskTier eq high → reject). Two built-in lenses view it: Lens (agent-centric) and Explainable Trace (structural). Three programmatic consumers fan out from it: audit replay (GDPR Article 22 adverse-action notice answered from chain, no LLM call, $15/1M to $0.25/1M tokens), cheap-model triage (Sonnet trace fed to Haiku for follow-ups), and training data export (every chain is a labeled trajectory ready for SFT/DPO/process-RL). One recording, two lenses, three consumers, zero extra instrumentation. Powered by footprintjs causalChain()." src="docs/assets/causal-memory-light.svg" width="100%"/>
314
+ </picture>
315
+ </p>
316
+
317
+ The same trace serves three downstream consumers — no extra instrumentation:
318
+
319
+ 1. **Audit / compliance.** Six months later, *"why was loan #42 rejected?"* answers from the chain (`creditScore=580 < 620 ∧ dti=0.6 > 0.43 → riskTier=high → REJECTED`). No LLM call. GDPR Art. 22, ECOA, and EU AI Act adverse-action notices write themselves from the captured decision evidence.
320
+
321
+ 2. **Cheap-model triage.** A Sonnet trace becomes good *input* for Haiku to answer follow-ups. ~200 tokens at any model ($0.25/1M) vs ~2,500 tokens at a reasoning model ($15/1M). Memoization for agent thinking — no agent rerun.
322
+
323
+ 3. **Training data — the substrate is already there.** Every successful chain is a labeled trajectory. SFT pairs (`{prompt, completion}`) fall out of the snapshot's history field; the export wrapper is roadmap work tracked in [GitHub issues](https://github.com/footprintjs/agentfootprint/issues). DPO and process-RL need additional collection layers (preference feedback, per-step reward annotation) that don't ship today.
324
+
325
+ Two built-in lenses view the same trace:
326
+
327
+ | Lens | View | When to use |
328
+ |---|---|---|
329
+ | **Lens** | Agent-centric — User/Agent[3 slots]/Tool flowchart with iteration scrubber and round commentary | Live debugging, "what did Neo see at step 5?" |
330
+ | **Explainable Trace** | Structural — subflow tree, full flowchart, memory inspector, per-stage execution timeline | Architecture review, root-cause analysis |
331
+
332
+ > 📖 Powered by [footprintjs `causalChain()`](https://footprintjs.github.io/footPrint/blog/backward-causal-chain/) — backward thin-slicing on the commit log. [Causal memory deep dive](https://footprintjs.github.io/agentfootprint/causal-deep-dive/) · [Explainability & compliance](https://footprintjs.github.io/footPrint/blog/explainability-compliance/)
333
+
334
+ **One recording. Two lenses. Three consumers. Zero extra instrumentation.**
110
335
 
111
336
  ---
112
337
 
@@ -146,69 +371,6 @@ Swap `mock(...)` for `anthropic(...)` / `openai(...)` / `bedrock(...)` / `ollama
146
371
 
147
372
  ---
148
373
 
149
- ## A real agent in 8 lines
150
-
151
- ```typescript
152
- const agent = Agent.create({ provider, model: 'claude-sonnet-4-5-20250929' })
153
- .system('You are a support assistant.')
154
- .steering(toneRule) // always-on
155
- .instruction(urgentRule) // rule-gated
156
- .skill(billingSkill) // LLM-activated
157
- .memory(conversationMemory) // cross-run, multi-tenant
158
- .tool(weather)
159
- .build();
160
-
161
- await agent.run({ message: userInput, identity: { conversationId } });
162
- ```
163
-
164
- The hand-rolled equivalent is ~80 lines of slot management, trigger evaluation, memory loading, and cache marker placement — and growing with every feature. The declarative version stays at 8.
165
-
166
- > 📖 Compare: [hand-rolled vs declarative](https://footprintjs.github.io/agentfootprint/getting-started/why/) · [migration from LangChain / CrewAI / LangGraph](https://footprintjs.github.io/agentfootprint/getting-started/vs/)
167
-
168
- ---
169
-
170
- ## The differentiator: the trace is a cache of the agent's thinking
171
-
172
- Other agent frameworks remember *what was said*. agentfootprint's causal memory records the **decision evidence** — every value the flowchart captured during the run, persisted as a JSON-portable snapshot.
173
-
174
- That changes the cost structure of everything that happens after the agent runs:
175
-
176
- 1. **Audit / explain** — six months later, "why was loan #42 rejected?" answers from the original evidence (creditScore=580, threshold=600), not reconstruction.
177
- 2. **Cheap-model triage** — a trace from Sonnet is good *input* for Haiku to answer follow-up questions about that run. Memoization for agent reasoning.
178
- 3. **Training data** — every successful production run is a labeled trajectory for SFT/DPO/process-RL, no separate data-collection phase.
179
-
180
- One recording, three downstream consumers, no extra instrumentation.
181
-
182
- > 📖 Deep dive: [Causal memory guide](https://footprintjs.github.io/agentfootprint/guides/causal-memory/)
183
-
184
- ---
185
-
186
- ## What you can build
187
-
188
- ```typescript
189
- // Customer support — skills + memory + audit + cache
190
- const agent = Agent.create({ provider, model })
191
- .system('You are a friendly support assistant.')
192
- .skill(billingSkill)
193
- .steering(toneGuidelines)
194
- .memory(conversationMemory)
195
- .build();
196
-
197
- // Research pipeline — multi-agent fan-out + merge
198
- const research = Parallel.create()
199
- .branch(optimist).branch(skeptic).branch(historian)
200
- .merge(synthesizer)
201
- .build();
202
-
203
- // Streaming chat — token-by-token to a browser via SSE
204
- agent.on('agentfootprint.stream.token', (e) => res.write(toSSE(e)));
205
- await agent.run({ message: req.query.message });
206
- ```
207
-
208
- > 📖 Full examples: [examples gallery](https://github.com/footprintjs/agentfootprint/tree/main/examples) · every example is also a CI test.
209
-
210
- ---
211
-
212
374
  ## Mocks first, production second
213
375
 
214
376
  Build the entire app against in-memory mocks with **zero API cost**, then swap real infrastructure one boundary at a time.
@@ -216,7 +378,7 @@ Build the entire app against in-memory mocks with **zero API cost**, then swap r
216
378
  | Boundary | Dev | Prod |
217
379
  |---|---|---|
218
380
  | LLM provider | `mock(...)` | `anthropic()` · `openai()` · `bedrock()` · `ollama()` |
219
- | Memory store | `InMemoryStore` | `RedisStore` · `AgentCoreStore` · DynamoDB / Postgres / Pinecone |
381
+ | Memory store | `InMemoryStore` | `RedisStore` · `AgentCoreStore` |
220
382
  | MCP | `mockMcpClient(...)` | `mcpClient({ transport })` |
221
383
  | Cache strategy | `NoOpCacheStrategy` | auto-selected per provider |
222
384
 
@@ -226,47 +388,41 @@ The flowchart, recorders, and tests don't change between dev and prod.
226
388
 
227
389
  ## What ships today
228
390
 
229
- - **2 primitives** — `LLMCall`, `Agent` (the ReAct loop)
230
- - **4 compositions** — `Sequence`, `Parallel`, `Conditional`, `Loop`
231
- - **7 LLM providers**Anthropic · OpenAI · Bedrock · Ollama · Browser-Anthropic · Browser-OpenAI · Mock
232
- - **One Injection primitive** — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact`
233
- - **One Memory factory**4 types × 7 strategies including **Causal**
234
- - **Provider-agnostic prompt caching**declarative per-injection, per-iteration marker recomputation
235
- - **RAG · MCP · Memory store adapters** — InMemory · Redis · AgentCore
236
- - **48+ typed observability events** across context · stream · agent · cost · skill · permission · eval · memory · cache · embedding · error
237
- - **Pause / resume** — JSON-serializable checkpoints; resume hours later on a different server
238
- - **Resilience** — `withRetry`, `withFallback`, `resilientProvider`
239
- - **AI-coding-tool support** — Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot
240
-
241
- > 📖 [Full feature list & API reference](https://footprintjs.github.io/agentfootprint/reference/) · [CHANGELOG](./CHANGELOG.md)
391
+ **Core**
392
+ - 2 primitives — `LLMCall`, `Agent` (the ReAct loop)
393
+ - 4 control flows`Sequence`, `Parallel`, `Conditional`, `Loop`
394
+ - 1 Injection primitive — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact`
395
+ - 1 reliability gate`.reliability({ preCheck, postDecide, providers, circuitBreaker, fallback })`
396
+ - 1 tool dispatch primitive `ToolProvider` (sync OR async) — `staticTools` · `gatedTools` · `skillScopedTools` · custom `discoveryProvider` over hubs / MCP / per-tenant catalogs
242
397
 
243
- ---
244
-
245
- ## Roadmap
398
+ **LLM providers** (7)
246
399
 
247
- | Theme | Focus |
400
+ | Factory | Use for |
248
401
  |---|---|
249
- | Reliability | Circuit breaker, output fallback, auto-resume-on-error |
250
- | Causal exports | `causalMemory.exportForTraining({ format: 'sft' \| 'dpo' \| 'process' })` |
251
- | Governance | Policies, budget tracking, production memory adapters |
252
- | Cache v2 | Gemini handle-based caching, cost attribution |
253
- | Deep agents | Planning-before-execution, A2A protocol, Lens UI |
254
-
255
- Roadmap items are *not* current API claims. If a feature isn't in `npm install agentfootprint` today, it's listed here, not in the docs.
256
-
257
- ---
258
-
259
- ## Design philosophy
260
-
261
- Two principles shape the runtime:
262
-
263
- **Connected data (Palantir, 2003).** Enterprise insight is bottlenecked by data fragmentation, not analyst skill. Agents face the same problem at runtime disconnected tool state, lost decision evidence, scattered execution context. agentfootprint connects state, decisions, execution, and memory into one runtime footprint so the next iteration compounds the connection instead of paying for it again.
264
-
265
- **Modular boundaries (Liskov, 1974).** Every framework boundary — `LLMProvider`, `ToolProvider`, `CacheStrategy`, `Recorder`, `MemoryStore` — is an LSP-substitutable interface. Swap implementations without changing agent code.
266
-
267
- Connected data alone is fast but unmaintainable. Modular boundaries alone are clean but dumb. Together: a runtime that's both fast and reasonable.
268
-
269
- > 📖 Long-form: [the Palantir lineage](https://footprintjs.github.io/agentfootprint/inspiration/connected-data/) · [the Liskov lineage](https://footprintjs.github.io/agentfootprint/inspiration/modularity/)
402
+ | `anthropic` | Claude (Sonnet, Opus, Haiku) via `@anthropic-ai/sdk` |
403
+ | `openai` | GPT-4o, GPT-4-turbo via `openai` SDK |
404
+ | `bedrock` | Claude / Titan / Mistral via AWS Bedrock runtime |
405
+ | `ollama` | Local models (OpenAI-compatible endpoint) |
406
+ | `browserAnthropic` | Browser-side Claude calls (no proxy server) |
407
+ | `browserOpenai` | Browser-side OpenAI calls (no proxy server) |
408
+ | `mock` | Deterministic dev/test (zero API cost) |
409
+
410
+ **Memory + adapters**
411
+ - Memory factory — 4 types (`episodic` / `semantic` / `narrative` / `causal`) × 7 strategies (`window` / `budget` / `summarize` / `topK` / `extract` / `decay` / `hybrid`)
412
+ - Memory stores — `InMemoryStore`, `RedisStore` (peer-dep `ioredis`), `AgentCoreStore` (peer-dep AWS SDK)
413
+ - RAG · MCP adapters — `mockMcpClient(...)` / `mcpClient({ transport })`
414
+
415
+ **Operability**
416
+ - Provider-agnostic prompt cachingdeclarative per-injection, per-iteration marker recomputation
417
+ - Pause / resume — JSON-serializable checkpoints; resume hours later on a different server
418
+ - Resilience primitives — `withRetry`, `withFallback`, `withCircuitBreaker`, `.outputFallback`, `agent.resumeOnError`
419
+ - 60+ typed observability events — `agent` · `composition` · `context` · `stream` · `tools` · `skill` · `memory` · `cache` · `cost` · `permission` · `eval` · `embedding` · `pause` · `error` · `fallback` · `resilience` · `reliability` · `risk`
420
+
421
+ **Tooling**
422
+ - **Lens** · **Explainable Trace** two visual replays of the causal trace (separate `agentfootprint-lens` package)
423
+ - AI-coding-tool support — Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot
424
+
425
+ > 📖 [Agent API reference](https://footprintjs.github.io/agentfootprint/api/agent/) · [CHANGELOG](./CHANGELOG.md)
270
426
 
271
427
  ---
272
428
 
@@ -277,8 +433,8 @@ Connected data alone is fast but unmaintainable. Modular boundaries alone are cl
277
433
  | New to agents | [5-minute quick start](https://footprintjs.github.io/agentfootprint/getting-started/quick-start/) |
278
434
  | Coming from LangChain / CrewAI / LangGraph | [Migration guide](https://footprintjs.github.io/agentfootprint/getting-started/vs/) |
279
435
  | Architecting an enterprise rollout | [Production guide](https://footprintjs.github.io/agentfootprint/guides/deployment/) |
280
- | Doing due diligence | [Architecture overview](https://footprintjs.github.io/agentfootprint/architecture/) |
281
- | Researcher / extending | [Extension guide](https://footprintjs.github.io/agentfootprint/contributing/extension-guide/) |
436
+ | Doing due diligence | [Architecture overview](https://footprintjs.github.io/agentfootprint/architecture/dependency-graph/) |
437
+ | Researcher / academic background | [Citations & prior art](https://footprintjs.github.io/agentfootprint/research/citations/) |
282
438
  | Curious about design | [Inspiration docs](https://footprintjs.github.io/agentfootprint/inspiration/) |
283
439
 
284
440
  Or jump into the [examples gallery](https://github.com/footprintjs/agentfootprint/tree/main/examples) — every example is also an end-to-end CI test.
@@ -287,7 +443,7 @@ Or jump into the [examples gallery](https://github.com/footprintjs/agentfootprin
287
443
 
288
444
  ## Built on
289
445
 
290
- [footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. The decision-evidence capture, narrative recording, and time-travel checkpointing this library uses are footprintjs primitives. The same way autograd's forward-pass traversal is what makes gradient inspection automatic, footprintjs's flowchart traversal is what makes agentfootprint's typed-event stream and replayable traces automatic.
446
+ [footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. agentfootprint's decision-evidence capture, narrative recording, and time-travel checkpointing are footprintjs primitives at the runtime layer.
291
447
 
292
448
  You don't need to learn footprintjs to use agentfootprint — but if you want to build your own primitives at this depth, [start there](https://footprintjs.github.io/footPrint/).
293
449