agentfootprint 2.3.0 → 2.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +277 -249
- package/dist/core/Agent.js +16 -0
- package/dist/core/Agent.js.map +1 -1
- package/dist/esm/core/Agent.js +16 -0
- package/dist/esm/core/Agent.js.map +1 -1
- package/dist/esm/index.js +1 -1
- package/dist/esm/index.js.map +1 -1
- package/dist/esm/lib/injection-engine/SkillRegistry.js +83 -0
- package/dist/esm/lib/injection-engine/SkillRegistry.js.map +1 -0
- package/dist/esm/lib/injection-engine/factories/defineSkill.js +34 -0
- package/dist/esm/lib/injection-engine/factories/defineSkill.js.map +1 -1
- package/dist/esm/lib/injection-engine/index.js +2 -1
- package/dist/esm/lib/injection-engine/index.js.map +1 -1
- package/dist/esm/lib/injection-engine/types.js.map +1 -1
- package/dist/index.js +3 -1
- package/dist/index.js.map +1 -1
- package/dist/lib/injection-engine/SkillRegistry.js +87 -0
- package/dist/lib/injection-engine/SkillRegistry.js.map +1 -0
- package/dist/lib/injection-engine/factories/defineSkill.js +36 -1
- package/dist/lib/injection-engine/factories/defineSkill.js.map +1 -1
- package/dist/lib/injection-engine/index.js +4 -1
- package/dist/lib/injection-engine/index.js.map +1 -1
- package/dist/lib/injection-engine/types.js.map +1 -1
- package/dist/types/core/Agent.d.ts +14 -0
- package/dist/types/core/Agent.d.ts.map +1 -1
- package/dist/types/index.d.ts +1 -1
- package/dist/types/index.d.ts.map +1 -1
- package/dist/types/lib/injection-engine/SkillRegistry.d.ts +50 -0
- package/dist/types/lib/injection-engine/SkillRegistry.d.ts.map +1 -0
- package/dist/types/lib/injection-engine/factories/defineSkill.d.ts +72 -0
- package/dist/types/lib/injection-engine/factories/defineSkill.d.ts.map +1 -1
- package/dist/types/lib/injection-engine/index.d.ts +2 -1
- package/dist/types/lib/injection-engine/index.d.ts.map +1 -1
- package/dist/types/lib/injection-engine/types.d.ts +10 -0
- package/dist/types/lib/injection-engine/types.d.ts.map +1 -1
- package/package.json +27 -8
- package/README.proposed.md +0 -258
- package/dist/instructions.js +0 -21
- package/dist/instructions.js.map +0 -1
- package/dist/lib/instructions/defineInstruction.js +0 -35
- package/dist/lib/instructions/defineInstruction.js.map +0 -1
- package/dist/lib/instructions/evaluator.js +0 -38
- package/dist/lib/instructions/evaluator.js.map +0 -1
- package/dist/lib/instructions/index.js +0 -48
- package/dist/lib/instructions/index.js.map +0 -1
- package/dist/lib/instructions/types.js +0 -22
- package/dist/lib/instructions/types.js.map +0 -1
- package/dist/memory/conversationHelpers.js +0 -39
- package/dist/memory/conversationHelpers.js.map +0 -1
- package/dist/types/instructions.d.ts +0 -5
- package/dist/types/instructions.d.ts.map +0 -1
- package/dist/types/lib/instructions/defineInstruction.d.ts +0 -22
- package/dist/types/lib/instructions/defineInstruction.d.ts.map +0 -1
- package/dist/types/lib/instructions/evaluator.d.ts +0 -11
- package/dist/types/lib/instructions/evaluator.d.ts.map +0 -1
- package/dist/types/lib/instructions/index.d.ts +0 -44
- package/dist/types/lib/instructions/index.d.ts.map +0 -1
- package/dist/types/lib/instructions/types.d.ts +0 -100
- package/dist/types/lib/instructions/types.d.ts.map +0 -1
- package/dist/types/memory/conversationHelpers.d.ts +0 -19
- package/dist/types/memory/conversationHelpers.d.ts.map +0 -1
package/README.md
CHANGED
|
@@ -1,351 +1,379 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
<h1 align="center">
|
|
2
|
+
<h1 align="center">agentfootprint</h1>
|
|
3
3
|
<p align="center">
|
|
4
|
-
<strong>Context engineering,
|
|
4
|
+
<strong>Context engineering, abstracted.</strong>
|
|
5
5
|
</p>
|
|
6
6
|
</p>
|
|
7
7
|
|
|
8
8
|
<p align="center">
|
|
9
9
|
<a href="https://github.com/footprintjs/agentfootprint/actions"><img src="https://github.com/footprintjs/agentfootprint/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
|
|
10
|
+
<a href="https://codecov.io/gh/footprintjs/agentfootprint"><img src="https://codecov.io/gh/footprintjs/agentfootprint/branch/main/graph/badge.svg" alt="Coverage"></a>
|
|
10
11
|
<a href="https://www.npmjs.com/package/agentfootprint"><img src="https://img.shields.io/npm/v/agentfootprint.svg?style=flat" alt="npm version"></a>
|
|
11
|
-
<a href="https://github.com/footprintjs/agentfootprint/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License"></a>
|
|
12
12
|
<a href="https://www.npmjs.com/package/agentfootprint"><img src="https://img.shields.io/npm/dm/agentfootprint.svg" alt="Downloads"></a>
|
|
13
|
-
<a href="https://
|
|
14
|
-
<a href="https://github.com/footprintjs/footPrint"><img src="https://img.shields.io/badge/Built_on-footprintjs-ca8a04?style=flat" alt="Built on footprintjs"></a>
|
|
13
|
+
<a href="https://github.com/footprintjs/agentfootprint/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT"></a>
|
|
15
14
|
</p>
|
|
16
15
|
|
|
17
16
|
<br>
|
|
18
17
|
|
|
19
|
-
**
|
|
18
|
+
> **PyTorch's autograd abstracted gradient computation. Express abstracted the HTTP request loop. Prisma abstracted SQL CRUD. Kubernetes abstracted reconciliation. React abstracted the DOM.**
|
|
19
|
+
>
|
|
20
|
+
> Every load-bearing dev tool of the last decade is *the same kind of move* — abstract one specific kind of bookkeeping that practitioners were doing by hand, so they can spend their attention on intent instead of plumbing.
|
|
21
|
+
>
|
|
22
|
+
> **agentfootprint is that move applied to context engineering** — the discipline of deciding what content lands in which slot of an LLM call, when, and why. You describe injections declaratively. The framework evaluates every trigger every iteration, composes the `system` / `messages` / `tools` slots, observes every decision as a typed event, and persists checkpoints you can replay six months later. So you write the **intent**, not 200 lines of slot-management bookkeeping per agent.
|
|
20
23
|
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
24
|
+
| Framework | You write declaratively | The framework abstracts |
|
|
25
|
+
|---|---|---|
|
|
26
|
+
| **PyTorch (autograd)** | Forward computation graph | Gradient computation, backward pass, parameter bookkeeping |
|
|
27
|
+
| **Express / Fastify** | Routes + handlers | HTTP request loop, middleware chain, response serialization |
|
|
28
|
+
| **Prisma / SQLAlchemy** | Schema + query intent | SQL generation, connection pooling, migrations |
|
|
29
|
+
| **Kubernetes** | Desired state (manifests) | Scheduling, health checks, reconciliation loop |
|
|
30
|
+
| **React** | Components + state | DOM diffing, render path, event delegation |
|
|
31
|
+
| **agentfootprint** | Injections (slot × trigger) | Slot composition, iteration loop, observation, replay |
|
|
32
|
+
|
|
33
|
+
The closest structural parallel is **autograd**: you describe the graph, the framework traverses it, and *because the framework owns the traversal it can record everything that happens for free*. Same idea here — you describe Injections, agentfootprint runs the iteration loop, and the typed-event stream + replayable checkpoints are a consequence, not an extra feature.
|
|
34
|
+
|
|
35
|
+
<!-- ┌────────────────────────────────────────────────────────────────┐
|
|
36
|
+
│ 📹 30-second demo video here. │
|
|
37
|
+
│ Embed: paste-trace → drag time-travel slider → │
|
|
38
|
+
│ every slot, every decision, every tool call visible. │
|
|
39
|
+
│ Frame this as "agent DevTools" — the React DevTools moment.│
|
|
40
|
+
└────────────────────────────────────────────────────────────────┘ -->
|
|
33
41
|
|
|
34
42
|
---
|
|
35
43
|
|
|
36
|
-
##
|
|
44
|
+
## The abstraction, concretely
|
|
37
45
|
|
|
38
|
-
|
|
46
|
+
### Without agentfootprint — context engineering by hand
|
|
39
47
|
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
48
|
+
```typescript
|
|
49
|
+
async function runAgentTurn(userMsg, state) {
|
|
50
|
+
let systemPrompt = baseSystem;
|
|
51
|
+
const messages = [...state.history, { role: 'user', content: userMsg }];
|
|
52
|
+
let activeTools = [...baseTools];
|
|
53
|
+
|
|
54
|
+
// 1. Apply always-on steering rules
|
|
55
|
+
for (const rule of steeringRules) systemPrompt += '\n' + rule.text;
|
|
56
|
+
|
|
57
|
+
// 2. Evaluate conditional instructions
|
|
58
|
+
for (const inst of instructions) {
|
|
59
|
+
if (inst.activeWhen(state)) systemPrompt += '\n' + inst.prompt;
|
|
60
|
+
}
|
|
61
|
+
|
|
62
|
+
// 3. Check on-tool-return triggers
|
|
63
|
+
if (state.lastToolResult?.toolName === 'redact_pii') {
|
|
64
|
+
messages.push({ role: 'system', content: 'Use redacted text only.' });
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
// 4. Resolve LLM-activated skills
|
|
68
|
+
for (const id of state.activatedSkills) {
|
|
69
|
+
systemPrompt += '\n' + skills[id].body;
|
|
70
|
+
activeTools.push(...skills[id].tools);
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
// 5. Load + format memory for this tenant
|
|
74
|
+
const memEntries = await store.list({ tenant, conversationId });
|
|
75
|
+
messages.unshift({ role: 'system', content: formatMemory(memEntries.slice(-10)) });
|
|
76
|
+
|
|
77
|
+
// 6. Call LLM, route tool calls, loop, capture state for resume...
|
|
78
|
+
// 7. Persist new turn back to memory tagged with identity...
|
|
79
|
+
// 8. Wire SSE for streaming, attach observability hooks...
|
|
80
|
+
|
|
81
|
+
// No replay. No audit trail. Per agent, hundreds of lines.
|
|
82
|
+
// Every refactor risks a slot-ordering bug nobody catches until prod.
|
|
83
|
+
}
|
|
49
84
|
```
|
|
50
85
|
|
|
51
|
-
|
|
86
|
+
### With agentfootprint — declarative
|
|
52
87
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
88
|
+
```typescript
|
|
89
|
+
const agent = Agent.create({ provider, model: 'claude-sonnet-4-5-20250929' })
|
|
90
|
+
.system('You are a support assistant.')
|
|
91
|
+
.steering(toneRule) // always-on
|
|
92
|
+
.instruction(urgentRule) // rule-gated
|
|
93
|
+
.skill(billingSkill) // LLM-activated
|
|
94
|
+
.memory(conversationMemory) // cross-run, multi-tenant
|
|
95
|
+
.tool(weather)
|
|
96
|
+
.build();
|
|
61
97
|
|
|
62
|
-
|
|
98
|
+
await agent.run({ message: userInput, identity: { conversationId } });
|
|
63
99
|
|
|
64
|
-
|
|
100
|
+
// Every iteration is a replayable typed event stream — for free.
|
|
101
|
+
agent.on('agentfootprint.context.injected', (e) =>
|
|
102
|
+
console.log(`[${e.payload.source}] landed in ${e.payload.slot}`));
|
|
103
|
+
```
|
|
65
104
|
|
|
66
|
-
|
|
105
|
+
Same agent. The hand-rolled version is ~80 lines and growing; the declarative version is ~8 and stable. **The framework owns the wiring** — which is exactly why it can observe, replay, and audit it for you.
|
|
67
106
|
|
|
68
|
-
|
|
107
|
+
---
|
|
69
108
|
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
109
|
+
## In 30 seconds — runs offline, no API key
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
npm install agentfootprint footprintjs
|
|
113
|
+
```
|
|
75
114
|
|
|
76
|
-
|
|
115
|
+
```typescript
|
|
116
|
+
import { Agent, defineTool, mock } from 'agentfootprint';
|
|
77
117
|
|
|
78
|
-
|
|
118
|
+
const weather = defineTool({
|
|
119
|
+
name: 'weather',
|
|
120
|
+
description: 'Get current weather for a city.',
|
|
121
|
+
inputSchema: {
|
|
122
|
+
type: 'object',
|
|
123
|
+
properties: { city: { type: 'string' } },
|
|
124
|
+
required: ['city'],
|
|
125
|
+
},
|
|
126
|
+
execute: async ({ city }: { city: string }) => `${city}: 72°F, sunny`,
|
|
127
|
+
});
|
|
79
128
|
|
|
80
|
-
|
|
129
|
+
const agent = Agent.create({
|
|
130
|
+
provider: mock({ reply: 'I checked: it is 72°F and sunny.' }), // ← deterministic, no API key
|
|
131
|
+
model: 'mock',
|
|
132
|
+
})
|
|
133
|
+
.system('You answer weather questions using the weather tool.')
|
|
134
|
+
.tool(weather)
|
|
135
|
+
.build();
|
|
81
136
|
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
↓
|
|
85
|
-
2. Agent (ReAct loop) ← inject before EVERY iteration.
|
|
86
|
-
↓
|
|
87
|
-
3. Dynamic Agent ← inject DIFFERENTLY per iteration based on
|
|
88
|
-
tool results, reasoning state, or user input.
|
|
137
|
+
const result = await agent.run({ message: 'Weather in Paris?' });
|
|
138
|
+
console.log(result); // → "I checked: it is 72°F and sunny."
|
|
89
139
|
```
|
|
90
140
|
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
agentfootprint handles all three timing levels through the **same** primitive (`Injection`), evaluated by the **same** engine, observed by the **same** event (`agentfootprint.context.injected`).
|
|
141
|
+
Swap `mock(...)` for `anthropic(...)` / `openai(...)` / `bedrock(...)` / `ollama(...)` for production. Nothing else changes.
|
|
94
142
|
|
|
95
143
|
---
|
|
96
144
|
|
|
97
|
-
##
|
|
98
|
-
|
|
99
|
-
That's the discipline. agentfootprint abstracts it for you in two layers, both built on the [footprintjs](https://github.com/footprintjs/footPrint) flowchart substrate:
|
|
100
|
-
|
|
101
|
-
### Layer 1 — Single agent: one `Injection` primitive
|
|
145
|
+
## The mental model — three slots, four triggers, one Injection
|
|
102
146
|
|
|
103
|
-
|
|
147
|
+
Every LLM call has three slots. **Every "agent feature" — Skill, Steering doc, Instruction, Fact, Memory replay, RAG chunk — is content flowing into one of them, under one of four triggers.** That's the entire abstraction.
|
|
104
148
|
|
|
105
149
|
```
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
150
|
+
┌─────────────────────────────────────┐
|
|
151
|
+
│ │
|
|
152
|
+
│ Your LLM call has 3 slots: │
|
|
153
|
+
│ │
|
|
154
|
+
│ system messages tools │
|
|
155
|
+
│ ▲ ▲ ▲ │
|
|
156
|
+
└───────┼──────────┼──────────┼───────┘
|
|
157
|
+
│ │ │
|
|
158
|
+
│ one │ one │
|
|
159
|
+
│ Injection│ Injection│
|
|
160
|
+
│ fires… │ fires… │
|
|
161
|
+
│ │ │
|
|
162
|
+
┌──────────────┴────┐ ┌──┴───┐ ┌──┴────┐
|
|
163
|
+
│ defineSteering │ │ ... │ │ ... │
|
|
164
|
+
│ defineInstruction │ │ │ │ │
|
|
165
|
+
│ defineSkill │ │ │ │ │
|
|
166
|
+
│ defineFact │ │ │ │ │
|
|
167
|
+
│ defineMemory(read) │ │ │ │ │
|
|
168
|
+
│ defineRAG │ │ │ │ │
|
|
169
|
+
│ …your next idea │ │ │ │ │
|
|
170
|
+
└────────────────────┘ └──────┘ └───────┘
|
|
171
|
+
▲
|
|
172
|
+
…under one of:
|
|
173
|
+
always · rule · on-tool-return · llm-activated
|
|
109
174
|
```
|
|
110
175
|
|
|
111
|
-
|
|
176
|
+
There's no fourth slot. There won't be. Every named pattern in the agent literature — Reflexion, Tree-of-Thoughts, Skills, RAG, Constitutional AI — reduces to *which slot* + *which trigger*. **You learn one model; the field's growth lands as new factories on the same primitive.**
|
|
112
177
|
|
|
113
|
-
|
|
178
|
+
### The four triggers — *who decides* this injection is needed right now?
|
|
114
179
|
|
|
115
|
-
|
|
|
116
|
-
|
|
117
|
-
|
|
|
118
|
-
|
|
|
119
|
-
| **
|
|
120
|
-
| **
|
|
180
|
+
| Trigger | Who decides | Fires when | Real-world example |
|
|
181
|
+
|---|---|---|---|
|
|
182
|
+
| `always` | nobody (always on) | Every iteration, every turn | *"Be friendly and concise."* — `defineSteering` |
|
|
183
|
+
| `rule` | **you**, via predicate | A `(ctx) => boolean` you wrote returns true | *"If user wrote 'urgent', prioritize fastest path."* — `defineInstruction({ activeWhen })` |
|
|
184
|
+
| `on-tool-return` | **the system** | A specific tool just returned (recency-first injection on the next iteration) | *"After `redact_pii` ran, use redacted text only."* — Dynamic ReAct |
|
|
185
|
+
| `llm-activated` | **the LLM** | The LLM called your activation tool (e.g. `read_skill('billing')`) | Skill body + unlocked tools land next iteration — `defineSkill` |
|
|
121
186
|
|
|
122
|
-
|
|
187
|
+
Why exactly four? Because *who decides activation* is a closed axis: nobody / the developer / the system / the LLM. Together those four exhaust the meaningful "when does this content matter?" cases. A fifth would require introducing a new agent of decision — and there isn't one. That's why the primitive surface stays this small even as named patterns proliferate above it.
|
|
123
188
|
|
|
124
|
-
|
|
189
|
+
---
|
|
125
190
|
|
|
126
|
-
|
|
191
|
+
## Why this isn't just an ergonomics win
|
|
127
192
|
|
|
128
|
-
|
|
129
|
-
|---|---|---|
|
|
130
|
-
| **ReAct** | `Agent` with the default loop | Yao 2022 |
|
|
131
|
-
| **Reflexion** | `Sequence(Agent, critique-LLM, Agent)` | Shinn 2023 |
|
|
132
|
-
| **Tree-of-Thoughts** | `Parallel(Agent × N) + rank` | Yao 2023 |
|
|
133
|
-
| **Self-Consistency** | `Parallel(Agent × N) + majority-vote` | Wang 2022 |
|
|
134
|
-
| **Debate** | `Loop(Agent × 2 + judge)` | Du 2023 |
|
|
135
|
-
| **Map-Reduce** | `Parallel(Agent × N) + merge` | Dean 2004 |
|
|
136
|
-
| **Swarm** | `Agent` whose tools are other `Agent`s | OpenAI 2024 |
|
|
193
|
+
The React parallel goes one layer deeper than "less code." Because the framework owns the wiring, the framework can do things you couldn't do by hand:
|
|
137
194
|
|
|
138
|
-
|
|
195
|
+
| You write declaratively | The framework does for you |
|
|
196
|
+
|---|---|
|
|
197
|
+
| `.steering(rule)` | Evaluates every iteration, composes into `system` slot |
|
|
198
|
+
| `.instruction(activeWhen, prompt)` | Re-evaluates predicate per iteration; routes to `system` or `messages` for attention positioning |
|
|
199
|
+
| `.skill(billing)` | Auto-attaches `read_skill` tool; LLM activates by id; body + unlocked tools land in next iteration |
|
|
200
|
+
| `.memory(causal)` | Persists footprintjs decision-evidence snapshots; embeds queries; cosine-matches on follow-up runs |
|
|
201
|
+
| `.tool(weather)` | Schemas to LLM, dispatches calls, captures args/results, gates by permission policy |
|
|
202
|
+
| `.attach(recorder)` | Subscribes to 47 typed events across 13 domains as the chart traverses |
|
|
203
|
+
| `agent.run({...})` | Captures every decision, every commit, every tool call as a JSON checkpoint that's replayable cross-server |
|
|
139
204
|
|
|
140
|
-
|
|
205
|
+
**The flowchart-pattern substrate** ([footprintjs](https://github.com/footprintjs/footPrint)) is what makes the observation automatic. Every stage execution is a typed event during one DFS traversal — no instrumentation, no post-processing. Same way React DevTools shows you the component tree because React owns the render path, agentfootprint shows you the slot composition because agentfootprint owns the prompt path.
|
|
141
206
|
|
|
142
207
|
---
|
|
143
208
|
|
|
144
|
-
##
|
|
209
|
+
## What you can build
|
|
145
210
|
|
|
146
|
-
|
|
211
|
+
Three example shapes, all runnable end-to-end with `npm run example examples/<file>.ts`.
|
|
147
212
|
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
213
|
+
### Customer support agent (skills + memory + audit trail)
|
|
214
|
+
|
|
215
|
+
```typescript
|
|
216
|
+
const agent = Agent.create({ provider, model: 'claude-sonnet-4-5-20250929' })
|
|
217
|
+
.system('You are a friendly support assistant.')
|
|
218
|
+
.skill(billingSkill) // LLM activates with read_skill('billing')
|
|
219
|
+
.steering(toneGuidelines) // always-on
|
|
220
|
+
.memory(conversationMemory) // remembers across .run() calls, per-tenant
|
|
221
|
+
.build();
|
|
156
222
|
```
|
|
157
223
|
|
|
158
|
-
|
|
224
|
+
→ [`examples/context-engineering/06-mixed-flavors.ts`](examples/context-engineering/06-mixed-flavors.ts)
|
|
159
225
|
|
|
160
|
-
|
|
161
|
-
|---|---|---|
|
|
162
|
-
| Adding a new flavor (e.g. *guardrail*) | New `GuardrailAgent` class, new event type, new UI surface | One factory file, same `Injection` shape, same `context.injected` event |
|
|
163
|
-
| Cross-run "why was X rejected?" | LLM reconstructs from messages | Replay EXACT past decisions from causal snapshots |
|
|
164
|
-
| Training-data export | Manual, lossy, optional | Same snapshot shape → SFT / DPO / process-RL ready (v2.1+) |
|
|
165
|
-
| Decision evidence | Lost — only the final answer survives | First-class events from `decide()` / `select()` captured during traversal |
|
|
226
|
+
### Research pipeline (multi-agent fan-out + merge)
|
|
166
227
|
|
|
167
|
-
|
|
228
|
+
```typescript
|
|
229
|
+
const research = Parallel.create()
|
|
230
|
+
.branch(optimist).branch(skeptic).branch(historian)
|
|
231
|
+
.merge(synthesizer)
|
|
232
|
+
.build();
|
|
168
233
|
|
|
169
|
-
|
|
234
|
+
await research.run({ message: 'Should we adopt microservices?' });
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
→ [`examples/patterns/05-tot.ts`](examples/patterns/05-tot.ts) (Tree-of-Thoughts) · [`examples/patterns/01-self-consistency.ts`](examples/patterns/01-self-consistency.ts)
|
|
238
|
+
|
|
239
|
+
### Streaming chat agent (token-by-token to a browser)
|
|
240
|
+
|
|
241
|
+
<!-- ┌────────────────────────────────────────────────────────────────┐
|
|
242
|
+
│ 📹 Streaming demo clip here. │
|
|
243
|
+
│ Short loop: user types → tokens stream → tool call │
|
|
244
|
+
│ surfaces mid-stream → final answer. │
|
|
245
|
+
└────────────────────────────────────────────────────────────────┘ -->
|
|
170
246
|
|
|
171
247
|
```typescript
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
} from 'agentfootprint';
|
|
248
|
+
agent.on('agentfootprint.stream.token', (e) => res.write(e.payload.content));
|
|
249
|
+
agent.on('agentfootprint.stream.tool_start', (e) => res.write(`\n→ ${e.payload.toolName}...\n`));
|
|
250
|
+
await agent.run({ message: userInput });
|
|
251
|
+
```
|
|
177
252
|
|
|
178
|
-
|
|
179
|
-
// — same agent, same flowchart, no API key needed.
|
|
253
|
+
→ [`docs-site/guides/streaming/`](docs-site/src/content/docs/guides/streaming.mdx)
|
|
180
254
|
|
|
181
|
-
|
|
182
|
-
const weather = defineTool({
|
|
183
|
-
schema: {
|
|
184
|
-
name: 'weather',
|
|
185
|
-
description: 'Current weather for a city.',
|
|
186
|
-
inputSchema: {
|
|
187
|
-
type: 'object',
|
|
188
|
-
properties: { city: { type: 'string' } },
|
|
189
|
-
required: ['city'],
|
|
190
|
-
},
|
|
191
|
-
},
|
|
192
|
-
execute: async (args) => `${(args as { city: string }).city}: 72°F, sunny`,
|
|
193
|
-
});
|
|
255
|
+
---
|
|
194
256
|
|
|
195
|
-
|
|
196
|
-
const tone = defineSteering({
|
|
197
|
-
id: 'tone',
|
|
198
|
-
prompt: 'Be friendly and concise. Acknowledge feelings before facts.',
|
|
199
|
-
});
|
|
257
|
+
## The differentiator: the trace is a cache of the agent's thinking
|
|
200
258
|
|
|
201
|
-
|
|
202
|
-
id: 'urgent',
|
|
203
|
-
activeWhen: (ctx) => /urgent|asap|emergency/i.test(ctx.userMessage),
|
|
204
|
-
prompt: 'The user marked this urgent. Prioritize the fastest resolution.',
|
|
205
|
-
});
|
|
259
|
+
Other agent frameworks' memory remembers *what was said*. agentfootprint's `defineMemory({ type: CAUSAL })` records the **decision evidence** — every value the agent's flowchart captured during the run, persisted as a JSON-portable snapshot.
|
|
206
260
|
|
|
207
|
-
|
|
208
|
-
const memory = defineMemory({
|
|
209
|
-
id: 'short-term',
|
|
210
|
-
type: MEMORY_TYPES.EPISODIC,
|
|
211
|
-
strategy: { kind: MEMORY_STRATEGIES.WINDOW, size: 10 },
|
|
212
|
-
store: new InMemoryStore(),
|
|
213
|
-
});
|
|
261
|
+
That changes the cost structure of *everything that happens after the agent runs.* The expensive thinking happened once; the recorded trace makes consuming that thinking cheap, three different ways:
|
|
214
262
|
|
|
215
|
-
|
|
216
|
-
const agent = Agent.create({
|
|
217
|
-
provider: anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! }),
|
|
218
|
-
model: 'claude-sonnet-4-5-20250929',
|
|
219
|
-
})
|
|
220
|
-
.system('You are a helpful weather assistant.')
|
|
221
|
-
.tool(weather)
|
|
222
|
-
.steering(tone)
|
|
223
|
-
.instruction(urgent)
|
|
224
|
-
.memory(memory)
|
|
225
|
-
.build();
|
|
263
|
+
### 1. Audit / explain — cross-run, six months later, exact past facts
|
|
226
264
|
|
|
227
|
-
|
|
228
|
-
const
|
|
229
|
-
|
|
265
|
+
```typescript
|
|
266
|
+
const causal = defineMemory({
|
|
267
|
+
id: 'causal',
|
|
268
|
+
type: MEMORY_TYPES.CAUSAL,
|
|
269
|
+
strategy: { kind: MEMORY_STRATEGIES.TOP_K, topK: 1, threshold: 0.7, embedder },
|
|
270
|
+
store,
|
|
271
|
+
projection: SNAPSHOT_PROJECTIONS.DECISIONS, // inject "why" only, not "what"
|
|
272
|
+
});
|
|
273
|
+
|
|
274
|
+
// Monday: agent decides loan #42 should be rejected (creditScore=580, threshold=600).
|
|
275
|
+
// Friday: user asks "Why was my application rejected?"
|
|
276
|
+
// → Causal memory loads the exact decision evidence from Monday.
|
|
277
|
+
// → LLM answers from the SOURCE, not reconstruction.
|
|
230
278
|
```
|
|
231
279
|
|
|
232
|
-
|
|
280
|
+
→ [`examples/memory/06-causal-snapshot.ts`](examples/memory/06-causal-snapshot.ts) — runs end-to-end with mock embedder, ~50 lines.
|
|
233
281
|
|
|
234
|
-
|
|
282
|
+
### 2. Cheap-model triage — the trace *is* the reasoning
|
|
235
283
|
|
|
236
|
-
|
|
284
|
+
A trace recorded from your expensive production model (Sonnet-4, GPT-4) is a perfectly good *input* for a small, fast, cheap model (Haiku, GPT-4o-mini) answering follow-up questions about that run. The expensive model already did the work; the cheap model just **reads what's in the trace**.
|
|
237
285
|
|
|
238
|
-
|
|
286
|
+
Reading recorded decision evidence is structurally simpler than re-deriving the answer from first principles — so a smaller model is enough. You can compose the routing yourself: when causal memory injected a snapshot on the next turn, send that turn to a cheaper provider.
|
|
239
287
|
|
|
240
288
|
```typescript
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
MEMORY_TYPES, MEMORY_STRATEGIES,
|
|
244
|
-
mock, InMemoryStore, // ← the mock surfaces
|
|
245
|
-
} from 'agentfootprint';
|
|
289
|
+
const heavy = anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });
|
|
290
|
+
const cheap = anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });
|
|
246
291
|
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
})
|
|
251
|
-
.steering(defineSteering({ id: 'tone', prompt: 'Be friendly.' }))
|
|
252
|
-
.tool(defineTool({
|
|
253
|
-
schema: { name: 'lookup', description: '...', inputSchema: {} },
|
|
254
|
-
execute: async () => 'mock data', // ← inline mock
|
|
255
|
-
}))
|
|
256
|
-
.memory(defineMemory({
|
|
257
|
-
id: 'short-term',
|
|
258
|
-
type: MEMORY_TYPES.EPISODIC,
|
|
259
|
-
strategy: { kind: MEMORY_STRATEGIES.WINDOW, size: 10 },
|
|
260
|
-
store: new InMemoryStore(), // ← ephemeral
|
|
261
|
-
}))
|
|
292
|
+
// Production turn — heavy model, full reasoning, snapshot persisted.
|
|
293
|
+
const productionAgent = Agent.create({ provider: heavy, model: 'claude-sonnet-4-5-20250929' })
|
|
294
|
+
.memory(causal)
|
|
262
295
|
.build();
|
|
296
|
+
await productionAgent.run({ message: 'Should we approve loan #42?', identity });
|
|
263
297
|
|
|
264
|
-
|
|
298
|
+
// Follow-up turn — cheaper model reads the snapshot, lower cost per turn.
|
|
299
|
+
const followUpAgent = Agent.create({ provider: cheap, model: 'claude-haiku-4-5-20251001' })
|
|
300
|
+
.memory(causal)
|
|
301
|
+
.build();
|
|
302
|
+
await followUpAgent.run({ message: 'Why was loan #42 rejected?', identity });
|
|
265
303
|
```
|
|
266
304
|
|
|
267
|
-
|
|
305
|
+
This is memoization for agent reasoning — do the expensive work once, serve many queries from the cached result. Across a production system that handles audit / explain / "why did the agent do X?" traffic, this is real money.
|
|
268
306
|
|
|
269
|
-
|
|
307
|
+
### 3. Training data — every successful run becomes a labeled trajectory
|
|
270
308
|
|
|
271
|
-
|
|
|
272
|
-
|---|---|---|
|
|
273
|
-
| **LLM provider** | `mock({ reply })` · `mock({ replies })` for scripted ReAct | `anthropic()` · `openai()` · `bedrock()` · `ollama()` |
|
|
274
|
-
| **Embedder** | `mockEmbedder()` | OpenAI / Cohere / Bedrock embedder (factories on roadmap) |
|
|
275
|
-
| **Memory store** | `InMemoryStore` | `RedisStore` (`agentfootprint/memory-redis`) · `AgentCoreStore` (`agentfootprint/memory-agentcore`) · DynamoDB / Postgres / Pinecone (planned) |
|
|
276
|
-
| **MCP server** | `mockMcpClient({ tools })` — in-memory, no SDK | `mcpClient({ transport })` to a real server |
|
|
277
|
-
| **Tool execution** | `defineTool({ execute: async () => '...' })` | Same `defineTool`, real implementation |
|
|
309
|
+
The same snapshot data shape is the input to SFT / DPO / process-RL training pipelines (`causalMemory.exportForTraining({ format: 'sft' | 'dpo' | 'process' })` is on the roadmap — see below). You don't run a separate data-collection phase — **your production traffic IS your training set.** Every successful customer interaction is a positive trajectory; every escalation or override is a counter-example.
|
|
278
310
|
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
> Why this matters: it's the difference between *learning context engineering by trying things* and *learning by burning your API budget*. The library treats $0 development as a first-class workflow, not an afterthought.
|
|
311
|
+
The same JSON shape that powered the audit trail and the cheap-model follow-up is the training payload. One recording, three downstream consumers, no extra instrumentation.
|
|
282
312
|
|
|
283
313
|
---
|
|
284
314
|
|
|
285
|
-
##
|
|
315
|
+
## Mocks first, prod second
|
|
286
316
|
|
|
287
|
-
|
|
317
|
+
Generative AI development is expensive when every iteration hits a paid API. agentfootprint is designed so you build the entire app — agent, context engineering, memory, RAG — against in-memory mocks, prove the logic end-to-end with **zero API cost**, then swap real infrastructure in one boundary at a time.
|
|
288
318
|
|
|
289
|
-
|
|
|
290
|
-
|
|
291
|
-
| `
|
|
292
|
-
| `
|
|
293
|
-
| `
|
|
294
|
-
|
|
|
295
|
-
|
|
296
|
-
| Strategy | How content is selected |
|
|
297
|
-
|---|---|
|
|
298
|
-
| `WINDOW` | Last N entries (rule, no LLM, no embeddings) |
|
|
299
|
-
| `BUDGET` | Fit-to-tokens (decider) |
|
|
300
|
-
| `SUMMARIZE` | LLM compresses older turns |
|
|
301
|
-
| `TOP_K` | Score-threshold semantic retrieval |
|
|
302
|
-
| `EXTRACT` | LLM distills facts/beats on write |
|
|
303
|
-
| `DECAY` | Recency-weighted (planned) |
|
|
304
|
-
| `HYBRID` | Compose multiple |
|
|
305
|
-
|
|
306
|
-
**Causal memory** is the differentiator: footprintjs's `decide()` and `select()` capture decision evidence as first-class events. Causal memory persists those snapshots tagged with the user's original query. New questions cosine-match past queries → inject the prior decision evidence → the LLM answers from EXACT past facts. Cross-run "why was X rejected last week?" follow-ups answer correctly without reconstruction.
|
|
319
|
+
| Boundary | Dev (mock) | Prod (swap one line) |
|
|
320
|
+
|---|---|---|
|
|
321
|
+
| LLM provider | `mock({ reply })` · `mock({ replies })` for scripted multi-turn | `anthropic()` · `openai()` · `bedrock()` · `ollama()` |
|
|
322
|
+
| Embedder | `mockEmbedder()` | OpenAI / Cohere / Bedrock embedder (factories on roadmap) |
|
|
323
|
+
| Memory store | `InMemoryStore` | `RedisStore` (`agentfootprint/memory-redis`) · `AgentCoreStore` (`agentfootprint/memory-agentcore`) · DynamoDB / Postgres / Pinecone (planned) |
|
|
324
|
+
| MCP server | `mockMcpClient({ tools })` — in-memory, no SDK | `mcpClient({ transport })` to a real server |
|
|
325
|
+
| Tool execution | inline closure | real implementation |
|
|
307
326
|
|
|
308
|
-
The
|
|
327
|
+
The flowchart, recorders, narrative, and tests don't change between dev and prod. **Ship the patterns first; pay for tokens last.**
|
|
309
328
|
|
|
310
329
|
---
|
|
311
330
|
|
|
312
|
-
##
|
|
331
|
+
## Pick your starting door
|
|
313
332
|
|
|
314
|
-
|
|
|
333
|
+
| If you are... | Start here |
|
|
315
334
|
|---|---|
|
|
316
|
-
| **
|
|
317
|
-
| **
|
|
318
|
-
| **
|
|
319
|
-
| **
|
|
320
|
-
|
|
335
|
+
| 🎓 **New to agents** | [5-minute Quick Start](https://footprintjs.github.io/agentfootprint/getting-started/quick-start/) → first agent runs offline |
|
|
336
|
+
| 🛠️ **A LangChain / CrewAI / LangGraph user** | [Migration sketch](https://footprintjs.github.io/agentfootprint/getting-started/vs/) — same patterns, fewer classes |
|
|
337
|
+
| 🏗️ **Architecting an enterprise rollout** | [Production guide](https://footprintjs.github.io/agentfootprint/guides/deployment/) — multi-tenant identity, audit trails, redaction, OTel |
|
|
338
|
+
| 🔬 **Researcher / extending the framework** | [Extension guide](https://footprintjs.github.io/agentfootprint/contributing/extension-guide/) — add a new flavor in 50 lines |
|
|
339
|
+
|
|
340
|
+
Every code snippet on the docs site is imported from a real, runnable file in [`examples/`](examples/) — every example is also an end-to-end test in CI. There is no docs-only code in this repo.
|
|
321
341
|
|
|
322
342
|
---
|
|
323
343
|
|
|
324
|
-
## What
|
|
344
|
+
## What ships today
|
|
345
|
+
|
|
346
|
+
- **2 primitives** — `LLMCall`, `Agent` (the ReAct loop)
|
|
347
|
+
- **4 compositions** — `Sequence`, `Parallel`, `Conditional`, `Loop`
|
|
348
|
+
- **6 LLM providers** — Anthropic · OpenAI · Bedrock · Ollama · Browser-Anthropic · Browser-OpenAI · Mock (with `mock({ replies })` for scripted multi-turn)
|
|
349
|
+
- **One Injection primitive** — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact` (one engine, four typed factories, all reduce to `{ trigger, slot }`)
|
|
350
|
+
- **One Memory factory** — `defineMemory({ type, strategy, store })` — 4 types × 7 strategies including **Causal**
|
|
351
|
+
- **RAG** — `defineRAG()` + `indexDocuments()` (sugar over Semantic + TopK)
|
|
352
|
+
- **MCP** — `mcpClient({ transport })` for real servers · `mockMcpClient({ tools })` for in-memory development
|
|
353
|
+
- **Memory store adapters** — `InMemoryStore` · `RedisStore` (subpath `agentfootprint/memory-redis`) · `AgentCoreStore` (subpath `agentfootprint/memory-agentcore`)
|
|
354
|
+
- **47 typed observability events** across 13 domains — context · stream · agent · cost · skill · permission · eval · memory · …
|
|
355
|
+
- **Pause / resume** — JSON-serializable checkpoints; pause via `askHuman` / `pauseHere`, resume hours later on a different server
|
|
356
|
+
- **Resilience** — `withRetry`, `withFallback`, `resilientProvider`
|
|
357
|
+
- **AI-coding-tool support** — bundled instructions for Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot
|
|
358
|
+
- **Runnable examples** organized by DNA layer (core · core-flow · patterns · context-engineering · memory · features) — every example is also an end-to-end CI test
|
|
359
|
+
|
|
360
|
+
## What's next (clearly marked roadmap)
|
|
361
|
+
|
|
362
|
+
| Theme | Focus |
|
|
363
|
+
|---|---|
|
|
364
|
+
| **Reliability subsystem** | `CircuitBreaker` · 3-tier output fallback · auto-resume-on-error · Skills upgrades (`surfaceMode`, `refreshPolicy`) · `MockEnvironment` composer |
|
|
365
|
+
| **Causal training-data exports** | `causalMemory.exportForTraining({ format: 'sft' \| 'dpo' \| 'process' })` — production traffic becomes labeled SFT / DPO / process-RL trajectories |
|
|
366
|
+
| **Governance** | `Policy` · `BudgetTracker` · DynamoDB / Postgres / Pinecone memory adapters · production embedder factories |
|
|
367
|
+
| **Deep Agents · A2A protocol** | Planning-before-execution · agent-to-agent protocol · Lens UI deep-link |
|
|
325
368
|
|
|
326
|
-
|
|
327
|
-
- **3 compositions + Loop** — Sequence · Parallel · Conditional · Loop
|
|
328
|
-
- **6 LLM providers** — Anthropic · OpenAI · Bedrock · Ollama · Browser-Anthropic · Browser-OpenAI · Mock (for $0 testing)
|
|
329
|
-
- **InjectionEngine** — one `Injection` primitive + 4 typed factories (`defineSkill` / `defineSteering` / `defineInstruction` / `defineFact`); covers Dynamic ReAct via `on-tool-return` triggers
|
|
330
|
-
- **Memory subsystem** — `defineMemory` factory, 4 types (Episodic / Semantic / Narrative / **Causal** ⭐) × 7 strategies (Window / Budget / Summarize / TopK / Extract / Decay / Hybrid)
|
|
331
|
-
- **Multi-agent through control flow** — no separate `MultiAgentSystem` class; agents compose via Sequence / Parallel / Conditional / Loop
|
|
332
|
-
- **6 canonical patterns** runnable as examples — ReAct · Reflexion · ToT · Self-Consistency · Debate · Map-Reduce · Swarm
|
|
333
|
-
- **Observability** — 47 typed events × 13 domains; recorders for context · stream · agent · cost · skill · permission · eval · memory
|
|
334
|
-
- **Resilience helpers** — `withRetry`, `withFallback`, `resilientProvider`
|
|
335
|
-
- **Pause / resume** — JSON-serializable checkpoints; agent can pause via `askHuman`/`pauseHere` and resume hours later on a different server
|
|
336
|
-
- **AI-coding-tool support** — bundled instructions for Claude Code / Cursor / Windsurf / Cline / Kiro / Copilot
|
|
337
|
-
- **33 runnable end-to-end examples** — every example is a real test exercising the documented surface
|
|
369
|
+
For shipped features per release see [CHANGELOG.md](./CHANGELOG.md). Roadmap items are *not* claims about the current API — if a feature isn't in `npm install agentfootprint` today, it's listed here, not in the documentation.
|
|
338
370
|
|
|
339
|
-
|
|
371
|
+
---
|
|
340
372
|
|
|
341
|
-
|
|
342
|
-
|---|---|
|
|
343
|
-
| ~~v2.1~~ ✓ | RAG flavor (`defineRAG`) — shipped in 2.1.0 |
|
|
344
|
-
| v2.2 | MCP integration (`mcpClient`) ✓ · Redis memory store adapter · CircuitBreaker primitive · 3-tier structured-output fallback |
|
|
345
|
-
| v2.2 | Governance subsystem (`Policy`, `BudgetTracker`, role-based access) · DynamoDB / Postgres / Pinecone store adapters |
|
|
346
|
-
| v2.3 | Causal training-data exports — `causalMemory.exportForTraining({ format: 'sft' \| 'dpo' \| 'process' })` for HuggingFace / OpenAI / Anthropic batch fine-tune |
|
|
347
|
-
| v2.4+ | Deep Agents (planning-before-execution) · A2A protocol · Lens UI deep-link |
|
|
373
|
+
## Built on
|
|
348
374
|
|
|
349
|
-
|
|
375
|
+
[footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. The decision-evidence capture, narrative recording, and time-travel checkpointing this library uses are footprintjs primitives. The same way autograd's forward-pass traversal is what makes gradient inspection automatic, footprintjs's flowchart traversal is what makes agentfootprint's typed-event stream and replayable traces automatic. You don't need to learn footprintjs to use agentfootprint — but if you want to build your own primitives at this depth, [start there](https://footprintjs.github.io/footPrint/).
|
|
376
|
+
|
|
377
|
+
## License
|
|
350
378
|
|
|
351
|
-
[MIT](./LICENSE)
|
|
379
|
+
[MIT](./LICENSE) © [Sanjay Krishna Anbalagan](https://github.com/sanjay1909)
|