agentfootprint 2.6.1 → 2.6.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +143 -93
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -28,16 +28,38 @@
|
|
|
28
28
|
| **Prisma / SQLAlchemy** | Schema + query intent | SQL generation, connection pooling, migrations |
|
|
29
29
|
| **Kubernetes** | Desired state (manifests) | Scheduling, health checks, reconciliation loop |
|
|
30
30
|
| **React** | Components + state | DOM diffing, render path, event delegation |
|
|
31
|
-
| **agentfootprint** | Injections (slot × trigger) | Slot composition, iteration loop, observation, replay |
|
|
31
|
+
| **agentfootprint** | Injections (slot × trigger × cache) | Slot composition, iteration loop, prompt caching, observation, replay |
|
|
32
32
|
|
|
33
|
-
The closest structural parallel is **autograd**: you describe the graph, the framework traverses it, and *because the framework owns the traversal it can record everything that happens for free*. Same idea here — you describe Injections, agentfootprint runs the iteration loop, and the typed-event stream + replayable checkpoints are
|
|
33
|
+
The closest structural parallel is **autograd**: you describe the graph, the framework traverses it, and *because the framework owns the traversal it can record everything that happens for free*. Same idea here — you describe Injections, agentfootprint runs the iteration loop, and the typed-event stream + replayable checkpoints + provider-agnostic prompt caching are consequences, not extra features.
|
|
34
34
|
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Why it's shaped this way — two pillars
|
|
38
|
+
|
|
39
|
+
The abstraction lineage above tells you *what* this library is. The two pillars below explain *why* it's structured the way it is. Neither is decorative — both are operationalized in the runtime.
|
|
40
|
+
|
|
41
|
+
### THE WHY — connected data (the user-visible win)
|
|
42
|
+
|
|
43
|
+
Palantir's 2003 thesis: enterprise insight is bottlenecked by **data fragmentation**, not analyst skill. Connecting siloed data into one ontology collapses weeks of manual correlation into minutes.
|
|
44
|
+
|
|
45
|
+
LLM agents face the same fragmentation problem at *runtime*. Disconnected tool state, lost decision evidence, scattered execution context — the agent re-discovers relationships every iteration, burning tokens. agentfootprint connects four classes of agent data so the next token compounds the connection instead of paying for it again:
|
|
46
|
+
|
|
47
|
+
| Class | Mechanism |
|
|
48
|
+
|---|---|
|
|
49
|
+
| **State** | `TypedScope<S>` — single typed shared state, every read/write tracked |
|
|
50
|
+
| **Decisions** | `decide()` evidence — every branch carries the inputs that triggered it |
|
|
51
|
+
| **Execution** | `commitLog` + `runtimeStageId` — every state mutation keyed to its writing stage |
|
|
52
|
+
| **Memory** | Causal memory — full footprintjs snapshots persisted, cosine-matched on follow-up runs |
|
|
53
|
+
|
|
54
|
+
**Connected data → fewer iterations → fewer tokens.** Same arithmetic Palantir was attacking in 2003, different decade, different layer.
|
|
55
|
+
|
|
56
|
+
### THE HOW — modular boundaries (the engineering discipline)
|
|
57
|
+
|
|
58
|
+
Liskov's ADT (1974) and LSP (1987) work gives a vocabulary for boundaries that don't leak. Every framework boundary in agentfootprint is an LSP-substitutable interface — `LLMProvider`, `ToolProvider`, `CacheStrategy`, `Recorder`, `MemoryStore` — so you can swap implementations without changing agent code. Subflows are CLU clusters with explicit input/output mappers; nothing leaks across the boundary.
|
|
59
|
+
|
|
60
|
+
Together: **clean modules + connected data = a runtime that's both fast (Palantir multiplier) and reasonable (Liskov locality).** Boundaries alone produce a clean but dumb library. Connections alone produce a fast but unmaintainable one.
|
|
61
|
+
|
|
62
|
+
Detailed write-ups: [`docs/inspiration/`](./docs/inspiration/) — *"Connected Data — the Palantir lineage"* and *"Modularity — the Liskov lineage"*. Not required reading for using the library; required reading for extending or evaluating it.
|
|
41
63
|
|
|
42
64
|
---
|
|
43
65
|
|
|
@@ -74,9 +96,10 @@ async function runAgentTurn(userMsg, state) {
|
|
|
74
96
|
const memEntries = await store.list({ tenant, conversationId });
|
|
75
97
|
messages.unshift({ role: 'system', content: formatMemory(memEntries.slice(-10)) });
|
|
76
98
|
|
|
77
|
-
// 6.
|
|
78
|
-
// 7.
|
|
79
|
-
// 8.
|
|
99
|
+
// 6. Decide what's cacheable; place provider-specific cache_control markers...
|
|
100
|
+
// 7. Call LLM, route tool calls, loop, capture state for resume...
|
|
101
|
+
// 8. Persist new turn back to memory tagged with identity...
|
|
102
|
+
// 9. Wire SSE for streaming, attach observability hooks...
|
|
80
103
|
|
|
81
104
|
// No replay. No audit trail. Per agent, hundreds of lines.
|
|
82
105
|
// Every refactor risks a slot-ordering bug nobody catches until prod.
|
|
@@ -102,7 +125,7 @@ agent.on('agentfootprint.context.injected', (e) =>
|
|
|
102
125
|
console.log(`[${e.payload.source}] landed in ${e.payload.slot}`));
|
|
103
126
|
```
|
|
104
127
|
|
|
105
|
-
Same agent. The hand-rolled version is ~80 lines and growing; the declarative version is ~8 and stable. **The framework owns the wiring** — which is exactly why it can observe, replay, and
|
|
128
|
+
Same agent. The hand-rolled version is ~80 lines and growing; the declarative version is ~8 and stable. **The framework owns the wiring** — which is exactly why it can observe, replay, audit, and cache it for you.
|
|
106
129
|
|
|
107
130
|
---
|
|
108
131
|
|
|
@@ -199,34 +222,37 @@ The React parallel goes one layer deeper than "less code." Because the framework
|
|
|
199
222
|
| `.skill(billing)` | Auto-attaches `read_skill` tool; LLM activates by id; body + unlocked tools land in next iteration |
|
|
200
223
|
| `.memory(causal)` | Persists footprintjs decision-evidence snapshots; embeds queries; cosine-matches on follow-up runs |
|
|
201
224
|
| `.tool(weather)` | Schemas to LLM, dispatches calls, captures args/results, gates by permission policy |
|
|
202
|
-
| `.attach(recorder)` | Subscribes to
|
|
225
|
+
| `.attach(recorder)` | Subscribes to typed events across many domains as the chart traverses |
|
|
203
226
|
| `agent.run({...})` | Captures every decision, every commit, every tool call as a JSON checkpoint that's replayable cross-server |
|
|
204
227
|
|
|
205
|
-
LangChain assembles prompts once per turn. LangGraph composes state per node, not per loop iteration. CrewAI's Agent is tool-aware but not iteration-aware. **Per-iteration recomposition of all three slots based on the latest tool result + accumulated state is structurally distinct.**
|
|
228
|
+
LangChain assembles prompts once per turn. LangGraph composes state per node, not per loop iteration. CrewAI's Agent is tool-aware but not iteration-aware. **Per-iteration recomposition of all three slots based on the latest tool result + accumulated state is structurally distinct.** Frameworks that compose state per-node rather than per-loop-iteration can't recompute cache markers in lockstep with the active injection set — the structural prerequisite for the cache layer below.
|
|
206
229
|
|
|
207
230
|
### What "every iteration" makes possible
|
|
208
231
|
|
|
209
|
-
Use cases that emerge once the loop re-evaluates everything:
|
|
210
|
-
|
|
211
232
|
| Use case | The mechanism |
|
|
212
233
|
|---|---|
|
|
213
234
|
| **Tool-by-tool LLM steering** — agent called `redact_pii` → next iter, system prompt gets *"use redacted text, don't paraphrase original"* | `defineInstruction({ activeWhen: (ctx) => ctx.lastToolResult?.toolName === 'redact_pii' })` |
|
|
214
|
-
| **Adaptive tool exposure** — agent activated `billing` skill → next iter, tool list switches to billing-only set (3× context-budget reduction) | `defineSkill({...}) +
|
|
235
|
+
| **Adaptive tool exposure** — agent activated `billing` skill → next iter, tool list switches to billing-only set (3× context-budget reduction) | `defineSkill({...})` + LLM-activated trigger |
|
|
215
236
|
| **Cost guardrails** — accumulated cost > threshold → next iter, system prompt adds *"be concise"* | `defineInstruction({ activeWhen: (ctx) => ctx.accumulatedCostUsd > 0.50 })` |
|
|
216
237
|
| **Iterative format refinement** — iter 1 emitted JSON → iter 2 prompt adds *"continue this format"*; iter 5 prompt drops it | predicate over `ctx.iteration` + `ctx.history` |
|
|
217
238
|
| **Failure adaptation** — tool X returned an error → next iter, prompt adds *"don't try X again; use Y"* | `on-tool-return` predicate inspecting `ctx.lastToolResult` for error markers |
|
|
218
239
|
| **Few-shot evolution** — iter 1 prompt has example for the rare case → iter 2 drops it because example is consumed | predicate that tracks which examples have already fired |
|
|
219
|
-
| **Skill body refresh** — long-context run, system-prompt skill body decayed → re-inject via tool result | `defineSkill({ refreshPolicy: { afterTokens: 50_000 } })` (v2.5+) |
|
|
220
240
|
|
|
221
241
|
The framework owns the loop. The framework re-evaluates triggers every iteration. Tool results reshape the next iteration's prompt. **That's what makes context engineering compositional instead of static.**
|
|
222
242
|
|
|
223
243
|
**The flowchart-pattern substrate** ([footprintjs](https://github.com/footprintjs/footPrint)) is what makes the observation automatic. Every stage execution is a typed event during one DFS traversal — no instrumentation, no post-processing. Same way React DevTools shows you the component tree because React owns the render path, agentfootprint shows you the slot composition because agentfootprint owns the prompt path.
|
|
224
244
|
|
|
225
|
-
###
|
|
245
|
+
### When to use Dynamic ReAct
|
|
246
|
+
|
|
247
|
+
Use it when **your tools have dependencies** — when one tool's output implies which tool to call next.
|
|
248
|
+
|
|
249
|
+
A skill body like *"if `get_port_errors` reports CRC > 0, call `get_sfp_diag` next; if it reports `signal_loss`, call `get_flogi` next"* IS a dependency graph. The skill encodes the workflow; Dynamic ReAct gates the tool surface to that workflow at runtime.
|
|
250
|
+
|
|
251
|
+
If your tools are independent (the LLM can call any of them at any time, ordering doesn't matter), Classic ReAct is fine and simpler — don't reach for Skills.
|
|
252
|
+
|
|
253
|
+
### Side-by-side example
|
|
226
254
|
|
|
227
|
-
[`examples/dynamic-react/`](./examples/dynamic-react/) ships two
|
|
228
|
-
mock-backed scripts solving the same SRE task with the same scripted
|
|
229
|
-
answers. Per-iteration tool-count progression makes the shape clear:
|
|
255
|
+
[`examples/dynamic-react/`](./examples/dynamic-react/) ships two mock-backed scripts solving the same task. Per-iteration tool-count progression makes the shape clear:
|
|
230
256
|
|
|
231
257
|
```
|
|
232
258
|
Classic ReAct Dynamic ReAct
|
|
@@ -238,75 +264,94 @@ iter 4: 12 tools shown iter 4: 5 tools
|
|
|
238
264
|
iter 5: 5 tools (final answer)
|
|
239
265
|
```
|
|
240
266
|
|
|
241
|
-
The
|
|
242
|
-
**Classic ReAct has no equivalent**: every registered tool ships
|
|
243
|
-
on every call.
|
|
267
|
+
The unactivated skills' tools never enter the LLM context. Classic ReAct has no equivalent — every registered tool ships on every call.
|
|
244
268
|
|
|
245
|
-
|
|
269
|
+
What Dynamic gives you that Classic doesn't:
|
|
270
|
+
|
|
271
|
+
1. **Constant per-call payload** bounded by active-skill size, not registry size. Scales to 50+ tool catalogs.
|
|
272
|
+
2. **Deterministic routing** — `read_skill` forces scope before data tools fire. LLM can't drift to off-topic tools.
|
|
273
|
+
3. **Auditability** — each iteration's tool list is a pure function of `activatedInjectionIds`. Recorded, replayable, diff-able across runs.
|
|
274
|
+
4. **Less hallucination** — fewer tools per call = more in-distribution on the active task.
|
|
275
|
+
|
|
276
|
+
> **Compounds with the cache layer (next section).** Because the framework owns both the per-iteration slot recomposition AND the cache marker placement, cache invalidation tracks the live skill state — when a skill deactivates, only its prefix invalidates; the rest of the cached system prompt stays warm.
|
|
277
|
+
|
|
278
|
+
Run it:
|
|
279
|
+
|
|
280
|
+
```sh
|
|
281
|
+
TSX_TSCONFIG_PATH=examples/runtime.tsconfig.json npx tsx examples/dynamic-react/01-classic-react.ts
|
|
282
|
+
TSX_TSCONFIG_PATH=examples/runtime.tsconfig.json npx tsx examples/dynamic-react/02-dynamic-react.ts
|
|
283
|
+
```
|
|
246
284
|
|
|
247
|
-
|
|
248
|
-
against Anthropic with Haiku 4.5, Sonnet 4.5, and Opus 4.5 in both
|
|
249
|
-
modes. Same prompt, same scenario data, real `usage.input_tokens`:
|
|
285
|
+
---
|
|
250
286
|
|
|
251
|
-
|
|
252
|
-
| ----------- | ---------: | ---------: | -----: | ---------------------------------- |
|
|
253
|
-
| Haiku 4.5 | 25,755 | 36,341 | +41% | Classic 4 iters / Dynamic 6 iters |
|
|
254
|
-
| Sonnet 4.5 | 36,690 | 28,486 | −22% | Classic went serial; Dynamic wins |
|
|
255
|
-
| Opus 4.5 | 20,114 | 28,401 | +41% | Opus's parallel batching is best |
|
|
287
|
+
## The cache layer — provider-agnostic prompt caching
|
|
256
288
|
|
|
257
|
-
|
|
258
|
-
total input-token cost depends on **how aggressively the model
|
|
259
|
-
parallelizes Classic mode**. Opus parallelizes best (3 iters, all
|
|
260
|
-
data tools in one round) so Classic minimizes iterations and wins.
|
|
261
|
-
Sonnet went serial that turn (5 iters) so Dynamic won. Haiku
|
|
262
|
-
parallelized well (4 iters) so Classic won.
|
|
289
|
+
Anthropic gives you `cache_control` blocks. OpenAI auto-caches. Bedrock has its own format. Each provider's docs are 30+ pages, the wire formats are different, and the right cache placement depends on what's stable across iterations vs what's volatile.
|
|
263
290
|
|
|
264
|
-
|
|
265
|
-
mood). Dynamic stays predictable at 5–6 iters across all models.
|
|
291
|
+
agentfootprint gives you **one declarative API across all three** (and a `NoOp` wildcard for the rest). You annotate intent at the injection level; the framework computes the cacheable boundary every iteration; per-provider strategies translate to the right wire format.
|
|
266
292
|
|
|
267
|
-
###
|
|
293
|
+
### Declarative cache directives
|
|
268
294
|
|
|
269
|
-
|
|
270
|
-
`[18, 18, 18]` for Classic, regardless of registry size. Scales
|
|
271
|
-
to 50+ tool catalogs without ballooning per-call cost.
|
|
272
|
-
2. **Deterministic routing**: `read_skill` forces scope before data
|
|
273
|
-
tools fire. LLM can't drift to off-topic tools.
|
|
274
|
-
3. **Auditability**: each iteration's tool list is a function of
|
|
275
|
-
`activatedInjectionIds` — recorded, replayable, diff-able across
|
|
276
|
-
runs. Classic mode has no equivalent artifact.
|
|
277
|
-
4. **Predictable cost**: Dynamic varies <5% across model sizes (28K-36K).
|
|
278
|
-
Classic varies 80%+ run-to-run depending on parallelization.
|
|
295
|
+
Every injection factory has a `cache:` field. Four forms:
|
|
279
296
|
|
|
280
|
-
|
|
297
|
+
| Policy | Meaning |
|
|
298
|
+
|---|---|
|
|
299
|
+
| `'always'` | Cache whenever this injection is in `activeInjections`. |
|
|
300
|
+
| `'never'` | Never cache — volatile content (timestamps, per-request IDs). |
|
|
301
|
+
| `'while-active'` | Cache while the injection is active; invalidates the moment it becomes inactive. |
|
|
302
|
+
| `{ until: ctx => boolean }` | Predicate-driven invalidation (Turing-complete escape hatch). |
|
|
281
303
|
|
|
282
|
-
|
|
283
|
-
classic mode often wins on raw tokens. Above it, Dynamic dominates
|
|
284
|
-
because Classic's tool-description payload grows linearly with the
|
|
285
|
-
catalog while Dynamic stays flat at active-skill size:
|
|
304
|
+
**Smart defaults per factory** — most consumers never write `cache:` explicitly:
|
|
286
305
|
|
|
306
|
+
```typescript
|
|
307
|
+
defineSteering({ id: 'tone', prompt: '...' }); // default: 'always'
|
|
308
|
+
defineFact({ id: 'profile', data: '...' }); // default: 'always'
|
|
309
|
+
defineSkill({ id: 'billing', body: '...', tools: [...] }); // default: 'while-active'
|
|
310
|
+
defineInstruction({ id: 'urgent', activeWhen: ..., prompt: '...' }); // default: 'never'
|
|
311
|
+
defineMemory({ id: 'causal', type: MEMORY_TYPES.CAUSAL, ... }); // default: 'while-active'
|
|
287
312
|
```
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
313
|
+
|
|
314
|
+
For composition beyond the four sentinels, use the predicate form:
|
|
315
|
+
|
|
316
|
+
```typescript
|
|
317
|
+
// Stable for the first 5 iterations, then flush:
|
|
318
|
+
defineSteering({ id: 'examples', prompt: '...', cache: { until: ctx => ctx.iteration > 5 } });
|
|
319
|
+
|
|
320
|
+
// Invalidate when cumulative spend exceeds budget:
|
|
321
|
+
defineFact({ id: 'rules', data: '...', cache: { until: ctx => ctx.cumulativeInputTokens > 50_000 } });
|
|
293
322
|
```
|
|
294
323
|
|
|
295
|
-
|
|
296
|
-
to hallucinate or pick the wrong tool**. Narrower context = more
|
|
297
|
-
in-distribution on the active task. Increasingly load-bearing as
|
|
298
|
-
catalogs grow.
|
|
324
|
+
### What the framework does every iteration
|
|
299
325
|
|
|
300
|
-
|
|
326
|
+
1. **`CacheDecisionSubflow`** walks `activeInjections`, evaluates each one's cache directive, and emits provider-independent `CacheMarker[]`.
|
|
327
|
+
2. **`CacheGate decider`** uses footprintjs `decide()` with three rules — kill switch, hit-rate floor (skip when recent hit-rate < 0.3), skill-churn (skip when ≥3 unique skills in the last 5 iters). Decision evidence captured for free.
|
|
328
|
+
3. **The active provider strategy** (registered automatically per `LLMProvider.name`) translates markers to wire format:
|
|
329
|
+
- `AnthropicCacheStrategy` → `cache_control` on system blocks (4-marker clamp)
|
|
330
|
+
- `OpenAICacheStrategy` → no-op writes (auto-cached); extracts metrics from `prompt_tokens_details.cached_tokens`
|
|
331
|
+
- `BedrockCacheStrategy` → model-aware (Anthropic-style for Claude, pass-through else)
|
|
332
|
+
- `NoOpCacheStrategy` → wildcard fallback
|
|
333
|
+
4. **`cacheRecorder`** emits typed events: hit rate, fresh-input tokens, cache-read tokens, cache-write tokens, markers applied. Same observability surface as every other event domain.
|
|
301
334
|
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
335
|
+
For the per-iteration cache invalidation walkthrough and the full benchmark numbers, see [`docs/guides/caching.md`](./docs/guides/caching.md).
|
|
336
|
+
|
|
337
|
+
### When to use it
|
|
338
|
+
|
|
339
|
+
Always — it's on by default. The smart defaults handle 80% of cases.
|
|
340
|
+
|
|
341
|
+
To audit it:
|
|
342
|
+
|
|
343
|
+
```typescript
|
|
344
|
+
import { cacheRecorder } from 'agentfootprint';
|
|
345
|
+
|
|
346
|
+
agent.attach(cacheRecorder({ onTurnEnd: (m) => console.log(m) }));
|
|
347
|
+
// → { hitRate: 0.71, freshInput: 1240, cacheRead: 9180, cacheWrite: 0, markersApplied: 2 }
|
|
305
348
|
```
|
|
306
349
|
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
350
|
+
To opt out globally for a specific run:
|
|
351
|
+
|
|
352
|
+
```typescript
|
|
353
|
+
const agent = Agent.create({ provider, caching: 'off', ... }).build();
|
|
354
|
+
```
|
|
310
355
|
|
|
311
356
|
---
|
|
312
357
|
|
|
@@ -314,13 +359,13 @@ in the Neo repo.
|
|
|
314
359
|
|
|
315
360
|
Three example shapes, all runnable end-to-end with `npm run example examples/<file>.ts`.
|
|
316
361
|
|
|
317
|
-
### Customer support agent (skills + memory + audit trail)
|
|
362
|
+
### Customer support agent (skills + memory + audit trail + cache)
|
|
318
363
|
|
|
319
364
|
```typescript
|
|
320
365
|
const agent = Agent.create({ provider, model: 'claude-sonnet-4-5-20250929' })
|
|
321
366
|
.system('You are a friendly support assistant.')
|
|
322
|
-
.skill(billingSkill) // LLM activates with read_skill('billing')
|
|
323
|
-
.steering(toneGuidelines) // always-on
|
|
367
|
+
.skill(billingSkill) // LLM activates with read_skill('billing'); cached while active
|
|
368
|
+
.steering(toneGuidelines) // always-on; cached forever
|
|
324
369
|
.memory(conversationMemory) // remembers across .run() calls, per-tenant
|
|
325
370
|
.build();
|
|
326
371
|
```
|
|
@@ -342,20 +387,20 @@ await research.run({ message: 'Should we adopt microservices?' });
|
|
|
342
387
|
|
|
343
388
|
### Streaming chat agent (token-by-token to a browser)
|
|
344
389
|
|
|
345
|
-
<!-- ┌────────────────────────────────────────────────────────────────┐
|
|
346
|
-
│ 📹 Streaming demo clip here. │
|
|
347
|
-
│ Short loop: user types → tokens stream → tool call │
|
|
348
|
-
│ surfaces mid-stream → final answer. │
|
|
349
|
-
└────────────────────────────────────────────────────────────────┘ -->
|
|
350
|
-
|
|
351
390
|
```typescript
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
391
|
+
import express from 'express';
|
|
392
|
+
import { toSSE } from 'agentfootprint';
|
|
393
|
+
|
|
394
|
+
app.get('/chat', async (req, res) => {
|
|
395
|
+
res.setHeader('Content-Type', 'text/event-stream');
|
|
396
|
+
agent.on('agentfootprint.stream.token', (e) => res.write(toSSE(e)));
|
|
397
|
+
agent.on('agentfootprint.stream.tool_start', (e) => res.write(toSSE(e)));
|
|
398
|
+
agent.on('agentfootprint.stream.tool_end', (e) => res.write(toSSE(e)));
|
|
399
|
+
await agent.run({ message: req.query.message as string });
|
|
400
|
+
res.end();
|
|
401
|
+
});
|
|
355
402
|
```
|
|
356
403
|
|
|
357
|
-
→ [`docs-site/guides/streaming/`](docs-site/src/content/docs/guides/streaming.mdx)
|
|
358
|
-
|
|
359
404
|
---
|
|
360
405
|
|
|
361
406
|
## The differentiator: the trace is a cache of the agent's thinking
|
|
@@ -410,7 +455,7 @@ This is memoization for agent reasoning — do the expensive work once, serve ma
|
|
|
410
455
|
|
|
411
456
|
### 3. Training data — every successful run becomes a labeled trajectory
|
|
412
457
|
|
|
413
|
-
The same snapshot data shape is the input to SFT / DPO / process-RL training pipelines (`causalMemory.exportForTraining({ format: 'sft' | 'dpo' | 'process' })` is on the roadmap
|
|
458
|
+
The same snapshot data shape is the input to SFT / DPO / process-RL training pipelines (`causalMemory.exportForTraining({ format: 'sft' | 'dpo' | 'process' })` is on the roadmap). You don't run a separate data-collection phase — **your production traffic IS your training set.** Every successful customer interaction is a positive trajectory; every escalation or override is a counter-example.
|
|
414
459
|
|
|
415
460
|
The same JSON shape that powered the audit trail and the cheap-model follow-up is the training payload. One recording, three downstream consumers, no extra instrumentation.
|
|
416
461
|
|
|
@@ -427,6 +472,7 @@ Generative AI development is expensive when every iteration hits a paid API. age
|
|
|
427
472
|
| Memory store | `InMemoryStore` | `RedisStore` (`agentfootprint/memory-redis`) · `AgentCoreStore` (`agentfootprint/memory-agentcore`) · DynamoDB / Postgres / Pinecone (planned) |
|
|
428
473
|
| MCP server | `mockMcpClient({ tools })` — in-memory, no SDK | `mcpClient({ transport })` to a real server |
|
|
429
474
|
| Tool execution | inline closure | real implementation |
|
|
475
|
+
| Cache strategy | `NoOpCacheStrategy` (when `mock` provider) | Auto-selected by provider: `AnthropicCacheStrategy` / `OpenAICacheStrategy` / `BedrockCacheStrategy` |
|
|
430
476
|
|
|
431
477
|
The flowchart, recorders, narrative, and tests don't change between dev and prod. **Ship the patterns first; pay for tokens last.**
|
|
432
478
|
|
|
@@ -439,6 +485,8 @@ The flowchart, recorders, narrative, and tests don't change between dev and prod
|
|
|
439
485
|
| 🎓 **New to agents** | [5-minute Quick Start](https://footprintjs.github.io/agentfootprint/getting-started/quick-start/) → first agent runs offline |
|
|
440
486
|
| 🛠️ **A LangChain / CrewAI / LangGraph user** | [Migration sketch](https://footprintjs.github.io/agentfootprint/getting-started/vs/) — same patterns, fewer classes |
|
|
441
487
|
| 🏗️ **Architecting an enterprise rollout** | [Production guide](https://footprintjs.github.io/agentfootprint/guides/deployment/) — multi-tenant identity, audit trails, redaction, OTel |
|
|
488
|
+
| 🏛️ **Doing production due diligence** | [Architecture page](https://footprintjs.github.io/agentfootprint/architecture/dependency-graph/) — 8-layer stack, hexagonal ports, the conventions SSOT |
|
|
489
|
+
| 💡 **Curious about the design philosophy** | [Inspiration](./docs/inspiration/) — Palantir-style connected data + Liskov-style modular boundaries |
|
|
442
490
|
| 🔬 **Researcher / extending the framework** | [Extension guide](https://footprintjs.github.io/agentfootprint/contributing/extension-guide/) — add a new flavor in 50 lines |
|
|
443
491
|
|
|
444
492
|
Every code snippet on the docs site is imported from a real, runnable file in [`examples/`](examples/) — every example is also an end-to-end test in CI. There is no docs-only code in this repo.
|
|
@@ -449,16 +497,17 @@ Every code snippet on the docs site is imported from a real, runnable file in [`
|
|
|
449
497
|
|
|
450
498
|
- **2 primitives** — `LLMCall`, `Agent` (the ReAct loop)
|
|
451
499
|
- **4 compositions** — `Sequence`, `Parallel`, `Conditional`, `Loop`
|
|
452
|
-
- **
|
|
453
|
-
- **One Injection primitive** — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact` (one engine, four typed factories, all reduce to `{ trigger, slot }`)
|
|
500
|
+
- **7 LLM providers** — Anthropic · OpenAI · Bedrock · Ollama · Browser-Anthropic · Browser-OpenAI · Mock (with `mock({ replies })` for scripted multi-turn)
|
|
501
|
+
- **One Injection primitive** — `defineSkill` / `defineSteering` / `defineInstruction` / `defineFact` (one engine, four typed factories, all reduce to `{ trigger, slot, cache }`)
|
|
454
502
|
- **One Memory factory** — `defineMemory({ type, strategy, store })` — 4 types × 7 strategies including **Causal**
|
|
503
|
+
- **Provider-agnostic prompt caching** — declarative `cache:` field per injection · per-iteration marker recomputation via `CacheDecisionSubflow` · registered strategies for Anthropic / OpenAI / Bedrock with `NoOp` wildcard fallback · `cacheRecorder` for hit-rate observability
|
|
455
504
|
- **RAG** — `defineRAG()` + `indexDocuments()` (sugar over Semantic + TopK)
|
|
456
505
|
- **MCP** — `mcpClient({ transport })` for real servers · `mockMcpClient({ tools })` for in-memory development
|
|
457
506
|
- **Memory store adapters** — `InMemoryStore` · `RedisStore` (subpath `agentfootprint/memory-redis`) · `AgentCoreStore` (subpath `agentfootprint/memory-agentcore`)
|
|
458
|
-
- **
|
|
507
|
+
- **48+ typed observability events** across context · stream · agent · cost · skill · permission · eval · memory · cache · embedding · error · …
|
|
459
508
|
- **Pause / resume** — JSON-serializable checkpoints; pause via `askHuman` / `pauseHere`, resume hours later on a different server
|
|
460
509
|
- **Resilience** — `withRetry`, `withFallback`, `resilientProvider`
|
|
461
|
-
- **AI-coding-tool support** — bundled instructions for Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot
|
|
510
|
+
- **AI-coding-tool support** — bundled instructions for Claude Code · Cursor · Windsurf · Cline · Kiro · Copilot (see `ai-instructions/`)
|
|
462
511
|
- **Runnable examples** organized by DNA layer (core · core-flow · patterns · context-engineering · memory · features) — every example is also an end-to-end CI test
|
|
463
512
|
|
|
464
513
|
## What's next (clearly marked roadmap)
|
|
@@ -468,6 +517,7 @@ Every code snippet on the docs site is imported from a real, runnable file in [`
|
|
|
468
517
|
| **Reliability subsystem** | `CircuitBreaker` · 3-tier output fallback · auto-resume-on-error · Skills upgrades (`surfaceMode`, `refreshPolicy`) · `MockEnvironment` composer |
|
|
469
518
|
| **Causal training-data exports** | `causalMemory.exportForTraining({ format: 'sft' \| 'dpo' \| 'process' })` — production traffic becomes labeled SFT / DPO / process-RL trajectories |
|
|
470
519
|
| **Governance** | `Policy` · `BudgetTracker` · DynamoDB / Postgres / Pinecone memory adapters · production embedder factories |
|
|
520
|
+
| **Cache layer v2** | Gemini handle-based caching · automatic provider routing based on causal-memory state · `cacheRecorder` cost-attribution |
|
|
471
521
|
| **Deep Agents · A2A protocol** | Planning-before-execution · agent-to-agent protocol · Lens UI deep-link |
|
|
472
522
|
|
|
473
523
|
For shipped features per release see [CHANGELOG.md](./CHANGELOG.md). Roadmap items are *not* claims about the current API — if a feature isn't in `npm install agentfootprint` today, it's listed here, not in the documentation.
|
package/package.json
CHANGED