agentfootprint 1.4.1 → 1.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +115 -157
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -16,21 +16,17 @@
16
16
  <a href="https://footprintjs.github.io/footPrint/"><img src="https://img.shields.io/badge/Built_on-footprintjs-ca8a04?style=flat" alt="Built on footprintjs"></a>
17
17
  </p>
18
18
 
19
- <br>
20
-
21
- Most agent frameworks give you execution. agentfootprint gives you **connected evidence** — grounded, auditable, LLM-readable. The LLM can explain its own decisions. You can verify it wasn't hallucinating.
19
+ > **Most agent frameworks give you execution. agentfootprint gives you connected evidence** — grounded, auditable, LLM-readable. The LLM can explain its own decisions. You can verify it wasn't hallucinating.
22
20
 
23
21
  ```bash
24
22
  npm install agentfootprint
25
23
  ```
26
24
 
27
- Import what you need — each capability is a subpath:
28
-
29
25
  ```typescript
30
26
  import { Agent, defineTool } from 'agentfootprint'; // Build agents
31
27
  import { mock, anthropic } from 'agentfootprint/providers'; // Connect providers
32
- import { defineInstruction } from 'agentfootprint/instructions'; // Smart behavior
33
- import { agentObservability } from 'agentfootprint/observe'; // Monitor execution
28
+ import { defineInstruction } from 'agentfootprint/instructions'; // Conditional behavior
29
+ import { agentObservability } from 'agentfootprint/observe'; // Observability
34
30
  import { withRetry } from 'agentfootprint/resilience'; // Reliability
35
31
  import { gatedTools } from 'agentfootprint/security'; // Tool safety
36
32
  import { ExplainRecorder } from 'agentfootprint/explain'; // Grounding analysis
@@ -41,30 +37,101 @@ import { SSEFormatter } from 'agentfootprint/stream'; // Real-time ev
41
37
 
42
38
  ## Start Simple, Compose Up
43
39
 
44
- Five concepts. Each adds one capability. No upfront graph DSL start with a function call and grow.
40
+ Six concepts. Start with a single LLM call, compose up to multi-agent. No upfront graph DSL.
45
41
 
46
42
  ```typescript
47
- import { Agent, defineTool, mock } from 'agentfootprint';
43
+ import { Agent, defineTool } from 'agentfootprint';
44
+ import { mock } from 'agentfootprint/providers';
45
+ import { agentObservability } from 'agentfootprint/observe';
48
46
 
47
+ const obs = agentObservability();
49
48
  const agent = Agent.create({ provider: mock([...]) })
50
49
  .system('You are a research assistant.')
51
50
  .tool(searchTool)
51
+ .recorder(obs)
52
52
  .build();
53
53
 
54
54
  const result = await agent.run('Find AI trends');
55
- console.log(result.content);
56
- console.log(agent.getNarrative()); // connected execution trace
55
+ console.log(result.content); // LLM response
56
+ console.log(obs.explain().iterations); // per-iteration evaluation data ← the differentiator
57
57
  ```
58
58
 
59
+ **Single LLM** (one agent, one task):
60
+
59
61
  | Concept | What it adds | Use case |
60
62
  |---------|-------------|----------|
61
63
  | **LLMCall** | Single LLM invocation | Summarization, classification |
62
64
  | **Agent** | + Tool use loop (ReAct) | Research, code generation |
63
65
  | **RAG** | + Retrieval | Q&A over documents |
64
- | **FlowChart** | + Sequential pipeline | Approval flows, ETL |
65
- | **Swarm** | + LLM-driven routing | Customer support, triage |
66
66
 
67
- All five share one interface: `.build()` → `.run()`, `.getNarrative()`, `.getSnapshot()`.
67
+ **Multi-Agent** (compose agents):
68
+
69
+ | Concept | What it adds | Use case |
70
+ |---------|-------------|----------|
71
+ | **FlowChart** | Sequential pipeline | Approval flows, ETL — output of one feeds the next |
72
+ | **Parallel** | Concurrent execution | Analysis from multiple perspectives — merged by LLM |
73
+ | **Swarm** | LLM-driven routing | Customer support — orchestrator delegates to specialists |
74
+
75
+ All six share one interface: `.build()` → `.run()`, `.getNarrative()`, `.getSnapshot()`.
76
+
77
+ ---
78
+
79
+ ## Architecture — 5 Layers
80
+
81
+ ```
82
+ Layer 1: BUILD → concepts/ Single LLM (LLMCall, Agent, RAG)
83
+ Multi-Agent (FlowChart, Parallel, Swarm)
84
+ tools/ defineTool, ToolRegistry, askHuman
85
+
86
+ Layer 2: COMPOSE → lib/loop/ buildAgentLoop — the ReAct engine
87
+ lib/slots/ SystemPrompt, Messages, Tools subflows
88
+
89
+ Layer 3: EVALUATE → recorders/ ExplainRecorder — per-iteration evaluation
90
+ explain obs.explain() → { iterations, sources, claims, context }
91
+
92
+ Layer 4: MONITOR → recorders/ TokenRecorder, CostRecorder, ToolUsageRecorder
93
+ streaming/ AgentStreamEvent, SSEFormatter
94
+ narrative Human-readable execution story (footprintjs)
95
+
96
+ Layer 5: INFRASTRUCTURE → adapters/ Anthropic, OpenAI, Bedrock, Mock, MCP, A2A
97
+ providers/ Prompt, Message, Tool strategies
98
+ memory/ Conversation stores (Redis, Postgres, DynamoDB)
99
+ ```
100
+
101
+ Each folder has a README. Start at Layer 1, add layers as you need them.
102
+
103
+ Built on [footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. One DFS traversal, three observer systems (scope/flow/agent), connected data out.
104
+
105
+ ---
106
+
107
+ ## What's Different
108
+
109
+ Features no other agent framework provides — and why they matter.
110
+
111
+ **Quality:**
112
+
113
+ | Feature | What |
114
+ |---------|------|
115
+ | **Dynamic ReAct** | All 3 slots (prompt, tools, messages) re-evaluate EACH loop iteration. Agent adapts mid-conversation. |
116
+ | **Conditional Behavior** | `defineInstruction({ activeWhen })` — rules activate based on accumulated decision state. |
117
+ | **Tool Result Recency** | Instructions inject into the recency window AFTER tool calls — guidance at the right moment. |
118
+ | **Per-Iteration Evaluation** | `obs.explain().iterations` — context + decisions + sources + claims connected per loop. |
119
+
120
+ **Safety & Cost:**
121
+
122
+ | Feature | What |
123
+ |---------|------|
124
+ | **Permission-Gated Tools** | LLM never SEES blocked tools — filtered at resolve time. Can't hallucinate a tool it never saw. |
125
+ | **$0 Testing** | `mock()` adapter — same interface as Anthropic/OpenAI. Full test suite, zero API spend. |
126
+
127
+ **UX & Debugging:**
128
+
129
+ | Feature | What |
130
+ |---------|------|
131
+ | **Human-in-the-Loop** | Agent pauses, serializes to JSON, resumes hours later on a different server. `askHuman()`. |
132
+ | **Streaming Events** | 9-event discriminated union. Build React/Next.js real-time UI. SSEFormatter for SSE. |
133
+ | **Narrative Traces** | Human-readable execution story a follow-up LLM can reason about. |
134
+ | **Single Traversal** | 3 observer systems fire during ONE DFS pass → all data connected. No post-processing. |
68
135
 
69
136
  ---
70
137
 
@@ -73,6 +140,8 @@ All five share one interface: `.build()` → `.run()`, `.getNarrative()`, `.getS
73
140
  Write tests with `mock()`. Deploy with `anthropic()`. Same code. $0 test runs.
74
141
 
75
142
  ```typescript
143
+ import { mock, createProvider, anthropic } from 'agentfootprint/providers';
144
+
76
145
  // test — deterministic, free, instant
77
146
  const provider = mock([{ content: 'Paris.' }]);
78
147
 
@@ -87,12 +156,15 @@ Works with Anthropic, OpenAI, Bedrock, Ollama. No lock-in.
87
156
 
88
157
  ---
89
158
 
90
- ## Instructions — Conditional Context Injection
159
+ ## Features
160
+
161
+ ### Conditional Behavior
91
162
 
92
- One concept. Three LLM API positions. Define a rule once — it injects into system prompt, tools, AND tool-result recency window. Driven by accumulated state.
163
+ Define rules that inject into system prompt, tools, AND tool-result recency window. Driven by accumulated state. All 3 slots re-evaluate each iteration in Dynamic mode — progressive tool authorization, context-aware prompts, state-driven behavior.
93
164
 
94
165
  ```typescript
95
- import { defineInstruction, Agent, AgentPattern } from 'agentfootprint';
166
+ import { defineInstruction } from 'agentfootprint/instructions';
167
+ import { Agent, AgentPattern } from 'agentfootprint';
96
168
 
97
169
  const refund = defineInstruction({
98
170
  id: 'refund-handling',
@@ -100,103 +172,52 @@ const refund = defineInstruction({
100
172
  prompt: 'Handle denied orders with empathy. Follow refund policy.',
101
173
  tools: [processRefund],
102
174
  onToolResult: [{ id: 'empathy', text: 'Do NOT promise reversal.' }],
175
+ safety: true, // fail-closed: fires even when predicate throws
103
176
  });
104
177
 
105
178
  const agent = Agent.create({ provider })
106
179
  .tool(lookupOrder)
107
180
  .instruction(refund)
108
181
  .decision({ orderStatus: null })
109
- .pattern(AgentPattern.Dynamic) // re-evaluate each iteration
182
+ .pattern(AgentPattern.Dynamic)
110
183
  .build();
111
184
  ```
112
185
 
113
- Tool results update the decision scope via `decide()`. Next iteration, different instructions activate. Progressive tool authorization, context-aware prompts, state-driven behavior — all declarative.
114
-
115
- See [Instructions Guide](docs/guides/instructions.md).
186
+ ### Narrative Traces
116
187
 
117
- ---
118
-
119
- ## LLM Narrative — Connected Evidence
120
-
121
- Not disconnected spans. Not logs. **Connected entries** with key, value, stageId — collected during the single traversal pass. The LLM can read its own trace and answer follow-up questions.
188
+ Connected entries with key, value, stageId — collected during traversal. Feed to a follow-up LLM for debugging.
122
189
 
123
190
  ```typescript
124
- const result = await agent.run('Check order ORD-1003');
125
-
126
- // Human-readable narrative
127
191
  agent.getNarrative();
128
192
  // [
129
193
  // "[Seed] Initialized agent state",
130
194
  // "[CallLLM] claude-sonnet-4 (127in / 45out)",
131
195
  // "[ExecuteToolCalls] lookup_order({orderId: 'ORD-1003'})",
132
- // " Tool results: {status: 'denied', amount: 5000}",
133
- // "[CallLLM] claude-sonnet-4 (312in / 89out)",
134
196
  // "[Finalize] Your order was denied..."
135
197
  // ]
136
-
137
- // Structured entries for programmatic access
138
- agent.getNarrativeEntries();
139
- // Each entry: { type, text, key, rawValue, stageId, subflowId }
140
198
  ```
141
199
 
142
- ### Grounding Analysis
200
+ ### Human-in-the-Loop
143
201
 
144
- Compare what tools returned vs what the LLM said. Hallucination detection without a separate eval pipeline.
145
-
146
- ```typescript
147
- import { ExplainRecorder } from 'agentfootprint/explain';
148
-
149
- const explain = new ExplainRecorder();
150
- const agent = Agent.create({ provider }).tool(orderTool).recorder(explain).build();
151
- await agent.run('Check order status');
152
-
153
- const report = explain.explain();
154
- report.sources; // what tools returned (ground truth)
155
- report.claims; // what the LLM said (to verify)
156
- report.decisions; // what tool calls the LLM made
157
- ```
158
-
159
- ---
160
-
161
- ## Dynamic ReAct
162
-
163
- All three slots (system prompt, tools, messages) re-evaluate each iteration. Instructions re-evaluate against updated decision scope. Progressive tool authorization:
164
-
165
- ```
166
- Turn 1: basic tools → LLM calls verify_identity → decision.verified = true
167
- Turn 2: InstructionsToLLM re-evaluates → admin tools unlocked → refund tools available
168
- Turn 3: LLM sees admin tools → can process refund
169
- ```
170
-
171
- The LLM's capabilities change based on what happened — not what you hardcoded.
172
-
173
- ---
174
-
175
- ## Pausable — Human-in-the-Loop
176
-
177
- Long-running agent pauses, serializes state to JSON, resumes hours later on a different server.
202
+ Agent pauses, serializes state to JSON, resumes hours later on a different server.
178
203
 
179
204
  ```typescript
180
205
  import { Agent, askHuman } from 'agentfootprint';
181
206
 
182
207
  const agent = Agent.create({ provider })
183
- .tool(askHuman()) // special tool that pauses execution
208
+ .tool(askHuman())
184
209
  .build();
185
210
 
186
211
  const result = await agent.run('Process my refund');
187
212
  if (result.paused) {
188
- // Store checkpoint in Redis/Postgres/anywhere
189
- const checkpoint = result.pauseData;
190
- // ... hours later, different server ...
191
- const final = await agent.resume(humanResponse);
213
+ const checkpoint = result.pauseData; // store in Redis/Postgres/anywhere
214
+ const final = await agent.resume(humanResponse); // hours later, different server
192
215
  }
193
216
  ```
194
217
 
195
- ---
218
+ ### Streaming Events
196
219
 
197
- ## Streaming Lifecycle Events
198
-
199
- 9-event discriminated union. Build any UX — CLI, web, mobile. Tool lifecycle fires even without streaming mode.
220
+ 9-event discriminated union. Build any UX — CLI, web, mobile.
200
221
 
201
222
  ```typescript
202
223
  await agent.run('Check order', {
@@ -205,7 +226,6 @@ await agent.run('Check order', {
205
226
  case 'token': process.stdout.write(event.content); break;
206
227
  case 'tool_start': console.log(`Running ${event.toolName}...`); break;
207
228
  case 'tool_end': console.log(`Done (${event.latencyMs}ms)`); break;
208
- case 'llm_end': console.log(`[${event.model}, ${event.latencyMs}ms]`); break;
209
229
  }
210
230
  },
211
231
  });
@@ -213,47 +233,36 @@ await agent.run('Check order', {
213
233
 
214
234
  Events: `turn_start` · `llm_start` · `thinking` · `token` · `llm_end` · `tool_start` · `tool_end` · `turn_end` · `error`
215
235
 
216
- SSE for web backends: `res.write(SSEFormatter.format(event))`
217
-
218
- ---
219
-
220
- ## Recorders — Passive Observation
236
+ ### Observability
221
237
 
222
- Observe without shaping behavior. Collect during traversal. One call for everything:
238
+ One call for everything. Collect during traversal, never post-process.
223
239
 
224
240
  ```typescript
225
241
  import { agentObservability } from 'agentfootprint/observe';
226
242
 
227
243
  const obs = agentObservability();
228
- const agent = Agent.create({ provider }).tool(searchTool).recorder(obs).build();
244
+ agent.recorder(obs).build();
229
245
  await agent.run('Hello');
230
246
 
231
247
  obs.tokens(); // metrics: { totalCalls, totalInputTokens, totalOutputTokens, calls[] }
232
248
  obs.tools(); // metrics: { totalCalls, byTool: { search: { calls, errors, latency } } }
233
249
  obs.cost(); // metrics: USD amount
234
- obs.explain(); // evaluation: { iterations, sources, claims, decisions, context, summary }
250
+ obs.explain(); // evaluation: { iterations, sources, claims, decisions, context }
235
251
  ```
236
252
 
237
- ### 5 Categories
238
-
239
253
  | Category | Recorders | Audience |
240
254
  |----------|-----------|----------|
241
255
  | **Evaluation** | `ExplainRecorder` | LLM evaluator — faithfulness, hallucination, grounding |
242
256
  | **Metrics** | `TokenRecorder`, `CostRecorder`, `ToolUsageRecorder`, `TurnRecorder` | Ops dashboard, billing |
243
257
  | **Safety** | `GuardrailRecorder`, `PermissionRecorder`, `QualityRecorder` | Security, compliance |
244
258
  | **Export** | `OTelRecorder` | Datadog, Grafana, any OTel backend |
245
- | **Composition** | `CompositeRecorder`, `agentObservability()` | Bundle recorders |
246
-
247
- `obs.explain()` is the differentiator — per-iteration evaluation units with connected context. See [`recorders/README.md`](src/recorders/README.md).
248
-
249
- ---
250
259
 
251
- ## Tool Gating — Defense-in-Depth
260
+ ### Tool Gating — Defense-in-Depth
252
261
 
253
- The LLM never sees tools it can't use. Can't hallucinate a tool it never saw.
262
+ The LLM never sees tools it can't use. Two layers: resolve-time filtering + execute-time rejection.
254
263
 
255
264
  ```typescript
256
- import { gatedTools, PermissionPolicy } from 'agentfootprint';
265
+ import { gatedTools, PermissionPolicy } from 'agentfootprint/security';
257
266
 
258
267
  const policy = PermissionPolicy.fromRoles({
259
268
  user: ['search', 'calc'],
@@ -264,88 +273,37 @@ const agent = Agent.create({ provider })
264
273
  .toolProvider(gatedTools(allTools, policy.checker()))
265
274
  .build();
266
275
 
267
- // Upgrade mid-conversation
268
- policy.setRole('admin');
276
+ policy.setRole('admin'); // upgrade mid-conversation
269
277
  ```
270
278
 
271
- Two layers: resolve-time filtering (hidden from LLM) + execute-time rejection (hallucinated names caught).
272
-
273
- ---
274
-
275
- ## Safety Instructions
276
-
277
- ```typescript
278
- defineInstruction({
279
- id: 'compliance',
280
- safety: true, // fail-closed: fires even when predicate throws
281
- prompt: 'GDPR compliance required.',
282
- });
283
- ```
284
-
285
- Safety instructions: unsuppressable, fail-closed, sorted last (highest LLM attention position).
286
-
287
- ---
288
-
289
- ## Orchestration
279
+ ### Resilience
290
280
 
291
281
  ```typescript
292
- import { withRetry, withFallback, withCircuitBreaker, resilientProvider } from 'agentfootprint';
282
+ import { withRetry, withFallback, resilientProvider } from 'agentfootprint/resilience';
293
283
 
294
284
  const reliable = withRetry(agent, { maxRetries: 3 });
295
285
  const resilient = withFallback(primaryAgent, cheapAgent);
296
- const guarded = withCircuitBreaker(agent, { failureThreshold: 5 });
297
-
298
- // Cross-family provider failover: Claude → GPT-4o → local Ollama
299
286
  const provider = resilientProvider([anthropicAdapter, openaiAdapter, ollamaAdapter]);
300
287
  ```
301
288
 
302
289
  ---
303
290
 
304
- ## 26 Samples
291
+ ## Samples
305
292
 
306
293
  `test/samples/` — runnable with `vitest`:
307
294
 
308
295
  | # | Sample | What it demonstrates |
309
296
  |---|--------|---------------------|
310
297
  | 01-16 | Core patterns | LLMCall, Agent, RAG, FlowChart, Swarm, recorders, tools, security, errors, multi-modal |
311
- | 17 | **Instructions** | defineInstruction, decide(), conditional activation, Decision Scope |
298
+ | 17 | **Conditional Behavior** | defineInstruction, decide(), conditional activation, Decision Scope |
312
299
  | 18 | **Streaming Events** | AgentStreamEvent lifecycle, tool events, SSE |
313
- | 19 | **Security** | gatedTools, PermissionPolicy, role-based tool access |
314
- | 20 | **Grounding** | ExplainRecorder — sources, claims, decisions |
300
+ | 19 | **Tool Gating** | gatedTools, PermissionPolicy, role-based tool access |
315
301
  | 21 | **SSE Server** | Express SSE endpoint with SSEFormatter |
316
302
  | 22 | **Resilience** | withRetry, withFallback, provider failover |
317
303
  | 23 | **Memory Stores** | redisStore, postgresStore, dynamoStore adapters |
318
304
  | 24 | **Structured Output** | outputSchema, Zod auto-convert, zodToJsonSchema |
319
- | 25 | **OTel Recorder** | OpenTelemetry spans with mock tracer |
320
- | 26 | **Explain Recorder** | ExplainRecorder: sources, claims, decisions during traversal |
321
-
322
- ---
323
-
324
- ## Architecture — 5 Layers
325
-
326
- ```
327
- Layer 1: BUILD → concepts/ Single LLM (LLMCall, Agent, RAG)
328
- Multi-Agent (FlowChart, Parallel, Swarm)
329
- tools/ defineTool, ToolRegistry, askHuman
330
-
331
- Layer 2: COMPOSE → lib/loop/ buildAgentLoop — the ReAct engine
332
- lib/slots/ SystemPrompt, Messages, Tools subflows
333
-
334
- Layer 3: EVALUATE → recorders/ ExplainRecorder — per-iteration evaluation
335
- explain obs.explain() → { iterations, sources, claims, context }
336
-
337
- Layer 4: MONITOR → recorders/ TokenRecorder, CostRecorder, ToolUsageRecorder
338
- streaming/ AgentStreamEvent, SSEFormatter
339
- narrative Human-readable execution story (footprintjs)
340
-
341
- Layer 5: INFRASTRUCTURE → adapters/ Anthropic, OpenAI, Bedrock, Mock, MCP, A2A
342
- providers/ Prompt, Message, Tool strategies
343
- memory/ Conversation stores (Redis, Postgres, DynamoDB)
344
- ```
345
-
346
- Each folder has a README explaining what, when, and how. Start at Layer 1, add layers as you need them.
347
-
348
- Built on [footprintjs](https://github.com/footprintjs/footPrint) — the flowchart pattern for backend code. One DFS traversal, three observer systems (scope/flow/agent), connected data out.
305
+ | 25 | **OTel** | OpenTelemetry spans with mock tracer |
306
+ | 26 | **Explain Recorder** | ExplainRecorder: sources, claims, decisions, per-iteration eval |
349
307
 
350
308
  ---
351
309
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentfootprint",
3
- "version": "1.4.1",
3
+ "version": "1.4.2",
4
4
  "description": "The explainable agent framework — build AI agents you can explain, audit, and trust. Built on footprintjs.",
5
5
  "license": "MIT",
6
6
  "author": "Sanjay Krishna Anbalagan",