agentfootprint 2.6.1 → 2.6.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +28 -63
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -222,11 +222,25 @@ The framework owns the loop. The framework re-evaluates triggers every iteration
|
|
|
222
222
|
|
|
223
223
|
**The flowchart-pattern substrate** ([footprintjs](https://github.com/footprintjs/footPrint)) is what makes the observation automatic. Every stage execution is a typed event during one DFS traversal — no instrumentation, no post-processing. Same way React DevTools shows you the component tree because React owns the render path, agentfootprint shows you the slot composition because agentfootprint owns the prompt path.
|
|
224
224
|
|
|
225
|
-
###
|
|
225
|
+
### When to use Dynamic ReAct
|
|
226
|
+
|
|
227
|
+
Use it when **your tools have dependencies** — when one tool's output
|
|
228
|
+
implies which tool to call next.
|
|
229
|
+
|
|
230
|
+
A skill body like *"if `get_port_errors` reports CRC > 0, call
|
|
231
|
+
`get_sfp_diag` next; if it reports `signal_loss`, call `get_flogi`
|
|
232
|
+
next"* IS a dependency graph. The skill encodes the workflow; Dynamic
|
|
233
|
+
ReAct gates the tool surface to that workflow at runtime.
|
|
234
|
+
|
|
235
|
+
If your tools are independent (the LLM can call any of them at any
|
|
236
|
+
time, ordering doesn't matter), Classic ReAct is fine and simpler —
|
|
237
|
+
don't reach for Skills.
|
|
238
|
+
|
|
239
|
+
### Side-by-side example
|
|
226
240
|
|
|
227
241
|
[`examples/dynamic-react/`](./examples/dynamic-react/) ships two
|
|
228
|
-
mock-backed scripts solving the same
|
|
229
|
-
|
|
242
|
+
mock-backed scripts solving the same task. Per-iteration tool-count
|
|
243
|
+
progression makes the shape clear:
|
|
230
244
|
|
|
231
245
|
```
|
|
232
246
|
Classic ReAct Dynamic ReAct
|
|
@@ -238,76 +252,27 @@ iter 4: 12 tools shown iter 4: 5 tools
|
|
|
238
252
|
iter 5: 5 tools (final answer)
|
|
239
253
|
```
|
|
240
254
|
|
|
241
|
-
The
|
|
242
|
-
|
|
243
|
-
on every call.
|
|
244
|
-
|
|
245
|
-
### Real Anthropic benchmark — 3 models × 2 modes
|
|
246
|
-
|
|
247
|
-
We ran a real production-shaped agent (10 skills, 18 tools after dedup)
|
|
248
|
-
against Anthropic with Haiku 4.5, Sonnet 4.5, and Opus 4.5 in both
|
|
249
|
-
modes. Same prompt, same scenario data, real `usage.input_tokens`:
|
|
250
|
-
|
|
251
|
-
| Model | Classic in | Dynamic in | Δ | Notes |
|
|
252
|
-
| ----------- | ---------: | ---------: | -----: | ---------------------------------- |
|
|
253
|
-
| Haiku 4.5 | 25,755 | 36,341 | +41% | Classic 4 iters / Dynamic 6 iters |
|
|
254
|
-
| Sonnet 4.5 | 36,690 | 28,486 | −22% | Classic went serial; Dynamic wins |
|
|
255
|
-
| Opus 4.5 | 20,114 | 28,401 | +41% | Opus's parallel batching is best |
|
|
256
|
-
|
|
257
|
-
**Reality check**: at Neo's 18-tool / 10-skill scale, Dynamic ReAct's
|
|
258
|
-
total input-token cost depends on **how aggressively the model
|
|
259
|
-
parallelizes Classic mode**. Opus parallelizes best (3 iters, all
|
|
260
|
-
data tools in one round) so Classic minimizes iterations and wins.
|
|
261
|
-
Sonnet went serial that turn (5 iters) so Dynamic won. Haiku
|
|
262
|
-
parallelized well (4 iters) so Classic won.
|
|
255
|
+
The unactivated skills' tools never enter the LLM context. Classic
|
|
256
|
+
ReAct has no equivalent — every registered tool ships on every call.
|
|
263
257
|
|
|
264
|
-
|
|
265
|
-
mood). Dynamic stays predictable at 5–6 iters across all models.
|
|
258
|
+
What Dynamic gives you that Classic doesn't:
|
|
266
259
|
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
`[18, 18, 18]` for Classic, regardless of registry size. Scales
|
|
271
|
-
to 50+ tool catalogs without ballooning per-call cost.
|
|
272
|
-
2. **Deterministic routing**: `read_skill` forces scope before data
|
|
260
|
+
1. **Constant per-call payload** bounded by active-skill size, not
|
|
261
|
+
registry size. Scales to 50+ tool catalogs.
|
|
262
|
+
2. **Deterministic routing** — `read_skill` forces scope before data
|
|
273
263
|
tools fire. LLM can't drift to off-topic tools.
|
|
274
|
-
3. **Auditability
|
|
275
|
-
`activatedInjectionIds
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
Classic varies 80%+ run-to-run depending on parallelization.
|
|
279
|
-
|
|
280
|
-
### Where Dynamic ReAct WILL win on cost
|
|
281
|
-
|
|
282
|
-
The break-even is roughly **30+ tools across 8+ skills**. Below that,
|
|
283
|
-
classic mode often wins on raw tokens. Above it, Dynamic dominates
|
|
284
|
-
because Classic's tool-description payload grows linearly with the
|
|
285
|
-
catalog while Dynamic stays flat at active-skill size:
|
|
286
|
-
|
|
287
|
-
```
|
|
288
|
-
Classic ReAct: 5 iters × 50 tools = 250 descriptions
|
|
289
|
-
Dynamic ReAct: 1 iter × 1 tool (read_skill)
|
|
290
|
-
+ 4 iters × 5 tools (1 active) = 21 descriptions
|
|
291
|
-
────────────
|
|
292
|
-
−92% on tool payload
|
|
293
|
-
```
|
|
264
|
+
3. **Auditability** — each iteration's tool list is a pure function of
|
|
265
|
+
`activatedInjectionIds`. Recorded, replayable, diff-able across runs.
|
|
266
|
+
4. **Less hallucination** — fewer tools per call = more in-distribution
|
|
267
|
+
on the active task.
|
|
294
268
|
|
|
295
|
-
|
|
296
|
-
to hallucinate or pick the wrong tool**. Narrower context = more
|
|
297
|
-
in-distribution on the active task. Increasingly load-bearing as
|
|
298
|
-
catalogs grow.
|
|
299
|
-
|
|
300
|
-
Run the side-by-side yourself:
|
|
269
|
+
Run it:
|
|
301
270
|
|
|
302
271
|
```sh
|
|
303
272
|
TSX_TSCONFIG_PATH=examples/runtime.tsconfig.json npx tsx examples/dynamic-react/01-classic-react.ts
|
|
304
273
|
TSX_TSCONFIG_PATH=examples/runtime.tsconfig.json npx tsx examples/dynamic-react/02-dynamic-react.ts
|
|
305
274
|
```
|
|
306
275
|
|
|
307
|
-
Or for the real-Anthropic version, see
|
|
308
|
-
[`scripts/run-comparison.ts`](https://github.com/footprintjs/neo-mds-triage/blob/main/scripts/run-comparison.ts)
|
|
309
|
-
in the Neo repo.
|
|
310
|
-
|
|
311
276
|
---
|
|
312
277
|
|
|
313
278
|
## What you can build
|
package/package.json
CHANGED