bare-agent 0.11.0 → 0.12.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/README.md +1 -0
  2. package/bareagent.context.md +1149 -0
  3. package/bin/cli.d.ts +4 -0
  4. package/bin/cli.js +40 -10
  5. package/bin/test-provider.d.ts +2 -0
  6. package/bin/test-provider.js +5 -1
  7. package/index.d.ts +20 -0
  8. package/package.json +46 -10
  9. package/src/bareguard-adapter.d.ts +118 -0
  10. package/src/bareguard-adapter.js +75 -3
  11. package/src/checkpoint.d.ts +61 -0
  12. package/src/checkpoint.js +17 -8
  13. package/src/circuit-breaker.d.ts +70 -0
  14. package/src/circuit-breaker.js +20 -4
  15. package/src/errors.d.ts +106 -0
  16. package/src/errors.js +50 -1
  17. package/src/loop.d.ts +135 -0
  18. package/src/loop.js +73 -17
  19. package/src/mcp-bridge.d.ts +133 -0
  20. package/src/mcp-bridge.js +179 -27
  21. package/src/mcp.d.ts +4 -0
  22. package/src/memory.d.ts +50 -0
  23. package/src/memory.js +22 -2
  24. package/src/planner.d.ts +62 -0
  25. package/src/planner.js +26 -7
  26. package/src/provider-anthropic.d.ts +55 -0
  27. package/src/provider-anthropic.js +32 -11
  28. package/src/provider-clipipe.d.ts +86 -0
  29. package/src/provider-clipipe.js +28 -18
  30. package/src/provider-fallback.d.ts +44 -0
  31. package/src/provider-fallback.js +18 -8
  32. package/src/provider-ollama.d.ts +41 -0
  33. package/src/provider-ollama.js +27 -7
  34. package/src/provider-openai.d.ts +57 -0
  35. package/src/provider-openai.js +31 -16
  36. package/src/providers.d.ts +6 -0
  37. package/src/providers.js +8 -0
  38. package/src/retry.d.ts +44 -0
  39. package/src/retry.js +15 -1
  40. package/src/run-plan.d.ts +126 -0
  41. package/src/run-plan.js +46 -13
  42. package/src/scheduler.d.ts +102 -0
  43. package/src/scheduler.js +32 -4
  44. package/src/state.d.ts +45 -0
  45. package/src/state.js +18 -2
  46. package/src/store-jsonfile.d.ts +85 -0
  47. package/src/store-jsonfile.js +33 -8
  48. package/src/store-sqlite.d.ts +90 -0
  49. package/src/store-sqlite.js +31 -7
  50. package/src/stores.d.ts +3 -0
  51. package/src/stream.d.ts +79 -0
  52. package/src/stream.js +32 -0
  53. package/src/tools.d.ts +8 -0
  54. package/src/transport-jsonl.d.ts +30 -0
  55. package/src/transport-jsonl.js +13 -0
  56. package/src/transports.d.ts +2 -0
  57. package/tools/browse.d.ts +10 -0
  58. package/tools/browse.js +2 -0
  59. package/tools/defer.d.ts +33 -0
  60. package/tools/defer.js +12 -3
  61. package/tools/mobile.d.ts +34 -0
  62. package/tools/mobile.js +28 -15
  63. package/tools/shell.d.ts +31 -0
  64. package/tools/shell.js +55 -6
  65. package/tools/spawn.d.ts +107 -0
  66. package/tools/spawn.js +24 -5
  67. package/types/index.d.ts +66 -0
  68. package/types/shims.d.ts +16 -0
@@ -0,0 +1,1149 @@
1
+ # bareagent — Integration Guide
2
+
3
+ > For AI assistants and developers wiring bareagent into a project.
4
+ > v0.12.1 | Node.js >= 18 | one required dep (`bareguard ^0.4.2`) | Apache 2.0
5
+ >
6
+ > Full human guide with composition examples, design philosophy, and recipes: [Usage Guide](docs/02-features/usage-guide.md)
7
+
8
+ ## What this is
9
+
10
+ bareagent is a lightweight agent orchestration library (~2.4K lines of core, one required dep). It provides composable components for LLM tool-calling loops, goal planning, state tracking, scheduled actions, human approval gates, persistent memory, circuit breaking, provider fallback, single-gate governance via [bareguard](https://npmjs.com/package/bareguard), cross-platform shell tools, and an MCP bridge. All components are independent — use one, use all, or bring your own.
11
+
12
+ ```
13
+ npm install bare-agent
14
+ ```
15
+
16
+ Eight entry points:
17
+ - `require('bare-agent')` — Loop, Planner, StateMachine, Scheduler, Checkpoint, Memory, Stream, Retry, runPlan, CircuitBreaker, wireGate, defaultActionTranslator, BareAgentError, ProviderError, ToolError, TimeoutError, ValidationError, CircuitOpenError, **HaltError**
18
+ - `require('bare-agent/errors')` — same error classes via a stable subpath (v0.10.1+) for adopters who want to import only the error surface
19
+ - `require('bare-agent/providers')` — OpenAI, Anthropic, Ollama, CLIPipe, Fallback (the canonical short names; `*Provider` aliases — `OpenAIProvider`, `AnthropicProvider`, etc. — are also exported and match the class names, so either destructure works, v0.12.1+)
20
+ - `require('bare-agent/stores')` — SQLite (FTS5), JsonFile
21
+ - `require('bare-agent/transports')` — JsonlTransport
22
+ - `require('bare-agent/tools')` — createBrowsingTools, createMobileTools, createShellTools, createSpawnTool, createDeferTool, spawnChild, readDeferQueue
23
+ - `require('bare-agent/mcp')` — createMCPBridge (returns `tools` + `metaTools`), discoverServers, buildMetaTools
24
+ - `require('bare-agent/bareguard')` — wireGate (one-line bareguard Gate integration), defaultActionTranslator
25
+
26
+ **TypeScript:** bareagent is pure JS + JSDoc but ships `.d.ts` declarations (generated from that JSDoc, v0.11+). Every entry point above resolves types automatically — `import { Loop } from 'bare-agent'` and `import { OpenAI } from 'bare-agent/providers'` give full autocomplete and type-checking, including required-option enforcement (e.g. `new Loop({})` is a type error: `provider` is required). No `@types/bare-agent` needed. Shared shapes (`Provider`, `Message`, `ToolDef`, `ToolCall`, `Usage`, `GenerateResult`, `Store`) are exported from the package's `types/`. The repo itself runs `tsc --checkJs` in CI on every push/PR so the JSDoc and the code can't drift.
27
+
28
+ ## Which components do I need?
29
+
30
+ | I want to... | Use these |
31
+ |---|---|
32
+ | Call an LLM with tools and get a result | Loop + a Provider |
33
+ | Break a goal into steps | Planner + a Provider |
34
+ | Execute a step DAG with parallelism | runPlan + executeFn |
35
+ | Track task state (pending/running/done/failed) | StateMachine |
36
+ | Run agent turns on a schedule (cron, timers) | Scheduler |
37
+ | Require human approval before dangerous actions | Checkpoint |
38
+ | Persist context across turns/sessions | Memory + a Store |
39
+ | Observe what the agent is doing | Stream |
40
+ | Retry on transient failures (429, timeouts) | Retry |
41
+ | Add jitter to backoff delays | Retry({ jitter: 'full' }) |
42
+ | Fail fast on repeated provider errors | CircuitBreaker |
43
+ | Fall back to another provider on failure | FallbackProvider |
44
+ | Retry individual plan steps | runPlan({ stepRetry }) |
45
+ | Use a CLI tool as an LLM provider | CLIPipe |
46
+ | Health-check provider, store, and tools | Loop.validate() |
47
+ | Track cost per run | Automatic — `result.cost` and `loop:done` event |
48
+ | Catch typed errors programmatically | ProviderError, ToolError, TimeoutError, CircuitOpenError |
49
+ | Cache identical planner calls | Planner({ cacheTTL: 60000 }) |
50
+ | Stream CLIPipe output in real-time | CLIPipeProvider({ onChunk: fn }) |
51
+ | Browse the web (inline snapshots) | createBrowsingTools + Loop |
52
+ | Browse the web (token-efficient, disk-based) | `barebrowse` CLI session — snapshots to `.barebrowse/*.yml` |
53
+ | Assess website privacy risk | createBrowsingTools + Loop (requires `npm install wearehere`) |
54
+ | Control Android/iOS devices | createMobileTools + Loop |
55
+ | Control mobile (token-efficient, disk-based) | `baremobile` CLI session — snapshots to `.baremobile/*.yml` |
56
+ | Read files, list directories, run shell commands, grep | createShellTools + Loop({ policy }) |
57
+ | Auto-discover MCP servers from IDE configs | createMCPBridge |
58
+ | Gate MCP tools with allow/deny lists | createMCPBridge + `.mcp-bridge.json` |
59
+ | Gate every tool call with one policy hook | `wireGate(gate).policy` → `Loop({ policy })` |
60
+ | Route policy decisions per user / tenant / chat | `wireGate(gate).policy` + `loop.run(msgs, tools, { ctx })` (ctx routes to bareguard's check via `_ctx`) |
61
+ | Cap total USD spend per run | `new Gate({ budget: { maxCostUsd: 0.50 } })` |
62
+ | Cap total tool-calling rounds | `new Gate({ limits: { maxTurns: 20 } })` |
63
+ | Audit every gated event to JSONL | `new Gate({ audit: { path: './audit.jsonl' } })` |
64
+ | Allowlist filesystem paths for shell tools | `new Gate({ fs: { readScope, writeScope, deny } })` |
65
+ | Allowlist `argv[0]` for shell_run | `new Gate({ bash: { allow: [...], denyPatterns: [...] } })` |
66
+ | Auto-deny Checkpoint prompts that never get a reply | Checkpoint({ timeout: 300000 }) |
67
+ | Get one hook for every silent-ish failure | Loop({ onError }) + `loop:error` stream events |
68
+ | Send messages across WhatsApp/iMessage/Signal/Discord/Slack/Telegram | createMCPBridge + beeperbox |
69
+ | **Spawn a child specialist agent** | createSpawnTool + bin/cli.js --config (v0.9+) |
70
+ | **Defer an action for later (cron-fired)** | createDeferTool + examples/wake.sh (v0.9+) |
71
+ | **Expose a large MCP catalog dynamically** | createMCPBridge → bridge.metaTools (v0.9+) |
72
+
73
+ **Most projects start with Loop + Provider.** Add components as needed.
74
+
75
+ ## Minimal wiring: Loop + Provider + Tool
76
+
77
+ ```javascript
78
+ const { Loop } = require('bare-agent');
79
+ const { OpenAI } = require('bare-agent/providers');
80
+
81
+ const provider = new OpenAI({
82
+ apiKey: process.env.OPENAI_API_KEY,
83
+ model: 'gpt-4o-mini',
84
+ });
85
+
86
+ const tools = [{
87
+ name: 'get_weather',
88
+ description: 'Get weather for a city',
89
+ parameters: {
90
+ type: 'object',
91
+ properties: { city: { type: 'string' } },
92
+ required: ['city'],
93
+ },
94
+ execute: async ({ city }) => ({ temp: 22, city, conditions: 'sunny' }),
95
+ }];
96
+
97
+ const loop = new Loop({ provider });
98
+ const result = await loop.run(
99
+ [{ role: 'user', content: 'What is the weather in Berlin?' }],
100
+ tools
101
+ );
102
+ // result: { text: "The weather in Berlin is 22°C and sunny.", toolCalls: [], usage: {...}, cost: 0.00045, error: null }
103
+ // cost = estimated USD based on model + token usage. Throws on error by default.
104
+ ```
105
+
106
+ ## Health check with validate()
107
+
108
+ ```javascript
109
+ const result = await loop.validate(tools);
110
+ // result: {
111
+ // provider: { ok: true },
112
+ // store: { ok: true, skipped: false },
113
+ // tools: { ok: true }
114
+ // }
115
+ // Never throws — all failures captured in the return structure.
116
+ // Store check skipped if no store was passed to Loop constructor.
117
+ ```
118
+
119
+ ## Wiring with Memory
120
+
121
+ ```javascript
122
+ const { Loop, Memory } = require('bare-agent');
123
+ const { OpenAI } = require('bare-agent/providers');
124
+ const { SQLite } = require('bare-agent/stores');
125
+
126
+ const store = new SQLite({ path: './agent-memory.db' });
127
+ const memory = new Memory({ store });
128
+
129
+ // Store context
130
+ memory.store('User prefers window seats on flights', { type: 'preference' });
131
+
132
+ // Search before a turn — inject results as system context
133
+ const relevant = memory.search('flight preferences', { limit: 5 });
134
+ const context = relevant.map(r => r.content).join('\n');
135
+
136
+ const loop = new Loop({
137
+ provider: new OpenAI({ apiKey: process.env.OPENAI_API_KEY }),
138
+ system: `Use this context:\n${context}`,
139
+ });
140
+ ```
141
+
142
+ ## Multi-agent: spawn + defer + wake (v0.9)
143
+
144
+ Three primitives, no framework. The "always-on" feeling of multi-agent
145
+ systems is an illusion produced by *frequent stateless wakeups over
146
+ persistent JSONL*. UNIX figured this out in 1973.
147
+
148
+ ```javascript
149
+ const { Loop, wireGate } = require('bare-agent');
150
+ const { Gate } = require('bareguard');
151
+ const { createSpawnTool, createDeferTool } = require('bare-agent/tools');
152
+
153
+ const gate = new Gate({
154
+ budget: { maxCostUsd: 0.50 }, // shared across the family via BAREGUARD_BUDGET_FILE
155
+ limits: { maxTurns: 20, maxChildren: 3, maxDepth: 2 },
156
+ spawn: { ratePerMinute: 5 }, // bareguard 0.2 — per-family
157
+ defer: { ratePerMinute: 10 }, // bareguard 0.2 — per-family
158
+ audit: { path: './bareagent-audit.jsonl' },
159
+ humanChannel: async () => ({ decision: 'deny' }),
160
+ });
161
+ await gate.init();
162
+
163
+ const { policy, onLlmResult, onToolResult, filterTools } = wireGate(gate);
164
+ const { tool: spawn } = createSpawnTool();
165
+ const { tool: defer } = createDeferTool();
166
+
167
+ const tools = await filterTools([spawn, defer, ...otherTools]);
168
+ const loop = new Loop({ provider, policy, onLlmResult, onToolResult });
169
+ await loop.run(messages, tools);
170
+ ```
171
+
172
+ **`spawn({ config, input? })`** — fork a child bareagent process with the
173
+ given config file path (a JSON specialist definition). Blocks until the
174
+ child exits; returns `{ text, usage, cost, error, events }`. The child is
175
+ invoked as `bare-agent --config <path>` (see `bin/cli.js` config-mode);
176
+ env-vars `BAREGUARD_AUDIT_PATH`, `BAREGUARD_PARENT_RUN_ID`,
177
+ `BAREGUARD_BUDGET_FILE`, `BAREGUARD_SPAWN_DEPTH+1` are threaded
178
+ automatically. Child stderr is captured and re-emitted as
179
+ `{type: 'child:stderr', text, ts}` events on the parent's stream — one
180
+ JSONL channel per child, no two-stream split. **The child config must
181
+ declare a `gate` block (v0.11.0)** — `bin/cli.js` refuses a gate-less
182
+ config (`exit 1`) rather than run a child with no policy/budget/depth
183
+ limits, since the parent gate only sees the config *path*, not its
184
+ contents. Set `"ungoverned": true` in the config to explicitly opt out.
185
+
186
+ **`defer({ action, when })`** — append a JSONL record to the queue file
187
+ (default `./bareagent-defers.jsonl`, override `BAREAGENT_DEFER_QUEUE`).
188
+ bareagent does NOT wake up later; the running process exits when the
189
+ loop ends. An external scheduler (cron + `examples/wake.sh`) reads the
190
+ queue and re-invokes bareagent at fire time. Returns `{ id }`.
191
+
192
+ **Two-phase defer (defense in depth):**
193
+
194
+ 1. **Emit** (the `defer` tool): one `gate.check` on `{ type: 'defer', args: { action, when } }`.
195
+ Runs the full pipeline — `defer.ratePerMinute` cap, `tools.allowlist`
196
+ on `defer`, `content.*` over the JSON-serialized form. Bareguard does
197
+ NOT extract `args.action` and run a second pipeline against it at
198
+ emit time.
199
+ 2. **Fire** (`wake.sh` invokes bareagent): a fresh `gate.check` on the
200
+ inner action — full pipeline against it as if it had been called
201
+ directly. Two distinct gate.check calls, two distinct audit lines,
202
+ reconstructable via `parent_run_id`.
203
+
204
+ **Per-family rate caps.** `spawn.ratePerMinute` and `defer.ratePerMinute`
205
+ count audit-log records in a trailing 60s window keyed by the root
206
+ `run_id`. A fork-bombing child can't evade the parent's cap by spawning
207
+ its own children — they all share the family count. Defaults: defer
208
+ 15/min, spawn 10/min.
209
+
210
+ **Reference cron + wake script:** `examples/wake.sh` (with
211
+ `examples/wake.md` for setup). The script folds the defer queue with
212
+ `jq -n` (null-input, so `inputs` reads *every* record — without `-n` the
213
+ first queue line is consumed as `jq`'s implicit input and silently
214
+ skipped), picks records where `when <= now() AND status === 'pending'`,
215
+ appends a `'fired'` line, and shells out to `bare-agent --config
216
+ <orchestrator>` with the inner action as stdin.
217
+
218
+ **End-to-end orchestrator example:** `examples/orchestrator/` ships a
219
+ parent + two specialists (summarizer, researcher). The orchestrator's
220
+ "intelligence" is its system prompt — there's no `class Orchestrator`,
221
+ no `dispatch_to_specialist()`. Roles are configs, not types. Adding a
222
+ new specialist is one JSON file.
223
+
224
+ ### MCP catalog: bulk vs metaTools (v0.9)
225
+
226
+ `createMCPBridge()` now returns BOTH surfaces. Pick by catalog size:
227
+
228
+ ```javascript
229
+ const bridge = await createMCPBridge();
230
+ // bridge.tools — bulk-loaded array (every MCP tool, name-prefixed).
231
+ // LLM sees them all upfront. Token-cheap upfront, token-
232
+ // expensive per turn if catalog is big.
233
+ // bridge.metaTools — [mcp_discover, mcp_invoke] LLM-callable pair.
234
+ // Two tool slots in the LLM's view; LLM calls
235
+ // mcp_discover() to list, then mcp_invoke({ name, args })
236
+ // to use. Token-cheap per turn, slightly more turns
237
+ // if the LLM needs to discover.
238
+ ```
239
+
240
+ Wire one or the other into Loop's tool array — never both (the LLM would
241
+ see the same MCP tool twice). Same RPC connections under the hood; one
242
+ factory, one source of truth, two output forms. **Lean: ~10 tools or
243
+ fewer → bulk. ~50+ tools → metaTools.**
244
+
245
+ Bareguard governs both forms, with one quirk for metaTools: it sees
246
+ `action.type === 'mcp_invoke'` (not the canonical inner name), and the
247
+ invoked tool name lives in `args.name`. To deny specific MCP tools when
248
+ using metaTools, use `tools.denyArgPatterns: { mcp_invoke: [/"name":"linear_admin_/] }`
249
+ or `content.denyPatterns` over the serialized action.
250
+
251
+ **Vetting server commands (v0.11.0).** Connecting to a server runs its
252
+ `command`, and discovery reads `.mcp.json` from the cwd (an untrusted
253
+ repo) as well as your home/IDE configs. Pass `confirmServer(name, def)
254
+ => boolean` to `createMCPBridge` to approve each server **before its
255
+ command is spawned** (return `false` to skip it; a throw fails closed).
256
+ Default trusts all discovered servers — unchanged behavior.
257
+
258
+ ## Wiring with bareguard
259
+
260
+ Every tool call (native, MCP, browsing, mobile, user-defined) flows through `Loop.run()`. The `policy` option is the single chokepoint; the recommended wiring delegates every decision to a [bareguard](https://github.com/hamr0/bareguard) `Gate`. Bareguard owns the audit log, budget caps, content rules, fs/net/bash primitives, and humanChannel — bareagent just respects the verdict.
261
+
262
+ ```javascript
263
+ const { Gate } = require('bareguard');
264
+ const { Loop, wireGate } = require('bare-agent');
265
+ const { OpenAI } = require('bare-agent/providers');
266
+ const { createShellTools } = require('bare-agent/tools');
267
+
268
+ const gate = new Gate({
269
+ budget: { maxCostUsd: 0.50 },
270
+ limits: { maxTurns: 20 },
271
+ fs: { readScope: ['/tmp', '~/Projects'], deny: ['/etc'] },
272
+ bash: { allow: ['ls', 'cat', 'grep', 'ps', 'df'] }, // argv[0] allowlist
273
+ audit: { path: './audit.jsonl' },
274
+ humanChannel: async (event) => ({ decision: 'deny' }), // wire to your UI
275
+ // humanChannelTimeoutMs: 60_000, // optional (bareguard ≥0.3) — timeout-deny if your channel hangs
276
+ });
277
+ await gate.init();
278
+
279
+ const { policy, onLlmResult, onToolResult, filterTools } = wireGate(gate);
280
+ const { tools: shellTools } = createShellTools();
281
+ const tools = await filterTools(shellTools); // drop denied tools from the LLM's view
282
+
283
+ const loop = new Loop({
284
+ provider: new OpenAI({ apiKey: process.env.OPENAI_API_KEY }),
285
+ policy,
286
+ onLlmResult, // forwards LLM cost to gate.record (BA1)
287
+ onToolResult, // forwards tool result + ctx (BA1)
288
+ });
289
+
290
+ const result = await loop.run(messages, tools, { ctx: { userId: 42 } });
291
+ if (result.error?.startsWith('halt:')) {
292
+ // budget / turn cap / gate terminated — handled cleanly, no [HALT:] reached the LLM (BA2)
293
+ }
294
+ ```
295
+
296
+ **Why four pieces (`policy` + `onLlmResult` + `onToolResult` + `filterTools`).** `policy` runs `gate.check` *before* every tool call. `onLlmResult` fires after every successful `provider.generate` — without it, `budget.maxCostUsd` never sees LLM cost and is silently undercounted for token-heavy / tool-light workloads (every chatbot). `onToolResult` fires after every `tool.execute` and carries the per-run `ctx` opaque blob into `gate.record` so per-principal accounting works. `filterTools` is a `gate.allows` pre-filter — denied tools are dropped from the catalog the LLM ever sees, no `gate.check` round-trip per call.
297
+
298
+ Halt-severity decisions exit the loop cleanly via a typed `HaltError` — full mechanics (sealed `msgs`, `halt:<rule>` error token, `loop:done{halted:true}` event, `throwOnError:true` interaction, `halt:unknown` coalesce) are in the **Halt decisions throw `HaltError`** paragraph below. Short version: check `result.error?.startsWith('halt:')` after the run.
299
+
300
+ Legacy `wrapTool` / `wrapTools` are retained as deprecation shims (one-shot console warning, removal in 1.0). Migration: replace `wrapTools(tools)` at `loop.run()` with `filterTools(tools)` once upfront + `onLlmResult` / `onToolResult` on `new Loop({...})` to pick up LLM-cost recording and `_ctx` threading.
301
+
302
+ **`actionTranslator` for bash/fs primitive activation (v0.10.1+).** Bareguard's `bashCheck` / `fsCheck` / `netCheck` only fire when `action.type === 'bash'` / `'read'` / `'write'` / `'fetch'`. The default action shape is `{type: toolName, args, _ctx}` which matches `tools.denylist` / `tools.allowlist` but does NOT activate those primitives. Adopters who want both pass `wireGate(gate, { actionTranslator })`. Since bareguard 0.4.1+, the primitives read fields from either flat (`action.cmd`) or nested (`action.args.cmd` / `.command`) shapes, so you can pass args through verbatim:
303
+
304
+ ```javascript
305
+ const { policy, onToolResult } = wireGate(gate, {
306
+ actionTranslator: (toolName, args, ctx) => {
307
+ if (toolName === 'shell_exec') return { type: 'bash', args, _ctx: ctx }; // bareguard 0.4.1+ reads args.command
308
+ if (toolName === 'shell_run') return { type: 'bash', args, _ctx: ctx }; // reads args.argv → joins to cmd
309
+ if (toolName === 'shell_read') return { type: 'read', args, _ctx: ctx }; // reads args.path
310
+ return { type: toolName, args, _ctx: ctx }; // fall through to defaultActionTranslator
311
+ },
312
+ });
313
+ ```
314
+
315
+ `onLlmResult` always uses `{type:'llm'}` regardless of the translator (so budget rules match without translator collusion). `defaultActionTranslator` is exported for composition.
316
+
317
+ **Bounding tool rounds — use `limits.maxToolRounds` (bareguard 0.4.2+), not doubled `maxTurns`.** `limits.maxTurns` ticks on every `gate.record` (LLM + tool), so an "N LLM-tool round" cap is `maxTurns: N*2`. `limits.maxToolRounds: N` ticks only on non-`llm` records and gives the natural semantic — pairs cleanly with our split `onLlmResult` / `onToolResult` (the LLM side writes `{type:'llm'}` records which the counter skips). Halt severity, same shape as `maxTurns`, rebuilt from audit on cold-start.
318
+
319
+ **`HaltError` reachable from the public API (v0.10.1+).** `require('bare-agent').HaltError`, `require('bare-agent/errors').HaltError`. Adopters whose policy shim throws `HaltError` get identity-equal class across module boundaries — Loop's `instanceof HaltError` catches it cleanly.
320
+
321
+ **`Loop({ maxRounds })` throws (v0.10.1+).** The pre-v0.8 option is now an explicit error pointing at bareguard's `limits.maxTurns`. Silent-ignore migration foot-gun removed.
322
+
323
+ **Halt decisions throw `HaltError` and Loop exits cleanly (v0.10.0+).** When bareguard halts (budget exhausted, `limits.maxTurns`/`maxToolRounds` hit, content rule fired with `severity: 'halt'`), `wireGate.policy` throws a typed `HaltError`. Loop's outer handler catches it, emits `loop:error{source:'halt'}` + `loop:done{halted:true, rule, cost}`, calls `onError`, and returns `{ error: 'halt:<rule>', msgs }` — **even when `throwOnError:true`** (halt is a governed exit, not a runtime failure). The halt **never** reaches the LLM as a tool message. Adopters react via `result.error?.startsWith('halt:')` or the `loop:done{halted:true}` event. When the throwing code path lacks a rule, the token is the stable `halt:unknown` (not `halt:null`). On a mid-round halt, the returned `result.msgs` is sealed: every dangling assistant `tool_calls` id gets a synthetic `[halted:<rule>]` `role:'tool'` reply so the transcript is valid OpenAI shape (safe to feed back into another provider call). The `[halted:]` lowercase tag is distinct from the legacy `[HALT:]` string — that one is reserved for the pre-0.10 deny-string mode and is now dead code in bareagent.
324
+
325
+ **Same gate covers every tool source.** MCP tools from `createMCPBridge`, browsing tools from `createBrowsingTools`, mobile tools from `createMobileTools`, and any user-defined tool all flow through `policy` (`gate.check`) before invocation and `onToolResult` (`gate.record`) after — bareguard does no MCP-specific parsing, just glob-matches `tools.allowlist` / `tools.denylist` on the canonical name string.
326
+
327
+ **Migration map (v0.7 → v0.8):**
328
+
329
+ | You had | Move to |
330
+ |---|---|
331
+ | `new Loop({ maxCost: 0.50 })` | `new Gate({ budget: { maxCostUsd: 0.50 } })` |
332
+ | `new Loop({ maxRounds: 20 })` | `new Gate({ limits: { maxTurns: 20 } })` |
333
+ | `new Loop({ audit: './x.jsonl' })` | `new Gate({ audit: { path: './x.jsonl' } })` |
334
+ | `pathAllowlist({ allow, deny })` | `new Gate({ fs: { readScope: allow, deny } })` |
335
+ | `commandAllowlist({ allow })` | `new Gate({ bash: { allow } })` |
336
+ | `combinePolicies(a, b, c)` | Stack primitives in one Gate config — they compose as one eval |
337
+ | `MaxCostError` / `MaxRoundsError` | `try { ... } catch { ... }` → check `result.error?.startsWith('halt:')` (e.g. `halt:budget.maxCostUsd`, `halt:limits.maxTurns`). HaltError is also catchable via `require('bare-agent').HaltError` if your wiring throws it explicitly. |
338
+
339
+ **Policy return values (Loop's contract is unchanged):**
340
+
341
+ | Return | Effect |
342
+ |---|---|
343
+ | `true` | Tool executes normally. |
344
+ | `false` | Tool call aborted. Generic `[Loop] Tool "X" denied by policy` returned to the LLM as tool result. |
345
+ | `string` | Returned verbatim to the LLM as the deny reason. `wireGate` produces these for every gate deny. |
346
+ | throws | Treated as a deny. The thrown message becomes the reason. Loop continues. |
347
+ | omitted | Allow-all. Useful for development; never in production — that's what bareguard is for. |
348
+
349
+ ### Per-caller governance with ctx (multi-user, multi-tenant)
350
+
351
+ The policy signature accepts a third arg `ctx` — an opaque blob you pass per-call via `loop.run(msgs, tools, { ctx })`. `wireGate` forwards it as `_ctx` on every `gate.check({ type, args, _ctx })`, and you can branch on it inside bareguard's `humanChannel` callback or via custom primitives.
352
+
353
+ ```javascript
354
+ // Tools were pre-filtered once at startup via filterTools (catalog pre-filter);
355
+ // runtime per-call governance comes from policy + onToolResult, both wired on
356
+ // the Loop constructor. ctx is forwarded as the third arg to policy and into
357
+ // onLlmResult / onToolResult → bareguard's gate.record as `_ctx`.
358
+ await loop.run(messages, tools, {
359
+ ctx: { senderId, chatId, isOwner, adminGroupIds },
360
+ });
361
+ ```
362
+
363
+ For routing rules that don't fit bareguard's primitives (e.g. "owner can do anything; user can only read"), you can layer a custom closure on top of `wireGate(gate).policy` — but the cleaner pattern is one source of truth: encode the rules as bareguard primitives and let the gate evaluate them.
364
+
365
+ ### Catalog pre-filter (omit denied tools from the LLM's view)
366
+
367
+ `wireGate(gate).filterTools(tools)` drops denied tools from the catalog before the LLM sees them. It calls `gate.allows(name)` in parallel for every tool — a pure predicate (no audit write, no budget delta) — and returns the filtered array. Bulk only: handles tools registered by name (native, MCP bulk-loaded, shell, browsing, mobile). For MCP meta-tools (`mcp_invoke`), inner names live inside `args.name` and are gated via `tools.denyArgPatterns` instead — see the MCP recipe below.
368
+
369
+ ```javascript
370
+ const { filterTools } = wireGate(gate);
371
+ const visibleTools = await filterTools(allTools);
372
+ const result = await loop.run(messages, visibleTools);
373
+ ```
374
+
375
+ For arg-aware *filtering* (rare — usually you want arg-aware *gating*, which is policy's job), drop to `gate.allows({ type: 'send_message', args: { chat_id } })` directly. Pre-filter is a context optimization; gov decisions still happen at invoke time via `policy` (= `gate.check`).
376
+
377
+ ### Checkpoint vs bareguard's humanChannel
378
+
379
+ - **`humanChannel`** (bareguard) — fires for *policy-driven* asks/halts (budget about to overrun, content rule wants a confirm, halt-severity event needs ack). One callback, one place to wire your UI.
380
+ - **`Checkpoint`** (bareagent) — fires for *always-prompt* flows that aren't policy-driven (e.g. "always confirm before sending an email", regardless of who or why). Stays for that case.
381
+
382
+ Both can route to the same underlying chat / terminal / Slack helper. Both also support a deadline so a hung UI can't pin the agent forever — bareguard ≥0.3 takes `humanChannelTimeoutMs` (timeout always denies, never allow), bareagent's Checkpoint takes `timeout` (default 5 min, throws → auto-deny).
383
+
384
+ ### Checkpoint timeout — no silent hangs
385
+
386
+ `Checkpoint.waitForReply()` is async and used to hang forever if the user never replied. As of v0.7.0, Checkpoint accepts a `timeout` option (default 5 minutes). On expiry it throws `TimeoutError`; the Loop catches it, auto-denies the tool call with reason `"Checkpoint failed: ... auto-denied"`, and routes the error through `loop:error` + `onError`.
387
+
388
+ ```javascript
389
+ const checkpoint = new Checkpoint({
390
+ tools: ['send_email', 'shell_exec'],
391
+ send: async (q) => await platform.send(chatId, q),
392
+ waitForReply: async () => await waitForChatReply(chatId),
393
+ timeout: 10 * 60 * 1000, // 10 minutes (default is 5)
394
+ });
395
+
396
+ const loop = new Loop({ provider, checkpoint });
397
+ ```
398
+
399
+ Set `timeout: 0` to opt out and keep the old "hang forever" behaviour.
400
+
401
+ **Approval is fail-closed (v0.11.0).** The Loop proceeds **only** when `waitForReply` resolves to an explicit affirmative — `"yes"`, `"y"`, `"approve"`, or `"approved"` (trimmed, case-insensitive). Every other reply denies, including unrecognized strings (`"ok"`, `"sure"`, `"denied"`), empty, and non-strings. Wire your transport to return one of the affirmatives on approval; before v0.11 any non-`"no"` reply was treated as approval.
402
+
403
+ ### Unified error surfacing — three hooks, one principle
404
+
405
+ *No silent failures.* Every previously-silent failure path in bareagent now routes through one of three operator hooks:
406
+
407
+ | Hook | Use for | Fires on |
408
+ |---|---|---|
409
+ | `Gate({ audit: { path } })` | Forensic replay, compliance, billing | Every gated event (check + record) — bareguard owns this |
410
+ | `stream` + a transport | Live telemetry (Datadog, Sentry, Loki) | Every loop event including `loop:error` |
411
+ | `onError(err, { source, ...meta })` | Pager-style alerts (one function, one-liner) | Provider errors, callback throws, Checkpoint timeouts, stream listener exceptions |
412
+
413
+ ```javascript
414
+ const loop = new Loop({
415
+ provider,
416
+ policy, // from wireGate(gate)
417
+ stream,
418
+ onError: (err, meta) => {
419
+ // Fires for every silent-ish failure with { source, ...extra }
420
+ // source ∈ {'provider', 'callback:onToolCall', 'callback:onText',
421
+ // 'checkpoint', 'stream'}
422
+ pager.send({ level: 'warn', source: meta.source, err: err.message });
423
+ },
424
+ });
425
+ ```
426
+
427
+ If you run bareagent headless, **wire at least `onError`, a `Gate` with an audit path, and a `humanChannel` callback** (the latter is required by bareguard — without it, ask/halt events return silent denies). Otherwise you are flying blind.
428
+
429
+ ## Wiring with Checkpoint (human approval)
430
+
431
+ ```javascript
432
+ const { Loop, Checkpoint } = require('bare-agent');
433
+
434
+ const checkpoint = new Checkpoint({
435
+ tools: ['send_email', 'purchase'], // these tools require approval
436
+ send: async (question) => console.log(question),
437
+ waitForReply: async () => {
438
+ // wire to your chat platform, readline, etc.
439
+ return 'yes';
440
+ },
441
+ });
442
+
443
+ const loop = new Loop({ provider, checkpoint });
444
+ ```
445
+
446
+ ## Wiring with Scheduler
447
+
448
+ ```javascript
449
+ const { Scheduler } = require('bare-agent');
450
+
451
+ const scheduler = new Scheduler({
452
+ file: './jobs.json', // persist across restarts
453
+ interval: 60000, // tick every 60s
454
+ onError: (err, job) => console.error(`Job ${job.id} failed:`, err.message),
455
+ });
456
+
457
+ scheduler.add({ schedule: '2h', action: 'check inbox', type: 'recurring' });
458
+ scheduler.add({ schedule: '0 9 * * 1-5', action: 'morning briefing', type: 'recurring' }); // cron requires cron-parser
459
+
460
+ scheduler.start(async (job) => {
461
+ try {
462
+ const result = await loop.run(
463
+ [{ role: 'user', content: job.action }],
464
+ tools
465
+ );
466
+ // do something with result
467
+ } catch (err) {
468
+ console.error(`Job ${job.id} failed:`, err.message);
469
+ }
470
+ });
471
+ ```
472
+
473
+ ## Wiring with Planner + StateMachine
474
+
475
+ ```javascript
476
+ const { Planner, StateMachine, Loop } = require('bare-agent');
477
+
478
+ const planner = new Planner({ provider });
479
+ const state = new StateMachine({ file: './tasks.json' });
480
+
481
+ const steps = await planner.plan('Book a trip to Berlin');
482
+ // steps: [{ id: 's1', action: 'Search flights', dependsOn: [], status: 'pending' }, ...]
483
+
484
+ // Option A: manual sequential execution
485
+ for (const step of steps) {
486
+ state.transition(step.id, 'start');
487
+ try {
488
+ const result = await loop.run(
489
+ [{ role: 'user', content: step.action }],
490
+ tools
491
+ );
492
+ state.transition(step.id, 'complete', result.text);
493
+ } catch (err) {
494
+ state.transition(step.id, 'fail', err.message);
495
+ }
496
+ }
497
+ ```
498
+
499
+ ## Wiring with runPlan (parallel execution)
500
+
501
+ ```javascript
502
+ const { Planner, runPlan, StateMachine } = require('bare-agent');
503
+
504
+ const planner = new Planner({ provider });
505
+ const steps = await planner.plan('Book a trip to Berlin');
506
+
507
+ // runPlan executes steps in dependency-respecting waves with parallelism
508
+ const results = await runPlan(steps, async (step) => {
509
+ const result = await loop.run(
510
+ [{ role: 'user', content: step.action }],
511
+ tools
512
+ );
513
+ return result.text;
514
+ }, {
515
+ concurrency: 3, // max 3 parallel steps per wave
516
+ stateMachine: new StateMachine(), // optional lifecycle tracking
517
+ onWaveStart: (num, steps) => console.log(`[Wave ${num}]: ${steps.map(s => s.id).join(', ')}`),
518
+ onStepStart: (step) => console.log(`Starting: ${step.action}`),
519
+ onStepDone: (step, result) => console.log(`Done: ${step.id}`),
520
+ onStepFail: (step, err) => console.error(`Failed: ${step.id}: ${err.message}`),
521
+ });
522
+ // results: [{ id: 's1', status: 'done', result: '...' }, { id: 's2', status: 'failed', error: '...' }, ...]
523
+ ```
524
+
525
+ ## Provider options
526
+
527
+ ```javascript
528
+ // OpenAI (also works with OpenRouter, Together, Groq, vLLM, LM Studio)
529
+ new OpenAI({ apiKey, model: 'gpt-4o-mini', baseUrl: 'https://api.openai.com/v1' })
530
+
531
+ // Anthropic
532
+ new Anthropic({ apiKey, model: 'claude-haiku-4-5-20251001' })
533
+
534
+ // Ollama (local, no key needed)
535
+ new Ollama({ model: 'llama3.2', url: 'http://localhost:11434' })
536
+
537
+ // CLIPipe — pipe prompts to any CLI tool via stdin/stdout
538
+ new CLIPipe({ command: 'claude', args: ['--print'], systemPromptFlag: '--system-prompt', timeout: 30000 })
539
+ new CLIPipe({ command: 'ollama', args: ['run', 'llama3.2'] })
540
+ ```
541
+
542
+ All return `{ text, toolCalls, usage: { inputTokens, outputTokens } }`. CLIPipe always returns `toolCalls: []` and zero usage (CLI tools don't report tokens).
543
+
544
+ **Error body (v0.11.0):** on an HTTP error the OpenAI/Anthropic/Ollama providers throw a `ProviderError` whose `message` carries the upstream error string. The full parsed response is **not** attached to `err.body` by default (so an unexpected field can't leak through logs that dump the error object). Pass `{ exposeErrorBody: true }` to attach it for debugging.
545
+
546
+ **Cost estimation:** Loop automatically estimates USD cost per run based on model and token usage. The `cost` field appears in every `loop.run()` result and in `loop:done` stream events. Pricing covers OpenAI and Anthropic models; unknown models use a default average. To adjust rates, edit `COST_PER_1K` at the top of `src/loop.js`.
547
+
548
+ ## Store options
549
+
550
+ ```javascript
551
+ // SQLite FTS5 — full-text search with BM25 ranking (requires: npm install better-sqlite3)
552
+ new SQLite({ path: './memory.db' })
553
+
554
+ // JSON file — zero deps, substring search
555
+ new JsonFile({ path: './memory.json' })
556
+
557
+ // Custom — implement { store, search, get, delete }
558
+ ```
559
+
560
+ **JsonFile scaling:** `search()` is an O(n) substring scan (no index) and every `store()`/`delete()` rewrites the whole file. Fine for hundreds–low-thousands of entries; for larger or write-heavy memory use `SQLite` (FTS5 index, incremental writes). JsonFile warns once past ~10k entries.
561
+
562
+ ## Tool format
563
+
564
+ Every tool passed to `Loop.run()` must have:
565
+
566
+ | Field | Type | Required | Notes |
567
+ |---|---|---|---|
568
+ | `name` | string | yes | Non-empty |
569
+ | `execute` | function | yes | `async (args) => result` — string or JSON-serializable |
570
+ | `description` | string | no | Providers pass this to the LLM |
571
+ | `parameters` | object | no | JSON Schema for the tool's arguments |
572
+
573
+ Tools are validated at the start of `run()`. Missing `name` or `execute` throws immediately with a clear `[Loop]` error.
574
+
575
+ ## Error handling
576
+
577
+ - **Loop throws by default** (v0.3.0+) — provider errors re-thrown as-is. Use `try/catch` or `.catch()`.
578
+ - **Loop `throwOnError: false`** — opt into v0.2.x behavior where errors are returned in `result.error` instead of thrown.
579
+ - **Loop throws at setup** — missing provider, malformed tools.
580
+ - **Halt decisions throw `HaltError` but Loop catches it cleanly (v0.10.0+)** — turn cap (`limits.maxTurns`/`maxToolRounds`), budget cap (`budget.maxCostUsd`), and content rules with `severity:'halt'` throw `HaltError` from `wireGate.policy`; Loop catches it, emits `loop:done{halted:true, rule}`, and returns `{ error: 'halt:<rule>' }`. **Even with `throwOnError:true`** Loop does NOT propagate — halt is a governed exit. Halt never reaches the LLM as a tool message. Detect via `result.error?.startsWith('halt:')`, the `loop:done{halted:true}` event, the `loop:error{source:'halt'}` event, or wire `humanChannel` for ask-then-decide flows.
581
+ - All errors are prefixed `[ComponentName]` for easy identification.
582
+ - See `docs/errors.md` in the repo for a full error reference with triggers and fixes.
583
+
584
+ ### Typed error hierarchy
585
+
586
+ ```
587
+ Error
588
+ └── BareAgentError { code, retryable, context }
589
+ ├── ProviderError { status, body } — auto retryable for 429/5xx
590
+ ├── ToolError code: 'TOOL_ERROR', retryable: false
591
+ ├── TimeoutError code: 'ETIMEDOUT', retryable: true
592
+ ├── ValidationError code: 'VALIDATION_ERROR', retryable: false
593
+ ├── CircuitOpenError code: 'CIRCUIT_OPEN', retryable: true
594
+ └── HaltError code: 'HALT', retryable: false — { rule, decision }
595
+ ```
596
+
597
+ Halt classes (`MaxCostError`, `MaxRoundsError`) were removed in v0.8.0. **In v0.10.0**, `HaltError` was added and `wireGate.policy` throws it on halt-severity decisions. Loop catches it in its outer handler and returns a clean exit (see the halt-mechanics paragraphs above) — adopters who *want* to catch the halt class explicitly can import it from `require('bare-agent').HaltError` or `require('bare-agent/errors').HaltError` (identity-equal across module boundaries, v0.10.1+). `err.rule` and `err.decision` are the stable public surface; do not read from `err.context.rule` / `err.context.decision` — those were removed in 0.10.3 as redundant.
598
+
599
+ All error classes extend `Error` — `instanceof Error` always works. The `retryable` property integrates with `Retry`'s fast path: `err.retryable === true` auto-retries, `err.retryable === false` bails immediately.
600
+
601
+ ## Key contracts
602
+
603
+ - Loop builds messages in OpenAI format internally. Each provider normalizes to its native format.
604
+ - `provider.generate(messages, tools, options)` must return `{ text, toolCalls, usage }`.
605
+ - Store must implement `store(content, metadata) → id`, `search(query, options) → [{id, content, metadata, score}]`, `get(id)`, `delete(id)`.
606
+ - Components are independent: Memory doesn't know Loop, Scheduler doesn't know Planner. You compose them.
607
+
608
+ ## Patterns, not features
609
+
610
+ These are deliberately NOT in bare-agent. Don't look for them — build them from existing primitives.
611
+
612
+ | Pattern | Not built in because | How to do it |
613
+ |---|---|---|
614
+ | **Multi-agent orchestration** | Routing, handoffs, shared state are app logic | Multiple Loop instances with different system prompts/tools. Your app routes. Share state via a common Memory/store. |
615
+ | **Structured output / named phases** | Domain-specific (trip planner ≠ code reviewer) | System prompts with format instructions, Planner with custom phase names, or tools with JSON Schema enforcing structure. |
616
+ | **Output limiting / token budgets** | Per-provider, per-plan, per-UX | Provider `maxTokens` option, system prompt guidance, or post-process `result.usage.outputTokens`. |
617
+ | **Rate limiting** | Per-provider, per-endpoint | Wrap `provider.generate` with a rate-limiting function. |
618
+ | **Hooks (lifecycle events)** | You own the code — add behavior directly | Stream subscriptions for after-the-fact hooks. Wrap tool `execute` functions for before/after semantics. |
619
+ | **Heartbeat (ambient awareness)** | "Check if anything needs attention" scope is your domain | Scheduler recurring job where the LLM triages: `scheduler.add({ type: 'recurring', schedule: '30m', action: 'Check if anything needs attention' })`. |
620
+ | **Cron** | **This IS built in** | Scheduler supports cron expressions (requires `cron-parser` peer dep) and relative schedules (`5s`, `30m`, `2h`, `1d`) natively. |
621
+
622
+ For full recipes with code examples, see `docs/02-features/usage-guide.md` § "Patterns, Not Features".
623
+
624
+ ## Production usage
625
+
626
+ | Component | aurora (SOAR2 pipeline) | multis (personal assistant) |
627
+ |---|---|---|
628
+ | Loop | ✓ | ✓ |
629
+ | Planner | ✓ | ✓ |
630
+ | runPlan | ✓ | — (sequential execution) |
631
+ | Retry | ✓ | ✓ |
632
+ | CircuitBreaker | — | ✓ |
633
+ | Fallback | — | — (deferred) |
634
+ | Memory | — (own BM25 store) | — (own SQLite FTS5 store) |
635
+ | StateMachine | — | — (deferred) |
636
+ | Scheduler | — | ✓ |
637
+ | Checkpoint | — | ✓ |
638
+ | Stream | — | — (deferred) |
639
+ | CLIPipe | ✓ | — |
640
+
641
+ Both projects kept their own memory/store implementations. Neither needed multi-agent routing. Full multis eval: `docs/03-logs/bareagent-eval-multis.md`.
642
+
643
+ ## Examples
644
+
645
+ Runnable reference scripts in `examples/`. Each is self-contained; the top-of-file docstring documents flags and required env vars. Recipes below are the prose form; these are the executable form.
646
+
647
+ | File | Demonstrates | Recipe cross-ref |
648
+ |---|---|---|
649
+ | `examples/with-bareguard.mjs` | Loop + bareguard end-to-end: budget cap, fs scope, bash allowlist, audit log, humanChannel. Canonical governed-loop reference. | § Wiring with bareguard |
650
+ | `examples/mcp-bridge-poc.js` | Auto-discover MCP servers from IDE configs, run a Loop with the discovered tools, persist allow/deny in `.mcp-bridge.json`. | Recipe 9 |
651
+ | `examples/mcp-bridge-concurrent.js` | Concurrent stress test against real public domains via `barebrowse_browse` (Amazon, Wikipedia, GitHub, a dead host). Validates bridge resilience under fan-out. | Recipe 9 |
652
+ | `examples/orchestrator/` | Multi-agent dispatch via `spawn`. Three configs (orchestrator + two specialists), one system prompt — no orchestrator class, no role types. Roles are JSON files. Wired with bareguard via `cfg.gate`. | § Multi-agent: spawn + defer + wake |
653
+ | `examples/wake.sh` + `examples/wake.md` | Reference cron + `jq` script for firing deferred actions. The runtime half of `createDeferTool`. | § Multi-agent: spawn + defer + wake |
654
+ | `examples/replay-job.js` | Supervised replay POC: record a browser task once with the LLM driving, then replay against fresh snapshots with the LLM acting only as a locator. On locator miss, falls back to full Loop reasoning and patches the trace. Stub points for fingerprint fast-path, postState assertion, and trace-confidence are inline in the file header. | Recipe 7 |
655
+
656
+ Stale example removed in 0.10.4: `examples/mcp-bridge-gov.js` (used a hard-coded path to the retired `mcp-gov` project; superseded by `with-bareguard.mjs` and bareguard's `policy` + `tools.denyArgPatterns` for MCP gating).
657
+
658
+ ## Gotchas
659
+
660
+ 1. **Anthropic requires apiKey** — OpenAI and Ollama don't (for local/keyless endpoints).
661
+ 2. **Cron schedules require `cron-parser`** — it's an optional dep. Relative schedules (`5s`, `30m`, `2h`, `1d`) work without it.
662
+ 3. **SQLiteStore requires `better-sqlite3`** — it's a peer dep. JsonFileStore has zero deps.
663
+ 4. **Scheduler runs jobs sequentially within a tick** — if one handler takes 5s, others wait. Use short handlers or offload work.
664
+ 5. **Ollama tool call IDs are synthetic** — `call_${Date.now()}`. Works fine but IDs aren't stable across retries.
665
+ 6. **Loop's `chat()` is stateful** — it accumulates the full conversation history including tool calls and tool results across turns. For long conversations, use `run()` with your own message management to control what stays in context.
666
+ 7. **CLIPipe `_formatPrompt()` flattens all messages** — System messages become `System: content` plaintext in stdin. If your CLI tool expects system prompts via a dedicated flag (e.g. `claude --system`), use `systemPromptFlag` to separate them. Without it, structured output prompts embedded in system messages will break.
667
+ 8. **Loop `run()` throws by default (v0.3.0+)** — Provider errors throw instead of returning `result.error`. Use `try/catch` or pass `throwOnError: false` for the old behavior. **Halt-severity governance exits are the exception**: even with `throwOnError:true`, Loop catches `HaltError` and returns `{ error: 'halt:<rule>' }` cleanly (v0.10.0+). `Loop({ maxRounds })` was removed in v0.8 and now throws at construction (v0.10.1+) with a migration message pointing at `new Gate({ limits: { maxTurns | maxToolRounds } })`. The internal `HARD_ROUND_LIMIT = 100` is a safety net only — wire bareguard for real iteration bounds.
668
+ 9. **StateMachine `getStatus()` returns `null` for unregistered IDs** — It does not throw. Always null-check before accessing `.status`.
669
+ 10. **Planner expects JSON array `[{id, action, dependsOn}]`** — Not `{steps: [...]}`. If the LLM wraps steps in an object, Planner's parser will reject it.
670
+ 11. **Loop injects system prompt as a message, not an option** — `{ role: 'system', content: '...' }` is prepended at index 0 of the messages array passed to `provider.generate()`. It is NOT passed in `options.system`. If your tests assert on `options.system`, they will break — assert on `messages[0]` instead.
671
+ 12. **JsonlTransport must be imported from `bare-agent/transports`** — Not from `bare-agent` main export. Importing from main will throw `ERR_PACKAGE_PATH_NOT_EXPORTED`.
672
+ 13. **Browsing tools require `close()`** — `createBrowsingTools()` launches a browser (20 tools as of barebrowse v0.9.0: browse, goto, snapshot, click, type, press, scroll, select, hover, back, forward, **reload**, drag, upload, tabs, switchTab, pdf, screenshot, **wait_for**, **downloads**, plus **assess** if `wearehere` is installed → 21 total). Always call `close()` in a `finally` block to release resources. Returns `null` if `barebrowse` is not installed. Action tools (click, type, press, scroll, hover, goto, back, forward, **reload**, drag, upload, select, switchTab, **wait_for**) auto-return a fresh snapshot with a 300ms settle delay so the LLM always sees the result. `downloads` returns a JSON snapshot of `page.downloads` frozen at request time (stable view, not a live reference). `onDialog` is intentionally not exposed as a tool — its callback shape doesn't fit a request/response loop; drop to `import { connect }` directly if a flow needs to override confirm/prompt replies, or read `page.dialogLog` after the fact. `browse` and `snapshot` accept `pruneMode: 'act'|'read'` — `act` (default) keeps interactive elements for clicking/filling; pass `'read'` when the page is paragraph-heavy (article, doc, blog) to keep prose. If `act` collapses a content-heavy page, the snapshot includes a hint to retry with `pruneMode: 'read'`. For multi-step flows, CLI session mode (`npx barebrowse open/click/snapshot/close`) is more token-efficient — snapshots go to `.barebrowse/*.yml`, agent reads only when needed instead of inline in conversation.
673
+ 14. **Mobile tools require `close()`** — `createMobileTools()` connects to a device. Always call `close()` in a `finally` block. Returns `null` if `baremobile` is not installed. Action tools auto-return a snapshot (unlike browsing tools where you call snapshot separately). Refs reset every snapshot — never cache them.
674
+ 15. **`bin/cli.js` (`spawn` child agents) fails closed on gate-wiring errors (v0.10.3+)** — when a child config sets `cfg.gate` but `Gate` init or `wireGate` throws, the CLI now `process.exit(1)` with `[cli] failed to wire bareguard: ... Refusing to run ungoverned (cfg.gate set).` instead of continuing with `policy=null`. Children without `cfg.gate` are unchanged (no governance asked, none enforced). If you previously relied on the silent-fallback behavior (you shouldn't have — it was a silent escape hatch), drop `cfg.gate` from the child config to opt in.
675
+ 16. **`bin/cli.js` uses BA1 callbacks since v0.10.3** — children record LLM cost into bareguard's audit, thread `_ctx`, and pre-filter tools via `gate.allows`. Pre-0.10.3 children used the deprecated `wrapTools` path which silently dropped LLM cost from `budget.maxCostUsd` and printed a deprecation warning per first tool call. No config change needed; the upgrade is transparent.
676
+ 17. **Halt-path `msgs` is sealed (v0.10.3+)** — when Loop catches `HaltError` mid-round, every dangling assistant `tool_calls.id` from the halted round gets a synthetic `{ role:'tool', tool_call_id, content: '[halted:<rule>]' }` appended so the returned `result.msgs` is valid OpenAI shape. Safe to feed back into another provider call without protocol errors. The `[halted:<rule>]` tag is lowercase — distinct from the legacy `[HALT:]` deny strings (removed in 0.10.0, do not match on the old form).
677
+ 18. **`HaltError` with no `rule` resolves to `halt:unknown` (v0.10.3+)** — `new HaltError('msg')` without a `{ rule }` option still produces a stable `result.error = 'halt:unknown'` and `loop:done{halted:true, rule:'unknown'}`. Pre-0.10.3 produced the literal `'halt:null'` which broke string-matching consumers. The `_reportError('halt', ...)` extra carries the same `rule:'unknown'` token.
678
+ 19. **Cost table is hand-curated** (`src/loop.js:COST_PER_1K`) — refreshed 2026-05-18 for Claude 4.x (`claude-opus-4-7`, `claude-sonnet-4-6`, `claude-haiku-4-5`) and the GPT-4.1 / o3-mini line. Unknown models fall through to `_default` ($0.002 in / $0.008 out per 1K). If you use a model not in the table and care about `result.cost` accuracy or `budget.maxCostUsd` enforcement via `onLlmResult`, add it.
679
+
680
+ ## Cross-language SDKs
681
+
682
+ Tested, importable wrappers for Python, Go, Rust, Ruby, and Java in `contrib/`. Each spawns `npx bare-agent --jsonl` and communicates via JSONL over stdin/stdout. Consistent API: constructor → `run(goal)` → `close()`.
683
+
684
+ ```python
685
+ # Python — contrib/python/bareagent.py (stdlib only)
686
+ from bareagent import BareAgent
687
+ agent = BareAgent(provider="openai", model="gpt-4o-mini")
688
+ result = agent.run("What is the capital of France?")
689
+ print(result["text"])
690
+ agent.close()
691
+ ```
692
+
693
+ See `contrib/README.md` for all 5 languages and protocol reference.
694
+
695
+ ## Recipes
696
+
697
+ ### Recipe 1: Planner → runPlan (main use case)
698
+
699
+ ```javascript
700
+ const { Planner, runPlan, StateMachine, Loop } = require('bare-agent');
701
+ const { OpenAI } = require('bare-agent/providers');
702
+
703
+ const provider = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' });
704
+ const loop = new Loop({ provider });
705
+
706
+ // Plan
707
+ const planner = new Planner({ provider });
708
+ const steps = await planner.plan('Book a trip to Berlin');
709
+
710
+ // Execute with wave progress
711
+ const results = await runPlan(steps, async (step) => {
712
+ const result = await loop.run(
713
+ [{ role: 'user', content: step.action }],
714
+ tools
715
+ );
716
+ return result.text; // throws on error by default (v0.3.0+)
717
+ }, {
718
+ concurrency: 3,
719
+ stateMachine: new StateMachine(),
720
+ onWaveStart: (num, wave) => console.log(`[Wave ${num}]: ${wave.map(s => s.id).join(', ')}`),
721
+ onStepDone: (step, result) => console.log(`Done: ${step.id}`),
722
+ onStepFail: (step, err) => console.error(`Failed: ${step.id}: ${err.message}`),
723
+ });
724
+ ```
725
+
726
+ ### Recipe 2: Loop + CLIPipe with systemPromptFlag
727
+
728
+ ```javascript
729
+ const { Loop } = require('bare-agent');
730
+ const { CLIPipe } = require('bare-agent/providers');
731
+
732
+ // Without systemPromptFlag: system messages become "System: ..." in stdin (breaks structured output)
733
+ // With systemPromptFlag: system content passed via --system flag, only user/assistant in stdin
734
+ const provider = new CLIPipe({
735
+ command: 'claude',
736
+ args: ['--print'],
737
+ systemPromptFlag: '--system-prompt',
738
+ });
739
+
740
+ const loop = new Loop({ provider });
741
+ const result = await loop.run([
742
+ { role: 'user', content: 'List 3 facts about Berlin' }
743
+ ]);
744
+ console.log(result.text);
745
+ ```
746
+
747
+ ### Recipe 3: CircuitBreaker + Fallback + Retry (resilient multi-provider)
748
+
749
+ ```javascript
750
+ const { Loop, Retry, CircuitBreaker } = require('bare-agent');
751
+ const { OpenAI, Anthropic, Fallback } = require('bare-agent/providers');
752
+
753
+ const cb = new CircuitBreaker({
754
+ threshold: 3,
755
+ resetAfter: 30000,
756
+ onStateChange: (key, from, to) => console.log(`[${key}] ${from} → ${to}`),
757
+ });
758
+
759
+ const provider = new Fallback([
760
+ cb.wrapProvider(new OpenAI({ apiKey: process.env.OPENAI_API_KEY }), 'openai'),
761
+ cb.wrapProvider(new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }), 'anthropic'),
762
+ ], {
763
+ onFallback: (err, from, to) => console.warn(`Provider ${from} failed, trying ${to}`),
764
+ });
765
+
766
+ const loop = new Loop({
767
+ provider,
768
+ retry: new Retry({ maxAttempts: 3, jitter: 'full' }),
769
+ });
770
+ ```
771
+
772
+ ### Recipe 4: Stream + JsonlTransport
773
+
774
+ ```javascript
775
+ const { Loop, Stream } = require('bare-agent');
776
+ const { JsonlTransport } = require('bare-agent/transports');
777
+ const { OpenAI } = require('bare-agent/providers');
778
+
779
+ // JSONL events to stdout — pipe to any consumer
780
+ const stream = new Stream({ transport: new JsonlTransport() });
781
+ const loop = new Loop({
782
+ provider: new OpenAI({ apiKey: process.env.OPENAI_API_KEY }),
783
+ stream,
784
+ });
785
+
786
+ // Subscribe for in-process handling
787
+ stream.subscribe((event) => {
788
+ if (event.type === 'loop:tool_call') {
789
+ console.error(`[debug] Tool: ${event.data.name}`);
790
+ }
791
+ });
792
+
793
+ const result = await loop.run(
794
+ [{ role: 'user', content: 'What is the weather in Berlin?' }],
795
+ [weatherTool]
796
+ );
797
+ ```
798
+
799
+ ### Recipe 5: Tool context adapter (ctx closure)
800
+
801
+ ```javascript
802
+ // Your tools need execution context (senderId, chatId, permissions, etc.)
803
+ // bareagent tools get execute(args) — just LLM arguments.
804
+ // Solution: closure that captures ctx.
805
+
806
+ function adaptTools(tools, ctx) {
807
+ return tools.map(tool => ({
808
+ name: tool.name,
809
+ description: tool.description,
810
+ parameters: tool.input_schema || tool.parameters,
811
+ execute: async (args) => tool.execute(args, ctx),
812
+ }));
813
+ }
814
+
815
+ // In your message handler:
816
+ const tools = adaptTools(myTools, { chatId, senderId, isOwner, platform });
817
+ const result = await loop.run([{ role: 'user', content: msg }], tools);
818
+ ```
819
+
820
+ ### Recipe 6: Checkpoint on a chat platform
821
+
822
+ ```javascript
823
+ const { Checkpoint } = require('bare-agent');
824
+
825
+ const pendingApprovals = new Map(); // chatId → resolve function
826
+
827
+ const checkpoint = new Checkpoint({
828
+ tools: ['send_email', 'purchase'],
829
+ send: async (question) => platform.send(chatId, `Approval needed: ${question}\nReply yes/no.`),
830
+ waitForReply: () => new Promise(resolve => pendingApprovals.set(chatId, resolve)),
831
+ });
832
+
833
+ // In your message router — intercept approval replies
834
+ function onMessage(chatId, text) {
835
+ if (pendingApprovals.has(chatId)) {
836
+ const resolve = pendingApprovals.get(chatId);
837
+ pendingApprovals.delete(chatId);
838
+ resolve(text); // unblocks waitForReply()
839
+ return;
840
+ }
841
+ // ... normal agent handling
842
+ }
843
+ ```
844
+
845
+ ### Recipe 7: Loop + Browsing Tools
846
+
847
+ ```javascript
848
+ const { Loop } = require('bare-agent');
849
+ const { OpenAI } = require('bare-agent/providers');
850
+ const { createBrowsingTools } = require('bare-agent/tools');
851
+
852
+ const provider = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' });
853
+ const browsing = await createBrowsingTools();
854
+ if (!browsing) throw new Error('barebrowse not installed');
855
+
856
+ const loop = new Loop({ provider });
857
+ try {
858
+ const result = await loop.run(
859
+ [{ role: 'user', content: 'Go to example.com and tell me what you see' }],
860
+ browsing.tools
861
+ );
862
+ console.log(result.text);
863
+ } finally {
864
+ await browsing.close(); // always close — releases browser resources
865
+ }
866
+ ```
867
+
868
+ **Privacy assessment:** If `wearehere` is installed (`npm install wearehere`), a 21st tool `assess` is automatically available. It scans any URL for privacy risks and returns a compact JSON:
869
+
870
+ ```javascript
871
+ // The assess tool is included in browsing.tools automatically
872
+ // Agent can call it like any other tool:
873
+ // assess({ url: "https://example.com" })
874
+ // Returns: { site, score (0-100), risk, recommendation, concerns, categories }
875
+ ```
876
+
877
+ Categories: cookies, network trackers, hidden tracking elements, dark patterns, data brokers, device fingerprinting, stored data, form surveillance, link tracking, terms of service. Score thresholds: 0-19 low, 20-39 moderate, 40-69 high, 70+ critical.
878
+
879
+ ### Recipe 7b: CLI Browsing (token-efficient)
880
+
881
+ Two browsing strategies — pick based on your use case:
882
+
883
+ | | Library tools (Recipe 7) | CLI session (this recipe) |
884
+ |---|---|---|
885
+ | **How** | `createBrowsingTools()` → Loop tools | `npx barebrowse` CLI commands |
886
+ | **Snapshots** | Inline in tool results (conversation context) | Written to `.barebrowse/*.yml` on disk |
887
+ | **Token cost** | Higher — every snapshot in LLM context | Lower — agent reads files only at decision points |
888
+ | **Best for** | Single-page reads, simple interactions | Multi-page workflows, research, token-constrained envs |
889
+
890
+ **CLI workflow pattern:**
891
+
892
+ ```bash
893
+ # Install: npm install barebrowse (CLI available via npx)
894
+
895
+ # 1. Open a URL (starts session)
896
+ npx barebrowse open https://example.com
897
+
898
+ # 2. Take a snapshot → writes .barebrowse/<session>/<timestamp>.yml
899
+ npx barebrowse snapshot
900
+
901
+ # 3. Agent reads the .yml file, finds [ref=N] markers for interactive elements
902
+
903
+ # 4. Click a link or button by ref number
904
+ npx barebrowse click 5
905
+
906
+ # 5. Snapshot again at the new page
907
+ npx barebrowse snapshot
908
+
909
+ # 6. Close session when done
910
+ npx barebrowse close
911
+ ```
912
+
913
+ **CLI command reference:**
914
+
915
+ | Category | Commands |
916
+ |---|---|
917
+ | **Session** | `open <url> [flags]`, `close`, `status` |
918
+ | **Navigation** | `goto <url>`, `back`, `forward`, `reload [--no-cache]`, `snapshot [--mode=act\|read]`, `screenshot`, `pdf` |
919
+ | **Interaction** | `click <ref>`, `type <ref> <text>`, `fill <ref> <text>`, `press <key>`, `scroll <dy>`, `hover <ref>`, `select <ref> <value>`, `drag <from> <to>`, `upload <ref> <files..>` |
920
+ | **Tabs** | `tabs`, `tab <index>` |
921
+ | **Downloads** | `downloads` (JSON array of captured downloads — `savedPath`, `state`, ...) |
922
+ | **Debugging** | `eval <expr>`, `wait-idle`, `wait-for --text=X --selector=Y`, `console-logs`, `network-log`, `dialog-log`, `save-state` |
923
+
924
+ **Open flags:** `--mode=headless|headed|hybrid`, `--port=N` (attach to running browser), `--proxy=URL`, `--viewport=WxH`, `--storage-state=FILE`, `--download-path=DIR`, `--no-cookies`, `--browser=firefox|chromium`, `--timeout=N`
925
+
926
+ **Snapshot `.yml` format** contains page content with `[ref=N]` markers on interactive elements (links, buttons, inputs). The ref numbers are stable within a snapshot — use them with `click`, `type`, `drag`, `upload`, and other ref-based commands.
927
+
928
+ **Key insight:** Don't read every snapshot. Take snapshots freely, but only read the `.yml` file at decision points where you need to choose what to click or verify page content.
929
+
930
+ ### Recipe 8: Loop + Mobile Tools
931
+
932
+ ```javascript
933
+ const { Loop } = require('bare-agent');
934
+ const { OpenAI } = require('bare-agent/providers');
935
+ const { createMobileTools } = require('bare-agent/tools');
936
+
937
+ const provider = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' });
938
+
939
+ // Android (default)
940
+ const mobile = await createMobileTools();
941
+ // iOS: await createMobileTools({ platform: 'ios' })
942
+ // Termux on-device: await createMobileTools({ termux: true })
943
+ if (!mobile) throw new Error('baremobile not installed');
944
+
945
+ const loop = new Loop({ provider });
946
+ try {
947
+ const result = await loop.run(
948
+ [{ role: 'user', content: 'Open Settings and turn on Bluetooth' }],
949
+ mobile.tools
950
+ );
951
+ console.log(result.text);
952
+ } finally {
953
+ await mobile.close(); // always close — releases device connection
954
+ }
955
+ ```
956
+
957
+ Mobile tools follow the observe-act pattern: action tools auto-return a fresh snapshot so the LLM sees the result immediately. Tools: `mobile_snapshot`, `mobile_tap`, `mobile_type`, `mobile_press`, `mobile_scroll`, `mobile_swipe`, `mobile_long_press`, `mobile_launch`, `mobile_back`, `mobile_home`, `mobile_screenshot`, `mobile_tap_xy`, `mobile_find_text`, `mobile_wait_text`, `mobile_wait_state`. Android-only: `mobile_intent`, `mobile_tap_grid`, `mobile_grid`. iOS-only: `mobile_unlock`.
958
+
959
+ ### Recipe 8b: Loop + Shell Tools (cross-platform primitives)
960
+
961
+ `createShellTools()` returns three pure-Node tools that work identically on linux, macOS, and Windows — no external binaries, no platform detection.
962
+
963
+ | Tool | Purpose |
964
+ |---|---|
965
+ | `shell_read` | Read a file (utf8, 256KB cap) or list a directory (tab-separated). `~` expands to home. |
966
+ | `shell_grep` | JavaScript regex search across files. Walks directories, skips binary files, returns `{hits: [{file, line, text}], truncated, fileCount}`. |
967
+ | `shell_run` | Run a command with an **argv array** via `child_process.execFile` (no shell, no metacharacter interpretation). Returns `{stdout, stderr, code, timedOut}`. **Use this when you need a policy allowlist.** |
968
+ | `shell_exec` | Run a raw shell command string via `/bin/sh -c` (or `cmd.exe`). Returns the same shape. **Shell metacharacters are interpreted — naive allowlists are bypassable.** Use only when you genuinely need shell features (pipes, redirects, globs). |
969
+
970
+ **Zero baked-in allowlist.** The library ships the primitives; gating is bareguard's job via the standard `wireGate(gate)` wiring.
971
+
972
+ > **⚠️ `shell_exec` injection caveat.** `"ls"` passes a base-command allowlist like `args.command.split(/\s+/)[0]`, but so does `"ls;rm -rf /tmp/x"` — the shell runs both. **A base-command allowlist is NOT safe for `shell_exec`.** For policy-gated use, prefer `shell_run({argv})` and allow-list on `args.argv[0]` — there is no shell in that path, so metacharacters are just literal argument bytes. Use `shell_exec` only when the agent needs pipes/redirects/globs, and gate it at a higher level (human approval, narrow intent).
973
+
974
+ ```javascript
975
+ const { Gate } = require('bareguard');
976
+ const { Loop, wireGate } = require('bare-agent');
977
+ const { OpenAI } = require('bare-agent/providers');
978
+ const { createShellTools } = require('bare-agent/tools');
979
+
980
+ const gate = new Gate({
981
+ // argv[0] allowlist for shell_run — bareguard's `bash` primitive enforces this.
982
+ bash: { allow: ['ls', 'cat', 'grep', 'ps', 'df', 'uname', 'node', 'git'] },
983
+ // Hard-deny shell_exec for this agent. tools.denylist short-circuits before content checks.
984
+ tools: { denylist: ['shell_exec'] },
985
+ // fs scope for shell_read / shell_grep.
986
+ fs: { readScope: ['/home/', '/tmp/'] },
987
+ audit: { path: './shell-audit.jsonl' },
988
+ humanChannel: async (event) => ({ decision: 'deny' }),
989
+ });
990
+ await gate.init();
991
+
992
+ const { policy, onLlmResult, onToolResult, filterTools } = wireGate(gate);
993
+ const { tools: shellTools } = createShellTools();
994
+ const tools = await filterTools(shellTools); // drop shell_exec (denylisted above) before the LLM sees it
995
+
996
+ const loop = new Loop({
997
+ provider: new OpenAI({ apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' }),
998
+ policy,
999
+ onLlmResult,
1000
+ onToolResult,
1001
+ });
1002
+
1003
+ const result = await loop.run(
1004
+ [{ role: 'user', content: 'What is in /tmp and how many README files are there under /home/me/code?' }],
1005
+ tools,
1006
+ );
1007
+ ```
1008
+
1009
+ **Allowlist is platform-specific on purpose.** `ls`/`cat`/`grep` work on linux and macOS, `dir`/`type`/`findstr` on Windows. The primitives are cross-platform; the *gate config you write* picks the commands appropriate for your OS. The library stays out of that decision.
1010
+
1011
+ **Why JavaScript regex for `shell_grep` instead of shelling out to `grep`/`rg`:** pure-Node means no dependency on external binaries being installed, identical behaviour on Windows, and governance covers the implementation (no hidden `child_process.spawn` bypassing the Loop policy).
1012
+
1013
+ ### Recipe 9: Loop + MCP Bridge (auto-discover + governance)
1014
+
1015
+ `createMCPBridge` reads MCP server definitions from standard IDE config locations (`.mcp.json`, `~/.mcp.json`, `~/.claude/mcp_servers.json`, `~/.config/Claude/claude_desktop_config.json`, `~/.cursor/mcp.json`), spawns each server over stdio, lists its tools, and returns a ready-to-use bareagent tool array. Any MCP-speaking server is consumable — zero glue code per server.
1016
+
1017
+ ```javascript
1018
+ const { Loop } = require('bare-agent');
1019
+ const { OpenAI } = require('bare-agent/providers');
1020
+ const { createMCPBridge } = require('bare-agent/mcp');
1021
+
1022
+ const provider = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' });
1023
+
1024
+ const bridge = await createMCPBridge();
1025
+ // bridge = { tools, servers, denied, systemContext, errors, close }
1026
+
1027
+ const loop = new Loop({
1028
+ provider,
1029
+ system: bridge.systemContext, // tells the LLM which tools exist and which are restricted
1030
+ });
1031
+
1032
+ try {
1033
+ const result = await loop.run(
1034
+ [{ role: 'user', content: 'Summarise my unread messages.' }],
1035
+ bridge.tools,
1036
+ );
1037
+ console.log(result.text);
1038
+ } finally {
1039
+ await bridge.close(); // always close — kills spawned MCP subprocesses
1040
+ }
1041
+ ```
1042
+
1043
+ **Governance via `.mcp-bridge.json`.** On first run, the bridge writes `.mcp-bridge.json` in the cwd listing every discovered server and tool with permission `"allow"`. Edit any entry to `"deny"` and the tool is dropped from the next run's tool array; the LLM sees it listed in `systemContext` as restricted, with instructions not to retry it. Re-discovery happens automatically after TTL expiry (default `24h`, settable via `ttl` field in the file).
1044
+
1045
+ ```json
1046
+ {
1047
+ "discovered": "2026-04-13T12:00:00.000Z",
1048
+ "ttl": "24h",
1049
+ "servers": {
1050
+ "beeperbox": {
1051
+ "command": "docker",
1052
+ "args": ["exec", "-i", "beeperbox", "node", "/opt/mcp/server.js", "--stdio"],
1053
+ "tools": {
1054
+ "list_inbox": "allow",
1055
+ "read_chat": "allow",
1056
+ "send_message": "deny",
1057
+ "archive_chat": "allow"
1058
+ }
1059
+ }
1060
+ }
1061
+ }
1062
+ ```
1063
+
1064
+ **Runtime policy (arg-dependent checks).** Static allow/deny in the file handles coarse-grained permissions. For checks that depend on arguments (e.g. deny `send_message` only when `chat_id` matches a specific group), express them in your bareguard `Gate` config — `tools.denyArgPatterns` and `content.denyPatterns` cover most cases, and the `wireGate(gate).policy` adapter applies them to every tool source uniformly:
1065
+
1066
+ ```javascript
1067
+ const { Gate } = require('bareguard');
1068
+ const { Loop, wireGate } = require('bare-agent');
1069
+ const { createMCPBridge } = require('bare-agent/mcp');
1070
+
1071
+ const bridge = await createMCPBridge();
1072
+
1073
+ const gate = new Gate({
1074
+ tools: {
1075
+ denyArgPatterns: {
1076
+ // Per-tool arg patterns. Matches against JSON-stringified args.
1077
+ beeperbox_send_message: [/"chat_id"\s*:\s*"[^"]*finance[^"]*"/],
1078
+ },
1079
+ },
1080
+ humanChannel: async (event) => ({ decision: 'deny' }),
1081
+ });
1082
+ await gate.init();
1083
+
1084
+ const { policy, onLlmResult, onToolResult, filterTools } = wireGate(gate);
1085
+ const gatedTools = await filterTools(bridge.tools);
1086
+
1087
+ const loop = new Loop({
1088
+ provider,
1089
+ system: bridge.systemContext,
1090
+ policy,
1091
+ onLlmResult,
1092
+ onToolResult,
1093
+ });
1094
+
1095
+ await loop.run(messages, gatedTools);
1096
+ ```
1097
+
1098
+ MCP tools arrive with the server name prepended (`beeperbox_send_message`, not `send_message`). Bareguard glob-matches the canonical name string against `tools.allowlist` / `tools.denylist`; no MCP-specific parsing.
1099
+
1100
+ > **v0.6.0 migration:** `createMCPBridge({ policy })` was removed. Runtime policy is Loop-level now, not mcp-bridge-level. Passing `policy` to `createMCPBridge` throws with a migration message.
1101
+ >
1102
+ > **v0.8.0 migration:** All policy/audit/budget decisions moved to bareguard. `Loop({ maxCost })`, `Loop({ maxRounds })`, `Loop({ audit })`, and the `bare-agent/policy` helpers are gone. Wire bareguard via `wireGate(gate)`; see "Wiring with bareguard" above.
1103
+
1104
+ **Options:**
1105
+
1106
+ | Option | Default | Purpose |
1107
+ |---|---|---|
1108
+ | `bridgePath` | `./.mcp-bridge.json` | Override the config file location |
1109
+ | `configPaths` | IDE defaults | Custom list of config files to scan |
1110
+ | `servers` | all discovered | Limit to a subset by name |
1111
+ | `timeout` | `15000` | Per-server init timeout in ms |
1112
+ | `refresh` | `false` | Force re-discovery regardless of TTL |
1113
+
1114
+ ### Recipe 10: beeperbox — multi-messenger reach via MCP bridge
1115
+
1116
+ [beeperbox](https://github.com/hamr0/beeperbox) is a headless Beeper Desktop in Docker that exposes an MCP server on stdio and HTTP. Wiring it into bareagent is a two-step process: drop its launch command into any MCP config file, then call `createMCPBridge`. No beeperbox-specific code in bareagent.
1117
+
1118
+ **Step 1** — add beeperbox to `.mcp.json` in your project root (or any of the IDE-standard locations):
1119
+
1120
+ ```json
1121
+ {
1122
+ "mcpServers": {
1123
+ "beeperbox": {
1124
+ "command": "docker",
1125
+ "args": ["exec", "-i", "beeperbox", "node", "/opt/mcp/server.js", "--stdio"]
1126
+ }
1127
+ }
1128
+ }
1129
+ ```
1130
+
1131
+ **Step 2** — use the bridge as in Recipe 9. beeperbox tools are namespaced `beeperbox_*`:
1132
+
1133
+ ```javascript
1134
+ const bridge = await createMCPBridge({ servers: ['beeperbox'] });
1135
+ const loop = new Loop({ provider, system: bridge.systemContext });
1136
+
1137
+ try {
1138
+ await loop.run(
1139
+ [{ role: 'user', content: 'Check my WhatsApp unread and reply to Sara that I\'ll call her at 5.' }],
1140
+ bridge.tools,
1141
+ );
1142
+ } finally {
1143
+ await bridge.close();
1144
+ }
1145
+ ```
1146
+
1147
+ beeperbox exposes 10 semantic tools covering every Beeper-connected bridge (WhatsApp, iMessage, Signal, Telegram, Discord, Slack, Messenger, Instagram, LinkedIn, Google Messages, Matrix): `list_accounts`, `list_inbox`, `list_unread`, `get_chat`, `read_chat`, `search_messages`, `send_message`, `note_to_self`, `react_to_message`, `archive_chat`. See [beeperbox.context.md](https://github.com/hamr0/beeperbox/blob/main/beeperbox.context.md) for full tool signatures, schemas, and network slugs.
1148
+
1149
+ **Least-privilege pattern:** beeperbox tokens have a read-only mode (Beeper Desktop → Settings → Developers → uncheck "Allow sensitive actions"). Combine a read-only token with `.mcp-bridge.json` deny entries on `send_message` / `archive_chat` for defence-in-depth — token scope enforced server-side, allow/deny enforced client-side before the LLM ever sees the tool.