npm - bare-agent - Versions diffs - 0.16.0 → 0.16.1 - Mend

bare-agent 0.16.0 → 0.16.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md CHANGED Viewed

@@ -66,7 +66,7 @@ Every piece works alone — take what you need, ignore the rest.
 | Component | What it does |
 |---|---|
-| **Loop** | Think → act → observe → repeat. Calls any LLM, executes your tools, loops until done. Returns estimated USD cost per run. Governance via `Loop({ policy })` — wire bareguard's `Gate` through `wireGate(gate)` and every tool call (native, MCP, browsing, mobile) traverses one chokepoint with per-caller `ctx` routing. Bareguard owns the audit log, budget caps, and halt decisions; Loop respects the verdict. Context engineering via `Loop({ assemble })` — a per-round `assemble(msgs, ctx)` chokepoint to recall/compress/trim the window sent to the model (the seam litectx plugs into); returns a view, the canonical transcript stays intact, fail-open. The exported `unitAssembler`/`toUnits`/`fromUnits` adapter lets a consumer work over a neutral unit `{id, role, content, kind, pinned, atomic, tokensApprox}` — bareagent owns the grammar (atomic tool-pair bundling, pinned system/task, a pairing seatbelt), the consumer owns content + relevance. The CE function reads its inputs from the per-run `ctx` — litectx's budget-fitter uses `ctx.budget` (and `ctx.task`), so you **must** populate it via `run(msgs, tools, { ctx })`: an unset `ctx.budget` means the fitter has no budget, keeps everything, and returns the window unchanged — a silent no-op, not a bug (see `examples/litectx-assemble.mjs`). For summary-window compaction the Loop also lends a provider-bound `ctx.summarize(excerpt) => Promise<string>` (R-C6): the consumer owns when/what to summarize and the splice, bareagent makes the one model call (counted against the budget via `onLlmResult`, tagged `kind:'summarize'`). For an unbounded long-running agent there's the **destructive** counterpart `Loop({ trim })` (RT-2) — a per-round bound on the canonical transcript that evicts old turns *after* harvesting them; wire it with the exported `unitTrimmer({ trim, onHarvest, policy })` over litectx's `trim` verb (harvest-before-evict, fail-open; `harvestKey` gives the stable upsert id), opt-in (requires a consumer on litectx ≥ 0.16.0). `onError` + `loop:error` surface every silent-ish failure (callback throw, Checkpoint timeout) |
+| **Loop** | Think → act → observe → repeat. Calls any LLM, runs your tools, loops until done, returns estimated USD cost per run. Three opt-in seams hook external libraries in without touching your code: **`policy`** (governance — wire bareguard for one gated chokepoint over every tool call), **`assemble`** (context engineering — recall/compress/trim the window per round; the seam [litectx](https://npmjs.com/package/litectx) plugs into, transcript untouched), and **`trim`** (destructively bound the transcript for unbounded runs, harvesting turns before eviction). Each is a single chokepoint, fail-open, off by default. `onError` + `loop:error` surface every silent failure |
 | **Planner** | Break a goal into a step DAG via LLM. Built-in caching (`cacheTTL`) |
 | **assessComplexity** | Pure-code pre-planner (no LLM): rates a goal `simple`/`medium`/`complex`/`critical` from its text via keyword scoring + a critical safety override. `needsPlanning` gates whether to spend a Planner pass; `critical` flags security/production/compliance work for extra scrutiny. Free, instant, debuggable via `signals` |
 | **runPlan** | Execute steps in parallel waves. Dependency-aware, failure propagation, per-step retry |
@@ -79,13 +79,13 @@ Every piece works alone — take what you need, ignore the rest.
 | **Scheduler** | Cron (`0 9 * * 1-5`) or relative (`2h`, `30m`). Persisted jobs survive restarts |
 | **Stream** | Structured event emitter. Pipe as JSONL, subscribe in-process, or custom transport |
 | **Errors** | Typed hierarchy — `ProviderError`, `ToolError`, `TimeoutError`, `CircuitOpenError`, `ValidationError`. Halt decisions (turn cap, budget cap, content rules) come from bareguard, not Loop |
-| **bareguard adapter** | `wireGate(gate)` returns `{ policy, onLlmResult, onToolResult, filterTools, formatDeny }` — one-line wiring to bareguard's `Gate`. `policy` maps gate decisions to Loop's policy contract; `onLlmResult` + `onToolResult` forward every LLM and tool result to `gate.record` (so `budget.maxCostUsd` covers token-only workloads); `filterTools` drops denied tools from the catalog the LLM ever sees. Halt-severity decisions throw a typed `HaltError` and Loop exits cleanly — never leaks `[HALT: ...]` to the LLM. `require('bare-agent/bareguard')` |
+| **bareguard adapter** | `wireGate(gate)` → `{ policy, onLlmResult, onToolResult, filterTools, formatDeny }`: one-line wiring to bareguard's `Gate`. Routes every LLM + tool result through the gate so budget caps cover token-heavy workloads, drops denied tools before the LLM ever sees them, and turns halts into a clean exit. `require('bare-agent/bareguard')` |
 | **Browsing** | Web navigation, clicking, typing, reading via `barebrowse` (17 tools). Two modes: library tools (inline snapshots, pass to Loop) or CLI session (disk-based snapshots, token-efficient for multi-step flows). Optional `assess` tool (privacy scan) when `wearehere` is installed |
 | **Mobile** | Android + iOS device control via `baremobile`. Same two modes: library tools (`createMobileTools` — action tools auto-return snapshots) or CLI session (`baremobile` CLI — disk-based snapshots) |
 | **Shell** | Cross-platform `shell_read`, `shell_grep`, `shell_run` (argv, no shell), `shell_exec` (raw shell). Pure Node — no `grep`/`rg`/`findstr` dependency. Injection-proof `shell_run` for policy-gated use |
-| **MCP Bridge** | Auto-discover MCP servers from IDE configs (Claude Code, Cursor, etc.), expose as bareagent tools. Static allow/deny via `.mcp-bridge.json`, `systemContext` for LLM awareness. Runtime policy lives in `Loop({ policy })` — one hook for MCP + native tools alike. Returns both bulk `tools` (one per MCP tool) and `metaTools` (`mcp_discover` + `mcp_invoke` for token-thrifty access to large catalogs). Connecting runs a server's `command` (which may come from a cwd `.mcp.json`): pass `confirmServer` to vet each before it spawns — otherwise the bridge warns naming every command it runs. Every RPC is time-bounded (`timeout` for the handshake, `callTimeout` for `tools/call`), and a server that breaks its stdin pipe fails the connection instead of crashing the host. Zero deps |
-| **Spawn** | Fork a child bareagent process as a specialist agent. LLM-callable form blocks until child exits; library form returns a handle (`wait`, `onLine`, `kill`). One JSONL channel per child — child stderr captured and re-emitted as `child:stderr` events on the parent stream. Threads `BAREGUARD_AUDIT_PATH` / `BAREGUARD_PARENT_RUN_ID` / `BAREGUARD_BUDGET_FILE` / `BAREGUARD_SPAWN_DEPTH` so the family stitches into one audit + budget. `bareguard ^0.2.0` adds `spawn.ratePerMinute` + `limits.maxDepth` per-family caps. `timeoutMs` is the wall-clock ceiling; opt-in `idleTimeoutMs` is a heartbeat watchdog that kills a child gone silent on both stdio streams (resets on each line, so slow-but-working children survive; result carries `idleKilled`) |
-| **Defer** | Append a `{action, when}` record to a JSONL queue for a separate waker (cron / systemd timer / `examples/wake.sh`) to fire later. Two-phase governance: emit-time `gate.check` on the `defer` action; fire-time `gate.check` on the inner action when the waker re-invokes. `bareguard ^0.2.0` adds `defer.ratePerMinute` family-wide cap |
+| **MCP Bridge** | Auto-discover MCP servers from your $HOME/IDE configs (Claude Code, Cursor, …) and expose them as bareagent tools — bulk (`tools`) or token-thrifty meta-tools (`mcp_discover` + `mcp_invoke`) for large catalogs. Same `Loop({ policy })` hook governs MCP and native tools alike. The project-cwd `.mcp.json` is **opt-in** (untrusted-repo safety); vet every server spawn with `confirmServer`; every RPC is time-bounded. Zero deps |
+| **Spawn** | Fork a child bareagent as a specialist agent — LLM-callable (blocks until exit) or a library handle (`wait`, `onLine`, `kill`). The whole family stitches into one audit log + budget; `bareguard ^0.2.0` adds per-family rate + depth caps. `timeoutMs` caps wall-clock, opt-in `idleTimeoutMs` kills a child gone silent (slow-but-working children survive) |
+| **Defer** | Queue a `{action, when}` record for a separate waker (cron / `examples/wake.sh`) to fire later. Governed twice — once when emitted, again when it fires. `bareguard ^0.2.0` adds a family-wide rate cap |
 **Providers:** OpenAI-compatible (OpenAI, OpenRouter, Groq, vLLM, LM Studio), Anthropic, Ollama, CLIPipe (any CLI tool via stdin/stdout with real-time streaming), Fallback, or bring your own (one method: `generate`). All return the same shape — swap freely. The OpenAI provider warns if it would send your key over plaintext `http://` to a non-loopback host (use `https`, or drop `apiKey` for keyless local endpoints).
@@ -95,6 +95,8 @@ Every piece works alone — take what you need, ignore the rest.
 **Deps:** 1 required (`bareguard ^0.2.0` for governance — single-gate policy + audit + budget + per-family rate caps). Optional: `cron-parser` (cron expressions), `better-sqlite3` (SQLite store), `barebrowse` (web browsing), `baremobile` (Android + iOS device control), `wearehere` (privacy assessment via barebrowse).
+This table is the map, not the manual — per-component wiring and API detail live in the [Integration Guide](bareagent.context.md) and [Usage Guide](docs/02-features/usage-guide.md).
 ---
 ## Recipes

package/bareagent.context.md CHANGED Viewed

@@ -1,7 +1,7 @@
 # bareagent — Integration Guide
 > For AI assistants and developers wiring bareagent into a project.
-> v0.16.0 | Node.js >= 18 | one required dep (`bareguard ^0.4.2`) | Apache 2.0
+> v0.16.1 | Node.js >= 18 | one required dep (`bareguard ^0.4.2`) | Apache 2.0
 >
 > Full human guide with composition examples, design philosophy, and recipes: [Usage Guide](docs/02-features/usage-guide.md)
@@ -251,16 +251,17 @@ invoked tool name lives in `args.name`. To deny specific MCP tools when
 using metaTools, use `tools.denyArgPatterns: { mcp_invoke: [/"name":"linear_admin_/] }`
 or `content.denyPatterns` over the serialized action.
-**Vetting server commands (v0.11.0).** Connecting to a server runs its
-`command`, and discovery reads `.mcp.json` from the cwd (an untrusted
-repo) as well as your home/IDE configs. Pass `confirmServer(name, def)
-=> boolean` to `createMCPBridge` to approve each server **before its
-command is spawned** (return `false` to skip it; a throw fails closed).
-Default trusts all discovered servers — unchanged behavior. **When no
-`confirmServer` is set, the bridge prints a one-time warning naming every
-command it is about to spawn** (before the first spawn, discovery included),
-so a cwd `.mcp.json` can't run a command unannounced — `confirmServer` is
-still how you actually *gate* it.
+**Vetting server commands (v0.11.0; cwd default tightened v0.16.1).** Connecting
+to a server runs its `command`. **Default discovery now scans only your
+$HOME/IDE configs — NOT the project-cwd `./.mcp.json`** (v0.16.1): a checked-in
+config in an untrusted repo would otherwise auto-spawn arbitrary commands. To
+include the project config, pass `createMCPBridge({ includeProjectConfig: true })`,
+or a `confirmServer` hook (which implies it, since the hook vets every command).
+Explicitly-passed `configPaths` are honored verbatim. Pass `confirmServer(name,
+def) => boolean` to approve each server **before its command is spawned** (return
+`false` to skip it; a throw fails closed). When no `confirmServer` is set, the
+bridge still trusts all *discovered* servers and prints a one-time warning naming
+every command it is about to spawn — `confirmServer` is how you actually *gate* it.
 **RPC timeouts (Unreleased).** Every JSON-RPC round-trip is now bounded, so a
 server that never answers can't hang the bridge or the loop. `opts.timeout`
@@ -554,13 +555,13 @@ new CLIPipe({ command: 'claude', args: ['--print'], systemPromptFlag: '--system-
 new CLIPipe({ command: 'ollama', args: ['run', 'llama3.2'] })
 ```
-All return `{ text, toolCalls, usage: { inputTokens, outputTokens } }`. CLIPipe always returns `toolCalls: []` and zero usage (CLI tools don't report tokens).
+All return `{ text, toolCalls, usage: { inputTokens, outputTokens }, model? }`. The optional `model` (v0.16.1+) is the id the response was produced by — Loop prefers it over `provider.model` for cost accounting. CLIPipe always returns `toolCalls: []` and zero usage (CLI tools don't report tokens), and omits `model`.
 **Error body (v0.11.0):** on an HTTP error the OpenAI/Anthropic/Ollama providers throw a `ProviderError` whose `message` carries the upstream error string. The full parsed response is **not** attached to `err.body` by default (so an unexpected field can't leak through logs that dump the error object). Pass `{ exposeErrorBody: true }` to attach it for debugging.
 **Plaintext-key warning (Unreleased):** the OpenAI provider's `baseUrl` accepts `http://` (for local/OpenAI-compatible endpoints), but a `Bearer` key sent over plaintext http to a **non-loopback** host is exposed on the wire. The provider now warns once when that happens. Loopback hosts (`localhost`/`127.0.0.0/8`/`::1` — local proxies, Ollama-style endpoints) stay silent, since that's the legitimate keyless-local case. The header is **not** stripped (some local proxies want a key), so use `https` for any remote endpoint, or drop `apiKey` when the local endpoint needs none.
-**Cost estimation:** Loop automatically estimates USD cost per run based on model and token usage. The `cost` field appears in every `loop.run()` result and in `loop:done` stream events. Pricing covers OpenAI and Anthropic models; unknown models use a default average. To adjust rates, edit `COST_PER_1K` at the top of `src/loop.js`.
+**Cost estimation:** Loop automatically estimates USD cost per run based on model and token usage. The `cost` field appears in every `loop.run()` result and in `loop:done` stream events. Pricing covers OpenAI and Anthropic models; unknown models use a default average. To adjust rates, edit `COST_PER_1K` at the top of `src/loop.js`. The model is resolved as `result.model || provider.model` (v0.16.1+) — providers now echo the model in their `generate()` result, so cost accounting holds even when `provider.model` is absent or varies per response, e.g. behind `FallbackProvider` or `CircuitBreaker.wrapProvider` (the wrapper also preserves `model`/`name` passthrough props). Wire `onLlmResult` (via `wireGate`) and a `budget.maxCostUsd` cap then halts on token-heavy workloads too.
 ## Store options
@@ -642,7 +643,7 @@ All error classes extend `Error` — `instanceof Error` always works. The `retry
 ## Key contracts
 - Loop builds messages in OpenAI format internally. Each provider normalizes to its native format.
-- `provider.generate(messages, tools, options)` must return `{ text, toolCalls, usage }`.
+- `provider.generate(messages, tools, options)` must return `{ text, toolCalls, usage }` (and may include `model` for accurate cost accounting).
 - Store must implement `store(content, metadata) → id`, `search(query, options) → [{id, content, metadata, score}]`, `get(id)`, `delete(id)`.
 - Components are independent: Memory doesn't know Loop, Scheduler doesn't know Planner. You compose them.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "bare-agent",
-  "version": "0.16.0",
+  "version": "0.16.1",
   "files": [
     "index.js",
     "index.d.ts",

package/src/circuit-breaker.d.ts CHANGED Viewed

@@ -57,14 +57,18 @@ export class CircuitBreaker {
      */
     reset(key?: string): void;
     /**
-     * Wrap a provider so generate() goes through the circuit breaker.
-     * @param {{ generate: (...args: any[]) => Promise<any> }} provider - Provider with generate().
+     * Wrap a provider so generate() goes through the circuit breaker. Passthrough props (e.g.
+     * `model`, `name`) are preserved so Loop cost accounting — which reads `provider.model` —
+     * keeps working through the wrapper.
+     * @param {{ generate: (...args: any[]) => Promise<any>, [k: string]: any }} provider - Provider with generate().
      * @param {string} [key] - Circuit key.
-     * @returns {{ generate: (...args: any[]) => Promise<any> }} Wrapped provider with generate().
+     * @returns {{ generate: (...args: any[]) => Promise<any>, [k: string]: any }} Wrapped provider.
      */
     wrapProvider(provider: {
         generate: (...args: any[]) => Promise<any>;
+        [k: string]: any;
     }, key?: string): {
         generate: (...args: any[]) => Promise<any>;
+        [k: string]: any;
     };
 }

package/src/circuit-breaker.js CHANGED Viewed

@@ -112,13 +112,16 @@ class CircuitBreaker {
   }
   /**
-   * Wrap a provider so generate() goes through the circuit breaker.
-   * @param {{ generate: (...args: any[]) => Promise<any> }} provider - Provider with generate().
+   * Wrap a provider so generate() goes through the circuit breaker. Passthrough props (e.g.
+   * `model`, `name`) are preserved so Loop cost accounting — which reads `provider.model` —
+   * keeps working through the wrapper.
+   * @param {{ generate: (...args: any[]) => Promise<any>, [k: string]: any }} provider - Provider with generate().
    * @param {string} [key] - Circuit key.
-   * @returns {{ generate: (...args: any[]) => Promise<any> }} Wrapped provider with generate().
+   * @returns {{ generate: (...args: any[]) => Promise<any>, [k: string]: any }} Wrapped provider.
    */
   wrapProvider(provider, key) {
     return {
+      ...provider,
       /** @param {...any} args */
       generate: (...args) => this.call(() => provider.generate(...args), key),
     };

package/src/loop.js CHANGED Viewed

@@ -331,7 +331,7 @@ class Loop {
         const startedAt = Date.now();
         const result = await loop.provider.generate(prompt, [], { temperature: 0, ...genOpts });
         const usage = (result && result.usage) || null;
-        const model = loop.provider.model || null;
+        const model = (result && result.model) || loop.provider.model || null;
         const cost = estimateCost(model, usage);
         if (cost !== null) totalCost += cost;
         loop._safeEmit({ type: 'loop:summarize', data: { usage, costUsd: cost, durationMs: Date.now() - startedAt } });
@@ -421,7 +421,9 @@ class Loop {
       }
       lastUsage = result.usage || lastUsage;
-      const model = this.provider.model || null;
+      // Prefer the model the response reports (robust when provider.model is absent or varies per
+      // response — e.g. FallbackProvider, or a CircuitBreaker-wrapped provider that drops .model).
+      const model = result.model || this.provider.model || null;
       const roundCost = estimateCost(model, lastUsage);
       if (roundCost !== null) totalCost += roundCost;

package/src/mcp-bridge.d.ts CHANGED Viewed

@@ -74,7 +74,11 @@ export type RpcClient = {
  *
  * @param {object} [opts]
  * @param {string} [opts.bridgePath] - Path to .mcp-bridge.json. Default: .mcp-bridge.json in cwd.
- * @param {string[]} [opts.configPaths] - IDE config paths for discovery.
+ * @param {string[]} [opts.configPaths] - Explicit config paths for discovery. When given, honored
+ *   verbatim. When omitted, only the trusted $HOME/IDE defaults are scanned (NOT `./.mcp.json`).
+ * @param {boolean} [opts.includeProjectConfig=false] - Also scan the project-cwd `./.mcp.json`
+ *   during default discovery. Off by default: a project config in an untrusted repo can auto-spawn
+ *   arbitrary commands. Implied true when a `confirmServer` hook is present (it vets each command).
  * @param {string[]} [opts.servers] - Limit to these server names.
  * @param {number} [opts.timeout=15000] - Per-server handshake timeout in ms (initialize + tools/list).
  * @param {number} [opts.callTimeout=120000] - Per-invocation timeout in ms for tools/call. Bounds a
@@ -82,16 +86,17 @@ export type RpcClient = {
  * @param {boolean} [opts.refresh=false] - Force re-discovery regardless of TTL.
  * @param {(name: string, def: ServerDef) => boolean | Promise<boolean>} [opts.confirmServer]
  *   Vet each discovered server BEFORE its `command` is spawned. Connecting to an
- *   MCP server runs its command, and discovery reads configs from the cwd (a
- *   `.mcp.json` in an untrusted repo) as well as the user's home/IDE configs.
- *   Return false to skip a server (its command is never executed). A throw is
- *   treated as a deny (fail-closed). Default: every discovered server is trusted
- *   (unchanged behavior) — pass this to gate command execution.
+ *   MCP server runs its command. Return false to skip a server (its command is
+ *   never executed). A throw is treated as a deny (fail-closed). Default: every
+ *   discovered server is trusted — pass this to gate command execution. Presence
+ *   of this hook also opts default discovery into the project-cwd `./.mcp.json`,
+ *   since each command is then vetted regardless of source.
  * @returns {Promise<{tools: ToolDef[], metaTools?: ToolDef[], servers: string[], systemContext: string, denied: DeniedTool[], errors?: Array<{server: string, error: string}>, close: Function}>}
  */
 export function createMCPBridge(opts?: {
     bridgePath?: string | undefined;
     configPaths?: string[] | undefined;
+    includeProjectConfig?: boolean | undefined;
     servers?: string[] | undefined;
     timeout?: number | undefined;
     callTimeout?: number | undefined;
@@ -110,10 +115,15 @@ export function createMCPBridge(opts?: {
     close: Function;
 }>;
 /**
- * @param {string[]} [configPaths]
+ * @param {string[]} [configPaths] - Explicit config paths. When given, honored verbatim (the
+ *   caller owns the choice). When omitted, the trusted $HOME/IDE defaults are scanned.
+ * @param {{ includeProjectConfig?: boolean }} [opts] - When no explicit `configPaths` are given,
+ *   set `includeProjectConfig: true` to also scan `./.mcp.json`. Default false — see PROJECT_CONFIG_PATH.
  * @returns {Map<string, ServerDef>}
  */
-export function discoverServers(configPaths?: string[]): Map<string, ServerDef>;
+export function discoverServers(configPaths?: string[], { includeProjectConfig }?: {
+    includeProjectConfig?: boolean;
+}): Map<string, ServerDef>;
 /**
  * Build the LLM-callable meta-tool surface from a fully-connected bridge.
  * Shares the underlying tool array and RPC clients with the bulk surface —

package/src/mcp-bridge.js CHANGED Viewed

@@ -62,8 +62,12 @@ const { ToolError } = require('./errors');
 // --- Config discovery (from IDE configs) ---
-const DEFAULT_CONFIG_PATHS = [
-  () => join(process.cwd(), '.mcp.json'),                              // project
+// The project-cwd `.mcp.json` is the untrusted-repo vector: discovering it auto-spawns its
+// `command`, so cloning a hostile repo and running an agent inside it would be arbitrary code
+// execution. It is therefore NOT in the trusted defaults — these are user/IDE-authored configs
+// under $HOME, which the user owns. The project config is opt-in (see `includeProjectConfig`).
+const PROJECT_CONFIG_PATH = () => join(process.cwd(), '.mcp.json');
+const TRUSTED_CONFIG_PATHS = [
   () => join(homedir(), '.mcp.json'),                                  // home
   () => join(homedir(), '.claude', 'mcp_servers.json'),                // Claude Code
   () => join(homedir(), '.config', 'Claude', 'claude_desktop_config.json'), // Claude Desktop
@@ -71,11 +75,22 @@ const DEFAULT_CONFIG_PATHS = [
 ];
 /**
- * @param {string[]} [configPaths]
+ * @param {string[]} [configPaths] - Explicit config paths. When given, honored verbatim (the
+ *   caller owns the choice). When omitted, the trusted $HOME/IDE defaults are scanned.
+ * @param {{ includeProjectConfig?: boolean }} [opts] - When no explicit `configPaths` are given,
+ *   set `includeProjectConfig: true` to also scan `./.mcp.json`. Default false — see PROJECT_CONFIG_PATH.
  * @returns {Map<string, ServerDef>}
  */
-function discoverServers(configPaths) {
-  const paths = configPaths || DEFAULT_CONFIG_PATHS.map(fn => fn());
+function discoverServers(configPaths, { includeProjectConfig = false } = {}) {
+  let paths;
+  if (configPaths) {
+    paths = configPaths;
+  } else {
+    paths = TRUSTED_CONFIG_PATHS.map(fn => fn());
+    // Project config kept at highest precedence (front) when explicitly opted in — preserves the
+    // historical "project overrides home" ordering for callers that want it.
+    if (includeProjectConfig) paths.unshift(PROJECT_CONFIG_PATH());
+  }
   /** @type {Map<string, ServerDef>} */
   const servers = new Map();
@@ -594,7 +609,11 @@ function buildMetaTools(tools, discoveredAt) {
  *
  * @param {object} [opts]
  * @param {string} [opts.bridgePath] - Path to .mcp-bridge.json. Default: .mcp-bridge.json in cwd.
- * @param {string[]} [opts.configPaths] - IDE config paths for discovery.
+ * @param {string[]} [opts.configPaths] - Explicit config paths for discovery. When given, honored
+ *   verbatim. When omitted, only the trusted $HOME/IDE defaults are scanned (NOT `./.mcp.json`).
+ * @param {boolean} [opts.includeProjectConfig=false] - Also scan the project-cwd `./.mcp.json`
+ *   during default discovery. Off by default: a project config in an untrusted repo can auto-spawn
+ *   arbitrary commands. Implied true when a `confirmServer` hook is present (it vets each command).
  * @param {string[]} [opts.servers] - Limit to these server names.
  * @param {number} [opts.timeout=15000] - Per-server handshake timeout in ms (initialize + tools/list).
  * @param {number} [opts.callTimeout=120000] - Per-invocation timeout in ms for tools/call. Bounds a
@@ -602,11 +621,11 @@ function buildMetaTools(tools, discoveredAt) {
  * @param {boolean} [opts.refresh=false] - Force re-discovery regardless of TTL.
  * @param {(name: string, def: ServerDef) => boolean | Promise<boolean>} [opts.confirmServer]
  *   Vet each discovered server BEFORE its `command` is spawned. Connecting to an
- *   MCP server runs its command, and discovery reads configs from the cwd (a
- *   `.mcp.json` in an untrusted repo) as well as the user's home/IDE configs.
- *   Return false to skip a server (its command is never executed). A throw is
- *   treated as a deny (fail-closed). Default: every discovered server is trusted
- *   (unchanged behavior) — pass this to gate command execution.
+ *   MCP server runs its command. Return false to skip a server (its command is
+ *   never executed). A throw is treated as a deny (fail-closed). Default: every
+ *   discovered server is trusted — pass this to gate command execution. Presence
+ *   of this hook also opts default discovery into the project-cwd `./.mcp.json`,
+ *   since each command is then vetted regardless of source.
  * @returns {Promise<{tools: ToolDef[], metaTools?: ToolDef[], servers: string[], systemContext: string, denied: DeniedTool[], errors?: Array<{server: string, error: string}>, close: Function}>}
  */
 async function createMCPBridge(opts = {}) {
@@ -632,9 +651,10 @@ async function createMCPBridge(opts = {}) {
     catch { return false; }
   };
-  // Connecting to a server EXECUTES its `command`, which can originate from a
-  // cwd-relative .mcp.json in an untrusted repo (discoverServers reads project
-  // configs). With no confirmServer hook, every discovered command runs unvetted.
+  // Connecting to a server EXECUTES its `command`. The project-cwd `.mcp.json` is excluded from
+  // default discovery (see TRUSTED_CONFIG_PATHS / includeProjectConfig), so the untrusted-repo
+  // path is closed by default; this warning covers the residual case where home/IDE configs (or an
+  // explicit opt-in) contribute commands and no confirmServer hook is present to vet them.
   // Warn ONCE per call, BEFORE the first spawn — and the first spawn is the
   // discovery phase on a cold/refresh run, not the main-connect phase below.
   let warnedUnvetted = false;
@@ -654,8 +674,11 @@ async function createMCPBridge(opts = {}) {
   const needsRefresh = opts.refresh || !config || isExpired(config);
   if (needsRefresh) {
-    // Discover from IDE configs
-    const discovered = discoverServers(opts.configPaths);
+    // Discover from IDE configs. The project-cwd `.mcp.json` is excluded by default (untrusted-repo
+    // RCE vector); it is scanned only on explicit opt-in, or when a `confirmServer` hook is present
+    // (which vets every command before it spawns, so cwd discovery is safe under it).
+    const includeProjectConfig = opts.includeProjectConfig === true || !!confirmServer;
+    const discovered = discoverServers(opts.configPaths, { includeProjectConfig });
     if (discovered.size === 0 && !config) {
       return { tools: [], servers: [], systemContext: '', denied: [], close: async () => {} };

package/src/provider-anthropic.js CHANGED Viewed

@@ -83,6 +83,7 @@ class AnthropicProvider {
     return {
       text,
       toolCalls,
+      model: data.model || this.model,
       usage: {
         inputTokens: data.usage?.input_tokens || 0,
         outputTokens: data.usage?.output_tokens || 0,

package/src/provider-ollama.js CHANGED Viewed

@@ -60,6 +60,7 @@ class OllamaProvider {
           ? JSON.parse(tc.function.arguments)
           : tc.function.arguments,
       })),
+      model: data.model || this.model,
       usage: {
         inputTokens: data.prompt_eval_count || 0,
         outputTokens: data.eval_count || 0,

package/src/provider-openai.js CHANGED Viewed

@@ -72,6 +72,7 @@ class OpenAIProvider {
         name: tc.function.name,
         arguments: JSON.parse(tc.function.arguments),
       })),
+      model: data.model || this.model,
       usage: {
         inputTokens: data.usage?.prompt_tokens || 0,
         outputTokens: data.usage?.completion_tokens || 0,

package/tools/defer.js CHANGED Viewed

@@ -133,16 +133,19 @@ async function readQueue(queuePath) {
   const path = resolveQueuePath(queuePath);
   try {
     const text = await fsp.readFile(path, 'utf8');
-    /** @type {Record<string, Record<string, any>>} */
-    const records = {};
+    // Fold append-only status lines by id (latest wins). A Map — not a plain object — so an
+    // attacker-influenced id from a tampered queue file (e.g. "__proto__", "constructor") is just
+    // an ordinary key and cannot reach the prototype-setter path. Also require a string id.
+    /** @type {Map<string, Record<string, any>>} */
+    const records = new Map();
     for (const line of text.split('\n')) {
       if (!line.trim()) continue;
       let r;
       try { r = JSON.parse(line); } catch { continue; }
-      if (!r.id) continue;
-      records[r.id] = { ...records[r.id], ...r };
+      if (typeof r.id !== 'string' || !r.id) continue;
+      records.set(r.id, { ...records.get(r.id), ...r });
     }
-    return Object.values(records);
+    return [...records.values()];
   } catch (/** @type {any} */ err) {
     if (err.code === 'ENOENT') return [];
     throw err;

package/tools/grep-worker.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/tools/grep-worker.js ADDED Viewed

@@ -0,0 +1,20 @@
+'use strict';
+/**
+ * Worker thread for shell_grep's matching phase. Runs the (potentially expensive) regex search
+ * off the main thread so the parent can enforce a hard timeout via `worker.terminate()` — JS
+ * regex backtracking is uninterruptible on its own thread, so isolation is the only sound bound
+ * against catastrophic patterns that slip past the static guard. See tools/shell.js `grepPath`.
+ */
+const { workerData, parentPort } = require('node:worker_threads');
+const { _grepCore } = require('./shell.js');
+// parentPort is non-null inside a worker, but the type is nullable for the main-thread case.
+if (!parentPort) throw new Error('grep-worker.js must be run as a worker thread');
+const port = parentPort;
+_grepCore(workerData).then(
+  (result) => port.postMessage({ ok: true, result }),
+  (err) => port.postMessage({ ok: false, error: err && err.message ? err.message : String(err) }),
+);

package/tools/shell.d.ts CHANGED Viewed

@@ -4,6 +4,12 @@ export type GrepArgs = {
     recursive?: boolean | undefined;
     maxMatches?: number | undefined;
     flags?: string | undefined;
+    /**
+     * - Hard wall-clock ceiling in ms (default 5000). The match runs in a
+     * worker thread; on overrun the worker is terminated and the call rejects, so a pattern that slips
+     * past `looksCatastrophic` can no longer hang the host event loop.
+     */
+    timeout?: number | undefined;
 };
 export type RunArgvArgs = {
     argv: string[];
@@ -29,3 +35,31 @@ export type ToolDef = import("../types").ToolDef;
 export function createShellTools(): {
     tools: ToolDef[];
 };
+/**
+ * @typedef {object} GrepArgs
+ * @property {string} pattern
+ * @property {string} path
+ * @property {boolean} [recursive]
+ * @property {number} [maxMatches]
+ * @property {string} [flags]
+ * @property {number} [timeout] - Hard wall-clock ceiling in ms (default 5000). The match runs in a
+ *   worker thread; on overrun the worker is terminated and the call rejects, so a pattern that slips
+ *   past `looksCatastrophic` can no longer hang the host event loop.
+ */
+/**
+ * The actual search: walk, skip binaries, regex-test each line. Runs in a worker thread (see
+ * grep-worker.js) so a runaway regex is killable via `worker.terminate()`. JS RegExp has no
+ * execution timeout and backtracking is uninterruptible on its own thread — isolation is the
+ * only sound bound (the static `looksCatastrophic` guard is a best-effort fast-reject, not a
+ * guarantee; a grounded bypass like `(a|a|a)*` passes it yet backtracks exponentially).
+ * @param {GrepArgs} args
+ */
+export function _grepCore({ pattern, path: rawPath, recursive, maxMatches, flags }: GrepArgs): Promise<{
+    hits: {
+        file: string;
+        line: number;
+        text: string;
+    }[];
+    truncated: boolean;
+    fileCount: number;
+}>;

package/tools/shell.js CHANGED Viewed

@@ -17,9 +17,11 @@
 const fs = require('node:fs/promises');
 const path = require('node:path');
 const { exec, execFile } = require('node:child_process');
+const { Worker } = require('node:worker_threads');
 const DEFAULT_READ_MAX_BYTES = 256 * 1024;       // 256 KB
 const DEFAULT_GREP_MAX_MATCHES = 200;
+const DEFAULT_GREP_TIMEOUT_MS = 5_000;           // hard ceiling on a single grep — bounds ReDoS
 const DEFAULT_EXEC_TIMEOUT_MS = 30_000;
 const DEFAULT_EXEC_MAX_BUFFER = 1024 * 1024;     // 1 MB
@@ -137,18 +139,22 @@ function looksCatastrophic(pattern) {
  * @property {boolean} [recursive]
  * @property {number} [maxMatches]
  * @property {string} [flags]
+ * @property {number} [timeout] - Hard wall-clock ceiling in ms (default 5000). The match runs in a
+ *   worker thread; on overrun the worker is terminated and the call rejects, so a pattern that slips
+ *   past `looksCatastrophic` can no longer hang the host event loop.
  */
-/** @param {GrepArgs} args */
-async function grepPath({ pattern, path: rawPath, recursive = true, maxMatches, flags = 'i' }) {
+/**
+ * The actual search: walk, skip binaries, regex-test each line. Runs in a worker thread (see
+ * grep-worker.js) so a runaway regex is killable via `worker.terminate()`. JS RegExp has no
+ * execution timeout and backtracking is uninterruptible on its own thread — isolation is the
+ * only sound bound (the static `looksCatastrophic` guard is a best-effort fast-reject, not a
+ * guarantee; a grounded bypass like `(a|a|a)*` passes it yet backtracks exponentially).
+ * @param {GrepArgs} args
+ */
+async function _grepCore({ pattern, path: rawPath, recursive = true, maxMatches, flags = 'i' }) {
   const resolved = path.resolve(expandHome(rawPath));
   const cap = maxMatches || DEFAULT_GREP_MAX_MATCHES;
-  if (looksCatastrophic(pattern)) {
-    throw new Error(
-      `shell_grep: pattern rejected — nested unbounded quantifier (e.g. "(a+)+") risks catastrophic ` +
-      `backtracking that would block the process. Simplify the regex.`,
-    );
-  }
   let re;
   try {
     re = new RegExp(pattern, flags);
@@ -190,6 +196,53 @@ async function grepPath({ pattern, path: rawPath, recursive = true, maxMatches,
   return { hits, truncated, fileCount: files.length };
 }
+/**
+ * Public grep entry. Fast-rejects obviously catastrophic patterns without paying for a worker,
+ * then runs the search in a worker thread bounded by a hard timeout — so even a pattern that
+ * defeats the static guard degrades to a bounded rejection instead of an event-loop hang.
+ * @param {GrepArgs} args
+ */
+function grepPath(args) {
+  const { pattern, flags = 'i', timeout } = args;
+  if (looksCatastrophic(pattern)) {
+    return Promise.reject(new Error(
+      `shell_grep: pattern rejected — nested unbounded quantifier (e.g. "(a+)+") risks catastrophic ` +
+      `backtracking that would block the process. Simplify the regex.`,
+    ));
+  }
+  // Cheap up-front validation so a syntactically invalid regex fails clearly without a worker spin-up.
+  try {
+    new RegExp(pattern, flags);
+  } catch (/** @type {any} */ err) {
+    return Promise.reject(new Error(`shell_grep: invalid regex — ${err.message}`));
+  }
+  const budgetMs = timeout && timeout > 0 ? timeout : DEFAULT_GREP_TIMEOUT_MS;
+  return new Promise((resolve, reject) => {
+    const worker = new Worker(path.join(__dirname, 'grep-worker.js'), { workerData: args });
+    let settled = false;
+    const done = (fn, val) => {
+      if (settled) return;
+      settled = true;
+      clearTimeout(timer);
+      worker.terminate();
+      fn(val);
+    };
+    const timer = setTimeout(() => {
+      done(reject, new Error(
+        `shell_grep: pattern exceeded ${budgetMs}ms time budget — likely catastrophic backtracking. ` +
+        `Simplify the regex.`,
+      ));
+    }, budgetMs);
+    timer.unref?.();
+    worker.once('message', (msg) => {
+      if (msg && msg.ok) done(resolve, msg.result);
+      else done(reject, new Error((msg && msg.error) || 'shell_grep: worker failed'));
+    });
+    worker.once('error', (err) => done(reject, err));
+  });
+}
 /**
  * @typedef {object} RunArgvArgs
  * @property {string[]} argv
@@ -360,4 +413,4 @@ function createShellTools() {
   return { tools };
 }
-module.exports = { createShellTools };
+module.exports = { createShellTools, _grepCore };

package/types/index.d.ts CHANGED Viewed

@@ -22,6 +22,8 @@ export interface GenerateResult {
   text: string;
   toolCalls: ToolCall[];
   usage: Usage;
+  /** Model id the response was produced by; preferred over Provider.model for cost accounting. */
+  model?: string | null;
 }
 /** A conversation message in OpenAI chat format. */