npm - freddie - Versions diffs - 0.0.89 → 0.0.91 - Mend

freddie 0.0.89 → 0.0.91

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/AGENTS.md +63 -2
package/package.json +5 -2
package/plugins/core-cli/plugin.js +2 -2
package/src/agent/llm_resolver.js +12 -10
package/src/agent/machine.js +46 -15

package/AGENTS.md CHANGED Viewed

@@ -9,7 +9,7 @@ Instructions for AI coding assistants working on Freddie.
 - `@mariozechner/pi-ai` — `complete`, `completeSimple`, `AssistantMessageEventStream`, `registerApiProvider`, `getModel`, `calculateCost`, `parseStreamingJson`, `isContextOverflow`. THE provider layer.
 - `@mariozechner/pi-tui` — TUI primitives (Ink-equivalent).
 - `floosie` v0.6.14 — `ProcessorMachine` (xstate). Use for gateway pipelines.
-- `anentrypoint-design` v0.0.27 — webjsx + ripple-ui. Use for any web UI; do NOT add React. Source in C:/dev/anentrypoint-design; freddie links via `file:../anentrypoint-design`.
+- `anentrypoint-design` ^0.0.94 — webjsx + ripple-ui. Use for any web UI; do NOT add React. Source in C:/dev/anentrypoint-design; freddie depends on the registry build (^0.0.94). For local SDK iteration, swap to `file:../anentrypoint-design` and rebuild via `node scripts/build.mjs`.
 - `xstate` v5 — every long-lived state machine (agent turns, gateway lifecycle, approvals).
 ## Plugin architecture (2026-05-03, pre-v1, no compat shims)
@@ -242,7 +242,7 @@ All 21 named integration tests in `test.js` pass (exit 0). Subsystem coverage:
 ## LLM backends and acptoapi
 - **acptoapi bridge** — Integrated at `src/agent/acptoapi-bridge.js` + `src/agent/llm_resolver.js` (commit 5f55f1e). Localhost API (default port 4800) converting OpenAI/Anthropic SDK calls to multiple backends: Kilo Code, opencode, Claude CLI, Anthropic API, Gemini, Ollama, Bedrock. Endpoint `/v1/chat/completions`, OpenAI-compatible, accepts `Bearer none` auth.
-- **acptoapi dep pattern** (2026-05-10) — `package.json` pins `"acptoapi": "file:../acptoapi"` (same pattern as `anentrypoint-design`). Always tracks local SDK without publish cycles. CJS/ESM boundary bridged via `createRequire(import.meta.url)` in freddie ESM files that import acptoapi CJS exports.
+- **acptoapi dep pattern** (2026-05-12) — `package.json` now pins `"acptoapi": "^1.0.55"` from the npm registry (CI auto-bumps on each acptoapi push; restore-package.cjs ROOT FIX prevents file: regressions). For local SDK iteration, swap to `file:../acptoapi` temporarily. CJS/ESM boundary bridged via `createRequire(import.meta.url)` in freddie ESM files that import acptoapi CJS exports.
 - **LLM resolver priority** (2026-05-10) — (1) explicit provider+key, (2) acptoapi if `/v1/models` returns 200, (3) `agent.model_preference` config array (ordered failover, sampler-gated), (4) `sdk.buildAutoChain()` env-key scan, (5) throw. `PROVIDER_KEYS` and `PROVIDER_DEFAULTS` imported from `acptoapi` — not maintained in freddie. `sdk.chat()` returns OpenAI `{choices:[{message}]}` format; `sdkChat()` adapter in llm_resolver converts to freddie's `{content, tool_calls, raw}`.
 - **Model sampler — re-export shim** (2026-05-10) — `src/agent/model-sampler.js` is a 13-line re-export shim over acptoapi sampler. Sampler logic (5-step backoff 30s→480s, createSampler factory, singleton) lives in `c:\dev\acptoapi\lib\sampler.js`. Exports: `isAvailable`, `markFailed`, `markOk`, `resetAvailability`, `getStatus`, `probe`, `startSampler`, `stopSampler`, `createSampler`.
 - **model_preference config key** (2026-05-10) — `agent.model_preference: []` in `~/.freddie/config.yaml`. Array of `{ provider, model? }` objects; `resolveCallLLM` tries each in order, skipping unavailable (sampler-gated) and marking failures with backoff. Config v2 migration adds the key on upgrade from v1.
@@ -302,3 +302,64 @@ To implement:
 4. Register `window.__debug.agents()` observability global
 **Blocked on**: Design decision (what metrics? count only? session associations? perf data?). Deferred pending user clarification.
+## Trajectory recorder schema v2 (2026-05-12)
+`src/agent/machine.js::writeTrajectory()` writes one JSON per turn under `<FREDDIE_HOME>/trajectories/<ts>-<slug>.json` whenever `agent.save_trajectories=true` OR `--witness <path>` is set on `freddie exec`. Schema (`schema_version: 2`):
+```
+{
+  schema_version: 2, ts, prompt, provider, model, skill, cwd,
+  iterations, result, error, error_stack,
+  state_transitions: ["PLAN"|"EXECUTE"|"VERIFY"|"COMPLETE", ...],
+  tool_calls: [{name, arguments, id}],
+  tool_results: [{tool_call_id, content}],
+  llm_calls: [{ok, durationMs, provider, model, content_length, tool_calls_count, ts, error?, stack?}],
+  llm_chunks_count, compressor_invocations, events, messages
+}
+```
+Optional `--witness <path>` writes a parallel JSONL stream with one event per line (`session_start`, `message`, `llm_call`, `session_end`) for downstream tail/grep. `runTurn({witnessPath})` is the in-code equivalent.
+Captured fields per acceptance bar:
+- (a) tool_call args → `tool_calls[].arguments` + `messages[].tool_calls`
+- (b) tool_result → `tool_results[]` + `messages[role:'tool']`
+- (c) LLM call timing/duration/provider/content_length/tool_calls_count → `llm_calls[]`
+- (d) compressor invocations → counted from `messages[role:'system']` matching `[trajectory.compressed]`
+- (e) errors with stack → `error_stack` + per-`llm_call` `stack` field
+Witnessed 2026-05-12: mistral-large 4-iteration loop on penguins repo produced 4 successful llm_calls + 1 failing (429 rate-limit) all captured with full stack trace. See `.gm/agent-loop-witness.jsonl` for canonical example.
+## LLM validation witness format (.gm/llm-validation.json)
+`.gm/llm-validation.json` is the canonical witness for provider reachability. Generated by an out-of-band validator script (per session) that probes every key in `process.env` matching `<PROVIDER>_API_KEY`, plus the acp-daemon endpoints (kilo on 4780, opencode on 4790) and claude-cli subprocess. Shape:
+```
+{
+  timestamp, env_keys: [...], targets: [{provider, source, daemonUp}],
+  results: [{provider, ok, ms, excerpt, error, source}],
+  sampler: [{provider, ok, failCount, nextCheckIn}],
+  pass_count, total
+}
+```
+Witnessed 2026-05-12: 7/15 pass — groq, google, mistral, openrouter, sambanova, kilo, claude-cli green. opencode-via-acp daemonUp=false until `opencode serve --port 4790` is started (see opencode caveat below).
+## opencode CLI shim caveat (2026-05-12)
+`opencode-ai@1.14.48` installs successfully via `npm install -g opencode-ai` and exposes a working binary at `C:\Users\user\AppData\Roaming\npm\opencode.cmd` (Windows). The bun-installed shim at `C:\Users\user\.bun\bin\opencode.exe` is BROKEN — its wrapper looks for `C:\Users\user\node_modules\opencode-ai\bin\opencode` (wrong path) and fails with `MODULE_NOT_FOUND`. **Workflow**: use npm version; do NOT `bun install -g opencode-ai`. To start ACP daemon: `& 'C:\Users\user\AppData\Roaming\npm\opencode.cmd' serve --port 4790 --hostname 127.0.0.1`. Verified by GET http://127.0.0.1:4790/ returning 200 (HTML shell). Warning `OPENCODE_SERVER_PASSWORD is not set` is harmless for localhost use.
+## gm-cc skill registry (2026-05-12)
+`plugins/gm-cc/plugin.js` auto-discovers 12 SKILL.md files from npm `gm-cc` package and registers them via `pi.skills.register({name: 'gm:<name>', description, content, source: 'gm-cc'})`. Registered skills: browser, code-search, create-lang-plugin, gm, gm-complete, gm-emit, gm-execute, governance, pages, planning, ssh, update-docs. Inspect with `node bin/freddie.js skills | findstr gm:`.
+## kilo ACP integration (2026-05-12)
+`src/agent/llm_resolver.js::acpChat()` (line 33) speaks the kilo ACP protocol: POST `/session` → GET `/event` (SSE) → POST `/session/<id>/message`. Streams `message.part.updated` events to assemble content; terminates on `session.idle`. Required ordering: `/event` must be opened BEFORE `/message` POST or messages drop. Configured in `ACP_BACKENDS` at line 14: kilo on `http://localhost:4780`, opencode on `http://localhost:4790`.
+Note: kilo + opencode ACP backends return content only, no tool_calls (the LLM-side tool layer is opaque to freddie). For multi-iteration tool-using loops, use OpenAI-compatible providers (mistral, openrouter, sambanova, groq) instead.
+## scripts/sync-upstream.mjs (2026-05-12)
+`node scripts/sync-upstream.mjs [--dry-run] [pkg ...]` bumps sibling dep entries (plugsdk, acptoapi, anentrypoint-design, gm-cc) in package.json to `^<latest>` from npm registry, then runs `npm install --package-lock-only`. Skips `file:` deps (local-dev pattern). Wired into `.github/workflows/sync-upstream.yml` (weekly cron + workflow_dispatch) which opens a PR via peter-evans/create-pull-request@v6 when changes land. Dry-run validated: detected `acptoapi: ^1.0.52 -> ^1.0.54`.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "freddie",
-  "version": "0.0.89",
+  "version": "0.0.91",
   "type": "module",
   "description": "Open JS agent harness built on pi-mono, floosie, xstate, and anentrypoint-design",
   "bin": {
@@ -27,7 +27,7 @@
     "xstate": "^5.31.0",
     "zod": "^4.0.0",
     "anentrypoint-design": "^0.0.94",
-    "acptoapi": "^1.0.51"
+    "acptoapi": "^1.0.56"
   },
   "optionalDependencies": {
     "@libsql/darwin-arm64": "0.3.19",
@@ -41,6 +41,9 @@
   "engines": {
     "node": ">=20.6.0"
   },
+  "overrides": {
+    "fast-xml-builder": "^1.1.7"
+  },
   "files": [
     "bin/",
     "src/",

package/plugins/core-cli/plugin.js CHANGED Viewed

@@ -56,12 +56,12 @@ export default {
             try { ({ callLLM } = await import('../../src/agent/pi-bridge.js')) } catch {}
             await interactive({ callLLM })
         } })
-        C({ name: 'exec', description: 'Run a single prompt through the agent and exit', options: [{ flag: '--prompt <prompt>', required: true }, { flag: '--model <model>', default: '' }, { flag: '--provider <provider>', default: '' }, { flag: '--skill <skill>', default: '' }, { flag: '--cwd <cwd>', default: '' }, { flag: '--timeout <ms>', default: '60000' }], action: async (opts) => {
+        C({ name: 'exec', description: 'Run a single prompt through the agent and exit', options: [{ flag: '--prompt <prompt>', required: true }, { flag: '--model <model>', default: '' }, { flag: '--provider <provider>', default: '' }, { flag: '--skill <skill>', default: '' }, { flag: '--cwd <cwd>', default: '' }, { flag: '--timeout <ms>', default: '60000' }, { flag: '--witness <path>', default: '' }], action: async (opts) => {
             const { runTurn } = await import('../../src/agent/machine.js')
             let provider = opts.provider || undefined
             let model = opts.model || undefined
             if (!provider && model && /^[a-z][a-z0-9-]*\//.test(model)) { provider = model.split('/')[0]; model = model.slice(provider.length + 1) }
-            const out = await runTurn({ prompt: opts.prompt, provider, model, skill: opts.skill || undefined, cwd: opts.cwd || undefined, timeoutMs: Number(opts.timeout) })
+            const out = await runTurn({ prompt: opts.prompt, provider, model, skill: opts.skill || undefined, cwd: opts.cwd || undefined, timeoutMs: Number(opts.timeout), witnessPath: opts.witness || undefined })
             if (out.error) { console.error('error:', out.error); process.exit(1) }
             console.log(out.result || out.messages?.at(-1)?.content || '')
             process.exit(0)

package/src/agent/llm_resolver.js CHANGED Viewed

@@ -12,7 +12,7 @@ export const PROVIDER_KEYS = sdk.PROVIDER_KEYS
 export const DEFAULTS = sdk.PROVIDER_DEFAULTS
 const ACP_BACKENDS = {
-    kilo: { base: 'http://localhost:4780', providerID: 'kilo', defaultModel: 'x-ai/grok-code-fast-1:optimized:free' },
+    kilo: { base: 'http://localhost:4780', providerID: 'kilo', defaultModel: 'openrouter/free' },
     opencode: { base: 'http://localhost:4790', providerID: 'opencode', defaultModel: 'minimax-m2.5-free' },
 }
@@ -36,13 +36,12 @@ async function acpChat(prefix, model, input) {
     if (!sessRes.ok) throw new Error(`ACP ${prefix} /session ${sessRes.status}`)
     const sessionId = (await sessRes.json()).id
     const userMsg = input.messages.filter(m => m.role === 'user').slice(-1)[0]?.content || ''
-    const body = { parts: [{ type: 'text', text: String(userMsg) }] }
-    if (b.providerID === 'opencode') body.model = { providerID: 'opencode', modelID: model || b.defaultModel }
-    else { body.providerID = 'kilo'; body.modelID = model || b.defaultModel }
-    await fetch(`${b.base}/session/${sessionId}/message`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body), signal: AbortSignal.timeout(120000) })
-    let content = ''
+    const body = { parts: [{ type: 'text', text: String(userMsg) }], model: { providerID: b.providerID, modelID: model || b.defaultModel } }
     const evRes = await fetch(`${b.base}/event`, { method: 'GET', signal: AbortSignal.timeout(120000) })
     if (!evRes.ok) throw new Error(`ACP ${prefix} /event ${evRes.status}`)
+    const msgRes = await fetch(`${b.base}/session/${sessionId}/message`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body), signal: AbortSignal.timeout(120000) })
+    if (!msgRes.ok) throw new Error(`ACP ${prefix} /message ${msgRes.status}: ${(await msgRes.text()).slice(0,200)}`)
+    let content = ''; let sawAssistantText = false
     const reader = evRes.body.getReader(); const dec = new TextDecoder(); let buf = ''
     while (true) {
         const { value, done } = await reader.read(); if (done) break
@@ -52,9 +51,10 @@ async function acpChat(prefix, model, input) {
             if (!raw.startsWith('data: ')) continue
             try { const ev = JSON.parse(raw.slice(6))
                 if (ev.properties?.sessionID && ev.properties.sessionID !== sessionId) continue
-                if (ev.properties?.part?.type === 'text' && ev.properties.part.text) content += ev.properties.part.text
-                if (ev.event === 'session.complete' || ev.properties?.complete) return { content: content.trim(), tool_calls: [], raw: { provider: prefix, model } }
-            } catch {}
+                if (ev.type === 'message.part.updated' && ev.properties?.part?.type === 'text' && ev.properties.part.text) { content = ev.properties.part.text; sawAssistantText = true }
+                if (ev.type === 'session.error') throw new Error(`ACP ${prefix} session.error: ${JSON.stringify(ev.properties?.error || {}).slice(0,200)}`)
+                if (ev.type === 'session.idle') return { content: content.trim(), tool_calls: [], raw: { provider: prefix, model } }
+            } catch (e) { if (/session.error/.test(e.message)) throw e }
         }
     }
     return { content: content.trim(), tool_calls: [], raw: { provider: prefix, model } }
@@ -115,7 +115,9 @@ function tryParseJson(s) { try { return typeof s === 'string' ? JSON.parse(s) :
 async function hasKey(provider) {
     if (provider === 'claude-cli' || provider === 'kilo' || provider === 'opencode') return true
     const resolved = await resolveKey(provider).catch(() => ({ value: null }))
-    return !!resolved.value
+    if (!resolved.value) return false
+    if (provider === 'cloudflare' && !process.env.CLOUDFLARE_ACCOUNT_ID) return false
+    return true
 }
 function defaultModel(provider) {

package/src/agent/machine.js CHANGED Viewed

@@ -6,8 +6,19 @@ import { resolveCallLLM } from './llm_resolver.js'
 const log = logger('agent')
-export function createAgentMachine({ provider, model, maxIterations = 90, callLLM, enabledToolsets = ['core'], disabledToolsets = [] } = {}) {
-    const llm = callLLM || resolveCallLLM({ provider, model })
+export function createAgentMachine({ provider, model, maxIterations = 90, callLLM, enabledToolsets = ['core'], disabledToolsets = [], events } = {}) {
+    const baseLLM = callLLM || resolveCallLLM({ provider, model })
+    const llm = events ? async (input) => {
+        const t0 = Date.now()
+        try {
+            const out = await baseLLM(input)
+            events.push({ type: 'llm_call', ok: true, durationMs: Date.now() - t0, provider: out?.raw?.provider || provider, model: out?.raw?.model || model, content_length: (out?.content || '').length, tool_calls_count: (out?.tool_calls || []).length, ts: new Date().toISOString() })
+            return out
+        } catch (e) {
+            events.push({ type: 'llm_call', ok: false, durationMs: Date.now() - t0, provider, model, error: String(e?.message || e), stack: e?.stack || null, ts: new Date().toISOString() })
+            throw e
+        }
+    } : baseLLM
     return createMachine({
         id: 'freddie-agent',
         initial: 'idle',
@@ -85,10 +96,10 @@ export function createAgentMachine({ provider, model, maxIterations = 90, callLL
     })
 }
-async function writeTrajectory(out, { prompt, provider, model, skill, cwd }) {
+async function writeTrajectory(out, { prompt, provider, model, skill, cwd, events = [], errorStack = null, witnessPath = null }) {
     try {
         const { getConfigValue } = await import('../config.js')
-        if (!getConfigValue('agent.save_trajectories', false)) return
+        if (!getConfigValue('agent.save_trajectories', false) && !witnessPath) return
         const { getFreddieHome } = await import('../home.js')
         const fs = await import('node:fs')
         const path = await import('node:path')
@@ -96,25 +107,44 @@ async function writeTrajectory(out, { prompt, provider, model, skill, cwd }) {
         fs.mkdirSync(dir, { recursive: true })
         const states = []
         const toolCalls = []
+        const toolResults = []
+        let compressorInvocations = 0
         for (const m of out.messages || []) {
-            if (m.role === 'assistant' && m.tool_calls?.length) { states.push('EXECUTE'); for (const tc of m.tool_calls) toolCalls.push({ name: tc.name || tc.function?.name, arguments: tc.arguments || tc.function?.arguments || {} }) }
+            if (m.role === 'assistant' && m.tool_calls?.length) { states.push('EXECUTE'); for (const tc of m.tool_calls) toolCalls.push({ name: tc.name || tc.function?.name, arguments: tc.arguments || tc.function?.arguments || {}, id: tc.id }) }
             else if (m.role === 'user') states.push('PLAN')
             else if (m.role === 'assistant') states.push('COMPLETE')
-            else if (m.role === 'tool') states.push('VERIFY')
+            else if (m.role === 'tool') { states.push('VERIFY'); toolResults.push({ tool_call_id: m.tool_call_id, content: typeof m.content === 'string' ? m.content : JSON.stringify(m.content) }) }
+            if (m.role === 'system' && typeof m.content === 'string' && /\[trajectory\.compressed\]/.test(m.content)) compressorInvocations += 1
         }
         const ts = new Date().toISOString().replace(/[:.]/g, '-').replace(/Z$/, '')
         const slug = (prompt || 'turn').slice(0, 40).replace(/[^a-zA-Z0-9-]+/g, '-').replace(/^-+|-+$/g, '').toLowerCase()
+        const llmCalls = events.filter(e => e.type === 'llm_call')
+        const streamChunks = events.filter(e => e.type === 'llm_chunk')
+        const payload = {
+            schema_version: 2, ts, prompt, provider, model, skill, cwd,
+            iterations: out.iterations, result: out.result, error: out.error, error_stack: errorStack,
+            state_transitions: states, tool_calls: toolCalls, tool_results: toolResults,
+            llm_calls: llmCalls, llm_chunks_count: streamChunks.length,
+            compressor_invocations: compressorInvocations,
+            events, messages: out.messages,
+        }
         const file = path.join(dir, `${ts}-${slug}.json`)
-        fs.writeFileSync(file, JSON.stringify({
-            ts, prompt, provider, model, skill, cwd,
-            iterations: out.iterations, result: out.result, error: out.error,
-            state_transitions: states, tool_calls: toolCalls,
-            messages: out.messages,
-        }, null, 2))
+        fs.writeFileSync(file, JSON.stringify(payload, null, 2))
+        if (witnessPath) {
+            const jsonl = [
+                JSON.stringify({ event: 'session_start', ts, prompt, provider, model, skill, cwd }),
+                ...(out.messages || []).map((m, i) => JSON.stringify({ event: 'message', index: i, role: m.role, content: m.content, tool_calls: m.tool_calls || null, tool_call_id: m.tool_call_id || null })),
+                ...llmCalls.map(e => JSON.stringify({ event: 'llm_call', ...e })),
+                JSON.stringify({ event: 'session_end', iterations: out.iterations, error: out.error, error_stack: errorStack, compressor_invocations: compressorInvocations }),
+            ].join('\n')
+            fs.mkdirSync(path.dirname(witnessPath), { recursive: true })
+            fs.writeFileSync(witnessPath, jsonl)
+        }
     } catch (_) {}
 }
-export async function runTurn({ prompt, messages = [], model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations = 90, timeoutMs = 30000, cwd, skill } = {}) {
+export async function runTurn({ prompt, messages = [], model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations = 90, timeoutMs = 30000, cwd, skill, witnessPath } = {}) {
+    const events = []
     const initMessages = [...messages]
     const systemParts = []
     if (cwd) systemParts.push(`Working directory: ${cwd}. Always pass cwd="${cwd}" to bash tool calls. When reading or writing files use paths relative to this directory or absolute paths under it.`)
@@ -124,7 +154,7 @@ export async function runTurn({ prompt, messages = [], model, provider, callLLM,
         if (skillDef?.content) systemParts.push('Skill context:\n' + skillDef.content)
     }
     if (systemParts.length > 0) initMessages.unshift({ role: 'user', content: systemParts.join('\n\n') })
-    const machine = createAgentMachine({ model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations })
+    const machine = createAgentMachine({ model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations, events })
     const actor = createActor(machine, { input: { messages: initMessages } })
     actor.start()
     actor.send({ type: 'SUBMIT', prompt })
@@ -133,7 +163,8 @@ export async function runTurn({ prompt, messages = [], model, provider, callLLM,
         actor.subscribe(snap => {
             if (snap.status === 'done') {
                 clearTimeout(t)
-                writeTrajectory(snap.output, { prompt, provider, model, skill, cwd }).finally(() => resolve(snap.output))
+                const errorStack = snap.output?.error ? (events.find(e => e.type === 'llm_call' && !e.ok)?.stack || null) : null
+                writeTrajectory(snap.output, { prompt, provider, model, skill, cwd, events, errorStack, witnessPath }).finally(() => resolve(snap.output))
             }
         })
     })