freddie 0.0.88 → 0.0.90
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +61 -0
- package/package.json +5 -2
- package/plugins/core-cli/plugin.js +10 -3
- package/src/agent/llm_resolver.js +12 -10
- package/src/agent/machine.js +46 -15
package/AGENTS.md
CHANGED
|
@@ -302,3 +302,64 @@ To implement:
|
|
|
302
302
|
4. Register `window.__debug.agents()` observability global
|
|
303
303
|
|
|
304
304
|
**Blocked on**: Design decision (what metrics? count only? session associations? perf data?). Deferred pending user clarification.
|
|
305
|
+
|
|
306
|
+
## Trajectory recorder schema v2 (2026-05-12)
|
|
307
|
+
|
|
308
|
+
`src/agent/machine.js::writeTrajectory()` writes one JSON per turn under `<FREDDIE_HOME>/trajectories/<ts>-<slug>.json` whenever `agent.save_trajectories=true` OR `--witness <path>` is set on `freddie exec`. Schema (`schema_version: 2`):
|
|
309
|
+
|
|
310
|
+
```
|
|
311
|
+
{
|
|
312
|
+
schema_version: 2, ts, prompt, provider, model, skill, cwd,
|
|
313
|
+
iterations, result, error, error_stack,
|
|
314
|
+
state_transitions: ["PLAN"|"EXECUTE"|"VERIFY"|"COMPLETE", ...],
|
|
315
|
+
tool_calls: [{name, arguments, id}],
|
|
316
|
+
tool_results: [{tool_call_id, content}],
|
|
317
|
+
llm_calls: [{ok, durationMs, provider, model, content_length, tool_calls_count, ts, error?, stack?}],
|
|
318
|
+
llm_chunks_count, compressor_invocations, events, messages
|
|
319
|
+
}
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
Optional `--witness <path>` writes a parallel JSONL stream with one event per line (`session_start`, `message`, `llm_call`, `session_end`) for downstream tail/grep. `runTurn({witnessPath})` is the in-code equivalent.
|
|
323
|
+
|
|
324
|
+
Captured fields per acceptance bar:
|
|
325
|
+
- (a) tool_call args → `tool_calls[].arguments` + `messages[].tool_calls`
|
|
326
|
+
- (b) tool_result → `tool_results[]` + `messages[role:'tool']`
|
|
327
|
+
- (c) LLM call timing/duration/provider/content_length/tool_calls_count → `llm_calls[]`
|
|
328
|
+
- (d) compressor invocations → counted from `messages[role:'system']` matching `[trajectory.compressed]`
|
|
329
|
+
- (e) errors with stack → `error_stack` + per-`llm_call` `stack` field
|
|
330
|
+
|
|
331
|
+
Witnessed 2026-05-12: mistral-large 4-iteration loop on penguins repo produced 4 successful llm_calls + 1 failing (429 rate-limit) all captured with full stack trace. See `.gm/agent-loop-witness.jsonl` for canonical example.
|
|
332
|
+
|
|
333
|
+
## LLM validation witness format (.gm/llm-validation.json)
|
|
334
|
+
|
|
335
|
+
`.gm/llm-validation.json` is the canonical witness for provider reachability. Generated by an out-of-band validator script (per session) that probes every key in `process.env` matching `<PROVIDER>_API_KEY`, plus the acp-daemon endpoints (kilo on 4780, opencode on 4790) and claude-cli subprocess. Shape:
|
|
336
|
+
|
|
337
|
+
```
|
|
338
|
+
{
|
|
339
|
+
timestamp, env_keys: [...], targets: [{provider, source, daemonUp}],
|
|
340
|
+
results: [{provider, ok, ms, excerpt, error, source}],
|
|
341
|
+
sampler: [{provider, ok, failCount, nextCheckIn}],
|
|
342
|
+
pass_count, total
|
|
343
|
+
}
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
Witnessed 2026-05-12: 7/15 pass — groq, google, mistral, openrouter, sambanova, kilo, claude-cli green. opencode-via-acp daemonUp=false until `opencode serve --port 4790` is started (see opencode caveat below).
|
|
347
|
+
|
|
348
|
+
## opencode CLI shim caveat (2026-05-12)
|
|
349
|
+
|
|
350
|
+
`opencode-ai@1.14.48` installs successfully via `npm install -g opencode-ai` and exposes a working binary at `C:\Users\user\AppData\Roaming\npm\opencode.cmd` (Windows). The bun-installed shim at `C:\Users\user\.bun\bin\opencode.exe` is BROKEN — its wrapper looks for `C:\Users\user\node_modules\opencode-ai\bin\opencode` (wrong path) and fails with `MODULE_NOT_FOUND`. **Workflow**: use npm version; do NOT `bun install -g opencode-ai`. To start ACP daemon: `& 'C:\Users\user\AppData\Roaming\npm\opencode.cmd' serve --port 4790 --hostname 127.0.0.1`. Verified by GET http://127.0.0.1:4790/ returning 200 (HTML shell). Warning `OPENCODE_SERVER_PASSWORD is not set` is harmless for localhost use.
|
|
351
|
+
|
|
352
|
+
## gm-cc skill registry (2026-05-12)
|
|
353
|
+
|
|
354
|
+
`plugins/gm-cc/plugin.js` auto-discovers 12 SKILL.md files from npm `gm-cc` package and registers them via `pi.skills.register({name: 'gm:<name>', description, content, source: 'gm-cc'})`. Registered skills: browser, code-search, create-lang-plugin, gm, gm-complete, gm-emit, gm-execute, governance, pages, planning, ssh, update-docs. Inspect with `node bin/freddie.js skills | findstr gm:`.
|
|
355
|
+
|
|
356
|
+
## kilo ACP integration (2026-05-12)
|
|
357
|
+
|
|
358
|
+
`src/agent/llm_resolver.js::acpChat()` (line 33) speaks the kilo ACP protocol: POST `/session` → GET `/event` (SSE) → POST `/session/<id>/message`. Streams `message.part.updated` events to assemble content; terminates on `session.idle`. Required ordering: `/event` must be opened BEFORE `/message` POST or messages drop. Configured in `ACP_BACKENDS` at line 14: kilo on `http://localhost:4780`, opencode on `http://localhost:4790`.
|
|
359
|
+
|
|
360
|
+
Note: kilo + opencode ACP backends return content only, no tool_calls (the LLM-side tool layer is opaque to freddie). For multi-iteration tool-using loops, use OpenAI-compatible providers (mistral, openrouter, sambanova, groq) instead.
|
|
361
|
+
|
|
362
|
+
## scripts/sync-upstream.mjs (2026-05-12)
|
|
363
|
+
|
|
364
|
+
`node scripts/sync-upstream.mjs [--dry-run] [pkg ...]` bumps sibling dep entries (plugsdk, acptoapi, anentrypoint-design, gm-cc) in package.json to `^<latest>` from npm registry, then runs `npm install --package-lock-only`. Skips `file:` deps (local-dev pattern). Wired into `.github/workflows/sync-upstream.yml` (weekly cron + workflow_dispatch) which opens a PR via peter-evans/create-pull-request@v6 when changes land. Dry-run validated: detected `acptoapi: ^1.0.52 -> ^1.0.54`.
|
|
365
|
+
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "freddie",
|
|
3
|
-
"version": "0.0.
|
|
3
|
+
"version": "0.0.90",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Open JS agent harness built on pi-mono, floosie, xstate, and anentrypoint-design",
|
|
6
6
|
"bin": {
|
|
@@ -27,7 +27,7 @@
|
|
|
27
27
|
"xstate": "^5.31.0",
|
|
28
28
|
"zod": "^4.0.0",
|
|
29
29
|
"anentrypoint-design": "^0.0.94",
|
|
30
|
-
"acptoapi": "^1.0.
|
|
30
|
+
"acptoapi": "^1.0.55"
|
|
31
31
|
},
|
|
32
32
|
"optionalDependencies": {
|
|
33
33
|
"@libsql/darwin-arm64": "0.3.19",
|
|
@@ -41,6 +41,9 @@
|
|
|
41
41
|
"engines": {
|
|
42
42
|
"node": ">=20.6.0"
|
|
43
43
|
},
|
|
44
|
+
"overrides": {
|
|
45
|
+
"fast-xml-builder": "^1.1.7"
|
|
46
|
+
},
|
|
44
47
|
"files": [
|
|
45
48
|
"bin/",
|
|
46
49
|
"src/",
|
|
@@ -15,7 +15,14 @@ export default {
|
|
|
15
15
|
if (action === 'get' && name) { console.log(JSON.stringify(host.pi.tools.get(name)?.schema, null, 2)); return }
|
|
16
16
|
for (const t of host.pi.tools.list()) console.log(`${(t.toolset || 'core').padEnd(10)} ${t.name}\t${(t.schema?.description || '').slice(0, 60)}`)
|
|
17
17
|
} })
|
|
18
|
-
C({ name: 'skills', description: 'List skills', args: [{ name: 'action', default: 'list' }
|
|
18
|
+
C({ name: 'skills', description: 'List/show skills (filesystem + registered via pi.skills)', args: [{ name: 'action', default: 'list' }, { name: 'name' }], action: (action, name) => {
|
|
19
|
+
const fsSkills = listSkills().map(s => ({ name: s.name, description: s.description || '', source: 'fs', body: s.body, file: s.file }))
|
|
20
|
+
const piSkills = host.pi.skills.list().map(s => ({ name: s.name, description: s.description || '', source: s.source || 'pi', body: s.content || s.body || '', file: s.file }))
|
|
21
|
+
const seen = new Set(); const all = []
|
|
22
|
+
for (const s of [...piSkills, ...fsSkills]) { if (seen.has(s.name)) continue; seen.add(s.name); all.push(s) }
|
|
23
|
+
if (action === 'show' && name) { const s = all.find(x => x.name === name); if (!s) { console.error('skill not found:', name); process.exit(1) } console.log(s.body); return }
|
|
24
|
+
for (const s of all) console.log(`${(s.source || 'fs').padEnd(8)} ${s.name}\t${s.description.slice(0, 80)}`)
|
|
25
|
+
} })
|
|
19
26
|
C({ name: 'profile', description: 'Manage profiles', args: [{ name: 'action', default: 'list' }, { name: 'name' }], action: (action, name) => {
|
|
20
27
|
if (action === 'list') { for (const p of listAllProfiles()) console.log(p); return }
|
|
21
28
|
if (action === 'create') { createProfile(name); console.log('created:', name); return }
|
|
@@ -49,12 +56,12 @@ export default {
|
|
|
49
56
|
try { ({ callLLM } = await import('../../src/agent/pi-bridge.js')) } catch {}
|
|
50
57
|
await interactive({ callLLM })
|
|
51
58
|
} })
|
|
52
|
-
C({ name: 'exec', description: 'Run a single prompt through the agent and exit', options: [{ flag: '--prompt <prompt>', required: true }, { flag: '--model <model>', default: '' }, { flag: '--provider <provider>', default: '' }, { flag: '--skill <skill>', default: '' }, { flag: '--cwd <cwd>', default: '' }, { flag: '--timeout <ms>', default: '60000' }], action: async (opts) => {
|
|
59
|
+
C({ name: 'exec', description: 'Run a single prompt through the agent and exit', options: [{ flag: '--prompt <prompt>', required: true }, { flag: '--model <model>', default: '' }, { flag: '--provider <provider>', default: '' }, { flag: '--skill <skill>', default: '' }, { flag: '--cwd <cwd>', default: '' }, { flag: '--timeout <ms>', default: '60000' }, { flag: '--witness <path>', default: '' }], action: async (opts) => {
|
|
53
60
|
const { runTurn } = await import('../../src/agent/machine.js')
|
|
54
61
|
let provider = opts.provider || undefined
|
|
55
62
|
let model = opts.model || undefined
|
|
56
63
|
if (!provider && model && /^[a-z][a-z0-9-]*\//.test(model)) { provider = model.split('/')[0]; model = model.slice(provider.length + 1) }
|
|
57
|
-
const out = await runTurn({ prompt: opts.prompt, provider, model, skill: opts.skill || undefined, cwd: opts.cwd || undefined, timeoutMs: Number(opts.timeout) })
|
|
64
|
+
const out = await runTurn({ prompt: opts.prompt, provider, model, skill: opts.skill || undefined, cwd: opts.cwd || undefined, timeoutMs: Number(opts.timeout), witnessPath: opts.witness || undefined })
|
|
58
65
|
if (out.error) { console.error('error:', out.error); process.exit(1) }
|
|
59
66
|
console.log(out.result || out.messages?.at(-1)?.content || '')
|
|
60
67
|
process.exit(0)
|
|
@@ -12,7 +12,7 @@ export const PROVIDER_KEYS = sdk.PROVIDER_KEYS
|
|
|
12
12
|
export const DEFAULTS = sdk.PROVIDER_DEFAULTS
|
|
13
13
|
|
|
14
14
|
const ACP_BACKENDS = {
|
|
15
|
-
kilo: { base: 'http://localhost:4780', providerID: 'kilo', defaultModel: '
|
|
15
|
+
kilo: { base: 'http://localhost:4780', providerID: 'kilo', defaultModel: 'openrouter/free' },
|
|
16
16
|
opencode: { base: 'http://localhost:4790', providerID: 'opencode', defaultModel: 'minimax-m2.5-free' },
|
|
17
17
|
}
|
|
18
18
|
|
|
@@ -36,13 +36,12 @@ async function acpChat(prefix, model, input) {
|
|
|
36
36
|
if (!sessRes.ok) throw new Error(`ACP ${prefix} /session ${sessRes.status}`)
|
|
37
37
|
const sessionId = (await sessRes.json()).id
|
|
38
38
|
const userMsg = input.messages.filter(m => m.role === 'user').slice(-1)[0]?.content || ''
|
|
39
|
-
const body = { parts: [{ type: 'text', text: String(userMsg) }] }
|
|
40
|
-
if (b.providerID === 'opencode') body.model = { providerID: 'opencode', modelID: model || b.defaultModel }
|
|
41
|
-
else { body.providerID = 'kilo'; body.modelID = model || b.defaultModel }
|
|
42
|
-
await fetch(`${b.base}/session/${sessionId}/message`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body), signal: AbortSignal.timeout(120000) })
|
|
43
|
-
let content = ''
|
|
39
|
+
const body = { parts: [{ type: 'text', text: String(userMsg) }], model: { providerID: b.providerID, modelID: model || b.defaultModel } }
|
|
44
40
|
const evRes = await fetch(`${b.base}/event`, { method: 'GET', signal: AbortSignal.timeout(120000) })
|
|
45
41
|
if (!evRes.ok) throw new Error(`ACP ${prefix} /event ${evRes.status}`)
|
|
42
|
+
const msgRes = await fetch(`${b.base}/session/${sessionId}/message`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body), signal: AbortSignal.timeout(120000) })
|
|
43
|
+
if (!msgRes.ok) throw new Error(`ACP ${prefix} /message ${msgRes.status}: ${(await msgRes.text()).slice(0,200)}`)
|
|
44
|
+
let content = ''; let sawAssistantText = false
|
|
46
45
|
const reader = evRes.body.getReader(); const dec = new TextDecoder(); let buf = ''
|
|
47
46
|
while (true) {
|
|
48
47
|
const { value, done } = await reader.read(); if (done) break
|
|
@@ -52,9 +51,10 @@ async function acpChat(prefix, model, input) {
|
|
|
52
51
|
if (!raw.startsWith('data: ')) continue
|
|
53
52
|
try { const ev = JSON.parse(raw.slice(6))
|
|
54
53
|
if (ev.properties?.sessionID && ev.properties.sessionID !== sessionId) continue
|
|
55
|
-
if (ev.properties?.part?.type === 'text' && ev.properties.part.text) content
|
|
56
|
-
if (ev.
|
|
57
|
-
|
|
54
|
+
if (ev.type === 'message.part.updated' && ev.properties?.part?.type === 'text' && ev.properties.part.text) { content = ev.properties.part.text; sawAssistantText = true }
|
|
55
|
+
if (ev.type === 'session.error') throw new Error(`ACP ${prefix} session.error: ${JSON.stringify(ev.properties?.error || {}).slice(0,200)}`)
|
|
56
|
+
if (ev.type === 'session.idle') return { content: content.trim(), tool_calls: [], raw: { provider: prefix, model } }
|
|
57
|
+
} catch (e) { if (/session.error/.test(e.message)) throw e }
|
|
58
58
|
}
|
|
59
59
|
}
|
|
60
60
|
return { content: content.trim(), tool_calls: [], raw: { provider: prefix, model } }
|
|
@@ -115,7 +115,9 @@ function tryParseJson(s) { try { return typeof s === 'string' ? JSON.parse(s) :
|
|
|
115
115
|
async function hasKey(provider) {
|
|
116
116
|
if (provider === 'claude-cli' || provider === 'kilo' || provider === 'opencode') return true
|
|
117
117
|
const resolved = await resolveKey(provider).catch(() => ({ value: null }))
|
|
118
|
-
|
|
118
|
+
if (!resolved.value) return false
|
|
119
|
+
if (provider === 'cloudflare' && !process.env.CLOUDFLARE_ACCOUNT_ID) return false
|
|
120
|
+
return true
|
|
119
121
|
}
|
|
120
122
|
|
|
121
123
|
function defaultModel(provider) {
|
package/src/agent/machine.js
CHANGED
|
@@ -6,8 +6,19 @@ import { resolveCallLLM } from './llm_resolver.js'
|
|
|
6
6
|
|
|
7
7
|
const log = logger('agent')
|
|
8
8
|
|
|
9
|
-
export function createAgentMachine({ provider, model, maxIterations = 90, callLLM, enabledToolsets = ['core'], disabledToolsets = [] } = {}) {
|
|
10
|
-
const
|
|
9
|
+
export function createAgentMachine({ provider, model, maxIterations = 90, callLLM, enabledToolsets = ['core'], disabledToolsets = [], events } = {}) {
|
|
10
|
+
const baseLLM = callLLM || resolveCallLLM({ provider, model })
|
|
11
|
+
const llm = events ? async (input) => {
|
|
12
|
+
const t0 = Date.now()
|
|
13
|
+
try {
|
|
14
|
+
const out = await baseLLM(input)
|
|
15
|
+
events.push({ type: 'llm_call', ok: true, durationMs: Date.now() - t0, provider: out?.raw?.provider || provider, model: out?.raw?.model || model, content_length: (out?.content || '').length, tool_calls_count: (out?.tool_calls || []).length, ts: new Date().toISOString() })
|
|
16
|
+
return out
|
|
17
|
+
} catch (e) {
|
|
18
|
+
events.push({ type: 'llm_call', ok: false, durationMs: Date.now() - t0, provider, model, error: String(e?.message || e), stack: e?.stack || null, ts: new Date().toISOString() })
|
|
19
|
+
throw e
|
|
20
|
+
}
|
|
21
|
+
} : baseLLM
|
|
11
22
|
return createMachine({
|
|
12
23
|
id: 'freddie-agent',
|
|
13
24
|
initial: 'idle',
|
|
@@ -85,10 +96,10 @@ export function createAgentMachine({ provider, model, maxIterations = 90, callLL
|
|
|
85
96
|
})
|
|
86
97
|
}
|
|
87
98
|
|
|
88
|
-
async function writeTrajectory(out, { prompt, provider, model, skill, cwd }) {
|
|
99
|
+
async function writeTrajectory(out, { prompt, provider, model, skill, cwd, events = [], errorStack = null, witnessPath = null }) {
|
|
89
100
|
try {
|
|
90
101
|
const { getConfigValue } = await import('../config.js')
|
|
91
|
-
if (!getConfigValue('agent.save_trajectories', false)) return
|
|
102
|
+
if (!getConfigValue('agent.save_trajectories', false) && !witnessPath) return
|
|
92
103
|
const { getFreddieHome } = await import('../home.js')
|
|
93
104
|
const fs = await import('node:fs')
|
|
94
105
|
const path = await import('node:path')
|
|
@@ -96,25 +107,44 @@ async function writeTrajectory(out, { prompt, provider, model, skill, cwd }) {
|
|
|
96
107
|
fs.mkdirSync(dir, { recursive: true })
|
|
97
108
|
const states = []
|
|
98
109
|
const toolCalls = []
|
|
110
|
+
const toolResults = []
|
|
111
|
+
let compressorInvocations = 0
|
|
99
112
|
for (const m of out.messages || []) {
|
|
100
|
-
if (m.role === 'assistant' && m.tool_calls?.length) { states.push('EXECUTE'); for (const tc of m.tool_calls) toolCalls.push({ name: tc.name || tc.function?.name, arguments: tc.arguments || tc.function?.arguments || {} }) }
|
|
113
|
+
if (m.role === 'assistant' && m.tool_calls?.length) { states.push('EXECUTE'); for (const tc of m.tool_calls) toolCalls.push({ name: tc.name || tc.function?.name, arguments: tc.arguments || tc.function?.arguments || {}, id: tc.id }) }
|
|
101
114
|
else if (m.role === 'user') states.push('PLAN')
|
|
102
115
|
else if (m.role === 'assistant') states.push('COMPLETE')
|
|
103
|
-
else if (m.role === 'tool') states.push('VERIFY')
|
|
116
|
+
else if (m.role === 'tool') { states.push('VERIFY'); toolResults.push({ tool_call_id: m.tool_call_id, content: typeof m.content === 'string' ? m.content : JSON.stringify(m.content) }) }
|
|
117
|
+
if (m.role === 'system' && typeof m.content === 'string' && /\[trajectory\.compressed\]/.test(m.content)) compressorInvocations += 1
|
|
104
118
|
}
|
|
105
119
|
const ts = new Date().toISOString().replace(/[:.]/g, '-').replace(/Z$/, '')
|
|
106
120
|
const slug = (prompt || 'turn').slice(0, 40).replace(/[^a-zA-Z0-9-]+/g, '-').replace(/^-+|-+$/g, '').toLowerCase()
|
|
121
|
+
const llmCalls = events.filter(e => e.type === 'llm_call')
|
|
122
|
+
const streamChunks = events.filter(e => e.type === 'llm_chunk')
|
|
123
|
+
const payload = {
|
|
124
|
+
schema_version: 2, ts, prompt, provider, model, skill, cwd,
|
|
125
|
+
iterations: out.iterations, result: out.result, error: out.error, error_stack: errorStack,
|
|
126
|
+
state_transitions: states, tool_calls: toolCalls, tool_results: toolResults,
|
|
127
|
+
llm_calls: llmCalls, llm_chunks_count: streamChunks.length,
|
|
128
|
+
compressor_invocations: compressorInvocations,
|
|
129
|
+
events, messages: out.messages,
|
|
130
|
+
}
|
|
107
131
|
const file = path.join(dir, `${ts}-${slug}.json`)
|
|
108
|
-
fs.writeFileSync(file, JSON.stringify(
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
132
|
+
fs.writeFileSync(file, JSON.stringify(payload, null, 2))
|
|
133
|
+
if (witnessPath) {
|
|
134
|
+
const jsonl = [
|
|
135
|
+
JSON.stringify({ event: 'session_start', ts, prompt, provider, model, skill, cwd }),
|
|
136
|
+
...(out.messages || []).map((m, i) => JSON.stringify({ event: 'message', index: i, role: m.role, content: m.content, tool_calls: m.tool_calls || null, tool_call_id: m.tool_call_id || null })),
|
|
137
|
+
...llmCalls.map(e => JSON.stringify({ event: 'llm_call', ...e })),
|
|
138
|
+
JSON.stringify({ event: 'session_end', iterations: out.iterations, error: out.error, error_stack: errorStack, compressor_invocations: compressorInvocations }),
|
|
139
|
+
].join('\n')
|
|
140
|
+
fs.mkdirSync(path.dirname(witnessPath), { recursive: true })
|
|
141
|
+
fs.writeFileSync(witnessPath, jsonl)
|
|
142
|
+
}
|
|
114
143
|
} catch (_) {}
|
|
115
144
|
}
|
|
116
145
|
|
|
117
|
-
export async function runTurn({ prompt, messages = [], model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations = 90, timeoutMs = 30000, cwd, skill } = {}) {
|
|
146
|
+
export async function runTurn({ prompt, messages = [], model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations = 90, timeoutMs = 30000, cwd, skill, witnessPath } = {}) {
|
|
147
|
+
const events = []
|
|
118
148
|
const initMessages = [...messages]
|
|
119
149
|
const systemParts = []
|
|
120
150
|
if (cwd) systemParts.push(`Working directory: ${cwd}. Always pass cwd="${cwd}" to bash tool calls. When reading or writing files use paths relative to this directory or absolute paths under it.`)
|
|
@@ -124,7 +154,7 @@ export async function runTurn({ prompt, messages = [], model, provider, callLLM,
|
|
|
124
154
|
if (skillDef?.content) systemParts.push('Skill context:\n' + skillDef.content)
|
|
125
155
|
}
|
|
126
156
|
if (systemParts.length > 0) initMessages.unshift({ role: 'user', content: systemParts.join('\n\n') })
|
|
127
|
-
const machine = createAgentMachine({ model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations })
|
|
157
|
+
const machine = createAgentMachine({ model, provider, callLLM, enabledToolsets, disabledToolsets, maxIterations, events })
|
|
128
158
|
const actor = createActor(machine, { input: { messages: initMessages } })
|
|
129
159
|
actor.start()
|
|
130
160
|
actor.send({ type: 'SUBMIT', prompt })
|
|
@@ -133,7 +163,8 @@ export async function runTurn({ prompt, messages = [], model, provider, callLLM,
|
|
|
133
163
|
actor.subscribe(snap => {
|
|
134
164
|
if (snap.status === 'done') {
|
|
135
165
|
clearTimeout(t)
|
|
136
|
-
|
|
166
|
+
const errorStack = snap.output?.error ? (events.find(e => e.type === 'llm_call' && !e.ok)?.stack || null) : null
|
|
167
|
+
writeTrajectory(snap.output, { prompt, provider, model, skill, cwd, events, errorStack, witnessPath }).finally(() => resolve(snap.output))
|
|
137
168
|
}
|
|
138
169
|
})
|
|
139
170
|
})
|