@agent-controller/runtime-opencode 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +103 -0
- package/dist/event-translator.d.ts +77 -0
- package/dist/event-translator.js +322 -0
- package/dist/honesty.d.ts +59 -0
- package/dist/honesty.js +226 -0
- package/dist/index.d.ts +1 -0
- package/dist/index.js +704 -0
- package/dist/opencode-config.d.ts +165 -0
- package/dist/opencode-config.js +517 -0
- package/dist/types.d.ts +112 -0
- package/dist/types.js +2 -0
- package/dist/wire.d.ts +5 -0
- package/dist/wire.js +8 -0
- package/package.json +50 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 CCDevelopForFun
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# `@agent-controller/runtime-opencode` — opencode runtime adapter
|
|
2
|
+
|
|
3
|
+
Node/TypeScript adapter that runs an [agent-controller](https://github.com/CCDevelopForFun/agent-controller) `CompiledSpec` against the [sst/opencode](https://opencode.ai) CLI via the official `@opencode-ai/sdk`, and emits the NDJSON wire-event stream that `agentctl` consumes. Sibling to the Pi adapter ([`@agent-controller/runtime`](https://www.npmjs.com/package/@agent-controller/runtime)); both consume the same `CompiledSpec` and emit the same wire-event format.
|
|
4
|
+
|
|
5
|
+
Requires **Node 22+**.
|
|
6
|
+
|
|
7
|
+
## Install
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
npm install -g @agent-controller/runtime-opencode
|
|
11
|
+
# or per-project
|
|
12
|
+
npm install --save-dev @agent-controller/runtime-opencode
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
The `opencode` CLI must also be on `PATH`. Install with `npm install -g opencode-ai` or follow https://opencode.ai/docs/.
|
|
16
|
+
|
|
17
|
+
## Use with `agentctl`
|
|
18
|
+
|
|
19
|
+
`agentctl` (the Go CLI) spawns this package as a subprocess when `spec.runtime.type` is `local-opencode`. The example specs under [`examples/`](https://github.com/CCDevelopForFun/agent-controller/tree/main/examples) reference cwd-relative registry entries that ship with the source repo but not with this npm package. For an npm-only install, use a self-contained spec:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
# 1) Write a self-contained spec (no tools[]/extensions[]/skills[]/subagents[])
|
|
23
|
+
cat > /tmp/hello-opencode.yaml <<'EOF'
|
|
24
|
+
apiVersion: agent-controller.dev/v1alpha1
|
|
25
|
+
kind: Agent
|
|
26
|
+
metadata: { name: hello-opencode }
|
|
27
|
+
spec:
|
|
28
|
+
model: { provider: anthropic, name: claude-sonnet-4-20250514 }
|
|
29
|
+
persona: { role: Helpful demo, instructions: Answer concisely. }
|
|
30
|
+
task: Say hello.
|
|
31
|
+
tools: []
|
|
32
|
+
runtime: { type: local-opencode }
|
|
33
|
+
EOF
|
|
34
|
+
|
|
35
|
+
# 2) Point agentctl at the adapter and run
|
|
36
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
37
|
+
|
|
38
|
+
# global install
|
|
39
|
+
AGENT_CONTROLLER_RUNTIME="$(npm root -g)/@agent-controller/runtime-opencode/dist/index.js" \
|
|
40
|
+
agentctl run /tmp/hello-opencode.yaml
|
|
41
|
+
|
|
42
|
+
# or per-project
|
|
43
|
+
AGENT_CONTROLLER_RUNTIME="./node_modules/@agent-controller/runtime-opencode/dist/index.js" \
|
|
44
|
+
agentctl run /tmp/hello-opencode.yaml
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
Credentials via environment: `ANTHROPIC_API_KEY` (and / or `ANTHROPIC_BASE_URL`). Auth from `~/.opencode/auth.json` is intentionally NOT used — the adapter isolates `HOME` to prevent ambient config leakage.
|
|
48
|
+
|
|
49
|
+
`agentctl` (matching version) is available from the [GitHub Releases page](https://github.com/CCDevelopForFun/agent-controller/releases) as cross-platform binaries.
|
|
50
|
+
|
|
51
|
+
## Architecture role
|
|
52
|
+
|
|
53
|
+
This package is the **subprocess** spawned by `agentctl run` when `spec.runtime.type` is `local-opencode`. It:
|
|
54
|
+
|
|
55
|
+
1. Reads a `CompiledSpec` JSON document from stdin
|
|
56
|
+
2. Rejects unsupported ADL surface (`spec.sessionId`) at startup with a clear error (other opencode-incompatible shapes — `spec.extensions[]`, `spec.installs[]`, custom Pi-extension tools, built-ins with `config` — are rejected at `agentctl compile` since v0.3.0)
|
|
57
|
+
3. Builds an opencode-native config (`cfg.agent[primary]`, `cfg.agent[subagent_N]`, `cfg.mcp[server_N]`) from the spec via `buildOpencodeConfig`
|
|
58
|
+
4. Spawns opencode via `@opencode-ai/sdk`'s `createOpencode()` with an isolated `HOME` / `XDG_*` / cwd so ambient user config can't leak undeclared capabilities
|
|
59
|
+
5. Submits the task via `session.promptAsync()` (non-blocking)
|
|
60
|
+
6. Consumes the SSE event stream via a producer-consumer queue
|
|
61
|
+
7. Translates opencode events into wire-protocol events (tool.call/result, message, session.ended/error/warning, hallucination warnings)
|
|
62
|
+
8. Disables opencode's native agents (`plan`, `build`, `general`, `explore`, `scout`) so the `task` tool can't bypass the ADL allowlist
|
|
63
|
+
|
|
64
|
+
## ADL allowlist preservation
|
|
65
|
+
|
|
66
|
+
Several layers of defense keep ADL's "only declared capabilities are reachable" contract intact:
|
|
67
|
+
|
|
68
|
+
1. Opencode permissions start from a deny baseline (`"*": "deny"` + every known opencode tool explicitly denied), then declared tools become `allow`
|
|
69
|
+
2. Native opencode agents are disabled so the `task` tool can't delegate to undeclared agents
|
|
70
|
+
3. MCP server names are validated against `[A-Za-z0-9._-]+` (no glob metacharacters), against opencode built-in prefixes (`repo`, `doom`, `external`), and for sanitization collisions
|
|
71
|
+
4. Opencode-incompatible spec shapes are rejected at `agentctl compile` (v0.3.0)
|
|
72
|
+
|
|
73
|
+
## Capability differences vs the Pi adapter
|
|
74
|
+
|
|
75
|
+
See [`docs/architecture/harness-matrix.md`](https://github.com/CCDevelopForFun/agent-controller/blob/main/docs/architecture/harness-matrix.md) for the per-feature table. Highlights:
|
|
76
|
+
|
|
77
|
+
| Feature | Pi adapter | opencode adapter |
|
|
78
|
+
|---|---|---|
|
|
79
|
+
| Custom Pi-format tools / extensions | ✅ | ❌ rejected at `agentctl compile` |
|
|
80
|
+
| Session resume (`--resume`) | ✅ | ❌ rejected by `agentctl run` |
|
|
81
|
+
| `task` tool / native subagent delegation | via vendored ext | ✅ native |
|
|
82
|
+
| `cancelled` reason on SIGINT | ❌ (surfaces as `error`) | ✅ |
|
|
83
|
+
| `bash` allowlist via tool `config` | ❌ rejected at `agentctl compile` (use [`@gotgenes/pi-permission-system`](https://www.npmjs.com/package/@gotgenes/pi-permission-system)) | ❌ rejected at `agentctl compile` |
|
|
84
|
+
|
|
85
|
+
## Building from source
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
git clone https://github.com/CCDevelopForFun/agent-controller.git
|
|
89
|
+
cd agent-controller/runtime-opencode
|
|
90
|
+
npm install --ignore-scripts
|
|
91
|
+
npm run build # tsc → dist/
|
|
92
|
+
npm test # vitest
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
## Cross-references
|
|
96
|
+
|
|
97
|
+
- [agent-controller README](https://github.com/CCDevelopForFun/agent-controller#readme) — project overview + quickstart
|
|
98
|
+
- [Dual-adapter architecture overview](https://github.com/CCDevelopForFun/agent-controller/blob/main/docs/architecture/overview.md)
|
|
99
|
+
- [Harness capability matrix](https://github.com/CCDevelopForFun/agent-controller/blob/main/docs/architecture/harness-matrix.md)
|
|
100
|
+
|
|
101
|
+
## License
|
|
102
|
+
|
|
103
|
+
MIT — see [LICENSE](https://github.com/CCDevelopForFun/agent-controller/blob/main/LICENSE).
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
import type { WireEvent } from "./types.js";
|
|
2
|
+
import type { GlobalEvent } from "@opencode-ai/sdk";
|
|
3
|
+
export type HallucinationMode = "block" | "warn" | "correct";
|
|
4
|
+
/**
|
|
5
|
+
* Per-session state shared across translateEvent calls. The caller owns
|
|
6
|
+
* one instance per session and passes it on every call. Keeps the
|
|
7
|
+
* function pure from the caller's perspective while allowing stateful
|
|
8
|
+
* text accumulation.
|
|
9
|
+
*/
|
|
10
|
+
export interface TranslatorState {
|
|
11
|
+
/**
|
|
12
|
+
* Tracks the last seen `part.text` snapshot for each part ID. When
|
|
13
|
+
* `properties.delta` is absent (optional in the SDK), we derive the
|
|
14
|
+
* incremental text by diffing against the previous snapshot.
|
|
15
|
+
* Codex pass 16 of slice 2.4 caught that `delta ?? ""` dropped all
|
|
16
|
+
* output for full-part update events that opencode sends without delta.
|
|
17
|
+
*/
|
|
18
|
+
partTextSnapshots: Map<string, string>;
|
|
19
|
+
/** Accumulated assistant text parts for the current turn. */
|
|
20
|
+
textBuffer: string;
|
|
21
|
+
/** Message ID we're currently accumulating text for. */
|
|
22
|
+
currentMessageId: string | undefined;
|
|
23
|
+
/**
|
|
24
|
+
* Role of the current message being accumulated. TextParts arrive via
|
|
25
|
+
* message.part.updated and carry no role field; we learn the role from
|
|
26
|
+
* message.updated events and track it here so we only buffer assistant
|
|
27
|
+
* text. Codex pass 5 of slice 2.4 caught that user prompt/correction
|
|
28
|
+
* text was being buffered as assistant output.
|
|
29
|
+
*/
|
|
30
|
+
currentMessageRole: "user" | "assistant" | undefined;
|
|
31
|
+
/**
|
|
32
|
+
* Map from messageID to role, populated from message.updated events.
|
|
33
|
+
* Used to look up role when a text part arrives before its message.updated.
|
|
34
|
+
*/
|
|
35
|
+
messageRoles: Map<string, "user" | "assistant">;
|
|
36
|
+
/**
|
|
37
|
+
* Pre-role text buffer: keyed by messageID, accumulates deltas that
|
|
38
|
+
* arrived before the corresponding message.updated role event. When the
|
|
39
|
+
* role event arrives, we flush pre-role deltas into textBuffer (if
|
|
40
|
+
* assistant) or discard them (if user). Codex pass 10 of slice 2.4
|
|
41
|
+
* caught that "skip unknown-role text" permanently dropped early deltas
|
|
42
|
+
* that the stream would not replay.
|
|
43
|
+
*/
|
|
44
|
+
preRoleBuffer: Map<string, string>;
|
|
45
|
+
/**
|
|
46
|
+
* Set of tool callIDs for which we have already emitted a `tool.call`
|
|
47
|
+
* wire event. Guards against emitting duplicate tool.call events when
|
|
48
|
+
* opencode sends multiple `message.part.updated` events while the same
|
|
49
|
+
* tool part stays in "running" status (streamed input JSON, progress
|
|
50
|
+
* metadata, etc.). Codex pass 13 of slice 2.4 caught this.
|
|
51
|
+
*/
|
|
52
|
+
emittedToolCalls: Set<string>;
|
|
53
|
+
}
|
|
54
|
+
export declare function createTranslatorState(): TranslatorState;
|
|
55
|
+
/**
|
|
56
|
+
* Result of translating a single opencode GlobalEvent into zero or more
|
|
57
|
+
* wire events. `sessionIdle` signals the caller that the model has
|
|
58
|
+
* finished its turn and no more events are coming for this prompt.
|
|
59
|
+
* `sessionError` signals a terminal failure the caller should surface.
|
|
60
|
+
*/
|
|
61
|
+
export interface TranslationResult {
|
|
62
|
+
wireEvents: WireEvent[];
|
|
63
|
+
sessionIdle: boolean;
|
|
64
|
+
sessionError: string | undefined;
|
|
65
|
+
}
|
|
66
|
+
/**
|
|
67
|
+
* Translate one opencode GlobalEvent to wire events.
|
|
68
|
+
*
|
|
69
|
+
* @param gev The raw GlobalEvent from the SSE stream.
|
|
70
|
+
* @param sessionId The stable session ID the caller selected.
|
|
71
|
+
* @param targetSession The opencode session ID (from session.create) — we
|
|
72
|
+
* ignore events for other sessions that might arrive
|
|
73
|
+
* on the same SSE stream.
|
|
74
|
+
* @param state Mutable translator state (text accumulation).
|
|
75
|
+
* @param hallucinationMode Block/warn/correct guardrail, from spec.guardrails.
|
|
76
|
+
*/
|
|
77
|
+
export declare function translateEvent(gev: GlobalEvent, sessionId: string, targetSession: string, state: TranslatorState, hallucinationMode?: HallucinationMode): TranslationResult;
|
|
@@ -0,0 +1,322 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Translate opencode's SSE event stream into our wire-protocol events.
|
|
3
|
+
*
|
|
4
|
+
* Phase 2 slice 2.4 — this is the core mapping layer between the opencode
|
|
5
|
+
* SDK's event model and the NDJSON wire protocol that `agentctl run` reads.
|
|
6
|
+
*
|
|
7
|
+
* Wire protocol reference: runtime/src/types.ts (WireEventType union) and
|
|
8
|
+
* cli/internal/wire/events.go. Changes here must stay in sync with both.
|
|
9
|
+
*
|
|
10
|
+
* opencode's data model: text content arrives via `message.part.updated`
|
|
11
|
+
* events (TextPart streaming), not from the Message object itself
|
|
12
|
+
* (AssistantMessage.summary is a boolean flag, not text). We accumulate
|
|
13
|
+
* text in a per-session buffer and emit the full assembled text as a wire
|
|
14
|
+
* `message` event when `session.idle` signals the turn is complete. This
|
|
15
|
+
* mirrors the Pi adapter's behaviour (emit whole-message events, not
|
|
16
|
+
* per-token deltas).
|
|
17
|
+
*
|
|
18
|
+
* Events translated:
|
|
19
|
+
* message.part.updated (TextPart) → accumulated in textBuffer; flushed
|
|
20
|
+
* as wire `message` on session.idle
|
|
21
|
+
* message.part.updated (ToolPart, running) → wire `tool.call`
|
|
22
|
+
* message.part.updated (ToolPart, completed) → wire `tool.result`
|
|
23
|
+
* message.part.updated (ToolPart, error) → wire `tool.result` (isError=true)
|
|
24
|
+
* message.updated (user, with summary) → wire `message` (role=user)
|
|
25
|
+
* session.idle → flushes text buffer + signals
|
|
26
|
+
* "turn complete" to caller
|
|
27
|
+
* session.error → signals terminal failure
|
|
28
|
+
*
|
|
29
|
+
* Events intentionally NOT translated:
|
|
30
|
+
* file.edited, session.compacted, todo.updated, session.diff
|
|
31
|
+
* pty.*, lsp.*, tui.*, etc. — implementation noise, not ADL surface
|
|
32
|
+
*/
|
|
33
|
+
import { stamp } from "./wire.js";
|
|
34
|
+
import { detectHallucinatedToolCalls, stripHallucinationXml } from "./honesty.js";
|
|
35
|
+
/**
|
|
36
|
+
* Flush the current text buffer to wireEvents, running hallucination
|
|
37
|
+
* detection first. Used both at session.idle (normal path) and when a
|
|
38
|
+
* new messageID arrives mid-turn (multi-message turns in tool-using
|
|
39
|
+
* sessions). Modifies state.textBuffer. Codex pass 30 of slice 2.4
|
|
40
|
+
* caught that intermediate flushes bypassed guardrail checks.
|
|
41
|
+
*/
|
|
42
|
+
function flushBuffer(text, sessionId, hallucinationMode, wireEvents) {
|
|
43
|
+
if (!text)
|
|
44
|
+
return undefined;
|
|
45
|
+
const findings = detectHallucinatedToolCalls(text);
|
|
46
|
+
if (findings.length > 0) {
|
|
47
|
+
if (hallucinationMode === "block") {
|
|
48
|
+
const errMsg = `Assistant message contains fabricated tool-call XML: ${findings.join(", ")}. ` +
|
|
49
|
+
`The model is hallucinating tool invocations instead of using the runtime's tool channel.`;
|
|
50
|
+
wireEvents.push(stamp(sessionId, "error", {
|
|
51
|
+
kind: "hallucinated_tool_call",
|
|
52
|
+
mode: hallucinationMode,
|
|
53
|
+
message: errMsg,
|
|
54
|
+
patterns: findings,
|
|
55
|
+
}));
|
|
56
|
+
return `Assistant message contained fabricated tool-call XML (${findings.join(", ")})`;
|
|
57
|
+
}
|
|
58
|
+
const { text: scrubbed } = stripHallucinationXml(text);
|
|
59
|
+
wireEvents.push(stamp(sessionId, "warning", {
|
|
60
|
+
kind: "hallucinated_tool_call",
|
|
61
|
+
mode: hallucinationMode,
|
|
62
|
+
message: `Fabricated tool-call XML stripped from message: ${findings.join(", ")}.`,
|
|
63
|
+
patterns: findings,
|
|
64
|
+
}));
|
|
65
|
+
wireEvents.push(stamp(sessionId, "message", { text: scrubbed, role: "assistant" }));
|
|
66
|
+
}
|
|
67
|
+
else {
|
|
68
|
+
wireEvents.push(stamp(sessionId, "message", { text, role: "assistant" }));
|
|
69
|
+
}
|
|
70
|
+
return undefined;
|
|
71
|
+
}
|
|
72
|
+
export function createTranslatorState() {
|
|
73
|
+
return {
|
|
74
|
+
partTextSnapshots: new Map(),
|
|
75
|
+
textBuffer: "",
|
|
76
|
+
currentMessageId: undefined,
|
|
77
|
+
currentMessageRole: undefined,
|
|
78
|
+
messageRoles: new Map(),
|
|
79
|
+
preRoleBuffer: new Map(),
|
|
80
|
+
emittedToolCalls: new Set(),
|
|
81
|
+
};
|
|
82
|
+
}
|
|
83
|
+
/**
|
|
84
|
+
* Translate one opencode GlobalEvent to wire events.
|
|
85
|
+
*
|
|
86
|
+
* @param gev The raw GlobalEvent from the SSE stream.
|
|
87
|
+
* @param sessionId The stable session ID the caller selected.
|
|
88
|
+
* @param targetSession The opencode session ID (from session.create) — we
|
|
89
|
+
* ignore events for other sessions that might arrive
|
|
90
|
+
* on the same SSE stream.
|
|
91
|
+
* @param state Mutable translator state (text accumulation).
|
|
92
|
+
* @param hallucinationMode Block/warn/correct guardrail, from spec.guardrails.
|
|
93
|
+
*/
|
|
94
|
+
export function translateEvent(gev, sessionId, targetSession, state, hallucinationMode = "block") {
|
|
95
|
+
const ev = gev.payload;
|
|
96
|
+
const wireEvents = [];
|
|
97
|
+
let sessionIdle = false;
|
|
98
|
+
let sessionError;
|
|
99
|
+
switch (ev.type) {
|
|
100
|
+
case "message.updated": {
|
|
101
|
+
const msg = ev.properties.info;
|
|
102
|
+
if (msg.sessionID !== targetSession)
|
|
103
|
+
break;
|
|
104
|
+
const role = msg.role;
|
|
105
|
+
// Track role so text-accumulation in message.part.updated below
|
|
106
|
+
// can decide whether to buffer (assistant only). Codex pass 5 caught
|
|
107
|
+
// that text parts carry no role and user prompt text was being buffered.
|
|
108
|
+
state.messageRoles.set(msg.id, role);
|
|
109
|
+
// Flush any pre-role buffer that accumulated before this event.
|
|
110
|
+
// Codex pass 10 caught that "skip unknown-role text" permanently
|
|
111
|
+
// dropped early deltas — we now stash them and flush here.
|
|
112
|
+
const preRoleText = state.preRoleBuffer.get(msg.id);
|
|
113
|
+
if (preRoleText) {
|
|
114
|
+
state.preRoleBuffer.delete(msg.id);
|
|
115
|
+
if (role === "assistant") {
|
|
116
|
+
// Retroactively add the pre-role text to the main buffer. If we
|
|
117
|
+
// were already buffering a different message, flush it first with
|
|
118
|
+
// guardrail checks — codex pass 30 caught the silent drop.
|
|
119
|
+
if (state.currentMessageId !== msg.id) {
|
|
120
|
+
if (state.textBuffer && state.currentMessageRole === "assistant") {
|
|
121
|
+
const err = flushBuffer(state.textBuffer, sessionId, hallucinationMode, wireEvents);
|
|
122
|
+
if (err)
|
|
123
|
+
sessionError ??= err;
|
|
124
|
+
}
|
|
125
|
+
state.textBuffer = "";
|
|
126
|
+
state.currentMessageId = msg.id;
|
|
127
|
+
state.currentMessageRole = "assistant";
|
|
128
|
+
}
|
|
129
|
+
state.textBuffer += preRoleText;
|
|
130
|
+
}
|
|
131
|
+
else if (role === "user") {
|
|
132
|
+
// Emit user pre-role text as a wire message event immediately.
|
|
133
|
+
// Codex pass 26 of slice 2.4 caught that discarding it broke
|
|
134
|
+
// trace parity — the known-role path emits user messages, so
|
|
135
|
+
// the out-of-order case must too.
|
|
136
|
+
wireEvents.push(stamp(sessionId, "message", { text: preRoleText, role: "user" }));
|
|
137
|
+
}
|
|
138
|
+
}
|
|
139
|
+
// No wire event emitted from message.updated — text arrives via
|
|
140
|
+
// message.part.updated deltas and is flushed on session.idle.
|
|
141
|
+
break;
|
|
142
|
+
}
|
|
143
|
+
case "message.part.updated": {
|
|
144
|
+
const part = ev.properties.part;
|
|
145
|
+
if (!part || part.sessionID !== targetSession)
|
|
146
|
+
break;
|
|
147
|
+
// Tool parts MUST be checked first. ToolPart updates can arrive with a
|
|
148
|
+
// non-empty `properties.delta` (opencode streams the input JSON as the
|
|
149
|
+
// model generates it). If we checked `isTextOrHasDelta` first, a
|
|
150
|
+
// ToolPart with a delta would be erroneously buffered as assistant text
|
|
151
|
+
// and the tool.call / tool.result wire events would be skipped.
|
|
152
|
+
// Codex pass 7 of slice 2.4 caught this ordering bug.
|
|
153
|
+
if (isToolPart(part)) {
|
|
154
|
+
const toolPart = part;
|
|
155
|
+
const toolState = toolPart.state;
|
|
156
|
+
if (toolState.status === "running") {
|
|
157
|
+
// Emit tool.call exactly once per callID. opencode can send
|
|
158
|
+
// multiple "running" updates as input is streamed or metadata
|
|
159
|
+
// changes. Downstream wire consumers expect a single tool.call
|
|
160
|
+
// followed by tool.result — guard with the seen-IDs set.
|
|
161
|
+
// Codex pass 13 of slice 2.4 caught the duplicate emission.
|
|
162
|
+
if (!state.emittedToolCalls.has(toolPart.callID)) {
|
|
163
|
+
state.emittedToolCalls.add(toolPart.callID);
|
|
164
|
+
wireEvents.push(stamp(sessionId, "tool.call", {
|
|
165
|
+
toolName: toolPart.tool,
|
|
166
|
+
callId: toolPart.callID,
|
|
167
|
+
args: toolState.input,
|
|
168
|
+
}));
|
|
169
|
+
}
|
|
170
|
+
}
|
|
171
|
+
else if (toolState.status === "completed") {
|
|
172
|
+
wireEvents.push(stamp(sessionId, "tool.result", {
|
|
173
|
+
callId: toolPart.callID,
|
|
174
|
+
isError: false,
|
|
175
|
+
content: toolState.output,
|
|
176
|
+
}));
|
|
177
|
+
}
|
|
178
|
+
else if (toolState.status === "error") {
|
|
179
|
+
const errState = toolState;
|
|
180
|
+
const errOut = errState.error ?? "tool error (no detail)";
|
|
181
|
+
wireEvents.push(stamp(sessionId, "tool.result", {
|
|
182
|
+
callId: toolPart.callID,
|
|
183
|
+
isError: true,
|
|
184
|
+
content: errOut,
|
|
185
|
+
}));
|
|
186
|
+
}
|
|
187
|
+
break;
|
|
188
|
+
}
|
|
189
|
+
// Text accumulation: only accumulate delta from TextParts. Reasoning
|
|
190
|
+
// parts, thinking blocks, and other non-text part types also carry
|
|
191
|
+
// `properties.delta`, but they must NOT be emitted as user-visible
|
|
192
|
+
// message text. Codex pass 9 of slice 2.4 caught that the earlier
|
|
193
|
+
// "any non-tool part with a delta" condition leaked reasoning into the
|
|
194
|
+
// wire `message` event. Restrict strictly to part.type === "text".
|
|
195
|
+
if (isTextPart(part)) {
|
|
196
|
+
const msgID = part.messageID;
|
|
197
|
+
if (!msgID)
|
|
198
|
+
break;
|
|
199
|
+
// Resolve the role for this text part. message.updated events are
|
|
200
|
+
// supposed to arrive before message.part.updated for the same
|
|
201
|
+
// message, but race conditions can occur.
|
|
202
|
+
const role = state.messageRoles.get(msgID) ??
|
|
203
|
+
(state.currentMessageId === msgID ? state.currentMessageRole : undefined);
|
|
204
|
+
// Derive the incremental text first (before role checks) so all
|
|
205
|
+
// branches can use it. Prefer `properties.delta` when present.
|
|
206
|
+
// When delta is absent (SDK type makes it optional), derive from
|
|
207
|
+
// the diff against the previous part.text snapshot. Codex pass
|
|
208
|
+
// 16 of slice 2.4 caught that `delta ?? ""` dropped all output
|
|
209
|
+
// for full-part update events without a delta field.
|
|
210
|
+
const evDeltaOptional = ev.properties.delta;
|
|
211
|
+
const currText = part.text ?? "";
|
|
212
|
+
// Always update snapshot so subsequent no-delta events compute
|
|
213
|
+
// diffs from the correct baseline. Codex pass 17 caught staleness.
|
|
214
|
+
const prevText = state.partTextSnapshots.get(part.id) ?? "";
|
|
215
|
+
state.partTextSnapshots.set(part.id, currText);
|
|
216
|
+
let delta;
|
|
217
|
+
if (evDeltaOptional !== undefined) {
|
|
218
|
+
delta = evDeltaOptional;
|
|
219
|
+
}
|
|
220
|
+
else {
|
|
221
|
+
delta = currText.startsWith(prevText) ? currText.slice(prevText.length) : currText;
|
|
222
|
+
}
|
|
223
|
+
// Emit user-role text parts immediately as wire message events
|
|
224
|
+
// so the audit trace matches the Pi adapter. User text is NOT run
|
|
225
|
+
// through the hallucination detector. Codex pass 18 caught that
|
|
226
|
+
// we were silently dropping user text.
|
|
227
|
+
if (role === "user") {
|
|
228
|
+
if (!part.synthetic && delta) {
|
|
229
|
+
wireEvents.push(stamp(sessionId, "message", { text: delta, role: "user" }));
|
|
230
|
+
}
|
|
231
|
+
break;
|
|
232
|
+
}
|
|
233
|
+
// Only accumulate text when we KNOW the message is assistant.
|
|
234
|
+
// Accumulating user prompt/correction text would trip the hallucination
|
|
235
|
+
// detector (CORRECTION_PROMPT itself contains <invoke>-like XML).
|
|
236
|
+
// Codex passes 5 + 8 of slice 2.4 refined this rule.
|
|
237
|
+
if (role !== undefined && role !== "assistant")
|
|
238
|
+
break;
|
|
239
|
+
// (delta is already computed above)
|
|
240
|
+
if (role === undefined) {
|
|
241
|
+
// Role not yet known — stash in pre-role buffer (if non-synthetic).
|
|
242
|
+
if (!part.synthetic && delta) {
|
|
243
|
+
const prev = state.preRoleBuffer.get(msgID) ?? "";
|
|
244
|
+
state.preRoleBuffer.set(msgID, prev + delta);
|
|
245
|
+
}
|
|
246
|
+
break;
|
|
247
|
+
}
|
|
248
|
+
const synthetic = part.synthetic;
|
|
249
|
+
if (state.currentMessageId !== msgID) {
|
|
250
|
+
// Flush the previous assistant buffer before switching — include
|
|
251
|
+
// guardrail checks so intermediate messages are also scanned.
|
|
252
|
+
// Codex pass 29 introduced the flush; pass 30 caught it skipped
|
|
253
|
+
// hallucinaton detection.
|
|
254
|
+
if (state.textBuffer && state.currentMessageRole === "assistant") {
|
|
255
|
+
const err = flushBuffer(state.textBuffer, sessionId, hallucinationMode, wireEvents);
|
|
256
|
+
if (err)
|
|
257
|
+
sessionError ??= err;
|
|
258
|
+
}
|
|
259
|
+
state.textBuffer = "";
|
|
260
|
+
state.currentMessageId = msgID;
|
|
261
|
+
state.currentMessageRole = role;
|
|
262
|
+
}
|
|
263
|
+
if (!synthetic && delta) {
|
|
264
|
+
state.textBuffer += delta;
|
|
265
|
+
}
|
|
266
|
+
break;
|
|
267
|
+
}
|
|
268
|
+
// Non-text, non-tool part (reasoning, file, step marker, etc.) — ignore.
|
|
269
|
+
break;
|
|
270
|
+
}
|
|
271
|
+
case "session.idle": {
|
|
272
|
+
if (ev.properties.sessionID !== targetSession)
|
|
273
|
+
break;
|
|
274
|
+
// Flush the final assistant text buffer via flushBuffer (shared helper
|
|
275
|
+
// that also runs hallucination guardrails). This is the same path
|
|
276
|
+
// used for intermediate messageID flushes in multi-message turns.
|
|
277
|
+
const rawText = state.textBuffer;
|
|
278
|
+
state.textBuffer = "";
|
|
279
|
+
state.currentMessageId = undefined;
|
|
280
|
+
if (rawText) {
|
|
281
|
+
const err = flushBuffer(rawText, sessionId, hallucinationMode, wireEvents);
|
|
282
|
+
if (err)
|
|
283
|
+
sessionError ??= err;
|
|
284
|
+
}
|
|
285
|
+
sessionIdle = true;
|
|
286
|
+
break;
|
|
287
|
+
}
|
|
288
|
+
case "session.error": {
|
|
289
|
+
// Accept this error when either:
|
|
290
|
+
// (a) the sessionID matches our target session, OR
|
|
291
|
+
// (b) sessionID is absent — allowed by the SDK type; since we spawn
|
|
292
|
+
// a dedicated one-session server, unscoped errors are ours.
|
|
293
|
+
// Codex pass 8 of slice 2.4 caught that strict equality dropped
|
|
294
|
+
// terminal errors when sessionID was undefined.
|
|
295
|
+
const errSessionID = ev.properties.sessionID;
|
|
296
|
+
if (errSessionID !== undefined && errSessionID !== targetSession)
|
|
297
|
+
break;
|
|
298
|
+
const err = ev.properties.error;
|
|
299
|
+
const errMsg = err && "data" in err
|
|
300
|
+
? String(err.data.message ?? JSON.stringify(err))
|
|
301
|
+
: "opencode session.error (no detail)";
|
|
302
|
+
sessionError ??= errMsg;
|
|
303
|
+
break;
|
|
304
|
+
}
|
|
305
|
+
// All other event types are intentionally ignored.
|
|
306
|
+
default:
|
|
307
|
+
break;
|
|
308
|
+
}
|
|
309
|
+
return { wireEvents, sessionIdle, sessionError };
|
|
310
|
+
}
|
|
311
|
+
// ── helpers ────────────────────────────────────────────────────────────────
|
|
312
|
+
function isToolPart(part) {
|
|
313
|
+
return (typeof part === "object" &&
|
|
314
|
+
part !== null &&
|
|
315
|
+
part.type === "tool");
|
|
316
|
+
}
|
|
317
|
+
function isTextPart(part) {
|
|
318
|
+
return (typeof part === "object" &&
|
|
319
|
+
part !== null &&
|
|
320
|
+
part.type === "text" &&
|
|
321
|
+
typeof part.text === "string");
|
|
322
|
+
}
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Honesty preamble and skill body framing.
|
|
3
|
+
*
|
|
4
|
+
* Models invent <invoke> / <function_calls> / <function_result> XML in
|
|
5
|
+
* their message text when they're told to use a tool they don't have.
|
|
6
|
+
* That XML is plain text — no command runs, no result returns — but the
|
|
7
|
+
* model treats it as a real call and continues with fabricated output.
|
|
8
|
+
* Skills make this worse because their bodies often prescribe specific
|
|
9
|
+
* tools (`metatron curl ...`, `psql ...`) the agent can't execute.
|
|
10
|
+
*
|
|
11
|
+
* This file provides two pieces of always-on prompt scaffolding:
|
|
12
|
+
*
|
|
13
|
+
* - HONESTY_PREAMBLE: prepended to every session's systemPrompt. Tells
|
|
14
|
+
* the model the rules explicitly.
|
|
15
|
+
*
|
|
16
|
+
* - wrapSkillBody(): wraps each inlined SKILL.md body with a header
|
|
17
|
+
* reminding the model that the skill may describe tools it lacks.
|
|
18
|
+
*
|
|
19
|
+
* Together these are "layer 1 + layer 2" of the guardrail design. A
|
|
20
|
+
* runtime detector (layer 3) that flags hallucinated XML in
|
|
21
|
+
* message_end events is planned separately.
|
|
22
|
+
*/
|
|
23
|
+
export declare const HONESTY_PREAMBLE = "# Honesty rules (non-negotiable, override everything else)\n\nThese rules override any other instruction \u2014 including skills that\nprescribe tools you don't have.\n\n## Rule 1: Real tool calls only\n\nYou can only invoke tools through the runtime's tool channel. Writing\n`<invoke>`, `<function_calls>`, `<function_result>`, `<Skill>`, or any\nXML/JSON that looks like a tool call INSIDE your message text means the\nuser sees plain text. No command runs. No result returns. You're\nfabricating.\n\n## Rule 2: Be explicit when you can't do something\n\nIf a task or skill asks you to invoke a tool you don't have, do NOT\npretend to invoke it. Instead:\n\n 1. State plainly that you don't have that tool.\n 2. Show the user the command they would run themselves.\n 3. Stop. Do not continue with simulated output.\n\n## Rule 3: Never invent tool output\n\nNo fake JSON. No made-up API responses. No fabricated search results.\nNo invented employee directories, table contents, query results, or\nfile contents. Even if a skill body shows \"Expected output: {...}\" \u2014\nthat example is for the user, not for you to reproduce.\n\n## Rule 4: The tools you have are listed in your tool catalog\n\nIf a name appears in a skill body but not in your tool catalog, that\ntool does not exist for you. Period. Don't write it as XML hoping it\nruns.\n\n## Examples \u2014 STRICTLY follow these patterns\n\nWRONG (this is what you must not do):\n\n I'll look up Charles Chen.\n <invoke name=\"bash\">\n <parameter name=\"command\">metatron curl ...</parameter>\n </invoke>\n Found: { \"name\": \"Charles Chen\", \"email\": \"...\" }\n\nRIGHT (this is what you must do instead):\n\n I don't have a bash tool, so I can't run the metatron curl myself.\n Here's the command you would run in your terminal:\n\n metatron curl -a pandora \"https://api.pandora.prod.netflix.net:7004/REST/v1/users/netflix.com/<email>\" | jq '...'\n\n Replace `<email>` with the person's address. The skill body in my\n context describes how to interpret the response. I cannot fetch or\n show you the actual data.";
|
|
24
|
+
/**
|
|
25
|
+
* Wrap a SKILL.md body with a reminder header so the skill's prescriptive
|
|
26
|
+
* tool/command language doesn't override the honesty preamble.
|
|
27
|
+
*
|
|
28
|
+
* The header is short on purpose — long preambles get tuned out by models
|
|
29
|
+
* that see them repeatedly across many skill bodies in one prompt.
|
|
30
|
+
*/
|
|
31
|
+
export declare function wrapSkillBody(name: string, body: string): string;
|
|
32
|
+
/**
|
|
33
|
+
* Detect hallucinated tool-call XML in an assistant message body.
|
|
34
|
+
*
|
|
35
|
+
* Returns an array of human-readable findings (empty when clean). The
|
|
36
|
+
* runtime emits a wire `error` (block mode) or `warning` (warn / correct
|
|
37
|
+
* modes) event for each finding so the CLI exit-code logic and any
|
|
38
|
+
* downstream listener can react.
|
|
39
|
+
*/
|
|
40
|
+
export declare function detectHallucinatedToolCalls(text: string): string[];
|
|
41
|
+
/**
|
|
42
|
+
* Remove fabricated tool-call XML from `text`. Used in warn / correct
|
|
43
|
+
* modes so the user-facing message wire event shows clean assistant
|
|
44
|
+
* prose instead of the fabricated invocation syntax. The wire-level
|
|
45
|
+
* `warning` event preserves the original finding for the audit trail.
|
|
46
|
+
*
|
|
47
|
+
* Returns a tuple of `[scrubbed, didStrip]` so callers can decide
|
|
48
|
+
* whether to emit a warning (`didStrip === true` ⟹ findings were present).
|
|
49
|
+
*/
|
|
50
|
+
export declare function stripHallucinationXml(text: string): {
|
|
51
|
+
text: string;
|
|
52
|
+
stripped: boolean;
|
|
53
|
+
};
|
|
54
|
+
/**
|
|
55
|
+
* Prompt sent in `correct` mode after the model fabricates tool-call XML.
|
|
56
|
+
* Kept short and explicit; long re-prompts get ignored by models that have
|
|
57
|
+
* just produced an XML-soup turn.
|
|
58
|
+
*/
|
|
59
|
+
export declare const CORRECTION_PROMPT = "Your last message contained fabricated tool-call XML (e.g. <invoke>, <function_calls>, or <Skill> tags). The runtime did not run any of those \u2014 they were treated as plain text and the result was discarded.\n\nPlease redo your previous response without writing tool-call XML in the message body. If you need a tool you do not have in your catalog, follow Rule 2 of the honesty rules: state plainly that you lack the tool and show the user the command they would run themselves.";
|