opencode-goal-mode 0.1.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/ARCHITECTURE.md +180 -0
  2. package/README.md +158 -52
  3. package/agents/goal-api-reviewer.md +0 -2
  4. package/agents/goal-architect.md +0 -2
  5. package/agents/goal-commentator.md +0 -2
  6. package/agents/goal-completion-guard.md +0 -2
  7. package/agents/goal-coordinator.md +0 -2
  8. package/agents/goal-data-reviewer.md +0 -2
  9. package/agents/goal-deep-researcher.md +0 -2
  10. package/agents/goal-diff-reviewer.md +0 -2
  11. package/agents/goal-doc-reviewer.md +0 -2
  12. package/agents/goal-doc-writer.md +0 -2
  13. package/agents/goal-explorer.md +9 -8
  14. package/agents/goal-final-auditor.md +0 -2
  15. package/agents/goal-implementer.md +0 -2
  16. package/agents/goal-mapper.md +0 -2
  17. package/agents/goal-ops-reviewer.md +0 -2
  18. package/agents/goal-perf-reviewer.md +0 -2
  19. package/agents/goal-planner.md +10 -5
  20. package/agents/goal-prompt-auditor.md +0 -2
  21. package/agents/goal-quality-gate.md +0 -2
  22. package/agents/goal-researcher.md +8 -7
  23. package/agents/goal-reviewer.md +0 -2
  24. package/agents/goal-security-reviewer.md +0 -2
  25. package/agents/goal-test-reviewer.md +0 -2
  26. package/agents/goal-ux-reviewer.md +0 -2
  27. package/agents/goal-verifier.md +0 -2
  28. package/agents/goal-web-researcher.md +0 -2
  29. package/agents/goal.md +9 -8
  30. package/package.json +13 -9
  31. package/plugins/goal-guard/agents.js +132 -0
  32. package/plugins/goal-guard/completion.js +64 -0
  33. package/plugins/goal-guard/config.js +87 -0
  34. package/plugins/goal-guard/events.js +65 -0
  35. package/plugins/goal-guard/gates.js +85 -0
  36. package/plugins/goal-guard/logger.js +36 -0
  37. package/plugins/goal-guard/persistence.js +122 -0
  38. package/plugins/goal-guard/shell.js +1159 -0
  39. package/plugins/goal-guard/state.js +182 -0
  40. package/plugins/goal-guard/summary.js +46 -0
  41. package/plugins/goal-guard/system.js +43 -0
  42. package/plugins/goal-guard/tools.js +129 -0
  43. package/plugins/goal-guard/verdicts.js +87 -0
  44. package/plugins/goal-guard.js +267 -379
  45. package/plugins/package.json +3 -0
  46. package/scripts/install.mjs +170 -36
  47. package/docs/research-report.md +0 -37
  48. package/scripts/check-npm-publish-ready.mjs +0 -54
  49. package/scripts/validate-opencode-config.mjs +0 -82
  50. package/tests/agents.test.mjs +0 -70
  51. package/tests/commands.test.mjs +0 -23
  52. package/tests/helpers.mjs +0 -23
  53. package/tests/install.test.mjs +0 -64
  54. package/tests/plugin.test.mjs +0 -195
@@ -0,0 +1,180 @@
1
+ # Architecture
2
+
3
+ OpenCode Goal Mode is three cooperating layers installed into an OpenCode
4
+ configuration directory:
5
+
6
+ 1. **Agents** (`agents/*.md`) — a primary `goal` agent plus specialist
7
+ subagents (researchers, mappers, planners, and a matrix of strict review
8
+ gates). Each is a Markdown file: YAML frontmatter (mode, permissions, color,
9
+ temperature) over a system-prompt body.
10
+ 2. **Commands** (`commands/*.md`) — slash commands (`/goal`, `/goal-contract`,
11
+ `/goal-review`, `/goal-status`, `/goal-repair`, `/goal-final`) that bind a
12
+ prompt template to an agent, some forced to run as subtasks.
13
+ 3. **The `goal-guard` plugin** (`plugins/goal-guard.js` + `plugins/goal-guard/`)
14
+ — a runtime guard that enforces review discipline, blocks destructive shell
15
+ commands, preserves state across compaction and restarts, and exposes
16
+ first-class `goal_*` tools.
17
+
18
+ This document focuses on the plugin, where the engineering lives.
19
+
20
+ ## Why a plugin at all
21
+
22
+ A prompt alone cannot guarantee discipline across a long session: the model can
23
+ forget the Goal Contract after compaction, claim completion without running the
24
+ required reviews, or run a destructive command. The plugin closes those gaps
25
+ using OpenCode's hook system as enforcement points that the model cannot talk
26
+ its way around.
27
+
28
+ ## Module layout
29
+
30
+ The entry file `plugins/goal-guard.js` is deliberately thin — it wires hooks to
31
+ modules and contains no business logic. OpenCode's plugin discovery glob is
32
+ `{plugin,plugins}/*.{ts,js}` (a single level), so the helper modules under
33
+ `plugins/goal-guard/` are imported relatively but are **not** themselves loaded
34
+ as plugins. Each module is independently unit-tested.
35
+
36
+ | Module | Responsibility |
37
+ | --- | --- |
38
+ | `goal-guard.js` | Hook wiring, state-mutation orchestration, tool registration. |
39
+ | `goal-guard/shell.js` | Quote-aware shell tokenizer + command classifier. |
40
+ | `goal-guard/agents.js` | Canonical agent sets, base gates, contextual-gate keyword map. |
41
+ | `goal-guard/config.js` | Config resolution (defaults < env vars < plugin options). |
42
+ | `goal-guard/state.js` | Per-session state records + the store (monotonic seq, LRU, persistence hooks). |
43
+ | `goal-guard/persistence.js` | Atomic, debounced JSON persistence under the XDG state dir. |
44
+ | `goal-guard/verdicts.js` | Verdict extraction (last-wins, anchored) and recording. |
45
+ | `goal-guard/gates.js` | Required-gate computation and freshness. |
46
+ | `goal-guard/completion.js` | `Goal Completed` claim evaluation. |
47
+ | `goal-guard/events.js` | Shared edit/verification/evidence mutators. |
48
+ | `goal-guard/summary.js` | State summaries and structured status reports. |
49
+ | `goal-guard/system.js` | Live state block injected into the system prompt. |
50
+ | `goal-guard/tools.js` | The `goal_status` / `goal_contract` / `goal_evidence` / `goal_reset` tools. |
51
+ | `goal-guard/logger.js` | Best-effort logging/toasts over the OpenCode client. |
52
+
53
+ ## Hooks used
54
+
55
+ Verified against `@opencode-ai/plugin@1.15.13` source.
56
+
57
+ | Hook | Purpose in the guard |
58
+ | --- | --- |
59
+ | `chat.message` | Capture the user's goal text (drives contextual review gates). |
60
+ | `chat.params` | Track the current agent; activate goal sessions. |
61
+ | `experimental.chat.system.transform` | Inject the live Goal Guard state block. |
62
+ | `tool.execute.before` | Block destructive / remote-exec bash by throwing. |
63
+ | `tool.execute.after` | Record edits, verification, mutations, and review verdicts. |
64
+ | `experimental.text.complete` | Rewrite premature `Goal Completed` claims. |
65
+ | `experimental.session.compacting` | Preserve guard state across compaction. |
66
+ | `event` | Track `file.edited` (subagent edits), flush state on `session.idle`. |
67
+ | `tool` | Register the custom `goal_*` tools. |
68
+ | `dispose` | Flush persisted state. |
69
+
70
+ `permission.ask` is intentionally **not** used: in 1.15.13 it is declared in the
71
+ type but never triggered by the runtime, so destructive blocking is done by
72
+ throwing in `tool.execute.before` (the throw surfaces to the model as the tool's
73
+ error result).
74
+
75
+ ## State model
76
+
77
+ State is created **per plugin instance** (a closure), not a module global, so
78
+ two OpenCode projects can never cross-contaminate each other's verdicts or dirty
79
+ flags. Within an instance, state is keyed by session id.
80
+
81
+ Every state-changing event draws from a single **monotonic `seq` counter** owned
82
+ by the store. Review freshness ("is this PASS newer than the latest edit?") is
83
+ decided by comparing seq numbers, not millisecond ISO timestamps — so two events
84
+ in the same millisecond cannot tie, and a review can never be accepted as fresh
85
+ against an edit it did not actually follow. Edits invalidate prior reviews;
86
+ re-running verification does not.
87
+
88
+ A session record tracks: active flag, captured goal text, the Goal Contract,
89
+ dirty flag and reasons, changed files, review-cycle count, the last edit/review/
90
+ verification seq and timestamps, the verdict log and per-agent latest verdict,
91
+ recorded evidence, and completion-rejection history.
92
+
93
+ ### Persistence
94
+
95
+ OpenCode exposes no key/value store to plugins and discards in-memory plugin
96
+ state on restart. `persistence.js` writes the store snapshot as JSON under
97
+ `$XDG_STATE_HOME/opencode/goal-guard/<sha256(worktree)>.json`, atomically (temp
98
+ file + rename) and debounced. On load the store rehydrates and the seq counter
99
+ is restored so ordering stays monotonic across restarts. A read-only or sandboxed
100
+ filesystem degrades to pure in-memory operation rather than failing a tool call.
101
+
102
+ ## Shell command analysis
103
+
104
+ `shell.js` replaces boundary-anchored regexes (which were trivially bypassed)
105
+ with a real lexer. It respects single/double quotes and backslash escapes,
106
+ recurses into `$( … )` / backtick substitutions, `eval`, and `-c` strings,
107
+ unwraps `sudo`/`env`/`xargs`/`timeout`/`nice`, resolves `/bin/rm` to `rm`, and
108
+ classifies each *simple* command by its resolved binary into four independent
109
+ signals:
110
+
111
+ - **destructive** — irreversible loss (`rm -rf`, `git reset --hard`, `dd of=/dev`,
112
+ `curl | sh`, interpreter `os.remove`, …); blocked before execution.
113
+ - **mutating** — writes to the tree (`npm install`, `tee`, `> file`, `git commit`);
114
+ marks the session dirty.
115
+ - **verification** — test/build/lint/typecheck commands; counts as evidence.
116
+ - **networkExec** — piping untrusted network output into a shell.
117
+
118
+ This catches the documented bypass corpus (`$(rm -rf /)`, `bash -c "rm -rf /"`,
119
+ `git -C /r reset --hard`, env-prefixes, newlines, interpreter deletions) while
120
+ clearing false positives such as `git checkout -b feature` and quoted text like
121
+ `echo "rm -rf /"`.
122
+
123
+ ## Gating and completion
124
+
125
+ `gates.js` derives the required review gates from a fixed base set plus
126
+ contextual specialists selected by whole-word keyword matches against the goal
127
+ text, the recorded Goal Contract, and the set of changed files (so a goal about
128
+ "auth tokens" requires the security reviewer; "capital city" does not pull in the
129
+ api reviewer). A gate is satisfied only when its latest verdict is `PASS` with a
130
+ seq newer than the last edit.
131
+
132
+ `completion.js` evaluates a finished message that claims `Goal Completed`. Only
133
+ active goal sessions are policed. The claim is rewritten to `Goal Not Completed`
134
+ — with the specific missing gates appended — when the `Review cycles: N` line is
135
+ absent, no cycle was recorded, the claimed N does not match the recorded count,
136
+ or any required gate is missing/stale.
137
+
138
+ ## Custom tools
139
+
140
+ The `tool` hook registers four tools (names are verbatim object keys):
141
+
142
+ - `goal_contract` — record the Goal Contract; activates enforcement and fixes the
143
+ required specialist gates.
144
+ - `goal_evidence` — log a verification command + result into the ledger.
145
+ - `goal_status` — return the authoritative gate/dirty/completion status.
146
+ - `goal_reset` — clear the session's goal state (requires `confirm: true`).
147
+
148
+ The `@opencode-ai/plugin` import they need is isolated to `tools.js` and loaded
149
+ via a guarded dynamic import, so if the host cannot resolve it the core guard
150
+ hooks still load.
151
+
152
+ ## Configuration
153
+
154
+ `config.js` merges, in increasing precedence: built-in defaults, environment
155
+ variables (`GOAL_GUARD_*`), and the plugin `options` object passed via the
156
+ `["./plugins/goal-guard.js", { … }]` form in `opencode.json`. Toggles cover
157
+ destructive blocking, network-exec blocking, completion enforcement, system-state
158
+ injection, persistence, contextual gates, session cache size/TTL, and toasts.
159
+
160
+ ## Installer
161
+
162
+ `scripts/install.mjs` recursively copies `agents/`, `commands/`, and `plugins/`
163
+ (including the nested module directory) into the target config dir, and records a
164
+ manifest of the file hashes it wrote. On upgrade it distinguishes files it owns
165
+ (safe to replace) from files the user has customized (a conflict requiring
166
+ `--force`), prunes files from prior versions that no longer ship, and supports
167
+ `--uninstall` (which leaves locally-modified files in place).
168
+
169
+ ## Testing
170
+
171
+ `node --test` runs the suite:
172
+
173
+ - `tests/shell.test.mjs` — the analyzer against the bypass and false-positive corpora.
174
+ - `tests/plugin.test.mjs` — hook behavior, gating, verdicts, completion, tools, isolation.
175
+ - `tests/state.test.mjs` — store, seq ordering, eviction, persistence round-trips.
176
+ - `tests/agents.test.mjs` / `tests/commands.test.mjs` — frontmatter and contracts.
177
+ - `tests/install.test.mjs` — recursive copy, manifest upgrades, uninstall.
178
+
179
+ `npm run validate` runs the tests, the structural config validator, the publish
180
+ readiness check, and an `npm pack --dry-run`.
package/README.md CHANGED
@@ -1,21 +1,95 @@
1
1
  # OpenCode Goal Mode
2
2
 
3
- Strict Goal Mode for OpenCode: a primary `goal` mode, specialized subagents, slash commands, and a guard plugin that preserves review discipline across long sessions.
3
+ Strict Goal Mode for OpenCode: a primary `goal` agent, a matrix of specialized
4
+ review subagents, slash commands, and a `goal-guard` plugin that enforces review
5
+ discipline, blocks destructive shell commands, and preserves goal state across
6
+ compaction **and** restarts.
7
+
8
+ See [ARCHITECTURE.md](ARCHITECTURE.md) for the design and [research/](research/)
9
+ for the platform reference, comparison, and threat model.
10
+
11
+ ## Why it's different
12
+
13
+ Most "goal mode" / agentic setups are **prompt-only**: the model is *asked* to
14
+ review its work and to keep going until done. Goal Mode adds a guard plugin that
15
+ makes that discipline **mechanical at the harness layer** — the model cannot
16
+ declare `Goal Completed` until the required reviews actually passed, and it
17
+ cannot run a destructive command that a regex guard would miss.
18
+
19
+ ![Mechanically-enforced goal discipline vs. Claude Code and Codex](docs/benchmarks/capability-matrix.svg)
20
+
21
+ Compared to Claude Code and OpenAI Codex (full analysis, with citations and
22
+ honest caveats, in [research/goal-mode-comparison.md](research/goal-mode-comparison.md)):
23
+
24
+ - **It is the only one of the three that mechanically blocks a premature
25
+ completion claim by default.** Goal Mode intercepts the finished message and
26
+ rewrites `Goal Completed` → `Goal Not Completed` unless every required reviewer
27
+ gate has a *fresh* PASS and the claimed `Review cycles: N` matches the recorded
28
+ counter. Claude Code can do this only via a user-authored Stop hook; Codex's
29
+ code review is advisory.
30
+ - **An edit automatically invalidates prior approvals.** A reviewer gate counts
31
+ only when its PASS is newer (by a monotonic integer sequence) than the last
32
+ edit — so any change forces the relevant reviews to re-run. Neither Claude Code
33
+ nor Codex ships this stale-review invariant.
34
+ - **Required specialist reviews are auto-selected and enforced** (security, api,
35
+ data, performance …) from the goal text, contract, and changed files — not left
36
+ to the model's discretion.
37
+ - **Destructive commands are blocked by a real shell tokenizer**, not a regex.
38
+ Claude Code's own docs call Bash argument-matching *"fragile"*.
39
+
40
+ ### Benchmark: shell-guard accuracy
41
+
42
+ The guard replaced a boundary-anchored regex classifier. On a labeled corpus of
43
+ 71 real commands (`npm run bench`, reproducible — see
44
+ [research/benchmarks.md](research/benchmarks.md)):
45
+
46
+ ![Destructive-command detection rate by family](docs/benchmarks/detection-by-family.svg)
47
+
48
+ ![Overall guard accuracy: detection rate vs false-positive rate](docs/benchmarks/overall-scorecard.svg)
49
+
50
+ | | Legacy regex guard | Goal Mode analyzer |
51
+ | --- | --- | --- |
52
+ | Destructive-command detection | **20.8%** | **100%** |
53
+ | False positives on safe commands | **21.7%** | **0%** |
54
+ | Obfuscated bypasses caught (`$(…)`, `bash -c`, `sudo -u`, interpreters) | 0% | 100% |
55
+ | Remote exec (`curl \| sh`) caught | 0% | 100% |
56
+
57
+ The deeper analysis costs ~0.6 µs more per command (~500,000 classifications/
58
+ second) — negligible for a per-tool-call guard:
59
+
60
+ ![Per-command analysis latency](docs/benchmarks/latency.svg)
4
61
 
5
62
  ## Requirements
6
63
 
7
64
  - Node.js 20.11 or newer.
8
65
  - OpenCode configured to load local agents, commands, and plugins.
9
66
 
10
- ## What It Adds
11
-
12
- - A primary `goal` agent that owns implementation but delegates research, discovery, verification planning, and reviews to subagents.
13
- - Strict review agents for prompt compliance, diff review, verification, security, UX, operations, and final completion.
14
- - Slash commands for `/goal`, `/goal-contract`, `/goal-review`, `/goal-status`, `/goal-repair`, and `/goal-final`.
15
- - A `goal-guard` OpenCode plugin that tracks dirty sessions, review cycles, review verdicts, and injects goal state into compaction.
16
- - Tests that validate agent frontmatter, command frontmatter, plugin behavior, install safety, and config compatibility.
17
-
18
- ## Install Globally
67
+ ## What it adds
68
+
69
+ - A primary `goal` agent that owns implementation but delegates research,
70
+ discovery, verification planning, and reviews to subagents.
71
+ - Strict review gates for prompt compliance, diff review, verification, security,
72
+ UX, operations, data, API, performance, tests, docs, quality, and final audit.
73
+ - Slash commands: `/goal`, `/goal-contract`, `/goal-review`, `/goal-status`,
74
+ `/goal-repair`, `/goal-final`.
75
+ - The `goal-guard` plugin:
76
+ - **Quote-aware shell analysis** that blocks destructive and remote-exec
77
+ commands (including ones that evade naive regexes — `$(rm -rf …)`,
78
+ `bash -c "…"`, `/bin/rm`, `git -C … reset --hard`, `curl | sh`) without
79
+ false-positiving harmless commands like `git checkout -b`.
80
+ - **Completion enforcement**: a premature `Goal Completed` is rewritten to
81
+ `Goal Not Completed` with the exact missing review gates.
82
+ - **Contextual gating**: the goal text and changed files determine which
83
+ specialist reviewers are required.
84
+ - **Disk persistence**: review ledgers survive OpenCode restarts.
85
+ - **Custom tools**: `goal_contract`, `goal_evidence`, `goal_status`,
86
+ `goal_reset`.
87
+ - **Live state injection** into the system prompt so the model always knows
88
+ what the guard requires.
89
+ - A test suite validating the analyzer, plugin hooks, state store, install
90
+ safety, and config compatibility.
91
+
92
+ ## Install globally
19
93
 
20
94
  ```bash
21
95
  npm ci
@@ -23,9 +97,10 @@ npm run validate
23
97
  npm run install:global
24
98
  ```
25
99
 
26
- Restart OpenCode after installation. OpenCode loads agents, commands, and plugins at startup.
100
+ Restart OpenCode after installation. OpenCode loads agents, commands, and
101
+ plugins at startup.
27
102
 
28
- ## Install Into One Project
103
+ ## Install into one project
29
104
 
30
105
  ```bash
31
106
  npm ci
@@ -35,15 +110,56 @@ npm run install:local
35
110
 
36
111
  This writes to `./.opencode` in the current project.
37
112
 
38
- ## Installer Options
113
+ ## Installer options
39
114
 
40
115
  ```bash
41
116
  node scripts/install.mjs --dry-run
42
117
  node scripts/install.mjs --target /path/to/opencode-config
43
118
  node scripts/install.mjs --global --force
119
+ node scripts/install.mjs --global --uninstall
44
120
  ```
45
121
 
46
- The installer refuses to overwrite changed destination files unless `--force` is passed.
122
+ The installer records a manifest of the files it writes. On upgrade it replaces
123
+ files it owns but refuses to clobber files you have locally modified unless
124
+ `--force` is passed. `--uninstall` removes only the files it installed and leaves
125
+ your local edits in place.
126
+
127
+ ## Configuration
128
+
129
+ The guard works with zero configuration. To tune it, add options in
130
+ `opencode.json`:
131
+
132
+ ```jsonc
133
+ {
134
+ "plugin": [
135
+ ["./plugins/goal-guard.js", { "blockDestructive": true, "contextualGates": true }]
136
+ ]
137
+ }
138
+ ```
139
+
140
+ Or via environment variables (`GOAL_GUARD_*`):
141
+
142
+ | Option / env | Default | Effect |
143
+ | --- | --- | --- |
144
+ | `blockDestructive` / `GOAL_GUARD_BLOCK_DESTRUCTIVE` | `true` | Block destructive bash before execution. |
145
+ | `blockNetworkExec` / `GOAL_GUARD_BLOCK_NETWORK_EXEC` | `true` | Block `curl \| sh`-style remote execution. |
146
+ | `enforceCompletion` / `GOAL_GUARD_ENFORCE_COMPLETION` | `true` | Rewrite premature `Goal Completed`. |
147
+ | `injectSystemState` / `GOAL_GUARD_INJECT_SYSTEM_STATE` | `true` | Inject live state into the prompt. |
148
+ | `persist` / `GOAL_GUARD_PERSIST` | `true` | Persist state under the XDG state dir. |
149
+ | `contextualGates` / `GOAL_GUARD_CONTEXTUAL_GATES` | `true` | Require specialist gates by goal keywords. |
150
+ | `maxSessions` / `GOAL_GUARD_MAX_SESSIONS` | `200` | Session cache size. |
151
+ | `sessionTtlMs` / `GOAL_GUARD_SESSION_TTL_MS` | `86400000` | Idle session TTL. |
152
+ | `toastOnBlock` / `GOAL_GUARD_TOAST_ON_BLOCK` | `true` | Toast when something is blocked. |
153
+
154
+ ## Custom tools
155
+
156
+ The plugin registers four tools the model can call directly:
157
+
158
+ - `goal_contract` — record the Goal Contract (requirements, non-goals,
159
+ acceptance criteria). Activates enforcement and fixes the required gates.
160
+ - `goal_evidence` — record a verification command and result.
161
+ - `goal_status` — return the authoritative gate/dirty/completion status.
162
+ - `goal_reset` — clear the session's goal state (requires `confirm: true`).
47
163
 
48
164
  ## Validation
49
165
 
@@ -54,40 +170,40 @@ npm run audit
54
170
  npm run publish:check
55
171
  ```
56
172
 
57
- `npm run validate` runs the test suite, checks the OpenCode package structure, verifies the guard plugin hooks, and performs an npm package dry run.
173
+ `npm run validate` runs the test suite, the structural config validator, the
174
+ publish readiness check, and an `npm pack --dry-run`.
58
175
 
59
- ## npm Publishing
176
+ ## Models
60
177
 
61
- Install from npm after the first publish:
178
+ Agents do not pin a provider-specific model, so they inherit the model OpenCode
179
+ is configured to use. To give a particular agent a specific model, add a
180
+ `model:` (and optional `variant:`) line to that agent's frontmatter in your
181
+ installed copy.
62
182
 
63
- ```bash
64
- npm install -g opencode-goal-mode
65
- opencode-goal-mode-install --global
66
- ```
67
-
68
- Publishing is handled by `.github/workflows/publish.yml`.
183
+ ## Safety
69
184
 
70
- First publish:
185
+ The installer copies only `agents/*.md`, `commands/*.md`, and the `plugins/`
186
+ tree — never auth files, session files, tokens, or personal provider config.
71
187
 
72
- ```bash
73
- npm publish --access public --otp <2fa-code>
74
- ```
188
+ The guard blocks destructive shell commands, marks real file mutations dirty,
189
+ keeps read-only inspection from dirtying the session, preserves goal state during
190
+ compaction and across restarts, and blocks premature `Goal Completed` responses
191
+ when review gates are missing or stale.
75
192
 
76
- npm requires 2FA proof or a granular access token with bypass 2FA enabled for creating and publishing packages. After the package exists on npm, configure Trusted Publishing for tokenless releases:
193
+ ## npm publishing
77
194
 
78
- - Provider: GitHub Actions
79
- - Organization/user: `devinoldenburg`
80
- - Repository: `opencode-goal-mode`
81
- - Workflow filename: `publish.yml`
82
- - Allowed action: `npm publish`
83
-
84
- The workflow already has `id-token: write`, runs on Node 24, uses npm 11, and publishes with:
195
+ Install from npm after the first publish:
85
196
 
86
197
  ```bash
87
- npm publish --access public
198
+ npm install -g opencode-goal-mode
199
+ opencode-goal-mode-install --global
88
200
  ```
89
201
 
90
- If you prefer token-based publishing instead of Trusted Publishing, add a repository secret named `NPM_TOKEN` with a granular npm token that has publish rights and bypass 2FA enabled.
202
+ Publishing is handled by `.github/workflows/publish.yml`, which runs on Node 24
203
+ with `id-token: write` for Trusted Publishing. The workflow validates the
204
+ package, checks the tag matches `package.json`, verifies the version is not
205
+ already on npm, then publishes. Manual workflow dispatch defaults to
206
+ `npm publish --dry-run`.
91
207
 
92
208
  Release flow:
93
209
 
@@ -96,19 +212,9 @@ npm version patch
96
212
  git push --follow-tags
97
213
  ```
98
214
 
99
- Create a GitHub Release from the pushed tag, for example `v0.1.1`. The publish workflow validates the package, checks that the tag matches `package.json`, verifies that the version is not already on npm, then publishes to npm.
100
-
101
- Manual workflow dispatch defaults to `npm publish --dry-run`.
102
-
103
- ## Safety
104
-
105
- This repository intentionally does not include auth files, session files, tokens, or personal OpenCode provider config. The installer copies only:
106
-
107
- - `agents/*.md`
108
- - `commands/*.md`
109
- - `plugins/goal-guard.js`
110
-
111
- The guard plugin blocks destructive shell commands, marks real file mutations dirty, avoids dirtying sessions for read-only inspection commands, preserves Goal state during compaction, and blocks premature `Goal Completed` responses when review gates are missing or stale.
215
+ Then create a GitHub Release from the pushed tag (e.g. `v0.1.1`). For
216
+ token-based publishing instead of Trusted Publishing, add a repository secret
217
+ `NPM_TOKEN` with publish rights.
112
218
 
113
219
  ## Goal Completion Contract
114
220
 
@@ -116,6 +222,6 @@ The guard plugin blocks destructive shell commands, marks real file mutations di
116
222
 
117
223
  - All acceptance criteria are mapped to evidence.
118
224
  - Required verification passed or is credibly accounted for.
119
- - Latest edit is not newer than latest required review cycle.
225
+ - No edit is newer than the latest required review cycle.
120
226
  - Required reviewers return `Verdict: PASS`.
121
- - Final answer includes `Review cycles: N`.
227
+ - The final answer includes an accurate `Review cycles: N`.
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for API design review, endpoint contracts, request/response schemas, backward compatibility, versioning, authentication boundaries, and client impact.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: error
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for system design, architectural decision records, technology selection, tradeoff analysis, data flow, module boundaries, and integration contracts.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively to add, improve, or standardize code comments, inline documentation, parameter descriptions, and developer-facing annotations without changing behavior.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use at completion time to enforce that every required contextual review gate has passed after the latest edit and verification. Prevents premature Goal Completed claims.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: error
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively to orchestrate multiple subagents, manage dependencies, sequence parallel workstreams, aggregate results, and keep complex multi-part goals on track.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for data model review, database schema, migrations, seed data, constraints, indexes, consistency rules, and data integrity checks.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: error
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for deep web research, external documentation, specs, RFCs, academic sources, competitor analysis, and authoritative references. Complements file/code research with full web-scale evidence gathering.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use after any file change to inspect diffs, side effects, regressions, unintended edits, and scope creep.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: error
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use for documentation, README, command help, install instructions, and maintainability of Goal Mode guidance.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for generating, updating, and improving documentation: READMEs, API docs, manuals, runbooks, inline help, release notes, and ADRs.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,7 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for local codebase exploration, file discovery, structure mapping, dependency tracing, and convention detection before Goal Mode implementation.
3
3
  mode: subagent
4
- model: ordis/minimax/minimax-m3
5
4
  color: secondary
6
5
  permission:
7
6
  read: allow
@@ -25,11 +24,13 @@ permission:
25
24
 
26
25
  You are a fast local exploration agent for Goal Mode. Build implementation context without changing files.
27
26
 
28
- Return only concise actionable context:
27
+ Discipline: return distilled conclusions, not raw material. Never paste large file bodies, full command output, or long search logs — cite `path:line` and summarize. Your job is to protect the main agent's context, so keep the response tight and actionable.
29
28
 
30
- - Relevant files
31
- - Current behavior
32
- - Constraints and conventions
33
- - Suggested edit points
34
- - Verification commands
35
- - Risks to preserve
29
+ Return only concise actionable context, in exactly these sections:
30
+
31
+ - Relevant files: each as `path:line` with a one-line reason.
32
+ - Current behavior: how the relevant code works today.
33
+ - Constraints and conventions: patterns the implementation must follow.
34
+ - Suggested edit points: the specific files/functions to change.
35
+ - Verification commands: how a change here is tested.
36
+ - Risks to preserve: behavior that must not regress.
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use as the final read-only completion gate before any Goal Mode answer may start with Goal Completed.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: error
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use only for isolated bounded implementation subtasks when the main Goal agent explicitly delegates a narrow edit.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  color: warning
7
5
  hidden: true
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for codebase structure mapping, entry points, dependency tracing, callgraph analysis, symbol resolution, test mapping, and configuration trail following.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use for config-time changes, install scripts, restarts, migrations, environment assumptions, GitHub/CI operations, and deployment/release risk.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: warning
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for performance, scalability, resource usage, latency, throughput, memory, CPU, I/O, algorithmic complexity, and observability review.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: xhigh
6
4
  temperature: 0
7
5
  color: error
8
6
  permission:
@@ -1,8 +1,6 @@
1
1
  ---
2
2
  description: Use proactively for breaking goals into executable tasks, sequencing, priority assignment, risk estimation, and acceptance-criteria alignment checks.
3
3
  mode: subagent
4
- model: ordis/chatgpt/gpt-5.5
5
- variant: high
6
4
  temperature: 0
7
5
  color: info
8
6
  permission:
@@ -35,6 +33,13 @@ Planning rules:
35
33
  - For each task, include: objective, inputs, outputs, verification command, rollback option, and acceptance check.
36
34
  - Estimate complexity and flag blockers that require human input.
37
35
  - Identify risks per task and propose mitigations.
38
- - M
39
- </think>
40
- Ich muss das `meta.json`-Mapping in `validate-opencode-config.mjs` und die Agent/Command-Listen anpassen, damit die neuen Agent(en) sauber laden.
36
+ - Map every task back to at least one acceptance criterion; flag any criterion no task covers.
37
+ - Name the required review gates each task will need (diff, verifier, security, etc.).
38
+
39
+ Output format (return only this, no file dumps):
40
+
41
+ - Task list: numbered, each with objective, inputs, outputs, verification command, rollback, acceptance check, and dependency IDs.
42
+ - Execution order: the sequence with rationale (dependencies and risk first).
43
+ - Coverage map: acceptance criterion -> task IDs that satisfy it; list any uncovered criteria.
44
+ - Risks and mitigations.
45
+ - Open blockers requiring human input, or "none".