opencodekit 0.18.7 → 0.18.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/index.js CHANGED
@@ -18,7 +18,7 @@ var __require = /* @__PURE__ */ createRequire(import.meta.url);
18
18
 
19
19
  //#endregion
20
20
  //#region package.json
21
- var version = "0.18.7";
21
+ var version = "0.18.8";
22
22
 
23
23
  //#endregion
24
24
  //#region src/utils/errors.ts
@@ -100,6 +100,14 @@ If a lookup, search, or tool call returns empty, partial, or suspiciously narrow
100
100
  - For lists, batches, or paginated results: determine expected scope, track processed items, confirm full coverage
101
101
  - If any item is blocked by missing data, mark it `[blocked]` and state exactly what is missing
102
102
 
103
+ ### Plan Quality Gate
104
+
105
+ Before approving or executing any implementation plan:
106
+
107
+ 1. Plan MUST contain a `## Discovery` section with substantive research findings (>100 characters)
108
+ 2. Plans without documented discovery skip the research phase and produce worse implementations
109
+ 3. If discovery is missing or boilerplate, reject the plan and research first
110
+
103
111
  ---
104
112
 
105
113
  ## Hard Constraints (Never Violate)
@@ -157,6 +165,74 @@ Use specialist agents by intent:
157
165
 
158
166
  **Parallelism rule**: Use parallel subagents for 3+ independent tasks; otherwise work sequentially.
159
167
 
168
+ ### Worker Distrust Protocol
169
+
170
+ Subagent self-reports are **approximately 50% accurate**. After every `task()` returns:
171
+
172
+ 1. **Read changed files directly** — don't trust the summary; `git diff` or read modified files
173
+ 2. **Run verification on modified files** — typecheck + lint at minimum; tests if the change touches behavior
174
+ 3. **Check acceptance criteria** — compare actual output against the original task spec, not the agent's claims
175
+ 4. **Verify nothing was broken** — check that files outside the agent's scope weren't unexpectedly modified
176
+
177
+ ```
178
+ ✅ Agent reports success → Read diff → Run verification → Confirm criteria → Accept
179
+ ❌ Agent reports success → Trust it → Move on
180
+ ❌ Agent reports success → Skim summary → Accept
181
+ ```
182
+
183
+ This applies to ALL subagent types (`@general`, `@explore`, `@review`, `@scout`), not just implementation agents.
184
+
185
+ ### Structured Termination Contract
186
+
187
+ Every subagent task MUST return a structured response. When dispatching, include this in the prompt:
188
+
189
+ ```
190
+ Return your results in this exact format:
191
+
192
+ ## Result
193
+ - **Status:** completed | blocked | failed
194
+ - **Files Modified:** [list of file paths]
195
+ - **Files Read:** [list of file paths consulted]
196
+
197
+ ## Verification
198
+ - [What you verified and how]
199
+ - [Command output or evidence]
200
+
201
+ ## Summary
202
+ [2-5 sentences: what was done, key decisions, anything unexpected]
203
+
204
+ ## Blockers (if status is blocked/failed)
205
+ - [What's blocking]
206
+ - [What was tried]
207
+ - [Recommended next step]
208
+ ```
209
+
210
+ When a subagent returns WITHOUT this structure, treat the response with extra skepticism — unstructured reports are more likely to omit failures or exaggerate completion.
211
+
212
+ ### Context File Pattern
213
+
214
+ For complex delegations, write context to a file instead of inlining it in the `task()` prompt:
215
+
216
+ ```typescript
217
+ // ❌ Token-expensive: inlining large context
218
+ task({
219
+ prompt: `Here is the full plan:\n${longPlanContent}\n\nImplement task 3...`
220
+ });
221
+
222
+ // ✅ Token-efficient: reference by path
223
+ // Write context file first:
224
+ write('.beads/artifacts/<id>/worker-context.md', contextContent);
225
+ // Then reference it:
226
+ task({
227
+ prompt: `Read the context file at .beads/artifacts/<id>/worker-context.md\n\nImplement task 3 as described in that file.`
228
+ });
229
+ ```
230
+
231
+ Use this pattern when:
232
+ - Context exceeds ~500 tokens
233
+ - Multiple subagents need the same context
234
+ - Plan content, research findings, or specs need to be passed to workers
235
+
160
236
  ---
161
237
 
162
238
  ## Question Policy
@@ -350,6 +350,15 @@ task({ subagent_type: "general", description: "...", prompt: "..." });
350
350
 
351
351
  Then synthesize results, verify locally, and report with file-level evidence.
352
352
 
353
+ **Mandatory post-delegation steps** (Worker Distrust Protocol):
354
+
355
+ 1. Read changed files directly — don't trust the agent summary
356
+ 2. Run verification on modified files (typecheck + lint minimum)
357
+ 3. Check acceptance criteria against the original task spec
358
+ 4. Only then accept the work
359
+
360
+ Include the **Structured Termination Contract** in every subagent prompt (Result/Verification/Summary/Blockers format). See AGENTS.md delegation policy for the template.
361
+
353
362
  ## Output
354
363
 
355
364
  Report in this order:
@@ -6,12 +6,17 @@
6
6
  "pruneNotification": "detailed",
7
7
  // "chat" (in-conversation) or "toast" (system notification)
8
8
  "pruneNotificationType": "toast",
9
- // Commands: /dcp context, /dcp stats, /dcp sweep, /dcp decompress, /dcp recompress
9
+ // Commands: /dcp context, /dcp stats, /dcp sweep, /dcp compress, /dcp decompress, /dcp recompress
10
10
  "commands": {
11
11
  "enabled": true,
12
12
  // Protect these from /dcp sweep
13
13
  "protectedTools": ["observation", "memory-update", "memory-search"]
14
14
  },
15
+ // Manual mode: disables autonomous context management
16
+ "manualMode": {
17
+ "enabled": false,
18
+ "automaticStrategies": true
19
+ },
15
20
  "turnProtection": {
16
21
  "enabled": true,
17
22
  "turns": 4
@@ -27,32 +32,26 @@
27
32
  "**/biome.json"
28
33
  ],
29
34
  "compress": {
30
- // v2.2.x beta: compress is the primary context management tool
35
+ // v3.0.0: single compress tool replaces the old 3-tool system
31
36
  "permission": "allow",
32
37
  "showCompression": false,
33
38
  "maxContextLimit": "80%",
34
39
  "minContextLimit": 30000,
40
+ "nudgeFrequency": 5,
41
+ "iterationNudgeThreshold": 15,
42
+ "nudgeForce": "soft",
35
43
  "flatSchema": false,
36
44
  "protectUserMessages": true,
37
- "protectedTools": [
38
- "write",
39
- "edit",
40
- "memory-*",
41
- "observation",
42
- "skill",
43
- "skill_mcp",
44
- "task",
45
- "batch",
46
- "todowrite",
47
- "todoread",
48
- "tilth_*"
49
- ]
45
+ // v3.0.0 auto-protects: task, skill, todowrite, todoread, compress, batch, plan_enter, plan_exit
46
+ // Only list additional tools here
47
+ "protectedTools": ["write", "edit", "memory-*", "observation", "tilth_*"]
50
48
  },
51
- // v2.2.5-beta0: experimental subagent support
49
+ // Experimental features
52
50
  "experimental": {
53
- "allowSubAgents": true
51
+ "allowSubAgents": true,
52
+ "customPrompts": false
54
53
  },
55
- // Auto strategies - TOP LEVEL, not under tools
54
+ // Auto strategies
56
55
  "strategies": {
57
56
  // Dedup = zero LLM cost, high impact - always enable
58
57
  "deduplication": {
Binary file
@@ -151,7 +151,7 @@
151
151
  }
152
152
  },
153
153
  "plugin": [
154
- "@tarquinen/opencode-dcp@beta",
154
+ "@tarquinen/opencode-dcp@latest",
155
155
  "@franlol/opencode-md-table-formatter@0.0.3",
156
156
  "openslimedit@latest"
157
157
  ],
@@ -11,7 +11,7 @@
11
11
  "type-check": "tsc --noEmit"
12
12
  },
13
13
  "dependencies": {
14
- "@opencode-ai/plugin": "1.2.22"
14
+ "@opencode-ai/plugin": "1.2.24"
15
15
  },
16
16
  "devDependencies": {
17
17
  "@types/node": "^25.3.0",
@@ -82,6 +82,7 @@ Good agent prompts are:
82
82
  1. **Focused** - One clear problem domain
83
83
  2. **Self-contained** - All context needed to understand the problem
84
84
  3. **Specific about output** - What should the agent return?
85
+ 4. **Structured termination** - Include the Structured Termination Contract from AGENTS.md (Result/Verification/Summary/Blockers format)
85
86
 
86
87
  ```markdown
87
88
  Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
@@ -101,9 +102,11 @@ These are timing/race condition issues. Your task:
101
102
 
102
103
  Do NOT just increase timeouts - find the real issue.
103
104
 
104
- Return: Summary of what you found and what you fixed.
105
+ [Include Structured Termination Contract here]
105
106
  ```
106
107
 
108
+ For large investigations (context >500 tokens), use the **Context File Pattern** from AGENTS.md — write context to `.beads/artifacts/<id>/investigation-context.md` and reference by path in the dispatch prompt.
109
+
107
110
  ## Common Mistakes
108
111
 
109
112
  **❌ Too broad:** "Fix all the tests" - agent gets lost
@@ -94,6 +94,36 @@ Choose the smallest review depth that still covers the risk.
94
94
 
95
95
  Do **not** dispatch 5 reviewers by reflex for every tiny change. That produces noise, not rigor.
96
96
 
97
+ ## Review Scope Self-Check
98
+
99
+ Before writing ANY review finding, the reviewer MUST ask:
100
+
101
+ > **"Am I questioning APPROACH or DOCUMENTATION/CORRECTNESS?"**
102
+
103
+ | Answer | Action |
104
+ | --- | --- |
105
+ | **APPROACH** — "I would have done it differently" | **Stay silent.** The approach was decided during planning. Review doesn't re-litigate design. |
106
+ | **DOCUMENTATION** — "A new developer couldn't execute this" | **Raise it.** Missing file paths, vague acceptance criteria, assumed context. |
107
+ | **CORRECTNESS** — "This will break at runtime" | **Raise it.** Bugs, security holes, type errors, logic flaws. |
108
+
109
+ ### Red Flags for Scope Creep
110
+
111
+ - Suggesting alternative libraries or frameworks → **out of scope** (approach)
112
+ - "I would rename this to..." without a concrete bug → **out of scope** (style preference)
113
+ - "Consider adding feature X" that wasn't in requirements → **out of scope** (scope inflation)
114
+ - "This could be more performant" without evidence of a real problem → **out of scope** (premature optimization)
115
+
116
+ ### What Reviewers SHOULD Flag
117
+
118
+ - Missing error handling that will crash in production → **correctness**
119
+ - Vague acceptance criteria that workers can't verify → **documentation**
120
+ - Hardcoded values with no explanation → **documentation**
121
+ - Missing file paths for tasks that require edits → **documentation**
122
+ - "Use the existing pattern" without specifying which pattern → **documentation**
123
+ - Security vulnerabilities (injection, auth bypass, secrets) → **correctness**
124
+
125
+ Include this self-check instruction in EVERY review dispatch prompt to prevent bikeshedding.
126
+
97
127
  ## Step 1: Get Git Context
98
128
 
99
129
  ### Setup Checklist
@@ -35,6 +35,8 @@ dependencies: [executing-plans]
35
35
 
36
36
  Read plan file, create TodoWrite with all tasks.
37
37
 
38
+ **Context file pattern:** If the plan exceeds ~500 tokens, write it to `.beads/artifacts/<id>/plan-context.md` and reference by path in subagent prompts instead of inlining. This saves tokens when dispatching multiple subagents from the same plan.
39
+
38
40
  ### 2. Execute Task with Subagent
39
41
 
40
42
  For each task:
@@ -56,10 +58,15 @@ Task tool (general-purpose):
56
58
 
57
59
  Work from: [directory]
58
60
 
59
- Report: What you implemented, what you tested, test results, files changed, any issues
61
+ [Include Structured Termination Contract from AGENTS.md]
60
62
  ```
61
63
 
62
- **Subagent reports back** with summary of work.
64
+ **After subagent reports back** follow the **Worker Distrust Protocol** from AGENTS.md:
65
+
66
+ 1. Read changed files directly (don't trust the report)
67
+ 2. Run verification on modified files (typecheck + lint minimum)
68
+ 3. Check acceptance criteria against original task spec
69
+ 4. Only then mark the task as complete
63
70
 
64
71
  ### 3. Review Subagent's Work
65
72
 
@@ -168,6 +168,37 @@ If you just verified and nothing changed, don't re-verify:
168
168
 
169
169
  This matters when other commands need verification (e.g., closing beads, `/ship`). If you verified 30 seconds ago and made no changes, the cache lets you skip.
170
170
 
171
+ ## Enforcement Gates
172
+
173
+ Prompt-level rules get ignored under pressure. These gates are **hard blocks** — they must be checked at the tool/action level, not just remembered.
174
+
175
+ ### Gate 1: Completion Claims Require verify.log
176
+
177
+ Before ANY completion claim (bead close, PR creation, `/ship`, task completion):
178
+
179
+ 1. Check `.beads/verify.log` exists and contains a recent `PASS` stamp
180
+ 2. If verify.log is missing or stale (older than last file change) → **BLOCK** — run verification first
181
+ 3. If verify.log shows `FAIL` → **BLOCK** — do not proceed
182
+
183
+ ```
184
+ ✅ verify.log exists, PASS within last edit window → proceed
185
+ ❌ verify.log missing → STOP: "Run verification first"
186
+ ❌ verify.log shows FAIL → STOP: "Verification failed, fix before claiming complete"
187
+ ❌ verify.log stale (files changed since last PASS) → STOP: "Re-run verification"
188
+ ```
189
+
190
+ ### Gate 2: Agent Delegation Requires Post-Verification
191
+
192
+ After ANY `task()` subagent returns with "success", follow the **Worker Distrust Protocol** from AGENTS.md — read changed files, run verification, check acceptance criteria. Do not trust agent self-reports.
193
+
194
+ ### Enforcement Principle
195
+
196
+ > **Prompt rules fail under pressure. Gates fail safe.**
197
+ >
198
+ > When a constraint matters enough to be an iron law, enforce it at the action level:
199
+ > check a file, verify a condition, reject if unmet. Don't rely on the agent
200
+ > "remembering" to follow the rule.
201
+
171
202
  ## Why This Matters
172
203
 
173
204
  From 24 failure memories:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "opencodekit",
3
- "version": "0.18.7",
3
+ "version": "0.18.8",
4
4
  "description": "CLI tool for bootstrapping and managing OpenCodeKit projects",
5
5
  "keywords": [
6
6
  "agents",