openhermes 1.5.6 → 1.12.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +217 -111
- package/autorecall.mjs +2 -12
- package/bootstrap.mjs +158 -8
- package/curator.mjs +1 -5
- package/harness/commands/checkpoint.md +68 -0
- package/harness/commands/eval.md +89 -0
- package/harness/commands/go-build.md +87 -0
- package/harness/commands/go-review.md +71 -0
- package/harness/commands/harness-audit.md +90 -0
- package/harness/commands/learn.md +2 -2
- package/harness/commands/loop-start.md +38 -0
- package/harness/commands/loop-status.md +30 -0
- package/harness/commands/memory-search.md +2 -2
- package/harness/commands/model-route.md +32 -0
- package/harness/commands/orchestrate.md +88 -0
- package/harness/commands/quality-gate.md +35 -0
- package/harness/commands/refactor-clean.md +102 -0
- package/harness/commands/rust-build.md +78 -0
- package/harness/commands/rust-review.md +65 -0
- package/harness/commands/setup-pm.md +65 -0
- package/harness/commands/skill-create.md +99 -0
- package/harness/commands/test-coverage.md +80 -0
- package/harness/commands/update-codemaps.md +81 -0
- package/harness/commands/update-docs.md +67 -0
- package/harness/commands/verify.md +68 -0
- package/harness/instructions/CONVENTIONS.md +206 -0
- package/harness/instructions/RUNTIME.md +8 -1
- package/harness/prompts/build-cpp.md +84 -0
- package/harness/prompts/build-error-resolver.md +2 -1
- package/harness/prompts/build-go.md +326 -0
- package/harness/prompts/build-java.md +126 -0
- package/harness/prompts/build-kotlin.md +123 -0
- package/harness/prompts/build-rust.md +94 -0
- package/harness/prompts/code-reviewer.md +2 -1
- package/harness/prompts/doc-updater.md +193 -0
- package/harness/prompts/docs-lookup.md +60 -0
- package/harness/prompts/explore.md +1 -0
- package/harness/prompts/harness-optimizer.md +30 -0
- package/harness/prompts/loop-operator.md +42 -0
- package/harness/prompts/planner.md +3 -2
- package/harness/prompts/refactor-cleaner.md +242 -0
- package/harness/prompts/review-cpp.md +68 -0
- package/harness/prompts/review-database.md +248 -0
- package/harness/prompts/review-go.md +244 -0
- package/harness/prompts/review-java.md +100 -0
- package/harness/prompts/review-kotlin.md +130 -0
- package/harness/prompts/review-python.md +88 -0
- package/harness/prompts/review-rust.md +64 -0
- package/harness/prompts/security-reviewer.md +3 -2
- package/harness/prompts/tdd-guide.md +214 -0
- package/harness/rules/delegation.md +28 -22
- package/harness/rules/memory-management.md +4 -4
- package/harness/rules/retrieval.md +5 -5
- package/harness/rules/runtime-guards.md +1 -1
- package/harness/rules/session-start.md +4 -4
- package/harness/rules/skills-management.md +2 -2
- package/harness/rules/state-drift.md +1 -1
- package/harness/rules/verification.md +4 -4
- package/harness/skills/coding-standards/SKILL.md +1 -1
- package/index.mjs +25 -4
- package/lib/hardening.mjs +11 -1
- package/lib/memory-tools-plugin.mjs +84 -71
- package/lib/ohc/config.mjs +30 -0
- package/lib/ohc/pruner.mjs +239 -0
- package/lib/ohc/reaper.mjs +61 -0
- package/lib/ohc/state.mjs +32 -0
- package/lib/ohc/updater.mjs +110 -0
- package/package.json +1 -1
- package/skill-builder.mjs +2 -6
package/bootstrap.mjs
CHANGED
|
@@ -33,9 +33,9 @@ Snapshot before mutation. Never delete unrelated files. Never assume \`%USERPROF
|
|
|
33
33
|
| Category | Items |
|
|
34
34
|
|----------|-------|
|
|
35
35
|
| **Native tools** | \`read\`, \`write\`, \`edit\`, \`glob\`, \`grep\`, \`bash\`, \`task\`, \`webfetch\`, \`skill\`, \`todowrite\`, \`todoread\` |
|
|
36
|
-
| **In-process tools** | \`
|
|
36
|
+
| **In-process tools** | \`add_memory\`, \`fetch_memory\`, \`list_memory\`, \`latest_memory\`, \`search_memory\`, \`archive_memory\` |
|
|
37
37
|
| **Memory recall cache** | \`openhermes/memory/recall/cache.json\` — read on session start, no MCP round-trip |
|
|
38
|
-
| **Subagents** | \`explore\` (read-only), \`general\` (multi-step), \`architect\`, \`planner\`, \`build-error-resolver\`, \`code-reviewer\`, \`security-reviewer\`, \`e2e-runner\` |
|
|
38
|
+
| **Subagents** | \`explore\` (read-only), \`general\` (multi-step), \`architect\`, \`planner\`, \`build-error-resolver\`, \`code-reviewer\`, \`security-reviewer\`, \`e2e-runner\`, \`docs-lookup\`, \`doc-updater\`, \`refactor-cleaner\`, \`loop-operator\`, \`harness-optimizer\`, \`tdd-guide\`, \`review-go\`, \`build-go\`, \`review-database\`, \`review-cpp\`, \`build-cpp\`, \`review-java\`, \`build-java\`, \`review-kotlin\`, \`build-kotlin\`, \`review-python\`, \`review-rust\`, \`build-rust\` |
|
|
39
39
|
| **Plugins** | \`curator\` (checkpoints, mistakes, audit, compaction), \`autorecall\` (recall cache on \`session.created\`), \`skill-builder\` (complex session detection) |
|
|
40
40
|
|
|
41
41
|
## Skills (available via \`skill\` tool)
|
|
@@ -56,17 +56,33 @@ Main context = coordination + verification only. Substantive work → subagent.
|
|
|
56
56
|
| Security audit | \`security-reviewer\` |
|
|
57
57
|
| E2E testing | \`e2e-runner\` |
|
|
58
58
|
| Multi-file search/exploration | \`explore\` or \`general\` |
|
|
59
|
+
| Documentation lookup | \`docs-lookup\` |
|
|
60
|
+
| Doc/codemap update | \`doc-updater\` |
|
|
61
|
+
| Dead code cleanup | \`refactor-cleaner\` |
|
|
62
|
+
| TDD workflow | \`tdd-guide\` |
|
|
63
|
+
| Autonomous loop | \`loop-operator\` |
|
|
64
|
+
| Go review | \`review-go\` |
|
|
65
|
+
| Go build fix | \`build-go\` |
|
|
66
|
+
| Database review | \`review-database\` |
|
|
67
|
+
| C++ review | \`review-cpp\` |
|
|
68
|
+
| Java review | \`review-java\` |
|
|
69
|
+
| Java build fix | \`build-java\` |
|
|
70
|
+
| Kotlin review | \`review-kotlin\` |
|
|
71
|
+
| Kotlin build fix | \`build-kotlin\` |
|
|
72
|
+
| Python review | \`review-python\` |
|
|
73
|
+
| Rust review | \`review-rust\` |
|
|
74
|
+
| Rust build fix | \`build-rust\` |
|
|
59
75
|
| Any non-trivial multi-step | appropriate specialist |
|
|
60
76
|
|
|
61
77
|
Never delegate trivial single-step ops. Subagent returns diff + summary + verification; inspect return only. Full ref: \`${RULES_DIR}\\\\delegation.md\`.
|
|
62
78
|
|
|
63
79
|
## Memory — Gated & Precision-First
|
|
64
80
|
|
|
65
|
-
- **Start**: Read recall cache first. If stale/missing → \`
|
|
66
|
-
- **Before work**: Narrow \`
|
|
81
|
+
- **Start**: Read recall cache first. If stale/missing → \`latest_memory\` for relevant classes.
|
|
82
|
+
- **Before work**: Narrow \`search_memory\` by class, scope, keywords. Never read full indexes.
|
|
67
83
|
- **Before close**: Query same-type mistakes (7 days). Match → \`code-reviewer\` or \`security-reviewer\`.
|
|
68
|
-
- **On failure**: \`
|
|
69
|
-
- **Precision ladder**: \`
|
|
84
|
+
- **On failure**: \`search_memory\` for similar incidents. Search memory before asking user.
|
|
85
|
+
- **Precision ladder**: \`latest_memory\` → \`search_memory\` → \`fetch_memory\` → \`list_memory\` (last resort). Full index reads only for explicit audit/repair tasks.
|
|
70
86
|
- **Anti-spam**: No obvious facts, no one-off prefs, no temp state, no low-risk mistakes. Supersede, don't duplicate. Full rules: \`${RULES_DIR}\\\\retrieval.md\`, \`${RULES_DIR}\\\\memory-management.md\`.
|
|
71
87
|
|
|
72
88
|
## Self-Edit Authority
|
|
@@ -86,7 +102,7 @@ Full tiers: \`${RULES_DIR}\\\\self-heal.md\`.
|
|
|
86
102
|
- Checkpoint on meaningful boundaries. Compress closed segments immediately.
|
|
87
103
|
- After subagent return: verify → compress that block.
|
|
88
104
|
- Compress proactively.
|
|
89
|
-
- Skill candidates → \`/learn\` only if repeated pattern + \`
|
|
105
|
+
- Skill candidates → \`/learn\` only if repeated pattern + \`search_memory\` confirms no dup. See \`${RULES_DIR}\\\\skills-management.md\`.
|
|
90
106
|
- Audit triggers: openhermes/config change, repeated failures, session start when last audit >7 days. See \`${RULES_DIR}\\\\audit.md\`.
|
|
91
107
|
|
|
92
108
|
## Escalation
|
|
@@ -100,7 +116,7 @@ T0: observe → log mistake → smallest fix. T1: add prevention rule → verify
|
|
|
100
116
|
- **Forensic ledger**: \`%USERPROFILE%\\\\.local\\\\share\\\\opencode\\\\opencode.db\``
|
|
101
117
|
|
|
102
118
|
return [
|
|
103
|
-
`<OPENHERMES_BOOTSTRAP>\nOpenHermes v${getOwnVersion()} active. Harness: \`${HARNESS_DIR}\\\`. Memory at \`~/.
|
|
119
|
+
`<OPENHERMES_BOOTSTRAP>\nOpenHermes v${getOwnVersion()} active. Harness: \`${HARNESS_DIR}\\\`. Memory at \`~/.local/share/opencode/openhermes/memory/\`. Rules at \`${RULES_DIR}\\\`. Skills discoverable via \`skill\` tool — use \`skill\` tool to list/load them.`,
|
|
104
120
|
`<OPENHERMES_CONSTITUTION>\n${constitution}\n</OPENHERMES_CONSTITUTION>`,
|
|
105
121
|
`<OPENHERMES_RUNTIME>\n${runtime}\n</OPENHERMES_RUNTIME>`,
|
|
106
122
|
`<OPENHERMES_ROUTER>\n${router}\n</OPENHERMES_ROUTER>`
|
|
@@ -152,6 +168,32 @@ export const BootstrapPlugin = async ({ client, directory }) => {
|
|
|
152
168
|
"doctor": { agent: "OpenHermes", description: "Run OpenCode OpenHermes health diagnostics", subtask: true, template: ct("doctor.md") },
|
|
153
169
|
"memory-search": { agent: "OpenHermes", description: "Search OpenHermes memory with LLM summarization", subtask: true, template: ct("memory-search.md") },
|
|
154
170
|
"learn": { agent: "OpenHermes", description: "Create a new skill from recent work patterns", subtask: true, template: ct("learn.md") },
|
|
171
|
+
"ohc": { template: "", description: "OHC context management: /ohc status, /ohc compress [focus]" },
|
|
172
|
+
"orchestrate": { agent: "planner", description: "Orchestrate multiple agents for complex tasks", subtask: true, template: ct("orchestrate.md") },
|
|
173
|
+
"eval": { agent: "planner", description: "Evaluate implementation against acceptance criteria", subtask: true, template: ct("eval.md") },
|
|
174
|
+
"model-route": { agent: "OpenHermes", description: "Recommend model tier by task complexity and budget", subtask: true, template: ct("model-route.md") },
|
|
175
|
+
"quality-gate": { agent: "OpenHermes", description: "Run quality pipeline (format, lint, type check)", subtask: true, template: ct("quality-gate.md") },
|
|
176
|
+
"test-coverage": { agent: "tdd-guide", description: "Analyze coverage reports and identify gaps", subtask: true, template: ct("test-coverage.md") },
|
|
177
|
+
"update-docs": { agent: "doc-updater", description: "Update documentation for recent code changes", subtask: true, template: ct("update-docs.md") },
|
|
178
|
+
"update-codemaps": { agent: "doc-updater", description: "Generate/update architecture codemaps", subtask: true, template: ct("update-codemaps.md") },
|
|
179
|
+
"refactor-clean": { agent: "refactor-cleaner", description: "Remove dead code and consolidate duplicates", subtask: true, template: ct("refactor-clean.md") },
|
|
180
|
+
"verify": { agent: "OpenHermes", description: "Run comprehensive verification loop (typecheck, lint, test, build)", subtask: true, template: ct("verify.md") },
|
|
181
|
+
"checkpoint": { agent: "OpenHermes", description: "Save verification state and progress checkpoint", subtask: true, template: ct("checkpoint.md") },
|
|
182
|
+
"loop-start": { agent: "loop-operator", description: "Start managed autonomous loop with safety defaults", subtask: true, template: ct("loop-start.md") },
|
|
183
|
+
"loop-status": { agent: "OpenHermes", description: "Inspect active loop state, progress, and failure signals", subtask: true, template: ct("loop-status.md") },
|
|
184
|
+
"harness-audit": { agent: "harness-optimizer", description: "Run harness self-audit across 7 categories", subtask: true, template: ct("harness-audit.md") },
|
|
185
|
+
"setup-pm": { agent: "OpenHermes", description: "Configure package manager preference for the project", subtask: true, template: ct("setup-pm.md") },
|
|
186
|
+
"go-build": { agent: "build-go", description: "Fix Go build, vet, and compilation errors", subtask: true, template: ct("go-build.md") },
|
|
187
|
+
"go-review": { agent: "review-go", description: "Review Go code for idiomatic patterns and best practices", subtask: true, template: ct("go-review.md") },
|
|
188
|
+
"rust-build": { agent: "build-rust", description: "Fix Rust build, clippy, and dependency errors", subtask: true, template: ct("rust-build.md") },
|
|
189
|
+
"rust-review": { agent: "review-rust", description: "Review Rust code for safety, ownership, and idioms", subtask: true, template: ct("rust-review.md") },
|
|
190
|
+
"skill-create": { agent: "OpenHermes", description: "Generate a new skill from git history analysis", subtask: true, template: ct("skill-create.md") },
|
|
191
|
+
}
|
|
192
|
+
|
|
193
|
+
config.experimental ??= {}
|
|
194
|
+
config.experimental.primary_tools ??= []
|
|
195
|
+
if (!config.experimental.primary_tools.includes("compress")) {
|
|
196
|
+
config.experimental.primary_tools.push("compress")
|
|
155
197
|
}
|
|
156
198
|
|
|
157
199
|
config.agent = {
|
|
@@ -210,6 +252,114 @@ export const BootstrapPlugin = async ({ client, directory }) => {
|
|
|
210
252
|
prompt: p("security-reviewer.md"),
|
|
211
253
|
permission: { read: "allow", edit: "deny", bash: "deny", task: { "*": "allow" } },
|
|
212
254
|
},
|
|
255
|
+
"docs-lookup": {
|
|
256
|
+
description: "Documentation lookup via MCP — query any library docs in real-time",
|
|
257
|
+
mode: "subagent",
|
|
258
|
+
prompt: p("docs-lookup.md"),
|
|
259
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
260
|
+
},
|
|
261
|
+
"doc-updater": {
|
|
262
|
+
description: "Documentation and codemap generation/update specialist",
|
|
263
|
+
mode: "subagent",
|
|
264
|
+
prompt: p("doc-updater.md"),
|
|
265
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
266
|
+
},
|
|
267
|
+
"refactor-cleaner": {
|
|
268
|
+
description: "Dead code detection and safe removal specialist",
|
|
269
|
+
mode: "subagent",
|
|
270
|
+
prompt: p("refactor-cleaner.md"),
|
|
271
|
+
permission: { read: "allow", edit: "allow" },
|
|
272
|
+
},
|
|
273
|
+
"loop-operator": {
|
|
274
|
+
description: "Autonomous agent loop operator — safe iteration with stop conditions",
|
|
275
|
+
mode: "subagent",
|
|
276
|
+
prompt: p("loop-operator.md"),
|
|
277
|
+
permission: { read: "allow", edit: "allow", bash: "allow", task: { "*": "allow" } },
|
|
278
|
+
},
|
|
279
|
+
"harness-optimizer": {
|
|
280
|
+
description: "OpenHermes harness configuration optimizer — audit, tune, measure",
|
|
281
|
+
mode: "subagent",
|
|
282
|
+
prompt: p("harness-optimizer.md"),
|
|
283
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
284
|
+
},
|
|
285
|
+
"tdd-guide": {
|
|
286
|
+
description: "Test-Driven Development coach — red-green-refactor cycle enforcement",
|
|
287
|
+
mode: "subagent",
|
|
288
|
+
prompt: p("tdd-guide.md"),
|
|
289
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
290
|
+
},
|
|
291
|
+
"review-go": {
|
|
292
|
+
description: "Go code review specialist — idiomatic Go, concurrency, error handling",
|
|
293
|
+
mode: "subagent",
|
|
294
|
+
prompt: p("review-go.md"),
|
|
295
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
296
|
+
},
|
|
297
|
+
"build-go": {
|
|
298
|
+
description: "Go build error resolution specialist — go build, vet, staticcheck fixes",
|
|
299
|
+
mode: "subagent",
|
|
300
|
+
prompt: p("build-go.md"),
|
|
301
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
302
|
+
},
|
|
303
|
+
"review-database": {
|
|
304
|
+
description: "PostgreSQL database specialist — query optimization, schema, RLS, indexes",
|
|
305
|
+
mode: "subagent",
|
|
306
|
+
prompt: p("review-database.md"),
|
|
307
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
308
|
+
},
|
|
309
|
+
"review-cpp": {
|
|
310
|
+
description: "C++ code review specialist — memory safety, modern C++, RAII",
|
|
311
|
+
mode: "subagent",
|
|
312
|
+
prompt: p("review-cpp.md"),
|
|
313
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
314
|
+
},
|
|
315
|
+
"build-cpp": {
|
|
316
|
+
description: "C++ build error resolution specialist — CMake, linker, template errors",
|
|
317
|
+
mode: "subagent",
|
|
318
|
+
prompt: p("build-cpp.md"),
|
|
319
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
320
|
+
},
|
|
321
|
+
"review-java": {
|
|
322
|
+
description: "Java/Spring Boot review specialist — JPA, architecture, security",
|
|
323
|
+
mode: "subagent",
|
|
324
|
+
prompt: p("review-java.md"),
|
|
325
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
326
|
+
},
|
|
327
|
+
"build-java": {
|
|
328
|
+
description: "Java/Maven/Gradle build error resolution specialist",
|
|
329
|
+
mode: "subagent",
|
|
330
|
+
prompt: p("build-java.md"),
|
|
331
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
332
|
+
},
|
|
333
|
+
"review-kotlin": {
|
|
334
|
+
description: "Kotlin/Android review specialist — coroutines, Compose, architecture",
|
|
335
|
+
mode: "subagent",
|
|
336
|
+
prompt: p("review-kotlin.md"),
|
|
337
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
338
|
+
},
|
|
339
|
+
"build-kotlin": {
|
|
340
|
+
description: "Kotlin/Gradle build error resolution specialist",
|
|
341
|
+
mode: "subagent",
|
|
342
|
+
prompt: p("build-kotlin.md"),
|
|
343
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
344
|
+
},
|
|
345
|
+
"review-python": {
|
|
346
|
+
description: "Python code review specialist — PEP 8, type hints, security",
|
|
347
|
+
mode: "subagent",
|
|
348
|
+
prompt: p("review-python.md"),
|
|
349
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
350
|
+
},
|
|
351
|
+
"review-rust": {
|
|
352
|
+
description: "Rust code review specialist — ownership, lifetimes, safety",
|
|
353
|
+
mode: "subagent",
|
|
354
|
+
prompt: p("review-rust.md"),
|
|
355
|
+
permission: { read: "allow", bash: "allow", edit: "deny" },
|
|
356
|
+
},
|
|
357
|
+
"build-rust": {
|
|
358
|
+
description: "Rust build error resolution specialist — cargo, borrow checker, clippy",
|
|
359
|
+
mode: "subagent",
|
|
360
|
+
prompt: p("build-rust.md"),
|
|
361
|
+
permission: { read: "allow", edit: "allow", bash: "allow" },
|
|
362
|
+
},
|
|
213
363
|
}
|
|
214
364
|
|
|
215
365
|
config.default_agent = "OpenHermes"
|
package/curator.mjs
CHANGED
|
@@ -2,7 +2,7 @@ import path from "node:path"
|
|
|
2
2
|
import fs from "node:fs"
|
|
3
3
|
import os from "node:os"
|
|
4
4
|
import { findUnsupportedSchemaKeywords, validateSchema } from "./lib/schema-validator.mjs"
|
|
5
|
-
import { atomicWriteJson, fingerprintEnvironment, fingerprintFile,
|
|
5
|
+
import { atomicWriteJson, fingerprintEnvironment, fingerprintFile, readJson, redactSensitiveText, sanitizeRecord, truncateText } from "./lib/hardening.mjs"
|
|
6
6
|
import { fileURLToPath } from "node:url"
|
|
7
7
|
import { dirname } from "node:path"
|
|
8
8
|
import { getDataRoot, getMemoryRoot, getRuntimeRoot, getArchiveRoot } from "./lib/paths.mjs"
|
|
@@ -20,10 +20,6 @@ function curatorLog(message) {
|
|
|
20
20
|
process.stderr.write(`${message}\n`)
|
|
21
21
|
}
|
|
22
22
|
|
|
23
|
-
function readJson(fp, fallback) {
|
|
24
|
-
try { return JSON.parse(fs.readFileSync(fp, "utf8")) } catch { return fallback }
|
|
25
|
-
}
|
|
26
|
-
|
|
27
23
|
function buildEnvironmentFingerprint(root, directory, project) {
|
|
28
24
|
return fingerprintEnvironment({
|
|
29
25
|
cwd: directory,
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Save verification state and progress checkpoint
|
|
3
|
+
agent: OpenHermes
|
|
4
|
+
subtask: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Checkpoint Command
|
|
8
|
+
|
|
9
|
+
Save current verification state and create progress checkpoint: $ARGUMENTS
|
|
10
|
+
|
|
11
|
+
## Your Task
|
|
12
|
+
|
|
13
|
+
Create a snapshot of current progress including:
|
|
14
|
+
|
|
15
|
+
1. **Tests status** - Which tests pass/fail
|
|
16
|
+
2. **Coverage** - Current coverage metrics
|
|
17
|
+
3. **Build status** - Build succeeds or errors
|
|
18
|
+
4. **Code changes** - Summary of modifications
|
|
19
|
+
5. **Next steps** - What remains to be done
|
|
20
|
+
|
|
21
|
+
## Checkpoint Format
|
|
22
|
+
|
|
23
|
+
### Checkpoint: [Timestamp]
|
|
24
|
+
|
|
25
|
+
**Tests**
|
|
26
|
+
- Total: X
|
|
27
|
+
- Passing: Y
|
|
28
|
+
- Failing: Z
|
|
29
|
+
- Coverage: XX%
|
|
30
|
+
|
|
31
|
+
**Build**
|
|
32
|
+
- Status: PASS: Passing / FAIL: Failing
|
|
33
|
+
- Errors: [if any]
|
|
34
|
+
|
|
35
|
+
**Changes Since Last Checkpoint**
|
|
36
|
+
```
|
|
37
|
+
git diff --stat [last-checkpoint-commit]
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
**Completed Tasks**
|
|
41
|
+
- [x] Task 1
|
|
42
|
+
- [x] Task 2
|
|
43
|
+
- [ ] Task 3 (in progress)
|
|
44
|
+
|
|
45
|
+
**Blocking Issues**
|
|
46
|
+
- [Issue description]
|
|
47
|
+
|
|
48
|
+
**Next Steps**
|
|
49
|
+
1. Step 1
|
|
50
|
+
2. Step 2
|
|
51
|
+
|
|
52
|
+
## Usage with Verification Loop
|
|
53
|
+
|
|
54
|
+
Checkpoints integrate with the verification loop:
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
/plan → implement → /checkpoint → /verify → /checkpoint → implement → ...
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Use checkpoints to:
|
|
61
|
+
- Save state before risky changes
|
|
62
|
+
- Track progress through phases
|
|
63
|
+
- Enable rollback if needed
|
|
64
|
+
- Document verification points
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
**TIP**: Create checkpoints at natural breakpoints: after each phase, before major refactoring, after fixing critical bugs.
|
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Run evaluation against acceptance criteria
|
|
3
|
+
agent: planner
|
|
4
|
+
subtask: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Eval Command
|
|
8
|
+
|
|
9
|
+
Evaluate implementation against acceptance criteria: $ARGUMENTS
|
|
10
|
+
|
|
11
|
+
## Your Task
|
|
12
|
+
|
|
13
|
+
Run structured evaluation to verify the implementation meets requirements.
|
|
14
|
+
|
|
15
|
+
## Evaluation Framework
|
|
16
|
+
|
|
17
|
+
### Grader Types
|
|
18
|
+
|
|
19
|
+
1. **Binary Grader** - Pass/Fail
|
|
20
|
+
- Does it work? Yes/No
|
|
21
|
+
- Good for: feature completion, bug fixes
|
|
22
|
+
|
|
23
|
+
2. **Scalar Grader** - Score 0-100
|
|
24
|
+
- How well does it work?
|
|
25
|
+
- Good for: performance, quality metrics
|
|
26
|
+
|
|
27
|
+
3. **Rubric Grader** - Category scores
|
|
28
|
+
- Multiple dimensions evaluated
|
|
29
|
+
- Good for: comprehensive review
|
|
30
|
+
|
|
31
|
+
## Evaluation Process
|
|
32
|
+
|
|
33
|
+
### Step 1: Define Criteria
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
Acceptance Criteria:
|
|
37
|
+
1. [Criterion 1] - [weight]
|
|
38
|
+
2. [Criterion 2] - [weight]
|
|
39
|
+
3. [Criterion 3] - [weight]
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Step 2: Run Tests
|
|
43
|
+
|
|
44
|
+
For each criterion:
|
|
45
|
+
- Execute relevant test
|
|
46
|
+
- Collect evidence
|
|
47
|
+
- Score result
|
|
48
|
+
|
|
49
|
+
### Step 3: Calculate Score
|
|
50
|
+
|
|
51
|
+
```
|
|
52
|
+
Final Score = Σ (criterion_score × weight) / total_weight
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
### Step 4: Report
|
|
56
|
+
|
|
57
|
+
## Evaluation Report
|
|
58
|
+
|
|
59
|
+
### Overall: [PASS/FAIL] (Score: X/100)
|
|
60
|
+
|
|
61
|
+
### Criterion Breakdown
|
|
62
|
+
|
|
63
|
+
| Criterion | Score | Weight | Weighted |
|
|
64
|
+
|-----------|-------|--------|----------|
|
|
65
|
+
| [Criterion 1] | X/10 | 30% | X |
|
|
66
|
+
| [Criterion 2] | X/10 | 40% | X |
|
|
67
|
+
| [Criterion 3] | X/10 | 30% | X |
|
|
68
|
+
|
|
69
|
+
### Evidence
|
|
70
|
+
|
|
71
|
+
**Criterion 1: [Name]**
|
|
72
|
+
- Test: [what was tested]
|
|
73
|
+
- Result: [outcome]
|
|
74
|
+
- Evidence: [screenshot, log, output]
|
|
75
|
+
|
|
76
|
+
### Recommendations
|
|
77
|
+
|
|
78
|
+
[If not passing, what needs to change]
|
|
79
|
+
|
|
80
|
+
## Pass@K Metrics
|
|
81
|
+
|
|
82
|
+
For non-deterministic evaluations:
|
|
83
|
+
- Run K times
|
|
84
|
+
- Calculate pass rate
|
|
85
|
+
- Report: "Pass@K = X/K"
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
**TIP**: Use eval for acceptance testing before marking features complete.
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Fix Go build and vet errors
|
|
3
|
+
agent: build-go
|
|
4
|
+
subtask: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Go Build Command
|
|
8
|
+
|
|
9
|
+
Fix Go build, vet, and compilation errors: $ARGUMENTS
|
|
10
|
+
|
|
11
|
+
## Your Task
|
|
12
|
+
|
|
13
|
+
1. **Run go build**: `go build ./...`
|
|
14
|
+
2. **Run go vet**: `go vet ./...`
|
|
15
|
+
3. **Fix errors** one by one
|
|
16
|
+
4. **Verify fixes** don't introduce new errors
|
|
17
|
+
|
|
18
|
+
## Common Go Errors
|
|
19
|
+
|
|
20
|
+
### Import Errors
|
|
21
|
+
```
|
|
22
|
+
imported and not used: "package"
|
|
23
|
+
```
|
|
24
|
+
**Fix**: Remove unused import or use `_` prefix
|
|
25
|
+
|
|
26
|
+
### Type Errors
|
|
27
|
+
```
|
|
28
|
+
cannot use x (type T) as type U
|
|
29
|
+
```
|
|
30
|
+
**Fix**: Add type conversion or fix type definition
|
|
31
|
+
|
|
32
|
+
### Undefined Errors
|
|
33
|
+
```
|
|
34
|
+
undefined: identifier
|
|
35
|
+
```
|
|
36
|
+
**Fix**: Import package, define variable, or fix typo
|
|
37
|
+
|
|
38
|
+
### Vet Errors
|
|
39
|
+
```
|
|
40
|
+
printf: call has arguments but no formatting directives
|
|
41
|
+
```
|
|
42
|
+
**Fix**: Add format directive or remove arguments
|
|
43
|
+
|
|
44
|
+
## Fix Order
|
|
45
|
+
|
|
46
|
+
1. **Import errors** - Fix or remove imports
|
|
47
|
+
2. **Type definitions** - Ensure types exist
|
|
48
|
+
3. **Function signatures** - Match parameters
|
|
49
|
+
4. **Vet warnings** - Address static analysis
|
|
50
|
+
|
|
51
|
+
## Build Commands
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
# Build all packages
|
|
55
|
+
go build ./...
|
|
56
|
+
|
|
57
|
+
# Build with race detector
|
|
58
|
+
go build -race ./...
|
|
59
|
+
|
|
60
|
+
# Build for specific OS/arch
|
|
61
|
+
GOOS=linux GOARCH=amd64 go build ./...
|
|
62
|
+
|
|
63
|
+
# Run go vet
|
|
64
|
+
go vet ./...
|
|
65
|
+
|
|
66
|
+
# Run staticcheck
|
|
67
|
+
staticcheck ./...
|
|
68
|
+
|
|
69
|
+
# Format code
|
|
70
|
+
gofmt -w .
|
|
71
|
+
|
|
72
|
+
# Tidy dependencies
|
|
73
|
+
go mod tidy
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## Verification
|
|
77
|
+
|
|
78
|
+
After fixes:
|
|
79
|
+
```bash
|
|
80
|
+
go build ./... # Should succeed
|
|
81
|
+
go vet ./... # Should have no warnings
|
|
82
|
+
go test ./... # Tests should pass
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
**IMPORTANT**: Fix errors only. No refactoring, no improvements. Get the build green with minimal changes.
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Go code review for idiomatic patterns
|
|
3
|
+
agent: review-go
|
|
4
|
+
subtask: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Go Review Command
|
|
8
|
+
|
|
9
|
+
Review Go code for idiomatic patterns and best practices: $ARGUMENTS
|
|
10
|
+
|
|
11
|
+
## Your Task
|
|
12
|
+
|
|
13
|
+
1. **Analyze Go code** for idioms and patterns
|
|
14
|
+
2. **Check concurrency** - goroutines, channels, mutexes
|
|
15
|
+
3. **Review error handling** - proper error wrapping
|
|
16
|
+
4. **Verify performance** - allocations, bottlenecks
|
|
17
|
+
|
|
18
|
+
## Review Checklist
|
|
19
|
+
|
|
20
|
+
### Idiomatic Go
|
|
21
|
+
- [ ] Package naming (lowercase, no underscores)
|
|
22
|
+
- [ ] Variable naming (camelCase, short)
|
|
23
|
+
- [ ] Interface naming (ends with -er)
|
|
24
|
+
- [ ] Error naming (starts with Err)
|
|
25
|
+
|
|
26
|
+
### Error Handling
|
|
27
|
+
- [ ] Errors are checked, not ignored
|
|
28
|
+
- [ ] Errors wrapped with context (`fmt.Errorf("...: %w", err)`)
|
|
29
|
+
- [ ] Sentinel errors used appropriately
|
|
30
|
+
- [ ] Custom error types when needed
|
|
31
|
+
|
|
32
|
+
### Concurrency
|
|
33
|
+
- [ ] Goroutines properly managed
|
|
34
|
+
- [ ] Channels buffered appropriately
|
|
35
|
+
- [ ] No data races (use `-race` flag)
|
|
36
|
+
- [ ] Context passed for cancellation
|
|
37
|
+
- [ ] WaitGroups used correctly
|
|
38
|
+
|
|
39
|
+
### Performance
|
|
40
|
+
- [ ] Avoid unnecessary allocations
|
|
41
|
+
- [ ] Use `sync.Pool` for frequent allocations
|
|
42
|
+
- [ ] Prefer value receivers for small structs
|
|
43
|
+
- [ ] Buffer I/O operations
|
|
44
|
+
|
|
45
|
+
### Code Organization
|
|
46
|
+
- [ ] Small, focused packages
|
|
47
|
+
- [ ] Clear dependency direction
|
|
48
|
+
- [ ] Internal packages for private code
|
|
49
|
+
- [ ] Godoc comments on exports
|
|
50
|
+
|
|
51
|
+
## Report Format
|
|
52
|
+
|
|
53
|
+
### Idiomatic Issues
|
|
54
|
+
- [file:line] Issue description
|
|
55
|
+
Suggestion: How to fix
|
|
56
|
+
|
|
57
|
+
### Error Handling Issues
|
|
58
|
+
- [file:line] Issue description
|
|
59
|
+
Suggestion: How to fix
|
|
60
|
+
|
|
61
|
+
### Concurrency Issues
|
|
62
|
+
- [file:line] Issue description
|
|
63
|
+
Suggestion: How to fix
|
|
64
|
+
|
|
65
|
+
### Performance Issues
|
|
66
|
+
- [file:line] Issue description
|
|
67
|
+
Suggestion: How to fix
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
**TIP**: Run `go vet` and `staticcheck` for additional automated checks.
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Run a self-evaluation audit of the OpenHermes harness
|
|
3
|
+
agent: harness-optimizer
|
|
4
|
+
subtask: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Harness Audit Command
|
|
8
|
+
|
|
9
|
+
Run a self-evaluation audit of the OpenHermes harness and return a prioritized scorecard.
|
|
10
|
+
|
|
11
|
+
## Usage
|
|
12
|
+
|
|
13
|
+
`/harness-audit [scope] [--format text|json] [--root path]`
|
|
14
|
+
|
|
15
|
+
- `scope` (optional): `repo` (default), `hooks`, `skills`, `commands`, `agents`
|
|
16
|
+
- `--format`: output style (`text` default, `json` for automation)
|
|
17
|
+
- `--root`: audit a specific path instead of the current working directory
|
|
18
|
+
|
|
19
|
+
## Self-Evaluation Checklist
|
|
20
|
+
|
|
21
|
+
Evaluate each category by inspecting the harness files directly. No external script needed.
|
|
22
|
+
|
|
23
|
+
### 1. Tool Coverage (0-10)
|
|
24
|
+
- [ ] Commands exist for all subagent types
|
|
25
|
+
- [ ] Each command has correct frontmatter (description, agent, subtask)
|
|
26
|
+
- [ ] Agent mapping table is complete and accurate
|
|
27
|
+
- [ ] Language-specific agents exist (Go, Rust)
|
|
28
|
+
- [ ] All command files are discoverable
|
|
29
|
+
|
|
30
|
+
### 2. Context Efficiency (0-10)
|
|
31
|
+
- [ ] Commands are concise (under 100 lines each)
|
|
32
|
+
- [ ] No redundant or overlapping command content
|
|
33
|
+
- [ ] Delegation instructions are clear
|
|
34
|
+
- [ ] Subagent handoffs minimize context overhead
|
|
35
|
+
- [ ] Templates provide structured output formats
|
|
36
|
+
|
|
37
|
+
### 3. Quality Gates (0-10)
|
|
38
|
+
- [ ] verify.md has comprehensive checklist
|
|
39
|
+
- [ ] quality-gate.md covers lint/type/build
|
|
40
|
+
- [ ] test-coverage.md has meaningful targets
|
|
41
|
+
- [ ] eval.md has structured scoring framework
|
|
42
|
+
- [ ] checkpoint.md enables state tracking
|
|
43
|
+
|
|
44
|
+
### 4. Memory Persistence (0-10)
|
|
45
|
+
- [ ] Memory tools documented (add_memory/fetch_memory/list_memory/latest_memory/search_memory)
|
|
46
|
+
- [ ] Checkpoint command references memory persistence
|
|
47
|
+
- [ ] Mistake/audit logging workflow documented
|
|
48
|
+
- [ ] Recall cache strategy defined
|
|
49
|
+
|
|
50
|
+
### 5. Eval Coverage (0-10)
|
|
51
|
+
- [ ] eval.md supports binary/scalar/rubric grading
|
|
52
|
+
- [ ] Acceptance criteria framework in place
|
|
53
|
+
- [ ] Pass@K metrics for non-deterministic evals
|
|
54
|
+
- [ ] Evaluation report format defined
|
|
55
|
+
|
|
56
|
+
### 6. Security Guardrails (0-10)
|
|
57
|
+
- [ ] verify.md includes security checklist items
|
|
58
|
+
- [ ] No hardcoded secrets guidance present
|
|
59
|
+
- [ ] Input validation guidance included
|
|
60
|
+
- [ ] SQL injection / XSS risks addressed
|
|
61
|
+
|
|
62
|
+
### 7. Cost Efficiency (0-10)
|
|
63
|
+
- [ ] model-route.md has budget tiers
|
|
64
|
+
- [ ] Model routing heuristic defined
|
|
65
|
+
- [ ] Subagent usage reduces main-context tokens
|
|
66
|
+
- [ ] Compression workflow documented
|
|
67
|
+
|
|
68
|
+
## Output Contract
|
|
69
|
+
|
|
70
|
+
Return:
|
|
71
|
+
|
|
72
|
+
1. `overall_score` out of `max_score` (70 for `repo`; smaller for scoped audits)
|
|
73
|
+
2. Category scores and concrete findings
|
|
74
|
+
3. Failed checks with exact file paths
|
|
75
|
+
4. Top 3 actions to improve
|
|
76
|
+
5. Suggested commands to apply next
|
|
77
|
+
|
|
78
|
+
## Example Result
|
|
79
|
+
|
|
80
|
+
```text
|
|
81
|
+
Harness Audit (repo): 66/70
|
|
82
|
+
- Tool Coverage: 10/10 (10/10 pts)
|
|
83
|
+
- Context Efficiency: 9/10 (9/10 pts)
|
|
84
|
+
- Quality Gates: 10/10 (10/10 pts)
|
|
85
|
+
|
|
86
|
+
Top 3 Actions:
|
|
87
|
+
1) [Security Guardrails] Add preflight security guard instructions in verify.md.
|
|
88
|
+
2) [Tool Coverage] Ensure all subagent types have corresponding command files.
|
|
89
|
+
3) [Eval Coverage] Add evaluation templates and examples.
|
|
90
|
+
```
|
|
@@ -11,7 +11,7 @@ Create a new reusable skill from recent work patterns. $ARGUMENTS
|
|
|
11
11
|
## Your Task
|
|
12
12
|
|
|
13
13
|
1. **Search backlog** for pending skill candidates:
|
|
14
|
-
- Use `
|
|
14
|
+
- Use `search_memory` with query="skill-candidate" classes=["backlog"]
|
|
15
15
|
- If $ARGUMENTS is non-empty, narrow search to that topic
|
|
16
16
|
2. **Analyze the candidate** — what pattern did the session reveal?
|
|
17
17
|
3. **Create the skill**:
|
|
@@ -19,7 +19,7 @@ Create a new reusable skill from recent work patterns. $ARGUMENTS
|
|
|
19
19
|
- Follow its instructions to create a new SKILL.md
|
|
20
20
|
- Target: `%USERPROFILE%\.config\opencode\skills\<name>\SKILL.md`
|
|
21
21
|
- Naming: lowercase, hyphenated, descriptive
|
|
22
|
-
4. **Close the backlog entry**: `
|
|
22
|
+
4. **Close the backlog entry**: `add_memory(class="backlog", id="<candidate-id>", data={..., status:"closed"})`
|
|
23
23
|
5. **Report**: What skill was created, where, and what it does
|
|
24
24
|
|
|
25
25
|
## Skill Requirements
|