openhermes 2.6.1 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (158) hide show
  1. package/CONTEXT.md +18 -0
  2. package/ETHOS.md +15 -0
  3. package/README.md +135 -292
  4. package/bootstrap.mjs +174 -499
  5. package/harness/agents/openhermes.md +87 -0
  6. package/harness/codex/CONSTITUTION.md +70 -148
  7. package/harness/codex/ROUTING.md +126 -0
  8. package/harness/commands/oh-doctor.md +26 -0
  9. package/harness/instructions/CONVENTIONS.md +206 -206
  10. package/harness/instructions/RUNTIME.md +54 -31
  11. package/harness/skills/oh-builder/SKILL.md +98 -0
  12. package/harness/skills/oh-caveman/SKILL.md +33 -0
  13. package/harness/skills/oh-expert/SKILL.md +121 -0
  14. package/harness/skills/oh-freeze/SKILL.md +28 -0
  15. package/harness/skills/oh-gauntlet/SKILL.md +119 -0
  16. package/harness/skills/oh-grill/SKILL.md +77 -0
  17. package/harness/skills/oh-guard/SKILL.md +33 -0
  18. package/harness/skills/oh-handoff/SKILL.md +33 -0
  19. package/harness/skills/oh-health/SKILL.md +90 -0
  20. package/harness/skills/oh-init/SKILL.md +78 -0
  21. package/harness/skills/oh-investigate/SKILL.md +35 -0
  22. package/harness/skills/oh-issue/SKILL.md +36 -0
  23. package/harness/skills/oh-learn/SKILL.md +28 -0
  24. package/harness/skills/oh-manifest/SKILL.md +84 -0
  25. package/harness/skills/oh-plan-review/SKILL.md +128 -0
  26. package/harness/skills/oh-planner/SKILL.md +157 -0
  27. package/harness/skills/oh-prd/SKILL.md +35 -0
  28. package/harness/skills/oh-retro/SKILL.md +33 -0
  29. package/harness/skills/oh-review/SKILL.md +110 -0
  30. package/harness/skills/oh-security/SKILL.md +110 -0
  31. package/harness/skills/oh-ship/SKILL.md +39 -0
  32. package/harness/skills/oh-skill-craft/SKILL.md +107 -0
  33. package/harness/skills/oh-skills-link/SKILL.md +29 -0
  34. package/harness/skills/oh-skills-list/SKILL.md +31 -0
  35. package/harness/skills/oh-triage/SKILL.md +36 -0
  36. package/index.mjs +3 -58
  37. package/lib/harness-resolver.mjs +77 -0
  38. package/lib/logger.mjs +62 -0
  39. package/package.json +49 -53
  40. package/test/plugins-behavioral.test.mjs +64 -0
  41. package/test/plugins.test.mjs +62 -0
  42. package/autorecall.mjs +0 -237
  43. package/curator.mjs +0 -455
  44. package/harness/commands/build-fix.md +0 -60
  45. package/harness/commands/checkpoint.md +0 -68
  46. package/harness/commands/code-review.md +0 -71
  47. package/harness/commands/doctor.md +0 -42
  48. package/harness/commands/eval.md +0 -89
  49. package/harness/commands/go-build.md +0 -87
  50. package/harness/commands/go-review.md +0 -71
  51. package/harness/commands/harness-audit.md +0 -90
  52. package/harness/commands/learn.md +0 -37
  53. package/harness/commands/loop-start.md +0 -38
  54. package/harness/commands/loop-status.md +0 -30
  55. package/harness/commands/memory-search.md +0 -37
  56. package/harness/commands/model-route.md +0 -32
  57. package/harness/commands/ohc.md +0 -13
  58. package/harness/commands/orchestrate.md +0 -88
  59. package/harness/commands/plan.md +0 -53
  60. package/harness/commands/quality-gate.md +0 -35
  61. package/harness/commands/refactor-clean.md +0 -102
  62. package/harness/commands/rust-build.md +0 -78
  63. package/harness/commands/rust-review.md +0 -65
  64. package/harness/commands/security.md +0 -93
  65. package/harness/commands/setup-pm.md +0 -65
  66. package/harness/commands/skill-create.md +0 -99
  67. package/harness/commands/test-coverage.md +0 -80
  68. package/harness/commands/update-codemaps.md +0 -81
  69. package/harness/commands/update-docs.md +0 -67
  70. package/harness/commands/verify.md +0 -68
  71. package/harness/prompts/architect.txt +0 -189
  72. package/harness/prompts/build-cpp.md +0 -98
  73. package/harness/prompts/build-error-resolver.md +0 -44
  74. package/harness/prompts/build-go.md +0 -340
  75. package/harness/prompts/build-java.md +0 -140
  76. package/harness/prompts/build-kotlin.md +0 -137
  77. package/harness/prompts/build-rust.md +0 -108
  78. package/harness/prompts/code-reviewer.md +0 -40
  79. package/harness/prompts/doc-updater.md +0 -206
  80. package/harness/prompts/docs-lookup.md +0 -71
  81. package/harness/prompts/e2e-runner.txt +0 -317
  82. package/harness/prompts/explore.md +0 -42
  83. package/harness/prompts/harness-optimizer.md +0 -42
  84. package/harness/prompts/loop-operator.md +0 -53
  85. package/harness/prompts/planner.md +0 -37
  86. package/harness/prompts/refactor-cleaner.md +0 -256
  87. package/harness/prompts/review-cpp.md +0 -81
  88. package/harness/prompts/review-database.md +0 -261
  89. package/harness/prompts/review-go.md +0 -257
  90. package/harness/prompts/review-java.md +0 -113
  91. package/harness/prompts/review-kotlin.md +0 -143
  92. package/harness/prompts/review-python.md +0 -101
  93. package/harness/prompts/review-rust.md +0 -77
  94. package/harness/prompts/security-reviewer.md +0 -42
  95. package/harness/prompts/tdd-guide.md +0 -228
  96. package/harness/rules/audit.md +0 -84
  97. package/harness/rules/checkpointing.md +0 -75
  98. package/harness/rules/context-loading.md +0 -33
  99. package/harness/rules/credential-exposure.md +0 -0
  100. package/harness/rules/delegation.md +0 -80
  101. package/harness/rules/handoff.md +0 -267
  102. package/harness/rules/memory-management.md +0 -28
  103. package/harness/rules/precedence.md +0 -52
  104. package/harness/rules/promotion.md +0 -46
  105. package/harness/rules/ranking.md +0 -64
  106. package/harness/rules/retrieval.md +0 -94
  107. package/harness/rules/runtime-guards.md +0 -196
  108. package/harness/rules/self-heal.md +0 -79
  109. package/harness/rules/session-start.md +0 -34
  110. package/harness/rules/skills-management.md +0 -165
  111. package/harness/rules/state-drift.md +0 -192
  112. package/harness/rules/verification.md +0 -88
  113. package/harness/scripts/sync-commands.mjs +0 -259
  114. package/harness/skills/.bundled_manifest +0 -17
  115. package/harness/skills/.usage.json +0 -6
  116. package/harness/skills/api-design/SKILL.md +0 -523
  117. package/harness/skills/backend-patterns/SKILL.md +0 -598
  118. package/harness/skills/coding-standards/SKILL.md +0 -549
  119. package/harness/skills/e2e-testing/SKILL.md +0 -326
  120. package/harness/skills/frontend-patterns/SKILL.md +0 -642
  121. package/harness/skills/frontend-slides/SKILL.md +0 -184
  122. package/harness/skills/security-review/SKILL.md +0 -495
  123. package/harness/skills/strategic-compact/SKILL.md +0 -131
  124. package/harness/skills/tdd-workflow/SKILL.md +0 -463
  125. package/harness/skills/verification-loop/SKILL.md +0 -126
  126. package/lib/ambient-memory.mjs +0 -167
  127. package/lib/handoff.mjs +0 -176
  128. package/lib/hardening.mjs +0 -128
  129. package/lib/memory-tools-plugin.mjs +0 -365
  130. package/lib/ohc/block-sync.mjs +0 -69
  131. package/lib/ohc/compress/search.mjs +0 -152
  132. package/lib/ohc/compress/state.mjs +0 -76
  133. package/lib/ohc/config.mjs +0 -186
  134. package/lib/ohc/message-ids.mjs +0 -168
  135. package/lib/ohc/notify.mjs +0 -154
  136. package/lib/ohc/protected-patterns.mjs +0 -54
  137. package/lib/ohc/prune-apply.mjs +0 -134
  138. package/lib/ohc/pruner.mjs +0 -610
  139. package/lib/ohc/reaper.mjs +0 -70
  140. package/lib/ohc/state.mjs +0 -266
  141. package/lib/ohc/strategies/deduplication.mjs +0 -72
  142. package/lib/ohc/strategies/index.mjs +0 -2
  143. package/lib/ohc/strategies/purge-errors.mjs +0 -43
  144. package/lib/ohc/token-utils.mjs +0 -26
  145. package/lib/ohc/updater.mjs +0 -133
  146. package/lib/paths.mjs +0 -50
  147. package/lib/schema-validator.mjs +0 -77
  148. package/lib/search.mjs +0 -48
  149. package/schemas/audit.schema.json +0 -82
  150. package/schemas/backlog.schema.json +0 -63
  151. package/schemas/checkpoint.schema.json +0 -65
  152. package/schemas/constraint.schema.json +0 -62
  153. package/schemas/decision.schema.json +0 -63
  154. package/schemas/instinct.schema.json +0 -63
  155. package/schemas/loop-state.schema.json +0 -33
  156. package/schemas/mistake.schema.json +0 -64
  157. package/schemas/verification_receipt.schema.json +0 -88
  158. package/skill-builder.mjs +0 -88
@@ -0,0 +1,28 @@
1
+ ---
2
+ name: oh-learn
3
+ description: "Review, search, prune, and export session learnings"
4
+ ---
5
+
6
+ # oh-learn
7
+
8
+ ## When to Use
9
+ To review what the agent has learned across sessions, search for specific patterns, prune stale knowledge, or export learnings for documentation.
10
+
11
+ ## Workflow
12
+ 1. **Review** — show recent learnings with context
13
+ 2. **Search** — find learnings matching specific topics or patterns
14
+ 3. **Prune** — remove stale, redundant, or superseded learnings
15
+ 4. **Export** — format learnings for documentation or sharing
16
+
17
+ ## Anti-patterns
18
+ - Hoarding every observation (most things aren't learnings)
19
+ - Never pruning (stale knowledge is worse than no knowledge)
20
+ - Storing what, not why (context-less facts are forgettable)
21
+
22
+ ## Routing
23
+
24
+ | Outcome | Route |
25
+ |---------|-------|
26
+ | pass | → [done — read-only report] |
27
+ | fail | → [surface gaps to user] |
28
+ | blocker | → surface to user |
@@ -0,0 +1,84 @@
1
+ ---
2
+ name: oh-manifest
3
+ description: "Full build loop: plan → build → verify → loop until done or blocker. Orchestrates oh-planner + oh-builder with auto-decisions."
4
+ tier: 4
5
+ benefits-from: [oh-planner, oh-builder, oh-expert]
6
+ triggers:
7
+ - "manifest"
8
+ - "full build"
9
+ - "build loop"
10
+ - "build until done"
11
+ - "orchestrate"
12
+ - "pipeline"
13
+ - "run the plan"
14
+ ---
15
+
16
+ # oh-manifest
17
+
18
+ Full build orchestration loop. Runs planner → builder → verify → repeat until done or a blocker is surfaced. Uses gstack decision principles to auto-resolve intermediate questions. Only interrupts the user for genuine blockers.
19
+
20
+ ## Pipeline
21
+
22
+ ### Step 1: Plan
23
+ - If `.opencode/plan.md` exists, load and verify it is current
24
+ - If not, run `oh-planner` (Mode A, B, or C depending on context)
25
+ - Auto-decide minor scope decisions using decision principles
26
+ - Surface only: premises that need human judgment, or plan/alternative conflicts
27
+
28
+ ### Step 2: Build
29
+ - For each phase in plan.md, run `oh-builder` (Mode D: From Plan)
30
+ - Implements phases in dependency order
31
+ - Parallelizable phases may be delegated to sub-agents
32
+ - Auto-decide implementation choices using decision principles
33
+
34
+ ### Step 3: Verify
35
+ - Check each phase against its verification criteria in plan.md
36
+ - Run tests if they exist
37
+ - If phase passes: mark complete in plan.md, proceed to next
38
+ - If phase fails: diagnose (use oh-expert self-diagnosis), fix, re-verify
39
+ - If fix is impossible within scope: surface blocker
40
+
41
+ ### Step 4: Loop or Done
42
+ - All phases complete and verified → DONE
43
+ - Phase failed and cannot be fixed → BLOCKER (surface to user with context)
44
+ - Phase passed but new work discovered → add to plan, continue loop
45
+
46
+ ## Decision Principles
47
+
48
+ Auto-resolve these without asking the user:
49
+
50
+ 1. **Completeness over cleverness** — cover more cases
51
+ 2. **Boil the lake** — fix blast radius, not symptom
52
+ 3. **Pragmatic over perfect** — cleaner option that ships today
53
+ 4. **DRY but not premature** — third instance is the time to abstract
54
+ 5. **Explicit over implicit** — clear code over magic
55
+ 6. **Bias toward action** — when in doubt, make progress
56
+
57
+ Surface to user only:
58
+ - **Premises** — fundamental assumptions that change the nature of the build
59
+ - **Dead end** — all viable paths have significant trade-offs
60
+ - **Cross-model disagreement** — two approaches both have strong arguments
61
+
62
+ ## Blocker Protocol
63
+
64
+ When a blocker is encountered:
65
+
66
+ 1. **Describe the blocker** — what was attempted, what failed, why it cannot proceed
67
+ 2. **Propose alternatives** — scope reduction, dependency change, architectural shift
68
+ 3. **Surface to user** with: `BLOCKER: <description> | Options: <A, B, C>`
69
+ 4. **Wait for user decision** before continuing
70
+
71
+ ## Anti-patterns
72
+ - Auto-deciding premises (fundamental assumptions need user input)
73
+ - Pushing through blockers (surface immediately, don't try 5 workarounds silently)
74
+ - Skipping verification (verify every phase, not just the final result)
75
+ - Parallelizing dependent phases (respect the dependency order in plan.md)
76
+ - Forgetting to update plan.md with completion status
77
+
78
+ ## Routing
79
+
80
+ | Outcome | Route |
81
+ |---------|-------|
82
+ | pass | → pipeline continues (planner→builder→gauntlet→ship) |
83
+ | fail | → oh-expert (diagnose loop failure) |
84
+ | blocker | → surface to user with context and options |
@@ -0,0 +1,128 @@
1
+ ---
2
+ name: oh-plan-review
3
+ description: "Multi-lens plan review: 4 perspectives in one skill. Choose Engineering (architecture/scope), Design (UX/interaction), DX (API/CLI ergonomics), or Strategy (product/CEO). Interactive — walks through findings one section at a time."
4
+ tier: 3
5
+ benefits-from: [oh-planner, oh-expert]
6
+ triggers:
7
+ - "plan review"
8
+ - "review the plan"
9
+ - "architecture review"
10
+ - "design review"
11
+ - "ux review"
12
+ - "dx review"
13
+ - "strategy review"
14
+ - "eng review"
15
+ - "ceo review"
16
+ ---
17
+
18
+ # oh-plan-review
19
+
20
+ Four review lenses in one skill. Pick the lens that fits the plan's scope — or run multiple lenses in sequence for thorough coverage.
21
+
22
+ **Interactive.** Walk findings one section at a time with opinionated recommendations and AskUserQuestion gates. Never dump all findings at once.
23
+
24
+ **Read-only.** No code changes. The output is a better plan, not a document about the plan.
25
+
26
+ ## Lens Selection
27
+
28
+ Ask the user which lens fits, or auto-detect from plan content:
29
+
30
+ | Trigger keywords | Recommended lens |
31
+ |---|---|
32
+ | architecture, data model, API design, file structure, types, modules | Engineering |
33
+ | UI, layout, colors, components, screens, mockups, user interface | Design |
34
+ | CLI, SDK, developer tool, API, npm package, documentation, onboarding | DX |
35
+ | product, strategy, scope, roadmap, competition, business model | Strategy |
36
+
37
+ ### Engineering Lens
38
+ Scope challenge, architecture review, cognitive patterns for eng managers.
39
+
40
+ **Scope Challenge** — Before reviewing anything:
41
+ 1. What existing code already partially solves each sub-problem?
42
+ 2. What is the minimum set of changes that achieves the stated goal?
43
+ 3. Complexity check: 8+ files or 2+ new classes/services → smell. Challenge it.
44
+ 4. Search check: does the runtime/framework have built-in support for each pattern the plan introduces?
45
+ 5. Completeness check: with AI-assisted coding, the cost of completeness is 10-100x cheaper. Recommend complete lakes over shortcuts.
46
+ 6. Distribution check: new artifact types need build/publish pipelines.
47
+
48
+ **Architecture Review** — Walk through one section at a time: Architecture → Code Quality → Tests → Performance. Max 8 top issues per section. Use AskUserQuestion to discuss each finding.
49
+
50
+ **Anti-skip rule:** Never condense or skip a section. If a section has zero findings, say so — but evaluate it.
51
+
52
+ **Cognitive patterns** (internalize, don't enumerate):
53
+ - State diagnosis (Larson) — Is your team falling behind, treading water, repaying debt, or innovating?
54
+ - Blast radius instinct — What's the worst case and how many systems does it affect?
55
+ - Boring by default (McKinley) — Proven technology unless you have innovation tokens to spend.
56
+ - Reversibility preference — Feature flags, incremental rollouts. Make wrong answers cheap.
57
+ - Essential vs accidental complexity (Brooks) — Is this solving a real problem or one we created?
58
+
59
+ ### Design Lens
60
+ UX review, interaction state coverage, AI slop detection.
61
+
62
+ **Evaluate:**
63
+ - Empty states — every screen without data needs warmth, action, context
64
+ - Visual hierarchy — what does the user see first, second, third?
65
+ - Edge cases — 47-char names, zero results, error states, first-time vs power user
66
+ - AI slop — generic card grids, hero sections, 3-column features? Flag them.
67
+ - Responsive — every viewport gets intentional design, not just stack-on-mobile
68
+ - Accessibility — keyboard nav, screen readers, contrast, touch targets
69
+
70
+ **Principle:** Specificity over vibes. "Clean, modern UI" is not a design decision. Name the font, spacing scale, interaction pattern, and motion.
71
+
72
+ ### DX Lens
73
+ Developer experience audit for APIs, CLIs, SDKs, libraries, platforms.
74
+
75
+ **Evaluate:**
76
+ - Time to Hello World — target < 2 minutes. Every extra minute drops adoption 20-30%.
77
+ - Error quality — every error = problem + cause + fix. No "something went wrong."
78
+ - First five minutes — one click to start. No credit card. No demo call.
79
+ - Progressive disclosure — simple case is production-ready. Complex case uses the same API.
80
+ - Pit of Success — make the right thing easy, the wrong thing hard.
81
+
82
+ **Three modes:**
83
+ - **DX Expansion** — competitive advantage. Design magical moments. Benchmark competitors.
84
+ - **DX Polish** — bulletproof every touchpoint. No friction, no uncertainty.
85
+ - **DX Triage** — critical gaps only. Minimum viable DX investment.
86
+
87
+ ### Strategy Lens
88
+ Product/CEO review with 4 scope modes.
89
+
90
+ **Select mode:**
91
+ - **Scope Expansion** — "What would make this 10x better for 2x the effort?" Push scope up. Present each expansion as an AskUserQuestion. The user opts in or out.
92
+ - **Selective Expansion** — Hold the baseline. Surface expansion opportunities for cherry-picking. Neutral recommendation posture.
93
+ - **Hold Scope** — Make it bulletproof. Catch every failure mode. No silent reduction or expansion.
94
+ - **Scope Reduction** — Find the minimum viable version. Be ruthless. Cut everything non-essential.
95
+
96
+ **Cognitive patterns** (internalize):
97
+ - Classification instinct (Bezos) — One-way vs two-way doors. Most things are two-way; move fast.
98
+ - Inversion reflex (Munger) — For every "how do we win?" also ask "what would make us fail?"
99
+ - Focus as subtraction (Jobs) — Default: do fewer things, better. 350 products → 10.
100
+ - Proxy skepticism (Bezos) — Are our metrics still serving users or self-referential?
101
+ - Temporal depth — Think in 5-10 year arcs. Apply regret minimization for major bets.
102
+
103
+ **Prime directives:**
104
+ - Zero silent failures. Every failure mode must be visible.
105
+ - Every error has a name. Don't say "handle errors." Name the exception class, trigger, catch, user-facing message.
106
+ - Data flows have shadow paths: nil, empty, upstream error. Trace all four.
107
+ - Observability is scope, not afterthought. New dashboards and alerts are first-class deliverables.
108
+ - Everything deferred must be written down. TODOS.md or it doesn't exist.
109
+ - You have permission to say "scrap it and do this instead."
110
+
111
+ ## Output
112
+
113
+ After each lens, the plan file (`/.opencode/plan.md`) is updated with findings and decisions. The user reviews and accepts changes interactively.
114
+
115
+ ## Rules
116
+
117
+ - **Interactive only.** One section at a time. Use AskUserQuestion to discuss findings before writing.
118
+ - **Anti-skip:** Every section must be evaluated. If zero findings, say "No issues found" and move on.
119
+ - **Anti-shortcut:** The plan file is the OUTPUT of the interactive review, not a substitute for it. Findings go through AskUserQuestion before writing.
120
+ - **Commit to the chosen lens.** Once scope is agreed, don't re-argue earlier decisions in later sections.
121
+
122
+ ## Routing
123
+
124
+ | Outcome | Route |
125
+ |---------|-------|
126
+ | pass | → oh-grill (if concerns remain) or oh-manifest (execute plan) |
127
+ | fail | → oh-planner (revise plan based on findings) |
128
+ | blocker | → surface to user |
@@ -0,0 +1,157 @@
1
+ ---
2
+ name: oh-planner
3
+ description: "ALL-arounder planner — brainstorm, architect, autoplan, decision pipeline. Produces a consumable plan artifact."
4
+ tier: 3
5
+ benefits-from: [oh-expert, oh-grill]
6
+ triggers:
7
+ - "plan this"
8
+ - "how should I build"
9
+ - "architecture"
10
+ - "design this feature"
11
+ - "brainstorm"
12
+ - "autoplan"
13
+ - "strategy"
14
+ - "scope this"
15
+ ---
16
+
17
+ # oh-planner
18
+
19
+ The ALL-arounder planner. Merges brainstorm, architecture analysis, strategy review, and automatic plan review into one skill. Produces `.opencode/plan.md` that oh-builder consumes.
20
+
21
+ ## Entry Modes
22
+
23
+ Use the mode that matches the user's starting point:
24
+
25
+ ### Mode A: Brainstorm (exploratory)
26
+ When the idea is fuzzy and needs shaping.
27
+
28
+ 1. **Demand reality** — who specifically needs this?
29
+ 2. **Status quo** — what do they do today?
30
+ 3. **Desperate specificity** — what's the one concrete thing they can't do?
31
+ 4. **Narrowest wedge** — what's the smallest useful version?
32
+ 5. **Observation** — what will you see/hear when it works?
33
+ 6. **Future-fit** — does this compound or plateau?
34
+
35
+ Output: structured design doc.
36
+
37
+ ### Mode B: Architecture Analysis (existing codebase)
38
+ When the codebase feels messy or you need to understand the surface.
39
+
40
+ 1. **Read the domain** — load CONTEXT.md, understand the language
41
+ 2. **Map the surface** — identify modules, boundaries, dependencies
42
+ 3. **Find deepening opportunities** — duplication, over-coupling, grown-beyond-purpose functions, missing abstractions
43
+ 4. **Rank by impact** — effort vs value, dependencies, risk
44
+
45
+ Output: ranked refactoring candidates.
46
+
47
+ ### Mode C: Structured Plan (non-trivial feature)
48
+ When requirements exist but need a plan document.
49
+
50
+ 1. **Scope challenge** — before reviewing anything, answer:
51
+ - What existing code already partially solves each sub-problem?
52
+ - What is the minimum set of changes that achieves the stated goal?
53
+ - **Complexity check:** 8+ files or 2+ new classes/services in a single phase → smell. Propose splitting or simplifying.
54
+ - **Search check:** for each architectural pattern or infrastructure component the plan introduces, check whether the runtime/framework has a built-in. Search for: `{framework} {pattern} built-in`. Flag custom solutions where built-ins exist.
55
+ - **Completeness check:** with AI-assisted coding, completeness is 10-100x cheaper than with human teams. If the plan shortcuts something to save human hours that only saves minutes with AI, recommend the complete version.
56
+ 2. **Strategy review** — challenge premises, identify scope decisions, consider 10x alternatives
57
+ 3. **Architecture review** — data flow, component boundaries, API surface, state model
58
+ 4. **Edge case analysis** — error states, concurrency, failure modes, security implications
59
+ 5. **Dependency mapping** — what blocks what, parallelizable work
60
+ 6. **Write plan.md** — structured artifact with phases, deps, verification steps
61
+
62
+ ### Mode D: Autoplan (plan exists, needs full review)
63
+ When a plan file exists and needs the full gauntlet. Auto-decides 90% of questions using decision principles. Surfaces only taste decisions at a final approval gate.
64
+
65
+ Runs in order: **Strategy → Architecture → Design → Engineering → DX**
66
+ Each phase must complete before the next begins.
67
+
68
+ ## Decision Principles
69
+
70
+ Use these to auto-resolve intermediate questions. Only surface to the user when options are genuinely close (taste decisions):
71
+
72
+ 1. **Completeness over cleverness** — Choose the option that covers more cases
73
+ 2. **Boil the lake** — Fix the blast radius, not the symptom
74
+ 3. **Pragmatic over perfect** — Cleaner option that ships today wins
75
+ 4. **DRY but not premature** — Reuse over rebuild, but don't abstract before the third instance
76
+ 5. **Explicit over implicit** — Clear code over magic
77
+ 6. **Bias toward action** — When in doubt, make progress
78
+
79
+ Never auto-decide: premises (need human judgment) or cases where both the plan and the alternative have strong arguments.
80
+
81
+ ## Plan Artifact
82
+
83
+ Output goes in `.opencode/plan.md` with this structure (matching the global AGENTS.md schema):
84
+
85
+ ```markdown
86
+ # PLAN: <project-name>
87
+
88
+ Plan ID: <project-name>-plan-<nnn>
89
+ Project: <project-name>
90
+ Status: active
91
+ Created: <local-date-time>
92
+ Updated: <local-date-time>
93
+ Project Path: <absolute-project-path>
94
+ Plan Path: .opencode/plan.md
95
+ Objective: <short objective>
96
+
97
+ ## Current State
98
+
99
+ <what exists now, what's missing>
100
+
101
+ ## Assumptions
102
+
103
+ - <assumption 1>
104
+ - <assumption 2>
105
+
106
+ ## Tasks
107
+
108
+ - [ ] Task 1
109
+ - [ ] Subtask 1.1
110
+
111
+ ## Active Task
112
+
113
+ <what's being worked on now>
114
+
115
+ ## Subagents
116
+
117
+ | Agent | Purpose | Status | Findings |
118
+ |---|---|---|---|
119
+
120
+ ## Completed
121
+
122
+ - <what's done>
123
+
124
+ ## Blockers
125
+
126
+ - None
127
+
128
+ ## Validation
129
+
130
+ - [ ] Static checks
131
+ - [ ] Unit tests
132
+ - [ ] Manual verification
133
+
134
+ ## Decisions
135
+
136
+ - <decision> — <rationale>
137
+
138
+ ## Notes
139
+
140
+ <anything else>
141
+ ```
142
+
143
+ ## Anti-patterns
144
+ - Skipping strategy review for complex features (architecture mistakes compound)
145
+ - Plans at wrong granularity — too vague to execute or too detailed to read
146
+ - Re-opening already-decided debates ("what if we rewrite in Rust?")
147
+ - Perfect being the enemy of shipped (progress > polish)
148
+ - Failing to flag taste decisions to the user
149
+ - Big bang rewrites — plan increments, not overhauls
150
+
151
+ ## Routing
152
+
153
+ | Outcome | Route |
154
+ |---------|-------|
155
+ | pass | → oh-grill (stress-test plan) |
156
+ | fail | → oh-planner (revise gaps) |
157
+ | blocker | → surface to user |
@@ -0,0 +1,35 @@
1
+ ---
2
+ name: oh-prd
3
+ description: "Turn conversation context into a PRD and publish as GitHub issue"
4
+ ---
5
+
6
+ # oh-prd
7
+
8
+ ## When to Use
9
+ When a feature discussion has produced enough context to write a product requirements document. Captures the decision tree and outputs a structured issue.
10
+
11
+ ## Workflow
12
+ 1. Extract requirements from conversation history
13
+ 2. Structure into PRD format: problem statement, target users, requirements (must/should/could), out of scope
14
+ 3. Create as GitHub issue with `gh issue create`
15
+ 4. Add triage label for prioritisation
16
+
17
+ ## PRD Structure
18
+ - **Problem** — what problem does this solve?
19
+ - **Target users** — who benefits?
20
+ - **Requirements** — must have / should have / could have
21
+ - **Out of scope** — explicitly what's NOT included
22
+ - **Success metrics** — how will we know it works?
23
+
24
+ ## Anti-patterns
25
+ - Writing PRD before understanding the problem
26
+ - Requirements that aren't testable ("fast" vs "loads in <200ms")
27
+ - Gold-plating — every feature is "must have"
28
+
29
+ ## Routing
30
+
31
+ | Outcome | Route |
32
+ |---------|-------|
33
+ | pass | → oh-issue (break PRD into actionable issues) |
34
+ | fail | → oh-grill (stress-test unclear requirements) |
35
+ | blocker | → surface to user |
@@ -0,0 +1,33 @@
1
+ ---
2
+ name: oh-retro
3
+ description: "Weekly engineering retrospective — analyze commit history and work patterns"
4
+ ---
5
+
6
+ # oh-retro
7
+
8
+ ## When to Use
9
+ At the end of a sprint or work week. Analyzes what was shipped, how it went, and what to improve.
10
+
11
+ ## Workflow
12
+ 1. **Analyze commits** — read git log since last retro
13
+ 2. **Categorize work** — features, fixes, refactors, docs, chores
14
+ 3. **Pattern analysis** — recurring themes, bottlenecks, types of bugs
15
+ 4. **Praise** — call out good work, good patterns, good decisions
16
+ 5. **Growth areas** — what could be better, with specific suggestions
17
+ 6. **Trend tracking** — compare against previous retros
18
+
19
+ ## Output
20
+ Structured retro report with: shipped items, metrics, praise, growth areas, action items.
21
+
22
+ ## Anti-patterns
23
+ - Blame-focused retro (it's about process, not people)
24
+ - Action items without owners (no follow-through)
25
+ - Same retro every week (if nothing changed, why?)
26
+
27
+ ## Routing
28
+
29
+ | Outcome | Route |
30
+ |---------|-------|
31
+ | pass | → oh-planner (start next cycle with retro insights) |
32
+ | fail | → oh-handoff (document blockers for next session) |
33
+ | blocker | → surface to user |
@@ -0,0 +1,110 @@
1
+ ---
2
+ name: oh-review
3
+ description: "Two-axis code and design review: Standards (conformance) + Spec (fidelity) in parallel sub-agents. Includes architecture deepening analysis."
4
+ tier: 3
5
+ benefits-from: [oh-expert]
6
+ triggers:
7
+ - "review"
8
+ - "code review"
9
+ - "review since"
10
+ - "review changes"
11
+ - "pr review"
12
+ - "design review"
13
+ ---
14
+
15
+ # oh-review
16
+
17
+ Two-axis review of the diff between HEAD and a fixed point. Both axes run as parallel sub-agents, then findings are aggregated. Three modes: **Diff Review**, **Architecture Deepening**, or both in sequence.
18
+
19
+ ## When to Use
20
+ Before merging any PR or landing changes. When you need a quality gate that catches both code-quality violations and spec deviations.
21
+
22
+ ## Mode Selection
23
+ - **Diff Review** (default) — Standards + Spec review of a changeset
24
+ - **Architecture Deepening** — Surface refactoring opportunities in the codebase
25
+ - **Full Review** — Both: diff review first, then architecture deepening pass
26
+
27
+ ---
28
+
29
+ ## Mode A: Diff Review
30
+
31
+ ### 1. Pin the Fixed Point
32
+ The user provides a branch, commit SHA, or tag. Capture `git diff <fixed-point>...HEAD` and `git log <fixed-point>..HEAD --oneline`.
33
+
34
+ ### 2. Identify the Spec Source
35
+ Look for the originating spec in this order:
36
+ 1. Issue references in commit messages (`#123`, `Closes #45`) — fetch via `docs/agents/issue-tracker.md`
37
+ 2. A path the user passed as an argument
38
+ 3. A PRD/spec file under `docs/`, `specs/`, or `.scratch/`
39
+ 4. Ask the user
40
+
41
+ If no spec exists, the Spec sub-agent skips and reports "no spec available."
42
+
43
+ ### 3. Identify the Standards Sources
44
+ Collect all files documenting how code should be written:
45
+ - AGENTS.md, CLAUDE.md, CONTRIBUTING.md
46
+ - CONTEXT.md, ADRs
47
+ - eslint/biome/prettier config (note tool-enforced ones — don't re-check)
48
+ - Any STYLE.md, STANDARDS.md, STYLEGUIDE.md
49
+
50
+ ### 4. Spawn Both Sub-Agents (parallel)
51
+
52
+ **Standards sub-agent:** Read the standards docs and the diff. Report per-file/hunk every place the diff violates a documented standard. Cite the standard source + rule. Distinguish hard violations from judgement calls. Skip anything tooling enforces.
53
+
54
+ **Spec sub-agent:** Read the spec and the diff. Report: (a) requirements missing or partial, (b) scope creep, (c) requirements implemented but wrong. Quote the spec line for each finding.
55
+
56
+ ### 5. Aggregate
57
+ Present findings under `## Standards` and `## Spec` headings. Do NOT merge or rerank — the two axes are deliberately separate. End with one-line summary: total findings per axis and the worst single issue.
58
+
59
+ ### Safety Check (always run inline before spawning sub-agents)
60
+ - SQL injection vectors
61
+ - LLM trust boundary violations
62
+ - Conditional side effects (test vs prod)
63
+ - Hardcoded secrets
64
+
65
+ Block immediately if critical safety issue found — do not spawn sub-agents.
66
+
67
+ ---
68
+
69
+ ## Mode B: Architecture Deepening
70
+
71
+ Surface deepening opportunities — refactors that turn shallow modules into deep ones. Uses the **deletion test**: if deleting a module would concentrate complexity (not just move it), the module is earning its keep. If complexity vanishes, the module was a pass-through.
72
+
73
+ ### Vocabulary
74
+ Use these terms exactly:
75
+ - **Module** — anything with an interface and an implementation
76
+ - **Depth** — leverage at the interface: lots of behavior behind a small interface
77
+ - **Seam** — where an interface lives; a place behavior can be altered without editing in place
78
+ - **Leverage** — what callers get from depth
79
+ - **Locality** — what maintainers get from depth: change concentrated in one place
80
+
81
+ ### Process
82
+ 1. **Explore** — Read CONTEXT.md and ADRs. Walk the codebase noting friction:
83
+ - Where does understanding one concept require bouncing between many small modules?
84
+ - Where are modules shallow (interface as complex as implementation)?
85
+ - Where are pure functions extracted for testability but real bugs hide in how they're called?
86
+ - Apply the deletion test to suspected shallow modules
87
+ 2. **Present candidates** — Numbered list. For each: files, problem, solution, benefits in terms of locality/leverage. Flag ADR conflicts.
88
+ 3. **Grilling loop** — Walk the design tree with the user. Side effects: update CONTEXT.md for new terms, offer ADRs for rejected candidates.
89
+ 4. **Output** — Ranked refactoring candidates with collision warnings.
90
+
91
+ ## Scoring
92
+ - Critical safety issue → block immediately (before sub-agents)
93
+ - Structural concern → changes requested
94
+ - Spec deviation → changes requested (with reference)
95
+ - Style/nit → note for follow-up
96
+
97
+ ## Anti-patterns
98
+ - Reviewing style before safety (wrong priority order)
99
+ - Rubber-stamping without reading the diff
100
+ - Requesting changes for subjective preferences
101
+ - Merging Standards and Spec findings (one axis masks the other)
102
+ - Proposing interfaces in deepening mode before the user picks a candidate
103
+
104
+ ## Routing
105
+
106
+ | Outcome | Route |
107
+ |---------|-------|
108
+ | pass | → oh-gauntlet (if code changes needed) or oh-ship |
109
+ | fail | → oh-builder (fix violations found) |
110
+ | blocker | → surface to user |