npm - openhermes - Versions diffs - 4.0.1 → 4.3.0 - Mend

openhermes 4.0.1 → 4.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/ETHOS.md +6 -3
package/LICENSE +21 -21
package/README.md +111 -81
package/bootstrap.ts +405 -0
package/harness/agents/openhermes.md +45 -55
package/harness/codex/AUTOPILOT.md +126 -0
package/harness/codex/CONSTITUTION.md +14 -11
package/harness/codex/ROUTING.md +35 -69
package/harness/commands/oh-log.md +18 -0
package/harness/instructions/RUNTIME.md +27 -51
package/harness/skills/oh-builder/SKILL.md +27 -16
package/harness/skills/oh-caveman/SKILL.md +9 -0
package/harness/skills/oh-expert/SKILL.md +6 -0
package/harness/skills/oh-facade/SKILL.md +298 -0
package/harness/skills/oh-freeze/SKILL.md +9 -0
package/harness/skills/oh-full-output/SKILL.md +81 -0
package/harness/skills/oh-fusion/SKILL.md +314 -0
package/harness/skills/oh-gauntlet/SKILL.md +10 -6
package/harness/skills/oh-grill/SKILL.md +9 -5
package/harness/skills/oh-guard/SKILL.md +9 -0
package/harness/skills/oh-handoff/SKILL.md +9 -0
package/harness/skills/oh-health/SKILL.md +8 -4
package/harness/skills/oh-init/SKILL.md +80 -13
package/harness/skills/oh-investigate/SKILL.md +57 -8
package/harness/skills/oh-issue/SKILL.md +9 -0
package/harness/skills/oh-learn/SKILL.md +81 -8
package/harness/skills/oh-manifest/SKILL.md +55 -11
package/harness/skills/oh-plan-review/SKILL.md +15 -8
package/harness/skills/oh-planner/SKILL.md +18 -8
package/harness/skills/oh-prd/SKILL.md +9 -0
package/harness/skills/oh-refactor/SKILL.md +426 -0
package/harness/skills/oh-retro/SKILL.md +9 -0
package/harness/skills/oh-review/SKILL.md +12 -5
package/harness/skills/oh-security/SKILL.md +4 -0
package/harness/skills/oh-ship/SKILL.md +10 -0
package/harness/skills/oh-skill-craft/SKILL.md +88 -0
package/harness/skills/oh-skills-link/SKILL.md +9 -0
package/harness/skills/oh-skills-list/SKILL.md +9 -0
package/harness/skills/oh-triage/SKILL.md +11 -0
package/index.ts +3 -0
package/lib/{harness-resolver.mjs → harness-resolver.ts} +16 -12
package/lib/logger.ts +75 -0
package/package.json +16 -10
package/tsconfig.json +16 -0
package/bootstrap.mjs +0 -174
package/harness/instructions/CONVENTIONS.md +0 -206
package/index.mjs +0 -3
package/lib/logger.mjs +0 -62
package/test/plugins-behavioral.test.mjs +0 -64
package/test/plugins.test.mjs +0 -62

package/harness/skills/oh-fusion/SKILL.md ADDED Viewed

@@ -0,0 +1,314 @@
+---
+name: oh-fusion
+description: "Skill ingestion pipeline: discover, analyze, filter, adapt, fuse, and integrate external skills into the OH harness. Use when the user has an existing skill, finds a skill in their .agents/skills, or wants to bring an external capability into OH."
+tier: 3
+benefits-from: [oh-skill-craft, oh-skills-link, oh-expert]
+triggers:
+  - "import skill"
+  - "ingest skill"
+  - "fuse skill"
+  - "merge skills"
+  - "port skill"
+  - "add skill from"
+  - "make this OH-native"
+  - "skill fusion"
+  - "oh-fusion"
+  - "integrate skill"
+  - "convert skill"
+  - "bring in a skill"
+  - "transfer skill"
+  - "copy skill"
+  - "adopt skill"
+route:
+  pass:
+    - oh-skills-link
+    - oh-skill-craft
+  fail: oh-skill-craft
+  blocker: surface
+---
+# oh-fusion
+The skill ingestion pipeline: discover external skills, evaluate signal quality, filter out noise, adapt to OH conventions, fuse multiple into one, and integrate into the harness.
+Every skill you run through `oh-fusion` becomes part of the closed loop — wired into AUTOPILOT, ROUTING.md, AGENTS.md, and the self-driving engine.
+## When to Use
+- The user points at a skill in `.agents/skills` and says "make this OH-native"
+- The user has a skill from `npx skills` ecosystem they want integrated
+- The user provides raw skill content and asks "is this worth keeping?"
+- Multiple skills need fusing into one (like the `oh-facade` fusion in this session)
+- Any external capability needs to become an `oh-*` skill with full wiring
+## Pipeline
+6-phase closed loop:
+```
+Discovery → Analysis → Decision → Adaptation → Fusion (opt) → Integration
+                                                                      ↓
+                                                            oh-skills-link (verify)
+```
+---
+## Phase 1: Discovery
+Input: user's skill source
+Output: raw skill content loaded for analysis
+### Sources
+| Source | How to access |
+|---|---|
+| `.agents/skills/<name>/SKILL.md` | Read the file directly |
+| `npx skills` package | Run `npx skills find <query>` or check `skills.sh` |
+| URL to a skill | Fetch the content via web fetch |
+| User-provided path | Resolve and read |
+| User-provided content inline | Capture the raw text |
+| Multiple skills (for fusion) | Load all, enter Phase 2 on each |
+### Discovery Checklist
+Before proceeding, confirm:
+- [ ] Skill content is loaded and readable
+- [ ] Frontmatter is present (name, description)
+- [ ] There are no access restrictions or permissions needed
+- [ ] For multiple skills: all are loaded and ready for comparison
+---
+## Phase 2: Analysis
+Input: raw skill content
+Output: structured analysis report with signal score
+### 2a. Depth Scoring
+Measure the skill's substantive content:
+| Metric | How to assess |
+|---|---|
+| Total lines | SKILL.md length |
+| Concrete rules count | Number of "must", "never", "always", "banned" directives |
+| Example count | Number of code blocks showing before/after or usage |
+| Anti-patterns listed | Explicit "don't do this" sections |
+| Workflow steps | Number of sequential, actionable steps |
+| Routing table | Does it define pass/fail/blocker routing? |
+**Scoring:**
+- **High signal** (70-100): Multiple concrete rules, examples, anti-patterns, workflow steps, routing
+- **Medium signal** (30-69): Some structure but thin on specifics, few examples
+- **Low signal** (0-29): Vague descriptions, no concrete rules, no anti-patterns, "be creative" level
+### 2b. Overlap Detection
+Compare against all existing OH skills (`harness/skills/oh-*/SKILL.md`):
+- Does any existing OH skill cover the same domain?
+- Is the overlap partial (complementary) or complete (redundant)?
+- Does the external skill have unique content OH lacks?
+### 2c. Convention Check
+Does the skill follow good practices?
+- [ ] Has clear description for triggering
+- [ ] Has concrete, actionable instructions (not just philosophy)
+- [ ] Has anti-patterns or failure modes documented
+- [ ] Has examples or code blocks
+- [ ] Has measurable outcomes (not subjective "make it good")
+- [ ] Avoids time-sensitive references (dates, version numbers)
+- [ ] Avoids platform-specific assumptions that don't apply
+### 2d. Report
+Output a structured report:
+```markdown
+## Analysis: <skill-name>
+**Source:** <path or origin>
+**Depth score:** <0-100> — <High/Medium/Low>
+**Total lines:** <N>  |  Concrete rules: <N>  |  Examples: <N>  |  Anti-patterns: <N>
+**Overlap:** <existing OH skill> — <none/partial/complete>
+**Verdict:** <keep / fuse / discard / ask>
+**Strengths:**
+- <what this skill does well>
+**Weaknesses:**
+- <what is missing or weak>
+**Recommended action:** <port directly / fuse with X > / discard>
+```
+---
+## Phase 3: Decision
+Based on the analysis, decide what to do:
+| Verdict | Action |
+|---|---|
+| **Keep** | High signal, no overlap, OH conventions missing. Port directly to `oh-<name>`. |
+| **Fuse** | Medium-high signal, partial overlap with existing OH skill(s). Merge complementary DNA. |
+| **Discard** | Low signal, complete overlap, too niche, or no actionable content. Surface reasoning. |
+| **Ask** | Ambiguous quality, unclear domain fit, or user needs to choose between approaches. Surface findings. |
+**Decision principles:**
+- When in doubt between keep and fuse, prefer fuse — conserves routing slots and reduces surface area
+- When in doubt between keep and discard, prefer keep if there is ANY unique signal — the autopilot won't load it unless triggered
+- Never fuse incompatible domains (e.g., UI design into a security skill) — the result is confusing
+---
+## Phase 4: Adaptation
+Input: raw skill content to keep/fuse
+Output: OH-native SKILL.md
+### 4a. Rewrite Frontmatter
+```markdown
+---
+name: oh-<new-name>
+description: "Adapted from <source>. <Core function>. Use when <triggers>."
+tier: <2|3|4>
+benefits-from: [<relevant oh- skills this depends on>]
+triggers:
+  - "<trigger phrase from original, adapted>"
+  - "<new trigger phrases for OH context>"
+---
+```
+### 4b. Structure the Body
+OH skill structure:
+1. **Summary** — one paragraph of what the skill does
+2. **When to Use** — clear triggering context
+3. **Workflow** — numbered steps (the core of the skill)
+4. **Anti-patterns** — what NOT to do
+5. **Routing** — pass/fail/blocker table
+Adaptation rules:
+- Remove all emojis from content
+- Replace ecosystem-specific terminology with OH equivalents
+- Convert relative paths to OH harness conventions
+- Add routing table based on skill's purpose
+- Keep all concrete rules, examples, and anti-patterns from the original
+- Discard fluff, philosophy, and motivational language
+- Preserve the original's unique signal — that's why you're importing it
+### 4c. Naming
+- Name must match `^[a-z0-9]+(-[a-z0-9]+)*$`
+- Prefix with `oh-`
+- Use the original name if it maps well, adapt if not
+- For fusions: invent a new name that captures the combined purpose
+---
+## Phase 5: Fusion (optional — skip for single-skill imports)
+Input: 2+ analyzed skill contents with "fuse" verdict
+Output: one unified skill that merges complementary DNA
+### 5a. Identify Complementary DNA
+For each skill being fused, identify:
+- **Unique rules/concepts** — content that only this skill has
+- **Overlapping content** — same idea expressed differently (keep the better version)
+- **Conflicting directives** — skills that say opposite things (surface to user)
+### 5b. Merge Architecture
+Structure the fused skill so each source contributes its strength:
+```markdown
+## <Combined Workflow>
+### Phase A: <from skill 1>
+<what skill 1 contributes>
+### Phase B: <from skill 2>
+<what skill 2 contributes>
+### Phase C: <from skill 3>
+<what skill 3 contributes>
+```
+Do NOT just concatenate. The fused skill must read as a single coherent workflow, not three documents glued together.
+### 5c. Name the Fusion
+The name should signal the combined purpose, not the individual sources.
+- `oh-facade` (from redesign + design-taste + high-end-visual) — not `oh-redesign-plus-taste`
+- Apply the same principle here
+---
+## Phase 6: Integration
+Input: OH-native SKILL.md
+Output: skill fully wired into the harness
+### 6a. Create the Skill File
+Write to `~/.config/opencode/skills/oh-<name>/SKILL.md` (user dir, survives npm updates).
+If the user has an alternative preference (`~/.agents/skills/`), use that instead.
+The file structure follows the standard OH skill template.
+### 6b. Wire into AUTOPILOT
+Add an entry to the auto-classify matrix in `harness/codex/AUTOPILOT.md`:
+- Signal keywords that should trigger this skill
+- Classification label
+- Action: "Load **oh-<name>**. Do not ask."
+### 6c. Wire routing into frontmatter
+Add `route:` frontmatter to the skill — no ROUTING.md edit needed. The dynamic routing system reads `route.pass`, `route.fail`, and `route.blocker` directly from the skill's own `SKILL.md`. The skill becomes routable automatically:
+```yaml
+route:
+  pass: <next skill or done>
+  fail: <fallback skill or surface>
+  blocker: surface
+```
+### 6d. Wire into AGENTS.md
+Add to the skills table in `AGENTS.md`:
+- Skill, tier, purpose
+- Increment the total count
+### 6e. Wire into openhermes.md
+Add to the orchestrator's skill list in `harness/agents/openhermes.md`.
+### 6f. Verify Discovery
+Route to `oh-skills-link` to confirm the skill is discoverable by OpenCode.
+---
+## Routing
+| Outcome | Route |
+|---|---|
+| integration complete | -> oh-skills-link (verify discovery) |
+| fusion with iteration needed | -> oh-skill-craft (optimize via eval loop) |
+| analysis: discard | -> surface findings to user |
+| analysis: ask | -> surface findings + recommendations to user |
+| blocker | -> surface to user |
+## Anti-patterns
+- Importing a skill without analyzing it first — always run Phase 2
+- Keeping everything from the source — 50% of most external skills is fluff. Be ruthless.
+- Fusing incompatible domains — the result confuses both the model and the user
+- Naming after the source ("oh-tailwind-v2") instead of the capability ("oh-styles")
+- Skipping route frontmatter — a skill without `route.pass`/`route.fail`/`route.blocker` won't auto-route
+- Overwriting existing routing entries without checking for collisions

package/harness/skills/oh-gauntlet/SKILL.md CHANGED Viewed

@@ -4,14 +4,18 @@ description: "Rigorous multi-axis testing gauntlet: unit, integration, edge case
 tier: 4
 benefits-from: [oh-expert, oh-builder]
 triggers:
-  - "gauntlet"
+  - "run the gauntlet on"
   - "test everything"
   - "rigorous testing"
   - "review all angles"
-  - "qa"
-  - "full review"
-  - "run the gauntlet"
-  - "validate"
+  - "qa the feature"
+  - "full review of the code"
+  - "validate this feature"
+  - "thorough testing"
+route:
+  pass: oh-ship
+  fail: oh-builder
+  blocker: surface
 ---
 # oh-gauntlet
@@ -34,7 +38,7 @@ If tests are missing or weak, flag what should be added. Do not add them here
 Spawn two sub-agents simultaneously:
-**Standards sub-agent:** Read the repo's documented standards (CONTEXT.md, AGENTS.md, eslint config, ADRs, STYLE.md, CONVENTIONS.md). Then read the diff. Report every place the diff violates a documented standard. Cite the standard source. Distinguish hard violations from judgement calls.
+**Standards sub-agent:** Read the repo's documented standards (CONTEXT.md, AGENTS.md, eslint config, ADRs). Then read the diff. Report every place the diff violates a documented standard. Cite the standard source. Distinguish hard violations from judgement calls.
 **Spec sub-agent:** Read the spec source (plan.md, issue, PRD, or user's description). Then read the diff. Report: (a) requirements that are missing or partial, (b) scope creep (behavior not asked for), (c) requirements that look implemented but wrong. Quote the spec.

package/harness/skills/oh-grill/SKILL.md CHANGED Viewed

@@ -4,12 +4,16 @@ description: "Stress-test plans and designs through relentless Socratic question
 tier: 3
 benefits-from: [oh-expert, oh-planner]
 triggers:
-  - "grill"
   - "stress test this plan"
-  - "challenge this"
-  - "grill me"
-  - "poke holes"
-  - "interrogate"
+  - "challenge this plan"
+  - "grill me on this"
+  - "poke holes in this plan"
+  - "interrogate this plan"
+  - "stress test this design"
+route:
+  pass: oh-planner
+  fail: oh-expert
+  blocker: surface
 ---
 # oh-grill

package/harness/skills/oh-guard/SKILL.md CHANGED Viewed

@@ -1,6 +1,15 @@
 ---
 name: oh-guard
 description: "Safety confirmation mode — warn before destructive operations"
+tier: 2
+triggers:
+  - "confirm before"
+  - "safety confirmation"
+  - "guard mode"
+route:
+  pass: mode
+  fail: mode
+  blocker: surface
 ---
 # oh-guard

package/harness/skills/oh-handoff/SKILL.md CHANGED Viewed

@@ -1,6 +1,15 @@
 ---
 name: oh-handoff
 description: "Compact session state into a structured handoff document"
+tier: 2
+triggers:
+  - "session handoff"
+  - "handoff to another agent"
+  - "handoff the session"
+route:
+  pass: done
+  fail: surface
+  blocker: surface
 ---
 # oh-handoff

package/harness/skills/oh-health/SKILL.md CHANGED Viewed

@@ -3,12 +3,16 @@ name: oh-health
 description: "Code quality dashboard: runs project tools (typecheck, lint, test, dead code detection), computes weighted composite 0-10 score, persists history, shows trend. Read-only — no fixes."
 tier: 2
 triggers:
-  - "health check"
-  - "code quality"
+  - "health check the codebase"
+  - "code quality check"
   - "quality dashboard"
   - "how healthy is the codebase"
-  - "run all checks"
-  - "health"
+  - "run all project checks"
+  - "code health"
+route:
+  pass: surface
+  fail: oh-investigate
+  blocker: surface
 ---
 # oh-health

package/harness/skills/oh-init/SKILL.md CHANGED Viewed

@@ -1,22 +1,86 @@
 ---
 name: oh-init
-description: "Initialize project for agent-assisted development: scaffold CONTEXT.md, AGENTS.md, docs/adr/, configure issue tracker and triage labels."
+description: "Initialize project for OpenHermes: wire AGENTS.md, configure domain docs, issue tracker, and triage labels. Does NOT create .opencode/ directory."
 tier: 2
 triggers:
-  - "init project"
-  - "setup project"
-  - "initialize"
-  - "onboard"
-  - "scaffold"
+  - "init this project for oh"
+  - "setup project for openhermes"
+  - "initialize openhermes setup"
+  - "onboard this project"
+  - "scaffold project setup"
+  - "oh takeover this project"
+route:
+  pass: done
+  fail: oh-init
+  blocker: surface
 ---
 # oh-init
-Per-repo setup for agent-assisted development. Run once per repo. Walks through configuration decisions one at a time.
+Per-repo setup for OpenHermes-assisted development. Run once per repo. Wires AGENTS.md, configures domain docs, issue tracker, and triage labels. Does NOT create a `.opencode/` directory — plan files go to `~/.local/share/opencode/openhermes/plans/`.
+Complements OpenCode's built-in `/init` command (which creates `AGENTS.md` with project build/test/architecture notes). Run oh-init after or instead — they serve different layers.
 ## Process
-### 1. Issue Tracker
+### Phase 0: Check Existing State
+Before writing anything, detect what already exists:
+- ☐ `AGENTS.md` exists? (If yes, was it created by OpenCode `/init` or manually?)
+- ☐ `opencode.json` / `opencode.jsonc` present?
+- ☐ Canonical plan files (`~/.local/share/opencode/openhermes/plans/<project-name>-plan-*.md`)?
+- ☐ `CONTEXT.md` exists?
+- ☐ `docs/agents/` directory exists?
+Report findings. If everything exists, offer to skip or verify and exit.
+### Phase 1: AGENTS.md Wiring
+Check if AGENTS.md exists:
+**If AGENTS.md does not exist:**
+Create it with OpenHermes orchestrator header + prompts for project info:
+```markdown
+# <project-name>
+OpenHermes is the primary orchestrator. All routing, planning, and delegation flows through oh-* skills.
+## Project Context
+- **Language**: <fill in>
+- **Package manager**: <fill in>
+- **Build command**: <fill in>
+- **Test command**: <fill in>
+- **Lint/type check**: <fill in>
+## Key Directives
+- Plan first. Write to `~/.local/share/opencode/openhermes/plans/<project-name>-plan-<nnn>.md` before multi-file changes.
+- **OpenHermes never executes tasks directly. It talks/reports to the user and delegates everything to sub-agents.**
+- Verify before claiming success. Read files, run commands, confirm output.
+- Never write code, run tests, or edit files in the main context — always delegate.
+- Use oh-* skills on demand. Load via OpenCode's skill tool when relevant.
+- Plan file is self-contained (Tasks, Completed, Work Log sections).
+```
+Then ask the user to fill in the Project Context fields. Offer to auto-detect from package manifests.
+**If AGENTS.md exists** (e.g., created by OpenCode `/init`):
+Append an `## OpenHermes Orchestrator` section to the end:
+```markdown
+## OpenHermes Orchestrator
+OpenHermes is the primary orchestrator for this session.
+- **Orchestrator**: OpenHermes — hub-and-spoke routing through oh-* skills
+- **Plan**: `~/.local/share/opencode/openhermes/plans/<project-name>-plan-<nnn>.md` — always check before starting work
+- **Never execute**: OpenHermes talks/reports to the user and delegates everything to sub-agents
+- **Verify before claim**: read files, run commands, confirm output
+```
+### Phase 2: Issue Tracker
 Detect the git hosting platform:
 - **GitHub** — `gh` CLI
 - **GitLab** — `glab` CLI
@@ -25,7 +89,7 @@ Detect the git hosting platform:
 Confirm with the user. Write the result to `docs/agents/issue-tracker.md`.
-### 2. Triage Labels
+### Phase 3: Triage Labels
 The `triage` skill uses these label strings to move issues through a state machine:
 - `needs-triage` — maintainer needs to evaluate
 - `needs-info` — waiting on reporter
@@ -35,7 +99,7 @@ The `triage` skill uses these label strings to move issues through a state machi
 If the repo already has different label names, map them. Write to `docs/agents/triage-labels.md`.
-### 3. Domain Docs
+### Phase 4: Domain Docs
 Configure how the project organizes domain language:
 - **Single-context** — one `CONTEXT.md` + `docs/adr/` at repo root
 - **Multi-context** — `CONTEXT-MAP.md` pointing to per-context files
@@ -44,7 +108,7 @@ Scaffold `CONTEXT.md` with project name, domain description, and placeholder glo
 Write to `docs/agents/domain.md`.
-### 4. Agent Skills Block
+### Phase 5: Agent Skills Block
 Add a `## Agent skills` section to `AGENTS.md` (or `CLAUDE.md` if it exists):
 ```markdown
@@ -60,14 +124,17 @@ Add a `## Agent skills` section to `AGENTS.md` (or `CLAUDE.md` if it exists):
 <summary>. See docs/agents/domain.md.
 ```
-### 5. Decision Record
-Record: "oh-init completed for project \<name\> on \<date\>."
+### Phase 6: Decision Record
+Record: "oh-init completed for project <name> on <date>."
 ## Anti-patterns
 - Running init without understanding the project domain
 - Scaffolding CONTEXT.md without populating any terms
 - Creating ADR directory but never writing ADRs
 - Creating both AGENTS.md and CLAUDE.md — edit the one that exists
+- Overwriting an existing AGENTS.md created by OpenCode `/init` (append instead)
+- Creating `.opencode/` directory — plan files go to OpenCode's canonical storage, not a hidden project dir
+- Empty instinct file never getting populated (run oh-learn extract periodically)
 ## Routing

package/harness/skills/oh-investigate/SKILL.md CHANGED Viewed

@@ -1,6 +1,16 @@
 ---
 name: oh-investigate
 description: "Systematic bug diagnosis with root cause investigation"
+tier: 2
+triggers:
+  - "investigate this bug"
+  - "debug this"
+  - "why is this broken"
+  - "root cause"
+route:
+  pass: oh-builder
+  fail: oh-expert
+  blocker: surface
 ---
 # oh-investigate
@@ -8,14 +18,52 @@ description: "Systematic bug diagnosis with root cause investigation"
 ## When to Use
 When a bug is reported, a test fails, or unexpected behavior occurs. Use this before attempting any fix.
-## Workflow
-1. **Reproduce** — get a reliable reproduction case (script, test, or steps)
-2. **Minimise** — strip away unrelated code until the minimal reproduction remains
-3. **Hypothesise** — list possible root causes, rank by likelihood
-4. **Instrument** — add logging, assertions, or debug output to test hypothesis
-5. **Fix** — implement the smallest correct change addressing root cause
-6. **Regression test** — verify fix doesn't break existing behavior
-7. **Document** — log the root cause and fix in the handoff, issue, or docs that are actually in scope
+## Phase 0 — Build a feedback loop
+**This is the actual skill. Everything else is mechanical.**
+If you have a fast, deterministic, agent-runnable pass/fail signal for the bug, you will find the cause — bisection, hypothesis-testing, and instrumentation are just consuming that signal. If you don't have one, no amount of staring at code will save you.
+Spend disproportionate effort here. **Be aggressive. Be creative. Refuse to give up.**
+### Ways to construct a feedback loop (try in this order)
+1. **Failing test** at whatever seam reaches the bug.
+2. **Curl / HTTP script** against a running dev server.
+3. **CLI invocation** with a fixture input, diffing stdout against a known-good snapshot.
+4. **Headless browser script** — drive the UI, assert on DOM/console/network.
+5. **Replay a captured trace** — save a real payload/event log, replay it in isolation.
+6. **Throwaway harness** — minimal subset of the system exercising the bug code path with a single call.
+7. **Property / fuzz loop** — run 1000 random inputs, look for the failure mode.
+8. **Bisection harness** — automate "boot at state X, check, repeat" so you can `git bisect run` it.
+9. **Differential loop** — run same input through old-version vs new-version, diff outputs.
+10. **HITL script** — last resort. Drive a human with a structured loop.
+### Iterate on the loop itself
+- Can I make it faster? (Cache setup, skip unrelated init, narrow the scope.)
+- Can I make the signal sharper? (Assert on the specific symptom, not "didn't crash".)
+- Can I make it more deterministic? (Pin time, seed RNG, isolate filesystem.)
+A 30-second flaky loop is barely better than no loop. A 2-second deterministic loop is a debugging superpower.
+### Non-deterministic bugs
+The goal is not a clean repro but a **higher reproduction rate**. Loop the trigger 100×, parallelise, add stress, narrow timing windows. A 50%-flake bug is debuggable; 1% is not.
+### When you genuinely cannot build a loop
+Stop and say so explicitly. List what you tried. Do **not** proceed to hypothesise without a loop.
+## Workflow (consumes the loop)
+1. **Reproduce** — run the loop, confirm the bug appears. The loop must match the user's described failure, not a different nearby failure.
+2. **Minimise** — strip away unrelated code until the minimal reproduction remains.
+3. **Hypothesise** — generate 3–5 ranked falsifiable hypotheses before testing any. Each must state a prediction: "If X is the cause, then changing Y will make the bug disappear".
+4. **Instrument** — one probe per hypothesis. Change one variable at a time. Tag every debug log with a unique prefix (e.g. `[DEBUG-a4f2]`) for easy cleanup.
+5. **Fix** — write the regression test at a correct seam first. Watch it fail. Apply the smallest correct change. Watch it pass. Re-run the Phase 0 loop against the original scenario.
+6. **Regression test** — verify fix doesn't break existing behavior. If no correct seam exists for a regression test, that itself is a finding — flag the architecture gap.
+7. **Document** — log the root cause and fix in the handoff, issue, or relevant docs. State which hypothesis was correct so the next debugger learns.
 ## Iron Law
 No fixes without root cause. Surface-level fixes compound into technical debt.
@@ -25,6 +73,7 @@ No fixes without root cause. Surface-level fixes compound into technical debt.
 - Changing code without reproducing the bug first
 - "Shotgun" debugging — changing multiple things hoping one sticks
 - Not documenting root cause for future reference
+- Proceeding to hypothesise without a feedback loop
 ## Routing

package/harness/skills/oh-issue/SKILL.md CHANGED Viewed

@@ -1,6 +1,15 @@
 ---
 name: oh-issue
 description: "Break a plan, spec, or PRD into independently-grabbable GitHub issues"
+tier: 2
+triggers:
+  - "break into issues"
+  - "create issues from plan"
+  - "issue breakdown"
+route:
+  pass: done
+  fail: oh-planner
+  blocker: surface
 ---
 # oh-issue