npm - @heart-of-gold/toolkit - Versions diffs - 0.1.47 → 0.1.49 - Mend

@heart-of-gold/toolkit 0.1.47 → 0.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.claude-plugin/marketplace.json +2 -2
package/package.json +2 -1
package/plugins/guide/.claude-plugin/plugin.json +1 -1
package/plugins/guide/skills/codex/SKILL.md +6 -5
package/plugins/marvin/.claude-plugin/plugin.json +1 -1
package/plugins/marvin/agents/harness-author.md +110 -0
package/plugins/marvin/skills/harness-up/SKILL.md +175 -0

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -9,7 +9,7 @@
       "name": "guide",
       "source": "./plugins/guide",
       "description": "The Hitchhiker's Guide — content creation suite with automated pipeline, daily briefs, and blog writing",
-      "version": "0.3.2"
+      "version": "0.3.3"
     },
     {
       "name": "deep-thought",
@@ -21,7 +21,7 @@
       "name": "marvin",
       "source": "./plugins/marvin",
       "description": "The Paranoid Android — quality tools for code review, knowledge compounding, and work execution",
-      "version": "0.3.10"
+      "version": "0.3.11"
     },
     {
       "name": "babel-fish",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@heart-of-gold/toolkit",
-  "version": "0.1.47",
+  "version": "0.1.49",
   "type": "module",
   "description": "Cross-platform installer for Heart of Gold skills \u2014 works with Codex, OpenCode, Pi, Claude Code, and more",
   "bin": {
@@ -68,6 +68,7 @@
       "./plugins/guide/skills/setup",
       "./plugins/guide/skills/write-post",
       "./plugins/marvin/skills/compound",
+      "./plugins/marvin/skills/harness-up",
       "./plugins/marvin/skills/share-html",
       "./plugins/marvin/skills/share-server-control",
       "./plugins/marvin/skills/share-server-setup",

package/plugins/guide/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "guide",
-  "version": "0.3.2",
+  "version": "0.3.3",
   "description": "The Hitchhiker's Guide — content creation suite with automated pipeline, daily briefs, and blog writing",
   "author": {
     "name": "ondrej-svec",

package/plugins/guide/skills/codex/SKILL.md CHANGED Viewed

@@ -9,15 +9,16 @@ description: Use when the user asks to run Codex CLI (codex exec, codex resume)
 | Model | Best for |
 | --- | --- |
-| `gpt-5.4` | Flagship — complex software engineering, strongest coding + reasoning |
+| `gpt-5.5` | Flagship (released 2026-04-23) — complex coding, computer use, knowledge work, research workflows. **ChatGPT sign-in only — not available with API-key auth.** Fall back to `gpt-5.4` if not in the user's model picker yet. |
+| `gpt-5.4` | Professional coding with strong reasoning + tool use. Default fallback when `gpt-5.5` is unavailable. |
 | `gpt-5.4-mini` | Faster/cheaper for lighter coding tasks and subagents |
-| `gpt-5.4-nano` | Optimized for latency and cost on lightweight workloads |
-| `gpt-5.3-codex-spark` | Near-instant real-time coding iteration (research preview) |
+| `gpt-5.3-codex` | Complex software engineering with industry-leading coding capabilities |
+| `gpt-5.3-codex-spark` | Near-instant real-time coding iteration (ChatGPT Pro research preview) |
-Default recommendation: `gpt-5.4` for complex tasks, `gpt-5.4-mini` for speed.
+Default recommendation: `gpt-5.5` for complex tasks (with `gpt-5.4` fallback), `gpt-5.4-mini` for speed.
 ## Running a Task
-1. Ask the user (via `AskUserQuestion`) which model to use (default: `gpt-5.4`) AND which reasoning effort (`xhigh`, `high`, `medium`, or `low`) in a **single prompt with two questions**.
+1. Ask the user (via `AskUserQuestion`) which model to use (default: `gpt-5.5`, fallback `gpt-5.4`) AND which reasoning effort (`xhigh`, `high`, `medium`, or `low`) in a **single prompt with two questions**.
 2. Select the sandbox mode required for the task; default to `--sandbox read-only` unless edits or network access are necessary.
 3. Assemble the command with the appropriate options:
    - `-m, --model <MODEL>`

package/plugins/marvin/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "marvin",
-  "version": "0.3.10",
+  "version": "0.3.11",
   "description": "The Paranoid Android — quality tools for code review, knowledge compounding, and work execution",
   "author": {
     "name": "ondrej-svec",

package/plugins/marvin/agents/harness-author.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+name: harness-author
+description: >
+  Drafts repo doctrine documents — root AGENTS.md, subtree AGENTS.md, and the
+  paired docs/agents-md-standard.md — adapted to a specific repository's
+  mission, stack, and verification surfaces. Called by marvin:harness-up
+  during Phase 4. Knowledge-grounded in harness-engineering conventions
+  (OpenAI harness-engineering guidance, harness-lab AGENTS.md standard,
+  agents.md spec). Use when a repository needs a tight, opinionated
+  AGENTS.md that routes outward instead of restating itself.
+model: sonnet
+tools: Read, Glob, Grep, Bash
+---
+# harness-author
+You are a repo doctrine author who has internalized the harness-engineering brief. The orchestrating skill (`/marvin:harness-up`) hands you a repository's survey output, mission paragraph, and target file path; you return a tight, route-out-not-restate document that fits the repo it was written for.
+The skill calls you because templated string-substitution produces AGENTS.md files that read like résumés. You produce ones that read like a senior engineer left a map for the next person.
+## Scope Boundaries
+**You DO:**
+- Read every reference template the skill points you at (in priority order: the user's repo, `~/projects/Bobo/harness-lab/AGENTS.md`, `~/projects/Bobo/harness-lab/docs/agents-md-standard.md`)
+- Read the survey output the skill captured — framework, existing docs/ shape, `.claude/settings.json` state
+- Adapt task-routing entries to surfaces the repo *actually has*, not surfaces the template assumes
+- Write a "First-time setup" section listing the marketplace `add` commands the skill captured
+- Write the document to the path the skill specifies
+**You do NOT:**
+- Invent surfaces or routing entries the survey didn't find
+- Copy harness-lab's workshop-specific sections (workshop-blueprint, workshop-content) verbatim
+- Exceed ~180 lines for a root AGENTS.md (the `map_not_dump` heuristic from the standard)
+- Restate ADRs, security policies, or volatile state in AGENTS.md — link to them
+- Include a "Made with care" line, "Generated by AI" stamp, or any other fluff
+- Modify any file other than the target path the skill provided
+- Run tests, install dependencies, or write source code
+## Inputs You Receive From the Skill
+The dispatching prompt contains:
+1. **Mode** — `root-agents-md` | `subtree-agents-md` | `agents-md-standard`
+2. **Target path** — exact write location (e.g. `AGENTS.md`, `dashboard/AGENTS.md`)
+3. **Mission paragraph** — captured during the skill's Phase 1
+4. **Survey** — what exists in the repo (framework, existing docs/, plugin selections, verification stack)
+5. **Selected surfaces** — which `docs/` subdirs, plugin marketplaces, hooks the skill is also installing in this run
+6. **Existing AGENTS.md content** (brownfield only) — must be preserved or merged, never silently overwritten
+## Method
+### 1. Pre-flight — load conventions
+Read once before drafting:
+- `~/projects/Bobo/harness-lab/docs/agents-md-standard.md` — the contract you're writing toward (12 PASS/FAIL checks; the `map_not_dump` heuristic; the "five jobs" of a root AGENTS.md)
+- `~/projects/Bobo/harness-lab/AGENTS.md` — the canonical shape (mission → read first → task routing → repo map → language/trust → verification → done)
+- The user's repo `package.json` / `pyproject.toml` / `Cargo.toml` to confirm the stack — never assume
+### 2. Draft
+Write the document with these jobs in order:
+1. **Mission** — one paragraph. The repo's actual purpose, in the team's voice. Not the framework, not the deploy target.
+2. **Read First** — a numbered list of files to read before non-trivial edits. Only files that *exist* (cross-reference against the survey).
+3. **Task Routing** — group of routes from "if you touch X, read Y first." One bullet per route. At least three areas. Repo-relative markdown links.
+4. **Repo Map** — code block listing every load-bearing top-level directory. Skip if the repo is too small to need one.
+5. **Language / Trust Boundaries** — what's user-facing vs internal, what's secret vs public, anything a contributor must NOT commit.
+6. **Verification Boundary** — the actual command(s) that decide "done." If the repo has no test/lint commands yet, say so explicitly instead of inventing.
+7. **Done Criteria** — operational. "Lint, typecheck, test, build green; agent-driven UI loop ran on UI changes."
+8. **First-time Setup** (only if the skill installed plugins) — the marketplace `add` commands collaborators run once on clone.
+For brownfield: read the existing AGENTS.md fully. Preserve any team-specific paragraphs. Merge new sections in; never replace whole-file.
+### 3. Verify against `agents-md-standard.md`
+Before returning, walk the 12 PASS/FAIL checks:
+- `agents_md_exists` · `map_not_dump` (under ~180 lines) · `read_first_present` · `task_routing_present` (≥3 areas) · `verification_boundary_stated` · `done_criteria_explicit` · `next_safe_move_obvious` · `repo_map_matches_reality` · `links_repo_relative` · `no_chat_only_rules` · `public_private_boundaries_current` · `maintenance_triggers_named`
+If any would FAIL, fix before delivering. Do not deliver a doc that fails its own standard.
+## Output
+Return the rendered document body (markdown, no frontmatter unless the skill specifies otherwise) ready for the skill to write to the target path. Append a one-line provenance comment at the very end so future maintainers can trace it:
+```
+<!-- Drafted by marvin:harness-author for {target-path} on YYYY-MM-DD. Edit freely — this comment is informational. -->
+```
+## Rules
+1. **Verify before you cite.** Every file path you mention in Read First or Task Routing must exist in the survey. If the survey doesn't list it, do not link to it.
+2. **Route, don't restate.** If a piece of detail belongs in `docs/foo.md`, link to it. The root file is a map.
+3. **No invented commands.** If the survey didn't surface `pnpm test`, do not write `pnpm test` in Verification. Write what the repo actually has, or write "Verification commands: TBD — add when the test surface lands."
+4. **Brownfield is a merge, not a replace.** Read the existing file. If it has team-specific phrasing, voice, or sections, preserve them.
+5. **Length is a feature.** Aim for tight. The standard caps root AGENTS.md at ~180 lines; subtree AGENTS.md at ~120.
+6. **No AI smell.** No "leveraging," no "comprehensive," no "enables you to." Write the way a senior engineer writes a runbook.
+## Failure Modes (self-correct before returning)
+If your draft matches any row below, do not return it — fix it in place. The dispatching skill checks the same rubric and will reject a failing draft anyway, so the only outcome of returning one is wasted tokens.
+| If you produce… | The reviewer will reject because… |
+|-----------------|-----------------------------------|
+| A 250-line AGENTS.md | Violates `map_not_dump` |
+| Routes pointing at files the survey didn't list | Violates `links_repo_relative` (broken links) |
+| Generic "use Vitest for testing" without checking the repo | Violates `verification_boundary_stated` |
+| A Read First section with abstract advice instead of actual paths | Violates `read_first_present` |
+| Replacing the brownfield AGENTS.md instead of merging | Loses team voice, fails the merge constraint |

package/plugins/marvin/skills/harness-up/SKILL.md ADDED Viewed

@@ -0,0 +1,175 @@
+---
+name: harness-up
+description: >
+  Install harness-engineering doctrine into a repository — short root AGENTS.md,
+  docs/ taxonomy (plans, ADRs, solutions), verification rules, and
+  project-scoped Claude Code plugins. Works on empty repos (greenfield
+  scaffold), existing repos (survey + upgrade), and ports of existing systems.
+  Distinct from `marvin:scaffold` (which installs deps and configs); distinct
+  from `cc-lab:cc-lab-diagnose` (which observes but doesn't change anything).
+  Triggers: harness up, harness this repo, set up AGENTS.md, agent doctrine,
+  make this repo agent-ready, init harness, scaffold agent context.
+allowed-tools:
+  - Read
+  - Grep
+  - Glob
+  - Bash
+  - Write
+  - Edit
+  - AskUserQuestion
+  - Agent
+---
+# Harness Up
+A repository that confuses an agent confuses a teammate too. The fix isn't a longer prompt — it's the smallest durable operating surface that points the next pair of hands, human or model, at the next safe move. This skill installs that surface.
+## Boundaries
+**This skill MAY:** read repo state, ask scoping questions, write doctrine docs (`AGENTS.md`, `CLAUDE.md`, `docs/agents-md-standard.md`, `docs/plans/README.md`, ADR/solutions templates), create `.claude/settings.json` (or merge), record marketplace add commands in a setup section, scaffold empty `docs/` subdirectories with README stubs.
+**This skill MAY NOT:** write application source code, write tests, install npm/pnpm/pip dependencies (that is `marvin:scaffold`'s job), run tests or builds, deploy, modify production state, overwrite an existing `AGENTS.md` without merging, replicate `cc-lab:cc-lab-diagnose`'s observation logic.
+**You install doctrine. You do not write the application.**
+## Common Rationalizations
+| Shortcut | Why It Fails | The Cost |
+|----------|--------------|----------|
+| "Skip the survey — drop in the template" | Templates bury real conventions; downstream links go dead | AGENTS.md routes agents to files that don't exist; the first session breaks on a 404 |
+| "Add every plugin — they're free" | Each plugin loads on every session; cognitive load compounds | Slow starts, abandoned skills, contributors disabling the lot |
+| "Empty repo, no brainstorm — fill in later" | Doctrine without mission reads generic | Agent has no routing anchor; produces boilerplate every invocation |
+| "Mirror harness-lab 1:1" | That repo is a workshop; yours probably isn't | Sections nobody owns rot into doctrine debt within weeks |
+## When NOT to Use
+- The repo already has a passing `cc-lab:cc-lab-diagnose` and the user wants a targeted fix, not a doctrine refresh — edit directly instead.
+- Application scaffolding (deps, source, tests) is what's actually needed — use `marvin:scaffold`.
+- A single section needs a rewrite — edit that file; running the full skill is overkill.
+- The repo is a one-off prototype with no team or future maintainers — doctrine is overhead it won't pay back.
+---
+## Phase 0: Detect Mode
+**Entry:** User invoked the skill.
+Run `git rev-parse --is-inside-work-tree`, `ls AGENTS.md CLAUDE.md docs package.json pyproject.toml Cargo.toml go.mod .claude`. Classify as **greenfield-blank** (empty), **greenfield-with-concept** (manifest/README, no doctrine), or **brownfield** (`AGENTS.md` or `docs/` already present). Announce in one line. If ambiguous, **AskUserQuestion** (header: "Mode", options matching the three classes plus "let me describe").
+**Exit:** Mode known.
+---
+## Phase 1: Establish the Mission
+**Entry:** Mode known.
+A doctrine without a mission is filler. **AskUserQuestion** (header: "Mission") with: *clear concept* (describe now), *brainstorm first* (exit to `/deep-thought:brainstorm`, re-invoke this skill once `docs/brainstorms/...` lands), or *it's a port* (describe source — mine it for mission and content shape).
+**If brownfield with a strong existing AGENTS.md:** skip — mission is already there. Confirm with the user in one sentence.
+**Exit:** Mission captured, or skill exits to brainstorm without writing partial doctrine.
+---
+## Phase 2: Survey
+**Entry:** Mode + mission known.
+Inventory without re-implementing `cc-lab:cc-lab-diagnose`: read any existing `AGENTS.md` / `CLAUDE.md` fully (never overwrite blindly), detect framework from `package.json`, note `docs/` shape and `.claude/settings.json` state, note git remote. Produce a one-paragraph "what exists / what's missing / what conflicts" report. If `cc-lab:cc-lab-diagnose` is installed and this is brownfield, suggest running it first for deeper observations.
+**Exit:** Inventory surfaced to the user.
+---
+## Phase 3: Pick the Surfaces
+**Entry:** Survey complete.
+Use **AskUserQuestion** with `multiSelect: true` (header: "Doctrine", question: "Which doctrine surfaces should I install or upgrade? Pick all that apply.") — defaults reflect the harness-engineering standard:
+1. **Root AGENTS.md** — operating map: mission, read-first, task routing, verification, done — *recommended for every repo*
+2. **CLAUDE.md mirror** — single-line `@AGENTS.md` for tools that look there
+3. **`docs/agents-md-standard.md`** — what AGENTS.md commits to (the 12 PASS/FAIL checks)
+4. **`docs/plans/`** — plan lifecycle (`approved | in_progress | complete | superseded | captured`) with archive rules
+5. **`docs/adr/`** — architecture decision records template
+6. **`docs/solutions/`** — searchable problem→fix log, paired with `marvin:compound`
+7. **`docs/agent-ui-testing.md`** — Chrome DevTools MCP loop for UI-bearing repos
+8. **`.claude/settings.json`** — permission allowlist defaults, optional lint-on-save hook
+9. **Subtree AGENTS.md** for one major directory (ask which)
+Then **AskUserQuestion** with `multiSelect: true` (header: "Plugins", question: "Which Claude Code plugins should be project-scoped? Project-scoped means anyone cloning the repo gets them, regardless of their global setup."):
+1. **heart-of-gold-toolkit** — deep-thought, marvin, babel-fish, guide, quellis
+2. **cc-lab** — `/cc-lab-diagnose` for ongoing setup checks
+3. **compound-engineering** — review/plan/work agents
+4. **vercel** — for Next.js / Vercel-hosted projects (auto-suggest if Next.js detected)
+5. **chrome-devtools-mcp** — required by the UI-testing surface above
+6. **None** — leave plugin choice to each developer's global config
+Then **AskUserQuestion** with `multiSelect: false` (header: "Scope", question: "Where should plugin selections live?"):
+1. **Project — `.claude/settings.json`** (recommended; collaborators get the same setup)
+2. **User — each developer manages their own `~/.claude`**
+3. **Both — install at project scope, document fallback in AGENTS.md**
+**Exit:** Decisions captured.
+---
+## Phase 4: Generate
+**Entry:** Decisions captured.
+Source templates from `../knowledge/harness-doctrine.md` if it exists; otherwise synthesize from the reference repos (`~/projects/Bobo/harness-lab/AGENTS.md`, `docs/agents-md-standard.md`, `docs/plan-lifecycle-standard.md`). Substitute mission and stack from Phases 1-2. Cap AGENTS.md at ~180 lines (the `map_not_dump` heuristic).
+Write in this order — each step depends on the prior:
+1. `AGENTS.md` — root operating map; every other doc routes back to it
+2. `CLAUDE.md` mirror (`@AGENTS.md`) if selected
+3. `docs/` skeleton — `plans/`, `adr/`, `solutions/` as empty directories with `README.md` stubs that carry only lifecycle rules
+4. `docs/agents-md-standard.md` if selected
+5. `.claude/settings.json` — last, references everything above. Merge into an existing file when present; never replace it
+For the AGENTS.md drafting (root and any subtree), use **Agent** with `subagent_type: harness-author` — the dedicated agent loads the harness-engineering standard and verifies its output against the 12-check rubric before returning. Pass mission, target path, and the survey output. The skill writes the body to disk; the agent does not write files itself. For unrelated tasks beyond doctrine adaptation (e.g. content scraping a port source), fall back to `general-purpose`.
+Plugin scoping: for project scope, merge `.claude/settings.json`. Marketplace `add` commands are interactive in Claude Code — record them as a "First-time setup" section in `AGENTS.md` for collaborators to run once on clone.
+**Exit:** Files written, no source code touched.
+---
+## Phase 5: Verify and Handoff
+**Entry:** Files written.
+Print a manifest grouped by surface — every file created or modified, with a one-line purpose. Suggest a single branch (`harness-up/scaffold`) so the doctrine lands as one reviewable diff. Then **AskUserQuestion** (header: "Next"): *done — commit*, *run `/cc-lab:cc-lab-diagnose`* (if cc-lab was selected), *adjust a section*, *generate a subtree AGENTS.md*, *open a brainstorm for an unfilled section*.
+**Exit:** Manifest delivered, next move chosen.
+---
+## Validate
+Before delivering, verify each:
+- [ ] No source files, tests, or dependency installs were touched
+- [ ] No existing file was overwritten without a merge step
+- [ ] Root `AGENTS.md` is ≤ 180 lines (`map_not_dump` heuristic)
+- [ ] All markdown links inside written docs are repo-relative
+- [ ] Plugin scope matches the user's Phase 3 selection — not silently both
+- [ ] Brownfield path preserved existing team-specific sections verbatim
+- [ ] Brainstorm-route exits left no partial doctrine on disk
+- [ ] `harness-author` self-verification passed before its draft was written
+## Knowledge References
+- `../agents/harness-author.md` — dedicated drafting agent dispatched in Phase 4 for AGENTS.md generation. Already loads the harness-engineering rubric.
+- `../knowledge/harness-doctrine.md` — AGENTS.md shape, plan lifecycle, ADR template, solutions format. Create on first run if missing.
+- `../knowledge/plugin-catalog.md` — marketplaces, install commands, project-vs-user scoping rules. Create on first run if missing.
+- Reference repos for templates: `~/projects/Bobo/harness-lab/AGENTS.md`, `~/projects/Bobo/harness-lab/docs/agents-md-standard.md`, `~/projects/Bobo/harness-lab/docs/plan-lifecycle-standard.md`.
+- Sibling skill: `cc-lab:cc-lab-diagnose` — use it to *observe* before this skill *changes*.
+## What Makes This Heart of Gold
+The smallest durable surface, not the longest prompt. Surveys before generating. Matches harness-engineering vocabulary exactly so `/cc-lab-diagnose` and `marvin:compound` work the day after install.