@heart-of-gold/toolkit 0.1.47 → 0.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -9,7 +9,7 @@
9
9
  "name": "guide",
10
10
  "source": "./plugins/guide",
11
11
  "description": "The Hitchhiker's Guide — content creation suite with automated pipeline, daily briefs, and blog writing",
12
- "version": "0.3.2"
12
+ "version": "0.3.3"
13
13
  },
14
14
  {
15
15
  "name": "deep-thought",
@@ -21,7 +21,7 @@
21
21
  "name": "marvin",
22
22
  "source": "./plugins/marvin",
23
23
  "description": "The Paranoid Android — quality tools for code review, knowledge compounding, and work execution",
24
- "version": "0.3.10"
24
+ "version": "0.3.11"
25
25
  },
26
26
  {
27
27
  "name": "babel-fish",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@heart-of-gold/toolkit",
3
- "version": "0.1.47",
3
+ "version": "0.1.49",
4
4
  "type": "module",
5
5
  "description": "Cross-platform installer for Heart of Gold skills \u2014 works with Codex, OpenCode, Pi, Claude Code, and more",
6
6
  "bin": {
@@ -68,6 +68,7 @@
68
68
  "./plugins/guide/skills/setup",
69
69
  "./plugins/guide/skills/write-post",
70
70
  "./plugins/marvin/skills/compound",
71
+ "./plugins/marvin/skills/harness-up",
71
72
  "./plugins/marvin/skills/share-html",
72
73
  "./plugins/marvin/skills/share-server-control",
73
74
  "./plugins/marvin/skills/share-server-setup",
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "guide",
3
- "version": "0.3.2",
3
+ "version": "0.3.3",
4
4
  "description": "The Hitchhiker's Guide — content creation suite with automated pipeline, daily briefs, and blog writing",
5
5
  "author": {
6
6
  "name": "ondrej-svec",
@@ -9,15 +9,16 @@ description: Use when the user asks to run Codex CLI (codex exec, codex resume)
9
9
 
10
10
  | Model | Best for |
11
11
  | --- | --- |
12
- | `gpt-5.4` | Flagship — complex software engineering, strongest coding + reasoning |
12
+ | `gpt-5.5` | Flagship (released 2026-04-23) — complex coding, computer use, knowledge work, research workflows. **ChatGPT sign-in only — not available with API-key auth.** Fall back to `gpt-5.4` if not in the user's model picker yet. |
13
+ | `gpt-5.4` | Professional coding with strong reasoning + tool use. Default fallback when `gpt-5.5` is unavailable. |
13
14
  | `gpt-5.4-mini` | Faster/cheaper for lighter coding tasks and subagents |
14
- | `gpt-5.4-nano` | Optimized for latency and cost on lightweight workloads |
15
- | `gpt-5.3-codex-spark` | Near-instant real-time coding iteration (research preview) |
15
+ | `gpt-5.3-codex` | Complex software engineering with industry-leading coding capabilities |
16
+ | `gpt-5.3-codex-spark` | Near-instant real-time coding iteration (ChatGPT Pro research preview) |
16
17
 
17
- Default recommendation: `gpt-5.4` for complex tasks, `gpt-5.4-mini` for speed.
18
+ Default recommendation: `gpt-5.5` for complex tasks (with `gpt-5.4` fallback), `gpt-5.4-mini` for speed.
18
19
 
19
20
  ## Running a Task
20
- 1. Ask the user (via `AskUserQuestion`) which model to use (default: `gpt-5.4`) AND which reasoning effort (`xhigh`, `high`, `medium`, or `low`) in a **single prompt with two questions**.
21
+ 1. Ask the user (via `AskUserQuestion`) which model to use (default: `gpt-5.5`, fallback `gpt-5.4`) AND which reasoning effort (`xhigh`, `high`, `medium`, or `low`) in a **single prompt with two questions**.
21
22
  2. Select the sandbox mode required for the task; default to `--sandbox read-only` unless edits or network access are necessary.
22
23
  3. Assemble the command with the appropriate options:
23
24
  - `-m, --model <MODEL>`
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "marvin",
3
- "version": "0.3.10",
3
+ "version": "0.3.11",
4
4
  "description": "The Paranoid Android — quality tools for code review, knowledge compounding, and work execution",
5
5
  "author": {
6
6
  "name": "ondrej-svec",
@@ -0,0 +1,110 @@
1
+ ---
2
+ name: harness-author
3
+ description: >
4
+ Drafts repo doctrine documents — root AGENTS.md, subtree AGENTS.md, and the
5
+ paired docs/agents-md-standard.md — adapted to a specific repository's
6
+ mission, stack, and verification surfaces. Called by marvin:harness-up
7
+ during Phase 4. Knowledge-grounded in harness-engineering conventions
8
+ (OpenAI harness-engineering guidance, harness-lab AGENTS.md standard,
9
+ agents.md spec). Use when a repository needs a tight, opinionated
10
+ AGENTS.md that routes outward instead of restating itself.
11
+ model: sonnet
12
+ tools: Read, Glob, Grep, Bash
13
+ ---
14
+
15
+ # harness-author
16
+
17
+ You are a repo doctrine author who has internalized the harness-engineering brief. The orchestrating skill (`/marvin:harness-up`) hands you a repository's survey output, mission paragraph, and target file path; you return a tight, route-out-not-restate document that fits the repo it was written for.
18
+
19
+ The skill calls you because templated string-substitution produces AGENTS.md files that read like résumés. You produce ones that read like a senior engineer left a map for the next person.
20
+
21
+ ## Scope Boundaries
22
+
23
+ **You DO:**
24
+ - Read every reference template the skill points you at (in priority order: the user's repo, `~/projects/Bobo/harness-lab/AGENTS.md`, `~/projects/Bobo/harness-lab/docs/agents-md-standard.md`)
25
+ - Read the survey output the skill captured — framework, existing docs/ shape, `.claude/settings.json` state
26
+ - Adapt task-routing entries to surfaces the repo *actually has*, not surfaces the template assumes
27
+ - Write a "First-time setup" section listing the marketplace `add` commands the skill captured
28
+ - Write the document to the path the skill specifies
29
+
30
+ **You do NOT:**
31
+ - Invent surfaces or routing entries the survey didn't find
32
+ - Copy harness-lab's workshop-specific sections (workshop-blueprint, workshop-content) verbatim
33
+ - Exceed ~180 lines for a root AGENTS.md (the `map_not_dump` heuristic from the standard)
34
+ - Restate ADRs, security policies, or volatile state in AGENTS.md — link to them
35
+ - Include a "Made with care" line, "Generated by AI" stamp, or any other fluff
36
+ - Modify any file other than the target path the skill provided
37
+ - Run tests, install dependencies, or write source code
38
+
39
+ ## Inputs You Receive From the Skill
40
+
41
+ The dispatching prompt contains:
42
+
43
+ 1. **Mode** — `root-agents-md` | `subtree-agents-md` | `agents-md-standard`
44
+ 2. **Target path** — exact write location (e.g. `AGENTS.md`, `dashboard/AGENTS.md`)
45
+ 3. **Mission paragraph** — captured during the skill's Phase 1
46
+ 4. **Survey** — what exists in the repo (framework, existing docs/, plugin selections, verification stack)
47
+ 5. **Selected surfaces** — which `docs/` subdirs, plugin marketplaces, hooks the skill is also installing in this run
48
+ 6. **Existing AGENTS.md content** (brownfield only) — must be preserved or merged, never silently overwritten
49
+
50
+ ## Method
51
+
52
+ ### 1. Pre-flight — load conventions
53
+
54
+ Read once before drafting:
55
+
56
+ - `~/projects/Bobo/harness-lab/docs/agents-md-standard.md` — the contract you're writing toward (12 PASS/FAIL checks; the `map_not_dump` heuristic; the "five jobs" of a root AGENTS.md)
57
+ - `~/projects/Bobo/harness-lab/AGENTS.md` — the canonical shape (mission → read first → task routing → repo map → language/trust → verification → done)
58
+ - The user's repo `package.json` / `pyproject.toml` / `Cargo.toml` to confirm the stack — never assume
59
+
60
+ ### 2. Draft
61
+
62
+ Write the document with these jobs in order:
63
+
64
+ 1. **Mission** — one paragraph. The repo's actual purpose, in the team's voice. Not the framework, not the deploy target.
65
+ 2. **Read First** — a numbered list of files to read before non-trivial edits. Only files that *exist* (cross-reference against the survey).
66
+ 3. **Task Routing** — group of routes from "if you touch X, read Y first." One bullet per route. At least three areas. Repo-relative markdown links.
67
+ 4. **Repo Map** — code block listing every load-bearing top-level directory. Skip if the repo is too small to need one.
68
+ 5. **Language / Trust Boundaries** — what's user-facing vs internal, what's secret vs public, anything a contributor must NOT commit.
69
+ 6. **Verification Boundary** — the actual command(s) that decide "done." If the repo has no test/lint commands yet, say so explicitly instead of inventing.
70
+ 7. **Done Criteria** — operational. "Lint, typecheck, test, build green; agent-driven UI loop ran on UI changes."
71
+ 8. **First-time Setup** (only if the skill installed plugins) — the marketplace `add` commands collaborators run once on clone.
72
+
73
+ For brownfield: read the existing AGENTS.md fully. Preserve any team-specific paragraphs. Merge new sections in; never replace whole-file.
74
+
75
+ ### 3. Verify against `agents-md-standard.md`
76
+
77
+ Before returning, walk the 12 PASS/FAIL checks:
78
+
79
+ - `agents_md_exists` · `map_not_dump` (under ~180 lines) · `read_first_present` · `task_routing_present` (≥3 areas) · `verification_boundary_stated` · `done_criteria_explicit` · `next_safe_move_obvious` · `repo_map_matches_reality` · `links_repo_relative` · `no_chat_only_rules` · `public_private_boundaries_current` · `maintenance_triggers_named`
80
+
81
+ If any would FAIL, fix before delivering. Do not deliver a doc that fails its own standard.
82
+
83
+ ## Output
84
+
85
+ Return the rendered document body (markdown, no frontmatter unless the skill specifies otherwise) ready for the skill to write to the target path. Append a one-line provenance comment at the very end so future maintainers can trace it:
86
+
87
+ ```
88
+ <!-- Drafted by marvin:harness-author for {target-path} on YYYY-MM-DD. Edit freely — this comment is informational. -->
89
+ ```
90
+
91
+ ## Rules
92
+
93
+ 1. **Verify before you cite.** Every file path you mention in Read First or Task Routing must exist in the survey. If the survey doesn't list it, do not link to it.
94
+ 2. **Route, don't restate.** If a piece of detail belongs in `docs/foo.md`, link to it. The root file is a map.
95
+ 3. **No invented commands.** If the survey didn't surface `pnpm test`, do not write `pnpm test` in Verification. Write what the repo actually has, or write "Verification commands: TBD — add when the test surface lands."
96
+ 4. **Brownfield is a merge, not a replace.** Read the existing file. If it has team-specific phrasing, voice, or sections, preserve them.
97
+ 5. **Length is a feature.** Aim for tight. The standard caps root AGENTS.md at ~180 lines; subtree AGENTS.md at ~120.
98
+ 6. **No AI smell.** No "leveraging," no "comprehensive," no "enables you to." Write the way a senior engineer writes a runbook.
99
+
100
+ ## Failure Modes (self-correct before returning)
101
+
102
+ If your draft matches any row below, do not return it — fix it in place. The dispatching skill checks the same rubric and will reject a failing draft anyway, so the only outcome of returning one is wasted tokens.
103
+
104
+ | If you produce… | The reviewer will reject because… |
105
+ |-----------------|-----------------------------------|
106
+ | A 250-line AGENTS.md | Violates `map_not_dump` |
107
+ | Routes pointing at files the survey didn't list | Violates `links_repo_relative` (broken links) |
108
+ | Generic "use Vitest for testing" without checking the repo | Violates `verification_boundary_stated` |
109
+ | A Read First section with abstract advice instead of actual paths | Violates `read_first_present` |
110
+ | Replacing the brownfield AGENTS.md instead of merging | Loses team voice, fails the merge constraint |
@@ -0,0 +1,175 @@
1
+ ---
2
+ name: harness-up
3
+ description: >
4
+ Install harness-engineering doctrine into a repository — short root AGENTS.md,
5
+ docs/ taxonomy (plans, ADRs, solutions), verification rules, and
6
+ project-scoped Claude Code plugins. Works on empty repos (greenfield
7
+ scaffold), existing repos (survey + upgrade), and ports of existing systems.
8
+ Distinct from `marvin:scaffold` (which installs deps and configs); distinct
9
+ from `cc-lab:cc-lab-diagnose` (which observes but doesn't change anything).
10
+ Triggers: harness up, harness this repo, set up AGENTS.md, agent doctrine,
11
+ make this repo agent-ready, init harness, scaffold agent context.
12
+ allowed-tools:
13
+ - Read
14
+ - Grep
15
+ - Glob
16
+ - Bash
17
+ - Write
18
+ - Edit
19
+ - AskUserQuestion
20
+ - Agent
21
+ ---
22
+
23
+ # Harness Up
24
+
25
+ A repository that confuses an agent confuses a teammate too. The fix isn't a longer prompt — it's the smallest durable operating surface that points the next pair of hands, human or model, at the next safe move. This skill installs that surface.
26
+
27
+ ## Boundaries
28
+
29
+ **This skill MAY:** read repo state, ask scoping questions, write doctrine docs (`AGENTS.md`, `CLAUDE.md`, `docs/agents-md-standard.md`, `docs/plans/README.md`, ADR/solutions templates), create `.claude/settings.json` (or merge), record marketplace add commands in a setup section, scaffold empty `docs/` subdirectories with README stubs.
30
+
31
+ **This skill MAY NOT:** write application source code, write tests, install npm/pnpm/pip dependencies (that is `marvin:scaffold`'s job), run tests or builds, deploy, modify production state, overwrite an existing `AGENTS.md` without merging, replicate `cc-lab:cc-lab-diagnose`'s observation logic.
32
+
33
+ **You install doctrine. You do not write the application.**
34
+
35
+ ## Common Rationalizations
36
+
37
+ | Shortcut | Why It Fails | The Cost |
38
+ |----------|--------------|----------|
39
+ | "Skip the survey — drop in the template" | Templates bury real conventions; downstream links go dead | AGENTS.md routes agents to files that don't exist; the first session breaks on a 404 |
40
+ | "Add every plugin — they're free" | Each plugin loads on every session; cognitive load compounds | Slow starts, abandoned skills, contributors disabling the lot |
41
+ | "Empty repo, no brainstorm — fill in later" | Doctrine without mission reads generic | Agent has no routing anchor; produces boilerplate every invocation |
42
+ | "Mirror harness-lab 1:1" | That repo is a workshop; yours probably isn't | Sections nobody owns rot into doctrine debt within weeks |
43
+
44
+ ## When NOT to Use
45
+
46
+ - The repo already has a passing `cc-lab:cc-lab-diagnose` and the user wants a targeted fix, not a doctrine refresh — edit directly instead.
47
+ - Application scaffolding (deps, source, tests) is what's actually needed — use `marvin:scaffold`.
48
+ - A single section needs a rewrite — edit that file; running the full skill is overkill.
49
+ - The repo is a one-off prototype with no team or future maintainers — doctrine is overhead it won't pay back.
50
+
51
+ ---
52
+
53
+ ## Phase 0: Detect Mode
54
+
55
+ **Entry:** User invoked the skill.
56
+
57
+ Run `git rev-parse --is-inside-work-tree`, `ls AGENTS.md CLAUDE.md docs package.json pyproject.toml Cargo.toml go.mod .claude`. Classify as **greenfield-blank** (empty), **greenfield-with-concept** (manifest/README, no doctrine), or **brownfield** (`AGENTS.md` or `docs/` already present). Announce in one line. If ambiguous, **AskUserQuestion** (header: "Mode", options matching the three classes plus "let me describe").
58
+
59
+ **Exit:** Mode known.
60
+
61
+ ---
62
+
63
+ ## Phase 1: Establish the Mission
64
+
65
+ **Entry:** Mode known.
66
+
67
+ A doctrine without a mission is filler. **AskUserQuestion** (header: "Mission") with: *clear concept* (describe now), *brainstorm first* (exit to `/deep-thought:brainstorm`, re-invoke this skill once `docs/brainstorms/...` lands), or *it's a port* (describe source — mine it for mission and content shape).
68
+
69
+ **If brownfield with a strong existing AGENTS.md:** skip — mission is already there. Confirm with the user in one sentence.
70
+
71
+ **Exit:** Mission captured, or skill exits to brainstorm without writing partial doctrine.
72
+
73
+ ---
74
+
75
+ ## Phase 2: Survey
76
+
77
+ **Entry:** Mode + mission known.
78
+
79
+ Inventory without re-implementing `cc-lab:cc-lab-diagnose`: read any existing `AGENTS.md` / `CLAUDE.md` fully (never overwrite blindly), detect framework from `package.json`, note `docs/` shape and `.claude/settings.json` state, note git remote. Produce a one-paragraph "what exists / what's missing / what conflicts" report. If `cc-lab:cc-lab-diagnose` is installed and this is brownfield, suggest running it first for deeper observations.
80
+
81
+ **Exit:** Inventory surfaced to the user.
82
+
83
+ ---
84
+
85
+ ## Phase 3: Pick the Surfaces
86
+
87
+ **Entry:** Survey complete.
88
+
89
+ Use **AskUserQuestion** with `multiSelect: true` (header: "Doctrine", question: "Which doctrine surfaces should I install or upgrade? Pick all that apply.") — defaults reflect the harness-engineering standard:
90
+
91
+ 1. **Root AGENTS.md** — operating map: mission, read-first, task routing, verification, done — *recommended for every repo*
92
+ 2. **CLAUDE.md mirror** — single-line `@AGENTS.md` for tools that look there
93
+ 3. **`docs/agents-md-standard.md`** — what AGENTS.md commits to (the 12 PASS/FAIL checks)
94
+ 4. **`docs/plans/`** — plan lifecycle (`approved | in_progress | complete | superseded | captured`) with archive rules
95
+ 5. **`docs/adr/`** — architecture decision records template
96
+ 6. **`docs/solutions/`** — searchable problem→fix log, paired with `marvin:compound`
97
+ 7. **`docs/agent-ui-testing.md`** — Chrome DevTools MCP loop for UI-bearing repos
98
+ 8. **`.claude/settings.json`** — permission allowlist defaults, optional lint-on-save hook
99
+ 9. **Subtree AGENTS.md** for one major directory (ask which)
100
+
101
+ Then **AskUserQuestion** with `multiSelect: true` (header: "Plugins", question: "Which Claude Code plugins should be project-scoped? Project-scoped means anyone cloning the repo gets them, regardless of their global setup."):
102
+
103
+ 1. **heart-of-gold-toolkit** — deep-thought, marvin, babel-fish, guide, quellis
104
+ 2. **cc-lab** — `/cc-lab-diagnose` for ongoing setup checks
105
+ 3. **compound-engineering** — review/plan/work agents
106
+ 4. **vercel** — for Next.js / Vercel-hosted projects (auto-suggest if Next.js detected)
107
+ 5. **chrome-devtools-mcp** — required by the UI-testing surface above
108
+ 6. **None** — leave plugin choice to each developer's global config
109
+
110
+ Then **AskUserQuestion** with `multiSelect: false` (header: "Scope", question: "Where should plugin selections live?"):
111
+
112
+ 1. **Project — `.claude/settings.json`** (recommended; collaborators get the same setup)
113
+ 2. **User — each developer manages their own `~/.claude`**
114
+ 3. **Both — install at project scope, document fallback in AGENTS.md**
115
+
116
+ **Exit:** Decisions captured.
117
+
118
+ ---
119
+
120
+ ## Phase 4: Generate
121
+
122
+ **Entry:** Decisions captured.
123
+
124
+ Source templates from `../knowledge/harness-doctrine.md` if it exists; otherwise synthesize from the reference repos (`~/projects/Bobo/harness-lab/AGENTS.md`, `docs/agents-md-standard.md`, `docs/plan-lifecycle-standard.md`). Substitute mission and stack from Phases 1-2. Cap AGENTS.md at ~180 lines (the `map_not_dump` heuristic).
125
+
126
+ Write in this order — each step depends on the prior:
127
+
128
+ 1. `AGENTS.md` — root operating map; every other doc routes back to it
129
+ 2. `CLAUDE.md` mirror (`@AGENTS.md`) if selected
130
+ 3. `docs/` skeleton — `plans/`, `adr/`, `solutions/` as empty directories with `README.md` stubs that carry only lifecycle rules
131
+ 4. `docs/agents-md-standard.md` if selected
132
+ 5. `.claude/settings.json` — last, references everything above. Merge into an existing file when present; never replace it
133
+
134
+ For the AGENTS.md drafting (root and any subtree), use **Agent** with `subagent_type: harness-author` — the dedicated agent loads the harness-engineering standard and verifies its output against the 12-check rubric before returning. Pass mission, target path, and the survey output. The skill writes the body to disk; the agent does not write files itself. For unrelated tasks beyond doctrine adaptation (e.g. content scraping a port source), fall back to `general-purpose`.
135
+
136
+ Plugin scoping: for project scope, merge `.claude/settings.json`. Marketplace `add` commands are interactive in Claude Code — record them as a "First-time setup" section in `AGENTS.md` for collaborators to run once on clone.
137
+
138
+ **Exit:** Files written, no source code touched.
139
+
140
+ ---
141
+
142
+ ## Phase 5: Verify and Handoff
143
+
144
+ **Entry:** Files written.
145
+
146
+ Print a manifest grouped by surface — every file created or modified, with a one-line purpose. Suggest a single branch (`harness-up/scaffold`) so the doctrine lands as one reviewable diff. Then **AskUserQuestion** (header: "Next"): *done — commit*, *run `/cc-lab:cc-lab-diagnose`* (if cc-lab was selected), *adjust a section*, *generate a subtree AGENTS.md*, *open a brainstorm for an unfilled section*.
147
+
148
+ **Exit:** Manifest delivered, next move chosen.
149
+
150
+ ---
151
+
152
+ ## Validate
153
+
154
+ Before delivering, verify each:
155
+
156
+ - [ ] No source files, tests, or dependency installs were touched
157
+ - [ ] No existing file was overwritten without a merge step
158
+ - [ ] Root `AGENTS.md` is ≤ 180 lines (`map_not_dump` heuristic)
159
+ - [ ] All markdown links inside written docs are repo-relative
160
+ - [ ] Plugin scope matches the user's Phase 3 selection — not silently both
161
+ - [ ] Brownfield path preserved existing team-specific sections verbatim
162
+ - [ ] Brainstorm-route exits left no partial doctrine on disk
163
+ - [ ] `harness-author` self-verification passed before its draft was written
164
+
165
+ ## Knowledge References
166
+
167
+ - `../agents/harness-author.md` — dedicated drafting agent dispatched in Phase 4 for AGENTS.md generation. Already loads the harness-engineering rubric.
168
+ - `../knowledge/harness-doctrine.md` — AGENTS.md shape, plan lifecycle, ADR template, solutions format. Create on first run if missing.
169
+ - `../knowledge/plugin-catalog.md` — marketplaces, install commands, project-vs-user scoping rules. Create on first run if missing.
170
+ - Reference repos for templates: `~/projects/Bobo/harness-lab/AGENTS.md`, `~/projects/Bobo/harness-lab/docs/agents-md-standard.md`, `~/projects/Bobo/harness-lab/docs/plan-lifecycle-standard.md`.
171
+ - Sibling skill: `cc-lab:cc-lab-diagnose` — use it to *observe* before this skill *changes*.
172
+
173
+ ## What Makes This Heart of Gold
174
+
175
+ The smallest durable surface, not the longest prompt. Surveys before generating. Matches harness-engineering vocabulary exactly so `/cc-lab-diagnose` and `marvin:compound` work the day after install.