harnessed 4.3.0 → 4.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -11,12 +11,31 @@
11
11
 
12
12
  > Not affiliated with, endorsed by, or sponsored by Harness Inc. (see [NOTICE](./NOTICE))
13
13
 
14
+ > **How it compares** to [comet](https://github.com/rpamis/comet) and [Trellis](https://github.com/mindfold-ai/Trellis) — an honest, snapshot-dated comparison (including where harnessed lags): [`docs/comparison.md`](./docs/comparison.md).
15
+
14
16
  ---
15
17
 
16
18
  ## ✨ TL;DR
17
19
 
18
20
  **Best-practice orchestration for Harness Engineering on Claude Code** — assembles the best open-source Claude Code ecosystem components, weaving them into a unified workflow via opinionated composition skills; does not vendor upstream code — manifests describe install/check, and composition skills orchestrate multi-upstream collaboration.
19
21
 
22
+ ### 🔁 The operating loop
23
+
24
+ > **Discuss → Plan → Build → Verify → Ship → Learn** — one repeatable loop, machine-executed across the three-layer stack (gstack governance · GSD orchestration · superpowers TDD · checkpoint evidence). Raw agent work drifts; harnessed turns it into a source-of-truth path where progress, evidence, and learnings persist instead of living in chat.
25
+
26
+ ```mermaid
27
+ flowchart LR
28
+ R(["⓪ Research<br/>multi-source investigate<br/>(optional)"]):::opt --> D
29
+ D(["① Discuss<br/>3-layer clarify"]) --> P(["② Plan<br/>persist spec + tasks"])
30
+ P --> T(["③ Task<br/>TDD build + checkpoint"])
31
+ T --> V(["④ Verify<br/>independent review + evidence gate"])
32
+ V --> S(["⑤ Ship<br/>release-preflight → tag-ready (publish via CI)"])
33
+ S --> L(["⑥ Retro<br/>capture learnings → next session smarter"])
34
+ V -. "fail / gap" .-> T
35
+ L -. "next requirement" .-> D
36
+ classDef opt stroke-dasharray:5,opacity:0.8
37
+ ```
38
+
20
39
  ---
21
40
 
22
41
  > Wait — can harnessed really go toe-to-toe with upstream giants like superpowers / gstack / GSD?
@@ -30,10 +49,10 @@
30
49
 
31
50
  - **Three-layer stack machine-executed** — `gstack governance` + `GSD project manager` + `superpowers senior engineer` + `karpathy 4 principles` + `mattpocock 23 moves`, 5 pillars at 100% capture
32
51
  - **No vendoring of upstream** — manifests describe install/check; on upstream upgrade users just re-install to get the latest version
33
- - **Composition Skill** — in-house workflow skills act as the conductor's baton, orchestrating multiple upstreams in concert. **1 super-master `/auto` + 4 stage masters + 18 sub-workflows + 2 standalones = 25 namespace-layered workflows**, full 4-stage machine-execution (`/auto` one-shot across stages / `/discuss /plan /task /verify` single stage / 18 three-layer-stack subs / `/research /retro` 2 standalones)
52
+ - **Composition Skill** — in-house workflow skills act as the conductor's baton, orchestrating multiple upstreams in concert. **1 super-master `/auto` + 5 stage masters + 19 sub-workflows + 2 standalones = 27 namespace-layered workflows**, full 5-stage machine-execution (`/auto` one-shot across stages / `/discuss /plan /task /verify /ship` single stage / 19 three-layer-stack subs / `/research /retro` 2 standalones)
34
53
  - **L0 Discipline Substrate** — global cross-stage behavior baseline (karpathy principles + output-style + language + operational + priority + protocols), applied universally
35
54
  - **Package manager mindset** — install dependency graph auto-resolves, doctor health check, install-base one-shot full install
36
- - **Unified entry point** — users face `/discuss /plan /task /verify` master slash commands without learning each upstream's terminology; sub commands explicitly invoke a single stage (e.g. `/discuss-strategic` runs only the strategic-layer clarification)
55
+ - **Unified entry point** — users face `/discuss /plan /task /verify /ship` master slash commands without learning each upstream's terminology; sub commands explicitly invoke a single stage (e.g. `/discuss-strategic` runs only the strategic-layer clarification)
37
56
 
38
57
  ---
39
58
 
@@ -90,14 +109,14 @@ In order of increasing user intervention:
90
109
  /discuss-phase "..." # Run only Phase-layer clarification
91
110
  /plan-architecture "..." # Run only architecture review
92
111
  /verify-paranoid "..." # Run only the Paranoid Staff Engineer review
93
- # ... pick any of the other 18 sub-workflows
112
+ # ... pick any of the other 19 sub-workflows
94
113
  ```
95
114
 
96
115
  > "I'm an expert, I'll decide myself" — skip the master, invoke a sub-workflow directly. Suits advanced users who know exactly which sub they need, or reuse of a single step.
97
116
 
98
117
  ---
99
118
 
100
- ## 📐 4-Stage Flow Diagram
119
+ ## 📐 5-Stage Flow Diagram
101
120
 
102
121
  ```mermaid
103
122
  graph TD
@@ -135,16 +154,21 @@ graph TD
135
154
  VM[verify-multispec]
136
155
  VMs --> VP & VC & VPa & VQ & VS & VD & VSi & VM
137
156
  end
138
- RT([⑤ /retromilestone summary, optional]):::optional
157
+ subgraph Ship[⑤ ShipRelease]
158
+ SMs[/ship master/]
159
+ SP[ship-preflight]
160
+ SMs --> SP
161
+ end
162
+ RT([⑥ /retro — milestone summary, optional]):::optional
139
163
  RS --> Discuss
140
- Discuss --> Plan --> Task --> Verify
141
- Verify --> RT
164
+ Discuss --> Plan --> Task --> Verify --> Ship
165
+ Ship --> RT
142
166
  classDef optional stroke-dasharray:5 5,fill:#f5f5f5,color:#666
143
167
  ```
144
168
 
145
- > Dashed boxes = optional standalones (`/research` pre-strategic investigation / `/retro` post-milestone summary); solid boxes = main 4-stage cadence.
169
+ > Dashed boxes = optional standalones (`/research` pre-strategic investigation / `/retro` post-milestone summary); solid boxes = main 5-stage cadence (Ship stops at tag-ready; `publish.yml` CI does the actual publish).
146
170
 
147
- ### 25-Workflow Overview Table
171
+ ### 27-Workflow Overview Table
148
172
 
149
173
  | Slash cmd | Stage | Type | Capability / Upstream | Brief |
150
174
  |-----------|-------|------|----------------------|-------|
@@ -170,6 +194,8 @@ graph TD
170
194
  | `/verify-design` | ④ Verify | Sub | gstack `/design-review` + ui-ux-pro-max + frontend-design | Design system consistency (has_design_changes conditional) |
171
195
  | `/verify-simplify` | ④ Verify | Sub | `code-simplifier` | Final serial simplification |
172
196
  | `/verify-multispec` | ④ Verify | Sub | 4-specialist Agent Team Pattern C | Critical release / large refactor PR escalation (mutual SendMessage cross-examination) |
197
+ | `/ship` | ⑤ Ship | Master | masterOrchestrator | Release stage after Verify — preflight → delegate PR/deploy to gstack `/ship` → publish via CI (tag-ready boundary) |
198
+ | `/ship-preflight` | ⑤ Ship | Sub | `harnessed release-preflight` | Read-only release-readiness gate (CHANGELOG `[Unreleased]` / version / git-clean / tag-absent); blocks on failure |
173
199
  | `/research` | Standalone | Standalone | Tavily / Exa MCP + ctx7 + GSD `/gsd-discuss-phase` | Multi-source investigation (Stage ① alternate) |
174
200
  | `/retro` | Standalone | Standalone | gstack `/retro` + planning-with-files RETROSPECTIVE.md | Project / milestone close-out summary |
175
201
 
@@ -180,11 +206,11 @@ graph TD
180
206
 
181
207
  ## ⚡ Usage Flow
182
208
 
183
- 4-stage three-layer-stack methodology — recommended driving via the 4 master orchestrators in series:
209
+ 5-stage three-layer-stack methodology — recommended driving via the 5 master orchestrators in series:
184
210
 
185
211
  ```
186
- /discuss → /plan → /task → /verify
187
- ① ② ③ ④
212
+ /discuss → /plan → /task → /verify → /ship
213
+ ① ② ③ ④
188
214
  ```
189
215
 
190
216
  | Stage | Master | Main sub-workflows | Upstream collaboration |
@@ -193,6 +219,7 @@ graph TD
193
219
  | ② **Plan** | `/plan` | architecture (conditional) → phase | gstack `/plan-eng-review` + GSD `/gsd-plan-phase` + planning-with-files |
194
220
  | ③ **Task** | `/task` | clarify → code → test → deliver (4 serial per subtask) | karpathy principles + mattpocock moves + superpowers TDD + `ralph-loop` |
195
221
  | ④ **Verify** | `/verify` | progress → 5 parallel conditional → simplify (+ multispec critical) | GSD `/gsd-verify-work` + code-review + gstack `/review` / `/qa` / `/cso` / `/design-review` + code-simplifier |
222
+ | ⑤ **Ship** | `/ship` | preflight (release-readiness gate) → delegate PR/deploy | `harnessed release-preflight` + gstack `/ship` + `publish.yml` CI (tag-ready boundary) |
196
223
 
197
224
  Practical example:
198
225
 
@@ -200,11 +227,12 @@ Practical example:
200
227
  # 1. Install workflow upstreams (one line installs gstack + GSD + superpowers + planning-with-files)
201
228
  harnessed setup
202
229
 
203
- # 2. Run the 4-stage cadence inside Claude Code
230
+ # 2. Run the 5-stage cadence inside Claude Code
204
231
  /discuss "new feature X" # Strategic + Phase + Subtask 3-layer clarification
205
232
  /plan "new feature X" # Architecture (conditional) + plan (task graph persisted)
206
233
  /task "subtask-1: API contract" # 4 subs serial per subtask
207
234
  /verify "phase-1" # 7 subs conditional
235
+ /ship # release-preflight gate → PR/deploy (tag-ready; publish via CI)
208
236
 
209
237
  # 3. Resume after interruption (any time)
210
238
  harnessed resume
@@ -216,14 +244,14 @@ harnessed resume
216
244
 
217
245
  ---
218
246
 
219
- ## 🗂️ Architecture (4-stage namespace-layered)
247
+ ## 🗂️ Architecture (5-stage namespace-layered)
220
248
 
221
249
  ### 1. Directory Structure
222
250
 
223
251
  ```
224
252
  harnessed/
225
253
  ├── manifests/ # L1: upstream description layer (NOT vendored)
226
- ├── workflows/ # L6: composition skills (4-stage conductor's baton)
254
+ ├── workflows/ # L6: composition skills (5-stage conductor's baton)
227
255
  │ ├── discuss/ # Stage ① 3 layers (strategic + phase + subtask)
228
256
  │ │ ├── auto/ # /discuss master gate-route
229
257
  │ │ ├── strategic/ # /discuss-strategic (gstack /office-hours + /plan-ceo-review)
@@ -232,9 +260,10 @@ harnessed/
232
260
  │ ├── plan/ # Stage ② (architecture + phase task graph)
233
261
  │ ├── task/ # Stage ③ (clarify + code + test + deliver)
234
262
  │ ├── verify/ # Stage ④ (progress + code-review + paranoid + qa + cso + design + simplify + multispec)
263
+ │ ├── ship/ # Stage ⑤ (preflight release-readiness gate → delegate PR/deploy to gstack /ship; tag-ready)
235
264
  │ ├── research/ # standalone Stage ① alternate
236
- │ ├── retro/ # standalone post-④ milestone close
237
- │ ├── capabilities.yaml # L5a: ~70 entries, 7 categories SoT
265
+ │ ├── retro/ # standalone post-⑤ milestone close
266
+ │ ├── capabilities.yaml # L5a: ~100 entries, 7 categories SoT
238
267
  │ ├── defaults.yaml # ralph_max_iterations per workflow phase
239
268
  │ ├── judgments/ # L5a: three-layer-stack criteria + parallelism + tdd + fallback + rules-routing
240
269
  │ │ ├── strategic-gate.yaml
@@ -268,7 +297,7 @@ harnessed/
268
297
  ```
269
298
  ┌────────────────────────────────────────────────────────────┐
270
299
  │ L7 User-facing slash cmd + harnessed CLI │
271
- │ /discuss /plan /task /verify (master) + 18 sub + /research /retro + /auto super-master
300
+ │ /discuss /plan /task /verify /ship (master) + 19 sub + /research /retro + /auto super-master
272
301
  │ + direct gstack invoke (30+ optional): /office-hours /review /qa /...
273
302
  ├────────────────────────────────────────────────────────────┤
274
303
  │ L6 Workflow orchestration (workflows/<stage>/<sub>/) │
@@ -296,7 +325,7 @@ harnessed/
296
325
  └────────────────────────────────────────────────────────────┘
297
326
  ```
298
327
 
299
- ### 3. Cross-cutting Capabilities (capabilities.yaml — 7 categories, ~83 entries)
328
+ ### 3. Cross-cutting Capabilities (capabilities.yaml — 7 categories, ~100 entries)
300
329
 
301
330
  ```
302
331
  behavioral (6): karpathy-guidelines + output-style + language + operational + priority + protocols
@@ -450,7 +479,7 @@ Think `brew install <formula>` pulling the full dependency set — you don't nee
450
479
  | Orchestration | GSD | High-level phase task graph + dependency analysis |
451
480
  | Persistence | planning-with-files | Persists `task_plan.md` / `progress.md` / `findings.md` |
452
481
 
453
- `/discuss /plan /task /verify` — the 4 masters string the 4 stages together; each master internally delegates to its sub. Each stage does a different thing and feeds the next. **No merging**.
482
+ `/discuss /plan /task /verify /ship` — the 5 masters string the 5 stages together; each master internally delegates to its sub. Each stage does a different thing and feeds the next. **No merging**.
454
483
 
455
484
  </details>
456
485
 
@@ -0,0 +1,157 @@
1
+ #!/usr/bin/env node
2
+ // G4 UserPromptSubmit hook — print the per-turn injection for the active harnessed
3
+ // workflow: a <workflow-state> breadcrumb + (Phase 17) a relevance-filtered
4
+ // <project-context> block (recent, phase/sub-relevant learnings from the repo's
5
+ // .planning/LEARNINGS.md + the current phase's CONTEXT.md excerpt). Silent exit 0
6
+ // on any error (fail-soft — a hook must never block the prompt).
7
+ //
8
+ // Self-contained plain JS (no project imports, no subprocess, no LLM) for hot-path
9
+ // speed. This MUST stay equivalent to src/checkpoint/injectState.ts `buildInjection`
10
+ // — the parity test in tests/checkpoint/injectState.test.ts runs this file and
11
+ // compares its stdout to the TS builder.
12
+ //
13
+ // Phase 15 repo-aware: resolves the active repo's slot from
14
+ // workflows.json[repoKey(cwd)] (legacy current-workflow.json as a fallback).
15
+ // Root: HARNESSED_ROOT_OVERRIDE if set, else <homedir>/.claude/harnessed.
16
+ import { existsSync, readdirSync, readFileSync } from 'node:fs'
17
+ import { homedir } from 'node:os'
18
+ import { dirname, join, resolve } from 'node:path'
19
+
20
+ const DEFAULT_INJECT_BUDGET = 1500
21
+ const tok = (s) => Math.ceil(Buffer.byteLength(s, 'utf8') / 4)
22
+
23
+ function repoKey(cwd) {
24
+ let dir = resolve(cwd)
25
+ for (;;) {
26
+ if (existsSync(join(dir, '.git'))) return dir
27
+ const parent = dirname(dir)
28
+ if (parent === dir) break
29
+ dir = parent
30
+ }
31
+ return resolve(cwd)
32
+ }
33
+
34
+ function harnessedRoot() {
35
+ const override = process.env.HARNESSED_ROOT_OVERRIDE
36
+ return override !== undefined && override !== ''
37
+ ? override
38
+ : join(homedir(), '.claude', 'harnessed')
39
+ }
40
+
41
+ // workflows.json[repoKey] first, then the legacy singleton (dual-write anchor).
42
+ function readWorkflow(root, key) {
43
+ try {
44
+ const store = JSON.parse(readFileSync(join(root, 'workflows.json'), 'utf8'))
45
+ if (store && store.workflows && store.workflows[key]) return store.workflows[key]
46
+ } catch {}
47
+ try {
48
+ return JSON.parse(readFileSync(join(root, 'current-workflow.json'), 'utf8'))
49
+ } catch {}
50
+ return null
51
+ }
52
+
53
+ function workflowStateBlock(wf) {
54
+ const ledger = wf.sub_progress ?? []
55
+ const next = ledger.find((e) => e.status === 'pending')?.sub ?? null
56
+ const lines = [
57
+ '<workflow-state>',
58
+ `phase: ${wf.phase}`,
59
+ `status: ${wf.status}`,
60
+ next ? `next: ${next}` : 'next: (none — all subs resolved)',
61
+ ]
62
+ for (const e of ledger) {
63
+ if ((e.fail_count ?? 0) >= 3)
64
+ lines.push(
65
+ `BREAK-LOOP: sub '${e.sub}' failed ${e.fail_count}x — stop retrying, run break-loop skill`,
66
+ )
67
+ }
68
+ lines.push('</workflow-state>')
69
+ return lines.join('\n')
70
+ }
71
+
72
+ function parseLearnings(md) {
73
+ const blocks = md.split(/^### /m).slice(1)
74
+ return blocks.map((b) => {
75
+ const raw = `### ${b}`.trimEnd()
76
+ const phase = /phase (\S+)/.exec(b)?.[1] ?? ''
77
+ const subs = []
78
+ for (const m of b.matchAll(/^- (?:looped|rejected|failed): (\S+)/gm)) subs.push(m[1])
79
+ return { raw, phase, subs }
80
+ })
81
+ }
82
+
83
+ function filterRelevant(entries, phase, ledgerSubs) {
84
+ const rel = entries.filter((e) => e.phase === phase || e.subs.some((s) => ledgerSubs.includes(s)))
85
+ const ordered = [...rel].reverse()
86
+ if (ordered.length === 0 && entries.length > 0) return [entries[entries.length - 1]]
87
+ return ordered
88
+ }
89
+
90
+ function selectWithinBudget(entries, budget) {
91
+ const out = []
92
+ let acc = 0
93
+ for (const e of entries) {
94
+ const cost = tok(e.raw)
95
+ if (acc + cost > budget) break
96
+ acc += cost
97
+ out.push(e)
98
+ }
99
+ return out
100
+ }
101
+
102
+ function findPhaseContextExcerpt(repoRoot, phase, budget) {
103
+ try {
104
+ const phasesDir = join(repoRoot, '.planning', 'phases')
105
+ if (!existsSync(phasesDir)) return null
106
+ for (const dir of readdirSync(phasesDir)) {
107
+ const num = /^(\d+)/.exec(dir)?.[1]
108
+ if (!num || !phase.includes(num)) continue
109
+ const ctxFile = join(phasesDir, dir, `${num}-CONTEXT.md`)
110
+ if (!existsSync(ctxFile)) continue
111
+ const body = readFileSync(ctxFile, 'utf8')
112
+ const goalIdx = body.indexOf('## Goal')
113
+ const slice = goalIdx >= 0 ? body.slice(goalIdx) : body
114
+ const nextH = slice.indexOf('\n## ', 1)
115
+ const excerpt = (nextH > 0 ? slice.slice(0, nextH) : slice).trim()
116
+ return excerpt.length > budget * 4 ? excerpt.slice(0, budget * 4) : excerpt
117
+ }
118
+ } catch {}
119
+ return null
120
+ }
121
+
122
+ function projectContextBlock(learnings, contextExcerpt) {
123
+ const parts = []
124
+ for (const l of learnings) parts.push(l.raw.trim())
125
+ if (contextExcerpt) parts.push(contextExcerpt.trim())
126
+ if (parts.length === 0) return ''
127
+ return ['<project-context>', ...parts, '</project-context>'].join('\n')
128
+ }
129
+
130
+ try {
131
+ const root = harnessedRoot()
132
+ const key = repoKey(process.cwd())
133
+ const wf = readWorkflow(root, key)
134
+ if (!wf) process.exit(0)
135
+
136
+ const budget = Number(process.env.HARNESSED_INJECT_BUDGET) || DEFAULT_INJECT_BUDGET
137
+ const ws = workflowStateBlock(wf)
138
+
139
+ let learningsMd = ''
140
+ try {
141
+ learningsMd = readFileSync(join(key, '.planning', 'LEARNINGS.md'), 'utf8')
142
+ } catch {}
143
+
144
+ const ledgerSubs = (wf.sub_progress ?? []).map((e) => e.sub)
145
+ const rel = selectWithinBudget(
146
+ filterRelevant(parseLearnings(learningsMd), wf.phase, ledgerSubs),
147
+ budget,
148
+ )
149
+ const used = rel.reduce((a, e) => a + tok(e.raw), 0)
150
+ const ctx = findPhaseContextExcerpt(key, wf.phase, Math.max(0, budget - used))
151
+ const pc = projectContextBlock(rel, ctx ?? undefined)
152
+
153
+ process.stdout.write(`${pc ? `${ws}\n${pc}` : ws}\n`)
154
+ } catch {
155
+ // no state / corrupt / not a harnessed session -> inject nothing
156
+ }
157
+ process.exit(0)