@harness-engineering/cli 1.12.0 → 1.13.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/agents/skills/claude-code/harness-autopilot/SKILL.md +57 -9
- package/dist/agents/skills/claude-code/harness-brainstorming/SKILL.md +1 -1
- package/dist/agents/skills/claude-code/harness-code-review/SKILL.md +19 -2
- package/dist/agents/skills/claude-code/harness-execution/SKILL.md +39 -12
- package/dist/agents/skills/claude-code/harness-planning/SKILL.md +28 -11
- package/dist/agents/skills/claude-code/harness-roadmap/SKILL.md +34 -0
- package/dist/agents/skills/claude-code/harness-verification/SKILL.md +42 -0
- package/dist/agents/skills/gemini-cli/harness-autopilot/SKILL.md +57 -9
- package/dist/agents/skills/gemini-cli/harness-brainstorming/SKILL.md +1 -1
- package/dist/agents/skills/gemini-cli/harness-code-review/SKILL.md +19 -2
- package/dist/agents/skills/gemini-cli/harness-execution/SKILL.md +39 -12
- package/dist/agents/skills/gemini-cli/harness-planning/SKILL.md +28 -11
- package/dist/agents/skills/gemini-cli/harness-roadmap/SKILL.md +34 -0
- package/dist/agents/skills/gemini-cli/harness-verification/SKILL.md +42 -0
- package/dist/{agents-md-KIS2RSMG.js → agents-md-P2RHSUV7.js} +1 -1
- package/dist/{architecture-AJAUDRQQ.js → architecture-ESOOE26S.js} +2 -2
- package/dist/bin/harness-mcp.js +10 -10
- package/dist/bin/harness.js +12 -12
- package/dist/{check-phase-gate-K7QCSYRJ.js → check-phase-gate-S2MZKLFQ.js} +2 -2
- package/dist/{chunk-2SWJ4VO7.js → chunk-2VU4MFM3.js} +4 -4
- package/dist/{chunk-ZU2UBYBY.js → chunk-3KOLLWWE.js} +1 -1
- package/dist/{chunk-EAURF4LH.js → chunk-5VY23YK3.js} +1 -1
- package/dist/{chunk-747VBPA4.js → chunk-7KQSUZVG.js} +96 -50
- package/dist/{chunk-FLOEMHDF.js → chunk-7PZWR4LI.js} +3 -3
- package/dist/{chunk-AE2OWWDH.js → chunk-KELT6K6M.js} +590 -253
- package/dist/{chunk-TJVVU3HB.js → chunk-LD3DKUK5.js} +1 -1
- package/dist/{chunk-JLXOEO5C.js → chunk-MACVXDZK.js} +2 -2
- package/dist/{chunk-CTTFXXKJ.js → chunk-MI5XJQDY.js} +3 -3
- package/dist/{chunk-YXOG2277.js → chunk-PSNN4LWX.js} +2 -2
- package/dist/{chunk-B5SBNH4S.js → chunk-RZSUJBZZ.js} +74 -14
- package/dist/{chunk-OIGVQF5V.js → chunk-WPPDRIJL.js} +1 -1
- package/dist/{ci-workflow-NBL4OT4A.js → ci-workflow-4NYBUG6R.js} +1 -1
- package/dist/{dist-IJ4J4C5G.js → dist-WF4C7A4A.js} +25 -1
- package/dist/{docs-CPTMH3VY.js → docs-BPYCN2DR.js} +2 -2
- package/dist/{engine-BUWPAAGD.js → engine-LXLIWQQ3.js} +1 -1
- package/dist/{entropy-Z4FYVQ7L.js → entropy-4VDVV5CR.js} +2 -2
- package/dist/{feedback-TT6WF5YX.js → feedback-63QB5RCA.js} +1 -1
- package/dist/{generate-agent-definitions-J5HANRNR.js → generate-agent-definitions-QABOJG56.js} +1 -1
- package/dist/index.d.ts +41 -41
- package/dist/index.js +12 -12
- package/dist/{loader-PCU5YWRH.js → loader-Z2IT7QX3.js} +1 -1
- package/dist/{mcp-YM6QLHLZ.js → mcp-KQHEL5IF.js} +10 -10
- package/dist/{performance-YJVXOKIB.js → performance-26BH47O4.js} +2 -2
- package/dist/{review-pipeline-KGMIMLIE.js → review-pipeline-GHR3WFBI.js} +1 -1
- package/dist/{runtime-F6R27LD6.js → runtime-PDWD7UIK.js} +1 -1
- package/dist/{security-MX5VVXBC.js → security-UQFUZXEN.js} +1 -1
- package/dist/{validate-EFNMSFKD.js → validate-N7QJOKFZ.js} +2 -2
- package/dist/{validate-cross-check-LJX65SBS.js → validate-cross-check-EDQ5QGTM.js} +1 -1
- package/package.json +4 -4
|
@@ -102,20 +102,26 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
102
102
|
path: "<project-root>",
|
|
103
103
|
intent: "Autopilot phase execution for <spec name>",
|
|
104
104
|
skill: "harness-autopilot",
|
|
105
|
+
session: "<session-slug>",
|
|
105
106
|
include: ["state", "learnings", "handoff", "validation"]
|
|
106
107
|
})
|
|
107
108
|
```
|
|
108
109
|
|
|
109
|
-
This loads learnings
|
|
110
|
+
This loads session-scoped learnings, handoff, state, and validation results in a single call. The `session` parameter ensures all reads come from the session directory (`.harness/sessions/<slug>/`), isolating this workstream from others. Note any relevant learnings or known dead ends for the current phase from the returned `learnings` array.
|
|
110
111
|
|
|
111
|
-
6. **Load
|
|
112
|
+
6. **Load session summary for cold start.** If resuming (existing `autopilot-state.json` found):
|
|
113
|
+
- Call `loadSessionSummary()` for the session slug to get quick orientation context (~200 tokens).
|
|
114
|
+
- The summary provides the last skill, phase, status, and next step — enough to understand where the autopilot left off without re-reading the full state machine.
|
|
115
|
+
- If no summary exists (first run), skip — the full INIT handles context loading.
|
|
116
|
+
|
|
117
|
+
7. **Load roadmap context.** If `docs/roadmap.md` exists, read it to understand:
|
|
112
118
|
- Current project priorities (which features are `in-progress`)
|
|
113
119
|
- Blockers that may affect the upcoming phases
|
|
114
120
|
- Overall project status and milestone progress
|
|
115
121
|
|
|
116
122
|
This provides the autopilot with project-level context beyond the individual spec being executed. If the roadmap does not exist, skip this step — the autopilot operates normally without it.
|
|
117
123
|
|
|
118
|
-
|
|
124
|
+
8. **Transition to ASSESS.**
|
|
119
125
|
|
|
120
126
|
---
|
|
121
127
|
|
|
@@ -155,9 +161,11 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
155
161
|
|
|
156
162
|
Spec: {specPath}
|
|
157
163
|
Session directory: {sessionDir}
|
|
164
|
+
Session slug: {sessionSlug}
|
|
158
165
|
Phase description: {phase description from spec}
|
|
159
|
-
|
|
160
|
-
|
|
166
|
+
|
|
167
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
168
|
+
session-scoped learnings, state, and validation context.
|
|
161
169
|
|
|
162
170
|
Follow the harness-planning skill process exactly. Write the plan to
|
|
163
171
|
docs/plans/{date}-{phase-name}-plan.md. Write {sessionDir}/handoff.json when done.
|
|
@@ -221,9 +229,11 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
221
229
|
|
|
222
230
|
Plan: {planPath}
|
|
223
231
|
Session directory: {sessionDir}
|
|
232
|
+
Session slug: {sessionSlug}
|
|
224
233
|
State: {sessionDir}/state.json
|
|
225
|
-
|
|
226
|
-
|
|
234
|
+
|
|
235
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
236
|
+
session-scoped learnings, state, and validation context.
|
|
227
237
|
|
|
228
238
|
Follow the harness-execution skill process exactly.
|
|
229
239
|
Update {sessionDir}/state.json after each task.
|
|
@@ -268,6 +278,10 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
268
278
|
You are running harness-verification for phase {N}: {name}.
|
|
269
279
|
|
|
270
280
|
Session directory: {sessionDir}
|
|
281
|
+
Session slug: {sessionSlug}
|
|
282
|
+
|
|
283
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
284
|
+
session-scoped learnings, state, and validation context.
|
|
271
285
|
|
|
272
286
|
Follow the harness-verification skill process exactly.
|
|
273
287
|
Report pass/fail with findings.
|
|
@@ -296,6 +310,10 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
296
310
|
You are running harness-code-review for phase {N}: {name}.
|
|
297
311
|
|
|
298
312
|
Session directory: {sessionDir}
|
|
313
|
+
Session slug: {sessionSlug}
|
|
314
|
+
|
|
315
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
316
|
+
session-scoped learnings, state, and validation context.
|
|
299
317
|
|
|
300
318
|
Follow the harness-code-review skill process exactly.
|
|
301
319
|
Report findings with severity (blocking / warning / note).
|
|
@@ -341,7 +359,23 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
341
359
|
|
|
342
360
|
4. **Sync roadmap.** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true`. This reflects the just-completed phase in the roadmap (e.g., updating the feature from `planned` to `in-progress`). If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. Skip silently if no roadmap exists. Do not use `force_sync: true` — the human-always-wins rule applies.
|
|
343
361
|
|
|
344
|
-
5. **
|
|
362
|
+
5. **Write session summary.** Update the session summary to reflect the completed phase:
|
|
363
|
+
|
|
364
|
+
```json
|
|
365
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
366
|
+
session: "<session-slug>",
|
|
367
|
+
lastActive: "<ISO timestamp>",
|
|
368
|
+
skill: "harness-autopilot",
|
|
369
|
+
phase: "<completed phase number> of <total phases>",
|
|
370
|
+
status: "Phase <N> complete. <tasks completed>/<total> tasks.",
|
|
371
|
+
spec: "<spec path>",
|
|
372
|
+
plan: "<current plan path>",
|
|
373
|
+
keyContext: "<1-2 sentences: what this phase accomplished, key decisions>",
|
|
374
|
+
nextStep: "<e.g., Continue to Phase N+1: <name>, or DONE>"
|
|
375
|
+
})
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
6. **Check for next phase:**
|
|
345
379
|
- If more phases remain: "Phase {N} complete. Next: Phase {N+1}: {name} (complexity: {level}). Continue? (yes / stop)"
|
|
346
380
|
- **yes** — Increment `currentPhase`, reset `retryBudget`, transition to ASSESS.
|
|
347
381
|
- **stop** — Save state and exit.
|
|
@@ -387,7 +421,21 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
387
421
|
|
|
388
422
|
5. **Update roadmap to done.** If `docs/roadmap.md` exists and the current spec maps to a roadmap feature, call `manage_roadmap` with action `update` to set the feature status to `done`. Derive the feature name from the spec title (H1 heading) or the session's `handoff.json` `summary` field. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `updateFeature()` from core. Skip silently if no roadmap exists or if the feature is not found. Do not use `force_sync: true`.
|
|
389
423
|
|
|
390
|
-
6. **
|
|
424
|
+
6. **Write final session summary.** Update the session summary to reflect completion:
|
|
425
|
+
|
|
426
|
+
```json
|
|
427
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
428
|
+
session: "<session-slug>",
|
|
429
|
+
lastActive: "<ISO timestamp>",
|
|
430
|
+
skill: "harness-autopilot",
|
|
431
|
+
status: "DONE. <total phases> phases, <total tasks> tasks complete.",
|
|
432
|
+
spec: "<spec path>",
|
|
433
|
+
keyContext: "<1-2 sentences: overall summary of what was built>",
|
|
434
|
+
nextStep: "All phases complete. Create PR or close session."
|
|
435
|
+
})
|
|
436
|
+
```
|
|
437
|
+
|
|
438
|
+
7. **Clean up state:** Set `currentState: "DONE"` in `{sessionDir}/autopilot-state.json`. Do not delete the file — it serves as a record.
|
|
391
439
|
|
|
392
440
|
## Harness Integration
|
|
393
441
|
|
|
@@ -161,7 +161,7 @@ These keywords flow into the `handoff.json` `contextKeywords` field when the spe
|
|
|
161
161
|
- Call `manage_roadmap` with action `add`, `status: "planned"`, `milestone: "Current Work"`, and the spec path. Include a one-line summary from the spec overview.
|
|
162
162
|
- If the feature already exists in the roadmap (duplicate name), skip silently — the feature was likely added manually or by a prior brainstorming session.
|
|
163
163
|
- Log: `"Added '<feature-name>' to roadmap as planned"` (informational, not a prompt).
|
|
164
|
-
- If `manage_roadmap` is unavailable, fall back to direct file manipulation using `
|
|
164
|
+
- If `manage_roadmap` is unavailable, fall back to direct file manipulation using `parseRoadmap`/`serializeRoadmap` from core to read, modify, and write `docs/roadmap.md`.
|
|
165
165
|
- If no roadmap exists, skip this step silently.
|
|
166
166
|
|
|
167
167
|
7. **Write handoff and suggest transition.** After the human approves the spec:
|
|
@@ -212,12 +212,13 @@ gather_context({
|
|
|
212
212
|
path: "<project-root>",
|
|
213
213
|
intent: "Code review of <change description>",
|
|
214
214
|
skill: "harness-code-review",
|
|
215
|
+
session: "<session-slug-if-provided>",
|
|
215
216
|
tokenBudget: 8000,
|
|
216
217
|
include: ["graph", "learnings", "validation"]
|
|
217
218
|
})
|
|
218
219
|
```
|
|
219
220
|
|
|
220
|
-
This replaces manual `query_graph` + `get_impact` + `find_context_for` calls with a single composite call that assembles review context in parallel, ranked by relevance. Falls back gracefully when no graph is available (`meta.graphAvailable: false`).
|
|
221
|
+
This replaces manual `query_graph` + `get_impact` + `find_context_for` calls with a single composite call that assembles review context in parallel, ranked by relevance. Falls back gracefully when no graph is available (`meta.graphAvailable: false`). When `session` is provided (e.g., via autopilot dispatch), learnings and state are scoped to the session directory. If no session is known, omit the parameter — `gather_context` falls back to global files.
|
|
221
222
|
|
|
222
223
|
For domain-specific scoping (compliance, bug detection, security, architecture), supplement `gather_context` output with targeted `query_graph` calls as needed.
|
|
223
224
|
|
|
@@ -528,6 +529,22 @@ Write `.harness/handoff.json`:
|
|
|
528
529
|
}
|
|
529
530
|
```
|
|
530
531
|
|
|
532
|
+
**Write session summary (if session is known).** If running within a session context, update the session summary:
|
|
533
|
+
|
|
534
|
+
```json
|
|
535
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
536
|
+
session: "<session-slug>",
|
|
537
|
+
lastActive: "<ISO timestamp>",
|
|
538
|
+
skill: "harness-code-review",
|
|
539
|
+
status: "Review complete. Assessment: <approve|request-changes|comment>. <N> findings.",
|
|
540
|
+
spec: "<spec path if known>",
|
|
541
|
+
keyContext: "<1-2 sentences: review outcome, key findings>",
|
|
542
|
+
nextStep: "<e.g., Address blocking findings / Ready to merge / Observations delivered>"
|
|
543
|
+
})
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
If no session slug is known, skip this step.
|
|
547
|
+
|
|
531
548
|
**If assessment is "approve":**
|
|
532
549
|
|
|
533
550
|
Call `emit_interaction`:
|
|
@@ -614,7 +631,7 @@ _This section is not part of the pipeline. It documents the process for respondi
|
|
|
614
631
|
## Harness Integration
|
|
615
632
|
|
|
616
633
|
- **`assess_project`** — Used in Phase 2 (MECHANICAL) to run `validate`, `deps`, and `docs` checks in parallel. Must pass for the pipeline to continue to AI review. Failures are Critical issues that stop the pipeline.
|
|
617
|
-
- **`gather_context`** — Used in Phase 3 (CONTEXT) for efficient parallel context assembly. Replaces separate graph query calls.
|
|
634
|
+
- **`gather_context`** — Used in Phase 3 (CONTEXT) for efficient parallel context assembly. The `session` parameter scopes learnings and state to the session directory when provided by autopilot dispatch. Replaces separate graph query calls.
|
|
618
635
|
- **`harness cleanup`** — Optional check during Phase 2 for entropy accumulation in changed files.
|
|
619
636
|
- **Graph queries** — Used in Phase 3 (CONTEXT) for dependency-scoped context and in Phase 5 (VALIDATE) for reachability verification. Graceful fallback when no graph exists.
|
|
620
637
|
- **`emit_interaction`** -- Call after review approval to suggest transitioning to merge/PR creation. Only emitted on APPROVE assessment. Uses confirmed transition (waits for user approval).
|
|
@@ -33,20 +33,29 @@ Deviating from the plan mid-execution introduces untested assumptions, breaks ta
|
|
|
33
33
|
path: "<project-root>",
|
|
34
34
|
intent: "Execute plan tasks starting from current position",
|
|
35
35
|
skill: "harness-execution",
|
|
36
|
+
session: "<session-slug-if-known>",
|
|
36
37
|
include: ["state", "learnings", "handoff", "validation"]
|
|
37
38
|
})
|
|
38
39
|
```
|
|
39
40
|
|
|
41
|
+
**Session resolution:** If a session directory is known (passed via autopilot dispatch or available from a previous handoff), include the `session` parameter. This scopes all state reads/writes to `.harness/sessions/<slug>/`. If no session is known, omit it — `gather_context` falls back to global files at `.harness/`.
|
|
42
|
+
|
|
40
43
|
This returns `state` (current position — if null, this is a fresh start at Task 1), `learnings` (hard-won insights from previous sessions — do not ignore them), `handoff` (structured context from the previous skill), and `validation` (current project health). If any constituent fails, its field is null and the error is reported in `meta.errors`.
|
|
41
44
|
|
|
42
|
-
3. **
|
|
45
|
+
3. **Load session summary for cold start.** If resuming a session (session slug is known), read the session summary for quick orientation:
|
|
46
|
+
- Call `listActiveSessions()` to read the session index (~100 tokens).
|
|
47
|
+
- If the target session is known, call `loadSessionSummary()` for that session (~200 tokens).
|
|
48
|
+
- If ambiguous (multiple active sessions, no clear target), present the index to the user and ask which session to resume.
|
|
49
|
+
- The summary provides skill, phase, status, key context, and next step — enough to orient without re-reading full state + learnings + plan.
|
|
50
|
+
|
|
51
|
+
4. **Check for known dead ends.** Review `learnings` entries tagged `[outcome:failure]`. If any match approaches in the current plan, surface warnings before proceeding.
|
|
43
52
|
|
|
44
|
-
|
|
53
|
+
5. **Verify prerequisites.** For the current task:
|
|
45
54
|
- Are dependency tasks marked complete in state?
|
|
46
55
|
- Do the files referenced in the task exist as expected?
|
|
47
56
|
- Does the test suite pass? Run `harness validate` to confirm a clean baseline.
|
|
48
57
|
|
|
49
|
-
|
|
58
|
+
6. **If prerequisites fail,** do not proceed. Report what is missing and which task is blocked.
|
|
50
59
|
|
|
51
60
|
### Graph-Enhanced Context (when available)
|
|
52
61
|
|
|
@@ -220,7 +229,7 @@ emit_interaction({
|
|
|
220
229
|
|
|
221
230
|
Between tasks (especially between sessions):
|
|
222
231
|
|
|
223
|
-
1. **Update `.harness/state.json
|
|
232
|
+
1. **Update state (session-scoped `{sessionDir}/state.json` if session is known, otherwise `.harness/state.json`)** with current position, progress, and `lastSession` context:
|
|
224
233
|
|
|
225
234
|
```json
|
|
226
235
|
{
|
|
@@ -241,7 +250,7 @@ harness scan [path]
|
|
|
241
250
|
|
|
242
251
|
Skipping this step means subsequent graph queries (impact analysis, dependency health, test advisor) may return stale results.
|
|
243
252
|
|
|
244
|
-
2. **Append tagged learnings to `.harness/learnings.md
|
|
253
|
+
2. **Append tagged learnings to the session-scoped learnings file (`{sessionDir}/learnings.md` if session is known, otherwise `.harness/learnings.md`).** Tag every entry with skill and outcome:
|
|
245
254
|
|
|
246
255
|
```markdown
|
|
247
256
|
## YYYY-MM-DD — Task N: <task name>
|
|
@@ -251,9 +260,9 @@ Skipping this step means subsequent graph queries (impact analysis, dependency h
|
|
|
251
260
|
- [skill:harness-execution] [outcome:decision] What was decided and why
|
|
252
261
|
```
|
|
253
262
|
|
|
254
|
-
3. **Record failures in `.harness/failures.md
|
|
263
|
+
3. **Record failures in the session-scoped failures file (`{sessionDir}/failures.md` if session is known, otherwise `.harness/failures.md`)** if any task was escalated after retry exhaustion (from Phase 2 Step 5). Include the approach attempted and why it failed, so future sessions avoid the same dead end.
|
|
255
264
|
|
|
256
|
-
4. **Write `.harness/handoff.json
|
|
265
|
+
4. **Write the session-scoped handoff (`{sessionDir}/handoff.json` if session is known, otherwise `.harness/handoff.json`)** with structured context for the next skill or session:
|
|
257
266
|
|
|
258
267
|
```json
|
|
259
268
|
{
|
|
@@ -266,11 +275,29 @@ Skipping this step means subsequent graph queries (impact analysis, dependency h
|
|
|
266
275
|
}
|
|
267
276
|
```
|
|
268
277
|
|
|
269
|
-
5. **
|
|
278
|
+
5. **Write session summary.** Write/update the session summary for cold-start context restoration:
|
|
279
|
+
|
|
280
|
+
```json
|
|
281
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
282
|
+
session: "<session-slug>",
|
|
283
|
+
lastActive: "<ISO timestamp>",
|
|
284
|
+
skill: "harness-execution",
|
|
285
|
+
phase: "<current phase of plan>",
|
|
286
|
+
status: "<e.g., Task 4/6 complete, paused at CHECKPOINT>",
|
|
287
|
+
spec: "<spec path if known>",
|
|
288
|
+
plan: "<plan path>",
|
|
289
|
+
keyContext: "<1-2 sentences: what was accomplished, key decisions made>",
|
|
290
|
+
nextStep: "<what to do next when resuming>"
|
|
291
|
+
})
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
This overwrites any previous summary for this session. The index.md is updated automatically.
|
|
295
|
+
|
|
296
|
+
6. **Sync roadmap (mandatory when present).** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true` to update linked feature statuses from the just-completed execution state. Do not use `force_sync: true` — the human-always-wins rule applies. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. If no roadmap exists, skip silently.
|
|
270
297
|
|
|
271
|
-
|
|
298
|
+
7. **Learnings are append-only.** Never edit or delete previous learnings. They are a chronological record.
|
|
272
299
|
|
|
273
|
-
|
|
300
|
+
8. **Auto-transition to verification.** When ALL tasks in the plan are complete (not when stopping mid-plan):
|
|
274
301
|
|
|
275
302
|
Call `emit_interaction`:
|
|
276
303
|
|
|
@@ -325,8 +352,8 @@ These are non-negotiable. When any condition is met, stop immediately.
|
|
|
325
352
|
- **`harness check-deps`** — Run when tasks add new imports or modules. Catches boundary violations early.
|
|
326
353
|
- **`harness state show`** — View current execution position and progress.
|
|
327
354
|
- **`harness state learn "<message>"`** — Append a learning from the command line.
|
|
328
|
-
-
|
|
329
|
-
-
|
|
355
|
+
- **State file** — Session-scoped at `{sessionDir}/state.json` when session is known, otherwise `.harness/state.json`. Read at session start to resume position. Updated after every task.
|
|
356
|
+
- **Learnings file** — Session-scoped at `{sessionDir}/learnings.md` when session is known, otherwise `.harness/learnings.md`. Append-only knowledge capture. Read at session start for prior context.
|
|
330
357
|
- **Roadmap sync** — After completing plan execution, call `manage_roadmap` with action `sync` and `apply: true` to update roadmap status. Mandatory when `docs/roadmap.md` exists. Do not use `force_sync: true`. Falls back to `syncRoadmap()` from core if MCP tool is unavailable.
|
|
331
358
|
- **`emit_interaction`** -- Call at plan completion to auto-transition to harness-verification. Uses auto-transition (proceeds immediately without user confirmation).
|
|
332
359
|
|
|
@@ -193,22 +193,39 @@ When presenting the task breakdown, use progress markers:
|
|
|
193
193
|
}
|
|
194
194
|
```
|
|
195
195
|
|
|
196
|
-
9. **
|
|
196
|
+
9. **Write session summary (if session is known).** If running within a session (autopilot dispatch or standalone with session context), write the session summary:
|
|
197
197
|
|
|
198
198
|
```json
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
199
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
200
|
+
session: "<session-slug>",
|
|
201
|
+
lastActive: "<ISO timestamp>",
|
|
202
|
+
skill: "harness-planning",
|
|
203
|
+
status: "Plan complete. <N> tasks defined.",
|
|
204
|
+
spec: "<spec path if known>",
|
|
205
|
+
plan: "<plan file path>",
|
|
206
|
+
keyContext: "<1-2 sentences: what was planned, key decisions>",
|
|
207
|
+
nextStep: "Approve plan and begin execution."
|
|
208
208
|
})
|
|
209
209
|
```
|
|
210
210
|
|
|
211
|
-
|
|
211
|
+
If no session slug is known (standalone invocation without session context), skip this step.
|
|
212
|
+
|
|
213
|
+
10. **Request plan sign-off:**
|
|
214
|
+
|
|
215
|
+
```json
|
|
216
|
+
emit_interaction({
|
|
217
|
+
path: "<project-root>",
|
|
218
|
+
type: "confirmation",
|
|
219
|
+
confirmation: {
|
|
220
|
+
text: "Approve plan at <plan-file-path>?",
|
|
221
|
+
context: "<task count> tasks, <estimated time> minutes. <one-sentence summary>",
|
|
222
|
+
impact: "Approving unlocks task-by-task execution. Plan defines exact file paths, code, and commands.",
|
|
223
|
+
risk: "low"
|
|
224
|
+
}
|
|
225
|
+
})
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
11. **Suggest transition to execution.** After the human approves the plan:
|
|
212
229
|
|
|
213
230
|
Call `emit_interaction`:
|
|
214
231
|
|
|
@@ -395,6 +395,38 @@ Choice?
|
|
|
395
395
|
harness validate: passed
|
|
396
396
|
```
|
|
397
397
|
|
|
398
|
+
---
|
|
399
|
+
|
|
400
|
+
### Command: `--query <filter>` -- Query Features by Filter
|
|
401
|
+
|
|
402
|
+
#### Phase 1: SCAN -- Load Roadmap
|
|
403
|
+
|
|
404
|
+
1. Check if `docs/roadmap.md` exists.
|
|
405
|
+
- If missing: error with clear message. "No roadmap found at docs/roadmap.md. Run `--create` first to bootstrap one."
|
|
406
|
+
2. Parse the roadmap (via `manage_roadmap query` or direct read).
|
|
407
|
+
|
|
408
|
+
#### Phase 2: FILTER -- Apply Query
|
|
409
|
+
|
|
410
|
+
1. Accept filter patterns:
|
|
411
|
+
- **Status filter:** `backlog`, `planned`, `in-progress`, `done`, `blocked` -- returns all features with that status
|
|
412
|
+
- **Milestone filter:** `milestone:<name>` -- returns all features in the named milestone (partial match)
|
|
413
|
+
|
|
414
|
+
2. Display matching features with their milestone context:
|
|
415
|
+
|
|
416
|
+
```
|
|
417
|
+
QUERY: <filter>
|
|
418
|
+
|
|
419
|
+
Results (N matches):
|
|
420
|
+
- Feature A (Current Work) .................. in-progress
|
|
421
|
+
- Feature B (Backlog) ....................... planned
|
|
422
|
+
|
|
423
|
+
Total: N matches
|
|
424
|
+
```
|
|
425
|
+
|
|
426
|
+
3. No file writes. This is a read-only operation.
|
|
427
|
+
|
|
428
|
+
---
|
|
429
|
+
|
|
398
430
|
## Harness Integration
|
|
399
431
|
|
|
400
432
|
- **`manage_roadmap` MCP tool** -- Primary read/write interface for roadmap operations. Supports `show`, `add`, `update`, `remove`, and `query` actions. Use this when MCP is available for structured CRUD.
|
|
@@ -422,6 +454,8 @@ Choice?
|
|
|
422
454
|
16. `--edit` updates `last_manual_edit` timestamp (since changes are human-driven)
|
|
423
455
|
17. Output matches the roadmap markdown format exactly (frontmatter, H2 milestones, H3 features, 5 fields each)
|
|
424
456
|
18. `harness validate` passes after all operations
|
|
457
|
+
19. `--query` filters features by status or milestone and displays results with milestone context
|
|
458
|
+
20. `--query` errors gracefully when no roadmap exists, directing the user to `--create`
|
|
425
459
|
|
|
426
460
|
## Examples
|
|
427
461
|
|
|
@@ -35,6 +35,26 @@ The words "should", "probably", "seems to", and "I believe" are forbidden in ver
|
|
|
35
35
|
|
|
36
36
|
---
|
|
37
37
|
|
|
38
|
+
### Context Loading
|
|
39
|
+
|
|
40
|
+
Before running verification levels, load session context if a session slug was provided (e.g., by autopilot dispatch):
|
|
41
|
+
|
|
42
|
+
```json
|
|
43
|
+
gather_context({
|
|
44
|
+
path: "<project-root>",
|
|
45
|
+
intent: "Verify phase deliverables",
|
|
46
|
+
skill: "harness-verification",
|
|
47
|
+
session: "<session-slug-if-provided>",
|
|
48
|
+
include: ["state", "learnings", "validation"]
|
|
49
|
+
})
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Session resolution:** If a session slug is known (passed via autopilot dispatch or available from a previous handoff), include the `session` parameter. This scopes all state reads to `.harness/sessions/<slug>/`. If no session is known, omit it — `gather_context` falls back to global files at `.harness/`.
|
|
53
|
+
|
|
54
|
+
Use the returned learnings to check for known failures and dead ends relevant to the artifacts being verified.
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
38
58
|
### Level 1: EXISTS — The Artifact Is Present
|
|
39
59
|
|
|
40
60
|
For every artifact that was supposed to be created or modified:
|
|
@@ -201,6 +221,22 @@ Write `.harness/handoff.json`:
|
|
|
201
221
|
}
|
|
202
222
|
```
|
|
203
223
|
|
|
224
|
+
**Write session summary (if session is known).** If running within a session context, update the session summary:
|
|
225
|
+
|
|
226
|
+
```json
|
|
227
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
228
|
+
session: "<session-slug>",
|
|
229
|
+
lastActive: "<ISO timestamp>",
|
|
230
|
+
skill: "harness-verification",
|
|
231
|
+
status: "Verification <PASS|FAIL|INCOMPLETE>. <N> artifacts checked, <N> gaps.",
|
|
232
|
+
spec: "<spec path if known>",
|
|
233
|
+
keyContext: "<1-2 sentences: verification outcome, any gaps found>",
|
|
234
|
+
nextStep: "<e.g., Proceed to code review / Resolve gaps>"
|
|
235
|
+
})
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
If no session slug is known, skip this step.
|
|
239
|
+
|
|
204
240
|
**If verdict is PASS (all levels passed, no gaps):**
|
|
205
241
|
|
|
206
242
|
Call `emit_interaction`:
|
|
@@ -339,6 +375,12 @@ Task: "Create UserService with create, read, update, delete operations."
|
|
|
339
375
|
- **When verification reveals the spec itself is incomplete:** Do not fill in the gaps yourself. Escalate to the human: "Verification found that the spec does not define behavior for [scenario]. How should this be handled?"
|
|
340
376
|
- **When you cannot run harness checks:** If `harness validate` or `harness check-deps` cannot be run (missing configuration, broken tooling), this is a blocking issue. Do not skip verification — fix the tooling or escalate.
|
|
341
377
|
|
|
378
|
+
## Harness Integration
|
|
379
|
+
|
|
380
|
+
- **`gather_context`** — Used in Context Loading phase (before Level 1) to load session-scoped state, learnings, and validation in a single call. The `session` parameter scopes reads to the session directory when provided by autopilot dispatch.
|
|
381
|
+
- **`harness validate`** — Run during Level 3 (WIRED) to verify artifact integration.
|
|
382
|
+
- **`harness check-deps`** — Run during Level 3 (WIRED) to verify dependency boundaries.
|
|
383
|
+
|
|
342
384
|
After verification completes, append a tagged learning:
|
|
343
385
|
|
|
344
386
|
- **YYYY-MM-DD [skill:harness-verification] [outcome:pass/fail]:** Verified [feature]. [Brief note on what was found or confirmed.]
|
|
@@ -102,20 +102,26 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
102
102
|
path: "<project-root>",
|
|
103
103
|
intent: "Autopilot phase execution for <spec name>",
|
|
104
104
|
skill: "harness-autopilot",
|
|
105
|
+
session: "<session-slug>",
|
|
105
106
|
include: ["state", "learnings", "handoff", "validation"]
|
|
106
107
|
})
|
|
107
108
|
```
|
|
108
109
|
|
|
109
|
-
This loads learnings
|
|
110
|
+
This loads session-scoped learnings, handoff, state, and validation results in a single call. The `session` parameter ensures all reads come from the session directory (`.harness/sessions/<slug>/`), isolating this workstream from others. Note any relevant learnings or known dead ends for the current phase from the returned `learnings` array.
|
|
110
111
|
|
|
111
|
-
6. **Load
|
|
112
|
+
6. **Load session summary for cold start.** If resuming (existing `autopilot-state.json` found):
|
|
113
|
+
- Call `loadSessionSummary()` for the session slug to get quick orientation context (~200 tokens).
|
|
114
|
+
- The summary provides the last skill, phase, status, and next step — enough to understand where the autopilot left off without re-reading the full state machine.
|
|
115
|
+
- If no summary exists (first run), skip — the full INIT handles context loading.
|
|
116
|
+
|
|
117
|
+
7. **Load roadmap context.** If `docs/roadmap.md` exists, read it to understand:
|
|
112
118
|
- Current project priorities (which features are `in-progress`)
|
|
113
119
|
- Blockers that may affect the upcoming phases
|
|
114
120
|
- Overall project status and milestone progress
|
|
115
121
|
|
|
116
122
|
This provides the autopilot with project-level context beyond the individual spec being executed. If the roadmap does not exist, skip this step — the autopilot operates normally without it.
|
|
117
123
|
|
|
118
|
-
|
|
124
|
+
8. **Transition to ASSESS.**
|
|
119
125
|
|
|
120
126
|
---
|
|
121
127
|
|
|
@@ -155,9 +161,11 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
155
161
|
|
|
156
162
|
Spec: {specPath}
|
|
157
163
|
Session directory: {sessionDir}
|
|
164
|
+
Session slug: {sessionSlug}
|
|
158
165
|
Phase description: {phase description from spec}
|
|
159
|
-
|
|
160
|
-
|
|
166
|
+
|
|
167
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
168
|
+
session-scoped learnings, state, and validation context.
|
|
161
169
|
|
|
162
170
|
Follow the harness-planning skill process exactly. Write the plan to
|
|
163
171
|
docs/plans/{date}-{phase-name}-plan.md. Write {sessionDir}/handoff.json when done.
|
|
@@ -221,9 +229,11 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
221
229
|
|
|
222
230
|
Plan: {planPath}
|
|
223
231
|
Session directory: {sessionDir}
|
|
232
|
+
Session slug: {sessionSlug}
|
|
224
233
|
State: {sessionDir}/state.json
|
|
225
|
-
|
|
226
|
-
|
|
234
|
+
|
|
235
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
236
|
+
session-scoped learnings, state, and validation context.
|
|
227
237
|
|
|
228
238
|
Follow the harness-execution skill process exactly.
|
|
229
239
|
Update {sessionDir}/state.json after each task.
|
|
@@ -268,6 +278,10 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
268
278
|
You are running harness-verification for phase {N}: {name}.
|
|
269
279
|
|
|
270
280
|
Session directory: {sessionDir}
|
|
281
|
+
Session slug: {sessionSlug}
|
|
282
|
+
|
|
283
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
284
|
+
session-scoped learnings, state, and validation context.
|
|
271
285
|
|
|
272
286
|
Follow the harness-verification skill process exactly.
|
|
273
287
|
Report pass/fail with findings.
|
|
@@ -296,6 +310,10 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
296
310
|
You are running harness-code-review for phase {N}: {name}.
|
|
297
311
|
|
|
298
312
|
Session directory: {sessionDir}
|
|
313
|
+
Session slug: {sessionSlug}
|
|
314
|
+
|
|
315
|
+
On startup, call gather_context({ session: "{sessionSlug}" }) to load
|
|
316
|
+
session-scoped learnings, state, and validation context.
|
|
299
317
|
|
|
300
318
|
Follow the harness-code-review skill process exactly.
|
|
301
319
|
Report findings with severity (blocking / warning / note).
|
|
@@ -341,7 +359,23 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
341
359
|
|
|
342
360
|
4. **Sync roadmap.** If `docs/roadmap.md` exists, call `manage_roadmap` with action `sync` and `apply: true`. This reflects the just-completed phase in the roadmap (e.g., updating the feature from `planned` to `in-progress`). If `manage_roadmap` is unavailable, fall back to direct file manipulation using `syncRoadmap()` from core. Skip silently if no roadmap exists. Do not use `force_sync: true` — the human-always-wins rule applies.
|
|
343
361
|
|
|
344
|
-
5. **
|
|
362
|
+
5. **Write session summary.** Update the session summary to reflect the completed phase:
|
|
363
|
+
|
|
364
|
+
```json
|
|
365
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
366
|
+
session: "<session-slug>",
|
|
367
|
+
lastActive: "<ISO timestamp>",
|
|
368
|
+
skill: "harness-autopilot",
|
|
369
|
+
phase: "<completed phase number> of <total phases>",
|
|
370
|
+
status: "Phase <N> complete. <tasks completed>/<total> tasks.",
|
|
371
|
+
spec: "<spec path>",
|
|
372
|
+
plan: "<current plan path>",
|
|
373
|
+
keyContext: "<1-2 sentences: what this phase accomplished, key decisions>",
|
|
374
|
+
nextStep: "<e.g., Continue to Phase N+1: <name>, or DONE>"
|
|
375
|
+
})
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
6. **Check for next phase:**
|
|
345
379
|
- If more phases remain: "Phase {N} complete. Next: Phase {N+1}: {name} (complexity: {level}). Continue? (yes / stop)"
|
|
346
380
|
- **yes** — Increment `currentPhase`, reset `retryBudget`, transition to ASSESS.
|
|
347
381
|
- **stop** — Save state and exit.
|
|
@@ -387,7 +421,21 @@ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW →
|
|
|
387
421
|
|
|
388
422
|
5. **Update roadmap to done.** If `docs/roadmap.md` exists and the current spec maps to a roadmap feature, call `manage_roadmap` with action `update` to set the feature status to `done`. Derive the feature name from the spec title (H1 heading) or the session's `handoff.json` `summary` field. If `manage_roadmap` is unavailable, fall back to direct file manipulation using `updateFeature()` from core. Skip silently if no roadmap exists or if the feature is not found. Do not use `force_sync: true`.
|
|
389
423
|
|
|
390
|
-
6. **
|
|
424
|
+
6. **Write final session summary.** Update the session summary to reflect completion:
|
|
425
|
+
|
|
426
|
+
```json
|
|
427
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
428
|
+
session: "<session-slug>",
|
|
429
|
+
lastActive: "<ISO timestamp>",
|
|
430
|
+
skill: "harness-autopilot",
|
|
431
|
+
status: "DONE. <total phases> phases, <total tasks> tasks complete.",
|
|
432
|
+
spec: "<spec path>",
|
|
433
|
+
keyContext: "<1-2 sentences: overall summary of what was built>",
|
|
434
|
+
nextStep: "All phases complete. Create PR or close session."
|
|
435
|
+
})
|
|
436
|
+
```
|
|
437
|
+
|
|
438
|
+
7. **Clean up state:** Set `currentState: "DONE"` in `{sessionDir}/autopilot-state.json`. Do not delete the file — it serves as a record.
|
|
391
439
|
|
|
392
440
|
## Harness Integration
|
|
393
441
|
|
|
@@ -161,7 +161,7 @@ These keywords flow into the `handoff.json` `contextKeywords` field when the spe
|
|
|
161
161
|
- Call `manage_roadmap` with action `add`, `status: "planned"`, `milestone: "Current Work"`, and the spec path. Include a one-line summary from the spec overview.
|
|
162
162
|
- If the feature already exists in the roadmap (duplicate name), skip silently — the feature was likely added manually or by a prior brainstorming session.
|
|
163
163
|
- Log: `"Added '<feature-name>' to roadmap as planned"` (informational, not a prompt).
|
|
164
|
-
- If `manage_roadmap` is unavailable, fall back to direct file manipulation using `
|
|
164
|
+
- If `manage_roadmap` is unavailable, fall back to direct file manipulation using `parseRoadmap`/`serializeRoadmap` from core to read, modify, and write `docs/roadmap.md`.
|
|
165
165
|
- If no roadmap exists, skip this step silently.
|
|
166
166
|
|
|
167
167
|
7. **Write handoff and suggest transition.** After the human approves the spec:
|
|
@@ -212,12 +212,13 @@ gather_context({
|
|
|
212
212
|
path: "<project-root>",
|
|
213
213
|
intent: "Code review of <change description>",
|
|
214
214
|
skill: "harness-code-review",
|
|
215
|
+
session: "<session-slug-if-provided>",
|
|
215
216
|
tokenBudget: 8000,
|
|
216
217
|
include: ["graph", "learnings", "validation"]
|
|
217
218
|
})
|
|
218
219
|
```
|
|
219
220
|
|
|
220
|
-
This replaces manual `query_graph` + `get_impact` + `find_context_for` calls with a single composite call that assembles review context in parallel, ranked by relevance. Falls back gracefully when no graph is available (`meta.graphAvailable: false`).
|
|
221
|
+
This replaces manual `query_graph` + `get_impact` + `find_context_for` calls with a single composite call that assembles review context in parallel, ranked by relevance. Falls back gracefully when no graph is available (`meta.graphAvailable: false`). When `session` is provided (e.g., via autopilot dispatch), learnings and state are scoped to the session directory. If no session is known, omit the parameter — `gather_context` falls back to global files.
|
|
221
222
|
|
|
222
223
|
For domain-specific scoping (compliance, bug detection, security, architecture), supplement `gather_context` output with targeted `query_graph` calls as needed.
|
|
223
224
|
|
|
@@ -528,6 +529,22 @@ Write `.harness/handoff.json`:
|
|
|
528
529
|
}
|
|
529
530
|
```
|
|
530
531
|
|
|
532
|
+
**Write session summary (if session is known).** If running within a session context, update the session summary:
|
|
533
|
+
|
|
534
|
+
```json
|
|
535
|
+
writeSessionSummary(projectPath, sessionSlug, {
|
|
536
|
+
session: "<session-slug>",
|
|
537
|
+
lastActive: "<ISO timestamp>",
|
|
538
|
+
skill: "harness-code-review",
|
|
539
|
+
status: "Review complete. Assessment: <approve|request-changes|comment>. <N> findings.",
|
|
540
|
+
spec: "<spec path if known>",
|
|
541
|
+
keyContext: "<1-2 sentences: review outcome, key findings>",
|
|
542
|
+
nextStep: "<e.g., Address blocking findings / Ready to merge / Observations delivered>"
|
|
543
|
+
})
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
If no session slug is known, skip this step.
|
|
547
|
+
|
|
531
548
|
**If assessment is "approve":**
|
|
532
549
|
|
|
533
550
|
Call `emit_interaction`:
|
|
@@ -614,7 +631,7 @@ _This section is not part of the pipeline. It documents the process for respondi
|
|
|
614
631
|
## Harness Integration
|
|
615
632
|
|
|
616
633
|
- **`assess_project`** — Used in Phase 2 (MECHANICAL) to run `validate`, `deps`, and `docs` checks in parallel. Must pass for the pipeline to continue to AI review. Failures are Critical issues that stop the pipeline.
|
|
617
|
-
- **`gather_context`** — Used in Phase 3 (CONTEXT) for efficient parallel context assembly. Replaces separate graph query calls.
|
|
634
|
+
- **`gather_context`** — Used in Phase 3 (CONTEXT) for efficient parallel context assembly. The `session` parameter scopes learnings and state to the session directory when provided by autopilot dispatch. Replaces separate graph query calls.
|
|
618
635
|
- **`harness cleanup`** — Optional check during Phase 2 for entropy accumulation in changed files.
|
|
619
636
|
- **Graph queries** — Used in Phase 3 (CONTEXT) for dependency-scoped context and in Phase 5 (VALIDATE) for reachability verification. Graceful fallback when no graph exists.
|
|
620
637
|
- **`emit_interaction`** -- Call after review approval to suggest transitioning to merge/PR creation. Only emitted on APPROVE assessment. Uses confirmed transition (waits for user approval).
|