@harness-engineering/cli 1.6.0 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/dist/agents/personas/code-reviewer.yaml +2 -0
  2. package/dist/agents/personas/codebase-health-analyst.yaml +5 -0
  3. package/dist/agents/personas/performance-guardian.yaml +26 -0
  4. package/dist/agents/personas/security-reviewer.yaml +35 -0
  5. package/dist/agents/skills/claude-code/harness-autopilot/SKILL.md +494 -0
  6. package/dist/agents/skills/claude-code/harness-autopilot/skill.yaml +52 -0
  7. package/dist/agents/skills/claude-code/harness-code-review/SKILL.md +15 -0
  8. package/dist/agents/skills/claude-code/harness-integrity/SKILL.md +20 -6
  9. package/dist/agents/skills/claude-code/harness-perf/SKILL.md +231 -0
  10. package/dist/agents/skills/claude-code/harness-perf/skill.yaml +47 -0
  11. package/dist/agents/skills/claude-code/harness-perf-tdd/SKILL.md +236 -0
  12. package/dist/agents/skills/claude-code/harness-perf-tdd/skill.yaml +47 -0
  13. package/dist/agents/skills/claude-code/harness-pre-commit-review/SKILL.md +27 -2
  14. package/dist/agents/skills/claude-code/harness-release-readiness/SKILL.md +657 -0
  15. package/dist/agents/skills/claude-code/harness-release-readiness/skill.yaml +57 -0
  16. package/dist/agents/skills/claude-code/harness-security-review/SKILL.md +206 -0
  17. package/dist/agents/skills/claude-code/harness-security-review/skill.yaml +50 -0
  18. package/dist/agents/skills/claude-code/harness-security-scan/SKILL.md +102 -0
  19. package/dist/agents/skills/claude-code/harness-security-scan/skill.yaml +41 -0
  20. package/dist/agents/skills/claude-code/harness-state-management/SKILL.md +22 -8
  21. package/dist/agents/skills/gemini-cli/harness-autopilot/SKILL.md +494 -0
  22. package/dist/agents/skills/gemini-cli/harness-autopilot/skill.yaml +52 -0
  23. package/dist/agents/skills/gemini-cli/harness-perf/SKILL.md +231 -0
  24. package/dist/agents/skills/gemini-cli/harness-perf/skill.yaml +47 -0
  25. package/dist/agents/skills/gemini-cli/harness-perf-tdd/SKILL.md +236 -0
  26. package/dist/agents/skills/gemini-cli/harness-perf-tdd/skill.yaml +47 -0
  27. package/dist/agents/skills/gemini-cli/harness-release-readiness/SKILL.md +657 -0
  28. package/dist/agents/skills/gemini-cli/harness-release-readiness/skill.yaml +57 -0
  29. package/dist/agents/skills/gemini-cli/harness-security-review/skill.yaml +50 -0
  30. package/dist/agents/skills/gemini-cli/harness-security-scan/SKILL.md +102 -0
  31. package/dist/agents/skills/gemini-cli/harness-security-scan/skill.yaml +41 -0
  32. package/dist/bin/harness.js +1 -1
  33. package/dist/{chunk-VS4OTOKZ.js → chunk-O6NEKDYP.js} +789 -299
  34. package/dist/index.js +1 -1
  35. package/package.json +2 -2
@@ -0,0 +1,494 @@
1
+ # Harness Autopilot
2
+
3
+ > Autonomous phase execution loop — chains planning, execution, verification, and review across multi-phase projects, pausing only at human decision points.
4
+
5
+ ## When to Use
6
+
7
+ - After a multi-phase spec is approved and you want automated execution across all phases
8
+ - When a project has 2+ implementation phases that would require repeated manual skill invocations
9
+ - When you want the Ralph Loop pattern (fresh context per iteration, append-only learnings) applied at the phase level
10
+ - NOT for single-phase work (use harness-execution directly)
11
+ - NOT when the spec is not yet approved (use harness-brainstorming first)
12
+ - NOT for CI/headless execution (this is a conversational skill)
13
+
14
+ ## Relationship to Other Skills
15
+
16
+ | Skill | Role in Autopilot |
17
+ | -------------------- | -------------------------------------------- |
18
+ | harness-planning | Delegated to for phase plan creation |
19
+ | harness-execution | Delegated to for task-by-task implementation |
20
+ | harness-verification | Delegated to for post-execution validation |
21
+ | harness-code-review | Delegated to for post-verification review |
22
+
23
+ Autopilot orchestrates these skills — it never reimplements their logic.
24
+
25
+ ## Iron Law
26
+
27
+ **Autopilot delegates, never reimplements.** If you find yourself writing planning logic, execution logic, or review logic inside the autopilot loop, STOP. Delegate to the appropriate skill via subagent.
28
+
29
+ **Human always approves plans.** No plan executes without explicit human sign-off, regardless of complexity level. The difference is whether autopilot generates the plan automatically or asks the human to drive planning interactively.
30
+
31
+ ## Process
32
+
33
+ ### State Machine
34
+
35
+ ```
36
+ INIT → ASSESS → PLAN → APPROVE_PLAN → EXECUTE → VERIFY → REVIEW → PHASE_COMPLETE
37
+
38
+ [next phase?]
39
+ ↓ ↓
40
+ ASSESS DONE
41
+ ```
42
+
43
+ ---
44
+
45
+ ### Phase 1: INIT — Load Spec and Restore State
46
+
47
+ 1. **Check for existing state.** Read `.harness/autopilot-state.json`. If it exists and `currentState` is not `DONE`:
48
+ - Report: "Resuming autopilot from state `{currentState}`, phase {currentPhase}: {phaseName}."
49
+ - Skip to the recorded `currentState` and continue from there.
50
+
51
+ 2. **If no existing state (fresh start):**
52
+ - Read the spec file (provided as argument or found via `.harness/handoff.json`). If neither is available, ask the user for the spec path.
53
+ - Parse the `## Implementation Order` section to extract phases.
54
+ - For each phase heading (`### Phase N: Name`), extract:
55
+ - Phase name
56
+ - Complexity annotation (`<!-- complexity: low|medium|high -->`, default: `medium`)
57
+ - Create `.harness/autopilot-state.json`:
58
+ ```json
59
+ {
60
+ "schemaVersion": 1,
61
+ "specPath": "<path to spec>",
62
+ "currentState": "ASSESS",
63
+ "currentPhase": 0,
64
+ "phases": [
65
+ {
66
+ "name": "<phase name>",
67
+ "complexity": "<low|medium|high>",
68
+ "complexityOverride": null,
69
+ "planPath": null,
70
+ "status": "pending"
71
+ }
72
+ ],
73
+ "retryBudget": {
74
+ "maxAttempts": 3,
75
+ "currentTask": null
76
+ },
77
+ "history": []
78
+ }
79
+ ```
80
+
81
+ 3. **Load context.** Read `.harness/learnings.md` and `.harness/failures.md` if they exist. Note any relevant learnings or known dead ends for the current phase.
82
+
83
+ 4. **Transition to ASSESS.**
84
+
85
+ ---
86
+
87
+ ### ASSESS — Determine Phase Approach
88
+
89
+ 1. **Read the current phase** from `autopilot-state.json` at index `currentPhase`.
90
+
91
+ 2. **Check if plan already exists.** If `planPath` is set and the file exists, skip to `APPROVE_PLAN`.
92
+
93
+ 3. **Evaluate complexity:**
94
+ - Read the phase's `complexity` field from state.
95
+ - If `complexityOverride` is set, use it instead.
96
+ - Decision matrix:
97
+
98
+ | Effective Complexity | Action |
99
+ | -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
100
+ | `low` | Auto-plan via subagent. Proceed to PLAN. |
101
+ | `medium` | Auto-plan via subagent. Proceed to PLAN. Present with extra scrutiny note. |
102
+ | `high` | Pause. Tell the user: "Phase {N}: {name} is marked high-complexity. Run `/harness:planning` interactively for this phase, then re-invoke `/harness:autopilot` to continue." Transition to PLAN with `awaitingInteractivePlan: true`. |
103
+
104
+ 4. **Update state** with `currentState: "PLAN"` and save.
105
+
106
+ ---
107
+
108
+ ### PLAN — Generate or Await Plan
109
+
110
+ **If auto-planning (low/medium complexity):**
111
+
112
+ 1. Dispatch a subagent with the following prompt:
113
+
114
+ ```
115
+ You are running harness-planning for phase {N}: {name}.
116
+
117
+ Spec: {specPath}
118
+ Phase description: {phase description from spec}
119
+ Previous phase learnings: {relevant learnings from .harness/learnings.md}
120
+ Known failures to avoid: {relevant entries from .harness/failures.md}
121
+
122
+ Follow the harness-planning skill process exactly. Write the plan to
123
+ docs/plans/{date}-{phase-name}-plan.md. Write .harness/handoff.json when done.
124
+ ```
125
+
126
+ 2. When the subagent returns:
127
+ - Read the generated plan path from `.harness/handoff.json`.
128
+ - **Apply complexity override check:**
129
+ - Count tasks in the plan.
130
+ - Count `[checkpoint:*]` markers.
131
+ - If `spec_complexity == "low"` AND (`task_count > 10` OR `checkpoint_count > 3`):
132
+ Set `complexityOverride: "medium"` in state. Note to user: "Planning produced {N} tasks — more than expected for low complexity. Reviewing with extra scrutiny."
133
+ - If `spec_complexity == "low"` AND (`task_count > 20` OR `checkpoint_count > 6`):
134
+ Set `complexityOverride: "high"` in state. Note to user: "This phase is significantly larger than expected. Consider breaking it down."
135
+ - Update state: set `planPath` for the current phase.
136
+ - Transition to `APPROVE_PLAN`.
137
+
138
+ **If awaiting interactive plan (high complexity):**
139
+
140
+ 1. Check if a plan file now exists for this phase (user ran planning separately).
141
+ - Look for files matching `docs/plans/*{phase-name}*` or check `.harness/handoff.json` for a planning handoff.
142
+ 2. If found: update `planPath` in state, transition to `APPROVE_PLAN`.
143
+ 3. If not found: remind the user and wait.
144
+
145
+ ---
146
+
147
+ ### APPROVE_PLAN — Human Review Gate
148
+
149
+ **This state always pauses for human input.**
150
+
151
+ 1. **Present the plan summary:**
152
+ - Phase name and number
153
+ - Task count
154
+ - Checkpoint count
155
+ - Estimated time (task count × 3 minutes)
156
+ - Effective complexity (original + any override)
157
+ - Any concerns from the planning handoff
158
+
159
+ 2. **Ask:** "Approve this plan and begin execution? (yes / revise / skip phase / stop)"
160
+ - **yes** — Transition to EXECUTE.
161
+ - **revise** — Tell user to edit the plan file directly, then re-present.
162
+ - **skip phase** — Mark phase as `skipped` in state, transition to PHASE_COMPLETE.
163
+ - **stop** — Save state and exit. User can resume later.
164
+
165
+ 3. **Record the decision** in state: `decisions` array.
166
+
167
+ 4. **Update state** with `currentState: "EXECUTE"` and save.
168
+
169
+ ---
170
+
171
+ ### EXECUTE — Run the Plan
172
+
173
+ 1. **Dispatch execution subagent:**
174
+
175
+ ```
176
+ You are running harness-execution for phase {N}: {name}.
177
+
178
+ Plan: {planPath}
179
+ State: .harness/state.json
180
+ Learnings: .harness/learnings.md
181
+ Failures: .harness/failures.md
182
+
183
+ Follow the harness-execution skill process exactly.
184
+ Update .harness/state.json after each task.
185
+ Write .harness/handoff.json when done or when blocked.
186
+ ```
187
+
188
+ 2. **When the subagent returns, check the outcome:**
189
+ - **All tasks complete:** Transition to VERIFY.
190
+ - **Checkpoint reached:** Surface the checkpoint to the user in the main conversation. Handle the checkpoint type:
191
+ - `[checkpoint:human-verify]` — Show output, ask for confirmation, then resume execution subagent.
192
+ - `[checkpoint:decision]` — Present options, record choice, resume execution subagent.
193
+ - `[checkpoint:human-action]` — Instruct user, wait for confirmation, resume execution subagent.
194
+ - **Task failed:** Enter retry logic (see below).
195
+
196
+ 3. **Retry logic on failure:**
197
+ - Read `retryBudget` from state.
198
+ - If `attemptsUsed < maxAttempts`:
199
+ - Increment `attemptsUsed`.
200
+ - Record the attempt (timestamp, error, fix attempted, result).
201
+ - **Attempt 1:** Read error output, apply obvious fix, re-dispatch execution subagent for the failed task only.
202
+ - **Attempt 2:** Expand context — read related files, check `learnings.md` for similar failures, re-dispatch with additional context.
203
+ - **Attempt 3:** Full context gather — read test output, imports, plan instructions for ambiguity. Re-dispatch with maximum context.
204
+ - If budget exhausted:
205
+ - **Stop.** Present all 3 attempts with full context to the user.
206
+ - Record failure in `.harness/failures.md`.
207
+ - Ask: "How should we proceed? (fix manually and continue / revise plan / stop)"
208
+ - Save state. User's choice determines next transition.
209
+
210
+ 4. **Update state** after each execution cycle and save.
211
+
212
+ ---
213
+
214
+ ### VERIFY — Post-Execution Validation
215
+
216
+ 1. **Dispatch verification subagent:**
217
+
218
+ ```
219
+ You are running harness-verification for phase {N}: {name}.
220
+
221
+ Follow the harness-verification skill process exactly.
222
+ Report pass/fail with findings.
223
+ ```
224
+
225
+ 2. **When the subagent returns:**
226
+ - **All checks pass:** Transition to REVIEW.
227
+ - **Failures found:** Surface findings to the user. Ask: "Fix these issues before review? (yes / skip verification / stop)"
228
+ - **yes** — Re-enter EXECUTE with targeted fixes (retry budget resets for verification fixes).
229
+ - **skip** — Proceed to REVIEW with verification warnings noted.
230
+ - **stop** — Save state and exit.
231
+
232
+ 3. **Update state** with `currentState: "REVIEW"` and save.
233
+
234
+ ---
235
+
236
+ ### REVIEW — Code Review
237
+
238
+ 1. **Dispatch review subagent:**
239
+
240
+ ```
241
+ You are running harness-code-review for phase {N}: {name}.
242
+
243
+ Follow the harness-code-review skill process exactly.
244
+ Report findings with severity (blocking / warning / note).
245
+ ```
246
+
247
+ 2. **When the subagent returns:**
248
+ - **No blocking findings:** Report summary, transition to PHASE_COMPLETE.
249
+ - **Blocking findings:** Surface to user. Ask: "Address blocking findings before completing this phase? (yes / override / stop)"
250
+ - **yes** — Re-enter EXECUTE with review fixes.
251
+ - **override** — Record override decision, transition to PHASE_COMPLETE.
252
+ - **stop** — Save state and exit.
253
+
254
+ 3. **Update state** with `currentState: "PHASE_COMPLETE"` and save.
255
+
256
+ ---
257
+
258
+ ### PHASE_COMPLETE — Summary and Transition
259
+
260
+ 1. **Present phase summary:**
261
+ - Phase name and number
262
+ - Tasks completed
263
+ - Retries used
264
+ - Verification result (pass/fail/skipped)
265
+ - Review findings count (blocking/warning/note)
266
+ - Time from phase start to completion (from history timestamps)
267
+
268
+ 2. **Record phase in history:**
269
+
270
+ ```json
271
+ {
272
+ "phase": 0,
273
+ "name": "<phase name>",
274
+ "startedAt": "<timestamp>",
275
+ "completedAt": "<now>",
276
+ "tasksCompleted": 8,
277
+ "retriesUsed": 1,
278
+ "verificationPassed": true,
279
+ "reviewFindings": { "blocking": 0, "warning": 1, "note": 3 }
280
+ }
281
+ ```
282
+
283
+ 3. **Mark phase as `complete`** in state.
284
+
285
+ 4. **Check for next phase:**
286
+ - If more phases remain: "Phase {N} complete. Next: Phase {N+1}: {name} (complexity: {level}). Continue? (yes / stop)"
287
+ - **yes** — Increment `currentPhase`, reset `retryBudget`, transition to ASSESS.
288
+ - **stop** — Save state and exit.
289
+ - If no more phases: Transition to DONE.
290
+
291
+ ---
292
+
293
+ ### DONE — Final Summary
294
+
295
+ 1. **Present project summary:**
296
+ - Total phases completed
297
+ - Total tasks across all phases
298
+ - Total retries used
299
+ - Total time (first phase start to last phase completion)
300
+ - Any overridden review findings
301
+
302
+ 2. **Offer next steps:**
303
+ - "Create a PR? (yes / no)"
304
+ - If yes: assemble commit history, suggest PR title and description.
305
+
306
+ 3. **Write final handoff:**
307
+
308
+ ```json
309
+ {
310
+ "fromSkill": "harness-autopilot",
311
+ "phase": "DONE",
312
+ "summary": "Completed {N} phases with {M} total tasks",
313
+ "completed": ["Phase 1: ...", "Phase 2: ..."],
314
+ "pending": [],
315
+ "concerns": [],
316
+ "decisions": ["<all decisions from all phases>"],
317
+ "contextKeywords": ["<merged from spec>"]
318
+ }
319
+ ```
320
+
321
+ 4. **Append learnings** to `.harness/learnings.md`:
322
+
323
+ ```
324
+ ## {date} — Autopilot: {spec name}
325
+ - [skill:harness-autopilot] [outcome:complete] Executed {N} phases, {M} tasks, {R} retries
326
+ - [skill:harness-autopilot] [outcome:observation] {any notable patterns from the run}
327
+ ```
328
+
329
+ 5. **Clean up state:** Set `currentState: "DONE"` in `autopilot-state.json`. Do not delete the file — it serves as a record.
330
+
331
+ ## Harness Integration
332
+
333
+ - **`harness validate`** — Run during INIT to verify project health. Included in every execution task via harness-execution delegation.
334
+ - **`harness check-deps`** — Delegated to harness-execution (included in task steps).
335
+ - **State file** — `.harness/autopilot-state.json` tracks the orchestration state machine. `.harness/state.json` tracks task-level execution state (managed by harness-execution).
336
+ - **Handoff** — `.harness/handoff.json` is written by each delegated skill and read by the next. Autopilot writes a final handoff on DONE.
337
+ - **Learnings** — `.harness/learnings.md` is appended by both delegated skills and autopilot itself.
338
+
339
+ ## Success Criteria
340
+
341
+ - Single `/harness:autopilot` invocation executes all phases through to completion
342
+ - Resume from any state after context reset via `.harness/autopilot-state.json`
343
+ - Low-complexity phases auto-plan; high-complexity phases pause for interactive planning
344
+ - Planning override bumps complexity upward when task signals disagree
345
+ - Retry budget (3 attempts) with escalating context before surfacing failures
346
+ - Existing skills (planning, execution, verification, review) are unchanged
347
+ - Human approves every plan before execution begins
348
+ - Phase completion summary shown between every phase
349
+
350
+ ## Examples
351
+
352
+ ### Example: 3-Phase Security Scanner
353
+
354
+ **User invokes:** `/harness:autopilot docs/specs/2026-03-19-security-scanner.md`
355
+
356
+ **INIT:**
357
+
358
+ ```
359
+ Read spec — found 3 phases:
360
+ Phase 1: Core Scanner (complexity: low)
361
+ Phase 2: Rule Engine (complexity: high)
362
+ Phase 3: CLI Integration (complexity: low)
363
+ Created .harness/autopilot-state.json. Starting Phase 1.
364
+ ```
365
+
366
+ **Phase 1 — ASSESS:**
367
+
368
+ ```
369
+ Phase 1: Core Scanner — complexity: low. Auto-planning.
370
+ ```
371
+
372
+ **Phase 1 — PLAN:**
373
+
374
+ ```
375
+ [Subagent runs harness-planning]
376
+ Plan generated: docs/plans/2026-03-19-core-scanner-plan.md (8 tasks, ~24 min)
377
+ ```
378
+
379
+ **Phase 1 — APPROVE_PLAN:**
380
+
381
+ ```
382
+ Phase 1: Core Scanner
383
+ Tasks: 8 | Checkpoints: 1 | Est. time: 24 min | Complexity: low
384
+ Approve this plan and begin execution? (yes / revise / skip / stop)
385
+ → User: "yes"
386
+ ```
387
+
388
+ **Phase 1 — EXECUTE → VERIFY → REVIEW:**
389
+
390
+ ```
391
+ [Subagent executes 8 tasks... all pass]
392
+ [Subagent runs verification... pass]
393
+ [Subagent runs code review... 0 blocking, 2 notes]
394
+ ```
395
+
396
+ **Phase 1 — PHASE_COMPLETE:**
397
+
398
+ ```
399
+ Phase 1: Core Scanner — COMPLETE
400
+ Tasks: 8/8 | Retries: 0 | Verification: pass | Review: 0 blocking
401
+ Next: Phase 2: Rule Engine (complexity: high). Continue? (yes / stop)
402
+ → User: "yes"
403
+ ```
404
+
405
+ **Phase 2 — ASSESS:**
406
+
407
+ ```
408
+ Phase 2: Rule Engine — complexity: high.
409
+ This phase requires interactive planning. Run /harness:planning with the
410
+ Rule Engine section of the spec, then re-invoke /harness:autopilot to continue.
411
+ ```
412
+
413
+ **[User runs /harness:planning interactively, produces plan]**
414
+
415
+ **User re-invokes:** `/harness:autopilot`
416
+
417
+ **INIT (resume):**
418
+
419
+ ```
420
+ Resuming autopilot from state PLAN, phase 2: Rule Engine.
421
+ Found plan: docs/plans/2026-03-19-rule-engine-plan.md
422
+ ```
423
+
424
+ **Phase 2 — APPROVE_PLAN → EXECUTE → VERIFY → REVIEW → PHASE_COMPLETE**
425
+
426
+ ```
427
+ [Same flow as Phase 1, with checkpoint pauses as needed]
428
+ Phase 2: Rule Engine — COMPLETE
429
+ Tasks: 14/14 | Retries: 1 | Verification: pass | Review: 0 blocking
430
+ Next: Phase 3: CLI Integration (complexity: low). Continue? (yes / stop)
431
+ → User: "yes"
432
+ ```
433
+
434
+ **Phase 3 — [auto-plans, executes, completes]**
435
+
436
+ **DONE:**
437
+
438
+ ```
439
+ All phases complete.
440
+ Total: 3 phases, 30 tasks, 1 retry
441
+ Create a PR? (yes / no)
442
+ → User: "yes"
443
+ ```
444
+
445
+ ### Example: Retry Budget Exhaustion
446
+
447
+ **During Phase 1 — EXECUTE:**
448
+
449
+ ```
450
+ Task 4: wire parser — FAILED
451
+ Error: Type error in parser.ts:42 — missing generic parameter
452
+
453
+ Retry 1/3: Added generic parameter to parseRule<T>. Re-running...
454
+ Task 4: wire parser — FAILED
455
+ Error: Test assertion failure — expected 3 rules, got 0
456
+
457
+ Retry 2/3: Expanded context — read rule-loader.ts, checked learnings. Fixed import path. Re-running...
458
+ Task 4: wire parser — FAILED
459
+ Error: Test timeout after 5000ms
460
+
461
+ Retry 3/3: Full context — re-read plan instructions, checked for ambiguity. Found test was missing async/await. Re-running...
462
+ Task 4: wire parser — PASSED
463
+
464
+ Continuing execution...
465
+ ```
466
+
467
+ **If all 3 retries fail:**
468
+
469
+ ```
470
+ Task 4: wire parser — FAILED (budget exhausted)
471
+
472
+ Attempt 1: Added generic parameter → Type error persists
473
+ Attempt 2: Fixed import path → Tests still timeout
474
+ Attempt 3: Added async/await → New error: connection refused
475
+
476
+ Recorded in .harness/failures.md.
477
+ How should we proceed? (fix manually and continue / revise plan / stop)
478
+ ```
479
+
480
+ ## Gates
481
+
482
+ - **No reimplementing delegated skills.** Autopilot orchestrates. If you are writing planning logic, execution logic, verification logic, or review logic, STOP. Delegate to the appropriate skill.
483
+ - **No executing without plan approval.** Every plan must be explicitly approved by the human before execution begins. No exceptions, regardless of complexity level.
484
+ - **No skipping VERIFY or REVIEW.** Every phase goes through verification and review. The human can override findings, but the steps cannot be skipped.
485
+ - **No infinite retries.** The retry budget is 3 attempts. If exhausted, STOP and surface to the human. Do not extend the budget without explicit human instruction.
486
+ - **No modifying autopilot-state.json manually.** The state file is managed by the skill. If the state appears corrupted, start fresh rather than patching it.
487
+
488
+ ## Escalation
489
+
490
+ - **When the spec has no Implementation Order section:** Cannot identify phases. Ask the user to add phase annotations to the spec or provide a roadmap file.
491
+ - **When a delegated skill fails to produce expected output:** Check that handoff.json was written correctly. If the subagent failed, report the failure and ask the user whether to retry the entire phase step or stop.
492
+ - **When the user wants to reorder phases mid-run:** Update the phases array in autopilot-state.json (mark skipped phases, adjust currentPhase). Do not re-run completed phases.
493
+ - **When context limits are approaching:** Persist state immediately and inform the user: "Context limit approaching. State saved. Re-invoke /harness:autopilot to continue from this point."
494
+ - **When multiple phases fail in sequence:** After 2 consecutive phase failures (retry budget exhausted in both), suggest the user review the spec for systemic issues rather than continuing.
@@ -0,0 +1,52 @@
1
+ name: harness-autopilot
2
+ version: "1.0.0"
3
+ description: Autonomous phase execution loop — chains planning, execution, verification, and review, pausing only at human decision points
4
+ cognitive_mode: constructive-architect
5
+ triggers:
6
+ - manual
7
+ platforms:
8
+ - claude-code
9
+ - gemini-cli
10
+ tools:
11
+ - Bash
12
+ - Read
13
+ - Write
14
+ - Edit
15
+ - Glob
16
+ - Grep
17
+ cli:
18
+ command: harness skill run harness-autopilot
19
+ args:
20
+ - name: spec
21
+ description: Path to approved spec document
22
+ required: false
23
+ - name: path
24
+ description: Project root path
25
+ required: false
26
+ mcp:
27
+ tool: run_skill
28
+ input:
29
+ skill: harness-autopilot
30
+ path: string
31
+ type: rigid
32
+ phases:
33
+ - name: init
34
+ description: Load spec, identify phases, restore state if resuming
35
+ required: true
36
+ - name: loop
37
+ description: Execute state machine — assess, plan, execute, verify, review per phase
38
+ required: true
39
+ - name: complete
40
+ description: Final summary and PR offering
41
+ required: true
42
+ state:
43
+ persistent: true
44
+ files:
45
+ - .harness/autopilot-state.json
46
+ - .harness/state.json
47
+ - .harness/learnings.md
48
+ depends_on:
49
+ - harness-planning
50
+ - harness-execution
51
+ - harness-verification
52
+ - harness-code-review