coding-agent-harness 1.0.4 → 1.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +7 -0
- package/LICENSE +661 -21
- package/LICENSE-EXCEPTION.md +37 -0
- package/README.md +33 -1
- package/README.zh-CN.md +23 -1
- package/SKILL.md +9 -8
- package/docs-release/architecture/overview.md +1 -1
- package/docs-release/architecture/overview.zh-CN.md +1 -1
- package/docs-release/architecture/system-explainer/01-system-overview.md +217 -0
- package/docs-release/architecture/system-explainer/02-module-dependency.md +257 -0
- package/docs-release/architecture/system-explainer/03-task-lifecycle.md +304 -0
- package/docs-release/architecture/system-explainer/04-check-and-governance.md +239 -0
- package/docs-release/architecture/system-explainer/05-data-flow.md +276 -0
- package/docs-release/architecture/system-explainer/06-preset-and-migration.md +303 -0
- package/docs-release/architecture/system-explainer/README.md +67 -0
- package/docs-release/architecture/system-explainer/en-US/01-system-overview.md +226 -0
- package/docs-release/architecture/system-explainer/en-US/02-module-dependency.md +263 -0
- package/docs-release/architecture/system-explainer/en-US/03-task-lifecycle.md +319 -0
- package/docs-release/architecture/system-explainer/en-US/04-check-and-governance.md +250 -0
- package/docs-release/architecture/system-explainer/en-US/05-data-flow.md +290 -0
- package/docs-release/architecture/system-explainer/en-US/06-preset-and-migration.md +323 -0
- package/docs-release/architecture/system-explainer/en-US/README.md +70 -0
- package/docs-release/guides/agent-installation.en-US.md +8 -7
- package/docs-release/guides/agent-installation.md +9 -7
- package/docs-release/guides/preset-development.md +26 -2
- package/docs-release/guides/task-state-machine.en-US.md +30 -13
- package/docs-release/guides/task-state-machine.md +30 -13
- package/examples/minimal-project/docs/09-PLANNING/TASKS/demo-task/INDEX.md +60 -0
- package/package.json +3 -2
- package/references/harness-ledger.md +1 -1
- package/scripts/commands/migration-command.mjs +30 -0
- package/scripts/commands/task-command.mjs +26 -25
- package/scripts/harness.mjs +7 -3
- package/scripts/lib/capability-registry.mjs +17 -21
- package/scripts/lib/check-module-parallel.mjs +9 -16
- package/scripts/lib/check-profiles.mjs +35 -81
- package/scripts/lib/check-task-contracts.mjs +13 -5
- package/scripts/lib/core-shared.mjs +55 -2
- package/scripts/lib/dashboard-data.mjs +126 -18
- package/scripts/lib/dashboard-workbench.mjs +80 -1
- package/scripts/lib/dashboard-writer.mjs +6 -2
- package/scripts/lib/git-status-summary.mjs +1 -1
- package/scripts/lib/governance-sync.mjs +180 -83
- package/scripts/lib/harness-core.mjs +1 -0
- package/scripts/lib/markdown-utils.mjs +33 -0
- package/scripts/lib/migration-planner.mjs +4 -6
- package/scripts/lib/phase-kind.mjs +50 -0
- package/scripts/lib/preset-engine.mjs +5 -8
- package/scripts/lib/preset-registry.mjs +188 -39
- package/scripts/lib/review-confirm-git-gate.mjs +1 -1
- package/scripts/lib/status-builder.mjs +88 -0
- package/scripts/lib/status-dashboard-renderer.mjs +7 -4
- package/scripts/lib/task-audit-metadata.mjs +385 -0
- package/scripts/lib/task-audit-migration.mjs +350 -0
- package/scripts/lib/task-completion-consistency.mjs +11 -1
- package/scripts/lib/task-lifecycle/create-task-helpers.mjs +67 -0
- package/scripts/lib/task-lifecycle/phase-sync.mjs +88 -0
- package/scripts/lib/task-lifecycle/review-confirm.mjs +40 -29
- package/scripts/lib/task-lifecycle/review-gates.mjs +13 -10
- package/scripts/lib/task-lifecycle/review-submission.mjs +63 -0
- package/scripts/lib/task-lifecycle/scaffold-provenance.mjs +49 -0
- package/scripts/lib/task-lifecycle/template-files.mjs +53 -0
- package/scripts/lib/task-lifecycle.mjs +114 -147
- package/scripts/lib/task-metadata.mjs +118 -0
- package/scripts/lib/task-review-model.mjs +54 -68
- package/scripts/lib/task-scanner.mjs +70 -143
- package/skills/preset-creator/references/complex-task-skeleton/brief.md +11 -0
- package/templates/AGENTS.md.template +7 -5
- package/templates/dashboard/assets/app-src/00-state.js +12 -0
- package/templates/dashboard/assets/app-src/10-router.js +3 -0
- package/templates/dashboard/assets/app-src/20-overview.js +7 -3
- package/templates/dashboard/assets/app-src/35-task-detail.js +46 -6
- package/templates/dashboard/assets/app-src/55-presets.js +375 -0
- package/templates/dashboard/assets/app-src/60-shared.js +3 -1
- package/templates/dashboard/assets/app-src/90-bindings.js +131 -0
- package/templates/dashboard/assets/app.css +583 -0
- package/templates/dashboard/assets/app.css.manifest.json +1 -0
- package/templates/dashboard/assets/app.js +578 -10
- package/templates/dashboard/assets/app.manifest.json +1 -0
- package/templates/dashboard/assets/css-src/00-foundation.css +4 -0
- package/templates/dashboard/assets/css-src/40-detail-modules-migration.css +62 -0
- package/templates/dashboard/assets/css-src/45-presets.css +516 -0
- package/templates/dashboard/assets/i18n.js +140 -2
- package/templates/planning/INDEX.md +87 -0
- package/templates/planning/brief.md +1 -1
- package/templates/planning/module_session_prompt.md +1 -0
- package/templates/planning/review.md +0 -18
- package/templates/planning/task_plan.md +4 -43
- package/templates/planning/visual_map.md +13 -9
- package/templates/planning/visual_map.simple.md +52 -0
- package/templates/reference/execution-workflow-standard.md +29 -2
- package/templates-zh-CN/AGENTS.md.template +7 -5
- package/templates-zh-CN/planning/INDEX.md +87 -0
- package/templates-zh-CN/planning/brief.md +1 -1
- package/templates-zh-CN/planning/module_session_prompt.md +1 -0
- package/templates-zh-CN/planning/review.md +0 -18
- package/templates-zh-CN/planning/task_plan.md +3 -63
- package/templates-zh-CN/planning/visual_map.md +14 -7
- package/templates-zh-CN/planning/visual_map.simple.md +48 -0
- package/templates-zh-CN/reference/execution-workflow-standard.md +31 -6
|
@@ -0,0 +1,319 @@
|
|
|
1
|
+
# 03 — Task Lifecycle
|
|
2
|
+
|
|
3
|
+
## Level 0 — A task's life
|
|
4
|
+
|
|
5
|
+
A task goes through six states from creation to closeout:
|
|
6
|
+
|
|
7
|
+
```mermaid
|
|
8
|
+
flowchart LR
|
|
9
|
+
A["not_started\nDirectory created"] --> B["planned\nPlan filled in"] --> C["in_progress\nExecuting"]
|
|
10
|
+
C --> D["review\nAwaiting human review"]
|
|
11
|
+
D --> E["done\nCloseout complete"]
|
|
12
|
+
C -->|"external block"| F["blocked"] -->|"unblocked"| C
|
|
13
|
+
D -->|"sent back for rework"| C
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
Each state transition is triggered by a corresponding CLI command. The `planned` state is
|
|
17
|
+
typically skipped in practice — Agents create a task and go directly to `in_progress`.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Level 1 — States and their corresponding commands
|
|
22
|
+
|
|
23
|
+
```mermaid
|
|
24
|
+
flowchart TD
|
|
25
|
+
NS["not_started\nTask directory created\nFiles scaffolded"]
|
|
26
|
+
IP["in_progress\nExecuting"]
|
|
27
|
+
R["review\nAwaiting human review"]
|
|
28
|
+
D["done\nCloseout complete"]
|
|
29
|
+
BL["blocked\nBlocked by external dependency"]
|
|
30
|
+
|
|
31
|
+
NS -->|"harness task-start"| IP
|
|
32
|
+
IP -->|"harness task-review"| R
|
|
33
|
+
R -->|"harness review-confirm\n(manual operation)"| D
|
|
34
|
+
IP -->|"harness task-phase --blocked"| BL
|
|
35
|
+
BL -->|"harness task-start"| IP
|
|
36
|
+
R -->|"sent back for rework"| IP
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
**Key point**: `review-confirm` is the **only command in the entire system that cannot be
|
|
40
|
+
automatically executed by an Agent**. It requires a real human operation and writes an
|
|
41
|
+
auditable confirmation block with Git `user.name` / `user.email`.
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Level 2 — Budget determines gate strictness
|
|
46
|
+
|
|
47
|
+
Budget is the task's complexity level, and it directly determines how strict the review gates are:
|
|
48
|
+
|
|
49
|
+
| Gate | simple | standard | complex |
|
|
50
|
+
| --- | --- | --- | --- |
|
|
51
|
+
| Requires Visual Map phase progress | ✗ | ✓ | ✓ |
|
|
52
|
+
| Requires lesson_candidates.md | ✗ | ✓ | ✓ |
|
|
53
|
+
| Requires Agent to write review.md | ✗ | ✓ | ✓ |
|
|
54
|
+
| Requires all blocking findings closed | ✗ | ✓ | ✓ |
|
|
55
|
+
| Requires Walkthrough link | ✗ | ✓ | ✓ |
|
|
56
|
+
| Requires Lesson decision complete | ✗ | ✓ | ✓ |
|
|
57
|
+
| Requires human review-confirm | ✗ | ✓ | ✓ |
|
|
58
|
+
|
|
59
|
+
`simple` tasks can jump directly from `in_progress` to `done` with no gates.
|
|
60
|
+
`standard` and `complex` have identical gates — the difference is that `complex` tasks
|
|
61
|
+
typically require subagent authorization and adversarial review.
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## Level 3 — Gate details for task-review
|
|
66
|
+
|
|
67
|
+
When an Agent runs `harness task-review`, the system performs three checks
|
|
68
|
+
**before entering review state** (`review-gates.mjs`):
|
|
69
|
+
|
|
70
|
+
```mermaid
|
|
71
|
+
flowchart TD
|
|
72
|
+
Trigger["harness task-review task-123"]
|
|
73
|
+
|
|
74
|
+
Trigger --> C1{"Current state == in_progress?"}
|
|
75
|
+
C1 -->|"no"| E1["❌ Rejected\nWrong state"]
|
|
76
|
+
C1 -->|"yes (and budget != simple)"| C2{"lesson_candidates.md exists?"}
|
|
77
|
+
C2 -->|"no"| E2["❌ Rejected\nMissing lesson candidates"]
|
|
78
|
+
C2 -->|"yes"| C3{"Visual Map has at least one phase\nwith recorded progress or evidence?"}
|
|
79
|
+
C3 -->|"no"| E3["❌ Rejected\nNo phase progress recorded"]
|
|
80
|
+
C3 -->|"yes"| OK["✅ Enter review state"]
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
How "phase has recorded progress" is determined (`review-gates.mjs`):
|
|
84
|
+
- `phase.completion > 0`, or
|
|
85
|
+
- `phase.state` is `in_progress / review / blocked / done`, or
|
|
86
|
+
- `phase.evidenceStatus` is `partial / present / waived`
|
|
87
|
+
|
|
88
|
+
After entering review state, the Agent needs to write `review.md` and fill in the findings table.
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## Level 3 — Gate details for review-confirm
|
|
93
|
+
|
|
94
|
+
When a human runs `harness review-confirm`, the system performs four checks
|
|
95
|
+
**before executing the confirmation**:
|
|
96
|
+
|
|
97
|
+
```mermaid
|
|
98
|
+
flowchart TD
|
|
99
|
+
Trigger["harness review-confirm task-123"]
|
|
100
|
+
|
|
101
|
+
Trigger --> C1{"Confirmation text matches task ID?"}
|
|
102
|
+
C1 -->|"no"| E1["❌ Rejected\nWrong confirmation text"]
|
|
103
|
+
C1 -->|"yes"| C2{"No blocking review findings?\n(P0/P1/P2 open findings)"}
|
|
104
|
+
C2 -->|"no"| E2["❌ Rejected\nStill has unclosed blocking findings"]
|
|
105
|
+
C2 -->|"yes"| C3{"Git working tree clean?"}
|
|
106
|
+
C3 -->|"no"| E3["❌ Rejected\nHas uncommitted changes"]
|
|
107
|
+
C3 -->|"yes"| Exec["✅ Execute confirmation"]
|
|
108
|
+
|
|
109
|
+
Exec --> Write1["Write confirmation audit fields\nto INDEX.md"]
|
|
110
|
+
Write1 --> Commit1["Git commit #1\nchore: confirm review task-123"]
|
|
111
|
+
Commit1 --> Commit2["Git commit #2\nchore: record review confirmation audit task-123"]
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
**Two-commit strategy**: The first commit covers confirmation fields in `INDEX.md`; the second
|
|
115
|
+
commits the final audit metadata. Even if the second commit fails, the first commit has
|
|
116
|
+
already locked in the confirmation record.
|
|
117
|
+
|
|
118
|
+
**Task Audit Metadata confirmation fields** (written to `INDEX.md`):
|
|
119
|
+
|
|
120
|
+
```markdown
|
|
121
|
+
## Task Audit Metadata
|
|
122
|
+
|
|
123
|
+
| Field | Value |
|
|
124
|
+
| --- | --- |
|
|
125
|
+
| Human Review Status | confirmed |
|
|
126
|
+
| Confirmation ID | HRC-<timestamp> |
|
|
127
|
+
| Confirmed At | <ISO timestamp> |
|
|
128
|
+
| Reviewer | <git user.name> |
|
|
129
|
+
| Reviewer Email | <git user.email> |
|
|
130
|
+
| Confirm Text | <task id confirmation> |
|
|
131
|
+
| Evidence Checked | <evidence path> |
|
|
132
|
+
| Review Commit SHA | <git commit sha> |
|
|
133
|
+
| Audit Status | committed |
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## Level 3 — lifecycleState derivation logic
|
|
139
|
+
|
|
140
|
+
`lifecycleState` is derived from task state + review state combined. It's not stored in files
|
|
141
|
+
and is recalculated on every run.
|
|
142
|
+
|
|
143
|
+
The complete decision tree for the derivation function `deriveLifecycleState()` (in priority order):
|
|
144
|
+
|
|
145
|
+
| Condition | lifecycleState |
|
|
146
|
+
| --- | --- |
|
|
147
|
+
| `reviewStatus == "blocked-open-findings"` | `review-blocked` |
|
|
148
|
+
| `closeoutStatus == "closed"` and `reviewStatus != "confirmed"` | `closed-review-pending` |
|
|
149
|
+
| `closeoutStatus == "closed"` | `closed` |
|
|
150
|
+
| `state == "blocked"` | `blocked` |
|
|
151
|
+
| `state == "done"` | `closing` |
|
|
152
|
+
| `state == "review"` | `in_review` |
|
|
153
|
+
| `state == "in_progress"` | `active` |
|
|
154
|
+
| `state == "planned"` or `"not_started"` | `ready` |
|
|
155
|
+
| other | `unknown` |
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Level 3 — Lifecycle queues
|
|
160
|
+
|
|
161
|
+
Tasks are automatically assigned to different queues based on their current state, and these
|
|
162
|
+
queues are visible in the Dashboard. **A task can belong to multiple queues simultaneously**
|
|
163
|
+
(e.g., both `missing-materials` and `blocked` at the same time).
|
|
164
|
+
|
|
165
|
+
Queue assignment logic (`deriveTaskQueues()`):
|
|
166
|
+
|
|
167
|
+
```mermaid
|
|
168
|
+
flowchart TD
|
|
169
|
+
Start["Task"]
|
|
170
|
+
|
|
171
|
+
Start --> T1{"tombstone.deletionState != active?"}
|
|
172
|
+
T1 -->|"yes"| Q_DEL["soft-deleted-superseded queue"]
|
|
173
|
+
T1 -->|"no"| T2{"Has materials issues?"}
|
|
174
|
+
|
|
175
|
+
T2 -->|"yes"| Q_MAT["missing-materials queue"]
|
|
176
|
+
T2 -->|"no"| T3{"Has blocking reasons?"}
|
|
177
|
+
|
|
178
|
+
T3 -->|"yes"| Q_BLK["blocked queue"]
|
|
179
|
+
T3 -->|"no"| T4{"Review submitted + ready for confirmation\n+ no lesson work\n+ no other queues?"}
|
|
180
|
+
|
|
181
|
+
T4 -->|"yes"| Q_REV["review queue"]
|
|
182
|
+
T4 -->|"no"| T5{"Has lesson work?"}
|
|
183
|
+
|
|
184
|
+
T5 -->|"yes"| Q_LES["lessons queue"]
|
|
185
|
+
T5 -->|"no"| T6{"Review confirmed?"}
|
|
186
|
+
|
|
187
|
+
T6 -->|"yes, closeoutStatus=closed"| Q_FIN["finalized queue"]
|
|
188
|
+
T6 -->|"yes, other"| Q_CON["confirmed queue"]
|
|
189
|
+
T6 -->|"no, queue empty"| Q_ACT["active queue"]
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
**Sources of blocking reasons**: materials issues, P0-P2 blocking findings, state conflicts,
|
|
193
|
+
outdated scanner version.
|
|
194
|
+
|
|
195
|
+
---
|
|
196
|
+
|
|
197
|
+
## Level 4 — Governance Sync: how state changes are written to the ledger
|
|
198
|
+
|
|
199
|
+
Every task state change triggers `syncTaskGovernance()`, which atomically updates `Harness-Ledger.md`.
|
|
200
|
+
|
|
201
|
+
**Lock mechanism** (`governance-sync.mjs`):
|
|
202
|
+
|
|
203
|
+
```mermaid
|
|
204
|
+
sequenceDiagram
|
|
205
|
+
participant CLI as harness CLI
|
|
206
|
+
participant Lock as .harness/locks/governance-sync.lock
|
|
207
|
+
participant Ledger as docs/Harness-Ledger.md
|
|
208
|
+
participant Git
|
|
209
|
+
|
|
210
|
+
CLI->>Lock: fs.openSync(lockPath, "wx")\n(exclusive write, throws EEXIST if exists)
|
|
211
|
+
Note over Lock: If lock exists → GovernanceSyncError\ncode: governance-lock-exists
|
|
212
|
+
CLI->>Ledger: Update the row for task-123
|
|
213
|
+
CLI->>Git: git add + git commit
|
|
214
|
+
Git-->>CLI: commit SHA
|
|
215
|
+
CLI->>Lock: fs.unlinkSync(lockPath)\n(delete lock file)
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
The lock file is created with the `wx` flag (write + exclusive) — this is an atomic Node.js
|
|
219
|
+
filesystem operation. If the file already exists, `openSync` throws `EEXIST` and won't overwrite.
|
|
220
|
+
|
|
221
|
+
**Difference from `governance rebuild`**:
|
|
222
|
+
|
|
223
|
+
| Operation | How triggered | Write target | Frequency |
|
|
224
|
+
| --- | --- | --- | --- |
|
|
225
|
+
| `syncTaskGovernance` | Automatic (on every state change) | Corresponding row in `Harness-Ledger.md` | High frequency |
|
|
226
|
+
| `rebuildGovernanceIndexes` | Manual (`harness governance rebuild`) | `docs/09-PLANNING/generated/` index tables | Low frequency |
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
230
|
+
## Level 3 — Tombstone: soft-delete and merge
|
|
231
|
+
|
|
232
|
+
Tasks can be soft-deleted, merged, or superseded rather than physically deleted.
|
|
233
|
+
The Tombstone block is appended to the end of `task_plan.md` (not replacing existing content),
|
|
234
|
+
preserving the historical audit trail.
|
|
235
|
+
|
|
236
|
+
Supported operations:
|
|
237
|
+
- `supersedeTask()`: mark as replaced by a new task
|
|
238
|
+
- `softDeleteTask()`: soft-delete
|
|
239
|
+
- `archiveTask()`: archive
|
|
240
|
+
- `reopenTask()`: remove the Tombstone block and reactivate the task
|
|
241
|
+
|
|
242
|
+
**Tombstone block format**:
|
|
243
|
+
|
|
244
|
+
```markdown
|
|
245
|
+
## Task Tombstone
|
|
246
|
+
|
|
247
|
+
| Field | Value |
|
|
248
|
+
| --- | --- |
|
|
249
|
+
| State | superseded |
|
|
250
|
+
| Superseded By | new-task-id |
|
|
251
|
+
| Reason | <reason text> |
|
|
252
|
+
| Operator | coordinator |
|
|
253
|
+
| Timestamp | <ISO timestamp> |
|
|
254
|
+
| Reopen Eligible | yes |
|
|
255
|
+
| Archive Eligible | no |
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## Level 2 — Design decisions
|
|
261
|
+
|
|
262
|
+
### Why lifecycleState is needed as a derived state
|
|
263
|
+
|
|
264
|
+
`task.state` is the raw execution phase that Agents write into `progress.md` — it only has
|
|
265
|
+
coarse-grained values and has many historical aliases (`complete`, `completed`, `doing`,
|
|
266
|
+
`active`, etc.). This field can't distinguish "Agent says it's done" from "human confirmed
|
|
267
|
+
it's done", and can't distinguish "waiting for human review" from "missing materials".
|
|
268
|
+
|
|
269
|
+
`lifecycleState` is derived from multiple files and is the primary lifecycle semantic for
|
|
270
|
+
the Dashboard. The core scenario driving this design: a task with `task.state = review`
|
|
271
|
+
might actually be in three completely different governance states — "missing materials",
|
|
272
|
+
"has open P0 finding", or "waiting for human review" — but the old model lumped all three
|
|
273
|
+
into the same review queue.
|
|
274
|
+
|
|
275
|
+
### Why a task can belong to multiple queues simultaneously
|
|
276
|
+
|
|
277
|
+
A task can simultaneously be "waiting for human review" (Review queue) and "has pending
|
|
278
|
+
lesson candidate" (Lessons queue). These two things have different responsible parties
|
|
279
|
+
(the former is the human reviewer, the latter is the coordinator) and different exit
|
|
280
|
+
conditions — they shouldn't be merged into one state. The multi-queue model lets each
|
|
281
|
+
governance concern be tracked independently.
|
|
282
|
+
|
|
283
|
+
### Why Tombstone doesn't physically delete the task directory
|
|
284
|
+
|
|
285
|
+
The document library has no database-level foreign keys. Physical deletion would leave
|
|
286
|
+
orphan references (the Ledger, Closeout SSoT, and other tasks' `Supersedes` fields may
|
|
287
|
+
all point to the deleted task). Tombstone markers let the Soft-deleted / Superseded queue
|
|
288
|
+
provide read-only traceability for "why isn't this task in the active queue".
|
|
289
|
+
|
|
290
|
+
### Why review-confirm requires two Git commits
|
|
291
|
+
|
|
292
|
+
Two commits make the audit commit's SHA an immutable timestamp. The first commit covers
|
|
293
|
+
the confirmation itself; the second commit contains an audit record with the first commit's
|
|
294
|
+
SHA. If only files were written without committing, Agents could forge confirmation state
|
|
295
|
+
without leaving a Git history. The validator can verify that the confirmation commit
|
|
296
|
+
actually exists in Git history.
|
|
297
|
+
|
|
298
|
+
### Why governance-sync uses a file lock instead of Git's own lock
|
|
299
|
+
|
|
300
|
+
Git's own lock (`.git/index.lock`) only protects index operations, not the read-modify-write
|
|
301
|
+
sequence on Markdown files. Two concurrent CLI processes could simultaneously read the same
|
|
302
|
+
governance table, each modify it, then commit one after the other — the second would
|
|
303
|
+
overwrite the first's row updates. The file lock's granularity is "the entire governance
|
|
304
|
+
sync operation", not a single git command.
|
|
305
|
+
|
|
306
|
+
### Why simple budget skips all gates
|
|
307
|
+
|
|
308
|
+
Simple tasks correspond to trivial changes (doc corrections, config adjustments). Forcing
|
|
309
|
+
them through `task-review → review-confirm → task-complete` would make the overhead exceed
|
|
310
|
+
the value of the task itself. This is an intentional fast path, not an oversight.
|
|
311
|
+
|
|
312
|
+
### The design intent of the Lesson system
|
|
313
|
+
|
|
314
|
+
The Lesson system transforms reusable knowledge discovered during task execution from
|
|
315
|
+
"mentioned in chat" into a governance object that is "trackable, reviewable, and
|
|
316
|
+
sedimentable into standard docs". Lesson candidate decisions must be completed before
|
|
317
|
+
`review-confirm`, because `review-confirm` is the responsibility transfer point — once
|
|
318
|
+
human confirmation is done, the task enters finalization, and requiring the Agent to
|
|
319
|
+
add lesson decisions at that point would create accountability confusion.
|
|
@@ -0,0 +1,250 @@
|
|
|
1
|
+
# 04 — Check System and Governance
|
|
2
|
+
|
|
3
|
+
## Level 0 — The purpose of checks
|
|
4
|
+
|
|
5
|
+
The core question behind `harness check` and `harness status` is:
|
|
6
|
+
|
|
7
|
+
> **Is this repository's documentation state compliant?**
|
|
8
|
+
|
|
9
|
+
The definition of "compliant" varies by context, which is why there are three profiles.
|
|
10
|
+
Each profile corresponds to a different use case and runs a different subset of validators.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Level 1 — Three check profiles
|
|
15
|
+
|
|
16
|
+
```mermaid
|
|
17
|
+
flowchart LR
|
|
18
|
+
subgraph "source-package"
|
|
19
|
+
SP["Validates harness itself\nPublished package integrity\n(for CI)"]
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
subgraph "private-harness"
|
|
23
|
+
PH["Validates private operating repo\n(e.g. .harness-private)"]
|
|
24
|
+
end
|
|
25
|
+
|
|
26
|
+
subgraph "target-project"
|
|
27
|
+
TP["Validates user's target project\n(most common, default)"]
|
|
28
|
+
end
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
| Profile | Typical command | Purpose |
|
|
32
|
+
| --- | --- | --- |
|
|
33
|
+
| `source-package` | `harness check --profile source-package .` | CI validation of harness's own published package; checks staged file boundaries (`.harness-private/` or generated dashboard must not be tracked) |
|
|
34
|
+
| `private-harness` | `harness check --profile private-harness .harness-private` | Validates private operating records |
|
|
35
|
+
| `target-project` | `harness check ~/my-app` (default) | Validates user project compliance, runs the full set of 9 validators |
|
|
36
|
+
|
|
37
|
+
**Special check for source-package**: In addition to running validators, it calls
|
|
38
|
+
`validateSourcePackageBoundary()` to check whether git staged files contain content
|
|
39
|
+
that shouldn't be published (`.harness-private/`, generated dashboard, etc.).
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## Level 2 — Which validators does buildStatus() call
|
|
44
|
+
|
|
45
|
+
`buildStatus()` is the core check function. It calls 9 validators in sequence:
|
|
46
|
+
|
|
47
|
+
```mermaid
|
|
48
|
+
flowchart TD
|
|
49
|
+
BS["buildStatus()"]
|
|
50
|
+
|
|
51
|
+
BS --> V1["① validateCapabilities\nCapability registry vs actual files consistency\nDependency requirements satisfied"]
|
|
52
|
+
BS --> V2["② validateReviewSchema\nreview.md has required sections\nFindings table format is compliant"]
|
|
53
|
+
BS --> V3["③ validateVisualMaps\nvisual_map.md format is compliant\nPhase fields are complete"]
|
|
54
|
+
BS --> V4["④ validatePlanContracts\nTask files have Task Contract marker"]
|
|
55
|
+
BS --> V5["⑤ validateTaskPresetContracts\nPreset task contracts are complete\nResource declarations exist"]
|
|
56
|
+
BS --> V6["⑥ validateContextDocs\nContext docs (brief etc.) exist"]
|
|
57
|
+
BS --> V7["⑦ validateGovernanceTableBoundaries\nGovernance table content is compliant\n(no execution logs, temp prompts, etc.)"]
|
|
58
|
+
BS --> V8["⑧ validateSubagentAuthorization\nSubagent authorization state is recorded\nAuthorization fields are complete"]
|
|
59
|
+
BS --> V9["⑨ validateTaskCompletionConsistency\nCompleted tasks' Visual Map phases are also complete"]
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Each validator returns `failures` (hard failures, must fix) and `warnings` (soft warnings, recommended to fix).
|
|
63
|
+
|
|
64
|
+
> **Note**: `check-module-parallel.mjs` exists in `scripts/lib/` but is **not** in `buildStatus()`'s
|
|
65
|
+
> call chain — it's a standalone tool for validating worktree isolation in module parallel work.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Level 3 — What each validator checks
|
|
70
|
+
|
|
71
|
+
### ① validateCapabilities
|
|
72
|
+
|
|
73
|
+
Reads `.harness-capabilities.json` and checks:
|
|
74
|
+
- Whether declared capabilities are all valid capability names (in the `allowedCapabilities` enum)
|
|
75
|
+
- Whether capability dependencies are all enabled (e.g. `subagent-worker` requires `module-parallel` first)
|
|
76
|
+
- Whether the artifact paths corresponding to capabilities exist
|
|
77
|
+
|
|
78
|
+
### ② validateReviewSchema
|
|
79
|
+
|
|
80
|
+
Scans all `review.md` files and checks whether each contains 4 required sections
|
|
81
|
+
(string matching, supports both English and Chinese):
|
|
82
|
+
|
|
83
|
+
1. `Reviewer Identity`
|
|
84
|
+
2. `Confidence Challenge`
|
|
85
|
+
3. `Evidence Checked`
|
|
86
|
+
4. `Final Confidence Basis`
|
|
87
|
+
|
|
88
|
+
For findings tables, also checks:
|
|
89
|
+
- Must have Severity (P0-P3), Open (yes/no), Disposition, Blocks Release columns
|
|
90
|
+
- **P0/P1 severity findings cannot have open=yes or blocks=yes simultaneously** (hard failure)
|
|
91
|
+
- `accepted-risk` / `deferred` disposition must have follow-up routing
|
|
92
|
+
- Evidence ID references (`E-\d+`) must exist in the Evidence ID table
|
|
93
|
+
|
|
94
|
+
For verifier-backed reviews, also requires `template_id: harness-verifier/v1` and
|
|
95
|
+
`verdict: pass|fail|inconclusive`.
|
|
96
|
+
|
|
97
|
+
### ③ validateVisualMaps
|
|
98
|
+
|
|
99
|
+
Checks that the Phase ID table in `visual_map.md` must contain 9 columns:
|
|
100
|
+
`Phase ID, Depends On, State, Completion, Output, Required Evidence, Evidence Status, Blocking Risk, Owner / Handoff`
|
|
101
|
+
|
|
102
|
+
Validation rules:
|
|
103
|
+
- `State` must be in `allowedPhaseStates`
|
|
104
|
+
- `Evidence Status` must be in `allowedEvidenceStatus`
|
|
105
|
+
- `Completion` must be an integer from 0-100
|
|
106
|
+
- When `state=done`, `completion` must = 100
|
|
107
|
+
- When `state=planned`, `completion` must = 0
|
|
108
|
+
- Visual maps from canonical sources require a `Visual Map Contract: v1.0` marker
|
|
109
|
+
|
|
110
|
+
### ④ validatePlanContracts
|
|
111
|
+
|
|
112
|
+
Checks whether `task_plan.md` contains a `Task Contract: harness-task/v1` marker line.
|
|
113
|
+
This is the most basic requirement for a task to be recognized by harness.
|
|
114
|
+
|
|
115
|
+
### ⑤ validateTaskPresetContracts
|
|
116
|
+
|
|
117
|
+
For tasks that use a Preset, checks:
|
|
118
|
+
- Whether the resource files declared by the Preset exist
|
|
119
|
+
- Whether `references/INDEX.md` has the corresponding index row
|
|
120
|
+
- Whether the "Preset Required Reads" in `task_plan.md` lists the required reads
|
|
121
|
+
|
|
122
|
+
### ⑦ validateGovernanceTableBoundaries
|
|
123
|
+
|
|
124
|
+
Checks content compliance for 5 global governance tables:
|
|
125
|
+
|
|
126
|
+
| Table | Content not allowed |
|
|
127
|
+
| --- | --- |
|
|
128
|
+
| Feature-SSoT | Module-level details, overly long evidence descriptions |
|
|
129
|
+
| Harness-Ledger | Execution logs, temporary fix prompts, raw conversation records |
|
|
130
|
+
| Closeout-SSoT | Execution logs, raw conversation records |
|
|
131
|
+
| Regression-SSoT | Execution logs, temporary prompts |
|
|
132
|
+
| Cadence-Ledger | Raw conversation records, temporary prompts |
|
|
133
|
+
|
|
134
|
+
**Time boundary**: Rows before 2026-05-24 are marked as legacy and only produce warnings;
|
|
135
|
+
rows after that date produce failures.
|
|
136
|
+
|
|
137
|
+
### ⑧ validateSubagentAuthorization
|
|
138
|
+
|
|
139
|
+
Scans all `execution_strategy.md` files for the **Subagent Authorization** table.
|
|
140
|
+
|
|
141
|
+
For rows with worker role and status `authorized`, checks completeness of 4 fields:
|
|
142
|
+
`Authorized By, Authorized At, Scope, Worktree / Branch`
|
|
143
|
+
|
|
144
|
+
Field values must be "concrete" (non-empty, not placeholder `[...]`, not `pending/n/a/none/—` etc.).
|
|
145
|
+
|
|
146
|
+
If the **Subagent Delegation Decision** table has a worker with decision=`ask-user`,
|
|
147
|
+
there must be a corresponding resolved row in the **User Authorization Decision** table.
|
|
148
|
+
|
|
149
|
+
**warning vs failure**: produces failure in strict mode; produces warning in adoption mode.
|
|
150
|
+
|
|
151
|
+
### ⑨ validateTaskCompletionConsistency
|
|
152
|
+
|
|
153
|
+
Checks tasks with `task.state=done` to see whether all phases in their Visual Map are also complete.
|
|
154
|
+
|
|
155
|
+
"Complete" is determined as: `phase.state=skipped` or `(phase.state=done and phase.completion=100)`.
|
|
156
|
+
|
|
157
|
+
If incomplete phases exist:
|
|
158
|
+
- `closeoutStatus=closed` → **failure**
|
|
159
|
+
- Otherwise → **warning**
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Level 2 — Check output structure
|
|
164
|
+
|
|
165
|
+
Core fields in `harness status --json` output:
|
|
166
|
+
|
|
167
|
+
```mermaid
|
|
168
|
+
flowchart TD
|
|
169
|
+
Output["status JSON"]
|
|
170
|
+
|
|
171
|
+
Output --> F["failures[]\nHard failures, must fix\n(blocks CI)"]
|
|
172
|
+
Output --> W["warnings[]\nSoft warnings, recommended to fix\n(doesn't block CI)"]
|
|
173
|
+
Output --> T["tasks[]\nStructured data for all tasks"]
|
|
174
|
+
Output --> C["capabilities[]\nCapability registry state"]
|
|
175
|
+
Output --> G["git\nGit status summary (dirty files etc.)"]
|
|
176
|
+
|
|
177
|
+
T --> TF["Each task contains:\nid / title / state / budget\nlifecycleState / reviewQueueState\ntaskQueues[] / phases[]\ncloseoutStatus / tombstone\nlessonCandidateDecisionComplete\nbriefSource / visualMapSource\ntaskPreset / presetVersion\nhandoffs[]"]
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Level 3 — Governance index rebuild
|
|
183
|
+
|
|
184
|
+
`harness governance rebuild --apply` rebuilds global index tables from task scan results:
|
|
185
|
+
|
|
186
|
+
```mermaid
|
|
187
|
+
flowchart TD
|
|
188
|
+
Cmd["harness governance rebuild --apply"]
|
|
189
|
+
|
|
190
|
+
Cmd --> Scan["collectTasks()\nScan all tasks"]
|
|
191
|
+
Scan --> Filter["Filter out tasks where deletionState == deleted"]
|
|
192
|
+
Filter --> Sort["Sort alphabetically by id"]
|
|
193
|
+
Sort --> Gen["Generate governance surfaces"]
|
|
194
|
+
|
|
195
|
+
Gen --> TI["docs/09-PLANNING/generated/task-index.md\nSummary index table for all tasks"]
|
|
196
|
+
Gen --> MI["docs/09-PLANNING/generated/module-index.md\nModule step index table"]
|
|
197
|
+
|
|
198
|
+
TI --> Atomic["Atomic write\n(governance-sync lock + git commit)"]
|
|
199
|
+
MI --> Atomic
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
This operation is **manually triggered** and does not run automatically on every task state change.
|
|
203
|
+
What runs automatically is `syncTaskGovernance()`, which only updates the corresponding row
|
|
204
|
+
in `Harness-Ledger.md`.
|
|
205
|
+
|
|
206
|
+
**Why they're separate**: `Harness-Ledger.md` is a high-frequency write ledger (updated on
|
|
207
|
+
every state change), while the `generated/` index tables are low-frequency full rebuilds
|
|
208
|
+
(require scanning all tasks, which is costly). Keeping them separate avoids triggering a
|
|
209
|
+
full scan on every state change.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Level 2 — Design decisions
|
|
214
|
+
|
|
215
|
+
### Why the validator has two levels: failures and warnings
|
|
216
|
+
|
|
217
|
+
Both levels existed from the start. The design motivation was **migration compatibility**:
|
|
218
|
+
- Newly installed projects (`strict=true`) report failure for missing files and block CI
|
|
219
|
+
- Legacy projects in `safe-adoption` mode report the same missing files as
|
|
220
|
+
`adoption-needed: ...` warnings without blocking
|
|
221
|
+
|
|
222
|
+
This lets harness gradually tighten standards without breaking existing users.
|
|
223
|
+
Three or more levels were never considered — two levels are sufficient to distinguish
|
|
224
|
+
"must fix" from "recommended migration".
|
|
225
|
+
|
|
226
|
+
### Why governance tables have a time boundary
|
|
227
|
+
|
|
228
|
+
Rows before 2026-05-24 are marked as legacy and only produce warnings; rows after produce failures.
|
|
229
|
+
This is because governance table content standards were introduced later, and historical data
|
|
230
|
+
can't suddenly become hard failures that block all operations. The time boundary requires
|
|
231
|
+
newly written rows to be compliant while giving historical data a migration window.
|
|
232
|
+
|
|
233
|
+
### Why subagent authorization distinguishes strict and adoption modes
|
|
234
|
+
|
|
235
|
+
The strictness of subagent authorization checks depends on project maturity:
|
|
236
|
+
- New projects require complete authorization records from the start (strict → failure)
|
|
237
|
+
- Legacy projects during migration may have many historical tasks missing authorization
|
|
238
|
+
records (adoption → warning)
|
|
239
|
+
|
|
240
|
+
This avoids the experience problem of "all historical tasks suddenly reporting errors
|
|
241
|
+
after adopting harness".
|
|
242
|
+
|
|
243
|
+
### Why validateTaskCompletionConsistency distinguishes closed vs non-closed
|
|
244
|
+
|
|
245
|
+
If a task already has `closeoutStatus=closed` (human-confirmed closeout) but still has
|
|
246
|
+
incomplete phases in the Visual Map, this is a serious inconsistency — it means the
|
|
247
|
+
closeout confirmation missed a check, and must report failure.
|
|
248
|
+
|
|
249
|
+
If the task isn't closed yet, the same inconsistency is only a warning — the Agent may
|
|
250
|
+
still be working, or some phases may be marked as skipped later.
|