@lannguyensi/harness 0.6.0 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (139) hide show
  1. package/CHANGELOG.md +255 -0
  2. package/README.md +189 -148
  3. package/dist/cli/apply/apply.d.ts +13 -0
  4. package/dist/cli/apply/apply.js +59 -3
  5. package/dist/cli/apply/apply.js.map +1 -1
  6. package/dist/cli/apply/generate-codex-config.d.ts +6 -0
  7. package/dist/cli/apply/generate-codex-config.js +149 -0
  8. package/dist/cli/apply/generate-codex-config.js.map +1 -0
  9. package/dist/cli/apply/generate-settings.d.ts +15 -1
  10. package/dist/cli/apply/generate-settings.js +16 -1
  11. package/dist/cli/apply/generate-settings.js.map +1 -1
  12. package/dist/cli/apply/index.d.ts +2 -1
  13. package/dist/cli/apply/index.js +2 -1
  14. package/dist/cli/apply/index.js.map +1 -1
  15. package/dist/cli/approve/understanding.d.ts +39 -0
  16. package/dist/cli/approve/understanding.js +122 -0
  17. package/dist/cli/approve/understanding.js.map +1 -0
  18. package/dist/cli/describe.d.ts +1 -1
  19. package/dist/cli/describe.js +2 -0
  20. package/dist/cli/describe.js.map +1 -1
  21. package/dist/cli/doctor/codex.d.ts +34 -0
  22. package/dist/cli/doctor/codex.js +331 -0
  23. package/dist/cli/doctor/codex.js.map +1 -0
  24. package/dist/cli/doctor/format.js +29 -1
  25. package/dist/cli/doctor/format.js.map +1 -1
  26. package/dist/cli/doctor/index.d.ts +13 -1
  27. package/dist/cli/doctor/index.js +49 -1
  28. package/dist/cli/doctor/index.js.map +1 -1
  29. package/dist/cli/doctor/types.d.ts +35 -1
  30. package/dist/cli/doctor/types.js +12 -1
  31. package/dist/cli/doctor/types.js.map +1 -1
  32. package/dist/cli/explain.d.ts +10 -1
  33. package/dist/cli/explain.js +44 -18
  34. package/dist/cli/explain.js.map +1 -1
  35. package/dist/cli/index.js +315 -8
  36. package/dist/cli/index.js.map +1 -1
  37. package/dist/cli/list.d.ts +1 -1
  38. package/dist/cli/list.js +17 -0
  39. package/dist/cli/list.js.map +1 -1
  40. package/dist/cli/pack/add.d.ts +13 -0
  41. package/dist/cli/pack/add.js +71 -0
  42. package/dist/cli/pack/add.js.map +1 -0
  43. package/dist/cli/pack/hook-codex-pre-tool-use.d.ts +30 -0
  44. package/dist/cli/pack/hook-codex-pre-tool-use.js +149 -0
  45. package/dist/cli/pack/hook-codex-pre-tool-use.js.map +1 -0
  46. package/dist/cli/pack/hook-codex-stop.d.ts +31 -0
  47. package/dist/cli/pack/hook-codex-stop.js +332 -0
  48. package/dist/cli/pack/hook-codex-stop.js.map +1 -0
  49. package/dist/cli/pack/hook-codex-user-prompt-submit.d.ts +18 -0
  50. package/dist/cli/pack/hook-codex-user-prompt-submit.js +92 -0
  51. package/dist/cli/pack/hook-codex-user-prompt-submit.js.map +1 -0
  52. package/dist/cli/pack/hook-pre-tool-use.d.ts +32 -0
  53. package/dist/cli/pack/hook-pre-tool-use.js +181 -0
  54. package/dist/cli/pack/hook-pre-tool-use.js.map +1 -0
  55. package/dist/cli/pack/index.d.ts +4 -0
  56. package/dist/cli/pack/index.js +5 -0
  57. package/dist/cli/pack/index.js.map +1 -0
  58. package/dist/cli/pack/list.d.ts +10 -0
  59. package/dist/cli/pack/list.js +43 -0
  60. package/dist/cli/pack/list.js.map +1 -0
  61. package/dist/cli/pack/mutate.d.ts +14 -0
  62. package/dist/cli/pack/mutate.js +76 -0
  63. package/dist/cli/pack/mutate.js.map +1 -0
  64. package/dist/cli/pack/remove.d.ts +15 -0
  65. package/dist/cli/pack/remove.js +153 -0
  66. package/dist/cli/pack/remove.js.map +1 -0
  67. package/dist/cli/session-export/index.d.ts +46 -0
  68. package/dist/cli/session-export/index.js +169 -0
  69. package/dist/cli/session-export/index.js.map +1 -0
  70. package/dist/cli/session-export/redact.d.ts +22 -0
  71. package/dist/cli/session-export/redact.js +47 -0
  72. package/dist/cli/session-export/redact.js.map +1 -0
  73. package/dist/cli/session-export/transcript.d.ts +24 -0
  74. package/dist/cli/session-export/transcript.js +162 -0
  75. package/dist/cli/session-export/transcript.js.map +1 -0
  76. package/dist/cli/validate/checks.js +32 -0
  77. package/dist/cli/validate/checks.js.map +1 -1
  78. package/dist/policies/ledger-client.js +2 -1
  79. package/dist/policies/ledger-client.js.map +1 -1
  80. package/dist/policy-packs/builtin/permission-profiles.d.ts +11 -0
  81. package/dist/policy-packs/builtin/permission-profiles.js +74 -0
  82. package/dist/policy-packs/builtin/permission-profiles.js.map +1 -0
  83. package/dist/policy-packs/builtin/understanding-before-execution-runtime.d.ts +56 -0
  84. package/dist/policy-packs/builtin/understanding-before-execution-runtime.js +186 -0
  85. package/dist/policy-packs/builtin/understanding-before-execution-runtime.js.map +1 -0
  86. package/dist/policy-packs/builtin/understanding-before-execution.d.ts +15 -0
  87. package/dist/policy-packs/builtin/understanding-before-execution.js +254 -0
  88. package/dist/policy-packs/builtin/understanding-before-execution.js.map +1 -0
  89. package/dist/policy-packs/expand.d.ts +4 -0
  90. package/dist/policy-packs/expand.js +90 -0
  91. package/dist/policy-packs/expand.js.map +1 -0
  92. package/dist/policy-packs/index.d.ts +5 -0
  93. package/dist/policy-packs/index.js +5 -0
  94. package/dist/policy-packs/index.js.map +1 -0
  95. package/dist/policy-packs/permission-translator.d.ts +9 -0
  96. package/dist/policy-packs/permission-translator.js +76 -0
  97. package/dist/policy-packs/permission-translator.js.map +1 -0
  98. package/dist/policy-packs/registry.d.ts +11 -0
  99. package/dist/policy-packs/registry.js +20 -0
  100. package/dist/policy-packs/registry.js.map +1 -0
  101. package/dist/policy-packs/runtime.d.ts +8 -0
  102. package/dist/policy-packs/runtime.js +30 -0
  103. package/dist/policy-packs/runtime.js.map +1 -0
  104. package/dist/policy-packs/source.d.ts +6 -0
  105. package/dist/policy-packs/source.js +10 -0
  106. package/dist/policy-packs/source.js.map +1 -0
  107. package/dist/policy-packs/types.d.ts +41 -0
  108. package/dist/policy-packs/types.js +11 -0
  109. package/dist/policy-packs/types.js.map +1 -0
  110. package/dist/probes/mcp.js +2 -1
  111. package/dist/probes/mcp.js.map +1 -1
  112. package/dist/runtime/index.d.ts +1 -0
  113. package/dist/runtime/index.js +1 -0
  114. package/dist/runtime/index.js.map +1 -1
  115. package/dist/runtime/ledger-add.d.ts +16 -0
  116. package/dist/runtime/ledger-add.js +139 -0
  117. package/dist/runtime/ledger-add.js.map +1 -0
  118. package/dist/runtime/ledger-record.js +2 -1
  119. package/dist/runtime/ledger-record.js.map +1 -1
  120. package/dist/schema/audit.d.ts +71 -0
  121. package/dist/schema/audit.js +32 -0
  122. package/dist/schema/audit.js.map +1 -0
  123. package/dist/schema/index.d.ts +1893 -10
  124. package/dist/schema/index.js +27 -0
  125. package/dist/schema/index.js.map +1 -1
  126. package/dist/schema/permission-profiles.d.ts +2161 -0
  127. package/dist/schema/permission-profiles.js +60 -0
  128. package/dist/schema/permission-profiles.js.map +1 -0
  129. package/dist/schema/policy-packs.d.ts +52 -0
  130. package/dist/schema/policy-packs.js +35 -0
  131. package/dist/schema/policy-packs.js.map +1 -0
  132. package/dist/schema/tools.d.ts +8 -8
  133. package/dist/schema/workflows.d.ts +519 -0
  134. package/dist/schema/workflows.js +81 -0
  135. package/dist/schema/workflows.js.map +1 -0
  136. package/dist/version.d.ts +1 -0
  137. package/dist/version.js +3 -0
  138. package/dist/version.js.map +1 -0
  139. package/package.json +1 -1
package/CHANGELOG.md CHANGED
@@ -7,6 +7,261 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.8.0] - 2026-05-10
11
+
12
+ **Headline: Understanding-Gate Policy Pack, end-to-end.** Phase 6 lands
13
+ the *Policy Pack* concept as a first-class harness unit: a reusable
14
+ bundle of instruction template, hooks, policies, and permission
15
+ profiles that ships under one name and is referenced from
16
+ `harness.yaml` with one key. The first showcase pack,
17
+ `understanding-before-execution`, forces an agent to expose its task
18
+ interpretation, an Understanding Report, before any write-capable
19
+ tool fires. The user confirms or corrects; only after explicit
20
+ approval is recorded as evidence may the agent edit, run shell,
21
+ commit, push, or open a PR. The pack ships across two runtimes
22
+ (Claude Code and Codex), three permission profiles
23
+ (`safe-start` / `implementation-after-approval` / `high-risk-grill-me`),
24
+ a CLI surface (`harness pack add / remove / list`,
25
+ `harness apply --runtime <runtime>`, `harness approve understanding`,
26
+ `harness doctor --target codex`), and a synthetic-stdin dogfood smoke
27
+ under `dogfood/phase6-6/` that exercises block, allow, capture, and
28
+ approve round-trips without a real Codex binary.
29
+
30
+ Operator note: no schema bump (still `version: 1`). New manifest blocks
31
+ (`policy_packs:`, `permission_profiles:`) are additive and default to
32
+ empty, so `0.7.0` manifests parse byte-identically. Manifests with the
33
+ pack enabled need a one-time `harness apply` after upgrade so the new
34
+ `harness pack hook pre-tool-use` blocker replaces the npm package's
35
+ standalone bin in the rendered `settings.json`. Ensure `harness` is on
36
+ `$PATH` (`npm i -g @lannguyensi/harness@0.8.0`) before the next session
37
+ starts.
38
+
39
+ ### Added
40
+ - Phase 6 #6 follow-up: `harness pack hook codex-stop` captures the
41
+ agent's Understanding Report into
42
+ `.understanding-gate/reports/<iso>-codex-<sessionhash>.json` with
43
+ `approvalStatus: "pending"`. Wire format on stdin accepts either
44
+ `last_assistant_message` directly or a `messages[]` array (the last
45
+ assistant entry is used). The parser recognises markdown headings,
46
+ bold labels, and plain colon-prefixed labels for the six report
47
+ fields (interpretation, assumptions, openQuestions, outOfScope,
48
+ risks, verificationPlan), with synonym support (Questions,
49
+ Exclusions, Validation). Failure modes (malformed input, missing
50
+ session id, unwritable reports dir, no recognisable fields)
51
+ resolve to exit 0 + a stderr diagnostic; capture must never block
52
+ the agent's stop path. The Codex pack now contributes a Stop hook
53
+ alongside UserPromptSubmit and PreToolUse. Closes agent-tasks
54
+ `adf356a0`.
55
+ - Phase 6 #6 follow-up: `harness doctor --target codex` evaluates the
56
+ harness side of the Codex adapter (binary resolution, harness-managed
57
+ `harness.generated/codex/config.toml` presence + banner, contributed
58
+ `[[hooks.*]]` command resolution, and persisted-report directory
59
+ writability). Codex error/warning counts roll into the top-level
60
+ totals; `--json` adds a structured `codexTarget` block to the
61
+ `DoctorReport`. The default `harness doctor` invocation is
62
+ unchanged. Closes agent-tasks `125fd02b`.
63
+ - Phase 6 #6: Codex adapter for the `understanding-before-execution`
64
+ policy pack. New CLI flag `harness apply --runtime codex` emits
65
+ `harness.generated/codex/config.toml` (TOML hook stanzas) instead of
66
+ `settings.json`; operators copy or include the generated TOML into
67
+ their own `~/.codex/config.toml`. Two new pack hook subcommands ship:
68
+ `harness pack hook codex-pre-tool-use` (PreToolUse blocker on
69
+ `apply_patch|Bash|shell`: exit 2 + reason on stderr when no source
70
+ has approved, exit 0 otherwise) and `harness pack hook
71
+ codex-user-prompt-submit` (instruction-template injector that emits
72
+ the Understanding-Gate prompt on stdout for Codex to prepend to
73
+ `additional_instructions`). The Codex blocker shares the
74
+ approval-check pipeline with the Claude Code blocker (ledger source
75
+ via grounding-mcp + persisted report under
76
+ `.understanding-gate/reports/`, either approves). Synthetic-stdin
77
+ smoke under `dogfood/phase6-6/` exercises block + allow paths
78
+ end-to-end without a Codex binary. `--target` is rejected with
79
+ `--runtime codex` (target wires Claude Code's `settings.json`, not
80
+ produced under codex). Phase 6 #6 follow-ups filed as separate
81
+ agent-tasks entries: `harness doctor --target codex` adapter-health
82
+ check; Codex Stop-equivalent for transcript capture; permission
83
+ profile translator into Codex's sandbox shape.
84
+ - Phase 6 anchor: additive `policy_packs:` manifest block (schema-only;
85
+ no runtime behaviour yet). Each entry has `name`, `source`
86
+ (default `builtin`), `enabled`, optional `description`, and an
87
+ opaque `config:` record validated by the pack itself at resolve
88
+ time. Duplicate names rejected at parse time; `.strict()` rejects
89
+ unknown keys per entry. The block defaults to `[]` so manifests
90
+ written for `0.7.0` parse byte-identically.
91
+ - `docs/policy-packs/understanding-before-execution.md`: canonical
92
+ documentation for the first Policy Pack, covering target
93
+ architecture, manifest reference, mode semantics, permission-profile
94
+ sketches, adapter notes for Claude Code / OpenCode / Codex, and the
95
+ two-source approval-state model (evidence-ledger tag for harnessed
96
+ sessions; persisted JSON report for solo `@lannguyensi/understanding-gate`
97
+ users). Phase 6 #2 through #6 will wire the surfaces this doc
98
+ describes; see `docs/ROADMAP.md` for the sub-task decomposition.
99
+ - `docs/examples/full-manifest.yaml` carries the canonical
100
+ `understanding-before-execution` pack as a worked example; the
101
+ byte-for-byte `harness describe` golden test covers the resulting
102
+ output.
103
+ - Phase 6 #2: `harness apply` now expands enabled `policy_packs[]`
104
+ entries into hook contributions and an operator audit copy. For the
105
+ builtin `understanding-before-execution` pack this writes three
106
+ namespaced hooks into the generated `settings.json`
107
+ (`UserPromptSubmit` injector, `Stop` capture, `PreToolUse` blocker
108
+ matching `Edit|Write|Bash`, all pointing at the
109
+ `@lannguyensi/understanding-gate` bins) and an audit copy at
110
+ `harness.generated/policy-packs/<name>/instructions.md`. Pack files
111
+ flow through the existing three-state-compare + lock pipeline, so
112
+ drift on the audit copy is caught by `harness apply` and surfaced in
113
+ `harness diff --since-apply`. `enabled: false` skips the pack
114
+ entirely. `harness validate` rejects an enabled pack with an
115
+ unrecognised source (only `builtin` resolves in v1) or an unknown
116
+ builtin name. Phase 6 #4 will add the harness-side ledger-aware
117
+ PreToolUse blocker; the standalone blocker shipped in
118
+ `@lannguyensi/understanding-gate@>=0.2.0` is already wired today.
119
+ - Phase 6 #3: new `harness pack` CLI subtree for managing `policy_packs[]`
120
+ declaratively. `harness pack add <name>` performs a schema-validated
121
+ insert (rejects unknown source/name pre-flight, then the schema
122
+ superRefine catches duplicates). `harness pack remove <name>` is
123
+ reference-checked against `.last-apply`: it refuses without `--force`
124
+ when applied state is present, and `--force` removes the manifest
125
+ entry, deletes the on-disk pack files under
126
+ `harness.generated/policy-packs/<name>/`, and prunes the
127
+ corresponding `.last-apply` entries so a follow-up `harness apply`
128
+ reconverges in one step. `harness pack list [--enabled-only] [--json]`
129
+ prints a flat table or pipeable JSON.
130
+ - Phase 6 #4: harness-side PreToolUse blocker + approve flow. The
131
+ `understanding-before-execution` pack now ships its `PreToolUse` hook
132
+ pointing at the new `harness pack hook pre-tool-use` runtime verb
133
+ (was: the npm package's standalone bin). The harness blocker is the
134
+ superset: it consults BOTH the evidence-ledger tag
135
+ `understanding-approved:${SESSION_ID}` (via grounding-mcp's
136
+ `ledger_summary`, canonical for harnessed sessions) AND the
137
+ persisted JSON report under `.understanding-gate/reports/` (fallback
138
+ for sessions without grounding-mcp wired). Either source approves;
139
+ neither blocks the tool call with a Claude-Code-shaped deny JSON
140
+ containing the actionable next step (`run \`harness approve
141
+ understanding\``). Failure modes (manifest unreadable, pack disabled,
142
+ no session id) resolve to allow with a stderr diagnostic, so the
143
+ Understanding Gate never bricks a session. Ledger matching filters
144
+ out `policy_decision` audit rows (typed and legacy-prefix backstop)
145
+ so a policy decision whose serialised reason field happens to
146
+ contain the approval substring cannot falsely approve.
147
+
148
+ **Breaking change for users with `understanding-before-execution`
149
+ enabled**: the regenerated `settings.json` calls `harness pack hook
150
+ pre-tool-use` instead of the npm bin. Run `harness apply` after
151
+ upgrading, and ensure `harness` is on `$PATH` (e.g.
152
+ `npm i -g @lannguyensi/harness`) before the next session starts.
153
+ - New `harness approve understanding [--session <id>] [--reports-dir
154
+ <path>] [--approved-by <actor>]` CLI verb that round-trips both
155
+ approval sources: writes the `understanding-approved:${SESSION_ID}`
156
+ ledger tag via `grounding-mcp`'s `ledger_add` AND flips
157
+ `approvalStatus: "approved"` on the latest matching persisted JSON
158
+ report (atomic rewrite). A degraded ledger surfaces as a one-line
159
+ warning, not a hard failure, so a solo
160
+ `@lannguyensi/understanding-gate` user without `grounding-mcp` wired
161
+ still benefits from the persisted-report path.
162
+ - New generic `runtime/ledger-add.ts` writer mirroring the structural
163
+ shape of `recordPolicyDecision` but exposed for non-policy-decision
164
+ fact rows. Used by `harness approve understanding`; available to any
165
+ future pack that wants to emit a session-tagged ledger entry without
166
+ encoding a policy-decision payload.
167
+ - Phase 6 #5: permission profiles. New top-level `permission_profiles:`
168
+ manifest block (additive, defaults to `{}`), with three v1 builtins
169
+ bundled with the `understanding-before-execution` pack: `safe-start`
170
+ (pre-approval default), `implementation-after-approval` (post-
171
+ approval working profile), and `high-risk-grill-me` (high-friction
172
+ for security / infra surfaces). Selection via the pack's
173
+ `config.permission_profile`. Profile actions (`read` / `edit` /
174
+ `bash` / `commit` / `push` / `pr` / `deploy`) translate to Claude
175
+ Code's `permissions: { allow, ask, deny }` block at apply time;
176
+ the new translator emits canonical tool patterns
177
+ (`Edit`/`Write`/`MultiEdit` for `edit`, `Bash(git commit*)` for
178
+ `commit`, etc.). `limited` and `ask_or_deny` collapse onto `ask`
179
+ for v1 (Claude Code does not natively distinguish them); finer-
180
+ grained shaping is a Phase 6 #5 follow-up. When multiple packs
181
+ contribute permissions, the merge follows
182
+ deny-wins-over-ask-wins-over-allow precedence: a stricter intent
183
+ from any pack is not silently relaxed by a more permissive
184
+ sibling. Profiles compose with the Phase 6 #4 PreToolUse blocker:
185
+ the static permissions block is the always-applies floor, the
186
+ blocker handles the conditional approval gate on top.
187
+
188
+ ## [0.7.0] - 2026-05-06
189
+
190
+ **Headline: workflows-as-data and full-session audit forensics.** The
191
+ `workflows:` block (PR #66) lets adopters declare branch policy,
192
+ review-subagent gating, and merge method as schema-validated data
193
+ instead of prose in memory files. `harness session-export <sessionId>`
194
+ (PR #67) joins the on-disk Claude Code transcript JSONL with the
195
+ evidence ledger for the same session and emits a single chronologically
196
+ ordered audit artifact, with default-on regex redaction extended by a
197
+ new optional `audit.redact[]` manifest block. The README is split into
198
+ audience-specific guides (`docs/for-humans.md`, `docs/for-agents.md`)
199
+ and gains a control-loop flowchart that both audiences read
200
+ identically. `harness explain --last` closes the "what just denied me?"
201
+ loop without needing the policy name. No runtime enforcement of
202
+ `workflows:` yet; that ships as a follow-up.
203
+
204
+ Operator note: no schema bump (still `version: 1`). All new manifest
205
+ fields are optional and additive; manifests written for `0.6.0` parse
206
+ under `0.7.0` byte-identically. The new `audit.redact[]` defaults to a
207
+ denylist that catches the four obvious key/secret patterns even when
208
+ the operator declares no `audit:` block, so existing operators get
209
+ redaction-on-by-default for `session-export` for free.
210
+
211
+ ### Changed
212
+ - `docs/for-agents.md` workflow lifecycle stateDiagram is now anchored
213
+ on the four step kinds the `workflows:` schema actually defines
214
+ (`branch`, `review_subagent`, `ci_gate`, `merge`) instead of
215
+ agent-tasks-MCP-specific verbs (`task_start`, `open` / `in_progress` /
216
+ `done`). A new "If you use agent-tasks MCP" footnote below the
217
+ diagram maps the lifecycle markers to the concrete MCP verbs as one
218
+ example integration; other task systems fit the same lifecycle.
219
+ Spotted right after the audience split landed (PR #69).
220
+ - Root `README.md` gains a control-loop flowchart ("What harness does":
221
+ declare, apply, enforce, record, observe, refine) that both
222
+ audiences read identically. No audience-specific verbs (PR #69).
223
+ - Docs split into two audience-specific surfaces:
224
+ `docs/for-humans.md` (operator guide: install, mental model, first
225
+ hour, diagnostics cheat sheet) and `docs/for-agents.md` (workflow
226
+ lifecycle, policy/ledger sequence, CLI cheat sheet by side-effect
227
+ class, audit triumvirate). README shrunk to a landing page that
228
+ picks audience, with the `Try it in 60 seconds` block, status
229
+ checklist, and `Why this exists` preserved. Three mermaid diagrams
230
+ added: a system flowchart in `for-humans.md`, a workflow
231
+ stateDiagram and a policy/ledger sequenceDiagram in
232
+ `for-agents.md`. Docs-only, no source changes (PR #68).
233
+
234
+ ### Added
235
+ - `harness explain --last` traces the most recent policy decision in the
236
+ evidence ledger without needing the policy name, closing the common
237
+ "I just got a deny, what fired?" loop in one command instead of three.
238
+ Pair with `--decision allow|deny|warn-degraded` to skip past intervening
239
+ outcomes. `<policy>` and `--last` are mutually exclusive (PR #65).
240
+ - `harness session-export <sessionId>` joins the on-disk Claude Code
241
+ transcript JSONL (`~/.claude/projects/<projectDir>/<sessionId>.jsonl`)
242
+ with evidence-ledger rows for the same session and emits a single
243
+ chronologically-ordered audit artifact. `--format json` (default) and
244
+ `--format jsonl` ship in v1; `-o <file>` writes to disk. Each event
245
+ carries an explicit `source: "transcript" | "ledger"` marker so the
246
+ export is traceable back to its inputs (PR #67).
247
+ - New optional `audit.redact[]` block in the manifest. Each entry is
248
+ either `{ regex, replacement? }` or `{ env_var, replacement? }`;
249
+ `env_var:` resolves to the actual value at export time and
250
+ string-replaces it. A default denylist (token / secret / password /
251
+ api_key) ships even when the manifest declares no `audit:` block, so
252
+ redaction is on by default. Manifests without `audit:` parse
253
+ unchanged (PR #67).
254
+ - Additive `workflows:` and `review_templates:` top-level blocks in the
255
+ manifest (still `version: 1`). Lets adopters declare review-subagent
256
+ gating, branch policy, CI gate, and merge method as data instead of
257
+ prose in memory files. The schema rejects duplicate workflow names,
258
+ unknown step `kind` values, `spawn: required` without a `template`,
259
+ and `template:` references not defined in `review_templates`. Surfaces
260
+ via `harness describe --pillar workflows`, `harness list workflows`,
261
+ and a new `Workflows` section in `harness doctor`. No runtime
262
+ enforcement yet, that ships as a follow-up. Manifests without
263
+ `workflows:` parse identically to before (PR #66).
264
+
10
265
  ## [0.6.0] - 2026-05-03
11
266
 
12
267
  **Headline: the Phase-5 adoption-blocker cycle closes end-to-end.**
package/README.md CHANGED
@@ -2,11 +2,62 @@
2
2
 
3
3
  **Declarative control plane for agent harnesses.**
4
4
 
5
- One zod-validated YAML manifest for grounding, tools, memory, hooks, and policies — plus a CLI that describes, validates, diffs, applies, audits, and *enforces*.
6
-
7
- > Most config tools tell you what an agent is configured to use. `harness` tells you what an agent is *allowed to do*, under this exact context, and why.
5
+ One zod-validated YAML manifest for grounding, tools, memory, hooks,
6
+ policies, and workflows, plus a CLI that describes, validates, diffs,
7
+ applies, audits, and *enforces*.
8
+
9
+ > Most config tools tell you what an agent is configured to use.
10
+ > `harness` tells you what an agent is *allowed to do*, under this
11
+ > exact context, and why.
12
+
13
+ `harness` collapses the six-to-eight surfaces a working agent harness
14
+ leaks across (`settings.json`, `CLAUDE.md`, memory frontmatter, MCP
15
+ registrations, per-project overrides, hook scripts) into a single
16
+ source of truth. Today (`v0.8.0`) policies fire end-to-end and ship as
17
+ reusable *Policy Packs*: a
18
+ `mcp__agent-tasks__pull_requests_merge` call against a session
19
+ without a `review:${PR_NUMBER}` ledger entry refuses; an `Edit` /
20
+ `apply_patch` against a session without an approved Understanding
21
+ Report refuses; `harness explain --last --trace` shows exactly why.
22
+ The Understanding Gate ships across both Claude Code and Codex
23
+ runtimes via `harness apply --runtime <claude-code|codex>`.
24
+
25
+ ## What harness does
26
+
27
+ ```mermaid
28
+ flowchart LR
29
+ declare["1. Declare<br/><code>harness.yaml</code>"]
30
+ apply["2. Apply<br/><code>harness apply</code>"]
31
+ enforce["3. Enforce<br/>hooks + policies<br/>at runtime"]
32
+ record[("4. Record<br/>evidence ledger")]
33
+ observe["5. Observe<br/><code>audit</code> / <code>explain</code> /<br/><code>session-export</code>"]
34
+
35
+ declare --> apply
36
+ apply --> enforce
37
+ enforce --> record
38
+ record --> observe
39
+ observe -. refine .-> declare
40
+ ```
8
41
 
9
- `harness` collapses the six-to-eight surfaces a working agent harness leaks across (`settings.json`, `CLAUDE.md`, memory frontmatter, MCP registrations, per-project overrides, hook scripts) into a single source of truth. Today (`v0.5.0`) policies fire end-to-end: a `mcp__agent-tasks__pull_requests_merge` call against a session without a `review:${PR_NUMBER}` ledger entry refuses; `harness explain review-before-merge --trace` shows exactly why. Phase 6 adds an *Understanding Gate* (agents confirm task interpretation before editing); Phase 7 adds a *Risk Gate* that blocks `DROP TABLE` against a prod target, even when the model would happily run it.
42
+ One manifest declares grounding, tools, memory, hooks, policies, and
43
+ workflows. `apply` materialises that into the files Claude Code
44
+ actually reads. At runtime, hooks and policies enforce the contract
45
+ and write decision rows to the evidence ledger. The read-side
46
+ surfaces (`audit`, `explain --trace`, `session-export`) replay those
47
+ rows so you can see what fired, why, and across which session.
48
+ Whatever you learn from observing flows back into the manifest. That
49
+ loop is the whole product.
50
+
51
+ ## Pick your audience
52
+
53
+ - **Operator?** Read [`docs/for-humans.md`](docs/for-humans.md). It
54
+ walks from `npm i -g @lannguyensi/harness` through your first
55
+ `apply`, your first real policy, and the diagnostics cheat sheet.
56
+ - **Agent (or onboarding one)?** Read
57
+ [`docs/for-agents.md`](docs/for-agents.md). It defines the
58
+ workflow lifecycle, the policy / ledger sequence, the CLI cheat
59
+ sheet split by side-effect class, and the audit triumvirate
60
+ (`audit` vs `explain --trace` vs `session-export`).
10
61
 
11
62
  ## Install
12
63
 
@@ -14,21 +65,10 @@ One zod-validated YAML manifest for grounding, tools, memory, hooks, and policie
14
65
  npm i -g @lannguyensi/harness
15
66
  ```
16
67
 
17
- The CLI binary is `harness`. Node 20 required.
68
+ The CLI binary is `harness`. Node 20 or newer required.
18
69
 
19
70
  ## Try it in 60 seconds
20
71
 
21
- ```bash
22
- # Statically predict which policies fire for a tool call (no ledger, no LLM).
23
- # Uses the bundled reference manifest from the npm package.
24
- harness dry-run "merge PR 42" \
25
- --tool mcp__agent-tasks__pull_requests_merge \
26
- --tool-args '{"prNumber":42}' \
27
- --config "$(npm root -g)/@lannguyensi/harness/dist/../docs/examples/full-manifest.yaml"
28
- ```
29
-
30
- Or from a checkout:
31
-
32
72
  ```bash
33
73
  git clone https://github.com/LanNguyenSi/harness && cd harness
34
74
  npm install && npm run build
@@ -38,157 +78,158 @@ node dist/cli/main.js dry-run "merge PR 42" \
38
78
  --config docs/examples/full-manifest.yaml
39
79
  ```
40
80
 
41
- `dry-run` reads the reference manifest, runs the trigger matcher, substitutes `${PR_NUMBER}=42` through the JSONPath-restricted extract DSL, and tells you exactly which hooks would fire and which policies would match — before any ledger I/O.
81
+ `dry-run` reads the reference manifest, runs the trigger matcher,
82
+ substitutes `${PR_NUMBER}=42` through the JSONPath-restricted extract
83
+ DSL, and tells you exactly which hooks would fire and which policies
84
+ would match, before any ledger I/O.
42
85
 
43
- ## What a run looks like
44
-
45
- ```yaml
46
- prompt: merge PR 42
47
- tool: mcp__agent-tasks__pull_requests_merge
48
- toolArgs:
49
- prNumber: 42
50
- Hooks that would fire:
51
- - event: SessionStart
52
- name: git-preflight
53
- - event: PreToolUse
54
- name: require-review-evidence
55
- - event: PreToolUse
56
- name: require-dogfood-evidence
57
- - event: PreToolUse
58
- name: require-preflight-evidence
59
- Policies that match:
60
- - name: review-before-merge
61
- ledgerQuery: review:42
62
- requires:
63
- ledger_tag: review:${PR_NUMBER}
64
- enforcement: block
65
- triggerEvent: PreToolUse
66
- - name: two-reviewers-required
67
- ledgerQuery: review:42
68
- requires:
69
- ledger_tag: review:${PR_NUMBER}
70
- count:
71
- min: 2
72
- enforcement: warn
73
- triggerEvent: PreToolUse
74
- Policies that COULD match (need --tool):
75
- - name: dogfood-before-release
76
- triggerEvent: PreToolUse
77
- reason: --tool "mcp__agent-tasks__pull_requests_merge" does not contain trigger.match "Bash"
78
- - name: preflight-before-investigation
79
- triggerEvent: PreToolUse
80
- reason: --tool "mcp__agent-tasks__pull_requests_merge" does not contain trigger.match "Bash"
81
- Memories that would route:
82
- - path: ~/.claude/projects/{project}/memory
83
- scope: project
84
- ```
86
+ ## Status
85
87
 
86
- When the matching policy actually fires (via `harness policy intercept`, wired by `harness apply` into `settings.json` as a `PreToolUse` hook), and the evidence ledger has no `review:42` entry, the runtime emits Claude Code's deny shape on stdout:
88
+ - [x] Phase 1, read-only inventory (`describe`, `validate`, `doctor`,
89
+ `list`, `explain`, `diff`), released as
90
+ [`v0.1.0`](CHANGELOG.md#010---2026-04-29).
91
+ - [x] Phase 2, managed edits (`init`, `add`, `remove`, `adopt`,
92
+ `export`), released as [`v0.2.0`](CHANGELOG.md#020---2026-04-29).
93
+ - [x] Phase 3, declarative truth (`apply`, `diff --since-apply`,
94
+ `harness.lock`), released as
95
+ [`v0.3.0`](CHANGELOG.md#030---2026-04-30).
96
+ - [x] Phase 4, policy layer (`policy intercept`, `explain --trace`,
97
+ `audit`, `dry-run`, requires-evaluator + extract DSL +
98
+ grounding-mcp adapter), released as
99
+ [`v0.4.0`](CHANGELOG.md#040---2026-04-30).
100
+ - [x] Phase 5, polish + dogfood lessons (`--verbose` policy
101
+ diagnostics, `$CLAUDE_SESSION_ID` env fallback, server-side
102
+ `audit` filter pushdown, `policy_decision` first-class entry
103
+ type, npm distribution as `@lannguyensi/harness`), released as
104
+ [`v0.5.0`](CHANGELOG.md#050---2026-05-01).
105
+ - [x] Apply-into-settings cycle, `harness adopt`, `apply --target /
106
+ --merge`, `harness.lock` target tracking, released as
107
+ [`v0.6.0`](CHANGELOG.md#060---2026-05-03).
108
+ - [x] Workflows-as-data + full-session audit forensics: additive
109
+ `workflows:` / `review_templates:` / `audit.redact[]` manifest
110
+ blocks, `harness session-export`, `explain --last`, audience-
111
+ specific docs surfaces, released as
112
+ [`v0.7.0`](CHANGELOG.md#070---2026-05-06).
113
+ - [x] Phase 6, Understanding Gate Policy Pack: `policy_packs:`
114
+ manifest block, the canonical `understanding-before-execution`
115
+ pack, `harness pack add / remove / list`,
116
+ `harness apply --runtime <claude-code|codex>` with TOML config
117
+ output for Codex, three permission profiles
118
+ (`safe-start` / `implementation-after-approval` /
119
+ `high-risk-grill-me`), a harness-side PreToolUse blocker that
120
+ consults both the evidence-ledger tag and the persisted JSON
121
+ report, `harness approve understanding`,
122
+ `harness doctor --target codex`, and a Codex Stop-equivalent
123
+ that captures Understanding Reports into
124
+ `.understanding-gate/reports/`. Released as
125
+ [`v0.8.0`](CHANGELOG.md#080---2026-05-10).
126
+ - [ ] Phase 7, Risk Gate: Action Envelope + Risk Classifier +
127
+ `allow / warn / require_approval / deny` for destructive-action
128
+ prevention.
129
+
130
+ ## Policy Packs (v0.8.0)
131
+
132
+ A *Policy Pack* is a reusable bundle of instruction template, hooks,
133
+ policies, and permission profiles that ships under one name and is
134
+ referenced from `harness.yaml` with a single key. The first pack,
135
+ `understanding-before-execution`, forces agents to expose and confirm
136
+ their task interpretation before any write-capable tool fires.
87
137
 
88
- ```json
89
- {"decision":"deny","reason":"review-before-merge: no matching ledger entry for tag `review:42`"}
138
+ ```yaml
139
+ policy_packs:
140
+ - name: understanding-before-execution
141
+ config:
142
+ mode: grill_me # fast_confirm | grill_me | strict
143
+ permission_profile: safe-start # safe-start | implementation-after-approval | high-risk-grill-me
90
144
  ```
91
145
 
92
- With `--verbose` (or `HARNESS_POLICY_VERBOSE=1`), stderr also carries a structured diagnostic block — policy name, ledger_tag, matched count, reason, sorted extract values — so the user sees *why* without a follow-up `explain --trace`.
93
-
94
- After the entry is recorded, the same call is silently allowed. Every fire writes a `policy_decision` row that `harness audit` and `harness explain --trace` replay:
95
-
96
- ```
97
- $ harness audit --since 1h --policy review-before-merge
146
+ Manage packs with `harness pack add / remove / list`. Apply against
147
+ either runtime:
98
148
 
99
- timestamp policy outcome reason
100
- ------------------------ ------------------- ------- ---------------------------------------------
101
- 2026-04-30T18:30:00.000Z review-before-merge deny no matching ledger entry for tag `review:42`
102
- 2026-04-30T18:31:00.000Z review-before-merge allow 1 matching ledger entries for tag `review:42`
149
+ ```sh
150
+ harness apply --runtime claude-code # default; writes harness.generated/settings.json
151
+ harness apply --runtime codex # writes harness.generated/codex/config.toml
103
152
  ```
104
153
 
105
- Inside a Claude Code session, `--session` defaults to `$CLAUDE_SESSION_ID`, so the read path automatically lines up with what the runtime hook wrote.
106
-
107
- ## Wire into Claude Code
108
-
109
- By default, `harness apply` writes the rendered settings to `harness.generated/settings.json` next to your manifest. To make Claude Code actually use it, point `apply` at a settings discovery path with `--target`:
110
-
111
- ```bash
112
- # Project scope: write straight to .claude/settings.local.json (created if missing).
113
- harness apply --target .claude/settings.local.json
114
-
115
- # User scope: merge harness-owned keys into your existing ~/.claude/settings.json,
116
- # preserving env, permissions, enabledPlugins, and any other top-level keys.
117
- harness apply --target ~/.claude/settings.json --merge
118
- ```
119
-
120
- `--merge` does a 3-way merge: harness-owned top-level keys (today: `hooks`) get replaced wholesale; everything else in the existing target file is preserved verbatim. Re-applying is idempotent: running twice produces the same target, and the second run reports `no changes`.
121
-
122
- If the target exists and you pass neither `--merge` nor `--force`, apply refuses with a clear hint instead of clobbering. `--force` overwrites with the generated content as-is (no merge).
123
-
124
- `harness.lock` records the target path + a sha256 of the merged output, so `harness validate --check-lock` flags out-of-band edits.
125
-
126
- ## Next steps
127
-
128
- | If you want to... | Read |
129
- |------|------|
130
- | Understand the YAML shape, CLI surface, drift handling, `requires` schema | [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) |
131
- | See phase-by-phase scope, deliverables, acceptance criteria, exit gates | [`docs/ROADMAP.md`](docs/ROADMAP.md) |
132
- | Read the long-form positioning (three pillars, ecosystem map, gaps) | [`docs/VISION.md`](docs/VISION.md) |
133
- | Browse a manifest covering every field | [`docs/examples/full-manifest.yaml`](docs/examples/full-manifest.yaml) |
134
- | Track what's shipping and what's deferred | [`CHANGELOG.md`](CHANGELOG.md) |
135
-
136
- ## Common commands
137
-
138
- ```bash
139
- harness init --template full --config /tmp/harness-demo/harness.yaml
140
- harness describe --config /tmp/harness-demo/harness.yaml --pillar tools
141
- harness doctor --config /tmp/harness-demo/harness.yaml --shallow
142
- harness validate --config /tmp/harness-demo/harness.yaml
143
- harness apply --config /tmp/harness-demo/harness.yaml # regenerate settings.json + MEMORY.md, write harness.lock
144
- harness diff --since-apply --config /tmp/harness-demo/harness.yaml
145
- harness explain review-before-merge --trace
146
- harness audit --since 24h
147
- ```
154
+ Approve a session's Understanding Report via
155
+ `harness approve understanding --session <id>` (round-trips both the
156
+ evidence-ledger tag and the persisted JSON report). Verify the
157
+ adapter wiring with `harness doctor --target codex` (`--json` for
158
+ machine-readable). The full reference lives in
159
+ [`docs/policy-packs/understanding-before-execution.md`](docs/policy-packs/understanding-before-execution.md);
160
+ synthetic-stdin dogfood under
161
+ [`dogfood/phase6-6/`](dogfood/phase6-6/run-smoke.sh) exercises the
162
+ block / allow / capture / approve round-trip without a real Codex
163
+ binary.
148
164
 
149
165
  ## What's next
150
166
 
151
- Two structurally larger themes are queued after Phase 5's polish:
152
-
153
- **Phase 6 Understanding Gate.** Before an agent edits files, runs shell, commits, or opens a PR, it must produce an *Understanding Report* (its interpretation of the task: derived todos, acceptance criteria, assumptions, out-of-scope, risks). The user confirms, corrects, or "grills me until precise enough". Only after explicit approval is recorded in the evidence ledger may write-capable tools fire. Ships as the first `harness` *Policy Pack* — a reusable bundle of instruction template + hooks + policies + permission profiles. Long-form design lives in the internal `lava-ice-logs` logbook (2026-04-30).
154
-
155
- **Phase 7 — Risk Gate.** Today's policy model evaluates a rule per matching trigger and returns a binary block/allow. Phase 7 makes harness reason about *the action itself*: an Action Envelope (tool + raw input + session + runtime context) is enriched by a Context Resolver (production / staging / dev / unknown), classified by a Risk Classifier (severity + categories + reversibility), then matched against policies whose `when:` clauses can reference `risk.severity_at_least`, `environment.name`, and similar. The decision space extends to `allow / warn / require_approval / deny`. Motivating use case: prevent `DROP TABLE users`, `kubectl delete namespace prod`, `terraform destroy` against an unverified production target before they reach the runtime — even if the model would have happily run them. Long-form design lives in the internal `lava-ice-logs` logbook (2026-04-30).
156
-
157
- Both build on Phase 4's `policy intercept` runtime backbone; neither replaces it.
167
+ **Phase 7, Risk Gate.** Today's policy model evaluates a rule per
168
+ matching trigger and returns a binary block/allow. Phase 7 makes
169
+ harness reason about *the action itself*: an Action Envelope (tool +
170
+ raw input + session + runtime context) is enriched by a Context
171
+ Resolver (production / staging / dev / unknown), classified by a Risk
172
+ Classifier (severity + categories + reversibility), then matched
173
+ against policies whose `when:` clauses can reference
174
+ `risk.severity_at_least`, `environment.name`, and similar. The
175
+ decision space extends to `allow / warn / require_approval / deny`.
176
+ Motivating use case: prevent `DROP TABLE users`, `kubectl delete
177
+ namespace prod`, `terraform destroy` against an unverified production
178
+ target, even if the model would have happily run them.
179
+
180
+ Phase 7 builds on Phase 4's `policy intercept` runtime backbone and
181
+ Phase 6's Policy Pack distribution surface; neither is replaced.
158
182
 
159
183
  > Bring your favorite agent harness. Add governance.
160
184
 
161
- ## Status
162
-
163
- - [x] Repo bootstrap (LICENSE, .gitignore)
164
- - [x] README + VISION — repo legible
165
- - [x] ARCHITECTURE — YAML shape + CLI surface agreed
166
- - [x] ROADMAP — phases 1–7 with acceptance criteria
167
- - [x] Phase 1 — read-only inventory (`describe`, `validate`, `doctor`, `list`, `explain`, `diff`) — released as [`v0.1.0`](CHANGELOG.md#010---2026-04-29)
168
- - [x] Phase 2 — managed edits (`init`, `add`, `remove`, `adopt`, `export`) — released as [`v0.2.0`](CHANGELOG.md#020---2026-04-29)
169
- - [x] Phase 3 — declarative truth (`apply`, `diff --since-apply`, `harness.lock`) — released as [`v0.3.0`](CHANGELOG.md#030---2026-04-30)
170
- - [x] Phase 4 — policy layer (`policy intercept`, `explain --trace`, `audit`, `dry-run`, requires-evaluator + extract DSL + grounding-mcp adapter) — released as [`v0.4.0`](CHANGELOG.md#040---2026-04-30)
171
- - [x] Phase 5 — polish + dogfood lessons (`--verbose` policy diagnostics, `$CLAUDE_SESSION_ID` env fallback, server-side `audit` filter pushdown, `policy_decision` first-class entry type, audit `--since` UTC parse fix, `explain --trace` ms-precision sort, npm distribution as `@lannguyensi/harness`) — released as [`v0.5.0`](CHANGELOG.md#050---2026-05-01)
172
- - [ ] Phase 6 — Understanding Gate Policy Pack (agents must expose and confirm task understanding before write-capable tools fire)
173
- - [ ] Phase 7 — Risk Gate (Action Envelope + Risk Classifier + `allow / warn / require_approval / deny` for destructive-action prevention)
174
-
175
185
  ## Why this exists
176
186
 
177
- A working agent harness today has six to eight configuration surfaces, each with its own schema and lifecycle: `~/.claude/settings.json`, `CLAUDE.md` (per repo + root), `~/.claude/projects/*/memory/*.md` with frontmatter, `~/.claude/keybindings.json`, MCP server registrations in `~/.claude.json`, skill directories, per-project overrides, and external CLIs that behave differently per project.
178
-
179
- There is no single place that answers *"what can this agent do right now, and why is that configured that way?"*. Drift between sessions is invisible until it breaks something. Humans editing one surface don't know which other surfaces they need to touch. A fresh agent instance has no way to audit its own setup.
180
-
181
- Our entry point into this problem: on 2026-04-23, an `agent-grounding` checkout that was 16 commits behind origin led two tasks to be incorrectly called "stale". The check that would have caught it already exists — [`agent-preflight`](https://github.com/LanNguyenSi/agent-preflight) runs `git fetch` + `git status` (alongside lint, typecheck, test, audit) and emits a structured `ready` + confidence-score result. The missing piece wasn't the check itself, it was the deterministic *trigger*: a `SessionStart` hook that invokes `preflight run` and a policy that gates further work on the result. Building that wiring needs an agreed-upon place for harness config to live first. That conversation is the origin of this repo.
187
+ A working agent harness today has six to eight configuration
188
+ surfaces, each with its own schema and lifecycle: `~/.claude/settings.json`,
189
+ `CLAUDE.md` (per repo + root), `~/.claude/projects/*/memory/*.md`
190
+ with frontmatter, `~/.claude/keybindings.json`, MCP server
191
+ registrations in `~/.claude.json`, skill directories, per-project
192
+ overrides, and external CLIs that behave differently per project.
193
+
194
+ There is no single place that answers *"what can this agent do right
195
+ now, and why is that configured that way?"*. Drift between sessions
196
+ is invisible until it breaks something. Humans editing one surface
197
+ do not know which other surfaces they need to touch. A fresh agent
198
+ instance has no way to audit its own setup.
199
+
200
+ Our entry point into this problem: on 2026-04-23, an
201
+ `agent-grounding` checkout that was 16 commits behind origin led two
202
+ tasks to be incorrectly called "stale". The check that would have
203
+ caught it already exists,
204
+ [`agent-preflight`](https://github.com/LanNguyenSi/agent-preflight)
205
+ runs `git fetch` + `git status` (alongside lint, typecheck, test,
206
+ audit) and emits a structured `ready` + confidence-score result. The
207
+ missing piece was not the check itself, it was the deterministic
208
+ *trigger*: a `SessionStart` hook that invokes `preflight run` and a
209
+ policy that gates further work on the result. Building that wiring
210
+ needs an agreed-upon place for harness config to live first. That
211
+ conversation is the origin of this repo.
182
212
 
183
213
  ## Related
184
214
 
185
- - [`agent-grounding`](https://github.com/LanNguyenSi/agent-grounding) — grounding primitives (evidence-ledger, claim-gate, review-claim-gate); `grounding-mcp` is the canonical client surface harness queries through `queryLedgerByTag` (Phase 4 #3).
186
- - [`agent-memory`](https://github.com/LanNguyenSi/agent-memory) — memory surfaces the control plane inventories.
187
- - [`agent-tasks`](https://github.com/LanNguyenSi/agent-tasks) the MCP-registered task platform whose registration + health appear in `harness describe`.
188
- - [`agent-preflight`](https://github.com/LanNguyenSi/agent-preflight) — local preflight validator; the canonical implementation of preflight-hook content harness wires (see `docs/ARCHITECTURE.md` §5 for the canonical hook-script shape and §6 for the Phase 4 policy that gates further work on a `preflight:${REPO}` ledger entry).
189
- - [`codebase-oracle`](https://github.com/LanNguyenSi/codebase-oracle) — one of the MCP surfaces being registered.
190
- - [`agent-dx`](https://github.com/LanNguyenSi/agent-dx) ships `git-batch-cli` (under `packages/git-batch-cli`), a day-to-day tool whose inventory appears in `harness describe`.
215
+ - [`agent-grounding`](https://github.com/LanNguyenSi/agent-grounding):
216
+ grounding primitives (evidence-ledger, claim-gate,
217
+ review-claim-gate); `grounding-mcp` is the canonical client surface
218
+ harness queries through `queryLedgerByTag`.
219
+ - [`agent-memory`](https://github.com/LanNguyenSi/agent-memory):
220
+ memory surfaces the control plane inventories.
221
+ - [`agent-tasks`](https://github.com/LanNguyenSi/agent-tasks): the
222
+ MCP-registered task platform whose registration + health appear in
223
+ `harness describe`.
224
+ - [`agent-preflight`](https://github.com/LanNguyenSi/agent-preflight):
225
+ local preflight validator; the canonical implementation of
226
+ preflight-hook content harness wires.
227
+ - [`codebase-oracle`](https://github.com/LanNguyenSi/codebase-oracle):
228
+ one of the MCP surfaces being registered.
229
+ - [`agent-dx`](https://github.com/LanNguyenSi/agent-dx): ships
230
+ `git-batch-cli`, a day-to-day tool whose inventory appears in
231
+ `harness describe`.
191
232
 
192
233
  ## License
193
234
 
194
- MIT see [LICENSE](LICENSE).
235
+ MIT, see [LICENSE](LICENSE).