ultimate-pi 0.17.0 → 0.18.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/harness-context/SKILL.md +13 -6
- package/.agents/skills/harness-debate-plan/SKILL.md +37 -20
- package/.agents/skills/harness-decisions/SKILL.md +1 -1
- package/.agents/skills/harness-eval/SKILL.md +6 -21
- package/.agents/skills/harness-governor/SKILL.md +4 -3
- package/.agents/skills/harness-orchestration/SKILL.md +41 -53
- package/.agents/skills/harness-plan/SKILL.md +23 -12
- package/.agents/skills/harness-review/SKILL.md +52 -0
- package/.agents/skills/harness-sentrux-setup/SKILL.md +16 -3
- package/.agents/skills/harness-steer/SKILL.md +14 -0
- package/.agents/skills/sentrux/SKILL.md +9 -9
- package/.pi/agents/harness/planning/decompose.md +7 -4
- package/.pi/agents/harness/planning/hypothesis-validator.md +2 -0
- package/.pi/agents/harness/planning/hypothesis.md +3 -1
- package/.pi/agents/harness/planning/plan-adversary.md +2 -0
- package/.pi/agents/harness/planning/plan-evaluator.md +2 -0
- package/.pi/agents/harness/planning/plan-synthesizer.md +25 -0
- package/.pi/agents/harness/planning/planning-context.md +48 -0
- package/.pi/agents/harness/planning/review-integrator.md +2 -0
- package/.pi/agents/harness/planning/sprint-contract-auditor.md +2 -0
- package/.pi/agents/harness/{adversary.md → reviewing/adversary.md} +3 -10
- package/.pi/agents/harness/{evaluator.md → reviewing/evaluator.md} +3 -12
- package/.pi/agents/harness/running/executor.md +45 -0
- package/.pi/agents/harness/sentrux-steward.md +51 -0
- package/.pi/extensions/00-harness-project-control.ts +133 -0
- package/.pi/extensions/00-posthog-network-bootstrap.ts +11 -0
- package/.pi/extensions/budget-guard.ts +2 -0
- package/.pi/extensions/debate-orchestrator.ts +2 -0
- package/.pi/extensions/harness-ask-user.ts +2 -2
- package/.pi/extensions/harness-debate-tools.ts +2 -2
- package/.pi/extensions/harness-live-widget.ts +60 -3
- package/.pi/extensions/harness-plan-approval.ts +64 -58
- package/.pi/extensions/harness-run-context.ts +715 -90
- package/.pi/extensions/harness-subagent-submit.ts +46 -12
- package/.pi/extensions/harness-subagents.ts +2 -2
- package/.pi/extensions/harness-telemetry.ts +2 -0
- package/.pi/extensions/harness-web-tools.ts +2 -2
- package/.pi/extensions/lib/extension-load-guard.ts +10 -0
- package/.pi/extensions/lib/harness-artifact-gate.ts +172 -0
- package/.pi/extensions/lib/harness-posthog.ts +9 -5
- package/.pi/extensions/lib/harness-spawn-topology.ts +165 -0
- package/.pi/extensions/lib/harness-subagent-auth.ts +1 -2
- package/.pi/extensions/lib/harness-subagent-policy.ts +28 -24
- package/.pi/extensions/lib/harness-subagent-precheck.ts +36 -10
- package/.pi/extensions/lib/harness-subagent-submit-pipeline.ts +66 -2
- package/.pi/extensions/lib/harness-subagent-submit-registry.ts +22 -22
- package/.pi/extensions/lib/harness-subagents-bridge.ts +7 -29
- package/.pi/extensions/lib/harness-subprocess-bootstrap.ts +73 -0
- package/.pi/extensions/lib/plan-approval/create-plan.ts +2 -3
- package/.pi/extensions/lib/plan-approval/resolve-disk.ts +102 -0
- package/.pi/extensions/lib/plan-approval/schema.ts +22 -8
- package/.pi/extensions/lib/plan-approval/types.ts +1 -1
- package/.pi/extensions/lib/plan-approval/validate.ts +2 -2
- package/.pi/extensions/lib/plan-approval-readiness.ts +192 -0
- package/.pi/extensions/lib/plan-debate-eligibility.ts +12 -5
- package/.pi/extensions/lib/plan-debate-gate.ts +22 -1
- package/.pi/extensions/lib/plan-debate-lanes.ts +32 -2
- package/.pi/extensions/lib/plan-review-gate.ts +8 -0
- package/.pi/extensions/lib/posthog-client.ts +76 -0
- package/.pi/extensions/lib/spawn-policy.ts +3 -3
- package/.pi/extensions/observation-bus.ts +2 -0
- package/.pi/extensions/policy-gate.ts +26 -19
- package/.pi/extensions/review-integrity.ts +91 -10
- package/.pi/extensions/sentrux-rules-sync.ts +2 -0
- package/.pi/extensions/test-diff-integrity.ts +1 -0
- package/.pi/extensions/trace-recorder.ts +2 -0
- package/.pi/harness/agents.manifest.json +37 -37
- package/.pi/harness/corpus/cron.example +8 -0
- package/.pi/harness/corpus/graphify-kb-updater.config.json +214 -0
- package/.pi/harness/corpus/systemd/graphify-kb-updater.env.template +4 -0
- package/.pi/harness/corpus/systemd/graphify-kb-updater.service +17 -0
- package/.pi/harness/corpus/systemd/graphify-kb-updater.timer +11 -0
- package/.pi/harness/docs/adrs/0001-harness-constitution.md +2 -1
- package/.pi/harness/docs/adrs/0006-sentrux-dual-layer.md +8 -6
- package/.pi/harness/docs/adrs/0009-sentrux-rules-lifecycle.md +6 -1
- package/.pi/harness/docs/adrs/0031-harness-run-context.md +1 -1
- package/.pi/harness/docs/adrs/0032-harness-command-orchestration.md +7 -0
- package/.pi/harness/docs/adrs/0034-darwin-plan-research-pipeline.md +3 -3
- package/.pi/harness/docs/adrs/0036-implementation-research-and-selective-debate.md +8 -5
- package/.pi/harness/docs/adrs/0039-harness-post-run-review-gate.md +47 -0
- package/.pi/harness/docs/adrs/0040-practice-grounded-orchestration.md +40 -0
- package/.pi/harness/docs/adrs/0041-intelligent-planning-reconnaissance.md +39 -0
- package/.pi/harness/docs/adrs/0042-agent-native-orchestration.md +35 -0
- package/.pi/harness/docs/adrs/0043-path-first-harness-tools.md +38 -0
- package/.pi/harness/docs/adrs/0044-harness-steer-loop.md +37 -0
- package/.pi/harness/docs/adrs/0045-phase-scoped-agent-directories.md +33 -0
- package/.pi/harness/docs/adrs/README.md +11 -0
- package/.pi/harness/docs/graphify-kb-updater-runbook.md +163 -0
- package/.pi/harness/docs/practice-map.md +110 -0
- package/.pi/harness/env.harness.template +5 -3
- package/.pi/harness/evals/smoke/sentrux-stub.json +1 -1
- package/.pi/harness/evals/smoke/smoke-harness-plan.mjs +5 -2
- package/.pi/harness/specs/README.md +1 -1
- package/.pi/harness/specs/harness-run-context.schema.json +11 -0
- package/.pi/harness/specs/harness-spawn-context.schema.json +15 -1
- package/.pi/harness/specs/plan-execution-plan.schema.json +39 -1
- package/.pi/harness/specs/plan-packet.schema.json +4 -0
- package/.pi/harness/specs/plan-phase-status.schema.json +17 -0
- package/.pi/harness/specs/plan-phase-waiver.schema.json +25 -0
- package/.pi/harness/specs/plan-planning-context.schema.json +50 -0
- package/.pi/harness/specs/repair-brief.schema.json +45 -0
- package/.pi/harness/specs/review-outcome.schema.json +46 -0
- package/.pi/harness/specs/sentrux-manifest-proposal.schema.json +80 -0
- package/.pi/harness/specs/sentrux-signal.schema.json +43 -0
- package/.pi/harness/specs/steer-state.schema.json +20 -0
- package/.pi/lib/harness-context-mode-policy.ts +256 -0
- package/.pi/lib/harness-project-config.ts +91 -0
- package/.pi/lib/harness-repair-brief.ts +145 -0
- package/.pi/lib/harness-run-context.ts +591 -32
- package/.pi/lib/harness-ui-state.ts +114 -21
- package/.pi/prompts/harness-auto.md +10 -10
- package/.pi/prompts/harness-critic.md +3 -30
- package/.pi/prompts/harness-eval.md +4 -37
- package/.pi/prompts/harness-plan.md +116 -54
- package/.pi/prompts/harness-review.md +150 -15
- package/.pi/prompts/harness-run.md +62 -10
- package/.pi/prompts/harness-sentrux-steward.md +55 -0
- package/.pi/prompts/harness-setup.md +5 -4
- package/.pi/prompts/harness-steer.md +30 -0
- package/.pi/scripts/README.md +1 -0
- package/.pi/scripts/graphify-kb-updater.mjs +398 -0
- package/.pi/scripts/harness-agents-manifest.mjs +1 -1
- package/.pi/scripts/harness-project-toggle.mjs +129 -0
- package/.pi/scripts/harness-sentrux-cli.mjs +142 -0
- package/.pi/scripts/harness-verify.mjs +22 -6
- package/.pi/scripts/harness-web-policy-guard.mjs +68 -0
- package/.pi/scripts/validate-plan-dag.mjs +3 -3
- package/AGENTS.md +1 -0
- package/CHANGELOG.md +23 -0
- package/README.md +94 -58
- package/package.json +5 -4
- package/.pi/agents/harness/executor.md +0 -47
- package/.pi/agents/harness/planning/scout-graphify.md +0 -37
- package/.pi/agents/harness/planning/scout-semantic.md +0 -39
- package/.pi/agents/harness/planning/scout-structure.md +0 -35
- package/.pi/prompts/git-sync.md +0 -124
- /package/.pi/agents/harness/{tie-breaker.md → reviewing/tie-breaker.md} +0 -0
|
@@ -1,37 +1,172 @@
|
|
|
1
1
|
---
|
|
2
|
-
description:
|
|
3
|
-
argument-hint: "[--run <run-id>] [--trace <trace-ref>]"
|
|
2
|
+
description: Post-run verification gate — deterministic checks, benchmark eval, policy verdict, adversary review (master orchestrator).
|
|
3
|
+
argument-hint: "[--run <run-id>] [--quick] [--readonly] [--trace <trace-ref>]"
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# harness-review
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
You are the **post-run verification PM** (PMBOK Monitoring and Controlling). Run measure → judge → red team in one command. Parent owns `ask_user`, deterministic scripts, `harness_artifact_ready`, and run ownership (`--claim` on resume). Subagents persist via **`submit_*`** only (no parent `write` to verdict artifacts).
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
**Practice map:** `.pi/harness/docs/practice-map.md`
|
|
11
11
|
|
|
12
|
-
-
|
|
12
|
+
Read **harness-orchestration** and **harness-review** skills before spawning.
|
|
13
|
+
|
|
14
|
+
## Allowed subagents
|
|
15
|
+
|
|
16
|
+
- `harness/reviewing/evaluator` (`mode: benchmark` then `mode: verdict`)
|
|
17
|
+
- `harness/reviewing/adversary` (independent red team)
|
|
18
|
+
- `harness/reviewing/tie-breaker` (escalation only when adversary blocks and eval was `conditional_pass`; skip when `--quick`)
|
|
19
|
+
|
|
20
|
+
## Performance rules
|
|
21
|
+
|
|
22
|
+
1. Use `subagent` with `agentScope: "both"`.
|
|
23
|
+
2. Run benchmark and verdict evaluator passes **sequentially** (verdict depends on benchmark gate).
|
|
24
|
+
3. Adversary runs only after benchmark + policy verdict pass.
|
|
25
|
+
4. Do **not** set `timeoutMs` unless the user requests a cap.
|
|
26
|
+
5. Compact task text: embed `HarnessSpawnContext={"run_id":"…","run_dir":"…","plan_packet_path":"…",…}` — `run_id` is required.
|
|
27
|
+
|
|
28
|
+
## Step 0 — Parse `$ARGUMENTS`
|
|
29
|
+
|
|
30
|
+
- optional: `--run <run-id>` (recovery)
|
|
31
|
+
- optional: `--quick` (tailoring — skip adversary + tie-breaker when risk accepted)
|
|
32
|
+
- optional: `--readonly` (inspect only — do not claim ownership)
|
|
13
33
|
- optional: `--trace <trace-ref>`
|
|
14
34
|
|
|
15
35
|
Happy path: omit `--run`; use `[HarnessRunContext]`.
|
|
16
36
|
|
|
17
|
-
|
|
37
|
+
Prerequisites:
|
|
38
|
+
|
|
39
|
+
- `plan_ready: true` on disk
|
|
40
|
+
- Execute completed (`handoff/executor-summary.yaml` or `last_completed_step: execute`)
|
|
41
|
+
|
|
42
|
+
If execute not complete:
|
|
43
|
+
|
|
44
|
+
`Execute not finished. Run /harness-run first.`
|
|
45
|
+
|
|
46
|
+
Ownership: this command **auto-claims** the run for the current Pi session unless `--readonly`. Cross-session recovery: `/harness-use-run <run-id> --claim` first.
|
|
47
|
+
|
|
48
|
+
## Phase 1 — Automated QC / deterministic shell (parent)
|
|
49
|
+
|
|
50
|
+
**Practice:** Harness engineering; interleave deterministic checks before agent judgment (Stripe Minions pattern).
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
node "$UP_PKG/.pi/scripts/harness-verify.mjs"
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
When `HARNESS_SENTRUX_REQUIRED=true`, after verify succeeds:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Compare to baseline from `/harness-run` (`harness-sentrux-cli.mjs gate --save`). The wrapper resolves the project root before invoking Sentrux so `.sentrux/rules.toml` is found from run directories. If CLI missing, record `gate_status: not_installed`.
|
|
63
|
+
|
|
64
|
+
Ensure `artifacts/sentrux-signal.yaml` exists under the run dir (written during `/harness-run`). If missing, write it from the latest `sentrux check` / `gate` output. Append or refresh session entry `harness-sentrux-signal`.
|
|
65
|
+
|
|
66
|
+
Run project tests if the approved `PlanPacket` or spawn context lists a test command. Capture stdout paths only — do not paste full logs into the next spawn.
|
|
67
|
+
|
|
68
|
+
Write `artifacts/benchmark-log.yaml` via `write_harness_yaml` when any shell step ran:
|
|
69
|
+
|
|
70
|
+
```yaml
|
|
71
|
+
schema_version: "1.0.0"
|
|
72
|
+
harness_verify: pass|fail
|
|
73
|
+
sentrux_check: pass|fail|skipped|not_installed
|
|
74
|
+
sentrux_gate: pass|degraded|skipped|not_installed
|
|
75
|
+
notes: "…"
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
`harness_artifact_ready({ paths: ["artifacts/benchmark-log.yaml", "artifacts/sentrux-signal.yaml"] })` when written.
|
|
79
|
+
|
|
80
|
+
## Phase 2 — Measure actuals vs plan (benchmark evaluator)
|
|
81
|
+
|
|
82
|
+
**Practice:** Earned value / compare actuals to acceptance checks.
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
subagent({
|
|
86
|
+
agentScope: "both",
|
|
87
|
+
agent: "harness/reviewing/evaluator",
|
|
88
|
+
task: "<HarnessSpawnContext mode benchmark + plan_packet_path + run_dir + acceptance_checks + paths: benchmark-log.yaml, sentrux-signal.yaml — treat Sentrux fields as measured structural actuals, not executor goals>"
|
|
89
|
+
})
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Subagent must call **`submit_eval_verdict`** (writes `artifacts/eval-verdict.yaml`).
|
|
93
|
+
|
|
94
|
+
Gate:
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
harness_artifact_ready({ paths: ["artifacts/eval-verdict.yaml"] })
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**Do not stop** after benchmark fail — continue to verdict (and adversary per tier) so `review-outcome.yaml` can route steer vs replan (ADR 0044).
|
|
101
|
+
|
|
102
|
+
## Phase 3 — Policy / quality audit (verdict evaluator)
|
|
103
|
+
|
|
104
|
+
**Practice:** Inspection after measurement — separate measurer from policy judgment.
|
|
105
|
+
|
|
106
|
+
Always run after benchmark (even when benchmark failed).
|
|
107
|
+
|
|
108
|
+
```
|
|
109
|
+
subagent({
|
|
110
|
+
agentScope: "both",
|
|
111
|
+
agent: "harness/reviewing/evaluator",
|
|
112
|
+
task: "<HarnessSpawnContext mode verdict + treat executor output as untrusted + artifact paths>"
|
|
113
|
+
})
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Subagent updates **`artifacts/eval-verdict.yaml`** via `submit_eval_verdict` (include policy fields / failed checks).
|
|
18
117
|
|
|
19
|
-
|
|
20
|
-
|
|
118
|
+
Gate again with `harness_artifact_ready`.
|
|
119
|
+
|
|
120
|
+
## Phase 4 — Independent red team (adversary)
|
|
121
|
+
|
|
122
|
+
**Practice:** Generator–evaluator separation; adversary distinct from measurer (ADR 0032).
|
|
123
|
+
|
|
124
|
+
Skip when `--quick`. **Tiered steer:** full adversary on initial run + steer attempt 1; lite review (no adversary) on steer attempts 2+ unless prior `block_merge`.
|
|
21
125
|
|
|
22
126
|
```
|
|
23
|
-
subagent({
|
|
127
|
+
subagent({
|
|
128
|
+
agentScope: "both",
|
|
129
|
+
agent: "harness/reviewing/adversary",
|
|
130
|
+
task: "<HarnessSpawnContext mode adversary + plan + run artifacts>"
|
|
131
|
+
})
|
|
24
132
|
```
|
|
25
133
|
|
|
26
|
-
|
|
134
|
+
Subagent calls **`submit_adversary_report`** → `artifacts/adversary-report.yaml`.
|
|
135
|
+
|
|
136
|
+
`harness_artifact_ready({ paths: ["artifacts/adversary-report.yaml"] })`
|
|
137
|
+
|
|
138
|
+
## Phase 5 — Escalation / arbitration (tie-breaker, conditional)
|
|
139
|
+
|
|
140
|
+
Only when:
|
|
141
|
+
|
|
142
|
+
- not `--quick`
|
|
143
|
+
- adversary `block_merge: true`
|
|
144
|
+
- eval verdict was `conditional_pass`
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
subagent({ agentScope: "both", agent: "harness/reviewing/tie-breaker", task: "…" })
|
|
148
|
+
```
|
|
27
149
|
|
|
28
150
|
## Parent rules
|
|
29
151
|
|
|
30
|
-
-
|
|
31
|
-
-
|
|
152
|
+
- **Never** parse subprocess JSON to write `eval-verdict.yaml` or `adversary-report.yaml` — use `submit_*` + `harness_artifact_ready` only.
|
|
153
|
+
- Do not edit `plan-packet.yaml`.
|
|
154
|
+
- Do not run inline review checks in this session (subagent isolation per ADR 0032).
|
|
155
|
+
- Same Pi session as `/harness-run` is preferred; `--claim` makes cross-session resume work.
|
|
156
|
+
|
|
157
|
+
## Phase 6 — Review outcome + repair brief (parent)
|
|
158
|
+
|
|
159
|
+
Write **`artifacts/review-outcome.yaml`** and **`artifacts/repair-brief.yaml`** via `write_harness_yaml` (path pointers in brief, not pasted bodies).
|
|
160
|
+
|
|
161
|
+
| `remediation_class` | `recommended_next` |
|
|
162
|
+
|---------------------|-------------------|
|
|
163
|
+
| `pass` | `/harness-policy-status` |
|
|
164
|
+
| `implementation_gap` | `/harness-steer` |
|
|
165
|
+
| `plan_gap` | `/harness-plan` (mode: revise) |
|
|
166
|
+
| `rollback` | `/harness-incident` |
|
|
167
|
+
|
|
168
|
+
One `ask_user` steer gate when not pass (unless `steer_approved` on run-context).
|
|
32
169
|
|
|
33
170
|
## Completion
|
|
34
171
|
|
|
35
|
-
|
|
36
|
-
- `recommended_action`: `proceed_to_adversary`, `replan`, or `rollback`
|
|
37
|
-
- Evidence list for each failed check
|
|
172
|
+
Report eval status, remediation class, and `next_command` from `review-outcome.yaml`.
|
|
@@ -5,7 +5,9 @@ argument-hint: ""
|
|
|
5
5
|
|
|
6
6
|
# harness-run
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
**Practice map:** `.pi/harness/docs/practice-map.md`
|
|
9
|
+
|
|
10
|
+
You orchestrate the **Executing Process Group** — spawn `harness/running/executor` only. Do **not** implement inline.
|
|
9
11
|
|
|
10
12
|
## Step 0 — Parse arguments
|
|
11
13
|
|
|
@@ -16,28 +18,78 @@ If plan not ready:
|
|
|
16
18
|
|
|
17
19
|
`Run /harness-plan first — no approved plan in active run context.`
|
|
18
20
|
|
|
19
|
-
##
|
|
21
|
+
## Gate — No execution without baseline (change control)
|
|
22
|
+
|
|
23
|
+
**Practice:** PMBOK integrated change control — refuse work without an approved baseline.
|
|
24
|
+
|
|
25
|
+
Refuse if `plan_ready` is false.
|
|
26
|
+
|
|
27
|
+
## Pre-work — Architectural fitness baseline (parent)
|
|
28
|
+
|
|
29
|
+
**Practice:** Fitness functions (architecture governance) — save structural baseline before the executor mutates the tree.
|
|
30
|
+
|
|
31
|
+
When `HARNESS_SENTRUX_REQUIRED=true` (see `.env.example`), run the bundled root-resolving wrapper:
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate --save
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
The wrapper passes the resolved project root explicitly so Sentrux can find `.sentrux/rules.toml` even if the active shell is under `.pi/harness/runs/*`. If `sentrux` is not installed, note `gate_baseline: skipped` in run notes and continue (harness-verify may still pass rules-sync checks).
|
|
38
|
+
|
|
39
|
+
Do **not** ask the executor to optimize Sentrux metrics — observation is for `/harness-review` only.
|
|
40
|
+
|
|
41
|
+
## Orchestration — Single jelled implementer
|
|
42
|
+
|
|
43
|
+
**Practice:** Peopleware — one accountable team owns delivery; generator–evaluator separation (executor does not self-certify).
|
|
20
44
|
|
|
21
45
|
1. Confirm `[HarnessActivePlan]` / extension reports plan ready.
|
|
22
46
|
2. Build `HarnessSpawnContext` with `mode: execute`, `plan_packet_path`, `run_dir`, `acceptance_checks` from plan file.
|
|
23
|
-
3.
|
|
47
|
+
3. Include **`critical_path_work_item_ids`** from `execution_plan.schedule_metadata` in spawn task when present — executor should tackle limiting-step items first (Grove).
|
|
48
|
+
4. Spawn (max **1** agent per call):
|
|
24
49
|
|
|
25
50
|
```
|
|
26
|
-
subagent({ agentScope: "both", agent: "harness/executor", task: "<HarnessSpawnContext + handoff>" })
|
|
51
|
+
subagent({ agentScope: "both", agent: "harness/running/executor", task: "<HarnessSpawnContext + handoff + critical path hint>" })
|
|
27
52
|
```
|
|
28
53
|
|
|
29
|
-
|
|
30
|
-
|
|
54
|
+
5. Parse subprocess output JSON (`execution_status`, validations, rollback refs) from tool result text.
|
|
55
|
+
6. Parent persists trace/handoff artifacts under run dir if needed; do not self-review.
|
|
56
|
+
|
|
57
|
+
## Post-work — Structural observation (parent)
|
|
58
|
+
|
|
59
|
+
**Practice:** Monitoring actuals vs baseline — in-process fitness functions after generator work.
|
|
60
|
+
|
|
61
|
+
After executor subprocess completes:
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" check
|
|
65
|
+
node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
- If `sentrux check` exits non-zero or `gate` reports degradation → set `execution_status: scope_drift` (or `blocked` if unrecoverable); parent runs **`/harness-review`** next (not immediate replan).
|
|
69
|
+
- Write `artifacts/sentrux-signal.yaml` via `write_harness_yaml`:
|
|
70
|
+
|
|
71
|
+
```yaml
|
|
72
|
+
schema_version: "1.0.0"
|
|
73
|
+
run_id: "<run_id>"
|
|
74
|
+
check_pass: true|false
|
|
75
|
+
gate_status: pass|degraded|skipped|not_installed
|
|
76
|
+
quality_signal_summary: "<one line from CLI output>"
|
|
77
|
+
recorded_at: "<ISO8601>"
|
|
78
|
+
phase: execute
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
- Append session custom entry `harness-sentrux-signal` with the same fields (observation bus / telemetry).
|
|
82
|
+
|
|
83
|
+
`harness_artifact_ready({ paths: ["artifacts/sentrux-signal.yaml"] })` when written.
|
|
31
84
|
|
|
32
85
|
## Parent rules
|
|
33
86
|
|
|
34
|
-
-
|
|
35
|
-
- On `scope_drift`, stop and recommend `/harness-plan`.
|
|
87
|
+
- On `scope_drift`, finish handoff and recommend **`/harness-review`** (review classifies `plan_gap` vs `implementation_gap` — ADR 0044).
|
|
36
88
|
- Do not call `ask_user` for plan-level ambiguity — return to plan command.
|
|
37
89
|
|
|
38
90
|
## Completion
|
|
39
91
|
|
|
40
92
|
- `execution_status`: `completed`, `blocked`, or `scope_drift`
|
|
41
93
|
- `validation_summary` with command evidence
|
|
42
|
-
- `handoff_ready` for
|
|
43
|
-
- `next_command`: `/harness-
|
|
94
|
+
- `handoff_ready` for post-run review
|
|
95
|
+
- `next_command`: `/harness-review` (Monitoring and Controlling — measure then judge; same session preferred)
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Ad-hoc architectural intent review — spawn harness/sentrux-steward with graphify evidence.
|
|
3
|
+
argument-hint: "[--run <run-id>]"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# harness-sentrux-steward
|
|
7
|
+
|
|
8
|
+
You are the **chair** for Sentrux **intent** evolution (manifest → rules.toml). Spawn **`harness/sentrux-steward`** only — do not edit the manifest inline without a proposal artifact.
|
|
9
|
+
|
|
10
|
+
**Skill:** `harness-sentrux-setup` — bootstrap vs steward vs sync.
|
|
11
|
+
|
|
12
|
+
## When to use
|
|
13
|
+
|
|
14
|
+
- User requests manifest / rules refresh
|
|
15
|
+
- After `/harness-plan` when execution plan adds top-level paths not covered by manifest layer globs
|
|
16
|
+
- Debate `quality` focus flags structural risk
|
|
17
|
+
- Post-run `sentrux check` failures suggesting missing boundaries (before replan)
|
|
18
|
+
|
|
19
|
+
Do **not** spawn on every `/harness-review`.
|
|
20
|
+
|
|
21
|
+
## Step 0 — Context
|
|
22
|
+
|
|
23
|
+
Use `[HarnessRunContext]` / `[HarnessActivePlan]`. Optional `--run <run-id>` for recovery.
|
|
24
|
+
|
|
25
|
+
## Step 1 — Spawn steward
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
subagent({
|
|
29
|
+
agentScope: "both",
|
|
30
|
+
agent: "harness/sentrux-steward",
|
|
31
|
+
task: "<HarnessSpawnContext + plan_packet_path + planning-context.yaml + execution-plan paths + scope hint>"
|
|
32
|
+
})
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Gate: `harness_artifact_ready({ paths: ["artifacts/sentrux-manifest-proposal.yaml"] })`
|
|
36
|
+
|
|
37
|
+
## Step 2 — Chair decision
|
|
38
|
+
|
|
39
|
+
Read `artifacts/sentrux-manifest-proposal.yaml`.
|
|
40
|
+
|
|
41
|
+
- `change_class: none` → report no manifest change; stop.
|
|
42
|
+
- Otherwise → `ask_user` with summary, evidence bullets, and `adr_draft` if `adr_required`.
|
|
43
|
+
|
|
44
|
+
On approval:
|
|
45
|
+
|
|
46
|
+
1. Apply `manifest_patch` to `.pi/harness/sentrux/architecture.manifest.json` (parent `write` or manual edit).
|
|
47
|
+
2. `node "$UP_PKG/.pi/scripts/harness-sentrux-bootstrap.mjs" --force`
|
|
48
|
+
3. Append session custom entry `harness-architecture-changed` (triggers rules sync extension).
|
|
49
|
+
4. If `adr_required`, file harness ADR snippet or `docs/adr/` entry per team convention.
|
|
50
|
+
|
|
51
|
+
On reject: keep manifest unchanged; document decision in run notes.
|
|
52
|
+
|
|
53
|
+
## Completion
|
|
54
|
+
|
|
55
|
+
Report `change_class`, whether manifest was updated, and `sentrux check` outcome if run after sync.
|
|
@@ -387,11 +387,11 @@ Manual override: **`/router profile auto`** or **`/router profile opencode-go`**
|
|
|
387
387
|
|
|
388
388
|
## Step 3.6 — Harness agents (package-resolved)
|
|
389
389
|
|
|
390
|
-
`harness-subagents` loads agents from the installed **`ultimate-pi`** package (`$UP_PKG/.pi/agents/**`) with namespaced ids (`harness/executor`, `harness/
|
|
390
|
+
`harness-subagents` loads agents from the installed **`ultimate-pi`** package (`$UP_PKG/.pi/agents/**`) with namespaced ids (`harness/running/executor`, `harness/reviewing/evaluator`, `pi-pi/agent-expert`). **Do not copy** agents into the project unless you want a deliberate override.
|
|
391
391
|
|
|
392
392
|
**Slash commands are orchestrators:** `/harness-plan`, `/harness-run`, etc. spawn `harness/*` agents via the `Agent` tool — bootstrap stays **script-first**; only optionally spawn `harness/sentrux-bootstrap` for Sentrux (see Step 4.2).
|
|
393
393
|
|
|
394
|
-
Optional per-repo overrides: place `.md` files at the **same relative path** (e.g. `.pi/agents/harness/
|
|
394
|
+
Optional per-repo overrides: place `.md` files at the **same relative path** (e.g. `.pi/agents/harness/running/executor.md` overrides the package executor).
|
|
395
395
|
|
|
396
396
|
Verify manifest drift after `pi update ultimate-pi`:
|
|
397
397
|
|
|
@@ -531,18 +531,19 @@ node "$UP_PKG/.pi/scripts/harness-sentrux-bootstrap.mjs" --force
|
|
|
531
531
|
| `harness-sentrux-bootstrap.mjs` (no flags) | `/harness-setup`, first install, re-run safe |
|
|
532
532
|
| `harness-sentrux-bootstrap.mjs --force` | Manifest layers/boundaries/constraints changed |
|
|
533
533
|
| `sentrux-rules-sync.mjs --check` | CI / harness-verify drift only |
|
|
534
|
+
| `harness-sentrux-cli.mjs check` / `gate` | Root-resolving Sentrux checks from harness run dirs |
|
|
534
535
|
| `/harness-sentrux-sync` | Interactive re-sync from pi |
|
|
535
536
|
|
|
536
537
|
`harness-seed-project-contracts.mjs` (Step 0.5) may copy `architecture.manifest.json` early; bootstrap still personalizes `project` on first seed and writes `rules.toml`.
|
|
537
538
|
|
|
538
539
|
Verify rules:
|
|
539
540
|
```bash
|
|
540
|
-
sentrux check
|
|
541
|
+
node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" check && echo "✓ sentrux rules pass" || echo "✗ sentrux check failed"
|
|
541
542
|
```
|
|
542
543
|
|
|
543
544
|
Set up structural regression baseline (optional):
|
|
544
545
|
```bash
|
|
545
|
-
sentrux gate --save
|
|
546
|
+
node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate --save 2>/dev/null || echo "Baseline will be saved on first gate run"
|
|
546
547
|
```
|
|
547
548
|
|
|
548
549
|
### 4.3 — Project AGENTS.md
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Post-review repair pass — executor reads repair-brief.yaml, then re-verify via /harness-review.
|
|
3
|
+
argument-hint: "[--attempt N]"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# harness-steer
|
|
7
|
+
|
|
8
|
+
Thin orchestrator for the **steer loop** (ADR 0044). Run only after `/harness-review` produced `artifacts/review-outcome.yaml` and `artifacts/repair-brief.yaml` with `remediation_class: implementation_gap`.
|
|
9
|
+
|
|
10
|
+
## Preconditions
|
|
11
|
+
|
|
12
|
+
- Active run with `plan_ready` and `plan_packet_path`
|
|
13
|
+
- `review-outcome.remediation_class` is `implementation_gap` (review outcome wins over executor `scope_drift` for routing)
|
|
14
|
+
- `steer_attempt < HARNESS_STEER_MAX_ATTEMPTS` (default 3)
|
|
15
|
+
|
|
16
|
+
## Steps
|
|
17
|
+
|
|
18
|
+
1. Read `artifacts/review-outcome.yaml`, `artifacts/repair-brief.yaml`, `plan_packet_path` (paths only — do not paste bodies into tool args).
|
|
19
|
+
2. Update `artifacts/steer-state.yaml` (`attempt`, `max_attempts`, `active: true`).
|
|
20
|
+
3. Set policy phase to **execute** before spawning executor (required for mutating tools).
|
|
21
|
+
4. One `ask_user` steer gate unless `run-context.steer_approved` is already true.
|
|
22
|
+
5. Spawn **`harness/running/executor`** with `HarnessSpawnContext.mode: repair` and `repair_brief_path: artifacts/repair-brief.yaml`.
|
|
23
|
+
6. Optional: `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate --save` after repair to refresh baseline (ADR 0044).
|
|
24
|
+
7. `next_command`: **`/harness-review`** (always re-verify; tiered adversary on attempts 2+ per practice-map).
|
|
25
|
+
|
|
26
|
+
## Forbidden
|
|
27
|
+
|
|
28
|
+
- Re-call `approve_plan` unless `plan-packet.yaml` structure changed
|
|
29
|
+
- Widen scope beyond approved packet
|
|
30
|
+
- Skip review after repair
|
package/.pi/scripts/README.md
CHANGED
|
@@ -27,6 +27,7 @@ From **Typescript extensions**, use `resolveHarnessScript()` / `getHarnessPackag
|
|
|
27
27
|
| Sentrux rules bootstrap (harness-setup) | `node "$UP_PKG/.pi/scripts/harness-sentrux-bootstrap.mjs"` |
|
|
28
28
|
| Sentrux rules re-sync after manifest edit | `node "$UP_PKG/.pi/scripts/harness-sentrux-bootstrap.mjs" --force` or `/harness-sentrux-sync` |
|
|
29
29
|
| Sentrux rules drift check (CI) | `node "$UP_PKG/.pi/scripts/sentrux-rules-sync.mjs" --check` |
|
|
30
|
+
| Sentrux run/review check or gate (root-resolving) | `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" check` / `gate [--save]` |
|
|
30
31
|
| Resolve package root (`UP_PKG`) | `node "$UP_PKG/.pi/scripts/harness-resolve-up-pkg.mjs"` |
|
|
31
32
|
| Model-router config (Pi auth) | `node "$UP_PKG/.pi/scripts/harness-generate-model-router.mjs"` |
|
|
32
33
|
| Project `.env` (append-only) | `node "$UP_PKG/.pi/scripts/harness-sync-env.mjs"` (`--create-missing` after user confirms) |
|