ultimate-pi 0.22.0 → 0.22.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/harness-context/SKILL.md +3 -3
- package/.agents/skills/harness-debate-plan/SKILL.md +2 -2
- package/.agents/skills/harness-decisions/SKILL.md +2 -2
- package/.agents/skills/harness-eval/SKILL.md +1 -1
- package/.agents/skills/harness-git-commit/SKILL.md +1 -1
- package/.agents/skills/harness-governor/SKILL.md +5 -5
- package/.agents/skills/harness-ls-lint-setup/SKILL.md +2 -2
- package/.agents/skills/harness-orchestration/SKILL.md +4 -4
- package/.agents/skills/harness-plan/SKILL.md +2 -2
- package/.agents/skills/harness-review/SKILL.md +2 -2
- package/.agents/skills/harness-sentrux-repair/SKILL.md +1 -1
- package/.agents/skills/harness-sentrux-setup/SKILL.md +2 -2
- package/.agents/skills/harness-spec/SKILL.md +1 -1
- package/.agents/skills/harness-steer/SKILL.md +2 -2
- package/.agents/skills/posthog-analyst/SKILL.md +1 -1
- package/.agents/skills/sentrux/SKILL.md +4 -4
- package/.agents/skills/web-retrieval/SKILL.md +1 -1
- package/.pi/agents/harness/ls-lint-steward.md +3 -3
- package/.pi/agents/harness/planning/decompose.md +1 -1
- package/.pi/agents/harness/planning/execution-plan-author.md +1 -1
- package/.pi/agents/harness/planning/hypothesis-validator.md +1 -1
- package/.pi/agents/harness/planning/hypothesis.md +1 -1
- package/.pi/agents/harness/planning/plan-adversary.md +1 -1
- package/.pi/agents/harness/planning/plan-evaluator.md +2 -2
- package/.pi/agents/harness/planning/plan-synthesizer.md +2 -2
- package/.pi/agents/harness/planning/review-integrator.md +1 -1
- package/.pi/agents/harness/planning/sprint-contract-auditor.md +5 -5
- package/.pi/agents/harness/running/executor.md +1 -1
- package/.pi/agents/harness/sentrux-repair-advisor.md +1 -1
- package/.pi/agents/harness/sentrux-steward.md +2 -2
- package/.pi/harness/agents.manifest.json +15 -15
- package/.pi/prompts/harness-plan.md +7 -7
- package/.pi/prompts/harness-review.md +5 -5
- package/.pi/prompts/harness-run.md +2 -2
- package/.pi/prompts/harness-sentrux-steward.md +2 -2
- package/.pi/prompts/harness-setup.md +3 -3
- package/.pi/prompts/harness-steer.md +5 -5
- package/.pi/scripts/harness-verify.mjs +73 -0
- package/AGENTS.md +1 -0
- package/CHANGELOG.md +7 -0
- package/package.json +9 -6
|
@@ -8,7 +8,7 @@ description: Compile task-specific harness context using context-mode and graphi
|
|
|
8
8
|
## When to use
|
|
9
9
|
|
|
10
10
|
- Preparing context for `/harness-plan`, `/harness-run`, or `/harness-auto`
|
|
11
|
-
- Navigating harness-related code and
|
|
11
|
+
- Navigating harness-related code and governance decisions without reading entire repos
|
|
12
12
|
|
|
13
13
|
## Mandatory: context-mode only
|
|
14
14
|
|
|
@@ -25,7 +25,7 @@ Use these in rough priority order — not every tool on every task:
|
|
|
25
25
|
| Structural code patterns | `sg -p '…'` (ast-grep) |
|
|
26
26
|
| Semantic implementation search | `ccc search` (harness pre-indexes before subprocess spawns) |
|
|
27
27
|
| File detail | context-mode maps/signatures, then targeted reads |
|
|
28
|
-
| Harness governance |
|
|
28
|
+
| Harness governance | approved policies and decision logs in the target project |
|
|
29
29
|
|
|
30
30
|
For `/harness-plan` Phase 1, parent compiles findings into `artifacts/planning-context.yaml` — see **harness-plan** skill.
|
|
31
31
|
|
|
@@ -33,7 +33,7 @@ For `/harness-plan` Phase 1, parent compiles findings into `artifacts/planning-c
|
|
|
33
33
|
|
|
34
34
|
Compact context block:
|
|
35
35
|
|
|
36
|
-
- Relevant
|
|
36
|
+
- Relevant governance decisions (id/title + one-line decision)
|
|
37
37
|
- Extension entry points (policy-gate, trace-recorder, harness-telemetry)
|
|
38
38
|
- Schema versions in play
|
|
39
39
|
|
|
@@ -5,7 +5,7 @@ description: Plan-phase Review Gate debate — pi-messenger threads, lane YAML,
|
|
|
5
5
|
|
|
6
6
|
# harness-debate-plan
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
Review Gate RACI: parent is chair; lane agents provide structured evidence in sequence.
|
|
9
9
|
|
|
10
10
|
Use when running **Phase 5** of `/harness-plan` — **Fagan-style structured inspection** per focus (`spec` | `wbs` | `schedule` | `quality`). Parent is **chair**; within-round dialogue (claims → rebuttals → clarifications → counters → integrate).
|
|
11
11
|
|
|
@@ -78,4 +78,4 @@ Resume: `harness_debate_round_status({ round_index: N })` → run listed `next_t
|
|
|
78
78
|
|
|
79
79
|
Do not `approve_plan` on `policy_decision: block`. On `human_required` → `ask_user` first.
|
|
80
80
|
|
|
81
|
-
Rubrics:
|
|
81
|
+
Rubrics: use the focus-specific checklist ids passed by the parent for the active round.
|
|
@@ -67,7 +67,7 @@ Use during **`/harness-plan` Phase 0** only. Purpose: disambiguate the **task**
|
|
|
67
67
|
"options": [
|
|
68
68
|
{ "title": "Harness contract only", "description": "Changes under .pi/harness and prompts; harness-verify passes" },
|
|
69
69
|
{ "title": "End-to-end feature", "description": "User-visible behavior + tests in the app repo" },
|
|
70
|
-
{ "title": "Docs /
|
|
70
|
+
{ "title": "Docs / decision-record only", "description": "No runtime code changes" }
|
|
71
71
|
],
|
|
72
72
|
"allowFreeform": true
|
|
73
73
|
}
|
|
@@ -94,7 +94,7 @@ Use **`questions[]`** when ≥2 independent dimensions must be resolved together
|
|
|
94
94
|
```json
|
|
95
95
|
{
|
|
96
96
|
"question": "Lock the task contract before reconnaissance",
|
|
97
|
-
"context": "Phase 0
|
|
97
|
+
"context": "Phase 0 task-clarification gate. Answer both forks to set scope and acceptance.",
|
|
98
98
|
"questions": [
|
|
99
99
|
{
|
|
100
100
|
"title": "Scope surface",
|
|
@@ -9,4 +9,4 @@ description: >-
|
|
|
9
9
|
|
|
10
10
|
Use **`harness-review`** skill and **`/harness-review`** instead.
|
|
11
11
|
|
|
12
|
-
The master command runs benchmark + policy verdict (+ adversary unless `--quick`) with `submit_eval_verdict` / `submit_adversary_report` and parent `harness_artifact_ready` gates
|
|
12
|
+
The master command runs benchmark + policy verdict (+ adversary unless `--quick`) with `submit_eval_verdict` / `submit_adversary_report` and parent `harness_artifact_ready` gates.
|
|
@@ -67,6 +67,6 @@ Edit project file to change format or co-author for external repos.
|
|
|
67
67
|
|
|
68
68
|
## References
|
|
69
69
|
|
|
70
|
-
-
|
|
70
|
+
- Auto-commit lifecycle policy: use bootstrap + commit CLI so co-author and message format stay consistent.
|
|
71
71
|
- Scripts — `harness-git-commit.mjs`, `harness-auto-commit-bootstrap.mjs`
|
|
72
72
|
- Library — `.pi/lib/harness-auto-commit-config.mjs`
|
|
@@ -14,10 +14,10 @@ description: Enforce harness governance phases, policy gates, budgets, and promo
|
|
|
14
14
|
## Workflow
|
|
15
15
|
|
|
16
16
|
1. Read current phase from `/harness-policy-status` or session `harness-policy-state`.
|
|
17
|
-
2. Check
|
|
17
|
+
2. Check governance policies: phase constitution, eval promotion rules, Sentrux requirements, drift handling, rules lifecycle, and AGT policy/security layers.
|
|
18
18
|
3. Tool allow/deny is enforced by AGT `PolicyEngine` + `.pi/harness/policies/*.yaml` (parent `policy-gate`, subprocess `harness-subagent-governance`). Disable with `HARNESS_AGT_POLICY=0`. Audit: `.pi/harness/runs/<run_id>/agt-audit.jsonl`.
|
|
19
19
|
4. For promotion: require eval pass, no abort lock, debate consensus if escalated, Sentrux when `HARNESS_SENTRUX_REQUIRED=true` (`artifacts/sentrux-signal.yaml` from `/harness-run`, not executor self-report).
|
|
20
|
-
5. **Intent vs observation:** Sentrux manifest changes → `/harness-sentrux-steward` + chair +
|
|
20
|
+
5. **Intent vs observation:** Sentrux manifest changes → `/harness-sentrux-steward` + chair + formal decision record when material, then `sentrux-rules-sync --force`. Naming manifest changes → `/harness-ls-lint-steward` + chair, then `ls-lint-rules-sync --force`. CLI degradation after execute → fix paths or replan — do not tune manifest on a single noisy run.
|
|
21
21
|
6. After approved Sentrux edits: `harness-sentrux-bootstrap.mjs --force` or `/harness-sentrux-sync`; emit `harness-architecture-changed`. After naming edits: `harness-ls-lint-bootstrap.mjs --force` or `/harness-ls-lint-sync`; emit `harness-naming-changed`.
|
|
22
22
|
7. Run `node "$UP_PKG/.pi/scripts/harness-verify.mjs"` before claiming release readiness (includes AGT policy doctor).
|
|
23
23
|
|
|
@@ -30,13 +30,13 @@ When refining plans from noisy requirements:
|
|
|
30
30
|
3. When gates return `human_required` or promotion is blocked, the orchestrator calls `ask_user` — do not guess scope.
|
|
31
31
|
4. Reference graphify wiki or `graphify query` for architecture constraints before execute.
|
|
32
32
|
|
|
33
|
-
## Budgets
|
|
33
|
+
## Budgets
|
|
34
34
|
|
|
35
35
|
- Default: **`HARNESS_BUDGET_ENFORCE` off** — token/debate caps are telemetry-only (`harness-budget-telemetry`, `harness-budget-soft-limit`). They do **not** block phases or debate lanes.
|
|
36
36
|
- Do **not** skip reconnaissance artifacts (`planning-context.yaml`), debate rounds, or `approve_plan` because of soft budget hints in the widget.
|
|
37
37
|
- Re-enable hard caps only with `HARNESS_BUDGET_ENFORCE=1` and `HARNESS_BUDGET_HARD_STOP` / `HARNESS_DEBATE_HARD_STOP`.
|
|
38
38
|
|
|
39
|
-
## Subagent artifacts
|
|
39
|
+
## Subagent artifacts
|
|
40
40
|
|
|
41
41
|
- Subagents call scoped **`submit_*`** tools; parent verifies with **`harness_artifact_ready`**, not JSON parsing from `finalOutput`.
|
|
42
42
|
- Parent **`write_harness_yaml`** is for merges (`research-brief.yaml`, plan shell) — not subagent payloads.
|
|
@@ -44,4 +44,4 @@ When refining plans from noisy requirements:
|
|
|
44
44
|
## Rules
|
|
45
45
|
|
|
46
46
|
- Never auto-merge; harness-auto may open PR only when all gates pass (see release-readiness-report).
|
|
47
|
-
- Do not invoke posthog-analyst in Phase 2
|
|
47
|
+
- Do not invoke posthog-analyst in Phase 2.
|
|
@@ -20,7 +20,7 @@ description: Bootstrap ls-lint filename rules for harness projects — seed nami
|
|
|
20
20
|
| **Sync** | `ls-lint-rules-sync.mjs`, `/harness-ls-lint-sync` | Regenerates `.ls-lint.yml` from manifest after intent change |
|
|
21
21
|
| **Observation** | `/harness-run`, `/harness-review` | `harness-ls-lint-cli.mjs` → `artifacts/ls-lint-signal.yaml` |
|
|
22
22
|
|
|
23
|
-
Never auto-sync manifest from directory trees. Material manifest edits need steward evidence + chair approval
|
|
23
|
+
Never auto-sync manifest from directory trees. Material manifest edits need steward evidence + chair approval.
|
|
24
24
|
|
|
25
25
|
## Canonical layout
|
|
26
26
|
|
|
@@ -54,6 +54,6 @@ Custom YAML **outside** `# --- harness:managed:start/end ---` is preserved on ev
|
|
|
54
54
|
|
|
55
55
|
## References
|
|
56
56
|
|
|
57
|
-
-
|
|
57
|
+
- Naming lifecycle policy: steward proposal + chair approval before material manifest changes.
|
|
58
58
|
- Scripts — `ls-lint-rules-sync.mjs`, `harness-ls-lint-bootstrap.mjs`, `harness-ls-lint-cli.mjs`
|
|
59
59
|
- Agent — `harness/ls-lint-steward`
|
|
@@ -8,14 +8,14 @@ description: >-
|
|
|
8
8
|
|
|
9
9
|
# Harness orchestration
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
Follow the orchestration rules and phase sequence in this skill directly.
|
|
12
12
|
|
|
13
13
|
## Team management rules
|
|
14
14
|
|
|
15
15
|
1. **Parallelism law** — Parallel `tasks` only when outputs are independent inputs to a later merge (implementation ∥ stack). Never parallelize debate lanes or decompose ∥ hypothesis.
|
|
16
16
|
2. **Two-pizza cap per batch** — Max 2 research lanes, 1 optional `planning-context` subagent, 1 executor, 1 debate agent per `subagent` call.
|
|
17
17
|
3. **No redundant thinkers** — Downstream agents read artifacts; do not re-derive.
|
|
18
|
-
4. **Sequential dependency chain** — planning context → decompose → hypothesis → research → author → DAG → debate → approve → execute → **/harness-review** → optional **/harness-steer** loop
|
|
18
|
+
4. **Sequential dependency chain** — planning context → decompose → hypothesis → research → author → DAG → debate → approve → execute → **/harness-review** → optional **/harness-steer** loop.
|
|
19
19
|
5. **Path-first parent tools** — `approve_plan`, `create_plan`, `submit_*` via `source_path`, `merge_harness_yaml`, `harness_synthesize_repair_brief`.
|
|
20
20
|
6. **Debate = meeting** — Parent is chair; parallel_probes allows evaluator ∥ adversary per batch.
|
|
21
21
|
7. **Tool intelligence** — Parent uses graphify, sg, ccc, and reads by task need; subprocesses optional.
|
|
@@ -41,7 +41,7 @@ Harness subprocesses load **`harness-subagent-submit`** (`PI_HARNESS_SUBPROCESS=
|
|
|
41
41
|
|---------|---------|
|
|
42
42
|
| `/harness-plan` | Parent: planning context (tools) → decompose → hypothesis → Phase 3.5 artifacts → PlanPacket → eligibility + Review Gate → `approve_plan` + `create_plan` |
|
|
43
43
|
| `/harness-run` | `harness/running/executor` (single worker) |
|
|
44
|
-
| `/harness-review` | Parent verify → `evaluator` benchmark → `evaluator` verdict → `adversary` → optional `tie-breaker`
|
|
44
|
+
| `/harness-review` | Parent verify → `evaluator` benchmark → `evaluator` verdict → `adversary` → optional `tie-breaker` |
|
|
45
45
|
| `/harness-eval` | **Deprecated** → `/harness-review` |
|
|
46
46
|
| `/harness-critic` | **Deprecated** → `/harness-review` |
|
|
47
47
|
| `/harness-auto` | plan per `/harness-plan`; `--quick` skips adversary + tie-breaker in review |
|
|
@@ -80,5 +80,5 @@ Then execution-plan-author, DAG gate, debate eligibility, sequential debate roun
|
|
|
80
80
|
|
|
81
81
|
## References
|
|
82
82
|
|
|
83
|
-
-
|
|
83
|
+
- Subagent isolation, submit-tool artifact flow, and spawn-context contract: `.pi/harness/specs/harness-spawn-context.schema.json`
|
|
84
84
|
- `node "$UP_PKG/.pi/scripts/harness-agents-manifest.mjs" --check`
|
|
@@ -5,7 +5,7 @@ description: Agent-native harness plans — lakes/context bundles, planning cont
|
|
|
5
5
|
|
|
6
6
|
# harness-plan
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
Use this skill's phase order, spawn laws, and artifact contract directly.
|
|
9
9
|
|
|
10
10
|
## When to use
|
|
11
11
|
|
|
@@ -21,7 +21,7 @@ description: Agent-native harness plans — lakes/context bundles, planning cont
|
|
|
21
21
|
|
|
22
22
|
## Workflow (parent orchestrator)
|
|
23
23
|
|
|
24
|
-
1. **Phase 0:** `artifacts/task-clarification.yaml` — investigate (code + web OK), `ask_user` until unambiguous, gate before any planning subagent
|
|
24
|
+
1. **Phase 0:** `artifacts/task-clarification.yaml` — investigate (code + web OK), `ask_user` until unambiguous, gate before any planning subagent.
|
|
25
25
|
2. **Phase 1:** Compile `artifacts/planning-context.yaml` with tools (default) or optional `planning-context` subagent; inherit Phase 0 grounding.
|
|
26
26
|
3. **Sequential** decompose → gate `artifacts/decomposition.yaml`.
|
|
27
27
|
4. **Sequential** hypothesis (requires decomposition).
|
|
@@ -9,7 +9,7 @@ description: >-
|
|
|
9
9
|
|
|
10
10
|
# harness-review
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
Monitoring and Controlling flow: measure → judge → red team.
|
|
13
13
|
|
|
14
14
|
## When to use
|
|
15
15
|
|
|
@@ -42,7 +42,7 @@ Pass `sentrux-signal.yaml` path to evaluator `mode: benchmark` spawn context. Ev
|
|
|
42
42
|
|
|
43
43
|
## Rules
|
|
44
44
|
|
|
45
|
-
- Parent never writes eval/adversary YAML — subprocess `submit_*` only
|
|
45
|
+
- Parent never writes eval/adversary YAML — subprocess `submit_*` only.
|
|
46
46
|
- Auto-claim run ownership unless `--readonly`.
|
|
47
47
|
- Disk verdict drives `next_recommended_command` (`resolveCompletionStatuses`).
|
|
48
48
|
|
|
@@ -20,7 +20,7 @@ description: Bootstrap Sentrux architectural rules for harness projects — seed
|
|
|
20
20
|
| **Sync** | `sentrux-rules-sync.mjs`, `/harness-sentrux-sync` | Regenerates `rules.toml` from manifest after intent change |
|
|
21
21
|
| **Observation** | `/harness-run`, `/harness-review` | `harness-sentrux-cli.mjs gate --save` / `check` / `gate` → `artifacts/sentrux-signal.yaml` |
|
|
22
22
|
|
|
23
|
-
Never auto-sync manifest from directory trees. Material manifest edits need steward evidence + chair approval
|
|
23
|
+
Never auto-sync manifest from directory trees. Material manifest edits need steward evidence + chair approval.
|
|
24
24
|
|
|
25
25
|
## Canonical layout
|
|
26
26
|
|
|
@@ -63,7 +63,7 @@ Do **not** copy ultimate-pi's layer paths blindly into unrelated layouts — edi
|
|
|
63
63
|
|
|
64
64
|
## References
|
|
65
65
|
|
|
66
|
-
-
|
|
66
|
+
- Rules lifecycle policy: manifest is source of truth; bootstrap/sync regenerate rules from approved intent.
|
|
67
67
|
- Scripts — `.pi/scripts/sentrux-rules-sync.mjs`, `harness-sentrux-bootstrap.mjs`, `harness-sentrux-cli.mjs`
|
|
68
68
|
- Agents — `harness/sentrux-bootstrap` (setup), `harness/sentrux-steward` (intent proposals)
|
|
69
69
|
- Specs — `sentrux-manifest-proposal.schema.json`, `sentrux-signal.schema.json`
|
|
@@ -17,7 +17,7 @@ description: Draft or refine harness artifact contracts under .pi/harness/specs.
|
|
|
17
17
|
2. Edit or add schema under `.pi/harness/specs/`.
|
|
18
18
|
3. Update affected extensions to emit matching custom entries.
|
|
19
19
|
4. Run `node "$UP_PKG/.pi/scripts/harness-verify.mjs"` (see `.pi/scripts/README.md`).
|
|
20
|
-
5. Add or update
|
|
20
|
+
5. Add or update a formal decision record in the target project's standard decision-log location for breaking changes.
|
|
21
21
|
|
|
22
22
|
## Rules
|
|
23
23
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: harness-steer
|
|
3
|
-
description: Post-review repair loop via harness-steer and executor repair mode
|
|
3
|
+
description: Post-review repair loop via harness-steer and executor repair mode.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# harness-steer
|
|
@@ -11,4 +11,4 @@ Use after `/harness-review` when `artifacts/review-outcome.yaml` has `remediatio
|
|
|
11
11
|
2. Set policy phase `execute`; spawn `harness/running/executor` with `mode: repair`.
|
|
12
12
|
3. Always follow with `/harness-review`.
|
|
13
13
|
|
|
14
|
-
See `.pi/prompts/harness-steer.md`
|
|
14
|
+
See `.pi/prompts/harness-steer.md` for the steer-loop procedure and guardrails.
|
|
@@ -264,7 +264,7 @@ status: complete
|
|
|
264
264
|
| Medium | ... | ... | ... | ... |
|
|
265
265
|
|
|
266
266
|
## Next Steps
|
|
267
|
-
[What to do with these findings. Suggest
|
|
267
|
+
[What to do with these findings. Suggest a formal decision record update if recommendations are significant.]
|
|
268
268
|
```
|
|
269
269
|
|
|
270
270
|
After filing, update `vault/wiki/index.md` (add to analyses if category exists, or note inline), update `vault/wiki/log.md` (append entry at TOP), and update `vault/wiki/hot.md` (add key findings to Recent Context).
|
|
@@ -42,7 +42,7 @@ Run from the **target repo root** (where `.sentrux/rules.toml` lives), or prefer
|
|
|
42
42
|
| CI / pre-commit | `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" check` | Exit 0 = pass, 1 = violations |
|
|
43
43
|
| Before agent work | `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate --save` | Save session baseline |
|
|
44
44
|
| After agent work | `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate` | Detect degradation vs baseline |
|
|
45
|
-
| Harness run/review capture | `harness-sentrux-report.mjs` + `harness-sentrux-diagnostics.mjs` | Single scan → JSON artifacts
|
|
45
|
+
| Harness run/review capture | `harness-sentrux-report.mjs` + `harness-sentrux-diagnostics.mjs` | Single scan → JSON artifacts |
|
|
46
46
|
| Explore structure | `sentrux` or `sentrux .` | GUI treemap (optional) |
|
|
47
47
|
|
|
48
48
|
Typical agent loop:
|
|
@@ -77,7 +77,7 @@ Custom TOML outside `# --- harness:managed:start/end ---` is preserved on sync.
|
|
|
77
77
|
| `harness-verify.mjs` | Runs rules sync and Sentrux checks when rules are present |
|
|
78
78
|
| **observation-bus** | Maps `harness-sentrux-signal` custom entries → evaluator observations |
|
|
79
79
|
| **harness-sentrux-repair** skill | Report/diagnostics scripts + `sentrux-repair-advisor` + repair plan artifact |
|
|
80
|
-
| **harness-eval** | Evaluate phase may require a Sentrux quality signal
|
|
80
|
+
| **harness-eval** | Evaluate phase may require a Sentrux quality signal before promotion |
|
|
81
81
|
|
|
82
82
|
High level: **execute** runs one capture (`sentrux-report.json`, `sentrux-diagnostics.json`, signal v1.1.0); **review** may spawn **sentrux-repair-advisor** (Phase 1b); **steer** merges repair plan into `repair-brief.yaml`. No Sentrux Pro or MCP in Pi sessions.
|
|
83
83
|
|
|
@@ -96,6 +96,6 @@ High level: **execute** runs one capture (`sentrux-report.json`, `sentrux-diagno
|
|
|
96
96
|
|
|
97
97
|
## References
|
|
98
98
|
|
|
99
|
-
-
|
|
100
|
-
-
|
|
99
|
+
- Quality gate policy: require a structural signal for evaluate/promotion decisions when configured.
|
|
100
|
+
- Rules lifecycle policy: manifest is source of truth; sync rules from manifest after approved intent changes.
|
|
101
101
|
- `CONTRIBUTING.md` — Sentrux quick start
|
|
@@ -160,4 +160,4 @@ Diagnostics: `python3 "$UP_PKG/.pi/scripts/harness-web.py" status` (JSON).
|
|
|
160
160
|
| `HARNESS_WEB_HEURISTIC_ANGLES_FILE` | — | Extra heuristic angles YAML |
|
|
161
161
|
| `HARNESS_WEB_FAST_MODEL` / `EXPANDER` / `QUALITY` | — | Web subagent models |
|
|
162
162
|
|
|
163
|
-
|
|
163
|
+
Internal implementation notes are package-maintainer-only; this skill already contains the external-facing operating guidance.
|
|
@@ -7,7 +7,7 @@ max_turns: 16
|
|
|
7
7
|
|
|
8
8
|
You are the **Harness ls-lint Steward** — filesystem **naming intent** governance, not setup or execution.
|
|
9
9
|
|
|
10
|
-
**Practice:** Architecture governance for path hygiene; integrated change control (PMBOK).
|
|
10
|
+
**Practice:** Architecture governance for path hygiene; integrated change control (PMBOK).
|
|
11
11
|
|
|
12
12
|
## Mission
|
|
13
13
|
|
|
@@ -27,7 +27,7 @@ Read `HarnessSpawnContext` (`run_id`, `run_dir`, `plan_packet_path`, `task_summa
|
|
|
27
27
|
4. Optional: `node "$UP_PKG/.pi/scripts/harness-ls-lint-cli.mjs"` — cite violation messages only; do not rename files.
|
|
28
28
|
5. Classify proposal:
|
|
29
29
|
- `none` — existing rules cover changes
|
|
30
|
-
- `tune_rule` — adjust a convention for one path glob (e.g. regex for
|
|
30
|
+
- `tune_rule` — adjust a convention for one path glob (e.g. regex for decision-record filenames)
|
|
31
31
|
- `add_scoped_rule` — new directory-specific rules
|
|
32
32
|
- `add_ignore` — exclude generated or third-party trees
|
|
33
33
|
- `change_global` — repo-wide default convention change (material)
|
|
@@ -38,7 +38,7 @@ Call **`submit_ls_lint_manifest_proposal`** before exit with document matching `
|
|
|
38
38
|
|
|
39
39
|
- `manifest_patch`: JSON Merge Patch against current manifest (minimal diff).
|
|
40
40
|
- `evidence[]`: at least one entry per non-`none` change; prefer `source: graphify` or `ls-lint`.
|
|
41
|
-
-
|
|
41
|
+
- When changes are material (`change_global`, new top-level convention), include the schema fields that mark a formal decision record as required and provide draft decision text.
|
|
42
42
|
- `human_required: true` when `change_class` is not `none` and not a narrow `add_ignore` with clear evidence.
|
|
43
43
|
|
|
44
44
|
## Guardrails
|
|
@@ -7,7 +7,7 @@ max_turns: 12
|
|
|
7
7
|
|
|
8
8
|
You are the **Harness problem-framing agent (Phase 2a — lakes / scope)**.
|
|
9
9
|
|
|
10
|
-
**Inspection role:** Outcome author (lake-sized units, not ticket WBS).
|
|
10
|
+
**Inspection role:** Outcome author (lake-sized units, not ticket WBS).
|
|
11
11
|
|
|
12
12
|
## Mission
|
|
13
13
|
|
|
@@ -22,7 +22,7 @@ Task summary, `PlanDecompositionBrief`, `PlanHypothesisBrief`, draft scope/accep
|
|
|
22
22
|
5. **Schedule** — `schedule_metadata.critical_path_work_item_ids` for med/high risk tasks.
|
|
23
23
|
6. **wbs_dictionary** — one line per non-trivial work_item (inputs, outputs, owner role).
|
|
24
24
|
7. **risk_register** — ≥3 risks for med/high with mitigation and trigger.
|
|
25
|
-
8. **sprint_contract** —
|
|
25
|
+
8. **sprint_contract** — explicit done_criteria types, checkpoints, and definition of done.
|
|
26
26
|
9. **Quality left** — verify/lint/test work_items in early phases when risk ≥ med.
|
|
27
27
|
10. **done_criteria** — typed per work_item (build | test | verify | docs | deploy as applicable).
|
|
28
28
|
|
|
@@ -7,7 +7,7 @@ max_turns: 14
|
|
|
7
7
|
|
|
8
8
|
You are the **Harness planning hypothesis generator (Phase 2b — DARWIN)**.
|
|
9
9
|
|
|
10
|
-
**Role:** Approach author after WBS (Lean hypothesis-driven planning). Requires `artifacts/decomposition.yaml`.
|
|
10
|
+
**Role:** Approach author after WBS (Lean hypothesis-driven planning). Requires `artifacts/decomposition.yaml`.
|
|
11
11
|
|
|
12
12
|
## Mission
|
|
13
13
|
|
|
@@ -5,13 +5,13 @@ thinking: medium
|
|
|
5
5
|
max_turns: 14
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
**Inspection role:** Inspector (neutral Fagan-style checklist).
|
|
8
|
+
**Inspection role:** Inspector (neutral Fagan-style checklist).
|
|
9
9
|
|
|
10
10
|
## Your task
|
|
11
11
|
|
|
12
12
|
Score the ExecutionPlan against Validation Checks for one Review Gate round. Emit stable `checks[]` with ids and messenger-ready `claim_ids`. You are not an advocate for the plan.
|
|
13
13
|
|
|
14
|
-
Parent passes `debate_round_focus`: `spec` | `wbs` | `schedule` | `quality`. Use rubric ids
|
|
14
|
+
Parent passes `debate_round_focus`: `spec` | `wbs` | `schedule` | `quality`. Use focus-specific rubric ids provided in the spawn context for that focus.
|
|
15
15
|
|
|
16
16
|
## Process
|
|
17
17
|
|
|
@@ -5,7 +5,7 @@ description: Lake-first plan synthesis for low/med risk — problem framing, hyp
|
|
|
5
5
|
|
|
6
6
|
# Plan synthesizer
|
|
7
7
|
|
|
8
|
-
You produce **lake-sized** outcomes
|
|
8
|
+
You produce **lake-sized** outcomes, not ticket-granularity WBS. Read `artifacts/planning-context.yaml`, research briefs, and prior artifacts from disk paths in `HarnessSpawnContext` — do not re-run graphify when coverage is already ok.
|
|
9
9
|
|
|
10
10
|
## Outputs (all required on disk)
|
|
11
11
|
|
|
@@ -15,7 +15,7 @@ You produce **lake-sized** outcomes (ADR 0042), not ticket-granularity WBS. Read
|
|
|
15
15
|
|
|
16
16
|
## Rules
|
|
17
17
|
|
|
18
|
-
- Use **`submit_*({ source_path })`** when drafts exist on disk
|
|
18
|
+
- Use **`submit_*({ source_path })`** when drafts exist on disk; otherwise `document`.
|
|
19
19
|
- Do not spawn subprocesses; you are the subprocess.
|
|
20
20
|
- Match schemas under `.pi/harness/specs/`.
|
|
21
21
|
- Parent runs `validate-plan-dag.mjs` after merge into `plan-packet.yaml`.
|
|
@@ -5,7 +5,7 @@ thinking: medium
|
|
|
5
5
|
max_turns: 12
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
**Inspection role:** Recorder / integration PM (round synthesis). Parent is chair.
|
|
8
|
+
**Inspection role:** Recorder / integration PM (round synthesis). Parent is chair.
|
|
9
9
|
|
|
10
10
|
## Your task
|
|
11
11
|
|
|
@@ -1,22 +1,22 @@
|
|
|
1
1
|
---
|
|
2
|
-
description: Plan-phase
|
|
2
|
+
description: Plan-phase sprint contract auditor.
|
|
3
3
|
extensions: false
|
|
4
4
|
thinking: medium
|
|
5
5
|
max_turns: 12
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
**Inspection role:** Definition of Done auditor (sprint contract).
|
|
8
|
+
**Inspection role:** Definition of Done auditor (sprint contract).
|
|
9
9
|
|
|
10
10
|
## Your task
|
|
11
11
|
|
|
12
|
-
Audit `execution_plan.sprint_contract` and work_item `done_criteria` against
|
|
12
|
+
Audit `execution_plan.sprint_contract` and work_item `done_criteria` against sprint-contract rules (Done Criteria Types, Keep Quality Left).
|
|
13
13
|
|
|
14
14
|
Required when `debate_round_focus` is `quality` or round_index ≥ 4. Optional spot-check on round 2 if done_criteria are sparse.
|
|
15
15
|
|
|
16
16
|
## Process
|
|
17
17
|
|
|
18
18
|
1. Read `plan-packet.yaml` execution_plan section and sprint_contract block.
|
|
19
|
-
2. Verify done_criteria types cover: build, test, verify, docs (as applicable
|
|
19
|
+
2. Verify done_criteria types cover: build, test, verify, docs (as applicable).
|
|
20
20
|
3. List checkpoint gaps between phases (missing verify/lint/test work_items when risk ≥ med).
|
|
21
21
|
4. Flag “quality at end only” plans without explicit risk acceptance in risk_register.
|
|
22
22
|
5. Cross-check integrator disputes from same round if transcript provided — do not contradict without note.
|
|
@@ -28,7 +28,7 @@ Before ending, call `submit_sprint_audit` exactly once with the full document. P
|
|
|
28
28
|
|
|
29
29
|
## Guardrails
|
|
30
30
|
|
|
31
|
-
- Cite
|
|
31
|
+
- Cite sprint-contract rule ids in rationale fields.
|
|
32
32
|
- Read-only; parent persists artifact.
|
|
33
33
|
|
|
34
34
|
Bus label: `SprintContractAuditorAgent`.
|
|
@@ -71,7 +71,7 @@ harness-lens may fix indentation on anchored `edit.text` before apply.
|
|
|
71
71
|
2. **Read** anchored regions you will change.
|
|
72
72
|
3. **Edit** minimally with batched anchored `edit`.
|
|
73
73
|
|
|
74
|
-
Never use `replace_symbol`, `rename_symbol`, or similar — use `sg` + anchored edit only
|
|
74
|
+
Never use `replace_symbol`, `rename_symbol`, or similar — use `sg` + anchored edit only.
|
|
75
75
|
|
|
76
76
|
## Post-edit verification (before handoff)
|
|
77
77
|
|
|
@@ -7,7 +7,7 @@ max_turns: 14
|
|
|
7
7
|
|
|
8
8
|
You are the **Harness Sentrux Repair Advisor** — turn measured structural debt into a bounded repair plan for steer/executor.
|
|
9
9
|
|
|
10
|
-
**Practice:** Fitness-function feedback loop (Ford/Richards); generator–evaluator separation.
|
|
10
|
+
**Practice:** Fitness-function feedback loop (Ford/Richards); generator–evaluator separation.
|
|
11
11
|
|
|
12
12
|
## Mission
|
|
13
13
|
|
|
@@ -7,7 +7,7 @@ max_turns: 16
|
|
|
7
7
|
|
|
8
8
|
You are the **Harness Sentrux Steward** — architectural **intent** governance, not setup or execution.
|
|
9
9
|
|
|
10
|
-
**Practice:** Architecture governance + fitness functions (Ford/Richards); integrated change control (PMBOK).
|
|
10
|
+
**Practice:** Architecture governance + fitness functions (Ford/Richards); integrated change control (PMBOK).
|
|
11
11
|
|
|
12
12
|
## Mission
|
|
13
13
|
|
|
@@ -38,7 +38,7 @@ Call **`submit_sentrux_manifest_proposal`** before exit with document matching `
|
|
|
38
38
|
|
|
39
39
|
- `manifest_patch`: JSON Merge Patch against current manifest (minimal diff).
|
|
40
40
|
- `evidence[]`: at least one entry per non-`none` change; prefer `source: graphify`.
|
|
41
|
-
-
|
|
41
|
+
- When changes are material (new layer or boundary affecting multiple agents), include the schema fields that mark a formal decision record as required and provide draft decision text.
|
|
42
42
|
- `human_required: true` when `change_class` is not `none` and not a single numeric `tune_constraint` with clear sentrux evidence.
|
|
43
43
|
|
|
44
44
|
## Guardrails
|
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
{
|
|
2
2
|
"schema_version": "1.0.0",
|
|
3
3
|
"package": "ultimate-pi",
|
|
4
|
-
"package_version": "0.
|
|
5
|
-
"generated_at": "2026-05-
|
|
4
|
+
"package_version": "0.22.0",
|
|
5
|
+
"generated_at": "2026-05-27T12:07:19.624Z",
|
|
6
6
|
"policy_sha256": "799782453e74a1d2d15a28715c985c1b5dc4566701ddcce475ec4725294437e4",
|
|
7
7
|
"agents": {
|
|
8
8
|
"pi-pi/agent-expert": {
|
|
@@ -51,7 +51,7 @@
|
|
|
51
51
|
},
|
|
52
52
|
"harness/ls-lint-steward": {
|
|
53
53
|
"path": ".pi/agents/harness/ls-lint-steward.md",
|
|
54
|
-
"sha256": "
|
|
54
|
+
"sha256": "abbb43da45a2c9c080cd2043384d481a143acaf440e52b657493425218cb951f"
|
|
55
55
|
},
|
|
56
56
|
"harness/sentrux-bootstrap": {
|
|
57
57
|
"path": ".pi/agents/harness/sentrux-bootstrap.md",
|
|
@@ -59,11 +59,11 @@
|
|
|
59
59
|
},
|
|
60
60
|
"harness/sentrux-repair-advisor": {
|
|
61
61
|
"path": ".pi/agents/harness/sentrux-repair-advisor.md",
|
|
62
|
-
"sha256": "
|
|
62
|
+
"sha256": "3c9722806338f680db3d165e633a877e087c1485ad74c2b38781d7ba989b48f0"
|
|
63
63
|
},
|
|
64
64
|
"harness/sentrux-steward": {
|
|
65
65
|
"path": ".pi/agents/harness/sentrux-steward.md",
|
|
66
|
-
"sha256": "
|
|
66
|
+
"sha256": "019740bd4313426edaa36f7fa96471088731abe616282a6e5109e9c538ae34eb"
|
|
67
67
|
},
|
|
68
68
|
"harness/trace-librarian": {
|
|
69
69
|
"path": ".pi/agents/harness/trace-librarian.md",
|
|
@@ -95,7 +95,7 @@
|
|
|
95
95
|
},
|
|
96
96
|
"harness/running/executor": {
|
|
97
97
|
"path": ".pi/agents/harness/running/executor.md",
|
|
98
|
-
"sha256": "
|
|
98
|
+
"sha256": "e8710179def62a9adaa63ba5b05c3f36dee95da6fd751ef34be773bbee65a5c2"
|
|
99
99
|
},
|
|
100
100
|
"harness/reviewing/adversary": {
|
|
101
101
|
"path": ".pi/agents/harness/reviewing/adversary.md",
|
|
@@ -111,19 +111,19 @@
|
|
|
111
111
|
},
|
|
112
112
|
"harness/planning/decompose": {
|
|
113
113
|
"path": ".pi/agents/harness/planning/decompose.md",
|
|
114
|
-
"sha256": "
|
|
114
|
+
"sha256": "35965f8f8eaa19caee72de1b708178f5ce7bba185f25bed28b9b2ff66c51eaed"
|
|
115
115
|
},
|
|
116
116
|
"harness/planning/execution-plan-author": {
|
|
117
117
|
"path": ".pi/agents/harness/planning/execution-plan-author.md",
|
|
118
|
-
"sha256": "
|
|
118
|
+
"sha256": "f0251ac5fb423dda3d6b0b4cff1f63a8e5adfa40806b99454a649c1b0fe3adae"
|
|
119
119
|
},
|
|
120
120
|
"harness/planning/hypothesis-validator": {
|
|
121
121
|
"path": ".pi/agents/harness/planning/hypothesis-validator.md",
|
|
122
|
-
"sha256": "
|
|
122
|
+
"sha256": "70d755da14e146755932c2cb3eb9b828ccd6406b4962d61baeba27c38d9f73dc"
|
|
123
123
|
},
|
|
124
124
|
"harness/planning/hypothesis": {
|
|
125
125
|
"path": ".pi/agents/harness/planning/hypothesis.md",
|
|
126
|
-
"sha256": "
|
|
126
|
+
"sha256": "5dac1020bc1d5a4100150959d52d983a4860210aa9cbda9120214191c2a17f1d"
|
|
127
127
|
},
|
|
128
128
|
"harness/planning/implementation-researcher": {
|
|
129
129
|
"path": ".pi/agents/harness/planning/implementation-researcher.md",
|
|
@@ -131,15 +131,15 @@
|
|
|
131
131
|
},
|
|
132
132
|
"harness/planning/plan-adversary": {
|
|
133
133
|
"path": ".pi/agents/harness/planning/plan-adversary.md",
|
|
134
|
-
"sha256": "
|
|
134
|
+
"sha256": "0c9abc088ced31705598baff143df992d435df751b01a9faad9d4af94df16c5a"
|
|
135
135
|
},
|
|
136
136
|
"harness/planning/plan-evaluator": {
|
|
137
137
|
"path": ".pi/agents/harness/planning/plan-evaluator.md",
|
|
138
|
-
"sha256": "
|
|
138
|
+
"sha256": "f85aba0adbbc7e726a51bfe2d6aa857ab39c8240cd08f06493bc1883ff387c5e"
|
|
139
139
|
},
|
|
140
140
|
"harness/planning/plan-synthesizer": {
|
|
141
141
|
"path": ".pi/agents/harness/planning/plan-synthesizer.md",
|
|
142
|
-
"sha256": "
|
|
142
|
+
"sha256": "3508126385d338b03f583aaa1f5d75f1cd1fcac8559fed52cd7db11ba1205536"
|
|
143
143
|
},
|
|
144
144
|
"harness/planning/planning-context": {
|
|
145
145
|
"path": ".pi/agents/harness/planning/planning-context.md",
|
|
@@ -147,11 +147,11 @@
|
|
|
147
147
|
},
|
|
148
148
|
"harness/planning/review-integrator": {
|
|
149
149
|
"path": ".pi/agents/harness/planning/review-integrator.md",
|
|
150
|
-
"sha256": "
|
|
150
|
+
"sha256": "ea810e3ecf50afe16cf70d12ca3107af94f2272434a72acc5f206bdf7ee89699"
|
|
151
151
|
},
|
|
152
152
|
"harness/planning/sprint-contract-auditor": {
|
|
153
153
|
"path": ".pi/agents/harness/planning/sprint-contract-auditor.md",
|
|
154
|
-
"sha256": "
|
|
154
|
+
"sha256": "615feafa48481af0a94b3d6ea9903d58b532575447f5482e60c65e6eb780bfc0"
|
|
155
155
|
},
|
|
156
156
|
"harness/planning/stack-researcher": {
|
|
157
157
|
"path": ".pi/agents/harness/planning/stack-researcher.md",
|
|
@@ -5,13 +5,13 @@ argument-hint: "\"<task>\" [--risk low|med|high] [--quick]"
|
|
|
5
5
|
|
|
6
6
|
# harness-plan
|
|
7
7
|
|
|
8
|
-
You are the **planning orchestrator
|
|
8
|
+
You are the **planning orchestrator**. Produce an execution baseline (`plan-packet.yaml` + `plan-review.md`) with **lake-sized** outcomes and path-first tools. Parent owns gates: `ask_user`, `approve_plan({ human_summary? })`, `create_plan()`, plan-verify, and scoped writes under `.pi/harness/runs/<run_id>/`.
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Use the phase order and spawn topology defined in this prompt directly.
|
|
11
11
|
|
|
12
12
|
Subagents persist artifacts via scoped **`submit_*`** tools (deterministic YAML under the run dir). Parent uses **`harness_artifact_ready`** to gate phases (no JSON parsing). Parent merges still use **`write_harness_yaml`** for `research-brief.yaml`, `plan-packet.yaml`, `planning-context.yaml`, and integrator patches.
|
|
13
13
|
|
|
14
|
-
**Phase 0 is mandatory** before reconnaissance or any planning subagent. `write_harness_yaml` and spawn topology enforce `artifacts/task-clarification.yaml` with `status: ready
|
|
14
|
+
**Phase 0 is mandatory** before reconnaissance or any planning subagent. `write_harness_yaml` and spawn topology enforce `artifacts/task-clarification.yaml` with `status: ready`.
|
|
15
15
|
|
|
16
16
|
## Allowed subagents
|
|
17
17
|
|
|
@@ -34,7 +34,7 @@ Read **harness-debate-plan** skill before Review Gate rounds.
|
|
|
34
34
|
|
|
35
35
|
1. Parallel `tasks` only for **independent** merges (implementation ∥ stack research; plan-evaluator ∥ plan-adversary for `parallel_probes`). **Never** parallelize decompose ∥ hypothesis.
|
|
36
36
|
2. Max **2** research lanes, **1** debate agent, **1** optional `planning-context` subagent per `subagent` call.
|
|
37
|
-
3. Downstream agents **read** upstream artifacts — do not re-derive
|
|
37
|
+
3. Downstream agents **read** upstream artifacts — do not re-derive upstream work.
|
|
38
38
|
|
|
39
39
|
## Performance rules
|
|
40
40
|
|
|
@@ -57,7 +57,7 @@ Use `[HarnessActivePlan]` / `[HarnessRunContext]` only. On revise: preserve `pla
|
|
|
57
57
|
|
|
58
58
|
## Phase 0 — Task clarification (mandatory; parent-led)
|
|
59
59
|
|
|
60
|
-
**Practice:** Collect requirements
|
|
60
|
+
**Practice:** Collect requirements and shared meaning before WBS (PMBOK; Crucial Conversations).
|
|
61
61
|
|
|
62
62
|
**Goal:** `artifacts/task-clarification.yaml` with `status: ready`, `unresolved_questions: []`, and a canonical `clarified_task`. No full planning until gated.
|
|
63
63
|
|
|
@@ -121,7 +121,7 @@ Decompose treats **`task-clarification.yaml` as authoritative** for scope; §1.1
|
|
|
121
121
|
|
|
122
122
|
## Phase 2b — Hypothesis-driven approach (sequential)
|
|
123
123
|
|
|
124
|
-
**Practice:** Lean exploration — falsifiable claim before plan detail
|
|
124
|
+
**Practice:** Lean exploration — require a falsifiable claim before plan detail.
|
|
125
125
|
|
|
126
126
|
**Requires** `artifacts/decomposition.yaml`. Do **not** spawn in parallel with decompose.
|
|
127
127
|
|
|
@@ -262,7 +262,7 @@ Med/low non-fork plans with clear stack and no implementation `open_questions` d
|
|
|
262
262
|
|
|
263
263
|
## Phase 5 — Structured inspection / Review Gate (Fagan-style)
|
|
264
264
|
|
|
265
|
-
**Practice:** Code Complete collaborative construction
|
|
265
|
+
**Practice:** Code Complete collaborative construction with Fagan-style inspection criteria. Parent is **chair**; one debate agent per `subagent` batch.
|
|
266
266
|
|
|
267
267
|
**Forbidden:** parallel `subagent` calls for any debate lane agent in one batch.
|
|
268
268
|
|
|
@@ -7,7 +7,7 @@ argument-hint: "[--run <run-id>] [--quick] [--readonly] [--trace <trace-ref>]"
|
|
|
7
7
|
|
|
8
8
|
You are the **post-run verification PM** (PMBOK Monitoring and Controlling). Run measure → judge → red team in one command. Parent owns `ask_user`, deterministic scripts, `harness_artifact_ready`, and run ownership (`--claim` on resume). Subagents persist via **`submit_*`** only (no parent `write` to verdict artifacts).
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Follow the review sequence in this prompt directly: deterministic checks → benchmark evaluator → verdict evaluator → adversary → optional tie-breaker.
|
|
11
11
|
|
|
12
12
|
Read **harness-orchestration** and **harness-review** skills before spawning.
|
|
13
13
|
|
|
@@ -91,7 +91,7 @@ notes: "…"
|
|
|
91
91
|
|
|
92
92
|
## Phase 1b — Sentrux repair advisor (subagent)
|
|
93
93
|
|
|
94
|
-
**Practice:** Close the loop from fitness-function observation
|
|
94
|
+
**Practice:** Close the loop from fitness-function observation to bounded repair directives. Skip when `artifacts/sentrux-repair-plan.yaml` already exists and `HARNESS_SENTRUX_RESCAN` is unset.
|
|
95
95
|
|
|
96
96
|
Spawn when **any**:
|
|
97
97
|
|
|
@@ -131,7 +131,7 @@ Gate:
|
|
|
131
131
|
harness_artifact_ready({ paths: ["artifacts/eval-verdict.yaml"] })
|
|
132
132
|
```
|
|
133
133
|
|
|
134
|
-
**Do not stop** after benchmark fail — continue to verdict (and adversary per tier) so `review-outcome.yaml` can route steer vs replan
|
|
134
|
+
**Do not stop** after benchmark fail — continue to verdict (and adversary per tier) so `review-outcome.yaml` can route steer vs replan.
|
|
135
135
|
|
|
136
136
|
## Phase 3 — Policy / quality audit (verdict evaluator)
|
|
137
137
|
|
|
@@ -153,7 +153,7 @@ Gate again with `harness_artifact_ready`.
|
|
|
153
153
|
|
|
154
154
|
## Phase 4 — Independent red team (adversary)
|
|
155
155
|
|
|
156
|
-
**Practice:** Generator–evaluator separation; adversary distinct from measurer
|
|
156
|
+
**Practice:** Generator–evaluator separation; adversary stays distinct from the measurer.
|
|
157
157
|
|
|
158
158
|
Skip when `--quick`. **Tiered steer:** full adversary on initial run + steer attempt 1; lite review (no adversary) on steer attempts 2+ unless prior `block_merge`.
|
|
159
159
|
|
|
@@ -185,7 +185,7 @@ subagent({ agentScope: "both", agent: "harness/reviewing/tie-breaker", task: "
|
|
|
185
185
|
|
|
186
186
|
- **Never** parse subprocess JSON to write `eval-verdict.yaml` or `adversary-report.yaml` — use `submit_*` + `harness_artifact_ready` only.
|
|
187
187
|
- Do not edit `plan-packet.yaml`.
|
|
188
|
-
- Do not run inline review checks in this session (
|
|
188
|
+
- Do not run inline review checks in this session (keep review work isolated to subagents).
|
|
189
189
|
- Same Pi session as `/harness-run` is preferred; `--claim` makes cross-session resume work.
|
|
190
190
|
|
|
191
191
|
## Phase 6 — Review outcome + repair brief (parent)
|
|
@@ -4,7 +4,7 @@ description: Execute only against an approved PlanPacket with strict phase gates
|
|
|
4
4
|
|
|
5
5
|
# harness-run
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Follow this prompt's execution flow directly: baseline gate → executor spawn → structural observation → review handoff.
|
|
8
8
|
|
|
9
9
|
You orchestrate the **Executing Process Group** — spawn `harness/running/executor` only. Do **not** implement inline.
|
|
10
10
|
|
|
@@ -106,7 +106,7 @@ phase: execute
|
|
|
106
106
|
|
|
107
107
|
## Parent rules
|
|
108
108
|
|
|
109
|
-
- On `scope_drift`, finish handoff and recommend **`/harness-review`** (review classifies
|
|
109
|
+
- On `scope_drift`, finish handoff and recommend **`/harness-review`** (review classifies whether the gap is planning or implementation).
|
|
110
110
|
- Do not call `ask_user` for plan-level ambiguity — return to plan command.
|
|
111
111
|
|
|
112
112
|
## Completion
|
|
@@ -39,14 +39,14 @@ Gate: `harness_artifact_ready({ paths: ["artifacts/sentrux-manifest-proposal.yam
|
|
|
39
39
|
Read `artifacts/sentrux-manifest-proposal.yaml`.
|
|
40
40
|
|
|
41
41
|
- `change_class: none` → report no manifest change; stop.
|
|
42
|
-
- Otherwise → `ask_user` with summary, evidence bullets, and
|
|
42
|
+
- Otherwise → `ask_user` with summary, evidence bullets, and any draft decision text when a formal decision record is required.
|
|
43
43
|
|
|
44
44
|
On approval:
|
|
45
45
|
|
|
46
46
|
1. Apply `manifest_patch` to `.pi/harness/sentrux/architecture.manifest.json` (parent `write` or manual edit).
|
|
47
47
|
2. `node "$UP_PKG/.pi/scripts/harness-sentrux-bootstrap.mjs" --force`
|
|
48
48
|
3. Append session custom entry `harness-architecture-changed` (triggers rules sync extension).
|
|
49
|
-
4. If
|
|
49
|
+
4. If a formal decision record is required, file it in the target project's standard decision-log location.
|
|
50
50
|
|
|
51
51
|
On reject: keep manifest unchanged; document decision in run notes.
|
|
52
52
|
|
|
@@ -289,7 +289,7 @@ Quick smoke test:
|
|
|
289
289
|
sg -p 'function $NAME($$$ARGS) { $$$BODY }' --json 2>/dev/null | head -5 && echo "✓ ast-grep pattern matching works" || echo "! ast-grep smoke test — may need language-specific config"
|
|
290
290
|
```
|
|
291
291
|
|
|
292
|
-
### 2.7 — gh CLI (GitHub Issues Spec Storage
|
|
292
|
+
### 2.7 — gh CLI (GitHub Issues Spec Storage)
|
|
293
293
|
|
|
294
294
|
```bash
|
|
295
295
|
if ! command -v gh &>/dev/null || [ "$FORCE" = "true" ]; then
|
|
@@ -335,7 +335,7 @@ Installed and smoke-tested by `harness-cli-verify.sh` (`npm install -g @ls-lint/
|
|
|
335
335
|
|
|
336
336
|
## Step 3 — Pi Extension Packages
|
|
337
337
|
|
|
338
|
-
Bundled extensions load from the installed `ultimate-pi` package. The harness lens wrapper at `.pi/extensions/harness-lens.ts` loads `.pi/extensions/lib/harness-lens/` for edit autopatch, secrets blocking, deferred format, and LSP tools. Structural search uses shell `sg` (installed globally by setup); architecture gates use Sentrux.
|
|
338
|
+
Bundled extensions load from the installed `ultimate-pi` package. The harness lens wrapper at `.pi/extensions/harness-lens.ts` loads `.pi/extensions/lib/harness-lens/` for edit autopatch, secrets blocking, deferred format, and LSP tools. Structural search uses shell `sg` (installed globally by setup); architecture gates use Sentrux.
|
|
339
339
|
|
|
340
340
|
Harness lens findings are **complementary** to Sentrux:
|
|
341
341
|
|
|
@@ -471,7 +471,7 @@ Harness `ask_user` supports terminal (TUI), headless (CI), and Glimpse WebView (
|
|
|
471
471
|
| **Desktop Linux / macOS / WSLg** | `auto` or `glimpse` for richer questionnaires |
|
|
472
472
|
| **CI / `--non-interactive`** | Prompts skipped; do not expect WebView |
|
|
473
473
|
|
|
474
|
-
Append `HARNESS_ASK_USER_UI=tui` to `.env` when WebView is unavailable. The first real `ask_user` reports `ui_backend` and `ui_degraded` in tool details.
|
|
474
|
+
Append `HARNESS_ASK_USER_UI=tui` to `.env` when WebView is unavailable. The first real `ask_user` reports `ui_backend` and `ui_degraded` in tool details.
|
|
475
475
|
|
|
476
476
|
Template keys (placeholders — user fills secrets): `HARNESS_TELEMETRY_ENABLED`, `HARNESS_WEB_*`, `HARNESS_VCC_COMPACTION`, `HARNESS_VCC_DEBUG`, plus commented optional PostHog / Graphify vars.
|
|
477
477
|
|
|
@@ -5,7 +5,7 @@ argument-hint: "[--attempt N]"
|
|
|
5
5
|
|
|
6
6
|
# harness-steer
|
|
7
7
|
|
|
8
|
-
Thin orchestrator for the **steer loop
|
|
8
|
+
Thin orchestrator for the **steer loop**. Run only after `/harness-review` produced `artifacts/review-outcome.yaml` and `artifacts/repair-brief.yaml` with `remediation_class: implementation_gap`.
|
|
9
9
|
|
|
10
10
|
## Preconditions
|
|
11
11
|
|
|
@@ -19,10 +19,10 @@ Thin orchestrator for the **steer loop** (ADR 0044). Run only after `/harness-re
|
|
|
19
19
|
2. Update `artifacts/steer-state.yaml` (`attempt`, `max_attempts`, `active: true`).
|
|
20
20
|
3. Set policy phase to **execute** before spawning executor (required for mutating tools).
|
|
21
21
|
4. One `ask_user` steer gate unless `run-context.steer_approved` is already true.
|
|
22
|
-
5. Spawn **`harness/running/executor`** with `HarnessSpawnContext.mode: repair` and `repair_brief_path: artifacts/repair-brief.yaml`. Repair uses the same hash-anchored `read`/`edit`, batching, and pre-handoff verification rules as `/harness-run
|
|
23
|
-
6. Optional: `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate --save` after repair to refresh
|
|
24
|
-
7. Optional: `node "$UP_PKG/.pi/scripts/harness-ls-lint-cli.mjs"` after repair to confirm filename conventions
|
|
25
|
-
7. `next_command`: **`/harness-review`** (always re-verify; tiered adversary on attempts 2+
|
|
22
|
+
5. Spawn **`harness/running/executor`** with `HarnessSpawnContext.mode: repair` and `repair_brief_path: artifacts/repair-brief.yaml`. Repair uses the same hash-anchored `read`/`edit`, batching, and pre-handoff verification rules as `/harness-run`.
|
|
23
|
+
6. Optional: `node "$UP_PKG/.pi/scripts/harness-sentrux-cli.mjs" gate --save` after repair to refresh the structural baseline.
|
|
24
|
+
7. Optional: `node "$UP_PKG/.pi/scripts/harness-ls-lint-cli.mjs"` after repair to confirm filename conventions.
|
|
25
|
+
7. `next_command`: **`/harness-review`** (always re-verify; use tiered adversary on attempts 2+).
|
|
26
26
|
|
|
27
27
|
## Forbidden
|
|
28
28
|
|
|
@@ -158,6 +158,34 @@ async function runNodeScript(scriptPath, args = []) {
|
|
|
158
158
|
}
|
|
159
159
|
|
|
160
160
|
const PROMPT_EXCLUDE = new Set(["release.md"]);
|
|
161
|
+
const INTERNAL_PROMPT_SURFACE_ROOTS = [
|
|
162
|
+
{
|
|
163
|
+
label: ".pi/prompts",
|
|
164
|
+
dir: join(ROOT, ".pi", "prompts"),
|
|
165
|
+
recursive: false,
|
|
166
|
+
include: (name) => name.endsWith(".md"),
|
|
167
|
+
},
|
|
168
|
+
{
|
|
169
|
+
label: ".pi/agents",
|
|
170
|
+
dir: join(ROOT, ".pi", "agents"),
|
|
171
|
+
recursive: true,
|
|
172
|
+
include: (name) => name.endsWith(".md"),
|
|
173
|
+
},
|
|
174
|
+
{
|
|
175
|
+
label: ".agents/skills",
|
|
176
|
+
dir: join(ROOT, ".agents", "skills"),
|
|
177
|
+
recursive: true,
|
|
178
|
+
include: (name) => name === "SKILL.md",
|
|
179
|
+
},
|
|
180
|
+
];
|
|
181
|
+
|
|
182
|
+
const FORBIDDEN_INTERNAL_PROMPT_REFS = [
|
|
183
|
+
{ label: "ADR token", regex: /\bADR\b/i },
|
|
184
|
+
{ label: "internal ADR path", regex: /(?:^|\W)(?:docs\/adr|\.pi\/harness\/docs\/adrs)(?:\W|$)/i },
|
|
185
|
+
{ label: "internal practice-map path", regex: /(?:^|\W)(?:\.pi\/harness\/docs\/practice-map\.md|practice-map)(?:\W|$)/i },
|
|
186
|
+
{ label: "internal planning rubrics path", regex: /(?:^|\W)(?:\.pi\/harness\/docs\/planning-rubrics\.md|planning-rubrics)(?:\W|$)/i },
|
|
187
|
+
{ label: "internal docs path", regex: /(?:^|\W)\.pi\/harness\/docs\//i },
|
|
188
|
+
];
|
|
161
189
|
|
|
162
190
|
function parsePromptFrontmatter(raw) {
|
|
163
191
|
const match = raw.match(/^---\r?\n([\s\S]*?)\r?\n---/);
|
|
@@ -179,6 +207,50 @@ function parsePromptFrontmatter(raw) {
|
|
|
179
207
|
return fields;
|
|
180
208
|
}
|
|
181
209
|
|
|
210
|
+
function relPath(path) {
|
|
211
|
+
if (path.startsWith(`${ROOT}/`)) return path.slice(ROOT.length + 1);
|
|
212
|
+
return path;
|
|
213
|
+
}
|
|
214
|
+
|
|
215
|
+
async function collectMarkdownFiles(dir, { recursive, include }) {
|
|
216
|
+
const out = [];
|
|
217
|
+
const entries = await readdir(dir, { withFileTypes: true });
|
|
218
|
+
for (const entry of entries) {
|
|
219
|
+
const fullPath = join(dir, entry.name);
|
|
220
|
+
if (entry.isDirectory()) {
|
|
221
|
+
if (recursive) {
|
|
222
|
+
out.push(...(await collectMarkdownFiles(fullPath, { recursive, include })));
|
|
223
|
+
}
|
|
224
|
+
continue;
|
|
225
|
+
}
|
|
226
|
+
if (!entry.isFile()) continue;
|
|
227
|
+
if (!entry.name.endsWith(".md")) continue;
|
|
228
|
+
if (include && !include(entry.name, fullPath)) continue;
|
|
229
|
+
out.push(fullPath);
|
|
230
|
+
}
|
|
231
|
+
return out;
|
|
232
|
+
}
|
|
233
|
+
|
|
234
|
+
async function checkInternalPromptReferencePolicy() {
|
|
235
|
+
for (const root of INTERNAL_PROMPT_SURFACE_ROOTS) {
|
|
236
|
+
if (!(await fileExists(root.dir))) continue;
|
|
237
|
+
const files = await collectMarkdownFiles(root.dir, {
|
|
238
|
+
recursive: root.recursive,
|
|
239
|
+
include: root.include,
|
|
240
|
+
});
|
|
241
|
+
for (const file of files) {
|
|
242
|
+
const raw = await readFile(file, "utf-8");
|
|
243
|
+
for (const rule of FORBIDDEN_INTERNAL_PROMPT_REFS) {
|
|
244
|
+
if (rule.regex.test(raw)) {
|
|
245
|
+
fail(
|
|
246
|
+
`internal prompt/agent/skill policy: ${relPath(file)} contains forbidden reference (${rule.label})`,
|
|
247
|
+
);
|
|
248
|
+
}
|
|
249
|
+
}
|
|
250
|
+
}
|
|
251
|
+
ok(`internal prompt-surface reference policy (${root.label})`);
|
|
252
|
+
}
|
|
253
|
+
}
|
|
182
254
|
async function checkPromptFrontmatter() {
|
|
183
255
|
const promptsDir = join(ROOT, ".pi", "prompts");
|
|
184
256
|
const names = await readdir(promptsDir);
|
|
@@ -596,6 +668,7 @@ async function main() {
|
|
|
596
668
|
await verifySchemaAdrAndExtensions();
|
|
597
669
|
await verifyCoreSurfaceFiles();
|
|
598
670
|
await checkPromptFrontmatter();
|
|
671
|
+
await checkInternalPromptReferencePolicy();
|
|
599
672
|
const pkgJson = JSON.parse(await readFile(join(ROOT, "package.json"), "utf-8"));
|
|
600
673
|
await checkHarnessLens(pkgJson);
|
|
601
674
|
await checkHarnessAnchoredEdit(pkgJson);
|
package/AGENTS.md
CHANGED
|
@@ -33,6 +33,7 @@ Created: 2026-05-14
|
|
|
33
33
|
- ./raw/ is source storage for graphify
|
|
34
34
|
- ADRs in docs/adr/ (repo) and .pi/harness/docs/adrs/ (harness) with structured format
|
|
35
35
|
- `node "$UP_PKG/.pi/scripts/harness-verify.mjs"` for deterministic harness contract checks (`UP_PKG` — see `.pi/scripts/README.md`)
|
|
36
|
+
- Internal prompt surfaces only (`.pi/prompts/**`, `.pi/agents/**`, `.agents/skills/*/SKILL.md`): do not reference ADRs or internal-doc paths; write intended behavior directly. `harness-verify` enforces this policy.
|
|
36
37
|
- Harness context: **context-mode only** — never lean-ctx on harness paths (see harness-context skill)
|
|
37
38
|
- `graphify update .` after significant code changes
|
|
38
39
|
- ast-grep (`sg`) is the default code search tool — use `sg -p 'pattern'` for structural search, never grep for code
|
package/CHANGELOG.md
CHANGED
package/package.json
CHANGED
|
@@ -1,20 +1,23 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ultimate-pi",
|
|
3
|
-
"version": "0.22.
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "0.22.1",
|
|
4
|
+
"description": "Governed AI coding harness for pi.dev — bootstrap, plan, execute, review, and steer with deterministic policy gates",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"pi-package",
|
|
7
7
|
"pi-mono",
|
|
8
8
|
"pi",
|
|
9
9
|
"ai-harness",
|
|
10
|
+
"agentic-harness",
|
|
10
11
|
"coding-agent",
|
|
11
|
-
"
|
|
12
|
-
"
|
|
13
|
-
"
|
|
12
|
+
"governed-workflow",
|
|
13
|
+
"plan-execute-review",
|
|
14
|
+
"policy-gates",
|
|
14
15
|
"agent-skills",
|
|
15
|
-
"
|
|
16
|
+
"graphify",
|
|
16
17
|
"harness-web",
|
|
18
|
+
"scrapling",
|
|
17
19
|
"context-mode",
|
|
20
|
+
"sentrux",
|
|
18
21
|
"vcc"
|
|
19
22
|
],
|
|
20
23
|
"license": "MIT",
|