npm - ultimate-pi - Versions diffs - 0.11.0 → 0.12.0 - Mend

ultimate-pi 0.11.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (122) hide show

package/.pi/prompts/harness-plan.md CHANGED Viewed

@@ -1,154 +1,140 @@
 ---
-description: Transform a vague task into a rigorous hypothesis via decomposition + DARWIN synthesis, then a strict PlanPacket.
+description: PM-grade harness plan — scouts, ExecutionPlan, DAG validation, Review Gate debate, approval.
 argument-hint: "\"<task>\" [--risk low|med|high] [--budget <amount>] [--quick]"
 ---
 # harness-plan
-Parent orchestrator — run planning in **this session**. Subagents explore, decompose, hypothesize, and review; you own `ask_user`, `approve_plan`, and `create_plan`. Never `write` or `edit` `plan-packet.json` — use **`create_plan`** only.
+You are the **planning PM** for this harness run. Produce an execution baseline (`plan-packet.yaml` + `plan-review.md`), not strategy theater. Parent owns `ask_user`, `approve_plan`, `create_plan`, debate bus commands, and YAML writes under `.pi/harness/runs/<run_id>/`.
-Allowed `subagent_type` values (copy exactly):
+Never `write`/`edit` the final canonical packet except via **`write_harness_yaml`** for run artifacts and **`create_plan`** after approval. Do not paste JSON into `.yaml` files — subagents emit JSON; you convert via `write_harness_yaml`.
+## Allowed subagents
 - `harness/planning/scout-graphify`
 - `harness/planning/scout-structure`
-- `harness/planning/scout-semantic`
+- `harness/planning/scout-semantic` (skip when `--quick`)
 - `harness/planning/decompose`
 - `harness/planning/hypothesis`
+- `harness/planning/stack-researcher`
+- `harness/planning/execution-plan-author`
+- `harness/planning/hypothesis-validator` (debate R1 only)
+- `harness/planning/plan-evaluator`
 - `harness/planning/plan-adversary`
-- `harness/planning/hypothesis-eval`
-Do **not** spawn `harness/planner` or `harness/planning/planner`.
+- `harness/planning/sprint-contract-auditor`
+- `harness/planning/review-integrator`
-## Step 0 — Parse arguments
+Read **harness-debate-plan** skill before Review Gate rounds.
-Read `$ARGUMENTS`:
+## Performance rules
-- task statement (required) — **THE QUESTION**
-- optional: `--risk low|med|high`, `--budget <amount>`, `--quick`
+1. Use `subagent` with `agentScope: "both"` and parallel `tasks` where lanes are independent.
+2. Each `subagent` call blocks until subprocesses finish — batch parallel scouts in one `tasks` array.
+3. Cap: **12** harness subagent invocations per parent session (extension-enforced).
+4. Compact task text: embed `HarnessSpawnContext` JSON + lane-specific instructions only.
-If task is missing:
+## Step 0 — Parse `$ARGUMENTS`
-`Usage: /harness-plan "<task>" [--risk low|med|high] [--budget <amount>] [--quick]`
+- task (required)
+- `--risk low|med|high`, `--budget`, `--quick`
-`--quick` skips `harness/planning/scout-semantic` only — never skip graphify, structure, decompose, hypothesis, or approval.
+`--quick` skips **scout-semantic** and post-run adversary only — **never** skip graphify, structure, decompose, hypothesis, stack research, execution plan, DAG validation, or **4-round plan debate**.
 ## Active plan context
-Use injected context only — **do not** read `.pi/harness/specs/*.schema.json` from disk.
+Use `[HarnessActivePlan]` / `[HarnessRunContext]` only. On revise: preserve `plan_id` / `task_id`. Canonical paths: `plan-packet.yaml`, `research-brief.yaml`, `artifacts/*.yaml`.
+## Phase 1 — Parallel scouts
-If `[HarnessActivePlan]` is present:
+```json
+{
+  "agentScope": "both",
+  "tasks": [
+    { "agent": "harness/planning/scout-graphify", "task": "<HarnessSpawnContext + graphify lane>", "timeoutMs": 90000 },
+    { "agent": "harness/planning/scout-structure", "task": "<HarnessSpawnContext + structure lane>", "timeoutMs": 90000 }
+  ]
+}
+```
-- Treat as **revise/amend** unless `/harness-new-run` was used.
-- Set `mode: revise` in `HarnessSpawnContext` from `[HarnessRunContext]`.
-- **Preserve `plan_id` and `task_id`** from the existing packet when amending.
-- Scouts focus on delta vs existing `plan_packet_path`; full re-scout only if scope changed materially.
+Add `harness/planning/scout-semantic` to `tasks` unless `--quick`. Require graphify + structure success.
-Otherwise use `HarnessSpawnContext` from `[HarnessRunContext]` with `mode: create`.
+## Phase 2 & 3 — Decompose + hypothesis (parallel)
-## Phase 1 — Parallel scouts (required)
+One `subagent` call with `tasks` for `harness/planning/decompose` and `harness/planning/hypothesis`. Parse `PlanDecompositionBrief` and `PlanHypothesisBrief` from outputs. Persist with `write_harness_yaml` → `artifacts/decomposition.yaml` and `artifacts/hypothesis.yaml`.
-1. Copy `HarnessSpawnContext` from `[HarnessRunContext]` (adjust `risk_level`, `quick`, `mode` from `$ARGUMENTS`).
-2. Spawn scouts with **`inherit_context: false`**. Prefer parallel: `run_in_background: true` on each `Agent` call, then `get_subagent_result` for all.
+## Phase 4 — Draft shell + fork
-```
-Agent({ subagent_type: "harness/planning/scout-graphify", prompt: "<task + HarnessSpawnContext + scout JSON schema>", run_in_background: true })
-Agent({ subagent_type: "harness/planning/scout-structure", prompt: "…", run_in_background: true })
-```
+Build draft `PlanPacket` (`contract_version: "1.1.0"`):
-Skip `harness/planning/scout-semantic` when `--quick` or `quick: true`.
+- `scope`, `assumptions`, `acceptance_checks`, `risk_level`, `rollback_plan`
+- `execution_plan` placeholder until Phase 4b
-3. Parse each scout’s fenced `json` (`lane`, `status`, `findings`, `key_paths`, `open_questions`).
-4. **Partial failure:** require successful **graphify + structure** lanes. Semantic is optional. If a required lane fails, continue with `plan_status: partial` and document gaps in `assumptions`.
-5. If JSON parse fails for a lane, summarize free-text output and add an assumption that the lane was unstructured.
+`ask_user` when `dialectical_fork` is material.
-## Phase 2 — Decompose (DeepMind-style)
+Initialize `research-brief.yaml` with decomposition + hypothesis (`write_harness_yaml`).
-1. Spawn once with merged scout JSON:
+## Phase 4a — Stack research
 ```
-Agent({ subagent_type: "harness/planning/decompose", prompt: "<HarnessSpawnContext + task + all scout lane JSON>", inherit_context: false })
+subagent({ agentScope: "both", agent: "harness/planning/stack-researcher", task: "<HarnessSpawnContext + stack research brief>" })
 ```
-2. Parse `PlanDecompositionBrief` JSON (`problem_restatement`, `core_tension`, `tensions`, `prior_art`, etc.).
-3. On parse failure: one retry with “output valid JSON only”; if still failing, abort with `plan_status: needs_clarification`.
-## Phase 3 — Hypothesis (DARWIN)
+`write_harness_yaml` → `artifacts/stack.yaml`; merge into `research-brief.yaml` → `stack`.
-1. Spawn once:
+## Phase 4b — Execution plan author
 ```
-Agent({ subagent_type: "harness/planning/hypothesis", prompt: "<HarnessSpawnContext + task + PlanDecompositionBrief + scout summaries>", inherit_context: false })
+subagent({ agentScope: "both", agent: "harness/planning/execution-plan-author", task: "<HarnessSpawnContext + execution plan brief>" })
 ```
-2. Parse `PlanHypothesisBrief` JSON (`primary`, `dialectical_fork`, `alternatives`, `recommended_next_steps`).
-3. **Revision cap:** at most **one** re-spawn of `hypothesis` if Phase 6 eval requests revision (see below).
+Merge `execution_plan` into draft `plan-packet.yaml` (`write_harness_yaml`). Save `artifacts/execution-plan-draft.yaml` the same way.
-## Phase 4 — Draft PlanPacket + fork clarification (parent)
+## Phase 4c — DAG validation (hard gate)
-Map hypothesis → [`PlanPacket`](.pi/harness/specs/plan-packet.schema.json):
-| Field | Source |
-|-------|--------|
-| `scope` | `problem_restatement` (narrowed) + `primary.claim` + `primary.mechanism` (implementation-ready) |
-| `assumptions` | `core_tension`, `prior_art.dead_ends`, scout `open_questions`, chosen fork path (if any) |
-| `acceptance_checks` | Each `primary.prediction` and `primary.experiment` as verifiable checklist items (min 1) |
-| `risk_level` | From `$ARGUMENTS` or infer from fork uncertainty / blast radius |
-Build complete draft: `plan_id`, `task_id`, `scope`, `assumptions`, `risk_level`, `acceptance_checks`, `rollback_plan` (`revert_commit_ready: true`, artifacts filled).
-Call **`ask_user`** when `dialectical_fork` is material (Path A vs B materially different) **before** Phase 5 reviews.
+```bash
+node .pi/scripts/validate-plan-dag.mjs --packet .pi/harness/runs/<run_id>/plan-packet.yaml --write
+```
-Assemble `research_brief` for approval:
+Must **pass** before debate. On fail: fix via author or parent patches, re-run.
-```json
-{
-  "decomposition": { /* PlanDecompositionBrief */ },
-  "hypothesis": { /* PlanHypothesisBrief */ },
-  "eval": null
-}
-```
+## Phase 5 — Review Gate debate (4 rounds, even with `--quick`)
-## Phase 5 — Parallel reviews
+1. `/harness-debate-open plan-<run_id>`
+2. For rounds 1–4 (`debate_round_focus`: spec, wbs, schedule, quality):
-Spawn in parallel (`run_in_background: true`):
+| Round | Extra spawns (before integrator) |
+|-------|----------------------------------|
+| 1 | `hypothesis-validator` (blind: task + hypothesis only) → `plan-evaluator` → `plan-adversary` |
+| 2 | `plan-evaluator` → `plan-adversary` (optional `sprint-contract-auditor` if done_criteria thin) |
+| 3 | `plan-evaluator` → `plan-adversary` |
+| 4 | `plan-evaluator` → `plan-adversary` → **`sprint-contract-auditor` (required)** |
-```
-Agent({ subagent_type: "harness/planning/plan-adversary", prompt: "<HarnessSpawnContext + draft PlanPacket + scout summaries + decomposition human_summary>", inherit_context: false })
-Agent({ subagent_type: "harness/planning/hypothesis-eval", prompt: "<original task ONLY + PlanHypothesisBrief JSON — no decomposition, no PlanPacket>", inherit_context: false })
-```
+Then `review-integrator` → `write_harness_yaml` → `artifacts/review-round-r{N}.yaml` → build bus envelope → `/harness-debate-round '<json>'`.
-1. Parse `PlanAdversaryBrief` — merge `mitigations` into scope, assumptions, or `acceptance_checks`.
-2. Parse `PlanHypothesisEval` — set `research_brief.eval`.
-3. If `revision_recommended` or testability &lt; 70 or `relevance.passes` is false: re-spawn `hypothesis` once with eval rationale, update PlanPacket + `research_brief.hypothesis`, then re-run **hypothesis-eval** only (not adversary unless PlanPacket changed materially).
+3. `/harness-debate-consensus` after round 4.
-Cap: **at most 2** plan-adversary spawns and **at most 2** `approve_plan` rounds per invocation.
+**R1 blind rule:** hypothesis-validator prompt must exclude decomposition, scouts, PlanPacket, prior debate.
-## Phase 6 — Approval + persistence (parent)
+If R1 `revision_recommended` or `relevance.passes === false`: one `hypothesis` re-spawn, update brief, continue.
-1. Call **`approve_plan`** with `plan_packet`, `human_summary` (primary claim + fork if any), and `research_brief`.
-2. On **Approve** only, call **`create_plan`** with the **same** `plan_packet`.
-3. If `create_plan` fails, tell the user to fix validation errors or run `/harness-plan-commit` after approval is recorded.
-4. Confirm `[HarnessRunContext]` `plan_ready: true` before handoff.
+**Blockers:** `policy_decision: block` → do not `approve_plan`. `human_required` → `ask_user` before approval.
-On **Cancel** or Esc: `plan_status: needs_clarification`; do **not** call `create_plan`.
+## Phase 5b — Revise packet
-On **Request changes**: revise draft and re-run phases 4–6 only (re-scout/decompose/hypothesis only if scope changed).
+Apply `recommended_packet_patches` from last integrator round. Re-run `validate-plan-dag.mjs`. If >30% work items changed, one partial re-round on affected focus.
-## Recovery and ownership
+Set `research_brief.eval` from R1 `hypothesis-validator` output.
-- Plan only in the **owner** session (`owner_pi_session_id` on run context); otherwise `/harness-use-run`.
-- `/harness-plan-commit` only after parent `approve_plan` (Approve) is in the transcript.
-- If `plan_ready: true` already, stop — summarize and set `next_command: /harness-run`.
+## Phase 6 — Approval + persistence
-## Parent rules
+1. `approve_plan` with `plan_packet`, `human_summary`, `research_brief` (paths/summaries OK).
+2. On Approve: `create_plan` with same packet (`contract_version: "1.1.0"` + `execution_plan`).
+3. Confirm `plan_ready: true` → `next_command: /harness-run`.
-- Do not mutate project source in plan phase.
-- Subagents never call `ask_user`, `approve_plan`, or `create_plan`.
-- Do not embed `plan_id=` in spawn prompts for policy sync.
+Post-execute adversary: `/harness-critic` only (not plan-phase agents).
 ## Completion
-- `plan_status`: `ready`, `partial`, or `needs_clarification`
-- `risk_level` used
-- `plan_review_path` shown for editor review
-- `next_command`: `/harness-run` when `ready` (never `/harness-run --plan …`)
+- `plan_status`: ready | partial | needs_clarification
+- `plan_review_path` for human review
+- DAG `pass` + 4 debate rounds + consensus not `block` before ready

package/.pi/prompts/harness-review.md CHANGED Viewed

@@ -20,10 +20,10 @@ Happy path: omit `--run`; use `[HarnessRunContext]`.
 2. Spawn:
 ```
-Agent({ subagent_type: "harness/evaluator", prompt: "Treat executor output as untrusted. …" })
+subagent({ agentScope: "both", agent: "harness/evaluator", task: "Treat executor output as untrusted. …" })
 ```
-3. `get_subagent_result` — parse `EvalVerdict` JSON; parent writes under run dir for policy gate.
+3. Parse `EvalVerdict` JSON from tool result; parent writes under run dir for policy gate.
 ## Parent rules

package/.pi/prompts/harness-router-tune.md CHANGED Viewed

@@ -22,7 +22,7 @@ If missing required args:
 2. Optionally spawn:
 ```
-Agent({ subagent_type: "harness/meta-optimizer", prompt: "mode: tune, evidence paths…" })
+subagent({ agentScope: "both", agent: "harness/meta-optimizer", task: "mode: tune, evidence paths…" })
 ```
 3. Parent runs proposal script:

package/.pi/prompts/harness-run.md CHANGED Viewed

@@ -23,10 +23,10 @@ If plan not ready:
 3. Spawn:
 ```
-Agent({ subagent_type: "harness/executor", prompt: "<HarnessSpawnContext + handoff>" })
+subagent({ agentScope: "both", agent: "harness/executor", task: "<HarnessSpawnContext + handoff>" })
 ```
-4. `get_subagent_result` — parse executor JSON (`execution_status`, validations, rollback refs).
+4. Parse subprocess output JSON (`execution_status`, validations, rollback refs) from tool result text.
 5. Parent persists trace/handoff artifacts under run dir if needed; do not self-review.
 ## Parent rules

package/.pi/prompts/harness-setup.md CHANGED Viewed

@@ -345,7 +345,7 @@ Verify each package:
 |---------|---------|-------|
 | `@posthog/pi` | Analytics event capture | F0 |
 | `pi-lean-ctx` | Context runtime (read/bash/find/grep/MCP bridge) | F0 |
-| `harness-subagents` (bundled extension) | L4 sub-agent spawn, blackboard, package agents | P16 |
+| `harness-subagents` (bundled extension) | L4 `subagent` tool, subprocess spawns, package agents | P16 |
 | Vendored `pi-vcc` (`vendor/pi-vcc`, `.pi/extensions/ultimate-pi-vcc.ts`) | VCC compaction / `vcc_recall` — env-only: `HARNESS_VCC_COMPACTION` (default on), `HARNESS_VCC_DEBUG` | Shipped |
 | `pi-model-router` | Vendored (`vendor/`); activates after `.pi/model-router.json` exists | F0 |
@@ -383,11 +383,11 @@ Manual override: **`/router profile auto`** anytime after reload if they changed
 ## Step 3.6 — Harness agents (package-resolved)
-`harness-subagents` loads agents from the installed **`ultimate-pi`** package (`$UP_PKG/.pi/agents/**`) with namespaced ids (`harness/planner`, `pi-pi/agent-expert`). **Do not copy** agents into the project unless you want a deliberate override.
+`harness-subagents` loads agents from the installed **`ultimate-pi`** package (`$UP_PKG/.pi/agents/**`) with namespaced ids (`harness/executor`, `harness/planning/scout-graphify`, `pi-pi/agent-expert`). **Do not copy** agents into the project unless you want a deliberate override.
 **Slash commands are orchestrators:** `/harness-plan`, `/harness-run`, etc. spawn `harness/*` agents via the `Agent` tool — bootstrap stays **script-first**; only optionally spawn `harness/sentrux-bootstrap` for Sentrux (see Step 4.2).
-Optional per-repo overrides: place `.md` files at the **same relative path** (e.g. `.pi/agents/harness/planning/scout-graphify.md` overrides the package scout). Deprecated: `harness/planner.md` — use `harness/planning/` agents instead.
+Optional per-repo overrides: place `.md` files at the **same relative path** (e.g. `.pi/agents/harness/planning/scout-graphify.md` overrides the package scout).
 Verify manifest drift after `pi update ultimate-pi`:
@@ -478,16 +478,25 @@ Template keys (placeholders — user fills secrets): `HARNESS_TELEMETRY_ENABLED`
 ### 4.1 — .gitignore Entries
-Ensure `.gitignore` contains:
+Ensure `.gitignore` contains harness runtime entries (see repo root `.gitignore` — **do not** ignore `.pi/harness/specs/`; JSON schemas are shared contracts):
 ```
 .env
 .web/
 .searxng/
 .raw/
 .vault-meta/
-.pi/harness/critics/
+.pi/harness/active-run.json
+.pi/harness/release-readiness-report.md
 .pi/harness/plans/
-.pi/harness/specs/
+.pi/harness/critics/
+.pi/harness/runs/**
+!.pi/harness/runs/README.md
+.pi/harness/incidents/*
+!.pi/harness/incidents/README.md
+.pi/harness/debates/*
+!.pi/harness/debates/README.md
+.pi/harness/router/proposals/*
 # Model router config (user-specific — generated from env)
 .pi/model-router.json

package/.pi/prompts/harness-trace.md CHANGED Viewed

@@ -20,10 +20,10 @@ Happy path: omit `--run`.
 2. Spawn:
 ```
-Agent({ subagent_type: "harness/trace-librarian", prompt: "…" })
+subagent({ agentScope: "both", agent: "harness/trace-librarian", task: "…" })
 ```
-3. `get_subagent_result` — present timeline and artifact index to user.
+3. Present timeline and artifact index from tool result to user.
 ## Completion

package/.pi/scripts/harness-agents-manifest.mjs CHANGED Viewed

@@ -14,7 +14,7 @@ import {
 	isSafeAgentId,
 	sha256Content,
 	walkAgentsDir,
-} from "../../test/harness-subagents-loader.core.mjs";
+} from "../lib/harness-agent-discovery.mjs";
 const ROOT = join(dirname(fileURLToPath(import.meta.url)), "..", "..");
 const MANIFEST_PATH = join(ROOT, ".pi", "harness", "agents.manifest.json");

package/.pi/scripts/harness-verify.mjs CHANGED Viewed

@@ -202,32 +202,41 @@ async function main() {
 	if (!(await fileExists(runCtxLib))) fail("missing lib/harness-run-context.ts");
 	ok("lib/harness-run-context.ts");
-	const vendoredIndex = join(
+	const subagentsVendor = join(
+		ROOT,
+		"vendor",
+		"pi-subagents",
+		"src",
+		"subagents.ts",
+	);
+	if (!(await fileExists(subagentsVendor))) {
+		fail("missing vendor/pi-subagents/src/subagents.ts");
+	}
+	const bridgePath = join(
 		ROOT,
 		".pi",
 		"extensions",
 		"lib",
-		"harness-subagents",
-		"vendored",
-		"index.ts",
+		"harness-subagents-bridge.ts",
 	);
-	const vendoredSrc = await readFile(vendoredIndex, "utf-8");
-	const runCtxImport = vendoredSrc.match(
-		/from ["']([^"']*harness-run-context\.js)["']/,
-	);
-	if (!runCtxImport) {
-		fail("vendored/index.ts must import harness-run-context.js");
+	if (!(await fileExists(bridgePath))) {
+		fail("missing harness-subagents-bridge.ts");
 	}
-	const runCtxImportPath = resolve(
-		dirname(vendoredIndex),
-		runCtxImport[1].replace(/\.js$/, ".ts"),
-	);
-	if (runCtxImportPath !== runCtxLib) {
-		fail(
-			`vendored/index.ts harness-run-context import resolves to ${runCtxImportPath}, expected ${runCtxLib}`,
-		);
+	const bridgeSrc = await readFile(bridgePath, "utf-8");
+	if (!bridgeSrc.includes("precheckHarnessSubagentSpawn")) {
+		fail("harness-subagents-bridge must run precheckHarnessSubagentSpawn");
+	}
+	if (!bridgeSrc.includes("packageRoot")) {
+		fail("harness-subagents-bridge must pass packageRoot for agent discovery");
+	}
+	const subagentsSrc = await readFile(subagentsVendor, "utf-8");
+	if (!subagentsSrc.includes("discoverAgents")) {
+		fail("vendor subagents.ts must implement discoverAgents");
+	}
+	if (!subagentsSrc.includes("packageRoot")) {
+		fail("vendor subagents.ts must pass packageRoot into discovery");
 	}
-	ok("vendored/index.ts harness-run-context import path");
+	ok("vendor pi-subagents + harness bridge");
 	const policyGateSrc = await readFile(
 		join(ROOT, ".pi", "extensions", "policy-gate.ts"),