ultimate-pi 0.11.0 → 0.13.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/ck-search/SKILL.md +11 -87
- package/.agents/skills/cocoindex-search/SKILL.md +35 -0
- package/.agents/skills/harness-debate-plan/SKILL.md +44 -0
- package/.agents/skills/harness-decisions/SKILL.md +1 -1
- package/.agents/skills/harness-orchestration/SKILL.md +54 -28
- package/.agents/skills/harness-plan/SKILL.md +15 -20
- package/.pi/PACKAGING.md +1 -0
- package/.pi/SYSTEM.md +21 -20
- package/.pi/agents/harness/adversary.md +0 -1
- package/.pi/agents/harness/evaluator.md +0 -1
- package/.pi/agents/harness/executor.md +1 -2
- package/.pi/agents/harness/incident-recorder.md +0 -1
- package/.pi/agents/harness/meta-optimizer.md +0 -1
- package/.pi/agents/harness/planning/decompose.md +3 -4
- package/.pi/agents/harness/planning/execution-plan-author.md +30 -0
- package/.pi/agents/harness/planning/hypothesis-validator.md +23 -0
- package/.pi/agents/harness/planning/hypothesis.md +3 -4
- package/.pi/agents/harness/planning/plan-adversary.md +10 -42
- package/.pi/agents/harness/planning/plan-evaluator.md +18 -0
- package/.pi/agents/harness/planning/review-integrator.md +23 -0
- package/.pi/agents/harness/planning/scout-graphify.md +13 -5
- package/.pi/agents/harness/planning/scout-semantic.md +23 -11
- package/.pi/agents/harness/planning/scout-structure.md +12 -6
- package/.pi/agents/harness/planning/sprint-contract-auditor.md +18 -0
- package/.pi/agents/harness/planning/stack-researcher.md +24 -0
- package/.pi/agents/harness/tie-breaker.md +0 -1
- package/.pi/agents/harness/trace-librarian.md +0 -1
- package/.pi/extensions/debate-orchestrator.ts +90 -53
- package/.pi/extensions/harness-plan-approval.ts +2 -2
- package/.pi/extensions/harness-run-context.ts +150 -5
- package/.pi/extensions/harness-subagents.ts +17 -6
- package/.pi/extensions/lib/harness-cocoindex-refresh.ts +49 -0
- package/.pi/extensions/lib/harness-posthog.ts +6 -1
- package/.pi/extensions/lib/harness-spawn-budget.ts +75 -0
- package/.pi/extensions/lib/harness-subagent-auth.ts +123 -0
- package/.pi/extensions/lib/{harness-subagents/harness-subagent-policy.ts → harness-subagent-policy.ts} +8 -7
- package/.pi/extensions/lib/harness-subagent-precheck.ts +95 -0
- package/.pi/extensions/lib/harness-subagents-bridge.ts +122 -0
- package/.pi/extensions/lib/plan-approval/create-plan.ts +4 -7
- package/.pi/extensions/lib/plan-approval/plan-review.ts +1 -1
- package/.pi/extensions/lib/plan-approval/types.ts +7 -1
- package/.pi/extensions/lib/plan-debate-envelope.ts +84 -0
- package/.pi/extensions/lib/{harness-subagents/spawn-policy.ts → spawn-policy.ts} +1 -0
- package/.pi/extensions/policy-gate.ts +1 -1
- package/.pi/extensions/review-integrity.ts +48 -29
- package/.pi/harness/agents.manifest.json +37 -25
- package/.pi/harness/docs/adrs/0032-harness-command-orchestration.md +4 -3
- package/.pi/harness/docs/adrs/0033-parent-orchestrated-planning.md +2 -2
- package/.pi/harness/docs/adrs/0035-plan-phase-review-gate.md +27 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/artifacts/review-round-r1.yaml +25 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/artifacts/review-round-r4.yaml +26 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/artifacts/sprint-audit-r4.yaml +5 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/plan-packet.yaml +196 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/plan-review.md +14 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/research-brief.yaml +32 -0
- package/.pi/harness/evals/smoke/run-context.fixture.json +1 -1
- package/.pi/harness/evals/smoke/smoke-harness-plan.mjs +88 -0
- package/.pi/harness/specs/harness-posthog-event.schema.json +6 -1
- package/.pi/harness/specs/plan-execution-plan-brief.schema.json +13 -0
- package/.pi/harness/specs/plan-execution-plan.schema.json +255 -0
- package/.pi/harness/specs/plan-packet.schema.json +14 -5
- package/.pi/harness/specs/plan-review-round-draft.schema.json +68 -0
- package/.pi/harness/specs/plan-sprint-audit-turn.schema.json +29 -0
- package/.pi/harness/specs/plan-stack-brief.schema.json +65 -0
- package/.pi/harness/specs/plan-validation-turn.schema.json +42 -0
- package/.pi/harness/specs/round-result.schema.json +16 -9
- package/.pi/lib/debate-orchestrator-types.ts +38 -0
- package/.pi/lib/harness-agent-discovery.mjs +81 -0
- package/.pi/lib/harness-run-context.ts +64 -38
- package/.pi/lib/harness-yaml.mjs +73 -0
- package/.pi/lib/harness-yaml.ts +90 -0
- package/.pi/prompts/harness-auto.md +13 -11
- package/.pi/prompts/harness-critic.md +2 -2
- package/.pi/prompts/harness-eval.md +3 -3
- package/.pi/prompts/harness-incident.md +2 -2
- package/.pi/prompts/harness-plan.md +83 -92
- package/.pi/prompts/harness-review.md +2 -2
- package/.pi/prompts/harness-router-tune.md +1 -1
- package/.pi/prompts/harness-run.md +2 -2
- package/.pi/prompts/harness-setup.md +30 -17
- package/.pi/prompts/harness-trace.md +2 -2
- package/.pi/scripts/README.md +1 -0
- package/.pi/scripts/harness-agents-manifest.mjs +1 -1
- package/.pi/scripts/harness-cli-verify.sh +24 -14
- package/.pi/scripts/harness-cocoindex-bootstrap.sh +182 -0
- package/.pi/scripts/harness-verify.mjs +38 -19
- package/.pi/scripts/validate-plan-dag.mjs +258 -0
- package/.pi/scripts/vendor-sync-pi-subagents.sh +19 -0
- package/.pi/skills/ast-grep/SKILL.md +2 -2
- package/.pi/skills/ccc/SKILL.md +142 -0
- package/.pi/skills/ccc/references/management.md +110 -0
- package/CHANGELOG.md +22 -0
- package/THIRD_PARTY_NOTICES.md +15 -0
- package/biome.json +2 -2
- package/package.json +7 -4
- package/vendor/pi-subagents/LICENSE +21 -0
- package/vendor/pi-subagents/UPSTREAM_PIN.md +11 -0
- package/vendor/pi-subagents/src/agents.ts +357 -0
- package/vendor/pi-subagents/src/subagents.ts +1463 -0
- package/.pi/agents/harness/planner.md +0 -13
- package/.pi/agents/harness/planning/hypothesis-eval.md +0 -59
- package/.pi/agents/harness/planning/planner.md +0 -20
- package/.pi/extensions/lib/harness-subagents/agent-loader.ts +0 -126
- package/.pi/extensions/lib/harness-subagents/agent-manifest.ts +0 -119
- package/.pi/extensions/lib/harness-subagents/agent-parser.ts +0 -87
- package/.pi/extensions/lib/harness-subagents/blackboard-tool.ts +0 -118
- package/.pi/extensions/lib/harness-subagents/blackboard.ts +0 -175
- package/.pi/extensions/lib/harness-subagents/parent-ask-user-bridge.ts +0 -10
- package/.pi/extensions/lib/harness-subagents/parent-harness-ui-bridge.ts +0 -137
- package/.pi/extensions/lib/harness-subagents/parent-harness-ui-hooks.ts +0 -77
- package/.pi/extensions/lib/harness-subagents/types-blackboard.ts +0 -27
- package/.pi/extensions/lib/harness-subagents/vendored/agent-manager.ts +0 -558
- package/.pi/extensions/lib/harness-subagents/vendored/agent-runner.ts +0 -666
- package/.pi/extensions/lib/harness-subagents/vendored/agent-types.ts +0 -175
- package/.pi/extensions/lib/harness-subagents/vendored/context.ts +0 -59
- package/.pi/extensions/lib/harness-subagents/vendored/cross-extension-rpc.ts +0 -134
- package/.pi/extensions/lib/harness-subagents/vendored/custom-agents.ts +0 -5
- package/.pi/extensions/lib/harness-subagents/vendored/default-agents.ts +0 -123
- package/.pi/extensions/lib/harness-subagents/vendored/env.ts +0 -43
- package/.pi/extensions/lib/harness-subagents/vendored/group-join.ts +0 -144
- package/.pi/extensions/lib/harness-subagents/vendored/index.ts +0 -2460
- package/.pi/extensions/lib/harness-subagents/vendored/invocation-config.ts +0 -52
- package/.pi/extensions/lib/harness-subagents/vendored/memory.ts +0 -182
- package/.pi/extensions/lib/harness-subagents/vendored/model-resolver.ts +0 -92
- package/.pi/extensions/lib/harness-subagents/vendored/output-file.ts +0 -115
- package/.pi/extensions/lib/harness-subagents/vendored/prompts.ts +0 -103
- package/.pi/extensions/lib/harness-subagents/vendored/schedule-store.ts +0 -177
- package/.pi/extensions/lib/harness-subagents/vendored/schedule.ts +0 -416
- package/.pi/extensions/lib/harness-subagents/vendored/settings.ts +0 -210
- package/.pi/extensions/lib/harness-subagents/vendored/skill-loader.ts +0 -108
- package/.pi/extensions/lib/harness-subagents/vendored/types.ts +0 -187
- package/.pi/extensions/lib/harness-subagents/vendored/ui/agent-widget.ts +0 -639
- package/.pi/extensions/lib/harness-subagents/vendored/ui/conversation-viewer.ts +0 -324
- package/.pi/extensions/lib/harness-subagents/vendored/ui/schedule-menu.ts +0 -110
- package/.pi/extensions/lib/harness-subagents/vendored/usage.ts +0 -71
- package/.pi/extensions/lib/harness-subagents/vendored/worktree.ts +0 -195
- /package/.pi/extensions/{00-ultimate-pi-system-prompt.ts → custom-system-prompt.ts} +0 -0
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* YAML read/write for harness plan artifacts (no JSON plan fallbacks).
|
|
3
|
+
*/
|
|
4
|
+
|
|
5
|
+
import { readFile, rename, writeFile } from "node:fs/promises";
|
|
6
|
+
import { parse, stringify } from "yaml";
|
|
7
|
+
|
|
8
|
+
const CODE_FENCE_RE = /^```(?:ya?ml|json)?\s*\n?([\s\S]*?)```\s*$/im;
|
|
9
|
+
|
|
10
|
+
export function stripYamlFences(text) {
|
|
11
|
+
return stripCodeFences(text);
|
|
12
|
+
}
|
|
13
|
+
|
|
14
|
+
export function stripCodeFences(text) {
|
|
15
|
+
const trimmed = text.trim();
|
|
16
|
+
const m = CODE_FENCE_RE.exec(trimmed);
|
|
17
|
+
return m ? m[1].trim() : trimmed;
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
export function parseStructuredDocument(text, label = "document") {
|
|
21
|
+
const body = stripCodeFences(text);
|
|
22
|
+
if (!body.trim()) {
|
|
23
|
+
throw new Error(`${label}: empty document`);
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
try {
|
|
27
|
+
const yamlDoc = parse(body, { uniqueKeys: true });
|
|
28
|
+
if (yamlDoc !== null && yamlDoc !== undefined) {
|
|
29
|
+
return yamlDoc;
|
|
30
|
+
}
|
|
31
|
+
} catch {
|
|
32
|
+
/* try JSON below */
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
const trimmed = body.trim();
|
|
36
|
+
if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
|
|
37
|
+
try {
|
|
38
|
+
return JSON.parse(trimmed);
|
|
39
|
+
} catch (err) {
|
|
40
|
+
const msg = err instanceof Error ? err.message : String(err);
|
|
41
|
+
throw new Error(`${label}: JSON parse failed — ${msg}`);
|
|
42
|
+
}
|
|
43
|
+
}
|
|
44
|
+
|
|
45
|
+
throw new Error(
|
|
46
|
+
`${label}: not valid YAML or JSON (use write_harness_yaml with a schema-shaped object)`,
|
|
47
|
+
);
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
export function parseYaml(text, label = "yaml") {
|
|
51
|
+
return parseStructuredDocument(text, label);
|
|
52
|
+
}
|
|
53
|
+
|
|
54
|
+
export async function readYamlFile(path, label) {
|
|
55
|
+
const raw = await readFile(path, "utf-8");
|
|
56
|
+
return parseStructuredDocument(raw, label ?? path);
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
export async function writeYamlFile(path, data) {
|
|
60
|
+
const tmp = `${path}.tmp`;
|
|
61
|
+
const content = `${stringify(data, { indent: 2 })}\n`;
|
|
62
|
+
await writeFile(tmp, content, "utf-8");
|
|
63
|
+
await rename(tmp, path);
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
export function stringifyYaml(data) {
|
|
67
|
+
return `${stringify(data, { indent: 2 })}\n`;
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
export function normalizeHarnessYamlContent(text, label = "yaml") {
|
|
71
|
+
const doc = parseStructuredDocument(text, label);
|
|
72
|
+
return stringifyYaml(doc);
|
|
73
|
+
}
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* YAML read/write for harness plan artifacts (no JSON plan fallbacks).
|
|
3
|
+
*/
|
|
4
|
+
|
|
5
|
+
import { readFile, rename, writeFile } from "node:fs/promises";
|
|
6
|
+
import { parse, stringify } from "yaml";
|
|
7
|
+
|
|
8
|
+
const CODE_FENCE_RE = /^```(?:ya?ml|json)?\s*\n?([\s\S]*?)```\s*$/im;
|
|
9
|
+
|
|
10
|
+
/** @deprecated Use stripCodeFences */
|
|
11
|
+
export function stripYamlFences(text: string): string {
|
|
12
|
+
return stripCodeFences(text);
|
|
13
|
+
}
|
|
14
|
+
|
|
15
|
+
export function stripCodeFences(text: string): string {
|
|
16
|
+
const trimmed = text.trim();
|
|
17
|
+
const m = CODE_FENCE_RE.exec(trimmed);
|
|
18
|
+
return m ? m[1].trim() : trimmed;
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
/**
|
|
22
|
+
* Parse agent output or file body: fenced YAML/JSON, raw YAML, or raw JSON object/array.
|
|
23
|
+
*/
|
|
24
|
+
export function parseStructuredDocument(
|
|
25
|
+
text: string,
|
|
26
|
+
label = "document",
|
|
27
|
+
): unknown {
|
|
28
|
+
const body = stripCodeFences(text);
|
|
29
|
+
if (!body.trim()) {
|
|
30
|
+
throw new Error(`${label}: empty document`);
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
try {
|
|
34
|
+
const yamlDoc = parse(body, { uniqueKeys: true });
|
|
35
|
+
if (yamlDoc !== null && yamlDoc !== undefined) {
|
|
36
|
+
return yamlDoc;
|
|
37
|
+
}
|
|
38
|
+
} catch {
|
|
39
|
+
/* try JSON below */
|
|
40
|
+
}
|
|
41
|
+
|
|
42
|
+
const trimmed = body.trim();
|
|
43
|
+
if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
|
|
44
|
+
try {
|
|
45
|
+
return JSON.parse(trimmed) as unknown;
|
|
46
|
+
} catch (err) {
|
|
47
|
+
const msg = err instanceof Error ? err.message : String(err);
|
|
48
|
+
throw new Error(`${label}: JSON parse failed — ${msg}`);
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
throw new Error(
|
|
53
|
+
`${label}: not valid YAML or JSON (use write_harness_yaml with a schema-shaped object)`,
|
|
54
|
+
);
|
|
55
|
+
}
|
|
56
|
+
|
|
57
|
+
export function parseYaml(text: string, label = "yaml"): unknown {
|
|
58
|
+
return parseStructuredDocument(text, label);
|
|
59
|
+
}
|
|
60
|
+
|
|
61
|
+
export async function readYamlFile(
|
|
62
|
+
path: string,
|
|
63
|
+
label?: string,
|
|
64
|
+
): Promise<unknown> {
|
|
65
|
+
const raw = await readFile(path, "utf-8");
|
|
66
|
+
return parseStructuredDocument(raw, label ?? path);
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
export async function writeYamlFile(
|
|
70
|
+
path: string,
|
|
71
|
+
data: unknown,
|
|
72
|
+
): Promise<void> {
|
|
73
|
+
const tmp = `${path}.tmp`;
|
|
74
|
+
const content = `${stringify(data, { indent: 2 })}\n`;
|
|
75
|
+
await writeFile(tmp, content, "utf-8");
|
|
76
|
+
await rename(tmp, path);
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
export function stringifyYaml(data: unknown): string {
|
|
80
|
+
return `${stringify(data, { indent: 2 })}\n`;
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
/** Normalize arbitrary agent text to canonical YAML file bytes. */
|
|
84
|
+
export function normalizeHarnessYamlContent(
|
|
85
|
+
text: string,
|
|
86
|
+
label = "yaml",
|
|
87
|
+
): string {
|
|
88
|
+
const doc = parseStructuredDocument(text, label);
|
|
89
|
+
return stringifyYaml(doc);
|
|
90
|
+
}
|
|
@@ -5,7 +5,7 @@ argument-hint: "\"<task>\" [--quick] [--risk low|med|high] [--budget <amount>]"
|
|
|
5
5
|
|
|
6
6
|
# harness-auto
|
|
7
7
|
|
|
8
|
-
Pipeline orchestrator — one session, sequential
|
|
8
|
+
Pipeline orchestrator — one session, sequential phase handoffs. Invoke **harness-orchestration** skill for agent IDs. Do **not** implement or review inline.
|
|
9
9
|
|
|
10
10
|
## Step 0 — Parse arguments
|
|
11
11
|
|
|
@@ -18,20 +18,22 @@ If task missing:
|
|
|
18
18
|
|
|
19
19
|
## Orchestration (required) — same session
|
|
20
20
|
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
21
|
+
Follow **harness-plan** performance rules (`subagent` with parallel `tasks`, `agentScope: "both"`).
|
|
22
|
+
|
|
23
|
+
1. **Plan** — follow `/harness-plan` (parallel scouts → parallel decompose/hypothesis → draft PlanPacket → debate rounds → parent `approve_plan` + `create_plan`). No second approval pass.
|
|
24
|
+
2. **Execute** — `subagent({ agent: "harness/executor", task: "<HarnessSpawnContext mode execute>" })`; summarize handoff bullets only (do not paste full subprocess log).
|
|
25
|
+
3. **Eval** — `subagent({ agent: "harness/evaluator", task: "<mode benchmark>" })` after parent scripts if needed.
|
|
26
|
+
4. **Review** — `subagent({ agent: "harness/evaluator", task: "<mode verdict>" })` when strict gates require.
|
|
27
|
+
5. **Adversary** — `subagent({ agent: "harness/adversary", ... })`. **Skip when `--quick`**.
|
|
28
|
+
6. **Tie-breaker** — `subagent({ agent: "harness/tie-breaker", ... })` only if debate unresolved and **not** `--quick`.
|
|
27
29
|
7. **Parent** — apply locked strict gates below; commit/PR only if all pass.
|
|
28
30
|
|
|
29
|
-
|
|
31
|
+
Review agents run in isolated subprocesses via `subagent` (same parent session).
|
|
30
32
|
|
|
31
33
|
## Locked decisions (do not change)
|
|
32
34
|
|
|
33
35
|
- Always produce and approve plan before mutation.
|
|
34
|
-
- Adversarial review always required.
|
|
36
|
+
- Adversarial review always required **except** `--quick` (evaluator-only gate).
|
|
35
37
|
- Severity-policy-engine blocks merge.
|
|
36
38
|
- Router tuning propose-and-approve only.
|
|
37
39
|
- Plan ambiguity → parent `ask_user` (harness-decisions).
|
|
@@ -41,11 +43,11 @@ No new Pi session for review — subagents use isolated context (`inherit_contex
|
|
|
41
43
|
|
|
42
44
|
## Strict gates
|
|
43
45
|
|
|
44
|
-
Block commit/PR if any fails: plan gate, execution in scope, evaluator pass, adversary complete, severity-policy pass/conditional_pass, benchmark deltas, rollback artifacts.
|
|
46
|
+
Block commit/PR if any fails: plan gate, execution in scope, evaluator pass, adversary complete (unless `--quick`), severity-policy pass/conditional_pass, benchmark deltas, rollback artifacts.
|
|
45
47
|
|
|
46
48
|
## Notes
|
|
47
49
|
|
|
48
|
-
- `--quick` reduces breadth, never safety gates.
|
|
50
|
+
- `--quick` reduces breadth (skips semantic scout, post-run adversary, tie-breaker), never core safety gates on plan approval or evaluator.
|
|
49
51
|
- High risk/ambiguity → stop and recommend manual `/harness-plan` with `ask_user`.
|
|
50
52
|
- Interrupt: `/harness-abort [reason]` then `/harness-plan`.
|
|
51
53
|
- Artifact refs under active run dir; `/harness-run-status` or `/harness-trace-last` for handoff.
|
|
@@ -20,10 +20,10 @@ Happy path: omit `--run`.
|
|
|
20
20
|
2. Spawn:
|
|
21
21
|
|
|
22
22
|
```
|
|
23
|
-
|
|
23
|
+
subagent({ agentScope: "both", agent: "harness/adversary", task: "…" })
|
|
24
24
|
```
|
|
25
25
|
|
|
26
|
-
3. `
|
|
26
|
+
3. Parse `AdversaryReport` JSON from tool result; parent persists for severity policy.
|
|
27
27
|
|
|
28
28
|
## Parent rules
|
|
29
29
|
|
|
@@ -26,11 +26,11 @@ If no active run:
|
|
|
26
26
|
4. Spawn:
|
|
27
27
|
|
|
28
28
|
```
|
|
29
|
-
|
|
29
|
+
subagent({ agentScope: "both", agent: "harness/evaluator", task: "<HarnessSpawnContext + eval brief>" })
|
|
30
30
|
```
|
|
31
31
|
|
|
32
|
-
5.
|
|
33
|
-
6. Do not edit `plan-packet.
|
|
32
|
+
5. Parse eval JSON from tool result; parent writes structured artifacts under run dir.
|
|
33
|
+
6. Do not edit `plan-packet.yaml`.
|
|
34
34
|
|
|
35
35
|
## Parent rules
|
|
36
36
|
|
|
@@ -22,10 +22,10 @@ If `--trigger` missing:
|
|
|
22
22
|
2. Spawn:
|
|
23
23
|
|
|
24
24
|
```
|
|
25
|
-
|
|
25
|
+
subagent({ agentScope: "both", agent: "harness/incident-recorder", task: "…" })
|
|
26
26
|
```
|
|
27
27
|
|
|
28
|
-
3. `
|
|
28
|
+
3. Parse `IncidentRecord` JSON from tool result; parent writes under `.pi/harness/incidents/`.
|
|
29
29
|
|
|
30
30
|
## Completion
|
|
31
31
|
|
|
@@ -1,154 +1,145 @@
|
|
|
1
1
|
---
|
|
2
|
-
description:
|
|
2
|
+
description: PM-grade harness plan — scouts, ExecutionPlan, DAG validation, Review Gate debate, approval.
|
|
3
3
|
argument-hint: "\"<task>\" [--risk low|med|high] [--budget <amount>] [--quick]"
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# harness-plan
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
You are the **planning PM** for this harness run. Produce an execution baseline (`plan-packet.yaml` + `plan-review.md`), not strategy theater. Parent owns `ask_user`, `approve_plan`, `create_plan`, debate bus commands, and YAML writes under `.pi/harness/runs/<run_id>/`.
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Never `write`/`edit` the final canonical packet except via **`write_harness_yaml`** for run artifacts and **`create_plan`** after approval. Do not paste JSON into `.yaml` files — subagents emit JSON; you convert via `write_harness_yaml`.
|
|
11
|
+
|
|
12
|
+
## Allowed subagents
|
|
11
13
|
|
|
12
14
|
- `harness/planning/scout-graphify`
|
|
13
15
|
- `harness/planning/scout-structure`
|
|
14
|
-
- `harness/planning/scout-semantic`
|
|
16
|
+
- `harness/planning/scout-semantic` (skip when `--quick`)
|
|
15
17
|
- `harness/planning/decompose`
|
|
16
18
|
- `harness/planning/hypothesis`
|
|
19
|
+
- `harness/planning/stack-researcher`
|
|
20
|
+
- `harness/planning/execution-plan-author`
|
|
21
|
+
- `harness/planning/hypothesis-validator` (debate R1 only)
|
|
22
|
+
- `harness/planning/plan-evaluator`
|
|
17
23
|
- `harness/planning/plan-adversary`
|
|
18
|
-
- `harness/planning/
|
|
19
|
-
|
|
20
|
-
Do **not** spawn `harness/planner` or `harness/planning/planner`.
|
|
24
|
+
- `harness/planning/sprint-contract-auditor`
|
|
25
|
+
- `harness/planning/review-integrator`
|
|
21
26
|
|
|
22
|
-
|
|
27
|
+
Read **harness-debate-plan** skill before Review Gate rounds.
|
|
23
28
|
|
|
24
|
-
|
|
29
|
+
## Performance rules
|
|
25
30
|
|
|
26
|
-
|
|
27
|
-
|
|
31
|
+
1. Use `subagent` with `agentScope: "both"` and parallel `tasks` where lanes are independent.
|
|
32
|
+
2. Each `subagent` call blocks until subprocesses finish — batch parallel scouts in one `tasks` array.
|
|
33
|
+
3. Do **not** set `timeoutMs` unless the user explicitly requests a cap — subagents run until natural completion (optional backstop: `PI_SUBAGENT_TIMEOUT_MS`).
|
|
34
|
+
4. Cap: **12** harness subagent invocations per parent session (extension-enforced).
|
|
35
|
+
5. Compact task text: embed `HarnessSpawnContext` JSON + lane-specific instructions only.
|
|
28
36
|
|
|
29
|
-
|
|
37
|
+
## Step 0 — Parse `$ARGUMENTS`
|
|
30
38
|
|
|
31
|
-
|
|
39
|
+
- task (required)
|
|
40
|
+
- `--risk low|med|high`, `--budget`, `--quick`
|
|
32
41
|
|
|
33
|
-
`--quick` skips
|
|
42
|
+
`--quick` skips **scout-semantic** and post-run adversary only — **never** skip graphify, structure, decompose, hypothesis, stack research, execution plan, DAG validation, or **4-round plan debate**.
|
|
34
43
|
|
|
35
44
|
## Active plan context
|
|
36
45
|
|
|
37
|
-
Use
|
|
38
|
-
|
|
39
|
-
If `[HarnessActivePlan]` is present:
|
|
46
|
+
Use `[HarnessActivePlan]` / `[HarnessRunContext]` only. On revise: preserve `plan_id` / `task_id`. Canonical paths: `plan-packet.yaml`, `research-brief.yaml`, `artifacts/*.yaml`.
|
|
40
47
|
|
|
41
|
-
|
|
42
|
-
- Set `mode: revise` in `HarnessSpawnContext` from `[HarnessRunContext]`.
|
|
43
|
-
- **Preserve `plan_id` and `task_id`** from the existing packet when amending.
|
|
44
|
-
- Scouts focus on delta vs existing `plan_packet_path`; full re-scout only if scope changed materially.
|
|
48
|
+
## Phase 0 — Semantic index (automatic)
|
|
45
49
|
|
|
46
|
-
|
|
50
|
+
Do **not** run `ccc index` or `ccc search --refresh`. The harness runs incremental `ccc index` before subagent spawns. Proceed directly to Phase 1 scouts.
|
|
47
51
|
|
|
48
|
-
## Phase 1 — Parallel scouts
|
|
52
|
+
## Phase 1 — Parallel scouts
|
|
49
53
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
54
|
+
```json
|
|
55
|
+
{
|
|
56
|
+
"agentScope": "both",
|
|
57
|
+
"tasks": [
|
|
58
|
+
{ "agent": "harness/planning/scout-graphify", "task": "<HarnessSpawnContext + graphify lane>" },
|
|
59
|
+
{ "agent": "harness/planning/scout-structure", "task": "<HarnessSpawnContext + structure lane>" }
|
|
60
|
+
]
|
|
61
|
+
}
|
|
56
62
|
```
|
|
57
63
|
|
|
58
|
-
|
|
64
|
+
Add `harness/planning/scout-semantic` to `tasks` unless `--quick`. Require graphify + structure success. Semantic lane uses `ccc search` only (see `scout-semantic` agent).
|
|
59
65
|
|
|
60
|
-
|
|
61
|
-
4. **Partial failure:** require successful **graphify + structure** lanes. Semantic is optional. If a required lane fails, continue with `plan_status: partial` and document gaps in `assumptions`.
|
|
62
|
-
5. If JSON parse fails for a lane, summarize free-text output and add an assumption that the lane was unstructured.
|
|
66
|
+
## Phase 2 & 3 — Decompose + hypothesis (parallel)
|
|
63
67
|
|
|
64
|
-
|
|
68
|
+
One `subagent` call with `tasks` for `harness/planning/decompose` and `harness/planning/hypothesis`. Parse `PlanDecompositionBrief` and `PlanHypothesisBrief` from outputs. Persist with `write_harness_yaml` → `artifacts/decomposition.yaml` and `artifacts/hypothesis.yaml`.
|
|
65
69
|
|
|
66
|
-
|
|
70
|
+
## Phase 4 — Draft shell + fork
|
|
67
71
|
|
|
68
|
-
|
|
69
|
-
Agent({ subagent_type: "harness/planning/decompose", prompt: "<HarnessSpawnContext + task + all scout lane JSON>", inherit_context: false })
|
|
70
|
-
```
|
|
72
|
+
Build draft `PlanPacket` (`contract_version: "1.1.0"`):
|
|
71
73
|
|
|
72
|
-
|
|
73
|
-
|
|
74
|
+
- `scope`, `assumptions`, `acceptance_checks`, `risk_level`, `rollback_plan`
|
|
75
|
+
- `execution_plan` placeholder until Phase 4b
|
|
74
76
|
|
|
75
|
-
|
|
77
|
+
`ask_user` when `dialectical_fork` is material.
|
|
76
78
|
|
|
77
|
-
|
|
79
|
+
Initialize `research-brief.yaml` with decomposition + hypothesis (`write_harness_yaml`).
|
|
80
|
+
|
|
81
|
+
## Phase 4a — Stack research
|
|
78
82
|
|
|
79
83
|
```
|
|
80
|
-
|
|
84
|
+
subagent({ agentScope: "both", agent: "harness/planning/stack-researcher", task: "<HarnessSpawnContext + stack research brief>" })
|
|
81
85
|
```
|
|
82
86
|
|
|
83
|
-
|
|
84
|
-
3. **Revision cap:** at most **one** re-spawn of `hypothesis` if Phase 6 eval requests revision (see below).
|
|
87
|
+
`write_harness_yaml` → `artifacts/stack.yaml`; merge into `research-brief.yaml` → `stack`.
|
|
85
88
|
|
|
86
|
-
## Phase
|
|
89
|
+
## Phase 4b — Execution plan author
|
|
87
90
|
|
|
88
|
-
|
|
91
|
+
```
|
|
92
|
+
subagent({ agentScope: "both", agent: "harness/planning/execution-plan-author", task: "<HarnessSpawnContext + execution plan brief>" })
|
|
93
|
+
```
|
|
89
94
|
|
|
90
|
-
|
|
91
|
-
|-------|--------|
|
|
92
|
-
| `scope` | `problem_restatement` (narrowed) + `primary.claim` + `primary.mechanism` (implementation-ready) |
|
|
93
|
-
| `assumptions` | `core_tension`, `prior_art.dead_ends`, scout `open_questions`, chosen fork path (if any) |
|
|
94
|
-
| `acceptance_checks` | Each `primary.prediction` and `primary.experiment` as verifiable checklist items (min 1) |
|
|
95
|
-
| `risk_level` | From `$ARGUMENTS` or infer from fork uncertainty / blast radius |
|
|
95
|
+
Merge `execution_plan` into draft `plan-packet.yaml` (`write_harness_yaml`). Save `artifacts/execution-plan-draft.yaml` the same way.
|
|
96
96
|
|
|
97
|
-
|
|
97
|
+
## Phase 4c — DAG validation (hard gate)
|
|
98
98
|
|
|
99
|
-
|
|
99
|
+
```bash
|
|
100
|
+
node .pi/scripts/validate-plan-dag.mjs --packet .pi/harness/runs/<run_id>/plan-packet.yaml --write
|
|
101
|
+
```
|
|
100
102
|
|
|
101
|
-
|
|
103
|
+
Must **pass** before debate. On fail: fix via author or parent patches, re-run.
|
|
102
104
|
|
|
103
|
-
|
|
104
|
-
{
|
|
105
|
-
"decomposition": { /* PlanDecompositionBrief */ },
|
|
106
|
-
"hypothesis": { /* PlanHypothesisBrief */ },
|
|
107
|
-
"eval": null
|
|
108
|
-
}
|
|
109
|
-
```
|
|
105
|
+
## Phase 5 — Review Gate debate (4 rounds, even with `--quick`)
|
|
110
106
|
|
|
111
|
-
|
|
107
|
+
1. `/harness-debate-open plan-<run_id>`
|
|
108
|
+
2. For rounds 1–4 (`debate_round_focus`: spec, wbs, schedule, quality):
|
|
112
109
|
|
|
113
|
-
|
|
110
|
+
| Round | Extra spawns (before integrator) |
|
|
111
|
+
|-------|----------------------------------|
|
|
112
|
+
| 1 | `hypothesis-validator` (blind: task + hypothesis only) → `plan-evaluator` → `plan-adversary` |
|
|
113
|
+
| 2 | `plan-evaluator` → `plan-adversary` (optional `sprint-contract-auditor` if done_criteria thin) |
|
|
114
|
+
| 3 | `plan-evaluator` → `plan-adversary` |
|
|
115
|
+
| 4 | `plan-evaluator` → `plan-adversary` → **`sprint-contract-auditor` (required)** |
|
|
114
116
|
|
|
115
|
-
|
|
116
|
-
Agent({ subagent_type: "harness/planning/plan-adversary", prompt: "<HarnessSpawnContext + draft PlanPacket + scout summaries + decomposition human_summary>", inherit_context: false })
|
|
117
|
-
Agent({ subagent_type: "harness/planning/hypothesis-eval", prompt: "<original task ONLY + PlanHypothesisBrief JSON — no decomposition, no PlanPacket>", inherit_context: false })
|
|
118
|
-
```
|
|
117
|
+
Then `review-integrator` → `write_harness_yaml` → `artifacts/review-round-r{N}.yaml` → build bus envelope → `/harness-debate-round '<json>'`.
|
|
119
118
|
|
|
120
|
-
|
|
121
|
-
2. Parse `PlanHypothesisEval` — set `research_brief.eval`.
|
|
122
|
-
3. If `revision_recommended` or testability < 70 or `relevance.passes` is false: re-spawn `hypothesis` once with eval rationale, update PlanPacket + `research_brief.hypothesis`, then re-run **hypothesis-eval** only (not adversary unless PlanPacket changed materially).
|
|
119
|
+
3. `/harness-debate-consensus` after round 4.
|
|
123
120
|
|
|
124
|
-
|
|
121
|
+
**R1 blind rule:** hypothesis-validator prompt must exclude decomposition, scouts, PlanPacket, prior debate.
|
|
125
122
|
|
|
126
|
-
|
|
123
|
+
If R1 `revision_recommended` or `relevance.passes === false`: one `hypothesis` re-spawn, update brief, continue.
|
|
127
124
|
|
|
128
|
-
|
|
129
|
-
2. On **Approve** only, call **`create_plan`** with the **same** `plan_packet`.
|
|
130
|
-
3. If `create_plan` fails, tell the user to fix validation errors or run `/harness-plan-commit` after approval is recorded.
|
|
131
|
-
4. Confirm `[HarnessRunContext]` `plan_ready: true` before handoff.
|
|
125
|
+
**Blockers:** `policy_decision: block` → do not `approve_plan`. `human_required` → `ask_user` before approval.
|
|
132
126
|
|
|
133
|
-
|
|
127
|
+
## Phase 5b — Revise packet
|
|
134
128
|
|
|
135
|
-
|
|
129
|
+
Apply `recommended_packet_patches` from last integrator round. Re-run `validate-plan-dag.mjs`. If >30% work items changed, one partial re-round on affected focus.
|
|
136
130
|
|
|
137
|
-
|
|
131
|
+
Set `research_brief.eval` from R1 `hypothesis-validator` output.
|
|
138
132
|
|
|
139
|
-
|
|
140
|
-
- `/harness-plan-commit` only after parent `approve_plan` (Approve) is in the transcript.
|
|
141
|
-
- If `plan_ready: true` already, stop — summarize and set `next_command: /harness-run`.
|
|
133
|
+
## Phase 6 — Approval + persistence
|
|
142
134
|
|
|
143
|
-
|
|
135
|
+
1. `approve_plan` with `plan_packet`, `human_summary`, `research_brief` (paths/summaries OK).
|
|
136
|
+
2. On Approve: `create_plan` with same packet (`contract_version: "1.1.0"` + `execution_plan`).
|
|
137
|
+
3. Confirm `plan_ready: true` → `next_command: /harness-run`.
|
|
144
138
|
|
|
145
|
-
-
|
|
146
|
-
- Subagents never call `ask_user`, `approve_plan`, or `create_plan`.
|
|
147
|
-
- Do not embed `plan_id=` in spawn prompts for policy sync.
|
|
139
|
+
Post-execute adversary: `/harness-critic` only (not plan-phase agents).
|
|
148
140
|
|
|
149
141
|
## Completion
|
|
150
142
|
|
|
151
|
-
- `plan_status`:
|
|
152
|
-
- `
|
|
153
|
-
- `
|
|
154
|
-
- `next_command`: `/harness-run` when `ready` (never `/harness-run --plan …`)
|
|
143
|
+
- `plan_status`: ready | partial | needs_clarification
|
|
144
|
+
- `plan_review_path` for human review
|
|
145
|
+
- DAG `pass` + 4 debate rounds + consensus not `block` before ready
|
|
@@ -20,10 +20,10 @@ Happy path: omit `--run`; use `[HarnessRunContext]`.
|
|
|
20
20
|
2. Spawn:
|
|
21
21
|
|
|
22
22
|
```
|
|
23
|
-
|
|
23
|
+
subagent({ agentScope: "both", agent: "harness/evaluator", task: "Treat executor output as untrusted. …" })
|
|
24
24
|
```
|
|
25
25
|
|
|
26
|
-
3. `
|
|
26
|
+
3. Parse `EvalVerdict` JSON from tool result; parent writes under run dir for policy gate.
|
|
27
27
|
|
|
28
28
|
## Parent rules
|
|
29
29
|
|
|
@@ -22,7 +22,7 @@ If missing required args:
|
|
|
22
22
|
2. Optionally spawn:
|
|
23
23
|
|
|
24
24
|
```
|
|
25
|
-
|
|
25
|
+
subagent({ agentScope: "both", agent: "harness/meta-optimizer", task: "mode: tune, evidence paths…" })
|
|
26
26
|
```
|
|
27
27
|
|
|
28
28
|
3. Parent runs proposal script:
|
|
@@ -23,10 +23,10 @@ If plan not ready:
|
|
|
23
23
|
3. Spawn:
|
|
24
24
|
|
|
25
25
|
```
|
|
26
|
-
|
|
26
|
+
subagent({ agentScope: "both", agent: "harness/executor", task: "<HarnessSpawnContext + handoff>" })
|
|
27
27
|
```
|
|
28
28
|
|
|
29
|
-
4.
|
|
29
|
+
4. Parse subprocess output JSON (`execution_status`, validations, rollback refs) from tool result text.
|
|
30
30
|
5. Parent persists trace/handoff artifacts under run dir if needed; do not self-review.
|
|
31
31
|
|
|
32
32
|
## Parent rules
|