ultimate-pi 0.10.1 → 0.12.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/harness-debate-plan/SKILL.md +44 -0
- package/.agents/skills/harness-decisions/SKILL.md +3 -3
- package/.agents/skills/harness-orchestration/SKILL.md +59 -25
- package/.agents/skills/harness-plan/SKILL.md +16 -15
- package/.pi/agents/harness/adversary.md +0 -1
- package/.pi/agents/harness/evaluator.md +0 -1
- package/.pi/agents/harness/executor.md +1 -2
- package/.pi/agents/harness/incident-recorder.md +0 -1
- package/.pi/agents/harness/meta-optimizer.md +0 -1
- package/.pi/agents/harness/planning/decompose.md +83 -0
- package/.pi/agents/harness/planning/execution-plan-author.md +30 -0
- package/.pi/agents/harness/planning/hypothesis-validator.md +23 -0
- package/.pi/agents/harness/planning/hypothesis.md +89 -0
- package/.pi/agents/harness/planning/plan-adversary.md +18 -0
- package/.pi/agents/harness/planning/plan-evaluator.md +18 -0
- package/.pi/agents/harness/planning/review-integrator.md +23 -0
- package/.pi/agents/harness/planning/scout-graphify.md +54 -0
- package/.pi/agents/harness/planning/scout-semantic.md +47 -0
- package/.pi/agents/harness/planning/scout-structure.md +50 -0
- package/.pi/agents/harness/planning/sprint-contract-auditor.md +18 -0
- package/.pi/agents/harness/planning/stack-researcher.md +24 -0
- package/.pi/agents/harness/tie-breaker.md +0 -1
- package/.pi/agents/harness/trace-librarian.md +0 -1
- package/.pi/extensions/debate-orchestrator.ts +90 -53
- package/.pi/extensions/harness-ask-user.ts +5 -0
- package/.pi/extensions/harness-plan-approval.ts +137 -3
- package/.pi/extensions/harness-run-context.ts +146 -6
- package/.pi/extensions/harness-subagents.ts +10 -5
- package/.pi/extensions/harness-web-tools.ts +2 -0
- package/.pi/extensions/lib/extension-load-guard.ts +39 -0
- package/.pi/extensions/lib/harness-posthog.ts +6 -1
- package/.pi/extensions/lib/harness-spawn-budget.ts +75 -0
- package/.pi/extensions/lib/harness-subagent-auth.ts +123 -0
- package/.pi/extensions/lib/{harness-subagents/harness-subagent-policy.ts → harness-subagent-policy.ts} +34 -9
- package/.pi/extensions/lib/harness-subagent-precheck.ts +95 -0
- package/.pi/extensions/lib/harness-subagents-bridge.ts +176 -0
- package/.pi/extensions/lib/plan-approval/create-plan.ts +9 -7
- package/.pi/extensions/lib/plan-approval/plan-review.ts +393 -0
- package/.pi/extensions/lib/plan-approval/schema.ts +16 -1
- package/.pi/extensions/lib/plan-approval/types.ts +16 -0
- package/.pi/extensions/lib/plan-approval/validate.ts +2 -0
- package/.pi/extensions/lib/plan-debate-envelope.ts +84 -0
- package/.pi/extensions/lib/{harness-subagents/spawn-policy.ts → spawn-policy.ts} +2 -5
- package/.pi/extensions/policy-gate.ts +1 -1
- package/.pi/extensions/review-integrity.ts +48 -29
- package/.pi/extensions/ultimate-pi-vcc.ts +5 -0
- package/.pi/harness/agents.manifest.json +126 -82
- package/.pi/harness/docs/adrs/0032-harness-command-orchestration.md +7 -6
- package/.pi/harness/docs/adrs/0033-parent-orchestrated-planning.md +34 -0
- package/.pi/harness/docs/adrs/0034-darwin-plan-research-pipeline.md +41 -0
- package/.pi/harness/docs/adrs/0035-plan-phase-review-gate.md +27 -0
- package/.pi/harness/docs/adrs/README.md +2 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/artifacts/review-round-r1.yaml +25 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/artifacts/review-round-r4.yaml +26 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/artifacts/sprint-audit-r4.yaml +5 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/plan-packet.yaml +196 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/plan-review.md +14 -0
- package/.pi/harness/evals/smoke/fixtures/plan-phase/minimal-med/research-brief.yaml +32 -0
- package/.pi/harness/evals/smoke/run-context.fixture.json +1 -1
- package/.pi/harness/evals/smoke/smoke-harness-plan.mjs +88 -0
- package/.pi/harness/specs/README.md +1 -1
- package/.pi/harness/specs/harness-posthog-event.schema.json +6 -1
- package/.pi/harness/specs/harness-spawn-context.schema.json +2 -1
- package/.pi/harness/specs/plan-adversary-brief.schema.json +45 -0
- package/.pi/harness/specs/plan-decomposition-brief.schema.json +108 -0
- package/.pi/harness/specs/plan-execution-plan-brief.schema.json +13 -0
- package/.pi/harness/specs/plan-execution-plan.schema.json +255 -0
- package/.pi/harness/specs/plan-hypothesis-brief.schema.json +96 -0
- package/.pi/harness/specs/plan-hypothesis-eval.schema.json +61 -0
- package/.pi/harness/specs/plan-packet.schema.json +14 -5
- package/.pi/harness/specs/plan-review-round-draft.schema.json +68 -0
- package/.pi/harness/specs/plan-sprint-audit-turn.schema.json +29 -0
- package/.pi/harness/specs/plan-stack-brief.schema.json +65 -0
- package/.pi/harness/specs/plan-validation-turn.schema.json +42 -0
- package/.pi/harness/specs/round-result.schema.json +16 -9
- package/.pi/lib/debate-orchestrator-types.ts +38 -0
- package/.pi/lib/harness-agent-discovery.mjs +81 -0
- package/.pi/lib/harness-run-context.ts +76 -38
- package/.pi/lib/harness-yaml.mjs +73 -0
- package/.pi/lib/harness-yaml.ts +90 -0
- package/.pi/prompts/harness-auto.md +13 -11
- package/.pi/prompts/harness-critic.md +2 -2
- package/.pi/prompts/harness-eval.md +3 -3
- package/.pi/prompts/harness-incident.md +2 -2
- package/.pi/prompts/harness-plan.md +106 -37
- package/.pi/prompts/harness-review.md +2 -2
- package/.pi/prompts/harness-router-tune.md +1 -1
- package/.pi/prompts/harness-run.md +2 -2
- package/.pi/prompts/harness-setup.md +15 -6
- package/.pi/prompts/harness-trace.md +2 -2
- package/.pi/scripts/harness-agents-manifest.mjs +1 -1
- package/.pi/scripts/harness-resolve-up-pkg.mjs +13 -0
- package/.pi/scripts/harness-verify.mjs +28 -19
- package/.pi/scripts/validate-plan-dag.mjs +258 -0
- package/.pi/scripts/vendor-sync-pi-subagents.sh +19 -0
- package/CHANGELOG.md +24 -0
- package/THIRD_PARTY_NOTICES.md +8 -0
- package/biome.json +4 -1
- package/package.json +6 -4
- package/.pi/agents/harness/planner.md +0 -54
- package/.pi/extensions/lib/harness-subagents/agent-loader.ts +0 -126
- package/.pi/extensions/lib/harness-subagents/agent-manifest.ts +0 -119
- package/.pi/extensions/lib/harness-subagents/agent-parser.ts +0 -87
- package/.pi/extensions/lib/harness-subagents/blackboard-tool.ts +0 -118
- package/.pi/extensions/lib/harness-subagents/blackboard.ts +0 -175
- package/.pi/extensions/lib/harness-subagents/parent-ask-user-bridge.ts +0 -10
- package/.pi/extensions/lib/harness-subagents/parent-harness-ui-bridge.ts +0 -310
- package/.pi/extensions/lib/harness-subagents/parent-harness-ui-hooks.ts +0 -59
- package/.pi/extensions/lib/harness-subagents/types-blackboard.ts +0 -27
- package/.pi/extensions/lib/harness-subagents/vendored/agent-manager.ts +0 -558
- package/.pi/extensions/lib/harness-subagents/vendored/agent-runner.ts +0 -684
- package/.pi/extensions/lib/harness-subagents/vendored/agent-types.ts +0 -175
- package/.pi/extensions/lib/harness-subagents/vendored/context.ts +0 -59
- package/.pi/extensions/lib/harness-subagents/vendored/cross-extension-rpc.ts +0 -134
- package/.pi/extensions/lib/harness-subagents/vendored/custom-agents.ts +0 -5
- package/.pi/extensions/lib/harness-subagents/vendored/default-agents.ts +0 -123
- package/.pi/extensions/lib/harness-subagents/vendored/env.ts +0 -43
- package/.pi/extensions/lib/harness-subagents/vendored/group-join.ts +0 -144
- package/.pi/extensions/lib/harness-subagents/vendored/index.ts +0 -2494
- package/.pi/extensions/lib/harness-subagents/vendored/invocation-config.ts +0 -52
- package/.pi/extensions/lib/harness-subagents/vendored/memory.ts +0 -182
- package/.pi/extensions/lib/harness-subagents/vendored/model-resolver.ts +0 -92
- package/.pi/extensions/lib/harness-subagents/vendored/output-file.ts +0 -115
- package/.pi/extensions/lib/harness-subagents/vendored/prompts.ts +0 -103
- package/.pi/extensions/lib/harness-subagents/vendored/schedule-store.ts +0 -177
- package/.pi/extensions/lib/harness-subagents/vendored/schedule.ts +0 -416
- package/.pi/extensions/lib/harness-subagents/vendored/settings.ts +0 -210
- package/.pi/extensions/lib/harness-subagents/vendored/skill-loader.ts +0 -108
- package/.pi/extensions/lib/harness-subagents/vendored/types.ts +0 -187
- package/.pi/extensions/lib/harness-subagents/vendored/ui/agent-widget.ts +0 -639
- package/.pi/extensions/lib/harness-subagents/vendored/ui/conversation-viewer.ts +0 -324
- package/.pi/extensions/lib/harness-subagents/vendored/ui/schedule-menu.ts +0 -110
- package/.pi/extensions/lib/harness-subagents/vendored/usage.ts +0 -71
- package/.pi/extensions/lib/harness-subagents/vendored/worktree.ts +0 -195
- /package/.pi/extensions/{00-ultimate-pi-system-prompt.ts → custom-system-prompt.ts} +0 -0
|
@@ -46,7 +46,17 @@
|
|
|
46
46
|
"minItems": 2,
|
|
47
47
|
"items": {
|
|
48
48
|
"type": "string",
|
|
49
|
-
"enum": [
|
|
49
|
+
"enum": [
|
|
50
|
+
"EvaluatorAgent",
|
|
51
|
+
"AdversaryAgent",
|
|
52
|
+
"TieBreakerAgent",
|
|
53
|
+
"PlanEvaluatorAgent",
|
|
54
|
+
"PlanAdversaryAgent",
|
|
55
|
+
"HypothesisValidatorAgent",
|
|
56
|
+
"SprintContractAuditorAgent",
|
|
57
|
+
"ReviewIntegratorAgent",
|
|
58
|
+
"StackResearchAgent"
|
|
59
|
+
]
|
|
50
60
|
}
|
|
51
61
|
},
|
|
52
62
|
"claims": {
|
|
@@ -80,7 +90,7 @@
|
|
|
80
90
|
"additionalProperties": {
|
|
81
91
|
"type": "integer",
|
|
82
92
|
"minimum": 0,
|
|
83
|
-
"maximum":
|
|
93
|
+
"maximum": 35000
|
|
84
94
|
}
|
|
85
95
|
},
|
|
86
96
|
"round_total": {
|
|
@@ -101,19 +111,16 @@
|
|
|
101
111
|
"properties": {
|
|
102
112
|
"name": {
|
|
103
113
|
"type": "string",
|
|
104
|
-
"
|
|
114
|
+
"enum": ["aggressive", "plan"]
|
|
105
115
|
},
|
|
106
116
|
"max_rounds": {
|
|
107
|
-
"type": "integer"
|
|
108
|
-
"const": 6
|
|
117
|
+
"type": "integer"
|
|
109
118
|
},
|
|
110
119
|
"round_token_cap": {
|
|
111
|
-
"type": "integer"
|
|
112
|
-
"const": 2500
|
|
120
|
+
"type": "integer"
|
|
113
121
|
},
|
|
114
122
|
"debate_global_cap": {
|
|
115
|
-
"type": "integer"
|
|
116
|
-
"const": 35000
|
|
123
|
+
"type": "integer"
|
|
117
124
|
}
|
|
118
125
|
}
|
|
119
126
|
},
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
/** Shared debate bus participant types (plan + post-execute). */
|
|
2
|
+
|
|
3
|
+
export type PostExecuteDebateParticipant =
|
|
4
|
+
| "EvaluatorAgent"
|
|
5
|
+
| "AdversaryAgent"
|
|
6
|
+
| "TieBreakerAgent";
|
|
7
|
+
|
|
8
|
+
export type PlanDebateParticipant =
|
|
9
|
+
| "PlanEvaluatorAgent"
|
|
10
|
+
| "PlanAdversaryAgent"
|
|
11
|
+
| "HypothesisValidatorAgent"
|
|
12
|
+
| "SprintContractAuditorAgent"
|
|
13
|
+
| "ReviewIntegratorAgent"
|
|
14
|
+
| "StackResearchAgent";
|
|
15
|
+
|
|
16
|
+
export type DebateParticipant =
|
|
17
|
+
| PostExecuteDebateParticipant
|
|
18
|
+
| PlanDebateParticipant;
|
|
19
|
+
|
|
20
|
+
export const PLAN_DEBATE_PARTICIPANTS: PlanDebateParticipant[] = [
|
|
21
|
+
"PlanEvaluatorAgent",
|
|
22
|
+
"PlanAdversaryAgent",
|
|
23
|
+
"HypothesisValidatorAgent",
|
|
24
|
+
"SprintContractAuditorAgent",
|
|
25
|
+
"ReviewIntegratorAgent",
|
|
26
|
+
"StackResearchAgent",
|
|
27
|
+
];
|
|
28
|
+
|
|
29
|
+
export const POST_EXECUTE_DEBATE_PARTICIPANTS: PostExecuteDebateParticipant[] =
|
|
30
|
+
["EvaluatorAgent", "AdversaryAgent", "TieBreakerAgent"];
|
|
31
|
+
|
|
32
|
+
export function isPlanDebateId(debateId: string): boolean {
|
|
33
|
+
return debateId.startsWith("plan-");
|
|
34
|
+
}
|
|
35
|
+
|
|
36
|
+
export function debatePhaseFromId(debateId: string): "plan" | "post_execute" {
|
|
37
|
+
return isPlanDebateId(debateId) ? "plan" : "post_execute";
|
|
38
|
+
}
|
|
@@ -0,0 +1,81 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Shared agent discovery helpers (manifest + tests).
|
|
3
|
+
*/
|
|
4
|
+
|
|
5
|
+
import { createHash } from "node:crypto";
|
|
6
|
+
import { existsSync, readdirSync, readFileSync } from "node:fs";
|
|
7
|
+
import { join, relative } from "node:path";
|
|
8
|
+
|
|
9
|
+
export function isSafeAgentId(id) {
|
|
10
|
+
if (!id || id.includes("..") || id.startsWith("/") || id.includes("\\")) {
|
|
11
|
+
return false;
|
|
12
|
+
}
|
|
13
|
+
return /^[a-zA-Z0-9][a-zA-Z0-9/_-]*$/.test(id);
|
|
14
|
+
}
|
|
15
|
+
|
|
16
|
+
export function sha256Content(content) {
|
|
17
|
+
return createHash("sha256").update(content, "utf8").digest("hex");
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
export function walkAgentsDir(rootDir, source, out) {
|
|
21
|
+
if (!existsSync(rootDir)) return;
|
|
22
|
+
const stack = [rootDir];
|
|
23
|
+
while (stack.length > 0) {
|
|
24
|
+
const dir = stack.pop();
|
|
25
|
+
let entries;
|
|
26
|
+
try {
|
|
27
|
+
entries = readdirSync(dir, { withFileTypes: true });
|
|
28
|
+
} catch {
|
|
29
|
+
continue;
|
|
30
|
+
}
|
|
31
|
+
for (const entry of entries) {
|
|
32
|
+
const full = join(dir, entry.name);
|
|
33
|
+
if (entry.isDirectory()) {
|
|
34
|
+
stack.push(full);
|
|
35
|
+
continue;
|
|
36
|
+
}
|
|
37
|
+
if (!entry.name.endsWith(".md")) continue;
|
|
38
|
+
const rel = relative(rootDir, full).replace(/\\/g, "/");
|
|
39
|
+
const id = rel.replace(/\.md$/i, "");
|
|
40
|
+
if (!isSafeAgentId(id)) continue;
|
|
41
|
+
let content;
|
|
42
|
+
try {
|
|
43
|
+
content = readFileSync(full, "utf-8");
|
|
44
|
+
} catch {
|
|
45
|
+
continue;
|
|
46
|
+
}
|
|
47
|
+
out.set(id, { id, path: full, source, content });
|
|
48
|
+
}
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
export function discoverFromRoots(packageAgentsDir, projectAgentsDir, globalAgentsDir) {
|
|
53
|
+
const files = new Map();
|
|
54
|
+
walkAgentsDir(packageAgentsDir, "package", files);
|
|
55
|
+
if (globalAgentsDir) walkAgentsDir(globalAgentsDir, "global", files);
|
|
56
|
+
walkAgentsDir(projectAgentsDir, "project", files);
|
|
57
|
+
return files;
|
|
58
|
+
}
|
|
59
|
+
|
|
60
|
+
export function getDriftReport(manifest, onDiskHashes) {
|
|
61
|
+
const items = [];
|
|
62
|
+
if (!manifest) {
|
|
63
|
+
return { ok: false, items: [{ id: "*", kind: "missing_on_disk" }] };
|
|
64
|
+
}
|
|
65
|
+
for (const [id, entry] of onDiskHashes) {
|
|
66
|
+
const expected = manifest.agents[id];
|
|
67
|
+
if (!expected) {
|
|
68
|
+
items.push({ id, kind: "missing_in_manifest" });
|
|
69
|
+
continue;
|
|
70
|
+
}
|
|
71
|
+
if (expected.sha256 !== entry.sha256) {
|
|
72
|
+
items.push({ id, kind: "hash_mismatch" });
|
|
73
|
+
}
|
|
74
|
+
}
|
|
75
|
+
for (const id of Object.keys(manifest.agents)) {
|
|
76
|
+
if (!onDiskHashes.has(id)) {
|
|
77
|
+
items.push({ id, kind: "missing_on_disk" });
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
return { ok: items.length === 0, items };
|
|
81
|
+
}
|
|
@@ -2,12 +2,13 @@
|
|
|
2
2
|
* harness-run-context — shared types and helpers for active harness runs.
|
|
3
3
|
*
|
|
4
4
|
* Session entry `harness-run-context` is the live source of truth; disk mirrors:
|
|
5
|
-
* - `.pi/harness/runs/<run_id>/run-context.
|
|
5
|
+
* - `.pi/harness/runs/<run_id>/run-context.yaml`
|
|
6
6
|
* - `.pi/harness/active-run.json` (cross-session pointer)
|
|
7
7
|
*/
|
|
8
8
|
|
|
9
9
|
import { mkdir, readFile, realpath, writeFile } from "node:fs/promises";
|
|
10
10
|
import { isAbsolute, join, relative, resolve } from "node:path";
|
|
11
|
+
import { readYamlFile, writeYamlFile } from "./harness-yaml.js";
|
|
11
12
|
|
|
12
13
|
export type HarnessPhase =
|
|
13
14
|
| "plan"
|
|
@@ -67,6 +68,7 @@ export interface PlanPacketLike {
|
|
|
67
68
|
risk_level?: string;
|
|
68
69
|
assumptions?: unknown[];
|
|
69
70
|
rollback_plan?: unknown;
|
|
71
|
+
execution_plan?: unknown;
|
|
70
72
|
}
|
|
71
73
|
|
|
72
74
|
interface SessionEntryLike {
|
|
@@ -107,14 +109,43 @@ export function activeRunPointerPath(projectRoot: string): string {
|
|
|
107
109
|
}
|
|
108
110
|
|
|
109
111
|
export function runContextDiskPath(runId: string, projectRoot: string): string {
|
|
110
|
-
return join(harnessRunsRoot(projectRoot), runId,
|
|
112
|
+
return join(harnessRunsRoot(projectRoot), runId, RUN_CONTEXT_BASENAME);
|
|
111
113
|
}
|
|
112
114
|
|
|
113
115
|
export function canonicalPlanPath(runId: string, projectRoot: string): string {
|
|
114
|
-
return join(harnessRunsRoot(projectRoot), runId,
|
|
116
|
+
return join(harnessRunsRoot(projectRoot), runId, PLAN_PACKET_BASENAME);
|
|
115
117
|
}
|
|
116
118
|
|
|
117
|
-
|
|
119
|
+
export function canonicalResearchBriefPath(
|
|
120
|
+
runId: string,
|
|
121
|
+
projectRoot: string,
|
|
122
|
+
): string {
|
|
123
|
+
return join(harnessRunsRoot(projectRoot), runId, RESEARCH_BRIEF_BASENAME);
|
|
124
|
+
}
|
|
125
|
+
|
|
126
|
+
export function runArtifactsDir(runId: string, projectRoot: string): string {
|
|
127
|
+
return join(harnessRunsRoot(projectRoot), runId, "artifacts");
|
|
128
|
+
}
|
|
129
|
+
|
|
130
|
+
export const PLAN_REVIEW_BASENAME = "plan-review.md";
|
|
131
|
+
|
|
132
|
+
export function canonicalPlanReviewPath(
|
|
133
|
+
runId: string,
|
|
134
|
+
projectRoot: string,
|
|
135
|
+
): string {
|
|
136
|
+
return join(harnessRunsRoot(projectRoot), runId, PLAN_REVIEW_BASENAME);
|
|
137
|
+
}
|
|
138
|
+
|
|
139
|
+
export const PLAN_PACKET_BASENAME = "plan-packet.yaml";
|
|
140
|
+
export const RUN_CONTEXT_BASENAME = "run-context.yaml";
|
|
141
|
+
export const RESEARCH_BRIEF_BASENAME = "research-brief.yaml";
|
|
142
|
+
|
|
143
|
+
const PLAN_RUN_SCOPED_ROOT_FILES = new Set([
|
|
144
|
+
PLAN_PACKET_BASENAME,
|
|
145
|
+
RESEARCH_BRIEF_BASENAME,
|
|
146
|
+
"plan-dag-validation.yaml",
|
|
147
|
+
PLAN_REVIEW_BASENAME,
|
|
148
|
+
]);
|
|
118
149
|
|
|
119
150
|
const MUTATING_FILE_TOOLS = new Set(["write", "edit"]);
|
|
120
151
|
|
|
@@ -199,7 +230,21 @@ export function extractWritePathFromToolInput(
|
|
|
199
230
|
return raw.trim();
|
|
200
231
|
}
|
|
201
232
|
|
|
202
|
-
/** True when absPath is
|
|
233
|
+
/** True when absPath is a plan-phase artifact under the active run directory. */
|
|
234
|
+
export function isPlanRunScopedRelativePath(rel: string): boolean {
|
|
235
|
+
if (rel.startsWith("..") || isAbsolute(rel)) return false;
|
|
236
|
+
const parts = rel.split(/[/\\]/);
|
|
237
|
+
if (parts.length === 2 && PLAN_RUN_SCOPED_ROOT_FILES.has(parts[1])) {
|
|
238
|
+
return true;
|
|
239
|
+
}
|
|
240
|
+
if (parts.length === 3 && parts[1] === "artifacts") {
|
|
241
|
+
const file = parts[2];
|
|
242
|
+
return file.endsWith(".yaml") || file.endsWith(".yml");
|
|
243
|
+
}
|
|
244
|
+
return false;
|
|
245
|
+
}
|
|
246
|
+
|
|
247
|
+
/** True when absPath is a writable plan-run artifact for the active run. */
|
|
203
248
|
export async function isPlanPhaseScopedWrite(
|
|
204
249
|
absPath: string,
|
|
205
250
|
runCtx: HarnessRunContext | null,
|
|
@@ -220,11 +265,9 @@ export async function isPlanPhaseScopedWrite(
|
|
|
220
265
|
runsReal = runsRoot;
|
|
221
266
|
}
|
|
222
267
|
const rel = relative(runsReal, resolved);
|
|
223
|
-
if (
|
|
268
|
+
if (!isPlanRunScopedRelativePath(rel)) return false;
|
|
224
269
|
const parts = rel.split(/[/\\]/);
|
|
225
|
-
|
|
226
|
-
if (parts[0] !== runCtx.run_id) return false;
|
|
227
|
-
return isCanonicalPlanPacketPath(resolved, projectRoot, runCtx.run_id);
|
|
270
|
+
return parts[0] === runCtx.run_id;
|
|
228
271
|
}
|
|
229
272
|
|
|
230
273
|
export function getLatestHarnessTurn(
|
|
@@ -488,19 +531,6 @@ export async function isPlanPhaseAllowedMutation(
|
|
|
488
531
|
'policy-gate: no active harness run. Run /harness-plan "<task>" first.',
|
|
489
532
|
};
|
|
490
533
|
}
|
|
491
|
-
if (
|
|
492
|
-
!hasPlanUserApproval(opts.entries, {
|
|
493
|
-
sincePlanCommand: true,
|
|
494
|
-
planId: runCtx.plan_id,
|
|
495
|
-
})
|
|
496
|
-
) {
|
|
497
|
-
return {
|
|
498
|
-
allowed: false,
|
|
499
|
-
isScopedPlanWrite: true,
|
|
500
|
-
reason:
|
|
501
|
-
"policy-gate: plan-packet.json write blocked until the user approves via approve_plan or ask_user (present the full plan, then Approve).",
|
|
502
|
-
};
|
|
503
|
-
}
|
|
504
534
|
if (opts.aborted) {
|
|
505
535
|
return { allowed: true, isScopedPlanWrite: true };
|
|
506
536
|
}
|
|
@@ -513,7 +543,7 @@ export async function isPlanPhaseAllowedMutation(
|
|
|
513
543
|
return {
|
|
514
544
|
allowed: false,
|
|
515
545
|
isScopedPlanWrite: true,
|
|
516
|
-
reason: `harness-run-context: plan-packet.
|
|
546
|
+
reason: `harness-run-context: plan-packet.yaml is read-only in phase '${phase}'.`,
|
|
517
547
|
};
|
|
518
548
|
}
|
|
519
549
|
|
|
@@ -521,7 +551,7 @@ export async function isPlanPhaseAllowedMutation(
|
|
|
521
551
|
return {
|
|
522
552
|
allowed: false,
|
|
523
553
|
reason:
|
|
524
|
-
"policy-gate: mutating tool blocked because harness-abort lock is active. Attach a new approved plan via plan-packet.
|
|
554
|
+
"policy-gate: mutating tool blocked because harness-abort lock is active. Attach a new approved plan via plan-packet.yaml first.",
|
|
525
555
|
};
|
|
526
556
|
}
|
|
527
557
|
|
|
@@ -539,7 +569,7 @@ export async function isPlanPhaseAllowedMutation(
|
|
|
539
569
|
|
|
540
570
|
const allowedPath = runCtx?.run_id
|
|
541
571
|
? canonicalPlanPath(runCtx.run_id, projectRoot)
|
|
542
|
-
:
|
|
572
|
+
: `.pi/harness/runs/<run_id>/${PLAN_PACKET_BASENAME}`;
|
|
543
573
|
return {
|
|
544
574
|
allowed: false,
|
|
545
575
|
reason: `policy-gate: ${toolName} blocked in phase '${phase}'. In plan phase only ${allowedPath} is writable after ask_user approval.`,
|
|
@@ -745,8 +775,11 @@ export async function loadRunContextFromDisk(
|
|
|
745
775
|
projectRoot: string,
|
|
746
776
|
): Promise<HarnessRunContext | null> {
|
|
747
777
|
try {
|
|
748
|
-
const
|
|
749
|
-
|
|
778
|
+
const doc = await readYamlFile(
|
|
779
|
+
runContextDiskPath(runId, projectRoot),
|
|
780
|
+
"run-context",
|
|
781
|
+
);
|
|
782
|
+
return normalizeRunContext(doc as Partial<HarnessRunContext>);
|
|
750
783
|
} catch {
|
|
751
784
|
return null;
|
|
752
785
|
}
|
|
@@ -757,11 +790,7 @@ export async function saveRunContextToDisk(
|
|
|
757
790
|
): Promise<void> {
|
|
758
791
|
const dir = join(harnessRunsRoot(ctx.project_root), ctx.run_id);
|
|
759
792
|
await mkdir(dir, { recursive: true });
|
|
760
|
-
await
|
|
761
|
-
runContextDiskPath(ctx.run_id, ctx.project_root),
|
|
762
|
-
`${JSON.stringify(ctx, null, 2)}\n`,
|
|
763
|
-
"utf-8",
|
|
764
|
-
);
|
|
793
|
+
await writeYamlFile(runContextDiskPath(ctx.run_id, ctx.project_root), ctx);
|
|
765
794
|
}
|
|
766
795
|
|
|
767
796
|
export async function loadProjectActiveRun(
|
|
@@ -819,8 +848,8 @@ export async function readPlanPacketFromPath(
|
|
|
819
848
|
planPath: string,
|
|
820
849
|
): Promise<PlanPacketLike | null> {
|
|
821
850
|
try {
|
|
822
|
-
const
|
|
823
|
-
return
|
|
851
|
+
const doc = await readYamlFile(planPath, planPath);
|
|
852
|
+
return doc as PlanPacketLike;
|
|
824
853
|
} catch {
|
|
825
854
|
return null;
|
|
826
855
|
}
|
|
@@ -835,8 +864,11 @@ export function validatePlanPacket(packet: PlanPacketLike | null): {
|
|
|
835
864
|
const errors: string[] = [];
|
|
836
865
|
if (packet.schema_version !== "1.0.0")
|
|
837
866
|
errors.push("schema_version must be 1.0.0");
|
|
838
|
-
if (
|
|
839
|
-
|
|
867
|
+
if (
|
|
868
|
+
packet.contract_version !== "1.0.0" &&
|
|
869
|
+
packet.contract_version !== "1.1.0"
|
|
870
|
+
)
|
|
871
|
+
errors.push("contract_version must be 1.0.0 or 1.1.0");
|
|
840
872
|
if (!packet.plan_id || typeof packet.plan_id !== "string")
|
|
841
873
|
errors.push("plan_id required");
|
|
842
874
|
if (!packet.task_id || typeof packet.task_id !== "string")
|
|
@@ -850,6 +882,9 @@ export function validatePlanPacket(packet: PlanPacketLike | null): {
|
|
|
850
882
|
errors.push("acceptance_checks required");
|
|
851
883
|
if (!packet.risk_level) errors.push("risk_level required");
|
|
852
884
|
if (!packet.rollback_plan) errors.push("rollback_plan required");
|
|
885
|
+
if (packet.contract_version === "1.1.0" && !packet.execution_plan) {
|
|
886
|
+
errors.push("execution_plan required for contract_version 1.1.0");
|
|
887
|
+
}
|
|
853
888
|
return { valid: errors.length === 0, errors };
|
|
854
889
|
}
|
|
855
890
|
|
|
@@ -910,6 +945,9 @@ export function formatPlanContextBlock(
|
|
|
910
945
|
];
|
|
911
946
|
if (ctx.plan_packet_path) {
|
|
912
947
|
lines.push(`plan_packet_path=${ctx.plan_packet_path}`);
|
|
948
|
+
lines.push(
|
|
949
|
+
`plan_review_path=${canonicalPlanReviewPath(ctx.run_id, ctx.project_root)}`,
|
|
950
|
+
);
|
|
913
951
|
}
|
|
914
952
|
if (ctx.task_summary) {
|
|
915
953
|
lines.push(`task_summary=${ctx.task_summary}`);
|
|
@@ -945,7 +983,7 @@ export function formatActivePlanBlock(
|
|
|
945
983
|
);
|
|
946
984
|
} else {
|
|
947
985
|
lines.push(
|
|
948
|
-
"Plan is read-only in this phase. Do not edit plan-packet.
|
|
986
|
+
"Plan is read-only in this phase. Do not edit plan-packet.yaml.",
|
|
949
987
|
);
|
|
950
988
|
}
|
|
951
989
|
if (ctx.plan_packet_path) {
|
|
@@ -1004,7 +1042,7 @@ export function validatePlanOverridePath(
|
|
|
1004
1042
|
if (!isCanonicalPlanPacketPath(absPlan, projectRoot, runId)) {
|
|
1005
1043
|
return {
|
|
1006
1044
|
ok: false,
|
|
1007
|
-
reason: `--plan must be runs/${runId}
|
|
1045
|
+
reason: `--plan must be runs/${runId}/${PLAN_PACKET_BASENAME} (canonical plan packet only)`,
|
|
1008
1046
|
};
|
|
1009
1047
|
}
|
|
1010
1048
|
return { ok: true };
|
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* YAML read/write for harness plan artifacts (no JSON plan fallbacks).
|
|
3
|
+
*/
|
|
4
|
+
|
|
5
|
+
import { readFile, rename, writeFile } from "node:fs/promises";
|
|
6
|
+
import { parse, stringify } from "yaml";
|
|
7
|
+
|
|
8
|
+
const CODE_FENCE_RE = /^```(?:ya?ml|json)?\s*\n?([\s\S]*?)```\s*$/im;
|
|
9
|
+
|
|
10
|
+
export function stripYamlFences(text) {
|
|
11
|
+
return stripCodeFences(text);
|
|
12
|
+
}
|
|
13
|
+
|
|
14
|
+
export function stripCodeFences(text) {
|
|
15
|
+
const trimmed = text.trim();
|
|
16
|
+
const m = CODE_FENCE_RE.exec(trimmed);
|
|
17
|
+
return m ? m[1].trim() : trimmed;
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
export function parseStructuredDocument(text, label = "document") {
|
|
21
|
+
const body = stripCodeFences(text);
|
|
22
|
+
if (!body.trim()) {
|
|
23
|
+
throw new Error(`${label}: empty document`);
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
try {
|
|
27
|
+
const yamlDoc = parse(body, { uniqueKeys: true });
|
|
28
|
+
if (yamlDoc !== null && yamlDoc !== undefined) {
|
|
29
|
+
return yamlDoc;
|
|
30
|
+
}
|
|
31
|
+
} catch {
|
|
32
|
+
/* try JSON below */
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
const trimmed = body.trim();
|
|
36
|
+
if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
|
|
37
|
+
try {
|
|
38
|
+
return JSON.parse(trimmed);
|
|
39
|
+
} catch (err) {
|
|
40
|
+
const msg = err instanceof Error ? err.message : String(err);
|
|
41
|
+
throw new Error(`${label}: JSON parse failed — ${msg}`);
|
|
42
|
+
}
|
|
43
|
+
}
|
|
44
|
+
|
|
45
|
+
throw new Error(
|
|
46
|
+
`${label}: not valid YAML or JSON (use write_harness_yaml with a schema-shaped object)`,
|
|
47
|
+
);
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
export function parseYaml(text, label = "yaml") {
|
|
51
|
+
return parseStructuredDocument(text, label);
|
|
52
|
+
}
|
|
53
|
+
|
|
54
|
+
export async function readYamlFile(path, label) {
|
|
55
|
+
const raw = await readFile(path, "utf-8");
|
|
56
|
+
return parseStructuredDocument(raw, label ?? path);
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
export async function writeYamlFile(path, data) {
|
|
60
|
+
const tmp = `${path}.tmp`;
|
|
61
|
+
const content = `${stringify(data, { indent: 2 })}\n`;
|
|
62
|
+
await writeFile(tmp, content, "utf-8");
|
|
63
|
+
await rename(tmp, path);
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
export function stringifyYaml(data) {
|
|
67
|
+
return `${stringify(data, { indent: 2 })}\n`;
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
export function normalizeHarnessYamlContent(text, label = "yaml") {
|
|
71
|
+
const doc = parseStructuredDocument(text, label);
|
|
72
|
+
return stringifyYaml(doc);
|
|
73
|
+
}
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* YAML read/write for harness plan artifacts (no JSON plan fallbacks).
|
|
3
|
+
*/
|
|
4
|
+
|
|
5
|
+
import { readFile, rename, writeFile } from "node:fs/promises";
|
|
6
|
+
import { parse, stringify } from "yaml";
|
|
7
|
+
|
|
8
|
+
const CODE_FENCE_RE = /^```(?:ya?ml|json)?\s*\n?([\s\S]*?)```\s*$/im;
|
|
9
|
+
|
|
10
|
+
/** @deprecated Use stripCodeFences */
|
|
11
|
+
export function stripYamlFences(text: string): string {
|
|
12
|
+
return stripCodeFences(text);
|
|
13
|
+
}
|
|
14
|
+
|
|
15
|
+
export function stripCodeFences(text: string): string {
|
|
16
|
+
const trimmed = text.trim();
|
|
17
|
+
const m = CODE_FENCE_RE.exec(trimmed);
|
|
18
|
+
return m ? m[1].trim() : trimmed;
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
/**
|
|
22
|
+
* Parse agent output or file body: fenced YAML/JSON, raw YAML, or raw JSON object/array.
|
|
23
|
+
*/
|
|
24
|
+
export function parseStructuredDocument(
|
|
25
|
+
text: string,
|
|
26
|
+
label = "document",
|
|
27
|
+
): unknown {
|
|
28
|
+
const body = stripCodeFences(text);
|
|
29
|
+
if (!body.trim()) {
|
|
30
|
+
throw new Error(`${label}: empty document`);
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
try {
|
|
34
|
+
const yamlDoc = parse(body, { uniqueKeys: true });
|
|
35
|
+
if (yamlDoc !== null && yamlDoc !== undefined) {
|
|
36
|
+
return yamlDoc;
|
|
37
|
+
}
|
|
38
|
+
} catch {
|
|
39
|
+
/* try JSON below */
|
|
40
|
+
}
|
|
41
|
+
|
|
42
|
+
const trimmed = body.trim();
|
|
43
|
+
if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
|
|
44
|
+
try {
|
|
45
|
+
return JSON.parse(trimmed) as unknown;
|
|
46
|
+
} catch (err) {
|
|
47
|
+
const msg = err instanceof Error ? err.message : String(err);
|
|
48
|
+
throw new Error(`${label}: JSON parse failed — ${msg}`);
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
throw new Error(
|
|
53
|
+
`${label}: not valid YAML or JSON (use write_harness_yaml with a schema-shaped object)`,
|
|
54
|
+
);
|
|
55
|
+
}
|
|
56
|
+
|
|
57
|
+
export function parseYaml(text: string, label = "yaml"): unknown {
|
|
58
|
+
return parseStructuredDocument(text, label);
|
|
59
|
+
}
|
|
60
|
+
|
|
61
|
+
export async function readYamlFile(
|
|
62
|
+
path: string,
|
|
63
|
+
label?: string,
|
|
64
|
+
): Promise<unknown> {
|
|
65
|
+
const raw = await readFile(path, "utf-8");
|
|
66
|
+
return parseStructuredDocument(raw, label ?? path);
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
export async function writeYamlFile(
|
|
70
|
+
path: string,
|
|
71
|
+
data: unknown,
|
|
72
|
+
): Promise<void> {
|
|
73
|
+
const tmp = `${path}.tmp`;
|
|
74
|
+
const content = `${stringify(data, { indent: 2 })}\n`;
|
|
75
|
+
await writeFile(tmp, content, "utf-8");
|
|
76
|
+
await rename(tmp, path);
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
export function stringifyYaml(data: unknown): string {
|
|
80
|
+
return `${stringify(data, { indent: 2 })}\n`;
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
/** Normalize arbitrary agent text to canonical YAML file bytes. */
|
|
84
|
+
export function normalizeHarnessYamlContent(
|
|
85
|
+
text: string,
|
|
86
|
+
label = "yaml",
|
|
87
|
+
): string {
|
|
88
|
+
const doc = parseStructuredDocument(text, label);
|
|
89
|
+
return stringifyYaml(doc);
|
|
90
|
+
}
|
|
@@ -5,7 +5,7 @@ argument-hint: "\"<task>\" [--quick] [--risk low|med|high] [--budget <amount>]"
|
|
|
5
5
|
|
|
6
6
|
# harness-auto
|
|
7
7
|
|
|
8
|
-
Pipeline orchestrator — one session, sequential
|
|
8
|
+
Pipeline orchestrator — one session, sequential phase handoffs. Invoke **harness-orchestration** skill for agent IDs. Do **not** implement or review inline.
|
|
9
9
|
|
|
10
10
|
## Step 0 — Parse arguments
|
|
11
11
|
|
|
@@ -18,20 +18,22 @@ If task missing:
|
|
|
18
18
|
|
|
19
19
|
## Orchestration (required) — same session
|
|
20
20
|
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
21
|
+
Follow **harness-plan** performance rules (`subagent` with parallel `tasks`, `agentScope: "both"`).
|
|
22
|
+
|
|
23
|
+
1. **Plan** — follow `/harness-plan` (parallel scouts → parallel decompose/hypothesis → draft PlanPacket → debate rounds → parent `approve_plan` + `create_plan`). No second approval pass.
|
|
24
|
+
2. **Execute** — `subagent({ agent: "harness/executor", task: "<HarnessSpawnContext mode execute>" })`; summarize handoff bullets only (do not paste full subprocess log).
|
|
25
|
+
3. **Eval** — `subagent({ agent: "harness/evaluator", task: "<mode benchmark>" })` after parent scripts if needed.
|
|
26
|
+
4. **Review** — `subagent({ agent: "harness/evaluator", task: "<mode verdict>" })` when strict gates require.
|
|
27
|
+
5. **Adversary** — `subagent({ agent: "harness/adversary", ... })`. **Skip when `--quick`**.
|
|
28
|
+
6. **Tie-breaker** — `subagent({ agent: "harness/tie-breaker", ... })` only if debate unresolved and **not** `--quick`.
|
|
27
29
|
7. **Parent** — apply locked strict gates below; commit/PR only if all pass.
|
|
28
30
|
|
|
29
|
-
|
|
31
|
+
Review agents run in isolated subprocesses via `subagent` (same parent session).
|
|
30
32
|
|
|
31
33
|
## Locked decisions (do not change)
|
|
32
34
|
|
|
33
35
|
- Always produce and approve plan before mutation.
|
|
34
|
-
- Adversarial review always required.
|
|
36
|
+
- Adversarial review always required **except** `--quick` (evaluator-only gate).
|
|
35
37
|
- Severity-policy-engine blocks merge.
|
|
36
38
|
- Router tuning propose-and-approve only.
|
|
37
39
|
- Plan ambiguity → parent `ask_user` (harness-decisions).
|
|
@@ -41,11 +43,11 @@ No new Pi session for review — subagents use isolated context (`inherit_contex
|
|
|
41
43
|
|
|
42
44
|
## Strict gates
|
|
43
45
|
|
|
44
|
-
Block commit/PR if any fails: plan gate, execution in scope, evaluator pass, adversary complete, severity-policy pass/conditional_pass, benchmark deltas, rollback artifacts.
|
|
46
|
+
Block commit/PR if any fails: plan gate, execution in scope, evaluator pass, adversary complete (unless `--quick`), severity-policy pass/conditional_pass, benchmark deltas, rollback artifacts.
|
|
45
47
|
|
|
46
48
|
## Notes
|
|
47
49
|
|
|
48
|
-
- `--quick` reduces breadth, never safety gates.
|
|
50
|
+
- `--quick` reduces breadth (skips semantic scout, post-run adversary, tie-breaker), never core safety gates on plan approval or evaluator.
|
|
49
51
|
- High risk/ambiguity → stop and recommend manual `/harness-plan` with `ask_user`.
|
|
50
52
|
- Interrupt: `/harness-abort [reason]` then `/harness-plan`.
|
|
51
53
|
- Artifact refs under active run dir; `/harness-run-status` or `/harness-trace-last` for handoff.
|
|
@@ -20,10 +20,10 @@ Happy path: omit `--run`.
|
|
|
20
20
|
2. Spawn:
|
|
21
21
|
|
|
22
22
|
```
|
|
23
|
-
|
|
23
|
+
subagent({ agentScope: "both", agent: "harness/adversary", task: "…" })
|
|
24
24
|
```
|
|
25
25
|
|
|
26
|
-
3. `
|
|
26
|
+
3. Parse `AdversaryReport` JSON from tool result; parent persists for severity policy.
|
|
27
27
|
|
|
28
28
|
## Parent rules
|
|
29
29
|
|
|
@@ -26,11 +26,11 @@ If no active run:
|
|
|
26
26
|
4. Spawn:
|
|
27
27
|
|
|
28
28
|
```
|
|
29
|
-
|
|
29
|
+
subagent({ agentScope: "both", agent: "harness/evaluator", task: "<HarnessSpawnContext + eval brief>" })
|
|
30
30
|
```
|
|
31
31
|
|
|
32
|
-
5.
|
|
33
|
-
6. Do not edit `plan-packet.
|
|
32
|
+
5. Parse eval JSON from tool result; parent writes structured artifacts under run dir.
|
|
33
|
+
6. Do not edit `plan-packet.yaml`.
|
|
34
34
|
|
|
35
35
|
## Parent rules
|
|
36
36
|
|
|
@@ -22,10 +22,10 @@ If `--trigger` missing:
|
|
|
22
22
|
2. Spawn:
|
|
23
23
|
|
|
24
24
|
```
|
|
25
|
-
|
|
25
|
+
subagent({ agentScope: "both", agent: "harness/incident-recorder", task: "…" })
|
|
26
26
|
```
|
|
27
27
|
|
|
28
|
-
3. `
|
|
28
|
+
3. Parse `IncidentRecord` JSON from tool result; parent writes under `.pi/harness/incidents/`.
|
|
29
29
|
|
|
30
30
|
## Completion
|
|
31
31
|
|