clean-room-skill 0.2.0 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/.codex-plugin/plugin.json +1 -1
- package/README.md +20 -2
- package/agents/clean-polish-reviewer.md +5 -3
- package/docs/ARCHITECTURE.md +1 -1
- package/docs/HOOKS.md +1 -0
- package/docs/REFERENCE.md +12 -3
- package/hooks/check-artifact-leakage.py +1 -0
- package/lib/bootstrap.cjs +2 -1
- package/lib/install-options.cjs +1 -0
- package/lib/run-claude-agent-runtime.cjs +2 -0
- package/lib/run-controller.cjs +21 -0
- package/lib/run-polish-commit.cjs +259 -0
- package/lib/run-results.cjs +23 -11
- package/lib/runtime-layout.cjs +7 -0
- package/package.json +1 -1
- package/plugin.json +1 -1
- package/skills/clean-room/SKILL.md +3 -3
- package/skills/clean-room/assets/polish-report.schema.json +84 -1
- package/skills/clean-room/examples/minimal-spec-package/implementation-report.json +24 -7
- package/skills/clean-room/examples/minimal-spec-package/polish-report.json +2 -0
- package/skills/clean-room/references/PROCESS.md +4 -2
- package/skills/clean-room/references/SPEC-SCHEMA.md +1 -1
- package/skills/init/SKILL.md +1 -1
- package/skills/preflight/SKILL.md +2 -2
- package/skills/refocus/SKILL.md +1 -1
- package/skills/resume-cr/SKILL.md +1 -1
- package/skills/unattended/SKILL.md +1 -1
package/README.md
CHANGED
|
@@ -31,7 +31,19 @@ For the full boundary model, see [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md). F
|
|
|
31
31
|
|
|
32
32
|
Requires Node.js `>=22`.
|
|
33
33
|
|
|
34
|
-
|
|
34
|
+
You can either install the CLI globally on your system, or run the commands on-demand using `npx`.
|
|
35
|
+
|
|
36
|
+
### Global Installation (npm)
|
|
37
|
+
|
|
38
|
+
To install the `clean-room-skill` executable globally:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
npm install -g clean-room-skill
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
### Direct On-Demand Execution (npx)
|
|
45
|
+
|
|
46
|
+
Preferred interactive install/onboarding flow:
|
|
35
47
|
|
|
36
48
|
```bash
|
|
37
49
|
npx clean-room-skill@latest
|
|
@@ -42,6 +54,7 @@ Non-interactive installs:
|
|
|
42
54
|
```bash
|
|
43
55
|
npx clean-room-skill@latest --codex --global --yes
|
|
44
56
|
npx clean-room-skill@latest --claude --global --yes
|
|
57
|
+
npx clean-room-skill@latest --pi --global --yes
|
|
45
58
|
npx clean-room-skill@latest --all --global --yes
|
|
46
59
|
```
|
|
47
60
|
|
|
@@ -77,9 +90,14 @@ Pi:
|
|
|
77
90
|
```bash
|
|
78
91
|
pi install npm:clean-room-skill@latest
|
|
79
92
|
pi install https://github.com/whit3rabbit/clean-room-skill
|
|
93
|
+
npx clean-room-skill@latest --pi --global --yes
|
|
80
94
|
```
|
|
81
95
|
|
|
82
|
-
Pi
|
|
96
|
+
Pi-native package install is preferred. This package declares `pi.skills: ["./skills"]`, so `pi install npm:clean-room-skill@latest` lets Pi discover the bundled `SKILL.md` entry points directly. Use the `npx ... --pi` installer only when you want this repo's compatibility installer to manage the same files alongside other runtimes. Global Pi compatibility installs target `~/.pi/agent`; local installs target `.pi`.
|
|
97
|
+
|
|
98
|
+
Both Pi install paths load bundled skills as `/skill:<name>`, for example `/skill:clean-room`. Pi installs do not currently register clean-room hooks. Installer-managed Pi layouts copy the hook scripts to `hooks/clean-room/` for inspection and future bridge work, but those files are not active enforcement in Pi.
|
|
99
|
+
|
|
100
|
+
Pi hook enforcement would need a Pi extension, not a `settings.json` edit. Pi extensions can subscribe to tool events such as `tool_call` and `tool_result`, block or mutate tool calls, and are declared with `pi.extensions` in `package.json`; see the [Pi extension docs](https://pi.dev/docs/latest/extensions). This package does not ship that extension yet, so clean-room safety in Pi still depends on role separation, path isolation, schema validation, and any supported hook runtime used for enforcement.
|
|
83
101
|
|
|
84
102
|
## How To Run
|
|
85
103
|
|
|
@@ -15,7 +15,7 @@ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMP
|
|
|
15
15
|
|
|
16
16
|
Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-polish-reviewer`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
|
|
17
17
|
|
|
18
|
-
This default profile has no shell-style tools. If final verification or commit is required, use an isolated polish profile where strict hooks are installed, `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1` is intentional, and the only allowed terminal command invokes the installed `agent4-polish-runner.py` from an implementation root. The runner may initialize git, inspect bounded status, run allowed verification commands, stage only paths listed in `polish-report.json`, and create one local commit. Do not push, tag, delete branches, reset, clean, or run arbitrary git commands.
|
|
18
|
+
This default profile has no shell-style tools. If final verification or commit is required, use an isolated polish profile where strict hooks are installed, `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1` is intentional, and the only allowed terminal command invokes the installed `agent4-polish-runner.py` from an implementation root. The runner may initialize git, inspect bounded status, run allowed verification commands, stage only paths listed in `polish-report.json` `git.include_paths`, and create one local commit. Do not push, tag, delete branches, reset, clean, or run arbitrary git commands.
|
|
19
19
|
|
|
20
20
|
## Required Handoff Inputs
|
|
21
21
|
|
|
@@ -37,8 +37,10 @@ Responsibilities:
|
|
|
37
37
|
- Update implementation-root `.gitignore` only for real generated outputs, dependency folders, local caches, or build/test artifacts relevant to the clean implementation stack.
|
|
38
38
|
- Do not add speculative ignores, speculative docs, broad refactors, new dependencies, or new behavior.
|
|
39
39
|
- Re-run relevant verification through `agent4-polish-runner.py` only when shell verification is enabled for this role.
|
|
40
|
-
- Record findings, changed relative paths, verification results, residual risks, git status, commit message, commit hash/status, and abstract delta tickets in `polish-report.json`.
|
|
41
|
-
-
|
|
40
|
+
- Record findings, Agent 4 changed relative paths, verification results, residual risks, git status, commit message, commit hash/status, and abstract delta tickets in `polish-report.json`.
|
|
41
|
+
- Set `git.include_paths` to the union of terminal `implementation-report.json` `changed_paths` and Agent 4 `polish-report.json` `changed_paths`; do not include unreported dirty files.
|
|
42
|
+
- When the controller must create the commit, write a pre-commit report with `final_status: "blocked"`, `git.commit_required: true`, and `git.commit_status: "not-run"`.
|
|
43
|
+
- Mark `final_status` as `passed` only when high/blocker security, correctness, exception, resource, race, leakage, and verification findings are resolved and either the constrained local commit succeeded or clean-run-context explicitly disables Agent 4 commits with `git.commit_status: "not-needed"`.
|
|
42
44
|
- Convert major behavior gaps or scope expansion into abstract delta tickets instead of implementing new scope.
|
|
43
45
|
|
|
44
46
|
If contamination is found, mark `polish-report.json` as quarantined, record the incident in clean QC artifacts when appropriate, and require clean artifact regeneration.
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -241,7 +241,7 @@ The architecture delegates work across six distinct custom role agents to enforc
|
|
|
241
241
|
* Updates `.gitignore` only for real generated outputs, dependencies, caches, or build/test artifacts.
|
|
242
242
|
* Writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`.
|
|
243
243
|
* Uses `agent4-polish-runner.py` only with `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`, cwd under implementation roots, and strict hooks.
|
|
244
|
-
* May initialize git and create one local commit containing only paths listed in `polish-report.json`;
|
|
244
|
+
* May initialize git and create one local commit containing only paths listed in `polish-report.json` `git.include_paths`; that list must include terminal Agent 3 implementation changes plus Agent 4 polish changes. It must not push, tag, reset, clean, or delete branches.
|
|
245
245
|
|
|
246
246
|
### Nested Controller Loop
|
|
247
247
|
|
package/docs/HOOKS.md
CHANGED
|
@@ -15,6 +15,7 @@ The installer copies the Python hook files for every supported runtime layout. R
|
|
|
15
15
|
| Antigravity | `<targetRoot>/hooks/clean-room/*.py` | Unsupported, copy only |
|
|
16
16
|
| Gemini CLI | `<targetRoot>/hooks/clean-room/*.py` | Unsupported, copy only |
|
|
17
17
|
| OpenCode | `<targetRoot>/hooks/clean-room/*.py` | `<targetRoot>/plugins/clean-room.ts` |
|
|
18
|
+
| Pi | `<targetRoot>/hooks/clean-room/*.py` | Unsupported, copy only |
|
|
18
19
|
| Kilo | `<targetRoot>/hooks/clean-room/*.py` | Unsupported, copy only |
|
|
19
20
|
| Cursor | `<targetRoot>/hooks/clean-room/*.py` | Unsupported, copy only |
|
|
20
21
|
| GitHub Copilot | `<targetRoot>/hooks/clean-room/*.py` | Unsupported, copy only |
|
package/docs/REFERENCE.md
CHANGED
|
@@ -21,6 +21,7 @@ Runtime flags:
|
|
|
21
21
|
| `--antigravity` | Antigravity |
|
|
22
22
|
| `--gemini` | Gemini CLI |
|
|
23
23
|
| `--opencode` | OpenCode |
|
|
24
|
+
| `--pi` | Pi |
|
|
24
25
|
| `--kilo` | Kilo |
|
|
25
26
|
| `--cursor` | Cursor |
|
|
26
27
|
| `--copilot` | GitHub Copilot |
|
|
@@ -70,6 +71,7 @@ Layout-only or experimental:
|
|
|
70
71
|
|
|
71
72
|
- Antigravity
|
|
72
73
|
- Gemini CLI
|
|
74
|
+
- Pi
|
|
73
75
|
- Kilo
|
|
74
76
|
- Cursor
|
|
75
77
|
- GitHub Copilot
|
|
@@ -82,7 +84,7 @@ Layout-only or experimental:
|
|
|
82
84
|
|
|
83
85
|
Layout-only installs write files to expected runtime locations, but this repository does not verify that those hosts load the files or emit all hook events needed for clean-room enforcement. OpenCode installs are verified through a generated local plugin bridge at `plugins/clean-room.ts`; `doctor` verifies that bridge and the Python guardrails, not every OpenCode tool surface.
|
|
84
86
|
|
|
85
|
-
### Pi
|
|
87
|
+
### Pi Compatibility
|
|
86
88
|
|
|
87
89
|
Pi can install this package and load the bundled skills from the package metadata:
|
|
88
90
|
|
|
@@ -91,7 +93,13 @@ pi install npm:clean-room-skill@latest
|
|
|
91
93
|
pi install https://github.com/whit3rabbit/clean-room-skill
|
|
92
94
|
```
|
|
93
95
|
|
|
94
|
-
Pi
|
|
96
|
+
Pi-native package install is preferred. The installer also supports a layout target:
|
|
97
|
+
|
|
98
|
+
```bash
|
|
99
|
+
npx clean-room-skill@latest --pi --global --yes
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Pi invokes skills as `/skill:<name>`. Use `/skill:init` for the setup pass, `/skill:clean-room` for the startup wizard, `/skill:attended` for attended controller mode, and `/skill:unattended` for bounded unattended mode. Pi installs do not register clean-room hooks; installer-managed Pi layouts copy hook scripts only. Clean-room safety still depends on role separation, path isolation, schema validation, and supported hook runtimes.
|
|
95
103
|
|
|
96
104
|
Global install roots:
|
|
97
105
|
|
|
@@ -102,6 +110,7 @@ Global install roots:
|
|
|
102
110
|
| Antigravity | `ANTIGRAVITY_PLUGIN_DIR`, `ANTIGRAVITY_CLI_PLUGIN_DIR`, `ANTIGRAVITY_CONFIG_DIR/plugins/clean-room`, or `~/.gemini/antigravity-cli/plugins/clean-room` |
|
|
103
111
|
| Gemini CLI | `GEMINI_CONFIG_DIR` or `~/.gemini` |
|
|
104
112
|
| OpenCode | `OPENCODE_CONFIG_DIR`, `OPENCODE_CONFIG`, `XDG_CONFIG_HOME/opencode`, or `~/.config/opencode` |
|
|
113
|
+
| Pi | `~/.pi/agent` |
|
|
105
114
|
| Kilo | `KILO_CONFIG_DIR`, `KILO_CONFIG`, `XDG_CONFIG_HOME/kilo`, or `~/.config/kilo` |
|
|
106
115
|
| Cursor | `CURSOR_CONFIG_DIR` or `~/.cursor` |
|
|
107
116
|
| GitHub Copilot | `COPILOT_CONFIG_DIR` or `~/.copilot` |
|
|
@@ -112,7 +121,7 @@ Global install roots:
|
|
|
112
121
|
| Hermes Agent | `HERMES_HOME` or `~/.hermes` |
|
|
113
122
|
| CodeBuddy | `CODEBUDDY_CONFIG_DIR` or `~/.codebuddy` |
|
|
114
123
|
|
|
115
|
-
Local installs use each runtime's project config directory. Antigravity local installs write `.agents/plugins/clean-room/`.
|
|
124
|
+
Local installs use each runtime's project config directory. Pi local installs write `.pi/`. Antigravity local installs write `.agents/plugins/clean-room/`.
|
|
116
125
|
|
|
117
126
|
## Agent Metadata Compatibility
|
|
118
127
|
|
package/lib/bootstrap.cjs
CHANGED
|
@@ -410,8 +410,9 @@ function printInitResult(options) {
|
|
|
410
410
|
console.log(' uninstall runtime install: npx clean-room-skill@latest --claude --global --uninstall --yes');
|
|
411
411
|
console.log(' Pi:');
|
|
412
412
|
console.log(' install package skills: pi install npm:clean-room-skill@latest');
|
|
413
|
+
console.log(' installer compatibility: npx clean-room-skill@latest --pi --global --yes');
|
|
413
414
|
console.log(' start in Pi: /skill:init, then /skill:clean-room or /skill:attended');
|
|
414
|
-
console.log(' Pi
|
|
415
|
+
console.log(' Pi installs do not register clean-room hooks');
|
|
415
416
|
console.log(' strict hooks are only for dedicated clean-room Codex, Claude, or OpenCode homes');
|
|
416
417
|
}
|
|
417
418
|
|
package/lib/install-options.cjs
CHANGED
|
@@ -6,6 +6,7 @@ const { assertClaudeAgentsAvailable, defaultClaudeConfigDir } = require('./claud
|
|
|
6
6
|
const { resolveClaudeExecutable } = require('./install-claude-plugin.cjs');
|
|
7
7
|
const {
|
|
8
8
|
MANAGER_PREPARE_PHASE,
|
|
9
|
+
POLISH_PHASE,
|
|
9
10
|
REQUIRED_COVERAGE_PHASE,
|
|
10
11
|
ROLE_BY_PHASE,
|
|
11
12
|
} = require('./run-constants.cjs');
|
|
@@ -46,6 +47,7 @@ function claudeStages(roots, executable, env, pluginArgs) {
|
|
|
46
47
|
claudeStage('sanitize-handoff', contaminatedCwd, executable, env, pluginArgs),
|
|
47
48
|
claudeStage('clean-plan', cleanCwd, executable, env, pluginArgs),
|
|
48
49
|
claudeStage('clean-implement-qc', implementationCwd, executable, env, pluginArgs),
|
|
50
|
+
claudeStage(POLISH_PHASE, implementationCwd, executable, env, pluginArgs),
|
|
49
51
|
claudeStage(REQUIRED_COVERAGE_PHASE, contaminatedCwd, executable, env, pluginArgs),
|
|
50
52
|
];
|
|
51
53
|
}
|
package/lib/run-controller.cjs
CHANGED
|
@@ -37,6 +37,7 @@ const {
|
|
|
37
37
|
snapshotsEqual,
|
|
38
38
|
validateImplementationArtifactPlacement,
|
|
39
39
|
} = require('./run-progress.cjs');
|
|
40
|
+
const { finalizeAgent4PolishCommit } = require('./run-polish-commit.cjs');
|
|
40
41
|
const {
|
|
41
42
|
defaultSchemaDir,
|
|
42
43
|
resolvePath,
|
|
@@ -135,6 +136,14 @@ function shouldContinueAfterUnitComplete(manifest, coverageLedger) {
|
|
|
135
136
|
return Boolean(selectUnit(manifest, coverageLedger));
|
|
136
137
|
}
|
|
137
138
|
|
|
139
|
+
function markStageFailed(stageResult, error) {
|
|
140
|
+
stageResult.status = 'failed';
|
|
141
|
+
const message = error?.message || String(error);
|
|
142
|
+
stageResult.stderr = stageResult.stderr
|
|
143
|
+
? `${stageResult.stderr}\n${message}`
|
|
144
|
+
: message;
|
|
145
|
+
}
|
|
146
|
+
|
|
138
147
|
async function runCleanRoom(options, context = {}) {
|
|
139
148
|
if (options.help) {
|
|
140
149
|
printRunHelp();
|
|
@@ -256,6 +265,18 @@ async function runCleanRoom(options, context = {}) {
|
|
|
256
265
|
if (stage.phase === REQUIRED_COVERAGE_PHASE && stageResult.status === 'passed') {
|
|
257
266
|
coveragePhaseRan = true;
|
|
258
267
|
}
|
|
268
|
+
if (stage.phase === POLISH_PHASE && stageResult.status === 'passed') {
|
|
269
|
+
try {
|
|
270
|
+
const commitResult = finalizeAgent4PolishCommit(options.python, roots, currentManifest, selected);
|
|
271
|
+
stageResult.agent4_commit = commitResult;
|
|
272
|
+
const afterCommit = artifactSnapshot(taskManifestPath, roots);
|
|
273
|
+
validateImplementationArtifactPlacement(roots);
|
|
274
|
+
validateArtifacts(options.python, taskManifestPath, roots, changedSnapshotPaths(afterStage, afterCommit));
|
|
275
|
+
validateCleanRunContextReferences(options.python, roots);
|
|
276
|
+
} catch (err) {
|
|
277
|
+
markStageFailed(stageResult, err);
|
|
278
|
+
}
|
|
279
|
+
}
|
|
259
280
|
if (stageResult.status !== 'passed') {
|
|
260
281
|
failedStage = stageResult;
|
|
261
282
|
break;
|
|
@@ -0,0 +1,259 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
const fs = require('node:fs');
|
|
4
|
+
const path = require('node:path');
|
|
5
|
+
const { spawnSync } = require('node:child_process');
|
|
6
|
+
|
|
7
|
+
const {
|
|
8
|
+
readJsonFile,
|
|
9
|
+
writeJsonFile,
|
|
10
|
+
} = require('./fs-utils.cjs');
|
|
11
|
+
const {
|
|
12
|
+
DEFAULT_TIMEOUT_MS,
|
|
13
|
+
MAX_OUTPUT_BYTES,
|
|
14
|
+
POLISH_REPORT_NAME,
|
|
15
|
+
} = require('./run-constants.cjs');
|
|
16
|
+
const {
|
|
17
|
+
readCleanCompletionArtifact,
|
|
18
|
+
readCleanRunContext,
|
|
19
|
+
} = require('./run-clean-artifacts.cjs');
|
|
20
|
+
const {
|
|
21
|
+
envFromAllowlist,
|
|
22
|
+
hookPath,
|
|
23
|
+
} = require('./run-roots.cjs');
|
|
24
|
+
|
|
25
|
+
const COMMIT_HASH_RE = /^[a-fA-F0-9]{40,64}$/;
|
|
26
|
+
|
|
27
|
+
function normalizeCommitPath(rawPath) {
|
|
28
|
+
if (typeof rawPath !== 'string' || rawPath.trim() === '') {
|
|
29
|
+
throw new Error('polish commit paths must be non-empty strings');
|
|
30
|
+
}
|
|
31
|
+
const normalized = rawPath.replace(/\\/g, '/').replace(/^\.\//, '').replace(/\/+/g, '/').replace(/\/$/, '');
|
|
32
|
+
if (
|
|
33
|
+
normalized === '' ||
|
|
34
|
+
normalized.startsWith('/') ||
|
|
35
|
+
normalized.startsWith('~') ||
|
|
36
|
+
/^[A-Za-z]:/.test(normalized)
|
|
37
|
+
) {
|
|
38
|
+
throw new Error(`polish commit path must be relative: ${rawPath}`);
|
|
39
|
+
}
|
|
40
|
+
const parts = normalized.split('/');
|
|
41
|
+
if (parts.includes('..') || parts.includes('.git')) {
|
|
42
|
+
throw new Error(`polish commit path must not contain '..' or '.git': ${rawPath}`);
|
|
43
|
+
}
|
|
44
|
+
return normalized;
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
function changedPathSet(entries, options = {}) {
|
|
48
|
+
const paths = new Set();
|
|
49
|
+
for (const entry of entries || []) {
|
|
50
|
+
if (!entry || typeof entry !== 'object') continue;
|
|
51
|
+
if (options.skipUnchanged && entry.action === 'unchanged') continue;
|
|
52
|
+
paths.add(normalizeCommitPath(entry.path));
|
|
53
|
+
}
|
|
54
|
+
return paths;
|
|
55
|
+
}
|
|
56
|
+
|
|
57
|
+
function sortedPathSet(paths) {
|
|
58
|
+
return [...paths].sort((left, right) => left.localeCompare(right));
|
|
59
|
+
}
|
|
60
|
+
|
|
61
|
+
function expectedPolishCommitPaths(implementationReport, polishReport) {
|
|
62
|
+
return sortedPathSet(new Set([
|
|
63
|
+
...changedPathSet(implementationReport?.changed_paths),
|
|
64
|
+
...changedPathSet(polishReport?.changed_paths, { skipUnchanged: true }),
|
|
65
|
+
]));
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
function polishIncludePaths(polishReport) {
|
|
69
|
+
return sortedPathSet(new Set((polishReport?.git?.include_paths || []).map((item) => normalizeCommitPath(item))));
|
|
70
|
+
}
|
|
71
|
+
|
|
72
|
+
function diffPaths(left, right) {
|
|
73
|
+
const rightSet = new Set(right);
|
|
74
|
+
return left.filter((item) => !rightSet.has(item));
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
function polishCommitPathGap(implementationReport, polishReport) {
|
|
78
|
+
const expected = expectedPolishCommitPaths(implementationReport, polishReport);
|
|
79
|
+
const included = polishIncludePaths(polishReport);
|
|
80
|
+
const missing = diffPaths(expected, included);
|
|
81
|
+
if (missing.length > 0) {
|
|
82
|
+
return `Final clean polish commit is missing changed implementation path: ${missing[0]}`;
|
|
83
|
+
}
|
|
84
|
+
const unexpected = diffPaths(included, expected);
|
|
85
|
+
if (unexpected.length > 0) {
|
|
86
|
+
return `Final clean polish commit includes an unreported implementation path: ${unexpected[0]}`;
|
|
87
|
+
}
|
|
88
|
+
return null;
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
function polishCommitCompletionGap(implementationReport, polishReport) {
|
|
92
|
+
if (!polishReport) return null;
|
|
93
|
+
const git = polishReport.git || {};
|
|
94
|
+
if (git.commit_required === true) {
|
|
95
|
+
if (git.commit_status !== 'committed') {
|
|
96
|
+
return 'Final clean polish commit has not completed.';
|
|
97
|
+
}
|
|
98
|
+
if (typeof git.commit_hash !== 'string' || !COMMIT_HASH_RE.test(git.commit_hash)) {
|
|
99
|
+
return 'Final clean polish commit hash is missing.';
|
|
100
|
+
}
|
|
101
|
+
return polishCommitPathGap(implementationReport, polishReport);
|
|
102
|
+
}
|
|
103
|
+
if (git.commit_required === false) {
|
|
104
|
+
if (git.commit_status !== 'not-needed') {
|
|
105
|
+
return 'Final clean polish commit status does not match commit_required=false.';
|
|
106
|
+
}
|
|
107
|
+
if (git.commit_hash !== null) {
|
|
108
|
+
return 'Final clean polish commit hash must be null when commit_required=false.';
|
|
109
|
+
}
|
|
110
|
+
}
|
|
111
|
+
return null;
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
function unresolvedPolishItems(polishReport) {
|
|
115
|
+
const unresolvedFinding = (polishReport.findings || []).find((item) => item?.status !== 'resolved');
|
|
116
|
+
if (unresolvedFinding) {
|
|
117
|
+
return 'polish-report has unresolved findings';
|
|
118
|
+
}
|
|
119
|
+
const unresolvedTicket = (polishReport.abstract_delta_tickets || []).find((item) => item?.status !== 'resolved');
|
|
120
|
+
if (unresolvedTicket) {
|
|
121
|
+
return 'polish-report has unresolved abstract delta tickets';
|
|
122
|
+
}
|
|
123
|
+
const unpassedVerification = (polishReport.verification_results || []).find((item) => item?.status !== 'passed');
|
|
124
|
+
if (unpassedVerification) {
|
|
125
|
+
return 'polish-report has verification results that did not pass';
|
|
126
|
+
}
|
|
127
|
+
return null;
|
|
128
|
+
}
|
|
129
|
+
|
|
130
|
+
function boundedOutput(value) {
|
|
131
|
+
const text = String(value || '');
|
|
132
|
+
if (Buffer.byteLength(text, 'utf8') <= 4096) {
|
|
133
|
+
return text;
|
|
134
|
+
}
|
|
135
|
+
return `${text.slice(0, 4096)}\n[truncated]`;
|
|
136
|
+
}
|
|
137
|
+
|
|
138
|
+
function runnerEnv(roots, manifest, selectedUnit) {
|
|
139
|
+
return {
|
|
140
|
+
...envFromAllowlist(),
|
|
141
|
+
CLEAN_ROOM_ROLE: 'clean-polish-reviewer',
|
|
142
|
+
CLEAN_ROOM_ALLOW_AGENT4_SHELL: '1',
|
|
143
|
+
CLEAN_ROOM_SOURCE_ROOTS: roots.sourceRoots.join(path.delimiter),
|
|
144
|
+
CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS: roots.contaminatedRoot,
|
|
145
|
+
CLEAN_ROOM_CLEAN_ROOTS: roots.cleanRoot,
|
|
146
|
+
CLEAN_ROOM_IMPLEMENTATION_ROOTS: roots.implementationRoots.join(path.delimiter),
|
|
147
|
+
CLEAN_ROOM_ALLOWED_READ_ROOTS: roots.allowedReadRoots.join(path.delimiter),
|
|
148
|
+
CLEAN_ROOM_SCHEMA_DIR: roots.schemaDir,
|
|
149
|
+
CLEAN_ROOM_SELECTED_UNIT_ID: selectedUnit.unit_id,
|
|
150
|
+
CLEAN_ROOM_SPEC_SLICE_REF: manifest.loop_context.spec_slice_ref,
|
|
151
|
+
};
|
|
152
|
+
}
|
|
153
|
+
|
|
154
|
+
function parseRunnerOutput(result) {
|
|
155
|
+
if (result.error) {
|
|
156
|
+
throw new Error(`Agent 4 polish commit runner failed: ${result.error.message}`);
|
|
157
|
+
}
|
|
158
|
+
if (result.status !== 0) {
|
|
159
|
+
const output = boundedOutput(result.stderr || result.stdout);
|
|
160
|
+
throw new Error(`Agent 4 polish commit runner failed: ${output}`);
|
|
161
|
+
}
|
|
162
|
+
const parsed = JSON.parse(result.stdout || '{}');
|
|
163
|
+
if (parsed?.commit?.commit_status !== 'committed' || typeof parsed.commit.commit_hash !== 'string') {
|
|
164
|
+
throw new Error('Agent 4 polish commit runner did not report a committed result');
|
|
165
|
+
}
|
|
166
|
+
return parsed.commit;
|
|
167
|
+
}
|
|
168
|
+
|
|
169
|
+
function updatePolishReportAfterCommit(polishReportPath, commit) {
|
|
170
|
+
const polishReport = readJsonFile(polishReportPath, null);
|
|
171
|
+
const priorStatus = polishReport.git?.repository_status;
|
|
172
|
+
polishReport.git = {
|
|
173
|
+
...polishReport.git,
|
|
174
|
+
repository_status: priorStatus === 'existing' ? 'existing' : 'initialized',
|
|
175
|
+
commit_required: true,
|
|
176
|
+
commit_status: 'committed',
|
|
177
|
+
include_paths: commit.staged_paths || polishReport.git.include_paths,
|
|
178
|
+
commit_hash: commit.commit_hash,
|
|
179
|
+
status_summary: 'Committed listed implementation-root paths only.',
|
|
180
|
+
};
|
|
181
|
+
polishReport.final_status = 'passed';
|
|
182
|
+
writeJsonFile(polishReportPath, polishReport);
|
|
183
|
+
return polishReport;
|
|
184
|
+
}
|
|
185
|
+
|
|
186
|
+
function finalizeAgent4PolishCommit(python, roots, manifest, selectedUnit) {
|
|
187
|
+
const context = readCleanRunContext(roots);
|
|
188
|
+
const policy = context?.implementation?.polish_commit || null;
|
|
189
|
+
const polishReportPath = path.join(roots.cleanRoot, POLISH_REPORT_NAME);
|
|
190
|
+
if (!fs.existsSync(polishReportPath)) {
|
|
191
|
+
return { status: 'not-needed' };
|
|
192
|
+
}
|
|
193
|
+
const polishReport = readJsonFile(polishReportPath, null);
|
|
194
|
+
const git = polishReport.git || {};
|
|
195
|
+
|
|
196
|
+
if (git.commit_required !== true) {
|
|
197
|
+
if (git.commit_required !== false) {
|
|
198
|
+
throw new Error('polish-report git.commit_required must be true or false');
|
|
199
|
+
}
|
|
200
|
+
if (policy?.git_policy !== 'disabled') {
|
|
201
|
+
throw new Error('polish-report sets commit_required=false, but clean-run-context does not disable Agent 4 commits');
|
|
202
|
+
}
|
|
203
|
+
const commitGap = polishCommitCompletionGap(null, polishReport);
|
|
204
|
+
if (commitGap) {
|
|
205
|
+
throw new Error(commitGap);
|
|
206
|
+
}
|
|
207
|
+
return { status: 'not-needed' };
|
|
208
|
+
}
|
|
209
|
+
const { artifact: implementationReport } = readCleanCompletionArtifact(
|
|
210
|
+
roots,
|
|
211
|
+
'implementation_report',
|
|
212
|
+
'implementation-report.json',
|
|
213
|
+
'clean-run-context implementation_report'
|
|
214
|
+
);
|
|
215
|
+
if (!implementationReport) {
|
|
216
|
+
throw new Error('Agent 4 commit requires terminal implementation-report.json');
|
|
217
|
+
}
|
|
218
|
+
const pathGap = polishCommitPathGap(implementationReport, polishReport);
|
|
219
|
+
if (pathGap) {
|
|
220
|
+
throw new Error(pathGap);
|
|
221
|
+
}
|
|
222
|
+
if (git.commit_status === 'committed') {
|
|
223
|
+
return { status: 'already-committed' };
|
|
224
|
+
}
|
|
225
|
+
if (git.commit_status !== 'not-run') {
|
|
226
|
+
throw new Error(`polish-report git.commit_status must be not-run before controller commit, got ${git.commit_status}`);
|
|
227
|
+
}
|
|
228
|
+
if (policy?.git_policy !== 'local-init-and-commit-only') {
|
|
229
|
+
throw new Error('clean-run-context does not allow Agent 4 local init-and-commit');
|
|
230
|
+
}
|
|
231
|
+
if (policy.agent4_shell_allowed !== true || policy.cwd_policy !== 'implementation-root') {
|
|
232
|
+
throw new Error('clean-run-context Agent 4 commit policy does not allow the bounded polish runner');
|
|
233
|
+
}
|
|
234
|
+
if (polishReport.final_status !== 'blocked') {
|
|
235
|
+
throw new Error('pre-commit polish-report final_status must be blocked');
|
|
236
|
+
}
|
|
237
|
+
const unresolved = unresolvedPolishItems(polishReport);
|
|
238
|
+
if (unresolved) {
|
|
239
|
+
throw new Error(unresolved);
|
|
240
|
+
}
|
|
241
|
+
|
|
242
|
+
const result = spawnSync(python, [hookPath('agent4-polish-runner.py'), '--report', polishReportPath, '--commit'], {
|
|
243
|
+
cwd: roots.implementationRoots[0],
|
|
244
|
+
env: runnerEnv(roots, manifest, selectedUnit),
|
|
245
|
+
encoding: 'utf8',
|
|
246
|
+
shell: false,
|
|
247
|
+
timeout: DEFAULT_TIMEOUT_MS,
|
|
248
|
+
maxBuffer: MAX_OUTPUT_BYTES,
|
|
249
|
+
});
|
|
250
|
+
const commit = parseRunnerOutput(result);
|
|
251
|
+
updatePolishReportAfterCommit(polishReportPath, commit);
|
|
252
|
+
return { status: 'committed', commit_hash: commit.commit_hash };
|
|
253
|
+
}
|
|
254
|
+
|
|
255
|
+
module.exports = {
|
|
256
|
+
expectedPolishCommitPaths,
|
|
257
|
+
finalizeAgent4PolishCommit,
|
|
258
|
+
polishCommitCompletionGap,
|
|
259
|
+
};
|
package/lib/run-results.cjs
CHANGED
|
@@ -28,6 +28,10 @@ const {
|
|
|
28
28
|
skeletonAreaMap,
|
|
29
29
|
validatePathsOwnedByAreas,
|
|
30
30
|
} = require('./run-clean-artifacts.cjs');
|
|
31
|
+
const {
|
|
32
|
+
expectedPolishCommitPaths,
|
|
33
|
+
polishCommitCompletionGap,
|
|
34
|
+
} = require('./run-polish-commit.cjs');
|
|
31
35
|
const {
|
|
32
36
|
approvedUnitIds,
|
|
33
37
|
coverageMap,
|
|
@@ -94,7 +98,7 @@ function validateImplementationReportArchitecture(report, plan, skeleton) {
|
|
|
94
98
|
}
|
|
95
99
|
}
|
|
96
100
|
|
|
97
|
-
function implementationReportArchitectureTickets(roots, observedChangedPaths = null) {
|
|
101
|
+
function implementationReportArchitectureTickets(roots, observedChangedPaths = null, polish = null) {
|
|
98
102
|
const { artifact: report } = readCleanCompletionArtifact(roots, 'implementation_report', 'implementation-report.json', 'clean-run-context implementation_report');
|
|
99
103
|
if (!report || !Array.isArray(report.changed_paths)) {
|
|
100
104
|
return [];
|
|
@@ -119,7 +123,10 @@ function implementationReportArchitectureTickets(roots, observedChangedPaths = n
|
|
|
119
123
|
}
|
|
120
124
|
}
|
|
121
125
|
if (Array.isArray(observedChangedPaths)) {
|
|
122
|
-
const
|
|
126
|
+
const reportedPaths = polish
|
|
127
|
+
? expectedPolishCommitPaths(report, polish)
|
|
128
|
+
: report.changed_paths.map((entry) => entry?.path).filter((value) => typeof value === 'string' && value.trim() !== '');
|
|
129
|
+
const normalizedReported = [...new Set(reportedPaths)].sort();
|
|
123
130
|
const normalizedObserved = [...new Set(observedChangedPaths.filter((value) => typeof value === 'string' && value.trim() !== ''))].sort();
|
|
124
131
|
if (normalizedReported.length !== normalizedObserved.length || normalizedReported.some((value, index) => value !== normalizedObserved[index])) {
|
|
125
132
|
return [architectureDeltaTicket('Implementation report changed paths did not match observed implementation-root file changes. Re-run clean implementation with accurate changed_paths.')];
|
|
@@ -164,10 +171,10 @@ function completionQualityTickets(qc) {
|
|
|
164
171
|
return tickets;
|
|
165
172
|
}
|
|
166
173
|
|
|
167
|
-
function architectureDeltaTickets(roots, qc, observedChangedPaths = null) {
|
|
174
|
+
function architectureDeltaTickets(roots, qc, observedChangedPaths = null, polish = null) {
|
|
168
175
|
return [
|
|
169
176
|
...qcArchitectureTickets(qc),
|
|
170
|
-
...implementationReportArchitectureTickets(roots, observedChangedPaths),
|
|
177
|
+
...implementationReportArchitectureTickets(roots, observedChangedPaths, polish),
|
|
171
178
|
];
|
|
172
179
|
}
|
|
173
180
|
|
|
@@ -181,7 +188,7 @@ function polishDeltaTicket(summary) {
|
|
|
181
188
|
};
|
|
182
189
|
}
|
|
183
190
|
|
|
184
|
-
function polishReviewTickets(polish, polishRequired) {
|
|
191
|
+
function polishReviewTickets(polish, polishRequired, implementationReport = null) {
|
|
185
192
|
if (!polish) {
|
|
186
193
|
return polishRequired
|
|
187
194
|
? [polishDeltaTicket('The configured clean polish review stage did not produce polish-report.json.')]
|
|
@@ -195,12 +202,16 @@ function polishReviewTickets(polish, polishRequired) {
|
|
|
195
202
|
} else if (polish.final_status === 'passed-with-gaps') {
|
|
196
203
|
tickets.push(polishDeltaTicket('Final clean polish review passed with unresolved gaps.'));
|
|
197
204
|
}
|
|
205
|
+
const commitGap = polishCommitCompletionGap(implementationReport, polish);
|
|
206
|
+
if (commitGap) {
|
|
207
|
+
tickets.push(polishDeltaTicket(commitGap));
|
|
208
|
+
}
|
|
198
209
|
return tickets;
|
|
199
210
|
}
|
|
200
211
|
|
|
201
|
-
function polishBlocksCompletion(polish, polishRequired) {
|
|
212
|
+
function polishBlocksCompletion(polish, polishRequired, implementationReport = null) {
|
|
202
213
|
if (!polish) return polishRequired;
|
|
203
|
-
return polish.final_status !== 'passed';
|
|
214
|
+
return polish.final_status !== 'passed' || Boolean(polishCommitCompletionGap(implementationReport, polish));
|
|
204
215
|
}
|
|
205
216
|
|
|
206
217
|
function validateTerminalCompletionArtifacts(roots) {
|
|
@@ -261,9 +272,9 @@ function inferTerminalResult(manifest, roots, selectedUnit, options = {}) {
|
|
|
261
272
|
polish,
|
|
262
273
|
coverage,
|
|
263
274
|
{ abstract_delta_tickets: behaviorSpecOpenQuestionTickets(roots) },
|
|
264
|
-
{ abstract_delta_tickets: architectureDeltaTickets(roots, qc, options.observedChangedPaths || null) },
|
|
275
|
+
{ abstract_delta_tickets: architectureDeltaTickets(roots, qc, options.observedChangedPaths || null, polish) },
|
|
265
276
|
{ abstract_delta_tickets: completionQualityTickets(qc) },
|
|
266
|
-
{ abstract_delta_tickets: polishReviewTickets(polish, polishRequired) }
|
|
277
|
+
{ abstract_delta_tickets: polishReviewTickets(polish, polishRequired, report) }
|
|
267
278
|
);
|
|
268
279
|
|
|
269
280
|
if (
|
|
@@ -280,7 +291,7 @@ function inferTerminalResult(manifest, roots, selectedUnit, options = {}) {
|
|
|
280
291
|
if (report?.final_status === 'blocked' || qc?.final_status === 'blocked' || polish?.final_status === 'blocked' || selectedUnit.status === 'blocked') {
|
|
281
292
|
return buildResult(manifest, 'spec-slice-blocked', coverageState(state, qc), report, qc, tickets, polish);
|
|
282
293
|
}
|
|
283
|
-
if (polishBlocksCompletion(polish, polishRequired)) {
|
|
294
|
+
if (polishBlocksCompletion(polish, polishRequired, report)) {
|
|
284
295
|
return null;
|
|
285
296
|
}
|
|
286
297
|
if (state === 'covered' || (qc?.coverage_status === 'complete' && qc?.final_status === 'passed')) {
|
|
@@ -302,7 +313,8 @@ function completeResultOrSpecDelta(manifest, roots, coverageLedger, coverageStat
|
|
|
302
313
|
coverageLedger,
|
|
303
314
|
{ abstract_delta_tickets: behaviorSpecOpenQuestionTickets(roots) },
|
|
304
315
|
{ abstract_delta_tickets: architectureDeltaTickets(roots, qc) },
|
|
305
|
-
{ abstract_delta_tickets: completionQualityTickets(qc) }
|
|
316
|
+
{ abstract_delta_tickets: completionQualityTickets(qc) },
|
|
317
|
+
{ abstract_delta_tickets: polishReviewTickets(polish, Boolean(polish), report) }
|
|
306
318
|
);
|
|
307
319
|
if (tickets.some((ticket) => ticket.status !== 'resolved')) {
|
|
308
320
|
return buildResult(manifest, 'spec-delta-required', coverageStateValue, null, null, tickets);
|
package/lib/runtime-layout.cjs
CHANGED
|
@@ -9,6 +9,7 @@ const RUNTIMES = Object.freeze([
|
|
|
9
9
|
'antigravity',
|
|
10
10
|
'gemini',
|
|
11
11
|
'opencode',
|
|
12
|
+
'pi',
|
|
12
13
|
'kilo',
|
|
13
14
|
'cursor',
|
|
14
15
|
'copilot',
|
|
@@ -104,6 +105,12 @@ const RUNTIME_DEFS = Object.freeze({
|
|
|
104
105
|
HOOKS,
|
|
105
106
|
],
|
|
106
107
|
},
|
|
108
|
+
pi: {
|
|
109
|
+
globalDefault: ['.pi', 'agent'],
|
|
110
|
+
localDir: '.pi',
|
|
111
|
+
hooks: false,
|
|
112
|
+
artifacts: [STANDARD_SKILLS, HOOKS],
|
|
113
|
+
},
|
|
107
114
|
kilo: {
|
|
108
115
|
globalResolver: resolveKiloGlobalRoot,
|
|
109
116
|
localDir: '.kilo',
|
package/package.json
CHANGED
package/plugin.json
CHANGED
|
@@ -34,7 +34,7 @@ Use these roles conceptually. If the host supports subagents, map each role to a
|
|
|
34
34
|
- Agent 1.5 / contaminated handoff sanitizer: works in a fresh source-denied contaminated context, reads only Agent 0's neutral brief plus assigned draft artifacts, scrubs identifying material, and approves or quarantines handoff candidates.
|
|
35
35
|
- Agent 2 / clean architect/planner: starts from the clean workspace, reads `clean-run-context.json`, approved clean handoff artifacts, the completed foundation spec, and the clean destination foundation under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`; then writes `CLEAN_ROOM_CLEAN_ROOTS/implementation-plan.json` with relative destination paths, tests, constraints, risks, and argv-array verification commands. It writes no code.
|
|
36
36
|
- Agent 3 / clean implementer/verifier: starts in the clean domain, reads `implementation-plan.json` and clean artifacts, writes code and tests only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, writes reports under `CLEAN_ROOM_CLEAN_ROOTS`, records verification status, and emits exactly one terminal report for Agent 0 only after the assigned plan or task is complete, blocked, or quarantined. Run verification only through the installed Agent 3 verification runner; optional Docker or Podman verification must not mount source or contaminated artifact roots.
|
|
37
|
-
- Agent 4 / clean polish reviewer: starts in the clean domain after Agent 3 terminal reports, reviews final code for security, comments/docs, exception handling, resource leaks, race conditions, missing tests, and repo hygiene, writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`, may update implementation-root `AGENTS.md` and `.gitignore`, and may create one local implementation-root commit only through the installed Agent 4 polish runner.
|
|
37
|
+
- Agent 4 / clean polish reviewer: starts in the clean domain after Agent 3 terminal reports, reviews final code for security, comments/docs, exception handling, resource leaks, race conditions, missing tests, and repo hygiene, writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`, may update implementation-root `AGENTS.md` and `.gitignore`, and may create one local implementation-root commit only through the installed Agent 4 polish runner. The commit path list must cover terminal Agent 3 changed paths plus Agent 4 polish changed paths.
|
|
38
38
|
|
|
39
39
|
## Workflow
|
|
40
40
|
|
|
@@ -54,7 +54,7 @@ Optional AST/indexing helpers are detected before the controller loop through `s
|
|
|
54
54
|
|
|
55
55
|
Controller mode defaults to `attended` when `task-manifest.json` has no `controller_policy`. The outer loop evolves specs and selects one approved spec slice. Code-development runs start with exactly one `unit_kind: "foundation"` unit named by `loop_context.foundation_unit_ref`; non-foundation behavior slices wait until that unit is covered. The inner clean-room loop completes the approved slice through sanitized handoff, implementation, QC, optional final polish review, and contaminated-side coverage verification, then returns `clean-room-result.json` to the outer loop. In `attended` mode, agent zero pauses for human review at scope gate, handoff, QC deltas, polish deltas, blocked units, and final coverage. In `unattended` mode, agent zero may run a bounded inner loop: reload durable artifacts for each iteration, select at most one pending or gap unit inside `loop_context.approved_scope_refs`, start each role from fresh context with the required environment block, validate before advancing, and stop on any configured safety or ambiguity condition.
|
|
56
56
|
|
|
57
|
-
In Claude Code unattended mode, launch the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude` when possible. The main conversation must not do Agent 1, Agent 2, Agent 3, or Agent 4 work, and must not ask to continue while unattended policy still allows bounded progress. If role-agent dispatch is unavailable, fail closed with a blocker.
|
|
57
|
+
In Claude Code unattended mode, launch the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude` (or `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude` if the binary is not available) when possible. The main conversation must not do Agent 1, Agent 2, Agent 3, or Agent 4 work, and must not ask to continue while unattended policy still allows bounded progress. If role-agent dispatch is unavailable, fail closed with a blocker.
|
|
58
58
|
|
|
59
59
|
Do not grant shell-style tools to Agent 0, Agent 1, Agent 1.5, Agent 2, or the default Agent 3/4 role sessions. Agent 3 terminal verification may use shell-style tools only when `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`, the command cwd is under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and the command invokes the installed `agent3-verification-runner.py`. Agent 4 polish verification and commit may use shell-style tools only when `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`, cwd is under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and the command invokes the installed `agent4-polish-runner.py`. Use `--hooks=strict` for dedicated Codex, Claude, or OpenCode clean-room homes so hooks fail closed if required environment is missing or shell tools are invoked outside the allowed runner boundaries. Safe hook installs are compatibility-only between runs; during init/onboarding, prepare the role environment block and pass it into every clean-room role session so safe hooks enforce during active work.
|
|
60
60
|
|
|
@@ -124,7 +124,7 @@ Default sequence:
|
|
|
124
124
|
7. Clean handoff: move only Agent 1.5-approved structured artifacts plus `clean-run-context.json` to the clean workspace. Do not hand off the full `task-manifest.json`. For each role launch, Agent 0 writes a compact `role-session-brief.json` for that role and phase; the brief carries status, next action, allowed artifact refs with hashes, and forbidden inputs. It is not a replacement for durable artifacts.
|
|
125
125
|
8. Clean planning: Agent 2 starts from the clean artifact root, reads `clean-run-context.json`, approved handoff artifacts, any existing `skeleton-manifest.json`, and the clean implementation foundation, then updates `skeleton-manifest.json` as the durable destination architecture map and produces `implementation-plan.json` with code hygiene policy. Use `implementation-plan.json` as the code-development work contract, and require every planned target/test path to be owned by a referenced architecture area.
|
|
126
126
|
9. Clean implementation and QC: Agent 3 reads `implementation-plan.json`, writes code and tests only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, writes `implementation-report.json` under `CLEAN_ROOM_CLEAN_ROOTS`, maintains `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`, and loops without Agent 0 guidance until selected-slice work items are complete, blocked, or quarantined.
|
|
127
|
-
10. Clean polish review: when configured, Agent 4 reviews final code, updates only implementation-root polish files such as `AGENTS.md` or `.gitignore` when needed, writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`, and commits only through `agent4-polish-runner.py`.
|
|
127
|
+
10. Clean polish review: when configured, Agent 4 reviews final code, updates only implementation-root polish files such as `AGENTS.md` or `.gitignore` when needed, writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`, and commits only through `agent4-polish-runner.py`. If the controller finalizes the commit, Agent 4 records `git.commit_status: "not-run"` and `final_status: "blocked"` until the bounded runner records the real commit hash.
|
|
128
128
|
11. Contaminated coverage verification: only after Agent 3 marks the report as terminal and any configured Agent 4 polish review passes may Agent 0 consume `implementation-report.json`, `qc-report.json`, `polish-report.json`, and `coverage-ledger.json`, compare against source coverage, and write `clean-room-result.json`. Exact-public-contract and behavior-compatible public-surface items must map item by item from behavior spec test coverage to implementation-plan `public_contract_refs`, terminal report completion, and coverage-ledger `public_surface_coverage`.
|
|
129
129
|
12. Repeat clean planning, implementation, and polish only from updated durable artifacts, never by steering an in-progress Agent 2, Agent 3, or Agent 4 session.
|
|
130
130
|
|
|
@@ -214,10 +214,21 @@
|
|
|
214
214
|
"properties": {
|
|
215
215
|
"final_status": {
|
|
216
216
|
"const": "passed"
|
|
217
|
+
},
|
|
218
|
+
"git": {
|
|
219
|
+
"properties": {
|
|
220
|
+
"commit_required": {
|
|
221
|
+
"const": true
|
|
222
|
+
}
|
|
223
|
+
},
|
|
224
|
+
"required": [
|
|
225
|
+
"commit_required"
|
|
226
|
+
]
|
|
217
227
|
}
|
|
218
228
|
},
|
|
219
229
|
"required": [
|
|
220
|
-
"final_status"
|
|
230
|
+
"final_status",
|
|
231
|
+
"git"
|
|
221
232
|
]
|
|
222
233
|
},
|
|
223
234
|
"then": {
|
|
@@ -226,6 +237,78 @@
|
|
|
226
237
|
"properties": {
|
|
227
238
|
"commit_status": {
|
|
228
239
|
"const": "committed"
|
|
240
|
+
},
|
|
241
|
+
"commit_hash": {
|
|
242
|
+
"type": "string",
|
|
243
|
+
"pattern": "^[a-fA-F0-9]{40,64}$"
|
|
244
|
+
}
|
|
245
|
+
}
|
|
246
|
+
}
|
|
247
|
+
}
|
|
248
|
+
}
|
|
249
|
+
},
|
|
250
|
+
{
|
|
251
|
+
"if": {
|
|
252
|
+
"properties": {
|
|
253
|
+
"final_status": {
|
|
254
|
+
"const": "passed"
|
|
255
|
+
},
|
|
256
|
+
"git": {
|
|
257
|
+
"properties": {
|
|
258
|
+
"commit_required": {
|
|
259
|
+
"const": false
|
|
260
|
+
}
|
|
261
|
+
},
|
|
262
|
+
"required": [
|
|
263
|
+
"commit_required"
|
|
264
|
+
]
|
|
265
|
+
}
|
|
266
|
+
},
|
|
267
|
+
"required": [
|
|
268
|
+
"final_status",
|
|
269
|
+
"git"
|
|
270
|
+
]
|
|
271
|
+
},
|
|
272
|
+
"then": {
|
|
273
|
+
"properties": {
|
|
274
|
+
"git": {
|
|
275
|
+
"properties": {
|
|
276
|
+
"commit_status": {
|
|
277
|
+
"const": "not-needed"
|
|
278
|
+
},
|
|
279
|
+
"commit_hash": {
|
|
280
|
+
"const": null
|
|
281
|
+
}
|
|
282
|
+
}
|
|
283
|
+
}
|
|
284
|
+
}
|
|
285
|
+
}
|
|
286
|
+
},
|
|
287
|
+
{
|
|
288
|
+
"if": {
|
|
289
|
+
"properties": {
|
|
290
|
+
"git": {
|
|
291
|
+
"properties": {
|
|
292
|
+
"commit_status": {
|
|
293
|
+
"const": "committed"
|
|
294
|
+
}
|
|
295
|
+
},
|
|
296
|
+
"required": [
|
|
297
|
+
"commit_status"
|
|
298
|
+
]
|
|
299
|
+
}
|
|
300
|
+
},
|
|
301
|
+
"required": [
|
|
302
|
+
"git"
|
|
303
|
+
]
|
|
304
|
+
},
|
|
305
|
+
"then": {
|
|
306
|
+
"properties": {
|
|
307
|
+
"git": {
|
|
308
|
+
"properties": {
|
|
309
|
+
"commit_hash": {
|
|
310
|
+
"type": "string",
|
|
311
|
+
"pattern": "^[a-fA-F0-9]{40,64}$"
|
|
229
312
|
}
|
|
230
313
|
}
|
|
231
314
|
}
|
|
@@ -4,15 +4,32 @@
|
|
|
4
4
|
"plan_ref": "implementation-plan.json",
|
|
5
5
|
"implementer_role": "clean-qa-editor",
|
|
6
6
|
"updated_at": "2024-01-01T00:00:00Z",
|
|
7
|
-
"implementation_status": "
|
|
7
|
+
"implementation_status": "complete",
|
|
8
8
|
"agent0_reporting": {
|
|
9
|
-
"report_state": "
|
|
9
|
+
"report_state": "terminal-report",
|
|
10
10
|
"terminal_report_target": "agent_0",
|
|
11
11
|
"interim_updates_allowed": false
|
|
12
12
|
},
|
|
13
|
-
"completed_work_items": [
|
|
13
|
+
"completed_work_items": [
|
|
14
|
+
"work-example-flow"
|
|
15
|
+
],
|
|
14
16
|
"blocked_work_items": [],
|
|
15
|
-
"changed_paths": [
|
|
17
|
+
"changed_paths": [
|
|
18
|
+
{
|
|
19
|
+
"path": "src/example-flow.js",
|
|
20
|
+
"kind": "code",
|
|
21
|
+
"work_item_ids": [
|
|
22
|
+
"work-example-flow"
|
|
23
|
+
]
|
|
24
|
+
},
|
|
25
|
+
{
|
|
26
|
+
"path": "test/example-flow.test.js",
|
|
27
|
+
"kind": "test",
|
|
28
|
+
"work_item_ids": [
|
|
29
|
+
"work-example-flow"
|
|
30
|
+
]
|
|
31
|
+
}
|
|
32
|
+
],
|
|
16
33
|
"verification_results": [
|
|
17
34
|
{
|
|
18
35
|
"command": [
|
|
@@ -20,11 +37,11 @@
|
|
|
20
37
|
"test"
|
|
21
38
|
],
|
|
22
39
|
"cwd": "CLEAN_ROOM_IMPLEMENTATION_ROOTS[0]",
|
|
23
|
-
"status": "
|
|
24
|
-
"output_summary": "Example
|
|
40
|
+
"status": "passed",
|
|
41
|
+
"output_summary": "Example verification passed."
|
|
25
42
|
}
|
|
26
43
|
],
|
|
27
44
|
"findings": [],
|
|
28
45
|
"abstract_delta_tickets": [],
|
|
29
|
-
"final_status": "
|
|
46
|
+
"final_status": "complete"
|
|
30
47
|
}
|
|
@@ -197,8 +197,10 @@ Clean polish reviewer:
|
|
|
197
197
|
- Update implementation-root `AGENTS.md` with gotchas and build/test/dev commands discovered from clean files.
|
|
198
198
|
- Update implementation-root `.gitignore` only for real generated outputs, dependencies, caches, or build/test artifacts.
|
|
199
199
|
- Run verification and commit only through `agent4-polish-runner.py` with `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`.
|
|
200
|
-
- Stage only paths listed in `polish-report.json` and create at most one local implementation-root commit.
|
|
200
|
+
- Stage only paths listed in `polish-report.json` `git.include_paths` and create at most one local implementation-root commit.
|
|
201
|
+
- Set `git.include_paths` to the union of terminal Agent 3 `implementation-report.json` `changed_paths` and Agent 4 `polish-report.json` `changed_paths`; leave unreported dirty files uncommitted.
|
|
201
202
|
- Write `polish-report.json` with findings, changed paths, verification results, git status, commit hash/status, residual risks, and abstract delta tickets.
|
|
203
|
+
- For controller-finalized commits, write a pre-commit `polish-report.json` with `final_status: "blocked"`, `git.commit_required: true`, and `git.commit_status: "not-run"`.
|
|
202
204
|
- Do not report progress or ask Agent 0 for guidance while implementing. Mark `implementation-report.json` as terminal only after the selected slice work is complete, blocked, or quarantined.
|
|
203
205
|
|
|
204
206
|
## Workflow
|
|
@@ -284,7 +286,7 @@ Clean polish reviewer:
|
|
|
284
286
|
- Start from a fresh role session brief when context management is enabled.
|
|
285
287
|
- Agent 4 starts from the clean domain, reviews only clean implementation-root files and clean artifacts, and writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`.
|
|
286
288
|
- Create or update implementation-root `AGENTS.md` and `.gitignore` only when the clean implementation actually needs them.
|
|
287
|
-
- Commit only through `agent4-polish-runner.py`, with no push, tag, reset, clean, branch deletion, or arbitrary git commands.
|
|
289
|
+
- Commit only through `agent4-polish-runner.py`, with `git.include_paths` covering terminal Agent 3 changed paths plus Agent 4 polish paths, and with no push, tag, reset, clean, branch deletion, or arbitrary git commands.
|
|
288
290
|
13. Verify coverage:
|
|
289
291
|
- Contaminated manager checks gaps against source behavior, discovered source tests, equal-output requirements, public contract compatibility, terminal implementation reports, and terminal polish reports when configured.
|
|
290
292
|
- Reject completion when any required public-surface obligation is missing from behavior spec test coverage, implementation-plan `public_contract_refs`, terminal implementation completion, or coverage-ledger `public_surface_coverage`.
|
|
@@ -344,7 +344,7 @@ Do not include raw source excerpts, contaminated evidence, or source stack trace
|
|
|
344
344
|
- residual risks and abstract delta tickets
|
|
345
345
|
- final status
|
|
346
346
|
|
|
347
|
-
Do not include source excerpts, contaminated evidence, source paths, private identifiers, raw diffs, or source-shaped pseudocode. A passing polish report requires the constrained local commit to have succeeded.
|
|
347
|
+
Do not include source excerpts, contaminated evidence, source paths, private identifiers, raw diffs, or source-shaped pseudocode. A passing polish report with `git.commit_required: true` requires the constrained local commit to have succeeded and a real commit hash to be recorded. A passing report with `git.commit_required: false` is valid only when the clean-run-context commit policy is disabled and `git.commit_status` is `not-needed`.
|
|
348
348
|
|
|
349
349
|
## Clean-Room Result Content
|
|
350
350
|
|
package/skills/init/SKILL.md
CHANGED
|
@@ -19,7 +19,7 @@ Keep `preflight-goal.json` in the controller/contaminated artifact domain. Clean
|
|
|
19
19
|
|
|
20
20
|
Use the canonical `clean-room` skill workflow and references in this plugin. Preserve the clean-room boundary, role separation, artifact schemas, leakage rules, implementation-root rules, and hook expectations.
|
|
21
21
|
|
|
22
|
-
The CLI command `clean-room-skill init` may have pre-created neutral external folders and a clean-safe `.clean-room/README.md` stub in the target repository. The bootstrap task root must contain `contaminated/`, `clean/`, `implementation/`, and `quarantine/`. Treat that bootstrap output as convenience scaffolding only. It does not replace this skill's initialization workflow, and it must not be treated as an active `preflight-goal.json`, `init-config.json`, `task-manifest.json`, or `clean-run-context.json`.
|
|
22
|
+
The CLI command `clean-room-skill init` (or `npx clean-room-skill@latest init` if the binary is not available) may have pre-created neutral external folders and a clean-safe `.clean-room/README.md` stub in the target repository. The bootstrap task root must contain `contaminated/`, `clean/`, `implementation/`, and `quarantine/`. Treat that bootstrap output as convenience scaffolding only. It does not replace this skill's initialization workflow, and it must not be treated as an active `preflight-goal.json`, `init-config.json`, `task-manifest.json`, or `clean-run-context.json`.
|
|
23
23
|
|
|
24
24
|
When using an existing CLI bootstrap, check `clean-room-bootstrap.json`, `contaminated/`, `clean/`, `implementation/`, `quarantine/`, and the target repo `.clean-room/README.md` before recording active init preferences. Stop if metadata is missing, invalid, mismatched with the task root, or any generated path is missing or the wrong type. Do not infer active workflow state from those bootstrap files.
|
|
25
25
|
|
|
@@ -11,7 +11,7 @@ Create or validate `preflight-goal.json` before active clean-room artifacts star
|
|
|
11
11
|
|
|
12
12
|
Use the canonical `clean-room` workflow and read `skills/clean-room/references/PREFLIGHT.md` when collecting missing goal details. Preserve the clean-room boundary: `preflight-goal.json` is a controller/contaminated-side artifact and must not be placed in clean-role readable roots.
|
|
13
13
|
|
|
14
|
-
If the user provides output from CLI `clean-room-skill init
|
|
14
|
+
If the user provides output from CLI `clean-room-skill init` (or `npx clean-room-skill@latest init` if the binary is not available), check the generated bootstrap scaffold before creating or copying `preflight-goal.json`: `clean-room-bootstrap.json`, `contaminated/`, `clean/`, `implementation/`, `quarantine/`, and the target repo `.clean-room/README.md` must exist and agree. Treat that scaffold as convenience output only; it is not an active `preflight-goal.json`, `init-config.json`, `task-manifest.json`, or `clean-run-context.json`.
|
|
15
15
|
|
|
16
16
|
## Required Contract
|
|
17
17
|
|
|
@@ -46,7 +46,7 @@ Do not infer target language, license, dependency policy, exactness policy, outp
|
|
|
46
46
|
|
|
47
47
|
## CLI Helper
|
|
48
48
|
|
|
49
|
-
Use the CLI only for template creation or validation/copying:
|
|
49
|
+
Use the CLI (`clean-room-skill` if installed, or `npx clean-room-skill@latest` as fallback) only for template creation or validation/copying:
|
|
50
50
|
|
|
51
51
|
```bash
|
|
52
52
|
clean-room-skill preflight --template --output ~/Documents/CleanRoom/task-xxxxxxxx/contaminated/preflight-goal.json
|
package/skills/refocus/SKILL.md
CHANGED
|
@@ -53,7 +53,7 @@ Emit missed-gate findings only:
|
|
|
53
53
|
- Stale implementation report compared with latest implementation plan.
|
|
54
54
|
- Controller policy not preserved.
|
|
55
55
|
- Missing, invalid, or drifted preflight goal.
|
|
56
|
-
- Noncanonical manifests, reports, ledgers, or manual result summaries used as completion evidence. Mark these `not verified` unless `clean-room-skill run --dry-run` succeeds against the canonical `task-manifest.json`.
|
|
56
|
+
- Noncanonical manifests, reports, ledgers, or manual result summaries used as completion evidence. Mark these `not verified` unless `clean-room-skill run --dry-run` (or `npx clean-room-skill@latest run --dry-run` if the binary is not available) succeeds against the canonical `task-manifest.json`.
|
|
57
57
|
- Missing public-surface inventory parity: required public commands, APIs, config keys, protocol entries, or user-visible behaviors listed in approved specs are not mapped through behavior spec tests, implementation-plan `public_contract_refs`, terminal implementation reports, and coverage-ledger `public_surface_coverage`.
|
|
58
58
|
|
|
59
59
|
Do not suggest speculative improvements. Do not change source scope, target profile, public API, or implementation plan.
|
|
@@ -11,7 +11,7 @@ Resume an existing clean-room run from durable artifacts. Never use prior chat h
|
|
|
11
11
|
|
|
12
12
|
Use the canonical `clean-room` skill workflow and references in this plugin. Read `skills/clean-room/references/CONTROLLER-LOOP.md` when the manifest records `loop_context` or unattended mode. Preserve the same clean-room boundary, role separation, artifact schemas, leakage rules, implementation-root rules, and hook expectations.
|
|
13
13
|
|
|
14
|
-
If `task-manifest.json` records `controller_policy.mode: "unattended"` in Claude Code, prefer launching `clean-room-skill run --task-manifest <path> --agent-runtime claude` and let the durable runner assign role agents. The main conversation must not perform Agent 1, Agent 2, Agent 3, or Agent 4 work. Do not ask to continue while unattended policy, iteration budget, and approved pending or gap units still permit progress. If the runner or Claude role-agent dispatch is unavailable, stop with `BLOCKERS: Claude role-agent dispatch unavailable` rather than silently continuing in the main chat.
|
|
14
|
+
If `task-manifest.json` records `controller_policy.mode: "unattended"` in Claude Code, prefer launching `clean-room-skill run --task-manifest <path> --agent-runtime claude` (or `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude` if the binary is not available) and let the durable runner assign role agents. The main conversation must not perform Agent 1, Agent 2, Agent 3, or Agent 4 work. Do not ask to continue while unattended policy, iteration budget, and approved pending or gap units still permit progress. If the runner or Claude role-agent dispatch is unavailable, stop with `BLOCKERS: Claude role-agent dispatch unavailable` rather than silently continuing in the main chat.
|
|
15
15
|
|
|
16
16
|
## Load Order
|
|
17
17
|
|
|
@@ -15,7 +15,7 @@ Use the canonical `clean-room` skill workflow and references in this plugin. Rea
|
|
|
15
15
|
|
|
16
16
|
Before asking setup or preflight questions, use the canonical `clean-room` "Run State Discovery Before Wizard" rules. Resolve explicit artifact paths first, then configured clean-room roots, then bounded `~/Documents/CleanRoom/task-*` candidates. If a valid `task-manifest.json` exists, route to `resume-cr`. If a valid canonical `preflight-goal.json` exists without a manifest, continue at source/destination discovery and manifest creation. If a preflight artifact exists but is invalid, stop with schema errors instead of restarting preflight. If multiple candidates are found without an explicit path, list them and stop for selection.
|
|
17
17
|
|
|
18
|
-
When resuming a valid unattended `task-manifest.json` in Claude Code, prefer launching the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude
|
|
18
|
+
When resuming a valid unattended `task-manifest.json` in Claude Code, prefer launching the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude` (or `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude` if the binary is not available). The main conversation must not perform Agent 1, Agent 2, Agent 3, or Agent 4 work. Do not ask to continue while `controller_policy.mode` is `unattended`, the iteration budget remains, and approved pending or gap units remain. If Claude role-agent dispatch or the runner is unavailable, stop with `BLOCKERS: Claude role-agent dispatch unavailable` instead of falling back to main-chat execution.
|
|
19
19
|
|
|
20
20
|
Load or create `preflight-goal.json` first. Unattended mode requires a complete goal contract with no blocking or non-blocking `open_questions`, `controller_policy.unattended_allowed_after_preflight: true`, and a finite `controller_policy.max_iterations`.
|
|
21
21
|
|