clean-room-skill 0.4.0 → 0.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/.codex-plugin/plugin.json +1 -1
- package/agents/clean-architect.md +1 -0
- package/agents/clean-implementer-verifier-shell.md +1 -0
- package/agents/clean-polish-reviewer.md +1 -0
- package/agents/clean-qa-editor.md +1 -0
- package/agents/contaminated-manager-verifier.md +5 -4
- package/agents/contaminated-source-analyst.md +2 -0
- package/hooks/check-artifact-leakage.py +102 -26
- package/lib/preflight-validation.cjs +56 -0
- package/lib/run-coverage.cjs +20 -3
- package/lib/run-roots.cjs +10 -1
- package/package.json +1 -1
- package/plugin.json +1 -1
- package/skills/clean-room/SKILL.md +5 -5
- package/skills/clean-room/assets/evidence-ledger.schema.json +2 -0
- package/skills/clean-room/assets/preflight-goal.schema.json +38 -0
- package/skills/clean-room/examples/contaminated-side/preflight-goal.json +8 -0
- package/skills/clean-room/examples/contaminated-side/task-manifest.json +1 -1
- package/skills/clean-room/references/PREFLIGHT.md +2 -1
- package/skills/clean-room/references/PROCESS.md +1 -1
- package/skills/clean-room/scripts/build_visual_index.py +6 -5
- package/skills/init/SKILL.md +4 -2
- package/skills/preflight/SKILL.md +4 -2
- package/skills/resume-cr/SKILL.md +1 -1
- package/skills/unattended/SKILL.md +4 -4
|
@@ -49,6 +49,7 @@ Responsibilities:
|
|
|
49
49
|
- Carry the preflight-derived code hygiene policy into `implementation-plan.json`.
|
|
50
50
|
- Keep `skeleton-manifest.json` valid and current for code-development runs. Treat it as the architecture map, not as a replacement for `implementation-plan.json`.
|
|
51
51
|
- Map approved specs to destination files, test files, work items, argv-array verification commands, risks, and acceptance criteria using only relative implementation-root paths.
|
|
52
|
+
- In clean artifact prose fields, use plain language instead of implementation syntax such as scoped identifiers, dotted module paths, call expressions, exact test function names, or type constructor text. Put paths only in structured path fields and commands only in structured argv arrays.
|
|
52
53
|
- Preserve public contract refs, dependency constraints, test mappings, and open decisions.
|
|
53
54
|
- Do not choose dependencies by copying source manifests. Add or preserve dependencies only when clean artifacts, destination evidence, or preflight policy justify them.
|
|
54
55
|
- Map every exact-public-contract or behavior-compatible public surface obligation to at least one `implementation-plan.json` work item through `public_contract_refs`; do not replace a public command/API inventory with one generic dispatch work item unless every obligation ref is listed.
|
|
@@ -36,6 +36,7 @@ Responsibilities:
|
|
|
36
36
|
- Review leakage risk using `LEAKAGE-RULES.md`.
|
|
37
37
|
- Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
|
|
38
38
|
- Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
|
|
39
|
+
- In implementation and QC report prose fields, use plain language instead of implementation syntax such as scoped identifiers, dotted module paths, call expressions, exact test function names, or type constructor text. Put changed paths in `changed_paths`, test paths in `test_paths`, and commands in `verification_results.command`.
|
|
39
40
|
- Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
|
|
40
41
|
- Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
|
|
41
42
|
- Verify public-surface inventory parity item by item. Every required `public_surface:<spec_id>:<kind>:<name>` ref must be covered by tests, mapped to a completed work item, and represented in terminal verification; passing test counts or broad command-dispatch coverage is not enough.
|
|
@@ -42,6 +42,7 @@ Responsibilities:
|
|
|
42
42
|
- Do not add speculative ignores, speculative docs, broad refactors, new dependencies, or new behavior.
|
|
43
43
|
- Re-run relevant verification through `agent4-polish-runner.py` only when shell verification is enabled for this role.
|
|
44
44
|
- Record findings, Agent 4 changed relative paths, verification results, residual risks, git status, commit message, commit hash/status, and abstract delta tickets in `polish-report.json`.
|
|
45
|
+
- In polish report prose fields, use plain language instead of implementation syntax such as scoped identifiers, dotted module paths, call expressions, exact test function names, or type constructor text. Put changed paths in `changed_paths`, included commit paths in `git.include_paths`, and commands in `verification_results.command`.
|
|
45
46
|
- Set `git.include_paths` to the union of terminal `implementation-report.json` `changed_paths` and Agent 4 `polish-report.json` `changed_paths`; do not include unreported dirty files.
|
|
46
47
|
- When the controller must create the commit, write a pre-commit report with `final_status: "blocked"`, `git.commit_required: true`, and `git.commit_status: "not-run"`.
|
|
47
48
|
- Mark `final_status` as `passed` only when high/blocker security, correctness, exception, resource, race, leakage, and verification findings are resolved and either the constrained local commit succeeded or clean-run-context explicitly disables Agent 4 commits with `git.commit_status: "not-needed"`.
|
|
@@ -57,6 +57,7 @@ Responsibilities:
|
|
|
57
57
|
- Review leakage risk using `LEAKAGE-RULES.md`.
|
|
58
58
|
- Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
|
|
59
59
|
- Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
|
|
60
|
+
- In implementation and QC report prose fields, use plain language instead of implementation syntax such as scoped identifiers, dotted module paths, call expressions, exact test function names, or type constructor text. Put changed paths in `changed_paths`, test paths in `test_paths`, and commands in `verification_results.command`.
|
|
60
61
|
- Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
|
|
61
62
|
- Record architecture alignment in `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
|
|
62
63
|
- Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
|
|
@@ -24,12 +24,12 @@ Before source discovery, decomposition, or role launch, verify:
|
|
|
24
24
|
- `preflight-goal.json` exists, validates, and is recorded by hash in `task-manifest.json`.
|
|
25
25
|
- `handoff_sequence` is present and starts with `preflight`.
|
|
26
26
|
- Attended mode records unresolved preflight questions as pause gates.
|
|
27
|
-
- Unattended mode has no open preflight questions
|
|
27
|
+
- Unattended mode has no open preflight questions, `unattended_allowed_after_preflight: true`, and `intent_confirmation` showing the end goal, target stack, and controller mode came from explicit user answers.
|
|
28
28
|
|
|
29
29
|
Responsibilities:
|
|
30
30
|
|
|
31
31
|
- Confirm authorization, source scope, clean output scope, and prohibited actions before assigning work.
|
|
32
|
-
- Do not infer target language, dependency policy, license policy, exactness policy, output directory, or feature add/remove policy from source.
|
|
32
|
+
- Do not infer end goal, target language, runtime, framework, package manager, test framework, dependency policy, license policy, exactness policy, output directory, or feature add/remove policy from source. If goal or target stack is unknown, leave blocking `open_questions`, keep unattended disabled, and do not write runner-ready `task-manifest.json` or `clean-run-context.json`.
|
|
33
33
|
- Record the user's `format_selection` target profile, Agent 0-4 `agent_pipeline` contract, Agent 1.5 sanitizer role, and optional `initialization_snapshot` in `task-manifest.json`.
|
|
34
34
|
- Produce `clean-run-context.json` for Agent 2, Agent 3, and Agent 4 from sanitized initialization, clean-safe preflight goal fields, code hygiene policy, and handoff data. Do not send the full `task-manifest.json` or `preflight-goal.json` to clean roles.
|
|
35
35
|
- Influence Agent 2, Agent 3, and Agent 4 only through durable sanitized artifacts. Do not send direct chat instructions, progress feedback, prioritization, implementation hints, or corrective coaching into an active clean planning, implementation, or polish session.
|
|
@@ -40,7 +40,8 @@ Responsibilities:
|
|
|
40
40
|
- When no indexable source code exists and screenshots/images are the authorized evidence, consume contaminated `visual-index.json` as fallback input only. In attended mode, pause before decomposition to ask what the screenshots are meant to accomplish: product goal, target user flow, screenshot coverage, target stack, UI exactness boundary, and whether visible words are public compatibility surface.
|
|
41
41
|
- Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source or visual layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`, or to one visual-index batch through `visual_index_refs`.
|
|
42
42
|
- Create exactly one `unit_kind: "foundation"` unit before behavior units. Set `loop_context.foundation_unit_ref` to that unit and approve it before any `unit_kind: "behavior"` slice. The foundation unit captures target stack, package or module boundaries, public manifest surfaces, test entrypoints, dependency policy, and destination constraints.
|
|
43
|
-
- Maintain `coverage-ledger.json` and `evidence-ledger.json` in the contaminated artifact workspace.
|
|
43
|
+
- Maintain `coverage-ledger.json` and exactly one canonical `evidence-ledger.json` in the contaminated artifact workspace. Preserve existing evidence entries across units; do not allow per-unit evidence-ledger filenames.
|
|
44
|
+
- Require every evidence-ledger entry `source_unit_ref` to be the assigned task-manifest unit id or accepted unit alias. Source paths, source-index refs, visual-index refs, and observation locations belong in `evidence_location_ref` or unit index refs, not in `source_unit_ref`.
|
|
44
45
|
- Maintain a private identifier denylist for hook scanning when practical; never send the denylist contents to Agent 1.5, clean roles, or clean artifacts.
|
|
45
46
|
- Provide Agent 1.5 only a neutral sanitizer brief with domain purpose, target profile, unit intent, public compatibility allowlist, and blocked categories.
|
|
46
47
|
- Send Agent 1 draft specs to Agent 1.5 for independent source-denied sanitization before clean handoff.
|
|
@@ -49,7 +50,7 @@ Responsibilities:
|
|
|
49
50
|
- When Agent 1 records `discovery_leads`, create neutral follow-up task units only when the lead is inside authorized scope. Do not silently expand `loop_context.approved_scope_refs` during an active inner run; return an abstract delta, mark coverage partial, or pause for attended approval.
|
|
50
51
|
- For multi-segment source work, you may include a previous contaminated draft behavior spec in a later contaminated-analysis role-session brief only when it is under the contaminated artifact root, hash-checked, within context budgets, and still forbidden to clean or source-denied roles.
|
|
51
52
|
- Compare clean artifacts and terminal implementation or polish reports against source behavior, discovered source tests, equal-output requirements, and public API/schema compatibility for coverage gaps.
|
|
52
|
-
- Do not mark a unit complete from summaries, claimed test counts, or progress prose alone. Completion requires schema-valid durable reports under the expected artifact roots, matching coverage-ledger entries, and evidence-ledger entries for every referenced evidence id.
|
|
53
|
+
- Do not mark a unit complete from summaries, claimed test counts, or progress prose alone. Completion requires schema-valid durable reports under the expected artifact roots, matching coverage-ledger entries, and canonical evidence-ledger entries for every referenced evidence id.
|
|
53
54
|
- For exact-public-contract or behavior-compatible units, split broad public surfaces into smaller units or maintain `coverage-ledger.json` `public_surface_coverage` entries for every required `public_surface:<spec_id>:<kind>:<name>` obligation. A covered unit requires each obligation to be covered, mapped to clean work, and verified.
|
|
54
55
|
- Source-backed units with `source_index_refs` or `visual_index_refs` must have durable source/evidence coverage before `coverage_state: "covered"`. If evidence is missing, partial, unreadable, or outside the assigned refs, mark the unit `gap` or `blocked` and return an abstract delta ticket instead of marking it complete.
|
|
55
56
|
- For full-parity runs, do not defer TUI, command, CLI, protocol, streaming, MCP, tool, public error, or config behavior while reporting completion. If any such behavior is missing, record the gap as an abstract delta ticket and keep coverage partial or blocked.
|
|
@@ -49,6 +49,8 @@ Responsibilities:
|
|
|
49
49
|
- Treat discovered source tests as behavioral evidence and convert them into clean `test_scenarios` that validate the same observable outputs.
|
|
50
50
|
- Record equal-output expectations for public return values, serialized data, CLI or API responses, errors, state changes, ordering, and compatibility-relevant side effects.
|
|
51
51
|
- Use `evidence_refs` that point to contaminated-side ledger entries instead of including source text.
|
|
52
|
+
- Maintain exactly one canonical contaminated-side `evidence-ledger.json`. Preserve existing entries and append or update entries by stable `evidence_id`; do not create per-unit evidence-ledger filenames.
|
|
53
|
+
- Set each evidence entry `source_unit_ref` to the assigned task-manifest unit id or accepted unit alias, preferably `CLEAN_ROOM_SELECTED_UNIT_ID` when set. Put source file paths, source-index refs, visual-index refs, and observation locations in `evidence_location_ref`, not in `source_unit_ref`.
|
|
52
54
|
- Keep public API names only when compatibility requires them and record the reason.
|
|
53
55
|
- Capture public API, protocol, config, and data/schema compatibility using existing behavior spec fields.
|
|
54
56
|
- Do not mirror source dependency lists, package manifests, or private module layout. Mention a dependency only when it is public compatibility surface, destination evidence, or explicitly allowed by preflight policy.
|
|
@@ -126,6 +126,45 @@ SCAN_LIGHT_JSON_STRING_KEYS = {
|
|
|
126
126
|
"action",
|
|
127
127
|
"formatting_rules",
|
|
128
128
|
}
|
|
129
|
+
JSON_PATH_KEY_ALLOWLIST = NEVER_SCAN_JSON_STRING_KEYS | DENYLIST_ONLY_JSON_STRING_KEYS | SCAN_LIGHT_JSON_STRING_KEYS | {
|
|
130
|
+
"acceptance_criteria",
|
|
131
|
+
"architecture_findings",
|
|
132
|
+
"architecture_summary",
|
|
133
|
+
"claim",
|
|
134
|
+
"constraints",
|
|
135
|
+
"dependency_constraints",
|
|
136
|
+
"description",
|
|
137
|
+
"expected_result",
|
|
138
|
+
"findings",
|
|
139
|
+
"formatting_rules",
|
|
140
|
+
"implementation_forbidden_material",
|
|
141
|
+
"invariants",
|
|
142
|
+
"leakage_review",
|
|
143
|
+
"leakage_scan_summary",
|
|
144
|
+
"local_patterns",
|
|
145
|
+
"name",
|
|
146
|
+
"negative_behaviors",
|
|
147
|
+
"notes",
|
|
148
|
+
"observable_behaviors",
|
|
149
|
+
"observable_surface",
|
|
150
|
+
"open_decisions",
|
|
151
|
+
"open_questions",
|
|
152
|
+
"output_summary",
|
|
153
|
+
"outputs",
|
|
154
|
+
"purpose",
|
|
155
|
+
"reason",
|
|
156
|
+
"requirements",
|
|
157
|
+
"residual_risks",
|
|
158
|
+
"responsibilities",
|
|
159
|
+
"risks",
|
|
160
|
+
"scenario",
|
|
161
|
+
"state_transitions",
|
|
162
|
+
"summary",
|
|
163
|
+
"target_constraints",
|
|
164
|
+
"test_obligations",
|
|
165
|
+
"test_scenarios",
|
|
166
|
+
"timing_or_ordering",
|
|
167
|
+
}
|
|
129
168
|
IMPLEMENTATION_METADATA_MANIFESTS = {
|
|
130
169
|
"Cargo.toml",
|
|
131
170
|
"go.mod",
|
|
@@ -344,14 +383,43 @@ def strip_allowed_text(text: str, allowed_names: set[str]) -> str:
|
|
|
344
383
|
return stripped
|
|
345
384
|
|
|
346
385
|
|
|
386
|
+
def json_path(path: tuple[str | int, ...]) -> str:
|
|
387
|
+
if not path:
|
|
388
|
+
return "$"
|
|
389
|
+
rendered = "$"
|
|
390
|
+
for item in path:
|
|
391
|
+
if isinstance(item, int):
|
|
392
|
+
rendered += f"[{item}]"
|
|
393
|
+
elif item in JSON_PATH_KEY_ALLOWLIST:
|
|
394
|
+
rendered += f".{item}"
|
|
395
|
+
else:
|
|
396
|
+
rendered += ".<field>"
|
|
397
|
+
return rendered
|
|
398
|
+
|
|
399
|
+
|
|
400
|
+
def format_finding_details(details: list[tuple[str, str]]) -> str:
|
|
401
|
+
grouped: dict[str, set[str]] = {}
|
|
402
|
+
for name, location in details:
|
|
403
|
+
grouped.setdefault(name, set()).add(location)
|
|
404
|
+
parts: list[str] = []
|
|
405
|
+
for name in sorted(grouped):
|
|
406
|
+
locations = sorted(grouped[name])
|
|
407
|
+
shown = locations[:3]
|
|
408
|
+
suffix = f" at {', '.join(shown)}"
|
|
409
|
+
if len(locations) > len(shown):
|
|
410
|
+
suffix += f", +{len(locations) - len(shown)} more"
|
|
411
|
+
parts.append(f"{name}{suffix}")
|
|
412
|
+
return ", ".join(parts)
|
|
413
|
+
|
|
414
|
+
|
|
347
415
|
def json_scan_strings(
|
|
348
416
|
value: object,
|
|
349
417
|
allowed_names: set[str],
|
|
350
418
|
path: tuple[str | int, ...] = (),
|
|
351
|
-
) -> tuple[list[str], list[str], list[str]]:
|
|
352
|
-
full_scan: list[str] = []
|
|
353
|
-
light_scan: list[str] = []
|
|
354
|
-
denylist_scan: list[str] = []
|
|
419
|
+
) -> tuple[list[tuple[str, str]], list[tuple[str, str]], list[tuple[str, str]]]:
|
|
420
|
+
full_scan: list[tuple[str, str]] = []
|
|
421
|
+
light_scan: list[tuple[str, str]] = []
|
|
422
|
+
denylist_scan: list[tuple[str, str]] = []
|
|
355
423
|
if isinstance(value, dict):
|
|
356
424
|
for key, item in value.items():
|
|
357
425
|
child_full, child_light, child_denylist = json_scan_strings(item, allowed_names, path + (key,))
|
|
@@ -369,62 +437,69 @@ def json_scan_strings(
|
|
|
369
437
|
if leaf_key in NEVER_SCAN_JSON_STRING_KEYS:
|
|
370
438
|
return full_scan, light_scan, denylist_scan
|
|
371
439
|
stripped = strip_allowed_text(value, allowed_names)
|
|
440
|
+
location = json_path(path)
|
|
372
441
|
if leaf_key in DENYLIST_ONLY_JSON_STRING_KEYS:
|
|
373
|
-
denylist_scan.append(stripped)
|
|
442
|
+
denylist_scan.append((location, stripped))
|
|
374
443
|
elif leaf_key in SCAN_LIGHT_JSON_STRING_KEYS:
|
|
375
|
-
light_scan.append(stripped)
|
|
444
|
+
light_scan.append((location, stripped))
|
|
376
445
|
else:
|
|
377
|
-
full_scan.append(stripped)
|
|
446
|
+
full_scan.append((location, stripped))
|
|
378
447
|
return full_scan, light_scan, denylist_scan
|
|
379
448
|
|
|
380
449
|
|
|
381
|
-
def scan_private_identifier_denylist(
|
|
382
|
-
|
|
383
|
-
|
|
450
|
+
def scan_private_identifier_denylist(
|
|
451
|
+
texts: list[tuple[str, str]],
|
|
452
|
+
private_patterns: list[tuple[str, re.Pattern[str]]],
|
|
453
|
+
) -> list[tuple[str, str]]:
|
|
454
|
+
findings: set[tuple[str, str]] = set()
|
|
455
|
+
for location, text in texts:
|
|
384
456
|
for _term, pattern in private_patterns:
|
|
385
457
|
if pattern.search(text):
|
|
386
|
-
findings.add("private_identifier_denylist")
|
|
458
|
+
findings.add(("private_identifier_denylist", location))
|
|
387
459
|
break
|
|
388
460
|
return sorted(findings)
|
|
389
461
|
|
|
390
462
|
|
|
391
|
-
def scan_source_derived_names(
|
|
392
|
-
|
|
393
|
-
|
|
463
|
+
def scan_source_derived_names(
|
|
464
|
+
texts: list[tuple[str, str]],
|
|
465
|
+
source_patterns: list[tuple[str, re.Pattern[str]]],
|
|
466
|
+
) -> list[tuple[str, str]]:
|
|
467
|
+
findings: set[tuple[str, str]] = set()
|
|
468
|
+
for location, text in texts:
|
|
394
469
|
for _term, pattern in source_patterns:
|
|
395
470
|
if pattern.search(text):
|
|
396
|
-
findings.add("source_derived_name")
|
|
471
|
+
findings.add(("source_derived_name", location))
|
|
397
472
|
break
|
|
398
473
|
return sorted(findings)
|
|
399
474
|
|
|
400
475
|
|
|
401
476
|
def scan_identifier_patterns(
|
|
402
|
-
texts: list[str],
|
|
477
|
+
texts: list[tuple[str, str]],
|
|
403
478
|
private_patterns: list[tuple[str, re.Pattern[str]]],
|
|
404
479
|
skipped_patterns: set[str] | None = None,
|
|
405
|
-
) -> list[str]:
|
|
406
|
-
findings: set[str] = set()
|
|
480
|
+
) -> list[tuple[str, str]]:
|
|
481
|
+
findings: set[tuple[str, str]] = set()
|
|
407
482
|
skipped_patterns = skipped_patterns or set()
|
|
408
|
-
for text in texts:
|
|
483
|
+
for location, text in texts:
|
|
409
484
|
for _term, pattern in private_patterns:
|
|
410
485
|
if pattern.search(text):
|
|
411
|
-
findings.add("private_identifier_denylist")
|
|
486
|
+
findings.add(("private_identifier_denylist", location))
|
|
412
487
|
break
|
|
413
488
|
for name, pattern in IDENTIFIER_PATTERNS.items():
|
|
414
489
|
if name in skipped_patterns:
|
|
415
490
|
continue
|
|
416
491
|
if any(identifier_match_is_finding(name, text, match) for match in pattern.finditer(text)):
|
|
417
|
-
findings.add(name)
|
|
492
|
+
findings.add((name, location))
|
|
418
493
|
return sorted(findings)
|
|
419
494
|
|
|
420
495
|
|
|
421
|
-
def identifier_scan_texts(path: Path, text: str) -> tuple[list[str], list[str], list[str]]:
|
|
496
|
+
def identifier_scan_texts(path: Path, text: str) -> tuple[list[tuple[str, str]], list[tuple[str, str]], list[tuple[str, str]]]:
|
|
422
497
|
if path.suffix.lower() != ".json":
|
|
423
|
-
return [strip_allowed_text(text, set())], [], []
|
|
498
|
+
return [("$", strip_allowed_text(text, set()))], [], []
|
|
424
499
|
try:
|
|
425
500
|
data = json.loads(text)
|
|
426
501
|
except json.JSONDecodeError:
|
|
427
|
-
return [strip_allowed_text(text, set())], [], []
|
|
502
|
+
return [("$", strip_allowed_text(text, set()))], [], []
|
|
428
503
|
allowed_names = public_names(data)
|
|
429
504
|
return json_scan_strings(data, allowed_names)
|
|
430
505
|
|
|
@@ -465,7 +540,7 @@ def main() -> int:
|
|
|
465
540
|
print(f"clean-room leakage scan failed: {redact_text(read_error)}", file=sys.stderr)
|
|
466
541
|
return 1
|
|
467
542
|
text = data.decode("utf-8", errors="replace")
|
|
468
|
-
findings = [name for name, pattern in BLOCKED_PATTERNS.items() if pattern.search(text)]
|
|
543
|
+
findings = [(name, "$") for name, pattern in BLOCKED_PATTERNS.items() if pattern.search(text)]
|
|
469
544
|
full_scan_texts, light_scan_texts, denylist_scan_texts = identifier_scan_texts(path, text)
|
|
470
545
|
findings.extend(scan_identifier_patterns(full_scan_texts, private_patterns))
|
|
471
546
|
findings.extend(
|
|
@@ -484,7 +559,8 @@ def main() -> int:
|
|
|
484
559
|
)
|
|
485
560
|
if findings:
|
|
486
561
|
print(
|
|
487
|
-
f"clean-room leakage scan failed for {describe_path(path)}:
|
|
562
|
+
f"clean-room leakage scan failed for {describe_path(path)}: "
|
|
563
|
+
f"{format_finding_details(sorted(set(findings)))}",
|
|
488
564
|
file=sys.stderr,
|
|
489
565
|
)
|
|
490
566
|
return 1
|
|
@@ -9,6 +9,8 @@ const {
|
|
|
9
9
|
VALID_NETWORK_POLICIES,
|
|
10
10
|
} = require('./preflight-constants.cjs');
|
|
11
11
|
|
|
12
|
+
const EXPLICIT_USER_ANSWER = 'explicit-user-answer';
|
|
13
|
+
|
|
12
14
|
/**
|
|
13
15
|
* Assert that a value is an object (not null and not an array), appending errors on failure.
|
|
14
16
|
* @param {any} value - Value to check.
|
|
@@ -98,6 +100,43 @@ function validateStringArray(root, field, errors) {
|
|
|
98
100
|
}
|
|
99
101
|
}
|
|
100
102
|
|
|
103
|
+
function isPlaceholderText(value) {
|
|
104
|
+
if (typeof value !== 'string') return false;
|
|
105
|
+
const normalized = value.trim().toLowerCase();
|
|
106
|
+
return normalized === '' ||
|
|
107
|
+
normalized === 'tbd' ||
|
|
108
|
+
normalized.startsWith('tbd:') ||
|
|
109
|
+
normalized === 'todo' ||
|
|
110
|
+
normalized.startsWith('todo:') ||
|
|
111
|
+
normalized === 'unknown';
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
function validateCompletedGoalFields(goal, errors) {
|
|
115
|
+
if (isPlaceholderText(goal?.end_goal?.success_definition)) {
|
|
116
|
+
errors.push('completed preflight input requires user-confirmed end_goal.success_definition, not a placeholder');
|
|
117
|
+
}
|
|
118
|
+
if (expectObject(goal?.target_stack, 'target_stack', errors)) {
|
|
119
|
+
for (const field of ['language', 'runtime', 'framework', 'package_manager', 'test_framework']) {
|
|
120
|
+
const value = goal.target_stack[field];
|
|
121
|
+
if (value !== null && isPlaceholderText(value)) {
|
|
122
|
+
errors.push(`completed preflight input requires user-confirmed target_stack.${field}, not a placeholder`);
|
|
123
|
+
}
|
|
124
|
+
}
|
|
125
|
+
}
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
function validateIntentConfirmation(goal, errors) {
|
|
129
|
+
if (!expectObject(goal.intent_confirmation, 'intent_confirmation', errors)) return;
|
|
130
|
+
expectString(goal.intent_confirmation.confirmed_at, 'intent_confirmation.confirmed_at', errors);
|
|
131
|
+
for (const field of ['end_goal_source', 'target_stack_source', 'controller_mode_source']) {
|
|
132
|
+
if (goal.intent_confirmation[field] !== EXPLICIT_USER_ANSWER) {
|
|
133
|
+
errors.push(`intent_confirmation.${field} must be "${EXPLICIT_USER_ANSWER}"`);
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
expectString(goal.intent_confirmation.user_goal_summary, 'intent_confirmation.user_goal_summary', errors);
|
|
137
|
+
expectString(goal.intent_confirmation.user_target_stack_summary, 'intent_confirmation.user_target_stack_summary', errors);
|
|
138
|
+
}
|
|
139
|
+
|
|
101
140
|
/**
|
|
102
141
|
* Validate a preflight goal contract object.
|
|
103
142
|
* @param {object} goal - Goal contract object to validate.
|
|
@@ -215,6 +254,23 @@ function validateGoalContract(goal, options = {}) {
|
|
|
215
254
|
if (blocking.length > 0) {
|
|
216
255
|
errors.push('completed preflight input must not contain blocking open_questions');
|
|
217
256
|
}
|
|
257
|
+
validateCompletedGoalFields(goal, errors);
|
|
258
|
+
if (goal.intent_confirmation === undefined) {
|
|
259
|
+
errors.push('completed preflight input requires intent_confirmation with explicit user-confirmed end goal and target stack');
|
|
260
|
+
} else {
|
|
261
|
+
validateIntentConfirmation(goal, errors);
|
|
262
|
+
}
|
|
263
|
+
} else if (goal.intent_confirmation !== undefined) {
|
|
264
|
+
validateIntentConfirmation(goal, errors);
|
|
265
|
+
}
|
|
266
|
+
|
|
267
|
+
if (options.requireUnattended) {
|
|
268
|
+
if (goal.controller_policy?.mode !== 'unattended') {
|
|
269
|
+
errors.push('runner-ready preflight requires controller_policy.mode="unattended"');
|
|
270
|
+
}
|
|
271
|
+
if (!options.requireComplete && goal.intent_confirmation === undefined) {
|
|
272
|
+
errors.push('runner-ready preflight requires intent_confirmation with explicit user-confirmed end goal and target stack');
|
|
273
|
+
}
|
|
218
274
|
}
|
|
219
275
|
|
|
220
276
|
return errors;
|
package/lib/run-coverage.cjs
CHANGED
|
@@ -90,6 +90,23 @@ function evidenceEntryMap(roots) {
|
|
|
90
90
|
return { evidence, map };
|
|
91
91
|
}
|
|
92
92
|
|
|
93
|
+
function evidenceLedgerMissingMessage(ref) {
|
|
94
|
+
return [
|
|
95
|
+
`coverage-ledger references evidence but canonical evidence-ledger.json is missing: ${ref}`,
|
|
96
|
+
'write one contaminated-side evidence-ledger.json; do not use per-unit evidence-ledger filenames',
|
|
97
|
+
].join('; ');
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
function evidenceSourceUnitMismatchMessage(ref, unitId) {
|
|
101
|
+
return [
|
|
102
|
+
`coverage-ledger evidence ref points at a different source unit: ${ref}`,
|
|
103
|
+
'evidence source_unit_ref was rejected; value not shown because it may contain a source path or private identifier',
|
|
104
|
+
`coverage unit_id=${unitId}`,
|
|
105
|
+
`source_unit_ref must be the task-manifest unit id or accepted unit alias (${[...unitRefValues(unitId)].join(', ')})`,
|
|
106
|
+
'source paths belong in evidence_location_ref or source_index_refs/visual_index_refs, not source_unit_ref',
|
|
107
|
+
].join('; ');
|
|
108
|
+
}
|
|
109
|
+
|
|
93
110
|
function hasUnresolvedCoverageTicket(coverageLedger, unitId) {
|
|
94
111
|
return (coverageLedger?.abstract_delta_tickets || []).some((ticket) => {
|
|
95
112
|
return (!ticket.unit_id || ticket.unit_id === unitId) && ticket.status !== 'resolved';
|
|
@@ -304,15 +321,15 @@ function validateCoverageLedgerIntegrity(manifest, roots, coverageLedger) {
|
|
|
304
321
|
if (evidenceRefs.length > 0) {
|
|
305
322
|
const { evidence, map } = evidenceEntryMap(roots);
|
|
306
323
|
if (!evidence) {
|
|
307
|
-
throw new Error(
|
|
324
|
+
throw new Error(evidenceLedgerMissingMessage(evidenceRefs[0].ref));
|
|
308
325
|
}
|
|
309
326
|
for (const { ref, evidenceId, unitId } of evidenceRefs) {
|
|
310
327
|
const entry = map.get(evidenceId);
|
|
311
328
|
if (!entry) {
|
|
312
|
-
throw new Error(`coverage-ledger references missing evidence-ledger item: ${ref}`);
|
|
329
|
+
throw new Error(`coverage-ledger references missing evidence-ledger item in canonical evidence-ledger.json: ${ref}`);
|
|
313
330
|
}
|
|
314
331
|
if (entry.source_unit_ref && !unitRefValues(unitId).has(entry.source_unit_ref)) {
|
|
315
|
-
throw new Error(
|
|
332
|
+
throw new Error(evidenceSourceUnitMismatchMessage(ref, unitId));
|
|
316
333
|
}
|
|
317
334
|
}
|
|
318
335
|
}
|
package/lib/run-roots.cjs
CHANGED
|
@@ -4,7 +4,8 @@ const fs = require('node:fs');
|
|
|
4
4
|
const os = require('node:os');
|
|
5
5
|
const path = require('node:path');
|
|
6
6
|
|
|
7
|
-
const { fileHash } = require('./fs-utils.cjs');
|
|
7
|
+
const { fileHash, readJsonFile } = require('./fs-utils.cjs');
|
|
8
|
+
const { validateGoalContract } = require('./preflight-validation.cjs');
|
|
8
9
|
const {
|
|
9
10
|
BASE_ENV_ALLOWLIST,
|
|
10
11
|
CI_ENV_ALLOWLIST,
|
|
@@ -224,6 +225,14 @@ function verifyPreflightGoal(manifest, manifestDir, roots) {
|
|
|
224
225
|
if (actual !== expectedHash) {
|
|
225
226
|
throw new Error(`preflight goal sha256 mismatch: ${preflightGoalPath}`);
|
|
226
227
|
}
|
|
228
|
+
const preflightGoal = readJsonFile(preflightGoalRealPath, null);
|
|
229
|
+
const errors = validateGoalContract(preflightGoal, { requireComplete: true, requireUnattended: true });
|
|
230
|
+
if (preflightGoal?.controller_policy?.mode !== manifest.controller_policy?.mode) {
|
|
231
|
+
errors.push('preflight goal controller_policy.mode must match task-manifest controller_policy.mode');
|
|
232
|
+
}
|
|
233
|
+
if (errors.length > 0) {
|
|
234
|
+
throw new Error(`preflight goal is not runner-ready:\n ${errors.join('\n ')}`);
|
|
235
|
+
}
|
|
227
236
|
}
|
|
228
237
|
|
|
229
238
|
function pathIsUnder(child, parent) {
|
package/package.json
CHANGED
package/plugin.json
CHANGED
|
@@ -44,7 +44,7 @@ Agent zero/controller must set and pass the clean-room environment block into ev
|
|
|
44
44
|
|
|
45
45
|
When `context_management.mode` is `role-session-briefs`, every role session starts from `CLEAN_ROOM_SESSION_BRIEF_PATH` plus the environment block. In `strict` enforcement, the controller must start a fresh model session, profile, or thread for each role, pass `CLEAN_ROOM_FRESH_CONTEXT_REQUIRED=1`, and keep the stage prompt, session brief, artifact ref count, and referenced artifact bytes inside the recorded budgets. Do not clear or delete durable artifacts to save tokens. Clear only model/chat context between roles.
|
|
46
46
|
|
|
47
|
-
`preflight-goal.json` is required before source indexing, visual indexing, or Agent 0 decomposition. It records the end goal, target stack, license policy, dependency policy, compatibility/exactness policy, feature policy, code hygiene limits, output policy, and controller mode. It is controller/contaminated-side only; clean roles receive only the clean-safe `goal_contract` subset and `code_hygiene_policy` through `clean-run-context.json`.
|
|
47
|
+
`preflight-goal.json` is required before source indexing, visual indexing, or Agent 0 decomposition. It records the end goal, target stack, license policy, dependency policy, compatibility/exactness policy, feature policy, code hygiene limits, output policy, and controller mode. Completed preflight inputs and unattended contracts also record `intent_confirmation` with explicit user-confirmed end goal, target stack, and controller mode. It is controller/contaminated-side only; clean roles receive only the clean-safe `goal_contract` subset and `code_hygiene_policy` through `clean-run-context.json`.
|
|
48
48
|
|
|
49
49
|
When source scope is larger than a single obvious unit, run `scripts/build_source_index.py` as source-index preflight before starting clean-room role sessions. The resulting `source-index.json` is contaminated-only input for Agent 0. It may contain source paths, import/export names, dependency relationships, large-file segment spans, and optional local AST/indexing tool status, so do not place it in clean handoff packages or expose it to Agent 1.5, Agent 2, Agent 3, or Agent 4.
|
|
50
50
|
|
|
@@ -54,7 +54,7 @@ Optional AST/indexing helpers are detected before the controller loop through `s
|
|
|
54
54
|
|
|
55
55
|
Controller mode defaults to `attended` when `task-manifest.json` has no `controller_policy`. The outer loop evolves specs and selects one approved spec slice. Code-development runs start with exactly one `unit_kind: "foundation"` unit named by `loop_context.foundation_unit_ref`; non-foundation behavior slices wait until that unit is covered. The inner clean-room loop completes the approved slice through sanitized handoff, implementation, QC, optional final polish review, and contaminated-side coverage verification, then returns `clean-room-result.json` to the outer loop. In `attended` mode, agent zero pauses for human review at scope gate, handoff, QC deltas, polish deltas, blocked units, and final coverage. In `unattended` mode, agent zero may run a bounded inner loop: reload durable artifacts for each iteration, select at most one pending or gap unit inside `loop_context.approved_scope_refs`, start each role from fresh context with the required environment block, validate before advancing, and stop on any configured safety or ambiguity condition.
|
|
56
56
|
|
|
57
|
-
In Claude Code unattended mode, launch the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude` when possible. If `clean-room-skill` is not on `PATH`, immediately use `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude`. Do not search plugin cache paths for schema files, and do not pass `--schema-dir /dev/null`; the runner uses bundled schemas by default. The main conversation must not do Agent 1, Agent 2, Agent 3, or Agent 4 work, and must not ask to continue while unattended policy still allows bounded progress. If role-agent dispatch is unavailable, fail closed with a blocker.
|
|
57
|
+
In Claude Code unattended mode, launch the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude` when possible and only after `task-manifest.json` has `loop_context` naming an approved pending or gap unit. If an unattended manifest lacks `loop_context`, treat it as incomplete outer-loop state and finish selected-slice approval before the runner is invoked. If `clean-room-skill` is not on `PATH`, immediately use `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude`. Do not search plugin cache paths for schema files, and do not pass `--schema-dir /dev/null`; the runner uses bundled schemas by default. The main conversation must not do Agent 1, Agent 2, Agent 3, or Agent 4 work once runner-ready unattended state exists, and must not ask to continue while unattended policy still allows bounded progress. If role-agent dispatch is unavailable, fail closed with a blocker.
|
|
58
58
|
|
|
59
59
|
Do not grant shell-style tools to Agent 0, Agent 1, Agent 1.5, Agent 2, or the default Agent 3/4 role sessions. Agent 3 terminal verification may use shell-style tools only when `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`, the command cwd is under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and the command invokes the installed `agent3-verification-runner.py`. Agent 4 polish verification and commit may use shell-style tools only when `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`, cwd is under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and the command invokes the installed `agent4-polish-runner.py`. Use `--hooks=strict` for dedicated Codex, Claude, or OpenCode clean-room homes so hooks fail closed if required environment is missing or shell tools are invoked outside the allowed runner boundaries. Safe hook installs are compatibility-only between runs; during init/onboarding, prepare the role environment block and pass it into every clean-room role session so safe hooks enforce during active work.
|
|
60
60
|
|
|
@@ -93,7 +93,7 @@ Classify the selected candidate before starting the wizard:
|
|
|
93
93
|
- Invalid `preflight-goal.json`: stop, report canonical schema or required-field errors, and do not create a replacement preflight.
|
|
94
94
|
- No artifacts found: start the normal preflight wizard.
|
|
95
95
|
|
|
96
|
-
Load or create `preflight-goal.json` only after this discovery step. Do not start attended or unattended execution until the goal contract records the end goal, target stack, license policy, dependency policy, compatibility/exactness policy, feature add/remove policy, code hygiene limits, output policy, existing destination policy, and controller mode.
|
|
96
|
+
Load or create `preflight-goal.json` only after this discovery step. Do not start attended or unattended execution until the goal contract records the end goal, target stack, license policy, dependency policy, compatibility/exactness policy, feature add/remove policy, code hygiene limits, output policy, existing destination policy, and controller mode. Do not infer end goal, target language, runtime, framework, package manager, or test framework from source contents. If the user's end goal or target stack is unknown, record blocking `open_questions`, keep unattended disabled, and do not write runner-ready `task-manifest.json` or `clean-run-context.json`.
|
|
97
97
|
|
|
98
98
|
Gather only the setup facts needed to decide whether the workflow may start, or invoke `init` when the user wants a dedicated setup pass:
|
|
99
99
|
|
|
@@ -101,7 +101,7 @@ Gather only the setup facts needed to decide whether the workflow may start, or
|
|
|
101
101
|
- Artifact base root. Default the task root to `~/Documents/CleanRoom/<project>/tasks/<task-id>/`. If the user does not provide an explicitly approved neutral task ID, generate one as `task-` plus 8 lowercase hex characters. Do not derive task IDs or output directory names from source folder names.
|
|
102
102
|
- Project grouping. Default to the clean-room project layout: `<base>/<project>/tasks/<task-id>/` with one shared `<base>/<project>/implementation/` root for every task in the project. When the user does not supply an approved neutral project name, generate `proj-` plus 8 lowercase hex characters; it must match `[a-z0-9][a-z0-9-]{0,63}`, must never be derived from source or destination folder basenames or meaningful source-name tokens, and appears in paths clean roles can see. Use the legacy flat `<base>/<task-id>/` layout only when the user explicitly chooses single-task compatibility. Only one task per project may run at a time because tasks share the implementation root; the durable runner enforces this with an advisory `.clean-room-implementation.lock` in each implementation root.
|
|
103
103
|
- Source roots or fallback visual evidence roots, contaminated artifact root, clean artifact root, clean implementation root, quarantine root, and optional public or destination reference roots.
|
|
104
|
-
-
|
|
104
|
+
- Explicit user-confirmed end goal, target stack, and destination constraints from `preflight-goal.json`.
|
|
105
105
|
- Target schema profile: `openspec-delta`, `gsd-planning-package`, `speckit-feature-folder`, or `kiro-spec-folder`.
|
|
106
106
|
- Default model plus optional clean, contaminated, or per-role overrides.
|
|
107
107
|
- Additional user rules split into clean-safe and contaminated-only rules.
|
|
@@ -113,7 +113,7 @@ Before indexing or artifact generation, confirm that source roots, contaminated
|
|
|
113
113
|
|
|
114
114
|
For `attended` mode, record a `controller_policy` that pauses for human review at scope gate, clean handoff, terminal implementation deltas, blocked units, and final coverage. Include stop conditions for `authorization-missing`, `scope-change`, `contamination-suspected`, `schema-validation-failed`, `leakage-scan-failed`, `unit-blocked`, `implementation-complete`, and `coverage-complete`; attended mode does not add an iteration-limit stop unless the user explicitly sets one.
|
|
115
115
|
|
|
116
|
-
For `unattended` mode, require explicit authorization, separated roots, finite bounds, `loop_context`, and a complete `preflight-goal.json` with no `open_questions` and `unattended_allowed_after_preflight: true` before work starts. Record `controller_policy.mode` as `unattended`, `max_units_per_iteration` as `1`, `max_iterations` from preflight, and include these stop conditions: `authorization-missing`, `scope-change`, `contamination-suspected`, `schema-validation-failed`, `leakage-scan-failed`, `unit-blocked`, `implementation-complete`, `coverage-complete`, `iteration-limit-reached`, `spec-slice-complete`, `spec-slice-blocked`, `spec-delta-required`, `no-progress-detected`, `repeated-unit-selection`, and `clean-room-returned`.
|
|
116
|
+
For `unattended` mode, require explicit authorization, separated roots, finite bounds, `loop_context`, and a complete `preflight-goal.json` with no `open_questions`, `intent_confirmation` for explicit user-confirmed goal and target stack, and `unattended_allowed_after_preflight: true` before work starts. Record `controller_policy.mode` as `unattended`, `max_units_per_iteration` as `1`, `max_iterations` from preflight, and include these stop conditions: `authorization-missing`, `scope-change`, `contamination-suspected`, `schema-validation-failed`, `leakage-scan-failed`, `unit-blocked`, `implementation-complete`, `coverage-complete`, `iteration-limit-reached`, `spec-slice-complete`, `spec-slice-blocked`, `spec-delta-required`, `no-progress-detected`, `repeated-unit-selection`, and `clean-room-returned`.
|
|
117
117
|
|
|
118
118
|
Default sequence:
|
|
119
119
|
|
|
@@ -41,6 +41,7 @@
|
|
|
41
41
|
},
|
|
42
42
|
"source_unit_ref": {
|
|
43
43
|
"type": "string",
|
|
44
|
+
"description": "Task-manifest unit id or accepted unit alias for the assigned unit. Do not put source paths here; use evidence_location_ref and source_index_refs or visual_index_refs for source/index location details.",
|
|
44
45
|
"minLength": 1
|
|
45
46
|
},
|
|
46
47
|
"evidence_type": {
|
|
@@ -58,6 +59,7 @@
|
|
|
58
59
|
},
|
|
59
60
|
"evidence_location_ref": {
|
|
60
61
|
"type": "string",
|
|
62
|
+
"description": "Contaminated-only pointer to where the evidence was observed, such as a source-index ref, visual-index ref, or other non-clean location reference.",
|
|
61
63
|
"minLength": 1
|
|
62
64
|
},
|
|
63
65
|
"source_hash": {
|
|
@@ -26,6 +26,41 @@
|
|
|
26
26
|
"type": "string",
|
|
27
27
|
"format": "date-time"
|
|
28
28
|
},
|
|
29
|
+
"intent_confirmation": {
|
|
30
|
+
"type": "object",
|
|
31
|
+
"additionalProperties": false,
|
|
32
|
+
"required": [
|
|
33
|
+
"confirmed_at",
|
|
34
|
+
"end_goal_source",
|
|
35
|
+
"target_stack_source",
|
|
36
|
+
"controller_mode_source",
|
|
37
|
+
"user_goal_summary",
|
|
38
|
+
"user_target_stack_summary"
|
|
39
|
+
],
|
|
40
|
+
"properties": {
|
|
41
|
+
"confirmed_at": {
|
|
42
|
+
"type": "string",
|
|
43
|
+
"format": "date-time"
|
|
44
|
+
},
|
|
45
|
+
"end_goal_source": {
|
|
46
|
+
"const": "explicit-user-answer"
|
|
47
|
+
},
|
|
48
|
+
"target_stack_source": {
|
|
49
|
+
"const": "explicit-user-answer"
|
|
50
|
+
},
|
|
51
|
+
"controller_mode_source": {
|
|
52
|
+
"const": "explicit-user-answer"
|
|
53
|
+
},
|
|
54
|
+
"user_goal_summary": {
|
|
55
|
+
"type": "string",
|
|
56
|
+
"minLength": 1
|
|
57
|
+
},
|
|
58
|
+
"user_target_stack_summary": {
|
|
59
|
+
"type": "string",
|
|
60
|
+
"minLength": 1
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
},
|
|
29
64
|
"end_goal": {
|
|
30
65
|
"type": "object",
|
|
31
66
|
"additionalProperties": false,
|
|
@@ -371,6 +406,9 @@
|
|
|
371
406
|
]
|
|
372
407
|
},
|
|
373
408
|
"then": {
|
|
409
|
+
"required": [
|
|
410
|
+
"intent_confirmation"
|
|
411
|
+
],
|
|
374
412
|
"properties": {
|
|
375
413
|
"controller_policy": {
|
|
376
414
|
"properties": {
|
|
@@ -1,6 +1,14 @@
|
|
|
1
1
|
{
|
|
2
2
|
"goal_id": "goal-task-example",
|
|
3
3
|
"created_at": "2024-01-01T00:00:00Z",
|
|
4
|
+
"intent_confirmation": {
|
|
5
|
+
"confirmed_at": "2024-01-01T00:00:00Z",
|
|
6
|
+
"end_goal_source": "explicit-user-answer",
|
|
7
|
+
"target_stack_source": "explicit-user-answer",
|
|
8
|
+
"controller_mode_source": "explicit-user-answer",
|
|
9
|
+
"user_goal_summary": "Build a behavior-compatible clean implementation from approved clean specs.",
|
|
10
|
+
"user_target_stack_summary": "JavaScript on Node.js with npm and node:test."
|
|
11
|
+
},
|
|
4
12
|
"end_goal": {
|
|
5
13
|
"intent": "clean-room-reimplementation",
|
|
6
14
|
"success_definition": "Build a behavior-compatible clean implementation from approved clean specs.",
|
|
@@ -17,7 +17,7 @@
|
|
|
17
17
|
"source_acquisition_basis": "Authorized local source access.",
|
|
18
18
|
"license_contract_notes": "No legal conclusion recorded.",
|
|
19
19
|
"preflight_goal_ref": "preflight-goal.json",
|
|
20
|
-
"preflight_goal_sha256": "
|
|
20
|
+
"preflight_goal_sha256": "a168b62605d5e3e262ba388e9fc75d99b1413450cd8aa0d96861e6d6496e9420",
|
|
21
21
|
"source_index_ref": "source-index.json",
|
|
22
22
|
"run_state": {
|
|
23
23
|
"generation": 1,
|
|
@@ -8,6 +8,7 @@ Ask only enough to fill `preflight-goal.json`:
|
|
|
8
8
|
|
|
9
9
|
- End goal: clean reimplementation, behavior-compatible port, API-compatible clone, modernization, partial extraction, or spec/test generation only.
|
|
10
10
|
- Target stack: language, runtime, framework, package manager, and test framework.
|
|
11
|
+
- Intent confirmation: completed and unattended contracts must record that end goal, target stack, and controller mode came from explicit user answers.
|
|
11
12
|
- Exactness: public APIs, CLI behavior, config files, output formats, error codes, UI behavior, or behavior-only.
|
|
12
13
|
- Visual fallback: when no source code is available, confirm what authorized screenshots are meant to accomplish, the target user flow, screenshot coverage, target stack, UI exactness boundary, and whether visible words are public compatibility surface.
|
|
13
14
|
- Forbidden mirroring: internal names, private structure, comments, source file layout, private helper behavior, and dependencies.
|
|
@@ -20,7 +21,7 @@ Ask only enough to fill `preflight-goal.json`:
|
|
|
20
21
|
|
|
21
22
|
## Defaults
|
|
22
23
|
|
|
23
|
-
Record every default as an assumption. Good defaults:
|
|
24
|
+
Record every default as an assumption. Do not default the end goal or target stack from source code. Source language, runtime, framework, package manager, and test framework describe the input, not the user's requested destination. If either the end goal or target stack is unknown, keep a blocking `open_questions` entry and do not mark an unattended contract complete. Good defaults:
|
|
24
25
|
|
|
25
26
|
- Artifact base: `~/Documents/CleanRoom/<project>/tasks/<task-id>/`.
|
|
26
27
|
- Implementation root: `~/Documents/CleanRoom/<project>/implementation/`.
|
|
@@ -118,7 +118,7 @@ Contaminated manager/verifier:
|
|
|
118
118
|
|
|
119
119
|
- Confirm authorization and source scope.
|
|
120
120
|
- Create or validate `preflight-goal.json` before source discovery and record its ref/hash in `task-manifest.json`.
|
|
121
|
-
- Do not infer target language, dependency policy, license policy, exactness policy, output directory, or feature add/remove policy from source.
|
|
121
|
+
- Do not infer end goal, target language, runtime, framework, package manager, test framework, dependency policy, license policy, exactness policy, output directory, or feature add/remove policy from source. Completed and unattended preflight contracts require explicit user intent confirmation.
|
|
122
122
|
- Create or update controller-side `init-config.json` when the user invokes initialization, then snapshot effective preferences into `task-manifest.json`.
|
|
123
123
|
- Produce sanitized `clean-run-context.json` for Agent 2, Agent 3, and Agent 4. Include clean artifact paths, implementation root environment references, target profile, clean-safe goal contract fields, code hygiene policy, approved public refs, clean-safe rules, clean-side model preferences, and artifact-only coordination policy only.
|
|
124
124
|
- Record optional `context_management` budgets in `task-manifest.json` and `clean-run-context.json` when low-context handoffs are enabled.
|
|
@@ -260,7 +260,7 @@ def collect_images(
|
|
|
260
260
|
ignore_dirs = set(DEFAULT_IGNORE_DIRS) | set(args.ignore_dir)
|
|
261
261
|
images: list[dict[str, Any]] = []
|
|
262
262
|
skipped_entries: list[dict[str, str]] = []
|
|
263
|
-
counters = {"skipped_count": 0, "total_bytes": 0}
|
|
263
|
+
counters = {"skipped_count": 0, "total_bytes": 0, "attempted_total_bytes": 0}
|
|
264
264
|
next_image_id = 1
|
|
265
265
|
|
|
266
266
|
for root in roots:
|
|
@@ -270,7 +270,7 @@ def collect_images(
|
|
|
270
270
|
def limit_reached_reason() -> str | None:
|
|
271
271
|
if len(images) >= args.max_files:
|
|
272
272
|
return "file-count-limit"
|
|
273
|
-
if counters["
|
|
273
|
+
if counters["attempted_total_bytes"] >= args.max_total_bytes:
|
|
274
274
|
return "total-byte-limit"
|
|
275
275
|
return None
|
|
276
276
|
|
|
@@ -347,7 +347,7 @@ def collect_images(
|
|
|
347
347
|
if stat.st_size > args.max_file_bytes:
|
|
348
348
|
add_skipped(skipped_entries, counters, rel, "file-byte-limit", "file")
|
|
349
349
|
continue
|
|
350
|
-
if counters["
|
|
350
|
+
if counters["attempted_total_bytes"] + stat.st_size > args.max_total_bytes:
|
|
351
351
|
add_skipped(skipped_entries, counters, rel, "total-byte-limit", "file")
|
|
352
352
|
continue
|
|
353
353
|
|
|
@@ -367,9 +367,10 @@ def collect_images(
|
|
|
367
367
|
if len(data) > args.max_file_bytes:
|
|
368
368
|
add_skipped(skipped_entries, counters, rel, "file-byte-limit-after-read", "file")
|
|
369
369
|
continue
|
|
370
|
-
|
|
370
|
+
counters["attempted_total_bytes"] += len(data)
|
|
371
|
+
if counters["attempted_total_bytes"] > args.max_total_bytes:
|
|
371
372
|
add_skipped(skipped_entries, counters, rel, "total-byte-limit-after-read", "file")
|
|
372
|
-
|
|
373
|
+
break
|
|
373
374
|
|
|
374
375
|
metadata = image_metadata(data, suffix)
|
|
375
376
|
if metadata is None:
|
package/skills/init/SKILL.md
CHANGED
|
@@ -13,7 +13,9 @@ Initialize or revise durable Clean Room run preferences before source analysis s
|
|
|
13
13
|
|
|
14
14
|
## Preflight Goal Contract
|
|
15
15
|
|
|
16
|
-
Before creating active artifacts, collect or confirm `preflight-goal.json`. Do not start attended or unattended execution until the goal contract records end goal, target stack, license policy, dependency policy, compatibility/exactness policy, feature add/remove policy, code hygiene limits, output policy, existing destination policy, and controller mode.
|
|
16
|
+
Before creating active artifacts, collect or confirm `preflight-goal.json`. Do not start attended or unattended execution until the goal contract records end goal, target stack, license policy, dependency policy, compatibility/exactness policy, feature add/remove policy, code hygiene limits, output policy, existing destination policy, and controller mode. Completed preflight inputs and unattended contracts must also record `intent_confirmation` proving the end goal, target stack, and controller mode came from explicit user answers.
|
|
17
|
+
|
|
18
|
+
Do not infer the user's end goal or target stack from the source repository. A source stack is not a destination stack; ports and rewrites often intentionally change language, runtime, framework, package manager, and test framework. If end goal or target stack is unknown, leave blocking `open_questions`, keep `controller_policy.unattended_allowed_after_preflight` false, and do not write runner-ready `task-manifest.json` or `clean-run-context.json`.
|
|
17
19
|
|
|
18
20
|
Keep `preflight-goal.json` in the controller/contaminated artifact domain. Clean roles receive only the clean-safe `goal_contract` subset, `code_hygiene_policy`, and optional Agent 4 local commit policy through `clean-run-context.json`.
|
|
19
21
|
|
|
@@ -32,7 +34,7 @@ Collect only setup decisions that affect correctness, safety, resumability, or o
|
|
|
32
34
|
- Artifact base root. Default the task root to `~/Documents/CleanRoom/<project>/tasks/<task-id>/`, never to the source workspace or a temporary directory unless the user explicitly chooses it. If the user does not provide an explicitly approved neutral task ID, generate one as `task-` plus 8 lowercase hex characters. Do not derive task IDs or output directory names from source folder names.
|
|
33
35
|
- Project grouping. Default to a clean-room project with shared `~/Documents/CleanRoom/<project>/implementation/`. When adding a task to an existing destination project, record the user-supplied `project_id` and `project_root`; otherwise generate a neutral `proj-` plus 8 lowercase hex project id. Project names follow the same neutrality rules as task IDs, match `[a-z0-9][a-z0-9-]{0,63}`, and are never derived from source folder names. Record both fields in `init-config.json` and the manifest `initialization_snapshot`. Use the legacy flat `~/Documents/CleanRoom/<task-id>/` layout only when the user explicitly chooses single-task compatibility.
|
|
34
36
|
- Target schema profile: `openspec-delta`, `gsd-planning-package`, `speckit-feature-folder`, or `kiro-spec-folder`.
|
|
35
|
-
- Goal contract choices from `preflight-goal.json`, including target stack, dependency/license policy, exactness policy, feature policy, code hygiene, output policy,
|
|
37
|
+
- Goal contract choices from `preflight-goal.json`, including explicit user-confirmed end goal, target stack, dependency/license policy, exactness policy, feature policy, code hygiene, output policy, controller mode, and `intent_confirmation`.
|
|
36
38
|
- Default model plus optional overrides for contaminated roles, clean roles, or individual roles. Keep model ids as runtime-specific strings.
|
|
37
39
|
- Additional user rules split into `clean_safe` and `contaminated_only`. Put anything containing source paths, private identifiers, private dependency names, or source-derived specifics into `contaminated_only`.
|
|
38
40
|
- Role hook environment values derived from the approved roots: `CLEAN_ROOM_ROLE`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, `CLEAN_ROOM_SCHEMA_DIR`, and optional hook-only denylist paths. The controller must pass these into each role session; do not require the user to set `CLEAN_ROOM_HOOK_ENFORCE` for normal safe-hook runs.
|
|
@@ -25,9 +25,10 @@ Record these decisions:
|
|
|
25
25
|
- Code hygiene policy: file line caps, max files per iteration, split strategy, exceptions, and forbidden patterns.
|
|
26
26
|
- Output policy: artifact base root, implementation root, assumed output directory, and write mode.
|
|
27
27
|
- Controller policy: attended or unattended, iteration cap, and whether unattended is allowed after preflight.
|
|
28
|
+
- Intent confirmation: `intent_confirmation` with explicit-user-answer sources for end goal, target stack, and controller mode, plus user-facing summaries of the goal and target stack.
|
|
28
29
|
- Open questions, with blocking questions clearly marked.
|
|
29
30
|
|
|
30
|
-
The artifact must use the canonical `preflight-goal.schema.json` shape. Required top-level keys are `goal_id`, `created_at`, `end_goal`, `target_stack`, `license_policy`, `dependency_policy`, `compatibility_policy`, `feature_policy`, `code_hygiene_policy`, `output_policy`, `controller_policy`, and `open_questions`.
|
|
31
|
+
The artifact must use the canonical `preflight-goal.schema.json` shape. Required top-level keys are `goal_id`, `created_at`, `end_goal`, `target_stack`, `license_policy`, `dependency_policy`, `compatibility_policy`, `feature_policy`, `code_hygiene_policy`, `output_policy`, `controller_policy`, and `open_questions`. Completed preflight inputs and unattended contracts also require `intent_confirmation`.
|
|
31
32
|
|
|
32
33
|
Reject non-canonical or legacy-shaped preflight artifacts instead of treating them as complete. Do not accept invented fields such as `version`, `created`, `source`, `destination`, `exactness_policy`, `output_policy.artifact_base`, `output_policy.contaminated_root`, `output_policy.clean_root`, or `output_policy.quarantine_root` as substitutes for canonical fields. Report the missing or invalid canonical fields and stop for review.
|
|
33
34
|
|
|
@@ -40,9 +41,10 @@ Unattended runs require a complete `preflight-goal.json` with:
|
|
|
40
41
|
- `controller_policy.mode: "unattended"`
|
|
41
42
|
- `controller_policy.unattended_allowed_after_preflight: true`
|
|
42
43
|
- finite `controller_policy.max_iterations`
|
|
44
|
+
- `intent_confirmation` showing the end goal, target stack, and controller mode came from explicit user answers
|
|
43
45
|
- empty `open_questions`
|
|
44
46
|
|
|
45
|
-
Do not infer target language, license, dependency policy, exactness policy, output directory, or feature add/remove policy from source code.
|
|
47
|
+
Do not infer end goal, target language, runtime, framework, package manager, test framework, license, dependency policy, exactness policy, output directory, or feature add/remove policy from source code. If the user's end goal or target stack is unknown, leave blocking `open_questions`, keep unattended disabled, and do not write runner-ready `task-manifest.json` or `clean-run-context.json`.
|
|
46
48
|
|
|
47
49
|
## CLI Helper
|
|
48
50
|
|
|
@@ -11,7 +11,7 @@ Resume an existing clean-room run from durable artifacts. Never use prior chat h
|
|
|
11
11
|
|
|
12
12
|
Use the canonical `clean-room` skill workflow and references in this plugin. Read `skills/clean-room/references/CONTROLLER-LOOP.md` when the manifest records `loop_context` or unattended mode. Preserve the same clean-room boundary, role separation, artifact schemas, leakage rules, implementation-root rules, and hook expectations.
|
|
13
13
|
|
|
14
|
-
If `task-manifest.json` records `controller_policy.mode: "unattended"` in Claude Code, prefer launching `clean-room-skill run --task-manifest <path> --agent-runtime claude` and
|
|
14
|
+
If `task-manifest.json` records `controller_policy.mode: "unattended"` in Claude Code, prefer launching `clean-room-skill run --task-manifest <path> --agent-runtime claude` only when `loop_context` exists and names approved pending or gap units. If an unattended manifest lacks `loop_context`, treat it as incomplete outer-loop state: finish decomposition or selected-slice approval first, or stop with the missing outer-loop fields instead of launching the runner. If `clean-room-skill` is not on `PATH`, immediately use `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude` instead of searching for the installed package. Do not search plugin cache paths for schema files, and do not pass `--schema-dir /dev/null`. The runner uses bundled schemas by default; pass `--schema-dir` only when the user provides a real schema directory. The main conversation must not perform Agent 1, Agent 2, Agent 3, or Agent 4 work once runner-ready unattended state exists. Do not ask to continue while unattended policy, iteration budget, and approved pending or gap units still permit progress. If the runner or Claude role-agent dispatch is unavailable, stop with `BLOCKERS: Claude role-agent dispatch unavailable` rather than silently continuing in the main chat.
|
|
15
15
|
|
|
16
16
|
## Load Order
|
|
17
17
|
|
|
@@ -15,11 +15,11 @@ Use the canonical `clean-room` skill workflow and references in this plugin. Rea
|
|
|
15
15
|
|
|
16
16
|
Before asking setup or preflight questions, use the canonical `clean-room` "Run State Discovery Before Wizard" rules. Resolve explicit artifact paths first, then configured clean-room roots, then bounded `~/Documents/CleanRoom/task-*` (legacy) and `~/Documents/CleanRoom/*/tasks/task-*` (project layout) candidates. If a valid `task-manifest.json` exists, route to `resume-cr`. If a valid canonical `preflight-goal.json` exists without a manifest, continue at source/destination discovery and manifest creation. If a preflight artifact exists but is invalid, stop with schema errors instead of restarting preflight. If multiple candidates are found without an explicit path, list them and stop for selection.
|
|
17
17
|
|
|
18
|
-
When resuming a valid unattended `task-manifest.json` in Claude Code, prefer launching the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude
|
|
18
|
+
When resuming a valid unattended `task-manifest.json` in Claude Code, prefer launching the durable runner with `clean-room-skill run --task-manifest <path> --agent-runtime claude` only after the manifest has `loop_context` with an approved pending or gap unit. If an unattended manifest lacks `loop_context`, treat it as incomplete outer-loop state: finish decomposition or selected-slice approval first, or stop with the missing outer-loop fields instead of launching the runner. If `clean-room-skill` is not on `PATH`, immediately use `npx clean-room-skill@latest run --task-manifest <path> --agent-runtime claude` instead of searching for the installed package. Do not search plugin cache paths for schema files, and do not pass `--schema-dir /dev/null`. The runner uses bundled schemas by default; pass `--schema-dir` only when the user provides a real schema directory. The main conversation must not perform Agent 1, Agent 2, Agent 3, or Agent 4 work once runner-ready unattended state exists. Do not ask to continue while `controller_policy.mode` is `unattended`, the iteration budget remains, and approved pending or gap units remain. If Claude role-agent dispatch or the runner is unavailable, stop with `BLOCKERS: Claude role-agent dispatch unavailable` instead of falling back to main-chat execution.
|
|
19
19
|
|
|
20
|
-
Load or create `preflight-goal.json` first. Unattended mode requires a complete goal contract with no blocking or non-blocking `open_questions`, `controller_policy.unattended_allowed_after_preflight: true`,
|
|
20
|
+
Load or create `preflight-goal.json` first. Unattended mode requires a complete goal contract with no blocking or non-blocking `open_questions`, `controller_policy.unattended_allowed_after_preflight: true`, finite `controller_policy.max_iterations`, and `intent_confirmation` showing the end goal, target stack, and controller mode came from explicit user answers.
|
|
21
21
|
|
|
22
|
-
Do not assume target language, license policy, dependency policy, exactness policy, output directory, or feature add/remove policy during the unattended loop.
|
|
22
|
+
Do not assume end goal, target language, runtime, framework, package manager, test framework, license policy, dependency policy, exactness policy, output directory, or feature add/remove policy during the unattended loop. Source language and build tooling are not destination choices. If the user's end goal or target stack is unknown, leave blocking `open_questions`, keep unattended disabled, and stop on ambiguity instead of inventing product decisions.
|
|
23
23
|
|
|
24
24
|
Gather only required setup facts:
|
|
25
25
|
|
|
@@ -27,7 +27,7 @@ Gather only required setup facts:
|
|
|
27
27
|
- Artifact base root, defaulting the task root to `~/Documents/CleanRoom/<project>/tasks/<task-id>/`. If the user does not provide an explicitly approved neutral task ID, generate one as `task-` plus 8 lowercase hex characters. Do not derive task IDs or output directory names from source folder names.
|
|
28
28
|
- Project grouping, following the canonical `clean-room` project layout rules: `<base>/<project>/tasks/<task-id>/` with one shared `<base>/<project>/implementation/` root, a neutral project name (`proj-` plus 8 lowercase hex unless the user supplies an approved neutral name, matching `[a-z0-9][a-z0-9-]{0,63}`, never source-derived), and at most one active task per project. Use legacy flat `<base>/<task-id>/` roots only when the user explicitly chooses single-task compatibility.
|
|
29
29
|
- Source roots, contaminated artifact root, clean artifact root, clean implementation root, quarantine root, and optional public or destination reference roots.
|
|
30
|
-
-
|
|
30
|
+
- Explicit user-confirmed end goal, target stack, destination constraints, dependency/license policy, exactness policy, feature policy, code hygiene policy, and output policy from `preflight-goal.json`.
|
|
31
31
|
- Target schema profile: `openspec-delta`, `gsd-planning-package`, `speckit-feature-folder`, or `kiro-spec-folder`.
|
|
32
32
|
- Default model plus optional clean, contaminated, or per-role overrides.
|
|
33
33
|
- Finite maximum iteration count for the inner clean-room loop from `preflight-goal.json`.
|