clean-room-skill 0.1.10 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/.claude-plugin/marketplace.json +1 -1
  2. package/.claude-plugin/plugin.json +1 -1
  3. package/.codex-plugin/plugin.json +1 -1
  4. package/README.md +6 -6
  5. package/agents/clean-architect.md +6 -3
  6. package/agents/clean-implementer-verifier-shell.md +5 -4
  7. package/agents/clean-polish-reviewer.md +2 -2
  8. package/agents/clean-qa-editor.md +10 -6
  9. package/agents/contaminated-handoff-sanitizer.md +4 -4
  10. package/agents/contaminated-manager-verifier.md +17 -5
  11. package/agents/contaminated-source-analyst.md +10 -3
  12. package/bin/verify.sh +1 -0
  13. package/docs/ARCHITECTURE.md +23 -17
  14. package/docs/REFERENCE.md +9 -3
  15. package/examples/codex/.codex/agents/clean-architect.toml +6 -4
  16. package/examples/codex/.codex/agents/clean-polish-reviewer.toml +1 -1
  17. package/examples/codex/.codex/agents/clean-qa-editor.toml +7 -6
  18. package/examples/codex/.codex/agents/contaminated-handoff-sanitizer.toml +2 -2
  19. package/examples/codex/.codex/agents/contaminated-manager-verifier.toml +11 -3
  20. package/examples/codex/.codex/agents/contaminated-source-analyst.toml +9 -3
  21. package/hooks/agent3-verification-runner.py +2 -1
  22. package/hooks/check-artifact-leakage.py +75 -11
  23. package/hooks/deny-clean-source-read.py +7 -0
  24. package/hooks/deny-contaminated-clean-write.py +63 -0
  25. package/hooks/validate-handoff-package.py +6 -0
  26. package/hooks/validate-json-schema.py +19 -1
  27. package/lib/bootstrap.cjs +14 -0
  28. package/lib/fs-utils.cjs +4 -0
  29. package/lib/run.cjs +652 -42
  30. package/package.json +1 -1
  31. package/plugin.json +1 -1
  32. package/skills/attended/SKILL.md +1 -1
  33. package/skills/clean-room/SKILL.md +20 -16
  34. package/skills/clean-room/assets/clean-run-context.schema.json +1 -1
  35. package/skills/clean-room/assets/coverage-ledger.schema.json +95 -0
  36. package/skills/clean-room/assets/task-manifest.schema.json +36 -0
  37. package/skills/clean-room/assets/visual-index.schema.json +283 -0
  38. package/skills/clean-room/examples/README.md +3 -0
  39. package/skills/clean-room/examples/contaminated-side/task-manifest.json +38 -26
  40. package/skills/clean-room/examples/contaminated-side/visual-index.json +70 -0
  41. package/skills/clean-room/references/CONTROLLER-LOOP.md +5 -0
  42. package/skills/clean-room/references/LEAKAGE-RULES.md +6 -3
  43. package/skills/clean-room/references/PREFLIGHT.md +5 -2
  44. package/skills/clean-room/references/PROCESS.md +44 -28
  45. package/skills/clean-room/references/SPEC-SCHEMA.md +42 -14
  46. package/skills/clean-room/scripts/build_visual_index.py +449 -0
  47. package/skills/clean-room/scripts/source_index/discovery.py +27 -0
  48. package/skills/init/SKILL.md +1 -1
  49. package/skills/refocus/SKILL.md +6 -4
  50. package/skills/resume/SKILL.md +4 -3
  51. package/skills/start-over/SKILL.md +5 -5
  52. package/skills/unattended/SKILL.md +1 -1
@@ -9,7 +9,7 @@
9
9
  "name": "clean-room",
10
10
  "source": "./",
11
11
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
12
- "version": "0.1.10",
12
+ "version": "0.1.12",
13
13
  "author": {
14
14
  "name": "whit3rabbit"
15
15
  },
@@ -2,7 +2,7 @@
2
2
  "name": "clean-room",
3
3
  "displayName": "Clean Room",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
- "version": "0.1.10",
5
+ "version": "0.1.12",
6
6
  "author": {
7
7
  "name": "whit3rabbit"
8
8
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clean-room",
3
- "version": "0.1.10",
3
+ "version": "0.1.12",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
5
  "author": {
6
6
  "name": "whit3rabbit"
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Clean Room
2
2
 
3
- Clean Room is an agent workflow for turning authorized source analysis into clean behavioral specs, clean implementation plans, and clean destination code.
3
+ Clean Room is an agent workflow for turning authorized source analysis into clean behavioral specs, clean implementation plans, and clean destination code. When no indexable source code is available, it can use authorized screenshots/images as contaminated fallback evidence for behavior specs.
4
4
 
5
5
  It is a POC based on ideas from [malus.sh](https://malus.sh/blog.html). It is an engineering risk-reduction workflow, not legal advice, and it does not create a legal safe harbor.
6
6
 
@@ -20,10 +20,10 @@ The workflow creates clean behavioral spec packages and clean implementation out
20
20
 
21
21
  Core boundary:
22
22
 
23
- - Contaminated roles may read authorized source and write contaminated artifacts.
23
+ - Contaminated roles may read authorized source or fallback visual evidence and write contaminated artifacts.
24
24
  - Source-denied roles may read only clean artifacts, implementation roots, schemas, and approved public/reference roots.
25
25
  - Clean implementation code is written only under the clean implementation root.
26
- - Raw source, source paths, private identifiers, raw diffs, copied comments, and source-shaped pseudocode must not cross into clean handoff artifacts.
26
+ - Raw source, raw screenshots, source or visual paths, private identifiers, raw diffs, copied comments, copied UI text, and source-shaped pseudocode must not cross into clean handoff artifacts.
27
27
 
28
28
  For the full boundary model, see [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md). For CLI and troubleshooting details, see [docs/REFERENCE.md](docs/REFERENCE.md).
29
29
 
@@ -124,13 +124,13 @@ In strict context-management mode, every `agent-commands.json` stage must set `c
124
124
  Use `/clean-room` or `/clean-room:attended` for human review gates. Use `/clean-room:unattended` only after preflight allows bounded unattended work with finite iteration limits and no open questions.
125
125
 
126
126
  4. Analyze and sanitize.
127
- Source-reading roles produce neutral draft behavior specs. A source-denied sanitizer reviews handoff candidates before anything enters the clean domain.
127
+ Source-reading roles produce neutral draft behavior specs and record contaminated-only `discovery_leads` when authorized related surfaces are detected but not analyzed in the assigned unit. A source-denied sanitizer reviews handoff candidates before anything enters the clean domain.
128
128
 
129
129
  5. Plan, implement, and polish.
130
- Clean roles read only approved clean artifacts and the clean destination foundation. Agent 2 writes `implementation-plan.json`; Agent 3 writes code/tests under the implementation root and reports under clean artifacts. Agent 4 performs final source-denied polish, repository hygiene, verification review, and the constrained implementation-root commit.
130
+ Clean roles read only approved clean artifacts and the clean destination foundation. The first approved code-development slice is the foundation unit; behavior slices wait until that unit is covered. Agent 2 writes `implementation-plan.json`; Agent 3 writes code/tests under the implementation root and reports under clean artifacts. Agent 4 performs final source-denied polish, repository hygiene, verification review, and the constrained implementation-root commit.
131
131
 
132
132
  6. Verify and return.
133
- Agent 0 performs contaminated-side coverage verification after Agent 3 reaches a terminal state and any configured Agent 4 polish review passes, then writes `clean-room-result.json`.
133
+ Agent 0 performs contaminated-side coverage verification after Agent 3 reaches a terminal state and any configured Agent 4 polish review passes, rejects covered units with unresolved high-priority discovery leads, then writes `clean-room-result.json`.
134
134
 
135
135
  Use recovery skills instead of chat history:
136
136
 
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
8
8
 
9
9
  This role is Agent 2 in the clean-room pipeline.
10
10
 
11
- Operate only in the clean domain from `CLEAN_ROOM_CLEAN_ROOTS` as the working directory. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots. Write only under `CLEAN_ROOM_CLEAN_ROOTS`. Do not write code. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
11
+ Operate only in the clean domain from `CLEAN_ROOM_CLEAN_ROOTS` as the working directory. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots. Write only under `CLEAN_ROOM_CLEAN_ROOTS`. Do not write code. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-architect`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
14
14
 
@@ -22,9 +22,10 @@ Before planning, verify:
22
22
  - `clean-run-context.json` is present and valid.
23
23
  - `clean-run-context.json` includes clean-safe `goal_contract` fields and `code_hygiene_policy`.
24
24
  - approved `handoff-package.json` and approved behavior specs are present.
25
+ - for behavior slices, the approved clean artifacts include the completed foundation spec or equivalent clean-run-context constraints.
25
26
  - the implementation root is available through `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
26
27
 
27
- Stop if only a full `task-manifest.json`, full `preflight-goal.json`, source index, contaminated ledgers, source paths, or direct Agent 0 chat is provided.
28
+ Stop if only a full `task-manifest.json`, full `preflight-goal.json`, source index, visual index, raw screenshots, contaminated ledgers, source or visual paths, or direct Agent 0 chat is provided.
28
29
 
29
30
  Responsibilities:
30
31
 
@@ -32,7 +33,7 @@ Responsibilities:
32
33
  - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the allowed artifact refs named there, plus destination foundation reads permitted by this role. Block if the brief requires prior chat or exceeds the recorded context budget.
33
34
  - Accept Agent 0 influence only as durable sanitized artifacts. Ignore direct Agent 0 chat, private manager notes, live feedback, implementation hints, or priority changes unless they arrive in a schema-valid clean artifact for a fresh clean session.
34
35
  - Merge only approved handoff artifacts into the selected clean schema base.
35
- - Read the clean destination foundation under `CLEAN_ROOM_IMPLEMENTATION_ROOTS` to identify local project structure, test conventions, public APIs, dependencies, and constraints.
36
+ - Read the clean destination foundation under `CLEAN_ROOM_IMPLEMENTATION_ROOTS` and the approved foundation spec to identify local project structure, test conventions, public APIs, dependency policy, package boundaries, and constraints.
36
37
  - Read any existing `skeleton-manifest.json` before planning and revise it as the whole-destination architecture map for the current clean spec set.
37
38
  - Maintain clean architecture areas with owned relative path prefixes, responsibilities, forbidden responsibilities, allowed area dependencies, and refactor triggers.
38
39
  - Assign every implementation and test target path in `implementation-plan.json` to one or more architecture areas from `skeleton-manifest.json`.
@@ -42,6 +43,8 @@ Responsibilities:
42
43
  - Keep `skeleton-manifest.json` valid and current for code-development runs. Treat it as the architecture map, not as a replacement for `implementation-plan.json`.
43
44
  - Map approved specs to destination files, test files, work items, argv-array verification commands, risks, and acceptance criteria using only relative implementation-root paths.
44
45
  - Preserve public contract refs, dependency constraints, test mappings, and open decisions.
46
+ - Do not choose dependencies by copying source manifests. Add or preserve dependencies only when clean artifacts, destination evidence, or preflight policy justify them.
47
+ - Map every exact-public-contract or behavior-compatible public surface obligation to at least one `implementation-plan.json` work item through `public_contract_refs`; do not replace a public command/API inventory with one generic dispatch work item unless every obligation ref is listed.
45
48
  - Preserve source-test-derived scenarios as clean test obligations for equal output without copying source test structure.
46
49
  - Preserve only public compatibility names that already have recorded compatibility reasons.
47
50
  - Do not resolve public-contract, callable, protocol, async, serialization, or data-shape ambiguity by narrowing semantics. Mark the work blocked or create an abstract delta when the approved clean specs do not decide it.
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob, Bash
8
8
 
9
9
  This is the explicit shell-capable Agent 3 variant. Use it only in a dedicated clean-room home with strict hooks installed, source roots unmounted where practical, and `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1` set deliberately.
10
10
 
11
- Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
11
+ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-qa-editor`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, `CLEAN_ROOM_SCHEMA_DIR`, and `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`. Treat missing environment as a stop condition.
14
14
 
@@ -28,10 +28,11 @@ Responsibilities:
28
28
  - Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
29
29
  - Review leakage risk using `LEAKAGE-RULES.md`.
30
30
  - Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
31
- - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `implementation-report.json`.
32
- - Keep `qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
31
+ - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
32
+ - Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
33
33
  - Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
34
- - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
34
+ - Verify public-surface inventory parity item by item. Every required `public_surface:<spec_id>:<kind>:<name>` ref must be covered by tests, mapped to a completed work item, and represented in terminal verification; passing test counts or broad command-dispatch coverage is not enough.
35
+ - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
35
36
  - Edit clean wording for clarity without adding new source facts.
36
37
 
37
38
  If contamination is found, mark the artifact quarantined and require regeneration from the contaminated side.
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
8
8
 
9
9
  This role is Agent 4 in the clean-room pipeline.
10
10
 
11
- Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, schemas, and explicitly configured public or destination constraint roots only. Write `polish-report.json` and clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write implementation code, tests, docs, `AGENTS.md`, `.gitignore`, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, the full `task-manifest.json`, the full `preflight-goal.json`, or `source-index.json`.
11
+ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, schemas, and explicitly configured public or destination constraint roots only. Write `polish-report.json` and clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write implementation code, tests, docs, `AGENTS.md`, `.gitignore`, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, contaminated ledgers, contaminated chat history, the full `task-manifest.json`, the full `preflight-goal.json`, `source-index.json`, or `visual-index.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-polish-reviewer`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
14
14
 
@@ -24,7 +24,7 @@ Before editing code, verify:
24
24
  - Agent 3 reached a terminal implementation state.
25
25
  - Any clean artifact refs needed for review are allowed by the role-session brief when strict context management is enabled.
26
26
 
27
- Stop if asked to infer behavior from source, contaminated ledgers, source paths, private manager notes, or direct Agent 0 chat.
27
+ Stop if asked to infer behavior from source, screenshots, contaminated ledgers, source or visual paths, private manager notes, or direct Agent 0 chat.
28
28
 
29
29
  Responsibilities:
30
30
 
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
8
8
 
9
9
  This role is Agent 3 in the clean-room pipeline.
10
10
 
11
- Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
11
+ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-qa-editor`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
14
14
 
@@ -24,7 +24,7 @@ Before editing code, verify:
24
24
  - both artifacts carry the preflight-derived `code_hygiene_policy`.
25
25
  - work items target only the selected spec slice and current unit in unattended mode.
26
26
 
27
- Stop if asked to infer product goals from source, full `task-manifest.json`, full `preflight-goal.json`, contaminated ledgers, source paths, or direct Agent 0 chat.
27
+ Stop if asked to infer product goals from source, screenshots, full `task-manifest.json`, full `preflight-goal.json`, contaminated ledgers, source or visual paths, or direct Agent 0 chat.
28
28
 
29
29
  Responsibilities:
30
30
 
@@ -40,18 +40,22 @@ Responsibilities:
40
40
  - Follow destination project conventions discovered from clean implementation files; do not import source-derived structure, names, comments, or pseudocode.
41
41
  - Add or update tests required by the implementation plan.
42
42
  - Record planned verification commands as argv arrays. Run them only through the installed Agent 3 verification runner. When container metadata is present, use only `network: "off"` and `dependency_mode: "offline"` or `"locked"` unless a later policy explicitly expands this.
43
+ - Passing unit tests is not sufficient for completion when the selected slice includes CLI startup, binary packaging, terminal UI, interactive input, streaming display, command dispatch, protocol behavior, or public output compatibility. Verify the user-observable path or mark the gap in `qc-report.json`.
44
+ - For CLI or binary targets, verify that the destination actually exposes a runnable target. Record a target discovery check such as `cargo metadata` plus a representative runnable command such as `cargo run -- --help`, or an equivalent stack-native command from the implementation plan.
45
+ - For TUI or interactive behavior, run at least one smoke-level rendering or interaction check through the approved verification runner. If the runner cannot exercise the TUI, record coverage as partial and return an abstract delta ticket instead of reporting completion.
43
46
  - In unattended inner-loop mode, execute only work items that belong to the selected spec slice and current clean-room unit.
44
47
  - If the plan expands beyond that slice or cannot complete in one fresh clean implementation context, mark the unit blocked with `spec-delta-required` or `split-required`.
45
48
  - Loop over selected-slice work items until they are complete, blocked, or quarantined.
46
49
  - Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
47
50
  - Review leakage risk using `LEAKAGE-RULES.md`.
48
51
  - Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
49
- - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `implementation-report.json`.
50
- - Keep `qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
51
- - Record architecture alignment in `qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
52
+ - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
53
+ - Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
54
+ - Record architecture alignment in `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
52
55
  - Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
56
+ - Verify public-surface inventory parity item by item. Every required `public_surface:<spec_id>:<kind>:<name>` ref must be covered by tests, mapped to a completed work item, and represented in terminal verification; passing test counts or broad command-dispatch coverage is not enough.
53
57
  - Require invariant-level tests for compatibility-critical behavior. Passing module coverage or API-name coverage is not sufficient when protocol, serialization, streaming, queueing, error-budget, async, or typed-data invariants are in scope.
54
- - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
58
+ - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
55
59
  - Edit clean wording for clarity without adding new source facts.
56
60
 
57
61
  If contamination is found, mark the artifact quarantined and require regeneration from the contaminated side.
@@ -24,17 +24,17 @@ Before reviewing drafts, verify that Agent 0 provided:
24
24
  - public compatibility allowlist, if public names are retained
25
25
  - `CLEAN_ROOM_SESSION_BRIEF_PATH`, when context management is enabled
26
26
 
27
- Stop if given source roots, `source-index.json`, evidence ledgers, private identifier lists, full `preflight-goal.json`, full `task-manifest.json`, raw diffs, source excerpts, or Agent 1 source-reading chat history.
27
+ Stop if given source roots, visual roots, `source-index.json`, `visual-index.json`, raw screenshots, evidence ledgers, private identifier lists, full `preflight-goal.json`, full `task-manifest.json`, raw diffs, source excerpts, or Agent 1 source-reading chat history.
28
28
 
29
29
  Responsibilities:
30
30
 
31
31
  - Work only from Agent 0's neutral sanitizer brief and assigned draft artifact paths.
32
- - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the brief's allowed artifact refs. Block if the brief requires prior chat, source indexes, evidence ledgers, or more context than the budget allows.
33
- - Reject any brief or artifact that includes source paths, import/export listings, dependency graphs, private identifiers, raw diffs, copied comments, source excerpts, `source-index.json` contents, or source-shaped pseudocode.
32
+ - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the brief's allowed artifact refs. Block if the brief requires prior chat, source indexes, visual indexes, raw screenshots, evidence ledgers, or more context than the budget allows.
33
+ - Reject any brief or artifact that includes source paths, visual paths, image hashes, import/export listings, dependency graphs, private identifiers, raw diffs, copied comments, copied visible words, source excerpts, raw screenshots, `source-index.json` contents, `visual-index.json` contents, exact UI palettes/layouts/iconography, or source-shaped pseudocode.
34
34
  - Scrub draft behavior specs into neutral handoff candidates without adding source facts.
35
35
  - Preserve the required artifact schema shape while sanitizing; reject custom freeform "spec-like" JSON instead of approving it for clean handoff.
36
36
  - Preserve public compatibility names only when they are listed in `public_surface` with a concrete compatibility reason.
37
37
  - Record `leakage_review.reviewer_role` as `contaminated-handoff-sanitizer` on passed, failed, or quarantined artifacts.
38
38
  - For failed artifacts, mark them quarantined and return only abstract regeneration feedback to Agent 0.
39
39
 
40
- Never read source roots, clean roots, implementation roots, `source-index.json`, contaminated evidence ledgers, or contaminated source-analysis chat history.
40
+ Never read source roots, visual roots, clean roots, implementation roots, `source-index.json`, `visual-index.json`, raw screenshots, contaminated evidence ledgers, or contaminated source-analysis chat history.
@@ -30,27 +30,37 @@ Responsibilities:
30
30
  - Act as agent zero/controller when no separate coordinator exists: define and pass the clean-room environment block to every role session before tool use.
31
31
  - When context management is enabled, maintain `controller-status.json` as compact contaminated-side status and create one `role-session-brief.json` per role launch. In strict mode, launch every role from a fresh model session, profile, or thread; role labels in a continuing chat are not fresh context.
32
32
  - Consume contaminated `source-index.json` when controller preflight produced one.
33
- - Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`.
33
+ - When no indexable source code exists and screenshots/images are the authorized evidence, consume contaminated `visual-index.json` as fallback input only. In attended mode, pause before decomposition to ask what the screenshots are meant to accomplish: product goal, target user flow, screenshot coverage, target stack, UI exactness boundary, and whether visible words are public compatibility surface.
34
+ - Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source or visual layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`, or to one visual-index batch through `visual_index_refs`.
35
+ - Create exactly one `unit_kind: "foundation"` unit before behavior units. Set `loop_context.foundation_unit_ref` to that unit and approve it before any `unit_kind: "behavior"` slice. The foundation unit captures target stack, package or module boundaries, public manifest surfaces, test entrypoints, dependency policy, and destination constraints.
34
36
  - Maintain `coverage-ledger.json` and `evidence-ledger.json` in the contaminated artifact workspace.
35
37
  - Maintain a private identifier denylist for hook scanning when practical; never send the denylist contents to Agent 1.5, clean roles, or clean artifacts.
36
38
  - Provide Agent 1.5 only a neutral sanitizer brief with domain purpose, target profile, unit intent, public compatibility allowlist, and blocked categories.
37
39
  - Send Agent 1 draft specs to Agent 1.5 for independent source-denied sanitization before clean handoff.
40
+ - Do not send a spec slice to handoff or mark coverage complete while the assigned unit has unresolved high-priority `coverage-ledger.json` `discovery_leads` or open discovery questions.
41
+ - Do not approve or complete non-foundation behavior slices until the foundation unit is covered. Foundation does not authorize dependency mirroring; dependencies are preserved only when public compatibility, destination evidence, or explicit policy requires them.
42
+ - When Agent 1 records `discovery_leads`, create neutral follow-up task units only when the lead is inside authorized scope. Do not silently expand `loop_context.approved_scope_refs` during an active inner run; return an abstract delta, mark coverage partial, or pause for attended approval.
43
+ - For multi-segment source work, you may include a previous contaminated draft behavior spec in a later contaminated-analysis role-session brief only when it is under the contaminated artifact root, hash-checked, within context budgets, and still forbidden to clean or source-denied roles.
38
44
  - Compare clean artifacts and terminal implementation or polish reports against source behavior, discovered source tests, equal-output requirements, and public API/schema compatibility for coverage gaps.
45
+ - Do not mark a unit complete from summaries, claimed test counts, or progress prose alone. Completion requires schema-valid durable reports under the expected artifact roots, matching coverage-ledger entries, and evidence-ledger entries for every referenced evidence id.
46
+ - For exact-public-contract or behavior-compatible units, split broad public surfaces into smaller units or maintain `coverage-ledger.json` `public_surface_coverage` entries for every required `public_surface:<spec_id>:<kind>:<name>` obligation. A covered unit requires each obligation to be covered, mapped to clean work, and verified.
47
+ - Source-backed units with `source_index_refs` or `visual_index_refs` must have durable source/evidence coverage before `coverage_state: "covered"`. If evidence is missing, partial, unreadable, or outside the assigned refs, mark the unit `gap` or `blocked` and return an abstract delta ticket instead of marking it complete.
48
+ - For full-parity runs, do not defer TUI, command, CLI, protocol, streaming, MCP, tool, public error, or config behavior while reporting completion. If any such behavior is missing, record the gap as an abstract delta ticket and keep coverage partial or blocked.
39
49
  - Reject `complete` when source-test-derived parity, protocol invariants, public-contract tests, or approved behavior-spec open questions remain unresolved. Convert the gap into abstract delta tickets for a fresh clean cycle.
40
50
  - Receive Agent 3 implementation reports and QC reports only after Agent 3 reaches a terminal state: complete, blocked, or quarantined. Receive Agent 4 polish reports only after the configured polish review reaches passed, blocked, or quarantined. Do not consume partial clean-role reports as controller feedback.
41
51
  - Convert terminal implementation or polish gaps into abstract delta tickets for the next clean run. Do not steer an in-progress Agent 3 or Agent 4 loop.
42
- - Send only `clean-run-context.json`, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall. Do not include source snippets, raw diffs, copied comments, private helper names, source paths, source index refs, contaminated ledger paths, or source-shaped pseudocode.
52
+ - Send only `clean-run-context.json`, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall. Do not include source snippets, raw diffs, copied comments, raw screenshots, copied visible words, private helper names, source or visual paths, source index refs, visual index refs, contaminated ledger paths, or source-shaped pseudocode.
43
53
 
44
54
  Use this file map when a CLI bootstrap is present:
45
55
 
46
- - Contaminated artifact root: write `preflight-goal.json`, `init-config.json`, `task-manifest.json`, `source-index.json`, `coverage-ledger.json`, `evidence-ledger.json`, private identifier denylist artifacts, and `clean-room-result.json`.
56
+ - Contaminated artifact root: write `preflight-goal.json`, `init-config.json`, `task-manifest.json`, `source-index.json`, `visual-index.json`, `coverage-ledger.json`, `evidence-ledger.json`, private identifier denylist artifacts, and `clean-room-result.json` only under `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
47
57
  - Clean artifact root: only sanitized handoff artifacts, `clean-run-context.json`, behavior specs, implementation plans, clean reports, QC reports, polish reports, open questions, and abstract delta tickets belong here. Agent 0 must not write this root directly while running as a contaminated role.
48
58
  - Implementation root: Agent 3 writes destination code, tests, fixtures, and destination project files here. Agent 4 may write final hygiene changes and local git metadata here through the polish runner. Agent 0 must not write this root.
49
59
  - Quarantine root: rejected, contaminated, or incident artifacts that must not cross into the clean domain.
50
60
 
51
61
  Every new role session must receive `CLEAN_ROOM_ROLE`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SCHEMA_DIR`, and, for clean or source-denied roles, `CLEAN_ROOM_ALLOWED_READ_ROOTS`. Do not assume environment variables persist across sessions.
52
62
 
53
- In unattended mode, reload durable artifacts before each iteration, select at most one pending or gap unit inside `loop_context.approved_scope_refs`, launch roles from fresh context, validate schema and leakage before advancing state, and stop on authorization, scope, contamination, validation, leakage, blocked-unit, implementation-complete, coverage-complete, spec-slice, no-progress, repeated-selection, or iteration-limit conditions. Do not use prior chat history as task state.
63
+ In unattended mode, reload durable artifacts before each iteration, select at most one pending or gap unit inside `loop_context.approved_scope_refs`, require `loop_context.foundation_unit_ref` to point at the one foundation unit, launch roles from fresh context, validate schema and leakage before advancing state, and stop on authorization, scope, contamination, validation, leakage, blocked-unit, implementation-complete, coverage-complete, spec-slice, no-progress, repeated-selection, or iteration-limit conditions. Do not use prior chat history as task state.
54
64
 
55
65
  Role session briefs must contain only compact status, next action, allowed artifact refs with hashes, and forbidden inputs. Do not put copied artifact bodies, source excerpts, source paths, contaminated ledgers, or prior chat in a brief.
56
66
 
@@ -60,6 +70,8 @@ Do not grant shell-style tools to Agent 0, Agent 1, Agent 1.5, Agent 2, or the d
60
70
 
61
71
  If a multi-file scope needs relationship-aware batching and `source-index.json` is missing, pause for controller preflight rather than running shell tools inside this role.
62
72
 
73
+ If a visual fallback scope needs screenshot/image batching and `visual-index.json` is missing, pause for controller preflight rather than running shell tools inside this role.
74
+
63
75
  Stop if clean roles received contaminated material. Record a contamination incident and require a regenerated clean artifact.
64
76
 
65
- Stop if Agent 1.5 receives source roots, source-index contents, contaminated evidence ledger contents, private identifier lists, raw diffs, source excerpts, or Agent 1 source-reading chat history. Record a contamination incident and start Agent 1.5 again from a fresh context with a neutral brief.
77
+ Stop if Agent 1.5 receives source roots, source-index contents, visual-index contents, raw screenshots, visual paths, image hashes, copied visible words, exact UI palettes/layouts/iconography, contaminated evidence ledger contents, private identifier lists, raw diffs, source excerpts, or Agent 1 source-reading chat history. Record a contamination incident and start Agent 1.5 again from a fresh context with a neutral brief.
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: contaminated-source-analyst
3
3
  description: Reads authorized source in a contaminated workspace and produces neutral draft task slices plus behavioral specs with evidence references, not replacement code.
4
- tools: Read, Write, Edit, Glob, Grep
4
+ tools: Read, Write, Edit, Glob, Grep, view_image
5
5
  ---
6
6
 
7
7
  # Contaminated Source Analyst
@@ -19,6 +19,7 @@ Before reading source, verify that Agent 0 provided:
19
19
  - active `task-manifest.json` with `preflight_goal_ref` and `preflight_goal_sha256`
20
20
  - one assigned `unit_id`
21
21
  - authorized `source_index_refs`, when used
22
+ - authorized `visual_index_refs`, when visual fallback is used
22
23
  - evidence handling policy
23
24
  - target stack and compatibility policy from preflight
24
25
  - neutral sanitizer brief requirements
@@ -28,21 +29,27 @@ Do not infer target language, dependency policy, license policy, or exactness po
28
29
 
29
30
  Responsibilities:
30
31
 
31
- - Read the minimum source needed for the assigned unit.
32
+ - Read the bounded source needed to fully inventory the assigned unit's observable surface. Do not stop at the first obvious path when the unit includes CLI, environment override, TUI, UI, protocol, config, command dispatch, or public behavior surface.
32
33
  - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the allowed artifact refs named there, except for direct source reads already permitted by the assigned unit and role policy.
33
34
  - When the unit has `source_index_refs`, stay within the referenced batch unless Agent 0 explicitly assigns a related gap.
35
+ - When the unit has `visual_index_refs`, use `view_image` only in this contaminated role and stay within the referenced visual batch unless Agent 0 explicitly assigns a related gap.
34
36
  - Generate neutral draft task slices and behavioral spec material for Agent 0-controlled units.
35
37
  - Write neutral behavioral requirements covering inputs, outputs, state transitions, edge cases, error conditions, invariants, and tests.
38
+ - For a `unit_kind: "foundation"` assignment, inventory target stack, package or module boundaries, public manifest surfaces, test entrypoints, dependency policy, and destination constraints. Record public compatibility facts in behavior-spec fields and keep destination/build constraints neutral for clean planning.
39
+ - When relevant to the assigned unit, locate and account for every observable CLI argument, flag, environment variable override, TUI command, keyboard shortcut, menu state, associated UI element, view state, accessibility expectation, config key, protocol entry point, and public user-visible behavior.
40
+ - If you detect related files, modules, visual components, or public surfaces that are inside authorized scope but outside the assigned refs or too large to analyze in the current context, record contaminated `coverage-ledger.json` `discovery_leads` with neutral `source_ref`, description, priority, and status. Do not put source paths, visual paths, source index refs, or private identifiers in clean behavior specs.
41
+ - For visual fallback units, write UI behavior/spec claims about intent, screen states, hierarchy, accessibility expectations, interaction purpose, and broad style goals. Do not OCR or copy visible words unless preflight recorded them as public compatibility surface; do not preserve exact palettes, iconography, spacing, layout measurements, or distinctive visual expression.
36
42
  - Treat discovered source tests as behavioral evidence and convert them into clean `test_scenarios` that validate the same observable outputs.
37
43
  - Record equal-output expectations for public return values, serialized data, CLI or API responses, errors, state changes, ordering, and compatibility-relevant side effects.
38
44
  - Use `evidence_refs` that point to contaminated-side ledger entries instead of including source text.
39
45
  - Keep public API names only when compatibility requires them and record the reason.
40
46
  - Capture public API, protocol, config, and data/schema compatibility using existing behavior spec fields.
47
+ - Do not mirror source dependency lists, package manifests, or private module layout. Mention a dependency only when it is public compatibility surface, destination evidence, or explicitly allowed by preflight policy.
41
48
  - For behavior-compatible ports, extract compatibility-critical invariants into `invariants`, `compatibility_notes`, and `test_scenarios`; broad module coverage is not enough.
42
49
  - When present, treat protocol transcript shape, request/response ID pairing, error budgets, streaming order, queue bounds, sampling registry aliases, async behavior, and typed JSON argument preservation as first-class observable behavior.
43
50
  - Treat package, namespace, module, class, function, method, variable, constant, field, and internal event names as private identifiers unless they are public compatibility surface.
44
51
  - Flag suspected leakage before returning drafts, but do not approve your own work for clean handoff.
45
52
 
46
- Never produce implementation code, copied comments, source excerpts, raw diffs, source test names, fixture structure, private helper names, or source-shaped pseudocode.
53
+ Never produce implementation code, copied comments, source excerpts, raw diffs, raw screenshots, visual paths, image hashes, copied visible text, exact UI palettes/layouts/iconography, source test names, fixture structure, private helper names, or source-shaped pseudocode.
47
54
 
48
55
  Agent 1.5 owns independent sanitization and leakage pass/fail review from a fresh source-denied context.
package/bin/verify.sh CHANGED
@@ -49,6 +49,7 @@ echo "Compiling Python hooks and scripts..."
49
49
 
50
50
  echo "Smoke testing source index CLI..."
51
51
  "$python_cmd" skills/clean-room/scripts/build_source_index.py --help >/dev/null
52
+ "$python_cmd" skills/clean-room/scripts/build_visual_index.py --help >/dev/null
52
53
 
53
54
  echo "Validating example schemas..."
54
55
  for dir in skills/clean-room/examples/minimal-spec-package skills/clean-room/examples/contaminated-side; do
@@ -19,7 +19,7 @@ The Clean Room workflow acts as an engineering risk-reduction process by establi
19
19
  To maintain compliance and mitigate leakage risks, the workflow utilizes strictly separated workspaces, worktrees, repositories, or profiles for contaminated and clean work:
20
20
 
21
21
  * **Contaminated Source Workspace**: Source-readable, read-only where practical. Contains the codebase under analysis.
22
- * **Contaminated Artifact Workspace**: Holds intermediate outputs like preflight goals, init configs, source indexes, task manifests, controller status, coverage ledgers, evidence ledgers, draft specs, contaminated-role session briefs, and abstract delta tickets. Configure via `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
22
+ * **Contaminated Artifact Workspace**: Holds intermediate outputs like preflight goals, init configs, source indexes, visual indexes, task manifests, controller status, coverage ledgers, evidence ledgers, draft specs, contaminated-role session briefs, and abstract delta tickets. Configure via `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
23
23
  * **Clean Artifact Workspace**: Houses sanitized clean run contexts, approved behavioral specifications, handoff packages, clean-role session briefs, skeleton manifests, implementation plans, implementation reports, QC reports, and test plans. Configure via `CLEAN_ROOM_CLEAN_ROOTS`.
24
24
  * **Clean Implementation Workspace**: Houses clean destination code and tests. Configure via `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
25
25
  * **Clean Allowed Reference Workspace**: Public documentation, specifications, or destination constraints explicitly approved for clean and source-denied role reads. Configure via `CLEAN_ROOM_ALLOWED_READ_ROOTS`.
@@ -41,17 +41,18 @@ The initialization wizard and `require-clean-room-env.py` audit clean, implement
41
41
 
42
42
  ![Stage 0 Goal Contract](assets/3.png)
43
43
 
44
- Every new run starts with `preflight-goal.json` before source discovery, source indexing, Agent 0 decomposition, attended execution, or unattended execution. The contract records end goal, target stack, license policy, dependency policy, compatibility exactness, feature changes, code hygiene, output policy, controller mode, and open questions.
44
+ Every new run starts with `preflight-goal.json` before source discovery, source indexing, visual indexing, Agent 0 decomposition, attended execution, or unattended execution. The contract records end goal, target stack, license policy, dependency policy, compatibility exactness, feature changes, code hygiene, output policy, controller mode, and open questions.
45
45
 
46
46
  `preflight-goal.json` is controller/contaminated-side only. Clean roles receive only clean-safe `goal_contract` fields and `code_hygiene_policy` through `clean-run-context.json`.
47
47
 
48
48
  ### Contaminated Source-Index Preflight Tooling
49
49
 
50
- To assist in logical unit decomposition, the workflow supports an optional source-index preflight stage using `build_source_index.py` and `clean_room_tool_manager.py`.
50
+ To assist in logical unit decomposition, the workflow supports an optional source-index preflight stage using `build_source_index.py` and `clean_room_tool_manager.py`. When no indexable source code exists and screenshots/images are the authorized evidence, the workflow supports a fallback visual-index preflight stage using `build_visual_index.py`.
51
51
 
52
52
  * **Execution Boundary**: This tooling runs exclusively in the contaminated domain before clean-room role sessions are initialized.
53
53
  * **Traversal Bounds**: Source indexing enforces file count, per-file byte, total byte, batch token, and segment caps. It validates file size again after reading, skips files that change during read, records directory walk errors, and prunes traversal after global limits are exhausted with an aggregate skipped entry.
54
- * **Agent 0 Use**: Agent 0 consumes `source-index.json` only to create neutral `task-manifest.json` units and per-unit `source_index_refs`. The index stays contaminated-only and does not cross to Agent 1.5, Agent 2, Agent 3, Agent 4, or clean handoff packages.
54
+ * **Agent 0 Use**: Agent 0 consumes `source-index.json` only to create neutral `task-manifest.json` units and per-unit `source_index_refs`. In visual fallback runs, Agent 0 consumes `visual-index.json` only to create neutral units and per-unit `visual_index_refs`. Both indexes stay contaminated-only and do not cross to Agent 1.5, Agent 2, Agent 3, Agent 4, or clean handoff packages.
55
+ * **Discovery Leads**: When Agent 1 detects an authorized related surface that cannot be analyzed inside the assigned unit, Agent 0 tracks it in contaminated `coverage-ledger.json` `discovery_leads`. High-priority leads must be resolved before the unit can be marked covered; the runner does not expand approved scope automatically.
55
56
  * **Tool Trust Policy**: By default, tool discovery operates in `stat-only` mode and does not execute third-party binaries. It queries version strings only when explicitly invoked with `--probe-tools`. Tools discovered under `/opt/homebrew` or `/usr/local` remain stat-only unless `--allow-user-toolchain-probes` is also supplied. Project-local directories (such as `.bin` or `node_modules/.bin`) are ignored unless the environment variable `RE_SKILLS_TRUST_PROJECT_TOOLS=1` or the flag `--allow-working-project-tools` is supplied.
56
57
  * **Local Tool Install Safety**: Explicit npm-backed helper installs are strict-version pinned and serialized with a cache-local lock before mutating `~/.cache/re-skills/clean-room-tools/npm`. Prefix creation failures, subprocess timeouts, and subprocess launch errors are returned as structured JSON facts instead of raw tracebacks.
57
58
 
@@ -78,7 +79,7 @@ flowchart LR
78
79
  sanitizer["Agent 1.5: contaminated-handoff-sanitizer<br/>Source-denied, scrub identifying material"]
79
80
  brief["Neutral sanitizer brief<br/>domain, target profile, unit intent,<br/>public allowlist, blocked categories"]
80
81
  preflight["preflight-goal.json<br/>goal, stack, policy, hygiene"]
81
- ledgers["Contaminated artifacts<br/>CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS<br/>init-config.json<br/>source-index.json<br/>task-manifest.json<br/>coverage-ledger.json<br/>evidence-ledger.json"]
82
+ ledgers["Contaminated artifacts<br/>CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS<br/>init-config.json<br/>source-index.json<br/>visual-index.json<br/>task-manifest.json<br/>coverage-ledger.json<br/>evidence-ledger.json"]
82
83
  drafts["Agent 1 draft specs<br/>assigned paths only for Agent 1.5"]
83
84
  staged["Sanitized handoff candidates<br/>Agent 1.5-reviewed behavior-spec.json"]
84
85
  end
@@ -95,8 +96,8 @@ flowchart LR
95
96
  architect["Agent 2: clean-architect<br/>Plan implementation from clean specs and foundation"]
96
97
  qa["Agent 3: clean-qa-editor<br/>Implement, record verification, terminal report"]
97
98
  polish["Agent 4: clean-polish-reviewer<br/>Final code polish, repo hygiene, local commit"]
98
- outputs["Clean artifacts<br/>implementation-plan.json<br/>qc-report.json<br/>test plan notes"]
99
- imploutputs["Implementation outputs<br/>code, tests, AGENTS.md, .gitignore<br/>implementation-report.json<br/>polish-report.json"]
99
+ outputs["Clean artifacts<br/>implementation-plan.json<br/>implementation-report.json<br/>qc-report.json<br/>polish-report.json<br/>test plan notes"]
100
+ imploutputs["Implementation outputs<br/>code, tests, fixtures, AGENTS.md, .gitignore"]
100
101
  end
101
102
 
102
103
  subgraph guardrails["Guardrails and audit"]
@@ -140,10 +141,10 @@ flowchart LR
140
141
  env -. required for every role session .-> architect
141
142
  denyread -. clean and source-denied roles cannot read source roots .-> cleanroots
142
143
  denyread -. clean roles may read implementation roots .-> implroots
143
- denyread -. Agent 1.5 cannot read source roots, clean roots, implementation roots, source-index.json, or preflight-goal.json .-> sanitizer
144
+ denyread -. Agent 1.5 cannot read source/visual roots, clean roots, implementation roots, source-index.json, visual-index.json, or preflight-goal.json .-> sanitizer
144
145
  denywrite -. contaminated writes only to contaminated artifact roots .-> ledgers
145
- denywrite -. Agent 2 writes clean artifacts only; Agents 3 and 4 write implementation roots .-> cleanroots
146
- denywrite -. Agents 3 and 4 write code, tests, docs, and repo hygiene only here .-> implroots
146
+ denywrite -. Agent 2 writes clean artifacts; Agents 3 and 4 write clean reports .-> cleanroots
147
+ denywrite -. Agents 3 and 4 write destination files only here; no clean-room artifact JSON .-> implroots
147
148
  denyshell -. no shell-style tools in role sessions .-> manager
148
149
  denyshell -. no shell for Agent 2; explicit Agent 3 and Agent 4 runners only .-> architect
149
150
  scan -. post-write checks .-> outputs
@@ -177,6 +178,7 @@ The architecture delegates work across six distinct custom role agents to enforc
177
178
  * Produces `clean-run-context.json` for Agent 2, Agent 3, and Agent 4 instead of handing over the full `task-manifest.json` or full `preflight-goal.json`.
178
179
  * Influences Agent 2, Agent 3, and Agent 4 only through durable sanitized artifacts, never direct chat, progress feedback, implementation hints, or priority changes.
179
180
  * Performs final verification of clean specification and implementation coverage against the source scope.
181
+ * Blocks handoff or coverage completion when high-priority contaminated discovery leads remain unresolved.
180
182
  * Writes the inner-loop `clean-room-result.json` only after contaminated-side coverage verification.
181
183
  * Consumes Agent 3 reports only after Agent 3 reaches a terminal state, and consumes Agent 4 reports only after the configured polish review reaches a terminal state, then sends only abstract delta tickets into a fresh clean artifact cycle.
182
184
 
@@ -187,6 +189,8 @@ The architecture delegates work across six distinct custom role agents to enforc
187
189
  * Analyzes the authorized source code within assigned units or batches.
188
190
  * Uses target stack and compatibility policy from preflight instead of inferring product goals from source.
189
191
  * Writes neutral draft behavioral specifications based on observed behavior, public contracts, invariants, state transitions, and errors.
192
+ * Inventories the assigned unit's observable CLI, env, TUI, UI, protocol, config, command, and public behavior surfaces when relevant.
193
+ * Records authorized related surfaces that cannot be analyzed in the assigned context as contaminated `discovery_leads`, not clean spec fields.
190
194
  * Generates evidence references pointing to contaminated ledgers instead of copying raw source code or comments.
191
195
  * Flags suspected leakage but does not approve its own work for clean handoff.
192
196
 
@@ -223,7 +227,7 @@ The architecture delegates work across six distinct custom role agents to enforc
223
227
  * Records code hygiene violations as `code-hygiene` findings in `qc-report.json`.
224
228
  * Writes code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
225
229
  * Runs bounded verification only through the installed Agent 3 verification runner, with `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`, strict hooks, and cwd under implementation roots.
226
- * Writes `implementation-report.json` and maintains `qc-report.json`.
230
+ * Writes `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` and maintains `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`.
227
231
  * Does not report progress or ask Agent 0 for guidance during implementation.
228
232
  * Emits one terminal report for Agent 0 only when the assigned spec slice is complete, blocked, or quarantined.
229
233
 
@@ -234,7 +238,7 @@ The architecture delegates work across six distinct custom role agents to enforc
234
238
  * Reviews final clean implementation for security, docs/comments, exception handling, resource leaks, race conditions, missing tests, and repository hygiene.
235
239
  * Creates or updates implementation-root `AGENTS.md` with gotchas and build/test/dev commands discovered from clean implementation files.
236
240
  * Updates `.gitignore` only for real generated outputs, dependencies, caches, or build/test artifacts.
237
- * Writes `polish-report.json`.
241
+ * Writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`.
238
242
  * Uses `agent4-polish-runner.py` only with `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`, cwd under implementation roots, and strict hooks.
239
243
  * May initialize git and create one local commit containing only paths listed in `polish-report.json`; it must not push, tag, reset, clean, or delete branches.
240
244
 
@@ -251,14 +255,16 @@ Agent 3's terminal report is not enough to return. If configured, Agent 4 must p
251
255
  * Locks the contaminated artifact root with `.clean-room-run.lock`.
252
256
  * Reloads durable artifacts before each iteration.
253
257
  * Selects at most one pending or gap unit inside `loop_context.approved_scope_refs`.
258
+ * Requires exactly one `unit_kind: "foundation"` unit, named by `loop_context.foundation_unit_ref`; behavior units cannot run or complete until that foundation unit is covered.
254
259
  * Spawns configured role commands with `shell: false`, bounded output, and bounded timeout.
255
260
  * In strict context-management mode, requires each configured stage to provide `context.fresh_session: true` and `context.brief_path`, then validates the session brief before spawn.
256
261
  * Supports the optional `clean-polish-review` phase between `clean-implement-qc` and `contaminated-coverage-verify`.
257
262
  * Validates schema, leakage, and handoff integrity before advancing state.
263
+ * Rejects `covered` coverage-ledger units that still have unresolved high-priority `discovery_leads`.
258
264
  * Records controller memory in contaminated-side `controller-run-ledger.json`.
259
265
  * Writes `clean-room-result.json` before returning to the outer spec loop.
260
266
 
261
- Progress is durable-artifact based. `clean-room-skill run` compares semantic JSON artifact hashes that ignore volatile timestamp and artifact-hash fields, plus raw file hashes under implementation roots. Chat output, timestamp-only artifact churn, and `controller-status.json` updates alone do not count as progress.
267
+ Progress is durable-artifact based. `clean-room-skill run` compares semantic JSON artifact hashes that ignore volatile timestamp and artifact-hash fields, plus raw file hashes under implementation roots while ignoring generated directories such as `target/`. Chat output, timestamp-only artifact churn, Cargo build metadata, and `controller-status.json` updates alone do not count as progress.
262
268
 
263
269
  ---
264
270
 
@@ -270,7 +276,7 @@ Every clean-room role session requires a populated environment block before any
270
276
  * `CLEAN_ROOM_SOURCE_ROOTS`: Source roots (only readable by source-reading contaminated roles, not Agent 1.5).
271
277
  * `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`: Target write directory for contaminated roles.
272
278
  * `CLEAN_ROOM_CLEAN_ROOTS`: Target write directory for clean artifacts and reports.
273
- * `CLEAN_ROOM_IMPLEMENTATION_ROOTS`: Target write directory for Agent 3 clean implementation code and tests, plus Agent 4 implementation-root hygiene changes and local git metadata.
279
+ * `CLEAN_ROOM_IMPLEMENTATION_ROOTS`: Target write directory for Agent 3 clean implementation code, tests, fixtures, real destination project files, plus Agent 4 implementation-root hygiene changes and local git metadata. Clean-room artifact JSON files stay out of this root.
274
280
  * `CLEAN_ROOM_ALLOWED_READ_ROOTS`: Approved reference docs or constraints readable by clean and source-denied roles.
275
281
  * `CLEAN_ROOM_SCHEMA_DIR`: Path to the directory containing JSON schema assets.
276
282
 
@@ -293,11 +299,11 @@ Post-write hook failures are deny-by-default and redacted. If an artifact disapp
293
299
  * [agent4-polish-runner.py](../hooks/agent4-polish-runner.py): Runs Agent 4 bounded status, verification, git init, staging, and one local commit from implementation roots only, using paths and policy recorded in `polish-report.json`.
294
300
  * [require-clean-room-env.py](../hooks/require-clean-room-env.py): Fails closed if the required role and root environment variables are missing, if trust-domain roots overlap, or if clean, implementation, or contaminated artifact root names appear source-derived.
295
301
  * [deny-clean-room-shell.py](../hooks/deny-clean-room-shell.py): Denies shell-style tool execution inside clean-room role sessions except installed Agent 3 verification-runner invocations under implementation roots and installed Agent 4 polish-runner invocations under implementation roots.
296
- * [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` reads.
297
- * [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, and contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
302
+ * [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source or visual roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` or `visual-index.json` reads.
303
+ * [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, and clean-room artifact JSON files are denied under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
298
304
  * [check-artifact-leakage.py](../hooks/check-artifact-leakage.py): Scans clean artifacts and Agent 1.5 staged contaminated artifacts for high-risk leakage markers, source-like identifiers, and private identifier denylist terms. The private identifier denylist (loaded via `CLEAN_ROOM_PRIVATE_IDENTIFIER_DENYLIST`) is subject to hard limits to protect hook execution performance: a maximum of 1,000,000 bytes per file, 20,000 total terms, and 512 characters per individual term.
299
305
  * [validate-json-schema.py](../hooks/validate-json-schema.py): Verifies JSON syntax and structural conformance against schemas under `CLEAN_ROOM_SCHEMA_DIR`, including controller-side `preflight-goal.schema.json` and `init-config.schema.json`. Under clean roots, any unrecognized JSON files that do not conform to canonical schemas will trigger a failure unless they are explicitly registered in the path-separated `CLEAN_ROOM_AUXILIARY_JSON_ALLOWLIST` environment variable.
300
- * [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, or `source-index.json`, and match declared `sha256` checksums.
306
+ * [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, `source-index.json`, or `visual-index.json`, and match declared `sha256` checksums.
301
307
 
302
308
  For detailed guidelines on the clean-room process, refer to:
303
309
  * [CONTROLLER-LOOP.md](../skills/clean-room/references/CONTROLLER-LOOP.md)
package/docs/REFERENCE.md CHANGED
@@ -210,7 +210,11 @@ Options:
210
210
  | `--schema-dir <path>` | Override bundled schema directory. |
211
211
  | `--python <path>` | Python executable for validation hooks; default is `python3`. |
212
212
 
213
- The task manifest must already include preflight references, the required handoff sequence, unattended controller policy, finite iteration bounds, and `loop_context.approved_scope_refs`.
213
+ The task manifest must already include preflight references, the required handoff sequence, unattended controller policy, finite iteration bounds, `loop_context.foundation_unit_ref`, and `loop_context.approved_scope_refs`.
214
+
215
+ Unattended code-development manifests must include exactly one `unit_kind: "foundation"` unit. The runner rejects non-foundation approved slices until that unit is covered.
216
+
217
+ `coverage-ledger.json` may record contaminated-only `source_units[].discovery_leads` for authorized related surfaces that were detected but not analyzed in the assigned unit. The runner rejects a `covered` unit while any high-priority discovery lead remains open or deferred. It does not add follow-up units or expand `loop_context.approved_scope_refs`; Agent 0 must return an abstract delta, mark coverage partial or blocked, or pause for attended approval.
214
218
 
215
219
  Minimal agent command adapter shape for advisory or disabled context management:
216
220
 
@@ -310,7 +314,7 @@ Strict context-management adapter example:
310
314
  }
311
315
  ```
312
316
 
313
- Relative `context.brief_path` values resolve relative to the `agent-commands.json` directory. Contaminated phases must point to briefs under the contaminated artifact root. Clean phases must point to briefs under the clean artifact root. A clean-stage brief may reference allowed clean artifacts, implementation artifacts, and approved public references, but not source indexes, contaminated ledgers, full manifests, controller status, or prior chat state.
317
+ Relative `context.brief_path` values resolve relative to the `agent-commands.json` directory. Contaminated phases must point to briefs under the contaminated artifact root. Clean phases must point to briefs under the clean artifact root. A clean-stage brief may reference allowed clean artifacts, implementation artifacts, and approved public references, but not source indexes, visual indexes, raw screenshots, contaminated ledgers, full manifests, controller status, or prior chat state.
314
318
 
315
319
  The runner exports `CLEAN_ROOM_SESSION_BRIEF_PATH`, `CLEAN_ROOM_ROLE_SESSION_ID`, and `CLEAN_ROOM_FRESH_CONTEXT_REQUIRED=1` for strict stages. The adapter still owns the actual fresh-context behavior: it must open a new model session, profile, or thread for that stage. Setting `fresh_session` while reusing one long chat is not a clean-room boundary.
316
320
 
@@ -323,12 +327,14 @@ The runner exports `CLEAN_ROOM_SESSION_BRIEF_PATH`, `CLEAN_ROOM_ROLE_SESSION_ID`
323
327
  | `install lock is held` | Another install or uninstall is mutating the same target root | Wait for the other process to finish; stale locks are handled conservatively. |
324
328
  | Hook config write failed after files copied | Partial installer state | Fix the filesystem error, then re-run the same installer command. |
325
329
  | Install manifest remains `installing` | The previous install did not complete | Re-run the same installer command for that runtime and target root. |
326
- | `clean-room run` rejects the manifest | Invalid or incomplete unattended loop metadata | Fix `controller_policy`, `loop_context`, and `approved_scope_refs`, then retry `--dry-run`. |
330
+ | `clean-room run` rejects the manifest | Invalid or incomplete unattended loop metadata | Fix `controller_policy`, `loop_context.foundation_unit_ref`, and `approved_scope_refs`, then retry `--dry-run`. |
331
+ | `clean-room run` rejects a covered unit with `discovery_leads` | A high-priority contaminated discovery lead is still unresolved | Analyze the lead in an authorized follow-up unit, mark it resolved, or keep coverage partial/blocked and return an abstract delta. |
327
332
  | `clean-room run` rejects an agent command stage in strict context mode | The stage is missing `context.fresh_session: true`, missing `context.brief_path`, or points the brief outside the allowed artifact root | Fix the stage context and regenerate the role-session brief for the selected unit. |
328
333
  | `clean-room run` reports no progress | Configured stages exited without durable artifact changes | Check role command cwd/argv, selected unit, and artifact write roots. |
329
334
  | `clean-room run` reports repeated unit selection | Same unit selected after a no-progress iteration | Resolve the blocker or update durable artifacts before retrying. |
330
335
  | Hook reports `could not read` or `could not stat` | Artifact disappeared, permissions changed, or path was replaced during validation | Restore readable artifact state and retry. |
331
336
  | `source-index.json` is missing files | Limits, unreadable directories, ignored directories, binary files, changed files, or outside-root symlinks | Inspect `skipped_entries` and adjust limits or permissions if omissions matter. |
337
+ | `visual-index.json` is missing screenshots | Limits, unsupported formats, unreadable directories, changed files, invalid image headers, or outside-root symlinks | Inspect `skipped_entries`, keep visual roots in the contaminated/source domain, and rerun `build_visual_index.py` only as fallback evidence preflight. |
332
338
 
333
339
  ## Local Verification
334
340
 
@@ -10,14 +10,14 @@ Run only from the clean workspace.
10
10
  Before tool use, require CLEAN_ROOM_ROLE=clean-architect, CLEAN_ROOM_CLEAN_ROOTS, CLEAN_ROOM_IMPLEMENTATION_ROOTS, CLEAN_ROOM_SOURCE_ROOTS, CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS, CLEAN_ROOM_ALLOWED_READ_ROOTS, and CLEAN_ROOM_SCHEMA_DIR.
11
11
  Read approved clean artifacts, CLEAN_ROOM_IMPLEMENTATION_ROOTS, and explicitly configured public or destination constraint roots only.
12
12
  Write only under CLEAN_ROOM_CLEAN_ROOTS. Do not write code.
13
- Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
13
+ Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
14
14
  Stop if only a full task-manifest.json is provided as run context.
15
- Before planning, require valid clean-run-context.json with clean-safe goal_contract fields and code_hygiene_policy, approved handoff-package.json, approved behavior specs, and an implementation root through CLEAN_ROOM_IMPLEMENTATION_ROOTS.
15
+ Before planning, require valid clean-run-context.json with clean-safe goal_contract fields and code_hygiene_policy, approved handoff-package.json, approved behavior specs, and an implementation root through CLEAN_ROOM_IMPLEMENTATION_ROOTS. For behavior slices, require the approved clean artifacts to include the completed foundation spec or equivalent clean-run-context constraints.
16
16
  When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there, plus destination foundation reads permitted by this role.
17
- Stop if full preflight-goal.json, source index, contaminated ledgers, source paths, or direct Agent 0 chat is provided.
17
+ Stop if full preflight-goal.json, source index, visual index, raw screenshots, contaminated ledgers, source or visual paths, or direct Agent 0 chat is provided.
18
18
  Accept Agent 0 influence only as durable sanitized artifacts. Ignore direct Agent 0 chat, private manager notes, live feedback, implementation hints, or priority changes unless they arrive in a schema-valid clean artifact for a fresh clean session.
19
19
  Merge only approved handoff artifacts into the selected clean schema base.
20
- Read the clean destination foundation to identify local structure, conventions, tests, dependencies, and constraints.
20
+ Read the clean destination foundation and approved foundation spec to identify local structure, conventions, tests, dependency policy, package boundaries, and constraints.
21
21
  Read any existing skeleton-manifest.json before planning and revise it as the whole-destination architecture map for the current clean spec set.
22
22
  Maintain architecture areas with owned relative path prefixes, responsibilities, forbidden responsibilities, allowed area dependencies, and refactor triggers.
23
23
  Assign every target and test path in implementation-plan.json to one or more skeleton-manifest.json architecture areas.
@@ -26,6 +26,8 @@ Create or update implementation-plan.json as the primary output for code-develop
26
26
  Carry the preflight-derived code hygiene policy into implementation-plan.json.
27
27
  Keep skeleton-manifest.json valid and current for code-development runs. Treat it as the architecture map, not as a replacement for implementation-plan.json.
28
28
  Map approved specs to destination files, test files, work items, argv-array verification commands, risks, and acceptance criteria using relative implementation-root paths.
29
+ Map every exact-public-contract or behavior-compatible public surface obligation to at least one implementation-plan.json work item through public_contract_refs; do not replace a public command/API inventory with one generic dispatch work item unless every obligation ref is listed.
30
+ Do not choose dependencies by copying source manifests. Add or preserve dependencies only when clean artifacts, destination evidence, or preflight policy justify them.
29
31
  Preserve source-test-derived scenarios as clean test obligations for equal output without copying source test structure.
30
32
  Do not resolve public-contract, callable, protocol, async, serialization, or data-shape ambiguity by narrowing semantics. Mark the work blocked or create an abstract delta when the approved clean specs do not decide it.
31
33
  Stop if contaminated material appears in clean inputs.