clean-room-skill 0.1.9 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/.claude-plugin/marketplace.json +1 -1
  2. package/.claude-plugin/plugin.json +1 -1
  3. package/.codex-plugin/plugin.json +1 -1
  4. package/README.md +3 -3
  5. package/agents/clean-architect.md +2 -2
  6. package/agents/clean-implementer-verifier-shell.md +4 -4
  7. package/agents/clean-polish-reviewer.md +2 -2
  8. package/agents/clean-qa-editor.md +9 -6
  9. package/agents/contaminated-handoff-sanitizer.md +4 -4
  10. package/agents/contaminated-manager-verifier.md +10 -4
  11. package/agents/contaminated-source-analyst.md +5 -2
  12. package/bin/verify.sh +1 -0
  13. package/docs/ARCHITECTURE.md +17 -17
  14. package/docs/REFERENCE.md +2 -1
  15. package/examples/codex/.codex/agents/clean-architect.toml +2 -2
  16. package/examples/codex/.codex/agents/clean-polish-reviewer.toml +1 -1
  17. package/examples/codex/.codex/agents/clean-qa-editor.toml +6 -6
  18. package/examples/codex/.codex/agents/contaminated-handoff-sanitizer.toml +2 -2
  19. package/examples/codex/.codex/agents/contaminated-manager-verifier.toml +4 -2
  20. package/examples/codex/.codex/agents/contaminated-source-analyst.toml +4 -2
  21. package/hooks/agent3-verification-runner.py +2 -1
  22. package/hooks/agent4-polish-runner.py +3 -0
  23. package/hooks/check-artifact-leakage.py +75 -11
  24. package/hooks/deny-clean-source-read.py +23 -0
  25. package/hooks/deny-contaminated-clean-write.py +63 -0
  26. package/hooks/validate-handoff-package.py +6 -0
  27. package/hooks/validate-json-schema.py +5 -1
  28. package/lib/bootstrap.cjs +14 -0
  29. package/lib/fs-utils.cjs +4 -0
  30. package/lib/run.cjs +387 -33
  31. package/package.json +1 -1
  32. package/plugin.json +1 -1
  33. package/skills/attended/SKILL.md +1 -1
  34. package/skills/clean-room/SKILL.md +18 -14
  35. package/skills/clean-room/assets/clean-run-context.schema.json +1 -1
  36. package/skills/clean-room/assets/task-manifest.schema.json +11 -0
  37. package/skills/clean-room/assets/visual-index.schema.json +283 -0
  38. package/skills/clean-room/examples/README.md +3 -0
  39. package/skills/clean-room/examples/contaminated-side/task-manifest.json +24 -24
  40. package/skills/clean-room/examples/contaminated-side/visual-index.json +70 -0
  41. package/skills/clean-room/references/LEAKAGE-RULES.md +6 -3
  42. package/skills/clean-room/references/PREFLIGHT.md +5 -2
  43. package/skills/clean-room/references/PROCESS.md +33 -25
  44. package/skills/clean-room/references/SPEC-SCHEMA.md +31 -12
  45. package/skills/clean-room/scripts/build_visual_index.py +449 -0
  46. package/skills/clean-room/scripts/source_index/discovery.py +27 -0
  47. package/skills/init/SKILL.md +1 -1
  48. package/skills/refocus/SKILL.md +4 -4
  49. package/skills/resume/SKILL.md +4 -3
  50. package/skills/start-over/SKILL.md +5 -5
  51. package/skills/unattended/SKILL.md +1 -1
@@ -9,7 +9,7 @@
9
9
  "name": "clean-room",
10
10
  "source": "./",
11
11
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
12
- "version": "0.1.9",
12
+ "version": "0.1.11",
13
13
  "author": {
14
14
  "name": "whit3rabbit"
15
15
  },
@@ -2,7 +2,7 @@
2
2
  "name": "clean-room",
3
3
  "displayName": "Clean Room",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
- "version": "0.1.9",
5
+ "version": "0.1.11",
6
6
  "author": {
7
7
  "name": "whit3rabbit"
8
8
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clean-room",
3
- "version": "0.1.9",
3
+ "version": "0.1.11",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
5
  "author": {
6
6
  "name": "whit3rabbit"
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Clean Room
2
2
 
3
- Clean Room is an agent workflow for turning authorized source analysis into clean behavioral specs, clean implementation plans, and clean destination code.
3
+ Clean Room is an agent workflow for turning authorized source analysis into clean behavioral specs, clean implementation plans, and clean destination code. When no indexable source code is available, it can use authorized screenshots/images as contaminated fallback evidence for behavior specs.
4
4
 
5
5
  It is a POC based on ideas from [malus.sh](https://malus.sh/blog.html). It is an engineering risk-reduction workflow, not legal advice, and it does not create a legal safe harbor.
6
6
 
@@ -20,10 +20,10 @@ The workflow creates clean behavioral spec packages and clean implementation out
20
20
 
21
21
  Core boundary:
22
22
 
23
- - Contaminated roles may read authorized source and write contaminated artifacts.
23
+ - Contaminated roles may read authorized source or fallback visual evidence and write contaminated artifacts.
24
24
  - Source-denied roles may read only clean artifacts, implementation roots, schemas, and approved public/reference roots.
25
25
  - Clean implementation code is written only under the clean implementation root.
26
- - Raw source, source paths, private identifiers, raw diffs, copied comments, and source-shaped pseudocode must not cross into clean handoff artifacts.
26
+ - Raw source, raw screenshots, source or visual paths, private identifiers, raw diffs, copied comments, copied UI text, and source-shaped pseudocode must not cross into clean handoff artifacts.
27
27
 
28
28
  For the full boundary model, see [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md). For CLI and troubleshooting details, see [docs/REFERENCE.md](docs/REFERENCE.md).
29
29
 
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
8
8
 
9
9
  This role is Agent 2 in the clean-room pipeline.
10
10
 
11
- Operate only in the clean domain from `CLEAN_ROOM_CLEAN_ROOTS` as the working directory. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots. Write only under `CLEAN_ROOM_CLEAN_ROOTS`. Do not write code. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
11
+ Operate only in the clean domain from `CLEAN_ROOM_CLEAN_ROOTS` as the working directory. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots. Write only under `CLEAN_ROOM_CLEAN_ROOTS`. Do not write code. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-architect`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
14
14
 
@@ -24,7 +24,7 @@ Before planning, verify:
24
24
  - approved `handoff-package.json` and approved behavior specs are present.
25
25
  - the implementation root is available through `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
26
26
 
27
- Stop if only a full `task-manifest.json`, full `preflight-goal.json`, source index, contaminated ledgers, source paths, or direct Agent 0 chat is provided.
27
+ Stop if only a full `task-manifest.json`, full `preflight-goal.json`, source index, visual index, raw screenshots, contaminated ledgers, source or visual paths, or direct Agent 0 chat is provided.
28
28
 
29
29
  Responsibilities:
30
30
 
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob, Bash
8
8
 
9
9
  This is the explicit shell-capable Agent 3 variant. Use it only in a dedicated clean-room home with strict hooks installed, source roots unmounted where practical, and `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1` set deliberately.
10
10
 
11
- Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
11
+ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-qa-editor`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, `CLEAN_ROOM_SCHEMA_DIR`, and `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`. Treat missing environment as a stop condition.
14
14
 
@@ -28,10 +28,10 @@ Responsibilities:
28
28
  - Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
29
29
  - Review leakage risk using `LEAKAGE-RULES.md`.
30
30
  - Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
31
- - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `implementation-report.json`.
32
- - Keep `qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
31
+ - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
32
+ - Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
33
33
  - Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
34
- - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
34
+ - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
35
35
  - Edit clean wording for clarity without adding new source facts.
36
36
 
37
37
  If contamination is found, mark the artifact quarantined and require regeneration from the contaminated side.
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
8
8
 
9
9
  This role is Agent 4 in the clean-room pipeline.
10
10
 
11
- Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, schemas, and explicitly configured public or destination constraint roots only. Write `polish-report.json` and clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write implementation code, tests, docs, `AGENTS.md`, `.gitignore`, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, the full `task-manifest.json`, the full `preflight-goal.json`, or `source-index.json`.
11
+ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, schemas, and explicitly configured public or destination constraint roots only. Write `polish-report.json` and clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write implementation code, tests, docs, `AGENTS.md`, `.gitignore`, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, contaminated ledgers, contaminated chat history, the full `task-manifest.json`, the full `preflight-goal.json`, `source-index.json`, or `visual-index.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-polish-reviewer`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
14
14
 
@@ -24,7 +24,7 @@ Before editing code, verify:
24
24
  - Agent 3 reached a terminal implementation state.
25
25
  - Any clean artifact refs needed for review are allowed by the role-session brief when strict context management is enabled.
26
26
 
27
- Stop if asked to infer behavior from source, contaminated ledgers, source paths, private manager notes, or direct Agent 0 chat.
27
+ Stop if asked to infer behavior from source, screenshots, contaminated ledgers, source or visual paths, private manager notes, or direct Agent 0 chat.
28
28
 
29
29
  Responsibilities:
30
30
 
@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
8
8
 
9
9
  This role is Agent 3 in the clean-room pipeline.
10
10
 
11
- Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
11
+ Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
12
12
 
13
13
  Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-qa-editor`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
14
14
 
@@ -24,7 +24,7 @@ Before editing code, verify:
24
24
  - both artifacts carry the preflight-derived `code_hygiene_policy`.
25
25
  - work items target only the selected spec slice and current unit in unattended mode.
26
26
 
27
- Stop if asked to infer product goals from source, full `task-manifest.json`, full `preflight-goal.json`, contaminated ledgers, source paths, or direct Agent 0 chat.
27
+ Stop if asked to infer product goals from source, screenshots, full `task-manifest.json`, full `preflight-goal.json`, contaminated ledgers, source or visual paths, or direct Agent 0 chat.
28
28
 
29
29
  Responsibilities:
30
30
 
@@ -40,18 +40,21 @@ Responsibilities:
40
40
  - Follow destination project conventions discovered from clean implementation files; do not import source-derived structure, names, comments, or pseudocode.
41
41
  - Add or update tests required by the implementation plan.
42
42
  - Record planned verification commands as argv arrays. Run them only through the installed Agent 3 verification runner. When container metadata is present, use only `network: "off"` and `dependency_mode: "offline"` or `"locked"` unless a later policy explicitly expands this.
43
+ - Passing unit tests is not sufficient for completion when the selected slice includes CLI startup, binary packaging, terminal UI, interactive input, streaming display, command dispatch, protocol behavior, or public output compatibility. Verify the user-observable path or mark the gap in `qc-report.json`.
44
+ - For CLI or binary targets, verify that the destination actually exposes a runnable target. Record a target discovery check such as `cargo metadata` plus a representative runnable command such as `cargo run -- --help`, or an equivalent stack-native command from the implementation plan.
45
+ - For TUI or interactive behavior, run at least one smoke-level rendering or interaction check through the approved verification runner. If the runner cannot exercise the TUI, record coverage as partial and return an abstract delta ticket instead of reporting completion.
43
46
  - In unattended inner-loop mode, execute only work items that belong to the selected spec slice and current clean-room unit.
44
47
  - If the plan expands beyond that slice or cannot complete in one fresh clean implementation context, mark the unit blocked with `spec-delta-required` or `split-required`.
45
48
  - Loop over selected-slice work items until they are complete, blocked, or quarantined.
46
49
  - Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
47
50
  - Review leakage risk using `LEAKAGE-RULES.md`.
48
51
  - Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
49
- - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `implementation-report.json`.
50
- - Keep `qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
51
- - Record architecture alignment in `qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
52
+ - Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
53
+ - Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
54
+ - Record architecture alignment in `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
52
55
  - Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
53
56
  - Require invariant-level tests for compatibility-critical behavior. Passing module coverage or API-name coverage is not sufficient when protocol, serialization, streaming, queueing, error-budget, async, or typed-data invariants are in scope.
54
- - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
57
+ - Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
55
58
  - Edit clean wording for clarity without adding new source facts.
56
59
 
57
60
  If contamination is found, mark the artifact quarantined and require regeneration from the contaminated side.
@@ -24,17 +24,17 @@ Before reviewing drafts, verify that Agent 0 provided:
24
24
  - public compatibility allowlist, if public names are retained
25
25
  - `CLEAN_ROOM_SESSION_BRIEF_PATH`, when context management is enabled
26
26
 
27
- Stop if given source roots, `source-index.json`, evidence ledgers, private identifier lists, full `preflight-goal.json`, full `task-manifest.json`, raw diffs, source excerpts, or Agent 1 source-reading chat history.
27
+ Stop if given source roots, visual roots, `source-index.json`, `visual-index.json`, raw screenshots, evidence ledgers, private identifier lists, full `preflight-goal.json`, full `task-manifest.json`, raw diffs, source excerpts, or Agent 1 source-reading chat history.
28
28
 
29
29
  Responsibilities:
30
30
 
31
31
  - Work only from Agent 0's neutral sanitizer brief and assigned draft artifact paths.
32
- - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the brief's allowed artifact refs. Block if the brief requires prior chat, source indexes, evidence ledgers, or more context than the budget allows.
33
- - Reject any brief or artifact that includes source paths, import/export listings, dependency graphs, private identifiers, raw diffs, copied comments, source excerpts, `source-index.json` contents, or source-shaped pseudocode.
32
+ - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the brief's allowed artifact refs. Block if the brief requires prior chat, source indexes, visual indexes, raw screenshots, evidence ledgers, or more context than the budget allows.
33
+ - Reject any brief or artifact that includes source paths, visual paths, image hashes, import/export listings, dependency graphs, private identifiers, raw diffs, copied comments, copied visible words, source excerpts, raw screenshots, `source-index.json` contents, `visual-index.json` contents, exact UI palettes/layouts/iconography, or source-shaped pseudocode.
34
34
  - Scrub draft behavior specs into neutral handoff candidates without adding source facts.
35
35
  - Preserve the required artifact schema shape while sanitizing; reject custom freeform "spec-like" JSON instead of approving it for clean handoff.
36
36
  - Preserve public compatibility names only when they are listed in `public_surface` with a concrete compatibility reason.
37
37
  - Record `leakage_review.reviewer_role` as `contaminated-handoff-sanitizer` on passed, failed, or quarantined artifacts.
38
38
  - For failed artifacts, mark them quarantined and return only abstract regeneration feedback to Agent 0.
39
39
 
40
- Never read source roots, clean roots, implementation roots, `source-index.json`, contaminated evidence ledgers, or contaminated source-analysis chat history.
40
+ Never read source roots, visual roots, clean roots, implementation roots, `source-index.json`, `visual-index.json`, raw screenshots, contaminated evidence ledgers, or contaminated source-analysis chat history.
@@ -30,20 +30,24 @@ Responsibilities:
30
30
  - Act as agent zero/controller when no separate coordinator exists: define and pass the clean-room environment block to every role session before tool use.
31
31
  - When context management is enabled, maintain `controller-status.json` as compact contaminated-side status and create one `role-session-brief.json` per role launch. In strict mode, launch every role from a fresh model session, profile, or thread; role labels in a continuing chat are not fresh context.
32
32
  - Consume contaminated `source-index.json` when controller preflight produced one.
33
- - Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`.
33
+ - When no indexable source code exists and screenshots/images are the authorized evidence, consume contaminated `visual-index.json` as fallback input only. In attended mode, pause before decomposition to ask what the screenshots are meant to accomplish: product goal, target user flow, screenshot coverage, target stack, UI exactness boundary, and whether visible words are public compatibility surface.
34
+ - Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source or visual layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`, or to one visual-index batch through `visual_index_refs`.
34
35
  - Maintain `coverage-ledger.json` and `evidence-ledger.json` in the contaminated artifact workspace.
35
36
  - Maintain a private identifier denylist for hook scanning when practical; never send the denylist contents to Agent 1.5, clean roles, or clean artifacts.
36
37
  - Provide Agent 1.5 only a neutral sanitizer brief with domain purpose, target profile, unit intent, public compatibility allowlist, and blocked categories.
37
38
  - Send Agent 1 draft specs to Agent 1.5 for independent source-denied sanitization before clean handoff.
38
39
  - Compare clean artifacts and terminal implementation or polish reports against source behavior, discovered source tests, equal-output requirements, and public API/schema compatibility for coverage gaps.
40
+ - Do not mark a unit complete from summaries, claimed test counts, or progress prose alone. Completion requires schema-valid durable reports under the expected artifact roots, matching coverage-ledger entries, and evidence-ledger entries for every referenced evidence id.
41
+ - Source-backed units with `source_index_refs` or `visual_index_refs` must have durable source/evidence coverage before `coverage_state: "covered"`. If evidence is missing, partial, unreadable, or outside the assigned refs, mark the unit `gap` or `blocked` and return an abstract delta ticket instead of marking it complete.
42
+ - For full-parity runs, do not defer TUI, command, CLI, protocol, streaming, MCP, tool, public error, or config behavior while reporting completion. If any such behavior is missing, record the gap as an abstract delta ticket and keep coverage partial or blocked.
39
43
  - Reject `complete` when source-test-derived parity, protocol invariants, public-contract tests, or approved behavior-spec open questions remain unresolved. Convert the gap into abstract delta tickets for a fresh clean cycle.
40
44
  - Receive Agent 3 implementation reports and QC reports only after Agent 3 reaches a terminal state: complete, blocked, or quarantined. Receive Agent 4 polish reports only after the configured polish review reaches passed, blocked, or quarantined. Do not consume partial clean-role reports as controller feedback.
41
45
  - Convert terminal implementation or polish gaps into abstract delta tickets for the next clean run. Do not steer an in-progress Agent 3 or Agent 4 loop.
42
- - Send only `clean-run-context.json`, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall. Do not include source snippets, raw diffs, copied comments, private helper names, source paths, source index refs, contaminated ledger paths, or source-shaped pseudocode.
46
+ - Send only `clean-run-context.json`, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall. Do not include source snippets, raw diffs, copied comments, raw screenshots, copied visible words, private helper names, source or visual paths, source index refs, visual index refs, contaminated ledger paths, or source-shaped pseudocode.
43
47
 
44
48
  Use this file map when a CLI bootstrap is present:
45
49
 
46
- - Contaminated artifact root: write `preflight-goal.json`, `init-config.json`, `task-manifest.json`, `source-index.json`, `coverage-ledger.json`, `evidence-ledger.json`, private identifier denylist artifacts, and `clean-room-result.json`.
50
+ - Contaminated artifact root: write `preflight-goal.json`, `init-config.json`, `task-manifest.json`, `source-index.json`, `visual-index.json`, `coverage-ledger.json`, `evidence-ledger.json`, private identifier denylist artifacts, and `clean-room-result.json` only under `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
47
51
  - Clean artifact root: only sanitized handoff artifacts, `clean-run-context.json`, behavior specs, implementation plans, clean reports, QC reports, polish reports, open questions, and abstract delta tickets belong here. Agent 0 must not write this root directly while running as a contaminated role.
48
52
  - Implementation root: Agent 3 writes destination code, tests, fixtures, and destination project files here. Agent 4 may write final hygiene changes and local git metadata here through the polish runner. Agent 0 must not write this root.
49
53
  - Quarantine root: rejected, contaminated, or incident artifacts that must not cross into the clean domain.
@@ -60,6 +64,8 @@ Do not grant shell-style tools to Agent 0, Agent 1, Agent 1.5, Agent 2, or the d
60
64
 
61
65
  If a multi-file scope needs relationship-aware batching and `source-index.json` is missing, pause for controller preflight rather than running shell tools inside this role.
62
66
 
67
+ If a visual fallback scope needs screenshot/image batching and `visual-index.json` is missing, pause for controller preflight rather than running shell tools inside this role.
68
+
63
69
  Stop if clean roles received contaminated material. Record a contamination incident and require a regenerated clean artifact.
64
70
 
65
- Stop if Agent 1.5 receives source roots, source-index contents, contaminated evidence ledger contents, private identifier lists, raw diffs, source excerpts, or Agent 1 source-reading chat history. Record a contamination incident and start Agent 1.5 again from a fresh context with a neutral brief.
71
+ Stop if Agent 1.5 receives source roots, source-index contents, visual-index contents, raw screenshots, visual paths, image hashes, copied visible words, exact UI palettes/layouts/iconography, contaminated evidence ledger contents, private identifier lists, raw diffs, source excerpts, or Agent 1 source-reading chat history. Record a contamination incident and start Agent 1.5 again from a fresh context with a neutral brief.
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: contaminated-source-analyst
3
3
  description: Reads authorized source in a contaminated workspace and produces neutral draft task slices plus behavioral specs with evidence references, not replacement code.
4
- tools: Read, Write, Edit, Glob, Grep
4
+ tools: Read, Write, Edit, Glob, Grep, view_image
5
5
  ---
6
6
 
7
7
  # Contaminated Source Analyst
@@ -19,6 +19,7 @@ Before reading source, verify that Agent 0 provided:
19
19
  - active `task-manifest.json` with `preflight_goal_ref` and `preflight_goal_sha256`
20
20
  - one assigned `unit_id`
21
21
  - authorized `source_index_refs`, when used
22
+ - authorized `visual_index_refs`, when visual fallback is used
22
23
  - evidence handling policy
23
24
  - target stack and compatibility policy from preflight
24
25
  - neutral sanitizer brief requirements
@@ -31,8 +32,10 @@ Responsibilities:
31
32
  - Read the minimum source needed for the assigned unit.
32
33
  - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the allowed artifact refs named there, except for direct source reads already permitted by the assigned unit and role policy.
33
34
  - When the unit has `source_index_refs`, stay within the referenced batch unless Agent 0 explicitly assigns a related gap.
35
+ - When the unit has `visual_index_refs`, use `view_image` only in this contaminated role and stay within the referenced visual batch unless Agent 0 explicitly assigns a related gap.
34
36
  - Generate neutral draft task slices and behavioral spec material for Agent 0-controlled units.
35
37
  - Write neutral behavioral requirements covering inputs, outputs, state transitions, edge cases, error conditions, invariants, and tests.
38
+ - For visual fallback units, write UI behavior/spec claims about intent, screen states, hierarchy, accessibility expectations, interaction purpose, and broad style goals. Do not OCR or copy visible words unless preflight recorded them as public compatibility surface; do not preserve exact palettes, iconography, spacing, layout measurements, or distinctive visual expression.
36
39
  - Treat discovered source tests as behavioral evidence and convert them into clean `test_scenarios` that validate the same observable outputs.
37
40
  - Record equal-output expectations for public return values, serialized data, CLI or API responses, errors, state changes, ordering, and compatibility-relevant side effects.
38
41
  - Use `evidence_refs` that point to contaminated-side ledger entries instead of including source text.
@@ -43,6 +46,6 @@ Responsibilities:
43
46
  - Treat package, namespace, module, class, function, method, variable, constant, field, and internal event names as private identifiers unless they are public compatibility surface.
44
47
  - Flag suspected leakage before returning drafts, but do not approve your own work for clean handoff.
45
48
 
46
- Never produce implementation code, copied comments, source excerpts, raw diffs, source test names, fixture structure, private helper names, or source-shaped pseudocode.
49
+ Never produce implementation code, copied comments, source excerpts, raw diffs, raw screenshots, visual paths, image hashes, copied visible text, exact UI palettes/layouts/iconography, source test names, fixture structure, private helper names, or source-shaped pseudocode.
47
50
 
48
51
  Agent 1.5 owns independent sanitization and leakage pass/fail review from a fresh source-denied context.
package/bin/verify.sh CHANGED
@@ -49,6 +49,7 @@ echo "Compiling Python hooks and scripts..."
49
49
 
50
50
  echo "Smoke testing source index CLI..."
51
51
  "$python_cmd" skills/clean-room/scripts/build_source_index.py --help >/dev/null
52
+ "$python_cmd" skills/clean-room/scripts/build_visual_index.py --help >/dev/null
52
53
 
53
54
  echo "Validating example schemas..."
54
55
  for dir in skills/clean-room/examples/minimal-spec-package skills/clean-room/examples/contaminated-side; do
@@ -19,7 +19,7 @@ The Clean Room workflow acts as an engineering risk-reduction process by establi
19
19
  To maintain compliance and mitigate leakage risks, the workflow utilizes strictly separated workspaces, worktrees, repositories, or profiles for contaminated and clean work:
20
20
 
21
21
  * **Contaminated Source Workspace**: Source-readable, read-only where practical. Contains the codebase under analysis.
22
- * **Contaminated Artifact Workspace**: Holds intermediate outputs like preflight goals, init configs, source indexes, task manifests, controller status, coverage ledgers, evidence ledgers, draft specs, contaminated-role session briefs, and abstract delta tickets. Configure via `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
22
+ * **Contaminated Artifact Workspace**: Holds intermediate outputs like preflight goals, init configs, source indexes, visual indexes, task manifests, controller status, coverage ledgers, evidence ledgers, draft specs, contaminated-role session briefs, and abstract delta tickets. Configure via `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
23
23
  * **Clean Artifact Workspace**: Houses sanitized clean run contexts, approved behavioral specifications, handoff packages, clean-role session briefs, skeleton manifests, implementation plans, implementation reports, QC reports, and test plans. Configure via `CLEAN_ROOM_CLEAN_ROOTS`.
24
24
  * **Clean Implementation Workspace**: Houses clean destination code and tests. Configure via `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
25
25
  * **Clean Allowed Reference Workspace**: Public documentation, specifications, or destination constraints explicitly approved for clean and source-denied role reads. Configure via `CLEAN_ROOM_ALLOWED_READ_ROOTS`.
@@ -41,17 +41,17 @@ The initialization wizard and `require-clean-room-env.py` audit clean, implement
41
41
 
42
42
  ![Stage 0 Goal Contract](assets/3.png)
43
43
 
44
- Every new run starts with `preflight-goal.json` before source discovery, source indexing, Agent 0 decomposition, attended execution, or unattended execution. The contract records end goal, target stack, license policy, dependency policy, compatibility exactness, feature changes, code hygiene, output policy, controller mode, and open questions.
44
+ Every new run starts with `preflight-goal.json` before source discovery, source indexing, visual indexing, Agent 0 decomposition, attended execution, or unattended execution. The contract records end goal, target stack, license policy, dependency policy, compatibility exactness, feature changes, code hygiene, output policy, controller mode, and open questions.
45
45
 
46
46
  `preflight-goal.json` is controller/contaminated-side only. Clean roles receive only clean-safe `goal_contract` fields and `code_hygiene_policy` through `clean-run-context.json`.
47
47
 
48
48
  ### Contaminated Source-Index Preflight Tooling
49
49
 
50
- To assist in logical unit decomposition, the workflow supports an optional source-index preflight stage using `build_source_index.py` and `clean_room_tool_manager.py`.
50
+ To assist in logical unit decomposition, the workflow supports an optional source-index preflight stage using `build_source_index.py` and `clean_room_tool_manager.py`. When no indexable source code exists and screenshots/images are the authorized evidence, the workflow supports a fallback visual-index preflight stage using `build_visual_index.py`.
51
51
 
52
52
  * **Execution Boundary**: This tooling runs exclusively in the contaminated domain before clean-room role sessions are initialized.
53
53
  * **Traversal Bounds**: Source indexing enforces file count, per-file byte, total byte, batch token, and segment caps. It validates file size again after reading, skips files that change during read, records directory walk errors, and prunes traversal after global limits are exhausted with an aggregate skipped entry.
54
- * **Agent 0 Use**: Agent 0 consumes `source-index.json` only to create neutral `task-manifest.json` units and per-unit `source_index_refs`. The index stays contaminated-only and does not cross to Agent 1.5, Agent 2, Agent 3, Agent 4, or clean handoff packages.
54
+ * **Agent 0 Use**: Agent 0 consumes `source-index.json` only to create neutral `task-manifest.json` units and per-unit `source_index_refs`. In visual fallback runs, Agent 0 consumes `visual-index.json` only to create neutral units and per-unit `visual_index_refs`. Both indexes stay contaminated-only and do not cross to Agent 1.5, Agent 2, Agent 3, Agent 4, or clean handoff packages.
55
55
  * **Tool Trust Policy**: By default, tool discovery operates in `stat-only` mode and does not execute third-party binaries. It queries version strings only when explicitly invoked with `--probe-tools`. Tools discovered under `/opt/homebrew` or `/usr/local` remain stat-only unless `--allow-user-toolchain-probes` is also supplied. Project-local directories (such as `.bin` or `node_modules/.bin`) are ignored unless the environment variable `RE_SKILLS_TRUST_PROJECT_TOOLS=1` or the flag `--allow-working-project-tools` is supplied.
56
56
  * **Local Tool Install Safety**: Explicit npm-backed helper installs are strict-version pinned and serialized with a cache-local lock before mutating `~/.cache/re-skills/clean-room-tools/npm`. Prefix creation failures, subprocess timeouts, and subprocess launch errors are returned as structured JSON facts instead of raw tracebacks.
57
57
 
@@ -78,7 +78,7 @@ flowchart LR
78
78
  sanitizer["Agent 1.5: contaminated-handoff-sanitizer<br/>Source-denied, scrub identifying material"]
79
79
  brief["Neutral sanitizer brief<br/>domain, target profile, unit intent,<br/>public allowlist, blocked categories"]
80
80
  preflight["preflight-goal.json<br/>goal, stack, policy, hygiene"]
81
- ledgers["Contaminated artifacts<br/>CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS<br/>init-config.json<br/>source-index.json<br/>task-manifest.json<br/>coverage-ledger.json<br/>evidence-ledger.json"]
81
+ ledgers["Contaminated artifacts<br/>CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS<br/>init-config.json<br/>source-index.json<br/>visual-index.json<br/>task-manifest.json<br/>coverage-ledger.json<br/>evidence-ledger.json"]
82
82
  drafts["Agent 1 draft specs<br/>assigned paths only for Agent 1.5"]
83
83
  staged["Sanitized handoff candidates<br/>Agent 1.5-reviewed behavior-spec.json"]
84
84
  end
@@ -95,8 +95,8 @@ flowchart LR
95
95
  architect["Agent 2: clean-architect<br/>Plan implementation from clean specs and foundation"]
96
96
  qa["Agent 3: clean-qa-editor<br/>Implement, record verification, terminal report"]
97
97
  polish["Agent 4: clean-polish-reviewer<br/>Final code polish, repo hygiene, local commit"]
98
- outputs["Clean artifacts<br/>implementation-plan.json<br/>qc-report.json<br/>test plan notes"]
99
- imploutputs["Implementation outputs<br/>code, tests, AGENTS.md, .gitignore<br/>implementation-report.json<br/>polish-report.json"]
98
+ outputs["Clean artifacts<br/>implementation-plan.json<br/>implementation-report.json<br/>qc-report.json<br/>polish-report.json<br/>test plan notes"]
99
+ imploutputs["Implementation outputs<br/>code, tests, fixtures, AGENTS.md, .gitignore"]
100
100
  end
101
101
 
102
102
  subgraph guardrails["Guardrails and audit"]
@@ -140,10 +140,10 @@ flowchart LR
140
140
  env -. required for every role session .-> architect
141
141
  denyread -. clean and source-denied roles cannot read source roots .-> cleanroots
142
142
  denyread -. clean roles may read implementation roots .-> implroots
143
- denyread -. Agent 1.5 cannot read source roots, clean roots, implementation roots, source-index.json, or preflight-goal.json .-> sanitizer
143
+ denyread -. Agent 1.5 cannot read source/visual roots, clean roots, implementation roots, source-index.json, visual-index.json, or preflight-goal.json .-> sanitizer
144
144
  denywrite -. contaminated writes only to contaminated artifact roots .-> ledgers
145
- denywrite -. Agent 2 writes clean artifacts only; Agents 3 and 4 write implementation roots .-> cleanroots
146
- denywrite -. Agents 3 and 4 write code, tests, docs, and repo hygiene only here .-> implroots
145
+ denywrite -. Agent 2 writes clean artifacts; Agents 3 and 4 write clean reports .-> cleanroots
146
+ denywrite -. Agents 3 and 4 write destination files only here; no clean-room artifact JSON .-> implroots
147
147
  denyshell -. no shell-style tools in role sessions .-> manager
148
148
  denyshell -. no shell for Agent 2; explicit Agent 3 and Agent 4 runners only .-> architect
149
149
  scan -. post-write checks .-> outputs
@@ -223,7 +223,7 @@ The architecture delegates work across six distinct custom role agents to enforc
223
223
  * Records code hygiene violations as `code-hygiene` findings in `qc-report.json`.
224
224
  * Writes code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
225
225
  * Runs bounded verification only through the installed Agent 3 verification runner, with `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`, strict hooks, and cwd under implementation roots.
226
- * Writes `implementation-report.json` and maintains `qc-report.json`.
226
+ * Writes `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` and maintains `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`.
227
227
  * Does not report progress or ask Agent 0 for guidance during implementation.
228
228
  * Emits one terminal report for Agent 0 only when the assigned spec slice is complete, blocked, or quarantined.
229
229
 
@@ -234,7 +234,7 @@ The architecture delegates work across six distinct custom role agents to enforc
234
234
  * Reviews final clean implementation for security, docs/comments, exception handling, resource leaks, race conditions, missing tests, and repository hygiene.
235
235
  * Creates or updates implementation-root `AGENTS.md` with gotchas and build/test/dev commands discovered from clean implementation files.
236
236
  * Updates `.gitignore` only for real generated outputs, dependencies, caches, or build/test artifacts.
237
- * Writes `polish-report.json`.
237
+ * Writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`.
238
238
  * Uses `agent4-polish-runner.py` only with `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`, cwd under implementation roots, and strict hooks.
239
239
  * May initialize git and create one local commit containing only paths listed in `polish-report.json`; it must not push, tag, reset, clean, or delete branches.
240
240
 
@@ -258,7 +258,7 @@ Agent 3's terminal report is not enough to return. If configured, Agent 4 must p
258
258
  * Records controller memory in contaminated-side `controller-run-ledger.json`.
259
259
  * Writes `clean-room-result.json` before returning to the outer spec loop.
260
260
 
261
- Progress is durable-artifact based. `clean-room-skill run` compares semantic JSON artifact hashes that ignore volatile timestamp and artifact-hash fields, plus raw file hashes under implementation roots. Chat output, timestamp-only artifact churn, and `controller-status.json` updates alone do not count as progress.
261
+ Progress is durable-artifact based. `clean-room-skill run` compares semantic JSON artifact hashes that ignore volatile timestamp and artifact-hash fields, plus raw file hashes under implementation roots while ignoring generated directories such as `target/`. Chat output, timestamp-only artifact churn, Cargo build metadata, and `controller-status.json` updates alone do not count as progress.
262
262
 
263
263
  ---
264
264
 
@@ -270,7 +270,7 @@ Every clean-room role session requires a populated environment block before any
270
270
  * `CLEAN_ROOM_SOURCE_ROOTS`: Source roots (only readable by source-reading contaminated roles, not Agent 1.5).
271
271
  * `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`: Target write directory for contaminated roles.
272
272
  * `CLEAN_ROOM_CLEAN_ROOTS`: Target write directory for clean artifacts and reports.
273
- * `CLEAN_ROOM_IMPLEMENTATION_ROOTS`: Target write directory for Agent 3 clean implementation code and tests, plus Agent 4 implementation-root hygiene changes and local git metadata.
273
+ * `CLEAN_ROOM_IMPLEMENTATION_ROOTS`: Target write directory for Agent 3 clean implementation code, tests, fixtures, real destination project files, plus Agent 4 implementation-root hygiene changes and local git metadata. Clean-room artifact JSON files stay out of this root.
274
274
  * `CLEAN_ROOM_ALLOWED_READ_ROOTS`: Approved reference docs or constraints readable by clean and source-denied roles.
275
275
  * `CLEAN_ROOM_SCHEMA_DIR`: Path to the directory containing JSON schema assets.
276
276
 
@@ -293,11 +293,11 @@ Post-write hook failures are deny-by-default and redacted. If an artifact disapp
293
293
  * [agent4-polish-runner.py](../hooks/agent4-polish-runner.py): Runs Agent 4 bounded status, verification, git init, staging, and one local commit from implementation roots only, using paths and policy recorded in `polish-report.json`.
294
294
  * [require-clean-room-env.py](../hooks/require-clean-room-env.py): Fails closed if the required role and root environment variables are missing, if trust-domain roots overlap, or if clean, implementation, or contaminated artifact root names appear source-derived.
295
295
  * [deny-clean-room-shell.py](../hooks/deny-clean-room-shell.py): Denies shell-style tool execution inside clean-room role sessions except installed Agent 3 verification-runner invocations under implementation roots and installed Agent 4 polish-runner invocations under implementation roots.
296
- * [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` reads.
297
- * [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, and contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
296
+ * [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source or visual roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` or `visual-index.json` reads.
297
+ * [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, and clean-room artifact JSON files are denied under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
298
298
  * [check-artifact-leakage.py](../hooks/check-artifact-leakage.py): Scans clean artifacts and Agent 1.5 staged contaminated artifacts for high-risk leakage markers, source-like identifiers, and private identifier denylist terms. The private identifier denylist (loaded via `CLEAN_ROOM_PRIVATE_IDENTIFIER_DENYLIST`) is subject to hard limits to protect hook execution performance: a maximum of 1,000,000 bytes per file, 20,000 total terms, and 512 characters per individual term.
299
299
  * [validate-json-schema.py](../hooks/validate-json-schema.py): Verifies JSON syntax and structural conformance against schemas under `CLEAN_ROOM_SCHEMA_DIR`, including controller-side `preflight-goal.schema.json` and `init-config.schema.json`. Under clean roots, any unrecognized JSON files that do not conform to canonical schemas will trigger a failure unless they are explicitly registered in the path-separated `CLEAN_ROOM_AUXILIARY_JSON_ALLOWLIST` environment variable.
300
- * [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, or `source-index.json`, and match declared `sha256` checksums.
300
+ * [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, `source-index.json`, or `visual-index.json`, and match declared `sha256` checksums.
301
301
 
302
302
  For detailed guidelines on the clean-room process, refer to:
303
303
  * [CONTROLLER-LOOP.md](../skills/clean-room/references/CONTROLLER-LOOP.md)
package/docs/REFERENCE.md CHANGED
@@ -310,7 +310,7 @@ Strict context-management adapter example:
310
310
  }
311
311
  ```
312
312
 
313
- Relative `context.brief_path` values resolve relative to the `agent-commands.json` directory. Contaminated phases must point to briefs under the contaminated artifact root. Clean phases must point to briefs under the clean artifact root. A clean-stage brief may reference allowed clean artifacts, implementation artifacts, and approved public references, but not source indexes, contaminated ledgers, full manifests, controller status, or prior chat state.
313
+ Relative `context.brief_path` values resolve relative to the `agent-commands.json` directory. Contaminated phases must point to briefs under the contaminated artifact root. Clean phases must point to briefs under the clean artifact root. A clean-stage brief may reference allowed clean artifacts, implementation artifacts, and approved public references, but not source indexes, visual indexes, raw screenshots, contaminated ledgers, full manifests, controller status, or prior chat state.
314
314
 
315
315
  The runner exports `CLEAN_ROOM_SESSION_BRIEF_PATH`, `CLEAN_ROOM_ROLE_SESSION_ID`, and `CLEAN_ROOM_FRESH_CONTEXT_REQUIRED=1` for strict stages. The adapter still owns the actual fresh-context behavior: it must open a new model session, profile, or thread for that stage. Setting `fresh_session` while reusing one long chat is not a clean-room boundary.
316
316
 
@@ -329,6 +329,7 @@ The runner exports `CLEAN_ROOM_SESSION_BRIEF_PATH`, `CLEAN_ROOM_ROLE_SESSION_ID`
329
329
  | `clean-room run` reports repeated unit selection | Same unit selected after a no-progress iteration | Resolve the blocker or update durable artifacts before retrying. |
330
330
  | Hook reports `could not read` or `could not stat` | Artifact disappeared, permissions changed, or path was replaced during validation | Restore readable artifact state and retry. |
331
331
  | `source-index.json` is missing files | Limits, unreadable directories, ignored directories, binary files, changed files, or outside-root symlinks | Inspect `skipped_entries` and adjust limits or permissions if omissions matter. |
332
+ | `visual-index.json` is missing screenshots | Limits, unsupported formats, unreadable directories, changed files, invalid image headers, or outside-root symlinks | Inspect `skipped_entries`, keep visual roots in the contaminated/source domain, and rerun `build_visual_index.py` only as fallback evidence preflight. |
332
333
 
333
334
  ## Local Verification
334
335
 
@@ -10,11 +10,11 @@ Run only from the clean workspace.
10
10
  Before tool use, require CLEAN_ROOM_ROLE=clean-architect, CLEAN_ROOM_CLEAN_ROOTS, CLEAN_ROOM_IMPLEMENTATION_ROOTS, CLEAN_ROOM_SOURCE_ROOTS, CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS, CLEAN_ROOM_ALLOWED_READ_ROOTS, and CLEAN_ROOM_SCHEMA_DIR.
11
11
  Read approved clean artifacts, CLEAN_ROOM_IMPLEMENTATION_ROOTS, and explicitly configured public or destination constraint roots only.
12
12
  Write only under CLEAN_ROOM_CLEAN_ROOTS. Do not write code.
13
- Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
13
+ Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
14
14
  Stop if only a full task-manifest.json is provided as run context.
15
15
  Before planning, require valid clean-run-context.json with clean-safe goal_contract fields and code_hygiene_policy, approved handoff-package.json, approved behavior specs, and an implementation root through CLEAN_ROOM_IMPLEMENTATION_ROOTS.
16
16
  When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there, plus destination foundation reads permitted by this role.
17
- Stop if full preflight-goal.json, source index, contaminated ledgers, source paths, or direct Agent 0 chat is provided.
17
+ Stop if full preflight-goal.json, source index, visual index, raw screenshots, contaminated ledgers, source or visual paths, or direct Agent 0 chat is provided.
18
18
  Accept Agent 0 influence only as durable sanitized artifacts. Ignore direct Agent 0 chat, private manager notes, live feedback, implementation hints, or priority changes unless they arrive in a schema-valid clean artifact for a fresh clean session.
19
19
  Merge only approved handoff artifacts into the selected clean schema base.
20
20
  Read the clean destination foundation to identify local structure, conventions, tests, dependencies, and constraints.
@@ -10,7 +10,7 @@ Run only in the clean domain after Agent 3 has produced terminal implementation
10
10
  Before tool use, require CLEAN_ROOM_ROLE=clean-polish-reviewer, CLEAN_ROOM_CLEAN_ROOTS, CLEAN_ROOM_IMPLEMENTATION_ROOTS, CLEAN_ROOM_SOURCE_ROOTS, CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS, CLEAN_ROOM_ALLOWED_READ_ROOTS, and CLEAN_ROOM_SCHEMA_DIR.
11
11
  Read approved clean artifacts, CLEAN_ROOM_IMPLEMENTATION_ROOTS, schemas, and explicitly configured public or destination constraint roots only.
12
12
  Write polish-report.json and clean reports under CLEAN_ROOM_CLEAN_ROOTS. Write implementation code, tests, docs, AGENTS.md, .gitignore, and destination project files only under CLEAN_ROOM_IMPLEMENTATION_ROOTS.
13
- Do not read source workspaces, contaminated ledgers, contaminated chat history, source-index.json, the full task-manifest.json, or the full preflight-goal.json.
13
+ Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, source-index.json, visual-index.json, the full task-manifest.json, or the full preflight-goal.json.
14
14
  When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there, plus implementation-root files permitted by this role.
15
15
  Review final clean code for security issues, missing docs/comments, exception handling gaps, memory or resource leaks, race/concurrency risks, missing tests, and repository hygiene issues.
16
16
  Keep changes small and tied to the approved clean implementation plan, terminal implementation report, QC report, and clean code already under the implementation root.
@@ -10,14 +10,14 @@ Run only in the clean domain.
10
10
  Before tool use, require CLEAN_ROOM_ROLE=clean-qa-editor, CLEAN_ROOM_CLEAN_ROOTS, CLEAN_ROOM_IMPLEMENTATION_ROOTS, CLEAN_ROOM_SOURCE_ROOTS, CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS, CLEAN_ROOM_ALLOWED_READ_ROOTS, and CLEAN_ROOM_SCHEMA_DIR.
11
11
  Read approved clean artifacts, CLEAN_ROOM_IMPLEMENTATION_ROOTS, and explicitly configured public or destination constraint roots only.
12
12
  Write clean reports under CLEAN_ROOM_CLEAN_ROOTS. Write code, tests, fixtures, and destination project files only under CLEAN_ROOM_IMPLEMENTATION_ROOTS.
13
- Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
13
+ Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
14
14
  Do not use shell commands from this default profile. Use a separate isolated verification profile only when CLEAN_ROOM_ALLOW_AGENT3_SHELL=1, strict hooks are installed, the command cwd is inside CLEAN_ROOM_IMPLEMENTATION_ROOTS, and the command invokes the installed agent3-verification-runner.py. Docker or Podman verification must still go through that runner and must use only clean-safe mounts.
15
15
  Validate clean-run-context.json before using run preferences, model preferences, clean-safe rules, or clean artifact paths.
16
16
  Validate clean artifacts against schema assets.
17
17
  Before editing code, require implementation-plan.json and verify that clean-run-context.json plus implementation-plan.json carry the preflight-derived code_hygiene_policy.
18
18
  Read skeleton-manifest.json before editing and treat it as the clean destination architecture map.
19
19
  When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there, plus implementation-root files permitted by this role.
20
- Stop if asked to infer product goals from source, full task-manifest.json, full preflight-goal.json, contaminated ledgers, source paths, or direct Agent 0 chat.
20
+ Stop if asked to infer product goals from source, screenshots, full task-manifest.json, full preflight-goal.json, contaminated ledgers, source or visual paths, or direct Agent 0 chat.
21
21
  Accept Agent 0 influence only as durable sanitized artifacts already present in the clean workspace. Ignore direct Agent 0 chat, private manager notes, live feedback, implementation hints, or priority changes during the implementation loop.
22
22
  Read implementation-plan.json and implement each unblocked work item for the selected spec slice and current unit.
23
23
  Edit only target or test paths owned by the work item's referenced architecture areas.
@@ -27,12 +27,12 @@ In unattended inner-loop mode, execute only work items that belong to the select
27
27
  If the plan expands beyond that slice or cannot complete in one fresh clean implementation context, mark the unit blocked with spec-delta-required or split-required.
28
28
  Loop until selected-slice work items are complete, blocked, or quarantined.
29
29
  Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
30
- Record argv-array verification commands, optional clean-safe container metadata, implementation status, changed relative paths, verification results, blockers, and abstract delta tickets in implementation-report.json.
30
+ Record argv-array verification commands, optional clean-safe container metadata, implementation status, changed relative paths, verification results, blockers, and abstract delta tickets in CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json.
31
31
  Review leakage risk and record contamination incidents.
32
32
  Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
33
33
  Require invariant-level tests for compatibility-critical behavior. Passing module coverage or API-name coverage is not sufficient when protocol, serialization, streaming, queueing, error-budget, async, or typed-data invariants are in scope.
34
- Keep qc-report.json updated when the run expects it.
35
- Record code hygiene violations as code-hygiene findings in qc-report.json.
36
- Record architecture alignment in qc-report.json. Use architecture_status drift or blocked when changed paths do not map to planned work items and owned architecture areas.
34
+ Keep CLEAN_ROOM_CLEAN_ROOTS/qc-report.json updated when the run expects it.
35
+ Record code hygiene violations as code-hygiene findings in CLEAN_ROOM_CLEAN_ROOTS/qc-report.json.
36
+ Record architecture alignment in CLEAN_ROOM_CLEAN_ROOTS/qc-report.json. Use architecture_status drift or blocked when changed paths do not map to planned work items and owned architecture areas.
37
37
  Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined.
38
38
  """
@@ -11,9 +11,9 @@ Before tool use, require CLEAN_ROOM_ROLE=contaminated-handoff-sanitizer, CLEAN_R
11
11
  Before reviewing drafts, require a neutral sanitizer brief, assigned draft artifact paths, schema directory, and public compatibility allowlist when public names are retained.
12
12
  When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there.
13
13
  Read only Agent 0's neutral sanitizer brief, assigned draft artifacts under CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS, schema assets, and explicitly configured public or destination reference roots.
14
- Do not read source roots, clean roots, implementation roots, source-index.json, full preflight-goal.json, full task-manifest.json, contaminated evidence ledgers, or contaminated source-analysis chat history.
14
+ Do not read source roots, visual roots, clean roots, implementation roots, source-index.json, visual-index.json, raw screenshots, full preflight-goal.json, full task-manifest.json, contaminated evidence ledgers, or contaminated source-analysis chat history.
15
15
  Write only under CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS.
16
- Remove or reject source paths, import/export listings, dependency graphs, private identifiers, copied comments, raw diffs, source excerpts, distinctive strings, and source-shaped pseudocode.
16
+ Remove or reject source paths, visual paths, image hashes, import/export listings, dependency graphs, private identifiers, copied comments, copied visible words, raw screenshots, raw diffs, source excerpts, distinctive strings, exact UI palettes/layouts/iconography, and source-shaped pseudocode.
17
17
  Preserve the required artifact schema shape while sanitizing; reject custom freeform spec-like JSON instead of approving it for clean handoff.
18
18
  Record leakage_review.reviewer_role as contaminated-handoff-sanitizer.
19
19
  Return only abstract regeneration feedback to Agent 0 when an artifact must be quarantined.
@@ -17,16 +17,18 @@ Record the user's format_selection target profile, Agent 0-4 agent_pipeline cont
17
17
  Produce clean-run-context.json for Agent 2, Agent 3, and Agent 4 from sanitized initialization, clean-safe preflight goal fields, code hygiene policy, and handoff data. Do not send the full task-manifest.json or full preflight-goal.json to clean roles.
18
18
  Influence Agent 2, Agent 3, and Agent 4 only through durable sanitized artifacts. Do not send direct chat instructions, progress feedback, priority changes, implementation hints, or corrective coaching into an active clean role session.
19
19
  Use contaminated source-index.json when controller preflight produced one.
20
- Maintain the tasklist as neutral task-manifest.json units, map at most one source-index batch or large-file segment into each unit, and track coverage under CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS.
20
+ When no indexable source code exists and screenshots/images are the authorized evidence, use contaminated visual-index.json only as fallback input. In attended mode, pause before decomposition to ask what the screenshots are meant to accomplish: product goal, target user flow, screenshot coverage, target stack, UI exactness boundary, and whether visible words are public compatibility surface.
21
+ Maintain the tasklist as neutral CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS/task-manifest.json units, map at most one source-index batch, large-file segment, or visual-index batch into each unit, and track coverage under CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS.
21
22
  Provide Agent 1.5 only a neutral sanitizer brief with domain purpose, target profile, unit intent, public compatibility allowlist, and blocked categories.
22
23
  Send Agent 1 draft specs to Agent 1.5 for independent source-denied sanitization before clean handoff.
23
24
  Compare clean artifacts and terminal implementation reports against source behavior, discovered source tests, equal-output requirements, and public API/schema compatibility for coverage gaps.
24
25
  Reject complete when source-test-derived parity, protocol invariants, public-contract tests, or approved behavior-spec open questions remain unresolved. Convert the gap into abstract delta tickets for a fresh clean cycle.
25
26
  Do not write clean artifacts.
26
27
  If source-index.json is needed but missing, pause for controller preflight instead of running shell tools inside this role.
28
+ If visual-index.json is needed but missing, pause for controller preflight instead of running shell tools inside this role.
27
29
  Receive Agent 3 implementation reports and QC reports only after Agent 3 reaches complete, blocked, or quarantined. Receive Agent 4 polish reports only after the configured polish review reaches passed, blocked, or quarantined. Do not consume partial clean-role reports as controller feedback.
28
30
  Do not return to the outer spec loop merely because Agent 3 produced implementation-report.json. Consume the terminal implementation report, any configured polish-report.json, verify coverage from the contaminated side, then write clean-room-result.json.
29
31
  Convert terminal implementation or polish gaps into abstract delta tickets for the next clean run. Do not steer an in-progress Agent 3 or Agent 4 loop.
30
32
  Send only clean-run-context.json, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall.
31
- Never include source excerpts, raw diffs, copied comments, private helper names, or source-shaped pseudocode.
33
+ Never include source excerpts, raw diffs, copied comments, raw screenshots, visual-index contents, visual paths, image hashes, copied visible text, exact UI palettes/layouts/iconography, private helper names, or source-shaped pseudocode.
32
34
  """
@@ -7,14 +7,16 @@ enabled_skills = ["clean-room"]
7
7
  instructions = """
8
8
  Act as Agent 1 in the clean-room pipeline.
9
9
  Operate only in the contaminated domain.
10
- Before reading source, require active task-manifest.json with preflight_goal_ref and preflight_goal_sha256, one assigned unit_id, authorized source_index_refs when used, evidence handling policy, and target stack plus compatibility policy from preflight.
10
+ Before reading source, require active task-manifest.json with preflight_goal_ref and preflight_goal_sha256, one assigned unit_id, authorized source_index_refs when used, authorized visual_index_refs when visual fallback is used, evidence handling policy, and target stack plus compatibility policy from preflight.
11
11
  When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there, except for direct source reads already permitted by the assigned unit and role policy.
12
12
  Do not infer target language, dependency policy, license policy, or exactness policy from source code.
13
13
  Read the minimum authorized source needed for the assigned unit.
14
14
  When the unit has source_index_refs, stay within the referenced batch unless Agent 0 explicitly assigns a related gap.
15
+ When the unit has visual_index_refs, use view_image only in this contaminated role and stay within the referenced visual batch unless Agent 0 explicitly assigns a related gap.
15
16
  Write only under CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS.
16
17
  Generate neutral draft task slices and behavioral spec material for Agent 0-controlled units.
17
18
  Produce neutral behavioral requirements and evidence refs.
19
+ For visual fallback units, write UI behavior/spec claims about intent, screen states, hierarchy, accessibility expectations, interaction purpose, and broad style goals. Do not OCR or copy visible words unless preflight recorded them as public compatibility surface; do not preserve exact palettes, iconography, spacing, layout measurements, or distinctive visual expression.
18
20
  Treat discovered source tests as behavioral evidence and convert them into clean test_scenarios that validate the same observable outputs.
19
21
  Record equal-output expectations for public return values, serialized data, CLI or API responses, errors, state changes, ordering, and compatibility-relevant side effects.
20
22
  Capture public API, protocol, config, and data/schema compatibility using existing behavior spec fields.
@@ -22,6 +24,6 @@ For behavior-compatible ports, extract compatibility-critical invariants into in
22
24
  When present, treat protocol transcript shape, request/response ID pairing, error budgets, streaming order, queue bounds, sampling registry aliases, async behavior, and typed JSON argument preservation as first-class observable behavior.
23
25
  Flag suspected leakage before returning drafts, but do not approve your own work for clean handoff.
24
26
  Do not produce implementation code; Agent 3 owns clean implementation after sanitization and planning.
25
- Do not include copied source expression, raw diffs, comments, source test names, fixture structure, private helper names, or implementation-shaped pseudocode.
27
+ Do not include copied source expression, raw diffs, comments, raw screenshots, visual paths, image hashes, copied visible text, exact UI palettes/layouts/iconography, source test names, fixture structure, private helper names, or implementation-shaped pseudocode.
26
28
  Agent 1.5 owns independent sanitization and leakage pass/fail review from a fresh source-denied context.
27
29
  """
@@ -384,13 +384,14 @@ def build_container_argv(
384
384
  context,
385
385
  fallback_timeout,
386
386
  )
387
+ runtime_network = "none" if network == "off" else network
387
388
  validate_container_mount_roots(cwd, blocked_roots)
388
389
  container_argv = [
389
390
  container_executable(backend),
390
391
  "run",
391
392
  "--rm",
392
393
  "--network",
393
- network,
394
+ runtime_network,
394
395
  "--read-only",
395
396
  "--cap-drop",
396
397
  "ALL",