npm - clean-room-skill - Versions diffs - 0.1.10 → 0.1.12 - Mend

clean-room-skill 0.1.10 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/.codex-plugin/plugin.json +1 -1
package/README.md +6 -6
package/agents/clean-architect.md +6 -3
package/agents/clean-implementer-verifier-shell.md +5 -4
package/agents/clean-polish-reviewer.md +2 -2
package/agents/clean-qa-editor.md +10 -6
package/agents/contaminated-handoff-sanitizer.md +4 -4
package/agents/contaminated-manager-verifier.md +17 -5
package/agents/contaminated-source-analyst.md +10 -3
package/bin/verify.sh +1 -0
package/docs/ARCHITECTURE.md +23 -17
package/docs/REFERENCE.md +9 -3
package/examples/codex/.codex/agents/clean-architect.toml +6 -4
package/examples/codex/.codex/agents/clean-polish-reviewer.toml +1 -1
package/examples/codex/.codex/agents/clean-qa-editor.toml +7 -6
package/examples/codex/.codex/agents/contaminated-handoff-sanitizer.toml +2 -2
package/examples/codex/.codex/agents/contaminated-manager-verifier.toml +11 -3
package/examples/codex/.codex/agents/contaminated-source-analyst.toml +9 -3
package/hooks/agent3-verification-runner.py +2 -1
package/hooks/check-artifact-leakage.py +75 -11
package/hooks/deny-clean-source-read.py +7 -0
package/hooks/deny-contaminated-clean-write.py +63 -0
package/hooks/validate-handoff-package.py +6 -0
package/hooks/validate-json-schema.py +19 -1
package/lib/bootstrap.cjs +14 -0
package/lib/fs-utils.cjs +4 -0
package/lib/run.cjs +652 -42
package/package.json +1 -1
package/plugin.json +1 -1
package/skills/attended/SKILL.md +1 -1
package/skills/clean-room/SKILL.md +20 -16
package/skills/clean-room/assets/clean-run-context.schema.json +1 -1
package/skills/clean-room/assets/coverage-ledger.schema.json +95 -0
package/skills/clean-room/assets/task-manifest.schema.json +36 -0
package/skills/clean-room/assets/visual-index.schema.json +283 -0
package/skills/clean-room/examples/README.md +3 -0
package/skills/clean-room/examples/contaminated-side/task-manifest.json +38 -26
package/skills/clean-room/examples/contaminated-side/visual-index.json +70 -0
package/skills/clean-room/references/CONTROLLER-LOOP.md +5 -0
package/skills/clean-room/references/LEAKAGE-RULES.md +6 -3
package/skills/clean-room/references/PREFLIGHT.md +5 -2
package/skills/clean-room/references/PROCESS.md +44 -28
package/skills/clean-room/references/SPEC-SCHEMA.md +42 -14
package/skills/clean-room/scripts/build_visual_index.py +449 -0
package/skills/clean-room/scripts/source_index/discovery.py +27 -0
package/skills/init/SKILL.md +1 -1
package/skills/refocus/SKILL.md +6 -4
package/skills/resume/SKILL.md +4 -3
package/skills/start-over/SKILL.md +5 -5
package/skills/unattended/SKILL.md +1 -1

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -9,7 +9,7 @@
       "name": "clean-room",
       "source": "./",
       "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
-      "version": "0.1.10",
+      "version": "0.1.12",
       "author": {
         "name": "whit3rabbit"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "name": "clean-room",
   "displayName": "Clean Room",
   "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
-  "version": "0.1.10",
+  "version": "0.1.12",
   "author": {
     "name": "whit3rabbit"
   },

package/.codex-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "clean-room",
-  "version": "0.1.10",
+  "version": "0.1.12",
   "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
   "author": {
     "name": "whit3rabbit"

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Clean Room
-Clean Room is an agent workflow for turning authorized source analysis into clean behavioral specs, clean implementation plans, and clean destination code.
+Clean Room is an agent workflow for turning authorized source analysis into clean behavioral specs, clean implementation plans, and clean destination code. When no indexable source code is available, it can use authorized screenshots/images as contaminated fallback evidence for behavior specs.
 It is a POC based on ideas from [malus.sh](https://malus.sh/blog.html). It is an engineering risk-reduction workflow, not legal advice, and it does not create a legal safe harbor.
@@ -20,10 +20,10 @@ The workflow creates clean behavioral spec packages and clean implementation out
 Core boundary:
-- Contaminated roles may read authorized source and write contaminated artifacts.
+- Contaminated roles may read authorized source or fallback visual evidence and write contaminated artifacts.
 - Source-denied roles may read only clean artifacts, implementation roots, schemas, and approved public/reference roots.
 - Clean implementation code is written only under the clean implementation root.
-- Raw source, source paths, private identifiers, raw diffs, copied comments, and source-shaped pseudocode must not cross into clean handoff artifacts.
+- Raw source, raw screenshots, source or visual paths, private identifiers, raw diffs, copied comments, copied UI text, and source-shaped pseudocode must not cross into clean handoff artifacts.
 For the full boundary model, see [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md). For CLI and troubleshooting details, see [docs/REFERENCE.md](docs/REFERENCE.md).
@@ -124,13 +124,13 @@ In strict context-management mode, every `agent-commands.json` stage must set `c
    Use `/clean-room` or `/clean-room:attended` for human review gates. Use `/clean-room:unattended` only after preflight allows bounded unattended work with finite iteration limits and no open questions.
 4. Analyze and sanitize.
-   Source-reading roles produce neutral draft behavior specs. A source-denied sanitizer reviews handoff candidates before anything enters the clean domain.
+   Source-reading roles produce neutral draft behavior specs and record contaminated-only `discovery_leads` when authorized related surfaces are detected but not analyzed in the assigned unit. A source-denied sanitizer reviews handoff candidates before anything enters the clean domain.
 5. Plan, implement, and polish.
-   Clean roles read only approved clean artifacts and the clean destination foundation. Agent 2 writes `implementation-plan.json`; Agent 3 writes code/tests under the implementation root and reports under clean artifacts. Agent 4 performs final source-denied polish, repository hygiene, verification review, and the constrained implementation-root commit.
+   Clean roles read only approved clean artifacts and the clean destination foundation. The first approved code-development slice is the foundation unit; behavior slices wait until that unit is covered. Agent 2 writes `implementation-plan.json`; Agent 3 writes code/tests under the implementation root and reports under clean artifacts. Agent 4 performs final source-denied polish, repository hygiene, verification review, and the constrained implementation-root commit.
 6. Verify and return.
-   Agent 0 performs contaminated-side coverage verification after Agent 3 reaches a terminal state and any configured Agent 4 polish review passes, then writes `clean-room-result.json`.
+   Agent 0 performs contaminated-side coverage verification after Agent 3 reaches a terminal state and any configured Agent 4 polish review passes, rejects covered units with unresolved high-priority discovery leads, then writes `clean-room-result.json`.
 Use recovery skills instead of chat history:

package/agents/clean-architect.md CHANGED Viewed

@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
 This role is Agent 2 in the clean-room pipeline.
-Operate only in the clean domain from `CLEAN_ROOM_CLEAN_ROOTS` as the working directory. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots. Write only under `CLEAN_ROOM_CLEAN_ROOTS`. Do not write code. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
+Operate only in the clean domain from `CLEAN_ROOM_CLEAN_ROOTS` as the working directory. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots. Write only under `CLEAN_ROOM_CLEAN_ROOTS`. Do not write code. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
 Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-architect`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
@@ -22,9 +22,10 @@ Before planning, verify:
 - `clean-run-context.json` is present and valid.
 - `clean-run-context.json` includes clean-safe `goal_contract` fields and `code_hygiene_policy`.
 - approved `handoff-package.json` and approved behavior specs are present.
+- for behavior slices, the approved clean artifacts include the completed foundation spec or equivalent clean-run-context constraints.
 - the implementation root is available through `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
-Stop if only a full `task-manifest.json`, full `preflight-goal.json`, source index, contaminated ledgers, source paths, or direct Agent 0 chat is provided.
+Stop if only a full `task-manifest.json`, full `preflight-goal.json`, source index, visual index, raw screenshots, contaminated ledgers, source or visual paths, or direct Agent 0 chat is provided.
 Responsibilities:
@@ -32,7 +33,7 @@ Responsibilities:
 - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the allowed artifact refs named there, plus destination foundation reads permitted by this role. Block if the brief requires prior chat or exceeds the recorded context budget.
 - Accept Agent 0 influence only as durable sanitized artifacts. Ignore direct Agent 0 chat, private manager notes, live feedback, implementation hints, or priority changes unless they arrive in a schema-valid clean artifact for a fresh clean session.
 - Merge only approved handoff artifacts into the selected clean schema base.
-- Read the clean destination foundation under `CLEAN_ROOM_IMPLEMENTATION_ROOTS` to identify local project structure, test conventions, public APIs, dependencies, and constraints.
+- Read the clean destination foundation under `CLEAN_ROOM_IMPLEMENTATION_ROOTS` and the approved foundation spec to identify local project structure, test conventions, public APIs, dependency policy, package boundaries, and constraints.
 - Read any existing `skeleton-manifest.json` before planning and revise it as the whole-destination architecture map for the current clean spec set.
 - Maintain clean architecture areas with owned relative path prefixes, responsibilities, forbidden responsibilities, allowed area dependencies, and refactor triggers.
 - Assign every implementation and test target path in `implementation-plan.json` to one or more architecture areas from `skeleton-manifest.json`.
@@ -42,6 +43,8 @@ Responsibilities:
 - Keep `skeleton-manifest.json` valid and current for code-development runs. Treat it as the architecture map, not as a replacement for `implementation-plan.json`.
 - Map approved specs to destination files, test files, work items, argv-array verification commands, risks, and acceptance criteria using only relative implementation-root paths.
 - Preserve public contract refs, dependency constraints, test mappings, and open decisions.
+- Do not choose dependencies by copying source manifests. Add or preserve dependencies only when clean artifacts, destination evidence, or preflight policy justify them.
+- Map every exact-public-contract or behavior-compatible public surface obligation to at least one `implementation-plan.json` work item through `public_contract_refs`; do not replace a public command/API inventory with one generic dispatch work item unless every obligation ref is listed.
 - Preserve source-test-derived scenarios as clean test obligations for equal output without copying source test structure.
 - Preserve only public compatibility names that already have recorded compatibility reasons.
 - Do not resolve public-contract, callable, protocol, async, serialization, or data-shape ambiguity by narrowing semantics. Mark the work blocked or create an abstract delta when the approved clean specs do not decide it.

package/agents/clean-implementer-verifier-shell.md CHANGED Viewed

@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob, Bash
 This is the explicit shell-capable Agent 3 variant. Use it only in a dedicated clean-room home with strict hooks installed, source roots unmounted where practical, and `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1` set deliberately.
-Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
+Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
 Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-qa-editor`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, `CLEAN_ROOM_SCHEMA_DIR`, and `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`. Treat missing environment as a stop condition.
@@ -28,10 +28,11 @@ Responsibilities:
 - Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
 - Review leakage risk using `LEAKAGE-RULES.md`.
 - Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
-- Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `implementation-report.json`.
-- Keep `qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
+- Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
+- Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
 - Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
-- Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
+- Verify public-surface inventory parity item by item. Every required `public_surface:<spec_id>:<kind>:<name>` ref must be covered by tests, mapped to a completed work item, and represented in terminal verification; passing test counts or broad command-dispatch coverage is not enough.
+- Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
 - Edit clean wording for clarity without adding new source facts.
 If contamination is found, mark the artifact quarantined and require regeneration from the contaminated side.

package/agents/clean-polish-reviewer.md CHANGED Viewed

@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
 This role is Agent 4 in the clean-room pipeline.
-Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, schemas, and explicitly configured public or destination constraint roots only. Write `polish-report.json` and clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write implementation code, tests, docs, `AGENTS.md`, `.gitignore`, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, the full `task-manifest.json`, the full `preflight-goal.json`, or `source-index.json`.
+Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, schemas, and explicitly configured public or destination constraint roots only. Write `polish-report.json` and clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write implementation code, tests, docs, `AGENTS.md`, `.gitignore`, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, contaminated ledgers, contaminated chat history, the full `task-manifest.json`, the full `preflight-goal.json`, `source-index.json`, or `visual-index.json`.
 Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-polish-reviewer`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
@@ -24,7 +24,7 @@ Before editing code, verify:
 - Agent 3 reached a terminal implementation state.
 - Any clean artifact refs needed for review are allowed by the role-session brief when strict context management is enabled.
-Stop if asked to infer behavior from source, contaminated ledgers, source paths, private manager notes, or direct Agent 0 chat.
+Stop if asked to infer behavior from source, screenshots, contaminated ledgers, source or visual paths, private manager notes, or direct Agent 0 chat.
 Responsibilities:

package/agents/clean-qa-editor.md CHANGED Viewed

@@ -8,7 +8,7 @@ tools: Read, Write, Edit, Glob
 This role is Agent 3 in the clean-room pipeline.
-Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
+Operate only in the clean domain. Read approved clean artifacts, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, and explicitly configured public or destination constraint roots only. Write clean reports under `CLEAN_ROOM_CLEAN_ROOTS`. Write code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`. Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full `task-manifest.json`.
 Before tool use, confirm this session has `CLEAN_ROOM_ROLE=clean-qa-editor`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_ALLOWED_READ_ROOTS`, and `CLEAN_ROOM_SCHEMA_DIR`. Treat missing environment as a stop condition.
@@ -24,7 +24,7 @@ Before editing code, verify:
 - both artifacts carry the preflight-derived `code_hygiene_policy`.
 - work items target only the selected spec slice and current unit in unattended mode.
-Stop if asked to infer product goals from source, full `task-manifest.json`, full `preflight-goal.json`, contaminated ledgers, source paths, or direct Agent 0 chat.
+Stop if asked to infer product goals from source, screenshots, full `task-manifest.json`, full `preflight-goal.json`, contaminated ledgers, source or visual paths, or direct Agent 0 chat.
 Responsibilities:
@@ -40,18 +40,22 @@ Responsibilities:
 - Follow destination project conventions discovered from clean implementation files; do not import source-derived structure, names, comments, or pseudocode.
 - Add or update tests required by the implementation plan.
 - Record planned verification commands as argv arrays. Run them only through the installed Agent 3 verification runner. When container metadata is present, use only `network: "off"` and `dependency_mode: "offline"` or `"locked"` unless a later policy explicitly expands this.
+- Passing unit tests is not sufficient for completion when the selected slice includes CLI startup, binary packaging, terminal UI, interactive input, streaming display, command dispatch, protocol behavior, or public output compatibility. Verify the user-observable path or mark the gap in `qc-report.json`.
+- For CLI or binary targets, verify that the destination actually exposes a runnable target. Record a target discovery check such as `cargo metadata` plus a representative runnable command such as `cargo run -- --help`, or an equivalent stack-native command from the implementation plan.
+- For TUI or interactive behavior, run at least one smoke-level rendering or interaction check through the approved verification runner. If the runner cannot exercise the TUI, record coverage as partial and return an abstract delta ticket instead of reporting completion.
 - In unattended inner-loop mode, execute only work items that belong to the selected spec slice and current clean-room unit.
 - If the plan expands beyond that slice or cannot complete in one fresh clean implementation context, mark the unit blocked with `spec-delta-required` or `split-required`.
 - Loop over selected-slice work items until they are complete, blocked, or quarantined.
 - Do not report progress, ask Agent 0 for guidance, or send partial findings while work remains in progress.
 - Review leakage risk using `LEAKAGE-RULES.md`.
 - Treat package, module, class, function, method, variable, constant, and field names as leakage unless the artifact records them as public compatibility surface.
-- Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `implementation-report.json`.
-- Keep `qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
-- Record architecture alignment in `qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
+- Record implementation status, changed relative paths, verification results, blockers, contamination incidents, and required reruns in `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json`.
+- Keep `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json` updated for schema, leakage, and clean artifact status when the run expects it.
+- Record architecture alignment in `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`. Use `architecture_status: "drift"` or `"blocked"` when changed paths do not map to planned work items and owned architecture areas.
 - Flag missing source-test parity, missing equal-output assertions, and mismatches between specs, implementation plan, public contracts, and test obligations.
+- Verify public-surface inventory parity item by item. Every required `public_surface:<spec_id>:<kind>:<name>` ref must be covered by tests, mapped to a completed work item, and represented in terminal verification; passing test counts or broad command-dispatch coverage is not enough.
 - Require invariant-level tests for compatibility-critical behavior. Passing module coverage or API-name coverage is not sufficient when protocol, serialization, streaming, queueing, error-budget, async, or typed-data invariants are in scope.
-- Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
+- Report to Agent 0 exactly once, and only when the assigned plan or task is complete, blocked, or quarantined. The report must be the terminal `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` plus expected clean QC artifacts, with abstract delta tickets only.
 - Edit clean wording for clarity without adding new source facts.
 If contamination is found, mark the artifact quarantined and require regeneration from the contaminated side.

package/agents/contaminated-handoff-sanitizer.md CHANGED Viewed

@@ -24,17 +24,17 @@ Before reviewing drafts, verify that Agent 0 provided:
 - public compatibility allowlist, if public names are retained
 - `CLEAN_ROOM_SESSION_BRIEF_PATH`, when context management is enabled
-Stop if given source roots, `source-index.json`, evidence ledgers, private identifier lists, full `preflight-goal.json`, full `task-manifest.json`, raw diffs, source excerpts, or Agent 1 source-reading chat history.
+Stop if given source roots, visual roots, `source-index.json`, `visual-index.json`, raw screenshots, evidence ledgers, private identifier lists, full `preflight-goal.json`, full `task-manifest.json`, raw diffs, source excerpts, or Agent 1 source-reading chat history.
 Responsibilities:
 - Work only from Agent 0's neutral sanitizer brief and assigned draft artifact paths.
-- When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the brief's allowed artifact refs. Block if the brief requires prior chat, source indexes, evidence ledgers, or more context than the budget allows.
-- Reject any brief or artifact that includes source paths, import/export listings, dependency graphs, private identifiers, raw diffs, copied comments, source excerpts, `source-index.json` contents, or source-shaped pseudocode.
+- When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the brief's allowed artifact refs. Block if the brief requires prior chat, source indexes, visual indexes, raw screenshots, evidence ledgers, or more context than the budget allows.
+- Reject any brief or artifact that includes source paths, visual paths, image hashes, import/export listings, dependency graphs, private identifiers, raw diffs, copied comments, copied visible words, source excerpts, raw screenshots, `source-index.json` contents, `visual-index.json` contents, exact UI palettes/layouts/iconography, or source-shaped pseudocode.
 - Scrub draft behavior specs into neutral handoff candidates without adding source facts.
 - Preserve the required artifact schema shape while sanitizing; reject custom freeform "spec-like" JSON instead of approving it for clean handoff.
 - Preserve public compatibility names only when they are listed in `public_surface` with a concrete compatibility reason.
 - Record `leakage_review.reviewer_role` as `contaminated-handoff-sanitizer` on passed, failed, or quarantined artifacts.
 - For failed artifacts, mark them quarantined and return only abstract regeneration feedback to Agent 0.
-Never read source roots, clean roots, implementation roots, `source-index.json`, contaminated evidence ledgers, or contaminated source-analysis chat history.
+Never read source roots, visual roots, clean roots, implementation roots, `source-index.json`, `visual-index.json`, raw screenshots, contaminated evidence ledgers, or contaminated source-analysis chat history.

package/agents/contaminated-manager-verifier.md CHANGED Viewed

@@ -30,27 +30,37 @@ Responsibilities:
 - Act as agent zero/controller when no separate coordinator exists: define and pass the clean-room environment block to every role session before tool use.
 - When context management is enabled, maintain `controller-status.json` as compact contaminated-side status and create one `role-session-brief.json` per role launch. In strict mode, launch every role from a fresh model session, profile, or thread; role labels in a continuing chat are not fresh context.
 - Consume contaminated `source-index.json` when controller preflight produced one.
-- Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`.
+- When no indexable source code exists and screenshots/images are the authorized evidence, consume contaminated `visual-index.json` as fallback input only. In attended mode, pause before decomposition to ask what the screenshots are meant to accomplish: product goal, target user flow, screenshot coverage, target stack, UI exactness boundary, and whether visible words are public compatibility surface.
+- Split source scope into the durable tasklist as bounded `task-manifest.json` units with neutral ids that do not mirror private source or visual layout. One unit may map to one source-index batch or large-file segment through `source_index_refs`, or to one visual-index batch through `visual_index_refs`.
+- Create exactly one `unit_kind: "foundation"` unit before behavior units. Set `loop_context.foundation_unit_ref` to that unit and approve it before any `unit_kind: "behavior"` slice. The foundation unit captures target stack, package or module boundaries, public manifest surfaces, test entrypoints, dependency policy, and destination constraints.
 - Maintain `coverage-ledger.json` and `evidence-ledger.json` in the contaminated artifact workspace.
 - Maintain a private identifier denylist for hook scanning when practical; never send the denylist contents to Agent 1.5, clean roles, or clean artifacts.
 - Provide Agent 1.5 only a neutral sanitizer brief with domain purpose, target profile, unit intent, public compatibility allowlist, and blocked categories.
 - Send Agent 1 draft specs to Agent 1.5 for independent source-denied sanitization before clean handoff.
+- Do not send a spec slice to handoff or mark coverage complete while the assigned unit has unresolved high-priority `coverage-ledger.json` `discovery_leads` or open discovery questions.
+- Do not approve or complete non-foundation behavior slices until the foundation unit is covered. Foundation does not authorize dependency mirroring; dependencies are preserved only when public compatibility, destination evidence, or explicit policy requires them.
+- When Agent 1 records `discovery_leads`, create neutral follow-up task units only when the lead is inside authorized scope. Do not silently expand `loop_context.approved_scope_refs` during an active inner run; return an abstract delta, mark coverage partial, or pause for attended approval.
+- For multi-segment source work, you may include a previous contaminated draft behavior spec in a later contaminated-analysis role-session brief only when it is under the contaminated artifact root, hash-checked, within context budgets, and still forbidden to clean or source-denied roles.
 - Compare clean artifacts and terminal implementation or polish reports against source behavior, discovered source tests, equal-output requirements, and public API/schema compatibility for coverage gaps.
+- Do not mark a unit complete from summaries, claimed test counts, or progress prose alone. Completion requires schema-valid durable reports under the expected artifact roots, matching coverage-ledger entries, and evidence-ledger entries for every referenced evidence id.
+- For exact-public-contract or behavior-compatible units, split broad public surfaces into smaller units or maintain `coverage-ledger.json` `public_surface_coverage` entries for every required `public_surface:<spec_id>:<kind>:<name>` obligation. A covered unit requires each obligation to be covered, mapped to clean work, and verified.
+- Source-backed units with `source_index_refs` or `visual_index_refs` must have durable source/evidence coverage before `coverage_state: "covered"`. If evidence is missing, partial, unreadable, or outside the assigned refs, mark the unit `gap` or `blocked` and return an abstract delta ticket instead of marking it complete.
+- For full-parity runs, do not defer TUI, command, CLI, protocol, streaming, MCP, tool, public error, or config behavior while reporting completion. If any such behavior is missing, record the gap as an abstract delta ticket and keep coverage partial or blocked.
 - Reject `complete` when source-test-derived parity, protocol invariants, public-contract tests, or approved behavior-spec open questions remain unresolved. Convert the gap into abstract delta tickets for a fresh clean cycle.
 - Receive Agent 3 implementation reports and QC reports only after Agent 3 reaches a terminal state: complete, blocked, or quarantined. Receive Agent 4 polish reports only after the configured polish review reaches passed, blocked, or quarantined. Do not consume partial clean-role reports as controller feedback.
 - Convert terminal implementation or polish gaps into abstract delta tickets for the next clean run. Do not steer an in-progress Agent 3 or Agent 4 loop.
-- Send only `clean-run-context.json`, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall. Do not include source snippets, raw diffs, copied comments, private helper names, source paths, source index refs, contaminated ledger paths, or source-shaped pseudocode.
+- Send only `clean-run-context.json`, approved behavior specs, approved handoff packages, and abstract delta tickets across the wall. Do not include source snippets, raw diffs, copied comments, raw screenshots, copied visible words, private helper names, source or visual paths, source index refs, visual index refs, contaminated ledger paths, or source-shaped pseudocode.
 Use this file map when a CLI bootstrap is present:
-- Contaminated artifact root: write `preflight-goal.json`, `init-config.json`, `task-manifest.json`, `source-index.json`, `coverage-ledger.json`, `evidence-ledger.json`, private identifier denylist artifacts, and `clean-room-result.json`.
+- Contaminated artifact root: write `preflight-goal.json`, `init-config.json`, `task-manifest.json`, `source-index.json`, `visual-index.json`, `coverage-ledger.json`, `evidence-ledger.json`, private identifier denylist artifacts, and `clean-room-result.json` only under `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
 - Clean artifact root: only sanitized handoff artifacts, `clean-run-context.json`, behavior specs, implementation plans, clean reports, QC reports, polish reports, open questions, and abstract delta tickets belong here. Agent 0 must not write this root directly while running as a contaminated role.
 - Implementation root: Agent 3 writes destination code, tests, fixtures, and destination project files here. Agent 4 may write final hygiene changes and local git metadata here through the polish runner. Agent 0 must not write this root.
 - Quarantine root: rejected, contaminated, or incident artifacts that must not cross into the clean domain.
 Every new role session must receive `CLEAN_ROOM_ROLE`, `CLEAN_ROOM_SOURCE_ROOTS`, `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, `CLEAN_ROOM_CLEAN_ROOTS`, `CLEAN_ROOM_IMPLEMENTATION_ROOTS`, `CLEAN_ROOM_SCHEMA_DIR`, and, for clean or source-denied roles, `CLEAN_ROOM_ALLOWED_READ_ROOTS`. Do not assume environment variables persist across sessions.
-In unattended mode, reload durable artifacts before each iteration, select at most one pending or gap unit inside `loop_context.approved_scope_refs`, launch roles from fresh context, validate schema and leakage before advancing state, and stop on authorization, scope, contamination, validation, leakage, blocked-unit, implementation-complete, coverage-complete, spec-slice, no-progress, repeated-selection, or iteration-limit conditions. Do not use prior chat history as task state.
+In unattended mode, reload durable artifacts before each iteration, select at most one pending or gap unit inside `loop_context.approved_scope_refs`, require `loop_context.foundation_unit_ref` to point at the one foundation unit, launch roles from fresh context, validate schema and leakage before advancing state, and stop on authorization, scope, contamination, validation, leakage, blocked-unit, implementation-complete, coverage-complete, spec-slice, no-progress, repeated-selection, or iteration-limit conditions. Do not use prior chat history as task state.
 Role session briefs must contain only compact status, next action, allowed artifact refs with hashes, and forbidden inputs. Do not put copied artifact bodies, source excerpts, source paths, contaminated ledgers, or prior chat in a brief.
@@ -60,6 +70,8 @@ Do not grant shell-style tools to Agent 0, Agent 1, Agent 1.5, Agent 2, or the d
 If a multi-file scope needs relationship-aware batching and `source-index.json` is missing, pause for controller preflight rather than running shell tools inside this role.
+If a visual fallback scope needs screenshot/image batching and `visual-index.json` is missing, pause for controller preflight rather than running shell tools inside this role.
 Stop if clean roles received contaminated material. Record a contamination incident and require a regenerated clean artifact.
-Stop if Agent 1.5 receives source roots, source-index contents, contaminated evidence ledger contents, private identifier lists, raw diffs, source excerpts, or Agent 1 source-reading chat history. Record a contamination incident and start Agent 1.5 again from a fresh context with a neutral brief.
+Stop if Agent 1.5 receives source roots, source-index contents, visual-index contents, raw screenshots, visual paths, image hashes, copied visible words, exact UI palettes/layouts/iconography, contaminated evidence ledger contents, private identifier lists, raw diffs, source excerpts, or Agent 1 source-reading chat history. Record a contamination incident and start Agent 1.5 again from a fresh context with a neutral brief.

package/agents/contaminated-source-analyst.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: contaminated-source-analyst
 description: Reads authorized source in a contaminated workspace and produces neutral draft task slices plus behavioral specs with evidence references, not replacement code.
-tools: Read, Write, Edit, Glob, Grep
+tools: Read, Write, Edit, Glob, Grep, view_image
 ---
 # Contaminated Source Analyst
@@ -19,6 +19,7 @@ Before reading source, verify that Agent 0 provided:
 - active `task-manifest.json` with `preflight_goal_ref` and `preflight_goal_sha256`
 - one assigned `unit_id`
 - authorized `source_index_refs`, when used
+- authorized `visual_index_refs`, when visual fallback is used
 - evidence handling policy
 - target stack and compatibility policy from preflight
 - neutral sanitizer brief requirements
@@ -28,21 +29,27 @@ Do not infer target language, dependency policy, license policy, or exactness po
 Responsibilities:
-- Read the minimum source needed for the assigned unit.
+- Read the bounded source needed to fully inventory the assigned unit's observable surface. Do not stop at the first obvious path when the unit includes CLI, environment override, TUI, UI, protocol, config, command dispatch, or public behavior surface.
 - When `CLEAN_ROOM_SESSION_BRIEF_PATH` is set, read it first and load only the allowed artifact refs named there, except for direct source reads already permitted by the assigned unit and role policy.
 - When the unit has `source_index_refs`, stay within the referenced batch unless Agent 0 explicitly assigns a related gap.
+- When the unit has `visual_index_refs`, use `view_image` only in this contaminated role and stay within the referenced visual batch unless Agent 0 explicitly assigns a related gap.
 - Generate neutral draft task slices and behavioral spec material for Agent 0-controlled units.
 - Write neutral behavioral requirements covering inputs, outputs, state transitions, edge cases, error conditions, invariants, and tests.
+- For a `unit_kind: "foundation"` assignment, inventory target stack, package or module boundaries, public manifest surfaces, test entrypoints, dependency policy, and destination constraints. Record public compatibility facts in behavior-spec fields and keep destination/build constraints neutral for clean planning.
+- When relevant to the assigned unit, locate and account for every observable CLI argument, flag, environment variable override, TUI command, keyboard shortcut, menu state, associated UI element, view state, accessibility expectation, config key, protocol entry point, and public user-visible behavior.
+- If you detect related files, modules, visual components, or public surfaces that are inside authorized scope but outside the assigned refs or too large to analyze in the current context, record contaminated `coverage-ledger.json` `discovery_leads` with neutral `source_ref`, description, priority, and status. Do not put source paths, visual paths, source index refs, or private identifiers in clean behavior specs.
+- For visual fallback units, write UI behavior/spec claims about intent, screen states, hierarchy, accessibility expectations, interaction purpose, and broad style goals. Do not OCR or copy visible words unless preflight recorded them as public compatibility surface; do not preserve exact palettes, iconography, spacing, layout measurements, or distinctive visual expression.
 - Treat discovered source tests as behavioral evidence and convert them into clean `test_scenarios` that validate the same observable outputs.
 - Record equal-output expectations for public return values, serialized data, CLI or API responses, errors, state changes, ordering, and compatibility-relevant side effects.
 - Use `evidence_refs` that point to contaminated-side ledger entries instead of including source text.
 - Keep public API names only when compatibility requires them and record the reason.
 - Capture public API, protocol, config, and data/schema compatibility using existing behavior spec fields.
+- Do not mirror source dependency lists, package manifests, or private module layout. Mention a dependency only when it is public compatibility surface, destination evidence, or explicitly allowed by preflight policy.
 - For behavior-compatible ports, extract compatibility-critical invariants into `invariants`, `compatibility_notes`, and `test_scenarios`; broad module coverage is not enough.
 - When present, treat protocol transcript shape, request/response ID pairing, error budgets, streaming order, queue bounds, sampling registry aliases, async behavior, and typed JSON argument preservation as first-class observable behavior.
 - Treat package, namespace, module, class, function, method, variable, constant, field, and internal event names as private identifiers unless they are public compatibility surface.
 - Flag suspected leakage before returning drafts, but do not approve your own work for clean handoff.
-Never produce implementation code, copied comments, source excerpts, raw diffs, source test names, fixture structure, private helper names, or source-shaped pseudocode.
+Never produce implementation code, copied comments, source excerpts, raw diffs, raw screenshots, visual paths, image hashes, copied visible text, exact UI palettes/layouts/iconography, source test names, fixture structure, private helper names, or source-shaped pseudocode.
 Agent 1.5 owns independent sanitization and leakage pass/fail review from a fresh source-denied context.

package/bin/verify.sh CHANGED Viewed

@@ -49,6 +49,7 @@ echo "Compiling Python hooks and scripts..."
 echo "Smoke testing source index CLI..."
 "$python_cmd" skills/clean-room/scripts/build_source_index.py --help >/dev/null
+"$python_cmd" skills/clean-room/scripts/build_visual_index.py --help >/dev/null
 echo "Validating example schemas..."
 for dir in skills/clean-room/examples/minimal-spec-package skills/clean-room/examples/contaminated-side; do

package/docs/ARCHITECTURE.md CHANGED Viewed

@@ -19,7 +19,7 @@ The Clean Room workflow acts as an engineering risk-reduction process by establi
 To maintain compliance and mitigate leakage risks, the workflow utilizes strictly separated workspaces, worktrees, repositories, or profiles for contaminated and clean work:
 *   **Contaminated Source Workspace**: Source-readable, read-only where practical. Contains the codebase under analysis.
-*   **Contaminated Artifact Workspace**: Holds intermediate outputs like preflight goals, init configs, source indexes, task manifests, controller status, coverage ledgers, evidence ledgers, draft specs, contaminated-role session briefs, and abstract delta tickets. Configure via `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
+*   **Contaminated Artifact Workspace**: Holds intermediate outputs like preflight goals, init configs, source indexes, visual indexes, task manifests, controller status, coverage ledgers, evidence ledgers, draft specs, contaminated-role session briefs, and abstract delta tickets. Configure via `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
 *   **Clean Artifact Workspace**: Houses sanitized clean run contexts, approved behavioral specifications, handoff packages, clean-role session briefs, skeleton manifests, implementation plans, implementation reports, QC reports, and test plans. Configure via `CLEAN_ROOM_CLEAN_ROOTS`.
 *   **Clean Implementation Workspace**: Houses clean destination code and tests. Configure via `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
 *   **Clean Allowed Reference Workspace**: Public documentation, specifications, or destination constraints explicitly approved for clean and source-denied role reads. Configure via `CLEAN_ROOM_ALLOWED_READ_ROOTS`.
@@ -41,17 +41,18 @@ The initialization wizard and `require-clean-room-env.py` audit clean, implement
 ![Stage 0 Goal Contract](assets/3.png)
-Every new run starts with `preflight-goal.json` before source discovery, source indexing, Agent 0 decomposition, attended execution, or unattended execution. The contract records end goal, target stack, license policy, dependency policy, compatibility exactness, feature changes, code hygiene, output policy, controller mode, and open questions.
+Every new run starts with `preflight-goal.json` before source discovery, source indexing, visual indexing, Agent 0 decomposition, attended execution, or unattended execution. The contract records end goal, target stack, license policy, dependency policy, compatibility exactness, feature changes, code hygiene, output policy, controller mode, and open questions.
 `preflight-goal.json` is controller/contaminated-side only. Clean roles receive only clean-safe `goal_contract` fields and `code_hygiene_policy` through `clean-run-context.json`.
 ### Contaminated Source-Index Preflight Tooling
-To assist in logical unit decomposition, the workflow supports an optional source-index preflight stage using `build_source_index.py` and `clean_room_tool_manager.py`.
+To assist in logical unit decomposition, the workflow supports an optional source-index preflight stage using `build_source_index.py` and `clean_room_tool_manager.py`. When no indexable source code exists and screenshots/images are the authorized evidence, the workflow supports a fallback visual-index preflight stage using `build_visual_index.py`.
 *   **Execution Boundary**: This tooling runs exclusively in the contaminated domain before clean-room role sessions are initialized.
 *   **Traversal Bounds**: Source indexing enforces file count, per-file byte, total byte, batch token, and segment caps. It validates file size again after reading, skips files that change during read, records directory walk errors, and prunes traversal after global limits are exhausted with an aggregate skipped entry.
-*   **Agent 0 Use**: Agent 0 consumes `source-index.json` only to create neutral `task-manifest.json` units and per-unit `source_index_refs`. The index stays contaminated-only and does not cross to Agent 1.5, Agent 2, Agent 3, Agent 4, or clean handoff packages.
+*   **Agent 0 Use**: Agent 0 consumes `source-index.json` only to create neutral `task-manifest.json` units and per-unit `source_index_refs`. In visual fallback runs, Agent 0 consumes `visual-index.json` only to create neutral units and per-unit `visual_index_refs`. Both indexes stay contaminated-only and do not cross to Agent 1.5, Agent 2, Agent 3, Agent 4, or clean handoff packages.
+*   **Discovery Leads**: When Agent 1 detects an authorized related surface that cannot be analyzed inside the assigned unit, Agent 0 tracks it in contaminated `coverage-ledger.json` `discovery_leads`. High-priority leads must be resolved before the unit can be marked covered; the runner does not expand approved scope automatically.
 *   **Tool Trust Policy**: By default, tool discovery operates in `stat-only` mode and does not execute third-party binaries. It queries version strings only when explicitly invoked with `--probe-tools`. Tools discovered under `/opt/homebrew` or `/usr/local` remain stat-only unless `--allow-user-toolchain-probes` is also supplied. Project-local directories (such as `.bin` or `node_modules/.bin`) are ignored unless the environment variable `RE_SKILLS_TRUST_PROJECT_TOOLS=1` or the flag `--allow-working-project-tools` is supplied.
 *   **Local Tool Install Safety**: Explicit npm-backed helper installs are strict-version pinned and serialized with a cache-local lock before mutating `~/.cache/re-skills/clean-room-tools/npm`. Prefix creation failures, subprocess timeouts, and subprocess launch errors are returned as structured JSON facts instead of raw tracebacks.
@@ -78,7 +79,7 @@ flowchart LR
     sanitizer["Agent 1.5: contaminated-handoff-sanitizer<br/>Source-denied, scrub identifying material"]
     brief["Neutral sanitizer brief<br/>domain, target profile, unit intent,<br/>public allowlist, blocked categories"]
     preflight["preflight-goal.json<br/>goal, stack, policy, hygiene"]
-    ledgers["Contaminated artifacts<br/>CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS<br/>init-config.json<br/>source-index.json<br/>task-manifest.json<br/>coverage-ledger.json<br/>evidence-ledger.json"]
+    ledgers["Contaminated artifacts<br/>CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS<br/>init-config.json<br/>source-index.json<br/>visual-index.json<br/>task-manifest.json<br/>coverage-ledger.json<br/>evidence-ledger.json"]
     drafts["Agent 1 draft specs<br/>assigned paths only for Agent 1.5"]
     staged["Sanitized handoff candidates<br/>Agent 1.5-reviewed behavior-spec.json"]
   end
@@ -95,8 +96,8 @@ flowchart LR
     architect["Agent 2: clean-architect<br/>Plan implementation from clean specs and foundation"]
     qa["Agent 3: clean-qa-editor<br/>Implement, record verification, terminal report"]
     polish["Agent 4: clean-polish-reviewer<br/>Final code polish, repo hygiene, local commit"]
-    outputs["Clean artifacts<br/>implementation-plan.json<br/>qc-report.json<br/>test plan notes"]
-    imploutputs["Implementation outputs<br/>code, tests, AGENTS.md, .gitignore<br/>implementation-report.json<br/>polish-report.json"]
+    outputs["Clean artifacts<br/>implementation-plan.json<br/>implementation-report.json<br/>qc-report.json<br/>polish-report.json<br/>test plan notes"]
+    imploutputs["Implementation outputs<br/>code, tests, fixtures, AGENTS.md, .gitignore"]
   end
   subgraph guardrails["Guardrails and audit"]
@@ -140,10 +141,10 @@ flowchart LR
   env -. required for every role session .-> architect
   denyread -. clean and source-denied roles cannot read source roots .-> cleanroots
   denyread -. clean roles may read implementation roots .-> implroots
-  denyread -. Agent 1.5 cannot read source roots, clean roots, implementation roots, source-index.json, or preflight-goal.json .-> sanitizer
+  denyread -. Agent 1.5 cannot read source/visual roots, clean roots, implementation roots, source-index.json, visual-index.json, or preflight-goal.json .-> sanitizer
   denywrite -. contaminated writes only to contaminated artifact roots .-> ledgers
-  denywrite -. Agent 2 writes clean artifacts only; Agents 3 and 4 write implementation roots .-> cleanroots
-  denywrite -. Agents 3 and 4 write code, tests, docs, and repo hygiene only here .-> implroots
+  denywrite -. Agent 2 writes clean artifacts; Agents 3 and 4 write clean reports .-> cleanroots
+  denywrite -. Agents 3 and 4 write destination files only here; no clean-room artifact JSON .-> implroots
   denyshell -. no shell-style tools in role sessions .-> manager
   denyshell -. no shell for Agent 2; explicit Agent 3 and Agent 4 runners only .-> architect
   scan -. post-write checks .-> outputs
@@ -177,6 +178,7 @@ The architecture delegates work across six distinct custom role agents to enforc
     *   Produces `clean-run-context.json` for Agent 2, Agent 3, and Agent 4 instead of handing over the full `task-manifest.json` or full `preflight-goal.json`.
     *   Influences Agent 2, Agent 3, and Agent 4 only through durable sanitized artifacts, never direct chat, progress feedback, implementation hints, or priority changes.
     *   Performs final verification of clean specification and implementation coverage against the source scope.
+    *   Blocks handoff or coverage completion when high-priority contaminated discovery leads remain unresolved.
     *   Writes the inner-loop `clean-room-result.json` only after contaminated-side coverage verification.
     *   Consumes Agent 3 reports only after Agent 3 reaches a terminal state, and consumes Agent 4 reports only after the configured polish review reaches a terminal state, then sends only abstract delta tickets into a fresh clean artifact cycle.
@@ -187,6 +189,8 @@ The architecture delegates work across six distinct custom role agents to enforc
     *   Analyzes the authorized source code within assigned units or batches.
     *   Uses target stack and compatibility policy from preflight instead of inferring product goals from source.
     *   Writes neutral draft behavioral specifications based on observed behavior, public contracts, invariants, state transitions, and errors.
+    *   Inventories the assigned unit's observable CLI, env, TUI, UI, protocol, config, command, and public behavior surfaces when relevant.
+    *   Records authorized related surfaces that cannot be analyzed in the assigned context as contaminated `discovery_leads`, not clean spec fields.
     *   Generates evidence references pointing to contaminated ledgers instead of copying raw source code or comments.
     *   Flags suspected leakage but does not approve its own work for clean handoff.
@@ -223,7 +227,7 @@ The architecture delegates work across six distinct custom role agents to enforc
     *   Records code hygiene violations as `code-hygiene` findings in `qc-report.json`.
     *   Writes code, tests, fixtures, and destination project files only under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
     *   Runs bounded verification only through the installed Agent 3 verification runner, with `CLEAN_ROOM_ALLOW_AGENT3_SHELL=1`, strict hooks, and cwd under implementation roots.
-    *   Writes `implementation-report.json` and maintains `qc-report.json`.
+    *   Writes `CLEAN_ROOM_CLEAN_ROOTS/implementation-report.json` and maintains `CLEAN_ROOM_CLEAN_ROOTS/qc-report.json`.
     *   Does not report progress or ask Agent 0 for guidance during implementation.
     *   Emits one terminal report for Agent 0 only when the assigned spec slice is complete, blocked, or quarantined.
@@ -234,7 +238,7 @@ The architecture delegates work across six distinct custom role agents to enforc
     *   Reviews final clean implementation for security, docs/comments, exception handling, resource leaks, race conditions, missing tests, and repository hygiene.
     *   Creates or updates implementation-root `AGENTS.md` with gotchas and build/test/dev commands discovered from clean implementation files.
     *   Updates `.gitignore` only for real generated outputs, dependencies, caches, or build/test artifacts.
-    *   Writes `polish-report.json`.
+    *   Writes `CLEAN_ROOM_CLEAN_ROOTS/polish-report.json`.
     *   Uses `agent4-polish-runner.py` only with `CLEAN_ROOM_ALLOW_AGENT4_SHELL=1`, cwd under implementation roots, and strict hooks.
     *   May initialize git and create one local commit containing only paths listed in `polish-report.json`; it must not push, tag, reset, clean, or delete branches.
@@ -251,14 +255,16 @@ Agent 3's terminal report is not enough to return. If configured, Agent 4 must p
 *   Locks the contaminated artifact root with `.clean-room-run.lock`.
 *   Reloads durable artifacts before each iteration.
 *   Selects at most one pending or gap unit inside `loop_context.approved_scope_refs`.
+*   Requires exactly one `unit_kind: "foundation"` unit, named by `loop_context.foundation_unit_ref`; behavior units cannot run or complete until that foundation unit is covered.
 *   Spawns configured role commands with `shell: false`, bounded output, and bounded timeout.
 *   In strict context-management mode, requires each configured stage to provide `context.fresh_session: true` and `context.brief_path`, then validates the session brief before spawn.
 *   Supports the optional `clean-polish-review` phase between `clean-implement-qc` and `contaminated-coverage-verify`.
 *   Validates schema, leakage, and handoff integrity before advancing state.
+*   Rejects `covered` coverage-ledger units that still have unresolved high-priority `discovery_leads`.
 *   Records controller memory in contaminated-side `controller-run-ledger.json`.
 *   Writes `clean-room-result.json` before returning to the outer spec loop.
-Progress is durable-artifact based. `clean-room-skill run` compares semantic JSON artifact hashes that ignore volatile timestamp and artifact-hash fields, plus raw file hashes under implementation roots. Chat output, timestamp-only artifact churn, and `controller-status.json` updates alone do not count as progress.
+Progress is durable-artifact based. `clean-room-skill run` compares semantic JSON artifact hashes that ignore volatile timestamp and artifact-hash fields, plus raw file hashes under implementation roots while ignoring generated directories such as `target/`. Chat output, timestamp-only artifact churn, Cargo build metadata, and `controller-status.json` updates alone do not count as progress.
 ---
@@ -270,7 +276,7 @@ Every clean-room role session requires a populated environment block before any
 *   `CLEAN_ROOM_SOURCE_ROOTS`: Source roots (only readable by source-reading contaminated roles, not Agent 1.5).
 *   `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`: Target write directory for contaminated roles.
 *   `CLEAN_ROOM_CLEAN_ROOTS`: Target write directory for clean artifacts and reports.
-*   `CLEAN_ROOM_IMPLEMENTATION_ROOTS`: Target write directory for Agent 3 clean implementation code and tests, plus Agent 4 implementation-root hygiene changes and local git metadata.
+*   `CLEAN_ROOM_IMPLEMENTATION_ROOTS`: Target write directory for Agent 3 clean implementation code, tests, fixtures, real destination project files, plus Agent 4 implementation-root hygiene changes and local git metadata. Clean-room artifact JSON files stay out of this root.
 *   `CLEAN_ROOM_ALLOWED_READ_ROOTS`: Approved reference docs or constraints readable by clean and source-denied roles.
 *   `CLEAN_ROOM_SCHEMA_DIR`: Path to the directory containing JSON schema assets.
@@ -293,11 +299,11 @@ Post-write hook failures are deny-by-default and redacted. If an artifact disapp
 *   [agent4-polish-runner.py](../hooks/agent4-polish-runner.py): Runs Agent 4 bounded status, verification, git init, staging, and one local commit from implementation roots only, using paths and policy recorded in `polish-report.json`.
 *   [require-clean-room-env.py](../hooks/require-clean-room-env.py): Fails closed if the required role and root environment variables are missing, if trust-domain roots overlap, or if clean, implementation, or contaminated artifact root names appear source-derived.
 *   [deny-clean-room-shell.py](../hooks/deny-clean-room-shell.py): Denies shell-style tool execution inside clean-room role sessions except installed Agent 3 verification-runner invocations under implementation roots and installed Agent 4 polish-runner invocations under implementation roots.
-*   [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` reads.
-*   [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, and contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`.
+*   [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source or visual roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` or `visual-index.json` reads.
+*   [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, and clean-room artifact JSON files are denied under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
 *   [check-artifact-leakage.py](../hooks/check-artifact-leakage.py): Scans clean artifacts and Agent 1.5 staged contaminated artifacts for high-risk leakage markers, source-like identifiers, and private identifier denylist terms. The private identifier denylist (loaded via `CLEAN_ROOM_PRIVATE_IDENTIFIER_DENYLIST`) is subject to hard limits to protect hook execution performance: a maximum of 1,000,000 bytes per file, 20,000 total terms, and 512 characters per individual term.
 *   [validate-json-schema.py](../hooks/validate-json-schema.py): Verifies JSON syntax and structural conformance against schemas under `CLEAN_ROOM_SCHEMA_DIR`, including controller-side `preflight-goal.schema.json` and `init-config.schema.json`. Under clean roots, any unrecognized JSON files that do not conform to canonical schemas will trigger a failure unless they are explicitly registered in the path-separated `CLEAN_ROOM_AUXILIARY_JSON_ALLOWLIST` environment variable.
-*   [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, or `source-index.json`, and match declared `sha256` checksums.
+*   [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, `source-index.json`, or `visual-index.json`, and match declared `sha256` checksums.
 For detailed guidelines on the clean-room process, refer to:
 *   [CONTROLLER-LOOP.md](../skills/clean-room/references/CONTROLLER-LOOP.md)

package/docs/REFERENCE.md CHANGED Viewed

@@ -210,7 +210,11 @@ Options:
 | `--schema-dir <path>` | Override bundled schema directory. |
 | `--python <path>` | Python executable for validation hooks; default is `python3`. |
-The task manifest must already include preflight references, the required handoff sequence, unattended controller policy, finite iteration bounds, and `loop_context.approved_scope_refs`.
+The task manifest must already include preflight references, the required handoff sequence, unattended controller policy, finite iteration bounds, `loop_context.foundation_unit_ref`, and `loop_context.approved_scope_refs`.
+Unattended code-development manifests must include exactly one `unit_kind: "foundation"` unit. The runner rejects non-foundation approved slices until that unit is covered.
+`coverage-ledger.json` may record contaminated-only `source_units[].discovery_leads` for authorized related surfaces that were detected but not analyzed in the assigned unit. The runner rejects a `covered` unit while any high-priority discovery lead remains open or deferred. It does not add follow-up units or expand `loop_context.approved_scope_refs`; Agent 0 must return an abstract delta, mark coverage partial or blocked, or pause for attended approval.
 Minimal agent command adapter shape for advisory or disabled context management:
@@ -310,7 +314,7 @@ Strict context-management adapter example:
 }
 ```
-Relative `context.brief_path` values resolve relative to the `agent-commands.json` directory. Contaminated phases must point to briefs under the contaminated artifact root. Clean phases must point to briefs under the clean artifact root. A clean-stage brief may reference allowed clean artifacts, implementation artifacts, and approved public references, but not source indexes, contaminated ledgers, full manifests, controller status, or prior chat state.
+Relative `context.brief_path` values resolve relative to the `agent-commands.json` directory. Contaminated phases must point to briefs under the contaminated artifact root. Clean phases must point to briefs under the clean artifact root. A clean-stage brief may reference allowed clean artifacts, implementation artifacts, and approved public references, but not source indexes, visual indexes, raw screenshots, contaminated ledgers, full manifests, controller status, or prior chat state.
 The runner exports `CLEAN_ROOM_SESSION_BRIEF_PATH`, `CLEAN_ROOM_ROLE_SESSION_ID`, and `CLEAN_ROOM_FRESH_CONTEXT_REQUIRED=1` for strict stages. The adapter still owns the actual fresh-context behavior: it must open a new model session, profile, or thread for that stage. Setting `fresh_session` while reusing one long chat is not a clean-room boundary.
@@ -323,12 +327,14 @@ The runner exports `CLEAN_ROOM_SESSION_BRIEF_PATH`, `CLEAN_ROOM_ROLE_SESSION_ID`
 | `install lock is held` | Another install or uninstall is mutating the same target root | Wait for the other process to finish; stale locks are handled conservatively. |
 | Hook config write failed after files copied | Partial installer state | Fix the filesystem error, then re-run the same installer command. |
 | Install manifest remains `installing` | The previous install did not complete | Re-run the same installer command for that runtime and target root. |
-| `clean-room run` rejects the manifest | Invalid or incomplete unattended loop metadata | Fix `controller_policy`, `loop_context`, and `approved_scope_refs`, then retry `--dry-run`. |
+| `clean-room run` rejects the manifest | Invalid or incomplete unattended loop metadata | Fix `controller_policy`, `loop_context.foundation_unit_ref`, and `approved_scope_refs`, then retry `--dry-run`. |
+| `clean-room run` rejects a covered unit with `discovery_leads` | A high-priority contaminated discovery lead is still unresolved | Analyze the lead in an authorized follow-up unit, mark it resolved, or keep coverage partial/blocked and return an abstract delta. |
 | `clean-room run` rejects an agent command stage in strict context mode | The stage is missing `context.fresh_session: true`, missing `context.brief_path`, or points the brief outside the allowed artifact root | Fix the stage context and regenerate the role-session brief for the selected unit. |
 | `clean-room run` reports no progress | Configured stages exited without durable artifact changes | Check role command cwd/argv, selected unit, and artifact write roots. |
 | `clean-room run` reports repeated unit selection | Same unit selected after a no-progress iteration | Resolve the blocker or update durable artifacts before retrying. |
 | Hook reports `could not read` or `could not stat` | Artifact disappeared, permissions changed, or path was replaced during validation | Restore readable artifact state and retry. |
 | `source-index.json` is missing files | Limits, unreadable directories, ignored directories, binary files, changed files, or outside-root symlinks | Inspect `skipped_entries` and adjust limits or permissions if omissions matter. |
+| `visual-index.json` is missing screenshots | Limits, unsupported formats, unreadable directories, changed files, invalid image headers, or outside-root symlinks | Inspect `skipped_entries`, keep visual roots in the contaminated/source domain, and rerun `build_visual_index.py` only as fallback evidence preflight. |
 ## Local Verification

package/examples/codex/.codex/agents/clean-architect.toml CHANGED Viewed

@@ -10,14 +10,14 @@ Run only from the clean workspace.
 Before tool use, require CLEAN_ROOM_ROLE=clean-architect, CLEAN_ROOM_CLEAN_ROOTS, CLEAN_ROOM_IMPLEMENTATION_ROOTS, CLEAN_ROOM_SOURCE_ROOTS, CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS, CLEAN_ROOM_ALLOWED_READ_ROOTS, and CLEAN_ROOM_SCHEMA_DIR.
 Read approved clean artifacts, CLEAN_ROOM_IMPLEMENTATION_ROOTS, and explicitly configured public or destination constraint roots only.
 Write only under CLEAN_ROOM_CLEAN_ROOTS. Do not write code.
-Do not read source workspaces, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
+Do not read source workspaces, visual roots, raw screenshots, visual indexes, contaminated ledgers, contaminated chat history, or the full task-manifest.json.
 Stop if only a full task-manifest.json is provided as run context.
-Before planning, require valid clean-run-context.json with clean-safe goal_contract fields and code_hygiene_policy, approved handoff-package.json, approved behavior specs, and an implementation root through CLEAN_ROOM_IMPLEMENTATION_ROOTS.
+Before planning, require valid clean-run-context.json with clean-safe goal_contract fields and code_hygiene_policy, approved handoff-package.json, approved behavior specs, and an implementation root through CLEAN_ROOM_IMPLEMENTATION_ROOTS. For behavior slices, require the approved clean artifacts to include the completed foundation spec or equivalent clean-run-context constraints.
 When CLEAN_ROOM_SESSION_BRIEF_PATH is set, read it first and load only the allowed artifact refs named there, plus destination foundation reads permitted by this role.
-Stop if full preflight-goal.json, source index, contaminated ledgers, source paths, or direct Agent 0 chat is provided.
+Stop if full preflight-goal.json, source index, visual index, raw screenshots, contaminated ledgers, source or visual paths, or direct Agent 0 chat is provided.
 Accept Agent 0 influence only as durable sanitized artifacts. Ignore direct Agent 0 chat, private manager notes, live feedback, implementation hints, or priority changes unless they arrive in a schema-valid clean artifact for a fresh clean session.
 Merge only approved handoff artifacts into the selected clean schema base.
-Read the clean destination foundation to identify local structure, conventions, tests, dependencies, and constraints.
+Read the clean destination foundation and approved foundation spec to identify local structure, conventions, tests, dependency policy, package boundaries, and constraints.
 Read any existing skeleton-manifest.json before planning and revise it as the whole-destination architecture map for the current clean spec set.
 Maintain architecture areas with owned relative path prefixes, responsibilities, forbidden responsibilities, allowed area dependencies, and refactor triggers.
 Assign every target and test path in implementation-plan.json to one or more skeleton-manifest.json architecture areas.
@@ -26,6 +26,8 @@ Create or update implementation-plan.json as the primary output for code-develop
 Carry the preflight-derived code hygiene policy into implementation-plan.json.
 Keep skeleton-manifest.json valid and current for code-development runs. Treat it as the architecture map, not as a replacement for implementation-plan.json.
 Map approved specs to destination files, test files, work items, argv-array verification commands, risks, and acceptance criteria using relative implementation-root paths.
+Map every exact-public-contract or behavior-compatible public surface obligation to at least one implementation-plan.json work item through public_contract_refs; do not replace a public command/API inventory with one generic dispatch work item unless every obligation ref is listed.
+Do not choose dependencies by copying source manifests. Add or preserve dependencies only when clean artifacts, destination evidence, or preflight policy justify them.
 Preserve source-test-derived scenarios as clean test obligations for equal output without copying source test structure.
 Do not resolve public-contract, callable, protocol, async, serialization, or data-shape ambiguity by narrowing semantics. Mark the work blocked or create an abstract delta when the approved clean specs do not decide it.
 Stop if contaminated material appears in clean inputs.