clean-room-skill 0.1.15 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -9,7 +9,7 @@
9
9
  "name": "clean-room",
10
10
  "source": "./",
11
11
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
12
- "version": "0.1.15",
12
+ "version": "0.2.0",
13
13
  "author": {
14
14
  "name": "whit3rabbit"
15
15
  },
@@ -2,7 +2,7 @@
2
2
  "name": "clean-room",
3
3
  "displayName": "Clean Room",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
- "version": "0.1.15",
5
+ "version": "0.2.0",
6
6
  "author": {
7
7
  "name": "whit3rabbit"
8
8
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clean-room",
3
- "version": "0.1.15",
3
+ "version": "0.2.0",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
5
  "author": {
6
6
  "name": "whit3rabbit"
@@ -179,6 +179,7 @@ The architecture delegates work across six distinct custom role agents to enforc
179
179
  * Influences Agent 2, Agent 3, and Agent 4 only through durable sanitized artifacts, never direct chat, progress feedback, implementation hints, or priority changes.
180
180
  * Performs final verification of clean specification and implementation coverage against the source scope.
181
181
  * Blocks handoff or coverage completion when high-priority contaminated discovery leads remain unresolved.
182
+ * Treats completion as deny-by-default unless durable canonical artifacts prove the clean behavior gate.
182
183
  * Writes the inner-loop `clean-room-result.json` only after contaminated-side coverage verification.
183
184
  * Consumes Agent 3 reports only after Agent 3 reaches a terminal state, and consumes Agent 4 reports only after the configured polish review reaches a terminal state, then sends only abstract delta tickets into a fresh clean artifact cycle.
184
185
 
@@ -250,6 +251,8 @@ The outer loop owns spec development: scope, behavior specs, acceptance criteria
250
251
 
251
252
  Agent 3's terminal report is not enough to return. If configured, Agent 4 must produce a passing `polish-report.json`. Agent 0 must then consume the terminal clean reports, verify contaminated-side coverage, and write `clean-room-result.json`.
252
253
 
254
+ Completion is deny-by-default. `task-manifest.json`, `coverage-ledger.json`, and `clean-room-result*.json` writes that claim completion must be backed by durable canonical clean artifacts: a matching clean behavior spec, implementation plan mappings, a terminal implementation report, a passed QC report, valid evidence references, and required public-surface mappings. Synthetic or manual completion summaries are not completion evidence.
255
+
253
256
  `clean-room-skill run` is the executable v1 inner-loop runner. It requires preflight refs, the required handoff sequence, unattended `controller_policy`, schema-valid `loop_context`, and either a user-supplied agent command adapter or the built-in Claude Code agent runtime. It does not automate outer spec development. The runner:
254
257
 
255
258
  * Locks the contaminated artifact root with `.clean-room-run.lock`.
@@ -261,6 +264,7 @@ Agent 3's terminal report is not enough to return. If configured, Agent 4 must p
261
264
  * Supports the optional `clean-polish-review` phase between `clean-implement-qc` and `contaminated-coverage-verify`.
262
265
  * Validates schema, leakage, and handoff integrity before advancing state.
263
266
  * Rejects `covered` coverage-ledger units that still have unresolved high-priority `discovery_leads`.
267
+ * Rejects completion claims that lack canonical clean specs, plans, terminal reports, QC, evidence, or public-surface coverage mappings.
264
268
  * Records controller memory in contaminated-side `controller-run-ledger.json`.
265
269
  * Writes `clean-room-result.json` before returning to the outer spec loop.
266
270
 
@@ -302,7 +306,7 @@ Post-write hook failures are deny-by-default and redacted. If an artifact disapp
302
306
  * [deny-clean-source-read.py](../hooks/deny-clean-source-read.py): Enforces that clean roles and Agent 1.5 cannot read source or visual roots or unapproved paths; clean roles may read implementation roots, and source-denied roles are denied direct `preflight-goal.json` reads. Agent 1.5 is also denied clean roots, implementation roots, and direct `source-index.json` or `visual-index.json` reads.
303
307
  * [deny-contaminated-clean-write.py](../hooks/deny-contaminated-clean-write.py): Enforces role write roots. Agent 2 writes clean artifacts only, Agent 3 writes implementation files and clean reports, Agent 4 writes clean polish reports and implementation-root polish changes, contaminated roles write only to `CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS`, and clean-room artifact JSON files are denied under `CLEAN_ROOM_IMPLEMENTATION_ROOTS`.
304
308
  * [check-artifact-leakage.py](../hooks/check-artifact-leakage.py): Scans clean artifacts and Agent 1.5 staged contaminated artifacts for high-risk leakage markers, source-like identifiers, and private identifier denylist terms. The private identifier denylist (loaded via `CLEAN_ROOM_PRIVATE_IDENTIFIER_DENYLIST`) is subject to hard limits to protect hook execution performance: a maximum of 1,000,000 bytes per file, 20,000 total terms, and 512 characters per individual term.
305
- * [validate-json-schema.py](../hooks/validate-json-schema.py): Verifies JSON syntax and structural conformance against schemas under `CLEAN_ROOM_SCHEMA_DIR`, including controller-side `preflight-goal.schema.json` and `init-config.schema.json`. Under clean roots, any unrecognized JSON files that do not conform to canonical schemas will trigger a failure unless they are explicitly registered in the path-separated `CLEAN_ROOM_AUXILIARY_JSON_ALLOWLIST` environment variable.
309
+ * [validate-json-schema.py](../hooks/validate-json-schema.py): Verifies JSON syntax and structural conformance against schemas under `CLEAN_ROOM_SCHEMA_DIR`, including controller-side `preflight-goal.schema.json` and `init-config.schema.json`. Under clean roots, any unrecognized JSON files that do not conform to canonical schemas will trigger a failure unless they are explicitly registered in the path-separated `CLEAN_ROOM_AUXILIARY_JSON_ALLOWLIST` environment variable. Post-write validation also rejects completion claims in `task-manifest.json`, `coverage-ledger.json`, and `clean-room-result*.json` unless durable canonical completion artifacts prove the gate.
306
310
  * [validate-handoff-package.py](../hooks/validate-handoff-package.py): Verifies that handoff packages stay within clean roots, do not reference contaminated paths, `task-manifest.json`, `preflight-goal.json`, `source-index.json`, or `visual-index.json`, and match declared `sha256` checksums.
307
311
 
308
312
  For detailed guidelines on the clean-room process, refer to:
package/docs/HOOKS.md CHANGED
@@ -171,6 +171,8 @@ Post-write JSON artifact validator.
171
171
  - Implements the lightweight schema keywords used by bundled schemas, including object and array constraints, required fields, enum/const, patterns, `format: date-time`, `$ref`, `allOf`, `anyOf`, `oneOf`, and `if`/`then`/`else`.
172
172
  - Adds clean-run-context path checks so clean artifact paths stay relative, do not use `~`, do not contain `..`, and do not resolve into source or contaminated roots.
173
173
  - Requires task manifest handoff stages to match the expected clean-room sequence when validating the task manifest schema.
174
+ - Performs semantic completion validation for `task-manifest.json`, `coverage-ledger.json`, and `clean-room-result*.json` post-write payloads.
175
+ - Rejects completion claims unless canonical durable artifacts prove the gate: matching clean behavior specs, implementation-plan work item mappings, terminal implementation reports, passed QC reports, valid evidence references, and required public-surface coverage mappings.
174
176
 
175
177
  ### `validate-handoff-package.py`
176
178
 
package/docs/REFERENCE.md CHANGED
@@ -238,6 +238,8 @@ Unattended code-development manifests must include exactly one `unit_kind: "foun
238
238
 
239
239
  `coverage-ledger.json` may record contaminated-only `source_units[].discovery_leads` for authorized related surfaces that were detected but not analyzed in the assigned unit. The runner rejects a `covered` unit while any high-priority discovery lead remains open or deferred. It does not add follow-up units or expand `loop_context.approved_scope_refs`; Agent 0 must return an abstract delta, mark coverage partial or blocked, or pause for attended approval.
240
240
 
241
+ Completion is valid only when canonical durable artifacts prove the gate. A completed behavior unit or `spec-slice-complete` result must have a matching clean behavior spec, implementation-plan work item mapping, terminal implementation report, passed QC report, valid contaminated evidence refs, and required public-surface mappings across the behavior spec, coverage ledger, implementation plan, and terminal report. Manual summaries or synthetic result files are not completion evidence.
242
+
241
243
  Minimal agent command adapter shape for advisory or disabled context management:
242
244
 
243
245
  ```json
@@ -71,6 +71,8 @@ TASK_MANIFEST_HANDOFF_SEQUENCE_WITH_POLISH = [
71
71
  "clean-polish-review",
72
72
  TASK_MANIFEST_HANDOFF_SEQUENCE[-1],
73
73
  ]
74
+ PUBLIC_SURFACE_COMPLETION_LEVELS = {"exact-public-contract", "behavior-compatible"}
75
+ MAX_COMPLETION_ARTIFACT_SCAN = 500
74
76
  MAX_REPORTED_ERRORS = 20
75
77
  MAX_VALIDATION_ERRORS = MAX_REPORTED_ERRORS + 1
76
78
  REPAIR_HINT = "Fix or update the JSON artifact to satisfy the reported schema errors, then write it again."
@@ -318,6 +320,500 @@ def task_manifest_handoff_sequence_errors(data: dict[str, Any]) -> list[str]:
318
320
  return []
319
321
 
320
322
 
323
+ def completion_guard_enabled(payload: Any) -> bool:
324
+ if not isinstance(payload, dict):
325
+ return False
326
+ tool = payload.get("tool_name") or payload.get("tool")
327
+ if not isinstance(tool, str):
328
+ return False
329
+ return tool.lower() in {"write", "edit", "multiedit", "notebookedit", "apply_patch"}
330
+
331
+
332
+ def read_json_artifact(path: Path, label: str) -> tuple[dict[str, Any] | None, str | None]:
333
+ text, read_error = read_artifact_text(path, label)
334
+ if read_error:
335
+ return None, read_error
336
+ try:
337
+ data = json.loads(text)
338
+ except json.JSONDecodeError as exc:
339
+ return None, f"{label} JSON parse failed for {describe_path(path)}: {redact_text(exc)}"
340
+ if not isinstance(data, dict):
341
+ return None, f"{label} must be a JSON object: {describe_path(path)}"
342
+ return data, None
343
+
344
+
345
+ def relative_ref_candidates(ref: str) -> list[str]:
346
+ refs = [ref]
347
+ for prefix in ("clean/", "contaminated/"):
348
+ if ref.startswith(prefix):
349
+ refs.append(ref.removeprefix(prefix))
350
+ return refs
351
+
352
+
353
+ def find_json_by_ref(ref: Any, roots: list[Path], label: str) -> tuple[dict[str, Any] | None, str | None]:
354
+ if not isinstance(ref, str) or not ref:
355
+ return None, f"{label} ref is missing"
356
+ try:
357
+ raw = Path(ref).expanduser()
358
+ except OSError as exc:
359
+ return None, f"{label} ref is invalid: {redact_text(exc)}"
360
+ candidates: list[Path] = []
361
+ if raw.is_absolute():
362
+ try:
363
+ candidates.append(raw.resolve())
364
+ except OSError as exc:
365
+ return None, f"{label} ref is invalid: {redact_text(exc)}"
366
+ else:
367
+ for root in roots:
368
+ for candidate_ref in relative_ref_candidates(ref):
369
+ try:
370
+ candidates.append((root / candidate_ref).resolve())
371
+ except OSError as exc:
372
+ return None, f"{label} ref is invalid: {redact_text(exc)}"
373
+ for candidate in candidates:
374
+ try:
375
+ if candidate.is_file():
376
+ return read_json_artifact(candidate, label)
377
+ except OSError as exc:
378
+ return None, f"{label} could not stat {describe_path(candidate)}: {redact_text(exc)}"
379
+ return None, f"{label} does not exist: {ref}"
380
+
381
+
382
+ def scan_json_artifacts(roots: list[Path], wanted_kind: str) -> list[tuple[Path, dict[str, Any]]]:
383
+ matches: list[tuple[Path, dict[str, Any]]] = []
384
+ scanned = 0
385
+ for root in roots:
386
+ try:
387
+ for candidate in root.rglob("*.json"):
388
+ scanned += 1
389
+ if scanned > MAX_COMPLETION_ARTIFACT_SCAN:
390
+ return matches
391
+ try:
392
+ if not candidate.is_file():
393
+ continue
394
+ except OSError:
395
+ continue
396
+ data, error = read_json_artifact(candidate, f"{wanted_kind} artifact")
397
+ if error or not data:
398
+ continue
399
+ if artifact_kind(candidate, data) == wanted_kind:
400
+ matches.append((candidate, data))
401
+ except OSError:
402
+ continue
403
+ return matches
404
+
405
+
406
+ def first_json_artifact(roots: list[Path], name: str, label: str) -> tuple[dict[str, Any] | None, str | None]:
407
+ for root in roots:
408
+ candidate = root / name
409
+ try:
410
+ if candidate.is_file():
411
+ return read_json_artifact(candidate, label)
412
+ except OSError as exc:
413
+ return None, f"{label} could not stat {describe_path(candidate)}: {redact_text(exc)}"
414
+ return None, None
415
+
416
+
417
+ def unit_ref_values(unit_id: str) -> set[str]:
418
+ return {unit_id, f"unit:{unit_id}", f"task-manifest:{unit_id}", f"behavior-spec:{unit_id}"}
419
+
420
+
421
+ def evidence_id_from_ref(ref: Any) -> str | None:
422
+ prefix = "evidence-ledger:"
423
+ if isinstance(ref, str) and ref.startswith(prefix):
424
+ return ref.removeprefix(prefix)
425
+ return None
426
+
427
+
428
+ def evidence_entry_map(evidence_ledger: dict[str, Any] | None) -> dict[str, dict[str, Any]]:
429
+ entries: dict[str, dict[str, Any]] = {}
430
+ if not isinstance(evidence_ledger, dict):
431
+ return entries
432
+ for entry in evidence_ledger.get("entries") or []:
433
+ if isinstance(entry, dict) and isinstance(entry.get("evidence_id"), str):
434
+ entries[entry["evidence_id"]] = entry
435
+ return entries
436
+
437
+
438
+ def public_surface_ref(spec: dict[str, Any], item: dict[str, Any]) -> str:
439
+ return f"public_surface:{spec.get('spec_id')}:{item.get('kind')}:{item.get('name')}"
440
+
441
+
442
+ def required_public_surface_obligations(spec: dict[str, Any]) -> list[str]:
443
+ if spec.get("compatibility_level") not in PUBLIC_SURFACE_COMPLETION_LEVELS:
444
+ return []
445
+ obligations: list[str] = []
446
+ for item in spec.get("public_surface") or []:
447
+ if isinstance(item, dict) and isinstance(item.get("name"), str) and isinstance(item.get("kind"), str):
448
+ obligations.append(public_surface_ref(spec, item))
449
+ return obligations
450
+
451
+
452
+ def behavior_spec_test_coverage_refs(spec: dict[str, Any]) -> set[str]:
453
+ refs: set[str] = set()
454
+ for scenario in spec.get("test_scenarios") or []:
455
+ if not isinstance(scenario, dict):
456
+ continue
457
+ for ref in scenario.get("coverage") or []:
458
+ if isinstance(ref, str):
459
+ refs.add(ref)
460
+ return refs
461
+
462
+
463
+ def matching_behavior_specs(
464
+ specs: list[tuple[Path, dict[str, Any]]],
465
+ unit_id: str,
466
+ spec_slice_ref: str | None = None,
467
+ ) -> list[tuple[Path, dict[str, Any]]]:
468
+ matches: list[tuple[Path, dict[str, Any]]] = []
469
+ accepted_refs = unit_ref_values(unit_id)
470
+ if spec_slice_ref:
471
+ accepted_refs.add(spec_slice_ref)
472
+ for spec_path, spec in specs:
473
+ source_refs = spec.get("source_unit_refs") if isinstance(spec.get("source_unit_refs"), list) else []
474
+ spec_refs = {
475
+ ref
476
+ for ref in [spec.get("spec_id"), spec.get("unit_id"), *source_refs]
477
+ if isinstance(ref, str)
478
+ }
479
+ if spec_refs & accepted_refs or spec.get("unit_id") == unit_id or unit_id in source_refs:
480
+ matches.append((spec_path, spec))
481
+ return matches
482
+
483
+
484
+ def unit_id_from_spec_slice_ref(spec_slice_ref: Any, specs: list[tuple[Path, dict[str, Any]]]) -> str | None:
485
+ if not isinstance(spec_slice_ref, str) or not spec_slice_ref:
486
+ return None
487
+ for spec_ref, unit_id_prefix in (("unit:", "unit:"), ("task-manifest:", "task-manifest:"), ("behavior-spec:", "behavior-spec:")):
488
+ if spec_slice_ref.startswith(spec_ref):
489
+ return spec_slice_ref.removeprefix(unit_id_prefix)
490
+ if spec_slice_ref.startswith("unit-"):
491
+ return spec_slice_ref
492
+ for _spec_path, spec in specs:
493
+ if spec.get("spec_id") == spec_slice_ref:
494
+ return spec.get("unit_id") if isinstance(spec.get("unit_id"), str) else None
495
+ return None
496
+
497
+
498
+ def plan_work_items_by_public_ref(plans: list[tuple[Path, dict[str, Any]]]) -> dict[str, list[str]]:
499
+ refs: dict[str, list[str]] = {}
500
+ for _plan_path, plan in plans:
501
+ for work_item in plan.get("work_items") or []:
502
+ if not isinstance(work_item, dict) or not isinstance(work_item.get("work_item_id"), str):
503
+ continue
504
+ for ref in work_item.get("public_contract_refs") or []:
505
+ if isinstance(ref, str):
506
+ refs.setdefault(ref, []).append(work_item["work_item_id"])
507
+ return refs
508
+
509
+
510
+ def plan_work_items_for_specs(plans: list[tuple[Path, dict[str, Any]]], specs: list[dict[str, Any]]) -> set[str]:
511
+ spec_ids = {spec.get("spec_id") for spec in specs if isinstance(spec.get("spec_id"), str)}
512
+ work_items: set[str] = set()
513
+ for _plan_path, plan in plans:
514
+ for work_item in plan.get("work_items") or []:
515
+ if not isinstance(work_item, dict) or not isinstance(work_item.get("work_item_id"), str):
516
+ continue
517
+ refs = {ref for ref in work_item.get("spec_ids") or [] if isinstance(ref, str)}
518
+ if refs & spec_ids:
519
+ work_items.add(work_item["work_item_id"])
520
+ return work_items
521
+
522
+
523
+ def completed_work_items(reports: list[tuple[Path, dict[str, Any]]]) -> set[str]:
524
+ completed: set[str] = set()
525
+ for _report_path, report in reports:
526
+ for work_item_id in report.get("completed_work_items") or []:
527
+ if isinstance(work_item_id, str):
528
+ completed.add(work_item_id)
529
+ return completed
530
+
531
+
532
+ def terminal_implementation_reports(reports: list[tuple[Path, dict[str, Any]]]) -> list[tuple[Path, dict[str, Any]]]:
533
+ terminal: list[tuple[Path, dict[str, Any]]] = []
534
+ for report_path, report in reports:
535
+ if (
536
+ report.get("implementation_status") == "complete"
537
+ and report.get("final_status") == "complete"
538
+ and isinstance(report.get("agent0_reporting"), dict)
539
+ and report["agent0_reporting"].get("report_state") == "terminal-report"
540
+ ):
541
+ terminal.append((report_path, report))
542
+ return terminal
543
+
544
+
545
+ def passed_qc_reports(qc_reports: list[tuple[Path, dict[str, Any]]]) -> list[tuple[Path, dict[str, Any]]]:
546
+ passed: list[tuple[Path, dict[str, Any]]] = []
547
+ for report_path, report in qc_reports:
548
+ if (
549
+ report.get("final_status") in {"passed", "passed-with-gaps"}
550
+ and report.get("coverage_status") == "complete"
551
+ and report.get("schema_status") == "passed"
552
+ and report.get("leakage_status") == "passed"
553
+ and report.get("required_rerun") is False
554
+ ):
555
+ passed.append((report_path, report))
556
+ return passed
557
+
558
+
559
+ def source_unit_for_unit(coverage_ledger: dict[str, Any] | None, unit_id: str) -> dict[str, Any] | None:
560
+ if not isinstance(coverage_ledger, dict):
561
+ return None
562
+ for source_unit in coverage_ledger.get("source_units") or []:
563
+ if isinstance(source_unit, dict) and source_unit.get("unit_id") == unit_id:
564
+ return source_unit
565
+ return None
566
+
567
+
568
+ def validate_evidence_refs(
569
+ errors: list[str],
570
+ unit_id: str,
571
+ refs: Any,
572
+ evidence_ledger: dict[str, Any] | None,
573
+ label: str,
574
+ ) -> None:
575
+ if not isinstance(refs, list) or not refs:
576
+ add_error(errors, f"{label} has no evidence_refs: {unit_id}")
577
+ return
578
+ entries = evidence_entry_map(evidence_ledger)
579
+ if not entries:
580
+ add_error(errors, f"{label} references evidence but evidence-ledger.json is missing or empty: {unit_id}")
581
+ return
582
+ for ref in refs:
583
+ evidence_id = evidence_id_from_ref(ref)
584
+ if not evidence_id:
585
+ continue
586
+ entry = entries.get(evidence_id)
587
+ if not entry:
588
+ add_error(errors, f"{label} references missing evidence-ledger item: {ref}")
589
+ continue
590
+ source_ref = entry.get("source_unit_ref")
591
+ if isinstance(source_ref, str) and source_ref not in unit_ref_values(unit_id):
592
+ add_error(errors, f"{label} evidence ref points at a different source unit: {ref}")
593
+
594
+
595
+ def manifest_behavior_unit_ids(manifest: dict[str, Any] | None) -> set[str]:
596
+ ids: set[str] = set()
597
+ if not isinstance(manifest, dict):
598
+ return ids
599
+ for unit in manifest.get("units") or []:
600
+ if isinstance(unit, dict) and unit.get("unit_kind") == "behavior" and isinstance(unit.get("unit_id"), str):
601
+ ids.add(unit["unit_id"])
602
+ return ids
603
+
604
+
605
+ def manifest_completion_behavior_unit_ids(manifest: dict[str, Any] | None) -> set[str]:
606
+ ids: set[str] = set()
607
+ if not isinstance(manifest, dict):
608
+ return ids
609
+ for unit in manifest.get("units") or []:
610
+ if (
611
+ isinstance(unit, dict)
612
+ and unit.get("unit_kind") == "behavior"
613
+ and unit.get("status") != "out-of-scope"
614
+ and isinstance(unit.get("unit_id"), str)
615
+ ):
616
+ ids.add(unit["unit_id"])
617
+ return ids
618
+
619
+
620
+ def behavior_unit_is_in_scope(unit_id: str, manifest: dict[str, Any] | None, specs: list[tuple[Path, dict[str, Any]]]) -> bool:
621
+ behavior_ids = manifest_behavior_unit_ids(manifest)
622
+ if behavior_ids:
623
+ return unit_id in behavior_ids
624
+ if matching_behavior_specs(specs, unit_id):
625
+ return True
626
+ return unit_id != "unit-foundation"
627
+
628
+
629
+ def completion_context(path: Path, kind: str, data: dict[str, Any]) -> dict[str, Any]:
630
+ clean_roots = env_roots("CLEAN_ROOM_CLEAN_ROOTS")
631
+ contaminated_roots = env_roots("CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS")
632
+ manifest = data if kind == "task-manifest" else first_json_artifact(contaminated_roots, "task-manifest.json", "task-manifest")[0]
633
+ coverage = data if kind == "coverage-ledger" else first_json_artifact(contaminated_roots, "coverage-ledger.json", "coverage-ledger")[0]
634
+ evidence = first_json_artifact(contaminated_roots, "evidence-ledger.json", "evidence-ledger")[0]
635
+ specs = scan_json_artifacts(clean_roots, "behavior-spec")
636
+ plans = scan_json_artifacts(clean_roots, "implementation-plan")
637
+ reports = scan_json_artifacts(clean_roots, "implementation-report")
638
+ qcs = scan_json_artifacts(clean_roots, "qc-report")
639
+ if kind == "clean-room-result" and data.get("result") == "spec-slice-complete":
640
+ report, report_error = find_json_by_ref(data.get("terminal_report_ref"), clean_roots, "clean-room-result terminal_report_ref")
641
+ qc, qc_error = find_json_by_ref(data.get("qc_report_ref"), clean_roots, "clean-room-result qc_report_ref")
642
+ if report:
643
+ reports = [(path, report)]
644
+ if qc:
645
+ qcs = [(path, qc)]
646
+ return {
647
+ "clean_roots": clean_roots,
648
+ "contaminated_roots": contaminated_roots,
649
+ "manifest": manifest,
650
+ "coverage": coverage,
651
+ "evidence": evidence,
652
+ "specs": specs,
653
+ "plans": plans,
654
+ "reports": reports,
655
+ "qcs": qcs,
656
+ "report_error": report_error if not report else None,
657
+ "qc_error": qc_error if not qc else None,
658
+ }
659
+ return {
660
+ "clean_roots": clean_roots,
661
+ "contaminated_roots": contaminated_roots,
662
+ "manifest": manifest,
663
+ "coverage": coverage,
664
+ "evidence": evidence,
665
+ "specs": specs,
666
+ "plans": plans,
667
+ "reports": reports,
668
+ "qcs": qcs,
669
+ "report_error": None,
670
+ "qc_error": None,
671
+ }
672
+
673
+
674
+ def validate_behavior_unit_completion(
675
+ errors: list[str],
676
+ unit_id: str,
677
+ context: dict[str, Any],
678
+ spec_slice_ref: str | None = None,
679
+ ) -> None:
680
+ specs = matching_behavior_specs(context["specs"], unit_id, spec_slice_ref)
681
+ if not specs:
682
+ add_error(errors, f"completion claim has no clean behavior spec: {unit_id}")
683
+ return
684
+ spec_data = [spec for _spec_path, spec in specs]
685
+ terminal_reports = terminal_implementation_reports(context["reports"])
686
+ if not terminal_reports:
687
+ add_error(errors, f"completion claim has no terminal implementation report: {unit_id}")
688
+ passed_qcs = passed_qc_reports(context["qcs"])
689
+ if not passed_qcs:
690
+ add_error(errors, f"completion claim has no passed QC report: {unit_id}")
691
+ work_item_ids = plan_work_items_for_specs(context["plans"], spec_data)
692
+ if not work_item_ids:
693
+ add_error(errors, f"completion claim has no implementation-plan work item for clean behavior spec: {unit_id}")
694
+ elif not (work_item_ids & completed_work_items(terminal_reports)):
695
+ add_error(errors, f"completion claim has no completed implementation work item for clean behavior spec: {unit_id}")
696
+
697
+ source_unit = source_unit_for_unit(context["coverage"], unit_id)
698
+ if not source_unit or source_unit.get("coverage_state") != "covered":
699
+ add_error(errors, f"completion claim has no covered coverage-ledger source unit: {unit_id}")
700
+ else:
701
+ validate_evidence_refs(errors, unit_id, source_unit.get("evidence_refs"), context["evidence"], "coverage-ledger source unit")
702
+
703
+ public_coverage_by_ref = {
704
+ item.get("ref"): item
705
+ for item in (source_unit or {}).get("public_surface_coverage") or []
706
+ if isinstance(item, dict) and isinstance(item.get("ref"), str)
707
+ }
708
+ plan_refs = plan_work_items_by_public_ref(context["plans"])
709
+ completed = completed_work_items(terminal_reports)
710
+ for spec_path, spec in specs:
711
+ coverage_refs = behavior_spec_test_coverage_refs(spec)
712
+ for obligation in required_public_surface_obligations(spec):
713
+ if obligation not in coverage_refs:
714
+ add_error(errors, f"public_surface obligation missing from behavior spec test coverage: {obligation} ({describe_path(spec_path)})")
715
+ coverage = public_coverage_by_ref.get(obligation)
716
+ if not coverage:
717
+ add_error(errors, f"coverage-ledger missing public_surface_coverage for: {obligation}")
718
+ continue
719
+ if coverage.get("status") != "covered":
720
+ add_error(errors, f"coverage-ledger public_surface_coverage is not covered: {obligation}")
721
+ validate_evidence_refs(errors, unit_id, coverage.get("evidence_refs"), context["evidence"], "coverage-ledger public_surface_coverage")
722
+ mapped_items = set(plan_refs.get(obligation) or [])
723
+ if not mapped_items:
724
+ add_error(errors, f"public_surface obligation missing from implementation plan: {obligation}")
725
+ elif not (mapped_items & completed):
726
+ add_error(errors, f"public_surface obligation work item is not complete: {obligation}")
727
+ if error_limit_reached(errors):
728
+ return
729
+
730
+
731
+ def completion_guard_errors(path: Path, kind: str, data: dict[str, Any]) -> list[str]:
732
+ if kind not in {"task-manifest", "coverage-ledger", "clean-room-result"}:
733
+ return []
734
+ if not env_roots("CLEAN_ROOM_CLEAN_ROOTS") or not env_roots("CLEAN_ROOM_CONTAMINATED_ARTIFACT_ROOTS"):
735
+ return []
736
+ errors: list[str] = []
737
+ context = completion_context(path, kind, data)
738
+ if context.get("report_error"):
739
+ add_error(errors, context["report_error"])
740
+ if context.get("qc_error"):
741
+ add_error(errors, context["qc_error"])
742
+
743
+ if kind == "task-manifest":
744
+ manifest_complete = isinstance(data.get("implementation_status"), dict) and data["implementation_status"].get("state") == "complete"
745
+ completed_behavior_ids: set[str] = set()
746
+ for unit in data.get("units") or []:
747
+ if not isinstance(unit, dict):
748
+ continue
749
+ if unit.get("unit_kind") != "behavior" or not isinstance(unit.get("unit_id"), str):
750
+ continue
751
+ unit_id = unit["unit_id"]
752
+ if unit.get("status") == "complete":
753
+ completed_behavior_ids.add(unit_id)
754
+ validate_behavior_unit_completion(errors, unit_id, context)
755
+ elif manifest_complete and unit.get("status") != "out-of-scope":
756
+ add_error(errors, f"task-manifest implementation_status complete but behavior unit is not complete: {unit_id}")
757
+ if error_limit_reached(errors):
758
+ break
759
+ if manifest_complete and not completed_behavior_ids:
760
+ add_error(errors, "task-manifest implementation_status complete has no completed behavior units")
761
+ elif kind == "coverage-ledger":
762
+ if data.get("coverage_status") != "complete":
763
+ return errors
764
+ if not isinstance(context["manifest"], dict):
765
+ add_error(errors, "coverage-ledger completion has no task-manifest.json")
766
+ if not data.get("behavior_spec_refs"):
767
+ add_error(errors, "coverage-ledger completion has no behavior_spec_refs")
768
+ required_behavior_ids = manifest_completion_behavior_unit_ids(context["manifest"])
769
+ if isinstance(context["manifest"], dict) and not required_behavior_ids:
770
+ add_error(errors, "coverage-ledger completion has no behavior units to complete")
771
+ covered_behavior_ids: set[str] = set()
772
+ behavior_spec_refs = {ref for ref in data.get("behavior_spec_refs") or [] if isinstance(ref, str)}
773
+ for source_unit in data.get("source_units") or []:
774
+ if not isinstance(source_unit, dict):
775
+ continue
776
+ unit_id = source_unit.get("unit_id")
777
+ if not isinstance(unit_id, str):
778
+ continue
779
+ if unit_id in required_behavior_ids and source_unit.get("coverage_state") != "covered":
780
+ add_error(errors, f"coverage-ledger completion does not cover behavior unit: {unit_id}")
781
+ if source_unit.get("coverage_state") != "covered":
782
+ continue
783
+ validate_evidence_refs(errors, unit_id, source_unit.get("evidence_refs"), context["evidence"], "coverage-ledger source unit")
784
+ is_behavior_completion = (
785
+ unit_id in required_behavior_ids
786
+ if required_behavior_ids
787
+ else behavior_unit_is_in_scope(unit_id, context["manifest"], context["specs"])
788
+ )
789
+ if is_behavior_completion:
790
+ covered_behavior_ids.add(unit_id)
791
+ validate_behavior_unit_completion(errors, unit_id, context)
792
+ for _spec_path, spec in matching_behavior_specs(context["specs"], unit_id):
793
+ spec_id = spec.get("spec_id")
794
+ if isinstance(spec_id, str) and spec_id not in behavior_spec_refs:
795
+ add_error(errors, f"coverage-ledger completion missing behavior_spec_refs entry: {spec_id}")
796
+ if error_limit_reached(errors):
797
+ break
798
+ for unit_id in sorted(required_behavior_ids - covered_behavior_ids):
799
+ add_error(errors, f"coverage-ledger completion does not cover behavior unit: {unit_id}")
800
+ if error_limit_reached(errors):
801
+ break
802
+ elif kind == "clean-room-result" and data.get("result") == "spec-slice-complete":
803
+ if not isinstance(context["manifest"], dict):
804
+ add_error(errors, "clean-room-result completion has no task-manifest.json")
805
+ if not isinstance(context["coverage"], dict):
806
+ add_error(errors, "clean-room-result completion has no coverage-ledger.json")
807
+ if data.get("coverage_state") != "complete":
808
+ add_error(errors, "clean-room-result spec-slice-complete must have coverage_state complete")
809
+ unit_id = unit_id_from_spec_slice_ref(data.get("spec_slice_ref"), context["specs"])
810
+ if not unit_id:
811
+ add_error(errors, "clean-room-result spec_slice_ref does not resolve to a behavior unit")
812
+ elif behavior_unit_is_in_scope(unit_id, context["manifest"], context["specs"]):
813
+ validate_behavior_unit_completion(errors, unit_id, context, data.get("spec_slice_ref"))
814
+ return errors
815
+
816
+
321
817
  def is_clean_room_task_manifest_schema(schema: dict[str, Any]) -> bool:
322
818
  properties = schema.get("properties")
323
819
  return isinstance(properties, dict) and "handoff_sequence" in properties and "agent_pipeline" in properties
@@ -525,6 +1021,7 @@ def main() -> int:
525
1021
  for error in path_errors:
526
1022
  print(f"clean-room schema check failed: {redact_text(error)}", file=sys.stderr)
527
1023
  return 1
1024
+ run_completion_guard = completion_guard_enabled(payload)
528
1025
  for path in paths:
529
1026
  if path.suffix.lower() != ".json" or not path.is_file():
530
1027
  continue
@@ -587,6 +1084,8 @@ def main() -> int:
587
1084
  extend_errors(errors, role_session_brief_path_errors(data))
588
1085
  if kind == "task-manifest" and is_clean_room_task_manifest_schema(schema):
589
1086
  extend_errors(errors, task_manifest_handoff_sequence_errors(data))
1087
+ if run_completion_guard:
1088
+ extend_errors(errors, completion_guard_errors(path, kind, data))
590
1089
  if errors:
591
1090
  print(f"clean-room schema check failed for {describe_path(path)}:", file=sys.stderr)
592
1091
  for error in errors[:MAX_REPORTED_ERRORS]:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clean-room-skill",
3
- "version": "0.1.15",
3
+ "version": "0.2.0",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
5
  "bin": {
6
6
  "clean-room-skill": "bin/install.js"
package/plugin.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clean-room",
3
- "version": "0.1.15",
3
+ "version": "0.2.0",
4
4
  "description": "Spec-first clean-room workflow for authorized source analysis without replacement code.",
5
5
  "author": {
6
6
  "name": "whit3rabbit"