npm - auditor-lambda - Versions diffs - 0.2.5 → 0.2.8 - Mend

auditor-lambda 0.2.5 → 0.2.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (71) hide show

package/README.md +35 -7
package/audit-code-wrapper-lib.mjs +1612 -331
package/dist/cli.js +397 -38
package/dist/coverage.d.ts +2 -2
package/dist/coverage.js +5 -5
package/dist/extractors/disposition.js +10 -1
package/dist/extractors/flows.js +7 -1
package/dist/extractors/pathPatterns.d.ts +3 -0
package/dist/extractors/pathPatterns.js +15 -0
package/dist/extractors/risk.js +7 -1
package/dist/io/artifacts.d.ts +6 -6
package/dist/io/artifacts.js +14 -17
package/dist/io/json.d.ts +2 -0
package/dist/io/json.js +15 -0
package/dist/io/runArtifacts.d.ts +3 -1
package/dist/io/runArtifacts.js +20 -5
package/dist/mcp/server.d.ts +1 -0
package/dist/mcp/server.js +579 -0
package/dist/orchestrator/advance.js +9 -2
package/dist/orchestrator/dependencyMap.js +9 -13
package/dist/orchestrator/executors.js +7 -2
package/dist/orchestrator/flowRequeue.d.ts +2 -2
package/dist/orchestrator/flowRequeue.js +16 -3
package/dist/orchestrator/internalExecutors.d.ts +2 -1
package/dist/orchestrator/internalExecutors.js +129 -48
package/dist/orchestrator/requeue.js +10 -4
package/dist/orchestrator/requeueCommand.js +15 -2
package/dist/orchestrator/resultIngestion.d.ts +2 -1
package/dist/orchestrator/resultIngestion.js +26 -6
package/dist/orchestrator/runtimeValidation.d.ts +7 -2
package/dist/orchestrator/runtimeValidation.js +61 -49
package/dist/orchestrator/runtimeValidationUpdate.js +2 -4
package/dist/orchestrator/state.js +28 -14
package/dist/orchestrator/taskBuilder.js +4 -2
package/dist/orchestrator/trivialAudit.d.ts +4 -0
package/dist/orchestrator/trivialAudit.js +49 -0
package/dist/prompts/renderWorkerPrompt.js +6 -2
package/dist/providers/spawnLoggedCommand.js +17 -0
package/dist/reporting/mergeFindings.js +3 -11
package/dist/reporting/rootCause.js +92 -9
package/dist/reporting/synthesis.d.ts +25 -22
package/dist/reporting/synthesis.js +92 -59
package/dist/reporting/workBlocks.d.ts +12 -3
package/dist/reporting/workBlocks.js +124 -70
package/dist/supervisor/sessionConfig.js +4 -2
package/dist/types/flows.d.ts +2 -0
package/dist/types/runtimeValidation.d.ts +2 -1
package/dist/types.d.ts +8 -6
package/dist/validation/auditResults.d.ts +5 -2
package/dist/validation/auditResults.js +335 -43
package/docs/agent-integrations.md +38 -29
package/docs/artifacts.md +18 -51
package/docs/bootstrap-install.md +60 -30
package/docs/contract.md +25 -117
package/docs/field-trial-bug-report.md +237 -0
package/docs/next-steps.md +59 -44
package/docs/packaging.md +13 -3
package/docs/production-launch-bar.md +2 -2
package/docs/production-readiness.md +9 -5
package/docs/releasing.md +81 -0
package/docs/session-config.md +20 -1
package/docs/usage.md +22 -0
package/package.json +4 -1
package/schemas/audit_result.schema.json +4 -5
package/schemas/audit_task.schema.json +10 -0
package/schemas/runtime_validation_report.schema.json +1 -1
package/skills/audit-code/SKILL.md +11 -2
package/skills/audit-code/audit-code.prompt.md +11 -10
package/schemas/merged_findings.schema.json +0 -19
package/schemas/root_cause_clusters.schema.json +0 -28
package/schemas/synthesis_report.schema.json +0 -61

package/docs/bootstrap-install.md CHANGED Viewed

@@ -8,65 +8,86 @@ audit-code install
 That command installs the repo-local `/audit-code` surfaces we can automate today.
-## What it writes
-Installed command surfaces:
+After bootstrap, run:
-- `.github/prompts/audit-code.prompt.md` for VS Code chat prompt files, with `agent: agent`
-- `.opencode/commands/audit-code.md` for OpenCode custom commands, with `agent: build`
-- `.claude/commands/audit-code.md` for Claude Code custom slash commands
+```bash
+audit-code verify-install
+```
-Installed always-on compatibility surfaces:
+That smoke-tests the generated host assets plus the shared repo-local MCP launcher without waiting for a full editor walkthrough.
-- `.github/copilot-instructions.md`
-- `AGENTS.md`
-- `CLAUDE.md`
+## What it writes
-Installed repo-local canonical assets:
+Installed shared surfaces:
 - `.audit-code/install/audit-code.import.md`
 - `.audit-code/install/SKILL.md`
 - `.audit-code/install/GETTING-STARTED.md`
+- `.audit-code/install/manifest.json`
+- `.audit-code/install/run-mcp-server.mjs`
+Installed host-specific surfaces:
+- Codex:
+  - `.codex/skills/audit-code/*`
+  - `AGENTS.md` managed block when needed
+  - `.audit-code/install/codex/MCP-SETUP.md`
+  - `.audit-code/install/codex/RE-AUDIT-AUTOMATION.md`
+- Claude Desktop:
+  - `.audit-code/install/claude-desktop/PROJECT-TEMPLATE.md`
+  - `.audit-code/install/claude-desktop/remote-mcp-connector.json`
+  - `.audit-code/install/claude-desktop/auditor-lambda.dxt`
+  - `.audit-code/install/claude-desktop/auditor-lambda.mcpb`
+- OpenCode:
+  - `.opencode/commands/audit-code.md`
+  - `.opencode/skills/audit-code/*`
+  - `opencode.json`
+  - `AGENTS.md` managed block when needed
+- VS Code:
+  - `.github/prompts/audit-code.prompt.md`
+  - `.github/copilot-instructions.md`
+  - `.github/agents/auditor.agent.md`
+  - `.vscode/mcp.json`
+- Antigravity:
+  - `.audit-code/install/antigravity/PLANNING-MODE.md`
+  - `AGENTS.md` managed block when needed
 The generated `GETTING-STARTED.md` now includes dedicated quick-start sections for:
-- VS Code
-- OpenCode
-- Claude Code
+- Codex
 - Claude Desktop
+- OpenCode
+- VS Code
 - Antigravity
-Installed compatibility skill bundles:
-- `.opencode/skills/audit-code/*`
-- `.claude/skills/audit-code/*`
-- `.agents/skills/audit-code/*`
 ## Goal
-After bootstrap, the user should be able to open a supported conversation surface in the repository and invoke:
+After bootstrap, the user should be able to open a supported host surface in the repository and invoke:
 ```text
 /audit-code
 ```
-without supplying extra root paths, provider flags, or model-selection arguments.
+without supplying extra root paths, provider flags, or model-selection arguments, or connect through the shared MCP server when the host prefers tool-driven integration.
 ## What is fully automated today
-- VS Code and GitHub Copilot repo-local prompt surfaces
-- OpenCode project command surfaces
-- Claude Code project command surfaces
-- tool-agnostic compatibility instruction files for hosts that honor `AGENTS.md` or `CLAUDE.md`
+- shared installer output, manifest generation, and repo-local MCP launcher generation
+- Codex skill-bundle and AGENTS-oriented install output
+- OpenCode command, skill, prompt, and config generation
+- VS Code prompt, custom-agent, instruction, and MCP config generation
+- Claude Desktop project-template, remote-connector, and local bundle generation
+- Antigravity planning-mode guidance generation
 ## What is not fully automated today
-- Claude Desktop does not currently have a verified project-local slash-command install surface in this repository
-- Antigravity does not currently have a verified repo-local slash-command install surface in this repository
+- product-level smoke validation for the generated Codex, Claude Desktop, OpenCode, and VS Code assets
+- one-click proof that the generated Claude Desktop bundle installs cleanly in a real Desktop environment
+- documented Antigravity artifact round-tripping back through `import_results` and `import_runtime_updates`
-For those hosts, the bootstrap command still installs compatibility assets, but the final `/audit-code` discovery behavior remains host-dependent.
+For those gaps, the bootstrap command now writes the repo-local assets and guidance, but the final operator experience still needs end-to-end host verification.
-Use `.audit-code/install/GETTING-STARTED.md` as the low-guess repo-local handoff for those manual prompt-import paths and for the exact VS Code, OpenCode, and Claude Code bootstrap surfaces that were generated.
+Use `.audit-code/install/GETTING-STARTED.md` as the low-guess repo-local handoff, and treat `.audit-code/install/manifest.json` as the machine-readable source of truth for what was generated.
 ## Narrow compatibility alias
@@ -77,3 +98,12 @@ audit-code install-host --host copilot
 ```
 Use it only when you intentionally want the smaller Copilot-only install path instead of the default bootstrap.
+## Remaining steps
+The installer foundation is now in place. The remaining work is:
+1. smoke-test each claimed host in the real product, not only via file-generation tests
+2. tighten `GETTING-STARTED.md` and host-specific setup docs where those smoke tests show friction
+3. prove the Claude Desktop local bundle install path operationally
+4. document Antigravity artifact-import workflows more concretely

package/docs/contract.md CHANGED Viewed

@@ -1,71 +1,12 @@
-# audit-code response contract
+# audit-code Response Contract
-This document describes the backend fallback JSON response contract for the `audit-code` wrapper.
+This document follows [audit-goals.md](C:/Code/auditor-lambda/spec/audit-goals.md).
-The canonical product remains `/audit-code` in conversation.
+## Canonical output
-## Backend fallback commands
+The authoritative completed-audit output is repo-root `audit-report.md`.
-Repo-local fallback command from the target repository root:
-```bash
-audit-code
-```
-Installed helper for locating the packaged conversation prompt asset:
-```bash
-audit-code prompt-path
-```
-Installed helper for validating the current backend artifact bundle:
-```bash
-audit-code validate
-```
-Repository-local wrapper equivalent:
-```bash
-node audit-code.mjs
-```
-## Contract version
-Every canonical wrapper response includes:
-```text
-contract_version: audit-code/v1alpha1
-```
-Consumers should verify this value before assuming the response shape.
-## Source of truth
-The versioned JSON schema is:
-```text
-schemas/audit-code-v1alpha1.schema.json
-```
-Product tests validate live wrapper output against that schema.
-## Reproducible installed-command smoke check
-From the repository root:
-```bash
-npm install
-npm run build
-npm link
-npm run smoke:linked-audit-code
-```
-This exercises the installed backend fallback command end-to-end and validates the emitted JSON against the versioned schema.
-## Top-level fields
-The current v1alpha1 contract includes these top-level fields:
+Until completion, the wrapper response remains a JSON envelope with:
 - `contract_version`
 - `audit_state`
@@ -77,64 +18,31 @@ The current v1alpha1 contract includes these top-level fields:
 - `next_likely_step`
 - `handoff`
-`handoff` is a companion operator-context object. It includes:
-- current top-level status
-- repo and artifacts paths
-- pending obligations
-- suggested evidence-import paths and commands when manual continuation is required
-- stable paths to companion handoff files under `.audit-artifacts`
-## Terminal states
-Consumers should continue invoking the same wrapper until:
-1. `next_likely_step == null`
-Terminal interpretation:
-- `audit_state.status == "complete"` means the audit finished end to end.
-- `audit_state.status == "blocked"` means no further automatic progress is available and the remaining work needs imported results or an interactive provider.
-- `progress_made` tells you whether the current invocation wrote additional artifacts before it reached that terminal state.
-When the wrapper emits a response, it also refreshes:
-- `.audit-artifacts/operator-handoff.json`
-- `.audit-artifacts/operator-handoff.md`
-Those files mirror the structured `handoff` guidance in machine-readable and human-readable forms.
-## Audit state shape
-`audit_state` includes:
-- `status`
-- `obligations`
-- optional `last_executor`
-- optional `last_obligation`
-- optional `blockers`
+## AuditResult contract
-`status` is one of:
+Workers submit `AuditResult[]` shaped by `schemas/audit_result.schema.json`.
-- `not_started`
-- `active`
-- `blocked`
-- `complete`
+Important rules:
-Each obligation includes:
+- `file_coverage` is required and must include every assigned file.
+- `file_coverage[].total_lines` must match the current file line count.
+- `findings[].affected_files` must be objects, not strings.
+- `findings[].evidence` must be an array of plain strings.
-- `id`
-- `state`
-- optional `reason`
+Use `audit-code validate-results --results <file>` before ingestion to validate
+results against the active task manifest.
-Obligation `state` is one of:
+## Internal artifacts during incomplete runs
-- `missing`
-- `present`
-- `stale`
-- `blocked`
-- `satisfied`
+The engine may keep resumable artifacts under `.audit-artifacts/`, including:
-## Compatibility note
+- intake/structure/planning artifacts
+- `audit_tasks.json`
+- `audit_results.jsonl`
+- `requeue_tasks.json`
+- `runtime_validation_tasks.json`
+- `runtime_validation_report.json` when runtime validation is planned
+- dispatch files for the active worker task
-This contract is versioned as `v1alpha1` deliberately. It is stable enough for current product use, but it should still be treated as pre-v1.
+These artifacts are internal and transient. On successful completion, they are
+cleared out and only `audit-report.md` remains.

package/docs/field-trial-bug-report.md ADDED Viewed

@@ -0,0 +1,237 @@
+# audit-code Field Trial Bug Report
+**Observed by:** LLM workers (Claude, April 2026)
+**Environments tested:** Claude Desktop (claude-code provider), OpenCode (opencode provider)
+**Repositories audited:** `Polar-CV-KAN` (~30 source files, ~126 tasks, Claude Desktop); same codebase, OpenCode run
+**Report date:** 2026-04-21
+Issues marked **[Both]** were independently observed in both environments. Issues marked **[CD]** or **[OC]** were specific to one environment.
+---
+## Critical
+### F-01 — Orchestrator never transitions to `status: "complete"` [Both]
+**CD observation:** `audit_tasks.json` showed `status: undefined` for all 126 tasks after successful ingestion. The completion gate checks this field, finds it undefined on every task, and permanently blocks. `synthesis_report.json` populated correctly (95 findings, 46 clusters) because synthesis runs on ingestion independently, but `status: "complete"` never fires because the gate only looks at the broken task-status field.
+**OC observation:** Even after all obligations reached `satisfied`, `audit_state.json` remained `status: "active"` and re-triggered planning artifacts. Required direct JSON edit to force `status: "complete"`.
+**Impact:** The documented stop condition ("stop the loop when terminal output shows `status: complete`") never fires in either environment. Workers must either loop indefinitely or make a judgment call to stop. This is the most fundamental failure in the framework.
+**Fix needed:** Audit task status must be written at ingestion time. The completion gate should also fall back on obligation-state truth: if all obligations are `satisfied`, the run is complete regardless of the task-status field.
+---
+### F-02 — Worker launch failures / silent executor failures [Both]
+**CD observation:** `structure_executor` failed to launch during initial structuring. The failure was reported in JSON output but the orchestrator continued as if structuring had succeeded. Task quality degraded silently from the start — unclear whether the resulting task plan was the best possible decomposition or a fallback.
+**OC observation:** Every executor failed (`agent`, `result_ingestion_executor`, `planning_executor`). The entire audit had to be performed manually by reading source files, writing findings in the correct format, and directly manipulating artifact files. The provider (`opencode`) was supposed to enable interactive dispatch but never actually dispatched work.
+**Impact (OC):** The framework served as a state tracker only — no automation at all. **Impact (CD):** Silent quality degradation at the task-planning phase with no way to detect it.
+**Fix needed:** Worker launch failures must be surfaced as blocking handoffs, not silently swallowed. The orchestrator must not advance past a failed executor as if it succeeded. Executor failure messages must include enough detail to diagnose the root cause.
+---
+## High
+### F-03 — `--results` ingestion is unreliable [Both]
+**CD observation:** `audit-code --results <file>` threw `TypeError: e.trim is not a function` when evidence fields contained objects rather than plain strings. The CLI exited 0 in some cases, making it ambiguous whether results were partially or fully ingested.
+**OC observation:** Two separate failure modes: (1) the generated task file contained `audit_results_path: "--root"` (the CLI flag was written as the path value) causing an immediate crash; (2) after manually fixing the path, ingestion crashed with `Cannot read properties of undefined (reading 'map')` — the ingestion executor cannot parse the incoming results format.
+**Impact:** The primary submission mechanism is unreliable. Workers resort to custom scripts that bypass the framework and will break if the artifact schema changes.
+**Fix needed:**
+- Fix the `audit_results_path: "--root"` generation bug in the task file writer.
+- Add schema validation on ingestion that emits field-level errors: `"evidence[2] must be a string, got object"` rather than a bare JS runtime crash.
+- The ingestion executor must not exit 0 on partial failure.
+---
+### F-04 — CLI hangs without output [CD]
+On multiple occasions, `audit-code` or `audit-code --results <file>` produced no output and hung indefinitely. No timeout, no error message, no way to distinguish a genuine hang from a slow operation.
+**Observed pattern:** Hangs were most frequent immediately after large ingestions and at session start during manifest structuring. Suspected cause: Node.js blocking on large JSON serialization or file locking between the orchestrator writing state and the CLI reading it.
+**Fix needed:** Add a timeout (`--timeout <ms>`) and ensure the CLI emits a progress indicator or heartbeat on long operations.
+---
+### F-05 — Requeue tasks explosion — 141 tasks from 10 findings [OC]
+After ingesting the first batch of 10 `data_integrity` findings, the orchestrator generated a `requeue_tasks.json` with 141 tasks — more than the original 64 audit tasks. The requeue logic appears to fan out across all `(lens, file_group)` combinations, producing a combinatorial explosion. No guidance is provided on which requeue tasks are actually needed vs. redundant.
+**Fix needed:** Requeue logic must de-duplicate against already-completed coverage. Requeue tasks should only be generated for `(file_group, lens)` pairs not already marked complete in `coverage_matrix.json`.
+---
+## Medium
+### F-06 — Evidence schema undocumented; validation only at ingestion [Both]
+**CD observation:** `evidence[]` must be an array of plain strings. This constraint is documented in `current-prompt.md` but not validated until `--results` ingestion, where failure produces a cryptic JS runtime error with no field-level attribution.
+**OC observation:** `evidence` must be an array of objects (`[{excerpt, line_reference}]`), not a single object. Discovered only through the error `"(finding.evidence ?? []) is not iterable"`.
+**Note — schema discrepancy:** The two environments reported conflicting evidence types (strings vs. objects). This is itself a documentation or versioning problem — the expected format is not the same across runs, or the prompt changed between environments.
+**Fix needed:**
+- Publish a JSON Schema file (or inline schema comment in the prompt) that workers can validate against before submission.
+- The string format should be explicit: `"src/foo.py:42 — variable overwritten before use"`.
+- Reconcile the expected type (string vs object) and pick one; document it prominently.
+---
+### F-07 — `synthesis_current` obligation permanently shows "missing" [Both]
+**CD observation:** Even after `synthesis_report.json` was fully populated (95 findings, 46 clusters, 22 work blocks), the obligation tracker showed `synthesis_current: missing` because it was blocked by `audit_tasks_completed` (itself blocked by the undefined-status bug, F-01). The worker cannot distinguish "synthesis truly hasn't run" from "synthesis ran but the gate is broken."
+**OC observation:** The obligation was never going to be satisfied by the framework — the synthesis agent never ran. Required manually creating `synthesis_report.json` and forcing the obligation to `satisfied`.
+**Fix needed:** Decouple `synthesis_current` satisfaction from `audit_tasks_completed`. If `synthesis_report.json` exists and is non-empty, `synthesis_current` should be satisfied regardless of upstream gate state.
+---
+### F-08 — "All remaining N tasks low priority" — no guidance on what to do [CD]
+At a certain point the orchestrator indicated all remaining 110 tasks were low priority. The directive does not define what the worker should do: submit empty findings (what was done), review at reduced depth, or skip entirely. Submitting `findings: []` for 60+ tasks in rapid succession was the only way to unblock the orchestrator, but legitimate low-severity findings in those files were never written.
+**Fix needed:** When the orchestrator decides remaining tasks are low priority, emit an explicit directive — e.g., `"You may submit empty findings for these tasks"` or `"Review at reduced depth"` or `"Skip — the audit has sufficient coverage"`.
+---
+### F-09 — Trivially empty files dispatched as full audit tasks [CD]
+The task manifest dispatched audit tasks for empty `__init__.py` files (some containing only a docstring), dotfiles (`.gitignore`, `.gitattributes`), and one-line stub files. Each required a full read→write→ingest round-trip to produce an empty `findings: []` result. For 30 files this added ~30–40 pointless round-trips; at scale this is severe.
+**Fix needed:** Filter files below a minimum token threshold (or with no parseable code constructs) before dispatching tasks. Batch all trivially-empty files into a single no-op task, or skip them entirely.
+---
+### F-10 — No batch task dispatch; one task per CLI invocation [Both]
+**CD observation:** 126 tasks required 252+ CLI invocations plus 126 file reads and writes. The design assumes one task = one LLM call = one CLI round-trip.
+**OC observation:** No bulk ingestion mechanism; wrote a custom `scripts/ingest-results.mjs` that directly mutated `audit_results.jsonl`, `coverage_matrix.json`, and `audit_state.json`.
+**Fix needed:** Support batched dispatch (a `current-tasks.json` with N tasks per run) and a native `audit-code --batch-results <dir>` that processes all result files in a directory. Alternatively, make `agentBatchSize` settable to a meaningful value that workers actually see in their prompt.
+---
+### F-11 — Coverage matrix ↔ task_id mapping is opaque [OC]
+The relationship between `audit_tasks.json` task IDs (e.g., `src-lib:correctness`) and `coverage_matrix.json` file entries (e.g., `src/lib/file-utils.ts` with `required_lenses: [correctness, ...]`) is implicit. A task's `file_group` maps to multiple files in the matrix, but there is no explicit mapping table. The mapping must be reverse-engineered by reading both files.
+**Fix needed:** Either include the resolved file list in each task, or provide an `audit-code explain-task <task_id>` subcommand that shows which files and lenses a task covers.
+---
+### F-12 — Bash variable substitution breaks `node -e` shell loops [CD]
+When batching remaining tasks with a shell loop that invoked `node -e '...'`, bash expanded `${}` syntax inside the JS string before Node.js received it, producing `bad substitution` errors. The workaround (write a standalone `.mjs` file) is not documented anywhere.
+**Fix needed:** Document the correct pattern for batch submission loops, or provide a native `audit-code --batch-results <dir>` to eliminate the need for shell-level scripting entirely.
+---
+### F-13 — Session config discovery and provider switching are error-prone [Both]
+**CD observation:** Every invocation printed `[session-config] no session-config.json found` — 252+ times across the run. The warning appeared even after completing ingestion steps that should have established the session.
+**OC observation:** Required manually creating `session-config.json` with `{"provider": "opencode"}`. Even after doing so, the provider change only altered the error message; actual dispatch still did not work.
+**Fix needed:** Create a default `session-config.json` on first `audit-code` run. Suppress the warning when a session config is genuinely optional. Document the `provider` field and its valid values prominently in the session-config guide.
+---
+### F-14 — No documentation on artifact schema or contract [OC]
+The `contract_version: "audit-code/v1alpha1"` header implies a versioned protocol, but there is no schema documentation. The expected formats for JSONL structure, finding shape, task_id conventions, and coverage matrix layout had to be learned entirely by reading existing artifacts and reverse-engineering error messages.
+**Fix needed:** Publish a contract reference document alongside `docs/contract.md` (which exists but may be incomplete) covering: all artifact file schemas, field-level types and constraints, task_id naming convention, and the expected `AuditResult` JSON shape with a worked example.
+---
+## Low
+### F-15 — `worker_results_pending.json` not cleared after ingestion [CD]
+After `audit-code --results <file>`, the staging file is not cleared. Stale results from the previous task are resubmitted if the worker forgets to overwrite the file.
+**Fix needed:** After successful ingestion, delete or rename `worker_results_pending.json` (e.g., to `worker_results_submitted_<timestamp>.json`).
+---
+### F-16 — `related_findings` contains only circular self-references [CD]
+In the synthesized `synthesis_report.json`, nearly every finding's `related_findings` array contains only the finding's own ID. This is useless and misleads reviewers into thinking cross-finding relationships were analyzed.
+**Fix needed:** Omit `related_findings` when the synthesis engine cannot identify cross-finding relationships, rather than populating it with a self-reference.
+---
+### F-17 — Runtime validation evidence shows "pending" for every finding [CD]
+All 95 findings contain entries like `"runtime:unit:src-modules: pending — No runtime evidence recorded yet"`. These appear verbatim across every finding, bloating each one with 2–3 lines of noise that convey nothing.
+**Fix needed:** Omit pending evidence entries from the output entirely. A finding with no runtime evidence should have no runtime evidence in its array — not a placeholder repeated 95 times.
+---
+### F-18 — `root_cause_clusters` are file co-location groups, not semantic clusters [CD]
+The 46 clusters are named `"correctness/correctness in src/modules"` — file-path groups with a lens label, not semantic root causes. One cluster for "correctness in src/modules" contains 5 findings with 3 entirely different root causes (division-by-zero, cudagraph violation, dead code).
+**Fix needed:** Root-cause clustering should be semantic (e.g., "All NaN paths from missing eps guards"). Either improve the clustering algorithm or rename the section to `file_clusters` to accurately describe what it contains.
+---
+### F-19 — `work_blocks` section omitted from final summary presentation [CD]
+The audit directive says findings should be organized into "non-overlapping blocks of tasks." `synthesis_report.json` correctly generates 22 `work_blocks`. The final summary presentation omitted this section entirely, showing only a flat findings table. Work blocks are arguably the most actionable output and should lead the summary.
+**Fix needed:** Make the `work_blocks` presentation requirement explicit and prominent in the final-output section of the prompt.
+---
+### F-20 — `reviewed_ranges` field is unenforceable and creates false confidence [CD]
+Workers can declare `reviewed_ranges: [{start: 1, end: 10}]` while writing findings about line 800, or declare the full file range without actually reading it. For a 966-line file this makes the field meaningless.
+**Fix needed:** Either remove `reviewed_ranges` (it creates false confidence) or enforce it mechanically by requiring a content hash or line-count to be included alongside the range declaration.
+---
+## Summary Table
+| ID | Issue | Sev | Env | Type |
+|----|-------|-----|-----|------|
+| F-01 | Orchestrator never reaches `status: "complete"` — task statuses undefined | Critical | Both | Bug |
+| F-02 | Worker launch failures / silent executor failures | Critical | Both | Bug |
+| F-03 | `--results` ingestion unreliable (type errors, wrong path, `.map()` crash) | High | Both | Bug |
+| F-04 | CLI hangs without output on some invocations | High | CD | Bug |
+| F-05 | Requeue tasks explosion — 141 tasks from 10 findings | High | OC | Bug |
+| F-06 | Evidence schema undocumented; validated only at ingestion (conflicting types across envs) | Medium | Both | DX |
+| F-07 | `synthesis_current` permanently "missing" even when report is populated | Medium | Both | Bug |
+| F-08 | "All remaining N tasks low priority" — no guidance on worker action | Medium | CD | UX |
+| F-09 | Trivially empty files dispatched as full audit tasks | Medium | CD | Design |
+| F-10 | No batch task dispatch; one task = one CLI invocation = one round-trip | Medium | Both | Design |
+| F-11 | Coverage matrix ↔ task_id mapping is opaque; no lookup table | Medium | OC | DX |
+| F-12 | Bash variable substitution breaks `node -e` shell loops | Medium | CD | DX |
+| F-13 | Session config discovery / provider switching error-prone; noisy warnings | Medium | Both | UX |
+| F-14 | No documentation on artifact schema or contract | Medium | OC | DX |
+| F-15 | `worker_results_pending.json` not cleared after ingestion | Low | CD | Bug |
+| F-16 | `related_findings` circular self-references | Low | CD | Data |
+| F-17 | Runtime validation "pending" entries in all 95 findings | Low | CD | Data |
+| F-18 | `root_cause_clusters` are file co-location groups, not semantic | Low | CD | Design |
+| F-19 | `work_blocks` omitted from final summary presentation | Low | CD | UX |
+| F-20 | `reviewed_ranges` unenforceable; creates false confidence | Low | CD | Design |
+**Env key:** CD = Claude Desktop (claude-code provider), OC = OpenCode (opencode provider), Both = independently observed in both.
+**Type key:** Bug = incorrect behavior, DX = developer/worker experience, UX = output/presentation, Design = intentional design that needs reconsideration, Data = output data quality.