auditor-lambda 0.3.3 → 0.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/README.md +6 -1
  2. package/audit-code-wrapper-lib.mjs +87 -7
  3. package/dist/cli.js +517 -91
  4. package/dist/extractors/graph.d.ts +5 -1
  5. package/dist/extractors/graph.js +223 -3
  6. package/dist/extractors/pathPatterns.d.ts +3 -2
  7. package/dist/extractors/pathPatterns.js +97 -24
  8. package/dist/io/artifacts.d.ts +5 -0
  9. package/dist/io/artifacts.js +2 -0
  10. package/dist/orchestrator/advance.js +1 -1
  11. package/dist/orchestrator/dependencyMap.js +18 -0
  12. package/dist/orchestrator/fileAnchors.d.ts +32 -0
  13. package/dist/orchestrator/fileAnchors.js +217 -0
  14. package/dist/orchestrator/internalExecutors.d.ts +1 -1
  15. package/dist/orchestrator/internalExecutors.js +120 -33
  16. package/dist/orchestrator/reviewPackets.d.ts +14 -0
  17. package/dist/orchestrator/reviewPackets.js +310 -0
  18. package/dist/orchestrator/selectiveDeepening.d.ts +14 -0
  19. package/dist/orchestrator/selectiveDeepening.js +392 -0
  20. package/dist/orchestrator/state.js +6 -1
  21. package/dist/orchestrator/taskBuilder.d.ts +16 -0
  22. package/dist/orchestrator/taskBuilder.js +68 -11
  23. package/dist/prompts/renderWorkerPrompt.js +2 -1
  24. package/dist/providers/claudeCodeProvider.js +3 -1
  25. package/dist/providers/index.js +2 -1
  26. package/dist/supervisor/operatorHandoff.js +22 -11
  27. package/dist/types/graph.d.ts +1 -0
  28. package/dist/types/reviewPlanning.d.ts +41 -0
  29. package/dist/types/reviewPlanning.js +1 -0
  30. package/dist/types/sessionConfig.d.ts +1 -0
  31. package/dist/validation/artifacts.js +13 -0
  32. package/dist/validation/auditResults.js +50 -2
  33. package/dist/validation/sessionConfig.js +5 -0
  34. package/docs/agent-integrations.md +4 -1
  35. package/docs/bootstrap-install.md +3 -0
  36. package/docs/contract.md +3 -0
  37. package/docs/dispatch-implementation-plan.md +220 -489
  38. package/docs/next-steps.md +13 -8
  39. package/docs/product-direction.md +5 -3
  40. package/docs/run-flow.md +25 -30
  41. package/docs/session-config.md +15 -4
  42. package/docs/supervisor.md +5 -3
  43. package/docs/workflow-refactor-brief.md +114 -176
  44. package/package.json +1 -1
  45. package/schemas/finding.schema.json +1 -15
  46. package/schemas/graph_bundle.schema.json +16 -0
  47. package/skills/audit-code/audit-code.prompt.md +11 -6
@@ -1,553 +1,284 @@
1
- # Dispatch Automation Implementation Plan
2
-
3
- ## Background
4
-
5
- The current audit-code workflow requires the LLM orchestrator to manually assemble
6
- subagent prompts, handle schema normalization, and merge results costing hundreds of
7
- tokens per task and producing frequent schema violations. This plan replaces that with a
8
- deterministic scripted dispatch layer so the orchestrator's only job is to fire Agent
9
- tool calls with pre-built prompts, then run a merge script.
10
-
11
- **Environment constraint:** Claude Desktop with no separate Anthropic API key. Subagent
12
- dispatch must go through the `Agent` tool in the conversation runtime — no direct SDK
13
- calls. All other steps must be zero-token scripts.
14
-
15
- ---
16
-
17
- ## Target workflow (per audit cycle)
18
-
1
+ # Dispatch Automation Reference
2
+
3
+ This document describes the implemented review-dispatch path for `/audit-code`.
4
+ The original dispatch plan was one agent per audit task. The current path keeps
5
+ the existing `AuditTask` and `AuditResult` contracts, but groups related tasks
6
+ into review packets so a worker can read a coherent file set once and submit
7
+ one validated result for each assigned task through a backend-owned write path.
8
+
9
+ ## Current Workflow
10
+
11
+ ```text
12
+ 1. audit-code
13
+ -> advances deterministic state until semantic review is needed
14
+ -> emits a blocked handoff with active_review_run.run_id
15
+
16
+ 2. audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
17
+ -> reads pending-audit-tasks.json and review planning artifacts
18
+ -> writes a slim dispatch-plan.json
19
+ -> writes backend-owned dispatch-result-map.json
20
+ -> writes one packet prompt per dispatch-plan entry
21
+ -> prints one compact JSON envelope
22
+
23
+ 3. Conversation orchestrator reads only dispatch-plan.json
24
+ -> launches one subagent per packet
25
+ -> each subagent reads its packet prompt and assigned files
26
+ -> each subagent pipes AuditResult[] to the submit-packet command in the prompt
27
+ -> submit-packet validates and writes only backend-assigned result files
28
+ -> each subagent replies: valid: <packet_id>, findings=<n>
29
+
30
+ 4. audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
31
+ -> validates that every assigned task has exactly one valid result
32
+ -> rejects missing, unknown, duplicate, malformed, or out-of-scope results
33
+ -> writes audit-results.json as the existing AuditResult[] shape
34
+ -> ingests accepted results through the normal result_ingestion_executor
35
+ -> prints one compact JSON envelope
36
+
37
+ 5. Repeat `audit-code` until complete.
19
38
  ```
20
- 1. node dist/index.js audit-code
21
- → emits run_id + pending-audit-tasks.json
22
-
23
- 2. node dispatch/prepare-dispatch.mjs --run-id <run_id>
24
- → reads tasks + schemas → writes dispatch-plan.json
25
- (deterministic, 0 LLM tokens)
26
-
27
- 3. [Orchestrator reads dispatch-plan.json — small JSON array]
28
- [Orchestrator fires N Agent calls in ONE message, verbatim prompts from plan]
29
-
30
- Each subagent (×N, parallel):
31
- - reads source files with Read tool
32
- - performs lens audit
33
- - writes result to task-results/<sanitized_task_id>.json using Write tool
34
- - runs: node dispatch/validate-result.mjs <run_id> <task_id>
35
- - if non-zero: fixes errors, rewrites, re-validates (max 3 attempts)
36
- - if still failing after 3: writes empty-but-valid fallback result
37
-
38
- 4. node dispatch/merge-results.mjs --run-id <run_id>
39
- → validates all task-results/*.json
40
- → writes audit-results.json (passing results only)
41
- → writes failed-tasks.json (task_ids that failed validation)
42
- (deterministic, 0 LLM tokens)
43
-
44
- 5. node dist/index.js worker-run --run-id <run_id>
45
- → ingests audit-results.json → coverage matrix → marks tasks complete
46
- (deterministic, 0 LLM tokens)
47
-
48
- 6. Repeat from step 1 until no pending tasks remain.
49
- ```
50
-
51
- Orchestrator token cost per cycle: **~50 tokens × N tasks** (read dispatch-plan + invoke Agent calls). Independent of source file sizes.
52
39
 
53
- ---
40
+ The parent orchestrator should not read prompt files, pending tasks, completed
41
+ task result payloads, or source files during the packet dispatch path unless a
42
+ backend command fails and the error requires diagnosis.
54
43
 
55
- ## Files to create
44
+ ## Planning Artifacts
56
45
 
57
- ```
58
- dispatch/
59
- lens-definitions.json — lens descriptions embedded in every subagent prompt
60
- validate.mjs — shared validation logic (imported by other scripts)
61
- validate-result.mjs — CLI: validate one task-results file
62
- prepare-dispatch.mjs — reads pending tasks → writes dispatch-plan.json
63
- merge-results.mjs — merges validated task results → audit-results.json
64
- ```
46
+ Planning writes two packet-specific artifacts alongside the existing task and
47
+ coverage artifacts:
65
48
 
66
- ## Files to modify
49
+ - `review_packets.json`: deterministic packets derived from current
50
+ `AuditTask` records.
51
+ - `audit_plan_metrics.json`: task count, packet count, repeated file/line
52
+ estimates, largest packet, and estimated agent-count reduction.
67
53
 
68
- ```
69
- package.json — add ajv devDependency; add dispatch:* npm scripts
70
- ```
54
+ Packets preserve task identity. They change the worker-facing unit of work, not
55
+ the backend-owned validation or ingestion contract.
71
56
 
72
- > **Do NOT add `dispatch/` to the `files` array in package.json.** These scripts are
73
- > local dev tooling and must not be published to npm.
57
+ ## Packet Construction
74
58
 
75
- ---
59
+ Packet planning is deterministic and compatibility-preserving:
76
60
 
77
- ## Step 1 Add `ajv` dependency
61
+ - tasks sharing the same file set and scope are grouped across lenses
62
+ - tiny homogeneous test files are batched before dispatch
63
+ - graph edges from imports, calls, and references can merge related task groups
64
+ - heuristic container edges do not force packet expansion
65
+ - packet chunking respects task-count and line-budget limits
66
+ - a single file that exceeds the packet target is isolated rather than split
67
+ - high-priority packets sort ahead of lower-priority packets
78
68
 
79
- In `package.json`, add to `devDependencies`:
69
+ Generated packets include:
80
70
 
81
71
  ```json
82
- "ajv": "^8.17.1"
72
+ {
73
+ "packet_id": "src-auth:security-correctness:packet-1-...",
74
+ "task_ids": ["src-auth:security", "src-auth:correctness"],
75
+ "lenses": ["security", "correctness"],
76
+ "file_paths": ["src/api/auth.ts", "src/lib/session.ts"],
77
+ "total_lines": 70,
78
+ "estimated_tokens": 1180
79
+ }
83
80
  ```
84
81
 
85
- Then run `npm install`.
86
-
87
- AJV v8 is required for JSON Schema draft 2020-12 support (which the existing schemas use).
88
- No other new dependencies are needed.
82
+ ## `prepare-dispatch` Output
89
83
 
90
- Also add npm scripts (optional convenience aliases):
84
+ Command:
91
85
 
92
- ```json
93
- "dispatch:prepare": "node dispatch/prepare-dispatch.mjs",
94
- "dispatch:merge": "node dispatch/merge-results.mjs",
95
- "dispatch:validate": "node dispatch/validate-result.mjs"
86
+ ```bash
87
+ audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
96
88
  ```
97
89
 
98
- ---
90
+ Artifacts:
99
91
 
100
- ## Step 2 — Create `dispatch/lens-definitions.json`
92
+ - `<artifacts_dir>/runs/<run_id>/dispatch-plan.json`
93
+ - `<artifacts_dir>/runs/<run_id>/dispatch-result-map.json`
94
+ - `<artifacts_dir>/runs/<run_id>/task-results/<packet_id>.prompt.md`
95
+ - `<artifacts_dir>/runs/<run_id>/task-results/<packet_id>.anchors.json`,
96
+ only for isolated large-file packets
97
+ - `<artifacts_dir>/runs/<run_id>/dispatch-warnings.json`, only when warnings
98
+ exist
101
99
 
102
- This file is embedded verbatim in every subagent prompt. It must be accurate enough that
103
- a subagent can scope its review correctly without reading any other files.
100
+ The command prints a compact JSON envelope:
104
101
 
105
102
  ```json
106
103
  {
107
- "correctness": {
108
- "description": "Logic errors, incorrect algorithm implementations, off-by-one bugs, type mismatches, wrong return values, incorrect state transitions, missing null/undefined guards, misuse of APIs. Focus on code that does the wrong thing.",
109
- "do_not_report": "Style issues, naming problems, missing tests, or findings that belong to other lenses."
110
- },
111
- "maintainability": {
112
- "description": "Code that is hard to change safely: excessive function length, deep nesting, tight coupling between unrelated modules, poor naming, magic constants, duplicated logic, inconsistent abstractions, unclear public APIs.",
113
- "do_not_report": "Correctness bugs, test gaps, or operational concerns."
114
- },
115
- "tests": {
116
- "description": "Test coverage gaps for important paths, tests that assert incorrect behavior (pinning bugs as expected), fragile or non-deterministic tests, missing negative/edge-case tests, tests that silently pass on stale builds (e.g. importing compiled dist/ rather than source).",
117
- "do_not_report": "Source code bugs — report only issues with the tests themselves."
104
+ "run_id": "run-1",
105
+ "dispatch_plan_path": ".audit-artifacts/runs/run-1/dispatch-plan.json",
106
+ "packet_count": 4,
107
+ "task_count": 18,
108
+ "largest_packet": {
109
+ "packet_id": "src-auth:security-correctness:packet-1-...",
110
+ "total_lines": 1320,
111
+ "estimated_tokens": 6180
118
112
  },
119
- "security": {
120
- "description": "Injection vulnerabilities (SQL, shell, path traversal), authentication/authorization flaws, secret exposure, insecure deserialization, privilege escalation, unsafe use of eval or child processes with user input.",
121
- "do_not_report": "Performance or correctness issues that are not security-relevant."
122
- },
123
- "reliability": {
124
- "description": "Failure modes without recovery, missing timeouts, unhandled promise rejections, race conditions, resource leaks (file handles, sockets, timers), incorrect retry logic, cascading failure risks.",
125
- "do_not_report": "Correctness bugs that do not affect reliability under failure conditions."
126
- },
127
- "performance": {
128
- "description": "Algorithmic inefficiencies (O(n²) where O(n) is possible), unnecessary re-computation, missing caching, synchronous blocking in hot paths, excessive memory allocation.",
129
- "do_not_report": "Correctness bugs unrelated to performance."
130
- },
131
- "data_integrity": {
132
- "description": "Missing input validation at trust boundaries, schema violations, inconsistent field naming across related schemas, data loss scenarios, missing required fields, enum values that are present in some schemas but not others.",
133
- "do_not_report": "UI or presentation issues; operational or deployment concerns."
134
- },
135
- "operability": {
136
- "description": "Missing or low-quality log output, error messages that don't help operators diagnose problems, missing progress indicators for long operations, no elapsed-time reporting, lack of dry-run or preview modes for destructive operations.",
137
- "do_not_report": "Correctness bugs or deployment configuration."
138
- },
139
- "config_deployment": {
140
- "description": "CI/CD pipeline correctness (wrong triggers, missing branch filters, floating version pins), deployment safety (no gate before publish, missing rollback), insecure secret handling in configs, mutable action tags that should be pinned to commit SHAs.",
141
- "do_not_report": "Runtime code issues; findings that belong to other lenses."
142
- }
113
+ "warning_count": 0,
114
+ "dispatch_warnings_path": null
143
115
  }
144
116
  ```
145
117
 
146
- ---
147
-
148
- ## Step 3 — Create `dispatch/validate.mjs`
118
+ `dispatch-plan.json` entries are intentionally small:
149
119
 
150
- Shared validation module. Exports a single function `validateResult(resultObj, fileLineCounts)`.
151
-
152
- ### Interface
153
-
154
- ```js
155
- /**
156
- * @param {object} resultObj — parsed JSON from a task-results file
157
- * @param {Record<string, number>} fileLineCounts — from the task's file_line_counts
158
- * @returns {{ valid: boolean, errors: string[] }}
159
- */
160
- export function validateResult(resultObj, fileLineCounts) { ... }
161
- ```
162
-
163
- ### Logic
164
-
165
- ```
166
- 1. AJV validate resultObj against schemas/audit_result.schema.json
167
- - Load finding.schema.json first (addSchema) so $ref resolves
168
- - Use Ajv({ strict: false }) to avoid complaints about unknown keywords like $schema
169
- - On failure: return { valid: false, errors: ajv.errors.map(e => formatAjvError(e)) }
170
-
171
- 2. Extra check — line range constraint:
172
- For each finding in resultObj.findings:
173
- For each entry in finding.affected_files:
174
- if entry.line_end is defined:
175
- look up total_lines from resultObj.file_coverage where path === entry.path
176
- if total_lines is undefined: push error "affected_files path '${entry.path}' not in file_coverage"
177
- else if entry.line_end > total_lines: push error
178
- "finding '${finding.id}': line_end ${entry.line_end} exceeds total_lines ${total_lines} for ${entry.path}"
179
-
180
- 3. Extra check — lens consistency:
181
- For each finding in resultObj.findings:
182
- if finding.lens !== resultObj.lens:
183
- push error "finding '${finding.id}': lens '${finding.lens}' does not match task lens '${resultObj.lens}'"
184
-
185
- 4. Extra check — affected_files paths in scope:
186
- Collect allowed paths from resultObj.file_coverage[].path
187
- For each finding's affected_files entry:
188
- if entry.path not in allowed paths:
189
- push error "finding '${finding.id}': affected path '${entry.path}' not in task file_coverage"
190
-
191
- 5. If any extra-check errors: return { valid: false, errors }
192
-
193
- 6. Return { valid: true, errors: [] }
194
- ```
195
-
196
- ### Schema loading
197
-
198
- Schemas are resolved relative to the project root. Use this logic to find the project root:
199
-
200
- ```js
201
- // dispatch/validate.mjs
202
- import { createRequire } from "node:module";
203
- import { dirname, resolve, join } from "node:path";
204
- import { fileURLToPath } from "node:url";
205
- import { readFileSync } from "node:fs";
206
- import Ajv from "ajv";
207
-
208
- const __filename = fileURLToPath(import.meta.url);
209
- const __dirname = dirname(__filename);
210
- // dispatch/ is one level below project root
211
- const PROJECT_ROOT = resolve(__dirname, "..");
212
- const SCHEMAS_DIR = join(PROJECT_ROOT, "schemas");
213
-
214
- function loadSchema(name) {
215
- return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), "utf8"));
216
- }
217
-
218
- let _ajv = null;
219
- function getAjv() {
220
- if (_ajv) return _ajv;
221
- _ajv = new Ajv({ strict: false, allErrors: true });
222
- _ajv.addSchema(loadSchema("finding.schema.json"));
223
- return _ajv;
120
+ ```json
121
+ {
122
+ "packet_id": "src-auth:security-correctness:packet-1-...",
123
+ "description": "Audit 2 file(s), 2 task(s), 2 lens(es) (~70 lines)",
124
+ "prompt_path": ".audit-artifacts/runs/run-1/task-results/src-auth_security-correctness_packet-1_ab12cd34ef56.prompt.md"
224
125
  }
225
126
  ```
226
127
 
227
- ---
228
-
229
- ## Step 4 — Create `dispatch/validate-result.mjs`
230
-
231
- CLI wrapper for use by subagents after writing their result file.
232
-
233
- ### Usage
234
-
235
- ```
236
- node dispatch/validate-result.mjs <run_id> <task_id>
128
+ The orchestrator should launch one subagent per entry with the entry
129
+ description and a prompt that tells the subagent to read and follow
130
+ `entry.prompt_path`.
131
+
132
+ ## Large File Mode
133
+
134
+ The workflow does not impose a hard single-file size limit. When a packet is
135
+ large because it contains one large file, `prepare-dispatch` keeps that file in
136
+ an isolated packet and writes a mechanical anchor summary next to the packet
137
+ prompt. The anchor summary may include:
138
+
139
+ - file boundaries
140
+ - imports and exports
141
+ - top-level symbols
142
+ - route-like declarations
143
+ - risk keywords
144
+ - graph edges
145
+ - external analyzer signals
146
+
147
+ The packet prompt points the worker at the anchor file and asks for targeted
148
+ reads/searches within the assigned file. The backend still validates and writes
149
+ results through `submit-packet`. This keeps large-file review bounded by
150
+ mechanically generated structure without slicing files into arbitrary line
151
+ ranges.
152
+
153
+ ## Packet Prompt Contract
154
+
155
+ Each packet prompt tells the worker to:
156
+
157
+ - review the packet once
158
+ - read only the listed repo-relative files
159
+ - produce one JSON object per listed task
160
+ - pipe one JSON array to the prompt's `submit-packet` command
161
+ - preserve the existing `AuditResult` fields:
162
+ `task_id`, `unit_id`, `pass_id`, `lens`, `file_coverage`, `findings`
163
+ - keep `file_coverage[]` as `{ path, total_lines }`
164
+ - keep every finding lens equal to the task lens
165
+ - avoid direct file writes, source edits, remediation, extra task results, and
166
+ unrelated audits
167
+ - reply exactly `valid: <packet_id>, findings=<total finding count>` after the
168
+ submit command accepts the packet
169
+
170
+ This keeps packet review efficient while leaving merge and ingestion
171
+ mechanically deterministic.
172
+
173
+ ## Submission and Validation
174
+
175
+ Packet submission is exposed through:
176
+
177
+ ```bash
178
+ audit-code submit-packet --run-id <run_id> --packet-id <packet_id> --artifacts-dir <artifacts_dir>
237
179
  ```
238
180
 
239
- - `run_id`: e.g. `20260424T152454170Z_audit_tasks_completed_001`
240
- - `task_id`: e.g. `src-adapters:correctness` (unsanitized the script sanitizes internally)
181
+ The command reads `AuditResult[]` from stdin, validates the complete assigned
182
+ packet, and writes only the backend-assigned per-task result paths from
183
+ `dispatch-result-map.json`. This keeps result writes out of the LLM prompt and
184
+ prevents swapped or unknown task result files from being ingested.
241
185
 
242
- ### Logic
186
+ Per-task validation is exposed through:
243
187
 
188
+ ```bash
189
+ audit-code validate-result --run-id <run_id> --task-id <task_id> --artifacts-dir <artifacts_dir>
244
190
  ```
245
- 1. Parse argv: run_id = process.argv[2], task_id = process.argv[3]
246
- If either missing: print usage and exit 1
247
191
 
248
- 2. Locate artifacts_dir:
249
- Read .audit-artifacts/session-config.json to find artifacts_dir.
250
- If not present, default: join(PROJECT_ROOT, ".audit-artifacts")
192
+ Generated packet prompts may pass run ids, packet ids, task ids, and artifact paths through
193
+ base64url flags such as `--run-id-b64`, `--packet-id-b64`, `--task-id-b64`, and
194
+ `--artifacts-dir-b64` when raw values could contain shell-sensitive characters.
251
195
 
252
- 3. Derive file path:
253
- sanitized = task_id.replace(/[^a-zA-Z0-9_-]/g, "_")
254
- resultPath = join(artifactsDir, "runs", run_id, "task-results", sanitized + ".json")
196
+ The validator checks the result against the assigned task set and enforces the
197
+ mechanical constraints that matter for ingestion:
255
198
 
256
- 4. Read and parse resultPath. If file not found or invalid JSON:
257
- print error, exit 1
199
+ - required `AuditResult` and finding fields
200
+ - finding lens matches the task lens
201
+ - cited and affected paths are in assigned coverage
202
+ - line spans do not exceed known `total_lines`
203
+ - result fields conform to the shipped schemas
258
204
 
259
- 5. Load the task from pending-audit-tasks.json to get file_line_counts:
260
- tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
261
- tasks = JSON.parse(readFileSync(tasksPath))
262
- task = tasks.find(t => t.task_id === task_id)
263
- fileLineCounts = task?.file_line_counts ?? {}
205
+ Workers should retry rejected submissions up to the bounded retry count in the prompt.
264
206
 
265
- 6. Call validateResult(resultObj, fileLineCounts) from validate.mjs
266
-
267
- 7. If valid: console.log("✓ valid:", task_id); exit 0
268
- If invalid:
269
- console.error("✗ invalid:", task_id);
270
- console.error(JSON.stringify(errors, null, 2));
271
- exit 1
272
- ```
207
+ ## `merge-and-ingest` Output
273
208
 
274
- ---
275
-
276
- ## Step 5 — Create `dispatch/prepare-dispatch.mjs`
277
-
278
- Core script. Reads pending tasks and produces a ready-to-use dispatch plan.
279
-
280
- ### Usage
281
-
282
- ```
283
- node dispatch/prepare-dispatch.mjs --run-id <run_id>
284
- ```
285
-
286
- ### Logic
287
-
288
- ```
289
- 1. Parse --run-id <run_id> from argv. Error if missing.
290
-
291
- 2. Resolve paths:
292
- artifactsDir = join(PROJECT_ROOT, ".audit-artifacts")
293
- runDir = join(artifactsDir, "runs", run_id)
294
- tasksPath = join(runDir, "pending-audit-tasks.json")
295
- dispatchPlanPath = join(runDir, "dispatch-plan.json")
296
-
297
- 3. Read pending-audit-tasks.json — array of AuditTask objects.
298
- If file not found: error and exit 1.
299
-
300
- 4. Load shared content (read once, reuse for all tasks):
301
- lensDefinitions = read dispatch/lens-definitions.json
302
- auditResultSchema = read schemas/audit_result.schema.json
303
- findingSchema = read schemas/finding.schema.json
304
-
305
- 5. For each task in tasks:
306
- a. sanitizedId = task.task_id.replace(/[^a-zA-Z0-9_-]/g, "_")
307
- b. outputPath = join(runDir, "task-results", sanitizedId + ".json")
308
- c. lensDef = lensDefinitions[task.lens]
309
- d. totalFileLines = Object.values(task.file_line_counts).reduce((a, b) => a + b, 0)
310
- e. description = `Audit ${task.unit_id} (${task.file_paths.length} file(s), ~${totalFileLines} lines) — ${task.lens} lens`
311
- f. prompt = buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, run_id, artifactsDir)
312
- g. Append { task_id, description, output_path: outputPath, prompt } to plan array
313
-
314
- 6. Ensure task-results/ directory exists:
315
- mkdirSync(join(runDir, "task-results"), { recursive: true })
316
-
317
- 7. Write plan array to dispatchPlanPath as formatted JSON.
318
-
319
- 8. Print: "Wrote dispatch-plan.json — N tasks ready for dispatch"
320
- Print: "Largest task: <task_id> (~N lines)"
321
- Print: ""
322
- Print: "--- ORCHESTRATOR INSTRUCTIONS ---"
323
- Print: "Read dispatch-plan.json. For each entry, fire one Agent call with:"
324
- Print: " description: <entry.description>"
325
- Print: " prompt: <entry.prompt>"
326
- Print: "Fire all N calls in a single message for parallel execution."
327
- Print: "When all complete, run: node dispatch/merge-results.mjs --run-id <run_id>"
328
- ```
329
-
330
- ### `buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, runId, artifactsDir)`
331
-
332
- Returns a string. Template (use template literals):
333
-
334
- ```
335
- You are a code auditor. Perform a bounded audit of the files listed below under the specified lens.
336
-
337
- ## Task metadata
338
- ${JSON.stringify(task, null, 2)}
339
-
340
- ## Files to read
341
- Read each path in task.file_paths using your Read tool. The repo root is the current working directory — paths are repo-relative (e.g. "src/foo.ts").
342
-
343
- file_line_counts gives the expected total line count for each file. Use those exact values for file_coverage[].total_lines in your result.
344
-
345
- ## Lens: ${task.lens}
346
- ${lensDef.description}
347
-
348
- Do NOT report: ${lensDef.do_not_report}
349
-
350
- ## Output format
351
- Write your result as a single JSON **object** (not an array) to this exact path:
352
- ${outputPath}
353
-
354
- The result must conform to the following schema:
355
-
356
- ### audit_result.schema.json
357
- ${JSON.stringify(auditResultSchema, null, 2)}
358
-
359
- ### finding.schema.json
360
- ${JSON.stringify(findingSchema, null, 2)}
361
-
362
- ## Hard constraints (violations will fail validation)
363
- 1. NEVER set line_end higher than the file's actual line count.
364
- Use file_line_counts as your reference. If in doubt, leave line_end omitted.
365
- 2. Every finding MUST have ALL required fields:
366
- id, title, category, severity, confidence, lens, summary, affected_files, evidence
367
- 3. lens on every finding must be exactly "${task.lens}"
368
- 4. No fields outside the schema. Forbidden: "recommendation", "tags", "description" (use "summary").
369
- 5. evidence[] must contain at least one specific file:line reference.
370
- Format: "path/to/file.ts:42 - brief description of what you see there"
371
- 6. affected_files[] entries are OBJECTS with a "path" key — NOT plain strings.
372
- Example: {"path": "src/foo.ts", "line_start": 10, "line_end": 20, "symbol": "myFunc"}
373
- 7. Only reference file paths that appear in this task's file_paths.
374
- 8. findings: [] is correct when you genuinely find nothing. Do not invent findings.
375
-
376
- ## Validation step (required)
377
- After writing your result, run:
378
- node dispatch/validate-result.mjs ${runId} ${task.task_id}
379
-
380
- - If it exits 0: you are done. Stop.
381
- - If it exits non-zero: read the error output, fix the JSON, rewrite the file, run again.
382
- - Repeat up to 3 times.
383
-
384
- If you cannot produce a valid result after 3 attempts, write this fallback (substituting real values):
385
- ${JSON.stringify({
386
- task_id: task.task_id,
387
- unit_id: task.unit_id,
388
- pass_id: task.pass_id,
389
- lens: task.lens,
390
- file_coverage: task.file_paths.map(p => ({ path: p, total_lines: task.file_line_counts[p] })),
391
- findings: [],
392
- notes: ["Validation failed after 3 attempts — empty result written as fallback."]
393
- }, null, 2)}
394
-
395
- Then validate the fallback passes before finishing.
396
- ```
397
-
398
- Note: the fallback JSON in the prompt is pre-computed in `buildPrompt` using the task
399
- data, not generated by the subagent.
400
-
401
- ---
402
-
403
- ## Step 6 — Create `dispatch/merge-results.mjs`
404
-
405
- ### Usage
406
-
407
- ```
408
- node dispatch/merge-results.mjs --run-id <run_id>
409
- ```
410
-
411
- ### Logic
209
+ Command:
412
210
 
211
+ ```bash
212
+ audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
413
213
  ```
414
- 1. Parse --run-id <run_id> from argv.
415
214
 
416
- 2. Resolve paths:
417
- taskResultsDir = join(artifactsDir, "runs", run_id, "task-results")
418
- auditResultsPath = join(artifactsDir, "runs", run_id, "audit-results.json")
419
- failedTasksPath = join(artifactsDir, "runs", run_id, "failed-tasks.json")
420
- tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
215
+ Merge behavior:
421
216
 
422
- 3. Read pending-audit-tasks.json to build fileLineCounts map:
423
- lineCounts = {}
424
- for each task: lineCounts[task.task_id] = task.file_line_counts
217
+ - validates every backend-assigned result-map path
218
+ - rejects unexpected JSON files under `task-results/`
219
+ - rejects task IDs that appear in the wrong assigned result path
220
+ - rejects duplicate task results
221
+ - rejects unknown task IDs
222
+ - rejects missing assigned task results
223
+ - writes `failed-tasks.json` and exits non-zero when any assigned result is
224
+ missing or invalid
225
+ - writes `audit-results.json` only from passing results
226
+ - invokes the normal result ingestion path only after the assigned set is clean
425
227
 
426
- 4. Read all *.json files from task-results/:
427
- files = readdirSync(taskResultsDir).filter(f => f.endsWith(".json"))
428
-
429
- 5. For each file:
430
- a. Parse JSON
431
- b. Call validateResult(resultObj, lineCounts[resultObj.task_id] ?? {})
432
- c. If valid: push to passing[]
433
- d. If invalid: push { task_id: resultObj?.task_id ?? filename, errors } to failing[]
434
-
435
- 6. Write passing array to audit-results.json (as AuditResult[] — array of passing objects)
436
-
437
- 7. If failing.length > 0:
438
- Write failing array to failed-tasks.json
439
- Print warning: "${failing.length} task(s) failed validation and were excluded:"
440
- For each: print " ✗ ${f.task_id}: ${f.errors[0]}" (first error only for brevity)
441
-
442
- 8. Print: "✓ ${passing.length}/${total} tasks valid → ${auditResultsPath}"
443
- If failing.length > 0: print " Re-run those tasks in the next cycle."
444
-
445
- 9. Exit 0 regardless (partial ingestion is safe — failed tasks remain pending for requeue).
446
- ```
447
-
448
- ---
449
-
450
- ## Step 7 — Update `session-config.json` (optional but recommended)
451
-
452
- Add `dispatch_provider` field to `.audit-artifacts/session-config.json`:
228
+ On success the command prints one compact JSON envelope:
453
229
 
454
230
  ```json
455
231
  {
456
- "provider": "local-subprocess",
457
- "dispatch_provider": "claude-desktop",
458
- "agent_task_batch_size": 10
232
+ "run_id": "run-1",
233
+ "status": "completed",
234
+ "accepted_count": 18,
235
+ "rejected_count": 0,
236
+ "finding_count": 3,
237
+ "audit_results_path": ".audit-artifacts/runs/run-1/audit-results.json",
238
+ "selected_executor": "result_ingestion_executor",
239
+ "progress_made": true,
240
+ "progress_summary": "Ingested 18 audit result entries and refreshed dependent artifacts.",
241
+ "next_likely_step": "runtime_validation"
459
242
  }
460
243
  ```
461
244
 
462
- This is metadata only for now no code reads `dispatch_provider` yet. It documents intent and provides the hook for future multi-provider support.
463
-
464
- ---
465
-
466
- ## Testing procedure
467
-
468
- ### Unit test: `validate-result.mjs`
245
+ If the command exits non-zero, the orchestrator should stop and report the exact
246
+ error instead of manually editing task results or audit state.
469
247
 
470
- 1. Write a minimal valid result to a temp file:
471
- ```json
472
- {
473
- "task_id": "test:correctness",
474
- "unit_id": "test",
475
- "pass_id": "pass:correctness",
476
- "lens": "correctness",
477
- "file_coverage": [{"path": "src/foo.ts", "total_lines": 100}],
478
- "findings": []
479
- }
480
- ```
481
- 2. Run: `node dispatch/validate-result.mjs <some_run_id> test:correctness` — expect exit 0
482
- 3. Mutate the file: remove `lens` field — expect exit 1 with error mentioning `lens`
483
- 4. Mutate: add `line_end: 200` on an affected_file with total_lines 100 — expect exit 1
248
+ ## Selective Deepening
484
249
 
485
- ### Integration test: `prepare-dispatch.mjs`
250
+ Result ingestion may add follow-up `AuditTask` records for bounded selective
251
+ deepening. Triggers include:
486
252
 
487
- 1. Run against the current pending tasks:
488
- ```
489
- node dispatch/prepare-dispatch.mjs --run-id 20260424T152454170Z_audit_tasks_completed_001
490
- ```
491
- 2. Inspect `dispatch-plan.json`: each entry should have `task_id`, `description`, `output_path`, `prompt`
492
- 3. Verify `prompt` contains the task JSON, lens definition, both schemas, and the output path
253
+ - high-severity findings
254
+ - low-confidence or ambiguous findings
255
+ - conflicting conclusions across related results
256
+ - high-risk no-finding sampling unless explicitly marked unnecessary
257
+ - runtime-validation disagreement
493
258
 
494
- ### Integration test: `merge-results.mjs`
259
+ When follow-up tasks are added, the backend refreshes `review_packets.json` and
260
+ `audit_plan_metrics.json`. The next dispatch cycle handles those tasks through
261
+ the same packet contract.
495
262
 
496
- 1. Write 2 valid and 1 invalid result to `task-results/`
497
- 2. Run: `node dispatch/merge-results.mjs --run-id <id>`
498
- 3. Verify `audit-results.json` contains exactly the 2 valid results
499
- 4. Verify `failed-tasks.json` contains the 1 invalid task
500
- 5. Verify exit code is 0
263
+ ## Compatibility Notes
501
264
 
502
- ---
265
+ - `AuditTask` remains the planning and coverage identity.
266
+ - `AuditResult[]` remains the ingestion shape.
267
+ - The older `.audit-artifacts/dispatch/current-*` files still exist for
268
+ repo-local backend fallback and single-task handoff flows.
269
+ - Backend provider adapters remain compatibility bridges. The canonical
270
+ `/audit-code` flow expects the active conversation orchestrator to dispatch
271
+ packet subagents when the host supports them.
272
+ - The `dispatch/` directory is packaged because `lens-definitions.json` and
273
+ validation support are part of the installed packet workflow.
503
274
 
504
- ## Orchestrator usage reference
275
+ ## Verification
505
276
 
506
- When `prepare-dispatch.mjs` finishes, it prints the instructions inline. For reference:
277
+ Run the normal project gate:
507
278
 
279
+ ```bash
280
+ npm test
508
281
  ```
509
- 1. Run: node dispatch/prepare-dispatch.mjs --run-id <run_id>
510
- 2. Read: .audit-artifacts/runs/<run_id>/dispatch-plan.json
511
- 3. In ONE message, fire one Agent call per entry:
512
- Agent({ description: entry.description, prompt: entry.prompt })
513
- Fire all calls simultaneously — they run in parallel.
514
- 4. Wait for all subagents to complete.
515
- 5. Run: node dispatch/merge-results.mjs --run-id <run_id>
516
- 6. Run: node dist/index.js worker-run --run-id <run_id>
517
- 7. Run: node dist/index.js audit-code (to get next batch)
518
- 8. Repeat.
519
- ```
520
-
521
- **Important:** The orchestrator should NOT read the pending-audit-tasks.json, NOT read
522
- any source files, NOT compose any prompts. Everything is pre-built. Just read
523
- `dispatch-plan.json` and fire the calls verbatim.
524
-
525
- ---
526
-
527
- ## Notes and caveats
528
-
529
- ### Large files (2000+ lines)
530
-
531
- Tasks with very large files (e.g. `audit-code-wrapper-lib.mjs` at 2215 lines) will still
532
- hit quota limits for subagents. The `prepare-dispatch.mjs` script should print a warning
533
- for tasks exceeding a threshold (e.g. 1500 total lines). These tasks may need to be split
534
- at the task-builder level — that is a separate concern and not addressed here.
535
-
536
- ### `audit_results_path` vs per-task files
537
-
538
- The existing `renderWorkerPrompt.ts` tells subagents to write to a shared
539
- `audit-results.json`. The new `prepare-dispatch.mjs`-generated prompts tell subagents to
540
- write to per-task `task-results/<task_id>.json` files. These are two separate dispatch
541
- paths — the old path (via `renderWorkerPrompt`) is still used for non-`claude-desktop`
542
- providers and is not modified by this plan.
543
-
544
- ### Future: provider abstraction
545
-
546
- `prepare-dispatch.mjs` output (`dispatch-plan.json`) is provider-agnostic. A future
547
- `anthropic-direct` provider could read the same `dispatch-plan.json` and call
548
- `messages.create()` for each entry via SDK, with no changes to `prepare-dispatch.mjs`.
549
-
550
- ### ajv and published package
551
282
 
552
- `ajv` is added as a devDependency. The `dispatch/` scripts are NOT in the `files` array
553
- and are not published. End users of the npm package are unaffected.
283
+ Focused packet coverage lives in `tests/review-packets.test.mjs` and
284
+ `tests/audit-code-wrapper.test.mjs`.