auditor-lambda 0.3.3 → 0.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.md +6 -1
  2. package/audit-code-wrapper-lib.mjs +78 -5
  3. package/dist/cli.js +187 -67
  4. package/dist/extractors/graph.d.ts +5 -1
  5. package/dist/extractors/graph.js +223 -3
  6. package/dist/extractors/pathPatterns.d.ts +3 -2
  7. package/dist/extractors/pathPatterns.js +97 -24
  8. package/dist/io/artifacts.d.ts +5 -0
  9. package/dist/io/artifacts.js +2 -0
  10. package/dist/orchestrator/advance.js +1 -1
  11. package/dist/orchestrator/dependencyMap.js +18 -0
  12. package/dist/orchestrator/internalExecutors.d.ts +1 -1
  13. package/dist/orchestrator/internalExecutors.js +120 -33
  14. package/dist/orchestrator/reviewPackets.d.ts +14 -0
  15. package/dist/orchestrator/reviewPackets.js +300 -0
  16. package/dist/orchestrator/selectiveDeepening.d.ts +14 -0
  17. package/dist/orchestrator/selectiveDeepening.js +392 -0
  18. package/dist/orchestrator/state.js +6 -1
  19. package/dist/orchestrator/taskBuilder.d.ts +16 -0
  20. package/dist/orchestrator/taskBuilder.js +68 -11
  21. package/dist/prompts/renderWorkerPrompt.js +2 -1
  22. package/dist/types/graph.d.ts +1 -0
  23. package/dist/types/reviewPlanning.d.ts +41 -0
  24. package/dist/types/reviewPlanning.js +1 -0
  25. package/dist/validation/artifacts.js +13 -0
  26. package/docs/bootstrap-install.md +3 -0
  27. package/docs/dispatch-implementation-plan.md +179 -481
  28. package/docs/next-steps.md +13 -8
  29. package/docs/product-direction.md +5 -3
  30. package/docs/run-flow.md +23 -30
  31. package/docs/session-config.md +4 -1
  32. package/docs/workflow-refactor-brief.md +83 -154
  33. package/package.json +1 -1
  34. package/schemas/finding.schema.json +1 -15
  35. package/schemas/graph_bundle.schema.json +16 -0
@@ -1,553 +1,251 @@
1
- # Dispatch Automation Implementation Plan
1
+ # Dispatch Automation Reference
2
2
 
3
- ## Background
3
+ This document describes the implemented review-dispatch path for `/audit-code`.
4
+ The original dispatch plan was one agent per audit task. The current path keeps
5
+ the existing `AuditTask` and `AuditResult` contracts, but groups related tasks
6
+ into review packets so a worker can read a coherent file set once and produce
7
+ one validated result file for each assigned task.
4
8
 
5
- The current audit-code workflow requires the LLM orchestrator to manually assemble
6
- subagent prompts, handle schema normalization, and merge results — costing hundreds of
7
- tokens per task and producing frequent schema violations. This plan replaces that with a
8
- deterministic scripted dispatch layer so the orchestrator's only job is to fire Agent
9
- tool calls with pre-built prompts, then run a merge script.
9
+ ## Current Workflow
10
10
 
11
- **Environment constraint:** Claude Desktop with no separate Anthropic API key. Subagent
12
- dispatch must go through the `Agent` tool in the conversation runtime — no direct SDK
13
- calls. All other steps must be zero-token scripts.
11
+ ```text
12
+ 1. audit-code
13
+ -> advances deterministic state until semantic review is needed
14
+ -> emits a blocked handoff with active_review_run.run_id
14
15
 
15
- ---
16
+ 2. audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
17
+ -> reads pending-audit-tasks.json and review planning artifacts
18
+ -> writes dispatch-plan.json
19
+ -> writes one packet prompt per dispatch-plan entry
20
+ -> prints one compact JSON envelope
16
21
 
17
- ## Target workflow (per audit cycle)
22
+ 3. Conversation orchestrator reads only dispatch-plan.json
23
+ -> launches one subagent per packet
24
+ -> each subagent reads its packet prompt and assigned files
25
+ -> each subagent writes one task-results/<task_id>.json per underlying task
26
+ -> each subagent runs the validation commands in the prompt
27
+ -> each subagent replies: valid: <packet_id>, findings=<n>
18
28
 
19
- ```
20
- 1. node dist/index.js audit-code
21
- emits run_id + pending-audit-tasks.json
22
-
23
- 2. node dispatch/prepare-dispatch.mjs --run-id <run_id>
24
- reads tasks + schemas → writes dispatch-plan.json
25
- (deterministic, 0 LLM tokens)
26
-
27
- 3. [Orchestrator reads dispatch-plan.json — small JSON array]
28
- [Orchestrator fires N Agent calls in ONE message, verbatim prompts from plan]
29
-
30
- Each subagent (×N, parallel):
31
- - reads source files with Read tool
32
- - performs lens audit
33
- - writes result to task-results/<sanitized_task_id>.json using Write tool
34
- - runs: node dispatch/validate-result.mjs <run_id> <task_id>
35
- - if non-zero: fixes errors, rewrites, re-validates (max 3 attempts)
36
- - if still failing after 3: writes empty-but-valid fallback result
37
-
38
- 4. node dispatch/merge-results.mjs --run-id <run_id>
39
- → validates all task-results/*.json
40
- → writes audit-results.json (passing results only)
41
- → writes failed-tasks.json (task_ids that failed validation)
42
- (deterministic, 0 LLM tokens)
43
-
44
- 5. node dist/index.js worker-run --run-id <run_id>
45
- → ingests audit-results.json → coverage matrix → marks tasks complete
46
- (deterministic, 0 LLM tokens)
47
-
48
- 6. Repeat from step 1 until no pending tasks remain.
29
+ 4. audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
30
+ -> validates that every assigned task has exactly one valid result
31
+ -> rejects missing, unknown, duplicate, malformed, or out-of-scope results
32
+ -> writes audit-results.json as the existing AuditResult[] shape
33
+ -> ingests accepted results through the normal result_ingestion_executor
34
+ -> prints one compact JSON envelope
35
+
36
+ 5. Repeat `audit-code` until complete.
49
37
  ```
50
38
 
51
- Orchestrator token cost per cycle: **~50 tokens × N tasks** (read dispatch-plan + invoke Agent calls). Independent of source file sizes.
39
+ The parent orchestrator should not read prompt files, pending tasks, completed
40
+ task result payloads, or source files during the packet dispatch path unless a
41
+ backend command fails and the error requires diagnosis.
52
42
 
53
- ---
43
+ ## Planning Artifacts
54
44
 
55
- ## Files to create
45
+ Planning writes two packet-specific artifacts alongside the existing task and
46
+ coverage artifacts:
56
47
 
57
- ```
58
- dispatch/
59
- lens-definitions.json lens descriptions embedded in every subagent prompt
60
- validate.mjs — shared validation logic (imported by other scripts)
61
- validate-result.mjs — CLI: validate one task-results file
62
- prepare-dispatch.mjs — reads pending tasks → writes dispatch-plan.json
63
- merge-results.mjs — merges validated task results → audit-results.json
64
- ```
48
+ - `review_packets.json`: deterministic packets derived from current
49
+ `AuditTask` records.
50
+ - `audit_plan_metrics.json`: task count, packet count, repeated file/line
51
+ estimates, largest packet, and estimated agent-count reduction.
65
52
 
66
- ## Files to modify
53
+ Packets preserve task identity. They change the worker-facing unit of work, not
54
+ the backend-owned validation or ingestion contract.
67
55
 
68
- ```
69
- package.json — add ajv devDependency; add dispatch:* npm scripts
70
- ```
56
+ ## Packet Construction
71
57
 
72
- > **Do NOT add `dispatch/` to the `files` array in package.json.** These scripts are
73
- > local dev tooling and must not be published to npm.
58
+ Packet planning is deterministic and compatibility-preserving:
74
59
 
75
- ---
60
+ - tasks sharing the same file set and scope are grouped across lenses
61
+ - tiny homogeneous test files are batched before dispatch
62
+ - graph edges from imports, calls, and references can merge related task groups
63
+ - heuristic container edges do not force packet expansion
64
+ - packet chunking respects task-count and line-budget limits
65
+ - high-priority packets sort ahead of lower-priority packets
76
66
 
77
- ## Step 1 — Add `ajv` dependency
78
-
79
- In `package.json`, add to `devDependencies`:
67
+ Generated packets include:
80
68
 
81
69
  ```json
82
- "ajv": "^8.17.1"
70
+ {
71
+ "packet_id": "src-auth:security-correctness:packet-1-...",
72
+ "task_ids": ["src-auth:security", "src-auth:correctness"],
73
+ "lenses": ["security", "correctness"],
74
+ "file_paths": ["src/api/auth.ts", "src/lib/session.ts"],
75
+ "total_lines": 70,
76
+ "estimated_tokens": 1180
77
+ }
83
78
  ```
84
79
 
85
- Then run `npm install`.
80
+ ## `prepare-dispatch` Output
86
81
 
87
- AJV v8 is required for JSON Schema draft 2020-12 support (which the existing schemas use).
88
- No other new dependencies are needed.
82
+ Command:
89
83
 
90
- Also add npm scripts (optional convenience aliases):
91
-
92
- ```json
93
- "dispatch:prepare": "node dispatch/prepare-dispatch.mjs",
94
- "dispatch:merge": "node dispatch/merge-results.mjs",
95
- "dispatch:validate": "node dispatch/validate-result.mjs"
84
+ ```bash
85
+ audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
96
86
  ```
97
87
 
98
- ---
88
+ Artifacts:
99
89
 
100
- ## Step 2 — Create `dispatch/lens-definitions.json`
90
+ - `<artifacts_dir>/runs/<run_id>/dispatch-plan.json`
91
+ - `<artifacts_dir>/runs/<run_id>/task-results/<packet_id>.prompt.md`
92
+ - `<artifacts_dir>/runs/<run_id>/dispatch-warnings.json`, only when warnings
93
+ exist
101
94
 
102
- This file is embedded verbatim in every subagent prompt. It must be accurate enough that
103
- a subagent can scope its review correctly without reading any other files.
95
+ The command prints a compact JSON envelope:
104
96
 
105
97
  ```json
106
98
  {
107
- "correctness": {
108
- "description": "Logic errors, incorrect algorithm implementations, off-by-one bugs, type mismatches, wrong return values, incorrect state transitions, missing null/undefined guards, misuse of APIs. Focus on code that does the wrong thing.",
109
- "do_not_report": "Style issues, naming problems, missing tests, or findings that belong to other lenses."
99
+ "run_id": "run-1",
100
+ "dispatch_plan_path": ".audit-artifacts/runs/run-1/dispatch-plan.json",
101
+ "packet_count": 4,
102
+ "task_count": 18,
103
+ "largest_packet": {
104
+ "packet_id": "src-auth:security-correctness:packet-1-...",
105
+ "total_lines": 1320,
106
+ "estimated_tokens": 6180
110
107
  },
111
- "maintainability": {
112
- "description": "Code that is hard to change safely: excessive function length, deep nesting, tight coupling between unrelated modules, poor naming, magic constants, duplicated logic, inconsistent abstractions, unclear public APIs.",
113
- "do_not_report": "Correctness bugs, test gaps, or operational concerns."
114
- },
115
- "tests": {
116
- "description": "Test coverage gaps for important paths, tests that assert incorrect behavior (pinning bugs as expected), fragile or non-deterministic tests, missing negative/edge-case tests, tests that silently pass on stale builds (e.g. importing compiled dist/ rather than source).",
117
- "do_not_report": "Source code bugs — report only issues with the tests themselves."
118
- },
119
- "security": {
120
- "description": "Injection vulnerabilities (SQL, shell, path traversal), authentication/authorization flaws, secret exposure, insecure deserialization, privilege escalation, unsafe use of eval or child processes with user input.",
121
- "do_not_report": "Performance or correctness issues that are not security-relevant."
122
- },
123
- "reliability": {
124
- "description": "Failure modes without recovery, missing timeouts, unhandled promise rejections, race conditions, resource leaks (file handles, sockets, timers), incorrect retry logic, cascading failure risks.",
125
- "do_not_report": "Correctness bugs that do not affect reliability under failure conditions."
126
- },
127
- "performance": {
128
- "description": "Algorithmic inefficiencies (O(n²) where O(n) is possible), unnecessary re-computation, missing caching, synchronous blocking in hot paths, excessive memory allocation.",
129
- "do_not_report": "Correctness bugs unrelated to performance."
130
- },
131
- "data_integrity": {
132
- "description": "Missing input validation at trust boundaries, schema violations, inconsistent field naming across related schemas, data loss scenarios, missing required fields, enum values that are present in some schemas but not others.",
133
- "do_not_report": "UI or presentation issues; operational or deployment concerns."
134
- },
135
- "operability": {
136
- "description": "Missing or low-quality log output, error messages that don't help operators diagnose problems, missing progress indicators for long operations, no elapsed-time reporting, lack of dry-run or preview modes for destructive operations.",
137
- "do_not_report": "Correctness bugs or deployment configuration."
138
- },
139
- "config_deployment": {
140
- "description": "CI/CD pipeline correctness (wrong triggers, missing branch filters, floating version pins), deployment safety (no gate before publish, missing rollback), insecure secret handling in configs, mutable action tags that should be pinned to commit SHAs.",
141
- "do_not_report": "Runtime code issues; findings that belong to other lenses."
142
- }
108
+ "warning_count": 0,
109
+ "dispatch_warnings_path": null
143
110
  }
144
111
  ```
145
112
 
146
- ---
147
-
148
- ## Step 3 — Create `dispatch/validate.mjs`
149
-
150
- Shared validation module. Exports a single function `validateResult(resultObj, fileLineCounts)`.
151
-
152
- ### Interface
153
-
154
- ```js
155
- /**
156
- * @param {object} resultObj — parsed JSON from a task-results file
157
- * @param {Record<string, number>} fileLineCounts — from the task's file_line_counts
158
- * @returns {{ valid: boolean, errors: string[] }}
159
- */
160
- export function validateResult(resultObj, fileLineCounts) { ... }
161
- ```
162
-
163
- ### Logic
164
-
165
- ```
166
- 1. AJV validate resultObj against schemas/audit_result.schema.json
167
- - Load finding.schema.json first (addSchema) so $ref resolves
168
- - Use Ajv({ strict: false }) to avoid complaints about unknown keywords like $schema
169
- - On failure: return { valid: false, errors: ajv.errors.map(e => formatAjvError(e)) }
170
-
171
- 2. Extra check — line range constraint:
172
- For each finding in resultObj.findings:
173
- For each entry in finding.affected_files:
174
- if entry.line_end is defined:
175
- look up total_lines from resultObj.file_coverage where path === entry.path
176
- if total_lines is undefined: push error "affected_files path '${entry.path}' not in file_coverage"
177
- else if entry.line_end > total_lines: push error
178
- "finding '${finding.id}': line_end ${entry.line_end} exceeds total_lines ${total_lines} for ${entry.path}"
179
-
180
- 3. Extra check — lens consistency:
181
- For each finding in resultObj.findings:
182
- if finding.lens !== resultObj.lens:
183
- push error "finding '${finding.id}': lens '${finding.lens}' does not match task lens '${resultObj.lens}'"
184
-
185
- 4. Extra check — affected_files paths in scope:
186
- Collect allowed paths from resultObj.file_coverage[].path
187
- For each finding's affected_files entry:
188
- if entry.path not in allowed paths:
189
- push error "finding '${finding.id}': affected path '${entry.path}' not in task file_coverage"
190
-
191
- 5. If any extra-check errors: return { valid: false, errors }
192
-
193
- 6. Return { valid: true, errors: [] }
194
- ```
195
-
196
- ### Schema loading
197
-
198
- Schemas are resolved relative to the project root. Use this logic to find the project root:
199
-
200
- ```js
201
- // dispatch/validate.mjs
202
- import { createRequire } from "node:module";
203
- import { dirname, resolve, join } from "node:path";
204
- import { fileURLToPath } from "node:url";
205
- import { readFileSync } from "node:fs";
206
- import Ajv from "ajv";
207
-
208
- const __filename = fileURLToPath(import.meta.url);
209
- const __dirname = dirname(__filename);
210
- // dispatch/ is one level below project root
211
- const PROJECT_ROOT = resolve(__dirname, "..");
212
- const SCHEMAS_DIR = join(PROJECT_ROOT, "schemas");
213
-
214
- function loadSchema(name) {
215
- return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), "utf8"));
216
- }
113
+ `dispatch-plan.json` entries are intentionally small:
217
114
 
218
- let _ajv = null;
219
- function getAjv() {
220
- if (_ajv) return _ajv;
221
- _ajv = new Ajv({ strict: false, allErrors: true });
222
- _ajv.addSchema(loadSchema("finding.schema.json"));
223
- return _ajv;
115
+ ```json
116
+ {
117
+ "packet_id": "src-auth:security-correctness:packet-1-...",
118
+ "task_id": "src-auth:security-correctness:packet-1-...",
119
+ "task_ids": ["src-auth:security", "src-auth:correctness"],
120
+ "description": "Audit 2 file(s), 2 task(s), 2 lens(es) (~70 lines)",
121
+ "output_paths": {
122
+ "src-auth:security": ".audit-artifacts/runs/run-1/task-results/src-auth_security.json",
123
+ "src-auth:correctness": ".audit-artifacts/runs/run-1/task-results/src-auth_correctness.json"
124
+ },
125
+ "prompt_path": ".audit-artifacts/runs/run-1/task-results/src-auth_security-correctness_packet-1.prompt.md",
126
+ "lenses": ["security", "correctness"],
127
+ "file_paths": ["src/api/auth.ts", "src/lib/session.ts"],
128
+ "total_lines": 70,
129
+ "estimated_tokens": 1180
224
130
  }
225
131
  ```
226
132
 
227
- ---
133
+ The orchestrator should launch one subagent per entry with the entry
134
+ description and a prompt that tells the subagent to read and follow
135
+ `entry.prompt_path`.
228
136
 
229
- ## Step 4 — Create `dispatch/validate-result.mjs`
137
+ ## Packet Prompt Contract
230
138
 
231
- CLI wrapper for use by subagents after writing their result file.
139
+ Each packet prompt tells the worker to:
232
140
 
233
- ### Usage
141
+ - review the packet once
142
+ - read only the listed repo-relative files
143
+ - produce one JSON object per listed task
144
+ - write each object to that task's exact `output_path`
145
+ - preserve the existing `AuditResult` fields:
146
+ `task_id`, `unit_id`, `pass_id`, `lens`, `file_coverage`, `findings`
147
+ - keep `file_coverage[]` as `{ path, total_lines }`
148
+ - keep every finding lens equal to the task lens
149
+ - avoid source edits, remediation, extra task results, and unrelated audits
150
+ - run the generated validation command for every task result
151
+ - reply exactly `valid: <packet_id>, findings=<total finding count>` after all
152
+ validation commands pass
234
153
 
235
- ```
236
- node dispatch/validate-result.mjs <run_id> <task_id>
237
- ```
154
+ This keeps packet review efficient while leaving merge and ingestion
155
+ mechanically deterministic.
238
156
 
239
- - `run_id`: e.g. `20260424T152454170Z_audit_tasks_completed_001`
240
- - `task_id`: e.g. `src-adapters:correctness` (unsanitized — the script sanitizes internally)
157
+ ## Validation
241
158
 
242
- ### Logic
159
+ Per-task validation is exposed through:
243
160
 
161
+ ```bash
162
+ audit-code validate-result --run-id <run_id> --task-id <task_id> --artifacts-dir <artifacts_dir>
244
163
  ```
245
- 1. Parse argv: run_id = process.argv[2], task_id = process.argv[3]
246
- If either missing: print usage and exit 1
247
164
 
248
- 2. Locate artifacts_dir:
249
- Read .audit-artifacts/session-config.json to find artifacts_dir.
250
- If not present, default: join(PROJECT_ROOT, ".audit-artifacts")
165
+ The validator checks the result against the assigned task set and enforces the
166
+ mechanical constraints that matter for ingestion:
251
167
 
252
- 3. Derive file path:
253
- sanitized = task_id.replace(/[^a-zA-Z0-9_-]/g, "_")
254
- resultPath = join(artifactsDir, "runs", run_id, "task-results", sanitized + ".json")
168
+ - required `AuditResult` and finding fields
169
+ - finding lens matches the task lens
170
+ - cited and affected paths are in assigned coverage
171
+ - line spans do not exceed known `total_lines`
172
+ - result fields conform to the shipped schemas
255
173
 
256
- 4. Read and parse resultPath. If file not found or invalid JSON:
257
- print error, exit 1
174
+ Workers should retry invalid JSON up to the bounded retry count in the prompt.
258
175
 
259
- 5. Load the task from pending-audit-tasks.json to get file_line_counts:
260
- tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
261
- tasks = JSON.parse(readFileSync(tasksPath))
262
- task = tasks.find(t => t.task_id === task_id)
263
- fileLineCounts = task?.file_line_counts ?? {}
176
+ ## `merge-and-ingest` Output
264
177
 
265
- 6. Call validateResult(resultObj, fileLineCounts) from validate.mjs
178
+ Command:
266
179
 
267
- 7. If valid: console.log("✓ valid:", task_id); exit 0
268
- If invalid:
269
- console.error("✗ invalid:", task_id);
270
- console.error(JSON.stringify(errors, null, 2));
271
- exit 1
180
+ ```bash
181
+ audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
272
182
  ```
273
183
 
274
- ---
275
-
276
- ## Step 5 — Create `dispatch/prepare-dispatch.mjs`
184
+ Merge behavior:
277
185
 
278
- Core script. Reads pending tasks and produces a ready-to-use dispatch plan.
186
+ - validates every JSON file under `task-results/`
187
+ - rejects duplicate task results
188
+ - rejects unknown task IDs
189
+ - rejects missing assigned task results
190
+ - writes `failed-tasks.json` and exits non-zero when any assigned result is
191
+ missing or invalid
192
+ - writes `audit-results.json` only from passing results
193
+ - invokes the normal result ingestion path only after the assigned set is clean
279
194
 
280
- ### Usage
281
-
282
- ```
283
- node dispatch/prepare-dispatch.mjs --run-id <run_id>
284
- ```
285
-
286
- ### Logic
287
-
288
- ```
289
- 1. Parse --run-id <run_id> from argv. Error if missing.
290
-
291
- 2. Resolve paths:
292
- artifactsDir = join(PROJECT_ROOT, ".audit-artifacts")
293
- runDir = join(artifactsDir, "runs", run_id)
294
- tasksPath = join(runDir, "pending-audit-tasks.json")
295
- dispatchPlanPath = join(runDir, "dispatch-plan.json")
296
-
297
- 3. Read pending-audit-tasks.json — array of AuditTask objects.
298
- If file not found: error and exit 1.
299
-
300
- 4. Load shared content (read once, reuse for all tasks):
301
- lensDefinitions = read dispatch/lens-definitions.json
302
- auditResultSchema = read schemas/audit_result.schema.json
303
- findingSchema = read schemas/finding.schema.json
304
-
305
- 5. For each task in tasks:
306
- a. sanitizedId = task.task_id.replace(/[^a-zA-Z0-9_-]/g, "_")
307
- b. outputPath = join(runDir, "task-results", sanitizedId + ".json")
308
- c. lensDef = lensDefinitions[task.lens]
309
- d. totalFileLines = Object.values(task.file_line_counts).reduce((a, b) => a + b, 0)
310
- e. description = `Audit ${task.unit_id} (${task.file_paths.length} file(s), ~${totalFileLines} lines) — ${task.lens} lens`
311
- f. prompt = buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, run_id, artifactsDir)
312
- g. Append { task_id, description, output_path: outputPath, prompt } to plan array
313
-
314
- 6. Ensure task-results/ directory exists:
315
- mkdirSync(join(runDir, "task-results"), { recursive: true })
316
-
317
- 7. Write plan array to dispatchPlanPath as formatted JSON.
318
-
319
- 8. Print: "Wrote dispatch-plan.json — N tasks ready for dispatch"
320
- Print: "Largest task: <task_id> (~N lines)"
321
- Print: ""
322
- Print: "--- ORCHESTRATOR INSTRUCTIONS ---"
323
- Print: "Read dispatch-plan.json. For each entry, fire one Agent call with:"
324
- Print: " description: <entry.description>"
325
- Print: " prompt: <entry.prompt>"
326
- Print: "Fire all N calls in a single message for parallel execution."
327
- Print: "When all complete, run: node dispatch/merge-results.mjs --run-id <run_id>"
328
- ```
329
-
330
- ### `buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, runId, artifactsDir)`
331
-
332
- Returns a string. Template (use template literals):
333
-
334
- ```
335
- You are a code auditor. Perform a bounded audit of the files listed below under the specified lens.
336
-
337
- ## Task metadata
338
- ${JSON.stringify(task, null, 2)}
339
-
340
- ## Files to read
341
- Read each path in task.file_paths using your Read tool. The repo root is the current working directory — paths are repo-relative (e.g. "src/foo.ts").
342
-
343
- file_line_counts gives the expected total line count for each file. Use those exact values for file_coverage[].total_lines in your result.
344
-
345
- ## Lens: ${task.lens}
346
- ${lensDef.description}
347
-
348
- Do NOT report: ${lensDef.do_not_report}
349
-
350
- ## Output format
351
- Write your result as a single JSON **object** (not an array) to this exact path:
352
- ${outputPath}
353
-
354
- The result must conform to the following schema:
355
-
356
- ### audit_result.schema.json
357
- ${JSON.stringify(auditResultSchema, null, 2)}
358
-
359
- ### finding.schema.json
360
- ${JSON.stringify(findingSchema, null, 2)}
361
-
362
- ## Hard constraints (violations will fail validation)
363
- 1. NEVER set line_end higher than the file's actual line count.
364
- Use file_line_counts as your reference. If in doubt, leave line_end omitted.
365
- 2. Every finding MUST have ALL required fields:
366
- id, title, category, severity, confidence, lens, summary, affected_files, evidence
367
- 3. lens on every finding must be exactly "${task.lens}"
368
- 4. No fields outside the schema. Forbidden: "recommendation", "tags", "description" (use "summary").
369
- 5. evidence[] must contain at least one specific file:line reference.
370
- Format: "path/to/file.ts:42 - brief description of what you see there"
371
- 6. affected_files[] entries are OBJECTS with a "path" key — NOT plain strings.
372
- Example: {"path": "src/foo.ts", "line_start": 10, "line_end": 20, "symbol": "myFunc"}
373
- 7. Only reference file paths that appear in this task's file_paths.
374
- 8. findings: [] is correct when you genuinely find nothing. Do not invent findings.
375
-
376
- ## Validation step (required)
377
- After writing your result, run:
378
- node dispatch/validate-result.mjs ${runId} ${task.task_id}
379
-
380
- - If it exits 0: you are done. Stop.
381
- - If it exits non-zero: read the error output, fix the JSON, rewrite the file, run again.
382
- - Repeat up to 3 times.
383
-
384
- If you cannot produce a valid result after 3 attempts, write this fallback (substituting real values):
385
- ${JSON.stringify({
386
- task_id: task.task_id,
387
- unit_id: task.unit_id,
388
- pass_id: task.pass_id,
389
- lens: task.lens,
390
- file_coverage: task.file_paths.map(p => ({ path: p, total_lines: task.file_line_counts[p] })),
391
- findings: [],
392
- notes: ["Validation failed after 3 attempts — empty result written as fallback."]
393
- }, null, 2)}
394
-
395
- Then validate the fallback passes before finishing.
396
- ```
397
-
398
- Note: the fallback JSON in the prompt is pre-computed in `buildPrompt` using the task
399
- data, not generated by the subagent.
400
-
401
- ---
402
-
403
- ## Step 6 — Create `dispatch/merge-results.mjs`
404
-
405
- ### Usage
406
-
407
- ```
408
- node dispatch/merge-results.mjs --run-id <run_id>
409
- ```
410
-
411
- ### Logic
412
-
413
- ```
414
- 1. Parse --run-id <run_id> from argv.
415
-
416
- 2. Resolve paths:
417
- taskResultsDir = join(artifactsDir, "runs", run_id, "task-results")
418
- auditResultsPath = join(artifactsDir, "runs", run_id, "audit-results.json")
419
- failedTasksPath = join(artifactsDir, "runs", run_id, "failed-tasks.json")
420
- tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
421
-
422
- 3. Read pending-audit-tasks.json to build fileLineCounts map:
423
- lineCounts = {}
424
- for each task: lineCounts[task.task_id] = task.file_line_counts
425
-
426
- 4. Read all *.json files from task-results/:
427
- files = readdirSync(taskResultsDir).filter(f => f.endsWith(".json"))
428
-
429
- 5. For each file:
430
- a. Parse JSON
431
- b. Call validateResult(resultObj, lineCounts[resultObj.task_id] ?? {})
432
- c. If valid: push to passing[]
433
- d. If invalid: push { task_id: resultObj?.task_id ?? filename, errors } to failing[]
434
-
435
- 6. Write passing array to audit-results.json (as AuditResult[] — array of passing objects)
436
-
437
- 7. If failing.length > 0:
438
- Write failing array to failed-tasks.json
439
- Print warning: "${failing.length} task(s) failed validation and were excluded:"
440
- For each: print " ✗ ${f.task_id}: ${f.errors[0]}" (first error only for brevity)
441
-
442
- 8. Print: "✓ ${passing.length}/${total} tasks valid → ${auditResultsPath}"
443
- If failing.length > 0: print " Re-run those tasks in the next cycle."
444
-
445
- 9. Exit 0 regardless (partial ingestion is safe — failed tasks remain pending for requeue).
446
- ```
447
-
448
- ---
449
-
450
- ## Step 7 — Update `session-config.json` (optional but recommended)
451
-
452
- Add `dispatch_provider` field to `.audit-artifacts/session-config.json`:
195
+ On success the command prints one compact JSON envelope:
453
196
 
454
197
  ```json
455
198
  {
456
- "provider": "local-subprocess",
457
- "dispatch_provider": "claude-desktop",
458
- "agent_task_batch_size": 10
199
+ "run_id": "run-1",
200
+ "status": "completed",
201
+ "accepted_count": 18,
202
+ "rejected_count": 0,
203
+ "finding_count": 3,
204
+ "audit_results_path": ".audit-artifacts/runs/run-1/audit-results.json",
205
+ "selected_executor": "result_ingestion_executor",
206
+ "progress_made": true,
207
+ "progress_summary": "Ingested 18 audit result entries and refreshed dependent artifacts.",
208
+ "next_likely_step": "runtime_validation"
459
209
  }
460
210
  ```
461
211
 
462
- This is metadata only for now no code reads `dispatch_provider` yet. It documents intent and provides the hook for future multi-provider support.
463
-
464
- ---
465
-
466
- ## Testing procedure
212
+ If the command exits non-zero, the orchestrator should stop and report the exact
213
+ error instead of manually editing task results or audit state.
467
214
 
468
- ### Unit test: `validate-result.mjs`
215
+ ## Selective Deepening
469
216
 
470
- 1. Write a minimal valid result to a temp file:
471
- ```json
472
- {
473
- "task_id": "test:correctness",
474
- "unit_id": "test",
475
- "pass_id": "pass:correctness",
476
- "lens": "correctness",
477
- "file_coverage": [{"path": "src/foo.ts", "total_lines": 100}],
478
- "findings": []
479
- }
480
- ```
481
- 2. Run: `node dispatch/validate-result.mjs <some_run_id> test:correctness` — expect exit 0
482
- 3. Mutate the file: remove `lens` field — expect exit 1 with error mentioning `lens`
483
- 4. Mutate: add `line_end: 200` on an affected_file with total_lines 100 — expect exit 1
217
+ Result ingestion may add follow-up `AuditTask` records for bounded selective
218
+ deepening. Triggers include:
484
219
 
485
- ### Integration test: `prepare-dispatch.mjs`
220
+ - high-severity findings
221
+ - low-confidence or ambiguous findings
222
+ - conflicting conclusions across related results
223
+ - high-risk no-finding sampling unless explicitly marked unnecessary
224
+ - runtime-validation disagreement
486
225
 
487
- 1. Run against the current pending tasks:
488
- ```
489
- node dispatch/prepare-dispatch.mjs --run-id 20260424T152454170Z_audit_tasks_completed_001
490
- ```
491
- 2. Inspect `dispatch-plan.json`: each entry should have `task_id`, `description`, `output_path`, `prompt`
492
- 3. Verify `prompt` contains the task JSON, lens definition, both schemas, and the output path
226
+ When follow-up tasks are added, the backend refreshes `review_packets.json` and
227
+ `audit_plan_metrics.json`. The next dispatch cycle handles those tasks through
228
+ the same packet contract.
493
229
 
494
- ### Integration test: `merge-results.mjs`
230
+ ## Compatibility Notes
495
231
 
496
- 1. Write 2 valid and 1 invalid result to `task-results/`
497
- 2. Run: `node dispatch/merge-results.mjs --run-id <id>`
498
- 3. Verify `audit-results.json` contains exactly the 2 valid results
499
- 4. Verify `failed-tasks.json` contains the 1 invalid task
500
- 5. Verify exit code is 0
232
+ - `AuditTask` remains the planning and coverage identity.
233
+ - `AuditResult[]` remains the ingestion shape.
234
+ - The older `.audit-artifacts/dispatch/current-*` files still exist for
235
+ repo-local backend fallback and single-task handoff flows.
236
+ - Backend provider adapters remain compatibility bridges. The canonical
237
+ `/audit-code` flow expects the active conversation orchestrator to dispatch
238
+ packet subagents when the host supports them.
239
+ - The `dispatch/` directory is packaged because `lens-definitions.json` and
240
+ validation support are part of the installed packet workflow.
501
241
 
502
- ---
242
+ ## Verification
503
243
 
504
- ## Orchestrator usage reference
244
+ Run the normal project gate:
505
245
 
506
- When `prepare-dispatch.mjs` finishes, it prints the instructions inline. For reference:
507
-
508
- ```
509
- 1. Run: node dispatch/prepare-dispatch.mjs --run-id <run_id>
510
- 2. Read: .audit-artifacts/runs/<run_id>/dispatch-plan.json
511
- 3. In ONE message, fire one Agent call per entry:
512
- Agent({ description: entry.description, prompt: entry.prompt })
513
- Fire all calls simultaneously — they run in parallel.
514
- 4. Wait for all subagents to complete.
515
- 5. Run: node dispatch/merge-results.mjs --run-id <run_id>
516
- 6. Run: node dist/index.js worker-run --run-id <run_id>
517
- 7. Run: node dist/index.js audit-code (to get next batch)
518
- 8. Repeat.
246
+ ```bash
247
+ npm test
519
248
  ```
520
249
 
521
- **Important:** The orchestrator should NOT read the pending-audit-tasks.json, NOT read
522
- any source files, NOT compose any prompts. Everything is pre-built. Just read
523
- `dispatch-plan.json` and fire the calls verbatim.
524
-
525
- ---
526
-
527
- ## Notes and caveats
528
-
529
- ### Large files (2000+ lines)
530
-
531
- Tasks with very large files (e.g. `audit-code-wrapper-lib.mjs` at 2215 lines) will still
532
- hit quota limits for subagents. The `prepare-dispatch.mjs` script should print a warning
533
- for tasks exceeding a threshold (e.g. 1500 total lines). These tasks may need to be split
534
- at the task-builder level — that is a separate concern and not addressed here.
535
-
536
- ### `audit_results_path` vs per-task files
537
-
538
- The existing `renderWorkerPrompt.ts` tells subagents to write to a shared
539
- `audit-results.json`. The new `prepare-dispatch.mjs`-generated prompts tell subagents to
540
- write to per-task `task-results/<task_id>.json` files. These are two separate dispatch
541
- paths — the old path (via `renderWorkerPrompt`) is still used for non-`claude-desktop`
542
- providers and is not modified by this plan.
543
-
544
- ### Future: provider abstraction
545
-
546
- `prepare-dispatch.mjs` output (`dispatch-plan.json`) is provider-agnostic. A future
547
- `anthropic-direct` provider could read the same `dispatch-plan.json` and call
548
- `messages.create()` for each entry via SDK, with no changes to `prepare-dispatch.mjs`.
549
-
550
- ### ajv and published package
551
-
552
- `ajv` is added as a devDependency. The `dispatch/` scripts are NOT in the `files` array
553
- and are not published. End users of the npm package are unaffected.
250
+ Focused packet coverage lives in `tests/review-packets.test.mjs` and
251
+ `tests/audit-code-wrapper.test.mjs`.