auditor-lambda 0.3.2 → 0.3.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +6 -1
- package/audit-code-wrapper-lib.mjs +78 -5
- package/dist/cli.js +205 -67
- package/dist/extractors/graph.d.ts +5 -1
- package/dist/extractors/graph.js +223 -3
- package/dist/extractors/pathPatterns.d.ts +3 -2
- package/dist/extractors/pathPatterns.js +97 -24
- package/dist/io/artifacts.d.ts +5 -0
- package/dist/io/artifacts.js +2 -0
- package/dist/io/json.js +3 -3
- package/dist/io/runArtifacts.js +4 -0
- package/dist/mcp/server.js +24 -11
- package/dist/orchestrator/advance.js +1 -1
- package/dist/orchestrator/dependencyMap.js +18 -0
- package/dist/orchestrator/internalExecutors.d.ts +1 -1
- package/dist/orchestrator/internalExecutors.js +120 -33
- package/dist/orchestrator/reviewPackets.d.ts +14 -0
- package/dist/orchestrator/reviewPackets.js +300 -0
- package/dist/orchestrator/selectiveDeepening.d.ts +14 -0
- package/dist/orchestrator/selectiveDeepening.js +392 -0
- package/dist/orchestrator/state.js +6 -1
- package/dist/orchestrator/taskBuilder.d.ts +16 -0
- package/dist/orchestrator/taskBuilder.js +68 -11
- package/dist/orchestrator.js +53 -2
- package/dist/prompts/renderWorkerPrompt.js +11 -4
- package/dist/providers/index.js +1 -1
- package/dist/supervisor/sessionConfig.js +1 -1
- package/dist/types/graph.d.ts +1 -0
- package/dist/types/reviewPlanning.d.ts +41 -0
- package/dist/types/reviewPlanning.js +1 -0
- package/dist/validation/artifacts.js +13 -0
- package/dist/validation/sessionConfig.js +1 -1
- package/docs/agent-integrations.md +17 -8
- package/docs/bootstrap-install.md +3 -0
- package/docs/dispatch-implementation-plan.md +179 -481
- package/docs/next-steps.md +13 -8
- package/docs/product-direction.md +5 -3
- package/docs/run-flow.md +23 -30
- package/docs/session-config.md +10 -3
- package/docs/supervisor.md +12 -4
- package/docs/workflow-refactor-brief.md +85 -147
- package/package.json +1 -1
- package/schemas/audit_results.schema.json +10 -0
- package/schemas/finding.schema.json +1 -15
- package/schemas/graph_bundle.schema.json +16 -0
- package/skills/audit-code/SKILL.md +12 -3
- package/skills/audit-code/audit-code.prompt.md +87 -57
|
@@ -1,553 +1,251 @@
|
|
|
1
|
-
# Dispatch Automation
|
|
1
|
+
# Dispatch Automation Reference
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
This document describes the implemented review-dispatch path for `/audit-code`.
|
|
4
|
+
The original dispatch plan was one agent per audit task. The current path keeps
|
|
5
|
+
the existing `AuditTask` and `AuditResult` contracts, but groups related tasks
|
|
6
|
+
into review packets so a worker can read a coherent file set once and produce
|
|
7
|
+
one validated result file for each assigned task.
|
|
4
8
|
|
|
5
|
-
|
|
6
|
-
subagent prompts, handle schema normalization, and merge results — costing hundreds of
|
|
7
|
-
tokens per task and producing frequent schema violations. This plan replaces that with a
|
|
8
|
-
deterministic scripted dispatch layer so the orchestrator's only job is to fire Agent
|
|
9
|
-
tool calls with pre-built prompts, then run a merge script.
|
|
9
|
+
## Current Workflow
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
11
|
+
```text
|
|
12
|
+
1. audit-code
|
|
13
|
+
-> advances deterministic state until semantic review is needed
|
|
14
|
+
-> emits a blocked handoff with active_review_run.run_id
|
|
14
15
|
|
|
15
|
-
|
|
16
|
+
2. audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
17
|
+
-> reads pending-audit-tasks.json and review planning artifacts
|
|
18
|
+
-> writes dispatch-plan.json
|
|
19
|
+
-> writes one packet prompt per dispatch-plan entry
|
|
20
|
+
-> prints one compact JSON envelope
|
|
16
21
|
|
|
17
|
-
|
|
22
|
+
3. Conversation orchestrator reads only dispatch-plan.json
|
|
23
|
+
-> launches one subagent per packet
|
|
24
|
+
-> each subagent reads its packet prompt and assigned files
|
|
25
|
+
-> each subagent writes one task-results/<task_id>.json per underlying task
|
|
26
|
+
-> each subagent runs the validation commands in the prompt
|
|
27
|
+
-> each subagent replies: valid: <packet_id>, findings=<n>
|
|
18
28
|
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
3. [Orchestrator reads dispatch-plan.json — small JSON array]
|
|
28
|
-
[Orchestrator fires N Agent calls in ONE message, verbatim prompts from plan]
|
|
29
|
-
|
|
30
|
-
Each subagent (×N, parallel):
|
|
31
|
-
- reads source files with Read tool
|
|
32
|
-
- performs lens audit
|
|
33
|
-
- writes result to task-results/<sanitized_task_id>.json using Write tool
|
|
34
|
-
- runs: node dispatch/validate-result.mjs <run_id> <task_id>
|
|
35
|
-
- if non-zero: fixes errors, rewrites, re-validates (max 3 attempts)
|
|
36
|
-
- if still failing after 3: writes empty-but-valid fallback result
|
|
37
|
-
|
|
38
|
-
4. node dispatch/merge-results.mjs --run-id <run_id>
|
|
39
|
-
→ validates all task-results/*.json
|
|
40
|
-
→ writes audit-results.json (passing results only)
|
|
41
|
-
→ writes failed-tasks.json (task_ids that failed validation)
|
|
42
|
-
(deterministic, 0 LLM tokens)
|
|
43
|
-
|
|
44
|
-
5. node dist/index.js worker-run --run-id <run_id>
|
|
45
|
-
→ ingests audit-results.json → coverage matrix → marks tasks complete
|
|
46
|
-
(deterministic, 0 LLM tokens)
|
|
47
|
-
|
|
48
|
-
6. Repeat from step 1 until no pending tasks remain.
|
|
29
|
+
4. audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
30
|
+
-> validates that every assigned task has exactly one valid result
|
|
31
|
+
-> rejects missing, unknown, duplicate, malformed, or out-of-scope results
|
|
32
|
+
-> writes audit-results.json as the existing AuditResult[] shape
|
|
33
|
+
-> ingests accepted results through the normal result_ingestion_executor
|
|
34
|
+
-> prints one compact JSON envelope
|
|
35
|
+
|
|
36
|
+
5. Repeat `audit-code` until complete.
|
|
49
37
|
```
|
|
50
38
|
|
|
51
|
-
|
|
39
|
+
The parent orchestrator should not read prompt files, pending tasks, completed
|
|
40
|
+
task result payloads, or source files during the packet dispatch path unless a
|
|
41
|
+
backend command fails and the error requires diagnosis.
|
|
52
42
|
|
|
53
|
-
|
|
43
|
+
## Planning Artifacts
|
|
54
44
|
|
|
55
|
-
|
|
45
|
+
Planning writes two packet-specific artifacts alongside the existing task and
|
|
46
|
+
coverage artifacts:
|
|
56
47
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
validate-result.mjs — CLI: validate one task-results file
|
|
62
|
-
prepare-dispatch.mjs — reads pending tasks → writes dispatch-plan.json
|
|
63
|
-
merge-results.mjs — merges validated task results → audit-results.json
|
|
64
|
-
```
|
|
48
|
+
- `review_packets.json`: deterministic packets derived from current
|
|
49
|
+
`AuditTask` records.
|
|
50
|
+
- `audit_plan_metrics.json`: task count, packet count, repeated file/line
|
|
51
|
+
estimates, largest packet, and estimated agent-count reduction.
|
|
65
52
|
|
|
66
|
-
|
|
53
|
+
Packets preserve task identity. They change the worker-facing unit of work, not
|
|
54
|
+
the backend-owned validation or ingestion contract.
|
|
67
55
|
|
|
68
|
-
|
|
69
|
-
package.json — add ajv devDependency; add dispatch:* npm scripts
|
|
70
|
-
```
|
|
56
|
+
## Packet Construction
|
|
71
57
|
|
|
72
|
-
|
|
73
|
-
> local dev tooling and must not be published to npm.
|
|
58
|
+
Packet planning is deterministic and compatibility-preserving:
|
|
74
59
|
|
|
75
|
-
|
|
60
|
+
- tasks sharing the same file set and scope are grouped across lenses
|
|
61
|
+
- tiny homogeneous test files are batched before dispatch
|
|
62
|
+
- graph edges from imports, calls, and references can merge related task groups
|
|
63
|
+
- heuristic container edges do not force packet expansion
|
|
64
|
+
- packet chunking respects task-count and line-budget limits
|
|
65
|
+
- high-priority packets sort ahead of lower-priority packets
|
|
76
66
|
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
In `package.json`, add to `devDependencies`:
|
|
67
|
+
Generated packets include:
|
|
80
68
|
|
|
81
69
|
```json
|
|
82
|
-
|
|
70
|
+
{
|
|
71
|
+
"packet_id": "src-auth:security-correctness:packet-1-...",
|
|
72
|
+
"task_ids": ["src-auth:security", "src-auth:correctness"],
|
|
73
|
+
"lenses": ["security", "correctness"],
|
|
74
|
+
"file_paths": ["src/api/auth.ts", "src/lib/session.ts"],
|
|
75
|
+
"total_lines": 70,
|
|
76
|
+
"estimated_tokens": 1180
|
|
77
|
+
}
|
|
83
78
|
```
|
|
84
79
|
|
|
85
|
-
|
|
80
|
+
## `prepare-dispatch` Output
|
|
86
81
|
|
|
87
|
-
|
|
88
|
-
No other new dependencies are needed.
|
|
82
|
+
Command:
|
|
89
83
|
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
```json
|
|
93
|
-
"dispatch:prepare": "node dispatch/prepare-dispatch.mjs",
|
|
94
|
-
"dispatch:merge": "node dispatch/merge-results.mjs",
|
|
95
|
-
"dispatch:validate": "node dispatch/validate-result.mjs"
|
|
84
|
+
```bash
|
|
85
|
+
audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
96
86
|
```
|
|
97
87
|
|
|
98
|
-
|
|
88
|
+
Artifacts:
|
|
99
89
|
|
|
100
|
-
|
|
90
|
+
- `<artifacts_dir>/runs/<run_id>/dispatch-plan.json`
|
|
91
|
+
- `<artifacts_dir>/runs/<run_id>/task-results/<packet_id>.prompt.md`
|
|
92
|
+
- `<artifacts_dir>/runs/<run_id>/dispatch-warnings.json`, only when warnings
|
|
93
|
+
exist
|
|
101
94
|
|
|
102
|
-
|
|
103
|
-
a subagent can scope its review correctly without reading any other files.
|
|
95
|
+
The command prints a compact JSON envelope:
|
|
104
96
|
|
|
105
97
|
```json
|
|
106
98
|
{
|
|
107
|
-
"
|
|
108
|
-
|
|
109
|
-
|
|
99
|
+
"run_id": "run-1",
|
|
100
|
+
"dispatch_plan_path": ".audit-artifacts/runs/run-1/dispatch-plan.json",
|
|
101
|
+
"packet_count": 4,
|
|
102
|
+
"task_count": 18,
|
|
103
|
+
"largest_packet": {
|
|
104
|
+
"packet_id": "src-auth:security-correctness:packet-1-...",
|
|
105
|
+
"total_lines": 1320,
|
|
106
|
+
"estimated_tokens": 6180
|
|
110
107
|
},
|
|
111
|
-
"
|
|
112
|
-
|
|
113
|
-
"do_not_report": "Correctness bugs, test gaps, or operational concerns."
|
|
114
|
-
},
|
|
115
|
-
"tests": {
|
|
116
|
-
"description": "Test coverage gaps for important paths, tests that assert incorrect behavior (pinning bugs as expected), fragile or non-deterministic tests, missing negative/edge-case tests, tests that silently pass on stale builds (e.g. importing compiled dist/ rather than source).",
|
|
117
|
-
"do_not_report": "Source code bugs — report only issues with the tests themselves."
|
|
118
|
-
},
|
|
119
|
-
"security": {
|
|
120
|
-
"description": "Injection vulnerabilities (SQL, shell, path traversal), authentication/authorization flaws, secret exposure, insecure deserialization, privilege escalation, unsafe use of eval or child processes with user input.",
|
|
121
|
-
"do_not_report": "Performance or correctness issues that are not security-relevant."
|
|
122
|
-
},
|
|
123
|
-
"reliability": {
|
|
124
|
-
"description": "Failure modes without recovery, missing timeouts, unhandled promise rejections, race conditions, resource leaks (file handles, sockets, timers), incorrect retry logic, cascading failure risks.",
|
|
125
|
-
"do_not_report": "Correctness bugs that do not affect reliability under failure conditions."
|
|
126
|
-
},
|
|
127
|
-
"performance": {
|
|
128
|
-
"description": "Algorithmic inefficiencies (O(n²) where O(n) is possible), unnecessary re-computation, missing caching, synchronous blocking in hot paths, excessive memory allocation.",
|
|
129
|
-
"do_not_report": "Correctness bugs unrelated to performance."
|
|
130
|
-
},
|
|
131
|
-
"data_integrity": {
|
|
132
|
-
"description": "Missing input validation at trust boundaries, schema violations, inconsistent field naming across related schemas, data loss scenarios, missing required fields, enum values that are present in some schemas but not others.",
|
|
133
|
-
"do_not_report": "UI or presentation issues; operational or deployment concerns."
|
|
134
|
-
},
|
|
135
|
-
"operability": {
|
|
136
|
-
"description": "Missing or low-quality log output, error messages that don't help operators diagnose problems, missing progress indicators for long operations, no elapsed-time reporting, lack of dry-run or preview modes for destructive operations.",
|
|
137
|
-
"do_not_report": "Correctness bugs or deployment configuration."
|
|
138
|
-
},
|
|
139
|
-
"config_deployment": {
|
|
140
|
-
"description": "CI/CD pipeline correctness (wrong triggers, missing branch filters, floating version pins), deployment safety (no gate before publish, missing rollback), insecure secret handling in configs, mutable action tags that should be pinned to commit SHAs.",
|
|
141
|
-
"do_not_report": "Runtime code issues; findings that belong to other lenses."
|
|
142
|
-
}
|
|
108
|
+
"warning_count": 0,
|
|
109
|
+
"dispatch_warnings_path": null
|
|
143
110
|
}
|
|
144
111
|
```
|
|
145
112
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
## Step 3 — Create `dispatch/validate.mjs`
|
|
149
|
-
|
|
150
|
-
Shared validation module. Exports a single function `validateResult(resultObj, fileLineCounts)`.
|
|
151
|
-
|
|
152
|
-
### Interface
|
|
153
|
-
|
|
154
|
-
```js
|
|
155
|
-
/**
|
|
156
|
-
* @param {object} resultObj — parsed JSON from a task-results file
|
|
157
|
-
* @param {Record<string, number>} fileLineCounts — from the task's file_line_counts
|
|
158
|
-
* @returns {{ valid: boolean, errors: string[] }}
|
|
159
|
-
*/
|
|
160
|
-
export function validateResult(resultObj, fileLineCounts) { ... }
|
|
161
|
-
```
|
|
162
|
-
|
|
163
|
-
### Logic
|
|
164
|
-
|
|
165
|
-
```
|
|
166
|
-
1. AJV validate resultObj against schemas/audit_result.schema.json
|
|
167
|
-
- Load finding.schema.json first (addSchema) so $ref resolves
|
|
168
|
-
- Use Ajv({ strict: false }) to avoid complaints about unknown keywords like $schema
|
|
169
|
-
- On failure: return { valid: false, errors: ajv.errors.map(e => formatAjvError(e)) }
|
|
170
|
-
|
|
171
|
-
2. Extra check — line range constraint:
|
|
172
|
-
For each finding in resultObj.findings:
|
|
173
|
-
For each entry in finding.affected_files:
|
|
174
|
-
if entry.line_end is defined:
|
|
175
|
-
look up total_lines from resultObj.file_coverage where path === entry.path
|
|
176
|
-
if total_lines is undefined: push error "affected_files path '${entry.path}' not in file_coverage"
|
|
177
|
-
else if entry.line_end > total_lines: push error
|
|
178
|
-
"finding '${finding.id}': line_end ${entry.line_end} exceeds total_lines ${total_lines} for ${entry.path}"
|
|
179
|
-
|
|
180
|
-
3. Extra check — lens consistency:
|
|
181
|
-
For each finding in resultObj.findings:
|
|
182
|
-
if finding.lens !== resultObj.lens:
|
|
183
|
-
push error "finding '${finding.id}': lens '${finding.lens}' does not match task lens '${resultObj.lens}'"
|
|
184
|
-
|
|
185
|
-
4. Extra check — affected_files paths in scope:
|
|
186
|
-
Collect allowed paths from resultObj.file_coverage[].path
|
|
187
|
-
For each finding's affected_files entry:
|
|
188
|
-
if entry.path not in allowed paths:
|
|
189
|
-
push error "finding '${finding.id}': affected path '${entry.path}' not in task file_coverage"
|
|
190
|
-
|
|
191
|
-
5. If any extra-check errors: return { valid: false, errors }
|
|
192
|
-
|
|
193
|
-
6. Return { valid: true, errors: [] }
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
### Schema loading
|
|
197
|
-
|
|
198
|
-
Schemas are resolved relative to the project root. Use this logic to find the project root:
|
|
199
|
-
|
|
200
|
-
```js
|
|
201
|
-
// dispatch/validate.mjs
|
|
202
|
-
import { createRequire } from "node:module";
|
|
203
|
-
import { dirname, resolve, join } from "node:path";
|
|
204
|
-
import { fileURLToPath } from "node:url";
|
|
205
|
-
import { readFileSync } from "node:fs";
|
|
206
|
-
import Ajv from "ajv";
|
|
207
|
-
|
|
208
|
-
const __filename = fileURLToPath(import.meta.url);
|
|
209
|
-
const __dirname = dirname(__filename);
|
|
210
|
-
// dispatch/ is one level below project root
|
|
211
|
-
const PROJECT_ROOT = resolve(__dirname, "..");
|
|
212
|
-
const SCHEMAS_DIR = join(PROJECT_ROOT, "schemas");
|
|
213
|
-
|
|
214
|
-
function loadSchema(name) {
|
|
215
|
-
return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), "utf8"));
|
|
216
|
-
}
|
|
113
|
+
`dispatch-plan.json` entries are intentionally small:
|
|
217
114
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
115
|
+
```json
|
|
116
|
+
{
|
|
117
|
+
"packet_id": "src-auth:security-correctness:packet-1-...",
|
|
118
|
+
"task_id": "src-auth:security-correctness:packet-1-...",
|
|
119
|
+
"task_ids": ["src-auth:security", "src-auth:correctness"],
|
|
120
|
+
"description": "Audit 2 file(s), 2 task(s), 2 lens(es) (~70 lines)",
|
|
121
|
+
"output_paths": {
|
|
122
|
+
"src-auth:security": ".audit-artifacts/runs/run-1/task-results/src-auth_security.json",
|
|
123
|
+
"src-auth:correctness": ".audit-artifacts/runs/run-1/task-results/src-auth_correctness.json"
|
|
124
|
+
},
|
|
125
|
+
"prompt_path": ".audit-artifacts/runs/run-1/task-results/src-auth_security-correctness_packet-1.prompt.md",
|
|
126
|
+
"lenses": ["security", "correctness"],
|
|
127
|
+
"file_paths": ["src/api/auth.ts", "src/lib/session.ts"],
|
|
128
|
+
"total_lines": 70,
|
|
129
|
+
"estimated_tokens": 1180
|
|
224
130
|
}
|
|
225
131
|
```
|
|
226
132
|
|
|
227
|
-
|
|
133
|
+
The orchestrator should launch one subagent per entry with the entry
|
|
134
|
+
description and a prompt that tells the subagent to read and follow
|
|
135
|
+
`entry.prompt_path`.
|
|
228
136
|
|
|
229
|
-
##
|
|
137
|
+
## Packet Prompt Contract
|
|
230
138
|
|
|
231
|
-
|
|
139
|
+
Each packet prompt tells the worker to:
|
|
232
140
|
|
|
233
|
-
|
|
141
|
+
- review the packet once
|
|
142
|
+
- read only the listed repo-relative files
|
|
143
|
+
- produce one JSON object per listed task
|
|
144
|
+
- write each object to that task's exact `output_path`
|
|
145
|
+
- preserve the existing `AuditResult` fields:
|
|
146
|
+
`task_id`, `unit_id`, `pass_id`, `lens`, `file_coverage`, `findings`
|
|
147
|
+
- keep `file_coverage[]` as `{ path, total_lines }`
|
|
148
|
+
- keep every finding lens equal to the task lens
|
|
149
|
+
- avoid source edits, remediation, extra task results, and unrelated audits
|
|
150
|
+
- run the generated validation command for every task result
|
|
151
|
+
- reply exactly `valid: <packet_id>, findings=<total finding count>` after all
|
|
152
|
+
validation commands pass
|
|
234
153
|
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
```
|
|
154
|
+
This keeps packet review efficient while leaving merge and ingestion
|
|
155
|
+
mechanically deterministic.
|
|
238
156
|
|
|
239
|
-
|
|
240
|
-
- `task_id`: e.g. `src-adapters:correctness` (unsanitized — the script sanitizes internally)
|
|
157
|
+
## Validation
|
|
241
158
|
|
|
242
|
-
|
|
159
|
+
Per-task validation is exposed through:
|
|
243
160
|
|
|
161
|
+
```bash
|
|
162
|
+
audit-code validate-result --run-id <run_id> --task-id <task_id> --artifacts-dir <artifacts_dir>
|
|
244
163
|
```
|
|
245
|
-
1. Parse argv: run_id = process.argv[2], task_id = process.argv[3]
|
|
246
|
-
If either missing: print usage and exit 1
|
|
247
164
|
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
If not present, default: join(PROJECT_ROOT, ".audit-artifacts")
|
|
165
|
+
The validator checks the result against the assigned task set and enforces the
|
|
166
|
+
mechanical constraints that matter for ingestion:
|
|
251
167
|
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
168
|
+
- required `AuditResult` and finding fields
|
|
169
|
+
- finding lens matches the task lens
|
|
170
|
+
- cited and affected paths are in assigned coverage
|
|
171
|
+
- line spans do not exceed known `total_lines`
|
|
172
|
+
- result fields conform to the shipped schemas
|
|
255
173
|
|
|
256
|
-
|
|
257
|
-
print error, exit 1
|
|
174
|
+
Workers should retry invalid JSON up to the bounded retry count in the prompt.
|
|
258
175
|
|
|
259
|
-
|
|
260
|
-
tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
|
|
261
|
-
tasks = JSON.parse(readFileSync(tasksPath))
|
|
262
|
-
task = tasks.find(t => t.task_id === task_id)
|
|
263
|
-
fileLineCounts = task?.file_line_counts ?? {}
|
|
176
|
+
## `merge-and-ingest` Output
|
|
264
177
|
|
|
265
|
-
|
|
178
|
+
Command:
|
|
266
179
|
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
console.error("✗ invalid:", task_id);
|
|
270
|
-
console.error(JSON.stringify(errors, null, 2));
|
|
271
|
-
exit 1
|
|
180
|
+
```bash
|
|
181
|
+
audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
272
182
|
```
|
|
273
183
|
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
## Step 5 — Create `dispatch/prepare-dispatch.mjs`
|
|
184
|
+
Merge behavior:
|
|
277
185
|
|
|
278
|
-
|
|
186
|
+
- validates every JSON file under `task-results/`
|
|
187
|
+
- rejects duplicate task results
|
|
188
|
+
- rejects unknown task IDs
|
|
189
|
+
- rejects missing assigned task results
|
|
190
|
+
- writes `failed-tasks.json` and exits non-zero when any assigned result is
|
|
191
|
+
missing or invalid
|
|
192
|
+
- writes `audit-results.json` only from passing results
|
|
193
|
+
- invokes the normal result ingestion path only after the assigned set is clean
|
|
279
194
|
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
```
|
|
283
|
-
node dispatch/prepare-dispatch.mjs --run-id <run_id>
|
|
284
|
-
```
|
|
285
|
-
|
|
286
|
-
### Logic
|
|
287
|
-
|
|
288
|
-
```
|
|
289
|
-
1. Parse --run-id <run_id> from argv. Error if missing.
|
|
290
|
-
|
|
291
|
-
2. Resolve paths:
|
|
292
|
-
artifactsDir = join(PROJECT_ROOT, ".audit-artifacts")
|
|
293
|
-
runDir = join(artifactsDir, "runs", run_id)
|
|
294
|
-
tasksPath = join(runDir, "pending-audit-tasks.json")
|
|
295
|
-
dispatchPlanPath = join(runDir, "dispatch-plan.json")
|
|
296
|
-
|
|
297
|
-
3. Read pending-audit-tasks.json — array of AuditTask objects.
|
|
298
|
-
If file not found: error and exit 1.
|
|
299
|
-
|
|
300
|
-
4. Load shared content (read once, reuse for all tasks):
|
|
301
|
-
lensDefinitions = read dispatch/lens-definitions.json
|
|
302
|
-
auditResultSchema = read schemas/audit_result.schema.json
|
|
303
|
-
findingSchema = read schemas/finding.schema.json
|
|
304
|
-
|
|
305
|
-
5. For each task in tasks:
|
|
306
|
-
a. sanitizedId = task.task_id.replace(/[^a-zA-Z0-9_-]/g, "_")
|
|
307
|
-
b. outputPath = join(runDir, "task-results", sanitizedId + ".json")
|
|
308
|
-
c. lensDef = lensDefinitions[task.lens]
|
|
309
|
-
d. totalFileLines = Object.values(task.file_line_counts).reduce((a, b) => a + b, 0)
|
|
310
|
-
e. description = `Audit ${task.unit_id} (${task.file_paths.length} file(s), ~${totalFileLines} lines) — ${task.lens} lens`
|
|
311
|
-
f. prompt = buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, run_id, artifactsDir)
|
|
312
|
-
g. Append { task_id, description, output_path: outputPath, prompt } to plan array
|
|
313
|
-
|
|
314
|
-
6. Ensure task-results/ directory exists:
|
|
315
|
-
mkdirSync(join(runDir, "task-results"), { recursive: true })
|
|
316
|
-
|
|
317
|
-
7. Write plan array to dispatchPlanPath as formatted JSON.
|
|
318
|
-
|
|
319
|
-
8. Print: "Wrote dispatch-plan.json — N tasks ready for dispatch"
|
|
320
|
-
Print: "Largest task: <task_id> (~N lines)"
|
|
321
|
-
Print: ""
|
|
322
|
-
Print: "--- ORCHESTRATOR INSTRUCTIONS ---"
|
|
323
|
-
Print: "Read dispatch-plan.json. For each entry, fire one Agent call with:"
|
|
324
|
-
Print: " description: <entry.description>"
|
|
325
|
-
Print: " prompt: <entry.prompt>"
|
|
326
|
-
Print: "Fire all N calls in a single message for parallel execution."
|
|
327
|
-
Print: "When all complete, run: node dispatch/merge-results.mjs --run-id <run_id>"
|
|
328
|
-
```
|
|
329
|
-
|
|
330
|
-
### `buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, runId, artifactsDir)`
|
|
331
|
-
|
|
332
|
-
Returns a string. Template (use template literals):
|
|
333
|
-
|
|
334
|
-
```
|
|
335
|
-
You are a code auditor. Perform a bounded audit of the files listed below under the specified lens.
|
|
336
|
-
|
|
337
|
-
## Task metadata
|
|
338
|
-
${JSON.stringify(task, null, 2)}
|
|
339
|
-
|
|
340
|
-
## Files to read
|
|
341
|
-
Read each path in task.file_paths using your Read tool. The repo root is the current working directory — paths are repo-relative (e.g. "src/foo.ts").
|
|
342
|
-
|
|
343
|
-
file_line_counts gives the expected total line count for each file. Use those exact values for file_coverage[].total_lines in your result.
|
|
344
|
-
|
|
345
|
-
## Lens: ${task.lens}
|
|
346
|
-
${lensDef.description}
|
|
347
|
-
|
|
348
|
-
Do NOT report: ${lensDef.do_not_report}
|
|
349
|
-
|
|
350
|
-
## Output format
|
|
351
|
-
Write your result as a single JSON **object** (not an array) to this exact path:
|
|
352
|
-
${outputPath}
|
|
353
|
-
|
|
354
|
-
The result must conform to the following schema:
|
|
355
|
-
|
|
356
|
-
### audit_result.schema.json
|
|
357
|
-
${JSON.stringify(auditResultSchema, null, 2)}
|
|
358
|
-
|
|
359
|
-
### finding.schema.json
|
|
360
|
-
${JSON.stringify(findingSchema, null, 2)}
|
|
361
|
-
|
|
362
|
-
## Hard constraints (violations will fail validation)
|
|
363
|
-
1. NEVER set line_end higher than the file's actual line count.
|
|
364
|
-
Use file_line_counts as your reference. If in doubt, leave line_end omitted.
|
|
365
|
-
2. Every finding MUST have ALL required fields:
|
|
366
|
-
id, title, category, severity, confidence, lens, summary, affected_files, evidence
|
|
367
|
-
3. lens on every finding must be exactly "${task.lens}"
|
|
368
|
-
4. No fields outside the schema. Forbidden: "recommendation", "tags", "description" (use "summary").
|
|
369
|
-
5. evidence[] must contain at least one specific file:line reference.
|
|
370
|
-
Format: "path/to/file.ts:42 - brief description of what you see there"
|
|
371
|
-
6. affected_files[] entries are OBJECTS with a "path" key — NOT plain strings.
|
|
372
|
-
Example: {"path": "src/foo.ts", "line_start": 10, "line_end": 20, "symbol": "myFunc"}
|
|
373
|
-
7. Only reference file paths that appear in this task's file_paths.
|
|
374
|
-
8. findings: [] is correct when you genuinely find nothing. Do not invent findings.
|
|
375
|
-
|
|
376
|
-
## Validation step (required)
|
|
377
|
-
After writing your result, run:
|
|
378
|
-
node dispatch/validate-result.mjs ${runId} ${task.task_id}
|
|
379
|
-
|
|
380
|
-
- If it exits 0: you are done. Stop.
|
|
381
|
-
- If it exits non-zero: read the error output, fix the JSON, rewrite the file, run again.
|
|
382
|
-
- Repeat up to 3 times.
|
|
383
|
-
|
|
384
|
-
If you cannot produce a valid result after 3 attempts, write this fallback (substituting real values):
|
|
385
|
-
${JSON.stringify({
|
|
386
|
-
task_id: task.task_id,
|
|
387
|
-
unit_id: task.unit_id,
|
|
388
|
-
pass_id: task.pass_id,
|
|
389
|
-
lens: task.lens,
|
|
390
|
-
file_coverage: task.file_paths.map(p => ({ path: p, total_lines: task.file_line_counts[p] })),
|
|
391
|
-
findings: [],
|
|
392
|
-
notes: ["Validation failed after 3 attempts — empty result written as fallback."]
|
|
393
|
-
}, null, 2)}
|
|
394
|
-
|
|
395
|
-
Then validate the fallback passes before finishing.
|
|
396
|
-
```
|
|
397
|
-
|
|
398
|
-
Note: the fallback JSON in the prompt is pre-computed in `buildPrompt` using the task
|
|
399
|
-
data, not generated by the subagent.
|
|
400
|
-
|
|
401
|
-
---
|
|
402
|
-
|
|
403
|
-
## Step 6 — Create `dispatch/merge-results.mjs`
|
|
404
|
-
|
|
405
|
-
### Usage
|
|
406
|
-
|
|
407
|
-
```
|
|
408
|
-
node dispatch/merge-results.mjs --run-id <run_id>
|
|
409
|
-
```
|
|
410
|
-
|
|
411
|
-
### Logic
|
|
412
|
-
|
|
413
|
-
```
|
|
414
|
-
1. Parse --run-id <run_id> from argv.
|
|
415
|
-
|
|
416
|
-
2. Resolve paths:
|
|
417
|
-
taskResultsDir = join(artifactsDir, "runs", run_id, "task-results")
|
|
418
|
-
auditResultsPath = join(artifactsDir, "runs", run_id, "audit-results.json")
|
|
419
|
-
failedTasksPath = join(artifactsDir, "runs", run_id, "failed-tasks.json")
|
|
420
|
-
tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
|
|
421
|
-
|
|
422
|
-
3. Read pending-audit-tasks.json to build fileLineCounts map:
|
|
423
|
-
lineCounts = {}
|
|
424
|
-
for each task: lineCounts[task.task_id] = task.file_line_counts
|
|
425
|
-
|
|
426
|
-
4. Read all *.json files from task-results/:
|
|
427
|
-
files = readdirSync(taskResultsDir).filter(f => f.endsWith(".json"))
|
|
428
|
-
|
|
429
|
-
5. For each file:
|
|
430
|
-
a. Parse JSON
|
|
431
|
-
b. Call validateResult(resultObj, lineCounts[resultObj.task_id] ?? {})
|
|
432
|
-
c. If valid: push to passing[]
|
|
433
|
-
d. If invalid: push { task_id: resultObj?.task_id ?? filename, errors } to failing[]
|
|
434
|
-
|
|
435
|
-
6. Write passing array to audit-results.json (as AuditResult[] — array of passing objects)
|
|
436
|
-
|
|
437
|
-
7. If failing.length > 0:
|
|
438
|
-
Write failing array to failed-tasks.json
|
|
439
|
-
Print warning: "${failing.length} task(s) failed validation and were excluded:"
|
|
440
|
-
For each: print " ✗ ${f.task_id}: ${f.errors[0]}" (first error only for brevity)
|
|
441
|
-
|
|
442
|
-
8. Print: "✓ ${passing.length}/${total} tasks valid → ${auditResultsPath}"
|
|
443
|
-
If failing.length > 0: print " Re-run those tasks in the next cycle."
|
|
444
|
-
|
|
445
|
-
9. Exit 0 regardless (partial ingestion is safe — failed tasks remain pending for requeue).
|
|
446
|
-
```
|
|
447
|
-
|
|
448
|
-
---
|
|
449
|
-
|
|
450
|
-
## Step 7 — Update `session-config.json` (optional but recommended)
|
|
451
|
-
|
|
452
|
-
Add `dispatch_provider` field to `.audit-artifacts/session-config.json`:
|
|
195
|
+
On success the command prints one compact JSON envelope:
|
|
453
196
|
|
|
454
197
|
```json
|
|
455
198
|
{
|
|
456
|
-
"
|
|
457
|
-
"
|
|
458
|
-
"
|
|
199
|
+
"run_id": "run-1",
|
|
200
|
+
"status": "completed",
|
|
201
|
+
"accepted_count": 18,
|
|
202
|
+
"rejected_count": 0,
|
|
203
|
+
"finding_count": 3,
|
|
204
|
+
"audit_results_path": ".audit-artifacts/runs/run-1/audit-results.json",
|
|
205
|
+
"selected_executor": "result_ingestion_executor",
|
|
206
|
+
"progress_made": true,
|
|
207
|
+
"progress_summary": "Ingested 18 audit result entries and refreshed dependent artifacts.",
|
|
208
|
+
"next_likely_step": "runtime_validation"
|
|
459
209
|
}
|
|
460
210
|
```
|
|
461
211
|
|
|
462
|
-
|
|
463
|
-
|
|
464
|
-
---
|
|
465
|
-
|
|
466
|
-
## Testing procedure
|
|
212
|
+
If the command exits non-zero, the orchestrator should stop and report the exact
|
|
213
|
+
error instead of manually editing task results or audit state.
|
|
467
214
|
|
|
468
|
-
|
|
215
|
+
## Selective Deepening
|
|
469
216
|
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
{
|
|
473
|
-
"task_id": "test:correctness",
|
|
474
|
-
"unit_id": "test",
|
|
475
|
-
"pass_id": "pass:correctness",
|
|
476
|
-
"lens": "correctness",
|
|
477
|
-
"file_coverage": [{"path": "src/foo.ts", "total_lines": 100}],
|
|
478
|
-
"findings": []
|
|
479
|
-
}
|
|
480
|
-
```
|
|
481
|
-
2. Run: `node dispatch/validate-result.mjs <some_run_id> test:correctness` — expect exit 0
|
|
482
|
-
3. Mutate the file: remove `lens` field — expect exit 1 with error mentioning `lens`
|
|
483
|
-
4. Mutate: add `line_end: 200` on an affected_file with total_lines 100 — expect exit 1
|
|
217
|
+
Result ingestion may add follow-up `AuditTask` records for bounded selective
|
|
218
|
+
deepening. Triggers include:
|
|
484
219
|
|
|
485
|
-
|
|
220
|
+
- high-severity findings
|
|
221
|
+
- low-confidence or ambiguous findings
|
|
222
|
+
- conflicting conclusions across related results
|
|
223
|
+
- high-risk no-finding sampling unless explicitly marked unnecessary
|
|
224
|
+
- runtime-validation disagreement
|
|
486
225
|
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
|
|
490
|
-
```
|
|
491
|
-
2. Inspect `dispatch-plan.json`: each entry should have `task_id`, `description`, `output_path`, `prompt`
|
|
492
|
-
3. Verify `prompt` contains the task JSON, lens definition, both schemas, and the output path
|
|
226
|
+
When follow-up tasks are added, the backend refreshes `review_packets.json` and
|
|
227
|
+
`audit_plan_metrics.json`. The next dispatch cycle handles those tasks through
|
|
228
|
+
the same packet contract.
|
|
493
229
|
|
|
494
|
-
|
|
230
|
+
## Compatibility Notes
|
|
495
231
|
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
|
|
500
|
-
|
|
232
|
+
- `AuditTask` remains the planning and coverage identity.
|
|
233
|
+
- `AuditResult[]` remains the ingestion shape.
|
|
234
|
+
- The older `.audit-artifacts/dispatch/current-*` files still exist for
|
|
235
|
+
repo-local backend fallback and single-task handoff flows.
|
|
236
|
+
- Backend provider adapters remain compatibility bridges. The canonical
|
|
237
|
+
`/audit-code` flow expects the active conversation orchestrator to dispatch
|
|
238
|
+
packet subagents when the host supports them.
|
|
239
|
+
- The `dispatch/` directory is packaged because `lens-definitions.json` and
|
|
240
|
+
validation support are part of the installed packet workflow.
|
|
501
241
|
|
|
502
|
-
|
|
242
|
+
## Verification
|
|
503
243
|
|
|
504
|
-
|
|
244
|
+
Run the normal project gate:
|
|
505
245
|
|
|
506
|
-
|
|
507
|
-
|
|
508
|
-
```
|
|
509
|
-
1. Run: node dispatch/prepare-dispatch.mjs --run-id <run_id>
|
|
510
|
-
2. Read: .audit-artifacts/runs/<run_id>/dispatch-plan.json
|
|
511
|
-
3. In ONE message, fire one Agent call per entry:
|
|
512
|
-
Agent({ description: entry.description, prompt: entry.prompt })
|
|
513
|
-
Fire all calls simultaneously — they run in parallel.
|
|
514
|
-
4. Wait for all subagents to complete.
|
|
515
|
-
5. Run: node dispatch/merge-results.mjs --run-id <run_id>
|
|
516
|
-
6. Run: node dist/index.js worker-run --run-id <run_id>
|
|
517
|
-
7. Run: node dist/index.js audit-code (to get next batch)
|
|
518
|
-
8. Repeat.
|
|
246
|
+
```bash
|
|
247
|
+
npm test
|
|
519
248
|
```
|
|
520
249
|
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
`dispatch-plan.json` and fire the calls verbatim.
|
|
524
|
-
|
|
525
|
-
---
|
|
526
|
-
|
|
527
|
-
## Notes and caveats
|
|
528
|
-
|
|
529
|
-
### Large files (2000+ lines)
|
|
530
|
-
|
|
531
|
-
Tasks with very large files (e.g. `audit-code-wrapper-lib.mjs` at 2215 lines) will still
|
|
532
|
-
hit quota limits for subagents. The `prepare-dispatch.mjs` script should print a warning
|
|
533
|
-
for tasks exceeding a threshold (e.g. 1500 total lines). These tasks may need to be split
|
|
534
|
-
at the task-builder level — that is a separate concern and not addressed here.
|
|
535
|
-
|
|
536
|
-
### `audit_results_path` vs per-task files
|
|
537
|
-
|
|
538
|
-
The existing `renderWorkerPrompt.ts` tells subagents to write to a shared
|
|
539
|
-
`audit-results.json`. The new `prepare-dispatch.mjs`-generated prompts tell subagents to
|
|
540
|
-
write to per-task `task-results/<task_id>.json` files. These are two separate dispatch
|
|
541
|
-
paths — the old path (via `renderWorkerPrompt`) is still used for non-`claude-desktop`
|
|
542
|
-
providers and is not modified by this plan.
|
|
543
|
-
|
|
544
|
-
### Future: provider abstraction
|
|
545
|
-
|
|
546
|
-
`prepare-dispatch.mjs` output (`dispatch-plan.json`) is provider-agnostic. A future
|
|
547
|
-
`anthropic-direct` provider could read the same `dispatch-plan.json` and call
|
|
548
|
-
`messages.create()` for each entry via SDK, with no changes to `prepare-dispatch.mjs`.
|
|
549
|
-
|
|
550
|
-
### ajv and published package
|
|
551
|
-
|
|
552
|
-
`ajv` is added as a devDependency. The `dispatch/` scripts are NOT in the `files` array
|
|
553
|
-
and are not published. End users of the npm package are unaffected.
|
|
250
|
+
Focused packet coverage lives in `tests/review-packets.test.mjs` and
|
|
251
|
+
`tests/audit-code-wrapper.test.mjs`.
|