auditor-lambda 0.3.3 → 0.3.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +6 -1
- package/audit-code-wrapper-lib.mjs +87 -7
- package/dist/cli.js +517 -91
- package/dist/extractors/graph.d.ts +5 -1
- package/dist/extractors/graph.js +223 -3
- package/dist/extractors/pathPatterns.d.ts +3 -2
- package/dist/extractors/pathPatterns.js +97 -24
- package/dist/io/artifacts.d.ts +5 -0
- package/dist/io/artifacts.js +2 -0
- package/dist/orchestrator/advance.js +1 -1
- package/dist/orchestrator/dependencyMap.js +18 -0
- package/dist/orchestrator/fileAnchors.d.ts +32 -0
- package/dist/orchestrator/fileAnchors.js +217 -0
- package/dist/orchestrator/internalExecutors.d.ts +1 -1
- package/dist/orchestrator/internalExecutors.js +120 -33
- package/dist/orchestrator/reviewPackets.d.ts +14 -0
- package/dist/orchestrator/reviewPackets.js +310 -0
- package/dist/orchestrator/selectiveDeepening.d.ts +14 -0
- package/dist/orchestrator/selectiveDeepening.js +392 -0
- package/dist/orchestrator/state.js +6 -1
- package/dist/orchestrator/taskBuilder.d.ts +16 -0
- package/dist/orchestrator/taskBuilder.js +68 -11
- package/dist/prompts/renderWorkerPrompt.js +2 -1
- package/dist/providers/claudeCodeProvider.js +3 -1
- package/dist/providers/index.js +2 -1
- package/dist/supervisor/operatorHandoff.js +22 -11
- package/dist/types/graph.d.ts +1 -0
- package/dist/types/reviewPlanning.d.ts +41 -0
- package/dist/types/reviewPlanning.js +1 -0
- package/dist/types/sessionConfig.d.ts +1 -0
- package/dist/validation/artifacts.js +13 -0
- package/dist/validation/auditResults.js +50 -2
- package/dist/validation/sessionConfig.js +5 -0
- package/docs/agent-integrations.md +4 -1
- package/docs/bootstrap-install.md +3 -0
- package/docs/contract.md +3 -0
- package/docs/dispatch-implementation-plan.md +220 -489
- package/docs/next-steps.md +13 -8
- package/docs/product-direction.md +5 -3
- package/docs/run-flow.md +25 -30
- package/docs/session-config.md +15 -4
- package/docs/supervisor.md +5 -3
- package/docs/workflow-refactor-brief.md +114 -176
- package/package.json +1 -1
- package/schemas/finding.schema.json +1 -15
- package/schemas/graph_bundle.schema.json +16 -0
- package/skills/audit-code/audit-code.prompt.md +11 -6
|
@@ -1,553 +1,284 @@
|
|
|
1
|
-
# Dispatch Automation
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
1
|
+
# Dispatch Automation Reference
|
|
2
|
+
|
|
3
|
+
This document describes the implemented review-dispatch path for `/audit-code`.
|
|
4
|
+
The original dispatch plan was one agent per audit task. The current path keeps
|
|
5
|
+
the existing `AuditTask` and `AuditResult` contracts, but groups related tasks
|
|
6
|
+
into review packets so a worker can read a coherent file set once and submit
|
|
7
|
+
one validated result for each assigned task through a backend-owned write path.
|
|
8
|
+
|
|
9
|
+
## Current Workflow
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
1. audit-code
|
|
13
|
+
-> advances deterministic state until semantic review is needed
|
|
14
|
+
-> emits a blocked handoff with active_review_run.run_id
|
|
15
|
+
|
|
16
|
+
2. audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
17
|
+
-> reads pending-audit-tasks.json and review planning artifacts
|
|
18
|
+
-> writes a slim dispatch-plan.json
|
|
19
|
+
-> writes backend-owned dispatch-result-map.json
|
|
20
|
+
-> writes one packet prompt per dispatch-plan entry
|
|
21
|
+
-> prints one compact JSON envelope
|
|
22
|
+
|
|
23
|
+
3. Conversation orchestrator reads only dispatch-plan.json
|
|
24
|
+
-> launches one subagent per packet
|
|
25
|
+
-> each subagent reads its packet prompt and assigned files
|
|
26
|
+
-> each subagent pipes AuditResult[] to the submit-packet command in the prompt
|
|
27
|
+
-> submit-packet validates and writes only backend-assigned result files
|
|
28
|
+
-> each subagent replies: valid: <packet_id>, findings=<n>
|
|
29
|
+
|
|
30
|
+
4. audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
31
|
+
-> validates that every assigned task has exactly one valid result
|
|
32
|
+
-> rejects missing, unknown, duplicate, malformed, or out-of-scope results
|
|
33
|
+
-> writes audit-results.json as the existing AuditResult[] shape
|
|
34
|
+
-> ingests accepted results through the normal result_ingestion_executor
|
|
35
|
+
-> prints one compact JSON envelope
|
|
36
|
+
|
|
37
|
+
5. Repeat `audit-code` until complete.
|
|
19
38
|
```
|
|
20
|
-
1. node dist/index.js audit-code
|
|
21
|
-
→ emits run_id + pending-audit-tasks.json
|
|
22
|
-
|
|
23
|
-
2. node dispatch/prepare-dispatch.mjs --run-id <run_id>
|
|
24
|
-
→ reads tasks + schemas → writes dispatch-plan.json
|
|
25
|
-
(deterministic, 0 LLM tokens)
|
|
26
|
-
|
|
27
|
-
3. [Orchestrator reads dispatch-plan.json — small JSON array]
|
|
28
|
-
[Orchestrator fires N Agent calls in ONE message, verbatim prompts from plan]
|
|
29
|
-
|
|
30
|
-
Each subagent (×N, parallel):
|
|
31
|
-
- reads source files with Read tool
|
|
32
|
-
- performs lens audit
|
|
33
|
-
- writes result to task-results/<sanitized_task_id>.json using Write tool
|
|
34
|
-
- runs: node dispatch/validate-result.mjs <run_id> <task_id>
|
|
35
|
-
- if non-zero: fixes errors, rewrites, re-validates (max 3 attempts)
|
|
36
|
-
- if still failing after 3: writes empty-but-valid fallback result
|
|
37
|
-
|
|
38
|
-
4. node dispatch/merge-results.mjs --run-id <run_id>
|
|
39
|
-
→ validates all task-results/*.json
|
|
40
|
-
→ writes audit-results.json (passing results only)
|
|
41
|
-
→ writes failed-tasks.json (task_ids that failed validation)
|
|
42
|
-
(deterministic, 0 LLM tokens)
|
|
43
|
-
|
|
44
|
-
5. node dist/index.js worker-run --run-id <run_id>
|
|
45
|
-
→ ingests audit-results.json → coverage matrix → marks tasks complete
|
|
46
|
-
(deterministic, 0 LLM tokens)
|
|
47
|
-
|
|
48
|
-
6. Repeat from step 1 until no pending tasks remain.
|
|
49
|
-
```
|
|
50
|
-
|
|
51
|
-
Orchestrator token cost per cycle: **~50 tokens × N tasks** (read dispatch-plan + invoke Agent calls). Independent of source file sizes.
|
|
52
39
|
|
|
53
|
-
|
|
40
|
+
The parent orchestrator should not read prompt files, pending tasks, completed
|
|
41
|
+
task result payloads, or source files during the packet dispatch path unless a
|
|
42
|
+
backend command fails and the error requires diagnosis.
|
|
54
43
|
|
|
55
|
-
##
|
|
44
|
+
## Planning Artifacts
|
|
56
45
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
lens-definitions.json — lens descriptions embedded in every subagent prompt
|
|
60
|
-
validate.mjs — shared validation logic (imported by other scripts)
|
|
61
|
-
validate-result.mjs — CLI: validate one task-results file
|
|
62
|
-
prepare-dispatch.mjs — reads pending tasks → writes dispatch-plan.json
|
|
63
|
-
merge-results.mjs — merges validated task results → audit-results.json
|
|
64
|
-
```
|
|
46
|
+
Planning writes two packet-specific artifacts alongside the existing task and
|
|
47
|
+
coverage artifacts:
|
|
65
48
|
|
|
66
|
-
|
|
49
|
+
- `review_packets.json`: deterministic packets derived from current
|
|
50
|
+
`AuditTask` records.
|
|
51
|
+
- `audit_plan_metrics.json`: task count, packet count, repeated file/line
|
|
52
|
+
estimates, largest packet, and estimated agent-count reduction.
|
|
67
53
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
```
|
|
54
|
+
Packets preserve task identity. They change the worker-facing unit of work, not
|
|
55
|
+
the backend-owned validation or ingestion contract.
|
|
71
56
|
|
|
72
|
-
|
|
73
|
-
> local dev tooling and must not be published to npm.
|
|
57
|
+
## Packet Construction
|
|
74
58
|
|
|
75
|
-
|
|
59
|
+
Packet planning is deterministic and compatibility-preserving:
|
|
76
60
|
|
|
77
|
-
|
|
61
|
+
- tasks sharing the same file set and scope are grouped across lenses
|
|
62
|
+
- tiny homogeneous test files are batched before dispatch
|
|
63
|
+
- graph edges from imports, calls, and references can merge related task groups
|
|
64
|
+
- heuristic container edges do not force packet expansion
|
|
65
|
+
- packet chunking respects task-count and line-budget limits
|
|
66
|
+
- a single file that exceeds the packet target is isolated rather than split
|
|
67
|
+
- high-priority packets sort ahead of lower-priority packets
|
|
78
68
|
|
|
79
|
-
|
|
69
|
+
Generated packets include:
|
|
80
70
|
|
|
81
71
|
```json
|
|
82
|
-
|
|
72
|
+
{
|
|
73
|
+
"packet_id": "src-auth:security-correctness:packet-1-...",
|
|
74
|
+
"task_ids": ["src-auth:security", "src-auth:correctness"],
|
|
75
|
+
"lenses": ["security", "correctness"],
|
|
76
|
+
"file_paths": ["src/api/auth.ts", "src/lib/session.ts"],
|
|
77
|
+
"total_lines": 70,
|
|
78
|
+
"estimated_tokens": 1180
|
|
79
|
+
}
|
|
83
80
|
```
|
|
84
81
|
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
AJV v8 is required for JSON Schema draft 2020-12 support (which the existing schemas use).
|
|
88
|
-
No other new dependencies are needed.
|
|
82
|
+
## `prepare-dispatch` Output
|
|
89
83
|
|
|
90
|
-
|
|
84
|
+
Command:
|
|
91
85
|
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
"dispatch:merge": "node dispatch/merge-results.mjs",
|
|
95
|
-
"dispatch:validate": "node dispatch/validate-result.mjs"
|
|
86
|
+
```bash
|
|
87
|
+
audit-code prepare-dispatch --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
96
88
|
```
|
|
97
89
|
|
|
98
|
-
|
|
90
|
+
Artifacts:
|
|
99
91
|
|
|
100
|
-
|
|
92
|
+
- `<artifacts_dir>/runs/<run_id>/dispatch-plan.json`
|
|
93
|
+
- `<artifacts_dir>/runs/<run_id>/dispatch-result-map.json`
|
|
94
|
+
- `<artifacts_dir>/runs/<run_id>/task-results/<packet_id>.prompt.md`
|
|
95
|
+
- `<artifacts_dir>/runs/<run_id>/task-results/<packet_id>.anchors.json`,
|
|
96
|
+
only for isolated large-file packets
|
|
97
|
+
- `<artifacts_dir>/runs/<run_id>/dispatch-warnings.json`, only when warnings
|
|
98
|
+
exist
|
|
101
99
|
|
|
102
|
-
|
|
103
|
-
a subagent can scope its review correctly without reading any other files.
|
|
100
|
+
The command prints a compact JSON envelope:
|
|
104
101
|
|
|
105
102
|
```json
|
|
106
103
|
{
|
|
107
|
-
"
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
"
|
|
112
|
-
"
|
|
113
|
-
"
|
|
114
|
-
|
|
115
|
-
"tests": {
|
|
116
|
-
"description": "Test coverage gaps for important paths, tests that assert incorrect behavior (pinning bugs as expected), fragile or non-deterministic tests, missing negative/edge-case tests, tests that silently pass on stale builds (e.g. importing compiled dist/ rather than source).",
|
|
117
|
-
"do_not_report": "Source code bugs — report only issues with the tests themselves."
|
|
104
|
+
"run_id": "run-1",
|
|
105
|
+
"dispatch_plan_path": ".audit-artifacts/runs/run-1/dispatch-plan.json",
|
|
106
|
+
"packet_count": 4,
|
|
107
|
+
"task_count": 18,
|
|
108
|
+
"largest_packet": {
|
|
109
|
+
"packet_id": "src-auth:security-correctness:packet-1-...",
|
|
110
|
+
"total_lines": 1320,
|
|
111
|
+
"estimated_tokens": 6180
|
|
118
112
|
},
|
|
119
|
-
"
|
|
120
|
-
|
|
121
|
-
"do_not_report": "Performance or correctness issues that are not security-relevant."
|
|
122
|
-
},
|
|
123
|
-
"reliability": {
|
|
124
|
-
"description": "Failure modes without recovery, missing timeouts, unhandled promise rejections, race conditions, resource leaks (file handles, sockets, timers), incorrect retry logic, cascading failure risks.",
|
|
125
|
-
"do_not_report": "Correctness bugs that do not affect reliability under failure conditions."
|
|
126
|
-
},
|
|
127
|
-
"performance": {
|
|
128
|
-
"description": "Algorithmic inefficiencies (O(n²) where O(n) is possible), unnecessary re-computation, missing caching, synchronous blocking in hot paths, excessive memory allocation.",
|
|
129
|
-
"do_not_report": "Correctness bugs unrelated to performance."
|
|
130
|
-
},
|
|
131
|
-
"data_integrity": {
|
|
132
|
-
"description": "Missing input validation at trust boundaries, schema violations, inconsistent field naming across related schemas, data loss scenarios, missing required fields, enum values that are present in some schemas but not others.",
|
|
133
|
-
"do_not_report": "UI or presentation issues; operational or deployment concerns."
|
|
134
|
-
},
|
|
135
|
-
"operability": {
|
|
136
|
-
"description": "Missing or low-quality log output, error messages that don't help operators diagnose problems, missing progress indicators for long operations, no elapsed-time reporting, lack of dry-run or preview modes for destructive operations.",
|
|
137
|
-
"do_not_report": "Correctness bugs or deployment configuration."
|
|
138
|
-
},
|
|
139
|
-
"config_deployment": {
|
|
140
|
-
"description": "CI/CD pipeline correctness (wrong triggers, missing branch filters, floating version pins), deployment safety (no gate before publish, missing rollback), insecure secret handling in configs, mutable action tags that should be pinned to commit SHAs.",
|
|
141
|
-
"do_not_report": "Runtime code issues; findings that belong to other lenses."
|
|
142
|
-
}
|
|
113
|
+
"warning_count": 0,
|
|
114
|
+
"dispatch_warnings_path": null
|
|
143
115
|
}
|
|
144
116
|
```
|
|
145
117
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
## Step 3 — Create `dispatch/validate.mjs`
|
|
118
|
+
`dispatch-plan.json` entries are intentionally small:
|
|
149
119
|
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
/**
|
|
156
|
-
* @param {object} resultObj — parsed JSON from a task-results file
|
|
157
|
-
* @param {Record<string, number>} fileLineCounts — from the task's file_line_counts
|
|
158
|
-
* @returns {{ valid: boolean, errors: string[] }}
|
|
159
|
-
*/
|
|
160
|
-
export function validateResult(resultObj, fileLineCounts) { ... }
|
|
161
|
-
```
|
|
162
|
-
|
|
163
|
-
### Logic
|
|
164
|
-
|
|
165
|
-
```
|
|
166
|
-
1. AJV validate resultObj against schemas/audit_result.schema.json
|
|
167
|
-
- Load finding.schema.json first (addSchema) so $ref resolves
|
|
168
|
-
- Use Ajv({ strict: false }) to avoid complaints about unknown keywords like $schema
|
|
169
|
-
- On failure: return { valid: false, errors: ajv.errors.map(e => formatAjvError(e)) }
|
|
170
|
-
|
|
171
|
-
2. Extra check — line range constraint:
|
|
172
|
-
For each finding in resultObj.findings:
|
|
173
|
-
For each entry in finding.affected_files:
|
|
174
|
-
if entry.line_end is defined:
|
|
175
|
-
look up total_lines from resultObj.file_coverage where path === entry.path
|
|
176
|
-
if total_lines is undefined: push error "affected_files path '${entry.path}' not in file_coverage"
|
|
177
|
-
else if entry.line_end > total_lines: push error
|
|
178
|
-
"finding '${finding.id}': line_end ${entry.line_end} exceeds total_lines ${total_lines} for ${entry.path}"
|
|
179
|
-
|
|
180
|
-
3. Extra check — lens consistency:
|
|
181
|
-
For each finding in resultObj.findings:
|
|
182
|
-
if finding.lens !== resultObj.lens:
|
|
183
|
-
push error "finding '${finding.id}': lens '${finding.lens}' does not match task lens '${resultObj.lens}'"
|
|
184
|
-
|
|
185
|
-
4. Extra check — affected_files paths in scope:
|
|
186
|
-
Collect allowed paths from resultObj.file_coverage[].path
|
|
187
|
-
For each finding's affected_files entry:
|
|
188
|
-
if entry.path not in allowed paths:
|
|
189
|
-
push error "finding '${finding.id}': affected path '${entry.path}' not in task file_coverage"
|
|
190
|
-
|
|
191
|
-
5. If any extra-check errors: return { valid: false, errors }
|
|
192
|
-
|
|
193
|
-
6. Return { valid: true, errors: [] }
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
### Schema loading
|
|
197
|
-
|
|
198
|
-
Schemas are resolved relative to the project root. Use this logic to find the project root:
|
|
199
|
-
|
|
200
|
-
```js
|
|
201
|
-
// dispatch/validate.mjs
|
|
202
|
-
import { createRequire } from "node:module";
|
|
203
|
-
import { dirname, resolve, join } from "node:path";
|
|
204
|
-
import { fileURLToPath } from "node:url";
|
|
205
|
-
import { readFileSync } from "node:fs";
|
|
206
|
-
import Ajv from "ajv";
|
|
207
|
-
|
|
208
|
-
const __filename = fileURLToPath(import.meta.url);
|
|
209
|
-
const __dirname = dirname(__filename);
|
|
210
|
-
// dispatch/ is one level below project root
|
|
211
|
-
const PROJECT_ROOT = resolve(__dirname, "..");
|
|
212
|
-
const SCHEMAS_DIR = join(PROJECT_ROOT, "schemas");
|
|
213
|
-
|
|
214
|
-
function loadSchema(name) {
|
|
215
|
-
return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), "utf8"));
|
|
216
|
-
}
|
|
217
|
-
|
|
218
|
-
let _ajv = null;
|
|
219
|
-
function getAjv() {
|
|
220
|
-
if (_ajv) return _ajv;
|
|
221
|
-
_ajv = new Ajv({ strict: false, allErrors: true });
|
|
222
|
-
_ajv.addSchema(loadSchema("finding.schema.json"));
|
|
223
|
-
return _ajv;
|
|
120
|
+
```json
|
|
121
|
+
{
|
|
122
|
+
"packet_id": "src-auth:security-correctness:packet-1-...",
|
|
123
|
+
"description": "Audit 2 file(s), 2 task(s), 2 lens(es) (~70 lines)",
|
|
124
|
+
"prompt_path": ".audit-artifacts/runs/run-1/task-results/src-auth_security-correctness_packet-1_ab12cd34ef56.prompt.md"
|
|
224
125
|
}
|
|
225
126
|
```
|
|
226
127
|
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
128
|
+
The orchestrator should launch one subagent per entry with the entry
|
|
129
|
+
description and a prompt that tells the subagent to read and follow
|
|
130
|
+
`entry.prompt_path`.
|
|
131
|
+
|
|
132
|
+
## Large File Mode
|
|
133
|
+
|
|
134
|
+
The workflow does not impose a hard single-file size limit. When a packet is
|
|
135
|
+
large because it contains one large file, `prepare-dispatch` keeps that file in
|
|
136
|
+
an isolated packet and writes a mechanical anchor summary next to the packet
|
|
137
|
+
prompt. The anchor summary may include:
|
|
138
|
+
|
|
139
|
+
- file boundaries
|
|
140
|
+
- imports and exports
|
|
141
|
+
- top-level symbols
|
|
142
|
+
- route-like declarations
|
|
143
|
+
- risk keywords
|
|
144
|
+
- graph edges
|
|
145
|
+
- external analyzer signals
|
|
146
|
+
|
|
147
|
+
The packet prompt points the worker at the anchor file and asks for targeted
|
|
148
|
+
reads/searches within the assigned file. The backend still validates and writes
|
|
149
|
+
results through `submit-packet`. This keeps large-file review bounded by
|
|
150
|
+
mechanically generated structure without slicing files into arbitrary line
|
|
151
|
+
ranges.
|
|
152
|
+
|
|
153
|
+
## Packet Prompt Contract
|
|
154
|
+
|
|
155
|
+
Each packet prompt tells the worker to:
|
|
156
|
+
|
|
157
|
+
- review the packet once
|
|
158
|
+
- read only the listed repo-relative files
|
|
159
|
+
- produce one JSON object per listed task
|
|
160
|
+
- pipe one JSON array to the prompt's `submit-packet` command
|
|
161
|
+
- preserve the existing `AuditResult` fields:
|
|
162
|
+
`task_id`, `unit_id`, `pass_id`, `lens`, `file_coverage`, `findings`
|
|
163
|
+
- keep `file_coverage[]` as `{ path, total_lines }`
|
|
164
|
+
- keep every finding lens equal to the task lens
|
|
165
|
+
- avoid direct file writes, source edits, remediation, extra task results, and
|
|
166
|
+
unrelated audits
|
|
167
|
+
- reply exactly `valid: <packet_id>, findings=<total finding count>` after the
|
|
168
|
+
submit command accepts the packet
|
|
169
|
+
|
|
170
|
+
This keeps packet review efficient while leaving merge and ingestion
|
|
171
|
+
mechanically deterministic.
|
|
172
|
+
|
|
173
|
+
## Submission and Validation
|
|
174
|
+
|
|
175
|
+
Packet submission is exposed through:
|
|
176
|
+
|
|
177
|
+
```bash
|
|
178
|
+
audit-code submit-packet --run-id <run_id> --packet-id <packet_id> --artifacts-dir <artifacts_dir>
|
|
237
179
|
```
|
|
238
180
|
|
|
239
|
-
|
|
240
|
-
|
|
181
|
+
The command reads `AuditResult[]` from stdin, validates the complete assigned
|
|
182
|
+
packet, and writes only the backend-assigned per-task result paths from
|
|
183
|
+
`dispatch-result-map.json`. This keeps result writes out of the LLM prompt and
|
|
184
|
+
prevents swapped or unknown task result files from being ingested.
|
|
241
185
|
|
|
242
|
-
|
|
186
|
+
Per-task validation is exposed through:
|
|
243
187
|
|
|
188
|
+
```bash
|
|
189
|
+
audit-code validate-result --run-id <run_id> --task-id <task_id> --artifacts-dir <artifacts_dir>
|
|
244
190
|
```
|
|
245
|
-
1. Parse argv: run_id = process.argv[2], task_id = process.argv[3]
|
|
246
|
-
If either missing: print usage and exit 1
|
|
247
191
|
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
192
|
+
Generated packet prompts may pass run ids, packet ids, task ids, and artifact paths through
|
|
193
|
+
base64url flags such as `--run-id-b64`, `--packet-id-b64`, `--task-id-b64`, and
|
|
194
|
+
`--artifacts-dir-b64` when raw values could contain shell-sensitive characters.
|
|
251
195
|
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
resultPath = join(artifactsDir, "runs", run_id, "task-results", sanitized + ".json")
|
|
196
|
+
The validator checks the result against the assigned task set and enforces the
|
|
197
|
+
mechanical constraints that matter for ingestion:
|
|
255
198
|
|
|
256
|
-
|
|
257
|
-
|
|
199
|
+
- required `AuditResult` and finding fields
|
|
200
|
+
- finding lens matches the task lens
|
|
201
|
+
- cited and affected paths are in assigned coverage
|
|
202
|
+
- line spans do not exceed known `total_lines`
|
|
203
|
+
- result fields conform to the shipped schemas
|
|
258
204
|
|
|
259
|
-
|
|
260
|
-
tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
|
|
261
|
-
tasks = JSON.parse(readFileSync(tasksPath))
|
|
262
|
-
task = tasks.find(t => t.task_id === task_id)
|
|
263
|
-
fileLineCounts = task?.file_line_counts ?? {}
|
|
205
|
+
Workers should retry rejected submissions up to the bounded retry count in the prompt.
|
|
264
206
|
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
7. If valid: console.log("✓ valid:", task_id); exit 0
|
|
268
|
-
If invalid:
|
|
269
|
-
console.error("✗ invalid:", task_id);
|
|
270
|
-
console.error(JSON.stringify(errors, null, 2));
|
|
271
|
-
exit 1
|
|
272
|
-
```
|
|
207
|
+
## `merge-and-ingest` Output
|
|
273
208
|
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
## Step 5 — Create `dispatch/prepare-dispatch.mjs`
|
|
277
|
-
|
|
278
|
-
Core script. Reads pending tasks and produces a ready-to-use dispatch plan.
|
|
279
|
-
|
|
280
|
-
### Usage
|
|
281
|
-
|
|
282
|
-
```
|
|
283
|
-
node dispatch/prepare-dispatch.mjs --run-id <run_id>
|
|
284
|
-
```
|
|
285
|
-
|
|
286
|
-
### Logic
|
|
287
|
-
|
|
288
|
-
```
|
|
289
|
-
1. Parse --run-id <run_id> from argv. Error if missing.
|
|
290
|
-
|
|
291
|
-
2. Resolve paths:
|
|
292
|
-
artifactsDir = join(PROJECT_ROOT, ".audit-artifacts")
|
|
293
|
-
runDir = join(artifactsDir, "runs", run_id)
|
|
294
|
-
tasksPath = join(runDir, "pending-audit-tasks.json")
|
|
295
|
-
dispatchPlanPath = join(runDir, "dispatch-plan.json")
|
|
296
|
-
|
|
297
|
-
3. Read pending-audit-tasks.json — array of AuditTask objects.
|
|
298
|
-
If file not found: error and exit 1.
|
|
299
|
-
|
|
300
|
-
4. Load shared content (read once, reuse for all tasks):
|
|
301
|
-
lensDefinitions = read dispatch/lens-definitions.json
|
|
302
|
-
auditResultSchema = read schemas/audit_result.schema.json
|
|
303
|
-
findingSchema = read schemas/finding.schema.json
|
|
304
|
-
|
|
305
|
-
5. For each task in tasks:
|
|
306
|
-
a. sanitizedId = task.task_id.replace(/[^a-zA-Z0-9_-]/g, "_")
|
|
307
|
-
b. outputPath = join(runDir, "task-results", sanitizedId + ".json")
|
|
308
|
-
c. lensDef = lensDefinitions[task.lens]
|
|
309
|
-
d. totalFileLines = Object.values(task.file_line_counts).reduce((a, b) => a + b, 0)
|
|
310
|
-
e. description = `Audit ${task.unit_id} (${task.file_paths.length} file(s), ~${totalFileLines} lines) — ${task.lens} lens`
|
|
311
|
-
f. prompt = buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, run_id, artifactsDir)
|
|
312
|
-
g. Append { task_id, description, output_path: outputPath, prompt } to plan array
|
|
313
|
-
|
|
314
|
-
6. Ensure task-results/ directory exists:
|
|
315
|
-
mkdirSync(join(runDir, "task-results"), { recursive: true })
|
|
316
|
-
|
|
317
|
-
7. Write plan array to dispatchPlanPath as formatted JSON.
|
|
318
|
-
|
|
319
|
-
8. Print: "Wrote dispatch-plan.json — N tasks ready for dispatch"
|
|
320
|
-
Print: "Largest task: <task_id> (~N lines)"
|
|
321
|
-
Print: ""
|
|
322
|
-
Print: "--- ORCHESTRATOR INSTRUCTIONS ---"
|
|
323
|
-
Print: "Read dispatch-plan.json. For each entry, fire one Agent call with:"
|
|
324
|
-
Print: " description: <entry.description>"
|
|
325
|
-
Print: " prompt: <entry.prompt>"
|
|
326
|
-
Print: "Fire all N calls in a single message for parallel execution."
|
|
327
|
-
Print: "When all complete, run: node dispatch/merge-results.mjs --run-id <run_id>"
|
|
328
|
-
```
|
|
329
|
-
|
|
330
|
-
### `buildPrompt(task, lensDef, auditResultSchema, findingSchema, outputPath, runId, artifactsDir)`
|
|
331
|
-
|
|
332
|
-
Returns a string. Template (use template literals):
|
|
333
|
-
|
|
334
|
-
```
|
|
335
|
-
You are a code auditor. Perform a bounded audit of the files listed below under the specified lens.
|
|
336
|
-
|
|
337
|
-
## Task metadata
|
|
338
|
-
${JSON.stringify(task, null, 2)}
|
|
339
|
-
|
|
340
|
-
## Files to read
|
|
341
|
-
Read each path in task.file_paths using your Read tool. The repo root is the current working directory — paths are repo-relative (e.g. "src/foo.ts").
|
|
342
|
-
|
|
343
|
-
file_line_counts gives the expected total line count for each file. Use those exact values for file_coverage[].total_lines in your result.
|
|
344
|
-
|
|
345
|
-
## Lens: ${task.lens}
|
|
346
|
-
${lensDef.description}
|
|
347
|
-
|
|
348
|
-
Do NOT report: ${lensDef.do_not_report}
|
|
349
|
-
|
|
350
|
-
## Output format
|
|
351
|
-
Write your result as a single JSON **object** (not an array) to this exact path:
|
|
352
|
-
${outputPath}
|
|
353
|
-
|
|
354
|
-
The result must conform to the following schema:
|
|
355
|
-
|
|
356
|
-
### audit_result.schema.json
|
|
357
|
-
${JSON.stringify(auditResultSchema, null, 2)}
|
|
358
|
-
|
|
359
|
-
### finding.schema.json
|
|
360
|
-
${JSON.stringify(findingSchema, null, 2)}
|
|
361
|
-
|
|
362
|
-
## Hard constraints (violations will fail validation)
|
|
363
|
-
1. NEVER set line_end higher than the file's actual line count.
|
|
364
|
-
Use file_line_counts as your reference. If in doubt, leave line_end omitted.
|
|
365
|
-
2. Every finding MUST have ALL required fields:
|
|
366
|
-
id, title, category, severity, confidence, lens, summary, affected_files, evidence
|
|
367
|
-
3. lens on every finding must be exactly "${task.lens}"
|
|
368
|
-
4. No fields outside the schema. Forbidden: "recommendation", "tags", "description" (use "summary").
|
|
369
|
-
5. evidence[] must contain at least one specific file:line reference.
|
|
370
|
-
Format: "path/to/file.ts:42 - brief description of what you see there"
|
|
371
|
-
6. affected_files[] entries are OBJECTS with a "path" key — NOT plain strings.
|
|
372
|
-
Example: {"path": "src/foo.ts", "line_start": 10, "line_end": 20, "symbol": "myFunc"}
|
|
373
|
-
7. Only reference file paths that appear in this task's file_paths.
|
|
374
|
-
8. findings: [] is correct when you genuinely find nothing. Do not invent findings.
|
|
375
|
-
|
|
376
|
-
## Validation step (required)
|
|
377
|
-
After writing your result, run:
|
|
378
|
-
node dispatch/validate-result.mjs ${runId} ${task.task_id}
|
|
379
|
-
|
|
380
|
-
- If it exits 0: you are done. Stop.
|
|
381
|
-
- If it exits non-zero: read the error output, fix the JSON, rewrite the file, run again.
|
|
382
|
-
- Repeat up to 3 times.
|
|
383
|
-
|
|
384
|
-
If you cannot produce a valid result after 3 attempts, write this fallback (substituting real values):
|
|
385
|
-
${JSON.stringify({
|
|
386
|
-
task_id: task.task_id,
|
|
387
|
-
unit_id: task.unit_id,
|
|
388
|
-
pass_id: task.pass_id,
|
|
389
|
-
lens: task.lens,
|
|
390
|
-
file_coverage: task.file_paths.map(p => ({ path: p, total_lines: task.file_line_counts[p] })),
|
|
391
|
-
findings: [],
|
|
392
|
-
notes: ["Validation failed after 3 attempts — empty result written as fallback."]
|
|
393
|
-
}, null, 2)}
|
|
394
|
-
|
|
395
|
-
Then validate the fallback passes before finishing.
|
|
396
|
-
```
|
|
397
|
-
|
|
398
|
-
Note: the fallback JSON in the prompt is pre-computed in `buildPrompt` using the task
|
|
399
|
-
data, not generated by the subagent.
|
|
400
|
-
|
|
401
|
-
---
|
|
402
|
-
|
|
403
|
-
## Step 6 — Create `dispatch/merge-results.mjs`
|
|
404
|
-
|
|
405
|
-
### Usage
|
|
406
|
-
|
|
407
|
-
```
|
|
408
|
-
node dispatch/merge-results.mjs --run-id <run_id>
|
|
409
|
-
```
|
|
410
|
-
|
|
411
|
-
### Logic
|
|
209
|
+
Command:
|
|
412
210
|
|
|
211
|
+
```bash
|
|
212
|
+
audit-code merge-and-ingest --run-id <run_id> --artifacts-dir <artifacts_dir>
|
|
413
213
|
```
|
|
414
|
-
1. Parse --run-id <run_id> from argv.
|
|
415
214
|
|
|
416
|
-
|
|
417
|
-
taskResultsDir = join(artifactsDir, "runs", run_id, "task-results")
|
|
418
|
-
auditResultsPath = join(artifactsDir, "runs", run_id, "audit-results.json")
|
|
419
|
-
failedTasksPath = join(artifactsDir, "runs", run_id, "failed-tasks.json")
|
|
420
|
-
tasksPath = join(artifactsDir, "runs", run_id, "pending-audit-tasks.json")
|
|
215
|
+
Merge behavior:
|
|
421
216
|
|
|
422
|
-
|
|
423
|
-
|
|
424
|
-
|
|
217
|
+
- validates every backend-assigned result-map path
|
|
218
|
+
- rejects unexpected JSON files under `task-results/`
|
|
219
|
+
- rejects task IDs that appear in the wrong assigned result path
|
|
220
|
+
- rejects duplicate task results
|
|
221
|
+
- rejects unknown task IDs
|
|
222
|
+
- rejects missing assigned task results
|
|
223
|
+
- writes `failed-tasks.json` and exits non-zero when any assigned result is
|
|
224
|
+
missing or invalid
|
|
225
|
+
- writes `audit-results.json` only from passing results
|
|
226
|
+
- invokes the normal result ingestion path only after the assigned set is clean
|
|
425
227
|
|
|
426
|
-
|
|
427
|
-
files = readdirSync(taskResultsDir).filter(f => f.endsWith(".json"))
|
|
428
|
-
|
|
429
|
-
5. For each file:
|
|
430
|
-
a. Parse JSON
|
|
431
|
-
b. Call validateResult(resultObj, lineCounts[resultObj.task_id] ?? {})
|
|
432
|
-
c. If valid: push to passing[]
|
|
433
|
-
d. If invalid: push { task_id: resultObj?.task_id ?? filename, errors } to failing[]
|
|
434
|
-
|
|
435
|
-
6. Write passing array to audit-results.json (as AuditResult[] — array of passing objects)
|
|
436
|
-
|
|
437
|
-
7. If failing.length > 0:
|
|
438
|
-
Write failing array to failed-tasks.json
|
|
439
|
-
Print warning: "${failing.length} task(s) failed validation and were excluded:"
|
|
440
|
-
For each: print " ✗ ${f.task_id}: ${f.errors[0]}" (first error only for brevity)
|
|
441
|
-
|
|
442
|
-
8. Print: "✓ ${passing.length}/${total} tasks valid → ${auditResultsPath}"
|
|
443
|
-
If failing.length > 0: print " Re-run those tasks in the next cycle."
|
|
444
|
-
|
|
445
|
-
9. Exit 0 regardless (partial ingestion is safe — failed tasks remain pending for requeue).
|
|
446
|
-
```
|
|
447
|
-
|
|
448
|
-
---
|
|
449
|
-
|
|
450
|
-
## Step 7 — Update `session-config.json` (optional but recommended)
|
|
451
|
-
|
|
452
|
-
Add `dispatch_provider` field to `.audit-artifacts/session-config.json`:
|
|
228
|
+
On success the command prints one compact JSON envelope:
|
|
453
229
|
|
|
454
230
|
```json
|
|
455
231
|
{
|
|
456
|
-
"
|
|
457
|
-
"
|
|
458
|
-
"
|
|
232
|
+
"run_id": "run-1",
|
|
233
|
+
"status": "completed",
|
|
234
|
+
"accepted_count": 18,
|
|
235
|
+
"rejected_count": 0,
|
|
236
|
+
"finding_count": 3,
|
|
237
|
+
"audit_results_path": ".audit-artifacts/runs/run-1/audit-results.json",
|
|
238
|
+
"selected_executor": "result_ingestion_executor",
|
|
239
|
+
"progress_made": true,
|
|
240
|
+
"progress_summary": "Ingested 18 audit result entries and refreshed dependent artifacts.",
|
|
241
|
+
"next_likely_step": "runtime_validation"
|
|
459
242
|
}
|
|
460
243
|
```
|
|
461
244
|
|
|
462
|
-
|
|
463
|
-
|
|
464
|
-
---
|
|
465
|
-
|
|
466
|
-
## Testing procedure
|
|
467
|
-
|
|
468
|
-
### Unit test: `validate-result.mjs`
|
|
245
|
+
If the command exits non-zero, the orchestrator should stop and report the exact
|
|
246
|
+
error instead of manually editing task results or audit state.
|
|
469
247
|
|
|
470
|
-
|
|
471
|
-
```json
|
|
472
|
-
{
|
|
473
|
-
"task_id": "test:correctness",
|
|
474
|
-
"unit_id": "test",
|
|
475
|
-
"pass_id": "pass:correctness",
|
|
476
|
-
"lens": "correctness",
|
|
477
|
-
"file_coverage": [{"path": "src/foo.ts", "total_lines": 100}],
|
|
478
|
-
"findings": []
|
|
479
|
-
}
|
|
480
|
-
```
|
|
481
|
-
2. Run: `node dispatch/validate-result.mjs <some_run_id> test:correctness` — expect exit 0
|
|
482
|
-
3. Mutate the file: remove `lens` field — expect exit 1 with error mentioning `lens`
|
|
483
|
-
4. Mutate: add `line_end: 200` on an affected_file with total_lines 100 — expect exit 1
|
|
248
|
+
## Selective Deepening
|
|
484
249
|
|
|
485
|
-
|
|
250
|
+
Result ingestion may add follow-up `AuditTask` records for bounded selective
|
|
251
|
+
deepening. Triggers include:
|
|
486
252
|
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
3. Verify `prompt` contains the task JSON, lens definition, both schemas, and the output path
|
|
253
|
+
- high-severity findings
|
|
254
|
+
- low-confidence or ambiguous findings
|
|
255
|
+
- conflicting conclusions across related results
|
|
256
|
+
- high-risk no-finding sampling unless explicitly marked unnecessary
|
|
257
|
+
- runtime-validation disagreement
|
|
493
258
|
|
|
494
|
-
|
|
259
|
+
When follow-up tasks are added, the backend refreshes `review_packets.json` and
|
|
260
|
+
`audit_plan_metrics.json`. The next dispatch cycle handles those tasks through
|
|
261
|
+
the same packet contract.
|
|
495
262
|
|
|
496
|
-
|
|
497
|
-
2. Run: `node dispatch/merge-results.mjs --run-id <id>`
|
|
498
|
-
3. Verify `audit-results.json` contains exactly the 2 valid results
|
|
499
|
-
4. Verify `failed-tasks.json` contains the 1 invalid task
|
|
500
|
-
5. Verify exit code is 0
|
|
263
|
+
## Compatibility Notes
|
|
501
264
|
|
|
502
|
-
|
|
265
|
+
- `AuditTask` remains the planning and coverage identity.
|
|
266
|
+
- `AuditResult[]` remains the ingestion shape.
|
|
267
|
+
- The older `.audit-artifacts/dispatch/current-*` files still exist for
|
|
268
|
+
repo-local backend fallback and single-task handoff flows.
|
|
269
|
+
- Backend provider adapters remain compatibility bridges. The canonical
|
|
270
|
+
`/audit-code` flow expects the active conversation orchestrator to dispatch
|
|
271
|
+
packet subagents when the host supports them.
|
|
272
|
+
- The `dispatch/` directory is packaged because `lens-definitions.json` and
|
|
273
|
+
validation support are part of the installed packet workflow.
|
|
503
274
|
|
|
504
|
-
##
|
|
275
|
+
## Verification
|
|
505
276
|
|
|
506
|
-
|
|
277
|
+
Run the normal project gate:
|
|
507
278
|
|
|
279
|
+
```bash
|
|
280
|
+
npm test
|
|
508
281
|
```
|
|
509
|
-
1. Run: node dispatch/prepare-dispatch.mjs --run-id <run_id>
|
|
510
|
-
2. Read: .audit-artifacts/runs/<run_id>/dispatch-plan.json
|
|
511
|
-
3. In ONE message, fire one Agent call per entry:
|
|
512
|
-
Agent({ description: entry.description, prompt: entry.prompt })
|
|
513
|
-
Fire all calls simultaneously — they run in parallel.
|
|
514
|
-
4. Wait for all subagents to complete.
|
|
515
|
-
5. Run: node dispatch/merge-results.mjs --run-id <run_id>
|
|
516
|
-
6. Run: node dist/index.js worker-run --run-id <run_id>
|
|
517
|
-
7. Run: node dist/index.js audit-code (to get next batch)
|
|
518
|
-
8. Repeat.
|
|
519
|
-
```
|
|
520
|
-
|
|
521
|
-
**Important:** The orchestrator should NOT read the pending-audit-tasks.json, NOT read
|
|
522
|
-
any source files, NOT compose any prompts. Everything is pre-built. Just read
|
|
523
|
-
`dispatch-plan.json` and fire the calls verbatim.
|
|
524
|
-
|
|
525
|
-
---
|
|
526
|
-
|
|
527
|
-
## Notes and caveats
|
|
528
|
-
|
|
529
|
-
### Large files (2000+ lines)
|
|
530
|
-
|
|
531
|
-
Tasks with very large files (e.g. `audit-code-wrapper-lib.mjs` at 2215 lines) will still
|
|
532
|
-
hit quota limits for subagents. The `prepare-dispatch.mjs` script should print a warning
|
|
533
|
-
for tasks exceeding a threshold (e.g. 1500 total lines). These tasks may need to be split
|
|
534
|
-
at the task-builder level — that is a separate concern and not addressed here.
|
|
535
|
-
|
|
536
|
-
### `audit_results_path` vs per-task files
|
|
537
|
-
|
|
538
|
-
The existing `renderWorkerPrompt.ts` tells subagents to write to a shared
|
|
539
|
-
`audit-results.json`. The new `prepare-dispatch.mjs`-generated prompts tell subagents to
|
|
540
|
-
write to per-task `task-results/<task_id>.json` files. These are two separate dispatch
|
|
541
|
-
paths — the old path (via `renderWorkerPrompt`) is still used for non-`claude-desktop`
|
|
542
|
-
providers and is not modified by this plan.
|
|
543
|
-
|
|
544
|
-
### Future: provider abstraction
|
|
545
|
-
|
|
546
|
-
`prepare-dispatch.mjs` output (`dispatch-plan.json`) is provider-agnostic. A future
|
|
547
|
-
`anthropic-direct` provider could read the same `dispatch-plan.json` and call
|
|
548
|
-
`messages.create()` for each entry via SDK, with no changes to `prepare-dispatch.mjs`.
|
|
549
|
-
|
|
550
|
-
### ajv and published package
|
|
551
282
|
|
|
552
|
-
|
|
553
|
-
|
|
283
|
+
Focused packet coverage lives in `tests/review-packets.test.mjs` and
|
|
284
|
+
`tests/audit-code-wrapper.test.mjs`.
|