auditor-lambda 0.3.3 → 0.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/README.md +6 -1
  2. package/audit-code-wrapper-lib.mjs +87 -7
  3. package/dist/cli.js +517 -91
  4. package/dist/extractors/graph.d.ts +5 -1
  5. package/dist/extractors/graph.js +223 -3
  6. package/dist/extractors/pathPatterns.d.ts +3 -2
  7. package/dist/extractors/pathPatterns.js +97 -24
  8. package/dist/io/artifacts.d.ts +5 -0
  9. package/dist/io/artifacts.js +2 -0
  10. package/dist/orchestrator/advance.js +1 -1
  11. package/dist/orchestrator/dependencyMap.js +18 -0
  12. package/dist/orchestrator/fileAnchors.d.ts +32 -0
  13. package/dist/orchestrator/fileAnchors.js +217 -0
  14. package/dist/orchestrator/internalExecutors.d.ts +1 -1
  15. package/dist/orchestrator/internalExecutors.js +120 -33
  16. package/dist/orchestrator/reviewPackets.d.ts +14 -0
  17. package/dist/orchestrator/reviewPackets.js +310 -0
  18. package/dist/orchestrator/selectiveDeepening.d.ts +14 -0
  19. package/dist/orchestrator/selectiveDeepening.js +392 -0
  20. package/dist/orchestrator/state.js +6 -1
  21. package/dist/orchestrator/taskBuilder.d.ts +16 -0
  22. package/dist/orchestrator/taskBuilder.js +68 -11
  23. package/dist/prompts/renderWorkerPrompt.js +2 -1
  24. package/dist/providers/claudeCodeProvider.js +3 -1
  25. package/dist/providers/index.js +2 -1
  26. package/dist/supervisor/operatorHandoff.js +22 -11
  27. package/dist/types/graph.d.ts +1 -0
  28. package/dist/types/reviewPlanning.d.ts +41 -0
  29. package/dist/types/reviewPlanning.js +1 -0
  30. package/dist/types/sessionConfig.d.ts +1 -0
  31. package/dist/validation/artifacts.js +13 -0
  32. package/dist/validation/auditResults.js +50 -2
  33. package/dist/validation/sessionConfig.js +5 -0
  34. package/docs/agent-integrations.md +4 -1
  35. package/docs/bootstrap-install.md +3 -0
  36. package/docs/contract.md +3 -0
  37. package/docs/dispatch-implementation-plan.md +220 -489
  38. package/docs/next-steps.md +13 -8
  39. package/docs/product-direction.md +5 -3
  40. package/docs/run-flow.md +25 -30
  41. package/docs/session-config.md +15 -4
  42. package/docs/supervisor.md +5 -3
  43. package/docs/workflow-refactor-brief.md +114 -176
  44. package/package.json +1 -1
  45. package/schemas/finding.schema.json +1 -15
  46. package/schemas/graph_bundle.schema.json +16 -0
  47. package/skills/audit-code/audit-code.prompt.md +11 -6
@@ -1,6 +1,7 @@
1
1
  # Next Implementation Steps
2
2
 
3
- This document tracks the next meaningful implementation work after the current skill-first productionization pass.
3
+ This document tracks the next meaningful implementation work after the packet
4
+ review-dispatch refactor and the current skill-first productionization pass.
4
5
 
5
6
  As of April 22, 2026, the shared MCP substrate and the first host-native installer pass have landed, but this repository is not yet ready for a public production launch.
6
7
 
@@ -31,22 +32,26 @@ The repository now supports:
31
32
  - an explicit in-repo release gate via `npm run verify:release`
32
33
  - structured operator handoff output plus `.audit-artifacts/operator-handoff.{json,md}` for blocked fallback runs
33
34
  - configured provider bridges that can continue audit-task review by writing structured results and handing control back to the bounded worker command
35
+ - graph-informed review packets, `review_packets.json`, and `audit_plan_metrics.json`
36
+ - compact packet `prepare-dispatch` and `merge-and-ingest` envelopes
34
37
 
35
38
  That means the current release is suitable for a controlled alpha or beta skill-first workflow with MCP-aware host bootstrapping, but it is not yet the final public production end-state.
36
39
 
37
40
  ## Near-term priorities
38
41
 
39
- ### 1. Realign review dispatch with the conversation-owned workflow
42
+ ### 1. Prove packet review dispatch on real repositories
40
43
 
41
- The highest-priority product refactor is to move semantic-review ownership back to the active conversation agent and to replace the current unit-first review fan-out with non-overlapping lens-aware review blocks.
44
+ The highest-priority product follow-through is to validate the packet workflow
45
+ outside this repository and compare it to the legacy fan-out baseline.
42
46
 
43
47
  Near-term work should focus on:
44
48
 
45
- - making the active conversation agent the default owner of semantic review
46
- - keeping `agent_task_batch_size` at one review block per task
47
- - treating backend provider adapters as compatibility bridges rather than the default review owner
48
- - replacing the current unit-first task planner with a non-overlapping lens-block planner
49
- - deleting the stale audit state and rerunning the audit only after that refactor lands
49
+ - running `/audit-code` against at least one nontrivial external repository
50
+ - recording packet count, task count, warning count, and largest-packet estimate
51
+ - comparing observed worker count and token/quota behavior against the old
52
+ one-task-per-worker model
53
+ - tightening packet budgets or warning thresholds if real repositories expose
54
+ rough edges
50
55
 
51
56
  The current handoff for this work is:
52
57
 
@@ -94,9 +94,11 @@ That means:
94
94
  The intended review planner should:
95
95
 
96
96
  - determine which files require which lenses
97
- - partition unresolved review into non-overlapping review blocks
98
- - prefer lens-homogeneous blocks when practical
99
- - keep the default dispatch granularity to one review block per task
97
+ - preserve `AuditTask` as the deterministic coverage identity
98
+ - group related tasks into graph-informed review packets for worker dispatch
99
+ - review multiple relevant lenses for the same packet in one worker pass
100
+ - keep one validated `AuditResult` object per underlying task
101
+ - batch tiny homogeneous files rather than spawning one worker per small task
100
102
 
101
103
  ## Default context & model rules
102
104
 
package/docs/run-flow.md CHANGED
@@ -4,28 +4,29 @@ The canonical product route is `/audit-code` in conversation.
4
4
 
5
5
  This document describes the backend execution flow that supports that conversational route and the repo-local fallback wrapper.
6
6
 
7
- ## Intended review-dispatch path
7
+ ## Packet review-dispatch path
8
8
 
9
9
  1. Build or import a repository manifest.
10
- 2. Build units, flows, and other deterministic structure artifacts.
11
- 3. Determine which files require which lenses.
12
- 4. Partition unresolved file/lens obligations into non-overlapping review blocks.
13
- 5. Hand one review block at a time to the active conversation agent.
14
- 6. Let the active agent decide whether it wants to use subagents in parallel.
15
- 7. Ingest structured audit results.
16
- 8. Mark completed file/lens coverage in the coverage matrix.
17
- 9. Build requeue only for still-missing coverage.
18
- 10. Repeat until coverage rules are satisfied.
19
- 11. Synthesize findings into merged outputs.
20
-
21
- ## Current implementation note
22
-
23
- The current TypeScript backend still has workflow drift:
24
-
25
- - planning is still mostly unit-first rather than lens-block-first
26
- - explicit backend providers can still end up owning semantic review in fallback mode
27
-
28
- That drift is being tracked explicitly in [docs/workflow-refactor-brief.md](/C:/Code/auditor-lambda/docs/workflow-refactor-brief.md).
10
+ 2. Build units, graph edges, flows, risk, and other deterministic structure
11
+ artifacts.
12
+ 3. Determine which files require which lenses and create compatible
13
+ `AuditTask` records.
14
+ 4. Build `review_packets.json` and `audit_plan_metrics.json` from those tasks.
15
+ 5. Stop at semantic review with an active run handoff.
16
+ 6. `prepare-dispatch` writes a small run-scoped `dispatch-plan.json` and one
17
+ prompt per review packet, plus a backend-owned result map.
18
+ Isolated large-file packets also get mechanical anchor summaries for
19
+ targeted review.
20
+ 7. The active conversation orchestrator launches one bounded subagent per
21
+ packet when the host supports subagents.
22
+ 8. Each subagent pipes `AuditResult[]` to the packet's `submit-packet` command;
23
+ the backend validates and writes assigned result files.
24
+ 9. `merge-and-ingest` validates the full assigned task set and ingests the
25
+ existing `AuditResult[]` shape.
26
+ 10. Result ingestion updates coverage, requeue, runtime-validation state, and
27
+ any selective-deepening follow-up tasks.
28
+ 11. Repeat until coverage and runtime rules are satisfied.
29
+ 12. Synthesize findings into merged outputs.
29
30
 
30
31
  ## Current backend capability
31
32
 
@@ -33,7 +34,10 @@ The current TypeScript implementation already covers:
33
34
 
34
35
  - repo intake and ignore handling
35
36
  - structure and planning artifact generation
37
+ - graph-first packet review planning
38
+ - compact packet dispatch and merge envelopes
36
39
  - reviewed-range ingestion from audit results
40
+ - bounded selective deepening
37
41
  - runtime validation update ingestion
38
42
  - synthesis and completion tracking
39
43
  - backend provider handoff for fallback or compatibility review flows
@@ -43,18 +47,9 @@ The current TypeScript implementation already covers:
43
47
  - the conversation route should hide this state machine behind `/audit-code`
44
48
  - the repo-local `audit-code` wrapper is fallback infrastructure for operators and local harnesses
45
49
  - provider adapters and artifact plumbing are backend details, not the primary product story
46
- - the active conversation agent should own semantic review by default
50
+ - the active conversation agent should own semantic packet dispatch by default
47
51
  - when fallback execution blocks, the wrapper should still leave behind explicit operator handoff files and suggested evidence-import paths
48
52
 
49
- ## Next backend implementation steps
50
-
51
- The next backend-focused work should support the conversation route more directly by:
52
-
53
- - realigning review planning around non-overlapping lens blocks
54
- - moving semantic-review ownership back to the active conversation agent
55
- - keeping backend provider bridges explicitly secondary
56
- - keeping evidence import and runtime-update handoff paths explicit and easier to follow
57
-
58
53
  Broader product priorities are tracked in:
59
54
 
60
55
  - `docs/workflow-refactor-brief.md`
@@ -59,7 +59,9 @@ Current implementation note:
59
59
 
60
60
  - `claude-code`, `opencode`, `subprocess-template`, and `vscode-task` are backend compatibility bridges
61
61
  - they are not the intended default owner of semantic review when the active conversation agent can handle the work directly
62
- - to activate one of those bridges for semantic review, re-run the wrapper with an explicit `--provider <name>` flag
62
+ - to activate one of those bridges for semantic review, either set `provider`
63
+ in this file intentionally or re-run the wrapper with an explicit
64
+ `--provider <name>` flag
63
65
 
64
66
  ### `timeout_ms`
65
67
 
@@ -80,7 +82,10 @@ How many audit tasks to include in one provider-assisted review batch.
80
82
 
81
83
  When this is greater than `1`, the generated worker prompt points at `current-tasks.json` / `pending-audit-tasks.json` and expects one `AuditResult` per assigned task.
82
84
 
83
- The intended default review granularity remains one review block per task.
85
+ This setting only affects explicit backend provider-assisted fallback batches.
86
+ The canonical conversation route uses run-scoped review packets from
87
+ `prepare-dispatch` while still preserving one validated `AuditResult` per
88
+ underlying task.
84
89
 
85
90
  ### `parallel_workers`
86
91
 
@@ -138,11 +143,17 @@ This remains the safest fallback default while the semantic-review workflow is b
138
143
  Fields:
139
144
 
140
145
  - `command`: optional override for the Claude Code executable
141
- - `extra_args`: optional extra arguments appended before the built-in permission-skipping flag
146
+ - `extra_args`: optional extra arguments for Claude Code
147
+ - `dangerously_skip_permissions`: optional trusted-automation opt-in. When
148
+ `true`, the bridge appends `--dangerously-skip-permissions`. Leave this
149
+ unset for the safer default.
142
150
 
143
151
  Current implementation support only.
144
152
 
145
- Use this only when you intentionally want the backend fallback CLI to bridge review into an external Claude Code process, together with `audit-code --provider claude-code`.
153
+ Use this only when you intentionally want the backend fallback CLI to bridge
154
+ review into an external Claude Code process, either by setting
155
+ `provider: "claude-code"` in this file or by running
156
+ `audit-code --provider claude-code`.
146
157
 
147
158
  ### `opencode`
148
159
 
@@ -63,9 +63,11 @@ audit-code --provider subprocess-template
63
63
  audit-code --provider vscode-task
64
64
  ```
65
65
 
66
- Those `--provider` invocations are the explicit bridge handoff point.
67
- Without an explicit `--provider` flag, the backend stops at the semantic-review
68
- boundary and exposes scoped task artifacts for the slash-command orchestrator.
66
+ Those `--provider` invocations are an explicit bridge handoff point.
67
+ Without an explicit `--provider` flag or a non-local provider in
68
+ `.audit-artifacts/session-config.json`, the backend stops at the
69
+ semantic-review boundary and exposes scoped task artifacts for the
70
+ slash-command orchestrator.
69
71
 
70
72
  ## Auto resolution rule
71
73
 
@@ -1,186 +1,124 @@
1
- # Workflow Refactor Brief
1
+ # Workflow Refactor Status
2
2
 
3
- This document is the handoff for the next context window.
3
+ This document records the packet-dispatch refactor that replaced the older
4
+ one-agent-per-small-task review plan.
4
5
 
5
- Use it as the source of truth for the workflow refactor before running a fresh audit again.
6
+ ## Goal
6
7
 
7
- ## Why this refactor is needed
8
+ Reduce token and quota usage for `/audit-code` while preserving deterministic
9
+ validation, ingestion, coverage tracking, and report synthesis.
8
10
 
9
- The current implementation still advances deterministic audit state correctly, but the semantic-review phase has drifted away from the intended product behavior.
11
+ The implemented design is a compatibility-preserving packet layer:
10
12
 
11
- The key symptom is that the backend can currently treat `provider` selection as the owner of review work, which is how the recent rerun ended up trying to use `claude-code` from `.audit-artifacts/session-config.json`.
13
+ - keep `AuditTask` as the backend planning and coverage identity
14
+ - keep `AuditResult[]` as the ingestion contract
15
+ - group related task records into worker-facing review packets
16
+ - make each worker read a coherent file set once and review multiple lenses in
17
+ one pass
18
+ - submit packet results through the backend so only assigned result files are
19
+ written
12
20
 
13
- That is not the intended workflow.
21
+ ## Current Product Model
14
22
 
15
- ## Intended workflow
16
-
17
- The intended `/audit-code` workflow is:
23
+ The canonical workflow is still conversation-first:
18
24
 
19
25
  1. The active conversation agent owns orchestration and ingestion control.
20
- 2. Bounded subagents own semantic review work whenever the host supports them.
26
+ 2. Bounded subagents own semantic packet review when the host supports them.
21
27
  3. If subagents are unavailable, the conversation agent completes one assigned
22
- review task and stops so `/audit-code` can be rerun from fresh context.
23
- 4. Deterministic planning computes which files need which lenses.
24
- 5. Pending review is partitioned into non-overlapping review blocks, preferably grouped by lens.
25
- 6. One dispatched review task should correspond to one review block.
26
- 7. `agent_task_batch_size` should stay `1` by default.
27
- 8. Subagent fan-out belongs to the host agent runtime, not to the backend session config.
28
- 9. Backend provider adapters are fallback compatibility bridges only. They should not be the default review owner.
29
-
30
- ## Current implementation drift
31
-
32
- The current code differs from that model in several important ways.
33
-
34
- ### 1. Review ownership is provider-mediated
35
-
36
- Today, the `agent` executor in the backend fallback path is still routed through `createFreshSessionProvider()` and may spawn an external CLI such as `claude` or `opencode`.
37
-
38
- Relevant files:
39
-
40
- - [src/cli.ts](/C:/Code/auditor-lambda/src/cli.ts:771)
41
- - [src/providers/index.ts](/C:/Code/auditor-lambda/src/providers/index.ts:37)
42
- - [src/providers/claudeCodeProvider.ts](/C:/Code/auditor-lambda/src/providers/claudeCodeProvider.ts:12)
43
- - [src/providers/opencodeProvider.ts](/C:/Code/auditor-lambda/src/providers/opencodeProvider.ts)
44
- - [src/providers/spawnLoggedCommand.ts](/C:/Code/auditor-lambda/src/providers/spawnLoggedCommand.ts:24)
45
-
46
- ### 2. Task planning is unit-first, not lens-first
47
-
48
- `buildChunkedAuditTasks()` currently creates tasks as `unit x lens`, then optionally splits oversized files into separate per-lens tasks.
49
-
50
- Relevant files:
51
-
52
- - [src/orchestrator/taskBuilder.ts](/C:/Code/auditor-lambda/src/orchestrator/taskBuilder.ts:101)
53
- - [src/orchestrator/unitBuilder.ts](/C:/Code/auditor-lambda/src/orchestrator/unitBuilder.ts:130)
54
-
55
- ### 3. Required lenses are unioned at the unit level
56
-
57
- The planner derives `required_lenses` for a unit, then applies that whole union to every file in the unit.
58
-
59
- That means the task count grows with `units x required_lenses`, not with a deliberately partitioned set of file/lens review blocks.
60
-
61
- Relevant files:
62
-
63
- - [src/orchestrator/unitBuilder.ts](/C:/Code/auditor-lambda/src/orchestrator/unitBuilder.ts:153)
64
- - [src/orchestrator/planning.ts](/C:/Code/auditor-lambda/src/orchestrator/planning.ts:63)
65
- - [src/coverage.ts](/C:/Code/auditor-lambda/src/coverage.ts:29)
66
-
67
- ### 4. Flow augmentation adds overlapping review tasks
68
-
69
- After the base unit tasks are built, the planner adds extra flow-aware tasks rather than repartitioning the pending review set into one global non-overlapping dispatch plan.
70
-
71
- Relevant file:
72
-
73
- - [src/orchestrator/flowPlanning.ts](/C:/Code/auditor-lambda/src/orchestrator/flowPlanning.ts:9)
74
-
75
- ### 5. `parallel_workers` means subprocess fan-out, not agent-owned parallelism
76
-
77
- The current `parallel_workers` setting only controls how many external provider worker runs the backend fallback CLI launches.
78
-
79
- It does not represent, and should not limit, the active conversation agent's own ability to use subagents.
80
-
81
- Relevant files:
82
-
83
- - [src/cli.ts](/C:/Code/auditor-lambda/src/cli.ts:83)
84
- - [src/cli.ts](/C:/Code/auditor-lambda/src/cli.ts:960)
85
-
86
- ## Evidence from the current stale audit
87
-
88
- The current stale audit run produced:
89
-
90
- - `91` units
91
- - average `3.26` required lenses per unit
92
- - `333` audit tasks total
93
- - `294` regular unit-lens tasks
94
- - `10` large-file split tasks
95
- - `29` flow tasks
96
-
97
- That fan-out is consistent with the current unit-first planner, not with the intended lens-block dispatch model.
98
-
99
- ## Refactor goals
100
-
101
- The next implementation pass should do the following.
102
-
103
- ### A. Make the slash-command orchestrator the review dispatcher
104
-
105
- The `agent` executor should represent review work owned by the current
106
- conversation or host agent session, with semantic review delegated to bounded
107
- subagents whenever possible.
108
-
109
- Target behavior:
110
-
111
- - normal `/audit-code` usage does not require `provider: "claude-code"` or `provider: "opencode"`
112
- - session-config should not be the normal way to choose a second LLM for review
113
- - backend provider bridges remain available only for explicit fallback workflows
114
- - when subagents are unavailable, one invocation performs at most one semantic
115
- review task before stopping
116
-
117
- ### B. Plan review work at the file/lens level
118
-
119
- Coverage should still know which files require which lenses, but dispatch planning should work from unresolved `(file, lens)` obligations rather than from unit-wide lens unions.
120
-
121
- Target behavior:
122
-
123
- - each review block should have explicit `file_paths`
124
- - each review block should represent one lens
125
- - review blocks in the same dispatch wave should be file-disjoint unless overlap is intentionally justified
126
-
127
- ### C. Partition pending review into non-overlapping blocks
128
-
129
- Replace the current unit-first task planner with a lens-aware block planner.
130
-
131
- Target behavior:
132
-
133
- - no combinatorial `unit x lens` explosion unless that is genuinely the smallest valid partition
134
- - large-file splitting may remain, but it should happen inside the lens-block planner
135
- - critical-flow context should influence block construction without blindly adding overlapping tasks on top
136
-
137
- ### D. Keep result ingestion deterministic
138
-
139
- The current ingestion model is mostly sound and should be preserved.
140
-
141
- Relevant files:
142
-
143
- - [src/orchestrator/resultIngestion.ts](/C:/Code/auditor-lambda/src/orchestrator/resultIngestion.ts)
144
- - [src/coverage.ts](/C:/Code/auditor-lambda/src/coverage.ts:42)
145
-
146
- ### E. Reframe session-config as backend fallback only
147
-
148
- `session-config.json` should continue to configure backend fallback bridges, but it should not be treated as the owner of semantic-review orchestration in the canonical workflow.
149
-
150
- `parallel_workers` should either:
151
-
152
- - become a legacy fallback-only knob, or
153
- - be removed from the semantic-review mental model entirely
154
-
155
- ## Acceptance criteria
156
-
157
- The refactor should be treated as done only when all of the following are true.
158
-
159
- - Starting `/audit-code` in a conversation does not rely on an external `claude-code` or `opencode` subprocess to own semantic review.
160
- - The slash-command orchestrator dispatches bounded subagents when available and
161
- falls back to one semantic review task per invocation otherwise.
162
- - The backend fallback still supports deterministic stages and explicit compatibility bridges.
163
- - The default dispatch granularity for semantic review remains one review block per task.
164
- - Pending review tasks are planned as lens-aware, non-overlapping file blocks.
165
- - `parallel_workers` no longer defines the default semantic-review parallelism model.
166
- - The next fresh audit can be run from a clean slate without inheriting the current stale provider-mediated task queue.
167
-
168
- ## Suggested implementation order
169
-
170
- 1. Refactor the review-ownership model in [src/cli.ts](/C:/Code/auditor-lambda/src/cli.ts), [src/providers/index.ts](/C:/Code/auditor-lambda/src/providers/index.ts), and related supervisor docs.
171
- 2. Replace the current task planner in [src/orchestrator/taskBuilder.ts](/C:/Code/auditor-lambda/src/orchestrator/taskBuilder.ts) with a lens-block planner.
172
- 3. Rework flow-aware planning in [src/orchestrator/flowPlanning.ts](/C:/Code/auditor-lambda/src/orchestrator/flowPlanning.ts) so it participates in block construction instead of layering overlapping tasks afterward.
173
- 4. Update docs and tests.
174
- 5. Delete the stale audit state and rerun the audit from scratch.
175
-
176
- ## Clean rerun after refactor
177
-
178
- Once the refactor is in place, the next context should:
179
-
180
- 1. keep the source changes and documentation already in the worktree
181
- 2. delete `.audit-artifacts/`
182
- 3. delete `audit-report.md`
183
- 4. run the workflow again from a clean state
184
- 5. treat the new audit output as authoritative
185
-
186
- For the remediation baseline that should survive the stale audit reset, see [docs/remediation-baseline.md](/C:/Code/auditor-lambda/docs/remediation-baseline.md).
28
+ fallback review task and stops so `/audit-code` can be rerun from fresh
29
+ context.
30
+ 4. Backend provider adapters remain explicit compatibility bridges, not the
31
+ default semantic-review owner.
32
+
33
+ Session config remains backend fallback configuration. It should not be treated
34
+ as the normal way to redirect semantic review into a second external LLM.
35
+
36
+ ## Implemented Changes
37
+
38
+ The refactor now includes:
39
+
40
+ - deterministic `review_packets.json` derived from current `AuditTask` records
41
+ - `audit_plan_metrics.json` with packet counts, repeated reference estimates,
42
+ largest packet details, and estimated agent reduction
43
+ - packet-first pending-task ordering for provider-assisted batches
44
+ - tiny homogeneous test-file batching before dispatch
45
+ - graph-edge expansion from import, call, and reference edges
46
+ - packet prompts that assign multiple task results to one worker
47
+ - backend-owned packet submission that validates before writing result files
48
+ - isolated large-file packet mode with mechanical anchors for targeted review
49
+ - validation and merge checks for missing, duplicate, unknown, malformed, or
50
+ out-of-scope task results, including swapped result files
51
+ - compact `prepare-dispatch` and `merge-and-ingest` JSON envelopes
52
+ - terse worker completion convention:
53
+ `valid: <packet_id>, findings=<n>`
54
+ - selective deepening for high-severity, low-confidence, conflicting,
55
+ high-risk clean, and runtime-disagreement cases
56
+ - refreshed packet metrics whenever selective deepening adds follow-up tasks
57
+
58
+ ## Dispatch Contract
59
+
60
+ `prepare-dispatch` writes a small `dispatch-plan.json`. Each entry points to a
61
+ packet prompt under the run-scoped `task-results/` directory.
62
+
63
+ The conversation orchestrator should:
64
+
65
+ - read only `dispatch-plan.json`
66
+ - launch one subagent per packet entry
67
+ - tell the subagent to read and follow `entry.prompt_path`
68
+ - wait for terse success replies
69
+ - run `merge-and-ingest`
70
+
71
+ The parent should not read source files, prompt bodies, result payloads, or
72
+ large task manifests during the normal packet route.
73
+
74
+ ## Artifacts
75
+
76
+ Packet mode adds or updates these artifacts:
77
+
78
+ - `review_packets.json`
79
+ - `audit_plan_metrics.json`
80
+ - `<artifacts_dir>/runs/<run_id>/dispatch-plan.json`
81
+ - `<artifacts_dir>/runs/<run_id>/dispatch-result-map.json`
82
+ - `<artifacts_dir>/runs/<run_id>/task-results/*.prompt.md`
83
+ - `<artifacts_dir>/runs/<run_id>/task-results/*.anchors.json`, only for
84
+ isolated large-file packets
85
+ - `<artifacts_dir>/runs/<run_id>/task-results/*.json`
86
+ - `<artifacts_dir>/runs/<run_id>/dispatch-warnings.json`, only when needed
87
+
88
+ The existing coverage, runtime validation, requeue, and synthesis artifacts
89
+ remain backend-owned.
90
+
91
+ ## Verification
92
+
93
+ Current in-repo verification:
94
+
95
+ - `npm test` passes with 148 tests.
96
+
97
+ Relevant test coverage:
98
+
99
+ - packet construction and metrics
100
+ - packet ordering
101
+ - graph-connected packet merging
102
+ - tiny test-file batching
103
+ - packet prompt generation
104
+ - packet submission and merge compatibility with the legacy result array
105
+ - missing-result blocking
106
+ - swapped-result blocking
107
+ - collision-proof assigned result paths
108
+ - isolated large-file anchor generation
109
+ - path-heuristic regressions
110
+ - graph extraction from source contents
111
+ - selective deepening triggers and packet refresh
112
+
113
+ ## Remaining Follow-Up
114
+
115
+ The main remaining work is operational, not structural:
116
+
117
+ - run `/audit-code` against at least one nontrivial external repository and
118
+ compare packet counts, warning counts, worker completion summaries, and
119
+ observed token/quota behavior against the legacy baseline
120
+ - keep host-specific smoke testing current for Codex, Claude Desktop, OpenCode,
121
+ VS Code, and Antigravity guidance
122
+
123
+ For the detailed packet dispatch reference, see
124
+ `docs/dispatch-implementation-plan.md`.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "auditor-lambda",
3
- "version": "0.3.3",
3
+ "version": "0.3.5",
4
4
  "private": false,
5
5
  "description": "Portable hybrid code-auditing framework for arbitrary repositories.",
6
6
  "type": "module",
@@ -17,21 +17,7 @@
17
17
  "properties": {
18
18
  "id": { "type": "string" },
19
19
  "title": { "type": "string" },
20
- "category": {
21
- "type": "string",
22
- "enum": [
23
- "correctness",
24
- "architecture",
25
- "maintainability",
26
- "security",
27
- "reliability",
28
- "performance",
29
- "data_integrity",
30
- "tests",
31
- "operability",
32
- "config_deployment"
33
- ]
34
- },
20
+ "category": { "type": "string", "minLength": 1 },
35
21
  "severity": {
36
22
  "type": "string",
37
23
  "enum": ["critical", "high", "medium", "low", "info"]
@@ -40,6 +40,22 @@
40
40
  "additionalProperties": false
41
41
  }
42
42
  },
43
+ "references": {
44
+ "type": "array",
45
+ "items": {
46
+ "type": "object",
47
+ "required": ["from", "to"],
48
+ "properties": {
49
+ "from": { "type": "string" },
50
+ "to": { "type": "string" },
51
+ "kind": {
52
+ "type": "string",
53
+ "description": "Reference edge kind from literal or path-oriented extraction (e.g. 'relative-string-reference', 'repo-path-reference')."
54
+ }
55
+ },
56
+ "additionalProperties": false
57
+ }
58
+ },
43
59
  "routes": {
44
60
  "type": "array",
45
61
  "items": {
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  description: Autonomous local loop code auditing - advances deterministic audit state, delegates bounded review tasks, and ingests validated results
3
3
  argument-hint: [target-dir]
4
- allowed-tools: [Read, Write, Bash, Glob, Grep, Agent]
4
+ allowed-tools: [Read, Bash, Glob, Grep, Agent]
5
5
  ---
6
6
 
7
7
  # `/audit-code` Execution Directive
@@ -81,14 +81,19 @@ In a single message, launch one Agent/subagent call per dispatch-plan entry:
81
81
  Agent({ description: entry.description, prompt: "Read and follow the audit instructions in: " + entry.prompt_path })
82
82
  ```
83
83
 
84
+ If the host supports per-subagent tool restrictions, give review subagents no
85
+ Write tool and allow shell access only for the `audit-code submit-packet`
86
+ command printed in their prompt.
87
+
84
88
  All subagent calls should be launched together. Wait for them to finish.
85
89
 
86
90
  Subagents own bounded semantic review. They must read only their prompt and
87
- assigned files, write exactly the requested audit result JSON to `output_path`,
88
- run the validation command in their prompt, retry up to 3 times if validation
89
- fails, and stop. They must not edit source files, remediate findings, create
90
- extra task results, run unrelated audits, or write the worker `result.json`
91
- control envelope.
91
+ assigned files, produce the requested `AuditResult[]`, pipe it to the
92
+ `submit-packet` command in their prompt, retry up to 3 times if submission
93
+ fails, and stop. The backend command validates and writes the packet-owned
94
+ result artifacts. They must not use direct file writes, edit source files,
95
+ remediate findings, create extra task results, run unrelated audits, or write
96
+ the worker `result.json` control envelope.
92
97
 
93
98
  Then run:
94
99