auditor-lambda 0.3.12 → 0.3.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +20 -24
- package/audit-code-wrapper-lib.mjs +52 -53
- package/dist/cli.js +43 -6
- package/dist/coverage.js +3 -1
- package/dist/extractors/disposition.js +8 -1
- package/dist/extractors/graph.d.ts +3 -1
- package/dist/extractors/graph.js +1147 -67
- package/dist/extractors/graphManifestEdges.d.ts +14 -0
- package/dist/extractors/graphManifestEdges.js +1158 -0
- package/dist/extractors/graphPathUtils.d.ts +5 -0
- package/dist/extractors/graphPathUtils.js +75 -0
- package/dist/extractors/pathPatterns.d.ts +1 -0
- package/dist/extractors/pathPatterns.js +3 -0
- package/dist/io/artifacts.d.ts +10 -1
- package/dist/io/artifacts.js +23 -3
- package/dist/orchestrator/internalExecutors.d.ts +4 -0
- package/dist/orchestrator/internalExecutors.js +35 -6
- package/dist/orchestrator/reviewPackets.js +1003 -31
- package/dist/orchestrator/syntaxResolutionExecutor.js +34 -0
- package/dist/types/externalAnalyzer.d.ts +9 -0
- package/dist/types/graph.d.ts +3 -0
- package/dist/types/reviewPlanning.d.ts +39 -0
- package/docs/contracts.md +215 -0
- package/docs/development.md +210 -0
- package/docs/handoff.md +204 -0
- package/docs/history.md +40 -0
- package/docs/operator-guide.md +189 -0
- package/docs/product.md +185 -0
- package/docs/release.md +131 -0
- package/package.json +1 -1
- package/schemas/audit_plan_metrics.schema.json +347 -0
- package/schemas/external_analyzer_results.schema.json +35 -0
- package/schemas/graph_bundle.schema.json +47 -2
- package/schemas/review_packets.schema.json +160 -0
- package/skills/audit-code/SKILL.md +7 -3
- package/skills/audit-code/audit-code.prompt.md +4 -1
- package/docs/agent-integrations.md +0 -317
- package/docs/agent-roles.md +0 -69
- package/docs/architecture.md +0 -90
- package/docs/artifacts.md +0 -36
- package/docs/bootstrap-install.md +0 -139
- package/docs/contract.md +0 -54
- package/docs/dispatch-implementation-plan.md +0 -302
- package/docs/field-trial-bug-report.md +0 -237
- package/docs/github-copilot.md +0 -66
- package/docs/model-selection.md +0 -97
- package/docs/next-steps.md +0 -202
- package/docs/packaging.md +0 -120
- package/docs/pipeline.md +0 -152
- package/docs/product-direction.md +0 -154
- package/docs/production-launch-bar.md +0 -92
- package/docs/production-readiness.md +0 -58
- package/docs/releasing.md +0 -145
- package/docs/remediation-baseline.md +0 -75
- package/docs/repo-layout.md +0 -30
- package/docs/run-flow.md +0 -56
- package/docs/session-config.md +0 -319
- package/docs/supervisor.md +0 -100
- package/docs/usage.md +0 -215
- package/docs/windows-setup.md +0 -146
- package/docs/workflow-refactor-brief.md +0 -124
package/docs/handoff.md
ADDED
|
@@ -0,0 +1,204 @@
|
|
|
1
|
+
# Handoff
|
|
2
|
+
|
|
3
|
+
Current pickup note for the next implementation agent. Keep durable product
|
|
4
|
+
direction in `docs/product.md`, engineering workflow in `docs/development.md`,
|
|
5
|
+
contracts in `docs/contracts.md`, and operator steps in
|
|
6
|
+
`docs/operator-guide.md`.
|
|
7
|
+
|
|
8
|
+
## Current State
|
|
9
|
+
|
|
10
|
+
The docs refresh remains consolidated under `docs/`; do not restore old
|
|
11
|
+
phase-specific docs unless asked. Checked-in `dist/` is expected to be rebuilt
|
|
12
|
+
after TypeScript changes.
|
|
13
|
+
|
|
14
|
+
Graph-informed packetization is in place and observable through
|
|
15
|
+
`review_packets.json` and `audit_plan_metrics.json`: packet entrypoints,
|
|
16
|
+
key edges, boundary files, quality, merge/boundary edge kinds, weak packet
|
|
17
|
+
counts, gap counts, extension counts, and bounded samples are all emitted.
|
|
18
|
+
|
|
19
|
+
Latest completed slice:
|
|
20
|
+
|
|
21
|
+
- completed the remediator-lambda audit end-to-end after refreshing stale
|
|
22
|
+
artifacts:
|
|
23
|
+
- `run-to-completion` refreshed file disposition, auto-fix, structure,
|
|
24
|
+
planning, runtime validation, and synthesis
|
|
25
|
+
- resolved all additional runtime/selective-deepening handoffs; final
|
|
26
|
+
`audit_tasks.json` had 81 tasks, all complete, 0 pending
|
|
27
|
+
- fixed target-side Windows/runtime validation noise in remediator-lambda:
|
|
28
|
+
`src/phases/plan.ts` no longer invokes `npx vitest`/`npx jest` in temp
|
|
29
|
+
roots without `package.json`; `tests/phase-plan.test.ts` has a longer
|
|
30
|
+
cleanup retry/hook timeout; `vitest.config.ts` excludes generated
|
|
31
|
+
audit/provider directories from test discovery
|
|
32
|
+
- `npm test` in `C:\Code\remediator-lambda` now passes: 5 test files,
|
|
33
|
+
51 tests
|
|
34
|
+
- final synthesis promoted `C:\Code\remediator-lambda\audit-report.md`
|
|
35
|
+
(47 findings, 16 work blocks), and `.audit-artifacts` was cleaned by
|
|
36
|
+
completion
|
|
37
|
+
- `node C:\Code\auditor-lambda\dist\index.js validate --root
|
|
38
|
+
C:\Code\remediator-lambda --artifacts-dir
|
|
39
|
+
C:\Code\remediator-lambda\.audit-artifacts` reports `issue_count: 0`
|
|
40
|
+
|
|
41
|
+
Prior completed remediator slice:
|
|
42
|
+
|
|
43
|
+
- completed the remediator-lambda final selective-deepening round:
|
|
44
|
+
- created dispatch run `20260509T180000000Z_audit_tasks_completed_002`
|
|
45
|
+
for the two remaining pending tasks
|
|
46
|
+
- submitted packet
|
|
47
|
+
`lens-steward-security:security-reliability:packet-1-cfa943527d`;
|
|
48
|
+
accepted 2 result entries, `finding_count: 0`
|
|
49
|
+
- `merge-and-ingest` accepted 2 result entries, rejected 0,
|
|
50
|
+
`spurious_file_count: 0`, `finding_count: 0`
|
|
51
|
+
- `audit_tasks.json` now has 73 tasks, all `complete`, 0 pending
|
|
52
|
+
- `audit-code validate` on the remediator artifact bundle reports
|
|
53
|
+
`issue_count: 0`
|
|
54
|
+
|
|
55
|
+
Prior completed implementation slice:
|
|
56
|
+
|
|
57
|
+
- fixed `merge-and-ingest` to treat unexpected files in `task-results/` as
|
|
58
|
+
warnings rather than hard failures; subagents sometimes write a spurious
|
|
59
|
+
packet-level result file alongside per-task `submit-packet` submissions —
|
|
60
|
+
the unexpected file check now emits a stderr warning and increments
|
|
61
|
+
`spurious_file_count` in the output JSON, but does not block ingestion when
|
|
62
|
+
all backend-assigned result files are present and valid
|
|
63
|
+
- added regression test: `merge-and-ingest proceeds despite unexpected files
|
|
64
|
+
in task-results/`; test count: 199 passing
|
|
65
|
+
- fixed Windows `EBUSY` test cleanup in `remediator-lambda/tests/phase-plan.test.ts`:
|
|
66
|
+
`enumerateTestFiles` calls `spawnSync("npx vitest ...")` in the temp dir;
|
|
67
|
+
on Windows the child process handle lingers briefly after return, causing
|
|
68
|
+
`rm()` in `afterEach` to EBUSY; added `rmWithRetry` (5 attempts, 100ms×n
|
|
69
|
+
backoff) used in both `beforeEach` and `afterEach`; remediator-lambda now
|
|
70
|
+
passes all 153 tests cleanly
|
|
71
|
+
|
|
72
|
+
Prior completed slice:
|
|
73
|
+
|
|
74
|
+
- added `python-test-util-suite-link` edges: `.py` files co-located in a
|
|
75
|
+
`utils/`, `helpers/`, or `support/` subdirectory within an `isTestPath`
|
|
76
|
+
directory are chained as a suite (same bounded-suite pattern as existing
|
|
77
|
+
TypeScript type / JSON schema / package-script suites); `conftest.py` is
|
|
78
|
+
excluded from the predicate
|
|
79
|
+
- confidence: `0.72`; direction: `undirected`
|
|
80
|
+
- added 3 focused unit tests; rebuilt checked-in `dist/`
|
|
81
|
+
|
|
82
|
+
Field evidence (Polar-CV-KAN):
|
|
83
|
+
|
|
84
|
+
- canonical run: `.audit-artifacts/polar-python-util-suite-20260509`
|
|
85
|
+
(7 packets, 1.000 cohesion, 2 weak packets)
|
|
86
|
+
- `python-test-util-suite-link` produces 2 intra-unit edges within the
|
|
87
|
+
`tests-utils` packet (`assertions.py → mocks.py`, `mocks.py → test_data.py`)
|
|
88
|
+
- `tests-utils` packet: `internal_edge_count` 0 → 2; `cohesion_score` 0 → 1;
|
|
89
|
+
`unexplained_file_count` 3 → 0; no longer a weak packet
|
|
90
|
+
- Polar metrics: 7 packets, **1.000 cohesion** (up from 0.857), **2 weak
|
|
91
|
+
packets** (down from 3)
|
|
92
|
+
- 2 remaining weak packets are `unexplained_files` type; genuinely isolated
|
|
93
|
+
files (`.auditorignore`, `experiments/domains/__init__.py`,
|
|
94
|
+
`experiments/summarize_results.py`) cannot be linked without false positives
|
|
95
|
+
|
|
96
|
+
Field evidence (remediator-lambda):
|
|
97
|
+
|
|
98
|
+
- baseline: `.audit-artifacts/remediator-yaml-refs-20260508`
|
|
99
|
+
- remediator metrics stable: 62 tasks, 3 packets, 1.000 cohesion, 0 weak
|
|
100
|
+
packets; `python-test-util-suite-link` adds 0 edges (TypeScript repo, no
|
|
101
|
+
`.py` files)
|
|
102
|
+
- remediator full audit loop completed: `.audit-artifacts/` (in-progress run
|
|
103
|
+
`20260509T153435008Z_audit_tasks_completed_006` + deepening run
|
|
104
|
+
`20260509T155225210Z_audit_tasks_completed_001`); first round produced 42
|
|
105
|
+
findings across 65 tasks; deepening round added 4 findings across 6 tasks
|
|
106
|
+
- remediator deepening `merge-and-ingest` retry succeeded after the spurious
|
|
107
|
+
file fix:
|
|
108
|
+
- command used:
|
|
109
|
+
`node C:\Code\auditor-lambda\dist\index.js merge-and-ingest --run-id 20260509T155225210Z_audit_tasks_completed_001 --root C:\Code\remediator-lambda --artifacts-dir C:\Code\remediator-lambda\.audit-artifacts`
|
|
110
|
+
- accepted 6 result entries, rejected 0, `spurious_file_count: 1`,
|
|
111
|
+
`finding_count: 4`
|
|
112
|
+
- result ingestion progressed and added 2 selective deepening tasks
|
|
113
|
+
- current remediator artifact state after retry: `audit_results_ingested`
|
|
114
|
+
present, `audit_tasks_completed` satisfied, `requeue_tasks.json` empty,
|
|
115
|
+
`audit_tasks.json` has 73 tasks with 0 pending after final selective
|
|
116
|
+
deepening run `20260509T180000000Z_audit_tasks_completed_002`
|
|
117
|
+
- final selective deepening verified the existing `src/types/workerSession.ts`
|
|
118
|
+
security findings and upheld the reliability no-finding result for
|
|
119
|
+
`src/types/sessionConfig.ts`, `src/types/workerResult.ts`, and
|
|
120
|
+
`src/types/workerSession.ts`; it added 0 findings
|
|
121
|
+
- current remediator packet metrics: 73 tasks, 3 packets, 1.000 cohesion,
|
|
122
|
+
1 weak packet with 1 unexplained file
|
|
123
|
+
- final refreshed remediator audit completed with 81 tasks, all complete, and
|
|
124
|
+
final `audit-report.md` at repo root. Runtime validation was confirmed after
|
|
125
|
+
excluding generated audit/provider directories from Vitest discovery; the
|
|
126
|
+
earlier `EBUSY` output was environmental noise from generated worktrees.
|
|
127
|
+
|
|
128
|
+
## Verification
|
|
129
|
+
|
|
130
|
+
Completed:
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
npm run build
|
|
134
|
+
npm test # 199 passing
|
|
135
|
+
node C:\Code\auditor-lambda\dist\index.js merge-and-ingest --run-id 20260509T180000000Z_audit_tasks_completed_002 --root C:\Code\remediator-lambda --artifacts-dir C:\Code\remediator-lambda\.audit-artifacts
|
|
136
|
+
node C:\Code\auditor-lambda\dist\index.js validate --root C:\Code\remediator-lambda --artifacts-dir C:\Code\remediator-lambda\.audit-artifacts
|
|
137
|
+
npm test # in C:\Code\remediator-lambda, 51 passing
|
|
138
|
+
node C:\Code\auditor-lambda\dist\index.js run-to-completion --root C:\Code\remediator-lambda --artifacts-dir C:\Code\remediator-lambda\.audit-artifacts --max-runs 10
|
|
139
|
+
node C:\Code\auditor-lambda\dist\index.js validate --root C:\Code\remediator-lambda --artifacts-dir C:\Code\remediator-lambda\.audit-artifacts
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Files Touched Recently
|
|
143
|
+
|
|
144
|
+
- `src/cli.ts` — `cmdMergeAndIngest`: unexpected files → warning, not failure
|
|
145
|
+
- `tests/audit-code-wrapper.test.mjs` — new regression test
|
|
146
|
+
- `dist/` — rebuilt
|
|
147
|
+
- `C:\Code\remediator-lambda\src\phases\plan.ts` — avoid test-runner
|
|
148
|
+
enumeration in roots without `package.json`
|
|
149
|
+
- `C:\Code\remediator-lambda\tests\phase-plan.test.ts` — sturdier
|
|
150
|
+
`rmWithRetry` helper and longer hook timeout
|
|
151
|
+
- `C:\Code\remediator-lambda\vitest.config.ts` — exclude generated
|
|
152
|
+
audit/provider directories from test discovery
|
|
153
|
+
- `C:\Code\remediator-lambda\audit-report.md` — final promoted report
|
|
154
|
+
- `docs/handoff.md`
|
|
155
|
+
|
|
156
|
+
## Next Steps
|
|
157
|
+
|
|
158
|
+
1. The 2 remaining weak packets in Polar (`experiments-domains` with 5
|
|
159
|
+
unexplained files, `tests-tiny-files` with 3 unexplained files) share the
|
|
160
|
+
same genuinely isolated files (`.auditorignore`,
|
|
161
|
+
`experiments/domains/__init__.py`, `experiments/summarize_results.py`).
|
|
162
|
+
No extractor can address these without false positives; treat as floor.
|
|
163
|
+
Only revisit if a future field trial on a different repo surfaces the same
|
|
164
|
+
pattern in fixable form.
|
|
165
|
+
2. Remediator-lambda field trial is closed. Review the final
|
|
166
|
+
`C:\Code\remediator-lambda\audit-report.md` only if you need product
|
|
167
|
+
remediation planning; no audit-code backend work remains for that run.
|
|
168
|
+
3. Run the release/publish flow only when intentionally cutting a version.
|
|
169
|
+
|
|
170
|
+
## Cautions
|
|
171
|
+
|
|
172
|
+
- `AuditTask` remains the deterministic coverage identity; `ReviewPacket`
|
|
173
|
+
should not replace result ingestion contracts.
|
|
174
|
+
- Weak graph edges, semantic affinity, and shared token frequency should remain
|
|
175
|
+
context unless deterministic graph evidence corroborates them.
|
|
176
|
+
- Boundary files are evidence hints. Worker prompts should continue to
|
|
177
|
+
discourage broad reads outside the packet.
|
|
178
|
+
- Keep suite links bounded and evidence-led; do not turn same-directory
|
|
179
|
+
proximity into a broad packet merge rule.
|
|
180
|
+
- `conftest-link` fires only when conftest.py is inside a `isTestPath`
|
|
181
|
+
directory; root-level conftest.py is deliberately excluded to avoid O(n)
|
|
182
|
+
fan-out to all Python files.
|
|
183
|
+
- `yaml-path-reference-link` only matches string values ending in config
|
|
184
|
+
extensions (`.yaml`, `.yml`, `.json`, `.toml`) that resolve to an existing
|
|
185
|
+
file in the repo; absolute URLs and values without `/` are excluded.
|
|
186
|
+
- `python-test-util-suite-link` predicate requires all four conditions: `.py`
|
|
187
|
+
extension, NOT a conftest, parent dir name in `{utils, helpers, support}`,
|
|
188
|
+
and the parent dir's normalized path passes `isTestPath`. Do not broaden the
|
|
189
|
+
dir-name set without field evidence from a real repository.
|
|
190
|
+
- `python-test-util-suite-link` edges appear as intra-unit edges (not counted
|
|
191
|
+
in `merge_edge_kind_counts`) when all suite files belong to the same unit.
|
|
192
|
+
This is correct — the edges still increment `internal_edge_count` and clear
|
|
193
|
+
the weak-packet flag. Absence from merge counts does not mean the edges are
|
|
194
|
+
inactive.
|
|
195
|
+
- `merge-and-ingest` unexpected files now warn to stderr and increment
|
|
196
|
+
`spurious_file_count` in the output JSON. They do not cause ingestion to
|
|
197
|
+
fail. The check is still present to make spurious writes visible.
|
|
198
|
+
- In this sandbox, running the wrapper from `C:\Code\auditor-lambda` with an
|
|
199
|
+
absolute remediator root hit an `EPERM` while overwriting the existing
|
|
200
|
+
remediator run `audit-results.json`; invoking the built CLI directly from
|
|
201
|
+
`C:\Code\remediator-lambda` succeeded. Treat this as an execution-environment
|
|
202
|
+
wrinkle unless it reproduces outside the sandbox.
|
|
203
|
+
- Final remediator completion cleaned `.audit-artifacts`; use the promoted
|
|
204
|
+
repo-root `audit-report.md` and `validate` output as the source of truth.
|
package/docs/history.md
ADDED
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
# History
|
|
2
|
+
|
|
3
|
+
This page keeps short archival context that used to live in several
|
|
4
|
+
phase-specific documents. It is not the current roadmap or release gate.
|
|
5
|
+
|
|
6
|
+
## Field-trial lessons
|
|
7
|
+
|
|
8
|
+
Earlier real-repository runs surfaced issues around:
|
|
9
|
+
|
|
10
|
+
- completion detection
|
|
11
|
+
- worker launch failures
|
|
12
|
+
- result ingestion validation
|
|
13
|
+
- command hangs without progress
|
|
14
|
+
- requeue task explosion
|
|
15
|
+
- evidence schema ambiguity
|
|
16
|
+
- noisy runtime placeholders
|
|
17
|
+
- weak root-cause clustering
|
|
18
|
+
- missing work-block presentation
|
|
19
|
+
- unenforceable reviewed ranges
|
|
20
|
+
|
|
21
|
+
Most of those findings have dedicated regression coverage now. The durable
|
|
22
|
+
lesson is that failure states should be explicit, schema validation should be
|
|
23
|
+
field-level, and packetization should optimize for coherent review context
|
|
24
|
+
rather than raw worker-count reduction alone.
|
|
25
|
+
|
|
26
|
+
## Remediation baseline
|
|
27
|
+
|
|
28
|
+
The old remediation baseline recorded fixes across:
|
|
29
|
+
|
|
30
|
+
- CI and release smoke coverage
|
|
31
|
+
- extractor path handling
|
|
32
|
+
- schema-contract validation
|
|
33
|
+
- orchestration state handling
|
|
34
|
+
- provider and supervisor behavior
|
|
35
|
+
- CLI and IO robustness
|
|
36
|
+
- reporting and synthesis behavior
|
|
37
|
+
- generated install payload parity
|
|
38
|
+
|
|
39
|
+
Current readiness is tracked in `docs/product.md`, `docs/operator-guide.md`,
|
|
40
|
+
`docs/contracts.md`, `docs/release.md`, and `docs/development.md`.
|
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
# Operator Guide
|
|
2
|
+
|
|
3
|
+
## Install and bootstrap
|
|
4
|
+
|
|
5
|
+
Install once:
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
npm install -g auditor-lambda
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
Then invoke `/audit-code` in a supported host. The prompt self-bootstraps the
|
|
12
|
+
current repository with:
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
audit-code ensure --quiet
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Use these commands when you want to manage setup manually:
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
audit-code ensure
|
|
22
|
+
audit-code ensure --force
|
|
23
|
+
audit-code install
|
|
24
|
+
audit-code verify-install
|
|
25
|
+
audit-code prompt-path
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
`ensure` is idempotent. `install` rewrites the supported repo-local host
|
|
29
|
+
surfaces.
|
|
30
|
+
|
|
31
|
+
## Generated files
|
|
32
|
+
|
|
33
|
+
Shared install files:
|
|
34
|
+
|
|
35
|
+
- `.audit-code/install/audit-code.import.md`
|
|
36
|
+
- `.audit-code/install/SKILL.md`
|
|
37
|
+
- `.audit-code/install/GETTING-STARTED.md`
|
|
38
|
+
- `.audit-code/install/manifest.json`
|
|
39
|
+
- `.audit-code/install/run-mcp-server.mjs`
|
|
40
|
+
- `.audit-artifacts/session-config.json` when no backend fallback config exists
|
|
41
|
+
|
|
42
|
+
Host-specific files may include:
|
|
43
|
+
|
|
44
|
+
- Codex: managed `AGENTS.md` fallback guidance
|
|
45
|
+
- Claude Desktop: project template, remote MCP connector, local MCP bundle
|
|
46
|
+
- OpenCode: command file, skill bundle, and `opencode.json`
|
|
47
|
+
- VS Code/Copilot: prompt, custom agent, instructions, and `.vscode/mcp.json`
|
|
48
|
+
- Antigravity: planning-mode and MCP-oriented guidance
|
|
49
|
+
|
|
50
|
+
Use `.audit-code/install/GETTING-STARTED.md` as the repo-local handoff after
|
|
51
|
+
bootstrap.
|
|
52
|
+
|
|
53
|
+
## Host guidance
|
|
54
|
+
|
|
55
|
+
ChatGPT-style project conversations are the intended product surface. Use
|
|
56
|
+
`/audit-code` in conversation and let the active model and project files be the
|
|
57
|
+
default context.
|
|
58
|
+
|
|
59
|
+
Codex should normally use the global skill seeded by the npm install plus
|
|
60
|
+
repo-local `AGENTS.md` fallback guidance.
|
|
61
|
+
|
|
62
|
+
Claude Desktop is treated as an MCP-first host. Use the generated project
|
|
63
|
+
template and local bundle artifacts when installing the integration.
|
|
64
|
+
|
|
65
|
+
OpenCode and VS Code use repo-local prompt, command, and MCP configuration
|
|
66
|
+
files generated by `audit-code ensure` or `audit-code install`.
|
|
67
|
+
|
|
68
|
+
Antigravity should be treated as a workflow-and-artifacts host until it has a
|
|
69
|
+
stable project-local config surface. Use generated planning-mode guidance,
|
|
70
|
+
MCP tools/resources, or the backend fallback from an Antigravity-managed
|
|
71
|
+
terminal when needed.
|
|
72
|
+
|
|
73
|
+
Manual prompt-import hosts can use:
|
|
74
|
+
|
|
75
|
+
```bash
|
|
76
|
+
audit-code prompt-path
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
## Backend fallback
|
|
80
|
+
|
|
81
|
+
From the target repository root:
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
audit-code
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
The wrapper:
|
|
88
|
+
|
|
89
|
+
- defaults artifacts to `<repo-root>/.audit-artifacts`
|
|
90
|
+
- advances deterministic work automatically
|
|
91
|
+
- stops cleanly when semantic review is required and no configured bridge can
|
|
92
|
+
continue
|
|
93
|
+
- emits `contract_version: "audit-code/v1alpha1"`
|
|
94
|
+
- refreshes `.audit-artifacts/operator-handoff.json` and
|
|
95
|
+
`.audit-artifacts/operator-handoff.md`
|
|
96
|
+
|
|
97
|
+
Useful fallback commands:
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
audit-code --single-step
|
|
101
|
+
audit-code --results /path/to/audit_results.json
|
|
102
|
+
audit-code --batch-results /path/to/results-dir
|
|
103
|
+
audit-code --updates /path/to/runtime_validation_update.json
|
|
104
|
+
audit-code --external-analyzer-results /path/to/external_analyzer_results.json
|
|
105
|
+
audit-code explain-task <task_id>
|
|
106
|
+
audit-code validate
|
|
107
|
+
audit-code mcp
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
`audit-code validate` checks artifact shape, cross-artifact consistency,
|
|
111
|
+
session config, and explicit provider readiness.
|
|
112
|
+
|
|
113
|
+
## Session config
|
|
114
|
+
|
|
115
|
+
Backend fallback configuration lives at:
|
|
116
|
+
|
|
117
|
+
```text
|
|
118
|
+
.audit-artifacts/session-config.json
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
The canonical `/audit-code` conversation route should not require users to
|
|
122
|
+
touch this file.
|
|
123
|
+
|
|
124
|
+
Default:
|
|
125
|
+
|
|
126
|
+
```json
|
|
127
|
+
{
|
|
128
|
+
"provider": "local-subprocess"
|
|
129
|
+
}
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Supported providers:
|
|
133
|
+
|
|
134
|
+
- `local-subprocess`
|
|
135
|
+
- `auto`
|
|
136
|
+
- `subprocess-template`
|
|
137
|
+
- `claude-code`
|
|
138
|
+
- `opencode`
|
|
139
|
+
- `vscode-task`
|
|
140
|
+
|
|
141
|
+
`local-subprocess` is the safest fallback default. `auto` is explicit opt-in.
|
|
142
|
+
External providers are compatibility bridges, not the intended default review
|
|
143
|
+
owner.
|
|
144
|
+
|
|
145
|
+
Common fields:
|
|
146
|
+
|
|
147
|
+
```json
|
|
148
|
+
{
|
|
149
|
+
"provider": "local-subprocess",
|
|
150
|
+
"timeout_ms": 1800000,
|
|
151
|
+
"ui_mode": "headless",
|
|
152
|
+
"agent_task_batch_size": 1,
|
|
153
|
+
"parallel_workers": 1
|
|
154
|
+
}
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
Use `ui_mode: "visible"` when debugging provider stdout/stderr. Use
|
|
158
|
+
`subprocess-template` or `vscode-task` only when you have a reliable launcher
|
|
159
|
+
bridge.
|
|
160
|
+
|
|
161
|
+
## Model selection
|
|
162
|
+
|
|
163
|
+
Conversation-level model choice belongs to the host conversation. The backend
|
|
164
|
+
should not force a model in normal usage.
|
|
165
|
+
|
|
166
|
+
For backend provider bridges, let the chosen provider own its own model
|
|
167
|
+
selection unless the operator has a concrete reason to configure it.
|
|
168
|
+
|
|
169
|
+
Packet dispatch may emit provider-neutral model hints such as `small`,
|
|
170
|
+
`standard`, or `deep`. Hosts can map those hints to their own models.
|
|
171
|
+
|
|
172
|
+
## Windows notes
|
|
173
|
+
|
|
174
|
+
Prefer command arrays over shell strings in `session-config.json`. Avoid nested
|
|
175
|
+
shell quoting when possible. For PowerShell templates, keep the executable and
|
|
176
|
+
arguments separate and prefer `{workerCommandJson}` when a launcher can consume
|
|
177
|
+
structured command data.
|
|
178
|
+
|
|
179
|
+
Runtime validation wraps package-manager shims such as `npm`, `npx`, `pnpm`,
|
|
180
|
+
and `yarn` through the Windows command shell automatically. A runtime
|
|
181
|
+
`not_confirmed` result can still be environmental when the target repo command
|
|
182
|
+
starts but cannot write its own build output.
|
|
183
|
+
|
|
184
|
+
If final report promotion to `<repo-root>/audit-report.md` is blocked by local
|
|
185
|
+
permissions, the audit can still complete. Use the artifact-bundle copy of
|
|
186
|
+
`audit-report.md` and run `audit-code validate`.
|
|
187
|
+
|
|
188
|
+
Run `audit-code validate` after editing session config so command-template
|
|
189
|
+
issues fail before a long audit run.
|
package/docs/product.md
ADDED
|
@@ -0,0 +1,185 @@
|
|
|
1
|
+
# Product
|
|
2
|
+
|
|
3
|
+
## Canonical surface
|
|
4
|
+
|
|
5
|
+
The primary product is `/audit-code` in conversation.
|
|
6
|
+
|
|
7
|
+
Normal product usage should:
|
|
8
|
+
|
|
9
|
+
- use the active conversation model by default
|
|
10
|
+
- use project files and attached repository context by default
|
|
11
|
+
- avoid manual paths, provider flags, and model-selection arguments
|
|
12
|
+
- keep semantic review with the active conversation agent by default
|
|
13
|
+
- advance the audit automatically until it completes or no further automatic progress is possible
|
|
14
|
+
|
|
15
|
+
The CLI is backend infrastructure, a local development harness, and a
|
|
16
|
+
repo-local fallback. It is not the preferred end-user mental model.
|
|
17
|
+
|
|
18
|
+
## Supported surfaces
|
|
19
|
+
|
|
20
|
+
The supported user-facing surfaces are:
|
|
21
|
+
|
|
22
|
+
1. `/audit-code` in conversation
|
|
23
|
+
2. `npm install -g auditor-lambda` as the one-time package install
|
|
24
|
+
3. `audit-code prompt-path` to locate the packaged prompt asset
|
|
25
|
+
4. `audit-code ensure` for idempotent repo-local bootstrap
|
|
26
|
+
5. `audit-code install` for explicit repair or force refresh
|
|
27
|
+
6. `audit-code` as the repo-local backend fallback
|
|
28
|
+
|
|
29
|
+
Anything below `dist/index.js` is backend or development interface.
|
|
30
|
+
|
|
31
|
+
## Product model
|
|
32
|
+
|
|
33
|
+
The intended workflow is:
|
|
34
|
+
|
|
35
|
+
1. The user invokes `/audit-code`.
|
|
36
|
+
2. The prompt runs `audit-code ensure --quiet`.
|
|
37
|
+
3. Deterministic backend steps build or refresh artifacts.
|
|
38
|
+
4. The active conversation dispatches bounded review packets when semantic
|
|
39
|
+
judgment is required.
|
|
40
|
+
5. Packet workers submit validated `AuditResult` objects through backend-owned
|
|
41
|
+
commands.
|
|
42
|
+
6. The backend ingests results, performs selective deepening and runtime
|
|
43
|
+
validation when needed, and writes the final `audit-report.md`.
|
|
44
|
+
|
|
45
|
+
Semantic review belongs to the active host conversation by default. Backend
|
|
46
|
+
provider adapters such as `claude-code`, `opencode`, `subprocess-template`, and
|
|
47
|
+
`vscode-task` are compatibility bridges for repo-local fallback workflows.
|
|
48
|
+
|
|
49
|
+
## Language strategy
|
|
50
|
+
|
|
51
|
+
Packet quality should not depend on one language ecosystem. JavaScript,
|
|
52
|
+
TypeScript, and Python can receive the richest early support because they are
|
|
53
|
+
common in current usage, but every language analyzer must write into the same
|
|
54
|
+
language-neutral graph and artifact contracts.
|
|
55
|
+
|
|
56
|
+
Do not keep expanding support by adding one bespoke parser per ecosystem unless
|
|
57
|
+
there is concrete repository demand or a high-value deterministic signal. The
|
|
58
|
+
current breadth of package and workspace manifest hints is enough to validate
|
|
59
|
+
the packetization approach. The next product goal is to make graph planning
|
|
60
|
+
observable, maintainable, and extensible through generic ownership hints rather
|
|
61
|
+
than through an open-ended list of file-format handlers.
|
|
62
|
+
|
|
63
|
+
The shared graph should model:
|
|
64
|
+
|
|
65
|
+
- file dependencies
|
|
66
|
+
- module/package ownership
|
|
67
|
+
- test-to-source relationships
|
|
68
|
+
- entrypoint-to-handler relationships
|
|
69
|
+
- config, schema, migration, workflow, and deployment relationships
|
|
70
|
+
- external boundary crossings such as HTTP, queues, databases, filesystems, and
|
|
71
|
+
subprocesses
|
|
72
|
+
- edge confidence, direction, and reason
|
|
73
|
+
|
|
74
|
+
Graph evidence should be treated in tiers:
|
|
75
|
+
|
|
76
|
+
- deterministic directed edges, such as imports, entrypoints, route handlers,
|
|
77
|
+
test/source links, and resolved analyzer references
|
|
78
|
+
- deterministic ownership edges, such as package, module, project, or subsystem
|
|
79
|
+
roots
|
|
80
|
+
- analyzer-supplied ownership roots, normalized into graph reference edges
|
|
81
|
+
- language-agnostic semantic affinity, such as shared unusual domain terms,
|
|
82
|
+
nearby paths, identifier overlap, or embeddings
|
|
83
|
+
|
|
84
|
+
Semantic affinity can help rank `boundary_files`, explain possible context, and
|
|
85
|
+
highlight missing deterministic extraction. It should not merge packets on
|
|
86
|
+
frequency alone because common tokens like `user`, `request`, `client`,
|
|
87
|
+
`config`, and `error` often connect unrelated code.
|
|
88
|
+
|
|
89
|
+
Language-specific adapters should enrich the graph without changing packet or
|
|
90
|
+
result contracts:
|
|
91
|
+
|
|
92
|
+
- JS/TS: TypeScript compiler API, package manifests, import/export edges, route
|
|
93
|
+
conventions, test adjacency
|
|
94
|
+
- Python: local import statement parsing, package/module resolution,
|
|
95
|
+
pytest/unittest adjacency, and future framework route conventions
|
|
96
|
+
- Other ecosystems: prefer analyzer-supplied ownership roots, ctags/tree-sitter,
|
|
97
|
+
LSP output, or existing external analyzer data before adding new bespoke
|
|
98
|
+
manifest parsers
|
|
99
|
+
|
|
100
|
+
The fallback should remain useful even when a language has no deep analyzer:
|
|
101
|
+
manifest files, path structure, tests, config, and external analyzer output can
|
|
102
|
+
still seed a graph with lower-confidence edges.
|
|
103
|
+
|
|
104
|
+
Deterministic tool runners should be project-config aware. For example, ESLint
|
|
105
|
+
syntax-resolution should run only when the repository has repo-local ESLint
|
|
106
|
+
configuration, not merely because an ESLint binary is installed.
|
|
107
|
+
|
|
108
|
+
## Packet planning
|
|
109
|
+
|
|
110
|
+
`AuditTask` remains the deterministic coverage identity. `ReviewPacket` is the
|
|
111
|
+
worker-facing unit of understanding.
|
|
112
|
+
|
|
113
|
+
The next packetization phase should:
|
|
114
|
+
|
|
115
|
+
- use planner observability to tune which edge kinds change grouping, which
|
|
116
|
+
files stay boundary-only, and which extractor gaps leave weakly explained
|
|
117
|
+
packets
|
|
118
|
+
- extend and exercise the generic ownership-root input so external analyzers
|
|
119
|
+
can say "these files belong to module root X" without a new parser for every
|
|
120
|
+
ecosystem
|
|
121
|
+
- keep graph and manifest parser code modular before broadening it further
|
|
122
|
+
- exercise deterministic Python import, package, and test/source graph support
|
|
123
|
+
on fixture and real repositories to find the next highest-value gaps
|
|
124
|
+
- use language-agnostic semantic affinity only as low-authority context unless
|
|
125
|
+
corroborated by deterministic graph evidence
|
|
126
|
+
- build packets around coherent subsystems and execution flows
|
|
127
|
+
- keep shared fan-in files visible as context instead of letting them merge too
|
|
128
|
+
much of the repository into one packet
|
|
129
|
+
- distinguish strong edges from weak or heuristic edges
|
|
130
|
+
- group tests with the code they verify when that helps review quality
|
|
131
|
+
- include packet rationale, key edges, entrypoints, and boundary files
|
|
132
|
+
- track packet-quality metrics such as cohesion, fan-in/fan-out, boundary
|
|
133
|
+
crossings, orphan tasks, weak-packet gap and extension counts, risk
|
|
134
|
+
concentration, and largest unexplained packet
|
|
135
|
+
|
|
136
|
+
The practical success bar is that packets feel like reviewable code ownership
|
|
137
|
+
or execution-flow units, not merely budget-sized bundles.
|
|
138
|
+
|
|
139
|
+
## Production readiness
|
|
140
|
+
|
|
141
|
+
The package publication path is operational. The release gate, packaged install
|
|
142
|
+
smoke tests, and GitHub Actions Trusted Publishing path are routine
|
|
143
|
+
maintenance. The remaining production work is product confidence rather than a
|
|
144
|
+
new contract shape.
|
|
145
|
+
|
|
146
|
+
Readiness should be judged through three checks:
|
|
147
|
+
|
|
148
|
+
- field-trial quality: run real repositories through planning, validate
|
|
149
|
+
artifacts, and use `audit_plan_metrics.json` to track packet count, weak
|
|
150
|
+
packet count, average cohesion, merge edge kinds, and weak-packet samples
|
|
151
|
+
- full-loop behavior: prove `prepare-dispatch`, worker review,
|
|
152
|
+
`submit-packet`, `merge-and-ingest`, selective deepening, runtime validation,
|
|
153
|
+
and final `audit-report.md` promotion in at least one real host flow
|
|
154
|
+
- release hygiene: keep `npm run verify:release`, linked smoke, packaged
|
|
155
|
+
smoke, tarball preview, and Trusted Publishing green from a clean checkout
|
|
156
|
+
|
|
157
|
+
Extractor work should follow field-trial evidence. Fix deterministic graph gaps
|
|
158
|
+
when metrics show them, prefer analyzer-supplied ownership roots before new
|
|
159
|
+
manifest parsers, and keep semantic affinity as context unless deterministic
|
|
160
|
+
evidence corroborates it.
|
|
161
|
+
|
|
162
|
+
The current production-readiness focus is:
|
|
163
|
+
|
|
164
|
+
- use the remediator packet-dispatch loop and Polar runtime-confirmed loop as
|
|
165
|
+
regression evidence for Windows runtime execution, runtime follow-up, final
|
|
166
|
+
synthesis, and report-promotion behavior
|
|
167
|
+
- use the remediator contract-link field trial as regression evidence that
|
|
168
|
+
small schema, workflow, package script, and type contract suites can become
|
|
169
|
+
graph evidence without broad directory merges
|
|
170
|
+
- rerun `remediator-lambda` after its Windows `EBUSY` test cleanup issue is
|
|
171
|
+
fixed
|
|
172
|
+
- keep exercising analyzer ownership roots on real repositories before adding
|
|
173
|
+
ecosystem-specific manifest parsers
|
|
174
|
+
- keep host setup claims aligned with verified Codex, Claude Desktop, OpenCode,
|
|
175
|
+
VS Code, and Antigravity behavior
|
|
176
|
+
- split high-concentration implementation files only after the packetization
|
|
177
|
+
and schema contracts stay easy to review
|
|
178
|
+
|
|
179
|
+
## Non-goals
|
|
180
|
+
|
|
181
|
+
- repositioning the CLI as a peer product surface
|
|
182
|
+
- making session config the normal way to redirect semantic review into a
|
|
183
|
+
second external LLM
|
|
184
|
+
- making backend implementation details outrank the conversation contract
|
|
185
|
+
- tying packetization quality to one programming language
|