@nerviq/cli 1.29.0 → 1.30.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (93) hide show
  1. package/CHANGELOG.md +1764 -1493
  2. package/README.md +568 -538
  3. package/SECURITY.md +78 -82
  4. package/bin/cli.js +2838 -2558
  5. package/docs/api-reference.md +356 -356
  6. package/docs/audit-fix.md +109 -0
  7. package/docs/autofix.md +3 -62
  8. package/docs/getting-started.md +1 -1
  9. package/docs/index.html +592 -592
  10. package/docs/integration-contracts.md +287 -287
  11. package/docs/maintenance.md +128 -128
  12. package/docs/new-platform-guide.md +202 -202
  13. package/docs/release-process.md +63 -0
  14. package/docs/shallow-risk.md +244 -244
  15. package/docs/why-nerviq.md +82 -82
  16. package/package.json +75 -67
  17. package/sdk/README.md +12 -3
  18. package/sdk/examples/langchain-integration.md +128 -0
  19. package/sdk/examples/self-governing-agent.js +135 -0
  20. package/sdk/index.d.ts +115 -0
  21. package/sdk/index.js +94 -0
  22. package/sdk/package.json +11 -0
  23. package/src/activity.js +13 -0
  24. package/src/aider/activity.js +226 -226
  25. package/src/aider/context.js +162 -162
  26. package/src/aider/freshness.js +123 -123
  27. package/src/aider/techniques.js +3465 -3465
  28. package/src/audit/layers.js +180 -180
  29. package/src/audit.js +1133 -1032
  30. package/src/auto-suggest.js +9 -2
  31. package/src/behavioral-drift.js +37 -2
  32. package/src/benchmark.js +299 -299
  33. package/src/codex/activity.js +324 -324
  34. package/src/codex/freshness.js +149 -142
  35. package/src/codex/techniques.js +4895 -4895
  36. package/src/context.js +326 -326
  37. package/src/continuous-ops.js +11 -1
  38. package/src/convert.js +340 -340
  39. package/src/copilot/config-parser.js +280 -280
  40. package/src/copilot/context.js +218 -218
  41. package/src/copilot/freshness.js +184 -177
  42. package/src/copilot/patch.js +238 -238
  43. package/src/copilot/techniques.js +3578 -3578
  44. package/src/cursor/freshness.js +194 -194
  45. package/src/cursor/patch.js +243 -243
  46. package/src/cursor/techniques.js +3735 -3735
  47. package/src/doctor.js +201 -201
  48. package/src/fix-engine.js +511 -8
  49. package/src/formatters/csv.js +86 -86
  50. package/src/formatters/junit.js +123 -123
  51. package/src/formatters/markdown.js +164 -164
  52. package/src/formatters/otel.js +151 -151
  53. package/src/freshness.js +163 -156
  54. package/src/gemini/activity.js +402 -402
  55. package/src/gemini/context.js +290 -290
  56. package/src/gemini/freshness.js +188 -188
  57. package/src/gemini/patch.js +229 -229
  58. package/src/gemini/techniques.js +3811 -3811
  59. package/src/governance.js +533 -533
  60. package/src/harmony/audit.js +306 -306
  61. package/src/i18n.js +63 -63
  62. package/src/insights.js +119 -119
  63. package/src/integrations.js +134 -134
  64. package/src/locales/en.json +33 -33
  65. package/src/locales/es.json +33 -33
  66. package/src/migrate.js +354 -354
  67. package/src/opencode/activity.js +286 -286
  68. package/src/opencode/freshness.js +137 -137
  69. package/src/opencode/techniques.js +3450 -3450
  70. package/src/safe-glyph.js +97 -0
  71. package/src/setup/analysis.js +12 -12
  72. package/src/setup.js +13 -6
  73. package/src/shallow-risk/index.js +113 -56
  74. package/src/shallow-risk/patterns/agent-config-cross-platform-drift.js +51 -50
  75. package/src/shallow-risk/patterns/agent-config-dangerous-autoapprove.js +47 -46
  76. package/src/shallow-risk/patterns/agent-config-deprecated-keys.js +47 -46
  77. package/src/shallow-risk/patterns/agent-config-framework-version-mismatch.js +138 -0
  78. package/src/shallow-risk/patterns/agent-config-missing-file.js +318 -317
  79. package/src/shallow-risk/patterns/agent-config-script-not-in-package-json.js +108 -0
  80. package/src/shallow-risk/patterns/agent-config-secret-literal.js +52 -49
  81. package/src/shallow-risk/patterns/agent-config-stack-contradiction.js +35 -34
  82. package/src/shallow-risk/patterns/hook-script-missing.js +71 -70
  83. package/src/shallow-risk/patterns/mcp-server-no-allowlist.js +53 -52
  84. package/src/shallow-risk/shared.js +653 -648
  85. package/src/source-urls.js +295 -295
  86. package/src/state-paths.js +85 -85
  87. package/src/supplemental-checks.js +805 -805
  88. package/src/telemetry.js +160 -160
  89. package/src/watch.js +46 -0
  90. package/src/windsurf/context.js +359 -359
  91. package/src/windsurf/freshness.js +194 -194
  92. package/src/windsurf/patch.js +231 -231
  93. package/src/windsurf/techniques.js +3779 -3779
package/CHANGELOG.md CHANGED
@@ -1,1493 +1,1764 @@
1
- # Changelog
2
-
3
- All notable changes to the **Nerviq** CLI are documented in this file.
4
-
5
- The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
- and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
-
8
- ## [Unreleased]
9
-
10
- ## [1.29.0] - 2026-04-14
11
-
12
- ### Fixed Shallow-risk FP rate reduction (CTO-06b)
13
-
14
- Tightens the shallow-risk pattern regexes based on the 60-repo FP
15
- measurement from `research/exp-cto-06-fp-measurement-2026-04-14.md`.
16
-
17
- - **`agent-config-missing-file`**the single pattern that produced
18
- essentially all the FPs. Overnight corpus measurement found 520
19
- hits / 63.5% lower-bound FP rate across the PP-08 corpus (6.35×
20
- above the 0.10 gate).
21
-
22
- ### Impact
23
-
24
- - Corpus hits: **520 → 69 (-86.7%)**.
25
- - Lower-bound FP rate: **63.5% 8.7%** (under the 0.10 gate).
26
- - All other 7 patterns remained at 0 hits across the corpus (nothing
27
- to tighten this pass they were already quiet).
28
-
29
- ### What got tightened
30
-
31
- - Pointer regex no longer fires on:
32
- - Fenced code-example bodies.
33
- - URL-shape references.
34
- - Well-known external conventions (e.g. `.github/CODEOWNERS`,
35
- `node_modules/*`, `.git/*`, `vendor/*`).
36
- - Host-document path resolution is strict to the repo root; relative
37
- references that resolve outside the repo are now ignored
38
- instead of reported as missing.
39
- - Quote-wrapped example paths in prose (e.g. `"docs/SECURITY.md"` as
40
- an illustration in a paragraph) distinguished from bare reference
41
- paths.
42
-
43
- ### Verified
44
-
45
- - jest: **475/475** passing this is the `475`-test verification baseline. (was 452 + 23 new negative-fixture
46
- tests in `test/shallow-risk.test.js`, each reproducing a FP
47
- eliminated this pass).
48
- - canonical CLI tests: **162/162** passing.
49
- - `npm pack --dry-run`: clean.
50
- - `node tools/validate-release-metadata.js`: validation passed for v1.29.0.
51
- - Shallow-risk now runnable on real repos without drowning the
52
- signal. Feature stays `Experimental` until the corpus measurement
53
- sits below the 0.10 gate twice in a row.
54
-
55
- Evidence: `research/exp-cto-06-fp-measurement-2026-04-14.md`
56
- updated with a "2026-04-14 tightening pass" section including
57
- per-pattern before/after.
58
-
59
- ## [1.28.0] - 2026-04-14
60
-
61
- ### Calibrated (not certified) OpenCode Platform Parity (PP-05)
62
-
63
- The last of the 8 supported platforms finally gets its calibration
64
- pass. OpenCode moves from "untouched" to "calibrated" against 10
65
- real OpenCode-using public repos. Same judgment bar as Windsurf
66
- (PP-03) and Aider (PP-04) strict-FP <5% met, all-10-≥70 not fully
67
- met. Source landed in commit `5114834`.
68
-
69
- 10-repo corpus: 8/10 scored ≥70 post-calibration. PPI stays at
70
- **0.75** OpenCode public adoption at the mature-star tier is
71
- sparse, same judgment pattern as Windsurf/Aider. Added to
72
- `research/platform-parity-corpus.json`, evidence docs
73
- `exp-pp-09-opencode-fp-2026-04-14.md` +
74
- `exp-pp-10-opencode-external-2026-04-14.md`.
75
-
76
- ### Verified
77
-
78
- - jest: **452/452** passing — this is the `452`-test verification baseline. (was 440 + 12 new opencode-pp05
79
- regression tests).
80
- - canonical CLI tests: **162/162** passing.
81
- - `npm pack --dry-run`: clean.
82
- - `node tools/validate-release-metadata.js`: validation passed for v1.28.0.
83
- - All guard suites still green (claude-na-gates, layer-coverage,
84
- framework-native, audit-evidence, score-preview, 3 format tests,
85
- shallow-risk).
86
-
87
- **All 8 platforms now calibrated or certified:** Claude, Cursor,
88
- Codex, Copilot, Gemini (certified, PPI contribution 1.0 each) +
89
- Windsurf, Aider, OpenCode (calibrated, 0.75 base). PPI 0.75 will
90
- graduate to 0.875+ only when corpus expansion on one of
91
- Windsurf/Aider/OpenCode produces a mature-repo set passing the
92
- score floor.
93
-
94
- ## [1.27.1] - 2026-04-14
95
-
96
- ### Fixed npm tarball completeness + Windows output encoding (MEMO wave)
97
-
98
- Addresses two real npm-user issues surfaced by the Codex CTO/CEO +
99
- Market Memo (2026-04-13 v2):
100
-
101
- - **`package.json` `files` broadened** (MEMO-17): the published
102
- tarball now includes `docs/`, `contracts/`, `sdk/README.md`,
103
- `CHANGELOG.md`, and `SECURITY.md` alongside `bin/`, `src/`, and
104
- `README.md`. Previously these docs surfaces were referenced in
105
- the README but not shipped in the npm tarball, meaning external
106
- users hit broken doc links post-install. Verified via
107
- `npm pack --dry-run` tarball now matches what the README
108
- promises.
109
-
110
- - **Windows output encoding** (MEMO-16): the CLI console output
111
- previously rendered mojibake on Windows cmd.exe where the runtime
112
- default code page did not support emoji (✅ ❌ ✔ ✗ U+2705 / U+274C /
113
- U+2713 / U+2717). Introduced `src/output-icons.js` as a single
114
- helper that emits clean ASCII fallbacks (`[OK]`, `[FAIL]`,
115
- `[SKIP]`, `[WARN]`) when `NERVIQ_ASCII_OUTPUT=1` or auto-detected
116
- from `process.platform === 'win32'` + non-TTY. Wired through
117
- `src/setup/runtime.js`, `src/setup.js`, `src/init.js`,
118
- `src/codex/setup.js`, `src/gemini/setup.js`, `test/run.js`.
119
- 2 new regression tests in `test/output-encoding.test.js`.
120
-
121
- ### Also this release
122
-
123
- - **7 back-dated GitHub Releases** created for v1.21.0 through
124
- v1.27.0 (MEMO-01). Previously the public GitHub release surface
125
- lagged npm by 7 versions; it now reflects the full release
126
- history.
127
- - **3 stale GitHub issues closed** (MEMO-02: #24, #25, #26)
128
- feature requests for Markdown / JUnit / CSV output that were
129
- actually shipped in v1.22.0. Each closed with a shipped-in
130
- attribution comment.
131
-
132
- ### Verified
133
-
134
- - jest: **440/440** passing — this is the `440`-test verification baseline. (was 438 + 2 new output-encoding
135
- regression tests).
136
- - canonical CLI tests: **162/162** passing.
137
- - `npm pack --dry-run`: clean, includes the broadened files set.
138
- - `node tools/validate-release-metadata.js --research <path>`:
139
- validation passed for v1.27.1.
140
-
141
- Evidence: `research/exp-memo-autonomous-wave-2026-04-14.md` in the
142
- research repo.
143
-
144
- ## [1.27.0] - 2026-04-14
145
-
146
- ### Added Shallow Risk Mode (experimental, CTO-06)
147
-
148
- Opt-in `--shallow-risk` lane that surfaces obvious problems at the
149
- intersection of agent configuration (CLAUDE.md, `.claude/`, `.cursor/`,
150
- `.codex/`, `.aider.conf.yml`, `.windsurf/`, etc.) and the rest of
151
- the codebase. Closes the 2026-04-08 UAT trust-break where evaluators
152
- said "missed something obvious" by catching a narrow, curated set
153
- of issues **no generic scanner can find** because they require
154
- understanding agent-config semantics.
155
-
156
- Implementation follows the approved design doc v2 (commit `f425209`
157
- in the research repo, `research/exp-cto-06-shallow-risk-design-2026-04-14.md`).
158
-
159
- ### The 8 initial patterns (all NERVIQ-native)
160
-
161
- 1. **`agent-config-missing-file`** — CLAUDE.md / AGENTS.md references
162
- a repo file that doesn't exist; agent works with broken context.
163
- 2. **`agent-config-stack-contradiction`**CLAUDE.md says "Go project"
164
- but repo is Python; agent recommends wrong tooling every session.
165
- 3. **`agent-config-cross-platform-drift`** Two platform configs
166
- give contradictory instructions (Cursor Claude disagree on
167
- primary language).
168
- 4. **`mcp-server-no-allowlist`** — MCP server declared with empty
169
- permissions / wildcard allow = full shell access, no guardrail.
170
- 5. **`hook-script-missing`** Hook declared in `.claude/settings.json`
171
- but the script file doesn't exist; hook silently skipped.
172
- 6. **`agent-config-secret-literal`** Secret token literal pasted
173
- into CLAUDE.md / agent config as "example". Narrow secret scanning
174
- scoped to our lane only (NOT broad repo secret scanning use
175
- gitleaks / truffleHog for that).
176
- 7. **`agent-config-deprecated-keys`** Config uses keys the platform
177
- removed in a later release (powered by our freshness manifest).
178
- 8. **`agent-config-dangerous-autoapprove`**Auto-approve list
179
- contains destructive patterns (`rm -rf *`, `git push --force`,
180
- `drop table`). Never suppressed.
181
-
182
- ### Shallow-risk is a parallel lane it does NOT affect the score
183
-
184
- Findings emit through `auditResult.shallowRiskHints[]` and are
185
- intentionally excluded from:
186
- - `auditResult.score`
187
- - `auditResult.organicScore`
188
- - `auditResult.passed` / `failed` / `skipped`
189
- - `auditResult.topNextActions`
190
- - `auditResult.layerSummary.*.failed`
191
-
192
- This keeps the governance pipeline stable while still surfacing
193
- agent-config ↔ codebase red flags. Score-unchanged proof on
194
- self-audit of the NERVIQ repo: governance score is **87** with and
195
- without `--shallow-risk`; only `shallowRiskHints` differs (empty
196
- vs. 17 hits).
197
-
198
- ### CLI UX
199
-
200
- ```bash
201
- npx @nerviq/cli audit --shallow-risk # full audit + shallow risk
202
- npx @nerviq/cli audit --shallow-risk-only # fast precommit mode
203
- NERVIQ_SHALLOW_RISK=off npx @nerviq/cli audit --shallow-risk # kill switch
204
- ```
205
-
206
- Friendly banner rendered in text output and as a blockquote in
207
- markdown:
208
-
209
- > Shallow Risk mode (experimental, opt-in). NERVIQ checks 8 patterns
210
- > that sit at the intersection of your AI agent configuration and
211
- > your codebase the kind of issues no generic scanner can find
212
- > because they require understanding CLAUDE.md, .claude/settings.json,
213
- > and similar files. For broader code-level security coverage, pair
214
- > this with Semgrep, CodeQL, or a dedicated secret scanner.
215
-
216
- ### Competitive positioning (explicit)
217
-
218
- NERVIQ `--shallow-risk` is **not** a replacement for Semgrep / ESLint
219
- / CodeQL / gitleiks / truffleHog / Dependabot — those tools work on
220
- source code or dependency manifests. NERVIQ works on the bridge
221
- between agent-declared intent and codebase reality. The 8 patterns
222
- reflect that lane exclusively.
223
-
224
- ### Rendering in all output formats
225
-
226
- - **JSON**: `auditResult.shallowRiskHints[]` parallel to `results[]`.
227
- - **Text**: separate `## Shallow Risk Hints (experimental, opt-in)`
228
- block after `## Top next actions`, banner inline.
229
- - **Markdown (`--format=markdown`)**: `### Shallow Risk (experimental,
230
- opt-in)` section after `### Top next actions`, banner as blockquote,
231
- each hint listed with severity / key / file:line.
232
- - **JUnit (`--format=junit`)**: separate `<testsuite name="shallow-risk">`
233
- so CI consumers can isolate or ignore it independently of the
234
- governance suite.
235
- - **CSV (`--format=csv`)**: hints appended as rows tagged
236
- `layer=shallow-risk`. Contract documented in
237
- `docs/integration-contracts.md` §7 and §8.1.
238
-
239
- ### Status: Experimental
240
-
241
- Release: `Experimental`. Graduates to `Beta` after 30 days of real
242
- telemetry with zero critical corpus-level false positives reported
243
- and at least one external user reporting a pattern caught a real
244
- issue. Graduates to `GA` after 50+ WAA using it on ≥5 distinct repos
245
- each.
246
-
247
- Reserved slots 9 and 10 are deliberately empty — they wait for 30
248
- days of user telemetry to tell us which patterns users most want
249
- that we didn't anticipate.
250
-
251
- ### Verified
252
-
253
- - jest: **438/438** passing — this is the `438`-test verification baseline. (was 419 + 19 new: 16 shallow-risk
254
- tests (positive + negative per pattern) + 3 format surface tests).
255
- - canonical CLI tests: **162/162** passing.
256
- - Guard coverage kept green: `claude-na-gates.test.js`,
257
- `layer-coverage.test.js`, `framework-native.test.js`,
258
- `audit-evidence.test.js`, `score-preview.test.js`, and the three
259
- format tests.
260
- - `npm pack --dry-run`: clean.
261
- - `node tools/validate-release-metadata.js --research <path>`:
262
- validation passed for v1.27.0.
263
- - Self-audit smoke: score unchanged (87 with and without the flag),
264
- 17 shallow-risk hints found on the NERVIQ repo itself (mostly
265
- `agent-config-missing-file` on `.claude/` docs).
266
-
267
- ### PP-08 gate
268
-
269
- Added `fp_rate_threshold_shallow_risk: 0.10` lane in
270
- `research/platform-parity-corpus.json`. Corpus FP measurement on
271
- shallow-risk patterns is a separate follow-up task (not in this
272
- release).
273
-
274
- Evidence: `research/exp-cto-06-implementation-2026-04-14.md`.
275
-
276
- ## [1.26.0] - 2026-04-14
277
-
278
- ### Fixed — Framework-native verification depth (CTO-07)
279
-
280
- Closes the trust-break documented in the 2026-04-08 UAT where Flutter
281
- + Swift projects got zero uplift from NERVIQ because valid verification
282
- commands (`xcodebuild test`, `flutter test`, `gradle test`) were
283
- treated as missing guidance, and mature Python ML + FastAPI repos
284
- flattened because NERVIQ didn't recognise existing scaffolding
285
- (pytest + `pyproject.toml` + poetry/uv + ruff/mypy).
286
-
287
- Moves KPI memo §6.5 ("Are mobile, infra, and mature repos improving
288
- with the same credibility as Node-oriented repos?") from NO YES.
289
-
290
- - `src/instruction-surfaces.js`: broadened surface bundle so repo
291
- files like `pyproject.toml`, `Makefile`, `justfile`, `Podfile`,
292
- `Cartfile`, `pubspec.yaml`, `Rakefile`, `build.gradle*`, and
293
- `.github/workflows/*` count as verification evidence. Expanded
294
- TEST/LINT/BUILD command patterns for Flutter (`flutter test`,
295
- `flutter analyze`, `dart analyze`, `dart format`, `fvm flutter`),
296
- iOS / Swift (`xcodebuild test`, `swift test`, `fastlane test`,
297
- `swiftlint`, `swift-format lint`), Android (`./gradlew test`,
298
- `./gradlew ktlintCheck`, `./gradlew detekt`), and Python (all of
299
- `pytest`, `poetry run pytest`, `uv run pytest`, `pdm run pytest`,
300
- `hatch run test`, `tox`, `nox`, `python -m pytest`, `python -m
301
- unittest`, `ruff check`, `ruff`, `flake8`, `pylint`, `black
302
- --check`, `mypy`, `pyright`, `pre-commit run`).
303
-
304
- - `src/techniques/shared.js`: 10 new memoized stack helpers
305
- (`hasIosXcodeProject`, `hasAndroidGradle`, `hasFlutterProject`,
306
- `hasPythonPoetry`, `hasPythonUv`, `hasPythonPdm`, `hasPythonHatch`,
307
- `hasFastApiProject`, `hasMlScaffolding`, `hasConfiguredTooling`).
308
- These let stack-specific checks detect "this project HAS
309
- verification wired up" directly from repo files rather than only
310
- from CLAUDE.md / AGENTS.md mentions — legitimate evidence because
311
- an agent working in the repo can observe these files itself.
312
-
313
- ### Re-audit — per-archetype uplift
314
-
315
- | Archetype | Before | After | Δ | Framework FNs resolved |
316
- |---|---:|---:|---:|---|
317
- | Flutter mobile | 14 | 25 | **+11** | 4 → 1 (build cmd advisory only) |
318
- | iOS Swift | 11 | 26 | **+15** | 4 → 0 |
319
- | Python ML | 14 | 23 | **+9** | 4 → 1 |
320
- | Python FastAPI | 11 | 21 | **+10** | 4 → 1 |
321
-
322
- Average uplift: **+11.25 points**. 14/15 framework-native false
323
- negatives flipped to pass/N/A; the residual 4 × `buildCommand` are
324
- legitimately advisory (category (c)).
325
-
326
- ### What is NOT changed
327
-
328
- - No new top-level checks. Catalog count stays at 2,441.
329
- - No check semantics inverted.
330
- - No scoring weights, severity values, or rating values touched.
331
- - CTO-08 `layer` tags preserved on every check.
332
- - Claude PP-06 calibration unaffected: `strict_false_positive_keys.
333
- claude` stays empty; `claude-na-gates.test.js` passes unchanged.
334
-
335
- ### Verified
336
-
337
- - jest: **419/419** passing — this is the `419`-test verification baseline. (was 403 + 16 new framework-native
338
- regression tests organised by stack in
339
- `test/framework-native.test.js`).
340
- - canonical CLI tests: **162/162** passing.
341
- - `npm pack --dry-run`: clean.
342
- - `node tools/validate-release-metadata.js --research <path>`:
343
- validation passed for v1.26.0.
344
-
345
- Evidence: `research/exp-cto-07-framework-native-2026-04-14.md`
346
- includes the full archetype survey, before/after re-audit, and
347
- methodology note on the deterministic fixtures used in Phase 3.
348
-
349
- ## [1.25.0] - 2026-04-14
350
-
351
- ### Added — 5-layer scope clarity (CTO-08)
352
-
353
- Every check in the NERVIQ audit is now tagged with exactly one of
354
- four layers. Closes the boundary-blur gap documented in the
355
- 2026-04-14 CTO memo §6 ("Do evaluators understand the product
356
- boundary before trust breaks?") and moves KPI question §6.2 from
357
- PARTIAL YES with measurable evidence. Source landed in commit
358
- `a8676b1`; this commit packages the release.
359
-
360
- The four layers:
361
-
362
- - **`governance`** — agent configuration posture: presence, content,
363
- and quality of agent-instruction files and platform settings.
364
- Example: `claudeMdExists`, `geminiSettingsExists`, MCP server
365
- declarations, hook presence.
366
- - **`drift`** — cross-platform consistency and declared-vs-actual
367
- alignment. Example: Harmony drift, Gemini propagation completeness,
368
- rules consistency across surfaces.
369
- - **`hygiene`** — repo-level cleanliness adjacent to agents (the
370
- engineering baseline that makes an agent's job easier). Example:
371
- `.gitignore`, CHANGELOG, SECURITY.md, LICENSE, Node version
372
- pinning, editorconfig.
373
- - **`shallow-risk`** reserved for CTO-06 (agent-config ↔ codebase
374
- boundary hints). No checks currently populate this layer; the
375
- constant exists so formatters and downstream consumers know about
376
- it for the future.
377
-
378
- There is **no `deep-review` or `security` layer**, by design. NERVIQ
379
- audits agent configuration and the cleanliness of the repo boundary
380
- an agent operates inside. It does not perform dataflow analysis,
381
- SAST, or general code review those are out of scope and left to
382
- dedicated tools. This is the contract that lets evaluators know
383
- where our claim to ground-truth starts and stops.
384
-
385
- ### Final layer distribution (2,441 checks)
386
-
387
- | Layer | Count | % |
388
- |---|---:|---:|
389
- | governance | 1,102 | 45.1% |
390
- | drift | 39 | 1.6% |
391
- | hygiene | 1,300 | 53.3% |
392
- | shallow-risk | 0 (reserved) | 0% |
393
-
394
- Disambiguation rules (codified in `src/audit/layers.js` and
395
- `docs/integration-contracts.md` §8):
396
- - "Does my agent know X?" `governance`.
397
- - "Do two places agree on X?" `drift`.
398
- - "Does the repo have standard engineering hygiene?" → `hygiene`.
399
- - When in doubt, default to `hygiene` (a mild misclassification is
400
- recoverable; a missing tag breaks the coverage contract).
401
-
402
- ### Surfaced in every output format
403
-
404
- - **JSON**: `auditResult.results[].layer`,
405
- `auditResult.topNextActions[].layer`, and a new
406
- `auditResult.layerSummary` giving per-layer
407
- `{ total, passed, failed, skipped }`.
408
- - **Text**: "Coverage by layer:" summary block plus a small
409
- `[layer]` prefix on failed-check names.
410
- - **Markdown (`--format=markdown`)**: `layer` column in the failed-
411
- checks table; `_layer: X_` suffix on each top-action checklist item.
412
- - **JUnit (`--format=junit`)**: `layer="..."` attribute on every
413
- `<testcase>`.
414
- - **CSV (`--format=csv`)**: new `layer` column between `category`
415
- and `rating`. Updated contract in `docs/integration-contracts.md` §7.
416
-
417
- ### Verified
418
-
419
- - jest: **403/403** passing this is the `403`-test verification baseline. (was 391 + 7 coverage tests + 5
420
- format surface tests).
421
- - canonical CLI tests: **162/162** passing.
422
- - `npm pack --dry-run`: clean.
423
- - `node tools/validate-release-metadata.js --research <path>`:
424
- validation passed for v1.25.0.
425
-
426
- Evidence: `research/exp-cto-08-layer-clarity-2026-04-14.md` includes
427
- the full distribution, ambiguous-call log, and KPI mapping.
428
-
429
- ## [1.24.0] - 2026-04-14
430
-
431
- ### Fixed Claude calibration debt resolved (CTO-09 / PP-06)
432
-
433
- Eleven Claude audit checks that were systematically firing as
434
- false-positives on repos that did not opt in to their respective
435
- agent-config surfaces now return `N/A` (null) instead of `false`.
436
- Previously these were captured in a post-hoc allowlist
437
- (`platform-parity-fp-rules.json.strict_false_positive_keys.claude`);
438
- now the checks are honest at source.
439
-
440
- The affected keys:
441
-
442
- - `claudeLocalMd`, `autoMemoryAwareness`, `importSyntax`
443
- (in `src/techniques/instructions.js`) N/A when the repo does
444
- not opt in to the overrides/memory/import-syntax conventions.
445
- `importSyntax` becomes a positive-signal check: it passes when
446
- `@`-imports are present in CLAUDE.md, and is advisory only on
447
- long (≥80 lines) CLAUDE.md files that would clearly benefit.
448
- - `mcpServers`, `multipleMcpServers`, `context7Mcp`
449
- (in `src/techniques/tools.js`) N/A on repos that have no MCP
450
- references anywhere. A new `_repoOptsInToMcp()` helper centralises
451
- the detection.
452
- - `dockerfile`, `dockerCompose`, `terraformFiles`, `hooksNotificationEvent`,
453
- `subagentStopHook`
454
- (in `src/techniques/automation.js`) — N/A when no infra signal
455
- exists (Dockerfile/`.tf`/`docker-compose*`) or when
456
- `.claude/settings.json` has no `hooks` block. New
457
- `_repoHasInfraSignal()` and `_repoHasHooksBlock()` helpers.
458
-
459
- ### Impact
460
-
461
- - **PP-08 CI gate threshold restored to 0.05** (from the 0.15
462
- holding pattern). The `fp_rate_threshold_notes` in
463
- `research/platform-parity-corpus.json` documents the resolution:
464
- any drift above 0.05 is now a real regression, not a calibration
465
- debt issue.
466
- - **Claude strict-FP rate dropped from ~11.99% to 0.00%** on the
467
- cleanly-cloned repos in the PP-08 corpus (8/9 — one long-path
468
- checkout failure on Windows unrelated to CLI).
469
- - **Per-repo total failures dropped by 6–10 checks each** on Claude
470
- audits, matching the expected ~7.6 opt-in hits per repo that moved
471
- from `false` `null`.
472
- - **`strict_false_positive_keys.claude` is now empty.** The post-hoc
473
- allowlist is no longer needed.
474
-
475
- ### Verified
476
-
477
- - jest: **391/391** passing — this is the `391`-test verification baseline. (was 369 + 22 new N/A-gate
478
- regression tests in `test/claude-na-gates.test.js`, two per key).
479
- - canonical CLI tests: **162/162** passing.
480
- - `npm pack --dry-run`: clean.
481
- - `node tools/validate-release-metadata.js --research <path>`:
482
- validation passed for v1.24.0.
483
- - PP-08 CI gate: all 6 platforms (claude, codex, cursor, gemini,
484
- windsurf, aider) PASS at the restored 0.05 threshold.
485
-
486
- Evidence: `research/exp-pp-06-claude-recalibration-debt-2026-04-14.md`
487
- updated with a Resolution section at the top (per-key table,
488
- before/after gate output, verification).
489
-
490
- ## [1.23.0] - 2026-04-14
491
-
492
- ### Added — Trust-recovery depth (CTO-04, CTO-05)
493
-
494
- Ships the two deepest items from the 2026-04-14 CTO memo — the
495
- evaluator-stated reasons trust breaks in real audits. Closing them
496
- moves KPI questions §6.3 (file-level evidence) and §6.4 (score
497
- impact before write) from NO/UNKNOWN YES with verifiable evidence.
498
- Formatter source landed in commit `e06ae64`; this commit packages
499
- the release.
500
-
501
- - **CTO-04 — File-level evidence (`file:line:snippet`).** Every
502
- failed check that has a sensible file-level source now emits
503
- `file`, `line`, and a `snippet` (2–5 lines of context, 300-char
504
- cap) so markdown/junit/text outputs can point at real evidence
505
- rather than abstract advice.
506
- - New resolver registry in `src/audit/evidence.js` for the 20
507
- highest-hitting check keys identified in a fresh self-audit.
508
- - Survey result on self-audit of the nerviq repo: 0 of 23 failed
509
- checks previously carried evidence; **9 of 23 now do**. The
510
- remaining 14 are either category (c) "absence-of-file"
511
- checks like `claudeLocalMd` where a null pointer is the correct
512
- semantic or roll-ups where evidence would be misleading.
513
- - Backlog of unresolved category (b) keys documented in the
514
- evidence doc. 1 deferred (`skillUsesPaths`, blocked on CTO-06).
515
- - Markdown formatter renders snippet as a fenced code block under
516
- each checklist item; JUnit formatter appends it to the
517
- `<failure>` body after `---`; CSV intentionally unchanged
518
- (snippet newlines/commas would hurt downstream parsing).
519
-
520
- - **CTO-05 — Score-impact preview before `--apply`.** Each
521
- `topNextActions` item now carries `projectedScoreDelta`,
522
- `projectedOrganicScoreDelta`, and `projectedScoreAfter` so the
523
- user sees "this fix moves score 67 74 (+7 pts)" before any
524
- write. Projection is computed by one O(1) recompute per top
525
- action using the existing scoring function (no extra full
526
- audits, no scoring-algorithm changes).
527
- - Text output appends ` (+N pts → X/100)` per top action.
528
- - Markdown formatter shows the same suffix inline in the
529
- checklist.
530
- - CSV adds two trailing columns
531
- `projectedScoreDelta,projectedScoreAfter` populated only
532
- for rows whose key appears in `topNextActions` (projection is
533
- per-top-action, not per-every-check); other rows leave both
534
- columns empty. Contract documented in
535
- `docs/integration-contracts.md` §7.
536
- - JUnit intentionally unchanged (testcases don't naturally carry
537
- scores).
538
-
539
- ### Verified
540
-
541
- - jest: **369/369** passing — this is the `369`-test verification baseline. (was 354 + 9 new
542
- evidence tests + 3 new score-preview tests + 3 markdown extensions
543
- + 1 junit extension + 2 csv extensions).
544
- - canonical CLI tests: **162/162** passing.
545
- - `npm pack --dry-run`: clean (213 files, 757 kB).
546
- - `node tools/validate-release-metadata.js --research <path>`:
547
- validation passed for v1.23.0.
548
-
549
- Evidence: `research/exp-cto-04-05-trust-recovery-2026-04-14.md`
550
- in the research repo (~263 lines) includes the full per-check
551
- survey, worked projection example, markdown + CSV samples with
552
- the new fields, and explicit mapping back to the 8 memo KPI
553
- questions.
554
-
555
- ## [1.22.0] - 2026-04-14
556
-
557
- ### Added CI output format pack (CTO-01, CTO-02, CTO-03)
558
-
559
- Three new output formats for `nerviq audit`, designed to plug the CLI
560
- straight into standard CI surfaces. Closes the "Markdown PR comment /
561
- JUnit XML / CSV" gap called out in the 2026-04-14 CTO memo §8 — the
562
- plumbing required before "no serious multi-agent repo merges without
563
- a Nerviq check" is even claimable as positioning.
564
-
565
- - **`--format=markdown` (CTO-01)** GitHub-flavoured markdown
566
- suitable for a PR comment. Includes a `## Score: N/100` header with
567
- shields.io badge, a `### Top next actions` task-list checklist (up
568
- to 5 items, each with severity + key + optional `file:line`), a
569
- collapsible `<details>` block listing all failed checks in a pipe
570
- table, and a `Generated by [Nerviq](https://nerviq.net)` footer.
571
- Pipe characters inside cells are backslash-escaped. No raw HTML
572
- beyond `<details>` / `<summary>`.
573
-
574
- - **`--format=junit` (CTO-02)** — Jenkins-compatible JUnit XML.
575
- `<testsuites name="nerviq" tests="N" failures="F" skipped="S">`
576
- root, one `<testsuite>` per check category, one `<testcase>` per
577
- check (`classname=category`, `name=key`). Failed checks emit
578
- `<failure message="..." type="SEVERITY">` with body containing
579
- `name [at file:line] [(sourceUrl)]`. Skipped checks emit `<skipped/>`.
580
- All attribute values + text nodes XML-escape `& < > " '`. Parses
581
- cleanly with GitHub Actions test reporter, GitLab JUnit reporter,
582
- and Jenkins JUnit plugin.
583
-
584
- - **`--format=csv` (CTO-03)** RFC 4180 CSV. Header row
585
- `key,id,name,category,rating,severity,passed,file,line,sourceUrl,fix`
586
- followed by one row per check. Fields containing comma, double-quote,
587
- CR, or LF are wrapped in double-quotes; internal double-quotes are
588
- escaped by doubling. No UTF-8 BOM (avoids pandas / Excel friction).
589
- LF line separator.
590
-
591
- Wired into `bin/cli.js` `--format` switch alongside existing
592
- `json|sarif|otel`. Format contracts documented in
593
- `docs/integration-contracts.md` §7 as the stable consumer API for
594
- downstream wrappers (GitHub Actions, Jenkins plugins, GitLab reporters,
595
- dashboards) bind to these shapes rather than scraping text output.
596
-
597
- ### Verified
598
-
599
- - jest: **354/354** passing this is the `354`-test verification baseline. (was 335 + 19 new format tests:
600
- `test/format-markdown.test.js`, `test/format-junit.test.js`,
601
- `test/format-csv.test.js` covering field shape, escaping rules,
602
- edge cases like missing `file:line`, and full round-trip parse
603
- on synthetic audit results).
604
- - canonical CLI tests: **162/162** passing.
605
- - `npm pack --dry-run`: clean (212 files, 754 kB).
606
- - `node tools/validate-release-metadata.js --research <path>`:
607
- validation passed for v1.22.0.
608
-
609
- Evidence: `research/exp-cto-01-03-formats-2026-04-14.md` in the
610
- research repo includes sample outputs and a GitHub Actions integration
611
- recipe.
612
-
613
- ## [1.21.0] - 2026-04-14
614
-
615
- ### Calibrated (not certified) Aider platform audit (PP-04)
616
-
617
- Aider platform audit recalibrated against 10 real Aider-using repos
618
- (`Aider-AI/aider`, `sysown/proxysql`, `Provenance-Emu/Provenance`,
619
- `disler/always-on-ai-assistant`, `SquirrelJME/SquirrelJME`, `ad-si/tu`,
620
- `Aider-AI/conventions`, `commit-0/commit0`, `roychri/mcp-server-asana`,
621
- `attestate/kiwistand`).
622
-
623
- Seven systematic 10/10 false-positives eliminated:
624
-
625
- - `aiderUndoSafetyAware` (10/10 5/10)
626
- - `aiderEditorModelConfigured` (10/10 0/10)
627
- - `aiderWeakModelConfigured` (10/10 5/10)
628
- - `aiderModelSettingsFileExists` (10/10 → 5/10)
629
- - `aiderAiderignoreExists` (10/10 → 5/10)
630
- - `aiderEnvFileExists` (10/10 → 5/10) — true FP: `.env` is gitignored;
631
- now accepts `.env.example` / `.sample` / `.template`.
632
- - `aiderAllConfigSurfacesPresent` (10/10 5/10) true FP, same root cause.
633
-
634
- Four additional ≥9/10 FPs sharply reduced: `aiderGitHooksForPreCommit` 9→3,
635
- `aiderBrowserModeForDocs` 9→5, `aiderPlaywrightUrlScraping` 9→4,
636
- `aiderVersionPinned` 9→0 (N/A on non-Python projects).
637
-
638
- Six opt-in tuning knobs converted to pass-or-N/A semantics:
639
- `aiderMapTokensConfigured`, `aiderEditFormatConfigured`,
640
- `aiderArchitectModeAvailable`, `aiderCachePromptsEnabled`,
641
- `aiderCommitPrefixConfigured`, `aiderVoiceModeAware` — they no longer
642
- fire as advisories on repos that do not opt in.
643
-
644
- Newly recognised conventions: `.aider.conf.yaml` (alt extension),
645
- `AGENTS.md` / `CLAUDE.md` / `.ai/instructions.md` / `AIDER.md` as
646
- alternative convention surfaces, `.env.example` / `.sample` / `.template`
647
- as env-contract surfaces.
648
-
649
- 10-repo corpus moved from baseline 38–64 final 44–82. 2/10 reach ≥70
650
- (kiwistand 82, proxysql 72). The other 8 are below 70 due to documented
651
- genuine content gaps in the audited repos themselves, not audit bugs.
652
-
653
- **Why "calibrated, not certified":** same judgment as Windsurf (PP-03).
654
- Strict-FP <5% bar is met; all-10-≥70 + mature-repos-≥73 bar is not,
655
- because public Aider adoption above 500 stars is sparse. PPI stays at
656
- **0.75** until corpus expansion.
657
-
658
- ### Fixed release drift guard prefers `-main` worktrees
659
-
660
- `tools/validate-release-metadata.js` now prefers `../nerviq-research-main`
661
- and `../nerviq-site-main` when those worktrees exist, falling back to
662
- `../nerviq-research` / `../nerviq-site` otherwise. When a parallel-agent
663
- worktree on a feature branch occupies the canonical `nerviq-research`
664
- directory, the drift guard was reading the feature-branch state and
665
- refusing publish even though the actual main branch was synced.
666
- Single-worktree setups are unaffected.
667
-
668
- ### Verified
669
-
670
- - jest: **335/335** passing — this is the `335`-test verification baseline.
671
- - canonical CLI tests: **162/162** passing.
672
- - aider matrix: **315/315** passing (was 308, +6 PP-04 regression tests).
673
- - `npm pack --dry-run`: clean.
674
- - `node tools/validate-release-metadata.js --research <path>`: validation
675
- passed for v1.21.0.
676
- - PP-08 CI gate: all 6 platforms (claude, codex, cursor, gemini, windsurf,
677
- aider) PASS at the current threshold.
678
-
679
- ## [1.20.1] - 2026-04-14
680
-
681
- ### Fixed Critical: bin/cli.js shebang regression
682
-
683
- `bin/cli.js` was missing the `#!/usr/bin/env node` shebang since v1.16.x (commit `40c27b8` on 2026-04-12, which fixed a macOS pipe-flush issue and accidentally dropped the shebang while restructuring the file). Without a shebang, `npx @nerviq/cli` failed on Linux and Mac because the OS fell back to `/bin/sh` and tried to execute JavaScript as a shell script (`//: Permission denied / Syntax error`). Windows installs were unaffected because npm generates `.cmd` wrappers that invoke `node` explicitly.
684
-
685
- This was discovered when wiring up the PP-08 CI gate against `npx @nerviq/cli@1.20.0`. Likely affected production users on Linux/macOS doing fresh `npx` installs since 2026-04-12.
686
-
687
- - Restored `#!/usr/bin/env node` as the first line of `bin/cli.js`.
688
- - Added `test/bin-shebang.test.js` regression test that scans every `bin` entry in `package.json` and asserts the shebang exists. Will catch any future drop of the shebang line on any bin script.
689
-
690
- ### Fixed claudeMdContent pointer expansion accepts `@` imports
691
-
692
- `ProjectContext.claudeMdContent()` in `src/context.js` recognizes when CLAUDE.md is a thin pointer to another file (e.g., `AGENTS.md`) and expands it. The expansion regex `/^[a-zA-Z0-9_./-]+\.(md|txt|rst)$/` did not accept Claude Code's standard `@`-prefixed import syntax (`@AGENTS.md`, `@./docs/CODING.md`). Repos using the standard syntax saw all memory/prompting/quality checks fail because the auditor only saw the 1-line pointer.
693
-
694
- Discovered while investigating the NERVIQ site's self-dogfood score (25 → 85 after this fix plus content enrichment).
695
-
696
- - Updated regex to `/^@?\.?\/?[a-zA-Z0-9_./-]+\.(md|txt|rst)$/`; resolver strips `@` and `./` prefixes before `fileContent()`.
697
- - Added `test/context.test.js` (+6 tests) covering raw content, bare-filename pointer, `@`-prefix, `@./`-prefix, nested-subdir, and null-fixture cases.
698
-
699
- ### Added — `prepublishOnly` lifecycle script
700
-
701
- `package.json` now wires the existing pre-publish drift guard (`tools/pre-publish.js`) to npm's `prepublishOnly` lifecycle, in addition to the manual `prepublish:check` alias. `npm publish` now blocks automatically on dirty tree, branch drift, missing CHANGELOG entry, jest failure, or release-metadata drift. `npm pack --dry-run` does not trigger it (verified) so local development is unaffected.
702
-
703
- ### Calibrated (not certified) Windsurf platform audit (PP-03)
704
-
705
- Windsurf platform audit recalibrated against 10 real Windsurf-using repos (`grapeot/devin.cursorrules`, `hyper-mcp-rs/hyper-mcp`, `dxos/dxos`, `snowflakedb/gosnowflake`, `ShareX/XerahS`, `Brawl345/Image-Reverse-Search-WebExtension`, `rudrankriyam/Ichi`, `snyk/snyk-intellij-plugin`, `wepublish/wepublish`, `AmadeusITGroup/otter`).
706
-
707
- Three systematic 10/10 false-positives eliminated:
708
- - `windsurfMemoriesConfigured` — opt-in memories surface; now N/A when absent.
709
- - `windsurfPackMcpRecommended` — opt-in MCP recommendation; now N/A when absent.
710
- - `windsurfAdvisoryMcpHealth` — **real bug fix**: was reading the host's `os.platform()` and asserting it inside the audited repo's advisory. Now host-agnostic; uses repo-local evidence only (Windows/WSL gate generalised).
711
-
712
- Other improvements: pointer/`@import` expansion for Windsurf instruction surfaces (`.windsurf/rules/*`, `WINDSURF.md`, pointer files like `.ai/instructions.md`), `.windsurfrules/` directory form support, fallback to `AGENTS.md`/`CLAUDE.md` for stack-marker generalisation, frontmatter realism for `.mdc` files.
713
-
714
- 10-repo corpus moved from baseline 9–70 → final 32–83. 7/10 ≥70. The 3 below 70 (hyper-mcp 69, Ichi 64, wepublish 60) are documented genuine content-depth gaps in the audited repos themselves, not audit bugs. The 32 outlier (`grapeot/devin.cursorrules`) uses the deprecated single-file `.windsurfrules` legacy format.
715
-
716
- **Why "calibrated, not certified":** Gemini PP-02 cleared "all 10 ≥70" and "all mature (>10K stars) 73". Windsurf cleared the strict-FP <5% bar (the primary criterion) but Windsurf public adoption is thinner than Gemini at equivalent star thresholds — the largest mature repo found was 5.9K stars. PPI stays at **0.75** until corpus expansion produces a mature-repo set passing the score floor. No inflated PPI claim shipped.
717
-
718
- ### Verified
719
-
720
- - jest: **335/335** passing (was 326 + 6 new context tests + 3 new shebang tests) — this is the `335`-test verification baseline.
721
- - canonical CLI tests: **162/162** passing.
722
- - matrix: **311/0** passing.
723
- - `npm pack --dry-run`: clean.
724
- - `node tools/validate-release-metadata.js --research ../nerviq-research-main`: validation passed.
725
-
726
- ## [1.20.0] - 2026-04-13
727
-
728
- ### Fixed — Gemini Platform Parity (PP-02, 10-repo calibration)
729
-
730
- Gemini becomes the **5th certified platform** (PPI 0.625 **0.75**). Calibrated against 10 real Gemini-using repos (google-gemini/gemini-cli, google-gemini/cookbook, GoogleCloudPlatform/generative-ai, obra/superpowers, JuliusBrussee/caveman, google/site-kit-wp, google/dotprompt, vdesabou/kafka-docker-playground, OthmanAdi/planning-with-files, mscraftsman/generative-ai).
731
-
732
- Key calibrations:
733
- - `_expandGeminiMdImports` resolves `@path.md` imports and single-line-pointer `GEMINI.md` files (observed in google/dotprompt).
734
- - Fallback chain for Gemini instruction surface: AGENTS.md → CLAUDE.md → `.gemini/styleguide.md` (Gemini Code Assist convention).
735
- - `isMcpOnlySettings` helper: 5 CLI-behaviour checks go N/A on MCP-only `.gemini/settings.json`.
736
- - `geminiSettingsExists` / `geminiCommandsExist` now N/A when the directory is absent rather than flagging a failure these surfaces are opt-in.
737
- - Broadened `docsBundle` to accept AGENTS/CLAUDE/CONTRIBUTING/ARCHITECTURE/DEVELOPMENT as documentation evidence.
738
- - `geminiEnvApiKey` credits ADC, Vertex AI, `gemini auth`, and service-account flows (not just `GEMINI_API_KEY`).
739
- - Tightened `geminiPropagationCompleteness`: the bare word "skills" was firing FPs.
740
- - **Bug fix:** `context.fileName` can legally be an array per the Gemini CLI schema. `path.join` crashed with `TypeError` on `google/site-kit-wp`. Now handled.
741
-
742
- ### Measured (strict FP <5% across 10-repo corpus)
743
-
744
- | Repo | Stars | Before | After |
745
- |---|---|---|---|
746
- | obra/superpowers | 148K | 73 | **88** |
747
- | google-gemini/gemini-cli | 101K | 74 | **89** |
748
- | JuliusBrussee/caveman | 21K | 75 | **94** |
749
- | OthmanAdi/planning-with-files | 18K | 72 | **73** |
750
- | google-gemini/cookbook | 17K | 73 | **94** |
751
- | GoogleCloudPlatform/generative-ai | 17K | 73 | **88** |
752
- | google/site-kit-wp | 1.4K | crash | **78** |
753
- | vdesabou/kafka-docker-playground | 778 | 68 | **83** |
754
- | google/dotprompt | 507 | 64 | **75** |
755
- | mscraftsman/generative-ai | 206 | 64 | **70** |
756
-
757
- All 10 repos ≥ 70; all 6 mature repos (>10K stars) ≥ 73.
758
-
759
- - **Gemini Platform Parity: certified**. PPI: 0.625 → **0.75** (Claude + Cursor + Codex + Copilot + Gemini).
760
-
761
- 326/326 tests pass (+2 PP-02 regressions on top of v1.19.0's 324) — this is the `326`-test verification baseline.
762
-
763
- ## [1.19.0] - 2026-04-13
764
-
765
- ### Added
766
- - **EXP-04: `nerviq audit --fix` autofix flow**. `audit --fix` now runs the audit, applies fixable critical fixes, writes rollback manifests for successful writes, and re-audits before returning an exit code.
767
- - **Autofix docs**. Added `docs/autofix.md` with command examples, safety behavior, and exit-code semantics for the new one-shot flow.
768
- - **GOV-03: Time-to-First-Value benchmark** (`tools/ttfv-benchmark.py`). Measured harness across 4×4 install/repo combos; verdict on "<2 min" claim: TRUE (slowest median 16.1s on npx cold × nerviq-research).
769
-
770
- ### Changed
771
- - **Shared fix engine now covers instruction-surface autofix**. Missing `CLAUDE.md`, verification guidance, and safe hygiene templates can now be applied through the same fix pipeline used by the CLI write paths.
772
-
773
- ### Tests
774
- - Added `test/audit-fix.test.js` coverage for dry-run, auto-apply, rollback artifacts, `DO NOT AUTOEDIT` safety skips, exit-code handling, and hygiene rollback verification.
775
-
776
- 324/324 tests pass.
777
-
778
- ## [1.18.0] - 2026-04-13
779
-
780
- ### Fixed Copilot Platform Parity (PP-01, 10-repo calibration)
781
-
782
- - **Copilot audit now recognizes real-world repo conventions.** Calibrated against 10 active Copilot-using repos (home-assistant/core, block/goose, microsoft/vscode, astral-sh/uv, microsoft/playwright, langchain-ai/langchain, microsoft/typescript-go, microsoft/semantic-kernel, dotnet/aspire, github/awesome-copilot).
783
- - **JSONC tolerance in `.vscode/settings.json`**: parser now strips comments/trailing commas before evaluation (Copilot/VSCode honor JSONC; strict-JSON parsing produced false CP-B06 failures).
784
- - **Context fallback for AGENTS.md / CLAUDE.md**: repos that centralize agent guidance in AGENTS.md or CLAUDE.md at repo root are no longer penalized for `.github/copilot-instructions.md` substance checks.
785
- - **Stack-docs bundle helper**: 45 stack/domain checks now accept a documented bundle of per-stack signals (pyproject.toml + ruff.toml, Cargo.toml + rustfmt.toml, go.mod + golangci.yml, etc.) rather than requiring a single canonical file.
786
-
787
- ### Measured (strict FP rate < 5% across 10-repo corpus)
788
-
789
- | Repo | Stars | Before | After |
790
- |---|---|---|---|
791
- | home-assistant/core | 86K | 42 | **76** |
792
- | block/goose | 41K | 41 | **76** |
793
- | microsoft/vscode | 183K | 46 | **61** |
794
- | astral-sh/uv | 83K | 28 | **75** |
795
- | microsoft/playwright | 86K | 46 | **66** |
796
- | langchain-ai/langchain | 133K | 23 | **65** |
797
- | microsoft/typescript-go | 25K | | **66** |
798
- | microsoft/semantic-kernel | 27K | 33 | **53** |
799
- | dotnet/aspire | 6K | 35 | **59** |
800
- | github/awesome-copilot | | 45 | **59** |
801
-
802
- All 10 repos ≥ 40; all 9 mature repos (>10K stars) ≥ 53.
803
-
804
- - **Copilot Platform Parity: certified**. PPI: 0.5 → **0.625** (Claude + Cursor + Codex + Copilot).
805
-
806
- ### Added
807
- - EXPERIMENTAL qualifiers surfaced consistently on all user-facing Synergy mentions in README, docs/why-nerviq.md, docs/api-reference.md (SYN-04 audit).
808
-
809
- 317/317 tests pass.
810
-
811
- ## [1.17.3] - 2026-04-12
812
-
813
- ### Fixed Codex Platform Parity (Issue #35, 10-repo scale-up)
814
-
815
- - **Hook checks now require Codex-specific evidence**. hooksClaimed() previously matched any generic 'hook' mention in AGENTS.md triggering FPs on git hooks, React hooks, or dependency names like 'hookable'. Now requires .codex/hooks/, .codex/hooks.json, [hooks]/codex_hooks in config.toml, specific Codex event names (SessionStart, PreToolUse, PostToolUse, UserPromptSubmit), or explicit 'codex hooks' phrase. Fixes jessfraz/dotfiles, ModelEngine-Group/fit-framework, finbarr/yolobox.
816
- - **codexPackRecommendationQuality accepts .NET / Gradle manifests**. Added .sln, .slnx, .csproj, .fsproj, .vbproj, Directory.Packages.props, Directory.Build.props, global.json, gradlew. Fixes Megabit/Blazorise.
817
- - **codexNoInstructionContradictions ignores line-ending guidance**. CRLF/LF/trailing-newline/EOF rules are style preferences, not logical contradictions.
818
- - **codexAgentsMd accepts .codex/AGENTS.md**. Some repos store AGENTS.md inside .codex/.
819
-
820
- ### Measured
821
- - jessfraz/dotfiles: 50 67 (hook FPs removed, +17 points)
822
- - Codex strict FP rate: 5.98% → <5% on 10-repo scale-up
823
- - **Codex Platform Parity: certified**. PPI: 0.375 → **0.5** (Claude + Cursor + Codex)
824
-
825
- 315/315 tests pass.
826
-
827
- Closes #35
828
-
829
- ## [1.17.2] - 2026-04-12
830
-
831
- ### Fixed
832
- - **`.codex/AGENTS.md` now recognized as a valid Codex instruction surface**. `agentsMdPath()` previously only checked root `AGENTS.md`, missing the emerging pattern of keeping Codex instructions inside `.codex/` (e.g., jessfraz/dotfiles stores a 12KB AGENTS.md there). This fix cascades to every check that reads `agentsContent()`, including `codexPackRecommendationQuality` — the last remaining FP in Codex re-validation.
833
-
834
- ### Measured
835
- - jessfraz/dotfiles: 47 50, `codexPackRecommendationQuality` FAIL → PASS
836
- - Codex strict FP rate: <5% across both re-validation repos → ready to scale to 10
837
-
838
- ## [1.17.1] - 2026-04-12
839
-
840
- ### Fixed Platform Parity re-validation (after v1.17.0)
841
-
842
- - **codexPythonPackageStructure (CX-PY19)**: Now probes common package layouts directly via filesystem scan instead of relying on `ctx.files` (which only lists root entries). Correctly detects `src/<package>/__init__.py` and flat `<package>/__init__.py` layouts. Fixes false negative on openai/openai-agents-python.
843
- - **codexPackRecommendationQuality (CX-N03)**: Returns N/A for dotfiles/config-only repos (detected via 2+ signals from `.zshrc`, `.bashrc`, `.vimrc`, `.tmux.conf`, `.gitconfig`, `install.sh`, `bootstrap.sh`). Pack recommendations are not meaningful for non-code repos.
844
- - **cursorBugbotEnabled (CU-J01)**: Severity downgraded medium → low. Returns N/A unless repo shows BugBot evidence (bugbot config file, `.github/workflows` reference, or docs mention). BugBot is an optional Cursor enterprise feature — no sense failing every repo that doesn't use it.
845
-
846
- ### Measured
847
- - **PP-02 Codex**: openai-agents-python 72 → 73. 2 remaining FPs resolved.
848
- - **PP-02 Cursor**: CU-J01 no longer fires on every repo with rules. Strict FP rate 4.9% → 0%.
849
-
850
- ## [1.17.0] - 2026-04-12
851
-
852
- ### Fixed — Cursor (from Platform Parity audit, Issue #32)
853
- - **CU-A01 (cursorRulesExist)**: Now follows file-redirect pattern. When `.cursor/rules` is a text file pointing to another path (e.g., `agents/rules/`), the rules are read from the redirect target. Fixes false negative on cal.com-style layouts.
854
- - **CU-A02 (cursorNoLegacyCursorrules)**: Returns N/A when repo has zero Cursor configuration. Fixes the calibration inversion where no-config repos outscored legacy-format repos.
855
- - **CU-C01 (cursorPrivacyMode)**: Severity downgraded from `critical` to `low`. Returns N/A when no rules exist. Privacy Mode is stored in SQLite state.vscdb and not meaningfully auditable from repo files.
856
-
857
- ### Fixed Codex (from Platform Parity audit, Issue #33)
858
- - **codexAgentsArchitecture (CX-A04)**: Expanded heading recognition to include "Project Structure Guide", "Repo Structure", "Repository Layout", "Codebase Guide", "Key Directories" and enumerated directory maps. Fixes false negative on openai/openai-agents-python.
859
- - **codexCliAuthCredentialsStoreExplicit (CX-B12)**: Tightened managed-machine heuristic to require explicit terms (`managed device`, `shared workstation`, `multi-user host`, `VDI`, `kiosk`, `enterprise-managed`). No longer triggers on generic words like "shared utilities" or "server-managed".
860
- - **codexMcpPresentIfRepoNeedsExternalTools (CX-F01)**: Returns N/A for SDK/library repos (detected via package manifest + README patterns). SDKs document integrations without needing project-scoped MCP.
861
- - **codexSkillsHaveMetadata**: Now accepts YAML frontmatter (`name`, `description`) as valid metadata. Fixes false negative on repos using OpenAI-style SKILL.md.
862
- - **codexPythonFormatterConfigured (CX-PY08)**: Accepts broader Ruff setups (any `[tool.ruff]` section, not just `[tool.ruff.format]`), yapf, autopep8, and standalone config files.
863
- - **codexPythonFastapiEntryDocumented (CX-PY10)**: Returns N/A when FastAPI appears only in examples/dev deps. Also checks AGENTS.md for entry point documentation.
864
- - **codexPythonMigrationsDocumented (CX-PY11)**: Returns N/A for SDK/library repos and when repo has no DB dependencies.
865
- - **codexPythonPackageStructure (CX-PY19)**: Path-separator-agnostic regex works correctly on Windows.
866
- - **codexPackRecommendationQuality (CX-N03)**: Removed `package.json` as universal requirement. Now accepts any primary manifest (pyproject.toml, Cargo.toml, go.mod, Gemfile, flake.nix, Makefile, etc.). Returns N/A when no signals exist.
867
-
868
- ### Measured
869
- - **PP-02/PP-03 Cursor**: FP rate 15% → <5% after fixes. Score range 14–76 → 20–68 (still differentiated).
870
- - **PP-02/PP-03 Codex**: Strict FP 27.8% <5% after fixes. openai-agents-python 65 → 72.
871
- - **Platform Parity Index (PPI)**: 0.125 0.375 (Claude + Cursor + Codex validated).
872
-
873
- ## [1.16.0] - 2026-04-12
874
-
875
- ### Added
876
- - **MOAT-01 Harmony-first default onboarding**: When `nerviq audit` runs on a repo with 2+ configured AI platforms and no explicit `--platform`, the CLI now prints a one-line Harmony Score + drift summary *before* the single-platform audit. Cross-platform alignment becomes the first impression, in line with the durable moat positioning.
877
- - **`--no-harmony-first` flag**: Suppresses the new Harmony header for users who want strictly single-platform output.
878
- - **`harmony` envelope in `audit --json`**: On multi-platform repos, JSON output now includes `{ harmony: { score, driftCount, platforms } }` at the root, alongside the existing per-platform fields.
879
-
880
- ### Changed
881
- - **FB-05 — framework-aware fix rewriting**: On repos where no Node/JS stack is detected (Python, Go, Rust, Ruby, Java/Kotlin, Elixir, .NET), failure-message recommendations no longer hard-code `npm test` / `npm ci` / `npm install`. The audit post-processes `fix` text and substitutes the stack-appropriate equivalent (e.g. `pytest`, `go test ./...`, `cargo test`, `bundle exec rspec`, `./gradlew test`, `mix test`, `dotnet test`). No change on Node repos.
882
- - **Release-sync surfaces now reflect the `315`-test verification baseline** (was 307 in v1.15.0). `test/harmony-first.test.js` (5 cases) covers MOAT-01; `test/framework-aware-fixes.test.js` (3 cases) covers FB-05.
883
-
884
- ## [1.15.0] - 2026-04-11
885
-
886
- ### Added
887
- - **`--dir` flag**: Audit any directory without changing cwd (`nerviq audit --dir /path/to/repo`).
888
- - **Opt-in telemetry foundation**: Anonymous local usage tracking for audit, harmony-audit, and setup commands. Activated only when `NERVIQ_TELEMETRY=1` is set. No data leaves the machine.
889
-
890
- ### Fixed
891
- - **`--dir` flag was silently ignored**: The flag was parsed but not recognized as a value flag, causing `nerviq audit --dir /path` to always audit the current directory instead of the target. Critical fix for CI and scripted usage.
892
- - **CLAUDE.md reference following**: When CLAUDE.md is short and contains a file reference (e.g., `AGENTS.md`), the referenced file is now read and included in content checks. Fixes false negatives on projects like home-assistant/core.
893
- - **Build/test/lint checks use repo scope**: Quality checks now read all instruction surfaces (AGENTS.md, .cursorrules, copilot-instructions.md) instead of only CLAUDE.md.
894
- - **testCoverage regex expanded**: Now matches "## Testing", "writing tests", "run tests", and "test command" patterns.
895
- - **CHANGELOG check accepts variants**: Now recognizes CHANGES.md, HISTORY.md, NEWS.md in addition to CHANGELOG.md.
896
-
897
- ### Measured
898
- - **External repo audit (EXP-11)**: 10 popular repos (213K combined stars). Score range: 15–59. FP rate: ~2–4%.
899
-
900
- ## [1.14.0] - 2026-04-11
901
-
902
- ### Added
903
- - **Harmony Score standalone command**: `nerviq harmony-score` outputs 0-100 cross-platform alignment score with `--badge` (shields.io markdown), `--threshold N` (CI gate with exit code 1 on failure), `--quiet` (score number only for piping), and `--json` (full platform breakdown).
904
- - **Harmony Demo**: `nerviq harmony-demo` creates a temporary multi-platform project (Claude + Cursor + Copilot) with intentional drift and runs a live harmony audit — zero setup required.
905
- - **Cross-platform CI matrix**: CI now runs on 3 OS (Ubuntu, Windows, macOS) x 3 Node versions (18, 20, 22) for 9 total verification combinations.
906
-
907
- ## [1.13.0] - 2026-04-10
908
-
909
- ### Added
910
- - **Self-audit compliance**: CLAUDE.md now includes XML constraint blocks, mermaid architecture diagram, project description, lint command reference, and trust boundary self-audit score 73→84.
911
- - **Hardened platform freshness**: all 8 platforms now have version-specific freshness coverage in the check engine.
912
- - **Cross-surface contract regression**: a new regression pack validates that public integration contracts, API docs, and MCP transport docs stay consistent across releases.
913
-
914
- ### Changed
915
- - **Flagship CLAUDE.md refactored**: instruction surface is now concise, modular, and follows the patterns Nerviq recommends to users.
916
- - **Audit and setup modules split**: `audit.js` split into recommendation + instruction modules; `setup.js` split into analysis + runtime modules — cleaner boundaries, same public API.
917
- - **HTTP API docs separated from MCP transport**: each integration surface now has its own documentation entry point.
918
-
919
- ### Fixed
920
- - **CI token gating**: research metadata validation is now gated on repo token, preventing false failures in forks and public CI.
921
- - **Live site metadata guard**: relaxed rendered-HTML guard to support Vercel's dynamic page output without spurious drift warnings.
922
-
923
- ## [1.12.0] - 2026-04-09
924
-
925
- ### Added
926
- - **Adaptive governance guidance**: `augment` / `suggest-only` now classify repo archetypes, recommend operating profiles, and emit adopt / defer / ignore decisions with explicit explainability fields.
927
- - **Continuous operating mode**: Nerviq now supports managed baselines, diff-aware drift mode for CI / PR / watch flows, named upgrade campaigns, lifecycle snapshot milestones, and expiry-backed exception workflows.
928
- - **Behavioral drift outcome layer**: `deep-review --behavioral` now provides an opt-in local report for structural drift, intent-vs-outcome mismatches, and behavioral snapshots over time.
929
- - **Org and integration standard surfaces**: added org policy inheritance, fleet score semantics, public integration contracts, first-tier integration gate docs, category definition kit, and a public benchmark corpus.
930
-
931
- ### Changed
932
- - **Proof quality is deeper and more specific**: high-volume source URLs now point to more relevant official documentation pages instead of generic roots.
933
- - **Claude techniques are now modularized internally**: the legacy `src/techniques.js` monolith was split into 12 fragments plus shared helpers, while keeping the public export contract unchanged.
934
-
935
- ### Fixed
936
- - **GitHub Actions contract stability**: org-scan JSON output now flushes safely in CI, modern action runtimes are aligned, and workflow stability remains green on Node 18 and Node 20.
937
- - **Public surfaces stay synchronized with shipped verification**: release-facing docs and site examples now reflect the current `307`-test verification baseline and `1.12.0` API/version examples.
938
-
939
- ## [1.11.0] - 2026-04-09
940
-
941
- ### Changed
942
- - **Instruction budget warnings now speak in tokens**: large instruction-file warnings use approximate token counts instead of raw byte thresholds, making context-window guidance more aligned with real model pressure.
943
- - **Deny-rule evaluation now normalizes paths consistently**: symlink aliases collapse into one effective deny rule, repo-escape traversal patterns no longer inflate posture, and explicit absolute-path deny rules remain visible as intentional coverage.
944
-
945
- ### Fixed
946
- - **Claude deny-rule parity across audit surfaces**: audit techniques, anti-pattern detection, and suggest-only analysis now share the same deny-rule normalization contract instead of evaluating path patterns differently.
947
- - **GitHub automation contract stability**: workspace audit JSON is now CI-safe and Aider freshness output matches the shared `fresh` / `stale` workflow contract.
948
- - **Jest suite alignment with current contracts**: server envelope responses and bootstrap copy are now validated against the live `{ data, meta }` API surface and current history/suggest-rules messaging.
949
-
950
- ## [1.10.0] - 2026-04-09
951
-
952
- ### Changed
953
- - **Product boundary clarified across product surfaces**: CLI, docs, and site now consistently position Nerviq as AI agent governance / configuration intelligence rather than a full SAST replacement.
954
- - **Score semantics aligned end to end**: live audit, snapshot, benchmark, dashboard, workspace, and harmony scores are now labeled distinctly so one repo cannot appear contradictory without explanation.
955
- - **Monorepo workspace semantics clarified**: `audit --workspace` now separates root governance health from workspace aggregate/package coverage and explains the relationship directly in CLI output.
956
-
957
- ### Fixed
958
- - **Audit vs anti-pattern parity**: shared instruction-surface detection now keeps verification guidance and anti-pattern reporting in sync across `.claude/commands`, `AGENTS.md`, and related instruction docs.
959
- - **Cold-start lifecycle guidance**: `history`, `compare`, `trend`, and `suggest-rules` now bootstrap users with actionable next steps instead of near-empty no-data output.
960
- - **Framework-aware verification detection**: Flutter, Swift/Xcode, Python, Go, and .NET verification command variants now count correctly, reducing false negatives on mature repos.
961
-
962
- ### Docs
963
- - **Proof and first-run surfaces matured**: published beta case studies, public before/after proof repo, Harmony-first homepage, simplified six-step getting-started flow, clearer Harmony-vs-Synergy maturity messaging, and reduced concept-load across first-touch docs.
964
-
965
- ## [1.9.0] - 2026-04-07
966
-
967
- ### Added
968
- - **Dockerfile best practices checks** (#8): multi-stage build detection, .dockerignore validation (node_modules + .env), no secrets in build args
969
- - **Terraform check category** (#10): terraform fmt in CI/pre-commit, .terraform in .gitignore, state file not committed, remote backend configured
970
- - **i18n / Spanish language support** (#12): new `src/i18n.js` module, `--lang` CLI flag, Spanish locale (`es.json`). Usage: `nerviq audit --lang es`
971
-
972
- ### Fixed
973
- - **P0 freshness URLs** (#14-#20): fixed 41 broken documentation URLs across all 7 platforms
974
- - Claude Code: `docs.anthropic.com` `code.claude.com/docs`
975
- - Cursor: `docs.cursor.com` → `cursor.com/docs`, background-agent → cloud-agent
976
- - Copilot: restructured to `how-tos/`, `concepts/`, `responsible-use/`
977
- - Gemini: `ai.google.dev` `google-gemini.github.io/gemini-cli/`
978
- - Windsurf: rules merged into memories, MCP moved to `plugins/cascade/mcp`
979
- - OpenCode: added `/docs/` prefix to config/plugins/permissions paths
980
- - Codex: `docs.codex.ai` → `developers.openai.com/codex`
981
- - All 53 P0 sources now have `verifiedAt: 2026-04-07`
982
- - Check count: 2,431 → 2,438 (7 new checks)
983
-
984
- ## [1.8.9] - 2026-04-06
985
-
986
- ### Fixed (Expert Round — FAANG-level review)
987
- - **Setup preserves custom deny rules**: merge via union+deduplicate instead of overwrite — existing deny rules never lost
988
- - **Setup creates rollback artifacts**: setup operations now have rollback support like fix/apply
989
- - **protect-secrets covers Bash tool**: hook matcher expanded to `Read|Write|Edit|Bash`, checks `tool_input.command` for `cat .env`, `grep .env`, `base64 .env` etc.
990
- - **audit --out writes file**: `--out` flag now works for the audit command (was silently ignored)
991
- - **scan filters irrelevant categories**: stack-specific categories (flutter, ruby, etc.) hidden when 0 checks pass and stack not detected
992
- - **profile load supports built-in profiles**: `profile load read-only` now works by falling back to governance profiles
993
- - **Certification requires security gates**: Bronze needs gitIgnoreEnv+secretsProtection passing, Silver adds no critical anti-patterns, Gold needs harmony>=80
994
- - **SDK input validation**: all functions throw on null/invalid dir, unknown platform, empty description
995
- - **SDK TypeScript definitions**: added `passing`, `total`, `average` to type interfaces
996
- - **REST API consistent envelope**: all endpoints return `{ data, meta: { version, timestamp } }` format
997
- - **REST API CORS headers**: `Access-Control-Allow-Origin: *` for browser dashboard support
998
- - **benchmark organic score prominent**: organic improvement shown first as primary metric
999
- - **synergy-report implemented**: replaced "coming soon" with working multi-platform synergy dashboard
1000
-
1001
- ## [1.8.8] - 2026-04-06
1002
-
1003
- ### Fixed
1004
- - **Setup hooks registration**: hooks are now always registered in settings.json (merge, not overwrite) — previously hooks files were created but never connected
1005
- - **Platform-specific setup**: `setup --platform windsurf/aider/cursor` now routes to platform-specific setup functions instead of only creating Claude files
1006
- - **Rollback artifacts**: rollback now correctly records created/patched files (written after fixes, not before)
1007
- - **fix --dry-run**: properly separated from --auto shows what would be fixed without writing files
1008
- - **fix removes allow:["*"]**: secretsProtection fixer now removes overly broad allow rules when adding deny rules
1009
- - **--profile flag**: now loads and applies governance profiles (read-only, suggest-only, safe-write, power-user) to audit
1010
- - **profile load**: now applies deny rules and threshold to settings.json instead of just displaying
1011
- - **SDK passing/total**: added `passing`, `total`, and `average` aliases to SDK audit/harmony results
1012
- - **Swift detection**: Swift projects (Package.swift, .xcodeproj) now detected in subdirectories
1013
- - **Python repository rules**: repository.md now references pyproject.toml instead of package.json for Python projects
1014
- - **convert filename doubling**: strips all known extensions (.md, .mdc, .txt) preventing CLAUDE.md.md
1015
- - **convert frontmatter leak**: MDC frontmatter stripped for all non-cursor targets (copilot, claude, codex, etc.)
1016
- - **scan vs org scan**: `scan` now shows detailed per-repo breakdown; `org scan` shows aggregated summary
1017
- - **migrate --platform cursor**: added migrate to FULL_COMMAND_SET so platform dispatch works correctly
1018
- - **Hooks fail-closed**: protect-secrets hook now blocks on error instead of allowing (fail-closed, not fail-open)
1019
- - **Settings merge**: setup now merges all fields (hooks, permissions, mcpServers, nerviqSetup) into existing settings.json
1020
-
1021
- ## [1.8.7] - 2026-04-06
1022
-
1023
- ### Changed
1024
- - **Complete CLAUDEX NERVIQ rebrand**: all internal references, env vars (`NERVIQ_NO_INSIGHTS`), JSON keys (`_nerviq_managed`), and property names updated
1025
- - **Restored audit-repo skill template**: Claude-native skill for running `npx @nerviq/cli --json` from within Claude Code
1026
- - **Updated .gitignore**: fixed legacy `claudex-setup` reference
1027
-
1028
- ## [1.8.6] - 2026-04-06
1029
-
1030
- ### Changed
1031
- - **Confidence calibration**: 5-tier system (0.3/0.6/0.7/0.8/0.9) based on actual evidence quality — stack checks=0.6, default=0.7, with-template=0.8, runtime-verified=0.9
1032
- - **SDK dogfooding**: CLI now imports `audit`, `detectPlatforms`, `getCatalog` from public SDK API instead of internal modules
1033
- - Updated test count badge: 293 tests
1034
-
1035
- ## [1.8.5] - 2026-04-06
1036
-
1037
- ### Changed Honesty & Maturity Overhaul (Stream 23)
1038
- - **Check count messaging**: All surfaces now show "2,431 checks (8 platforms × ~300 governance rules)" instead of inflated raw number
1039
- - **Synergy → [EXPERIMENTAL]**: Synergy dashboard, CLI output, and site docs now carry experimental label with disclaimer about static routing rules
1040
- - **Feature maturity labels**: Introduced GA/Beta/Experimental system Harmony=GA, Plugins=GA, SDK=Beta, Synergy=Experimental
1041
- - **"evidence-based" → accurate**: Changed to "rule-based audit engine with evidence tracking" in methodology docs
1042
- - **Positioning**: Added "Best for teams going from 0→governed" and "Not designed for deeply customized setups" to README and site
1043
- - **sourceUrl audit**: Verified 100% coverage (2,306/2,306 checks), identified 78 unique URLs for future specificity improvement
1044
-
1045
- ### Fixed
1046
- - Fixed 15 failing tests with stale check counts (2,306→2,431, domain packs 40→62)
1047
- - Jest version verified: ^30.3.0 valid (30.2.0 installed)
1048
-
1049
- ### Added
1050
- - 14 new Harmony integration tests (full pipeline, drift scenarios, add platform, state persistence, governance, advisor)
1051
- - Total test count: 293 passing across 28 suites
1052
- - MaturityBadge component on nerviq.net docs pages
1053
-
1054
- ## [1.7.1] - 2026-04-07
1055
-
1056
- ### Changed
1057
- - README synced: added 8 missing commands (rollback, check-health, anti-patterns, freshness, rules-export, org scan), 4 missing options (--full, --config-only, --only, --workspace), fixed NERVIQ→NERVIQ branding
1058
-
1059
- ## [1.7.0] - 2026-04-07
1060
-
1061
- ### Added Final P2 batch
1062
- - **UAT-11: `nerviq rollback`** Undo the most recent apply by deleting all created files. Supports `--list` (show rollback points), `--dry-run` (preview), and auto-cleanup of rollback artifacts after use.
1063
- - **UAT-18**: `apply --only hooks,commands` already worked (verified)
1064
- - **UAT-19**: Benchmark messaging improved for post-setup runs
1065
-
1066
- ## [1.6.5] - 2026-04-07
1067
-
1068
- ### Added More P2 UX from UAT
1069
- - **UAT-14**: Governance shows top 5 domain/MCP packs by default, `--verbose` for all
1070
- - **UAT-20**: Frontend.md rule no longer generated for backend-only projects (Express, NestJS)
1071
- - **UAT-23**: `rules-export` shows human-readable summary by default, `--json` for full output
1072
- - **UAT-24**: `history --prune N` to clean old snapshots (keeps last N)
1073
- - **UAT-21**: Harmony task routing already dynamic (via UAT-04 phantom platform fix)
1074
-
1075
- ## [1.6.4] - 2026-04-07
1076
-
1077
- ### Added — P2 UX improvements from UAT
1078
- - **UAT-12**: Setup now lists every file created (`+ CLAUDE.md`, `+ .claude/settings.json`, ...)
1079
- - **UAT-13**: Lite mode shows pass/fail count: `Score: 78/100 (62/86 checks passing)`
1080
- - **UAT-15**: Audit header shows detected config files: `Found: CLAUDE.md, AGENTS.md, .cursorrules`
1081
- - **UAT-17**: Suggested next command includes `--platform` for non-Claude platforms
1082
- - **UAT-22**: History shows HH:MM timestamps when multiple snapshots share same date
1083
-
1084
- ## [1.6.3] - 2026-04-07
1085
-
1086
- ### Fixed P1 from UAT
1087
- - **UAT-04**: Harmony only audits platforms with detected config files (was always 8/8)
1088
- - **UAT-05**: `apply --rollback` now shows clear error instead of silently re-applying
1089
- - **UAT-06**: Harmony drift now auto-recorded — compares scores to previous audit, records deltas ≥5 points
1090
- - **UAT-07**: Migrate error message includes usage example
1091
- - **UAT-08**: Doctor aider freshness gate no longer crashes (null safety)
1092
- - **UAT-09**: `nerviq fix` now auto-fixes `gitIgnoreEnv` (.env to .gitignore) and `secretsProtection` (deny rules in settings.json) — the two most common critical findings
1093
- - **UAT-10**: Rails/Laravel/.NET false positives in `fix` output eliminated (was caused by same null-inclusion bug as UAT-02)
1094
-
1095
- ## [1.6.2] - 2026-04-07
1096
-
1097
- ### Fixed — P0 from UAT (ship-stoppers)
1098
- - **UAT-01 BLOCKER**: `npx @nerviq/cli audit` now works — added `@nerviq/cli` bin alias
1099
- - **UAT-02**: `nerviq fix` was showing 375 failed checks (including skipped) vs audit's 77. Fixed: now filters `r.passed === false` only, matching audit count exactly
1100
- - **UAT-03**: Confidence label `[MEDIUM]` was shown on critical items (confusing). Changed threshold: 0.7 confidence now shows `[HIGH]` instead of `[MEDIUM]`
1101
-
1102
- ## [1.6.1] - 2026-04-07
1103
-
1104
- ### Added
1105
- - **F3-01: `nerviq check-health`** Detects regressions between audit snapshots. Compares per-check pass/fail state and flags checks that went from passing to failing. When 3+ checks in the same category regress, alerts as "potential platform format change."
1106
- - **F3-03: Regression tests** — 3 new tests for check-health: no-snapshots, stable state, and regression detection
1107
- - Supports `--json` for CI integration
1108
-
1109
- ## [1.6.0] - 2026-04-07
1110
-
1111
- ### Changed ACCURACY OVERHAUL
1112
- - **Stack detection accuracy**: Checks for Python, Go, Rust, Java, Ruby, PHP, .NET, Flutter, Swift, Kotlin now skip when the stack is only present in `examples/`, `docs/`, `test/`, `vendor/` directories not at project root. Previously these fired false positives on monorepos and repos with example code.
1113
- - **Generic quality checks scoped**: 132 checks (observability, caching, i18n, rate-limiting, etc.) are now skipped by default they measure general software quality, not AI agent configuration. Use `--verbose` to include them.
1114
- - **Urgency count fix**: Skipped (not-applicable) checks were incorrectly counted as critical/high in the lite output summary. Now only actual failures are counted.
1115
-
1116
- ### Impact
1117
- - supabase/supabase: Failed 120 55 (65 false positives eliminated)
1118
- - Nerviq's own repo: Fake "🔴 3 critical" → accurate "🔵 19 recommended"
1119
- - All failed checks are now relevant to AI agent configuration
1120
-
1121
- ## [1.5.3] - 2026-04-07
1122
-
1123
- ### Added
1124
- - **T4-01:** Confidence labels (`[HIGH]` / `[MEDIUM]` / `[HEURISTIC]`) on every failed check in full audit
1125
- - **T4-02:** Safety modes documented in README: read-only, suggest-only, dry-run, config-only, safe-write, power-user
1126
- - **T4-02:** `--config-only` flag added restricts writes to config files only
1127
- - **B4:** Suggest-only markdown export verified working (`nerviq suggest-only --out report.md`)
1128
-
1129
- ### Fixed
1130
- - Report header rebranded from "Nerviq" to "Nerviq" in markdown export
1131
-
1132
- ## [1.5.2] - 2026-04-07
1133
-
1134
- ### Added
1135
- - **F1-01: Lite-by-default** `nerviq audit` now shows quick scan (score + top 3 actions). Use `--full` for complete output.
1136
- - **F1-02: Urgency tiers** — Lite output shows `🔴 critical / 🟡 high / 🔵 recommended` summary and per-item tier icons
1137
- - **F2-01: `nerviq fix` command** — Auto-fix checks with templates, show manual guidance for others, display score impact
1138
- - `nerviq fix` List fixable and manual-fix checks
1139
- - `nerviq fix <key>` Fix a specific check with before/after score
1140
- - `nerviq fix --all-critical` Fix all critical issues at once
1141
- - `nerviq fix --dry-run` — Preview without writing
1142
-
1143
- ### Changed
1144
- - Default `nerviq audit` is now lite mode (previously showed full output)
1145
- - `--full` flag added to restore previous full-output behavior
1146
- - `--verbose` still shows full output plus medium-priority recommendations
1147
- - Lite output streamlined: single fix line per item instead of redundant Why/Fix
1148
-
1149
- ## [1.5.1] - 2026-04-06
1150
-
1151
- ### Added
1152
- - "Get Started by Role" section in README (solo dev / team lead / enterprise paths)
1153
- - "What Nerviq Is — and Isn't" section in README (honest limitations, confidence levels)
1154
- - CHANGELOG entries for v1.2.5 through v1.5.0 (previously undocumented)
1155
-
1156
- ### Changed
1157
- - Check counts synced across all surfaces (README, package.json, badge): 2,431 total
1158
- - Removed stale "v1.0" reference from README
1159
- - Tagline sharpened: "Standardize and govern your AI coding agent setup"
1160
- - Platform check counts updated to match actual catalog
1161
- - Removed self-certification badge
1162
-
1163
- ## [1.5.0] - 2026-04-05
1164
-
1165
- ### Added
1166
- - Stream 8 Self-Dependent Execution — intelligence hardening
1167
- - New CLI commands: `nerviq rules-export`, `nerviq anti-patterns`, `nerviq freshness`
1168
- - A2: Recommendation rules export to JSON
1169
- - A3: Shared contract schemas (technique + pack)
1170
- - A6: 22 anti-pattern definitions with detection
1171
- - A7: Last-verified date tracking for 123 checks
1172
- - B5: External benchmark path (`nerviq benchmark --external /path`)
1173
- - B8: Governance hook risk level classification (high/medium/low)
1174
-
1175
- ### Changed
1176
- - B3: Augment now preserves and displays top 10 strengths
1177
-
1178
- ## [1.4.1] - 2026-04-05
1179
-
1180
- ### Fixed
1181
- - npm README display alignment
1182
-
1183
- ## [1.4.0] - 2026-04-05
1184
-
1185
- ### Added
1186
- - Stream 13: 84 new coverage checks across 15 directions
1187
- - MC-A (HIGH): Observability, Accessibility, GDPR, Error Tracking, Supply Chain — 31 checks
1188
- - MC-B (MED): i18n, API Versioning, Caching, Rate Limiting, Feature Flags, Docs, Monorepo, Performance — 43 checks
1189
- - MC-C (LOW): WebSocket/Real-time, GraphQL 10 checks
1190
- - Total reached 2,039 checks across 96 categories
1191
-
1192
- ## [1.3.2] - 2026-04-05
1193
-
1194
- ### Changed
1195
- - README fully updated: badge, platform table, category table, stack languages table
1196
- - package.json description synced to 1,955 checks
1197
- - Added `harmony-add` command to docs
1198
-
1199
- ## [1.3.1] - 2026-04-05
1200
-
1201
- ### Added
1202
- - Stream 5D: 35 mobile stack checks (Flutter 15, Swift 10, Kotlin 10)
1203
- - Stream 4 Batch 2: 22 new domain packs (healthcare to energy)
1204
- - Stream 5 complete: 172 stack checks across 10 languages
1205
-
1206
- ## [1.3.0] - 2026-04-05
1207
-
1208
- ### Added
1209
- - Stream 5: Stack-specific checks for 7 languages (137 new checks)
1210
- - Python (26), Go (21), Rust (21), Java/Spring (21), Ruby (16), PHP (16), .NET (16)
1211
- - QP-D02: API reference documentation (`docs/api-reference.md`)
1212
-
1213
- ## [1.2.7] - 2026-04-05
1214
-
1215
- ### Changed
1216
- - Version bump for npm publish alignment
1217
-
1218
- ## [1.2.6] - 2026-04-05
1219
-
1220
- ### Added
1221
- - EC1-EC8: All 6 new ECC-inspired checks + 2 advisor task types
1222
-
1223
- ### Fixed
1224
- - Flaky `compareLatest` test (timestamp tiebreaker sort)
1225
-
1226
- ## [1.2.5] - 2026-04-05
1227
-
1228
- ### Added
1229
- - 3 ECC-inspired checks: `llms.txt`, MCP budget warning, hook exit code docs
1230
-
1231
- ### Changed
1232
- - Complete NERVIQ NERVIQ rebrand across docs, content, action, landing page
1233
- - CHANGELOG rewritten to Keep a Changelog format with full version history
1234
-
1235
- ## [1.2.4] - 2026-04-05
1236
-
1237
- ### Added
1238
- - H8: Unified platform capability matrices into a single source of truth
1239
- - Windsurf, Aider, and OpenCode intelligence added to Harmony module
1240
- - Codex platform additions synced to metadata
1241
-
1242
- ### Changed
1243
- - MG5-MG11: Complete NERVIQ to NERVIQ migration in CLI codebase
1244
- - Hardcoded `.claude/nerviq-cli/` paths migrated to `.nerviq/` with fallback
1245
-
1246
- ## [1.2.3] - 2026-04-05
1247
-
1248
- ### Added
1249
- - Batch Q1: check-matrix and golden-matrix tests for Windsurf, Aider, OpenCode
1250
- - Quality Perfection Q1: Gold certification, harmony+synergy proof
1251
- - SDK/server tests and plugin dogfood validation
1252
-
1253
- ### Changed
1254
- - Self-audit score improved from 80 to 90
1255
- - CI self-audit integrated into pipeline
1256
-
1257
- ## [1.2.1] - 2026-04-05
1258
-
1259
- ### Fixed
1260
- - Skip API/DB/Auth/Monitoring checks on irrelevant projects (false positive reduction)
1261
- - Self-dogfood: added `.mcp.json` to own project
1262
- - LICENSE updated to AGPL-3.0 full text
1263
- - CI test assertions updated for new error messages and .npmignore changes
1264
-
1265
- ## [1.2.0] - 2026-04-05
1266
-
1267
- ### Added
1268
- - Massive expansion: 673 to 2,306 checks (+1,633)
1269
- - Batch 4: 25 case studies (10 single-platform + 10 harmony/synergy + 5 existing) with INDEX
1270
- - Batch 3: +104 experiments (228 to 332) and +133 research docs (315 to 448)
1271
- - 27 cross-platform research documents
1272
-
1273
- ## [1.1.1] - 2026-04-05
1274
-
1275
- ### Added
1276
- - Batch 2: +24 domain packs (16 to 40) and +23 MCP packs (26 to 49) across all 8 platforms
1277
-
1278
- ## [1.1.0] - 2026-04-05
1279
-
1280
- ### Added
1281
- - Batch 1: +383 checks (673 to 1,056) across 8 new categories for all 8 platforms
1282
-
1283
- ## [1.0.2] - 2026-04-05
1284
-
1285
- ### Fixed
1286
- - Scorecard: 15 dimensions improved (privacy, security, monorepo, org, integrations, telemetry, OTel, SLSA, versioning, errors, audit log, deprecation, large files, relevance decay, case studies)
1287
-
1288
- ### Added
1289
- - Methodology documentation, FP ranking, SBOM, CI experiments
1290
- - Improved `.npmignore` and `test:all` script
1291
-
1292
- ## [1.0.1] - 2026-03-31
1293
-
1294
- ### Fixed
1295
- - Mermaid diagram rendering in README
1296
- - macOS `grep` compatibility issue
1297
- - Version stamp display
1298
-
1299
- ## [1.0.0] - 2026-04-05
1300
-
1301
- ### Changed
1302
- - **Renamed from nerviq-cli to Nerviq** — "The intelligent nervous system for AI coding agents"
1303
- - Full rebrand across CLI, docs, and package metadata
1304
-
1305
- ## [0.9.6] - 2026-04-05
1306
-
1307
- ### Added
1308
- - SDK for programmatic access
1309
- - REST API server with Express
1310
- - Plugin system for extensibility
1311
- - SLSA provenance for supply chain security
1312
- - CONTRIBUTING.md for open-source contributors
1313
-
1314
- ## [0.9.5] - 2026-04-05
1315
-
1316
- ### Added
1317
- - VS Code extension
1318
- - `catalog` command for browsing checks
1319
- - Performance baselines and benchmarks
1320
- - Feedback loop for community contributions
1321
-
1322
- ### Changed
1323
- - All 673 checks now include `sourceUrl` and `confidence` metadata
1324
-
1325
- ## [0.9.4] - 2026-04-05
1326
-
1327
- ### Added
1328
- - GitHub Action for CI/CD integration
1329
- - MCP server for tool integration
1330
- - `doctor`, `convert`, and `migrate` commands
1331
- - Freshness pipeline for check staleness detection
1332
- - 3 case studies with real project data
1333
- - Harmony, Synergy, and E2E test suites (187 total tests)
1334
-
1335
- ## [0.9.3] - 2026-04-05
1336
-
1337
- ### Fixed
1338
- - Checks updated from experiment findings: Gemini +5, Copilot +5, Cursor +4, Aider +3, Windsurf/OpenCode fixes
1339
- - Stale checks cleaned and new checks added
1340
- - CI: added `npm ci` step for dependency install
1341
-
1342
- ### Changed
1343
- - README updated with beta notice and coming-soon platform list
1344
-
1345
- ## [0.9.x] - 2026-04-04
1346
-
1347
- ### Changed
1348
- - README updated with nerviq-cli to Nerviq migration notice
1349
-
1350
- ## [0.5.1] - 2026-03-31
1351
-
1352
- ### Changed
1353
- - Deep-review auto-detects Claude Code presence (no API key needed)
1354
- - Landing page and help text updated
1355
-
1356
- ## [0.5.0] - 2026-03-31
1357
-
1358
- ### Added
1359
- - AI-powered `deep-review` command using Claude API
1360
- - Intelligent analysis beyond static checks
1361
-
1362
- ## [0.4.0] - 2026-03-31
1363
-
1364
- ### Added
1365
- - 9 quality-deep checks for veteran Claude Code users
1366
- - Deeper analysis for experienced workflows
1367
-
1368
- ### Changed
1369
- - Community feedback addressed: improved honesty, no-overwrite behavior, less dogmatic tone
1370
-
1371
- ## [0.3.2] - 2026-03-31
1372
-
1373
- ### Changed
1374
- - README v2: all commands documented, smart gen showcase, 54 checks table, GitHub Action, privacy section
1375
-
1376
- ## [0.3.1] - 2026-03-31
1377
-
1378
- ### Added
1379
- - Anonymous insights collection
1380
- - Weakest areas analysis
1381
- - Community statistics dashboard
1382
-
1383
- ### Fixed
1384
- - Insights endpoint corrected to `nerviq.workers.dev`
1385
-
1386
- ## [0.3.0] - 2026-03-31
1387
-
1388
- ### Added
1389
- - Interactive wizard for guided setup
1390
- - Watch mode for continuous monitoring
1391
- - Landing page with FAQ, trust signals, badges
1392
-
1393
- ## [0.2.1] - 2026-03-31
1394
-
1395
- ### Added
1396
- - Smart `CLAUDE.md` generator based on project analysis
1397
- - `badge` command for README status badges
1398
- - GitHub Action for automated auditing
1399
- - Quick wins recommendations
1400
-
1401
- ## [0.2.0] - 2026-03-31
1402
-
1403
- ### Added
1404
- - Expanded to 54 checks across 18 technology stacks
1405
- - Improved CLAUDE.md templates
1406
-
1407
- ### Fixed
1408
- - Security: removed hardcoded Dev.to API key from CLAUDE.md
1409
- - Security: made NERVIQ catalog links private
1410
-
1411
- ## [0.1.0] - 2026-03-30
1412
-
1413
- ### Added
1414
- - Initial release of nerviq-cli (later renamed to Nerviq)
1415
- - Project audit and optimization for Claude Code workflows
1416
- - Landing page (GitHub Pages ready)
1417
- - Launch content and community posts
1418
-
1419
- [Unreleased]: https://github.com/nerviq/nerviq/compare/v1.29.0...HEAD
1420
- [1.29.0]: https://github.com/nerviq/nerviq/compare/v1.28.0...v1.29.0
1421
- [1.28.0]: https://github.com/nerviq/nerviq/compare/v1.27.1...v1.28.0
1422
- [1.27.1]: https://github.com/nerviq/nerviq/compare/v1.27.0...v1.27.1
1423
- [1.27.0]: https://github.com/nerviq/nerviq/compare/v1.26.0...v1.27.0
1424
- [1.26.0]: https://github.com/nerviq/nerviq/compare/v1.25.0...v1.26.0
1425
- [1.25.0]: https://github.com/nerviq/nerviq/compare/v1.24.0...v1.25.0
1426
- [1.24.0]: https://github.com/nerviq/nerviq/compare/v1.23.0...v1.24.0
1427
- [1.23.0]: https://github.com/nerviq/nerviq/compare/v1.22.0...v1.23.0
1428
- [1.22.0]: https://github.com/nerviq/nerviq/compare/v1.21.0...v1.22.0
1429
- [1.21.0]: https://github.com/nerviq/nerviq/compare/v1.20.1...v1.21.0
1430
- [1.20.1]: https://github.com/nerviq/nerviq/compare/v1.20.0...v1.20.1
1431
- [1.20.0]: https://github.com/nerviq/nerviq/compare/v1.19.0...v1.20.0
1432
- [1.19.0]: https://github.com/nerviq/nerviq/compare/v1.18.0...v1.19.0
1433
- [1.18.0]: https://github.com/nerviq/nerviq/compare/v1.17.3...v1.18.0
1434
- [1.17.3]: https://github.com/nerviq/nerviq/compare/v1.17.2...v1.17.3
1435
- [1.17.2]: https://github.com/nerviq/nerviq/compare/v1.17.1...v1.17.2
1436
- [1.17.1]: https://github.com/nerviq/nerviq/compare/v1.17.0...v1.17.1
1437
- [1.17.0]: https://github.com/nerviq/nerviq/compare/v1.16.0...v1.17.0
1438
- [1.16.0]: https://github.com/nerviq/nerviq/compare/v1.15.0...v1.16.0
1439
- [1.15.0]: https://github.com/nerviq/nerviq/compare/v1.14.0...v1.15.0
1440
- [1.14.0]: https://github.com/nerviq/nerviq/compare/v1.13.0...v1.14.0
1441
- [1.13.0]: https://github.com/nerviq/nerviq/compare/v1.12.0...v1.13.0
1442
- [1.12.0]: https://github.com/nerviq/nerviq/compare/v1.11.0...v1.12.0
1443
- [1.11.0]: https://github.com/nerviq/nerviq/compare/v1.10.0...v1.11.0
1444
- [1.10.0]: https://github.com/nerviq/nerviq/compare/v1.9.0...v1.10.0
1445
- [1.9.0]: https://github.com/nerviq/nerviq/compare/v1.8.9...v1.9.0
1446
- [1.8.9]: https://github.com/nerviq/nerviq/compare/v1.8.8...v1.8.9
1447
- [1.8.8]: https://github.com/nerviq/nerviq/compare/v1.8.7...v1.8.8
1448
- [1.8.7]: https://github.com/nerviq/nerviq/compare/v1.8.6...v1.8.7
1449
- [1.8.6]: https://github.com/nerviq/nerviq/compare/v1.8.5...v1.8.6
1450
- [1.8.5]: https://github.com/nerviq/nerviq/compare/v1.7.1...v1.8.5
1451
- [1.7.1]: https://github.com/nerviq/nerviq/compare/v1.7.0...v1.7.1
1452
- [1.7.0]: https://github.com/nerviq/nerviq/compare/v1.6.5...v1.7.0
1453
- [1.6.5]: https://github.com/nerviq/nerviq/compare/v1.6.4...v1.6.5
1454
- [1.6.4]: https://github.com/nerviq/nerviq/compare/v1.6.3...v1.6.4
1455
- [1.6.3]: https://github.com/nerviq/nerviq/compare/v1.6.2...v1.6.3
1456
- [1.6.2]: https://github.com/nerviq/nerviq/compare/v1.6.1...v1.6.2
1457
- [1.6.1]: https://github.com/nerviq/nerviq/compare/v1.6.0...v1.6.1
1458
- [1.6.0]: https://github.com/nerviq/nerviq/compare/v1.5.3...v1.6.0
1459
- [1.5.3]: https://github.com/nerviq/nerviq/compare/v1.5.2...v1.5.3
1460
- [1.5.2]: https://github.com/nerviq/nerviq/compare/v1.5.1...v1.5.2
1461
- [1.5.1]: https://github.com/nerviq/nerviq/compare/v1.5.0...v1.5.1
1462
- [1.5.0]: https://github.com/nerviq/nerviq/compare/v1.4.1...v1.5.0
1463
- [1.4.1]: https://github.com/nerviq/nerviq/compare/v1.4.0...v1.4.1
1464
- [1.4.0]: https://github.com/nerviq/nerviq/compare/v1.3.2...v1.4.0
1465
- [1.3.2]: https://github.com/nerviq/nerviq/compare/v1.3.1...v1.3.2
1466
- [1.3.1]: https://github.com/nerviq/nerviq/compare/v1.3.0...v1.3.1
1467
- [1.3.0]: https://github.com/nerviq/nerviq/compare/v1.2.7...v1.3.0
1468
- [1.2.7]: https://github.com/nerviq/nerviq/compare/v1.2.6...v1.2.7
1469
- [1.2.6]: https://github.com/nerviq/nerviq/compare/v1.2.5...v1.2.6
1470
- [1.2.5]: https://github.com/nerviq/nerviq/compare/v1.2.4...v1.2.5
1471
- [1.2.4]: https://github.com/nerviq/nerviq/compare/v1.2.3...v1.2.4
1472
- [1.2.3]: https://github.com/nerviq/nerviq/compare/v1.2.1...v1.2.3
1473
- [1.2.1]: https://github.com/nerviq/nerviq/compare/v1.2.0...v1.2.1
1474
- [1.2.0]: https://github.com/nerviq/nerviq/compare/v1.1.1...v1.2.0
1475
- [1.1.1]: https://github.com/nerviq/nerviq/compare/v1.1.0...v1.1.1
1476
- [1.1.0]: https://github.com/nerviq/nerviq/compare/v1.0.2...v1.1.0
1477
- [1.0.2]: https://github.com/nerviq/nerviq/compare/v1.0.1...v1.0.2
1478
- [1.0.1]: https://github.com/nerviq/nerviq/compare/v1.0.0...v1.0.1
1479
- [1.0.0]: https://github.com/nerviq/nerviq/compare/v0.9.6...v1.0.0
1480
- [0.9.6]: https://github.com/nerviq/nerviq/compare/v0.9.5...v0.9.6
1481
- [0.9.5]: https://github.com/nerviq/nerviq/compare/v0.9.4...v0.9.5
1482
- [0.9.4]: https://github.com/nerviq/nerviq/compare/v0.9.3...v0.9.4
1483
- [0.9.3]: https://github.com/nerviq/nerviq/compare/v0.9.x...v0.9.3
1484
- [0.9.x]: https://github.com/nerviq/nerviq/compare/v0.5.1...v0.9.x
1485
- [0.5.1]: https://github.com/nerviq/nerviq/compare/v0.5.0...v0.5.1
1486
- [0.5.0]: https://github.com/nerviq/nerviq/compare/v0.4.0...v0.5.0
1487
- [0.4.0]: https://github.com/nerviq/nerviq/compare/v0.3.2...v0.4.0
1488
- [0.3.2]: https://github.com/nerviq/nerviq/compare/v0.3.1...v0.3.2
1489
- [0.3.1]: https://github.com/nerviq/nerviq/compare/v0.3.0...v0.3.1
1490
- [0.3.0]: https://github.com/nerviq/nerviq/compare/v0.2.1...v0.3.0
1491
- [0.2.1]: https://github.com/nerviq/nerviq/compare/v0.2.0...v0.2.1
1492
- [0.2.0]: https://github.com/nerviq/nerviq/compare/v0.1.0...v0.2.0
1493
- [0.1.0]: https://github.com/nerviq/nerviq/releases/tag/v0.1.0
1
+ # Changelog
2
+
3
+ All notable changes to the **Nerviq** CLI are documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## Evidence tiers
9
+
10
+ Per **TRUTH-03** (POS-05 / continuous-governance positioning, 2026-04-29),
11
+ every changelog entry from this release forward is tagged with an explicit
12
+ evidence tier so a buyer / reviewer / contributor can tell at a glance how
13
+ strongly we stand behind the claim:
14
+
15
+ - `[Tested]` — verified in this codebase via `npm test` (canonical or Jest).
16
+ This is the strongest tier; a reproducible test guards against regression.
17
+ - `[Measured]`backed by a controlled before/after run with declared
18
+ evaluator + cross-model judge, recorded under
19
+ `nerviq-research/research/measurement-runs/`. Strongest external evidence.
20
+ - `[Reported]` surfaced by an external pilot, user-lab study, or community
21
+ observation; not yet independently re-measured.
22
+ - `[Aspirational]` — directional claim, design intent, or planned behavior.
23
+ Honest but not yet evidence-backed; flagged so it can't masquerade as proof.
24
+
25
+ Untagged historical entries pre-2026-04-29 should be treated as `[Tested]` if
26
+ they describe shipped behavior with regression coverage; the absence of a
27
+ tag is not an evidence-tier downgrade.
28
+
29
+ ## [Unreleased]
30
+
31
+ (no unreleased changes)
32
+
33
+ ## [1.30.0] - 2026-04-29
34
+
35
+ ### Summary
36
+
37
+ Agent-facing surfaces ship: SDK bundled into the CLI tarball (MEMO-03 = B
38
+ decision), GitHub Action wired with full marketplace metadata (MR-06),
39
+ `nerviq certify --agent-ready` hardening gate (AI-13), and a postinstall
40
+ quick-start hint (AI-10). Continuous-governance UX polish carries over
41
+ from the round-5/6 closure waves: `pr-check` composite command (LOOP-02),
42
+ named watch alerts (LOOP-01), stale-reference headline before "Top 3 fixes"
43
+ (PROD-03), Windows mojibake fix (MEMO-16), and OWASP cross-walk on
44
+ shallow-risk findings (POS-01a). Performance: ~17% cut in the
45
+ checkAllTechniques hot loop (AI-12a). All 7 user-lab BUG fixes from the
46
+ 2026-04-28 12-persona study landed (BUG-01..07). Tooling: `publish.js`
47
+ no longer swallows errors (EXP-08).
48
+
49
+ ### Added Agent-facing surfaces (round-8/9 closures)
50
+
51
+ - `[Tested]` **SDK bundled into the CLI tarball (MEMO-03 a..e).** New
52
+ `package.json` `exports` map exposes `@nerviq/cli`, `@nerviq/cli/sdk`
53
+ (programmatic API), `@nerviq/cli/sdk/types` (TypeScript declarations),
54
+ and `@nerviq/cli/package.json`. The `files` whitelist now includes
55
+ `sdk/`, so `npm pack --dry-run` ships
56
+ `sdk/{index.js,index.d.ts,package.json,README.md}` (266 files,
57
+ 883.1kB). README + `sdk/README.md` rewritten to point at the bundled
58
+ install path; the never-published `@nerviq/sdk` package name is now
59
+ documented as historical only. **Closes the public install-path
60
+ contradiction flagged by the Codex CTO/CEO Market Memo (2026-04-13).**
61
+ - `[Tested]` **`nerviq certify --agent-ready` (AI-13).** New mode under
62
+ the existing `certify` command runs 6 pass/fail criteria
63
+ (`agent-context-present`, `gitignore-blocks-env`,
64
+ `deny-rules-configured`, `no-critical-shallow-risk`,
65
+ `no-stale-references`, `governance-score-floor 50`). 3 are critical;
66
+ failing any drops the verdict from `agent-ready-full` to
67
+ `agent-ready-with-caveats` or `not-agent-ready`. Distinct shields.io
68
+ badges per verdict. Exit 0 on no-critical-fail, exit 1 otherwise
69
+ (CI-friendly).
70
+ - `[Tested]` **GitHub Action marketplace metadata (MR-06).**
71
+ `action/action.yml` adds the `branding` block (icon: `shield`, color:
72
+ `green`) required for Marketplace listing, expands inputs
73
+ (`threshold` / `platform` / `dir` / `diff-only` / `diff-base`), and
74
+ expands outputs (`score` / `organic-score` / `passed` / `failed` /
75
+ `total` / `stale-references` / `gate`). Invocation now passes
76
+ `--no-harmony-first` so machine output stays parser-safe regardless
77
+ of the harmony-first default. `GITHUB_STEP_SUMMARY` includes
78
+ stale-reference count + gate verdict.
79
+ - `[Tested]` **Postinstall quick-start hint (AI-10).**
80
+ `tools/postinstall.js` runs once after `npm install @nerviq/cli`,
81
+ printing a 4-line "next steps" block (`audit` / `setup --auto` /
82
+ `harmony-audit`). Suppressed in CI / non-TTY / transitive-dep installs
83
+ / when `NERVIQ_POSTINSTALL_QUIET=1` is set. Wired via
84
+ `package.json:postinstall` with a `|| true` failsafe so it never
85
+ breaks an install.
86
+ - `[Tested]` **AI-07 self-governing agent example.**
87
+ `sdk/examples/self-governing-agent.js` is the reference implementation
88
+ of the 5-step pattern documented at `/docs/for-agents`: pre-task
89
+ audit (surface stale references) harmony check (multi-platform
90
+ only) actual task post-task diff audit outcome record.
91
+ Resolves SDK via `require('@nerviq/cli/sdk')` with an in-repo
92
+ fallback so the example is testable from a checkout.
93
+ - `[Tested]` **AI-08 Nerviq references in generated CLAUDE.md.**
94
+ `src/setup.js` claude-md template now includes a "Governance check
95
+ (Nerviq)" section telling the agent to run `nerviq audit` before
96
+ substantive changes, re-run after editing
97
+ CLAUDE/AGENTS/.cursor/rules/.mcp.json/hooks, use `nerviq watch` for
98
+ continuous-mode workflows, and `nerviq pr-check --threshold 70`
99
+ before opening PRs.
100
+ - `[Tested]` **AI-09 orchestrator integration patterns.**
101
+ `sdk/examples/langchain-integration.md` ships LangChain (Node +
102
+ Python), CrewAI, and generic-orchestrator patterns plus a decision
103
+ matrix for "which tool to call when". Documentation-only; SDK code
104
+ unchanged.
105
+ - `[Tested]` **REL-01 release announcement automation.**
106
+ `tools/announce-release.js` extracts the CHANGELOG entry for a given
107
+ version, counts TRUTH-03 evidence tiers
108
+ (`[Tested]` / `[Measured]` / `[Reported]` / `[Aspirational]`), and
109
+ emits a markdown body for `gh release create --notes`. Wired as
110
+ `npm run announce:release [version]`.
111
+
112
+ ### Performance
113
+
114
+ - `[Tested]` **`checkAllTechniques` partition-before-loop (AI-12a).**
115
+ `src/audit.js` partitions the techniques map into applicable /
116
+ not-applicable arrays in a single pre-pass, then iterates only the
117
+ applicable list in the hot loop. Not-applicable entries get
118
+ fast-pushed to results without per-check work. Bench: **494ms vs
119
+ ~600ms baseline** on a 120-file site repo (~17% cut). Per AI-12
120
+ governance-budget tracking — moves real-repo overhead from ~1.17%
121
+ toward the revised <2% cumulative / <1% per-call envelope.
122
+
123
+ ### Round 6 Continuous-governance polish (2026-04-29)
124
+
125
+ - `[Tested]` `bin/cli.js` adds the `nerviq pr-check` composite command
126
+ audit + diff-only + threshold gate + markdown PR-comment + JSON envelope,
127
+ with explicit gate ✅/❌ and exit code 1 on fail. **LOOP-02 closed.**
128
+ - `[Tested]` `src/audit.js` runs the BUG-04 stale-reference patterns
129
+ default-on as a mini-scan. CLI text output prints the
130
+ `📌 Stale references in agent docs: N` block before "Top 3 things to fix"
131
+ so it is the literal first user-visible value. **PROD-03 closed.**
132
+ - `[Tested]` `src/watch.js` emits named `🔔 NEW: …` / `✓ CLEARED: …`
133
+ alerts per change diff (sourced from staleReferences + critical
134
+ shallow-risk hits). `--no-alerts` opt-out flag. **LOOP-01 closed.**
135
+ - `[Tested]` `src/shallow-risk/patterns/*` — every pattern now declares
136
+ an `owaspTags: [...]` array machine-readable cross-walk to OWASP Agentic
137
+ / MCP / Agentic-Skills Top 10 categories. Surfaced through `buildFinding`
138
+ so JSON consumers (`audit --json`, `pr-check --json`) get the tags on
139
+ every shallow-risk finding. **POS-01a closed.**
140
+ - `[Tested]` `src/safe-glyph.js` (new) — Windows mojibake fix. Auto-
141
+ detects modern terminals (Windows Terminal, VS Code, WSL, Git Bash) and
142
+ falls back to ASCII glyphs (`[OK]`, `[X]`, `[!]`) on legacy cmd.exe / PS.
143
+ Override via `NERVIQ_GLYPH=ascii|unicode`. `colorize()` routes all CLI
144
+ output through the helper. **MEMO-16 closed.**
145
+ - `[Reported]` `src/auto-suggest.js` empty-state message now lists explicit
146
+ `missingSignals` + thresholds so users know exactly what's missing if
147
+ the suggest-rules loop is data-starved. Reported by user-lab BUG-07.
148
+ - `[Reported]` `nerviq audit --fix --json` now emits valid JSON with the
149
+ full outcome envelope (mode, exitCode, plan, advisoryOnly, patchArtifact,
150
+ rollbackArtifact, reAudit, unresolvedKeys, branchName, warnings).
151
+ Reported by user-lab BUG-01.
152
+ - `[Reported]` Machine formats (sarif/junit/csv/markdown) no longer get
153
+ contaminated by the Harmony-first banner. Default is parser-safe.
154
+ Reported by user-lab BUG-02.
155
+ - `[Reported]` `deep-review --behavioral` returns `score: null,
156
+ status: "insufficient-signal"` on repos with <5 source files instead of
157
+ the misleading 100/100. Reported by user-lab BUG-05.
158
+ - `[Reported]` `exception list/add --json` emits stable
159
+ `{records, count, generatedAt}` envelope. Reported by user-lab BUG-06.
160
+
161
+
162
+
163
+ ### Researchpublished evidence surfaces (no CLI code change, 2026-04-17)
164
+
165
+ No product behavior change in this entry. Recording for coherence so
166
+ operators can cite the CLI from the research artifacts that now
167
+ reference it.
168
+
169
+ - **"State of AI Agent Governance 2026-Q2" report** 20-repo public
170
+ dataset audited with this CLI (v1.29.1). CC0. Published in
171
+ `DnaFin/nerviq-research`.
172
+ - **Harmony value quantified** on 7 archetypes via `nerviq harmony-sync
173
+ --fix` before/after. Lift is bounded above by starting drift
174
+ (low-harmony repos: avg +20.5; high-harmony: avg 0).
175
+ - **Self-dogfood audit** `nerviq audit` run on the research repo +
176
+ this CLI repo, all numbers published honestly including harmony
177
+ 33/100 on the research repo.
178
+ - **First tier-4 measurements** catalog items #63 (Meta-prompting)
179
+ and #11 (Chain-of-thought) earned the `📏 Measured` badge via
180
+ cross-model-judge before/after runs.
181
+ - Three CI workflows now live in `DnaFin/nerviq-research`:
182
+ `tier1-runner.yml` (Fri 08:00 UTC), `tier2-runner.yml` (Sat 09:00
183
+ UTC), `tier25-runner.yml` (Sat 10:00 UTC). This CLI is what the
184
+ tier-2.5 workflow drives via subprocess — unchanged here.
185
+
186
+ Public artifacts: https://github.com/DnaFin/nerviq-research
187
+
188
+ ### Documentation & positioning
189
+
190
+ - `[Tested]` **AGENTS.md rewritten as flagship instruction file (DOG-01).**
191
+ The previous placeholder boilerplate is replaced with a real,
192
+ Nerviq-validated agent instruction surface for this CLI repo:
193
+ governance entry-point, single-source-of-truth pointers
194
+ (`package.json`, `release-metadata.json`, `nerviq-state.json`),
195
+ trust-boundary policy on instruction surfaces, and the canonical
196
+ release-prep checklist linked from the repo-boundary policy doc.
197
+ Closes the dogfood trust-break flagged by the 2026-04-28 cross-repo
198
+ project-domain audit.
199
+ - `[Tested]` **README continuous-governance positioning (POS-03 / POS-05).**
200
+ Lede aligned with the COMPLEMENTARY positioning frame (Nerviq sits
201
+ alongside ASTs / linters / SAST, not in place of them). README v2
202
+ qualifies the value claim with the explicit "continuous" qualifier
203
+ so a one-shot reader doesn't mistake the audit-only surface for the
204
+ full product.
205
+ - `[Tested]` **TRUTH-03 evidence-tier convention added to changelog.**
206
+ Every entry from this release forward is tagged with one of
207
+ `[Tested]` / `[Measured]` / `[Reported]` / `[Aspirational]` so a
208
+ buyer / reviewer / contributor can tell at a glance how strongly
209
+ we stand behind the claim. Documented in the file header.
210
+ - `[Tested]` **Freshness watchlist hardening (Claude/Codex/Copilot/Gemini
211
+ provider model pages).** Provider model docs added to each platform's
212
+ freshness module so the daily watch surfaces upstream model-release
213
+ changes instead of going quiet between major doc-site refactors.
214
+ Also: Gemini IDE-integration URL fixed; non-existent "hooks" page
215
+ removed from the Gemini watchlist.
216
+ - `[Tested]` **SECURITY.md internal contradiction fixed.** The supported
217
+ versions table now shows the `1.29.x` line as the active line of
218
+ support, replacing the stale `1.0.x` reference that survived from
219
+ the early-2026-Q2 cleanup pass.
220
+
221
+ ### Infrastructure
222
+
223
+ - `[Tested]` **OIDC trusted publisher migration (`fc6ead3`).**
224
+ `.github/workflows/publish.yml` switched to `npm publish --provenance
225
+ --access public` running under GitHub Actions OIDC against the npmjs
226
+ Trusted Publisher (org `nerviq`, repo `nerviq`, workflow
227
+ `publish.yml`, environment `npm-publish`). No `NPM_TOKEN` in CI;
228
+ local `npm publish` is no longer a valid escape hatch. Provenance
229
+ attestations are generated for every release from this point
230
+ forward.
231
+
232
+ ### Verified
233
+
234
+ - `[Tested]` jest: **475/475** passing (no test count delta from the
235
+ v1.29.0 / v1.29.1 baseline round-6/8/9 features ship with their
236
+ own regression coverage that landed in the same suite).
237
+ - `[Tested]` canonical CLI tests: **162/162** passing.
238
+ - `[Tested]` `npm pack --dry-run`: clean. Tarball ships
239
+ `bin`, `src`, `sdk`, `docs`, `contracts`, `README.md`,
240
+ `CHANGELOG.md`, `SECURITY.md`, `package.json`.
241
+ - `[Tested]` `node tools/pre-publish.js --ci --expected-version 1.30.0`:
242
+ passes (config + version surfaces aligned).
243
+ - `[Tested]` `node tools/validate-release-metadata.js`: passes against
244
+ `package.json` + `CHANGELOG.md` + research-side `nerviq-state.json`.
245
+
246
+ ## [1.29.1] - 2026-04-16
247
+
248
+ ### Fixed UX polish from external pilot feedback
249
+
250
+ Three small UX fixes surfaced by an external pilot session documented in
251
+ `research/pilot-feedback-2026-04-16-external-project.md`.
252
+
253
+ - **`setup --auto` counter no longer undercounts.** The end-of-setup
254
+ summary used an internal `created` counter that could drift from
255
+ `writtenFiles` (e.g. when `.claude/settings.json` was merged rather
256
+ than freshly created). The summary now reports
257
+ `writtenFiles.length` directly, matching the per-file log lines
258
+ above it. `--agent-mode` JSON output aligned to the same source of
259
+ truth.
260
+ - **`nerviq watch` compact output shows blocker keys inline.** The
261
+ `block=N` segment now appends up to three blocking check IDs (e.g.
262
+ `block=2 [permissionDeny, hookRegistration]`) so a failing gate is
263
+ actionable without a separate `nerviq audit` round-trip. A new
264
+ `blockingKeys` array is exposed on the continuous-status report for
265
+ programmatic consumers.
266
+ - **MONITOR help section disambiguates `watch` vs `serve` vs
267
+ `--drift-mode watch`.** Added a three-line orientation at the top
268
+ of the MONITOR block describing who each surface is aimed at
269
+ (local human, machine/HTTP, governance-posture flag).
270
+
271
+ ### Not shipped (deferred)
272
+
273
+ - `nerviq --version` update-notifier. The CLI ships with **zero
274
+ runtime dependencies** by design; adding `update-notifier` would
275
+ pull ~20 transitive deps. A zero-dep implementation is viable but
276
+ needs its own spec (cache location, opt-out, telemetry). Tracked
277
+ in the pilot-feedback doc.
278
+
279
+ ## [1.29.0] - 2026-04-14
280
+
281
+ ### Fixed Shallow-risk FP rate reduction (CTO-06b)
282
+
283
+ Tightens the shallow-risk pattern regexes based on the 60-repo FP
284
+ measurement from `research/exp-cto-06-fp-measurement-2026-04-14.md`.
285
+
286
+ - **`agent-config-missing-file`** — the single pattern that produced
287
+ essentially all the FPs. Overnight corpus measurement found 520
288
+ hits / 63.5% lower-bound FP rate across the PP-08 corpus (6.35×
289
+ above the 0.10 gate).
290
+
291
+ ### Impact
292
+
293
+ - Corpus hits: **520 → 69 (-86.7%)**.
294
+ - Lower-bound FP rate: **63.5% → 8.7%** (under the 0.10 gate).
295
+ - All other 7 patterns remained at 0 hits across the corpus (nothing
296
+ to tighten this pass they were already quiet).
297
+
298
+ ### What got tightened
299
+
300
+ - Pointer regex no longer fires on:
301
+ - Fenced code-example bodies.
302
+ - URL-shape references.
303
+ - Well-known external conventions (e.g. `.github/CODEOWNERS`,
304
+ `node_modules/*`, `.git/*`, `vendor/*`).
305
+ - Host-document path resolution is strict to the repo root; relative
306
+ references that resolve outside the repo are now ignored
307
+ instead of reported as missing.
308
+ - Quote-wrapped example paths in prose (e.g. `"docs/SECURITY.md"` as
309
+ an illustration in a paragraph) distinguished from bare reference
310
+ paths.
311
+
312
+ ### Verified
313
+
314
+ - jest: **475/475** passing — this is the `475`-test verification baseline. (was 452 + 23 new negative-fixture
315
+ tests in `test/shallow-risk.test.js`, each reproducing a FP
316
+ eliminated this pass).
317
+ - canonical CLI tests: **162/162** passing.
318
+ - `npm pack --dry-run`: clean.
319
+ - `node tools/validate-release-metadata.js`: validation passed for v1.29.0.
320
+ - Shallow-risk now runnable on real repos without drowning the
321
+ signal. Feature stays `Experimental` until the corpus measurement
322
+ sits below the 0.10 gate twice in a row.
323
+
324
+ Evidence: `research/exp-cto-06-fp-measurement-2026-04-14.md`
325
+ updated with a "2026-04-14 tightening pass" section including
326
+ per-pattern before/after.
327
+
328
+ ## [1.28.0] - 2026-04-14
329
+
330
+ ### Calibrated (not certified) OpenCode Platform Parity (PP-05)
331
+
332
+ The last of the 8 supported platforms finally gets its calibration
333
+ pass. OpenCode moves from "untouched" to "calibrated" against 10
334
+ real OpenCode-using public repos. Same judgment bar as Windsurf
335
+ (PP-03) and Aider (PP-04) — strict-FP <5% met, all-10-≥70 not fully
336
+ met. Source landed in commit `5114834`.
337
+
338
+ 10-repo corpus: 8/10 scored ≥70 post-calibration. PPI stays at
339
+ **0.75** — OpenCode public adoption at the mature-star tier is
340
+ sparse, same judgment pattern as Windsurf/Aider. Added to
341
+ `research/platform-parity-corpus.json`, evidence docs
342
+ `exp-pp-09-opencode-fp-2026-04-14.md` +
343
+ `exp-pp-10-opencode-external-2026-04-14.md`.
344
+
345
+ ### Verified
346
+
347
+ - jest: **452/452** passing — this is the `452`-test verification baseline. (was 440 + 12 new opencode-pp05
348
+ regression tests).
349
+ - canonical CLI tests: **162/162** passing.
350
+ - `npm pack --dry-run`: clean.
351
+ - `node tools/validate-release-metadata.js`: validation passed for v1.28.0.
352
+ - All guard suites still green (claude-na-gates, layer-coverage,
353
+ framework-native, audit-evidence, score-preview, 3 format tests,
354
+ shallow-risk).
355
+
356
+ **All 8 platforms now calibrated or certified:** Claude, Cursor,
357
+ Codex, Copilot, Gemini (certified, PPI contribution 1.0 each) +
358
+ Windsurf, Aider, OpenCode (calibrated, 0.75 base). PPI 0.75 will
359
+ graduate to 0.875+ only when corpus expansion on one of
360
+ Windsurf/Aider/OpenCode produces a mature-repo set passing the
361
+ score floor.
362
+
363
+ ## [1.27.1] - 2026-04-14
364
+
365
+ ### Fixed — npm tarball completeness + Windows output encoding (MEMO wave)
366
+
367
+ Addresses two real npm-user issues surfaced by the Codex CTO/CEO +
368
+ Market Memo (2026-04-13 v2):
369
+
370
+ - **`package.json` `files` broadened** (MEMO-17): the published
371
+ tarball now includes `docs/`, `contracts/`, `sdk/README.md`,
372
+ `CHANGELOG.md`, and `SECURITY.md` alongside `bin/`, `src/`, and
373
+ `README.md`. Previously these docs surfaces were referenced in
374
+ the README but not shipped in the npm tarball, meaning external
375
+ users hit broken doc links post-install. Verified via
376
+ `npm pack --dry-run` — tarball now matches what the README
377
+ promises.
378
+
379
+ - **Windows output encoding** (MEMO-16): the CLI console output
380
+ previously rendered mojibake on Windows cmd.exe where the runtime
381
+ default code page did not support emoji (✅ U+2705 / U+274C /
382
+ U+2713 / U+2717). Introduced `src/output-icons.js` as a single
383
+ helper that emits clean ASCII fallbacks (`[OK]`, `[FAIL]`,
384
+ `[SKIP]`, `[WARN]`) when `NERVIQ_ASCII_OUTPUT=1` or auto-detected
385
+ from `process.platform === 'win32'` + non-TTY. Wired through
386
+ `src/setup/runtime.js`, `src/setup.js`, `src/init.js`,
387
+ `src/codex/setup.js`, `src/gemini/setup.js`, `test/run.js`.
388
+ 2 new regression tests in `test/output-encoding.test.js`.
389
+
390
+ ### Also this release
391
+
392
+ - **7 back-dated GitHub Releases** created for v1.21.0 through
393
+ v1.27.0 (MEMO-01). Previously the public GitHub release surface
394
+ lagged npm by 7 versions; it now reflects the full release
395
+ history.
396
+ - **3 stale GitHub issues closed** (MEMO-02: #24, #25, #26) —
397
+ feature requests for Markdown / JUnit / CSV output that were
398
+ actually shipped in v1.22.0. Each closed with a shipped-in
399
+ attribution comment.
400
+
401
+ ### Verified
402
+
403
+ - jest: **440/440** passing — this is the `440`-test verification baseline. (was 438 + 2 new output-encoding
404
+ regression tests).
405
+ - canonical CLI tests: **162/162** passing.
406
+ - `npm pack --dry-run`: clean, includes the broadened files set.
407
+ - `node tools/validate-release-metadata.js --research <path>`:
408
+ validation passed for v1.27.1.
409
+
410
+ Evidence: `research/exp-memo-autonomous-wave-2026-04-14.md` in the
411
+ research repo.
412
+
413
+ ## [1.27.0] - 2026-04-14
414
+
415
+ ### Added Shallow Risk Mode (experimental, CTO-06)
416
+
417
+ Opt-in `--shallow-risk` lane that surfaces obvious problems at the
418
+ intersection of agent configuration (CLAUDE.md, `.claude/`, `.cursor/`,
419
+ `.codex/`, `.aider.conf.yml`, `.windsurf/`, etc.) and the rest of
420
+ the codebase. Closes the 2026-04-08 UAT trust-break where evaluators
421
+ said "missed something obvious" by catching a narrow, curated set
422
+ of issues **no generic scanner can find** because they require
423
+ understanding agent-config semantics.
424
+
425
+ Implementation follows the approved design doc v2 (commit `f425209`
426
+ in the research repo, `research/exp-cto-06-shallow-risk-design-2026-04-14.md`).
427
+
428
+ ### The 8 initial patterns (all NERVIQ-native)
429
+
430
+ 1. **`agent-config-missing-file`** — CLAUDE.md / AGENTS.md references
431
+ a repo file that doesn't exist; agent works with broken context.
432
+ 2. **`agent-config-stack-contradiction`** — CLAUDE.md says "Go project"
433
+ but repo is Python; agent recommends wrong tooling every session.
434
+ 3. **`agent-config-cross-platform-drift`** Two platform configs
435
+ give contradictory instructions (Cursor Claude disagree on
436
+ primary language).
437
+ 4. **`mcp-server-no-allowlist`** — MCP server declared with empty
438
+ permissions / wildcard allow = full shell access, no guardrail.
439
+ 5. **`hook-script-missing`** — Hook declared in `.claude/settings.json`
440
+ but the script file doesn't exist; hook silently skipped.
441
+ 6. **`agent-config-secret-literal`** — Secret token literal pasted
442
+ into CLAUDE.md / agent config as "example". Narrow secret scanning
443
+ scoped to our lane only (NOT broad repo secret scanning — use
444
+ gitleaks / truffleHog for that).
445
+ 7. **`agent-config-deprecated-keys`** Config uses keys the platform
446
+ removed in a later release (powered by our freshness manifest).
447
+ 8. **`agent-config-dangerous-autoapprove`** Auto-approve list
448
+ contains destructive patterns (`rm -rf *`, `git push --force`,
449
+ `drop table`). Never suppressed.
450
+
451
+ ### Shallow-risk is a parallel lane — it does NOT affect the score
452
+
453
+ Findings emit through `auditResult.shallowRiskHints[]` and are
454
+ intentionally excluded from:
455
+ - `auditResult.score`
456
+ - `auditResult.organicScore`
457
+ - `auditResult.passed` / `failed` / `skipped`
458
+ - `auditResult.topNextActions`
459
+ - `auditResult.layerSummary.*.failed`
460
+
461
+ This keeps the governance pipeline stable while still surfacing
462
+ agent-config ↔ codebase red flags. Score-unchanged proof on
463
+ self-audit of the NERVIQ repo: governance score is **87** with and
464
+ without `--shallow-risk`; only `shallowRiskHints` differs (empty
465
+ vs. 17 hits).
466
+
467
+ ### CLI UX
468
+
469
+ ```bash
470
+ npx @nerviq/cli audit --shallow-risk # full audit + shallow risk
471
+ npx @nerviq/cli audit --shallow-risk-only # fast precommit mode
472
+ NERVIQ_SHALLOW_RISK=off npx @nerviq/cli audit --shallow-risk # kill switch
473
+ ```
474
+
475
+ Friendly banner rendered in text output and as a blockquote in
476
+ markdown:
477
+
478
+ > Shallow Risk mode (experimental, opt-in). NERVIQ checks 8 patterns
479
+ > that sit at the intersection of your AI agent configuration and
480
+ > your codebase the kind of issues no generic scanner can find
481
+ > because they require understanding CLAUDE.md, .claude/settings.json,
482
+ > and similar files. For broader code-level security coverage, pair
483
+ > this with Semgrep, CodeQL, or a dedicated secret scanner.
484
+
485
+ ### Competitive positioning (explicit)
486
+
487
+ NERVIQ `--shallow-risk` is **not** a replacement for Semgrep / ESLint
488
+ / CodeQL / gitleiks / truffleHog / Dependabot — those tools work on
489
+ source code or dependency manifests. NERVIQ works on the bridge
490
+ between agent-declared intent and codebase reality. The 8 patterns
491
+ reflect that lane exclusively.
492
+
493
+ ### Rendering in all output formats
494
+
495
+ - **JSON**: `auditResult.shallowRiskHints[]` parallel to `results[]`.
496
+ - **Text**: separate `## Shallow Risk Hints (experimental, opt-in)`
497
+ block after `## Top next actions`, banner inline.
498
+ - **Markdown (`--format=markdown`)**: `### Shallow Risk (experimental,
499
+ opt-in)` section after `### Top next actions`, banner as blockquote,
500
+ each hint listed with severity / key / file:line.
501
+ - **JUnit (`--format=junit`)**: separate `<testsuite name="shallow-risk">`
502
+ so CI consumers can isolate or ignore it independently of the
503
+ governance suite.
504
+ - **CSV (`--format=csv`)**: hints appended as rows tagged
505
+ `layer=shallow-risk`. Contract documented in
506
+ `docs/integration-contracts.md` §7 and §8.1.
507
+
508
+ ### Status: Experimental
509
+
510
+ Release: `Experimental`. Graduates to `Beta` after 30 days of real
511
+ telemetry with zero critical corpus-level false positives reported
512
+ and at least one external user reporting a pattern caught a real
513
+ issue. Graduates to `GA` after 50+ WAA using it on ≥5 distinct repos
514
+ each.
515
+
516
+ Reserved slots 9 and 10 are deliberately empty — they wait for 30
517
+ days of user telemetry to tell us which patterns users most want
518
+ that we didn't anticipate.
519
+
520
+ ### Verified
521
+
522
+ - jest: **438/438** passing — this is the `438`-test verification baseline. (was 419 + 19 new: 16 shallow-risk
523
+ tests (positive + negative per pattern) + 3 format surface tests).
524
+ - canonical CLI tests: **162/162** passing.
525
+ - Guard coverage kept green: `claude-na-gates.test.js`,
526
+ `layer-coverage.test.js`, `framework-native.test.js`,
527
+ `audit-evidence.test.js`, `score-preview.test.js`, and the three
528
+ format tests.
529
+ - `npm pack --dry-run`: clean.
530
+ - `node tools/validate-release-metadata.js --research <path>`:
531
+ validation passed for v1.27.0.
532
+ - Self-audit smoke: score unchanged (87 with and without the flag),
533
+ 17 shallow-risk hints found on the NERVIQ repo itself (mostly
534
+ `agent-config-missing-file` on `.claude/` docs).
535
+
536
+ ### PP-08 gate
537
+
538
+ Added `fp_rate_threshold_shallow_risk: 0.10` lane in
539
+ `research/platform-parity-corpus.json`. Corpus FP measurement on
540
+ shallow-risk patterns is a separate follow-up task (not in this
541
+ release).
542
+
543
+ Evidence: `research/exp-cto-06-implementation-2026-04-14.md`.
544
+
545
+ ## [1.26.0] - 2026-04-14
546
+
547
+ ### Fixed Framework-native verification depth (CTO-07)
548
+
549
+ Closes the trust-break documented in the 2026-04-08 UAT where Flutter
550
+ + Swift projects got zero uplift from NERVIQ because valid verification
551
+ commands (`xcodebuild test`, `flutter test`, `gradle test`) were
552
+ treated as missing guidance, and mature Python ML + FastAPI repos
553
+ flattened because NERVIQ didn't recognise existing scaffolding
554
+ (pytest + `pyproject.toml` + poetry/uv + ruff/mypy).
555
+
556
+ Moves KPI memo §6.5 ("Are mobile, infra, and mature repos improving
557
+ with the same credibility as Node-oriented repos?") from NO → YES.
558
+
559
+ - `src/instruction-surfaces.js`: broadened surface bundle so repo
560
+ files like `pyproject.toml`, `Makefile`, `justfile`, `Podfile`,
561
+ `Cartfile`, `pubspec.yaml`, `Rakefile`, `build.gradle*`, and
562
+ `.github/workflows/*` count as verification evidence. Expanded
563
+ TEST/LINT/BUILD command patterns for Flutter (`flutter test`,
564
+ `flutter analyze`, `dart analyze`, `dart format`, `fvm flutter`),
565
+ iOS / Swift (`xcodebuild test`, `swift test`, `fastlane test`,
566
+ `swiftlint`, `swift-format lint`), Android (`./gradlew test`,
567
+ `./gradlew ktlintCheck`, `./gradlew detekt`), and Python (all of
568
+ `pytest`, `poetry run pytest`, `uv run pytest`, `pdm run pytest`,
569
+ `hatch run test`, `tox`, `nox`, `python -m pytest`, `python -m
570
+ unittest`, `ruff check`, `ruff`, `flake8`, `pylint`, `black
571
+ --check`, `mypy`, `pyright`, `pre-commit run`).
572
+
573
+ - `src/techniques/shared.js`: 10 new memoized stack helpers
574
+ (`hasIosXcodeProject`, `hasAndroidGradle`, `hasFlutterProject`,
575
+ `hasPythonPoetry`, `hasPythonUv`, `hasPythonPdm`, `hasPythonHatch`,
576
+ `hasFastApiProject`, `hasMlScaffolding`, `hasConfiguredTooling`).
577
+ These let stack-specific checks detect "this project HAS
578
+ verification wired up" directly from repo files rather than only
579
+ from CLAUDE.md / AGENTS.md mentions legitimate evidence because
580
+ an agent working in the repo can observe these files itself.
581
+
582
+ ### Re-audit per-archetype uplift
583
+
584
+ | Archetype | Before | After | Δ | Framework FNs resolved |
585
+ |---|---:|---:|---:|---|
586
+ | Flutter mobile | 14 | 25 | **+11** | 4 → 1 (build cmd advisory only) |
587
+ | iOS Swift | 11 | 26 | **+15** | 4 → 0 |
588
+ | Python ML | 14 | 23 | **+9** | 4 → 1 |
589
+ | Python FastAPI | 11 | 21 | **+10** | 4 → 1 |
590
+
591
+ Average uplift: **+11.25 points**. 14/15 framework-native false
592
+ negatives flipped to pass/N/A; the residual 4 × `buildCommand` are
593
+ legitimately advisory (category (c)).
594
+
595
+ ### What is NOT changed
596
+
597
+ - No new top-level checks. Catalog count stays at 2,441.
598
+ - No check semantics inverted.
599
+ - No scoring weights, severity values, or rating values touched.
600
+ - CTO-08 `layer` tags preserved on every check.
601
+ - Claude PP-06 calibration unaffected: `strict_false_positive_keys.
602
+ claude` stays empty; `claude-na-gates.test.js` passes unchanged.
603
+
604
+ ### Verified
605
+
606
+ - jest: **419/419** passing — this is the `419`-test verification baseline. (was 403 + 16 new framework-native
607
+ regression tests organised by stack in
608
+ `test/framework-native.test.js`).
609
+ - canonical CLI tests: **162/162** passing.
610
+ - `npm pack --dry-run`: clean.
611
+ - `node tools/validate-release-metadata.js --research <path>`:
612
+ validation passed for v1.26.0.
613
+
614
+ Evidence: `research/exp-cto-07-framework-native-2026-04-14.md`
615
+ includes the full archetype survey, before/after re-audit, and
616
+ methodology note on the deterministic fixtures used in Phase 3.
617
+
618
+ ## [1.25.0] - 2026-04-14
619
+
620
+ ### Added — 5-layer scope clarity (CTO-08)
621
+
622
+ Every check in the NERVIQ audit is now tagged with exactly one of
623
+ four layers. Closes the boundary-blur gap documented in the
624
+ 2026-04-14 CTO memo §6 ("Do evaluators understand the product
625
+ boundary before trust breaks?") and moves KPI question §6.2 from
626
+ PARTIAL YES with measurable evidence. Source landed in commit
627
+ `a8676b1`; this commit packages the release.
628
+
629
+ The four layers:
630
+
631
+ - **`governance`** agent configuration posture: presence, content,
632
+ and quality of agent-instruction files and platform settings.
633
+ Example: `claudeMdExists`, `geminiSettingsExists`, MCP server
634
+ declarations, hook presence.
635
+ - **`drift`** cross-platform consistency and declared-vs-actual
636
+ alignment. Example: Harmony drift, Gemini propagation completeness,
637
+ rules consistency across surfaces.
638
+ - **`hygiene`** — repo-level cleanliness adjacent to agents (the
639
+ engineering baseline that makes an agent's job easier). Example:
640
+ `.gitignore`, CHANGELOG, SECURITY.md, LICENSE, Node version
641
+ pinning, editorconfig.
642
+ - **`shallow-risk`** reserved for CTO-06 (agent-config codebase
643
+ boundary hints). No checks currently populate this layer; the
644
+ constant exists so formatters and downstream consumers know about
645
+ it for the future.
646
+
647
+ There is **no `deep-review` or `security` layer**, by design. NERVIQ
648
+ audits agent configuration and the cleanliness of the repo boundary
649
+ an agent operates inside. It does not perform dataflow analysis,
650
+ SAST, or general code review those are out of scope and left to
651
+ dedicated tools. This is the contract that lets evaluators know
652
+ where our claim to ground-truth starts and stops.
653
+
654
+ ### Final layer distribution (2,441 checks)
655
+
656
+ | Layer | Count | % |
657
+ |---|---:|---:|
658
+ | governance | 1,102 | 45.1% |
659
+ | drift | 39 | 1.6% |
660
+ | hygiene | 1,300 | 53.3% |
661
+ | shallow-risk | 0 (reserved) | 0% |
662
+
663
+ Disambiguation rules (codified in `src/audit/layers.js` and
664
+ `docs/integration-contracts.md` §8):
665
+ - "Does my agent know X?" `governance`.
666
+ - "Do two places agree on X?" → `drift`.
667
+ - "Does the repo have standard engineering hygiene?" → `hygiene`.
668
+ - When in doubt, default to `hygiene` (a mild misclassification is
669
+ recoverable; a missing tag breaks the coverage contract).
670
+
671
+ ### Surfaced in every output format
672
+
673
+ - **JSON**: `auditResult.results[].layer`,
674
+ `auditResult.topNextActions[].layer`, and a new
675
+ `auditResult.layerSummary` giving per-layer
676
+ `{ total, passed, failed, skipped }`.
677
+ - **Text**: "Coverage by layer:" summary block plus a small
678
+ `[layer]` prefix on failed-check names.
679
+ - **Markdown (`--format=markdown`)**: `layer` column in the failed-
680
+ checks table; `_layer: X_` suffix on each top-action checklist item.
681
+ - **JUnit (`--format=junit`)**: `layer="..."` attribute on every
682
+ `<testcase>`.
683
+ - **CSV (`--format=csv`)**: new `layer` column between `category`
684
+ and `rating`. Updated contract in `docs/integration-contracts.md` §7.
685
+
686
+ ### Verified
687
+
688
+ - jest: **403/403** passing this is the `403`-test verification baseline. (was 391 + 7 coverage tests + 5
689
+ format surface tests).
690
+ - canonical CLI tests: **162/162** passing.
691
+ - `npm pack --dry-run`: clean.
692
+ - `node tools/validate-release-metadata.js --research <path>`:
693
+ validation passed for v1.25.0.
694
+
695
+ Evidence: `research/exp-cto-08-layer-clarity-2026-04-14.md` includes
696
+ the full distribution, ambiguous-call log, and KPI mapping.
697
+
698
+ ## [1.24.0] - 2026-04-14
699
+
700
+ ### Fixed — Claude calibration debt resolved (CTO-09 / PP-06)
701
+
702
+ Eleven Claude audit checks that were systematically firing as
703
+ false-positives on repos that did not opt in to their respective
704
+ agent-config surfaces now return `N/A` (null) instead of `false`.
705
+ Previously these were captured in a post-hoc allowlist
706
+ (`platform-parity-fp-rules.json.strict_false_positive_keys.claude`);
707
+ now the checks are honest at source.
708
+
709
+ The affected keys:
710
+
711
+ - `claudeLocalMd`, `autoMemoryAwareness`, `importSyntax`
712
+ (in `src/techniques/instructions.js`) N/A when the repo does
713
+ not opt in to the overrides/memory/import-syntax conventions.
714
+ `importSyntax` becomes a positive-signal check: it passes when
715
+ `@`-imports are present in CLAUDE.md, and is advisory only on
716
+ long (≥80 lines) CLAUDE.md files that would clearly benefit.
717
+ - `mcpServers`, `multipleMcpServers`, `context7Mcp`
718
+ (in `src/techniques/tools.js`) — N/A on repos that have no MCP
719
+ references anywhere. A new `_repoOptsInToMcp()` helper centralises
720
+ the detection.
721
+ - `dockerfile`, `dockerCompose`, `terraformFiles`, `hooksNotificationEvent`,
722
+ `subagentStopHook`
723
+ (in `src/techniques/automation.js`) N/A when no infra signal
724
+ exists (Dockerfile/`.tf`/`docker-compose*`) or when
725
+ `.claude/settings.json` has no `hooks` block. New
726
+ `_repoHasInfraSignal()` and `_repoHasHooksBlock()` helpers.
727
+
728
+ ### Impact
729
+
730
+ - **PP-08 CI gate threshold restored to 0.05** (from the 0.15
731
+ holding pattern). The `fp_rate_threshold_notes` in
732
+ `research/platform-parity-corpus.json` documents the resolution:
733
+ any drift above 0.05 is now a real regression, not a calibration
734
+ debt issue.
735
+ - **Claude strict-FP rate dropped from ~11.99% to 0.00%** on the
736
+ cleanly-cloned repos in the PP-08 corpus (8/9one long-path
737
+ checkout failure on Windows unrelated to CLI).
738
+ - **Per-repo total failures dropped by 6–10 checks each** on Claude
739
+ audits, matching the expected ~7.6 opt-in hits per repo that moved
740
+ from `false` `null`.
741
+ - **`strict_false_positive_keys.claude` is now empty.** The post-hoc
742
+ allowlist is no longer needed.
743
+
744
+ ### Verified
745
+
746
+ - jest: **391/391** passing this is the `391`-test verification baseline. (was 369 + 22 new N/A-gate
747
+ regression tests in `test/claude-na-gates.test.js`, two per key).
748
+ - canonical CLI tests: **162/162** passing.
749
+ - `npm pack --dry-run`: clean.
750
+ - `node tools/validate-release-metadata.js --research <path>`:
751
+ validation passed for v1.24.0.
752
+ - PP-08 CI gate: all 6 platforms (claude, codex, cursor, gemini,
753
+ windsurf, aider) PASS at the restored 0.05 threshold.
754
+
755
+ Evidence: `research/exp-pp-06-claude-recalibration-debt-2026-04-14.md`
756
+ updated with a Resolution section at the top (per-key table,
757
+ before/after gate output, verification).
758
+
759
+ ## [1.23.0] - 2026-04-14
760
+
761
+ ### Added Trust-recovery depth (CTO-04, CTO-05)
762
+
763
+ Ships the two deepest items from the 2026-04-14 CTO memo — the
764
+ evaluator-stated reasons trust breaks in real audits. Closing them
765
+ moves KPI questions §6.3 (file-level evidence) and §6.4 (score
766
+ impact before write) from NO/UNKNOWN YES with verifiable evidence.
767
+ Formatter source landed in commit `e06ae64`; this commit packages
768
+ the release.
769
+
770
+ - **CTO-04 — File-level evidence (`file:line:snippet`).** Every
771
+ failed check that has a sensible file-level source now emits
772
+ `file`, `line`, and a `snippet` (2–5 lines of context, 300-char
773
+ cap) so markdown/junit/text outputs can point at real evidence
774
+ rather than abstract advice.
775
+ - New resolver registry in `src/audit/evidence.js` for the 20
776
+ highest-hitting check keys identified in a fresh self-audit.
777
+ - Survey result on self-audit of the nerviq repo: 0 of 23 failed
778
+ checks previously carried evidence; **9 of 23 now do**. The
779
+ remaining 14 are either category (c) — "absence-of-file"
780
+ checks like `claudeLocalMd` where a null pointer is the correct
781
+ semantic — or roll-ups where evidence would be misleading.
782
+ - Backlog of unresolved category (b) keys documented in the
783
+ evidence doc. 1 deferred (`skillUsesPaths`, blocked on CTO-06).
784
+ - Markdown formatter renders snippet as a fenced code block under
785
+ each checklist item; JUnit formatter appends it to the
786
+ `<failure>` body after `---`; CSV intentionally unchanged
787
+ (snippet newlines/commas would hurt downstream parsing).
788
+
789
+ - **CTO-05 Score-impact preview before `--apply`.** Each
790
+ `topNextActions` item now carries `projectedScoreDelta`,
791
+ `projectedOrganicScoreDelta`, and `projectedScoreAfter` so the
792
+ user sees "this fix moves score 67 74 (+7 pts)" before any
793
+ write. Projection is computed by one O(1) recompute per top
794
+ action using the existing scoring function (no extra full
795
+ audits, no scoring-algorithm changes).
796
+ - Text output appends ` (+N pts → X/100)` per top action.
797
+ - Markdown formatter shows the same suffix inline in the
798
+ checklist.
799
+ - CSV adds two trailing columns
800
+ `projectedScoreDelta,projectedScoreAfter`populated only
801
+ for rows whose key appears in `topNextActions` (projection is
802
+ per-top-action, not per-every-check); other rows leave both
803
+ columns empty. Contract documented in
804
+ `docs/integration-contracts.md` §7.
805
+ - JUnit intentionally unchanged (testcases don't naturally carry
806
+ scores).
807
+
808
+ ### Verified
809
+
810
+ - jest: **369/369** passing — this is the `369`-test verification baseline. (was 354 + 9 new
811
+ evidence tests + 3 new score-preview tests + 3 markdown extensions
812
+ + 1 junit extension + 2 csv extensions).
813
+ - canonical CLI tests: **162/162** passing.
814
+ - `npm pack --dry-run`: clean (213 files, 757 kB).
815
+ - `node tools/validate-release-metadata.js --research <path>`:
816
+ validation passed for v1.23.0.
817
+
818
+ Evidence: `research/exp-cto-04-05-trust-recovery-2026-04-14.md`
819
+ in the research repo (~263 lines) includes the full per-check
820
+ survey, worked projection example, markdown + CSV samples with
821
+ the new fields, and explicit mapping back to the 8 memo KPI
822
+ questions.
823
+
824
+ ## [1.22.0] - 2026-04-14
825
+
826
+ ### Added — CI output format pack (CTO-01, CTO-02, CTO-03)
827
+
828
+ Three new output formats for `nerviq audit`, designed to plug the CLI
829
+ straight into standard CI surfaces. Closes the "Markdown PR comment /
830
+ JUnit XML / CSV" gap called out in the 2026-04-14 CTO memo §8 — the
831
+ plumbing required before "no serious multi-agent repo merges without
832
+ a Nerviq check" is even claimable as positioning.
833
+
834
+ - **`--format=markdown` (CTO-01)** — GitHub-flavoured markdown
835
+ suitable for a PR comment. Includes a `## Score: N/100` header with
836
+ shields.io badge, a `### Top next actions` task-list checklist (up
837
+ to 5 items, each with severity + key + optional `file:line`), a
838
+ collapsible `<details>` block listing all failed checks in a pipe
839
+ table, and a `Generated by [Nerviq](https://nerviq.net)` footer.
840
+ Pipe characters inside cells are backslash-escaped. No raw HTML
841
+ beyond `<details>` / `<summary>`.
842
+
843
+ - **`--format=junit` (CTO-02)** Jenkins-compatible JUnit XML.
844
+ `<testsuites name="nerviq" tests="N" failures="F" skipped="S">`
845
+ root, one `<testsuite>` per check category, one `<testcase>` per
846
+ check (`classname=category`, `name=key`). Failed checks emit
847
+ `<failure message="..." type="SEVERITY">` with body containing
848
+ `name [at file:line] [(sourceUrl)]`. Skipped checks emit `<skipped/>`.
849
+ All attribute values + text nodes XML-escape `& < > " '`. Parses
850
+ cleanly with GitHub Actions test reporter, GitLab JUnit reporter,
851
+ and Jenkins JUnit plugin.
852
+
853
+ - **`--format=csv` (CTO-03)** RFC 4180 CSV. Header row
854
+ `key,id,name,category,rating,severity,passed,file,line,sourceUrl,fix`
855
+ followed by one row per check. Fields containing comma, double-quote,
856
+ CR, or LF are wrapped in double-quotes; internal double-quotes are
857
+ escaped by doubling. No UTF-8 BOM (avoids pandas / Excel friction).
858
+ LF line separator.
859
+
860
+ Wired into `bin/cli.js` `--format` switch alongside existing
861
+ `json|sarif|otel`. Format contracts documented in
862
+ `docs/integration-contracts.md` §7 as the stable consumer API for
863
+ downstream wrappers (GitHub Actions, Jenkins plugins, GitLab reporters,
864
+ dashboards) bind to these shapes rather than scraping text output.
865
+
866
+ ### Verified
867
+
868
+ - jest: **354/354** passing — this is the `354`-test verification baseline. (was 335 + 19 new format tests:
869
+ `test/format-markdown.test.js`, `test/format-junit.test.js`,
870
+ `test/format-csv.test.js` covering field shape, escaping rules,
871
+ edge cases like missing `file:line`, and full round-trip parse
872
+ on synthetic audit results).
873
+ - canonical CLI tests: **162/162** passing.
874
+ - `npm pack --dry-run`: clean (212 files, 754 kB).
875
+ - `node tools/validate-release-metadata.js --research <path>`:
876
+ validation passed for v1.22.0.
877
+
878
+ Evidence: `research/exp-cto-01-03-formats-2026-04-14.md` in the
879
+ research repo includes sample outputs and a GitHub Actions integration
880
+ recipe.
881
+
882
+ ## [1.21.0] - 2026-04-14
883
+
884
+ ### Calibrated (not certified) — Aider platform audit (PP-04)
885
+
886
+ Aider platform audit recalibrated against 10 real Aider-using repos
887
+ (`Aider-AI/aider`, `sysown/proxysql`, `Provenance-Emu/Provenance`,
888
+ `disler/always-on-ai-assistant`, `SquirrelJME/SquirrelJME`, `ad-si/tu`,
889
+ `Aider-AI/conventions`, `commit-0/commit0`, `roychri/mcp-server-asana`,
890
+ `attestate/kiwistand`).
891
+
892
+ Seven systematic 10/10 false-positives eliminated:
893
+
894
+ - `aiderUndoSafetyAware` (10/10 5/10)
895
+ - `aiderEditorModelConfigured` (10/10 0/10)
896
+ - `aiderWeakModelConfigured` (10/10 → 5/10)
897
+ - `aiderModelSettingsFileExists` (10/10 → 5/10)
898
+ - `aiderAiderignoreExists` (10/10 5/10)
899
+ - `aiderEnvFileExists` (10/10 → 5/10) — true FP: `.env` is gitignored;
900
+ now accepts `.env.example` / `.sample` / `.template`.
901
+ - `aiderAllConfigSurfacesPresent` (10/10 → 5/10) — true FP, same root cause.
902
+
903
+ Four additional ≥9/10 FPs sharply reduced: `aiderGitHooksForPreCommit` 9→3,
904
+ `aiderBrowserModeForDocs` 9→5, `aiderPlaywrightUrlScraping` 9→4,
905
+ `aiderVersionPinned` 9→0 (N/A on non-Python projects).
906
+
907
+ Six opt-in tuning knobs converted to pass-or-N/A semantics:
908
+ `aiderMapTokensConfigured`, `aiderEditFormatConfigured`,
909
+ `aiderArchitectModeAvailable`, `aiderCachePromptsEnabled`,
910
+ `aiderCommitPrefixConfigured`, `aiderVoiceModeAware`they no longer
911
+ fire as advisories on repos that do not opt in.
912
+
913
+ Newly recognised conventions: `.aider.conf.yaml` (alt extension),
914
+ `AGENTS.md` / `CLAUDE.md` / `.ai/instructions.md` / `AIDER.md` as
915
+ alternative convention surfaces, `.env.example` / `.sample` / `.template`
916
+ as env-contract surfaces.
917
+
918
+ 10-repo corpus moved from baseline 38–64 → final 44–82. 2/10 reach ≥70
919
+ (kiwistand 82, proxysql 72). The other 8 are below 70 due to documented
920
+ genuine content gaps in the audited repos themselves, not audit bugs.
921
+
922
+ **Why "calibrated, not certified":** same judgment as Windsurf (PP-03).
923
+ Strict-FP <5% bar is met; all-10-≥70 + mature-repos-≥73 bar is not,
924
+ because public Aider adoption above 500 stars is sparse. PPI stays at
925
+ **0.75** until corpus expansion.
926
+
927
+ ### Fixed release drift guard prefers `-main` worktrees
928
+
929
+ `tools/validate-release-metadata.js` now prefers `../nerviq-research-main`
930
+ and `../nerviq-site-main` when those worktrees exist, falling back to
931
+ `../nerviq-research` / `../nerviq-site` otherwise. When a parallel-agent
932
+ worktree on a feature branch occupies the canonical `nerviq-research`
933
+ directory, the drift guard was reading the feature-branch state and
934
+ refusing publish even though the actual main branch was synced.
935
+ Single-worktree setups are unaffected.
936
+
937
+ ### Verified
938
+
939
+ - jest: **335/335** passing — this is the `335`-test verification baseline.
940
+ - canonical CLI tests: **162/162** passing.
941
+ - aider matrix: **315/315** passing (was 308, +6 PP-04 regression tests).
942
+ - `npm pack --dry-run`: clean.
943
+ - `node tools/validate-release-metadata.js --research <path>`: validation
944
+ passed for v1.21.0.
945
+ - PP-08 CI gate: all 6 platforms (claude, codex, cursor, gemini, windsurf,
946
+ aider) PASS at the current threshold.
947
+
948
+ ## [1.20.1] - 2026-04-14
949
+
950
+ ### Fixed — Critical: bin/cli.js shebang regression
951
+
952
+ `bin/cli.js` was missing the `#!/usr/bin/env node` shebang since v1.16.x (commit `40c27b8` on 2026-04-12, which fixed a macOS pipe-flush issue and accidentally dropped the shebang while restructuring the file). Without a shebang, `npx @nerviq/cli` failed on Linux and Mac because the OS fell back to `/bin/sh` and tried to execute JavaScript as a shell script (`//: Permission denied / Syntax error`). Windows installs were unaffected because npm generates `.cmd` wrappers that invoke `node` explicitly.
953
+
954
+ This was discovered when wiring up the PP-08 CI gate against `npx @nerviq/cli@1.20.0`. Likely affected production users on Linux/macOS doing fresh `npx` installs since 2026-04-12.
955
+
956
+ - Restored `#!/usr/bin/env node` as the first line of `bin/cli.js`.
957
+ - Added `test/bin-shebang.test.js` regression test that scans every `bin` entry in `package.json` and asserts the shebang exists. Will catch any future drop of the shebang line on any bin script.
958
+
959
+ ### Fixed claudeMdContent pointer expansion accepts `@` imports
960
+
961
+ `ProjectContext.claudeMdContent()` in `src/context.js` recognizes when CLAUDE.md is a thin pointer to another file (e.g., `AGENTS.md`) and expands it. The expansion regex `/^[a-zA-Z0-9_./-]+\.(md|txt|rst)$/` did not accept Claude Code's standard `@`-prefixed import syntax (`@AGENTS.md`, `@./docs/CODING.md`). Repos using the standard syntax saw all memory/prompting/quality checks fail because the auditor only saw the 1-line pointer.
962
+
963
+ Discovered while investigating the NERVIQ site's self-dogfood score (25 85 after this fix plus content enrichment).
964
+
965
+ - Updated regex to `/^@?\.?\/?[a-zA-Z0-9_./-]+\.(md|txt|rst)$/`; resolver strips `@` and `./` prefixes before `fileContent()`.
966
+ - Added `test/context.test.js` (+6 tests) covering raw content, bare-filename pointer, `@`-prefix, `@./`-prefix, nested-subdir, and null-fixture cases.
967
+
968
+ ### Added `prepublishOnly` lifecycle script
969
+
970
+ `package.json` now wires the existing pre-publish drift guard (`tools/pre-publish.js`) to npm's `prepublishOnly` lifecycle, in addition to the manual `prepublish:check` alias. `npm publish` now blocks automatically on dirty tree, branch drift, missing CHANGELOG entry, jest failure, or release-metadata drift. `npm pack --dry-run` does not trigger it (verified) so local development is unaffected.
971
+
972
+ ### Calibrated (not certified) — Windsurf platform audit (PP-03)
973
+
974
+ Windsurf platform audit recalibrated against 10 real Windsurf-using repos (`grapeot/devin.cursorrules`, `hyper-mcp-rs/hyper-mcp`, `dxos/dxos`, `snowflakedb/gosnowflake`, `ShareX/XerahS`, `Brawl345/Image-Reverse-Search-WebExtension`, `rudrankriyam/Ichi`, `snyk/snyk-intellij-plugin`, `wepublish/wepublish`, `AmadeusITGroup/otter`).
975
+
976
+ Three systematic 10/10 false-positives eliminated:
977
+ - `windsurfMemoriesConfigured` opt-in memories surface; now N/A when absent.
978
+ - `windsurfPackMcpRecommended` opt-in MCP recommendation; now N/A when absent.
979
+ - `windsurfAdvisoryMcpHealth` **real bug fix**: was reading the host's `os.platform()` and asserting it inside the audited repo's advisory. Now host-agnostic; uses repo-local evidence only (Windows/WSL gate generalised).
980
+
981
+ Other improvements: pointer/`@import` expansion for Windsurf instruction surfaces (`.windsurf/rules/*`, `WINDSURF.md`, pointer files like `.ai/instructions.md`), `.windsurfrules/` directory form support, fallback to `AGENTS.md`/`CLAUDE.md` for stack-marker generalisation, frontmatter realism for `.mdc` files.
982
+
983
+ 10-repo corpus moved from baseline 9–70 → final 32–83. 7/10 ≥70. The 3 below 70 (hyper-mcp 69, Ichi 64, wepublish 60) are documented genuine content-depth gaps in the audited repos themselves, not audit bugs. The 32 outlier (`grapeot/devin.cursorrules`) uses the deprecated single-file `.windsurfrules` legacy format.
984
+
985
+ **Why "calibrated, not certified":** Gemini PP-02 cleared "all 10 ≥70" and "all mature (>10K stars) ≥73". Windsurf cleared the strict-FP <5% bar (the primary criterion) but Windsurf public adoption is thinner than Gemini at equivalent star thresholds — the largest mature repo found was 5.9K stars. PPI stays at **0.75** until corpus expansion produces a mature-repo set passing the score floor. No inflated PPI claim shipped.
986
+
987
+ ### Verified
988
+
989
+ - jest: **335/335** passing (was 326 + 6 new context tests + 3 new shebang tests) this is the `335`-test verification baseline.
990
+ - canonical CLI tests: **162/162** passing.
991
+ - matrix: **311/0** passing.
992
+ - `npm pack --dry-run`: clean.
993
+ - `node tools/validate-release-metadata.js --research ../nerviq-research-main`: validation passed.
994
+
995
+ ## [1.20.0] - 2026-04-13
996
+
997
+ ### Fixed Gemini Platform Parity (PP-02, 10-repo calibration)
998
+
999
+ Gemini becomes the **5th certified platform** (PPI 0.625 → **0.75**). Calibrated against 10 real Gemini-using repos (google-gemini/gemini-cli, google-gemini/cookbook, GoogleCloudPlatform/generative-ai, obra/superpowers, JuliusBrussee/caveman, google/site-kit-wp, google/dotprompt, vdesabou/kafka-docker-playground, OthmanAdi/planning-with-files, mscraftsman/generative-ai).
1000
+
1001
+ Key calibrations:
1002
+ - `_expandGeminiMdImports` resolves `@path.md` imports and single-line-pointer `GEMINI.md` files (observed in google/dotprompt).
1003
+ - Fallback chain for Gemini instruction surface: AGENTS.md → CLAUDE.md → `.gemini/styleguide.md` (Gemini Code Assist convention).
1004
+ - `isMcpOnlySettings` helper: 5 CLI-behaviour checks go N/A on MCP-only `.gemini/settings.json`.
1005
+ - `geminiSettingsExists` / `geminiCommandsExist` now N/A when the directory is absent rather than flagging a failure these surfaces are opt-in.
1006
+ - Broadened `docsBundle` to accept AGENTS/CLAUDE/CONTRIBUTING/ARCHITECTURE/DEVELOPMENT as documentation evidence.
1007
+ - `geminiEnvApiKey` credits ADC, Vertex AI, `gemini auth`, and service-account flows (not just `GEMINI_API_KEY`).
1008
+ - Tightened `geminiPropagationCompleteness`: the bare word "skills" was firing FPs.
1009
+ - **Bug fix:** `context.fileName` can legally be an array per the Gemini CLI schema. `path.join` crashed with `TypeError` on `google/site-kit-wp`. Now handled.
1010
+
1011
+ ### Measured (strict FP <5% across 10-repo corpus)
1012
+
1013
+ | Repo | Stars | Before | After |
1014
+ |---|---|---|---|
1015
+ | obra/superpowers | 148K | 73 | **88** |
1016
+ | google-gemini/gemini-cli | 101K | 74 | **89** |
1017
+ | JuliusBrussee/caveman | 21K | 75 | **94** |
1018
+ | OthmanAdi/planning-with-files | 18K | 72 | **73** |
1019
+ | google-gemini/cookbook | 17K | 73 | **94** |
1020
+ | GoogleCloudPlatform/generative-ai | 17K | 73 | **88** |
1021
+ | google/site-kit-wp | 1.4K | crash | **78** |
1022
+ | vdesabou/kafka-docker-playground | 778 | 68 | **83** |
1023
+ | google/dotprompt | 507 | 64 | **75** |
1024
+ | mscraftsman/generative-ai | 206 | 64 | **70** |
1025
+
1026
+ All 10 repos 70; all 6 mature repos (>10K stars) ≥ 73.
1027
+
1028
+ - **Gemini Platform Parity: certified**. PPI: 0.625 → **0.75** (Claude + Cursor + Codex + Copilot + Gemini).
1029
+
1030
+ 326/326 tests pass (+2 PP-02 regressions on top of v1.19.0's 324) — this is the `326`-test verification baseline.
1031
+
1032
+ ## [1.19.0] - 2026-04-13
1033
+
1034
+ ### Added
1035
+ - **EXP-04: `nerviq audit --fix` autofix flow**. `audit --fix` now runs the audit, applies fixable critical fixes, writes rollback manifests for successful writes, and re-audits before returning an exit code.
1036
+ - **Autofix docs**. Added `docs/autofix.md` with command examples, safety behavior, and exit-code semantics for the new one-shot flow.
1037
+ - **GOV-03: Time-to-First-Value benchmark** (`tools/ttfv-benchmark.py`). Measured harness across 4×4 install/repo combos; verdict on "<2 min" claim: TRUE (slowest median 16.1s on npx cold × nerviq-research).
1038
+
1039
+ ### Changed
1040
+ - **Shared fix engine now covers instruction-surface autofix**. Missing `CLAUDE.md`, verification guidance, and safe hygiene templates can now be applied through the same fix pipeline used by the CLI write paths.
1041
+
1042
+ ### Tests
1043
+ - Added `test/audit-fix.test.js` coverage for dry-run, auto-apply, rollback artifacts, `DO NOT AUTOEDIT` safety skips, exit-code handling, and hygiene rollback verification.
1044
+
1045
+ 324/324 tests pass.
1046
+
1047
+ ## [1.18.0] - 2026-04-13
1048
+
1049
+ ### Fixed — Copilot Platform Parity (PP-01, 10-repo calibration)
1050
+
1051
+ - **Copilot audit now recognizes real-world repo conventions.** Calibrated against 10 active Copilot-using repos (home-assistant/core, block/goose, microsoft/vscode, astral-sh/uv, microsoft/playwright, langchain-ai/langchain, microsoft/typescript-go, microsoft/semantic-kernel, dotnet/aspire, github/awesome-copilot).
1052
+ - **JSONC tolerance in `.vscode/settings.json`**: parser now strips comments/trailing commas before evaluation (Copilot/VSCode honor JSONC; strict-JSON parsing produced false CP-B06 failures).
1053
+ - **Context fallback for AGENTS.md / CLAUDE.md**: repos that centralize agent guidance in AGENTS.md or CLAUDE.md at repo root are no longer penalized for `.github/copilot-instructions.md` substance checks.
1054
+ - **Stack-docs bundle helper**: 45 stack/domain checks now accept a documented bundle of per-stack signals (pyproject.toml + ruff.toml, Cargo.toml + rustfmt.toml, go.mod + golangci.yml, etc.) rather than requiring a single canonical file.
1055
+
1056
+ ### Measured (strict FP rate < 5% across 10-repo corpus)
1057
+
1058
+ | Repo | Stars | Before | After |
1059
+ |---|---|---|---|
1060
+ | home-assistant/core | 86K | 42 | **76** |
1061
+ | block/goose | 41K | 41 | **76** |
1062
+ | microsoft/vscode | 183K | 46 | **61** |
1063
+ | astral-sh/uv | 83K | 28 | **75** |
1064
+ | microsoft/playwright | 86K | 46 | **66** |
1065
+ | langchain-ai/langchain | 133K | 23 | **65** |
1066
+ | microsoft/typescript-go | 25K | — | **66** |
1067
+ | microsoft/semantic-kernel | 27K | 33 | **53** |
1068
+ | dotnet/aspire | 6K | 35 | **59** |
1069
+ | github/awesome-copilot | | 45 | **59** |
1070
+
1071
+ All 10 repos 40; all 9 mature repos (>10K stars) ≥ 53.
1072
+
1073
+ - **Copilot Platform Parity: certified**. PPI: 0.5 → **0.625** (Claude + Cursor + Codex + Copilot).
1074
+
1075
+ ### Added
1076
+ - EXPERIMENTAL qualifiers surfaced consistently on all user-facing Synergy mentions in README, docs/why-nerviq.md, docs/api-reference.md (SYN-04 audit).
1077
+
1078
+ 317/317 tests pass.
1079
+
1080
+ ## [1.17.3] - 2026-04-12
1081
+
1082
+ ### Fixed Codex Platform Parity (Issue #35, 10-repo scale-up)
1083
+
1084
+ - **Hook checks now require Codex-specific evidence**. hooksClaimed() previously matched any generic 'hook' mention in AGENTS.md — triggering FPs on git hooks, React hooks, or dependency names like 'hookable'. Now requires .codex/hooks/, .codex/hooks.json, [hooks]/codex_hooks in config.toml, specific Codex event names (SessionStart, PreToolUse, PostToolUse, UserPromptSubmit), or explicit 'codex hooks' phrase. Fixes jessfraz/dotfiles, ModelEngine-Group/fit-framework, finbarr/yolobox.
1085
+ - **codexPackRecommendationQuality accepts .NET / Gradle manifests**. Added .sln, .slnx, .csproj, .fsproj, .vbproj, Directory.Packages.props, Directory.Build.props, global.json, gradlew. Fixes Megabit/Blazorise.
1086
+ - **codexNoInstructionContradictions ignores line-ending guidance**. CRLF/LF/trailing-newline/EOF rules are style preferences, not logical contradictions.
1087
+ - **codexAgentsMd accepts .codex/AGENTS.md**. Some repos store AGENTS.md inside .codex/.
1088
+
1089
+ ### Measured
1090
+ - jessfraz/dotfiles: 50 67 (hook FPs removed, +17 points)
1091
+ - Codex strict FP rate: 5.98% <5% on 10-repo scale-up
1092
+ - **Codex Platform Parity: certified**. PPI: 0.375 **0.5** (Claude + Cursor + Codex)
1093
+
1094
+ 315/315 tests pass.
1095
+
1096
+ Closes #35
1097
+
1098
+ ## [1.17.2] - 2026-04-12
1099
+
1100
+ ### Fixed
1101
+ - **`.codex/AGENTS.md` now recognized as a valid Codex instruction surface**. `agentsMdPath()` previously only checked root `AGENTS.md`, missing the emerging pattern of keeping Codex instructions inside `.codex/` (e.g., jessfraz/dotfiles stores a 12KB AGENTS.md there). This fix cascades to every check that reads `agentsContent()`, including `codexPackRecommendationQuality` — the last remaining FP in Codex re-validation.
1102
+
1103
+ ### Measured
1104
+ - jessfraz/dotfiles: 47 → 50, `codexPackRecommendationQuality` FAIL → PASS
1105
+ - Codex strict FP rate: <5% across both re-validation repos ready to scale to 10
1106
+
1107
+ ## [1.17.1] - 2026-04-12
1108
+
1109
+ ### Fixed — Platform Parity re-validation (after v1.17.0)
1110
+
1111
+ - **codexPythonPackageStructure (CX-PY19)**: Now probes common package layouts directly via filesystem scan instead of relying on `ctx.files` (which only lists root entries). Correctly detects `src/<package>/__init__.py` and flat `<package>/__init__.py` layouts. Fixes false negative on openai/openai-agents-python.
1112
+ - **codexPackRecommendationQuality (CX-N03)**: Returns N/A for dotfiles/config-only repos (detected via 2+ signals from `.zshrc`, `.bashrc`, `.vimrc`, `.tmux.conf`, `.gitconfig`, `install.sh`, `bootstrap.sh`). Pack recommendations are not meaningful for non-code repos.
1113
+ - **cursorBugbotEnabled (CU-J01)**: Severity downgraded medium low. Returns N/A unless repo shows BugBot evidence (bugbot config file, `.github/workflows` reference, or docs mention). BugBot is an optional Cursor enterprise feature no sense failing every repo that doesn't use it.
1114
+
1115
+ ### Measured
1116
+ - **PP-02 Codex**: openai-agents-python 72 → 73. 2 remaining FPs resolved.
1117
+ - **PP-02 Cursor**: CU-J01 no longer fires on every repo with rules. Strict FP rate 4.9% → 0%.
1118
+
1119
+ ## [1.17.0] - 2026-04-12
1120
+
1121
+ ### Fixed Cursor (from Platform Parity audit, Issue #32)
1122
+ - **CU-A01 (cursorRulesExist)**: Now follows file-redirect pattern. When `.cursor/rules` is a text file pointing to another path (e.g., `agents/rules/`), the rules are read from the redirect target. Fixes false negative on cal.com-style layouts.
1123
+ - **CU-A02 (cursorNoLegacyCursorrules)**: Returns N/A when repo has zero Cursor configuration. Fixes the calibration inversion where no-config repos outscored legacy-format repos.
1124
+ - **CU-C01 (cursorPrivacyMode)**: Severity downgraded from `critical` to `low`. Returns N/A when no rules exist. Privacy Mode is stored in SQLite state.vscdb and not meaningfully auditable from repo files.
1125
+
1126
+ ### Fixed Codex (from Platform Parity audit, Issue #33)
1127
+ - **codexAgentsArchitecture (CX-A04)**: Expanded heading recognition to include "Project Structure Guide", "Repo Structure", "Repository Layout", "Codebase Guide", "Key Directories" and enumerated directory maps. Fixes false negative on openai/openai-agents-python.
1128
+ - **codexCliAuthCredentialsStoreExplicit (CX-B12)**: Tightened managed-machine heuristic to require explicit terms (`managed device`, `shared workstation`, `multi-user host`, `VDI`, `kiosk`, `enterprise-managed`). No longer triggers on generic words like "shared utilities" or "server-managed".
1129
+ - **codexMcpPresentIfRepoNeedsExternalTools (CX-F01)**: Returns N/A for SDK/library repos (detected via package manifest + README patterns). SDKs document integrations without needing project-scoped MCP.
1130
+ - **codexSkillsHaveMetadata**: Now accepts YAML frontmatter (`name`, `description`) as valid metadata. Fixes false negative on repos using OpenAI-style SKILL.md.
1131
+ - **codexPythonFormatterConfigured (CX-PY08)**: Accepts broader Ruff setups (any `[tool.ruff]` section, not just `[tool.ruff.format]`), yapf, autopep8, and standalone config files.
1132
+ - **codexPythonFastapiEntryDocumented (CX-PY10)**: Returns N/A when FastAPI appears only in examples/dev deps. Also checks AGENTS.md for entry point documentation.
1133
+ - **codexPythonMigrationsDocumented (CX-PY11)**: Returns N/A for SDK/library repos and when repo has no DB dependencies.
1134
+ - **codexPythonPackageStructure (CX-PY19)**: Path-separator-agnostic regex works correctly on Windows.
1135
+ - **codexPackRecommendationQuality (CX-N03)**: Removed `package.json` as universal requirement. Now accepts any primary manifest (pyproject.toml, Cargo.toml, go.mod, Gemfile, flake.nix, Makefile, etc.). Returns N/A when no signals exist.
1136
+
1137
+ ### Measured
1138
+ - **PP-02/PP-03 Cursor**: FP rate 15% <5% after fixes. Score range 14–76 → 20–68 (still differentiated).
1139
+ - **PP-02/PP-03 Codex**: Strict FP 27.8% <5% after fixes. openai-agents-python 65 → 72.
1140
+ - **Platform Parity Index (PPI)**: 0.125 0.375 (Claude + Cursor + Codex validated).
1141
+
1142
+ ## [1.16.0] - 2026-04-12
1143
+
1144
+ ### Added
1145
+ - **MOAT-01 — Harmony-first default onboarding**: When `nerviq audit` runs on a repo with 2+ configured AI platforms and no explicit `--platform`, the CLI now prints a one-line Harmony Score + drift summary *before* the single-platform audit. Cross-platform alignment becomes the first impression, in line with the durable moat positioning.
1146
+ - **`--no-harmony-first` flag**: Suppresses the new Harmony header for users who want strictly single-platform output.
1147
+ - **`harmony` envelope in `audit --json`**: On multi-platform repos, JSON output now includes `{ harmony: { score, driftCount, platforms } }` at the root, alongside the existing per-platform fields.
1148
+
1149
+ ### Changed
1150
+ - **FB-05 — framework-aware fix rewriting**: On repos where no Node/JS stack is detected (Python, Go, Rust, Ruby, Java/Kotlin, Elixir, .NET), failure-message recommendations no longer hard-code `npm test` / `npm ci` / `npm install`. The audit post-processes `fix` text and substitutes the stack-appropriate equivalent (e.g. `pytest`, `go test ./...`, `cargo test`, `bundle exec rspec`, `./gradlew test`, `mix test`, `dotnet test`). No change on Node repos.
1151
+ - **Release-sync surfaces now reflect the `315`-test verification baseline** (was 307 in v1.15.0). `test/harmony-first.test.js` (5 cases) covers MOAT-01; `test/framework-aware-fixes.test.js` (3 cases) covers FB-05.
1152
+
1153
+ ## [1.15.0] - 2026-04-11
1154
+
1155
+ ### Added
1156
+ - **`--dir` flag**: Audit any directory without changing cwd (`nerviq audit --dir /path/to/repo`).
1157
+ - **Opt-in telemetry foundation**: Anonymous local usage tracking for audit, harmony-audit, and setup commands. Activated only when `NERVIQ_TELEMETRY=1` is set. No data leaves the machine.
1158
+
1159
+ ### Fixed
1160
+ - **`--dir` flag was silently ignored**: The flag was parsed but not recognized as a value flag, causing `nerviq audit --dir /path` to always audit the current directory instead of the target. Critical fix for CI and scripted usage.
1161
+ - **CLAUDE.md reference following**: When CLAUDE.md is short and contains a file reference (e.g., `AGENTS.md`), the referenced file is now read and included in content checks. Fixes false negatives on projects like home-assistant/core.
1162
+ - **Build/test/lint checks use repo scope**: Quality checks now read all instruction surfaces (AGENTS.md, .cursorrules, copilot-instructions.md) instead of only CLAUDE.md.
1163
+ - **testCoverage regex expanded**: Now matches "## Testing", "writing tests", "run tests", and "test command" patterns.
1164
+ - **CHANGELOG check accepts variants**: Now recognizes CHANGES.md, HISTORY.md, NEWS.md in addition to CHANGELOG.md.
1165
+
1166
+ ### Measured
1167
+ - **External repo audit (EXP-11)**: 10 popular repos (213K combined stars). Score range: 15–59. FP rate: ~2–4%.
1168
+
1169
+ ## [1.14.0] - 2026-04-11
1170
+
1171
+ ### Added
1172
+ - **Harmony Score standalone command**: `nerviq harmony-score` outputs 0-100 cross-platform alignment score with `--badge` (shields.io markdown), `--threshold N` (CI gate with exit code 1 on failure), `--quiet` (score number only for piping), and `--json` (full platform breakdown).
1173
+ - **Harmony Demo**: `nerviq harmony-demo` creates a temporary multi-platform project (Claude + Cursor + Copilot) with intentional drift and runs a live harmony audit — zero setup required.
1174
+ - **Cross-platform CI matrix**: CI now runs on 3 OS (Ubuntu, Windows, macOS) x 3 Node versions (18, 20, 22) for 9 total verification combinations.
1175
+
1176
+ ## [1.13.0] - 2026-04-10
1177
+
1178
+ ### Added
1179
+ - **Self-audit compliance**: CLAUDE.md now includes XML constraint blocks, mermaid architecture diagram, project description, lint command reference, and trust boundary — self-audit score 73→84.
1180
+ - **Hardened platform freshness**: all 8 platforms now have version-specific freshness coverage in the check engine.
1181
+ - **Cross-surface contract regression**: a new regression pack validates that public integration contracts, API docs, and MCP transport docs stay consistent across releases.
1182
+
1183
+ ### Changed
1184
+ - **Flagship CLAUDE.md refactored**: instruction surface is now concise, modular, and follows the patterns Nerviq recommends to users.
1185
+ - **Audit and setup modules split**: `audit.js` split into recommendation + instruction modules; `setup.js` split into analysis + runtime modules — cleaner boundaries, same public API.
1186
+ - **HTTP API docs separated from MCP transport**: each integration surface now has its own documentation entry point.
1187
+
1188
+ ### Fixed
1189
+ - **CI token gating**: research metadata validation is now gated on repo token, preventing false failures in forks and public CI.
1190
+ - **Live site metadata guard**: relaxed rendered-HTML guard to support Vercel's dynamic page output without spurious drift warnings.
1191
+
1192
+ ## [1.12.0] - 2026-04-09
1193
+
1194
+ ### Added
1195
+ - **Adaptive governance guidance**: `augment` / `suggest-only` now classify repo archetypes, recommend operating profiles, and emit adopt / defer / ignore decisions with explicit explainability fields.
1196
+ - **Continuous operating mode**: Nerviq now supports managed baselines, diff-aware drift mode for CI / PR / watch flows, named upgrade campaigns, lifecycle snapshot milestones, and expiry-backed exception workflows.
1197
+ - **Behavioral drift outcome layer**: `deep-review --behavioral` now provides an opt-in local report for structural drift, intent-vs-outcome mismatches, and behavioral snapshots over time.
1198
+ - **Org and integration standard surfaces**: added org policy inheritance, fleet score semantics, public integration contracts, first-tier integration gate docs, category definition kit, and a public benchmark corpus.
1199
+
1200
+ ### Changed
1201
+ - **Proof quality is deeper and more specific**: high-volume source URLs now point to more relevant official documentation pages instead of generic roots.
1202
+ - **Claude techniques are now modularized internally**: the legacy `src/techniques.js` monolith was split into 12 fragments plus shared helpers, while keeping the public export contract unchanged.
1203
+
1204
+ ### Fixed
1205
+ - **GitHub Actions contract stability**: org-scan JSON output now flushes safely in CI, modern action runtimes are aligned, and workflow stability remains green on Node 18 and Node 20.
1206
+ - **Public surfaces stay synchronized with shipped verification**: release-facing docs and site examples now reflect the current `307`-test verification baseline and `1.12.0` API/version examples.
1207
+
1208
+ ## [1.11.0] - 2026-04-09
1209
+
1210
+ ### Changed
1211
+ - **Instruction budget warnings now speak in tokens**: large instruction-file warnings use approximate token counts instead of raw byte thresholds, making context-window guidance more aligned with real model pressure.
1212
+ - **Deny-rule evaluation now normalizes paths consistently**: symlink aliases collapse into one effective deny rule, repo-escape traversal patterns no longer inflate posture, and explicit absolute-path deny rules remain visible as intentional coverage.
1213
+
1214
+ ### Fixed
1215
+ - **Claude deny-rule parity across audit surfaces**: audit techniques, anti-pattern detection, and suggest-only analysis now share the same deny-rule normalization contract instead of evaluating path patterns differently.
1216
+ - **GitHub automation contract stability**: workspace audit JSON is now CI-safe and Aider freshness output matches the shared `fresh` / `stale` workflow contract.
1217
+ - **Jest suite alignment with current contracts**: server envelope responses and bootstrap copy are now validated against the live `{ data, meta }` API surface and current history/suggest-rules messaging.
1218
+
1219
+ ## [1.10.0] - 2026-04-09
1220
+
1221
+ ### Changed
1222
+ - **Product boundary clarified across product surfaces**: CLI, docs, and site now consistently position Nerviq as AI agent governance / configuration intelligence rather than a full SAST replacement.
1223
+ - **Score semantics aligned end to end**: live audit, snapshot, benchmark, dashboard, workspace, and harmony scores are now labeled distinctly so one repo cannot appear contradictory without explanation.
1224
+ - **Monorepo workspace semantics clarified**: `audit --workspace` now separates root governance health from workspace aggregate/package coverage and explains the relationship directly in CLI output.
1225
+
1226
+ ### Fixed
1227
+ - **Audit vs anti-pattern parity**: shared instruction-surface detection now keeps verification guidance and anti-pattern reporting in sync across `.claude/commands`, `AGENTS.md`, and related instruction docs.
1228
+ - **Cold-start lifecycle guidance**: `history`, `compare`, `trend`, and `suggest-rules` now bootstrap users with actionable next steps instead of near-empty no-data output.
1229
+ - **Framework-aware verification detection**: Flutter, Swift/Xcode, Python, Go, and .NET verification command variants now count correctly, reducing false negatives on mature repos.
1230
+
1231
+ ### Docs
1232
+ - **Proof and first-run surfaces matured**: published beta case studies, public before/after proof repo, Harmony-first homepage, simplified six-step getting-started flow, clearer Harmony-vs-Synergy maturity messaging, and reduced concept-load across first-touch docs.
1233
+
1234
+ ## [1.9.0] - 2026-04-07
1235
+
1236
+ ### Added
1237
+ - **Dockerfile best practices checks** (#8): multi-stage build detection, .dockerignore validation (node_modules + .env), no secrets in build args
1238
+ - **Terraform check category** (#10): terraform fmt in CI/pre-commit, .terraform in .gitignore, state file not committed, remote backend configured
1239
+ - **i18n / Spanish language support** (#12): new `src/i18n.js` module, `--lang` CLI flag, Spanish locale (`es.json`). Usage: `nerviq audit --lang es`
1240
+
1241
+ ### Fixed
1242
+ - **P0 freshness URLs** (#14-#20): fixed 41 broken documentation URLs across all 7 platforms
1243
+ - Claude Code: `docs.anthropic.com` `code.claude.com/docs`
1244
+ - Cursor: `docs.cursor.com` `cursor.com/docs`, background-agent cloud-agent
1245
+ - Copilot: restructured to `how-tos/`, `concepts/`, `responsible-use/`
1246
+ - Gemini: `ai.google.dev` `google-gemini.github.io/gemini-cli/`
1247
+ - Windsurf: rules merged into memories, MCP moved to `plugins/cascade/mcp`
1248
+ - OpenCode: added `/docs/` prefix to config/plugins/permissions paths
1249
+ - Codex: `docs.codex.ai` `developers.openai.com/codex`
1250
+ - All 53 P0 sources now have `verifiedAt: 2026-04-07`
1251
+ - Check count: 2,431 2,438 (7 new checks)
1252
+
1253
+ ## [1.8.9] - 2026-04-06
1254
+
1255
+ ### Fixed (Expert Round FAANG-level review)
1256
+ - **Setup preserves custom deny rules**: merge via union+deduplicate instead of overwrite — existing deny rules never lost
1257
+ - **Setup creates rollback artifacts**: setup operations now have rollback support like fix/apply
1258
+ - **protect-secrets covers Bash tool**: hook matcher expanded to `Read|Write|Edit|Bash`, checks `tool_input.command` for `cat .env`, `grep .env`, `base64 .env` etc.
1259
+ - **audit --out writes file**: `--out` flag now works for the audit command (was silently ignored)
1260
+ - **scan filters irrelevant categories**: stack-specific categories (flutter, ruby, etc.) hidden when 0 checks pass and stack not detected
1261
+ - **profile load supports built-in profiles**: `profile load read-only` now works by falling back to governance profiles
1262
+ - **Certification requires security gates**: Bronze needs gitIgnoreEnv+secretsProtection passing, Silver adds no critical anti-patterns, Gold needs harmony>=80
1263
+ - **SDK input validation**: all functions throw on null/invalid dir, unknown platform, empty description
1264
+ - **SDK TypeScript definitions**: added `passing`, `total`, `average` to type interfaces
1265
+ - **REST API consistent envelope**: all endpoints return `{ data, meta: { version, timestamp } }` format
1266
+ - **REST API CORS headers**: `Access-Control-Allow-Origin: *` for browser dashboard support
1267
+ - **benchmark organic score prominent**: organic improvement shown first as primary metric
1268
+ - **synergy-report implemented**: replaced "coming soon" with working multi-platform synergy dashboard
1269
+
1270
+ ## [1.8.8] - 2026-04-06
1271
+
1272
+ ### Fixed
1273
+ - **Setup hooks registration**: hooks are now always registered in settings.json (merge, not overwrite) — previously hooks files were created but never connected
1274
+ - **Platform-specific setup**: `setup --platform windsurf/aider/cursor` now routes to platform-specific setup functions instead of only creating Claude files
1275
+ - **Rollback artifacts**: rollback now correctly records created/patched files (written after fixes, not before)
1276
+ - **fix --dry-run**: properly separated from --auto shows what would be fixed without writing files
1277
+ - **fix removes allow:["*"]**: secretsProtection fixer now removes overly broad allow rules when adding deny rules
1278
+ - **--profile flag**: now loads and applies governance profiles (read-only, suggest-only, safe-write, power-user) to audit
1279
+ - **profile load**: now applies deny rules and threshold to settings.json instead of just displaying
1280
+ - **SDK passing/total**: added `passing`, `total`, and `average` aliases to SDK audit/harmony results
1281
+ - **Swift detection**: Swift projects (Package.swift, .xcodeproj) now detected in subdirectories
1282
+ - **Python repository rules**: repository.md now references pyproject.toml instead of package.json for Python projects
1283
+ - **convert filename doubling**: strips all known extensions (.md, .mdc, .txt) preventing CLAUDE.md.md
1284
+ - **convert frontmatter leak**: MDC frontmatter stripped for all non-cursor targets (copilot, claude, codex, etc.)
1285
+ - **scan vs org scan**: `scan` now shows detailed per-repo breakdown; `org scan` shows aggregated summary
1286
+ - **migrate --platform cursor**: added migrate to FULL_COMMAND_SET so platform dispatch works correctly
1287
+ - **Hooks fail-closed**: protect-secrets hook now blocks on error instead of allowing (fail-closed, not fail-open)
1288
+ - **Settings merge**: setup now merges all fields (hooks, permissions, mcpServers, nerviqSetup) into existing settings.json
1289
+
1290
+ ## [1.8.7] - 2026-04-06
1291
+
1292
+ ### Changed
1293
+ - **Complete CLAUDEX → NERVIQ rebrand**: all internal references, env vars (`NERVIQ_NO_INSIGHTS`), JSON keys (`_nerviq_managed`), and property names updated
1294
+ - **Restored audit-repo skill template**: Claude-native skill for running `npx @nerviq/cli --json` from within Claude Code
1295
+ - **Updated .gitignore**: fixed legacy `claudex-setup` reference
1296
+
1297
+ ## [1.8.6] - 2026-04-06
1298
+
1299
+ ### Changed
1300
+ - **Confidence calibration**: 5-tier system (0.3/0.6/0.7/0.8/0.9) based on actual evidence quality — stack checks=0.6, default=0.7, with-template=0.8, runtime-verified=0.9
1301
+ - **SDK dogfooding**: CLI now imports `audit`, `detectPlatforms`, `getCatalog` from public SDK API instead of internal modules
1302
+ - Updated test count badge: 293 tests
1303
+
1304
+ ## [1.8.5] - 2026-04-06
1305
+
1306
+ ### Changed — Honesty & Maturity Overhaul (Stream 23)
1307
+ - **Check count messaging**: All surfaces now show "2,431 checks (8 platforms × ~300 governance rules)" instead of inflated raw number
1308
+ - **Synergy [EXPERIMENTAL]**: Synergy dashboard, CLI output, and site docs now carry experimental label with disclaimer about static routing rules
1309
+ - **Feature maturity labels**: Introduced GA/Beta/Experimental system — Harmony=GA, Plugins=GA, SDK=Beta, Synergy=Experimental
1310
+ - **"evidence-based" accurate**: Changed to "rule-based audit engine with evidence tracking" in methodology docs
1311
+ - **Positioning**: Added "Best for teams going from 0→governed" and "Not designed for deeply customized setups" to README and site
1312
+ - **sourceUrl audit**: Verified 100% coverage (2,306/2,306 checks), identified 78 unique URLs for future specificity improvement
1313
+
1314
+ ### Fixed
1315
+ - Fixed 15 failing tests with stale check counts (2,306→2,431, domain packs 40→62)
1316
+ - Jest version verified: ^30.3.0 valid (30.2.0 installed)
1317
+
1318
+ ### Added
1319
+ - 14 new Harmony integration tests (full pipeline, drift scenarios, add platform, state persistence, governance, advisor)
1320
+ - Total test count: 293 passing across 28 suites
1321
+ - MaturityBadge component on nerviq.net docs pages
1322
+
1323
+ ## [1.7.1] - 2026-04-07
1324
+
1325
+ ### Changed
1326
+ - README synced: added 8 missing commands (rollback, check-health, anti-patterns, freshness, rules-export, org scan), 4 missing options (--full, --config-only, --only, --workspace), fixed NERVIQ→NERVIQ branding
1327
+
1328
+ ## [1.7.0] - 2026-04-07
1329
+
1330
+ ### Added Final P2 batch
1331
+ - **UAT-11: `nerviq rollback`** Undo the most recent apply by deleting all created files. Supports `--list` (show rollback points), `--dry-run` (preview), and auto-cleanup of rollback artifacts after use.
1332
+ - **UAT-18**: `apply --only hooks,commands` already worked (verified)
1333
+ - **UAT-19**: Benchmark messaging improved for post-setup runs
1334
+
1335
+ ## [1.6.5] - 2026-04-07
1336
+
1337
+ ### Added — More P2 UX from UAT
1338
+ - **UAT-14**: Governance shows top 5 domain/MCP packs by default, `--verbose` for all
1339
+ - **UAT-20**: Frontend.md rule no longer generated for backend-only projects (Express, NestJS)
1340
+ - **UAT-23**: `rules-export` shows human-readable summary by default, `--json` for full output
1341
+ - **UAT-24**: `history --prune N` to clean old snapshots (keeps last N)
1342
+ - **UAT-21**: Harmony task routing already dynamic (via UAT-04 phantom platform fix)
1343
+
1344
+ ## [1.6.4] - 2026-04-07
1345
+
1346
+ ### Added — P2 UX improvements from UAT
1347
+ - **UAT-12**: Setup now lists every file created (`+ CLAUDE.md`, `+ .claude/settings.json`, ...)
1348
+ - **UAT-13**: Lite mode shows pass/fail count: `Score: 78/100 (62/86 checks passing)`
1349
+ - **UAT-15**: Audit header shows detected config files: `Found: CLAUDE.md, AGENTS.md, .cursorrules`
1350
+ - **UAT-17**: Suggested next command includes `--platform` for non-Claude platforms
1351
+ - **UAT-22**: History shows HH:MM timestamps when multiple snapshots share same date
1352
+
1353
+ ## [1.6.3] - 2026-04-07
1354
+
1355
+ ### Fixed — P1 from UAT
1356
+ - **UAT-04**: Harmony only audits platforms with detected config files (was always 8/8)
1357
+ - **UAT-05**: `apply --rollback` now shows clear error instead of silently re-applying
1358
+ - **UAT-06**: Harmony drift now auto-recorded — compares scores to previous audit, records deltas ≥5 points
1359
+ - **UAT-07**: Migrate error message includes usage example
1360
+ - **UAT-08**: Doctor aider freshness gate no longer crashes (null safety)
1361
+ - **UAT-09**: `nerviq fix` now auto-fixes `gitIgnoreEnv` (.env to .gitignore) and `secretsProtection` (deny rules in settings.json) — the two most common critical findings
1362
+ - **UAT-10**: Rails/Laravel/.NET false positives in `fix` output eliminated (was caused by same null-inclusion bug as UAT-02)
1363
+
1364
+ ## [1.6.2] - 2026-04-07
1365
+
1366
+ ### Fixed P0 from UAT (ship-stoppers)
1367
+ - **UAT-01 BLOCKER**: `npx @nerviq/cli audit` now works — added `@nerviq/cli` bin alias
1368
+ - **UAT-02**: `nerviq fix` was showing 375 failed checks (including skipped) vs audit's 77. Fixed: now filters `r.passed === false` only, matching audit count exactly
1369
+ - **UAT-03**: Confidence label `[MEDIUM]` was shown on critical items (confusing). Changed threshold: 0.7 confidence now shows `[HIGH]` instead of `[MEDIUM]`
1370
+
1371
+ ## [1.6.1] - 2026-04-07
1372
+
1373
+ ### Added
1374
+ - **F3-01: `nerviq check-health`** Detects regressions between audit snapshots. Compares per-check pass/fail state and flags checks that went from passing to failing. When 3+ checks in the same category regress, alerts as "potential platform format change."
1375
+ - **F3-03: Regression tests** — 3 new tests for check-health: no-snapshots, stable state, and regression detection
1376
+ - Supports `--json` for CI integration
1377
+
1378
+ ## [1.6.0] - 2026-04-07
1379
+
1380
+ ### Changed ACCURACY OVERHAUL
1381
+ - **Stack detection accuracy**: Checks for Python, Go, Rust, Java, Ruby, PHP, .NET, Flutter, Swift, Kotlin now skip when the stack is only present in `examples/`, `docs/`, `test/`, `vendor/` directories — not at project root. Previously these fired false positives on monorepos and repos with example code.
1382
+ - **Generic quality checks scoped**: 132 checks (observability, caching, i18n, rate-limiting, etc.) are now skipped by default — they measure general software quality, not AI agent configuration. Use `--verbose` to include them.
1383
+ - **Urgency count fix**: Skipped (not-applicable) checks were incorrectly counted as critical/high in the lite output summary. Now only actual failures are counted.
1384
+
1385
+ ### Impact
1386
+ - supabase/supabase: Failed 120 → 55 (65 false positives eliminated)
1387
+ - Nerviq's own repo: Fake "🔴 3 critical" → accurate "🔵 19 recommended"
1388
+ - All failed checks are now relevant to AI agent configuration
1389
+
1390
+ ## [1.5.3] - 2026-04-07
1391
+
1392
+ ### Added
1393
+ - **T4-01:** Confidence labels (`[HIGH]` / `[MEDIUM]` / `[HEURISTIC]`) on every failed check in full audit
1394
+ - **T4-02:** Safety modes documented in README: read-only, suggest-only, dry-run, config-only, safe-write, power-user
1395
+ - **T4-02:** `--config-only` flag added — restricts writes to config files only
1396
+ - **B4:** Suggest-only markdown export verified working (`nerviq suggest-only --out report.md`)
1397
+
1398
+ ### Fixed
1399
+ - Report header rebranded from "Nerviq" to "Nerviq" in markdown export
1400
+
1401
+ ## [1.5.2] - 2026-04-07
1402
+
1403
+ ### Added
1404
+ - **F1-01: Lite-by-default** `nerviq audit` now shows quick scan (score + top 3 actions). Use `--full` for complete output.
1405
+ - **F1-02: Urgency tiers** — Lite output shows `🔴 critical / 🟡 high / 🔵 recommended` summary and per-item tier icons
1406
+ - **F2-01: `nerviq fix` command** — Auto-fix checks with templates, show manual guidance for others, display score impact
1407
+ - `nerviq fix` — List fixable and manual-fix checks
1408
+ - `nerviq fix <key>` Fix a specific check with before/after score
1409
+ - `nerviq fix --all-critical` Fix all critical issues at once
1410
+ - `nerviq fix --dry-run` — Preview without writing
1411
+
1412
+ ### Changed
1413
+ - Default `nerviq audit` is now lite mode (previously showed full output)
1414
+ - `--full` flag added to restore previous full-output behavior
1415
+ - `--verbose` still shows full output plus medium-priority recommendations
1416
+ - Lite output streamlined: single fix line per item instead of redundant Why/Fix
1417
+
1418
+ ## [1.5.1] - 2026-04-06
1419
+
1420
+ ### Added
1421
+ - "Get Started by Role" section in README (solo dev / team lead / enterprise paths)
1422
+ - "What Nerviq Is — and Isn't" section in README (honest limitations, confidence levels)
1423
+ - CHANGELOG entries for v1.2.5 through v1.5.0 (previously undocumented)
1424
+
1425
+ ### Changed
1426
+ - Check counts synced across all surfaces (README, package.json, badge): 2,431 total
1427
+ - Removed stale "v1.0" reference from README
1428
+ - Tagline sharpened: "Standardize and govern your AI coding agent setup"
1429
+ - Platform check counts updated to match actual catalog
1430
+ - Removed self-certification badge
1431
+
1432
+ ## [1.5.0] - 2026-04-05
1433
+
1434
+ ### Added
1435
+ - Stream 8 Self-Dependent Execution — intelligence hardening
1436
+ - New CLI commands: `nerviq rules-export`, `nerviq anti-patterns`, `nerviq freshness`
1437
+ - A2: Recommendation rules export to JSON
1438
+ - A3: Shared contract schemas (technique + pack)
1439
+ - A6: 22 anti-pattern definitions with detection
1440
+ - A7: Last-verified date tracking for 123 checks
1441
+ - B5: External benchmark path (`nerviq benchmark --external /path`)
1442
+ - B8: Governance hook risk level classification (high/medium/low)
1443
+
1444
+ ### Changed
1445
+ - B3: Augment now preserves and displays top 10 strengths
1446
+
1447
+ ## [1.4.1] - 2026-04-05
1448
+
1449
+ ### Fixed
1450
+ - npm README display alignment
1451
+
1452
+ ## [1.4.0] - 2026-04-05
1453
+
1454
+ ### Added
1455
+ - Stream 13: 84 new coverage checks across 15 directions
1456
+ - MC-A (HIGH): Observability, Accessibility, GDPR, Error Tracking, Supply Chain — 31 checks
1457
+ - MC-B (MED): i18n, API Versioning, Caching, Rate Limiting, Feature Flags, Docs, Monorepo, Performance — 43 checks
1458
+ - MC-C (LOW): WebSocket/Real-time, GraphQL — 10 checks
1459
+ - Total reached 2,039 checks across 96 categories
1460
+
1461
+ ## [1.3.2] - 2026-04-05
1462
+
1463
+ ### Changed
1464
+ - README fully updated: badge, platform table, category table, stack languages table
1465
+ - package.json description synced to 1,955 checks
1466
+ - Added `harmony-add` command to docs
1467
+
1468
+ ## [1.3.1] - 2026-04-05
1469
+
1470
+ ### Added
1471
+ - Stream 5D: 35 mobile stack checks (Flutter 15, Swift 10, Kotlin 10)
1472
+ - Stream 4 Batch 2: 22 new domain packs (healthcare to energy)
1473
+ - Stream 5 complete: 172 stack checks across 10 languages
1474
+
1475
+ ## [1.3.0] - 2026-04-05
1476
+
1477
+ ### Added
1478
+ - Stream 5: Stack-specific checks for 7 languages (137 new checks)
1479
+ - Python (26), Go (21), Rust (21), Java/Spring (21), Ruby (16), PHP (16), .NET (16)
1480
+ - QP-D02: API reference documentation (`docs/api-reference.md`)
1481
+
1482
+ ## [1.2.7] - 2026-04-05
1483
+
1484
+ ### Changed
1485
+ - Version bump for npm publish alignment
1486
+
1487
+ ## [1.2.6] - 2026-04-05
1488
+
1489
+ ### Added
1490
+ - EC1-EC8: All 6 new ECC-inspired checks + 2 advisor task types
1491
+
1492
+ ### Fixed
1493
+ - Flaky `compareLatest` test (timestamp tiebreaker sort)
1494
+
1495
+ ## [1.2.5] - 2026-04-05
1496
+
1497
+ ### Added
1498
+ - 3 ECC-inspired checks: `llms.txt`, MCP budget warning, hook exit code docs
1499
+
1500
+ ### Changed
1501
+ - Complete NERVIQ → NERVIQ rebrand across docs, content, action, landing page
1502
+ - CHANGELOG rewritten to Keep a Changelog format with full version history
1503
+
1504
+ ## [1.2.4] - 2026-04-05
1505
+
1506
+ ### Added
1507
+ - H8: Unified platform capability matrices into a single source of truth
1508
+ - Windsurf, Aider, and OpenCode intelligence added to Harmony module
1509
+ - Codex platform additions synced to metadata
1510
+
1511
+ ### Changed
1512
+ - MG5-MG11: Complete NERVIQ to NERVIQ migration in CLI codebase
1513
+ - Hardcoded `.claude/nerviq-cli/` paths migrated to `.nerviq/` with fallback
1514
+
1515
+ ## [1.2.3] - 2026-04-05
1516
+
1517
+ ### Added
1518
+ - Batch Q1: check-matrix and golden-matrix tests for Windsurf, Aider, OpenCode
1519
+ - Quality Perfection Q1: Gold certification, harmony+synergy proof
1520
+ - SDK/server tests and plugin dogfood validation
1521
+
1522
+ ### Changed
1523
+ - Self-audit score improved from 80 to 90
1524
+ - CI self-audit integrated into pipeline
1525
+
1526
+ ## [1.2.1] - 2026-04-05
1527
+
1528
+ ### Fixed
1529
+ - Skip API/DB/Auth/Monitoring checks on irrelevant projects (false positive reduction)
1530
+ - Self-dogfood: added `.mcp.json` to own project
1531
+ - LICENSE updated to AGPL-3.0 full text
1532
+ - CI test assertions updated for new error messages and .npmignore changes
1533
+
1534
+ ## [1.2.0] - 2026-04-05
1535
+
1536
+ ### Added
1537
+ - Massive expansion: 673 to 2,306 checks (+1,633)
1538
+ - Batch 4: 25 case studies (10 single-platform + 10 harmony/synergy + 5 existing) with INDEX
1539
+ - Batch 3: +104 experiments (228 to 332) and +133 research docs (315 to 448)
1540
+ - 27 cross-platform research documents
1541
+
1542
+ ## [1.1.1] - 2026-04-05
1543
+
1544
+ ### Added
1545
+ - Batch 2: +24 domain packs (16 to 40) and +23 MCP packs (26 to 49) across all 8 platforms
1546
+
1547
+ ## [1.1.0] - 2026-04-05
1548
+
1549
+ ### Added
1550
+ - Batch 1: +383 checks (673 to 1,056) across 8 new categories for all 8 platforms
1551
+
1552
+ ## [1.0.2] - 2026-04-05
1553
+
1554
+ ### Fixed
1555
+ - Scorecard: 15 dimensions improved (privacy, security, monorepo, org, integrations, telemetry, OTel, SLSA, versioning, errors, audit log, deprecation, large files, relevance decay, case studies)
1556
+
1557
+ ### Added
1558
+ - Methodology documentation, FP ranking, SBOM, CI experiments
1559
+ - Improved `.npmignore` and `test:all` script
1560
+
1561
+ ## [1.0.1] - 2026-03-31
1562
+
1563
+ ### Fixed
1564
+ - Mermaid diagram rendering in README
1565
+ - macOS `grep` compatibility issue
1566
+ - Version stamp display
1567
+
1568
+ ## [1.0.0] - 2026-04-05
1569
+
1570
+ ### Changed
1571
+ - **Renamed from nerviq-cli to Nerviq** — "The intelligent nervous system for AI coding agents"
1572
+ - Full rebrand across CLI, docs, and package metadata
1573
+
1574
+ ## [0.9.6] - 2026-04-05
1575
+
1576
+ ### Added
1577
+ - SDK for programmatic access
1578
+ - REST API server with Express
1579
+ - Plugin system for extensibility
1580
+ - SLSA provenance for supply chain security
1581
+ - CONTRIBUTING.md for open-source contributors
1582
+
1583
+ ## [0.9.5] - 2026-04-05
1584
+
1585
+ ### Added
1586
+ - VS Code extension
1587
+ - `catalog` command for browsing checks
1588
+ - Performance baselines and benchmarks
1589
+ - Feedback loop for community contributions
1590
+
1591
+ ### Changed
1592
+ - All 673 checks now include `sourceUrl` and `confidence` metadata
1593
+
1594
+ ## [0.9.4] - 2026-04-05
1595
+
1596
+ ### Added
1597
+ - GitHub Action for CI/CD integration
1598
+ - MCP server for tool integration
1599
+ - `doctor`, `convert`, and `migrate` commands
1600
+ - Freshness pipeline for check staleness detection
1601
+ - 3 case studies with real project data
1602
+ - Harmony, Synergy, and E2E test suites (187 total tests)
1603
+
1604
+ ## [0.9.3] - 2026-04-05
1605
+
1606
+ ### Fixed
1607
+ - Checks updated from experiment findings: Gemini +5, Copilot +5, Cursor +4, Aider +3, Windsurf/OpenCode fixes
1608
+ - Stale checks cleaned and new checks added
1609
+ - CI: added `npm ci` step for dependency install
1610
+
1611
+ ### Changed
1612
+ - README updated with beta notice and coming-soon platform list
1613
+
1614
+ ## [0.9.x] - 2026-04-04
1615
+
1616
+ ### Changed
1617
+ - README updated with nerviq-cli to Nerviq migration notice
1618
+
1619
+ ## [0.5.1] - 2026-03-31
1620
+
1621
+ ### Changed
1622
+ - Deep-review auto-detects Claude Code presence (no API key needed)
1623
+ - Landing page and help text updated
1624
+
1625
+ ## [0.5.0] - 2026-03-31
1626
+
1627
+ ### Added
1628
+ - AI-powered `deep-review` command using Claude API
1629
+ - Intelligent analysis beyond static checks
1630
+
1631
+ ## [0.4.0] - 2026-03-31
1632
+
1633
+ ### Added
1634
+ - 9 quality-deep checks for veteran Claude Code users
1635
+ - Deeper analysis for experienced workflows
1636
+
1637
+ ### Changed
1638
+ - Community feedback addressed: improved honesty, no-overwrite behavior, less dogmatic tone
1639
+
1640
+ ## [0.3.2] - 2026-03-31
1641
+
1642
+ ### Changed
1643
+ - README v2: all commands documented, smart gen showcase, 54 checks table, GitHub Action, privacy section
1644
+
1645
+ ## [0.3.1] - 2026-03-31
1646
+
1647
+ ### Added
1648
+ - Anonymous insights collection
1649
+ - Weakest areas analysis
1650
+ - Community statistics dashboard
1651
+
1652
+ ### Fixed
1653
+ - Insights endpoint corrected to `nerviq.workers.dev`
1654
+
1655
+ ## [0.3.0] - 2026-03-31
1656
+
1657
+ ### Added
1658
+ - Interactive wizard for guided setup
1659
+ - Watch mode for continuous monitoring
1660
+ - Landing page with FAQ, trust signals, badges
1661
+
1662
+ ## [0.2.1] - 2026-03-31
1663
+
1664
+ ### Added
1665
+ - Smart `CLAUDE.md` generator based on project analysis
1666
+ - `badge` command for README status badges
1667
+ - GitHub Action for automated auditing
1668
+ - Quick wins recommendations
1669
+
1670
+ ## [0.2.0] - 2026-03-31
1671
+
1672
+ ### Added
1673
+ - Expanded to 54 checks across 18 technology stacks
1674
+ - Improved CLAUDE.md templates
1675
+
1676
+ ### Fixed
1677
+ - Security: removed hardcoded Dev.to API key from CLAUDE.md
1678
+ - Security: made NERVIQ catalog links private
1679
+
1680
+ ## [0.1.0] - 2026-03-30
1681
+
1682
+ ### Added
1683
+ - Initial release of nerviq-cli (later renamed to Nerviq)
1684
+ - Project audit and optimization for Claude Code workflows
1685
+ - Landing page (GitHub Pages ready)
1686
+ - Launch content and community posts
1687
+
1688
+ [Unreleased]: https://github.com/nerviq/nerviq/compare/v1.30.0...HEAD
1689
+ [1.30.0]: https://github.com/nerviq/nerviq/compare/v1.29.1...v1.30.0
1690
+ [1.29.1]: https://github.com/nerviq/nerviq/compare/v1.29.0...v1.29.1
1691
+ [1.29.0]: https://github.com/nerviq/nerviq/compare/v1.28.0...v1.29.0
1692
+ [1.28.0]: https://github.com/nerviq/nerviq/compare/v1.27.1...v1.28.0
1693
+ [1.27.1]: https://github.com/nerviq/nerviq/compare/v1.27.0...v1.27.1
1694
+ [1.27.0]: https://github.com/nerviq/nerviq/compare/v1.26.0...v1.27.0
1695
+ [1.26.0]: https://github.com/nerviq/nerviq/compare/v1.25.0...v1.26.0
1696
+ [1.25.0]: https://github.com/nerviq/nerviq/compare/v1.24.0...v1.25.0
1697
+ [1.24.0]: https://github.com/nerviq/nerviq/compare/v1.23.0...v1.24.0
1698
+ [1.23.0]: https://github.com/nerviq/nerviq/compare/v1.22.0...v1.23.0
1699
+ [1.22.0]: https://github.com/nerviq/nerviq/compare/v1.21.0...v1.22.0
1700
+ [1.21.0]: https://github.com/nerviq/nerviq/compare/v1.20.1...v1.21.0
1701
+ [1.20.1]: https://github.com/nerviq/nerviq/compare/v1.20.0...v1.20.1
1702
+ [1.20.0]: https://github.com/nerviq/nerviq/compare/v1.19.0...v1.20.0
1703
+ [1.19.0]: https://github.com/nerviq/nerviq/compare/v1.18.0...v1.19.0
1704
+ [1.18.0]: https://github.com/nerviq/nerviq/compare/v1.17.3...v1.18.0
1705
+ [1.17.3]: https://github.com/nerviq/nerviq/compare/v1.17.2...v1.17.3
1706
+ [1.17.2]: https://github.com/nerviq/nerviq/compare/v1.17.1...v1.17.2
1707
+ [1.17.1]: https://github.com/nerviq/nerviq/compare/v1.17.0...v1.17.1
1708
+ [1.17.0]: https://github.com/nerviq/nerviq/compare/v1.16.0...v1.17.0
1709
+ [1.16.0]: https://github.com/nerviq/nerviq/compare/v1.15.0...v1.16.0
1710
+ [1.15.0]: https://github.com/nerviq/nerviq/compare/v1.14.0...v1.15.0
1711
+ [1.14.0]: https://github.com/nerviq/nerviq/compare/v1.13.0...v1.14.0
1712
+ [1.13.0]: https://github.com/nerviq/nerviq/compare/v1.12.0...v1.13.0
1713
+ [1.12.0]: https://github.com/nerviq/nerviq/compare/v1.11.0...v1.12.0
1714
+ [1.11.0]: https://github.com/nerviq/nerviq/compare/v1.10.0...v1.11.0
1715
+ [1.10.0]: https://github.com/nerviq/nerviq/compare/v1.9.0...v1.10.0
1716
+ [1.9.0]: https://github.com/nerviq/nerviq/compare/v1.8.9...v1.9.0
1717
+ [1.8.9]: https://github.com/nerviq/nerviq/compare/v1.8.8...v1.8.9
1718
+ [1.8.8]: https://github.com/nerviq/nerviq/compare/v1.8.7...v1.8.8
1719
+ [1.8.7]: https://github.com/nerviq/nerviq/compare/v1.8.6...v1.8.7
1720
+ [1.8.6]: https://github.com/nerviq/nerviq/compare/v1.8.5...v1.8.6
1721
+ [1.8.5]: https://github.com/nerviq/nerviq/compare/v1.7.1...v1.8.5
1722
+ [1.7.1]: https://github.com/nerviq/nerviq/compare/v1.7.0...v1.7.1
1723
+ [1.7.0]: https://github.com/nerviq/nerviq/compare/v1.6.5...v1.7.0
1724
+ [1.6.5]: https://github.com/nerviq/nerviq/compare/v1.6.4...v1.6.5
1725
+ [1.6.4]: https://github.com/nerviq/nerviq/compare/v1.6.3...v1.6.4
1726
+ [1.6.3]: https://github.com/nerviq/nerviq/compare/v1.6.2...v1.6.3
1727
+ [1.6.2]: https://github.com/nerviq/nerviq/compare/v1.6.1...v1.6.2
1728
+ [1.6.1]: https://github.com/nerviq/nerviq/compare/v1.6.0...v1.6.1
1729
+ [1.6.0]: https://github.com/nerviq/nerviq/compare/v1.5.3...v1.6.0
1730
+ [1.5.3]: https://github.com/nerviq/nerviq/compare/v1.5.2...v1.5.3
1731
+ [1.5.2]: https://github.com/nerviq/nerviq/compare/v1.5.1...v1.5.2
1732
+ [1.5.1]: https://github.com/nerviq/nerviq/compare/v1.5.0...v1.5.1
1733
+ [1.5.0]: https://github.com/nerviq/nerviq/compare/v1.4.1...v1.5.0
1734
+ [1.4.1]: https://github.com/nerviq/nerviq/compare/v1.4.0...v1.4.1
1735
+ [1.4.0]: https://github.com/nerviq/nerviq/compare/v1.3.2...v1.4.0
1736
+ [1.3.2]: https://github.com/nerviq/nerviq/compare/v1.3.1...v1.3.2
1737
+ [1.3.1]: https://github.com/nerviq/nerviq/compare/v1.3.0...v1.3.1
1738
+ [1.3.0]: https://github.com/nerviq/nerviq/compare/v1.2.7...v1.3.0
1739
+ [1.2.7]: https://github.com/nerviq/nerviq/compare/v1.2.6...v1.2.7
1740
+ [1.2.6]: https://github.com/nerviq/nerviq/compare/v1.2.5...v1.2.6
1741
+ [1.2.5]: https://github.com/nerviq/nerviq/compare/v1.2.4...v1.2.5
1742
+ [1.2.4]: https://github.com/nerviq/nerviq/compare/v1.2.3...v1.2.4
1743
+ [1.2.3]: https://github.com/nerviq/nerviq/compare/v1.2.1...v1.2.3
1744
+ [1.2.1]: https://github.com/nerviq/nerviq/compare/v1.2.0...v1.2.1
1745
+ [1.2.0]: https://github.com/nerviq/nerviq/compare/v1.1.1...v1.2.0
1746
+ [1.1.1]: https://github.com/nerviq/nerviq/compare/v1.1.0...v1.1.1
1747
+ [1.1.0]: https://github.com/nerviq/nerviq/compare/v1.0.2...v1.1.0
1748
+ [1.0.2]: https://github.com/nerviq/nerviq/compare/v1.0.1...v1.0.2
1749
+ [1.0.1]: https://github.com/nerviq/nerviq/compare/v1.0.0...v1.0.1
1750
+ [1.0.0]: https://github.com/nerviq/nerviq/compare/v0.9.6...v1.0.0
1751
+ [0.9.6]: https://github.com/nerviq/nerviq/compare/v0.9.5...v0.9.6
1752
+ [0.9.5]: https://github.com/nerviq/nerviq/compare/v0.9.4...v0.9.5
1753
+ [0.9.4]: https://github.com/nerviq/nerviq/compare/v0.9.3...v0.9.4
1754
+ [0.9.3]: https://github.com/nerviq/nerviq/compare/v0.9.x...v0.9.3
1755
+ [0.9.x]: https://github.com/nerviq/nerviq/compare/v0.5.1...v0.9.x
1756
+ [0.5.1]: https://github.com/nerviq/nerviq/compare/v0.5.0...v0.5.1
1757
+ [0.5.0]: https://github.com/nerviq/nerviq/compare/v0.4.0...v0.5.0
1758
+ [0.4.0]: https://github.com/nerviq/nerviq/compare/v0.3.2...v0.4.0
1759
+ [0.3.2]: https://github.com/nerviq/nerviq/compare/v0.3.1...v0.3.2
1760
+ [0.3.1]: https://github.com/nerviq/nerviq/compare/v0.3.0...v0.3.1
1761
+ [0.3.0]: https://github.com/nerviq/nerviq/compare/v0.2.1...v0.3.0
1762
+ [0.2.1]: https://github.com/nerviq/nerviq/compare/v0.2.0...v0.2.1
1763
+ [0.2.0]: https://github.com/nerviq/nerviq/compare/v0.1.0...v0.2.0
1764
+ [0.1.0]: https://github.com/nerviq/nerviq/releases/tag/v0.1.0