falsegreen-js 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -6,6 +6,98 @@ All notable changes to this project are documented here. The format is based on
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ## [0.4.0] - 2026-06-28
10
+
11
+ ### Fixed
12
+ - C21 no longer false-positives on a `do { expect } while(c)`: a do/while body always runs at least
13
+ once, so its assertion is unconditional (#60).
14
+ - C16 crypto match is anchored to a crypto root (`crypto.randomUUID`, `globalThis/window/self.crypto`,
15
+ or the bare node:crypto import), so a user method named `randomUUID()`/`getRandomValues()` is no
16
+ longer flagged (#61).
17
+ - C20 and C21 no longer double-report on a dead-code-only assertion: an assertion already flagged
18
+ C20 (unreachable) is excluded from the C21 set, so C20 owns the line. C21 still fires when a live
19
+ conditional assertion remains (#62).
20
+
21
+ ### Added
22
+ - `C16` nondeterminism now also flags `new Date()` (zero-arg, reads the system clock),
23
+ `crypto.randomUUID()`, and `crypto.getRandomValues()`. `new Date(<literal/expr>)` is a fixed
24
+ instant and stays clean, and the file-wide fake-timer suppression applies. Aliased/destructured
25
+ clock reads (`const now = Date.now; now()`) stay out: tracking them would trade a rare miss for
26
+ a frequent false positive on user `now()` helpers (#46).
27
+ - `C48` (dark patch): a test that flips a known test-mode flag into test mode and then
28
+ asserts is exercising the product's test-only branch, not real behaviour. Detection-only;
29
+ v1 covers raw writes (`process.env.NODE_ENV = "test"`, `process.env.TESTING = "1"`,
30
+ `settings.TESTING = true`). `NODE_ENV` only counts as `"test"`; config values and product
31
+ feature flags are not flagged; a flag write with no assertion after it is setup, not a dark
32
+ patch. Parity with falsegreen (Python) `C48`, same id and J1/low (#39).
33
+
34
+ ### Tests
35
+ - Lock the floating `expect(p).resolves`/`.rejects.<matcher>()` case for `JS5`: a non-awaited
36
+ promise matcher is already flagged through the oracle registry (the matcher builds a promise
37
+ that never settles before the test ends), and tests now pin that, including an exact-count
38
+ guard so a future change cannot double-report it. Awaited/returned forms stay clean (#43).
39
+ - Characterization tests for the cfg reachability edge cases: for-in (C20 after / C21 inside),
40
+ labeled `break outer`, a switch case that falls through without escaping, an IIFE holding the
41
+ only assertion (no phantom C21), and `performance.now()` C16 (#63).
42
+
43
+ ### Changed
44
+ - `C20` and `C21` now use a structured intra-test reachability walk (`src/cfg.ts`) instead of
45
+ a top-level-only scan. `C20` (dead code) catches an assertion after any non-falling-through
46
+ statement: a `return`/`throw`, `process.exit()`, a `break`/`continue`, an `if` whose both
47
+ arms terminate, a terminating block, or an exhaustive `switch` (every case plus a `default`
48
+ escapes). `C21` (no unconditional assertion) fires only when no assertion is on the guaranteed
49
+ spine; an assertion in an `if(true)` branch, a `finally`, or a `try` block now counts as
50
+ guaranteed, and an assertion only in a `catch` or a loop body is correctly flagged. The walk
51
+ stops at nested functions, so a `return` inside a `forEach`/IIFE callback no longer reads as
52
+ dead code. False-positive-averse: anything unmodeled is treated as reachable/guaranteed (#35).
53
+ - README Status no longer pins a stale `0.1.0` literal; it points to STATUS.md for the current
54
+ version and coverage. Removed two boolean sub-clauses fully subsumed by their first disjunct
55
+ in `isTestBlock` and the JS6 suite guard (behavior-preserving) (#64, #65).
56
+
57
+ ### Fixed (earlier in the 0.4.0 cycle)
58
+ - JS5 surfaces a floating promise observed only by a non-assignment binary op (||, &&, ===); only a real assignment with the call as RHS counts as observed.
59
+ - JS7 timer control now consults lifecycle hooks that install/flush fake timers at both the enclosing-describe and source-file top level (#41).
60
+
61
+
62
+ ### Added
63
+ - C44 (numeric tautology, high, J2): `expect(x.length).toBeGreaterThanOrEqual(0)`. A
64
+ `.length` is never negative and never NaN, so `>= 0` holds for every input and checks
65
+ nothing — the JS/TS mirror of the Python `len(x) >= 0`. The subject must be a direct
66
+ `.length` property access: a derived expression that only mentions `.length`
67
+ (`a.length - b.length`) can be negative and is not flagged, and a bound that can still
68
+ fail (`>= 1`, `> 0`) is a real check. Finiteness/NaN guards (`toBeLessThan(Infinity)`,
69
+ `toBeGreaterThan(-Infinity)`) are deliberately not flagged: they are false for `NaN`, so
70
+ they catch divide-by-zero and invalid-number bugs.
71
+ - Output-format parity with the Python scanner: `--format text|json|sarif|junit`
72
+ (default `text`; `--json` stays as an alias for `--format json`). SARIF 2.1.0
73
+ emits one rule per code and one result per finding, maps high to `error`, low to
74
+ `warning`, off to `note`, and tags each result with its judgment, `risk:<group>`,
75
+ and `level:<conf>`. JUnit XML turns high findings into `<failure>` and the rest
76
+ into `<skipped>`. `--output` writes any of the four formats (sarif -> `.sarif`,
77
+ junit -> `.xml`).
78
+ - Baseline ratchet: `--baseline [PATH]` suppresses findings already recorded (so CI
79
+ fails only on net-new ones), and `--write-baseline [PATH]` records the current
80
+ findings and exits 0. Both default to `.falsegreen-baseline.json`. A finding's
81
+ identity is a content fingerprint (`sha1` of relative path + code + detail, no
82
+ line number), stable across unrelated line shifts. The fingerprint omits the
83
+ source snippet the Python scanner folds in, since the js `Finding` carries none.
84
+ - Risk-group taxonomy: every code now carries an explicit conceptual failure mode
85
+ (`effectiveness`, `execution`, `nondeterminism`, `dependency`, `structure`,
86
+ `diagnostic`), read from a closed per-code table (`riskGroupOf`) rather than the
87
+ code prefix. An unknown code is rejected instead of defaulted. The JSON report
88
+ gains a `riskGroup` field; the legacy `group` field stays for transition compatibility.
89
+ - A code's metadata is split into independent axes: `group` (taxonomy), `severity`
90
+ (`high`/`low`), `defaultOn` (whether the default scan emits it), and `judgment`
91
+ (J1-J6). The taxonomy no longer depends on whether a finding blocks.
92
+ - Oracle registry (`oracles.ts`): the assertion-API vocabulary is one versioned
93
+ table, each family classified by how its failure reaches the runner (`sync-fail`,
94
+ `promise`, `runner-registered`, `value-only`). The JSON report records the
95
+ `oracleRegistryVersion` that classified it.
96
+
97
+ ### Fixed
98
+ - `--version` and the JSON report's `version` field read from `package.json` at
99
+ runtime; they were pinned to a stale `0.2.0` literal while the package was `0.3.0`.
100
+
9
101
  ## [0.3.0] - 2026-06-23
10
102
 
11
103
  ### Added
@@ -84,7 +176,8 @@ All notable changes to this project are documented here. The format is based on
84
176
  - pre-commit hook (`.pre-commit-hooks.yaml`), CI matrix (Node 18/20/22), and an npm
85
177
  trusted-publishing release workflow.
86
178
 
87
- [Unreleased]: https://github.com/vinicq/falsegreen-js/compare/v0.3.0...HEAD
179
+ [Unreleased]: https://github.com/vinicq/falsegreen-js/compare/v0.4.0...HEAD
180
+ [0.4.0]: https://github.com/vinicq/falsegreen-js/compare/v0.3.0...v0.4.0
88
181
  [0.3.0]: https://github.com/vinicq/falsegreen-js/compare/v0.2.0...v0.3.0
89
182
  [0.2.0]: https://github.com/vinicq/falsegreen-js/compare/v0.1.0...v0.2.0
90
183
  [0.1.0]: https://github.com/vinicq/falsegreen-js/releases/tag/v0.1.0
package/README.md CHANGED
@@ -1,5 +1,13 @@
1
1
  # falsegreen-js
2
2
 
3
+ [![CI](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml/badge.svg)](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml)
4
+ [![npm](https://img.shields.io/npm/v/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)
5
+ [![Node](https://img.shields.io/node/v/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)
6
+ [![Downloads](https://img.shields.io/npm/dm/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)
7
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
8
+ [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)
9
+ [![Docs](https://img.shields.io/badge/docs-online-blue.svg)](https://vinicq.github.io/falsegreen-docs/)
10
+
3
11
  Find JavaScript/TypeScript unit tests that give false positives: green tests that
4
12
  protect nothing, and tests that pass while asserting the wrong thing. Deterministic
5
13
  AST scan, no code execution. Sibling of [`falsegreen`](https://github.com/vinicq/falsegreen)
@@ -7,6 +15,8 @@ AST scan, no code execution. Sibling of [`falsegreen`](https://github.com/vinicq
7
15
 
8
16
  Covers `.js`, `.jsx`, `.ts`, `.tsx`, `.mjs`, `.cjs`, `.mts`, `.cts`.
9
17
 
18
+ **The falsegreen family:** [falsegreen](https://github.com/vinicq/falsegreen) (Python/pytest) · **falsegreen-js** (JS/TS) · [robotframework-falsegreen](https://github.com/vinicq/robotframework-falsegreen) (Robot Framework) · [falsegreen-skill](https://github.com/vinicq/falsegreen-skill) (semantic LLM pass).
19
+
10
20
  ## Why
11
21
 
12
22
  A test can be green and still protect nothing: an empty body, an assertion that is
@@ -27,7 +37,9 @@ npm install -D falsegreen-js
27
37
  npx falsegreen-js # scan cwd
28
38
  npx falsegreen-js src test # scan paths
29
39
  npx falsegreen-js --staged # only test files staged in git (pre-commit)
30
- npx falsegreen-js --json # machine-readable output
40
+ npx falsegreen-js --json # machine-readable output (alias for --format json)
41
+ npx falsegreen-js --format sarif # SARIF 2.1.0 for GitHub code scanning
42
+ npx falsegreen-js --format junit # JUnit XML for CI test reporters
31
43
  npx falsegreen-js --output report.json # write to a file
32
44
  npx falsegreen-js --output .falsegreen/ # write report.<ext> into a directory
33
45
  npx falsegreen-js --config-audit # audit Jest/Vitest config (project-layer PL codes)
@@ -36,6 +48,24 @@ npx falsegreen-js --disable C7,JS3
36
48
 
37
49
  Each finding is reported with its pyramid level (unit / integration / e2e, read from the file's imports) and a one-line fix hint, and the summary breaks the findings down by level and lists the most common fixes. `--output` takes a file or a directory: an extension-less or trailing-slash path (e.g. `.falsegreen/`) receives `report.<ext>` for the chosen format. Reports are run artifacts; keep the output directory gitignored.
38
50
 
51
+ ### Output formats
52
+
53
+ `--format text|json|sarif|junit` (default `text`; `--json` stays as an alias for `--format json`). These match the [Python sibling](https://github.com/vinicq/falsegreen) byte-for-concept, so a pipeline can swap one scanner for the other.
54
+
55
+ - **`sarif`**: SARIF 2.1.0. One rule per code present, one result per finding, with `error` for high-severity findings, `warning` for low, and `note` for off. Result tags carry the judgment (J1-J6), the risk group (`risk:effectiveness`...), and the level (`level:high`). Upload it to GitHub code scanning to see findings inline on the PR.
56
+ - **`junit`**: JUnit XML. High-severity findings become `<failure>`, everything else `<skipped>`, so a CI test reporter surfaces them as a failing suite.
57
+
58
+ ### Baseline (ratchet)
59
+
60
+ Adopting the scanner on a large codebase without fixing every legacy finding at once:
61
+
62
+ ```bash
63
+ npx falsegreen-js --write-baseline # record current findings to .falsegreen-baseline.json, exit 0
64
+ npx falsegreen-js --baseline # report and fail only on findings not in the baseline
65
+ ```
66
+
67
+ `--baseline [PATH]` and `--write-baseline [PATH]` default to `.falsegreen-baseline.json`. A finding's identity is a content fingerprint (`sha1` of relative path + code + detail, no line number), so it survives unrelated line shifts in the file. Commit the baseline, then let CI block only on net-new findings. (The fingerprint omits the source snippet the Python scanner folds in, since the js scanner does not carry one; two findings with the same code and detail in one file share an id.)
68
+
39
69
  `--config-audit` is a separate mode: instead of scanning test files, it reads the Jest/Vitest config (`package.json` `jest` field, `jest.config.*`, `vitest.config.*`) and reports the project-layer ways a suite stays green by configuration: `PL10` (`passWithNoTests` passes an empty or filtered-to-nothing run), `PL7` (no `coverageThreshold` / `coverage.thresholds`), `PL8` (`bail` stops the run early). The per-file scan cannot see config.
40
70
 
41
71
  For the layer no static scan reaches (does a green test fail when the code is wrong?), run a **mutation tester** like [Stryker](https://stryker-mutator.io/). falsegreen-js is the cheap pre-filter on every commit; mutation testing is the deeper audit.
@@ -95,7 +125,8 @@ line up in the research. `JS*` codes are ecosystem-specific.
95
125
  | C5 | high | always-true check (`expect(true).toBe(true)`, `assert(1)`) |
96
126
  | C6 | low | weak check — only verifies something came back (`toBeTruthy`/`toBeDefined`, `length > 0`) |
97
127
  | C7 | high | compares a thing to itself (`expect(x).toBe(x)`) |
98
- | C20 | high | assertion in dead code after a `return`/`throw` it never runs |
128
+ | C44 | high | numeric tautology a length compared so the result is always true (`x.length >= 0`) |
129
+ | C20 | high | assertion in unreachable code (after a `return`/`throw`/`process.exit`, a `break`, a both-arms-terminating `if`, or an exhaustive `switch`) — it never runs |
99
130
  | C23 | low | reads a real file at a literal path, or a hard-coded URL (mystery guest) |
100
131
  | C8 | low | exact equality on a float (use `toBeCloseTo`) |
101
132
  | C9 | low | `toThrow()` with no error type or message — accepts any error |
@@ -103,6 +134,7 @@ line up in the research. `JS*` codes are ecosystem-specific.
103
134
  | C18 | low | compares `String(x)` / `JSON.stringify(x)` / `` `${x}` `` to a literal (formatting, not value) |
104
135
  | C21 | low | every assertion is conditional — none runs unconditionally |
105
136
  | C37 | low | duplicate case in `it.each`/`test.each` — the same scenario runs twice |
137
+ | C48 | low | dark patch — the test flips a test-mode flag (`process.env.NODE_ENV = "test"`, `process.env.TESTING`, a `TESTING` flag) then asserts, exercising the product's test-only branch |
106
138
  | CC | low | commented-out assertion |
107
139
  | JS1 | high | focused test (`it.only` / `fit`) silently skips the rest of the suite |
108
140
  | JS2 | high | `expect(x)` with no matcher — the assertion never runs |
@@ -126,7 +158,7 @@ Each code carries a judgment tag (J1-J6) shared with the
126
158
 
127
159
  ### Opt-in: maintainability group (default off)
128
160
 
129
- These are **not** false-green the test still protects something so they are off by
161
+ These are **not** false-green - the test still protects something - so they are off by
130
162
  default. Enable them with `--diagnostics`, or per code via config `severity`. They are a
131
163
  "plus" for test-code health, mirroring falsegreen's diagnostic/coupling groups.
132
164
 
@@ -149,18 +181,29 @@ npx falsegreen-js --diagnostics # include D*/M* as warnings
149
181
  Some catalog codes were reviewed and left out, on purpose:
150
182
 
151
183
  - **JS19** (`toBe` on an object/array literal): `expect(x).toBe({...})` compares by reference,
152
- so it always fails. That is a loud red test, the opposite of false-green, and out of scope.
184
+ so it always fails. That is the false-red axis (a test that always fails), the opposite of
185
+ what this scanner looks for, and out of scope on principle.
153
186
  - **JS20** (a Promise compared without `resolves`/`rejects`): telling that a value is a
154
- Promise needs type information the parser does not have, so it would be too noisy.
187
+ Promise needs type information the AST does not carry, so it would be too noisy.
155
188
  - **JS12** (a floating promise whose `expect` is never returned): already covered by JS7.
156
189
  - **JS16** (`async` test with no `expect.assertions(n)`): the absence of a guard is not a
157
190
  smell on its own; flagging it would fire on most async tests.
191
+ - **JS14** (a giant inline snapshot): a readability and review-noise concern, not a
192
+ false-green one. The snapshot still protects, so it belongs to the diagnostic group and is
193
+ better served by `eslint-plugin-jest` (`no-large-snapshots`) as an opt-in lint rule.
158
194
  - **JS10** (any conditional in a test body): handled by `eslint-plugin-jest`
159
195
  (`no-conditional-in-test`); JS9 and C21 already cover the false-green subset.
196
+ - **C1** (an assertion under an `if`/`for` that may not run): redundant once C21 and JS9
197
+ exist, and high-FP on its own. C21 already fires the actual false-green case, where
198
+ *every* assertion is conditional and the test can pass with nothing checked. A test that
199
+ mixes a conditional assertion with an unconditional one is not false-green: the
200
+ unconditional assertion still protects. JS9 covers the dead-branch form (`if(false)`).
201
+ Flagging every conditional assertion (C1's full scope) is the linter concern JS10 already
202
+ names (`no-conditional-in-test`), so C1 would add noise without a new false-green signal.
160
203
 
161
204
  ### What carries over from falsegreen, what does not
162
205
 
163
- Ported (same concept): C2, C2b, C5, C7, C8, C16, CC.
206
+ Ported (same concept): C2, C2b, C5, C7, C8, C16, C44, C48, CC.
164
207
 
165
208
  Python-only, not applicable to JS/TS: pytest collection rules (C4 family), `pytest.raises`
166
209
  breadth (C9/C19/C27/C28), fixtures and `os.environ`/global-state codes (C23/C24/C29),
@@ -193,20 +236,23 @@ test re-implements the production logic. Those are semantic and belong to the
193
236
  `falsegreen-skill` LLM pass. Precision over recall: a softened heuristic that misses a
194
237
  case is preferred to one that flags correct code.
195
238
 
239
+ Measured against the [Open Catalog of Test Smells](https://test-smell-catalog.readthedocs.io/) (517 documented smells), only the false-green slice is in scope. What stays out, on purpose: **brittleness / false-red** (sensitive equality, brittle assertions - the opposite axis), **hygiene / maintainability** (assertion roulette, magic numbers, long tests - linter territory, a few surfaced as opt-in diagnostics), and **slow, design, naming, duplication, runtime/culture** (none about whether the test protects). The boundary is deliberate: where a smell has a statically provable false-green form, that form is a code here - uncontrolled `Date.now`/`Math.random` is `C16`, a hard-coded path or URL is `C23`, an assertion that may never run is `C21`/`C20`, and JS-specific forms (focused tests, missing matchers) are the `JS*` codes. See [CREDITS.md](CREDITS.md) for the full cross-walk.
240
+
196
241
  ## References
197
242
 
198
243
  The catalog is grounded in the test-smell literature. Direct influences: the
199
244
  rotten-green-test work that names this whole family (Delplanque et al., ICSE 2019),
200
245
  the founding test-smell refactoring catalog (van Deursen et al., XP 2001), the
201
- JS/TS empirical studies (Jorge, UFCG 2023; Silva, PUC Minas 2022 the academic
246
+ JS/TS empirical studies (Jorge, UFCG 2023; Silva, PUC Minas 2022 - the academic
202
247
  precedent for the focused-test and snapshot codes; Oliveira et al., SBES 2024/2025),
203
248
  and the detection-tool baselines (tsDetect, Peruma et al., 2020). Full list and the
204
249
  code-to-source mapping in [CREDITS.md](CREDITS.md).
205
250
 
206
251
  ## Status
207
252
 
208
- `0.1.0`, early. The rule set is a deterministic core; the full JS/TS smell catalog is
209
- tracked as research in the private audit hub. Issues and PRs welcome.
253
+ The rule set is a deterministic core; the full JS/TS smell catalog is tracked as
254
+ research in the private audit hub. See [STATUS.md](STATUS.md) for the current version
255
+ and rule coverage. Issues and PRs welcome.
210
256
 
211
257
  ## License
212
258
 
@@ -227,7 +273,7 @@ Thanks to the people who keep false-green tests out of real suites ([emoji key](
227
273
  <tbody>
228
274
  <tr>
229
275
  <td align="center" valign="top" width="14.28%"><a href="https://vinicq.github.io/md-bridge/"><img src="https://avatars.githubusercontent.com/u/78210890?v=4?s=100" width="100px;" alt="Vinicius Queiroz"/><br /><sub><b>Vinicius Queiroz</b></sub></a><br /><a href="https://github.com/vinicq/falsegreen-js/commits?author=vinicq" title="Code">💻</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=vinicq" title="Documentation">📖</a> <a href="#ideas-vinicq" title="Ideas, Planning, & Feedback">🤔</a> <a href="#maintenance-vinicq" title="Maintenance">🚧</a> <a href="#infra-vinicq" title="Infrastructure (Hosting, Build-Tools, etc)">🚇</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=vinicq" title="Tests">⚠️</a> <a href="#research-vinicq" title="Research">🔬</a></td>
230
- <td align="center" valign="top" width="14.28%"><a href="https://github.com/homesellerq-coder"><img src="https://avatars.githubusercontent.com/u/294912019?v=4?s=100" width="100px;" alt="Home Seller"/><br /><sub><b>Home Seller</b></sub></a><br /><a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Code">💻</a></td>
276
+ <td align="center" valign="top" width="14.28%"><a href="https://github.com/homesellerq-coder"><img src="https://avatars.githubusercontent.com/u/294912019?v=4?s=100" width="100px;" alt="Home Seller"/><br /><sub><b>Home Seller</b></sub></a><br /><a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Code">💻</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Documentation">📖</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Tests">⚠️</a> <a href="#infra-homesellerq-coder" title="Infrastructure (Hosting, Build-Tools, etc)">🚇</a></td>
231
277
  </tr>
232
278
  </tbody>
233
279
  </table>
package/dist/cases.d.ts CHANGED
@@ -3,14 +3,42 @@
3
3
  * the same concept (shared C-codes, so cross-language paper comparison lines up),
4
4
  * plus JS/TS-specific codes (JS-prefix) for ecosystem-only patterns.
5
5
  *
6
- * confidence: "high" => blocks (exit 20); "low" => warns (exit 10); "off" => silent.
7
- * judgment: which semantic question (J1-J6, see falsegreen-skill) the code belongs to.
6
+ * Each code carries four independent axes (none derived from another or from the
7
+ * code prefix):
8
+ * group conceptual failure mode (RiskGroup, closed taxonomy).
9
+ * severity how serious the finding is when it fires ("high" | "low").
10
+ * defaultOn whether the default scan emits it (false for the opt-in
11
+ * diagnostic group, surfaced only via --diagnostics).
12
+ * judgment which semantic question (J1-J6, see falsegreen-skill) it answers.
13
+ *
14
+ * The effective "confidence" used downstream (high/low/off) is derived from
15
+ * severity + defaultOn by baseConfidence(); the exit code is derived from the
16
+ * severity of the findings that are actually emitted. Keeping the axes apart is
17
+ * the point: a finding's taxonomy must not depend on whether it blocks.
8
18
  */
9
19
  export type Confidence = "high" | "low" | "off";
20
+ export type Severity = "high" | "low";
21
+ /**
22
+ * Conceptual failure mode — a closed taxonomy condensing the F1-F8 families to
23
+ * six axes. Driven by the per-code table below (riskGroupOf), never by the code
24
+ * prefix, and never defaulted: an unknown code is an error, not a guess.
25
+ *
26
+ * effectiveness no oracle, a trivial oracle, or the wrong oracle (F1/F3/F4).
27
+ * execution the check exists but does not run, or the test vanishes from
28
+ * the count (F2/F5).
29
+ * nondeterminism passes or fails by luck — time, randomness, timers (F6).
30
+ * dependency real I/O or a stand-in for the unit under test: mystery
31
+ * guest, self-mock (isolation, J3/J6).
32
+ * structure size/readability; the test still protects (F8 maintainability).
33
+ * diagnostic opt-in health signal, off by default.
34
+ */
35
+ export type RiskGroup = "effectiveness" | "execution" | "nondeterminism" | "dependency" | "structure" | "diagnostic";
10
36
  export declare const JUDGMENTS: Record<string, string>;
11
37
  export interface CaseDef {
12
38
  title: string;
13
- confidence: Confidence;
39
+ group: RiskGroup;
40
+ severity: Severity;
41
+ defaultOn: boolean;
14
42
  judgment: keyof typeof JUDGMENTS;
15
43
  }
16
44
  export declare const CASES: Record<string, CaseDef>;
@@ -19,6 +47,26 @@ export declare const DIAGNOSTIC_THRESHOLDS: {
19
47
  assertionRoulette: number;
20
48
  longTest: number;
21
49
  };
50
+ /**
51
+ * Effective default state of a code as a single value: its severity when the
52
+ * default scan emits it, or "off" when it is opt-in. Derives the legacy
53
+ * three-valued "confidence" from the independent severity + defaultOn axes, so
54
+ * the rest of the pipeline (makeFinding, effectiveConf, exit code) keeps working
55
+ * unchanged while the taxonomy stays separate from the blocking decision.
56
+ */
57
+ export declare function baseConfidence(code: string): Confidence;
58
+ /**
59
+ * Primary taxonomy: the conceptual failure mode, read from the closed per-code
60
+ * table. Rejects an unknown code instead of defaulting, so a typo or a code that
61
+ * was added to the rules but never classified fails loudly.
62
+ */
63
+ export declare function riskGroupOf(code: string): RiskGroup;
64
+ /**
65
+ * Legacy product grouping (false-positive / diagnostic / coupling / project),
66
+ * kept only as a transition-compat field in the JSON report. New consumers
67
+ * should read `riskGroup` (riskGroupOf). Prefix-based by design: it mirrors the
68
+ * pre-0.3 output exactly so downstream filters do not break across the upgrade.
69
+ */
22
70
  export declare function groupOf(code: string): "false-positive" | "diagnostic" | "coupling" | "project";
23
71
  /** Test-pyramid level, detected from a file's import roots (see level.ts).
24
72
  * `project` is the config-audit layer (--config-audit), not a file level. */
package/dist/cases.js CHANGED
@@ -3,8 +3,18 @@
3
3
  * the same concept (shared C-codes, so cross-language paper comparison lines up),
4
4
  * plus JS/TS-specific codes (JS-prefix) for ecosystem-only patterns.
5
5
  *
6
- * confidence: "high" => blocks (exit 20); "low" => warns (exit 10); "off" => silent.
7
- * judgment: which semantic question (J1-J6, see falsegreen-skill) the code belongs to.
6
+ * Each code carries four independent axes (none derived from another or from the
7
+ * code prefix):
8
+ * group conceptual failure mode (RiskGroup, closed taxonomy).
9
+ * severity how serious the finding is when it fires ("high" | "low").
10
+ * defaultOn whether the default scan emits it (false for the opt-in
11
+ * diagnostic group, surfaced only via --diagnostics).
12
+ * judgment which semantic question (J1-J6, see falsegreen-skill) it answers.
13
+ *
14
+ * The effective "confidence" used downstream (high/low/off) is derived from
15
+ * severity + defaultOn by baseConfidence(); the exit code is derived from the
16
+ * severity of the findings that are actually emitted. Keeping the axes apart is
17
+ * the point: a finding's taxonomy must not depend on whether it blocks.
8
18
  */
9
19
  export const JUDGMENTS = {
10
20
  J1: "does the assertion actually run?",
@@ -16,56 +26,88 @@ export const JUDGMENTS = {
16
26
  };
17
27
  export const CASES = {
18
28
  // --- shared concept with falsegreen (same code id) -----------------------
19
- C2: { title: "test with no check at all (empty body)", confidence: "high", judgment: "J1" },
20
- C2b: { title: "test calls things but checks nothing", confidence: "low", judgment: "J1" },
21
- C5: { title: "always-true check (expect(true).toBe(true), assert(1))", confidence: "high", judgment: "J2" },
22
- C6: { title: "weak check — only verifies something came back (toBeTruthy/toBeDefined, length > 0)", confidence: "low", judgment: "J4" },
23
- C7: { title: "compares a thing to itself (expect(x).toBe(x))", confidence: "high", judgment: "J2" },
24
- C20: { title: "assertion in dead code after a return/throw it never runs", confidence: "high", judgment: "J1" },
25
- C23: { title: "reads a real file at a literal path or hits a hard-coded URL (mystery guest)", confidence: "low", judgment: "J6" },
26
- C8: { title: "exact equality on a float (fails on rounding, not bugs)", confidence: "low", judgment: "J4" },
27
- C16: { title: "result depends on time, randomness or a fixed timer", confidence: "low", judgment: "J1" },
28
- C18: { title: "compares String()/JSON.stringify()/`${x}` of a value to a literal (checks formatting, not the value)", confidence: "low", judgment: "J2" },
29
- C21: { title: "every assertion is conditional none runs unconditionally", confidence: "low", judgment: "J1" },
30
- C9: { title: "expect(...).toThrow() with no error type or message accepts any error", confidence: "low", judgment: "J4" },
31
- C37: { title: "duplicate case in it.each/test.eachthe same scenario runs twice", confidence: "low", judgment: "J4" },
32
- CC: { title: "commented-out assertion (check switched off)", confidence: "low", judgment: "J1" },
29
+ C2: { title: "test with no check at all (empty body)", group: "effectiveness", severity: "high", defaultOn: true, judgment: "J1" },
30
+ C2b: { title: "test calls things but checks nothing", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J1" },
31
+ C5: { title: "always-true check (expect(true).toBe(true), assert(1))", group: "effectiveness", severity: "high", defaultOn: true, judgment: "J2" },
32
+ C6: { title: "weak check — only verifies something came back (toBeTruthy/toBeDefined, length > 0)", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J4" },
33
+ C7: { title: "compares a thing to itself (expect(x).toBe(x))", group: "effectiveness", severity: "high", defaultOn: true, judgment: "J2" },
34
+ C44: { title: "numeric tautology a length compared so the result is always true (length >= 0)", group: "effectiveness", severity: "high", defaultOn: true, judgment: "J2" },
35
+ C20: { title: "assertion in dead code after a return/throw it never runs", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
36
+ C23: { title: "reads a real file at a literal path or hits a hard-coded URL (mystery guest)", group: "dependency", severity: "low", defaultOn: true, judgment: "J6" },
37
+ C8: { title: "exact equality on a float (fails on rounding, not bugs)", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J4" },
38
+ C16: { title: "result depends on time, randomness or a fixed timer", group: "nondeterminism", severity: "low", defaultOn: true, judgment: "J1" },
39
+ C18: { title: "compares String()/JSON.stringify()/`${x}` of a value to a literal (checks formatting, not the value)", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J2" },
40
+ C21: { title: "every assertion is conditional none runs unconditionally", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
41
+ C9: { title: "expect(...).toThrow() with no error type or message accepts any error", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J4" },
42
+ C37: { title: "duplicate case in it.each/test.each — the same scenario runs twice", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J4" },
43
+ CC: { title: "commented-out assertion (check switched off)", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
44
+ C48: { title: "dark patch — the test flips a test-mode flag (process.env.NODE_ENV=\"test\", process.env.TESTING, a TESTING flag) then asserts, exercising the product's test-only branch", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
33
45
  // --- JS/TS ecosystem-specific --------------------------------------------
34
- JS1: { title: "focused test (it.only / fit / describe.only) silently skips the rest of the suite", confidence: "high", judgment: "J1" },
35
- JS2: { title: "expect(x) with no matcher — the assertion is never executed", confidence: "high", judgment: "J1" },
36
- JS3: { title: "snapshot is the only assertion (toMatchSnapshot generated from the output itself)", confidence: "low", judgment: "J2" },
37
- JS4: { title: "skipped test (it.skip / xit / xdescribe / it.todo) never runs", confidence: "low", judgment: "J1" },
38
- JS5: { title: "async query/event not awaited (findBy* / waitFor / user-event) — the assertion may never settle", confidence: "low", judgment: "J1" },
39
- JS6: { title: "empty describe/suite block — the suite reports green but runs nothing", confidence: "high", judgment: "J1" },
40
- JS7: { title: "assertion inside a non-awaited setTimeout/setInterval/then callback — it may run after the test ends", confidence: "low", judgment: "J1" },
41
- JS8: { title: "mocks the unit under test (jest.mock/vi.mock of an imported module asserted directly) — tests the mock, not the code", confidence: "low", judgment: "J3" },
42
- JS9: { title: "assertion in a dead branch (if(false) / if(true){}else) — it never runs", confidence: "high", judgment: "J1" },
43
- JS11: { title: "try/catch swallows the assertion — a failing expect is caught and the test stays green", confidence: "low", judgment: "J1" },
44
- JS13: { title: "query (getBy*/queryBy*/wrapper.find) as a loose statement — its result is never asserted", confidence: "low", judgment: "J4" },
45
- JS15: { title: "inappropriate assertion — the comparison is wrapped in a boolean (expect(a===b).toBe(true)), so the failure message is blind", confidence: "low", judgment: "J4" },
46
- JS17: { title: "commented-out test block (// it(...) / // test(...)) — a disabled test that no longer runs", confidence: "low", judgment: "J1" },
47
- JS18: { title: "test takes a done callback instead of async/await — a done called too early (or in a floating promise) passes before the assertions run", confidence: "low", judgment: "J1" },
48
- JS21: { title: "matcher referenced but never called (expect(x).toBe with no ()) — the assertion never executes", confidence: "high", judgment: "J1" },
49
- JS22: { title: "empty it.each/test.each table — the test is generated with zero cases and never runs", confidence: "high", judgment: "J1" },
46
+ JS1: { title: "focused test (it.only / fit / describe.only) silently skips the rest of the suite", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
47
+ JS2: { title: "expect(x) with no matcher — the assertion is never executed", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
48
+ JS3: { title: "snapshot is the only assertion (toMatchSnapshot generated from the output itself)", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J2" },
49
+ JS4: { title: "skipped test (it.skip / xit / xdescribe / it.todo) never runs", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
50
+ JS5: { title: "async query/event not awaited (findBy* / waitFor / user-event) — the assertion may never settle", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
51
+ JS6: { title: "empty describe/suite block — the suite reports green but runs nothing", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
52
+ JS7: { title: "assertion inside a non-awaited setTimeout/setInterval/then callback — it may run after the test ends", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
53
+ JS8: { title: "mocks the unit under test (jest.mock/vi.mock of an imported module asserted directly) — tests the mock, not the code", group: "dependency", severity: "low", defaultOn: true, judgment: "J3" },
54
+ JS9: { title: "assertion in a dead branch (if(false) / if(true){}else) — it never runs", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
55
+ JS11: { title: "try/catch swallows the assertion — a failing expect is caught and the test stays green", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
56
+ JS13: { title: "query (getBy*/queryBy*/wrapper.find) as a loose statement — its result is never asserted", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J4" },
57
+ JS15: { title: "inappropriate assertion — the comparison is wrapped in a boolean (expect(a===b).toBe(true)), so the failure message is blind", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J4" },
58
+ JS17: { title: "commented-out test block (// it(...) / // test(...)) — a disabled test that no longer runs", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
59
+ JS18: { title: "test takes a done callback instead of async/await — a done called too early (or in a floating promise) passes before the assertions run", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
60
+ JS21: { title: "matcher referenced but never called (expect(x).toBe with no ()) — the assertion never executes", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
61
+ JS22: { title: "empty it.each/test.each table — the test is generated with zero cases and never runs", group: "execution", severity: "high", defaultOn: true, judgment: "J1" },
50
62
  // --- diagnostic group (maintainability; default off, opt-in via --diagnostics
51
63
  // or config severity). These are NOT false-green: the test still protects. They
52
64
  // are a "plus" for test-code health, mirroring falsegreen's D/M group. -------
53
- D1: { title: "assertion roulette — many assertions in one test; a failure does not say which", confidence: "off", judgment: "J4" },
54
- D3: { title: "duplicate assert — the same assertion appears more than once in a test", confidence: "off", judgment: "J4" },
55
- D4: { title: "it.each/test.each without titled cases — a failing case is identified only by its index", confidence: "off", judgment: "J4" },
56
- D6: { title: "console.* in a test body — a debug artifact that bypasses the oracle", confidence: "off", judgment: "J4" },
57
- D7: { title: "anonymous test — empty or missing description", confidence: "off", judgment: "J4" },
58
- D8: { title: "magic number in an assertion — a bare numeric literal instead of a named constant", confidence: "off", judgment: "J4" },
59
- M2: { title: "test body exceeds the line-count threshold — hard to read and maintain", confidence: "off", judgment: "J5" },
65
+ D1: { title: "assertion roulette — many assertions in one test; a failure does not say which", group: "diagnostic", severity: "low", defaultOn: false, judgment: "J4" },
66
+ D3: { title: "duplicate assert — the same assertion appears more than once in a test", group: "diagnostic", severity: "low", defaultOn: false, judgment: "J4" },
67
+ D4: { title: "it.each/test.each without titled cases — a failing case is identified only by its index", group: "diagnostic", severity: "low", defaultOn: false, judgment: "J4" },
68
+ D6: { title: "console.* in a test body — a debug artifact that bypasses the oracle", group: "diagnostic", severity: "low", defaultOn: false, judgment: "J4" },
69
+ D7: { title: "anonymous test — empty or missing description", group: "diagnostic", severity: "low", defaultOn: false, judgment: "J4" },
70
+ D8: { title: "magic number in an assertion — a bare numeric literal instead of a named constant", group: "diagnostic", severity: "low", defaultOn: false, judgment: "J4" },
71
+ M2: { title: "test body exceeds the line-count threshold — hard to read and maintain", group: "structure", severity: "low", defaultOn: false, judgment: "J5" },
60
72
  // --- project layer (config-audit only; emitted by --config-audit, never by
61
73
  // the per-file scan). The suite goes green by configuration, not by a smell
62
74
  // inside any one test file. ------------------------------------------------
63
- PL7: { title: "no coverage gate (coverageThreshold / coverage.thresholds) - coverage can fall to zero and the suite still passes", confidence: "low", judgment: "J5" },
64
- PL8: { title: "bail stops the run early (bail) - the reported test count is incomplete", confidence: "low", judgment: "J5" },
65
- PL10: { title: "passWithNoTests lets an empty or fully-filtered suite report green", confidence: "low", judgment: "J1" },
75
+ PL7: { title: "no coverage gate (coverageThreshold / coverage.thresholds) - coverage can fall to zero and the suite still passes", group: "effectiveness", severity: "low", defaultOn: true, judgment: "J5" },
76
+ PL8: { title: "bail stops the run early (bail) - the reported test count is incomplete", group: "execution", severity: "low", defaultOn: true, judgment: "J5" },
77
+ PL10: { title: "passWithNoTests lets an empty or fully-filtered suite report green", group: "execution", severity: "low", defaultOn: true, judgment: "J1" },
66
78
  };
67
79
  /** Default thresholds for the diagnostic group (overridable later via config). */
68
80
  export const DIAGNOSTIC_THRESHOLDS = { assertionRoulette: 5, longTest: 50 };
81
+ /**
82
+ * Effective default state of a code as a single value: its severity when the
83
+ * default scan emits it, or "off" when it is opt-in. Derives the legacy
84
+ * three-valued "confidence" from the independent severity + defaultOn axes, so
85
+ * the rest of the pipeline (makeFinding, effectiveConf, exit code) keeps working
86
+ * unchanged while the taxonomy stays separate from the blocking decision.
87
+ */
88
+ export function baseConfidence(code) {
89
+ const c = CASES[code];
90
+ if (!c)
91
+ throw new Error(`falsegreen-js: unknown code "${code}" — not in the case catalog`);
92
+ return c.defaultOn ? c.severity : "off";
93
+ }
94
+ /**
95
+ * Primary taxonomy: the conceptual failure mode, read from the closed per-code
96
+ * table. Rejects an unknown code instead of defaulting, so a typo or a code that
97
+ * was added to the rules but never classified fails loudly.
98
+ */
99
+ export function riskGroupOf(code) {
100
+ const c = CASES[code];
101
+ if (!c)
102
+ throw new Error(`falsegreen-js: unknown code "${code}" — not in the case catalog`);
103
+ return c.group;
104
+ }
105
+ /**
106
+ * Legacy product grouping (false-positive / diagnostic / coupling / project),
107
+ * kept only as a transition-compat field in the JSON report. New consumers
108
+ * should read `riskGroup` (riskGroupOf). Prefix-based by design: it mirrors the
109
+ * pre-0.3 output exactly so downstream filters do not break across the upgrade.
110
+ */
69
111
  export function groupOf(code) {
70
112
  if (code.startsWith("PL"))
71
113
  return "project";
@@ -86,6 +128,7 @@ export const FIX_HINTS = {
86
128
  C5: "assert the real behaviour, not a constant or tautology",
87
129
  C6: "assert the actual value, not just that something came back",
88
130
  C7: "compare against an independent expected value, not the subject itself",
131
+ C44: "assert the actual length, not that it is at least zero (always true)",
89
132
  C20: "move the assertion before the return/throw so it runs",
90
133
  C23: "use a fixture or temp file instead of a real path or hard-coded URL",
91
134
  C8: "use toBeCloseTo() or a tolerance instead of exact float equality",
@@ -95,13 +138,14 @@ export const FIX_HINTS = {
95
138
  C9: "pass an error type or message to toThrow()",
96
139
  C37: "remove the duplicate it.each/test.each case",
97
140
  CC: "restore the commented-out assertion, or delete it",
141
+ C48: "assert the behaviour a real user hits; don't force the product's test-mode branch from the test",
98
142
  JS1: "remove .only (it.only/fit/describe.only) so the whole suite runs",
99
143
  JS2: "add a matcher (expect(x).toBe(...)) so the assertion runs",
100
144
  JS3: "add a real assertion; don't rely only on a self-generated snapshot",
101
145
  JS4: "remove .skip/xit/todo, or implement the test",
102
146
  JS5: "await the async query/event before asserting",
103
147
  JS6: "add tests to the describe block, or remove it",
104
- JS7: "await the promise/timer, or assert synchronously",
148
+ JS7: "await the promise, or use/flush fake timers, or assert synchronously",
105
149
  JS8: "unmock the unit under test; mock only its collaborators",
106
150
  JS9: "remove the dead branch so the assertion runs",
107
151
  JS11: "let the assertion error propagate; don't catch it",
package/dist/cfg.d.ts ADDED
@@ -0,0 +1,32 @@
1
+ /**
2
+ * Intra-test structured reachability, backing C20 (dead code) and C21 (no
3
+ * unconditional assertion). JS test bodies are structured (no goto), so this is a
4
+ * recursive walk over the statement tree, not a full control-flow graph.
5
+ *
6
+ * Two questions, both FP-averse (a false positive is worse than a miss):
7
+ * assertionsInDeadCode - assertions at a position control can never reach.
8
+ * hasUnconditionalAssertion - is at least one assertion guaranteed to run?
9
+ *
10
+ * The assertion predicate and literal-truthiness helper are passed in (they live in
11
+ * rules.ts) so this module imports nothing from there and there is no cycle.
12
+ */
13
+ import ts from "typescript";
14
+ type IsAssertion = (n: ts.Node) => boolean;
15
+ type LitTruth = (e: ts.Expression | undefined) => boolean | null;
16
+ /**
17
+ * Assertions sitting at a position control can never reach (C20). Walk each
18
+ * statement list keeping `reachable` (true at the head); once a statement does not
19
+ * fall through, the rest of the list is dead. Recurse into nested lists of a
20
+ * reachable statement with a fresh reachable=true. Stop at nested functions.
21
+ */
22
+ export declare function assertionsInDeadCode(body: ts.Block, isAssertion: IsAssertion): ts.Node[];
23
+ /**
24
+ * Is at least one assertion guaranteed to run (so C21 must NOT fire)? Walks the
25
+ * "spine" of always-executed positions: top-level statements, blocks on the spine,
26
+ * the taken branch of an if/?: with a literal-constant condition, finally blocks,
27
+ * and - conservatively - a try block. A non-const if, any loop, switch, catch, or
28
+ * short-circuit is not guaranteed. Anything unmodeled is treated as guaranteed
29
+ * (suppress C21) to stay false-positive-averse.
30
+ */
31
+ export declare function hasUnconditionalAssertion(body: ts.Block, isAssertion: IsAssertion, litTruth: LitTruth, deadAsserts?: ReadonlySet<ts.Node>): boolean;
32
+ export {};