falsegreen-js 0.2.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +134 -1
- package/README.md +131 -14
- package/dist/audit.d.ts +5 -0
- package/dist/audit.js +120 -0
- package/dist/cases.d.ts +61 -4
- package/dist/cases.js +138 -31
- package/dist/cfg.d.ts +32 -0
- package/dist/cfg.js +237 -0
- package/dist/cli.d.ts +38 -1
- package/dist/cli.js +210 -23
- package/dist/level.d.ts +10 -0
- package/dist/level.js +75 -0
- package/dist/oracles.d.ts +52 -0
- package/dist/oracles.js +87 -0
- package/dist/report.d.ts +31 -0
- package/dist/report.js +177 -0
- package/dist/rules.js +472 -68
- package/dist/scan.js +0 -0
- package/dist/types.d.ts +2 -1
- package/dist/types.js +3 -2
- package/package.json +34 -9
package/CHANGELOG.md
CHANGED
|
@@ -6,6 +6,137 @@ All notable changes to this project are documented here. The format is based on
|
|
|
6
6
|
|
|
7
7
|
## [Unreleased]
|
|
8
8
|
|
|
9
|
+
## [0.4.0] - 2026-06-28
|
|
10
|
+
|
|
11
|
+
### Fixed
|
|
12
|
+
- C21 no longer false-positives on a `do { expect } while(c)`: a do/while body always runs at least
|
|
13
|
+
once, so its assertion is unconditional (#60).
|
|
14
|
+
- C16 crypto match is anchored to a crypto root (`crypto.randomUUID`, `globalThis/window/self.crypto`,
|
|
15
|
+
or the bare node:crypto import), so a user method named `randomUUID()`/`getRandomValues()` is no
|
|
16
|
+
longer flagged (#61).
|
|
17
|
+
- C20 and C21 no longer double-report on a dead-code-only assertion: an assertion already flagged
|
|
18
|
+
C20 (unreachable) is excluded from the C21 set, so C20 owns the line. C21 still fires when a live
|
|
19
|
+
conditional assertion remains (#62).
|
|
20
|
+
|
|
21
|
+
### Added
|
|
22
|
+
- `C16` nondeterminism now also flags `new Date()` (zero-arg, reads the system clock),
|
|
23
|
+
`crypto.randomUUID()`, and `crypto.getRandomValues()`. `new Date(<literal/expr>)` is a fixed
|
|
24
|
+
instant and stays clean, and the file-wide fake-timer suppression applies. Aliased/destructured
|
|
25
|
+
clock reads (`const now = Date.now; now()`) stay out: tracking them would trade a rare miss for
|
|
26
|
+
a frequent false positive on user `now()` helpers (#46).
|
|
27
|
+
- `C48` (dark patch): a test that flips a known test-mode flag into test mode and then
|
|
28
|
+
asserts is exercising the product's test-only branch, not real behaviour. Detection-only;
|
|
29
|
+
v1 covers raw writes (`process.env.NODE_ENV = "test"`, `process.env.TESTING = "1"`,
|
|
30
|
+
`settings.TESTING = true`). `NODE_ENV` only counts as `"test"`; config values and product
|
|
31
|
+
feature flags are not flagged; a flag write with no assertion after it is setup, not a dark
|
|
32
|
+
patch. Parity with falsegreen (Python) `C48`, same id and J1/low (#39).
|
|
33
|
+
|
|
34
|
+
### Tests
|
|
35
|
+
- Lock the floating `expect(p).resolves`/`.rejects.<matcher>()` case for `JS5`: a non-awaited
|
|
36
|
+
promise matcher is already flagged through the oracle registry (the matcher builds a promise
|
|
37
|
+
that never settles before the test ends), and tests now pin that, including an exact-count
|
|
38
|
+
guard so a future change cannot double-report it. Awaited/returned forms stay clean (#43).
|
|
39
|
+
- Characterization tests for the cfg reachability edge cases: for-in (C20 after / C21 inside),
|
|
40
|
+
labeled `break outer`, a switch case that falls through without escaping, an IIFE holding the
|
|
41
|
+
only assertion (no phantom C21), and `performance.now()` C16 (#63).
|
|
42
|
+
|
|
43
|
+
### Changed
|
|
44
|
+
- `C20` and `C21` now use a structured intra-test reachability walk (`src/cfg.ts`) instead of
|
|
45
|
+
a top-level-only scan. `C20` (dead code) catches an assertion after any non-falling-through
|
|
46
|
+
statement: a `return`/`throw`, `process.exit()`, a `break`/`continue`, an `if` whose both
|
|
47
|
+
arms terminate, a terminating block, or an exhaustive `switch` (every case plus a `default`
|
|
48
|
+
escapes). `C21` (no unconditional assertion) fires only when no assertion is on the guaranteed
|
|
49
|
+
spine; an assertion in an `if(true)` branch, a `finally`, or a `try` block now counts as
|
|
50
|
+
guaranteed, and an assertion only in a `catch` or a loop body is correctly flagged. The walk
|
|
51
|
+
stops at nested functions, so a `return` inside a `forEach`/IIFE callback no longer reads as
|
|
52
|
+
dead code. False-positive-averse: anything unmodeled is treated as reachable/guaranteed (#35).
|
|
53
|
+
- README Status no longer pins a stale `0.1.0` literal; it points to STATUS.md for the current
|
|
54
|
+
version and coverage. Removed two boolean sub-clauses fully subsumed by their first disjunct
|
|
55
|
+
in `isTestBlock` and the JS6 suite guard (behavior-preserving) (#64, #65).
|
|
56
|
+
|
|
57
|
+
### Fixed (earlier in the 0.4.0 cycle)
|
|
58
|
+
- JS5 surfaces a floating promise observed only by a non-assignment binary op (||, &&, ===); only a real assignment with the call as RHS counts as observed.
|
|
59
|
+
- JS7 timer control now consults lifecycle hooks that install/flush fake timers at both the enclosing-describe and source-file top level (#41).
|
|
60
|
+
|
|
61
|
+
|
|
62
|
+
### Added
|
|
63
|
+
- C44 (numeric tautology, high, J2): `expect(x.length).toBeGreaterThanOrEqual(0)`. A
|
|
64
|
+
`.length` is never negative and never NaN, so `>= 0` holds for every input and checks
|
|
65
|
+
nothing — the JS/TS mirror of the Python `len(x) >= 0`. The subject must be a direct
|
|
66
|
+
`.length` property access: a derived expression that only mentions `.length`
|
|
67
|
+
(`a.length - b.length`) can be negative and is not flagged, and a bound that can still
|
|
68
|
+
fail (`>= 1`, `> 0`) is a real check. Finiteness/NaN guards (`toBeLessThan(Infinity)`,
|
|
69
|
+
`toBeGreaterThan(-Infinity)`) are deliberately not flagged: they are false for `NaN`, so
|
|
70
|
+
they catch divide-by-zero and invalid-number bugs.
|
|
71
|
+
- Output-format parity with the Python scanner: `--format text|json|sarif|junit`
|
|
72
|
+
(default `text`; `--json` stays as an alias for `--format json`). SARIF 2.1.0
|
|
73
|
+
emits one rule per code and one result per finding, maps high to `error`, low to
|
|
74
|
+
`warning`, off to `note`, and tags each result with its judgment, `risk:<group>`,
|
|
75
|
+
and `level:<conf>`. JUnit XML turns high findings into `<failure>` and the rest
|
|
76
|
+
into `<skipped>`. `--output` writes any of the four formats (sarif -> `.sarif`,
|
|
77
|
+
junit -> `.xml`).
|
|
78
|
+
- Baseline ratchet: `--baseline [PATH]` suppresses findings already recorded (so CI
|
|
79
|
+
fails only on net-new ones), and `--write-baseline [PATH]` records the current
|
|
80
|
+
findings and exits 0. Both default to `.falsegreen-baseline.json`. A finding's
|
|
81
|
+
identity is a content fingerprint (`sha1` of relative path + code + detail, no
|
|
82
|
+
line number), stable across unrelated line shifts. The fingerprint omits the
|
|
83
|
+
source snippet the Python scanner folds in, since the js `Finding` carries none.
|
|
84
|
+
- Risk-group taxonomy: every code now carries an explicit conceptual failure mode
|
|
85
|
+
(`effectiveness`, `execution`, `nondeterminism`, `dependency`, `structure`,
|
|
86
|
+
`diagnostic`), read from a closed per-code table (`riskGroupOf`) rather than the
|
|
87
|
+
code prefix. An unknown code is rejected instead of defaulted. The JSON report
|
|
88
|
+
gains a `riskGroup` field; the legacy `group` field stays for transition compatibility.
|
|
89
|
+
- A code's metadata is split into independent axes: `group` (taxonomy), `severity`
|
|
90
|
+
(`high`/`low`), `defaultOn` (whether the default scan emits it), and `judgment`
|
|
91
|
+
(J1-J6). The taxonomy no longer depends on whether a finding blocks.
|
|
92
|
+
- Oracle registry (`oracles.ts`): the assertion-API vocabulary is one versioned
|
|
93
|
+
table, each family classified by how its failure reaches the runner (`sync-fail`,
|
|
94
|
+
`promise`, `runner-registered`, `value-only`). The JSON report records the
|
|
95
|
+
`oracleRegistryVersion` that classified it.
|
|
96
|
+
|
|
97
|
+
### Fixed
|
|
98
|
+
- `--version` and the JSON report's `version` field read from `package.json` at
|
|
99
|
+
runtime; they were pinned to a stale `0.2.0` literal while the package was `0.3.0`.
|
|
100
|
+
|
|
101
|
+
## [0.3.0] - 2026-06-23
|
|
102
|
+
|
|
103
|
+
### Added
|
|
104
|
+
- New codes: JS21 (matcher referenced but never called, `expect(x).toBe` with no `()`),
|
|
105
|
+
JS22 (empty `it.each`/`test.each` table), JS17 (commented-out test block), JS18 (`done`
|
|
106
|
+
callback instead of async/await).
|
|
107
|
+
- supertest / chai-http `.expect()` is recognized as an assertion, so API integration tests
|
|
108
|
+
built with `request(app).get(...).expect(200)` are no longer flagged C2b.
|
|
109
|
+
- Documented test-pyramid coverage: unit, integration (API and database), and E2E.
|
|
110
|
+
- `--config-audit` mode (project layer): reads the Jest/Vitest config (`package.json` `jest`
|
|
111
|
+
field, `jest.config.*`, `vitest.config.*`; JSON directly, JS/TS via the TypeScript parser)
|
|
112
|
+
and reports PL10 (`passWithNoTests`), PL7 (no `coverageThreshold` / `coverage.thresholds`),
|
|
113
|
+
PL8 (`bail`). Findings carry level `project` and a fix hint. README now recommends Stryker
|
|
114
|
+
for the mutation-testing layer the static scan cannot reach.
|
|
115
|
+
- Status report output: every finding now carries its pyramid level (unit / integration /
|
|
116
|
+
e2e, detected from the file's import roots) and a one-line fix hint. The text summary adds
|
|
117
|
+
a per-level breakdown and the top fixes by frequency; JSON gains `level` and `fix` fields.
|
|
118
|
+
- `--output` flag: write to a file, or pass a directory (e.g. `.falsegreen/`) to get
|
|
119
|
+
`report.<ext>` for the chosen format. Parent directories are created as needed.
|
|
120
|
+
|
|
121
|
+
### Added
|
|
122
|
+
- Cross-language parity with the Python scanner: C6 (weak check — toBeTruthy/toBeDefined/length>0),
|
|
123
|
+
C20 (assertion in dead code after return/throw), C23 (mystery guest — real file at a literal
|
|
124
|
+
path / hard-coded URL), and JS8 (mocks the unit under test and asserts it directly).
|
|
125
|
+
|
|
126
|
+
### Added
|
|
127
|
+
- JS3 now covers visual snapshots (Playwright `toHaveScreenshot`/`toMatchScreenshot`): a test whose only check is a visual snapshot is snapshot-only (the baseline comes from the output). Percy `percySnapshot()`/`cy.percySnapshot()` is not a runtime assertion, so a percy-only test surfaces as no-assertion (C2b).
|
|
128
|
+
|
|
129
|
+
### Fixed
|
|
130
|
+
- Test-file discovery now matches more JS/TS naming conventions: Cypress `.cy.*`,
|
|
131
|
+
Deno/Go `_test.*`, Jasmine `*Spec.*`, Angular/Protractor `.e2e-spec.*`, and `.e2e.*`
|
|
132
|
+
(plus the `cypress`/`e2e` directories). Previously only `.test`/`.spec` were
|
|
133
|
+
discovered, so Cypress/Deno/Jasmine/Angular specs were silently skipped.
|
|
134
|
+
|
|
135
|
+
### Added
|
|
136
|
+
- Vue/Svelte test-utils coverage: JS5 now flags non-awaited `flushPromises`/`nextTick`/
|
|
137
|
+
`tick`; JS13 now flags Vue Test Utils `findComponent`/`findAllComponents` and
|
|
138
|
+
`find`/`findAll` with a string selector used as a loose statement.
|
|
139
|
+
|
|
9
140
|
## [0.2.0] - 2026-06-22
|
|
10
141
|
|
|
11
142
|
### Added
|
|
@@ -45,6 +176,8 @@ All notable changes to this project are documented here. The format is based on
|
|
|
45
176
|
- pre-commit hook (`.pre-commit-hooks.yaml`), CI matrix (Node 18/20/22), and an npm
|
|
46
177
|
trusted-publishing release workflow.
|
|
47
178
|
|
|
48
|
-
[Unreleased]: https://github.com/vinicq/falsegreen-js/compare/v0.
|
|
179
|
+
[Unreleased]: https://github.com/vinicq/falsegreen-js/compare/v0.4.0...HEAD
|
|
180
|
+
[0.4.0]: https://github.com/vinicq/falsegreen-js/compare/v0.3.0...v0.4.0
|
|
181
|
+
[0.3.0]: https://github.com/vinicq/falsegreen-js/compare/v0.2.0...v0.3.0
|
|
49
182
|
[0.2.0]: https://github.com/vinicq/falsegreen-js/compare/v0.1.0...v0.2.0
|
|
50
183
|
[0.1.0]: https://github.com/vinicq/falsegreen-js/releases/tag/v0.1.0
|
package/README.md
CHANGED
|
@@ -1,5 +1,13 @@
|
|
|
1
1
|
# falsegreen-js
|
|
2
2
|
|
|
3
|
+
[](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml)
|
|
4
|
+
[](https://www.npmjs.com/package/falsegreen-js)
|
|
5
|
+
[](https://www.npmjs.com/package/falsegreen-js)
|
|
6
|
+
[](https://www.npmjs.com/package/falsegreen-js)
|
|
7
|
+
[](LICENSE)
|
|
8
|
+
[](CONTRIBUTING.md)
|
|
9
|
+
[](https://vinicq.github.io/falsegreen-docs/)
|
|
10
|
+
|
|
3
11
|
Find JavaScript/TypeScript unit tests that give false positives: green tests that
|
|
4
12
|
protect nothing, and tests that pass while asserting the wrong thing. Deterministic
|
|
5
13
|
AST scan, no code execution. Sibling of [`falsegreen`](https://github.com/vinicq/falsegreen)
|
|
@@ -7,6 +15,8 @@ AST scan, no code execution. Sibling of [`falsegreen`](https://github.com/vinicq
|
|
|
7
15
|
|
|
8
16
|
Covers `.js`, `.jsx`, `.ts`, `.tsx`, `.mjs`, `.cjs`, `.mts`, `.cts`.
|
|
9
17
|
|
|
18
|
+
**The falsegreen family:** [falsegreen](https://github.com/vinicq/falsegreen) (Python/pytest) · **falsegreen-js** (JS/TS) · [robotframework-falsegreen](https://github.com/vinicq/robotframework-falsegreen) (Robot Framework) · [falsegreen-skill](https://github.com/vinicq/falsegreen-skill) (semantic LLM pass).
|
|
19
|
+
|
|
10
20
|
## Why
|
|
11
21
|
|
|
12
22
|
A test can be green and still protect nothing: an empty body, an assertion that is
|
|
@@ -27,10 +37,39 @@ npm install -D falsegreen-js
|
|
|
27
37
|
npx falsegreen-js # scan cwd
|
|
28
38
|
npx falsegreen-js src test # scan paths
|
|
29
39
|
npx falsegreen-js --staged # only test files staged in git (pre-commit)
|
|
30
|
-
npx falsegreen-js --json # machine-readable output
|
|
40
|
+
npx falsegreen-js --json # machine-readable output (alias for --format json)
|
|
41
|
+
npx falsegreen-js --format sarif # SARIF 2.1.0 for GitHub code scanning
|
|
42
|
+
npx falsegreen-js --format junit # JUnit XML for CI test reporters
|
|
43
|
+
npx falsegreen-js --output report.json # write to a file
|
|
44
|
+
npx falsegreen-js --output .falsegreen/ # write report.<ext> into a directory
|
|
45
|
+
npx falsegreen-js --config-audit # audit Jest/Vitest config (project-layer PL codes)
|
|
31
46
|
npx falsegreen-js --disable C7,JS3
|
|
32
47
|
```
|
|
33
48
|
|
|
49
|
+
Each finding is reported with its pyramid level (unit / integration / e2e, read from the file's imports) and a one-line fix hint, and the summary breaks the findings down by level and lists the most common fixes. `--output` takes a file or a directory: an extension-less or trailing-slash path (e.g. `.falsegreen/`) receives `report.<ext>` for the chosen format. Reports are run artifacts; keep the output directory gitignored.
|
|
50
|
+
|
|
51
|
+
### Output formats
|
|
52
|
+
|
|
53
|
+
`--format text|json|sarif|junit` (default `text`; `--json` stays as an alias for `--format json`). These match the [Python sibling](https://github.com/vinicq/falsegreen) byte-for-concept, so a pipeline can swap one scanner for the other.
|
|
54
|
+
|
|
55
|
+
- **`sarif`**: SARIF 2.1.0. One rule per code present, one result per finding, with `error` for high-severity findings, `warning` for low, and `note` for off. Result tags carry the judgment (J1-J6), the risk group (`risk:effectiveness`...), and the level (`level:high`). Upload it to GitHub code scanning to see findings inline on the PR.
|
|
56
|
+
- **`junit`**: JUnit XML. High-severity findings become `<failure>`, everything else `<skipped>`, so a CI test reporter surfaces them as a failing suite.
|
|
57
|
+
|
|
58
|
+
### Baseline (ratchet)
|
|
59
|
+
|
|
60
|
+
Adopting the scanner on a large codebase without fixing every legacy finding at once:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
npx falsegreen-js --write-baseline # record current findings to .falsegreen-baseline.json, exit 0
|
|
64
|
+
npx falsegreen-js --baseline # report and fail only on findings not in the baseline
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
`--baseline [PATH]` and `--write-baseline [PATH]` default to `.falsegreen-baseline.json`. A finding's identity is a content fingerprint (`sha1` of relative path + code + detail, no line number), so it survives unrelated line shifts in the file. Commit the baseline, then let CI block only on net-new findings. (The fingerprint omits the source snippet the Python scanner folds in, since the js scanner does not carry one; two findings with the same code and detail in one file share an id.)
|
|
68
|
+
|
|
69
|
+
`--config-audit` is a separate mode: instead of scanning test files, it reads the Jest/Vitest config (`package.json` `jest` field, `jest.config.*`, `vitest.config.*`) and reports the project-layer ways a suite stays green by configuration: `PL10` (`passWithNoTests` passes an empty or filtered-to-nothing run), `PL7` (no `coverageThreshold` / `coverage.thresholds`), `PL8` (`bail` stops the run early). The per-file scan cannot see config.
|
|
70
|
+
|
|
71
|
+
For the layer no static scan reaches (does a green test fail when the code is wrong?), run a **mutation tester** like [Stryker](https://stryker-mutator.io/). falsegreen-js is the cheap pre-filter on every commit; mutation testing is the deeper audit.
|
|
72
|
+
|
|
34
73
|
Exit code: `0` clean, `10` low-confidence only, `20` high-confidence present. Wire it
|
|
35
74
|
into CI or a pre-commit hook and let exit `20` block the commit.
|
|
36
75
|
|
|
@@ -44,8 +83,9 @@ expect(x); // falsegreen: ignore
|
|
|
44
83
|
## Runner coverage
|
|
45
84
|
|
|
46
85
|
Runner-agnostic. The assertion and test vocabulary spans Jest, Vitest, Mocha + Chai,
|
|
47
|
-
Jasmine, AVA, `node:test`, tap, Cypress, Playwright,
|
|
48
|
-
(`@testing-library/*` with `jest-dom` / `jasmine-dom` matchers and `user-event`)
|
|
86
|
+
Jasmine, AVA, `node:test`, tap, Cypress, Playwright, Testing Library
|
|
87
|
+
(`@testing-library/*` with `jest-dom` / `jasmine-dom` matchers and `user-event`),
|
|
88
|
+
and Vue Test Utils (`mount`/`wrapper.find`/`flushPromises`/`nextTick`).
|
|
49
89
|
`expect().matcher()`, chai `expect().to`, `assert`, `x.should`, and AVA `t.is` all
|
|
50
90
|
count as real assertions, so a Mocha or AVA test is not mistaken for one that never
|
|
51
91
|
checks anything.
|
|
@@ -54,6 +94,25 @@ Note: component files (`.vue`, `.svelte`, `.astro`, `.marko`) and templates (`.h
|
|
|
54
94
|
are not test files. Tests for those frameworks are written in `.spec`/`.test` files in
|
|
55
95
|
the eight extensions above, which is what the scanner reads.
|
|
56
96
|
|
|
97
|
+
## Test levels (the pyramid)
|
|
98
|
+
|
|
99
|
+
falsegreen-js scans tests at every level of the pyramid. Discovery is level-agnostic - it
|
|
100
|
+
reads any test file - but a few codes are read in light of the level, so a valid pattern at
|
|
101
|
+
one level is not flagged at another.
|
|
102
|
+
|
|
103
|
+
- **Unit:** a function or component with its boundaries doubled. The oracle is `expect`.
|
|
104
|
+
- **Integration (API and database):** API tests through supertest / chai-http
|
|
105
|
+
(`request(app).get("/").expect(200)`, recognized as an assertion) or `fetch`, and database
|
|
106
|
+
tests through Prisma / TypeORM / Knex against a real datastore. These cross the I/O
|
|
107
|
+
boundary on purpose, so the response or row IS the verification at that level.
|
|
108
|
+
- **E2E:** Cypress (`.cy.*`) and Playwright (`.e2e.*`). `cy.get().should(...)` and
|
|
109
|
+
`expect(page).toHaveURL(...)` are the oracle; a visible element is a real check here, not a
|
|
110
|
+
weak one.
|
|
111
|
+
|
|
112
|
+
A real API or database call inside a test that claims to be a unit test is itself the smell
|
|
113
|
+
(mystery guest, environment coupling), not the level of the test. C23 flags the hard-coded
|
|
114
|
+
file path or URL form.
|
|
115
|
+
|
|
57
116
|
## Case catalog
|
|
58
117
|
|
|
59
118
|
Codes shared with `falsegreen` (Python) keep the same id, so cross-language results
|
|
@@ -64,13 +123,18 @@ line up in the research. `JS*` codes are ecosystem-specific.
|
|
|
64
123
|
| C2 | high | test with no check at all (empty body) |
|
|
65
124
|
| C2b | low | test calls code but asserts nothing |
|
|
66
125
|
| C5 | high | always-true check (`expect(true).toBe(true)`, `assert(1)`) |
|
|
126
|
+
| C6 | low | weak check — only verifies something came back (`toBeTruthy`/`toBeDefined`, `length > 0`) |
|
|
67
127
|
| C7 | high | compares a thing to itself (`expect(x).toBe(x)`) |
|
|
128
|
+
| C44 | high | numeric tautology — a length compared so the result is always true (`x.length >= 0`) |
|
|
129
|
+
| C20 | high | assertion in unreachable code (after a `return`/`throw`/`process.exit`, a `break`, a both-arms-terminating `if`, or an exhaustive `switch`) — it never runs |
|
|
130
|
+
| C23 | low | reads a real file at a literal path, or a hard-coded URL (mystery guest) |
|
|
68
131
|
| C8 | low | exact equality on a float (use `toBeCloseTo`) |
|
|
69
132
|
| C9 | low | `toThrow()` with no error type or message — accepts any error |
|
|
70
133
|
| C16 | low | result depends on `Date.now`, `Math.random`, or a fixed timer |
|
|
71
134
|
| C18 | low | compares `String(x)` / `JSON.stringify(x)` / `` `${x}` `` to a literal (formatting, not value) |
|
|
72
135
|
| C21 | low | every assertion is conditional — none runs unconditionally |
|
|
73
136
|
| C37 | low | duplicate case in `it.each`/`test.each` — the same scenario runs twice |
|
|
137
|
+
| C48 | low | dark patch — the test flips a test-mode flag (`process.env.NODE_ENV = "test"`, `process.env.TESTING`, a `TESTING` flag) then asserts, exercising the product's test-only branch |
|
|
74
138
|
| CC | low | commented-out assertion |
|
|
75
139
|
| JS1 | high | focused test (`it.only` / `fit`) silently skips the rest of the suite |
|
|
76
140
|
| JS2 | high | `expect(x)` with no matcher — the assertion never runs |
|
|
@@ -79,17 +143,22 @@ line up in the research. `JS*` codes are ecosystem-specific.
|
|
|
79
143
|
| JS5 | low | async query/event not awaited (`findBy*` / `waitFor` / `user-event`) |
|
|
80
144
|
| JS6 | high | empty `describe`/`suite` — the suite is green but runs nothing |
|
|
81
145
|
| JS7 | low | assertion inside a non-awaited `setTimeout`/`then` callback — may run after the test ends |
|
|
146
|
+
| JS8 | low | mocks the unit under test (`jest.mock`/`vi.mock` of an imported module asserted directly) |
|
|
82
147
|
| JS9 | high | assertion in a dead branch (`if(false)` / `if(true){}else`) — never runs |
|
|
83
148
|
| JS11 | low | `try/catch` swallows the assertion — a failing `expect` is caught, test stays green |
|
|
84
149
|
| JS13 | low | query (`getBy*`/`queryBy*`) as a loose statement — its result is never asserted |
|
|
85
150
|
| JS15 | low | inappropriate assertion — comparison wrapped in a boolean (`expect(a===b).toBe(true)`), blind failure message |
|
|
151
|
+
| JS17 | low | commented-out test block (`// it(...)` / `// test(...)`) — disabled, no longer runs |
|
|
152
|
+
| JS18 | low | test takes a `done` callback instead of async/await — a mistimed `done` passes early |
|
|
153
|
+
| JS21 | high | matcher referenced but never called (`expect(x).toBe` with no `()`) — the assertion never runs |
|
|
154
|
+
| JS22 | high | empty `it.each`/`test.each` table — generated with zero cases, never runs |
|
|
86
155
|
|
|
87
156
|
Each code carries a judgment tag (J1-J6) shared with the
|
|
88
157
|
[falsegreen-skill](https://github.com/vinicq/falsegreen-skill) semantic framework.
|
|
89
158
|
|
|
90
159
|
### Opt-in: maintainability group (default off)
|
|
91
160
|
|
|
92
|
-
These are **not** false-green
|
|
161
|
+
These are **not** false-green - the test still protects something - so they are off by
|
|
93
162
|
default. Enable them with `--diagnostics`, or per code via config `severity`. They are a
|
|
94
163
|
"plus" for test-code health, mirroring falsegreen's diagnostic/coupling groups.
|
|
95
164
|
|
|
@@ -107,16 +176,34 @@ default. Enable them with `--diagnostics`, or per code via config `severity`. Th
|
|
|
107
176
|
npx falsegreen-js --diagnostics # include D*/M* as warnings
|
|
108
177
|
```
|
|
109
178
|
|
|
110
|
-
###
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
(`
|
|
115
|
-
|
|
179
|
+
### Deliberately not implemented
|
|
180
|
+
|
|
181
|
+
Some catalog codes were reviewed and left out, on purpose:
|
|
182
|
+
|
|
183
|
+
- **JS19** (`toBe` on an object/array literal): `expect(x).toBe({...})` compares by reference,
|
|
184
|
+
so it always fails. That is the false-red axis (a test that always fails), the opposite of
|
|
185
|
+
what this scanner looks for, and out of scope on principle.
|
|
186
|
+
- **JS20** (a Promise compared without `resolves`/`rejects`): telling that a value is a
|
|
187
|
+
Promise needs type information the AST does not carry, so it would be too noisy.
|
|
188
|
+
- **JS12** (a floating promise whose `expect` is never returned): already covered by JS7.
|
|
189
|
+
- **JS16** (`async` test with no `expect.assertions(n)`): the absence of a guard is not a
|
|
190
|
+
smell on its own; flagging it would fire on most async tests.
|
|
191
|
+
- **JS14** (a giant inline snapshot): a readability and review-noise concern, not a
|
|
192
|
+
false-green one. The snapshot still protects, so it belongs to the diagnostic group and is
|
|
193
|
+
better served by `eslint-plugin-jest` (`no-large-snapshots`) as an opt-in lint rule.
|
|
194
|
+
- **JS10** (any conditional in a test body): handled by `eslint-plugin-jest`
|
|
195
|
+
(`no-conditional-in-test`); JS9 and C21 already cover the false-green subset.
|
|
196
|
+
- **C1** (an assertion under an `if`/`for` that may not run): redundant once C21 and JS9
|
|
197
|
+
exist, and high-FP on its own. C21 already fires the actual false-green case, where
|
|
198
|
+
*every* assertion is conditional and the test can pass with nothing checked. A test that
|
|
199
|
+
mixes a conditional assertion with an unconditional one is not false-green: the
|
|
200
|
+
unconditional assertion still protects. JS9 covers the dead-branch form (`if(false)`).
|
|
201
|
+
Flagging every conditional assertion (C1's full scope) is the linter concern JS10 already
|
|
202
|
+
names (`no-conditional-in-test`), so C1 would add noise without a new false-green signal.
|
|
116
203
|
|
|
117
204
|
### What carries over from falsegreen, what does not
|
|
118
205
|
|
|
119
|
-
Ported (same concept): C2, C2b, C5, C7, C8, C16, CC.
|
|
206
|
+
Ported (same concept): C2, C2b, C5, C7, C8, C16, C44, C48, CC.
|
|
120
207
|
|
|
121
208
|
Python-only, not applicable to JS/TS: pytest collection rules (C4 family), `pytest.raises`
|
|
122
209
|
breadth (C9/C19/C27/C28), fixtures and `os.environ`/global-state codes (C23/C24/C29),
|
|
@@ -149,21 +236,51 @@ test re-implements the production logic. Those are semantic and belong to the
|
|
|
149
236
|
`falsegreen-skill` LLM pass. Precision over recall: a softened heuristic that misses a
|
|
150
237
|
case is preferred to one that flags correct code.
|
|
151
238
|
|
|
239
|
+
Measured against the [Open Catalog of Test Smells](https://test-smell-catalog.readthedocs.io/) (517 documented smells), only the false-green slice is in scope. What stays out, on purpose: **brittleness / false-red** (sensitive equality, brittle assertions - the opposite axis), **hygiene / maintainability** (assertion roulette, magic numbers, long tests - linter territory, a few surfaced as opt-in diagnostics), and **slow, design, naming, duplication, runtime/culture** (none about whether the test protects). The boundary is deliberate: where a smell has a statically provable false-green form, that form is a code here - uncontrolled `Date.now`/`Math.random` is `C16`, a hard-coded path or URL is `C23`, an assertion that may never run is `C21`/`C20`, and JS-specific forms (focused tests, missing matchers) are the `JS*` codes. See [CREDITS.md](CREDITS.md) for the full cross-walk.
|
|
240
|
+
|
|
152
241
|
## References
|
|
153
242
|
|
|
154
243
|
The catalog is grounded in the test-smell literature. Direct influences: the
|
|
155
244
|
rotten-green-test work that names this whole family (Delplanque et al., ICSE 2019),
|
|
156
245
|
the founding test-smell refactoring catalog (van Deursen et al., XP 2001), the
|
|
157
|
-
JS/TS empirical studies (Jorge, UFCG 2023; Silva, PUC Minas 2022
|
|
246
|
+
JS/TS empirical studies (Jorge, UFCG 2023; Silva, PUC Minas 2022 - the academic
|
|
158
247
|
precedent for the focused-test and snapshot codes; Oliveira et al., SBES 2024/2025),
|
|
159
248
|
and the detection-tool baselines (tsDetect, Peruma et al., 2020). Full list and the
|
|
160
249
|
code-to-source mapping in [CREDITS.md](CREDITS.md).
|
|
161
250
|
|
|
162
251
|
## Status
|
|
163
252
|
|
|
164
|
-
|
|
165
|
-
|
|
253
|
+
The rule set is a deterministic core; the full JS/TS smell catalog is tracked as
|
|
254
|
+
research in the private audit hub. See [STATUS.md](STATUS.md) for the current version
|
|
255
|
+
and rule coverage. Issues and PRs welcome.
|
|
166
256
|
|
|
167
257
|
## License
|
|
168
258
|
|
|
169
259
|
MIT, Vinicius Queiroz.
|
|
260
|
+
|
|
261
|
+
## Contributors ✨
|
|
262
|
+
|
|
263
|
+
Thanks to the people who keep false-green tests out of real suites ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
|
|
264
|
+
|
|
265
|
+
<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
|
|
266
|
+
[](#contributors-)
|
|
267
|
+
<!-- ALL-CONTRIBUTORS-BADGE:END -->
|
|
268
|
+
|
|
269
|
+
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
|
|
270
|
+
<!-- prettier-ignore-start -->
|
|
271
|
+
<!-- markdownlint-disable -->
|
|
272
|
+
<table>
|
|
273
|
+
<tbody>
|
|
274
|
+
<tr>
|
|
275
|
+
<td align="center" valign="top" width="14.28%"><a href="https://vinicq.github.io/md-bridge/"><img src="https://avatars.githubusercontent.com/u/78210890?v=4?s=100" width="100px;" alt="Vinicius Queiroz"/><br /><sub><b>Vinicius Queiroz</b></sub></a><br /><a href="https://github.com/vinicq/falsegreen-js/commits?author=vinicq" title="Code">💻</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=vinicq" title="Documentation">📖</a> <a href="#ideas-vinicq" title="Ideas, Planning, & Feedback">🤔</a> <a href="#maintenance-vinicq" title="Maintenance">🚧</a> <a href="#infra-vinicq" title="Infrastructure (Hosting, Build-Tools, etc)">🚇</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=vinicq" title="Tests">⚠️</a> <a href="#research-vinicq" title="Research">🔬</a></td>
|
|
276
|
+
<td align="center" valign="top" width="14.28%"><a href="https://github.com/homesellerq-coder"><img src="https://avatars.githubusercontent.com/u/294912019?v=4?s=100" width="100px;" alt="Home Seller"/><br /><sub><b>Home Seller</b></sub></a><br /><a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Code">💻</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Documentation">📖</a> <a href="https://github.com/vinicq/falsegreen-js/commits?author=homesellerq-coder" title="Tests">⚠️</a> <a href="#infra-homesellerq-coder" title="Infrastructure (Hosting, Build-Tools, etc)">🚇</a></td>
|
|
277
|
+
</tr>
|
|
278
|
+
</tbody>
|
|
279
|
+
</table>
|
|
280
|
+
|
|
281
|
+
<!-- markdownlint-restore -->
|
|
282
|
+
<!-- prettier-ignore-end -->
|
|
283
|
+
|
|
284
|
+
<!-- ALL-CONTRIBUTORS-LIST:END -->
|
|
285
|
+
|
|
286
|
+
New contributors are added automatically; the table also recognizes non-code work (docs, ideas, infrastructure, tests, research) via the [all-contributors](https://allcontributors.org) spec.
|
package/dist/audit.d.ts
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
1
|
+
import { Finding } from "./types.js";
|
|
2
|
+
/** Read the Jest/Vitest config and report the project-layer PL codes: ways the
|
|
3
|
+
* suite can report green by configuration. Findings carry level `project`.
|
|
4
|
+
* Returns [] when no Jest/Vitest config is found. */
|
|
5
|
+
export declare function auditConfig(start?: string): Finding[];
|
package/dist/audit.js
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
import * as fs from "node:fs";
|
|
2
|
+
import * as path from "node:path";
|
|
3
|
+
import ts from "typescript";
|
|
4
|
+
import { makeFinding } from "./types.js";
|
|
5
|
+
import { parse } from "./parse.js";
|
|
6
|
+
/** JSON config: package.json `jest` field, or jest.config.json. */
|
|
7
|
+
function readJsonConfig(start) {
|
|
8
|
+
const pkg = path.join(start, "package.json");
|
|
9
|
+
if (fs.existsSync(pkg)) {
|
|
10
|
+
try {
|
|
11
|
+
const j = JSON.parse(fs.readFileSync(pkg, "utf-8"));
|
|
12
|
+
if (j && typeof j.jest === "object")
|
|
13
|
+
return { path: pkg, cfg: j.jest };
|
|
14
|
+
}
|
|
15
|
+
catch { /* unreadable */ }
|
|
16
|
+
}
|
|
17
|
+
const jcj = path.join(start, "jest.config.json");
|
|
18
|
+
if (fs.existsSync(jcj)) {
|
|
19
|
+
try {
|
|
20
|
+
return { path: jcj, cfg: JSON.parse(fs.readFileSync(jcj, "utf-8")) };
|
|
21
|
+
}
|
|
22
|
+
catch { /* unreadable */ }
|
|
23
|
+
}
|
|
24
|
+
return null;
|
|
25
|
+
}
|
|
26
|
+
/** JS/TS config (jest.config.*, vitest.config.*, vite.config.*): AST-walk for
|
|
27
|
+
* the property assignments of interest. A heuristic, not full evaluation. */
|
|
28
|
+
function readAstConfig(start) {
|
|
29
|
+
const candidates = [
|
|
30
|
+
"jest.config.ts", "jest.config.js", "jest.config.mjs", "jest.config.cjs",
|
|
31
|
+
"vitest.config.ts", "vitest.config.js", "vitest.config.mts",
|
|
32
|
+
"vite.config.ts", "vite.config.js",
|
|
33
|
+
];
|
|
34
|
+
for (const name of candidates) {
|
|
35
|
+
const p = path.join(start, name);
|
|
36
|
+
if (!fs.existsSync(p))
|
|
37
|
+
continue;
|
|
38
|
+
let text;
|
|
39
|
+
try {
|
|
40
|
+
text = fs.readFileSync(p, "utf-8");
|
|
41
|
+
}
|
|
42
|
+
catch {
|
|
43
|
+
continue;
|
|
44
|
+
}
|
|
45
|
+
const sf = parse(p, text);
|
|
46
|
+
const props = new Set();
|
|
47
|
+
let passWithNoTests = false;
|
|
48
|
+
let bail = false;
|
|
49
|
+
const visit = (node) => {
|
|
50
|
+
if (ts.isPropertyAssignment(node) &&
|
|
51
|
+
(ts.isIdentifier(node.name) || ts.isStringLiteral(node.name))) {
|
|
52
|
+
const key = node.name.text;
|
|
53
|
+
props.add(key);
|
|
54
|
+
const init = node.initializer;
|
|
55
|
+
if (key === "passWithNoTests" && init.kind === ts.SyntaxKind.TrueKeyword) {
|
|
56
|
+
passWithNoTests = true;
|
|
57
|
+
}
|
|
58
|
+
if (key === "bail") {
|
|
59
|
+
if (init.kind === ts.SyntaxKind.TrueKeyword)
|
|
60
|
+
bail = true;
|
|
61
|
+
else if (ts.isNumericLiteral(init) && Number(init.text) > 0)
|
|
62
|
+
bail = true;
|
|
63
|
+
}
|
|
64
|
+
}
|
|
65
|
+
ts.forEachChild(node, visit);
|
|
66
|
+
};
|
|
67
|
+
visit(sf);
|
|
68
|
+
return { path: p, props, passWithNoTests, bail };
|
|
69
|
+
}
|
|
70
|
+
return null;
|
|
71
|
+
}
|
|
72
|
+
function collect(start) {
|
|
73
|
+
const json = readJsonConfig(start);
|
|
74
|
+
const ast = readAstConfig(start);
|
|
75
|
+
if (!json && !ast)
|
|
76
|
+
return null;
|
|
77
|
+
let passWithNoTests = false;
|
|
78
|
+
let bail = false;
|
|
79
|
+
let hasCovGate = false;
|
|
80
|
+
if (json) {
|
|
81
|
+
const c = json.cfg;
|
|
82
|
+
if (c.passWithNoTests === true)
|
|
83
|
+
passWithNoTests = true;
|
|
84
|
+
if (c.bail === true || (typeof c.bail === "number" && c.bail > 0))
|
|
85
|
+
bail = true;
|
|
86
|
+
const cov = c.coverage;
|
|
87
|
+
if (c.coverageThreshold || c.thresholds || (cov && cov.thresholds))
|
|
88
|
+
hasCovGate = true;
|
|
89
|
+
}
|
|
90
|
+
if (ast) {
|
|
91
|
+
if (ast.passWithNoTests)
|
|
92
|
+
passWithNoTests = true;
|
|
93
|
+
if (ast.bail)
|
|
94
|
+
bail = true;
|
|
95
|
+
if (ast.props.has("coverageThreshold") || ast.props.has("thresholds"))
|
|
96
|
+
hasCovGate = true;
|
|
97
|
+
}
|
|
98
|
+
return { where: json?.path ?? ast.path, passWithNoTests, bail, hasCovGate };
|
|
99
|
+
}
|
|
100
|
+
/** Read the Jest/Vitest config and report the project-layer PL codes: ways the
|
|
101
|
+
* suite can report green by configuration. Findings carry level `project`.
|
|
102
|
+
* Returns [] when no Jest/Vitest config is found. */
|
|
103
|
+
export function auditConfig(start = process.cwd()) {
|
|
104
|
+
const sig = collect(start);
|
|
105
|
+
if (!sig)
|
|
106
|
+
return [];
|
|
107
|
+
const findings = [];
|
|
108
|
+
const mk = (code) => {
|
|
109
|
+
const f = makeFinding(sig.where, 1, code);
|
|
110
|
+
f.level = "project";
|
|
111
|
+
return f;
|
|
112
|
+
};
|
|
113
|
+
if (!sig.hasCovGate)
|
|
114
|
+
findings.push(mk("PL7"));
|
|
115
|
+
if (sig.bail)
|
|
116
|
+
findings.push(mk("PL8"));
|
|
117
|
+
if (sig.passWithNoTests)
|
|
118
|
+
findings.push(mk("PL10"));
|
|
119
|
+
return findings;
|
|
120
|
+
}
|
package/dist/cases.d.ts
CHANGED
|
@@ -3,14 +3,42 @@
|
|
|
3
3
|
* the same concept (shared C-codes, so cross-language paper comparison lines up),
|
|
4
4
|
* plus JS/TS-specific codes (JS-prefix) for ecosystem-only patterns.
|
|
5
5
|
*
|
|
6
|
-
*
|
|
7
|
-
*
|
|
6
|
+
* Each code carries four independent axes (none derived from another or from the
|
|
7
|
+
* code prefix):
|
|
8
|
+
* group conceptual failure mode (RiskGroup, closed taxonomy).
|
|
9
|
+
* severity how serious the finding is when it fires ("high" | "low").
|
|
10
|
+
* defaultOn whether the default scan emits it (false for the opt-in
|
|
11
|
+
* diagnostic group, surfaced only via --diagnostics).
|
|
12
|
+
* judgment which semantic question (J1-J6, see falsegreen-skill) it answers.
|
|
13
|
+
*
|
|
14
|
+
* The effective "confidence" used downstream (high/low/off) is derived from
|
|
15
|
+
* severity + defaultOn by baseConfidence(); the exit code is derived from the
|
|
16
|
+
* severity of the findings that are actually emitted. Keeping the axes apart is
|
|
17
|
+
* the point: a finding's taxonomy must not depend on whether it blocks.
|
|
8
18
|
*/
|
|
9
19
|
export type Confidence = "high" | "low" | "off";
|
|
20
|
+
export type Severity = "high" | "low";
|
|
21
|
+
/**
|
|
22
|
+
* Conceptual failure mode — a closed taxonomy condensing the F1-F8 families to
|
|
23
|
+
* six axes. Driven by the per-code table below (riskGroupOf), never by the code
|
|
24
|
+
* prefix, and never defaulted: an unknown code is an error, not a guess.
|
|
25
|
+
*
|
|
26
|
+
* effectiveness no oracle, a trivial oracle, or the wrong oracle (F1/F3/F4).
|
|
27
|
+
* execution the check exists but does not run, or the test vanishes from
|
|
28
|
+
* the count (F2/F5).
|
|
29
|
+
* nondeterminism passes or fails by luck — time, randomness, timers (F6).
|
|
30
|
+
* dependency real I/O or a stand-in for the unit under test: mystery
|
|
31
|
+
* guest, self-mock (isolation, J3/J6).
|
|
32
|
+
* structure size/readability; the test still protects (F8 maintainability).
|
|
33
|
+
* diagnostic opt-in health signal, off by default.
|
|
34
|
+
*/
|
|
35
|
+
export type RiskGroup = "effectiveness" | "execution" | "nondeterminism" | "dependency" | "structure" | "diagnostic";
|
|
10
36
|
export declare const JUDGMENTS: Record<string, string>;
|
|
11
37
|
export interface CaseDef {
|
|
12
38
|
title: string;
|
|
13
|
-
|
|
39
|
+
group: RiskGroup;
|
|
40
|
+
severity: Severity;
|
|
41
|
+
defaultOn: boolean;
|
|
14
42
|
judgment: keyof typeof JUDGMENTS;
|
|
15
43
|
}
|
|
16
44
|
export declare const CASES: Record<string, CaseDef>;
|
|
@@ -19,4 +47,33 @@ export declare const DIAGNOSTIC_THRESHOLDS: {
|
|
|
19
47
|
assertionRoulette: number;
|
|
20
48
|
longTest: number;
|
|
21
49
|
};
|
|
22
|
-
|
|
50
|
+
/**
|
|
51
|
+
* Effective default state of a code as a single value: its severity when the
|
|
52
|
+
* default scan emits it, or "off" when it is opt-in. Derives the legacy
|
|
53
|
+
* three-valued "confidence" from the independent severity + defaultOn axes, so
|
|
54
|
+
* the rest of the pipeline (makeFinding, effectiveConf, exit code) keeps working
|
|
55
|
+
* unchanged while the taxonomy stays separate from the blocking decision.
|
|
56
|
+
*/
|
|
57
|
+
export declare function baseConfidence(code: string): Confidence;
|
|
58
|
+
/**
|
|
59
|
+
* Primary taxonomy: the conceptual failure mode, read from the closed per-code
|
|
60
|
+
* table. Rejects an unknown code instead of defaulting, so a typo or a code that
|
|
61
|
+
* was added to the rules but never classified fails loudly.
|
|
62
|
+
*/
|
|
63
|
+
export declare function riskGroupOf(code: string): RiskGroup;
|
|
64
|
+
/**
|
|
65
|
+
* Legacy product grouping (false-positive / diagnostic / coupling / project),
|
|
66
|
+
* kept only as a transition-compat field in the JSON report. New consumers
|
|
67
|
+
* should read `riskGroup` (riskGroupOf). Prefix-based by design: it mirrors the
|
|
68
|
+
* pre-0.3 output exactly so downstream filters do not break across the upgrade.
|
|
69
|
+
*/
|
|
70
|
+
export declare function groupOf(code: string): "false-positive" | "diagnostic" | "coupling" | "project";
|
|
71
|
+
/** Test-pyramid level, detected from a file's import roots (see level.ts).
|
|
72
|
+
* `project` is the config-audit layer (--config-audit), not a file level. */
|
|
73
|
+
export type PyramidLevel = "unit" | "integration" | "e2e" | "project";
|
|
74
|
+
/**
|
|
75
|
+
* One-line remediation per case: what to change so the test protects something.
|
|
76
|
+
* Short, imperative, no trailing period. Surfaced in the status report (text +
|
|
77
|
+
* JSON `fix` field). A code missing here renders no fix line, never throws.
|
|
78
|
+
*/
|
|
79
|
+
export declare const FIX_HINTS: Record<string, string>;
|