@wbern/obscene 0.3.0 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +59 -16
- package/dist/cli.js +44 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -77,9 +77,9 @@ Scores each file by `complexity × commits` over a time window, then assigns tie
|
|
|
77
77
|
|
|
78
78
|
### `obscene coupling`
|
|
79
79
|
|
|
80
|
-
Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis. Surfaces hidden structural dependencies that aren't visible in the
|
|
80
|
+
Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis from *Your Code as a Crime Scene* (2015). Surfaces hidden structural dependencies that aren't visible in imports or the module graph.
|
|
81
81
|
|
|
82
|
-
Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors).
|
|
82
|
+
Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors). See [Why temporal coupling?](#why-temporal-coupling) for the research backing this approach.
|
|
83
83
|
|
|
84
84
|
```bash
|
|
85
85
|
obscene coupling # default: min 2 shared commits
|
|
@@ -103,49 +103,55 @@ Per-file complexity without churn. Useful for raw complexity distribution.
|
|
|
103
103
|
|
|
104
104
|
## Metrics
|
|
105
105
|
|
|
106
|
-
|
|
106
|
+
### Hotspot metrics
|
|
107
107
|
|
|
108
|
-
|
|
108
|
+
#### Hotspot score (`Score`)
|
|
109
109
|
|
|
110
110
|
`complexity × churn`. The core ranking metric — files that are both complex and frequently modified bubble to the top. See [Why churn × complexity?](#why-churn-x-complexity) for the research backing this approach.
|
|
111
111
|
|
|
112
|
-
|
|
112
|
+
#### Churn (`Churn`)
|
|
113
113
|
|
|
114
114
|
Number of commits touching the file within the configured time window (default: 3 months). Measures how actively the file is being modified.
|
|
115
115
|
|
|
116
|
-
|
|
116
|
+
#### Cyclomatic complexity (`Cmplx`)
|
|
117
117
|
|
|
118
118
|
Total cyclomatic complexity as reported by [scc](https://github.com/boyter/scc). Counts independent execution paths (branches, loops, conditions). Higher values mean more paths to test and more places for bugs to hide.
|
|
119
119
|
|
|
120
|
-
|
|
120
|
+
#### Complexity density (`Dens`)
|
|
121
121
|
|
|
122
122
|
`complexity / lines of code`. Normalizes complexity by file size so a 50-line file with complexity 25 (density 0.50) stands out against a 500-line file with complexity 25 (density 0.05). Based on Harrison & Magel (1981), who found that complexity relative to code size is a stronger fault predictor than raw complexity alone.
|
|
123
123
|
|
|
124
|
-
|
|
124
|
+
#### Defects (`Dfcts`)
|
|
125
125
|
|
|
126
126
|
Count of `fix:` conventional commits touching the file within the churn window. A proxy for historical defect rate — files that attract repeated fixes are more likely to contain latent bugs. Inspired by Moser, Pedrycz & Succi (2008), who showed that change-history metrics outperform static code metrics for defect prediction.
|
|
127
127
|
|
|
128
|
-
|
|
128
|
+
#### Defect density (`defectDensity`, JSON only)
|
|
129
129
|
|
|
130
130
|
`defects / lines of code`. Not shown in table output due to column width, but available in JSON. Normalizes defect count by file size.
|
|
131
131
|
|
|
132
|
-
|
|
132
|
+
#### Nesting depth (`Nest`)
|
|
133
133
|
|
|
134
134
|
Maximum indentation level (tab stops) in the file. Deep nesting correlates with high cognitive load and defect likelihood. Harrison & Magel (1981) identified nesting depth as a significant complexity contributor.
|
|
135
135
|
|
|
136
|
-
|
|
136
|
+
#### Unique authors (`Auth`)
|
|
137
137
|
|
|
138
138
|
Number of distinct git authors who committed to the file within the churn window. Files touched by many authors may lack clear ownership and accumulate inconsistent patterns. Kamei et al. (2013) found developer count to be a significant predictor of defect-introducing changes.
|
|
139
139
|
|
|
140
|
-
###
|
|
140
|
+
### Coupling metrics
|
|
141
141
|
|
|
142
|
-
|
|
142
|
+
#### Shared commits (`Shared`)
|
|
143
143
|
|
|
144
|
-
|
|
144
|
+
Number of commits where both files in a pair were modified together. The core ranking metric for temporal coupling — higher values indicate stronger hidden dependencies between files in different directories. Ball, Kim, Porter & Siy (1997) demonstrated that co-change relationships reveal design dependencies that static analysis misses.
|
|
145
145
|
|
|
146
|
-
|
|
146
|
+
#### Coupling degree (`Degree`)
|
|
147
147
|
|
|
148
|
-
|
|
148
|
+
`shared commits / min(churn of file1, churn of file2) × 100`. What percentage of the less-active file's changes also involved the other file. A degree of 100% means every change to the less-active file also touched the other file. This normalization follows D'Ambros, Lanza & Lungu (2009), who showed that relative coupling measures provide more stable results than raw co-change counts across projects of different sizes.
|
|
149
|
+
|
|
150
|
+
#### Combined complexity (`Cmplx`)
|
|
151
|
+
|
|
152
|
+
Sum of cyclomatic complexity of both files in the pair. Highlights coupled pairs where the involved code is also complex — the combination of hidden dependency and high complexity compounds maintenance risk.
|
|
153
|
+
|
|
154
|
+
#### Tier
|
|
149
155
|
|
|
150
156
|
Cumulative score distribution bucket:
|
|
151
157
|
|
|
@@ -174,6 +180,25 @@ Score=complexity×churn | Dens=complexity/code | Dfcts=fix commits | Nest=max in
|
|
|
174
180
|
Docs: https://github.com/wbern/obscene#metrics
|
|
175
181
|
```
|
|
176
182
|
|
|
183
|
+
### Coupling example
|
|
184
|
+
|
|
185
|
+
```
|
|
186
|
+
Coupling — 6 months churn window | Min shared: 3 | Total score: 91
|
|
187
|
+
Tiers: 10 danger, 7 watch, 7 stable
|
|
188
|
+
Showing: 5 of 24
|
|
189
|
+
|
|
190
|
+
File 1 File 2 Shared Degree Cmplx Tier
|
|
191
|
+
────────────────────────────────────────────────────────────────────────────────────────────────────
|
|
192
|
+
…ePlayer/hooks/useChessEffects.ts src/utils/effect-generator.ts 6 46.2% 261 DANGER
|
|
193
|
+
…ePlayer/hooks/useChessEffects.ts src/utils/pgn-types.ts 6 50.0% 121 DANGER
|
|
194
|
+
src/test/pgn-fixtures.ts src/utils/pgn-parser.server.ts 5 71.4% 3 DANGER
|
|
195
|
+
src/test/pgn-fixtures.ts src/utils/effect-generator.ts 4 57.1% 145 DANGER
|
|
196
|
+
src/test/pgn-fixtures.ts src/utils/pgn-types.ts 4 57.1% 5 DANGER
|
|
197
|
+
|
|
198
|
+
Shared=co-changed commits | Degree=shared/min(churn)×100 | Cmplx=sum of both files
|
|
199
|
+
Docs: https://github.com/wbern/obscene#metrics
|
|
200
|
+
```
|
|
201
|
+
|
|
177
202
|
## Supported languages
|
|
178
203
|
|
|
179
204
|
Any language [scc supports](https://github.com/boyter/scc#features) — 200+ languages including C, C++, Go, Java, JavaScript, TypeScript, Python, Rust, Ruby, PHP, Swift, Kotlin, and many more. No configuration needed; scc auto-detects languages from file extensions.
|
|
@@ -193,14 +218,32 @@ Files that are both complex and frequently modified are disproportionately likel
|
|
|
193
218
|
|
|
194
219
|
The general approach was popularized by Adam Tornhill's *Your Code as a Crime Scene* (2015), which applies forensic analysis techniques to version control history.
|
|
195
220
|
|
|
221
|
+
## Why temporal coupling?
|
|
222
|
+
|
|
223
|
+
Files that change together but live in different directories reveal implicit dependencies that the module graph doesn't capture. These hidden couplings are a maintenance hazard: a developer modifying one file doesn't know they also need to update the other, leading to bugs that only surface later.
|
|
224
|
+
|
|
225
|
+
- **Ball, Kim, Porter & Siy (1997)** pioneered co-change analysis and showed that version control history surfaces design relationships invisible to static analysis. — [ICSE 1997 Workshop](https://www.researchgate.net/publication/2791666_If_Your_Version_Control_System_Could_Talk)
|
|
226
|
+
- **D'Ambros, Lanza & Lungu (2009)** developed the Evolution Radar for visualizing logical coupling at both file and module level, showing how evolutionary coupling reveals architectural decay. The normalized approach (coupling relative to total changes) provides more stable measures across projects of different sizes. — [IEEE TSE](https://doi.org/10.1109/TSE.2009.17)
|
|
227
|
+
- **Tornhill (2015)** popularized temporal coupling analysis in *Your Code as a Crime Scene*, demonstrating how co-change patterns reveal "surprise dependencies" — files that should logically be independent but can't be changed separately in practice. His tooling (Code Maat) uses the same commit co-occurrence approach.
|
|
228
|
+
- **Cataldo, Mockus, Roberts & Herbsleb (2009)** analyzed both syntactic and logical dependencies across two large systems and found that logical (co-change) dependencies have a significant independent effect on failure proneness. When developers are unaware of these hidden couplings, defects increase. — [IEEE TSE](https://doi.org/10.1109/TSE.2009.42)
|
|
229
|
+
|
|
196
230
|
## Limitations
|
|
197
231
|
|
|
232
|
+
### General
|
|
233
|
+
|
|
198
234
|
- **Churn = commit count**, not lines changed. A one-line typo fix counts the same as a 500-line rewrite.
|
|
199
235
|
- **Per-file granularity only.** A 1000-line file with many small functions scores higher than it probably should. No function-level breakdown.
|
|
200
236
|
- **Must be run inside a git repo.** Churn data comes from `git log`.
|
|
201
237
|
- **Only analyzes files that currently exist.** Deleted files don't appear, even if they churned heavily before removal.
|
|
202
238
|
- **Tier thresholds are fixed** (50/80 cumulative %). Not configurable yet.
|
|
203
239
|
|
|
240
|
+
### Coupling-specific
|
|
241
|
+
|
|
242
|
+
- **Same-directory exclusion is a heuristic.** Files in the same directory that are unexpectedly coupled won't be surfaced. The assumption is that co-located files are *expected* to change together.
|
|
243
|
+
- **Mass commit threshold (>20 files) is hardcoded.** Commits touching many files are skipped to avoid noise from formatting changes and large refactors, but legitimate large features that touch many files across directories are also excluded.
|
|
244
|
+
- **Degree uses unfiltered churn.** The denominator (`min(churn)`) counts all commits to a file, including single-file commits. This means degree can understate coupling when a file has high solo churn.
|
|
245
|
+
- **Squash merges collapse coupling signal.** If a branch with 10 separate commits is squash-merged into one, all co-changes within that branch become a single co-occurrence.
|
|
246
|
+
|
|
204
247
|
## License
|
|
205
248
|
|
|
206
249
|
MIT
|
package/dist/cli.js
CHANGED
|
@@ -299,6 +299,14 @@ function formatReportTable(output) {
|
|
|
299
299
|
padRight(truncate(f.file, 58), 60) + padLeft(String(f.code), 8) + padLeft(String(f.complexity), 12) + padLeft(f.complexityDensity.toFixed(2), 9) + padLeft(String(f.comments), 10)
|
|
300
300
|
);
|
|
301
301
|
}
|
|
302
|
+
lines.push("");
|
|
303
|
+
lines.push(
|
|
304
|
+
"Complexity=cyclomatic branch/loop count | Density=complexity/code | Comments=comment lines"
|
|
305
|
+
);
|
|
306
|
+
lines.push(
|
|
307
|
+
"High complexity is expected for parsers, state machines, and business logic. Compare density across files, not raw values."
|
|
308
|
+
);
|
|
309
|
+
lines.push("Docs: https://github.com/wbern/obscene#metrics");
|
|
302
310
|
return lines.join("\n");
|
|
303
311
|
}
|
|
304
312
|
function formatHotspotsTable(output) {
|
|
@@ -321,6 +329,12 @@ function formatHotspotsTable(output) {
|
|
|
321
329
|
lines.push(
|
|
322
330
|
"Score=complexity\xD7churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors"
|
|
323
331
|
);
|
|
332
|
+
lines.push(
|
|
333
|
+
"Tiers are relative to THIS codebase, not absolute quality grades. A 'danger' file in a clean codebase may be fine."
|
|
334
|
+
);
|
|
335
|
+
lines.push(
|
|
336
|
+
"High scores flag review candidates, not bad code \u2014 stable complex files (parsers, engines) score high naturally."
|
|
337
|
+
);
|
|
324
338
|
lines.push("Docs: https://github.com/wbern/obscene#metrics");
|
|
325
339
|
return lines.join("\n");
|
|
326
340
|
}
|
|
@@ -344,6 +358,12 @@ function formatCouplingTable(output) {
|
|
|
344
358
|
lines.push(
|
|
345
359
|
"Shared=co-changed commits | Degree=shared/min(churn)\xD7100 | Cmplx=sum of both files"
|
|
346
360
|
);
|
|
361
|
+
lines.push(
|
|
362
|
+
"Tiers are relative to THIS codebase, not absolute quality grades. High coupling may be intentional and fine."
|
|
363
|
+
);
|
|
364
|
+
lines.push(
|
|
365
|
+
"Same-directory pairs excluded. Commits touching >20 files skipped. Only cross-directory dependencies shown."
|
|
366
|
+
);
|
|
347
367
|
lines.push("Docs: https://github.com/wbern/obscene#metrics");
|
|
348
368
|
return lines.join("\n");
|
|
349
369
|
}
|
|
@@ -371,7 +391,27 @@ function truncate(s, max) {
|
|
|
371
391
|
|
|
372
392
|
// src/cli.ts
|
|
373
393
|
var program = new Command();
|
|
374
|
-
program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.
|
|
394
|
+
program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.1");
|
|
395
|
+
var REPORT_GUIDE = {
|
|
396
|
+
complexity: "Cyclomatic complexity (branch/loop count). NOT a quality judgment \u2014 a 500-line parser will naturally score high. Compare density, not raw values.",
|
|
397
|
+
complexityDensity: "Complexity per line of code. Normalizes for file size. >0.25 suggests dense logic worth reviewing; <0.10 is typical for straightforward code.",
|
|
398
|
+
comments: "Comment line count. Low comments in high-density files may indicate under-documented logic. High comments alone is not a problem."
|
|
399
|
+
};
|
|
400
|
+
var HOTSPOTS_GUIDE = {
|
|
401
|
+
hotspotScore: "complexity \xD7 churn. Ranks files by combined risk: complex code that changes often. High score does NOT mean bad code \u2014 stable high-complexity files (parsers, engines) are fine. Focus on files where score is rising over time.",
|
|
402
|
+
churn: "Commit count in the time window. High churn alone is neutral \u2014 active development is normal. It becomes a signal when combined with high complexity.",
|
|
403
|
+
tier: "Relative ranking within THIS codebase (top 50% = danger, next 30% = watch, bottom 20% = stable). NOT an absolute quality grade. A 'danger' file in a clean codebase may be perfectly fine. Compare across runs to spot trends.",
|
|
404
|
+
defects: "Count of fix: conventional commits. A proxy for bug frequency \u2014 0 does not mean bug-free, and >0 does not mean bad code. Useful for spotting files that attract repeated fixes.",
|
|
405
|
+
defectDensity: "Fix commits per line of code. Normalizes defect count by file size. Only meaningful with conventional commits (fix: prefix).",
|
|
406
|
+
maxNesting: "Deepest indentation level. >6 suggests complex control flow worth simplifying. Language-dependent \u2014 Python files naturally nest less than C++.",
|
|
407
|
+
authors: "Unique committers in the time window. High author count may indicate unclear ownership. Low count is normal for specialized code. Neither value is inherently good or bad."
|
|
408
|
+
};
|
|
409
|
+
var COUPLING_GUIDE = {
|
|
410
|
+
cochanges: "Times both files appeared in the same commit. Higher values suggest a dependency between the files. Same-directory pairs are excluded \u2014 only cross-directory pairs are shown.",
|
|
411
|
+
degree: "Percentage: shared commits / min(churn of file1, file2) \xD7 100. Shows how tightly coupled the pair is relative to their individual change rates. 100% means every change to the less-active file also touched the other.",
|
|
412
|
+
totalComplexity: "Sum of both files' cyclomatic complexity. Highlights coupled pairs where the involved code is also complex \u2014 hidden dependency + high complexity compounds maintenance risk.",
|
|
413
|
+
tier: "Relative ranking within THIS codebase's coupling pairs (top 50% = danger, next 30% = watch, bottom 20% = stable). NOT an absolute quality grade. 'danger' means this pair co-changes more than most \u2014 it may be intentional and fine."
|
|
414
|
+
};
|
|
375
415
|
function addSharedOptions(cmd) {
|
|
376
416
|
return cmd.option("--top <n>", "limit to top N entries (0 = all)", "20").option("--format <type>", "output format: json | table", "json").option(
|
|
377
417
|
"--exclude <patterns...>",
|
|
@@ -421,6 +461,7 @@ function runReport(opts) {
|
|
|
421
461
|
const limited = top > 0 ? files.slice(0, top) : files;
|
|
422
462
|
const output = {
|
|
423
463
|
generated: (/* @__PURE__ */ new Date()).toISOString(),
|
|
464
|
+
guide: REPORT_GUIDE,
|
|
424
465
|
summary: {
|
|
425
466
|
...totals,
|
|
426
467
|
fileCount: files.length,
|
|
@@ -460,6 +501,7 @@ function runHotspots(opts) {
|
|
|
460
501
|
const totalScore = hotspots.reduce((sum, h) => sum + h.hotspotScore, 0);
|
|
461
502
|
const output = {
|
|
462
503
|
generated: (/* @__PURE__ */ new Date()).toISOString(),
|
|
504
|
+
guide: HOTSPOTS_GUIDE,
|
|
463
505
|
churnWindow: `${months} months`,
|
|
464
506
|
totalScore,
|
|
465
507
|
tierCounts,
|
|
@@ -500,6 +542,7 @@ function runCoupling(opts) {
|
|
|
500
542
|
const totalScore = couplings.reduce((sum, c) => sum + c.couplingScore, 0);
|
|
501
543
|
const output = {
|
|
502
544
|
generated: (/* @__PURE__ */ new Date()).toISOString(),
|
|
545
|
+
guide: COUPLING_GUIDE,
|
|
503
546
|
churnWindow: `${months} months`,
|
|
504
547
|
minCochanges,
|
|
505
548
|
totalScore,
|