@wbern/obscene 0.3.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +59 -16
  2. package/dist/cli.js +44 -1
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -77,9 +77,9 @@ Scores each file by `complexity × commits` over a time window, then assigns tie
77
77
 
78
78
  ### `obscene coupling`
79
79
 
80
- Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis. Surfaces hidden structural dependencies that aren't visible in the code itself.
80
+ Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis from *Your Code as a Crime Scene* (2015). Surfaces hidden structural dependencies that aren't visible in imports or the module graph.
81
81
 
82
- Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors).
82
+ Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors). See [Why temporal coupling?](#why-temporal-coupling) for the research backing this approach.
83
83
 
84
84
  ```bash
85
85
  obscene coupling # default: min 2 shared commits
@@ -103,49 +103,55 @@ Per-file complexity without churn. Useful for raw complexity distribution.
103
103
 
104
104
  ## Metrics
105
105
 
106
- Each hotspot row includes the following metrics:
106
+ ### Hotspot metrics
107
107
 
108
- ### Hotspot score (`Score`)
108
+ #### Hotspot score (`Score`)
109
109
 
110
110
  `complexity × churn`. The core ranking metric — files that are both complex and frequently modified bubble to the top. See [Why churn × complexity?](#why-churn-x-complexity) for the research backing this approach.
111
111
 
112
- ### Churn (`Churn`)
112
+ #### Churn (`Churn`)
113
113
 
114
114
  Number of commits touching the file within the configured time window (default: 3 months). Measures how actively the file is being modified.
115
115
 
116
- ### Cyclomatic complexity (`Cmplx`)
116
+ #### Cyclomatic complexity (`Cmplx`)
117
117
 
118
118
  Total cyclomatic complexity as reported by [scc](https://github.com/boyter/scc). Counts independent execution paths (branches, loops, conditions). Higher values mean more paths to test and more places for bugs to hide.
119
119
 
120
- ### Complexity density (`Dens`)
120
+ #### Complexity density (`Dens`)
121
121
 
122
122
  `complexity / lines of code`. Normalizes complexity by file size so a 50-line file with complexity 25 (density 0.50) stands out against a 500-line file with complexity 25 (density 0.05). Based on Harrison & Magel (1981), who found that complexity relative to code size is a stronger fault predictor than raw complexity alone.
123
123
 
124
- ### Defects (`Dfcts`)
124
+ #### Defects (`Dfcts`)
125
125
 
126
126
  Count of `fix:` conventional commits touching the file within the churn window. A proxy for historical defect rate — files that attract repeated fixes are more likely to contain latent bugs. Inspired by Moser, Pedrycz & Succi (2008), who showed that change-history metrics outperform static code metrics for defect prediction.
127
127
 
128
- ### Defect density (`defectDensity`, JSON only)
128
+ #### Defect density (`defectDensity`, JSON only)
129
129
 
130
130
  `defects / lines of code`. Not shown in table output due to column width, but available in JSON. Normalizes defect count by file size.
131
131
 
132
- ### Nesting depth (`Nest`)
132
+ #### Nesting depth (`Nest`)
133
133
 
134
134
  Maximum indentation level (tab stops) in the file. Deep nesting correlates with high cognitive load and defect likelihood. Harrison & Magel (1981) identified nesting depth as a significant complexity contributor.
135
135
 
136
- ### Unique authors (`Auth`)
136
+ #### Unique authors (`Auth`)
137
137
 
138
138
  Number of distinct git authors who committed to the file within the churn window. Files touched by many authors may lack clear ownership and accumulate inconsistent patterns. Kamei et al. (2013) found developer count to be a significant predictor of defect-introducing changes.
139
139
 
140
- ### Shared commits (`Shared`, coupling only)
140
+ ### Coupling metrics
141
141
 
142
- Number of commits where both files in a pair were modified together. The core ranking metric for temporal coupling — higher values indicate stronger hidden dependencies between files in different directories.
142
+ #### Shared commits (`Shared`)
143
143
 
144
- ### Coupling degree (`Degree`, coupling only)
144
+ Number of commits where both files in a pair were modified together. The core ranking metric for temporal coupling — higher values indicate stronger hidden dependencies between files in different directories. Ball, Kim, Porter & Siy (1997) demonstrated that co-change relationships reveal design dependencies that static analysis misses.
145
145
 
146
- `shared commits / min(churn of file1, churn of file2) × 100`. What percentage of the less-active file's changes also involved the other file. A degree of 100% means every change to the less-active file also touched the other file.
146
+ #### Coupling degree (`Degree`)
147
147
 
148
- ### Tier
148
+ `shared commits / min(churn of file1, churn of file2) × 100`. What percentage of the less-active file's changes also involved the other file. A degree of 100% means every change to the less-active file also touched the other file. This normalization follows D'Ambros, Lanza & Lungu (2009), who showed that relative coupling measures provide more stable results than raw co-change counts across projects of different sizes.
149
+
150
+ #### Combined complexity (`Cmplx`)
151
+
152
+ Sum of cyclomatic complexity of both files in the pair. Highlights coupled pairs where the involved code is also complex — the combination of hidden dependency and high complexity compounds maintenance risk.
153
+
154
+ #### Tier
149
155
 
150
156
  Cumulative score distribution bucket:
151
157
 
@@ -174,6 +180,25 @@ Score=complexity×churn | Dens=complexity/code | Dfcts=fix commits | Nest=max in
174
180
  Docs: https://github.com/wbern/obscene#metrics
175
181
  ```
176
182
 
183
+ ### Coupling example
184
+
185
+ ```
186
+ Coupling — 6 months churn window | Min shared: 3 | Total score: 91
187
+ Tiers: 10 danger, 7 watch, 7 stable
188
+ Showing: 5 of 24
189
+
190
+ File 1 File 2 Shared Degree Cmplx Tier
191
+ ────────────────────────────────────────────────────────────────────────────────────────────────────
192
+ …ePlayer/hooks/useChessEffects.ts src/utils/effect-generator.ts 6 46.2% 261 DANGER
193
+ …ePlayer/hooks/useChessEffects.ts src/utils/pgn-types.ts 6 50.0% 121 DANGER
194
+ src/test/pgn-fixtures.ts src/utils/pgn-parser.server.ts 5 71.4% 3 DANGER
195
+ src/test/pgn-fixtures.ts src/utils/effect-generator.ts 4 57.1% 145 DANGER
196
+ src/test/pgn-fixtures.ts src/utils/pgn-types.ts 4 57.1% 5 DANGER
197
+
198
+ Shared=co-changed commits | Degree=shared/min(churn)×100 | Cmplx=sum of both files
199
+ Docs: https://github.com/wbern/obscene#metrics
200
+ ```
201
+
177
202
  ## Supported languages
178
203
 
179
204
  Any language [scc supports](https://github.com/boyter/scc#features) — 200+ languages including C, C++, Go, Java, JavaScript, TypeScript, Python, Rust, Ruby, PHP, Swift, Kotlin, and many more. No configuration needed; scc auto-detects languages from file extensions.
@@ -193,14 +218,32 @@ Files that are both complex and frequently modified are disproportionately likel
193
218
 
194
219
  The general approach was popularized by Adam Tornhill's *Your Code as a Crime Scene* (2015), which applies forensic analysis techniques to version control history.
195
220
 
221
+ ## Why temporal coupling?
222
+
223
+ Files that change together but live in different directories reveal implicit dependencies that the module graph doesn't capture. These hidden couplings are a maintenance hazard: a developer modifying one file doesn't know they also need to update the other, leading to bugs that only surface later.
224
+
225
+ - **Ball, Kim, Porter & Siy (1997)** pioneered co-change analysis and showed that version control history surfaces design relationships invisible to static analysis. — [ICSE 1997 Workshop](https://www.researchgate.net/publication/2791666_If_Your_Version_Control_System_Could_Talk)
226
+ - **D'Ambros, Lanza & Lungu (2009)** developed the Evolution Radar for visualizing logical coupling at both file and module level, showing how evolutionary coupling reveals architectural decay. The normalized approach (coupling relative to total changes) provides more stable measures across projects of different sizes. — [IEEE TSE](https://doi.org/10.1109/TSE.2009.17)
227
+ - **Tornhill (2015)** popularized temporal coupling analysis in *Your Code as a Crime Scene*, demonstrating how co-change patterns reveal "surprise dependencies" — files that should logically be independent but can't be changed separately in practice. His tooling (Code Maat) uses the same commit co-occurrence approach.
228
+ - **Cataldo, Mockus, Roberts & Herbsleb (2009)** analyzed both syntactic and logical dependencies across two large systems and found that logical (co-change) dependencies have a significant independent effect on failure proneness. When developers are unaware of these hidden couplings, defects increase. — [IEEE TSE](https://doi.org/10.1109/TSE.2009.42)
229
+
196
230
  ## Limitations
197
231
 
232
+ ### General
233
+
198
234
  - **Churn = commit count**, not lines changed. A one-line typo fix counts the same as a 500-line rewrite.
199
235
  - **Per-file granularity only.** A 1000-line file with many small functions scores higher than it probably should. No function-level breakdown.
200
236
  - **Must be run inside a git repo.** Churn data comes from `git log`.
201
237
  - **Only analyzes files that currently exist.** Deleted files don't appear, even if they churned heavily before removal.
202
238
  - **Tier thresholds are fixed** (50/80 cumulative %). Not configurable yet.
203
239
 
240
+ ### Coupling-specific
241
+
242
+ - **Same-directory exclusion is a heuristic.** Files in the same directory that are unexpectedly coupled won't be surfaced. The assumption is that co-located files are *expected* to change together.
243
+ - **Mass commit threshold (>20 files) is hardcoded.** Commits touching many files are skipped to avoid noise from formatting changes and large refactors, but legitimate large features that touch many files across directories are also excluded.
244
+ - **Degree uses unfiltered churn.** The denominator (`min(churn)`) counts all commits to a file, including single-file commits. This means degree can understate coupling when a file has high solo churn.
245
+ - **Squash merges collapse coupling signal.** If a branch with 10 separate commits is squash-merged into one, all co-changes within that branch become a single co-occurrence.
246
+
204
247
  ## License
205
248
 
206
249
  MIT
package/dist/cli.js CHANGED
@@ -299,6 +299,14 @@ function formatReportTable(output) {
299
299
  padRight(truncate(f.file, 58), 60) + padLeft(String(f.code), 8) + padLeft(String(f.complexity), 12) + padLeft(f.complexityDensity.toFixed(2), 9) + padLeft(String(f.comments), 10)
300
300
  );
301
301
  }
302
+ lines.push("");
303
+ lines.push(
304
+ "Complexity=cyclomatic branch/loop count | Density=complexity/code | Comments=comment lines"
305
+ );
306
+ lines.push(
307
+ "High complexity is expected for parsers, state machines, and business logic. Compare density across files, not raw values."
308
+ );
309
+ lines.push("Docs: https://github.com/wbern/obscene#metrics");
302
310
  return lines.join("\n");
303
311
  }
304
312
  function formatHotspotsTable(output) {
@@ -321,6 +329,12 @@ function formatHotspotsTable(output) {
321
329
  lines.push(
322
330
  "Score=complexity\xD7churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors"
323
331
  );
332
+ lines.push(
333
+ "Tiers are relative to THIS codebase, not absolute quality grades. A 'danger' file in a clean codebase may be fine."
334
+ );
335
+ lines.push(
336
+ "High scores flag review candidates, not bad code \u2014 stable complex files (parsers, engines) score high naturally."
337
+ );
324
338
  lines.push("Docs: https://github.com/wbern/obscene#metrics");
325
339
  return lines.join("\n");
326
340
  }
@@ -344,6 +358,12 @@ function formatCouplingTable(output) {
344
358
  lines.push(
345
359
  "Shared=co-changed commits | Degree=shared/min(churn)\xD7100 | Cmplx=sum of both files"
346
360
  );
361
+ lines.push(
362
+ "Tiers are relative to THIS codebase, not absolute quality grades. High coupling may be intentional and fine."
363
+ );
364
+ lines.push(
365
+ "Same-directory pairs excluded. Commits touching >20 files skipped. Only cross-directory dependencies shown."
366
+ );
347
367
  lines.push("Docs: https://github.com/wbern/obscene#metrics");
348
368
  return lines.join("\n");
349
369
  }
@@ -371,7 +391,27 @@ function truncate(s, max) {
371
391
 
372
392
  // src/cli.ts
373
393
  var program = new Command();
374
- program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.0");
394
+ program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.1");
395
+ var REPORT_GUIDE = {
396
+ complexity: "Cyclomatic complexity (branch/loop count). NOT a quality judgment \u2014 a 500-line parser will naturally score high. Compare density, not raw values.",
397
+ complexityDensity: "Complexity per line of code. Normalizes for file size. >0.25 suggests dense logic worth reviewing; <0.10 is typical for straightforward code.",
398
+ comments: "Comment line count. Low comments in high-density files may indicate under-documented logic. High comments alone is not a problem."
399
+ };
400
+ var HOTSPOTS_GUIDE = {
401
+ hotspotScore: "complexity \xD7 churn. Ranks files by combined risk: complex code that changes often. High score does NOT mean bad code \u2014 stable high-complexity files (parsers, engines) are fine. Focus on files where score is rising over time.",
402
+ churn: "Commit count in the time window. High churn alone is neutral \u2014 active development is normal. It becomes a signal when combined with high complexity.",
403
+ tier: "Relative ranking within THIS codebase (top 50% = danger, next 30% = watch, bottom 20% = stable). NOT an absolute quality grade. A 'danger' file in a clean codebase may be perfectly fine. Compare across runs to spot trends.",
404
+ defects: "Count of fix: conventional commits. A proxy for bug frequency \u2014 0 does not mean bug-free, and >0 does not mean bad code. Useful for spotting files that attract repeated fixes.",
405
+ defectDensity: "Fix commits per line of code. Normalizes defect count by file size. Only meaningful with conventional commits (fix: prefix).",
406
+ maxNesting: "Deepest indentation level. >6 suggests complex control flow worth simplifying. Language-dependent \u2014 Python files naturally nest less than C++.",
407
+ authors: "Unique committers in the time window. High author count may indicate unclear ownership. Low count is normal for specialized code. Neither value is inherently good or bad."
408
+ };
409
+ var COUPLING_GUIDE = {
410
+ cochanges: "Times both files appeared in the same commit. Higher values suggest a dependency between the files. Same-directory pairs are excluded \u2014 only cross-directory pairs are shown.",
411
+ degree: "Percentage: shared commits / min(churn of file1, file2) \xD7 100. Shows how tightly coupled the pair is relative to their individual change rates. 100% means every change to the less-active file also touched the other.",
412
+ totalComplexity: "Sum of both files' cyclomatic complexity. Highlights coupled pairs where the involved code is also complex \u2014 hidden dependency + high complexity compounds maintenance risk.",
413
+ tier: "Relative ranking within THIS codebase's coupling pairs (top 50% = danger, next 30% = watch, bottom 20% = stable). NOT an absolute quality grade. 'danger' means this pair co-changes more than most \u2014 it may be intentional and fine."
414
+ };
375
415
  function addSharedOptions(cmd) {
376
416
  return cmd.option("--top <n>", "limit to top N entries (0 = all)", "20").option("--format <type>", "output format: json | table", "json").option(
377
417
  "--exclude <patterns...>",
@@ -421,6 +461,7 @@ function runReport(opts) {
421
461
  const limited = top > 0 ? files.slice(0, top) : files;
422
462
  const output = {
423
463
  generated: (/* @__PURE__ */ new Date()).toISOString(),
464
+ guide: REPORT_GUIDE,
424
465
  summary: {
425
466
  ...totals,
426
467
  fileCount: files.length,
@@ -460,6 +501,7 @@ function runHotspots(opts) {
460
501
  const totalScore = hotspots.reduce((sum, h) => sum + h.hotspotScore, 0);
461
502
  const output = {
462
503
  generated: (/* @__PURE__ */ new Date()).toISOString(),
504
+ guide: HOTSPOTS_GUIDE,
463
505
  churnWindow: `${months} months`,
464
506
  totalScore,
465
507
  tierCounts,
@@ -500,6 +542,7 @@ function runCoupling(opts) {
500
542
  const totalScore = couplings.reduce((sum, c) => sum + c.couplingScore, 0);
501
543
  const output = {
502
544
  generated: (/* @__PURE__ */ new Date()).toISOString(),
545
+ guide: COUPLING_GUIDE,
503
546
  churnWindow: `${months} months`,
504
547
  minCochanges,
505
548
  totalScore,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@wbern/obscene",
3
- "version": "0.3.0",
3
+ "version": "0.3.1",
4
4
  "description": "Identify hotspot files — complex code that changes frequently. Churn × complexity analysis for any git repo.",
5
5
  "type": "module",
6
6
  "bin": {