@wbern/obscene 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +68 -8
  2. package/dist/cli.js +6 -1
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -86,20 +86,69 @@ Per-file complexity without churn. Useful for raw complexity distribution.
86
86
  | `--format <type>` | `json` | `json` or `table` |
87
87
  | `--exclude <patterns...>` | — | Additional exclusion patterns |
88
88
 
89
+ ## Metrics
90
+
91
+ Each hotspot row includes the following metrics:
92
+
93
+ ### Hotspot score (`Score`)
94
+
95
+ `complexity × churn`. The core ranking metric — files that are both complex and frequently modified bubble to the top. See [Why churn × complexity?](#why-churn-x-complexity) for the research backing this approach.
96
+
97
+ ### Churn (`Churn`)
98
+
99
+ Number of commits touching the file within the configured time window (default: 3 months). Measures how actively the file is being modified.
100
+
101
+ ### Cyclomatic complexity (`Cmplx`)
102
+
103
+ Total cyclomatic complexity as reported by [scc](https://github.com/boyter/scc). Counts independent execution paths (branches, loops, conditions). Higher values mean more paths to test and more places for bugs to hide.
104
+
105
+ ### Complexity density (`Dens`)
106
+
107
+ `complexity / lines of code`. Normalizes complexity by file size so a 50-line file with complexity 25 (density 0.50) stands out against a 500-line file with complexity 25 (density 0.05). Based on Harrison & Magel (1981), who found that complexity relative to code size is a stronger fault predictor than raw complexity alone.
108
+
109
+ ### Defects (`Dfcts`)
110
+
111
+ Count of `fix:` conventional commits touching the file within the churn window. A proxy for historical defect rate — files that attract repeated fixes are more likely to contain latent bugs. Inspired by Moser, Pedrycz & Succi (2008), who showed that change-history metrics outperform static code metrics for defect prediction.
112
+
113
+ ### Defect density (`defectDensity`, JSON only)
114
+
115
+ `defects / lines of code`. Not shown in table output due to column width, but available in JSON. Normalizes defect count by file size.
116
+
117
+ ### Nesting depth (`Nest`)
118
+
119
+ Maximum indentation level (tab stops) in the file. Deep nesting correlates with high cognitive load and defect likelihood. Harrison & Magel (1981) identified nesting depth as a significant complexity contributor.
120
+
121
+ ### Unique authors (`Auth`)
122
+
123
+ Number of distinct git authors who committed to the file within the churn window. Files touched by many authors may lack clear ownership and accumulate inconsistent patterns. Kamei et al. (2013) found developer count to be a significant predictor of defect-introducing changes.
124
+
125
+ ### Tier
126
+
127
+ Cumulative score distribution bucket:
128
+
129
+ | Tier | Range | Meaning |
130
+ |------|-------|---------|
131
+ | **danger** | top 50% of total score | Refactor candidates |
132
+ | **watch** | next 30% (50–80%) | Keep an eye on these |
133
+ | **stable** | bottom 20% | Low risk |
134
+
89
135
  ## Example output
90
136
 
91
137
  ```
92
- Hotspots — 3 months churn window | Total score: 35452
138
+ Hotspots — 3 months churn window | Total score: 35,452
93
139
  Tiers: 3 danger, 13 watch, 194 stable
94
140
  Showing: 5 of 210
95
141
 
96
- File Score % Churn Cmplx Density Tier
97
- ──────────────────────────────────────────────────────────────────────────────────────
98
- src/utils/effect-generator.ts 8296 23.4 68 122 0.12 DANGER
99
- src/services/game-engine.ts 4284 12.1 51 84 0.09 DANGER
100
- src/components/board-renderer.tsx 2940 8.3 42 70 0.11 DANGER
101
- src/hooks/use-game-state.ts 1320 3.7 33 40 0.08 WATCH
102
- src/utils/move-validator.ts 945 2.7 27 35 0.06 WATCH
142
+ File Score % Churn Cmplx Dens Dfcts Nest Auth Tier
143
+ ────────────────────────────────────────────────────────────────────────────────────────────────────────────────
144
+ src/utils/effect-generator.ts 8,296 23.4 68 122 0.12 5 6 4 DANGER
145
+ src/services/game-engine.ts 4,284 12.1 51 84 0.09 3 4 3 DANGER
146
+ src/components/board-renderer.tsx 2,940 8.3 42 70 0.11 2 5 3 DANGER
147
+ src/hooks/use-game-state.ts 1,320 3.7 33 40 0.08 1 3 2 WATCH
148
+ src/utils/move-validator.ts 945 2.7 27 35 0.06 0 2 1 WATCH
149
+
150
+ Score=complexity×churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors
151
+ Docs: https://github.com/wbern/obscene#metrics
103
152
  ```
104
153
 
105
154
  ## Supported languages
@@ -110,6 +159,17 @@ Any language [scc supports](https://github.com/boyter/scc#features) — 200+ lan
110
159
 
111
160
  Test and generated files are excluded automatically: `*.test.*`, `*.spec.*`, `__tests__/`, `__mocks__/`, `*.stories.*`, `*.d.ts`, and similar patterns. scc also skips generated files by default (`--no-gen`).
112
161
 
162
+ ## Why churn x complexity?
163
+
164
+ Files that are both complex and frequently modified are disproportionately likely to contain defects. This is backed by decades of empirical software engineering research:
165
+
166
+ - **Nagappan & Ball (2005)** studied Windows Server 2003 and found that relative code churn measures predict system defect density with 89% accuracy. — [ICSE 2005](https://doi.org/10.1109/ICSE.2005.1553571)
167
+ - **Moser, Pedrycz & Succi (2008)** compared change metrics against static code attributes on Eclipse and found that process metrics (churn, change frequency) outperform static code metrics for defect prediction. — [ICSE 2008](https://doi.org/10.1145/1368088.1368114)
168
+ - **Shin, Meneely, Williams & Osborne (2011)** combined complexity, churn, and developer activity metrics to predict vulnerabilities in Mozilla Firefox and the Linux kernel. By flagging only 10.9% of files, the model identified 70.8% of known vulnerabilities. — [IEEE TSE](https://doi.org/10.1109/TSE.2010.55)
169
+ - **Tornhill & Borg (2022)** analyzed 39 proprietary codebases and found that low-quality code (by their Code Health metric) contains 15x more defects and takes 124% longer to resolve. In their case studies, 4% of the codebase was responsible for 72% of all defects. — [ACM/IEEE TechDebt 2022](https://arxiv.org/abs/2203.04374)
170
+
171
+ The general approach was popularized by Adam Tornhill's *Your Code as a Crime Scene* (2015), which applies forensic analysis techniques to version control history.
172
+
113
173
  ## Limitations
114
174
 
115
175
  - **Churn = commit count**, not lines changed. A one-line typo fix counts the same as a 500-line rewrite.
package/dist/cli.js CHANGED
@@ -246,6 +246,11 @@ function formatHotspotsTable(output) {
246
246
  padRight(truncate(h.file, 48), 50) + padLeft(h.hotspotScore.toLocaleString(), 8) + padLeft(h.percentOfTotal.toFixed(1), 7) + padLeft(String(h.churn), 7) + padLeft(String(h.complexity), 7) + padLeft(h.complexityDensity.toFixed(2), 7) + padLeft(String(h.defects), 6) + padLeft(String(h.maxNesting), 6) + padLeft(String(h.authors), 6) + padLeft(tierLabel, 8)
247
247
  );
248
248
  }
249
+ lines.push("");
250
+ lines.push(
251
+ "Score=complexity\xD7churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors"
252
+ );
253
+ lines.push("Docs: https://github.com/wbern/obscene#metrics");
249
254
  return lines.join("\n");
250
255
  }
251
256
  function padRight(s, n) {
@@ -260,7 +265,7 @@ function truncate(s, max) {
260
265
 
261
266
  // src/cli.ts
262
267
  var program = new Command();
263
- program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.2.0");
268
+ program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.2.1");
264
269
  function addSharedOptions(cmd) {
265
270
  return cmd.option("--top <n>", "limit to top N entries (0 = all)", "20").option("--format <type>", "output format: json | table", "json").option(
266
271
  "--exclude <patterns...>",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@wbern/obscene",
3
- "version": "0.2.0",
3
+ "version": "0.2.1",
4
4
  "description": "Identify hotspot files — complex code that changes frequently. Churn × complexity analysis for any git repo.",
5
5
  "type": "module",
6
6
  "bin": {