npm - @wbern/obscene - Versions diffs - 0.2.0 → 0.3.0 - Mend

@wbern/obscene 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -57,6 +57,8 @@ obscene --format table           # human-readable table
 obscene --top 50 --months 6     # more results, longer window
 obscene --top 0                  # all files
 obscene report                   # raw complexity (no churn)
+obscene coupling                 # temporal coupling analysis
+obscene coupling --min-cochanges 1 --format table
 obscene --exclude "*.generated.*"
 obscene | jq '.hotspots[0]'     # pipe-friendly
 ```
@@ -73,6 +75,18 @@ Scores each file by `complexity × commits` over a time window, then assigns tie
 | **watch** | next 30% (50–80%) | Keep an eye on these |
 | **stable** | bottom 20% | Low risk |
+### `obscene coupling`
+Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis. Surfaces hidden structural dependencies that aren't visible in the code itself.
+Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors).
+```bash
+obscene coupling                          # default: min 2 shared commits
+obscene coupling --min-cochanges 1        # include single co-occurrences
+obscene coupling --format table --top 10  # human-readable, top 10
+```
 ### `obscene report`
 Per-file complexity without churn. Useful for raw complexity distribution.
@@ -84,22 +98,80 @@ Per-file complexity without churn. Useful for raw complexity distribution.
 | `--top <n>` | `20` | Limit results (0 = all) |
 | `--months <n>` | `3` | Churn window in months |
 | `--format <type>` | `json` | `json` or `table` |
+| `--min-cochanges <n>` | `2` | Minimum shared commits (coupling only) |
 | `--exclude <patterns...>` | — | Additional exclusion patterns |
+## Metrics
+Each hotspot row includes the following metrics:
+### Hotspot score (`Score`)
+`complexity × churn`. The core ranking metric — files that are both complex and frequently modified bubble to the top. See [Why churn × complexity?](#why-churn-x-complexity) for the research backing this approach.
+### Churn (`Churn`)
+Number of commits touching the file within the configured time window (default: 3 months). Measures how actively the file is being modified.
+### Cyclomatic complexity (`Cmplx`)
+Total cyclomatic complexity as reported by [scc](https://github.com/boyter/scc). Counts independent execution paths (branches, loops, conditions). Higher values mean more paths to test and more places for bugs to hide.
+### Complexity density (`Dens`)
+`complexity / lines of code`. Normalizes complexity by file size so a 50-line file with complexity 25 (density 0.50) stands out against a 500-line file with complexity 25 (density 0.05). Based on Harrison & Magel (1981), who found that complexity relative to code size is a stronger fault predictor than raw complexity alone.
+### Defects (`Dfcts`)
+Count of `fix:` conventional commits touching the file within the churn window. A proxy for historical defect rate — files that attract repeated fixes are more likely to contain latent bugs. Inspired by Moser, Pedrycz & Succi (2008), who showed that change-history metrics outperform static code metrics for defect prediction.
+### Defect density (`defectDensity`, JSON only)
+`defects / lines of code`. Not shown in table output due to column width, but available in JSON. Normalizes defect count by file size.
+### Nesting depth (`Nest`)
+Maximum indentation level (tab stops) in the file. Deep nesting correlates with high cognitive load and defect likelihood. Harrison & Magel (1981) identified nesting depth as a significant complexity contributor.
+### Unique authors (`Auth`)
+Number of distinct git authors who committed to the file within the churn window. Files touched by many authors may lack clear ownership and accumulate inconsistent patterns. Kamei et al. (2013) found developer count to be a significant predictor of defect-introducing changes.
+### Shared commits (`Shared`, coupling only)
+Number of commits where both files in a pair were modified together. The core ranking metric for temporal coupling — higher values indicate stronger hidden dependencies between files in different directories.
+### Coupling degree (`Degree`, coupling only)
+`shared commits / min(churn of file1, churn of file2) × 100`. What percentage of the less-active file's changes also involved the other file. A degree of 100% means every change to the less-active file also touched the other file.
+### Tier
+Cumulative score distribution bucket:
+| Tier | Range | Meaning |
+|------|-------|---------|
+| **danger** | top 50% of total score | Refactor candidates |
+| **watch** | next 30% (50–80%) | Keep an eye on these |
+| **stable** | bottom 20% | Low risk |
 ## Example output
 ```
-Hotspots — 3 months churn window | Total score: 35452
+Hotspots — 3 months churn window | Total score: 35,452
 Tiers: 3 danger, 13 watch, 194 stable
 Showing: 5 of 210
-File                                       Score      %  Churn  Cmplx  Density    Tier
-──────────────────────────────────────────────────────────────────────────────────────
-src/utils/effect-generator.ts               8296   23.4     68    122     0.12  DANGER
-src/services/game-engine.ts                 4284   12.1     51     84     0.09  DANGER
-src/components/board-renderer.tsx           2940    8.3     42     70     0.11  DANGER
-src/hooks/use-game-state.ts                 1320    3.7     33     40     0.08   WATCH
-src/utils/move-validator.ts                  945    2.7     27     35     0.06   WATCH
+File                                                 Score      %  Churn  Cmplx   Dens Dfcts  Nest  Auth    Tier
+────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+src/utils/effect-generator.ts                        8,296   23.4     68    122   0.12     5     6     4  DANGER
+src/services/game-engine.ts                          4,284   12.1     51     84   0.09     3     4     3  DANGER
+src/components/board-renderer.tsx                    2,940    8.3     42     70   0.11     2     5     3  DANGER
+src/hooks/use-game-state.ts                          1,320    3.7     33     40   0.08     1     3     2   WATCH
+src/utils/move-validator.ts                            945    2.7     27     35   0.06     0     2     1   WATCH
+Score=complexity×churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors
+Docs: https://github.com/wbern/obscene#metrics
 ```
 ## Supported languages
@@ -110,6 +182,17 @@ Any language [scc supports](https://github.com/boyter/scc#features) — 200+ lan
 Test and generated files are excluded automatically: `*.test.*`, `*.spec.*`, `__tests__/`, `__mocks__/`, `*.stories.*`, `*.d.ts`, and similar patterns. scc also skips generated files by default (`--no-gen`).
+## Why churn x complexity?
+Files that are both complex and frequently modified are disproportionately likely to contain defects. This is backed by decades of empirical software engineering research:
+- **Nagappan & Ball (2005)** studied Windows Server 2003 and found that relative code churn measures predict system defect density with 89% accuracy. — [ICSE 2005](https://doi.org/10.1109/ICSE.2005.1553571)
+- **Moser, Pedrycz & Succi (2008)** compared change metrics against static code attributes on Eclipse and found that process metrics (churn, change frequency) outperform static code metrics for defect prediction. — [ICSE 2008](https://doi.org/10.1145/1368088.1368114)
+- **Shin, Meneely, Williams & Osborne (2011)** combined complexity, churn, and developer activity metrics to predict vulnerabilities in Mozilla Firefox and the Linux kernel. By flagging only 10.9% of files, the model identified 70.8% of known vulnerabilities. — [IEEE TSE](https://doi.org/10.1109/TSE.2010.55)
+- **Tornhill & Borg (2022)** analyzed 39 proprietary codebases and found that low-quality code (by their Code Health metric) contains 15x more defects and takes 124% longer to resolve. In their case studies, 4% of the codebase was responsible for 72% of all defects. — [ACM/IEEE TechDebt 2022](https://arxiv.org/abs/2203.04374)
+The general approach was popularized by Adam Tornhill's *Your Code as a Crime Scene* (2015), which applies forensic analysis techniques to version control history.
 ## Limitations
 - **Churn = commit count**, not lines changed. A one-line typo fix counts the same as a 500-line rewrite.

package/dist/cli.js CHANGED Viewed

@@ -129,6 +129,82 @@ function getAuthors(months) {
   }
   return counts;
 }
+var MAX_FILES_PER_COMMIT = 20;
+function getCoChanges(months, excludes = []) {
+  const patterns = [...DEFAULT_EXCLUDES, ...excludes.map(globToRegex)];
+  let raw;
+  try {
+    raw = execSync(
+      `git log --since="${months} months ago" --format="COMMIT_SEP%n" --name-only`,
+      { maxBuffer: 50 * 1024 * 1024, stdio: ["pipe", "pipe", "pipe"] }
+    );
+  } catch {
+    throw new Error("Not a git repository or git is not installed.");
+  }
+  const cochanges = /* @__PURE__ */ new Map();
+  const commits = raw.toString().split("COMMIT_SEP\n");
+  for (const commit of commits) {
+    if (!commit.trim()) continue;
+    const seen = /* @__PURE__ */ new Set();
+    for (const line of commit.split("\n")) {
+      const trimmed = normalizePath(line.trim());
+      if (!trimmed) continue;
+      if (!isExcluded(trimmed, patterns)) {
+        seen.add(trimmed);
+      }
+    }
+    const files = [...seen];
+    if (files.length < 2 || files.length > MAX_FILES_PER_COMMIT) continue;
+    for (let i = 0; i < files.length; i++) {
+      for (let j = i + 1; j < files.length; j++) {
+        const [a, b] = files[i] < files[j] ? [files[i], files[j]] : [files[j], files[i]];
+        const dirA = a.includes("/") ? a.slice(0, a.lastIndexOf("/")) : "";
+        const dirB = b.includes("/") ? b.slice(0, b.lastIndexOf("/")) : "";
+        if (dirA === dirB) continue;
+        const key = `${a}\0${b}`;
+        cochanges.set(key, (cochanges.get(key) ?? 0) + 1);
+      }
+    }
+  }
+  return cochanges;
+}
+function computeCoupling(cochanges, churn, complexityMap, minCochanges) {
+  const entries = [];
+  for (const [key, count] of cochanges) {
+    if (count < minCochanges) continue;
+    const [file1, file2] = key.split("\0");
+    const minChurn = Math.min(churn.get(file1) ?? 0, churn.get(file2) ?? 0);
+    const degree = minChurn > 0 ? Math.round(count / minChurn * 1e3) / 10 : 0;
+    const totalComplexity = (complexityMap.get(file1) ?? 0) + (complexityMap.get(file2) ?? 0);
+    entries.push({
+      file1,
+      file2,
+      cochanges: count,
+      degree,
+      totalComplexity,
+      couplingScore: count,
+      percentOfTotal: 0,
+      tier: "stable"
+    });
+  }
+  entries.sort((a, b) => b.couplingScore - a.couplingScore);
+  const totalScore = entries.reduce((sum, e) => sum + e.couplingScore, 0);
+  if (totalScore === 0) return [];
+  let cumulative = 0;
+  for (const entry of entries) {
+    entry.percentOfTotal = Math.round(entry.couplingScore / totalScore * 1e3) / 10;
+    cumulative += entry.couplingScore;
+    const cumulativeShare = cumulative / totalScore;
+    if (cumulativeShare <= DANGER_CUMULATIVE) {
+      entry.tier = "danger";
+    } else if (cumulativeShare <= WATCH_CUMULATIVE) {
+      entry.tier = "watch";
+    } else {
+      entry.tier = "stable";
+    }
+  }
+  return entries;
+}
 function getNestingDepths(filePaths) {
   const depths = /* @__PURE__ */ new Map();
   for (const filePath of filePaths) {
@@ -231,23 +307,58 @@ function formatHotspotsTable(output) {
   lines.push(
     `Hotspots \u2014 ${churnWindow} churn window | Total score: ${totalScore.toLocaleString()}`
   );
-  lines.push(
-    `Tiers: ${tierCounts.danger} danger, ${tierCounts.watch} watch, ${tierCounts.stable} stable`
-  );
-  lines.push(`Showing: ${output.showing} of ${output.totalHotspots}`);
-  lines.push("");
+  pushTierSummary(lines, tierCounts, output.showing, output.totalHotspots);
   lines.push(
     padRight("File", 50) + padLeft("Score", 8) + padLeft("%", 7) + padLeft("Churn", 7) + padLeft("Cmplx", 7) + padLeft("Dens", 7) + padLeft("Dfcts", 6) + padLeft("Nest", 6) + padLeft("Auth", 6) + padLeft("Tier", 8)
   );
   lines.push("\u2500".repeat(112));
   for (const h of hotspots) {
-    const tierLabel = h.tier === "danger" ? "DANGER" : h.tier === "watch" ? "WATCH" : "stable";
     lines.push(
-      padRight(truncate(h.file, 48), 50) + padLeft(h.hotspotScore.toLocaleString(), 8) + padLeft(h.percentOfTotal.toFixed(1), 7) + padLeft(String(h.churn), 7) + padLeft(String(h.complexity), 7) + padLeft(h.complexityDensity.toFixed(2), 7) + padLeft(String(h.defects), 6) + padLeft(String(h.maxNesting), 6) + padLeft(String(h.authors), 6) + padLeft(tierLabel, 8)
+      padRight(truncate(h.file, 48), 50) + padLeft(h.hotspotScore.toLocaleString(), 8) + padLeft(h.percentOfTotal.toFixed(1), 7) + padLeft(String(h.churn), 7) + padLeft(String(h.complexity), 7) + padLeft(h.complexityDensity.toFixed(2), 7) + padLeft(String(h.defects), 6) + padLeft(String(h.maxNesting), 6) + padLeft(String(h.authors), 6) + padLeft(tierLabel(h.tier), 8)
+    );
+  }
+  lines.push("");
+  lines.push(
+    "Score=complexity\xD7churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors"
+  );
+  lines.push("Docs: https://github.com/wbern/obscene#metrics");
+  return lines.join("\n");
+}
+function formatCouplingTable(output) {
+  const lines = [];
+  const { tierCounts, totalScore, churnWindow, couplings } = output;
+  lines.push(
+    `Coupling \u2014 ${churnWindow} churn window | Min shared: ${output.minCochanges} | Total score: ${totalScore.toLocaleString()}`
+  );
+  pushTierSummary(lines, tierCounts, output.showing, output.totalCouplings);
+  lines.push(
+    padRight("File 1", 35) + padRight("File 2", 35) + padLeft("Shared", 7) + padLeft("Degree", 8) + padLeft("Cmplx", 7) + padLeft("Tier", 8)
+  );
+  lines.push("\u2500".repeat(100));
+  for (const c of couplings) {
+    lines.push(
+      padRight(truncate(c.file1, 33), 35) + padRight(truncate(c.file2, 33), 35) + padLeft(String(c.cochanges), 7) + padLeft(`${c.degree.toFixed(1)}%`, 8) + padLeft(String(c.totalComplexity), 7) + padLeft(tierLabel(c.tier), 8)
     );
   }
+  lines.push("");
+  lines.push(
+    "Shared=co-changed commits | Degree=shared/min(churn)\xD7100 | Cmplx=sum of both files"
+  );
+  lines.push("Docs: https://github.com/wbern/obscene#metrics");
   return lines.join("\n");
 }
+function pushTierSummary(lines, tierCounts, showing, total) {
+  lines.push(
+    `Tiers: ${tierCounts.danger} danger, ${tierCounts.watch} watch, ${tierCounts.stable} stable`
+  );
+  lines.push(`Showing: ${showing} of ${total}`);
+  lines.push("");
+}
+function tierLabel(tier) {
+  if (tier === "danger") return "DANGER";
+  if (tier === "watch") return "WATCH";
+  return "stable";
+}
 function padRight(s, n) {
   return s.length >= n ? s : s + " ".repeat(n - s.length);
 }
@@ -260,7 +371,7 @@ function truncate(s, max) {
 // src/cli.ts
 var program = new Command();
-program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.2.0");
+program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.0");
 function addSharedOptions(cmd) {
   return cmd.option("--top <n>", "limit to top N entries (0 = all)", "20").option("--format <type>", "output format: json | table", "json").option(
     "--exclude <patterns...>",
@@ -285,6 +396,17 @@ addSharedOptions(
     exitWithError(err);
   }
 });
+addSharedOptions(
+  program.command("coupling").description(
+    "temporal coupling \u2014 files that change together across directories"
+  )
+).option("--months <n>", "churn window in months", "3").option("--min-cochanges <n>", "minimum shared commits to include", "2").action((opts) => {
+  try {
+    runCoupling(opts);
+  } catch (err) {
+    exitWithError(err);
+  }
+});
 function runReport(opts) {
   const top = parseInt(opts.top, 10);
   const files = runScc(opts.exclude);
@@ -353,6 +475,47 @@ function runHotspots(opts) {
 `);
   }
 }
+function runCoupling(opts) {
+  const top = parseInt(opts.top, 10);
+  const months = parseInt(opts.months, 10);
+  const minCochanges = parseInt(opts.minCochanges, 10);
+  const files = runScc(opts.exclude);
+  const churn = getChurn(months);
+  const cochanges = getCoChanges(months, opts.exclude);
+  const complexityMap = /* @__PURE__ */ new Map();
+  for (const f of files) {
+    complexityMap.set(f.file, f.complexity);
+  }
+  const couplings = computeCoupling(
+    cochanges,
+    churn,
+    complexityMap,
+    minCochanges
+  );
+  const limited = top > 0 ? couplings.slice(0, top) : couplings;
+  const tierCounts = { danger: 0, watch: 0, stable: 0 };
+  for (const c of couplings) {
+    tierCounts[c.tier]++;
+  }
+  const totalScore = couplings.reduce((sum, c) => sum + c.couplingScore, 0);
+  const output = {
+    generated: (/* @__PURE__ */ new Date()).toISOString(),
+    churnWindow: `${months} months`,
+    minCochanges,
+    totalScore,
+    tierCounts,
+    totalCouplings: couplings.length,
+    showing: limited.length,
+    couplings: limited
+  };
+  if (opts.format === "table") {
+    process.stdout.write(`${formatCouplingTable(output)}
+`);
+  } else {
+    process.stdout.write(`${JSON.stringify(output, null, 2)}
+`);
+  }
+}
 function exitWithError(err) {
   const message = err instanceof Error ? err.message : String(err);
   process.stderr.write(`Error: ${message}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@wbern/obscene",
-  "version": "0.2.0",
+  "version": "0.3.0",
   "description": "Identify hotspot files — complex code that changes frequently. Churn × complexity analysis for any git repo.",
   "type": "module",
   "bin": {