npm - @wbern/obscene - Versions diffs - 0.3.0 → 0.3.1 - Mend

@wbern/obscene 0.3.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -77,9 +77,9 @@ Scores each file by `complexity × commits` over a time window, then assigns tie
 ### `obscene coupling`
-Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis. Surfaces hidden structural dependencies that aren't visible in the code itself.
+Detects files that frequently change together in the same commit but live in different directories — Tornhill's "temporal coupling" analysis from *Your Code as a Crime Scene* (2015). Surfaces hidden structural dependencies that aren't visible in imports or the module graph.
-Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors).
+Same-directory pairs are excluded (co-location is expected coupling). Mass commits touching >20 files are skipped (formatting changes, large refactors). See [Why temporal coupling?](#why-temporal-coupling) for the research backing this approach.
 ```bash
 obscene coupling                          # default: min 2 shared commits
@@ -103,49 +103,55 @@ Per-file complexity without churn. Useful for raw complexity distribution.
 ## Metrics
-Each hotspot row includes the following metrics:
+### Hotspot metrics
-### Hotspot score (`Score`)
+#### Hotspot score (`Score`)
 `complexity × churn`. The core ranking metric — files that are both complex and frequently modified bubble to the top. See [Why churn × complexity?](#why-churn-x-complexity) for the research backing this approach.
-### Churn (`Churn`)
+#### Churn (`Churn`)
 Number of commits touching the file within the configured time window (default: 3 months). Measures how actively the file is being modified.
-### Cyclomatic complexity (`Cmplx`)
+#### Cyclomatic complexity (`Cmplx`)
 Total cyclomatic complexity as reported by [scc](https://github.com/boyter/scc). Counts independent execution paths (branches, loops, conditions). Higher values mean more paths to test and more places for bugs to hide.
-### Complexity density (`Dens`)
+#### Complexity density (`Dens`)
 `complexity / lines of code`. Normalizes complexity by file size so a 50-line file with complexity 25 (density 0.50) stands out against a 500-line file with complexity 25 (density 0.05). Based on Harrison & Magel (1981), who found that complexity relative to code size is a stronger fault predictor than raw complexity alone.
-### Defects (`Dfcts`)
+#### Defects (`Dfcts`)
 Count of `fix:` conventional commits touching the file within the churn window. A proxy for historical defect rate — files that attract repeated fixes are more likely to contain latent bugs. Inspired by Moser, Pedrycz & Succi (2008), who showed that change-history metrics outperform static code metrics for defect prediction.
-### Defect density (`defectDensity`, JSON only)
+#### Defect density (`defectDensity`, JSON only)
 `defects / lines of code`. Not shown in table output due to column width, but available in JSON. Normalizes defect count by file size.
-### Nesting depth (`Nest`)
+#### Nesting depth (`Nest`)
 Maximum indentation level (tab stops) in the file. Deep nesting correlates with high cognitive load and defect likelihood. Harrison & Magel (1981) identified nesting depth as a significant complexity contributor.
-### Unique authors (`Auth`)
+#### Unique authors (`Auth`)
 Number of distinct git authors who committed to the file within the churn window. Files touched by many authors may lack clear ownership and accumulate inconsistent patterns. Kamei et al. (2013) found developer count to be a significant predictor of defect-introducing changes.
-### Shared commits (`Shared`, coupling only)
+### Coupling metrics
-Number of commits where both files in a pair were modified together. The core ranking metric for temporal coupling — higher values indicate stronger hidden dependencies between files in different directories.
+#### Shared commits (`Shared`)
-### Coupling degree (`Degree`, coupling only)
+Number of commits where both files in a pair were modified together. The core ranking metric for temporal coupling — higher values indicate stronger hidden dependencies between files in different directories. Ball, Kim, Porter & Siy (1997) demonstrated that co-change relationships reveal design dependencies that static analysis misses.
-`shared commits / min(churn of file1, churn of file2) × 100`. What percentage of the less-active file's changes also involved the other file. A degree of 100% means every change to the less-active file also touched the other file.
+#### Coupling degree (`Degree`)
-### Tier
+`shared commits / min(churn of file1, churn of file2) × 100`. What percentage of the less-active file's changes also involved the other file. A degree of 100% means every change to the less-active file also touched the other file. This normalization follows D'Ambros, Lanza & Lungu (2009), who showed that relative coupling measures provide more stable results than raw co-change counts across projects of different sizes.
+#### Combined complexity (`Cmplx`)
+Sum of cyclomatic complexity of both files in the pair. Highlights coupled pairs where the involved code is also complex — the combination of hidden dependency and high complexity compounds maintenance risk.
+#### Tier
 Cumulative score distribution bucket:
@@ -174,6 +180,25 @@ Score=complexity×churn | Dens=complexity/code | Dfcts=fix commits | Nest=max in
 Docs: https://github.com/wbern/obscene#metrics
 ```
+### Coupling example
+```
+Coupling — 6 months churn window | Min shared: 3 | Total score: 91
+Tiers: 10 danger, 7 watch, 7 stable
+Showing: 5 of 24
+File 1                             File 2                              Shared  Degree  Cmplx    Tier
+────────────────────────────────────────────────────────────────────────────────────────────────────
+…ePlayer/hooks/useChessEffects.ts  src/utils/effect-generator.ts            6   46.2%    261  DANGER
+…ePlayer/hooks/useChessEffects.ts  src/utils/pgn-types.ts                   6   50.0%    121  DANGER
+src/test/pgn-fixtures.ts           src/utils/pgn-parser.server.ts           5   71.4%      3  DANGER
+src/test/pgn-fixtures.ts           src/utils/effect-generator.ts            4   57.1%    145  DANGER
+src/test/pgn-fixtures.ts           src/utils/pgn-types.ts                   4   57.1%      5  DANGER
+Shared=co-changed commits | Degree=shared/min(churn)×100 | Cmplx=sum of both files
+Docs: https://github.com/wbern/obscene#metrics
+```
 ## Supported languages
 Any language [scc supports](https://github.com/boyter/scc#features) — 200+ languages including C, C++, Go, Java, JavaScript, TypeScript, Python, Rust, Ruby, PHP, Swift, Kotlin, and many more. No configuration needed; scc auto-detects languages from file extensions.
@@ -193,14 +218,32 @@ Files that are both complex and frequently modified are disproportionately likel
 The general approach was popularized by Adam Tornhill's *Your Code as a Crime Scene* (2015), which applies forensic analysis techniques to version control history.
+## Why temporal coupling?
+Files that change together but live in different directories reveal implicit dependencies that the module graph doesn't capture. These hidden couplings are a maintenance hazard: a developer modifying one file doesn't know they also need to update the other, leading to bugs that only surface later.
+- **Ball, Kim, Porter & Siy (1997)** pioneered co-change analysis and showed that version control history surfaces design relationships invisible to static analysis. — [ICSE 1997 Workshop](https://www.researchgate.net/publication/2791666_If_Your_Version_Control_System_Could_Talk)
+- **D'Ambros, Lanza & Lungu (2009)** developed the Evolution Radar for visualizing logical coupling at both file and module level, showing how evolutionary coupling reveals architectural decay. The normalized approach (coupling relative to total changes) provides more stable measures across projects of different sizes. — [IEEE TSE](https://doi.org/10.1109/TSE.2009.17)
+- **Tornhill (2015)** popularized temporal coupling analysis in *Your Code as a Crime Scene*, demonstrating how co-change patterns reveal "surprise dependencies" — files that should logically be independent but can't be changed separately in practice. His tooling (Code Maat) uses the same commit co-occurrence approach.
+- **Cataldo, Mockus, Roberts & Herbsleb (2009)** analyzed both syntactic and logical dependencies across two large systems and found that logical (co-change) dependencies have a significant independent effect on failure proneness. When developers are unaware of these hidden couplings, defects increase. — [IEEE TSE](https://doi.org/10.1109/TSE.2009.42)
 ## Limitations
+### General
 - **Churn = commit count**, not lines changed. A one-line typo fix counts the same as a 500-line rewrite.
 - **Per-file granularity only.** A 1000-line file with many small functions scores higher than it probably should. No function-level breakdown.
 - **Must be run inside a git repo.** Churn data comes from `git log`.
 - **Only analyzes files that currently exist.** Deleted files don't appear, even if they churned heavily before removal.
 - **Tier thresholds are fixed** (50/80 cumulative %). Not configurable yet.
+### Coupling-specific
+- **Same-directory exclusion is a heuristic.** Files in the same directory that are unexpectedly coupled won't be surfaced. The assumption is that co-located files are *expected* to change together.
+- **Mass commit threshold (>20 files) is hardcoded.** Commits touching many files are skipped to avoid noise from formatting changes and large refactors, but legitimate large features that touch many files across directories are also excluded.
+- **Degree uses unfiltered churn.** The denominator (`min(churn)`) counts all commits to a file, including single-file commits. This means degree can understate coupling when a file has high solo churn.
+- **Squash merges collapse coupling signal.** If a branch with 10 separate commits is squash-merged into one, all co-changes within that branch become a single co-occurrence.
 ## License
 MIT

package/dist/cli.js CHANGED Viewed

@@ -299,6 +299,14 @@ function formatReportTable(output) {
       padRight(truncate(f.file, 58), 60) + padLeft(String(f.code), 8) + padLeft(String(f.complexity), 12) + padLeft(f.complexityDensity.toFixed(2), 9) + padLeft(String(f.comments), 10)
     );
   }
+  lines.push("");
+  lines.push(
+    "Complexity=cyclomatic branch/loop count | Density=complexity/code | Comments=comment lines"
+  );
+  lines.push(
+    "High complexity is expected for parsers, state machines, and business logic. Compare density across files, not raw values."
+  );
+  lines.push("Docs: https://github.com/wbern/obscene#metrics");
   return lines.join("\n");
 }
 function formatHotspotsTable(output) {
@@ -321,6 +329,12 @@ function formatHotspotsTable(output) {
   lines.push(
     "Score=complexity\xD7churn | Dens=complexity/code | Dfcts=fix commits | Nest=max indent depth | Auth=unique authors"
   );
+  lines.push(
+    "Tiers are relative to THIS codebase, not absolute quality grades. A 'danger' file in a clean codebase may be fine."
+  );
+  lines.push(
+    "High scores flag review candidates, not bad code \u2014 stable complex files (parsers, engines) score high naturally."
+  );
   lines.push("Docs: https://github.com/wbern/obscene#metrics");
   return lines.join("\n");
 }
@@ -344,6 +358,12 @@ function formatCouplingTable(output) {
   lines.push(
     "Shared=co-changed commits | Degree=shared/min(churn)\xD7100 | Cmplx=sum of both files"
   );
+  lines.push(
+    "Tiers are relative to THIS codebase, not absolute quality grades. High coupling may be intentional and fine."
+  );
+  lines.push(
+    "Same-directory pairs excluded. Commits touching >20 files skipped. Only cross-directory dependencies shown."
+  );
   lines.push("Docs: https://github.com/wbern/obscene#metrics");
   return lines.join("\n");
 }
@@ -371,7 +391,27 @@ function truncate(s, max) {
 // src/cli.ts
 var program = new Command();
-program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.0");
+program.name("obscene").description("Identify hotspot files \u2014 complex code that changes frequently").version("0.3.1");
+var REPORT_GUIDE = {
+  complexity: "Cyclomatic complexity (branch/loop count). NOT a quality judgment \u2014 a 500-line parser will naturally score high. Compare density, not raw values.",
+  complexityDensity: "Complexity per line of code. Normalizes for file size. >0.25 suggests dense logic worth reviewing; <0.10 is typical for straightforward code.",
+  comments: "Comment line count. Low comments in high-density files may indicate under-documented logic. High comments alone is not a problem."
+};
+var HOTSPOTS_GUIDE = {
+  hotspotScore: "complexity \xD7 churn. Ranks files by combined risk: complex code that changes often. High score does NOT mean bad code \u2014 stable high-complexity files (parsers, engines) are fine. Focus on files where score is rising over time.",
+  churn: "Commit count in the time window. High churn alone is neutral \u2014 active development is normal. It becomes a signal when combined with high complexity.",
+  tier: "Relative ranking within THIS codebase (top 50% = danger, next 30% = watch, bottom 20% = stable). NOT an absolute quality grade. A 'danger' file in a clean codebase may be perfectly fine. Compare across runs to spot trends.",
+  defects: "Count of fix: conventional commits. A proxy for bug frequency \u2014 0 does not mean bug-free, and >0 does not mean bad code. Useful for spotting files that attract repeated fixes.",
+  defectDensity: "Fix commits per line of code. Normalizes defect count by file size. Only meaningful with conventional commits (fix: prefix).",
+  maxNesting: "Deepest indentation level. >6 suggests complex control flow worth simplifying. Language-dependent \u2014 Python files naturally nest less than C++.",
+  authors: "Unique committers in the time window. High author count may indicate unclear ownership. Low count is normal for specialized code. Neither value is inherently good or bad."
+};
+var COUPLING_GUIDE = {
+  cochanges: "Times both files appeared in the same commit. Higher values suggest a dependency between the files. Same-directory pairs are excluded \u2014 only cross-directory pairs are shown.",
+  degree: "Percentage: shared commits / min(churn of file1, file2) \xD7 100. Shows how tightly coupled the pair is relative to their individual change rates. 100% means every change to the less-active file also touched the other.",
+  totalComplexity: "Sum of both files' cyclomatic complexity. Highlights coupled pairs where the involved code is also complex \u2014 hidden dependency + high complexity compounds maintenance risk.",
+  tier: "Relative ranking within THIS codebase's coupling pairs (top 50% = danger, next 30% = watch, bottom 20% = stable). NOT an absolute quality grade. 'danger' means this pair co-changes more than most \u2014 it may be intentional and fine."
+};
 function addSharedOptions(cmd) {
   return cmd.option("--top <n>", "limit to top N entries (0 = all)", "20").option("--format <type>", "output format: json | table", "json").option(
     "--exclude <patterns...>",
@@ -421,6 +461,7 @@ function runReport(opts) {
   const limited = top > 0 ? files.slice(0, top) : files;
   const output = {
     generated: (/* @__PURE__ */ new Date()).toISOString(),
+    guide: REPORT_GUIDE,
     summary: {
       ...totals,
       fileCount: files.length,
@@ -460,6 +501,7 @@ function runHotspots(opts) {
   const totalScore = hotspots.reduce((sum, h) => sum + h.hotspotScore, 0);
   const output = {
     generated: (/* @__PURE__ */ new Date()).toISOString(),
+    guide: HOTSPOTS_GUIDE,
     churnWindow: `${months} months`,
     totalScore,
     tierCounts,
@@ -500,6 +542,7 @@ function runCoupling(opts) {
   const totalScore = couplings.reduce((sum, c) => sum + c.couplingScore, 0);
   const output = {
     generated: (/* @__PURE__ */ new Date()).toISOString(),
+    guide: COUPLING_GUIDE,
     churnWindow: `${months} months`,
     minCochanges,
     totalScore,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@wbern/obscene",
-  "version": "0.3.0",
+  "version": "0.3.1",
   "description": "Identify hotspot files — complex code that changes frequently. Churn × complexity analysis for any git repo.",
   "type": "module",
   "bin": {