npm - claude-attribution - Versions diffs - 1.1.3 → 1.2.1 - Mend

claude-attribution 1.1.3 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +92 -19
package/package.json +1 -1
package/src/__tests__/minimap.test.ts +225 -0
package/src/attribution/commit.ts +220 -43
package/src/attribution/minimap.ts +178 -0
package/src/cli.ts +4 -0
package/src/commands/init.ts +165 -0
package/src/export/pr-summary.ts +18 -0
package/src/metrics/collect.ts +59 -2
package/src/setup/install.ts +21 -0
package/src/setup/templates/pr-metrics-workflow.yml +3 -1
package/src/setup/uninstall.ts +9 -0

package/README.md CHANGED Viewed

@@ -6,13 +6,14 @@
 > ```bash
 > npm install -g claude-attribution
 > claude-attribution install ~/Code/your-repo
-> git add .claude/settings.json .github/workflows/claude-attribution-pr.yml .gitignore && git commit -m "chore: install claude-attribution hooks"
+> claude-attribution init --ai    # repo built with Claude Code — or --human if human/mixed
+> git add .claude/settings.json .gitignore && git commit -m "chore: install claude-attribution hooks"
 > ```
-> From then on, just work normally. After each `git commit` you'll see a one-line attribution summary in your terminal. When you open a PR — whether Claude creates it or you run `gh pr create` yourself — metrics are injected into the PR body automatically, no command needed.
+> From then on, just work normally. After each `git commit` you'll see a one-line attribution summary in your terminal. When you're ready to open a PR, run `/pr` in Claude Code (or `claude-attribution pr "feat: your title"`) — it fills in the metrics automatically, no copy-paste needed.
 >
-> **Using Copilot?** The tool still works for tracking Claude usage alongside Copilot. Copilot line-level attribution isn't supported yet — for Copilot-specific stats, use the GitHub Copilot usage dashboard under your organization's Settings → Copilot. Both tools' org-level data flows into the VP Datadog dashboard automatically on every PR merge.
+> **Using Copilot?** The tool still works for tracking Claude usage alongside Copilot. Copilot line-level attribution isn't supported yet — for Copilot-specific stats, use the GitHub Copilot usage dashboard. Both tools' org-level data flows into the VP Datadog dashboard automatically on every PR merge.
 >
-> **Requirements:** [Bun](https://bun.sh) (preferred) or Node 18+, and `gh` (GitHub CLI) authenticated.
+> **Requirements:** [Bun](https://bun.sh) (preferred) or Node 18+, and `gh` (GitHub CLI) authenticated for the `/pr` command.
 ---
@@ -46,17 +47,45 @@ bun install
 ### Install into a repo (per repo, per developer)
-**npm install:**
+**Step 1 — Run the installer:**
 ```bash
+# npm install:
 claude-attribution install ~/Code/your-repo
+# clone install:
+bun ~/Code/claude-attribution/src/setup/install.ts ~/Code/your-repo
 ```
-**Clone install:**
+**Step 2 — Declare your attribution baseline (`init`):**
+This step tells the tool whether the codebase was written by Claude or by humans before this install. It only needs to be run once.
 ```bash
-bun ~/Code/claude-attribution/src/setup/install.ts ~/Code/your-repo
+# Repo was built entirely with Claude Code — mark all files as AI-written:
+claude-attribution init --ai
+# Repo is human-written, or a mix — confirm the default (no note written):
+claude-attribution init --human
+# Not sure? Run with no flag — same as --human, prints a confirmation:
+claude-attribution init
 ```
-The installer makes six changes to the target repo:
+> **Why this matters:** Without `init`, the codebase-wide AI% starts at 0% and grows only from new commits. If your repo is all Claude Code, run `init --ai` now or the metrics will be misleading until the entire codebase has been re-committed line by line.
+**Step 3 — Commit and push:**
+```bash
+git add .claude/settings.json .github/workflows/claude-attribution-pr.yml .gitignore
+git commit -m "chore: install claude-attribution hooks"
+git push
+# If you ran init --ai, also push the minimap notes:
+git push origin refs/notes/claude-attribution-map
+```
+The installer makes the following changes to the target repo:
 **`.claude/settings.json`** — merges six Claude Code hooks:
@@ -70,7 +99,7 @@ The installer makes six changes to the target repo:
 **`.git/hooks/post-commit`** — runs attribution after every commit. If the repo already has a `post-commit` hook from Husky or another tool, the call is appended rather than replacing it. For Lefthook repos, the installer prints the config snippet to add manually.
-**`remote.origin.push` refspec** — the installer runs `git config --add remote.origin.push refs/notes/claude-attribution:refs/notes/claude-attribution` so that `git push` (without an explicit refspec) automatically includes attribution notes. No pre-push hook is installed — a hook that pushes notes concurrently with the main push causes SSH connection conflicts on GitHub.
+**`remote.origin.push` refspecs** — the installer adds two refspecs so that `git push` (without an explicit refspec) automatically includes both notes refs: `refs/notes/claude-attribution` (per-commit attribution) and `refs/notes/claude-attribution-map` (cumulative minimap). No pre-push hook is installed — a hook that pushes notes concurrently with the main push causes SSH connection conflicts on GitHub.
 **`.github/workflows/claude-attribution-pr.yml`** — GitHub Actions workflow that fires on every PR open and push. Injects metrics into the PR body automatically for PRs created outside Claude (Copilot, manual `gh pr create`, GitHub UI). Skips injection if the local `post-bash` hook already injected metrics on `opened`; always updates on `synchronize` (new commits).
@@ -80,14 +109,34 @@ The installer makes six changes to the target repo:
 **`.gitignore`** — adds `.claude/logs/` so tool usage logs don't end up in version control.
-### Committing the settings change
+### Attribution minimap — detailed options
+The attribution minimap tracks cumulative AI% across the entire codebase, carrying attribution forward across sessions and developers. For new commits it is updated automatically. For the history that predates the install, you declare the baseline once using `claude-attribution init`.
-The `.claude/settings.json` and workflow changes should be committed so all developers get the hooks and all PRs get metrics automatically.
+There are three options depending on the history of your repo:
+**Option 1 — Repo was built entirely with Claude Code (`--ai`):**
 ```bash
-# After running the installer:
-git add .claude/settings.json .github/workflows/claude-attribution-pr.yml .gitignore
-git commit -m "chore: install claude-attribution hooks"
+claude-attribution init --ai
+git push origin refs/notes/claude-attribution-map
+```
+Marks every currently tracked file as AI-written at HEAD. After this, PR metrics will show:
+```
+Codebase: ~100% AI (4150 / 4150 lines)
+This PR: 184 lines changed (4% of codebase) · 77% Claude edits · 142 AI lines
+```
+**Option 2 — Repo is human-written, or a mix (`--human` / no flag):**
+```bash
+claude-attribution init          # or: claude-attribution init --human
+```
+Prints a confirmation that the default human baseline is already in effect — no note is written. Attribution accumulates naturally as Claude writes new code going forward.
+**Option 3 — Already had claude-attribution installed before v1.2.0:**
+The minimap feature didn't exist before v1.2.0 — per-session notes are intact but the codebase-wide signal was missing. Run `init --ai` now if the repo is all Claude Code, or do nothing (human default) if it's a mix:
+```bash
+claude-attribution init --ai     # only if repo was built 100% with Claude Code
+git push origin refs/notes/claude-attribution-map
 ```
 ### Re-installing after moving this directory
@@ -150,12 +199,13 @@ Metrics are injected automatically — no command needed.
 **On every new push to an open PR**: the workflow fires on `synchronize` and updates the attribution percentages to reflect new commits.
-The metrics block looks like:
+The metrics block looks like (when the cumulative minimap exists):
 ```markdown
 ## Claude Code Metrics
-**AI contribution: ~77%** (142 of 184 committed lines) · Active: 8m
+**Codebase: ~77% AI** (3200 / 4150 lines)
+**This PR:** 184 lines changed (4% of codebase) · 77% Claude edits · 142 AI lines · Active: 8m
 | Model | Calls | Input | Output | Cache |
 |-------|-------|-------|--------|-------|
@@ -172,6 +222,12 @@ The metrics block looks like:
 </details>
 ```
+Before running `init --ai` (or on a fresh install with no minimap), the headline falls back to the session-only view:
+```markdown
+**AI contribution: ~77%** (142 of 184 committed lines) · Active: 8m
+```
 #### Manual option
 If you need to create a PR with metrics outside of Claude, use the `/pr` slash command or CLI directly:
@@ -238,11 +294,17 @@ Session IDs are shown in `.claude/logs/tool-usage.jsonl`.
 Attribution results are stored as git notes and queryable directly:
 ```bash
-# View attribution for the last commit
+# View per-commit attribution for the last commit
 git notes --ref=claude-attribution show HEAD
 # List all attributed commits in the repo
 git notes --ref=claude-attribution list
+# View the cumulative codebase minimap (all files, AI% totals)
+git notes --ref=refs/notes/claude-attribution-map show HEAD | jq .totals
+# Check codebase AI% quickly
+git notes --ref=refs/notes/claude-attribution-map show HEAD | jq .totals.pctAi
 ```
 Example output:
@@ -336,7 +398,7 @@ Running `/start` scopes both tool/token metrics AND attribution data to commits
 - **MIXED detection is positional (best-effort)** — MIXED is detected by checking whether Claude's i-th line was changed in the committed file. If a human inserts or deletes lines above position `i`, the commit's line positions shift while the after-snapshot's positions don't, causing false MIXED classifications. MIXED is most accurate when human edits are small in-place tweaks (e.g., changing a value on a line Claude wrote) rather than bulk insertions or deletions.
-- **Sessions without checkpoints** — commits made outside an active Claude session (no `current-session` file, or checkpoints already cleaned up) are attributed 100% HUMAN. This is correct.
+- **Sessions without checkpoints** — commits made outside an active Claude session (no `current-session` file, or checkpoints already cleaned up) are attributed 100% HUMAN for that commit's per-session stats. However, the cumulative minimap carries AI attribution forward for untouched lines from previous sessions — so the codebase-wide AI% is not lost when another developer commits without hooks installed.
 - **`git commit --amend`** — when a commit is amended, the original SHA is replaced but the old git note (pointing to the now-orphaned SHA) remains in the notes object store. `/metrics` reads notes across the entire branch, so an amended commit's lines may appear twice. Avoid amending published commits; if you do, run `/metrics` knowing totals may be slightly inflated for that commit's files.
@@ -377,7 +439,9 @@ Attribution data is pushed to Datadog automatically on every PR merge via GitHub
 | `claude_attribution.ai_lines` | Lines written by Claude and committed unchanged |
 | `claude_attribution.human_lines` | Lines written or left unchanged by the developer |
 | `claude_attribution.total_lines` | Total committed lines in the PR |
-| `claude_attribution.pct_ai` | Percentage of lines attributed to Claude |
+| `claude_attribution.pct_ai` | Percentage of lines attributed to Claude (this PR) |
+| `claude_attribution.codebase_pct_ai` | Cumulative codebase-wide AI% at PR merge time (requires minimap) |
+| `claude_attribution.codebase_total_lines` | Total codebase lines tracked in the minimap |
 | `github_copilot.acceptance_rate` | Org-level Copilot suggestion acceptance rate |
 | `github_copilot.lines_accepted` | Copilot lines accepted org-wide |
 | `github_copilot.lines_suggested` | Copilot lines suggested org-wide |
@@ -447,6 +511,15 @@ The post-commit hook may not have run. Check:
 Most likely the commit happened in a different terminal session from where Claude is running. The `current-session` file in `.claude/attribution-state/` needs to match the session that created the checkpoints. Start Claude, make your changes, and commit without switching sessions.
+**Codebase AI% shows 0% (or very low) even though the repo is all Claude Code**
+The cumulative minimap hasn't been initialized yet. Run once to backfill:
+```bash
+claude-attribution init --ai
+git push origin refs/notes/claude-attribution-map
+```
+See [Backfilling the attribution minimap](#backfilling-the-attribution-minimap) for details.
 **The hook is slowing down commits**
 The post-commit hook runs attribution after the commit is already recorded — it can't block the commit. If you see a pause, it's likely runtime startup time on a cold start (~100ms for Bun, ~300ms for npx tsx). Subsequent runs use the cached runtime from `/tmp/claude-attribution-runtime` and are faster.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "claude-attribution",
-	"version": "1.1.3",
+	"version": "1.2.1",
 	"description": "AI code attribution tracking for Claude Code sessions — checkpoint-based line diff approach",
 	"type": "module",
 	"bin": {

package/src/__tests__/minimap.test.ts ADDED Viewed

@@ -0,0 +1,225 @@
+import { test, expect, describe } from "bun:test";
+import {
+	hashSetFromString,
+	hashSetToString,
+	computeMinimapFile,
+} from "../attribution/minimap.ts";
+import { hashLine } from "../attribution/differ.ts";
+// ---------------------------------------------------------------------------
+// hashSetFromString / hashSetToString round-trip
+// ---------------------------------------------------------------------------
+describe("hashSetFromString", () => {
+	test("empty string returns empty set", () => {
+		expect(hashSetFromString("")).toEqual(new Set());
+	});
+	test("parses single 16-char hash", () => {
+		const hash = "abcd1234abcd1234";
+		expect(hashSetFromString(hash)).toEqual(new Set([hash]));
+	});
+	test("parses two concatenated hashes", () => {
+		const h1 = "aaaa1111aaaa1111";
+		const h2 = "bbbb2222bbbb2222";
+		expect(hashSetFromString(h1 + h2)).toEqual(new Set([h1, h2]));
+	});
+	test("ignores trailing partial hash (< 16 chars)", () => {
+		const full = "aaaa1111aaaa1111";
+		expect(hashSetFromString(full + "tooshort")).toEqual(new Set([full]));
+	});
+});
+describe("hashSetToString", () => {
+	test("empty set returns empty string", () => {
+		expect(hashSetToString(new Set())).toBe("");
+	});
+	test("single hash round-trips", () => {
+		const hash = "abcd1234abcd1234";
+		expect(hashSetToString(new Set([hash]))).toBe(hash);
+	});
+});
+describe("hashSetFromString / hashSetToString round-trip", () => {
+	test("multiple hashes survive encode-decode", () => {
+		const hashes = new Set([
+			"aaaa1111aaaa1111",
+			"bbbb2222bbbb2222",
+			"cccc3333cccc3333",
+		]);
+		const encoded = hashSetToString(hashes);
+		expect(hashSetFromString(encoded)).toEqual(hashes);
+	});
+});
+// ---------------------------------------------------------------------------
+// computeMinimapFile
+// ---------------------------------------------------------------------------
+describe("computeMinimapFile", () => {
+	test("new AI line (in currentAiHashes) → ai=1, in ai_hashes", () => {
+		const line = "const x = 1;";
+		const hash = hashLine(line);
+		const result = computeMinimapFile(
+			"foo.ts",
+			[line],
+			new Set([hash]),
+			new Set(),
+		);
+		expect(result.ai).toBe(1);
+		expect(result.human).toBe(0);
+		expect(result.total).toBe(1);
+		expect(result.pctAi).toBe(100);
+		expect(hashSetFromString(result.ai_hashes).has(hash)).toBe(true);
+	});
+	test("carry-forward AI line (in prevAiHashSet only) → ai=1", () => {
+		const line = "const y = 2;";
+		const hash = hashLine(line);
+		const result = computeMinimapFile(
+			"foo.ts",
+			[line],
+			new Set(), // not in current session
+			new Set([hash]), // but in prev minimap
+		);
+		expect(result.ai).toBe(1);
+		expect(result.human).toBe(0);
+		expect(hashSetFromString(result.ai_hashes).has(hash)).toBe(true);
+	});
+	test("human line (in neither set) → human=1, not in ai_hashes", () => {
+		const line = "const z = 3;";
+		const hash = hashLine(line);
+		const result = computeMinimapFile("foo.ts", [line], new Set(), new Set());
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(1);
+		expect(hashSetFromString(result.ai_hashes).has(hash)).toBe(false);
+	});
+	test("blank line always human (never in ai_hashes)", () => {
+		const aiHash = hashLine("");
+		const result = computeMinimapFile(
+			"foo.ts",
+			[""],
+			new Set([aiHash]),
+			new Set([aiHash]),
+		);
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(1);
+		expect(result.ai_hashes).toBe("");
+	});
+	test("whitespace-only line treated as blank (always human)", () => {
+		const line = "   ";
+		const aiHash = hashLine(line);
+		const result = computeMinimapFile(
+			"foo.ts",
+			[line],
+			new Set([aiHash]),
+			new Set([aiHash]),
+		);
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(1);
+	});
+	test("MIXED line — not in currentAiHashes → human even if in prevAiHashSet", () => {
+		// A MIXED line means Claude wrote it then a human modified it.
+		// The commit.ts attributeLines() result for MIXED lines only marks "AI"
+		// labeled lines in sessionAiByPath — MIXED lines are excluded.
+		// So if a hash is in prevAiHashSet but NOT in currentAiHashes,
+		// AND it's the committed version (i.e., human changed it), it stays human.
+		// We simulate this: prevAiHashSet has a hash, currentAiHashes is empty.
+		// But the committed line is DIFFERENT (human modified), so its hash is
+		// NOT in either set.
+		const prevLine = "const x = aiVersion;";
+		const prevHash = hashLine(prevLine);
+		const committedLine = "const x = humanVersion;";
+		const result = computeMinimapFile(
+			"foo.ts",
+			[committedLine],
+			new Set(), // current session: no AI label for this line
+			new Set([prevHash]), // prev minimap had the old AI hash
+		);
+		// committedLine hash is not in either set → human
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(1);
+	});
+	test("all-AI file", () => {
+		const lines = ["line one", "line two", "line three"];
+		const aiHashes = new Set(lines.map(hashLine));
+		const result = computeMinimapFile("foo.ts", lines, aiHashes, new Set());
+		expect(result.ai).toBe(3);
+		expect(result.human).toBe(0);
+		expect(result.pctAi).toBe(100);
+	});
+	test("all-human file", () => {
+		const lines = ["line one", "line two", "line three"];
+		const result = computeMinimapFile("foo.ts", lines, new Set(), new Set());
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(3);
+		expect(result.pctAi).toBe(0);
+	});
+	test("removed AI line (hash not in committedLines) → absent from result", () => {
+		// The old AI line was deleted; committedLines no longer contain it.
+		const oldAiLine = "const removed = true;";
+		const oldAiHash = hashLine(oldAiLine);
+		const committedLines = ["const kept = 1;"];
+		const result = computeMinimapFile(
+			"foo.ts",
+			committedLines,
+			new Set(), // session had no AI lines this commit
+			new Set([oldAiHash]), // prev minimap had the old line as AI
+		);
+		// Old hash not in committedLines → not counted as AI
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(1);
+		expect(hashSetFromString(result.ai_hashes).has(oldAiHash)).toBe(false);
+	});
+	test("duplicate content lines — set deduplication + correct counts", () => {
+		// Two lines with identical content → one hash in the set, but both counted
+		const line = "return null;";
+		const hash = hashLine(line);
+		const result = computeMinimapFile(
+			"foo.ts",
+			[line, line],
+			new Set([hash]),
+			new Set(),
+		);
+		// Both lines match the AI hash → ai=2
+		expect(result.ai).toBe(2);
+		expect(result.human).toBe(0);
+		expect(result.total).toBe(2);
+		// ai_hashes only stores the hash once (it's a set)
+		expect(hashSetFromString(result.ai_hashes).size).toBe(1);
+	});
+	test("pctAi rounds correctly", () => {
+		// 1 AI out of 3 total → 33%
+		const aiLine = "ai line";
+		const humanLine1 = "human line one";
+		const humanLine2 = "human line two";
+		const result = computeMinimapFile(
+			"foo.ts",
+			[aiLine, humanLine1, humanLine2],
+			new Set([hashLine(aiLine)]),
+			new Set(),
+		);
+		expect(result.pctAi).toBe(33);
+	});
+	test("empty file → all zeros, no crash", () => {
+		const result = computeMinimapFile("empty.ts", [], new Set(), new Set());
+		expect(result.ai).toBe(0);
+		expect(result.human).toBe(0);
+		expect(result.total).toBe(0);
+		expect(result.pctAi).toBe(0);
+		expect(result.ai_hashes).toBe("");
+	});
+});

package/src/attribution/commit.ts CHANGED Viewed

@@ -21,12 +21,18 @@
  */
 import { resolve, join } from "path";
 import { mkdir, appendFile } from "fs/promises";
+import { execFile } from "child_process";
+import { promisify } from "util";
+const execFileAsync = promisify(execFile);
 import { loadCheckpoint, readCurrentSession } from "./checkpoint.ts";
 import {
 	attributeLines,
 	aggregateTotals,
 	type AttributionResult,
 	type FileAttribution,
+	type LineAttribution,
+	hashLine,
 } from "./differ.ts";
 import {
 	writeNote,
@@ -35,6 +41,14 @@ import {
 	committedContent,
 	currentBranch,
 } from "./git-notes.ts";
+import {
+	computeMinimapFile,
+	hashSetFromString,
+	readMinimap,
+	writeMinimap,
+	type MinimapFileState,
+	type MinimapResult,
+} from "./minimap.ts";
 import {
 	otelEndpoint,
 	otelHeaders,
@@ -54,58 +68,74 @@ async function main() {
 		filesInCommit(repoRoot),
 	]);
-	// Process files in parallel — each file attribution is independent
-	const fileResults = (
+	// Process files in parallel — each file attribution is independent.
+	// Return type includes attribution[] so the minimap block can build currentAiHashes.
+	type FileAttributionWithLines = FileAttribution & {
+		attribution: LineAttribution[];
+		committedLines: string[];
+	};
+	const filesWithAttribution = (
 		await Promise.all(
-			changedFiles.map(async (relPath): Promise<FileAttribution | null> => {
-				const absPath = join(repoRoot, relPath);
-				const committed = await committedContent(repoRoot, relPath);
+			changedFiles.map(
+				async (relPath): Promise<FileAttributionWithLines | null> => {
+					const absPath = join(repoRoot, relPath);
+					const committed = await committedContent(repoRoot, relPath);
-				// Deleted file — skip attribution
-				if (committed === null) return null;
+					// Deleted file — skip attribution
+					if (committed === null) return null;
-				// Binary file — null bytes indicate binary content; line-splitting produces garbage
-				if (committed.includes("\0")) return null;
+					// Binary file — null bytes indicate binary content; line-splitting produces garbage
+					if (committed.includes("\0")) return null;
-				const committedLines = committed.split("\n");
+					const committedLines = committed.split("\n");
+					const empty: LineAttribution[] = committedLines.map(() => "HUMAN");
-				if (!sessionId) {
-					// No active Claude session — everything is HUMAN
-					return {
-						path: relPath,
-						ai: 0,
-						human: committedLines.length,
-						mixed: 0,
-						total: committedLines.length,
-						pctAi: 0,
-					};
-				}
+					if (!sessionId) {
+						return {
+							path: relPath,
+							ai: 0,
+							human: committedLines.length,
+							mixed: 0,
+							total: committedLines.length,
+							pctAi: 0,
+							attribution: empty,
+							committedLines,
+						};
+					}
-				const before = await loadCheckpoint(sessionId, absPath, "before");
-				const after = await loadCheckpoint(sessionId, absPath, "after");
+					const before = await loadCheckpoint(sessionId, absPath, "before");
+					const after = await loadCheckpoint(sessionId, absPath, "after");
-				if (!after) {
-					// No Claude checkpoint for this file — all HUMAN
-					return {
-						path: relPath,
-						ai: 0,
-						human: committedLines.length,
-						mixed: 0,
-						total: committedLines.length,
-						pctAi: 0,
-					};
-				}
+					if (!after) {
+						return {
+							path: relPath,
+							ai: 0,
+							human: committedLines.length,
+							mixed: 0,
+							total: committedLines.length,
+							pctAi: 0,
+							attribution: empty,
+							committedLines,
+						};
+					}
-				const beforeLines = before?.lines ?? [];
-				const { stats } = attributeLines(
-					beforeLines,
-					after.lines,
-					committedLines,
-				);
-				return { ...stats, path: relPath };
-			}),
+					const beforeLines = before?.lines ?? [];
+					const { stats, attribution } = attributeLines(
+						beforeLines,
+						after.lines,
+						committedLines,
+					);
+					return { ...stats, path: relPath, attribution, committedLines };
+				},
+			),
 		)
-	).filter((r): r is FileAttribution => r !== null);
+	).filter((r): r is FileAttributionWithLines => r !== null);
+	// Strip attribution/committedLines before building the AttributionResult
+	const fileResults: FileAttribution[] = filesWithAttribution.map(
+		({ attribution: _a, committedLines: _c, ...f }) => f,
+	);
 	const result: AttributionResult = {
 		commit: sha,
@@ -119,6 +149,153 @@ async function main() {
 	// Write git note
 	await writeNote(result, repoRoot);
+	// Update cumulative minimap — non-fatal, never blocks commits
+	try {
+		// Load parent commit's minimap for carry-forward
+		const parentSha = await execFileAsync("git", ["rev-parse", "HEAD^1"], {
+			cwd: repoRoot,
+		})
+			.then((r: { stdout: string }) => r.stdout.trim())
+			.catch(() => null);
+		const prevMinimap = parentSha
+			? await readMinimap(repoRoot, parentSha)
+			: null;
+		// Build lookup: file path → Set of AI hashes from previous minimap
+		const prevAiByFile = new Map<string, Set<string>>();
+		if (prevMinimap) {
+			for (const f of prevMinimap.files) {
+				prevAiByFile.set(f.path, hashSetFromString(f.ai_hashes));
+			}
+		}
+		// Build lookup: file path → Set of AI hashes from this session
+		const sessionAiByPath = new Map<string, Set<string>>();
+		for (const f of filesWithAttribution) {
+			const aiHashes = new Set<string>();
+			for (let i = 0; i < f.committedLines.length; i++) {
+				if (f.attribution[i] === "AI") {
+					aiHashes.add(hashLine(f.committedLines[i] ?? ""));
+				}
+			}
+			sessionAiByPath.set(f.path, aiHashes);
+		}
+		// Get all tracked files in HEAD
+		const lsResult = (await execFileAsync("git", ["ls-files"], {
+			cwd: repoRoot,
+		})) as unknown as { stdout: string };
+		const allFiles = lsResult.stdout.trim().split("\n").filter(Boolean);
+		const changedFileSet = new Set(changedFiles);
+		// Build minimap state for every tracked file
+		const CONCURRENCY = 8;
+		const minimapFiles: MinimapFileState[] = [];
+		for (let i = 0; i < allFiles.length; i += CONCURRENCY) {
+			const batch = allFiles.slice(i, i + CONCURRENCY);
+			const batchResults = await Promise.all(
+				batch.map(async (relPath): Promise<MinimapFileState> => {
+					const prevAiSet = prevAiByFile.get(relPath) ?? new Set<string>();
+					if (changedFileSet.has(relPath)) {
+						// File changed this commit — recompute using session data
+						const fileWithAttr = filesWithAttribution.find(
+							(f) => f.path === relPath,
+						);
+						if (!fileWithAttr) {
+							// Changed but no attribution (deleted/binary) — carry forward or all-human
+							const prev = prevMinimap?.files.find((f) => f.path === relPath);
+							return prev
+								? { ...prev }
+								: {
+										path: relPath,
+										ai_hashes: "",
+										ai: 0,
+										human: 0,
+										total: 0,
+										pctAi: 0,
+									};
+						}
+						const currentAiSet =
+							sessionAiByPath.get(relPath) ?? new Set<string>();
+						return computeMinimapFile(
+							relPath,
+							fileWithAttr.committedLines,
+							currentAiSet,
+							prevAiSet,
+						);
+					}
+					// File unchanged this commit
+					const prevEntry = prevMinimap?.files.find((f) => f.path === relPath);
+					if (prevEntry) {
+						// Carry forward existing minimap entry unchanged
+						return { ...prevEntry };
+					}
+					// File exists in repo but has no prior minimap entry (pre-install file)
+					// Baseline as all-Human
+					const committed = await committedContent(repoRoot, relPath).catch(
+						() => null,
+					);
+					if (!committed || committed.includes("\0")) {
+						return {
+							path: relPath,
+							ai_hashes: "",
+							ai: 0,
+							human: 0,
+							total: 0,
+							pctAi: 0,
+						};
+					}
+					const lines = committed.split("\n");
+					return {
+						path: relPath,
+						ai_hashes: "",
+						ai: 0,
+						human: lines.length,
+						total: lines.length,
+						pctAi: 0,
+					};
+				}),
+			);
+			minimapFiles.push(...batchResults);
+		}
+		// Compute totals
+		let mAi = 0,
+			mHuman = 0,
+			mTotal = 0;
+		for (const f of minimapFiles) {
+			mAi += f.ai;
+			mHuman += f.human;
+			mTotal += f.total;
+		}
+		const minimapResult: MinimapResult = {
+			commit: sha,
+			timestamp: new Date().toISOString(),
+			files: minimapFiles,
+			totals: {
+				ai: mAi,
+				human: mHuman,
+				total: mTotal,
+				pctAi: mTotal > 0 ? Math.round((mAi / mTotal) * 100) : 0,
+			},
+		};
+		await writeMinimap(minimapResult, repoRoot);
+	} catch (err) {
+		// Never block the commit
+		console.error(
+			"[claude-attribution] minimap update failed (non-fatal):",
+			err,
+		);
+	}
 	// Append to local log
 	const logDir = join(repoRoot, ".claude", "logs");
 	await mkdir(logDir, { recursive: true });

package/src/attribution/minimap.ts ADDED Viewed

@@ -0,0 +1,178 @@
+/**
+ * Cumulative AI attribution minimap.
+ *
+ * Stores a persistent, per-file line-hash attribution map in a separate git
+ * notes ref (`refs/notes/claude-attribution-map`). Updated on every commit by
+ * commit.ts. Carries AI attribution forward across sessions and developers.
+ *
+ * Design:
+ *   - Only `ai_hashes` is stored; Human = any committed line NOT in ai_hashes.
+ *   - ai_hashes is a concatenated string of 16-char hex hashes (no separator).
+ *   - Blank lines are always Human and never appear in ai_hashes.
+ *   - Full state (all tracked files) is stored on every commit for simple reads.
+ *
+ * Carry-forward algorithm (per committed line hash):
+ *   1. In currentAiHashes (this session's attributeLines() AI lines) → AI
+ *   2. In prevAiHashSet (parent commit's minimap for this file) → AI (carry forward)
+ *   3. Otherwise → Human
+ */
+import { execFile } from "child_process";
+import { promisify } from "util";
+import { writeFile, unlink, mkdtemp, rmdir } from "fs/promises";
+import { tmpdir } from "os";
+import { join } from "path";
+import { hashLine } from "./differ.ts";
+const execFileAsync = promisify(execFile);
+export const MINIMAP_NOTES_REF = "refs/notes/claude-attribution-map";
+export interface MinimapFileState {
+	path: string;
+	/** Concatenated 16-char hex hashes of AI-attributed lines (no separator). */
+	ai_hashes: string;
+	ai: number;
+	human: number;
+	total: number;
+	pctAi: number;
+}
+export interface MinimapResult {
+	commit: string;
+	timestamp: string;
+	files: MinimapFileState[];
+	totals: { ai: number; human: number; total: number; pctAi: number };
+}
+async function run(cmd: string, args: string[], cwd?: string): Promise<string> {
+	const { stdout } = await execFileAsync(cmd, args, { cwd });
+	return stdout.trim();
+}
+/** Parse a concatenated 16-char hex hash string into a Set. */
+export function hashSetFromString(s: string): Set<string> {
+	const result = new Set<string>();
+	for (let i = 0; i + 16 <= s.length; i += 16) {
+		result.add(s.slice(i, i + 16));
+	}
+	return result;
+}
+/** Serialize a Set of 16-char hashes into a concatenated string. */
+export function hashSetToString(hashes: Set<string>): string {
+	return [...hashes].join("");
+}
+/**
+ * Compute the minimap state for a single file given:
+ * - committedLines: the lines of the file as committed
+ * - currentAiHashes: hashes of lines Claude wrote in the current session
+ * - prevAiHashSet: hashes of AI lines from the parent commit's minimap
+ */
+export function computeMinimapFile(
+	path: string,
+	committedLines: string[],
+	currentAiHashes: Set<string>,
+	prevAiHashSet: Set<string>,
+): MinimapFileState {
+	const newAiHashes = new Set<string>();
+	let ai = 0;
+	let human = 0;
+	for (const line of committedLines) {
+		if (line.trim() === "") {
+			// Blank lines carry no attribution signal — always Human
+			human++;
+			continue;
+		}
+		const hash = hashLine(line);
+		if (currentAiHashes.has(hash) || prevAiHashSet.has(hash)) {
+			newAiHashes.add(hash);
+			ai++;
+		} else {
+			human++;
+		}
+	}
+	const total = committedLines.length;
+	return {
+		path,
+		ai_hashes: hashSetToString(newAiHashes),
+		ai,
+		human,
+		total,
+		pctAi: total > 0 ? Math.round((ai / total) * 100) : 0,
+	};
+}
+export async function writeMinimap(
+	result: MinimapResult,
+	repoRoot: string,
+	commitSha = "HEAD",
+): Promise<void> {
+	// Write JSON to a temp file and use -F to avoid E2BIG on large repos.
+	// Passing the full minimap as a -m argument fails when the JSON exceeds
+	// the OS argument size limit (~500KB on macOS).
+	//
+	// Use mkdtemp to create a unique, isolated temp directory rather than a
+	// predictable filename in the shared OS temp dir — prevents collisions
+	// under concurrent runs and symlink/race attacks.
+	const tmpDir = await mkdtemp(join(tmpdir(), "claude-attribution-minimap-"));
+	const tmpFile = join(tmpDir, "minimap.json");
+	try {
+		await writeFile(tmpFile, JSON.stringify(result, null, 2), {
+			encoding: "utf8",
+			flag: "wx",
+		});
+		await run(
+			"git",
+			[
+				"notes",
+				"--ref",
+				MINIMAP_NOTES_REF,
+				"add",
+				"--force",
+				"-F",
+				tmpFile,
+				commitSha,
+			],
+			repoRoot,
+		);
+	} finally {
+		await unlink(tmpFile).catch(() => {});
+		await rmdir(tmpDir).catch(() => {});
+	}
+}
+export async function readMinimap(
+	repoRoot: string,
+	commitSha = "HEAD",
+): Promise<MinimapResult | null> {
+	try {
+		const output = await run(
+			"git",
+			["notes", "--ref", MINIMAP_NOTES_REF, "show", commitSha],
+			repoRoot,
+		);
+		return JSON.parse(output) as MinimapResult;
+	} catch {
+		return null;
+	}
+}
+export async function listMinimapNotes(repoRoot: string): Promise<string[]> {
+	try {
+		const output = await run(
+			"git",
+			["notes", "--ref", MINIMAP_NOTES_REF, "list"],
+			repoRoot,
+		);
+		if (!output) return [];
+		return output
+			.split("\n")
+			.map((line) => line.split(" ")[1])
+			.filter((sha): sha is string => !!sha && sha.length > 0);
+	} catch {
+		return [];
+	}
+}

package/src/cli.ts CHANGED Viewed

@@ -30,6 +30,9 @@ switch (cmd) {
 	case "pr":
 		await import("./commands/pr.ts");
 		break;
+	case "init":
+		await import("./commands/init.ts");
+		break;
 	case "hook": {
 		switch (rest[0]) {
 			case "pre-tool-use":
@@ -89,6 +92,7 @@ Commands:
   uninstall [repo] Remove hooks from a repo (default: current directory)
   metrics [id]     Generate PR metrics report
   pr [title]       Create PR with metrics embedded (--draft, --base <branch>)
+  init [--ai]      Declare current codebase as AI-written in the cumulative minimap
   start            Mark session start for per-ticket scoping
   hook <name>      Run an internal hook (used by installed git hooks)
   version          Print version

package/src/commands/init.ts ADDED Viewed

@@ -0,0 +1,165 @@
+/**
+ * claude-attribution init [--ai | --human]
+ *
+ * Initializes the cumulative attribution minimap for an existing repo.
+ *
+ * --ai     Mark all currently tracked files as AI-written. Use for repos that
+ *          were built entirely with Claude Code from the start.
+ * --human  (default) Confirm the default: no minimap note written; all lines
+ *          are assumed human until Claude writes them.
+ *
+ * After running init --ai, the next `claude-attribution metrics` or PR will show
+ * the true codebase AI% instead of only the current session's delta.
+ */
+import { resolve } from "path";
+import { execFile } from "child_process";
+import { promisify } from "util";
+import { hashLine } from "../attribution/differ.ts";
+import {
+	computeMinimapFile,
+	writeMinimap,
+	type MinimapFileState,
+	type MinimapResult,
+} from "../attribution/minimap.ts";
+const execFileAsync = promisify(execFile);
+const CONCURRENCY = 8;
+async function runGit(args: string[], cwd: string): Promise<string> {
+	const result = (await execFileAsync("git", args, { cwd })) as unknown as {
+		stdout: string;
+	};
+	return result.stdout.trim();
+}
+async function main() {
+	const repoRoot = resolve(process.cwd());
+	const flag = process.argv[2];
+	if (!flag || flag === "--human") {
+		console.log(
+			'Baseline is already "human" by default. No minimap note written.',
+		);
+		console.log(
+			"Attribution accumulates automatically as Claude Code writes and commits code.",
+		);
+		console.log(
+			"\nIf this repo was built entirely with Claude Code, run: claude-attribution init --ai",
+		);
+		return;
+	}
+	if (flag !== "--ai") {
+		console.error(`Unknown flag: ${flag}`);
+		console.error("Usage: claude-attribution init [--ai | --human]");
+		process.exit(1);
+	}
+	// --ai: mark entire current codebase as AI-written
+	let sha = "";
+	try {
+		sha = await runGit(["rev-parse", "HEAD"], repoRoot);
+	} catch {
+		console.error(
+			"Error: no commits found. Commit your files first, then run init --ai.",
+		);
+		process.exit(1);
+	}
+	const lsOutput = await runGit(["ls-files"], repoRoot);
+	const allFiles = lsOutput ? lsOutput.split("\n").filter(Boolean) : [];
+	if (allFiles.length === 0) {
+		console.error("Error: no tracked files found.");
+		process.exit(1);
+	}
+	console.log(
+		`Marking ${allFiles.length} files as AI-written on ${sha.slice(0, 7)}...`,
+	);
+	const minimapFiles: MinimapFileState[] = [];
+	let processed = 0;
+	for (let i = 0; i < allFiles.length; i += CONCURRENCY) {
+		const batch = allFiles.slice(i, i + CONCURRENCY);
+		const batchResults = await Promise.all(
+			batch.map(async (relPath): Promise<MinimapFileState | null> => {
+				try {
+					const result = (await execFileAsync(
+						"git",
+						["show", `HEAD:${relPath}`],
+						{ cwd: repoRoot },
+					)) as unknown as { stdout: string };
+					const content = result.stdout;
+					// Skip binary files
+					if (content.includes("\0")) return null;
+					const lines = content.split("\n");
+					// Build currentAiHashes from every non-blank line in the file
+					const currentAiHashes = new Set<string>();
+					for (const line of lines) {
+						if (line.trim() !== "") {
+							currentAiHashes.add(hashLine(line));
+						}
+					}
+					return computeMinimapFile(
+						relPath,
+						lines,
+						currentAiHashes,
+						new Set<string>(),
+					);
+				} catch {
+					return null;
+				}
+			}),
+		);
+		const valid = batchResults.filter((r): r is MinimapFileState => r !== null);
+		minimapFiles.push(...valid);
+		processed += batch.length;
+		if (processed % 100 === 0 || processed === allFiles.length) {
+			process.stdout.write(`\r  ${processed} / ${allFiles.length} files...`);
+		}
+	}
+	process.stdout.write("\n");
+	let totalAi = 0,
+		totalHuman = 0,
+		totalLines = 0;
+	for (const f of minimapFiles) {
+		totalAi += f.ai;
+		totalHuman += f.human;
+		totalLines += f.total;
+	}
+	const minimapResult: MinimapResult = {
+		commit: sha,
+		timestamp: new Date().toISOString(),
+		files: minimapFiles,
+		totals: {
+			ai: totalAi,
+			human: totalHuman,
+			total: totalLines,
+			pctAi: totalLines > 0 ? Math.round((totalAi / totalLines) * 100) : 0,
+		},
+	};
+	await writeMinimap(minimapResult, repoRoot);
+	const pct = minimapResult.totals.pctAi;
+	console.log(
+		`✓ Marked ${minimapFiles.length} files (${totalLines} lines, ${pct}% AI) as AI-written.`,
+	);
+	console.log(
+		"  Push to share with team: git push origin refs/notes/claude-attribution-map",
+	);
+}
+main().catch((err) => {
+	console.error("Error:", err);
+	process.exit(1);
+});

package/src/export/pr-summary.ts CHANGED Viewed

@@ -30,6 +30,7 @@ import {
 	type AttributionResult,
 	type FileAttribution,
 } from "../attribution/differ.ts";
+import { readMinimap } from "../attribution/minimap.ts";
 interface MetricPoint {
 	timestamp: number;
@@ -175,6 +176,23 @@ async function main() {
 		},
 	];
+	// Codebase-wide minimap gauges (optional — only pushed if minimap exists)
+	const headMinimap = await readMinimap(repoRoot, "HEAD").catch(() => null);
+	if (headMinimap) {
+		series.push({
+			metric: "claude_attribution.codebase_pct_ai",
+			type: 3,
+			points: [{ timestamp, value: headMinimap.totals.pctAi }],
+			tags,
+		});
+		series.push({
+			metric: "claude_attribution.codebase_total_lines",
+			type: 3,
+			points: [{ timestamp, value: headMinimap.totals.total }],
+			tags,
+		});
+	}
 	const event: DatadogEventPayload = {
 		title: `PR #${prNumber} merged — ${pctAi}% AI (claude-code)`,
 		text: `repo: ${repo}\nbranch: ${branch}\nai: ${ai} / human: ${human} / total: ${total}`,

package/src/metrics/collect.ts CHANGED Viewed

@@ -22,6 +22,7 @@ import {
 	type FileAttribution,
 } from "../attribution/differ.ts";
 import { SESSION_ID_RE } from "../attribution/checkpoint.ts";
+import { readMinimap, listMinimapNotes } from "../attribution/minimap.ts";
 const execFileAsync = promisify(execFile);
@@ -34,6 +35,12 @@ export interface MetricsData {
 	attributions: AttributionResult[];
 	lastSeenByFile: Map<string, FileAttribution>;
 	allTranscripts: TranscriptResult[];
+	minimapTotals: {
+		ai: number;
+		human: number;
+		total: number;
+		pctAi: number;
+	} | null;
 }
 async function readSessionStart(repoRoot: string): Promise<Date | null> {
@@ -151,6 +158,29 @@ export function kFormat(n: number): string {
 	return n >= 1000 ? `${Math.floor(n / 1000)}K` : String(n);
 }
+async function getMinimapTotals(
+	repoRoot: string,
+): Promise<{ ai: number; human: number; total: number; pctAi: number } | null> {
+	// Try HEAD first (written by post-commit hook)
+	const head = await readMinimap(repoRoot, "HEAD");
+	if (head) return head.totals;
+	// Fall back: find the most recent minimap note on the current branch
+	const [allNotes, branchShas] = await Promise.all([
+		listMinimapNotes(repoRoot),
+		getBranchCommitShas(repoRoot),
+	]);
+	const branchSet = new Set(branchShas);
+	const candidates = allNotes.filter((sha) => branchSet.has(sha));
+	const scanList = candidates.length > 0 ? candidates : allNotes;
+	for (const sha of scanList) {
+		const note = await readMinimap(repoRoot, sha);
+		if (note) return note.totals;
+	}
+	return null;
+}
 export async function collectMetrics(
 	sessionIdArg?: string,
 	repoRoot?: string,
@@ -165,7 +195,7 @@ export async function collectMetrics(
 	const sessionStart =
 		(await readSessionStart(root)) ?? (await getBranchStartTime(root));
-	const [toolEntries, agentEntries, transcript, attributions] =
+	const [toolEntries, agentEntries, transcript, attributions, minimapTotals] =
 		await Promise.all([
 			readJsonlForSession(
 				join(logDir, "tool-usage.jsonl"),
@@ -179,6 +209,7 @@ export async function collectMetrics(
 			) as Promise<{ subagentType?: string; event?: string }[]>,
 			parseTranscript(sessionId, root),
 			getBranchAttribution(root, sessionStart),
+			getMinimapTotals(root),
 		]);
 	// Tool counts
@@ -234,6 +265,7 @@ export async function collectMetrics(
 		attributions,
 		lastSeenByFile,
 		allTranscripts,
+		minimapTotals,
 	};
 }
@@ -245,6 +277,7 @@ export function renderMetrics(data: MetricsData): string {
 		transcript,
 		lastSeenByFile,
 		allTranscripts,
+		minimapTotals,
 	} = data;
 	const lines: string[] = [];
@@ -256,7 +289,31 @@ export function renderMetrics(data: MetricsData): string {
 	// Headline: AI% + active time (most important stat, shown first)
 	const allFileStats = [...lastSeenByFile.values()];
 	const hasAttribution = allFileStats.length > 0;
-	if (hasAttribution) {
+	if (minimapTotals && minimapTotals.total > 0) {
+		// Two-signal headline: codebase-wide + this PR
+		out(
+			`**Codebase: ~${minimapTotals.pctAi}% AI** (${minimapTotals.ai} / ${minimapTotals.total} lines)`,
+		);
+		if (hasAttribution) {
+			const {
+				ai: prAi,
+				total: prTotal,
+				pctAi: prPctAi,
+			} = aggregateTotals(allFileStats);
+			const codebasePct =
+				minimapTotals.total > 0
+					? Math.round((prTotal / minimapTotals.total) * 100)
+					: 0;
+			const activePart =
+				transcript && transcript.activeMinutes > 0
+					? ` · Active: ${transcript.activeMinutes}m`
+					: "";
+			out(
+				`**This PR:** ${prTotal} lines changed (${codebasePct}% of codebase) · ${prPctAi}% Claude edits · ${prAi} AI lines${activePart}`,
+			);
+		}
+		out();
+	} else if (hasAttribution) {
 		const { ai, total, pctAi } = aggregateTotals(allFileStats);
 		const activePart =
 			transcript && transcript.activeMinutes > 0

package/src/setup/install.ts CHANGED Viewed

@@ -119,6 +119,27 @@ async function main() {
 		);
 	}
+	// Configure git push to also push minimap notes (same idempotent pattern)
+	const minimapRefspec =
+		"refs/notes/claude-attribution-map:refs/notes/claude-attribution-map";
+	try {
+		await execFileAsync(
+			"git",
+			["config", "--unset-all", "remote.origin.push", minimapRefspec],
+			{ cwd: targetRepo },
+		).catch(() => {});
+		await execFileAsync(
+			"git",
+			["config", "--add", "remote.origin.push", minimapRefspec],
+			{ cwd: targetRepo },
+		);
+		console.log("✓ Configured remote.origin.push to include minimap notes");
+	} catch {
+		console.log(
+			"  (skipped git config remote.origin.push — no origin remote or git unavailable)",
+		);
+	}
 	// 3. Install slash commands
 	await installSlashCommands(targetRepo);
 	console.log("✓ Installed .claude/commands/metrics.md (/metrics command)");

package/src/setup/templates/pr-metrics-workflow.yml CHANGED Viewed

@@ -18,7 +18,9 @@ jobs:
           fetch-depth: 0
       - name: Fetch attribution notes
-        run: git fetch origin refs/notes/claude-attribution:refs/notes/claude-attribution || true
+        run: |
+          git fetch origin refs/notes/claude-attribution:refs/notes/claude-attribution || true
+          git fetch origin refs/notes/claude-attribution-map:refs/notes/claude-attribution-map || true
       - name: Install claude-attribution
         run: npm install -g claude-attribution

package/src/setup/uninstall.ts CHANGED Viewed

@@ -149,6 +149,15 @@ async function main() {
 		// Ignore — refspec may not be present or git may be unavailable
 	}
+	// 4b. Remove remote.origin.push refspec for minimap notes (best-effort)
+	const minimapRefspec =
+		"refs/notes/claude-attribution-map:refs/notes/claude-attribution-map";
+	await execFileAsync(
+		"git",
+		["config", "--unset", "remote.origin.push", minimapRefspec],
+		{ cwd: targetRepo },
+	).catch(() => {});
 	// 5. Remove slash commands
 	const commandsDir = join(targetRepo, ".claude", "commands");
 	const metricsRemoved = await removeFile(join(commandsDir, "metrics.md"));