@uxcontinuum/ccaudit 1.0.3 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +52 -42
- package/index.js +83 -7
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,12 +1,20 @@
|
|
|
1
1
|
# ccaudit
|
|
2
2
|
|
|
3
|
-
A diagnostic for your Claude Code setup.
|
|
3
|
+
A diagnostic for your Claude Code setup. Mostly for fun, partly genuinely useful. It reads `~/.claude/` locally and grades you across hook coverage, project hygiene, tool balance, prompt tells, and pipeline ops.
|
|
4
4
|
|
|
5
5
|
```bash
|
|
6
6
|
npx @uxcontinuum/ccaudit
|
|
7
7
|
```
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Zero install, zero dependencies, no network calls.
|
|
10
|
+
|
|
11
|
+
## What the grade is and isn't
|
|
12
|
+
|
|
13
|
+
This is a hygiene audit, not an outcomes audit. It measures whether your Claude Code setup is **set up well**, not whether your outputs are good.
|
|
14
|
+
|
|
15
|
+
Think of it as a linter for your AI workflow. Passing lint doesn't guarantee your code is good. Failing lint usually means something is missing. Same here: a high grade doesn't mean Claude is shipping perfect work for you; a low grade usually means the scaffolding around your AI is sparse.
|
|
16
|
+
|
|
17
|
+
The grade can be gamed (install five no-op hooks, auto-title every session, scrub "just" from your prompts). Don't bother. The findings under the grade are the value, not the letter.
|
|
10
18
|
|
|
11
19
|
## What you get
|
|
12
20
|
|
|
@@ -15,38 +23,50 @@ Reads `~/.claude/` locally. Zero dependencies. Nothing leaves your machine.
|
|
|
15
23
|
CCAUDIT your Claude Code report card
|
|
16
24
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
17
25
|
|
|
18
|
-
OVERALL GRADE
|
|
26
|
+
OVERALL GRADE C+ (79/100)
|
|
19
27
|
|
|
20
|
-
Hook coverage
|
|
21
|
-
1
|
|
28
|
+
Hook coverage A+ ████████████████████
|
|
29
|
+
1 PostToolUse, 2 Stop, 1 PreToolUse, autoMemory plugin.
|
|
22
30
|
|
|
23
31
|
Project hygiene (human) F ████████░░░░░░░░░░░░
|
|
24
|
-
0%
|
|
25
|
-
→ Title your sessions. Untitled sessions are unsearchable history.
|
|
32
|
+
0% titled, launched from 10 distinct working dirs.
|
|
26
33
|
|
|
27
|
-
Tool balance (human)
|
|
28
|
-
Bash 73%, Edit+Write 10
|
|
29
|
-
→ You are running things, not editing things. Use Edit/Write more.
|
|
34
|
+
Tool balance (human) D+ ██████████████░░░░░░
|
|
35
|
+
Bash 73%, Edit+Write 10% (3,536 calls), Read 10%.
|
|
30
36
|
|
|
31
37
|
Prompt tells C ███████████████░░░░░
|
|
32
|
-
|
|
33
|
-
|
|
38
|
+
Said "just" 10,236 times across 19,192 prompts (53%).
|
|
39
|
+
|
|
40
|
+
Output signals B ████████████████░░░░
|
|
41
|
+
Tool error rate 4.2%, median session length 8 messages.
|
|
34
42
|
|
|
35
43
|
Pipeline ops (agent sessions) B █████████████████░░░
|
|
36
|
-
|
|
44
|
+
3,253 agent-spawned sessions, 26.93M output tokens.
|
|
37
45
|
```
|
|
38
46
|
|
|
39
47
|
## What it checks
|
|
40
48
|
|
|
41
|
-
| Dimension |
|
|
42
|
-
|
|
43
|
-
| Hook coverage | `~/.claude/settings.json`
|
|
44
|
-
| Project hygiene |
|
|
45
|
-
| Tool balance |
|
|
46
|
-
| Prompt tells | "just"
|
|
47
|
-
|
|
|
49
|
+
| Dimension | What it measures | What it cannot see |
|
|
50
|
+
|-----------|------------------|-------------------|
|
|
51
|
+
| Hook coverage | Hooks configured in `~/.claude/settings.json` across all event types, plus `autoMemoryEnabled` plugin flag | Whether the hooks actually do anything useful |
|
|
52
|
+
| Project hygiene | Custom titles, auto-slugs, CWD diversity, prompt length | Whether your titles describe the work accurately |
|
|
53
|
+
| Tool balance | Distribution across Bash, Edit, Read, Grep, Agent. Adaptive: high Bash% is okay if absolute Edit volume is also high | Whether each tool call accomplished the goal |
|
|
54
|
+
| Prompt tells | Frequency of hedge words ("just", "please"), prompt clarity heuristics | Whether your prompts produce good outputs |
|
|
55
|
+
| Output signals | Tool-call error rate, median session length, within-session retry patterns | Whether your shipped code works in production |
|
|
56
|
+
| Pipeline ops | Agent-spawned session count, token spend, hook coverage relative to volume | Whether your pipeline ships features that don't break |
|
|
57
|
+
|
|
58
|
+
It separates human-driven sessions from agent-spawned worktrees via three signals (`isSidechain`, `userType`, UUID/hex dir-name pattern). Operator grade and pipeline grade get scored independently against different rubrics.
|
|
59
|
+
|
|
60
|
+
## Cross-platform support
|
|
61
|
+
|
|
62
|
+
Works on:
|
|
63
|
+
- macOS (`$HOME/.claude/`)
|
|
64
|
+
- Linux (`$HOME/.claude/`)
|
|
65
|
+
- Windows / WSL (`%USERPROFILE%\.claude\` or `$HOME/.claude/`)
|
|
66
|
+
- VPS / non-default home (uses Node's `os.homedir()`)
|
|
67
|
+
- Running as root with users in `/home/*` (scans all)
|
|
48
68
|
|
|
49
|
-
|
|
69
|
+
Tested against setups ranging from "brand new install with zero sessions" to "20,000 sessions and 4,000 agent worktrees."
|
|
50
70
|
|
|
51
71
|
## Install
|
|
52
72
|
|
|
@@ -54,7 +74,7 @@ It separates human-driven sessions from agent-spawned worktrees (UUID and ULID-s
|
|
|
54
74
|
# Run once without installing
|
|
55
75
|
npx @uxcontinuum/ccaudit
|
|
56
76
|
|
|
57
|
-
# Or globally
|
|
77
|
+
# Or install globally
|
|
58
78
|
npm i -g @uxcontinuum/ccaudit
|
|
59
79
|
ccaudit
|
|
60
80
|
```
|
|
@@ -64,34 +84,24 @@ Requires Node 14+. No other dependencies.
|
|
|
64
84
|
## Options
|
|
65
85
|
|
|
66
86
|
```bash
|
|
67
|
-
ccaudit # full report, last 30 days
|
|
87
|
+
ccaudit # full report, last 30 days
|
|
68
88
|
ccaudit --days 7 # just last week
|
|
69
89
|
ccaudit --days 365 # full year
|
|
90
|
+
ccaudit --json # programmatic output, anonymized
|
|
70
91
|
ccaudit --no-color # plain text for copying
|
|
71
92
|
```
|
|
72
93
|
|
|
73
|
-
##
|
|
74
|
-
|
|
75
|
-
Each dimension produces a 0-100 score and a letter grade (A+ through F). The overall grade is the mean of the dimension scores. The rubric weights:
|
|
76
|
-
|
|
77
|
-
- Hook coverage is hard-floored at 35 if you have zero hooks. Anything could happen overnight.
|
|
78
|
-
- Project hygiene scales linearly with titled-session percentage and penalizes both ultra-terse (<80 chars) and wall-of-text (>1500 chars) average prompts.
|
|
79
|
-
- Tool balance penalizes Bash dominance above 65% and rewards healthy editing (10-55% Edit+Write).
|
|
80
|
-
- Prompt tells subtract for high "just" frequency. "Just" telegraphs that you think the task is simple. It usually is not.
|
|
81
|
-
- Pipeline ops rewards low tokens-per-session and penalizes running an agent pipeline without runtime hooks.
|
|
82
|
-
|
|
83
|
-
The grade is opinionated, not objective. Read it as a diagnostic, not a judgment.
|
|
84
|
-
|
|
85
|
-
## Why this exists
|
|
86
|
-
|
|
87
|
-
There is no public benchmark for "is my Claude Code setup any good." People burn weeks reading other people's CLAUDE.md files trying to figure out what they're doing wrong. This tool answers that question in 30 seconds.
|
|
94
|
+
## Privacy
|
|
88
95
|
|
|
89
|
-
|
|
96
|
+
Reads `~/.claude/` on your machine. Outputs to stdout. No network calls, no telemetry, no opt-in submission. The `--json` output is anonymized (no prompts, no slugs, no CWD strings, just aggregate counts and percentages).
|
|
90
97
|
|
|
91
|
-
##
|
|
98
|
+
## Honest disclaimers
|
|
92
99
|
|
|
93
|
-
|
|
100
|
+
- The grade is opinionated, not objective.
|
|
101
|
+
- The rubric will change as the tool matures.
|
|
102
|
+
- High grade ≠ good outputs. Low grade ≠ bad outputs. The grade is about scaffolding, not results.
|
|
103
|
+
- The tool ships with a built-in nudge toward [Continuum Sprint](https://uxcontinuum.com/sprint) when it surfaces 2+ failing dimensions. That's intentional. If your setup is genuinely broken in two places, a 2-week sprint is often what fixes it. Ignore the nudge if you don't want it.
|
|
94
104
|
|
|
95
105
|
---
|
|
96
106
|
|
|
97
|
-
Built by [Matt Turley](https://uxcontinuum.com)
|
|
107
|
+
Built by [Matt Turley](https://uxcontinuum.com).
|
package/index.js
CHANGED
|
@@ -75,6 +75,14 @@ function projDirName(filePath) {
|
|
|
75
75
|
return idx >= 0 && idx + 1 < parts.length ? parts[idx + 1] : '';
|
|
76
76
|
}
|
|
77
77
|
|
|
78
|
+
// Cheap fingerprint of a tool_use input. Used to detect within-session retries.
|
|
79
|
+
function fpToolUse(name, input) {
|
|
80
|
+
const key = typeof input === 'object' && input
|
|
81
|
+
? (input.command || input.file_path || input.path || input.pattern || JSON.stringify(input))
|
|
82
|
+
: String(input ?? '');
|
|
83
|
+
return name + '::' + String(key).slice(0, 200);
|
|
84
|
+
}
|
|
85
|
+
|
|
78
86
|
function parseSession(filePath, cutoffMs) {
|
|
79
87
|
let lines;
|
|
80
88
|
try { lines = fs.readFileSync(filePath, 'utf8').split('\n'); } catch (_) { return null; }
|
|
@@ -92,6 +100,8 @@ function parseSession(filePath, cutoffMs) {
|
|
|
92
100
|
let entrypoint = null;
|
|
93
101
|
let claudeVersion = null;
|
|
94
102
|
let messageCount = 0;
|
|
103
|
+
let toolErrors = 0;
|
|
104
|
+
const fpCounts = new Map(); // tool_use fingerprint → count, for retry detection
|
|
95
105
|
|
|
96
106
|
for (const raw of lines) {
|
|
97
107
|
if (!raw) continue;
|
|
@@ -119,7 +129,11 @@ function parseSession(filePath, cutoffMs) {
|
|
|
119
129
|
if (msg.type === 'user') {
|
|
120
130
|
const c = msg.message?.content;
|
|
121
131
|
if (Array.isArray(c)) {
|
|
122
|
-
for (const b of c)
|
|
132
|
+
for (const b of c) {
|
|
133
|
+
if (b?.type === 'text' && b.text?.trim()) userPrompts.push(b.text.trim());
|
|
134
|
+
// tool_result blocks appear in user messages. is_error true = the tool call failed.
|
|
135
|
+
if (b?.type === 'tool_result' && b.is_error === true) toolErrors++;
|
|
136
|
+
}
|
|
123
137
|
} else if (typeof c === 'string' && c.trim()) {
|
|
124
138
|
userPrompts.push(c.trim());
|
|
125
139
|
}
|
|
@@ -128,7 +142,13 @@ function parseSession(filePath, cutoffMs) {
|
|
|
128
142
|
if (msg.type === 'assistant') {
|
|
129
143
|
const c = msg.message?.content;
|
|
130
144
|
if (Array.isArray(c)) {
|
|
131
|
-
for (const b of c)
|
|
145
|
+
for (const b of c) {
|
|
146
|
+
if (b?.type === 'tool_use') {
|
|
147
|
+
toolCalls.push(b.name || 'unknown');
|
|
148
|
+
const fp = fpToolUse(b.name || 'unknown', b.input);
|
|
149
|
+
fpCounts.set(fp, (fpCounts.get(fp) || 0) + 1);
|
|
150
|
+
}
|
|
151
|
+
}
|
|
132
152
|
}
|
|
133
153
|
const u = msg.message?.usage;
|
|
134
154
|
if (u) {
|
|
@@ -141,14 +161,15 @@ function parseSession(filePath, cutoffMs) {
|
|
|
141
161
|
if (!timestamps.length) return null;
|
|
142
162
|
|
|
143
163
|
const projDir = projDirName(filePath);
|
|
144
|
-
// Multi-signal agent detector. Any of these is sufficient:
|
|
145
|
-
// - isSidechain: subagent inside another Claude session
|
|
146
|
-
// - userType non-external: internal automation invocation
|
|
147
|
-
// - dir-name matches UUID/hex pattern: orchestrator-spawned worktree
|
|
148
164
|
const isAgent = isSidechain ||
|
|
149
165
|
(userType && userType !== 'external') ||
|
|
150
166
|
fallbackAgentDirGuess(projDir);
|
|
151
167
|
|
|
168
|
+
// Within-session retries: any fingerprint that fired >1 time. Count the
|
|
169
|
+
// excess fires beyond the first as retries.
|
|
170
|
+
let retries = 0;
|
|
171
|
+
for (const c of fpCounts.values()) if (c > 1) retries += (c - 1);
|
|
172
|
+
|
|
152
173
|
return {
|
|
153
174
|
projDir,
|
|
154
175
|
isAgent,
|
|
@@ -157,6 +178,8 @@ function parseSession(filePath, cutoffMs) {
|
|
|
157
178
|
cwd: cwd || '',
|
|
158
179
|
userPrompts,
|
|
159
180
|
toolCalls,
|
|
181
|
+
toolErrors,
|
|
182
|
+
retries,
|
|
160
183
|
timestamps,
|
|
161
184
|
outputTokens,
|
|
162
185
|
inputTokens,
|
|
@@ -283,6 +306,19 @@ function aggregate(sessions) {
|
|
|
283
306
|
const outputTokens = subset.reduce((n, s) => n + s.outputTokens, 0);
|
|
284
307
|
const inputTokens = subset.reduce((n, s) => n + s.inputTokens, 0);
|
|
285
308
|
|
|
309
|
+
const toolErrorsTotal = subset.reduce((n, s) => n + (s.toolErrors || 0), 0);
|
|
310
|
+
const retriesTotal = subset.reduce((n, s) => n + (s.retries || 0), 0);
|
|
311
|
+
const toolErrorRate = tools.length ? (100 * toolErrorsTotal / tools.length) : 0;
|
|
312
|
+
const retriesPerSession = subset.length ? (retriesTotal / subset.length) : 0;
|
|
313
|
+
|
|
314
|
+
// Median session length (message count). Cheaper proxy for first-shot success.
|
|
315
|
+
const lengths = subset.map(s => s.messageCount).sort((a, b) => a - b);
|
|
316
|
+
const medianLen = lengths.length
|
|
317
|
+
? (lengths.length % 2 === 1
|
|
318
|
+
? lengths[(lengths.length - 1) / 2]
|
|
319
|
+
: Math.round((lengths[lengths.length / 2 - 1] + lengths[lengths.length / 2]) / 2))
|
|
320
|
+
: 0;
|
|
321
|
+
|
|
286
322
|
return {
|
|
287
323
|
sessions: subset.length,
|
|
288
324
|
prompts: prompts.length,
|
|
@@ -300,6 +336,11 @@ function aggregate(sessions) {
|
|
|
300
336
|
outputTokens,
|
|
301
337
|
inputTokens,
|
|
302
338
|
totalTools: tools.length,
|
|
339
|
+
toolErrorRate: Math.round(toolErrorRate * 10) / 10,
|
|
340
|
+
toolErrorsTotal,
|
|
341
|
+
retriesTotal,
|
|
342
|
+
retriesPerSession: Math.round(retriesPerSession * 10) / 10,
|
|
343
|
+
medianSessionLength: medianLen,
|
|
303
344
|
};
|
|
304
345
|
};
|
|
305
346
|
|
|
@@ -443,7 +484,37 @@ function grade(stats, setup) {
|
|
|
443
484
|
});
|
|
444
485
|
}
|
|
445
486
|
|
|
446
|
-
// 5.
|
|
487
|
+
// 5. Output signals (human sessions only). Best available local proxy for
|
|
488
|
+
// whether your sessions actually produce results vs grinding. Three inputs:
|
|
489
|
+
// - tool error rate (lower = cleaner runs)
|
|
490
|
+
// - retries per session (lower = first-shot success)
|
|
491
|
+
// - median session length (very long = stuck, very short = trivial)
|
|
492
|
+
if (human.sessions && human.totalTools > 0) {
|
|
493
|
+
let oScore = 80;
|
|
494
|
+
if (human.toolErrorRate > 15) oScore -= 18;
|
|
495
|
+
else if (human.toolErrorRate > 8) oScore -= 10;
|
|
496
|
+
else if (human.toolErrorRate > 4) oScore -= 4;
|
|
497
|
+
|
|
498
|
+
if (human.retriesPerSession > 6) oScore -= 12;
|
|
499
|
+
else if (human.retriesPerSession > 3) oScore -= 6;
|
|
500
|
+
|
|
501
|
+
if (human.medianSessionLength > 100) oScore -= 15; // genuinely stuck
|
|
502
|
+
else if (human.medianSessionLength > 50) oScore -= 6; // long grinds
|
|
503
|
+
else if (human.medianSessionLength >= 2 && human.medianSessionLength <= 20) oScore += 4; // healthy
|
|
504
|
+
|
|
505
|
+
oScore = Math.max(0, Math.min(100, oScore));
|
|
506
|
+
|
|
507
|
+
dims.push({
|
|
508
|
+
name: 'Output signals',
|
|
509
|
+
score: oScore,
|
|
510
|
+
detail: `Tool error rate ${human.toolErrorRate}%, ${human.retriesPerSession} retries per session, median session ${human.medianSessionLength} messages.`,
|
|
511
|
+
fix: human.toolErrorRate > 15
|
|
512
|
+
? 'Your tool error rate is high. Sessions are fighting the environment more than producing output.'
|
|
513
|
+
: null,
|
|
514
|
+
});
|
|
515
|
+
}
|
|
516
|
+
|
|
517
|
+
// 6. Agent pipeline grade (only if agent sessions exist).
|
|
447
518
|
if (agent.sessions) {
|
|
448
519
|
let aScore = 75;
|
|
449
520
|
if (agent.sessions > 50) aScore += 8;
|
|
@@ -604,6 +675,11 @@ if (hasFlag('--json')) {
|
|
|
604
675
|
},
|
|
605
676
|
output_tokens: stats.human.outputTokens,
|
|
606
677
|
input_tokens: stats.human.inputTokens,
|
|
678
|
+
tool_error_rate_pct: stats.human.toolErrorRate,
|
|
679
|
+
tool_errors_total: stats.human.toolErrorsTotal,
|
|
680
|
+
retries_total: stats.human.retriesTotal,
|
|
681
|
+
retries_per_session: stats.human.retriesPerSession,
|
|
682
|
+
median_session_length: stats.human.medianSessionLength,
|
|
607
683
|
} : null,
|
|
608
684
|
agent: stats.agent ? {
|
|
609
685
|
sessions: stats.agent.sessions,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@uxcontinuum/ccaudit",
|
|
3
|
-
"version": "1.0
|
|
3
|
+
"version": "1.1.0",
|
|
4
4
|
"description": "A diagnostic for your Claude Code setup. Reads ~/.claude/ locally, grades you across hook coverage, project hygiene, tool balance, prompt tells, and pipeline ops. Zero install: npx @uxcontinuum/ccaudit",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|