@uxcontinuum/ccaudit 1.0.3 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +77 -37
- package/index.js +83 -7
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,12 +1,32 @@
|
|
|
1
1
|
# ccaudit
|
|
2
2
|
|
|
3
|
-
A diagnostic for your Claude Code setup.
|
|
3
|
+
A diagnostic for your Claude Code setup. Three things at once:
|
|
4
|
+
|
|
5
|
+
1. **A fun report card** you can screenshot and share.
|
|
6
|
+
2. **A hygiene linter** that surfaces what's missing.
|
|
7
|
+
3. **A discovery tool** that shows you which parts of Claude Code you are not using yet.
|
|
4
8
|
|
|
5
9
|
```bash
|
|
6
10
|
npx @uxcontinuum/ccaudit
|
|
7
11
|
```
|
|
8
12
|
|
|
9
|
-
|
|
13
|
+
Zero install. Zero dependencies. No network calls. Reads `~/.claude/` on your machine and outputs a grade card.
|
|
14
|
+
|
|
15
|
+
## Why this exists
|
|
16
|
+
|
|
17
|
+
Most Claude Code users are running on a fraction of the surface area. No hooks installed. No skills configured. No MCP servers. No idea what their token cost per shipped feature is. No concept of how often their agent fails on first try.
|
|
18
|
+
|
|
19
|
+
The hype is on the model. The actual constraint is everything around the model. The scaffolding.
|
|
20
|
+
|
|
21
|
+
ccaudit grades the scaffolding.
|
|
22
|
+
|
|
23
|
+
## What the grade is and isn't
|
|
24
|
+
|
|
25
|
+
This is a **hygiene and discovery audit**, not an outcomes audit. It measures whether your Claude Code setup is **set up well** and **uses what's available**, not whether your specific outputs are good.
|
|
26
|
+
|
|
27
|
+
Think of it as a linter for your AI workflow. Passing lint doesn't guarantee your code is good. Failing lint usually means something is missing. Same here: a high grade doesn't mean Claude is shipping perfect work for you. A low grade usually means there's surface area of Claude Code you haven't unlocked yet.
|
|
28
|
+
|
|
29
|
+
The grade can be gamed (install five no-op hooks, auto-title every session, scrub "just" from your prompts). Don't bother. The findings under the grade are the value, not the letter.
|
|
10
30
|
|
|
11
31
|
## What you get
|
|
12
32
|
|
|
@@ -15,38 +35,50 @@ Reads `~/.claude/` locally. Zero dependencies. Nothing leaves your machine.
|
|
|
15
35
|
CCAUDIT your Claude Code report card
|
|
16
36
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
17
37
|
|
|
18
|
-
OVERALL GRADE
|
|
38
|
+
OVERALL GRADE C+ (79/100)
|
|
19
39
|
|
|
20
|
-
Hook coverage
|
|
21
|
-
1
|
|
40
|
+
Hook coverage A+ ████████████████████
|
|
41
|
+
1 PostToolUse, 2 Stop, 1 PreToolUse, autoMemory plugin.
|
|
22
42
|
|
|
23
43
|
Project hygiene (human) F ████████░░░░░░░░░░░░
|
|
24
|
-
0%
|
|
25
|
-
→ Title your sessions. Untitled sessions are unsearchable history.
|
|
44
|
+
0% titled, launched from 10 distinct working dirs.
|
|
26
45
|
|
|
27
|
-
Tool balance (human)
|
|
28
|
-
Bash 73%, Edit+Write 10
|
|
29
|
-
→ You are running things, not editing things. Use Edit/Write more.
|
|
46
|
+
Tool balance (human) D+ ██████████████░░░░░░
|
|
47
|
+
Bash 73%, Edit+Write 10% (3,536 calls), Read 10%.
|
|
30
48
|
|
|
31
49
|
Prompt tells C ███████████████░░░░░
|
|
32
|
-
|
|
33
|
-
|
|
50
|
+
Said "just" 10,236 times across 19,192 prompts (53%).
|
|
51
|
+
|
|
52
|
+
Output signals B ████████████████░░░░
|
|
53
|
+
Tool error rate 4.2%, median session length 8 messages.
|
|
34
54
|
|
|
35
55
|
Pipeline ops (agent sessions) B █████████████████░░░
|
|
36
|
-
|
|
56
|
+
3,253 agent-spawned sessions, 26.93M output tokens.
|
|
37
57
|
```
|
|
38
58
|
|
|
39
59
|
## What it checks
|
|
40
60
|
|
|
41
|
-
| Dimension |
|
|
42
|
-
|
|
43
|
-
| Hook coverage | `~/.claude/settings.json`
|
|
44
|
-
| Project hygiene |
|
|
45
|
-
| Tool balance |
|
|
46
|
-
| Prompt tells | "just"
|
|
47
|
-
|
|
|
61
|
+
| Dimension | What it measures | What it cannot see |
|
|
62
|
+
|-----------|------------------|-------------------|
|
|
63
|
+
| Hook coverage | Hooks configured in `~/.claude/settings.json` across all event types, plus `autoMemoryEnabled` plugin flag | Whether the hooks actually do anything useful |
|
|
64
|
+
| Project hygiene | Custom titles, auto-slugs, CWD diversity, prompt length | Whether your titles describe the work accurately |
|
|
65
|
+
| Tool balance | Distribution across Bash, Edit, Read, Grep, Agent. Adaptive: high Bash% is okay if absolute Edit volume is also high | Whether each tool call accomplished the goal |
|
|
66
|
+
| Prompt tells | Frequency of hedge words ("just", "please"), prompt clarity heuristics | Whether your prompts produce good outputs |
|
|
67
|
+
| Output signals | Tool-call error rate, median session length, within-session retry patterns | Whether your shipped code works in production |
|
|
68
|
+
| Pipeline ops | Agent-spawned session count, token spend, hook coverage relative to volume | Whether your pipeline ships features that don't break |
|
|
69
|
+
|
|
70
|
+
It separates human-driven sessions from agent-spawned worktrees via three signals (`isSidechain`, `userType`, UUID/hex dir-name pattern). Operator grade and pipeline grade get scored independently against different rubrics.
|
|
48
71
|
|
|
49
|
-
|
|
72
|
+
## Cross-platform support
|
|
73
|
+
|
|
74
|
+
Works on:
|
|
75
|
+
- macOS (`$HOME/.claude/`)
|
|
76
|
+
- Linux (`$HOME/.claude/`)
|
|
77
|
+
- Windows / WSL (`%USERPROFILE%\.claude\` or `$HOME/.claude/`)
|
|
78
|
+
- VPS / non-default home (uses Node's `os.homedir()`)
|
|
79
|
+
- Running as root with users in `/home/*` (scans all)
|
|
80
|
+
|
|
81
|
+
Tested against setups ranging from "brand new install with zero sessions" to "20,000 sessions and 4,000 agent worktrees."
|
|
50
82
|
|
|
51
83
|
## Install
|
|
52
84
|
|
|
@@ -54,7 +86,7 @@ It separates human-driven sessions from agent-spawned worktrees (UUID and ULID-s
|
|
|
54
86
|
# Run once without installing
|
|
55
87
|
npx @uxcontinuum/ccaudit
|
|
56
88
|
|
|
57
|
-
# Or globally
|
|
89
|
+
# Or install globally
|
|
58
90
|
npm i -g @uxcontinuum/ccaudit
|
|
59
91
|
ccaudit
|
|
60
92
|
```
|
|
@@ -64,34 +96,42 @@ Requires Node 14+. No other dependencies.
|
|
|
64
96
|
## Options
|
|
65
97
|
|
|
66
98
|
```bash
|
|
67
|
-
ccaudit # full report, last 30 days
|
|
99
|
+
ccaudit # full report, last 30 days
|
|
68
100
|
ccaudit --days 7 # just last week
|
|
69
101
|
ccaudit --days 365 # full year
|
|
102
|
+
ccaudit --json # programmatic output, anonymized
|
|
70
103
|
ccaudit --no-color # plain text for copying
|
|
71
104
|
```
|
|
72
105
|
|
|
73
|
-
##
|
|
106
|
+
## Privacy
|
|
107
|
+
|
|
108
|
+
Reads `~/.claude/` on your machine. Outputs to stdout. No network calls, no telemetry, no opt-in submission. The `--json` output is anonymized (no prompts, no slugs, no CWD strings, just aggregate counts and percentages).
|
|
74
109
|
|
|
75
|
-
|
|
110
|
+
## Honest disclaimers
|
|
76
111
|
|
|
77
|
-
-
|
|
78
|
-
-
|
|
79
|
-
-
|
|
80
|
-
-
|
|
81
|
-
- Pipeline ops rewards low tokens-per-session and penalizes running an agent pipeline without runtime hooks.
|
|
112
|
+
- The grade is opinionated, not objective.
|
|
113
|
+
- The rubric will change as the tool matures.
|
|
114
|
+
- High grade ≠ good outputs. Low grade ≠ bad outputs. The grade is about **scaffolding and feature coverage**, not results.
|
|
115
|
+
- The tool ships with a built-in nudge toward [Continuum Sprint](https://uxcontinuum.com/sprint) when it surfaces 2+ failing dimensions. That is intentional. If your setup is genuinely broken in two places, a 2-week sprint is often what fixes it. Ignore the nudge if you don't want it.
|
|
82
116
|
|
|
83
|
-
The
|
|
117
|
+
## The story behind this
|
|
84
118
|
|
|
85
|
-
|
|
119
|
+
Karpathy keeps saying we're entering vibe coding. Software you write in English while AI generates the code. He is not wrong about where this is going.
|
|
86
120
|
|
|
87
|
-
|
|
121
|
+
I bought in six months ago. Built a multi-agent pipeline. Started shipping production code through it. Six weeks of recent data: 333 PRs, $1,132 in tokens, $3.40 per shipped PR.
|
|
88
122
|
|
|
89
|
-
|
|
123
|
+
Then I ran ccaudit on myself, expecting an A.
|
|
90
124
|
|
|
91
|
-
|
|
125
|
+
I got a B-.
|
|
92
126
|
|
|
93
|
-
|
|
127
|
+
The findings were valid. The reason I assumed A was that I had been optimizing the agents and ignoring the room they live in. Almost everyone running Claude Code is doing the same thing. The hype is on the model. The constraint is the scaffolding.
|
|
128
|
+
|
|
129
|
+
If you want to know what your scaffolding looks like graded:
|
|
130
|
+
|
|
131
|
+
```bash
|
|
132
|
+
npx @uxcontinuum/ccaudit
|
|
133
|
+
```
|
|
94
134
|
|
|
95
135
|
---
|
|
96
136
|
|
|
97
|
-
Built by [Matt Turley](https://uxcontinuum.com)
|
|
137
|
+
Built by [Matt Turley](https://uxcontinuum.com).
|
package/index.js
CHANGED
|
@@ -75,6 +75,14 @@ function projDirName(filePath) {
|
|
|
75
75
|
return idx >= 0 && idx + 1 < parts.length ? parts[idx + 1] : '';
|
|
76
76
|
}
|
|
77
77
|
|
|
78
|
+
// Cheap fingerprint of a tool_use input. Used to detect within-session retries.
|
|
79
|
+
function fpToolUse(name, input) {
|
|
80
|
+
const key = typeof input === 'object' && input
|
|
81
|
+
? (input.command || input.file_path || input.path || input.pattern || JSON.stringify(input))
|
|
82
|
+
: String(input ?? '');
|
|
83
|
+
return name + '::' + String(key).slice(0, 200);
|
|
84
|
+
}
|
|
85
|
+
|
|
78
86
|
function parseSession(filePath, cutoffMs) {
|
|
79
87
|
let lines;
|
|
80
88
|
try { lines = fs.readFileSync(filePath, 'utf8').split('\n'); } catch (_) { return null; }
|
|
@@ -92,6 +100,8 @@ function parseSession(filePath, cutoffMs) {
|
|
|
92
100
|
let entrypoint = null;
|
|
93
101
|
let claudeVersion = null;
|
|
94
102
|
let messageCount = 0;
|
|
103
|
+
let toolErrors = 0;
|
|
104
|
+
const fpCounts = new Map(); // tool_use fingerprint → count, for retry detection
|
|
95
105
|
|
|
96
106
|
for (const raw of lines) {
|
|
97
107
|
if (!raw) continue;
|
|
@@ -119,7 +129,11 @@ function parseSession(filePath, cutoffMs) {
|
|
|
119
129
|
if (msg.type === 'user') {
|
|
120
130
|
const c = msg.message?.content;
|
|
121
131
|
if (Array.isArray(c)) {
|
|
122
|
-
for (const b of c)
|
|
132
|
+
for (const b of c) {
|
|
133
|
+
if (b?.type === 'text' && b.text?.trim()) userPrompts.push(b.text.trim());
|
|
134
|
+
// tool_result blocks appear in user messages. is_error true = the tool call failed.
|
|
135
|
+
if (b?.type === 'tool_result' && b.is_error === true) toolErrors++;
|
|
136
|
+
}
|
|
123
137
|
} else if (typeof c === 'string' && c.trim()) {
|
|
124
138
|
userPrompts.push(c.trim());
|
|
125
139
|
}
|
|
@@ -128,7 +142,13 @@ function parseSession(filePath, cutoffMs) {
|
|
|
128
142
|
if (msg.type === 'assistant') {
|
|
129
143
|
const c = msg.message?.content;
|
|
130
144
|
if (Array.isArray(c)) {
|
|
131
|
-
for (const b of c)
|
|
145
|
+
for (const b of c) {
|
|
146
|
+
if (b?.type === 'tool_use') {
|
|
147
|
+
toolCalls.push(b.name || 'unknown');
|
|
148
|
+
const fp = fpToolUse(b.name || 'unknown', b.input);
|
|
149
|
+
fpCounts.set(fp, (fpCounts.get(fp) || 0) + 1);
|
|
150
|
+
}
|
|
151
|
+
}
|
|
132
152
|
}
|
|
133
153
|
const u = msg.message?.usage;
|
|
134
154
|
if (u) {
|
|
@@ -141,14 +161,15 @@ function parseSession(filePath, cutoffMs) {
|
|
|
141
161
|
if (!timestamps.length) return null;
|
|
142
162
|
|
|
143
163
|
const projDir = projDirName(filePath);
|
|
144
|
-
// Multi-signal agent detector. Any of these is sufficient:
|
|
145
|
-
// - isSidechain: subagent inside another Claude session
|
|
146
|
-
// - userType non-external: internal automation invocation
|
|
147
|
-
// - dir-name matches UUID/hex pattern: orchestrator-spawned worktree
|
|
148
164
|
const isAgent = isSidechain ||
|
|
149
165
|
(userType && userType !== 'external') ||
|
|
150
166
|
fallbackAgentDirGuess(projDir);
|
|
151
167
|
|
|
168
|
+
// Within-session retries: any fingerprint that fired >1 time. Count the
|
|
169
|
+
// excess fires beyond the first as retries.
|
|
170
|
+
let retries = 0;
|
|
171
|
+
for (const c of fpCounts.values()) if (c > 1) retries += (c - 1);
|
|
172
|
+
|
|
152
173
|
return {
|
|
153
174
|
projDir,
|
|
154
175
|
isAgent,
|
|
@@ -157,6 +178,8 @@ function parseSession(filePath, cutoffMs) {
|
|
|
157
178
|
cwd: cwd || '',
|
|
158
179
|
userPrompts,
|
|
159
180
|
toolCalls,
|
|
181
|
+
toolErrors,
|
|
182
|
+
retries,
|
|
160
183
|
timestamps,
|
|
161
184
|
outputTokens,
|
|
162
185
|
inputTokens,
|
|
@@ -283,6 +306,19 @@ function aggregate(sessions) {
|
|
|
283
306
|
const outputTokens = subset.reduce((n, s) => n + s.outputTokens, 0);
|
|
284
307
|
const inputTokens = subset.reduce((n, s) => n + s.inputTokens, 0);
|
|
285
308
|
|
|
309
|
+
const toolErrorsTotal = subset.reduce((n, s) => n + (s.toolErrors || 0), 0);
|
|
310
|
+
const retriesTotal = subset.reduce((n, s) => n + (s.retries || 0), 0);
|
|
311
|
+
const toolErrorRate = tools.length ? (100 * toolErrorsTotal / tools.length) : 0;
|
|
312
|
+
const retriesPerSession = subset.length ? (retriesTotal / subset.length) : 0;
|
|
313
|
+
|
|
314
|
+
// Median session length (message count). Cheaper proxy for first-shot success.
|
|
315
|
+
const lengths = subset.map(s => s.messageCount).sort((a, b) => a - b);
|
|
316
|
+
const medianLen = lengths.length
|
|
317
|
+
? (lengths.length % 2 === 1
|
|
318
|
+
? lengths[(lengths.length - 1) / 2]
|
|
319
|
+
: Math.round((lengths[lengths.length / 2 - 1] + lengths[lengths.length / 2]) / 2))
|
|
320
|
+
: 0;
|
|
321
|
+
|
|
286
322
|
return {
|
|
287
323
|
sessions: subset.length,
|
|
288
324
|
prompts: prompts.length,
|
|
@@ -300,6 +336,11 @@ function aggregate(sessions) {
|
|
|
300
336
|
outputTokens,
|
|
301
337
|
inputTokens,
|
|
302
338
|
totalTools: tools.length,
|
|
339
|
+
toolErrorRate: Math.round(toolErrorRate * 10) / 10,
|
|
340
|
+
toolErrorsTotal,
|
|
341
|
+
retriesTotal,
|
|
342
|
+
retriesPerSession: Math.round(retriesPerSession * 10) / 10,
|
|
343
|
+
medianSessionLength: medianLen,
|
|
303
344
|
};
|
|
304
345
|
};
|
|
305
346
|
|
|
@@ -443,7 +484,37 @@ function grade(stats, setup) {
|
|
|
443
484
|
});
|
|
444
485
|
}
|
|
445
486
|
|
|
446
|
-
// 5.
|
|
487
|
+
// 5. Output signals (human sessions only). Best available local proxy for
|
|
488
|
+
// whether your sessions actually produce results vs grinding. Three inputs:
|
|
489
|
+
// - tool error rate (lower = cleaner runs)
|
|
490
|
+
// - retries per session (lower = first-shot success)
|
|
491
|
+
// - median session length (very long = stuck, very short = trivial)
|
|
492
|
+
if (human.sessions && human.totalTools > 0) {
|
|
493
|
+
let oScore = 80;
|
|
494
|
+
if (human.toolErrorRate > 15) oScore -= 18;
|
|
495
|
+
else if (human.toolErrorRate > 8) oScore -= 10;
|
|
496
|
+
else if (human.toolErrorRate > 4) oScore -= 4;
|
|
497
|
+
|
|
498
|
+
if (human.retriesPerSession > 6) oScore -= 12;
|
|
499
|
+
else if (human.retriesPerSession > 3) oScore -= 6;
|
|
500
|
+
|
|
501
|
+
if (human.medianSessionLength > 100) oScore -= 15; // genuinely stuck
|
|
502
|
+
else if (human.medianSessionLength > 50) oScore -= 6; // long grinds
|
|
503
|
+
else if (human.medianSessionLength >= 2 && human.medianSessionLength <= 20) oScore += 4; // healthy
|
|
504
|
+
|
|
505
|
+
oScore = Math.max(0, Math.min(100, oScore));
|
|
506
|
+
|
|
507
|
+
dims.push({
|
|
508
|
+
name: 'Output signals',
|
|
509
|
+
score: oScore,
|
|
510
|
+
detail: `Tool error rate ${human.toolErrorRate}%, ${human.retriesPerSession} retries per session, median session ${human.medianSessionLength} messages.`,
|
|
511
|
+
fix: human.toolErrorRate > 15
|
|
512
|
+
? 'Your tool error rate is high. Sessions are fighting the environment more than producing output.'
|
|
513
|
+
: null,
|
|
514
|
+
});
|
|
515
|
+
}
|
|
516
|
+
|
|
517
|
+
// 6. Agent pipeline grade (only if agent sessions exist).
|
|
447
518
|
if (agent.sessions) {
|
|
448
519
|
let aScore = 75;
|
|
449
520
|
if (agent.sessions > 50) aScore += 8;
|
|
@@ -604,6 +675,11 @@ if (hasFlag('--json')) {
|
|
|
604
675
|
},
|
|
605
676
|
output_tokens: stats.human.outputTokens,
|
|
606
677
|
input_tokens: stats.human.inputTokens,
|
|
678
|
+
tool_error_rate_pct: stats.human.toolErrorRate,
|
|
679
|
+
tool_errors_total: stats.human.toolErrorsTotal,
|
|
680
|
+
retries_total: stats.human.retriesTotal,
|
|
681
|
+
retries_per_session: stats.human.retriesPerSession,
|
|
682
|
+
median_session_length: stats.human.medianSessionLength,
|
|
607
683
|
} : null,
|
|
608
684
|
agent: stats.agent ? {
|
|
609
685
|
sessions: stats.agent.sessions,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@uxcontinuum/ccaudit",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.1.1",
|
|
4
4
|
"description": "A diagnostic for your Claude Code setup. Reads ~/.claude/ locally, grades you across hook coverage, project hygiene, tool balance, prompt tells, and pipeline ops. Zero install: npx @uxcontinuum/ccaudit",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|