specrails-core 1.4.0 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,552 @@
1
+ ---
2
+ name: "Agent Telemetry & Cost Tracker"
3
+ description: "Inspect per-agent execution metrics: token usage (input/output/cache), estimated API cost, run count, average duration, and success/failure rate. Reads Claude CLI JSONL session logs and agent-memory files. Outputs a cost dashboard with trend indicators and optimization recommendations."
4
+ category: Workflow
5
+ tags: [workflow, telemetry, cost, metrics, analytics, agents]
6
+ ---
7
+
8
+ Analyze **{{PROJECT_NAME}}** agent execution telemetry — token usage, API cost estimates, run throughput, and performance trends across all `sr-*` agents.
9
+
10
+ **Input:** `$ARGUMENTS` — optional flags:
11
+
12
+ - `--period <filter>` — time window for analysis. Values: `today`, `week`, `all`. Default: `week`.
13
+ - `--agent <name>` — focus on a single agent (e.g. `sr-developer`). Default: all agents.
14
+ - `--format <fmt>` — output format: `markdown` or `json`. Default: `markdown`.
15
+ - `--save` — write a snapshot to `.claude/telemetry/` after display.
16
+
17
+ ---
18
+
19
+ ## Phase 0: Argument Parsing
20
+
21
+ Parse `$ARGUMENTS` to set runtime variables.
22
+
23
+ **Variables to set:**
24
+
25
+ - `PERIOD` — `"today"`, `"week"`, or `"all"`. Default: `"week"`.
26
+ - `AGENT_FILTER` — string or empty string. Default: `""` (all agents).
27
+ - `FORMAT` — `"markdown"` or `"json"`. Default: `"markdown"`.
28
+ - `SAVE_SNAPSHOT` — boolean. Default: `false`.
29
+ - `PERIOD_START` — ISO datetime lower bound derived from `PERIOD` and the current date/time. Set to `null` for `"all"`.
30
+
31
+ **Parsing rules:**
32
+
33
+ 1. Scan `$ARGUMENTS` for `--period <value>`. Valid values: `today`, `week`, `all`. If an unknown value is found: print `Error: unknown period "<value>". Valid: today, week, all` and stop. Strip from arguments.
34
+ 2. Scan for `--agent <name>`. If found, set `AGENT_FILTER=<name>` (lowercase, strip leading `sr-` if the user omitted it — always normalize to `sr-<name>` form internally). Strip from arguments.
35
+ 3. Scan for `--format <value>`. Valid: `markdown`, `json`. Unknown value: print `Error: unknown format "<value>". Valid: markdown, json` and stop. Strip from arguments.
36
+ 4. Scan for `--save`. If found, set `SAVE_SNAPSHOT=true`. Strip from arguments.
37
+
38
+ **Derive `PERIOD_START`:**
39
+
40
+ - `today` → start of today (00:00:00 UTC of the current date)
41
+ - `week` → 7 days ago at 00:00:00 UTC
42
+ - `all` → `null` (no lower bound)
43
+
44
+ **Print active configuration:**
45
+
46
+ ```
47
+ Period: <today | last 7 days | all time> Agent: <AGENT_FILTER or "all"> Format: <markdown|json> Save: <yes|no>
48
+ ```
49
+
50
+ ---
51
+
52
+ ## Phase 1: Log Discovery
53
+
54
+ Collect JSONL session log files that may contain Claude CLI usage data. Search the following locations in order; collect all matching files (do not stop at the first hit):
55
+
56
+ ### 1a. Project-scoped logs
57
+
58
+ Check for `.jsonl` files under the project `.claude/` directory (excluding `.claude/agent-memory/` and `.claude/health-history/`):
59
+
60
+ ```bash
61
+ find .claude/ -name "*.jsonl" \
62
+ -not -path ".claude/agent-memory/*" \
63
+ -not -path ".claude/health-history/*" \
64
+ -not -path ".claude/telemetry/*" 2>/dev/null
65
+ ```
66
+
67
+ ### 1b. Global Claude CLI logs
68
+
69
+ Claude CLI stores project sessions under `~/.claude/projects/`. Derive the project hash from the current working directory's absolute path (Claude uses a URL-encoded or base64 path as the directory name). Try:
70
+
71
+ ```bash
72
+ # Attempt 1: match by cwd
73
+ PROJECT_HASH=$(ls ~/.claude/projects/ 2>/dev/null | while read d; do
74
+ cwd_file="$HOME/.claude/projects/$d/.cwd"
75
+ [ -f "$cwd_file" ] && grep -qF "$(pwd)" "$cwd_file" && echo "$d" && break
76
+ done)
77
+
78
+ # Attempt 2: glob all projects and filter by recent mtime if Attempt 1 fails
79
+ ```
80
+
81
+ If a matching project directory is found, glob for JSONL files:
82
+
83
+ ```bash
84
+ find "$HOME/.claude/projects/$PROJECT_HASH/" -name "*.jsonl" 2>/dev/null
85
+ ```
86
+
87
+ ### 1c. Explicit log paths
88
+
89
+ Also check these well-known paths:
90
+
91
+ - `~/.claude/logs/*.jsonl`
92
+ - `.claude/runs/*.jsonl`
93
+
94
+ ### 1d. Collect results
95
+
96
+ Set `LOG_FILES` = deduplicated list of all discovered `.jsonl` paths, sorted by modification time descending.
97
+
98
+ Apply time filter: if `PERIOD_START` is not null, exclude files whose last modification time is before `PERIOD_START`.
99
+
100
+ **Print discovery summary:**
101
+
102
+ ```
103
+ Log files discovered: N (project-scoped: N | global: N | other: N)
104
+ ```
105
+
106
+ If `LOG_FILES` is empty:
107
+ ```
108
+ No JSONL session logs found. Telemetry will be derived from agent-memory metadata only.
109
+ Log locations searched:
110
+ - .claude/**/*.jsonl
111
+ - ~/.claude/projects/<hash>/**/*.jsonl
112
+ - ~/.claude/logs/*.jsonl
113
+ - .claude/runs/*.jsonl
114
+
115
+ To enable richer telemetry, configure Claude CLI to persist session logs.
116
+ ```
117
+
118
+ Set `LOGS_AVAILABLE=false` and continue (the command falls back to agent-memory in Phase 3).
119
+
120
+ ---
121
+
122
+ ## Phase 2: Parse Session Logs
123
+
124
+ Skip this phase if `LOGS_AVAILABLE=false`.
125
+
126
+ For each file in `LOG_FILES`, read line-by-line and parse each valid JSON object. Extract records that match the Claude CLI session result schema.
127
+
128
+ ### 2a. Record schema
129
+
130
+ A valid telemetry record has the following structure (fields may be absent — treat missing as null/zero):
131
+
132
+ ```json
133
+ {
134
+ "type": "result" | "assistant" | "tool_result" | ...,
135
+ "session_id": "<uuid>",
136
+ "agent": "<agent-name>",
137
+ "model": "<model-id>",
138
+ "duration_ms": 12345,
139
+ "is_error": false,
140
+ "usage": {
141
+ "input_tokens": 1000,
142
+ "output_tokens": 500,
143
+ "cache_read_input_tokens": 200,
144
+ "cache_creation_input_tokens": 50
145
+ },
146
+ "timestamp": "<ISO 8601>"
147
+ }
148
+ ```
149
+
150
+ **Extraction rules:**
151
+
152
+ - Include records where `type` is `"result"` or where `usage` is present with at least one non-zero token count.
153
+ - Infer `agent` name from:
154
+ 1. The `agent` field directly (if present).
155
+ 2. The source log file path (if the path contains `sr-<name>`, extract it).
156
+ 3. A `system_prompt` field snippet (if present): match against known agent persona names (`sr-architect`, `sr-developer`, `sr-test-writer`, `sr-reviewer`, `sr-security-reviewer`, `sr-doc-sync`, `sr-product-analyst`, `sr-product-manager`).
157
+ 4. If agent cannot be determined: assign to the bucket `"unknown"`.
158
+ - Apply time filter: if `PERIOD_START` is not null, skip records where `timestamp < PERIOD_START`.
159
+ - If `AGENT_FILTER` is set, skip records where the resolved agent name does not match.
160
+
161
+ ### 2b. Build RAW_RECORDS
162
+
163
+ `RAW_RECORDS` = list of parsed records, each normalized to:
164
+
165
+ ```
166
+ { session_id, agent, model, timestamp, duration_ms, is_error, input_tokens, output_tokens, cache_read_tokens, cache_write_tokens }
167
+ ```
168
+
169
+ Where `is_error` defaults to `false` and missing token counts default to `0`.
170
+
171
+ **Print:**
172
+
173
+ ```
174
+ Parsed N records from N log files.
175
+ ```
176
+
177
+ ---
178
+
179
+ ## Phase 3: Agent-Memory Inventory
180
+
181
+ Regardless of `LOGS_AVAILABLE`, collect metadata from `.claude/agent-memory/sr-*/`.
182
+
183
+ For each directory:
184
+
185
+ - `AGENT_NAME` — directory name (e.g. `sr-developer`)
186
+ - `FILE_COUNT` — total files in the directory
187
+ - `LAST_MODIFIED` — ISO date of the most recently modified file
188
+
189
+ If `AGENT_FILTER` is set, collect only the matching directory.
190
+
191
+ Set `MEMORY_AGENTS` = list of `{ agent_name, file_count, last_modified }`.
192
+
193
+ **If `LOGS_AVAILABLE=false`**, synthesize a minimal telemetry record per agent from agent-memory presence only:
194
+
195
+ ```
196
+ { agent: <agent_name>, run_count: "unknown", last_active: <last_modified>, tokens: null, cost: null }
197
+ ```
198
+
199
+ This allows the dashboard to at least list known agents and their last activity.
200
+
201
+ ---
202
+
203
+ ## Phase 4: Aggregate Per-Agent Metrics
204
+
205
+ Group `RAW_RECORDS` by `agent`. For each agent, compute:
206
+
207
+ | Metric | Derivation |
208
+ |--------|-----------|
209
+ | `run_count` | Count of records for this agent |
210
+ | `success_count` | Count where `is_error=false` |
211
+ | `failure_count` | Count where `is_error=true` |
212
+ | `success_rate` | `success_count / run_count * 100` (%) |
213
+ | `total_input_tokens` | Sum of `input_tokens` |
214
+ | `total_output_tokens` | Sum of `output_tokens` |
215
+ | `total_cache_read_tokens` | Sum of `cache_read_tokens` |
216
+ | `total_cache_write_tokens` | Sum of `cache_write_tokens` |
217
+ | `total_tokens` | Sum of all token types |
218
+ | `avg_input_tokens` | `total_input_tokens / run_count` |
219
+ | `avg_output_tokens` | `total_output_tokens / run_count` |
220
+ | `avg_duration_ms` | Mean of non-null `duration_ms` values |
221
+ | `p50_duration_ms` | Median of non-null `duration_ms` values |
222
+ | `p95_duration_ms` | 95th percentile of non-null `duration_ms` values |
223
+ | `last_active` | Max `timestamp` across all records |
224
+ | `models_used` | Deduplicated list of `model` values observed |
225
+
226
+ Set `AGENT_METRICS` = map of agent name → aggregated metrics object.
227
+
228
+ Also compute `TOTALS`:
229
+
230
+ | Field | Derivation |
231
+ |-------|-----------|
232
+ | `total_runs` | Sum of `run_count` across all agents |
233
+ | `total_successes` | Sum of `success_count` |
234
+ | `total_failures` | Sum of `failure_count` |
235
+ | `overall_success_rate` | `total_successes / total_runs * 100` (%) |
236
+ | `grand_total_input_tokens` | Sum across all agents |
237
+ | `grand_total_output_tokens` | Sum across all agents |
238
+ | `grand_total_cache_read_tokens` | Sum across all agents |
239
+ | `grand_total_cache_write_tokens` | Sum across all agents |
240
+ | `grand_total_tokens` | Sum across all agents |
241
+
242
+ ---
243
+
244
+ ## Phase 5: Cost Estimation
245
+
246
+ Compute estimated API cost per agent using published Claude pricing (per million tokens). Use the `models_used` field to select the correct rate card. If multiple models were used, compute cost per-model and sum.
247
+
248
+ **Default rate cards (USD per million tokens):**
249
+
250
+ | Model family | Input | Output | Cache read | Cache write |
251
+ |---|---|---|---|---|
252
+ | `claude-opus-4*` | $15.00 | $75.00 | $1.50 | $3.75 |
253
+ | `claude-sonnet-4*` | $3.00 | $15.00 | $0.30 | $0.75 |
254
+ | `claude-haiku-4*` | $0.25 | $1.25 | $0.025 | $0.0625 |
255
+ | `unknown` | $3.00 | $15.00 | $0.30 | $0.75 |
256
+
257
+ Match model IDs by prefix (e.g. `claude-sonnet-4-6` → `claude-sonnet-4*`). If a model cannot be matched, use the `unknown` rate card and flag it in output.
258
+
259
+ **Per-agent cost formula:**
260
+
261
+ ```
262
+ cost = (input_tokens / 1_000_000 * input_rate)
263
+ + (output_tokens / 1_000_000 * output_rate)
264
+ + (cache_read_tokens / 1_000_000 * cache_read_rate)
265
+ + (cache_write_tokens / 1_000_000 * cache_write_rate)
266
+ ```
267
+
268
+ Compute:
269
+
270
+ - `agent_cost` — total cost for the agent across all records in the current period
271
+ - `avg_cost_per_run` — `agent_cost / run_count`
272
+ - `cache_savings` — tokens saved by cache reads, expressed as cost equivalent:
273
+ `cache_read_tokens / 1_000_000 * (input_rate - cache_read_rate)`
274
+
275
+ Add `agent_cost`, `avg_cost_per_run`, and `cache_savings` to each agent's metrics object.
276
+
277
+ Also compute:
278
+
279
+ - `TOTAL_COST` — sum of `agent_cost` across all agents
280
+ - `TOTAL_CACHE_SAVINGS` — sum of `cache_savings` across all agents
281
+
282
+ ---
283
+
284
+ ## Phase 6: Trend Analysis
285
+
286
+ Compare the current period's per-agent metrics against the previous equivalent period (loaded from `.claude/telemetry/` snapshots, if available).
287
+
288
+ ### 6a. Load Previous Snapshot
289
+
290
+ Check `.claude/telemetry/` for JSON files matching `<YYYY-MM-DD>-<period>.json`. Select the snapshot from the previous equivalent period:
291
+
292
+ - `today` → yesterday's `today` snapshot
293
+ - `week` → the `week` snapshot from 7 days ago
294
+ - `all` → skip trend analysis (no meaningful comparison baseline for "all time")
295
+
296
+ Set `PREV_SNAPSHOT` = parsed JSON or `null` if not found / period is `"all"`.
297
+
298
+ ### 6b. Compute Trend Indicators
299
+
300
+ If `PREV_SNAPSHOT` is not null, for each agent compute deltas for these metrics vs. the previous snapshot:
301
+
302
+ | Metric | Direction where higher is better |
303
+ |--------|----------------------------------|
304
+ | `run_count` | higher ↑ |
305
+ | `success_rate` | higher ↑ |
306
+ | `avg_duration_ms` | lower ↓ |
307
+ | `total_tokens` | lower ↓ (efficiency) |
308
+ | `agent_cost` | lower ↓ |
309
+ | `cache_savings` | higher ↑ |
310
+
311
+ Assign a trend label per metric:
312
+
313
+ - `improving` — delta moves in the "better" direction by ≥ 5%
314
+ - `degrading` — delta moves in the "worse" direction by ≥ 5%
315
+ - `stable` — delta within ±5%
316
+ - `new` — agent did not exist in the previous snapshot
317
+
318
+ Set `TRENDS` = map of agent name → map of metric → `{ delta, direction, label }`.
319
+
320
+ ---
321
+
322
+ ## Phase 7: Display Dashboard
323
+
324
+ Render output according to `FORMAT`.
325
+
326
+ ### FORMAT = "markdown"
327
+
328
+ ```
329
+ ## Agent Telemetry Dashboard — {{PROJECT_NAME}}
330
+ Period: <today | last 7 days | all time> | As of: <ISO date> | Data source: <logs + memory | memory only>
331
+
332
+ ---
333
+
334
+ ### Summary
335
+
336
+ | Metric | Value |
337
+ |--------|-------|
338
+ | Total runs | N |
339
+ | Success rate | N% |
340
+ | Total tokens consumed | N (input: N, output: N, cached: N) |
341
+ | Estimated total cost | $N.NN |
342
+ | Cache savings | $N.NN |
343
+ | Active agents | N |
344
+
345
+ ---
346
+
347
+ ### Per-Agent Breakdown
348
+
349
+ <For each agent, sorted by agent_cost descending:>
350
+
351
+ #### <agent-name> <trend badge: 🟢 improving | 🔴 degrading | 🟡 stable | 🆕 new>
352
+
353
+ | Metric | Value | vs. Previous |
354
+ |--------|-------|-------------|
355
+ | Runs | N | <+N / -N / N/A> |
356
+ | Success rate | N% | <trend indicator> |
357
+ | Avg duration | Ns | <trend indicator> |
358
+ | Total tokens | N | <trend indicator> |
359
+ | — Input | N | |
360
+ | — Output | N | |
361
+ | — Cache reads | N | |
362
+ | — Cache writes | N | |
363
+ | Estimated cost | $N.NN | <trend indicator> |
364
+ | Avg cost / run | $N.NN | |
365
+ | Cache savings | $N.NN | <trend indicator> |
366
+ | Models used | <model-id, ...> | |
367
+ | Last active | <ISO date> | |
368
+
369
+ <if success_rate < 80%:>
370
+ ⚠️ Low success rate detected. Check agent logs for recurring failures.
371
+
372
+ <if avg_duration_ms > 120_000:>
373
+ ⚠️ Average run exceeds 2 minutes. Consider prompt optimization or task decomposition.
374
+
375
+ ---
376
+
377
+ ### Cost Distribution
378
+
379
+ | Agent | Cost | % of Total | Trend |
380
+ |-------|------|------------|-------|
381
+ | sr-developer | $N.NN | N% | 🟢 improving |
382
+ | ... | ... | ...| ... |
383
+ | **TOTAL** | **$N.NN** | 100% | |
384
+
385
+ ---
386
+
387
+ ### Cache Efficiency
388
+
389
+ Cache reads reduce cost compared to re-sending the same tokens as new input.
390
+
391
+ | Agent | Cache Read Tokens | Savings | Hit Rate |
392
+ |-------|------------------|---------|---------|
393
+ | sr-developer | N | $N.NN | N% |
394
+ | ... | ... | ... | ... |
395
+ | **TOTAL** | N | **$N.NN** | |
396
+
397
+ Cache hit rate = `cache_read_tokens / (input_tokens + cache_read_tokens) * 100`.
398
+
399
+ ---
400
+
401
+ ### Trend Summary
402
+
403
+ <if PREV_SNAPSHOT is null:>
404
+ No previous snapshot available for comparison (period: <PERIOD>).
405
+ Run with `--save` to persist a baseline for future trend analysis.
406
+
407
+ <if PREV_SNAPSHOT is not null:>
408
+
409
+ | Agent | Runs Δ | Success Δ | Duration Δ | Cost Δ | Overall |
410
+ |-------|--------|----------|-----------|--------|---------|
411
+ | sr-developer | +N | +N% | -Ns | -$N.NN | 🟢 improving |
412
+ | ... | ... | ... | ... | ... | ... |
413
+
414
+ ---
415
+
416
+ ### Recommendations
417
+
418
+ <render only applicable items, in priority order:>
419
+
420
+ 1. **High failure rate** — if any agent has `success_rate < 80%`:
421
+ ```
422
+ ⚠️ <agent-name> has a N% success rate (N failures out of N runs).
423
+ Action: Review `.claude/agent-memory/<agent-name>/failure-patterns.md` for recurring errors.
424
+ ```
425
+
426
+ 2. **Cost outlier** — if any agent accounts for > 50% of total cost:
427
+ ```
428
+ 💰 <agent-name> accounts for N% of total cost ($N.NN / $N.NN).
429
+ Action: Review average token consumption per run (avg input: N, avg output: N).
430
+ Consider shorter prompts, tighter context windows, or caching more aggressively.
431
+ ```
432
+
433
+ 3. **Cache underutilization** — if any agent's cache hit rate < 20% and run count ≥ 5:
434
+ ```
435
+ 💡 <agent-name> cache hit rate is N% (N cache reads / N total input tokens).
436
+ Action: Ensure system prompts and static context are positioned where Claude can cache them.
437
+ Use the `--cache` flag if available in your Claude CLI version.
438
+ ```
439
+
440
+ 4. **Slow average duration** — if any agent's `avg_duration_ms > 120_000`:
441
+ ```
442
+ 🐢 <agent-name> average run duration is Ns (target: <120s).
443
+ Action: Profile the longest phases. Consider breaking large tasks into smaller subtasks.
444
+ ```
445
+
446
+ 5. **Degrading trend** — if any agent's overall trend is "degrading":
447
+ ```
448
+ 📉 <agent-name> metrics are degrading vs. the previous period.
449
+ Check: success rate, token usage, and cost deltas in the Per-Agent Breakdown above.
450
+ ```
451
+
452
+ 6. **Memory not cleared** — if any agent has > 50 memory files:
453
+ ```
454
+ 🗂️ <agent-name> has N memory files. Large memory stores may slow context loading.
455
+ Action: Run `/sr:memory-inspect --prune` to clean stale entries.
456
+ ```
457
+
458
+ 7. **Data gap warning** — if `LOGS_AVAILABLE=false`:
459
+ ```
460
+ ℹ️ Telemetry is based on agent-memory metadata only (no JSONL logs found).
461
+ Token and cost data are unavailable. To enable full telemetry:
462
+ - Run /sr:implement with Claude CLI configured to persist session logs.
463
+ - Check your Claude CLI version for `--output-format` or `--log` options.
464
+ ```
465
+
466
+ 8. **No issues found** — if none of the above apply:
467
+ ```
468
+ ✅ All agents are operating within healthy parameters.
469
+ ```
470
+ ```
471
+
472
+ ### FORMAT = "json"
473
+
474
+ Emit a single JSON object to stdout:
475
+
476
+ ```json
477
+ {
478
+ "schema_version": "1",
479
+ "project": "{{PROJECT_NAME}}",
480
+ "period": "<today|week|all>",
481
+ "period_start": "<ISO datetime or null>",
482
+ "generated_at": "<ISO datetime>",
483
+ "data_source": "<logs+memory|memory-only>",
484
+ "totals": {
485
+ "run_count": 0,
486
+ "success_rate": 0.0,
487
+ "total_input_tokens": 0,
488
+ "total_output_tokens": 0,
489
+ "total_cache_read_tokens": 0,
490
+ "total_cache_write_tokens": 0,
491
+ "grand_total_tokens": 0,
492
+ "total_cost_usd": 0.0,
493
+ "total_cache_savings_usd": 0.0,
494
+ "active_agents": 0
495
+ },
496
+ "agents": {
497
+ "<agent-name>": {
498
+ "run_count": 0,
499
+ "success_count": 0,
500
+ "failure_count": 0,
501
+ "success_rate": 0.0,
502
+ "total_input_tokens": 0,
503
+ "total_output_tokens": 0,
504
+ "total_cache_read_tokens": 0,
505
+ "total_cache_write_tokens": 0,
506
+ "total_tokens": 0,
507
+ "avg_input_tokens": 0,
508
+ "avg_output_tokens": 0,
509
+ "avg_duration_ms": 0,
510
+ "p50_duration_ms": 0,
511
+ "p95_duration_ms": 0,
512
+ "agent_cost_usd": 0.0,
513
+ "avg_cost_per_run_usd": 0.0,
514
+ "cache_savings_usd": 0.0,
515
+ "cache_hit_rate": 0.0,
516
+ "models_used": [],
517
+ "last_active": "<ISO datetime>",
518
+ "trend": {
519
+ "run_count": "stable",
520
+ "success_rate": "stable",
521
+ "avg_duration_ms": "stable",
522
+ "agent_cost_usd": "stable"
523
+ }
524
+ }
525
+ }
526
+ }
527
+ ```
528
+
529
+ ---
530
+
531
+ ## Phase 8: Store Snapshot (if --save)
532
+
533
+ Skip if `SAVE_SNAPSHOT=false`.
534
+
535
+ 1. Determine filename: `<YYYY-MM-DD>-<period>.json` using today's ISO date and the `PERIOD` value.
536
+ 2. Create `.claude/telemetry/` if it does not exist.
537
+ 3. Write the JSON telemetry object (same schema as `FORMAT=json` output) to `.claude/telemetry/<filename>`.
538
+ 4. Print: `Stored: .claude/telemetry/<filename>`
539
+
540
+ **Housekeeping:** If `.claude/telemetry/` contains more than 60 files, print:
541
+
542
+ ```
543
+ Note: .claude/telemetry/ has N snapshots. Consider pruning old ones:
544
+ ls -t .claude/telemetry/ | tail -n +61 | xargs -I{} rm .claude/telemetry/{}
545
+ ```
546
+
547
+ **Gitignore advisory:** Check whether `.claude/telemetry` appears in `.gitignore`. If not:
548
+
549
+ ```
550
+ Tip: Telemetry snapshots are local artifacts. Add to .gitignore:
551
+ echo '.claude/telemetry/' >> .gitignore
552
+ ```
@@ -35,6 +35,8 @@
35
35
  | Functional | Maintain coding standards across a growing contributor base |
36
36
  | Functional | Triage issues and PRs — separate signal from noise |
37
37
  | Functional | Keep CI/CD green and catch regressions early |
38
+ | Functional | Keep dependencies up to date without introducing breaking changes |
39
+ | Functional | Coordinate releases — changelog curation, versioning, and publishing |
38
40
  | Social | Build a healthy community where contributors feel welcomed and guided |
39
41
  | Emotional | Avoid burnout from the growing volume of contributions and issues |
40
42
  | Emotional | Feel that their project is sustainable, not just surviving |
@@ -49,6 +51,8 @@
49
51
  | High | No way to enforce project-specific coding standards automatically beyond basic linting |
50
52
  | Medium | Automated scanning tools (security, code quality) generate noise — hard to distinguish real issues |
51
53
  | Medium | Onboarding contributors to the project's specific patterns and conventions is time-consuming |
54
+ | Medium | Dependency upgrades require manual changelog review and breakage risk assessment — Dependabot creates noise without project-specific context |
55
+ | Medium | Release coordination is manual — changelog curation, version bumping, and publishing require synchronous maintainer attention |
52
56
  | Medium | Feature requests pile up with no framework to evaluate which ones matter most to users |
53
57
  | Low | Sponsorship/funding doesn't scale with project popularity or maintenance burden |
54
58
 
@@ -69,6 +73,22 @@
69
73
 
70
74
  > Open-source maintainers are the most **time-constrained** users in the software ecosystem. They don't need more AI to *write* code — they need AI that *understands their project deeply enough* to review contributions, enforce conventions, and handle routine tasks so they can focus on architecture and community. The key unlock is project-specific intelligence, not generic coding ability.
71
75
 
76
+ ## Feature Evaluation Criteria
77
+
78
+ When evaluating whether a feature is worth Kai's time and adoption risk:
79
+
80
+ | Criterion | Question |
81
+ |-----------|----------|
82
+ | **Review burden** | Does this reduce time spent reviewing contributions without adding maintainer overhead? |
83
+ | **Convention enforcement** | Does this enforce project-specific rules, not just generic coding standards? |
84
+ | **GitHub-native** | Does this work with Issues, PRs, and Actions — the tools Kai already lives in? |
85
+ | **Cost ceiling** | Is this free or under $20/month? (OSS projects cannot justify SaaS pricing) |
86
+ | **Contributor UX** | Does this improve contributor experience without adding new maintainer responsibilities? |
87
+ | **Backwards compatibility** | Does this respect the project's stability contract with existing users? |
88
+ | **Scale range** | Does this work for a 200-star hobby project and a 50k-star ecosystem library alike? |
89
+
90
+ A feature scores high for Kai (4-5/5) when it reduces async review work, enforces conventions automatically, or handles routine coordination (dependency updates, release notes) without requiring Kai to be online. A feature scores low (0-1/5) when it adds configuration burden, requires paid tiers, or is primarily useful for teams rather than solo/small maintainer groups.
91
+
72
92
  ## Sources
73
93
 
74
94
  - [GitHub Blog — Welcome to the Eternal September of Open Source](https://github.blog/open-source/maintainers/welcome-to-the-eternal-september-of-open-source-heres-what-we-plan-to-do-for-maintainers/)