cctally 1.27.1 → 1.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,43 @@ based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
5
5
 
6
6
  ## [Unreleased]
7
7
 
8
+ ## [1.29.0] - 2026-06-08
9
+
10
+ ### Added
11
+ - **Dashboard conversation viewer.** New full-screen Conversations workspace in `cctally dashboard`: a cost-aware transcript reader (rendered markdown, per-turn cost, collapsible thinking/tool/sidechain detail) plus cross-session full-text search that jumps to the highlighted message. Loopback-only by default; LAN needs `dashboard.expose_transcripts`. Backend shipped earlier; this adds the front end.
12
+
13
+ ### Fixed
14
+ - **The local web dashboard server now tears down on a single SIGINT/SIGTERM unconditionally, closing a rare lost-wakeup race in its shutdown path.** `cctally dashboard` blocked its main thread on a `threading.Event.wait()` woken only by the SIGINT/SIGTERM handler calling `stop.set()` — and CPython can lose a single signal that races the entry into `Event.wait()` (the Python-level handler never runs, or `set()`'s `notify_all()` fires before the waiter registers), so ~0.04–0.07% of single signals failed to wake the loop and recovery needed a second signal (a timed-poll does *not* fix it: on the miss the flag is never set, so it polls forever). The wait now uses a self-pipe wakeup fd (`signal.set_wakeup_fd` + a `select` on the read end): CPython's C-level signal trampoline writes the signum to the pipe on every delivery — before and independent of the Python-level handler running — so the first signal always unblocks the loop. This was already mitigated in practice (interactive Ctrl-C sends more than one signal, a process manager escalates to SIGKILL, and the #153 harness fix already bounded test-server teardown), so there is no behavior change to the banner, browser-open, or clean-shutdown print paths and nothing to do on upgrade (#154).
15
+ - **`cctally db recover --db stats` no longer resets a recovered DB's schema `user_version` to 0 when a known migration is recorded only under its legacy unprefixed marker name.** When healing a version-ahead `stats.db`, the all-known-applied check that decides whether to fast-path straight to the known schema head compared each migration's canonical `NNN_`-prefixed name against the recorded markers without normalizing the three pre-framework legacy aliases (`five_hour_block_models_backfill_v1`, `five_hour_block_projects_backfill_v1`, `merge_5h_block_duplicates_v1`) to their canonical names — so a DB whose those markers predate the framework rename was misread as having a missing migration and reverted to `user_version = 0`, forcing an unnecessary full migration re-walk on the next open instead of reconciling directly to head. The recover path now normalizes legacy aliases before the membership test (matching the alias-aware read in `db status`), so such a DB reconciles straight to the known head; cache.db (which has no legacy markers) is unaffected (#148).
16
+ - **Internal (test infra, no user-facing change): the golden-file test suite no longer hangs indefinitely when a backgrounded dashboard test server drops a single SIGTERM.** The server-spawning harnesses (`dashboard`, `conversation`, `settings-api`) tore their `cctally dashboard` test servers down with an unbounded `kill "$pid"; wait "$pid"` — and CPython can lose a single SIGTERM that races the server's main-thread `threading.Event.wait()` (woken only by its signal handler's `stop.set()`; ~0.04–0.07% of single signals are dropped and recovery needs a second signal), so on the rare miss the `wait` blocked forever and wedged the whole suite (observed once under #153 as a 30+ minute hang on a non-TTY/piped run; the foreground suite always completed `1395/0`). Teardown now routes through a shared `bin/_lib-kill-server.sh` helper that escalates SIGTERM → bounded grace poll → uncatchable SIGKILL → reap (a wedge emits a non-fatal WARN rather than hanging or failing), guaranteeing teardown regardless of the server's signal-handling state; a new `bin/cctally-kill-server-test` harness pins the behavior (#153).
17
+
18
+ ### Changed
19
+ - Conversation viewer: subagent (sidechain) threads are now grouped by their originating agent file so parallel subagents render as separate collapsible threads (with task-prompt label, message count, and thread cost) instead of being fused by adjacency; threads nest under a parent message where a real cross-file link exists. Reader items expose a privacy-safe subagent key + parent link (never a raw filesystem path).
20
+ - **Internal performance (no user-facing change): the conversation-viewer search endpoint now dedups, pages, and counts entirely in SQL instead of materializing every match in Python.** `/api/conversation/search` (and the `_lib_conversation_query` FTS/LIKE kernels behind it) previously ran an unbounded `SELECT` that built a hit object — and, for FTS, a `snippet()` string; for LIKE, the full row `text` — for *every* corpus match, then deduped by `(session_id, uuid)` and sliced one page in Python, so latency and memory scaled with the number of matches rather than the page size. The match set is now deduped via a SQL window function (`ROW_NUMBER() OVER (PARTITION BY session_id, uuid …)`, keeping the same first-occurrence row as before), paged with `LIMIT/OFFSET`, and the exact post-dedup `total` is a separate `COUNT(*)` over `SELECT DISTINCT session_id, uuid`; snippet/text generation is deferred to a second query covering only the page's rowids — so Python never holds more than one page of hits/snippets regardless of corpus match count. The JSON response (`{query, mode, hits, total}`, deduped by `(session_id, uuid)`, cost-once) is byte-identical (the conversation-query unit suite and the `bin/cctally-conversation-test` search goldens are unchanged), so there is nothing to do on upgrade (#149).
21
+ - **Internal refactor (no user-facing change): the two per-vendor budget-milestone tables are now one vendor-tagged `budget_milestones` table.** The structurally-identical Claude and Codex budget-milestone tables (`budget_milestones` keyed on `week_start_at`, `codex_budget_milestones` keyed on `period_start_at`) are merged by a new stats migration `012_unify_budget_milestones_vendor` into a single `budget_milestones` with a `vendor` column (`'claude'`/`'codex'`), the renamed `period_start_at` key, and `UNIQUE(vendor, period_start_at, period, threshold)` — history, `alerted_at`, and `period` are preserved verbatim and the migration is idempotent / partial-state-safe (the Codex table is dropped). The `budget` and `codex_budget` desktop-alert axes stay two distinct axes but now share the one table (filtered `WHERE vendor=?`), and the parallel insert / dashboard envelope / reconcile-on-set / firing code collapses to a single vendor-parameterized path with the two `maybe_record_*` entry points kept as thin vendor adapters. Also folds in a dashboard fix: the Settings `POST /api/settings` budget-reconcile trigger now fires on a changed `period` (parity with the CLI `budget set --period` path). Alert ids, dashboard envelope bytes, and notification text are unchanged (no frontend bundle change), so there is nothing to do on upgrade — the merge runs automatically on the next DB open (#143).
22
+ - **The `0700` data-dir hardening now also covers a stats-first cold start.** The owner-only data-dir permission shipped in 1.28.0 was applied when `cache.db` was opened, but a cold start that opened `stats.db` first (e.g. `record-usage` on a fresh machine) materialized the directory at the default umask and left it that way until the next `cache.db` open. The `0700` chmod now lives in the shared `ensure_dirs()` primitive (best-effort, swallowing `OSError`) that every `stats.db` open runs through, with the `cache.db` open keeping its own chmod as a backstop — so the data dir is owner-only regardless of which database is touched first. Posture-only; no action needed on upgrade (#150).
23
+ - **Internal refactor (no user-facing change): the three local-dashboard conversation GET handlers (`/api/conversations`, `/api/conversation/<id>`, `/api/conversation/search`) now share one `_run_conversation_query` scaffold for the open-cache → run-query → close → 500-envelope lifecycle (previously triplicated), and the single-value query-string string parse routes through a new `_qs_str` helper (the string sibling of the existing `_qs_int`).** Status codes, JSON bodies, the `cache unavailable:` / `<type>: <msg>` 500 envelopes, and the reader's 404 are byte-identical — the conversation-endpoint, conversation-query, and dashboard golden suites are unchanged (a new test also pins the cache-open-failure 500 across all three routes). Purely a maintainability / de-duplication change; nothing to do on upgrade (#151).
24
+ - **Internal performance (no user-facing change): the cache sync now parses each changed session JSONL file once per sync instead of twice, and a `cache-sync --rebuild` / truncation re-ingest clears the conversation full-text search index without the per-row delete-trigger storm.** Cost rows and conversation message rows are now produced from a single fused pass over each changed file (previously the cost walk and the conversation walk each re-read and re-parsed the same byte range), and the rebuild/truncation full-clear drops the FTS sync triggers, truncates, then resets the index with one `'delete-all'` instead of firing an FTS shadow-write per row inside the held cache lock — on a large index (≈850k rows) the full-clear dropped from ~8.5s to ~0.3s of held-lock time. Output is byte-identical (cost totals, conversation rows, and the search index are unchanged; the reconcile and conversation-ingest suites stay green), so there is nothing to do on upgrade (#138).
25
+
26
+ ## [1.28.0] - 2026-06-06
27
+
28
+ ### Added
29
+ - **`cctally budget` now supports per-vendor budgets over configurable calendar periods.** The Claude budget can run over a `calendar-week` or `calendar-month` instead of the default subscription week (`cctally budget set 300 --period calendar-month`, or `cctally config set budget.period calendar-month`), and a separate **Codex (OpenAI) budget** tracks Codex's *actual API dollars* over a calendar week or month (`cctally budget set 200 --vendor codex --period calendar-month`). The two budgets are independent — configure either, both, or neither — and there is no combined cross-vendor cap (Claude is equivalent-$, Codex is actual-$, so they are never summed). The status report renders a labeled block per configured vendor (Claude first, then Codex) with a cost-basis parenthetical (`— equivalent-$` / `— actual API $`); the legacy single-vendor subscription-week output stays byte-identical. `cctally budget set/unset` gain `--vendor {claude,codex}` and `--period {subscription-week,calendar-week,calendar-month}` (short spellings `sub-week`/`week`/`month` normalize); `--json` gains an always-present `period` key and, when a Codex budget is configured, an additive `codex` sibling object (both additive — no schema-version bump). Calendar and Codex budgets no longer depend on weekly usage snapshots, so a fresh machine with a configured Codex budget renders `$0`/`0%` rather than "no usage data yet this week". Codex spend reconciles to the `codex-*` reports within 1e-9 USD.
30
+ - **A new `codex_budget` desktop-alert axis fires once per threshold as Codex actual spend crosses that percent of the Codex budget** (opt-in via `budget.codex.alerts_enabled`, default off), with the same forward-only / fire-once / reconcile-on-set latching as the Claude budget axis and re-arming each calendar period. Because Codex usage never flows through Claude's `record-usage`, the axis fires both from every Claude hook-tick and opportunistically whenever you run `cctally budget` (so a pure-Codex user still gets a push on their next `cctally` invocation). The Claude `budget` axis is also period-generalized so calendar-period Claude alerts fire correctly. In the local web dashboard, fired Codex alerts appear in the Recent-alerts panel/modal and as a toast with a distinct **CODEX** chip and a period-aware label ("Month of …" / "Calendar week of …" instead of always "Week"); the same period-aware label fix applies to calendar-period Claude budget alerts. Preview the axis end-to-end with `cctally alerts test --axis codex-budget`.
31
+ - **Projected-pace budget alerts now cover calendar-period Claude budgets and Codex budgets.** Previously the `projected` alert axis was subscription-week + Claude-only; it now fires an on-pace-to-exceed alert for any Claude period (`calendar-week` / `calendar-month`, opt-in via `budget.projected_enabled`) and for Codex budgets (opt-in via `budget.codex.projected_enabled`, which — like the Claude toggle — requires `budget.codex.alerts_enabled` to also be on). Codex projected crossings fire from `record-usage` and opportunistically whenever you run `cctally budget`, and re-arm each period; the fired projection reconciles to `cctally budget --json` `week_avg_projection_usd` within 1e-9 USD. Preview either with `cctally alerts test --axis projected --metric {budget_usd,codex_budget_usd}`.
32
+ - **The local web dashboard can now toggle the two Codex budget alert switches from Settings.** The Settings overlay (key `s`) gains "Codex budget alerts" (`budget.codex.alerts_enabled`) and "Codex projected-pace alerts" (`budget.codex.projected_enabled`); both write through a nested partial-merge so flipping a toggle never clobbers the Codex amount, period, or thresholds (those stay CLI-only). When no Codex budget is configured the toggles render disabled with a one-line hint pointing at `cctally budget set 200 --vendor codex`. Codex projected-pace crossings render on the dashboard with the **PROJECTED** chip and a vendor-tagged context line ("projected $230 of $200 · Codex").
33
+ - These two additions resolve the deferred follow-ups noted in the prior calendar-period + Codex budgets work (issues #134 and #135).
34
+ - **The local web dashboard can optionally serve read-only Claude/Codex conversation transcripts through three new JSON endpoints** (`/api/conversations`, `/api/conversation/<id>`, `/api/conversation/search`), behind a new opt-in `dashboard.expose_transcripts` config key (default off). Transcripts are double-gated — never served unless you have explicitly opted in AND the request Host is loopback-allowed — so a LAN-exposed dashboard (`--host 0.0.0.0`) never leaks conversation text by default. This release ships the endpoints and access gate only; there is no transcript-viewer UI yet.
35
+ - **`cctally doctor` gains a `db.version_ahead` check, and `cctally db recover` can now self-heal an ahead `cache.db`.** The check warns when a local DB's schema `user_version` has drifted ahead of the running binary (the "unreleased-head poisoning" hazard of running a newer checkout against your data dir and then downgrading); `cctally db recover` rebuilds an ahead `cache.db` losslessly from source JSONL (the cache is fully re-derivable) instead of leaving the binary stuck on `DowngradeDetected` (#145).
36
+
37
+ ### Changed
38
+ - **`cache.db` and its lock/WAL sidecars are now created with owner-only permissions (files `0600`, the data dir `0700`).** Conversation transcripts can flow through the cache, so this keeps that data from being world-readable on shared machines.
39
+
40
+ ### Fixed
41
+ - **The weekly trend (`cctally report` / `dollar-per-percent` / `weekly`) no longer splits a past week into a spurious zero-width row from a single transient `0%` reading.** The historical reset-event backfill was applying the lenient "reset-to-zero" discriminator (a sub-25pp drop to ~0%, intended for *live* current-week detection where a debounce filters transient API zeros) to its one-shot scan over all past snapshots — which has no debounce. A single stale-replica `0%` blip mid-week (e.g. usage climbing `6% → 0% → 1%` on the same still-future week boundary) was therefore mis-read as a goodwill credit and segmented that historical week into a degenerate `09:00 → 09:00` zero-width row with duplicated/misattributed percentages and cost. The backfill now fires only on the unambiguous `≥25pp` drop; the reset-to-zero signal remains active for live detection (so a real surprise reset on the current week is still caught). On upgrade, any week already mis-split this way renders correctly again on the next read (the spurious event stops regenerating).
42
+ - Refuse to forward-migrate the prod data dir (`~/.local/share/cctally`) when running from a git checkout, preventing a dev/worktree binary from bricking the installed release with `DowngradeDetected`; override with `CCTALLY_ALLOW_PROD_MIGRATION=1` (#142).
43
+ - **`cctally db recover --db stats` now refuses to recover the production stats DB when run from a dev/git checkout** (exit 2, the DB left untouched), matching the prod-migration guard, so a worktree binary can't rewrite the installed instance's stats history (#146).
44
+
8
45
  ## [1.27.1] - 2026-06-04
9
46
 
10
47
  ### Fixed
@@ -71,12 +71,14 @@ _alert_text_weekly = _lib_alerts_payload._alert_text_weekly
71
71
  _alert_text_five_hour = _lib_alerts_payload._alert_text_five_hour
72
72
  _alert_text_budget = _lib_alerts_payload._alert_text_budget
73
73
  _alert_text_project_budget = _lib_alerts_payload._alert_text_project_budget
74
+ _alert_text_codex_budget = _lib_alerts_payload._alert_text_codex_budget
74
75
  _alert_text_projected = _lib_alerts_payload._alert_text_projected
75
76
  _escape_applescript_string = _lib_alerts_payload._escape_applescript_string
76
77
  _build_alert_payload_weekly = _lib_alerts_payload._build_alert_payload_weekly
77
78
  _build_alert_payload_five_hour = _lib_alerts_payload._build_alert_payload_five_hour
78
79
  _build_alert_payload_budget = _lib_alerts_payload._build_alert_payload_budget
79
80
  _build_alert_payload_project_budget = _lib_alerts_payload._build_alert_payload_project_budget
81
+ _build_alert_payload_codex_budget = _lib_alerts_payload._build_alert_payload_codex_budget
80
82
  _build_alert_payload_projected = _lib_alerts_payload._build_alert_payload_projected
81
83
 
82
84
  # Phase B: severity policy + the cross-platform dispatch kernel. The kernel is
@@ -175,6 +177,8 @@ def _dispatch_alert_notification(
175
177
  title, subtitle, body = _alert_text_budget(payload, tz)
176
178
  elif axis == "project_budget":
177
179
  title, subtitle, body = _alert_text_project_budget(payload, tz)
180
+ elif axis == "codex_budget":
181
+ title, subtitle, body = _alert_text_codex_budget(payload, tz)
178
182
  elif axis == "projected":
179
183
  title, subtitle, body = _alert_text_projected(payload, tz)
180
184
  else:
@@ -249,6 +253,7 @@ def _dispatch_alert_notification(
249
253
  ctx.get("week_start_date")
250
254
  or ctx.get("five_hour_window_key")
251
255
  or ctx.get("week_start_at")
256
+ or ctx.get("period_start_at")
252
257
  or ""
253
258
  )
254
259
  line = (
@@ -285,6 +290,8 @@ def cmd_alerts_test(args: argparse.Namespace) -> int:
285
290
  axis = "budget"
286
291
  elif args.axis == "project-budget":
287
292
  axis = "project_budget"
293
+ elif args.axis == "codex-budget":
294
+ axis = "codex_budget"
288
295
  elif args.axis == "projected":
289
296
  axis = "projected"
290
297
  else:
@@ -335,17 +342,35 @@ def cmd_alerts_test(args: argparse.Namespace) -> int:
335
342
  spent_usd=26.0,
336
343
  consumption_pct=104.0,
337
344
  )
345
+ elif axis == "codex_budget":
346
+ # Synthetic Codex budget payload — NO DB writes (test/real divergence
347
+ # contract), NO real budget.codex entry required. A $200 calendar-month
348
+ # budget reads plausibly; spent scaled to the threshold so the body line
349
+ # reads as the at-crossing snapshot the dashboard would render.
350
+ payload = _build_alert_payload_codex_budget(
351
+ threshold=threshold,
352
+ crossed_at_utc=now_utc_iso(),
353
+ period_start_at=dt.date.today().replace(day=1).isoformat(),
354
+ period="calendar-month",
355
+ budget_usd=200.0,
356
+ spent_usd=200.0 * threshold / 100.0,
357
+ consumption_pct=float(threshold),
358
+ )
338
359
  elif axis == "projected":
339
360
  # Synthetic projected-pace payload — NO DB writes (test/real divergence
340
361
  # contract). The metric discriminator picks the wiring; projected_value
341
362
  # is the threshold's denominator-relative value (so the body reads
342
363
  # plausibly, e.g. weekly 100% → "~100% of cap", budget 100% → "$300 of
343
364
  # $300"). denominator is the at-crossing target the row would carry
344
- # (Codex P0-4): 100.0 for weekly_pct, $300 for budget_usd.
365
+ # (Codex P0-4): 100.0 for weekly_pct, $300 for budget_usd, $200 for
366
+ # codex_budget_usd (matching the codex_budget axis test-alert budget).
345
367
  metric = getattr(args, "metric", "weekly_pct")
346
368
  if metric == "budget_usd":
347
369
  denominator = 300.0
348
370
  projected_value = 300.0 * threshold / 100.0
371
+ elif metric == "codex_budget_usd":
372
+ denominator = 200.0
373
+ projected_value = 200.0 * threshold / 100.0
349
374
  else: # weekly_pct
350
375
  denominator = 100.0
351
376
  projected_value = float(threshold)
@@ -167,10 +167,99 @@ _iter_codex_jsonl_entries_with_offsets = _lib_jsonl._iter_codex_jsonl_entries_wi
167
167
  _parse_usage_entries = _lib_jsonl._parse_usage_entries
168
168
  _should_replace = _lib_jsonl._should_replace
169
169
 
170
+ # Conversation-message parser kernel (Plan 1). Pure leaf (stdlib-only), so
171
+ # it loads at module-load time alongside _lib_jsonl. Since #138 the per-file
172
+ # sync ingest goes through the fused ``_iter_sync_entries`` walker (which calls
173
+ # ``_lib_conversation.parse_message_row`` directly); ``_iter_message_rows`` is
174
+ # now used only by ``backfill_conversation_messages``.
175
+ _lib_conversation = _load_lib("_lib_conversation")
176
+ _iter_message_rows = _lib_conversation.iter_message_rows
177
+
178
+ # Shared by the fused per-file walk AND backfill_conversation_messages so the
179
+ # column list, placeholders, and tuple order live in ONE place — a column
180
+ # add/reorder can't silently desync the two ingest paths (which would land
181
+ # values in the wrong columns on whichever path was missed).
182
+ _CONV_INSERT_SQL = (
183
+ "INSERT OR IGNORE INTO conversation_messages"
184
+ "(session_id,uuid,parent_uuid,source_path,byte_offset,"
185
+ " timestamp_utc,entry_type,text,blocks_json,model,msg_id,"
186
+ " req_id,cwd,git_branch,is_sidechain)"
187
+ " VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
188
+ )
189
+
190
+
191
+ def _conv_row_tuple(m, path_str):
192
+ """Flatten a ``MessageRow`` into the ``_CONV_INSERT_SQL`` column order."""
193
+ return (
194
+ m.session_id, m.uuid, m.parent_uuid, path_str, m.byte_offset,
195
+ m.timestamp_utc, m.entry_type, m.text, m.blocks_json, m.model,
196
+ m.msg_id, m.req_id, m.cwd, m.git_branch, m.is_sidechain,
197
+ )
198
+
199
+
200
+ def _iter_sync_entries(fh, path_str):
201
+ """Fused single-pass sync walker (#138). Yields
202
+ ``(byte_offset, cost_or_None, msgrow_or_None)`` for each JSONL line from
203
+ ``fh``'s current position that produces a cost entry and/or a conversation
204
+ message row.
205
+
206
+ Each line is read once (readline()+tell()) and ``json.loads``-parsed ONCE,
207
+ then classified by both pure per-line parsers:
208
+
209
+ * ``cost_or_None`` is ``(UsageEntry, msg_id, req_id)`` when the line is a
210
+ billable assistant entry (``_lib_jsonl.parse_cost_entry``), else None.
211
+ * ``msgrow_or_None`` is a ``MessageRow`` when the line is a user/assistant
212
+ turn carrying a uuid (``_lib_conversation.parse_message_row``), else None.
213
+
214
+ The two are independent — a normal assistant line yields both. This replaces
215
+ the former cost walk + re-seek-and-walk over the identical byte span: with a
216
+ single walk the "identical span" invariant is structural (one stop point),
217
+ not a prose-enforced ``mrow.byte_offset >= final_offset`` runtime break. A
218
+ partial mid-write tail line (no trailing newline) rewinds the handle and
219
+ stops, so ``fh.tell()`` after the loop is the cost cursor's ``final_offset``
220
+ and the next sync re-reads the line once the newline lands.
221
+ """
222
+ while True:
223
+ offset = fh.tell()
224
+ line = fh.readline()
225
+ if not line:
226
+ return
227
+ if not line.endswith("\n"):
228
+ # Partial tail line — writer is mid-flight. Rewind so the next sync
229
+ # re-reads this line once the newline is in place (and so fh.tell()
230
+ # reports the cost cursor's stop, never past the partial).
231
+ fh.seek(offset)
232
+ return
233
+ stripped = line.strip()
234
+ if not stripped:
235
+ continue
236
+ try:
237
+ obj = json.loads(stripped)
238
+ except json.JSONDecodeError:
239
+ continue
240
+ cost = _lib_jsonl.parse_cost_entry(obj, path_str)
241
+ mrow = _lib_conversation.parse_message_row(obj, offset)
242
+ if cost is not None or mrow is not None:
243
+ yield offset, cost, mrow
244
+
245
+
246
+ def _iter_claude_jsonl_files():
247
+ """Yield every Claude transcript ``*.jsonl`` under each data dir's
248
+ ``projects/`` tree. Shared by ``sync_cache`` and the conversation backfill
249
+ so both ingest paths enumerate the IDENTICAL file set."""
250
+ for claude_dir in _get_claude_data_dirs():
251
+ for jp in (claude_dir / "projects").glob("**/*.jsonl"):
252
+ if jp.is_file():
253
+ yield jp
254
+
170
255
  _cctally_db_sib = _load_lib("_cctally_db")
171
256
  add_column_if_missing = _cctally_db_sib.add_column_if_missing
172
257
  _run_pending_migrations = _cctally_db_sib._run_pending_migrations
173
258
  _CACHE_MIGRATIONS = _cctally_db_sib._CACHE_MIGRATIONS
259
+ # Storm-free conversation_messages + FTS full-clear (#138). Owns the trigger
260
+ # drop/recreate dance so the per-row delete trigger never fires O(rows) under
261
+ # the held lock on a rebuild / truncation escalation.
262
+ clear_conversation_messages = _cctally_db_sib.clear_conversation_messages
174
263
 
175
264
 
176
265
  # === BEGIN MOVED REGIONS ===
@@ -502,20 +591,63 @@ def sync_cache(
502
591
  # empty baseline.
503
592
  conn.execute("DELETE FROM session_entries")
504
593
  conn.execute("DELETE FROM session_files")
594
+ # Plan 1: conversation_messages shares the cost path's lifecycle.
595
+ # A rebuild re-derives the whole cache from on-disk JSONL, so the
596
+ # message index is wiped here (inside the held lock) and the
597
+ # per-file fused walk repopulates it. clear_conversation_messages
598
+ # drops the FTS triggers, truncates, and clears the index via
599
+ # 'delete-all' so the per-row delete trigger never storms O(rows)
600
+ # under the lock (#138) — NOT a bare DELETE that fires conv_fts_ad
601
+ # per row.
602
+ clear_conversation_messages(conn)
505
603
  # Clear the walk-complete sentinel atomically with the wipe
506
604
  # (cctally-dev#93, D5/D2): a stale "complete" marker must never
507
605
  # survive a destructive rebuild. The end-of-loop write below
508
606
  # re-establishes it only after this rebuild's clean walk.
509
607
  conn.execute("DELETE FROM cache_meta WHERE key='claude_ingest_walk_complete'")
608
+ # Issue #139: a rebuild walks every file from offset 0, so the
609
+ # per-file fused walk below repopulates the whole message
610
+ # index — that satisfies any deferred existing-install backfill.
611
+ # Drop the pending flag here so the post-rebuild sync does not also
612
+ # run a redundant (idempotent but wasteful) offset-0 backfill pass.
613
+ conn.execute(
614
+ "DELETE FROM cache_meta WHERE key='conversation_backfill_pending'")
510
615
  conn.commit()
511
616
  eprint("[cache-sync] rebuild: cleared Claude cached entries")
512
617
 
513
- claude_dirs = _get_claude_data_dirs()
514
- paths: list[pathlib.Path] = []
515
- for claude_dir in claude_dirs:
516
- for jp in (claude_dir / "projects").glob("**/*.jsonl"):
517
- if jp.is_file():
518
- paths.append(jp)
618
+ # Issue #139: consume the deferred conversation_messages backfill. On an
619
+ # existing-install upgrade, cache migration 002 sets
620
+ # ``conversation_backfill_pending`` instead of walking the whole JSONL
621
+ # history inline (which stalled the triggering command — even a
622
+ # stats-only ``cctally report`` that fires the cache dispatcher but never
623
+ # reads cache.db). sync_cache is the natural owner: it already holds the
624
+ # flock + owns the walker, so a cache-consuming command or the
625
+ # background hook-tick absorbs the one-time offset-0 walk. The backfill
626
+ # touches ONLY conversation_messages (never the session_files cost
627
+ # cursor), is idempotent on (source_path, byte_offset), and commits
628
+ # per-file — so a crash leaves the flag set and the next sync re-runs
629
+ # cleanly. It writes + commits, so it must land here, BEFORE the
630
+ # zero-write-lock read+parse region below (and never on the rebuild
631
+ # path, which already cleared the flag and repopulates via the normal
632
+ # walk). A path-less/:memory: conn has no cache_meta only if the schema
633
+ # was never applied; the try/except tolerates that.
634
+ if not rebuild:
635
+ try:
636
+ _pending = conn.execute(
637
+ "SELECT 1 FROM cache_meta "
638
+ "WHERE key='conversation_backfill_pending'"
639
+ ).fetchone() is not None
640
+ except sqlite3.OperationalError:
641
+ _pending = False
642
+ if _pending:
643
+ backfill_conversation_messages(conn)
644
+ conn.execute(
645
+ "DELETE FROM cache_meta "
646
+ "WHERE key='conversation_backfill_pending'"
647
+ )
648
+ conn.commit()
649
+
650
+ paths: list[pathlib.Path] = list(_iter_claude_jsonl_files())
519
651
  stats.files_total = len(paths)
520
652
 
521
653
  # This SELECT does NOT open an implicit transaction (Python's
@@ -614,6 +746,13 @@ def sync_cache(
614
746
  f"dedup)"
615
747
  )
616
748
  conn.execute("DELETE FROM session_entries")
749
+ # Plan 1: truncation escalates to a full re-ingest of EVERY file,
750
+ # so conversation_messages is wiped here (parallel to the
751
+ # session_entries full-reset) and the per-file fused walk
752
+ # repopulates it from offset 0. Storm-free clear (#138): drop FTS
753
+ # triggers → truncate → 'delete-all' → recreate, so conv_fts_ad
754
+ # never fires O(rows) inside the held lock.
755
+ clear_conversation_messages(conn)
617
756
  # Clear the walk-complete sentinel atomically with the truncation
618
757
  # full-reset (cctally-dev#93, D5/D2): the cache is being wiped, so
619
758
  # any "complete" marker is now stale. The end-of-loop write below
@@ -684,35 +823,54 @@ def sync_cache(
684
823
  # Read + parse is a pure read; do it OUTSIDE the write transaction
685
824
  # so a slow JSONL doesn't hold a SQLite lock.
686
825
  rows: list[tuple[Any, ...]] = []
826
+ conv_rows: list[tuple[Any, ...]] = []
687
827
  final_offset = start_offset
688
828
  try:
689
829
  with open(jp, "r", encoding="utf-8", errors="replace") as fh:
690
830
  fh.seek(start_offset)
691
- for offset, entry, msg_id, req_id in _iter_jsonl_entries_with_offsets(fh, str(jp)):
692
- usage = entry.usage
693
- inp = int(usage.get("input_tokens", 0) or 0)
694
- out = int(usage.get("output_tokens", 0) or 0)
695
- cc = int(usage.get("cache_creation_input_tokens", 0) or 0)
696
- cr = int(usage.get("cache_read_input_tokens", 0) or 0)
697
- extras = {
698
- k: v for k, v in usage.items()
699
- if k not in (
700
- "input_tokens", "output_tokens",
701
- "cache_creation_input_tokens",
702
- "cache_read_input_tokens",
703
- )
704
- }
705
- rows.append((
706
- path_str,
707
- offset,
708
- entry.timestamp.astimezone(dt.timezone.utc).isoformat(),
709
- entry.model,
710
- msg_id,
711
- req_id,
712
- inp, out, cc, cr,
713
- json.dumps(extras, sort_keys=True) if extras else None,
714
- entry.cost_usd,
715
- ))
831
+ # Fused single-pass walk (#138): cost rows AND conversation
832
+ # message rows come from ONE parse of each line. An assistant
833
+ # line yields both; a user line yields only a message row.
834
+ # This replaces the former cost walk + re-seek conversation
835
+ # walk over the identical span — the "identical span"
836
+ # invariant is now structural (a single stop point) rather
837
+ # than a prose-enforced ``>= final_offset`` runtime break.
838
+ for offset, cost, mrow in _iter_sync_entries(fh, path_str):
839
+ if cost is not None:
840
+ entry, msg_id, req_id = cost
841
+ usage = entry.usage
842
+ inp = int(usage.get("input_tokens", 0) or 0)
843
+ out = int(usage.get("output_tokens", 0) or 0)
844
+ cc = int(usage.get("cache_creation_input_tokens", 0) or 0)
845
+ cr = int(usage.get("cache_read_input_tokens", 0) or 0)
846
+ extras = {
847
+ k: v for k, v in usage.items()
848
+ if k not in (
849
+ "input_tokens", "output_tokens",
850
+ "cache_creation_input_tokens",
851
+ "cache_read_input_tokens",
852
+ )
853
+ }
854
+ rows.append((
855
+ path_str,
856
+ offset,
857
+ entry.timestamp.astimezone(dt.timezone.utc).isoformat(),
858
+ entry.model,
859
+ msg_id,
860
+ req_id,
861
+ inp, out, cc, cr,
862
+ json.dumps(extras, sort_keys=True) if extras else None,
863
+ entry.cost_usd,
864
+ ))
865
+ if mrow is not None:
866
+ conv_rows.append(_conv_row_tuple(mrow, path_str))
867
+ # ``final_offset`` is the single walk's stop — captured AFTER
868
+ # the loop drains (or rewinds a partial mid-write tail line).
869
+ # It is what session_files.last_byte_offset is written from,
870
+ # so it must reflect the cost cursor's position; with the
871
+ # fused walk there is exactly one stop point shared by the
872
+ # cost and conversation rows (#138 / #Plan1 Task 4
873
+ # cursor-consistency invariant).
716
874
  final_offset = fh.tell()
717
875
  except OSError as exc:
718
876
  eprint(f"[cache] could not read {jp}: {exc}")
@@ -793,6 +951,18 @@ def sync_cache(
793
951
  rows,
794
952
  )
795
953
  stats.rows_changed += conn.total_changes - before
954
+ # Conversation message ingest (Plan 1). Lands in the SAME
955
+ # per-file write transaction as session_entries so the cost
956
+ # rows and message rows for a file commit atomically.
957
+ # INSERT OR IGNORE on UNIQUE(source_path, byte_offset): a
958
+ # resume-replayed line re-walked from a delta offset that
959
+ # already landed is a silent no-op, and the same physical line
960
+ # in two files (resume across JSONL) keeps BOTH rows. No
961
+ # per-file DELETE here — the only conversation_messages resets
962
+ # are the rebuild + truncation-escalation full-clears above
963
+ # (parallel to the cost path's lifecycle).
964
+ if conv_rows:
965
+ conn.executemany(_CONV_INSERT_SQL, conv_rows)
796
966
  # UPSERT preserves session_id / project_path columns populated
797
967
  # by _ensure_session_files_row at the top of this loop. A plain
798
968
  # INSERT OR REPLACE would wipe them on every changed-file sync.
@@ -839,6 +1009,12 @@ def sync_cache(
839
1009
  (dt.datetime.now(dt.timezone.utc).isoformat(),),
840
1010
  )
841
1011
  conn.commit()
1012
+ # At-rest hardening (Plan 2, spec §5). Runs here — at the end of the
1013
+ # write transaction, while the cache.db.lock flock is still held (so a
1014
+ # concurrent writer can't be mid-checkpoint) AND after at least one
1015
+ # write has materialized the -wal/-shm sidecars. open_cache_db hardens
1016
+ # cache.db + the data dir; this finishes the job for the sidecars.
1017
+ _harden_cache_sidecars()
842
1018
  return stats
843
1019
  finally:
844
1020
  try:
@@ -848,6 +1024,56 @@ def sync_cache(
848
1024
  lock_fh.close()
849
1025
 
850
1026
 
1027
+ def backfill_conversation_messages(conn: sqlite3.Connection) -> int:
1028
+ """One-time backfill of ``conversation_messages`` for existing installs
1029
+ (Plan 1 Task 5). Walks EVERY Claude JSONL from offset 0 and inserts one
1030
+ row per user/assistant line via ``_lib_conversation.iter_message_rows``.
1031
+
1032
+ Properties:
1033
+ * Per-file commits — a short write transaction per JSONL file, never one
1034
+ long transaction over the whole (potentially ~1M-line) history. The
1035
+ backfill of a huge history can't hold the cache.db write lock for
1036
+ minutes.
1037
+ * Idempotent — ``INSERT OR IGNORE`` on ``UNIQUE(source_path,
1038
+ byte_offset)``. A row already present (from a prior partial run or from
1039
+ the live ``sync_cache`` ingest) is silently skipped.
1040
+ * Crash-resumable — because each file commits independently and the
1041
+ INSERT is idempotent, a re-run after a crash re-walks every file but
1042
+ only the not-yet-committed rows actually land.
1043
+ * Cursor-safe — touches ONLY ``conversation_messages``. It never reads or
1044
+ writes ``session_files`` / ``session_entries``, so the cost delta
1045
+ cursor is untouched: a later ``sync_cache`` still resumes the cost walk
1046
+ from exactly where it left off.
1047
+
1048
+ Returns the number of rows inserted. Since issue #139 the caller is
1049
+ ``sync_cache`` itself (consuming the ``conversation_backfill_pending`` flag),
1050
+ which already holds the ``cache.db.lock`` flock for the duration — the same
1051
+ serialization cache migration 001 relies on. The 002 migration handler no
1052
+ longer walks inline; it only flags the work as pending.
1053
+ """
1054
+ inserted = 0
1055
+ for jp in _iter_claude_jsonl_files():
1056
+ path_str = str(jp)
1057
+ rows: list[tuple[Any, ...]] = []
1058
+ try:
1059
+ with open(jp, "r", encoding="utf-8", errors="replace") as fh:
1060
+ for m in _iter_message_rows(fh, path_str):
1061
+ rows.append(_conv_row_tuple(m, path_str))
1062
+ except OSError as exc:
1063
+ eprint(f"[conversation-backfill] could not read {jp}: {exc}")
1064
+ continue
1065
+ if rows:
1066
+ # cursor.rowcount after an executemany INSERT OR IGNORE is the
1067
+ # number of rows actually inserted (conflicts excluded), and —
1068
+ # unlike conn.total_changes — it is NOT inflated by the FTS
1069
+ # AFTER INSERT trigger's shadow-table writes.
1070
+ cur = conn.executemany(_CONV_INSERT_SQL, rows)
1071
+ conn.commit() # per-file commit — no long write txn
1072
+ if cur.rowcount and cur.rowcount > 0:
1073
+ inserted += cur.rowcount
1074
+ return inserted
1075
+
1076
+
851
1077
  def iter_entries(
852
1078
  conn: sqlite3.Connection,
853
1079
  range_start: dt.datetime,
@@ -1561,17 +1787,27 @@ def _collect_codex_entries_direct(
1561
1787
  def get_codex_entries(
1562
1788
  range_start: dt.datetime,
1563
1789
  range_end: dt.datetime,
1790
+ *,
1791
+ skip_sync: bool = False,
1564
1792
  ) -> list[CodexEntry]:
1565
1793
  """Cache-first Codex entry fetch with transparent fallback.
1566
1794
 
1567
1795
  Every Codex-reading command must use this rather than touching
1568
1796
  open_cache_db directly.
1797
+
1798
+ ``skip_sync=True`` bypasses the ``sync_codex_cache`` ingest pass and serves
1799
+ whatever is already cached — for a second read in the same process whose
1800
+ range is a SUBSET of a range already fetched (the cache is already warm), so
1801
+ a redundant full JSONL walk is wasted work (mirrors ``get_entries``'
1802
+ ``skip_sync``).
1569
1803
  """
1570
1804
  try:
1571
1805
  conn = open_cache_db()
1572
1806
  except (sqlite3.DatabaseError, OSError) as exc:
1573
1807
  eprint(f"[cache] unavailable ({exc}); falling back to direct JSONL parse")
1574
1808
  return _collect_codex_entries_direct(range_start, range_end)
1809
+ if skip_sync:
1810
+ return iter_codex_entries(conn, range_start, range_end)
1575
1811
  stats = sync_codex_cache(conn)
1576
1812
  if stats.lock_contended:
1577
1813
  # Sync commits file-by-file, so contention on the ingest lock
@@ -1590,6 +1826,60 @@ def get_codex_entries(
1590
1826
  return iter_codex_entries(conn, range_start, range_end)
1591
1827
 
1592
1828
 
1829
+ def _sum_codex_cost_for_range(
1830
+ start: dt.datetime,
1831
+ end: dt.datetime,
1832
+ *,
1833
+ speed: str = "auto",
1834
+ skip_sync: bool = False,
1835
+ ) -> float:
1836
+ """Sum USD Codex cost of all `codex_session_entries` in ``[start, end)``.
1837
+
1838
+ The Codex analog of Claude's ``_sum_cost_for_range`` (bin/cctally), used by
1839
+ `cctally budget`'s Codex-vendor path (calendar-period + Codex budgets
1840
+ feature, spec §4). Reads the **cache DB** via ``get_codex_entries`` (which
1841
+ opens ``cache.db``, runs the Codex sync, and carries the contention /
1842
+ direct-parse fallback) — NEVER the budget's stats ``conn``, which has no
1843
+ Codex tables.
1844
+
1845
+ Spend is computed per entry via the SAME ``_calculate_codex_entry_cost``
1846
+ primitive the ``codex-*`` reports use (LiteLLM token semantics; unknown
1847
+ model → ``gpt-5`` fallback), so a Codex budget and ``codex-weekly`` agree to
1848
+ the cent. A lean sum — no per-entry sample collection (budgets don't need
1849
+ ``_compute_codex_cost_stats``' samples list) — but routed through the same
1850
+ cost primitive so there is no second pricing copy.
1851
+
1852
+ ``speed="auto"`` resolves to the SAME effective tier the ``codex-*`` reports
1853
+ use under the current config (``_resolve_codex_speed`` reads the active
1854
+ ``$CODEX_HOME``/``config.toml`` — fast multiplies cost at calc time), so the
1855
+ figure matches what ``codex-weekly`` shows on this machine right now.
1856
+
1857
+ ``get_codex_entries`` filters on ``timestamp_utc <= end``; the budget window
1858
+ is half-open ``[start, end)`` so an entry exactly at ``end`` is excluded
1859
+ here (mirrors the kernel's half-open elapsed math). Empty cache / no entries
1860
+ → ``0.0``.
1861
+
1862
+ ``skip_sync=True`` serves the already-warm cache without a fresh ingest —
1863
+ for a second sum in the same process over a sub-range of one already fetched
1864
+ (e.g. the recent-24h window after the full-period sum).
1865
+ """
1866
+ c = _cctally()
1867
+ eff_speed = c._resolve_codex_speed(speed)
1868
+ total = 0.0
1869
+ for entry in c.get_codex_entries(start, end, skip_sync=skip_sync):
1870
+ if entry.timestamp >= end:
1871
+ continue
1872
+ total += c._calculate_codex_entry_cost(
1873
+ entry.model,
1874
+ entry.input_tokens,
1875
+ entry.cached_input_tokens,
1876
+ entry.output_tokens,
1877
+ entry.reasoning_output_tokens,
1878
+ speed=eff_speed,
1879
+ )
1880
+ return total
1881
+
1882
+
1593
1883
  def get_entries(
1594
1884
  range_start: dt.datetime,
1595
1885
  range_end: dt.datetime,
@@ -1628,6 +1918,24 @@ def get_entries(
1628
1918
  return iter_entries(conn, range_start, range_end, project=project)
1629
1919
 
1630
1920
 
1921
+ def _harden_cache_sidecars() -> None:
1922
+ """Best-effort 0600 on cache.db + its -wal/-shm sidecars (Plan 2, spec §5).
1923
+
1924
+ The -wal/-shm sidecars are created on the first WRITE (not on connect), so
1925
+ this runs at the END of the sync_cache write transaction — under the held
1926
+ cache.db.lock flock, where they exist — NOT in open_cache_db (where the
1927
+ sidecars are absent → a silent no-op that would leave a 0644 WAL). All
1928
+ chmod is best-effort: swallow OSError, log, continue.
1929
+ """
1930
+ base = str(_cctally_core.CACHE_DB_PATH)
1931
+ for path in (base, base + "-wal", base + "-shm"):
1932
+ try:
1933
+ if os.path.exists(path):
1934
+ os.chmod(path, 0o600)
1935
+ except OSError as exc:
1936
+ eprint(f"[cache] could not chmod {path} 0600 ({exc}); continuing")
1937
+
1938
+
1631
1939
  # === Region 6: open_cache_db (was bin/cctally:9040-9155) ===
1632
1940
 
1633
1941
 
@@ -1640,6 +1948,14 @@ def open_cache_db() -> sqlite3.Connection:
1640
1948
  """
1641
1949
  c = _cctally()
1642
1950
  _cctally_core.APP_DIR.mkdir(parents=True, exist_ok=True)
1951
+ # cache.db holds plaintext conversation prose at rest (Plan 2, spec §5).
1952
+ # Harden the data dir to 0700 so the WAL window between connect and the
1953
+ # first write (which materializes the -wal/-shm sidecars, hardened in
1954
+ # sync_cache) is not world-readable. Best-effort: swallow OSError + continue.
1955
+ try:
1956
+ os.chmod(_cctally_core.APP_DIR, 0o700)
1957
+ except OSError as exc:
1958
+ eprint(f"[cache] could not chmod data dir 0700 ({exc}); continuing")
1643
1959
  try:
1644
1960
  conn = sqlite3.connect(_cctally_core.CACHE_DB_PATH)
1645
1961
  conn.execute("SELECT 1").fetchone()
@@ -1651,6 +1967,13 @@ def open_cache_db() -> sqlite3.Connection:
1651
1967
  pass
1652
1968
  conn = sqlite3.connect(_cctally_core.CACHE_DB_PATH)
1653
1969
 
1970
+ # Best-effort 0600 on cache.db itself (the 0700 dir above backstops the
1971
+ # sidecars until the first write hardens them in sync_cache).
1972
+ try:
1973
+ os.chmod(_cctally_core.CACHE_DB_PATH, 0o600)
1974
+ except OSError as exc:
1975
+ eprint(f"[cache] could not chmod cache.db 0600 ({exc}); continuing")
1976
+
1654
1977
  conn.execute("PRAGMA journal_mode=WAL")
1655
1978
  conn.execute("PRAGMA busy_timeout=5000")
1656
1979
 
@@ -1682,6 +2005,7 @@ def open_cache_db() -> sqlite3.Connection:
1682
2005
  # §2.5, §3.3 + the @cache_migration decorator further down in this file.
1683
2006
  _run_pending_migrations(
1684
2007
  conn, registry=_CACHE_MIGRATIONS, db_label="cache.db",
2008
+ recover_version_ahead=True,
1685
2009
  )
1686
2010
  return conn
1687
2011