npm - cctally - Versions diffs - 1.27.1 → 1.29.0 - Mend

cctally 1.27.1 → 1.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/CHANGELOG.md +37 -0
package/bin/_cctally_alerts.py +26 -1
package/bin/_cctally_cache.py +355 -31
package/bin/_cctally_config.py +153 -11
package/bin/_cctally_core.py +204 -42
package/bin/_cctally_dashboard.py +510 -61
package/bin/_cctally_db.py +756 -163
package/bin/_cctally_doctor.py +11 -0
package/bin/_cctally_forecast.py +700 -57
package/bin/_cctally_milestones.py +252 -47
package/bin/_cctally_parser.py +44 -4
package/bin/_cctally_record.py +380 -133
package/bin/_cctally_weekrefs.py +30 -6
package/bin/_lib_alert_axes.py +12 -2
package/bin/_lib_alerts_payload.py +95 -3
package/bin/_lib_budget.py +48 -0
package/bin/_lib_conversation.py +177 -0
package/bin/_lib_conversation_query.py +620 -0
package/bin/_lib_doctor.py +60 -1
package/bin/_lib_jsonl.py +69 -50
package/bin/_lib_transcript_access.py +80 -0
package/bin/cctally +29 -2
package/dashboard/static/assets/index-BGaWg6ys.js +47 -0
package/dashboard/static/assets/{index-D34qf0LE.css → index-BqQ5xdX0.css} +1 -1
package/dashboard/static/dashboard.html +2 -2
package/package.json +4 -1
package/dashboard/static/assets/index-C2F1_Mxt.js +0 -18

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,43 @@ based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 ## [Unreleased]
+## [1.29.0] - 2026-06-08
+### Added
+- **Dashboard conversation viewer.** New full-screen Conversations workspace in `cctally dashboard`: a cost-aware transcript reader (rendered markdown, per-turn cost, collapsible thinking/tool/sidechain detail) plus cross-session full-text search that jumps to the highlighted message. Loopback-only by default; LAN needs `dashboard.expose_transcripts`. Backend shipped earlier; this adds the front end.
+### Fixed
+- **The local web dashboard server now tears down on a single SIGINT/SIGTERM unconditionally, closing a rare lost-wakeup race in its shutdown path.** `cctally dashboard` blocked its main thread on a `threading.Event.wait()` woken only by the SIGINT/SIGTERM handler calling `stop.set()` — and CPython can lose a single signal that races the entry into `Event.wait()` (the Python-level handler never runs, or `set()`'s `notify_all()` fires before the waiter registers), so ~0.04–0.07% of single signals failed to wake the loop and recovery needed a second signal (a timed-poll does *not* fix it: on the miss the flag is never set, so it polls forever). The wait now uses a self-pipe wakeup fd (`signal.set_wakeup_fd` + a `select` on the read end): CPython's C-level signal trampoline writes the signum to the pipe on every delivery — before and independent of the Python-level handler running — so the first signal always unblocks the loop. This was already mitigated in practice (interactive Ctrl-C sends more than one signal, a process manager escalates to SIGKILL, and the #153 harness fix already bounded test-server teardown), so there is no behavior change to the banner, browser-open, or clean-shutdown print paths and nothing to do on upgrade (#154).
+- **`cctally db recover --db stats` no longer resets a recovered DB's schema `user_version` to 0 when a known migration is recorded only under its legacy unprefixed marker name.** When healing a version-ahead `stats.db`, the all-known-applied check that decides whether to fast-path straight to the known schema head compared each migration's canonical `NNN_`-prefixed name against the recorded markers without normalizing the three pre-framework legacy aliases (`five_hour_block_models_backfill_v1`, `five_hour_block_projects_backfill_v1`, `merge_5h_block_duplicates_v1`) to their canonical names — so a DB whose those markers predate the framework rename was misread as having a missing migration and reverted to `user_version = 0`, forcing an unnecessary full migration re-walk on the next open instead of reconciling directly to head. The recover path now normalizes legacy aliases before the membership test (matching the alias-aware read in `db status`), so such a DB reconciles straight to the known head; cache.db (which has no legacy markers) is unaffected (#148).
+- **Internal (test infra, no user-facing change): the golden-file test suite no longer hangs indefinitely when a backgrounded dashboard test server drops a single SIGTERM.** The server-spawning harnesses (`dashboard`, `conversation`, `settings-api`) tore their `cctally dashboard` test servers down with an unbounded `kill "$pid"; wait "$pid"` — and CPython can lose a single SIGTERM that races the server's main-thread `threading.Event.wait()` (woken only by its signal handler's `stop.set()`; ~0.04–0.07% of single signals are dropped and recovery needs a second signal), so on the rare miss the `wait` blocked forever and wedged the whole suite (observed once under #153 as a 30+ minute hang on a non-TTY/piped run; the foreground suite always completed `1395/0`). Teardown now routes through a shared `bin/_lib-kill-server.sh` helper that escalates SIGTERM → bounded grace poll → uncatchable SIGKILL → reap (a wedge emits a non-fatal WARN rather than hanging or failing), guaranteeing teardown regardless of the server's signal-handling state; a new `bin/cctally-kill-server-test` harness pins the behavior (#153).
+### Changed
+- Conversation viewer: subagent (sidechain) threads are now grouped by their originating agent file so parallel subagents render as separate collapsible threads (with task-prompt label, message count, and thread cost) instead of being fused by adjacency; threads nest under a parent message where a real cross-file link exists. Reader items expose a privacy-safe subagent key + parent link (never a raw filesystem path).
+- **Internal performance (no user-facing change): the conversation-viewer search endpoint now dedups, pages, and counts entirely in SQL instead of materializing every match in Python.** `/api/conversation/search` (and the `_lib_conversation_query` FTS/LIKE kernels behind it) previously ran an unbounded `SELECT` that built a hit object — and, for FTS, a `snippet()` string; for LIKE, the full row `text` — for *every* corpus match, then deduped by `(session_id, uuid)` and sliced one page in Python, so latency and memory scaled with the number of matches rather than the page size. The match set is now deduped via a SQL window function (`ROW_NUMBER() OVER (PARTITION BY session_id, uuid …)`, keeping the same first-occurrence row as before), paged with `LIMIT/OFFSET`, and the exact post-dedup `total` is a separate `COUNT(*)` over `SELECT DISTINCT session_id, uuid`; snippet/text generation is deferred to a second query covering only the page's rowids — so Python never holds more than one page of hits/snippets regardless of corpus match count. The JSON response (`{query, mode, hits, total}`, deduped by `(session_id, uuid)`, cost-once) is byte-identical (the conversation-query unit suite and the `bin/cctally-conversation-test` search goldens are unchanged), so there is nothing to do on upgrade (#149).
+- **Internal refactor (no user-facing change): the two per-vendor budget-milestone tables are now one vendor-tagged `budget_milestones` table.** The structurally-identical Claude and Codex budget-milestone tables (`budget_milestones` keyed on `week_start_at`, `codex_budget_milestones` keyed on `period_start_at`) are merged by a new stats migration `012_unify_budget_milestones_vendor` into a single `budget_milestones` with a `vendor` column (`'claude'`/`'codex'`), the renamed `period_start_at` key, and `UNIQUE(vendor, period_start_at, period, threshold)` — history, `alerted_at`, and `period` are preserved verbatim and the migration is idempotent / partial-state-safe (the Codex table is dropped). The `budget` and `codex_budget` desktop-alert axes stay two distinct axes but now share the one table (filtered `WHERE vendor=?`), and the parallel insert / dashboard envelope / reconcile-on-set / firing code collapses to a single vendor-parameterized path with the two `maybe_record_*` entry points kept as thin vendor adapters. Also folds in a dashboard fix: the Settings `POST /api/settings` budget-reconcile trigger now fires on a changed `period` (parity with the CLI `budget set --period` path). Alert ids, dashboard envelope bytes, and notification text are unchanged (no frontend bundle change), so there is nothing to do on upgrade — the merge runs automatically on the next DB open (#143).
+- **The `0700` data-dir hardening now also covers a stats-first cold start.** The owner-only data-dir permission shipped in 1.28.0 was applied when `cache.db` was opened, but a cold start that opened `stats.db` first (e.g. `record-usage` on a fresh machine) materialized the directory at the default umask and left it that way until the next `cache.db` open. The `0700` chmod now lives in the shared `ensure_dirs()` primitive (best-effort, swallowing `OSError`) that every `stats.db` open runs through, with the `cache.db` open keeping its own chmod as a backstop — so the data dir is owner-only regardless of which database is touched first. Posture-only; no action needed on upgrade (#150).
+- **Internal refactor (no user-facing change): the three local-dashboard conversation GET handlers (`/api/conversations`, `/api/conversation/<id>`, `/api/conversation/search`) now share one `_run_conversation_query` scaffold for the open-cache → run-query → close → 500-envelope lifecycle (previously triplicated), and the single-value query-string string parse routes through a new `_qs_str` helper (the string sibling of the existing `_qs_int`).** Status codes, JSON bodies, the `cache unavailable:` / `<type>: <msg>` 500 envelopes, and the reader's 404 are byte-identical — the conversation-endpoint, conversation-query, and dashboard golden suites are unchanged (a new test also pins the cache-open-failure 500 across all three routes). Purely a maintainability / de-duplication change; nothing to do on upgrade (#151).
+- **Internal performance (no user-facing change): the cache sync now parses each changed session JSONL file once per sync instead of twice, and a `cache-sync --rebuild` / truncation re-ingest clears the conversation full-text search index without the per-row delete-trigger storm.** Cost rows and conversation message rows are now produced from a single fused pass over each changed file (previously the cost walk and the conversation walk each re-read and re-parsed the same byte range), and the rebuild/truncation full-clear drops the FTS sync triggers, truncates, then resets the index with one `'delete-all'` instead of firing an FTS shadow-write per row inside the held cache lock — on a large index (≈850k rows) the full-clear dropped from ~8.5s to ~0.3s of held-lock time. Output is byte-identical (cost totals, conversation rows, and the search index are unchanged; the reconcile and conversation-ingest suites stay green), so there is nothing to do on upgrade (#138).
+## [1.28.0] - 2026-06-06
+### Added
+- **`cctally budget` now supports per-vendor budgets over configurable calendar periods.** The Claude budget can run over a `calendar-week` or `calendar-month` instead of the default subscription week (`cctally budget set 300 --period calendar-month`, or `cctally config set budget.period calendar-month`), and a separate **Codex (OpenAI) budget** tracks Codex's *actual API dollars* over a calendar week or month (`cctally budget set 200 --vendor codex --period calendar-month`). The two budgets are independent — configure either, both, or neither — and there is no combined cross-vendor cap (Claude is equivalent-$, Codex is actual-$, so they are never summed). The status report renders a labeled block per configured vendor (Claude first, then Codex) with a cost-basis parenthetical (`— equivalent-$` / `— actual API $`); the legacy single-vendor subscription-week output stays byte-identical. `cctally budget set/unset` gain `--vendor {claude,codex}` and `--period {subscription-week,calendar-week,calendar-month}` (short spellings `sub-week`/`week`/`month` normalize); `--json` gains an always-present `period` key and, when a Codex budget is configured, an additive `codex` sibling object (both additive — no schema-version bump). Calendar and Codex budgets no longer depend on weekly usage snapshots, so a fresh machine with a configured Codex budget renders `$0`/`0%` rather than "no usage data yet this week". Codex spend reconciles to the `codex-*` reports within 1e-9 USD.
+- **A new `codex_budget` desktop-alert axis fires once per threshold as Codex actual spend crosses that percent of the Codex budget** (opt-in via `budget.codex.alerts_enabled`, default off), with the same forward-only / fire-once / reconcile-on-set latching as the Claude budget axis and re-arming each calendar period. Because Codex usage never flows through Claude's `record-usage`, the axis fires both from every Claude hook-tick and opportunistically whenever you run `cctally budget` (so a pure-Codex user still gets a push on their next `cctally` invocation). The Claude `budget` axis is also period-generalized so calendar-period Claude alerts fire correctly. In the local web dashboard, fired Codex alerts appear in the Recent-alerts panel/modal and as a toast with a distinct **CODEX** chip and a period-aware label ("Month of …" / "Calendar week of …" instead of always "Week"); the same period-aware label fix applies to calendar-period Claude budget alerts. Preview the axis end-to-end with `cctally alerts test --axis codex-budget`.
+- **Projected-pace budget alerts now cover calendar-period Claude budgets and Codex budgets.** Previously the `projected` alert axis was subscription-week + Claude-only; it now fires an on-pace-to-exceed alert for any Claude period (`calendar-week` / `calendar-month`, opt-in via `budget.projected_enabled`) and for Codex budgets (opt-in via `budget.codex.projected_enabled`, which — like the Claude toggle — requires `budget.codex.alerts_enabled` to also be on). Codex projected crossings fire from `record-usage` and opportunistically whenever you run `cctally budget`, and re-arm each period; the fired projection reconciles to `cctally budget --json` `week_avg_projection_usd` within 1e-9 USD. Preview either with `cctally alerts test --axis projected --metric {budget_usd,codex_budget_usd}`.
+- **The local web dashboard can now toggle the two Codex budget alert switches from Settings.** The Settings overlay (key `s`) gains "Codex budget alerts" (`budget.codex.alerts_enabled`) and "Codex projected-pace alerts" (`budget.codex.projected_enabled`); both write through a nested partial-merge so flipping a toggle never clobbers the Codex amount, period, or thresholds (those stay CLI-only). When no Codex budget is configured the toggles render disabled with a one-line hint pointing at `cctally budget set 200 --vendor codex`. Codex projected-pace crossings render on the dashboard with the **PROJECTED** chip and a vendor-tagged context line ("projected $230 of $200 · Codex").
+- These two additions resolve the deferred follow-ups noted in the prior calendar-period + Codex budgets work (issues #134 and #135).
+- **The local web dashboard can optionally serve read-only Claude/Codex conversation transcripts through three new JSON endpoints** (`/api/conversations`, `/api/conversation/<id>`, `/api/conversation/search`), behind a new opt-in `dashboard.expose_transcripts` config key (default off). Transcripts are double-gated — never served unless you have explicitly opted in AND the request Host is loopback-allowed — so a LAN-exposed dashboard (`--host 0.0.0.0`) never leaks conversation text by default. This release ships the endpoints and access gate only; there is no transcript-viewer UI yet.
+- **`cctally doctor` gains a `db.version_ahead` check, and `cctally db recover` can now self-heal an ahead `cache.db`.** The check warns when a local DB's schema `user_version` has drifted ahead of the running binary (the "unreleased-head poisoning" hazard of running a newer checkout against your data dir and then downgrading); `cctally db recover` rebuilds an ahead `cache.db` losslessly from source JSONL (the cache is fully re-derivable) instead of leaving the binary stuck on `DowngradeDetected` (#145).
+### Changed
+- **`cache.db` and its lock/WAL sidecars are now created with owner-only permissions (files `0600`, the data dir `0700`).** Conversation transcripts can flow through the cache, so this keeps that data from being world-readable on shared machines.
+### Fixed
+- **The weekly trend (`cctally report` / `dollar-per-percent` / `weekly`) no longer splits a past week into a spurious zero-width row from a single transient `0%` reading.** The historical reset-event backfill was applying the lenient "reset-to-zero" discriminator (a sub-25pp drop to ~0%, intended for *live* current-week detection where a debounce filters transient API zeros) to its one-shot scan over all past snapshots — which has no debounce. A single stale-replica `0%` blip mid-week (e.g. usage climbing `6% → 0% → 1%` on the same still-future week boundary) was therefore mis-read as a goodwill credit and segmented that historical week into a degenerate `09:00 → 09:00` zero-width row with duplicated/misattributed percentages and cost. The backfill now fires only on the unambiguous `≥25pp` drop; the reset-to-zero signal remains active for live detection (so a real surprise reset on the current week is still caught). On upgrade, any week already mis-split this way renders correctly again on the next read (the spurious event stops regenerating).
+- Refuse to forward-migrate the prod data dir (`~/.local/share/cctally`) when running from a git checkout, preventing a dev/worktree binary from bricking the installed release with `DowngradeDetected`; override with `CCTALLY_ALLOW_PROD_MIGRATION=1` (#142).
+- **`cctally db recover --db stats` now refuses to recover the production stats DB when run from a dev/git checkout** (exit 2, the DB left untouched), matching the prod-migration guard, so a worktree binary can't rewrite the installed instance's stats history (#146).
 ## [1.27.1] - 2026-06-04
 ### Fixed

package/bin/_cctally_alerts.py CHANGED Viewed

@@ -71,12 +71,14 @@ _alert_text_weekly = _lib_alerts_payload._alert_text_weekly
 _alert_text_five_hour = _lib_alerts_payload._alert_text_five_hour
 _alert_text_budget = _lib_alerts_payload._alert_text_budget
 _alert_text_project_budget = _lib_alerts_payload._alert_text_project_budget
+_alert_text_codex_budget = _lib_alerts_payload._alert_text_codex_budget
 _alert_text_projected = _lib_alerts_payload._alert_text_projected
 _escape_applescript_string = _lib_alerts_payload._escape_applescript_string
 _build_alert_payload_weekly = _lib_alerts_payload._build_alert_payload_weekly
 _build_alert_payload_five_hour = _lib_alerts_payload._build_alert_payload_five_hour
 _build_alert_payload_budget = _lib_alerts_payload._build_alert_payload_budget
 _build_alert_payload_project_budget = _lib_alerts_payload._build_alert_payload_project_budget
+_build_alert_payload_codex_budget = _lib_alerts_payload._build_alert_payload_codex_budget
 _build_alert_payload_projected = _lib_alerts_payload._build_alert_payload_projected
 # Phase B: severity policy + the cross-platform dispatch kernel. The kernel is
@@ -175,6 +177,8 @@ def _dispatch_alert_notification(
         title, subtitle, body = _alert_text_budget(payload, tz)
     elif axis == "project_budget":
         title, subtitle, body = _alert_text_project_budget(payload, tz)
+    elif axis == "codex_budget":
+        title, subtitle, body = _alert_text_codex_budget(payload, tz)
     elif axis == "projected":
         title, subtitle, body = _alert_text_projected(payload, tz)
     else:
@@ -249,6 +253,7 @@ def _dispatch_alert_notification(
             ctx.get("week_start_date")
             or ctx.get("five_hour_window_key")
             or ctx.get("week_start_at")
+            or ctx.get("period_start_at")
             or ""
         )
         line = (
@@ -285,6 +290,8 @@ def cmd_alerts_test(args: argparse.Namespace) -> int:
         axis = "budget"
     elif args.axis == "project-budget":
         axis = "project_budget"
+    elif args.axis == "codex-budget":
+        axis = "codex_budget"
     elif args.axis == "projected":
         axis = "projected"
     else:
@@ -335,17 +342,35 @@ def cmd_alerts_test(args: argparse.Namespace) -> int:
             spent_usd=26.0,
             consumption_pct=104.0,
         )
+    elif axis == "codex_budget":
+        # Synthetic Codex budget payload — NO DB writes (test/real divergence
+        # contract), NO real budget.codex entry required. A $200 calendar-month
+        # budget reads plausibly; spent scaled to the threshold so the body line
+        # reads as the at-crossing snapshot the dashboard would render.
+        payload = _build_alert_payload_codex_budget(
+            threshold=threshold,
+            crossed_at_utc=now_utc_iso(),
+            period_start_at=dt.date.today().replace(day=1).isoformat(),
+            period="calendar-month",
+            budget_usd=200.0,
+            spent_usd=200.0 * threshold / 100.0,
+            consumption_pct=float(threshold),
+        )
     elif axis == "projected":
         # Synthetic projected-pace payload — NO DB writes (test/real divergence
         # contract). The metric discriminator picks the wiring; projected_value
         # is the threshold's denominator-relative value (so the body reads
         # plausibly, e.g. weekly 100% → "~100% of cap", budget 100% → "$300 of
         # $300"). denominator is the at-crossing target the row would carry
-        # (Codex P0-4): 100.0 for weekly_pct, $300 for budget_usd.
+        # (Codex P0-4): 100.0 for weekly_pct, $300 for budget_usd, $200 for
+        # codex_budget_usd (matching the codex_budget axis test-alert budget).
         metric = getattr(args, "metric", "weekly_pct")
         if metric == "budget_usd":
             denominator = 300.0
             projected_value = 300.0 * threshold / 100.0
+        elif metric == "codex_budget_usd":
+            denominator = 200.0
+            projected_value = 200.0 * threshold / 100.0
         else:  # weekly_pct
             denominator = 100.0
             projected_value = float(threshold)

package/bin/_cctally_cache.py CHANGED Viewed

@@ -167,10 +167,99 @@ _iter_codex_jsonl_entries_with_offsets = _lib_jsonl._iter_codex_jsonl_entries_wi
 _parse_usage_entries = _lib_jsonl._parse_usage_entries
 _should_replace = _lib_jsonl._should_replace
+# Conversation-message parser kernel (Plan 1). Pure leaf (stdlib-only), so
+# it loads at module-load time alongside _lib_jsonl. Since #138 the per-file
+# sync ingest goes through the fused ``_iter_sync_entries`` walker (which calls
+# ``_lib_conversation.parse_message_row`` directly); ``_iter_message_rows`` is
+# now used only by ``backfill_conversation_messages``.
+_lib_conversation = _load_lib("_lib_conversation")
+_iter_message_rows = _lib_conversation.iter_message_rows
+# Shared by the fused per-file walk AND backfill_conversation_messages so the
+# column list, placeholders, and tuple order live in ONE place — a column
+# add/reorder can't silently desync the two ingest paths (which would land
+# values in the wrong columns on whichever path was missed).
+_CONV_INSERT_SQL = (
+    "INSERT OR IGNORE INTO conversation_messages"
+    "(session_id,uuid,parent_uuid,source_path,byte_offset,"
+    " timestamp_utc,entry_type,text,blocks_json,model,msg_id,"
+    " req_id,cwd,git_branch,is_sidechain)"
+    " VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
+)
+def _conv_row_tuple(m, path_str):
+    """Flatten a ``MessageRow`` into the ``_CONV_INSERT_SQL`` column order."""
+    return (
+        m.session_id, m.uuid, m.parent_uuid, path_str, m.byte_offset,
+        m.timestamp_utc, m.entry_type, m.text, m.blocks_json, m.model,
+        m.msg_id, m.req_id, m.cwd, m.git_branch, m.is_sidechain,
+    )
+def _iter_sync_entries(fh, path_str):
+    """Fused single-pass sync walker (#138). Yields
+    ``(byte_offset, cost_or_None, msgrow_or_None)`` for each JSONL line from
+    ``fh``'s current position that produces a cost entry and/or a conversation
+    message row.
+    Each line is read once (readline()+tell()) and ``json.loads``-parsed ONCE,
+    then classified by both pure per-line parsers:
+      * ``cost_or_None`` is ``(UsageEntry, msg_id, req_id)`` when the line is a
+        billable assistant entry (``_lib_jsonl.parse_cost_entry``), else None.
+      * ``msgrow_or_None`` is a ``MessageRow`` when the line is a user/assistant
+        turn carrying a uuid (``_lib_conversation.parse_message_row``), else None.
+    The two are independent — a normal assistant line yields both. This replaces
+    the former cost walk + re-seek-and-walk over the identical byte span: with a
+    single walk the "identical span" invariant is structural (one stop point),
+    not a prose-enforced ``mrow.byte_offset >= final_offset`` runtime break. A
+    partial mid-write tail line (no trailing newline) rewinds the handle and
+    stops, so ``fh.tell()`` after the loop is the cost cursor's ``final_offset``
+    and the next sync re-reads the line once the newline lands.
+    """
+    while True:
+        offset = fh.tell()
+        line = fh.readline()
+        if not line:
+            return
+        if not line.endswith("\n"):
+            # Partial tail line — writer is mid-flight. Rewind so the next sync
+            # re-reads this line once the newline is in place (and so fh.tell()
+            # reports the cost cursor's stop, never past the partial).
+            fh.seek(offset)
+            return
+        stripped = line.strip()
+        if not stripped:
+            continue
+        try:
+            obj = json.loads(stripped)
+        except json.JSONDecodeError:
+            continue
+        cost = _lib_jsonl.parse_cost_entry(obj, path_str)
+        mrow = _lib_conversation.parse_message_row(obj, offset)
+        if cost is not None or mrow is not None:
+            yield offset, cost, mrow
+def _iter_claude_jsonl_files():
+    """Yield every Claude transcript ``*.jsonl`` under each data dir's
+    ``projects/`` tree. Shared by ``sync_cache`` and the conversation backfill
+    so both ingest paths enumerate the IDENTICAL file set."""
+    for claude_dir in _get_claude_data_dirs():
+        for jp in (claude_dir / "projects").glob("**/*.jsonl"):
+            if jp.is_file():
+                yield jp
 _cctally_db_sib = _load_lib("_cctally_db")
 add_column_if_missing = _cctally_db_sib.add_column_if_missing
 _run_pending_migrations = _cctally_db_sib._run_pending_migrations
 _CACHE_MIGRATIONS = _cctally_db_sib._CACHE_MIGRATIONS
+# Storm-free conversation_messages + FTS full-clear (#138). Owns the trigger
+# drop/recreate dance so the per-row delete trigger never fires O(rows) under
+# the held lock on a rebuild / truncation escalation.
+clear_conversation_messages = _cctally_db_sib.clear_conversation_messages
 # === BEGIN MOVED REGIONS ===
@@ -502,20 +591,63 @@ def sync_cache(
             # empty baseline.
             conn.execute("DELETE FROM session_entries")
             conn.execute("DELETE FROM session_files")
+            # Plan 1: conversation_messages shares the cost path's lifecycle.
+            # A rebuild re-derives the whole cache from on-disk JSONL, so the
+            # message index is wiped here (inside the held lock) and the
+            # per-file fused walk repopulates it. clear_conversation_messages
+            # drops the FTS triggers, truncates, and clears the index via
+            # 'delete-all' so the per-row delete trigger never storms O(rows)
+            # under the lock (#138) — NOT a bare DELETE that fires conv_fts_ad
+            # per row.
+            clear_conversation_messages(conn)
             # Clear the walk-complete sentinel atomically with the wipe
             # (cctally-dev#93, D5/D2): a stale "complete" marker must never
             # survive a destructive rebuild. The end-of-loop write below
             # re-establishes it only after this rebuild's clean walk.
             conn.execute("DELETE FROM cache_meta WHERE key='claude_ingest_walk_complete'")
+            # Issue #139: a rebuild walks every file from offset 0, so the
+            # per-file fused walk below repopulates the whole message
+            # index — that satisfies any deferred existing-install backfill.
+            # Drop the pending flag here so the post-rebuild sync does not also
+            # run a redundant (idempotent but wasteful) offset-0 backfill pass.
+            conn.execute(
+                "DELETE FROM cache_meta WHERE key='conversation_backfill_pending'")
             conn.commit()
             eprint("[cache-sync] rebuild: cleared Claude cached entries")
-        claude_dirs = _get_claude_data_dirs()
-        paths: list[pathlib.Path] = []
-        for claude_dir in claude_dirs:
-            for jp in (claude_dir / "projects").glob("**/*.jsonl"):
-                if jp.is_file():
-                    paths.append(jp)
+        # Issue #139: consume the deferred conversation_messages backfill. On an
+        # existing-install upgrade, cache migration 002 sets
+        # ``conversation_backfill_pending`` instead of walking the whole JSONL
+        # history inline (which stalled the triggering command — even a
+        # stats-only ``cctally report`` that fires the cache dispatcher but never
+        # reads cache.db). sync_cache is the natural owner: it already holds the
+        # flock + owns the walker, so a cache-consuming command or the
+        # background hook-tick absorbs the one-time offset-0 walk. The backfill
+        # touches ONLY conversation_messages (never the session_files cost
+        # cursor), is idempotent on (source_path, byte_offset), and commits
+        # per-file — so a crash leaves the flag set and the next sync re-runs
+        # cleanly. It writes + commits, so it must land here, BEFORE the
+        # zero-write-lock read+parse region below (and never on the rebuild
+        # path, which already cleared the flag and repopulates via the normal
+        # walk). A path-less/:memory: conn has no cache_meta only if the schema
+        # was never applied; the try/except tolerates that.
+        if not rebuild:
+            try:
+                _pending = conn.execute(
+                    "SELECT 1 FROM cache_meta "
+                    "WHERE key='conversation_backfill_pending'"
+                ).fetchone() is not None
+            except sqlite3.OperationalError:
+                _pending = False
+            if _pending:
+                backfill_conversation_messages(conn)
+                conn.execute(
+                    "DELETE FROM cache_meta "
+                    "WHERE key='conversation_backfill_pending'"
+                )
+                conn.commit()
+        paths: list[pathlib.Path] = list(_iter_claude_jsonl_files())
         stats.files_total = len(paths)
         # This SELECT does NOT open an implicit transaction (Python's
@@ -614,6 +746,13 @@ def sync_cache(
                 f"dedup)"
             )
             conn.execute("DELETE FROM session_entries")
+            # Plan 1: truncation escalates to a full re-ingest of EVERY file,
+            # so conversation_messages is wiped here (parallel to the
+            # session_entries full-reset) and the per-file fused walk
+            # repopulates it from offset 0. Storm-free clear (#138): drop FTS
+            # triggers → truncate → 'delete-all' → recreate, so conv_fts_ad
+            # never fires O(rows) inside the held lock.
+            clear_conversation_messages(conn)
             # Clear the walk-complete sentinel atomically with the truncation
             # full-reset (cctally-dev#93, D5/D2): the cache is being wiped, so
             # any "complete" marker is now stale. The end-of-loop write below
@@ -684,35 +823,54 @@ def sync_cache(
             # Read + parse is a pure read; do it OUTSIDE the write transaction
             # so a slow JSONL doesn't hold a SQLite lock.
             rows: list[tuple[Any, ...]] = []
+            conv_rows: list[tuple[Any, ...]] = []
             final_offset = start_offset
             try:
                 with open(jp, "r", encoding="utf-8", errors="replace") as fh:
                     fh.seek(start_offset)
-                    for offset, entry, msg_id, req_id in _iter_jsonl_entries_with_offsets(fh, str(jp)):
-                        usage = entry.usage
-                        inp = int(usage.get("input_tokens", 0) or 0)
-                        out = int(usage.get("output_tokens", 0) or 0)
-                        cc = int(usage.get("cache_creation_input_tokens", 0) or 0)
-                        cr = int(usage.get("cache_read_input_tokens", 0) or 0)
-                        extras = {
-                            k: v for k, v in usage.items()
-                            if k not in (
-                                "input_tokens", "output_tokens",
-                                "cache_creation_input_tokens",
-                                "cache_read_input_tokens",
-                            )
-                        }
-                        rows.append((
-                            path_str,
-                            offset,
-                            entry.timestamp.astimezone(dt.timezone.utc).isoformat(),
-                            entry.model,
-                            msg_id,
-                            req_id,
-                            inp, out, cc, cr,
-                            json.dumps(extras, sort_keys=True) if extras else None,
-                            entry.cost_usd,
-                        ))
+                    # Fused single-pass walk (#138): cost rows AND conversation
+                    # message rows come from ONE parse of each line. An assistant
+                    # line yields both; a user line yields only a message row.
+                    # This replaces the former cost walk + re-seek conversation
+                    # walk over the identical span — the "identical span"
+                    # invariant is now structural (a single stop point) rather
+                    # than a prose-enforced ``>= final_offset`` runtime break.
+                    for offset, cost, mrow in _iter_sync_entries(fh, path_str):
+                        if cost is not None:
+                            entry, msg_id, req_id = cost
+                            usage = entry.usage
+                            inp = int(usage.get("input_tokens", 0) or 0)
+                            out = int(usage.get("output_tokens", 0) or 0)
+                            cc = int(usage.get("cache_creation_input_tokens", 0) or 0)
+                            cr = int(usage.get("cache_read_input_tokens", 0) or 0)
+                            extras = {
+                                k: v for k, v in usage.items()
+                                if k not in (
+                                    "input_tokens", "output_tokens",
+                                    "cache_creation_input_tokens",
+                                    "cache_read_input_tokens",
+                                )
+                            }
+                            rows.append((
+                                path_str,
+                                offset,
+                                entry.timestamp.astimezone(dt.timezone.utc).isoformat(),
+                                entry.model,
+                                msg_id,
+                                req_id,
+                                inp, out, cc, cr,
+                                json.dumps(extras, sort_keys=True) if extras else None,
+                                entry.cost_usd,
+                            ))
+                        if mrow is not None:
+                            conv_rows.append(_conv_row_tuple(mrow, path_str))
+                    # ``final_offset`` is the single walk's stop — captured AFTER
+                    # the loop drains (or rewinds a partial mid-write tail line).
+                    # It is what session_files.last_byte_offset is written from,
+                    # so it must reflect the cost cursor's position; with the
+                    # fused walk there is exactly one stop point shared by the
+                    # cost and conversation rows (#138 / #Plan1 Task 4
+                    # cursor-consistency invariant).
                     final_offset = fh.tell()
             except OSError as exc:
                 eprint(f"[cache] could not read {jp}: {exc}")
@@ -793,6 +951,18 @@ def sync_cache(
                         rows,
                     )
                     stats.rows_changed += conn.total_changes - before
+                # Conversation message ingest (Plan 1). Lands in the SAME
+                # per-file write transaction as session_entries so the cost
+                # rows and message rows for a file commit atomically.
+                # INSERT OR IGNORE on UNIQUE(source_path, byte_offset): a
+                # resume-replayed line re-walked from a delta offset that
+                # already landed is a silent no-op, and the same physical line
+                # in two files (resume across JSONL) keeps BOTH rows. No
+                # per-file DELETE here — the only conversation_messages resets
+                # are the rebuild + truncation-escalation full-clears above
+                # (parallel to the cost path's lifecycle).
+                if conv_rows:
+                    conn.executemany(_CONV_INSERT_SQL, conv_rows)
                 # UPSERT preserves session_id / project_path columns populated
                 # by _ensure_session_files_row at the top of this loop. A plain
                 # INSERT OR REPLACE would wipe them on every changed-file sync.
@@ -839,6 +1009,12 @@ def sync_cache(
                 (dt.datetime.now(dt.timezone.utc).isoformat(),),
             )
             conn.commit()
+        # At-rest hardening (Plan 2, spec §5). Runs here — at the end of the
+        # write transaction, while the cache.db.lock flock is still held (so a
+        # concurrent writer can't be mid-checkpoint) AND after at least one
+        # write has materialized the -wal/-shm sidecars. open_cache_db hardens
+        # cache.db + the data dir; this finishes the job for the sidecars.
+        _harden_cache_sidecars()
         return stats
     finally:
         try:
@@ -848,6 +1024,56 @@ def sync_cache(
         lock_fh.close()
+def backfill_conversation_messages(conn: sqlite3.Connection) -> int:
+    """One-time backfill of ``conversation_messages`` for existing installs
+    (Plan 1 Task 5). Walks EVERY Claude JSONL from offset 0 and inserts one
+    row per user/assistant line via ``_lib_conversation.iter_message_rows``.
+    Properties:
+      * Per-file commits — a short write transaction per JSONL file, never one
+        long transaction over the whole (potentially ~1M-line) history. The
+        backfill of a huge history can't hold the cache.db write lock for
+        minutes.
+      * Idempotent — ``INSERT OR IGNORE`` on ``UNIQUE(source_path,
+        byte_offset)``. A row already present (from a prior partial run or from
+        the live ``sync_cache`` ingest) is silently skipped.
+      * Crash-resumable — because each file commits independently and the
+        INSERT is idempotent, a re-run after a crash re-walks every file but
+        only the not-yet-committed rows actually land.
+      * Cursor-safe — touches ONLY ``conversation_messages``. It never reads or
+        writes ``session_files`` / ``session_entries``, so the cost delta
+        cursor is untouched: a later ``sync_cache`` still resumes the cost walk
+        from exactly where it left off.
+    Returns the number of rows inserted. Since issue #139 the caller is
+    ``sync_cache`` itself (consuming the ``conversation_backfill_pending`` flag),
+    which already holds the ``cache.db.lock`` flock for the duration — the same
+    serialization cache migration 001 relies on. The 002 migration handler no
+    longer walks inline; it only flags the work as pending.
+    """
+    inserted = 0
+    for jp in _iter_claude_jsonl_files():
+        path_str = str(jp)
+        rows: list[tuple[Any, ...]] = []
+        try:
+            with open(jp, "r", encoding="utf-8", errors="replace") as fh:
+                for m in _iter_message_rows(fh, path_str):
+                    rows.append(_conv_row_tuple(m, path_str))
+        except OSError as exc:
+            eprint(f"[conversation-backfill] could not read {jp}: {exc}")
+            continue
+        if rows:
+            # cursor.rowcount after an executemany INSERT OR IGNORE is the
+            # number of rows actually inserted (conflicts excluded), and —
+            # unlike conn.total_changes — it is NOT inflated by the FTS
+            # AFTER INSERT trigger's shadow-table writes.
+            cur = conn.executemany(_CONV_INSERT_SQL, rows)
+            conn.commit()  # per-file commit — no long write txn
+            if cur.rowcount and cur.rowcount > 0:
+                inserted += cur.rowcount
+    return inserted
 def iter_entries(
     conn: sqlite3.Connection,
     range_start: dt.datetime,
@@ -1561,17 +1787,27 @@ def _collect_codex_entries_direct(
 def get_codex_entries(
     range_start: dt.datetime,
     range_end: dt.datetime,
+    *,
+    skip_sync: bool = False,
 ) -> list[CodexEntry]:
     """Cache-first Codex entry fetch with transparent fallback.
     Every Codex-reading command must use this rather than touching
     open_cache_db directly.
+    ``skip_sync=True`` bypasses the ``sync_codex_cache`` ingest pass and serves
+    whatever is already cached — for a second read in the same process whose
+    range is a SUBSET of a range already fetched (the cache is already warm), so
+    a redundant full JSONL walk is wasted work (mirrors ``get_entries``'
+    ``skip_sync``).
     """
     try:
         conn = open_cache_db()
     except (sqlite3.DatabaseError, OSError) as exc:
         eprint(f"[cache] unavailable ({exc}); falling back to direct JSONL parse")
         return _collect_codex_entries_direct(range_start, range_end)
+    if skip_sync:
+        return iter_codex_entries(conn, range_start, range_end)
     stats = sync_codex_cache(conn)
     if stats.lock_contended:
         # Sync commits file-by-file, so contention on the ingest lock
@@ -1590,6 +1826,60 @@ def get_codex_entries(
     return iter_codex_entries(conn, range_start, range_end)
+def _sum_codex_cost_for_range(
+    start: dt.datetime,
+    end: dt.datetime,
+    *,
+    speed: str = "auto",
+    skip_sync: bool = False,
+) -> float:
+    """Sum USD Codex cost of all `codex_session_entries` in ``[start, end)``.
+    The Codex analog of Claude's ``_sum_cost_for_range`` (bin/cctally), used by
+    `cctally budget`'s Codex-vendor path (calendar-period + Codex budgets
+    feature, spec §4). Reads the **cache DB** via ``get_codex_entries`` (which
+    opens ``cache.db``, runs the Codex sync, and carries the contention /
+    direct-parse fallback) — NEVER the budget's stats ``conn``, which has no
+    Codex tables.
+    Spend is computed per entry via the SAME ``_calculate_codex_entry_cost``
+    primitive the ``codex-*`` reports use (LiteLLM token semantics; unknown
+    model → ``gpt-5`` fallback), so a Codex budget and ``codex-weekly`` agree to
+    the cent. A lean sum — no per-entry sample collection (budgets don't need
+    ``_compute_codex_cost_stats``' samples list) — but routed through the same
+    cost primitive so there is no second pricing copy.
+    ``speed="auto"`` resolves to the SAME effective tier the ``codex-*`` reports
+    use under the current config (``_resolve_codex_speed`` reads the active
+    ``$CODEX_HOME``/``config.toml`` — fast multiplies cost at calc time), so the
+    figure matches what ``codex-weekly`` shows on this machine right now.
+    ``get_codex_entries`` filters on ``timestamp_utc <= end``; the budget window
+    is half-open ``[start, end)`` so an entry exactly at ``end`` is excluded
+    here (mirrors the kernel's half-open elapsed math). Empty cache / no entries
+    → ``0.0``.
+    ``skip_sync=True`` serves the already-warm cache without a fresh ingest —
+    for a second sum in the same process over a sub-range of one already fetched
+    (e.g. the recent-24h window after the full-period sum).
+    """
+    c = _cctally()
+    eff_speed = c._resolve_codex_speed(speed)
+    total = 0.0
+    for entry in c.get_codex_entries(start, end, skip_sync=skip_sync):
+        if entry.timestamp >= end:
+            continue
+        total += c._calculate_codex_entry_cost(
+            entry.model,
+            entry.input_tokens,
+            entry.cached_input_tokens,
+            entry.output_tokens,
+            entry.reasoning_output_tokens,
+            speed=eff_speed,
+        )
+    return total
 def get_entries(
     range_start: dt.datetime,
     range_end: dt.datetime,
@@ -1628,6 +1918,24 @@ def get_entries(
     return iter_entries(conn, range_start, range_end, project=project)
+def _harden_cache_sidecars() -> None:
+    """Best-effort 0600 on cache.db + its -wal/-shm sidecars (Plan 2, spec §5).
+    The -wal/-shm sidecars are created on the first WRITE (not on connect), so
+    this runs at the END of the sync_cache write transaction — under the held
+    cache.db.lock flock, where they exist — NOT in open_cache_db (where the
+    sidecars are absent → a silent no-op that would leave a 0644 WAL). All
+    chmod is best-effort: swallow OSError, log, continue.
+    """
+    base = str(_cctally_core.CACHE_DB_PATH)
+    for path in (base, base + "-wal", base + "-shm"):
+        try:
+            if os.path.exists(path):
+                os.chmod(path, 0o600)
+        except OSError as exc:
+            eprint(f"[cache] could not chmod {path} 0600 ({exc}); continuing")
 # === Region 6: open_cache_db (was bin/cctally:9040-9155) ===
@@ -1640,6 +1948,14 @@ def open_cache_db() -> sqlite3.Connection:
     """
     c = _cctally()
     _cctally_core.APP_DIR.mkdir(parents=True, exist_ok=True)
+    # cache.db holds plaintext conversation prose at rest (Plan 2, spec §5).
+    # Harden the data dir to 0700 so the WAL window between connect and the
+    # first write (which materializes the -wal/-shm sidecars, hardened in
+    # sync_cache) is not world-readable. Best-effort: swallow OSError + continue.
+    try:
+        os.chmod(_cctally_core.APP_DIR, 0o700)
+    except OSError as exc:
+        eprint(f"[cache] could not chmod data dir 0700 ({exc}); continuing")
     try:
         conn = sqlite3.connect(_cctally_core.CACHE_DB_PATH)
         conn.execute("SELECT 1").fetchone()
@@ -1651,6 +1967,13 @@ def open_cache_db() -> sqlite3.Connection:
             pass
         conn = sqlite3.connect(_cctally_core.CACHE_DB_PATH)
+    # Best-effort 0600 on cache.db itself (the 0700 dir above backstops the
+    # sidecars until the first write hardens them in sync_cache).
+    try:
+        os.chmod(_cctally_core.CACHE_DB_PATH, 0o600)
+    except OSError as exc:
+        eprint(f"[cache] could not chmod cache.db 0600 ({exc}); continuing")
     conn.execute("PRAGMA journal_mode=WAL")
     conn.execute("PRAGMA busy_timeout=5000")
@@ -1682,6 +2005,7 @@ def open_cache_db() -> sqlite3.Connection:
     # §2.5, §3.3 + the @cache_migration decorator further down in this file.
     _run_pending_migrations(
         conn, registry=_CACHE_MIGRATIONS, db_label="cache.db",
+        recover_version_ahead=True,
     )
     return conn