cctally 1.7.0 → 1.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,2305 @@
1
+ """Record-usage / hook-tick hot-path subsystem for cctally.
2
+
3
+ Eager I/O sibling: bin/cctally loads this at startup. Holds the
4
+ runtime path that every Claude Code statusline tick and every CC
5
+ hook fires:
6
+
7
+ - ``cmd_record_usage`` — the statusline-driven entry point. Parses
8
+ ``--percent`` / ``--resets-at`` / ``--five-hour-*``, applies
9
+ ULP-noise sanitization at the ingress (``_normalize_percent``),
10
+ resolves the canonical 5h window key (Tier 1 blocks-table + Tier
11
+ 2 snapshots fallback), runs the mid-week reset-event detector,
12
+ applies the per-window 7d/5h monotonicity clamps, dedup-skips
13
+ no-op ticks (with a self-heal probe that re-fires the milestone
14
+ + 5h-block helpers when a prior process was killed between
15
+ ``insert_usage_snapshot`` and the helpers), inserts the snapshot
16
+ row, queues the milestone + 5h-block updates, and writes the
17
+ ``hwm-7d`` / ``hwm-5h`` files.
18
+ - ``cmd_hook_tick`` — the CC hook entry point. Reads CC's JSON
19
+ payload from stdin BEFORE fork (POSIX §2.9.3 makes ``cmd &``
20
+ blank stdin), forks to background so CC unblocks immediately,
21
+ detaches stdio to ``hook-tick.log``, runs ``sync_cache`` + a
22
+ throttled OAuth refresh under ``hook-tick.last-fetch.lock``, and
23
+ writes one log line. Normal mode returns 0 unconditionally
24
+ (hook discipline); ``--explain`` returns a decision-tree exit code.
25
+ - ``maybe_record_milestone`` — percent-crossing detector. Runs
26
+ ``cmd_sync_week`` to refresh cost-on-disk, computes cumulative +
27
+ marginal cost via ``_compute_cost_for_weekref`` for reset-affected
28
+ weeks or ``get_latest_cost_for_week`` otherwise, inserts a
29
+ ``percent_milestones`` row per crossed threshold inside a single
30
+ transaction, and queues ``_dispatch_alert_notification`` jobs
31
+ for thresholds configured in ``alerts.weekly_thresholds`` (set-
32
+ then-dispatch invariant, spec §3.2).
33
+ - ``maybe_update_five_hour_block`` — 5h block upsert + rollup-children
34
+ replace-all + 5h-% milestone detection. Resolves block_start_at
35
+ from prior row (or computes from ``five_hour_resets_at - 5h`` on
36
+ first observation), recomputes totals via ``_compute_block_totals``,
37
+ upserts the parent row with ON CONFLICT DO UPDATE, replaces
38
+ per-(block, model) and per-(block, project) children, fires the
39
+ 5h-% alert dispatch, and runs the cross-reset cross-flag JOIN
40
+ sweep — all inside one BEGIN.
41
+ - ``_compute_block_totals`` — sums tokens + cost over
42
+ [block_start_at, range_end] from ``session_entries``, with
43
+ per-model and per-project breakdowns. Routes through
44
+ ``get_claude_session_entries`` (cache-first / lock-contention
45
+ fallback / direct-JSONL fallback) so the rollup children inherit
46
+ the cache subsystem's correctness envelope.
47
+ - ``insert_usage_snapshot`` / ``_saved_dict_from_usage_row`` —
48
+ ``weekly_usage_snapshots`` INSERT and its inverse (rebuild the
49
+ ``saved`` dict from an existing row for the dedup self-heal
50
+ path).
51
+ - ``DerivedWeekWindow`` + ``_derive_week_from_payload`` +
52
+ ``_coerce_payload_captured_at`` — payload-to-week-bucket
53
+ resolution shared by ``insert_usage_snapshot``. Anchors the
54
+ bucket-key date on the canonical UTC ISO (regression: Israel host
55
+ briefly running with TZ=America/Los_Angeles spawned ghost
56
+ ``week_start_date`` rows; see ``tests/test_derive_week_utc_anchor.py``).
57
+ - ``_normalize_percent`` — single chokepoint that flushes IEEE 754
58
+ ULP noise out of ingress percent floats. Applied at every
59
+ cmd_record_usage ingress site (CLI args, hook-tick OAuth refresh,
60
+ refresh-usage OAuth fetch). 10dp round is well below any
61
+ meaningful consumer precision but above IEEE 754 ULP scale near
62
+ 100.
63
+ - ``_hook_tick_*`` helpers — log/throttle file primitives,
64
+ stdin-read, session-id short, log-line formatter.
65
+ - ``_safe_float`` / ``_validate_date_optional`` — payload-validation
66
+ helpers consumed only by ``insert_usage_snapshot``.
67
+ - ``_logged_window_key_coerce_failure`` — one-shot module-level
68
+ guard so a misbehaving caller passing a non-int ``fiveHourWindowKey``
69
+ doesn't spam stderr on every insert.
70
+
71
+ What stays in bin/cctally:
72
+ - Path constants ``APP_DIR``, ``HOOK_TICK_LOG_DIR``,
73
+ ``HOOK_TICK_LOG_PATH``, ``HOOK_TICK_LOG_ROTATED_PATH``,
74
+ ``HOOK_TICK_LOG_ROTATE_BYTES``, ``HOOK_TICK_THROTTLE_PATH``,
75
+ ``HOOK_TICK_THROTTLE_LOCK_PATH``,
76
+ ``HOOK_TICK_DEFAULT_THROTTLE_SECONDS`` — referenced from the
77
+ moved bodies via the ``c = _cctally()`` call-time accessor pattern
78
+ (spec §5.5, same as ``bin/_cctally_cache.py``). The accessor
79
+ resolves ``sys.modules['cctally'].X`` on every call, so the
80
+ conftest ``redirect_paths`` ``setitem(ns, "APP_DIR", tmp)``
81
+ propagates transparently — no sibling-side patches needed.
82
+ - Alerts-config surface (``_AlertsConfigError``, ``_get_alerts_config``,
83
+ ``_warn_alerts_bad_config_once``, ``_ALERTS_BAD_CONFIG_WARNED``)
84
+ — stays in bin/cctally per task brief; consumed by dashboard
85
+ and other surfaces beyond record/hook-tick. Routed through
86
+ module-level shims here so the moved bodies keep bare-name
87
+ call shape.
88
+ - ``_dispatch_alert_notification`` — already lives in
89
+ ``bin/_cctally_alerts.py`` (Phase B). Accessed via shim that
90
+ resolves through ``sys.modules['cctally']._dispatch_alert_notification``
91
+ so the eager re-export in bin/cctally propagates the same
92
+ function object both sides see.
93
+ - ``cmd_sync_week`` (Phase B sibling), ``cmd_refresh_usage`` /
94
+ ``_hook_tick_oauth_refresh`` / ``_hook_tick_make_mock_refresh``
95
+ / ``_get_oauth_usage_config`` / ``OauthUsageConfigError``
96
+ (Phase C ``_cctally_refresh.py``) — consumed from this sibling
97
+ via the same bare-name shim or ``c.X`` pattern.
98
+ - ``open_db``, ``open_cache_db``, ``sync_cache``, ``parse_iso_datetime``,
99
+ ``now_utc_iso``, ``load_config``, ``get_week_start_name``,
100
+ ``compute_week_bounds``, ``parse_date_str``,
101
+ ``_canonicalize_optional_iso``, ``_canonical_5h_window_key``,
102
+ ``_floor_to_hour``, ``_get_canonical_boundary_for_date``,
103
+ ``_apply_reset_events_to_weekrefs``, ``_week_ref_has_reset_event``,
104
+ ``_compute_cost_for_weekref``, ``get_latest_cost_for_week``,
105
+ ``get_max_milestone_for_week``, ``get_milestone_cost_for_week``,
106
+ ``insert_percent_milestone``, ``make_week_ref``,
107
+ ``_calculate_entry_cost``, ``_resolve_primary_model_for_block``,
108
+ ``_resolve_display_tz_obj``, ``_build_alert_payload_weekly``,
109
+ ``_build_alert_payload_five_hour``, ``eprint``,
110
+ ``get_claude_session_entries``, ``_FIVE_HOUR_JITTER_FLOOR_SECONDS``,
111
+ ``_RESET_PCT_DROP_THRESHOLD`` — boundary helpers, already-extracted
112
+ subsystems, or constants that belong in bin/cctally. All accessed
113
+ via the same shim/``c.X`` pattern.
114
+
115
+ §5.6 audit on this extraction's monkeypatch surface:
116
+ - ``cmd_record_usage`` — patched via ``monkeypatch.setitem(ns, …)``
117
+ by 5 test files (``test_hook_tick_rate_limit.py``,
118
+ ``test_refresh_usage_inproc.py``, ``test_refresh_usage_cmd.py``,
119
+ callers via ``ns["cmd_record_usage"](...)``). Re-export in
120
+ bin/cctally propagates patches; the moved body never reaches
121
+ for itself.
122
+ - ``_hook_tick_oauth_refresh`` — patched via
123
+ ``monkeypatch.setitem(ns, …)`` by ``test_hook_tick_rate_limit.py``.
124
+ Moved ``cmd_hook_tick`` uses a module-level shim that resolves
125
+ via ``sys.modules['cctally']`` at call time; the
126
+ ``globals()["_hook_tick_oauth_refresh"] = …`` mock-injection
127
+ branch is rewritten to mutate ``sys.modules['cctally']`` so
128
+ ``--mock-oauth-response`` still propagates.
129
+ - ``_normalize_percent`` — test reads via
130
+ ``ns["_normalize_percent"]`` (``test_record_usage_precision.py``).
131
+ Re-export in bin/cctally propagates the same function object.
132
+ - ``_derive_week_from_payload`` — test reads via
133
+ ``ns["_derive_week_from_payload"]``
134
+ (``test_derive_week_utc_anchor.py``). Re-export in bin/cctally
135
+ propagates.
136
+
137
+ Spec: docs/superpowers/specs/2026-05-13-bin-cctally-split-design.md
138
+ """
139
+ from __future__ import annotations
140
+
141
+ import argparse
142
+ import datetime as dt
143
+ import fcntl
144
+ import json
145
+ import math
146
+ import os
147
+ import sqlite3
148
+ import sys
149
+ import time
150
+ from dataclasses import dataclass
151
+ from typing import Any
152
+
153
+
154
+ def _cctally():
155
+ """Resolve the current ``cctally`` module at call-time (spec §5.5)."""
156
+ return sys.modules["cctally"]
157
+
158
+
159
+ # Module-level back-ref shims. Each shim resolves
160
+ # ``sys.modules['cctally'].X`` at CALL TIME (not bind time), so
161
+ # monkeypatches on cctally's namespace propagate into the moved code
162
+ # unchanged. Mirrors the precedent established in
163
+ # ``bin/_cctally_cache.py`` and ``bin/_cctally_db.py``.
164
+ def eprint(*args, **kwargs):
165
+ return sys.modules["cctally"].eprint(*args, **kwargs)
166
+
167
+
168
+ def now_utc_iso(*args, **kwargs):
169
+ return sys.modules["cctally"].now_utc_iso(*args, **kwargs)
170
+
171
+
172
+ def parse_iso_datetime(*args, **kwargs):
173
+ return sys.modules["cctally"].parse_iso_datetime(*args, **kwargs)
174
+
175
+
176
+ def open_db(*args, **kwargs):
177
+ return sys.modules["cctally"].open_db(*args, **kwargs)
178
+
179
+
180
+ def open_cache_db(*args, **kwargs):
181
+ return sys.modules["cctally"].open_cache_db(*args, **kwargs)
182
+
183
+
184
+ def sync_cache(*args, **kwargs):
185
+ return sys.modules["cctally"].sync_cache(*args, **kwargs)
186
+
187
+
188
+ def load_config(*args, **kwargs):
189
+ return sys.modules["cctally"].load_config(*args, **kwargs)
190
+
191
+
192
+ def get_week_start_name(*args, **kwargs):
193
+ return sys.modules["cctally"].get_week_start_name(*args, **kwargs)
194
+
195
+
196
+ def compute_week_bounds(*args, **kwargs):
197
+ return sys.modules["cctally"].compute_week_bounds(*args, **kwargs)
198
+
199
+
200
+ def parse_date_str(*args, **kwargs):
201
+ return sys.modules["cctally"].parse_date_str(*args, **kwargs)
202
+
203
+
204
+ def _canonicalize_optional_iso(*args, **kwargs):
205
+ return sys.modules["cctally"]._canonicalize_optional_iso(*args, **kwargs)
206
+
207
+
208
+ def _canonical_5h_window_key(*args, **kwargs):
209
+ return sys.modules["cctally"]._canonical_5h_window_key(*args, **kwargs)
210
+
211
+
212
+ def _floor_to_hour(*args, **kwargs):
213
+ return sys.modules["cctally"]._floor_to_hour(*args, **kwargs)
214
+
215
+
216
+ def _get_canonical_boundary_for_date(*args, **kwargs):
217
+ return sys.modules["cctally"]._get_canonical_boundary_for_date(*args, **kwargs)
218
+
219
+
220
+ def _apply_reset_events_to_weekrefs(*args, **kwargs):
221
+ return sys.modules["cctally"]._apply_reset_events_to_weekrefs(*args, **kwargs)
222
+
223
+
224
+ def _week_ref_has_reset_event(*args, **kwargs):
225
+ return sys.modules["cctally"]._week_ref_has_reset_event(*args, **kwargs)
226
+
227
+
228
+ def _compute_cost_for_weekref(*args, **kwargs):
229
+ return sys.modules["cctally"]._compute_cost_for_weekref(*args, **kwargs)
230
+
231
+
232
+ def get_latest_cost_for_week(*args, **kwargs):
233
+ return sys.modules["cctally"].get_latest_cost_for_week(*args, **kwargs)
234
+
235
+
236
+ def get_max_milestone_for_week(*args, **kwargs):
237
+ return sys.modules["cctally"].get_max_milestone_for_week(*args, **kwargs)
238
+
239
+
240
+ def get_milestone_cost_for_week(*args, **kwargs):
241
+ return sys.modules["cctally"].get_milestone_cost_for_week(*args, **kwargs)
242
+
243
+
244
+ def insert_percent_milestone(*args, **kwargs):
245
+ return sys.modules["cctally"].insert_percent_milestone(*args, **kwargs)
246
+
247
+
248
+ def make_week_ref(*args, **kwargs):
249
+ return sys.modules["cctally"].make_week_ref(*args, **kwargs)
250
+
251
+
252
+ def cmd_sync_week(*args, **kwargs):
253
+ return sys.modules["cctally"].cmd_sync_week(*args, **kwargs)
254
+
255
+
256
+ def _calculate_entry_cost(*args, **kwargs):
257
+ return sys.modules["cctally"]._calculate_entry_cost(*args, **kwargs)
258
+
259
+
260
+ def get_claude_session_entries(*args, **kwargs):
261
+ return sys.modules["cctally"].get_claude_session_entries(*args, **kwargs)
262
+
263
+
264
+ def _resolve_primary_model_for_block(*args, **kwargs):
265
+ return sys.modules["cctally"]._resolve_primary_model_for_block(*args, **kwargs)
266
+
267
+
268
+ def _resolve_display_tz_obj(*args, **kwargs):
269
+ return sys.modules["cctally"]._resolve_display_tz_obj(*args, **kwargs)
270
+
271
+
272
+ def _build_alert_payload_weekly(*args, **kwargs):
273
+ return sys.modules["cctally"]._build_alert_payload_weekly(*args, **kwargs)
274
+
275
+
276
+ def _build_alert_payload_five_hour(*args, **kwargs):
277
+ return sys.modules["cctally"]._build_alert_payload_five_hour(*args, **kwargs)
278
+
279
+
280
+ def _dispatch_alert_notification(*args, **kwargs):
281
+ return sys.modules["cctally"]._dispatch_alert_notification(*args, **kwargs)
282
+
283
+
284
+ def _get_alerts_config(*args, **kwargs):
285
+ return sys.modules["cctally"]._get_alerts_config(*args, **kwargs)
286
+
287
+
288
+ def _warn_alerts_bad_config_once(*args, **kwargs):
289
+ return sys.modules["cctally"]._warn_alerts_bad_config_once(*args, **kwargs)
290
+
291
+
292
+ def _get_oauth_usage_config(*args, **kwargs):
293
+ return sys.modules["cctally"]._get_oauth_usage_config(*args, **kwargs)
294
+
295
+
296
+ def _hook_tick_oauth_refresh(*args, **kwargs):
297
+ """Shim for ``_hook_tick_oauth_refresh``.
298
+
299
+ Resolves via ``sys.modules['cctally']`` at call time so
300
+ ``monkeypatch.setitem(ns, "_hook_tick_oauth_refresh", boom)``
301
+ propagates. The ``--mock-oauth-response`` flag below rewrites
302
+ ``sys.modules['cctally']._hook_tick_oauth_refresh`` so this
303
+ shim picks up the mock on the very next call.
304
+ """
305
+ return sys.modules["cctally"]._hook_tick_oauth_refresh(*args, **kwargs)
306
+
307
+
308
+ def _hook_tick_make_mock_refresh(*args, **kwargs):
309
+ return sys.modules["cctally"]._hook_tick_make_mock_refresh(*args, **kwargs)
310
+
311
+
312
+ # Exception classes raised by callees that stay in bin/cctally
313
+ # (``_AlertsConfigError``) or in another sibling (``OauthUsageConfigError``
314
+ # in ``bin/_cctally_refresh.py``) are caught here via
315
+ # ``except sys.modules['cctally'].SomeError`` — Python evaluates the
316
+ # ``except`` expression at except-time, so each catch resolves to the
317
+ # live class object that the raiser also reaches. See call sites in
318
+ # ``maybe_record_milestone``, ``maybe_update_five_hour_block``, and
319
+ # ``cmd_hook_tick`` for the three rewrites.
320
+
321
+
322
+ # Constants referenced by the moved bodies. Defined here (rather than
323
+ # `c.X`-routed) because they're pure literals — no monkeypatch surface
324
+ # and no dependency on cctally's module instance.
325
+ _PERCENT_NORMALIZE_DECIMALS = 10
326
+
327
+
328
+ # One-shot guard so a misbehaving caller passing a non-int
329
+ # fiveHourWindowKey doesn't spam the log on every insert. Set on first
330
+ # loud-skip in insert_usage_snapshot. Moved into this sibling alongside
331
+ # insert_usage_snapshot — the `global` statement inside that function
332
+ # now binds to THIS module's namespace, which is correct (cctally re-
333
+ # exports the function via the eager-load block, but the `global` write
334
+ # stays in the sibling's __dict__ and the per-process one-shot semantics
335
+ # are preserved across both call routes).
336
+ _logged_window_key_coerce_failure = False
337
+
338
+
339
+ # === BEGIN MOVED REGIONS ===
340
+ # Path constants (APP_DIR, HOOK_TICK_*) are accessed via the
341
+ # `c = _cctally()` call-time accessor inside each function that needs
342
+ # them — so ``monkeypatch.setitem(ns, "APP_DIR", tmp)`` in tests
343
+ # resolves on every read (no stale module-level binding).
344
+ #
345
+ # Constants pulled from cctally at call time:
346
+ # c._FIVE_HOUR_JITTER_FLOOR_SECONDS — _lib_five_hour.* re-export
347
+ # c._RESET_PCT_DROP_THRESHOLD — bin/cctally module-level constant
348
+ # c.HOOK_TICK_LOG_DIR / _PATH / _ROTATED_PATH / _ROTATE_BYTES
349
+ # c.HOOK_TICK_THROTTLE_PATH / _LOCK_PATH
350
+ # c.HOOK_TICK_DEFAULT_THROTTLE_SECONDS
351
+ # c.APP_DIR
352
+
353
+
354
+ def _normalize_percent(value: "float | int | None") -> "float | None":
355
+ """Flush IEEE 754 ULP noise out of an ingress percent value.
356
+
357
+ Single chokepoint applied at every site where a raw percent enters
358
+ cctally's runtime path (OAuth fetch, hook-tick OAuth refresh, and
359
+ the cmd_record_usage CLI ingress). Downstream consumers — HWM
360
+ files, ``weekly_usage_snapshots.{weekly,five_hour}_percent`` REAL
361
+ columns, ``five_hour_blocks.final_five_hour_percent``, milestone
362
+ crossing values, and the SSE envelope's ``used_percent`` field —
363
+ all read the cleaned value, so a single round here stops
364
+ ``5h=7.000000000000001`` style strings from reaching any log or
365
+ serialized surface.
366
+
367
+ ``None`` is the canonical absent-percent sentinel; preserve it
368
+ unchanged so the optional-5h branches stay simple.
369
+ """
370
+ if value is None:
371
+ return None
372
+ return round(float(value), _PERCENT_NORMALIZE_DECIMALS)
373
+
374
+
375
+ def maybe_record_milestone(
376
+ saved: dict[str, Any],
377
+ ) -> None:
378
+ """Check if a new integer percent threshold was crossed, and if so,
379
+ fetch cost and record the milestone. Errors are logged, not raised."""
380
+ weekly_percent = saved.get("weeklyPercent")
381
+ if weekly_percent is None or weekly_percent < 1:
382
+ return
383
+
384
+ # Snap near-integer values up before flooring: the status-line API returns
385
+ # N% as 0.N * 100, which in IEEE 754 can land one ULP below N.0 (e.g.
386
+ # 0.58 * 100 == 57.99999999999999). A bare math.floor() then returns N-1
387
+ # and the N-threshold milestone is never recorded.
388
+ current_floor = math.floor(weekly_percent + 1e-9)
389
+ if current_floor < 1:
390
+ return
391
+
392
+ week_start_date = saved["weekStartDate"]
393
+ week_end_date = saved["weekEndDate"]
394
+ week_start_at = saved.get("weekStartAt")
395
+ week_end_at = saved.get("weekEndAt")
396
+ usage_snapshot_id = saved["id"]
397
+ five_hour_percent = saved.get("fiveHourPercent")
398
+
399
+ conn = open_db()
400
+ try:
401
+ # Resolve the active segment for THIS captured moment. The segment
402
+ # is the latest week_reset_events row keyed on week_end_at whose
403
+ # effective_reset_at_utc <= captured_at; 0 = pre-credit / no-event
404
+ # sentinel. ``unixepoch()`` normalizes the comparison across mixed
405
+ # +00:00 / Z offsets (see precedent at bin/cctally:_compute_block_totals
406
+ # cross-reset detection; also project gotcha
407
+ # ``unixepoch_for_cross_offset_compare``).
408
+ captured_at_iso = saved.get("capturedAt") or now_utc_iso()
409
+ reset_event_id = 0
410
+ if week_end_at:
411
+ seg_row = conn.execute(
412
+ """
413
+ SELECT id FROM week_reset_events
414
+ WHERE new_week_end_at = ?
415
+ AND unixepoch(effective_reset_at_utc) <= unixepoch(?)
416
+ ORDER BY id DESC LIMIT 1
417
+ """,
418
+ (week_end_at, captured_at_iso),
419
+ ).fetchone()
420
+ if seg_row is not None:
421
+ reset_event_id = int(seg_row["id"])
422
+
423
+ max_existing = get_max_milestone_for_week(
424
+ conn, week_start_date, reset_event_id=reset_event_id,
425
+ )
426
+ if max_existing is not None and current_floor <= max_existing:
427
+ return
428
+
429
+ # Threshold crossed — sync cost before recording so the milestone
430
+ # captures up-to-date cumulative cost, not a stale snapshot.
431
+ try:
432
+ sync_ns = argparse.Namespace(
433
+ week_start=None,
434
+ week_end=None,
435
+ week_start_name=None,
436
+ mode="auto",
437
+ offline=False,
438
+ project=None,
439
+ json=False,
440
+ quiet=True,
441
+ )
442
+ cmd_sync_week(sync_ns)
443
+ except Exception as exc:
444
+ eprint(f"[milestone] cost sync failed, using latest available: {exc}")
445
+
446
+ week_start = dt.date.fromisoformat(week_start_date)
447
+ week_end = dt.date.fromisoformat(week_end_date)
448
+ week_ref = make_week_ref(
449
+ week_start_date=week_start_date,
450
+ week_end_date=week_end_date,
451
+ week_start_at=week_start_at,
452
+ week_end_at=week_end_at,
453
+ )
454
+
455
+ # For reset-affected weeks, the cached weekly_cost_snapshots row
456
+ # covers the API-derived range (which for a post-reset week
457
+ # backdates into the old window). Live-compute over the effective
458
+ # range so the milestone captures cost from the reset moment
459
+ # forward, not from the phantom backdated start.
460
+ effective_ref = week_ref
461
+ adjusted = _apply_reset_events_to_weekrefs(conn, [week_ref])
462
+ if adjusted:
463
+ effective_ref = adjusted[0]
464
+
465
+ if _week_ref_has_reset_event(conn, effective_ref):
466
+ live_cost = _compute_cost_for_weekref(effective_ref)
467
+ if live_cost is None:
468
+ eprint("[milestone] could not compute effective-range cost, skipping")
469
+ return
470
+ cumulative_cost = live_cost
471
+ cost_snapshot_id = 0 # no snapshot row to anchor against
472
+ else:
473
+ latest_cost = get_latest_cost_for_week(conn, week_ref)
474
+ if latest_cost is None:
475
+ eprint("[milestone] no cost snapshot yet for this week, skipping")
476
+ return
477
+ cumulative_cost = float(latest_cost["cost_usd"])
478
+ cost_snapshot_id = int(latest_cost["id"])
479
+
480
+ # Determine which thresholds to record
481
+ start_threshold = (max_existing + 1) if max_existing is not None else current_floor
482
+
483
+ # Hoist `_get_alerts_config(load_config())` above the per-pct loop:
484
+ # in the catch-up case (multi-percent jump on first observation) the
485
+ # loop iterates N times and the config never changes mid-loop. One
486
+ # read serves all iterations.
487
+ # `load_config()` is safe outside the writer lock — atomic-rename
488
+ # guarantees readers see whole bytes (CLAUDE.md gotcha).
489
+ # `_ALERTS_BAD_CONFIG_WARNED` (module-level, M3) rate-limits the
490
+ # warning to once per process; both axis paths share the flag since
491
+ # the underlying problem is config-wide, not axis-specific.
492
+ try:
493
+ alerts_cfg: "dict | None" = _get_alerts_config(load_config())
494
+ except sys.modules["cctally"]._AlertsConfigError as exc:
495
+ _warn_alerts_bad_config_once(exc)
496
+ alerts_cfg = None
497
+
498
+ # Collect dispatch jobs across the per-pct loop and fire AFTER the
499
+ # single commit below. Mirrors the 5h path's pending_alerts pattern
500
+ # (set-then-dispatch + atomic INSERT/UPDATE, spec §3.2). Without
501
+ # this, `insert_percent_milestone`'s prior internal commit would
502
+ # split INSERT and the alerted_at UPDATE across two transactions —
503
+ # a crash in the gap left `alerted_at` NULL forever, since the
504
+ # next call's INSERT OR IGNORE returns rowcount==0 and the
505
+ # `if inserted == 1` dispatch guard skips re-firing.
506
+ pending_alerts: list[dict[str, Any]] = []
507
+ for pct in range(start_threshold, current_floor + 1):
508
+ if pct == start_threshold and max_existing is not None:
509
+ prev_cost = get_milestone_cost_for_week(
510
+ conn, week_start_date, max_existing,
511
+ reset_event_id=reset_event_id,
512
+ )
513
+ marginal = (cumulative_cost - prev_cost) if prev_cost is not None else None
514
+ else:
515
+ marginal = None
516
+ inserted = insert_percent_milestone(
517
+ conn,
518
+ week_start_date=week_start_date,
519
+ week_end_date=week_end_date,
520
+ week_start_at=week_start_at,
521
+ week_end_at=week_end_at,
522
+ percent_threshold=pct,
523
+ cumulative_cost_usd=cumulative_cost,
524
+ marginal_cost_usd=marginal,
525
+ usage_snapshot_id=usage_snapshot_id,
526
+ cost_snapshot_id=cost_snapshot_id,
527
+ five_hour_percent_at_crossing=five_hour_percent,
528
+ commit=False,
529
+ reset_event_id=reset_event_id,
530
+ )
531
+ # ── Threshold-actions dispatch (set-then-dispatch, spec §3.2) ──
532
+ # Only the genuine-new-crossing winner (rowcount==1) reaches this
533
+ # path; concurrent record-usage instances that race on the same
534
+ # (week_start_date, percent_threshold) get rowcount==0 from the
535
+ # INSERT OR IGNORE and skip dispatch entirely. The
536
+ # `alerted_at IS NULL` guard on the UPDATE is defense-in-depth:
537
+ # write-once even if two writers somehow both think they won.
538
+ if inserted == 1:
539
+ if (
540
+ alerts_cfg is not None
541
+ and alerts_cfg["enabled"]
542
+ and pct in alerts_cfg["weekly_thresholds"]
543
+ ):
544
+ crossed_at = now_utc_iso()
545
+ # set-then-dispatch: alerted_at lands on the row BEFORE
546
+ # the osascript Popen, so a dismissed-after-spawn
547
+ # notification still surfaces in the dashboard alerts
548
+ # envelope (T5). UPDATE shares the transaction with
549
+ # the preceding INSERT (commit=False above) so a
550
+ # crash between them is impossible.
551
+ conn.execute(
552
+ "UPDATE percent_milestones SET alerted_at = ? "
553
+ "WHERE week_start_date = ? AND percent_threshold = ? "
554
+ " AND reset_event_id = ? "
555
+ " AND alerted_at IS NULL",
556
+ (crossed_at, week_start_date, pct, reset_event_id),
557
+ )
558
+ # Cheap re-read for payload context (cumulative_cost_usd
559
+ # reflects the value persisted on insert, immune to any
560
+ # subsequent recompute drift). SELECT inside the open
561
+ # transaction is fine; values reflect post-INSERT state.
562
+ # Filter by reset_event_id so a credited week's
563
+ # alert payload reads the post-credit row, not a
564
+ # stale pre-credit row at the same (week, threshold).
565
+ row = conn.execute(
566
+ "SELECT cumulative_cost_usd FROM percent_milestones "
567
+ "WHERE week_start_date = ? AND percent_threshold = ? "
568
+ " AND reset_event_id = ?",
569
+ (week_start_date, pct, reset_event_id),
570
+ ).fetchone()
571
+ if row is not None:
572
+ cum = float(row["cumulative_cost_usd"])
573
+ # $/1% rough trend metric: cumulative / threshold.
574
+ dpp = (cum / pct) if pct else None
575
+ payload = _build_alert_payload_weekly(
576
+ threshold=pct,
577
+ crossed_at_utc=crossed_at,
578
+ week_start_date=week_start_date,
579
+ cumulative_cost_usd=cum,
580
+ dollars_per_percent=dpp,
581
+ )
582
+ pending_alerts.append(payload)
583
+ # Single commit after the loop durably writes every milestone row
584
+ # AND its alerted_at marker together.
585
+ conn.commit()
586
+ # Dispatch deferred to AFTER commit (matches 5h path). Per-payload
587
+ # exception logged so a bad-payload alert can't suppress healthy ones.
588
+ # Production caller ignores _dispatch_alert_notification's return
589
+ # value (spec §6.4).
590
+ for payload in pending_alerts:
591
+ try:
592
+ _dispatch_alert_notification(payload, mode="real")
593
+ except Exception as dispatch_exc:
594
+ eprint(f"[alerts] dispatch failed: {dispatch_exc}")
595
+ except Exception as exc:
596
+ eprint(f"[milestone] error recording milestone: {exc}")
597
+ finally:
598
+ conn.close()
599
+
600
+
601
+ def _compute_block_totals(
602
+ block_start_at: dt.datetime,
603
+ range_end: dt.datetime,
604
+ *,
605
+ skip_sync: bool = False,
606
+ ) -> dict[str, Any]:
607
+ """Sum tokens + cost over [block_start_at, range_end] from session_entries,
608
+ plus per-model and per-project breakdowns in the same walk.
609
+
610
+ Used by the live write path (maybe_update_five_hour_block) and the
611
+ historical backfill (_backfill_five_hour_blocks /
612
+ _backfill_five_hour_block_models / _backfill_five_hour_block_projects).
613
+
614
+ Routes through get_claude_session_entries (rather than the parent
615
+ get_entries which returns UsageEntry without project_path) — same
616
+ cache-first / lock-contention / direct-JSONL fallback chain.
617
+
618
+ Returns a dict with:
619
+ input_tokens, output_tokens, cache_create_tokens, cache_read_tokens (int)
620
+ cost_usd (float)
621
+ by_model: dict[model_name -> {input_tokens, output_tokens,
622
+ cache_create_tokens, cache_read_tokens,
623
+ cost_usd, entry_count}]
624
+ by_project: dict[project_path_or_'(unknown)' -> same shape]
625
+ """
626
+ totals: dict[str, Any] = {
627
+ "input_tokens": 0,
628
+ "output_tokens": 0,
629
+ "cache_create_tokens": 0,
630
+ "cache_read_tokens": 0,
631
+ "cost_usd": 0.0,
632
+ "by_model": {},
633
+ "by_project": {},
634
+ }
635
+ for entry in get_claude_session_entries(
636
+ block_start_at, range_end, skip_sync=skip_sync,
637
+ ):
638
+ usage = {
639
+ "input_tokens": entry.input_tokens,
640
+ "output_tokens": entry.output_tokens,
641
+ "cache_creation_input_tokens": entry.cache_creation_tokens,
642
+ "cache_read_input_tokens": entry.cache_read_tokens,
643
+ }
644
+ cost = _calculate_entry_cost(
645
+ entry.model, usage, mode="auto", cost_usd=entry.cost_usd,
646
+ )
647
+
648
+ totals["input_tokens"] += entry.input_tokens
649
+ totals["output_tokens"] += entry.output_tokens
650
+ totals["cache_create_tokens"] += entry.cache_creation_tokens
651
+ totals["cache_read_tokens"] += entry.cache_read_tokens
652
+ totals["cost_usd"] += cost
653
+
654
+ # Bucket by model and by project_path. NULL project_path → sentinel
655
+ # so reconcile invariant SUM(child.cost) == parent.total holds.
656
+ # Note: the JSONL-fallback path (_direct_parse_claude_session_entries)
657
+ # always populates project_path = cwd (never NULL); '(unknown)' only
658
+ # appears on the cache-backed path during the brief session_files
659
+ # lazy-backfill window.
660
+ for key, bucket_dict in (
661
+ (entry.model, totals["by_model"]),
662
+ (entry.project_path or "(unknown)", totals["by_project"]),
663
+ ):
664
+ b = bucket_dict.setdefault(
665
+ key,
666
+ {
667
+ "input_tokens": 0,
668
+ "output_tokens": 0,
669
+ "cache_create_tokens": 0,
670
+ "cache_read_tokens": 0,
671
+ "cost_usd": 0.0,
672
+ "entry_count": 0,
673
+ },
674
+ )
675
+ b["input_tokens"] += entry.input_tokens
676
+ b["output_tokens"] += entry.output_tokens
677
+ b["cache_create_tokens"] += entry.cache_creation_tokens
678
+ b["cache_read_tokens"] += entry.cache_read_tokens
679
+ b["cost_usd"] += cost
680
+ b["entry_count"] += 1
681
+ return totals
682
+
683
+
684
+ def maybe_update_five_hour_block(saved: dict[str, Any]) -> None:
685
+ """Upsert the current 5h block in five_hour_blocks; close strictly
686
+ older open blocks; sweep naturally-expired blocks; flag blocks
687
+ spanning a recorded mid-week 7d-reset.
688
+
689
+ Errors are logged and swallowed — record-usage must not regress
690
+ because of this helper, same posture as maybe_record_milestone.
691
+ """
692
+ five_hour_percent = saved.get("fiveHourPercent")
693
+ five_hour_resets_at = saved.get("fiveHourResetsAt")
694
+ five_hour_window_key = saved.get("fiveHourWindowKey")
695
+ if (
696
+ five_hour_percent is None
697
+ or five_hour_resets_at is None
698
+ or five_hour_window_key is None
699
+ ):
700
+ return # no canonical 5h anchor — nothing to record
701
+
702
+ captured_at = saved["capturedAt"]
703
+ weekly_percent = saved.get("weeklyPercent")
704
+ snapshot_id = saved["id"]
705
+
706
+ # Note: this is the 4th open_db() invocation per record-usage call
707
+ # (after cmd_record_usage's prior-state read, insert_usage_snapshot,
708
+ # and maybe_record_milestone). Each open re-runs the inline schema
709
+ # migrations and the empty-table check that gates _backfill_five_hour_blocks.
710
+ # The backfill itself only runs once per process (the gate fires only when
711
+ # five_hour_blocks is empty), so the cost is benign — but the count is
712
+ # surprising. If any future helper grows expensive open_db() side effects,
713
+ # consolidate by passing the connection through rather than reopening.
714
+ conn = open_db()
715
+ try:
716
+ # Step 3 (per spec §3.2): read prior state including immutable
717
+ # fields we'll re-use. Re-deriving block_start_at from saved.
718
+ # fiveHourResetsAt would reintroduce the seconds-level Anthropic
719
+ # ISO jitter that five_hour_window_key was designed to collapse.
720
+ prior = conn.execute(
721
+ """
722
+ SELECT id AS prior_block_id,
723
+ block_start_at AS block_start_at
724
+ FROM five_hour_blocks
725
+ WHERE five_hour_window_key = ?
726
+ """,
727
+ (int(five_hour_window_key),),
728
+ ).fetchone()
729
+
730
+ if prior is None:
731
+ # First observation of this window. Compute block_start_at
732
+ # from the canonical resets timestamp.
733
+ try:
734
+ resets_dt = parse_iso_datetime(
735
+ five_hour_resets_at, "five_hour_resets_at",
736
+ )
737
+ except ValueError as exc:
738
+ eprint(f"[5h-block] bad resets_at, skipping: {exc}")
739
+ return
740
+ block_start_dt = resets_dt - dt.timedelta(hours=5)
741
+ block_start_at = block_start_dt.isoformat(timespec="seconds")
742
+ else:
743
+ block_start_at = prior["block_start_at"]
744
+ block_start_dt = parse_iso_datetime(
745
+ block_start_at, "five_hour_blocks.block_start_at",
746
+ )
747
+
748
+ # Step 6 (totals) — done outside the transaction so the
749
+ # cache.db read doesn't hold the stats.db write lock open.
750
+ captured_at_dt = parse_iso_datetime(captured_at, "capturedAt")
751
+ totals = _compute_block_totals(block_start_dt, captured_at_dt)
752
+
753
+ # Hoist alerts config above BEGIN (M1 + M2): single read serves
754
+ # all per-pct iterations in the catch-up case, AND keeps the
755
+ # filesystem read out of the transaction window so the stats.db
756
+ # write lock isn't held across config.json I/O.
757
+ # `load_config()` is safe outside the writer lock — atomic-rename
758
+ # guarantees readers see whole bytes (CLAUDE.md gotcha).
759
+ # `_ALERTS_BAD_CONFIG_WARNED` (module-level, M3) rate-limits the
760
+ # warning to once per process; both axis paths share the flag since
761
+ # the underlying problem is config-wide, not axis-specific.
762
+ cfg_for_alerts = load_config()
763
+ try:
764
+ alerts_cfg: "dict | None" = _get_alerts_config(cfg_for_alerts)
765
+ except sys.modules["cctally"]._AlertsConfigError as exc:
766
+ _warn_alerts_bad_config_once(exc)
767
+ alerts_cfg = None
768
+ # Resolve display.tz once (shares the cfg load above). Threaded
769
+ # into _dispatch_alert_notification so the macOS notification
770
+ # subtitle (block-start time) matches the dashboard / TUI render
771
+ # rather than falling back to host-local via tz=None.
772
+ display_tz_for_alerts = _resolve_display_tz_obj(cfg_for_alerts)
773
+
774
+ # Collect dispatch jobs while inside BEGIN (set-then-dispatch:
775
+ # alerted_at UPDATE stays inside the transaction per spec §3.2)
776
+ # but DEFER `_dispatch_alert_notification` until AFTER the outer
777
+ # commit (I1: prevents the inner Popen-time conn.commit() from
778
+ # ending the surrounding BEGIN mid-sequence and breaking the
779
+ # close-older + upsert + cross-flag atomicity envelope).
780
+ pending_alerts: list[dict[str, Any]] = []
781
+
782
+ # Steps 4-5 + 7: transaction wraps close-older + upsert so a
783
+ # mid-sequence failure doesn't leave the prior block closed
784
+ # without the current block opened/updated.
785
+ now_iso = now_utc_iso()
786
+ conn.execute("BEGIN")
787
+ try:
788
+ # Step 5: close any STRICTLY OLDER open block. `<` not `!=`
789
+ # — record-usage runs in parallel via background hook-tick &
790
+ # detach + status-line ticks; an older invocation completing
791
+ # after a newer one would close the now-current block under
792
+ # `!=`. With `<`, an older invocation only closes still-older
793
+ # blocks. window_key is a 10-min-floored monotonic epoch.
794
+ conn.execute(
795
+ """
796
+ UPDATE five_hour_blocks
797
+ SET is_closed = 1, last_updated_at_utc = ?
798
+ WHERE is_closed = 0
799
+ AND five_hour_window_key < ?
800
+ """,
801
+ (now_iso, int(five_hour_window_key)),
802
+ )
803
+
804
+ # Step 5b: natural-expiration sweep. The close-older predicate
805
+ # above only fires when a strictly-newer window arrives. A user
806
+ # who lets a block expire without a successor (idle / shut down
807
+ # past the 5h reset) would otherwise leave the row at
808
+ # is_closed = 0 forever. Idempotent (only flips 0 → 1); safe to
809
+ # re-run every tick. ISO-string compare is monotonic so it
810
+ # works directly on five_hour_resets_at.
811
+ conn.execute(
812
+ """
813
+ UPDATE five_hour_blocks
814
+ SET is_closed = 1, last_updated_at_utc = ?
815
+ WHERE is_closed = 0
816
+ AND five_hour_resets_at < ?
817
+ """,
818
+ (now_iso, now_iso),
819
+ )
820
+
821
+ # Step 7: atomic upsert. Single statement collapses the
822
+ # insert-vs-update branches and is race-safe: when two
823
+ # record-usage invocations both observe `prior is None`
824
+ # for a brand-new window (the SELECT at line 8636 happens
825
+ # before BEGIN), the loser's INSERT lands as DO UPDATE
826
+ # rather than raising IntegrityError on the
827
+ # UNIQUE(five_hour_window_key) constraint and dropping the
828
+ # tick. Immutable columns (block_start_at,
829
+ # first_observed_at_utc, five_hour_resets_at,
830
+ # seven_day_pct_at_block_start, created_at_utc) are
831
+ # deliberately omitted from DO UPDATE — first writer
832
+ # owns them.
833
+ conn.execute(
834
+ """
835
+ INSERT INTO five_hour_blocks (
836
+ five_hour_window_key,
837
+ five_hour_resets_at,
838
+ block_start_at,
839
+ first_observed_at_utc,
840
+ last_observed_at_utc,
841
+ final_five_hour_percent,
842
+ seven_day_pct_at_block_start,
843
+ seven_day_pct_at_block_end,
844
+ crossed_seven_day_reset,
845
+ total_input_tokens,
846
+ total_output_tokens,
847
+ total_cache_create_tokens,
848
+ total_cache_read_tokens,
849
+ total_cost_usd,
850
+ is_closed,
851
+ created_at_utc,
852
+ last_updated_at_utc
853
+ )
854
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, 0, ?, ?, ?, ?, ?, 0, ?, ?)
855
+ ON CONFLICT(five_hour_window_key) DO UPDATE SET
856
+ last_observed_at_utc = excluded.last_observed_at_utc,
857
+ final_five_hour_percent = excluded.final_five_hour_percent,
858
+ seven_day_pct_at_block_end = excluded.seven_day_pct_at_block_end,
859
+ total_input_tokens = excluded.total_input_tokens,
860
+ total_output_tokens = excluded.total_output_tokens,
861
+ total_cache_create_tokens = excluded.total_cache_create_tokens,
862
+ total_cache_read_tokens = excluded.total_cache_read_tokens,
863
+ total_cost_usd = excluded.total_cost_usd,
864
+ last_updated_at_utc = excluded.last_updated_at_utc
865
+ """,
866
+ (
867
+ int(five_hour_window_key),
868
+ str(five_hour_resets_at),
869
+ block_start_at,
870
+ captured_at,
871
+ captured_at,
872
+ float(five_hour_percent),
873
+ weekly_percent,
874
+ weekly_percent,
875
+ totals["input_tokens"],
876
+ totals["output_tokens"],
877
+ totals["cache_create_tokens"],
878
+ totals["cache_read_tokens"],
879
+ totals["cost_usd"],
880
+ now_iso,
881
+ now_iso,
882
+ ),
883
+ )
884
+
885
+ # ── Resolve current block_id once for reuse by the per-(block, model)
886
+ # / per-(block, project) child writes below AND the existing milestone
887
+ # detection (which previously did its own SELECT — drop that SELECT in
888
+ # favor of this variable).
889
+ block_id_row = conn.execute(
890
+ "SELECT id FROM five_hour_blocks WHERE five_hour_window_key = ?",
891
+ (int(five_hour_window_key),),
892
+ ).fetchone()
893
+ block_id = int(block_id_row["id"])
894
+
895
+ # ── Replace-all per-tick: per-(block, model) and per-(block, project_path)
896
+ # rollup-children. DELETE keyed on five_hour_window_key (NOT block_id) so
897
+ # orphan child rows from a prior parent rebuild are cleaned up automatically.
898
+ # Same transaction as the parent upsert; if these raise, the whole tick
899
+ # rolls back and the next tick recomputes from scratch.
900
+ conn.execute(
901
+ "DELETE FROM five_hour_block_models WHERE five_hour_window_key = ?",
902
+ (int(five_hour_window_key),),
903
+ )
904
+ if totals.get("by_model"):
905
+ conn.executemany(
906
+ """
907
+ INSERT INTO five_hour_block_models (
908
+ block_id, five_hour_window_key, model,
909
+ input_tokens, output_tokens,
910
+ cache_create_tokens, cache_read_tokens,
911
+ cost_usd, entry_count
912
+ )
913
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
914
+ """,
915
+ [
916
+ (
917
+ block_id,
918
+ int(five_hour_window_key),
919
+ model,
920
+ b["input_tokens"],
921
+ b["output_tokens"],
922
+ b["cache_create_tokens"],
923
+ b["cache_read_tokens"],
924
+ b["cost_usd"],
925
+ b["entry_count"],
926
+ )
927
+ for model, b in totals["by_model"].items()
928
+ ],
929
+ )
930
+
931
+ conn.execute(
932
+ "DELETE FROM five_hour_block_projects WHERE five_hour_window_key = ?",
933
+ (int(five_hour_window_key),),
934
+ )
935
+ if totals.get("by_project"):
936
+ conn.executemany(
937
+ """
938
+ INSERT INTO five_hour_block_projects (
939
+ block_id, five_hour_window_key, project_path,
940
+ input_tokens, output_tokens,
941
+ cache_create_tokens, cache_read_tokens,
942
+ cost_usd, entry_count
943
+ )
944
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
945
+ """,
946
+ [
947
+ (
948
+ block_id,
949
+ int(five_hour_window_key),
950
+ project_path,
951
+ b["input_tokens"],
952
+ b["output_tokens"],
953
+ b["cache_create_tokens"],
954
+ b["cache_read_tokens"],
955
+ b["cost_usd"],
956
+ b["entry_count"],
957
+ )
958
+ for project_path, b in totals["by_project"].items()
959
+ ],
960
+ )
961
+
962
+ # ── 5h-% milestone detection (mirrors maybe_record_milestone) ──
963
+ # Snap-up-by-1e-9 per the gotcha: 0.50 * 100 == 49.99...9 in
964
+ # IEEE-754, so bare math.floor would miss the 50 threshold.
965
+ current_floor = math.floor(float(five_hour_percent) + 1e-9)
966
+ if current_floor >= 1:
967
+ # Use max(percent_threshold) directly (not prior block's
968
+ # final_pct) so first-observation already-mid-stream doesn't
969
+ # synthesize crossings 1..(current_floor - 1) we never had
970
+ # authentic moment-of-detection data for. Same shape as
971
+ # maybe_record_milestone's max_existing path.
972
+ row = conn.execute(
973
+ "SELECT MAX(percent_threshold) AS m FROM five_hour_milestones "
974
+ "WHERE five_hour_window_key = ?",
975
+ (int(five_hour_window_key),),
976
+ ).fetchone()
977
+ max_existing = row["m"] if row and row["m"] is not None else None
978
+
979
+ if max_existing is None:
980
+ start_threshold = current_floor # first observation: only current floor
981
+ else:
982
+ start_threshold = int(max_existing) + 1
983
+
984
+ if start_threshold <= current_floor:
985
+ # block_id was resolved above (before the children writes) and
986
+ # is still in scope here.
987
+
988
+ # Marginal-cost lookup for the start_threshold milestone
989
+ # (only when there's a prior milestone in this block).
990
+ prior_cost: float | None = None
991
+ if max_existing is not None:
992
+ prev_row = conn.execute(
993
+ "SELECT block_cost_usd FROM five_hour_milestones "
994
+ "WHERE five_hour_window_key = ? AND percent_threshold = ?",
995
+ (int(five_hour_window_key), int(max_existing)),
996
+ ).fetchone()
997
+ if prev_row is not None:
998
+ prior_cost = float(prev_row["block_cost_usd"])
999
+
1000
+ for pct in range(start_threshold, current_floor + 1):
1001
+ if pct == start_threshold and prior_cost is not None:
1002
+ marginal: float | None = totals["cost_usd"] - prior_cost
1003
+ else:
1004
+ marginal = None
1005
+ cur = conn.execute(
1006
+ """
1007
+ INSERT OR IGNORE INTO five_hour_milestones (
1008
+ block_id,
1009
+ five_hour_window_key,
1010
+ percent_threshold,
1011
+ captured_at_utc,
1012
+ usage_snapshot_id,
1013
+ block_input_tokens,
1014
+ block_output_tokens,
1015
+ block_cache_create_tokens,
1016
+ block_cache_read_tokens,
1017
+ block_cost_usd,
1018
+ marginal_cost_usd,
1019
+ seven_day_pct_at_crossing
1020
+ )
1021
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
1022
+ """,
1023
+ (
1024
+ block_id,
1025
+ int(five_hour_window_key),
1026
+ int(pct),
1027
+ captured_at,
1028
+ int(snapshot_id),
1029
+ totals["input_tokens"],
1030
+ totals["output_tokens"],
1031
+ totals["cache_create_tokens"],
1032
+ totals["cache_read_tokens"],
1033
+ totals["cost_usd"],
1034
+ marginal,
1035
+ weekly_percent,
1036
+ ),
1037
+ )
1038
+ # ── Threshold-actions dispatch (set-then-dispatch, spec §3.2) ──
1039
+ # Only the genuine-new-crossing winner (rowcount==1)
1040
+ # reaches dispatch. Concurrent record-usage instances
1041
+ # that race on the same (five_hour_window_key,
1042
+ # percent_threshold) get rowcount==0 from the
1043
+ # INSERT OR IGNORE and skip dispatch entirely.
1044
+ # `alerted_at IS NULL` on the UPDATE preserves
1045
+ # write-once even if two writers somehow both think
1046
+ # they won.
1047
+ #
1048
+ # I1: alerted_at UPDATE stays inside BEGIN (set-then-
1049
+ # dispatch invariant per spec §3.2 — the row carries
1050
+ # alerted_at BEFORE any externally-observable side
1051
+ # effect). The single outer commit at the bottom of
1052
+ # this BEGIN durably writes the milestone row AND the
1053
+ # alerted_at update together. Dispatch itself is
1054
+ # collected into pending_alerts and fired AFTER the
1055
+ # outer commit so the inner Popen-time bookkeeping
1056
+ # never ends the surrounding BEGIN mid-sequence.
1057
+ if (
1058
+ cur.rowcount == 1
1059
+ and alerts_cfg is not None
1060
+ and alerts_cfg["enabled"]
1061
+ and pct in alerts_cfg["five_hour_thresholds"]
1062
+ ):
1063
+ crossed_at = now_utc_iso()
1064
+ conn.execute(
1065
+ "UPDATE five_hour_milestones SET alerted_at = ? "
1066
+ "WHERE five_hour_window_key = ? AND percent_threshold = ? "
1067
+ "AND alerted_at IS NULL",
1068
+ (crossed_at, int(five_hour_window_key), int(pct)),
1069
+ )
1070
+ # Cheap re-reads inside BEGIN are SELECT-only and
1071
+ # safe; values reflect post-INSERT state. We
1072
+ # build the payload now (while block_id / totals
1073
+ # are in scope) and defer ONLY the Popen-side
1074
+ # _dispatch_alert_notification to after the outer
1075
+ # commit.
1076
+ cost_row = conn.execute(
1077
+ "SELECT block_cost_usd FROM five_hour_milestones "
1078
+ "WHERE five_hour_window_key = ? AND percent_threshold = ?",
1079
+ (int(five_hour_window_key), int(pct)),
1080
+ ).fetchone()
1081
+ block_row = conn.execute(
1082
+ "SELECT block_start_at FROM five_hour_blocks "
1083
+ "WHERE five_hour_window_key = ?",
1084
+ (int(five_hour_window_key),),
1085
+ ).fetchone()
1086
+ primary_model = _resolve_primary_model_for_block(
1087
+ conn, int(five_hour_window_key)
1088
+ )
1089
+ payload = _build_alert_payload_five_hour(
1090
+ threshold=int(pct),
1091
+ crossed_at_utc=crossed_at,
1092
+ five_hour_window_key=int(five_hour_window_key),
1093
+ block_start_at=(
1094
+ block_row["block_start_at"] if block_row else ""
1095
+ ),
1096
+ block_cost_usd=(
1097
+ float(cost_row["block_cost_usd"])
1098
+ if cost_row
1099
+ else 0.0
1100
+ ),
1101
+ primary_model=primary_model,
1102
+ )
1103
+ pending_alerts.append(payload)
1104
+
1105
+ # ── Reset-crossing cross-flag (opportunistic, JOIN-based) ──
1106
+ # Self-healing sweep: every tick, flag any open block whose
1107
+ # [block_start_at, last_observed_at_utc] interval crosses a
1108
+ # weekly reset, from either of two sources:
1109
+ # (a) week_reset_events — Anthropic-shifted MID-week resets
1110
+ # (prior week_end_at was still in the future at detect
1111
+ # time; see cmd_record_usage's reset-event detection).
1112
+ # (b) weekly_usage_snapshots.week_start_at — NATURAL weekly
1113
+ # boundaries. These never get a week_reset_events row
1114
+ # (mid-week detection requires the prior end to be in
1115
+ # the future), so source (a) silently misses blocks
1116
+ # that span a routine week reset. Without this clause
1117
+ # the dashboard's "Δ pp this block" delta is computed
1118
+ # against the pre-reset 7d% (~94%) versus post-reset
1119
+ # (~0%) and renders as a misleading −94pp drop.
1120
+ # Predicate (b) uses strict ``>`` on the lower bound so a
1121
+ # block that starts EXACTLY at the boundary (post-reset) is
1122
+ # not flagged. Symmetric with the historical-backfill
1123
+ # predicate (§4.2 step 5). Idempotent (only flips 0 → 1).
1124
+ #
1125
+ # Comparisons go through ``unixepoch()`` rather than a raw
1126
+ # lex BETWEEN: ``parse_iso_datetime`` returns host-local
1127
+ # tz-aware datetimes (line 9433: ``return parsed.astimezone()``),
1128
+ # so ``block_start_at`` is stored with the host's display
1129
+ # offset (e.g. ``+03:00``) while ``week_start_at`` is
1130
+ # ``+00:00`` and ``last_observed_at_utc`` is ``Z``. A lex
1131
+ # compare across mixed offsets silently mis-orders moments
1132
+ # for non-UTC hosts; ``unixepoch()`` normalizes all three
1133
+ # to seconds-since-epoch and is correct regardless of
1134
+ # offset suffix.
1135
+ #
1136
+ # Why the JOIN rather than a per-tick param: an earlier
1137
+ # design passed mid_week_reset_at only on the tick that
1138
+ # cmd_record_usage's INSERT OR IGNORE actually inserted
1139
+ # the event row. If the helper raised after the event
1140
+ # commit but before the flag UPDATE, the next tick's
1141
+ # INSERT OR IGNORE was a duplicate and the flag stayed 0
1142
+ # forever. The JOIN re-derives from durable state on
1143
+ # every tick and self-heals.
1144
+ conn.execute(
1145
+ """
1146
+ UPDATE five_hour_blocks
1147
+ SET crossed_seven_day_reset = 1
1148
+ WHERE crossed_seven_day_reset = 0
1149
+ AND (
1150
+ EXISTS (
1151
+ SELECT 1 FROM week_reset_events e
1152
+ WHERE unixepoch(e.effective_reset_at_utc)
1153
+ BETWEEN unixepoch(five_hour_blocks.block_start_at)
1154
+ AND unixepoch(five_hour_blocks.last_observed_at_utc)
1155
+ )
1156
+ OR EXISTS (
1157
+ SELECT 1 FROM weekly_usage_snapshots ws
1158
+ WHERE ws.week_start_at IS NOT NULL
1159
+ AND unixepoch(ws.week_start_at)
1160
+ > unixepoch(five_hour_blocks.block_start_at)
1161
+ AND unixepoch(ws.week_start_at)
1162
+ <= unixepoch(five_hour_blocks.last_observed_at_utc)
1163
+ )
1164
+ )
1165
+ """,
1166
+ )
1167
+
1168
+ conn.commit()
1169
+ except Exception:
1170
+ conn.rollback()
1171
+ raise
1172
+
1173
+ # I1: dispatch deferred to AFTER the outer commit. The milestone
1174
+ # row + alerted_at update + close-older + parent upsert + child
1175
+ # rebuilds + cross-flag sweep are all durably written together
1176
+ # before any externally-observable osascript Popen fires. If the
1177
+ # inner BEGIN rolled back above, `pending_alerts` is unreachable
1178
+ # (the `raise` above bubbles out via the outer try). Production
1179
+ # caller ignores _dispatch_alert_notification's return value
1180
+ # (spec §6.4); a per-payload exception is logged and the loop
1181
+ # continues so a bad-payload alert can't suppress healthy ones.
1182
+ for payload in pending_alerts:
1183
+ try:
1184
+ _dispatch_alert_notification(
1185
+ payload, mode="real", tz=display_tz_for_alerts
1186
+ )
1187
+ except Exception as dispatch_exc:
1188
+ eprint(f"[alerts] dispatch failed: {dispatch_exc}")
1189
+ except Exception as exc:
1190
+ eprint(f"[5h-block] error updating block: {exc}")
1191
+ finally:
1192
+ conn.close()
1193
+
1194
+
1195
+ def cmd_record_usage(args: argparse.Namespace) -> int:
1196
+ """Record usage data from Claude Code status line rate_limits."""
1197
+ c = _cctally()
1198
+ config = load_config()
1199
+ week_start_name = get_week_start_name(config, getattr(args, "week_start_name", None))
1200
+
1201
+ # ULP-noise sanitization is applied at the cmd_record_usage ingress
1202
+ # boundary so every downstream consumer (HWM files, DB rows,
1203
+ # five_hour_blocks rollup, milestones) reads a stable value. See
1204
+ # `_normalize_percent` for the rationale.
1205
+ weekly_percent = _normalize_percent(args.percent)
1206
+ resets_at = int(args.resets_at)
1207
+
1208
+ five_hour_percent: float | None = None
1209
+ five_hour_resets_at_str: str | None = None
1210
+ five_hour_window_key: int | None = None
1211
+ five_hour_resets_at_epoch: int | None = None
1212
+ if args.five_hour_percent is not None:
1213
+ five_hour_percent = _normalize_percent(args.five_hour_percent)
1214
+ if args.five_hour_resets_at is not None:
1215
+ five_hour_resets_at_epoch = int(args.five_hour_resets_at)
1216
+ five_hour_resets_at_str = dt.datetime.fromtimestamp(
1217
+ five_hour_resets_at_epoch, tz=dt.timezone.utc
1218
+ ).isoformat(timespec="seconds")
1219
+ # five_hour_window_key derivation is deferred until after open_db()
1220
+ # so we can pass the most-recent stored sample as the prior anchor.
1221
+ # See _canonical_5h_window_key docstring (spec invariant #3:
1222
+ # boundary-straddling jitter must collapse to the first-seen key).
1223
+
1224
+ # Derive week boundaries from resets_at (exact UTC epoch)
1225
+ week_end_at_dt = dt.datetime.fromtimestamp(resets_at, tz=dt.timezone.utc)
1226
+ week_start_at_dt = week_end_at_dt - dt.timedelta(days=7)
1227
+ week_start_date = week_start_at_dt.date().isoformat()
1228
+ week_end_date = week_end_at_dt.date().isoformat()
1229
+ week_start_at = week_start_at_dt.isoformat(timespec="seconds")
1230
+ week_end_at = week_end_at_dt.isoformat(timespec="seconds")
1231
+
1232
+ # Deduplication: skip if nothing changed since last snapshot
1233
+ should_insert = True
1234
+ conn = open_db()
1235
+ try:
1236
+ # Resolve the canonical 5h window key. Pass the most-recent stored
1237
+ # sample as the prior anchor so seconds-level jitter that straddles
1238
+ # a 600-second floor-bucket boundary (e.g. resets_at=1746014999 vs.
1239
+ # 1746015000) collapses to the first-seen key instead of forking
1240
+ # a new one. Without this, both the DB clamp below and the hwm-5h
1241
+ # file write further down would treat the same physical window as
1242
+ # distinct, regressing the monotonic 5h percent (spec invariant #3).
1243
+ if five_hour_resets_at_epoch is not None:
1244
+ prior_5h_epoch: int | None = None
1245
+ prior_5h_key: int | None = None
1246
+ # Tier 1: blocks-table lookup (steady state). Find the closest
1247
+ # canonical block whose five_hour_resets_at is within ±1800s of
1248
+ # the new resets_at. The blocks table has one canonical row per
1249
+ # physical window after the merge_5h_block_duplicates_v1
1250
+ # migration, so this is more reliable than scanning
1251
+ # weekly_usage_snapshots for an "anchor" row — snapshots can be
1252
+ # noisy when the status line returns out-of-order rate-limit
1253
+ # data from older windows (the F4 incident: snap N+1 carrying a
1254
+ # window-A boundary-jitter resets_at, but snap N reported an
1255
+ # OLDER window B; pre-fix the prior-anchor lookup picked B as
1256
+ # the anchor and the |epoch-prior| > 600 check then forked a
1257
+ # new key for what was actually still window A). 1800s is wide
1258
+ # enough to absorb known jitter, narrow enough that consecutive
1259
+ # 5h blocks (>4h apart in resets_at) cannot collide.
1260
+ try:
1261
+ prior_block_row = conn.execute(
1262
+ """
1263
+ SELECT five_hour_window_key, five_hour_resets_at
1264
+ FROM five_hour_blocks
1265
+ WHERE abs(? - CAST(strftime('%s', five_hour_resets_at) AS INTEGER)) <= ?
1266
+ ORDER BY abs(? - CAST(strftime('%s', five_hour_resets_at) AS INTEGER)) ASC
1267
+ LIMIT 1
1268
+ """,
1269
+ (
1270
+ five_hour_resets_at_epoch,
1271
+ c._FIVE_HOUR_JITTER_FLOOR_SECONDS * 3,
1272
+ five_hour_resets_at_epoch,
1273
+ ),
1274
+ ).fetchone()
1275
+ if prior_block_row is not None:
1276
+ prior_iso = prior_block_row["five_hour_resets_at"]
1277
+ prior_5h_epoch = int(parse_iso_datetime(
1278
+ prior_iso, "prior 5h block anchor"
1279
+ ).timestamp())
1280
+ prior_5h_key = int(prior_block_row["five_hour_window_key"])
1281
+ except (sqlite3.DatabaseError, ValueError, TypeError) as exc:
1282
+ eprint(f"[record-usage] prior 5h block-anchor lookup failed: {exc}")
1283
+
1284
+ # Tier 2: snapshot lookup (legacy fallback). Only run when Tier
1285
+ # 1 missed (no canonical block row exists yet — the brand-new-
1286
+ # window case before any record-usage tick has materialized a
1287
+ # five_hour_blocks row). Tier 1's empty-result guard is the
1288
+ # `prior_block_row is not None` test above; replicating it
1289
+ # here keeps Tier 2 strictly secondary.
1290
+ if prior_5h_key is None:
1291
+ try:
1292
+ prior_5h_row = conn.execute(
1293
+ "SELECT five_hour_resets_at, five_hour_window_key "
1294
+ "FROM weekly_usage_snapshots "
1295
+ "WHERE five_hour_resets_at IS NOT NULL "
1296
+ " AND five_hour_window_key IS NOT NULL "
1297
+ "ORDER BY captured_at_utc DESC, id DESC LIMIT 1"
1298
+ ).fetchone()
1299
+ if prior_5h_row is not None:
1300
+ prior_iso = prior_5h_row["five_hour_resets_at"]
1301
+ prior_5h_epoch = int(parse_iso_datetime(
1302
+ prior_iso, "prior 5h anchor"
1303
+ ).timestamp())
1304
+ prior_5h_key = int(prior_5h_row["five_hour_window_key"])
1305
+ except (sqlite3.DatabaseError, ValueError, TypeError) as exc:
1306
+ eprint(f"[record-usage] prior 5h anchor lookup failed: {exc}")
1307
+
1308
+ # Tier 3 is implicit: with no anchor, _canonical_5h_window_key
1309
+ # falls back to the pure 600-second floor.
1310
+ five_hour_window_key = _canonical_5h_window_key(
1311
+ five_hour_resets_at_epoch,
1312
+ prior_epoch=prior_5h_epoch,
1313
+ prior_key=prior_5h_key,
1314
+ )
1315
+
1316
+ # Mid-week reset detection. When `resets_at` advances before the
1317
+ # previously-declared reset actually fires (Anthropic-initiated
1318
+ # goodwill reset, or any API-side shift), record one week_reset_events
1319
+ # row so display + cost layers can treat the observed moment as the
1320
+ # old week's effective end AND the new week's effective start. The
1321
+ # monotonic check below stays keyed on week_start_date so it still
1322
+ # guards the new week against stale rate-limit data independently.
1323
+ # Both boundaries canonicalize to hour (same rule make_week_ref uses)
1324
+ # so minute/second-level Anthropic jitter doesn't masquerade as a
1325
+ # reset and the stored values match what WeekRef.week_end_at carries.
1326
+ # The 5h-block cross-flag is no longer threaded from here —
1327
+ # maybe_update_five_hour_block re-derives it every tick by JOINing
1328
+ # against week_reset_events (self-healing, see helper for rationale).
1329
+ try:
1330
+ cur_end_canon = _canonicalize_optional_iso(week_end_at, "record.cur")
1331
+ prior = conn.execute(
1332
+ "SELECT week_end_at, weekly_percent FROM weekly_usage_snapshots "
1333
+ "WHERE week_end_at IS NOT NULL "
1334
+ "ORDER BY captured_at_utc DESC, id DESC LIMIT 1"
1335
+ ).fetchone()
1336
+ if prior and prior["week_end_at"] and cur_end_canon:
1337
+ prior_end_canon = _canonicalize_optional_iso(
1338
+ prior["week_end_at"], "record.prior"
1339
+ )
1340
+ prior_pct = prior["weekly_percent"]
1341
+ now_utc = dt.datetime.now(dt.timezone.utc)
1342
+ if prior_end_canon and prior_end_canon != cur_end_canon:
1343
+ prior_end_dt = parse_iso_datetime(prior_end_canon, "prior.week_end_at")
1344
+ # Fire only when (a) prior window was still in the FUTURE
1345
+ # (Anthropic shifted the boundary before natural expiration),
1346
+ # AND (b) weekly_percent dropped by RESET_PCT_DROP_THRESHOLD
1347
+ # or more (filters out API flaps / transient boundary
1348
+ # jitter where usage stays roughly the same).
1349
+ if (
1350
+ prior_end_dt > now_utc
1351
+ and prior_pct is not None
1352
+ and (float(prior_pct) - float(weekly_percent)) >= c._RESET_PCT_DROP_THRESHOLD
1353
+ ):
1354
+ # See _backfill_week_reset_events for why we floor
1355
+ # the reset moment to the hour (natural display
1356
+ # boundary, aligned with Anthropic's hour-only
1357
+ # resets_at values).
1358
+ effective_iso = _floor_to_hour(now_utc).isoformat(timespec="seconds")
1359
+ conn.execute(
1360
+ "INSERT OR IGNORE INTO week_reset_events "
1361
+ "(detected_at_utc, old_week_end_at, new_week_end_at, "
1362
+ " effective_reset_at_utc) VALUES (?, ?, ?, ?)",
1363
+ (now_utc_iso(), prior_end_canon, cur_end_canon,
1364
+ effective_iso),
1365
+ )
1366
+ conn.commit()
1367
+ elif prior_end_canon and prior_end_canon == cur_end_canon:
1368
+ # In-place credit branch (v1.7.2). When `resets_at` stays
1369
+ # unchanged but `weekly_percent` drops by RESET_PCT_DROP_THRESHOLD
1370
+ # or more, Anthropic has issued a goodwill in-place weekly
1371
+ # credit. Emit one week_reset_events row keyed on the
1372
+ # current end_at (old == new) so the reset-aware clamp
1373
+ # above and the milestone segment writer can pivot to
1374
+ # the post-credit segment. The seed snapshot lands via
1375
+ # the now-reset-aware clamp on this same call.
1376
+ prior_end_dt = parse_iso_datetime(prior_end_canon, "prior.week_end_at")
1377
+ if (
1378
+ prior_end_dt > now_utc
1379
+ and prior_pct is not None
1380
+ and (float(prior_pct) - float(weekly_percent)) >= c._RESET_PCT_DROP_THRESHOLD
1381
+ ):
1382
+ # Pre-check (Q5 belt-and-suspenders): suppress duplicate
1383
+ # event rows for the same new_week_end_at across
1384
+ # consecutive ticks. UNIQUE(old, new) at the DDL
1385
+ # also catches the duplicate in the (old == new) case,
1386
+ # but the pre-check avoids a useless write attempt
1387
+ # and keeps the log clean. After the seed lands at
1388
+ # post-credit %, the next tick's `prior_pct` will be
1389
+ # the post-credit value so the drop predicate alone
1390
+ # also suffices — pre-check is belt-and-suspenders.
1391
+ already = conn.execute(
1392
+ "SELECT 1 FROM week_reset_events "
1393
+ "WHERE new_week_end_at = ? LIMIT 1",
1394
+ (cur_end_canon,),
1395
+ ).fetchone()
1396
+ if already is None:
1397
+ effective_dt = _floor_to_hour(now_utc)
1398
+ effective_iso = effective_dt.isoformat(timespec="seconds")
1399
+ # Row shape: old=effective_iso, new=cur_end_canon
1400
+ # (distinct values). The previous shape stored
1401
+ # old==new==cur_end_canon, which let BOTH
1402
+ # _apply_reset_events_to_weekrefs maps
1403
+ # (pre_map[old] and post_map[new]) fire on the
1404
+ # SAME WeekRef — pre_map rewrote week_end_at to
1405
+ # effective, post_map rewrote week_start_at to
1406
+ # effective, collapsing the credited week to a
1407
+ # zero-width window in downstream renders. With
1408
+ # old==effective and new==cur_end_canon, only
1409
+ # post_map fires on the credited week (setting
1410
+ # week_start_at = effective, the intended
1411
+ # behavior); pre_map keys on effective_iso and
1412
+ # finds no matching WeekRef in practice. The
1413
+ # UNIQUE(old, new) constraint permits this
1414
+ # row, and the pre-check above keys on
1415
+ # new_week_end_at so dedup still works.
1416
+ conn.execute(
1417
+ "INSERT OR IGNORE INTO week_reset_events "
1418
+ "(detected_at_utc, old_week_end_at, new_week_end_at, "
1419
+ " effective_reset_at_utc) VALUES (?, ?, ?, ?)",
1420
+ (now_utc_iso(), effective_iso, cur_end_canon,
1421
+ effective_iso),
1422
+ )
1423
+ conn.commit()
1424
+ # Force-write hwm-7d so the next status-line
1425
+ # render reflects the post-credit value. The
1426
+ # monotonic guard at the normal write site (below)
1427
+ # would refuse to decrease the file; this write
1428
+ # is the credit-only escape hatch. Lands AFTER
1429
+ # the conn.commit() so a concurrent record-usage
1430
+ # reader doesn't see the new HWM before the
1431
+ # event row is durable.
1432
+ try:
1433
+ (c.APP_DIR / "hwm-7d").write_text(
1434
+ f"{week_start_date} {weekly_percent}\n"
1435
+ )
1436
+ except OSError:
1437
+ pass
1438
+
1439
+ # Race-defensive cleanup. Between the moment
1440
+ # Anthropic credited the user (effective_iso)
1441
+ # and this code firing, the EXTERNAL
1442
+ # claude-statusline tool can replay stale
1443
+ # pre-credit `--percent` values (it has its
1444
+ # own in-memory HWM cache and re-runs us once
1445
+ # per status-line tick). Those replays land
1446
+ # captured_at_utc >= effective_iso with
1447
+ # weekly_percent == prior_pct (the pre-credit
1448
+ # value), and they dominate the reset-aware
1449
+ # clamp's MAX over the post-credit segment so
1450
+ # legitimate fresh OAuth values are rejected.
1451
+ # Strict equality (round(.,1)) keeps this
1452
+ # narrow: we only delete rows whose percent
1453
+ # exactly matches the pre-credit value we just
1454
+ # observed — legitimate post-credit climbs
1455
+ # past `prior_pct` (rare, but possible if the
1456
+ # credit is small + activity is heavy) stay.
1457
+ try:
1458
+ conn.execute(
1459
+ "DELETE FROM weekly_usage_snapshots "
1460
+ "WHERE week_start_date = ? "
1461
+ " AND unixepoch(captured_at_utc) >= "
1462
+ " unixepoch(?) "
1463
+ " AND round(weekly_percent, 1) = "
1464
+ " round(?, 1)",
1465
+ (week_start_date, effective_iso,
1466
+ float(prior_pct)),
1467
+ )
1468
+ conn.commit()
1469
+ except sqlite3.DatabaseError as exc:
1470
+ eprint(
1471
+ "[record-usage] post-credit cleanup "
1472
+ f"failed: {exc}"
1473
+ )
1474
+ except (sqlite3.DatabaseError, ValueError) as exc:
1475
+ eprint(f"[record-usage] reset-event detection failed: {exc}")
1476
+
1477
+ # 7-day usage is monotonically non-decreasing within a billing week
1478
+ # — UNTIL Anthropic issues an in-place weekly credit. When a
1479
+ # week_reset_events row exists for THIS week_end_at, the MAX query
1480
+ # filters to samples captured at-or-after the segment's
1481
+ # effective_reset_at_utc so a fresh post-credit OAuth value (e.g.
1482
+ # 2%) lands instead of being held back by stale pre-credit history
1483
+ # (e.g. 67%). When no event row exists, COALESCE defaults to
1484
+ # epoch-zero so the filter is a no-op and legacy clamp behavior
1485
+ # is preserved byte-identically.
1486
+ # NB: comparison wrapped with ``unixepoch()`` on BOTH sides.
1487
+ # ``captured_at_utc`` is stored with `Z` suffix, but
1488
+ # ``effective_reset_at_utc`` may have a non-UTC offset on
1489
+ # historical backfill rows written before Bug 3 was fixed
1490
+ # (parse_iso_datetime returned host-local). Lex string compare
1491
+ # on mixed offsets silently mis-orders moments for non-UTC
1492
+ # hosts (CLAUDE.md gotcha: 5h-block cross-reset flag — "all
1493
+ # comparisons go through unixepoch(), NOT lex
1494
+ # BETWEEN/`<`/`>`"). Same rule applies here.
1495
+ max_row = conn.execute(
1496
+ """
1497
+ SELECT MAX(weekly_percent) AS v
1498
+ FROM weekly_usage_snapshots
1499
+ WHERE week_start_date = ?
1500
+ AND unixepoch(captured_at_utc) >= unixepoch(COALESCE(
1501
+ (SELECT effective_reset_at_utc
1502
+ FROM week_reset_events
1503
+ WHERE new_week_end_at = ?
1504
+ ORDER BY id DESC
1505
+ LIMIT 1),
1506
+ '1970-01-01T00:00:00Z'
1507
+ ))
1508
+ """,
1509
+ (week_start_date, week_end_at),
1510
+ ).fetchone()
1511
+ if max_row and max_row["v"] is not None and round(weekly_percent, 1) < round(float(max_row["v"]), 1):
1512
+ should_insert = False
1513
+ else:
1514
+ # 5-hour usage is monotonically non-decreasing within a window.
1515
+ # A lower value means stale API data; clamp to existing max.
1516
+ # Joining on five_hour_window_key (canonical 10-min-floored
1517
+ # epoch) absorbs Anthropic's seconds-level jitter on
1518
+ # resets_at; an ISO-string equality at this site silently
1519
+ # skipped the clamp every time a jittered fetch landed in
1520
+ # the same physical 5h window (spec Bug B).
1521
+ if five_hour_percent is not None and five_hour_window_key is not None:
1522
+ max_5h_row = conn.execute(
1523
+ "SELECT MAX(five_hour_percent) AS v FROM weekly_usage_snapshots WHERE five_hour_window_key = ?",
1524
+ (five_hour_window_key,),
1525
+ ).fetchone()
1526
+ if max_5h_row and max_5h_row["v"] is not None and round(five_hour_percent, 1) < round(float(max_5h_row["v"]), 1):
1527
+ five_hour_percent = float(max_5h_row["v"])
1528
+
1529
+ # Dedup vs last snapshot: if BOTH weekly_percent and
1530
+ # five_hour_percent are unchanged from the most recent row in
1531
+ # this week, swallow the insert. Tests of the 5h clamp must
1532
+ # vary --percent (or --five-hour-percent) between calls, or
1533
+ # the second call is dropped here before the clamp even runs
1534
+ # — see bin/cctally-5h-canonical-test scenario B.
1535
+ last = conn.execute(
1536
+ """
1537
+ SELECT weekly_percent, five_hour_percent
1538
+ FROM weekly_usage_snapshots
1539
+ WHERE week_start_date = ?
1540
+ ORDER BY captured_at_utc DESC, id DESC
1541
+ LIMIT 1
1542
+ """,
1543
+ (week_start_date,),
1544
+ ).fetchone()
1545
+ if last is not None:
1546
+ if float(last["weekly_percent"]) == weekly_percent:
1547
+ last_5h = last["five_hour_percent"]
1548
+ if five_hour_percent is None or (
1549
+ last_5h is not None and float(last_5h) == five_hour_percent
1550
+ ):
1551
+ should_insert = False
1552
+
1553
+ # No backfill of 5h data on existing milestones — we don't have
1554
+ # authentic crossing-time values for them. New milestones created
1555
+ # by the status line path will have 5h data set at creation time
1556
+ # via maybe_record_milestone().
1557
+ finally:
1558
+ conn.close()
1559
+
1560
+ if not should_insert:
1561
+ # Self-heal: a prior record-usage invocation may have inserted
1562
+ # the snapshot but been killed (CC self-update, machine sleep,
1563
+ # OOM) before maybe_record_milestone / maybe_update_five_hour_block
1564
+ # could run. Pre-probe both surfaces with cheap indexed SELECTs
1565
+ # and only invoke the helpers when a row is actually missing or
1566
+ # stale. Steady-state cost: 1-3 SELECTs (latest snapshot always;
1567
+ # +max_milestone if floor>=1; +block last_observed if window_key
1568
+ # is set); ZERO JSONL re-ingest on healthy ticks. The helpers themselves are idempotent under
1569
+ # concurrent record-usage instances (INSERT OR IGNORE for
1570
+ # percent_milestones; SQLite write-lock serialization for the
1571
+ # 5h upsert). Without the pre-probe, every dedup tick would
1572
+ # trigger sync_cache + a window walk + replace-all rollups via
1573
+ # maybe_update_five_hour_block's unconditional _compute_block_totals
1574
+ # call. Regression: bin/cctally-record-usage-selfheal-test.
1575
+ try:
1576
+ heal_conn = open_db()
1577
+ try:
1578
+ latest_row = heal_conn.execute(
1579
+ "SELECT * FROM weekly_usage_snapshots "
1580
+ "WHERE week_start_date = ? "
1581
+ "ORDER BY captured_at_utc DESC, id DESC LIMIT 1",
1582
+ (week_start_date,),
1583
+ ).fetchone()
1584
+ if latest_row is None:
1585
+ return 0
1586
+
1587
+ # Probe 1: do we owe a percent milestone? Snap up before
1588
+ # floor (status-line API returns 0.N*100 which can fall
1589
+ # one ULP short of N — same convention as
1590
+ # maybe_record_milestone).
1591
+ latest_floor = math.floor(
1592
+ float(latest_row["weekly_percent"]) + 1e-9
1593
+ )
1594
+ need_milestone_heal = False
1595
+ if latest_floor >= 1:
1596
+ # v1.7.2: scope the heal probe to the ACTIVE segment.
1597
+ # Without this, a credited week's MAX over the whole
1598
+ # ledger would still read the pre-credit ceiling
1599
+ # (e.g. 67%) and silently suppress the post-credit
1600
+ # ledger's heal even though it has zero rows.
1601
+ captured_at_for_probe = latest_row["captured_at_utc"]
1602
+ week_end_at_for_probe = latest_row["week_end_at"]
1603
+ heal_segment = 0
1604
+ if week_end_at_for_probe and captured_at_for_probe:
1605
+ seg = heal_conn.execute(
1606
+ "SELECT id FROM week_reset_events "
1607
+ "WHERE new_week_end_at = ? "
1608
+ " AND unixepoch(effective_reset_at_utc) <= unixepoch(?) "
1609
+ "ORDER BY id DESC LIMIT 1",
1610
+ (week_end_at_for_probe, captured_at_for_probe),
1611
+ ).fetchone()
1612
+ if seg is not None:
1613
+ heal_segment = int(seg["id"])
1614
+ max_existing = heal_conn.execute(
1615
+ "SELECT MAX(percent_threshold) AS m "
1616
+ "FROM percent_milestones "
1617
+ "WHERE week_start_date = ? AND reset_event_id = ?",
1618
+ (week_start_date, heal_segment),
1619
+ ).fetchone()
1620
+ if max_existing is None or max_existing["m"] is None:
1621
+ need_milestone_heal = True
1622
+ elif int(max_existing["m"]) < latest_floor:
1623
+ need_milestone_heal = True
1624
+
1625
+ # Probe 2: do we owe a 5h-block update? Either no row
1626
+ # for this canonical window, or the existing row's
1627
+ # last_observed_at_utc is stale relative to the latest
1628
+ # snapshot's captured_at_utc (the kill landed between
1629
+ # insert_usage_snapshot and maybe_update_five_hour_block).
1630
+ need_5h_heal = False
1631
+ window_key = latest_row["five_hour_window_key"]
1632
+ if window_key is not None:
1633
+ block_row = heal_conn.execute(
1634
+ "SELECT last_observed_at_utc "
1635
+ "FROM five_hour_blocks "
1636
+ "WHERE five_hour_window_key = ?",
1637
+ (int(window_key),),
1638
+ ).fetchone()
1639
+ if block_row is None:
1640
+ need_5h_heal = True
1641
+ elif (
1642
+ block_row["last_observed_at_utc"]
1643
+ < latest_row["captured_at_utc"]
1644
+ ):
1645
+ need_5h_heal = True
1646
+ finally:
1647
+ heal_conn.close()
1648
+
1649
+ if need_milestone_heal or need_5h_heal:
1650
+ latest_saved = _saved_dict_from_usage_row(latest_row)
1651
+ if need_milestone_heal:
1652
+ try:
1653
+ maybe_record_milestone(latest_saved)
1654
+ except Exception as exc:
1655
+ eprint(f"[milestone] self-heal error: {exc}")
1656
+ if need_5h_heal:
1657
+ try:
1658
+ maybe_update_five_hour_block(latest_saved)
1659
+ except Exception as exc:
1660
+ eprint(f"[5h-block] self-heal error: {exc}")
1661
+ except Exception as exc:
1662
+ eprint(f"[record-usage] self-heal lookup failed: {exc}")
1663
+ return 0
1664
+
1665
+ payload = {
1666
+ "source": "statusline",
1667
+ "capturedAt": now_utc_iso(),
1668
+ "weeklyPercent": weekly_percent,
1669
+ "weekStartDate": week_start_date,
1670
+ "weekEndDate": week_end_date,
1671
+ "weekStartAt": week_start_at,
1672
+ "weekEndAt": week_end_at,
1673
+ }
1674
+ if five_hour_percent is not None:
1675
+ payload["fiveHourPercent"] = five_hour_percent
1676
+ if five_hour_resets_at_str is not None:
1677
+ payload["fiveHourResetsAt"] = five_hour_resets_at_str
1678
+ if five_hour_window_key is not None:
1679
+ payload["fiveHourWindowKey"] = five_hour_window_key
1680
+
1681
+ saved = insert_usage_snapshot(payload, week_start_name)
1682
+ try:
1683
+ maybe_record_milestone(saved)
1684
+ except Exception as exc:
1685
+ eprint(f"[milestone] unexpected error: {exc}")
1686
+
1687
+ # NEW: 5h-block rollup (paired with maybe_record_milestone for 7d).
1688
+ # The helper performs an opportunistic JOIN against week_reset_events
1689
+ # every tick to flag any open block whose interval contains a recorded
1690
+ # reset; no per-call plumbing needed (self-healing).
1691
+ try:
1692
+ maybe_update_five_hour_block(saved)
1693
+ except Exception as exc:
1694
+ eprint(f"[5h-block] unexpected error: {exc}")
1695
+
1696
+ # Write high-water mark so the status line never displays a regression.
1697
+ # The file contains "week_start_date weekly_percent" on one line.
1698
+ try:
1699
+ hwm_path = c.APP_DIR / "hwm-7d"
1700
+ existing_hwm = 0.0
1701
+ try:
1702
+ parts = hwm_path.read_text().strip().split()
1703
+ if len(parts) == 2 and parts[0] == week_start_date:
1704
+ existing_hwm = float(parts[1])
1705
+ except (FileNotFoundError, ValueError, OSError):
1706
+ pass
1707
+ if weekly_percent >= existing_hwm:
1708
+ hwm_path.write_text(f"{week_start_date} {weekly_percent}\n")
1709
+ except OSError:
1710
+ pass
1711
+
1712
+ # Symmetric 5h HWM. Keyed by the canonical five_hour_window_key derived
1713
+ # under the prior-anchor logic above (NOT a fresh pure-floor recompute) so
1714
+ # boundary-straddling jitter writes to the same file key as the matching
1715
+ # DB row. File format: "<canonical_5h_window_key> <percent>".
1716
+ if (
1717
+ five_hour_percent is not None
1718
+ and five_hour_window_key is not None
1719
+ ):
1720
+ try:
1721
+ five_resets_key = five_hour_window_key
1722
+ hwm5_path = c.APP_DIR / "hwm-5h"
1723
+ existing_hwm5 = 0.0
1724
+ try:
1725
+ parts5 = hwm5_path.read_text().strip().split()
1726
+ if len(parts5) == 2 and parts5[0] == str(five_resets_key):
1727
+ existing_hwm5 = float(parts5[1])
1728
+ except (FileNotFoundError, ValueError, OSError):
1729
+ pass
1730
+ if five_hour_percent >= existing_hwm5:
1731
+ hwm5_path.write_text(f"{five_resets_key} {five_hour_percent}\n")
1732
+ except OSError:
1733
+ pass
1734
+
1735
+ return 0
1736
+
1737
+
1738
+ def _hook_tick_log_line(line: str) -> None:
1739
+ """Append one line to hook-tick.log; create dir if missing.
1740
+
1741
+ Uses O_APPEND so concurrent writers' sub-PIPE_BUF lines don't interleave.
1742
+ Best-effort: any IO error is silently swallowed (hook discipline).
1743
+ """
1744
+ c = _cctally()
1745
+ try:
1746
+ c.HOOK_TICK_LOG_DIR.mkdir(parents=True, exist_ok=True)
1747
+ fd = os.open(c.HOOK_TICK_LOG_PATH, os.O_WRONLY | os.O_CREAT | os.O_APPEND, 0o644)
1748
+ try:
1749
+ os.write(fd, (line.rstrip("\n") + "\n").encode("utf-8", errors="replace"))
1750
+ finally:
1751
+ os.close(fd)
1752
+ except OSError:
1753
+ pass
1754
+
1755
+
1756
+ def _hook_tick_log_rotate_if_needed() -> None:
1757
+ """If hook-tick.log exceeds the size cap, atomic-rename to .1 (overwriting)."""
1758
+ c = _cctally()
1759
+ try:
1760
+ size = c.HOOK_TICK_LOG_PATH.stat().st_size
1761
+ except FileNotFoundError:
1762
+ return
1763
+ except OSError:
1764
+ return
1765
+ if size <= c.HOOK_TICK_LOG_ROTATE_BYTES:
1766
+ return
1767
+ try:
1768
+ os.replace(c.HOOK_TICK_LOG_PATH, c.HOOK_TICK_LOG_ROTATED_PATH)
1769
+ except OSError:
1770
+ pass
1771
+
1772
+
1773
+ def _hook_tick_throttle_age_seconds() -> float:
1774
+ """Return seconds since last successful OAuth fetch; +inf if never."""
1775
+ c = _cctally()
1776
+ try:
1777
+ mtime = c.HOOK_TICK_THROTTLE_PATH.stat().st_mtime
1778
+ except FileNotFoundError:
1779
+ return float("inf")
1780
+ except OSError:
1781
+ return float("inf")
1782
+ return max(0.0, time.time() - mtime)
1783
+
1784
+
1785
+ def _hook_tick_throttle_touch() -> None:
1786
+ """Update mtime to now (creating the file if missing)."""
1787
+ c = _cctally()
1788
+ try:
1789
+ c.APP_DIR.mkdir(parents=True, exist_ok=True)
1790
+ c.HOOK_TICK_THROTTLE_PATH.touch(exist_ok=True)
1791
+ os.utime(c.HOOK_TICK_THROTTLE_PATH, None)
1792
+ except OSError:
1793
+ pass
1794
+
1795
+
1796
+ def _hook_tick_read_stdin_event(stdin_max_bytes: int = 32 * 1024) -> dict:
1797
+ """Read CC's hook payload (JSON on stdin). Best-effort.
1798
+
1799
+ Returns dict with keys event, session_id, transcript_path, cwd —
1800
+ every value is a string (or "unknown"). Never raises.
1801
+ """
1802
+ out = {"event": "unknown", "session_id": "unknown", "transcript_path": "", "cwd": ""}
1803
+ try:
1804
+ data = sys.stdin.buffer.read(stdin_max_bytes)
1805
+ except (OSError, ValueError):
1806
+ return out
1807
+ if not data:
1808
+ return out
1809
+ try:
1810
+ payload = json.loads(data.decode("utf-8", errors="replace"))
1811
+ except (ValueError, UnicodeDecodeError):
1812
+ return out
1813
+ if not isinstance(payload, dict):
1814
+ return out
1815
+ out["event"] = str(payload.get("hook_event_name") or "unknown")
1816
+ sid = payload.get("session_id")
1817
+ out["session_id"] = str(sid) if isinstance(sid, str) else "unknown"
1818
+ tp = payload.get("transcript_path")
1819
+ out["transcript_path"] = str(tp) if isinstance(tp, str) else ""
1820
+ cwd = payload.get("cwd")
1821
+ out["cwd"] = str(cwd) if isinstance(cwd, str) else ""
1822
+ return out
1823
+
1824
+
1825
+ def _hook_tick_session_short(sid: str) -> str:
1826
+ """First 8 chars of a session id, sanitized for log lines."""
1827
+ if not sid or sid == "unknown":
1828
+ return "unknown"
1829
+ return "".join(c for c in sid[:8] if c.isalnum() or c in "-_")
1830
+
1831
+
1832
+ def _hook_tick_format_log_line(
1833
+ event: str, session: str, ingested: int, oauth_status: str, dur_ms: int
1834
+ ) -> str:
1835
+ ts = now_utc_iso()
1836
+ return (
1837
+ f"{ts} event={event:14s} session={session} "
1838
+ f"ingested={ingested} oauth={oauth_status} dur_ms={dur_ms}"
1839
+ )
1840
+
1841
+
1842
+ def cmd_hook_tick(args: argparse.Namespace) -> int:
1843
+ """Per-fire hook runtime (Section 3 of onboarding spec).
1844
+
1845
+ Normal mode: reads stdin, detaches stdout/stderr to log file, runs
1846
+ sync_cache + (throttled) OAuth refresh, writes one log line, returns 0
1847
+ UNCONDITIONALLY (even on internal failure — hook discipline).
1848
+
1849
+ --explain mode: synchronous, prints to stdout, returns informative
1850
+ exit code.
1851
+ """
1852
+ c = _cctally()
1853
+ explain = bool(getattr(args, "explain", False))
1854
+ no_oauth = bool(getattr(args, "no_oauth", False))
1855
+ # Use an explicit `is None` check so `--throttle-seconds 0` survives the
1856
+ # default-fallback (a `0 or DEFAULT` short-circuit would silently drop
1857
+ # the override and reapply the configured window — defeats the purpose
1858
+ # of the zero-second escape hatch).
1859
+ override = getattr(args, "throttle_seconds", None)
1860
+ if override is not None:
1861
+ throttle_seconds = float(override)
1862
+ else:
1863
+ try:
1864
+ _cfg = _get_oauth_usage_config(load_config())
1865
+ throttle_seconds = float(_cfg["throttle_seconds"])
1866
+ except sys.modules["cctally"].OauthUsageConfigError:
1867
+ throttle_seconds = float(c.HOOK_TICK_DEFAULT_THROTTLE_SECONDS)
1868
+
1869
+ # --- Step 1: read stdin (before detach OR fork) ---
1870
+ # CRITICAL: stdin must be read BEFORE we fork. POSIX (XCU §2.9.3) says
1871
+ # async commands (`cmd &`) in non-interactive shells get stdin redirected
1872
+ # to /dev/null; we previously relied on shell `&` which blanked the
1873
+ # hook payload. Now the settings.json command is bare and we fork here
1874
+ # ourselves — but stdin still has to be drained first.
1875
+ forced_event = getattr(args, "event", None)
1876
+ if explain:
1877
+ meta = {"event": forced_event or "explain", "session_id": "explain",
1878
+ "transcript_path": "", "cwd": ""}
1879
+ else:
1880
+ meta = _hook_tick_read_stdin_event()
1881
+ if forced_event:
1882
+ meta["event"] = forced_event
1883
+
1884
+ # --- Step 1b: fork to background so CC's hook returns immediately ---
1885
+ # Parent returns 0 right away; child carries on with sync_cache + OAuth.
1886
+ # If fork fails (rare: out of pids/memory), fall back to running the
1887
+ # body inline — the parent process must NOT be misclassified as a
1888
+ # forked child, otherwise os.setsid() would detach the parent's
1889
+ # controlling terminal and os._exit(0) at function end would kill it
1890
+ # mid-stack.
1891
+ forked = False
1892
+ pid = 0
1893
+ if not explain:
1894
+ try:
1895
+ pid = os.fork()
1896
+ forked = True
1897
+ except OSError:
1898
+ pass
1899
+ if forked and pid > 0:
1900
+ # Parent of a successful fork: CC unblocks immediately.
1901
+ return 0
1902
+ # Either: child of successful fork, OR inline fallback after fork failure.
1903
+ if forked:
1904
+ # Detach from parent's session so SIGHUP from CC doesn't kill us.
1905
+ try:
1906
+ os.setsid()
1907
+ except OSError:
1908
+ pass
1909
+
1910
+ # --- Step 2: detach stdio (forked child OR inline fallback after fork failure) ---
1911
+ # In the inline-fallback path the parent process re-routes its own stdout/
1912
+ # stderr to the log file for the rest of its short life. Function returns
1913
+ # immediately after Step 7, so the leak is bounded.
1914
+ if not explain:
1915
+ try:
1916
+ c.HOOK_TICK_LOG_DIR.mkdir(parents=True, exist_ok=True)
1917
+ log_fd = os.open(
1918
+ c.HOOK_TICK_LOG_PATH,
1919
+ os.O_WRONLY | os.O_CREAT | os.O_APPEND, 0o644,
1920
+ )
1921
+ os.dup2(log_fd, 1) # stdout
1922
+ os.dup2(log_fd, 2) # stderr
1923
+ os.close(log_fd)
1924
+ try:
1925
+ devnull = os.open(os.devnull, os.O_RDONLY)
1926
+ os.dup2(devnull, 0)
1927
+ os.close(devnull)
1928
+ except OSError:
1929
+ pass
1930
+ except OSError:
1931
+ pass # log redirect failed; carry on silently
1932
+
1933
+ # --- Steps 3-7: wrap remainder in try/except (always exit 0 in normal mode) ---
1934
+ start = time.monotonic()
1935
+ ingested = 0
1936
+ oauth_status = "skipped-no-oauth" if no_oauth else "throttled(age=?s)"
1937
+ # Pre-fetch throttle state captured for --explain output. The OAuth
1938
+ # block re-touches the throttle marker after a successful fetch, so
1939
+ # re-reading age there would print `mtime: 0s ago → skip` even when
1940
+ # the call we just made was a fetch. Freeze the values at decision
1941
+ # time. `pre_age` is read once now (covers --no-oauth / lock-failure
1942
+ # paths); the throttle block below re-assigns it under flock for the
1943
+ # OAuth-active path so the explain output matches the actual decision.
1944
+ pre_age: float = _hook_tick_throttle_age_seconds()
1945
+ decision: str = "skip"
1946
+
1947
+ try:
1948
+ # Local sync (always)
1949
+ try:
1950
+ cache_conn = open_cache_db()
1951
+ try:
1952
+ stats = sync_cache(cache_conn)
1953
+ ingested = int(stats.rows_inserted)
1954
+ finally:
1955
+ try:
1956
+ cache_conn.close()
1957
+ except Exception:
1958
+ pass
1959
+ except Exception as exc:
1960
+ ingested = -1
1961
+ if explain:
1962
+ eprint(f"[hook-tick] sync_cache failed: {exc}")
1963
+
1964
+ mock = getattr(args, "mock_oauth_response", None)
1965
+ if mock is not None:
1966
+ # Replace the throttle path's fetch fn for this process.
1967
+ sys.modules["cctally"]._hook_tick_oauth_refresh = _hook_tick_make_mock_refresh(mock)
1968
+
1969
+ # Throttle check + OAuth (under flock)
1970
+ if not no_oauth:
1971
+ c.APP_DIR.mkdir(parents=True, exist_ok=True)
1972
+ try:
1973
+ lock_fd = os.open(
1974
+ c.HOOK_TICK_THROTTLE_LOCK_PATH,
1975
+ os.O_WRONLY | os.O_CREAT, 0o644,
1976
+ )
1977
+ except OSError:
1978
+ lock_fd = -1
1979
+ try:
1980
+ if lock_fd >= 0:
1981
+ fcntl.flock(lock_fd, fcntl.LOCK_EX)
1982
+ pre_age = _hook_tick_throttle_age_seconds()
1983
+ if pre_age >= throttle_seconds:
1984
+ decision = "fetch"
1985
+ oauth_status, _ = _hook_tick_oauth_refresh(throttle_seconds=throttle_seconds)
1986
+ if oauth_status.startswith("ok"):
1987
+ _hook_tick_throttle_touch()
1988
+ else:
1989
+ oauth_status = f"throttled(age={int(pre_age)}s)"
1990
+ finally:
1991
+ if lock_fd >= 0:
1992
+ try:
1993
+ fcntl.flock(lock_fd, fcntl.LOCK_UN)
1994
+ except OSError:
1995
+ pass
1996
+ try:
1997
+ os.close(lock_fd)
1998
+ except OSError:
1999
+ pass
2000
+ except Exception as exc:
2001
+ oauth_status = f"err(internal:{type(exc).__name__})"
2002
+ if explain:
2003
+ eprint(f"[hook-tick] internal error: {exc}")
2004
+
2005
+ dur_ms = int((time.monotonic() - start) * 1000)
2006
+
2007
+ # --- Step 7: log line ---
2008
+ line = _hook_tick_format_log_line(
2009
+ event=meta["event"],
2010
+ session=_hook_tick_session_short(meta["session_id"]),
2011
+ ingested=ingested,
2012
+ oauth_status=oauth_status,
2013
+ dur_ms=dur_ms,
2014
+ )
2015
+ _hook_tick_log_line(line)
2016
+ _hook_tick_log_rotate_if_needed()
2017
+
2018
+ # --- Step 9: exit code ---
2019
+ if not explain:
2020
+ # Forked child: skip Python's atexit / argparse / cleanup paths
2021
+ # (they may try to flush already-redirected stdio handles).
2022
+ if forked:
2023
+ os._exit(0)
2024
+ return 0
2025
+ # --explain mapping (Section 3 of spec)
2026
+ if oauth_status == "skipped-no-token":
2027
+ rc = 2
2028
+ elif oauth_status.startswith("err(network") or oauth_status.startswith("err(parse"):
2029
+ rc = 3
2030
+ elif oauth_status.startswith("err(record-usage"):
2031
+ rc = 4
2032
+ elif ingested < 0:
2033
+ rc = 5
2034
+ else:
2035
+ rc = 0
2036
+ # Print --explain decision tree
2037
+ print("[1/4] Local sync (sync_cache)")
2038
+ print(f" → ingested {max(0, ingested)} new entries")
2039
+ print("[2/4] Throttle check")
2040
+ print(f" → throttle file: {c.HOOK_TICK_THROTTLE_PATH}")
2041
+ if pre_age == float("inf"):
2042
+ print(" → mtime: (file absent)")
2043
+ else:
2044
+ print(f" → mtime: {int(pre_age)}s ago")
2045
+ print(f" → threshold: {int(throttle_seconds)}s → {decision}")
2046
+ print("[3/4] OAuth refresh")
2047
+ print(f" → status: {oauth_status}")
2048
+ print(f"[4/4] Log written → {c.HOOK_TICK_LOG_PATH}")
2049
+ print(f"\nDone in {dur_ms} ms.")
2050
+ return rc
2051
+
2052
+
2053
+ def _safe_float(value: Any) -> float:
2054
+ try:
2055
+ num = float(value)
2056
+ except (TypeError, ValueError) as exc:
2057
+ raise ValueError("weeklyPercent must be numeric") from exc
2058
+ if num < 0:
2059
+ raise ValueError("weeklyPercent must be >= 0")
2060
+ if num > 1000:
2061
+ raise ValueError("weeklyPercent is unreasonably large")
2062
+ return num
2063
+
2064
+
2065
+ def _validate_date_optional(value: Any, label: str) -> dt.date | None:
2066
+ if value in (None, ""):
2067
+ return None
2068
+ if not isinstance(value, str):
2069
+ raise ValueError(f"{label} must be a string in YYYY-MM-DD")
2070
+ return parse_date_str(value, label)
2071
+
2072
+
2073
+ @dataclass(frozen=True)
2074
+ class DerivedWeekWindow:
2075
+ week_start: dt.date
2076
+ week_end: dt.date
2077
+ week_start_at: str | None = None
2078
+ week_end_at: str | None = None
2079
+
2080
+
2081
+
2082
+ def _coerce_payload_captured_at(payload: dict[str, Any]) -> tuple[str, dt.datetime]:
2083
+ captured_at_raw = payload.get("capturedAt")
2084
+ if isinstance(captured_at_raw, str) and captured_at_raw.strip():
2085
+ try:
2086
+ return captured_at_raw, parse_iso_datetime(captured_at_raw, "capturedAt")
2087
+ except ValueError:
2088
+ pass
2089
+
2090
+ captured_at = now_utc_iso()
2091
+ return captured_at, parse_iso_datetime(captured_at, "capturedAt")
2092
+
2093
+
2094
+
2095
+ def _derive_week_from_payload(payload: dict[str, Any], week_start_name: str) -> DerivedWeekWindow:
2096
+ ws_at = payload.get("weekStartAt")
2097
+ we_at = payload.get("weekEndAt")
2098
+ if isinstance(ws_at, str) and ws_at.strip() and isinstance(we_at, str) and we_at.strip():
2099
+ start_iso = _canonicalize_optional_iso(ws_at, "weekStartAt")
2100
+ end_iso = _canonicalize_optional_iso(we_at, "weekEndAt")
2101
+ if not start_iso or not end_iso:
2102
+ raise ValueError("weekStartAt/weekEndAt must be non-empty ISO datetime strings")
2103
+ start_at = parse_iso_datetime(start_iso, "weekStartAt")
2104
+ end_at = parse_iso_datetime(end_iso, "weekEndAt")
2105
+ if end_at <= start_at:
2106
+ raise ValueError("weekEndAt must be after weekStartAt")
2107
+ # Anchor the bucket-key date on the canonical UTC ISO, not on
2108
+ # `.date()` of the parsed datetime — `parse_iso_datetime` ends
2109
+ # with `.astimezone()` which converts to host-local TZ. If the
2110
+ # cctally process inherits a TZ whose offset puts the UTC moment
2111
+ # on a different calendar date, `start_at.date()` silently
2112
+ # forks the `week_start_date` column for the SAME physical
2113
+ # subscription week, producing a ghost row that never gets
2114
+ # updated (regression: Israel host briefly running with
2115
+ # TZ=America/Los_Angeles for 7 minutes during refactor work
2116
+ # spawned 18 ghost usage rows + 2 ghost cost rows under
2117
+ # week_start_date='2026-05-08' while every other row sat at
2118
+ # '2026-05-09'). Re-canonicalize to UTC before `.date()` so the
2119
+ # bucket key matches what `cmd_record_usage` writes (it derives
2120
+ # `week_start_date` directly from `resets_at` in UTC).
2121
+ return DerivedWeekWindow(
2122
+ week_start=start_at.astimezone(dt.timezone.utc).date(),
2123
+ week_end=end_at.astimezone(dt.timezone.utc).date(),
2124
+ week_start_at=start_iso,
2125
+ week_end_at=end_iso,
2126
+ )
2127
+
2128
+ ws = _validate_date_optional(payload.get("weekStartDate"), "weekStartDate")
2129
+ we = _validate_date_optional(payload.get("weekEndDate"), "weekEndDate")
2130
+ if ws and we:
2131
+ if we < ws:
2132
+ raise ValueError("weekEndDate must be on or after weekStartDate")
2133
+ return DerivedWeekWindow(week_start=ws, week_end=we)
2134
+ if ws and not we:
2135
+ return DerivedWeekWindow(week_start=ws, week_end=ws + dt.timedelta(days=6))
2136
+
2137
+ captured_raw = payload.get("capturedAt")
2138
+ if isinstance(captured_raw, str) and captured_raw.strip():
2139
+ try:
2140
+ captured_dt = dt.datetime.fromisoformat(captured_raw.replace("Z", "+00:00"))
2141
+ if captured_dt.tzinfo is None:
2142
+ captured_dt = captured_dt.replace(tzinfo=dt.timezone.utc)
2143
+ except ValueError:
2144
+ # internal fallback: host-local intentional
2145
+ captured_dt = dt.datetime.now().astimezone()
2146
+ else:
2147
+ # internal fallback: host-local intentional
2148
+ captured_dt = dt.datetime.now().astimezone()
2149
+
2150
+ start, end = compute_week_bounds(captured_dt, week_start_name)
2151
+ return DerivedWeekWindow(week_start=start, week_end=end)
2152
+
2153
+
2154
+ def insert_usage_snapshot(payload: dict[str, Any], week_start_name: str) -> dict[str, Any]:
2155
+ weekly_percent = _safe_float(payload.get("weeklyPercent"))
2156
+ captured_at, captured_at_dt = _coerce_payload_captured_at(payload)
2157
+
2158
+ page_url = payload.get("pageUrl") if isinstance(payload.get("pageUrl"), str) else None
2159
+ source = payload.get("source") if isinstance(payload.get("source"), str) else "userscript"
2160
+
2161
+ five_hour_percent = payload.get("fiveHourPercent")
2162
+ if five_hour_percent is not None:
2163
+ five_hour_percent = float(five_hour_percent)
2164
+ five_hour_resets_at = payload.get("fiveHourResetsAt")
2165
+ if five_hour_resets_at is not None:
2166
+ five_hour_resets_at = str(five_hour_resets_at)
2167
+ five_hour_window_key = payload.get("fiveHourWindowKey")
2168
+ if five_hour_window_key is not None:
2169
+ try:
2170
+ five_hour_window_key = int(five_hour_window_key)
2171
+ except (TypeError, ValueError) as exc:
2172
+ # Loud-skip on first failure only (module-level guard) so a
2173
+ # misbehaving caller doesn't spam the log on every insert.
2174
+ global _logged_window_key_coerce_failure
2175
+ if not _logged_window_key_coerce_failure:
2176
+ # Use the local (already extracted from payload at line
2177
+ # ~13858) instead of re-reading; payload is mutable and
2178
+ # could in principle change between extraction and the
2179
+ # except branch.
2180
+ eprint(
2181
+ f"[record-usage] fiveHourWindowKey coerce failed "
2182
+ f"(got {type(five_hour_window_key).__name__}: "
2183
+ f"{five_hour_window_key!r}); "
2184
+ f"5h DB clamp will be skipped for this row: {exc}"
2185
+ )
2186
+ _logged_window_key_coerce_failure = True
2187
+ five_hour_window_key = None
2188
+
2189
+ conn = open_db()
2190
+ try:
2191
+ week_window = _derive_week_from_payload(payload, week_start_name)
2192
+
2193
+ # Use the canonical boundary already established for this week_start_date.
2194
+ # This prevents relative-reset drift from creating duplicate weeks.
2195
+ date_str = week_window.week_start.isoformat()
2196
+ canon_start, canon_end = _get_canonical_boundary_for_date(conn, date_str)
2197
+ if canon_start and canon_end:
2198
+ week_window = DerivedWeekWindow(
2199
+ week_start=week_window.week_start,
2200
+ week_end=week_window.week_end,
2201
+ week_start_at=canon_start,
2202
+ week_end_at=canon_end,
2203
+ )
2204
+
2205
+ week_start = week_window.week_start
2206
+ week_end = week_window.week_end
2207
+
2208
+ cur = conn.execute(
2209
+ """
2210
+ INSERT INTO weekly_usage_snapshots
2211
+ (
2212
+ captured_at_utc,
2213
+ week_start_date,
2214
+ week_end_date,
2215
+ week_start_at,
2216
+ week_end_at,
2217
+ weekly_percent,
2218
+ page_url,
2219
+ source,
2220
+ payload_json,
2221
+ five_hour_percent,
2222
+ five_hour_resets_at,
2223
+ five_hour_window_key
2224
+ )
2225
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
2226
+ """,
2227
+ (
2228
+ captured_at,
2229
+ week_start.isoformat(),
2230
+ week_end.isoformat(),
2231
+ week_window.week_start_at,
2232
+ week_window.week_end_at,
2233
+ weekly_percent,
2234
+ page_url,
2235
+ source,
2236
+ json.dumps(payload, separators=(",", ":")),
2237
+ five_hour_percent,
2238
+ five_hour_resets_at,
2239
+ five_hour_window_key,
2240
+ ),
2241
+ )
2242
+ conn.commit()
2243
+ snapshot_id = int(cur.lastrowid)
2244
+ finally:
2245
+ conn.close()
2246
+
2247
+ out = {
2248
+ "id": snapshot_id,
2249
+ "capturedAt": captured_at,
2250
+ "weekStartDate": week_start.isoformat(),
2251
+ "weekEndDate": week_end.isoformat(),
2252
+ "weeklyPercent": weekly_percent,
2253
+ }
2254
+ if week_window.week_start_at:
2255
+ out["weekStartAt"] = week_window.week_start_at
2256
+ if week_window.week_end_at:
2257
+ out["weekEndAt"] = week_window.week_end_at
2258
+ if isinstance(payload.get("resetText"), str):
2259
+ out["resetText"] = payload["resetText"]
2260
+ if five_hour_percent is not None:
2261
+ out["fiveHourPercent"] = five_hour_percent
2262
+ if five_hour_resets_at is not None:
2263
+ out["fiveHourResetsAt"] = five_hour_resets_at
2264
+ if five_hour_window_key is not None:
2265
+ out["fiveHourWindowKey"] = five_hour_window_key
2266
+ return out
2267
+
2268
+
2269
+ def _saved_dict_from_usage_row(row: sqlite3.Row) -> dict[str, Any]:
2270
+ """Mirror ``insert_usage_snapshot``'s output dict from an existing
2271
+ weekly_usage_snapshots row. Used by ``cmd_record_usage``'s dedup
2272
+ self-heal path so ``maybe_record_milestone`` and
2273
+ ``maybe_update_five_hour_block`` can re-run on the latest snapshot
2274
+ when an earlier invocation was killed between snapshot insert and
2275
+ milestone insert (e.g. CC self-update kill window, 2026-05-08).
2276
+
2277
+ Field omissions match ``insert_usage_snapshot``: keys whose values
2278
+ would be ``None`` are not emitted, so downstream ``saved.get(...)``
2279
+ callers see the same shape they'd see on a fresh insert.
2280
+
2281
+ Note: ``resetText`` (the only userscript-payload-only key
2282
+ ``insert_usage_snapshot`` re-emits in its output dict) is
2283
+ intentionally omitted — no downstream ``saved``-dict consumer in
2284
+ this codebase reads it. ``pageUrl`` is a column on
2285
+ ``weekly_usage_snapshots`` but is never propagated into the output
2286
+ dict either path.
2287
+ """
2288
+ out: dict[str, Any] = {
2289
+ "id": int(row["id"]),
2290
+ "capturedAt": row["captured_at_utc"],
2291
+ "weekStartDate": row["week_start_date"],
2292
+ "weekEndDate": row["week_end_date"],
2293
+ "weeklyPercent": float(row["weekly_percent"]),
2294
+ }
2295
+ if row["week_start_at"] is not None:
2296
+ out["weekStartAt"] = row["week_start_at"]
2297
+ if row["week_end_at"] is not None:
2298
+ out["weekEndAt"] = row["week_end_at"]
2299
+ if row["five_hour_percent"] is not None:
2300
+ out["fiveHourPercent"] = float(row["five_hour_percent"])
2301
+ if row["five_hour_resets_at"] is not None:
2302
+ out["fiveHourResetsAt"] = row["five_hour_resets_at"]
2303
+ if row["five_hour_window_key"] is not None:
2304
+ out["fiveHourWindowKey"] = int(row["five_hour_window_key"])
2305
+ return out