projecta-rrr 1.21.9 → 1.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/CHANGELOG.md +107 -0
  2. package/agents/rrr-planner.md +1 -1
  3. package/bin/install.js +24 -1
  4. package/commands/rrr/add-phase.md +17 -0
  5. package/commands/rrr/audit-milestone.md +1 -0
  6. package/commands/rrr/complete-milestone.md +10 -0
  7. package/commands/rrr/create-roadmap.md +1 -0
  8. package/commands/rrr/execute-phase.md +15 -0
  9. package/commands/rrr/execute-plan.md +15 -0
  10. package/commands/rrr/insert-phase.md +17 -0
  11. package/commands/rrr/new-milestone.md +3 -0
  12. package/commands/rrr/plan-milestone-gaps.md +1 -0
  13. package/commands/rrr/plan-phase.md +16 -0
  14. package/commands/rrr/remove-phase.md +1 -0
  15. package/commands/rrr/research-project.md +4 -0
  16. package/commands/rrr/savings.md +68 -0
  17. package/commands/rrr/ship.md +12 -0
  18. package/commands/rrr/verify-work.md +21 -1
  19. package/docs/escalations-schema.md +112 -0
  20. package/docs/model-router.md +167 -0
  21. package/docs/testing.md +127 -0
  22. package/docs/v1.22-load-test.md +81 -0
  23. package/hooks/edit-batching-nudge.js +188 -0
  24. package/hooks/model-router.js +204 -0
  25. package/hooks/tool-disallow.js +131 -0
  26. package/package.json +4 -3
  27. package/rrr/lib/auto-push.js +183 -0
  28. package/rrr/lib/auto-push.test.js +146 -0
  29. package/rrr/lib/hud/freshness-segment.js +118 -0
  30. package/rrr/lib/hud/freshness-segment.test.js +125 -0
  31. package/rrr/lib/install-hooks-wiring.js +141 -1
  32. package/rrr/lib/router/detect-stage.js +93 -0
  33. package/rrr/lib/router/escalation.js +112 -0
  34. package/rrr/lib/router/load-settings.js +137 -0
  35. package/rrr/lib/router/tier-map.js +122 -0
  36. package/rrr/lib/savings/hud-segment.js +67 -0
  37. package/rrr/lib/savings/index.js +286 -0
  38. package/rrr/lib/state/debug-session.js +117 -0
  39. package/rrr/lib/state/executor-retry.js +123 -0
  40. package/rrr/lib/test-shim.js +106 -0
  41. package/rrr/references/semantic-search-preference.md +19 -0
  42. package/scripts/backfill-summaries.js +397 -0
  43. package/scripts/bnch-02-redo.js +245 -0
  44. package/scripts/k6-cross-user-isolation.js +149 -0
  45. package/scripts/measure-opus-rate.js +115 -0
  46. package/scripts/measure-token-delta.js +176 -0
  47. package/scripts/prepublish-check.js +48 -13
  48. package/scripts/rrr-savings.js +81 -0
  49. package/scripts/run-k6-isolation.sh +75 -0
  50. package/scripts/snapshot-infra.js +100 -0
  51. package/scripts/test-prepublish-block.js +113 -0
package/CHANGELOG.md CHANGED
@@ -4,6 +4,113 @@ All notable changes to RRR will be documented in this file.
4
4
 
5
5
  Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
6
6
 
7
+ ## [1.22.0] - 2026-04-19
8
+
9
+ **Webhook Live Reindex + Dynamic Routing + Token Diet v2.**
10
+
11
+ 23 of 24 Active requirements shipped across 6 phases (80, 82–86); 1 deferred to v1.23 (TOK-04).
12
+
13
+ ### Added — Phase 80 (Webhook & Push Policy)
14
+
15
+ - **App-level GitHub webhook** `POST /webhooks/github-app` — single URL fans out to all repos via `installation.id` lookup. Reuses HMAC + 3-layer dedup from v1.21 (both routes coexist).
16
+ - **Migration `014-installations.sql`** — `installations(installation_id, team_id)` + `repos.github_repo_id` (PF-WEBH-01 rename-resilient routing).
17
+ - **PUSH-POLICY in 10 RRR command lifecycle skills** (plan-phase, execute-plan, execute-phase, verify-work-on-pass, complete-milestone, new-milestone, add-phase, insert-phase, remove-phase, ship) — every commit auto-pushes.
18
+ - **`rrr/lib/auto-push.js`** with `settings.rrr.auto_push` opt-out + non-fatal failure surfacing.
19
+ - **`rrr/lib/hud/freshness-segment.js`** + `rrr/hosted-mcp/docs/freshness.md` onboarding.
20
+
21
+ ### Added — Phase 82 (Framework Hygiene)
22
+
23
+ - **SUMMARY.md backfill** across 184 legacy plans (`scripts/backfill-summaries.js`); 175 appended canonical sections, 10 marked `legacy_exempt`.
24
+ - **LINT-BLOCK-01** — `prepublish-check.js` §12/§13 WARN→BLOCK with 7-day grace.
25
+ - **TEST-UNIFY-01** — consolidated on `node --test` (D-v1.22-04); jest-compat shim retained for legacy `__tests__/`.
26
+
27
+ ### Added — Phase 83 (Dynamic Routing Core)
28
+
29
+ - **`hooks/model-router.js`** — PreToolUse hook rewrites `Task.model` per canonical tier map (Pattern A confirmed). Fail-OPEN; logs every dispatch to `~/.rrr/telemetry/escalations.jsonl`.
30
+ - **`rrr/lib/router/{tier-map,detect-stage,load-settings}.js`** — composer + stage-aware (planning/execution/verification per-stage).
31
+ - **`settings.rrr.model_router: "dynamic"`** opt-in (default `static` preserves v1.21 bit-for-bit; 13 fixture-replay regressions).
32
+ - Installer wiring via `wireModelRouter()`.
33
+
34
+ ### Added — Phase 84 (Model Escalation)
35
+
36
+ - **`rrr-debugger` iter-3+ → opus** + **`rrr-executor` 2nd-retry → opus** layered on `decideTier()`.
37
+ - **`rrr/lib/state/{debug-session,executor-retry}.js`** atomic per-plan counters.
38
+ - **`escalations.jsonl` schema v2** (`retry_count`, `prior_tier`); `docs/escalations-schema.md`.
39
+
40
+ ### Added — Phase 85 (Token Diet v2)
41
+
42
+ - **`hooks/tool-disallow.js`** — PreToolUse BLOCK for stock Read/Grep/Glob on main thread (opt-in `settings.rrr.tok_disallow`; installer/subagent/env exemptions).
43
+ - **`hooks/edit-batching-nudge.js`** — PostToolUse advisory on ≥2 same-file edits within 60s.
44
+ - **`/rrr:savings`** + `scripts/rrr-savings.js` + `rrr/lib/savings/{index,hud-segment}.js` — session/lifetime token-reduction + tier distribution + Opus-rate dashboard.
45
+ - **TOK-04 deferred to v1.23** (measurement gate: 0 observation JSONLs in-repo).
46
+
47
+ ### Added — Phase 86 (SHIP GATE)
48
+
49
+ - **SHIP-01 PASS** — `boot-e2e-first-time-index.test.js` (closes v1.21.6–8 root cause) + folds dropped Phase 81 SRCH-01 cross-repo contamination assertion.
50
+ - `scripts/bnch-02-redo.js` + 2 per-repo `queries.json` — Haiku hit-rate@5 measurement (deferred-operator).
51
+ - `scripts/measure-token-delta.js` — three-number falsifiable comparator (`actual ≤ 0.80 × static_all_sonnet`).
52
+ - `scripts/measure-opus-rate.js` — strict <15% gate.
53
+ - `scripts/k6-cross-user-isolation.js` + `scripts/run-k6-isolation.sh` + `docs/v1.22-load-test.md` — multi-tenant RLS load test (deferred-operator).
54
+ - 48 Phase 86 tests + 239 root suite + 30 hosted-mcp targeted, all green.
55
+
56
+ ### Added — Infrastructure
57
+
58
+ - **`.planning/INFRA-STATE.md`** — authoritative deployed-infra snapshot. `CLAUDE.md` + `PROJECT.md` mandate reading it before flagging "operator-step".
59
+ - **`scripts/snapshot-infra.js`** — Fly + Neon verification runner.
60
+
61
+ ### Changed
62
+
63
+ - Phase 81 (Search Hardening) **dropped** — SRCH-01 already shipped in v1.21 76-03. Verification folded into Phase 86 SHIP-01.
64
+
65
+ ### Deferred to v1.23
66
+
67
+ TOK-04 (terse Read MCP); BROW-01/02 (gsd-browser eval); AUTH-U-01/02; RERK-01 (Voyage rerank, conditional on SHIP-02); CTXP-01..04; LRNG-01..04; XREP-01..02. Full list in `.planning/REQUIREMENTS.md`.
68
+
69
+ ## [1.21.10] - 2026-04-18
70
+
71
+ **semantic_search injection waves 2 + 3.**
72
+
73
+ Completes the token-saving rollout started in v1.21.9. Waves 2 and 3 target
74
+ commands that auditor identified as MED priority but still carry significant
75
+ token cost in practice — milestone audits, gap planning, roadmap creation,
76
+ project research, and the planner agent itself.
77
+
78
+ ### Added
79
+
80
+ - **semantic_search in `rrr-planner` agent.** Planner previously grepped
81
+ CONTEXT.md and ROADMAP sections during task breakdown — now prefers
82
+ ~500-token semantic_search for "existing implementations", "similar
83
+ patterns", "existing code shape" queries.
84
+
85
+ - **semantic_search reference in 4 more commands:** `/rrr:audit-milestone`,
86
+ `/rrr:plan-milestone-gaps`, `/rrr:create-roadmap`, `/rrr:research-project`.
87
+ These don't run every session but they dominate milestone boundaries —
88
+ typical burn 12-22K each before this change.
89
+
90
+ - **`semantic-search-preference.md` domain-query patterns (wave 3).**
91
+ 10-row table: tech stack, architecture, similar implementation, error
92
+ root-cause, requirement coverage, auth/security, data model, test
93
+ coverage, API surface, config. With suggested k per query. Purpose:
94
+ stop Claude from grepping for "middleware" when a better-phrased
95
+ semantic_search would nail it in one call.
96
+
97
+ ### Status after waves 1-3
98
+
99
+ Commands using semantic_search: **13 of 23** (was 5). The 10 remaining
100
+ untouched commands do minimal code exploration (<5K tokens each) —
101
+ diminishing returns beyond here.
102
+
103
+ Agents using semantic_search: **5 of ~15** — the five highest-burn ones
104
+ (rrr-codebase-mapper, rrr-phase-researcher, rrr-executor, rrr-debugger,
105
+ rrr-planner). These carry ~80% of agent-side exploration cost.
106
+
107
+ ### Next (v1.22.0)
108
+
109
+ Webhook-driven re-index infrastructure (GitHub App single-URL webhook +
110
+ local post-commit hook → local MCP). Closes the "stale index" gap — today
111
+ reindex only runs on cron or manual sync_repo. Needs a phase plan; not a
112
+ patch.
113
+
7
114
  ## [1.21.9] - 2026-04-18
8
115
 
9
116
  **Token-efficient RRR: semantic_search injection wave 1 + hosted worker hardening.**
@@ -2,7 +2,7 @@
2
2
  name: rrr-planner
3
3
  description: Creates executable phase plans with task breakdown, dependency analysis, and goal-backward verification. Spawned by /rrr:plan-phase orchestrator.
4
4
  model: sonnet
5
- tools: Read, Write, Bash, Glob, Grep, WebFetch, mcp__context7__*
5
+ tools: Read, Write, Bash, Glob, Grep, WebFetch, mcp__context7__*, mcp__rrr-search-hosted__semantic_search, mcp__rrr-search__semantic_search
6
6
  color: green
7
7
  ---
8
8
 
package/bin/install.js CHANGED
@@ -383,7 +383,8 @@ function runHostedInstallOrchestrator() {
383
383
  writePendingMerge
384
384
  } = require('../rrr/lib/install-merge-agents');
385
385
  const {
386
- wireToolRedirect
386
+ wireToolRedirect,
387
+ wireModelRouter
387
388
  } = require('../rrr/lib/install-hooks-wiring');
388
389
 
389
390
  const repoRoot = path.join(__dirname, '..');
@@ -533,6 +534,28 @@ function runHostedInstallOrchestrator() {
533
534
  audit.hooks_wired = { error: err.message };
534
535
  }
535
536
 
537
+ // 6b) Wire the Phase 83 model-router hook (MDLR-03).
538
+ // Seeds `rrr.model_router: "static"` if not present — default v1.21 behavior.
539
+ let routerResult = null;
540
+ try {
541
+ routerResult = wireModelRouter({
542
+ repoRoot,
543
+ userHooksDir,
544
+ userClaudeDir,
545
+ userSettingsPath,
546
+ dryRun: hasDryRun
547
+ });
548
+ audit.model_router_wired = {
549
+ copied: routerResult.copied,
550
+ settings_updated: routerResult.settingsUpdated,
551
+ hook_command: routerResult.hookCommand,
552
+ libs_copied: routerResult.libsCopied
553
+ };
554
+ } catch (err) {
555
+ console.error(` ${yellow}⚠ model-router wiring failed (non-fatal): ${err.message}${reset}`);
556
+ audit.model_router_wired = { error: err.message };
557
+ }
558
+
536
559
  // 7) Write (or preview) the merged registry.
537
560
  if (hasDryRun) {
538
561
  console.log(` ${cyan}[dry-run]${reset} Planned actions:`);
@@ -411,3 +411,20 @@ Phase addition is complete when:
411
411
  - [ ] Next phase number calculated correctly (ignoring decimals)
412
412
  - [ ] User informed of next steps
413
413
  </success_criteria>
414
+
415
+ <post_command_auto_push>
416
+ **When the user commits the new phase entry (PUSH-01 — paired with Phase 80 webhook):**
417
+
418
+ This command intentionally does NOT commit (see anti_patterns). After the
419
+ user commits the new ROADMAP.md / STATE.md / phase-dir changes, prompt
420
+ them to push so the hosted MCP reindexes within ~60s:
421
+
422
+ ```bash
423
+ # After `git commit -m "docs: add phase {NN}"`:
424
+ node rrr/lib/auto-push.js --source=add-phase || true
425
+ ```
426
+
427
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
428
+ non-fatal warning. To opt out: `{"rrr": {"auto_push": false}}` in
429
+ `~/.claude/settings.json`.
430
+ </post_command_auto_push>
@@ -19,6 +19,7 @@ Verify milestone achieved its definition of done. Check requirements coverage, c
19
19
 
20
20
  <execution_context>
21
21
  @~/.claude/rrr/references/principles.md
22
+ @rrr/references/semantic-search-preference.md
22
23
  </execution_context>
23
24
 
24
25
  <context>
@@ -148,6 +148,16 @@ Output: Milestone archived (roadmap + requirements), PROJECT.md evolved, git tag
148
148
  - Tag: `git tag -a v{{version}} -m "[milestone summary]"`
149
149
  - Ask about pushing tag
150
150
 
151
+ 8.5. **Auto-push** (PUSH-01 — paired with Phase 80 webhook for <60s reindex):
152
+
153
+ ```bash
154
+ node rrr/lib/auto-push.js --source=complete-milestone || true
155
+ ```
156
+
157
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
158
+ non-fatal warning and continues. To opt out:
159
+ `{"rrr": {"auto_push": false}}` in `~/.claude/settings.json`.
160
+
151
161
  9. **Offer next steps:**
152
162
  - `/rrr:discuss-milestone` — thinking partner, creates context file
153
163
  - Then `/rrr:new-milestone` — update PROJECT.md with new goals
@@ -40,6 +40,7 @@ Roadmaps define what work happens in what order. Phases map to requirements.
40
40
  @~/.claude/rrr/templates/roadmap.md
41
41
  @~/.claude/rrr/templates/state.md
42
42
  @~/.claude/rrr/references/goal-backward.md
43
+ @rrr/references/semantic-search-preference.md
43
44
  </execution_context>
44
45
 
45
46
  <context>
@@ -598,3 +598,18 @@ After all plans in phase complete (step 7):
598
598
  - [ ] REQUIREMENTS.md updated (phase requirements marked Complete)
599
599
  - [ ] User informed of next steps
600
600
  </success_criteria>
601
+
602
+ <post_command_auto_push>
603
+ **Final step (PUSH-01 — paired with Phase 80 webhook for <60s reindex):**
604
+
605
+ After all wave plans complete and the phase-level rollup commit lands,
606
+ nudge the hosted MCP reindex:
607
+
608
+ ```bash
609
+ node rrr/lib/auto-push.js --source=execute-phase || true
610
+ ```
611
+
612
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
613
+ non-fatal warning and continues. To opt out:
614
+ `{"rrr": {"auto_push": false}}` in `~/.claude/settings.json`.
615
+ </post_command_auto_push>
@@ -882,3 +882,18 @@ Continue handling returns until "## PLAN COMPLETE" or user stops.
882
882
  - [ ] If phase complete: REQUIREMENTS.md updated (phase requirements marked Complete)
883
883
  - [ ] User informed of completion and next steps
884
884
  </success_criteria>
885
+
886
+ <post_command_auto_push>
887
+ **Final step (PUSH-01 — paired with Phase 80 webhook for <60s reindex):**
888
+
889
+ After plan completes (executor's per-task commits + SUMMARY.md commit have
890
+ landed locally), nudge the hosted MCP reindex:
891
+
892
+ ```bash
893
+ node rrr/lib/auto-push.js --source=execute-plan || true
894
+ ```
895
+
896
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
897
+ non-fatal warning and continues. To opt out:
898
+ `{"rrr": {"auto_push": false}}` in `~/.claude/settings.json`.
899
+ </post_command_auto_push>
@@ -414,3 +414,20 @@ Phase insertion is complete when:
414
414
  - [ ] Decimal number calculated correctly (based on existing decimals)
415
415
  - [ ] User informed of next steps and dependency implications
416
416
  </success_criteria>
417
+
418
+ <post_command_auto_push>
419
+ **When the user commits the inserted phase entry (PUSH-01 — paired with Phase 80 webhook):**
420
+
421
+ This command intentionally does NOT commit (see anti_patterns). After the
422
+ user commits the new ROADMAP.md / STATE.md / phase-dir changes, prompt
423
+ them to push so the hosted MCP reindexes within ~60s:
424
+
425
+ ```bash
426
+ # After `git commit -m "docs: insert phase {N.M}"`:
427
+ node rrr/lib/auto-push.js --source=insert-phase || true
428
+ ```
429
+
430
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
431
+ non-fatal warning. To opt out: `{"rrr": {"auto_push": false}}` in
432
+ `~/.claude/settings.json`.
433
+ </post_command_auto_push>
@@ -103,6 +103,7 @@ Milestone name: $ARGUMENTS (optional - will prompt if not provided)
103
103
  ```bash
104
104
  git add "$PATCHES_FILE"
105
105
  git commit -m "docs: close v${PREV_MILESTONE} patches (starting v{NEW_VERSION})"
106
+ node rrr/lib/auto-push.js --source=new-milestone-patches || true # PUSH-01
106
107
  ```
107
108
 
108
109
  **If no patches (PATCH_COUNT = 0 or file doesn't exist):**
@@ -217,6 +218,7 @@ Milestone name: $ARGUMENTS (optional - will prompt if not provided)
217
218
  ```bash
218
219
  git add .planning/PROJECT.md .planning/STATE.md
219
220
  git commit -m "docs: start milestone v[X.Y] [Name]"
221
+ node rrr/lib/auto-push.js --source=new-milestone || true # PUSH-01 (auto-push for hosted MCP reindex)
220
222
  ```
221
223
 
222
224
  8. **Milestone research decision:**
@@ -321,6 +323,7 @@ Milestone name: $ARGUMENTS (optional - will prompt if not provided)
321
323
  ```bash
322
324
  git add ".planning/milestones/v${NEW_VERSION}/research/"
323
325
  git commit -m "docs: research milestone v${NEW_VERSION} features"
326
+ node rrr/lib/auto-push.js --source=new-milestone-research || true # PUSH-01
324
327
  ```
325
328
 
326
329
  **If "No, define requirements directly":** Skip research, continue to step 9.
@@ -22,6 +22,7 @@ One command creates all fix phases — no manual `/rrr:add-phase` per gap.
22
22
  @~/.claude/rrr/references/principles.md
23
23
  @~/.claude/rrr/workflows/plan-phase.md
24
24
  @~/.claude/rrr/lib/phase-paths.md
25
+ @rrr/references/semantic-search-preference.md
25
26
  </execution_context>
26
27
 
27
28
  <context>
@@ -941,3 +941,19 @@ Verification: {Passed | Passed with override | Skipped}
941
941
  - [ ] User sees status between agent spawns
942
942
  - [ ] User knows next steps (execute or review)
943
943
  </success_criteria>
944
+
945
+ <post_command_auto_push>
946
+ **Final step (PUSH-01 — paired with Phase 80 webhook for <60s reindex):**
947
+
948
+ After plans are created and any planning commits have landed, nudge the
949
+ hosted MCP reindex so subsequent `semantic_search` calls see the new
950
+ PLAN.md files:
951
+
952
+ ```bash
953
+ node rrr/lib/auto-push.js --source=plan-phase || true
954
+ ```
955
+
956
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
957
+ non-fatal warning and continues. To opt out:
958
+ `{"rrr": {"auto_push": false}}` in `~/.claude/settings.json`.
959
+ </post_command_auto_push>
@@ -261,6 +261,7 @@ Stage and commit the removal:
261
261
  ```bash
262
262
  git add .planning/
263
263
  git commit -m "chore: remove phase {target} ({original-phase-name})"
264
+ node rrr/lib/auto-push.js --source=remove-phase || true # PUSH-01 (auto-push for hosted MCP reindex)
264
265
  ```
265
266
 
266
267
  The commit message preserves the historical record of what was removed.
@@ -23,6 +23,10 @@ For new projects, use /rrr:new-project instead.
23
23
  Deprecated: 2026-01-16
24
24
  -->
25
25
 
26
+ <execution_context>
27
+ @rrr/references/semantic-search-preference.md
28
+ </execution_context>
29
+
26
30
  <objective>
27
31
  Research domain ecosystem. Spawns 4 parallel rrr-project-researcher agents for comprehensive coverage.
28
32
 
@@ -0,0 +1,68 @@
1
+ ---
2
+ name: rrr:savings
3
+ description: Show token-savings report (session + lifetime) with tier distribution and Opus-rate dashboard
4
+ ---
5
+
6
+ <objective>
7
+ Display the RRR token-savings report computed from observation JSONL files,
8
+ the optional escalations log, and the optional tier map. Defaults to lifetime
9
+ aggregation; pass `--session <id>` to scope to one session.
10
+ </objective>
11
+
12
+ <behavior>
13
+
14
+ ## Step 1: Run the savings script
15
+
16
+ ```bash
17
+ node scripts/rrr-savings.js
18
+ ```
19
+
20
+ This produces a human-readable report with:
21
+
22
+ - **Tool calls** and edits batched
23
+ - **Tier distribution** (Haiku / Sonnet / Opus call counts + percentages)
24
+ - **Opus rate** with status badge: OK (≤10%), WARN (10–15%), CRITICAL (>15%)
25
+ - **Cost (USD)** measured vs all-Opus counterfactual baseline (PF-TOK-03 two-number reporting)
26
+ - **v1.21 baseline** reduction (99.74%) for context
27
+ - **Methodology footer** — token counts are heuristic estimates; for billing-grade numbers run Tier 1 telemetry (`countTokens`) on a representative session
28
+
29
+ ## Step 2: Optional filters
30
+
31
+ | Flag | Meaning |
32
+ |------|---------|
33
+ | `--session <id>` | Filter to a specific session id (from observation `session` field) |
34
+ | `--lifetime` | (default) Aggregate across all observation files under `.planning/` |
35
+ | `--json` | Machine-readable output for tooling |
36
+ | `--hud` | Print only the HUD segment string (`H45% S40% O15% \| Opus:15%`) |
37
+
38
+ ## Step 3: Watch the Opus rate
39
+
40
+ If the **Opus rate** crosses 15% in lifetime mode, that's the SHIP-GATE-relevant threshold for Phase 86. Investigate via:
41
+
42
+ - `.planning/escalations.jsonl` — every escalation event (Phase 84)
43
+ - `~/.claude/rrr/tier-map.json` — per-command/per-agent tier defaults (Phase 83)
44
+
45
+ ## Step 4: Recommended cadence
46
+
47
+ - **Per session:** `/rrr:savings --session <id>` after a long task to spot-check.
48
+ - **Monthly:** `/rrr:savings` (lifetime) — operator step, captures milestone-scale trend.
49
+ - **HUD:** the `--hud` segment is intended to live alongside the Phase 80 freshness segment in the OhMyPosh / JSON-state pipeline.
50
+
51
+ </behavior>
52
+
53
+ <inputs_consumed>
54
+
55
+ | File | Phase shipped | Tolerated if missing? |
56
+ |------|---------------|-----------------------|
57
+ | `.planning/observations-*.jsonl` | v1.19 | Yes — returns zeros |
58
+ | `.planning/escalations.jsonl` | v1.22 Phase 84 | Yes — falls back to tier-map / all-sonnet default |
59
+ | `.rrr/tier-map.json` or `~/.claude/rrr/tier-map.json` | v1.22 Phase 83 | Yes — falls back to all-sonnet |
60
+
61
+ </inputs_consumed>
62
+
63
+ <constraints>
64
+ - Pure read-only: never modifies observation/escalation files.
65
+ - Never blocks: always exits 0 even if all inputs are missing.
66
+ - Methodology is explicit: report header labels token counts as "estimated" to
67
+ avoid PF-TOK-03 (speculative savings calc) trap.
68
+ </constraints>
@@ -246,6 +246,13 @@ On success, display:
246
246
  The PR description includes planning context from {N} SUMMARY.md files.
247
247
  ```
248
248
 
249
+ After successful PR creation, also nudge the hosted MCP reindex on the
250
+ base branch (PUSH-01 — paired with Phase 80 webhook):
251
+
252
+ ```bash
253
+ node rrr/lib/auto-push.js --source=ship || true
254
+ ```
255
+
249
256
  On failure (gh pr create returns error), fall through to Step 4.
250
257
 
251
258
  **If ANY check fails, go to Step 4 (fallback).**
@@ -283,6 +290,11 @@ Generate `.planning/SHIP-READY.md`:
283
290
  2. Create PR at: https://github.com/{remote-owner/repo}/compare/{BASE_BRANCH}...{CURRENT_BRANCH}
284
291
  3. Copy the PR Body above into the PR description
285
292
 
293
+ > Note: after merging the PR, run `node rrr/lib/auto-push.js --source=ship-postmerge`
294
+ > on `main` to nudge the hosted MCP reindex (PUSH-01 — paired with Phase 80
295
+ > webhook). Or set `rrr.auto_push: true` (default) and run `/rrr:next` —
296
+ > downstream commands auto-push.
297
+
286
298
  ## Why Manual?
287
299
 
288
300
  {Reason from above}
@@ -436,4 +436,24 @@ All Glob/Grep/Read operations must exclude these paths.
436
436
  - [ ] Manual: presents tests one at a time for user verification
437
437
  - [ ] Creates/resumes UAT.md with tests
438
438
  - [ ] Tracks progress across context resets
439
- </success_criteria>
439
+ </success_criteria>
440
+
441
+ <post_command_auto_push>
442
+ **Final step ON PASS only (PUSH-01 — paired with Phase 80 webhook):**
443
+
444
+ When verification result is "Clean" (all SUMMARY.md present + commit
445
+ hashes), nudge the hosted MCP reindex so the verified state is
446
+ queryable within ~60s:
447
+
448
+ ```bash
449
+ # Only run when verification PASSED (not on gaps_found / human_needed)
450
+ node rrr/lib/auto-push.js --source=verify-work-pass || true
451
+ ```
452
+
453
+ If verification reported gaps, do NOT auto-push — let the operator fix
454
+ gaps first, then re-verify, then push on the next clean pass.
455
+
456
+ Honors `settings.rrr.auto_push` (default `true`); on failure prints
457
+ non-fatal warning and continues. To opt out:
458
+ `{"rrr": {"auto_push": false}}` in `~/.claude/settings.json`.
459
+ </post_command_auto_push>
@@ -0,0 +1,112 @@
1
+ # `escalations.jsonl` schema (Phase 84, v2)
2
+
3
+ The `~/.rrr/telemetry/escalations.jsonl` file is the single audit log for
4
+ RRR's dynamic model router (Phase 83) and the difficulty-triggered
5
+ escalation layer (Phase 84). One JSON object per line; appended atomically.
6
+
7
+ This doc is the canonical reference. Phase 85 `/rrr:savings` consumes it.
8
+ Phase 86 SHIP-04 (`<15%` Opus rate) is computed from it.
9
+
10
+ ## Schema v2
11
+
12
+ ```jsonc
13
+ {
14
+ // Always present
15
+ "ts": "2026-04-18T22:33:00.000Z", // ISO-8601 UTC timestamp
16
+ "event": "dispatch", // see Event enum below
17
+
18
+ // Dispatch-context fields (present when event=dispatch)
19
+ "tool_name": "Task", // matcher scope (always Task in v1.22)
20
+ "agent_name": "rrr-debugger", // subagent_type from tool_input, or null
21
+ "command": "debug", // detected /rrr:<command>, or null
22
+ "stage": "execution", // "planning" | "execution" | "verification"
23
+ "tier": "opus", // "haiku" | "sonnet" | "opus" | "inherit"
24
+ "reason": "debugger_iter_escalate", // see Reason enum below
25
+
26
+ // Phase 84 escalation extension (present when reason ∈ escalation set)
27
+ "retry_count": 3, // iter (debugger) or retry_count (executor)
28
+ "prior_tier": "sonnet", // tier the base decideTier() returned
29
+
30
+ // Free-form
31
+ "details": null, // optional object — never required to read
32
+
33
+ // Hook-error context (present when event=hook_error)
34
+ "error": "...", // truncated message
35
+ "stack": "..." // truncated stack (unhandled_exception only)
36
+ }
37
+ ```
38
+
39
+ ### Event enum
40
+
41
+ | `event` | Meaning |
42
+ |---------------|-----------------------------------------------------------|
43
+ | `dispatch` | Hook routed the Task tool. `tier` + `reason` describe the decision. |
44
+ | `hook_error` | Hook caught an exception or invariant violation. Fail-OPEN — Claude Code still gets an `allow` response. |
45
+
46
+ ### Reason enum (event=dispatch)
47
+
48
+ | `reason` | Trigger |
49
+ |--------------------------------|------------------------------------------------------------------------------------------|
50
+ | `dispatch` | Plain Phase-83 routing — base `decideTier()` decision applied, no escalation. |
51
+ | `debugger_iter_escalate` | Phase 84 ESCL-01 — `agent=rrr-debugger` AND iteration ≥ 3 forced tier to `opus`. |
52
+ | `executor_retry_escalate` | Phase 84 ESCL-02 — `agent=rrr-executor` AND `retry_count` ≥ 2 forced tier to `opus`. |
53
+ | `manual` | Operator-emitted row (e.g. `/rrr:savings --tag` or external script). Reserved for v1.23+. |
54
+
55
+ ### Reason enum (event=hook_error)
56
+
57
+ | `reason` | Trigger |
58
+ |------------------------|--------------------------------------------------------------------|
59
+ | `router_lib_missing` | One of the router libs failed to require — likely install drift. |
60
+ | `settings_load_failed` | `loadRouterConfig()` threw — corrupt JSON in settings.json. |
61
+ | `decide_tier_threw` | `tier-map.decideTier()` threw — bug in escalation rules. |
62
+ | `escalation_threw` | `escalation.applyEscalation()` threw — fail-OPEN, base passes. |
63
+ | `unhandled_exception` | Top-level catch — anything else. |
64
+
65
+ ## Per-plan vs per-task retry (PF-RETRY-02)
66
+
67
+ `retry_count` for `executor_retry_escalate` is **per-plan**: it lives in
68
+ `<project>/.rrr/state/executor-retry.json` keyed by `plan_id`. Querying
69
+ the helper with a different `plan_id` returns 0 (retries don't carry
70
+ across plans). Documented to prevent the silent "monotonic-vs-resetting"
71
+ ambiguity the v1.22 PITFALLS research flagged.
72
+
73
+ For `debugger_iter_escalate`, `retry_count` is the debug-session
74
+ iteration count (also per-session, in `.rrr/state/debug-session.json`).
75
+
76
+ ## Backwards compatibility
77
+
78
+ Phase 83 wrote rows with these fields only:
79
+ `ts`, `event`, optional `tool_name`/`agent`/`command`/`stage`/`tier`/`reason`/`error`.
80
+
81
+ Phase 84 adds `retry_count` and `prior_tier`. Existing readers (e.g.
82
+ `rrr/lib/savings/index.js`, which uses `e.to_tier || e.tier`) tolerate the
83
+ addition: unknown fields are simply ignored.
84
+
85
+ ## Append protocol
86
+
87
+ The hook calls `logEscalation(entry)` which:
88
+
89
+ 1. Ensures `~/.rrr/telemetry/` exists.
90
+ 2. Prepends `ts: new Date().toISOString()`.
91
+ 3. Appends `JSON.stringify(entry) + '\n'` synchronously.
92
+ 4. **Swallows write errors** — telemetry must never crash the hook
93
+ (Phase 83 fail-OPEN contract).
94
+
95
+ ## Reading the file
96
+
97
+ ```bash
98
+ # Last 20 escalations
99
+ tail -n 20 ~/.rrr/telemetry/escalations.jsonl | jq .
100
+
101
+ # Opus-rate over last 100 dispatches
102
+ tail -n 100 ~/.rrr/telemetry/escalations.jsonl \
103
+ | jq -s 'map(select(.event=="dispatch")) | (map(select(.tier=="opus")) | length) / length'
104
+ ```
105
+
106
+ `/rrr:savings` (Phase 85 TOK-03) aggregates this file and prints
107
+ session + lifetime tier distribution. See `rrr/lib/savings/index.js`.
108
+
109
+ ## Rotation
110
+
111
+ Not implemented in v1.22. The file grows ~100 bytes/event; ~10 MB/year
112
+ for active dogfood usage. Track for v1.23 via a small cron rotator.