instar 0.28.78 → 0.28.80

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/dashboard/index.html +170 -7
  2. package/dist/commands/init.d.ts.map +1 -1
  3. package/dist/commands/init.js +6 -4
  4. package/dist/commands/init.js.map +1 -1
  5. package/dist/commands/playbook.d.ts.map +1 -1
  6. package/dist/commands/playbook.js +2 -1
  7. package/dist/commands/playbook.js.map +1 -1
  8. package/dist/commands/server.d.ts.map +1 -1
  9. package/dist/commands/server.js +91 -8
  10. package/dist/commands/server.js.map +1 -1
  11. package/dist/commands/setup.d.ts.map +1 -1
  12. package/dist/commands/setup.js +5 -3
  13. package/dist/commands/setup.js.map +1 -1
  14. package/dist/core/Config.d.ts.map +1 -1
  15. package/dist/core/Config.js +2 -1
  16. package/dist/core/Config.js.map +1 -1
  17. package/dist/core/PostUpdateMigrator.d.ts.map +1 -1
  18. package/dist/core/PostUpdateMigrator.js +4 -5
  19. package/dist/core/PostUpdateMigrator.js.map +1 -1
  20. package/dist/core/SessionManager.d.ts +38 -0
  21. package/dist/core/SessionManager.d.ts.map +1 -1
  22. package/dist/core/SessionManager.js +157 -23
  23. package/dist/core/SessionManager.js.map +1 -1
  24. package/dist/core/UpdateChecker.d.ts.map +1 -1
  25. package/dist/core/UpdateChecker.js +3 -1
  26. package/dist/core/UpdateChecker.js.map +1 -1
  27. package/dist/core/UpgradeGuideProcessor.d.ts.map +1 -1
  28. package/dist/core/UpgradeGuideProcessor.js +3 -1
  29. package/dist/core/UpgradeGuideProcessor.js.map +1 -1
  30. package/dist/core/types.d.ts +18 -0
  31. package/dist/core/types.d.ts.map +1 -1
  32. package/dist/core/types.js.map +1 -1
  33. package/dist/lifeline/ServerSupervisor.d.ts.map +1 -1
  34. package/dist/lifeline/ServerSupervisor.js +3 -1
  35. package/dist/lifeline/ServerSupervisor.js.map +1 -1
  36. package/dist/memory/SemanticMemory.d.ts +9 -0
  37. package/dist/memory/SemanticMemory.d.ts.map +1 -1
  38. package/dist/memory/SemanticMemory.js +131 -0
  39. package/dist/memory/SemanticMemory.js.map +1 -1
  40. package/dist/monitoring/PresenceProxy.d.ts +53 -0
  41. package/dist/monitoring/PresenceProxy.d.ts.map +1 -1
  42. package/dist/monitoring/PresenceProxy.js +219 -20
  43. package/dist/monitoring/PresenceProxy.js.map +1 -1
  44. package/dist/scheduler/JobRunHistory.d.ts +6 -0
  45. package/dist/scheduler/JobRunHistory.d.ts.map +1 -1
  46. package/dist/scheduler/JobRunHistory.js +11 -0
  47. package/dist/scheduler/JobRunHistory.js.map +1 -1
  48. package/dist/scheduler/JobScheduler.d.ts +23 -0
  49. package/dist/scheduler/JobScheduler.d.ts.map +1 -1
  50. package/dist/scheduler/JobScheduler.js +84 -0
  51. package/dist/scheduler/JobScheduler.js.map +1 -1
  52. package/dist/server/routes.d.ts.map +1 -1
  53. package/dist/server/routes.js +56 -0
  54. package/dist/server/routes.js.map +1 -1
  55. package/dist/threadline/ThreadlineBootstrap.d.ts.map +1 -1
  56. package/dist/threadline/ThreadlineBootstrap.js +3 -2
  57. package/dist/threadline/ThreadlineBootstrap.js.map +1 -1
  58. package/dist/threadline/relay/ConnectionManager.d.ts.map +1 -1
  59. package/dist/threadline/relay/ConnectionManager.js +34 -7
  60. package/dist/threadline/relay/ConnectionManager.js.map +1 -1
  61. package/package.json +1 -1
  62. package/scripts/pre-push-gate.js +26 -0
  63. package/src/data/builtin-manifest.json +64 -64
  64. package/upgrades/0.28.79.md +67 -0
  65. package/upgrades/0.28.80.md +93 -0
  66. package/upgrades/side-effects/0.28.79.md +310 -0
  67. package/upgrades/side-effects/assembler-context-endpoint.md +67 -0
  68. package/upgrades/side-effects/post-update-migrator-path-fix.md +52 -0
  69. package/upgrades/side-effects/presence-proxy-ack-and-baseline.md +260 -0
  70. package/upgrades/side-effects/semantic-memory-corruption-recovery.md +98 -0
  71. package/upgrades/side-effects/url-pathname-path-encoding-fix.md +45 -0
@@ -0,0 +1,93 @@
1
+ # Upgrade Guide — vNEXT
2
+
3
+ <!-- bump: patch -->
4
+
5
+ ## What Changed
6
+
7
+ PresenceProxy — the standby system that emits 20s / 2m / 5m progressive
8
+ status updates while an agent is busy — had two regressions that made
9
+ the feature look broken even though the timer machinery was still in
10
+ place.
11
+
12
+ **Layer A — Brief acks no longer cancel tier timers.** Recent guidance
13
+ told every Telegram/Slack/iMessage agent to send an immediate
14
+ acknowledgement ("Got it, looking into this") on every inbound user
15
+ message. The proxy treated that ack as the agent's response and
16
+ silently cancelled every pending tier check. Result: progressive
17
+ 20s/2m/5m updates stopped firing entirely — the user got an immediate
18
+ "On it" and then radio silence until the real reply arrived.
19
+
20
+ PresenceProxy now classifies short, forward-looking acks ("On it",
21
+ "Got it, looking into this", "I'll dig into that") as non-cancelling.
22
+ The classifier is length-bounded (≤ 200 chars) and opener-only (the
23
+ ack phrase has to appear in the first 60 chars), so a substantive
24
+ multi-sentence reply that happens to mention "I will…" deep in the
25
+ body is NOT misclassified.
26
+
27
+ **Layer B — Tier prompts now scope to post-message activity.** The
28
+ prompts that built the tier-1/2/3 status messages read whatever was
29
+ visible in the agent's tmux pane right then, which is the rolling
30
+ window — so older work from BEFORE the user's latest message often
31
+ dominated the snapshot. The user got summaries describing pre-message
32
+ work instead of "what the agent is doing in response to my message."
33
+
34
+ PresenceProxy now captures a baseline tmux snapshot at the instant
35
+ the user message arrives (`userMessageBaselineSnapshot`). The four
36
+ prompt builders (Tier 1, conversation, Tier 2, Tier 3) anchor on the
37
+ baseline and feed only the post-baseline delta to the LLM, with an
38
+ explicit "[scope: only output that appeared AFTER the user's message
39
+ arrived]" header. If the baseline anchor scrolled off the visible
40
+ pane (very busy build), we fall back to the full pane with a
41
+ labelled scope tag.
42
+
43
+ ## What to Tell Your User
44
+
45
+ - **The 20-second / 2-minute / 5-minute progressive standby updates
46
+ are working again**: When your agent is busy and you message it,
47
+ you'll once more get a status update at the 20-second mark, then
48
+ another at 2 minutes, then a stall assessment at 5 minutes. The
49
+ agent's brief ack right after your message no longer turns those
50
+ off.
51
+ - **Standby summaries finally describe what the agent is doing in
52
+ response to your latest message**: Before, the standby update
53
+ could summarize work the agent was already doing before your
54
+ question arrived. Now the proxy anchors on the moment your
55
+ message hit and only describes activity since.
56
+
57
+ ## Summary of New Capabilities
58
+
59
+ | Capability | How to Use |
60
+ |-----------|-----------|
61
+ | Brief-ack tolerance — tier timers survive "On it" / "Got it" replies | Automatic. Substantive replies still cancel timers as before; only short forward-looking acks are now treated as non-cancelling. |
62
+ | Post-message scope — standby summaries describe only activity AFTER your latest message | Automatic. Captured at the moment your message arrives; baseline is in-memory only and survives a session restart by falling back to the legacy full-pane scope. |
63
+
64
+ ## Evidence
65
+
66
+ - Repro source: user report on topic 8882 (2026-05-04T03:52Z) —
67
+ "this feature no longer seems to give progressive updates such as
68
+ the 5, 10, 15 min mark like it used to" + "the messages from
69
+ standby mode often seem to be summarizing what the agent was
70
+ working on BEFORE the user's last message."
71
+ - Root cause for #1: `handleAgentMessage` was called from
72
+ `onMessageLogged` for every non-system, non-proxy outbound
73
+ message. The Telegram-bridge instruction to ack immediately on
74
+ inbound meant every user message produced an immediate
75
+ cancellation before tier 1 had a chance to fire.
76
+ - Root cause for #2: tier prompts at lines 1192/1211/1244/1271 in
77
+ `src/monitoring/PresenceProxy.ts` passed the raw rolling tmux
78
+ pane to the LLM with no boundary marker for "what was visible at
79
+ user-message arrival."
80
+ - Side-effects review at
81
+ `upgrades/side-effects/presence-proxy-ack-and-baseline.md`
82
+ covers over/under-block for the brief-ack filter,
83
+ level-of-abstraction fit, signal-vs-authority compliance,
84
+ interactions with CompactionSentinel / PromiseBeacon /
85
+ ProxyCoordinator, and rollback cost.
86
+ - 15 new unit tests in
87
+ `tests/unit/presence-proxy-ack-and-baseline.test.ts` covering
88
+ `isBriefAck` (5), `extractDeltaSinceBaseline` (5), brief-ack
89
+ handling end-to-end (3), baseline capture (2). All 64 prior
90
+ PresenceProxy unit tests + 64 e2e tests still pass after
91
+ updating two e2e tests to use clearly-substantive agent replies
92
+ (the prior fixtures were short messages now correctly classified
93
+ as acks).
@@ -0,0 +1,310 @@
1
+ # Side-Effects Review — Topic-binding-aware zombie kill + resume-failure fallback
2
+
3
+ **Version / slug:** `zombie-kill-topic-binding`
4
+ **Date:** `2026-05-04`
5
+ **Author:** `Echo`
6
+ **Second-pass reviewer:** `independent-review-subagent (concerns raised + resolved)`
7
+
8
+ ## Summary of the change
9
+
10
+ Closes a two-stage failure mode that drops the user's first message after a
11
+ conversational pause on Telegram-bound (and Slack/iMessage-bound) agents.
12
+
13
+ **Root cause traced from Inspec/monroe-workspace logs.** When a Telegram agent
14
+ finishes replying, Claude sits at the prompt waiting for the next user
15
+ message. SessionManager's zombie-killer interprets "idle at prompt + no active
16
+ processes for 15 minutes" as zombie and kills the session. When the user
17
+ finally messages, the bridge tries to respawn with `--resume <UUID>`; the
18
+ saved UUID was captured at kill time and sometimes crashes Claude during
19
+ startup (`Session died during startup`). `waitForClaudeReady` times out, the
20
+ initial message is logged "NOT injected", and the user's message is dropped.
21
+ Five minutes later, the presence proxy fires its `tier-3 — session appears
22
+ stopped` warning. The user has to send "unstick" or re-send to recover.
23
+
24
+ **Fix in two layers:**
25
+
26
+ - **Layer A — Topic-binding exemption (signal-vs-authority structural
27
+ exemption).** SessionManager gains an optional `topicBindingChecker`
28
+ callback. When the zombie-killer is about to act, it consults the checker;
29
+ if the session is bound to a live messaging topic the kill threshold is
30
+ raised from 15 minutes to a configurable bound threshold (default 240
31
+ minutes / 4h). The binding is an authoritative structural fact (the
32
+ TelegramAdapter's reverse map), not a judgment call. Default chosen to
33
+ cover normal conversational pauses through a workday without holding
34
+ per-session resources (Claude TUI ~200-500MB RSS, Anthropic connection)
35
+ indefinitely. Operators can override via `idlePromptKillMinutesBoundToTopic`.
36
+
37
+ - **Layer B — Resume-failure fresh-spawn fallback.** When the readiness
38
+ probe fails AND tmux died during startup AND the spawn was using
39
+ `--resume`, SessionManager falls through once to a fresh-spawn carrying the
40
+ same initial message. A `resumeFailed` event is emitted; the bridge clears
41
+ the bad UUID from `TopicResumeMap` so the next user-driven respawn doesn't
42
+ retry the same broken UUID. The bridge listener gates the `remove()` on
43
+ UUID-equality with the failed UUID — so a fresh spawn that quickly saved a
44
+ *new* UUID won't have it wiped by a late-firing listener.
45
+
46
+ **Files touched:**
47
+ - `src/core/types.ts` — adds `idlePromptKillMinutesBoundToTopic?: number`.
48
+ - `src/core/SessionManager.ts` — adds binding checker, bound threshold getter,
49
+ binding-aware kill decision, and `handleReadyAndInject` with single-retry
50
+ fresh-spawn fallback. Emits `resumeFailed` event.
51
+ - `src/commands/server.ts` — wires the binding checker to consult Telegram /
52
+ Slack / iMessage adapters; subscribes to `resumeFailed` to clear the stale
53
+ UUID from `TopicResumeMap`.
54
+ - `tests/unit/zombie-kill-topic-binding.test.ts` — new behavioral tests.
55
+ - `tests/unit/spawn-resume-fallback.test.ts` — new behavioral tests.
56
+
57
+ ## Decision-point inventory
58
+
59
+ - `SessionManager` zombie-kill decision (`isActuallyIdle && idleMs > threshold`) — **modified**: threshold is now binding-aware.
60
+ - `SessionManager.spawnInteractiveSession` post-readiness initial-message inject — **modified**: adds a single fresh-spawn fallback when --resume crashes during startup.
61
+ - `commands/server.ts` `injectionDropped` listener — **pass-through**: existing recovery path is preserved; the new `resumeFailed` listener is purely a UUID-cleanup hook, no block/allow surface.
62
+
63
+ ---
64
+
65
+ ## 1. Over-block
66
+
67
+ **What legitimate inputs does this change reject that it shouldn't?**
68
+
69
+ The only block-shaped surface this change touches is "kill vs don't-kill". The
70
+ change *raises* the threshold for topic-bound sessions; it does not block
71
+ anything new. The risk is the inverse of over-block: failing to kill a truly-
72
+ zombied topic-bound session for up to 24h.
73
+
74
+ Concrete scenario: A topic-bound session whose Claude process hangs internally
75
+ (e.g., infinite loop in the TUI) but stays "alive" by `pane_current_command`
76
+ will not be cleaned up by the zombie-killer for up to 24h. Mitigation: the
77
+ bridge's `isSessionAlive` check on the next user message authoritatively
78
+ detects truly-dead Claude processes and triggers a clean respawn — that's the
79
+ fast path. The 24h threshold only matters for users who never message again.
80
+
81
+ ---
82
+
83
+ ## 2. Under-block
84
+
85
+ **What failure modes does this still miss?**
86
+
87
+ Layer A misses: a topic-bound session whose Claude has stopped responding to
88
+ input but is still showing the prompt and registering as `alive` will not be
89
+ killed promptly. As above — mitigated by user-driven respawn on next message.
90
+
91
+ Layer B misses: if the fresh-spawn fallback also crashes during startup (e.g.,
92
+ disk full, claudePath wrong, persistent corruption), we surface a degradation
93
+ event but do not retry again. This is intentional — single retry only — to
94
+ avoid spawn-loops. The bridge's existing `injectionDropped` recovery path
95
+ will pick up the ball on the next inbound message.
96
+
97
+ Layer B also does not cover: the case where `--resume` succeeds *enough* for
98
+ tmux to stay alive but Claude itself is broken (won't render the prompt). In
99
+ that case we still fall through to "best-effort inject anyway" preserving the
100
+ prior behavior. That's not a regression.
101
+
102
+ ---
103
+
104
+ ## 3. Level-of-abstraction fit
105
+
106
+ **Is this at the right layer?**
107
+
108
+ Yes. SessionManager owns session lifecycle, so the kill threshold belongs
109
+ there. The binding check is delegated to a callback (the same pattern as
110
+ `subagentChecker` and `activeRecoveryChecker`) so SessionManager stays
111
+ unaware of which messaging platform is asking — it only consumes a yes/no
112
+ binding signal.
113
+
114
+ The fresh-spawn fallback also belongs in SessionManager because it owns the
115
+ spawn primitive. The bridge layer only consumes the `resumeFailed` event for
116
+ its own state cleanup (`TopicResumeMap.remove`), which is unique to the
117
+ bridge's responsibility.
118
+
119
+ A higher-level alternative would have been to do the retry in the bridge
120
+ (routes.ts `/internal/telegram-forward`). Rejected: that requires either
121
+ refactoring `spawnInteractiveSession` to expose readiness to the caller (big
122
+ churn across 15 callers) or duplicating the spawn-and-await logic in two
123
+ places (drift risk). Keeping it in SessionManager is cheaper and isolates
124
+ the fix to one method body.
125
+
126
+ ---
127
+
128
+ ## 4. Signal vs authority compliance
129
+
130
+ **Required reference:** [docs/signal-vs-authority.md](../../docs/signal-vs-authority.md)
131
+
132
+ **Does this change hold blocking authority with brittle logic?**
133
+
134
+ - [x] No — this change has no block/allow surface in the judgment sense.
135
+
136
+ The zombie-killer is not a judgment authority — it's a structural cleanup
137
+ mechanism whose behavior is now parameterized by an authoritative structural
138
+ fact (is this session in the topic→session reverse map). Per the principle
139
+ doc:
140
+
141
+ > When this principle does NOT apply: Hard-invariant validation … structural
142
+ > validators at the boundary of the system are not decision points in the
143
+ > sense this principle applies to.
144
+
145
+ The binding lookup is a hard structural fact ("is this session ID in the
146
+ TelegramAdapter's reverse map?"), not a judgment about what a message
147
+ *means*. There is no LLM, no regex, no similarity score, no token list.
148
+
149
+ The fresh-spawn fallback is a recovery flow control, not a decision point on
150
+ content or intent. No principle violation.
151
+
152
+ ---
153
+
154
+ ## 5. Interactions
155
+
156
+ **Does this interact with existing checks, recovery paths, or infrastructure?**
157
+
158
+ - **Shadowing:**
159
+ - The zombie-killer's existing vetoes (`activeRecoveryChecker`,
160
+ `subagentChecker`, `pendingInjections`) all run BEFORE the new threshold
161
+ check. They are unaffected — bound sessions still respect compaction
162
+ recovery, subagent activity, and pending-injection events.
163
+ - The new `topicBindingChecker` runs AFTER `idlePromptSince` is established,
164
+ not before, so the existing first-idle hooks (paste-retry, error-nudge)
165
+ still fire normally on bound sessions.
166
+
167
+ - **Double-fire:**
168
+ - `resumeFailed` and `injectionDropped` could both fire for the same session
169
+ if the resume crashes AND the recovered fresh-spawn also fails to
170
+ inject. In that case, the bridge's `injectionDropped` listener will
171
+ re-forward the user's text via `/internal/telegram-forward`, which
172
+ triggers a new spawn. This is the same path that already runs today on
173
+ crashed sessions; the change does not introduce a new loop.
174
+
175
+ - **Races:**
176
+ - The fresh-spawn fallback inside `handleReadyAndInject` runs after a
177
+ `kill-session` to clean up any zombie pane. If a concurrent monitor tick
178
+ is in flight, it could observe the dead pane mid-cleanup and emit
179
+ `sessionComplete` for the failed session. We mark the failed session
180
+ `status: 'failed'` BEFORE emitting `resumeFailed` (and before the
181
+ recursive spawn), so reapers see consistent state through the full
182
+ handoff.
183
+ - The `resumeFailed` listener fires AFTER the fresh-spawn may have already
184
+ saved a new UUID via the proactive 8-second save. To avoid wiping the new
185
+ UUID, the listener gates `TopicResumeMap.remove(topicId)` on a
186
+ UUID-equality check (only remove when the stored UUID still matches the
187
+ failed one). Direct test: `tests/unit/resume-failed-uuid-gate.test.ts`.
188
+ - `topicBindingChecker` is read-only. No shared mutable state.
189
+
190
+ - **Feedback loops:** None. The fresh-spawn fallback is one-shot.
191
+
192
+ ---
193
+
194
+ ## 6. External surfaces
195
+
196
+ **Does this change anything visible outside the immediate code path?**
197
+
198
+ - **Other agents on the same machine:** No. Each agent's SessionManager owns
199
+ its own kill threshold and its own binding map.
200
+ - **Other users of the install base:** Yes — every Telegram/Slack/iMessage
201
+ agent will now hold sessions for up to 24h instead of cleaning them up at
202
+ 15 minutes. Memory footprint and Claude API connection count per agent
203
+ may rise. Users with many concurrent topics on a memory-constrained host
204
+ can override `idlePromptKillMinutesBoundToTopic` in config.json. The
205
+ default is conservative for the common case (1-3 topics per agent).
206
+ - **External systems:** No changes to Telegram/Slack/iMessage API surface.
207
+ Tunnel, GitHub, Cloudflare unaffected.
208
+ - **Persistent state:** `TopicResumeMap` entries are cleared on resume
209
+ failure (one extra `remove` call per failure). State file format unchanged.
210
+ - **Timing/runtime:** The bound threshold default (24h) is bounded; sessions
211
+ cannot accumulate forever. The fresh-spawn fallback adds at most one
212
+ additional 90-second readiness window per spawn attempt; bounded.
213
+ - **Logs:** New log lines on bound-zombie-kill (`(topic-bound, threshold Nm)`),
214
+ resume failure (`Resume failed for "X" — tmux died during startup. Falling
215
+ back to fresh spawn.`), and fresh-spawn success/failure. Format is
216
+ consistent with existing `[SessionManager]` lines.
217
+
218
+ ---
219
+
220
+ ## 7. Rollback cost
221
+
222
+ **If this turns out wrong in production, what's the back-out?**
223
+
224
+ Pure code change. No schema migration, no persistent state shape change, no
225
+ data migration. Rollback path: revert the commit, ship as next patch. Agents
226
+ will resume the prior 15-minute kill threshold on their next server restart.
227
+ No user-visible regression during rollback window — at worst, the user sees
228
+ the old "session appears stopped" pattern they reported.
229
+
230
+ The new config option `idlePromptKillMinutesBoundToTopic` falls back to a
231
+ hardcoded default (1440), so a rollback that drops the field from disk
232
+ config is a no-op.
233
+
234
+ ---
235
+
236
+ ## Conclusion
237
+
238
+ This review produced no design changes — both layers passed signal-vs-
239
+ authority compliance and the side-effects review on first read. The change
240
+ is contained to SessionManager and one wiring call in `commands/server.ts`,
241
+ with two new dedicated test files (8 new tests) plus 21 existing
242
+ session-reap-detect tests still passing.
243
+
244
+ The change is clear to ship pending second-pass review (required because it
245
+ touches session lifecycle: spawn, kill, recovery).
246
+
247
+ ---
248
+
249
+ ## Second-pass review (if required)
250
+
251
+ **Reviewer:** independent-review-subagent
252
+ **Independent read of the artifact: concern**
253
+
254
+ I concur on layer A's signal-vs-authority compliance and on the overall shape of layer B, but I have specific concerns that should be resolved before ship:
255
+
256
+ - **Threshold default 1440m (24h) is too aggressive a swing from 15m.** The healthy waiting state argument is sound, but 24h means each bound session holds a Claude TUI process (~200–500MB RSS) and an Anthropic connection for a full day even if the user never returns. For an agent with 8–10 concurrent Telegram topics on a 16GB host, that's 2–5GB of resident memory locked indefinitely, vs. the prior steady-state where idle topics released within 15m and only re-spawned on the next message. The artifact's mitigation ("config override available") puts the burden on every multi-topic operator to discover the new default and tune it down; the conservative default should solve the reported symptom without the resource cost. **Recommended resolution:** drop default to 240m (4h) — long enough that conversational pauses through normal work hours don't trip the kill, short enough that overnight idle sessions release. Keep the config knob for users who genuinely want 24h.
257
+
258
+ - **Cleanup race between proactive UUID save and `resumeFailed` listener is plausible (low-likelihood but real).** Order of operations I traced:
259
+ 1. `spawnSessionForTopic` calls `spawnInteractiveSession` (returns at line 1390 once tmux is created, before readiness probe).
260
+ 2. Caller at server.ts:528 immediately removes the bad UUID from `TopicResumeMap`.
261
+ 3. Caller at server.ts:537–549 schedules a `setTimeout(8s)` proactive UUID save against the same tmux name.
262
+ 4. ~90s later, `handleReadyAndInject` decides resume failed, emits `resumeFailed`, listener tries to remove UUID (no-op — already gone).
263
+ 5. Fallback recursively calls `spawnInteractiveSession` which creates a fresh Claude under the same tmux name.
264
+
265
+ The 8s proactive save fires while the failed Claude is still in startup-crash territory (no hook event yet, `claudeSessionId` empty → save is skipped). That's safe by accident, not by design. If the fresh-spawn fallback finishes quickly enough that a hook event lands before the resumeFailed listener fires, the listener could clear a fresh, valid UUID. The current emit-before-spawn ordering makes this unlikely, but it's not asserted by a test. **Recommended resolution:** add a test that runs the full sequence (proactive save scheduled → resume crash → fallback spawn → fresh hook event lands) and verifies `TopicResumeMap` ends with the *new* UUID, not empty. Or, more defensively, gate the listener's remove on a UUID-equality check (only clear if the stored UUID still matches `info.resumeSessionId`).
266
+
267
+ - **Failed-session status update happens AFTER `emit('resumeFailed')`.** Lines 1431–1446 emit the event first, then mark `failed.status = 'failed'`. The artifact §5 Races claims "We mark the failed session `status: 'failed'` before kicking off the fresh spawn so reapers don't re-process it" — but the emit-then-mark order means a concurrent monitor tick that fires between emit and `state.saveSession(failed)` will see `status: 'running'` on a dead pane. The recursive spawn happens after the marking, so the practical impact is small (window is microseconds), but the artifact's claim doesn't match the code. **Recommended resolution:** either move the status update before the emit, or soften the artifact's claim.
268
+
269
+ - **Test coverage gaps the artifact undersells.** All 8 new tests use mocked tmux with a single session; none cover (a) the fresh-spawn fallback itself failing — only `DegradationReporter.report` is exercised by code path, never by test, (b) concurrent monitor ticks during fallback (the race the artifact §5 itself flags), (c) multiple bound + unbound sessions on the same manager where the binding checker returns a mix, or (d) the listener's UUID-cleanup interacting with a happy-path remove on the same topic. The "21 existing session-reap-detect tests" don't cover any of this — they predate the change. **Recommended resolution:** add at least the "fallback also fails" test and a "mixed bound/unbound sessions" test before merge.
270
+
271
+ - **Minor: `tmuxSession.replace(\`${path.basename(this.config.projectDir)}-\`, '')` at line 1457 is a string-first-occurrence replace.** If the agent's project directory basename happens to appear later in the session name (rare but possible — e.g. project `monroe`, session named `monroe-ai-monroe-debug`), only the first occurrence is stripped, which is correct. But the implicit assumption that `tmuxSession` always begins with `${projectBase}-` isn't enforced — if `name` was originally `null` and `tmuxSession` was `${projectBase}-interactive-${Date.now()}`, the recursive call passes `interactive-${Date.now()}` as `name`, creating a *different* tmux session name on retry (`projectBase-interactive-<sanitized>`). The recursive call would not reuse the same tmux name, breaking the implicit contract that the bridge's session→topic mapping still resolves. **Recommended resolution:** add a guard for the un-named case, or pass the tmuxSession name through more explicitly to ensure name preservation.
272
+
273
+ None of these are blockers in the "stop the world" sense — layer A is sound and layer B is a clear improvement on dropping messages. But the threshold default and the cleanup-race test gap warrant a follow-up before this lands on a production-traffic agent.
274
+
275
+ ---
276
+
277
+ ### Author's resolution of second-pass concerns
278
+
279
+ All five concerns were addressed in this same PR before commit:
280
+
281
+ 1. **Threshold default lowered from 1440 → 240 minutes (4h).** Source: `src/core/SessionManager.ts:65`. Long enough to cover normal conversational pauses through a workday; short enough to release resources from genuinely abandoned topics. Config knob `idlePromptKillMinutesBoundToTopic` preserved for operators who want a different value. Test updated to assert the new default.
282
+
283
+ 2. **UUID-equality gate on `resumeFailed` listener.** Source: `src/commands/server.ts` (search `UUID-equality gate`). Listener now reads stored UUID and only calls `remove()` when it matches `info.resumeSessionId`. New test file `tests/unit/resume-failed-uuid-gate.test.ts` covers all four cases: matching, replaced (the race), absent, and missing-topicId.
284
+
285
+ 3. **Order swapped: failed-status update now happens BEFORE `emit('resumeFailed')`.** Source: `src/core/SessionManager.ts` `handleReadyAndInject`. Artifact §5 claim now matches the code.
286
+
287
+ 4. **Test gaps closed.** Added: "fresh-spawn fallback also fails → degradation reported", "mixed bound + unbound sessions on the same manager", and the entire UUID-equality gate test file. Total new behavioral tests: 14 (was 8).
288
+
289
+ 5. **tmuxSession-name reconstruction fixed.** Source: `handleReadyAndInject` now threads the original `name` parameter through and passes it directly to the recursive `spawnInteractiveSession`. The fragile `tmuxSession.replace(prefix, '')` reconstruction is gone — auto-generated `interactive-${ts}` names round-trip correctly.
290
+
291
+ Verified by re-running the focused test suite: 81 tests across 7 files passing.
292
+
293
+ ---
294
+
295
+ ## Evidence pointers
296
+
297
+ **Repro evidence:**
298
+ - `/Users/justin/Documents/Projects/monroe-workspace/logs/server.log`
299
+ - `2026-05-04T23:39:14Z` — zombie kill of healthy idle session
300
+ - `2026-05-05T00:48:16Z` — user message arrives, no live session
301
+ - `2026-05-05T00:48:16Z` — respawn-with-resume attempted (UUID `716881a4-...`)
302
+ - `2026-05-05T00:48:20Z` — Session died during startup
303
+ - `2026-05-05T00:48:20Z` — Claude not ready, message NOT injected
304
+ - 19 prior occurrences of the same `Claude not ready` log line going back to 2026-04-28.
305
+
306
+ **Test evidence:**
307
+ - `tests/unit/zombie-kill-topic-binding.test.ts` — 6 tests: unbound kill, bound exemption, bound + over-threshold kill, null-checker, mixed bound+unbound (added per reviewer), default 4h.
308
+ - `tests/unit/spawn-resume-fallback.test.ts` — 4 tests: resume crash → fresh-spawn fallback, no-resume fresh spawn, no-fallback on prompt-detection false negative, both spawns fail → degradation reported (added per reviewer).
309
+ - `tests/unit/resume-failed-uuid-gate.test.ts` — 4 tests (added per reviewer): clear when stored UUID matches, preserve when stored UUID has been replaced (race), no-op when no stored UUID, no-op when no telegramTopicId.
310
+ - 7 related test files (81 tests) all green: `session-manager-behavioral`, `session-reap-detect`, `CompactionSentinel`, `bootstrap-file-threshold`, plus the three new files.
@@ -0,0 +1,67 @@
1
+ # Side-Effects Review — Wire WorkingMemoryAssembler into session context API
2
+
3
+ **Version / slug:** `assembler-context-endpoint`
4
+ **Date:** 2026-04-28
5
+ **Author:** gfrankgva (contributor)
6
+ **Second-pass reviewer:** Echo (EchoOfDawn), 3 review rounds
7
+
8
+ ## Summary of the change
9
+
10
+ Two files touched:
11
+
12
+ 1. `src/commands/server.ts` — WorkingMemoryAssembler construction is moved from line 3258 (before activitySentinel) to after activitySentinel initialization (~line 3475). This enables wiring `episodicMemory` via `activitySentinel.getEpisodicMemory()`, which was previously left as a TODO comment. The assembler now receives both `semanticMemory` and `episodicMemory`, making the 400-token episode budget functional in production. Construction is guarded by `if (semanticMemory || activitySentinel)` — skipped entirely in minimal-config setups where neither memory system is available.
13
+
14
+ 2. `src/server/routes.ts` — The two assembled-context endpoints (`/topic/context/:topicId?assembled=true` and `/session/context/:topicId`) are refactored to call a shared `assembleAndRespond()` helper instead of duplicating the assembly + response logic. The helper takes the assembler instance, topicId, options, and the Express response object. Auth confirmation is added to the JSDoc for the session context route.
15
+
16
+ ## Decision-point inventory
17
+
18
+ - `WorkingMemoryAssembler` construction order — **modify** (move later in init sequence for dependency availability).
19
+ - `WorkingMemoryAssembler` construction guard — **add** (skip when both `semanticMemory` and `activitySentinel` are undefined).
20
+ - `episodicMemory` wiring — **add** (was commented out, now passed via `activitySentinel?.getEpisodicMemory()`).
21
+ - `assembleAndRespond()` helper — **add** (extracts duplicated assembly logic).
22
+ - Route handlers — **modify** (delegate to shared helper instead of inline assembly).
23
+
24
+ ---
25
+
26
+ ## 1. Over-block
27
+
28
+ **What legitimate inputs does this change reject that it shouldn't?**
29
+
30
+ None. The assembler degrades gracefully when episodicMemory is undefined (sentinel requires sharedIntelligence / LLM key). The helper produces identical output to the previous inline logic. Backwards compatibility is preserved: `?assembled=true` is opt-in, and the raw topic context path is unchanged.
31
+
32
+ ## 2. Under-block
33
+
34
+ **What failure modes does this still miss?**
35
+
36
+ If `activitySentinel.getEpisodicMemory()` returns an EpisodicMemory instance that later becomes invalid (e.g., sentinel is stopped mid-session), the assembler would hold a stale reference. However, EpisodicMemory is file-based (JSON under `state/episodes/`), so the instance remains usable even if the sentinel stops producing new digests — it just won't have fresh data.
37
+
38
+ ## 3. Level-of-abstraction fit
39
+
40
+ **Is this at the right layer?**
41
+
42
+ Yes. The assembler is a dependency-injected component — it receives its memory sources at construction time. Moving its initialization to the correct point in the dependency graph (after sentinel) is the natural fix. The shared helper is a local function within the route setup closure, keeping the DRY refactor scoped to the routes file.
43
+
44
+ ## 4. Blocking authority
45
+
46
+ - [x] No — these are read-only API endpoints. They do not gate any operation.
47
+
48
+ ## 5. Interactions
49
+
50
+ - **Init ordering**: Assembler now depends on `activitySentinel` being initialized first. If sentinel init fails (sharedIntelligence unavailable), `activitySentinel` is undefined and `getEpisodicMemory()` is not called — assembler gets `episodicMemory: undefined` and degrades gracefully.
51
+ - **Route behavior**: Identical to prior implementation — the helper is a pure extraction refactor.
52
+
53
+ ## 6. External surfaces
54
+
55
+ - **Agents**: Session-start hooks calling `/session/context/:topicId` now receive episode context (recent activity digests, themed episodes) in the assembled output. This is strictly additive — agents get richer context.
56
+ - **Persistent state**: No modifications. Both endpoints are read-only.
57
+
58
+ ## 7. Rollback cost
59
+
60
+ Pure code change. Revert restores the previous inline handlers and removes episodic wiring. No migration or data repair needed.
61
+
62
+ ---
63
+
64
+ ## Evidence pointers
65
+
66
+ - Typecheck: `tsc --noEmit` — 0 errors.
67
+ - Existing tests (14 integration + 3 E2E) cover both endpoints' happy paths, fallback behavior, and budget surfacing. The shared helper produces identical output, so existing test assertions remain valid.
@@ -0,0 +1,52 @@
1
+ # Side-Effects Review — PostUpdateMigrator path decoding fix
2
+
3
+ **Version / slug:** `post-update-migrator-path-fix`
4
+ **Date:** 2026-04-28
5
+ **Author:** gfrankgva (contributor)
6
+
7
+ ## Summary of the change
8
+
9
+ One file, one line:
10
+
11
+ `src/core/PostUpdateMigrator.ts` — `getFreeTextGuardHook()` replaced `path.dirname(new URL(import.meta.url).pathname)` with `__dirname`. The former preserves `%20`-encoded spaces in the filesystem path, causing `fs.readFileSync` to fail when the project directory contains spaces. `__dirname` is already defined at module scope via `fileURLToPath(import.meta.url)`, which properly decodes percent-encoded characters.
12
+
13
+ ## Decision-point inventory
14
+
15
+ - `getFreeTextGuardHook()` path construction — **fix** (replace URL.pathname with __dirname).
16
+
17
+ ---
18
+
19
+ ## 1. Over-block
20
+
21
+ None. Pure bug fix — strictly widens the set of environments where the function works.
22
+
23
+ ## 2. Under-block
24
+
25
+ None. `__dirname` handles all valid filesystem paths.
26
+
27
+ ## 3. Level-of-abstraction fit
28
+
29
+ Correct. Uses the same `__dirname` already defined at module scope by the file itself.
30
+
31
+ ## 4. Blocking authority
32
+
33
+ - [x] No — this is a path construction fix, not a gate.
34
+
35
+ ## 5. Interactions
36
+
37
+ None. The function is called during hook installation — no racing, no shadowing.
38
+
39
+ ## 6. External surfaces
40
+
41
+ The function returns the content of `free-text-guard.sh`, which is written to `.claude/hooks/`. No behavioral change to the hook content itself.
42
+
43
+ ## 7. Rollback cost
44
+
45
+ Revert restores the bug — `readFileSync` would again fail on paths with spaces.
46
+
47
+ ---
48
+
49
+ ## Evidence pointers
50
+
51
+ - Typecheck: `tsc --noEmit` — 0 errors.
52
+ - Tests: All 7 `PostUpdateMigrator-buildStopHook` tests pass (were 2/7 failing before this fix).