instar 0.28.78 → 0.28.80
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dashboard/index.html +170 -7
- package/dist/commands/init.d.ts.map +1 -1
- package/dist/commands/init.js +6 -4
- package/dist/commands/init.js.map +1 -1
- package/dist/commands/playbook.d.ts.map +1 -1
- package/dist/commands/playbook.js +2 -1
- package/dist/commands/playbook.js.map +1 -1
- package/dist/commands/server.d.ts.map +1 -1
- package/dist/commands/server.js +91 -8
- package/dist/commands/server.js.map +1 -1
- package/dist/commands/setup.d.ts.map +1 -1
- package/dist/commands/setup.js +5 -3
- package/dist/commands/setup.js.map +1 -1
- package/dist/core/Config.d.ts.map +1 -1
- package/dist/core/Config.js +2 -1
- package/dist/core/Config.js.map +1 -1
- package/dist/core/PostUpdateMigrator.d.ts.map +1 -1
- package/dist/core/PostUpdateMigrator.js +4 -5
- package/dist/core/PostUpdateMigrator.js.map +1 -1
- package/dist/core/SessionManager.d.ts +38 -0
- package/dist/core/SessionManager.d.ts.map +1 -1
- package/dist/core/SessionManager.js +157 -23
- package/dist/core/SessionManager.js.map +1 -1
- package/dist/core/UpdateChecker.d.ts.map +1 -1
- package/dist/core/UpdateChecker.js +3 -1
- package/dist/core/UpdateChecker.js.map +1 -1
- package/dist/core/UpgradeGuideProcessor.d.ts.map +1 -1
- package/dist/core/UpgradeGuideProcessor.js +3 -1
- package/dist/core/UpgradeGuideProcessor.js.map +1 -1
- package/dist/core/types.d.ts +18 -0
- package/dist/core/types.d.ts.map +1 -1
- package/dist/core/types.js.map +1 -1
- package/dist/lifeline/ServerSupervisor.d.ts.map +1 -1
- package/dist/lifeline/ServerSupervisor.js +3 -1
- package/dist/lifeline/ServerSupervisor.js.map +1 -1
- package/dist/memory/SemanticMemory.d.ts +9 -0
- package/dist/memory/SemanticMemory.d.ts.map +1 -1
- package/dist/memory/SemanticMemory.js +131 -0
- package/dist/memory/SemanticMemory.js.map +1 -1
- package/dist/monitoring/PresenceProxy.d.ts +53 -0
- package/dist/monitoring/PresenceProxy.d.ts.map +1 -1
- package/dist/monitoring/PresenceProxy.js +219 -20
- package/dist/monitoring/PresenceProxy.js.map +1 -1
- package/dist/scheduler/JobRunHistory.d.ts +6 -0
- package/dist/scheduler/JobRunHistory.d.ts.map +1 -1
- package/dist/scheduler/JobRunHistory.js +11 -0
- package/dist/scheduler/JobRunHistory.js.map +1 -1
- package/dist/scheduler/JobScheduler.d.ts +23 -0
- package/dist/scheduler/JobScheduler.d.ts.map +1 -1
- package/dist/scheduler/JobScheduler.js +84 -0
- package/dist/scheduler/JobScheduler.js.map +1 -1
- package/dist/server/routes.d.ts.map +1 -1
- package/dist/server/routes.js +56 -0
- package/dist/server/routes.js.map +1 -1
- package/dist/threadline/ThreadlineBootstrap.d.ts.map +1 -1
- package/dist/threadline/ThreadlineBootstrap.js +3 -2
- package/dist/threadline/ThreadlineBootstrap.js.map +1 -1
- package/dist/threadline/relay/ConnectionManager.d.ts.map +1 -1
- package/dist/threadline/relay/ConnectionManager.js +34 -7
- package/dist/threadline/relay/ConnectionManager.js.map +1 -1
- package/package.json +1 -1
- package/scripts/pre-push-gate.js +26 -0
- package/src/data/builtin-manifest.json +64 -64
- package/upgrades/0.28.79.md +67 -0
- package/upgrades/0.28.80.md +93 -0
- package/upgrades/side-effects/0.28.79.md +310 -0
- package/upgrades/side-effects/assembler-context-endpoint.md +67 -0
- package/upgrades/side-effects/post-update-migrator-path-fix.md +52 -0
- package/upgrades/side-effects/presence-proxy-ack-and-baseline.md +260 -0
- package/upgrades/side-effects/semantic-memory-corruption-recovery.md +98 -0
- package/upgrades/side-effects/url-pathname-path-encoding-fix.md +45 -0
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# Upgrade Guide — vNEXT
|
|
2
|
+
|
|
3
|
+
<!-- bump: patch -->
|
|
4
|
+
|
|
5
|
+
## What Changed
|
|
6
|
+
|
|
7
|
+
PresenceProxy — the standby system that emits 20s / 2m / 5m progressive
|
|
8
|
+
status updates while an agent is busy — had two regressions that made
|
|
9
|
+
the feature look broken even though the timer machinery was still in
|
|
10
|
+
place.
|
|
11
|
+
|
|
12
|
+
**Layer A — Brief acks no longer cancel tier timers.** Recent guidance
|
|
13
|
+
told every Telegram/Slack/iMessage agent to send an immediate
|
|
14
|
+
acknowledgement ("Got it, looking into this") on every inbound user
|
|
15
|
+
message. The proxy treated that ack as the agent's response and
|
|
16
|
+
silently cancelled every pending tier check. Result: progressive
|
|
17
|
+
20s/2m/5m updates stopped firing entirely — the user got an immediate
|
|
18
|
+
"On it" and then radio silence until the real reply arrived.
|
|
19
|
+
|
|
20
|
+
PresenceProxy now classifies short, forward-looking acks ("On it",
|
|
21
|
+
"Got it, looking into this", "I'll dig into that") as non-cancelling.
|
|
22
|
+
The classifier is length-bounded (≤ 200 chars) and opener-only (the
|
|
23
|
+
ack phrase has to appear in the first 60 chars), so a substantive
|
|
24
|
+
multi-sentence reply that happens to mention "I will…" deep in the
|
|
25
|
+
body is NOT misclassified.
|
|
26
|
+
|
|
27
|
+
**Layer B — Tier prompts now scope to post-message activity.** The
|
|
28
|
+
prompts that built the tier-1/2/3 status messages read whatever was
|
|
29
|
+
visible in the agent's tmux pane right then, which is the rolling
|
|
30
|
+
window — so older work from BEFORE the user's latest message often
|
|
31
|
+
dominated the snapshot. The user got summaries describing pre-message
|
|
32
|
+
work instead of "what the agent is doing in response to my message."
|
|
33
|
+
|
|
34
|
+
PresenceProxy now captures a baseline tmux snapshot at the instant
|
|
35
|
+
the user message arrives (`userMessageBaselineSnapshot`). The four
|
|
36
|
+
prompt builders (Tier 1, conversation, Tier 2, Tier 3) anchor on the
|
|
37
|
+
baseline and feed only the post-baseline delta to the LLM, with an
|
|
38
|
+
explicit "[scope: only output that appeared AFTER the user's message
|
|
39
|
+
arrived]" header. If the baseline anchor scrolled off the visible
|
|
40
|
+
pane (very busy build), we fall back to the full pane with a
|
|
41
|
+
labelled scope tag.
|
|
42
|
+
|
|
43
|
+
## What to Tell Your User
|
|
44
|
+
|
|
45
|
+
- **The 20-second / 2-minute / 5-minute progressive standby updates
|
|
46
|
+
are working again**: When your agent is busy and you message it,
|
|
47
|
+
you'll once more get a status update at the 20-second mark, then
|
|
48
|
+
another at 2 minutes, then a stall assessment at 5 minutes. The
|
|
49
|
+
agent's brief ack right after your message no longer turns those
|
|
50
|
+
off.
|
|
51
|
+
- **Standby summaries finally describe what the agent is doing in
|
|
52
|
+
response to your latest message**: Before, the standby update
|
|
53
|
+
could summarize work the agent was already doing before your
|
|
54
|
+
question arrived. Now the proxy anchors on the moment your
|
|
55
|
+
message hit and only describes activity since.
|
|
56
|
+
|
|
57
|
+
## Summary of New Capabilities
|
|
58
|
+
|
|
59
|
+
| Capability | How to Use |
|
|
60
|
+
|-----------|-----------|
|
|
61
|
+
| Brief-ack tolerance — tier timers survive "On it" / "Got it" replies | Automatic. Substantive replies still cancel timers as before; only short forward-looking acks are now treated as non-cancelling. |
|
|
62
|
+
| Post-message scope — standby summaries describe only activity AFTER your latest message | Automatic. Captured at the moment your message arrives; baseline is in-memory only and survives a session restart by falling back to the legacy full-pane scope. |
|
|
63
|
+
|
|
64
|
+
## Evidence
|
|
65
|
+
|
|
66
|
+
- Repro source: user report on topic 8882 (2026-05-04T03:52Z) —
|
|
67
|
+
"this feature no longer seems to give progressive updates such as
|
|
68
|
+
the 5, 10, 15 min mark like it used to" + "the messages from
|
|
69
|
+
standby mode often seem to be summarizing what the agent was
|
|
70
|
+
working on BEFORE the user's last message."
|
|
71
|
+
- Root cause for #1: `handleAgentMessage` was called from
|
|
72
|
+
`onMessageLogged` for every non-system, non-proxy outbound
|
|
73
|
+
message. The Telegram-bridge instruction to ack immediately on
|
|
74
|
+
inbound meant every user message produced an immediate
|
|
75
|
+
cancellation before tier 1 had a chance to fire.
|
|
76
|
+
- Root cause for #2: tier prompts at lines 1192/1211/1244/1271 in
|
|
77
|
+
`src/monitoring/PresenceProxy.ts` passed the raw rolling tmux
|
|
78
|
+
pane to the LLM with no boundary marker for "what was visible at
|
|
79
|
+
user-message arrival."
|
|
80
|
+
- Side-effects review at
|
|
81
|
+
`upgrades/side-effects/presence-proxy-ack-and-baseline.md`
|
|
82
|
+
covers over/under-block for the brief-ack filter,
|
|
83
|
+
level-of-abstraction fit, signal-vs-authority compliance,
|
|
84
|
+
interactions with CompactionSentinel / PromiseBeacon /
|
|
85
|
+
ProxyCoordinator, and rollback cost.
|
|
86
|
+
- 15 new unit tests in
|
|
87
|
+
`tests/unit/presence-proxy-ack-and-baseline.test.ts` covering
|
|
88
|
+
`isBriefAck` (5), `extractDeltaSinceBaseline` (5), brief-ack
|
|
89
|
+
handling end-to-end (3), baseline capture (2). All 64 prior
|
|
90
|
+
PresenceProxy unit tests + 64 e2e tests still pass after
|
|
91
|
+
updating two e2e tests to use clearly-substantive agent replies
|
|
92
|
+
(the prior fixtures were short messages now correctly classified
|
|
93
|
+
as acks).
|
|
@@ -0,0 +1,310 @@
|
|
|
1
|
+
# Side-Effects Review — Topic-binding-aware zombie kill + resume-failure fallback
|
|
2
|
+
|
|
3
|
+
**Version / slug:** `zombie-kill-topic-binding`
|
|
4
|
+
**Date:** `2026-05-04`
|
|
5
|
+
**Author:** `Echo`
|
|
6
|
+
**Second-pass reviewer:** `independent-review-subagent (concerns raised + resolved)`
|
|
7
|
+
|
|
8
|
+
## Summary of the change
|
|
9
|
+
|
|
10
|
+
Closes a two-stage failure mode that drops the user's first message after a
|
|
11
|
+
conversational pause on Telegram-bound (and Slack/iMessage-bound) agents.
|
|
12
|
+
|
|
13
|
+
**Root cause traced from Inspec/monroe-workspace logs.** When a Telegram agent
|
|
14
|
+
finishes replying, Claude sits at the prompt waiting for the next user
|
|
15
|
+
message. SessionManager's zombie-killer interprets "idle at prompt + no active
|
|
16
|
+
processes for 15 minutes" as zombie and kills the session. When the user
|
|
17
|
+
finally messages, the bridge tries to respawn with `--resume <UUID>`; the
|
|
18
|
+
saved UUID was captured at kill time and sometimes crashes Claude during
|
|
19
|
+
startup (`Session died during startup`). `waitForClaudeReady` times out, the
|
|
20
|
+
initial message is logged "NOT injected", and the user's message is dropped.
|
|
21
|
+
Five minutes later, the presence proxy fires its `tier-3 — session appears
|
|
22
|
+
stopped` warning. The user has to send "unstick" or re-send to recover.
|
|
23
|
+
|
|
24
|
+
**Fix in two layers:**
|
|
25
|
+
|
|
26
|
+
- **Layer A — Topic-binding exemption (signal-vs-authority structural
|
|
27
|
+
exemption).** SessionManager gains an optional `topicBindingChecker`
|
|
28
|
+
callback. When the zombie-killer is about to act, it consults the checker;
|
|
29
|
+
if the session is bound to a live messaging topic the kill threshold is
|
|
30
|
+
raised from 15 minutes to a configurable bound threshold (default 240
|
|
31
|
+
minutes / 4h). The binding is an authoritative structural fact (the
|
|
32
|
+
TelegramAdapter's reverse map), not a judgment call. Default chosen to
|
|
33
|
+
cover normal conversational pauses through a workday without holding
|
|
34
|
+
per-session resources (Claude TUI ~200-500MB RSS, Anthropic connection)
|
|
35
|
+
indefinitely. Operators can override via `idlePromptKillMinutesBoundToTopic`.
|
|
36
|
+
|
|
37
|
+
- **Layer B — Resume-failure fresh-spawn fallback.** When the readiness
|
|
38
|
+
probe fails AND tmux died during startup AND the spawn was using
|
|
39
|
+
`--resume`, SessionManager falls through once to a fresh-spawn carrying the
|
|
40
|
+
same initial message. A `resumeFailed` event is emitted; the bridge clears
|
|
41
|
+
the bad UUID from `TopicResumeMap` so the next user-driven respawn doesn't
|
|
42
|
+
retry the same broken UUID. The bridge listener gates the `remove()` on
|
|
43
|
+
UUID-equality with the failed UUID — so a fresh spawn that quickly saved a
|
|
44
|
+
*new* UUID won't have it wiped by a late-firing listener.
|
|
45
|
+
|
|
46
|
+
**Files touched:**
|
|
47
|
+
- `src/core/types.ts` — adds `idlePromptKillMinutesBoundToTopic?: number`.
|
|
48
|
+
- `src/core/SessionManager.ts` — adds binding checker, bound threshold getter,
|
|
49
|
+
binding-aware kill decision, and `handleReadyAndInject` with single-retry
|
|
50
|
+
fresh-spawn fallback. Emits `resumeFailed` event.
|
|
51
|
+
- `src/commands/server.ts` — wires the binding checker to consult Telegram /
|
|
52
|
+
Slack / iMessage adapters; subscribes to `resumeFailed` to clear the stale
|
|
53
|
+
UUID from `TopicResumeMap`.
|
|
54
|
+
- `tests/unit/zombie-kill-topic-binding.test.ts` — new behavioral tests.
|
|
55
|
+
- `tests/unit/spawn-resume-fallback.test.ts` — new behavioral tests.
|
|
56
|
+
|
|
57
|
+
## Decision-point inventory
|
|
58
|
+
|
|
59
|
+
- `SessionManager` zombie-kill decision (`isActuallyIdle && idleMs > threshold`) — **modified**: threshold is now binding-aware.
|
|
60
|
+
- `SessionManager.spawnInteractiveSession` post-readiness initial-message inject — **modified**: adds a single fresh-spawn fallback when --resume crashes during startup.
|
|
61
|
+
- `commands/server.ts` `injectionDropped` listener — **pass-through**: existing recovery path is preserved; the new `resumeFailed` listener is purely a UUID-cleanup hook, no block/allow surface.
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## 1. Over-block
|
|
66
|
+
|
|
67
|
+
**What legitimate inputs does this change reject that it shouldn't?**
|
|
68
|
+
|
|
69
|
+
The only block-shaped surface this change touches is "kill vs don't-kill". The
|
|
70
|
+
change *raises* the threshold for topic-bound sessions; it does not block
|
|
71
|
+
anything new. The risk is the inverse of over-block: failing to kill a truly-
|
|
72
|
+
zombied topic-bound session for up to 24h.
|
|
73
|
+
|
|
74
|
+
Concrete scenario: A topic-bound session whose Claude process hangs internally
|
|
75
|
+
(e.g., infinite loop in the TUI) but stays "alive" by `pane_current_command`
|
|
76
|
+
will not be cleaned up by the zombie-killer for up to 24h. Mitigation: the
|
|
77
|
+
bridge's `isSessionAlive` check on the next user message authoritatively
|
|
78
|
+
detects truly-dead Claude processes and triggers a clean respawn — that's the
|
|
79
|
+
fast path. The 24h threshold only matters for users who never message again.
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## 2. Under-block
|
|
84
|
+
|
|
85
|
+
**What failure modes does this still miss?**
|
|
86
|
+
|
|
87
|
+
Layer A misses: a topic-bound session whose Claude has stopped responding to
|
|
88
|
+
input but is still showing the prompt and registering as `alive` will not be
|
|
89
|
+
killed promptly. As above — mitigated by user-driven respawn on next message.
|
|
90
|
+
|
|
91
|
+
Layer B misses: if the fresh-spawn fallback also crashes during startup (e.g.,
|
|
92
|
+
disk full, claudePath wrong, persistent corruption), we surface a degradation
|
|
93
|
+
event but do not retry again. This is intentional — single retry only — to
|
|
94
|
+
avoid spawn-loops. The bridge's existing `injectionDropped` recovery path
|
|
95
|
+
will pick up the ball on the next inbound message.
|
|
96
|
+
|
|
97
|
+
Layer B also does not cover: the case where `--resume` succeeds *enough* for
|
|
98
|
+
tmux to stay alive but Claude itself is broken (won't render the prompt). In
|
|
99
|
+
that case we still fall through to "best-effort inject anyway" preserving the
|
|
100
|
+
prior behavior. That's not a regression.
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## 3. Level-of-abstraction fit
|
|
105
|
+
|
|
106
|
+
**Is this at the right layer?**
|
|
107
|
+
|
|
108
|
+
Yes. SessionManager owns session lifecycle, so the kill threshold belongs
|
|
109
|
+
there. The binding check is delegated to a callback (the same pattern as
|
|
110
|
+
`subagentChecker` and `activeRecoveryChecker`) so SessionManager stays
|
|
111
|
+
unaware of which messaging platform is asking — it only consumes a yes/no
|
|
112
|
+
binding signal.
|
|
113
|
+
|
|
114
|
+
The fresh-spawn fallback also belongs in SessionManager because it owns the
|
|
115
|
+
spawn primitive. The bridge layer only consumes the `resumeFailed` event for
|
|
116
|
+
its own state cleanup (`TopicResumeMap.remove`), which is unique to the
|
|
117
|
+
bridge's responsibility.
|
|
118
|
+
|
|
119
|
+
A higher-level alternative would have been to do the retry in the bridge
|
|
120
|
+
(routes.ts `/internal/telegram-forward`). Rejected: that requires either
|
|
121
|
+
refactoring `spawnInteractiveSession` to expose readiness to the caller (big
|
|
122
|
+
churn across 15 callers) or duplicating the spawn-and-await logic in two
|
|
123
|
+
places (drift risk). Keeping it in SessionManager is cheaper and isolates
|
|
124
|
+
the fix to one method body.
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## 4. Signal vs authority compliance
|
|
129
|
+
|
|
130
|
+
**Required reference:** [docs/signal-vs-authority.md](../../docs/signal-vs-authority.md)
|
|
131
|
+
|
|
132
|
+
**Does this change hold blocking authority with brittle logic?**
|
|
133
|
+
|
|
134
|
+
- [x] No — this change has no block/allow surface in the judgment sense.
|
|
135
|
+
|
|
136
|
+
The zombie-killer is not a judgment authority — it's a structural cleanup
|
|
137
|
+
mechanism whose behavior is now parameterized by an authoritative structural
|
|
138
|
+
fact (is this session in the topic→session reverse map). Per the principle
|
|
139
|
+
doc:
|
|
140
|
+
|
|
141
|
+
> When this principle does NOT apply: Hard-invariant validation … structural
|
|
142
|
+
> validators at the boundary of the system are not decision points in the
|
|
143
|
+
> sense this principle applies to.
|
|
144
|
+
|
|
145
|
+
The binding lookup is a hard structural fact ("is this session ID in the
|
|
146
|
+
TelegramAdapter's reverse map?"), not a judgment about what a message
|
|
147
|
+
*means*. There is no LLM, no regex, no similarity score, no token list.
|
|
148
|
+
|
|
149
|
+
The fresh-spawn fallback is a recovery flow control, not a decision point on
|
|
150
|
+
content or intent. No principle violation.
|
|
151
|
+
|
|
152
|
+
---
|
|
153
|
+
|
|
154
|
+
## 5. Interactions
|
|
155
|
+
|
|
156
|
+
**Does this interact with existing checks, recovery paths, or infrastructure?**
|
|
157
|
+
|
|
158
|
+
- **Shadowing:**
|
|
159
|
+
- The zombie-killer's existing vetoes (`activeRecoveryChecker`,
|
|
160
|
+
`subagentChecker`, `pendingInjections`) all run BEFORE the new threshold
|
|
161
|
+
check. They are unaffected — bound sessions still respect compaction
|
|
162
|
+
recovery, subagent activity, and pending-injection events.
|
|
163
|
+
- The new `topicBindingChecker` runs AFTER `idlePromptSince` is established,
|
|
164
|
+
not before, so the existing first-idle hooks (paste-retry, error-nudge)
|
|
165
|
+
still fire normally on bound sessions.
|
|
166
|
+
|
|
167
|
+
- **Double-fire:**
|
|
168
|
+
- `resumeFailed` and `injectionDropped` could both fire for the same session
|
|
169
|
+
if the resume crashes AND the recovered fresh-spawn also fails to
|
|
170
|
+
inject. In that case, the bridge's `injectionDropped` listener will
|
|
171
|
+
re-forward the user's text via `/internal/telegram-forward`, which
|
|
172
|
+
triggers a new spawn. This is the same path that already runs today on
|
|
173
|
+
crashed sessions; the change does not introduce a new loop.
|
|
174
|
+
|
|
175
|
+
- **Races:**
|
|
176
|
+
- The fresh-spawn fallback inside `handleReadyAndInject` runs after a
|
|
177
|
+
`kill-session` to clean up any zombie pane. If a concurrent monitor tick
|
|
178
|
+
is in flight, it could observe the dead pane mid-cleanup and emit
|
|
179
|
+
`sessionComplete` for the failed session. We mark the failed session
|
|
180
|
+
`status: 'failed'` BEFORE emitting `resumeFailed` (and before the
|
|
181
|
+
recursive spawn), so reapers see consistent state through the full
|
|
182
|
+
handoff.
|
|
183
|
+
- The `resumeFailed` listener fires AFTER the fresh-spawn may have already
|
|
184
|
+
saved a new UUID via the proactive 8-second save. To avoid wiping the new
|
|
185
|
+
UUID, the listener gates `TopicResumeMap.remove(topicId)` on a
|
|
186
|
+
UUID-equality check (only remove when the stored UUID still matches the
|
|
187
|
+
failed one). Direct test: `tests/unit/resume-failed-uuid-gate.test.ts`.
|
|
188
|
+
- `topicBindingChecker` is read-only. No shared mutable state.
|
|
189
|
+
|
|
190
|
+
- **Feedback loops:** None. The fresh-spawn fallback is one-shot.
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
194
|
+
## 6. External surfaces
|
|
195
|
+
|
|
196
|
+
**Does this change anything visible outside the immediate code path?**
|
|
197
|
+
|
|
198
|
+
- **Other agents on the same machine:** No. Each agent's SessionManager owns
|
|
199
|
+
its own kill threshold and its own binding map.
|
|
200
|
+
- **Other users of the install base:** Yes — every Telegram/Slack/iMessage
|
|
201
|
+
agent will now hold sessions for up to 24h instead of cleaning them up at
|
|
202
|
+
15 minutes. Memory footprint and Claude API connection count per agent
|
|
203
|
+
may rise. Users with many concurrent topics on a memory-constrained host
|
|
204
|
+
can override `idlePromptKillMinutesBoundToTopic` in config.json. The
|
|
205
|
+
default is conservative for the common case (1-3 topics per agent).
|
|
206
|
+
- **External systems:** No changes to Telegram/Slack/iMessage API surface.
|
|
207
|
+
Tunnel, GitHub, Cloudflare unaffected.
|
|
208
|
+
- **Persistent state:** `TopicResumeMap` entries are cleared on resume
|
|
209
|
+
failure (one extra `remove` call per failure). State file format unchanged.
|
|
210
|
+
- **Timing/runtime:** The bound threshold default (24h) is bounded; sessions
|
|
211
|
+
cannot accumulate forever. The fresh-spawn fallback adds at most one
|
|
212
|
+
additional 90-second readiness window per spawn attempt; bounded.
|
|
213
|
+
- **Logs:** New log lines on bound-zombie-kill (`(topic-bound, threshold Nm)`),
|
|
214
|
+
resume failure (`Resume failed for "X" — tmux died during startup. Falling
|
|
215
|
+
back to fresh spawn.`), and fresh-spawn success/failure. Format is
|
|
216
|
+
consistent with existing `[SessionManager]` lines.
|
|
217
|
+
|
|
218
|
+
---
|
|
219
|
+
|
|
220
|
+
## 7. Rollback cost
|
|
221
|
+
|
|
222
|
+
**If this turns out wrong in production, what's the back-out?**
|
|
223
|
+
|
|
224
|
+
Pure code change. No schema migration, no persistent state shape change, no
|
|
225
|
+
data migration. Rollback path: revert the commit, ship as next patch. Agents
|
|
226
|
+
will resume the prior 15-minute kill threshold on their next server restart.
|
|
227
|
+
No user-visible regression during rollback window — at worst, the user sees
|
|
228
|
+
the old "session appears stopped" pattern they reported.
|
|
229
|
+
|
|
230
|
+
The new config option `idlePromptKillMinutesBoundToTopic` falls back to a
|
|
231
|
+
hardcoded default (1440), so a rollback that drops the field from disk
|
|
232
|
+
config is a no-op.
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
## Conclusion
|
|
237
|
+
|
|
238
|
+
This review produced no design changes — both layers passed signal-vs-
|
|
239
|
+
authority compliance and the side-effects review on first read. The change
|
|
240
|
+
is contained to SessionManager and one wiring call in `commands/server.ts`,
|
|
241
|
+
with two new dedicated test files (8 new tests) plus 21 existing
|
|
242
|
+
session-reap-detect tests still passing.
|
|
243
|
+
|
|
244
|
+
The change is clear to ship pending second-pass review (required because it
|
|
245
|
+
touches session lifecycle: spawn, kill, recovery).
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## Second-pass review (if required)
|
|
250
|
+
|
|
251
|
+
**Reviewer:** independent-review-subagent
|
|
252
|
+
**Independent read of the artifact: concern**
|
|
253
|
+
|
|
254
|
+
I concur on layer A's signal-vs-authority compliance and on the overall shape of layer B, but I have specific concerns that should be resolved before ship:
|
|
255
|
+
|
|
256
|
+
- **Threshold default 1440m (24h) is too aggressive a swing from 15m.** The healthy waiting state argument is sound, but 24h means each bound session holds a Claude TUI process (~200–500MB RSS) and an Anthropic connection for a full day even if the user never returns. For an agent with 8–10 concurrent Telegram topics on a 16GB host, that's 2–5GB of resident memory locked indefinitely, vs. the prior steady-state where idle topics released within 15m and only re-spawned on the next message. The artifact's mitigation ("config override available") puts the burden on every multi-topic operator to discover the new default and tune it down; the conservative default should solve the reported symptom without the resource cost. **Recommended resolution:** drop default to 240m (4h) — long enough that conversational pauses through normal work hours don't trip the kill, short enough that overnight idle sessions release. Keep the config knob for users who genuinely want 24h.
|
|
257
|
+
|
|
258
|
+
- **Cleanup race between proactive UUID save and `resumeFailed` listener is plausible (low-likelihood but real).** Order of operations I traced:
|
|
259
|
+
1. `spawnSessionForTopic` calls `spawnInteractiveSession` (returns at line 1390 once tmux is created, before readiness probe).
|
|
260
|
+
2. Caller at server.ts:528 immediately removes the bad UUID from `TopicResumeMap`.
|
|
261
|
+
3. Caller at server.ts:537–549 schedules a `setTimeout(8s)` proactive UUID save against the same tmux name.
|
|
262
|
+
4. ~90s later, `handleReadyAndInject` decides resume failed, emits `resumeFailed`, listener tries to remove UUID (no-op — already gone).
|
|
263
|
+
5. Fallback recursively calls `spawnInteractiveSession` which creates a fresh Claude under the same tmux name.
|
|
264
|
+
|
|
265
|
+
The 8s proactive save fires while the failed Claude is still in startup-crash territory (no hook event yet, `claudeSessionId` empty → save is skipped). That's safe by accident, not by design. If the fresh-spawn fallback finishes quickly enough that a hook event lands before the resumeFailed listener fires, the listener could clear a fresh, valid UUID. The current emit-before-spawn ordering makes this unlikely, but it's not asserted by a test. **Recommended resolution:** add a test that runs the full sequence (proactive save scheduled → resume crash → fallback spawn → fresh hook event lands) and verifies `TopicResumeMap` ends with the *new* UUID, not empty. Or, more defensively, gate the listener's remove on a UUID-equality check (only clear if the stored UUID still matches `info.resumeSessionId`).
|
|
266
|
+
|
|
267
|
+
- **Failed-session status update happens AFTER `emit('resumeFailed')`.** Lines 1431–1446 emit the event first, then mark `failed.status = 'failed'`. The artifact §5 Races claims "We mark the failed session `status: 'failed'` before kicking off the fresh spawn so reapers don't re-process it" — but the emit-then-mark order means a concurrent monitor tick that fires between emit and `state.saveSession(failed)` will see `status: 'running'` on a dead pane. The recursive spawn happens after the marking, so the practical impact is small (window is microseconds), but the artifact's claim doesn't match the code. **Recommended resolution:** either move the status update before the emit, or soften the artifact's claim.
|
|
268
|
+
|
|
269
|
+
- **Test coverage gaps the artifact undersells.** All 8 new tests use mocked tmux with a single session; none cover (a) the fresh-spawn fallback itself failing — only `DegradationReporter.report` is exercised by code path, never by test, (b) concurrent monitor ticks during fallback (the race the artifact §5 itself flags), (c) multiple bound + unbound sessions on the same manager where the binding checker returns a mix, or (d) the listener's UUID-cleanup interacting with a happy-path remove on the same topic. The "21 existing session-reap-detect tests" don't cover any of this — they predate the change. **Recommended resolution:** add at least the "fallback also fails" test and a "mixed bound/unbound sessions" test before merge.
|
|
270
|
+
|
|
271
|
+
- **Minor: `tmuxSession.replace(\`${path.basename(this.config.projectDir)}-\`, '')` at line 1457 is a string-first-occurrence replace.** If the agent's project directory basename happens to appear later in the session name (rare but possible — e.g. project `monroe`, session named `monroe-ai-monroe-debug`), only the first occurrence is stripped, which is correct. But the implicit assumption that `tmuxSession` always begins with `${projectBase}-` isn't enforced — if `name` was originally `null` and `tmuxSession` was `${projectBase}-interactive-${Date.now()}`, the recursive call passes `interactive-${Date.now()}` as `name`, creating a *different* tmux session name on retry (`projectBase-interactive-<sanitized>`). The recursive call would not reuse the same tmux name, breaking the implicit contract that the bridge's session→topic mapping still resolves. **Recommended resolution:** add a guard for the un-named case, or pass the tmuxSession name through more explicitly to ensure name preservation.
|
|
272
|
+
|
|
273
|
+
None of these are blockers in the "stop the world" sense — layer A is sound and layer B is a clear improvement on dropping messages. But the threshold default and the cleanup-race test gap warrant a follow-up before this lands on a production-traffic agent.
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
### Author's resolution of second-pass concerns
|
|
278
|
+
|
|
279
|
+
All five concerns were addressed in this same PR before commit:
|
|
280
|
+
|
|
281
|
+
1. **Threshold default lowered from 1440 → 240 minutes (4h).** Source: `src/core/SessionManager.ts:65`. Long enough to cover normal conversational pauses through a workday; short enough to release resources from genuinely abandoned topics. Config knob `idlePromptKillMinutesBoundToTopic` preserved for operators who want a different value. Test updated to assert the new default.
|
|
282
|
+
|
|
283
|
+
2. **UUID-equality gate on `resumeFailed` listener.** Source: `src/commands/server.ts` (search `UUID-equality gate`). Listener now reads stored UUID and only calls `remove()` when it matches `info.resumeSessionId`. New test file `tests/unit/resume-failed-uuid-gate.test.ts` covers all four cases: matching, replaced (the race), absent, and missing-topicId.
|
|
284
|
+
|
|
285
|
+
3. **Order swapped: failed-status update now happens BEFORE `emit('resumeFailed')`.** Source: `src/core/SessionManager.ts` `handleReadyAndInject`. Artifact §5 claim now matches the code.
|
|
286
|
+
|
|
287
|
+
4. **Test gaps closed.** Added: "fresh-spawn fallback also fails → degradation reported", "mixed bound + unbound sessions on the same manager", and the entire UUID-equality gate test file. Total new behavioral tests: 14 (was 8).
|
|
288
|
+
|
|
289
|
+
5. **tmuxSession-name reconstruction fixed.** Source: `handleReadyAndInject` now threads the original `name` parameter through and passes it directly to the recursive `spawnInteractiveSession`. The fragile `tmuxSession.replace(prefix, '')` reconstruction is gone — auto-generated `interactive-${ts}` names round-trip correctly.
|
|
290
|
+
|
|
291
|
+
Verified by re-running the focused test suite: 81 tests across 7 files passing.
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
## Evidence pointers
|
|
296
|
+
|
|
297
|
+
**Repro evidence:**
|
|
298
|
+
- `/Users/justin/Documents/Projects/monroe-workspace/logs/server.log`
|
|
299
|
+
- `2026-05-04T23:39:14Z` — zombie kill of healthy idle session
|
|
300
|
+
- `2026-05-05T00:48:16Z` — user message arrives, no live session
|
|
301
|
+
- `2026-05-05T00:48:16Z` — respawn-with-resume attempted (UUID `716881a4-...`)
|
|
302
|
+
- `2026-05-05T00:48:20Z` — Session died during startup
|
|
303
|
+
- `2026-05-05T00:48:20Z` — Claude not ready, message NOT injected
|
|
304
|
+
- 19 prior occurrences of the same `Claude not ready` log line going back to 2026-04-28.
|
|
305
|
+
|
|
306
|
+
**Test evidence:**
|
|
307
|
+
- `tests/unit/zombie-kill-topic-binding.test.ts` — 6 tests: unbound kill, bound exemption, bound + over-threshold kill, null-checker, mixed bound+unbound (added per reviewer), default 4h.
|
|
308
|
+
- `tests/unit/spawn-resume-fallback.test.ts` — 4 tests: resume crash → fresh-spawn fallback, no-resume fresh spawn, no-fallback on prompt-detection false negative, both spawns fail → degradation reported (added per reviewer).
|
|
309
|
+
- `tests/unit/resume-failed-uuid-gate.test.ts` — 4 tests (added per reviewer): clear when stored UUID matches, preserve when stored UUID has been replaced (race), no-op when no stored UUID, no-op when no telegramTopicId.
|
|
310
|
+
- 7 related test files (81 tests) all green: `session-manager-behavioral`, `session-reap-detect`, `CompactionSentinel`, `bootstrap-file-threshold`, plus the three new files.
|
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
# Side-Effects Review — Wire WorkingMemoryAssembler into session context API
|
|
2
|
+
|
|
3
|
+
**Version / slug:** `assembler-context-endpoint`
|
|
4
|
+
**Date:** 2026-04-28
|
|
5
|
+
**Author:** gfrankgva (contributor)
|
|
6
|
+
**Second-pass reviewer:** Echo (EchoOfDawn), 3 review rounds
|
|
7
|
+
|
|
8
|
+
## Summary of the change
|
|
9
|
+
|
|
10
|
+
Two files touched:
|
|
11
|
+
|
|
12
|
+
1. `src/commands/server.ts` — WorkingMemoryAssembler construction is moved from line 3258 (before activitySentinel) to after activitySentinel initialization (~line 3475). This enables wiring `episodicMemory` via `activitySentinel.getEpisodicMemory()`, which was previously left as a TODO comment. The assembler now receives both `semanticMemory` and `episodicMemory`, making the 400-token episode budget functional in production. Construction is guarded by `if (semanticMemory || activitySentinel)` — skipped entirely in minimal-config setups where neither memory system is available.
|
|
13
|
+
|
|
14
|
+
2. `src/server/routes.ts` — The two assembled-context endpoints (`/topic/context/:topicId?assembled=true` and `/session/context/:topicId`) are refactored to call a shared `assembleAndRespond()` helper instead of duplicating the assembly + response logic. The helper takes the assembler instance, topicId, options, and the Express response object. Auth confirmation is added to the JSDoc for the session context route.
|
|
15
|
+
|
|
16
|
+
## Decision-point inventory
|
|
17
|
+
|
|
18
|
+
- `WorkingMemoryAssembler` construction order — **modify** (move later in init sequence for dependency availability).
|
|
19
|
+
- `WorkingMemoryAssembler` construction guard — **add** (skip when both `semanticMemory` and `activitySentinel` are undefined).
|
|
20
|
+
- `episodicMemory` wiring — **add** (was commented out, now passed via `activitySentinel?.getEpisodicMemory()`).
|
|
21
|
+
- `assembleAndRespond()` helper — **add** (extracts duplicated assembly logic).
|
|
22
|
+
- Route handlers — **modify** (delegate to shared helper instead of inline assembly).
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## 1. Over-block
|
|
27
|
+
|
|
28
|
+
**What legitimate inputs does this change reject that it shouldn't?**
|
|
29
|
+
|
|
30
|
+
None. The assembler degrades gracefully when episodicMemory is undefined (sentinel requires sharedIntelligence / LLM key). The helper produces identical output to the previous inline logic. Backwards compatibility is preserved: `?assembled=true` is opt-in, and the raw topic context path is unchanged.
|
|
31
|
+
|
|
32
|
+
## 2. Under-block
|
|
33
|
+
|
|
34
|
+
**What failure modes does this still miss?**
|
|
35
|
+
|
|
36
|
+
If `activitySentinel.getEpisodicMemory()` returns an EpisodicMemory instance that later becomes invalid (e.g., sentinel is stopped mid-session), the assembler would hold a stale reference. However, EpisodicMemory is file-based (JSON under `state/episodes/`), so the instance remains usable even if the sentinel stops producing new digests — it just won't have fresh data.
|
|
37
|
+
|
|
38
|
+
## 3. Level-of-abstraction fit
|
|
39
|
+
|
|
40
|
+
**Is this at the right layer?**
|
|
41
|
+
|
|
42
|
+
Yes. The assembler is a dependency-injected component — it receives its memory sources at construction time. Moving its initialization to the correct point in the dependency graph (after sentinel) is the natural fix. The shared helper is a local function within the route setup closure, keeping the DRY refactor scoped to the routes file.
|
|
43
|
+
|
|
44
|
+
## 4. Blocking authority
|
|
45
|
+
|
|
46
|
+
- [x] No — these are read-only API endpoints. They do not gate any operation.
|
|
47
|
+
|
|
48
|
+
## 5. Interactions
|
|
49
|
+
|
|
50
|
+
- **Init ordering**: Assembler now depends on `activitySentinel` being initialized first. If sentinel init fails (sharedIntelligence unavailable), `activitySentinel` is undefined and `getEpisodicMemory()` is not called — assembler gets `episodicMemory: undefined` and degrades gracefully.
|
|
51
|
+
- **Route behavior**: Identical to prior implementation — the helper is a pure extraction refactor.
|
|
52
|
+
|
|
53
|
+
## 6. External surfaces
|
|
54
|
+
|
|
55
|
+
- **Agents**: Session-start hooks calling `/session/context/:topicId` now receive episode context (recent activity digests, themed episodes) in the assembled output. This is strictly additive — agents get richer context.
|
|
56
|
+
- **Persistent state**: No modifications. Both endpoints are read-only.
|
|
57
|
+
|
|
58
|
+
## 7. Rollback cost
|
|
59
|
+
|
|
60
|
+
Pure code change. Revert restores the previous inline handlers and removes episodic wiring. No migration or data repair needed.
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Evidence pointers
|
|
65
|
+
|
|
66
|
+
- Typecheck: `tsc --noEmit` — 0 errors.
|
|
67
|
+
- Existing tests (14 integration + 3 E2E) cover both endpoints' happy paths, fallback behavior, and budget surfacing. The shared helper produces identical output, so existing test assertions remain valid.
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
# Side-Effects Review — PostUpdateMigrator path decoding fix
|
|
2
|
+
|
|
3
|
+
**Version / slug:** `post-update-migrator-path-fix`
|
|
4
|
+
**Date:** 2026-04-28
|
|
5
|
+
**Author:** gfrankgva (contributor)
|
|
6
|
+
|
|
7
|
+
## Summary of the change
|
|
8
|
+
|
|
9
|
+
One file, one line:
|
|
10
|
+
|
|
11
|
+
`src/core/PostUpdateMigrator.ts` — `getFreeTextGuardHook()` replaced `path.dirname(new URL(import.meta.url).pathname)` with `__dirname`. The former preserves `%20`-encoded spaces in the filesystem path, causing `fs.readFileSync` to fail when the project directory contains spaces. `__dirname` is already defined at module scope via `fileURLToPath(import.meta.url)`, which properly decodes percent-encoded characters.
|
|
12
|
+
|
|
13
|
+
## Decision-point inventory
|
|
14
|
+
|
|
15
|
+
- `getFreeTextGuardHook()` path construction — **fix** (replace URL.pathname with __dirname).
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## 1. Over-block
|
|
20
|
+
|
|
21
|
+
None. Pure bug fix — strictly widens the set of environments where the function works.
|
|
22
|
+
|
|
23
|
+
## 2. Under-block
|
|
24
|
+
|
|
25
|
+
None. `__dirname` handles all valid filesystem paths.
|
|
26
|
+
|
|
27
|
+
## 3. Level-of-abstraction fit
|
|
28
|
+
|
|
29
|
+
Correct. Uses the same `__dirname` already defined at module scope by the file itself.
|
|
30
|
+
|
|
31
|
+
## 4. Blocking authority
|
|
32
|
+
|
|
33
|
+
- [x] No — this is a path construction fix, not a gate.
|
|
34
|
+
|
|
35
|
+
## 5. Interactions
|
|
36
|
+
|
|
37
|
+
None. The function is called during hook installation — no racing, no shadowing.
|
|
38
|
+
|
|
39
|
+
## 6. External surfaces
|
|
40
|
+
|
|
41
|
+
The function returns the content of `free-text-guard.sh`, which is written to `.claude/hooks/`. No behavioral change to the hook content itself.
|
|
42
|
+
|
|
43
|
+
## 7. Rollback cost
|
|
44
|
+
|
|
45
|
+
Revert restores the bug — `readFileSync` would again fail on paths with spaces.
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Evidence pointers
|
|
50
|
+
|
|
51
|
+
- Typecheck: `tsc --noEmit` — 0 errors.
|
|
52
|
+
- Tests: All 7 `PostUpdateMigrator-buildStopHook` tests pass (were 2/7 failing before this fix).
|