ai-agent-session-center 2.0.2 → 2.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. package/README.md +484 -429
  2. package/docs/3D/ADAPTATION_GUIDE.md +592 -0
  3. package/docs/3D/index.html +754 -0
  4. package/docs/AGENT_TEAM_TASKS.md +716 -0
  5. package/docs/CYBERDROME_V2_SPEC.md +531 -0
  6. package/docs/ERROR_185_ANALYSIS.md +263 -0
  7. package/docs/PLATFORM_FEATURES_PROMPT.md +296 -0
  8. package/docs/SESSION_DETAIL_FEATURES.md +98 -0
  9. package/docs/_3d_multimedia_features.md +1080 -0
  10. package/docs/_frontend_features.md +1057 -0
  11. package/docs/_server_features.md +1077 -0
  12. package/docs/session-duplication-fixes.md +271 -0
  13. package/docs/session-terminal-linkage.md +412 -0
  14. package/package.json +63 -5
  15. package/public/apple-touch-icon.svg +21 -0
  16. package/public/css/dashboard.css +0 -161
  17. package/public/css/detail-panel.css +25 -0
  18. package/public/css/layout.css +18 -1
  19. package/public/css/modals.css +0 -26
  20. package/public/css/settings.css +0 -150
  21. package/public/css/terminal.css +34 -0
  22. package/public/favicon.svg +18 -0
  23. package/public/index.html +6 -26
  24. package/public/js/alarmManager.js +0 -21
  25. package/public/js/app.js +21 -7
  26. package/public/js/detailPanel.js +63 -64
  27. package/public/js/historyPanel.js +61 -55
  28. package/public/js/quickActions.js +132 -48
  29. package/public/js/sessionCard.js +5 -20
  30. package/public/js/sessionControls.js +8 -0
  31. package/public/js/settingsManager.js +0 -142
  32. package/server/apiRouter.js +60 -15
  33. package/server/apiRouter.ts +774 -0
  34. package/server/approvalDetector.ts +94 -0
  35. package/server/authManager.ts +144 -0
  36. package/server/autoIdleManager.ts +110 -0
  37. package/server/config.ts +121 -0
  38. package/server/constants.ts +150 -0
  39. package/server/db.ts +475 -0
  40. package/server/hookInstaller.d.ts +3 -0
  41. package/server/hookProcessor.ts +108 -0
  42. package/server/hookRouter.ts +18 -0
  43. package/server/hookStats.ts +116 -0
  44. package/server/index.js +15 -1
  45. package/server/index.ts +230 -0
  46. package/server/logger.ts +75 -0
  47. package/server/mqReader.ts +349 -0
  48. package/server/portManager.ts +55 -0
  49. package/server/processMonitor.ts +239 -0
  50. package/server/serverConfig.ts +29 -0
  51. package/server/sessionMatcher.js +17 -6
  52. package/server/sessionMatcher.ts +403 -0
  53. package/server/sessionStore.js +109 -3
  54. package/server/sessionStore.ts +1145 -0
  55. package/server/sshManager.js +167 -24
  56. package/server/sshManager.ts +671 -0
  57. package/server/teamManager.ts +289 -0
  58. package/server/wsManager.ts +200 -0
@@ -0,0 +1,271 @@
1
+ # Session Card Duplication Fixes
2
+
3
+ > Comprehensive analysis of all bugs that caused duplicate session cards and the fixes applied.
4
+
5
+ ## Problem Statement
6
+
7
+ Clicking **Resume/Reconnect** on a session card — or restarting the server and refreshing the browser — would create duplicate session cards pointing to the same SSH terminal. In the worst case, a single resume click produced **two extra cards**, resulting in three cards for one logical session.
8
+
9
+ ---
10
+
11
+ ## Root Causes Overview
12
+
13
+ | # | Root Cause | Trigger | Impact |
14
+ |---|-----------|---------|--------|
15
+ | 1 | `AGENT_MANAGER_TERMINAL_ID` not propagated over SSH | Any remote SSH session | Priority 1 matching fails; falls through to weaker heuristics |
16
+ | 2 | `projectPath` resolves `~` to local homedir for remote sessions | Remote SSH resume | Priority 0 path matching fails (local vs remote path) |
17
+ | 3 | Stale `pendingResume`/`pendingLinks` when `--resume` reuses same session_id | `claude --resume` with unchanged ID | Dangling entries cause future sessions to mis-match |
18
+ | 4 | `createTerminal()` always registers `pendingLinks` — even for resume | Resume with dead terminal | Another Claude session in same directory steals terminal via Priority 2 |
19
+ | 5 | Close button race with `session_removed` broadcast | Closing a card, then server broadcast | IndexedDB record re-created after deletion |
20
+ | 6 | Server Map key diverges from `session.sessionId` after re-key | Resume with new session_id | Snapshot sends both old key and new key → two cards |
21
+ | 7 | `pendingResume` not persisted across Ctrl+C restart | Stop server, restart, refresh | Resume data lost; hooks create new display-only card |
22
+
23
+ ---
24
+
25
+ ## Fix 1: Export `AGENT_MANAGER_TERMINAL_ID` Over SSH
26
+
27
+ **File:** `server/sshManager.js`
28
+
29
+ **Problem:** The terminal ID environment variable is set in the local PTY's `env` object, but SSH doesn't forward arbitrary env vars to the remote shell. Remote hooks never include `agent_terminal_id`, so Priority 1 matching (`tryMatchByTerminalId`) always fails.
30
+
31
+ **Fix:** Export the variable explicitly in the remote shell command — both for direct launch and tmux launch:
32
+
33
+ ```javascript
34
+ // Direct launch (sshManager.js)
35
+ if (!local) {
36
+ launchCmd += `export AGENT_MANAGER_TERMINAL_ID='${terminalId}'`;
37
+ }
38
+
39
+ // Tmux launch (sshManager.js)
40
+ if (!local) {
41
+ tmuxSendCmd += `export AGENT_MANAGER_TERMINAL_ID='${terminalId}' && `;
42
+ }
43
+ ```
44
+
45
+ **Also applied to resume prefix** in `server/apiRouter.js`:
46
+
47
+ ```javascript
48
+ if (isRemote) {
49
+ prefix += `export AGENT_MANAGER_TERMINAL_ID='${newTerminalId}' && `;
50
+ if (cfg.workingDir) prefix += `cd '${cfg.workingDir}' && `;
51
+ }
52
+ ```
53
+
54
+ ---
55
+
56
+ ## Fix 2: Update SSH `projectPath` From Hook's Actual CWD
57
+
58
+ **File:** `server/sessionStore.js` — `handleEvent()` SessionStart handler
59
+
60
+ **Problem:** When `createTerminalSession()` is called for a remote SSH session, it resolves `~` to the **local** homedir (e.g., `/Users/kason`). But the hook reports the **remote** cwd (e.g., `/home/user/project`). This mismatch causes Priority 0 path-based matching to fail on resume.
61
+
62
+ **Fix:** On `SessionStart`, update `projectPath` from the hook's cwd — but **only for SSH sessions** to avoid overwriting source-derived project names on display-only sessions (VS Code, Terminal, iTerm, etc.):
63
+
64
+ ```javascript
65
+ if (cwd && cwd !== session.projectPath && session.source === 'ssh') {
66
+ const oldPath = session.projectPath;
67
+ session.projectPath = cwd;
68
+ session.projectName = cwd.split('/').filter(Boolean).pop() || session.projectName;
69
+ // source is NEVER overwritten
70
+ }
71
+ ```
72
+
73
+ **Key constraint:** User explicitly required: *"don't lose the session card source, like VS Code, Terminal, etc."* — so the update is scoped to `session.source === 'ssh'` only.
74
+
75
+ ---
76
+
77
+ ## Fix 3: Clean Stale `pendingResume`/`pendingLinks` on Direct ID Match
78
+
79
+ **File:** `server/sessionMatcher.js`
80
+
81
+ **Problem:** When `claude --resume <id>` reuses the **same** session_id, the session is found by direct `Map.get(session_id)`. But `reconnectSessionTerminal()` / `createTerminal()` already registered `pendingResume` and `pendingLinks` entries. These are never consumed because the matcher returns early on direct match. Dangling entries then incorrectly match future, unrelated hooks.
82
+
83
+ **Fix:** Clean up stale entries when a session is found by direct ID:
84
+
85
+ ```javascript
86
+ if (session) {
87
+ if (hook_event_name === 'SessionStart' && session.terminalId) {
88
+ if (pendingResume.has(session.terminalId)) {
89
+ pendingResume.delete(session.terminalId);
90
+ }
91
+ consumePendingLink(session.projectPath);
92
+ }
93
+ return session;
94
+ }
95
+ ```
96
+
97
+ Also consume `pendingLinks` after a successful Priority 0 resume match:
98
+
99
+ ```javascript
100
+ // After Priority 0 match succeeds
101
+ if (session && session.projectPath) {
102
+ consumePendingLink(session.projectPath);
103
+ }
104
+ ```
105
+
106
+ ---
107
+
108
+ ## Fix 4: Consume `pendingLinks` Immediately After Resume Terminal Creation
109
+
110
+ **File:** `server/apiRouter.js` — resume endpoint
111
+
112
+ **Problem:** This was the **primary root cause** of the "two extra cards on resume" bug. The flow:
113
+
114
+ 1. User clicks Resume → API calls `createTerminal()` → registers `pendingLinks.set(workDir, { terminalId })`
115
+ 2. The **current conversation's Claude session** (running in the same working directory) fires a hook
116
+ 3. Hook arrives → Priority 2 (`tryLinkByWorkDir`) matches the pendingLink → **steals the terminal** → creates a duplicate card
117
+ 4. The actual resume Claude starts → `pendingResume` is consumed → but the terminal is already stolen → falls through all priorities → creates **another** display-only card
118
+
119
+ **Fix:** Immediately consume the pendingLink after `createTerminal()`, since the resume flow uses `pendingResume` (not `pendingLinks`) for matching:
120
+
121
+ ```javascript
122
+ const newTerminalId = await createTerminal(newConfig, null);
123
+
124
+ // Immediately consume the pendingLink that createTerminal registered.
125
+ // The resume flow uses pendingResume (not pendingLinks) for session matching.
126
+ // If we leave the pendingLink alive, ANY other Claude session in the same
127
+ // working directory could match it via Priority 2 (tryLinkByWorkDir),
128
+ // stealing the terminal and creating a duplicate card.
129
+ consumePendingLink(newConfig.workingDir || session.projectPath || '');
130
+
131
+ const result = reconnectSessionTerminal(sessionId, newTerminalId);
132
+ ```
133
+
134
+ ---
135
+
136
+ ## Fix 5: Close Button IndexedDB Race Condition
137
+
138
+ **File:** `public/js/sessionCard.js`
139
+
140
+ **Problem:** The close button handler did `get → put (mark as ended)` on IndexedDB. But the server also broadcasts `session_removed`, and the broadcast handler calls `del('sessions', sid)`. Race condition:
141
+
142
+ 1. Close button fires → `get()` starts
143
+ 2. `session_removed` broadcast arrives → `del()` completes
144
+ 3. Close button's `get()` resolves → `put()` **re-creates** the deleted record
145
+
146
+ On next refresh, IndexedDB has the "ghost" record → duplicate card.
147
+
148
+ **Fix:** Use `del()` directly instead of `get → put`:
149
+
150
+ ```javascript
151
+ // Delete from IndexedDB immediately — don't race with the server's
152
+ // session_removed broadcast which also calls del('sessions', sid).
153
+ db.del('sessions', sid).catch(() => {});
154
+ ```
155
+
156
+ ---
157
+
158
+ ## Fix 6: Snapshot Deduplication (Map Key/SessionId Divergence)
159
+
160
+ **File:** `public/js/app.js` — `onSnapshotCb`
161
+
162
+ **Problem:** After `reKeyResumedSession()`, the server's `sessions` Map may briefly contain entries where the Map key differs from `session.sessionId` (e.g., old key still lingering). The snapshot sends both, and the browser creates two cards for the same logical session.
163
+
164
+ **Fix:** Deduplicate by `sessionId` value, keeping only the most recent entry:
165
+
166
+ ```javascript
167
+ const deduped = new Map();
168
+ for (const [id, session] of Object.entries(sessions)) {
169
+ const sid = session.sessionId || id;
170
+ const existing = deduped.get(sid);
171
+ if (!existing || (session.lastActivityAt || 0) > (existing.lastActivityAt || 0)) {
172
+ deduped.set(sid, session);
173
+ }
174
+ }
175
+ ```
176
+
177
+ Also handle `replacesId` in `onSessionUpdateCb` to clean up old cards and migrate IndexedDB child records:
178
+
179
+ ```javascript
180
+ if (session.replacesId) {
181
+ delete allSessions[session.replacesId];
182
+ removeCard(session.replacesId);
183
+ migrateSessionId(session.replacesId, session.sessionId);
184
+ del('sessions', session.replacesId);
185
+ }
186
+ ```
187
+
188
+ ---
189
+
190
+ ## Fix 7: Persist `pendingResume` Across Server Restart
191
+
192
+ **File:** `server/sessionStore.js` — `saveSnapshot()` / `loadSnapshot()`
193
+
194
+ **Problem:** `pendingResume` is an in-memory Map. When the server is stopped (Ctrl+C) and restarted, all pending resume data is lost. If a session was in `connecting` status when the server stopped, the next hook from Claude has no `pendingResume` entry to match against → creates a new display-only card.
195
+
196
+ **Fix:** Include `pendingResume` in the snapshot that's persisted to SQLite:
197
+
198
+ ```javascript
199
+ // saveSnapshot()
200
+ pendingResume: Object.fromEntries(pendingResume)
201
+
202
+ // loadSnapshot()
203
+ if (snapshot.pendingResume) {
204
+ for (const [k, v] of Object.entries(snapshot.pendingResume)) {
205
+ pendingResume.set(k, v);
206
+ }
207
+ }
208
+ ```
209
+
210
+ ---
211
+
212
+ ## Session Matching Priority System
213
+
214
+ For context, here's the 5-priority fallback system that these fixes protect:
215
+
216
+ | Priority | Strategy | Reliability |
217
+ |----------|----------|-------------|
218
+ | 0 | `pendingResume` + terminal ID / workDir | High — explicit resume action |
219
+ | 1 | `AGENT_MANAGER_TERMINAL_ID` env var | High — direct match (Fix 1 enables this for SSH) |
220
+ | 2 | `tryLinkByWorkDir` (pendingLinks) | Medium — ambiguous if multiple sessions in same dir |
221
+ | 3 | Path scan (connecting sessions) | Medium — ambiguous if multiple connecting |
222
+ | 4 | PID parent check | Low — unreliable across shells |
223
+
224
+ ---
225
+
226
+ ## How the Fixes Work Together
227
+
228
+ ```
229
+ User clicks Resume
230
+
231
+
232
+ apiRouter.js resume endpoint
233
+ ├── createTerminal() → registers pendingLinks
234
+ ├── consumePendingLink() ← FIX 4: immediately remove
235
+ ├── reconnectSessionTerminal() → registers pendingResume
236
+ │ └── pendingResume persisted to snapshot ← FIX 7
237
+ └── writeWhenReady(AGENT_MANAGER_TERMINAL_ID + resumeCmd) ← FIX 1
238
+
239
+
240
+ Claude starts, sends SessionStart hook
241
+
242
+
243
+ sessionMatcher.js
244
+ ├── Direct ID match? → clean stale entries ← FIX 3
245
+ ├── Priority 0: pendingResume match → consume pendingLinks ← FIX 3
246
+ ├── Priority 1: AGENT_MANAGER_TERMINAL_ID ← FIX 1 (now works over SSH)
247
+ └── Priority 2: tryLinkByWorkDir → no stale link to steal ← FIX 4
248
+
249
+
250
+ sessionStore.js handleEvent (SessionStart)
251
+ ├── Update projectPath from hook cwd (SSH only) ← FIX 2
252
+ └── reKeyResumedSession if new session_id
253
+
254
+
255
+ Browser receives session_update
256
+ ├── replacesId → remove old card, migrate IndexedDB ← FIX 6
257
+ └── Snapshot deduplication on refresh ← FIX 6
258
+ ```
259
+
260
+ ---
261
+
262
+ ## Files Modified
263
+
264
+ | File | Fixes Applied |
265
+ |------|--------------|
266
+ | `server/sshManager.js` | Fix 1 (env var export), Fix 4 (`consumePendingLink` function) |
267
+ | `server/apiRouter.js` | Fix 1 (resume prefix), Fix 4 (consume after createTerminal) |
268
+ | `server/sessionStore.js` | Fix 2 (SSH projectPath), Fix 7 (pendingResume persistence) |
269
+ | `server/sessionMatcher.js` | Fix 3 (stale cleanup on direct match + Priority 0 match) |
270
+ | `public/js/app.js` | Fix 6 (snapshot dedup, replacesId handling) |
271
+ | `public/js/sessionCard.js` | Fix 5 (close button IndexedDB race) |
@@ -0,0 +1,412 @@
1
+ # Session Card ↔ SSH Terminal: Linkage Flow
2
+
3
+ This document describes how session cards in the dashboard are linked to SSH terminal sessions, including the matching logic, restart recovery, and resume flow.
4
+
5
+ ---
6
+
7
+ ## Phase 1: Terminal Creation
8
+
9
+ When the user clicks "New Terminal", two things happen simultaneously:
10
+
11
+ ### 1a. PTY Process Spawned (`sshManager.js`)
12
+
13
+ ```
14
+ createTerminal(config, wsClient)
15
+
16
+ ├── Generate terminalId: "term-{timestamp}-{random}"
17
+ ├── Spawn shell via node-pty (local shell or `ssh -t user@host`)
18
+ │ └── Inject env: AGENT_MANAGER_TERMINAL_ID = terminalId
19
+ ├── Start shell ready detector (detectShellReady)
20
+ │ └── Watches PTY output for prompt pattern ($ % # >)
21
+ │ └── 100ms settle timer to avoid false-matching MOTD
22
+ │ └── Fallback timeout: 5s (local) / 15s (remote)
23
+ ├── Register pending link: pendingLinks[workDir] = { terminalId, host }
24
+ ├── Stream PTY output → WebSocket client + ring buffer
25
+
26
+ └── Once shell ready detected:
27
+ └── Write launch command (e.g., "cd /myproject && claude")
28
+ ```
29
+
30
+ **Key point**: The launch command is NOT sent on a blind timer. `detectShellReady()` watches PTY output and waits until a shell prompt is visible before writing. This prevents commands from being lost if SSH hasn't finished connecting.
31
+
32
+ ### 1b. Session Card Created (`sessionStore.js`)
33
+
34
+ ```
35
+ createTerminalSession(terminalId, config)
36
+
37
+ ├── Session keyed by terminalId (not Claude session ID — doesn't exist yet)
38
+ ├── status = "connecting"
39
+ ├── source = "ssh"
40
+ ├── terminalId = terminalId
41
+ └── Card appears in dashboard immediately
42
+ ```
43
+
44
+ **State at this point:**
45
+ ```
46
+ sessions["term-abc"] = {
47
+ sessionId: "term-abc",
48
+ terminalId: "term-abc",
49
+ status: "connecting",
50
+ source: "ssh",
51
+ projectPath: "/myproject"
52
+ }
53
+ ```
54
+
55
+ ---
56
+
57
+ ## Phase 2: Hook Arrives — Session Matching
58
+
59
+ When Claude starts inside the terminal, it fires a `SessionStart` hook with its own `session_id` (a UUID like `a1b2c3d4-...`). The server must figure out which terminal card this hook belongs to.
60
+
61
+ `matchSession()` in `sessionMatcher.js` implements a **5-priority fallback system**:
62
+
63
+ ### Priority 0: Pending Resume Match
64
+
65
+ **When**: A `pendingResume` entry exists (user clicked Resume before this hook arrived).
66
+
67
+ ```
68
+ SessionStart hook arrives with session_id + agent_terminal_id + cwd
69
+
70
+ ├── Check pendingResume.has(agent_terminal_id)
71
+ │ └── YES → reKeyResumedSession() → done
72
+
73
+ └── Path fallback: scan pendingResume for matching projectPath
74
+ ├── Exactly 1 match → reKeyResumedSession() → done
75
+ ├── 0 matches → fall through
76
+ └── 2+ matches → AMBIGUOUS, skip (log warning)
77
+ ```
78
+
79
+ After matching, calls `consumePendingLink(projectPath)` to prevent duplicate match at Priority 2.
80
+
81
+ ### Priority 0.5: Auto-link to Snapshot-Restored Sessions
82
+
83
+ **When**: After server restart, sessions loaded from snapshot are marked `ended` with a `ServerRestart` event.
84
+
85
+ ```
86
+ SessionStart hook arrives with cwd
87
+
88
+ └── Scan all sessions for:
89
+ - status = "ended"
90
+ - has ServerRestart event
91
+ - projectPath matches cwd
92
+ - ended less than 30 minutes ago
93
+
94
+ ├── Exactly 1 candidate → reKeyResumedSession() → done
95
+ ├── 0 candidates → fall through
96
+ └── 2+ candidates → AMBIGUOUS, skip
97
+ ```
98
+
99
+ Also matches zombie SSH sessions (non-ended, source=ssh, no terminalId, stale >60s).
100
+
101
+ ### Priority 1: `AGENT_MANAGER_TERMINAL_ID` (Primary Happy Path)
102
+
103
+ **When**: The hook's enriched data includes `agent_terminal_id` from the env var injected at terminal creation.
104
+
105
+ ```
106
+ hookData.agent_terminal_id = "term-abc"
107
+
108
+ └── sessions.get("term-abc") exists?
109
+ └── YES → Re-key: delete "term-abc", set sessionId = hook's session_id
110
+ ```
111
+
112
+ **This is the most reliable matcher** — direct env var injection, no heuristics.
113
+
114
+ ### Priority 2: Work Directory Link (`tryLinkByWorkDir`)
115
+
116
+ **When**: `pendingLinks` map has an entry for the hook's `cwd`.
117
+
118
+ ```
119
+ pendingLinks["/myproject"] = { terminalId: "term-abc" }
120
+
121
+ └── Hook cwd = "/myproject" → match!
122
+ └── Re-key session from "term-abc" to hook's session_id
123
+ ```
124
+
125
+ **Risk**: Two terminals in the same directory will collide.
126
+
127
+ ### Priority 3: Path Scan (Connecting Sessions)
128
+
129
+ **When**: Scans all `connecting` sessions for a matching `projectPath`.
130
+
131
+ ```
132
+ Scan sessions where:
133
+ - status = "connecting"
134
+ - has terminalId
135
+ - projectPath matches cwd
136
+
137
+ ├── Exactly 1 → re-key
138
+ └── 0 or 2+ → fall through
139
+ ```
140
+
141
+ ### Priority 4: PID Parent Check
142
+
143
+ **When**: Checks if Claude's PID is a child of any known PTY process.
144
+
145
+ ```
146
+ hookData.claude_pid → ps -o ppid= → compare with terminal PIDs
147
+ ```
148
+
149
+ **Least reliable** — breaks across shell boundaries (zsh → bash → claude).
150
+
151
+ ### Fallback: Display-Only Card
152
+
153
+ If nothing matches, a new card is created with the detected source (vscode, iterm, warp, etc.). No terminal is attached.
154
+
155
+ ---
156
+
157
+ ## Phase 3: Re-keying
158
+
159
+ When a match is found, the session Map key changes:
160
+
161
+ ```
162
+ Before: sessions["term-abc"] = { sessionId: "term-abc", terminalId: "term-abc", status: "connecting" }
163
+ After: sessions["a1b2c3d4"] = { sessionId: "a1b2c3d4", terminalId: "term-abc", status: "idle" }
164
+ ```
165
+
166
+ The `terminalId` field stays the same (it's the PTY reference). Only the Map key and `sessionId` change.
167
+
168
+ ---
169
+
170
+ ## Phase 4: Server Shutdown (Ctrl+C)
171
+
172
+ ```
173
+ Ctrl+C (SIGINT)
174
+
175
+ ├── gracefulShutdown()
176
+ │ ├── stopPeriodicSave()
177
+ │ ├── stopMqReader()
178
+ │ ├── saveSnapshot() ← saves sessions, pidToSession, AND pendingResume
179
+ │ ├── closeDb()
180
+ │ └── server.close() → process.exit(0)
181
+
182
+ └── Snapshot contains:
183
+ ├── sessions (with live status, cachedPid, terminalId)
184
+ ├── projectSessionCounters
185
+ ├── pidToSession
186
+ ├── pendingResume ← survives restart for Priority 0 disambiguation
187
+ └── eventSeq
188
+ ```
189
+
190
+ **Key point**: `saveSnapshot()` saves sessions AS-IS with their live status. It does NOT add `ServerRestart` events — that happens at load time.
191
+
192
+ ---
193
+
194
+ ## Phase 5: Server Restart — Snapshot Restoration
195
+
196
+ `loadSnapshot()` performs triage on each saved session:
197
+
198
+ ### SSH Sessions
199
+
200
+ | State in snapshot | PID alive? | Action |
201
+ |---|---|---|
202
+ | Active + cachedPid | Yes | **Kill** (SIGTERM) — PTY is dead, Claude is orphaned. Mark `ended` + `ServerRestart` |
203
+ | Active + cachedPid | No | Mark `ended` + `ServerRestart` |
204
+ | Active + no cachedPid | — | Mark `ended` + `ServerRestart` (zombie cleanup) |
205
+ | Connecting + no cachedPid | — | Mark `ended` + `ServerRestart` (zombie cleanup) |
206
+ | Already ended + historical | — | Restore for history display |
207
+
208
+ **Post-restoration cleanup** (all SSH sessions):
209
+ - Clear `terminalId` → `null` (all PTYs are dead after restart)
210
+ - Save old value as `lastTerminalId` (needed for Resume)
211
+
212
+ ### Non-SSH Sessions (VS Code, iTerm, etc.)
213
+
214
+ | State in snapshot | PID alive? | Action |
215
+ |---|---|---|
216
+ | Active + cachedPid | Yes | Restore as-is (external terminal survived) |
217
+ | Active + cachedPid | No | Mark `ended` + `ServerRestart` |
218
+ | Active + no cachedPid | — | Restore as-is, processMonitor will check later |
219
+ | Already ended | — | **Not restored** (silently dropped) |
220
+
221
+ ### pendingResume Restoration
222
+
223
+ ```
224
+ For each entry in snapshot.pendingResume:
225
+
226
+ ├── Referenced session still exists in sessions Map?
227
+ │ ├── YES → Restore with refreshed timestamp (reset 2-min cleanup window)
228
+ │ └── NO → Skip (session was cleaned up)
229
+
230
+ └── Terminal ID is stale (PTY dead), but Priority 0's path-based
231
+ fallback only needs oldSessionId + projectPath to match
232
+ ```
233
+
234
+ ---
235
+
236
+ ## Phase 6: Resume Flow
237
+
238
+ ### Case A: Terminal Still Alive
239
+
240
+ Rare after restart, but possible if the user resumes before the terminal dies.
241
+
242
+ ```
243
+ User clicks Resume
244
+
245
+ ├── resumeSession(sessionId)
246
+ │ ├── Archive previous session data into previousSessions[]
247
+ │ ├── pendingResume.set(lastTerminalId, { oldSessionId })
248
+ │ ├── session.terminalId = lastTerminalId
249
+ │ └── session.status = "connecting"
250
+
251
+ └── writeToTerminal(terminalId, "claude --resume <id> || claude --continue\r")
252
+ ```
253
+
254
+ ### Case B: Terminal Dead — Create New One (Normal Post-Restart Path)
255
+
256
+ ```
257
+ User clicks Resume
258
+
259
+ ├── POST /api/sessions/:id/resume
260
+
261
+ ├── Terminal exists? NO → create new terminal
262
+ │ ├── createTerminal({ command: '' }) ← skipAutoLaunch
263
+ │ │ ├── Spawn shell/SSH
264
+ │ │ ├── detectShellReady() → shellReady promise
265
+ │ │ └── Register pendingLinks[workDir]
266
+ │ │
267
+ │ ├── reconnectSessionTerminal(sessionId, newTerminalId)
268
+ │ │ ├── Archive previous session data
269
+ │ │ ├── pendingResume.set(newTerminalId, { oldSessionId })
270
+ │ │ ├── session.terminalId = newTerminalId
271
+ │ │ └── session.status = "connecting"
272
+ │ │
273
+ │ └── writeWhenReady(newTerminalId, "claude --resume <id> || ...\r")
274
+ │ └── Awaits shellReady, then writes command
275
+
276
+ └── When Claude starts → SessionStart hook fires
277
+
278
+ └── matchSession() — Priority 0 matches:
279
+ ├── pendingResume.has(agent_terminal_id) → re-key
280
+ └── or path fallback → re-key
281
+ ```
282
+
283
+ ### Resume After Restart (Full Sequence)
284
+
285
+ ```
286
+ ┌──────────────────┐
287
+ │ Server Running │
288
+ │ Session A: idle │
289
+ │ Session B: idle │
290
+ │ (both in /proj) │
291
+ └────────┬─────────┘
292
+
293
+ Ctrl+C (SIGINT)
294
+
295
+
296
+ ┌─────────────────────────┐
297
+ │ saveSnapshot() │
298
+ │ A: status=idle, pid=123 │
299
+ │ B: status=idle, pid=456 │
300
+ │ pendingResume: {} │
301
+ └────────────┬────────────┘
302
+
303
+ Server restarts
304
+
305
+
306
+ ┌─────────────────────────┐
307
+ │ loadSnapshot() │
308
+ │ A: pid 123 dead → ended │
309
+ │ + ServerRestart │
310
+ │ B: pid 456 dead → ended │
311
+ │ + ServerRestart │
312
+ └────────────┬────────────┘
313
+
314
+ User clicks Resume on A
315
+
316
+
317
+ ┌─────────────────────────┐
318
+ │ reconnectSessionTerminal│
319
+ │ pendingResume[newTerm] │
320
+ │ = { oldSessionId: A } │
321
+ │ A: status = connecting │
322
+ └────────────┬────────────┘
323
+
324
+ writeWhenReady → waits for shell prompt
325
+
326
+ Shell ready → "claude --resume A || claude --continue"
327
+
328
+ Claude starts → SessionStart hook
329
+
330
+
331
+ ┌─────────────────────────┐
332
+ │ matchSession() │
333
+ │ Priority 0: pendingResume│
334
+ │ has newTerm → match A! │
335
+ │ Re-key: A → new Claude │
336
+ │ session_id │
337
+ └─────────────────────────┘
338
+ ```
339
+
340
+ **Without pendingResume**: Priority 0.5 would see TWO ended sessions with `ServerRestart` in `/proj` → ambiguous → new card created instead of linking to A.
341
+
342
+ **With pendingResume**: Priority 0 fires first and unambiguously matches A because `pendingResume` explicitly records which session was being resumed.
343
+
344
+ ---
345
+
346
+ ## Shell Ready Detection (`detectShellReady`)
347
+
348
+ Instead of a blind `setTimeout(500ms)`, commands are sent only when the shell prompt is visible.
349
+
350
+ ```
351
+ PTY spawned
352
+
353
+ ├── onData listener: accumulates output in buffer
354
+ │ └── On each chunk: reset 100ms settle timer
355
+ │ └── After 100ms silence: check last line
356
+ │ └── Strip ANSI escapes
357
+ │ └── Match /[#$%>]\s*$/ on last non-empty line
358
+ │ └── If match → resolve(true) → command is sent
359
+
360
+ ├── onExit listener: resolve(false) if PTY dies before prompt
361
+
362
+ └── Fallback timeout: 5s (local) / 15s (remote)
363
+ └── resolve(false) → command sent anyway with warning log
364
+ ```
365
+
366
+ | Scenario | Before (blind delay) | After (prompt detection) |
367
+ |---|---|---|
368
+ | Local shell (fast) | 100ms blind wait | ~200ms (prompt + 100ms settle) |
369
+ | Remote SSH (fast network) | 500ms blind wait | ~600ms (SSH + prompt + settle) |
370
+ | Remote SSH (slow network) | **Command lost** | Waits up to 15s, then fallback |
371
+ | SSH password prompt | Sends command as password | 15s timeout, then fallback + warn |
372
+ | SSH connection failure | Command to dead PTY | onExit fires, command skipped |
373
+ | Resume (new terminal) | 600ms blind wait | Waits for prompt, then sends |
374
+
375
+ ---
376
+
377
+ ## Data Structures
378
+
379
+ ### Server-Side Maps
380
+
381
+ | Map | Key | Value | Persisted in snapshot? |
382
+ |---|---|---|---|
383
+ | `sessions` | sessionId (or terminalId before re-key) | Session object | Yes |
384
+ | `pidToSession` | Claude PID (number) | sessionId | Yes |
385
+ | `pendingResume` | terminalId | `{ oldSessionId, timestamp }` | Yes |
386
+ | `pendingLinks` | workDir path | `{ terminalId, host, createdAt }` | No (recreated on terminal creation) |
387
+ | `terminals` | terminalId | `{ pty, sessionId, config, wsClient, shellReady, ... }` | No (PTYs die with server) |
388
+
389
+ ### Session Object Key Fields for Linkage
390
+
391
+ | Field | Description | Set by |
392
+ |---|---|---|
393
+ | `sessionId` | Map key. Initially = terminalId, re-keyed to Claude's session_id | createTerminalSession → matchSession |
394
+ | `terminalId` | Reference to live PTY. Null after disconnect/restart | createTerminalSession → cleared on end/restart |
395
+ | `lastTerminalId` | Previous terminalId, preserved for resume | SessionEnd handler / loadSnapshot |
396
+ | `cachedPid` | Claude's process ID | Hook enrichment |
397
+ | `source` | `"ssh"` / `"vscode"` / `"iterm"` / etc. | createTerminalSession / detectHookSource |
398
+ | `replacesId` | One-time flag: the old sessionId before re-key | reKeyResumedSession (consumed after broadcast) |
399
+ | `sshConfig` | SSH connection params for reconnect | createTerminalSession |
400
+ | `previousSessions` | Array of archived data from prior incarnations | resumeSession / reconnectSessionTerminal |
401
+
402
+ ---
403
+
404
+ ## Known Limitations
405
+
406
+ 1. **Two sessions, same directory, no pendingResume**: Priority 0.5 and Priority 3 skip ambiguous matches. A new card is created.
407
+
408
+ 2. **`pendingLinks` not persisted**: After restart, Priority 2 (`tryLinkByWorkDir`) can never match for pre-existing terminals. This is fine because all terminals are dead anyway — Priority 1 (`agent_terminal_id`) handles fresh terminals.
409
+
410
+ 3. **Shell prompt detection heuristic**: Unusual prompts that don't end with `$ % # >` won't be detected. The 5s/15s fallback timeout ensures commands are eventually sent.
411
+
412
+ 4. **`autoIdleManager` cleanup**: Restored `pendingResume` entries get 2 minutes before cleanup. If the user doesn't trigger a resume within that window, the entries are garbage collected.