switchroom 0.14.2 → 0.14.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli/switchroom.js +5 -3
- package/package.json +1 -1
- package/profiles/default/CLAUDE.md +122 -0
- package/telegram-plugin/dist/bridge/bridge.js +8 -2
- package/telegram-plugin/dist/gateway/gateway.js +64 -126
- package/telegram-plugin/dist/server.js +8 -2
- package/telegram-plugin/gateway/gateway.ts +135 -21
- package/telegram-plugin/gateway/inbound-delivery-machine-shadow.ts +33 -0
- package/telegram-plugin/hooks/tool-label-pretool.mjs +13 -4
- package/telegram-plugin/session-tail.ts +18 -0
- package/telegram-plugin/tests/inbound-delivery-cutover-gate.test.ts +93 -0
- package/telegram-plugin/tests/tool-activity-summary.test.ts +19 -0
- package/telegram-plugin/tool-activity-summary.ts +18 -0
- package/telegram-plugin/tool-label-sidecar.ts +13 -5
- package/telegram-plugin/uat/scenarios/fuzz-status-ask-dm.test.ts +39 -13
package/dist/cli/switchroom.js
CHANGED
|
@@ -49278,8 +49278,8 @@ var {
|
|
|
49278
49278
|
} = import__.default;
|
|
49279
49279
|
|
|
49280
49280
|
// src/build-info.ts
|
|
49281
|
-
var VERSION = "0.14.
|
|
49282
|
-
var COMMIT_SHA = "
|
|
49281
|
+
var VERSION = "0.14.3";
|
|
49282
|
+
var COMMIT_SHA = "b61cef7e";
|
|
49283
49283
|
|
|
49284
49284
|
// src/cli/agent.ts
|
|
49285
49285
|
init_source();
|
|
@@ -51763,7 +51763,9 @@ function buildSettingsHooksBlock(p) {
|
|
|
51763
51763
|
` + `So:
|
|
51764
51764
|
` + " - Trivial / social message \u2192 reply once, briefly, in your voice. " + `The reply IS the response.
|
|
51765
51765
|
` + ` - Question with a short answer \u2192 just reply with the answer.
|
|
51766
|
-
` + " - Complex tool-driven work \u2192 go straight to the tools (the " + "compose-area preview is the ambient liveness signal), then reply " + 'once with the answer or a genuine mid-work pivot ("halfway ' + 'through \u2014 found an unexpected issue, want me to continue?"). Not ' +
|
|
51766
|
+
` + " - Complex tool-driven work \u2192 go straight to the tools (the " + "compose-area preview is the ambient liveness signal), then reply " + 'once with the answer or a genuine mid-work pivot ("halfway ' + 'through \u2014 found an unexpected issue, want me to continue?"). Not ' + `"still working".
|
|
51767
|
+
|
|
51768
|
+
` + 'Do NOT send a trailing confirmation after your answer \u2014 no "Done.", ' + '"Sent.", "Hope that helps." as a separate message once you have ' + "already replied. Your answer is the last thing the user should " + `see; a follow-up "Done." is dead-air clutter (and the user's ` + "device already pinged on the answer). Stop after the answer.</turn-pacing>";
|
|
51767
51769
|
const switchroomUserPromptSubmit = [
|
|
51768
51770
|
...useHotReloadStable ? [
|
|
51769
51771
|
{
|
package/package.json
CHANGED
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# Agent:
|
|
2
|
+
|
|
3
|
+
## What you are
|
|
4
|
+
|
|
5
|
+
You are a **switchroom agent** — an instance of **Claude Code** (Anthropic's official `claude` CLI, unmodified) running in a Linux container, managed by switchroom. Your `$SWITCHROOM_AGENT_NAME` is ``. Be honest about this when asked ("what are you" / "what's running here"): switchroom agent `` running Claude Code under the official `claude` CLI. Not a custom model, not a wrapper, not "an AI assistant" in the abstract.
|
|
6
|
+
|
|
7
|
+
You are one of several agents here. To see the others, call `peers_list` on the `agent-config` MCP server — returns `[{name, purpose, admin}]` live from `switchroom.yaml`. **Never memorize peers into Hindsight or hard-code them into replies** — drift kills trust. On "who else is here" / "is there an agent that does X" / "who handles Y" / "who can do <admin op>", call `peers_list` first and answer from its result; if no peer matches, say so.
|
|
8
|
+
|
|
9
|
+
## Who you are
|
|
10
|
+
|
|
11
|
+
See `SOUL.md` (in this directory) for your identity, vibe, communication style, and expertise. That file is your persona source of truth.
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
## Core Behavior
|
|
15
|
+
- Respond helpfully, concisely, and conversationally.
|
|
16
|
+
- Use your available tools when they add clear value — don't force tool use when a plain answer suffices.
|
|
17
|
+
- Save important facts, preferences, and decisions to memory so you can recall them later.
|
|
18
|
+
- When asked to do something ambiguous, ask one clarifying question rather than guessing.
|
|
19
|
+
- If a task has multiple steps, outline your plan before executing.
|
|
20
|
+
|
|
21
|
+
## Safety
|
|
22
|
+
- Don't exfiltrate private data. Ever.
|
|
23
|
+
- Don't run destructive commands without asking.
|
|
24
|
+
- Prefer `trash` over `rm` when available (recoverable beats gone forever).
|
|
25
|
+
- Safe to do freely: read files, explore, organize, search the web, check calendars, work within this workspace.
|
|
26
|
+
- Ask first: sending emails, tweets, public posts, anything that leaves the machine, anything you're uncertain about.
|
|
27
|
+
|
|
28
|
+
## Execution Bias
|
|
29
|
+
|
|
30
|
+
How you should decide what to do next. These are procedural rules, not vibe.
|
|
31
|
+
|
|
32
|
+
- **Act in-turn.** If the request is actionable, do it this turn. Don't finish with a plan or promise when tools can move it forward.
|
|
33
|
+
- **Verify mutable facts before claiming them.** Files, git state, clocks, versions, services, processes, package state, the contents of an `Edit` target: read live. Memory and prior context are not verification sources. "I think the function is at line 200" is not an answer; `Grep`/`Read` is.
|
|
34
|
+
- **Final answer needs evidence.** Test/build/lint output, screenshot, inspection, tool output, or a named blocker. "It should work" is not a finalization.
|
|
35
|
+
- **Weak or empty tool result is not a conclusion.** Vary the query, path, command, or source before deciding the thing isn't there.
|
|
36
|
+
- **Non-final turn:** use tools to advance, or ask the one clarifying question that unblocks safe progress. One question, not five.
|
|
37
|
+
|
|
38
|
+
|
|
39
|
+
## Memory — Hindsight is your single backend
|
|
40
|
+
|
|
41
|
+
**Claude Code's built-in file-based auto-memory is disabled for this agent.** Don't try to write `.md` files under `.claude/projects/.../memory/` or maintain a `MEMORY.md` index — that whole system is off. There's exactly one memory backend: **Hindsight**.
|
|
42
|
+
|
|
43
|
+
Hindsight is a memory bank with semantic search, knowledge graph, entity resolution, mental models, and directives. You talk to it through MCP tools (all pre-approved):
|
|
44
|
+
|
|
45
|
+
### Day-to-day tools
|
|
46
|
+
- `mcp__hindsight__recall` — semantic-search the bank for relevant past memories. Auto-fires on every inbound user message via the plugin's UserPromptSubmit hook (you'll see "Relevant memories from past conversations" in your context). Call manually when you need a more specific query than the auto-fired one.
|
|
47
|
+
- `mcp__hindsight__retain` — store a new memory. The plugin automatically retains the conversation transcript every ~10 turns via the Stop hook, so you usually don't need this. Call manually for significant decisions, corrections, or facts you want immediately searchable.
|
|
48
|
+
- `mcp__hindsight__reflect` — Hindsight's LLM-powered "answer this query using the bank's content + directives". Use when the user asks a question that requires synthesis across multiple past memories.
|
|
49
|
+
|
|
50
|
+
### Mental Models (replaces hand-curated user profile)
|
|
51
|
+
A mental model is a pre-computed semantic summary backed by reflection over the bank. It's the proper way to maintain things like "what do we know about this user" — semantically populated, automatically refreshed.
|
|
52
|
+
|
|
53
|
+
- `mcp__hindsight__create_mental_model(name, source_query)` — create one. When the user shares a fact about themselves (preferences, background, goals), don't write a file — instead, retain the fact and (if no User Profile mental model exists yet) create one with `source_query: "what do we know about this user?"`. Hindsight will populate it from the retained memories.
|
|
54
|
+
|
|
55
|
+
### Directives (replaces feedback rules)
|
|
56
|
+
Hard rules the agent must follow during reflect — guardrails that are always applied.
|
|
57
|
+
|
|
58
|
+
- `mcp__hindsight__create_directive(text)` — e.g., `create_directive("Always prefer TypeScript over JavaScript for this user's projects")`. When the user gives you a correction or "always do X" rule, create a directive instead of writing a feedback `.md` file.
|
|
59
|
+
|
|
60
|
+
(Inspection tools like `list_memories`, `list_mental_models`, `update_mental_model`, `refresh_mental_model`, `list_directives`, `delete_directive` are available under the `mcp__hindsight__*` namespace if you ever need them, but you rarely should — Hindsight's own auto-recall surfaces what matters and the operator handles bank curation out-of-band.)
|
|
61
|
+
|
|
62
|
+
### What to retain — and what NOT to retain
|
|
63
|
+
|
|
64
|
+
Retain proactively when:
|
|
65
|
+
- The user shares a preference or fact about themselves
|
|
66
|
+
- The user gives you a correction or rule (these go to directives, not retain)
|
|
67
|
+
- A significant decision was made and the rationale matters for next time
|
|
68
|
+
- You did real work and the result + the path you took would be useful next session
|
|
69
|
+
|
|
70
|
+
Don't retain:
|
|
71
|
+
- Routine pleasantries, "thanks", "got it"
|
|
72
|
+
- Conversation chatter that doesn't carry forward
|
|
73
|
+
- Sensitive content the user explicitly asked you to not remember
|
|
74
|
+
- Things already in a mental model — they'll be re-derived from underlying memories
|
|
75
|
+
|
|
76
|
+
The plugin's auto-retain (Stop hook) handles transcript-level storage on a 10-turn cadence, so you don't need to manually retain everything. Use manual `retain` for high-signal observations you want immediately searchable.
|
|
77
|
+
|
|
78
|
+
## Sub-Agent Delegation
|
|
79
|
+
|
|
80
|
+
The main session is for conversation. Execution belongs in sub-agents. Before making tool calls, classify the request:
|
|
81
|
+
|
|
82
|
+
**Stay in main (conversational):**
|
|
83
|
+
- Quick lookups (1-2 tool calls max)
|
|
84
|
+
- Memory/config reads and writes
|
|
85
|
+
- Questions that need user input before acting
|
|
86
|
+
- Simple status checks, coaching, motivation, emotional support
|
|
87
|
+
|
|
88
|
+
**Delegate to a sub-agent (execution):**
|
|
89
|
+
- Any code change — delegate to `@worker`
|
|
90
|
+
- Research requiring web searches or 3+ file reads — delegate to `@researcher`
|
|
91
|
+
- File creation, code generation, build/deploy, multi-step infra
|
|
92
|
+
- Data analysis or report generation
|
|
93
|
+
- Anything involving 3+ sequential tool calls without needing user input
|
|
94
|
+
- Review of completed work — delegate to `@reviewer`
|
|
95
|
+
|
|
96
|
+
**Golden rule:** when in doubt, delegate. Unnecessary delegation costs slightly more tokens. A blocked session costs the user's attention. Keep your own turns short — dispatch and acknowledge. The user should never wait more than 10 seconds for a response from you.
|
|
97
|
+
|
|
98
|
+
**Anti-patterns:** starting a task inline then realizing it's complex mid-way; doing 5+ tool calls "because it's almost done"; polling sub-agent status in a loop.
|
|
99
|
+
|
|
100
|
+
If no sub-agents are configured, do the work yourself.
|
|
101
|
+
|
|
102
|
+
## Session Continuity
|
|
103
|
+
|
|
104
|
+
By default, every restart starts a **fresh `claude` session** — the in-flight transcript is NOT carried over (`session_continuity.resume_mode: handoff`, the default since switchroom #362). Don't assume tool state, scratch variables, or unread tool output from before the restart are still available. What does survive:
|
|
105
|
+
|
|
106
|
+
- **Handoff briefing** — on a clean shutdown, the Stop hook writes a bounded raw transcript tail of the prior session to `.handoff.md`. On boot, start.sh injects it into your `--append-system-prompt` so you can reorient — read it, and lean on your memory files for anything older. If `.handoff.md` is missing or stale (fresh agent, or pre-Stop-hook crash), `start.sh` runs `handoff-briefing.sh` to assemble `.handoff-briefing.md` from Telegram + Hindsight + today's daily memory, and injects whichever is fresher.
|
|
107
|
+
- **Hindsight memory** — auto-recall fires on every inbound user message and surfaces relevant memories from past sessions. Long-term facts, decisions, and mental models live here, not in the transcript.
|
|
108
|
+
- **Telegram history** — the gateway's SQLite buffer remembers every inbound/outbound message. Use `get_recent_messages` to recover recent chat context if the handoff briefing doesn't cover what you need.
|
|
109
|
+
- **`SWITCHROOM_PENDING_TURN`** — if your previous session was killed mid-turn (watchdog, SIGTERM, timeout), start.sh exports this env var plus the chat/thread/last-user-message context. Acknowledge the interruption and ask for direction rather than silently resuming.
|
|
110
|
+
- **`.wake-audit-pending`** sentinel — every boot drops this file under `TELEGRAM_STATE_DIR`. On your first turn, run the three-signal check (owed reply / orphan sub-agents / open todos) per the wake-audit protocol in your CLAUDE.md, then `rm -f` the sentinel.
|
|
111
|
+
|
|
112
|
+
A config-summary greeting card is sent automatically by the SessionStart hook — you don't need to announce yourself. If your context feels thin (after compaction or any fresh session), proactively recall from Hindsight before proceeding.
|
|
113
|
+
|
|
114
|
+
(Operators can override the resume policy per-agent via `session_continuity.resume_mode` in switchroom.yaml — `auto`, `continue`, `handoff`, or `none`. The default is `handoff`.)
|
|
115
|
+
|
|
116
|
+
## Admin operations
|
|
117
|
+
|
|
118
|
+
You're NOT `admin: true`. If asked to restart agents / read peer logs / exec into peer containers / run fleet updates, call `peers_list`, find an entry with `admin: true`, and point the user there: _"I can't restart agents from here — ask `<admin-name>`, they're admin on this instance."_ No long apology; just hand off.
|
|
119
|
+
|
|
120
|
+
## Tools
|
|
121
|
+
Use your available tools when appropriate. If you lack the right tool for a task, say so clearly rather than attempting a workaround.
|
|
122
|
+
|
|
@@ -23063,14 +23063,14 @@ function createToolLabelSidecar(opts) {
|
|
|
23063
23063
|
} catch {
|
|
23064
23064
|
continue;
|
|
23065
23065
|
}
|
|
23066
|
-
if (!row || typeof row.tool_use_id !== "string" || typeof row.label !== "string")
|
|
23066
|
+
if (!row || typeof row.tool_use_id !== "string" || typeof row.label !== "string" || typeof row.tool_name !== "string")
|
|
23067
23067
|
continue;
|
|
23068
23068
|
if (labels.has(row.tool_use_id))
|
|
23069
23069
|
continue;
|
|
23070
23070
|
labels.set(row.tool_use_id, row.label);
|
|
23071
23071
|
for (const cb of subscribers) {
|
|
23072
23072
|
try {
|
|
23073
|
-
cb(row.tool_use_id, row.label);
|
|
23073
|
+
cb(row.tool_use_id, row.label, row.tool_name);
|
|
23074
23074
|
} catch {}
|
|
23075
23075
|
}
|
|
23076
23076
|
}
|
|
@@ -23441,6 +23441,9 @@ function startSessionTail(config2) {
|
|
|
23441
23441
|
try {
|
|
23442
23442
|
const s = createToolLabelSidecar({ stateDir: stateDirForSidecar, sessionId });
|
|
23443
23443
|
sidecars.set(sessionId, s);
|
|
23444
|
+
s.onLabel((toolUseId, label, toolName) => {
|
|
23445
|
+
rawOnEvent({ kind: "tool_label", toolUseId, label, toolName });
|
|
23446
|
+
});
|
|
23444
23447
|
return s;
|
|
23445
23448
|
} catch (err) {
|
|
23446
23449
|
log?.(`session-tail: sidecar create failed: ${err.message}`);
|
|
@@ -23554,6 +23557,9 @@ function startSessionTail(config2) {
|
|
|
23554
23557
|
}
|
|
23555
23558
|
log?.(`session-tail: attached to ${file} (cursor=${cursor})`);
|
|
23556
23559
|
}
|
|
23560
|
+
const attachSid = sessionIdForFile(file);
|
|
23561
|
+
if (attachSid)
|
|
23562
|
+
ensureSidecar(attachSid);
|
|
23557
23563
|
try {
|
|
23558
23564
|
watcher = watch(file, () => readNew());
|
|
23559
23565
|
} catch (err) {
|
|
@@ -31866,121 +31866,7 @@ function registerAndRender(state, toolName) {
|
|
|
31866
31866
|
return null;
|
|
31867
31867
|
return formatSummary(state);
|
|
31868
31868
|
}
|
|
31869
|
-
function baseName(p) {
|
|
31870
|
-
if (typeof p !== "string" || p.length === 0)
|
|
31871
|
-
return null;
|
|
31872
|
-
const parts = p.split("/").filter(Boolean);
|
|
31873
|
-
return parts.length > 0 ? parts[parts.length - 1] : p;
|
|
31874
|
-
}
|
|
31875
|
-
function hostName(u) {
|
|
31876
|
-
if (typeof u !== "string" || u.length === 0)
|
|
31877
|
-
return null;
|
|
31878
|
-
try {
|
|
31879
|
-
return new URL(u).hostname.replace(/^www\./, "");
|
|
31880
|
-
} catch {
|
|
31881
|
-
return u.replace(/^https?:\/\//, "").split("/")[0] || null;
|
|
31882
|
-
}
|
|
31883
|
-
}
|
|
31884
|
-
function clip(s, n) {
|
|
31885
|
-
if (typeof s !== "string")
|
|
31886
|
-
return null;
|
|
31887
|
-
const t = s.trim();
|
|
31888
|
-
if (t.length === 0)
|
|
31889
|
-
return null;
|
|
31890
|
-
return t.length > n ? t.slice(0, n - 1) + "\u2026" : t;
|
|
31891
|
-
}
|
|
31892
|
-
function describeToolUse(toolName, input) {
|
|
31893
|
-
if (!toolName)
|
|
31894
|
-
return null;
|
|
31895
|
-
const inp = input ?? {};
|
|
31896
|
-
const mcpMatch = /^mcp__(.+?)__(.+)$/.exec(toolName);
|
|
31897
|
-
if (mcpMatch) {
|
|
31898
|
-
const server = mcpMatch[1].toLowerCase();
|
|
31899
|
-
const tool = mcpMatch[2].toLowerCase();
|
|
31900
|
-
if (server === "switchroom-telegram")
|
|
31901
|
-
return null;
|
|
31902
|
-
if (server === "hindsight") {
|
|
31903
|
-
if (tool === "recall" || tool === "reflect")
|
|
31904
|
-
return "Searching memory";
|
|
31905
|
-
if (tool === "retain" || tool === "update_memory" || tool === "sync_retain")
|
|
31906
|
-
return "Saving to memory";
|
|
31907
|
-
return "Working with memory";
|
|
31908
|
-
}
|
|
31909
|
-
if (server === "google-workspace" || server === "claude_ai_google_calendar") {
|
|
31910
|
-
return "Checking your calendar";
|
|
31911
|
-
}
|
|
31912
|
-
if (server === "claude_ai_gmail")
|
|
31913
|
-
return "Checking your email";
|
|
31914
|
-
if (server === "claude_ai_google_drive")
|
|
31915
|
-
return "Looking through your files";
|
|
31916
|
-
if (server === "notion" || server === "claude_ai_notion") {
|
|
31917
|
-
return "Checking your notes";
|
|
31918
|
-
}
|
|
31919
|
-
const desc = clip(inp.description, 60) ?? clip(inp.query, 50) ?? clip(inp.title, 50);
|
|
31920
|
-
if (desc)
|
|
31921
|
-
return desc;
|
|
31922
|
-
return "Using " + tool.replace(/[-_]+/g, " ");
|
|
31923
|
-
}
|
|
31924
|
-
switch (toolName) {
|
|
31925
|
-
case "Bash": {
|
|
31926
|
-
return clip(inp.description, 70) ?? "Running a command";
|
|
31927
|
-
}
|
|
31928
|
-
case "BashOutput":
|
|
31929
|
-
case "KillShell":
|
|
31930
|
-
return "Managing a background command";
|
|
31931
|
-
case "Read": {
|
|
31932
|
-
const f = baseName(inp.file_path);
|
|
31933
|
-
return f ? `Reading ${f}` : "Reading a file";
|
|
31934
|
-
}
|
|
31935
|
-
case "Edit":
|
|
31936
|
-
case "MultiEdit":
|
|
31937
|
-
case "NotebookEdit": {
|
|
31938
|
-
const f = baseName(inp.file_path) ?? baseName(inp.notebook_path);
|
|
31939
|
-
return f ? `Editing ${f}` : "Editing a file";
|
|
31940
|
-
}
|
|
31941
|
-
case "Write": {
|
|
31942
|
-
const f = baseName(inp.file_path);
|
|
31943
|
-
return f ? `Writing ${f}` : "Writing a file";
|
|
31944
|
-
}
|
|
31945
|
-
case "Grep":
|
|
31946
|
-
case "Glob": {
|
|
31947
|
-
const p = clip(inp.pattern, 40);
|
|
31948
|
-
return p ? `Searching for ${p}` : "Searching files";
|
|
31949
|
-
}
|
|
31950
|
-
case "WebFetch": {
|
|
31951
|
-
const h = hostName(inp.url);
|
|
31952
|
-
return h ? `Reading ${h}` : "Reading a web page";
|
|
31953
|
-
}
|
|
31954
|
-
case "WebSearch": {
|
|
31955
|
-
const q = clip(inp.query, 50);
|
|
31956
|
-
return q ? `Searching the web for ${q}` : "Searching the web";
|
|
31957
|
-
}
|
|
31958
|
-
case "Task":
|
|
31959
|
-
case "Agent": {
|
|
31960
|
-
const d = clip(inp.description, 60);
|
|
31961
|
-
return d ? `Delegating: ${d}` : "Delegating to a sub-agent";
|
|
31962
|
-
}
|
|
31963
|
-
case "TodoWrite":
|
|
31964
|
-
case "TaskCreate":
|
|
31965
|
-
case "TaskUpdate":
|
|
31966
|
-
case "TaskList":
|
|
31967
|
-
return "Updating the plan";
|
|
31968
|
-
case "ToolSearch":
|
|
31969
|
-
return "Finding the right tool";
|
|
31970
|
-
default:
|
|
31971
|
-
return "Working\u2026";
|
|
31972
|
-
}
|
|
31973
|
-
}
|
|
31974
31869
|
var MIRROR_MAX_LINES = 6;
|
|
31975
|
-
function appendActivityLine(lines, toolName, input) {
|
|
31976
|
-
const line = describeToolUse(toolName, input);
|
|
31977
|
-
if (line == null)
|
|
31978
|
-
return null;
|
|
31979
|
-
if (lines.length === 0 || lines[lines.length - 1] !== line) {
|
|
31980
|
-
lines.push(line);
|
|
31981
|
-
}
|
|
31982
|
-
return renderActivityFeed(lines);
|
|
31983
|
-
}
|
|
31984
31870
|
function renderActivityFeed(lines) {
|
|
31985
31871
|
if (lines.length === 0)
|
|
31986
31872
|
return null;
|
|
@@ -31991,6 +31877,15 @@ function renderActivityFeed(lines) {
|
|
|
31991
31877
|
return hidden > 0 ? `\u00b7 +${hidden} earlier\u2026
|
|
31992
31878
|
${body}` : body;
|
|
31993
31879
|
}
|
|
31880
|
+
function appendActivityLabel(lines, label) {
|
|
31881
|
+
const l = (label ?? "").trim();
|
|
31882
|
+
if (l.length === 0)
|
|
31883
|
+
return null;
|
|
31884
|
+
if (lines.length === 0 || lines[lines.length - 1] !== l) {
|
|
31885
|
+
lines.push(l);
|
|
31886
|
+
}
|
|
31887
|
+
return renderActivityFeed(lines);
|
|
31888
|
+
}
|
|
31994
31889
|
|
|
31995
31890
|
// tool-labels.ts
|
|
31996
31891
|
var MAX_LABEL_CHARS = 60;
|
|
@@ -46282,6 +46177,13 @@ function transition(state3, event) {
|
|
|
46282
46177
|
// gateway/inbound-delivery-machine-shadow.ts
|
|
46283
46178
|
var state3 = initialState();
|
|
46284
46179
|
var enabled5 = process.env.SWITCHROOM_DELIVERY_MACHINE_SHADOW !== "0";
|
|
46180
|
+
var cutoverEnabled = enabled5 && process.env.SWITCHROOM_DELIVERY_MACHINE_CUTOVER !== "0";
|
|
46181
|
+
function isDeliveryCutoverEnabled() {
|
|
46182
|
+
return cutoverEnabled;
|
|
46183
|
+
}
|
|
46184
|
+
function isMachineInTurn() {
|
|
46185
|
+
return state3.global.kind === "bridge_alive_in_turn";
|
|
46186
|
+
}
|
|
46285
46187
|
function shadowEmit(event) {
|
|
46286
46188
|
if (!enabled5)
|
|
46287
46189
|
return [];
|
|
@@ -50163,10 +50065,10 @@ function sweepStaleTurnActiveMarker(stateDir, opts) {
|
|
|
50163
50065
|
}
|
|
50164
50066
|
|
|
50165
50067
|
// ../src/build-info.ts
|
|
50166
|
-
var VERSION = "0.14.
|
|
50167
|
-
var COMMIT_SHA = "
|
|
50168
|
-
var COMMIT_DATE = "2026-05-
|
|
50169
|
-
var LATEST_PR =
|
|
50068
|
+
var VERSION = "0.14.3";
|
|
50069
|
+
var COMMIT_SHA = "b61cef7e";
|
|
50070
|
+
var COMMIT_DATE = "2026-05-28T09:56:51Z";
|
|
50071
|
+
var LATEST_PR = 1964;
|
|
50170
50072
|
var COMMITS_AHEAD_OF_TAG = 0;
|
|
50171
50073
|
|
|
50172
50074
|
// gateway/boot-version.ts
|
|
@@ -51091,6 +50993,9 @@ function markClaudeBusyForInbound(m) {
|
|
|
51091
50993
|
}
|
|
51092
50994
|
claudeBusyKeys.add(chatKey2(m.chatId, tid));
|
|
51093
50995
|
}
|
|
50996
|
+
function turnInFlightForGate() {
|
|
50997
|
+
return isDeliveryCutoverEnabled() ? isMachineInTurn() : claudeBusyKeys.size > 0;
|
|
50998
|
+
}
|
|
51094
50999
|
var pendingRestarts = new Map;
|
|
51095
51000
|
var lastSessionActiveFile = null;
|
|
51096
51001
|
var compactState = initialCompactState();
|
|
@@ -51156,7 +51061,7 @@ function purgeReactionTracking(key, endingTurn) {
|
|
|
51156
51061
|
if (agentDir != null)
|
|
51157
51062
|
removeActiveReaction(agentDir, msgInfo.chatId, msgInfo.messageId);
|
|
51158
51063
|
}
|
|
51159
|
-
if (
|
|
51064
|
+
if (!turnInFlightForGate()) {
|
|
51160
51065
|
const selfAgentForFlush = process.env.SWITCHROOM_AGENT_NAME ?? "";
|
|
51161
51066
|
if (pendingInboundBuffer.depth(selfAgentForFlush) > 0) {
|
|
51162
51067
|
const fr = redeliverBufferedInbound(pendingInboundBuffer, selfAgentForFlush, (m) => {
|
|
@@ -51186,7 +51091,7 @@ function releaseTurnBufferGate(key) {
|
|
|
51186
51091
|
activeTurnStartedAt.delete(key);
|
|
51187
51092
|
claudeBusyKeys.delete(key);
|
|
51188
51093
|
shadowEmit({ kind: "turnEnd", key, at: Date.now(), outboundEmitted: true });
|
|
51189
|
-
if (
|
|
51094
|
+
if (!turnInFlightForGate()) {
|
|
51190
51095
|
const selfAgentForFlush = process.env.SWITCHROOM_AGENT_NAME ?? "";
|
|
51191
51096
|
if (pendingInboundBuffer.depth(selfAgentForFlush) > 0) {
|
|
51192
51097
|
const fr = redeliverBufferedInbound(pendingInboundBuffer, selfAgentForFlush, (m) => {
|
|
@@ -52134,6 +52039,11 @@ startTimer({
|
|
|
52134
52039
|
`);
|
|
52135
52040
|
}
|
|
52136
52041
|
});
|
|
52042
|
+
var DELIVERY_MACHINE_TICK_MS = 30000;
|
|
52043
|
+
var _deliveryMachineTick = setInterval(() => {
|
|
52044
|
+
shadowEmit({ kind: "tick", now: Date.now() });
|
|
52045
|
+
}, DELIVERY_MACHINE_TICK_MS);
|
|
52046
|
+
_deliveryMachineTick.unref?.();
|
|
52137
52047
|
startTimer2({
|
|
52138
52048
|
editMessage: async (ctx) => {
|
|
52139
52049
|
const editOpts = ctx.parseMode != null ? { parse_mode: ctx.parseMode } : undefined;
|
|
@@ -52435,7 +52345,7 @@ ${reminder}
|
|
|
52435
52345
|
onHeartbeat(_client, _msg) {},
|
|
52436
52346
|
onScheduleRestart(client3, msg) {
|
|
52437
52347
|
const { agentName: agentName3 } = msg;
|
|
52438
|
-
const turnInFlight =
|
|
52348
|
+
const turnInFlight = turnInFlightForGate();
|
|
52439
52349
|
if (!turnInFlight) {
|
|
52440
52350
|
try {
|
|
52441
52351
|
client3.send({
|
|
@@ -52690,7 +52600,7 @@ if (!STATIC) {
|
|
|
52690
52600
|
setInterval(() => {
|
|
52691
52601
|
const selfAgent = process.env.SWITCHROOM_AGENT_NAME ?? "";
|
|
52692
52602
|
const r = idleDrainTick(pendingInboundBuffer, selfAgent, () => {
|
|
52693
|
-
if (
|
|
52603
|
+
if (turnInFlightForGate())
|
|
52694
52604
|
return false;
|
|
52695
52605
|
const c = ipcServer.getClient(selfAgent);
|
|
52696
52606
|
return c != null && c.isAlive();
|
|
@@ -52971,6 +52881,7 @@ ${url}`;
|
|
|
52971
52881
|
});
|
|
52972
52882
|
noteOutbound(statusKey(chat_id, threadId), Date.now());
|
|
52973
52883
|
noteOutbound2(statusKey(chat_id, threadId), Date.now());
|
|
52884
|
+
shadowEmit({ kind: "modelOutbound", key: statusKey(chat_id, threadId), at: Date.now() });
|
|
52974
52885
|
if (isFinalAnswerReply({ text: rawText, disableNotification })) {
|
|
52975
52886
|
clearSilentEndState(statusKey(chat_id, threadId));
|
|
52976
52887
|
}
|
|
@@ -53298,6 +53209,7 @@ async function executeStreamReply(args) {
|
|
|
53298
53209
|
const sKey = statusKey(streamChatId, streamThreadId);
|
|
53299
53210
|
noteOutbound(sKey, Date.now());
|
|
53300
53211
|
noteOutbound2(sKey, Date.now());
|
|
53212
|
+
shadowEmit({ kind: "modelOutbound", key: sKey, at: Date.now() });
|
|
53301
53213
|
if (isFinalAnswerReply({
|
|
53302
53214
|
text: args.text ?? "",
|
|
53303
53215
|
disableNotification: args.disable_notification === true,
|
|
@@ -54248,6 +54160,11 @@ function handleSessionEvent(ev) {
|
|
|
54248
54160
|
isDm: isDmChatId(ev.chatId)
|
|
54249
54161
|
};
|
|
54250
54162
|
currentTurn = next;
|
|
54163
|
+
shadowEmit({
|
|
54164
|
+
kind: "turnStart",
|
|
54165
|
+
key: statusKey(ev.chatId, ev.threadId != null ? Number(ev.threadId) : undefined),
|
|
54166
|
+
at: startedAt
|
|
54167
|
+
});
|
|
54251
54168
|
preambleSuppressor.reset();
|
|
54252
54169
|
clearSilentEndState(statusKey(ev.chatId, ev.threadId != null ? Number(ev.threadId) : null));
|
|
54253
54170
|
if (turnsDb != null) {
|
|
@@ -54309,12 +54226,12 @@ function handleSessionEvent(ev) {
|
|
|
54309
54226
|
clearTimeout(turn.orphanedReplyTimeoutId);
|
|
54310
54227
|
turn.orphanedReplyTimeoutId = null;
|
|
54311
54228
|
}
|
|
54312
|
-
if (wasFirstReply) {
|
|
54229
|
+
if (wasFirstReply && !DRAFT_MIRROR_ENABLED) {
|
|
54313
54230
|
clearActivitySummary(turn);
|
|
54314
54231
|
}
|
|
54315
54232
|
}
|
|
54316
|
-
if (!turn.replyCalled && !isTelegramSurfaceTool(name)) {
|
|
54317
|
-
const rendered =
|
|
54233
|
+
if (!DRAFT_MIRROR_ENABLED && !turn.replyCalled && !isTelegramSurfaceTool(name)) {
|
|
54234
|
+
const rendered = registerAndRender(turn.toolActivity, name);
|
|
54318
54235
|
if (rendered != null) {
|
|
54319
54236
|
turn.activityPendingRender = rendered;
|
|
54320
54237
|
if (turn.activityInFlight == null) {
|
|
@@ -54332,6 +54249,23 @@ function handleSessionEvent(ev) {
|
|
|
54332
54249
|
}
|
|
54333
54250
|
return;
|
|
54334
54251
|
}
|
|
54252
|
+
case "tool_label": {
|
|
54253
|
+
if (!DRAFT_MIRROR_ENABLED)
|
|
54254
|
+
return;
|
|
54255
|
+
const turn = currentTurn;
|
|
54256
|
+
if (turn == null)
|
|
54257
|
+
return;
|
|
54258
|
+
if (isTelegramSurfaceTool(ev.toolName))
|
|
54259
|
+
return;
|
|
54260
|
+
const rendered = appendActivityLabel(turn.mirrorLines, ev.label);
|
|
54261
|
+
if (rendered != null) {
|
|
54262
|
+
turn.activityPendingRender = rendered;
|
|
54263
|
+
if (turn.activityInFlight == null) {
|
|
54264
|
+
turn.activityInFlight = drainActivitySummary(turn);
|
|
54265
|
+
}
|
|
54266
|
+
}
|
|
54267
|
+
return;
|
|
54268
|
+
}
|
|
54335
54269
|
case "text": {
|
|
54336
54270
|
const turn = currentTurn;
|
|
54337
54271
|
if (turn != null) {
|
|
@@ -54461,6 +54395,9 @@ function handleSessionEvent(ev) {
|
|
|
54461
54395
|
clearTimeout(turn.orphanedReplyTimeoutId);
|
|
54462
54396
|
turn.orphanedReplyTimeoutId = null;
|
|
54463
54397
|
}
|
|
54398
|
+
if (DRAFT_MIRROR_ENABLED && turn != null) {
|
|
54399
|
+
clearActivitySummary(turn);
|
|
54400
|
+
}
|
|
54464
54401
|
preambleSuppressor.flushNow();
|
|
54465
54402
|
let streamFinalizedAsAnswer = false;
|
|
54466
54403
|
if (turn?.answerStream != null) {
|
|
@@ -55001,6 +54938,7 @@ async function handleInbound(ctx, text, downloadImage, attachment) {
|
|
|
55001
54938
|
}
|
|
55002
54939
|
const inboundReceivedAt = Date.now();
|
|
55003
54940
|
const _shadowKey = statusKey(ctx.chat?.id != null ? String(ctx.chat.id) : "0", ctx.message?.message_thread_id);
|
|
54941
|
+
const machineInTurnAtReceipt = isDeliveryCutoverEnabled() ? isMachineInTurn() : null;
|
|
55004
54942
|
shadowEmit({
|
|
55005
54943
|
kind: "inbound",
|
|
55006
54944
|
key: _shadowKey,
|
|
@@ -55011,7 +54949,7 @@ async function handleInbound(ctx, text, downloadImage, attachment) {
|
|
|
55011
54949
|
},
|
|
55012
54950
|
at: Date.now()
|
|
55013
54951
|
});
|
|
55014
|
-
const turnInFlightAtReceipt = claudeBusyKeys.size > 0;
|
|
54952
|
+
const turnInFlightAtReceipt = machineInTurnAtReceipt ?? claudeBusyKeys.size > 0;
|
|
55015
54953
|
const access = result.access;
|
|
55016
54954
|
const from = ctx.from;
|
|
55017
54955
|
const chat_id = String(ctx.chat.id);
|
|
@@ -17091,14 +17091,14 @@ function createToolLabelSidecar(opts) {
|
|
|
17091
17091
|
} catch {
|
|
17092
17092
|
continue;
|
|
17093
17093
|
}
|
|
17094
|
-
if (!row || typeof row.tool_use_id !== "string" || typeof row.label !== "string")
|
|
17094
|
+
if (!row || typeof row.tool_use_id !== "string" || typeof row.label !== "string" || typeof row.tool_name !== "string")
|
|
17095
17095
|
continue;
|
|
17096
17096
|
if (labels.has(row.tool_use_id))
|
|
17097
17097
|
continue;
|
|
17098
17098
|
labels.set(row.tool_use_id, row.label);
|
|
17099
17099
|
for (const cb of subscribers) {
|
|
17100
17100
|
try {
|
|
17101
|
-
cb(row.tool_use_id, row.label);
|
|
17101
|
+
cb(row.tool_use_id, row.label, row.tool_name);
|
|
17102
17102
|
} catch {}
|
|
17103
17103
|
}
|
|
17104
17104
|
}
|
|
@@ -17479,6 +17479,9 @@ function startSessionTail(config2) {
|
|
|
17479
17479
|
try {
|
|
17480
17480
|
const s = createToolLabelSidecar({ stateDir: stateDirForSidecar, sessionId });
|
|
17481
17481
|
sidecars.set(sessionId, s);
|
|
17482
|
+
s.onLabel((toolUseId, label, toolName) => {
|
|
17483
|
+
rawOnEvent({ kind: "tool_label", toolUseId, label, toolName });
|
|
17484
|
+
});
|
|
17482
17485
|
return s;
|
|
17483
17486
|
} catch (err) {
|
|
17484
17487
|
log?.(`session-tail: sidecar create failed: ${err.message}`);
|
|
@@ -17592,6 +17595,9 @@ function startSessionTail(config2) {
|
|
|
17592
17595
|
}
|
|
17593
17596
|
log?.(`session-tail: attached to ${file} (cursor=${cursor})`);
|
|
17594
17597
|
}
|
|
17598
|
+
const attachSid = sessionIdForFile(file);
|
|
17599
|
+
if (attachSid)
|
|
17600
|
+
ensureSidecar(attachSid);
|
|
17595
17601
|
try {
|
|
17596
17602
|
watcher = watch(file, () => readNew());
|
|
17597
17603
|
} catch (err) {
|
|
@@ -59,6 +59,7 @@ import {
|
|
|
59
59
|
registerAndRender,
|
|
60
60
|
describeToolUse,
|
|
61
61
|
appendActivityLine,
|
|
62
|
+
appendActivityLabel,
|
|
62
63
|
type ActivityState,
|
|
63
64
|
} from '../tool-activity-summary.js'
|
|
64
65
|
import { toolLabel } from '../tool-labels.js'
|
|
@@ -286,7 +287,7 @@ import { chatKey, chatKeyWithSuffix, chatIdOfChatKey } from './chat-key.js'
|
|
|
286
287
|
// should do. Behavior unchanged in this PR — the imperative code below
|
|
287
288
|
// still runs everything. PR 3 will cut over to executing the machine's
|
|
288
289
|
// effects.
|
|
289
|
-
import { shadowEmit } from './inbound-delivery-machine-shadow.js'
|
|
290
|
+
import { shadowEmit, isMachineInTurn, isDeliveryCutoverEnabled } from './inbound-delivery-machine-shadow.js'
|
|
290
291
|
import type { ChatKey as _ChatKey } from './inbound-delivery-machine.js'
|
|
291
292
|
import { dispatchEffects, isDispatchEnabled } from './inbound-delivery-machine-dispatch.js'
|
|
292
293
|
import { maybeFireWarmup } from './prefix-warmup.js'
|
|
@@ -1161,6 +1162,24 @@ function markClaudeBusyForInbound(m: {
|
|
|
1161
1162
|
}
|
|
1162
1163
|
claudeBusyKeys.add(chatKey(m.chatId, tid))
|
|
1163
1164
|
}
|
|
1165
|
+
|
|
1166
|
+
/**
|
|
1167
|
+
* Authoritative "is a turn in flight?" for every gate that previously
|
|
1168
|
+
* read `claudeBusyKeys.size`. PR 3b cutover (extends PR 3a's bridgeUp
|
|
1169
|
+
* dispatch): when the delivery state machine is authoritative
|
|
1170
|
+
* (`SWITCHROOM_DELIVERY_MACHINE_CUTOVER` on + shadow on) the answer is
|
|
1171
|
+
* its single-`activeTurn` global state, which — unlike the
|
|
1172
|
+
* per-delivery `claudeBusyKeys` set — cannot accumulate orphan keys and
|
|
1173
|
+
* wedge the gate "in-flight forever" (the gymbro/clerk 5-min dangle,
|
|
1174
|
+
* 2026-05-28). Kill-switch off → exact legacy claudeBusyKeys behaviour.
|
|
1175
|
+
*
|
|
1176
|
+
* NOT for the inbound-receipt gate (line ~8551): that must snapshot the
|
|
1177
|
+
* machine state BEFORE the inbound event advances it, or a fresh-turn
|
|
1178
|
+
* message self-blocks. See the snapshot at the inbound handler.
|
|
1179
|
+
*/
|
|
1180
|
+
function turnInFlightForGate(): boolean {
|
|
1181
|
+
return isDeliveryCutoverEnabled() ? isMachineInTurn() : claudeBusyKeys.size > 0
|
|
1182
|
+
}
|
|
1164
1183
|
const pendingRestarts = new Map<string, number>() // agentName -> timestamp when restart was requested
|
|
1165
1184
|
|
|
1166
1185
|
// ─── Proactive context compaction (session.max_context_tokens) ──────────
|
|
@@ -1490,7 +1509,11 @@ function purgeReactionTracking(key: string, endingTurn?: CurrentTurn): void {
|
|
|
1490
1509
|
// activeTurnStartedAt entry in the fresh-turn branch) doesn't pin this
|
|
1491
1510
|
// gate forever while claude is genuinely idle. See the claudeBusyKeys
|
|
1492
1511
|
// declaration for the supergroup deadlock this fixes.
|
|
1493
|
-
|
|
1512
|
+
// PR3b-cutover: `turnInFlightForGate()` reads the delivery machine
|
|
1513
|
+
// when the cutover kill-switch is on; the turnEnd event was emitted
|
|
1514
|
+
// just above (purgeReactionTracking head), so the machine is already
|
|
1515
|
+
// idle here.
|
|
1516
|
+
if (!turnInFlightForGate()) {
|
|
1494
1517
|
// #1556: the deterministic delivery point. claude has just gone
|
|
1495
1518
|
// idle — flush any inbound held mid-turn so the channel
|
|
1496
1519
|
// notification lands at the idle prompt and submits as a fresh
|
|
@@ -1590,7 +1613,9 @@ function releaseTurnBufferGate(key: string): void {
|
|
|
1590
1613
|
// test-harness's 13:02 UAT now opens after the reply.
|
|
1591
1614
|
//
|
|
1592
1615
|
// PR3b: gated on claudeBusyKeys (see purgeReactionTracking comment).
|
|
1593
|
-
|
|
1616
|
+
// PR3b-cutover: turnEnd was emitted just above (releaseTurnBufferGate
|
|
1617
|
+
// head), so the machine is already idle when the cutover gate reads.
|
|
1618
|
+
if (!turnInFlightForGate()) {
|
|
1594
1619
|
const selfAgentForFlush = process.env.SWITCHROOM_AGENT_NAME ?? ''
|
|
1595
1620
|
if (pendingInboundBuffer.depth(selfAgentForFlush) > 0) {
|
|
1596
1621
|
const fr = redeliverBufferedInbound(
|
|
@@ -3656,6 +3681,23 @@ silencePoke.startTimer({
|
|
|
3656
3681
|
},
|
|
3657
3682
|
})
|
|
3658
3683
|
|
|
3684
|
+
// PR3b-cutover: drive the delivery machine's TTL `tick`. The machine
|
|
3685
|
+
// expires any turn whose `turnStartedAt` is older than TURN_TTL_MS
|
|
3686
|
+
// (5 min) and drops global state back to idle — its structural
|
|
3687
|
+
// equivalent of the imperative silence-poke framework-fallback. This
|
|
3688
|
+
// is the load-bearing safety net for the cutover gate: even if a
|
|
3689
|
+
// `turnEnd` event is somehow missed (the dangle class), the machine
|
|
3690
|
+
// self-heals at TTL instead of pinning the gate "in-flight forever".
|
|
3691
|
+
// shadowEmit only advances state + logs the predicted effects; we
|
|
3692
|
+
// deliberately do NOT execute the machine's firePoke here (the
|
|
3693
|
+
// imperative silence-poke still owns the user-facing ping), so there
|
|
3694
|
+
// is no double-poke. unref so the interval never holds the process.
|
|
3695
|
+
const DELIVERY_MACHINE_TICK_MS = 30_000
|
|
3696
|
+
const _deliveryMachineTick = setInterval(() => {
|
|
3697
|
+
shadowEmit({ kind: 'tick', now: Date.now() })
|
|
3698
|
+
}, DELIVERY_MACHINE_TICK_MS)
|
|
3699
|
+
_deliveryMachineTick.unref?.()
|
|
3700
|
+
|
|
3659
3701
|
// #1445 cross-turn pending-async ambient. When a turn ends after the
|
|
3660
3702
|
// model dispatched background async work (Agent / Task / Bash run-in-
|
|
3661
3703
|
// background) and the model has stopped speaking, keep editing the
|
|
@@ -4195,7 +4237,8 @@ const ipcServer: IpcServer = createIpcServer({
|
|
|
4195
4237
|
// PR3b: gated on claudeBusyKeys (actually-handed-to-claude turns)
|
|
4196
4238
|
// not activeTurnStartedAt (receipt-eager), so a buffered topic-B
|
|
4197
4239
|
// inbound doesn't pin this as turnInFlight=true forever.
|
|
4198
|
-
|
|
4240
|
+
// PR3b-cutover: reads the delivery machine when the kill-switch is on.
|
|
4241
|
+
const turnInFlight = turnInFlightForGate();
|
|
4199
4242
|
|
|
4200
4243
|
if (!turnInFlight) {
|
|
4201
4244
|
// No active turn, restart immediately. Cycle both the agent and
|
|
@@ -4615,7 +4658,8 @@ if (!STATIC) {
|
|
|
4615
4658
|
// #1556: never drain mid-turn — that re-creates the composer
|
|
4616
4659
|
// wedge this buffer exists to prevent.
|
|
4617
4660
|
// PR3b: gated on claudeBusyKeys (see purgeReactionTracking).
|
|
4618
|
-
|
|
4661
|
+
// PR3b-cutover: reads the delivery machine when the kill-switch is on.
|
|
4662
|
+
if (turnInFlightForGate()) return false
|
|
4619
4663
|
const c = ipcServer.getClient(selfAgent)
|
|
4620
4664
|
return c != null && c.isAlive()
|
|
4621
4665
|
},
|
|
@@ -5020,6 +5064,11 @@ async function executeReply(args: Record<string, unknown>): Promise<{ content: A
|
|
|
5020
5064
|
// silence-poke clock so the next poke is measured from this send.
|
|
5021
5065
|
signalTracker.noteOutbound(statusKey(chat_id, threadId), Date.now())
|
|
5022
5066
|
silencePoke.noteOutbound(statusKey(chat_id, threadId), Date.now())
|
|
5067
|
+
// PR3b-cutover: feed lastOutboundAt to the delivery machine so its
|
|
5068
|
+
// TTL `tick` suppresses the fallback for a long-but-active turn
|
|
5069
|
+
// (model streaming past 5 min) — parity with silencePoke's own
|
|
5070
|
+
// suppression, so the cutover gate doesn't clear a live turn.
|
|
5071
|
+
shadowEmit({ kind: 'modelOutbound', key: statusKey(chat_id, threadId) as _ChatKey, at: Date.now() })
|
|
5023
5072
|
// #1741 — only clear silent-end state on a plausibly-final reply.
|
|
5024
5073
|
// An interim ack (disable_notification:true, short text, no done)
|
|
5025
5074
|
// must NOT clear the state file; otherwise a turn that ends with
|
|
@@ -5615,6 +5664,9 @@ async function executeStreamReply(args: Record<string, unknown>): Promise<unknow
|
|
|
5615
5664
|
const sKey = statusKey(streamChatId, streamThreadId)
|
|
5616
5665
|
signalTracker.noteOutbound(sKey, Date.now())
|
|
5617
5666
|
silencePoke.noteOutbound(sKey, Date.now())
|
|
5667
|
+
// PR3b-cutover: feed lastOutboundAt to the delivery machine (see
|
|
5668
|
+
// executeReply) so its TTL tick suppresses an active-turn fallback.
|
|
5669
|
+
shadowEmit({ kind: 'modelOutbound', key: sKey as _ChatKey, at: Date.now() })
|
|
5618
5670
|
// #1741 — see executeReply for the rationale: only a plausibly-
|
|
5619
5671
|
// final stream_reply clears the silent-end state. An interim
|
|
5620
5672
|
// ack via stream_reply must NOT clear; the Stop hook needs
|
|
@@ -7012,6 +7064,20 @@ function handleSessionEvent(ev: SessionEvent): void {
|
|
|
7012
7064
|
isDm: isDmChatId(ev.chatId),
|
|
7013
7065
|
}
|
|
7014
7066
|
currentTurn = next
|
|
7067
|
+
// PR3b-cutover: feed the authoritative turn-start to the delivery
|
|
7068
|
+
// machine. `enqueue` fires for EVERY turn atom regardless of
|
|
7069
|
+
// source — inbound, cron, subagent-handback, vault-resume,
|
|
7070
|
+
// restart-marker — so it is the single chokepoint that captures
|
|
7071
|
+
// the non-inbound turns the machine's own `inbound` event never
|
|
7072
|
+
// sees (those bypass handleInbound). Without it the machine reads
|
|
7073
|
+
// idle during a cron/handback turn and the gate would mis-deliver
|
|
7074
|
+
// a concurrent inbound mid-turn (the #1556 composer wedge).
|
|
7075
|
+
// Idempotent when already in_turn (turnStart only sets perKey).
|
|
7076
|
+
shadowEmit({
|
|
7077
|
+
kind: 'turnStart',
|
|
7078
|
+
key: statusKey(ev.chatId, ev.threadId != null ? Number(ev.threadId) : undefined) as _ChatKey,
|
|
7079
|
+
at: startedAt,
|
|
7080
|
+
})
|
|
7015
7081
|
// #549 fix — fresh turn, reset preamble-suppression state.
|
|
7016
7082
|
preambleSuppressor.reset()
|
|
7017
7083
|
// Reset the silent-end retry budget for this chat. The stored
|
|
@@ -7130,7 +7196,12 @@ function handleSessionEvent(ev: SessionEvent): void {
|
|
|
7130
7196
|
// empty draft to wipe the compose-area preview; for persisted
|
|
7131
7197
|
// messages, delete. The user sees the real reply land in the
|
|
7132
7198
|
// same beat the summary disappears.
|
|
7133
|
-
|
|
7199
|
+
// Legacy (flag-off): the activity summary clears on the first
|
|
7200
|
+
// reply — it was a one-shot "what I did" line. DRAFT_MIRROR keeps
|
|
7201
|
+
// the live feed running through mid-turn replies and clears it at
|
|
7202
|
+
// turn_end instead, so an early reply doesn't wipe the stream
|
|
7203
|
+
// (the fast-turn determinism fix).
|
|
7204
|
+
if (wasFirstReply && !DRAFT_MIRROR_ENABLED) {
|
|
7134
7205
|
clearActivitySummary(turn)
|
|
7135
7206
|
}
|
|
7136
7207
|
}
|
|
@@ -7153,22 +7224,19 @@ function handleSessionEvent(ev: SessionEvent): void {
|
|
|
7153
7224
|
// exactly once at a time and re-running until pending matches
|
|
7154
7225
|
// the last-sent. Captures `turn` so a late drain after turn-swap
|
|
7155
7226
|
// can't corrupt the next turn's atom.
|
|
7156
|
-
// DRAFT_MIRROR (RFC draft-mirror-preview): accumulate each tool_use
|
|
7157
|
-
// into a human-friendly running feed in the live preview, using the
|
|
7158
|
-
// model-authored descriptive field (Bash.description, Read/Edit file
|
|
7159
|
-
// basename, hindsight→"Searching memory", etc. — see describeToolUse
|
|
7160
|
-
// / appendActivityLine). The draft shows the turn's actions as a
|
|
7161
|
-
// capped chronological list (Claude Code-style), clears on reply.
|
|
7162
|
-
// Never surfaces raw shell/query syntax — option A, uniform across
|
|
7163
|
-
// code + non-code agents.
|
|
7164
|
-
//
|
|
7165
7227
|
// Flag OFF (default): the legacy generic verb-count summary
|
|
7166
7228
|
// ("Ran 5 commands") via registerAndRender — byte-identical to
|
|
7167
|
-
// pre-draft-mirror behavior.
|
|
7168
|
-
|
|
7169
|
-
|
|
7170
|
-
|
|
7171
|
-
|
|
7229
|
+
// pre-draft-mirror behavior, cleared on first reply.
|
|
7230
|
+
//
|
|
7231
|
+
// DRAFT_MIRROR: the draft is NOT driven from this (flush-gated)
|
|
7232
|
+
// tool_use event — it's driven by the real-time `tool_label` event
|
|
7233
|
+
// (PreToolUse sidecar, fires at tool-call time regardless of when
|
|
7234
|
+
// claude flushes the transcript). See `case 'tool_label'`. That's
|
|
7235
|
+
// the determinism fix: on a fast/clustered-tool turn the JSONL
|
|
7236
|
+
// tool_use rows aren't on disk until ~turn-end, so sourcing the
|
|
7237
|
+
// draft here lost the feed; the sidecar is flush-independent.
|
|
7238
|
+
if (!DRAFT_MIRROR_ENABLED && !turn.replyCalled && !isTelegramSurfaceTool(name)) {
|
|
7239
|
+
const rendered = registerAndRender(turn.toolActivity, name)
|
|
7172
7240
|
if (rendered != null) {
|
|
7173
7241
|
turn.activityPendingRender = rendered
|
|
7174
7242
|
if (turn.activityInFlight == null) {
|
|
@@ -7184,6 +7252,31 @@ function handleSessionEvent(ev: SessionEvent): void {
|
|
|
7184
7252
|
}
|
|
7185
7253
|
return
|
|
7186
7254
|
}
|
|
7255
|
+
case 'tool_label': {
|
|
7256
|
+
// DRAFT_MIRROR real-time driver. The PreToolUse hook wrote this
|
|
7257
|
+
// label synchronously at tool-call time; the sidecar surfaced it
|
|
7258
|
+
// here (~250ms) independent of the transcript flush. Accumulate it
|
|
7259
|
+
// into the live feed and update the ephemeral draft — this is what
|
|
7260
|
+
// makes the draft deterministic on fast/clustered-tool turns where
|
|
7261
|
+
// the JSONL tool_use rows arrive too late.
|
|
7262
|
+
if (!DRAFT_MIRROR_ENABLED) return
|
|
7263
|
+
const turn = currentTurn
|
|
7264
|
+
if (turn == null) return
|
|
7265
|
+
// Surface tools (reply/stream_reply/react) are the conversation, not
|
|
7266
|
+
// activity — the hook labels them ("Replying"), so filter by name.
|
|
7267
|
+
if (isTelegramSurfaceTool(ev.toolName)) return
|
|
7268
|
+
// Unlike the legacy tool_use path, do NOT gate on replyCalled — the
|
|
7269
|
+
// whole point is to show activity even when a reply raced ahead of
|
|
7270
|
+
// the (lagged) transcript. The feed clears at turn_end.
|
|
7271
|
+
const rendered = appendActivityLabel(turn.mirrorLines, ev.label)
|
|
7272
|
+
if (rendered != null) {
|
|
7273
|
+
turn.activityPendingRender = rendered
|
|
7274
|
+
if (turn.activityInFlight == null) {
|
|
7275
|
+
turn.activityInFlight = drainActivitySummary(turn)
|
|
7276
|
+
}
|
|
7277
|
+
}
|
|
7278
|
+
return
|
|
7279
|
+
}
|
|
7187
7280
|
case 'text': {
|
|
7188
7281
|
// #1067: snapshot at entry. The answer-stream creation closures
|
|
7189
7282
|
// below also read `turn` instead of currentTurn so they pin to
|
|
@@ -7454,6 +7547,14 @@ function handleSessionEvent(ev: SessionEvent): void {
|
|
|
7454
7547
|
clearTimeout(turn.orphanedReplyTimeoutId)
|
|
7455
7548
|
turn.orphanedReplyTimeoutId = null
|
|
7456
7549
|
}
|
|
7550
|
+
// DRAFT_MIRROR: the live activity feed runs through the whole turn
|
|
7551
|
+
// (it is NOT cleared on the first reply, unlike the legacy summary)
|
|
7552
|
+
// so an early/mid-turn reply can't wipe it. Clear it here, at the
|
|
7553
|
+
// real end of the turn — the ephemeral compose-area draft goes away
|
|
7554
|
+
// once the work is actually done.
|
|
7555
|
+
if (DRAFT_MIRROR_ENABLED && turn != null) {
|
|
7556
|
+
clearActivitySummary(turn)
|
|
7557
|
+
}
|
|
7457
7558
|
// #549 fix — flush any pending preamble BEFORE the answer stream is
|
|
7458
7559
|
// nulled below. Text emitted immediately before turn_end (no tool
|
|
7459
7560
|
// followed) is the answer; the suppressor's emitAnswer callback
|
|
@@ -8505,6 +8606,14 @@ async function handleInbound(
|
|
|
8505
8606
|
// vs mid-turn — its decision will be visible in the gw-trace shadow
|
|
8506
8607
|
// line emitted to stderr.
|
|
8507
8608
|
const _shadowKey = statusKey(ctx.chat?.id != null ? String(ctx.chat.id) : '0', ctx.message?.message_thread_id) as _ChatKey
|
|
8609
|
+
// PR3b-cutover: snapshot the machine's in-turn state BEFORE the
|
|
8610
|
+
// inbound event advances it. A fresh-turn inbound transitions the
|
|
8611
|
+
// machine idle→in_turn; reading after the emit would see THIS
|
|
8612
|
+
// message's own just-started turn and self-block it (the same
|
|
8613
|
+
// self-block hazard the claudeBusyKeys snapshot below guards). When
|
|
8614
|
+
// the kill-switch is off this is null and the gate uses the legacy
|
|
8615
|
+
// claudeBusyKeys read.
|
|
8616
|
+
const machineInTurnAtReceipt = isDeliveryCutoverEnabled() ? isMachineInTurn() : null
|
|
8508
8617
|
shadowEmit({
|
|
8509
8618
|
kind: 'inbound',
|
|
8510
8619
|
key: _shadowKey,
|
|
@@ -8556,7 +8665,12 @@ async function handleInbound(
|
|
|
8556
8665
|
// no turn_end ever fires). With claudeBusyKeys, B sees true (A is
|
|
8557
8666
|
// busy) → B is buffered correctly, AND the gate cleanly reopens
|
|
8558
8667
|
// when A's turn_end deletes keyA → flush triggers → B delivered.
|
|
8559
|
-
|
|
8668
|
+
// PR3b-cutover: prefer the machine snapshot taken before the inbound
|
|
8669
|
+
// event advanced it (machineInTurnAtReceipt); null when the
|
|
8670
|
+
// kill-switch is off, in which case the legacy claudeBusyKeys read
|
|
8671
|
+
// stands. Both are "was a turn in flight at receipt", not a live
|
|
8672
|
+
// post-this-inbound read — see machineInTurnAtReceipt's comment.
|
|
8673
|
+
const turnInFlightAtReceipt = machineInTurnAtReceipt ?? (claudeBusyKeys.size > 0)
|
|
8560
8674
|
|
|
8561
8675
|
const access = result.access
|
|
8562
8676
|
const from = ctx.from!
|
|
@@ -43,6 +43,39 @@ import {
|
|
|
43
43
|
let state: State = initialState()
|
|
44
44
|
const enabled = process.env.SWITCHROOM_DELIVERY_MACHINE_SHADOW !== '0'
|
|
45
45
|
|
|
46
|
+
// Phase 2b PR 3 — STAGED CUTOVER. When enabled, the gateway's
|
|
47
|
+
// "is a turn in flight?" gate reads this machine's global state
|
|
48
|
+
// instead of the PR3b `claudeBusyKeys` set. The machine tracks ONE
|
|
49
|
+
// `activeTurn` (single bridge) plus TTL `tick` expiry, so — unlike a
|
|
50
|
+
// per-delivery key set — it cannot accumulate orphan keys and wedge
|
|
51
|
+
// the gate "in-flight forever" (the gymbro/clerk 5-min dangle of
|
|
52
|
+
// 2026-05-28). Scope is the turn-in-flight GATE only; the poke ladder
|
|
53
|
+
// and perm-verdict effects stay imperative for a follow-up PR.
|
|
54
|
+
//
|
|
55
|
+
// Kill switch: `SWITCHROOM_DELIVERY_MACHINE_CUTOVER=0` reverts every
|
|
56
|
+
// gate to the legacy claudeBusyKeys read (zero behaviour change).
|
|
57
|
+
// Requires shadow mode ON — with shadow off the machine state is
|
|
58
|
+
// frozen and must NOT be read as authoritative.
|
|
59
|
+
const cutoverEnabled = enabled && process.env.SWITCHROOM_DELIVERY_MACHINE_CUTOVER !== '0'
|
|
60
|
+
|
|
61
|
+
/**
|
|
62
|
+
* True when the kill-switch leaves the delivery machine authoritative
|
|
63
|
+
* for the turn-in-flight gate. Gateway gate sites branch on this.
|
|
64
|
+
*/
|
|
65
|
+
export function isDeliveryCutoverEnabled(): boolean {
|
|
66
|
+
return cutoverEnabled
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
/**
|
|
70
|
+
* Authoritative "is a turn currently in flight?" read for the gate.
|
|
71
|
+
* Maps the machine's global state to the boolean the legacy
|
|
72
|
+
* `claudeBusyKeys.size > 0` gate produced. `bridge_dead` and
|
|
73
|
+
* `bridge_alive_idle` are both "not in flight".
|
|
74
|
+
*/
|
|
75
|
+
export function isMachineInTurn(): boolean {
|
|
76
|
+
return state.global.kind === 'bridge_alive_in_turn'
|
|
77
|
+
}
|
|
78
|
+
|
|
46
79
|
/**
|
|
47
80
|
* Run an event through the state machine in shadow mode. The machine
|
|
48
81
|
* state advances, the predicted effects are LOGGED, but no I/O fires.
|
|
@@ -74,15 +74,24 @@ function urlHostPath(u) {
|
|
|
74
74
|
export function computeLabel(toolName, input) {
|
|
75
75
|
const i = input ?? {}
|
|
76
76
|
|
|
77
|
-
//
|
|
78
|
-
// the
|
|
77
|
+
// Bash / Task / ToolSearch / TodoWrite: previously emitted nothing
|
|
78
|
+
// (deferred to the session-JSONL description path). The draft-mirror
|
|
79
|
+
// now drives off THIS sidecar in real time (flush-independent), so we
|
|
80
|
+
// must label them here too — otherwise the most common tool (Bash)
|
|
81
|
+
// never reaches the live draft. Uses the model-authored `description`
|
|
82
|
+
// for Bash/Task, matching the gateway's describeToolUse rendering.
|
|
79
83
|
switch (toolName) {
|
|
80
84
|
case 'Bash':
|
|
85
|
+
return clip(String(i.description ?? ''), 70).trim() || 'Running a command'
|
|
81
86
|
case 'Task':
|
|
82
|
-
case 'Agent':
|
|
87
|
+
case 'Agent': {
|
|
88
|
+
const d = clip(String(i.description ?? ''), 60).trim()
|
|
89
|
+
return d ? `Delegating: ${d}` : 'Delegating to a sub-agent'
|
|
90
|
+
}
|
|
83
91
|
case 'TodoWrite':
|
|
92
|
+
return 'Updating the plan'
|
|
84
93
|
case 'ToolSearch':
|
|
85
|
-
return
|
|
94
|
+
return 'Finding the right tool'
|
|
86
95
|
}
|
|
87
96
|
|
|
88
97
|
// Built-in rule table.
|
|
@@ -93,6 +93,11 @@ export type SessionEvent =
|
|
|
93
93
|
| { kind: 'dequeue' }
|
|
94
94
|
| { kind: 'thinking' }
|
|
95
95
|
| { kind: 'tool_use'; toolName: string; toolUseId?: string | null; input?: Record<string, unknown>; precomputedLabel?: string }
|
|
96
|
+
// Real-time tool label from the PreToolUse-hook sidecar — fires when the
|
|
97
|
+
// hook writes the label (synchronous at tool-call time), independent of
|
|
98
|
+
// the lazily-flushed transcript. The draft-mirror drives off THIS, not
|
|
99
|
+
// the flush-gated `tool_use`, so activity streams deterministically.
|
|
100
|
+
| { kind: 'tool_label'; toolUseId: string; label: string; toolName: string }
|
|
96
101
|
| { kind: 'text'; text: string }
|
|
97
102
|
| { kind: 'tool_result'; toolUseId: string; toolName: string | null; isError?: boolean; errorText?: string }
|
|
98
103
|
| { kind: 'turn_end'; durationMs: number }
|
|
@@ -639,6 +644,13 @@ export function startSessionTail(config: SessionTailConfig): SessionTailHandle {
|
|
|
639
644
|
try {
|
|
640
645
|
const s = createToolLabelSidecar({ stateDir: stateDirForSidecar, sessionId })
|
|
641
646
|
sidecars.set(sessionId, s)
|
|
647
|
+
// Real-time draft-mirror source: emit a `tool_label` event the moment
|
|
648
|
+
// the hook writes a label (flush-independent), so the gateway can
|
|
649
|
+
// stream the activity feed without waiting on the transcript flush.
|
|
650
|
+
// Subscribed once per sidecar (this is the only creation site).
|
|
651
|
+
s.onLabel((toolUseId, label, toolName) => {
|
|
652
|
+
rawOnEvent({ kind: 'tool_label', toolUseId, label, toolName })
|
|
653
|
+
})
|
|
642
654
|
return s
|
|
643
655
|
} catch (err) {
|
|
644
656
|
log?.(`session-tail: sidecar create failed: ${(err as Error).message}`)
|
|
@@ -775,6 +787,12 @@ export function startSessionTail(config: SessionTailConfig): SessionTailHandle {
|
|
|
775
787
|
}
|
|
776
788
|
log?.(`session-tail: attached to ${file} (cursor=${cursor})`)
|
|
777
789
|
}
|
|
790
|
+
// Eagerly create + subscribe the PreToolUse sidecar for this session
|
|
791
|
+
// NOW (on attach), not lazily on the first JSONL tool_use — otherwise
|
|
792
|
+
// the real-time `tool_label` source wouldn't exist until a flush-gated
|
|
793
|
+
// tool_use arrived, re-introducing the very lag the sidecar avoids.
|
|
794
|
+
const attachSid = sessionIdForFile(file)
|
|
795
|
+
if (attachSid) ensureSidecar(attachSid)
|
|
778
796
|
try {
|
|
779
797
|
watcher = watch(file, () => readNew())
|
|
780
798
|
} catch (err) {
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* PR3b cutover — the turn-in-flight GATE now reads the delivery state
|
|
3
|
+
* machine (`isMachineInTurn`) instead of the PR3b `claudeBusyKeys` set.
|
|
4
|
+
*
|
|
5
|
+
* The bug this closes (gymbro/clerk, 2026-05-28): `claudeBusyKeys` is a
|
|
6
|
+
* per-delivery Set — every delivery `.add`s a key, but turn-end `.delete`s
|
|
7
|
+
* exactly one. When a turn-end is missed (or fires under a non-matching
|
|
8
|
+
* key) the set keeps an orphan, `size > 0` reads true forever, and EVERY
|
|
9
|
+
* subsequent inbound buffers as "held mid-turn" until the 5-min
|
|
10
|
+
* framework-fallback force-drains it.
|
|
11
|
+
*
|
|
12
|
+
* The machine cannot accumulate orphans: global state holds ONE
|
|
13
|
+
* `activeTurn`, so any matching turnEnd returns it to idle, and the TTL
|
|
14
|
+
* `tick` self-heals a missed turnEnd. These tests pin both the normal
|
|
15
|
+
* reopen and the dangle-recovery path on the accessors the gate reads.
|
|
16
|
+
*/
|
|
17
|
+
|
|
18
|
+
import { describe, expect, it, beforeEach } from 'vitest'
|
|
19
|
+
import {
|
|
20
|
+
shadowEmit,
|
|
21
|
+
isMachineInTurn,
|
|
22
|
+
isDeliveryCutoverEnabled,
|
|
23
|
+
__shadowResetForTests,
|
|
24
|
+
} from '../gateway/inbound-delivery-machine-shadow.js'
|
|
25
|
+
import { TURN_TTL_MS, type ChatKey } from '../gateway/inbound-delivery-machine.js'
|
|
26
|
+
|
|
27
|
+
const KEY_A = '111:_' as ChatKey
|
|
28
|
+
const KEY_B = '222:_' as ChatKey
|
|
29
|
+
|
|
30
|
+
function inbound(key: ChatKey, at: number, msgId = 1) {
|
|
31
|
+
shadowEmit({ kind: 'inbound', key, msg: { msgId, isSteering: false, payload: null }, at })
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
describe('PR3b cutover gate accessors', () => {
|
|
35
|
+
beforeEach(() => __shadowResetForTests())
|
|
36
|
+
|
|
37
|
+
it('enabled by default (shadow on, no kill-switch in test env)', () => {
|
|
38
|
+
expect(isDeliveryCutoverEnabled()).toBe(true)
|
|
39
|
+
})
|
|
40
|
+
|
|
41
|
+
it('reads idle before any turn (bridge alive)', () => {
|
|
42
|
+
shadowEmit({ kind: 'bridgeUp', at: 1000 })
|
|
43
|
+
expect(isMachineInTurn()).toBe(false)
|
|
44
|
+
})
|
|
45
|
+
|
|
46
|
+
it('flips in-turn on a fresh inbound and reopens on turnEnd (the gate reopen)', () => {
|
|
47
|
+
shadowEmit({ kind: 'bridgeUp', at: 1000 })
|
|
48
|
+
inbound(KEY_A, 2000)
|
|
49
|
+
expect(isMachineInTurn()).toBe(true)
|
|
50
|
+
shadowEmit({ kind: 'turnEnd', key: KEY_A, at: 3000, outboundEmitted: true })
|
|
51
|
+
// Gate reopens immediately — this is the path claudeBusyKeys danged on.
|
|
52
|
+
expect(isMachineInTurn()).toBe(false)
|
|
53
|
+
})
|
|
54
|
+
|
|
55
|
+
it('self-heals a MISSED turnEnd via the TTL tick (the dangle the fix kills)', () => {
|
|
56
|
+
shadowEmit({ kind: 'bridgeUp', at: 1000 })
|
|
57
|
+
// Turn A starts via enqueue (turnStart), then turn B starts before A's
|
|
58
|
+
// turnEnd ever lands — the orphan scenario. The machine keeps
|
|
59
|
+
// activeTurn=A (turnStart is a no-op on global when already in_turn),
|
|
60
|
+
// so a later turnEnd(B) does NOT match and would leave A dangling.
|
|
61
|
+
shadowEmit({ kind: 'turnStart', key: KEY_A, at: 2000 })
|
|
62
|
+
shadowEmit({ kind: 'turnStart', key: KEY_B, at: 3000 })
|
|
63
|
+
shadowEmit({ kind: 'turnEnd', key: KEY_B, at: 4000, outboundEmitted: true })
|
|
64
|
+
// Without tick, the gate would still read in-turn (activeTurn=A stuck).
|
|
65
|
+
expect(isMachineInTurn()).toBe(true)
|
|
66
|
+
// TTL tick past A's start clears the orphan and reopens the gate —
|
|
67
|
+
// the structural guarantee claudeBusyKeys lacked.
|
|
68
|
+
shadowEmit({ kind: 'tick', now: 2000 + TURN_TTL_MS + 1 })
|
|
69
|
+
expect(isMachineInTurn()).toBe(false)
|
|
70
|
+
})
|
|
71
|
+
|
|
72
|
+
it('does NOT clear a long-but-ACTIVE turn (modelOutbound suppression)', () => {
|
|
73
|
+
shadowEmit({ kind: 'bridgeUp', at: 1000 })
|
|
74
|
+
shadowEmit({ kind: 'turnStart', key: KEY_A, at: 2000 })
|
|
75
|
+
// Model is still streaming just before the TTL boundary.
|
|
76
|
+
const justBeforeTtl = 2000 + TURN_TTL_MS - 5_000
|
|
77
|
+
shadowEmit({ kind: 'modelOutbound', key: KEY_A, at: justBeforeTtl })
|
|
78
|
+
// Tick past TTL — but recent outbound is within the suppression window,
|
|
79
|
+
// so the turn is NOT cleared (parity with the imperative silence-poke).
|
|
80
|
+
shadowEmit({ kind: 'tick', now: 2000 + TURN_TTL_MS + 1 })
|
|
81
|
+
expect(isMachineInTurn()).toBe(true)
|
|
82
|
+
})
|
|
83
|
+
|
|
84
|
+
it('a buffered sibling inbound does not change the active turn', () => {
|
|
85
|
+
shadowEmit({ kind: 'bridgeUp', at: 1000 })
|
|
86
|
+
inbound(KEY_A, 2000) // fresh turn A
|
|
87
|
+
inbound(KEY_B, 2500) // mid-turn — buffered, must NOT start a new turn
|
|
88
|
+
expect(isMachineInTurn()).toBe(true)
|
|
89
|
+
shadowEmit({ kind: 'turnEnd', key: KEY_A, at: 3000, outboundEmitted: true })
|
|
90
|
+
// A ended; nothing else active → gate reopens so B can drain.
|
|
91
|
+
expect(isMachineInTurn()).toBe(false)
|
|
92
|
+
})
|
|
93
|
+
})
|
|
@@ -7,6 +7,7 @@ import {
|
|
|
7
7
|
verbForTool,
|
|
8
8
|
describeToolUse,
|
|
9
9
|
appendActivityLine,
|
|
10
|
+
appendActivityLabel,
|
|
10
11
|
renderActivityFeed,
|
|
11
12
|
MIRROR_MAX_LINES,
|
|
12
13
|
} from "../tool-activity-summary.js";
|
|
@@ -328,3 +329,21 @@ describe("appendActivityLine + renderActivityFeed — accumulating draft feed",
|
|
|
328
329
|
expect(renderActivityFeed([])).toBeNull();
|
|
329
330
|
});
|
|
330
331
|
});
|
|
332
|
+
|
|
333
|
+
describe("appendActivityLabel — precomputed label feed (tool_label path)", () => {
|
|
334
|
+
it("accumulates precomputed labels, dedups consecutive, ignores empty", () => {
|
|
335
|
+
const lines: string[] = [];
|
|
336
|
+
expect(appendActivityLabel(lines, "Searching memory")).toBe("· Searching memory");
|
|
337
|
+
expect(appendActivityLabel(lines, "List workspace")).toBe(
|
|
338
|
+
"· Searching memory\n· List workspace",
|
|
339
|
+
);
|
|
340
|
+
// consecutive dup collapses
|
|
341
|
+
appendActivityLabel(lines, "List workspace");
|
|
342
|
+
expect(lines).toEqual(["Searching memory", "List workspace"]);
|
|
343
|
+
// empty / whitespace → null, no push
|
|
344
|
+
expect(appendActivityLabel(lines, "")).toBeNull();
|
|
345
|
+
expect(appendActivityLabel(lines, " ")).toBeNull();
|
|
346
|
+
expect(appendActivityLabel(lines, undefined)).toBeNull();
|
|
347
|
+
expect(lines.length).toBe(2);
|
|
348
|
+
});
|
|
349
|
+
});
|
|
@@ -382,3 +382,21 @@ export function renderActivityFeed(lines: string[]): string | null {
|
|
|
382
382
|
const body = shown.map((l) => `· ${l}`).join("\n");
|
|
383
383
|
return hidden > 0 ? `· +${hidden} earlier…\n${body}` : body;
|
|
384
384
|
}
|
|
385
|
+
|
|
386
|
+
/**
|
|
387
|
+
* Like appendActivityLine, but for a pre-computed label (from the
|
|
388
|
+
* real-time PreToolUse sidecar / `tool_label` event) — the hook already
|
|
389
|
+
* rendered the friendly text, so we skip describeToolUse. Returns the
|
|
390
|
+
* rendered feed, or null when the label is empty.
|
|
391
|
+
*/
|
|
392
|
+
export function appendActivityLabel(
|
|
393
|
+
lines: string[],
|
|
394
|
+
label: string | undefined,
|
|
395
|
+
): string | null {
|
|
396
|
+
const l = (label ?? "").trim();
|
|
397
|
+
if (l.length === 0) return null;
|
|
398
|
+
if (lines.length === 0 || lines[lines.length - 1] !== l) {
|
|
399
|
+
lines.push(l);
|
|
400
|
+
}
|
|
401
|
+
return renderActivityFeed(lines);
|
|
402
|
+
}
|
|
@@ -40,8 +40,11 @@ export interface ToolLabelRow {
|
|
|
40
40
|
export interface ToolLabelSidecar {
|
|
41
41
|
/** Synchronous label lookup. */
|
|
42
42
|
getLabel(toolUseId: string): string | undefined
|
|
43
|
-
/** Subscribe to "label arrived" notifications.
|
|
44
|
-
|
|
43
|
+
/** Subscribe to "label arrived" notifications. Fires once per new
|
|
44
|
+
* sidecar line, in real time (~pollMs after the hook's appendFileSync),
|
|
45
|
+
* independent of when the claude transcript flushes. `toolName` lets
|
|
46
|
+
* subscribers filter surface tools (reply/react) from a live feed. */
|
|
47
|
+
onLabel(cb: (toolUseId: string, label: string, toolName: string) => void): () => void
|
|
45
48
|
/** Force a re-poll (tests). */
|
|
46
49
|
poll(): void
|
|
47
50
|
/** Stop polling and release resources. */
|
|
@@ -63,7 +66,7 @@ export interface SidecarOptions {
|
|
|
63
66
|
export function createToolLabelSidecar(opts: SidecarOptions): ToolLabelSidecar {
|
|
64
67
|
const path = join(opts.stateDir, `tool-labels-${opts.sessionId}.jsonl`)
|
|
65
68
|
const labels = new Map<string, string>()
|
|
66
|
-
const subscribers = new Set<(toolUseId: string, label: string) => void>()
|
|
69
|
+
const subscribers = new Set<(toolUseId: string, label: string, toolName: string) => void>()
|
|
67
70
|
let offset = 0
|
|
68
71
|
let stopped = false
|
|
69
72
|
|
|
@@ -84,13 +87,18 @@ export function createToolLabelSidecar(opts: SidecarOptions): ToolLabelSidecar {
|
|
|
84
87
|
} catch {
|
|
85
88
|
continue
|
|
86
89
|
}
|
|
87
|
-
if (
|
|
90
|
+
if (
|
|
91
|
+
!row ||
|
|
92
|
+
typeof row.tool_use_id !== 'string' ||
|
|
93
|
+
typeof row.label !== 'string' ||
|
|
94
|
+
typeof row.tool_name !== 'string'
|
|
95
|
+
) continue
|
|
88
96
|
// First write wins — sidecar lines are append-only and we don't
|
|
89
97
|
// expect duplicates, but if one lands we keep the earliest.
|
|
90
98
|
if (labels.has(row.tool_use_id)) continue
|
|
91
99
|
labels.set(row.tool_use_id, row.label)
|
|
92
100
|
for (const cb of subscribers) {
|
|
93
|
-
try { cb(row.tool_use_id, row.label) } catch { /* ignore */ }
|
|
101
|
+
try { cb(row.tool_use_id, row.label, row.tool_name) } catch { /* ignore */ }
|
|
94
102
|
}
|
|
95
103
|
}
|
|
96
104
|
}
|
|
@@ -215,9 +215,13 @@ const CC2_CASES: readonly CC2Case[] = [
|
|
|
215
215
|
},
|
|
216
216
|
{
|
|
217
217
|
name: "long-running with planned check-ins",
|
|
218
|
+
// Use python time.sleep, NOT the `sleep` command — Claude Code's bash
|
|
219
|
+
// sandbox blocks standalone `sleep` ("foreground sleep is sandboxed
|
|
220
|
+
// away"), which made this case un-runnable (agent replied instantly).
|
|
218
221
|
prompt:
|
|
219
|
-
"Run `bash` with `
|
|
220
|
-
"
|
|
222
|
+
"Run `bash` with `python3 -c 'import time; time.sleep(5)'` then echo " +
|
|
223
|
+
"step1, send a brief update, then `python3 -c 'import time; " +
|
|
224
|
+
"time.sleep(5)'` then echo step2, send another brief update, then " +
|
|
221
225
|
"send a final 'done' as your answer.",
|
|
222
226
|
},
|
|
223
227
|
];
|
|
@@ -262,12 +266,27 @@ async function assertMidTurnSilent(
|
|
|
262
266
|
)
|
|
263
267
|
.join("\n");
|
|
264
268
|
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
)
|
|
269
|
-
|
|
270
|
-
|
|
269
|
+
// The model habitually emits a trailing trivial confirmation ("Done.",
|
|
270
|
+
// "Sent.", "OK") as a separate SILENT message AFTER its real pinged
|
|
271
|
+
// answer. That's pacing noise (the turn-pacing directive discourages
|
|
272
|
+
// it), not the final answer — so don't treat it as the
|
|
273
|
+
// "final-answer-must-ping" target. Find the last SUBSTANTIVE message
|
|
274
|
+
// and assert that one pinged; trailing trivial confirmations are
|
|
275
|
+
// ignored for this invariant (they're correctly silent anyway).
|
|
276
|
+
const TRIVIAL_TAIL = /^(done|sent|ok|okay|ack|got it|hope (that|this) helps)\b[.! ]*$/i;
|
|
277
|
+
const isTrivial = (m: ObservedMessage) => TRIVIAL_TAIL.test(m.text.trim());
|
|
278
|
+
let finalIdx = collected.length - 1;
|
|
279
|
+
while (finalIdx > 0 && isTrivial(collected[finalIdx])) finalIdx--;
|
|
280
|
+
const finalAnswer = collected[finalIdx];
|
|
281
|
+
expect(
|
|
282
|
+
finalAnswer.silent,
|
|
283
|
+
`final substantive answer was silent — won't ping. Trail:\n${trail}`,
|
|
284
|
+
).toBe(false);
|
|
285
|
+
|
|
286
|
+
// Everything BEFORE the final substantive answer must be silent
|
|
287
|
+
// (mid-turn updates ping-free). Trailing trivial confirmations after
|
|
288
|
+
// it are already silent and are not "mid-turn" — exclude them too.
|
|
289
|
+
const midTurn = collected.slice(0, finalIdx);
|
|
271
290
|
const loudMidTurn = midTurn.filter((m) => !m.silent);
|
|
272
291
|
expect(
|
|
273
292
|
loudMidTurn.length,
|
|
@@ -334,12 +353,19 @@ async function assertSilencePokeFires(
|
|
|
334
353
|
// Single bash call so the poke piggybacks the single tool result.
|
|
335
354
|
// Without the explicit "no replies" instruction the model might
|
|
336
355
|
// soft-commit; that resets the silence clock but a single >75s
|
|
337
|
-
//
|
|
356
|
+
// wait still pushes post-commit silence past the threshold.
|
|
357
|
+
//
|
|
358
|
+
// Use python time.sleep, NOT the `sleep` command — Claude Code's bash
|
|
359
|
+
// sandbox blocks standalone `sleep` ("foreground sleep is sandboxed
|
|
360
|
+
// away to prevent burning cache windows"), so a `sleep 80` prompt made
|
|
361
|
+
// the agent reply instantly instead of going silent, breaking this
|
|
362
|
+
// case. python3 time.sleep is a genuine foreground wait the sandbox
|
|
363
|
+
// doesn't special-case.
|
|
338
364
|
const prompt =
|
|
339
|
-
`Run exactly one Bash tool call: \`
|
|
340
|
-
`send any reply before
|
|
341
|
-
`mid-turn updates. When
|
|
342
|
-
`reply.`;
|
|
365
|
+
`Run exactly one Bash tool call: \`python3 -c 'import time; ` +
|
|
366
|
+
`time.sleep(${sleepSeconds})'\`. Do NOT send any reply before it ` +
|
|
367
|
+
`completes — no soft commit, no mid-turn updates. When it returns, ` +
|
|
368
|
+
`send one brief 'done' reply.`;
|
|
343
369
|
|
|
344
370
|
await scenario.sendDM(prompt);
|
|
345
371
|
|