switchroom 0.15.37 → 0.15.39

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/dist/agent-scheduler/index.js +89 -89
  2. package/dist/auth-broker/index.js +89 -89
  3. package/dist/cli/autoaccept-poll.js +13 -7
  4. package/dist/cli/drive-write-pretool.mjs +10 -10
  5. package/dist/cli/notion-write-pretool.mjs +91 -91
  6. package/dist/cli/skill-validate-pretool.mjs +72 -72
  7. package/dist/cli/switchroom.js +857 -572
  8. package/dist/cli/ui/index.html +87 -17
  9. package/dist/host-control/main.js +158 -158
  10. package/dist/vault/approvals/kernel-server.js +91 -91
  11. package/dist/vault/broker/server.js +92 -92
  12. package/package.json +1 -1
  13. package/profiles/_base/cron-session.sh.hbs +1 -1
  14. package/profiles/_base/start.sh.hbs +1 -1
  15. package/profiles/default/CLAUDE.md.hbs +2 -0
  16. package/skills/switchroom-manage/SKILL.md +1 -1
  17. package/skills/switchroom-runtime/SKILL.md +1 -1
  18. package/telegram-plugin/answer-stream.ts +1 -1
  19. package/telegram-plugin/bridge/bridge.ts +18 -1
  20. package/telegram-plugin/bridge/ipc-client.ts +4 -1
  21. package/telegram-plugin/bridge/tool-filter.ts +77 -0
  22. package/telegram-plugin/chat-lock.ts +1 -1
  23. package/telegram-plugin/credits-watch.ts +1 -1
  24. package/telegram-plugin/dist/bridge/bridge.js +141 -115
  25. package/telegram-plugin/dist/gateway/gateway.js +318 -207
  26. package/telegram-plugin/dist/server.js +193 -164
  27. package/telegram-plugin/gateway/auto-classify-mid-turn.ts +1 -1
  28. package/telegram-plugin/gateway/boot-card.ts +5 -1
  29. package/telegram-plugin/gateway/boot-probes.ts +62 -0
  30. package/telegram-plugin/gateway/cron-session.ts +1 -1
  31. package/telegram-plugin/gateway/gateway.ts +133 -12
  32. package/telegram-plugin/gateway/grant-restart.ts +1 -1
  33. package/telegram-plugin/gateway/inbound-delivery-machine-dispatch.ts +1 -1
  34. package/telegram-plugin/gateway/inbound-delivery-machine-shadow.ts +1 -1
  35. package/telegram-plugin/gateway/inbound-delivery-machine.ts +1 -1
  36. package/telegram-plugin/gateway/interrupt-defer.ts +1 -1
  37. package/telegram-plugin/gateway/ipc-protocol.ts +12 -0
  38. package/telegram-plugin/gateway/permission-card-origin.ts +62 -0
  39. package/telegram-plugin/gateway/permission-timeout.ts +70 -0
  40. package/telegram-plugin/gateway/prefix-warmup.ts +1 -1
  41. package/telegram-plugin/gateway/webhook-ingest-server.test.ts +1 -1
  42. package/telegram-plugin/gateway/webhook-ingest-server.ts +1 -1
  43. package/telegram-plugin/hooks/subagent-tracker-pretool.mjs +1 -1
  44. package/telegram-plugin/interrupt-marker.ts +1 -1
  45. package/telegram-plugin/over-ping-safety-net.ts +1 -1
  46. package/telegram-plugin/scoped-approval.ts +1 -1
  47. package/telegram-plugin/secret-detect/vault-error.ts +1 -1
  48. package/telegram-plugin/silence-poke.ts +2 -2
  49. package/telegram-plugin/silent-reply-anchor.ts +1 -1
  50. package/telegram-plugin/slot-banner-driver.ts +1 -1
  51. package/telegram-plugin/startup-reset.ts +1 -1
  52. package/telegram-plugin/tests/boot-probes-connections.test.ts +66 -0
  53. package/telegram-plugin/tests/gateway-startup-reset.test.ts +1 -1
  54. package/telegram-plugin/tests/inbound-delivery-machine.test.ts +1 -1
  55. package/telegram-plugin/tests/permission-card-origin.test.ts +97 -0
  56. package/telegram-plugin/tests/permission-card-routing.test.ts +23 -0
  57. package/telegram-plugin/tests/permission-no-repeat-wiring.test.ts +76 -0
  58. package/telegram-plugin/tests/permission-timeout.test.ts +87 -0
  59. package/telegram-plugin/tests/scoped-approval.test.ts +1 -1
  60. package/telegram-plugin/tests/silence-poke.test.ts +1 -1
  61. package/telegram-plugin/tests/tool-filter.test.ts +87 -0
  62. package/telegram-plugin/tests/turn-flush-safety.test.ts +1 -1
  63. package/telegram-plugin/turn-flush-safety.ts +1 -1
  64. package/telegram-plugin/uat/assertions.ts +1 -1
  65. package/telegram-plugin/uat/scenarios/bg-sub-agent-dispatch-dm.test.ts +1 -1
  66. package/telegram-plugin/uat/scenarios/fuzz-extended-dm.test.ts +1 -1
  67. package/telegram-plugin/uat/scenarios/jtbd-fast-ack-dm.test.ts +1 -1
  68. package/telegram-plugin/uat/scenarios/jtbd-fast-trivial-dm.test.ts +2 -2
  69. package/telegram-plugin/uat/scenarios/jtbd-forwarded-burst-dm.test.ts +1 -1
  70. package/telegram-plugin/uat/scenarios/jtbd-memory-survives-restart-dm.test.ts +1 -1
  71. package/telegram-plugin/uat/scenarios/jtbd-rapid-followup-dm.test.ts +1 -1
  72. package/telegram-plugin/uat/scenarios/jtbd-reflective-status-reaction-dm.test.ts +1 -1
  73. package/telegram-plugin/uat/scenarios/jtbd-wake-audit-content-dm.test.ts +1 -1
@@ -2,7 +2,7 @@
2
2
  * over-ping-safety-net.ts — pure decision predicate for #1674's
3
3
  * "at-most-one device-ping per turn" framework safety net.
4
4
  *
5
- * Background. `reference/conversational-pacing.md` beat 5 is
5
+ * Background. `reference/rfcs/conversational-pacing.md` beat 5 is
6
6
  * explicit: the model should deliver the answer as a fresh `reply`
7
7
  * omitting `disable_notification` (i.e. pinging the device once).
8
8
  * EXACTLY ONE ping per turn. The model occasionally violates this
@@ -9,7 +9,7 @@
9
9
  * "Allow" means for a narrow safe scope, disclosed honestly on the post-tap
10
10
  * card ("won't ask again about <breadth> for 30 min" vs "allowed once").
11
11
  *
12
- * Design contract (reference/access-model.md — "you hold the leash"):
12
+ * Design contract (reference/rfcs/access-model.md — "you hold the leash"):
13
13
  *
14
14
  * - **Operator-authored only.** Every cache entry is created by an
15
15
  * `allowFrom`-authenticated Telegram tap. No tool call can seed an
@@ -158,7 +158,7 @@ export function renderVaultCliError(
158
158
  // Route the operator at the Telegram-native equivalent for the
159
159
  // verb in flight — only `init` needs a one-time host shell.
160
160
  // Closes the "leave Telegram for a verb that exists in Telegram"
161
- // anti-pattern from reference/talk-to-agents-from-anywhere.md.
161
+ // anti-pattern from reference/jobs/talk-to-agents-from-anywhere.md.
162
162
  return {
163
163
  suppressRaw: true,
164
164
  html:
@@ -10,7 +10,7 @@
10
10
  * 75s, firm at 180s) and the 60s user-visible awareness ping were
11
11
  * retired: their success rate was 0-7% by the design's own KPI, and they
12
12
  * duplicated a job the draft thinking-lane now does natively. See
13
- * `reference/conversational-pacing.md` § Safety net.
13
+ * `reference/rfcs/conversational-pacing.md` § Safety net.
14
14
  *
15
15
  * What remains: ONE silence clock and ONE terminal action.
16
16
  *
@@ -323,7 +323,7 @@ export function silenceMsForKey(key: string, now: number): number | null {
323
323
  * Verbatim framework-fallback text — the user-visible "still working / still
324
324
  * thinking" message the gateway sends at the 300s threshold when the model
325
325
  * hasn't broken its own silence. Wording is load-bearing (see
326
- * `reference/conversational-pacing.md` § Safety net). Two principles:
326
+ * `reference/rfcs/conversational-pacing.md` § Safety net). Two principles:
327
327
  *
328
328
  * 1. The parenthetical `(no update from agent in N min)` is honest —
329
329
  * distinguishes from "the agent said something" so users learn to trust
@@ -3,7 +3,7 @@
3
3
  * "consecutive silent replies edit one growing message" UX fix.
4
4
  *
5
5
  * Background. Modern Claude 2.1.x on this fleet implements
6
- * conversational pacing (`reference/conversational-pacing.md` beats
6
+ * conversational pacing (`reference/rfcs/conversational-pacing.md` beats
7
7
  * 1 + 3 + 5) by calling the `reply` MCP tool multiple times in a
8
8
  * turn — a silent ack, silent per-step updates, and one pinged
9
9
  * final answer. The over-ping safety net (#1674) caps the
@@ -17,7 +17,7 @@
17
17
  * unpinned message.
18
18
  *
19
19
  * See #421 (banner pin lifecycle) and JTBD
20
- * `reference/track-plan-quota-live.md` ("at a glance").
20
+ * `reference/jobs/track-plan-quota-live.md` ("at a glance").
21
21
  */
22
22
 
23
23
  import type { BannerState } from './slot-banner.js';
@@ -17,7 +17,7 @@
17
17
  * idempotent and has no user-visible side effects beyond clearing the
18
18
  * (probably-empty) pending-updates queue.
19
19
  *
20
- * Reference: reference/restart-and-know-what-im-running.md — "silent
20
+ * Reference: reference/jobs/restart-and-know-what-im-running.md — "silent
21
21
  * respawn. Agent comes back and the user has to guess whether it's
22
22
  * the same agent." A gateway stuck in a 409 loop is exactly that
23
23
  * failure mode.
@@ -0,0 +1,66 @@
1
+ /**
2
+ * Unit tests for probeConnections — the boot-card surface for
3
+ * configured-but-unauthed MCP connections (P3). The probe only READS the
4
+ * host-computed snapshot at <agentDir>/.claude/connection-health.json, so
5
+ * we drive it with an injected readFileImpl (no fs / no broker).
6
+ */
7
+
8
+ import { describe, it, expect } from 'bun:test'
9
+ import { probeConnections } from '../gateway/boot-probes.js'
10
+
11
+ const ENOENT = () => {
12
+ const e = new Error('ENOENT') as NodeJS.ErrnoException
13
+ e.code = 'ENOENT'
14
+ throw e
15
+ }
16
+
17
+ describe('probeConnections', () => {
18
+ it('OK (silent) when the snapshot file is absent — assume healthy', async () => {
19
+ const r = await probeConnections('/agent', { readFileImpl: ENOENT })
20
+ expect(r.status).toBe('ok')
21
+ })
22
+
23
+ it('OK when the snapshot is malformed JSON', async () => {
24
+ const r = await probeConnections('/agent', { readFileImpl: () => 'not json{' })
25
+ expect(r.status).toBe('ok')
26
+ })
27
+
28
+ it('OK when there are zero issues', async () => {
29
+ const r = await probeConnections('/agent', {
30
+ readFileImpl: () => JSON.stringify({ computedAt: 1, issues: [] }),
31
+ })
32
+ expect(r.status).toBe('ok')
33
+ expect(r.detail).toContain('all authed')
34
+ })
35
+
36
+ it('DEGRADED (never fail) with named servers + a fix when connections are unauthed', async () => {
37
+ const snapshot = {
38
+ computedAt: 1,
39
+ issues: [
40
+ { server: 'meta', key: 'meta/token', kind: 'missing', detail: 'x', fix: 'switchroom vault set meta/token --allow marko' },
41
+ { server: 'postiz', key: 'postiz/key', kind: 'missing', detail: 'y', fix: 'switchroom vault set postiz/key --allow marko' },
42
+ ],
43
+ }
44
+ const r = await probeConnections('/agent', { readFileImpl: () => JSON.stringify(snapshot) })
45
+ expect(r.status).toBe('degraded')
46
+ expect(r.detail).toContain('2 integration(s)')
47
+ expect(r.detail).toContain('meta')
48
+ expect(r.detail).toContain('postiz')
49
+ // nextStep carries the first fix + a pointer to doctor for the rest.
50
+ expect(r.nextStep).toContain('switchroom vault set meta/token')
51
+ expect(r.nextStep).toContain('+1 more')
52
+ })
53
+
54
+ it('dedupes servers in the detail count', async () => {
55
+ const snapshot = {
56
+ computedAt: 1,
57
+ issues: [
58
+ { server: 'meta', key: 'meta/a', kind: 'missing', detail: 'x', fix: 'fixa' },
59
+ { server: 'meta', key: 'meta/b', kind: 'acl', detail: 'y', fix: 'fixb' },
60
+ ],
61
+ }
62
+ const r = await probeConnections('/agent', { readFileImpl: () => JSON.stringify(snapshot) })
63
+ expect(r.status).toBe('degraded')
64
+ expect(r.detail).toContain('1 integration(s)')
65
+ })
66
+ })
@@ -16,7 +16,7 @@ import { clearStaleTelegramPollingState } from "../startup-reset";
16
16
  *
17
17
  * These tests pin that behaviour so we don't accidentally remove the
18
18
  * call during a future refactor and reintroduce the silent-respawn
19
- * anti-pattern from reference/restart-and-know-what-im-running.md.
19
+ * anti-pattern from reference/jobs/restart-and-know-what-im-running.md.
20
20
  */
21
21
 
22
22
  describe("clearStaleTelegramPollingState", () => {
@@ -1,7 +1,7 @@
1
1
  /**
2
2
  * Property tests for `inbound-delivery-machine.ts`.
3
3
  *
4
- * Per RFC `docs/rfcs/inbound-delivery-state-machine.md`: 5 invariants
4
+ * Per RFC `reference/rfcs/inbound-delivery-state-machine.md`: 5 invariants
5
5
  * validated over arbitrary event schedules. A counterexample is the
6
6
  * minimal evidence that the machine has a bug. The wedge-cluster
7
7
  * bugs (v0.12.22 boot-wedge, overlapping-turn silence, #1564 sibling
@@ -0,0 +1,97 @@
1
+ /**
2
+ * Unit tests for the pure permission-card origin-recovery helper.
3
+ *
4
+ * Pins the behaviour that fixes the marko Rentals-budget incident
5
+ * (2026-06-17): when the gateway's `currentTurn` was force-closed by the
6
+ * orphaned-reply backstop but the claude session kept running into a
7
+ * permission-gated tool, the card must recover its origin from the most-recent
8
+ * still-fresh turn — so it lands in the forum topic the operator is working in
9
+ * rather than fanning out to operator DMs where it auto-denies on the 10-min
10
+ * TTL.
11
+ */
12
+
13
+ import { describe, it, expect } from 'vitest'
14
+ import {
15
+ pickRecoveredPermissionOrigin,
16
+ type RecoverableTurn,
17
+ } from '../gateway/permission-card-origin.js'
18
+
19
+ const NOW = 1_000_000_000_000
20
+ const MAX_AGE = 30 * 60_000 // 30 min, mirrors PERMISSION_CARD_ORIGIN_MAX_AGE_MS
21
+
22
+ function turn(
23
+ chatId: string,
24
+ threadId: number | undefined,
25
+ ageMs: number,
26
+ ): RecoverableTurn {
27
+ return { sessionChatId: chatId, sessionThreadId: threadId, startedAt: NOW - ageMs }
28
+ }
29
+
30
+ describe('pickRecoveredPermissionOrigin', () => {
31
+ it('returns null for an empty registry (caller keeps the DM fan-out)', () => {
32
+ expect(pickRecoveredPermissionOrigin([], NOW, MAX_AGE)).toBeNull()
33
+ })
34
+
35
+ it('recovers the supergroup chat + topic of the most-recent fresh turn', () => {
36
+ // The marko shape: the force-closed turn was in supergroup topic 3.
37
+ const recovered = pickRecoveredPermissionOrigin(
38
+ [turn('-1001234567890', 3, 11 * 60_000)],
39
+ NOW,
40
+ MAX_AGE,
41
+ )
42
+ expect(recovered).toEqual({ chatId: '-1001234567890', threadId: 3 })
43
+ })
44
+
45
+ it('picks the most-recently-started turn when several are fresh', () => {
46
+ const recovered = pickRecoveredPermissionOrigin(
47
+ [
48
+ turn('-100aaa', 1, 20 * 60_000),
49
+ turn('-100bbb', 3, 2 * 60_000), // most recent
50
+ turn('-100ccc', 4, 9 * 60_000),
51
+ ],
52
+ NOW,
53
+ MAX_AGE,
54
+ )
55
+ expect(recovered).toEqual({ chatId: '-100bbb', threadId: 3 })
56
+ })
57
+
58
+ it('selects by startedAt, not iteration order (robust to out-of-order inserts)', () => {
59
+ const recovered = pickRecoveredPermissionOrigin(
60
+ [
61
+ turn('-100recent', 1, 1 * 60_000), // freshest, but listed first
62
+ turn('-100older', 2, 15 * 60_000),
63
+ ],
64
+ NOW,
65
+ MAX_AGE,
66
+ )
67
+ expect(recovered).toEqual({ chatId: '-100recent', threadId: 1 })
68
+ })
69
+
70
+ it('ignores turns older than the freshness ceiling', () => {
71
+ expect(
72
+ pickRecoveredPermissionOrigin([turn('-100stale', 7, 45 * 60_000)], NOW, MAX_AGE),
73
+ ).toBeNull()
74
+ })
75
+
76
+ it('recovers a DM-origin turn thread-less (threadId undefined)', () => {
77
+ const recovered = pickRecoveredPermissionOrigin(
78
+ [turn('12345', undefined, 3 * 60_000)],
79
+ NOW,
80
+ MAX_AGE,
81
+ )
82
+ expect(recovered).toEqual({ chatId: '12345', threadId: undefined })
83
+ })
84
+
85
+ it('keeps the freshest in-window turn even when stale turns are present', () => {
86
+ const recovered = pickRecoveredPermissionOrigin(
87
+ [
88
+ turn('-100stale', 1, 90 * 60_000),
89
+ turn('-100fresh', 3, 5 * 60_000),
90
+ turn('-100ancient', 2, 600 * 60_000),
91
+ ],
92
+ NOW,
93
+ MAX_AGE,
94
+ )
95
+ expect(recovered).toEqual({ chatId: '-100fresh', threadId: 3 })
96
+ })
97
+ })
@@ -74,4 +74,27 @@ describe('permission card routing', () => {
74
74
  const body = GATEWAY_SRC.slice(start, start + 1400)
75
75
  expect(body).toContain('resolvePermissionCardTargets()')
76
76
  })
77
+
78
+ // marko Rentals-budget incident (2026-06-17): a turn force-closed by the
79
+ // orphaned-reply backstop nulled currentTurn, so a permission gate that
80
+ // fired afterwards fell through to the operator-DM fan-out instead of the
81
+ // forum topic. The helper must first try to recover the origin from the
82
+ // recently-started turn registry.
83
+ it('resolvePermissionCardTargets recovers origin from recent turns when currentTurn is null', () => {
84
+ const start = GATEWAY_SRC.indexOf('function resolvePermissionCardTargets(')
85
+ expect(start).toBeGreaterThan(-1)
86
+ const end = GATEWAY_SRC.indexOf('\n}', start)
87
+ const body = GATEWAY_SRC.slice(start, end)
88
+ // Recovery is attempted via the pure helper over the turn registry...
89
+ expect(body).toContain('pickRecoveredPermissionOrigin')
90
+ expect(body).toContain('recentTurnsById')
91
+ // ...before the operator-DM fan-out (recovery branch precedes allowFrom).
92
+ expect(body.indexOf('pickRecoveredPermissionOrigin')).toBeLessThan(
93
+ body.indexOf('allowFrom'),
94
+ )
95
+ })
96
+
97
+ it('the origin-recovery path has a kill switch', () => {
98
+ expect(GATEWAY_SRC).toContain('SWITCHROOM_PERMISSION_CARD_ORIGIN_RECOVERY')
99
+ })
77
100
  })
@@ -0,0 +1,76 @@
1
+ /**
2
+ * Source-text pins for the no-repeat-on-timeout wiring (marko Rentals-budget
3
+ * loop, 2026-06-17). gateway.ts / bridge.ts have top-level side effects and
4
+ * aren't unit-importable; the decision logic is unit-tested in
5
+ * permission-timeout.test.ts. These pins lock the wiring so it can't silently
6
+ * regress.
7
+ */
8
+
9
+ import { describe, it, expect } from 'vitest'
10
+ import { readFileSync } from 'node:fs'
11
+ import { fileURLToPath } from 'node:url'
12
+ import { dirname, resolve } from 'node:path'
13
+
14
+ const __dirname = dirname(fileURLToPath(import.meta.url))
15
+ const read = (p: string) => readFileSync(resolve(__dirname, '..', p), 'utf8')
16
+ const GATEWAY = read('gateway/gateway.ts')
17
+ const BRIDGE = read('bridge/bridge.ts')
18
+ const IPC_PROTOCOL = read('gateway/ipc-protocol.ts')
19
+ const IPC_CLIENT = read('bridge/ipc-client.ts')
20
+
21
+ function slice(src: string, fnHeader: string, span = 1600): string {
22
+ const start = src.indexOf(fnHeader)
23
+ expect(start, `expected to find ${fnHeader}`).toBeGreaterThan(-1)
24
+ return src.slice(start, start + span)
25
+ }
26
+
27
+ describe('no-repeat-on-timeout wiring', () => {
28
+ it('PermissionEvent carries an optional message field', () => {
29
+ const evt = slice(IPC_PROTOCOL, 'export interface PermissionEvent', 2400)
30
+ expect(evt).toMatch(/message\?:\s*string/)
31
+ })
32
+
33
+ it('the bridge IPC validator accepts an optional non-empty message', () => {
34
+ expect(IPC_CLIENT).toMatch(/m\.message === undefined/)
35
+ })
36
+
37
+ it('the bridge forwards message on the permission channel notification', () => {
38
+ const fn = slice(BRIDGE, 'function onPermission(', 1600)
39
+ expect(fn).toContain('notifications/claude/channel/permission')
40
+ expect(fn).toMatch(/msg\.message/)
41
+ })
42
+
43
+ it('the TTL auto-deny attaches a timeout message and records the signature', () => {
44
+ // Within the pending-permission sweep block.
45
+ const sweep = slice(GATEWAY, 'for (const [k, v] of pendingPermissions)', 2200)
46
+ expect(sweep).toContain('timeoutDenyMessage(')
47
+ expect(sweep).toContain('permissionTimeoutSignatures.set(')
48
+ })
49
+
50
+ it('onPermissionRequest short-circuits a recent-timeout duplicate before posting a card', () => {
51
+ const fn = slice(GATEWAY, 'onPermissionRequest(', 4000)
52
+ const dupIdx = fn.indexOf('isRecentTimeoutDuplicate(')
53
+ const cardIdx = fn.indexOf('pendingPermissions.set(requestId')
54
+ expect(dupIdx).toBeGreaterThan(-1)
55
+ expect(cardIdx).toBeGreaterThan(-1)
56
+ // The duplicate check must run BEFORE the card is registered/posted.
57
+ expect(dupIdx).toBeLessThan(cardIdx)
58
+ expect(fn).toContain('duplicateDenyMessage')
59
+ })
60
+
61
+ it('suppression is reset on operator activity (inbound + card verdict + slash)', () => {
62
+ // Three distinct reset points so a returning operator always gets a fresh card.
63
+ const resets = GATEWAY.match(/clearPermissionTimeoutSuppression\(/g) ?? []
64
+ // 1 definition call inside the helper + at least 3 reset callsites.
65
+ expect(resets.length).toBeGreaterThanOrEqual(3)
66
+ expect(GATEWAY).toContain("clearPermissionTimeoutSuppression('operator inbound')")
67
+ })
68
+
69
+ it('has a kill switch', () => {
70
+ expect(GATEWAY).toContain('SWITCHROOM_PERMISSION_NO_REPEAT')
71
+ })
72
+
73
+ it('sweeps stale suppression entries past the safety-cap window', () => {
74
+ expect(GATEWAY).toMatch(/permissionTimeoutSignatures\.delete\(sig\)/)
75
+ })
76
+ })
@@ -0,0 +1,87 @@
1
+ /**
2
+ * Unit tests for the pure permission-timeout helpers (no-repeat-on-timeout).
3
+ *
4
+ * Pins the behaviour that closes the marko Rentals-budget retry loop
5
+ * (2026-06-17): a TTL auto-deny must be distinguishable from a real denial,
6
+ * and an identical retry shortly after a timeout (operator still absent) must
7
+ * be recognisable so the gateway can suppress the duplicate card.
8
+ */
9
+
10
+ import { describe, it, expect } from 'vitest'
11
+ import {
12
+ permissionSignature,
13
+ timeoutDenyMessage,
14
+ duplicateDenyMessage,
15
+ isRecentTimeoutDuplicate,
16
+ } from '../gateway/permission-timeout.js'
17
+
18
+ describe('permissionSignature', () => {
19
+ it('is stable for the same tool + input', () => {
20
+ expect(permissionSignature('mcp__meta_ads__set_budget', '{"id":"1","budget":1400}'))
21
+ .toBe(permissionSignature('mcp__meta_ads__set_budget', '{"id":"1","budget":1400}'))
22
+ })
23
+
24
+ it('differs when the tool differs', () => {
25
+ expect(permissionSignature('toolA', 'x')).not.toBe(permissionSignature('toolB', 'x'))
26
+ })
27
+
28
+ it('differs when the input differs', () => {
29
+ expect(permissionSignature('t', 'Rentals $14')).not.toBe(permissionSignature('t', 'Land $60'))
30
+ })
31
+
32
+ it('does not collide across the tool/input boundary (NUL-separated)', () => {
33
+ // A space separator would make ("a b","c") and ("a","b c") collide.
34
+ expect(permissionSignature('a b', 'c')).not.toBe(permissionSignature('a', 'b c'))
35
+ })
36
+ })
37
+
38
+ describe('timeoutDenyMessage', () => {
39
+ it('names the timeout, the minutes, and tells the model not to retry', () => {
40
+ const msg = timeoutDenyMessage(10)
41
+ expect(msg).toContain('10 minutes')
42
+ expect(msg).toMatch(/timeout/i)
43
+ expect(msg).toMatch(/not a denial/i)
44
+ expect(msg).toMatch(/do not retry/i)
45
+ })
46
+
47
+ it('is a non-empty string (wire-validator requires non-empty)', () => {
48
+ expect(timeoutDenyMessage(5).length).toBeGreaterThan(0)
49
+ })
50
+ })
51
+
52
+ describe('duplicateDenyMessage', () => {
53
+ it('tells the model to stop re-requesting and is non-empty', () => {
54
+ expect(duplicateDenyMessage).toMatch(/do not keep re-requesting/i)
55
+ expect(duplicateDenyMessage.length).toBeGreaterThan(0)
56
+ })
57
+ })
58
+
59
+ describe('isRecentTimeoutDuplicate', () => {
60
+ const WINDOW = 60 * 60_000
61
+ const NOW = 1_000_000_000_000
62
+
63
+ it('false when the signature was never recorded', () => {
64
+ expect(isRecentTimeoutDuplicate(new Map(), 'sig', NOW, WINDOW)).toBe(false)
65
+ })
66
+
67
+ it('true when the signature timed out within the window', () => {
68
+ const m = new Map([['sig', NOW - 5 * 60_000]])
69
+ expect(isRecentTimeoutDuplicate(m, 'sig', NOW, WINDOW)).toBe(true)
70
+ })
71
+
72
+ it('false when the timeout is older than the window', () => {
73
+ const m = new Map([['sig', NOW - 2 * WINDOW]])
74
+ expect(isRecentTimeoutDuplicate(m, 'sig', NOW, WINDOW)).toBe(false)
75
+ })
76
+
77
+ it('true exactly at the window boundary', () => {
78
+ const m = new Map([['sig', NOW - WINDOW]])
79
+ expect(isRecentTimeoutDuplicate(m, 'sig', NOW, WINDOW)).toBe(true)
80
+ })
81
+
82
+ it('only matches the exact signature', () => {
83
+ const m = new Map([[permissionSignature('t', 'Rentals'), NOW]])
84
+ expect(isRecentTimeoutDuplicate(m, permissionSignature('t', 'Land'), NOW, WINDOW)).toBe(false)
85
+ expect(isRecentTimeoutDuplicate(m, permissionSignature('t', 'Rentals'), NOW, WINDOW)).toBe(true)
86
+ })
87
+ })
@@ -3,7 +3,7 @@
3
3
  * the middle rung between "Allow once" and "🔁 Always".
4
4
  *
5
5
  * These pin the access-model invariants the adversarial review flagged as
6
- * load-bearing (reference/access-model.md "you hold the leash"):
6
+ * load-bearing (reference/rfcs/access-model.md "you hold the leash"):
7
7
  * - no tool call can SEED a grant (first contact never auto-allows);
8
8
  * - no tool call can EXTEND the window (fixed box — expiresAt is set once
9
9
  * at the operator tap and never moves on a match);
@@ -528,7 +528,7 @@ describe('silence-poke — fallback handler errors do not break timer', () => {
528
528
  })
529
529
 
530
530
  // CC-4 from `docs/status-ask-cause-classes.md`: wording is load-bearing
531
- // (`reference/conversational-pacing.md` § Safety net). Snapshot the exact
531
+ // (`reference/rfcs/conversational-pacing.md` § Safety net). Snapshot the exact
532
532
  // strings here so a refactor that drops a key phrase fails loud at test
533
533
  // time. If you genuinely need to change the wording, update the snapshot
534
534
  // AND the design doc together.
@@ -0,0 +1,87 @@
1
+ /**
2
+ * Unit tests for the switchroom-telegram tool-surface right-sizing (P4):
3
+ * connection gating of linear_* (A) + per-tool alwaysLoad pins for the hot
4
+ * path (B). Pure function — no bridge.ts import (which has side effects).
5
+ */
6
+
7
+ import { describe, it, expect } from 'bun:test'
8
+ import {
9
+ buildEffectiveToolSchemas,
10
+ ALWAYS_LOAD_TOOLS,
11
+ LINEAR_TOOLS,
12
+ type NamedTool,
13
+ } from '../bridge/tool-filter.js'
14
+
15
+ // A representative slice mirroring the real TOOL_SCHEMAS names.
16
+ const SAMPLE: NamedTool[] = [
17
+ { name: 'reply', description: 'r' },
18
+ { name: 'stream_reply', description: 's' },
19
+ { name: 'get_recent_messages', description: 'g' },
20
+ { name: 'react', description: 'k' },
21
+ { name: 'edit_message', description: 'e' },
22
+ { name: 'send_typing', description: 't' },
23
+ { name: 'download_attachment', description: 'd' },
24
+ { name: 'ask_user', description: 'a' }, // cold
25
+ { name: 'send_gif', description: 'gif' }, // cold
26
+ { name: 'vault_request_access', description: 'v' }, // cold
27
+ { name: 'linear_agent_activity', description: 'la' },
28
+ { name: 'linear_create_issue', description: 'lc' },
29
+ { name: 'linear_agent_setup', description: 'ls' },
30
+ ]
31
+
32
+ const names = (tools: NamedTool[]) => tools.map((t) => t.name)
33
+ const metaOf = (tools: Array<NamedTool & { _meta?: unknown }>, n: string) =>
34
+ tools.find((t) => t.name === n)?._meta
35
+
36
+ describe('buildEffectiveToolSchemas — connection gating (A)', () => {
37
+ it('drops all linear_* tools when Linear is NOT enabled', () => {
38
+ const out = buildEffectiveToolSchemas(SAMPLE, { linearEnabled: false })
39
+ for (const t of LINEAR_TOOLS) expect(names(out)).not.toContain(t)
40
+ // non-linear tools all survive
41
+ expect(names(out)).toContain('reply')
42
+ expect(names(out)).toContain('ask_user')
43
+ expect(out.length).toBe(SAMPLE.length - LINEAR_TOOLS.size)
44
+ })
45
+
46
+ it('keeps linear_* tools when Linear IS enabled', () => {
47
+ const out = buildEffectiveToolSchemas(SAMPLE, { linearEnabled: true })
48
+ for (const t of LINEAR_TOOLS) expect(names(out)).toContain(t)
49
+ expect(out.length).toBe(SAMPLE.length)
50
+ })
51
+ })
52
+
53
+ describe('buildEffectiveToolSchemas — per-tool deferral pins (B)', () => {
54
+ it('pins exactly the hot tools with _meta anthropic/alwaysLoad', () => {
55
+ const out = buildEffectiveToolSchemas(SAMPLE, { linearEnabled: true })
56
+ for (const hot of ALWAYS_LOAD_TOOLS) {
57
+ expect(metaOf(out, hot)).toEqual({ 'anthropic/alwaysLoad': true })
58
+ }
59
+ })
60
+
61
+ it('the reply path (reply/stream_reply) is ALWAYS pinned — never defers', () => {
62
+ const out = buildEffectiveToolSchemas(SAMPLE, { linearEnabled: false })
63
+ expect(metaOf(out, 'reply')).toEqual({ 'anthropic/alwaysLoad': true })
64
+ expect(metaOf(out, 'stream_reply')).toEqual({ 'anthropic/alwaysLoad': true })
65
+ })
66
+
67
+ it('cold tools carry NO _meta (so they defer under tool-search)', () => {
68
+ const out = buildEffectiveToolSchemas(SAMPLE, { linearEnabled: true })
69
+ for (const cold of ['ask_user', 'send_gif', 'vault_request_access', 'linear_create_issue']) {
70
+ expect(metaOf(out, cold)).toBeUndefined()
71
+ }
72
+ })
73
+ })
74
+
75
+ describe('buildEffectiveToolSchemas — purity', () => {
76
+ it('does not mutate the input array or its objects', () => {
77
+ const input: NamedTool[] = [{ name: 'reply' }, { name: 'send_gif' }]
78
+ const snapshot = JSON.stringify(input)
79
+ buildEffectiveToolSchemas(input, { linearEnabled: true })
80
+ expect(JSON.stringify(input)).toBe(snapshot)
81
+ })
82
+
83
+ it('preserves order', () => {
84
+ const out = buildEffectiveToolSchemas(SAMPLE, { linearEnabled: true })
85
+ expect(names(out)).toEqual(names(SAMPLE))
86
+ })
87
+ })
@@ -237,7 +237,7 @@ describe('decideTurnFlush', () => {
237
237
  // Regression guard for the redundant-follow-up-message fix: this reverts
238
238
  // the #1291 post-reply-tail flush, which posted a duplicate recap on
239
239
  // essentially every turn because the model habitually writes a closing
240
- // summary after its final reply. See reference/conversational-pacing.md
240
+ // summary after its final reply. See reference/rfcs/conversational-pacing.md
241
241
  // — "the framework owns the beat; the model authors the words".
242
242
  describe('reply-called turns never flush trailing terminal text', () => {
243
243
  it('skips even when a long substantive tail follows the reply', () => {
@@ -172,7 +172,7 @@ export interface FlushDecisionInput {
172
172
  * message second-guesses an explicit reply and posts a redundant duplicate
173
173
  * on essentially every turn, because the model habitually writes a closing
174
174
  * summary. The framework owns the *beat*; the model authors the *words*
175
- * and emits them via reply (`reference/conversational-pacing.md`).
175
+ * and emits them via reply (`reference/rfcs/conversational-pacing.md`).
176
176
  *
177
177
  * (This reverts the #1291 post-reply-tail flush. Its intent — catch a
178
178
  * soft-commit reply followed by the real answer in terminal text only —
@@ -395,7 +395,7 @@ export async function waitForCardPhase(
395
395
  * The actual card render uses emoji markers in the header: `✅` for
396
396
  * done, `❌` for errors, `⚙️` while working (foreground), `🌀` for
397
397
  * Background (parent done but fleet still running, see #862 /
398
- * reference/conversational-pacing.md),
398
+ * reference/rfcs/conversational-pacing.md),
399
399
  * and `⏳` during the boot-card window. These markers are stable
400
400
  * enough to key on for UAT — finer parsing (checklist items,
401
401
  * sub-agent row content) is out of scope.
@@ -1,6 +1,6 @@
1
1
  /**
2
2
  * Background sub-agent visibility scenario — closes #709 / #776 / #782 / #788
3
- * (the four-issue family analysed in `reference/sub-agent-visibility-rfc.md`).
3
+ * (the four-issue family analysed in `reference/rfcs/sub-agent-visibility.md`).
4
4
  *
5
5
  * Verifies three acceptance criteria from the RFC in a single run because
6
6
  * they share setup:
@@ -149,7 +149,7 @@ const FUZZ_CASES: readonly FuzzCase[] = [
149
149
  // The conservative regex set in `telegram-plugin/inbound-classifier.ts`
150
150
  // captures 10 standalone "ping" patterns that count toward the
151
151
  // primary lagging KPI `inbound_status_query`. Each fire is a JTBD
152
- // failure (`reference/know-what-my-agent-is-doing.md`), so we
152
+ // failure (`reference/jobs/know-what-my-agent-is-doing.md`), so we
153
153
  // want every variant to (a) reach the agent unchanged, (b)
154
154
  // produce a sensible reply (no crash, no loop, no ghosting).
155
155
  // Tracks cause class CC-7 from
@@ -1,7 +1,7 @@
1
1
  /**
2
2
  * JTBD scenario — guaranteed fast acknowledgement (human-feel UX epic).
3
3
  *
4
- * Serves: `reference/conversational-pacing.md` and the JTBD
4
+ * Serves: `reference/rfcs/conversational-pacing.md` and the JTBD
5
5
  * "talking to my agent feels like talking to a capable person".
6
6
  *
7
7
  * A person you message answers in a beat — "got it", "on it, checking
@@ -1,7 +1,7 @@
1
1
  /**
2
2
  * JTBD scenario — short happy path: trivial questions reply FAST.
3
3
  *
4
- * Serves: `reference/know-what-my-agent-is-doing.md` — the short-path
4
+ * Serves: `reference/jobs/know-what-my-agent-is-doing.md` — the short-path
5
5
  * contract: a question with no real work should produce a plain reply
6
6
  * with no ceremony (no soft-commit, no progress chunks) within a tight
7
7
  * budget. Users judge agent speed on THIS path more than any other.
@@ -12,7 +12,7 @@
12
12
  *
13
13
  * ## Targets
14
14
  *
15
- * From `reference/conversational-pacing.md` and the post-v0.12.22
15
+ * From `reference/rfcs/conversational-pacing.md` and the post-v0.12.22
16
16
  * baseline measurements:
17
17
  *
18
18
  * - **TTFO p95 (vision target):** < 30s — the published contract.
@@ -1,7 +1,7 @@
1
1
  /**
2
2
  * JTBD scenario — forwarded burst / split paste coalesces into ONE turn.
3
3
  *
4
- * Serves: `reference/steer-or-queue-mid-flight.md` — the "Forwarded
4
+ * Serves: `reference/jobs/steer-or-queue-mid-flight.md` — the "Forwarded
5
5
  * burst / split paste" UAT prompt. When several messages land in quick
6
6
  * succession from the same sender (a forward of 3-4 messages, or a long
7
7
  * paste Telegram split into chunks), inbound coalescing must merge them