mstro-app 0.5.1 → 0.5.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (283) hide show
  1. package/PRIVACY.md +9 -9
  2. package/README.md +71 -28
  3. package/bin/commands/config.js +1 -1
  4. package/bin/mstro.js +55 -4
  5. package/dist/server/cli/eta-estimator.d.ts +55 -0
  6. package/dist/server/cli/eta-estimator.d.ts.map +1 -0
  7. package/dist/server/cli/eta-estimator.js +222 -0
  8. package/dist/server/cli/eta-estimator.js.map +1 -0
  9. package/dist/server/cli/headless/claude-invoker-process.d.ts.map +1 -1
  10. package/dist/server/cli/headless/claude-invoker-process.js +9 -1
  11. package/dist/server/cli/headless/claude-invoker-process.js.map +1 -1
  12. package/dist/server/cli/headless/mcp-config.d.ts +22 -5
  13. package/dist/server/cli/headless/mcp-config.d.ts.map +1 -1
  14. package/dist/server/cli/headless/mcp-config.js +7 -5
  15. package/dist/server/cli/headless/mcp-config.js.map +1 -1
  16. package/dist/server/cli/headless/runner.d.ts.map +1 -1
  17. package/dist/server/cli/headless/runner.js +19 -0
  18. package/dist/server/cli/headless/runner.js.map +1 -1
  19. package/dist/server/cli/headless/stall-assessor.d.ts +50 -0
  20. package/dist/server/cli/headless/stall-assessor.d.ts.map +1 -1
  21. package/dist/server/cli/headless/stall-assessor.js +64 -9
  22. package/dist/server/cli/headless/stall-assessor.js.map +1 -1
  23. package/dist/server/cli/headless/tool-watchdog.d.ts +21 -0
  24. package/dist/server/cli/headless/tool-watchdog.d.ts.map +1 -1
  25. package/dist/server/cli/headless/tool-watchdog.js +19 -12
  26. package/dist/server/cli/headless/tool-watchdog.js.map +1 -1
  27. package/dist/server/cli/headless/types.d.ts +16 -1
  28. package/dist/server/cli/headless/types.d.ts.map +1 -1
  29. package/dist/server/cli/improvisation-history-store.d.ts.map +1 -1
  30. package/dist/server/cli/improvisation-history-store.js +5 -1
  31. package/dist/server/cli/improvisation-history-store.js.map +1 -1
  32. package/dist/server/cli/improvisation-output-queue.d.ts +5 -1
  33. package/dist/server/cli/improvisation-output-queue.d.ts.map +1 -1
  34. package/dist/server/cli/improvisation-output-queue.js +30 -7
  35. package/dist/server/cli/improvisation-output-queue.js.map +1 -1
  36. package/dist/server/cli/improvisation-session-manager.d.ts +35 -0
  37. package/dist/server/cli/improvisation-session-manager.d.ts.map +1 -1
  38. package/dist/server/cli/improvisation-session-manager.js +58 -1
  39. package/dist/server/cli/improvisation-session-manager.js.map +1 -1
  40. package/dist/server/cli/improvisation-types.d.ts +9 -0
  41. package/dist/server/cli/improvisation-types.d.ts.map +1 -1
  42. package/dist/server/cli/improvisation-types.js.map +1 -1
  43. package/dist/server/cli/retry/retry-runner-factory.d.ts.map +1 -1
  44. package/dist/server/cli/retry/retry-runner-factory.js +1 -0
  45. package/dist/server/cli/retry/retry-runner-factory.js.map +1 -1
  46. package/dist/server/engines/EngineEvent.d.ts +126 -0
  47. package/dist/server/engines/EngineEvent.d.ts.map +1 -0
  48. package/dist/server/engines/EngineEvent.js +11 -0
  49. package/dist/server/engines/EngineEvent.js.map +1 -0
  50. package/dist/server/engines/claude/ClaudeCodeEngine.d.ts +47 -0
  51. package/dist/server/engines/claude/ClaudeCodeEngine.d.ts.map +1 -0
  52. package/dist/server/engines/claude/ClaudeCodeEngine.js +338 -0
  53. package/dist/server/engines/claude/ClaudeCodeEngine.js.map +1 -0
  54. package/dist/server/engines/factory.d.ts +21 -0
  55. package/dist/server/engines/factory.d.ts.map +1 -0
  56. package/dist/server/engines/factory.js +152 -0
  57. package/dist/server/engines/factory.js.map +1 -0
  58. package/dist/server/engines/opencode/OpenCodeEngine.d.ts +148 -0
  59. package/dist/server/engines/opencode/OpenCodeEngine.d.ts.map +1 -0
  60. package/dist/server/engines/opencode/OpenCodeEngine.js +630 -0
  61. package/dist/server/engines/opencode/OpenCodeEngine.js.map +1 -0
  62. package/dist/server/engines/opencode/OpenCodeServerManager.d.ts +172 -0
  63. package/dist/server/engines/opencode/OpenCodeServerManager.d.ts.map +1 -0
  64. package/dist/server/engines/opencode/OpenCodeServerManager.js +390 -0
  65. package/dist/server/engines/opencode/OpenCodeServerManager.js.map +1 -0
  66. package/dist/server/engines/opencode/model-catalog.d.ts +94 -0
  67. package/dist/server/engines/opencode/model-catalog.d.ts.map +1 -0
  68. package/dist/server/engines/opencode/model-catalog.js +141 -0
  69. package/dist/server/engines/opencode/model-catalog.js.map +1 -0
  70. package/dist/server/engines/types.d.ts +146 -0
  71. package/dist/server/engines/types.d.ts.map +1 -0
  72. package/dist/server/engines/types.js +4 -0
  73. package/dist/server/engines/types.js.map +1 -0
  74. package/dist/server/index.js +9 -2
  75. package/dist/server/index.js.map +1 -1
  76. package/dist/server/mcp/bouncer-haiku.d.ts +17 -4
  77. package/dist/server/mcp/bouncer-haiku.d.ts.map +1 -1
  78. package/dist/server/mcp/bouncer-haiku.js +8 -124
  79. package/dist/server/mcp/bouncer-haiku.js.map +1 -1
  80. package/dist/server/mcp/bouncer-integration.d.ts +45 -0
  81. package/dist/server/mcp/bouncer-integration.d.ts.map +1 -1
  82. package/dist/server/mcp/bouncer-integration.js +69 -5
  83. package/dist/server/mcp/bouncer-integration.js.map +1 -1
  84. package/dist/server/mcp/classifier/BouncerClassifier.d.ts +34 -0
  85. package/dist/server/mcp/classifier/BouncerClassifier.d.ts.map +1 -0
  86. package/dist/server/mcp/classifier/BouncerClassifier.js +4 -0
  87. package/dist/server/mcp/classifier/BouncerClassifier.js.map +1 -0
  88. package/dist/server/mcp/classifier/ClaudeBouncerClassifier.d.ts +17 -0
  89. package/dist/server/mcp/classifier/ClaudeBouncerClassifier.d.ts.map +1 -0
  90. package/dist/server/mcp/classifier/ClaudeBouncerClassifier.js +142 -0
  91. package/dist/server/mcp/classifier/ClaudeBouncerClassifier.js.map +1 -0
  92. package/dist/server/mcp/classifier/OpenCodeBouncerClassifier.d.ts +68 -0
  93. package/dist/server/mcp/classifier/OpenCodeBouncerClassifier.d.ts.map +1 -0
  94. package/dist/server/mcp/classifier/OpenCodeBouncerClassifier.js +182 -0
  95. package/dist/server/mcp/classifier/OpenCodeBouncerClassifier.js.map +1 -0
  96. package/dist/server/mcp/classifier/factory.d.ts +70 -0
  97. package/dist/server/mcp/classifier/factory.d.ts.map +1 -0
  98. package/dist/server/mcp/classifier/factory.js +155 -0
  99. package/dist/server/mcp/classifier/factory.js.map +1 -0
  100. package/dist/server/mcp/server.js +52 -0
  101. package/dist/server/mcp/server.js.map +1 -1
  102. package/dist/server/routes/index.d.ts +1 -0
  103. package/dist/server/routes/index.d.ts.map +1 -1
  104. package/dist/server/routes/index.js +1 -0
  105. package/dist/server/routes/index.js.map +1 -1
  106. package/dist/server/routes/internal.d.ts +16 -0
  107. package/dist/server/routes/internal.d.ts.map +1 -0
  108. package/dist/server/routes/internal.js +94 -0
  109. package/dist/server/routes/internal.js.map +1 -0
  110. package/dist/server/services/plan/agent-resolver.d.ts +26 -0
  111. package/dist/server/services/plan/agent-resolver.d.ts.map +1 -0
  112. package/dist/server/services/plan/agent-resolver.js +102 -0
  113. package/dist/server/services/plan/agent-resolver.js.map +1 -0
  114. package/dist/server/services/plan/composer.d.ts.map +1 -1
  115. package/dist/server/services/plan/composer.js +59 -11
  116. package/dist/server/services/plan/composer.js.map +1 -1
  117. package/dist/server/services/plan/executor.d.ts.map +1 -1
  118. package/dist/server/services/plan/executor.js +3 -1
  119. package/dist/server/services/plan/executor.js.map +1 -1
  120. package/dist/server/services/plan/issue-prompt-builder.d.ts.map +1 -1
  121. package/dist/server/services/plan/issue-prompt-builder.js +33 -1
  122. package/dist/server/services/plan/issue-prompt-builder.js.map +1 -1
  123. package/dist/server/services/plan/parser-core.d.ts.map +1 -1
  124. package/dist/server/services/plan/parser-core.js +1 -0
  125. package/dist/server/services/plan/parser-core.js.map +1 -1
  126. package/dist/server/services/plan/types.d.ts +1 -0
  127. package/dist/server/services/plan/types.d.ts.map +1 -1
  128. package/dist/server/services/runtime-info.d.ts +3 -0
  129. package/dist/server/services/runtime-info.d.ts.map +1 -0
  130. package/dist/server/services/runtime-info.js +21 -0
  131. package/dist/server/services/runtime-info.js.map +1 -0
  132. package/dist/server/services/settings.d.ts +76 -2
  133. package/dist/server/services/settings.d.ts.map +1 -1
  134. package/dist/server/services/settings.js +127 -4
  135. package/dist/server/services/settings.js.map +1 -1
  136. package/dist/server/services/websocket/ask-user-question-bridge.d.ts +32 -0
  137. package/dist/server/services/websocket/ask-user-question-bridge.d.ts.map +1 -0
  138. package/dist/server/services/websocket/ask-user-question-bridge.js +115 -0
  139. package/dist/server/services/websocket/ask-user-question-bridge.js.map +1 -0
  140. package/dist/server/services/websocket/git-branch-handlers.d.ts.map +1 -1
  141. package/dist/server/services/websocket/git-branch-handlers.js +19 -6
  142. package/dist/server/services/websocket/git-branch-handlers.js.map +1 -1
  143. package/dist/server/services/websocket/handler.d.ts +25 -1
  144. package/dist/server/services/websocket/handler.d.ts.map +1 -1
  145. package/dist/server/services/websocket/handler.js +84 -2
  146. package/dist/server/services/websocket/handler.js.map +1 -1
  147. package/dist/server/services/websocket/quality-complexity.d.ts.map +1 -1
  148. package/dist/server/services/websocket/quality-complexity.js +78 -26
  149. package/dist/server/services/websocket/quality-complexity.js.map +1 -1
  150. package/dist/server/services/websocket/quality-eta.d.ts +47 -0
  151. package/dist/server/services/websocket/quality-eta.d.ts.map +1 -0
  152. package/dist/server/services/websocket/quality-eta.js +110 -0
  153. package/dist/server/services/websocket/quality-eta.js.map +1 -0
  154. package/dist/server/services/websocket/quality-grading.d.ts +27 -4
  155. package/dist/server/services/websocket/quality-grading.d.ts.map +1 -1
  156. package/dist/server/services/websocket/quality-grading.js +369 -201
  157. package/dist/server/services/websocket/quality-grading.js.map +1 -1
  158. package/dist/server/services/websocket/quality-handlers.d.ts.map +1 -1
  159. package/dist/server/services/websocket/quality-handlers.js +145 -7
  160. package/dist/server/services/websocket/quality-handlers.js.map +1 -1
  161. package/dist/server/services/websocket/quality-operations.d.ts +34 -0
  162. package/dist/server/services/websocket/quality-operations.d.ts.map +1 -0
  163. package/dist/server/services/websocket/quality-operations.js +47 -0
  164. package/dist/server/services/websocket/quality-operations.js.map +1 -0
  165. package/dist/server/services/websocket/quality-persistence.d.ts +9 -0
  166. package/dist/server/services/websocket/quality-persistence.d.ts.map +1 -1
  167. package/dist/server/services/websocket/quality-persistence.js +10 -0
  168. package/dist/server/services/websocket/quality-persistence.js.map +1 -1
  169. package/dist/server/services/websocket/quality-review-agent.d.ts +1 -1
  170. package/dist/server/services/websocket/quality-review-agent.d.ts.map +1 -1
  171. package/dist/server/services/websocket/quality-review-agent.js +105 -56
  172. package/dist/server/services/websocket/quality-review-agent.js.map +1 -1
  173. package/dist/server/services/websocket/quality-service.d.ts +9 -1
  174. package/dist/server/services/websocket/quality-service.d.ts.map +1 -1
  175. package/dist/server/services/websocket/quality-service.js +334 -14
  176. package/dist/server/services/websocket/quality-service.js.map +1 -1
  177. package/dist/server/services/websocket/quality-tools.d.ts +21 -0
  178. package/dist/server/services/websocket/quality-tools.d.ts.map +1 -1
  179. package/dist/server/services/websocket/quality-tools.js +49 -0
  180. package/dist/server/services/websocket/quality-tools.js.map +1 -1
  181. package/dist/server/services/websocket/quality-types.d.ts +35 -2
  182. package/dist/server/services/websocket/quality-types.d.ts.map +1 -1
  183. package/dist/server/services/websocket/quality-types.js +1 -1
  184. package/dist/server/services/websocket/quality-types.js.map +1 -1
  185. package/dist/server/services/websocket/session-handlers.d.ts +3 -1
  186. package/dist/server/services/websocket/session-handlers.d.ts.map +1 -1
  187. package/dist/server/services/websocket/session-handlers.js +60 -9
  188. package/dist/server/services/websocket/session-handlers.js.map +1 -1
  189. package/dist/server/services/websocket/session-history.js +3 -0
  190. package/dist/server/services/websocket/session-history.js.map +1 -1
  191. package/dist/server/services/websocket/session-initialization.d.ts.map +1 -1
  192. package/dist/server/services/websocket/session-initialization.js +158 -42
  193. package/dist/server/services/websocket/session-initialization.js.map +1 -1
  194. package/dist/server/services/websocket/session-registry.d.ts +25 -0
  195. package/dist/server/services/websocket/session-registry.d.ts.map +1 -1
  196. package/dist/server/services/websocket/session-registry.js +19 -0
  197. package/dist/server/services/websocket/session-registry.js.map +1 -1
  198. package/dist/server/services/websocket/settings-handlers.d.ts +1 -1
  199. package/dist/server/services/websocket/settings-handlers.d.ts.map +1 -1
  200. package/dist/server/services/websocket/settings-handlers.js +35 -4
  201. package/dist/server/services/websocket/settings-handlers.js.map +1 -1
  202. package/dist/server/services/websocket/tab-broadcast.d.ts +7 -2
  203. package/dist/server/services/websocket/tab-broadcast.d.ts.map +1 -1
  204. package/dist/server/services/websocket/tab-broadcast.js +10 -2
  205. package/dist/server/services/websocket/tab-broadcast.js.map +1 -1
  206. package/dist/server/services/websocket/tab-event-buffer.d.ts +97 -8
  207. package/dist/server/services/websocket/tab-event-buffer.d.ts.map +1 -1
  208. package/dist/server/services/websocket/tab-event-buffer.js +138 -12
  209. package/dist/server/services/websocket/tab-event-buffer.js.map +1 -1
  210. package/dist/server/services/websocket/tab-event-replay.d.ts +29 -13
  211. package/dist/server/services/websocket/tab-event-replay.d.ts.map +1 -1
  212. package/dist/server/services/websocket/tab-event-replay.js +55 -2
  213. package/dist/server/services/websocket/tab-event-replay.js.map +1 -1
  214. package/dist/server/services/websocket/tab-handlers.d.ts +9 -1
  215. package/dist/server/services/websocket/tab-handlers.d.ts.map +1 -1
  216. package/dist/server/services/websocket/tab-handlers.js +47 -2
  217. package/dist/server/services/websocket/tab-handlers.js.map +1 -1
  218. package/dist/server/services/websocket/types.d.ts +67 -7
  219. package/dist/server/services/websocket/types.d.ts.map +1 -1
  220. package/dist/server/services/websocket/types.js +12 -6
  221. package/dist/server/services/websocket/types.js.map +1 -1
  222. package/package.json +5 -3
  223. package/server/cli/eta-estimator.ts +249 -0
  224. package/server/cli/headless/claude-invoker-process.ts +9 -1
  225. package/server/cli/headless/mcp-config.ts +30 -5
  226. package/server/cli/headless/runner.ts +21 -0
  227. package/server/cli/headless/stall-assessor.ts +93 -0
  228. package/server/cli/headless/tool-watchdog.ts +21 -0
  229. package/server/cli/headless/types.ts +16 -1
  230. package/server/cli/improvisation-history-store.ts +4 -1
  231. package/server/cli/improvisation-output-queue.ts +29 -7
  232. package/server/cli/improvisation-session-manager.ts +63 -1
  233. package/server/cli/improvisation-types.ts +9 -0
  234. package/server/cli/retry/retry-runner-factory.ts +1 -0
  235. package/server/engines/EngineEvent.ts +156 -0
  236. package/server/engines/claude/ClaudeCodeEngine.ts +404 -0
  237. package/server/engines/factory.ts +176 -0
  238. package/server/engines/opencode/OpenCodeEngine.ts +786 -0
  239. package/server/engines/opencode/OpenCodeServerManager.ts +577 -0
  240. package/server/engines/opencode/model-catalog.ts +217 -0
  241. package/server/engines/types.ts +173 -0
  242. package/server/index.ts +9 -1
  243. package/server/mcp/bouncer-haiku.ts +21 -145
  244. package/server/mcp/bouncer-integration.ts +107 -5
  245. package/server/mcp/classifier/BouncerClassifier.ts +40 -0
  246. package/server/mcp/classifier/ClaudeBouncerClassifier.ts +189 -0
  247. package/server/mcp/classifier/OpenCodeBouncerClassifier.ts +305 -0
  248. package/server/mcp/classifier/factory.ts +195 -0
  249. package/server/mcp/server.ts +57 -0
  250. package/server/routes/index.ts +1 -0
  251. package/server/routes/internal.ts +112 -0
  252. package/server/services/plan/agent-resolver.ts +115 -0
  253. package/server/services/plan/agents/code-review.md +38 -8
  254. package/server/services/plan/composer.ts +63 -11
  255. package/server/services/plan/executor.ts +3 -1
  256. package/server/services/plan/issue-prompt-builder.ts +39 -1
  257. package/server/services/plan/parser-core.ts +1 -0
  258. package/server/services/plan/types.ts +4 -0
  259. package/server/services/runtime-info.ts +24 -0
  260. package/server/services/settings.ts +161 -4
  261. package/server/services/websocket/ask-user-question-bridge.ts +148 -0
  262. package/server/services/websocket/git-branch-handlers.ts +20 -6
  263. package/server/services/websocket/handler.ts +89 -2
  264. package/server/services/websocket/quality-complexity.ts +80 -26
  265. package/server/services/websocket/quality-eta.ts +155 -0
  266. package/server/services/websocket/quality-grading.ts +445 -222
  267. package/server/services/websocket/quality-handlers.ts +153 -7
  268. package/server/services/websocket/quality-operations.ts +72 -0
  269. package/server/services/websocket/quality-persistence.ts +17 -0
  270. package/server/services/websocket/quality-review-agent.ts +154 -64
  271. package/server/services/websocket/quality-service.ts +361 -13
  272. package/server/services/websocket/quality-tools.ts +51 -0
  273. package/server/services/websocket/quality-types.ts +41 -2
  274. package/server/services/websocket/session-handlers.ts +67 -10
  275. package/server/services/websocket/session-history.ts +3 -0
  276. package/server/services/websocket/session-initialization.ts +189 -46
  277. package/server/services/websocket/session-registry.ts +37 -0
  278. package/server/services/websocket/settings-handlers.ts +41 -4
  279. package/server/services/websocket/tab-broadcast.ts +10 -2
  280. package/server/services/websocket/tab-event-buffer.ts +143 -11
  281. package/server/services/websocket/tab-event-replay.ts +70 -3
  282. package/server/services/websocket/tab-handlers.ts +53 -5
  283. package/server/services/websocket/types.ts +85 -7
@@ -0,0 +1,195 @@
1
+ // Copyright (c) 2025-present Mstro, Inc. All rights reserved.
2
+ // Licensed under the MIT License. See LICENSE file for details.
3
+
4
+ /**
5
+ * Bouncer classifier factory.
6
+ *
7
+ * Two entry points:
8
+ *
9
+ * - `getClassifier()` — production path. Reads
10
+ * `settings.bouncerClassifier: { engine, model }` and returns the
11
+ * matching `BouncerClassifier` instance. If the persisted config is
12
+ * missing, malformed, or names a non-eligible model, it logs a clear
13
+ * warning and falls back to `ClaudeBouncerClassifier` + Haiku — the
14
+ * Bouncer must always have a classifier to call, so "no config" and
15
+ * "bad config" both collapse to the known-safe default rather than
16
+ * throwing.
17
+ *
18
+ * - `createBouncerClassifier(options?)` — direct-construction helper used
19
+ * by the engineSwap feature-flag gate (see `engine-swap-flag.test.ts`).
20
+ * Accepts an explicit `engineId` and is deliberately feature-flag-aware:
21
+ * when `engineSwap` is disabled, the flag short-circuits to Claude.
22
+ *
23
+ * New callers should prefer `getClassifier()` so the user-selected model
24
+ * takes effect without plumbing. The bouncer-integration layer constructs
25
+ * its default classifier lazily so env var changes and settings edits
26
+ * propagate on the next classification call.
27
+ */
28
+
29
+ import { OpenCodeServerManager } from '../../engines/opencode/OpenCodeServerManager.js';
30
+ import type { EngineId } from '../../engines/types.js';
31
+ import {
32
+ BOUNCER_ELIGIBLE_MODELS,
33
+ type BouncerClassifierConfig,
34
+ DEFAULT_BOUNCER_CLASSIFIER,
35
+ getBouncerClassifier,
36
+ isEngineSwapEnabled,
37
+ } from '../../services/settings.js';
38
+ import type { BouncerClassifier } from './BouncerClassifier.js';
39
+ import { ClaudeBouncerClassifier } from './ClaudeBouncerClassifier.js';
40
+ import { OpenCodeBouncerClassifier } from './OpenCodeBouncerClassifier.js';
41
+
42
+ /** Options accepted by every classifier implementation. */
43
+ export interface ClassifierFactoryOptions {
44
+ /**
45
+ * Which engine backs the classifier. With `engineSwap` off this is
46
+ * ignored and `'claude-code'` is used; with the flag on, non-Claude
47
+ * engines throw until their implementations land (Epic 4).
48
+ */
49
+ engineId?: EngineId;
50
+ }
51
+
52
+ /**
53
+ * Construct the Layer-2 Bouncer classifier by engine id (no settings
54
+ * lookup). Exists for the `engineSwap` feature-flag gate, which asserts
55
+ * that the factory is flag-aware in both on/off states. New production
56
+ * callers should route through {@link getClassifier} instead.
57
+ */
58
+ export function createBouncerClassifier(
59
+ options: ClassifierFactoryOptions = {},
60
+ ): BouncerClassifier {
61
+ if (!isEngineSwapEnabled()) {
62
+ return new ClaudeBouncerClassifier();
63
+ }
64
+ const engineId = options.engineId ?? 'claude-code';
65
+ switch (engineId) {
66
+ case 'claude-code':
67
+ return new ClaudeBouncerClassifier();
68
+ case 'opencode':
69
+ // Wired through `getClassifier()` (settings path). Direct engine-id
70
+ // construction stays intentionally narrow — callers that want the
71
+ // OpenCode classifier should pick it via the Settings UI so the
72
+ // shared `OpenCodeServerManager` is available.
73
+ throw new Error(
74
+ 'OpenCode bouncer classifier is not implemented yet (Epic 4). ' +
75
+ 'Keep engineSwap off until the OpenCode classifier ships.',
76
+ );
77
+ default: {
78
+ const exhaustive: never = engineId;
79
+ throw new Error(`Unknown classifier engine id: ${String(exhaustive)}`);
80
+ }
81
+ }
82
+ }
83
+
84
+ /**
85
+ * Process-lifetime singleton for the `opencode serve` subprocess used by
86
+ * the classifier. Deliberately separate from the engines-side manager so
87
+ * tests can inject a mock client without touching the engine factory.
88
+ * Lazy: never created until an OpenCode classifier is first requested.
89
+ */
90
+ let sharedOpenCodeManager: OpenCodeServerManager | null = null;
91
+ let openCodeManagerFactory: () => OpenCodeServerManager = () =>
92
+ new OpenCodeServerManager({ registerProcessHandlers: true });
93
+
94
+ function getSharedOpenCodeServerManager(): OpenCodeServerManager {
95
+ if (!sharedOpenCodeManager) {
96
+ sharedOpenCodeManager = openCodeManagerFactory();
97
+ }
98
+ return sharedOpenCodeManager;
99
+ }
100
+
101
+ /**
102
+ * Override the OpenCode manager used by the classifier factory. Test-only;
103
+ * production code never calls this. Pass `null` to reset to the default.
104
+ */
105
+ export function __setOpenCodeManagerFactoryForTests(
106
+ factory: (() => OpenCodeServerManager) | null,
107
+ ): void {
108
+ sharedOpenCodeManager = null;
109
+ openCodeManagerFactory = factory
110
+ ?? (() => new OpenCodeServerManager({ registerProcessHandlers: true }));
111
+ }
112
+
113
+ /**
114
+ * Log a fallback reason in a single place so grep + log analysis surface
115
+ * every path where we silently dropped back to Claude+Haiku. Goes to
116
+ * stderr (matching the rest of the Bouncer logs) so it shows up in the
117
+ * CLI's `--trace` output and in audit transcripts.
118
+ */
119
+ function logFallback(reason: string): void {
120
+ console.warn(
121
+ `[Bouncer] Classifier config invalid, falling back to Claude+Haiku: ${reason}`,
122
+ );
123
+ }
124
+
125
+ /**
126
+ * Construct a `BouncerClassifier` for the provided config. Throws on bad
127
+ * config — callers that need fallback semantics should use
128
+ * {@link getClassifier} instead.
129
+ */
130
+ export function createClassifierForConfig(
131
+ config: BouncerClassifierConfig,
132
+ ): BouncerClassifier {
133
+ const eligible = BOUNCER_ELIGIBLE_MODELS[config.engine];
134
+ if (!eligible || !eligible.includes(config.model)) {
135
+ throw new Error(
136
+ `Model '${config.model}' is not bouncer-eligible for engine '${config.engine}'`,
137
+ );
138
+ }
139
+ switch (config.engine) {
140
+ case 'claude-code':
141
+ // The Claude classifier currently hardcodes `--model haiku` in the
142
+ // subprocess call. Passing `sonnet` still returns Haiku until a
143
+ // later issue threads the model through — the eligibility check
144
+ // guards correctness; the subprocess args are a follow-up.
145
+ return new ClaudeBouncerClassifier();
146
+ case 'opencode':
147
+ return new OpenCodeBouncerClassifier({
148
+ manager: getSharedOpenCodeServerManager(),
149
+ model: config.model,
150
+ });
151
+ default: {
152
+ const exhaustive: never = config.engine;
153
+ throw new Error(`Unknown classifier engine id: ${String(exhaustive)}`);
154
+ }
155
+ }
156
+ }
157
+
158
+ /**
159
+ * Production classifier accessor. Reads the user's current Bouncer
160
+ * classifier choice from persistent settings and returns a fresh
161
+ * `BouncerClassifier` instance. Invalid or missing config logs a clear
162
+ * warning and falls back to the default Claude+Haiku classifier — the
163
+ * Bouncer is a required security layer, so "no classifier available" is
164
+ * never an acceptable outcome.
165
+ *
166
+ * Called on every `reviewOperation()` path (indirectly via the
167
+ * integration layer's lazy default); cheap because classifier
168
+ * construction is synchronous and does not spawn subprocesses until the
169
+ * first `classify()` call.
170
+ */
171
+ export function getClassifier(): BouncerClassifier {
172
+ let config: BouncerClassifierConfig;
173
+ try {
174
+ config = getBouncerClassifier();
175
+ } catch (err) {
176
+ logFallback(err instanceof Error ? err.message : String(err));
177
+ return new ClaudeBouncerClassifier();
178
+ }
179
+
180
+ try {
181
+ return createClassifierForConfig(config);
182
+ } catch (err) {
183
+ logFallback(err instanceof Error ? err.message : String(err));
184
+ // Last-resort fallback — if even the default config can't build the
185
+ // classifier (e.g. OpenCode catalogue edit broke the model list), we
186
+ // still return Claude+Haiku so the Bouncer keeps functioning.
187
+ if (
188
+ config.engine === DEFAULT_BOUNCER_CLASSIFIER.engine &&
189
+ config.model === DEFAULT_BOUNCER_CLASSIFIER.model
190
+ ) {
191
+ return new ClaudeBouncerClassifier();
192
+ }
193
+ return new ClaudeBouncerClassifier();
194
+ }
195
+ }
@@ -62,6 +62,51 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
62
62
  };
63
63
  });
64
64
 
65
+ /**
66
+ * Bridge AskUserQuestion to the running CLI server. Claude pauses on this
67
+ * tool until we return; the CLI server pushes the questions to the web UI
68
+ * via WebSocket, awaits the user's answers, and returns them here.
69
+ *
70
+ * On any failure (server unreachable, timeout, no tab routing context) we
71
+ * return `behavior: allow` with the input unchanged. Claude treats it as
72
+ * "no answers" and proceeds with its own guesses — same fallback as before
73
+ * we had this integration. Better than blocking the run.
74
+ */
75
+ async function bridgeAskUserQuestion(
76
+ input: Record<string, unknown>,
77
+ ): Promise<{ behavior: 'allow'; updatedInput: Record<string, unknown> }> {
78
+ const port = process.env.MSTRO_PORT;
79
+ const tabId = process.env.MSTRO_TAB_ID;
80
+ const secret = process.env.MSTRO_BOUNCER_SECRET;
81
+ const toolUseId = process.env.MSTRO_CURRENT_TOOL_USE_ID || `aq-${Date.now()}-${Math.random().toString(36).slice(2, 10)}`;
82
+
83
+ if (!port || !tabId || !secret) {
84
+ console.error('[MCP Bouncer] AskUserQuestion: missing routing context (port/tabId/secret) — passing through with no answers');
85
+ return { behavior: 'allow', updatedInput: input };
86
+ }
87
+
88
+ try {
89
+ const res = await fetch(`http://127.0.0.1:${port}/internal/ask-user-question`, {
90
+ method: 'POST',
91
+ headers: { 'content-type': 'application/json', 'x-mstro-bouncer-secret': secret },
92
+ body: JSON.stringify({ toolUseId, tabId, questions: input.questions }),
93
+ });
94
+ if (!res.ok) {
95
+ console.error(`[MCP Bouncer] AskUserQuestion bridge returned ${res.status} — passing through with no answers`);
96
+ return { behavior: 'allow', updatedInput: input };
97
+ }
98
+ const json = (await res.json()) as { answers?: Record<string, string> };
99
+ const answers = json.answers && typeof json.answers === 'object' ? json.answers : {};
100
+ return {
101
+ behavior: 'allow',
102
+ updatedInput: { questions: input.questions, answers },
103
+ };
104
+ } catch (err) {
105
+ console.error(`[MCP Bouncer] AskUserQuestion bridge failed: ${err instanceof Error ? err.message : String(err)} — passing through with no answers`);
106
+ return { behavior: 'allow', updatedInput: input };
107
+ }
108
+ }
109
+
65
110
  /**
66
111
  * Handle tool calls (approval_prompt)
67
112
  */
@@ -75,6 +120,18 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
75
120
  input: Record<string, unknown>;
76
121
  };
77
122
 
123
+ // AskUserQuestion is a clarifying-question tool — Claude needs the user's
124
+ // answers in `updatedInput.answers`, not a yes/no permission decision. Skip
125
+ // the security review entirely (the prior pattern fast-path also auto-allowed
126
+ // this) and route to the web UI bridge for real interactive answering.
127
+ if (tool_name === 'AskUserQuestion') {
128
+ console.error('[MCP Bouncer] AskUserQuestion received — bridging to web UI');
129
+ const response = await bridgeAskUserQuestion(input);
130
+ return {
131
+ content: [{ type: 'text', text: JSON.stringify(response) }],
132
+ };
133
+ }
134
+
78
135
  console.error(`[MCP Bouncer] Analyzing ${tool_name} request...`);
79
136
 
80
137
  // Format operation string for bouncer analysis
@@ -9,4 +9,5 @@
9
9
  export { createFileRoutes } from './files.js'
10
10
  export { createImproviseRoutes } from './improvise.js'
11
11
  export { createInstanceRoutes, createShutdownRoute } from './instances.js'
12
+ export { createInternalRoutes } from './internal.js'
12
13
  export { createNotificationRoutes } from './notifications.js'
@@ -0,0 +1,112 @@
1
+ // Copyright (c) 2025-present Mstro, Inc. All rights reserved.
2
+
3
+ /**
4
+ * Internal Routes
5
+ *
6
+ * HTTP endpoints used by sibling subprocesses (like the MCP bouncer) to talk
7
+ * back to the running CLI server. NOT mounted under `/api/*` — these are gated
8
+ * by the per-process bouncer secret instead of the user's session token.
9
+ *
10
+ * Currently a single endpoint:
11
+ * POST /internal/ask-user-question
12
+ * Bouncer pauses Claude on AskUserQuestion; this blocks until the web
13
+ * user answers, then returns the answers Claude needs to continue.
14
+ */
15
+
16
+ import { Hono } from 'hono'
17
+ import {
18
+ isValidBouncerSecret,
19
+ registerPendingQuestion,
20
+ } from '../services/websocket/ask-user-question-bridge.js'
21
+ import type { HandlerContext } from '../services/websocket/handler-context.js'
22
+ import { broadcastTabEvent } from '../services/websocket/tab-broadcast.js'
23
+ import type {
24
+ AskUserQuestionItem,
25
+ AskUserQuestionPayload,
26
+ } from '../services/websocket/types.js'
27
+
28
+ interface AskUserQuestionRequestBody {
29
+ toolUseId?: unknown
30
+ tabId?: unknown
31
+ questions?: unknown
32
+ /** Override default 15min timeout (ms). Optional. */
33
+ timeoutMs?: unknown
34
+ }
35
+
36
+ /** Narrow an unknown into AskUserQuestionItem[] without throwing. */
37
+ function parseQuestions(value: unknown): AskUserQuestionItem[] | null {
38
+ if (!Array.isArray(value)) return null
39
+ const out: AskUserQuestionItem[] = []
40
+ for (const raw of value) {
41
+ if (!raw || typeof raw !== 'object') return null
42
+ const r = raw as Record<string, unknown>
43
+ if (typeof r.question !== 'string' || typeof r.header !== 'string') return null
44
+ if (!Array.isArray(r.options)) return null
45
+ const options = r.options.map((o) => {
46
+ if (!o || typeof o !== 'object') return null
47
+ const oo = o as Record<string, unknown>
48
+ if (typeof oo.label !== 'string') return null
49
+ return {
50
+ label: oo.label,
51
+ description: typeof oo.description === 'string' ? oo.description : '',
52
+ preview: typeof oo.preview === 'string' ? oo.preview : undefined,
53
+ }
54
+ })
55
+ if (options.some((o) => o === null)) return null
56
+ out.push({
57
+ question: r.question,
58
+ header: r.header,
59
+ options: options as AskUserQuestionItem['options'],
60
+ multiSelect: r.multiSelect === true,
61
+ })
62
+ }
63
+ return out
64
+ }
65
+
66
+ export function createInternalRoutes(ctx: HandlerContext): Hono {
67
+ const app = new Hono()
68
+
69
+ app.post('/ask-user-question', async (c) => {
70
+ const secret = c.req.header('x-mstro-bouncer-secret')
71
+ if (!isValidBouncerSecret(secret)) {
72
+ return c.json({ error: 'Forbidden' }, 403)
73
+ }
74
+
75
+ let body: AskUserQuestionRequestBody
76
+ try {
77
+ body = (await c.req.json()) as AskUserQuestionRequestBody
78
+ } catch {
79
+ return c.json({ error: 'Invalid JSON' }, 400)
80
+ }
81
+
82
+ const toolUseId = typeof body.toolUseId === 'string' ? body.toolUseId : ''
83
+ const tabId = typeof body.tabId === 'string' ? body.tabId : ''
84
+ const questions = parseQuestions(body.questions)
85
+ if (!toolUseId || !tabId || !questions || questions.length === 0) {
86
+ return c.json({ error: 'toolUseId, tabId, and non-empty questions[] are required' }, 400)
87
+ }
88
+
89
+ const timeoutMs =
90
+ typeof body.timeoutMs === 'number' && body.timeoutMs > 0 ? body.timeoutMs : undefined
91
+
92
+ const payload: AskUserQuestionPayload = { toolUseId, questions }
93
+ broadcastTabEvent(ctx, tabId, 'askUserQuestion', payload)
94
+
95
+ try {
96
+ const answers = await registerPendingQuestion({ toolUseId, tabId, timeoutMs })
97
+ return c.json({ answers })
98
+ } catch (err) {
99
+ const reason = err instanceof Error ? err.message : 'cancelled'
100
+ // Tell every web client to dismiss the card so users don't keep poking
101
+ // an already-dead question.
102
+ broadcastTabEvent(ctx, tabId, 'askUserQuestionDismissed', {
103
+ toolUseId,
104
+ reason: reason === 'timeout' ? 'timeout' : 'cancelled',
105
+ })
106
+ const status = reason === 'timeout' ? 504 : 410
107
+ return c.json({ error: reason }, status)
108
+ }
109
+ })
110
+
111
+ return app
112
+ }
@@ -0,0 +1,115 @@
1
+ // Copyright (c) 2025-present Mstro, Inc. All rights reserved.
2
+
3
+ /**
4
+ * Agent Resolver — Maps issue.agents hints to subagents installed on the user's system.
5
+ *
6
+ * Issue front matter may specify `agents` as either canonical Claude Code subagent
7
+ * names (e.g. `backend-architect`) or general role pointers (e.g. `backend engineer`).
8
+ * This module bridges the two: it consults AgentManager (project / global / bundled
9
+ * `.claude/agents/`) and resolves each hint to a concrete agent name when possible,
10
+ * falling back to the original hint when no match is found so the executor can still
11
+ * surface the user's intent in the prompt.
12
+ */
13
+
14
+ import { type AgentInfo, agentManager } from '../../utils/agent-manager.js';
15
+
16
+ export interface ResolvedAgent {
17
+ /** The original hint as written in the issue front matter. */
18
+ hint: string;
19
+ /** The resolved canonical agent name, or null if no installed agent matched. */
20
+ resolvedName: string | null;
21
+ /** The matching agent info, or null if no installed agent matched. */
22
+ info: AgentInfo | null;
23
+ }
24
+
25
+ const NON_WORD = /[^a-z0-9]+/g;
26
+
27
+ function normalize(input: string): string {
28
+ return input.toLowerCase().replace(NON_WORD, ' ').trim();
29
+ }
30
+
31
+ function tokenize(input: string): string[] {
32
+ return normalize(input).split(' ').filter(Boolean);
33
+ }
34
+
35
+ /**
36
+ * Discover every available agent across project / global / bundled directories.
37
+ * Project entries shadow global, which shadows bundled (deduped by canonical name).
38
+ */
39
+ function listAvailableAgents(workingDir: string): AgentInfo[] {
40
+ const seen = new Map<string, AgentInfo>();
41
+ const layers = [
42
+ agentManager.listProjectAgents(workingDir),
43
+ agentManager.listGlobalAgents(),
44
+ agentManager.listBundledAgents(),
45
+ ];
46
+ for (const layer of layers) {
47
+ for (const agent of layer) {
48
+ if (!seen.has(agent.name)) seen.set(agent.name, agent);
49
+ }
50
+ }
51
+ return Array.from(seen.values());
52
+ }
53
+
54
+ /**
55
+ * Score how well an agent matches a hint. Returns 0 when there is no token overlap.
56
+ * Higher is better. Exact normalized matches return Infinity.
57
+ */
58
+ function matchScore(hint: string, agent: AgentInfo): number {
59
+ const normalizedHint = normalize(hint);
60
+ const normalizedName = normalize(agent.name);
61
+ if (normalizedHint === normalizedName) return Number.POSITIVE_INFINITY;
62
+
63
+ const hintTokens = tokenize(hint);
64
+ if (hintTokens.length === 0) return 0;
65
+
66
+ const haystack = `${normalizedName} ${normalize(agent.description ?? '')}`;
67
+ let matched = 0;
68
+ for (const token of hintTokens) {
69
+ if (token.length < 2) continue;
70
+ if (haystack.includes(token)) matched++;
71
+ }
72
+ if (matched === 0) return 0;
73
+
74
+ // Reward agents whose name (not just description) contains hint tokens.
75
+ const nameMatches = hintTokens.filter(t => t.length >= 2 && normalizedName.includes(t)).length;
76
+ return matched + nameMatches * 0.5;
77
+ }
78
+
79
+ /**
80
+ * Resolve a single hint against the catalog of available agents.
81
+ * Returns the highest-scoring agent, or null when no agent has any token overlap.
82
+ */
83
+ function resolveHint(hint: string, available: AgentInfo[]): AgentInfo | null {
84
+ let bestScore = 0;
85
+ let best: AgentInfo | null = null;
86
+ for (const agent of available) {
87
+ const score = matchScore(hint, agent);
88
+ if (score > bestScore) {
89
+ bestScore = score;
90
+ best = agent;
91
+ }
92
+ }
93
+ return best;
94
+ }
95
+
96
+ /**
97
+ * Resolve every hint in `agents` against the user's installed Claude Code subagents.
98
+ * Hints with no match are preserved (resolvedName: null) so the executor can still
99
+ * mention them in the prompt with a graceful fallback note.
100
+ */
101
+ export function resolveAgentHints(agents: string[], workingDir: string): ResolvedAgent[] {
102
+ if (!agents || agents.length === 0) return [];
103
+ const available = listAvailableAgents(workingDir);
104
+ return agents
105
+ .map(raw => raw.trim())
106
+ .filter(Boolean)
107
+ .map(hint => {
108
+ const info = resolveHint(hint, available);
109
+ return {
110
+ hint,
111
+ resolvedName: info?.name ?? null,
112
+ info,
113
+ };
114
+ });
115
+ }
@@ -74,19 +74,49 @@ For each finding, use this reasoning process:
74
74
 
75
75
  ## Scoring Guidelines
76
76
 
77
- The overall grade is computed deterministically from your findings, not from a number you supply. Severity and category on each finding are what drive the grade — pick them carefully.
77
+ The overall grade is computed deterministically from your findings, not from a number you supply. **Severity and category on each finding are what drive the grade — pick them carefully.** When in doubt, downgrade.
78
78
 
79
- Three independent dimension grades are computed:
79
+ ### Severity Ladder calibrate by likelihood × user impact, not just by topic
80
80
 
81
- - **Security** (category: `security`) — uses a severity-threshold rule: A = 0 findings, B = only low, C = ≥1 medium, D = ≥1 high, F = ≥1 critical.
82
- - **Reliability** (categories: `bugs`, `logic`, `performance`) severity-threshold rule, slightly more lenient: A = 0 findings or ≤1 low, B = ≥2 low or ≤2 medium, C = ≥3 medium or ≥1 high, D = ≥2 high, F = ≥1 critical.
83
- - **Maintainability** (categories: `architecture`, `oop`, `maintainability`) — density-based (issues per 1000 lines), with a severity escape hatch: any high finding caps at C, any critical caps at D.
81
+ Severity should answer two questions:
82
+ 1. **How likely is this to actually trigger?** (Common path vs. edge case vs. theoretical)
83
+ 2. **What happens when it triggers?** (User-visible breakage / data loss vs. internal-only / cosmetic)
84
84
 
85
- Overall grade = the worst of the three dimensions. A single critical security finding caps the entire codebase at F.
85
+ Use this ladder. Worked examples follow each level.
86
86
 
87
- This means **severity is load-bearing**: marking something `high` when it's really `low` will swing the grade unfairly. When in doubt, downgrade. A finding without clear evidence of harm is `low`.
87
+ - **`critical`** Reserved for "this is broken in production today on common code paths." Active data corruption, RCE, auth bypass for normal users, unrecoverable crash on the happy path. If the on-call would page at 3 AM for it, it's critical.
88
+ - ✅ SQL injection on a public form. Hard-coded production credentials in a deployed file. A `null`-deref on the homepage render path.
89
+ - ❌ "Could become a problem if traffic 100×". "Edge case where two clients race within 50ms." A theoretical bug in error-handling code that has never run.
88
90
 
89
- You may still emit `score`, `grade`, and `scoreRationale` for reference they are persisted but ignored when computing the displayed grade. Focus your effort on accurate findings, not on guessing the overall number.
91
+ - **`high`** A real bug or vulnerability that **definitely affects normal users on common code paths** with **user-visible consequences** (broken UI, wrong data shown, action silently fails). Or an exploitable security issue that requires only realistic conditions.
92
+ - ✅ Wrong state shown after a successful save (UI/UX bug). XSS via reflected URL parameter on a logged-in dashboard. Wrong calculation in a money-handling code path. Memory leak that grows on every page-view.
93
+ - ❌ Race condition on degraded shutdown paths. Edge-case exploit gated behind admin auth on a feature that hasn't shipped. A theoretical SSRF on an internal endpoint with no user reach. Defense-in-depth gaps (rate limit absent, header missing) — those are `low`.
94
+
95
+ - **`medium`** — Real issue but affects an edge case OR has limited user impact OR requires unusual conditions to trigger. Worth fixing eventually; not blocking.
96
+ - ✅ Missing error handling on a rarely-failing dependency. Logic bug in an admin-only page. A bug only reachable when two specific feature flags are both on. Performance issue that adds 50 ms but isn't user-perceptible.
97
+ - ❌ "Best practice" preferences with no user impact. Theoretical bugs in unreachable code.
98
+
99
+ - **`low`** — Improbable, theoretical, or cosmetic. Defense-in-depth missing, style/preference, "could be cleaner." Many of these are fine to leave for years.
100
+ - ✅ Missing rate limit on a low-traffic admin endpoint. SQL injection-shaped pattern that ends up safely parameterized. A `console.log` left in code. A nullable field that's only null in a code path that never executes.
101
+
102
+ ### Likelihood-weighted severity rules
103
+
104
+ Apply these as veto rules **after** you've chosen a severity from topic alone:
105
+
106
+ - If the bug only fires on a path that **realistically never executes in production**, downgrade by at least one step (high→medium, medium→low). A bug that requires "the network connection drops between line 42 and 43 of the shutdown handler" is `low` even if its consequences would be severe.
107
+ - If the issue has **no user-visible effect** (no UI/UX impact, no incorrect data shown, no security boundary crossed), it caps at `medium`. UI/UX wiring bugs and broken interactive flows skew higher; pure-internal architecture / observability gaps skew lower.
108
+ - If the issue is a **defense-in-depth gap** (rate limits, hardening headers, additional validation on already-validated input), cap at `low` unless you can articulate the realistic exploit chain that survives the existing defenses.
109
+ - If exploitability requires **conditions that only matter at high traffic / wide user attack surface**, downgrade for early-stage projects: this is `low` or `medium`, not `high`. (Make this explicit in the description so the reader knows the call.)
110
+
111
+ ### Three dimension grades the engine derives
112
+
113
+ - **Security** (category: `security`) — strictest. A = 0 findings, B = only low, C = ≥1 medium, F = ≥1 high, F- = ≥1 critical.
114
+ - **Reliability** (categories: `bugs`, `logic`, `performance`) — density-based grade per KLOC with severity escape: critical → F, any high → caps at C. Multiple medium findings escalate gradually rather than auto-failing.
115
+ - **Maintainability** (categories: `architecture`, `oop`, `maintainability`) — density-based with severity escape: critical → F, any high → C.
116
+
117
+ Overall grade = the worst of the three. A single critical security finding caps the entire codebase at F-.
118
+
119
+ You may still emit `score`, `grade`, and `scoreRationale` for reference — they are persisted but ignored when computing the displayed grade. Focus your effort on accurate severity classification, not on guessing the overall number.
90
120
 
91
121
  ## Output
92
122