oh-my-codex 0.18.6 → 0.18.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (444) hide show
  1. package/Cargo.lock +6 -6
  2. package/Cargo.toml +1 -1
  3. package/README.md +59 -10
  4. package/crates/omx-sparkshell/tests/execution.rs +1 -1
  5. package/dist/agents/__tests__/definitions.test.js +11 -0
  6. package/dist/agents/__tests__/definitions.test.js.map +1 -1
  7. package/dist/agents/__tests__/native-config.test.js +56 -6
  8. package/dist/agents/__tests__/native-config.test.js.map +1 -1
  9. package/dist/agents/definitions.d.ts +10 -0
  10. package/dist/agents/definitions.d.ts.map +1 -1
  11. package/dist/agents/definitions.js +5 -1
  12. package/dist/agents/definitions.js.map +1 -1
  13. package/dist/agents/native-config.d.ts +5 -1
  14. package/dist/agents/native-config.d.ts.map +1 -1
  15. package/dist/agents/native-config.js +19 -4
  16. package/dist/agents/native-config.js.map +1 -1
  17. package/dist/autopilot/__tests__/fsm.test.d.ts +2 -0
  18. package/dist/autopilot/__tests__/fsm.test.d.ts.map +1 -0
  19. package/dist/autopilot/__tests__/fsm.test.js +75 -0
  20. package/dist/autopilot/__tests__/fsm.test.js.map +1 -0
  21. package/dist/autopilot/__tests__/ralplan-gate.test.d.ts +2 -0
  22. package/dist/autopilot/__tests__/ralplan-gate.test.d.ts.map +1 -0
  23. package/dist/autopilot/__tests__/ralplan-gate.test.js +79 -0
  24. package/dist/autopilot/__tests__/ralplan-gate.test.js.map +1 -0
  25. package/dist/autopilot/deep-interview-gate.d.ts +18 -0
  26. package/dist/autopilot/deep-interview-gate.d.ts.map +1 -0
  27. package/dist/autopilot/deep-interview-gate.js +256 -0
  28. package/dist/autopilot/deep-interview-gate.js.map +1 -0
  29. package/dist/autopilot/fsm.d.ts +13 -0
  30. package/dist/autopilot/fsm.d.ts.map +1 -0
  31. package/dist/autopilot/fsm.js +70 -0
  32. package/dist/autopilot/fsm.js.map +1 -0
  33. package/dist/autopilot/ralplan-gate.d.ts +17 -0
  34. package/dist/autopilot/ralplan-gate.d.ts.map +1 -0
  35. package/dist/autopilot/ralplan-gate.js +61 -0
  36. package/dist/autopilot/ralplan-gate.js.map +1 -0
  37. package/dist/cli/__tests__/codex-plugin-layout.test.js +512 -1
  38. package/dist/cli/__tests__/codex-plugin-layout.test.js.map +1 -1
  39. package/dist/cli/__tests__/doctor-warning-copy.test.js +39 -0
  40. package/dist/cli/__tests__/doctor-warning-copy.test.js.map +1 -1
  41. package/dist/cli/__tests__/index.test.js +83 -7
  42. package/dist/cli/__tests__/index.test.js.map +1 -1
  43. package/dist/cli/__tests__/launch-fallback.test.js +175 -6
  44. package/dist/cli/__tests__/launch-fallback.test.js.map +1 -1
  45. package/dist/cli/__tests__/package-bin-contract.test.js +8 -4
  46. package/dist/cli/__tests__/package-bin-contract.test.js.map +1 -1
  47. package/dist/cli/__tests__/question.test.js +100 -0
  48. package/dist/cli/__tests__/question.test.js.map +1 -1
  49. package/dist/cli/__tests__/ralph-goal-mode-contract.test.js +13 -0
  50. package/dist/cli/__tests__/ralph-goal-mode-contract.test.js.map +1 -1
  51. package/dist/cli/__tests__/ralph.test.js +14 -0
  52. package/dist/cli/__tests__/ralph.test.js.map +1 -1
  53. package/dist/cli/__tests__/setup-install-mode.test.js +89 -0
  54. package/dist/cli/__tests__/setup-install-mode.test.js.map +1 -1
  55. package/dist/cli/__tests__/setup-refresh.test.js +83 -0
  56. package/dist/cli/__tests__/setup-refresh.test.js.map +1 -1
  57. package/dist/cli/__tests__/state.test.js +21 -0
  58. package/dist/cli/__tests__/state.test.js.map +1 -1
  59. package/dist/cli/__tests__/team.test.js +2 -2
  60. package/dist/cli/__tests__/team.test.js.map +1 -1
  61. package/dist/cli/__tests__/update.test.js +110 -2
  62. package/dist/cli/__tests__/update.test.js.map +1 -1
  63. package/dist/cli/doctor.d.ts.map +1 -1
  64. package/dist/cli/doctor.js +8 -1
  65. package/dist/cli/doctor.js.map +1 -1
  66. package/dist/cli/index.d.ts +14 -3
  67. package/dist/cli/index.d.ts.map +1 -1
  68. package/dist/cli/index.js +298 -50
  69. package/dist/cli/index.js.map +1 -1
  70. package/dist/cli/plugin-marketplace.d.ts +14 -2
  71. package/dist/cli/plugin-marketplace.d.ts.map +1 -1
  72. package/dist/cli/plugin-marketplace.js +62 -15
  73. package/dist/cli/plugin-marketplace.js.map +1 -1
  74. package/dist/cli/question.d.ts.map +1 -1
  75. package/dist/cli/question.js +36 -5
  76. package/dist/cli/question.js.map +1 -1
  77. package/dist/cli/ralph.d.ts.map +1 -1
  78. package/dist/cli/ralph.js +3 -1
  79. package/dist/cli/ralph.js.map +1 -1
  80. package/dist/cli/setup-preferences.d.ts +2 -0
  81. package/dist/cli/setup-preferences.d.ts.map +1 -1
  82. package/dist/cli/setup-preferences.js +4 -0
  83. package/dist/cli/setup-preferences.js.map +1 -1
  84. package/dist/cli/setup.d.ts +3 -0
  85. package/dist/cli/setup.d.ts.map +1 -1
  86. package/dist/cli/setup.js +166 -27
  87. package/dist/cli/setup.js.map +1 -1
  88. package/dist/cli/state.d.ts.map +1 -1
  89. package/dist/cli/state.js +8 -1
  90. package/dist/cli/state.js.map +1 -1
  91. package/dist/cli/tmux-hook.d.ts.map +1 -1
  92. package/dist/cli/tmux-hook.js +16 -0
  93. package/dist/cli/tmux-hook.js.map +1 -1
  94. package/dist/cli/update.d.ts +2 -0
  95. package/dist/cli/update.d.ts.map +1 -1
  96. package/dist/cli/update.js +47 -3
  97. package/dist/cli/update.js.map +1 -1
  98. package/dist/config/__tests__/deep-interview.test.js +7 -6
  99. package/dist/config/__tests__/deep-interview.test.js.map +1 -1
  100. package/dist/config/__tests__/generator-notify.test.js +1 -0
  101. package/dist/config/__tests__/generator-notify.test.js.map +1 -1
  102. package/dist/config/deep-interview.d.ts.map +1 -1
  103. package/dist/config/deep-interview.js +14 -4
  104. package/dist/config/deep-interview.js.map +1 -1
  105. package/dist/config/generator.d.ts +2 -2
  106. package/dist/config/generator.d.ts.map +1 -1
  107. package/dist/config/generator.js +2 -2
  108. package/dist/config/generator.js.map +1 -1
  109. package/dist/config/team-mode.d.ts +12 -0
  110. package/dist/config/team-mode.d.ts.map +1 -0
  111. package/dist/config/team-mode.js +91 -0
  112. package/dist/config/team-mode.js.map +1 -0
  113. package/dist/hooks/__tests__/agents-overlay.test.js +88 -0
  114. package/dist/hooks/__tests__/agents-overlay.test.js.map +1 -1
  115. package/dist/hooks/__tests__/autopilot-skill-contract.test.js +8 -0
  116. package/dist/hooks/__tests__/autopilot-skill-contract.test.js.map +1 -1
  117. package/dist/hooks/__tests__/code-review-skill-contract.test.js +8 -0
  118. package/dist/hooks/__tests__/code-review-skill-contract.test.js.map +1 -1
  119. package/dist/hooks/__tests__/deep-interview-contract.test.js +10 -0
  120. package/dist/hooks/__tests__/deep-interview-contract.test.js.map +1 -1
  121. package/dist/hooks/__tests__/keyword-detector.test.js +1072 -14
  122. package/dist/hooks/__tests__/keyword-detector.test.js.map +1 -1
  123. package/dist/hooks/__tests__/notify-fallback-watcher.test.js +64 -1
  124. package/dist/hooks/__tests__/notify-fallback-watcher.test.js.map +1 -1
  125. package/dist/hooks/__tests__/notify-hook-auto-nudge.test.js +189 -0
  126. package/dist/hooks/__tests__/notify-hook-auto-nudge.test.js.map +1 -1
  127. package/dist/hooks/__tests__/notify-hook-team-leader-nudge.test.js +35 -2
  128. package/dist/hooks/__tests__/notify-hook-team-leader-nudge.test.js.map +1 -1
  129. package/dist/hooks/__tests__/notify-hook-tmux-heal.test.js +3 -3
  130. package/dist/hooks/__tests__/notify-hook-tmux-heal.test.js.map +1 -1
  131. package/dist/hooks/__tests__/session.test.js +25 -0
  132. package/dist/hooks/__tests__/session.test.js.map +1 -1
  133. package/dist/hooks/__tests__/skill-guidance-contract.test.js +21 -0
  134. package/dist/hooks/__tests__/skill-guidance-contract.test.js.map +1 -1
  135. package/dist/hooks/agents-overlay.d.ts.map +1 -1
  136. package/dist/hooks/agents-overlay.js +36 -50
  137. package/dist/hooks/agents-overlay.js.map +1 -1
  138. package/dist/hooks/deep-interview-config-instruction.js +1 -1
  139. package/dist/hooks/deep-interview-config-instruction.js.map +1 -1
  140. package/dist/hooks/extensibility/__tests__/plugin-runner.test.js +31 -0
  141. package/dist/hooks/extensibility/__tests__/plugin-runner.test.js.map +1 -1
  142. package/dist/hooks/extensibility/plugin-runner.js +17 -21
  143. package/dist/hooks/extensibility/plugin-runner.js.map +1 -1
  144. package/dist/hooks/keyword-detector.d.ts +1 -0
  145. package/dist/hooks/keyword-detector.d.ts.map +1 -1
  146. package/dist/hooks/keyword-detector.js +428 -32
  147. package/dist/hooks/keyword-detector.js.map +1 -1
  148. package/dist/hooks/keyword-registry.d.ts.map +1 -1
  149. package/dist/hooks/keyword-registry.js +1 -0
  150. package/dist/hooks/keyword-registry.js.map +1 -1
  151. package/dist/hooks/prompt-guidance-contract.d.ts.map +1 -1
  152. package/dist/hooks/prompt-guidance-contract.js +6 -0
  153. package/dist/hooks/prompt-guidance-contract.js.map +1 -1
  154. package/dist/hooks/session.d.ts +3 -0
  155. package/dist/hooks/session.d.ts.map +1 -1
  156. package/dist/hooks/session.js +13 -5
  157. package/dist/hooks/session.js.map +1 -1
  158. package/dist/hud/__tests__/authority.test.js +469 -31
  159. package/dist/hud/__tests__/authority.test.js.map +1 -1
  160. package/dist/hud/__tests__/hud-tmux-injection.test.js +2 -1
  161. package/dist/hud/__tests__/hud-tmux-injection.test.js.map +1 -1
  162. package/dist/hud/__tests__/index.test.js +210 -2
  163. package/dist/hud/__tests__/index.test.js.map +1 -1
  164. package/dist/hud/__tests__/reconcile.test.js +588 -28
  165. package/dist/hud/__tests__/reconcile.test.js.map +1 -1
  166. package/dist/hud/__tests__/render.test.js +61 -0
  167. package/dist/hud/__tests__/render.test.js.map +1 -1
  168. package/dist/hud/__tests__/state.test.js +208 -0
  169. package/dist/hud/__tests__/state.test.js.map +1 -1
  170. package/dist/hud/__tests__/tmux.test.js +314 -22
  171. package/dist/hud/__tests__/tmux.test.js.map +1 -1
  172. package/dist/hud/authority.d.ts +5 -0
  173. package/dist/hud/authority.d.ts.map +1 -1
  174. package/dist/hud/authority.js +337 -30
  175. package/dist/hud/authority.js.map +1 -1
  176. package/dist/hud/index.d.ts +20 -2
  177. package/dist/hud/index.d.ts.map +1 -1
  178. package/dist/hud/index.js +103 -26
  179. package/dist/hud/index.js.map +1 -1
  180. package/dist/hud/reconcile.d.ts +3 -3
  181. package/dist/hud/reconcile.d.ts.map +1 -1
  182. package/dist/hud/reconcile.js +129 -20
  183. package/dist/hud/reconcile.js.map +1 -1
  184. package/dist/hud/render.d.ts.map +1 -1
  185. package/dist/hud/render.js +35 -0
  186. package/dist/hud/render.js.map +1 -1
  187. package/dist/hud/state.d.ts.map +1 -1
  188. package/dist/hud/state.js +64 -50
  189. package/dist/hud/state.js.map +1 -1
  190. package/dist/hud/tmux.d.ts +26 -6
  191. package/dist/hud/tmux.d.ts.map +1 -1
  192. package/dist/hud/tmux.js +173 -38
  193. package/dist/hud/tmux.js.map +1 -1
  194. package/dist/hud/types.d.ts +11 -0
  195. package/dist/hud/types.d.ts.map +1 -1
  196. package/dist/hud/types.js.map +1 -1
  197. package/dist/mcp/__tests__/hermes-bridge.test.js +203 -7
  198. package/dist/mcp/__tests__/hermes-bridge.test.js.map +1 -1
  199. package/dist/mcp/__tests__/state-paths.test.js +71 -1
  200. package/dist/mcp/__tests__/state-paths.test.js.map +1 -1
  201. package/dist/mcp/__tests__/state-server.test.js +13 -1
  202. package/dist/mcp/__tests__/state-server.test.js.map +1 -1
  203. package/dist/mcp/hermes-bridge.d.ts +12 -2
  204. package/dist/mcp/hermes-bridge.d.ts.map +1 -1
  205. package/dist/mcp/hermes-bridge.js +83 -9
  206. package/dist/mcp/hermes-bridge.js.map +1 -1
  207. package/dist/mcp/state-paths.d.ts +32 -0
  208. package/dist/mcp/state-paths.d.ts.map +1 -1
  209. package/dist/mcp/state-paths.js +113 -17
  210. package/dist/mcp/state-paths.js.map +1 -1
  211. package/dist/mcp/state-server.d.ts +4 -4
  212. package/dist/modes/__tests__/base-autoresearch-contract.test.js +7 -1
  213. package/dist/modes/__tests__/base-autoresearch-contract.test.js.map +1 -1
  214. package/dist/pipeline/__tests__/stages.test.js +130 -0
  215. package/dist/pipeline/__tests__/stages.test.js.map +1 -1
  216. package/dist/pipeline/orchestrator.js +1 -1
  217. package/dist/pipeline/orchestrator.js.map +1 -1
  218. package/dist/pipeline/stages/ralplan.d.ts +1 -0
  219. package/dist/pipeline/stages/ralplan.d.ts.map +1 -1
  220. package/dist/pipeline/stages/ralplan.js +14 -5
  221. package/dist/pipeline/stages/ralplan.js.map +1 -1
  222. package/dist/question/__tests__/deep-interview.test.js +160 -2
  223. package/dist/question/__tests__/deep-interview.test.js.map +1 -1
  224. package/dist/question/__tests__/policy.test.js +63 -3
  225. package/dist/question/__tests__/policy.test.js.map +1 -1
  226. package/dist/question/__tests__/renderer.test.js +191 -2
  227. package/dist/question/__tests__/renderer.test.js.map +1 -1
  228. package/dist/question/__tests__/state.test.js +94 -3
  229. package/dist/question/__tests__/state.test.js.map +1 -1
  230. package/dist/question/__tests__/ui.test.js +4 -0
  231. package/dist/question/__tests__/ui.test.js.map +1 -1
  232. package/dist/question/autopilot-wait.d.ts +12 -2
  233. package/dist/question/autopilot-wait.d.ts.map +1 -1
  234. package/dist/question/autopilot-wait.js +158 -47
  235. package/dist/question/autopilot-wait.js.map +1 -1
  236. package/dist/question/deep-interview.d.ts.map +1 -1
  237. package/dist/question/deep-interview.js +22 -6
  238. package/dist/question/deep-interview.js.map +1 -1
  239. package/dist/question/policy.d.ts.map +1 -1
  240. package/dist/question/policy.js +2 -5
  241. package/dist/question/policy.js.map +1 -1
  242. package/dist/question/renderer.d.ts +12 -0
  243. package/dist/question/renderer.d.ts.map +1 -1
  244. package/dist/question/renderer.js +87 -3
  245. package/dist/question/renderer.js.map +1 -1
  246. package/dist/question/state.d.ts +8 -1
  247. package/dist/question/state.d.ts.map +1 -1
  248. package/dist/question/state.js +54 -14
  249. package/dist/question/state.js.map +1 -1
  250. package/dist/question/types.d.ts +1 -1
  251. package/dist/question/types.d.ts.map +1 -1
  252. package/dist/question/ui.d.ts +1 -0
  253. package/dist/question/ui.d.ts.map +1 -1
  254. package/dist/question/ui.js +1 -0
  255. package/dist/question/ui.js.map +1 -1
  256. package/dist/ralplan/__tests__/runtime.test.js +191 -0
  257. package/dist/ralplan/__tests__/runtime.test.js.map +1 -1
  258. package/dist/ralplan/consensus-gate.d.ts +9 -1
  259. package/dist/ralplan/consensus-gate.d.ts.map +1 -1
  260. package/dist/ralplan/consensus-gate.js +84 -2
  261. package/dist/ralplan/consensus-gate.js.map +1 -1
  262. package/dist/ralplan/runtime.d.ts +9 -0
  263. package/dist/ralplan/runtime.d.ts.map +1 -1
  264. package/dist/ralplan/runtime.js +32 -11
  265. package/dist/ralplan/runtime.js.map +1 -1
  266. package/dist/scripts/__tests__/codex-native-hook.test.js +2315 -280
  267. package/dist/scripts/__tests__/codex-native-hook.test.js.map +1 -1
  268. package/dist/scripts/__tests__/notify-state-io.test.js +72 -1
  269. package/dist/scripts/__tests__/notify-state-io.test.js.map +1 -1
  270. package/dist/scripts/__tests__/notify-tmux-injection.test.d.ts +2 -0
  271. package/dist/scripts/__tests__/notify-tmux-injection.test.d.ts.map +1 -0
  272. package/dist/scripts/__tests__/notify-tmux-injection.test.js +57 -0
  273. package/dist/scripts/__tests__/notify-tmux-injection.test.js.map +1 -0
  274. package/dist/scripts/__tests__/run-test-files.test.js +74 -0
  275. package/dist/scripts/__tests__/run-test-files.test.js.map +1 -1
  276. package/dist/scripts/__tests__/verify-native-agents.test.js +65 -0
  277. package/dist/scripts/__tests__/verify-native-agents.test.js.map +1 -1
  278. package/dist/scripts/codex-native-hook.d.ts.map +1 -1
  279. package/dist/scripts/codex-native-hook.js +431 -56
  280. package/dist/scripts/codex-native-hook.js.map +1 -1
  281. package/dist/scripts/codex-native-pre-post.d.ts.map +1 -1
  282. package/dist/scripts/codex-native-pre-post.js +79 -1
  283. package/dist/scripts/codex-native-pre-post.js.map +1 -1
  284. package/dist/scripts/eval/eval-parity-smoke.js +1 -1
  285. package/dist/scripts/eval/eval-parity-smoke.js.map +1 -1
  286. package/dist/scripts/hook-payload-guard.d.ts +9 -0
  287. package/dist/scripts/hook-payload-guard.d.ts.map +1 -0
  288. package/dist/scripts/hook-payload-guard.js +111 -0
  289. package/dist/scripts/hook-payload-guard.js.map +1 -0
  290. package/dist/scripts/notify-fallback-watcher.js +8 -1
  291. package/dist/scripts/notify-fallback-watcher.js.map +1 -1
  292. package/dist/scripts/notify-hook/__tests__/payload-guard.test.d.ts +2 -0
  293. package/dist/scripts/notify-hook/__tests__/payload-guard.test.d.ts.map +1 -0
  294. package/dist/scripts/notify-hook/__tests__/payload-guard.test.js +39 -0
  295. package/dist/scripts/notify-hook/__tests__/payload-guard.test.js.map +1 -0
  296. package/dist/scripts/notify-hook/auto-nudge.d.ts.map +1 -1
  297. package/dist/scripts/notify-hook/auto-nudge.js +3 -1
  298. package/dist/scripts/notify-hook/auto-nudge.js.map +1 -1
  299. package/dist/scripts/notify-hook/ralph-session-resume.d.ts.map +1 -1
  300. package/dist/scripts/notify-hook/ralph-session-resume.js +3 -10
  301. package/dist/scripts/notify-hook/ralph-session-resume.js.map +1 -1
  302. package/dist/scripts/notify-hook/state-io.d.ts.map +1 -1
  303. package/dist/scripts/notify-hook/state-io.js +62 -38
  304. package/dist/scripts/notify-hook/state-io.js.map +1 -1
  305. package/dist/scripts/notify-hook/team-leader-nudge.d.ts.map +1 -1
  306. package/dist/scripts/notify-hook/team-leader-nudge.js +7 -0
  307. package/dist/scripts/notify-hook/team-leader-nudge.js.map +1 -1
  308. package/dist/scripts/notify-hook/team-worker-stop.d.ts.map +1 -1
  309. package/dist/scripts/notify-hook/team-worker-stop.js +234 -86
  310. package/dist/scripts/notify-hook/team-worker-stop.js.map +1 -1
  311. package/dist/scripts/notify-hook/tmux-injection.d.ts +7 -0
  312. package/dist/scripts/notify-hook/tmux-injection.d.ts.map +1 -1
  313. package/dist/scripts/notify-hook/tmux-injection.js +24 -18
  314. package/dist/scripts/notify-hook/tmux-injection.js.map +1 -1
  315. package/dist/scripts/notify-hook.js +86 -13
  316. package/dist/scripts/notify-hook.js.map +1 -1
  317. package/dist/scripts/run-test-files.js +193 -22
  318. package/dist/scripts/run-test-files.js.map +1 -1
  319. package/dist/scripts/sync-plugin-mirror.d.ts.map +1 -1
  320. package/dist/scripts/sync-plugin-mirror.js +61 -3
  321. package/dist/scripts/sync-plugin-mirror.js.map +1 -1
  322. package/dist/scripts/verify-native-agents.d.ts.map +1 -1
  323. package/dist/scripts/verify-native-agents.js +58 -1
  324. package/dist/scripts/verify-native-agents.js.map +1 -1
  325. package/dist/state/__tests__/operations.test.js +1125 -1
  326. package/dist/state/__tests__/operations.test.js.map +1 -1
  327. package/dist/state/__tests__/skill-active.test.js +46 -1
  328. package/dist/state/__tests__/skill-active.test.js.map +1 -1
  329. package/dist/state/__tests__/workflow-transition.test.js +98 -7
  330. package/dist/state/__tests__/workflow-transition.test.js.map +1 -1
  331. package/dist/state/operations.d.ts.map +1 -1
  332. package/dist/state/operations.js +159 -2
  333. package/dist/state/operations.js.map +1 -1
  334. package/dist/state/skill-active.js +6 -8
  335. package/dist/state/skill-active.js.map +1 -1
  336. package/dist/state/workflow-transition-reconcile.d.ts +6 -0
  337. package/dist/state/workflow-transition-reconcile.d.ts.map +1 -1
  338. package/dist/state/workflow-transition-reconcile.js +38 -15
  339. package/dist/state/workflow-transition-reconcile.js.map +1 -1
  340. package/dist/state/workflow-transition.d.ts.map +1 -1
  341. package/dist/state/workflow-transition.js +10 -3
  342. package/dist/state/workflow-transition.js.map +1 -1
  343. package/dist/subagents/__tests__/tracker.test.js +139 -0
  344. package/dist/subagents/__tests__/tracker.test.js.map +1 -1
  345. package/dist/subagents/tracker.d.ts +3 -0
  346. package/dist/subagents/tracker.d.ts.map +1 -1
  347. package/dist/subagents/tracker.js +41 -4
  348. package/dist/subagents/tracker.js.map +1 -1
  349. package/dist/team/__tests__/coordination-protocol.test.d.ts +2 -0
  350. package/dist/team/__tests__/coordination-protocol.test.d.ts.map +1 -0
  351. package/dist/team/__tests__/coordination-protocol.test.js +173 -0
  352. package/dist/team/__tests__/coordination-protocol.test.js.map +1 -0
  353. package/dist/team/__tests__/runtime.test.js +52 -3
  354. package/dist/team/__tests__/runtime.test.js.map +1 -1
  355. package/dist/team/__tests__/scaling.test.js +9 -4
  356. package/dist/team/__tests__/scaling.test.js.map +1 -1
  357. package/dist/team/__tests__/state.test.js +83 -0
  358. package/dist/team/__tests__/state.test.js.map +1 -1
  359. package/dist/team/__tests__/tmux-session.test.js +240 -2
  360. package/dist/team/__tests__/tmux-session.test.js.map +1 -1
  361. package/dist/team/__tests__/worker-bootstrap.test.js +84 -0
  362. package/dist/team/__tests__/worker-bootstrap.test.js.map +1 -1
  363. package/dist/team/__tests__/worker-runtime-identity.test.js +4 -2
  364. package/dist/team/__tests__/worker-runtime-identity.test.js.map +1 -1
  365. package/dist/team/coordination-protocol.d.ts +14 -0
  366. package/dist/team/coordination-protocol.d.ts.map +1 -0
  367. package/dist/team/coordination-protocol.js +244 -0
  368. package/dist/team/coordination-protocol.js.map +1 -0
  369. package/dist/team/runtime.d.ts +1 -0
  370. package/dist/team/runtime.d.ts.map +1 -1
  371. package/dist/team/runtime.js +19 -3
  372. package/dist/team/runtime.js.map +1 -1
  373. package/dist/team/scaling.d.ts.map +1 -1
  374. package/dist/team/scaling.js +3 -2
  375. package/dist/team/scaling.js.map +1 -1
  376. package/dist/team/state/tasks.d.ts.map +1 -1
  377. package/dist/team/state/tasks.js +24 -0
  378. package/dist/team/state/tasks.js.map +1 -1
  379. package/dist/team/state/types.d.ts +21 -1
  380. package/dist/team/state/types.d.ts.map +1 -1
  381. package/dist/team/state/types.js.map +1 -1
  382. package/dist/team/state.d.ts +17 -1
  383. package/dist/team/state.d.ts.map +1 -1
  384. package/dist/team/state.js +12 -5
  385. package/dist/team/state.js.map +1 -1
  386. package/dist/team/team-ops.d.ts +1 -1
  387. package/dist/team/team-ops.d.ts.map +1 -1
  388. package/dist/team/team-ops.js.map +1 -1
  389. package/dist/team/tmux-session.d.ts +2 -0
  390. package/dist/team/tmux-session.d.ts.map +1 -1
  391. package/dist/team/tmux-session.js +161 -13
  392. package/dist/team/tmux-session.js.map +1 -1
  393. package/dist/team/worker-bootstrap.d.ts.map +1 -1
  394. package/dist/team/worker-bootstrap.js +63 -0
  395. package/dist/team/worker-bootstrap.js.map +1 -1
  396. package/dist/utils/__tests__/agents-model-table.test.js +4 -2
  397. package/dist/utils/__tests__/agents-model-table.test.js.map +1 -1
  398. package/dist/utils/agents-model-table.d.ts.map +1 -1
  399. package/dist/utils/agents-model-table.js +3 -0
  400. package/dist/utils/agents-model-table.js.map +1 -1
  401. package/dist/verification/__tests__/ci-rust-gates.test.js +81 -1
  402. package/dist/verification/__tests__/ci-rust-gates.test.js.map +1 -1
  403. package/package.json +8 -8
  404. package/plugins/oh-my-codex/.codex-plugin/plugin.json +1 -1
  405. package/plugins/oh-my-codex/hooks/codex-native-hook.mjs +334 -21
  406. package/plugins/oh-my-codex/hooks/hooks.json +1 -2
  407. package/plugins/oh-my-codex/skills/autopilot/SKILL.md +13 -6
  408. package/plugins/oh-my-codex/skills/code-review/SKILL.md +7 -7
  409. package/plugins/oh-my-codex/skills/deep-interview/SKILL.md +9 -4
  410. package/plugins/oh-my-codex/skills/ralph/SKILL.md +22 -22
  411. package/plugins/oh-my-codex/skills/ralplan/SKILL.md +12 -0
  412. package/plugins/oh-my-codex/skills/team/SKILL.md +16 -0
  413. package/plugins/oh-my-codex/skills/ultraqa/SKILL.md +9 -0
  414. package/plugins/oh-my-codex/skills/worker/SKILL.md +14 -0
  415. package/skills/autopilot/SKILL.md +13 -6
  416. package/skills/code-review/SKILL.md +7 -7
  417. package/skills/deep-interview/SKILL.md +9 -4
  418. package/skills/ralph/SKILL.md +22 -22
  419. package/skills/ralplan/SKILL.md +12 -0
  420. package/skills/team/SKILL.md +16 -0
  421. package/skills/ultraqa/SKILL.md +9 -0
  422. package/skills/worker/SKILL.md +14 -0
  423. package/src/scripts/__tests__/codex-native-hook.test.ts +4435 -2083
  424. package/src/scripts/__tests__/notify-state-io.test.ts +95 -0
  425. package/src/scripts/__tests__/notify-tmux-injection.test.ts +82 -0
  426. package/src/scripts/__tests__/run-test-files.test.ts +102 -0
  427. package/src/scripts/__tests__/verify-native-agents.test.ts +75 -0
  428. package/src/scripts/codex-native-hook.ts +536 -51
  429. package/src/scripts/codex-native-pre-post.ts +80 -0
  430. package/src/scripts/demo-team-e2e.sh +10 -7
  431. package/src/scripts/eval/eval-parity-smoke.ts +1 -1
  432. package/src/scripts/hook-payload-guard.ts +113 -0
  433. package/src/scripts/notify-fallback-watcher.ts +8 -1
  434. package/src/scripts/notify-hook/__tests__/payload-guard.test.ts +41 -0
  435. package/src/scripts/notify-hook/auto-nudge.ts +3 -1
  436. package/src/scripts/notify-hook/ralph-session-resume.ts +2 -8
  437. package/src/scripts/notify-hook/state-io.ts +75 -37
  438. package/src/scripts/notify-hook/team-leader-nudge.ts +7 -0
  439. package/src/scripts/notify-hook/team-worker-stop.ts +193 -52
  440. package/src/scripts/notify-hook/tmux-injection.ts +35 -19
  441. package/src/scripts/notify-hook.ts +105 -6
  442. package/src/scripts/run-test-files.ts +192 -22
  443. package/src/scripts/sync-plugin-mirror.ts +98 -9
  444. package/src/scripts/verify-native-agents.ts +65 -1
@@ -64,6 +64,18 @@ The consensus workflow:
64
64
 
65
65
  > **Important:** Steps 3 and 4 MUST run sequentially as role-specific subagents. Do NOT issue both agent calls in the same parallel batch. Always await the subsequent `Architect` result before invoking the subsequent `Critic`; only a completed, role-specific `Critic` approval can satisfy the durable gate.
66
66
 
67
+ ## Planning/Execution Boundary
68
+
69
+ `$ralplan` is a planning mode. While ralplan is active and no explicit execution handoff is active, implementation-focused write tools are out of scope. Ralplan may inspect the repository and may write only planning artifacts such as `.omx/context/`, `.omx/plans/`, `.omx/specs/`, and required `.omx/state/` records.
70
+
71
+ The canonical flow is:
72
+
73
+ ```
74
+ $ralplan -> durable consensus artifact -> explicit execution lane -> $ultragoal | $team | $ralph
75
+ ```
76
+
77
+ Before any execution lane begins, ralplan must emit terminal planning state (complete, paused, failed, or waiting for input) and the durable handoff record below. Do not continue from consensus planning into direct code edits in the same ralplan session.
78
+
67
79
  ## Durable Consensus Handoff Contract
68
80
 
69
81
  Ralplan is not complete, skippable, or ready for execution merely because `.omx/plans/prd-*.md` and `.omx/plans/test-spec-*.md` exist. Those files are planning artifacts, not consensus evidence.
@@ -57,6 +57,22 @@ requiring a separate linked Ralph launch up front.
57
57
  - **Escalation:** start a separate `omx ralph ...` / `$ralph ...` only when a later manual follow-up still needs a persistent single-owner fix/verification loop.
58
58
  - **Deprecation:** `omx team ralph ...` has been removed. Use plain `omx team ...` for team execution or run `omx ralph ...` separately when you explicitly want a later Ralph loop.
59
59
 
60
+
61
+ ### Team Big Five / ATEM coordination gate
62
+
63
+ `$team` keeps simple independent fan-out lightweight. For isolated tasks (for example per-file sweeps, typo/copy edits, or explicitly independent lanes with no shared files/dependencies), workers use the normal concise protocol: startup ACK, claim-safe task lifecycle, status, verification, and completion evidence.
64
+
65
+ Activate the lightweight Team Big Five + ATEM-inspired coordination layer when the task or task graph has dependencies, shared files/surfaces/contracts, cross-boundary ownership, handoffs, integration/merge work, blocked lanes, or changed assumptions. The protocol is not a separate ceremony; it is a concise boundary checklist:
66
+
67
+ - **Shared mental model / single source of truth:** task JSON, inbox, mailbox, approved handoff, and leader updates are canonical.
68
+ - **Closed-loop communication / ACK-readback handoffs:** acknowledge handoffs with understood scope, affected artifact/path, owner, and next action.
69
+ - **Mutual performance monitoring at boundaries:** check upstream/downstream contracts, shared files, and verification evidence before completion.
70
+ - **Backup/reassignment behavior:** blocked workers report the smallest needed help/reassignment request and continue safe unblocked slices.
71
+ - **Adaptability checkpoints:** changed assumptions, dependencies, or verification results trigger a brief leader-facing update before widening scope.
72
+ - **Team orientation:** workers optimize for the integrated team outcome, not local-optimum-only task summaries; report integration risks, missing tests, and peer impacts.
73
+
74
+ ATEM fit: treat this as agile teamwork support for transition/action/interpersonal moments around boundaries, not as a heavyweight process model. Do not copy provider-specific plugin implementations; keep the protocol in OMX/Codex prompts, inboxes, state, and tests.
75
+
60
76
  ### Team + Ultragoal bridge
61
77
 
62
78
  Use `$ultragoal` for durable leader-owned goal/ledger tracking and `$team` for parallel execution lanes. When Team is launched with an active `.omx/ultragoal/goals.json`, worker inboxes/status may include leader-owned Ultragoal context: `.omx/ultragoal/goals.json`, `.omx/ultragoal/ledger.jsonl`, the active goal id, Codex goal mode, and the `fresh_leader_get_goal_required` checkpoint policy.
@@ -58,6 +58,15 @@ The matrix must include normal-path coverage plus adversarial dynamic e2e scenar
58
58
  - Validate exit codes and output semantics; do not trust success-looking text alone.
59
59
  - Do not delete, rewrite, or mask unrelated user work. Capture dirty-worktree evidence before and after generated harness work.
60
60
 
61
+ ### Temporary Harness Generation Guardrails
62
+
63
+ Generated harnesses are part of the QA evidence chain; until setup succeeds, they are evidence about the harness apparatus, not product behavior.
64
+
65
+ - **Use absolute repo imports for built artifacts.** When a harness runs from `/tmp` or another scratch directory but imports repository code, resolve the repository root explicitly from the verified repo cwd and import built modules with an absolute path or `pathToFileURL(join(repoRoot, "dist", ...)).href`. Never rely on `./dist/...` from the harness file's temporary directory.
66
+ - **Use a safe file writer for JS/TS harness bodies.** Prefer a small Node/Python writer or another non-interpolating file-write mechanism for harness source that contains backticks, `${...}`, shell metacharacters, or prompt-injection strings. If a shell heredoc is unavoidable, quote the delimiter and verify the written file before execution; do not use interpolating heredocs for JavaScript assertions.
67
+ - **Sanitize OMX runtime env for isolated probes.** When the scenario creates a temporary repo/state tree or intentionally checks local isolation, run the probe with `OMX_ROOT` and `OMX_STATE_ROOT` unset (for example `env -u OMX_ROOT -u OMX_STATE_ROOT ...`) so ambient boxed runtime state cannot redirect reads/writes away from the scenario fixture.
68
+ - **Classify harness setup failures separately.** If a generated harness fails before exercising product behavior because of import paths, shell interpolation, environment leakage, or fixture construction, record it as harness debris, fix the harness, and rerun the scenario before declaring a product defect.
69
+
61
70
  ## Cycle Workflow
62
71
 
63
72
  ### Cycle N (Max 5)
@@ -101,6 +101,20 @@ Worker sessions should treat team state + CLI interop as the source of truth.
101
101
  - Do **not** rely on ad-hoc tmux keystrokes as a primary delivery channel.
102
102
  - If a manual trigger arrives (for example `tmux send-keys` nudge), treat it only as a prompt to re-check state and continue through the normal claim-safe lifecycle.
103
103
 
104
+
105
+ ## Team Big Five / ATEM Coordination Gate
106
+
107
+ Keep independent fan-out lightweight: if your task is isolated with no shared files, dependencies, or handoffs, normal startup ACK, claim-safe lifecycle, status, verification, and completion evidence are sufficient.
108
+
109
+ When your inbox/task activates the Team Big Five / ATEM-inspired protocol (dependencies, shared files/surfaces/contracts, handoffs, integration, blocked lanes, or changed assumptions), use this concise boundary checklist:
110
+
111
+ - Shared mental model / single source of truth: treat task JSON, inbox, mailbox, approved handoff, and leader updates as canonical.
112
+ - Closed-loop communication / ACK-readback: acknowledge handoffs with what you understood, affected artifact/path, owner, and next action.
113
+ - Mutual performance monitoring: check boundary contracts, shared files, and verification evidence before completion.
114
+ - Backup/reassignment behavior: if blocked, write blocked status with the smallest needed help/reassignment request and continue any safe unblocked slice.
115
+ - Adaptability checkpoint: changed assumptions, dependencies, or verification results require a brief leader-facing update before widening scope.
116
+ - Team orientation: optimize for the integrated team result; report integration risks, missing tests, and peer impacts instead of local-only success.
117
+
104
118
  ## Shutdown
105
119
 
106
120
  If the lead sends a shutdown request, follow the shutdown inbox instructions exactly, write your shutdown ack file, then exit the Codex session.
@@ -31,7 +31,9 @@ Autopilot must not run a separate broad expansion/planning/execution/QA/validati
31
31
 
32
32
  1. **Phase `deep-interview`** — Socratic requirements clarification gate
33
33
  - Run or resume `$deep-interview` to clarify intent, scope, non-goals, constraints, and decision boundaries.
34
- - Required handoff artifact: a clarified spec or concise requirements summary suitable for `$ralplan`.
34
+ - Deep-interview is a structured question chain, not a one-question gate; `max_rounds` is a cap, not a target.
35
+ - After a user answers an `omx question`, re-score ambiguity against the active profile threshold. Ask another question only when a readiness gate is still unresolved and the answer would materially change execution; otherwise crystallize the spec and hand off.
36
+ - Required handoff artifact: a clarified spec or concise requirements summary suitable for `$ralplan`, including an explicit interview-complete rationale when leaving deep-interview.
35
37
 
36
38
  2. **Phase `ralplan`** — consensus planning gate
37
39
  - Ground the task with pre-context intake and the deep-interview artifact.
@@ -66,12 +68,14 @@ Before Phase `deep-interview` or `ralplan` starts or resumes:
66
68
  1. Derive a task slug from the request.
67
69
  2. Reuse the latest relevant `.omx/context/{slug}-*.md` snapshot when available.
68
70
  3. If none exists, create `.omx/context/{slug}-{timestamp}.md` (UTC `YYYYMMDDTHHMMSSZ`) with:
69
- - task statement
71
+ - activation prompt / task seed
72
+ - original task status (`activation-prompt`, `legacy-unverified`, or `unavailable`)
70
73
  - desired outcome
71
74
  - known facts/evidence
72
75
  - constraints
73
76
  - unknowns/open questions
74
77
  - likely codebase touchpoints
78
+ - a scope note that the seed is the Autopilot activation prompt, not guaranteed prior conversation context
75
79
  4. If brownfield facts are missing, run `explore` first before or during `$deep-interview` (`$deep-interview --quick <task>` remains acceptable for bounded low-ambiguity intake); do not skip the clarification gate merely because the task sounds actionable.
76
80
  5. Carry the snapshot path in Autopilot state and all handoff artifacts.
77
81
  </Pre-context Intake>
@@ -91,6 +95,8 @@ Before Phase `deep-interview` or `ralplan` starts or resumes:
91
95
  <State_Management>
92
96
  Use the CLI-first state surface (`omx state ... --json`) for Autopilot lifecycle state. State must be session-aware when a session id exists. If the explicit MCP compatibility surface is already available, equivalent `omx_state` tool calls remain acceptable but are not required.
93
97
 
98
+ Inside active Autopilot, named child phases such as `$ralplan` are supervised phases, not peer workflow activations: keep `mode:"autopilot"` active and update `current_phase:"ralplan"` rather than starting standalone `mode:"ralplan"` over Autopilot.
99
+
94
100
  Required fields:
95
101
 
96
102
  ```json
@@ -126,12 +132,12 @@ Required fields:
126
132
  ```
127
133
 
128
134
  - **On start**: `omx state write --input '{"mode":"autopilot","active":true,"current_phase":"deep-interview","iteration":1,"review_cycle":0,"state":{"phase_cycle":["deep-interview","ralplan","ultragoal","code-review","ultraqa"],"handoff_artifacts":{"context_snapshot_path":"<snapshot-path>","deep_interview":null,"ralplan":null,"ralplan_consensus_gate":{"required":true,"sequence":["architect-review","critic-review"],"planning_artifacts_are_not_consensus":true,"required_review_roles":["architect","critic"],"ralplan_architect_review":null,"ralplan_critic_review":null,"complete":false},"ultragoal":null,"code_review":null,"ultraqa":null},"review_verdict":null,"qa_verdict":null,"return_to_ralplan_reason":null}}' --json`
129
- - **On deep-interview -> ralplan**: set `current_phase:"ralplan"`, persist the clarified spec/requirements under `handoff_artifacts.deep_interview`.
130
- - **On ralplan -> ultragoal**: only after `ralplan_consensus_gate.complete:true`, with `ralplan_architect_review.agent_role:"architect"` and `ralplan_architect_review.verdict:"approve"` recorded before `ralplan_critic_review.agent_role:"critic"` and `ralplan_critic_review.verdict:"approve"`; set `current_phase:"ultragoal"` and persist the plan/test-spec paths under `handoff_artifacts.ralplan`.
135
+ - **On deep-interview -> ralplan**: only after a separate gate proves the interview chain is explicitly complete or the user explicitly authorized a skip. For completion, persist `deep_interview_gate:{"status":"complete","rationale":"<why requirements are complete>","handoff_summary":"<summary>"}` (or equivalent non-empty rationale/summary) plus the clarified spec/requirements under `handoff_artifacts.deep_interview`; if a final `omx question` was involved, keep its same-session answered record linked by `question_id`/`satisfied_at`. For skip, persist `deep_interview_gate:{"status":"skipped","skip_authorized_by_user":true,"skip_reason":"<user-authorized reason>","skipped_at":"<timestamp>","source":"user","session_id":"<session>"}`. Do not leave deep-interview merely because the first `omx question` was answered or cleared.
136
+ - **On ralplan -> ultragoal**: only after `ralplan_consensus_gate.complete:true`, with tracker-backed native-subagent `ralplan_architect_review.agent_role:"architect"` and `ralplan_architect_review.verdict:"approve"` recorded before tracker-backed native-subagent `ralplan_critic_review.agent_role:"critic"` and `ralplan_critic_review.verdict:"approve"`; `codex_exec` or artifact-only approvals are trace evidence but not native lane proof. Set `current_phase:"ultragoal"` and persist the plan/test-spec paths under `handoff_artifacts.ralplan`.
131
137
  - **On missing ralplan consensus evidence**: keep `current_phase:"ralplan"`, persist `ralplan_consensus_gate.complete:false` with `blocked_reason`, and report an explicit blocker or max-iteration outcome instead of handing off to execution.
132
138
  - **On ultragoal -> code-review**: set `current_phase:"code-review"`, persist implementation/test/ledger evidence under `handoff_artifacts.ultragoal`.
133
- - **On code-review -> ultraqa**: set `current_phase:"ultraqa"`, persist the clean review under `handoff_artifacts.code_review`.
134
- - **On clean review + passed/skipped QA**: set `active:false`, `current_phase:"complete"`, persist `review_verdict:{recommendation:"APPROVE", architectural_status:"CLEAR", clean:true}`, `qa_verdict:{clean:true, skipped:<boolean>, reason:<string|null>}`, and `completed_at`.
139
+ - **On code-review -> ultraqa**: set `current_phase:"ultraqa"` only after a real `$code-review` stage/subagent has produced durable evidence; persist the clean review under `handoff_artifacts.code_review` with its source thread/tool/stage reference. Do not author `review_verdict:{clean:true}` from the leader's own summary.
140
+ - **On clean review + passed/skipped QA**: set `active:false`, `current_phase:"complete"`, persist `review_verdict:{recommendation:"APPROVE", architectural_status:"CLEAR", clean:true}`, `qa_verdict:{clean:true, skipped:<boolean>, reason:<string|null>}`, and `completed_at` only when both gates have durable source evidence. Required evidence is either (a) actual `$code-review`/`$ultraqa` stage or native-subagent/thread/tool records, or (b) for QA only, an explicit persisted skip reason for a documented docs-only/trivially non-runtime condition. If that evidence is missing, keep the active phase at `code-review` or `ultraqa` and record a blocker instead of self-attesting a clean gate.
135
141
  - **On non-clean review or failed QA**: increment `iteration` and `review_cycle`, set `current_phase:"ralplan"`, persist `review_verdict` or `qa_verdict`, persist the phase handoff, and set `return_to_ralplan_reason` to a concise findings-driven reason.
136
142
  - **Legacy Ralph state**: if a user explicitly selected the legacy Ralph execution lane, phase names and handoff keys may include `ralph`; preserve and resume them rather than rewriting history to Ultragoal.
137
143
  - **On cancellation**: run `$cancel`; preserve progress for resume rather than deleting handoff artifacts.
@@ -175,6 +181,7 @@ Pipeline state should use `current_phase` values that match the same phase names
175
181
  - [ ] `$team` was used only if the active Ultragoal story needed coordinated parallel work, or explicitly recorded as not needed
176
182
  - [ ] Phase `code-review` returned a clean verdict (`APPROVE` + `CLEAR`)
177
183
  - [ ] Phase `ultraqa` passed, or was explicitly skipped because the change was docs-only/trivially non-runtime with evidence
184
+ - [ ] Clean `review_verdict` cites durable source evidence from a real `$code-review` stage/subagent/thread/tool record; `qa_verdict` cites durable `$ultraqa` evidence or an explicit persisted low-risk skip reason; leader-authored summaries alone are not gate evidence
178
185
  - [ ] `review_verdict.clean` is true, `qa_verdict.clean` is true, and `return_to_ralplan_reason` is null
179
186
  - [ ] Tests/build/lint/typecheck evidence from Ultragoal is available in handoff artifacts
180
187
  - [ ] Autopilot state is marked `complete` or cancellation state is preserved coherently
@@ -31,7 +31,7 @@ Delegates to the `code-reviewer` and `architect` agents in parallel for a two-la
31
31
  2. **Launch Parallel Review Lanes**
32
32
  - **`code-reviewer` lane** - owns spec compliance, security, code quality, performance, and maintainability findings
33
33
  - **`architect` lane** - owns the devil's-advocate / design-tradeoff perspective
34
- - Both lanes run in parallel and produce distinct outputs before final synthesis
34
+ - Both lanes run in parallel on a clean context with explicit scope and artifacts, and produce distinct outputs before final synthesis
35
35
  - If either lane cannot be launched or does not return evidence, report `independent review unavailable`; do **not** substitute the current/authoring lane, and do **not** approve or mark the review merge-ready.
36
36
 
37
37
  3. **Review Categories**
@@ -72,9 +72,9 @@ Delegates to the `code-reviewer` and `architect` agents in parallel for a two-la
72
72
  Do not self-review as a fallback. If the `code-reviewer` or `architect` agent path is missing, unavailable, skipped, or fails, emit a clear unavailable-review result and block approval until the independent lane evidence exists.
73
73
 
74
74
  ```
75
- delegate(
76
- role="code-reviewer",
77
- tier="THOROUGH",
75
+ task(
76
+ agent_type="code-reviewer",
77
+ reasoning_effort="xhigh",
78
78
  prompt="CODE REVIEW TASK
79
79
 
80
80
  Review code changes for quality, security, and maintainability.
@@ -98,9 +98,9 @@ Output: Code review report with:
98
98
  - Approval recommendation (APPROVE / REQUEST CHANGES / COMMENT)"
99
99
  )
100
100
 
101
- delegate(
102
- role="architect",
103
- tier="THOROUGH",
101
+ task(
102
+ agent_type="architect",
103
+ reasoning_effort="xhigh",
104
104
  prompt="ARCHITECTURE / DEVIL'S-ADVOCATE REVIEW TASK
105
105
 
106
106
  Review the same code changes from the architecture/tradeoff perspective.
@@ -32,6 +32,8 @@ Execution quality is usually bottlenecked by intent clarity, not just missing im
32
32
  - **Deep (`--deep`)**: high-rigor exploration; target threshold `<= 0.15`; max rounds 20
33
33
  - **Autoresearch (`--autoresearch`)**: same interview rigor as Standard, but specialized for `$autoresearch` mission readiness and `.omx/specs/` artifact handoff
34
34
 
35
+ Profile `max rounds` is a hard cap, not a target. Do not continue only to reach a numbered round count. Extra Socratic rigor does not override the active threshold unless the profile/config changes.
36
+
35
37
  If no flag is provided, use **Standard**.
36
38
 
37
39
  <Mode_Flags>
@@ -70,6 +72,8 @@ If no flag is provided, use **Standard**.
70
72
  - Treat `answers[]` as the primary `omx question` success contract. For a single interview round, read `answers[0].answer`; use legacy top-level `answer` only as a compatibility fallback when needed.
71
73
  - If the current runtime is outside tmux and cannot render `omx question`, use the native structured question tool when available; otherwise ask exactly one concise plain-text question and wait for the answer
72
74
  - Re-score ambiguity after each answer and show progress transparently
75
+ - Once ambiguity is at or below the active profile threshold, stop ordinary questioning. Run the practical closure audit: crystallize/handoff when readiness gates pass; otherwise ask only the final closure question needed to satisfy a named gate.
76
+ - Treat `max_rounds` as a stop cap, not evidence that more rounds are needed.
73
77
  - Do not hand off to execution while ambiguity remains above threshold unless user explicitly opts to proceed with warning
74
78
  - Do not crystallize or hand off while `Non-goals` or `Decision Boundaries` remain unresolved, even if the weighted ambiguity threshold is met
75
79
  - Treat early exit as a safety valve, not the default success path
@@ -130,7 +134,7 @@ If no flag is provided, use **Standard**.
130
134
 
131
135
  ## Phase 2: Socratic Interview Loop
132
136
 
133
- Repeat until ambiguity `<= threshold`, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached.
137
+ Repeat until ambiguity `<= threshold`, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.
134
138
 
135
139
  ### 2a) Generate next question
136
140
  If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to `$ralplan`, `$autopilot`, `$ralph`, or `$team` until that summary answer is captured.
@@ -280,8 +284,9 @@ Readiness gate:
280
284
  - `Decision Boundaries` must be explicit
281
285
  - A pressure pass must be complete: at least one earlier answer has been revisited with an evidence, assumption, or tradeoff follow-up
282
286
  - A practical closure audit must pass: another question would change execution materially, not merely polish wording or chase a narrow edge case
283
- - If either gate is unresolved, or the pressure pass is incomplete, continue interviewing even when weighted ambiguity is below threshold
284
- - Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec or ask a final closure question instead of opening a new branch.
287
+ - If either gate is unresolved, or the pressure pass is incomplete, continue below threshold only with a final closure question that names the unresolved gate and would materially change execution.
288
+ - Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec instead of opening a new branch.
289
+ - If ambiguity is `<= 0.10`, another user-facing question is allowed only as that final closure question; otherwise crystallize immediately.
285
290
 
286
291
  ### 2d) Report progress
287
292
  Show weighted breakdown table, readiness-gate status (`Non-goals`, `Decision Boundaries`), and the next focus dimension.
@@ -294,7 +299,7 @@ Append round result and updated scores via `omx state write --input '<json>' --j
294
299
  - Apply a **Dialectic Rhythm Guard**: track consecutive non-user fact discoveries and confirmation-style answers (`[from-code][auto-confirmed]`, `[from-code]`, or `[from-research]`). After 3 consecutive non-user or confirmation answers, the next material user-facing round must solicit direct human judgment (`[from-user]`) unless the closure audit says the interview is ready to crystallize.
295
300
  - Round 4+: allow explicit early exit with risk warning
296
301
  - Soft warning at profile midpoint (e.g., round 3/6/10 depending on profile)
297
- - Hard cap at profile `max_rounds`
302
+ - Hard cap at profile `max_rounds`; never treat this cap as a desired interview length or quota
298
303
 
299
304
  ## Phase 3: Challenge Modes (assumption stress tests)
300
305
 
@@ -26,14 +26,14 @@ Ralph is a persistence loop that keeps working on a task until it is fully compl
26
26
  </Do_Not_Use_When>
27
27
 
28
28
  <Why_This_Exists>
29
- Complex tasks often fail silently: partial implementations get declared "done", tests get skipped, edge cases get forgotten. Ralph prevents this by looping until work is genuinely complete, requiring fresh verification evidence before allowing completion, and using tiered architect review to confirm quality.
29
+ Complex tasks often fail silently: partial implementations get declared "done", tests get skipped, edge cases get forgotten. Ralph prevents this by looping until work is genuinely complete, requiring fresh verification evidence before allowing completion, and using explicit architect native-subagent verification to confirm quality.
30
30
  </Why_This_Exists>
31
31
 
32
32
  <Execution_Policy>
33
33
  - Fire independent agent calls simultaneously -- never wait sequentially for independent work
34
34
  - Use `run_in_background: true` for long operations (installs, builds, test suites)
35
- - Always pass the `model` parameter explicitly when delegating to agents
36
- - Read `docs/shared/agent-tiers.md` before first delegation to select correct agent tiers
35
+ - Always set `agent_type` when spawning native subagents; use `reasoning_effort` for per-dispatch intensity when needed
36
+ - Preserve legacy Ralph tier intent through native reasoning effort: LOW -> `low`, STANDARD -> `medium`, THOROUGH -> `xhigh`
37
37
  - Deliver the full implementation: no scope reduction, no partial completion, no deleting tests to make them pass
38
38
  - Apply the shared workflow guidance pattern: outcome-first framing, concise visible updates for multi-step execution, local overrides for the active workflow branch, validation proportional to risk, explicit stop rules, and automatic continuation for safe reversible steps. Ask only for material, destructive, credentialed, external-production, or preference-dependent branches.
39
39
  - Integrate with Codex goal mode when goal tools are available: inspect the active thread goal with `get_goal`, preserve it as the top-level stop condition, and only call `update_goal({status: "complete"})` after a Ralph completion audit proves the objective is actually achieved.
@@ -54,10 +54,10 @@ Complex tasks often fail silently: partial implementations get declared "done",
54
54
  - Do not begin Ralph execution work (delegation, implementation, or verification loops) until snapshot grounding exists. If forced to proceed quickly, note explicit risk tradeoffs.
55
55
  1. **Review progress**: Check TODO list and any prior iteration state
56
56
  2. **Continue from where you left off**: Pick up incomplete tasks
57
- 3. **Delegate in parallel**: Route tasks to specialist agents at appropriate tiers
58
- - Simple lookups: LOW tier -- "What does this function return?"
59
- - Standard work: STANDARD tier -- "Add error handling to this module"
60
- - Complex analysis: THOROUGH tier -- "Debug this race condition"
57
+ 3. **Delegate in parallel**: Route tasks to specialist native agents with explicit `agent_type` and appropriate `reasoning_effort`
58
+ - Simple lookups: `reasoning_effort="low"` -- "What does this function return?"
59
+ - Standard work: `reasoning_effort="medium"` -- "Add error handling to this module"
60
+ - Complex analysis: `reasoning_effort="xhigh"` -- "Debug this race condition"
61
61
  - When Ralph is entered as a ralplan follow-up, start from the approved **available-agent-types roster** and make the delegation plan explicit: implementation lane, evidence/regression lane, and final sign-off lane using only known agent types
62
62
  4. **Run long operations in background**: Builds, installs, test suites use `run_in_background: true`
63
63
  5. **Visual task gate (when screenshot/reference images are present)**:
@@ -72,11 +72,11 @@ Complex tasks often fail silently: partial implementations get declared "done",
72
72
  b. Run verification (test, build, lint)
73
73
  c. Read the output -- confirm it actually passed
74
74
  d. Check: zero pending/in_progress TODO items
75
- 7. **Architect verification** (tiered):
76
- - <5 files, <100 lines with full tests: STANDARD tier minimum (architect role)
77
- - Standard changes: STANDARD tier (architect role)
78
- - >20 files or security/architectural changes: THOROUGH tier (architect role)
79
- - Ralph floor: always at least STANDARD, even for small changes
75
+ 7. **Architect verification** (native role):
76
+ - <5 files, <100 lines with full tests: `task(agent_type="architect", reasoning_effort="medium", prompt="...")` minimum
77
+ - Standard changes: `task(agent_type="architect", reasoning_effort="medium", prompt="...")`
78
+ - >20 files or security/architectural changes: `task(agent_type="architect", reasoning_effort="xhigh", prompt="...")`
79
+ - Ralph floor: always run an explicit `architect` native subagent, even for small changes
80
80
  7.5 **Mandatory Deslop Pass**:
81
81
  - After Step 7 passes, run `oh-my-codex:ai-slop-cleaner` on **all files changed during the Ralph session**.
82
82
  - Scope the cleaner to **changed files only**; do not widen the pass beyond Ralph-owned edits.
@@ -87,7 +87,7 @@ Complex tasks often fail silently: partial implementations get declared "done",
87
87
  - If post-deslop regression fails, roll back cleaner changes or fix and retry. Then rerun Step 7.5 and Step 7.6 until the regression is green.
88
88
  - Do not proceed to completion until post-deslop regression is green (unless `--no-deslop` explicitly skipped the deslop pass).
89
89
  8. **On approval**: If Codex goal mode is active, call `update_goal({status: "complete"})` before `/cancel`; report final elapsed time and token-budget usage when the tool returns it. Then run `/cancel` to cleanly exit and clean up all state files.
90
- 9. **On rejection**: Fix the issues raised, then re-verify at the same tier
90
+ 9. **On rejection**: Fix the issues raised, then re-verify with the same `agent_type` and `reasoning_effort` profile
91
91
  </Steps>
92
92
 
93
93
  <Tool_Usage>
@@ -150,11 +150,11 @@ Use the CLI-first state surface for Ralph lifecycle state (`omx state write/read
150
150
  <Good>
151
151
  Correct parallel delegation:
152
152
  ```
153
- delegate(role="executor", tier="LOW", task="Add type export for UserConfig")
154
- delegate(role="executor", tier="STANDARD", task="Implement the caching layer for API responses")
155
- delegate(role="executor", tier="THOROUGH", task="Refactor auth module to support OAuth2 flow")
153
+ task(agent_type="executor", reasoning_effort="low", prompt="Add type export for UserConfig")
154
+ task(agent_type="executor", reasoning_effort="medium", prompt="Implement the caching layer for API responses")
155
+ task(agent_type="executor", reasoning_effort="xhigh", prompt="Refactor auth module to support OAuth2 flow")
156
156
  ```
157
- Why good: Three independent tasks fired simultaneously at appropriate tiers.
157
+ Why good: Three independent tasks fired simultaneously while explicitly selecting the installed `executor` native role, so the UI/tracker does not show default subagents; legacy tier intent is preserved through native reasoning effort (`LOW` -> `low`, `STANDARD` -> `medium`, `THOROUGH` -> `xhigh`).
158
158
  </Good>
159
159
 
160
160
  <Good>
@@ -163,7 +163,7 @@ Correct verification before completion:
163
163
  1. Run: npm test → Output: "42 passed, 0 failed"
164
164
  2. Run: npm run build → Output: "Build succeeded"
165
165
  3. Run: lsp_diagnostics → Output: 0 errors
166
- 4. Delegate to architect at STANDARD tier → Verdict: "APPROVED"
166
+ 4. task(agent_type="architect", reasoning_effort="medium", prompt="verify completion") → Verdict: "APPROVED"
167
167
  5. Run /cancel
168
168
  ```
169
169
  Why good: Fresh evidence at each step, architect verification, then clean exit.
@@ -178,9 +178,9 @@ Why bad: Uses "should" and "look good" -- no fresh test/build output, no archite
178
178
  <Bad>
179
179
  Sequential execution of independent tasks:
180
180
  ```
181
- delegate(executor, LOW, "Add type export") → wait →
182
- delegate(executor, STANDARD, "Implement caching") → wait →
183
- delegate(executor, THOROUGH, "Refactor auth")
181
+ task(agent_type="executor", reasoning_effort="low", prompt="Add type export") → wait →
182
+ task(agent_type="executor", reasoning_effort="medium", prompt="Implement caching") → wait →
183
+ task(agent_type="executor", reasoning_effort="xhigh", prompt="Refactor auth")
184
184
  ```
185
185
  Why bad: These are independent tasks that should run in parallel, not sequentially.
186
186
  </Bad>
@@ -200,7 +200,7 @@ Why bad: These are independent tasks that should run in parallel, not sequential
200
200
  - [ ] Fresh test run output shows all tests pass
201
201
  - [ ] Fresh build output shows success
202
202
  - [ ] lsp_diagnostics shows 0 errors on affected files
203
- - [ ] Architect verification passed (STANDARD tier minimum)
203
+ - [ ] Architect verification passed through explicit `task(agent_type="architect", reasoning_effort="medium"...)` minimum
204
204
  - [ ] Codex goal-mode completion audit passed, and `update_goal({status: "complete"})` was called when an active goal exists
205
205
  - [ ] ai-slop-cleaner pass completed on changed files (or --no-deslop specified)
206
206
  - [ ] Post-deslop regression tests pass
@@ -64,6 +64,18 @@ The consensus workflow:
64
64
 
65
65
  > **Important:** Steps 3 and 4 MUST run sequentially as role-specific subagents. Do NOT issue both agent calls in the same parallel batch. Always await the subsequent `Architect` result before invoking the subsequent `Critic`; only a completed, role-specific `Critic` approval can satisfy the durable gate.
66
66
 
67
+ ## Planning/Execution Boundary
68
+
69
+ `$ralplan` is a planning mode. While ralplan is active and no explicit execution handoff is active, implementation-focused write tools are out of scope. Ralplan may inspect the repository and may write only planning artifacts such as `.omx/context/`, `.omx/plans/`, `.omx/specs/`, and required `.omx/state/` records.
70
+
71
+ The canonical flow is:
72
+
73
+ ```
74
+ $ralplan -> durable consensus artifact -> explicit execution lane -> $ultragoal | $team | $ralph
75
+ ```
76
+
77
+ Before any execution lane begins, ralplan must emit terminal planning state (complete, paused, failed, or waiting for input) and the durable handoff record below. Do not continue from consensus planning into direct code edits in the same ralplan session.
78
+
67
79
  ## Durable Consensus Handoff Contract
68
80
 
69
81
  Ralplan is not complete, skippable, or ready for execution merely because `.omx/plans/prd-*.md` and `.omx/plans/test-spec-*.md` exist. Those files are planning artifacts, not consensus evidence.
@@ -57,6 +57,22 @@ requiring a separate linked Ralph launch up front.
57
57
  - **Escalation:** start a separate `omx ralph ...` / `$ralph ...` only when a later manual follow-up still needs a persistent single-owner fix/verification loop.
58
58
  - **Deprecation:** `omx team ralph ...` has been removed. Use plain `omx team ...` for team execution or run `omx ralph ...` separately when you explicitly want a later Ralph loop.
59
59
 
60
+
61
+ ### Team Big Five / ATEM coordination gate
62
+
63
+ `$team` keeps simple independent fan-out lightweight. For isolated tasks (for example per-file sweeps, typo/copy edits, or explicitly independent lanes with no shared files/dependencies), workers use the normal concise protocol: startup ACK, claim-safe task lifecycle, status, verification, and completion evidence.
64
+
65
+ Activate the lightweight Team Big Five + ATEM-inspired coordination layer when the task or task graph has dependencies, shared files/surfaces/contracts, cross-boundary ownership, handoffs, integration/merge work, blocked lanes, or changed assumptions. The protocol is not a separate ceremony; it is a concise boundary checklist:
66
+
67
+ - **Shared mental model / single source of truth:** task JSON, inbox, mailbox, approved handoff, and leader updates are canonical.
68
+ - **Closed-loop communication / ACK-readback handoffs:** acknowledge handoffs with understood scope, affected artifact/path, owner, and next action.
69
+ - **Mutual performance monitoring at boundaries:** check upstream/downstream contracts, shared files, and verification evidence before completion.
70
+ - **Backup/reassignment behavior:** blocked workers report the smallest needed help/reassignment request and continue safe unblocked slices.
71
+ - **Adaptability checkpoints:** changed assumptions, dependencies, or verification results trigger a brief leader-facing update before widening scope.
72
+ - **Team orientation:** workers optimize for the integrated team outcome, not local-optimum-only task summaries; report integration risks, missing tests, and peer impacts.
73
+
74
+ ATEM fit: treat this as agile teamwork support for transition/action/interpersonal moments around boundaries, not as a heavyweight process model. Do not copy provider-specific plugin implementations; keep the protocol in OMX/Codex prompts, inboxes, state, and tests.
75
+
60
76
  ### Team + Ultragoal bridge
61
77
 
62
78
  Use `$ultragoal` for durable leader-owned goal/ledger tracking and `$team` for parallel execution lanes. When Team is launched with an active `.omx/ultragoal/goals.json`, worker inboxes/status may include leader-owned Ultragoal context: `.omx/ultragoal/goals.json`, `.omx/ultragoal/ledger.jsonl`, the active goal id, Codex goal mode, and the `fresh_leader_get_goal_required` checkpoint policy.
@@ -58,6 +58,15 @@ The matrix must include normal-path coverage plus adversarial dynamic e2e scenar
58
58
  - Validate exit codes and output semantics; do not trust success-looking text alone.
59
59
  - Do not delete, rewrite, or mask unrelated user work. Capture dirty-worktree evidence before and after generated harness work.
60
60
 
61
+ ### Temporary Harness Generation Guardrails
62
+
63
+ Generated harnesses are part of the QA evidence chain; until setup succeeds, they are evidence about the harness apparatus, not product behavior.
64
+
65
+ - **Use absolute repo imports for built artifacts.** When a harness runs from `/tmp` or another scratch directory but imports repository code, resolve the repository root explicitly from the verified repo cwd and import built modules with an absolute path or `pathToFileURL(join(repoRoot, "dist", ...)).href`. Never rely on `./dist/...` from the harness file's temporary directory.
66
+ - **Use a safe file writer for JS/TS harness bodies.** Prefer a small Node/Python writer or another non-interpolating file-write mechanism for harness source that contains backticks, `${...}`, shell metacharacters, or prompt-injection strings. If a shell heredoc is unavoidable, quote the delimiter and verify the written file before execution; do not use interpolating heredocs for JavaScript assertions.
67
+ - **Sanitize OMX runtime env for isolated probes.** When the scenario creates a temporary repo/state tree or intentionally checks local isolation, run the probe with `OMX_ROOT` and `OMX_STATE_ROOT` unset (for example `env -u OMX_ROOT -u OMX_STATE_ROOT ...`) so ambient boxed runtime state cannot redirect reads/writes away from the scenario fixture.
68
+ - **Classify harness setup failures separately.** If a generated harness fails before exercising product behavior because of import paths, shell interpolation, environment leakage, or fixture construction, record it as harness debris, fix the harness, and rerun the scenario before declaring a product defect.
69
+
61
70
  ## Cycle Workflow
62
71
 
63
72
  ### Cycle N (Max 5)
@@ -101,6 +101,20 @@ Worker sessions should treat team state + CLI interop as the source of truth.
101
101
  - Do **not** rely on ad-hoc tmux keystrokes as a primary delivery channel.
102
102
  - If a manual trigger arrives (for example `tmux send-keys` nudge), treat it only as a prompt to re-check state and continue through the normal claim-safe lifecycle.
103
103
 
104
+
105
+ ## Team Big Five / ATEM Coordination Gate
106
+
107
+ Keep independent fan-out lightweight: if your task is isolated with no shared files, dependencies, or handoffs, normal startup ACK, claim-safe lifecycle, status, verification, and completion evidence are sufficient.
108
+
109
+ When your inbox/task activates the Team Big Five / ATEM-inspired protocol (dependencies, shared files/surfaces/contracts, handoffs, integration, blocked lanes, or changed assumptions), use this concise boundary checklist:
110
+
111
+ - Shared mental model / single source of truth: treat task JSON, inbox, mailbox, approved handoff, and leader updates as canonical.
112
+ - Closed-loop communication / ACK-readback: acknowledge handoffs with what you understood, affected artifact/path, owner, and next action.
113
+ - Mutual performance monitoring: check boundary contracts, shared files, and verification evidence before completion.
114
+ - Backup/reassignment behavior: if blocked, write blocked status with the smallest needed help/reassignment request and continue any safe unblocked slice.
115
+ - Adaptability checkpoint: changed assumptions, dependencies, or verification results require a brief leader-facing update before widening scope.
116
+ - Team orientation: optimize for the integrated team result; report integration risks, missing tests, and peer impacts instead of local-only success.
117
+
104
118
  ## Shutdown
105
119
 
106
120
  If the lead sends a shutdown request, follow the shutdown inbox instructions exactly, write your shutdown ack file, then exit the Codex session.