oh-my-codex 0.18.6 → 0.18.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/Cargo.lock +6 -6
- package/Cargo.toml +1 -1
- package/README.md +59 -10
- package/crates/omx-sparkshell/tests/execution.rs +1 -1
- package/dist/agents/__tests__/definitions.test.js +11 -0
- package/dist/agents/__tests__/definitions.test.js.map +1 -1
- package/dist/agents/__tests__/native-config.test.js +56 -6
- package/dist/agents/__tests__/native-config.test.js.map +1 -1
- package/dist/agents/definitions.d.ts +10 -0
- package/dist/agents/definitions.d.ts.map +1 -1
- package/dist/agents/definitions.js +5 -1
- package/dist/agents/definitions.js.map +1 -1
- package/dist/agents/native-config.d.ts +5 -1
- package/dist/agents/native-config.d.ts.map +1 -1
- package/dist/agents/native-config.js +19 -4
- package/dist/agents/native-config.js.map +1 -1
- package/dist/autopilot/__tests__/fsm.test.d.ts +2 -0
- package/dist/autopilot/__tests__/fsm.test.d.ts.map +1 -0
- package/dist/autopilot/__tests__/fsm.test.js +75 -0
- package/dist/autopilot/__tests__/fsm.test.js.map +1 -0
- package/dist/autopilot/__tests__/ralplan-gate.test.d.ts +2 -0
- package/dist/autopilot/__tests__/ralplan-gate.test.d.ts.map +1 -0
- package/dist/autopilot/__tests__/ralplan-gate.test.js +79 -0
- package/dist/autopilot/__tests__/ralplan-gate.test.js.map +1 -0
- package/dist/autopilot/deep-interview-gate.d.ts +18 -0
- package/dist/autopilot/deep-interview-gate.d.ts.map +1 -0
- package/dist/autopilot/deep-interview-gate.js +256 -0
- package/dist/autopilot/deep-interview-gate.js.map +1 -0
- package/dist/autopilot/fsm.d.ts +13 -0
- package/dist/autopilot/fsm.d.ts.map +1 -0
- package/dist/autopilot/fsm.js +70 -0
- package/dist/autopilot/fsm.js.map +1 -0
- package/dist/autopilot/ralplan-gate.d.ts +17 -0
- package/dist/autopilot/ralplan-gate.d.ts.map +1 -0
- package/dist/autopilot/ralplan-gate.js +61 -0
- package/dist/autopilot/ralplan-gate.js.map +1 -0
- package/dist/cli/__tests__/codex-plugin-layout.test.js +512 -1
- package/dist/cli/__tests__/codex-plugin-layout.test.js.map +1 -1
- package/dist/cli/__tests__/doctor-warning-copy.test.js +39 -0
- package/dist/cli/__tests__/doctor-warning-copy.test.js.map +1 -1
- package/dist/cli/__tests__/index.test.js +83 -7
- package/dist/cli/__tests__/index.test.js.map +1 -1
- package/dist/cli/__tests__/launch-fallback.test.js +175 -6
- package/dist/cli/__tests__/launch-fallback.test.js.map +1 -1
- package/dist/cli/__tests__/package-bin-contract.test.js +8 -4
- package/dist/cli/__tests__/package-bin-contract.test.js.map +1 -1
- package/dist/cli/__tests__/question.test.js +100 -0
- package/dist/cli/__tests__/question.test.js.map +1 -1
- package/dist/cli/__tests__/ralph-goal-mode-contract.test.js +13 -0
- package/dist/cli/__tests__/ralph-goal-mode-contract.test.js.map +1 -1
- package/dist/cli/__tests__/ralph.test.js +14 -0
- package/dist/cli/__tests__/ralph.test.js.map +1 -1
- package/dist/cli/__tests__/setup-install-mode.test.js +89 -0
- package/dist/cli/__tests__/setup-install-mode.test.js.map +1 -1
- package/dist/cli/__tests__/setup-refresh.test.js +83 -0
- package/dist/cli/__tests__/setup-refresh.test.js.map +1 -1
- package/dist/cli/__tests__/state.test.js +21 -0
- package/dist/cli/__tests__/state.test.js.map +1 -1
- package/dist/cli/__tests__/team.test.js +2 -2
- package/dist/cli/__tests__/team.test.js.map +1 -1
- package/dist/cli/__tests__/update.test.js +110 -2
- package/dist/cli/__tests__/update.test.js.map +1 -1
- package/dist/cli/doctor.d.ts.map +1 -1
- package/dist/cli/doctor.js +8 -1
- package/dist/cli/doctor.js.map +1 -1
- package/dist/cli/index.d.ts +14 -3
- package/dist/cli/index.d.ts.map +1 -1
- package/dist/cli/index.js +298 -50
- package/dist/cli/index.js.map +1 -1
- package/dist/cli/plugin-marketplace.d.ts +14 -2
- package/dist/cli/plugin-marketplace.d.ts.map +1 -1
- package/dist/cli/plugin-marketplace.js +62 -15
- package/dist/cli/plugin-marketplace.js.map +1 -1
- package/dist/cli/question.d.ts.map +1 -1
- package/dist/cli/question.js +36 -5
- package/dist/cli/question.js.map +1 -1
- package/dist/cli/ralph.d.ts.map +1 -1
- package/dist/cli/ralph.js +3 -1
- package/dist/cli/ralph.js.map +1 -1
- package/dist/cli/setup-preferences.d.ts +2 -0
- package/dist/cli/setup-preferences.d.ts.map +1 -1
- package/dist/cli/setup-preferences.js +4 -0
- package/dist/cli/setup-preferences.js.map +1 -1
- package/dist/cli/setup.d.ts +3 -0
- package/dist/cli/setup.d.ts.map +1 -1
- package/dist/cli/setup.js +166 -27
- package/dist/cli/setup.js.map +1 -1
- package/dist/cli/state.d.ts.map +1 -1
- package/dist/cli/state.js +8 -1
- package/dist/cli/state.js.map +1 -1
- package/dist/cli/tmux-hook.d.ts.map +1 -1
- package/dist/cli/tmux-hook.js +16 -0
- package/dist/cli/tmux-hook.js.map +1 -1
- package/dist/cli/update.d.ts +2 -0
- package/dist/cli/update.d.ts.map +1 -1
- package/dist/cli/update.js +47 -3
- package/dist/cli/update.js.map +1 -1
- package/dist/config/__tests__/deep-interview.test.js +7 -6
- package/dist/config/__tests__/deep-interview.test.js.map +1 -1
- package/dist/config/__tests__/generator-notify.test.js +1 -0
- package/dist/config/__tests__/generator-notify.test.js.map +1 -1
- package/dist/config/deep-interview.d.ts.map +1 -1
- package/dist/config/deep-interview.js +14 -4
- package/dist/config/deep-interview.js.map +1 -1
- package/dist/config/generator.d.ts +2 -2
- package/dist/config/generator.d.ts.map +1 -1
- package/dist/config/generator.js +2 -2
- package/dist/config/generator.js.map +1 -1
- package/dist/config/team-mode.d.ts +12 -0
- package/dist/config/team-mode.d.ts.map +1 -0
- package/dist/config/team-mode.js +91 -0
- package/dist/config/team-mode.js.map +1 -0
- package/dist/hooks/__tests__/agents-overlay.test.js +88 -0
- package/dist/hooks/__tests__/agents-overlay.test.js.map +1 -1
- package/dist/hooks/__tests__/autopilot-skill-contract.test.js +8 -0
- package/dist/hooks/__tests__/autopilot-skill-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/code-review-skill-contract.test.js +8 -0
- package/dist/hooks/__tests__/code-review-skill-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/deep-interview-contract.test.js +10 -0
- package/dist/hooks/__tests__/deep-interview-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/keyword-detector.test.js +1072 -14
- package/dist/hooks/__tests__/keyword-detector.test.js.map +1 -1
- package/dist/hooks/__tests__/notify-fallback-watcher.test.js +64 -1
- package/dist/hooks/__tests__/notify-fallback-watcher.test.js.map +1 -1
- package/dist/hooks/__tests__/notify-hook-auto-nudge.test.js +189 -0
- package/dist/hooks/__tests__/notify-hook-auto-nudge.test.js.map +1 -1
- package/dist/hooks/__tests__/notify-hook-team-leader-nudge.test.js +35 -2
- package/dist/hooks/__tests__/notify-hook-team-leader-nudge.test.js.map +1 -1
- package/dist/hooks/__tests__/notify-hook-tmux-heal.test.js +3 -3
- package/dist/hooks/__tests__/notify-hook-tmux-heal.test.js.map +1 -1
- package/dist/hooks/__tests__/session.test.js +25 -0
- package/dist/hooks/__tests__/session.test.js.map +1 -1
- package/dist/hooks/__tests__/skill-guidance-contract.test.js +21 -0
- package/dist/hooks/__tests__/skill-guidance-contract.test.js.map +1 -1
- package/dist/hooks/agents-overlay.d.ts.map +1 -1
- package/dist/hooks/agents-overlay.js +36 -50
- package/dist/hooks/agents-overlay.js.map +1 -1
- package/dist/hooks/deep-interview-config-instruction.js +1 -1
- package/dist/hooks/deep-interview-config-instruction.js.map +1 -1
- package/dist/hooks/extensibility/__tests__/plugin-runner.test.js +31 -0
- package/dist/hooks/extensibility/__tests__/plugin-runner.test.js.map +1 -1
- package/dist/hooks/extensibility/plugin-runner.js +17 -21
- package/dist/hooks/extensibility/plugin-runner.js.map +1 -1
- package/dist/hooks/keyword-detector.d.ts +1 -0
- package/dist/hooks/keyword-detector.d.ts.map +1 -1
- package/dist/hooks/keyword-detector.js +428 -32
- package/dist/hooks/keyword-detector.js.map +1 -1
- package/dist/hooks/keyword-registry.d.ts.map +1 -1
- package/dist/hooks/keyword-registry.js +1 -0
- package/dist/hooks/keyword-registry.js.map +1 -1
- package/dist/hooks/prompt-guidance-contract.d.ts.map +1 -1
- package/dist/hooks/prompt-guidance-contract.js +6 -0
- package/dist/hooks/prompt-guidance-contract.js.map +1 -1
- package/dist/hooks/session.d.ts +3 -0
- package/dist/hooks/session.d.ts.map +1 -1
- package/dist/hooks/session.js +13 -5
- package/dist/hooks/session.js.map +1 -1
- package/dist/hud/__tests__/authority.test.js +469 -31
- package/dist/hud/__tests__/authority.test.js.map +1 -1
- package/dist/hud/__tests__/hud-tmux-injection.test.js +2 -1
- package/dist/hud/__tests__/hud-tmux-injection.test.js.map +1 -1
- package/dist/hud/__tests__/index.test.js +210 -2
- package/dist/hud/__tests__/index.test.js.map +1 -1
- package/dist/hud/__tests__/reconcile.test.js +588 -28
- package/dist/hud/__tests__/reconcile.test.js.map +1 -1
- package/dist/hud/__tests__/render.test.js +61 -0
- package/dist/hud/__tests__/render.test.js.map +1 -1
- package/dist/hud/__tests__/state.test.js +208 -0
- package/dist/hud/__tests__/state.test.js.map +1 -1
- package/dist/hud/__tests__/tmux.test.js +314 -22
- package/dist/hud/__tests__/tmux.test.js.map +1 -1
- package/dist/hud/authority.d.ts +5 -0
- package/dist/hud/authority.d.ts.map +1 -1
- package/dist/hud/authority.js +337 -30
- package/dist/hud/authority.js.map +1 -1
- package/dist/hud/index.d.ts +20 -2
- package/dist/hud/index.d.ts.map +1 -1
- package/dist/hud/index.js +103 -26
- package/dist/hud/index.js.map +1 -1
- package/dist/hud/reconcile.d.ts +3 -3
- package/dist/hud/reconcile.d.ts.map +1 -1
- package/dist/hud/reconcile.js +129 -20
- package/dist/hud/reconcile.js.map +1 -1
- package/dist/hud/render.d.ts.map +1 -1
- package/dist/hud/render.js +35 -0
- package/dist/hud/render.js.map +1 -1
- package/dist/hud/state.d.ts.map +1 -1
- package/dist/hud/state.js +64 -50
- package/dist/hud/state.js.map +1 -1
- package/dist/hud/tmux.d.ts +26 -6
- package/dist/hud/tmux.d.ts.map +1 -1
- package/dist/hud/tmux.js +173 -38
- package/dist/hud/tmux.js.map +1 -1
- package/dist/hud/types.d.ts +11 -0
- package/dist/hud/types.d.ts.map +1 -1
- package/dist/hud/types.js.map +1 -1
- package/dist/mcp/__tests__/hermes-bridge.test.js +203 -7
- package/dist/mcp/__tests__/hermes-bridge.test.js.map +1 -1
- package/dist/mcp/__tests__/state-paths.test.js +71 -1
- package/dist/mcp/__tests__/state-paths.test.js.map +1 -1
- package/dist/mcp/__tests__/state-server.test.js +13 -1
- package/dist/mcp/__tests__/state-server.test.js.map +1 -1
- package/dist/mcp/hermes-bridge.d.ts +12 -2
- package/dist/mcp/hermes-bridge.d.ts.map +1 -1
- package/dist/mcp/hermes-bridge.js +83 -9
- package/dist/mcp/hermes-bridge.js.map +1 -1
- package/dist/mcp/state-paths.d.ts +32 -0
- package/dist/mcp/state-paths.d.ts.map +1 -1
- package/dist/mcp/state-paths.js +113 -17
- package/dist/mcp/state-paths.js.map +1 -1
- package/dist/mcp/state-server.d.ts +4 -4
- package/dist/modes/__tests__/base-autoresearch-contract.test.js +7 -1
- package/dist/modes/__tests__/base-autoresearch-contract.test.js.map +1 -1
- package/dist/pipeline/__tests__/stages.test.js +130 -0
- package/dist/pipeline/__tests__/stages.test.js.map +1 -1
- package/dist/pipeline/orchestrator.js +1 -1
- package/dist/pipeline/orchestrator.js.map +1 -1
- package/dist/pipeline/stages/ralplan.d.ts +1 -0
- package/dist/pipeline/stages/ralplan.d.ts.map +1 -1
- package/dist/pipeline/stages/ralplan.js +14 -5
- package/dist/pipeline/stages/ralplan.js.map +1 -1
- package/dist/question/__tests__/deep-interview.test.js +160 -2
- package/dist/question/__tests__/deep-interview.test.js.map +1 -1
- package/dist/question/__tests__/policy.test.js +63 -3
- package/dist/question/__tests__/policy.test.js.map +1 -1
- package/dist/question/__tests__/renderer.test.js +191 -2
- package/dist/question/__tests__/renderer.test.js.map +1 -1
- package/dist/question/__tests__/state.test.js +94 -3
- package/dist/question/__tests__/state.test.js.map +1 -1
- package/dist/question/__tests__/ui.test.js +4 -0
- package/dist/question/__tests__/ui.test.js.map +1 -1
- package/dist/question/autopilot-wait.d.ts +12 -2
- package/dist/question/autopilot-wait.d.ts.map +1 -1
- package/dist/question/autopilot-wait.js +158 -47
- package/dist/question/autopilot-wait.js.map +1 -1
- package/dist/question/deep-interview.d.ts.map +1 -1
- package/dist/question/deep-interview.js +22 -6
- package/dist/question/deep-interview.js.map +1 -1
- package/dist/question/policy.d.ts.map +1 -1
- package/dist/question/policy.js +2 -5
- package/dist/question/policy.js.map +1 -1
- package/dist/question/renderer.d.ts +12 -0
- package/dist/question/renderer.d.ts.map +1 -1
- package/dist/question/renderer.js +87 -3
- package/dist/question/renderer.js.map +1 -1
- package/dist/question/state.d.ts +8 -1
- package/dist/question/state.d.ts.map +1 -1
- package/dist/question/state.js +54 -14
- package/dist/question/state.js.map +1 -1
- package/dist/question/types.d.ts +1 -1
- package/dist/question/types.d.ts.map +1 -1
- package/dist/question/ui.d.ts +1 -0
- package/dist/question/ui.d.ts.map +1 -1
- package/dist/question/ui.js +1 -0
- package/dist/question/ui.js.map +1 -1
- package/dist/ralplan/__tests__/runtime.test.js +191 -0
- package/dist/ralplan/__tests__/runtime.test.js.map +1 -1
- package/dist/ralplan/consensus-gate.d.ts +9 -1
- package/dist/ralplan/consensus-gate.d.ts.map +1 -1
- package/dist/ralplan/consensus-gate.js +84 -2
- package/dist/ralplan/consensus-gate.js.map +1 -1
- package/dist/ralplan/runtime.d.ts +9 -0
- package/dist/ralplan/runtime.d.ts.map +1 -1
- package/dist/ralplan/runtime.js +32 -11
- package/dist/ralplan/runtime.js.map +1 -1
- package/dist/scripts/__tests__/codex-native-hook.test.js +2315 -280
- package/dist/scripts/__tests__/codex-native-hook.test.js.map +1 -1
- package/dist/scripts/__tests__/notify-state-io.test.js +72 -1
- package/dist/scripts/__tests__/notify-state-io.test.js.map +1 -1
- package/dist/scripts/__tests__/notify-tmux-injection.test.d.ts +2 -0
- package/dist/scripts/__tests__/notify-tmux-injection.test.d.ts.map +1 -0
- package/dist/scripts/__tests__/notify-tmux-injection.test.js +57 -0
- package/dist/scripts/__tests__/notify-tmux-injection.test.js.map +1 -0
- package/dist/scripts/__tests__/run-test-files.test.js +74 -0
- package/dist/scripts/__tests__/run-test-files.test.js.map +1 -1
- package/dist/scripts/__tests__/verify-native-agents.test.js +65 -0
- package/dist/scripts/__tests__/verify-native-agents.test.js.map +1 -1
- package/dist/scripts/codex-native-hook.d.ts.map +1 -1
- package/dist/scripts/codex-native-hook.js +431 -56
- package/dist/scripts/codex-native-hook.js.map +1 -1
- package/dist/scripts/codex-native-pre-post.d.ts.map +1 -1
- package/dist/scripts/codex-native-pre-post.js +79 -1
- package/dist/scripts/codex-native-pre-post.js.map +1 -1
- package/dist/scripts/eval/eval-parity-smoke.js +1 -1
- package/dist/scripts/eval/eval-parity-smoke.js.map +1 -1
- package/dist/scripts/hook-payload-guard.d.ts +9 -0
- package/dist/scripts/hook-payload-guard.d.ts.map +1 -0
- package/dist/scripts/hook-payload-guard.js +111 -0
- package/dist/scripts/hook-payload-guard.js.map +1 -0
- package/dist/scripts/notify-fallback-watcher.js +8 -1
- package/dist/scripts/notify-fallback-watcher.js.map +1 -1
- package/dist/scripts/notify-hook/__tests__/payload-guard.test.d.ts +2 -0
- package/dist/scripts/notify-hook/__tests__/payload-guard.test.d.ts.map +1 -0
- package/dist/scripts/notify-hook/__tests__/payload-guard.test.js +39 -0
- package/dist/scripts/notify-hook/__tests__/payload-guard.test.js.map +1 -0
- package/dist/scripts/notify-hook/auto-nudge.d.ts.map +1 -1
- package/dist/scripts/notify-hook/auto-nudge.js +3 -1
- package/dist/scripts/notify-hook/auto-nudge.js.map +1 -1
- package/dist/scripts/notify-hook/ralph-session-resume.d.ts.map +1 -1
- package/dist/scripts/notify-hook/ralph-session-resume.js +3 -10
- package/dist/scripts/notify-hook/ralph-session-resume.js.map +1 -1
- package/dist/scripts/notify-hook/state-io.d.ts.map +1 -1
- package/dist/scripts/notify-hook/state-io.js +62 -38
- package/dist/scripts/notify-hook/state-io.js.map +1 -1
- package/dist/scripts/notify-hook/team-leader-nudge.d.ts.map +1 -1
- package/dist/scripts/notify-hook/team-leader-nudge.js +7 -0
- package/dist/scripts/notify-hook/team-leader-nudge.js.map +1 -1
- package/dist/scripts/notify-hook/team-worker-stop.d.ts.map +1 -1
- package/dist/scripts/notify-hook/team-worker-stop.js +234 -86
- package/dist/scripts/notify-hook/team-worker-stop.js.map +1 -1
- package/dist/scripts/notify-hook/tmux-injection.d.ts +7 -0
- package/dist/scripts/notify-hook/tmux-injection.d.ts.map +1 -1
- package/dist/scripts/notify-hook/tmux-injection.js +24 -18
- package/dist/scripts/notify-hook/tmux-injection.js.map +1 -1
- package/dist/scripts/notify-hook.js +86 -13
- package/dist/scripts/notify-hook.js.map +1 -1
- package/dist/scripts/run-test-files.js +193 -22
- package/dist/scripts/run-test-files.js.map +1 -1
- package/dist/scripts/sync-plugin-mirror.d.ts.map +1 -1
- package/dist/scripts/sync-plugin-mirror.js +61 -3
- package/dist/scripts/sync-plugin-mirror.js.map +1 -1
- package/dist/scripts/verify-native-agents.d.ts.map +1 -1
- package/dist/scripts/verify-native-agents.js +58 -1
- package/dist/scripts/verify-native-agents.js.map +1 -1
- package/dist/state/__tests__/operations.test.js +1125 -1
- package/dist/state/__tests__/operations.test.js.map +1 -1
- package/dist/state/__tests__/skill-active.test.js +46 -1
- package/dist/state/__tests__/skill-active.test.js.map +1 -1
- package/dist/state/__tests__/workflow-transition.test.js +98 -7
- package/dist/state/__tests__/workflow-transition.test.js.map +1 -1
- package/dist/state/operations.d.ts.map +1 -1
- package/dist/state/operations.js +159 -2
- package/dist/state/operations.js.map +1 -1
- package/dist/state/skill-active.js +6 -8
- package/dist/state/skill-active.js.map +1 -1
- package/dist/state/workflow-transition-reconcile.d.ts +6 -0
- package/dist/state/workflow-transition-reconcile.d.ts.map +1 -1
- package/dist/state/workflow-transition-reconcile.js +38 -15
- package/dist/state/workflow-transition-reconcile.js.map +1 -1
- package/dist/state/workflow-transition.d.ts.map +1 -1
- package/dist/state/workflow-transition.js +10 -3
- package/dist/state/workflow-transition.js.map +1 -1
- package/dist/subagents/__tests__/tracker.test.js +139 -0
- package/dist/subagents/__tests__/tracker.test.js.map +1 -1
- package/dist/subagents/tracker.d.ts +3 -0
- package/dist/subagents/tracker.d.ts.map +1 -1
- package/dist/subagents/tracker.js +41 -4
- package/dist/subagents/tracker.js.map +1 -1
- package/dist/team/__tests__/coordination-protocol.test.d.ts +2 -0
- package/dist/team/__tests__/coordination-protocol.test.d.ts.map +1 -0
- package/dist/team/__tests__/coordination-protocol.test.js +173 -0
- package/dist/team/__tests__/coordination-protocol.test.js.map +1 -0
- package/dist/team/__tests__/runtime.test.js +52 -3
- package/dist/team/__tests__/runtime.test.js.map +1 -1
- package/dist/team/__tests__/scaling.test.js +9 -4
- package/dist/team/__tests__/scaling.test.js.map +1 -1
- package/dist/team/__tests__/state.test.js +83 -0
- package/dist/team/__tests__/state.test.js.map +1 -1
- package/dist/team/__tests__/tmux-session.test.js +240 -2
- package/dist/team/__tests__/tmux-session.test.js.map +1 -1
- package/dist/team/__tests__/worker-bootstrap.test.js +84 -0
- package/dist/team/__tests__/worker-bootstrap.test.js.map +1 -1
- package/dist/team/__tests__/worker-runtime-identity.test.js +4 -2
- package/dist/team/__tests__/worker-runtime-identity.test.js.map +1 -1
- package/dist/team/coordination-protocol.d.ts +14 -0
- package/dist/team/coordination-protocol.d.ts.map +1 -0
- package/dist/team/coordination-protocol.js +244 -0
- package/dist/team/coordination-protocol.js.map +1 -0
- package/dist/team/runtime.d.ts +1 -0
- package/dist/team/runtime.d.ts.map +1 -1
- package/dist/team/runtime.js +19 -3
- package/dist/team/runtime.js.map +1 -1
- package/dist/team/scaling.d.ts.map +1 -1
- package/dist/team/scaling.js +3 -2
- package/dist/team/scaling.js.map +1 -1
- package/dist/team/state/tasks.d.ts.map +1 -1
- package/dist/team/state/tasks.js +24 -0
- package/dist/team/state/tasks.js.map +1 -1
- package/dist/team/state/types.d.ts +21 -1
- package/dist/team/state/types.d.ts.map +1 -1
- package/dist/team/state/types.js.map +1 -1
- package/dist/team/state.d.ts +17 -1
- package/dist/team/state.d.ts.map +1 -1
- package/dist/team/state.js +12 -5
- package/dist/team/state.js.map +1 -1
- package/dist/team/team-ops.d.ts +1 -1
- package/dist/team/team-ops.d.ts.map +1 -1
- package/dist/team/team-ops.js.map +1 -1
- package/dist/team/tmux-session.d.ts +2 -0
- package/dist/team/tmux-session.d.ts.map +1 -1
- package/dist/team/tmux-session.js +161 -13
- package/dist/team/tmux-session.js.map +1 -1
- package/dist/team/worker-bootstrap.d.ts.map +1 -1
- package/dist/team/worker-bootstrap.js +63 -0
- package/dist/team/worker-bootstrap.js.map +1 -1
- package/dist/utils/__tests__/agents-model-table.test.js +4 -2
- package/dist/utils/__tests__/agents-model-table.test.js.map +1 -1
- package/dist/utils/agents-model-table.d.ts.map +1 -1
- package/dist/utils/agents-model-table.js +3 -0
- package/dist/utils/agents-model-table.js.map +1 -1
- package/dist/verification/__tests__/ci-rust-gates.test.js +81 -1
- package/dist/verification/__tests__/ci-rust-gates.test.js.map +1 -1
- package/package.json +8 -8
- package/plugins/oh-my-codex/.codex-plugin/plugin.json +1 -1
- package/plugins/oh-my-codex/hooks/codex-native-hook.mjs +334 -21
- package/plugins/oh-my-codex/hooks/hooks.json +1 -2
- package/plugins/oh-my-codex/skills/autopilot/SKILL.md +13 -6
- package/plugins/oh-my-codex/skills/code-review/SKILL.md +7 -7
- package/plugins/oh-my-codex/skills/deep-interview/SKILL.md +9 -4
- package/plugins/oh-my-codex/skills/ralph/SKILL.md +22 -22
- package/plugins/oh-my-codex/skills/ralplan/SKILL.md +12 -0
- package/plugins/oh-my-codex/skills/team/SKILL.md +16 -0
- package/plugins/oh-my-codex/skills/ultraqa/SKILL.md +9 -0
- package/plugins/oh-my-codex/skills/worker/SKILL.md +14 -0
- package/skills/autopilot/SKILL.md +13 -6
- package/skills/code-review/SKILL.md +7 -7
- package/skills/deep-interview/SKILL.md +9 -4
- package/skills/ralph/SKILL.md +22 -22
- package/skills/ralplan/SKILL.md +12 -0
- package/skills/team/SKILL.md +16 -0
- package/skills/ultraqa/SKILL.md +9 -0
- package/skills/worker/SKILL.md +14 -0
- package/src/scripts/__tests__/codex-native-hook.test.ts +4435 -2083
- package/src/scripts/__tests__/notify-state-io.test.ts +95 -0
- package/src/scripts/__tests__/notify-tmux-injection.test.ts +82 -0
- package/src/scripts/__tests__/run-test-files.test.ts +102 -0
- package/src/scripts/__tests__/verify-native-agents.test.ts +75 -0
- package/src/scripts/codex-native-hook.ts +536 -51
- package/src/scripts/codex-native-pre-post.ts +80 -0
- package/src/scripts/demo-team-e2e.sh +10 -7
- package/src/scripts/eval/eval-parity-smoke.ts +1 -1
- package/src/scripts/hook-payload-guard.ts +113 -0
- package/src/scripts/notify-fallback-watcher.ts +8 -1
- package/src/scripts/notify-hook/__tests__/payload-guard.test.ts +41 -0
- package/src/scripts/notify-hook/auto-nudge.ts +3 -1
- package/src/scripts/notify-hook/ralph-session-resume.ts +2 -8
- package/src/scripts/notify-hook/state-io.ts +75 -37
- package/src/scripts/notify-hook/team-leader-nudge.ts +7 -0
- package/src/scripts/notify-hook/team-worker-stop.ts +193 -52
- package/src/scripts/notify-hook/tmux-injection.ts +35 -19
- package/src/scripts/notify-hook.ts +105 -6
- package/src/scripts/run-test-files.ts +192 -22
- package/src/scripts/sync-plugin-mirror.ts +98 -9
- package/src/scripts/verify-native-agents.ts +65 -1
|
@@ -64,6 +64,18 @@ The consensus workflow:
|
|
|
64
64
|
|
|
65
65
|
> **Important:** Steps 3 and 4 MUST run sequentially as role-specific subagents. Do NOT issue both agent calls in the same parallel batch. Always await the subsequent `Architect` result before invoking the subsequent `Critic`; only a completed, role-specific `Critic` approval can satisfy the durable gate.
|
|
66
66
|
|
|
67
|
+
## Planning/Execution Boundary
|
|
68
|
+
|
|
69
|
+
`$ralplan` is a planning mode. While ralplan is active and no explicit execution handoff is active, implementation-focused write tools are out of scope. Ralplan may inspect the repository and may write only planning artifacts such as `.omx/context/`, `.omx/plans/`, `.omx/specs/`, and required `.omx/state/` records.
|
|
70
|
+
|
|
71
|
+
The canonical flow is:
|
|
72
|
+
|
|
73
|
+
```
|
|
74
|
+
$ralplan -> durable consensus artifact -> explicit execution lane -> $ultragoal | $team | $ralph
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Before any execution lane begins, ralplan must emit terminal planning state (complete, paused, failed, or waiting for input) and the durable handoff record below. Do not continue from consensus planning into direct code edits in the same ralplan session.
|
|
78
|
+
|
|
67
79
|
## Durable Consensus Handoff Contract
|
|
68
80
|
|
|
69
81
|
Ralplan is not complete, skippable, or ready for execution merely because `.omx/plans/prd-*.md` and `.omx/plans/test-spec-*.md` exist. Those files are planning artifacts, not consensus evidence.
|
|
@@ -57,6 +57,22 @@ requiring a separate linked Ralph launch up front.
|
|
|
57
57
|
- **Escalation:** start a separate `omx ralph ...` / `$ralph ...` only when a later manual follow-up still needs a persistent single-owner fix/verification loop.
|
|
58
58
|
- **Deprecation:** `omx team ralph ...` has been removed. Use plain `omx team ...` for team execution or run `omx ralph ...` separately when you explicitly want a later Ralph loop.
|
|
59
59
|
|
|
60
|
+
|
|
61
|
+
### Team Big Five / ATEM coordination gate
|
|
62
|
+
|
|
63
|
+
`$team` keeps simple independent fan-out lightweight. For isolated tasks (for example per-file sweeps, typo/copy edits, or explicitly independent lanes with no shared files/dependencies), workers use the normal concise protocol: startup ACK, claim-safe task lifecycle, status, verification, and completion evidence.
|
|
64
|
+
|
|
65
|
+
Activate the lightweight Team Big Five + ATEM-inspired coordination layer when the task or task graph has dependencies, shared files/surfaces/contracts, cross-boundary ownership, handoffs, integration/merge work, blocked lanes, or changed assumptions. The protocol is not a separate ceremony; it is a concise boundary checklist:
|
|
66
|
+
|
|
67
|
+
- **Shared mental model / single source of truth:** task JSON, inbox, mailbox, approved handoff, and leader updates are canonical.
|
|
68
|
+
- **Closed-loop communication / ACK-readback handoffs:** acknowledge handoffs with understood scope, affected artifact/path, owner, and next action.
|
|
69
|
+
- **Mutual performance monitoring at boundaries:** check upstream/downstream contracts, shared files, and verification evidence before completion.
|
|
70
|
+
- **Backup/reassignment behavior:** blocked workers report the smallest needed help/reassignment request and continue safe unblocked slices.
|
|
71
|
+
- **Adaptability checkpoints:** changed assumptions, dependencies, or verification results trigger a brief leader-facing update before widening scope.
|
|
72
|
+
- **Team orientation:** workers optimize for the integrated team outcome, not local-optimum-only task summaries; report integration risks, missing tests, and peer impacts.
|
|
73
|
+
|
|
74
|
+
ATEM fit: treat this as agile teamwork support for transition/action/interpersonal moments around boundaries, not as a heavyweight process model. Do not copy provider-specific plugin implementations; keep the protocol in OMX/Codex prompts, inboxes, state, and tests.
|
|
75
|
+
|
|
60
76
|
### Team + Ultragoal bridge
|
|
61
77
|
|
|
62
78
|
Use `$ultragoal` for durable leader-owned goal/ledger tracking and `$team` for parallel execution lanes. When Team is launched with an active `.omx/ultragoal/goals.json`, worker inboxes/status may include leader-owned Ultragoal context: `.omx/ultragoal/goals.json`, `.omx/ultragoal/ledger.jsonl`, the active goal id, Codex goal mode, and the `fresh_leader_get_goal_required` checkpoint policy.
|
|
@@ -58,6 +58,15 @@ The matrix must include normal-path coverage plus adversarial dynamic e2e scenar
|
|
|
58
58
|
- Validate exit codes and output semantics; do not trust success-looking text alone.
|
|
59
59
|
- Do not delete, rewrite, or mask unrelated user work. Capture dirty-worktree evidence before and after generated harness work.
|
|
60
60
|
|
|
61
|
+
### Temporary Harness Generation Guardrails
|
|
62
|
+
|
|
63
|
+
Generated harnesses are part of the QA evidence chain; until setup succeeds, they are evidence about the harness apparatus, not product behavior.
|
|
64
|
+
|
|
65
|
+
- **Use absolute repo imports for built artifacts.** When a harness runs from `/tmp` or another scratch directory but imports repository code, resolve the repository root explicitly from the verified repo cwd and import built modules with an absolute path or `pathToFileURL(join(repoRoot, "dist", ...)).href`. Never rely on `./dist/...` from the harness file's temporary directory.
|
|
66
|
+
- **Use a safe file writer for JS/TS harness bodies.** Prefer a small Node/Python writer or another non-interpolating file-write mechanism for harness source that contains backticks, `${...}`, shell metacharacters, or prompt-injection strings. If a shell heredoc is unavoidable, quote the delimiter and verify the written file before execution; do not use interpolating heredocs for JavaScript assertions.
|
|
67
|
+
- **Sanitize OMX runtime env for isolated probes.** When the scenario creates a temporary repo/state tree or intentionally checks local isolation, run the probe with `OMX_ROOT` and `OMX_STATE_ROOT` unset (for example `env -u OMX_ROOT -u OMX_STATE_ROOT ...`) so ambient boxed runtime state cannot redirect reads/writes away from the scenario fixture.
|
|
68
|
+
- **Classify harness setup failures separately.** If a generated harness fails before exercising product behavior because of import paths, shell interpolation, environment leakage, or fixture construction, record it as harness debris, fix the harness, and rerun the scenario before declaring a product defect.
|
|
69
|
+
|
|
61
70
|
## Cycle Workflow
|
|
62
71
|
|
|
63
72
|
### Cycle N (Max 5)
|
|
@@ -101,6 +101,20 @@ Worker sessions should treat team state + CLI interop as the source of truth.
|
|
|
101
101
|
- Do **not** rely on ad-hoc tmux keystrokes as a primary delivery channel.
|
|
102
102
|
- If a manual trigger arrives (for example `tmux send-keys` nudge), treat it only as a prompt to re-check state and continue through the normal claim-safe lifecycle.
|
|
103
103
|
|
|
104
|
+
|
|
105
|
+
## Team Big Five / ATEM Coordination Gate
|
|
106
|
+
|
|
107
|
+
Keep independent fan-out lightweight: if your task is isolated with no shared files, dependencies, or handoffs, normal startup ACK, claim-safe lifecycle, status, verification, and completion evidence are sufficient.
|
|
108
|
+
|
|
109
|
+
When your inbox/task activates the Team Big Five / ATEM-inspired protocol (dependencies, shared files/surfaces/contracts, handoffs, integration, blocked lanes, or changed assumptions), use this concise boundary checklist:
|
|
110
|
+
|
|
111
|
+
- Shared mental model / single source of truth: treat task JSON, inbox, mailbox, approved handoff, and leader updates as canonical.
|
|
112
|
+
- Closed-loop communication / ACK-readback: acknowledge handoffs with what you understood, affected artifact/path, owner, and next action.
|
|
113
|
+
- Mutual performance monitoring: check boundary contracts, shared files, and verification evidence before completion.
|
|
114
|
+
- Backup/reassignment behavior: if blocked, write blocked status with the smallest needed help/reassignment request and continue any safe unblocked slice.
|
|
115
|
+
- Adaptability checkpoint: changed assumptions, dependencies, or verification results require a brief leader-facing update before widening scope.
|
|
116
|
+
- Team orientation: optimize for the integrated team result; report integration risks, missing tests, and peer impacts instead of local-only success.
|
|
117
|
+
|
|
104
118
|
## Shutdown
|
|
105
119
|
|
|
106
120
|
If the lead sends a shutdown request, follow the shutdown inbox instructions exactly, write your shutdown ack file, then exit the Codex session.
|
|
@@ -31,7 +31,9 @@ Autopilot must not run a separate broad expansion/planning/execution/QA/validati
|
|
|
31
31
|
|
|
32
32
|
1. **Phase `deep-interview`** — Socratic requirements clarification gate
|
|
33
33
|
- Run or resume `$deep-interview` to clarify intent, scope, non-goals, constraints, and decision boundaries.
|
|
34
|
-
-
|
|
34
|
+
- Deep-interview is a structured question chain, not a one-question gate; `max_rounds` is a cap, not a target.
|
|
35
|
+
- After a user answers an `omx question`, re-score ambiguity against the active profile threshold. Ask another question only when a readiness gate is still unresolved and the answer would materially change execution; otherwise crystallize the spec and hand off.
|
|
36
|
+
- Required handoff artifact: a clarified spec or concise requirements summary suitable for `$ralplan`, including an explicit interview-complete rationale when leaving deep-interview.
|
|
35
37
|
|
|
36
38
|
2. **Phase `ralplan`** — consensus planning gate
|
|
37
39
|
- Ground the task with pre-context intake and the deep-interview artifact.
|
|
@@ -66,12 +68,14 @@ Before Phase `deep-interview` or `ralplan` starts or resumes:
|
|
|
66
68
|
1. Derive a task slug from the request.
|
|
67
69
|
2. Reuse the latest relevant `.omx/context/{slug}-*.md` snapshot when available.
|
|
68
70
|
3. If none exists, create `.omx/context/{slug}-{timestamp}.md` (UTC `YYYYMMDDTHHMMSSZ`) with:
|
|
69
|
-
- task
|
|
71
|
+
- activation prompt / task seed
|
|
72
|
+
- original task status (`activation-prompt`, `legacy-unverified`, or `unavailable`)
|
|
70
73
|
- desired outcome
|
|
71
74
|
- known facts/evidence
|
|
72
75
|
- constraints
|
|
73
76
|
- unknowns/open questions
|
|
74
77
|
- likely codebase touchpoints
|
|
78
|
+
- a scope note that the seed is the Autopilot activation prompt, not guaranteed prior conversation context
|
|
75
79
|
4. If brownfield facts are missing, run `explore` first before or during `$deep-interview` (`$deep-interview --quick <task>` remains acceptable for bounded low-ambiguity intake); do not skip the clarification gate merely because the task sounds actionable.
|
|
76
80
|
5. Carry the snapshot path in Autopilot state and all handoff artifacts.
|
|
77
81
|
</Pre-context Intake>
|
|
@@ -91,6 +95,8 @@ Before Phase `deep-interview` or `ralplan` starts or resumes:
|
|
|
91
95
|
<State_Management>
|
|
92
96
|
Use the CLI-first state surface (`omx state ... --json`) for Autopilot lifecycle state. State must be session-aware when a session id exists. If the explicit MCP compatibility surface is already available, equivalent `omx_state` tool calls remain acceptable but are not required.
|
|
93
97
|
|
|
98
|
+
Inside active Autopilot, named child phases such as `$ralplan` are supervised phases, not peer workflow activations: keep `mode:"autopilot"` active and update `current_phase:"ralplan"` rather than starting standalone `mode:"ralplan"` over Autopilot.
|
|
99
|
+
|
|
94
100
|
Required fields:
|
|
95
101
|
|
|
96
102
|
```json
|
|
@@ -126,12 +132,12 @@ Required fields:
|
|
|
126
132
|
```
|
|
127
133
|
|
|
128
134
|
- **On start**: `omx state write --input '{"mode":"autopilot","active":true,"current_phase":"deep-interview","iteration":1,"review_cycle":0,"state":{"phase_cycle":["deep-interview","ralplan","ultragoal","code-review","ultraqa"],"handoff_artifacts":{"context_snapshot_path":"<snapshot-path>","deep_interview":null,"ralplan":null,"ralplan_consensus_gate":{"required":true,"sequence":["architect-review","critic-review"],"planning_artifacts_are_not_consensus":true,"required_review_roles":["architect","critic"],"ralplan_architect_review":null,"ralplan_critic_review":null,"complete":false},"ultragoal":null,"code_review":null,"ultraqa":null},"review_verdict":null,"qa_verdict":null,"return_to_ralplan_reason":null}}' --json`
|
|
129
|
-
- **On deep-interview -> ralplan**:
|
|
130
|
-
- **On ralplan -> ultragoal**: only after `ralplan_consensus_gate.complete:true`, with `ralplan_architect_review.agent_role:"architect"` and `ralplan_architect_review.verdict:"approve"` recorded before `ralplan_critic_review.agent_role:"critic"` and `ralplan_critic_review.verdict:"approve"`;
|
|
135
|
+
- **On deep-interview -> ralplan**: only after a separate gate proves the interview chain is explicitly complete or the user explicitly authorized a skip. For completion, persist `deep_interview_gate:{"status":"complete","rationale":"<why requirements are complete>","handoff_summary":"<summary>"}` (or equivalent non-empty rationale/summary) plus the clarified spec/requirements under `handoff_artifacts.deep_interview`; if a final `omx question` was involved, keep its same-session answered record linked by `question_id`/`satisfied_at`. For skip, persist `deep_interview_gate:{"status":"skipped","skip_authorized_by_user":true,"skip_reason":"<user-authorized reason>","skipped_at":"<timestamp>","source":"user","session_id":"<session>"}`. Do not leave deep-interview merely because the first `omx question` was answered or cleared.
|
|
136
|
+
- **On ralplan -> ultragoal**: only after `ralplan_consensus_gate.complete:true`, with tracker-backed native-subagent `ralplan_architect_review.agent_role:"architect"` and `ralplan_architect_review.verdict:"approve"` recorded before tracker-backed native-subagent `ralplan_critic_review.agent_role:"critic"` and `ralplan_critic_review.verdict:"approve"`; `codex_exec` or artifact-only approvals are trace evidence but not native lane proof. Set `current_phase:"ultragoal"` and persist the plan/test-spec paths under `handoff_artifacts.ralplan`.
|
|
131
137
|
- **On missing ralplan consensus evidence**: keep `current_phase:"ralplan"`, persist `ralplan_consensus_gate.complete:false` with `blocked_reason`, and report an explicit blocker or max-iteration outcome instead of handing off to execution.
|
|
132
138
|
- **On ultragoal -> code-review**: set `current_phase:"code-review"`, persist implementation/test/ledger evidence under `handoff_artifacts.ultragoal`.
|
|
133
|
-
- **On code-review -> ultraqa**: set `current_phase:"ultraqa"
|
|
134
|
-
- **On clean review + passed/skipped QA**: set `active:false`, `current_phase:"complete"`, persist `review_verdict:{recommendation:"APPROVE", architectural_status:"CLEAR", clean:true}`, `qa_verdict:{clean:true, skipped:<boolean>, reason:<string|null>}`, and `completed_at
|
|
139
|
+
- **On code-review -> ultraqa**: set `current_phase:"ultraqa"` only after a real `$code-review` stage/subagent has produced durable evidence; persist the clean review under `handoff_artifacts.code_review` with its source thread/tool/stage reference. Do not author `review_verdict:{clean:true}` from the leader's own summary.
|
|
140
|
+
- **On clean review + passed/skipped QA**: set `active:false`, `current_phase:"complete"`, persist `review_verdict:{recommendation:"APPROVE", architectural_status:"CLEAR", clean:true}`, `qa_verdict:{clean:true, skipped:<boolean>, reason:<string|null>}`, and `completed_at` only when both gates have durable source evidence. Required evidence is either (a) actual `$code-review`/`$ultraqa` stage or native-subagent/thread/tool records, or (b) for QA only, an explicit persisted skip reason for a documented docs-only/trivially non-runtime condition. If that evidence is missing, keep the active phase at `code-review` or `ultraqa` and record a blocker instead of self-attesting a clean gate.
|
|
135
141
|
- **On non-clean review or failed QA**: increment `iteration` and `review_cycle`, set `current_phase:"ralplan"`, persist `review_verdict` or `qa_verdict`, persist the phase handoff, and set `return_to_ralplan_reason` to a concise findings-driven reason.
|
|
136
142
|
- **Legacy Ralph state**: if a user explicitly selected the legacy Ralph execution lane, phase names and handoff keys may include `ralph`; preserve and resume them rather than rewriting history to Ultragoal.
|
|
137
143
|
- **On cancellation**: run `$cancel`; preserve progress for resume rather than deleting handoff artifacts.
|
|
@@ -175,6 +181,7 @@ Pipeline state should use `current_phase` values that match the same phase names
|
|
|
175
181
|
- [ ] `$team` was used only if the active Ultragoal story needed coordinated parallel work, or explicitly recorded as not needed
|
|
176
182
|
- [ ] Phase `code-review` returned a clean verdict (`APPROVE` + `CLEAR`)
|
|
177
183
|
- [ ] Phase `ultraqa` passed, or was explicitly skipped because the change was docs-only/trivially non-runtime with evidence
|
|
184
|
+
- [ ] Clean `review_verdict` cites durable source evidence from a real `$code-review` stage/subagent/thread/tool record; `qa_verdict` cites durable `$ultraqa` evidence or an explicit persisted low-risk skip reason; leader-authored summaries alone are not gate evidence
|
|
178
185
|
- [ ] `review_verdict.clean` is true, `qa_verdict.clean` is true, and `return_to_ralplan_reason` is null
|
|
179
186
|
- [ ] Tests/build/lint/typecheck evidence from Ultragoal is available in handoff artifacts
|
|
180
187
|
- [ ] Autopilot state is marked `complete` or cancellation state is preserved coherently
|
|
@@ -31,7 +31,7 @@ Delegates to the `code-reviewer` and `architect` agents in parallel for a two-la
|
|
|
31
31
|
2. **Launch Parallel Review Lanes**
|
|
32
32
|
- **`code-reviewer` lane** - owns spec compliance, security, code quality, performance, and maintainability findings
|
|
33
33
|
- **`architect` lane** - owns the devil's-advocate / design-tradeoff perspective
|
|
34
|
-
- Both lanes run in parallel and produce distinct outputs before final synthesis
|
|
34
|
+
- Both lanes run in parallel on a clean context with explicit scope and artifacts, and produce distinct outputs before final synthesis
|
|
35
35
|
- If either lane cannot be launched or does not return evidence, report `independent review unavailable`; do **not** substitute the current/authoring lane, and do **not** approve or mark the review merge-ready.
|
|
36
36
|
|
|
37
37
|
3. **Review Categories**
|
|
@@ -72,9 +72,9 @@ Delegates to the `code-reviewer` and `architect` agents in parallel for a two-la
|
|
|
72
72
|
Do not self-review as a fallback. If the `code-reviewer` or `architect` agent path is missing, unavailable, skipped, or fails, emit a clear unavailable-review result and block approval until the independent lane evidence exists.
|
|
73
73
|
|
|
74
74
|
```
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
75
|
+
task(
|
|
76
|
+
agent_type="code-reviewer",
|
|
77
|
+
reasoning_effort="xhigh",
|
|
78
78
|
prompt="CODE REVIEW TASK
|
|
79
79
|
|
|
80
80
|
Review code changes for quality, security, and maintainability.
|
|
@@ -98,9 +98,9 @@ Output: Code review report with:
|
|
|
98
98
|
- Approval recommendation (APPROVE / REQUEST CHANGES / COMMENT)"
|
|
99
99
|
)
|
|
100
100
|
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
101
|
+
task(
|
|
102
|
+
agent_type="architect",
|
|
103
|
+
reasoning_effort="xhigh",
|
|
104
104
|
prompt="ARCHITECTURE / DEVIL'S-ADVOCATE REVIEW TASK
|
|
105
105
|
|
|
106
106
|
Review the same code changes from the architecture/tradeoff perspective.
|
|
@@ -32,6 +32,8 @@ Execution quality is usually bottlenecked by intent clarity, not just missing im
|
|
|
32
32
|
- **Deep (`--deep`)**: high-rigor exploration; target threshold `<= 0.15`; max rounds 20
|
|
33
33
|
- **Autoresearch (`--autoresearch`)**: same interview rigor as Standard, but specialized for `$autoresearch` mission readiness and `.omx/specs/` artifact handoff
|
|
34
34
|
|
|
35
|
+
Profile `max rounds` is a hard cap, not a target. Do not continue only to reach a numbered round count. Extra Socratic rigor does not override the active threshold unless the profile/config changes.
|
|
36
|
+
|
|
35
37
|
If no flag is provided, use **Standard**.
|
|
36
38
|
|
|
37
39
|
<Mode_Flags>
|
|
@@ -70,6 +72,8 @@ If no flag is provided, use **Standard**.
|
|
|
70
72
|
- Treat `answers[]` as the primary `omx question` success contract. For a single interview round, read `answers[0].answer`; use legacy top-level `answer` only as a compatibility fallback when needed.
|
|
71
73
|
- If the current runtime is outside tmux and cannot render `omx question`, use the native structured question tool when available; otherwise ask exactly one concise plain-text question and wait for the answer
|
|
72
74
|
- Re-score ambiguity after each answer and show progress transparently
|
|
75
|
+
- Once ambiguity is at or below the active profile threshold, stop ordinary questioning. Run the practical closure audit: crystallize/handoff when readiness gates pass; otherwise ask only the final closure question needed to satisfy a named gate.
|
|
76
|
+
- Treat `max_rounds` as a stop cap, not evidence that more rounds are needed.
|
|
73
77
|
- Do not hand off to execution while ambiguity remains above threshold unless user explicitly opts to proceed with warning
|
|
74
78
|
- Do not crystallize or hand off while `Non-goals` or `Decision Boundaries` remain unresolved, even if the weighted ambiguity threshold is met
|
|
75
79
|
- Treat early exit as a safety valve, not the default success path
|
|
@@ -130,7 +134,7 @@ If no flag is provided, use **Standard**.
|
|
|
130
134
|
|
|
131
135
|
## Phase 2: Socratic Interview Loop
|
|
132
136
|
|
|
133
|
-
Repeat until ambiguity `<= threshold`, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached.
|
|
137
|
+
Repeat until ambiguity `<= threshold`, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.
|
|
134
138
|
|
|
135
139
|
### 2a) Generate next question
|
|
136
140
|
If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to `$ralplan`, `$autopilot`, `$ralph`, or `$team` until that summary answer is captured.
|
|
@@ -280,8 +284,9 @@ Readiness gate:
|
|
|
280
284
|
- `Decision Boundaries` must be explicit
|
|
281
285
|
- A pressure pass must be complete: at least one earlier answer has been revisited with an evidence, assumption, or tradeoff follow-up
|
|
282
286
|
- A practical closure audit must pass: another question would change execution materially, not merely polish wording or chase a narrow edge case
|
|
283
|
-
- If either gate is unresolved, or the pressure pass is incomplete, continue
|
|
284
|
-
- Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec
|
|
287
|
+
- If either gate is unresolved, or the pressure pass is incomplete, continue below threshold only with a final closure question that names the unresolved gate and would materially change execution.
|
|
288
|
+
- Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec instead of opening a new branch.
|
|
289
|
+
- If ambiguity is `<= 0.10`, another user-facing question is allowed only as that final closure question; otherwise crystallize immediately.
|
|
285
290
|
|
|
286
291
|
### 2d) Report progress
|
|
287
292
|
Show weighted breakdown table, readiness-gate status (`Non-goals`, `Decision Boundaries`), and the next focus dimension.
|
|
@@ -294,7 +299,7 @@ Append round result and updated scores via `omx state write --input '<json>' --j
|
|
|
294
299
|
- Apply a **Dialectic Rhythm Guard**: track consecutive non-user fact discoveries and confirmation-style answers (`[from-code][auto-confirmed]`, `[from-code]`, or `[from-research]`). After 3 consecutive non-user or confirmation answers, the next material user-facing round must solicit direct human judgment (`[from-user]`) unless the closure audit says the interview is ready to crystallize.
|
|
295
300
|
- Round 4+: allow explicit early exit with risk warning
|
|
296
301
|
- Soft warning at profile midpoint (e.g., round 3/6/10 depending on profile)
|
|
297
|
-
- Hard cap at profile `max_rounds
|
|
302
|
+
- Hard cap at profile `max_rounds`; never treat this cap as a desired interview length or quota
|
|
298
303
|
|
|
299
304
|
## Phase 3: Challenge Modes (assumption stress tests)
|
|
300
305
|
|
package/skills/ralph/SKILL.md
CHANGED
|
@@ -26,14 +26,14 @@ Ralph is a persistence loop that keeps working on a task until it is fully compl
|
|
|
26
26
|
</Do_Not_Use_When>
|
|
27
27
|
|
|
28
28
|
<Why_This_Exists>
|
|
29
|
-
Complex tasks often fail silently: partial implementations get declared "done", tests get skipped, edge cases get forgotten. Ralph prevents this by looping until work is genuinely complete, requiring fresh verification evidence before allowing completion, and using
|
|
29
|
+
Complex tasks often fail silently: partial implementations get declared "done", tests get skipped, edge cases get forgotten. Ralph prevents this by looping until work is genuinely complete, requiring fresh verification evidence before allowing completion, and using explicit architect native-subagent verification to confirm quality.
|
|
30
30
|
</Why_This_Exists>
|
|
31
31
|
|
|
32
32
|
<Execution_Policy>
|
|
33
33
|
- Fire independent agent calls simultaneously -- never wait sequentially for independent work
|
|
34
34
|
- Use `run_in_background: true` for long operations (installs, builds, test suites)
|
|
35
|
-
- Always
|
|
36
|
-
-
|
|
35
|
+
- Always set `agent_type` when spawning native subagents; use `reasoning_effort` for per-dispatch intensity when needed
|
|
36
|
+
- Preserve legacy Ralph tier intent through native reasoning effort: LOW -> `low`, STANDARD -> `medium`, THOROUGH -> `xhigh`
|
|
37
37
|
- Deliver the full implementation: no scope reduction, no partial completion, no deleting tests to make them pass
|
|
38
38
|
- Apply the shared workflow guidance pattern: outcome-first framing, concise visible updates for multi-step execution, local overrides for the active workflow branch, validation proportional to risk, explicit stop rules, and automatic continuation for safe reversible steps. Ask only for material, destructive, credentialed, external-production, or preference-dependent branches.
|
|
39
39
|
- Integrate with Codex goal mode when goal tools are available: inspect the active thread goal with `get_goal`, preserve it as the top-level stop condition, and only call `update_goal({status: "complete"})` after a Ralph completion audit proves the objective is actually achieved.
|
|
@@ -54,10 +54,10 @@ Complex tasks often fail silently: partial implementations get declared "done",
|
|
|
54
54
|
- Do not begin Ralph execution work (delegation, implementation, or verification loops) until snapshot grounding exists. If forced to proceed quickly, note explicit risk tradeoffs.
|
|
55
55
|
1. **Review progress**: Check TODO list and any prior iteration state
|
|
56
56
|
2. **Continue from where you left off**: Pick up incomplete tasks
|
|
57
|
-
3. **Delegate in parallel**: Route tasks to specialist agents
|
|
58
|
-
- Simple lookups:
|
|
59
|
-
- Standard work:
|
|
60
|
-
- Complex analysis:
|
|
57
|
+
3. **Delegate in parallel**: Route tasks to specialist native agents with explicit `agent_type` and appropriate `reasoning_effort`
|
|
58
|
+
- Simple lookups: `reasoning_effort="low"` -- "What does this function return?"
|
|
59
|
+
- Standard work: `reasoning_effort="medium"` -- "Add error handling to this module"
|
|
60
|
+
- Complex analysis: `reasoning_effort="xhigh"` -- "Debug this race condition"
|
|
61
61
|
- When Ralph is entered as a ralplan follow-up, start from the approved **available-agent-types roster** and make the delegation plan explicit: implementation lane, evidence/regression lane, and final sign-off lane using only known agent types
|
|
62
62
|
4. **Run long operations in background**: Builds, installs, test suites use `run_in_background: true`
|
|
63
63
|
5. **Visual task gate (when screenshot/reference images are present)**:
|
|
@@ -72,11 +72,11 @@ Complex tasks often fail silently: partial implementations get declared "done",
|
|
|
72
72
|
b. Run verification (test, build, lint)
|
|
73
73
|
c. Read the output -- confirm it actually passed
|
|
74
74
|
d. Check: zero pending/in_progress TODO items
|
|
75
|
-
7. **Architect verification** (
|
|
76
|
-
- <5 files, <100 lines with full tests:
|
|
77
|
-
- Standard changes:
|
|
78
|
-
- >20 files or security/architectural changes:
|
|
79
|
-
- Ralph floor: always
|
|
75
|
+
7. **Architect verification** (native role):
|
|
76
|
+
- <5 files, <100 lines with full tests: `task(agent_type="architect", reasoning_effort="medium", prompt="...")` minimum
|
|
77
|
+
- Standard changes: `task(agent_type="architect", reasoning_effort="medium", prompt="...")`
|
|
78
|
+
- >20 files or security/architectural changes: `task(agent_type="architect", reasoning_effort="xhigh", prompt="...")`
|
|
79
|
+
- Ralph floor: always run an explicit `architect` native subagent, even for small changes
|
|
80
80
|
7.5 **Mandatory Deslop Pass**:
|
|
81
81
|
- After Step 7 passes, run `oh-my-codex:ai-slop-cleaner` on **all files changed during the Ralph session**.
|
|
82
82
|
- Scope the cleaner to **changed files only**; do not widen the pass beyond Ralph-owned edits.
|
|
@@ -87,7 +87,7 @@ Complex tasks often fail silently: partial implementations get declared "done",
|
|
|
87
87
|
- If post-deslop regression fails, roll back cleaner changes or fix and retry. Then rerun Step 7.5 and Step 7.6 until the regression is green.
|
|
88
88
|
- Do not proceed to completion until post-deslop regression is green (unless `--no-deslop` explicitly skipped the deslop pass).
|
|
89
89
|
8. **On approval**: If Codex goal mode is active, call `update_goal({status: "complete"})` before `/cancel`; report final elapsed time and token-budget usage when the tool returns it. Then run `/cancel` to cleanly exit and clean up all state files.
|
|
90
|
-
9. **On rejection**: Fix the issues raised, then re-verify
|
|
90
|
+
9. **On rejection**: Fix the issues raised, then re-verify with the same `agent_type` and `reasoning_effort` profile
|
|
91
91
|
</Steps>
|
|
92
92
|
|
|
93
93
|
<Tool_Usage>
|
|
@@ -150,11 +150,11 @@ Use the CLI-first state surface for Ralph lifecycle state (`omx state write/read
|
|
|
150
150
|
<Good>
|
|
151
151
|
Correct parallel delegation:
|
|
152
152
|
```
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
153
|
+
task(agent_type="executor", reasoning_effort="low", prompt="Add type export for UserConfig")
|
|
154
|
+
task(agent_type="executor", reasoning_effort="medium", prompt="Implement the caching layer for API responses")
|
|
155
|
+
task(agent_type="executor", reasoning_effort="xhigh", prompt="Refactor auth module to support OAuth2 flow")
|
|
156
156
|
```
|
|
157
|
-
Why good: Three independent tasks fired simultaneously
|
|
157
|
+
Why good: Three independent tasks fired simultaneously while explicitly selecting the installed `executor` native role, so the UI/tracker does not show default subagents; legacy tier intent is preserved through native reasoning effort (`LOW` -> `low`, `STANDARD` -> `medium`, `THOROUGH` -> `xhigh`).
|
|
158
158
|
</Good>
|
|
159
159
|
|
|
160
160
|
<Good>
|
|
@@ -163,7 +163,7 @@ Correct verification before completion:
|
|
|
163
163
|
1. Run: npm test → Output: "42 passed, 0 failed"
|
|
164
164
|
2. Run: npm run build → Output: "Build succeeded"
|
|
165
165
|
3. Run: lsp_diagnostics → Output: 0 errors
|
|
166
|
-
4.
|
|
166
|
+
4. task(agent_type="architect", reasoning_effort="medium", prompt="verify completion") → Verdict: "APPROVED"
|
|
167
167
|
5. Run /cancel
|
|
168
168
|
```
|
|
169
169
|
Why good: Fresh evidence at each step, architect verification, then clean exit.
|
|
@@ -178,9 +178,9 @@ Why bad: Uses "should" and "look good" -- no fresh test/build output, no archite
|
|
|
178
178
|
<Bad>
|
|
179
179
|
Sequential execution of independent tasks:
|
|
180
180
|
```
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
181
|
+
task(agent_type="executor", reasoning_effort="low", prompt="Add type export") → wait →
|
|
182
|
+
task(agent_type="executor", reasoning_effort="medium", prompt="Implement caching") → wait →
|
|
183
|
+
task(agent_type="executor", reasoning_effort="xhigh", prompt="Refactor auth")
|
|
184
184
|
```
|
|
185
185
|
Why bad: These are independent tasks that should run in parallel, not sequentially.
|
|
186
186
|
</Bad>
|
|
@@ -200,7 +200,7 @@ Why bad: These are independent tasks that should run in parallel, not sequential
|
|
|
200
200
|
- [ ] Fresh test run output shows all tests pass
|
|
201
201
|
- [ ] Fresh build output shows success
|
|
202
202
|
- [ ] lsp_diagnostics shows 0 errors on affected files
|
|
203
|
-
- [ ] Architect verification passed (
|
|
203
|
+
- [ ] Architect verification passed through explicit `task(agent_type="architect", reasoning_effort="medium"...)` minimum
|
|
204
204
|
- [ ] Codex goal-mode completion audit passed, and `update_goal({status: "complete"})` was called when an active goal exists
|
|
205
205
|
- [ ] ai-slop-cleaner pass completed on changed files (or --no-deslop specified)
|
|
206
206
|
- [ ] Post-deslop regression tests pass
|
package/skills/ralplan/SKILL.md
CHANGED
|
@@ -64,6 +64,18 @@ The consensus workflow:
|
|
|
64
64
|
|
|
65
65
|
> **Important:** Steps 3 and 4 MUST run sequentially as role-specific subagents. Do NOT issue both agent calls in the same parallel batch. Always await the subsequent `Architect` result before invoking the subsequent `Critic`; only a completed, role-specific `Critic` approval can satisfy the durable gate.
|
|
66
66
|
|
|
67
|
+
## Planning/Execution Boundary
|
|
68
|
+
|
|
69
|
+
`$ralplan` is a planning mode. While ralplan is active and no explicit execution handoff is active, implementation-focused write tools are out of scope. Ralplan may inspect the repository and may write only planning artifacts such as `.omx/context/`, `.omx/plans/`, `.omx/specs/`, and required `.omx/state/` records.
|
|
70
|
+
|
|
71
|
+
The canonical flow is:
|
|
72
|
+
|
|
73
|
+
```
|
|
74
|
+
$ralplan -> durable consensus artifact -> explicit execution lane -> $ultragoal | $team | $ralph
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Before any execution lane begins, ralplan must emit terminal planning state (complete, paused, failed, or waiting for input) and the durable handoff record below. Do not continue from consensus planning into direct code edits in the same ralplan session.
|
|
78
|
+
|
|
67
79
|
## Durable Consensus Handoff Contract
|
|
68
80
|
|
|
69
81
|
Ralplan is not complete, skippable, or ready for execution merely because `.omx/plans/prd-*.md` and `.omx/plans/test-spec-*.md` exist. Those files are planning artifacts, not consensus evidence.
|
package/skills/team/SKILL.md
CHANGED
|
@@ -57,6 +57,22 @@ requiring a separate linked Ralph launch up front.
|
|
|
57
57
|
- **Escalation:** start a separate `omx ralph ...` / `$ralph ...` only when a later manual follow-up still needs a persistent single-owner fix/verification loop.
|
|
58
58
|
- **Deprecation:** `omx team ralph ...` has been removed. Use plain `omx team ...` for team execution or run `omx ralph ...` separately when you explicitly want a later Ralph loop.
|
|
59
59
|
|
|
60
|
+
|
|
61
|
+
### Team Big Five / ATEM coordination gate
|
|
62
|
+
|
|
63
|
+
`$team` keeps simple independent fan-out lightweight. For isolated tasks (for example per-file sweeps, typo/copy edits, or explicitly independent lanes with no shared files/dependencies), workers use the normal concise protocol: startup ACK, claim-safe task lifecycle, status, verification, and completion evidence.
|
|
64
|
+
|
|
65
|
+
Activate the lightweight Team Big Five + ATEM-inspired coordination layer when the task or task graph has dependencies, shared files/surfaces/contracts, cross-boundary ownership, handoffs, integration/merge work, blocked lanes, or changed assumptions. The protocol is not a separate ceremony; it is a concise boundary checklist:
|
|
66
|
+
|
|
67
|
+
- **Shared mental model / single source of truth:** task JSON, inbox, mailbox, approved handoff, and leader updates are canonical.
|
|
68
|
+
- **Closed-loop communication / ACK-readback handoffs:** acknowledge handoffs with understood scope, affected artifact/path, owner, and next action.
|
|
69
|
+
- **Mutual performance monitoring at boundaries:** check upstream/downstream contracts, shared files, and verification evidence before completion.
|
|
70
|
+
- **Backup/reassignment behavior:** blocked workers report the smallest needed help/reassignment request and continue safe unblocked slices.
|
|
71
|
+
- **Adaptability checkpoints:** changed assumptions, dependencies, or verification results trigger a brief leader-facing update before widening scope.
|
|
72
|
+
- **Team orientation:** workers optimize for the integrated team outcome, not local-optimum-only task summaries; report integration risks, missing tests, and peer impacts.
|
|
73
|
+
|
|
74
|
+
ATEM fit: treat this as agile teamwork support for transition/action/interpersonal moments around boundaries, not as a heavyweight process model. Do not copy provider-specific plugin implementations; keep the protocol in OMX/Codex prompts, inboxes, state, and tests.
|
|
75
|
+
|
|
60
76
|
### Team + Ultragoal bridge
|
|
61
77
|
|
|
62
78
|
Use `$ultragoal` for durable leader-owned goal/ledger tracking and `$team` for parallel execution lanes. When Team is launched with an active `.omx/ultragoal/goals.json`, worker inboxes/status may include leader-owned Ultragoal context: `.omx/ultragoal/goals.json`, `.omx/ultragoal/ledger.jsonl`, the active goal id, Codex goal mode, and the `fresh_leader_get_goal_required` checkpoint policy.
|
package/skills/ultraqa/SKILL.md
CHANGED
|
@@ -58,6 +58,15 @@ The matrix must include normal-path coverage plus adversarial dynamic e2e scenar
|
|
|
58
58
|
- Validate exit codes and output semantics; do not trust success-looking text alone.
|
|
59
59
|
- Do not delete, rewrite, or mask unrelated user work. Capture dirty-worktree evidence before and after generated harness work.
|
|
60
60
|
|
|
61
|
+
### Temporary Harness Generation Guardrails
|
|
62
|
+
|
|
63
|
+
Generated harnesses are part of the QA evidence chain; until setup succeeds, they are evidence about the harness apparatus, not product behavior.
|
|
64
|
+
|
|
65
|
+
- **Use absolute repo imports for built artifacts.** When a harness runs from `/tmp` or another scratch directory but imports repository code, resolve the repository root explicitly from the verified repo cwd and import built modules with an absolute path or `pathToFileURL(join(repoRoot, "dist", ...)).href`. Never rely on `./dist/...` from the harness file's temporary directory.
|
|
66
|
+
- **Use a safe file writer for JS/TS harness bodies.** Prefer a small Node/Python writer or another non-interpolating file-write mechanism for harness source that contains backticks, `${...}`, shell metacharacters, or prompt-injection strings. If a shell heredoc is unavoidable, quote the delimiter and verify the written file before execution; do not use interpolating heredocs for JavaScript assertions.
|
|
67
|
+
- **Sanitize OMX runtime env for isolated probes.** When the scenario creates a temporary repo/state tree or intentionally checks local isolation, run the probe with `OMX_ROOT` and `OMX_STATE_ROOT` unset (for example `env -u OMX_ROOT -u OMX_STATE_ROOT ...`) so ambient boxed runtime state cannot redirect reads/writes away from the scenario fixture.
|
|
68
|
+
- **Classify harness setup failures separately.** If a generated harness fails before exercising product behavior because of import paths, shell interpolation, environment leakage, or fixture construction, record it as harness debris, fix the harness, and rerun the scenario before declaring a product defect.
|
|
69
|
+
|
|
61
70
|
## Cycle Workflow
|
|
62
71
|
|
|
63
72
|
### Cycle N (Max 5)
|
package/skills/worker/SKILL.md
CHANGED
|
@@ -101,6 +101,20 @@ Worker sessions should treat team state + CLI interop as the source of truth.
|
|
|
101
101
|
- Do **not** rely on ad-hoc tmux keystrokes as a primary delivery channel.
|
|
102
102
|
- If a manual trigger arrives (for example `tmux send-keys` nudge), treat it only as a prompt to re-check state and continue through the normal claim-safe lifecycle.
|
|
103
103
|
|
|
104
|
+
|
|
105
|
+
## Team Big Five / ATEM Coordination Gate
|
|
106
|
+
|
|
107
|
+
Keep independent fan-out lightweight: if your task is isolated with no shared files, dependencies, or handoffs, normal startup ACK, claim-safe lifecycle, status, verification, and completion evidence are sufficient.
|
|
108
|
+
|
|
109
|
+
When your inbox/task activates the Team Big Five / ATEM-inspired protocol (dependencies, shared files/surfaces/contracts, handoffs, integration, blocked lanes, or changed assumptions), use this concise boundary checklist:
|
|
110
|
+
|
|
111
|
+
- Shared mental model / single source of truth: treat task JSON, inbox, mailbox, approved handoff, and leader updates as canonical.
|
|
112
|
+
- Closed-loop communication / ACK-readback: acknowledge handoffs with what you understood, affected artifact/path, owner, and next action.
|
|
113
|
+
- Mutual performance monitoring: check boundary contracts, shared files, and verification evidence before completion.
|
|
114
|
+
- Backup/reassignment behavior: if blocked, write blocked status with the smallest needed help/reassignment request and continue any safe unblocked slice.
|
|
115
|
+
- Adaptability checkpoint: changed assumptions, dependencies, or verification results require a brief leader-facing update before widening scope.
|
|
116
|
+
- Team orientation: optimize for the integrated team result; report integration risks, missing tests, and peer impacts instead of local-only success.
|
|
117
|
+
|
|
104
118
|
## Shutdown
|
|
105
119
|
|
|
106
120
|
If the lead sends a shutdown request, follow the shutdown inbox instructions exactly, write your shutdown ack file, then exit the Codex session.
|