oh-my-codex 0.18.8 → 0.18.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/Cargo.lock +12 -12
- package/Cargo.toml +1 -1
- package/README.md +4 -0
- package/dist/autopilot/__tests__/deep-interview-gate.test.d.ts +2 -0
- package/dist/autopilot/__tests__/deep-interview-gate.test.d.ts.map +1 -0
- package/dist/autopilot/__tests__/deep-interview-gate.test.js +215 -0
- package/dist/autopilot/__tests__/deep-interview-gate.test.js.map +1 -0
- package/dist/autopilot/__tests__/fsm.test.js +3 -0
- package/dist/autopilot/__tests__/fsm.test.js.map +1 -1
- package/dist/autopilot/__tests__/ralplan-gate.test.js +148 -0
- package/dist/autopilot/__tests__/ralplan-gate.test.js.map +1 -1
- package/dist/autopilot/deep-interview-gate.d.ts.map +1 -1
- package/dist/autopilot/deep-interview-gate.js +140 -0
- package/dist/autopilot/deep-interview-gate.js.map +1 -1
- package/dist/autopilot/fsm.js +2 -2
- package/dist/autopilot/fsm.js.map +1 -1
- package/dist/cli/__tests__/auth.test.js +37 -2
- package/dist/cli/__tests__/auth.test.js.map +1 -1
- package/dist/cli/__tests__/codex-feature-probe.test.d.ts +2 -0
- package/dist/cli/__tests__/codex-feature-probe.test.d.ts.map +1 -0
- package/dist/cli/__tests__/codex-feature-probe.test.js +46 -0
- package/dist/cli/__tests__/codex-feature-probe.test.js.map +1 -0
- package/dist/cli/__tests__/codex-plugin-layout.test.js +1 -1
- package/dist/cli/__tests__/codex-plugin-layout.test.js.map +1 -1
- package/dist/cli/__tests__/doctor-warning-copy.test.js +2 -0
- package/dist/cli/__tests__/doctor-warning-copy.test.js.map +1 -1
- package/dist/cli/__tests__/index.test.js +288 -6
- package/dist/cli/__tests__/index.test.js.map +1 -1
- package/dist/cli/__tests__/launch-fallback.test.js +19 -5
- package/dist/cli/__tests__/launch-fallback.test.js.map +1 -1
- package/dist/cli/__tests__/package-bin-contract.test.js +39 -10
- package/dist/cli/__tests__/package-bin-contract.test.js.map +1 -1
- package/dist/cli/__tests__/question.test.js +26 -9
- package/dist/cli/__tests__/question.test.js.map +1 -1
- package/dist/cli/__tests__/resume.test.js +50 -1
- package/dist/cli/__tests__/resume.test.js.map +1 -1
- package/dist/cli/__tests__/setup-refresh.test.js +6 -2
- package/dist/cli/__tests__/setup-refresh.test.js.map +1 -1
- package/dist/cli/__tests__/sparkshell-packaging.test.js +45 -2
- package/dist/cli/__tests__/sparkshell-packaging.test.js.map +1 -1
- package/dist/cli/__tests__/team-decompose.test.js +10 -5
- package/dist/cli/__tests__/team-decompose.test.js.map +1 -1
- package/dist/cli/__tests__/team.test.js +45 -1
- package/dist/cli/__tests__/team.test.js.map +1 -1
- package/dist/cli/__tests__/ultragoal.test.js +75 -0
- package/dist/cli/__tests__/ultragoal.test.js.map +1 -1
- package/dist/cli/__tests__/update.test.js +214 -17
- package/dist/cli/__tests__/update.test.js.map +1 -1
- package/dist/cli/__tests__/windows-popup-loop-contract.test.js +1 -1
- package/dist/cli/auth.d.ts.map +1 -1
- package/dist/cli/auth.js +25 -1
- package/dist/cli/auth.js.map +1 -1
- package/dist/cli/codex-feature-probe.d.ts +5 -2
- package/dist/cli/codex-feature-probe.d.ts.map +1 -1
- package/dist/cli/codex-feature-probe.js +25 -9
- package/dist/cli/codex-feature-probe.js.map +1 -1
- package/dist/cli/index.d.ts +39 -5
- package/dist/cli/index.d.ts.map +1 -1
- package/dist/cli/index.js +184 -101
- package/dist/cli/index.js.map +1 -1
- package/dist/cli/setup.d.ts.map +1 -1
- package/dist/cli/setup.js +9 -1
- package/dist/cli/setup.js.map +1 -1
- package/dist/cli/team.d.ts +4 -0
- package/dist/cli/team.d.ts.map +1 -1
- package/dist/cli/team.js +43 -4
- package/dist/cli/team.js.map +1 -1
- package/dist/cli/ultragoal.d.ts.map +1 -1
- package/dist/cli/ultragoal.js +29 -0
- package/dist/cli/ultragoal.js.map +1 -1
- package/dist/cli/update.d.ts +20 -3
- package/dist/cli/update.d.ts.map +1 -1
- package/dist/cli/update.js +265 -23
- package/dist/cli/update.js.map +1 -1
- package/dist/cli/version.d.ts.map +1 -1
- package/dist/cli/version.js +5 -9
- package/dist/cli/version.js.map +1 -1
- package/dist/compat/__tests__/doctor-contract.test.js +12 -1
- package/dist/compat/__tests__/doctor-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/agents-overlay.test.js +1 -0
- package/dist/hooks/__tests__/agents-overlay.test.js.map +1 -1
- package/dist/hooks/__tests__/autopilot-skill-contract.test.js +15 -0
- package/dist/hooks/__tests__/autopilot-skill-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/code-review-skill-contract.test.js +7 -3
- package/dist/hooks/__tests__/code-review-skill-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/deep-interview-contract.test.js +46 -1
- package/dist/hooks/__tests__/deep-interview-contract.test.js.map +1 -1
- package/dist/hooks/__tests__/skill-guidance-contract.test.js +14 -5
- package/dist/hooks/__tests__/skill-guidance-contract.test.js.map +1 -1
- package/dist/hooks/agents-overlay.d.ts.map +1 -1
- package/dist/hooks/agents-overlay.js +2 -1
- package/dist/hooks/agents-overlay.js.map +1 -1
- package/dist/hooks/extensibility/__tests__/plugin-runner.test.js +112 -1
- package/dist/hooks/extensibility/__tests__/plugin-runner.test.js.map +1 -1
- package/dist/hooks/extensibility/plugin-runner-stdin.d.ts +2 -0
- package/dist/hooks/extensibility/plugin-runner-stdin.d.ts.map +1 -0
- package/dist/hooks/extensibility/plugin-runner-stdin.js +16 -0
- package/dist/hooks/extensibility/plugin-runner-stdin.js.map +1 -0
- package/dist/hooks/extensibility/plugin-runner.js +2 -4
- package/dist/hooks/extensibility/plugin-runner.js.map +1 -1
- package/dist/hud/__tests__/index.test.js +23 -2
- package/dist/hud/__tests__/index.test.js.map +1 -1
- package/dist/hud/__tests__/reconcile.test.js +387 -0
- package/dist/hud/__tests__/reconcile.test.js.map +1 -1
- package/dist/hud/__tests__/state.test.js +28 -0
- package/dist/hud/__tests__/state.test.js.map +1 -1
- package/dist/hud/__tests__/tmux.test.js +118 -7
- package/dist/hud/__tests__/tmux.test.js.map +1 -1
- package/dist/hud/index.d.ts +6 -1
- package/dist/hud/index.d.ts.map +1 -1
- package/dist/hud/index.js +12 -3
- package/dist/hud/index.js.map +1 -1
- package/dist/hud/reconcile.d.ts +6 -2
- package/dist/hud/reconcile.d.ts.map +1 -1
- package/dist/hud/reconcile.js +58 -28
- package/dist/hud/reconcile.js.map +1 -1
- package/dist/hud/state.d.ts.map +1 -1
- package/dist/hud/state.js +4 -18
- package/dist/hud/state.js.map +1 -1
- package/dist/hud/tmux.d.ts +14 -1
- package/dist/hud/tmux.d.ts.map +1 -1
- package/dist/hud/tmux.js +129 -15
- package/dist/hud/tmux.js.map +1 -1
- package/dist/question/__tests__/renderer.test.js +566 -1
- package/dist/question/__tests__/renderer.test.js.map +1 -1
- package/dist/question/renderer.d.ts +9 -1
- package/dist/question/renderer.d.ts.map +1 -1
- package/dist/question/renderer.js +246 -70
- package/dist/question/renderer.js.map +1 -1
- package/dist/ralplan/consensus-gate.js +9 -1
- package/dist/ralplan/consensus-gate.js.map +1 -1
- package/dist/scripts/__tests__/codex-native-hook.test.js +322 -15
- package/dist/scripts/__tests__/codex-native-hook.test.js.map +1 -1
- package/dist/scripts/__tests__/run-test-files.test.js +115 -1
- package/dist/scripts/__tests__/run-test-files.test.js.map +1 -1
- package/dist/scripts/codex-native-hook.d.ts.map +1 -1
- package/dist/scripts/codex-native-hook.js +94 -20
- package/dist/scripts/codex-native-hook.js.map +1 -1
- package/dist/scripts/notify-hook/team-worker-stop.d.ts.map +1 -1
- package/dist/scripts/notify-hook/team-worker-stop.js +54 -21
- package/dist/scripts/notify-hook/team-worker-stop.js.map +1 -1
- package/dist/scripts/run-test-files.js +218 -160
- package/dist/scripts/run-test-files.js.map +1 -1
- package/dist/state/__tests__/operations.test.js +463 -0
- package/dist/state/__tests__/operations.test.js.map +1 -1
- package/dist/team/__tests__/delivery-log.test.js +18 -0
- package/dist/team/__tests__/delivery-log.test.js.map +1 -1
- package/dist/team/__tests__/runtime.test.js +48 -0
- package/dist/team/__tests__/runtime.test.js.map +1 -1
- package/dist/team/__tests__/tmux-session.test.js +107 -0
- package/dist/team/__tests__/tmux-session.test.js.map +1 -1
- package/dist/team/__tests__/tmux-test-fixture.d.ts.map +1 -1
- package/dist/team/__tests__/tmux-test-fixture.js +14 -2
- package/dist/team/__tests__/tmux-test-fixture.js.map +1 -1
- package/dist/team/__tests__/tmux-test-fixture.test.js +1 -0
- package/dist/team/__tests__/tmux-test-fixture.test.js.map +1 -1
- package/dist/team/__tests__/worker-bootstrap.test.js +54 -1
- package/dist/team/__tests__/worker-bootstrap.test.js.map +1 -1
- package/dist/team/delivery-log.d.ts +1 -1
- package/dist/team/delivery-log.d.ts.map +1 -1
- package/dist/team/delivery-log.js.map +1 -1
- package/dist/team/repo-aware-decomposition.d.ts +4 -0
- package/dist/team/repo-aware-decomposition.d.ts.map +1 -1
- package/dist/team/repo-aware-decomposition.js.map +1 -1
- package/dist/team/runtime.d.ts.map +1 -1
- package/dist/team/runtime.js +78 -9
- package/dist/team/runtime.js.map +1 -1
- package/dist/team/tmux-session.d.ts +1 -0
- package/dist/team/tmux-session.d.ts.map +1 -1
- package/dist/team/tmux-session.js +16 -5
- package/dist/team/tmux-session.js.map +1 -1
- package/dist/team/ultragoal-context.d.ts +12 -0
- package/dist/team/ultragoal-context.d.ts.map +1 -1
- package/dist/team/ultragoal-context.js +32 -8
- package/dist/team/ultragoal-context.js.map +1 -1
- package/dist/utils/__tests__/paths.test.js +23 -0
- package/dist/utils/__tests__/paths.test.js.map +1 -1
- package/dist/utils/__tests__/platform-command.test.js +16 -1
- package/dist/utils/__tests__/platform-command.test.js.map +1 -1
- package/dist/utils/__tests__/version.test.d.ts +2 -0
- package/dist/utils/__tests__/version.test.d.ts.map +1 -0
- package/dist/utils/__tests__/version.test.js +51 -0
- package/dist/utils/__tests__/version.test.js.map +1 -0
- package/dist/utils/paths.d.ts +8 -1
- package/dist/utils/paths.d.ts.map +1 -1
- package/dist/utils/paths.js +20 -6
- package/dist/utils/paths.js.map +1 -1
- package/dist/utils/platform-command.d.ts +9 -0
- package/dist/utils/platform-command.d.ts.map +1 -1
- package/dist/utils/platform-command.js +15 -0
- package/dist/utils/platform-command.js.map +1 -1
- package/dist/utils/toml.d.ts +4 -0
- package/dist/utils/toml.d.ts.map +1 -0
- package/dist/utils/toml.js +75 -0
- package/dist/utils/toml.js.map +1 -0
- package/dist/utils/version.d.ts +7 -0
- package/dist/utils/version.d.ts.map +1 -0
- package/dist/utils/version.js +67 -0
- package/dist/utils/version.js.map +1 -0
- package/dist/verification/__tests__/ci-rust-gates.test.js +8 -0
- package/dist/verification/__tests__/ci-rust-gates.test.js.map +1 -1
- package/dist/verification/__tests__/dev-merge-issue-close-workflow.test.js +16 -2
- package/dist/verification/__tests__/dev-merge-issue-close-workflow.test.js.map +1 -1
- package/package.json +4 -3
- package/plugins/oh-my-codex/.codex-plugin/plugin.json +1 -1
- package/plugins/oh-my-codex/skills/autopilot/SKILL.md +3 -0
- package/plugins/oh-my-codex/skills/code-review/SKILL.md +2 -2
- package/plugins/oh-my-codex/skills/deep-interview/SKILL.md +85 -11
- package/plugins/oh-my-codex/skills/ultrawork/SKILL.md +32 -17
- package/skills/autopilot/SKILL.md +3 -0
- package/skills/code-review/SKILL.md +2 -2
- package/skills/deep-interview/SKILL.md +85 -11
- package/skills/ultrawork/SKILL.md +32 -17
- package/src/scripts/__tests__/codex-native-hook.test.ts +391 -26
- package/src/scripts/__tests__/run-test-files.test.ts +138 -2
- package/src/scripts/codex-native-hook.ts +99 -17
- package/src/scripts/notify-hook/team-worker-stop.ts +58 -18
- package/src/scripts/prepare-build.js +83 -0
- package/src/scripts/run-test-files.ts +229 -150
- package/templates/AGENTS.md +40 -199
- package/src/scripts/postinstall-bootstrap.js +0 -23
|
@@ -51,6 +51,11 @@ If no flag is provided, use **Standard**.
|
|
|
51
51
|
- Gather codebase facts via `explore` before asking user about internals
|
|
52
52
|
- `omx explore` is deprecated. Use normal repository inspection tools/subagents for simple read-only brownfield fact gathering; use `omx sparkshell` only for explicit shell-native read-only evidence, and keep ambiguous or non-shell-only investigation on the richer normal path.
|
|
53
53
|
- Always run a preflight context intake before the first interview question
|
|
54
|
+
- For brownfield work, preflight must include doc/context grounding before user-facing questions: inspect applicable `AGENTS.md` files, README/getting-started docs, relevant `docs/` contracts/plans/ADRs, existing `.omx/context/` snapshots, and any project-local glossary/context files such as `CONTEXT.md` or `CONTEXT-MAP.md` when present.
|
|
55
|
+
- Treat existing repo language as evidence, not authority: if the user uses a fuzzy, overloaded, or conflicting term, surface the specific doc/code wording and ask which meaning should govern before implementation.
|
|
56
|
+
- Cross-check user claims about current behavior against code or documented contracts when discoverable. If docs and code disagree, ask a confirmation question that names both sources instead of silently choosing one.
|
|
57
|
+
- Use scenario-based edge-case grilling when relationships, boundaries, or handoff behavior are unclear: invent one concrete scenario that stresses the ambiguous boundary, then ask one focused question about the expected outcome.
|
|
58
|
+
- Durable docs, glossary, ADR, or memory updates are opt-in and public-safe only. Deep-interview may recommend such updates in the handoff summary, but must not automatically create or dump public docs from interview transcripts unless the user explicitly chooses that as in-scope.
|
|
54
59
|
- If initial context is oversized or would exceed the prompt budget, do not paste or forward the raw payload into interview prompts; request and record a prompt-safe initial-context summary first
|
|
55
60
|
- The oversized initial-context summary gate is blocking: wait for the concise summary before ambiguity scoring, crystallizing artifacts, or any downstream execution handoff
|
|
56
61
|
- The summary must preserve goals, constraints, success criteria, non-goals, decision boundaries, and references to any full source documents so downstream consumers receive a prompt-safe but faithful context
|
|
@@ -97,8 +102,15 @@ If no flag is provided, use **Standard**.
|
|
|
97
102
|
- Unknowns/open questions
|
|
98
103
|
- Decision-boundary unknowns
|
|
99
104
|
- Likely codebase touchpoints
|
|
105
|
+
- Relevant repo docs/rules/context inspected
|
|
106
|
+
- Terminology or doc/code conflicts found
|
|
100
107
|
- Prompt-safe initial-context summary status (`not_needed`, `needed`, or `recorded`)
|
|
101
|
-
5.
|
|
108
|
+
5. For brownfield tasks, inspect the applicable documentation/rule surface before the first user-facing round. Prefer exact, nearby sources over broad scans:
|
|
109
|
+
- governing `AGENTS.md` files and template/runtime instruction surfaces that apply to the touched paths
|
|
110
|
+
- README/getting-started docs and relevant docs under `docs/`, especially contracts, plans, ADR-like records, and workflow docs
|
|
111
|
+
- existing `.omx/context/` snapshots, `.omx/specs/`, and planning artifacts relevant to the slug
|
|
112
|
+
- project-local glossary/context files such as `CONTEXT.md`, `CONTEXT-MAP.md`, or context-specific docs when they exist
|
|
113
|
+
6. Save snapshot to `.omx/context/{slug}-{timestamp}.md` (UTC `YYYYMMDDTHHMMSSZ`) and reference it in mode state.
|
|
102
114
|
|
|
103
115
|
## Phase 1: Initialize
|
|
104
116
|
|
|
@@ -137,13 +149,14 @@ If no flag is provided, use **Standard**.
|
|
|
137
149
|
Repeat until ambiguity `<= threshold`, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.
|
|
138
150
|
|
|
139
151
|
### 2a) Generate next question
|
|
140
|
-
If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to `$ralplan`, `$autopilot`, `$ralph`, or `$team` until that summary answer is captured.
|
|
152
|
+
If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to `$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, or `$team` until that summary answer is captured.
|
|
141
153
|
|
|
142
154
|
Use:
|
|
143
155
|
- Original idea
|
|
144
156
|
- Prior Q&A rounds
|
|
145
157
|
- Current dimension scores
|
|
146
158
|
- Brownfield context (if any)
|
|
159
|
+
- Doc/context grounding notes, including existing terminology, governing rules, and any doc/code mismatch
|
|
147
160
|
- Activated challenge mode injection (Phase 3)
|
|
148
161
|
|
|
149
162
|
Target the lowest-scoring dimension, but respect stage priority:
|
|
@@ -155,12 +168,21 @@ Follow-up pressure ladder after each answer:
|
|
|
155
168
|
1. Ask for a concrete example, counterexample, or evidence signal behind the latest claim
|
|
156
169
|
2. Probe the hidden assumption, dependency, or belief that makes the claim true
|
|
157
170
|
3. Force a boundary or tradeoff: what would you explicitly not do, defer, or reject?
|
|
158
|
-
4.
|
|
171
|
+
4. Challenge fuzzy or conflicting terms against the repo's documented language and current code behavior
|
|
172
|
+
5. Stress-test the boundary with one concrete scenario or edge case when a relationship or handoff remains ambiguous
|
|
173
|
+
6. If the answer still describes symptoms, reframe toward essence / root cause before moving on
|
|
159
174
|
|
|
160
175
|
Prefer staying on the same thread for multiple rounds when it has the highest leverage. Breadth without pressure is not progress.
|
|
161
176
|
|
|
162
177
|
Maintain a **Breadth Ledger** across independent ambiguity tracks: scope, constraints, outputs, verification, brownfield integration, and any user-mentioned deliverable tracks. The ledger is a guard, not a mandatory rotation rule: stay deep on the current thread until it has been pressure-tested, then zoom out only when another material track remains unresolved and would change execution.
|
|
163
178
|
|
|
179
|
+
Maintain a **Docs/Terminology Ledger** for brownfield interviews:
|
|
180
|
+
- repo docs/rules/context sources inspected, with path references
|
|
181
|
+
- canonical terms already used by the repo and terms to avoid or disambiguate
|
|
182
|
+
- user terms that conflict with docs or current code behavior
|
|
183
|
+
- doc/code mismatches that require a human decision before implementation
|
|
184
|
+
- optional durable-doc follow-ups that are safe to propose but not auto-apply
|
|
185
|
+
|
|
164
186
|
Detailed dimensions:
|
|
165
187
|
- Intent Clarity — why the user wants this
|
|
166
188
|
- Outcome Clarity — what end state they want
|
|
@@ -306,6 +328,7 @@ Append round result and updated scores via `omx state write --input '<json>' --j
|
|
|
306
328
|
Use each mode once when applicable. These are normal escalation tools, not rare rescue moves:
|
|
307
329
|
|
|
308
330
|
- **Contrarian** (round 2+ or immediately when an answer rests on an untested assumption): challenge core assumptions
|
|
331
|
+
- **Terminologist** (brownfield, whenever a key term is fuzzy, overloaded, or conflicts with repo docs/code): force a canonical meaning against existing project language before implementation
|
|
309
332
|
- **Simplifier** (round 4+ or when scope expands faster than outcome clarity): probe minimal viable scope
|
|
310
333
|
- **Ontologist** (round 5+ and ambiguity > 0.25, or when the user keeps describing symptoms): ask for essence-level reframing
|
|
311
334
|
|
|
@@ -336,6 +359,9 @@ Spec should include:
|
|
|
336
359
|
- Assumptions exposed + resolutions
|
|
337
360
|
- Pressure-pass findings (which answer was revisited, and what changed)
|
|
338
361
|
- Brownfield evidence vs inference notes for any repository-grounded confirmation questions
|
|
362
|
+
- Docs/Terminology Ledger with inspected repo docs/rules/context, term conflicts, and any doc/code mismatch decisions
|
|
363
|
+
- Scenario/edge-case pressure findings that materially shaped scope or acceptance criteria
|
|
364
|
+
- Optional durable documentation recommendations, explicitly marked opt-in and public-safe; do not include raw private transcript dumps
|
|
339
365
|
- Technical context findings
|
|
340
366
|
- Full or condensed transcript
|
|
341
367
|
|
|
@@ -365,11 +391,45 @@ When the clarified task is specifically about `$autoresearch`, or the skill is i
|
|
|
365
391
|
|
|
366
392
|
## Phase 5: Execution Bridge
|
|
367
393
|
|
|
368
|
-
Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, and any residual-risk warnings across the handoff.
|
|
394
|
+
Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, and any residual-risk warnings across the handoff.
|
|
395
|
+
|
|
396
|
+
### Optional execution contract foundation
|
|
397
|
+
|
|
398
|
+
When an Autopilot/deep-interview handoff explicitly requires a stride contract, emit it as structured data rather than prose. This is a validation foundation, not a broadness-inference feature: do not infer stride from task length, phase labels, snapshots, or freeform wording.
|
|
399
|
+
|
|
400
|
+
Canonical location under Autopilot state:
|
|
401
|
+
|
|
402
|
+
```json
|
|
403
|
+
{
|
|
404
|
+
"handoff_artifacts": {
|
|
405
|
+
"deep_interview": {
|
|
406
|
+
"execution_contract_required": true,
|
|
407
|
+
"execution_contract": {
|
|
408
|
+
"version": 1,
|
|
409
|
+
"execution_stride": "task",
|
|
410
|
+
"source": "deep-interview",
|
|
411
|
+
"selected_by": "user",
|
|
412
|
+
"allow_task_shrink": true,
|
|
413
|
+
"completion_unit": "One focused task",
|
|
414
|
+
"stop_condition": "Stop after that task is implemented and verified",
|
|
415
|
+
"acceptance_coverage_scope": "task",
|
|
416
|
+
"shrink_policy": "allowed"
|
|
417
|
+
}
|
|
418
|
+
}
|
|
419
|
+
}
|
|
420
|
+
}
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
Stride meanings:
|
|
424
|
+
- `task`: conservative, small-step execution; `allow_task_shrink:true`, `acceptance_coverage_scope:"task"`, `shrink_policy:"allowed"`.
|
|
425
|
+
- `deliverable`: finish the named deliverable before stopping; `allow_task_shrink:false`, `acceptance_coverage_scope:"deliverable"`, `shrink_policy:"ask_before_shrink"`.
|
|
426
|
+
- `milestone`: finish the larger approved milestone unless blocked; `allow_task_shrink:false`, `acceptance_coverage_scope:"milestone"`, `shrink_policy:"deny_unless_blocked"`.
|
|
427
|
+
|
|
428
|
+
Only set `execution_contract_required:true` when the selected downstream workflow needs this explicit stride/stop-condition guard. New artifacts must write the canonical snake_case schema shown above under `handoff_artifacts.deep_interview`; runtime readers may accept legacy camelCase field/marker aliases and direct/nested `execution_contract` locations only as compatibility input. If `execution_contract_required` is absent or false, downstream Autopilot compatibility behavior is unchanged.
|
|
369
429
|
|
|
370
430
|
### Goal-mode follow-ups
|
|
371
431
|
|
|
372
|
-
Include these product-facing suggestions when they fit the clarified spec, without removing the existing `$ralplan`, `$autopilot`, `$ralph`, and `$team` handoff options:
|
|
432
|
+
Include these product-facing suggestions when they fit the clarified spec, without removing the existing `$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, and `$team` handoff options:
|
|
373
433
|
|
|
374
434
|
- **`$ultragoal`** — default goal-mode follow-up for implementation or general goal-oriented follow-up specs that should be converted into durable Codex/OMX goals with sequential completion tracking.
|
|
375
435
|
- **`$autoresearch-goal`** — use when the clarified context is a research project: a research question, reference/literature gathering, evaluator-backed analysis, or professor/critic-style deliverable.
|
|
@@ -377,7 +437,16 @@ Include these product-facing suggestions when they fit the clarified spec, witho
|
|
|
377
437
|
|
|
378
438
|
Recommend `$ultragoal` as the default durable goal-mode follow-up because it supersedes Ralph for goal tracking. Preserve `$team` for coordinated parallel implementation and keep `$ralph` only as an explicit fallback for persistent single-owner execution/verification when the user specifically selects it.
|
|
379
439
|
|
|
380
|
-
### 1. **`$
|
|
440
|
+
### 1. **`$ultragoal` (Default durable execution follow-up)**
|
|
441
|
+
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md` (optionally accompanied by the transcript/context snapshot for traceability)
|
|
442
|
+
- **Invocation:** `$ultragoal create-goals --brief-file <spec-path>` followed by `$ultragoal complete-goals` in the active execution lane
|
|
443
|
+
- **Consumer Behavior:** Convert the clarified spec into durable goal-mode work. Preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, scenario-pressure findings, and residual-risk warnings as binding story constraints.
|
|
444
|
+
- **Skipped / Already-Satisfied Stages:** Requirement interview, ambiguity clarification, doc/context preflight, and early intent-boundary elicitation
|
|
445
|
+
- **Expected Output:** `.omx/ultragoal/brief.md`, `.omx/ultragoal/goals.json`, `.omx/ultragoal/ledger.jsonl`, implementation evidence, verification evidence, and final cleanup/review-gate evidence
|
|
446
|
+
- **Best When:** The clarified spec is execution-ready or the user explicitly wants durable goal tracking as the next step
|
|
447
|
+
- **Next Recommended Step:** Run the Ultragoal completion loop; launch `$team` only inside an active Ultragoal story when parallel lanes are warranted, and use `$ralph` only as an explicit fallback when the user asks for that legacy persistence mode
|
|
448
|
+
|
|
449
|
+
### 2. **`$ralplan` (Recommended when architecture/test-shape review is still needed)**
|
|
381
450
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md` (optionally accompanied by the transcript/context snapshot for traceability)
|
|
382
451
|
- **Invocation:** `$plan --consensus --direct <spec-path>`
|
|
383
452
|
- **Consumer Behavior:** Treat the deep-interview spec as the requirements source of truth. Do not repeat the interview by default; refine architecture/feasibility around the clarified intent and boundaries instead.
|
|
@@ -386,7 +455,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
386
455
|
- **Best When:** Requirements are clear enough to stop interviewing, but architectural validation / consensus planning is still desirable
|
|
387
456
|
- **Next Recommended Step:** Use the approved planning artifacts with `$ultragoal` as the default durable goal-mode follow-up (optionally with `$team` for parallel lanes); choose `$autoresearch-goal` for research validation or `$performance-goal` for measurable optimization, and use `$ralph` only as an explicit fallback when a narrow single-owner persistence loop is requested
|
|
388
457
|
|
|
389
|
-
###
|
|
458
|
+
### 3. **`$autopilot`**
|
|
390
459
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md`
|
|
391
460
|
- **Invocation:** `$autopilot <spec-path>`
|
|
392
461
|
- **Consumer Behavior:** Use the deep-interview spec as the clarified execution brief. Preserve intent, non-goals, decision boundaries, and acceptance criteria as binding context for planning/execution.
|
|
@@ -395,7 +464,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
395
464
|
- **Best When:** The clarified spec is already strong enough for direct planning + execution without an additional consensus gate
|
|
396
465
|
- **Next Recommended Step:** Continue through autopilot's execution/QA/validation flow; if coordination-heavy execution emerges, prefer `$team` under a leader-owned `$ultragoal` ledger, using `$ralph` only as an explicit fallback when a narrow single-owner persistence loop is requested
|
|
397
466
|
|
|
398
|
-
###
|
|
467
|
+
### 4. **`$ralph` (Explicit fallback only)**
|
|
399
468
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md`
|
|
400
469
|
- **Invocation:** `$ralph <spec-path>`
|
|
401
470
|
- **Consumer Behavior:** Use the spec's acceptance criteria and boundary constraints as the persistence target. Do not reopen requirements discovery unless the user explicitly asks to refine further.
|
|
@@ -404,7 +473,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
404
473
|
- **Best When:** The user explicitly asks for Ralph's persistent sequential completion pressure; otherwise use `$ultragoal` for durable goal tracking and completion checkpoints
|
|
405
474
|
- **Next Recommended Step:** If this explicit fallback is selected, continue Ralph's persistence loop; if work expands into coordination-heavy lanes, hand off to `$team` under `$ultragoal` checkpointing rather than promoting Ralph as the next default
|
|
406
475
|
|
|
407
|
-
###
|
|
476
|
+
### 5. **`$team`**
|
|
408
477
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md`
|
|
409
478
|
- **Invocation:** `$team <spec-path>`
|
|
410
479
|
- **Consumer Behavior:** Treat the spec as shared execution context for coordinated parallel work. Preserve the clarified intent, non-goals, decision boundaries, and acceptance criteria as common lane constraints.
|
|
@@ -413,7 +482,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
413
482
|
- **Best When:** The task is large, multi-lane, or blocker-sensitive enough to justify coordinated parallel execution instead of a single persistent loop
|
|
414
483
|
- **Next Recommended Step:** Follow the team verification path when the coordinated execution phase finishes; checkpoint completion through `$ultragoal` by default, escalating to a separate Ralph loop only when the user explicitly asks for that persistent verification/fix owner
|
|
415
484
|
|
|
416
|
-
###
|
|
485
|
+
### 6. **Refine further**
|
|
417
486
|
- **Input Artifact:** Existing transcript, context snapshot, and current spec draft
|
|
418
487
|
- **Invocation:** Continue the interview loop
|
|
419
488
|
- **Consumer Behavior:** Re-enter questioning to resolve the highest-leverage remaining uncertainty
|
|
@@ -437,6 +506,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
437
506
|
- Use `omx state write/read --input '<json>' --json` for resumable mode state; `state_write` / `state_read` are explicit MCP compatibility fallbacks only
|
|
438
507
|
- If the interview cannot ask a required `omx question` round, persist the blocker as terminal state with `active: false` and `current_phase: "blocked"`; do not write a terminal blocked phase with `active: true`
|
|
439
508
|
- Read/write context snapshots under `.omx/context/`
|
|
509
|
+
- Read applicable repo docs/rules/context during preflight; write durable docs, glossary, ADR, or memory updates only when the user explicitly opts in and the content is public-safe
|
|
440
510
|
- Record whether the oversized-context summary gate is not needed, pending, or satisfied before any scoring or handoff step
|
|
441
511
|
- Save transcript/spec artifacts under `.omx/interviews/` and `.omx/specs/`
|
|
442
512
|
</Tool_Usage>
|
|
@@ -460,7 +530,11 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
460
530
|
- [ ] Transcript written to `.omx/interviews/{slug}-{timestamp}.md`
|
|
461
531
|
- [ ] Spec written to `.omx/specs/deep-interview-{slug}.md`
|
|
462
532
|
- [ ] Brownfield questions use evidence-backed confirmation when applicable
|
|
463
|
-
- [ ]
|
|
533
|
+
- [ ] Brownfield preflight inspected applicable repo docs/rules/context before user-facing questions
|
|
534
|
+
- [ ] Fuzzy or conflicting terminology was challenged against repo language/current code behavior when applicable
|
|
535
|
+
- [ ] Scenario-based edge-case grilling was used when boundary ambiguity would materially affect implementation
|
|
536
|
+
- [ ] Durable docs/ADR/memory updates, if any, were explicitly opted into and public-safe
|
|
537
|
+
- [ ] Handoff options provided (`$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, `$team`) plus context-sensitive goal-mode suggestions (`$autoresearch-goal`, `$performance-goal`) when applicable
|
|
464
538
|
- [ ] No direct implementation performed in this mode
|
|
465
539
|
</Final_Checklist>
|
|
466
540
|
|
|
@@ -4,22 +4,23 @@ description: Parallel execution engine for high-throughput task completion
|
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
<Purpose>
|
|
7
|
-
Ultrawork is a parallel execution engine for high-throughput task completion. It is a component, not a standalone persistence mode: it provides parallelism, context discipline, and smart delegation guidance, but not Ralph's persistence loop, architect sign-off, or long-running completion guarantees.
|
|
7
|
+
Ultrawork is a parallel execution engine for high-throughput task completion. It is a component, not a standalone persistence or verification mode: it provides parallelism, context discipline, and smart delegation guidance, but not durable goal tracking, Team's tmux worker lifecycle, Ralph's legacy persistence loop, architect sign-off, or long-running completion guarantees.
|
|
8
8
|
</Purpose>
|
|
9
9
|
|
|
10
10
|
<Use_When>
|
|
11
11
|
- Multiple independent tasks can run simultaneously
|
|
12
12
|
- User says "ulw", "ultrawork", or explicitly wants parallel execution
|
|
13
13
|
- Task benefits from concurrent execution plus lightweight evidence before wrap-up
|
|
14
|
-
- You need a direct-tool lane plus optional background evidence lanes without entering
|
|
14
|
+
- You need a direct-tool lane plus optional background evidence lanes without entering Team or a durable goal workflow
|
|
15
15
|
</Use_When>
|
|
16
16
|
|
|
17
17
|
<Do_Not_Use_When>
|
|
18
|
-
- Task
|
|
19
|
-
- Task
|
|
20
|
-
-
|
|
18
|
+
- Task needs durable goal tracking, ledger checkpoints, or resume across stories -- use `ultragoal` instead
|
|
19
|
+
- Task needs coordinated tmux workers, shared task state, mailbox/dispatch coordination, or long-running parallel execution -- use `team` instead
|
|
20
|
+
- Task requires a full autonomous pipeline -- use `autopilot` instead (default loop: `deep-interview -> ralplan -> ultragoal`, with `team` only when needed)
|
|
21
|
+
- Task intentionally requires the legacy persistent single-owner completion/verification loop -- use `ralph` explicitly; do not present it as the default durable path
|
|
22
|
+
- There is only one sequential task with no parallelism opportunity -- execute directly, use `ultragoal` for durable tracking, or delegate to a single `executor`
|
|
21
23
|
- The request is still in plan-consensus mode -- keep planning artifacts in `ralplan` until execution is explicitly authorized
|
|
22
|
-
- User needs session persistence for resume -- use `ralph`, which adds persistence on top of ultrawork
|
|
23
24
|
</Do_Not_Use_When>
|
|
24
25
|
|
|
25
26
|
<Why_This_Exists>
|
|
@@ -138,8 +139,12 @@ Why bad: No verification output, no acceptance evidence, and no manual QA note w
|
|
|
138
139
|
</Examples>
|
|
139
140
|
|
|
140
141
|
<Escalation_And_Stop_Conditions>
|
|
141
|
-
- When ultrawork is invoked directly
|
|
142
|
-
-
|
|
142
|
+
- When ultrawork is invoked directly, apply lightweight verification only -- build/typecheck passes when relevant, affected tests pass, and manual QA notes are captured when needed.
|
|
143
|
+
- Ultrawork does not own persistence, durable ledgers, architect verification, deslop, full QA, or the full verified-completion promise. Do not claim those guarantees from direct ultrawork alone.
|
|
144
|
+
- Escalate to `ultragoal` when the work needs durable goal state, story checkpoints, or resume across implementation steps.
|
|
145
|
+
- Escalate to `team` when the work needs coordinated tmux workers, shared task state, or durable multi-worker lifecycle control.
|
|
146
|
+
- Escalate to explicitly requested `ralph` only for the supported legacy single-owner persistence/verification fallback.
|
|
147
|
+
- Ralph owns persistence, architect verification, deslop, and the full verified-completion promise only when explicitly selected as the supported legacy fallback; direct ultrawork does not own those guarantees.
|
|
143
148
|
- If a task fails repeatedly across retries, report the issue rather than retrying indefinitely.
|
|
144
149
|
- Escalate to the user when tasks have unclear dependencies, conflicting requirements, or a materially branching acceptance target.
|
|
145
150
|
</Escalation_And_Stop_Conditions>
|
|
@@ -159,17 +164,27 @@ Why bad: No verification output, no acceptance evidence, and no manual QA note w
|
|
|
159
164
|
## Relationship to Other Modes
|
|
160
165
|
|
|
161
166
|
```
|
|
162
|
-
|
|
163
|
-
\--
|
|
164
|
-
\-- provides: high-throughput execution + lightweight evidence
|
|
167
|
+
ultrawork (this skill)
|
|
168
|
+
\-- provides: in-session parallel execution discipline + lightweight evidence
|
|
165
169
|
|
|
166
|
-
|
|
167
|
-
\--
|
|
168
|
-
|
|
170
|
+
ultragoal (durable goal execution)
|
|
171
|
+
\-- owns: goal ledger, checkpoints, resume across stories, final gate discipline
|
|
172
|
+
\-- may use: team for parallel lanes when a story benefits from coordinated workers
|
|
169
173
|
|
|
170
|
-
|
|
171
|
-
\--
|
|
174
|
+
team (tmux coordinated execution)
|
|
175
|
+
\-- owns: worker panes, shared task state, mailbox/dispatch, lifecycle control
|
|
176
|
+
\-- can return: checkpoint-ready evidence to an Ultragoal leader
|
|
177
|
+
|
|
178
|
+
autopilot (strict autonomous delivery loop)
|
|
179
|
+
\-- default flow: deep-interview -> ralplan -> ultragoal -> code-review -> ultraqa
|
|
180
|
+
\-- may use: team only when an Ultragoal story needs parallel execution
|
|
181
|
+
|
|
182
|
+
ralph (supported legacy explicit fallback)
|
|
183
|
+
\-- owns: single-owner persistence loop + architect verification when intentionally selected
|
|
184
|
+
|
|
185
|
+
ecomode (deprecated compatibility-only)
|
|
186
|
+
\-- do not route users there from ultrawork; it is not the current model-selection path
|
|
172
187
|
```
|
|
173
188
|
|
|
174
|
-
Ultrawork is the parallelism and execution-discipline layer.
|
|
189
|
+
Ultrawork is the parallelism and execution-discipline layer. Ultragoal is the current default durable goal/ledger follow-up. Team is the coordinated tmux parallel runtime, often nested under an Ultragoal story when durable work needs multiple lanes. Autopilot orchestrates the full default lifecycle through deep-interview, ralplan, ultragoal, code-review, and ultraqa. Ralph remains active as an explicit legacy fallback for persistent single-owner verification, but it is not the recommended default durable path. Ecomode is deprecated compatibility-only and should not be advertised as the ultrawork model-selection route.
|
|
175
190
|
</Advanced>
|
|
@@ -133,6 +133,9 @@ Required fields:
|
|
|
133
133
|
|
|
134
134
|
- **On start**: `omx state write --input '{"mode":"autopilot","active":true,"current_phase":"deep-interview","iteration":1,"review_cycle":0,"state":{"phase_cycle":["deep-interview","ralplan","ultragoal","code-review","ultraqa"],"handoff_artifacts":{"context_snapshot_path":"<snapshot-path>","deep_interview":null,"ralplan":null,"ralplan_consensus_gate":{"required":true,"sequence":["architect-review","critic-review"],"planning_artifacts_are_not_consensus":true,"required_review_roles":["architect","critic"],"ralplan_architect_review":null,"ralplan_critic_review":null,"complete":false},"ultragoal":null,"code_review":null,"ultraqa":null},"review_verdict":null,"qa_verdict":null,"return_to_ralplan_reason":null}}' --json`
|
|
135
135
|
- **On deep-interview -> ralplan**: only after a separate gate proves the interview chain is explicitly complete or the user explicitly authorized a skip. For completion, persist `deep_interview_gate:{"status":"complete","rationale":"<why requirements are complete>","handoff_summary":"<summary>"}` (or equivalent non-empty rationale/summary) plus the clarified spec/requirements under `handoff_artifacts.deep_interview`; if a final `omx question` was involved, keep its same-session answered record linked by `question_id`/`satisfied_at`. For skip, persist `deep_interview_gate:{"status":"skipped","skip_authorized_by_user":true,"skip_reason":"<user-authorized reason>","skipped_at":"<timestamp>","source":"user","session_id":"<session>"}`. Do not leave deep-interview merely because the first `omx question` was answered or cleared.
|
|
136
|
+
- **Optional execution contract foundation**: when a downstream handoff explicitly sets `execution_contract_required:true`, persist a complete structured `execution_contract` under `handoff_artifacts.deep_interview` before leaving deep-interview. The canonical schema is `version:1`, `execution_stride:"task"|"deliverable"|"milestone"`, `source:"deep-interview"`, `selected_by:"user"|"default"`, `allow_task_shrink:<boolean>`, non-empty `completion_unit`, non-empty `stop_condition`, `acceptance_coverage_scope:"task"|"deliverable"|"milestone"`, and `shrink_policy:"allowed"|"ask_before_shrink"|"deny_unless_blocked"`.
|
|
137
|
+
- Stride semantics are binding only when `execution_contract_required:true`: `task` means `allow_task_shrink:true`, `acceptance_coverage_scope:"task"`, `shrink_policy:"allowed"`; `deliverable` means `allow_task_shrink:false`, `acceptance_coverage_scope:"deliverable"`, `shrink_policy:"ask_before_shrink"`; `milestone` means `allow_task_shrink:false`, `acceptance_coverage_scope:"milestone"`, `shrink_policy:"deny_unless_blocked"`.
|
|
138
|
+
- Preserve legacy behavior when `execution_contract_required` is absent or false. Do not infer stride from prose, broadness, phase names, snapshots, or task size; this foundation only validates an explicit structured contract and deliberately uses `milestone` rather than `phase`. New artifacts must write canonical snake_case keys under `handoff_artifacts.deep_interview`; the runtime may read legacy camelCase field/marker aliases and direct/nested `execution_contract` locations only as compatibility input.
|
|
136
139
|
- **On ralplan -> ultragoal**: only after `ralplan_consensus_gate.complete:true`, with tracker-backed native-subagent `ralplan_architect_review.agent_role:"architect"` and `ralplan_architect_review.verdict:"approve"` recorded before tracker-backed native-subagent `ralplan_critic_review.agent_role:"critic"` and `ralplan_critic_review.verdict:"approve"`; `codex_exec` or artifact-only approvals are trace evidence but not native lane proof. Set `current_phase:"ultragoal"` and persist the plan/test-spec paths under `handoff_artifacts.ralplan`.
|
|
137
140
|
- **On missing ralplan consensus evidence**: keep `current_phase:"ralplan"`, persist `ralplan_consensus_gate.complete:false` with `blocked_reason`, and report an explicit blocker or max-iteration outcome instead of handing off to execution.
|
|
138
141
|
- **On ultragoal -> code-review**: set `current_phase:"code-review"`, persist implementation/test/ledger evidence under `handoff_artifacts.ultragoal`.
|
|
@@ -71,10 +71,11 @@ Delegates to the `code-reviewer` and `architect` agents in parallel for a two-la
|
|
|
71
71
|
|
|
72
72
|
Do not self-review as a fallback. If the `code-reviewer` or `architect` agent path is missing, unavailable, skipped, or fails, emit a clear unavailable-review result and block approval until the independent lane evidence exists.
|
|
73
73
|
|
|
74
|
+
Respect the user's current model and reasoning/effort selection when launching review lanes. Do not pass `model` or `reasoning_effort` overrides in the review-lane task calls unless the user explicitly asks for review-specific overrides; omitting them lets native subagents inherit the active session settings.
|
|
75
|
+
|
|
74
76
|
```
|
|
75
77
|
task(
|
|
76
78
|
agent_type="code-reviewer",
|
|
77
|
-
reasoning_effort="xhigh",
|
|
78
79
|
prompt="CODE REVIEW TASK
|
|
79
80
|
|
|
80
81
|
Review code changes for quality, security, and maintainability.
|
|
@@ -100,7 +101,6 @@ Output: Code review report with:
|
|
|
100
101
|
|
|
101
102
|
task(
|
|
102
103
|
agent_type="architect",
|
|
103
|
-
reasoning_effort="xhigh",
|
|
104
104
|
prompt="ARCHITECTURE / DEVIL'S-ADVOCATE REVIEW TASK
|
|
105
105
|
|
|
106
106
|
Review the same code changes from the architecture/tradeoff perspective.
|
|
@@ -51,6 +51,11 @@ If no flag is provided, use **Standard**.
|
|
|
51
51
|
- Gather codebase facts via `explore` before asking user about internals
|
|
52
52
|
- `omx explore` is deprecated. Use normal repository inspection tools/subagents for simple read-only brownfield fact gathering; use `omx sparkshell` only for explicit shell-native read-only evidence, and keep ambiguous or non-shell-only investigation on the richer normal path.
|
|
53
53
|
- Always run a preflight context intake before the first interview question
|
|
54
|
+
- For brownfield work, preflight must include doc/context grounding before user-facing questions: inspect applicable `AGENTS.md` files, README/getting-started docs, relevant `docs/` contracts/plans/ADRs, existing `.omx/context/` snapshots, and any project-local glossary/context files such as `CONTEXT.md` or `CONTEXT-MAP.md` when present.
|
|
55
|
+
- Treat existing repo language as evidence, not authority: if the user uses a fuzzy, overloaded, or conflicting term, surface the specific doc/code wording and ask which meaning should govern before implementation.
|
|
56
|
+
- Cross-check user claims about current behavior against code or documented contracts when discoverable. If docs and code disagree, ask a confirmation question that names both sources instead of silently choosing one.
|
|
57
|
+
- Use scenario-based edge-case grilling when relationships, boundaries, or handoff behavior are unclear: invent one concrete scenario that stresses the ambiguous boundary, then ask one focused question about the expected outcome.
|
|
58
|
+
- Durable docs, glossary, ADR, or memory updates are opt-in and public-safe only. Deep-interview may recommend such updates in the handoff summary, but must not automatically create or dump public docs from interview transcripts unless the user explicitly chooses that as in-scope.
|
|
54
59
|
- If initial context is oversized or would exceed the prompt budget, do not paste or forward the raw payload into interview prompts; request and record a prompt-safe initial-context summary first
|
|
55
60
|
- The oversized initial-context summary gate is blocking: wait for the concise summary before ambiguity scoring, crystallizing artifacts, or any downstream execution handoff
|
|
56
61
|
- The summary must preserve goals, constraints, success criteria, non-goals, decision boundaries, and references to any full source documents so downstream consumers receive a prompt-safe but faithful context
|
|
@@ -97,8 +102,15 @@ If no flag is provided, use **Standard**.
|
|
|
97
102
|
- Unknowns/open questions
|
|
98
103
|
- Decision-boundary unknowns
|
|
99
104
|
- Likely codebase touchpoints
|
|
105
|
+
- Relevant repo docs/rules/context inspected
|
|
106
|
+
- Terminology or doc/code conflicts found
|
|
100
107
|
- Prompt-safe initial-context summary status (`not_needed`, `needed`, or `recorded`)
|
|
101
|
-
5.
|
|
108
|
+
5. For brownfield tasks, inspect the applicable documentation/rule surface before the first user-facing round. Prefer exact, nearby sources over broad scans:
|
|
109
|
+
- governing `AGENTS.md` files and template/runtime instruction surfaces that apply to the touched paths
|
|
110
|
+
- README/getting-started docs and relevant docs under `docs/`, especially contracts, plans, ADR-like records, and workflow docs
|
|
111
|
+
- existing `.omx/context/` snapshots, `.omx/specs/`, and planning artifacts relevant to the slug
|
|
112
|
+
- project-local glossary/context files such as `CONTEXT.md`, `CONTEXT-MAP.md`, or context-specific docs when they exist
|
|
113
|
+
6. Save snapshot to `.omx/context/{slug}-{timestamp}.md` (UTC `YYYYMMDDTHHMMSSZ`) and reference it in mode state.
|
|
102
114
|
|
|
103
115
|
## Phase 1: Initialize
|
|
104
116
|
|
|
@@ -137,13 +149,14 @@ If no flag is provided, use **Standard**.
|
|
|
137
149
|
Repeat until ambiguity `<= threshold`, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.
|
|
138
150
|
|
|
139
151
|
### 2a) Generate next question
|
|
140
|
-
If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to `$ralplan`, `$autopilot`, `$ralph`, or `$team` until that summary answer is captured.
|
|
152
|
+
If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to `$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, or `$team` until that summary answer is captured.
|
|
141
153
|
|
|
142
154
|
Use:
|
|
143
155
|
- Original idea
|
|
144
156
|
- Prior Q&A rounds
|
|
145
157
|
- Current dimension scores
|
|
146
158
|
- Brownfield context (if any)
|
|
159
|
+
- Doc/context grounding notes, including existing terminology, governing rules, and any doc/code mismatch
|
|
147
160
|
- Activated challenge mode injection (Phase 3)
|
|
148
161
|
|
|
149
162
|
Target the lowest-scoring dimension, but respect stage priority:
|
|
@@ -155,12 +168,21 @@ Follow-up pressure ladder after each answer:
|
|
|
155
168
|
1. Ask for a concrete example, counterexample, or evidence signal behind the latest claim
|
|
156
169
|
2. Probe the hidden assumption, dependency, or belief that makes the claim true
|
|
157
170
|
3. Force a boundary or tradeoff: what would you explicitly not do, defer, or reject?
|
|
158
|
-
4.
|
|
171
|
+
4. Challenge fuzzy or conflicting terms against the repo's documented language and current code behavior
|
|
172
|
+
5. Stress-test the boundary with one concrete scenario or edge case when a relationship or handoff remains ambiguous
|
|
173
|
+
6. If the answer still describes symptoms, reframe toward essence / root cause before moving on
|
|
159
174
|
|
|
160
175
|
Prefer staying on the same thread for multiple rounds when it has the highest leverage. Breadth without pressure is not progress.
|
|
161
176
|
|
|
162
177
|
Maintain a **Breadth Ledger** across independent ambiguity tracks: scope, constraints, outputs, verification, brownfield integration, and any user-mentioned deliverable tracks. The ledger is a guard, not a mandatory rotation rule: stay deep on the current thread until it has been pressure-tested, then zoom out only when another material track remains unresolved and would change execution.
|
|
163
178
|
|
|
179
|
+
Maintain a **Docs/Terminology Ledger** for brownfield interviews:
|
|
180
|
+
- repo docs/rules/context sources inspected, with path references
|
|
181
|
+
- canonical terms already used by the repo and terms to avoid or disambiguate
|
|
182
|
+
- user terms that conflict with docs or current code behavior
|
|
183
|
+
- doc/code mismatches that require a human decision before implementation
|
|
184
|
+
- optional durable-doc follow-ups that are safe to propose but not auto-apply
|
|
185
|
+
|
|
164
186
|
Detailed dimensions:
|
|
165
187
|
- Intent Clarity — why the user wants this
|
|
166
188
|
- Outcome Clarity — what end state they want
|
|
@@ -306,6 +328,7 @@ Append round result and updated scores via `omx state write --input '<json>' --j
|
|
|
306
328
|
Use each mode once when applicable. These are normal escalation tools, not rare rescue moves:
|
|
307
329
|
|
|
308
330
|
- **Contrarian** (round 2+ or immediately when an answer rests on an untested assumption): challenge core assumptions
|
|
331
|
+
- **Terminologist** (brownfield, whenever a key term is fuzzy, overloaded, or conflicts with repo docs/code): force a canonical meaning against existing project language before implementation
|
|
309
332
|
- **Simplifier** (round 4+ or when scope expands faster than outcome clarity): probe minimal viable scope
|
|
310
333
|
- **Ontologist** (round 5+ and ambiguity > 0.25, or when the user keeps describing symptoms): ask for essence-level reframing
|
|
311
334
|
|
|
@@ -336,6 +359,9 @@ Spec should include:
|
|
|
336
359
|
- Assumptions exposed + resolutions
|
|
337
360
|
- Pressure-pass findings (which answer was revisited, and what changed)
|
|
338
361
|
- Brownfield evidence vs inference notes for any repository-grounded confirmation questions
|
|
362
|
+
- Docs/Terminology Ledger with inspected repo docs/rules/context, term conflicts, and any doc/code mismatch decisions
|
|
363
|
+
- Scenario/edge-case pressure findings that materially shaped scope or acceptance criteria
|
|
364
|
+
- Optional durable documentation recommendations, explicitly marked opt-in and public-safe; do not include raw private transcript dumps
|
|
339
365
|
- Technical context findings
|
|
340
366
|
- Full or condensed transcript
|
|
341
367
|
|
|
@@ -365,11 +391,45 @@ When the clarified task is specifically about `$autoresearch`, or the skill is i
|
|
|
365
391
|
|
|
366
392
|
## Phase 5: Execution Bridge
|
|
367
393
|
|
|
368
|
-
Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, and any residual-risk warnings across the handoff.
|
|
394
|
+
Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, and any residual-risk warnings across the handoff.
|
|
395
|
+
|
|
396
|
+
### Optional execution contract foundation
|
|
397
|
+
|
|
398
|
+
When an Autopilot/deep-interview handoff explicitly requires a stride contract, emit it as structured data rather than prose. This is a validation foundation, not a broadness-inference feature: do not infer stride from task length, phase labels, snapshots, or freeform wording.
|
|
399
|
+
|
|
400
|
+
Canonical location under Autopilot state:
|
|
401
|
+
|
|
402
|
+
```json
|
|
403
|
+
{
|
|
404
|
+
"handoff_artifacts": {
|
|
405
|
+
"deep_interview": {
|
|
406
|
+
"execution_contract_required": true,
|
|
407
|
+
"execution_contract": {
|
|
408
|
+
"version": 1,
|
|
409
|
+
"execution_stride": "task",
|
|
410
|
+
"source": "deep-interview",
|
|
411
|
+
"selected_by": "user",
|
|
412
|
+
"allow_task_shrink": true,
|
|
413
|
+
"completion_unit": "One focused task",
|
|
414
|
+
"stop_condition": "Stop after that task is implemented and verified",
|
|
415
|
+
"acceptance_coverage_scope": "task",
|
|
416
|
+
"shrink_policy": "allowed"
|
|
417
|
+
}
|
|
418
|
+
}
|
|
419
|
+
}
|
|
420
|
+
}
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
Stride meanings:
|
|
424
|
+
- `task`: conservative, small-step execution; `allow_task_shrink:true`, `acceptance_coverage_scope:"task"`, `shrink_policy:"allowed"`.
|
|
425
|
+
- `deliverable`: finish the named deliverable before stopping; `allow_task_shrink:false`, `acceptance_coverage_scope:"deliverable"`, `shrink_policy:"ask_before_shrink"`.
|
|
426
|
+
- `milestone`: finish the larger approved milestone unless blocked; `allow_task_shrink:false`, `acceptance_coverage_scope:"milestone"`, `shrink_policy:"deny_unless_blocked"`.
|
|
427
|
+
|
|
428
|
+
Only set `execution_contract_required:true` when the selected downstream workflow needs this explicit stride/stop-condition guard. New artifacts must write the canonical snake_case schema shown above under `handoff_artifacts.deep_interview`; runtime readers may accept legacy camelCase field/marker aliases and direct/nested `execution_contract` locations only as compatibility input. If `execution_contract_required` is absent or false, downstream Autopilot compatibility behavior is unchanged.
|
|
369
429
|
|
|
370
430
|
### Goal-mode follow-ups
|
|
371
431
|
|
|
372
|
-
Include these product-facing suggestions when they fit the clarified spec, without removing the existing `$ralplan`, `$autopilot`, `$ralph`, and `$team` handoff options:
|
|
432
|
+
Include these product-facing suggestions when they fit the clarified spec, without removing the existing `$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, and `$team` handoff options:
|
|
373
433
|
|
|
374
434
|
- **`$ultragoal`** — default goal-mode follow-up for implementation or general goal-oriented follow-up specs that should be converted into durable Codex/OMX goals with sequential completion tracking.
|
|
375
435
|
- **`$autoresearch-goal`** — use when the clarified context is a research project: a research question, reference/literature gathering, evaluator-backed analysis, or professor/critic-style deliverable.
|
|
@@ -377,7 +437,16 @@ Include these product-facing suggestions when they fit the clarified spec, witho
|
|
|
377
437
|
|
|
378
438
|
Recommend `$ultragoal` as the default durable goal-mode follow-up because it supersedes Ralph for goal tracking. Preserve `$team` for coordinated parallel implementation and keep `$ralph` only as an explicit fallback for persistent single-owner execution/verification when the user specifically selects it.
|
|
379
439
|
|
|
380
|
-
### 1. **`$
|
|
440
|
+
### 1. **`$ultragoal` (Default durable execution follow-up)**
|
|
441
|
+
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md` (optionally accompanied by the transcript/context snapshot for traceability)
|
|
442
|
+
- **Invocation:** `$ultragoal create-goals --brief-file <spec-path>` followed by `$ultragoal complete-goals` in the active execution lane
|
|
443
|
+
- **Consumer Behavior:** Convert the clarified spec into durable goal-mode work. Preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, scenario-pressure findings, and residual-risk warnings as binding story constraints.
|
|
444
|
+
- **Skipped / Already-Satisfied Stages:** Requirement interview, ambiguity clarification, doc/context preflight, and early intent-boundary elicitation
|
|
445
|
+
- **Expected Output:** `.omx/ultragoal/brief.md`, `.omx/ultragoal/goals.json`, `.omx/ultragoal/ledger.jsonl`, implementation evidence, verification evidence, and final cleanup/review-gate evidence
|
|
446
|
+
- **Best When:** The clarified spec is execution-ready or the user explicitly wants durable goal tracking as the next step
|
|
447
|
+
- **Next Recommended Step:** Run the Ultragoal completion loop; launch `$team` only inside an active Ultragoal story when parallel lanes are warranted, and use `$ralph` only as an explicit fallback when the user asks for that legacy persistence mode
|
|
448
|
+
|
|
449
|
+
### 2. **`$ralplan` (Recommended when architecture/test-shape review is still needed)**
|
|
381
450
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md` (optionally accompanied by the transcript/context snapshot for traceability)
|
|
382
451
|
- **Invocation:** `$plan --consensus --direct <spec-path>`
|
|
383
452
|
- **Consumer Behavior:** Treat the deep-interview spec as the requirements source of truth. Do not repeat the interview by default; refine architecture/feasibility around the clarified intent and boundaries instead.
|
|
@@ -386,7 +455,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
386
455
|
- **Best When:** Requirements are clear enough to stop interviewing, but architectural validation / consensus planning is still desirable
|
|
387
456
|
- **Next Recommended Step:** Use the approved planning artifacts with `$ultragoal` as the default durable goal-mode follow-up (optionally with `$team` for parallel lanes); choose `$autoresearch-goal` for research validation or `$performance-goal` for measurable optimization, and use `$ralph` only as an explicit fallback when a narrow single-owner persistence loop is requested
|
|
388
457
|
|
|
389
|
-
###
|
|
458
|
+
### 3. **`$autopilot`**
|
|
390
459
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md`
|
|
391
460
|
- **Invocation:** `$autopilot <spec-path>`
|
|
392
461
|
- **Consumer Behavior:** Use the deep-interview spec as the clarified execution brief. Preserve intent, non-goals, decision boundaries, and acceptance criteria as binding context for planning/execution.
|
|
@@ -395,7 +464,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
395
464
|
- **Best When:** The clarified spec is already strong enough for direct planning + execution without an additional consensus gate
|
|
396
465
|
- **Next Recommended Step:** Continue through autopilot's execution/QA/validation flow; if coordination-heavy execution emerges, prefer `$team` under a leader-owned `$ultragoal` ledger, using `$ralph` only as an explicit fallback when a narrow single-owner persistence loop is requested
|
|
397
466
|
|
|
398
|
-
###
|
|
467
|
+
### 4. **`$ralph` (Explicit fallback only)**
|
|
399
468
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md`
|
|
400
469
|
- **Invocation:** `$ralph <spec-path>`
|
|
401
470
|
- **Consumer Behavior:** Use the spec's acceptance criteria and boundary constraints as the persistence target. Do not reopen requirements discovery unless the user explicitly asks to refine further.
|
|
@@ -404,7 +473,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
404
473
|
- **Best When:** The user explicitly asks for Ralph's persistent sequential completion pressure; otherwise use `$ultragoal` for durable goal tracking and completion checkpoints
|
|
405
474
|
- **Next Recommended Step:** If this explicit fallback is selected, continue Ralph's persistence loop; if work expands into coordination-heavy lanes, hand off to `$team` under `$ultragoal` checkpointing rather than promoting Ralph as the next default
|
|
406
475
|
|
|
407
|
-
###
|
|
476
|
+
### 5. **`$team`**
|
|
408
477
|
- **Input Artifact:** `.omx/specs/deep-interview-{slug}.md`
|
|
409
478
|
- **Invocation:** `$team <spec-path>`
|
|
410
479
|
- **Consumer Behavior:** Treat the spec as shared execution context for coordinated parallel work. Preserve the clarified intent, non-goals, decision boundaries, and acceptance criteria as common lane constraints.
|
|
@@ -413,7 +482,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
413
482
|
- **Best When:** The task is large, multi-lane, or blocker-sensitive enough to justify coordinated parallel execution instead of a single persistent loop
|
|
414
483
|
- **Next Recommended Step:** Follow the team verification path when the coordinated execution phase finishes; checkpoint completion through `$ultragoal` by default, escalating to a separate Ralph loop only when the user explicitly asks for that persistent verification/fix owner
|
|
415
484
|
|
|
416
|
-
###
|
|
485
|
+
### 6. **Refine further**
|
|
417
486
|
- **Input Artifact:** Existing transcript, context snapshot, and current spec draft
|
|
418
487
|
- **Invocation:** Continue the interview loop
|
|
419
488
|
- **Consumer Behavior:** Re-enter questioning to resolve the highest-leverage remaining uncertainty
|
|
@@ -437,6 +506,7 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
437
506
|
- Use `omx state write/read --input '<json>' --json` for resumable mode state; `state_write` / `state_read` are explicit MCP compatibility fallbacks only
|
|
438
507
|
- If the interview cannot ask a required `omx question` round, persist the blocker as terminal state with `active: false` and `current_phase: "blocked"`; do not write a terminal blocked phase with `active: true`
|
|
439
508
|
- Read/write context snapshots under `.omx/context/`
|
|
509
|
+
- Read applicable repo docs/rules/context during preflight; write durable docs, glossary, ADR, or memory updates only when the user explicitly opts in and the content is public-safe
|
|
440
510
|
- Record whether the oversized-context summary gate is not needed, pending, or satisfied before any scoring or handoff step
|
|
441
511
|
- Save transcript/spec artifacts under `.omx/interviews/` and `.omx/specs/`
|
|
442
512
|
</Tool_Usage>
|
|
@@ -460,7 +530,11 @@ Recommend `$ultragoal` as the default durable goal-mode follow-up because it sup
|
|
|
460
530
|
- [ ] Transcript written to `.omx/interviews/{slug}-{timestamp}.md`
|
|
461
531
|
- [ ] Spec written to `.omx/specs/deep-interview-{slug}.md`
|
|
462
532
|
- [ ] Brownfield questions use evidence-backed confirmation when applicable
|
|
463
|
-
- [ ]
|
|
533
|
+
- [ ] Brownfield preflight inspected applicable repo docs/rules/context before user-facing questions
|
|
534
|
+
- [ ] Fuzzy or conflicting terminology was challenged against repo language/current code behavior when applicable
|
|
535
|
+
- [ ] Scenario-based edge-case grilling was used when boundary ambiguity would materially affect implementation
|
|
536
|
+
- [ ] Durable docs/ADR/memory updates, if any, were explicitly opted into and public-safe
|
|
537
|
+
- [ ] Handoff options provided (`$ultragoal`, `$ralplan`, `$autopilot`, `$ralph`, `$team`) plus context-sensitive goal-mode suggestions (`$autoresearch-goal`, `$performance-goal`) when applicable
|
|
464
538
|
- [ ] No direct implementation performed in this mode
|
|
465
539
|
</Final_Checklist>
|
|
466
540
|
|