devlyn-cli 1.12.2 → 1.12.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -72,7 +72,7 @@ Optional flags:
72
72
  - `--skip-review` — skip team-review phase
73
73
  - `--skip-clean` — skip clean phase
74
74
  - `--skip-docs` — skip update-docs phase
75
- - `--engine auto|codex|claude` — intelligent model routing. `auto` routes each phase and team role to the optimal model (Claude or Codex GPT-5.4) based on benchmark data. `codex` forces Codex for implementation, Claude for evaluation. `claude` (default) uses Claude for everything. Requires codex-mcp-server.
75
+ - `--engine auto|codex|claude` — intelligent model routing. `auto` (default) routes each phase and team role to the optimal model (Claude or Codex GPT-5.4) based on benchmark data. `codex` forces Codex for implementation, Claude for evaluation. `claude` uses Claude for everything. Requires codex-mcp-server for `auto` and `codex` modes.
76
76
  - `--with-codex [evaluate|review|both]` — (legacy, superseded by `--engine`) use OpenAI Codex as cross-model evaluator/reviewer (requires codex-mcp-server)
77
77
 
78
78
  ## Preflight Check (Post-Roadmap Verification)
package/README.md CHANGED
@@ -111,7 +111,7 @@ Install the Codex MCP server during setup, then:
111
111
 
112
112
  **`--engine auto`** routes each pipeline phase and team role to the optimal model (Claude Opus 4.6 or GPT-5.4) — validated through A/B testing, not just benchmarks.
113
113
 
114
- > `--engine auto` (recommended) · `--engine codex` (force Codex for build) · `--engine claude` (default, Claude only)
114
+ > `--engine auto` (default, recommended) · `--engine codex` (force Codex for build) · `--engine claude` (Claude only)
115
115
 
116
116
  Works across the full pipeline:
117
117
 
@@ -34,17 +34,21 @@ This pipeline runs hands-free. The user launches it to walk away and come back t
34
34
  - `--skip-build-gate` (false) — skip the deterministic build gate (Phase 1.4). Not recommended — the build gate is the primary defense against "tests pass locally, breaks in CI/Docker/production" class of bugs.
35
35
  - `--build-gate MODE` (auto) — controls build gate behavior. `auto`: detect project type and run appropriate build/typecheck/lint commands; if Dockerfile(s) are present, Docker builds are included automatically. `strict`: auto + treat warnings as errors. `no-docker`: auto but skip Docker builds even if Dockerfiles exist (for faster iteration). `skip`: same as --skip-build-gate.
36
36
  - `--with-codex` (false) — use OpenAI Codex as a cross-model evaluator/reviewer via `mcp__codex-cli__*` MCP tools. Accepts: `evaluate`, `review`, or `both` (default when flag is present without value). When enabled, Codex provides an independent second opinion from a different model family, creating a GAN-like dynamic where Claude builds and Codex critiques. **Ignored if `--engine` is set** (engine routing subsumes this).
37
- - `--engine MODE` (claude) — controls which model handles each pipeline phase and team role. Modes:
38
- - `claude` (default): all phases use Claude subagents. Current behavior, no Codex calls.
37
+ - `--engine MODE` (auto) — controls which model handles each pipeline phase and team role. Modes:
38
+ - `auto` (default): each phase and team role routes to the optimal model based on benchmark data. Requires Codex MCP server. Subsumes `--with-codex both`.
39
39
  - `codex`: Codex handles implementation/analysis phases, Claude handles orchestration, evaluation, and Chrome MCP.
40
- - `auto`: each phase and team role routes to the optimal model based on benchmark data. Recommended when Codex MCP server is available. Subsumes `--with-codex both`.
40
+ - `claude`: all phases use Claude subagents. No Codex calls.
41
41
 
42
42
  Flags can be passed naturally: `/devlyn:auto-resolve fix the auth bug --max-rounds 3 --skip-docs`
43
43
  Engine examples: `--engine auto`, `--engine codex`, `--engine claude`
44
44
  Codex examples (legacy): `--with-codex` (both), `--with-codex evaluate`, `--with-codex review`
45
- If no flags are present, use defaults.
45
+ If no flags are present, use defaults. **The default engine is `auto` — if the user does not pass `--engine`, treat it as `--engine auto`.**
46
46
 
47
- 3. **If `--engine` is `auto` or `codex`**: Read `references/engine-routing.md` for the full routing table. Then call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude` (fallback), [2] Abort. If `--engine` is not set but `--with-codex` is enabled, read `references/codex-integration.md` instead and run its pre-flight check.
47
+ 3. **Engine pre-flight** (runs unless `--engine claude` was explicitly passed):
48
+ - The default engine is `auto`. If the user did not pass `--engine`, the engine is `auto` — NOT `claude`.
49
+ - Read `references/engine-routing.md` for the full routing table.
50
+ - Call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude` (fallback), [2] Abort.
51
+ - Exception: if `--engine` is not set AND `--with-codex` is explicitly enabled (legacy), read `references/codex-integration.md` instead and run its pre-flight check.
48
52
 
49
53
  4. Announce the pipeline plan:
50
54
  ```
@@ -58,7 +62,7 @@ Cross-model evaluation (Codex): [evaluate / review / both / disabled / subsumed
58
62
 
59
63
  ## PHASE 1: BUILD
60
64
 
61
- **Engine routing**: If `--engine` is `auto` or `codex`, read `references/engine-routing.md` "How to Spawn a Codex BUILD/FIX Agent" section. Call `mcp__codex-cli__codex` with `model: "gpt-5.4"`, `reasoningEffort: "xhigh"`, `sandbox: "workspace-write"`, `fullAuto: true`, and the full agent prompt below as the `prompt` parameter. If `--engine` is `claude` (default), spawn a Claude subagent as described below.
65
+ **Engine routing**: If `--engine` is `auto` or `codex`, read `references/engine-routing.md` "How to Spawn a Codex BUILD/FIX Agent" section. Call `mcp__codex-cli__codex` with `model: "gpt-5.4"`, `reasoningEffort: "xhigh"`, `sandbox: "workspace-write"`, `fullAuto: true`, and the full agent prompt below as the `prompt` parameter. If `--engine` is `claude`, spawn a Claude subagent as described below.
62
66
 
63
67
  Spawn a subagent using the Agent tool with `mode: "bypassPermissions"` to investigate and implement the fix. The subagent does NOT have access to skills, so include all necessary instructions inline.
64
68
 
@@ -197,7 +197,7 @@ mcp__codex-cli__codex({
197
197
 
198
198
  ## Override Behavior
199
199
 
200
- - `--engine claude` → all roles and phases use Claude (current default behavior, no Codex calls)
200
+ - `--engine claude` → all roles and phases use Claude (no Codex calls)
201
201
  - `--engine codex` → all phases use Codex for implementation/analysis, Claude only for orchestration and Chrome MCP
202
202
  - `--engine auto` → each role and phase routes to the optimal model per this table
203
203
  - `--engine auto` is the recommended default when Codex MCP server is available
@@ -25,14 +25,17 @@ Concretely:
25
25
  Parse these from the user's invocation message:
26
26
 
27
27
  - `--with-codex` (default: off) — bare flag. When set, OpenAI Codex runs an independent rubric pass during Phase 3.5 CHALLENGE via `mcp__codex-cli__*` MCP tools, using the same rubric as the solo pass. Codex always runs at `reasoningEffort: "xhigh"` — the entire reason for the flag is maximum reasoning from a second model family. **Ignored if `--engine` is set** (engine routing subsumes this).
28
- - `--engine MODE` (claude) — controls which model handles each ideation phase. Modes:
29
- - `claude` (default): all phases use Claude. Current behavior.
28
+ - `--engine MODE` (auto) — controls which model handles each ideation phase. Modes:
29
+ - `auto` (default): Claude handles FRAME/EXPLORE/CONVERGE/DOCUMENT (ambiguous intent, writing quality), Codex runs the CHALLENGE rubric pass as critic (GAN dynamic). Subsumes `--with-codex`. Requires Codex MCP server.
30
30
  - `codex`: Codex handles FRAME/EXPLORE/CONVERGE/DOCUMENT, Claude runs CHALLENGE (role reversal — builder and critic are always different models).
31
- - `auto`: Claude handles FRAME/EXPLORE/CONVERGE/DOCUMENT (ambiguous intent, writing quality), Codex runs the CHALLENGE rubric pass as critic (GAN dynamic). Subsumes `--with-codex`. Recommended when Codex MCP is available.
31
+ - `claude`: all phases use Claude. No Codex calls.
32
32
 
33
- **If `--engine` is `auto` or `codex`**: call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude`, [2] Abort. Also read `references/challenge-rubric.md` up front. The engine routing table is defined in the auto-resolve skill's `references/engine-routing.md` under "Pipeline Phase Routing (ideate)".
33
+ **Engine pre-flight** (runs unless `--engine claude` was explicitly passed):
34
+ - The default engine is `auto`. If the user did not pass `--engine`, the engine is `auto` — NOT `claude`.
35
+ - Call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude`, [2] Abort.
36
+ - Also read `references/challenge-rubric.md` up front. The engine routing table is defined in the auto-resolve skill's `references/engine-routing.md` under "Pipeline Phase Routing (ideate)".
34
37
 
35
- **If `--engine` is not set and `--with-codex` is set** (legacy): read `references/challenge-rubric.md` and `references/codex-debate.md` up front, then run the pre-flight check described in `codex-debate.md` to verify the Codex MCP server is available before starting the pipeline. If the server is unavailable and the user opts to continue without Codex, the solo CHALLENGE pass still runs — only the cross-model rubric pass is disabled.
38
+ **If `--engine` is not set and `--with-codex` is explicitly set** (legacy): read `references/challenge-rubric.md` and `references/codex-debate.md` up front, then run the pre-flight check described in `codex-debate.md` to verify the Codex MCP server is available before starting the pipeline. If the server is unavailable and the user opts to continue without Codex, the solo CHALLENGE pass still runs — only the cross-model rubric pass is disabled.
36
39
 
37
40
  <why_this_matters>
38
41
  When ideas flow directly from conversation to `/devlyn:auto-resolve`, context degrades at each handoff:
@@ -44,15 +44,17 @@ Parse from `<preflight_config>`:
44
44
  - `--autofix` — auto-promote all findings to roadmap items and run auto-resolve on each
45
45
  - `--skip-browser` — skip browser validation
46
46
  - `--skip-docs` — skip documentation audit
47
- - `--engine MODE` (claude) — controls which model handles audit phases. Modes:
48
- - `claude` (default): all auditors use Claude subagents.
47
+ - `--engine MODE` (auto) — controls which model handles audit phases. Modes:
48
+ - `auto` (default): code-auditor uses Codex (SWE-bench Pro +11.7pp for code analysis), docs-auditor uses Claude (writing quality), browser-auditor uses Claude (Chrome MCP). Requires Codex MCP server.
49
49
  - `codex`: code-auditor uses Codex, docs-auditor and browser-auditor use Claude.
50
- - `auto`: code-auditor uses Codex (SWE-bench Pro +11.7pp for code analysis), docs-auditor uses Claude (writing quality), browser-auditor uses Claude (Chrome MCP). Recommended when Codex MCP is available.
50
+ - `claude`: all auditors use Claude subagents. No Codex calls.
51
51
 
52
52
  Example: `/devlyn:preflight --phase 2 --skip-browser`
53
53
  Example with engine: `/devlyn:preflight --engine auto`
54
54
 
55
- **If `--engine` is `auto` or `codex`**: call `mcp__codex-cli__ping` to verify Codex MCP availability. If ping fails, fall back to `--engine claude` with a warning.
55
+ **Engine pre-flight** (runs unless `--engine claude` was explicitly passed):
56
+ - The default engine is `auto`. If the user did not pass `--engine`, the engine is `auto` — NOT `claude`.
57
+ - Call `mcp__codex-cli__ping` to verify Codex MCP availability. If ping fails, fall back to `--engine claude` with a warning.
56
58
 
57
59
  ## PHASE 0: DISCOVER & SCOPE
58
60
 
@@ -147,7 +147,7 @@ Codex roles cannot use TeamCreate/SendMessage — the Team Lead (you) relays the
147
147
 
148
148
  **For Dual roles** (e.g., security-auditor): Run BOTH a Claude Agent teammate AND a `mcp__codex-cli__codex` call in parallel with the same prompt. Merge findings per `engine-routing.md` "How to Spawn a Dual Role" section.
149
149
 
150
- If `--engine claude` or no `--engine` flag: all roles use Claude Agent teammates (current default behavior).
150
+ If `--engine auto` or no `--engine` flag: routes each role to the optimal model based on benchmark data (see `engine-routing.md`). If `--engine claude`: all roles use Claude Agent teammates.
151
151
 
152
152
  ### Teammate Prompts
153
153
 
@@ -83,7 +83,7 @@ Codex reviewers cannot use TeamCreate/SendMessage — the Review Lead (you) coll
83
83
 
84
84
  **For Dual roles** (e.g., security-reviewer): Run BOTH a Claude Agent reviewer AND a `mcp__codex-cli__codex` call in parallel with the same prompt. Merge findings per `engine-routing.md` "How to Spawn a Dual Role" section.
85
85
 
86
- If `--engine claude` or no `--engine` flag: all roles use Claude Agent reviewers (current default behavior).
86
+ If `--engine auto` or no `--engine` flag: routes each reviewer role to the optimal model based on benchmark data (see `engine-routing.md`). If `--engine claude`: all roles use Claude Agent reviewers.
87
87
 
88
88
  ### Reviewer Prompts
89
89
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "devlyn-cli",
3
- "version": "1.12.2",
3
+ "version": "1.12.4",
4
4
  "description": "AI development toolkit for Claude Code — ideate, auto-resolve, and ship with context engineering and agent orchestration",
5
5
  "homepage": "https://github.com/fysoul17/devlyn-cli#readme",
6
6
  "bin": {