npm - devlyn-cli - Versions diffs - 1.12.2 → 1.12.4 - Mend

devlyn-cli 1.12.2 → 1.12.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/CLAUDE.md +1 -1
package/README.md +1 -1
package/config/skills/devlyn:auto-resolve/SKILL.md +10 -6
package/config/skills/devlyn:auto-resolve/references/engine-routing.md +1 -1
package/config/skills/devlyn:ideate/SKILL.md +8 -5
package/config/skills/devlyn:preflight/SKILL.md +6 -4
package/config/skills/devlyn:team-resolve/SKILL.md +1 -1
package/config/skills/devlyn:team-review/SKILL.md +1 -1
package/package.json +1 -1

package/CLAUDE.md CHANGED Viewed

@@ -72,7 +72,7 @@ Optional flags:
 - `--skip-review` — skip team-review phase
 - `--skip-clean` — skip clean phase
 - `--skip-docs` — skip update-docs phase
-- `--engine auto|codex|claude` — intelligent model routing. `auto` routes each phase and team role to the optimal model (Claude or Codex GPT-5.4) based on benchmark data. `codex` forces Codex for implementation, Claude for evaluation. `claude` (default) uses Claude for everything. Requires codex-mcp-server.
+- `--engine auto|codex|claude` — intelligent model routing. `auto` (default) routes each phase and team role to the optimal model (Claude or Codex GPT-5.4) based on benchmark data. `codex` forces Codex for implementation, Claude for evaluation. `claude` uses Claude for everything. Requires codex-mcp-server for `auto` and `codex` modes.
 - `--with-codex [evaluate|review|both]` — (legacy, superseded by `--engine`) use OpenAI Codex as cross-model evaluator/reviewer (requires codex-mcp-server)
 ## Preflight Check (Post-Roadmap Verification)

package/README.md CHANGED Viewed

@@ -111,7 +111,7 @@ Install the Codex MCP server during setup, then:
 **`--engine auto`** routes each pipeline phase and team role to the optimal model (Claude Opus 4.6 or GPT-5.4) — validated through A/B testing, not just benchmarks.
-> `--engine auto` (recommended) · `--engine codex` (force Codex for build) · `--engine claude` (default, Claude only)
+> `--engine auto` (default, recommended) · `--engine codex` (force Codex for build) · `--engine claude` (Claude only)
 Works across the full pipeline:

package/config/skills/devlyn:auto-resolve/SKILL.md CHANGED Viewed

@@ -34,17 +34,21 @@ This pipeline runs hands-free. The user launches it to walk away and come back t
    - `--skip-build-gate` (false) — skip the deterministic build gate (Phase 1.4). Not recommended — the build gate is the primary defense against "tests pass locally, breaks in CI/Docker/production" class of bugs.
    - `--build-gate MODE` (auto) — controls build gate behavior. `auto`: detect project type and run appropriate build/typecheck/lint commands; if Dockerfile(s) are present, Docker builds are included automatically. `strict`: auto + treat warnings as errors. `no-docker`: auto but skip Docker builds even if Dockerfiles exist (for faster iteration). `skip`: same as --skip-build-gate.
    - `--with-codex` (false) — use OpenAI Codex as a cross-model evaluator/reviewer via `mcp__codex-cli__*` MCP tools. Accepts: `evaluate`, `review`, or `both` (default when flag is present without value). When enabled, Codex provides an independent second opinion from a different model family, creating a GAN-like dynamic where Claude builds and Codex critiques. **Ignored if `--engine` is set** (engine routing subsumes this).
-   - `--engine MODE` (claude) — controls which model handles each pipeline phase and team role. Modes:
-     - `claude` (default): all phases use Claude subagents. Current behavior, no Codex calls.
+   - `--engine MODE` (auto) — controls which model handles each pipeline phase and team role. Modes:
+     - `auto` (default): each phase and team role routes to the optimal model based on benchmark data. Requires Codex MCP server. Subsumes `--with-codex both`.
      - `codex`: Codex handles implementation/analysis phases, Claude handles orchestration, evaluation, and Chrome MCP.
-     - `auto`: each phase and team role routes to the optimal model based on benchmark data. Recommended when Codex MCP server is available. Subsumes `--with-codex both`.
+     - `claude`: all phases use Claude subagents. No Codex calls.
    Flags can be passed naturally: `/devlyn:auto-resolve fix the auth bug --max-rounds 3 --skip-docs`
    Engine examples: `--engine auto`, `--engine codex`, `--engine claude`
    Codex examples (legacy): `--with-codex` (both), `--with-codex evaluate`, `--with-codex review`
-   If no flags are present, use defaults.
+   If no flags are present, use defaults. **The default engine is `auto` — if the user does not pass `--engine`, treat it as `--engine auto`.**
-3. **If `--engine` is `auto` or `codex`**: Read `references/engine-routing.md` for the full routing table. Then call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude` (fallback), [2] Abort. If `--engine` is not set but `--with-codex` is enabled, read `references/codex-integration.md` instead and run its pre-flight check.
+3. **Engine pre-flight** (runs unless `--engine claude` was explicitly passed):
+   - The default engine is `auto`. If the user did not pass `--engine`, the engine is `auto` — NOT `claude`.
+   - Read `references/engine-routing.md` for the full routing table.
+   - Call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude` (fallback), [2] Abort.
+   - Exception: if `--engine` is not set AND `--with-codex` is explicitly enabled (legacy), read `references/codex-integration.md` instead and run its pre-flight check.
 4. Announce the pipeline plan:
 ```
@@ -58,7 +62,7 @@ Cross-model evaluation (Codex): [evaluate / review / both / disabled / subsumed
 ## PHASE 1: BUILD
-**Engine routing**: If `--engine` is `auto` or `codex`, read `references/engine-routing.md` "How to Spawn a Codex BUILD/FIX Agent" section. Call `mcp__codex-cli__codex` with `model: "gpt-5.4"`, `reasoningEffort: "xhigh"`, `sandbox: "workspace-write"`, `fullAuto: true`, and the full agent prompt below as the `prompt` parameter. If `--engine` is `claude` (default), spawn a Claude subagent as described below.
+**Engine routing**: If `--engine` is `auto` or `codex`, read `references/engine-routing.md` "How to Spawn a Codex BUILD/FIX Agent" section. Call `mcp__codex-cli__codex` with `model: "gpt-5.4"`, `reasoningEffort: "xhigh"`, `sandbox: "workspace-write"`, `fullAuto: true`, and the full agent prompt below as the `prompt` parameter. If `--engine` is `claude`, spawn a Claude subagent as described below.
 Spawn a subagent using the Agent tool with `mode: "bypassPermissions"` to investigate and implement the fix. The subagent does NOT have access to skills, so include all necessary instructions inline.

package/config/skills/devlyn:auto-resolve/references/engine-routing.md CHANGED Viewed

@@ -197,7 +197,7 @@ mcp__codex-cli__codex({
 ## Override Behavior
-- `--engine claude` → all roles and phases use Claude (current default behavior, no Codex calls)
+- `--engine claude` → all roles and phases use Claude (no Codex calls)
 - `--engine codex` → all phases use Codex for implementation/analysis, Claude only for orchestration and Chrome MCP
 - `--engine auto` → each role and phase routes to the optimal model per this table
 - `--engine auto` is the recommended default when Codex MCP server is available

package/config/skills/devlyn:ideate/SKILL.md CHANGED Viewed

@@ -25,14 +25,17 @@ Concretely:
 Parse these from the user's invocation message:
 - `--with-codex` (default: off) — bare flag. When set, OpenAI Codex runs an independent rubric pass during Phase 3.5 CHALLENGE via `mcp__codex-cli__*` MCP tools, using the same rubric as the solo pass. Codex always runs at `reasoningEffort: "xhigh"` — the entire reason for the flag is maximum reasoning from a second model family. **Ignored if `--engine` is set** (engine routing subsumes this).
-- `--engine MODE` (claude) — controls which model handles each ideation phase. Modes:
-  - `claude` (default): all phases use Claude. Current behavior.
+- `--engine MODE` (auto) — controls which model handles each ideation phase. Modes:
+  - `auto` (default): Claude handles FRAME/EXPLORE/CONVERGE/DOCUMENT (ambiguous intent, writing quality), Codex runs the CHALLENGE rubric pass as critic (GAN dynamic). Subsumes `--with-codex`. Requires Codex MCP server.
   - `codex`: Codex handles FRAME/EXPLORE/CONVERGE/DOCUMENT, Claude runs CHALLENGE (role reversal — builder and critic are always different models).
-  - `auto`: Claude handles FRAME/EXPLORE/CONVERGE/DOCUMENT (ambiguous intent, writing quality), Codex runs the CHALLENGE rubric pass as critic (GAN dynamic). Subsumes `--with-codex`. Recommended when Codex MCP is available.
+  - `claude`: all phases use Claude. No Codex calls.
-**If `--engine` is `auto` or `codex`**: call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude`, [2] Abort. Also read `references/challenge-rubric.md` up front. The engine routing table is defined in the auto-resolve skill's `references/engine-routing.md` under "Pipeline Phase Routing (ideate)".
+**Engine pre-flight** (runs unless `--engine claude` was explicitly passed):
+- The default engine is `auto`. If the user did not pass `--engine`, the engine is `auto` — NOT `claude`.
+- Call `mcp__codex-cli__ping` to verify the Codex MCP server is available. If ping fails, warn the user and offer: [1] Continue with `--engine claude`, [2] Abort.
+- Also read `references/challenge-rubric.md` up front. The engine routing table is defined in the auto-resolve skill's `references/engine-routing.md` under "Pipeline Phase Routing (ideate)".
-**If `--engine` is not set and `--with-codex` is set** (legacy): read `references/challenge-rubric.md` and `references/codex-debate.md` up front, then run the pre-flight check described in `codex-debate.md` to verify the Codex MCP server is available before starting the pipeline. If the server is unavailable and the user opts to continue without Codex, the solo CHALLENGE pass still runs — only the cross-model rubric pass is disabled.
+**If `--engine` is not set and `--with-codex` is explicitly set** (legacy): read `references/challenge-rubric.md` and `references/codex-debate.md` up front, then run the pre-flight check described in `codex-debate.md` to verify the Codex MCP server is available before starting the pipeline. If the server is unavailable and the user opts to continue without Codex, the solo CHALLENGE pass still runs — only the cross-model rubric pass is disabled.
 <why_this_matters>
 When ideas flow directly from conversation to `/devlyn:auto-resolve`, context degrades at each handoff:

package/config/skills/devlyn:preflight/SKILL.md CHANGED Viewed

@@ -44,15 +44,17 @@ Parse from `<preflight_config>`:
 - `--autofix` — auto-promote all findings to roadmap items and run auto-resolve on each
 - `--skip-browser` — skip browser validation
 - `--skip-docs` — skip documentation audit
-- `--engine MODE` (claude) — controls which model handles audit phases. Modes:
-  - `claude` (default): all auditors use Claude subagents.
+- `--engine MODE` (auto) — controls which model handles audit phases. Modes:
+  - `auto` (default): code-auditor uses Codex (SWE-bench Pro +11.7pp for code analysis), docs-auditor uses Claude (writing quality), browser-auditor uses Claude (Chrome MCP). Requires Codex MCP server.
   - `codex`: code-auditor uses Codex, docs-auditor and browser-auditor use Claude.
-  - `auto`: code-auditor uses Codex (SWE-bench Pro +11.7pp for code analysis), docs-auditor uses Claude (writing quality), browser-auditor uses Claude (Chrome MCP). Recommended when Codex MCP is available.
+  - `claude`: all auditors use Claude subagents. No Codex calls.
 Example: `/devlyn:preflight --phase 2 --skip-browser`
 Example with engine: `/devlyn:preflight --engine auto`
-**If `--engine` is `auto` or `codex`**: call `mcp__codex-cli__ping` to verify Codex MCP availability. If ping fails, fall back to `--engine claude` with a warning.
+**Engine pre-flight** (runs unless `--engine claude` was explicitly passed):
+- The default engine is `auto`. If the user did not pass `--engine`, the engine is `auto` — NOT `claude`.
+- Call `mcp__codex-cli__ping` to verify Codex MCP availability. If ping fails, fall back to `--engine claude` with a warning.
 ## PHASE 0: DISCOVER & SCOPE

package/config/skills/devlyn:team-resolve/SKILL.md CHANGED Viewed

@@ -147,7 +147,7 @@ Codex roles cannot use TeamCreate/SendMessage — the Team Lead (you) relays the
 **For Dual roles** (e.g., security-auditor): Run BOTH a Claude Agent teammate AND a `mcp__codex-cli__codex` call in parallel with the same prompt. Merge findings per `engine-routing.md` "How to Spawn a Dual Role" section.
-If `--engine claude` or no `--engine` flag: all roles use Claude Agent teammates (current default behavior).
+If `--engine auto` or no `--engine` flag: routes each role to the optimal model based on benchmark data (see `engine-routing.md`). If `--engine claude`: all roles use Claude Agent teammates.
 ### Teammate Prompts

package/config/skills/devlyn:team-review/SKILL.md CHANGED Viewed

@@ -83,7 +83,7 @@ Codex reviewers cannot use TeamCreate/SendMessage — the Review Lead (you) coll
 **For Dual roles** (e.g., security-reviewer): Run BOTH a Claude Agent reviewer AND a `mcp__codex-cli__codex` call in parallel with the same prompt. Merge findings per `engine-routing.md` "How to Spawn a Dual Role" section.
-If `--engine claude` or no `--engine` flag: all roles use Claude Agent reviewers (current default behavior).
+If `--engine auto` or no `--engine` flag: routes each reviewer role to the optimal model based on benchmark data (see `engine-routing.md`). If `--engine claude`: all roles use Claude Agent reviewers.
 ### Reviewer Prompts

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "devlyn-cli",
-  "version": "1.12.2",
+  "version": "1.12.4",
   "description": "AI development toolkit for Claude Code — ideate, auto-resolve, and ship with context engineering and agent orchestration",
   "homepage": "https://github.com/fysoul17/devlyn-cli#readme",
   "bin": {