npm - devlyn-cli - Versions diffs - 2.3.0 → 2.3.2 - Mend

devlyn-cli 2.3.0 → 2.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (219) hide show

package/config/skills/devlyn:resolve/references/phases/verify.md CHANGED Viewed

@@ -10,7 +10,7 @@ Independent quality layer. You answer one question: did the diff deliver what th
 - `spec.md` (or `.devlyn/criteria.generated.md` for free-form mode) — the contract.
 - `spec.expected.json` — the mechanical acceptance contract per `_shared/expected.schema.json`.
 - The cumulative diff against `state.base_ref.sha`.
-- The spec hash (`state.source.spec_sha256`) — re-read the spec from disk and confirm the hash matches; if it does not, write `state.phases.verify.verdict: "BLOCKED"` with reason `spec_sha256_mismatch` and stop.
+- The source hash (`state.source.spec_sha256` for spec mode, `state.source.criteria_sha256` for generated free-form mode) — re-read the source contract from disk and confirm the hash matches; if it does not, write `state.phases.verify.verdict: "BLOCKED"` with reason `source_sha256_mismatch` and stop.
 You do NOT receive: PLAN, IMPLEMENT's reasoning, BUILD_GATE's findings, CLEANUP's allowlist negotiations. Reading those would compromise independence.
 </input>
@@ -21,10 +21,7 @@ You do NOT receive: PLAN, IMPLEMENT's reasoning, BUILD_GATE's findings, CLEANUP'
 Re-run the mechanical checks fresh, independent of BUILD_GATE's earlier run:
-1. `python3 .claude/skills/_shared/spec-verify-check.py --include-risk-probes` against the post-CLEANUP code.
-2. Re-scan `spec.expected.json.forbidden_patterns` against the diff (Python re.search; honor each pattern's `files` allowlist).
-3. Confirm `required_files` exist post-diff; confirm `forbidden_files` do not appear in the diff.
-4. Confirm `max_deps_added` is not exceeded (`git diff -- package.json` for Node; equivalent for other ecosystems).
+1. `SPEC_VERIFY_PHASE=verify_mechanical SPEC_VERIFY_FINDINGS_FILE=verify-mechanical.findings.jsonl SPEC_VERIFY_FINDING_PREFIX=VERIFY-MECH python3 .claude/skills/_shared/spec-verify-check.py --include-risk-probes` against the post-CLEANUP code. In spec mode, sibling `spec.expected.json` wins; a malformed sibling is CRITICAL, not a fallback. When `state.risk_profile.risk_probes_enabled == true`, missing `.devlyn/risk-probes.jsonl` is also CRITICAL. The script also checks `forbidden_patterns`, `required_files`, `forbidden_files`, and `max_deps_added`.
 Emit findings to `.devlyn/verify-mechanical.findings.jsonl`. Each match = one finding. Severity from the pattern's `severity` field (disqualifier → CRITICAL, warning → MEDIUM).
@@ -87,28 +84,40 @@ design/style concerns remain non-binding MEDIUM and produce `PASS_WITH_ISSUES`.
 ### Pair-mode (when triggered by orchestrator)
-Pair-mode is eligible only after MECHANICAL has no HIGH/CRITICAL findings.
-Deterministic blockers already decide the verdict and route to the fix loop; a
-second judge there duplicates evidence and wastes wall-time. If MECHANICAL has
-a HIGH/CRITICAL finding, record `pair_judge: null` and do not spawn the second
-VERIFY agent.
+Pair-mode is eligible only after MECHANICAL and the primary JUDGE have no
+verdict-binding findings. Deterministic blockers and primary JUDGE blockers
+already decide the verdict and route to the fix loop; a second judge there
+duplicates evidence and wastes wall-time. If MECHANICAL or the primary JUDGE
+has a verdict-binding finding, record `pair_judge: null` and do not spawn the
+second VERIFY agent.
 When eligible, trigger pair-mode if any of these are true:
-- `--pair-verify` was set.
+- `state.pair_verify == true` (`--pair-verify` was set).
 - `state.mode == "verify-only"`.
-- The spec frontmatter has `complexity: high`.
-- `state.complexity` is `"high"` or `"large"`.
+- The spec frontmatter has `complexity: high`; legacy/external spec
+  `complexity: large` is accepted for compatibility, but new specs use `high`.
+- Current free-form `state.complexity` is `"large"`; legacy `"high"` state remains accepted by the merge validator only for archived run compatibility.
 - `state.risk_profile.high_risk == true`.
 - `.devlyn/risk-probes.jsonl` exists or `state.risk_profile.risk_probes_enabled == true`.
-- MECHANICAL emitted warning-level findings but no HIGH/CRITICAL blockers.
+- The spec includes an actionable solo-headroom hypothesis.
+- MECHANICAL or the primary JUDGE emitted warning-level findings but no
+  verdict-binding blockers.
 - `state.verify.coverage_failed == true`.
+Malformed `state.risk_profile` is a VERIFY contract violation: it must be an
+object, `high_risk` / `risk_probes_enabled` / `pair_default_enabled` must be
+JSON booleans when present, and `reasons` must be a string array. Do not treat
+missing or malformed risk state as low-risk; `verify-merge-findings.py` blocks
+it because it can hide `risk.high` or `risk_probes.enabled` pair triggers.
 If `--no-pair` was set, do not spawn the OTHER-engine judge. Record
 `pair_trigger: { eligible: false, reasons: [], skipped_reason: "user_no_pair" }`
 and continue with solo VERIFY. This is an explicit user opt-out, not an engine
-availability fallback.
+availability fallback. `--pair-verify` and `--no-pair` are mutually exclusive;
+if both are present, stop with `BLOCKED:invalid-flags`.
-Before JUDGE spawn, compute and persist:
+After MECHANICAL and the primary JUDGE finish, compute and persist this before
+spawning the OTHER-engine pair judge:
 ```json
 "pair_trigger": {
@@ -118,9 +127,23 @@ Before JUDGE spawn, compute and persist:
 }
 ```
-If `eligible == true` and `reasons` is non-empty, the OTHER-engine judge is
-mandatory. Skipping it is a VERIFY contract violation. If ineligible, record the
-reason, e.g. `"mechanical_blocker"`.
+If `eligible == true`, `reasons` must be non-empty and include every applicable canonical reason; for example, a spec with an actionable solo-headroom
+hypothesis must include `spec.solo_headroom_hypothesis` even when another reason
+such as `risk.high` also applies. The OTHER-engine judge is mandatory. Skipping
+it is a VERIFY contract violation. If ineligible, record the
+reason, e.g. `"mechanical_blocker"` or `"primary_judge_blocker"`.
+`pair_trigger` is a strict contract, not advisory metadata. `eligible: true`
+requires a non-empty `reasons` list and `skipped_reason: null`; `eligible: false`
+requires an empty `reasons` list and a string/null `skipped_reason`. Do not emit
+contradictory states such as `eligible: true` with `skipped_reason`, or
+`eligible: false` with trigger reasons. `verify-merge-findings.py` blocks VERIFY
+on malformed trigger state. Eligible triggers must contain only canonical
+reasons and at least one reason: `mode.verify-only`, `complexity.high`, `complexity.large`,
+`mode.pair-verify`, `spec.complexity.high`, `spec.complexity.large`,
+`spec.solo_headroom_hypothesis`, `risk.high`, `risk_probes.enabled`,
+`risk_probes.present`, `coverage.failed`, `mechanical.warning`, or
+`judge.warning`.
 The `--engine` flag never disables this rule. Explicit `--engine claude` means
 Claude is the primary judge; if pair-mode triggers, Codex is still the mandatory
@@ -160,12 +183,22 @@ When eligible and the orchestrator spawns a second VERIFY agent with the OTHER e
   after the first verdict-binding finding and emit JSONL. If both probes pass
   and static scope/dependency checks show no blocker, emit PASS; do not continue
   exhaustive exploration.
+  If the spec includes a solo-headroom hypothesis, one of the two targeted
+  probes must exercise that hypothesis with the visible command/input shape and
+  compare the full externally visible result. The probe must use the
+  hypothesis's backticked observable command as its command anchor before adding
+  bounded input variations. Do not substitute a neighboring easier edge case;
+  the pair judge exists to test the stated expected solo miss.
   A targeted probe must compare the full externally visible result
   (stdout/stderr/exit and full parsed output object, including accepted/scheduled
   rows, rejected rows, and remaining state when present), not just a single
-  property. For priority/stateful specs, at least one probe must include an
-  earlier input entity that would succeed under input-order processing, a later
-  higher-priority entity that consumes or blocks the critical resource, and a
+  property. When the spec names exact keys, row shapes, JSON object shape, or an
+  exact error body, compare parsed key sets/deep equality so aliased keys,
+  missing keys, and extra keys are verdict-binding failures. Use the spec's
+  visible input key names literally when constructing the probe input. For
+  priority/stateful specs, at least one probe must include an earlier input
+  entity that would succeed under input-order processing, a later higher-priority
+  entity that consumes or blocks the critical resource, and a
   failure/blocked/rollback edge that determines a later entity's state. This is
   the minimum compound shape for priority + failure/state-mutation bugs.
   Scope qualifiers are binding for the pair judge too: do not reinterpret
@@ -181,12 +214,13 @@ When eligible and the orchestrator spawns a second VERIFY agent with the OTHER e
   (or scheduled) and rejected rows.
 Codex pair-JUDGE is read-only. Invoke `codex-monitored.sh` directly with
-`-c model_reasoning_effort=medium`; this phase is a bounded two-probe review,
-not an unbounded implementation task. Do not pipe it to `tail`, `head`, `grep`,
-`sed`, or `awk`. Capture stdout/stderr by direct tool capture or file
-redirection. The Codex judge must return JSONL
-findings on stdout; the orchestrator writes `.devlyn/verify.pair.findings.jsonl`
-and merges verdicts. Do not ask Codex to `apply_patch` or edit `.devlyn`.
+`CODEX_MONITORED_ISOLATED=1` and `-c model_reasoning_effort=medium`; this is a
+bounded two-probe review, not implementation. Isolation blocks user config,
+AGENTS.md, pyx-memory, hooks, and project rules from hidden context/tool
+side effects. Do not pipe it to `tail`, `head`, `grep`, `sed`, or `awk`.
+Capture stdout/stderr directly. The Codex judge must return JSONL findings on
+stdout; the orchestrator writes `.devlyn/verify.pair.findings.jsonl` and merges
+verdicts. Do not ask Codex to `apply_patch` or edit `.devlyn`.
 The Codex prompt must include a bounded-output contract: no harness-doc reads,
 maximum two targeted probes before first output, stop on the first
 verdict-binding finding, and emit PASS immediately after the bounded checks pass.

package/config/skills/devlyn:resolve/references/state-schema.md CHANGED Viewed

@@ -11,6 +11,7 @@ Single authoritative verdict source for `/devlyn:resolve`. The orchestrator bran
   "started_at": "2026-04-30T12:00:00Z",
   "engine": "claude",
   "mode": "spec",
+  "pair_verify": false,
   "complexity": null,
   "risk_profile": { "high_risk": false, "reasons": [], "risk_probes_enabled": false, "pair_default_enabled": true },
   "base_ref": { "branch": "main", "sha": "abc123..." },
@@ -44,16 +45,18 @@ Single authoritative verdict source for `/devlyn:resolve`. The orchestrator bran
 - **version** — string. Bump major on a breaking schema change.
 - **mode** — `"free-form" | "spec" | "verify-only"`.
+- **pair_verify** — boolean. Set true only when the user passed `--pair-verify`; otherwise false. This is the durable state evidence for the `mode.pair-verify` pair-trigger reason. It is mutually exclusive with `risk_profile.pair_default_enabled == false` from `--no-pair`; `verify-merge-findings.py` blocks the contradictory state.
 - **complexity** — `null | "trivial" | "medium" | "large"`. Free-form mode populates this; spec/verify-only mode leaves it null.
 - **engine** — `"claude" | "codex" | "auto"` initially; a required unavailable engine stops the run with `BLOCKED:<engine>-unavailable`.
-- **risk_profile** — PHASE 0 classification for conditional defaults. `high_risk` records durable-risk signals from the goal/spec; `risk_probes_enabled` is true for explicit `--risk-probes` or high-risk specs unless `--no-risk-probes`; `pair_default_enabled` is false only for explicit `--no-pair`.
+- **source** — provenance for the contract all downstream phases read. Spec and verify-only mode set `type: "spec"`, `spec_path`, and `spec_sha256`. Free-form mode sets `type: "generated"`, leaves `spec_path`/`spec_sha256` null, and must set `criteria_path: ".devlyn/criteria.generated.md"` plus `criteria_sha256` from the generated file's raw bytes. VERIFY re-checks the matching hash before judging.
+- **risk_profile** — PHASE 0 classification for conditional defaults. `high_risk` records durable-risk signals from the goal/spec; `risk_probes_enabled` is true for explicit `--risk-probes` or high-risk specs unless `--no-risk-probes`; `pair_default_enabled` is false only for explicit `--no-pair`. `risk_profile` must remain an object with boolean `high_risk`, `risk_probes_enabled`, and `pair_default_enabled` fields when present, plus `reasons` as a list of strings. Malformed `risk_profile` blocks VERIFY because pair-trigger reasons derive `risk.high` and `risk_probes.enabled` from this state.
 - **rounds.global** — incremented every fix-loop pass (BUILD_GATE → fix-loop OR VERIFY → fix-loop).
 - **phases.probe_derive** — optional PHASE 1.5 entry when `--risk-probes` is enabled. Artifacts include `.devlyn/risk-probes.jsonl`. Probe failures later surface through BUILD_GATE/VERIFY as `correctness.risk-probe-failed`.
 - **bypasses** — array of phase names from `--bypass`. Valid: `"build-gate" | "cleanup"`. PLAN, IMPLEMENT, VERIFY are non-bypassable (orchestrator rejects at parse time).
 - **implement_passed_sha** — captured at end of PHASE 2; null until then. Activates the post-implement invariant for CLEANUP and VERIFY.
 - **criteria** — generated from spec's `## Requirements` checklist (one per `- [ ]`). `status: pending → implemented` is the legal transition. `failed_by_finding_ids` populates when VERIFY surfaces a finding tied to a criterion.
-- **verify.coverage_failed** — set by VERIFY's JUDGE sub-phase when a spec axis could not be exercised against the diff. Triggers pair-mode escalation when set. Pair-mode also triggers for verify-only mode, high-risk specs, active risk probes, `complexity: high` specs, or `state.complexity` of `"high"`/`"large"` when MECHANICAL has no HIGH/CRITICAL blockers.
-- **verify.pair_trigger** — VERIFY's trigger decision: `{ "eligible": boolean, "reasons": string[], "skipped_reason": string|null }`. If eligible with any reason, `pair_judge` must be non-null.
+- **verify.coverage_failed** — set by VERIFY's JUDGE sub-phase when a spec axis could not be exercised against the diff. Triggers pair-mode escalation when set. Pair-mode also triggers for `state.pair_verify == true`, verify-only mode, high-risk specs, active risk probes, actionable solo-headroom hypotheses, `complexity: high` specs, or current free-form `state.complexity` of `"large"` when MECHANICAL and the primary JUDGE have no verdict-binding blockers. Legacy/external spec `complexity: large` remains accepted for compatibility; new specs use `high`. Legacy `"high"` state remains accepted by the merge validator only for archived run compatibility.
+- **verify.pair_trigger** — VERIFY's trigger decision: `{ "eligible": boolean, "reasons": string[], "skipped_reason": string|null }`. The shape is strict: `eligible: true` requires a non-empty reasons list containing every applicable canonical eligible reason and only canonical eligible reasons, plus `skipped_reason: null`; `eligible: false` requires an empty reasons list and may set only `user_no_pair`, `mechanical_blocker`, `primary_judge_blocker`, or null as the skip cause. Canonical eligible reasons are `mode.verify-only`, `mode.pair-verify`, `complexity.high`, `complexity.large`, `spec.complexity.high`, `spec.complexity.large`, `spec.solo_headroom_hypothesis`, `risk.high`, `risk_probes.enabled`, `risk_probes.present`, `coverage.failed`, `mechanical.warning`, and `judge.warning`. `user_no_pair` is valid only when `risk_profile.pair_default_enabled == false` from an explicit `--no-pair`; `mechanical_blocker` and `primary_judge_blocker` are valid only when the matching source has a verdict-binding finding. If state implies a pair decision is required but `pair_trigger` is missing, if it records `eligible:false` with no supported skip reason, if an eligible trigger omits an applicable reason such as `spec.solo_headroom_hypothesis`, or if any combination is malformed, `verify-merge-findings.py` blocks VERIFY.
 ## Per-phase shape
@@ -107,7 +110,7 @@ Per-phase summary table: `phase | verdict | duration_ms | round | triggered_by |
 Findings table (post-IMPLEMENT phases only — they are findings-only): each finding's `severity | rule_id | file:line | message | confidence`.
-Follow-up notes: any `--continue-on-large` assumptions, pair/risk-probe opt-out state, engine setup guidance for `BLOCKED:<engine>-unavailable`, any `state.verify.coverage_failed` axes.
+Follow-up notes: any `--continue-on-large` assumptions, pair/risk-probe opt-out state, engine setup guidance for `BLOCKED:<engine>-unavailable`, `/devlyn:ideate` guidance for `BLOCKED:solo-headroom-hypothesis-required` that asks for the visible behavior `solo_claude` is expected to miss, `/devlyn:ideate` guidance for `BLOCKED:solo-ceiling-avoidance-required` that asks for the concrete difference from rejected or solo-saturated controls such as `S2`-`S6`, and any `state.verify.coverage_failed` axes.
 ## Archive contract

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "devlyn-cli",
-  "version": "2.3.0",
-  "description": "AI development toolkit for Claude Code — ideate, auto-resolve, and ship with context engineering and agent orchestration",
+  "version": "2.3.2",
+  "description": "AI development toolkit for Claude Code — ideate, resolve, and ship with context engineering and agent orchestration",
   "homepage": "https://github.com/fysoul17/devlyn-cli#readme",
   "bin": {
     "devlyn": "bin/devlyn.js"
@@ -20,13 +20,58 @@
     "agents-config",
     "optional-skills",
     "benchmark/auto-resolve/BENCHMARK-DESIGN.md",
+    "benchmark/auto-resolve/BENCHMARK-RESULTS.md",
     "benchmark/auto-resolve/README.md",
     "benchmark/auto-resolve/RUBRIC.md",
+    "benchmark/auto-resolve/run-real-benchmark.md",
     "benchmark/auto-resolve/fixtures/SCHEMA.md",
     "benchmark/auto-resolve/fixtures/F*/**",
+    "benchmark/auto-resolve/fixtures/retired/F*/**",
+    "benchmark/auto-resolve/shadow-fixtures/S*/**",
     "benchmark/auto-resolve/fixtures/test-repo/**",
     "!benchmark/auto-resolve/fixtures/test-repo/node_modules/**",
+    "benchmark/auto-resolve/results/20260510-f16-f23-f25-combined-proof/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260510-f16-f23-f25-combined-proof/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260510-f16-f23-f25-combined-proof/full-pipeline-pair-gate.md",
+    "benchmark/auto-resolve/results/20260510-f16-f23-f25-combined-proof/full-pipeline-pair-gate.json",
+    "benchmark/auto-resolve/results/20260511-f21-current-riskprobes-v1/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260511-f21-current-riskprobes-v1/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260511-f21-current-riskprobes-v1/full-pipeline-pair-gate.md",
+    "benchmark/auto-resolve/results/20260511-f21-current-riskprobes-v1/full-pipeline-pair-gate.json",
+    "benchmark/auto-resolve/results/20260507-f10-f11-tier1-full-pipeline/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260507-f10-f11-tier1-full-pipeline/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260508-f22-exact-error-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260508-f22-exact-error-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260508-f26-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260508-f26-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260511-f3-http-error-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260511-f3-http-error-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260511-f12-webhook-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260511-f12-webhook-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260511-f15-concurrency-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260511-f15-concurrency-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f2-medium-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f2-medium-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f4-web-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f4-web-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f5-fixloop-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f5-fixloop-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f6-checksum-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f6-checksum-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f7-scope-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f7-scope-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f9-e2e-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f9-e2e-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f31-seat-rebalance-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f31-seat-rebalance-headroom/headroom-gate.json",
+    "benchmark/auto-resolve/results/20260512-f32-subscription-renewal-headroom/headroom-gate.md",
+    "benchmark/auto-resolve/results/20260512-f32-subscription-renewal-headroom/headroom-gate.json",
     "benchmark/auto-resolve/scripts/**",
+    "!**/__pycache__",
+    "!**/__pycache__/**",
+    "!**/*.pyc",
+    "scripts/lint-fixtures.sh",
+    "scripts/lint-shadow-fixtures.sh",
     "scripts/lint-skills.sh",
     "CLAUDE.md",
     "AGENTS.md"

package/scripts/lint-fixtures.sh ADDED Viewed

@@ -0,0 +1,349 @@
+#!/usr/bin/env bash
+# lint-fixtures.sh — schema validity + structural check for golden fixtures/.
+set -euo pipefail
+REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+FIXTURES_DIR="${DEVLYN_FIXTURES_DIR:-$REPO_ROOT/benchmark/auto-resolve/fixtures}"
+FIXTURE_GLOB="${DEVLYN_FIXTURE_GLOB:-F*}"
+RETIRED_FIXTURE_GLOB="${DEVLYN_RETIRED_FIXTURE_GLOB:-F*}"
+REJECTED_REGISTRY="${DEVLYN_REJECTED_FIXTURE_REGISTRY:-$REPO_ROOT/benchmark/auto-resolve/scripts/pair-rejected-fixtures.sh}"
+SCHEMA="${DEVLYN_EXPECTED_SCHEMA:-$REPO_ROOT/config/skills/_shared/expected.schema.json}"
+SPEC_VERIFY_CHECK="$REPO_ROOT/config/skills/_shared/spec-verify-check.py"
+SOLO_HEADROOM_CHECK="$REPO_ROOT/benchmark/auto-resolve/scripts/solo-headroom-hypothesis.py"
+[ -d "$FIXTURES_DIR" ] || { echo "✗ $FIXTURES_DIR missing"; exit 1; }
+[ -f "$SCHEMA" ] || { echo "✗ $SCHEMA missing"; exit 1; }
+[ -f "$SPEC_VERIFY_CHECK" ] || { echo "✗ $SPEC_VERIFY_CHECK missing"; exit 1; }
+[ -f "$SOLO_HEADROOM_CHECK" ] || { echo "✗ solo-headroom checker missing: $SOLO_HEADROOM_CHECK"; exit 1; }
+[ -f "$REJECTED_REGISTRY" ] || { echo "✗ rejected fixture registry missing: $REJECTED_REGISTRY"; exit 1; }
+# shellcheck source=/dev/null
+source "$REJECTED_REGISTRY"
+if ! declare -F rejected_pair_fixture_reason >/dev/null; then
+  echo "✗ rejected fixture registry must define rejected_pair_fixture_reason: $REJECTED_REGISTRY"
+  exit 1
+fi
+REQUIRED_FILES=(metadata.json spec.md task.txt expected.json setup.sh NOTES.md)
+ERRORS=0
+COUNT=0
+RETIRED_COUNT=0
+for d in "$FIXTURES_DIR"/$FIXTURE_GLOB/; do
+  [ -d "$d" ] || continue
+  COUNT=$((COUNT + 1))
+  fid="$(basename "$d")"
+  for f in "${REQUIRED_FILES[@]}"; do
+    if [ ! -f "$d/$f" ]; then
+      echo "✗ $fid: missing $f"
+      ERRORS=$((ERRORS + 1))
+    fi
+  done
+  if [ -f "$d/metadata.json" ]; then
+    meta_id=$(python3 -c "import json,sys; print(json.load(open('$d/metadata.json'))['id'])" 2>/dev/null || echo "")
+    if [ "$meta_id" != "$fid" ]; then
+      echo "✗ $fid: metadata.json id='$meta_id' does not match dir name"
+      ERRORS=$((ERRORS + 1))
+    fi
+    python3 - "$d/metadata.json" "$d/spec.md" "$fid" <<'PY' || ERRORS=$((ERRORS + 1))
+import json
+import re
+import sys
+metadata_path, spec_path, fid = sys.argv[1], sys.argv[2], sys.argv[3]
+try:
+    metadata = json.load(open(metadata_path, encoding="utf-8"))
+except Exception:
+    sys.exit(0)
+if metadata.get("category") != "high-risk":
+    sys.exit(0)
+intent = str(metadata.get("intent") or "")
+try:
+    spec = open(spec_path, encoding="utf-8").read()
+except FileNotFoundError:
+    spec = ""
+text = f"{intent}\n{spec}".lower()
+risk_pattern = re.compile(
+    r"\b("
+    r"auth|authz|permissions?|security|tokens?|sessions?|"
+    r"payments?|money|billing|invoices?|pricing|tax|ledger|"
+    r"persistence|persist\w*|data mutation|delet\w*|migrations?|"
+    r"idempoten\w*|replay|duplicates?|api|webhook|raw-body|signatures?|"
+    r"allocation|scheduling|inventory|rollback|transaction|"
+    r"priority|error-priority|output-shape|output shape|response-shape|response shape"
+    r")\b"
+)
+if not risk_pattern.search(text):
+    print(
+        f"✗ {fid}: high-risk fixture must include a resolve risk-trigger term "
+        "in metadata intent or spec.md"
+    )
+    sys.exit(1)
+PY
+  fi
+  if [ -f "$d/spec.md" ]; then
+    spec_id=$(python3 - "$d/spec.md" <<'PY' 2>/dev/null || true
+import re, sys
+text = open(sys.argv[1], encoding="utf-8").read()
+m = re.search(r'^id:\s*"?([^"\n]+)"?\s*$', text, re.M)
+print(m.group(1) if m else "")
+PY
+)
+    if [ "$spec_id" != "$fid" ]; then
+      echo "✗ $fid: spec.md frontmatter id='$spec_id' does not match dir name"
+      ERRORS=$((ERRORS + 1))
+    fi
+  fi
+  if [ -f "$d/expected.json" ]; then
+    if ! python3 - "$d/expected.json" "$fid" <<'PY'
+import json
+import sys
+expected_path, fid = sys.argv[1], sys.argv[2]
+try:
+    data = json.load(open(expected_path, encoding="utf-8"))
+except json.JSONDecodeError:
+    print(f"✗ {fid}: expected.json is not valid JSON")
+    sys.exit(1)
+if not isinstance(data, dict):
+    print(f"✗ {fid}: expected.json must be an object")
+    sys.exit(1)
+PY
+    then
+      ERRORS=$((ERRORS + 1))
+      continue
+    fi
+    n_cmds=$(python3 - "$d/expected.json" <<'PY'
+import json
+import sys
+data = json.load(open(sys.argv[1], encoding="utf-8"))
+commands = data.get("verification_commands", [])
+print(len(commands) if isinstance(commands, list) else 0)
+PY
+)
+    if [ "$n_cmds" -lt 1 ]; then
+      echo "✗ $fid: expected.json has 0 verification_commands (need ≥1)"
+      ERRORS=$((ERRORS + 1))
+    fi
+    schema_ok=1
+    if ! python3 - "$SCHEMA" "$d/expected.json" "$fid" <<'PY'
+import json, os, sys
+schema_path, expected_path, fid = sys.argv[1], sys.argv[2], sys.argv[3]
+schema = json.load(open(schema_path))
+data = json.load(open(expected_path))
+def is_string_list(value):
+    return isinstance(value, list) and all(isinstance(item, str) and item for item in value)
+def fallback_validate():
+    allowed = set(schema["properties"])
+    errors = []
+    if not isinstance(data, dict):
+        return ["expected.json must be an object"]
+    unknown = sorted(set(data) - allowed)
+    if unknown:
+        errors.append(f"expected.json has unknown key(s): {', '.join(unknown)}")
+    commands = data.get("verification_commands", [])
+    if not isinstance(commands, list):
+        errors.append("verification_commands must be an array")
+    else:
+        for idx, command in enumerate(commands):
+            if not isinstance(command, dict):
+                errors.append(f"verification_commands[{idx}] must be an object")
+                continue
+            unknown_command = sorted(set(command) - {"cmd", "exit_code", "stdout_contains", "stdout_not_contains", "contract_refs"})
+            if unknown_command:
+                errors.append(f"verification_commands[{idx}] has unknown key(s): {', '.join(unknown_command)}")
+            if not isinstance(command.get("cmd"), str) or not command.get("cmd"):
+                errors.append(f"verification_commands[{idx}].cmd must be a non-empty string")
+            exit_code = command.get("exit_code", 0)
+            if isinstance(exit_code, bool) or not isinstance(exit_code, int):
+                errors.append(f"verification_commands[{idx}].exit_code must be an integer")
+            for key in ("stdout_contains", "stdout_not_contains", "contract_refs"):
+                if key in command and not is_string_list(command[key]):
+                    errors.append(f"verification_commands[{idx}].{key} must be an array of non-empty strings")
+    patterns = data.get("forbidden_patterns", [])
+    if not isinstance(patterns, list):
+        errors.append("forbidden_patterns must be an array")
+    else:
+        for idx, pattern in enumerate(patterns):
+            if not isinstance(pattern, dict):
+                errors.append(f"forbidden_patterns[{idx}] must be an object")
+                continue
+            unknown_pattern = sorted(set(pattern) - {"pattern", "description", "files", "severity"})
+            if unknown_pattern:
+                errors.append(f"forbidden_patterns[{idx}] has unknown key(s): {', '.join(unknown_pattern)}")
+            for key in ("pattern", "description"):
+                if not isinstance(pattern.get(key), str) or not pattern.get(key):
+                    errors.append(f"forbidden_patterns[{idx}].{key} must be a non-empty string")
+            if pattern.get("severity") not in {"disqualifier", "warning"}:
+                errors.append(f"forbidden_patterns[{idx}].severity must be disqualifier or warning")
+            if "files" in pattern and not is_string_list(pattern["files"]):
+                errors.append(f"forbidden_patterns[{idx}].files must be an array of non-empty strings")
+    for key in ("required_files", "forbidden_files", "tier_a_waivers", "spec_output_files"):
+        if key in data and not is_string_list(data[key]):
+            errors.append(f"{key} must be an array of non-empty strings")
+    max_deps_added = data.get("max_deps_added", 0)
+    if isinstance(max_deps_added, bool) or not isinstance(max_deps_added, int) or max_deps_added < 0:
+        errors.append("max_deps_added must be an integer >= 0")
+    return errors
+force_fallback = os.environ.get("DEVLYN_LINT_FIXTURES_NO_JSONSCHEMA") == "1"
+try:
+    if force_fallback:
+        raise ImportError
+    import jsonschema
+except ImportError:
+    fallback_errors = fallback_validate()
+    if fallback_errors:
+        for error in fallback_errors:
+            print(f"✗ {fid}: expected.json schema violation: {error}")
+        sys.exit(1)
+else:
+    try:
+        jsonschema.validate(data, schema)
+    except jsonschema.ValidationError as e:
+        print(f"✗ {fid}: expected.json schema violation: {e.message}")
+        sys.exit(1)
+PY
+    then
+      ERRORS=$((ERRORS + 1))
+      schema_ok=0
+    fi
+    if [ "$schema_ok" -eq 1 ]; then
+      if ! python3 "$SPEC_VERIFY_CHECK" --check "$d/spec.md"; then
+        echo "✗ $fid: spec-verify-check --check failed"
+        ERRORS=$((ERRORS + 1))
+      fi
+      if ! python3 "$SPEC_VERIFY_CHECK" --check-expected "$d/expected.json"; then
+        echo "✗ $fid: spec-verify-check --check-expected failed"
+        ERRORS=$((ERRORS + 1))
+      fi
+      python3 - "$d/spec.md" "$d/expected.json" "$fid" <<'PY' || ERRORS=$((ERRORS + 1))
+import json, pathlib, re, sys
+spec_path, expected_path, fid = sys.argv[1], sys.argv[2], sys.argv[3]
+spec = open(spec_path, encoding="utf-8").read()
+expected = json.load(open(expected_path, encoding="utf-8"))
+fixture_dir = pathlib.Path(expected_path).parent
+fixture_root = fixture_dir.resolve()
+errors = []
+for idx, command in enumerate(expected.get("verification_commands", [])):
+    cmd = str(command.get("cmd", ""))
+    if "BENCH_FIXTURE_DIR" not in cmd:
+        continue
+    fixture_refs = re.findall(r"(?:\$\{BENCH_FIXTURE_DIR\}|\$BENCH_FIXTURE_DIR)/([^\"'\s]+)", cmd)
+    if not fixture_refs:
+        errors.append(
+            f"verification_commands[{idx}] hidden oracle must reference an explicit $BENCH_FIXTURE_DIR/... file"
+        )
+    stdout_contains = command.get("stdout_contains", [])
+    if '"ok":true' not in stdout_contains:
+        errors.append(
+            f"verification_commands[{idx}] hidden oracle must assert stdout_contains includes '\"ok\":true'"
+        )
+    for fixture_ref in fixture_refs:
+        target = (fixture_dir / fixture_ref).resolve(strict=False)
+        try:
+            target.relative_to(fixture_root)
+        except ValueError:
+            errors.append(
+                f"verification_commands[{idx}] BENCH_FIXTURE_DIR file escapes fixture dir: {fixture_ref!r}"
+            )
+            continue
+        if not target.is_file():
+            errors.append(
+                f"verification_commands[{idx}] BENCH_FIXTURE_DIR file not found: {fixture_ref!r}"
+            )
+    refs = command.get("contract_refs", [])
+    if not refs:
+        errors.append(f"verification_commands[{idx}] hidden oracle missing contract_refs")
+        continue
+    for ref in refs:
+        if ref not in spec:
+            errors.append(
+                f"verification_commands[{idx}] contract_ref not found in spec.md: {ref!r}"
+            )
+if errors:
+    for err in errors:
+        print(f"✗ {fid}: {err}")
+    sys.exit(1)
+PY
+    fi
+  fi
+  if [ -f "$d/setup.sh" ] && [ ! -x "$d/setup.sh" ]; then
+    echo "✗ $fid: setup.sh not executable (run: chmod +x $d/setup.sh)"
+    ERRORS=$((ERRORS + 1))
+  fi
+  if [ -f "$d/NOTES.md" ] \
+     && { { grep -Fq 'headroom gate' "$d/NOTES.md" && grep -Eq '`?FAIL`?' "$d/NOTES.md"; } \
+       || { grep -Fq 'pair-lift evidence' "$d/NOTES.md" && grep -Eiq 'reject|rejected' "$d/NOTES.md"; }; } \
+     && ! rejected_pair_fixture_reason "$fid" >/dev/null 2>&1; then
+    echo "✗ $fid: NOTES.md records pair-candidate rejection but pair-rejected-fixtures.sh has no rejected reason"
+    ERRORS=$((ERRORS + 1))
+  fi
+  if [ -f "$d/NOTES.md" ] \
+     && grep -Fq 'pair_evidence_passed' "$d/NOTES.md" \
+     && ! python3 "$SOLO_HEADROOM_CHECK" --expected-json "$d/expected.json" "$d/spec.md"; then
+    echo "✗ $fid: pair_evidence_passed fixture spec.md must document an actionable solo-headroom hypothesis with solo_claude miss and observable command from expected.json"
+    ERRORS=$((ERRORS + 1))
+  fi
+done
+for d in "$FIXTURES_DIR"/retired/$RETIRED_FIXTURE_GLOB/; do
+  [ -d "$d" ] || continue
+  RETIRED_COUNT=$((RETIRED_COUNT + 1))
+  fid="$(basename "$d")"
+  if [ ! -f "$d/RETIRED.md" ]; then
+    echo "✗ retired/$fid: missing RETIRED.md"
+    ERRORS=$((ERRORS + 1))
+  fi
+  for f in "${REQUIRED_FILES[@]}"; do
+    if [ ! -f "$d/$f" ]; then
+      echo "✗ retired/$fid: missing preserved $f"
+      ERRORS=$((ERRORS + 1))
+    fi
+  done
+  if [ -f "$d/metadata.json" ]; then
+    meta_id=$(python3 -c "import json,sys; print(json.load(open('$d/metadata.json'))['id'])" 2>/dev/null || echo "")
+    if [ "$meta_id" != "$fid" ]; then
+      echo "✗ retired/$fid: metadata.json id='$meta_id' does not match dir name"
+      ERRORS=$((ERRORS + 1))
+    fi
+  fi
+  if [ -f "$d/setup.sh" ] && [ ! -x "$d/setup.sh" ]; then
+    echo "✗ retired/$fid: setup.sh not executable (run: chmod +x $d/setup.sh)"
+    ERRORS=$((ERRORS + 1))
+  fi
+done
+if [ $COUNT -eq 0 ]; then
+  echo "✗ no fixtures found in $FIXTURES_DIR"
+  exit 1
+fi
+if [ $ERRORS -gt 0 ]; then
+  echo ""
+  echo "✗ lint-fixtures: $ERRORS error(s) across $COUNT active fixture(s) and $RETIRED_COUNT retired fixture(s)"
+  exit 1
+fi
+echo "✓ lint-fixtures: $COUNT active fixture(s) passed schema + structural checks; $RETIRED_COUNT retired fixture(s) preserved"