npm - @chrono-meta/fh-gate - Versions diffs - 1.4.20 → 1.4.22 - Mend

@chrono-meta/fh-gate 1.4.20 → 1.4.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/AGENTS.md +10 -1
package/CATALOG.md +18 -6
package/CHEATSHEET.md +2 -1
package/CLAUDE.md +5 -2
package/package.json +1 -1
package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +29 -3
package/plugins/fh-meta/skills/phantom-quench/SKILL.md +36 -6
package/plugins/fh-meta/skills/sim-conductor/SKILL.md +76 -2
package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +6 -1
package/plugins/fh-meta/skills/steel-quench/SKILL.md +23 -1
package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +23 -0

package/AGENTS.md CHANGED Viewed

@@ -59,7 +59,16 @@ Agents in this registry belong to the **Automation layer**. Skills (in `plugins/
 > **Codex-compatible beta**: The Methodology layer (`tracks/`, `knowledge/`, skill documentation) is designated Codex-compatible beta. Gemini, Codex, and other AI users can apply FH methodology without the Automation layer — manual invocation replaces hook/agent dispatch.
-> **Multi-model sidecar (validated)**: Any FH user can delegate to other models via sidecar — Gemini CLI, OpenAI/Codex CLI, or Copilot CLI's model catalog — invoked with `Bash` from within the Claude Code session. FH is the orchestrating harness; the sidecar is a routing/access layer (not a second harness — different layer entirely). Validated empirically: `echo "prompt" | gemini` works inside a CC session and produces usable output. Sidecar calls are Bash invocations, not agent dispatches — they bypass this registry and are coordinated inline by the skill. See `knowledge/shared/harness-core/multi_model_sidecar_strategy.md` for the full pattern.
+> **Directory → destination routing (where your outputs belong)**: not everything in the methodology layer is public-shareable. `knowledge/` and `SKILL.md` docs are the **public, reusable** methodology. `tracks/` is **local / private by convention** — work history, session records, `fh_signal_*`, audit logs — and is gitignored on the public mirror. An AI working in a local workspace that pairs the public mirror with a private companion store (the `*-be` pattern) must **not** infer "same folder ⇒ same repository"; route by content type:
+>
+> | Content | Default destination |
+> |---|---|
+> | reusable methodology · docs · skills · public guidance · polished external-facing conclusions | public mirror (`knowledge/`, `plugins/`, `docs/`) |
+> | raw signal · operator observation · private validation · handoff · paper draft · PR-background reasoning log | private companion store (`*-be` pattern) — or keep local; do **not** commit to the public mirror |
+>
+> When unsure, treat raw / observational / operator-specific material as **private-first** and promote only the polished result to public. (Concrete per-operator bindings — exact companion-store path, sync mechanism — live in the operator's local config, not here.)
+> **Multi-model sidecar (validated)**: Any FH user can delegate to other models via sidecar — Gemini CLI, OpenAI/Codex CLI, or Copilot CLI's model catalog — invoked with `Bash` from within the Claude Code session. FH is the orchestrating harness; the sidecar is a routing/access layer (not a second harness — different layer entirely). Validated empirically: `echo "prompt" | gemini` works inside a CC session and produces usable output. Sidecar calls are Bash invocations, not agent dispatches — they bypass this registry and are coordinated inline by the skill. Capability routing matters too: Gemini/Antigravity is the natural multimodal sidecar, while a Codex app/runtime session with Browser/Chrome connectors is the preferred handoff for live web-flow automation. In a local FH workspace that pairs the public methodology mirror with a private companion store (the `*-be` pattern), route by workspace capability while preserving each repository's ownership boundary. See `knowledge/shared/harness-core/multi_model_sidecar_strategy.md` for the full pattern.
 ---

package/CATALOG.md CHANGED Viewed

@@ -8,6 +8,18 @@ AI reads this file first when searching past work. Open individual files for det
 <!-- Add entries in reverse date order (newest at top) -->
+### 2026-06-13 | forge-harness | #judge-robustness, #mechanical-anchor, #hardening-batch-2, #sycophancy-gate, #verification
+**File:** plugins/fh-meta/skills/{verify-bidirectional,steel-quench,asset-placement-gate}/SKILL.md (commit f80bc99)
+Batch-2 of the judge-robustness hardening (after #1-#2 in be2d5dc): three more judge-only verdict paths bound to anchors. #3 verify-bidirectional evidence gate — a persistent-baseline overwrite needs a supporting cited source (read, not existence) or a grep contradiction, else ESCALATE+block; closes the bare-pushback sycophancy vector without restoring AI stubbornness. #4 steel-quench Wave-P3 PASS-framing redaction (mktemp glyph+verdict-phrase strip — the challenger caught a naive bare-PASS global corrupting "status==PASS", an S fixed pre-commit). #5 asset-placement Step 0.5 mechanical pre-grep grounds criterion ④. challenger-verify round 2 was load-bearing again (FAIL→fixed: 1S+2A+4B); sonnet blind sim PASS (evidence gate ESCALATEs on bare overwrite).
+- Decision: #6 sim-conductor (A) deferred — its cross-model hard-gate needs graceful degradation for CC-only environments; not rushed into the batch.
+- Open: #6 sim-conductor staged; npm republish (1.4.21) bundling be2d5dc + f80bc99 pending operator.
+### 2026-06-13 | forge-harness | #judge-robustness, #mechanical-anchor, #verification-hardening, #self-audit, #scaled-dispatch
+**File:** plugins/fh-meta/skills/phantom-quench/SKILL.md + templates/.git-hooks/pre-commit + CLAUDE.md/CHEATSHEET (commit be2d5dc)
+Deep-research + a 6-agent parallel swarm audit turned FH's own adversarial method on FH: armed with arXiv 2507.08794 ("One Token to Fool LLM-as-a-Judge"), cold agents found 5/6 of FH's judged-check skills S-exploitable. Common root cause: every terminal verdict is judge-only with no mechanical checksum. Shipped the two highest-leverage fixes — (#1) the pre-commit gate marker now requires a non-vacuous, auditable axis2-evidence field, with the residual (provenance against a self-deceiving runner) documented honestly as the weekly-audit's+operator's, not pretended-closed by a false-security HMAC; (#2) phantom-quench GROUNDED is now gated on a typed mechanical anchor (proper-noun grep-in-asserting-slot / numeric value-after-normalization / branching decompose-or-declare-judged; universal rule: a hit counts only if the line expresses the claimed relation), closing out-of-context grounding. challenger-verify caught a real over-block regression (format-variant false-flag) before it shipped; sonnet blind sim PASS.
+- Decision: bind judged verdicts to mechanical anchors where one exists (verifiability constraint); ship #1-#2 first, stage #3-#6 (verify-bidirectional evidence gate, steel-quench redaction, asset-placement grep, sim-conductor cross-model) — rushing all six would be the over-verification the same research warned against.
+- Open: hardenings #3-#6 staged; scaled-dispatch meta-lesson = value is real-target coverage (~1 reviewer per asset), not agent count.
 ### 2026-06-13 | forge-harness | #deep-research, #capability-ladder, #no-reinvention, #routing, #goal-quench-max
 **File:** knowledge/shared/harness-core/deep_research_capability_ladder.md (+ CLAUDE.md initiative row, goal-quench/frontier-digest SKILL.md) (commit 55fa3da)
 Deep-research as an FH default — lifts the /deep-research engine ladder that was locked inside frontier-digest into a general routing default. 3 rungs: built-in /deep-research if present → Claude WebSearch+WebFetch synthesis (always available, tier-sensitive) → frontier-digest for AI/harness trend-scan only. No-reinvention: FH routes to the best capability present, builds no research engine. Wired as a CLAUDE.md Autonomous Initiative Layer row (default invocation) + goal-quench max-mode capability-gap fill (flexes in when budget RED), with rung-2 research run in an isolated sub-agent to preserve max's context budget. 4-axis: challenger PASS no-S (4B applied incl. the isolation invariant) + sonnet blind sim PASS (correct rung, trend-scan boundary held).
@@ -19,13 +31,13 @@ Completion sweep of all currently-unblocked carries. FP3: agy joins the sidecar
 - Decision: probe must exercise the same invocation form the dispatch uses (forms diverge on one binary — agy proved it); hook enforces ack form, weekly audit owns genuineness.
 ### 2026-06-11 | forge-harness | #readme-dedup, #commoditization-defense, #b1-boundary, #seed-vetting, #field-routing
-**File:** README.md (PR #93) + fh-be signals/handoffs (PR #24+) + field project v0.14 (private track)
-Cloud session #4 (Mode D, remote-doable batch): README stale duplicate "Measured, not asserted" block removed (pre-tier-floor copy contradicted default-Sonnet stance) + "Where this sits (2026)" positioning para (gate+loop are the asset — plumbing commoditizes). fh-be: B1 scope boundary (AW2 simple-vs-complex import), SC1–4 seed vetting (SC1 phantom arXiv fixed → 2509.19349; SC3 commoditization threat to bet ID), gstack positioning-triangle line. Field: private-track companion handoff verified+executed — DefectPatternMatcher+Bug Mode implemented UTF-8-clean in the field repo (25 tests PASS, 0 regressions); its OpenCode(sLLM) lane's downgrade triage hardened to 2-tier (keep/block) with free-tier 80/20 split — β/SC2 field case logged.
-- Decision: README action PARTIAL (merge done; About refresh + npm 1.4.14 laptop-bound); full-TC locked on qwen build per operator's CaseCraft limit measurement — structure-transform survives (P6/P6.5 deterministic), meaning-fill routes to frontier.
+**File:** README.md (PR #93) + companion-store signals/handoffs (PR #24+) + field project v0.14 (private track)
+Cloud session #4 (Mode D, remote-doable batch): README stale duplicate "Measured, not asserted" block removed (pre-tier-floor copy contradicted default-Sonnet stance) + "Where this sits (2026)" positioning para (gate+loop are the asset — plumbing commoditizes). Companion store: B1 scope boundary (AW2 simple-vs-complex import), SC1–4 seed vetting (SC1 phantom arXiv fixed → 2509.19349; SC3 commoditization threat to bet ID), gstack positioning-triangle line. Field: private-track companion handoff verified+executed — DefectPatternMatcher+Bug Mode implemented UTF-8-clean in the field repo (25 tests PASS, 0 regressions); its OpenCode(sLLM) lane's downgrade triage hardened to 2-tier (keep/block) with free-tier 80/20 split — β/SC2 field case logged.
+- Decision: README action PARTIAL (merge done; About refresh + npm 1.4.14 laptop-bound); full-TC locked on local-LLM build per operator's CaseCraft limit measurement — structure-transform survives (P6/P6.5 deterministic), meaning-fill routes to frontier.
 ### 2026-06-11 | forge-harness | #mcp-gating, #external-mcp, #name-keyed-policy, #measured-origin, #field-template
 **File:** templates/.claude/rules/mcp_tool_gating.md (+ auto_project_mapping.md §6 row 4, CLAUDE.md mount-intent trigger)
-Cloud session (Mode D, ext): new field template — external-MCP tool gating with three tiers (ask / ask-meta-write / allow-untrusted-read), name-keyed because server-supplied annotations are unreliable (measured same-day: live messaging-class MCP shipped all-None hints incl. irreversible send + approval-resolution tools — fh-be `signal_2026-06-11_hermes-mcp-cloud-boot.md`). Opus challenger caught the name-spoofing hole (server controls names too → behavior-confirmation required for non-ask tiers, fixed inline); sonnet blind sim PASS on unfilled-§3 scenario (per-item ask on send, batch approval-grant refused).
+Cloud session (Mode D, ext): new field template — external-MCP tool gating with three tiers (ask / ask-meta-write / allow-untrusted-read), name-keyed because server-supplied annotations are unreliable (measured same-day: live messaging-class MCP shipped all-None hints incl. irreversible send + approval-resolution tools — companion store `signal_2026-06-11_hermes-mcp-cloud-boot.md`). Opus challenger caught the name-spoofing hole (server controls names too → behavior-confirmation required for non-ask tiers, fixed inline); sonnet blind sim PASS on unfilled-§3 scenario (per-item ask on send, batch approval-grant refused).
 - Decision: prefer host-native per-tool permission config as enforcement; this template = what-to-gate + portable fallback. §6 install row is conditional (MCP present); the proactive mount-intent trigger is the load-bearing path.
 ### 2026-06-11 | forge-harness | #identity-marker, #door-skeleton, #target-tier-sim, #below-floor-consumer, #false-control-kill
@@ -72,7 +84,7 @@ Tier-floor resolution ships — the model dimension of the Sidecar Engine Resolu
 **File:** docs/OUTPUT_EVIDENCE.md (+ README.md §Model setup)
 Model-tier flattening measured and published: 30-point blind battery (rule-application + meta-dev fixtures, pre-registered rubric) on four Claude tiers — operation 100/100/97/94 (anchor/Opus 4.8/Sonnet 4.6/Haiku 4.5), tier separation only on above-rubric design increments (3/3·1/3·0.5/3·0/3). Public claim scoped honestly: single trial, self-graded, worked example not benchmark. README §Model setup gains the evidence note grounding the existing Opus recommendation.
 - Decision: operating FH ≈ model-flat (the harness is the score); developing FH is where tier matters — recommendation unchanged (opus for harness-editing/gates), now evidence-backed. **[superseded same-day by the tier-floor entry above: default flipped to sonnet + floored dispatch; opus pin now Mode-D-only]**
-- Open: real Qwen-class measurement on laptop (batteries are a portable fixture pack, fh-be record).
+- Open: real local-LLM-class measurement on laptop (batteries are a portable fixture pack, companion-store record).
 ### 2026-06-10 | forge-harness | #fc, #consent-lane, #federated-compounding, #starved-center, #v3
 **File:** tracks/_contrib/README.md (+ .gitignore, templates/contrib_session.md, docs/CONTRIBUTING.md, README.md)
@@ -158,7 +170,7 @@ Axis 5 check-class taxonomy added: every verify check classified as mandatory-pa
 **File:** knowledge/shared/harness-core/multi_model_sidecar_strategy.md (+ hybrid_orchestration_architecture_roadmap.md)
 Added canonical §Sidecar Engine Resolution Protocol — Tier1 subscription-CLI → Tier2 API-key → Tier3 Claude-subagent guaranteed fallback. Principle: discovery automatic/free, invocation value-gated (intelligent default multi-AI, no hard-fail for Mode C). Wired pointers into goal-quench Step D / steel-quench runtime-adapter / harvest-loop Step 3.5-X; sim-conductor/pipeline-conductor/agent-composer inherit by reference. Source hybrid-orchestration design archived as proposed roadmap (versions→placeholders, Python pseudo-code→illustrative, non-shipped tagged Proposed). PR #80.
 - Decision: single-source resolution protocol — skills cite it instead of re-inventing "if available" probes.
-- Open: npm republish (machine-bound) — 3 npm-shipped SKILL.md changed; handed off to laptop via fh-be.
+- Open: npm republish (machine-bound) — 3 npm-shipped SKILL.md changed; handed off to laptop via the companion store.
 ### 2026-06-09 | forge-harness | #onboarding, #greeting, #3-axis-scaffold, #returning-user
 **File:** knowledge/shared/harness-core/fh_detail_protocols.md

package/CHEATSHEET.md CHANGED Viewed

@@ -98,13 +98,14 @@ git config core.hooksPath templates/.git-hooks
 chmod +x templates/.git-hooks/pre-commit
 ```
-After running `/steel-quench` and `/phantom-quench` in your session, Claude creates the Axes 2+3 pass marker automatically. The marker must carry machine-readable floor fields — the hook validates them (a bare `touch` marker no longer passes; below-floor passes block unless an explicit `below-floor-ack:` line records operator acceptance, **quoting the operator's approval utterance verbatim** — an unquoted reason is rejected as agent-self-writable). If Claude doesn't create it (e.g., session interrupted), create it manually:
+After running `/steel-quench` and `/phantom-quench` in your session, Claude creates the Axes 2+3 pass marker automatically. The marker must carry machine-readable floor fields — the hook validates them (a bare `touch` marker no longer passes; below-floor passes block unless an explicit `below-floor-ack:` line records operator acceptance, **quoting the operator's approval utterance verbatim** — an unquoted reason is rejected as agent-self-writable). It also requires an **`axis2-evidence:`** line recording what the pass actually found (a finding count or verdict token — `PASS no-S` / `1S/4A fixed` / `clean — 0 findings`); a vacuous "it ran" line is rejected. *Honest scope: this enforces the marker is non-vacuous + auditable, not that the pass truly ran — a fabricated attestation is the weekly audit's + operator's residual, by design (judge-robustness swarm, 2026-06-13).* If Claude doesn't create it (e.g., session interrupted), create it manually:
 ```bash
 cat > "tracks/_meta/.axes_23_passed_$(git rev-parse --abbrev-ref HEAD | tr '/' '_')_$(date +%Y-%m-%d).marker" <<'EOF'
 axis2-engine: quench-challenger
 axis2-model: opus
 floor-status: at-floor
+axis2-evidence: PASS no-S
 <scope / findings prose>
 EOF
 ```

package/CLAUDE.md CHANGED Viewed

@@ -157,8 +157,11 @@ No user request is needed — this is a mandatory autonomous step, not a proposa
 FH asset modified → Axis 1 (regression_guard.sh --pr {BRANCH})
   → Axis 2 (/steel-quench) → Axis 3 (/phantom-quench)
   → marker: tracks/_meta/.axes_23_passed_{branch}_{date}.marker
-     (structured — required fields: axis2-engine / axis2-model / floor-status;
-      hook validates mechanically: below-floor blocks without below-floor-ack)
+     (structured — required fields: axis2-engine / axis2-model / floor-status / axis2-evidence;
+      hook validates mechanically: below-floor blocks without below-floor-ack, and
+      axis2-evidence must be non-vacuous — a recorded verdict/count, not "it ran". Honest
+      scope: form + non-vacuity + auditability, NOT provenance — a fabricated marker is the
+      weekly audit's + operator's residual by design, judge-robustness swarm 2026-06-13)
   → Axis 4 (/edit-manifest RECORD, today's entry in edit_manifest.yaml)
   → All 4 PASS → git commit allowed   |   Any FAIL → fix inline, re-run
 ```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@chrono-meta/fh-gate",
-  "version": "1.4.20",
+  "version": "1.4.22",
   "description": "FH runtime adapters — run FH governance, skills, and agents via Claude or Codex with machine-parseable gates.",
   "license": "MIT",
   "keywords": [

package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md CHANGED Viewed

@@ -37,7 +37,8 @@ When unsure where to place a new asset or skill:
 1. Request full file path from user (or accept natural language description)
 2. Load asset content via `Read` (if path provided)
-3. Evaluate Step 1 4-criteria in order (LLM makes the judgment directly)
+2.5. Step 0.5 mechanical overlap pre-scan (grounds criterion ④ before the judged pass)
+3. Evaluate Step 1 4-criteria in order (LLM makes the judgment, ④ gated on the Step 0.5 scan)
 4. ① + ④ both pass + at least one of ②③ passes → output **"FH suitable"**
    Otherwise, proceed to Step 2 local assessment → if fails, output **"Project-local agent or no asset needed"**
@@ -64,7 +65,32 @@ Immediately after trigger, acquire asset content in the following order.
    > **Which asset should I evaluate?**
    > Enter a file path (e.g., `.claude/agents/jira-create.md`) or a description.
-After acquiring the asset content, **Claude directly** applies Step 1 4-criteria (no external calls).
+After acquiring the asset content, run Step 0.5 (mechanical overlap pre-scan) **before** the judged Step 1.
+## Step 0.5. Mechanical Overlap Pre-Scan (grounds criterion ④)
+Criterion ④ ("no overlap with existing FH skills") is otherwise an LLM **recall** judgment with no
+ground truth — a duplicate skill with a novel name passes because the judge has no enumerated list to
+check against (judge-robustness swarm, 2026-06-13). Ground it mechanically first:
+```bash
+# enumerate existing skill names + descriptions (grounds the judged comparison)
+grep -riE "name:|description:" plugins/fh-meta/skills/*/SKILL.md plugins/fh-commons/skills/*/SKILL.md
+# hard-collision check: WHOLE proposed name or a WHOLE trigger phrase reused verbatim.
+# grep -wF (whole-word, fixed-string) on the full strings — NOT -E on tokens (a shared common
+# word like "review" is not a collision). Exclude the asset's own file (self-match = false hit).
+SELF="plugins/fh-meta/skills/<proposed name>/SKILL.md"
+grep -rwF -e "<proposed full name>" -e "<full trigger phrase 1>" -e "<full trigger phrase 2>" \
+     plugins/*/skills/*/SKILL.md | grep -v "$SELF" | grep -c .
+```
+Surface **collision count + nearest existing skill(s)**. Criterion ④ then passes only if **0 whole-name/
+whole-trigger collision AND the judged ≤90%-overlap check agrees**. A verbatim whole-name or
+whole-trigger reuse is a hard ④ fail regardless of the LLM judgment. **Honest scope**: the grep grounds
+*literal* name/trigger reuse only — a post-cutoff duplicate with a *paraphrased* trigger is invisible to
+both the judge (cutoff) and the grep (literal); that residual leans on the judged half **fed the
+enumerated descriptions above** (grounded comparison, not pure memory), not on full mechanization. A
+shared common word is a judged-review flag, **not** a hard fail.
 ## Done When
@@ -83,7 +109,7 @@ All steps 0–3 completed
 | ① | Cross-project value | Is this asset equally useful in other projects without depending on a specific project? |
 | ② | Orchestration / judgment layer | Is it just a list of MCP/Bash calls, or a judgment layer that synthesizes multiple signals? |
 | ③ | Not replaceable by built-ins | Can this be equally achieved with direct MCP calls or basic bash? (If yes, fails this criterion) |
-| ④ | No overlap with existing FH skills | Does it not overlap 90%+ with existing FH skills? |
+| ④ | No overlap with existing FH skills | Step 0.5 mechanical scan = 0 name/trigger collision **AND** judged ≤90% overlap. Non-zero collision → hard fail. |
 **FH suitable** → ① + ④ both pass + at least one of ②③ passes.
 **Fail** → ① or ④ fails → immediate fail. Or both ②③ fail → proceed to Step 2.

package/plugins/fh-meta/skills/phantom-quench/SKILL.md CHANGED Viewed

@@ -123,14 +123,44 @@ Extract claims from the artifact that require source back-tracing. Claim types:
 Back-trace each claim to the declared source files using Read + Grep directly — no inference judgment. Partial match is not treated as match.
+**Mechanical anchor (GROUNDED is gated on a literal grep hit *in the right slot*, not a bare
+judgment)** — this closes *out-of-context grounding*: citing a real, readable file for a false claim,
+where the judge biases GROUNDED merely because the source *exists* and contains domain-adjacent text
+(measured 2026-06-13, judge-robustness swarm — the cheapest S-exploit of this skill). The anchor is
+**typed** — applying one byte-literal rule to every claim type both under-blocks (a token that hits an
+irrelevant line) and over-blocks (a correct value formatted differently); each type gets the right check:
+- **Proper noun / exact identifier** (skill name, file path, API name, flag): run
+  `grep -n "<exact token>" <cited file>` — must return a non-empty line, surfaced **literally**
+  (`file:line: matched text`), and that line must be where the identifier is *used as the claim
+  asserts* (a bare occurrence elsewhere is **not** grounding — e.g. claim "X is the default model" needs
+  the line that sets X as default, not any line mentioning X). No qualifying hit → Phantom.
+- **Numerical / range value**: grep the value, but **normalize format/unit before judging** —
+  `300s` ≡ `300 seconds`, `≥5` ≡ `>= 5` (ASCII/Unicode), `5 minutes` ≡ `300 seconds`. The *value* must
+  sit in the slot the claim asserts. A correct value in a different format = Grounded (note the
+  normalization in evidence); a **different** value in that slot = Partial (re-confirm) or Phantom.
+  Never auto-Phantom a format variant.
+- **Branching / multi-clause condition**: no single greppable token. Either **decompose to atomic
+  sub-conditions** and grep each identifier, or — if it stays a compound judgment — keep it
+  **judged-class with the adversarial pairing declared** (do not fake a single-token grep). State which.
+**Universal rule**: a grep hit counts only if the surfaced line *expresses the claimed relation*.
+"The token appears somewhere in a real file" is precisely the out-of-context trap, not evidence.
+A claim that is **cited to a specific source but the source cannot be Read** (path doesn't resolve,
+line beyond EOF) is **S-grade Phantom**, *not* the softer Source-Missing 🔴 — a citation that does not
+resolve means the citation was invented. Source-Missing 🔴 is reserved for *undeclared* sources only.
+Declared "sources" that are not resolvable file paths (e.g. `source: "the codebase"`) count as **0
+sources** for the Step 0.5 blocker.
 Back-tracing classification:
 | Classification | Criteria | Marking |
 |---|---|:---:|
-| **Grounded** | Claim directly confirmed in source | ✅ |
-| **Partial** | Similar content in source but not exact match — needs re-confirmation | ⚠️ |
-| **Phantom** | Cannot be found in source | ❌ |
-| **Source-Missing** | Source itself cannot be Read or was not declared | 🔴 |
+| **Grounded** | Typed anchor passes: identifier grep-hits *in the asserting slot* / value matches after normalization / branching sub-conditions trace (line surfaced) | ✅ |
+| **Partial** | A *different* value sits in the claimed slot, or a compound condition partially traces — needs re-confirmation | ⚠️ |
+| **Phantom** | Exact token not found in source, **or** cited to a named source that cannot be Read | ❌ |
+| **Source-Missing** | Source was **not declared** (undeclared only — a failed *cited* source is Phantom) | 🔴 |
 > **Detail**: See `SKILL_detail.md §Step2-Detail` — back-tracing execution procedure, classification decision rules, and Step 2 output format template — read when handling edge cases or formatting results.
@@ -232,7 +262,7 @@ This skill can be used independently without the full meta-harness structure.
 ```
 Step 1 claim extraction complete
-+ Step 2 all claims back-traced (using Read tool — no inference judgment)
++ Step 2 all claims back-traced (Read + Grep — highest-priority GROUNDED requires a typed literal grep hit in the asserting slot, not inference)
 + Step 3 Phantom severity classification + prescription output
 + Step 4 process pattern diagnosis complete (skip if 0 Phantoms)
 + "phantom-quench Complete" declaration output
@@ -244,7 +274,7 @@ Verdict: PASS (0 Phantom claims) | CONDITIONAL_PASS (LOW-severity Phantoms only,
 ## Operating Notes
-- **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep.
+- **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep. **GROUNDED on a highest-priority claim is gated on a literal grep hit of the exact token (Step 2 mechanical anchor) — "the file exists and looks right" is the out-of-context-grounding trap, not evidence.**
 - **Partial is not Grounded**: Processing similar-value-in-source as Grounded misses the reconstruction modification pattern.
 - **Source not declared itself is S-grade**: If source is not declared when making an artifact, no claim can subsequently be verified. Recommend mandating source declaration in the process design stage.
 - **Recommended to use with steel-quench**: steel-quench quenches structural flaws, phantom-quench ensures source consistency. The two skills are orthogonal and artifact quality assurance is strengthened when used together.

package/plugins/fh-meta/skills/sim-conductor/SKILL.md CHANGED Viewed

@@ -78,7 +78,7 @@ Read target artifact(s) → classify on 5 dimensions → output recommendation
 | `artifact_type` | SKILL.md / design-doc → Area B + D-skill↑ · README / CHEATSHEET → Area A↑ · code / config → Area D-code↑ |
 | `audience` | external installer / first-time user → beginner↑ · internal team only → challenger↑ |
 | `claim_density` | 3+ stated benefits or superlatives → challenger↑ |
-| `risk_level` | external publish / marketplace listing → steel-quench prerequisite triggered |
+| `risk_level` | external publish / marketplace listing → steel-quench prerequisite triggered. **Mechanical floor (not judge-only)**: any of — publish/marketplace target · public-surface or visibility change · auth/secret-handling or executable code · an FH asset under the 4-axis gate — **forces `risk_level ≥ medium`** regardless of profiler judgment. The floor closes the "fool the profiler into `low` to skip Step 0.6" seam; the judge may only raise above the floor, never below it. |
 | `novelty` | first-of-its-kind / no prior session evidence → phantom-quench recommended |
 ```
@@ -149,6 +149,48 @@ Concern format: `"One thing to check before [Area X]: [concern]. Proceed?"`
 ---
+## Step 0.6 — Cross-Model Coverage Gate (risk≥medium — hard)
+Closes the homogeneous-blind-spot + formatting-flattery vector (judge-robustness swarm, 2026-06-13):
+a panel of same-session Claude sub-agents shares one model's blind spots, so a clean verdict can be
+flattery the **whole panel** is blind to — and with no quotable-rule violation, no persona escalates.
+For `risk_level ≥ medium` targets (from Step 0.3), at least one persona MUST come from a source
+**outside the orchestrator's own session context**. This **promotes the former advisory "dual
+validation principle"** (detail §AreaB-Baseline #4) to a hard gate — the mechanical-anchor pattern of
+hardening #1–#5: a judged verdict binds to a fact the judge cannot fake.
+**Graceful-degradation ladder** — take the highest available rung. The gate **never breaks
+sim-conductor**; at the bottom it only withdraws the *unsafe autonomy* (self-certifying a blind verdict),
+not the run:
+| Rung | Source | `cross_model_coverage` | Closes | When |
+|---|---|:---:|---|---|
+| 1 | External CLI team (Multi-Team Mode — §MultiTeam) | `external` | model-level blind spot (genuine cross-model) | 1+ external CLI live + probe non-empty |
+| 2 | Cross-session Claude — `claude -p` headless, or an Agent with **zero inherited context** | `cross-session` | **session-contamination only** — a fresh Claude shares the same weights/RLHF gradient, so it does **not** close the model-level blind spot; it only removes the orchestrator's working-memory bias. Honest partial mitigation, labeled as such | no external CLI; dispatch probe returns non-empty |
+| 3 | Same-session sub-agents only | `NONE` | nothing — homogeneous panel | neither rung's probe succeeded |
+**Mechanical anchor** — `cross_model_coverage` is valid **only if backed by a quoted dispatch artifact**,
+not a self-assessment (the self-signing hole hardening #1 closed for the marker — same fix here). To
+record `external` or `cross-session`, the Step 3 report must **quote a non-empty excerpt of the actual
+dispatch output** (external CLI stdout, or the dispatched Agent's returned verdict text); a label with no
+quoted excerpt is invalid and falls to `NONE`. **Liveness, not mere availability**: probe the rung before
+claiming it — attempt the dispatch with a timeout; if it errors or returns empty (plan-gate closed,
+context saturated, CLI present-but-dead), record `NONE`, never assume the rung succeeded. This is the
+same honest scope as #1: the artifact makes the claim **auditable**, not cryptographically unforgeable —
+a fabricated excerpt remains the operator's + weekly-audit's residual by design. On rung 3 (`NONE`) for a
+risk≥medium target:
+- the report flags `⚠️ cross-model coverage: NONE — homogeneous same-session panel; verdict provisional`, **and**
+- **Step 4 auto-commit privilege is withdrawn** (see Step 4): M-tier fixes may be *prepared* but the
+  commit waits for the operator's explicit go. Auto-committing a structurally self-blind verdict is
+  exactly what the exploit targeted — so that single privilege is what degrades, not the simulation.
+`risk_level = low` targets are exempt (a homogeneous panel is acceptable); the gate fires only at
+medium+. The rung-2 fallback is what makes this CC-only-safe: a Claude-only environment still gets a
+real cross-context read (a fresh isolated dispatch shares no working memory with the orchestrator), so
+`NONE` is reached only when *both* external CLIs and a second Claude context are unavailable.
+---
 ## Step 1 — Area-Specific Simulation
 ### Area A — External User Perspective
@@ -315,6 +357,29 @@ positives erode reviewer trust.
 - No forced consensus or forced conflict — report Common opinions (2+ personas agree) and Conflicts
   (position A vs B, each with rationale) as-is. Feeds Step 2 M/S/R triage (M ← Critical or 2+ personas).
+**Zero-coverage map (mandatory synthesizer output)** — the synthesizer must emit, mechanically, the set
+of standpoints that produced **no** finding, not only the ones that did (judge-robustness swarm,
+2026-06-13). Enumerate the persona **standpoints** in play — those the Step 0.3 profile recommended,
+**plus the standpoints its dimensions imply** (risk_level=high → a security/publish standpoint ·
+audience=mixed → a non-native-reader standpoint · claim_density=high → a claim-evidence standpoint).
+List the *standpoints*, not the bare dimension names (a `risk_level (low) → ZERO` row is noise that
+trains operators to ignore ⚠️). Mark each `covered` (≥1 persona addressed it) or `ZERO` (no persona
+touched it):
+```
+Coverage map:
+  beginner (onboarding friction)      → covered (A-1: 2 findings)
+  challenger (claim-evidence)         → covered (A-3: 1 finding)
+  security surface (risk_level=high)  → ZERO ⚠️
+  non-native reader (audience=mixed)  → ZERO ⚠️
+```
+A clean report with `ZERO` standpoints is **not** a pass — it is an uncovered surface, reported as such.
+This converts the formatting-flattery failure (everything reads fine → nothing escalates) into a visible
+gap: **silence on a standpoint is reported as `ZERO`, never inferred as approval.** It is a checklist
+derived from the profile and the dispatch outputs — a mechanical anchor, not a judgment. Carry the map
+verbatim into the Step 3 report.
 The two severity vocabularies are layered, not redundant: a persona running **in isolation** assigns only
 its own Critical/Important/Suggestion — it cannot assign M/S/R, since `S = found by 3+ personas` depends on
 cross-persona agreement the isolated persona never sees. The synthesizer is the only context that can triage
@@ -365,6 +430,13 @@ File: `$REPORT_DIR/sim_YYYY_MM_DD_area_[X].md`
 1+ M-tier → fix immediately → commit. PR creation requires explicit user request.
 0 M-tier → commit report only + report S/R backlog.
+**Cross-model gate on auto-commit (risk≥medium)** — when Step 0.6 recorded `cross_model_coverage: NONE`
+on a risk≥medium target, the auto-commit privilege is **withdrawn**: prepare the M-tier fixes and write
+the report, but do **not** self-commit — surface *"cross-model coverage NONE on a risk≥medium target;
+the verdict is from a homogeneous same-session panel. Commit the fixes, or add a cross-context read
+first?"* and wait for the operator's go. `external`/`cross-session` coverage, or risk_level=low, commits
+as normal. (This withdraws one privilege, not the run — the report and fixes still exist.)
 > **Detail**: See `SKILL_detail.md §PR-Bash` — branch creation bash, commit + push, gh pr create template — read when creating a PR.
 ---
@@ -390,8 +462,10 @@ Convergence within an AI-AI loop is **provisional**. Elevated to final only afte
 | 1+ M-tier → fixed + committed (or "none") | ✅ Prescription complete |
 | Report `tracks/_meta/sim_YYYY_MM_DD_*.md` saved | ✅ Persistence complete |
 | 0 M-tier → report committed + S/R backlog reported | ✅ Health check complete |
+| risk≥medium → `cross_model_coverage` recorded in report (external/cross-session/NONE) | ✅ Coverage gate ran *(check class: measured — the recorded value reflects the dispatch path that ran, not a self-grade; pair: NONE withdraws auto-commit per Step 4)* |
+| Synthesizer emitted the zero-coverage map (every profile standpoint marked covered/ZERO) | ✅ Blind-spot surface reported *(check class: mechanical — a checklist over the profile, not a judgment)* |
-Verdicts: PASS · CONDITIONAL_PASS (S/R only, or Area B cadence skip) · FAIL (M-tier unresolved) · ESCALATE (persona conflict requiring human judgment).
+Verdicts: PASS · CONDITIONAL_PASS (S/R only, or Area B cadence skip) · FAIL (M-tier unresolved) · ESCALATE (persona conflict requiring human judgment, **or** `cross_model_coverage: NONE` on risk≥medium → auto-commit withdrawn pending operator go).
 **Mandatory for Area A (external publish)**: steel-quench must complete in same session before Area A is marked complete.

package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md CHANGED Viewed

@@ -214,7 +214,7 @@ Structural methods to reduce self-reference risk in Area B:
 1. **Regular adversarial attacks**: Area B once/month + `challenger` attack once/quarter. Route challenger → defense results directly into SKILL.md via steel-quench handoff after Area B ends.
 2. **Direct external user validation**: Non-owner attempts install + invocation → collect reactions. (cascade β validated: first autonomous external run confirmed.)
 3. **steel-quench integration**: After Area B ends, hand off challenger findings to `/steel-quench` for deeper adversarial review + SKILL.md inscription.
-4. **Dual validation principle**: Internal validation (Area B) alone is insufficient — minimized only when combined with external install reaction collection or cross-model validation.
+4. **Dual validation principle**: Internal validation (Area B) alone is insufficient — minimized only when combined with external install reaction collection or cross-model validation. **For risk≥medium targets this is no longer advisory: it is the hard Cross-Model Coverage Gate** (SKILL.md Step 0.6) — at least one persona from outside the orchestrator's session context (external CLI → cross-session Claude → else `cross_model_coverage: NONE` withdraws auto-commit). Promoted from advisory to gate by the judge-robustness swarm (2026-06-13): a homogeneous same-session panel shares blind spots, so its clean verdict cannot self-certify.
 **Dispatch template for Area B parallel**:
 ```
@@ -291,6 +291,7 @@ type: simulation-report
 date: YYYY-MM-DD
 areas: [A|B|C|D|E|all]
 target_profile: [artifact_type | audience | risk_level]
+cross_model_coverage: [external | cross-session | NONE | n/a-low-risk]   # Step 0.6 — recorded from dispatch path
 m_count: N
 s_count: N
 r_count: N
@@ -299,6 +300,10 @@ r_count: N
 ## Target Profile
 artifact_type: [type] · audience: [internal|external|mixed] · risk_level: [low|medium|high]
+## Coverage map (Step 1.5 — every profile standpoint marked covered/ZERO)
+[verbatim zero-coverage map; ⚠️ on each ZERO]
+cross_model_coverage: [external | cross-session | NONE]   # NONE on risk≥medium → auto-commit withdrawn (Step 4)
 ## M-tier ([N] items)
 | # | Issue | Location | Prescription |
 ...

package/plugins/fh-meta/skills/steel-quench/SKILL.md CHANGED Viewed

@@ -164,8 +164,30 @@ No gate-PASS in scope → skip Wave-P3 entirely.
 > dimensions the gate's own pass criteria structurally could not check. Only when all three Attack Failed can
 > a **"Real PASS"** be declared.
+**PASS-framing redaction (mandatory pre-step)** — the artifact reaches Wave-P3 *carrying its own
+PASS declaration* (a `✅`, a "Verified" header, the gate marker), and a re-judge that reads that
+framing is biased toward "Attack Failed" — the exact bias Wave-P3 exists to defeat (judge-robustness
+swarm, 2026-06-13). Before feeding the artifact to the dimensions/challenger, **strip the pass-framing
+at the bash layer, not the AI layer**:
+```bash
+# Strip framing glyphs + canonical FH verdict PHRASES only — never the bare word PASS
+# (a global s/PASS//g corrupts substance: "status==PASS" → "status==", manufacturing findings).
+P3R=$(mktemp)
+sed -E -e 's/[✅🟩✓]//g' \
+       -e 's/(ALL AXES PASSED|FH_GATE_VERDICT:[[:space:]]*PASS|Real PASS|VERIFIED|[Gg]ate[^.]{0,24}declared PASS)//g' \
+       "{ARTIFACT}" > "$P3R"
+# feed "$P3R" to Wave-P3; clean up after: rm -f "$P3R"
+```
+Feed `$P3R` to Wave-P3, then `rm -f "$P3R"`. The redaction is mechanical, so it cannot itself be
+placated. **Honest scope**: it strips framing glyphs and *canonical FH verdict phrases* — bare-word or
+lowercase prose self-congratulation ("this passed review", "looks green") is out of scope for `sed`
+and is covered by a prose backstop instead: the Wave-P3 persona is instructed to **disregard any
+residual self-pass language in the artifact** and attack as if no verdict were stated.
 **Agent utilization**:
-- `fh-commons:quench-challenger` (optional) — adds 6-axis structural attack to each dimension. If absent, run the 3 dimensions directly.
+- `fh-commons:quench-challenger` (optional) — adds 6-axis structural attack to each dimension, fed the **redacted** artifact. If absent, run the 3 dimensions directly on the redacted copy.
 - `fh-meta:persona-innovator` (after convergence) — error/gap patterns found during Wave-P3 → auto-propose new Cross-Project Pattern rows or skill-candidate signals.
 The three dimensions generalize the gate's three blind spots:

package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md CHANGED Viewed

@@ -10,6 +10,7 @@ complexity_routing:
   escalate_when:
     - full_revalidation
     - high_stakes
+    - fail_verdict   # AI recommendation was wrong → baseline overwrite is high-stakes, never stay at sonnet
 ---
 # verify-bidirectional — Bidirectional Self-Validation Automation
@@ -50,6 +51,28 @@ Treat user's statement as **external refinement material**. **Do NOT attempt to
 Core proposition: "refinement challenge ≠ fundamental negation". Priority is identifying where the initial recommendation is weakened.
+**Evidence gate (overwrite ≠ soften)** — closes the sycophancy/steering vector where a bare assertion
+("that's wrong, re-examine") flips a baseline with zero evidence (judge-robustness swarm, 2026-06-13).
+"Do NOT defend" still holds for *this conversation's proposition* (anti-stubbornness is the point), but a
+**persistent-baseline overwrite** (a rule, asset, memory, or `knowledge/` claim — anything that outlives
+this session) requires a **supporting basis**, not mere pushback:
+- **(a)** the user cited a file / line / commit / URL / past decision **and the cited content actually
+  supports the challenge** — verified by *reading it*, not by its mere existence (an irrelevant-but-real
+  citation is the out-of-context-grounding trap, the same vector phantom-quench guards), **or**
+- **(b)** the Step 2 grep returns an actual contradiction that *supports the challenge* (not just any
+  conflict with the original) — surfaced literally.
+If a baseline overwrite is implied but **neither holds**, do not silently rewrite: verdict is **ESCALATE**
+— surface *"this would overwrite baseline {X} with no cited evidence; confirm override, or provide a
+source?"* and block the Step 4 cascade until the operator answers. Softening a local in-conversation
+proposition (no persistent asset changed) proceeds as before — the gate fires only on persistent-baseline
+writes. **Sequencing**: this gate is written in Step 1, but Step 4 is what enumerates affected persistent
+assets — so if Step 4 later identifies *any* persistent asset to write, **re-apply this gate before Step
+4.5** even if the original challenge first looked like a mere soften. This is **not** restored AI defensiveness: the AI still does not argue the user is wrong; it only
+declines to *fabricate a baseline change* the evidence does not support (mirror of the steel-quench/phantom
+mechanical-anchor principle — judged verdicts bind to evidence).
 ### Step 2. Consistency Area Grep (3-step mandatory)
 Grep to find which rules, assets, or propositions conflict with the initial recommendation: