@chrono-meta/fh-gate 1.4.20 → 1.4.22
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +10 -1
- package/CATALOG.md +18 -6
- package/CHEATSHEET.md +2 -1
- package/CLAUDE.md +5 -2
- package/package.json +1 -1
- package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +29 -3
- package/plugins/fh-meta/skills/phantom-quench/SKILL.md +36 -6
- package/plugins/fh-meta/skills/sim-conductor/SKILL.md +76 -2
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +6 -1
- package/plugins/fh-meta/skills/steel-quench/SKILL.md +23 -1
- package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +23 -0
package/AGENTS.md
CHANGED
|
@@ -59,7 +59,16 @@ Agents in this registry belong to the **Automation layer**. Skills (in `plugins/
|
|
|
59
59
|
|
|
60
60
|
> **Codex-compatible beta**: The Methodology layer (`tracks/`, `knowledge/`, skill documentation) is designated Codex-compatible beta. Gemini, Codex, and other AI users can apply FH methodology without the Automation layer — manual invocation replaces hook/agent dispatch.
|
|
61
61
|
|
|
62
|
-
> **
|
|
62
|
+
> **Directory → destination routing (where your outputs belong)**: not everything in the methodology layer is public-shareable. `knowledge/` and `SKILL.md` docs are the **public, reusable** methodology. `tracks/` is **local / private by convention** — work history, session records, `fh_signal_*`, audit logs — and is gitignored on the public mirror. An AI working in a local workspace that pairs the public mirror with a private companion store (the `*-be` pattern) must **not** infer "same folder ⇒ same repository"; route by content type:
|
|
63
|
+
>
|
|
64
|
+
> | Content | Default destination |
|
|
65
|
+
> |---|---|
|
|
66
|
+
> | reusable methodology · docs · skills · public guidance · polished external-facing conclusions | public mirror (`knowledge/`, `plugins/`, `docs/`) |
|
|
67
|
+
> | raw signal · operator observation · private validation · handoff · paper draft · PR-background reasoning log | private companion store (`*-be` pattern) — or keep local; do **not** commit to the public mirror |
|
|
68
|
+
>
|
|
69
|
+
> When unsure, treat raw / observational / operator-specific material as **private-first** and promote only the polished result to public. (Concrete per-operator bindings — exact companion-store path, sync mechanism — live in the operator's local config, not here.)
|
|
70
|
+
|
|
71
|
+
> **Multi-model sidecar (validated)**: Any FH user can delegate to other models via sidecar — Gemini CLI, OpenAI/Codex CLI, or Copilot CLI's model catalog — invoked with `Bash` from within the Claude Code session. FH is the orchestrating harness; the sidecar is a routing/access layer (not a second harness — different layer entirely). Validated empirically: `echo "prompt" | gemini` works inside a CC session and produces usable output. Sidecar calls are Bash invocations, not agent dispatches — they bypass this registry and are coordinated inline by the skill. Capability routing matters too: Gemini/Antigravity is the natural multimodal sidecar, while a Codex app/runtime session with Browser/Chrome connectors is the preferred handoff for live web-flow automation. In a local FH workspace that pairs the public methodology mirror with a private companion store (the `*-be` pattern), route by workspace capability while preserving each repository's ownership boundary. See `knowledge/shared/harness-core/multi_model_sidecar_strategy.md` for the full pattern.
|
|
63
72
|
|
|
64
73
|
---
|
|
65
74
|
|
package/CATALOG.md
CHANGED
|
@@ -8,6 +8,18 @@ AI reads this file first when searching past work. Open individual files for det
|
|
|
8
8
|
|
|
9
9
|
<!-- Add entries in reverse date order (newest at top) -->
|
|
10
10
|
|
|
11
|
+
### 2026-06-13 | forge-harness | #judge-robustness, #mechanical-anchor, #hardening-batch-2, #sycophancy-gate, #verification
|
|
12
|
+
**File:** plugins/fh-meta/skills/{verify-bidirectional,steel-quench,asset-placement-gate}/SKILL.md (commit f80bc99)
|
|
13
|
+
Batch-2 of the judge-robustness hardening (after #1-#2 in be2d5dc): three more judge-only verdict paths bound to anchors. #3 verify-bidirectional evidence gate — a persistent-baseline overwrite needs a supporting cited source (read, not existence) or a grep contradiction, else ESCALATE+block; closes the bare-pushback sycophancy vector without restoring AI stubbornness. #4 steel-quench Wave-P3 PASS-framing redaction (mktemp glyph+verdict-phrase strip — the challenger caught a naive bare-PASS global corrupting "status==PASS", an S fixed pre-commit). #5 asset-placement Step 0.5 mechanical pre-grep grounds criterion ④. challenger-verify round 2 was load-bearing again (FAIL→fixed: 1S+2A+4B); sonnet blind sim PASS (evidence gate ESCALATEs on bare overwrite).
|
|
14
|
+
- Decision: #6 sim-conductor (A) deferred — its cross-model hard-gate needs graceful degradation for CC-only environments; not rushed into the batch.
|
|
15
|
+
- Open: #6 sim-conductor staged; npm republish (1.4.21) bundling be2d5dc + f80bc99 pending operator.
|
|
16
|
+
|
|
17
|
+
### 2026-06-13 | forge-harness | #judge-robustness, #mechanical-anchor, #verification-hardening, #self-audit, #scaled-dispatch
|
|
18
|
+
**File:** plugins/fh-meta/skills/phantom-quench/SKILL.md + templates/.git-hooks/pre-commit + CLAUDE.md/CHEATSHEET (commit be2d5dc)
|
|
19
|
+
Deep-research + a 6-agent parallel swarm audit turned FH's own adversarial method on FH: armed with arXiv 2507.08794 ("One Token to Fool LLM-as-a-Judge"), cold agents found 5/6 of FH's judged-check skills S-exploitable. Common root cause: every terminal verdict is judge-only with no mechanical checksum. Shipped the two highest-leverage fixes — (#1) the pre-commit gate marker now requires a non-vacuous, auditable axis2-evidence field, with the residual (provenance against a self-deceiving runner) documented honestly as the weekly-audit's+operator's, not pretended-closed by a false-security HMAC; (#2) phantom-quench GROUNDED is now gated on a typed mechanical anchor (proper-noun grep-in-asserting-slot / numeric value-after-normalization / branching decompose-or-declare-judged; universal rule: a hit counts only if the line expresses the claimed relation), closing out-of-context grounding. challenger-verify caught a real over-block regression (format-variant false-flag) before it shipped; sonnet blind sim PASS.
|
|
20
|
+
- Decision: bind judged verdicts to mechanical anchors where one exists (verifiability constraint); ship #1-#2 first, stage #3-#6 (verify-bidirectional evidence gate, steel-quench redaction, asset-placement grep, sim-conductor cross-model) — rushing all six would be the over-verification the same research warned against.
|
|
21
|
+
- Open: hardenings #3-#6 staged; scaled-dispatch meta-lesson = value is real-target coverage (~1 reviewer per asset), not agent count.
|
|
22
|
+
|
|
11
23
|
### 2026-06-13 | forge-harness | #deep-research, #capability-ladder, #no-reinvention, #routing, #goal-quench-max
|
|
12
24
|
**File:** knowledge/shared/harness-core/deep_research_capability_ladder.md (+ CLAUDE.md initiative row, goal-quench/frontier-digest SKILL.md) (commit 55fa3da)
|
|
13
25
|
Deep-research as an FH default — lifts the /deep-research engine ladder that was locked inside frontier-digest into a general routing default. 3 rungs: built-in /deep-research if present → Claude WebSearch+WebFetch synthesis (always available, tier-sensitive) → frontier-digest for AI/harness trend-scan only. No-reinvention: FH routes to the best capability present, builds no research engine. Wired as a CLAUDE.md Autonomous Initiative Layer row (default invocation) + goal-quench max-mode capability-gap fill (flexes in when budget RED), with rung-2 research run in an isolated sub-agent to preserve max's context budget. 4-axis: challenger PASS no-S (4B applied incl. the isolation invariant) + sonnet blind sim PASS (correct rung, trend-scan boundary held).
|
|
@@ -19,13 +31,13 @@ Completion sweep of all currently-unblocked carries. FP3: agy joins the sidecar
|
|
|
19
31
|
- Decision: probe must exercise the same invocation form the dispatch uses (forms diverge on one binary — agy proved it); hook enforces ack form, weekly audit owns genuineness.
|
|
20
32
|
|
|
21
33
|
### 2026-06-11 | forge-harness | #readme-dedup, #commoditization-defense, #b1-boundary, #seed-vetting, #field-routing
|
|
22
|
-
**File:** README.md (PR #93) +
|
|
23
|
-
Cloud session #4 (Mode D, remote-doable batch): README stale duplicate "Measured, not asserted" block removed (pre-tier-floor copy contradicted default-Sonnet stance) + "Where this sits (2026)" positioning para (gate+loop are the asset — plumbing commoditizes).
|
|
24
|
-
- Decision: README action PARTIAL (merge done; About refresh + npm 1.4.14 laptop-bound); full-TC locked on
|
|
34
|
+
**File:** README.md (PR #93) + companion-store signals/handoffs (PR #24+) + field project v0.14 (private track)
|
|
35
|
+
Cloud session #4 (Mode D, remote-doable batch): README stale duplicate "Measured, not asserted" block removed (pre-tier-floor copy contradicted default-Sonnet stance) + "Where this sits (2026)" positioning para (gate+loop are the asset — plumbing commoditizes). Companion store: B1 scope boundary (AW2 simple-vs-complex import), SC1–4 seed vetting (SC1 phantom arXiv fixed → 2509.19349; SC3 commoditization threat to bet ID), gstack positioning-triangle line. Field: private-track companion handoff verified+executed — DefectPatternMatcher+Bug Mode implemented UTF-8-clean in the field repo (25 tests PASS, 0 regressions); its OpenCode(sLLM) lane's downgrade triage hardened to 2-tier (keep/block) with free-tier 80/20 split — β/SC2 field case logged.
|
|
36
|
+
- Decision: README action PARTIAL (merge done; About refresh + npm 1.4.14 laptop-bound); full-TC locked on local-LLM build per operator's CaseCraft limit measurement — structure-transform survives (P6/P6.5 deterministic), meaning-fill routes to frontier.
|
|
25
37
|
|
|
26
38
|
### 2026-06-11 | forge-harness | #mcp-gating, #external-mcp, #name-keyed-policy, #measured-origin, #field-template
|
|
27
39
|
**File:** templates/.claude/rules/mcp_tool_gating.md (+ auto_project_mapping.md §6 row 4, CLAUDE.md mount-intent trigger)
|
|
28
|
-
Cloud session (Mode D, ext): new field template — external-MCP tool gating with three tiers (ask / ask-meta-write / allow-untrusted-read), name-keyed because server-supplied annotations are unreliable (measured same-day: live messaging-class MCP shipped all-None hints incl. irreversible send + approval-resolution tools —
|
|
40
|
+
Cloud session (Mode D, ext): new field template — external-MCP tool gating with three tiers (ask / ask-meta-write / allow-untrusted-read), name-keyed because server-supplied annotations are unreliable (measured same-day: live messaging-class MCP shipped all-None hints incl. irreversible send + approval-resolution tools — companion store `signal_2026-06-11_hermes-mcp-cloud-boot.md`). Opus challenger caught the name-spoofing hole (server controls names too → behavior-confirmation required for non-ask tiers, fixed inline); sonnet blind sim PASS on unfilled-§3 scenario (per-item ask on send, batch approval-grant refused).
|
|
29
41
|
- Decision: prefer host-native per-tool permission config as enforcement; this template = what-to-gate + portable fallback. §6 install row is conditional (MCP present); the proactive mount-intent trigger is the load-bearing path.
|
|
30
42
|
|
|
31
43
|
### 2026-06-11 | forge-harness | #identity-marker, #door-skeleton, #target-tier-sim, #below-floor-consumer, #false-control-kill
|
|
@@ -72,7 +84,7 @@ Tier-floor resolution ships — the model dimension of the Sidecar Engine Resolu
|
|
|
72
84
|
**File:** docs/OUTPUT_EVIDENCE.md (+ README.md §Model setup)
|
|
73
85
|
Model-tier flattening measured and published: 30-point blind battery (rule-application + meta-dev fixtures, pre-registered rubric) on four Claude tiers — operation 100/100/97/94 (anchor/Opus 4.8/Sonnet 4.6/Haiku 4.5), tier separation only on above-rubric design increments (3/3·1/3·0.5/3·0/3). Public claim scoped honestly: single trial, self-graded, worked example not benchmark. README §Model setup gains the evidence note grounding the existing Opus recommendation.
|
|
74
86
|
- Decision: operating FH ≈ model-flat (the harness is the score); developing FH is where tier matters — recommendation unchanged (opus for harness-editing/gates), now evidence-backed. **[superseded same-day by the tier-floor entry above: default flipped to sonnet + floored dispatch; opus pin now Mode-D-only]**
|
|
75
|
-
- Open: real
|
|
87
|
+
- Open: real local-LLM-class measurement on laptop (batteries are a portable fixture pack, companion-store record).
|
|
76
88
|
|
|
77
89
|
### 2026-06-10 | forge-harness | #fc, #consent-lane, #federated-compounding, #starved-center, #v3
|
|
78
90
|
**File:** tracks/_contrib/README.md (+ .gitignore, templates/contrib_session.md, docs/CONTRIBUTING.md, README.md)
|
|
@@ -158,7 +170,7 @@ Axis 5 check-class taxonomy added: every verify check classified as mandatory-pa
|
|
|
158
170
|
**File:** knowledge/shared/harness-core/multi_model_sidecar_strategy.md (+ hybrid_orchestration_architecture_roadmap.md)
|
|
159
171
|
Added canonical §Sidecar Engine Resolution Protocol — Tier1 subscription-CLI → Tier2 API-key → Tier3 Claude-subagent guaranteed fallback. Principle: discovery automatic/free, invocation value-gated (intelligent default multi-AI, no hard-fail for Mode C). Wired pointers into goal-quench Step D / steel-quench runtime-adapter / harvest-loop Step 3.5-X; sim-conductor/pipeline-conductor/agent-composer inherit by reference. Source hybrid-orchestration design archived as proposed roadmap (versions→placeholders, Python pseudo-code→illustrative, non-shipped tagged Proposed). PR #80.
|
|
160
172
|
- Decision: single-source resolution protocol — skills cite it instead of re-inventing "if available" probes.
|
|
161
|
-
- Open: npm republish (machine-bound) — 3 npm-shipped SKILL.md changed; handed off to laptop via
|
|
173
|
+
- Open: npm republish (machine-bound) — 3 npm-shipped SKILL.md changed; handed off to laptop via the companion store.
|
|
162
174
|
|
|
163
175
|
### 2026-06-09 | forge-harness | #onboarding, #greeting, #3-axis-scaffold, #returning-user
|
|
164
176
|
**File:** knowledge/shared/harness-core/fh_detail_protocols.md
|
package/CHEATSHEET.md
CHANGED
|
@@ -98,13 +98,14 @@ git config core.hooksPath templates/.git-hooks
|
|
|
98
98
|
chmod +x templates/.git-hooks/pre-commit
|
|
99
99
|
```
|
|
100
100
|
|
|
101
|
-
After running `/steel-quench` and `/phantom-quench` in your session, Claude creates the Axes 2+3 pass marker automatically. The marker must carry machine-readable floor fields — the hook validates them (a bare `touch` marker no longer passes; below-floor passes block unless an explicit `below-floor-ack:` line records operator acceptance, **quoting the operator's approval utterance verbatim** — an unquoted reason is rejected as agent-self-writable). If Claude doesn't create it (e.g., session interrupted), create it manually:
|
|
101
|
+
After running `/steel-quench` and `/phantom-quench` in your session, Claude creates the Axes 2+3 pass marker automatically. The marker must carry machine-readable floor fields — the hook validates them (a bare `touch` marker no longer passes; below-floor passes block unless an explicit `below-floor-ack:` line records operator acceptance, **quoting the operator's approval utterance verbatim** — an unquoted reason is rejected as agent-self-writable). It also requires an **`axis2-evidence:`** line recording what the pass actually found (a finding count or verdict token — `PASS no-S` / `1S/4A fixed` / `clean — 0 findings`); a vacuous "it ran" line is rejected. *Honest scope: this enforces the marker is non-vacuous + auditable, not that the pass truly ran — a fabricated attestation is the weekly audit's + operator's residual, by design (judge-robustness swarm, 2026-06-13).* If Claude doesn't create it (e.g., session interrupted), create it manually:
|
|
102
102
|
|
|
103
103
|
```bash
|
|
104
104
|
cat > "tracks/_meta/.axes_23_passed_$(git rev-parse --abbrev-ref HEAD | tr '/' '_')_$(date +%Y-%m-%d).marker" <<'EOF'
|
|
105
105
|
axis2-engine: quench-challenger
|
|
106
106
|
axis2-model: opus
|
|
107
107
|
floor-status: at-floor
|
|
108
|
+
axis2-evidence: PASS no-S
|
|
108
109
|
<scope / findings prose>
|
|
109
110
|
EOF
|
|
110
111
|
```
|
package/CLAUDE.md
CHANGED
|
@@ -157,8 +157,11 @@ No user request is needed — this is a mandatory autonomous step, not a proposa
|
|
|
157
157
|
FH asset modified → Axis 1 (regression_guard.sh --pr {BRANCH})
|
|
158
158
|
→ Axis 2 (/steel-quench) → Axis 3 (/phantom-quench)
|
|
159
159
|
→ marker: tracks/_meta/.axes_23_passed_{branch}_{date}.marker
|
|
160
|
-
(structured — required fields: axis2-engine / axis2-model / floor-status;
|
|
161
|
-
hook validates mechanically: below-floor blocks without below-floor-ack
|
|
160
|
+
(structured — required fields: axis2-engine / axis2-model / floor-status / axis2-evidence;
|
|
161
|
+
hook validates mechanically: below-floor blocks without below-floor-ack, and
|
|
162
|
+
axis2-evidence must be non-vacuous — a recorded verdict/count, not "it ran". Honest
|
|
163
|
+
scope: form + non-vacuity + auditability, NOT provenance — a fabricated marker is the
|
|
164
|
+
weekly audit's + operator's residual by design, judge-robustness swarm 2026-06-13)
|
|
162
165
|
→ Axis 4 (/edit-manifest RECORD, today's entry in edit_manifest.yaml)
|
|
163
166
|
→ All 4 PASS → git commit allowed | Any FAIL → fix inline, re-run
|
|
164
167
|
```
|
package/package.json
CHANGED
|
@@ -37,7 +37,8 @@ When unsure where to place a new asset or skill:
|
|
|
37
37
|
|
|
38
38
|
1. Request full file path from user (or accept natural language description)
|
|
39
39
|
2. Load asset content via `Read` (if path provided)
|
|
40
|
-
|
|
40
|
+
2.5. Step 0.5 mechanical overlap pre-scan (grounds criterion ④ before the judged pass)
|
|
41
|
+
3. Evaluate Step 1 4-criteria in order (LLM makes the judgment, ④ gated on the Step 0.5 scan)
|
|
41
42
|
4. ① + ④ both pass + at least one of ②③ passes → output **"FH suitable"**
|
|
42
43
|
Otherwise, proceed to Step 2 local assessment → if fails, output **"Project-local agent or no asset needed"**
|
|
43
44
|
|
|
@@ -64,7 +65,32 @@ Immediately after trigger, acquire asset content in the following order.
|
|
|
64
65
|
> **Which asset should I evaluate?**
|
|
65
66
|
> Enter a file path (e.g., `.claude/agents/jira-create.md`) or a description.
|
|
66
67
|
|
|
67
|
-
After acquiring the asset content,
|
|
68
|
+
After acquiring the asset content, run Step 0.5 (mechanical overlap pre-scan) **before** the judged Step 1.
|
|
69
|
+
|
|
70
|
+
## Step 0.5. Mechanical Overlap Pre-Scan (grounds criterion ④)
|
|
71
|
+
|
|
72
|
+
Criterion ④ ("no overlap with existing FH skills") is otherwise an LLM **recall** judgment with no
|
|
73
|
+
ground truth — a duplicate skill with a novel name passes because the judge has no enumerated list to
|
|
74
|
+
check against (judge-robustness swarm, 2026-06-13). Ground it mechanically first:
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# enumerate existing skill names + descriptions (grounds the judged comparison)
|
|
78
|
+
grep -riE "name:|description:" plugins/fh-meta/skills/*/SKILL.md plugins/fh-commons/skills/*/SKILL.md
|
|
79
|
+
# hard-collision check: WHOLE proposed name or a WHOLE trigger phrase reused verbatim.
|
|
80
|
+
# grep -wF (whole-word, fixed-string) on the full strings — NOT -E on tokens (a shared common
|
|
81
|
+
# word like "review" is not a collision). Exclude the asset's own file (self-match = false hit).
|
|
82
|
+
SELF="plugins/fh-meta/skills/<proposed name>/SKILL.md"
|
|
83
|
+
grep -rwF -e "<proposed full name>" -e "<full trigger phrase 1>" -e "<full trigger phrase 2>" \
|
|
84
|
+
plugins/*/skills/*/SKILL.md | grep -v "$SELF" | grep -c .
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Surface **collision count + nearest existing skill(s)**. Criterion ④ then passes only if **0 whole-name/
|
|
88
|
+
whole-trigger collision AND the judged ≤90%-overlap check agrees**. A verbatim whole-name or
|
|
89
|
+
whole-trigger reuse is a hard ④ fail regardless of the LLM judgment. **Honest scope**: the grep grounds
|
|
90
|
+
*literal* name/trigger reuse only — a post-cutoff duplicate with a *paraphrased* trigger is invisible to
|
|
91
|
+
both the judge (cutoff) and the grep (literal); that residual leans on the judged half **fed the
|
|
92
|
+
enumerated descriptions above** (grounded comparison, not pure memory), not on full mechanization. A
|
|
93
|
+
shared common word is a judged-review flag, **not** a hard fail.
|
|
68
94
|
|
|
69
95
|
## Done When
|
|
70
96
|
|
|
@@ -83,7 +109,7 @@ All steps 0–3 completed
|
|
|
83
109
|
| ① | Cross-project value | Is this asset equally useful in other projects without depending on a specific project? |
|
|
84
110
|
| ② | Orchestration / judgment layer | Is it just a list of MCP/Bash calls, or a judgment layer that synthesizes multiple signals? |
|
|
85
111
|
| ③ | Not replaceable by built-ins | Can this be equally achieved with direct MCP calls or basic bash? (If yes, fails this criterion) |
|
|
86
|
-
| ④ | No overlap with existing FH skills |
|
|
112
|
+
| ④ | No overlap with existing FH skills | Step 0.5 mechanical scan = 0 name/trigger collision **AND** judged ≤90% overlap. Non-zero collision → hard fail. |
|
|
87
113
|
|
|
88
114
|
**FH suitable** → ① + ④ both pass + at least one of ②③ passes.
|
|
89
115
|
**Fail** → ① or ④ fails → immediate fail. Or both ②③ fail → proceed to Step 2.
|
|
@@ -123,14 +123,44 @@ Extract claims from the artifact that require source back-tracing. Claim types:
|
|
|
123
123
|
|
|
124
124
|
Back-trace each claim to the declared source files using Read + Grep directly — no inference judgment. Partial match is not treated as match.
|
|
125
125
|
|
|
126
|
+
**Mechanical anchor (GROUNDED is gated on a literal grep hit *in the right slot*, not a bare
|
|
127
|
+
judgment)** — this closes *out-of-context grounding*: citing a real, readable file for a false claim,
|
|
128
|
+
where the judge biases GROUNDED merely because the source *exists* and contains domain-adjacent text
|
|
129
|
+
(measured 2026-06-13, judge-robustness swarm — the cheapest S-exploit of this skill). The anchor is
|
|
130
|
+
**typed** — applying one byte-literal rule to every claim type both under-blocks (a token that hits an
|
|
131
|
+
irrelevant line) and over-blocks (a correct value formatted differently); each type gets the right check:
|
|
132
|
+
|
|
133
|
+
- **Proper noun / exact identifier** (skill name, file path, API name, flag): run
|
|
134
|
+
`grep -n "<exact token>" <cited file>` — must return a non-empty line, surfaced **literally**
|
|
135
|
+
(`file:line: matched text`), and that line must be where the identifier is *used as the claim
|
|
136
|
+
asserts* (a bare occurrence elsewhere is **not** grounding — e.g. claim "X is the default model" needs
|
|
137
|
+
the line that sets X as default, not any line mentioning X). No qualifying hit → Phantom.
|
|
138
|
+
- **Numerical / range value**: grep the value, but **normalize format/unit before judging** —
|
|
139
|
+
`300s` ≡ `300 seconds`, `≥5` ≡ `>= 5` (ASCII/Unicode), `5 minutes` ≡ `300 seconds`. The *value* must
|
|
140
|
+
sit in the slot the claim asserts. A correct value in a different format = Grounded (note the
|
|
141
|
+
normalization in evidence); a **different** value in that slot = Partial (re-confirm) or Phantom.
|
|
142
|
+
Never auto-Phantom a format variant.
|
|
143
|
+
- **Branching / multi-clause condition**: no single greppable token. Either **decompose to atomic
|
|
144
|
+
sub-conditions** and grep each identifier, or — if it stays a compound judgment — keep it
|
|
145
|
+
**judged-class with the adversarial pairing declared** (do not fake a single-token grep). State which.
|
|
146
|
+
|
|
147
|
+
**Universal rule**: a grep hit counts only if the surfaced line *expresses the claimed relation*.
|
|
148
|
+
"The token appears somewhere in a real file" is precisely the out-of-context trap, not evidence.
|
|
149
|
+
|
|
150
|
+
A claim that is **cited to a specific source but the source cannot be Read** (path doesn't resolve,
|
|
151
|
+
line beyond EOF) is **S-grade Phantom**, *not* the softer Source-Missing 🔴 — a citation that does not
|
|
152
|
+
resolve means the citation was invented. Source-Missing 🔴 is reserved for *undeclared* sources only.
|
|
153
|
+
Declared "sources" that are not resolvable file paths (e.g. `source: "the codebase"`) count as **0
|
|
154
|
+
sources** for the Step 0.5 blocker.
|
|
155
|
+
|
|
126
156
|
Back-tracing classification:
|
|
127
157
|
|
|
128
158
|
| Classification | Criteria | Marking |
|
|
129
159
|
|---|---|:---:|
|
|
130
|
-
| **Grounded** |
|
|
131
|
-
| **Partial** |
|
|
132
|
-
| **Phantom** |
|
|
133
|
-
| **Source-Missing** | Source
|
|
160
|
+
| **Grounded** | Typed anchor passes: identifier grep-hits *in the asserting slot* / value matches after normalization / branching sub-conditions trace (line surfaced) | ✅ |
|
|
161
|
+
| **Partial** | A *different* value sits in the claimed slot, or a compound condition partially traces — needs re-confirmation | ⚠️ |
|
|
162
|
+
| **Phantom** | Exact token not found in source, **or** cited to a named source that cannot be Read | ❌ |
|
|
163
|
+
| **Source-Missing** | Source was **not declared** (undeclared only — a failed *cited* source is Phantom) | 🔴 |
|
|
134
164
|
|
|
135
165
|
> **Detail**: See `SKILL_detail.md §Step2-Detail` — back-tracing execution procedure, classification decision rules, and Step 2 output format template — read when handling edge cases or formatting results.
|
|
136
166
|
|
|
@@ -232,7 +262,7 @@ This skill can be used independently without the full meta-harness structure.
|
|
|
232
262
|
|
|
233
263
|
```
|
|
234
264
|
Step 1 claim extraction complete
|
|
235
|
-
+ Step 2 all claims back-traced (
|
|
265
|
+
+ Step 2 all claims back-traced (Read + Grep — highest-priority GROUNDED requires a typed literal grep hit in the asserting slot, not inference)
|
|
236
266
|
+ Step 3 Phantom severity classification + prescription output
|
|
237
267
|
+ Step 4 process pattern diagnosis complete (skip if 0 Phantoms)
|
|
238
268
|
+ "phantom-quench Complete" declaration output
|
|
@@ -244,7 +274,7 @@ Verdict: PASS (0 Phantom claims) | CONDITIONAL_PASS (LOW-severity Phantoms only,
|
|
|
244
274
|
|
|
245
275
|
## Operating Notes
|
|
246
276
|
|
|
247
|
-
- **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep.
|
|
277
|
+
- **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep. **GROUNDED on a highest-priority claim is gated on a literal grep hit of the exact token (Step 2 mechanical anchor) — "the file exists and looks right" is the out-of-context-grounding trap, not evidence.**
|
|
248
278
|
- **Partial is not Grounded**: Processing similar-value-in-source as Grounded misses the reconstruction modification pattern.
|
|
249
279
|
- **Source not declared itself is S-grade**: If source is not declared when making an artifact, no claim can subsequently be verified. Recommend mandating source declaration in the process design stage.
|
|
250
280
|
- **Recommended to use with steel-quench**: steel-quench quenches structural flaws, phantom-quench ensures source consistency. The two skills are orthogonal and artifact quality assurance is strengthened when used together.
|
|
@@ -78,7 +78,7 @@ Read target artifact(s) → classify on 5 dimensions → output recommendation
|
|
|
78
78
|
| `artifact_type` | SKILL.md / design-doc → Area B + D-skill↑ · README / CHEATSHEET → Area A↑ · code / config → Area D-code↑ |
|
|
79
79
|
| `audience` | external installer / first-time user → beginner↑ · internal team only → challenger↑ |
|
|
80
80
|
| `claim_density` | 3+ stated benefits or superlatives → challenger↑ |
|
|
81
|
-
| `risk_level` | external publish / marketplace listing → steel-quench prerequisite triggered |
|
|
81
|
+
| `risk_level` | external publish / marketplace listing → steel-quench prerequisite triggered. **Mechanical floor (not judge-only)**: any of — publish/marketplace target · public-surface or visibility change · auth/secret-handling or executable code · an FH asset under the 4-axis gate — **forces `risk_level ≥ medium`** regardless of profiler judgment. The floor closes the "fool the profiler into `low` to skip Step 0.6" seam; the judge may only raise above the floor, never below it. |
|
|
82
82
|
| `novelty` | first-of-its-kind / no prior session evidence → phantom-quench recommended |
|
|
83
83
|
|
|
84
84
|
```
|
|
@@ -149,6 +149,48 @@ Concern format: `"One thing to check before [Area X]: [concern]. Proceed?"`
|
|
|
149
149
|
|
|
150
150
|
---
|
|
151
151
|
|
|
152
|
+
## Step 0.6 — Cross-Model Coverage Gate (risk≥medium — hard)
|
|
153
|
+
|
|
154
|
+
Closes the homogeneous-blind-spot + formatting-flattery vector (judge-robustness swarm, 2026-06-13):
|
|
155
|
+
a panel of same-session Claude sub-agents shares one model's blind spots, so a clean verdict can be
|
|
156
|
+
flattery the **whole panel** is blind to — and with no quotable-rule violation, no persona escalates.
|
|
157
|
+
For `risk_level ≥ medium` targets (from Step 0.3), at least one persona MUST come from a source
|
|
158
|
+
**outside the orchestrator's own session context**. This **promotes the former advisory "dual
|
|
159
|
+
validation principle"** (detail §AreaB-Baseline #4) to a hard gate — the mechanical-anchor pattern of
|
|
160
|
+
hardening #1–#5: a judged verdict binds to a fact the judge cannot fake.
|
|
161
|
+
|
|
162
|
+
**Graceful-degradation ladder** — take the highest available rung. The gate **never breaks
|
|
163
|
+
sim-conductor**; at the bottom it only withdraws the *unsafe autonomy* (self-certifying a blind verdict),
|
|
164
|
+
not the run:
|
|
165
|
+
|
|
166
|
+
| Rung | Source | `cross_model_coverage` | Closes | When |
|
|
167
|
+
|---|---|:---:|---|---|
|
|
168
|
+
| 1 | External CLI team (Multi-Team Mode — §MultiTeam) | `external` | model-level blind spot (genuine cross-model) | 1+ external CLI live + probe non-empty |
|
|
169
|
+
| 2 | Cross-session Claude — `claude -p` headless, or an Agent with **zero inherited context** | `cross-session` | **session-contamination only** — a fresh Claude shares the same weights/RLHF gradient, so it does **not** close the model-level blind spot; it only removes the orchestrator's working-memory bias. Honest partial mitigation, labeled as such | no external CLI; dispatch probe returns non-empty |
|
|
170
|
+
| 3 | Same-session sub-agents only | `NONE` | nothing — homogeneous panel | neither rung's probe succeeded |
|
|
171
|
+
|
|
172
|
+
**Mechanical anchor** — `cross_model_coverage` is valid **only if backed by a quoted dispatch artifact**,
|
|
173
|
+
not a self-assessment (the self-signing hole hardening #1 closed for the marker — same fix here). To
|
|
174
|
+
record `external` or `cross-session`, the Step 3 report must **quote a non-empty excerpt of the actual
|
|
175
|
+
dispatch output** (external CLI stdout, or the dispatched Agent's returned verdict text); a label with no
|
|
176
|
+
quoted excerpt is invalid and falls to `NONE`. **Liveness, not mere availability**: probe the rung before
|
|
177
|
+
claiming it — attempt the dispatch with a timeout; if it errors or returns empty (plan-gate closed,
|
|
178
|
+
context saturated, CLI present-but-dead), record `NONE`, never assume the rung succeeded. This is the
|
|
179
|
+
same honest scope as #1: the artifact makes the claim **auditable**, not cryptographically unforgeable —
|
|
180
|
+
a fabricated excerpt remains the operator's + weekly-audit's residual by design. On rung 3 (`NONE`) for a
|
|
181
|
+
risk≥medium target:
|
|
182
|
+
- the report flags `⚠️ cross-model coverage: NONE — homogeneous same-session panel; verdict provisional`, **and**
|
|
183
|
+
- **Step 4 auto-commit privilege is withdrawn** (see Step 4): M-tier fixes may be *prepared* but the
|
|
184
|
+
commit waits for the operator's explicit go. Auto-committing a structurally self-blind verdict is
|
|
185
|
+
exactly what the exploit targeted — so that single privilege is what degrades, not the simulation.
|
|
186
|
+
|
|
187
|
+
`risk_level = low` targets are exempt (a homogeneous panel is acceptable); the gate fires only at
|
|
188
|
+
medium+. The rung-2 fallback is what makes this CC-only-safe: a Claude-only environment still gets a
|
|
189
|
+
real cross-context read (a fresh isolated dispatch shares no working memory with the orchestrator), so
|
|
190
|
+
`NONE` is reached only when *both* external CLIs and a second Claude context are unavailable.
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
152
194
|
## Step 1 — Area-Specific Simulation
|
|
153
195
|
|
|
154
196
|
### Area A — External User Perspective
|
|
@@ -315,6 +357,29 @@ positives erode reviewer trust.
|
|
|
315
357
|
- No forced consensus or forced conflict — report Common opinions (2+ personas agree) and Conflicts
|
|
316
358
|
(position A vs B, each with rationale) as-is. Feeds Step 2 M/S/R triage (M ← Critical or 2+ personas).
|
|
317
359
|
|
|
360
|
+
**Zero-coverage map (mandatory synthesizer output)** — the synthesizer must emit, mechanically, the set
|
|
361
|
+
of standpoints that produced **no** finding, not only the ones that did (judge-robustness swarm,
|
|
362
|
+
2026-06-13). Enumerate the persona **standpoints** in play — those the Step 0.3 profile recommended,
|
|
363
|
+
**plus the standpoints its dimensions imply** (risk_level=high → a security/publish standpoint ·
|
|
364
|
+
audience=mixed → a non-native-reader standpoint · claim_density=high → a claim-evidence standpoint).
|
|
365
|
+
List the *standpoints*, not the bare dimension names (a `risk_level (low) → ZERO` row is noise that
|
|
366
|
+
trains operators to ignore ⚠️). Mark each `covered` (≥1 persona addressed it) or `ZERO` (no persona
|
|
367
|
+
touched it):
|
|
368
|
+
|
|
369
|
+
```
|
|
370
|
+
Coverage map:
|
|
371
|
+
beginner (onboarding friction) → covered (A-1: 2 findings)
|
|
372
|
+
challenger (claim-evidence) → covered (A-3: 1 finding)
|
|
373
|
+
security surface (risk_level=high) → ZERO ⚠️
|
|
374
|
+
non-native reader (audience=mixed) → ZERO ⚠️
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
A clean report with `ZERO` standpoints is **not** a pass — it is an uncovered surface, reported as such.
|
|
378
|
+
This converts the formatting-flattery failure (everything reads fine → nothing escalates) into a visible
|
|
379
|
+
gap: **silence on a standpoint is reported as `ZERO`, never inferred as approval.** It is a checklist
|
|
380
|
+
derived from the profile and the dispatch outputs — a mechanical anchor, not a judgment. Carry the map
|
|
381
|
+
verbatim into the Step 3 report.
|
|
382
|
+
|
|
318
383
|
The two severity vocabularies are layered, not redundant: a persona running **in isolation** assigns only
|
|
319
384
|
its own Critical/Important/Suggestion — it cannot assign M/S/R, since `S = found by 3+ personas` depends on
|
|
320
385
|
cross-persona agreement the isolated persona never sees. The synthesizer is the only context that can triage
|
|
@@ -365,6 +430,13 @@ File: `$REPORT_DIR/sim_YYYY_MM_DD_area_[X].md`
|
|
|
365
430
|
1+ M-tier → fix immediately → commit. PR creation requires explicit user request.
|
|
366
431
|
0 M-tier → commit report only + report S/R backlog.
|
|
367
432
|
|
|
433
|
+
**Cross-model gate on auto-commit (risk≥medium)** — when Step 0.6 recorded `cross_model_coverage: NONE`
|
|
434
|
+
on a risk≥medium target, the auto-commit privilege is **withdrawn**: prepare the M-tier fixes and write
|
|
435
|
+
the report, but do **not** self-commit — surface *"cross-model coverage NONE on a risk≥medium target;
|
|
436
|
+
the verdict is from a homogeneous same-session panel. Commit the fixes, or add a cross-context read
|
|
437
|
+
first?"* and wait for the operator's go. `external`/`cross-session` coverage, or risk_level=low, commits
|
|
438
|
+
as normal. (This withdraws one privilege, not the run — the report and fixes still exist.)
|
|
439
|
+
|
|
368
440
|
> **Detail**: See `SKILL_detail.md §PR-Bash` — branch creation bash, commit + push, gh pr create template — read when creating a PR.
|
|
369
441
|
|
|
370
442
|
---
|
|
@@ -390,8 +462,10 @@ Convergence within an AI-AI loop is **provisional**. Elevated to final only afte
|
|
|
390
462
|
| 1+ M-tier → fixed + committed (or "none") | ✅ Prescription complete |
|
|
391
463
|
| Report `tracks/_meta/sim_YYYY_MM_DD_*.md` saved | ✅ Persistence complete |
|
|
392
464
|
| 0 M-tier → report committed + S/R backlog reported | ✅ Health check complete |
|
|
465
|
+
| risk≥medium → `cross_model_coverage` recorded in report (external/cross-session/NONE) | ✅ Coverage gate ran *(check class: measured — the recorded value reflects the dispatch path that ran, not a self-grade; pair: NONE withdraws auto-commit per Step 4)* |
|
|
466
|
+
| Synthesizer emitted the zero-coverage map (every profile standpoint marked covered/ZERO) | ✅ Blind-spot surface reported *(check class: mechanical — a checklist over the profile, not a judgment)* |
|
|
393
467
|
|
|
394
|
-
Verdicts: PASS · CONDITIONAL_PASS (S/R only, or Area B cadence skip) · FAIL (M-tier unresolved) · ESCALATE (persona conflict requiring human judgment).
|
|
468
|
+
Verdicts: PASS · CONDITIONAL_PASS (S/R only, or Area B cadence skip) · FAIL (M-tier unresolved) · ESCALATE (persona conflict requiring human judgment, **or** `cross_model_coverage: NONE` on risk≥medium → auto-commit withdrawn pending operator go).
|
|
395
469
|
|
|
396
470
|
**Mandatory for Area A (external publish)**: steel-quench must complete in same session before Area A is marked complete.
|
|
397
471
|
|
|
@@ -214,7 +214,7 @@ Structural methods to reduce self-reference risk in Area B:
|
|
|
214
214
|
1. **Regular adversarial attacks**: Area B once/month + `challenger` attack once/quarter. Route challenger → defense results directly into SKILL.md via steel-quench handoff after Area B ends.
|
|
215
215
|
2. **Direct external user validation**: Non-owner attempts install + invocation → collect reactions. (cascade β validated: first autonomous external run confirmed.)
|
|
216
216
|
3. **steel-quench integration**: After Area B ends, hand off challenger findings to `/steel-quench` for deeper adversarial review + SKILL.md inscription.
|
|
217
|
-
4. **Dual validation principle**: Internal validation (Area B) alone is insufficient — minimized only when combined with external install reaction collection or cross-model validation.
|
|
217
|
+
4. **Dual validation principle**: Internal validation (Area B) alone is insufficient — minimized only when combined with external install reaction collection or cross-model validation. **For risk≥medium targets this is no longer advisory: it is the hard Cross-Model Coverage Gate** (SKILL.md Step 0.6) — at least one persona from outside the orchestrator's session context (external CLI → cross-session Claude → else `cross_model_coverage: NONE` withdraws auto-commit). Promoted from advisory to gate by the judge-robustness swarm (2026-06-13): a homogeneous same-session panel shares blind spots, so its clean verdict cannot self-certify.
|
|
218
218
|
|
|
219
219
|
**Dispatch template for Area B parallel**:
|
|
220
220
|
```
|
|
@@ -291,6 +291,7 @@ type: simulation-report
|
|
|
291
291
|
date: YYYY-MM-DD
|
|
292
292
|
areas: [A|B|C|D|E|all]
|
|
293
293
|
target_profile: [artifact_type | audience | risk_level]
|
|
294
|
+
cross_model_coverage: [external | cross-session | NONE | n/a-low-risk] # Step 0.6 — recorded from dispatch path
|
|
294
295
|
m_count: N
|
|
295
296
|
s_count: N
|
|
296
297
|
r_count: N
|
|
@@ -299,6 +300,10 @@ r_count: N
|
|
|
299
300
|
## Target Profile
|
|
300
301
|
artifact_type: [type] · audience: [internal|external|mixed] · risk_level: [low|medium|high]
|
|
301
302
|
|
|
303
|
+
## Coverage map (Step 1.5 — every profile standpoint marked covered/ZERO)
|
|
304
|
+
[verbatim zero-coverage map; ⚠️ on each ZERO]
|
|
305
|
+
cross_model_coverage: [external | cross-session | NONE] # NONE on risk≥medium → auto-commit withdrawn (Step 4)
|
|
306
|
+
|
|
302
307
|
## M-tier ([N] items)
|
|
303
308
|
| # | Issue | Location | Prescription |
|
|
304
309
|
...
|
|
@@ -164,8 +164,30 @@ No gate-PASS in scope → skip Wave-P3 entirely.
|
|
|
164
164
|
> dimensions the gate's own pass criteria structurally could not check. Only when all three Attack Failed can
|
|
165
165
|
> a **"Real PASS"** be declared.
|
|
166
166
|
|
|
167
|
+
**PASS-framing redaction (mandatory pre-step)** — the artifact reaches Wave-P3 *carrying its own
|
|
168
|
+
PASS declaration* (a `✅`, a "Verified" header, the gate marker), and a re-judge that reads that
|
|
169
|
+
framing is biased toward "Attack Failed" — the exact bias Wave-P3 exists to defeat (judge-robustness
|
|
170
|
+
swarm, 2026-06-13). Before feeding the artifact to the dimensions/challenger, **strip the pass-framing
|
|
171
|
+
at the bash layer, not the AI layer**:
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
# Strip framing glyphs + canonical FH verdict PHRASES only — never the bare word PASS
|
|
175
|
+
# (a global s/PASS//g corrupts substance: "status==PASS" → "status==", manufacturing findings).
|
|
176
|
+
P3R=$(mktemp)
|
|
177
|
+
sed -E -e 's/[✅🟩✓]//g' \
|
|
178
|
+
-e 's/(ALL AXES PASSED|FH_GATE_VERDICT:[[:space:]]*PASS|Real PASS|VERIFIED|[Gg]ate[^.]{0,24}declared PASS)//g' \
|
|
179
|
+
"{ARTIFACT}" > "$P3R"
|
|
180
|
+
# feed "$P3R" to Wave-P3; clean up after: rm -f "$P3R"
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
Feed `$P3R` to Wave-P3, then `rm -f "$P3R"`. The redaction is mechanical, so it cannot itself be
|
|
184
|
+
placated. **Honest scope**: it strips framing glyphs and *canonical FH verdict phrases* — bare-word or
|
|
185
|
+
lowercase prose self-congratulation ("this passed review", "looks green") is out of scope for `sed`
|
|
186
|
+
and is covered by a prose backstop instead: the Wave-P3 persona is instructed to **disregard any
|
|
187
|
+
residual self-pass language in the artifact** and attack as if no verdict were stated.
|
|
188
|
+
|
|
167
189
|
**Agent utilization**:
|
|
168
|
-
- `fh-commons:quench-challenger` (optional) — adds 6-axis structural attack to each dimension. If absent, run the 3 dimensions directly.
|
|
190
|
+
- `fh-commons:quench-challenger` (optional) — adds 6-axis structural attack to each dimension, fed the **redacted** artifact. If absent, run the 3 dimensions directly on the redacted copy.
|
|
169
191
|
- `fh-meta:persona-innovator` (after convergence) — error/gap patterns found during Wave-P3 → auto-propose new Cross-Project Pattern rows or skill-candidate signals.
|
|
170
192
|
|
|
171
193
|
The three dimensions generalize the gate's three blind spots:
|
|
@@ -10,6 +10,7 @@ complexity_routing:
|
|
|
10
10
|
escalate_when:
|
|
11
11
|
- full_revalidation
|
|
12
12
|
- high_stakes
|
|
13
|
+
- fail_verdict # AI recommendation was wrong → baseline overwrite is high-stakes, never stay at sonnet
|
|
13
14
|
---
|
|
14
15
|
|
|
15
16
|
# verify-bidirectional — Bidirectional Self-Validation Automation
|
|
@@ -50,6 +51,28 @@ Treat user's statement as **external refinement material**. **Do NOT attempt to
|
|
|
50
51
|
|
|
51
52
|
Core proposition: "refinement challenge ≠ fundamental negation". Priority is identifying where the initial recommendation is weakened.
|
|
52
53
|
|
|
54
|
+
**Evidence gate (overwrite ≠ soften)** — closes the sycophancy/steering vector where a bare assertion
|
|
55
|
+
("that's wrong, re-examine") flips a baseline with zero evidence (judge-robustness swarm, 2026-06-13).
|
|
56
|
+
"Do NOT defend" still holds for *this conversation's proposition* (anti-stubbornness is the point), but a
|
|
57
|
+
**persistent-baseline overwrite** (a rule, asset, memory, or `knowledge/` claim — anything that outlives
|
|
58
|
+
this session) requires a **supporting basis**, not mere pushback:
|
|
59
|
+
|
|
60
|
+
- **(a)** the user cited a file / line / commit / URL / past decision **and the cited content actually
|
|
61
|
+
supports the challenge** — verified by *reading it*, not by its mere existence (an irrelevant-but-real
|
|
62
|
+
citation is the out-of-context-grounding trap, the same vector phantom-quench guards), **or**
|
|
63
|
+
- **(b)** the Step 2 grep returns an actual contradiction that *supports the challenge* (not just any
|
|
64
|
+
conflict with the original) — surfaced literally.
|
|
65
|
+
|
|
66
|
+
If a baseline overwrite is implied but **neither holds**, do not silently rewrite: verdict is **ESCALATE**
|
|
67
|
+
— surface *"this would overwrite baseline {X} with no cited evidence; confirm override, or provide a
|
|
68
|
+
source?"* and block the Step 4 cascade until the operator answers. Softening a local in-conversation
|
|
69
|
+
proposition (no persistent asset changed) proceeds as before — the gate fires only on persistent-baseline
|
|
70
|
+
writes. **Sequencing**: this gate is written in Step 1, but Step 4 is what enumerates affected persistent
|
|
71
|
+
assets — so if Step 4 later identifies *any* persistent asset to write, **re-apply this gate before Step
|
|
72
|
+
4.5** even if the original challenge first looked like a mere soften. This is **not** restored AI defensiveness: the AI still does not argue the user is wrong; it only
|
|
73
|
+
declines to *fabricate a baseline change* the evidence does not support (mirror of the steel-quench/phantom
|
|
74
|
+
mechanical-anchor principle — judged verdicts bind to evidence).
|
|
75
|
+
|
|
53
76
|
### Step 2. Consistency Area Grep (3-step mandatory)
|
|
54
77
|
|
|
55
78
|
Grep to find which rules, assets, or propositions conflict with the initial recommendation:
|