@chrono-meta/fh-gate 1.2.2 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +2 -2
- package/CATALOG.md +6 -1
- package/CHEATSHEET.md +125 -1
- package/CLAUDE.md +40 -4
- package/README.md +68 -15
- package/docs/codex-compat.md +4 -4
- package/docs/pillars.svg +26 -29
- package/knowledge/shared/harness-core/fh_integration_contract.md +1 -1
- package/package.json +1 -1
- package/plugins/fh-meta/skills/agent-composer/SKILL.md +1 -1
- package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +2 -2
- package/plugins/fh-meta/skills/edit-manifest/SKILL.md +1 -1
- package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +1 -1
- package/plugins/fh-meta/skills/install-wizard/SKILL.md +8 -1
- package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +1 -1
- package/plugins/fh-meta/skills/phantom-quench/SKILL.md +248 -0
- package/plugins/fh-meta/skills/{source-grounding-audit → phantom-quench}/SKILL_detail.md +3 -3
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +10 -10
- package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +77 -1
- package/plugins/fh-meta/skills/return-path-gate/SKILL.md +2 -2
- package/plugins/fh-meta/skills/sim-conductor/SKILL.md +59 -2
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +2 -2
- package/plugins/fh-meta/skills/skill-splitter/SKILL.md +4 -4
- package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +2 -2
- package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +27 -215
- package/plugins/fh-meta/skills/steel-quench/SKILL.md +24 -2
- package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +2 -2
- package/scripts/fh-gate.sh +3 -9
- package/scripts/fh-run.sh +1 -1
|
@@ -56,7 +56,7 @@ Step 3 — Draft SKILL_detail.md
|
|
|
56
56
|
Front-matter: name, description, load: on-demand
|
|
57
57
|
|
|
58
58
|
Step 4 — Verify
|
|
59
|
-
|
|
59
|
+
phantom-quench: every §pointer in SKILL.md resolves to ## §SectionName in SKILL_detail.md
|
|
60
60
|
sim-conductor Area D-skill: consumer agent with SKILL.md only → must reach grade F
|
|
61
61
|
→ Any pointer mismatch or grade P/B → fix before commit
|
|
62
62
|
```
|
|
@@ -101,9 +101,9 @@ Run on a SKILL.md when **any one** of:
|
|
|
101
101
|
| Situation | Skill |
|
|
102
102
|
|---|---|
|
|
103
103
|
| Diagnose which SKILL.md files are candidates | `/context-doctor` or `/harness-doctor` |
|
|
104
|
-
| Verify §pointer grounding after split | `/
|
|
104
|
+
| Verify §pointer grounding after split | `/phantom-quench` |
|
|
105
105
|
| Verify cold-start still works after split | `/sim-conductor D skill <name>` |
|
|
106
|
-
| Check new SKILL_detail.md for phantom claims | `/
|
|
106
|
+
| Check new SKILL_detail.md for phantom claims | `/phantom-quench` |
|
|
107
107
|
| Adversarial review of the split result | `/steel-quench` |
|
|
108
108
|
|
|
109
109
|
---
|
|
@@ -115,7 +115,7 @@ Step 1 classification table produced
|
|
|
115
115
|
+ SKILL.md trimmed: triggers · principles · step overview · decision tables · Done When retained
|
|
116
116
|
+ SKILL.md has imperative pointer for every removed section (> **Detail**: See SKILL_detail.md §X)
|
|
117
117
|
+ SKILL_detail.md created: ## §SectionName header for every pointer in SKILL.md
|
|
118
|
-
+
|
|
118
|
+
+ phantom-quench: 0 phantoms (all §pointers resolve)
|
|
119
119
|
→ Fallback (skill unavailable): run §Verification-Checklist manually from SKILL_detail.md
|
|
120
120
|
+ sim-conductor Area D-skill: grade F (consumer completes core task from SKILL.md alone)
|
|
121
121
|
→ Fallback (skill unavailable): manually confirm "trigger → step overview → key decision → Done When" all present in SKILL.md
|
|
@@ -164,7 +164,7 @@ grep "^## §" plugins/{plugin}/skills/{name}/SKILL_detail.md
|
|
|
164
164
|
```
|
|
165
165
|
|
|
166
166
|
Then run:
|
|
167
|
-
- `/
|
|
167
|
+
- `/phantom-quench` — artifact: SKILL.md, declared source: SKILL_detail.md
|
|
168
168
|
- `/sim-conductor D skill {name}` — provide SKILL.md only, attempt core task from trigger phrase
|
|
169
169
|
|
|
170
170
|
---
|
|
@@ -177,7 +177,7 @@ Use before committing a completed split:
|
|
|
177
177
|
|---|---|
|
|
178
178
|
| Trigger phrases ≥ 3 | SKILL.md §Trigger Phrases has 3+ entries |
|
|
179
179
|
| Done When defined | SKILL.md has Done When block with ≥1 measurable condition |
|
|
180
|
-
| All §pointers resolve |
|
|
180
|
+
| All §pointers resolve | phantom-quench: 0 phantoms |
|
|
181
181
|
| Cold-start grade F | sim-conductor Area D-skill: consumer reaches core completion |
|
|
182
182
|
| No behavioral rule in SKILL_detail only | Any rule governing "what counts as X" present in SKILL.md |
|
|
183
183
|
| No orphan §sections | Every ## §SectionName in SKILL_detail.md has a pointer from SKILL.md |
|
|
@@ -1,230 +1,42 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: source-grounding-audit
|
|
3
|
-
description:
|
|
4
|
-
|
|
5
|
-
|
|
3
|
+
description: >-
|
|
4
|
+
RENAMED to phantom-quench (2026-06-06, quench-series rebrand). Same skill, same ruleset — only the
|
|
5
|
+
label changed to fit the quench family (steel-quench · phantom-quench · goal-quench). Use
|
|
6
|
+
/phantom-quench. This alias is retained so old references and the v1 paper's name still resolve.
|
|
7
|
+
user-invocable: false
|
|
8
|
+
allowed-tools: []
|
|
6
9
|
model: sonnet
|
|
10
|
+
deprecated: true
|
|
11
|
+
deprecated_reason: renamed to phantom-quench (label-only; not a merge — same skill)
|
|
12
|
+
deprecated_date: 2026-06-06
|
|
13
|
+
successor: phantom-quench
|
|
7
14
|
---
|
|
8
15
|
|
|
9
|
-
# source-grounding-audit —
|
|
16
|
+
# source-grounding-audit — RENAMED to `phantom-quench`
|
|
10
17
|
|
|
11
|
-
>
|
|
18
|
+
> **Renamed to `phantom-quench` (2026-06-06).** This is a **label rename, not a deprecation-by-merge** —
|
|
19
|
+
> the skill is unchanged and fully active under its new name. Invoke **`/phantom-quench`**.
|
|
12
20
|
|
|
13
|
-
|
|
21
|
+
## Why the rename
|
|
14
22
|
|
|
15
|
-
|
|
23
|
+
phantom-quench is the **grounding member of the quench series** (steel-quench attacks output patterns ·
|
|
24
|
+
phantom-quench traces inputs for Phantom Claims · goal-quench gates autonomous runs). The old descriptive
|
|
25
|
+
name did not signal that family membership; the function is identical.
|
|
16
26
|
|
|
17
|
-
|
|
18
|
-
|---|---|---|
|
|
19
|
-
| **Attack target** | Output patterns (self-declarations, cushion language, reason for existence) | Input tracing (is the claim in the source?) |
|
|
20
|
-
| **Core question** | "Is this structure flawed?" | "Where did this content come from?" |
|
|
21
|
-
| **Activation timing** | All-angle quench just before completion | Immediately after source-based artifact generation or at point of suspicion |
|
|
22
|
-
| **Primary attack vector** | Bus factor, self-reference, platform obsolescence | Phantom Claim, source not read, fabricated branching conditions |
|
|
23
|
-
| **Representative pattern** | "Declaration only, no evidence" | "Number in TC that doesn't exist in source" |
|
|
27
|
+
## Where the skill lives now
|
|
24
28
|
|
|
25
|
-
|
|
29
|
+
`plugins/fh-meta/skills/phantom-quench/SKILL.md` (+ `SKILL_detail.md`) — full ruleset preserved
|
|
30
|
+
(S-grade blocker, Human Gate, Pattern Diagnosis, etc.).
|
|
26
31
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
## Trigger Phrases
|
|
30
|
-
|
|
31
|
-
| Phrase | Situation |
|
|
32
|
-
|---|---|
|
|
33
|
-
| "phantom detection", "phantom claim", "false claim detection" | Full artifact Phantom scan (primary trigger) |
|
|
34
|
-
| "source back-trace", "source audit" | Analysis report, design document verification |
|
|
35
|
-
| "verify source", "where did this come from" | Suspecting origin of a specific claim |
|
|
36
|
-
| "TC evidence tracing", "TC source verification" | Post-TC-generation source consistency check |
|
|
37
|
-
| "grounding audit", "source grounding audit" | Full artifact Phantom scan |
|
|
38
|
-
| "verify evidence files" | Analysis report, design document verification |
|
|
39
|
-
| `/source-grounding-audit` | Explicit call |
|
|
40
|
-
|
|
41
|
-
---
|
|
42
|
-
|
|
43
|
-
## Core Concept — Phantom Claim
|
|
44
|
-
|
|
45
|
-
**Phantom Claim**: A claim that appears in the artifact but cannot be found in the declared source files.
|
|
46
|
-
|
|
47
|
-
3 paths through which Phantoms are produced:
|
|
48
|
-
|
|
49
|
-
| Path | Description | Risk |
|
|
50
|
-
|---|---|:---:|
|
|
51
|
-
| **Source not read** | AI generates artifact using domain knowledge without Read-ing source | S |
|
|
52
|
-
| **Partial reading** | Source partially read, rest filled in with inference | A |
|
|
53
|
-
| **Reconstruction contamination** | Source was read but LLM modified values/conditions during paraphrase | A |
|
|
54
|
-
|
|
55
|
-
---
|
|
56
|
-
|
|
57
|
-
## Execution Steps
|
|
58
|
-
|
|
59
|
-
### Step 0. Confirm Audit Target
|
|
60
|
-
|
|
61
|
-
If not provided by user, explicitly confirm: artifact file path, declared source files, and audit scope. Source not declared = S-grade blocker registered immediately.
|
|
62
|
-
|
|
63
|
-
> **Detail**: See `SKILL_detail.md §Step0-Detail` — confirmation output format and simplification guard — read when audit target or source list is ambiguous.
|
|
64
|
-
|
|
65
|
-
---
|
|
66
|
-
|
|
67
|
-
### Step 0.5. Claim Distribution Profile
|
|
68
|
-
|
|
69
|
-
> **Schema**: `knowledge/shared/harness-core/tpa_schema.md` — `phantom_risk` derivation rule, gate trigger conditions, §Gate Routing Table.
|
|
70
|
-
|
|
71
|
-
Runs after Step 0 (target + source confirmed). Skip if user specifies scope explicitly.
|
|
72
|
-
|
|
73
|
-
Scan artifact quickly to classify claim distribution:
|
|
74
|
-
|
|
75
|
-
| Dimension | Signal → Audit depth shift |
|
|
76
|
-
|---|---|
|
|
77
|
-
| `claim_density` | > 10 claims → full Step 1-4 audit; ≤ 3 claims → light (S+A only) |
|
|
78
|
-
| `artifact_type` | SKILL.md/design-doc → prioritize Branch/State-transition claims; code → prioritize Proper-noun/API claims |
|
|
79
|
-
| `risk_level` | external publish / arXiv citations → all claim types, max depth |
|
|
80
|
-
| `source_count` | 0 declared sources → S-grade blocker immediately (skip to Step 3 prescription) |
|
|
81
|
-
| `quantitative_density` | > 3 numerical claims → focus numerical+range types first |
|
|
82
|
-
|
|
83
|
-
Scope recommendation output:
|
|
84
|
-
```
|
|
85
|
-
Claim types to prioritize: [list]
|
|
86
|
-
Audit depth: [full | prioritized | light]
|
|
87
|
-
Immediate blockers detected: [yes/no — 0 sources = immediate S-grade]
|
|
88
|
-
```
|
|
89
|
-
|
|
90
|
-
**0-source behavioral rule**: When artifact has 0 declared sources, skip Steps 1-2 entirely and go directly to Step 3 with S-grade blocker: "Source not declared — all claims unverifiable."
|
|
91
|
-
|
|
92
|
-
---
|
|
93
|
-
|
|
94
|
-
### Step 1. Claim Extraction (Artifact Scan)
|
|
95
|
-
|
|
96
|
-
Extract claims from the artifact that require source back-tracing. Claim types: Proper nouns (highest), Numerical/range values (highest), Branching conditions (highest), State transitions (high), Preconditions (high), Actors (medium). Exclude generic test methodology descriptions and generic UI patterns.
|
|
97
|
-
|
|
98
|
-
> **Detail**: See `SKILL_detail.md §Step1-Detail` — full claim types table with examples, exclude list, and Step 1 output format template — read when deciding which claims to include or format the extraction results.
|
|
99
|
-
|
|
100
|
-
---
|
|
101
|
-
|
|
102
|
-
### Step 2. Source Read + Back-Trace
|
|
103
|
-
|
|
104
|
-
Back-trace each claim to the declared source files using Read + Grep directly — no inference judgment. Partial match is not treated as match.
|
|
105
|
-
|
|
106
|
-
Back-tracing classification:
|
|
107
|
-
|
|
108
|
-
| Classification | Criteria | Marking |
|
|
109
|
-
|---|---|:---:|
|
|
110
|
-
| **Grounded** | Claim directly confirmed in source | ✅ |
|
|
111
|
-
| **Partial** | Similar content in source but not exact match — needs re-confirmation | ⚠️ |
|
|
112
|
-
| **Phantom** | Cannot be found in source | ❌ |
|
|
113
|
-
| **Source-Missing** | Source itself cannot be Read or was not declared | 🔴 |
|
|
114
|
-
|
|
115
|
-
> **Detail**: See `SKILL_detail.md §Step2-Detail` — back-tracing execution procedure, classification decision rules, and Step 2 output format template — read when handling edge cases or formatting results.
|
|
116
|
-
|
|
117
|
-
---
|
|
118
|
-
|
|
119
|
-
### Step 3. Phantom Classification + Prescription
|
|
120
|
-
|
|
121
|
-
Classify Phantom and Partial claims by severity and provide prescriptions.
|
|
122
|
-
|
|
123
|
-
**Severity classification criteria**:
|
|
32
|
+
## Record note (do not "fix")
|
|
124
33
|
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
| **A** (Must fix) | If this claim is wrong, TC cannot execute or runs wrong path | API endpoint names, field names, preconditions |
|
|
129
|
-
| **B** (Improvement recommended) | If this claim is wrong, TC can execute but intent may differ | Descriptive text, non-critical names |
|
|
130
|
-
|
|
131
|
-
Prescriptions: (1) Source Re-read — precisely re-read the relevant source section and fix; (2) Request source specification — when source doesn't exist or wasn't declared; (3) Delete/rewrite — delete claims without source grounding and rewrite from source.
|
|
132
|
-
|
|
133
|
-
> **Detail**: See `SKILL_detail.md §Step3-Detail` — prescription procedures and Step 3 output format template — read when writing the classification table or applying a prescription.
|
|
134
|
-
|
|
135
|
-
**S-grade Immediate Human Gate** — if 1+ S-grade Phantoms found, pause before Step 4/5 and surface:
|
|
136
|
-
|
|
137
|
-
```
|
|
138
|
-
⚠️ source-grounding-audit: N S-grade Phantom(s) found:
|
|
139
|
-
- [claim 1 — one-line summary, location]
|
|
140
|
-
- [claim 2 — one-line summary, location]
|
|
141
|
-
|
|
142
|
-
Options:
|
|
143
|
-
(a) Continue — AI proceeds to Step 4 pattern diagnosis + Step 5 re-audit
|
|
144
|
-
(b) Human review first — inspect Phantoms directly, then proceed
|
|
145
|
-
(c) Abort — fix sources manually and re-run audit
|
|
146
|
-
|
|
147
|
-
Waiting for input. (Default: a)
|
|
148
|
-
```
|
|
149
|
-
|
|
150
|
-
Rationale: S-grade Phantoms that enter Step 5 re-audit without human review risk LLM reconstruction contamination — the same pattern that originally produced the Phantoms can "verify" its own fixes. Human review at this threshold breaks the loop.
|
|
151
|
-
|
|
152
|
-
---
|
|
153
|
-
|
|
154
|
-
### Step 4. Source Not-Read Pattern Detection (Meta Diagnosis)
|
|
155
|
-
|
|
156
|
-
Analyze Phantom distribution to diagnose structural problems in the artifact generation process. Reveal "why were these Phantoms produced", not just "this TC is wrong".
|
|
157
|
-
|
|
158
|
-
**Pattern detection criteria**:
|
|
159
|
-
|
|
160
|
-
| Pattern | Detection Condition | Meaning |
|
|
161
|
-
|---|---|---|
|
|
162
|
-
| **Source not read** | 3+ Phantoms and no or partial source Read history | AI generated using domain knowledge without reading source |
|
|
163
|
-
| **Partial reading contamination** | Partial items exceed 30% of total | AI read source partially and filled rest with inference |
|
|
164
|
-
| **Reconstruction modification** | Source value exists but unit/format/range modified in TC | LLM paraphrase process contamination |
|
|
165
|
-
| **Source declaration absent** | Source file not specified when generating artifact | Process design stage problem |
|
|
166
|
-
|
|
167
|
-
**Simplification guard**: If 0 Phantoms, skip Step 4 entirely. Replace with one line: "Source grounding adequate."
|
|
168
|
-
|
|
169
|
-
> **Detail**: See `SKILL_detail.md §Step4-Detail` — Step 4 output format template — read when writing the pattern diagnosis section.
|
|
170
|
-
|
|
171
|
-
---
|
|
172
|
-
|
|
173
|
-
### Step 5. Post-Fix Re-audit (Optional)
|
|
174
|
-
|
|
175
|
-
Re-run back-trace for S-grade blocker claims after fixes are complete. Activate when 1+ S-grade blockers exist and fix is immediately possible.
|
|
176
|
-
|
|
177
|
-
**Done When (re-audit)**: Back-trace results for fixed claims all show Grounded (✅) status.
|
|
178
|
-
|
|
179
|
-
---
|
|
180
|
-
|
|
181
|
-
## Completion Declaration Format
|
|
182
|
-
|
|
183
|
-
> **Template**: See `SKILL_detail.md §Report-Template` — full completion declaration format — read when producing the final audit summary.
|
|
184
|
-
|
|
185
|
-
---
|
|
186
|
-
|
|
187
|
-
## Connected Skills
|
|
188
|
-
|
|
189
|
-
| Situation | Connected Skill |
|
|
190
|
-
|---|---|
|
|
191
|
-
| Simultaneously verify output patterns (self-declarations, cushion language) | `/steel-quench` Wave 1 "real-use verification" angle |
|
|
192
|
-
| Re-verify Phantom patterns from external user perspective | `/sim-conductor Area A` |
|
|
193
|
-
| Source not-read is a harness structure problem | `/harness-doctor` |
|
|
194
|
-
| Phantom pattern is a candidate for new rule items | `fh-meta:persona-innovator` |
|
|
195
|
-
| Redesign the artifact generation prompt itself | `/meta-prompt-builder` |
|
|
196
|
-
|
|
197
|
-
---
|
|
198
|
-
|
|
199
|
-
## External User Environment Adaptation
|
|
200
|
-
|
|
201
|
-
This skill can be used independently without the full meta-harness structure.
|
|
202
|
-
|
|
203
|
-
**How to declare source files**: When generating artifacts, specify "source: [file path list]", or provide source files when invoking this skill.
|
|
204
|
-
|
|
205
|
-
**External environment fallback**:
|
|
206
|
-
- If no `tracks/_meta/` → skip persistence step
|
|
207
|
-
- If no project-specific rules (like PFD) → output Phantom pattern summary only
|
|
208
|
-
|
|
209
|
-
---
|
|
34
|
+
The **v1 paper** (Zenodo 10.5281/zenodo.20397566; arXiv submission) cites `source-grounding-audit`.
|
|
35
|
+
That is the **immutable historical name**, not a phantom — `paper/forge_harness_v1.0.html` is left
|
|
36
|
+
unchanged by design. Future readers map: *source-grounding-audit (v1 paper) = phantom-quench (current)*.
|
|
210
37
|
|
|
211
38
|
## Done When
|
|
212
39
|
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
+ Step 3 Phantom severity classification + prescription output
|
|
217
|
-
+ Step 4 process pattern diagnosis complete (skip if 0 Phantoms)
|
|
218
|
-
+ "source-grounding-audit Complete" declaration output
|
|
219
|
-
```
|
|
220
|
-
|
|
221
|
-
Verdict: PASS (0 Phantom claims) | CONDITIONAL_PASS (LOW-severity Phantoms only, prescriptions noted) | FAIL (1+ HIGH/MEDIUM Phantom — broken path, phantom file, or stale external link) | ESCALATE (scope unclear or claim extraction impossible)
|
|
222
|
-
|
|
223
|
-
---
|
|
224
|
-
|
|
225
|
-
## Operating Notes
|
|
226
|
-
|
|
227
|
-
- **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep.
|
|
228
|
-
- **Partial is not Grounded**: Processing similar-value-in-source as Grounded misses the reconstruction modification pattern.
|
|
229
|
-
- **Source not declared itself is S-grade**: If source is not declared when making an artifact, no claim can subsequently be verified. Recommend mandating source declaration in the process design stage.
|
|
230
|
-
- **Recommended to use with steel-quench**: steel-quench quenches structural flaws, source-grounding-audit ensures source consistency. The two skills are orthogonal and artifact quality assurance is strengthened when used together.
|
|
40
|
+
Deprecated alias — no active execution path of its own. Done When: all invocation routes through
|
|
41
|
+
`/phantom-quench` (the successor); this entry exists only so old names resolve. Satisfies the
|
|
42
|
+
harness-doctor L2 M-tier Done-When requirement (CLAUDE.md §New Skill Creation Pre-Commit Gate).
|
|
@@ -148,6 +148,27 @@ Wave 4 convergence = Wave 3 criteria + 3 AI-specific vectors actually reviewed +
|
|
|
148
148
|
|
|
149
149
|
---
|
|
150
150
|
|
|
151
|
+
## External-GT Adjudication (when the target has a public ground truth)
|
|
152
|
+
|
|
153
|
+
When quenching a **public artifact that has its own ground truth** — a repo's open issues, test suite, or
|
|
154
|
+
stated policy/threat-model (a frontier codebase, a sister project — *not* your own in-progress draft) — add
|
|
155
|
+
an adjudication pass after the panel produces findings. The panel (Wave 5 cross-family) gives decorrelated
|
|
156
|
+
detection; this pass adds the *external check* the panel cannot self-supply. For each finding, classify:
|
|
157
|
+
|
|
158
|
+
| Class | Test | Meaning |
|
|
159
|
+
|---|---|---|
|
|
160
|
+
| **Corroborated** | matches an OPEN issue / a failing test | independent rediscovery — strongest |
|
|
161
|
+
| **Novel** | no matching issue, but confirmed by logic or a written test | caught what the target missed |
|
|
162
|
+
| **Reframe / reject** | the target's own docs/policy/threat-model marks it intentional or out-of-scope | NOT a confident catch — a false positive |
|
|
163
|
+
|
|
164
|
+
The GT (not a cross-family vote) resolves contention objectively, and it catches the panel's own
|
|
165
|
+
**shared training-prior** false positives. Report only Corroborated + Novel as confident catches; a null
|
|
166
|
+
result on sound code is the correct answer, not a failure. **Basis**: 2026-06-06 frontier-quench sweep —
|
|
167
|
+
a single-family pass repeated still misses what cross-family catches, and a target's `SECURITY.md` reframed
|
|
168
|
+
"security" findings to "correctness" (its permission layer was UX, not a boundary).
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
151
172
|
## Cross-Project Common Patterns (initial seed)
|
|
152
173
|
|
|
153
174
|
| # | Pattern Name | Description | Response Direction |
|
|
@@ -194,7 +215,7 @@ Verdict: PASS (zero S-grade, convergence reached) | CONDITIONAL_PASS (A/B-grade
|
|
|
194
215
|
| Attack angle is a harness structure problem | `/harness-doctor` | optional |
|
|
195
216
|
| After Wave convergence, propose new pattern rules | `fh-meta:persona-innovator` | optional |
|
|
196
217
|
| Wave 1 structure-specific attack (6-axis) | `fh-commons:quench-challenger` | priority |
|
|
197
|
-
| Back-trace whether claims exist in source files | `/
|
|
218
|
+
| Back-trace whether claims exist in source files | `/phantom-quench` | **mandatory** when `phantom_risk=true` OR `scope=external` (see tpa_schema.md §Gate Routing Table) |
|
|
198
219
|
|
|
199
220
|
**steel-quench → sim-conductor gate**: After Wave convergence in external-publish context, `/sim-conductor Area A` is the mandatory next step.
|
|
200
221
|
|
|
@@ -219,7 +240,8 @@ sim-conductor Area A (external user perspective)
|
|
|
219
240
|
- **Attacks without real code are invalid.** Abstract criticism is not included in Wave 1 results.
|
|
220
241
|
- **quench-challenger first.** Call fh-commons:quench-challenger in isolation in Wave 1 if available.
|
|
221
242
|
- **Always check self-referential pattern (P3).** Cross-validate Wave results with external criteria.
|
|
222
|
-
- **
|
|
243
|
+
- **Public target → adjudicate against external GT before claiming.** A finding the target's own docs/policy/threat-model marks intentional or out-of-scope is a false positive, not a catch. See §External-GT Adjudication.
|
|
244
|
+
- **Attack surface limit**: steel-quench attacks output content patterns. Phantom Claim detection → `phantom-quench`.
|
|
223
245
|
|
|
224
246
|
## Failure Fallback
|
|
225
247
|
|
|
@@ -443,11 +443,11 @@ External CLIs available: check at runtime via Step 0-pre bash detection
|
|
|
443
443
|
**Wave selection**:
|
|
444
444
|
```
|
|
445
445
|
Run: Wave 1 (claim density), Wave 2 (structural defense, weight↑),
|
|
446
|
-
Wave 3 (weight↑ — arXiv/DOI phantom risk; pair with /
|
|
446
|
+
Wave 3 (weight↑ — arXiv/DOI phantom risk; pair with /phantom-quench),
|
|
447
447
|
Wave 4 (novelty: new architecture)
|
|
448
448
|
Wave 5 (cross-team scope — activate if risk_level=high or user requests)
|
|
449
449
|
Skip: Phase 0 (unless user supplies an external bad-case doc)
|
|
450
450
|
External CLIs available: check at runtime
|
|
451
451
|
```
|
|
452
452
|
|
|
453
|
-
**Degraded coverage note**: Wave 3 without `/
|
|
453
|
+
**Degraded coverage note**: Wave 3 without `/phantom-quench` available → flag as "Axis 3 skipped (skill unavailable)" and note in residual risk card.
|
package/scripts/fh-gate.sh
CHANGED
|
@@ -329,16 +329,10 @@ if [[ "$FIRST_OUTPUT_LINE" != "FH_STATUS: SUCCESS" ]]; then
|
|
|
329
329
|
exit $EXIT_HARNESS_ERROR
|
|
330
330
|
fi
|
|
331
331
|
|
|
332
|
-
|
|
332
|
+
# Harness-failure guard is already enforced above: the first non-empty output line
|
|
333
|
+
# must be "FH_STATUS: SUCCESS" (see check at top of this block) or we exit HARNESS_ERROR.
|
|
333
334
|
VERDICT=$(grep -m 1 "^FH_GATE_VERDICT:" "$PARSE_FILE" 2>/dev/null | awk '{print $2}' | tr -d '[:space:]' || true)
|
|
334
335
|
|
|
335
|
-
# Harness failure guard (fail-safe: missing status → BLOCKED)
|
|
336
|
-
if [[ "$FH_STATUS" != "SUCCESS" ]]; then
|
|
337
|
-
echo "ERROR: FH_STATUS=${FH_STATUS:-MISSING} — harness failure (fail-safe: BLOCKED)" >&2
|
|
338
|
-
cat "$OUTPUT_FILE" >&2
|
|
339
|
-
exit $EXIT_HARNESS_ERROR
|
|
340
|
-
fi
|
|
341
|
-
|
|
342
336
|
# Emit structured output to stdout
|
|
343
337
|
cat "$PARSE_FILE"
|
|
344
338
|
|
|
@@ -368,7 +362,7 @@ case "$VERDICT" in
|
|
|
368
362
|
BLOCKED) echo "→ verdict: BLOCKED" >&2; exit $EXIT_BLOCKED ;;
|
|
369
363
|
ESCALATE) echo "→ verdict: ESCALATE" >&2; exit $EXIT_ESCALATE ;;
|
|
370
364
|
*)
|
|
371
|
-
echo "ERROR: unrecognized verdict '${VERDICT:-EMPTY}' —
|
|
365
|
+
echo "ERROR: unrecognized verdict '${VERDICT:-EMPTY}' — harness error, failing safe (commit not allowed)" >&2
|
|
372
366
|
exit $EXIT_HARNESS_ERROR
|
|
373
367
|
;;
|
|
374
368
|
esac
|
package/scripts/fh-run.sh
CHANGED
|
@@ -35,7 +35,7 @@ Environment:
|
|
|
35
35
|
FH_DRY_RUN=1 Print assembled prompt only
|
|
36
36
|
|
|
37
37
|
Examples:
|
|
38
|
-
FH_BACKEND=codex fh-run --skill
|
|
38
|
+
FH_BACKEND=codex fh-run --skill phantom-quench --file docs/foo.md
|
|
39
39
|
FH_BACKEND=codex fh-run --agent fh-commons:quench-challenger --file plugins/fh-meta/skills/foo/SKILL.md
|
|
40
40
|
USAGE
|
|
41
41
|
}
|