@chrono-meta/fh-gate 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (69) hide show
  1. package/.claude/agents/challenger.md +169 -0
  2. package/AGENTS.md +160 -0
  3. package/CATALOG.md +256 -0
  4. package/CHEATSHEET.md +367 -0
  5. package/CLAUDE.md +331 -0
  6. package/CONTRIBUTING.md +198 -0
  7. package/LICENSE +21 -0
  8. package/README.md +60 -7
  9. package/bin/fh-goal.js +9 -0
  10. package/bin/fh-run.js +9 -0
  11. package/docs/banner.png +0 -0
  12. package/docs/codex-compat.md +123 -0
  13. package/docs/pillars.svg +70 -0
  14. package/knowledge/shared/harness-core/fh_integration_contract.md +45 -28
  15. package/package.json +31 -6
  16. package/plugins/fh-commons/README.md +37 -0
  17. package/plugins/fh-commons/agents/quench-challenger.md +373 -0
  18. package/plugins/fh-commons/skills/convergence-loop/SKILL.md +155 -0
  19. package/plugins/fh-commons/skills/deliberation/SKILL.md +288 -0
  20. package/plugins/fh-commons/skills/mcp-circuit-breaker/SKILL.md +196 -0
  21. package/plugins/fh-commons/skills/token-budget-gate/SKILL.md +175 -0
  22. package/plugins/fh-meta/agents/fact-checker.md +121 -0
  23. package/plugins/fh-meta/agents/hub-persona-auditor.md +109 -0
  24. package/plugins/fh-meta/agents/persona-innovator.md +195 -0
  25. package/plugins/fh-meta/skills/agent-composer/SKILL.md +461 -0
  26. package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +464 -0
  27. package/plugins/fh-meta/skills/apex-review/SKILL.md +185 -0
  28. package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +135 -0
  29. package/plugins/fh-meta/skills/contention-layer/SKILL.md +127 -0
  30. package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL.md +30 -0
  31. package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL_detail.md +144 -0
  32. package/plugins/fh-meta/skills/context-doctor/SKILL.md +341 -0
  33. package/plugins/fh-meta/skills/cross-ecosystem-synergy-detection/SKILL.md +202 -0
  34. package/plugins/fh-meta/skills/deep-clarify/SKILL.md +144 -0
  35. package/plugins/fh-meta/skills/edit-manifest/SKILL.md +210 -0
  36. package/plugins/fh-meta/skills/field-harvest/SKILL.md +384 -0
  37. package/plugins/fh-meta/skills/frontier-digest/SKILL.md +272 -0
  38. package/plugins/fh-meta/skills/goal-quench/SKILL.md +509 -0
  39. package/plugins/fh-meta/skills/harness-doctor/SKILL.md +277 -0
  40. package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +484 -0
  41. package/plugins/fh-meta/skills/harvest-loop/SKILL.md +231 -0
  42. package/plugins/fh-meta/skills/harvest-loop/SKILL_detail.md +201 -0
  43. package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +129 -0
  44. package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL_detail.md +158 -0
  45. package/plugins/fh-meta/skills/install-doctor/SKILL.md +207 -0
  46. package/plugins/fh-meta/skills/install-wizard/SKILL.md +613 -0
  47. package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +193 -0
  48. package/plugins/fh-meta/skills/memory-hygiene/SKILL.md +143 -0
  49. package/plugins/fh-meta/skills/meta-prompt-builder/SKILL.md +167 -0
  50. package/plugins/fh-meta/skills/meta-prompt-builder/SKILL_detail.md +37 -0
  51. package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +430 -0
  52. package/plugins/fh-meta/skills/plugin-recommender/SKILL.md +221 -0
  53. package/plugins/fh-meta/skills/plugin-recommender/SKILL_detail.md +220 -0
  54. package/plugins/fh-meta/skills/prompt-regression/SKILL.md +178 -0
  55. package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +224 -0
  56. package/plugins/fh-meta/skills/return-path-gate/SKILL.md +257 -0
  57. package/plugins/fh-meta/skills/self-marketing-lint/SKILL.md +129 -0
  58. package/plugins/fh-meta/skills/sim-conductor/SKILL.md +364 -0
  59. package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +337 -0
  60. package/plugins/fh-meta/skills/skill-splitter/SKILL.md +126 -0
  61. package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +185 -0
  62. package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +230 -0
  63. package/plugins/fh-meta/skills/source-grounding-audit/SKILL_detail.md +182 -0
  64. package/plugins/fh-meta/skills/steel-quench/SKILL.md +226 -0
  65. package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +453 -0
  66. package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +238 -0
  67. package/scripts/fh-gate.sh +175 -40
  68. package/scripts/fh-goal.sh +182 -0
  69. package/scripts/fh-run.sh +269 -0
@@ -0,0 +1,230 @@
1
+ ---
2
+ name: source-grounding-audit
3
+ description: Extracts proper nouns, numerical values, and branching conditions from artifacts (TCs, analysis reports, design documents), back-traces them to declared source files, and marks as Phantom (false) if not found in source. If steel-quench attacks output patterns (self-declarations, cushion language), source-grounding-audit attacks input tracing (where did this come from?). Triggered by "phantom detection", "source back-trace", "source audit", "verify source", "TC evidence tracing", "where did this come from", "grounding audit", "source grounding audit", "phantom claim", "false claim detection".
4
+ user-invocable: true
5
+ allowed-tools: ["Read", "Write", "Edit", "Bash", "Grep", "Glob"]
6
+ model: sonnet
7
+ ---
8
+
9
+ # source-grounding-audit — Input Tracing Grounding Audit
10
+
11
+ > Just because an artifact looks plausible doesn't mean it's grounded in source. plausible ≠ grounded.
12
+
13
+ When AI generates artifacts without reading the source, those artifacts look like domain knowledge but are actually **Phantom Claims** coming from LLM weights. This skill back-traces each claim in the artifact to the declared source to explicitly mark Phantoms.
14
+
15
+ ## Role Separation from steel-quench
16
+
17
+ | Dimension | steel-quench | source-grounding-audit |
18
+ |---|---|---|
19
+ | **Attack target** | Output patterns (self-declarations, cushion language, reason for existence) | Input tracing (is the claim in the source?) |
20
+ | **Core question** | "Is this structure flawed?" | "Where did this content come from?" |
21
+ | **Activation timing** | All-angle quench just before completion | Immediately after source-based artifact generation or at point of suspicion |
22
+ | **Primary attack vector** | Bus factor, self-reference, platform obsolescence | Phantom Claim, source not read, fabricated branching conditions |
23
+ | **Representative pattern** | "Declaration only, no evidence" | "Number in TC that doesn't exist in source" |
24
+
25
+ **Can be used together**: steel-quench Wave 1 real-code-based attack + source-grounding-audit Phantom marking can be run sequentially in the same session. But do not mix the roles of the two skills.
26
+
27
+ ---
28
+
29
+ ## Trigger Phrases
30
+
31
+ | Phrase | Situation |
32
+ |---|---|
33
+ | "phantom detection", "phantom claim", "false claim detection" | Full artifact Phantom scan (primary trigger) |
34
+ | "source back-trace", "source audit" | Analysis report, design document verification |
35
+ | "verify source", "where did this come from" | Suspecting origin of a specific claim |
36
+ | "TC evidence tracing", "TC source verification" | Post-TC-generation source consistency check |
37
+ | "grounding audit", "source grounding audit" | Full artifact Phantom scan |
38
+ | "verify evidence files" | Analysis report, design document verification |
39
+ | `/source-grounding-audit` | Explicit call |
40
+
41
+ ---
42
+
43
+ ## Core Concept — Phantom Claim
44
+
45
+ **Phantom Claim**: A claim that appears in the artifact but cannot be found in the declared source files.
46
+
47
+ 3 paths through which Phantoms are produced:
48
+
49
+ | Path | Description | Risk |
50
+ |---|---|:---:|
51
+ | **Source not read** | AI generates artifact using domain knowledge without Read-ing source | S |
52
+ | **Partial reading** | Source partially read, rest filled in with inference | A |
53
+ | **Reconstruction contamination** | Source was read but LLM modified values/conditions during paraphrase | A |
54
+
55
+ ---
56
+
57
+ ## Execution Steps
58
+
59
+ ### Step 0. Confirm Audit Target
60
+
61
+ If not provided by user, explicitly confirm: artifact file path, declared source files, and audit scope. Source not declared = S-grade blocker registered immediately.
62
+
63
+ > **Detail**: See `SKILL_detail.md §Step0-Detail` — confirmation output format and simplification guard — read when audit target or source list is ambiguous.
64
+
65
+ ---
66
+
67
+ ### Step 0.5. Claim Distribution Profile
68
+
69
+ > **Schema**: `knowledge/shared/harness-core/tpa_schema.md` — `phantom_risk` derivation rule, gate trigger conditions, §Gate Routing Table.
70
+
71
+ Runs after Step 0 (target + source confirmed). Skip if user specifies scope explicitly.
72
+
73
+ Scan artifact quickly to classify claim distribution:
74
+
75
+ | Dimension | Signal → Audit depth shift |
76
+ |---|---|
77
+ | `claim_density` | > 10 claims → full Step 1-4 audit; ≤ 3 claims → light (S+A only) |
78
+ | `artifact_type` | SKILL.md/design-doc → prioritize Branch/State-transition claims; code → prioritize Proper-noun/API claims |
79
+ | `risk_level` | external publish / arXiv citations → all claim types, max depth |
80
+ | `source_count` | 0 declared sources → S-grade blocker immediately (skip to Step 3 prescription) |
81
+ | `quantitative_density` | > 3 numerical claims → focus numerical+range types first |
82
+
83
+ Scope recommendation output:
84
+ ```
85
+ Claim types to prioritize: [list]
86
+ Audit depth: [full | prioritized | light]
87
+ Immediate blockers detected: [yes/no — 0 sources = immediate S-grade]
88
+ ```
89
+
90
+ **0-source behavioral rule**: When artifact has 0 declared sources, skip Steps 1-2 entirely and go directly to Step 3 with S-grade blocker: "Source not declared — all claims unverifiable."
91
+
92
+ ---
93
+
94
+ ### Step 1. Claim Extraction (Artifact Scan)
95
+
96
+ Extract claims from the artifact that require source back-tracing. Claim types: Proper nouns (highest), Numerical/range values (highest), Branching conditions (highest), State transitions (high), Preconditions (high), Actors (medium). Exclude generic test methodology descriptions and generic UI patterns.
97
+
98
+ > **Detail**: See `SKILL_detail.md §Step1-Detail` — full claim types table with examples, exclude list, and Step 1 output format template — read when deciding which claims to include or format the extraction results.
99
+
100
+ ---
101
+
102
+ ### Step 2. Source Read + Back-Trace
103
+
104
+ Back-trace each claim to the declared source files using Read + Grep directly — no inference judgment. Partial match is not treated as match.
105
+
106
+ Back-tracing classification:
107
+
108
+ | Classification | Criteria | Marking |
109
+ |---|---|:---:|
110
+ | **Grounded** | Claim directly confirmed in source | ✅ |
111
+ | **Partial** | Similar content in source but not exact match — needs re-confirmation | ⚠️ |
112
+ | **Phantom** | Cannot be found in source | ❌ |
113
+ | **Source-Missing** | Source itself cannot be Read or was not declared | 🔴 |
114
+
115
+ > **Detail**: See `SKILL_detail.md §Step2-Detail` — back-tracing execution procedure, classification decision rules, and Step 2 output format template — read when handling edge cases or formatting results.
116
+
117
+ ---
118
+
119
+ ### Step 3. Phantom Classification + Prescription
120
+
121
+ Classify Phantom and Partial claims by severity and provide prescriptions.
122
+
123
+ **Severity classification criteria**:
124
+
125
+ | Severity | Criteria | Examples |
126
+ |:---:|---|---|
127
+ | **S** (Immediate blocker) | If this claim is wrong, TC could Pass-judge incorrect behavior | Monetary boundary values, branching conditions, status values |
128
+ | **A** (Must fix) | If this claim is wrong, TC cannot execute or runs wrong path | API endpoint names, field names, preconditions |
129
+ | **B** (Improvement recommended) | If this claim is wrong, TC can execute but intent may differ | Descriptive text, non-critical names |
130
+
131
+ Prescriptions: (1) Source Re-read — precisely re-read the relevant source section and fix; (2) Request source specification — when source doesn't exist or wasn't declared; (3) Delete/rewrite — delete claims without source grounding and rewrite from source.
132
+
133
+ > **Detail**: See `SKILL_detail.md §Step3-Detail` — prescription procedures and Step 3 output format template — read when writing the classification table or applying a prescription.
134
+
135
+ **S-grade Immediate Human Gate** — if 1+ S-grade Phantoms found, pause before Step 4/5 and surface:
136
+
137
+ ```
138
+ ⚠️ source-grounding-audit: N S-grade Phantom(s) found:
139
+ - [claim 1 — one-line summary, location]
140
+ - [claim 2 — one-line summary, location]
141
+
142
+ Options:
143
+ (a) Continue — AI proceeds to Step 4 pattern diagnosis + Step 5 re-audit
144
+ (b) Human review first — inspect Phantoms directly, then proceed
145
+ (c) Abort — fix sources manually and re-run audit
146
+
147
+ Waiting for input. (Default: a)
148
+ ```
149
+
150
+ Rationale: S-grade Phantoms that enter Step 5 re-audit without human review risk LLM reconstruction contamination — the same pattern that originally produced the Phantoms can "verify" its own fixes. Human review at this threshold breaks the loop.
151
+
152
+ ---
153
+
154
+ ### Step 4. Source Not-Read Pattern Detection (Meta Diagnosis)
155
+
156
+ Analyze Phantom distribution to diagnose structural problems in the artifact generation process. Reveal "why were these Phantoms produced", not just "this TC is wrong".
157
+
158
+ **Pattern detection criteria**:
159
+
160
+ | Pattern | Detection Condition | Meaning |
161
+ |---|---|---|
162
+ | **Source not read** | 3+ Phantoms and no or partial source Read history | AI generated using domain knowledge without reading source |
163
+ | **Partial reading contamination** | Partial items exceed 30% of total | AI read source partially and filled rest with inference |
164
+ | **Reconstruction modification** | Source value exists but unit/format/range modified in TC | LLM paraphrase process contamination |
165
+ | **Source declaration absent** | Source file not specified when generating artifact | Process design stage problem |
166
+
167
+ **Simplification guard**: If 0 Phantoms, skip Step 4 entirely. Replace with one line: "Source grounding adequate."
168
+
169
+ > **Detail**: See `SKILL_detail.md §Step4-Detail` — Step 4 output format template — read when writing the pattern diagnosis section.
170
+
171
+ ---
172
+
173
+ ### Step 5. Post-Fix Re-audit (Optional)
174
+
175
+ Re-run back-trace for S-grade blocker claims after fixes are complete. Activate when 1+ S-grade blockers exist and fix is immediately possible.
176
+
177
+ **Done When (re-audit)**: Back-trace results for fixed claims all show Grounded (✅) status.
178
+
179
+ ---
180
+
181
+ ## Completion Declaration Format
182
+
183
+ > **Template**: See `SKILL_detail.md §Report-Template` — full completion declaration format — read when producing the final audit summary.
184
+
185
+ ---
186
+
187
+ ## Connected Skills
188
+
189
+ | Situation | Connected Skill |
190
+ |---|---|
191
+ | Simultaneously verify output patterns (self-declarations, cushion language) | `/steel-quench` Wave 1 "real-use verification" angle |
192
+ | Re-verify Phantom patterns from external user perspective | `/sim-conductor Area A` |
193
+ | Source not-read is a harness structure problem | `/harness-doctor` |
194
+ | Phantom pattern is a candidate for new rule items | `fh-meta:persona-innovator` |
195
+ | Redesign the artifact generation prompt itself | `/meta-prompt-builder` |
196
+
197
+ ---
198
+
199
+ ## External User Environment Adaptation
200
+
201
+ This skill can be used independently without the full meta-harness structure.
202
+
203
+ **How to declare source files**: When generating artifacts, specify "source: [file path list]", or provide source files when invoking this skill.
204
+
205
+ **External environment fallback**:
206
+ - If no `tracks/_meta/` → skip persistence step
207
+ - If no project-specific rules (like PFD) → output Phantom pattern summary only
208
+
209
+ ---
210
+
211
+ ## Done When
212
+
213
+ ```
214
+ Step 1 claim extraction complete
215
+ + Step 2 all claims back-traced (using Read tool — no inference judgment)
216
+ + Step 3 Phantom severity classification + prescription output
217
+ + Step 4 process pattern diagnosis complete (skip if 0 Phantoms)
218
+ + "source-grounding-audit Complete" declaration output
219
+ ```
220
+
221
+ Verdict: PASS (0 Phantom claims) | CONDITIONAL_PASS (LOW-severity Phantoms only, prescriptions noted) | FAIL (1+ HIGH/MEDIUM Phantom — broken path, phantom file, or stale external link) | ESCALATE (scope unclear or claim extraction impossible)
222
+
223
+ ---
224
+
225
+ ## Operating Notes
226
+
227
+ - **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep.
228
+ - **Partial is not Grounded**: Processing similar-value-in-source as Grounded misses the reconstruction modification pattern.
229
+ - **Source not declared itself is S-grade**: If source is not declared when making an artifact, no claim can subsequently be verified. Recommend mandating source declaration in the process design stage.
230
+ - **Recommended to use with steel-quench**: steel-quench quenches structural flaws, source-grounding-audit ensures source consistency. The two skills are orthogonal and artifact quality assurance is strengthened when used together.
@@ -0,0 +1,182 @@
1
+ # source-grounding-audit — Execution Detail
2
+
3
+ On-demand reference. Load the section indicated by the pointer in SKILL.md.
4
+
5
+ ---
6
+
7
+ ## §Step0-Detail
8
+
9
+ **Step 0 — Audit Target Confirmation**
10
+
11
+ If not provided by user, explicitly confirm all three items:
12
+
13
+ 1. **Artifact file**: Path to audit target file (TC file, analysis report, design document, etc.)
14
+ 2. **Declared source files**: List of source file paths that should be the basis for artifact generation
15
+ 3. **When source not declared**: Source not declared itself is registered as an S-grade blocker
16
+
17
+ Output format:
18
+
19
+ ```
20
+ Audit target:
21
+ Artifact: {file path}
22
+ Declared source: {file path list or "not declared"}
23
+ Audit scope: {full / specific section / specific claim type}
24
+ ```
25
+
26
+ **Simplification guard**: If source is clear and audit scope is single section, skip Step 0 output and go straight to Step 1.
27
+
28
+ ---
29
+
30
+ ## §Step1-Detail
31
+
32
+ **Step 1 — Claim Extraction Full Reference**
33
+
34
+ **Claim types to extract**:
35
+
36
+ | Type | Examples | Priority |
37
+ |---|---|:---:|
38
+ | **Proper nouns** | API endpoint names, field names, status values, screen names | Highest |
39
+ | **Numerical/range values** | Amounts, time, ratios, counts, thresholds | Highest |
40
+ | **Branching conditions** | if/else branches, exception cases, error codes | Highest |
41
+ | **State transitions** | Conditions for A state → B state, allowed/forbidden combinations | High |
42
+ | **Preconditions** | "only when ~", "when ~ is active" | High |
43
+ | **Actors** | System, user, external API role distinctions | Medium |
44
+
45
+ **Exclude from extraction** (no source back-tracing needed):
46
+ - General test methodology descriptions ("using boundary value analysis")
47
+ - Generic UI patterns ("click button then verify result")
48
+
49
+ **Step 1 output format**:
50
+
51
+ ```
52
+ ## Step 1 — Claim Extraction Results
53
+
54
+ | # | Claim | Type | Location (artifact file:line) |
55
+ |:---:|---|:---:|---|
56
+ | 1 | [claim content] | Proper noun/Numerical/Branch | [filename:line N] |
57
+ ...
58
+
59
+ Total {N} extracted (Proper nouns N / Numerical N / Branch N / State transition N / Precondition N / Actor N)
60
+ ```
61
+
62
+ ---
63
+
64
+ ## §Step2-Detail
65
+
66
+ **Step 2 — Source Read + Back-Trace Execution Detail**
67
+
68
+ **Back-tracing principles**:
69
+ - Read source files directly with the Read tool — do not judge from memory or inference
70
+ - Use Grep to confirm exact value, keyword, or pattern match
71
+ - **Partial match is not treated as match** — e.g., if source has "5 minutes" and TC has "300 seconds", treat as requiring separate confirmation
72
+
73
+ **Classification decision rules**:
74
+
75
+ | Classification | Criteria | Marking |
76
+ |---|---|:---:|
77
+ | **Grounded** | Claim directly confirmed in source | ✅ |
78
+ | **Partial** | Similar content in source but not exact match — needs re-confirmation | ⚠️ |
79
+ | **Phantom** | Cannot be found in source | ❌ |
80
+ | **Source-Missing** | Source itself cannot be Read or was not declared | 🔴 |
81
+
82
+ **Step 2 output format**:
83
+
84
+ ```
85
+ ## Step 2 — Source Back-Trace Results
86
+
87
+ | # | Claim | Back-Trace Result | Source Evidence (file:line) | Notes |
88
+ |:---:|---|:---:|---|---|
89
+ | 1 | [claim] | ✅/⚠️/❌/🔴 | [filename:line N or "none"] | [modifications, etc.] |
90
+ ...
91
+
92
+ Grounded: N / Partial: N / Phantom: N / Source-Missing: N
93
+ ```
94
+
95
+ ---
96
+
97
+ ## §Step3-Detail
98
+
99
+ **Step 3 — Prescription Procedures + Output Format**
100
+
101
+ **3 Prescriptions — detailed execution**:
102
+
103
+ 1. **Source Re-read**: Precisely Read the relevant section of that source file again → fix the claim. Use Read with line-range targeting if the source is large.
104
+ 2. **Request source specification**: When source doesn't exist or wasn't declared → ask user to specify source file. Do not proceed until source is provided for S-grade items.
105
+ 3. **Delete/rewrite**: TCs/claims without source grounding should be deleted and rewritten based on source. Rewrite must start from source Read, not from the existing artifact text.
106
+
107
+ **Step 3 output format**:
108
+
109
+ ```
110
+ ## Step 3 — Phantom Classification + Prescription
111
+
112
+ ### S-grade Immediate Blockers
113
+
114
+ | # | Claim (Phantom) | Prescription | Evidence |
115
+ |:---:|---|---|---|
116
+ | 1 | [claim] | Source Re-read / Request source specification / Delete rewrite | [source file specified or reason for absence] |
117
+
118
+ ### A-grade Must Fix
119
+
120
+ | # | Claim (Phantom/Partial) | Prescription | Notes |
121
+ |:---:|---|---|---|
122
+ ...
123
+
124
+ ### B-grade Improvement Recommended
125
+
126
+ | # | Claim (Partial) | Prescription | Notes |
127
+ |:---:|---|---|---|
128
+ ...
129
+
130
+ S-grade: N / A-grade: N / B-grade: N
131
+ ```
132
+
133
+ ---
134
+
135
+ ## §Step4-Detail
136
+
137
+ **Step 4 — Pattern Diagnosis Output Format**
138
+
139
+ ```
140
+ ## Step 4 — Source Not-Read Pattern Diagnosis
141
+
142
+ Detected pattern: {pattern name or "none"}
143
+ Evidence: {Phantom/Partial distribution analysis}
144
+
145
+ Process prescription:
146
+ - [specific process improvement suggestions]
147
+ ```
148
+
149
+ ---
150
+
151
+ ## §Report-Template
152
+
153
+ **Completion Declaration Format**
154
+
155
+ ```
156
+ ## source-grounding-audit Complete
157
+
158
+ Audit scope: {artifact file} / source {N files}
159
+ {N} total claims audited
160
+
161
+ Result summary:
162
+ ✅ Grounded: N
163
+ ⚠️ Partial: N (fix recommended)
164
+ ❌ Phantom: N (S: N / A: N / B: N)
165
+ 🔴 Source-Missing: N
166
+
167
+ Process pattern: {detected pattern or "none"}
168
+
169
+ Next actions:
170
+ - S-grade Phantom → immediately Source Re-read then fix
171
+ - Source not-read pattern detected → add source Read prerequisite to artifact generation process
172
+ - 3+ Phantoms → recommend using with steel-quench Wave 1 "real-use verification" angle
173
+ - Repeated pattern detected → persist to tracks/_meta/ + propose as rule candidate
174
+ ```
175
+
176
+ ---
177
+
178
+ ## §Evidence
179
+
180
+ **Evidence Record**
181
+
182
+ - **Verified in practice**: TC generation without reading source files → steel-quench passes → source-grounding-audit back-trace detects numerous Phantoms (notifications vs. push notifications, version names vs. non-enrolled, bottom sheet vs. screen navigation). **Procedure**: Read sources in order then regenerate → replace with source-based TCs. **Recurrence prevention**: Source gate implementation — FileNotFoundError if required source files absent. steel-quench misses this because: outputs look logically sound so pattern attacks cannot identify Phantoms — only source back-tracing can detect them.
@@ -0,0 +1,226 @@
1
+ ---
2
+ name: steel-quench
3
+ description: >-
4
+ A meta-skill that concretizes a designer's anxiety into AI-driven all-angle challenger attacks (via fh-commons:quench-challenger) and shakes off flaws through defensive rounds. Systematically surfaces root weaknesses of near-complete projects wave by wave, guaranteeing near-human-review quality without direct human deep inspection. Wave 4 (Meta-Aware Adversary) is an advanced mode where the challenger uses its own AI nature — hallucination, context collapse, prompt injection, tool lock-in — as attack vectors. Built-in fh-commons:quench-challenger agent outputs harness structure 6-axis attack+prescription pairs; after convergence, fh-meta:persona-innovator auto-extracts new patterns. Triggered by: "quench this", "devil's judgment", "all-angle review", "end-to-end verification", "steel quench", "deep pre-completion inspection", "shake out design anxiety", "attack from the root".
5
+ user-invocable: true
6
+ allowed-tools: ["Read", "Write", "Edit", "Bash", "Grep", "Glob", "WebSearch", "Agent"]
7
+ model: opus
8
+ ---
9
+
10
+ # steel-quench — All-Angle Verification Meta-Skill
11
+
12
+ > Heating steel and plunging it into water brings internal defects to the surface. quench-challenger attacks → defense → repeat = systematic surfacing and elimination of design flaws.
13
+
14
+ A designer's anxiety is most dangerous when vague. steel-quench breaks that anxiety into concrete attack angles, defends against them, and closes with residual risks explicitly stated.
15
+
16
+ > **Scope boundary**: steel-quench stress-tests a **near-complete artifact** (post-build). For pre-build design decisions → `deliberation`. For completed-asset validation → `sim-conductor`.
17
+
18
+ ## Trigger Phrases
19
+
20
+ | Phrase | Situation |
21
+ |---|---|
22
+ | "quench this", "run quench" | All-angle verification just before completion |
23
+ | "devil's judgment" | Focused challenger attack on specific design decision |
24
+ | "all-angle review", "end-to-end verification" | Full project scope verification |
25
+ | "shake out design anxiety", "deep pre-completion inspection" | Concretize vague anxiety |
26
+ | "attack from the root" | Re-verify from reason for existence |
27
+ | "diagnose with counterexample", "use this bad case as reference" | Phase 0 calibration |
28
+ | `/steel-quench` | Explicit call |
29
+
30
+ ---
31
+
32
+ ## Wave Structure
33
+
34
+ | Wave | Role | Termination |
35
+ |---|---|---|
36
+ | **Phase 0** (optional) | Counterexample calibration — extract patterns from external bad cases, merge into Wave 1 | No external case → skip |
37
+ | **Wave 1** | Challenger attack (quench-challenger) — surface critical flaws, no defense | — |
38
+ | **Wave 2** | Defense — defend or state as residual risk | — |
39
+ | **Wave 3+** | Convergence — repeat until zero new S-grade | Zero new S-grade |
40
+ | **Wave 4** (optional) | Meta-Aware Adversary — AI uses its own nature as attack vector | Zero new S-grade + AI-specific criteria |
41
+ | **Wave-P3** (reserved) | Domain gate integration slot | Future use |
42
+ | **Wave 5** (optional) | Multi-Team Adversarial Panel — external CLIs or cross-session Claude | Zero new S-grade cross-team |
43
+
44
+ ---
45
+
46
+ ## Step 0.3 — Artifact Vulnerability Profile
47
+
48
+ > **Schema**: `knowledge/shared/harness-core/tpa_schema.md` — canonical artifact_type/risk_level/phantom_risk derivation, gate routing, meta-harness broadcast multiplier.
49
+
50
+ Runs when steel-quench is invoked without a specific wave restriction.
51
+ Skip if user specifies exact waves (e.g. "run Wave 1 and Wave 4 only").
52
+
53
+ Read target artifact → classify vulnerability surface:
54
+
55
+ | Dimension | Signal → Wave weight shift |
56
+ |---|---|
57
+ | `artifact_type` | SKILL.md/design-doc → Wave 2 (structural defense) weight↑ · bash/code → Wave 1 (real-code) weight↑ · external publish imminent → Wave 5 (cross-team) weight↑ |
58
+ | `phantom_risk` | citations/arXiv/DOIs/http URLs present → Wave 3 (source-grounding) weight↑ |
59
+ | `claim_density` | 3+ benefit claims → Wave 1 U3 (evidence grounding) angle weight↑ |
60
+ | `novelty` | first-of-its-kind pattern → Wave 4 (convergence) weight↑ |
61
+ | `scope` | internal-only doc → Wave 5 (external CLI) weight=0 (skip) |
62
+
63
+ Wave selection output:
64
+ ```
65
+ Run: [list of selected waves with rationale]
66
+ Skip: [list of skipped waves with reason]
67
+ External CLIs available: [yes/no → Wave 5 available]
68
+ ```
69
+
70
+ **Degraded coverage rule**: if a high-weight wave or capability is skipped (user choice, unavailable tool, or scope=internal), flag explicitly in the output header — do not silently proceed.
71
+
72
+ ---
73
+
74
+ ## Step 0.4 — Specialized Reviewer Discovery
75
+
76
+ For the target artifact, scan installed agents for a domain-specific adversarial reviewer:
77
+
78
+ 1. Check `.claude/agents/` for a reviewer matching `artifact_type`
79
+ 2. Built-in fallback: `fh-commons:quench-challenger` (general-purpose adversarial review)
80
+ 3. GAP for high-risk artifact: query `/plugin-recommender "adversarial reviewer for [artifact_type]"` → user: install / skip / use fallback
81
+
82
+ **Runtime adapter note**: In Claude Code, invoke the fallback as an isolated `Agent(subagent_type="fh-commons:quench-challenger")`. In Codex-primary or other non-Claude runtimes, use the FH adapter instead:
83
+
84
+ ```bash
85
+ FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-run \
86
+ --agent fh-commons:quench-challenger \
87
+ --file {target-artifact}
88
+ ```
89
+
90
+ Treat the adapter output as the isolated challenger result for Wave 1. This preserves the same workflow without depending on Claude Code's Agent tool.
91
+
92
+ **Wave 5 activation rule**: Wave 5 (external CLI team) is only activated when `scope` is not internal-only AND external CLIs are available AND risk_level is high or user explicitly requests it.
93
+
94
+ > **Detail**: See `SKILL_detail.md §ArtifactProfile` — worked examples (SKILL.md, bash script, README, design doc with citations) showing wave selection and rationale — read when classifying an unfamiliar artifact type.
95
+
96
+ ---
97
+
98
+ ## Wave 1 — 5 Mandatory Attack Angles
99
+
100
+ **Execution principles**: Attacks must be based on real code/files/configs — abstract criticism prohibited.
101
+ Assign severity: **S** (immediate blocker) / **A** (required before deployment) / **B** (improvement recommended).
102
+ Call **fh-commons:quench-challenger** in isolation first (6-axis structural attack); apply 5 angles in parallel.
103
+
104
+ Isolation can be achieved by Claude Code `Agent(...)` or by `fh-run --agent fh-commons:quench-challenger` under Codex. Do not run the challenger inline in the same reasoning pass when the attack result gates the defense.
105
+
106
+ | # | Attack Angle | Core Question |
107
+ |:---:|---|---|
108
+ | 1 | **Reason for existence** | "Why this structure? Is there no simpler alternative?" |
109
+ | 2 | **Real-use verification** | "Does what's written in the docs actually match the real code?" |
110
+ | 3 | **Bus factor** | "Single-person dependency — can it operate if that person is absent?" |
111
+ | 4 | **Platform obsolescence** | "Does this structure survive when the external ecosystem expands or changes?" |
112
+ | 5 | **Self-referential structure** | "Is there a closed circuit that evaluates itself by its own criteria?" |
113
+
114
+ **S-grade Immediate Human Gate**: If Wave 1 contains 1+ S-grade blocker → pause, surface options (a) proceed to Wave 2 / (b) human review first / (c) abort. Do not silently enter Wave 2 with unreviewed S-grade items.
115
+
116
+ > **Detail**: See `SKILL_detail.md §Wave1` — Wave 1 output format, optional numeric score, quench-challenger invocation.
117
+
118
+ ---
119
+
120
+ ## Wave 2 — Defense Principles
121
+
122
+ **3 Defense Principles**: (1) Reinforce with external cases via WebSearch — "unique to us" or "structural pattern"?
123
+ (2) Cover with experience — other project cases defend bus factor. (3) Prioritize immediate implementation over logical construction.
124
+
125
+ **Classification**: Immediate implementation (this session) / Long-term improvement (residual risk card) / Structural acceptance (declare with rationale).
126
+
127
+ **"Brain in a Vat + Sandboxed Adversary"**: Challenger attacks only static code (isolated). Defender brings living system evidence. This asymmetry makes Wave 2 structurally superior to Wave 1.
128
+
129
+ > **Detail**: See `SKILL_detail.md §Wave2` — Wave 2 output format, full Brain-in-Vat principle.
130
+
131
+ ---
132
+
133
+ ## Wave 4 — Meta-Aware Adversary (5 Attack Angles)
134
+
135
+ The challenger (quench-challenger in Wave 4 mode) knows it's running in an isolated sub-agent sandbox and uses that knowledge as a weapon.
136
+
137
+ | # | Attack Angle | Core Question |
138
+ |:---:|---|---|
139
+ | W4-1 | **AI dependency single point of failure** | "If Claude API goes down, does harness core function go to zero?" |
140
+ | W4-2 | **Context Collapse** | "When initial instructions are lost to context compression, does harness go silent?" |
141
+ | W4-3 | **Prompt Injection exposure** | "Can external data overwrite harness internal rules?" |
142
+ | W4-4 | **Hallucination cumulative contamination** | "Do Wave defense arguments rely on LLM inference rather than actual measurement?" |
143
+ | W4-5 | **Tool Dependency Lock-in** | "If a specific MCP/plugin/tool is removed, does harness functionality collapse?" |
144
+
145
+ Wave 4 convergence = Wave 3 criteria + 3 AI-specific vectors actually reviewed + hallucination defense based on original file references.
146
+
147
+ > **Detail**: See `SKILL_detail.md §Wave4` — Wave 4 output format, defense principles, convergence criteria, activation declaration format.
148
+
149
+ ---
150
+
151
+ ## Cross-Project Common Patterns (initial seed)
152
+
153
+ | # | Pattern Name | Description | Response Direction |
154
+ |:---:|---|---|---|
155
+ | P1 | **Single-person bus factor** | System paralysis when core operator absent | Document, automate, formalize delegation |
156
+ | P2 | **Doc-code mismatch** | Documented behavior differs from actual code | Re-sync to real code as ground truth |
157
+ | P3 | **Self-referential diagnosis** | Creator validates — internal viewpoint closed circuit | Connect external persona or sim-conductor |
158
+ | P4 | **No real-use verification** | Theoretically designed but never executed | Mandate 1 cold-start simulation |
159
+ | P5 | **Platform obsolescence unplanned** | No response to external ecosystem changes | Quarterly frontier diagnosis |
160
+ | P6 | **AI dependency single point of failure** | Claude API/MCP removal causes collapse | Document graceful degradation + fallback |
161
+ | P7 | **Hallucination-contaminated defense** | Defense relies on LLM inference, not measurement | Mandate citing original file/commit/value |
162
+ | P8 | **Context Collapse unguarded** | Key instructions lost to compression → harness silent | Review CLAUDE.md compact repeated insertion |
163
+
164
+ Add new rows as new patterns are discovered.
165
+
166
+ ---
167
+
168
+ ## Done When
169
+
170
+ ```
171
+ Wave convergence criteria met: zero new S-grade blockers
172
+ + Residual risk card output (A-grade · B-grade items)
173
+ + "steel-quench Complete" declaration output
174
+ ```
175
+
176
+ Verdict: PASS (zero S-grade, convergence reached) | CONDITIONAL_PASS (A/B-grade remain) | FAIL (S-grade persist) | ESCALATE (structural ambiguity requiring human judgment)
177
+
178
+ ---
179
+
180
+ ## Convergence Criteria + Downstream Chaining
181
+
182
+ ### Convergence Criteria
183
+ 1. **Zero new S-grade blockers** → terminate
184
+ 2. A-grade or higher complex improvements → skill-ize with `/meta-prompt-builder`
185
+ 3. Full Wave results → recommend persisting to `tracks/_meta/steel_quench_YYYY_MM_DD_{slug}.md`
186
+
187
+ ### Connected Skills
188
+
189
+ | Situation | Connected Skill | Mandatory? |
190
+ |---|---|:---:|
191
+ | Delegate improvements as prompts | `/meta-prompt-builder` | optional |
192
+ | **External publish: re-validate from external user perspective** | **`/sim-conductor Area A`** | **mandatory** |
193
+ | Re-validate structural decision | `/verify-bidirectional` | optional |
194
+ | Attack angle is a harness structure problem | `/harness-doctor` | optional |
195
+ | After Wave convergence, propose new pattern rules | `fh-meta:persona-innovator` | optional |
196
+ | Wave 1 structure-specific attack (6-axis) | `fh-commons:quench-challenger` | priority |
197
+ | Back-trace whether claims exist in source files | `/source-grounding-audit` | **mandatory** when `phantom_risk=true` OR `scope=external` (see tpa_schema.md §Gate Routing Table) |
198
+
199
+ **steel-quench → sim-conductor gate**: After Wave convergence in external-publish context, `/sim-conductor Area A` is the mandatory next step.
200
+
201
+ ### Required Pre-External-Deployment Sequence
202
+
203
+ ```
204
+ steel-quench convergence (zero new S-grade)
205
+ ↓ pass residual risk list
206
+ sim-conductor Area A (external user perspective)
207
+ ↓ new items found that steel-quench missed?
208
+ ├── yes → additional steel-quench Wave round
209
+ └── no → deployment approved
210
+ ```
211
+
212
+ > **Detail**: See `SKILL_detail.md §Wave5` — Wave 5 Multi-Team Panel (team formation bash, parallel dispatch, cross-team synthesis) — read when activating `--sidecar` flag. See `SKILL_detail.md §Structural-Defense` for meta-harness defense layering explanation.
213
+
214
+ ---
215
+
216
+ ## Operating Notes
217
+
218
+ - **Do not defend in Wave 1.** Mixing attack and defense modes dulls the attack's edge.
219
+ - **Attacks without real code are invalid.** Abstract criticism is not included in Wave 1 results.
220
+ - **quench-challenger first.** Call fh-commons:quench-challenger in isolation in Wave 1 if available.
221
+ - **Always check self-referential pattern (P3).** Cross-validate Wave results with external criteria.
222
+ - **Attack surface limit**: steel-quench attacks output content patterns. Phantom Claim detection → `source-grounding-audit`.
223
+
224
+ ## Failure Fallback
225
+
226
+ On Claude API / MCP failure → refer to [`references/fallback-guide.md`](../../references/fallback-guide.md).