@chrono-meta/fh-gate 1.2.2 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/AGENTS.md +7 -4
  2. package/CATALOG.md +6 -1
  3. package/CHEATSHEET.md +125 -1
  4. package/CLAUDE.md +49 -6
  5. package/README.md +79 -20
  6. package/docs/codex-compat.md +4 -4
  7. package/docs/pillars.svg +26 -29
  8. package/knowledge/shared/harness-core/fh_integration_contract.md +1 -1
  9. package/package.json +1 -2
  10. package/plugins/fh-commons/skills/deliberation/SKILL.md +1 -1
  11. package/plugins/fh-meta/agents/beginner.md +104 -0
  12. package/{.claude → plugins/fh-meta}/agents/challenger.md +3 -1
  13. package/plugins/fh-meta/agents/expert.md +114 -0
  14. package/plugins/fh-meta/agents/main-player.md +106 -0
  15. package/plugins/fh-meta/skills/agent-composer/SKILL.md +2 -2
  16. package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +2 -2
  17. package/plugins/fh-meta/skills/apex-review/SKILL.md +1 -1
  18. package/plugins/fh-meta/skills/edit-manifest/SKILL.md +1 -1
  19. package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +1 -1
  20. package/plugins/fh-meta/skills/install-wizard/SKILL.md +54 -30
  21. package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +1 -1
  22. package/plugins/fh-meta/skills/phantom-quench/SKILL.md +248 -0
  23. package/plugins/fh-meta/skills/{source-grounding-audit → phantom-quench}/SKILL_detail.md +3 -3
  24. package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +10 -10
  25. package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +77 -1
  26. package/plugins/fh-meta/skills/return-path-gate/SKILL.md +2 -2
  27. package/plugins/fh-meta/skills/sim-conductor/SKILL.md +91 -24
  28. package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +18 -18
  29. package/plugins/fh-meta/skills/skill-splitter/SKILL.md +4 -4
  30. package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +2 -2
  31. package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +27 -215
  32. package/plugins/fh-meta/skills/steel-quench/SKILL.md +24 -2
  33. package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +8 -8
  34. package/scripts/fh-gate.sh +3 -9
  35. package/scripts/fh-run.sh +1 -1
@@ -26,15 +26,15 @@ category: Composability Gate
26
26
  > See `README.md > Advanced Settings > Plugin Install` for detailed guide.
27
27
 
28
28
  Run immediately after cloning forge-harness (FH), or when setting up a new project for the first time.
29
- Sets up periodic notification structure (zshrc hook) and weekly audit notifications within Claude Code (CC) sessions. The zshrc hook is permanently applied; CronCreate is valid only for the current session.
29
+ Sets up the periodic-audit notification structure: a permanent zshrc hook (`fh_audit_check.zsh`, runs on terminal start) plus FH's session-start mtime detection. Both surface a weekly-audit reminder when 7+ days have elapsed since the last `weekly_audit` no persistent cron is used (a session-scoped scheduler cannot survive to fire on a later day).
30
30
 
31
31
  ## Key Terms
32
32
 
33
33
  | Term | Definition |
34
34
  |---|---|
35
35
  | **sentinel** | An empty file that records whether a specific event (audit complete, install complete, etc.) has occurred. Created in `~/.cc_sentinels/`. |
36
- | **CronCreate** | Claude Code built-in command — schedules periodic tasks valid for the current session. Disappears when session ends. |
37
36
  | **zshrc hook** | Shell function added to `~/.zshrc`. Automatically runs on terminal start and applies permanently. |
37
+ | **session-start detection** | FH's durable weekly-audit cadence — at session start the mtime of the latest `weekly_audit_*` is checked and `/harvest-loop` is proposed if 7+ days elapsed (see CLAUDE.md Cadence Rules). No persistent scheduler required. |
38
38
 
39
39
  ## Execution Modes
40
40
 
@@ -51,7 +51,7 @@ Sets up periodic notification structure (zshrc hook) and weekly audit notificati
51
51
  - **Per-item approval**: Select each item individually (Y approve / N skip / L later)
52
52
  - **Double-confirm irreversible changes**: Preview before file writes and zshrc modifications
53
53
  - **User review before PR creation**: Output PR parameters (title, base branch, included files, body) and get approval before execution. No automatic submission.
54
- - **Periodic audit structure setup**: zshrc hook (permanently applied on terminal start) + sentinel initialization + CronCreate (valid for current CC session)
54
+ - **Periodic audit structure setup**: zshrc hook (permanently applied on terminal start) + sentinel initialization + session-start mtime detection (7-day threshold)
55
55
 
56
56
  ## Execution Steps
57
57
 
@@ -138,9 +138,13 @@ echo 'source ~/.cc_secrets/tokens.env' >> ~/.zshrc
138
138
  **The following are environment detection procedures that CC executes automatically. No need for users to run manually.**
139
139
 
140
140
  ```bash
141
- # Prompt injection pre-flight: check for AI instruction injection in external config files
142
- if grep -rE "^# CLAUDE:|^# AI:|<instructions>" ~/.zshrc .claude/settings.json 2>/dev/null | grep -q .; then
143
- echo "⚠️ AI instruction pattern detected in external config files injection risk. Manual check recommended."; fi
141
+ # Prompt injection pre-flight: scan config AND the project's AI-instruction surfaces CLAUDE.md,
142
+ # AGENTS.md, .claude/rules/* which are the higher-risk vectors in an unknown repo (not just shell/settings).
143
+ # Injection-SPECIFIC patterns only (override/exfil), since instruction files legitimately carry directives;
144
+ # advisory (recommend manual review), never an auto-block.
145
+ if grep -rIE "ignore (all )?previous|disregard (the )?above|exfiltrat|^# CLAUDE:|^# AI:|<instructions>" \
146
+ ~/.zshrc .claude/settings.json CLAUDE.md AGENTS.md .claude/rules/ 2>/dev/null | grep -q .; then
147
+ echo "⚠️ AI-instruction / override pattern detected in config or instruction files — injection risk in an unknown repo. Review the listed files manually before proceeding."; fi
144
148
 
145
149
  # FH location
146
150
  echo "FH_DIR=${FH_DIR:-not set}"
@@ -164,13 +168,13 @@ python3 -c "import json,os; d=json.load(open(os.path.expanduser('~/.claude.json'
164
168
  # zshrc hook status
165
169
  grep -q "fh_audit_check.zsh" ~/.zshrc 2>/dev/null && echo "zshrc hook: present" || echo "zshrc hook: absent"
166
170
 
167
- # Framework detection (Streamlit) — must be specified in requirements.txt or pyproject.toml
168
- STREAMLIT_PROJECT=false
169
- if grep -q "streamlit" requirements.txt 2>/dev/null || \
170
- grep -q "streamlit" pyproject.toml 2>/dev/null; then
171
- STREAMLIT_PROJECT=true
172
- echo "Framework: Streamlit detected"
173
- fi
171
+ # Framework detection (optional) — only used to look for a matching OPTIONAL domain pattern pack.
172
+ # Generic: capture the framework name; the pattern-pack path is derived as {framework}_patterns.md.
173
+ # No pattern pack ships by default — this is a user-supplied extension point, absence is the normal state.
174
+ FRAMEWORK=""
175
+ for fw in streamlit django fastapi flask; do
176
+ if grep -qi "$fw" requirements.txt pyproject.toml 2>/dev/null; then FRAMEWORK="$fw"; echo "Framework: $fw detected"; break; fi
177
+ done
174
178
  ```
175
179
 
176
180
  **Bootstrap guidance when FH_DIR is not set (stop immediately in Step 0):**
@@ -180,8 +184,10 @@ fi
180
184
  1. Clone FH repo:
181
185
  git clone https://github.com/chrono-meta/forge-harness ~/forge-harness
182
186
 
183
- 2. Set environment variable:
187
+ 2. Set environment variables:
184
188
  export FH_DIR=~/forge-harness
189
+ export CC_HUB_DIR=$FH_DIR # FH hub dir (holds tracks/_audit for the weekly-audit mtime check);
190
+ # equals FH_DIR unless you run a separate hub clone
185
191
 
186
192
  3. Install FH plugin in CC:
187
193
  Settings → Plugins → Add → {FH_DIR}/plugins/fh-meta
@@ -194,11 +200,12 @@ fi
194
200
 
195
201
  *(Run after Step 0-A·B pre-checks. Output results as environment card, then continue to Step 0-C.)*
196
202
 
197
- Output detection results as **environment card**. Activate CC pattern reference on Streamlit detection:
203
+ Output detection results as **environment card**. If a framework was detected AND you maintain a matching
204
+ optional domain pattern pack, reference it (none ship by default — absence is normal, never a gap):
198
205
  ```
199
- 📌 Streamlit project detected → CC pattern reference activated
200
- {CC_HUB_DIR}/knowledge/shared/streamlit_patterns.md loaded (if present optional Streamlit pattern pack, not shipped by default)
201
- Check: data_editor empty df / column nesting / async wrapper / CSS numeric variables
206
+ 📌 {FRAMEWORK} project detected → optional domain pattern pack check
207
+ {CC_HUB_DIR}/knowledge/shared/{FRAMEWORK}_patterns.md loaded (only if you supplied it; not shipped by default)
208
+ If absent: skip silently no pack is the expected default state.
202
209
  ```
203
210
 
204
211
  ```
@@ -219,7 +226,7 @@ install-wizard — Environment Detection
219
226
  > **Core message**: FH is not something placed on top of an existing harness.
220
227
  > It analyzes existing rules to remove duplicates — making things lighter.
221
228
  >
222
- > **Measured expectations** (--dry-run verified values):
229
+ > **Illustrative single-run measurements** (n=1 per project, `--dry-run` verified — not benchmarks; your numbers will differ):
223
230
  >
224
231
  > | Project type | Example | Total volume | Reduction | Main cause |
225
232
  > |---|---|---|---|---|
@@ -323,9 +330,10 @@ Auto-check the following items based on detected environment. Each item classifi
323
330
  | MCP plugin | ~/.claude.json mcpServers contains entry | `python3 -c "import json,os; d=json.load(open(os.path.expanduser('~/.claude.json'))); print(list(d.get('mcpServers',{}).keys()))"` |
324
331
  | `deep-insight plugin` | settings.json plugins contains deep-insight | `grep -r "deep-insight" .claude/settings.json 2>/dev/null` |
325
332
  | `fh_env_context.jsonc` | `.claude/rules/fh_env_context.jsonc` exists | `ls .claude/rules/fh_env_context.jsonc` |
326
- | `Streamlit pattern applied` | (Streamlit projects only, if the pattern pack is present) data_editor empty df branch/async wrapper/CSS numeric variables | CC `knowledge/shared/streamlit_patterns.md` Pattern 1-5 check (skip if file absent) |
333
+ | `phantom-gate` | **(Python + AI-output projects only)** `phantom-gate` present in `requirements.txt` / `pyproject.toml` | `grep "phantom.gate" requirements.txt pyproject.toml 2>/dev/null` |
334
+ | `Domain pattern pack applied` | (optional — only when a `{framework}_patterns.md` pack is present; none ship by default) framework-specific pattern checks | `knowledge/shared/{framework}_patterns.md` check (skip if file absent — the normal default) |
327
335
 
328
- **Score calculation**: PASS = 1 point / MISS = 0.5 points / FAIL = 0 points converted to 100-point scale.
336
+ **Score calculation**: PASS = 1 / MISS = 0.5 / FAIL = 0. Formula: `score = round( Σ(points) ÷ (applicable item count) × 100 )`. Conditional items (domain pattern pack / phantom-gate / MCP / deep-insight) are excluded from the denominator when not relevant, so always print the denominator next to the score (e.g. `{score}/100 over {n} applicable items`) — the percentage is reproducible only when the item count is shown.
329
337
 
330
338
  ### Step 2. Diagnosis Report + Proposal List
331
339
 
@@ -356,13 +364,21 @@ install-wizard — Diagnosis Results ({score}/100)
356
364
  [6] Add MCP plugin — activate integrations (if MCP plugin MISS)
357
365
  Run: claude mcp add <your-mcp-plugin> -- npx -y <your-mcp-plugin>
358
366
  CC restart required after completion
359
- [7] Install deep-insight plugin activate sim-conductor multi-persona simulation (if deep-insight MISS)
360
- Settings Plugins Add {deep-insight plugin path}
361
- Without install, /sim-conductor persona branching disabled (single-point simulation only)
367
+ [7] (Optional — field plugin, NOT required) Install deep-insight — adds the field's domain personas to sim-conductor
368
+ deep-insight is a private/field marketplace plugin. sim-conductor already ships the built-in
369
+ user-mastery spectrum (beginner · main-player · expert · challenger), so multi-persona simulation
370
+ works WITHOUT it. If you have access: Settings → Plugins → Add → <your deep-insight path>.
371
+ If not: skip — sim-conductor falls back to the built-in spectrum agents (no capability lost).
362
372
  [8] Create fh_env_context.jsonc — org/network/Git environment context file (if fh_env_context.jsonc MISS)
363
373
  Copy: {FH_DIR}/templates/fh_env_context.jsonc → .claude/rules/fh_env_context.jsonc
364
374
  Then manually update with actual values for org name, Jira URL, environment status, etc.
365
375
  Effect: Each skill references common environment context → eliminate individual setting duplication
376
+ [9] Install phantom-gate — AI output hallucination detection (Python + AI-output projects only, if MISS)
377
+ Run: pip install git+https://github.com/chrono-meta/phantom-gate.git
378
+ Usage: phantom-gate scan output.txt / phantom-gate scan . --project
379
+ Detectors: M1 (phantom claims) · M2 (self-reference loops) · M3 (unvalidated external-dep claims) · M4 (temporal) · M5 (cross-file version mismatch)
380
+ Skip condition: non-Python project OR no AI-generated output in pipeline
381
+
366
382
 
367
383
  Each item: Y (approve) / N (skip) / L (later) / A (approve all)
368
384
  ```
@@ -470,9 +486,16 @@ source "$FH_DIR/templates/fh_audit_check.zsh"
470
486
  EOF
471
487
  fi
472
488
 
473
- # 4-axis verification gate install the FH pre-commit hook on the forge-harness clone (idempotent)
474
- # Git does NOT set core.hooksPath automatically on clone, so this one-time step is required for the gate to enforce (otherwise it stays advisory).
489
+ # 4-axis verification gate (Mode D / FH-self-development only OPT-IN, double-confirm required)
490
+ # SCOPE (state this before asking): this gates commits IN YOUR FH CLONE ($FH_DIR) git commit there is
491
+ # blocked until the 4-axis markers pass. It is FH-internal infra (hardcodes hub paths/markers) and is
492
+ # NEVER installed into field projects (see auto_project_mapping.md §6). Skip unless you develop FH itself.
493
+ # Per Core Principles (Per-item approval + Double-confirm irreversible changes): this is NOT auto-run —
494
+ # it is a separate explicit Y/N, not folded into the baseline-setup batch.
475
495
  if [ -d "$FH_DIR/templates/.git-hooks" ]; then
496
+ echo "Enable the 4-axis pre-commit gate on your FH clone ($FH_DIR)? It will block commits there until"
497
+ echo "markers pass (Mode D / FH-development only). Skip if you are not developing FH itself. (Y/N)"
498
+ # → On explicit Y only:
476
499
  git -C "$FH_DIR" config core.hooksPath templates/.git-hooks
477
500
  chmod +x "$FH_DIR/templates/.git-hooks/pre-commit" 2>/dev/null
478
501
  echo "4-axis pre-commit gate: installed (core.hooksPath -> templates/.git-hooks)"
@@ -482,8 +505,9 @@ fi
482
505
  mkdir -p ~/.cc_sentinels
483
506
  touch ~/.cc_sentinels/$(basename "$(pwd)")_wizard_done
484
507
 
485
- # Weekly audit schedule in CC (CronCreate valid for this session)
486
- # auto-call /harvest-loop (lightweight mode) every Monday at 9:03
508
+ # Weekly audit cadence NO cron needed (a session-scoped scheduler cannot fire on a later day).
509
+ # Durable mechanism = the zshrc hook above (fh_audit_check.zsh warns on terminal start when 7+ days
510
+ # since last weekly_audit) + FH session-start detection (proposes /harvest-loop lightweight when overdue).
487
511
  ```
488
512
 
489
513
  ### Step 5. Completion Report + Contribution Guidance
@@ -496,7 +520,7 @@ install-wizard — Complete
496
520
  From now on:
497
521
  · Periodic audit auto-check on terminal start
498
522
  · Yellow warning output when weekly_audit exceeds 7 days
499
- · Auto /harvest-loop (lightweight) at 9am Monday when CC is open
523
+ · /harvest-loop (lightweight) proposed at session start when 7+ days since last weekly_audit
500
524
 
501
525
  Next step skills:
502
526
  · Not sure which plugin you need → /plugin-recommender
@@ -559,7 +583,7 @@ ls ~/.cc_sentinels/${PROJECT_NAME}_wizard_done 2>/dev/null && echo "Inspection m
559
583
  |---|---|
560
584
  | Structural anomaly detected | `/harness-doctor` |
561
585
  | Token waste pattern detected | `/context-doctor` |
562
- | External user simulation needed | `/sim-conductor Area A` |
586
+ | External user simulation needed | `/sim-conductor` |
563
587
  | Install conflict suspected | `/install-doctor` |
564
588
 
565
589
  ## Per-Cluster Deferred Loading (Progressive Disclosure)
@@ -187,7 +187,7 @@ All steps 0–2 completed
187
187
  + Overall verdict output (🟢 Recommended / 🟡 Conditional / 🔴 On hold)
188
188
  ```
189
189
 
190
- **→ Mandatory before 🟢 Recommended verdict: `source-grounding-audit`** — forward axis check on all citations, external URLs, and file path references in the asset being reviewed. A 🟢 verdict without source-grounding-audit is incomplete. If source-grounding-audit finds phantom refs → verdict downgrades to 🟡 Conditional automatically.
190
+ **→ Mandatory before 🟢 Recommended verdict: `phantom-quench`** — forward axis check on all citations, external URLs, and file path references in the asset being reviewed. A 🟢 verdict without phantom-quench is incomplete. If phantom-quench finds phantom refs → verdict downgrades to 🟡 Conditional automatically.
191
191
 
192
192
  > When `agent-composer` receives a "comprehensive marketplace listing audit" request,
193
193
  > recommend: Wave 0 `fact-checker` → Wave 1 `marketplace-gate` + `hub-persona-auditor` in parallel.
@@ -0,0 +1,248 @@
1
+ ---
2
+ name: phantom-quench
3
+ description: The grounding member of the quench series — extracts proper nouns, numerical values, and branching conditions from artifacts (TCs, analysis reports, design documents), back-traces them to declared source files, and marks anything not found as a Phantom Claim (ungrounded — present in the artifact but not traceable to a declared source; not a claim that it is necessarily false). If steel-quench attacks output patterns (self-declarations, cushion language), phantom-quench attacks input tracing (where did this come from?). Renamed from source-grounding-audit (2026-06-06, quench-series); `/source-grounding-audit` still resolves as an alias. Triggered by "phantom detection", "phantom-quench", "phantom claim", "hallucinated claim detection", "source back-trace", "source audit", "verify source", "TC evidence tracing", "where did this come from", "grounding audit", "source grounding audit", "false claim detection".
4
+ user-invocable: true
5
+ allowed-tools: ["Read", "Write", "Edit", "Bash", "Grep", "Glob"]
6
+ model: sonnet
7
+ ---
8
+
9
+ # phantom-quench — Input Tracing Grounding Audit
10
+
11
+ > Just because an artifact looks plausible doesn't mean it's grounded in source. plausible ≠ grounded.
12
+
13
+ > **Renamed from `source-grounding-audit` (2026-06-06)** — the grounding member of the quench series
14
+ > (steel-quench · phantom-quench · goal-quench). Same skill, same ruleset; only the label changed to fit
15
+ > the family. The **v1 paper (Zenodo 10.5281/zenodo.20397566) cites the old name** — that is the
16
+ > historical record, not a phantom. `/source-grounding-audit` still resolves via the deprecated redirect
17
+ > stub at `plugins/fh-meta/skills/source-grounding-audit/SKILL.md` (`successor: phantom-quench`).
18
+ > This is a **label rename, not a capability change** — phantom-quench does not fuse steel-quench or
19
+ > inject faults; those remain separate (orthogonality is deliberate — see Role Separation below).
20
+ >
21
+ > **Quench-series semantics** (resolves the "quench *what*?" question): each member subjects a different
22
+ > thing to the forge — steel-quench hardens an **existing output**; phantom-quench hardens the system
23
+ > against **mistaking the absent for present** (the phantom illusion — *not* the phantom as a material to
24
+ > harden); goal-quench hardens the **goal itself** into an advanced version. Same verb, consistent grammar.
25
+ >
26
+ > **Not the same as `phantom-gate`.** `phantom-gate` is the *productized standalone* phantom detector — a
27
+ > PyPI package run against any repo from the shell. `phantom-quench` is the *in-harness skill* — the same
28
+ > detection lineage as a method invoked inside a Claude session against a declared source set. Tool vs
29
+ > skill; different delivery, shared idea.
30
+
31
+ When AI generates artifacts without reading the source, those artifacts look like domain knowledge but are actually **Phantom Claims** coming from LLM weights. This skill back-traces each claim in the artifact to the declared source to explicitly mark Phantoms.
32
+
33
+ ## Role Separation from steel-quench
34
+
35
+ | Dimension | steel-quench | phantom-quench |
36
+ |---|---|---|
37
+ | **Attack target** | Output patterns (self-declarations, cushion language, reason for existence) | Input tracing (is the claim in the source?) |
38
+ | **Core question** | "Is this structure flawed?" | "Where did this content come from?" |
39
+ | **Activation timing** | All-angle quench just before completion | Immediately after source-based artifact generation or at point of suspicion |
40
+ | **Primary attack vector** | Bus factor, self-reference, platform obsolescence | Phantom Claim, source not read, fabricated branching conditions |
41
+ | **Representative pattern** | "Declaration only, no evidence" | "Number in TC that doesn't exist in source" |
42
+
43
+ **Can be used together**: steel-quench Wave 1 real-code-based attack + phantom-quench Phantom marking can be run sequentially in the same session. But do not mix the roles of the two skills.
44
+
45
+ ---
46
+
47
+ ## Trigger Phrases
48
+
49
+ | Phrase | Situation |
50
+ |---|---|
51
+ | "phantom detection", "phantom claim", "false claim detection" | Full artifact Phantom scan (primary trigger) |
52
+ | "source back-trace", "source audit" | Analysis report, design document verification |
53
+ | "verify source", "where did this come from" | Suspecting origin of a specific claim |
54
+ | "TC evidence tracing", "TC source verification" | Post-TC-generation source consistency check |
55
+ | "grounding audit", "source grounding audit" | Full artifact Phantom scan |
56
+ | "verify evidence files" | Analysis report, design document verification |
57
+ | `/phantom-quench` | Explicit call |
58
+
59
+ ---
60
+
61
+ ## Core Concept — Phantom Claim
62
+
63
+ **Phantom Claim**: A claim that appears in the artifact but cannot be found in the declared source files.
64
+
65
+ 3 paths through which Phantoms are produced:
66
+
67
+ | Path | Description | Risk |
68
+ |---|---|:---:|
69
+ | **Source not read** | AI generates artifact using domain knowledge without Read-ing source | S |
70
+ | **Partial reading** | Source partially read, rest filled in with inference | A |
71
+ | **Reconstruction contamination** | Source was read but LLM modified values/conditions during paraphrase | A |
72
+
73
+ ---
74
+
75
+ ## Execution Steps
76
+
77
+ ### Step 0. Confirm Audit Target
78
+
79
+ If not provided by user, explicitly confirm: artifact file path, declared source files, and audit scope. Source not declared = S-grade blocker registered immediately.
80
+
81
+ > **Detail**: See `SKILL_detail.md §Step0-Detail` — confirmation output format and simplification guard — read when audit target or source list is ambiguous.
82
+
83
+ ---
84
+
85
+ ### Step 0.5. Claim Distribution Profile
86
+
87
+ > **Schema**: `knowledge/shared/harness-core/tpa_schema.md` — `phantom_risk` derivation rule, gate trigger conditions, §Gate Routing Table.
88
+
89
+ Runs after Step 0 (target + source confirmed). Skip if user specifies scope explicitly.
90
+
91
+ Scan artifact quickly to classify claim distribution:
92
+
93
+ | Dimension | Signal → Audit depth shift |
94
+ |---|---|
95
+ | `claim_density` | > 10 claims → full Step 1-4 audit; ≤ 3 claims → light (S+A only) |
96
+ | `artifact_type` | SKILL.md/design-doc → prioritize Branch/State-transition claims; code → prioritize Proper-noun/API claims |
97
+ | `risk_level` | external publish / arXiv citations → all claim types, max depth |
98
+ | `source_count` | 0 declared sources → S-grade blocker immediately (skip to Step 3 prescription) |
99
+ | `quantitative_density` | > 3 numerical claims → focus numerical+range types first |
100
+
101
+ Scope recommendation output:
102
+ ```
103
+ Claim types to prioritize: [list]
104
+ Audit depth: [full | prioritized | light]
105
+ Immediate blockers detected: [yes/no — 0 sources = immediate S-grade]
106
+ ```
107
+
108
+ **0-source behavioral rule**: When artifact has 0 declared sources, skip Steps 1-2 entirely and go directly to Step 3 with S-grade blocker: "Source not declared — all claims unverifiable."
109
+
110
+ ---
111
+
112
+ ### Step 1. Claim Extraction (Artifact Scan)
113
+
114
+ Extract claims from the artifact that require source back-tracing. Claim types: Proper nouns (highest), Numerical/range values (highest), Branching conditions (highest), State transitions (high), Preconditions (high), Actors (medium). Exclude generic test methodology descriptions and generic UI patterns.
115
+
116
+ > **Detail**: See `SKILL_detail.md §Step1-Detail` — full claim types table with examples, exclude list, and Step 1 output format template — read when deciding which claims to include or format the extraction results.
117
+
118
+ ---
119
+
120
+ ### Step 2. Source Read + Back-Trace
121
+
122
+ Back-trace each claim to the declared source files using Read + Grep directly — no inference judgment. Partial match is not treated as match.
123
+
124
+ Back-tracing classification:
125
+
126
+ | Classification | Criteria | Marking |
127
+ |---|---|:---:|
128
+ | **Grounded** | Claim directly confirmed in source | ✅ |
129
+ | **Partial** | Similar content in source but not exact match — needs re-confirmation | ⚠️ |
130
+ | **Phantom** | Cannot be found in source | ❌ |
131
+ | **Source-Missing** | Source itself cannot be Read or was not declared | 🔴 |
132
+
133
+ > **Detail**: See `SKILL_detail.md §Step2-Detail` — back-tracing execution procedure, classification decision rules, and Step 2 output format template — read when handling edge cases or formatting results.
134
+
135
+ ---
136
+
137
+ ### Step 3. Phantom Classification + Prescription
138
+
139
+ Classify Phantom and Partial claims by severity and provide prescriptions.
140
+
141
+ **Severity classification criteria**:
142
+
143
+ | Severity | Criteria | Examples |
144
+ |:---:|---|---|
145
+ | **S** (Immediate blocker) | If this claim is wrong, TC could Pass-judge incorrect behavior | Monetary boundary values, branching conditions, status values |
146
+ | **A** (Must fix) | If this claim is wrong, TC cannot execute or runs wrong path | API endpoint names, field names, preconditions |
147
+ | **B** (Improvement recommended) | If this claim is wrong, TC can execute but intent may differ | Descriptive text, non-critical names |
148
+
149
+ Prescriptions: (1) Source Re-read — precisely re-read the relevant source section and fix; (2) Request source specification — when source doesn't exist or wasn't declared; (3) Delete/rewrite — delete claims without source grounding and rewrite from source.
150
+
151
+ > **Detail**: See `SKILL_detail.md §Step3-Detail` — prescription procedures and Step 3 output format template — read when writing the classification table or applying a prescription.
152
+
153
+ **S-grade Immediate Human Gate** — if 1+ S-grade Phantoms found, pause before Step 4/5 and surface:
154
+
155
+ ```
156
+ ⚠️ phantom-quench: N S-grade Phantom(s) found:
157
+ - [claim 1 — one-line summary, location]
158
+ - [claim 2 — one-line summary, location]
159
+
160
+ Options:
161
+ (a) Continue — AI proceeds to Step 4 pattern diagnosis + Step 5 re-audit
162
+ (b) Human review first — inspect Phantoms directly, then proceed
163
+ (c) Abort — fix sources manually and re-run audit
164
+
165
+ Waiting for input. (Default: a)
166
+ ```
167
+
168
+ Rationale: S-grade Phantoms that enter Step 5 re-audit without human review risk LLM reconstruction contamination — the same pattern that originally produced the Phantoms can "verify" its own fixes. Human review at this threshold breaks the loop.
169
+
170
+ ---
171
+
172
+ ### Step 4. Source Not-Read Pattern Detection (Meta Diagnosis)
173
+
174
+ Analyze Phantom distribution to diagnose structural problems in the artifact generation process. Reveal "why were these Phantoms produced", not just "this TC is wrong".
175
+
176
+ **Pattern detection criteria**:
177
+
178
+ | Pattern | Detection Condition | Meaning |
179
+ |---|---|---|
180
+ | **Source not read** | 3+ Phantoms and no or partial source Read history | AI generated using domain knowledge without reading source |
181
+ | **Partial reading contamination** | Partial items exceed 30% of total | AI read source partially and filled rest with inference |
182
+ | **Reconstruction modification** | Source value exists but unit/format/range modified in TC | LLM paraphrase process contamination |
183
+ | **Source declaration absent** | Source file not specified when generating artifact | Process design stage problem |
184
+
185
+ **Simplification guard**: If 0 Phantoms, skip Step 4 entirely. Replace with one line: "Source grounding adequate."
186
+
187
+ > **Detail**: See `SKILL_detail.md §Step4-Detail` — Step 4 output format template — read when writing the pattern diagnosis section.
188
+
189
+ ---
190
+
191
+ ### Step 5. Post-Fix Re-audit (Optional)
192
+
193
+ Re-run back-trace for S-grade blocker claims after fixes are complete. Activate when 1+ S-grade blockers exist and fix is immediately possible.
194
+
195
+ **Done When (re-audit)**: Back-trace results for fixed claims all show Grounded (✅) status.
196
+
197
+ ---
198
+
199
+ ## Completion Declaration Format
200
+
201
+ > **Template**: See `SKILL_detail.md §Report-Template` — full completion declaration format — read when producing the final audit summary.
202
+
203
+ ---
204
+
205
+ ## Connected Skills
206
+
207
+ | Situation | Connected Skill |
208
+ |---|---|
209
+ | Simultaneously verify output patterns (self-declarations, cushion language) | `/steel-quench` Wave 1 "real-use verification" angle |
210
+ | Re-verify Phantom patterns from external user perspective | `/sim-conductor Area A` |
211
+ | Source not-read is a harness structure problem | `/harness-doctor` |
212
+ | Phantom pattern is a candidate for new rule items | `fh-meta:persona-innovator` |
213
+ | Redesign the artifact generation prompt itself | `/meta-prompt-builder` |
214
+
215
+ ---
216
+
217
+ ## External User Environment Adaptation
218
+
219
+ This skill can be used independently without the full meta-harness structure.
220
+
221
+ **How to declare source files**: When generating artifacts, specify "source: [file path list]", or provide source files when invoking this skill.
222
+
223
+ **External environment fallback**:
224
+ - If no `tracks/_meta/` → skip persistence step
225
+ - If no project-specific rules (like PFD) → output Phantom pattern summary only
226
+
227
+ ---
228
+
229
+ ## Done When
230
+
231
+ ```
232
+ Step 1 claim extraction complete
233
+ + Step 2 all claims back-traced (using Read tool — no inference judgment)
234
+ + Step 3 Phantom severity classification + prescription output
235
+ + Step 4 process pattern diagnosis complete (skip if 0 Phantoms)
236
+ + "phantom-quench Complete" declaration output
237
+ ```
238
+
239
+ Verdict: PASS (0 Phantom claims) | CONDITIONAL_PASS (LOW-severity Phantoms only, prescriptions noted) | FAIL (1+ HIGH/MEDIUM Phantom — broken path, phantom file, or stale external link) | ESCALATE (scope unclear or claim extraction impossible)
240
+
241
+ ---
242
+
243
+ ## Operating Notes
244
+
245
+ - **Never back-trace by inference**: Judging "this value is probably in the source" treats it as Partial not Phantom. Always directly confirm with Read + Grep.
246
+ - **Partial is not Grounded**: Processing similar-value-in-source as Grounded misses the reconstruction modification pattern.
247
+ - **Source not declared itself is S-grade**: If source is not declared when making an artifact, no claim can subsequently be verified. Recommend mandating source declaration in the process design stage.
248
+ - **Recommended to use with steel-quench**: steel-quench quenches structural flaws, phantom-quench ensures source consistency. The two skills are orthogonal and artifact quality assurance is strengthened when used together.
@@ -1,4 +1,4 @@
1
- # source-grounding-audit — Execution Detail
1
+ # phantom-quench — Execution Detail
2
2
 
3
3
  On-demand reference. Load the section indicated by the pointer in SKILL.md.
4
4
 
@@ -153,7 +153,7 @@ Process prescription:
153
153
  **Completion Declaration Format**
154
154
 
155
155
  ```
156
- ## source-grounding-audit Complete
156
+ ## phantom-quench Complete
157
157
 
158
158
  Audit scope: {artifact file} / source {N files}
159
159
  {N} total claims audited
@@ -179,4 +179,4 @@ Next actions:
179
179
 
180
180
  **Evidence Record**
181
181
 
182
- - **Verified in practice**: TC generation without reading source files → steel-quench passes → source-grounding-audit back-trace detects numerous Phantoms (notifications vs. push notifications, version names vs. non-enrolled, bottom sheet vs. screen navigation). **Procedure**: Read sources in order then regenerate → replace with source-based TCs. **Recurrence prevention**: Source gate implementation — FileNotFoundError if required source files absent. steel-quench misses this because: outputs look logically sound so pattern attacks cannot identify Phantoms — only source back-tracing can detect them.
182
+ - **Verified in practice**: TC generation without reading source files → steel-quench passes → phantom-quench back-trace detects numerous Phantoms (notifications vs. push notifications, version names vs. non-enrolled, bottom sheet vs. screen navigation). **Procedure**: Read sources in order then regenerate → replace with source-based TCs. **Recurrence prevention**: Source gate implementation — FileNotFoundError if required source files absent. steel-quench misses this because: outputs look logically sound so pattern attacks cannot identify Phantoms — only source back-tracing can detect them.
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: pipeline-conductor
3
- description: Chains the four core FH verification pipelines (harvest-loop → steel-quench → source-grounding-audit → sim-conductor) into a single gated sweep. Accepts a scope (single skill, specific asset, full harness) and aggregates results into one structured report. Supports --quick mode (steps 2+3 only) and --full mode (all four steps). Triggered by "run the full pipeline", "chain all verifications", "end-to-end sweep", "pipeline-conductor", or "verify everything".
3
+ description: Chains the four core FH verification pipelines (harvest-loop → steel-quench → phantom-quench → sim-conductor) into a single gated sweep. Accepts a scope (single skill, specific asset, full harness) and aggregates results into one structured report. Supports --quick mode (steps 2+3 only) and --full mode (all four steps). Triggered by "run the full pipeline", "chain all verifications", "end-to-end sweep", "pipeline-conductor", or "verify everything".
4
4
  user-invocable: true
5
5
  allowed-tools: ["Read", "Write", "Bash", "Grep", "Glob", "Agent"]
6
6
  model: sonnet
@@ -10,7 +10,7 @@ model: sonnet
10
10
 
11
11
  Chains the four standalone FH verification pipelines into a gated sequence. Each step receives the previous step's verdict before proceeding. Aggregates all findings into a single structured report at the end.
12
12
 
13
- The gap this closes: harvest-loop, steel-quench, source-grounding-audit, and sim-conductor are each invocable independently but have no automatic hand-off between them. Running them sequentially by hand loses inter-step signal — a FAIL in step 2 should block step 3 rather than silently continuing. pipeline-conductor enforces that ordering.
13
+ The gap this closes: harvest-loop, steel-quench, phantom-quench, and sim-conductor are each invocable independently but have no automatic hand-off between them. Running them sequentially by hand loses inter-step signal — a FAIL in step 2 should block step 3 rather than silently continuing. pipeline-conductor enforces that ordering.
14
14
 
15
15
  ## Triggers
16
16
 
@@ -92,7 +92,7 @@ Do not infer scope — a wrong scope produces misleading verdicts.
92
92
 
93
93
  The four constituent skills use heterogeneous scope models. Translate the pipeline scope to each skill's invocation form before running any step:
94
94
 
95
- | Pipeline scope | harvest-loop (Step 1) | steel-quench (Step 2) | source-grounding-audit (Step 3) | sim-conductor (Step 4) |
95
+ | Pipeline scope | harvest-loop (Step 1) | steel-quench (Step 2) | phantom-quench (Step 3) | sim-conductor (Step 4) |
96
96
  |---|---|---|---|---|
97
97
  | Single SKILL.md | Check session findings relevant to this skill; propose mode only | Adversarial attack on this SKILL.md | Back-trace claims in this SKILL.md to declared sources | Area D (artifact review) on this SKILL.md |
98
98
  | Specific directory | Check session findings in this domain | Attack all SKILL.md files in directory | Back-trace all claims in directory | Area A + Area D on the domain |
@@ -219,13 +219,13 @@ Run steel-quench against the target scope.
219
219
 
220
220
  ---
221
221
 
222
- ## Step 3. source-grounding-audit — Phantom Claim Detection
222
+ ## Step 3. phantom-quench — Phantom Claim Detection
223
223
 
224
- Run source-grounding-audit against the target scope.
224
+ Run phantom-quench against the target scope.
225
225
 
226
226
  **What it checks**: Proper nouns, numerical values, file paths, and branching conditions back-traced to declared source files. Claims not found in source are marked Phantom.
227
227
 
228
- **Invocation**: Run source-grounding-audit scoped to the same target as Steps 1 and 2.
228
+ **Invocation**: Run phantom-quench scoped to the same target as Steps 1 and 2.
229
229
 
230
230
  **Load-bearing Phantom** (binary test — apply mechanically):
231
231
 
@@ -238,7 +238,7 @@ All other locations (§Triggers, advisory §Chains language, frontmatter descrip
238
238
 
239
239
  **Verdict criteria**:
240
240
 
241
- | source-grounding-audit result | pipeline-conductor verdict |
241
+ | phantom-quench result | pipeline-conductor verdict |
242
242
  |---|---|
243
243
  | 0 Phantoms, all claims grounded | `PASS` |
244
244
  | Phantom claims found, none load-bearing (binary test) | `CONDITIONAL_PASS` — list Phantoms |
@@ -246,12 +246,12 @@ All other locations (§Triggers, advisory §Chains language, frontmatter descrip
246
246
  | Grounding ambiguous (source file exists but content unclear) | `ESCALATE` |
247
247
 
248
248
  **On FAIL**: Output the load-bearing Phantom(s). Ask:
249
- > "source-grounding-audit found a load-bearing Phantom claim. Fix and re-run Step 3, or abort the sweep?"
249
+ > "phantom-quench found a load-bearing Phantom claim. Fix and re-run Step 3, or abort the sweep?"
250
250
 
251
251
  **On CONDITIONAL_PASS**: Capture non-load-bearing Phantoms. Continue to Step 4.
252
252
 
253
253
  ```
254
- [Step 3 — source-grounding-audit]
254
+ [Step 3 — phantom-quench]
255
255
  Verdict: {verdict}
256
256
  Basis: {one-line}
257
257
  Phantoms: {count} — {load-bearing: Y/N} — {top item or "none"}
@@ -320,7 +320,7 @@ pipeline-conductor — Sweep Report
320
320
  Step 0.5 — return-path-gate: {PASS / CONDITIONAL_PASS / FAIL / SKIPPED / degraded}
321
321
  Step 1 — harvest-loop: {PASS / CONDITIONAL_PASS / FAIL / ESCALATE / SKIPPED}
322
322
  Step 2 — steel-quench: {verdict}
323
- Step 3 — source-grounding-audit: {verdict}
323
+ Step 3 — phantom-quench: {verdict}
324
324
  Step 4 — sim-conductor: {verdict}
325
325
 
326
326
  Overall: {CLEAN (--full) / CLEAN (--quick) / CLEAN (--no-sim) / PENDING / BLOCKED}