npm - @chrono-meta/fh-gate - Versions diffs - 1.4.27 → 1.4.29 - Mend

@chrono-meta/fh-gate 1.4.27 → 1.4.29

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@chrono-meta/fh-gate",
-  "version": "1.4.27",
+  "version": "1.4.29",
   "description": "FH runtime adapters — run FH governance, skills, and agents via Claude or Codex with machine-parseable gates.",
   "license": "MIT",
   "keywords": [

package/plugins/fh-meta/skills/agent-composer/SKILL.md CHANGED Viewed

@@ -85,7 +85,7 @@ Execution method · agent selection · recording location → infer and proceed
 | "Do X" — target and completion criteria are clear | Clear | Immediately enter Step 0-b |
 | "Improve X" — improvement dimension unclear | Direction unclear | Ask for clarification |
 | "Something feels off" — diagnosis target unclear | Execution unclear → infer | Auto-dispatch harness-doctor |
-| "Run the pipeline" — trigger is clear | Clear | Run harvest-loop based on recent work |
+| "Wrap up / harvest this session" — trigger is clear | Clear | Run harvest-loop based on recent work ("run the pipeline" itself is ceded to pipeline-conductor — route end-to-end sweeps there, not here) |
 | No request — session context exists | Execution unclear → infer | Auto-compose based on CATALOG/MEMORY, then confirm |
 > **Detail**: See `SKILL_detail.md §Clarification` — 2-question format, 4-question structured confirmation, meta-prompting intervention — read when direction ambiguity requires a clarification block.

package/plugins/fh-meta/skills/phantom-quench/SKILL_detail.md CHANGED Viewed

@@ -106,12 +106,42 @@ literal span — never on a model's agreement.
 | Citation form | Resolved URL |
 |---|---|
-| `arXiv:NNNN.NNNNN` / `arXiv:NNNN.NNNNNvK` | `https://arxiv.org/abs/NNNN.NNNNN` |
+| `arXiv:NNNN.NNNNN` / `arXiv:NNNN.NNNNNvK` | `https://arxiv.org/abs/NNNN.NNNNN` (abstract — the **canonical identifier** surface). For a section-/body-cited claim, prefer full text **first**, falling back in order `https://arxiv.org/html/NNNN.NNNNN` → `https://arxiv.org/pdf/NNNN.NNNNN` → `/abs/` (HTML is a *partial* backfill — see *Surface selection* below) |
 | bare DOI `10.xxxx/...` | `https://doi.org/10.xxxx/...` |
 | `http(s)://...` | use as-is (HTTPS-upgrade handled by WebFetch) |
 | Named source, no URL ("paper X shows Y") | **one** `WebSearch` to locate the canonical URL → then WebFetch it. Do **not** verify against the search snippet alone — the snippet is not the source. |
 | version token `pkg x.y.z` | the package registry/release page (npm/PyPI/GitHub releases) for that exact version |
+**Surface selection — the abstract is the wrong surface for a §-cited mechanism.** The `/abs/` page
+carries only the abstract; a claim that cites a section or attributes a *named body-level mechanism* to
+the paper (e.g. "§4.2 stale-but-confident", "the paper's Change Manifest") returns NONE / MENTIONS on
+`/abs/` even when fully supported in the body — an **artificial Unsupported** (measured 2026-06-17
+in-the-wild dogfood: a `/abs/`-only fetch falsely flagged a body-supported claim; it resolved only after
+escalating to full text). Two mechanical, judgment-free rules cover this:
+1. **Mechanical first-surface trigger** — if the claim text carries a section/figure marker (`§N`,
+   "section N", "Table N", "Figure N") → fetch full text **first**. Otherwise fetch `/abs/` first. (Do
+   *not* pre-decide "is this a body-level mechanism?" — that is a salience-dependent judgment; the
+   `§`-marker is the only mechanical signal, and the NONE-escalation below catches the rest.)
+2. **Abstract-NONE escalates** (the self-correcting backstop) — *any* `/abs/` fetch that returns
+   NONE/MENTIONS escalates once to full text before an Unsupported verdict (see Decision rules). A true
+   headline claim already returned ASSERTS on `/abs/`, so it never escalates; only a NONE does, and the
+   escalation is cheap insurance. This subsumes rule 1's misses — a body claim that *didn't* carry a `§`
+   marker still gets full text via its abstract-NONE.
+**HTML is a partial backfill — fall back, never Phantom.** arXiv HTML exists only for LaTeX-source
+papers (roughly post-2023) and some conversions fail, so `/html/{id}` can 404 for a perfectly real
+paper. A 404 on `/html/` (or `/pdf/`) is **surface-absence, not identifier-absence**: fall back
+`/html/` → `/pdf/` → `/abs/`. Only a 404 on **`/abs/`** (the canonical identifier surface) is Phantom.
+**Anchor the full-text fetch.** Full text is large; pass the cited section number / mechanism term *into*
+the WebFetch prompt (`"…the span in §4.2 that states…"`) so the span search is anchored, not whole-paper
+— this makes the `§`-citation the retrieval anchor, not merely a routing flag, and guards against a long
+paper's relevant span falling outside the fetch model's attention (a full-text false-NONE).
+Reserve `/abs/` for claims the abstract itself states (headline numbers, the paper's stated contribution
+— e.g. a top-line stat like "39–77% factual support" is abstract-level and `/abs/` is correct).
 **WebFetch prompt template** (ask for a *span*, not a *verdict* — this keeps the anchor non-model):
 ```
@@ -126,6 +156,12 @@ Polarity is load-bearing: a page saying "X does NOT hold" or "earlier work claim
 contains a span lexically matching the claim. Asking for the label keeps the *retriever* mechanical while
 surfacing the stance the *judge* needs — a NEGATES/MENTIONS span is **not** grounding.
+**Coinage at fetch time:** when the claim's key term is a known FH coinage (a name FH invented, not the
+paper's — e.g. "Change Manifest", "Validation Gate"), pass the *mechanism description* — not the coinage
+string — into the `<exact claim>` slot, so the retriever searches for the **relation**, not a label that
+isn't on the page. The Coinage-vs-source-term decision rule (below) then only adjudicates the returned
+span; it never has to overturn a NONE the prompt itself caused.
 **Span-evidence format** (a Grounded external verdict is invalid without this):
 ```
@@ -152,7 +188,16 @@ Grounded: N / Unsupported: N / Unreachable: N / Phantom: N
 - Span labelled **NEGATES or MENTIONS** → **Unsupported 🟠** (a negating or merely-mentioning span is not
   grounding — the false-Grounded trap).
 - Fetch succeeded, content readable, span = NONE or off-claim → **Unsupported 🟠** (cited-but-not-verified).
-- Identifier does not resolve at all (404 on a specific arXiv id, dead DOI, fabricated) → **Phantom ❌**.
+  **Surface-escalation first — abstract-NONE is the mechanical trigger (no judgment):** if the only
+  surface fetched was the abstract (`/abs/`), re-fetch full text (fall back `/html/{id}` → `/pdf/{id}`)
+  before classifying — *any* abstract-NONE escalates once, you do **not** first decide whether the claim
+  "looks body-level" (that judgment is the salience leak this rule removes). A body-supported claim must
+  not be failed on an abstract-only NONE. This is the second-surface escalation, **extended from
+  Unreachable to Unsupported-on-abstract** (2026-06-17 dogfood finding #2). Only after a full-text surface
+  *also* returns NONE/MENTIONS is the verdict Unsupported. (Single-pass: escalate once, then classify — no loop.)
+- Identifier does not resolve on the **canonical** surface (`/abs/` 404, dead DOI, fabricated id) →
+  **Phantom ❌**. A 404 on `/html/` or `/pdf/` while `/abs/` resolves is **surface-absence, not
+  identifier-absence** — fall back per *Surface selection*, never Phantom.
 - Identifier plausibly real but fetch blocked in this environment (paywall/403/timeout/cross-host) →
   **Unreachable ⏳** — provisional; note the limit, route to a second surface or the human gate; never auto-Phantom.
 - `WebFetch`/`WebSearch` **absent/disabled in the environment** (not one source — the capability) → Step 2-E
@@ -160,6 +205,40 @@ Grounded: N / Unsupported: N / Unreachable: N / Phantom: N
   (do not mark every claim Unreachable + CONDITIONAL_PASS — that falsely implies grounding was attempted).
 - **Never** upgrade NONE to Grounded because a second model "thinks it's probably right" (agreement bias).
+**Coinage-vs-source-term (the external analogue of Step 2's format normalization — but *judged*, not
+mechanical).** FH sometimes attributes a *coined* name to an external source ("the paper's Change
+Manifest", "its Validation Gate") when the paper uses different words for the same mechanism. A name
+mismatch alone is not automatically Unsupported — but the path to rescuing it is **deliberately narrow**,
+because this is one rationalization away from the agreement-bias trap above. The failure direction must
+stay conservative (over-flag, never false-Grounded):
+- **Possessive/attributive phrasing is presumed a terminology claim.** "the paper's X", "its X",
+  "X, per [paper]" → literal absence of the coinage on the source = **Unsupported 🟠**, *unless the FH
+  artifact separately states the underlying relation in non-coinage words* (so the mechanism is
+  independently legible without the label). A runner may **not** reclassify a bare "the paper's X" as a
+  mechanism claim on its own reading — the mechanism claim must already be spelled out **in the artifact**
+  before the span-judged path opens. (Closes the laundering move: relabeling a terminology claim as a
+  mechanism claim to save it.)
+- **When the mechanism IS spelled out, ground on the relation, not the string — mechanical de-label
+  test:** blank out *both* labels (FH's coinage and the source's term); the span must still literally
+  state the operative relation the claim asserts — the same inputs→outputs / the same conditional / the
+  same action. If, with both names blanked, the span no longer states the claim, the equivalence was
+  reader-supplied → **Unsupported 🟠**. (2026-06-17 dogfood: SkillOpt's "validation score" /
+  "rejected-edit buffer" survives the de-label test for FH's "Validation Gate" claim — the span states
+  the accept-only-on-improvement relation in its own words.)
+- **No self-granted Grounded on a contested match.** If the de-label test is contested rather than clean,
+  the runner may **not** write Grounded — the *only* path to Grounded for a disputed term-match is
+  **escalate to `/steel-quench` Wave 1** and let the isolated adversary surface the span or reject it.
+  Default otherwise = Unsupported. (Removes the runner's discretion to self-select the lenient branch.)
+- **Recommended conservative resolution (publish-facing).** For a publish-facing citation the safer fix
+  is to require the artifact to **quote the paper's own term** ("what SkillOpt calls a 'validation
+  score'") instead of asserting FH's coinage as the paper's — removing the judgment entirely. Reserve the
+  judged de-label path for low-stakes internal claims; match effort to stakes.
+Unlike Step 2's value-format normalization (mechanical: `300s` ≡ `300 seconds`), coinage-vs-term is
+**judged** — Grounded only via a clean de-label span or a passed steel-quench escalation, never a
+reader's semantic equivalence assertion.
 ---
 ## §Step3-Detail

package/plugins/fh-meta/skills/self-marketing-lint/SKILL.md CHANGED Viewed

@@ -13,117 +13,18 @@ deprecated_date: 2026-06-02
 successor: harness-doctor
 ---
-# self-marketing-lint — FH Self-Marketing Language Detection + Replacement Suggestions
+# self-marketing-lint — DEPRECATED
-> Detection criteria are defined inline in this skill (see "Inline Baseline" below).
-> The skill is self-contained and does not depend on external personal-memory documents.
+This skill was absorbed into **harness-doctor** on 2026-06-02. Its self-marketing /
+cushion-word / version-brag detection patterns now live in **harness-doctor Step 3-L
+(`--lint` mode)**, which carries the canonical baseline.
-## Triggers
-- `/self-marketing-lint`
-- "Check FH files", "remove marketing language", "description diet"
-- "Catch self-marketing language", "run the lint"
-- When surfaced as a HIGH item in harvest-loop (P10 series)
-## Detection Patterns (Inline Baseline)
-The following criteria are self-contained inside this skill. No external personal-memory file is required.
-### Self-Marketing Anti-Patterns
-1. **Self-declarations of quality**: "industry-leading", "world-class", "production-ready", "the world's first" without external evidence
-2. **Round-counters and version brags**: "after 26 iterations", "v0.5 maturity", "22nd naming round", "prototype 9th iteration" — irrelevant to function
-3. **Cushion language**: "perfectly", "easily", "seamlessly", "instantly", "without burden" — content-free intensifiers
-4. **Owner self-reference loops**: "validated by our team", "proven internally" — circular evidence with no external check
-5. **Marketing word stacking**: 3+ adjectives in one sentence
-6. **Spec-only artifacts**: skills/agents missing a Done When section — declaration without observable output
-| Pattern type | Examples (detection targets) | Replacement direction |
-|---|---|---|
-| Version labels | "v0.3", "latest version", "(added 2026-05-24)" | Remove (git history manages versioning) |
-| Emphasis words | "fully automated", "instantly", "automatically", "full coverage" | Replace with functional description |
-| Iteration counts | "22nd naming", "26th round", "prototype 9th iteration" | Remove or move to external document |
-| Self-declarations | "This skill is the world's first", "Core FH skill" | Delete |
-| Cushion language | "easily", "conveniently", "without burden" | Replace with behavior-based description |
-| Spec-only (no outputs) | Skills missing Done When | Require adding an outputs section |
-## Execution Steps
-### Step 1. Scan Target Files
-```bash
-# full list of skill files
-find {FH_ROOT}/plugins -name "SKILL.md" | sort
-# full list of agent files
-find {FH_ROOT}/.claude/agents -name "*.md" | sort
-# docs/ public documents
-find {FH_ROOT}/docs -name "*.md" | sort
-```
-Scope: specified files only if target is given / full scan if unspecified.
-### Step 2. Confirm Inline Baseline
-Re-read the "Detection Patterns (Inline Baseline)" section above. No external file load is required — the skill is self-contained.
-### Step 3. Pattern Detection
-Grep detection patterns in each file's description field + body.
-```bash
-# version / iteration count patterns
-grep -n "v[0-9]\+\.[0-9]\|[0-9]\+th\|[0-9]\+th iteration\|prototype [0-9]\+" {file}
-# emphasis word patterns
-grep -n "fully\|instantly\|full coverage\|automatically\|conveniently\|easily\|without burden" {file}
-# self-declaration patterns
-grep -n "core skill\|world's first\|best\|most\|top" {file}
-```
-### Step 4. Output Replacement Suggestions
-Organize and output detection results per file.
-```
-File: {path}
-  L{line}: "{original text}"
-  → Suggested: "{replacement text}" (reason: {pattern type})
-  → Or: recommend removal (reason: {pattern type})
-```
-**No automatic edits**: Output suggestions only. Actual edits proceed after user confirmation.
-### Step 5. Summary Report
+**Use instead:**
 ```
-## self-marketing-lint results
-Target: {N} files
-Detected: {N} items
-| File | Detection count | Most common pattern |
-|---|---|---|
-...
-Recommendation: prioritize files with highest detection counts for review
+/harness-doctor --lint
 ```
-## Done When
-```
-1. Full scan of target files complete
-2. Inline baseline criteria applied (no external file dependency)
-3. Detection results organized and output per file
-4. No automatic edits — user confirmation gate maintained
-```
-## Connections
-| Situation | Connection |
-|---|---|
-| harvest-loop HIGH P10 series triggered | auto-suggest `/self-marketing-lint` |
-| During cold audit | `/steel-quench` + self-marketing-lint in parallel |
-| Before external distribution | `fh-meta:hub-persona-auditor` followed by self-marketing-lint |
+Triggers that previously reached this skill (`check FH files`, `remove marketing
+language`, `description diet`) now route to harness-doctor Step 3-L. This stub remains
+only so old references resolve.