npm - @chrono-meta/fh-gate - Versions diffs - 1.4.13 → 1.4.14 - Mend

@chrono-meta/fh-gate 1.4.13 → 1.4.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/CATALOG.md +22 -0
package/CLAUDE.md +11 -3
package/README.md +6 -7
package/package.json +1 -1
package/plugins/fh-meta/skills/steel-quench/SKILL.md +1 -0

package/CATALOG.md CHANGED Viewed

@@ -8,6 +8,28 @@ AI reads this file first when searching past work. Open individual files for det
 <!-- Add entries in reverse date order (newest at top) -->
+### 2026-06-11 | forge-harness | #readme-dedup, #commoditization-defense, #b1-boundary, #seed-vetting, #field-routing
+**File:** README.md (PR #93) + fh-be signals/handoffs (PR #24+) + field project v0.14 (private track)
+Cloud session #4 (Mode D, remote-doable batch): README stale duplicate "Measured, not asserted" block removed (pre-tier-floor copy contradicted default-Sonnet stance) + "Where this sits (2026)" positioning para (gate+loop are the asset — plumbing commoditizes). fh-be: B1 scope boundary (AW2 simple-vs-complex import), SC1–4 seed vetting (SC1 phantom arXiv fixed → 2509.19349; SC3 commoditization threat to bet ID), gstack positioning-triangle line. Field: private-track companion handoff verified+executed — DefectPatternMatcher+Bug Mode implemented UTF-8-clean in the field repo (25 tests PASS, 0 regressions); its OpenCode(sLLM) lane's downgrade triage hardened to 2-tier (keep/block) with free-tier 80/20 split — β/SC2 field case logged.
+- Decision: README action PARTIAL (merge done; About refresh + npm 1.4.14 laptop-bound); full-TC locked on qwen build per operator's CaseCraft limit measurement — structure-transform survives (P6/P6.5 deterministic), meaning-fill routes to frontier.
+### 2026-06-11 | forge-harness | #mcp-gating, #external-mcp, #name-keyed-policy, #measured-origin, #field-template
+**File:** templates/.claude/rules/mcp_tool_gating.md (+ auto_project_mapping.md §6 row 4, CLAUDE.md mount-intent trigger)
+Cloud session (Mode D, ext): new field template — external-MCP tool gating with three tiers (ask / ask-meta-write / allow-untrusted-read), name-keyed because server-supplied annotations are unreliable (measured same-day: live messaging-class MCP shipped all-None hints incl. irreversible send + approval-resolution tools — fh-be `signal_2026-06-11_hermes-mcp-cloud-boot.md`). Opus challenger caught the name-spoofing hole (server controls names too → behavior-confirmation required for non-ask tiers, fixed inline); sonnet blind sim PASS on unfilled-§3 scenario (per-item ask on send, batch approval-grant refused).
+- Decision: prefer host-native per-tool permission config as enforcement; this template = what-to-gate + portable fallback. §6 install row is conditional (MCP present); the proactive mount-intent trigger is the load-bearing path.
+### 2026-06-11 | forge-harness | #identity-marker, #door-skeleton, #target-tier-sim, #below-floor-consumer, #false-control-kill
+**File:** CLAUDE.md §Active Onboarding (+ fh_detail_protocols.md, scripts/below_floor_scan.sh, .claude/rules/operations.md)
+Cloud session (Mode D): 🐿️ marker folded into the returning-user door skeleton — one salience unit with the menu, closing the sonnet-sim marker-drop residual (`fh_signal_2026-06-11`); verified by blind sonnet Agent sim PASS (verification tier = failure tier; in-session model-pinned dispatch available in cloud, no headless fallback needed). 6/15 billing one-line amendment rode the same CLAUDE.md edit (no churn commit). below_floor_scan.sh ships as the standing consumer the pre-commit hook promised ("weekly audit re-queues below-floor markers") but never had — resolution via `floor-rerun:`/`floor-writeoff:` marker appends, wired as weekly Phase 1.5 step 2.
+- Decision: ack rubber-stamp (card item 4) closed via route-around — build the re-run consumer, leave the ack untouched (per opus-challenger verdict on the reverted regex attempt).
+- Decision: opus challenger PASS 0S+3B — marker-append/hook-collision replicated CLEAN; P9 check: "builds the control, not paper-over".
+### 2026-06-11 | forge-harness | #p9, #harness-bulk, #check-class, #model-portability, #steel-quench
+**File:** plugins/fh-meta/skills/steel-quench/SKILL.md (+ knowledge/shared/harness-core/multi_model_sidecar_strategy.md)
+Two field-validated generalizations promoted to the public mirror (origin: 2026-06-08 field reversal — a weak open-weight model's domain ceiling proved iteration-proof while the pipeline thickened to compensate): (a) steel-quench Cross-Project Patterns gains **P9 harness-bulk-as-model-compensation** (route the task class to a stronger model; never paper over a capability ceiling with more harness — signals are measured: steps added for one model's weakness, quality flat while step count rises); (b) sidecar strategy gains the **check-class = model-portability map** principle (mandatory-pass + mechanical-measured port by construction; judged is where model choice matters, bounded by judged-pairing and §Floor governance re-runs).
+- Decision: 4-axis gate ran with opus challenger (CONDITIONAL_PASS → A-grade measured-class over-claim narrowed inline + 4 B fixes) + blind sonnet target-tier sim (P9 fixture correctly classified + routed) — the salience row survives the field tier.
+- Decision: P9 scope-tagged to the field axis ("simpler over time"); meta-harness complexity that earns its scope is explicitly distinguished.
 ### 2026-06-10 | forge-harness | #destructive-op-gate, #irreversibility, #silent-loss, #branch-cleanup-incident
 **File:** CLAUDE.md §Destructive-Op Gate (+ templates/predelete_check.sh, scripts/selfcheck.sh)
 Third irreversibility gate (sibling of Pre-Publish): enumerate → recover → destroy, never destroy-then-check. predelete_check.sh classifies branches SAFE/CHECK/REVIEW (CHECK = 0 unique paths but commits off base — shared files may hold newer content, the silent-loss class); REVIEW blocks scripted deletes (exit 1); the recovery step is judged/depth-sensitive (strongest-tier floor semantics). Signal-table row fires on destructive intents proactively. Origin: same-day incident — a parallel session's card (weekly-audit done + #88) lived only on an unmerged 0-unique-path branch; pre-deletion enumeration recovered it. Dogfood replay: the script lands that exact branch in CHECK.

package/CLAUDE.md CHANGED Viewed

@@ -103,9 +103,14 @@ Simplification guard: trivial denials with one obvious fix → state block + sin
 **4-step summary**: ① Auto-read CLAUDE.md + CATALOG + session card + registry scan → ② One-line proposal (new user / exploratory / returning branches) → ③ 5-skill cascade (plugin-recommender → synergy → .claudeignore → model → verify) → ④ Approval + setup
-**Returning-user door skeleton (summary-level — applies even if the detail file read is skipped; returning users only, new/exploratory branches stay in the detail file)**: open with the fixed 3-door menu — *"① Connect/map a project · ② Work on a mapped project — {field candidates} · ③ FH self-development — {FH worklist}"* — composing session-card candidates **into doors ②/③**, never as a raw priority dump that replaces the menu. An urgent open item (time-windowed handoff · blocking external deadline) outranks the menu; an explicit task utterance skips it entirely (see Guards below); cadence reminders (§Cadence Rules) ride below it, they don't displace it. Canonical source: `fh_detail_protocols.md` Step 2 §Returning user — keep door labels in sync.
+**Returning-user door skeleton (summary-level — applies even if the detail file read is skipped; returning users only, new/exploratory branches stay in the detail file)**: open with the fixed 3-door menu, **🐿️ on its own line as the skeleton's first line** —
-**Identity marker**: every greeting response (Step ②) opens with 🐿️ on its own line. FH's session-start signal — see `fh_detail_protocols.md` Step 2 for full greeting templates.
+> 🐿️
+> *"① Connect/map a project · ② Work on a mapped project — {field candidates} · ③ FH self-development — {FH worklist}"*
+The marker is part of this skeleton — one salience unit with the menu, not a separate rule (origin: a sonnet-tier sim emitted the menu but dropped the standalone marker rule, `fh_signal_2026-06-11`). Compose session-card candidates **into doors ②/③**, never as a raw priority dump that replaces the menu. An urgent open item (time-windowed handoff · blocking external deadline) outranks the menu; an explicit task utterance skips it entirely (see Guards below); cadence reminders (§Cadence Rules) ride below it, they don't displace it. Canonical source: `fh_detail_protocols.md` Step 2 §Returning user — keep door labels in sync.
+**Identity marker**: every greeting response (Step ②) opens with 🐿️ on its own line. For returning users it is embedded in the door skeleton above (do not strip it when composing doors); new/exploratory branch templates carry it in `fh_detail_protocols.md` Step 2.
 **Guards**: explicit task-entry utterance → skip onboarding · once per session · code/debug requests → start working directly · project routing is a suggestion, mention at most once
 **Metadata-is-not-intent guard**: the trigger is the user's **typed message only**. Session metadata — branch name (auto-derived from the first message, e.g. `claude/korean-greeting-*`), repo name, file paths — is **never** a task spec and never suppresses or redirects the greeting trigger. A bare greeting fires onboarding even when the branch name looks like a feature request; if the only "task" signal lives in metadata and not in what the user typed, treat the message as a greeting and run the 3-axis scaffold.
@@ -178,7 +183,9 @@ the operator (one line), mirroring §Floor governance.
 If `model:`-pinned dispatch is unavailable (plan/billing gate), fall back to a cross-session headless
 run (`claude -p "<trigger>" --model <tier>` in the target cwd) — stronger isolation, zero instruction
-contamination. Record sim results in the Axes 2–3 marker + sub-agent invocation log.
+contamination. 2026-06-15+: headless `claude -p` draws from the hard-capped credit pool, not the
+subscription — prefer in-session Agent dispatch when the plan gate allows; take the headless fallback
+knowingly. Record sim results in the Axes 2–3 marker + sub-agent invocation log.
 **Axis ownership** (each skill is already complete — orchestrator only coordinates):
@@ -299,6 +306,7 @@ Proposal format: `"I see [X]. Want me to run /[skill] to [one-line description]?
 | "delete the branch", "브랜치 삭제", "브랜치 정리", "clean up branches", "force-push", "rewrite history", "지워도 돼?" (destructive intent — **proactive**, fire *before* the action) | **Destructive-Op Gate** (see above → enumerate → recover → destroy; `templates/predelete_check.sh`) |
 | "look at this again", "is this right", "counterargument", "re-validate" | `/verify-bidirectional` |
 | "MCP failing", "tool keeps erroring", "circuit-breaker", "same error looping" | `/mcp-circuit-breaker` |
+| "add this MCP server", "mount this MCP", "mcp.json에 추가", "connect this tool server" (external-MCP mount intent — **proactive**, fire *before* first tool call; mount intent only — a failing/erroring mounted server is `/mcp-circuit-breaker`'s row above) | `templates/.claude/rules/mcp_tool_gating.md` (name-keyed ask/allow table — never trust server annotations or names; fill §3 at mount time) |
 | "token budget", "how expensive", "estimate tokens", "will this cost a lot" | `/token-budget-gate` |
 | "did my rule change break anything", "regression check", "test harness changes" | `/prompt-regression` |
 | "SKILL.md too large", "split this skill", "skill is bloated", "skill file too long" | `/skill-splitter` |

package/README.md CHANGED Viewed

@@ -99,6 +99,12 @@ forge-harness is structured as **two distinct layers**:
 The methodology layer is the portable core — persistent hub, accumulating learnings, curating cross-project knowledge. The automation layer makes it frictionless when running Claude Code.
+**Where this sits (2026):** "harness engineering" is now a public paradigm — and basic agent
+orchestration is rapidly commoditizing into standard infrastructure. FH deliberately stakes nothing on
+that plumbing. Its durable layer is what does *not* commoditize: the governance gates (adversarial ·
+phantom · regression), drift control, and the cross-project compounding loop. Routing and dispatch are
+means; **the gate and the loop are the asset.**
 ```
 forge-harness/   ← the hub (persistent brain)
 ├── knowledge/   → shared across all projects
@@ -273,13 +279,6 @@ above-rubric *design* increments (developing the harness, not running it) — wh
 Sonnet with **tier-floored dispatch** covering the depth-sensitive turns, and a pinned stronger model is
 recommended only for harness-editing sessions. Details: `docs/OUTPUT_EVIDENCE.md` §Validation signals.
-**Measured, not asserted** (2026-06-10, worked example): on a 30-point blind rule-application battery,
-*operating* FH was nearly model-flat — Opus 4.8 / Sonnet 4.6 / Haiku 4.5 scored **100 / 97 / 94** against
-a top-tier anchor at 100, with the rules in context doing most of the work. The tiers separated only on
-above-rubric *design* increments (developing the harness, not running it) — which is exactly why the
-recommendation stays Opus for harness-editing and gate turns, while field operation tolerates lower tiers.
-Details: `docs/OUTPUT_EVIDENCE.md` §Validation signals.
 If you use external CLIs (Gemini, Codex, `gh copilot`) as sidecars, their costs are billed to their own quota and not visible in CC's token display.
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@chrono-meta/fh-gate",
-  "version": "1.4.13",
+  "version": "1.4.14",
   "description": "FH runtime adapters — run FH governance, skills, and agents via Claude or Codex with machine-parseable gates.",
   "license": "MIT",
   "keywords": [

package/plugins/fh-meta/skills/steel-quench/SKILL.md CHANGED Viewed

@@ -259,6 +259,7 @@ a single-family pass repeated still misses what cross-family catches, and a targ
 | P6 | **AI dependency single point of failure** | Claude API/MCP removal causes collapse | Document graceful degradation + fallback |
 | P7 | **Hallucination-contaminated defense** | Defense relies on LLM inference, not measurement | Mandate citing original file/commit/value |
 | P8 | **Context Collapse unguarded** | Key instructions lost to compression → harness silent | Review CLAUDE.md compact repeated insertion |
+| P9 | **Harness-bulk as model compensation** | Pipeline thickened to substitute for a model capability ceiling (a gap no iteration count closes — e.g. domain understanding) — complexity replaces missing capability, violating the field axis "simpler over time" (meta-harness: distinguish from complexity that earns its scope) | Route the task class to a stronger model; never paper over the ceiling with more harness. Signals: steps added for one model's weakness; step count rising while class quality stays flat across iterations |
 Add new rows as new patterns are discovered.