@chrono-meta/fh-gate 1.4.17 → 1.4.19
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CATALOG.md
CHANGED
|
@@ -8,6 +8,11 @@ AI reads this file first when searching past work. Open individual files for det
|
|
|
8
8
|
|
|
9
9
|
<!-- Add entries in reverse date order (newest at top) -->
|
|
10
10
|
|
|
11
|
+
### 2026-06-13 | forge-harness | #sidecar-eol-proofing, #agy, #liveness-probe, #rubber-stamp-guard, #upstream-report
|
|
12
|
+
**File:** plugins/fh-meta/skills/{steel-quench,sim-conductor}/SKILL_detail.md + templates/.git-hooks/pre-commit (commit 4693d00)
|
|
13
|
+
Completion sweep of all currently-unblocked carries. FP3: agy joins the sidecar panel as T5 (argument form `agy -p` only — stdin pipe prints help, measured; 60s timebox+1 retry hard rule; trusted-artifacts caution since -p auto-approves tools) and gemini detection becomes a dispatch-form stdin liveness probe (EOL 2026-06-18 leaves the binary alive, backend dead — bare `command -v` goes silently stale). Ack hardening: `below-floor-ack:` now requires a verbatim-quoted operator utterance (unquoted reason = agent-self-writable = blocked; residuals — quote fabrication, out-of-context quoting — named as weekly-audit targets). Challenger round: 1S/4A fixed (cross-fence tb() self-containment, probe-form/dispatch-form mismatch, T1~T5, empty-team synthesis gate, comment overclaim); B1 curly-quote locale block refuted by live test. FP1 closed upstream: increment comment (auto-compact non-recovery + 76% non-repro control) posted to anthropics/claude-code#65359 with operator approval. Knowledge orphans resolved: 3 gitignored paper files de-orphaned (unique framework rescued to companion store, stale dupes deleted).
|
|
14
|
+
- Decision: probe must exercise the same invocation form the dispatch uses (forms diverge on one binary — agy proved it); hook enforces ack form, weekly audit owns genuineness.
|
|
15
|
+
|
|
11
16
|
### 2026-06-11 | forge-harness | #readme-dedup, #commoditization-defense, #b1-boundary, #seed-vetting, #field-routing
|
|
12
17
|
**File:** README.md (PR #93) + fh-be signals/handoffs (PR #24+) + field project v0.14 (private track)
|
|
13
18
|
Cloud session #4 (Mode D, remote-doable batch): README stale duplicate "Measured, not asserted" block removed (pre-tier-floor copy contradicted default-Sonnet stance) + "Where this sits (2026)" positioning para (gate+loop are the asset — plumbing commoditizes). fh-be: B1 scope boundary (AW2 simple-vs-complex import), SC1–4 seed vetting (SC1 phantom arXiv fixed → 2509.19349; SC3 commoditization threat to bet ID), gstack positioning-triangle line. Field: private-track companion handoff verified+executed — DefectPatternMatcher+Bug Mode implemented UTF-8-clean in the field repo (25 tests PASS, 0 regressions); its OpenCode(sLLM) lane's downgrade triage hardened to 2-tier (keep/block) with free-tier 80/20 split — β/SC2 field case logged.
|
package/CHEATSHEET.md
CHANGED
|
@@ -98,7 +98,7 @@ git config core.hooksPath templates/.git-hooks
|
|
|
98
98
|
chmod +x templates/.git-hooks/pre-commit
|
|
99
99
|
```
|
|
100
100
|
|
|
101
|
-
After running `/steel-quench` and `/phantom-quench` in your session, Claude creates the Axes 2+3 pass marker automatically. The marker must carry machine-readable floor fields — the hook validates them (a bare `touch` marker no longer passes; below-floor passes block unless an explicit `below-floor-ack:` line records operator acceptance). If Claude doesn't create it (e.g., session interrupted), create it manually:
|
|
101
|
+
After running `/steel-quench` and `/phantom-quench` in your session, Claude creates the Axes 2+3 pass marker automatically. The marker must carry machine-readable floor fields — the hook validates them (a bare `touch` marker no longer passes; below-floor passes block unless an explicit `below-floor-ack:` line records operator acceptance, **quoting the operator's approval utterance verbatim** — an unquoted reason is rejected as agent-self-writable). If Claude doesn't create it (e.g., session interrupted), create it manually:
|
|
102
102
|
|
|
103
103
|
```bash
|
|
104
104
|
cat > "tracks/_meta/.axes_23_passed_$(git rev-parse --abbrev-ref HEAD | tr '/' '_')_$(date +%Y-%m-%d).marker" <<'EOF'
|
package/README.md
CHANGED
|
@@ -157,6 +157,8 @@ FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-goal --prompt "Implement
|
|
|
157
157
|
|
|
158
158
|
The broader FH automation layer still depends on Claude Code for sub-agents, hooks, and slash commands. The portable path is shared documents plus runtime adapters, not separate Codex and Claude forks.
|
|
159
159
|
|
|
160
|
+
**Recommended posture — Claude Code as orchestrator, others as sidecars.** FH's automation layer (auto-firing hooks, sub-agent dispatch, onboarding, memory) is Claude-Code-native, so the fullest experience runs **Claude Code as the main orchestrator with Gemini, Codex, or Antigravity (`agy`) as actively-used sidecars**. You can also run a **non-CC runtime as your main agent** — you keep the full methodology layer and M1 skills through `fh-gate`/`fh-run`, but you do **not** get the autopilot layer: hooks don't auto-fire, M2 agent-dispatch steps need the adapter (or interactive approval), and M3 skills are reference-only. This is a deliberate two-layer boundary, not a gap to be closed. Per-runtime detail: [`docs/codex-compat.md`](docs/codex-compat.md) (tier-by-tier) and [`multi_model_sidecar_strategy.md`](knowledge/shared/harness-core/multi_model_sidecar_strategy.md) (sidecar engines, including the Gemini→`agy` succession at the 2026-06-18 EOL).
|
|
161
|
+
|
|
160
162
|
**Empirical result (2026-05-31)**: Applied to OpenCode's AI-generated `permission/arity.ts` (163 lines, CI green). Current gate semantics classify this as BLOCKED: 2 A-grade findings CI didn't catch (short-token overflow in allowlist, executor tools absent from arity table).
|
|
161
163
|
|
|
162
164
|
Full spec: [`fh_integration_contract.md`](knowledge/shared/harness-core/fh_integration_contract.md)
|
package/package.json
CHANGED
|
@@ -172,13 +172,19 @@ Run multi-team? (a) Full panel (b) Claude sub-agents only (c) Skip to Area B
|
|
|
172
172
|
| T2 Copilot | `gh copilot suggest` | challenger · expert | `gh copilot suggest -t shell` |
|
|
173
173
|
| T3 Ollama | `ollama run` | challenger | `ollama run llama3 PROMPT` |
|
|
174
174
|
| T4 Codex | `npx @openai/codex exec` | challenger · edge-case-hunter | `echo PROMPT \| npx @openai/codex exec -m gpt-5 -` |
|
|
175
|
+
| T5 agy | `agy -p` (gemini successor) | challenger · beginner | `agy -p "PROMPT"` — argument form only (stdin pipe prints help); timebox+retry hard rule (intermittent hang class); -p auto-approves tools → trusted artifacts only |
|
|
175
176
|
|
|
176
177
|
### CLI detection bash
|
|
177
178
|
|
|
178
179
|
```bash
|
|
179
180
|
# Detect available external CLIs
|
|
180
181
|
AVAILABLE_CLIS=""
|
|
181
|
-
|
|
182
|
+
tb() { perl -e 'alarm shift; exec @ARGV' "$@"; } # portable timebox — darwin ships no `timeout`
|
|
183
|
+
command -v agy &>/dev/null && AVAILABLE_CLIS="$AVAILABLE_CLIS agy"
|
|
184
|
+
# gemini backend EOL 2026-06-18 — binary outlives the service; liveness probe (timeboxed minimal
|
|
185
|
+
# call) replaces the bare `command -v`, which would pass while pipes silently return empty.
|
|
186
|
+
# Probe uses the same stdin-pipe form T1 dispatch uses; result valid per session (no re-probe).
|
|
187
|
+
command -v gemini &>/dev/null && [ -n "$(echo ping | tb 20 gemini 2>/dev/null)" ] && AVAILABLE_CLIS="$AVAILABLE_CLIS gemini"
|
|
182
188
|
command -v gh &>/dev/null && gh copilot --version &>/dev/null && AVAILABLE_CLIS="$AVAILABLE_CLIS copilot"
|
|
183
189
|
command -v ollama &>/dev/null && AVAILABLE_CLIS="$AVAILABLE_CLIS ollama"
|
|
184
190
|
command -v npx &>/dev/null && npx @openai/codex --version &>/dev/null 2>&1 && AVAILABLE_CLIS="$AVAILABLE_CLIS codex"
|
|
@@ -243,7 +243,7 @@ claude CLI also absent → Skip Wave 5, note in residual risk card
|
|
|
243
243
|
|
|
244
244
|
```
|
|
245
245
|
Wave 5 — Multi-Team Panel available.
|
|
246
|
-
Detected external CLIs: [gemini · gh-copilot · ollama | none]
|
|
246
|
+
Detected external CLIs: [agy · gemini · gh-copilot · ollama | none]
|
|
247
247
|
|
|
248
248
|
Estimated token cost:
|
|
249
249
|
External CLIs: each team ×2-3 personas → ~2K–5K tokens per team (billed to that CLI)
|
|
@@ -259,7 +259,15 @@ Run Wave 5?
|
|
|
259
259
|
|
|
260
260
|
```bash
|
|
261
261
|
TEAMS=()
|
|
262
|
-
|
|
262
|
+
# tb = portable timebox (darwin ships no `timeout`; perl alarm works everywhere)
|
|
263
|
+
tb() { perl -e 'alarm shift; exec @ARGV' "$@"; }
|
|
264
|
+
command -v agy &>/dev/null && TEAMS+=("agy")
|
|
265
|
+
# gemini backend EOL 2026-06-18 — the binary outlives the service, so a bare `command -v` goes
|
|
266
|
+
# silently stale (pipes return empty behind 2>/dev/null). Liveness probe (one minimal billed
|
|
267
|
+
# call, timeboxed) gates the slot instead. Probe uses the SAME stdin-pipe form the T1 dispatch
|
|
268
|
+
# uses (probing a different invocation path can pass while dispatch fails — agy proved the two
|
|
269
|
+
# forms diverge on one binary). Probe result is valid for the session — skip re-probe on re-runs.
|
|
270
|
+
command -v gemini &>/dev/null && [ -n "$(echo ping | tb 20 gemini 2>/dev/null)" ] && TEAMS+=("gemini")
|
|
263
271
|
command -v gh &>/dev/null && gh copilot --version &>/dev/null 2>&1 && TEAMS+=("gh-copilot")
|
|
264
272
|
command -v ollama &>/dev/null && TEAMS+=("ollama")
|
|
265
273
|
npx @openai/codex --version &>/dev/null 2>&1 && TEAMS+=("codex")
|
|
@@ -277,6 +285,7 @@ Default team-persona assignments:
|
|
|
277
285
|
| **T2 Copilot** | `gh copilot suggest` | devil · expert |
|
|
278
286
|
| **T3 Ollama** | `ollama run {model}` | devil |
|
|
279
287
|
| **T4 Codex** | `npx @openai/codex exec` | devil · edge-case-hunter |
|
|
288
|
+
| **T5 agy** | `agy -p "PROMPT"` (argument form only — stdin pipe prints help, measured 2026-06-13) | devil · beginner · alternatives (gemini successor) |
|
|
280
289
|
|
|
281
290
|
**Step 1 — Parallel Team Dispatch**:
|
|
282
291
|
|
|
@@ -332,6 +341,46 @@ $ARTIFACT_TAIL" 2>/dev/null) &
|
|
|
332
341
|
$C_EDGE"
|
|
333
342
|
fi
|
|
334
343
|
|
|
344
|
+
# ── T5: agy (Antigravity) — gemini successor ──────────────
|
|
345
|
+
# Constraints: argument form only (`agy -p "PROMPT"`); timebox+retry is a hard rule
|
|
346
|
+
# (intermittent hang class, ~50% observed 2026-06-11); -p auto-approves tool execution,
|
|
347
|
+
# so feed only trusted artifacts — never untrusted external content. Serial by design
|
|
348
|
+
# (worst case ~6 min if the hang class fires on every call) — do NOT copy the T1-T4
|
|
349
|
+
# `VAR=$(...) &` pattern: it assigns inside a background subshell.
|
|
350
|
+
if [[ " ${TEAMS[*]} " =~ " agy " ]]; then
|
|
351
|
+
# tb redefined here — each fenced block must be self-contained when copy-run.
|
|
352
|
+
# Limitation: alarm kills the exec'd process only; a child holding the stdout
|
|
353
|
+
# pipe can outlive it (same gap as `timeout` without process-group kill).
|
|
354
|
+
tb() { perl -e 'alarm shift; exec @ARGV' "$@"; }
|
|
355
|
+
agy_call() { # $1=prompt — 60s timebox, 1 retry on empty
|
|
356
|
+
local out; out=$(tb 60 agy -p "$1" 2>/dev/null)
|
|
357
|
+
[ -z "$out" ] && out=$(tb 60 agy -p "$1" 2>/dev/null)
|
|
358
|
+
printf '%s' "$out"
|
|
359
|
+
}
|
|
360
|
+
A_DEVIL=$(agy_call "[Devil] Adversarial reviewer, no prior context.
|
|
361
|
+
Find 3 critical structural flaws — especially whether Done When criteria are binary and achievable.
|
|
362
|
+
Format: [issue · location · severity S/A/B]
|
|
363
|
+
---
|
|
364
|
+
$ARTIFACT_TAIL")
|
|
365
|
+
A_NEW=$(agy_call "[Beginner] First-time user, zero background.
|
|
366
|
+
Find 3 unclear or jargon-heavy points.
|
|
367
|
+
Format: [issue · location · severity]
|
|
368
|
+
---
|
|
369
|
+
$ARTIFACT_TAIL")
|
|
370
|
+
A_SKEP=$(agy_call "[Alternatives — challenger U1 lens] Pragmatic outsider.
|
|
371
|
+
Find 3 \"why not just X?\" challenges.
|
|
372
|
+
Format: [issue · location · severity]
|
|
373
|
+
---
|
|
374
|
+
$ARTIFACT_TAIL")
|
|
375
|
+
if [ -n "$A_DEVIL$A_NEW$A_SKEP" ]; then
|
|
376
|
+
TEAM_RESULTS["agy"]="$A_DEVIL
|
|
377
|
+
$A_NEW
|
|
378
|
+
$A_SKEP"
|
|
379
|
+
else
|
|
380
|
+
echo "T5 agy: empty after timebox+retry — dropped (degraded coverage, do not count in synthesis)"
|
|
381
|
+
fi
|
|
382
|
+
fi
|
|
383
|
+
|
|
335
384
|
# ── Path B: Cross-session Claude fallback ─────────────────
|
|
336
385
|
if [ ${#TEAMS[@]} -eq 0 ]; then
|
|
337
386
|
TEAM_RESULTS["cross-session-claude"]=$(claude --print \
|
|
@@ -356,7 +405,7 @@ Confidence scoring:
|
|
|
356
405
|
1 team only → B-grade (single-team observation)
|
|
357
406
|
|
|
358
407
|
Claude blind spots (highest value):
|
|
359
|
-
Issues raised by T1~
|
|
408
|
+
Issues raised by T1~T5 but absent from Wave 1~4 (T0) results
|
|
360
409
|
→ flag explicitly as "cross-team delta"
|
|
361
410
|
```
|
|
362
411
|
|