@chrono-meta/fh-gate 1.4.16 → 1.4.18
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CATALOG.md +6 -1
- package/CHEATSHEET.md +1 -1
- package/CLAUDE.md +15 -2
- package/package.json +1 -1
- package/plugins/fh-meta/skills/context-doctor/SKILL.md +10 -96
- package/plugins/fh-meta/skills/context-doctor/SKILL_detail.md +114 -0
- package/plugins/fh-meta/skills/field-harvest/SKILL.md +23 -163
- package/plugins/fh-meta/skills/field-harvest/SKILL_detail.md +189 -0
- package/plugins/fh-meta/skills/frontier-digest/SKILL.md +23 -167
- package/plugins/fh-meta/skills/frontier-digest/SKILL_detail.md +177 -0
- package/plugins/fh-meta/skills/goal-quench/SKILL.md +42 -263
- package/plugins/fh-meta/skills/goal-quench/SKILL_detail.md +249 -0
- package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +1 -1
- package/plugins/fh-meta/skills/install-wizard/SKILL.md +35 -420
- package/plugins/fh-meta/skills/install-wizard/SKILL_detail.md +432 -0
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +19 -179
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL_detail.md +149 -0
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +7 -1
- package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +52 -3
|
@@ -33,7 +33,7 @@ The evaluator principle: Haiku judges completion (every turn, cheap). pipeline-c
|
|
|
33
33
|
|
|
34
34
|
## Modes — core → pro → max (fluid)
|
|
35
35
|
|
|
36
|
-
goal-quench is a ladder, not a fixed shape. The
|
|
36
|
+
goal-quench is a ladder, not a fixed shape. The default (**core**) is the narrow safety belt — for users who only want /goal's two structural gaps closed. **pro** and **max** add optimization and orchestration on top, and are **auto-recommended by the Phase-1 budget verdict** so the orchestration cost is only paid when the task is large enough to justify it.
|
|
37
37
|
|
|
38
38
|
| Mode | Adds over previous | Chained skills | Auto-recommended when |
|
|
39
39
|
|---|---|---|---|
|
|
@@ -58,7 +58,7 @@ For **CC-only users**, pro/max sidecars run as isolated sub-agents (Sonnet for e
|
|
|
58
58
|
|
|
59
59
|
Therefore:
|
|
60
60
|
- **Never auto-escalate a GREEN/YELLOW task to pro/max.** For external CLI users, sidecar costs are untracked and invisible in CC's budget display. For CC sub-agent users, the overhead is estimable but still real — orchestrator Opus × 3 applies. Note the appropriate caveat when disclosing cost.
|
|
61
|
-
- **Model escalation cost**: With `/model opusplan`, Opus activates on plan-mode turns
|
|
61
|
+
- **Model escalation cost**: With `/model opusplan`, Opus activates on plan-mode turns and Sonnet handles execution turns — both CC-visible in session jsonl under `message.model`. External CLI sidecar fees are additional and invisible to CC; state both when present.
|
|
62
62
|
- **RED is no longer a dead-end.** v1 hard-blocked RED ("split manually"). v2 turns RED into the **on-ramp to max** — agent-composer decomposes the over-budget goal into sequential sub-goals automatically.
|
|
63
63
|
|
|
64
64
|
---
|
|
@@ -78,46 +78,11 @@ Therefore:
|
|
|
78
78
|
|
|
79
79
|
## One-Time Setup (required)
|
|
80
80
|
|
|
81
|
-
|
|
81
|
+
**Claude Code runtime**: merge the Stop-hook snippet from `templates/goal-quench-hook-setup.md` into `.claude/settings.json`, and gitignore the two state files. Recovery flags exist for hook failures: `/goal-quench --verify` (Phase 3 only, manual trigger) and `/goal-quench --recover` (interrupted /goal — promotes `.active` → `.pending`).
|
|
82
82
|
|
|
83
|
-
|
|
83
|
+
**Codex runtime**: do not replace Codex's native goal/session capability with goal-quench state files — use `fh-gate`/`fh-goal` as post-run governance wrappers instead.
|
|
84
84
|
|
|
85
|
-
|
|
86
|
-
```
|
|
87
|
-
.claude/goal-quench.active
|
|
88
|
-
.claude/goal-quench.pending
|
|
89
|
-
```
|
|
90
|
-
|
|
91
|
-
**Hook failure fallback**: The Stop hook may fire silently without triggering verification (hook failure, session reset, or CC version differences). If Phase 3 verification has not run after /goal completes, manually trigger it:
|
|
92
|
-
> `/goal-quench --verify` — skips Phase 1+2, runs pipeline-conductor --quick using current session scope.
|
|
93
|
-
|
|
94
|
-
**Interrupted /goal recovery**: If /goal is interrupted (error, user abort, CC crash), the Stop hook may not fire — leaving `.claude/goal-quench.active` without a `.pending`. Detect and recover:
|
|
95
|
-
```bash
|
|
96
|
-
[ -f .claude/goal-quench.active ] && ! [ -f .claude/goal-quench.pending ] && echo "Interrupted — run /goal-quench --recover"
|
|
97
|
-
```
|
|
98
|
-
> `/goal-quench --recover` — promotes `.active` → `.pending` and runs Phase 3 verification on partially-completed work.
|
|
99
|
-
|
|
100
|
-
### Codex Runtime
|
|
101
|
-
|
|
102
|
-
Codex has its own goal/session capability. Do not replace it with goal-quench state files. In Codex-primary sessions:
|
|
103
|
-
|
|
104
|
-
1. Use Codex's native goal/session feature for goal control.
|
|
105
|
-
2. Use `fh-run` for any FH skill/agent sub-step that would otherwise require Claude Code `Agent(...)`.
|
|
106
|
-
3. After the Codex goal completes, run FH governance on changed files:
|
|
107
|
-
|
|
108
|
-
```bash
|
|
109
|
-
FH_BACKEND=codex npx @chrono-meta/fh-gate "{changed-files}" quick codex-goal
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
For non-interactive one-shot runs only, `fh-goal` can run a backend task and then invoke `fh-gate` automatically:
|
|
113
|
-
|
|
114
|
-
```bash
|
|
115
|
-
FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-goal \
|
|
116
|
-
--prompt "{task}" \
|
|
117
|
-
--gate quick
|
|
118
|
-
```
|
|
119
|
-
|
|
120
|
-
`fh-goal` is not a Codex goal replacement; it is a post-run governance wrapper.
|
|
85
|
+
> **Detail**: See `SKILL_detail.md §Setup` — CC hook snippet + gitignore lines, hook-failure/interrupted-run recovery commands, Codex runtime commands — read when performing one-time setup or recovery.
|
|
121
86
|
|
|
122
87
|
---
|
|
123
88
|
|
|
@@ -125,45 +90,26 @@ FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-goal \
|
|
|
125
90
|
|
|
126
91
|
### Step 1. Collect task description
|
|
127
92
|
|
|
128
|
-
Ask:
|
|
129
|
-
> "What will you run with /goal? Describe the task and expected scope."
|
|
93
|
+
Ask: *"What will you run with /goal? Describe the task and expected scope."* Collect: task description, target files or directories, expected output.
|
|
130
94
|
|
|
131
|
-
|
|
95
|
+
**Pre-flight rule (pro/max only)**: verify required chained skills (context-doctor, agent-composer) are installed before proposing pro/max. If any is missing, **fall back to core and warn**: "Pro/max mode requires `{skill}` — not installed. Running in core mode instead."
|
|
132
96
|
|
|
133
|
-
**
|
|
134
|
-
```bash
|
|
135
|
-
for skill in context-doctor agent-composer; do
|
|
136
|
-
[ -f ".claude/plugins/cache/forge-harness/fh-meta/"*"/skills/${skill}/SKILL.md" ] 2>/dev/null \
|
|
137
|
-
|| find ~/.claude/plugins -name "${skill}" -type d 2>/dev/null | grep -q . \
|
|
138
|
-
|| echo "WARNING: ${skill} not found — pro/max mode requires it. Falling back to core."
|
|
139
|
-
done
|
|
140
|
-
```
|
|
141
|
-
If any required skill is missing, **fall back to core and warn**: "Pro/max mode requires `{skill}` — not installed. Running in core mode instead."
|
|
97
|
+
> **Detail**: See `SKILL_detail.md §Phase1-Blocks` — pre-flight check bash, token-budget-gate invocation contract, state file format, threshold injection block, hand-off output — read when executing Phase 1 Steps 1–5.
|
|
142
98
|
|
|
143
99
|
### Step 2. token-budget-gate estimate
|
|
144
100
|
|
|
145
|
-
|
|
146
|
-
```
|
|
147
|
-
Input: task description (one paragraph), target file count (approximate)
|
|
148
|
-
Trigger phrase: "estimate token budget for: {task description}"
|
|
149
|
-
Expected output fields:
|
|
150
|
-
- estimated_tokens: N
|
|
151
|
-
- verdict: GREEN | YELLOW | ORANGE | RED
|
|
152
|
-
- reasoning: one-line basis for estimate
|
|
153
|
-
```
|
|
101
|
+
Run token-budget-gate against the task description (invocation contract in §Phase1-Blocks). If the skill is not installed, use this fallback estimator and note `budget_source: fallback-heuristic` in `.active`:
|
|
154
102
|
|
|
155
|
-
If `token-budget-gate` skill is not installed, use this fallback estimator:
|
|
156
103
|
```
|
|
157
104
|
< 5 files changed, no new architecture → GREEN (< 10K)
|
|
158
105
|
5–20 files or new module/feature → YELLOW (10K–30K)
|
|
159
106
|
20+ files or cross-system refactor → ORANGE (30K–60K)
|
|
160
107
|
Full-project migration or rewrite → RED (> 60K)
|
|
161
108
|
```
|
|
162
|
-
**Session overhead factor (empirical calibration, N=10, 2026-06-01–06-03)**: The tiers above estimate *task* tokens only. Actual full CC session tokens average 4.7× the task estimate (range: 1.3×–10.5×) for harness-heavy projects (dense CLAUDE.md + multi-rule auto-load). Multiply task estimate by 4× for FH-density projects (rough inference only — not measured for non-FH projects), to get expected session total. This multiplier informs the mode recommendation — it does not change the go/no-go gate thresholds.
|
|
163
109
|
|
|
164
|
-
|
|
110
|
+
**Session overhead factor (empirical, N=10, 2026-06-01–06-03)**: the tiers estimate *task* tokens only. Actual full CC session tokens average 4.7× the task estimate (range 1.3×–10.5×) for harness-heavy projects. Multiply task estimate by 4× for FH-density projects to get the expected session total — this informs the mode recommendation, not the go/no-go thresholds.
|
|
165
111
|
|
|
166
|
-
|
|
112
|
+
Map the verdict to a go/no-go decision:
|
|
167
113
|
|
|
168
114
|
| token-budget-gate verdict | goal-quench action |
|
|
169
115
|
|---|---|
|
|
@@ -172,60 +118,21 @@ Run token-budget-gate against the task description. Map the result to a go/no-go
|
|
|
172
118
|
| ORANGE (30K–60K) | Propose **pro mode** as the cheaper path: "This is expensive. Run as a single /goal, or let pro mode optimize (context-doctor) + decompose (agent-composer) it?" Proceed in core only if the user declines orchestration. |
|
|
173
119
|
| RED (> 60K) | Propose **max mode** instead of a hard block: "Too big for one /goal run. Route through max mode — agent-composer decomposes into sequential sub-goals, plugin-recommender fills any capability gaps." Hard-block only if the user declines orchestration. |
|
|
174
120
|
|
|
175
|
-
On RED,
|
|
176
|
-
- **Accept max mode** (recommended) → goal-quench enters Phase 1.5 and lets agent-composer decompose the over-budget goal into sequential sub-goals, each small enough to run under its own budget. This is the v2 recovery path that replaces v1's dead-end.
|
|
177
|
-
- **Decline orchestration** → goal-quench halts and does not write `.active` or inject thresholds. The user may still run `/goal` manually — but goal-quench will not gate or verify a session started without its active file.
|
|
121
|
+
On RED: **accept max mode** (recommended) → Phase 1.5 decomposition recovery path; **decline orchestration** → goal-quench halts, writes no `.active`, and will not gate or verify a session started without its active file.
|
|
178
122
|
|
|
179
123
|
### Step 3. Write state file
|
|
180
124
|
|
|
181
|
-
Create `.claude/goal-quench.active
|
|
182
|
-
|
|
183
|
-
```
|
|
184
|
-
scope: {task description — one line}
|
|
185
|
-
mode: core | pro | max
|
|
186
|
-
target_files: {comma-separated file paths or directory; "inferred from git diff" if not known}
|
|
187
|
-
budget_estimate: {N} tokens
|
|
188
|
-
budget_source: token-budget-gate | fallback-heuristic
|
|
189
|
-
budget_verdict: {GREEN/YELLOW/ORANGE/RED}
|
|
190
|
-
pipeline_mode: {--quick for core/pro · --full for max}
|
|
191
|
-
composed_plan: {sub-goal list from Phase 1.5 agent-composer; "n/a" in core mode}
|
|
192
|
-
timestamp: {YYYY-MM-DD HH:MM}
|
|
193
|
-
session_pid: {$$}
|
|
194
|
-
start_commit: {git rev-parse HEAD}
|
|
195
|
-
```
|
|
196
|
-
|
|
197
|
-
`target_files` is the **verification artifact** — what pipeline-conductor --quick will evaluate in Phase 3. Specify files/directories explicitly if known; otherwise write `"inferred from git diff"` and Phase 3 will resolve using `git diff {start_commit}..HEAD` to capture all changes including interim commits made during the /goal session.
|
|
125
|
+
Create `.claude/goal-quench.active` (format in §Phase1-Blocks). `target_files` is the **verification artifact** — what pipeline-conductor will evaluate in Phase 3. Specify explicitly if known; otherwise write `"inferred from git diff"` and Phase 3 resolves via `git diff {start_commit}..HEAD` (captures interim commits).
|
|
198
126
|
|
|
199
|
-
> **Control flow (core vs pro/max)**: in **core** mode, proceed directly Step 3 → Step 4. In **pro/max** mode, run **Phase 1.5 (orchestration) here — after Step 3, before Step 4** — then return to Step 4
|
|
127
|
+
> **Control flow (core vs pro/max)**: in **core** mode, proceed directly Step 3 → Step 4. In **pro/max** mode, run **Phase 1.5 (orchestration) here — after Step 3, before Step 4** — then return to Step 4. The `composed_plan` field in `.active` stays `n/a` until Phase 1.5 fills it.
|
|
200
128
|
|
|
201
129
|
### Step 4. Inject mid-run budget thresholds
|
|
202
130
|
|
|
203
|
-
Before the user runs /goal, output the
|
|
204
|
-
|
|
205
|
-
```
|
|
206
|
-
─── goal-quench budget thresholds (active for this /goal run) ───
|
|
207
|
-
50% of estimated budget consumed:
|
|
208
|
-
→ Output a one-line progress summary. No action required.
|
|
209
|
-
|
|
210
|
-
70% consumed:
|
|
211
|
-
→ YELLOW: re-prioritize remaining tasks. Drop lowest-priority items
|
|
212
|
-
if the goal can still be met without them.
|
|
213
|
-
|
|
214
|
-
85% consumed:
|
|
215
|
-
→ ORANGE: stop current task, commit completed work, surface to user:
|
|
216
|
-
"Budget at 85%. Completed: [X]. Remaining: [Y]. Continue / stop?"
|
|
217
|
-
|
|
218
|
-
95% consumed:
|
|
219
|
-
→ RED: stop immediately. Commit everything completed so far.
|
|
220
|
-
Output: "Budget exhausted. Completed tasks: [list]. Incomplete: [list]."
|
|
221
|
-
─────────────────────────────────────────────────────────────────
|
|
222
|
-
```
|
|
131
|
+
Before the user runs /goal, output the threshold instruction block (50% progress note · 70% re-prioritize · 85% stop-and-ask · 95% stop-and-commit — full block in §Phase1-Blocks) so Claude carries it into the /goal session.
|
|
223
132
|
|
|
224
133
|
### Step 5. Hand off to /goal
|
|
225
134
|
|
|
226
|
-
Output
|
|
227
|
-
> "goal-quench ready. Budget: {verdict} (~{N} tokens). Now run: `/goal {your condition}`
|
|
228
|
-
> When /goal finishes, the Stop hook will trigger pipeline-conductor --quick automatically."
|
|
135
|
+
Output the ready message (budget verdict + estimated tokens + "run `/goal {your condition}`" — template in §Phase1-Blocks).
|
|
229
136
|
|
|
230
137
|
---
|
|
231
138
|
|
|
@@ -237,30 +144,18 @@ The ordering is deliberate: optimize the context first (cheapest win), then deco
|
|
|
237
144
|
|
|
238
145
|
### Step A — context-doctor (token-reduction pre-pass) · pro + max
|
|
239
146
|
|
|
240
|
-
Invoke context-doctor on the target scope before /goal runs
|
|
241
|
-
|
|
242
|
-
**Re-estimate** the budget after the pre-pass. If the verdict drops (e.g., ORANGE → YELLOW), offer to step back down to core: "Context trimmed; estimate is now YELLOW. Continue in pro, or drop to core?" (See Simplification Guards — post-optimization step-down.)
|
|
147
|
+
Invoke context-doctor on the target scope before /goal runs (generates/updates `.claudeignore`, flags over-read files, recommends `/clear` timing). **Re-estimate** the budget after the pre-pass. If the verdict drops (e.g., ORANGE → YELLOW), offer to step back down to core (see Simplification Guards — post-optimization step-down).
|
|
243
148
|
|
|
244
149
|
### Step B — agent-composer (goal decomposition) · pro + max
|
|
245
150
|
|
|
246
|
-
Hand the task description to agent-composer in compose-only mode
|
|
151
|
+
Hand the task description to agent-composer in compose-only mode → Wave plan with capability-fit scoring (agent-composer Step 0.2). For RED-origin runs this decomposition is the recovery path v1 lacked.
|
|
247
152
|
|
|
248
|
-
**Sub-goal execution (user-driven, not automatic)**: `/goal` is
|
|
249
|
-
|
|
250
|
-
> "Sub-goal plan ready. Run each sub-goal in order by invoking `/goal-quench` again with the next sub-goal description. goal-quench will gate each sub-goal independently."
|
|
251
|
-
|
|
252
|
-
Sub-goal queue format (`.claude/goal-quench.queue`):
|
|
253
|
-
```
|
|
254
|
-
remaining:
|
|
255
|
-
- sub-goal-1: {description}
|
|
256
|
-
- sub-goal-2: {description}
|
|
257
|
-
completed: []
|
|
258
|
-
```
|
|
259
|
-
|
|
260
|
-
Each `/goal-quench` invocation pops the first `remaining` item, runs it as a core-mode run (Phase 1 GREEN/YELLOW expected → /goal → Phase 3 verify → commit), then moves it to `completed`. The outer pro/max run owns the decomposition; each inner invocation owns its sub-goal's gate. When `remaining` is empty, delete `.claude/goal-quench.queue`. (Parallel sub-goal execution is deferred — sequential is the v2 contract.)
|
|
153
|
+
**Sub-goal execution (user-driven, not automatic)**: `/goal` is user-invoked — goal-quench cannot programmatically loop over sub-goals. Write the plan to `.claude/goal-quench.queue` (format in §Queue-Format) and instruct the user to invoke `/goal-quench` once per sub-goal. Each invocation pops the first `remaining` item, runs it as an independent core-mode run, then moves it to `completed`. When `remaining` is empty, delete the queue file. (Parallel sub-goal execution is deferred — sequential is the v2 contract.)
|
|
261
154
|
|
|
262
155
|
goal-quench does **not** re-implement agent-composer's gates — its destructive-action gate (Step 2.7) and per-wave fan-out cap apply as-is.
|
|
263
156
|
|
|
157
|
+
> **Detail**: See `SKILL_detail.md §Queue-Format` — queue file format, plan-ready output text, sidecar `.active` fields — read when writing the queue (Step B) or the sidecar fields (Step D).
|
|
158
|
+
|
|
264
159
|
### Step C — plugin-recommender + synergy pre-validation · max only
|
|
265
160
|
|
|
266
161
|
Triggered only when agent-composer Step 0.2 reports a capability **GAP** (`fit_score < 0.5` on a required-weight sub-task):
|
|
@@ -272,64 +167,38 @@ max mode never installs anything silently — discovery and synergy-check are su
|
|
|
272
167
|
|
|
273
168
|
### Step D — scope-driven sidecar configuration · pro + max
|
|
274
169
|
|
|
275
|
-
Runs after Step C (or after Step B if no capability gap). Selects an adversarial sidecar based on the task's quality-risk profile — distinct from Step C's capability-gap sidecar
|
|
170
|
+
Runs after Step C (or after Step B if no capability gap). Selects an adversarial sidecar based on the task's quality-risk profile — distinct from Step C's capability-gap sidecar. Step D's sidecar addresses **blind-spot risk**: the generator and reviewer sharing the same reasoning distribution.
|
|
276
171
|
|
|
277
|
-
> **Availability resolution**: "
|
|
172
|
+
> **Availability resolution**: "if available" below is decided by the canonical Tier 1→2→3 recipe in `knowledge/shared/harness-core/multi_model_sidecar_strategy.md §Sidecar Engine Resolution Protocol` — discovery is automatic, invocation stays value-gated. Tier 3 (Claude sub-agent) guarantees this step never hard-fails even with zero external CLIs/keys.
|
|
278
173
|
|
|
279
174
|
**Scope → sidecar routing table**:
|
|
280
175
|
|
|
281
176
|
| Task scope signal | Sidecar | Invocation point |
|
|
282
177
|
|---|---|---|
|
|
283
|
-
| Code quality review / new SKILL.md / governance gate change | `steel-quench` cross-provider challenger (Gemini sidecar if available; Tier 3 Claude sub-agent fallback
|
|
178
|
+
| Code quality review / new SKILL.md / governance gate change | `steel-quench` cross-provider challenger (Gemini sidecar if available; Tier 3 Claude sub-agent fallback) | Post-/goal, before pipeline-conductor |
|
|
284
179
|
| Architecture design / cross-project dependency | `agent-composer` multi-model panel (if external CLIs available; otherwise single-Claude sub-agent) | Parallel to /goal as separate Agent |
|
|
285
180
|
| External publication / marketplace-gate / skill release | `sim-conductor` + `steel-quench` Wave 5 | Post-/goal quality gate |
|
|
286
181
|
| No signal match (default) | None — pipeline-conductor handles quality alone | — |
|
|
287
182
|
|
|
288
|
-
Write resolved sidecar to `.active
|
|
289
|
-
```
|
|
290
|
-
sidecar: none | steel-quench-crossprovider | agent-composer-panel | sim-conductor | {cli-name}
|
|
291
|
-
sidecar_rationale: {one-line reason — which scope signal triggered this}
|
|
292
|
-
```
|
|
293
|
-
|
|
294
|
-
Output to user (one line only):
|
|
295
|
-
> "Sidecar: {config} — {rationale}"
|
|
296
|
-
|
|
297
|
-
Do not ask for confirmation. The user may override by re-running `/goal-quench --sidecar none`.
|
|
183
|
+
Write the resolved `sidecar:` + `sidecar_rationale:` fields to `.active` (format in §Queue-Format). Output one line: *"Sidecar: {config} — {rationale}"*. Do not ask for confirmation; the user may override via `/goal-quench --sidecar none`.
|
|
298
184
|
|
|
299
|
-
**Dispatch-failure triage (saturation disguise)**: a sidecar dispatch
|
|
300
|
-
usage-credits error late in a long run may be reporting *session context saturation*, not a billing
|
|
301
|
-
gate (measured 2026-06-12: identical error pre-compaction; the same opus-pinned dispatch succeeded
|
|
302
|
-
post-compaction with the sub-agent completing normally). Triage order: compact — flushing handoff
|
|
303
|
-
state to disk first — → retry the dispatch once → only then take the headless `claude -p` fallback
|
|
304
|
-
(credit-pool cost) or an inline at-floor pass. Concluding "billing gate" from the first failure
|
|
305
|
-
skips the cheapest recovery.
|
|
185
|
+
**Dispatch-failure triage (saturation disguise)**: a sidecar dispatch failing with a 1M-context / usage-credits error late in a long run may be reporting *session context saturation*, not a billing gate (measured 2026-06-12). Triage order: compact — flushing handoff state to disk first — → retry the dispatch once → only then take the headless `claude -p` fallback (credit-pool cost) or an inline at-floor pass.
|
|
306
186
|
|
|
307
187
|
### Hand-off
|
|
308
188
|
|
|
309
|
-
After orchestration, **update** (not re-create) `.claude/goal-quench.active` — add `mode:` and `composed_plan:` lines alongside existing budget fields. Re-creating the file loses the `start_commit` field
|
|
189
|
+
After orchestration, **update** (not re-create) `.claude/goal-quench.active` — add `mode:` and `composed_plan:` lines alongside existing budget fields. Re-creating the file loses the `start_commit` field. Then proceed to threshold injection (Phase 1 Step 4) and hand off to /goal. Phase 3 verification runs as in core — `pipeline-conductor --quick` for pro, `--full` for max.
|
|
310
190
|
|
|
311
191
|
---
|
|
312
192
|
|
|
313
193
|
## Phase 2 — Mid-run (during /goal execution)
|
|
314
194
|
|
|
315
|
-
goal-quench does not directly control /goal execution. The budget thresholds injected in Phase 1 are instructions Claude follows during the /goal session.
|
|
316
|
-
|
|
317
|
-
**What Claude should do mid-run**:
|
|
318
|
-
|
|
319
|
-
- Track approximate token consumption against the Phase 1 estimate
|
|
320
|
-
- At each threshold (50/70/85/95%), execute the corresponding action
|
|
321
|
-
- Do not wait for goal-quench to intervene — the thresholds are self-enforced
|
|
322
|
-
|
|
323
|
-
**Threshold enforcement caveat**: Claude cannot reliably read its own real-time token consumption during a /goal session. The 50/70/85/95% thresholds are instructional — Claude approximates consumption based on task complexity and turn count. They are not mechanically enforced. For hard enforcement, see the Anthropic feature request (--budget flag).
|
|
195
|
+
goal-quench does not directly control /goal execution. The budget thresholds injected in Phase 1 are instructions Claude follows during the /goal session: track approximate consumption against the estimate, execute the corresponding action at each threshold (50/70/85/95%), self-enforced without waiting for goal-quench.
|
|
324
196
|
|
|
325
|
-
**
|
|
197
|
+
**Threshold enforcement caveat**: Claude cannot reliably read its own real-time token consumption — thresholds are instructional approximations, not mechanically enforced. For hard enforcement, see the Anthropic feature request (--budget flag).
|
|
326
198
|
|
|
327
|
-
**
|
|
328
|
-
- Hard-enforce token ceiling mid-run (--budget flag, not yet available)
|
|
329
|
-
- Auto-checkpoint commit on sub-goal completion (--checkpoint flag, not yet available)
|
|
330
|
-
- Mid-run Sonnet quality signal (requires structured evaluator hooks)
|
|
199
|
+
**Queue mode (per-sub-goal threshold scope)**: each `/goal-quench` invocation from the queue is an independent core-mode run — thresholds are re-injected fresh against that sub-goal's own Phase 1 estimate, not inherited from the outer pro/max run.
|
|
331
200
|
|
|
332
|
-
These gaps are the basis for the
|
|
201
|
+
**What goal-quench cannot do in v1** (requires native Anthropic support): hard token ceiling (--budget), auto-checkpoint commits (--checkpoint), mid-run Sonnet quality signal. These gaps are the basis for the feature request (`knowledge/shared/harness-core/goal_quench_anthropic_issue.md`).
|
|
333
202
|
|
|
334
203
|
---
|
|
335
204
|
|
|
@@ -337,67 +206,21 @@ These gaps are the basis for the Anthropic feature request (see `knowledge/share
|
|
|
337
206
|
|
|
338
207
|
### Claude Code path
|
|
339
208
|
|
|
340
|
-
When /goal stops, the Stop hook
|
|
341
|
-
1. Detects `.claude/goal-quench.active`
|
|
342
|
-
2. Copies it to `.claude/goal-quench.pending`
|
|
343
|
-
3. Removes `.active`
|
|
344
|
-
4. Prints: "[goal-quench] /goal finished. Verification pending — starting pipeline-conductor --quick."
|
|
345
|
-
|
|
346
|
-
On the next Claude response (after /goal), check for `.claude/goal-quench.pending` **with a freshness guard** — only act if the timestamp in the file is within the current session (< 4 hours old):
|
|
347
|
-
|
|
348
|
-
```bash
|
|
349
|
-
if [ -f .claude/goal-quench.pending ]; then
|
|
350
|
-
PENDING_TIME=$(grep "^timestamp:" .claude/goal-quench.pending | sed 's/^timestamp: //')
|
|
351
|
-
NOW=$(date +%s)
|
|
352
|
-
PENDING_EPOCH=$(date -d "$PENDING_TIME" +%s 2>/dev/null || date -j -f "%Y-%m-%d %H:%M" "$PENDING_TIME" +%s 2>/dev/null)
|
|
353
|
-
AGE_HOURS=$(( (NOW - PENDING_EPOCH) / 3600 ))
|
|
354
|
-
if [ "$AGE_HOURS" -lt 4 ]; then
|
|
355
|
-
echo "goal-quench verification pending (created: $PENDING_TIME — ${AGE_HOURS}h ago)"
|
|
356
|
-
else
|
|
357
|
-
echo "STALE: goal-quench.pending is ${AGE_HOURS}h old. Run /goal-quench --verify to re-evaluate or delete to clear."
|
|
358
|
-
exit 0
|
|
359
|
-
fi
|
|
360
|
-
fi
|
|
361
|
-
```
|
|
362
|
-
|
|
363
|
-
If found and fresh: **automatically run pipeline-conductor** before responding to any other request — `--quick` for core/pro, `--full` for max (the `mode:` field in `.pending` selects which). This is not optional — pending verification takes priority.
|
|
209
|
+
When /goal stops, the Stop hook: detects `.active` → copies to `.pending` → removes `.active` → prints the verification-pending notice.
|
|
364
210
|
|
|
365
|
-
|
|
211
|
+
On the next Claude response, check `.claude/goal-quench.pending` **with a freshness guard** (bash in §Phase3-Bash):
|
|
212
|
+
- **Fresh (< 4 hours)**: **automatically run pipeline-conductor** before responding to any other request — `--quick` for core/pro, `--full` for max (the `mode:` field in `.pending` selects). Not optional — pending verification takes priority.
|
|
213
|
+
- **Stale (> 4 hours)**: warn — "A stale goal-quench.pending exists from {timestamp}. Run `/goal-quench --verify` to re-evaluate, or delete it to clear." Do not auto-trigger verification on stale state.
|
|
366
214
|
|
|
367
215
|
### Codex path
|
|
368
216
|
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
```bash
|
|
372
|
-
FH_BACKEND=codex npx @chrono-meta/fh-gate "{changed-files}" quick codex-goal
|
|
373
|
-
```
|
|
217
|
+
After the Codex goal completes, run `fh-gate` on changed files (commands in §Phase3-Bash). Treat `BLOCKED`/`ESCALATE` the same as the Claude path.
|
|
374
218
|
|
|
375
|
-
|
|
219
|
+
> **Detail**: See `SKILL_detail.md §Phase3-Bash` — freshness guard bash, verification-artifact resolution bash, Codex gate commands — read when executing Phase 3.
|
|
376
220
|
|
|
377
221
|
### Verification flow
|
|
378
222
|
|
|
379
|
-
|
|
380
|
-
# Read scope and target from pending file
|
|
381
|
-
SCOPE=$(grep "^scope:" .claude/goal-quench.pending | sed 's/^scope: //')
|
|
382
|
-
TARGET=$(grep "^target_files:" .claude/goal-quench.pending | sed 's/^target_files: //')
|
|
383
|
-
|
|
384
|
-
# Resolve artifact: explicit target or git diff
|
|
385
|
-
# Use git diff against the commit recorded at Phase 1 start, NOT HEAD,
|
|
386
|
-
# to capture files changed during the entire /goal session including interim commits.
|
|
387
|
-
GOAL_START_COMMIT=$(grep "^start_commit:" .claude/goal-quench.pending | sed 's/^start_commit: //')
|
|
388
|
-
if [ "$TARGET" = "inferred from git diff" ] || [ -z "$TARGET" ]; then
|
|
389
|
-
if [ -n "$GOAL_START_COMMIT" ]; then
|
|
390
|
-
TARGET=$(git diff "$GOAL_START_COMMIT"..HEAD --name-only 2>/dev/null | tr '\n' ' ')
|
|
391
|
-
else
|
|
392
|
-
# Fallback: staged + unstaged changes (does not capture interim commits)
|
|
393
|
-
TARGET=$(git status --short 2>/dev/null | awk '{print $2}' | tr '\n' ' ')
|
|
394
|
-
fi
|
|
395
|
-
fi
|
|
396
|
-
# Guard: reject empty target
|
|
397
|
-
[ -z "$TARGET" ] && echo "ERROR: cannot resolve verification artifact — no files changed or start_commit missing" && exit 1
|
|
398
|
-
```
|
|
399
|
-
|
|
400
|
-
Run `pipeline-conductor --quick` on `$TARGET` (the artifact changed during this /goal session). Gate on result:
|
|
223
|
+
Resolve the verification artifact from `.pending` — explicit `target_files`, or `git diff {start_commit}..HEAD` when inferred (bash in §Phase3-Bash; empty target = hard error). Run pipeline-conductor on the artifact and gate on the result:
|
|
401
224
|
|
|
402
225
|
| pipeline-conductor verdict | goal-quench action | Delete `.pending`? |
|
|
403
226
|
|---|---|---|
|
|
@@ -406,9 +229,9 @@ Run `pipeline-conductor --quick` on `$TARGET` (the artifact changed during this
|
|
|
406
229
|
| `BLOCKED` | Block commit. Output blocking items. Ask: "Fix and re-run /goal, or accept partial completion?" | **Only after user acknowledges** |
|
|
407
230
|
| `ESCALATE` | Surface to user for decision. Do not auto-delete — preserve state until user explicitly decides. | **Only after user decision** |
|
|
408
231
|
|
|
409
|
-
**Deletion rule**: On BLOCKED or ESCALATE, `.pending` must survive until the user makes an explicit decision
|
|
232
|
+
**Deletion rule**: On BLOCKED or ESCALATE, `.pending` must survive until the user makes an explicit decision — deleting before that loses the recovery anchor.
|
|
410
233
|
|
|
411
|
-
**Deadlock prevention**: While `.pending` exists (BLOCKED/ESCALATE), the next-turn verification check runs ONCE per turn only
|
|
234
|
+
**Deadlock prevention**: While `.pending` exists (BLOCKED/ESCALATE), the next-turn verification check runs ONCE per turn only. After the first verification output, suppress further auto-trigger until the user explicitly acts. If the user starts a new `/goal-quench` run while `.pending` exists, warn and do not silently overwrite with a new `.active`.
|
|
412
235
|
|
|
413
236
|
Record actual token usage in `tracks/_meta/goal_quench_{YYYY-MM-DD}.md` for calibration (regardless of verdict).
|
|
414
237
|
|
|
@@ -428,53 +251,9 @@ This gate is **blocking** — Done When cannot be reached until the sidecar verd
|
|
|
428
251
|
|
|
429
252
|
## Calibration Record
|
|
430
253
|
|
|
431
|
-
After each goal-quench run, append to `tracks/_meta/goal_quench_{YYYY-MM-DD}.md
|
|
432
|
-
|
|
433
|
-
```yaml
|
|
434
|
-
- run_id: GQ-{YYYYMMDD}-{N} # e.g. GQ-20260603-01 — unique per run, links to session JSONL
|
|
435
|
-
session_id: "{first-8-chars-of-jsonl-filename}" # for JSONL back-trace
|
|
436
|
-
date: YYYY-MM-DD
|
|
437
|
-
task: {one-line description}
|
|
438
|
-
scope_hint: "{N files affected, task type}" # e.g. "3 new files, signal doc"
|
|
439
|
-
mode: core | pro | max
|
|
440
|
-
session_type: minor | normal | heavy | continuation # for within-type comparison
|
|
441
|
-
estimated_tokens: N
|
|
442
|
-
budget_source: token-budget-gate | fallback-heuristic
|
|
443
|
-
actual_tokens: N # from ~/.claude/projects/*/conversation*.jsonl or user-reported; "unknown" if unavailable
|
|
444
|
-
estimation_error: over | under | accurate
|
|
445
|
-
estimation_error_pct: "N%" # e.g. "717%" — magnitude of under/over; omit if accurate
|
|
446
|
-
actual_vs_estimate_ratio: N.N # actual / estimated (e.g., 4.7 means actual was 4.7× estimate)
|
|
447
|
-
budget_verdict: GREEN/YELLOW/ORANGE/RED
|
|
448
|
-
pipeline_verdict: CLEAN/PENDING/BLOCKED/ESCALATE
|
|
449
|
-
sidecar: none | steel-quench-crossprovider | agent-composer-panel | sim-conductor | {cli-name}
|
|
450
|
-
sidecar_type: none | cc-subagent | external-cli # cc-subagent = isolated Sonnet context; external-cli = untracked
|
|
451
|
-
sidecar_model: none | sonnet | {model-name} # standard baseline: sonnet; external CLIs vary by tier
|
|
452
|
-
orchestrator_tokens: N | unknown # Opus orchestrator CC tokens (pro/max only; visible in CC)
|
|
453
|
-
sidecar_tokens: N | unknown # Sonnet sub-agent CC tokens if cc-subagent; "untracked" if external-cli
|
|
454
|
-
sidecar_findings_count: N # 0 if sidecar=none
|
|
455
|
-
threshold_triggered: none/50/70/85/95
|
|
456
|
-
notes: {optional — why estimate was off, what scope changed}
|
|
457
|
-
```
|
|
458
|
-
|
|
459
|
-
**`actual_tokens` collection**: Claude cannot read session token counts directly. Preferred method:
|
|
460
|
-
```bash
|
|
461
|
-
python3 -c "
|
|
462
|
-
import json, glob
|
|
463
|
-
f = sorted(glob.glob('~/.claude/projects/*/conversation*.jsonl'.replace('~', __import__('os').path.expanduser('~'))))[-1]
|
|
464
|
-
lines = [json.loads(l) for l in open(f)]
|
|
465
|
-
total = sum(m.get('message',{}).get('usage',{}).get('input_tokens',0) + m.get('message',{}).get('usage',{}).get('output_tokens',0) for m in lines)
|
|
466
|
-
print(f'Session tokens: {total:,}')
|
|
467
|
-
"
|
|
468
|
-
```
|
|
469
|
-
Fallback: turn count × estimated tokens/turn (~2K for short turns, ~8K for long file edits). Write `"unknown"` if no estimate is possible.
|
|
470
|
-
|
|
471
|
-
**Retrospective calibration baseline (N=10, 2026-06-01–06-03, Sonnet)**: mean actual/estimate ratio = 4.7× (range 1.3×–10.5×). Systematic underestimation due to session overhead. Full data held in the private companion store (`paper-signals/`).
|
|
472
|
-
|
|
473
|
-
After 10 additional prospective runs, compute mean estimation error per mode — calibrate the session_overhead_factor if systematic over/under persists.
|
|
474
|
-
|
|
475
|
-
`pipeline_verdict` enum includes `ESCALATE` — record it when Phase 3 required user decision before proceeding.
|
|
254
|
+
After each goal-quench run, append a calibration entry to `tracks/_meta/goal_quench_{YYYY-MM-DD}.md`. Baseline (N=10, 2026-06-01–06-03, Sonnet): mean actual/estimate ratio 4.7× — systematic underestimation from session overhead. Target: 10 prospective runs per mode before treating mode comparison as reliable.
|
|
476
255
|
|
|
477
|
-
|
|
256
|
+
> **Detail**: See `SKILL_detail.md §Calibration-Record` — full YAML entry format, actual_tokens collection script, calibration baseline notes — read when appending the post-run record.
|
|
478
257
|
|
|
479
258
|
---
|
|
480
259
|
|