@chrono-meta/fh-gate 1.1.0 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/.claude/agents/challenger.md +169 -0
  2. package/AGENTS.md +160 -0
  3. package/CATALOG.md +256 -0
  4. package/CHEATSHEET.md +367 -0
  5. package/CLAUDE.md +331 -0
  6. package/CONTRIBUTING.md +198 -0
  7. package/LICENSE +21 -0
  8. package/README.md +61 -8
  9. package/bin/fh-goal.js +9 -0
  10. package/bin/fh-run.js +9 -0
  11. package/docs/codex-compat.md +123 -0
  12. package/docs/pillars.svg +70 -0
  13. package/knowledge/shared/harness-core/fh_integration_contract.md +45 -28
  14. package/package.json +30 -6
  15. package/plugins/fh-commons/README.md +37 -0
  16. package/plugins/fh-commons/agents/quench-challenger.md +373 -0
  17. package/plugins/fh-commons/skills/convergence-loop/SKILL.md +155 -0
  18. package/plugins/fh-commons/skills/deliberation/SKILL.md +288 -0
  19. package/plugins/fh-commons/skills/mcp-circuit-breaker/SKILL.md +196 -0
  20. package/plugins/fh-commons/skills/token-budget-gate/SKILL.md +175 -0
  21. package/plugins/fh-meta/agents/fact-checker.md +121 -0
  22. package/plugins/fh-meta/agents/hub-persona-auditor.md +109 -0
  23. package/plugins/fh-meta/agents/persona-innovator.md +195 -0
  24. package/plugins/fh-meta/skills/agent-composer/SKILL.md +461 -0
  25. package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +464 -0
  26. package/plugins/fh-meta/skills/apex-review/SKILL.md +185 -0
  27. package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +135 -0
  28. package/plugins/fh-meta/skills/contention-layer/SKILL.md +127 -0
  29. package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL.md +30 -0
  30. package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL_detail.md +144 -0
  31. package/plugins/fh-meta/skills/context-doctor/SKILL.md +341 -0
  32. package/plugins/fh-meta/skills/cross-ecosystem-synergy-detection/SKILL.md +202 -0
  33. package/plugins/fh-meta/skills/deep-clarify/SKILL.md +144 -0
  34. package/plugins/fh-meta/skills/edit-manifest/SKILL.md +210 -0
  35. package/plugins/fh-meta/skills/field-harvest/SKILL.md +384 -0
  36. package/plugins/fh-meta/skills/frontier-digest/SKILL.md +272 -0
  37. package/plugins/fh-meta/skills/goal-quench/SKILL.md +509 -0
  38. package/plugins/fh-meta/skills/harness-doctor/SKILL.md +277 -0
  39. package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +484 -0
  40. package/plugins/fh-meta/skills/harvest-loop/SKILL.md +231 -0
  41. package/plugins/fh-meta/skills/harvest-loop/SKILL_detail.md +201 -0
  42. package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +129 -0
  43. package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL_detail.md +158 -0
  44. package/plugins/fh-meta/skills/install-doctor/SKILL.md +207 -0
  45. package/plugins/fh-meta/skills/install-wizard/SKILL.md +613 -0
  46. package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +193 -0
  47. package/plugins/fh-meta/skills/memory-hygiene/SKILL.md +143 -0
  48. package/plugins/fh-meta/skills/meta-prompt-builder/SKILL.md +167 -0
  49. package/plugins/fh-meta/skills/meta-prompt-builder/SKILL_detail.md +37 -0
  50. package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +430 -0
  51. package/plugins/fh-meta/skills/plugin-recommender/SKILL.md +221 -0
  52. package/plugins/fh-meta/skills/plugin-recommender/SKILL_detail.md +220 -0
  53. package/plugins/fh-meta/skills/prompt-regression/SKILL.md +178 -0
  54. package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +224 -0
  55. package/plugins/fh-meta/skills/return-path-gate/SKILL.md +257 -0
  56. package/plugins/fh-meta/skills/self-marketing-lint/SKILL.md +129 -0
  57. package/plugins/fh-meta/skills/sim-conductor/SKILL.md +364 -0
  58. package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +337 -0
  59. package/plugins/fh-meta/skills/skill-splitter/SKILL.md +126 -0
  60. package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +185 -0
  61. package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +230 -0
  62. package/plugins/fh-meta/skills/source-grounding-audit/SKILL_detail.md +182 -0
  63. package/plugins/fh-meta/skills/steel-quench/SKILL.md +226 -0
  64. package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +453 -0
  65. package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +238 -0
  66. package/scripts/fh-gate.sh +175 -40
  67. package/scripts/fh-goal.sh +182 -0
  68. package/scripts/fh-run.sh +269 -0
@@ -0,0 +1,288 @@
1
+ ---
2
+ name: deliberation
3
+ description: Multi-perspective synthesis structure — Innovator (propose) → Devil-Advocate (challenge) → Mediator (synthesize) 3-layer execution. Outputs conditional verdicts without binary win/loss. Activates on "deliberation", "battle this out", "weigh the pros and cons", "review from multiple angles", "which side is right?". Optional deep-insight persona jurors for domain-specific views. Designed for design decisions, skill proposals, and architectural choices.
4
+ user-invocable: true
5
+ allowed-tools: ["Read", "Bash", "Agent"]
6
+ model: opus
7
+ origin: fh-meta
8
+ ---
9
+
10
+ # deliberation — The Forge Skill
11
+
12
+ Innovator (propose) → Devil (challenge) → Mediator (synthesize) 3-layer core structure.
13
+ The goal is not to pick a winner — it is to **extract salvageable fragments from the losing argument and produce a conditional verdict**.
14
+ Even those who struggle to challenge assumptions can use this structure as a rope to reach better decisions.
15
+
16
+ > **Role distinction from sim-conductor and steel-quench**
17
+ > - sim-conductor: **validates** quality and consistency of a completed asset
18
+ > - steel-quench: **adversarially stress-tests** a near-complete artifact (post-build defect surfacing)
19
+ > - deliberation: perspective clash during the **design decision process** → synthesis (upstream of both)
20
+
21
+ ---
22
+
23
+ ## Triggers
24
+
25
+ - `/deliberation`
26
+ - "battle this out", "make them argue", "clash and synthesize"
27
+ - "review this decision from multiple angles"
28
+ - When agent-composer Step 4-b proposes `Wave next-D` automatically
29
+
30
+ ### Natural Language Triggers (for general users — activates without internal vocabulary)
31
+
32
+ Also activates when design decisions or perspective clashes are expressed in natural language:
33
+
34
+ | Example phrase | Intent |
35
+ |---|---|
36
+ | "I'm not sure whether to do this or not" | Decision uncertainty → multi-perspective synthesis |
37
+ | "It seems like opinions are divided" | Perspective clash → synthesis layer needed |
38
+ | "Help me decide which side is right" | Conditional synthesis, not simple winner selection |
39
+ | "Someone will probably object to this" | Request for devil's advocate perspective |
40
+ | "Is it okay to keep going in this direction?" | Re-validation of design decision |
41
+ | "Review this from multiple angles" | Multi-perspective synthesis → 3-layer default |
42
+ | "Analyze this from all sides" | Multi-perspective synthesis → 3-layer default |
43
+ | "Weigh the pros and cons" | Perspective clash → devil + innovator engaged |
44
+ | "Analyze the strengths and weaknesses" | Pro/con structure → Innovator (pros) + Devil (cons) |
45
+ | "Help me make a decision" | Decision support → conditional verdict generation |
46
+ | "pros and cons", "pros cons" | Comparative analysis → synthesis layer needed |
47
+
48
+ ---
49
+
50
+ ## Step 0. Receive Topic + Select Layer
51
+
52
+ If no input is provided, ask:
53
+ ```
54
+ Please provide the deliberation topic.
55
+ - Topic: What are you trying to decide or design?
56
+ - Layer: [3-layer default (recommended)] / [5-layer — includes jury]
57
+ - Jury focus (if 5-layer selected): user experience / technical feasibility / business & policy
58
+ ```
59
+
60
+ **Default: 3-layer** (Innovator → Devil → Mediator). Use 5-layer only when a jury is needed.
61
+
62
+ **Execution log (workers_approved pattern)**:
63
+ Upon completing Step 0, include the following in the output:
64
+ ```
65
+ [DELIBERATION START] Topic: {topic} | Layer: {layer} | {timestamp}
66
+ → WORKER_CALL: Innovator (isolated instance)
67
+ → WORKER_CALL: Devil-Advocate (isolated instance)
68
+ → WORKER_CALL: Mediator (isolated instance — Cost of Consensus prevention)
69
+ ```
70
+
71
+ Jury auto-selection criteria:
72
+ | Topic nature | Recommended jury personas |
73
+ |---|---|
74
+ | New user experience related | `newcomer` + `power-user` |
75
+ | Technical implementation feasibility | `persona-be` + `persona-fe` |
76
+ | Business viability / policy / legal | `persona-pm` + `persona-business` |
77
+ | General design decisions | No jury (3-layer is sufficient) |
78
+
79
+ ---
80
+
81
+ ## Step 1. Innovator Layer — Propose
82
+
83
+ Invoke `deep-insight:persona-innovator`.
84
+
85
+ > **Fallback (if deep-insight is not installed)**: Claude Code performs the Innovator role inline. Same instruction template and output format apply. The deliberation pipeline is guaranteed to work without deep-insight installed.
86
+
87
+ > **No isolation (intentional)**: The Innovator is a proposal generator — it does not evaluate its own output, so Agent tool isolation is not needed. Cost of Consensus applies only to the Mediator, which **evaluates** its own generated content (arXiv 2605.00914). Only the Mediator (Step 3) is isolated via the Agent tool.
88
+
89
+ Instruction template (meta-prompt-builder structure):
90
+ ```
91
+ Goal: Generate the most creative and scalable proposals for {topic}
92
+ Context: Current harness state + list of relevant assets
93
+ Constraints: No duplication of existing assets / must not violate simplification guard
94
+ Done When: 1~3 concrete proposals + 1-line rationale per proposal
95
+ Brief limit: Total Context passed to Agent must be kept under 1200 characters
96
+ ```
97
+
98
+ Output format:
99
+ ```
100
+ [Innovator]
101
+ Proposal 1: {content} — Rationale: {1 line}
102
+ Proposal 2: {content} — Rationale: {1 line}
103
+ (optional) Proposal 3: {content} — Rationale: {1 line}
104
+ ```
105
+
106
+ ---
107
+
108
+ ## Step 2. Devil-Advocate Layer — Challenge
109
+
110
+ Invoke `deep-insight:persona-devil-advocate`. Takes Step 1 output as input.
111
+
112
+ > **Fallback (if deep-insight is not installed)**: `fh-commons:quench-challenger` (includes Devil DNA) or Claude Code performs the Devil-Advocate role inline. Same output format. Instance isolation is not required, same as Step 1.
113
+
114
+ Instruction template:
115
+ ```
116
+ Goal: Generate the sharpest single rebuttal for each of the {N} Innovator proposals
117
+ Context: Innovator output + harness simplification guard + known failure patterns
118
+ Constraints: No emotional rebuttals / no baseless negation / must include a hint toward improvement
119
+ Done When: 1 rebuttal per proposal + 1-line core risk + 1-line acknowledgment of valid parts
120
+ Brief limit: Total Context passed to Agent must be kept under 1200 characters
121
+ ```
122
+
123
+ Output format:
124
+ ```
125
+ [Devil-Advocate]
126
+ Proposal 1 rebuttal: {content}
127
+ Risk: {1 line}
128
+ Acknowledgment: {1 line} ← this line is the Mediator's raw material
129
+ Proposal 2 rebuttal: ...
130
+ ```
131
+
132
+ > **Acknowledgment line is mandatory**: The Devil must explicitly state "this part is valid" — synthesis is impossible without it.
133
+ > A rebuttal with no acknowledgment is automatically flagged as `[WARN: unsynthesizable rebuttal]`.
134
+
135
+ ---
136
+
137
+ ## Step 3. Mediator Layer — Synthesize (Core)
138
+
139
+ **[Isolation Principle — Cost of Consensus Response]**
140
+ The Mediator invokes a separate instance via the `Agent` tool.
141
+ When the same instance evaluates its own generated content, confirmation bias occurs (demonstrated in arXiv 2605.00914).
142
+ Physical separation from the Innovator/Devil generation context is required for unbiased synthesis.
143
+
144
+ > **What isolation means**: Blocks Self-Evaluation Bias — the tendency for an instance to favor its own output.
145
+ > The Mediator reads the Innovator and Devil outputs, but does not share the **reasoning process that generated** those outputs.
146
+ > Independence of the reasoning path — not mere information sharing — is the key to resolving Cost of Consensus.
147
+
148
+ Agent invocation instruction (includes Context Card):
149
+ ```
150
+ Goal: Synthesize Innovator and Devil-Advocate outputs into a conditional verdict
151
+ Context: {full Step 1 output} + {full Step 2 output}
152
+ Constraints: No simple selection of the winning argument / must extract fragments from the losing argument / no hedge language ("balance both sides") / output under 1200 characters
153
+ Done When: All 5 sections output — Adopt / Alert absorption / Verdict / Conditions / Discard
154
+ ```
155
+
156
+ **Synthesis formula**:
157
+ ```
158
+ Core value of the Innovator proposal
159
+ + Valid alerts from Devil's rebuttal (extracted from the acknowledgment line)
160
+ → Conditional verdict: "{proposal} OK, provided {condition} is mandatory"
161
+ ```
162
+
163
+ **What the Mediator must not do**:
164
+ - Simply select the winning argument as-is (simple verdict = deliberation failure)
165
+ - Discard the losing argument (fragment extraction is mandatory)
166
+ - Use hedge expressions like "we should consider both sides in a balanced way"
167
+
168
+ Output format:
169
+ ```
170
+ [Mediator — Synthesis Verdict]
171
+ Adopt: Core of {Innovator Proposal N} — {value, 1 line}
172
+ Alert absorption: "{acknowledgment line}" from {Devil rebuttal} → converted to condition {X}
173
+ ─────────────────────────────────────────
174
+ Verdict: Proceed with {proposal} — OK
175
+ Conditions: {1~3 required conditions}
176
+ Discard: {fully rejected parts — with rationale}
177
+ ```
178
+
179
+ ---
180
+
181
+ ## Step 4 (Optional). Jury Layer — Domain Perspectives
182
+
183
+ Runs only when 5-layer is selected. **Parallel** dispatch of 2~3 selected deep-insight personas via the `Agent` tool.
184
+
185
+ > **Fallback (if deep-insight is not installed)**: dispatch the jury from real fh-meta agents instead — `persona-innovator` and `hub-persona-auditor` (and `fh-commons:quench-challenger` for an adversarial juror). Same parallel-Agent dispatch, juror cap, and output format apply. The 5-layer jury is guaranteed to run without deep-insight installed — mirroring the Step 1/Step 2 fallbacks.
186
+
187
+ Juror count limit: **maximum 3**. If 4 or more are selected, output `[WARN: jury overload — noise risk]` and defer to the user.
188
+
189
+ Instruction per juror:
190
+ ```
191
+ Goal: Review the Mediator's synthesis verdict from the perspective of {persona}
192
+ Context: Full output from Steps 1~3
193
+ Constraints: Do not overturn the already-synthesized verdict / only propose additional conditions
194
+ Done When: Agree / partial agreement / disagree + 1 line of additional conditions or risk
195
+ ```
196
+
197
+ Output format:
198
+ ```
199
+ [Jury: {persona name}]
200
+ Verdict: Agree / Partial agreement / Disagree
201
+ Opinion: {1~2 lines}
202
+ Additional condition: {1 line if applicable}
203
+ ```
204
+
205
+ ---
206
+
207
+ ## Step 5 (Optional). Mediator Final Synthesis
208
+
209
+ Incorporates jury opinions to refine the Step 3 verdict.
210
+
211
+ ```
212
+ [Final Verdict]
213
+ Based on: Step 3 synthesis verdict
214
+ Jury input: {N agreed / N partial / N disagreed}
215
+ Added conditions: {additional conditions from jury}
216
+ Final conclusion: {1~2 lines}
217
+ ```
218
+
219
+ ---
220
+
221
+ ## Automatic WARN Detection Patterns
222
+
223
+ | Situation | WARN content |
224
+ |---|---|
225
+ | Devil rebuttal has no "acknowledgment" line | `[WARN: unsynthesizable rebuttal — Mediator lacks raw material]` |
226
+ | Mediator adopts only one side's argument | `[WARN: simple verdict, not synthesis — deliberation failure]` |
227
+ | Innovator and Devil share the same premise | `[WARN: no real clash — recommend redefining the topic]` |
228
+ | 4 or more jurors selected | `[WARN: jury overload — reduce to 3 or fewer?]` |
229
+ | Done When contains vague expressions | `[WARN: Done When is unmeasurable — share meta-prompt-builder WARN criteria]` |
230
+
231
+ ---
232
+
233
+ ## agent-composer Integration — Wave next-D
234
+
235
+ Add the following condition to the agent-composer Step 4-b state transition gate:
236
+
237
+ ```
238
+ | ⑤ Design decision clash or judgment that "a battle is needed" | Wave next-D | deliberation (S) |
239
+ ```
240
+
241
+ `Wave next-D` activation criteria:
242
+ - M/S/R results contain "2 or more mutually conflicting proposals"
243
+ - User utterance includes "which side is right?", "they conflict", "both seem valid"
244
+ - agent-composer cannot synthesize the fan-in results and is about to defer the decision to the user
245
+
246
+ ---
247
+
248
+ ## Design Principle — Why This Skill Exists
249
+
250
+ It is a **rope** for those who have not yet dared to challenge.
251
+
252
+ Thinking alone traps you in a single perspective. Only the courageous construct counterarguments themselves.
253
+ deliberation provides that counterargument as structure — even those afraid of conflict can start a battle, because the Mediator will synthesize it for them.
254
+
255
+ The Mediator's conditional verdict creates a safe entry point: "this is okay if you do it this way."
256
+ The jury fills the domain blind spots that no single person can see on their own.
257
+
258
+ **What the forge creates is not a winner — it is a new alloy.**
259
+
260
+ ---
261
+
262
+ ## Done When
263
+
264
+ ```
265
+ All Steps 0~3 completed (Steps 4~5 added if 5-layer selected)
266
+ + [Mediator — Synthesis Verdict] output present (Adopt / Alert absorption / Verdict / Conditions / Discard)
267
+ + User's final decision confirmed (deliberation output must never be auto-executed)
268
+ ```
269
+
270
+ **→ When invoked from agent-composer Wave next-D: synthesis verdict is the fan-in input for Wave continuation** — return the Mediator verdict + Conditions to agent-composer so the conflict is marked resolved in the fan-in result set. After this, agent-composer re-runs Step 4-b state transition evaluation with the conflict cleared; subsequent Waves (next-M / next-E / end) proceed based on the updated result.
271
+
272
+ ## Simplification Guard
273
+
274
+ - Simple information queries → deliberation is unnecessary. Respond directly.
275
+ - Cases where the answer is already clear → route to sim-conductor (validation) or harness-doctor (diagnosis).
276
+ - If resolvable without a jury, 3-layer default is sufficient.
277
+ - deliberation output always completes with the user's final decision. Auto-execution is prohibited.
278
+
279
+ ---
280
+
281
+ ## Related Skills
282
+
283
+ | Situation | Related skill |
284
+ |---|---|
285
+ | Auto-triggered on design decision clash (Wave next-D) | `fh-meta:agent-composer` Step 4-b |
286
+ | Validate quality and consistency of a completed asset | `fh-meta:sim-conductor` |
287
+ | Validate skill candidates after deliberation | `fh-meta:harness-doctor` |
288
+ | Implementation convergence loop after a decision | `fh-commons:convergence-loop` |
@@ -0,0 +1,196 @@
1
+ ---
2
+ name: mcp-circuit-breaker
3
+ description: Detects MCP tool failure patterns and trips a circuit breaker to stop cascading retries. Proposes fallback alternatives and resets when the tool recovers. Triggers on "MCP failing", "tool keeps erroring", "circuit-breaker", repeated tool call failures.
4
+ user-invocable: true
5
+ allowed-tools: ["Read", "Bash", "Write"]
6
+ model: sonnet
7
+ complexity_routing:
8
+ base: sonnet
9
+ high: opus
10
+ escalate_when:
11
+ - multi_server_failure
12
+ - unknown_mcp_server
13
+ ---
14
+
15
+ # mcp-circuit-breaker — MCP Tool Failure Guard
16
+
17
+ MCP tools can fail silently or return partial results, leading to cascading retry loops that waste tokens and degrade session quality. This skill detects failure patterns, trips a circuit breaker to halt retries, proposes alternatives, and resets when the tool recovers.
18
+
19
+ > **Scope distinction**
20
+ > - Claude Code native retry: low-level transport retries (transparent, fast)
21
+ > - mcp-circuit-breaker: **session-level guard** — detects repeated semantic failures, intervenes before token waste compounds
22
+
23
+ ---
24
+
25
+ ## Triggers
26
+
27
+ - `/mcp-circuit-breaker`
28
+ - "MCP failing", "MCP keeps erroring", "tool isn't working", "circuit-breaker"
29
+ - "same error keeps happening", "tool call looping", "MCP timeout"
30
+ - Automatic: when the same MCP tool name appears in 3+ consecutive failed calls within a session
31
+
32
+ ---
33
+
34
+ ## Circuit States
35
+
36
+ ```
37
+ CLOSED → Normal operation (tool calls pass through)
38
+ OPEN → Circuit tripped (calls blocked, alternatives proposed)
39
+ HALF-OPEN → Recovery probe (1 test call allowed, resets if success)
40
+ ```
41
+
42
+ Default thresholds:
43
+ - **Trip threshold**: 3 consecutive failures of the same tool
44
+ - **Half-open probe**: after 60s cooldown (or explicit user command)
45
+ - **Reset**: 1 successful call in HALF-OPEN state → back to CLOSED
46
+
47
+ ---
48
+
49
+ ## Execution Steps
50
+
51
+ ### Step 1. Detect Failure Pattern
52
+
53
+ Identify the failing tool and failure mode:
54
+
55
+ ```bash
56
+ # Check MCP server config
57
+ cat .claude/settings.json 2>/dev/null | grep -A5 '"mcpServers"' || echo "No MCP config found"
58
+ ```
59
+
60
+ Classify failure type:
61
+
62
+ | Type | Symptom | Likely Cause |
63
+ |---|---|---|
64
+ | `TIMEOUT` | Tool call hangs >30s | Server overload / network |
65
+ | `AUTH` | 401 / 403 response | Credentials expired or missing |
66
+ | `NOT_FOUND` | 404 / tool not available | Server down / tool removed |
67
+ | `MALFORMED` | Parse error on response | Schema mismatch / API change |
68
+ | `RATE_LIMIT` | 429 / quota exceeded | Too many calls |
69
+
70
+ If failure type cannot be determined: classify as `UNKNOWN`.
71
+
72
+ ---
73
+
74
+ ### Step 2. Trip Decision
75
+
76
+ Count consecutive failures of the identified tool in the current session context.
77
+
78
+ | Count | Action |
79
+ |---|---|
80
+ | 1 | Log warning. Continue — *"MCP tool {name} failed once. Monitoring."* |
81
+ | 2 | Escalate warning. Suggest checking server status. |
82
+ | 3+ | **TRIP CIRCUIT** → output circuit open notice, block further calls to this tool |
83
+
84
+ Circuit open notice format:
85
+ ```
86
+ ⚡ CIRCUIT OPEN — {tool-name}
87
+ Failure type: {TYPE} | Consecutive failures: {N}
88
+ Further calls to this tool are blocked until circuit resets.
89
+ ```
90
+
91
+ ---
92
+
93
+ ### Step 3. Log Circuit State
94
+
95
+ Write state to session-local file (in-memory is insufficient — logs survive /clear):
96
+
97
+ ```bash
98
+ mkdir -p .claude/mcp_circuit/
99
+ # Append to circuit log
100
+ ```
101
+
102
+ Log entry format:
103
+ ```yaml
104
+ - tool: {tool-name}
105
+ state: OPEN
106
+ failure_type: {TYPE}
107
+ failure_count: {N}
108
+ tripped_at: {ISO-8601}
109
+ reset_at: null
110
+ ```
111
+
112
+ ---
113
+
114
+ ### Step 4. Propose Alternatives
115
+
116
+ Present 3 fallback options ranked by effort:
117
+
118
+ | Priority | Alternative | When to Use |
119
+ |---|---|---|
120
+ | **1 — Substitute tool** | Use a different MCP tool or built-in tool that covers the same task | Tool-specific failure (NOT_FOUND, AUTH) |
121
+ | **2 — Degrade gracefully** | Skip the MCP step, note the gap, continue with available information | TIMEOUT / RATE_LIMIT |
122
+ | **3 — Pause and retry** | Wait for server recovery (HALF-OPEN probe after cooldown) | Transient failure (TIMEOUT, RATE_LIMIT) |
123
+
124
+ Output format:
125
+ ```
126
+ ## Fallback Options for {tool-name}
127
+
128
+ Option 1 — Substitute: Use {alternative-tool} instead
129
+ → Command: [specific invocation]
130
+ → Gap: [what's different vs. original tool]
131
+
132
+ Option 2 — Degrade: Skip this step, continue without {capability}
133
+ → Impact: [what is missing from the output]
134
+
135
+ Option 3 — Retry after cooldown (60s)
136
+ → Run: /mcp-circuit-breaker reset {tool-name}
137
+ ```
138
+
139
+ ---
140
+
141
+ ### Step 5. Recovery Probe (HALF-OPEN)
142
+
143
+ When user requests reset or after cooldown:
144
+
145
+ ```
146
+ Sending HALF-OPEN probe to {tool-name}...
147
+ ```
148
+
149
+ - 1 minimal test call is allowed through
150
+ - If success: circuit → CLOSED, log updated
151
+ - If fail: circuit remains OPEN, cooldown resets
152
+
153
+ Reset log entry:
154
+ ```yaml
155
+ state: CLOSED
156
+ reset_at: {ISO-8601}
157
+ reset_method: probe_success | user_forced
158
+ ```
159
+
160
+ ---
161
+
162
+ ### Step 6. Report
163
+
164
+ At session end or on demand:
165
+
166
+ ```
167
+ ## MCP Circuit Breaker Report
168
+
169
+ | Tool | State | Failures | Tripped | Reset |
170
+ |---|---|---|---|---|
171
+ | {tool} | OPEN | 4 | 14:23 | — |
172
+ | {tool} | CLOSED | 1 | 14:10 | 14:15 (probe) |
173
+
174
+ Recommendations:
175
+ - {tool}: AUTH failure → refresh credentials in .claude/settings.json
176
+ ```
177
+
178
+ ---
179
+
180
+ ## Done When
181
+
182
+ - Failure pattern classified (type + count)
183
+ - Circuit state logged (OPEN / HALF-OPEN / CLOSED)
184
+ - At least 3 fallback alternatives proposed when circuit is OPEN
185
+ - Recovery probe offered with reset path
186
+
187
+ ---
188
+
189
+ ## Chains
190
+
191
+ **Upstream** (can trigger this skill):
192
+ - Automatically activates on 3+ consecutive MCP failures during any task
193
+
194
+ **Downstream** (after circuit open):
195
+ - No mandatory chain — fallback options are presented, user decides
196
+ - Optional: `context-doctor` if MCP failure is due to large context degrading tool calls
@@ -0,0 +1,175 @@
1
+ ---
2
+ name: token-budget-gate
3
+ description: Estimates token cost before a multi-step task and outputs a Green/Yellow/Red gate verdict. Tracks actual vs. estimated after completion for calibration. Triggers on "token budget", "how much will this cost", "will this be expensive", "estimate tokens", before long multi-agent tasks.
4
+ user-invocable: true
5
+ allowed-tools: ["Read", "Bash"]
6
+ model: sonnet
7
+ complexity_routing:
8
+ base: sonnet
9
+ high: opus
10
+ escalate_when:
11
+ - multi_project_scope
12
+ - unknown_task_type
13
+ ---
14
+
15
+ # token-budget-gate — Pre-Task Token Cost Gate
16
+
17
+ Multi-step and multi-agent tasks can silently consume large token budgets. This skill estimates cost before execution, outputs a gate verdict, and calibrates estimates against actual usage after completion — preventing surprise overruns without blocking legitimate work.
18
+
19
+ > **FH context**: FH default execution tier is `standard` (~15K tokens). This skill gates against accidental `full` (~30K) or `max` (~60K+) consumption on tasks that could be handled lighter.
20
+
21
+ ---
22
+
23
+ ## Triggers
24
+
25
+ - `/token-budget-gate`
26
+ - "token budget", "token cost", "how expensive", "will this use a lot of tokens"
27
+ - "estimate tokens", "token estimate before we start"
28
+ - Before invoking: `agent-composer`, `sim-conductor`, `steel-quench` (max-tier skills)
29
+ - Automatically proposed when task description contains: multi-agent, parallel dispatch, full suite, all files, entire codebase
30
+
31
+ ---
32
+
33
+ ## Gate Thresholds (defaults — user-configurable)
34
+
35
+ | Signal | Verdict | Action |
36
+ |---|---|---|
37
+ | Estimated < 10K tokens | 🟢 **GREEN** | Proceed without comment |
38
+ | 10K–30K tokens | 🟡 **YELLOW** | Proceed with notice — suggest lighter approach if one exists |
39
+ | 30K–60K tokens | 🟠 **ORANGE** | Confirm before proceeding — present scope reduction options |
40
+ | > 60K tokens | 🔴 **RED** | Block + require explicit approval — present mandatory reduction |
41
+
42
+ Custom threshold: user can set `TOKEN_BUDGET_MAX=N` in conversation or `.claude/settings.json`.
43
+
44
+ ---
45
+
46
+ ## Execution Steps
47
+
48
+ ### Step 1. Parse Task Description
49
+
50
+ Extract task dimensions:
51
+
52
+ | Dimension | Low (×1) | Medium (×2) | High (×4) |
53
+ |---|---|---|---|
54
+ | **File scope** | 1–3 files | 4–10 files | 11+ files / whole codebase |
55
+ | **Agent count** | 0 (inline) | 1–2 agents | 3+ agents / parallel |
56
+ | **Step depth** | 1–3 steps | 4–8 steps | 9+ steps |
57
+ | **Iteration** | None | 1 round | 2+ rounds (wave/loop) |
58
+ | **Output size** | Short answer | Medium doc | Full report / deck |
59
+
60
+ ---
61
+
62
+ ### Step 2. Estimate Token Cost
63
+
64
+ Base estimates per task type:
65
+
66
+ | Task Type | Base Estimate | Notes |
67
+ |---|---|---|
68
+ | Single file edit | 2K | Read + edit + verify |
69
+ | Code review (1 PR) | 5K | Diff + analysis + comments |
70
+ | Skill creation (1 SKILL.md) | 8K | Design + write + CATALOG update |
71
+ | Agent dispatch (1 agent) | 10K | Context card + agent overhead |
72
+ | Parallel dispatch (3 agents) | 25K | 3× agent + orchestration |
73
+ | sim-conductor full run | 30K | All 5 simulation axes |
74
+ | steel-quench 4-wave | 50K | All waves + prescriptions |
75
+ | Full harvest-loop cycle | 40K | 8-step pipeline + PRs |
76
+
77
+ Apply dimension multipliers from Step 1 to the base estimate.
78
+
79
+ **Final formula:**
80
+ ```
81
+ Estimated = base × file_multiplier × agent_multiplier × iteration_multiplier
82
+ ```
83
+
84
+ Round to nearest 1K.
85
+
86
+ ---
87
+
88
+ ### Step 3. Output Gate Verdict
89
+
90
+ ```
91
+ ## Token Budget Gate
92
+
93
+ Task: {one-line task description}
94
+ Estimated cost: ~{N}K tokens
95
+ Threshold: {user max or default}
96
+
97
+ Verdict: 🟡 YELLOW — within budget but consider lighter approach
98
+
99
+ Breakdown:
100
+ Base (skill creation): 8K
101
+ × 2 agents: ×2 = 16K
102
+ × 1 iteration: ×1 = 16K
103
+ Total: ~16K
104
+
105
+ Lighter alternative:
106
+ → Inline (no agent dispatch): ~8K (-50%)
107
+ → Single agent, not parallel: ~12K (-25%)
108
+
109
+ Proceed? (y to continue / n to adjust scope)
110
+ ```
111
+
112
+ For 🟢 GREEN: output one line only — *"Token estimate: ~{N}K — GREEN, proceeding."*
113
+
114
+ ---
115
+
116
+ ### Step 4. Proceed / Adjust
117
+
118
+ - **GREEN / YELLOW + user confirms**: proceed, note start marker
119
+ - **ORANGE**: present scope reduction table, wait for user selection
120
+ - **RED**: present mandatory reduction — do not proceed until user explicitly approves
121
+
122
+ Scope reduction options table (ORANGE/RED):
123
+
124
+ | Option | Reduction | Trade-off |
125
+ |---|---|---|
126
+ | Drop parallel → sequential | -30% | Slower, same quality |
127
+ | Reduce agent count (3→1) | -50% | Less parallelism |
128
+ | Narrow file scope | -40% | Shallower coverage |
129
+ | Use lighter skill variant | -60% | Fewer waves/probes |
130
+ | Split into 2 sessions | -50%/session | No quality loss |
131
+
132
+ ---
133
+
134
+ ### Step 5. Post-Task Calibration (optional)
135
+
136
+ After task completion, if user says "how much did that cost" or "calibrate":
137
+
138
+ ```
139
+ ## Calibration
140
+
141
+ Estimated: ~16K tokens
142
+ Actual: ~{actual}K tokens
143
+ Error: {+/-N}%
144
+
145
+ Calibration note saved → improves next estimate for this task type.
146
+ ```
147
+
148
+ Write calibration data:
149
+ ```bash
150
+ mkdir -p .claude/token_calibration/
151
+ # Append: task_type, estimated, actual, date
152
+ ```
153
+
154
+ Calibration data improves future estimates for the same task type (no model training — local record only).
155
+
156
+ ---
157
+
158
+ ## Done When
159
+
160
+ - Gate verdict output (GREEN/YELLOW/ORANGE/RED) with estimated cost breakdown
161
+ - For ORANGE/RED: scope reduction options presented and user decision recorded
162
+ - Calibration offered after task completion (optional, not mandatory)
163
+
164
+ ---
165
+
166
+ ## Chains
167
+
168
+ **Upstream** (proposed before these skills):
169
+ - → `agent-composer` (multi-agent orchestration)
170
+ - → `sim-conductor` (5-axis simulation)
171
+ - → `steel-quench` (4-wave adversarial review)
172
+ - → `harvest-loop` (8-step pipeline)
173
+
174
+ **Downstream**:
175
+ - No mandatory chain — gate verdict is the output; task execution follows user decision