@chrono-meta/fh-gate 1.0.3 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/challenger.md +169 -0
- package/AGENTS.md +160 -0
- package/CATALOG.md +256 -0
- package/CHEATSHEET.md +367 -0
- package/CLAUDE.md +331 -0
- package/CONTRIBUTING.md +198 -0
- package/LICENSE +21 -0
- package/README.md +131 -418
- package/bin/fh-goal.js +9 -0
- package/bin/fh-run.js +9 -0
- package/docs/banner.png +0 -0
- package/docs/codex-compat.md +123 -0
- package/docs/pillars.svg +70 -0
- package/knowledge/shared/harness-core/fh_integration_contract.md +48 -29
- package/package.json +31 -6
- package/plugins/fh-commons/README.md +37 -0
- package/plugins/fh-commons/agents/quench-challenger.md +373 -0
- package/plugins/fh-commons/skills/convergence-loop/SKILL.md +155 -0
- package/plugins/fh-commons/skills/deliberation/SKILL.md +288 -0
- package/plugins/fh-commons/skills/mcp-circuit-breaker/SKILL.md +196 -0
- package/plugins/fh-commons/skills/token-budget-gate/SKILL.md +175 -0
- package/plugins/fh-meta/agents/fact-checker.md +121 -0
- package/plugins/fh-meta/agents/hub-persona-auditor.md +109 -0
- package/plugins/fh-meta/agents/persona-innovator.md +195 -0
- package/plugins/fh-meta/skills/agent-composer/SKILL.md +461 -0
- package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +464 -0
- package/plugins/fh-meta/skills/apex-review/SKILL.md +185 -0
- package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +135 -0
- package/plugins/fh-meta/skills/contention-layer/SKILL.md +127 -0
- package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL.md +30 -0
- package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL_detail.md +144 -0
- package/plugins/fh-meta/skills/context-doctor/SKILL.md +341 -0
- package/plugins/fh-meta/skills/cross-ecosystem-synergy-detection/SKILL.md +202 -0
- package/plugins/fh-meta/skills/deep-clarify/SKILL.md +144 -0
- package/plugins/fh-meta/skills/edit-manifest/SKILL.md +210 -0
- package/plugins/fh-meta/skills/field-harvest/SKILL.md +384 -0
- package/plugins/fh-meta/skills/frontier-digest/SKILL.md +272 -0
- package/plugins/fh-meta/skills/goal-quench/SKILL.md +509 -0
- package/plugins/fh-meta/skills/harness-doctor/SKILL.md +277 -0
- package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +484 -0
- package/plugins/fh-meta/skills/harvest-loop/SKILL.md +231 -0
- package/plugins/fh-meta/skills/harvest-loop/SKILL_detail.md +201 -0
- package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +129 -0
- package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL_detail.md +158 -0
- package/plugins/fh-meta/skills/install-doctor/SKILL.md +207 -0
- package/plugins/fh-meta/skills/install-wizard/SKILL.md +613 -0
- package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +193 -0
- package/plugins/fh-meta/skills/memory-hygiene/SKILL.md +143 -0
- package/plugins/fh-meta/skills/meta-prompt-builder/SKILL.md +167 -0
- package/plugins/fh-meta/skills/meta-prompt-builder/SKILL_detail.md +37 -0
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +430 -0
- package/plugins/fh-meta/skills/plugin-recommender/SKILL.md +221 -0
- package/plugins/fh-meta/skills/plugin-recommender/SKILL_detail.md +220 -0
- package/plugins/fh-meta/skills/prompt-regression/SKILL.md +178 -0
- package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +224 -0
- package/plugins/fh-meta/skills/return-path-gate/SKILL.md +257 -0
- package/plugins/fh-meta/skills/self-marketing-lint/SKILL.md +129 -0
- package/plugins/fh-meta/skills/sim-conductor/SKILL.md +364 -0
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +337 -0
- package/plugins/fh-meta/skills/skill-splitter/SKILL.md +126 -0
- package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +185 -0
- package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +230 -0
- package/plugins/fh-meta/skills/source-grounding-audit/SKILL_detail.md +182 -0
- package/plugins/fh-meta/skills/steel-quench/SKILL.md +226 -0
- package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +453 -0
- package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +238 -0
- package/scripts/fh-gate.sh +175 -40
- package/scripts/fh-goal.sh +182 -0
- package/scripts/fh-run.sh +269 -0
|
@@ -0,0 +1,288 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: deliberation
|
|
3
|
+
description: Multi-perspective synthesis structure — Innovator (propose) → Devil-Advocate (challenge) → Mediator (synthesize) 3-layer execution. Outputs conditional verdicts without binary win/loss. Activates on "deliberation", "battle this out", "weigh the pros and cons", "review from multiple angles", "which side is right?". Optional deep-insight persona jurors for domain-specific views. Designed for design decisions, skill proposals, and architectural choices.
|
|
4
|
+
user-invocable: true
|
|
5
|
+
allowed-tools: ["Read", "Bash", "Agent"]
|
|
6
|
+
model: opus
|
|
7
|
+
origin: fh-meta
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# deliberation — The Forge Skill
|
|
11
|
+
|
|
12
|
+
Innovator (propose) → Devil (challenge) → Mediator (synthesize) 3-layer core structure.
|
|
13
|
+
The goal is not to pick a winner — it is to **extract salvageable fragments from the losing argument and produce a conditional verdict**.
|
|
14
|
+
Even those who struggle to challenge assumptions can use this structure as a rope to reach better decisions.
|
|
15
|
+
|
|
16
|
+
> **Role distinction from sim-conductor and steel-quench**
|
|
17
|
+
> - sim-conductor: **validates** quality and consistency of a completed asset
|
|
18
|
+
> - steel-quench: **adversarially stress-tests** a near-complete artifact (post-build defect surfacing)
|
|
19
|
+
> - deliberation: perspective clash during the **design decision process** → synthesis (upstream of both)
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Triggers
|
|
24
|
+
|
|
25
|
+
- `/deliberation`
|
|
26
|
+
- "battle this out", "make them argue", "clash and synthesize"
|
|
27
|
+
- "review this decision from multiple angles"
|
|
28
|
+
- When agent-composer Step 4-b proposes `Wave next-D` automatically
|
|
29
|
+
|
|
30
|
+
### Natural Language Triggers (for general users — activates without internal vocabulary)
|
|
31
|
+
|
|
32
|
+
Also activates when design decisions or perspective clashes are expressed in natural language:
|
|
33
|
+
|
|
34
|
+
| Example phrase | Intent |
|
|
35
|
+
|---|---|
|
|
36
|
+
| "I'm not sure whether to do this or not" | Decision uncertainty → multi-perspective synthesis |
|
|
37
|
+
| "It seems like opinions are divided" | Perspective clash → synthesis layer needed |
|
|
38
|
+
| "Help me decide which side is right" | Conditional synthesis, not simple winner selection |
|
|
39
|
+
| "Someone will probably object to this" | Request for devil's advocate perspective |
|
|
40
|
+
| "Is it okay to keep going in this direction?" | Re-validation of design decision |
|
|
41
|
+
| "Review this from multiple angles" | Multi-perspective synthesis → 3-layer default |
|
|
42
|
+
| "Analyze this from all sides" | Multi-perspective synthesis → 3-layer default |
|
|
43
|
+
| "Weigh the pros and cons" | Perspective clash → devil + innovator engaged |
|
|
44
|
+
| "Analyze the strengths and weaknesses" | Pro/con structure → Innovator (pros) + Devil (cons) |
|
|
45
|
+
| "Help me make a decision" | Decision support → conditional verdict generation |
|
|
46
|
+
| "pros and cons", "pros cons" | Comparative analysis → synthesis layer needed |
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## Step 0. Receive Topic + Select Layer
|
|
51
|
+
|
|
52
|
+
If no input is provided, ask:
|
|
53
|
+
```
|
|
54
|
+
Please provide the deliberation topic.
|
|
55
|
+
- Topic: What are you trying to decide or design?
|
|
56
|
+
- Layer: [3-layer default (recommended)] / [5-layer — includes jury]
|
|
57
|
+
- Jury focus (if 5-layer selected): user experience / technical feasibility / business & policy
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
**Default: 3-layer** (Innovator → Devil → Mediator). Use 5-layer only when a jury is needed.
|
|
61
|
+
|
|
62
|
+
**Execution log (workers_approved pattern)**:
|
|
63
|
+
Upon completing Step 0, include the following in the output:
|
|
64
|
+
```
|
|
65
|
+
[DELIBERATION START] Topic: {topic} | Layer: {layer} | {timestamp}
|
|
66
|
+
→ WORKER_CALL: Innovator (isolated instance)
|
|
67
|
+
→ WORKER_CALL: Devil-Advocate (isolated instance)
|
|
68
|
+
→ WORKER_CALL: Mediator (isolated instance — Cost of Consensus prevention)
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Jury auto-selection criteria:
|
|
72
|
+
| Topic nature | Recommended jury personas |
|
|
73
|
+
|---|---|
|
|
74
|
+
| New user experience related | `newcomer` + `power-user` |
|
|
75
|
+
| Technical implementation feasibility | `persona-be` + `persona-fe` |
|
|
76
|
+
| Business viability / policy / legal | `persona-pm` + `persona-business` |
|
|
77
|
+
| General design decisions | No jury (3-layer is sufficient) |
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Step 1. Innovator Layer — Propose
|
|
82
|
+
|
|
83
|
+
Invoke `deep-insight:persona-innovator`.
|
|
84
|
+
|
|
85
|
+
> **Fallback (if deep-insight is not installed)**: Claude Code performs the Innovator role inline. Same instruction template and output format apply. The deliberation pipeline is guaranteed to work without deep-insight installed.
|
|
86
|
+
|
|
87
|
+
> **No isolation (intentional)**: The Innovator is a proposal generator — it does not evaluate its own output, so Agent tool isolation is not needed. Cost of Consensus applies only to the Mediator, which **evaluates** its own generated content (arXiv 2605.00914). Only the Mediator (Step 3) is isolated via the Agent tool.
|
|
88
|
+
|
|
89
|
+
Instruction template (meta-prompt-builder structure):
|
|
90
|
+
```
|
|
91
|
+
Goal: Generate the most creative and scalable proposals for {topic}
|
|
92
|
+
Context: Current harness state + list of relevant assets
|
|
93
|
+
Constraints: No duplication of existing assets / must not violate simplification guard
|
|
94
|
+
Done When: 1~3 concrete proposals + 1-line rationale per proposal
|
|
95
|
+
Brief limit: Total Context passed to Agent must be kept under 1200 characters
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Output format:
|
|
99
|
+
```
|
|
100
|
+
[Innovator]
|
|
101
|
+
Proposal 1: {content} — Rationale: {1 line}
|
|
102
|
+
Proposal 2: {content} — Rationale: {1 line}
|
|
103
|
+
(optional) Proposal 3: {content} — Rationale: {1 line}
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
|
|
108
|
+
## Step 2. Devil-Advocate Layer — Challenge
|
|
109
|
+
|
|
110
|
+
Invoke `deep-insight:persona-devil-advocate`. Takes Step 1 output as input.
|
|
111
|
+
|
|
112
|
+
> **Fallback (if deep-insight is not installed)**: `fh-commons:quench-challenger` (includes Devil DNA) or Claude Code performs the Devil-Advocate role inline. Same output format. Instance isolation is not required, same as Step 1.
|
|
113
|
+
|
|
114
|
+
Instruction template:
|
|
115
|
+
```
|
|
116
|
+
Goal: Generate the sharpest single rebuttal for each of the {N} Innovator proposals
|
|
117
|
+
Context: Innovator output + harness simplification guard + known failure patterns
|
|
118
|
+
Constraints: No emotional rebuttals / no baseless negation / must include a hint toward improvement
|
|
119
|
+
Done When: 1 rebuttal per proposal + 1-line core risk + 1-line acknowledgment of valid parts
|
|
120
|
+
Brief limit: Total Context passed to Agent must be kept under 1200 characters
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Output format:
|
|
124
|
+
```
|
|
125
|
+
[Devil-Advocate]
|
|
126
|
+
Proposal 1 rebuttal: {content}
|
|
127
|
+
Risk: {1 line}
|
|
128
|
+
Acknowledgment: {1 line} ← this line is the Mediator's raw material
|
|
129
|
+
Proposal 2 rebuttal: ...
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
> **Acknowledgment line is mandatory**: The Devil must explicitly state "this part is valid" — synthesis is impossible without it.
|
|
133
|
+
> A rebuttal with no acknowledgment is automatically flagged as `[WARN: unsynthesizable rebuttal]`.
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Step 3. Mediator Layer — Synthesize (Core)
|
|
138
|
+
|
|
139
|
+
**[Isolation Principle — Cost of Consensus Response]**
|
|
140
|
+
The Mediator invokes a separate instance via the `Agent` tool.
|
|
141
|
+
When the same instance evaluates its own generated content, confirmation bias occurs (demonstrated in arXiv 2605.00914).
|
|
142
|
+
Physical separation from the Innovator/Devil generation context is required for unbiased synthesis.
|
|
143
|
+
|
|
144
|
+
> **What isolation means**: Blocks Self-Evaluation Bias — the tendency for an instance to favor its own output.
|
|
145
|
+
> The Mediator reads the Innovator and Devil outputs, but does not share the **reasoning process that generated** those outputs.
|
|
146
|
+
> Independence of the reasoning path — not mere information sharing — is the key to resolving Cost of Consensus.
|
|
147
|
+
|
|
148
|
+
Agent invocation instruction (includes Context Card):
|
|
149
|
+
```
|
|
150
|
+
Goal: Synthesize Innovator and Devil-Advocate outputs into a conditional verdict
|
|
151
|
+
Context: {full Step 1 output} + {full Step 2 output}
|
|
152
|
+
Constraints: No simple selection of the winning argument / must extract fragments from the losing argument / no hedge language ("balance both sides") / output under 1200 characters
|
|
153
|
+
Done When: All 5 sections output — Adopt / Alert absorption / Verdict / Conditions / Discard
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
**Synthesis formula**:
|
|
157
|
+
```
|
|
158
|
+
Core value of the Innovator proposal
|
|
159
|
+
+ Valid alerts from Devil's rebuttal (extracted from the acknowledgment line)
|
|
160
|
+
→ Conditional verdict: "{proposal} OK, provided {condition} is mandatory"
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
**What the Mediator must not do**:
|
|
164
|
+
- Simply select the winning argument as-is (simple verdict = deliberation failure)
|
|
165
|
+
- Discard the losing argument (fragment extraction is mandatory)
|
|
166
|
+
- Use hedge expressions like "we should consider both sides in a balanced way"
|
|
167
|
+
|
|
168
|
+
Output format:
|
|
169
|
+
```
|
|
170
|
+
[Mediator — Synthesis Verdict]
|
|
171
|
+
Adopt: Core of {Innovator Proposal N} — {value, 1 line}
|
|
172
|
+
Alert absorption: "{acknowledgment line}" from {Devil rebuttal} → converted to condition {X}
|
|
173
|
+
─────────────────────────────────────────
|
|
174
|
+
Verdict: Proceed with {proposal} — OK
|
|
175
|
+
Conditions: {1~3 required conditions}
|
|
176
|
+
Discard: {fully rejected parts — with rationale}
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Step 4 (Optional). Jury Layer — Domain Perspectives
|
|
182
|
+
|
|
183
|
+
Runs only when 5-layer is selected. **Parallel** dispatch of 2~3 selected deep-insight personas via the `Agent` tool.
|
|
184
|
+
|
|
185
|
+
> **Fallback (if deep-insight is not installed)**: dispatch the jury from real fh-meta agents instead — `persona-innovator` and `hub-persona-auditor` (and `fh-commons:quench-challenger` for an adversarial juror). Same parallel-Agent dispatch, juror cap, and output format apply. The 5-layer jury is guaranteed to run without deep-insight installed — mirroring the Step 1/Step 2 fallbacks.
|
|
186
|
+
|
|
187
|
+
Juror count limit: **maximum 3**. If 4 or more are selected, output `[WARN: jury overload — noise risk]` and defer to the user.
|
|
188
|
+
|
|
189
|
+
Instruction per juror:
|
|
190
|
+
```
|
|
191
|
+
Goal: Review the Mediator's synthesis verdict from the perspective of {persona}
|
|
192
|
+
Context: Full output from Steps 1~3
|
|
193
|
+
Constraints: Do not overturn the already-synthesized verdict / only propose additional conditions
|
|
194
|
+
Done When: Agree / partial agreement / disagree + 1 line of additional conditions or risk
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Output format:
|
|
198
|
+
```
|
|
199
|
+
[Jury: {persona name}]
|
|
200
|
+
Verdict: Agree / Partial agreement / Disagree
|
|
201
|
+
Opinion: {1~2 lines}
|
|
202
|
+
Additional condition: {1 line if applicable}
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
---
|
|
206
|
+
|
|
207
|
+
## Step 5 (Optional). Mediator Final Synthesis
|
|
208
|
+
|
|
209
|
+
Incorporates jury opinions to refine the Step 3 verdict.
|
|
210
|
+
|
|
211
|
+
```
|
|
212
|
+
[Final Verdict]
|
|
213
|
+
Based on: Step 3 synthesis verdict
|
|
214
|
+
Jury input: {N agreed / N partial / N disagreed}
|
|
215
|
+
Added conditions: {additional conditions from jury}
|
|
216
|
+
Final conclusion: {1~2 lines}
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
## Automatic WARN Detection Patterns
|
|
222
|
+
|
|
223
|
+
| Situation | WARN content |
|
|
224
|
+
|---|---|
|
|
225
|
+
| Devil rebuttal has no "acknowledgment" line | `[WARN: unsynthesizable rebuttal — Mediator lacks raw material]` |
|
|
226
|
+
| Mediator adopts only one side's argument | `[WARN: simple verdict, not synthesis — deliberation failure]` |
|
|
227
|
+
| Innovator and Devil share the same premise | `[WARN: no real clash — recommend redefining the topic]` |
|
|
228
|
+
| 4 or more jurors selected | `[WARN: jury overload — reduce to 3 or fewer?]` |
|
|
229
|
+
| Done When contains vague expressions | `[WARN: Done When is unmeasurable — share meta-prompt-builder WARN criteria]` |
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## agent-composer Integration — Wave next-D
|
|
234
|
+
|
|
235
|
+
Add the following condition to the agent-composer Step 4-b state transition gate:
|
|
236
|
+
|
|
237
|
+
```
|
|
238
|
+
| ⑤ Design decision clash or judgment that "a battle is needed" | Wave next-D | deliberation (S) |
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
`Wave next-D` activation criteria:
|
|
242
|
+
- M/S/R results contain "2 or more mutually conflicting proposals"
|
|
243
|
+
- User utterance includes "which side is right?", "they conflict", "both seem valid"
|
|
244
|
+
- agent-composer cannot synthesize the fan-in results and is about to defer the decision to the user
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## Design Principle — Why This Skill Exists
|
|
249
|
+
|
|
250
|
+
It is a **rope** for those who have not yet dared to challenge.
|
|
251
|
+
|
|
252
|
+
Thinking alone traps you in a single perspective. Only the courageous construct counterarguments themselves.
|
|
253
|
+
deliberation provides that counterargument as structure — even those afraid of conflict can start a battle, because the Mediator will synthesize it for them.
|
|
254
|
+
|
|
255
|
+
The Mediator's conditional verdict creates a safe entry point: "this is okay if you do it this way."
|
|
256
|
+
The jury fills the domain blind spots that no single person can see on their own.
|
|
257
|
+
|
|
258
|
+
**What the forge creates is not a winner — it is a new alloy.**
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## Done When
|
|
263
|
+
|
|
264
|
+
```
|
|
265
|
+
All Steps 0~3 completed (Steps 4~5 added if 5-layer selected)
|
|
266
|
+
+ [Mediator — Synthesis Verdict] output present (Adopt / Alert absorption / Verdict / Conditions / Discard)
|
|
267
|
+
+ User's final decision confirmed (deliberation output must never be auto-executed)
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
**→ When invoked from agent-composer Wave next-D: synthesis verdict is the fan-in input for Wave continuation** — return the Mediator verdict + Conditions to agent-composer so the conflict is marked resolved in the fan-in result set. After this, agent-composer re-runs Step 4-b state transition evaluation with the conflict cleared; subsequent Waves (next-M / next-E / end) proceed based on the updated result.
|
|
271
|
+
|
|
272
|
+
## Simplification Guard
|
|
273
|
+
|
|
274
|
+
- Simple information queries → deliberation is unnecessary. Respond directly.
|
|
275
|
+
- Cases where the answer is already clear → route to sim-conductor (validation) or harness-doctor (diagnosis).
|
|
276
|
+
- If resolvable without a jury, 3-layer default is sufficient.
|
|
277
|
+
- deliberation output always completes with the user's final decision. Auto-execution is prohibited.
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
281
|
+
## Related Skills
|
|
282
|
+
|
|
283
|
+
| Situation | Related skill |
|
|
284
|
+
|---|---|
|
|
285
|
+
| Auto-triggered on design decision clash (Wave next-D) | `fh-meta:agent-composer` Step 4-b |
|
|
286
|
+
| Validate quality and consistency of a completed asset | `fh-meta:sim-conductor` |
|
|
287
|
+
| Validate skill candidates after deliberation | `fh-meta:harness-doctor` |
|
|
288
|
+
| Implementation convergence loop after a decision | `fh-commons:convergence-loop` |
|
|
@@ -0,0 +1,196 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mcp-circuit-breaker
|
|
3
|
+
description: Detects MCP tool failure patterns and trips a circuit breaker to stop cascading retries. Proposes fallback alternatives and resets when the tool recovers. Triggers on "MCP failing", "tool keeps erroring", "circuit-breaker", repeated tool call failures.
|
|
4
|
+
user-invocable: true
|
|
5
|
+
allowed-tools: ["Read", "Bash", "Write"]
|
|
6
|
+
model: sonnet
|
|
7
|
+
complexity_routing:
|
|
8
|
+
base: sonnet
|
|
9
|
+
high: opus
|
|
10
|
+
escalate_when:
|
|
11
|
+
- multi_server_failure
|
|
12
|
+
- unknown_mcp_server
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# mcp-circuit-breaker — MCP Tool Failure Guard
|
|
16
|
+
|
|
17
|
+
MCP tools can fail silently or return partial results, leading to cascading retry loops that waste tokens and degrade session quality. This skill detects failure patterns, trips a circuit breaker to halt retries, proposes alternatives, and resets when the tool recovers.
|
|
18
|
+
|
|
19
|
+
> **Scope distinction**
|
|
20
|
+
> - Claude Code native retry: low-level transport retries (transparent, fast)
|
|
21
|
+
> - mcp-circuit-breaker: **session-level guard** — detects repeated semantic failures, intervenes before token waste compounds
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Triggers
|
|
26
|
+
|
|
27
|
+
- `/mcp-circuit-breaker`
|
|
28
|
+
- "MCP failing", "MCP keeps erroring", "tool isn't working", "circuit-breaker"
|
|
29
|
+
- "same error keeps happening", "tool call looping", "MCP timeout"
|
|
30
|
+
- Automatic: when the same MCP tool name appears in 3+ consecutive failed calls within a session
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## Circuit States
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
CLOSED → Normal operation (tool calls pass through)
|
|
38
|
+
OPEN → Circuit tripped (calls blocked, alternatives proposed)
|
|
39
|
+
HALF-OPEN → Recovery probe (1 test call allowed, resets if success)
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Default thresholds:
|
|
43
|
+
- **Trip threshold**: 3 consecutive failures of the same tool
|
|
44
|
+
- **Half-open probe**: after 60s cooldown (or explicit user command)
|
|
45
|
+
- **Reset**: 1 successful call in HALF-OPEN state → back to CLOSED
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Execution Steps
|
|
50
|
+
|
|
51
|
+
### Step 1. Detect Failure Pattern
|
|
52
|
+
|
|
53
|
+
Identify the failing tool and failure mode:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
# Check MCP server config
|
|
57
|
+
cat .claude/settings.json 2>/dev/null | grep -A5 '"mcpServers"' || echo "No MCP config found"
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Classify failure type:
|
|
61
|
+
|
|
62
|
+
| Type | Symptom | Likely Cause |
|
|
63
|
+
|---|---|---|
|
|
64
|
+
| `TIMEOUT` | Tool call hangs >30s | Server overload / network |
|
|
65
|
+
| `AUTH` | 401 / 403 response | Credentials expired or missing |
|
|
66
|
+
| `NOT_FOUND` | 404 / tool not available | Server down / tool removed |
|
|
67
|
+
| `MALFORMED` | Parse error on response | Schema mismatch / API change |
|
|
68
|
+
| `RATE_LIMIT` | 429 / quota exceeded | Too many calls |
|
|
69
|
+
|
|
70
|
+
If failure type cannot be determined: classify as `UNKNOWN`.
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
### Step 2. Trip Decision
|
|
75
|
+
|
|
76
|
+
Count consecutive failures of the identified tool in the current session context.
|
|
77
|
+
|
|
78
|
+
| Count | Action |
|
|
79
|
+
|---|---|
|
|
80
|
+
| 1 | Log warning. Continue — *"MCP tool {name} failed once. Monitoring."* |
|
|
81
|
+
| 2 | Escalate warning. Suggest checking server status. |
|
|
82
|
+
| 3+ | **TRIP CIRCUIT** → output circuit open notice, block further calls to this tool |
|
|
83
|
+
|
|
84
|
+
Circuit open notice format:
|
|
85
|
+
```
|
|
86
|
+
⚡ CIRCUIT OPEN — {tool-name}
|
|
87
|
+
Failure type: {TYPE} | Consecutive failures: {N}
|
|
88
|
+
Further calls to this tool are blocked until circuit resets.
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
### Step 3. Log Circuit State
|
|
94
|
+
|
|
95
|
+
Write state to session-local file (in-memory is insufficient — logs survive /clear):
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
mkdir -p .claude/mcp_circuit/
|
|
99
|
+
# Append to circuit log
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Log entry format:
|
|
103
|
+
```yaml
|
|
104
|
+
- tool: {tool-name}
|
|
105
|
+
state: OPEN
|
|
106
|
+
failure_type: {TYPE}
|
|
107
|
+
failure_count: {N}
|
|
108
|
+
tripped_at: {ISO-8601}
|
|
109
|
+
reset_at: null
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
### Step 4. Propose Alternatives
|
|
115
|
+
|
|
116
|
+
Present 3 fallback options ranked by effort:
|
|
117
|
+
|
|
118
|
+
| Priority | Alternative | When to Use |
|
|
119
|
+
|---|---|---|
|
|
120
|
+
| **1 — Substitute tool** | Use a different MCP tool or built-in tool that covers the same task | Tool-specific failure (NOT_FOUND, AUTH) |
|
|
121
|
+
| **2 — Degrade gracefully** | Skip the MCP step, note the gap, continue with available information | TIMEOUT / RATE_LIMIT |
|
|
122
|
+
| **3 — Pause and retry** | Wait for server recovery (HALF-OPEN probe after cooldown) | Transient failure (TIMEOUT, RATE_LIMIT) |
|
|
123
|
+
|
|
124
|
+
Output format:
|
|
125
|
+
```
|
|
126
|
+
## Fallback Options for {tool-name}
|
|
127
|
+
|
|
128
|
+
Option 1 — Substitute: Use {alternative-tool} instead
|
|
129
|
+
→ Command: [specific invocation]
|
|
130
|
+
→ Gap: [what's different vs. original tool]
|
|
131
|
+
|
|
132
|
+
Option 2 — Degrade: Skip this step, continue without {capability}
|
|
133
|
+
→ Impact: [what is missing from the output]
|
|
134
|
+
|
|
135
|
+
Option 3 — Retry after cooldown (60s)
|
|
136
|
+
→ Run: /mcp-circuit-breaker reset {tool-name}
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
### Step 5. Recovery Probe (HALF-OPEN)
|
|
142
|
+
|
|
143
|
+
When user requests reset or after cooldown:
|
|
144
|
+
|
|
145
|
+
```
|
|
146
|
+
Sending HALF-OPEN probe to {tool-name}...
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
- 1 minimal test call is allowed through
|
|
150
|
+
- If success: circuit → CLOSED, log updated
|
|
151
|
+
- If fail: circuit remains OPEN, cooldown resets
|
|
152
|
+
|
|
153
|
+
Reset log entry:
|
|
154
|
+
```yaml
|
|
155
|
+
state: CLOSED
|
|
156
|
+
reset_at: {ISO-8601}
|
|
157
|
+
reset_method: probe_success | user_forced
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
### Step 6. Report
|
|
163
|
+
|
|
164
|
+
At session end or on demand:
|
|
165
|
+
|
|
166
|
+
```
|
|
167
|
+
## MCP Circuit Breaker Report
|
|
168
|
+
|
|
169
|
+
| Tool | State | Failures | Tripped | Reset |
|
|
170
|
+
|---|---|---|---|---|
|
|
171
|
+
| {tool} | OPEN | 4 | 14:23 | — |
|
|
172
|
+
| {tool} | CLOSED | 1 | 14:10 | 14:15 (probe) |
|
|
173
|
+
|
|
174
|
+
Recommendations:
|
|
175
|
+
- {tool}: AUTH failure → refresh credentials in .claude/settings.json
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
## Done When
|
|
181
|
+
|
|
182
|
+
- Failure pattern classified (type + count)
|
|
183
|
+
- Circuit state logged (OPEN / HALF-OPEN / CLOSED)
|
|
184
|
+
- At least 3 fallback alternatives proposed when circuit is OPEN
|
|
185
|
+
- Recovery probe offered with reset path
|
|
186
|
+
|
|
187
|
+
---
|
|
188
|
+
|
|
189
|
+
## Chains
|
|
190
|
+
|
|
191
|
+
**Upstream** (can trigger this skill):
|
|
192
|
+
- Automatically activates on 3+ consecutive MCP failures during any task
|
|
193
|
+
|
|
194
|
+
**Downstream** (after circuit open):
|
|
195
|
+
- No mandatory chain — fallback options are presented, user decides
|
|
196
|
+
- Optional: `context-doctor` if MCP failure is due to large context degrading tool calls
|
|
@@ -0,0 +1,175 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: token-budget-gate
|
|
3
|
+
description: Estimates token cost before a multi-step task and outputs a Green/Yellow/Red gate verdict. Tracks actual vs. estimated after completion for calibration. Triggers on "token budget", "how much will this cost", "will this be expensive", "estimate tokens", before long multi-agent tasks.
|
|
4
|
+
user-invocable: true
|
|
5
|
+
allowed-tools: ["Read", "Bash"]
|
|
6
|
+
model: sonnet
|
|
7
|
+
complexity_routing:
|
|
8
|
+
base: sonnet
|
|
9
|
+
high: opus
|
|
10
|
+
escalate_when:
|
|
11
|
+
- multi_project_scope
|
|
12
|
+
- unknown_task_type
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# token-budget-gate — Pre-Task Token Cost Gate
|
|
16
|
+
|
|
17
|
+
Multi-step and multi-agent tasks can silently consume large token budgets. This skill estimates cost before execution, outputs a gate verdict, and calibrates estimates against actual usage after completion — preventing surprise overruns without blocking legitimate work.
|
|
18
|
+
|
|
19
|
+
> **FH context**: FH default execution tier is `standard` (~15K tokens). This skill gates against accidental `full` (~30K) or `max` (~60K+) consumption on tasks that could be handled lighter.
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Triggers
|
|
24
|
+
|
|
25
|
+
- `/token-budget-gate`
|
|
26
|
+
- "token budget", "token cost", "how expensive", "will this use a lot of tokens"
|
|
27
|
+
- "estimate tokens", "token estimate before we start"
|
|
28
|
+
- Before invoking: `agent-composer`, `sim-conductor`, `steel-quench` (max-tier skills)
|
|
29
|
+
- Automatically proposed when task description contains: multi-agent, parallel dispatch, full suite, all files, entire codebase
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Gate Thresholds (defaults — user-configurable)
|
|
34
|
+
|
|
35
|
+
| Signal | Verdict | Action |
|
|
36
|
+
|---|---|---|
|
|
37
|
+
| Estimated < 10K tokens | 🟢 **GREEN** | Proceed without comment |
|
|
38
|
+
| 10K–30K tokens | 🟡 **YELLOW** | Proceed with notice — suggest lighter approach if one exists |
|
|
39
|
+
| 30K–60K tokens | 🟠 **ORANGE** | Confirm before proceeding — present scope reduction options |
|
|
40
|
+
| > 60K tokens | 🔴 **RED** | Block + require explicit approval — present mandatory reduction |
|
|
41
|
+
|
|
42
|
+
Custom threshold: user can set `TOKEN_BUDGET_MAX=N` in conversation or `.claude/settings.json`.
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Execution Steps
|
|
47
|
+
|
|
48
|
+
### Step 1. Parse Task Description
|
|
49
|
+
|
|
50
|
+
Extract task dimensions:
|
|
51
|
+
|
|
52
|
+
| Dimension | Low (×1) | Medium (×2) | High (×4) |
|
|
53
|
+
|---|---|---|---|
|
|
54
|
+
| **File scope** | 1–3 files | 4–10 files | 11+ files / whole codebase |
|
|
55
|
+
| **Agent count** | 0 (inline) | 1–2 agents | 3+ agents / parallel |
|
|
56
|
+
| **Step depth** | 1–3 steps | 4–8 steps | 9+ steps |
|
|
57
|
+
| **Iteration** | None | 1 round | 2+ rounds (wave/loop) |
|
|
58
|
+
| **Output size** | Short answer | Medium doc | Full report / deck |
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
### Step 2. Estimate Token Cost
|
|
63
|
+
|
|
64
|
+
Base estimates per task type:
|
|
65
|
+
|
|
66
|
+
| Task Type | Base Estimate | Notes |
|
|
67
|
+
|---|---|---|
|
|
68
|
+
| Single file edit | 2K | Read + edit + verify |
|
|
69
|
+
| Code review (1 PR) | 5K | Diff + analysis + comments |
|
|
70
|
+
| Skill creation (1 SKILL.md) | 8K | Design + write + CATALOG update |
|
|
71
|
+
| Agent dispatch (1 agent) | 10K | Context card + agent overhead |
|
|
72
|
+
| Parallel dispatch (3 agents) | 25K | 3× agent + orchestration |
|
|
73
|
+
| sim-conductor full run | 30K | All 5 simulation axes |
|
|
74
|
+
| steel-quench 4-wave | 50K | All waves + prescriptions |
|
|
75
|
+
| Full harvest-loop cycle | 40K | 8-step pipeline + PRs |
|
|
76
|
+
|
|
77
|
+
Apply dimension multipliers from Step 1 to the base estimate.
|
|
78
|
+
|
|
79
|
+
**Final formula:**
|
|
80
|
+
```
|
|
81
|
+
Estimated = base × file_multiplier × agent_multiplier × iteration_multiplier
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Round to nearest 1K.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
### Step 3. Output Gate Verdict
|
|
89
|
+
|
|
90
|
+
```
|
|
91
|
+
## Token Budget Gate
|
|
92
|
+
|
|
93
|
+
Task: {one-line task description}
|
|
94
|
+
Estimated cost: ~{N}K tokens
|
|
95
|
+
Threshold: {user max or default}
|
|
96
|
+
|
|
97
|
+
Verdict: 🟡 YELLOW — within budget but consider lighter approach
|
|
98
|
+
|
|
99
|
+
Breakdown:
|
|
100
|
+
Base (skill creation): 8K
|
|
101
|
+
× 2 agents: ×2 = 16K
|
|
102
|
+
× 1 iteration: ×1 = 16K
|
|
103
|
+
Total: ~16K
|
|
104
|
+
|
|
105
|
+
Lighter alternative:
|
|
106
|
+
→ Inline (no agent dispatch): ~8K (-50%)
|
|
107
|
+
→ Single agent, not parallel: ~12K (-25%)
|
|
108
|
+
|
|
109
|
+
Proceed? (y to continue / n to adjust scope)
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
For 🟢 GREEN: output one line only — *"Token estimate: ~{N}K — GREEN, proceeding."*
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
### Step 4. Proceed / Adjust
|
|
117
|
+
|
|
118
|
+
- **GREEN / YELLOW + user confirms**: proceed, note start marker
|
|
119
|
+
- **ORANGE**: present scope reduction table, wait for user selection
|
|
120
|
+
- **RED**: present mandatory reduction — do not proceed until user explicitly approves
|
|
121
|
+
|
|
122
|
+
Scope reduction options table (ORANGE/RED):
|
|
123
|
+
|
|
124
|
+
| Option | Reduction | Trade-off |
|
|
125
|
+
|---|---|---|
|
|
126
|
+
| Drop parallel → sequential | -30% | Slower, same quality |
|
|
127
|
+
| Reduce agent count (3→1) | -50% | Less parallelism |
|
|
128
|
+
| Narrow file scope | -40% | Shallower coverage |
|
|
129
|
+
| Use lighter skill variant | -60% | Fewer waves/probes |
|
|
130
|
+
| Split into 2 sessions | -50%/session | No quality loss |
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
### Step 5. Post-Task Calibration (optional)
|
|
135
|
+
|
|
136
|
+
After task completion, if user says "how much did that cost" or "calibrate":
|
|
137
|
+
|
|
138
|
+
```
|
|
139
|
+
## Calibration
|
|
140
|
+
|
|
141
|
+
Estimated: ~16K tokens
|
|
142
|
+
Actual: ~{actual}K tokens
|
|
143
|
+
Error: {+/-N}%
|
|
144
|
+
|
|
145
|
+
Calibration note saved → improves next estimate for this task type.
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Write calibration data:
|
|
149
|
+
```bash
|
|
150
|
+
mkdir -p .claude/token_calibration/
|
|
151
|
+
# Append: task_type, estimated, actual, date
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
Calibration data improves future estimates for the same task type (no model training — local record only).
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Done When
|
|
159
|
+
|
|
160
|
+
- Gate verdict output (GREEN/YELLOW/ORANGE/RED) with estimated cost breakdown
|
|
161
|
+
- For ORANGE/RED: scope reduction options presented and user decision recorded
|
|
162
|
+
- Calibration offered after task completion (optional, not mandatory)
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
## Chains
|
|
167
|
+
|
|
168
|
+
**Upstream** (proposed before these skills):
|
|
169
|
+
- → `agent-composer` (multi-agent orchestration)
|
|
170
|
+
- → `sim-conductor` (5-axis simulation)
|
|
171
|
+
- → `steel-quench` (4-wave adversarial review)
|
|
172
|
+
- → `harvest-loop` (8-step pipeline)
|
|
173
|
+
|
|
174
|
+
**Downstream**:
|
|
175
|
+
- No mandatory chain — gate verdict is the output; task execution follows user decision
|