@chrono-meta/fh-gate 1.0.3 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/challenger.md +169 -0
- package/AGENTS.md +160 -0
- package/CATALOG.md +256 -0
- package/CHEATSHEET.md +367 -0
- package/CLAUDE.md +331 -0
- package/CONTRIBUTING.md +198 -0
- package/LICENSE +21 -0
- package/README.md +131 -418
- package/bin/fh-goal.js +9 -0
- package/bin/fh-run.js +9 -0
- package/docs/banner.png +0 -0
- package/docs/codex-compat.md +123 -0
- package/docs/pillars.svg +70 -0
- package/knowledge/shared/harness-core/fh_integration_contract.md +48 -29
- package/package.json +31 -6
- package/plugins/fh-commons/README.md +37 -0
- package/plugins/fh-commons/agents/quench-challenger.md +373 -0
- package/plugins/fh-commons/skills/convergence-loop/SKILL.md +155 -0
- package/plugins/fh-commons/skills/deliberation/SKILL.md +288 -0
- package/plugins/fh-commons/skills/mcp-circuit-breaker/SKILL.md +196 -0
- package/plugins/fh-commons/skills/token-budget-gate/SKILL.md +175 -0
- package/plugins/fh-meta/agents/fact-checker.md +121 -0
- package/plugins/fh-meta/agents/hub-persona-auditor.md +109 -0
- package/plugins/fh-meta/agents/persona-innovator.md +195 -0
- package/plugins/fh-meta/skills/agent-composer/SKILL.md +461 -0
- package/plugins/fh-meta/skills/agent-composer/SKILL_detail.md +464 -0
- package/plugins/fh-meta/skills/apex-review/SKILL.md +185 -0
- package/plugins/fh-meta/skills/asset-placement-gate/SKILL.md +135 -0
- package/plugins/fh-meta/skills/contention-layer/SKILL.md +127 -0
- package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL.md +30 -0
- package/plugins/fh-meta/skills/context-bridge-dispatch/SKILL_detail.md +144 -0
- package/plugins/fh-meta/skills/context-doctor/SKILL.md +341 -0
- package/plugins/fh-meta/skills/cross-ecosystem-synergy-detection/SKILL.md +202 -0
- package/plugins/fh-meta/skills/deep-clarify/SKILL.md +144 -0
- package/plugins/fh-meta/skills/edit-manifest/SKILL.md +210 -0
- package/plugins/fh-meta/skills/field-harvest/SKILL.md +384 -0
- package/plugins/fh-meta/skills/frontier-digest/SKILL.md +272 -0
- package/plugins/fh-meta/skills/goal-quench/SKILL.md +509 -0
- package/plugins/fh-meta/skills/harness-doctor/SKILL.md +277 -0
- package/plugins/fh-meta/skills/harness-doctor/SKILL_detail.md +484 -0
- package/plugins/fh-meta/skills/harvest-loop/SKILL.md +231 -0
- package/plugins/fh-meta/skills/harvest-loop/SKILL_detail.md +201 -0
- package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL.md +129 -0
- package/plugins/fh-meta/skills/hub-cc-pr-reviewer/SKILL_detail.md +158 -0
- package/plugins/fh-meta/skills/install-doctor/SKILL.md +207 -0
- package/plugins/fh-meta/skills/install-wizard/SKILL.md +613 -0
- package/plugins/fh-meta/skills/marketplace-gate/SKILL.md +193 -0
- package/plugins/fh-meta/skills/memory-hygiene/SKILL.md +143 -0
- package/plugins/fh-meta/skills/meta-prompt-builder/SKILL.md +167 -0
- package/plugins/fh-meta/skills/meta-prompt-builder/SKILL_detail.md +37 -0
- package/plugins/fh-meta/skills/pipeline-conductor/SKILL.md +430 -0
- package/plugins/fh-meta/skills/plugin-recommender/SKILL.md +221 -0
- package/plugins/fh-meta/skills/plugin-recommender/SKILL_detail.md +220 -0
- package/plugins/fh-meta/skills/prompt-regression/SKILL.md +178 -0
- package/plugins/fh-meta/skills/public-surface-audit/SKILL.md +224 -0
- package/plugins/fh-meta/skills/return-path-gate/SKILL.md +257 -0
- package/plugins/fh-meta/skills/self-marketing-lint/SKILL.md +129 -0
- package/plugins/fh-meta/skills/sim-conductor/SKILL.md +364 -0
- package/plugins/fh-meta/skills/sim-conductor/SKILL_detail.md +337 -0
- package/plugins/fh-meta/skills/skill-splitter/SKILL.md +126 -0
- package/plugins/fh-meta/skills/skill-splitter/SKILL_detail.md +185 -0
- package/plugins/fh-meta/skills/source-grounding-audit/SKILL.md +230 -0
- package/plugins/fh-meta/skills/source-grounding-audit/SKILL_detail.md +182 -0
- package/plugins/fh-meta/skills/steel-quench/SKILL.md +226 -0
- package/plugins/fh-meta/skills/steel-quench/SKILL_detail.md +453 -0
- package/plugins/fh-meta/skills/verify-bidirectional/SKILL.md +238 -0
- package/scripts/fh-gate.sh +175 -40
- package/scripts/fh-goal.sh +182 -0
- package/scripts/fh-run.sh +269 -0
package/README.md
CHANGED
|
@@ -1,18 +1,23 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
<img src="docs/banner.png" alt="forge-harness — A forkable Claude Code meta-harness for multi-project teams" width="
|
|
2
|
+
<img src="docs/banner.png" alt="forge-harness — A forkable Claude Code meta-harness for multi-project teams" width="680">
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
5
|
<p align="center">
|
|
6
6
|
<a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-22c55e.svg" alt="MIT License"></a>
|
|
7
|
-
<img src="https://img.shields.io/badge/
|
|
8
|
-
<a href="https://zenodo.org/records/20397566"><img src="https://img.shields.io/badge/DOI-10.5281%2Fzenodo.20397566-blue.svg" alt="DOI
|
|
9
|
-
<img src="https://img.shields.io/badge/Claude_Code-compatible-a855f7.svg" alt="Claude Code
|
|
7
|
+
<img src="https://img.shields.io/badge/fh--gate-v1.2.0-3b82f6.svg" alt="fh-gate v1.2.0">
|
|
8
|
+
<a href="https://zenodo.org/records/20397566"><img src="https://img.shields.io/badge/DOI-10.5281%2Fzenodo.20397566-blue.svg" alt="DOI"></a>
|
|
9
|
+
<img src="https://img.shields.io/badge/Claude_Code-compatible-a855f7.svg" alt="Claude Code">
|
|
10
10
|
<img src="https://img.shields.io/badge/Codex-beta-f59e0b.svg" alt="Codex-compatible beta">
|
|
11
|
+
<a href="https://www.npmjs.com/package/@chrono-meta/fh-gate"><img src="https://img.shields.io/npm/v/@chrono-meta/fh-gate.svg?color=cb3837" alt="npm"></a>
|
|
11
12
|
</p>
|
|
12
13
|
|
|
13
14
|
<p align="center">
|
|
14
15
|
<b>Fork it. Rename it. Make it yours.</b><br>
|
|
15
|
-
A persistent knowledge hub that connects all your Claude Code projects
|
|
16
|
+
A persistent knowledge hub that connects all your Claude Code projects —<br>shared skills, accumulated context, and a compounding improvement loop.
|
|
17
|
+
</p>
|
|
18
|
+
|
|
19
|
+
<p align="center">
|
|
20
|
+
<img src="docs/pillars.svg" alt="FORK · ADAPT · COLLABORATE · EMPOWER" width="680">
|
|
16
21
|
</p>
|
|
17
22
|
|
|
18
23
|
---
|
|
@@ -20,500 +25,208 @@
|
|
|
20
25
|
| If you're here because… | forge-harness solves it |
|
|
21
26
|
|---|---|
|
|
22
27
|
| Context disappears when a session ends | Persistent `tracks/` — resumable from anywhere |
|
|
23
|
-
| You repeat the same setup
|
|
24
|
-
|
|
|
28
|
+
| You repeat the same setup across projects | Connect once to the hub, share across all projects |
|
|
29
|
+
| Team AI know-how lives only in people's heads | Codify it so everyone shares it |
|
|
25
30
|
| You want AI to get *better* as work accumulates | Skills and patterns compound session over session |
|
|
26
|
-
| You
|
|
27
|
-
|
|
28
|
-
> **Worried about token costs?** New install footprint ≈ 14.5% of 200K context. `context-doctor` diagnoses and reduces this further. → [Token optimization](#token-cost-optimization)
|
|
29
|
-
|
|
30
|
-
| Where you are now | Jump to |
|
|
31
|
-
|---|---|
|
|
32
|
-
| Starting from scratch | [Get started in 2 minutes](#get-started-in-2-minutes) |
|
|
33
|
-
| Already using it, want more | [33 asset activation check](#already-using-it----33-asset-activation-check) |
|
|
34
|
-
| Wrapping an external coding agent | [Governance layer](#governance-layer-for-ai-generated-code) |
|
|
35
|
-
| Want to spread it to your team | [Operating model Phase 3](#operating-model----3-phase-essence) |
|
|
36
|
-
|
|
37
|
-
---
|
|
31
|
+
| You need a governance layer for AI-generated code | `fh-gate` wraps any coding agent as a post-generation gate |
|
|
38
32
|
|
|
39
33
|
> **This document is for humans.** AI operating rules → `CLAUDE.md` · Command reference → `CHEATSHEET.md`
|
|
40
34
|
|
|
41
35
|
---
|
|
42
36
|
|
|
43
|
-
## What is this?
|
|
44
|
-
|
|
45
|
-
An **acceleration hub** for teams already using Claude Code. Connect N projects to one hub — work, learnings, and patterns from each project **mutually reinforce** each other. Build skills and agents once in the hub; share them across every project.
|
|
46
|
-
|
|
47
|
-
> **Goal**: Get into orbit in the age of AI acceleration without burning out. Minimize setup friction, optimize context, distribute expertise by task complexity, and raise the success rate of every session.
|
|
48
|
-
|
|
49
|
-
forge-harness is structured as two distinct layers:
|
|
50
|
-
|
|
51
|
-
| Layer | Contents | AI compatibility |
|
|
52
|
-
|---|---|---|
|
|
53
|
-
| **Methodology layer** (model-agnostic) | `tracks/`, `knowledge/`, `SKILL.md` documents, session protocols | Any AI model |
|
|
54
|
-
| **Automation layer** (Claude-native) | `.claude/agents/`, hooks, slash commands, `CLAUDE.md` rules | Claude Code only |
|
|
55
|
-
|
|
56
|
-
The **methodology layer** is the portable core — connecting projects to a persistent hub, accumulating learnings in `tracks/`, curating cross-project knowledge in `knowledge/shared/`. Works regardless of which AI you use.
|
|
57
|
-
|
|
58
|
-
The **automation layer** is what makes the methodology frictionless when running Claude Code: agents dispatch automatically, hooks fire at session boundaries, and slash commands invoke skills without manual prompting.
|
|
59
|
-
|
|
60
|
-
> **Codex-compatible beta**: Gemini, Codex, and other AI users can apply the methodology layer manually. Automation layer features require Claude Code as host.
|
|
61
|
-
|
|
62
|
-
---
|
|
63
|
-
|
|
64
|
-
## Finding your entry path
|
|
65
|
-
|
|
66
|
-
Teams using AI collaboration tools systematically are already doing **harness engineering**: QA protocols, verification pipelines, structures that make AI behave more consistently. forge-harness is the **OS one layer above** — a system that measures, improves, and evolves the harness itself across multiple projects.
|
|
67
|
-
|
|
68
|
-
| Layer | What it does | Examples |
|
|
69
|
-
|---|---|---|
|
|
70
|
-
| Harness engineering | Per-project rules, gates, context management | QA protocols, CLAUDE.md rulesets, TC verification pipelines |
|
|
71
|
-
| **Meta harness engineering** | Cross-project system to measure, improve, evolve harnesses | FH skill bus, harness-doctor, steel-quench, field-harvest |
|
|
72
|
-
|
|
73
|
-
> **FH v1.0 paper** — published 2026-05-30 on [Zenodo](https://zenodo.org/records/20397566) (DOI: 10.5281/zenodo.20397566) · arXiv submission in review. Documents the 2-layer design, 6-axis framework, 4-agent orchestration, and compounding loop with controlled empirical evidence.
|
|
74
|
-
|
|
75
|
-
> **External validation (2026)** — three independent research findings converge:
|
|
76
|
-
> - VILA-Lab analysis of Claude Code v2.1.88 (512K lines): [98.4% is harness infrastructure, 1.6% AI logic](https://arxiv.org/abs/2604.14228)
|
|
77
|
-
> - "[Code as Agent Harness](https://arxiv.org/abs/2605.18747)" (arXiv, May 2026, 43 authors)
|
|
78
|
-
> - Stanford IRIS Lab: "[Meta-Harness](https://arxiv.org/abs/2603.28052)" (Lee et al., Mar 2026) — outer-loop harness optimization; +7.7pts at 4× fewer tokens
|
|
79
|
-
|
|
80
|
-
#### FH vs. automated harness tools
|
|
81
|
-
|
|
82
|
-
The Stanford paper inspired [harness-evolver](https://github.com/raphaelchristi/harness-evolver) — a fully automated 7-stage CODE optimizer. FH independently converged on the same loop architecture from the opposite direction:
|
|
83
|
-
|
|
84
|
-
| Axis | harness-evolver | forge-harness |
|
|
85
|
-
|---|---|---|
|
|
86
|
-
| **Optimization target** | Harness code (prompts, routing) | Harness knowledge (context, patterns, expertise) |
|
|
87
|
-
| **Evolution** | Auto-merge winners to git | Human-approved at every stage |
|
|
88
|
-
| **Infrastructure** | LangSmith + Python 3.10+ | CLAUDE.md + skills only, zero extra |
|
|
89
|
-
| **Scope** | Single-harness optimization | Multi-project federation, shared skill bus |
|
|
90
|
-
| **Knowledge layer** | No persistent curation | `tracks/` + `knowledge/` grow over time |
|
|
91
|
-
|
|
92
|
-
They're complementary — FH's approval gates and knowledge layer fill exactly the gaps automated CODE search leaves open.
|
|
93
|
-
|
|
94
|
-
Count how many apply to you:
|
|
95
|
-
|
|
96
|
-
- [ ] You have 2 or more Claude Code projects
|
|
97
|
-
- [ ] You lose context when a session ends
|
|
98
|
-
- [ ] You repeat the same patterns and rules across multiple projects
|
|
99
|
-
- [ ] You want to spread AI methodology to your team
|
|
100
|
-
- [ ] You want AI to improve as work accumulates
|
|
101
|
-
|
|
102
|
-
| Count | Recommended path |
|
|
103
|
-
|:---:|---|
|
|
104
|
-
| **3+** | Standard entry → [Get started in 2 minutes](#get-started-in-2-minutes) |
|
|
105
|
-
| **1–2** | Plugin first → `claude plugin install -s user fh-meta@forge-harness` |
|
|
106
|
-
| **0** | Single-project stage — check back when you reach 2+ projects. `context-doctor` is available standalone now |
|
|
107
|
-
|
|
108
|
-
---
|
|
109
|
-
|
|
110
37
|
## Get started in 2 minutes
|
|
111
38
|
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
### Step 0. Register the plugin
|
|
39
|
+
**Prerequisite**: Claude Code CLI — verify with `claude --version`
|
|
115
40
|
|
|
116
41
|
```bash
|
|
42
|
+
# 1. Install the plugin
|
|
117
43
|
claude plugin marketplace add https://github.com/chrono-meta/forge-harness.git
|
|
118
44
|
claude plugin install -s user fh-meta@forge-harness
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
> If Step 0 fails: run `claude plugin update fh-meta@forge-harness`, or check that your network can reach `github.com`.
|
|
122
45
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
### Step 1. Clone the hub
|
|
126
|
-
|
|
127
|
-
```bash
|
|
46
|
+
# 2. Clone the hub
|
|
128
47
|
git clone https://github.com/chrono-meta/forge-harness.git ~/forge-harness
|
|
129
48
|
cd ~/forge-harness
|
|
130
|
-
```
|
|
131
|
-
|
|
132
|
-
> **Standard path**: Fork on GitHub first → clone your fork → accumulate `tracks/` and `knowledge/` there → periodically pull upstream updates from forge-harness. Rename it to make it yours.
|
|
133
49
|
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
```bash
|
|
50
|
+
# 3. Start a session
|
|
137
51
|
claude
|
|
138
52
|
```
|
|
139
53
|
|
|
140
|
-
> ✅
|
|
141
|
-
>
|
|
142
|
-
> ❌ **Generic response?** → Run `pwd` to confirm you're in the `forge-harness` root. If not: `cd ~/forge-harness && claude`
|
|
143
|
-
|
|
144
|
-
From here:
|
|
145
|
-
- **"Connect a project"** → hub scans `../`, lists projects with `.git`, creates `tracks/{project}/` on confirmation
|
|
146
|
-
- **"My projects are in `~/work/`"** → specify a different root
|
|
147
|
-
|
|
148
|
-
---
|
|
149
|
-
|
|
150
|
-
## Governance layer for AI-generated code
|
|
151
|
-
|
|
152
|
-
FH wraps any coding agent (OpenCode, Codex, etc.) as a **post-generation governance layer** — no runtime adapter needed. FH reads files the agent writes; the protocol is the interface.
|
|
54
|
+
> ✅ Claude reads `CLAUDE.md` and asks what project to connect or what task to start.
|
|
55
|
+
> Say **"Connect a project"** → hub scans `../`, finds `.git` directories, creates `tracks/{project}/`.
|
|
153
56
|
|
|
57
|
+
**Plugin only (no clone):**
|
|
154
58
|
```bash
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
# → pipeline-conductor --quick # 4-axis: regression / adversarial / grounding / record
|
|
159
|
-
# → FH_GATE_VERDICT # PASS | PENDING | BLOCKED | ESCALATE
|
|
59
|
+
claude plugin marketplace add https://github.com/chrono-meta/forge-harness.git # once
|
|
60
|
+
claude plugin install -s user fh-meta@forge-harness
|
|
61
|
+
cd ~/projects/{your-project} && claude
|
|
160
62
|
```
|
|
161
63
|
|
|
162
|
-
**Empirical result (2026-05-31)**: Applied to OpenCode's own AI-generated `permission/arity.ts` (163 lines, 6 tests passing, CI green). Governance verdict: PENDING — 2 A-grade findings CI didn't cover (short-token overflow in allowlist, executor tools absent from arity table). Delta attributable to methodology layer, not the model.
|
|
163
|
-
|
|
164
|
-
Full spec: `knowledge/shared/harness-core/fh_integration_contract.md` · Usage: `knowledge/shared/harness-core/fh_opencode_governance_wrapper.md`
|
|
165
|
-
|
|
166
|
-
> **One-line governance gate**: `npx @chrono-meta/fh-gate "src/foo.ts" quick ci` — or `npm install -g @chrono-meta/fh-gate` for repeated use.
|
|
167
|
-
|
|
168
64
|
---
|
|
169
65
|
|
|
170
|
-
##
|
|
66
|
+
## What it is
|
|
171
67
|
|
|
172
|
-
|
|
68
|
+
forge-harness is structured as **two distinct layers**:
|
|
173
69
|
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
| Wave | What was attacked | Result |
|
|
70
|
+
| Layer | Contents | AI compatibility |
|
|
177
71
|
|---|---|---|
|
|
178
|
-
|
|
|
179
|
-
|
|
|
180
|
-
| W6 | Soft review → Hard gate ("no next step until fix complete") | Incomplete TC merge blocked |
|
|
181
|
-
| W7 | P0 ratio inflation → forced re-review above 30% | Priority inflation prevented |
|
|
182
|
-
| W8 | Phantom Claim Guard — unspecified values/button names banned | Fabricated expected results blocked |
|
|
72
|
+
| **Methodology layer** | `tracks/`, `knowledge/`, `SKILL.md` docs, session protocols | Any AI model |
|
|
73
|
+
| **Automation layer** | `.claude/agents/`, hooks, slash commands, `CLAUDE.md` rules | Claude Code only |
|
|
183
74
|
|
|
184
|
-
|
|
75
|
+
The methodology layer is the portable core — persistent hub, accumulating learnings, curating cross-project knowledge. The automation layer makes it frictionless when running Claude Code.
|
|
185
76
|
|
|
186
|
-
> The self-healing loop: steel-quench attacks the prompt → execution catches bugs the review missed → fixes are verified in the same pass.
|
|
187
|
-
|
|
188
|
-
---
|
|
189
|
-
|
|
190
|
-
## Already using it — 33 asset activation check
|
|
191
|
-
|
|
192
|
-
<details>
|
|
193
|
-
<summary>Expand full asset table (33 skills + 5 agents)</summary>
|
|
194
|
-
|
|
195
|
-
Check which of the following are **regularly activating** for you:
|
|
196
|
-
|
|
197
|
-
| Asset | Role | Natural language triggers | Active |
|
|
198
|
-
|---|---|---|:---:|
|
|
199
|
-
| `agent-composer` | Plans optimal agent dispatch | "How should I split this across agents?", "Run in parallel" | □ |
|
|
200
|
-
| `apex-review` | Final quality review from executive perspective | "Will this hold up with decision-makers?" | □ |
|
|
201
|
-
| `verify-bidirectional` | Reverse-verify decisions | "Is that right?", "Double-check this" | □ |
|
|
202
|
-
| `deliberation` *(fh-commons)* | Structured multi-angle argument | "Battle it out", "Review this from multiple angles" | □ |
|
|
203
|
-
| `cross-ecosystem-synergy-detection` | Detect cross-tool synergies | "Are my installed tools working together?" | □ |
|
|
204
|
-
| `plugin-recommender` | Plugin recommendations | "Is there a good tool for this?" | □ |
|
|
205
|
-
| `hub-cc-pr-reviewer` | Automated PR review | "Review this PR", "Is it okay to merge?" | □ |
|
|
206
|
-
| `context-doctor` | Token efficiency + `.claudeignore` | "Session is slow", "Clean up context" | □ |
|
|
207
|
-
| `sim-conductor` | Meta-simulation orchestrator | "External user perspective", "Internal audit" | □ |
|
|
208
|
-
| `steel-quench` | Full-spectrum adversarial verification — attacks output patterns (self-declarations, cushion language, structural flaws) | "Run the quench", "Attack from the root" | □ |
|
|
209
|
-
| `source-grounding-audit` | Source back-tracing — detects Phantom Claims (no source found). Attacks input tracing (where did this come from?) | "Verify the source", "Grounding audit" | □ |
|
|
210
|
-
| `harness-doctor` | Harness structure diagnosis | "Something seems wrong with my Claude setup" | □ |
|
|
211
|
-
| `deep-clarify` | Socratic requirements clarification | "I'm not sure what I need to build", "Clarify this" | □ |
|
|
212
|
-
| `meta-prompt-builder` | Meta prompt design | "Write a prompt for each Wave", "What should I tell the agent?" | □ |
|
|
213
|
-
| `install-doctor` | Diagnose conflicts before/after plugin install | "Is it okay to add this plugin?" | □ |
|
|
214
|
-
| `install-wizard` | Initial environment diagnosis + onboarding | "First-time setup", "Just installed this" | □ |
|
|
215
|
-
| `asset-placement-gate` | New asset belongs in FH or project? | "Should this be shared?", "Hub vs project" | □ |
|
|
216
|
-
| `marketplace-gate` | 5-point fitness gate before listing | "Is it okay to list this?" | □ |
|
|
217
|
-
| `field-harvest` | Back-propagate field patterns to hub | "I could reuse this in other projects" | □ |
|
|
218
|
-
| `hub-persona-auditor` | Pre-publish 4-axis audit | "How will this look to others?" | □ |
|
|
219
|
-
| `fact-checker` | Asset deduplication check | "Isn't there something similar already?" | □ |
|
|
220
|
-
| `persona-innovator` | Naming gap detection + ideation | "What would be a good name for this?" | □ |
|
|
221
|
-
| `contention-layer` | Treat skill conflicts as harvest signals | "These two skills conflict" | □ |
|
|
222
|
-
| `context-bridge-dispatch` | Inject session context cards before parallel dispatch | "Brief the agents first", "Parallel dispatch" | □ |
|
|
223
|
-
| `frontier-digest` | Frontier signals (HN, arXiv) → actionable insights | "AI trend digest", "What's new this week" | □ |
|
|
224
|
-
| `harvest-loop` | End-of-session learning → evolution pipeline | "Harvest the session", "Run the pipeline" | □ |
|
|
225
|
-
| `self-marketing-lint` | Remove self-marketing language from skill descriptions | "Description diet", "Strip the marketing tone" | □ |
|
|
226
|
-
| `pipeline-conductor` | 4-axis quality gate (backward/adversarial/forward/record) | "Run the quality gate", "4-axis check" | □ |
|
|
227
|
-
| `goal-quench` | `/goal` wrapper with token budget gate + pipeline-conductor verification | "Safe goal run", "Goal with budget control" | □ |
|
|
228
|
-
| `edit-manifest` | Predict-verify loop for harness edits | "Log this edit", "Predict what this changes" | □ |
|
|
229
|
-
| `memory-hygiene` | Detect stale memory entries + re-verify live | "Check stale memory", "Memory drift" | □ |
|
|
230
|
-
| `prompt-regression` | Detect behavioral regressions after rule edits | "Did my rule change break anything?" | □ |
|
|
231
|
-
| `convergence-loop` *(fh-commons)* | N-round convergence loops — only "truly passed" after convergence | "Suspicious of single-pass", "Convergence loop" | □ |
|
|
232
|
-
| `token-budget-gate` *(fh-commons)* | Pre-task token cost estimate (GREEN/YELLOW/ORANGE/RED) | "How expensive is this?", "Token budget estimate" | □ |
|
|
233
|
-
| `mcp-circuit-breaker` *(fh-commons)* | Detects MCP tool failure patterns, blocks further calls | "MCP keeps failing", "Tool error loop" | □ |
|
|
234
|
-
| `quench-challenger` *(fh-commons)* | Pressure-tests near-final artifacts from adversarial angles | "Challenge this with a devil", "Quench challenger" | □ |
|
|
235
|
-
|
|
236
|
-
| Count | Diagnosis |
|
|
237
|
-
|:---:|---|
|
|
238
|
-
| **28–36** | Advanced — focus on `agent-composer` + `sim-conductor` + `steel-quench` + `pipeline-conductor` chained |
|
|
239
|
-
| **10–27** | Activation stage — gradually activate unchecked assets |
|
|
240
|
-
| **0–9** | Early stage — go back to self-diagnosis above |
|
|
241
|
-
|
|
242
|
-
</details>
|
|
243
|
-
|
|
244
|
-
---
|
|
245
|
-
|
|
246
|
-
## How it works
|
|
247
|
-
|
|
248
|
-
```
|
|
249
|
-
forge-harness (the brain — persistent hub)
|
|
250
|
-
├── knowledge/ → referenced from all projects
|
|
251
|
-
└── tracks/ → work records per project
|
|
252
|
-
|
|
253
|
-
Project A (the execution site)
|
|
254
|
-
→ connect hub in CLAUDE.md → auto-referenced
|
|
255
|
-
|
|
256
|
-
Project B (the execution site)
|
|
257
|
-
→ connect hub in CLAUDE.md → auto-referenced
|
|
258
77
|
```
|
|
78
|
+
forge-harness/ ← the hub (persistent brain)
|
|
79
|
+
├── knowledge/ → shared across all projects
|
|
80
|
+
└── tracks/ → work records per project
|
|
259
81
|
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
- **"Hello"** → Claude automatically pulls recent context and today's tasks from the hub *(when running `claude` from the FH cwd)*
|
|
263
|
-
|
|
264
|
-
```
|
|
265
|
-
Search: CATALOG.md (tags + summary) → open that file directly
|
|
266
|
-
Store: End of session → save to tracks/{project}/ → update CATALOG.md
|
|
267
|
-
Return: New pattern found → save to tracks/{project}/learnings/
|
|
268
|
-
Share: Common to 2+ projects → write to knowledge/shared/
|
|
82
|
+
Project A ──→ connect hub in CLAUDE.md
|
|
83
|
+
Project B ──→ connect hub in CLAUDE.md
|
|
269
84
|
```
|
|
270
85
|
|
|
271
86
|
---
|
|
272
87
|
|
|
273
|
-
##
|
|
274
|
-
|
|
275
|
-
| What you want | What to say |
|
|
276
|
-
|---|---|
|
|
277
|
-
| Start a session | "Hello" → reads hub, guides today's tasks |
|
|
278
|
-
| Save session | "Sync this session to forge-harness" |
|
|
279
|
-
| Search past work | "What did I do around April 13th?" |
|
|
280
|
-
| Connect a new project | "Connect a project" |
|
|
281
|
-
| Run adversarial review | "Run the quench on this" |
|
|
282
|
-
| Run end-of-session harvest | "Harvest the session" |
|
|
283
|
-
|
|
284
|
-
---
|
|
285
|
-
|
|
286
|
-
## Agent dispatch
|
|
287
|
-
|
|
288
|
-
forge-harness includes specialized agents and `agent-composer` to plan their optimal combination.
|
|
289
|
-
|
|
290
|
-
```
|
|
291
|
-
/agent-composer
|
|
292
|
-
```
|
|
293
|
-
|
|
294
|
-
Analyzes the current task and proposes which agents to dispatch in what order.
|
|
295
|
-
|
|
296
|
-
### FH agents
|
|
297
|
-
|
|
298
|
-
| Agent | Role | Tool restrictions |
|
|
299
|
-
|---|---|---|
|
|
300
|
-
| `plan` | Read-only design agent — analyzes files, maps impact, plans before implementation | Read·Bash·Glob·Grep only |
|
|
301
|
-
| `fact-checker` | Asset deduplication and staleness check | Read·Grep·Glob |
|
|
302
|
-
| `hub-persona-auditor` | 3+ persona audit of externally published assets | Read·Grep·Glob |
|
|
303
|
-
| `persona-innovator` | Naming exploration + frame proposals | Read·Grep·Glob·WebSearch·WebFetch |
|
|
304
|
-
| `quench-challenger` | Steel-quench adversary — pressure-tests near-final artifacts | Read·Grep·Glob |
|
|
305
|
-
|
|
306
|
-
### Parallel dispatch
|
|
307
|
-
|
|
308
|
-
Request two agents in a single message to run in parallel:
|
|
309
|
-
|
|
310
|
-
```
|
|
311
|
-
"Run fact-checker and persona-innovator in parallel.
|
|
312
|
-
First: check [asset path] for duplicates
|
|
313
|
-
Second: scan current harness for naming gaps"
|
|
314
|
-
```
|
|
315
|
-
|
|
316
|
-
> **Validated**: 6 background agents dispatched in parallel from meta-harness cwd → completed in ~3 minutes (~5× faster than sequential).
|
|
317
|
-
|
|
318
|
-
---
|
|
319
|
-
|
|
320
|
-
## Multi-Model Sidecar (v1.3)
|
|
321
|
-
|
|
322
|
-
Each available AI CLI (Gemini, Codex, `gh copilot`) forms an independent review team alongside Claude. Cross-team synthesis surfaces Claude blind spots — issues external teams catch that single-model review misses. The sidecars act as **peer reviewers**, not primary orchestrators; skill invocation and harness automation remain Claude Code-native.
|
|
323
|
-
|
|
324
|
-
**Coverage tiers (measured on `source-grounding-audit/SKILL.md`):**
|
|
325
|
-
| Tier | Setup | Defects found |
|
|
326
|
-
|---|---|---|
|
|
327
|
-
| **C1** Single Claude persona | Default | 25% |
|
|
328
|
-
| **C2** 3 cross-session Claude personas | No extra tools | 75% |
|
|
329
|
-
| **C3** C2 + external CLI (Gemini/Codex/gh copilot) | External CLI installed | 100% — +3 Claude blind spots |
|
|
330
|
-
|
|
331
|
-
Claude-side token cost: **zero increase** C2→C3. External CLI billed to its own quota.
|
|
332
|
-
|
|
333
|
-
Decision rule: routine → C2, pre-publish → C3+.
|
|
334
|
-
|
|
335
|
-
> **Corporate path**: `gh copilot` as sidecar (GitHub Copilot CLI, separate enterprise license). Requires headless operability — use `gh copilot -- -p "..." --allow-all-tools`. Note: CLI presence ≠ headless capable; verify with `--allow-all-tools` before adding to CI.
|
|
336
|
-
|
|
337
|
-
---
|
|
338
|
-
|
|
339
|
-
## Runtime requirements
|
|
340
|
-
|
|
341
|
-
| Environment | Support | Notes |
|
|
342
|
-
|---|---|---|
|
|
343
|
-
| Claude Code + Anthropic API Key | ✅ Recommended | 200K context · officially supported |
|
|
344
|
-
| claude.ai Pro / Team Plan | ✅ Recommended | 200K context · officially supported |
|
|
345
|
-
| AWS Bedrock (direct API) | ⚠️ Conditional | Possible with sufficient account quota |
|
|
346
|
-
| Bedrock + LiteLLM proxy | ⚠️ Unofficial | Frequent `Input is too long` errors |
|
|
347
|
-
| Internal AI API proxy | ⚠️ Conditional | Depends on `max_input_tokens` config |
|
|
348
|
-
|
|
349
|
-
---
|
|
88
|
+
## Governance layer for AI-generated code
|
|
350
89
|
|
|
351
|
-
|
|
90
|
+
FH wraps any coding agent (OpenCode, Codex, etc.) as a **post-generation governance gate**.
|
|
352
91
|
|
|
353
92
|
```bash
|
|
354
|
-
|
|
355
|
-
|
|
93
|
+
npx --package @chrono-meta/fh-gate fh-gate # default: Claude backend
|
|
94
|
+
FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-gate # Codex backend
|
|
95
|
+
FH_BACKEND=auto npx --package @chrono-meta/fh-gate fh-gate "src/foo.ts" full
|
|
96
|
+
# → FH_GATE_VERDICT: PASS | PENDING | BLOCKED | ESCALATE
|
|
356
97
|
```
|
|
357
98
|
|
|
358
|
-
|
|
99
|
+
`fh-gate` uses the same FH governance prompt for both runtimes. `FH_BACKEND=claude` runs `claude --print`; `FH_BACKEND=codex` runs `codex exec`; `FH_BACKEND=auto` prefers Codex when both CLIs are present.
|
|
359
100
|
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
| Plugin | Skills | Agents |
|
|
363
|
-
|---|---|---|
|
|
364
|
-
| **fh-meta** (v1.3) | 29 skills — agent-composer · apex-review · asset-placement-gate · contention-layer · context-bridge-dispatch · context-doctor · cross-ecosystem-synergy-detection · deep-clarify · edit-manifest · field-harvest · frontier-digest · goal-quench · harness-doctor · harvest-loop · hub-cc-pr-reviewer · install-doctor · install-wizard · marketplace-gate · memory-hygiene · meta-prompt-builder · pipeline-conductor · plugin-recommender · prompt-regression · self-marketing-lint · sim-conductor · source-grounding-audit · steel-quench · verify-bidirectional · and more | 3 (hub-persona-auditor · fact-checker · persona-innovator) |
|
|
365
|
-
| **fh-commons** (v0.2.0) | 4 skills — convergence-loop · deliberation · mcp-circuit-breaker · token-budget-gate | 1 (quench-challenger) |
|
|
366
|
-
|
|
367
|
-
#### Mode C (plugin only — no clone)
|
|
101
|
+
For direct skill or agent execution outside Claude Code, use `fh-run`:
|
|
368
102
|
|
|
369
103
|
```bash
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
cd ~/projects/{your-project} && claude
|
|
104
|
+
FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-run --skill source-grounding-audit --file docs/foo.md
|
|
105
|
+
FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-run --agent fh-commons:quench-challenger --file plugins/fh-meta/skills/foo/SKILL.md
|
|
373
106
|
```
|
|
374
107
|
|
|
375
|
-
|
|
376
|
-
|---|:---:|:---:|
|
|
377
|
-
| `verify-bidirectional` · `apex-review` | ✅ hub baseline | ⚠️ no `knowledge/` |
|
|
378
|
-
| `cross-ecosystem-synergy-detection` · `plugin-recommender` | ✅ hub cross-ref | ⚠️ your project only |
|
|
379
|
-
| Meta/hub seed accumulation | ✅ `knowledge/shared/` | ❌ |
|
|
380
|
-
|
|
381
|
-
#### Mode D — agent file copy only
|
|
382
|
-
|
|
383
|
-
The lightest entry. Copy a single agent file to use immediately:
|
|
108
|
+
For Codex-primary work, keep using Codex's native goal/session features when available. `fh-goal` is only a portable wrapper for one-off non-interactive runs that should be followed by FH governance:
|
|
384
109
|
|
|
385
110
|
```bash
|
|
386
|
-
|
|
387
|
-
cp <harness-root>/.claude/agents/fact-checker.md <your-project>/.claude/agents/
|
|
111
|
+
FH_BACKEND=codex npx --package @chrono-meta/fh-gate fh-goal --prompt "Implement X and update tests" --gate quick
|
|
388
112
|
```
|
|
389
113
|
|
|
390
|
-
|
|
114
|
+
The broader FH automation layer still depends on Claude Code for sub-agents, hooks, and slash commands. The portable path is shared documents plus runtime adapters, not separate Codex and Claude forks.
|
|
391
115
|
|
|
392
|
-
|
|
393
|
-
cp {FH_ROOT}/templates/local_fh_context.md .claude/rules/local_fh_context.md
|
|
394
|
-
echo ".claude/rules/local_fh_context.md" >> .git/info/exclude
|
|
395
|
-
```
|
|
116
|
+
**Empirical result (2026-05-31)**: Applied to OpenCode's AI-generated `permission/arity.ts` (163 lines, CI green). Current gate semantics classify this as BLOCKED: 2 A-grade findings CI didn't catch (short-token overflow in allowlist, executor tools absent from arity table).
|
|
396
117
|
|
|
397
|
-
|
|
118
|
+
Full spec: [`fh_integration_contract.md`](knowledge/shared/harness-core/fh_integration_contract.md)
|
|
398
119
|
|
|
399
120
|
---
|
|
400
121
|
|
|
401
|
-
##
|
|
402
|
-
|
|
403
|
-
**Native overhead** — measured: new install standalone ≈ 29K tokens (14.5% of 200K). Top 2 heaviest files: `.claude/rules/*.md` (~20K) and `CLAUDE.md` (~8.7K). `context-doctor` diagnoses and recommends keyword-trigger deferral for infrequently-used rules (saves 5–8K).
|
|
122
|
+
## 35 skill files, 5 agents
|
|
404
123
|
|
|
405
|
-
|
|
124
|
+
<details>
|
|
125
|
+
<summary>Full asset activation check</summary>
|
|
406
126
|
|
|
407
|
-
|
|
127
|
+
| Asset | Role | Triggers |
|
|
128
|
+
|---|---|---|
|
|
129
|
+
| `steel-quench` | Full-spectrum adversarial verification | "Run the quench", "Attack from the root" |
|
|
130
|
+
| `source-grounding-audit` | Phantom claim detection + source back-tracing | "Verify the source", "Grounding audit" |
|
|
131
|
+
| `harvest-loop` | End-of-session learning → evolution pipeline | "Harvest the session" |
|
|
132
|
+
| `agent-composer` | Plans optimal agent dispatch | "Run in parallel", "Which agents?" |
|
|
133
|
+
| `sim-conductor` | Meta-simulation orchestrator | "External user perspective" |
|
|
134
|
+
| `context-doctor` | Token efficiency + `.claudeignore` | "Session is slow", "Clean up context" |
|
|
135
|
+
| `harness-doctor` | Harness structure diagnosis | "Check my Claude setup" |
|
|
136
|
+
| `pipeline-conductor` | 4-axis quality gate (backward/adversarial/forward/record) | "Run the quality gate" |
|
|
137
|
+
| `field-harvest` | Back-propagate field patterns to hub | "I could reuse this" |
|
|
138
|
+
| `frontier-digest` | HN + arXiv → actionable insights | "AI trend digest" |
|
|
139
|
+
| `hub-cc-pr-reviewer` | Automated PR review | "Review this PR" |
|
|
140
|
+
| `verify-bidirectional` | Reverse-verify decisions | "Is that right?", "Double-check" |
|
|
141
|
+
| `deep-clarify` | Socratic requirements clarification | "I'm not sure what to build" |
|
|
142
|
+
| `install-wizard` | Initial onboarding | "First-time setup" |
|
|
143
|
+
| `plugin-recommender` | Plugin recommendations | "Is there a good tool for this?" |
|
|
144
|
+
| `apex-review` | Executive-perspective quality review | "Will this hold up?" |
|
|
145
|
+
| `meta-prompt-builder` | Meta prompt design | "Write a prompt for the agent" |
|
|
146
|
+
| `asset-placement-gate` | Hub vs project asset routing | "Should this be shared?" |
|
|
147
|
+
| `cross-ecosystem-synergy-detection` | Cross-tool synergy finder | "Are my tools working together?" |
|
|
148
|
+
| `convergence-loop` *(fh-commons)* | N-round convergence loops | "Single-pass seems suspicious" |
|
|
149
|
+
| `token-budget-gate` *(fh-commons)* | Pre-task token cost estimate | "How expensive is this?" |
|
|
150
|
+
| `mcp-circuit-breaker` *(fh-commons)* | MCP tool failure pattern detection | "MCP keeps failing" |
|
|
151
|
+
| `quench-challenger` *(fh-commons)* | Adversarial pressure-test agent | "Challenge this with a devil" |
|
|
152
|
+
| *(+ additional assets)* | marketplace-gate · contention-layer · edit-manifest · fact-checker · goal-quench · hub-persona-auditor · install-doctor · memory-hygiene · persona-innovator · prompt-regression · public-surface-audit · skill-splitter | |
|
|
153
|
+
|
|
154
|
+
| Active count | Diagnosis |
|
|
155
|
+
|:---:|---|
|
|
156
|
+
| **28+** | Advanced — chain agent-composer + sim-conductor + steel-quench + pipeline-conductor |
|
|
157
|
+
| **10–27** | Activation stage — gradually enable unchecked assets |
|
|
158
|
+
| **0–9** | Early stage — start with `install-wizard` |
|
|
408
159
|
|
|
409
|
-
**
|
|
160
|
+
**Find a skill by what you're trying to do:**
|
|
410
161
|
|
|
411
|
-
|
|
162
|
+
| Cluster | Skills |
|
|
163
|
+
|---|---|
|
|
164
|
+
| Verification | `steel-quench` · `source-grounding-audit` · `convergence-loop` · `prompt-regression` · `return-path-gate` |
|
|
165
|
+
| Orchestration | `agent-composer` · `pipeline-conductor` · `goal-quench` · `deliberation` |
|
|
166
|
+
| Diagnosis | `harness-doctor` · `context-doctor` · `install-doctor` · `mcp-circuit-breaker` |
|
|
167
|
+
| Harvesting / Learning | `harvest-loop` · `field-harvest` · `edit-manifest` · `memory-hygiene` |
|
|
168
|
+
| Gate / Guard | `token-budget-gate` · `asset-placement-gate` · `marketplace-gate` |
|
|
169
|
+
| Discovery | `plugin-recommender` · `cross-ecosystem-synergy-detection` · `frontier-digest` · `verify-bidirectional` |
|
|
170
|
+
| Content / Simulation | `sim-conductor` · `apex-review` · `meta-prompt-builder` · `deep-clarify` |
|
|
171
|
+
| Setup | `install-wizard` · `hub-cc-pr-reviewer` · `skill-splitter` |
|
|
412
172
|
|
|
413
|
-
|
|
414
|
-
export FH_DIR="$HOME/path/to/forge-harness"
|
|
415
|
-
source "$FH_DIR/templates/fh_audit_check.zsh"
|
|
416
|
-
```
|
|
173
|
+
</details>
|
|
417
174
|
|
|
418
175
|
---
|
|
419
176
|
|
|
420
|
-
##
|
|
177
|
+
## Model setup
|
|
421
178
|
|
|
422
|
-
|
|
179
|
+
Claude Code does not auto-select models by task complexity — you configure this once.
|
|
423
180
|
|
|
424
|
-
|
|
425
|
-
|
|
426
|
-
|
|
181
|
+
```bash
|
|
182
|
+
/model opusplan # recommended for forge-harness
|
|
183
|
+
```
|
|
427
184
|
|
|
428
|
-
|
|
185
|
+
| Command | Who runs what | Best for |
|
|
186
|
+
|---|---|---|
|
|
187
|
+
| `/model sonnet` | Sonnet handles everything | Fast coding, simple tasks |
|
|
188
|
+
| `/model opus` | Opus handles everything | Complex reasoning, architecture |
|
|
189
|
+
| `/model opusplan` | **Opus plans · Sonnet executes** | FH orchestration + sub-agents |
|
|
429
190
|
|
|
430
|
-
|
|
191
|
+
**Why `opusplan` for FH**: CC switches models per-turn based on task weight — Opus activates for plan-mode turns (complex reasoning, decomposition decisions), Sonnet handles execution turns (tool calls, file edits, bash). forge-harness orchestration leans on both: Opus for design decisions in agent-composer / goal-quench / steel-quench, Sonnet for the actual file edits and sub-agent execution contexts. Sub-agent token costs are CC-visible and appear in the session jsonl under `message.model`.
|
|
431
192
|
|
|
432
|
-
|
|
193
|
+
If you use external CLIs (Gemini, Codex, `gh copilot`) as sidecars, their costs are billed to their own quota and not visible in CC's token display.
|
|
433
194
|
|
|
434
|
-
|
|
435
|
-
|---|---|
|
|
436
|
-
| New generalizable pattern emerges | First discovery of a pattern worth promoting |
|
|
437
|
-
| 3+ accumulated upgrades | Stabilization signal from the same asset evolving |
|
|
438
|
-
| Sister asset absorption | External PR audit gate passed |
|
|
439
|
-
|
|
440
|
-
### Command tower pattern (advanced)
|
|
195
|
+
---
|
|
441
196
|
|
|
442
|
-
|
|
443
|
-
|---|---|
|
|
444
|
-
| Single project coding/debugging | That project's cwd |
|
|
445
|
-
| Meta/audit/simulation | **Meta-harness cwd + Agent** |
|
|
446
|
-
| 2+ projects simultaneously | **Meta-harness cwd + parallel Agent** |
|
|
447
|
-
| field-harvest · PR audit · CATALOG updates | **Meta-harness cwd + Agent** |
|
|
197
|
+
## Multi-Model Sidecar (v1.3)
|
|
448
198
|
|
|
449
|
-
|
|
199
|
+
Run Gemini, Codex, or `gh copilot` as independent peer reviewers alongside Claude.
|
|
450
200
|
|
|
451
|
-
|
|
201
|
+
| Tier | Setup | Defects found |
|
|
202
|
+
|---|---|---|
|
|
203
|
+
| **C1** Single Claude | Default | 25% |
|
|
204
|
+
| **C2** 3 cross-session Claude personas | No extra tools | 75% |
|
|
205
|
+
| **C3** C2 + external CLI | External CLI installed | 100% (+3 Claude blind spots) |
|
|
452
206
|
|
|
453
|
-
|
|
454
|
-
|:---:|---|
|
|
455
|
-
| **L1** | harness-doctor + context-doctor + sim-conductor Area B — isolated third-person evaluation |
|
|
456
|
-
| **L2** | Real user feedback + external PR review — evidence generated outside owner environment |
|
|
457
|
-
| **L3** | steel-quench pre-runs attack angles internally; flaws patched before external devils run |
|
|
458
|
-
| **L4** | Meta-aware adversary — remaining attack surface shrinks per wave |
|
|
207
|
+
Claude-side token cost: **zero increase** C2→C3.
|
|
459
208
|
|
|
460
209
|
---
|
|
461
210
|
|
|
462
|
-
## Research
|
|
463
|
-
|
|
464
|
-
> **FH v1.0 paper** — published 2026-05-30 on [Zenodo](https://zenodo.org/records/20397566) (DOI: 10.5281/zenodo.20397566) · arXiv submission in review. Documents the 2-layer design, 6-axis framework, 4-agent orchestration, and compounding loop with controlled empirical evidence.
|
|
211
|
+
## Research
|
|
465
212
|
|
|
466
|
-
|
|
467
|
-
|
|
468
|
-
- "[Code as Agent Harness](https://arxiv.org/abs/2605.18747)" (arXiv, May 2026, 43 authors)
|
|
469
|
-
- Stanford IRIS Lab: "[Meta-Harness](https://arxiv.org/abs/2603.28052)" (Lee et al., Mar 2026) — outer-loop harness optimization; +7.7pts at 4× fewer tokens
|
|
213
|
+
> **FH v1.0 paper** — [Zenodo](https://zenodo.org/records/20397566) (DOI: 10.5281/zenodo.20397566) · arXiv in review.
|
|
214
|
+
> Documents 2-layer design, 6-axis framework, 4-agent orchestration, and compounding loop with empirical evidence.
|
|
470
215
|
|
|
471
|
-
|
|
216
|
+
External convergence:
|
|
217
|
+
- ["Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems"](https://arxiv.org/abs/2604.14228) — arXiv April 2026
|
|
218
|
+
- ["Code as Agent Harness"](https://arxiv.org/abs/2605.18747) — arXiv May 2026
|
|
219
|
+
- Stanford IRIS Lab: ["Meta-Harness"](https://arxiv.org/abs/2603.28052) — +7.7pts at 4× fewer tokens
|
|
472
220
|
|
|
473
221
|
---
|
|
474
222
|
|
|
475
223
|
## Learn more
|
|
476
224
|
|
|
477
|
-
|
|
478
|
-
- `AGENTS.md` — Runtime agent specs
|
|
479
|
-
- `CATALOG.md` — Search index
|
|
480
|
-
- `CHEATSHEET.md` — Full command reference
|
|
481
|
-
- `CONTRIBUTING.md` — How to contribute skills and patterns
|
|
482
|
-
- `knowledge/shared/harness-core/fh_integration_contract.md` — Governance layer spec
|
|
483
|
-
|
|
484
|
-
---
|
|
485
|
-
|
|
486
|
-
## Appendix
|
|
487
|
-
|
|
488
|
-
### Directory structure
|
|
489
|
-
|
|
490
|
-
```
|
|
491
|
-
forge-harness/
|
|
492
|
-
├── knowledge/ # Pure knowledge — time-independent, for reference
|
|
493
|
-
│ ├── domain/ # Domain-specific knowledge
|
|
494
|
-
│ └── shared/ # Cross-project patterns
|
|
495
|
-
│
|
|
496
|
-
├── tracks/ # Work records per project — time-accumulated
|
|
497
|
-
│ └── {project_name}/
|
|
498
|
-
│ ├── session_*.md # Session history
|
|
499
|
-
│ └── learnings/ # Accumulated feedback
|
|
500
|
-
│
|
|
501
|
-
├── plugins/ # fh-meta + fh-commons plugins
|
|
502
|
-
├── templates/ # Skeletons to copy for new projects
|
|
503
|
-
├── scripts/ # fh-gate.sh and automation scripts
|
|
504
|
-
├── docs/ # Diagrams and reference assets
|
|
505
|
-
├── CATALOG.md # Full search index
|
|
506
|
-
├── CLAUDE.md # AI operating rules + Sync/Push protocol
|
|
507
|
-
└── CHEATSHEET.md # Command cheat sheet
|
|
508
|
-
```
|
|
509
|
-
|
|
510
|
-
### Key terms
|
|
511
|
-
|
|
512
|
-
| Term | Definition |
|
|
225
|
+
| Resource | Purpose |
|
|
513
226
|
|---|---|
|
|
514
|
-
|
|
|
515
|
-
|
|
|
516
|
-
|
|
|
517
|
-
|
|
|
518
|
-
|
|
|
519
|
-
|
|
|
227
|
+
| [`CLAUDE.md`](CLAUDE.md) | AI operating rules + sync/push protocol |
|
|
228
|
+
| [`CHEATSHEET.md`](CHEATSHEET.md) | Full command reference |
|
|
229
|
+
| [`AGENTS.md`](AGENTS.md) | Runtime agent specs |
|
|
230
|
+
| [`CATALOG.md`](CATALOG.md) | Past work search index |
|
|
231
|
+
| [`CONTRIBUTING.md`](CONTRIBUTING.md) | How to contribute skills and patterns |
|
|
232
|
+
| [`fh_integration_contract.md`](knowledge/shared/harness-core/fh_integration_contract.md) | Governance gate spec |
|