@really-knows-ai/foundry 2.3.0 → 2.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +23 -0
- package/README.md +330 -139
- package/docs/concepts.md +89 -26
- package/docs/getting-started.md +149 -40
- package/docs/work-spec.md +38 -24
- package/package.json +1 -1
- package/skills/add-appraiser/SKILL.md +8 -2
- package/skills/add-artefact-type/SKILL.md +8 -2
- package/skills/add-cycle/SKILL.md +8 -2
- package/skills/add-flow/SKILL.md +8 -2
- package/skills/add-law/SKILL.md +8 -2
- package/skills/flow/SKILL.md +7 -5
- package/skills/forge/SKILL.md +15 -5
package/README.md
CHANGED
|
@@ -1,12 +1,51 @@
|
|
|
1
1
|
# Foundry
|
|
2
2
|
|
|
3
|
-
A skill-driven framework for governed artefact generation
|
|
3
|
+
> A skill-driven framework for governed artefact generation with AI coding tools. Define your own artefact types, laws, and flows — Foundry handles the forge → quench → appraise pipeline with deterministic routing, quality gates, and iterative refinement.
|
|
4
|
+
|
|
5
|
+
[](https://www.npmjs.com/package/@really-knows-ai/foundry)
|
|
6
|
+
[](LICENSE)
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Table of contents
|
|
11
|
+
|
|
12
|
+
- [Why Foundry?](#why-foundry)
|
|
13
|
+
- [Compatibility](#compatibility)
|
|
14
|
+
- [Installation](#installation)
|
|
15
|
+
- [Quick start](#quick-start)
|
|
16
|
+
- [How it works](#how-it-works)
|
|
17
|
+
- [Core concepts](#core-concepts)
|
|
18
|
+
- [The pipeline in depth](#the-pipeline-in-depth)
|
|
19
|
+
- [Feedback lifecycle](#feedback-lifecycle)
|
|
20
|
+
- [Enforcement model](#enforcement-model)
|
|
21
|
+
- [Multi-model routing](#multi-model-routing)
|
|
22
|
+
- [Skills](#skills)
|
|
23
|
+
- [Custom tools](#custom-tools)
|
|
24
|
+
- [Project layout](#project-layout)
|
|
25
|
+
- [Design decisions](#design-decisions)
|
|
26
|
+
- [Further reading](#further-reading)
|
|
27
|
+
- [License](#license)
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Why Foundry?
|
|
32
|
+
|
|
33
|
+
LLMs are excellent at producing artefacts — code, specs, docs, tests — but they are erratic about *governing* that production. They skip checks, silently ignore feedback, drift from constraints, and forget what stage they're in. Foundry is an opinionated framework that separates **creative work** (handled by LLMs via skills) from **process work** (handled by deterministic tools):
|
|
34
|
+
|
|
35
|
+
- **The pipeline is code, not prose.** Routing, state transitions, commit discipline, and write invariants live inside tested plugin tools. LLMs can't rationalise their way past them.
|
|
36
|
+
- **Every artefact is governed by laws.** Global and per-type pass/fail criteria are evaluated by a panel of independent appraisers before anything is considered done.
|
|
37
|
+
- **Nothing is silent.** Feedback has a full lifecycle (open → actioned/wont-fix → approved/rejected). Wont-fix requires appraiser approval. Validation is non-negotiable.
|
|
38
|
+
- **Writes are enforced.** Each stage is allowed to modify a specific, narrow set of files. Violations halt the cycle.
|
|
39
|
+
- **Humans can step in.** Human-in-the-loop gates can run every iteration or only when LLM appraisers deadlock.
|
|
40
|
+
|
|
41
|
+
---
|
|
4
42
|
|
|
5
43
|
## Compatibility
|
|
6
44
|
|
|
7
|
-
- **OpenCode** — full support
|
|
45
|
+
- **OpenCode** — full support. Multi-model routing via file-based `foundry-*` agents. This is the primary target.
|
|
46
|
+
- **Other skill-aware AI tools** — the skills and tools are portable. Multi-model stage routing is OpenCode-specific today because it relies on `.opencode/agents/` files generated by `refresh-agents`.
|
|
8
47
|
|
|
9
|
-
|
|
48
|
+
---
|
|
10
49
|
|
|
11
50
|
## Installation
|
|
12
51
|
|
|
@@ -15,211 +54,339 @@ Add `@really-knows-ai/foundry` to your OpenCode config:
|
|
|
15
54
|
```json
|
|
16
55
|
// opencode.json
|
|
17
56
|
{
|
|
18
|
-
"
|
|
19
|
-
|
|
20
|
-
}
|
|
57
|
+
"$schema": "https://opencode.ai/config.json",
|
|
58
|
+
"plugin": ["@really-knows-ai/foundry"]
|
|
21
59
|
}
|
|
22
60
|
```
|
|
23
61
|
|
|
62
|
+
---
|
|
63
|
+
|
|
24
64
|
## Quick start
|
|
25
65
|
|
|
26
|
-
1. **Install** the package
|
|
27
|
-
2. **Initialize** —
|
|
28
|
-
3. **Define artefact types** —
|
|
29
|
-
4. **Add laws** —
|
|
30
|
-
5. **Add appraisers** —
|
|
31
|
-
6. **Define cycles** —
|
|
32
|
-
7. **Define
|
|
33
|
-
8. **Run** —
|
|
66
|
+
1. **Install** the package (above).
|
|
67
|
+
2. **Initialize** — run the `init-foundry` skill to scaffold a `foundry/` directory and generate `foundry-*` agent files.
|
|
68
|
+
3. **Define artefact types** — `add-artefact-type` walks you through identity, file patterns, output directory, laws, and optional CLI validation.
|
|
69
|
+
4. **Add laws** — `add-law` creates subjective pass/fail criteria, globally or per-type.
|
|
70
|
+
5. **Add appraisers** — `add-appraiser` creates appraiser personalities with conflict detection.
|
|
71
|
+
6. **Define cycles** — `add-cycle` wires artefact types into a forge/quench/appraise loop with targets and input contracts.
|
|
72
|
+
7. **Define a flow** — `add-flow` groups cycles and declares entry points.
|
|
73
|
+
8. **Run** — invoke the `flow` skill with your goal. It creates a work branch, picks the right cycle, and hands off to `orchestrate`.
|
|
74
|
+
|
|
75
|
+
---
|
|
34
76
|
|
|
35
77
|
## How it works
|
|
36
78
|
|
|
37
79
|
```
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
80
|
+
┌─────────────────────────────┐
|
|
81
|
+
│ Flow (entry points + set) │
|
|
82
|
+
└──────────────┬──────────────┘
|
|
83
|
+
│ starting cycle picked
|
|
84
|
+
▼
|
|
85
|
+
┌────────────────────────────────────────────────────────────────┐
|
|
86
|
+
│ Cycle (outputs exactly one artefact type) │
|
|
87
|
+
│ │
|
|
88
|
+
│ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │
|
|
89
|
+
│ │ forge │ → │ quench │ → │ appraise │ ──┐ │
|
|
90
|
+
│ └─────────┘ └─────────┘ └─────────────┘ │ loop │
|
|
91
|
+
│ ▲ │ until │
|
|
92
|
+
│ └───── unresolved feedback ─────────────────┘ clean │
|
|
93
|
+
│ │
|
|
94
|
+
│ [ optional: human-appraise — every iter or on deadlock ] │
|
|
95
|
+
└──────────────┬─────────────────────────────────────────────────┘
|
|
96
|
+
│ targets (may branch)
|
|
97
|
+
▼
|
|
98
|
+
next cycle → … → done
|
|
50
99
|
```
|
|
51
100
|
|
|
52
|
-
A **
|
|
101
|
+
- A **flow** defines the set of cycles and their entry points.
|
|
102
|
+
- A **cycle** produces exactly one artefact type and declares its own `targets` — Foundry follows a dependency graph, not a linear list.
|
|
103
|
+
- Each cycle loops through **forge → quench → appraise** until there is no unresolved feedback, or an iteration limit is hit.
|
|
104
|
+
- All inter-stage communication goes through **WORK.md** on a dedicated work branch; every stage ends with a micro-commit.
|
|
53
105
|
|
|
54
|
-
|
|
106
|
+
---
|
|
55
107
|
|
|
56
|
-
##
|
|
108
|
+
## Core concepts
|
|
57
109
|
|
|
58
|
-
|
|
110
|
+
### Flow
|
|
59
111
|
|
|
60
|
-
|
|
61
|
-
|----------|-------|
|
|
62
|
-
| **Workfile** | `foundry_workfile_create`, `foundry_workfile_get`, `foundry_workfile_set`, `foundry_workfile_delete` |
|
|
63
|
-
| **Artefacts** | `foundry_artefacts_add`, `foundry_artefacts_list`, `foundry_artefacts_set_status` |
|
|
64
|
-
| **Feedback** | `foundry_feedback_add`, `foundry_feedback_action`, `foundry_feedback_wontfix`, `foundry_feedback_resolve`, `foundry_feedback_list` |
|
|
65
|
-
| **History** | `foundry_history_append`, `foundry_history_list` |
|
|
66
|
-
| **Sort** | `foundry_sort` |
|
|
67
|
-
| **Config** | `foundry_config_cycle`, `foundry_config_artefact_type`, `foundry_config_laws`, `foundry_config_validation`, `foundry_config_appraisers`, `foundry_config_flow` |
|
|
68
|
-
| **Validation** | `foundry_validate_run`, `foundry_appraisers_select` |
|
|
69
|
-
| **Git** | `foundry_git_branch`, `foundry_git_commit` |
|
|
112
|
+
A flow lives in `foundry/flows/`. It declares:
|
|
70
113
|
|
|
71
|
-
|
|
114
|
+
- `starting-cycles` — hints about where the flow can be entered.
|
|
115
|
+
- The set of cycles it contains (routing between them is owned by cycles, not by the flow).
|
|
72
116
|
|
|
73
|
-
|
|
117
|
+
Starting a flow creates a work branch and a fresh `WORK.md`.
|
|
74
118
|
|
|
75
|
-
###
|
|
119
|
+
### Cycle
|
|
76
120
|
|
|
77
|
-
|
|
121
|
+
A cycle lives in `foundry/cycles/`. It declares:
|
|
78
122
|
|
|
79
|
-
|
|
123
|
+
- `output` — the artefact type the cycle produces (read-write).
|
|
124
|
+
- `inputs` — a contract (`any-of` or `all-of`) over artefact types from other cycles. Inputs are discovered on disk by filesystem scan against each input type's file-patterns; they are read-only.
|
|
125
|
+
- `targets` — which cycle(s) may run next after this one completes.
|
|
126
|
+
- `human-appraise` / `deadlock-appraise` / `deadlock-iterations` — human-in-the-loop configuration.
|
|
127
|
+
- `models` — optional per-stage model overrides for multi-model diversity.
|
|
80
128
|
|
|
81
|
-
|
|
82
|
-
- `output` — the artefact type it produces (read-write)
|
|
83
|
-
- `inputs` — artefact types from previous cycles (read-only)
|
|
129
|
+
### Stage
|
|
84
130
|
|
|
85
|
-
|
|
131
|
+
A single step within a cycle. Stages are identified as `base:alias` (e.g. `forge:write-haiku`, `quench:check-syllables`). The base is one of:
|
|
86
132
|
|
|
87
|
-
|
|
88
|
-
- **
|
|
89
|
-
- **
|
|
90
|
-
- **
|
|
133
|
+
- **forge** — produce or revise the artefact.
|
|
134
|
+
- **quench** — run deterministic CLI checks (skipped if the artefact type has no `validation.md`).
|
|
135
|
+
- **appraise** — subjective evaluation by multiple independent appraiser sub-agents.
|
|
136
|
+
- **human-appraise** — human quality gate, either every iteration or only on deadlock.
|
|
91
137
|
|
|
92
|
-
### Artefact
|
|
138
|
+
### Artefact type
|
|
93
139
|
|
|
94
|
-
Defined in `foundry/artefacts/<type
|
|
95
|
-
|
|
96
|
-
- `
|
|
97
|
-
- `
|
|
140
|
+
Defined in `foundry/artefacts/<type>/`:
|
|
141
|
+
|
|
142
|
+
- `definition.md` — id, name, file patterns, output directory, appraiser configuration, prose description.
|
|
143
|
+
- `laws.md` *(optional)* — type-specific subjective criteria.
|
|
144
|
+
- `validation.md` *(optional)* — CLI commands with a `{file}` placeholder; non-zero exit = failure.
|
|
98
145
|
|
|
99
146
|
### Laws
|
|
100
147
|
|
|
101
|
-
Subjective pass/fail criteria
|
|
102
|
-
|
|
103
|
-
- `foundry/
|
|
148
|
+
Subjective pass/fail criteria evaluated by appraisers.
|
|
149
|
+
|
|
150
|
+
- `foundry/laws/*.md` — global laws (all files concatenated, apply everywhere).
|
|
151
|
+
- `foundry/artefacts/<type>/laws.md` — type-specific laws.
|
|
104
152
|
|
|
105
|
-
Each law is a `## heading` (
|
|
153
|
+
Each law is a `## heading` (its identifier, referenced in feedback as `#law:<id>`) with a description, passing criteria, and failing criteria.
|
|
106
154
|
|
|
107
155
|
### Appraisers
|
|
108
156
|
|
|
109
|
-
Defined in `foundry/appraisers/`. Each appraiser
|
|
157
|
+
Defined in `foundry/appraisers/`. Each appraiser is a named personality with an optional `model` override. Artefact types pick which appraisers may evaluate them:
|
|
110
158
|
|
|
111
159
|
```yaml
|
|
112
160
|
appraisers:
|
|
113
|
-
count: 3
|
|
114
|
-
allowed: [pedantic, pragmatic]
|
|
161
|
+
count: 3 # how many appraisers (default: 3)
|
|
162
|
+
allowed: [pedantic, pragmatic] # which personalities (default: all)
|
|
115
163
|
```
|
|
116
164
|
|
|
117
|
-
Appraisers are distributed evenly across
|
|
165
|
+
Appraisers are distributed evenly across the allowed set for maximum diversity.
|
|
118
166
|
|
|
119
167
|
### WORK.md
|
|
120
168
|
|
|
121
|
-
Transient shared state on the work branch.
|
|
122
|
-
|
|
123
|
-
-
|
|
124
|
-
-
|
|
125
|
-
-
|
|
169
|
+
Transient shared state on the work branch. Created when the flow starts, deleted before the branch is squash-merged. It contains:
|
|
170
|
+
|
|
171
|
+
- **Frontmatter** — current position (`flow`, `cycle`, stage list, max iterations, model map, human-appraise config).
|
|
172
|
+
- **Goal** — the prose request that kicked off the flow.
|
|
173
|
+
- **Artefacts** — a table of every file produced by the flow and its status (`draft`, `done`, `blocked`).
|
|
174
|
+
- **Feedback** — grouped by artefact file, every feedback item with its full lifecycle.
|
|
175
|
+
|
|
176
|
+
A sibling file `WORK.history.yaml` is an append-only log of every stage execution. See [docs/work-spec.md](docs/work-spec.md).
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
## The pipeline in depth
|
|
181
|
+
|
|
182
|
+
### Stages run inside a token-gated lifecycle
|
|
183
|
+
|
|
184
|
+
Every dispatched stage (forge, quench, appraise, human-appraise) runs under a single-use HMAC token:
|
|
185
|
+
|
|
186
|
+
1. The `orchestrate` tool mints a token and hands it to the sub-agent in the dispatch prompt.
|
|
187
|
+
2. The sub-agent's **first** call must be `foundry_stage_begin({stage, cycle, token})`. The token is redeemed; mutation tools now check that the active stage matches.
|
|
188
|
+
3. The sub-agent does its work (reads WORK.md, writes artefact files / feedback, etc.).
|
|
189
|
+
4. The sub-agent's **last** call is `foundry_stage_end({summary})`.
|
|
190
|
+
5. The orchestrator then calls `foundry_stage_finalize`, which:
|
|
191
|
+
- Scans the git diff against the stage's allowed file-patterns.
|
|
192
|
+
- Registers any new files matching the output artefact type as `draft` artefacts.
|
|
193
|
+
- Returns `{error: 'unexpected_files'}` if the stage wrote anywhere it shouldn't have.
|
|
194
|
+
6. The cycle is committed (`foundry_git_commit` internally) and routing advances.
|
|
195
|
+
|
|
196
|
+
Per-stage write rules:
|
|
197
|
+
|
|
198
|
+
| Stage | May write |
|
|
199
|
+
|-------|-----------|
|
|
200
|
+
| `forge` | Files matching the output artefact type's `file-patterns`, plus `WORK.md` / `WORK.history.yaml` |
|
|
201
|
+
| `quench` | `WORK.md` / `WORK.history.yaml` only (feedback) |
|
|
202
|
+
| `appraise` | `WORK.md` / `WORK.history.yaml` only (feedback) |
|
|
203
|
+
| `human-appraise` | `WORK.md` / `WORK.history.yaml` only (feedback) |
|
|
204
|
+
|
|
205
|
+
Input artefacts are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle.
|
|
206
|
+
|
|
207
|
+
### Deterministic orchestration
|
|
208
|
+
|
|
209
|
+
The `orchestrate` skill is thin — a 3-line loop:
|
|
210
|
+
|
|
211
|
+
```text
|
|
212
|
+
call foundry_orchestrate({lastResult})
|
|
213
|
+
switch on action:
|
|
214
|
+
dispatch → task tool (subagent) → report back
|
|
215
|
+
human_appraise → run human-appraise inline → report back
|
|
216
|
+
done / blocked / violation → terminate the loop
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
`foundry_orchestrate` owns sort routing, history, commits, finalize, deadlock detection, and violation handling. Because the protocol lives in a plugin tool, the LLM can't skip steps, reorder them, or silently drop a commit.
|
|
220
|
+
|
|
221
|
+
---
|
|
126
222
|
|
|
127
|
-
|
|
223
|
+
## Feedback lifecycle
|
|
224
|
+
|
|
225
|
+
Feedback is markdown checklists under each artefact in WORK.md, tagged to indicate source.
|
|
128
226
|
|
|
129
227
|
```
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
228
|
+
- [ ] issue #tag → open — needs forge action
|
|
229
|
+
- [x] issue #tag → actioned — needs appraise approval
|
|
230
|
+
- [~] issue #tag | wont-fix: <reason> → wont-fix — needs appraise approval
|
|
231
|
+
- [x] issue #tag | approved → resolved
|
|
232
|
+
- [~] issue #tag | wont-fix: <reason> | approved → resolved
|
|
233
|
+
- [x] issue #tag | rejected: <reason> → re-opened
|
|
234
|
+
- [~] issue #tag | wont-fix: <reason> | rejected → re-opened
|
|
137
235
|
```
|
|
138
236
|
|
|
139
|
-
|
|
237
|
+
Tags:
|
|
238
|
+
|
|
239
|
+
| Tag | Source | Notes |
|
|
240
|
+
|-----|--------|-------|
|
|
241
|
+
| `#validation` | quench (CLI command failed) | Cannot be wont-fixed. Deterministic rules are not negotiable. |
|
|
242
|
+
| `#law:<id>` | appraise (subjective law) | May be wont-fixed with justification; an appraiser must approve. |
|
|
243
|
+
| `#human` | human-appraise | Takes absolute priority. Forge MUST address it — cannot wont-fix. |
|
|
244
|
+
|
|
245
|
+
Feedback is append-only: items are never deleted, only resolved. Re-opened items show their full history.
|
|
246
|
+
|
|
247
|
+
### Deadlock handling
|
|
248
|
+
|
|
249
|
+
If forge and appraise ping-pong on the same items for `deadlock-iterations` (default 5) iterations, and the cycle has `deadlock-appraise: true` (default), the router inserts a `human-appraise` stage. If `deadlock-appraise: false`, the cycle is marked `blocked` and control returns to the human.
|
|
250
|
+
|
|
251
|
+
---
|
|
252
|
+
|
|
253
|
+
## Enforcement model
|
|
254
|
+
|
|
255
|
+
Foundry is designed around "trust the tool, not the LLM". The following guarantees are enforced in plugin code, not prose:
|
|
256
|
+
|
|
257
|
+
- **Stage-locked mutations.** `foundry_feedback_*`, `foundry_artefacts_*`, and `foundry_workfile_*` tools require the caller's role to match the active stage. A forge sub-agent cannot add feedback; a quench sub-agent cannot register artefacts.
|
|
258
|
+
- **Single-use tokens.** `foundry_stage_begin` verifies an HMAC token minted at dispatch time. Replays, forgery, and cross-stage reuse all fail closed. Keys live in `.foundry/.secret` (mode 0600, gitignored, one per worktree).
|
|
259
|
+
- **Commit-per-stage contract.** `foundry_orchestrate` refuses to proceed if there are uncommitted changes to `WORK.md`, `WORK.history.yaml`, or anything under `.foundry/` at the start of a sort call and history is non-empty.
|
|
260
|
+
- **Write invariants.** `foundry_stage_finalize` scans the git diff and rejects stray writes with `{error: 'unexpected_files'}`.
|
|
261
|
+
- **Feedback state machine.** Only legal transitions are accepted: `approved` is terminal; quench cannot approve/reject a `wont-fix`; validation cannot be wont-fixed.
|
|
262
|
+
- **Artefact-type glob uniqueness.** `add-artefact-type` refuses to create a type whose file patterns overlap with an existing type; the enforcer can't determine file ownership otherwise.
|
|
263
|
+
|
|
264
|
+
---
|
|
140
265
|
|
|
141
|
-
|
|
266
|
+
## Multi-model routing
|
|
142
267
|
|
|
143
|
-
|
|
144
|
-
- After forge: only output artefact file patterns + WORK.md + WORK.history.yaml (input artefacts are read-only — violation if touched)
|
|
145
|
-
- After quench/appraise: only WORK.md + WORK.history.yaml
|
|
146
|
-
- Violations are hard stops
|
|
268
|
+
Different stages can run on different models for genuine cognitive diversity (mitigating shared blind spots):
|
|
147
269
|
|
|
148
|
-
|
|
270
|
+
- Cycle definitions can declare a `models` map, e.g. `models: { forge: anthropic/claude-opus-4.7, appraise: openai/gpt-5 }`.
|
|
271
|
+
- Individual appraisers can override the cycle-level appraise model via a `model` field in their personality definition.
|
|
272
|
+
- `refresh-agents` generates a `foundry-<provider>-<model>.md` agent file in `.opencode/agents/` for every model available in the session. `orchestrate` picks the matching agent when dispatching.
|
|
273
|
+
|
|
274
|
+
Resolution order for a given stage: **appraiser `model`** → **cycle `models.<stage>`** → **session default**.
|
|
275
|
+
|
|
276
|
+
Run `list-agents` to see what's available.
|
|
277
|
+
|
|
278
|
+
---
|
|
149
279
|
|
|
150
280
|
## Skills
|
|
151
281
|
|
|
152
|
-
|
|
282
|
+
Foundry is a collection of skills. Skills are either **atomic** (do one thing) or **composite** (orchestrate other skills).
|
|
153
283
|
|
|
154
|
-
### Pipeline
|
|
284
|
+
### Pipeline
|
|
155
285
|
|
|
156
286
|
| Skill | Type | Purpose |
|
|
157
287
|
|-------|------|---------|
|
|
158
|
-
| `
|
|
159
|
-
| `
|
|
160
|
-
| `
|
|
161
|
-
| `
|
|
162
|
-
| `
|
|
288
|
+
| `flow` | composite | Entry point. Picks a starting cycle, creates the work branch, invokes `orchestrate`, follows `targets` between cycles. |
|
|
289
|
+
| `orchestrate` | atomic | Thin driver around `foundry_orchestrate`. Dispatches sub-agents, runs human-appraise inline, reports terminal states. |
|
|
290
|
+
| `forge` | atomic | Produce or revise the artefact. Discovers inputs by filesystem scan. |
|
|
291
|
+
| `quench` | atomic | Run the artefact type's CLI validation commands; write `#validation` feedback. |
|
|
292
|
+
| `appraise` | atomic | Dispatch the selected appraiser personalities as parallel sub-agents; consolidate `#law:<id>` feedback (union + dedup). |
|
|
293
|
+
| `human-appraise` | atomic | Human quality gate. Presents the artefact, collects `#human` feedback. |
|
|
163
294
|
|
|
164
|
-
###
|
|
295
|
+
### Authoring
|
|
165
296
|
|
|
166
297
|
| Skill | Purpose |
|
|
167
298
|
|-------|---------|
|
|
168
|
-
| `init-foundry` | Scaffold the `foundry/` directory
|
|
169
|
-
| `add-artefact-type` | Create a new artefact type with conflict and glob-overlap checks |
|
|
170
|
-
| `add-law` | Create a new law with conflict detection |
|
|
171
|
-
| `add-appraiser` | Create
|
|
172
|
-
| `add-cycle` | Create a
|
|
173
|
-
| `add-flow` | Create a
|
|
299
|
+
| `init-foundry` | Scaffold the `foundry/` directory and generate agent files. |
|
|
300
|
+
| `add-artefact-type` | Create a new artefact type, with conflict and glob-overlap checks. |
|
|
301
|
+
| `add-law` | Create a new law with conflict detection. |
|
|
302
|
+
| `add-appraiser` | Create an appraiser personality with semantic-overlap checks. |
|
|
303
|
+
| `add-cycle` | Create a cycle, validate its targets and input contract against the flow. |
|
|
304
|
+
| `add-flow` | Create a flow definition with cycle-graph reachability checks. |
|
|
174
305
|
|
|
175
|
-
### Utility
|
|
306
|
+
### Utility
|
|
176
307
|
|
|
177
308
|
| Skill | Purpose |
|
|
178
309
|
|-------|---------|
|
|
179
|
-
| `
|
|
180
|
-
| `
|
|
310
|
+
| `list-agents` | List available `foundry-*` sub-agents (for multi-model routing). |
|
|
311
|
+
| `refresh-agents` | Regenerate `foundry-*` agent files from the currently available models. |
|
|
312
|
+
| `upgrade-foundry` | Analyse and migrate `foundry/` config to the current version. |
|
|
181
313
|
|
|
182
|
-
All
|
|
314
|
+
All authoring skills are interactive and conflict-aware — they explain what they're about to write and ask before writing.
|
|
183
315
|
|
|
184
|
-
|
|
316
|
+
---
|
|
317
|
+
|
|
318
|
+
## Custom tools
|
|
319
|
+
|
|
320
|
+
The plugin registers **24 custom tools**. Skills call these rather than manipulating files directly, which keeps format-parsing and state transitions out of LLM hands.
|
|
321
|
+
|
|
322
|
+
| Category | Tools |
|
|
323
|
+
|----------|-------|
|
|
324
|
+
| **Orchestration** | `foundry_orchestrate` |
|
|
325
|
+
| **Stage lifecycle** | `foundry_stage_begin`, `foundry_stage_end` |
|
|
326
|
+
| **Workfile** | `foundry_workfile_create`, `foundry_workfile_get`, `foundry_workfile_delete` |
|
|
327
|
+
| **Artefacts** | `foundry_artefacts_set_status`, `foundry_artefacts_list` |
|
|
328
|
+
| **Feedback** | `foundry_feedback_add`, `foundry_feedback_action`, `foundry_feedback_wontfix`, `foundry_feedback_resolve`, `foundry_feedback_list` |
|
|
329
|
+
| **History** | `foundry_history_list` |
|
|
330
|
+
| **Config** | `foundry_config_cycle`, `foundry_config_artefact_type`, `foundry_config_laws`, `foundry_config_validation`, `foundry_config_appraisers`, `foundry_config_flow` |
|
|
331
|
+
| **Validation** | `foundry_validate_run`, `foundry_appraisers_select` |
|
|
332
|
+
| **Git** | `foundry_git_branch`, `foundry_git_finish` |
|
|
333
|
+
|
|
334
|
+
A handful of internal tools (`foundry_sort`, `foundry_history_append`, `foundry_stage_finalize`, `foundry_git_commit`, `foundry_workfile_set`, `foundry_workfile_configure_from_cycle`) are intentionally *not* registered — they exist only inside `foundry_orchestrate` so they cannot be called out of band.
|
|
335
|
+
|
|
336
|
+
Tools are backed by shared modules in `scripts/lib/` with injectable I/O for testability (see `tests/`).
|
|
337
|
+
|
|
338
|
+
---
|
|
339
|
+
|
|
340
|
+
## Project layout
|
|
341
|
+
|
|
342
|
+
### Package (this repo)
|
|
185
343
|
|
|
186
344
|
```
|
|
187
345
|
@really-knows-ai/foundry
|
|
188
346
|
├── .opencode/
|
|
189
347
|
│ └── plugins/
|
|
190
|
-
│ └── foundry.js #
|
|
191
|
-
├── skills/ # skill definitions
|
|
348
|
+
│ └── foundry.js # plugin: skills + 24 custom tools
|
|
349
|
+
├── skills/ # skill definitions
|
|
350
|
+
│ ├── flow/ # pipeline
|
|
351
|
+
│ ├── orchestrate/
|
|
192
352
|
│ ├── forge/
|
|
193
353
|
│ ├── quench/
|
|
194
354
|
│ ├── appraise/
|
|
195
|
-
│ ├──
|
|
196
|
-
│ ├──
|
|
197
|
-
│ ├── init-foundry/
|
|
355
|
+
│ ├── human-appraise/
|
|
356
|
+
│ ├── init-foundry/ # authoring
|
|
198
357
|
│ ├── add-artefact-type/
|
|
199
358
|
│ ├── add-law/
|
|
200
359
|
│ ├── add-appraiser/
|
|
201
360
|
│ ├── add-cycle/
|
|
202
361
|
│ ├── add-flow/
|
|
203
|
-
│ ├──
|
|
204
|
-
│
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
│
|
|
208
|
-
│ │ ├──
|
|
209
|
-
│ │ ├──
|
|
210
|
-
│ │ ├──
|
|
362
|
+
│ ├── list-agents/ # utility
|
|
363
|
+
│ ├── refresh-agents/
|
|
364
|
+
│ └── upgrade-foundry/
|
|
365
|
+
├── scripts/
|
|
366
|
+
│ ├── lib/ # shared libraries (injectable I/O)
|
|
367
|
+
│ │ ├── workfile.js # WORK.md frontmatter
|
|
368
|
+
│ │ ├── artefacts.js # artefact table ops
|
|
369
|
+
│ │ ├── history.js # WORK.history.yaml ops
|
|
370
|
+
│ │ ├── feedback.js # feedback lifecycle
|
|
371
|
+
│ │ ├── feedback-transitions.js
|
|
372
|
+
│ │ ├── finalize.js # stage_finalize implementation
|
|
373
|
+
│ │ ├── stage-guard.js # stage-lock preconditions
|
|
374
|
+
│ │ ├── token.js # HMAC token mint/verify
|
|
375
|
+
│ │ ├── secret.js # .foundry/.secret handling
|
|
376
|
+
│ │ ├── pending.js # active-stage state
|
|
377
|
+
│ │ ├── state.js # .foundry state dir
|
|
211
378
|
│ │ ├── config.js # foundry/ config readers
|
|
212
|
-
│ │
|
|
213
|
-
│ └──
|
|
214
|
-
├──
|
|
215
|
-
|
|
216
|
-
├──
|
|
379
|
+
│ │ ├── tags.js # feedback tag extraction
|
|
380
|
+
│ │ └── slug.js
|
|
381
|
+
│ ├── orchestrate.js # orchestration loop (exports runOrchestrate)
|
|
382
|
+
│ └── sort.js # routing engine (exports runSort)
|
|
383
|
+
├── tests/ # node:test suite
|
|
384
|
+
├── docs/ # concepts, getting-started, work-spec
|
|
385
|
+
├── CHANGELOG.md
|
|
217
386
|
└── README.md
|
|
218
387
|
```
|
|
219
388
|
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
After running `init-foundry`, your project gets a `foundry/` directory:
|
|
389
|
+
### User project (after `init-foundry`)
|
|
223
390
|
|
|
224
391
|
```
|
|
225
392
|
your-project/
|
|
@@ -229,47 +396,71 @@ your-project/
|
|
|
229
396
|
│ ├── artefacts/ # artefact type definitions
|
|
230
397
|
│ │ └── <type>/
|
|
231
398
|
│ │ ├── definition.md
|
|
232
|
-
│ │ ├── laws.md #
|
|
233
|
-
│ │ └── validation.md #
|
|
399
|
+
│ │ ├── laws.md # optional
|
|
400
|
+
│ │ └── validation.md # optional
|
|
234
401
|
│ ├── laws/ # global laws
|
|
235
402
|
│ └── appraisers/ # appraiser personalities
|
|
403
|
+
├── .foundry/ # runtime state (gitignored)
|
|
404
|
+
│ └── .secret # per-worktree HMAC key (mode 0600)
|
|
405
|
+
├── .opencode/
|
|
406
|
+
│ └── agents/
|
|
407
|
+
│ └── foundry-*.md # generated by refresh-agents
|
|
236
408
|
├── opencode.json
|
|
237
409
|
└── ...
|
|
238
410
|
```
|
|
239
411
|
|
|
412
|
+
During a flow, a work branch also contains `WORK.md` and `WORK.history.yaml` at the repo root. Both are ephemeral — delete them before squash-merging.
|
|
413
|
+
|
|
414
|
+
---
|
|
415
|
+
|
|
240
416
|
## Design decisions
|
|
241
417
|
|
|
242
418
|
### Everything is markdown
|
|
243
419
|
|
|
244
|
-
|
|
420
|
+
Flows, cycles, artefact types, laws, appraiser personalities, skills — all markdown with YAML frontmatter. Readable by humans, consumable by LLMs, diff-able in git. No bespoke formats, no databases.
|
|
245
421
|
|
|
246
422
|
### Skills are the pipeline, tools are the machinery
|
|
247
423
|
|
|
248
|
-
Composition happens
|
|
424
|
+
Composition happens at the skill layer. `flow` reads a definition and invokes `orchestrate`. `orchestrate` calls `foundry_orchestrate` in a loop. The hard guarantees — routing, commits, state transitions, enforcement — live inside the plugin's custom tools and the libraries under `scripts/lib/`. Skills handle creative and subjective work; tools handle everything else.
|
|
249
425
|
|
|
250
426
|
### WORK.md as shared state
|
|
251
427
|
|
|
252
|
-
All communication
|
|
428
|
+
All inter-stage communication goes through WORK.md via the `foundry_workfile_*`, `foundry_artefacts_*`, `foundry_feedback_*`, and `foundry_history_*` tools. No stage passes output directly to another. This gives a complete audit trail, makes flows resumable after a crash, and lets any stage be re-run independently.
|
|
429
|
+
|
|
430
|
+
### Cycles own their routing
|
|
431
|
+
|
|
432
|
+
A flow declares starting points; individual cycles declare `targets` and input contracts. The flow skill walks the resulting graph. This keeps cycles composable across flows and prevents the flow file from becoming a procedural monolith.
|
|
253
433
|
|
|
254
|
-
### Feedback as
|
|
434
|
+
### Feedback as checklists
|
|
255
435
|
|
|
256
|
-
|
|
436
|
+
Markdown checkboxes with `#validation`, `#law:<id>`, or `#human` tags. Human-readable, trivially parseable, lifecycle encoded inline. Feedback is append-only; history is part of the artefact's story.
|
|
257
437
|
|
|
258
|
-
### Wont-fix requires
|
|
438
|
+
### Wont-fix requires approval
|
|
259
439
|
|
|
260
|
-
|
|
440
|
+
A forge sub-agent can decline subjective feedback with a justification, but an appraiser must approve or reject that decision on the next iteration. Validation and human feedback cannot be wont-fixed.
|
|
261
441
|
|
|
262
|
-
### Multi-model
|
|
442
|
+
### Multi-model diversity
|
|
263
443
|
|
|
264
|
-
Cycle definitions specify
|
|
444
|
+
Cycle definitions specify per-stage models; individual appraisers may override. Different models catch different issues; consolidation is a union. One appraiser flagging an issue is enough to raise it.
|
|
265
445
|
|
|
266
446
|
### Input artefacts are read-only
|
|
267
447
|
|
|
268
|
-
When a cycle reads from
|
|
448
|
+
When a cycle reads from another cycle's output, those files cannot be modified. Enforced via `stage_finalize` and `sort`'s diff check. Downstream cycles cannot corrupt upstream work.
|
|
269
449
|
|
|
270
450
|
### Glob patterns must not overlap
|
|
271
451
|
|
|
272
|
-
Two artefact types cannot have file patterns that match the same files.
|
|
452
|
+
Two artefact types cannot have file patterns that match the same files. Hard-blocked at creation time; the file-ownership rule doesn't have a meaningful answer otherwise.
|
|
453
|
+
|
|
454
|
+
---
|
|
455
|
+
|
|
456
|
+
## Further reading
|
|
457
|
+
|
|
458
|
+
- [docs/concepts.md](docs/concepts.md) — every concept defined concisely.
|
|
459
|
+
- [docs/getting-started.md](docs/getting-started.md) — end-to-end walkthrough.
|
|
460
|
+
- [docs/work-spec.md](docs/work-spec.md) — the full WORK.md + WORK.history.yaml spec.
|
|
461
|
+
- [CHANGELOG.md](CHANGELOG.md) — version history and migration notes.
|
|
462
|
+
|
|
463
|
+
---
|
|
273
464
|
|
|
274
465
|
## License
|
|
275
466
|
|