@really-knows-ai/foundry 2.3.1 → 2.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,15 @@
1
1
  # Changelog
2
2
 
3
+ ## 2.3.2 — 2026-04-21
4
+
5
+ ### Changed
6
+
7
+ - Config-modifying skills (`add-flow`, `add-cycle`, `add-law`, `add-appraiser`, `add-artefact-type`) now refuse to run on a work branch. They require the current branch to not start with `work/`, directing the user to complete or discard the in-flight flow before changing foundry configuration. Structural changes belong on the base branch, not alongside transient flow state.
8
+
9
+ ### Removed
10
+
11
+ - Historical planning docs (`docs/plans/`, `docs/specs/`, `docs/superpowers/`) and `HARDEN.md`. All described features that shipped in v2.2.0–v2.3.1; git history preserves the full record.
12
+
3
13
  ## 2.3.1 — 2026-04-20
4
14
 
5
15
  ### Changed
package/README.md CHANGED
@@ -1,12 +1,51 @@
1
1
  # Foundry
2
2
 
3
- A skill-driven framework for governed artefact generation and evaluation using AI coding tools. Install it as an npm package and define your own artefact types, laws, and flows — Foundry handles the forge-quench-appraise pipeline.
3
+ > A skill-driven framework for governed artefact generation with AI coding tools. Define your own artefact types, laws, and flows — Foundry handles the forgequenchappraise pipeline with deterministic routing, quality gates, and iterative refinement.
4
+
5
+ [![npm version](https://img.shields.io/npm/v/@really-knows-ai/foundry.svg)](https://www.npmjs.com/package/@really-knows-ai/foundry)
6
+ [![license](https://img.shields.io/npm/l/@really-knows-ai/foundry.svg)](LICENSE)
7
+
8
+ ---
9
+
10
+ ## Table of contents
11
+
12
+ - [Why Foundry?](#why-foundry)
13
+ - [Compatibility](#compatibility)
14
+ - [Installation](#installation)
15
+ - [Quick start](#quick-start)
16
+ - [How it works](#how-it-works)
17
+ - [Core concepts](#core-concepts)
18
+ - [The pipeline in depth](#the-pipeline-in-depth)
19
+ - [Feedback lifecycle](#feedback-lifecycle)
20
+ - [Enforcement model](#enforcement-model)
21
+ - [Multi-model routing](#multi-model-routing)
22
+ - [Skills](#skills)
23
+ - [Custom tools](#custom-tools)
24
+ - [Project layout](#project-layout)
25
+ - [Design decisions](#design-decisions)
26
+ - [Further reading](#further-reading)
27
+ - [License](#license)
28
+
29
+ ---
30
+
31
+ ## Why Foundry?
32
+
33
+ LLMs are excellent at producing artefacts — code, specs, docs, tests — but they are erratic about *governing* that production. They skip checks, silently ignore feedback, drift from constraints, and forget what stage they're in. Foundry is an opinionated framework that separates **creative work** (handled by LLMs via skills) from **process work** (handled by deterministic tools):
34
+
35
+ - **The pipeline is code, not prose.** Routing, state transitions, commit discipline, and write invariants live inside tested plugin tools. LLMs can't rationalise their way past them.
36
+ - **Every artefact is governed by laws.** Global and per-type pass/fail criteria are evaluated by a panel of independent appraisers before anything is considered done.
37
+ - **Nothing is silent.** Feedback has a full lifecycle (open → actioned/wont-fix → approved/rejected). Wont-fix requires appraiser approval. Validation is non-negotiable.
38
+ - **Writes are enforced.** Each stage is allowed to modify a specific, narrow set of files. Violations halt the cycle.
39
+ - **Humans can step in.** Human-in-the-loop gates can run every iteration or only when LLM appraisers deadlock.
40
+
41
+ ---
4
42
 
5
43
  ## Compatibility
6
44
 
7
- - **OpenCode** — full support, multi-model routing via file-based agents
45
+ - **OpenCode** — full support. Multi-model routing via file-based `foundry-*` agents. This is the primary target.
46
+ - **Other skill-aware AI tools** — the skills and tools are portable. Multi-model stage routing is OpenCode-specific today because it relies on `.opencode/agents/` files generated by `refresh-agents`.
8
47
 
9
- Multi-model support enables model diversity across pipeline stages. Foundry agents are defined as `.opencode/agents/foundry-*.md` files, generated by the `refresh-agents` skill (also run during `init-foundry`). Cycle definitions specify which model each stage uses. Tools limited to a single model lose model-diversity but still get personality-based diversity.
48
+ ---
10
49
 
11
50
  ## Installation
12
51
 
@@ -15,211 +54,339 @@ Add `@really-knows-ai/foundry` to your OpenCode config:
15
54
  ```json
16
55
  // opencode.json
17
56
  {
18
- "packages": {
19
- "@really-knows-ai/foundry": "latest"
20
- }
57
+ "$schema": "https://opencode.ai/config.json",
58
+ "plugin": ["@really-knows-ai/foundry"]
21
59
  }
22
60
  ```
23
61
 
62
+ ---
63
+
24
64
  ## Quick start
25
65
 
26
- 1. **Install** the package as shown above
27
- 2. **Initialize** — use the `init-foundry` skill to scaffold a `foundry/` directory in your project
28
- 3. **Define artefact types** — use `add-artefact-type` to create types with file patterns, descriptions, and optional validation
29
- 4. **Add laws** — use `add-law` to define subjective pass/fail criteria (global or per-type)
30
- 5. **Add appraisers** — use `add-appraiser` to create appraiser personalities
31
- 6. **Define cycles** — use `add-cycle` to wire artefact types into forge/quench/appraise loops
32
- 7. **Define flows** — use `add-flow` to sequence cycles into end-to-end pipelines
33
- 8. **Run** — use the `flow` skill to execute a flow
66
+ 1. **Install** the package (above).
67
+ 2. **Initialize** — run the `init-foundry` skill to scaffold a `foundry/` directory and generate `foundry-*` agent files.
68
+ 3. **Define artefact types** — `add-artefact-type` walks you through identity, file patterns, output directory, laws, and optional CLI validation.
69
+ 4. **Add laws** — `add-law` creates subjective pass/fail criteria, globally or per-type.
70
+ 5. **Add appraisers** — `add-appraiser` creates appraiser personalities with conflict detection.
71
+ 6. **Define cycles** — `add-cycle` wires artefact types into a forge/quench/appraise loop with targets and input contracts.
72
+ 7. **Define a flow** — `add-flow` groups cycles and declares entry points.
73
+ 8. **Run** — invoke the `flow` skill with your goal. It creates a work branch, picks the right cycle, and hands off to `orchestrate`.
74
+
75
+ ---
34
76
 
35
77
  ## How it works
36
78
 
37
79
  ```
38
- Foundry Flow
39
- └─ Cycle 1 (e.g., ideation)
40
- │ ├─ Forge → produce the artefact
41
- ├─ Quench deterministic CLI checks (if defined)
42
- │ ├─ Appraise → subjective evaluation by multiple appraisers
43
- │ └─ ↺ iterate until all feedback is resolved
44
- └─ Cycle 2 (e.g., creation)
45
- ├─ reads output from Cycle 1 (read-only)
46
- ├─ Forge → produce the artefact
47
- ├─ Quenchdeterministic CLI checks
48
- ├─ Appraise → subjective evaluation
49
- └─ ↺ iterate until all feedback is resolved
80
+ ┌─────────────────────────────┐
81
+ │ Flow (entry points + set)
82
+ └──────────────┬──────────────┘
83
+ starting cycle picked
84
+
85
+ ┌────────────────────────────────────────────────────────────────┐
86
+ Cycle (outputs exactly one artefact type)
87
+ │ │
88
+ │ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │
89
+ │ │ forge │ quench │ → │ appraise │ ──┐ │
90
+ │ └─────────┘ └─────────┘ └─────────────┘ │ loop │
91
+ │ ▲ │ until
92
+ │ └───── unresolved feedback ─────────────────┘ clean │
93
+ │ │
94
+ │ [ optional: human-appraise — every iter or on deadlock ] │
95
+ └──────────────┬─────────────────────────────────────────────────┘
96
+ │ targets (may branch)
97
+
98
+ next cycle → … → done
50
99
  ```
51
100
 
52
- A **foundry flow** runs one or more **foundry cycles** in sequence. Each cycle produces a single artefact type by looping through forge → quench → appraise until the artefact passes all criteria. The output of one cycle becomes read-only input for the next.
101
+ - A **flow** defines the set of cycles and their entry points.
102
+ - A **cycle** produces exactly one artefact type and declares its own `targets` — Foundry follows a dependency graph, not a linear list.
103
+ - Each cycle loops through **forge → quench → appraise** until there is no unresolved feedback, or an iteration limit is hit.
104
+ - All inter-stage communication goes through **WORK.md** on a dedicated work branch; every stage ends with a micro-commit.
53
105
 
54
- All state lives in `WORK.md` on a dedicated work branch. Every stage micro-commits, and file modification enforcement ensures stages only touch what they're allowed to.
106
+ ---
55
107
 
56
- ## Custom tools
108
+ ## Core concepts
57
109
 
58
- The Foundry plugin exposes 25 custom tools that handle all deterministic pipeline operations. Skills call these tools instead of manipulating files directly — this eliminates LLM interpretation of file formats and ensures consistent state management.
110
+ ### Flow
59
111
 
60
- | Category | Tools |
61
- |----------|-------|
62
- | **Workfile** | `foundry_workfile_create`, `foundry_workfile_get`, `foundry_workfile_set`, `foundry_workfile_delete` |
63
- | **Artefacts** | `foundry_artefacts_add`, `foundry_artefacts_list`, `foundry_artefacts_set_status` |
64
- | **Feedback** | `foundry_feedback_add`, `foundry_feedback_action`, `foundry_feedback_wontfix`, `foundry_feedback_resolve`, `foundry_feedback_list` |
65
- | **History** | `foundry_history_append`, `foundry_history_list` |
66
- | **Sort** | `foundry_sort` |
67
- | **Config** | `foundry_config_cycle`, `foundry_config_artefact_type`, `foundry_config_laws`, `foundry_config_validation`, `foundry_config_appraisers`, `foundry_config_flow` |
68
- | **Validation** | `foundry_validate_run`, `foundry_appraisers_select` |
69
- | **Git** | `foundry_git_branch`, `foundry_git_commit` |
112
+ A flow lives in `foundry/flows/`. It declares:
70
113
 
71
- Tools are backed by shared library modules in `scripts/lib/` that use injectable I/O for testability. The sort routing engine (`scripts/sort.js`) exports `runSort()` for the sort tool.
114
+ - `starting-cycles` hints about where the flow can be entered.
115
+ - The set of cycles it contains (routing between them is owned by cycles, not by the flow).
72
116
 
73
- ## Core concepts
117
+ Starting a flow creates a work branch and a fresh `WORK.md`.
74
118
 
75
- ### Foundry Flows
119
+ ### Cycle
76
120
 
77
- Defined in `foundry/flows/`. A flow lists cycles to execute in order. Starting a flow creates a work branch and a fresh `WORK.md`.
121
+ A cycle lives in `foundry/cycles/`. It declares:
78
122
 
79
- ### Foundry Cycles
123
+ - `output` — the artefact type the cycle produces (read-write).
124
+ - `inputs` — a contract (`any-of` or `all-of`) over artefact types from other cycles. Inputs are discovered on disk by filesystem scan against each input type's file-patterns; they are read-only.
125
+ - `targets` — which cycle(s) may run next after this one completes.
126
+ - `human-appraise` / `deadlock-appraise` / `deadlock-iterations` — human-in-the-loop configuration.
127
+ - `models` — optional per-stage model overrides for multi-model diversity.
80
128
 
81
- Defined in `foundry/cycles/`. A cycle specifies:
82
- - `output` — the artefact type it produces (read-write)
83
- - `inputs` — artefact types from previous cycles (read-only)
129
+ ### Stage
84
130
 
85
- ### Stages
131
+ A single step within a cycle. Stages are identified as `base:alias` (e.g. `forge:write-haiku`, `quench:check-syllables`). The base is one of:
86
132
 
87
- The three steps within a cycle:
88
- - **Forge** — produce or revise the artefact
89
- - **Quench** — run deterministic CLI checks (skipped if artefact type has no `validation.md`)
90
- - **Appraise** — subjective evaluation by multiple independent appraisers
133
+ - **forge** produce or revise the artefact.
134
+ - **quench** — run deterministic CLI checks (skipped if the artefact type has no `validation.md`).
135
+ - **appraise** — subjective evaluation by multiple independent appraiser sub-agents.
136
+ - **human-appraise** — human quality gate, either every iteration or only on deadlock.
91
137
 
92
- ### Artefact types
138
+ ### Artefact type
93
139
 
94
- Defined in `foundry/artefacts/<type>/`. Each type has:
95
- - `definition.md` — id, name, file patterns, output directory, appraiser config, prose description
96
- - `laws.md` (optional) type-specific subjective criteria
97
- - `validation.md` (optional) — CLI commands with `{file}` placeholder; non-zero exit = failure
140
+ Defined in `foundry/artefacts/<type>/`:
141
+
142
+ - `definition.md` — id, name, file patterns, output directory, appraiser configuration, prose description.
143
+ - `laws.md` *(optional)*type-specific subjective criteria.
144
+ - `validation.md` *(optional)* — CLI commands with a `{file}` placeholder; non-zero exit = failure.
98
145
 
99
146
  ### Laws
100
147
 
101
- Subjective pass/fail criteria. Two scopes:
102
- - `foundry/laws/*.md` — global laws, all files concatenated, apply to everything
103
- - `foundry/artefacts/<type>/laws.md` — type-specific laws
148
+ Subjective pass/fail criteria evaluated by appraisers.
149
+
150
+ - `foundry/laws/*.md` — global laws (all files concatenated, apply everywhere).
151
+ - `foundry/artefacts/<type>/laws.md` — type-specific laws.
104
152
 
105
- Each law is a `## heading` (the identifier, used in feedback tags as `#law:<id>`) with a description, passing criteria, and failing criteria.
153
+ Each law is a `## heading` (its identifier, referenced in feedback as `#law:<id>`) with a description, passing criteria, and failing criteria.
106
154
 
107
155
  ### Appraisers
108
156
 
109
- Defined in `foundry/appraisers/`. Each appraiser has a personality and an optional model override. Appraisers are assigned to artefact types via the `appraisers` section in the type's `definition.md`:
157
+ Defined in `foundry/appraisers/`. Each appraiser is a named personality with an optional `model` override. Artefact types pick which appraisers may evaluate them:
110
158
 
111
159
  ```yaml
112
160
  appraisers:
113
- count: 3 # how many appraisers (default: 3)
114
- allowed: [pedantic, pragmatic] # which personalities (default: all available)
161
+ count: 3 # how many appraisers (default: 3)
162
+ allowed: [pedantic, pragmatic] # which personalities (default: all)
115
163
  ```
116
164
 
117
- Appraisers are distributed evenly across available personalities for maximum diversity. If you request 6 appraisers with 3 personalities, you get 2 of each. Model diversity is configured at the cycle level (per-stage) and optionally per-appraiser — see [concepts](docs/concepts.md).
165
+ Appraisers are distributed evenly across the allowed set for maximum diversity.
118
166
 
119
167
  ### WORK.md
120
168
 
121
- Transient shared state on the work branch. Tracks:
122
- - Current position (flow, cycle, stage) in frontmatter
123
- - Goal description
124
- - Artefact registry (what exists, its status)
125
- - All feedback with full lifecycle
169
+ Transient shared state on the work branch. Created when the flow starts, deleted before the branch is squash-merged. It contains:
170
+
171
+ - **Frontmatter** — current position (`flow`, `cycle`, stage list, max iterations, model map, human-appraise config).
172
+ - **Goal** the prose request that kicked off the flow.
173
+ - **Artefacts** a table of every file produced by the flow and its status (`draft`, `done`, `blocked`).
174
+ - **Feedback** — grouped by artefact file, every feedback item with its full lifecycle.
175
+
176
+ A sibling file `WORK.history.yaml` is an append-only log of every stage execution. See [docs/work-spec.md](docs/work-spec.md).
177
+
178
+ ---
179
+
180
+ ## The pipeline in depth
181
+
182
+ ### Stages run inside a token-gated lifecycle
183
+
184
+ Every dispatched stage (forge, quench, appraise, human-appraise) runs under a single-use HMAC token:
185
+
186
+ 1. The `orchestrate` tool mints a token and hands it to the sub-agent in the dispatch prompt.
187
+ 2. The sub-agent's **first** call must be `foundry_stage_begin({stage, cycle, token})`. The token is redeemed; mutation tools now check that the active stage matches.
188
+ 3. The sub-agent does its work (reads WORK.md, writes artefact files / feedback, etc.).
189
+ 4. The sub-agent's **last** call is `foundry_stage_end({summary})`.
190
+ 5. The orchestrator then calls `foundry_stage_finalize`, which:
191
+ - Scans the git diff against the stage's allowed file-patterns.
192
+ - Registers any new files matching the output artefact type as `draft` artefacts.
193
+ - Returns `{error: 'unexpected_files'}` if the stage wrote anywhere it shouldn't have.
194
+ 6. The cycle is committed (`foundry_git_commit` internally) and routing advances.
195
+
196
+ Per-stage write rules:
197
+
198
+ | Stage | May write |
199
+ |-------|-----------|
200
+ | `forge` | Files matching the output artefact type's `file-patterns`, plus `WORK.md` / `WORK.history.yaml` |
201
+ | `quench` | `WORK.md` / `WORK.history.yaml` only (feedback) |
202
+ | `appraise` | `WORK.md` / `WORK.history.yaml` only (feedback) |
203
+ | `human-appraise` | `WORK.md` / `WORK.history.yaml` only (feedback) |
204
+
205
+ Input artefacts are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle.
206
+
207
+ ### Deterministic orchestration
208
+
209
+ The `orchestrate` skill is thin — a 3-line loop:
210
+
211
+ ```text
212
+ call foundry_orchestrate({lastResult})
213
+ switch on action:
214
+ dispatch → task tool (subagent) → report back
215
+ human_appraise → run human-appraise inline → report back
216
+ done / blocked / violation → terminate the loop
217
+ ```
218
+
219
+ `foundry_orchestrate` owns sort routing, history, commits, finalize, deadlock detection, and violation handling. Because the protocol lives in a plugin tool, the LLM can't skip steps, reorder them, or silently drop a commit.
220
+
221
+ ---
126
222
 
127
- ### Feedback lifecycle
223
+ ## Feedback lifecycle
224
+
225
+ Feedback is markdown checklists under each artefact in WORK.md, tagged to indicate source.
128
226
 
129
227
  ```
130
- open - [ ] issue #tag → needs generator action
131
- actioned - [x] issue #tag → needs approval
132
- wont-fix - [~] issue #tag | wont-fix: <reason> → needs approval
133
- approved - [x] issue #tag | approved → resolved
134
- approved - [~] issue #tag | wont-fix: <reason> | approved → resolved
135
- rejected - [x] issue #tag | rejected: <reason> → re-opened
136
- rejected - [~] issue #tag | wont-fix: <reason> | rejected → re-opened
228
+ - [ ] issue #tag → open — needs forge action
229
+ - [x] issue #tag → actioned — needs appraise approval
230
+ - [~] issue #tag | wont-fix: <reason> → wont-fix — needs appraise approval
231
+ - [x] issue #tag | approved → resolved
232
+ - [~] issue #tag | wont-fix: <reason> | approved → resolved
233
+ - [x] issue #tag | rejected: <reason> → re-opened
234
+ - [~] issue #tag | wont-fix: <reason> | rejected → re-opened
137
235
  ```
138
236
 
139
- Validation feedback (`#validation`) cannot be wont-fixed — deterministic rules are not negotiable.
237
+ Tags:
238
+
239
+ | Tag | Source | Notes |
240
+ |-----|--------|-------|
241
+ | `#validation` | quench (CLI command failed) | Cannot be wont-fixed. Deterministic rules are not negotiable. |
242
+ | `#law:<id>` | appraise (subjective law) | May be wont-fixed with justification; an appraiser must approve. |
243
+ | `#human` | human-appraise | Takes absolute priority. Forge MUST address it — cannot wont-fix. |
244
+
245
+ Feedback is append-only: items are never deleted, only resolved. Re-opened items show their full history.
246
+
247
+ ### Deadlock handling
248
+
249
+ If forge and appraise ping-pong on the same items for `deadlock-iterations` (default 5) iterations, and the cycle has `deadlock-appraise: true` (default), the router inserts a `human-appraise` stage. If `deadlock-appraise: false`, the cycle is marked `blocked` and control returns to the human.
250
+
251
+ ---
252
+
253
+ ## Enforcement model
254
+
255
+ Foundry is designed around "trust the tool, not the LLM". The following guarantees are enforced in plugin code, not prose:
256
+
257
+ - **Stage-locked mutations.** `foundry_feedback_*`, `foundry_artefacts_*`, and `foundry_workfile_*` tools require the caller's role to match the active stage. A forge sub-agent cannot add feedback; a quench sub-agent cannot register artefacts.
258
+ - **Single-use tokens.** `foundry_stage_begin` verifies an HMAC token minted at dispatch time. Replays, forgery, and cross-stage reuse all fail closed. Keys live in `.foundry/.secret` (mode 0600, gitignored, one per worktree).
259
+ - **Commit-per-stage contract.** `foundry_orchestrate` refuses to proceed if there are uncommitted changes to `WORK.md`, `WORK.history.yaml`, or anything under `.foundry/` at the start of a sort call and history is non-empty.
260
+ - **Write invariants.** `foundry_stage_finalize` scans the git diff and rejects stray writes with `{error: 'unexpected_files'}`.
261
+ - **Feedback state machine.** Only legal transitions are accepted: `approved` is terminal; quench cannot approve/reject a `wont-fix`; validation cannot be wont-fixed.
262
+ - **Artefact-type glob uniqueness.** `add-artefact-type` refuses to create a type whose file patterns overlap with an existing type; the enforcer can't determine file ownership otherwise.
263
+
264
+ ---
140
265
 
141
- ### File modification enforcement
266
+ ## Multi-model routing
142
267
 
143
- Every stage micro-commits. The cycle checks the git diff:
144
- - After forge: only output artefact file patterns + WORK.md + WORK.history.yaml (input artefacts are read-only — violation if touched)
145
- - After quench/appraise: only WORK.md + WORK.history.yaml
146
- - Violations are hard stops
268
+ Different stages can run on different models for genuine cognitive diversity (mitigating shared blind spots):
147
269
 
148
- > **Merge hygiene:** WORK.md and WORK.history.yaml are ephemeral working files. Delete them before squash-merging the branch back into main.
270
+ - Cycle definitions can declare a `models` map, e.g. `models: { forge: anthropic/claude-opus-4.7, appraise: openai/gpt-5 }`.
271
+ - Individual appraisers can override the cycle-level appraise model via a `model` field in their personality definition.
272
+ - `refresh-agents` generates a `foundry-<provider>-<model>.md` agent file in `.opencode/agents/` for every model available in the session. `orchestrate` picks the matching agent when dispatching.
273
+
274
+ Resolution order for a given stage: **appraiser `model`** → **cycle `models.<stage>`** → **session default**.
275
+
276
+ Run `list-agents` to see what's available.
277
+
278
+ ---
149
279
 
150
280
  ## Skills
151
281
 
152
- Everything is a skill. Skills are either atomic (do one thing) or composite (orchestrate other skills).
282
+ Foundry is a collection of skills. Skills are either **atomic** (do one thing) or **composite** (orchestrate other skills).
153
283
 
154
- ### Pipeline skills
284
+ ### Pipeline
155
285
 
156
286
  | Skill | Type | Purpose |
157
287
  |-------|------|---------|
158
- | `forge` | atomic | Produce or revise an artefact |
159
- | `quench` | atomic | Run deterministic CLI checks |
160
- | `appraise` | atomic | Dispatch multiple appraisers, consolidate feedback |
161
- | `cycle` | composite | forge quench appraise iterate |
162
- | `flow` | composite | Orchestrate cycles on a work branch |
288
+ | `flow` | composite | Entry point. Picks a starting cycle, creates the work branch, invokes `orchestrate`, follows `targets` between cycles. |
289
+ | `orchestrate` | atomic | Thin driver around `foundry_orchestrate`. Dispatches sub-agents, runs human-appraise inline, reports terminal states. |
290
+ | `forge` | atomic | Produce or revise the artefact. Discovers inputs by filesystem scan. |
291
+ | `quench` | atomic | Run the artefact type's CLI validation commands; write `#validation` feedback. |
292
+ | `appraise` | atomic | Dispatch the selected appraiser personalities as parallel sub-agents; consolidate `#law:<id>` feedback (union + dedup). |
293
+ | `human-appraise` | atomic | Human quality gate. Presents the artefact, collects `#human` feedback. |
163
294
 
164
- ### Helper skills
295
+ ### Authoring
165
296
 
166
297
  | Skill | Purpose |
167
298
  |-------|---------|
168
- | `init-foundry` | Scaffold the `foundry/` directory in your project |
169
- | `add-artefact-type` | Create a new artefact type with conflict and glob-overlap checks |
170
- | `add-law` | Create a new law with conflict detection |
171
- | `add-appraiser` | Create a new appraiser personality with semantic overlap checks |
172
- | `add-cycle` | Create a new cycle within a flow with dependency validation |
173
- | `add-flow` | Create a new flow definition |
299
+ | `init-foundry` | Scaffold the `foundry/` directory and generate agent files. |
300
+ | `add-artefact-type` | Create a new artefact type, with conflict and glob-overlap checks. |
301
+ | `add-law` | Create a new law with conflict detection. |
302
+ | `add-appraiser` | Create an appraiser personality with semantic-overlap checks. |
303
+ | `add-cycle` | Create a cycle, validate its targets and input contract against the flow. |
304
+ | `add-flow` | Create a flow definition with cycle-graph reachability checks. |
174
305
 
175
- ### Utility skills
306
+ ### Utility
176
307
 
177
308
  | Skill | Purpose |
178
309
  |-------|---------|
179
- | `sort` | Deterministic cycle router determines and dispatches the next stage |
180
- | `hitl` | Human-in-the-loop intervention points |
310
+ | `list-agents` | List available `foundry-*` sub-agents (for multi-model routing). |
311
+ | `refresh-agents` | Regenerate `foundry-*` agent files from the currently available models. |
312
+ | `upgrade-foundry` | Analyse and migrate `foundry/` config to the current version. |
181
313
 
182
- All helper skills are interactive — they walk you through the process, check for conflicts, and confirm before writing files.
314
+ All authoring skills are interactive and conflict-aware — they explain what they're about to write and ask before writing.
183
315
 
184
- ## Package structure
316
+ ---
317
+
318
+ ## Custom tools
319
+
320
+ The plugin registers **24 custom tools**. Skills call these rather than manipulating files directly, which keeps format-parsing and state transitions out of LLM hands.
321
+
322
+ | Category | Tools |
323
+ |----------|-------|
324
+ | **Orchestration** | `foundry_orchestrate` |
325
+ | **Stage lifecycle** | `foundry_stage_begin`, `foundry_stage_end` |
326
+ | **Workfile** | `foundry_workfile_create`, `foundry_workfile_get`, `foundry_workfile_delete` |
327
+ | **Artefacts** | `foundry_artefacts_set_status`, `foundry_artefacts_list` |
328
+ | **Feedback** | `foundry_feedback_add`, `foundry_feedback_action`, `foundry_feedback_wontfix`, `foundry_feedback_resolve`, `foundry_feedback_list` |
329
+ | **History** | `foundry_history_list` |
330
+ | **Config** | `foundry_config_cycle`, `foundry_config_artefact_type`, `foundry_config_laws`, `foundry_config_validation`, `foundry_config_appraisers`, `foundry_config_flow` |
331
+ | **Validation** | `foundry_validate_run`, `foundry_appraisers_select` |
332
+ | **Git** | `foundry_git_branch`, `foundry_git_finish` |
333
+
334
+ A handful of internal tools (`foundry_sort`, `foundry_history_append`, `foundry_stage_finalize`, `foundry_git_commit`, `foundry_workfile_set`, `foundry_workfile_configure_from_cycle`) are intentionally *not* registered — they exist only inside `foundry_orchestrate` so they cannot be called out of band.
335
+
336
+ Tools are backed by shared modules in `scripts/lib/` with injectable I/O for testability (see `tests/`).
337
+
338
+ ---
339
+
340
+ ## Project layout
341
+
342
+ ### Package (this repo)
185
343
 
186
344
  ```
187
345
  @really-knows-ai/foundry
188
346
  ├── .opencode/
189
347
  │ └── plugins/
190
- │ └── foundry.js # OpenCode plugin (skills + 25 custom tools)
191
- ├── skills/ # skill definitions (the pipeline)
348
+ │ └── foundry.js # plugin: skills + 24 custom tools
349
+ ├── skills/ # skill definitions
350
+ │ ├── flow/ # pipeline
351
+ │ ├── orchestrate/
192
352
  │ ├── forge/
193
353
  │ ├── quench/
194
354
  │ ├── appraise/
195
- │ ├── cycle/
196
- │ ├── flow/
197
- │ ├── init-foundry/
355
+ │ ├── human-appraise/
356
+ │ ├── init-foundry/ # authoring
198
357
  │ ├── add-artefact-type/
199
358
  │ ├── add-law/
200
359
  │ ├── add-appraiser/
201
360
  │ ├── add-cycle/
202
361
  │ ├── add-flow/
203
- │ ├── sort/
204
- └── hitl/
205
- ├── scripts/ # shared library and routing engine
206
- ├── lib/
207
- ├── workfile.js # WORK.md frontmatter parsing/writing
208
- │ │ ├── artefacts.js # artefacts table operations
209
- │ │ ├── history.js # WORK.history.yaml operations
210
- │ │ ├── feedback.js # feedback lifecycle operations
362
+ │ ├── list-agents/ # utility
363
+ ├── refresh-agents/
364
+ │ └── upgrade-foundry/
365
+ ├── scripts/
366
+ │ ├── lib/ # shared libraries (injectable I/O)
367
+ │ │ ├── workfile.js # WORK.md frontmatter
368
+ │ │ ├── artefacts.js # artefact table ops
369
+ │ │ ├── history.js # WORK.history.yaml ops
370
+ │ │ ├── feedback.js # feedback lifecycle
371
+ │ │ ├── feedback-transitions.js
372
+ │ │ ├── finalize.js # stage_finalize implementation
373
+ │ │ ├── stage-guard.js # stage-lock preconditions
374
+ │ │ ├── token.js # HMAC token mint/verify
375
+ │ │ ├── secret.js # .foundry/.secret handling
376
+ │ │ ├── pending.js # active-stage state
377
+ │ │ ├── state.js # .foundry state dir
211
378
  │ │ ├── config.js # foundry/ config readers
212
- │ │ └── tags.js # tag extraction
213
- │ └── sort.js # deterministic routing engine (exports runSort)
214
- ├── tests/ # test suite (node:test)
215
- ├── docs/ # concept docs and specs
216
- ├── package.json
379
+ │ │ ├── tags.js # feedback tag extraction
380
+ └── slug.js
381
+ ├── orchestrate.js # orchestration loop (exports runOrchestrate)
382
+ │ └── sort.js # routing engine (exports runSort)
383
+ ├── tests/ # node:test suite
384
+ ├── docs/ # concepts, getting-started, work-spec
385
+ ├── CHANGELOG.md
217
386
  └── README.md
218
387
  ```
219
388
 
220
- ## User project structure
221
-
222
- After running `init-foundry`, your project gets a `foundry/` directory:
389
+ ### User project (after `init-foundry`)
223
390
 
224
391
  ```
225
392
  your-project/
@@ -229,47 +396,71 @@ your-project/
229
396
  │ ├── artefacts/ # artefact type definitions
230
397
  │ │ └── <type>/
231
398
  │ │ ├── definition.md
232
- │ │ ├── laws.md # (optional) type-specific laws
233
- │ │ └── validation.md # (optional) CLI checks
399
+ │ │ ├── laws.md # optional
400
+ │ │ └── validation.md # optional
234
401
  │ ├── laws/ # global laws
235
402
  │ └── appraisers/ # appraiser personalities
403
+ ├── .foundry/ # runtime state (gitignored)
404
+ │ └── .secret # per-worktree HMAC key (mode 0600)
405
+ ├── .opencode/
406
+ │ └── agents/
407
+ │ └── foundry-*.md # generated by refresh-agents
236
408
  ├── opencode.json
237
409
  └── ...
238
410
  ```
239
411
 
412
+ During a flow, a work branch also contains `WORK.md` and `WORK.history.yaml` at the repo root. Both are ephemeral — delete them before squash-merging.
413
+
414
+ ---
415
+
240
416
  ## Design decisions
241
417
 
242
418
  ### Everything is markdown
243
419
 
244
- Flow definitions, cycle definitions, artefact types, laws, appraiser personalities, skills — all markdown. Readable by humans, consumable by LLMs, versionable in git. No config files, no databases, no custom formats.
420
+ Flows, cycles, artefact types, laws, appraiser personalities, skills — all markdown with YAML frontmatter. Readable by humans, consumable by LLMs, diff-able in git. No bespoke formats, no databases.
245
421
 
246
422
  ### Skills are the pipeline, tools are the machinery
247
423
 
248
- Composition happens via skills referencing other skills. The `flow` skill reads a flow definition and invokes the `cycle` skill. The `cycle` skill invokes `forge`, `quench`, and `appraise`. Skills handle creative and subjective work; deterministic operations (parsing, routing, state updates) are handled by custom tools backed by shared library code.
424
+ Composition happens at the skill layer. `flow` reads a definition and invokes `orchestrate`. `orchestrate` calls `foundry_orchestrate` in a loop. The hard guarantees routing, commits, state transitions, enforcement live inside the plugin's custom tools and the libraries under `scripts/lib/`. Skills handle creative and subjective work; tools handle everything else.
249
425
 
250
426
  ### WORK.md as shared state
251
427
 
252
- All communication between stages goes through WORK.md. No stage passes output directly to another — all reads and writes go through the `foundry_workfile_*`, `foundry_artefacts_*`, and `foundry_feedback_*` tools. This gives a complete audit trail, makes the process resumable, and means any stage can be re-run independently.
428
+ All inter-stage communication goes through WORK.md via the `foundry_workfile_*`, `foundry_artefacts_*`, `foundry_feedback_*`, and `foundry_history_*` tools. No stage passes output directly to another. This gives a complete audit trail, makes flows resumable after a crash, and lets any stage be re-run independently.
429
+
430
+ ### Cycles own their routing
431
+
432
+ A flow declares starting points; individual cycles declare `targets` and input contracts. The flow skill walks the resulting graph. This keeps cycles composable across flows and prevents the flow file from becoming a procedural monolith.
253
433
 
254
- ### Feedback as checklist items
434
+ ### Feedback as checklists
255
435
 
256
- Feedback uses markdown checklists with `#validation` or `#law:<id>` tags. Human-readable, trivially parseable by an LLM, with lifecycle states expressed inline.
436
+ Markdown checkboxes with `#validation`, `#law:<id>`, or `#human` tags. Human-readable, trivially parseable, lifecycle encoded inline. Feedback is append-only; history is part of the artefact's story.
257
437
 
258
- ### Wont-fix requires appraiser approval
438
+ ### Wont-fix requires approval
259
439
 
260
- The generator can decline subjective feedback with a justification, but an appraiser must approve or reject that decision. This prevents silently ignoring feedback while allowing legitimate pushback.
440
+ A forge sub-agent can decline subjective feedback with a justification, but an appraiser must approve or reject that decision on the next iteration. Validation and human feedback cannot be wont-fixed.
261
441
 
262
- ### Multi-model stage routing
442
+ ### Multi-model diversity
263
443
 
264
- Cycle definitions specify which model each stage uses via a `models` map. The `refresh-agents` skill generates `foundry-*` agent files in `.opencode/agents/` from available models. Individual appraisers can override the cycle-level model. Resolution order: appraiser `model` → cycle `models.<stage>` → session default. Multiple personalities catch different issues. Consolidation is union with dedup — one appraiser flagging an issue is enough.
444
+ Cycle definitions specify per-stage models; individual appraisers may override. Different models catch different issues; consolidation is a union. One appraiser flagging an issue is enough to raise it.
265
445
 
266
446
  ### Input artefacts are read-only
267
447
 
268
- When a cycle reads from a previous cycle's output, those files cannot be modified. Enforced via git diff after every micro-commit. This prevents downstream cycles from corrupting upstream work.
448
+ When a cycle reads from another cycle's output, those files cannot be modified. Enforced via `stage_finalize` and `sort`'s diff check. Downstream cycles cannot corrupt upstream work.
269
449
 
270
450
  ### Glob patterns must not overlap
271
451
 
272
- Two artefact types cannot have file patterns that match the same files. This is checked when creating new types and is a hard block — file modification enforcement can't determine ownership if patterns overlap.
452
+ Two artefact types cannot have file patterns that match the same files. Hard-blocked at creation time; the file-ownership rule doesn't have a meaningful answer otherwise.
453
+
454
+ ---
455
+
456
+ ## Further reading
457
+
458
+ - [docs/concepts.md](docs/concepts.md) — every concept defined concisely.
459
+ - [docs/getting-started.md](docs/getting-started.md) — end-to-end walkthrough.
460
+ - [docs/work-spec.md](docs/work-spec.md) — the full WORK.md + WORK.history.yaml spec.
461
+ - [CHANGELOG.md](CHANGELOG.md) — version history and migration notes.
462
+
463
+ ---
273
464
 
274
465
  ## License
275
466
 
package/docs/concepts.md CHANGED
@@ -1,59 +1,122 @@
1
1
  # Concepts
2
2
 
3
- Core concepts and how they relate.
3
+ This is the glossary. Every term here has a single definition and links out to the spec document that elaborates it. Concepts are arranged roughly top-down: flows contain cycles, cycles contain stages, stages operate on artefacts, artefacts are governed by laws and evaluated by appraisers.
4
4
 
5
- ## Foundry Flow
5
+ ---
6
6
 
7
- A foundry flow is the top-level unit of work. It is defined in `foundry/flows/` and lists the foundry cycles to execute in order. Starting a foundry flow creates a work branch and a WORK.md file. A foundry flow is complete when all its foundry cycles are done.
7
+ ## Flow
8
8
 
9
- ## Foundry Cycle
9
+ The top-level unit of work. Defined in `foundry/flows/*.md`. A flow declares:
10
10
 
11
- A foundry cycle is an iterative loop that produces a single artefact type. It is defined in `foundry/cycles/` and specifies:
12
- - An output artefact type (read-write)
13
- - Zero or more input artefact types (read-only, from previous foundry cycles)
11
+ - A `starting-cycles` list hints about which cycles can be entered first when the flow begins.
12
+ - A set of cycles (listed under `## Cycles`). Order is not implied — routing between cycles is owned by cycles themselves via their `targets` field.
14
13
 
15
- A foundry cycle runs: forge quench appraise, repeating until all feedback is resolved or the iteration limit is hit.
14
+ Running a flow creates a work branch and a `WORK.md`. The flow completes when no more reachable cycles remain to run, or when the user decides to stop.
15
+
16
+ ## Cycle
17
+
18
+ An iterative loop that produces a single artefact type. Defined in `foundry/cycles/*.md`. A cycle declares:
19
+
20
+ - `output` — the artefact type it produces (read-write).
21
+ - `inputs` — a contract (`any-of` / `all-of`) over other artefact types. Inputs are discovered on disk; they are read-only unless the output type's patterns happen to cover them.
22
+ - `targets` — the cycle(s) that may run after this one. May be empty (terminal cycle).
23
+ - `human-appraise` — whether a human quality gate runs every iteration (default: `false`).
24
+ - `deadlock-appraise` — whether a human is pulled in when LLM appraisers deadlock (default: `true`).
25
+ - `deadlock-iterations` — deadlock threshold (default: `5`).
26
+ - `models` — optional per-stage model overrides.
27
+
28
+ A cycle runs **forge → quench → appraise** (and optionally **human-appraise**), looping until all feedback is resolved or `max-iterations` is hit.
16
29
 
17
30
  ## Stage
18
31
 
19
- The steps within a foundry cycle. Each stage is referenced using a `base:alias` format (e.g. `forge:write-haiku`) where the base is the stage type and the alias describes its role in that cycle.
32
+ A single step within a cycle. Every stage is referenced as `base:alias` (e.g. `forge:write-haiku`, `quench:check-syllables`) the base is the stage type; the alias makes the stage's role self-documenting in WORK.md.
20
33
 
21
- - Forge — produce or revise the artefact
22
- - Quench — run deterministic CLI checks
23
- - Appraisesubjective evaluation by multiple appraisers
24
- - HITLhuman-in-the-loop checkpoint (see below)
34
+ Stage bases:
35
+
36
+ - **forge**produce or revise the artefact.
37
+ - **quench**run deterministic CLI checks (skipped if the artefact type has no `validation.md`).
38
+ - **appraise** — subjective evaluation by multiple appraiser sub-agents.
39
+ - **human-appraise** — human quality gate. Can run every iteration, only on deadlock, or both.
40
+
41
+ Every stage runs inside a token-gated lifecycle (`foundry_stage_begin` / `foundry_stage_end` / `foundry_stage_finalize`). Mutation tools are stage-locked: a forge stage can't add feedback, a quench stage can't register artefacts. See the enforcement section of the [README](../README.md#enforcement-model).
25
42
 
26
43
  ## Artefact type
27
44
 
28
- A definition of what kind of thing is being produced. Lives in `foundry/artefacts/<type>/` with:
29
- - `definition.md` — identity, file patterns, output location, prose description
30
- - `laws.md` — type-specific subjective evaluation criteria
31
- - `validation.md` — CLI commands for deterministic quench checks
45
+ A definition of what is being produced. Lives in `foundry/artefacts/<type>/`:
46
+
47
+ - `definition.md` — identity, file patterns, output directory, appraiser config, prose description.
48
+ - `laws.md` *(optional)* type-specific subjective criteria.
49
+ - `validation.md` *(optional)* — CLI commands for deterministic quench checks.
50
+
51
+ File patterns must not overlap with any other artefact type's patterns — the write-invariant enforcer needs to know which type owns a given file.
32
52
 
33
53
  ## Law
34
54
 
35
- A subjective pass/fail criterion. Global laws live in `foundry/laws/` (all files concatenated). Type-specific laws live in `foundry/artefacts/<type>/laws.md`. Each law has an identifier (its heading), used in feedback tags.
55
+ A subjective pass/fail criterion. Two scopes:
56
+
57
+ - **Global** — `foundry/laws/*.md`, all files concatenated, applies to every artefact.
58
+ - **Type-specific** — `foundry/artefacts/<type>/laws.md`.
59
+
60
+ Each law is a `## heading` (its identifier, used in feedback tags as `#law:<id>`) with a description, passing criteria, and failing criteria.
36
61
 
37
62
  ## Appraiser
38
63
 
39
- An independent evaluator with a defined personality. Lives in `foundry/appraisers/`. Each appraiser can optionally specify a `model` to override the cycle-level appraise model. Model diversity is configured at the cycle level (via the `models` frontmatter map) and optionally per-appraiser. They can be assigned to specific artefact types or appraise everything.
64
+ An independent evaluator with a defined personality. Lives in `foundry/appraisers/*.md`. Appraisers may specify a `model` field to override the cycle-level appraise model. Each artefact type picks which appraisers may evaluate it (`appraisers.allowed`) and how many run per iteration (`appraisers.count`). Selection distributes evenly across allowed personalities.
40
65
 
41
66
  ## WORK.md
42
67
 
43
- The transient shared state for a foundry flow. Created on the work branch, it tracks: where the foundry flow is (frontmatter cursor), what artefacts exist, and all feedback with its full lifecycle. See [work-spec.md](work-spec.md) for the full spec.
68
+ The transient shared state for a flow. Created on the work branch by the flow skill, it tracks:
69
+
70
+ - Current position (flow, cycle, stage list, iteration limits) in frontmatter.
71
+ - The goal (prose — written once).
72
+ - An artefact registry (file, type, cycle, status).
73
+ - All feedback with its full lifecycle.
74
+
75
+ See [work-spec.md](work-spec.md) for the full spec.
76
+
77
+ ## WORK.history.yaml
78
+
79
+ Append-only log of every stage execution, sitting next to WORK.md. Used by sort to reconstruct what has happened in the current cycle. See [work-spec.md](work-spec.md).
44
80
 
45
81
  ## Feedback
46
82
 
47
- The communication mechanism between stages. Written as markdown checklist items in WORK.md with tags (`#validation` or `#law:<id>`). Follows a lifecycle: open → actioned/wont-fix → approved/rejected. See [work-spec.md](work-spec.md) for details.
83
+ The communication mechanism between stages. Written as markdown checklist items in WORK.md, grouped by artefact file, tagged by source:
84
+
85
+ - `#validation` — from a deterministic quench command. Cannot be wont-fixed.
86
+ - `#law:<law-id>` — from an appraiser, tied to a specific law. May be wont-fixed with justification.
87
+ - `#human` — from a human-appraise stage. Takes absolute priority; cannot be wont-fixed.
88
+
89
+ Lifecycle: `open` → `actioned` / `wont-fix` → `approved` / `rejected`. `approved` is terminal; `rejected` re-opens. Items are never deleted.
90
+
91
+ ## HITL / human-appraise
92
+
93
+ Human-in-the-loop checkpoint. A stage where Foundry pauses and asks a human for input. Two triggers:
94
+
95
+ 1. **Every-iteration** — the cycle declares `human-appraise: true`. The `human-appraise` stage runs after LLM appraise each iteration.
96
+ 2. **Deadlock** — the cycle declares `deadlock-appraise: true` (default). If forge and appraisers ping-pong on the same items for `deadlock-iterations` (default 5) iterations, sort inserts a `human-appraise` stage to break the tie.
48
97
 
49
- ## HITL
98
+ Human feedback is tagged `#human` and takes priority over LLM feedback on the same topic.
50
99
 
51
- Human-in-the-loop checkpoint. A stage type that pauses the foundry cycle and requests human input before continuing. Configured per cycle by including a `hitl:alias` entry in the `stages` list. When a hitl stage runs, it presents the current artefact state to the human and collects feedback tagged `#hitl`. Like other feedback, hitl feedback follows the standard lifecycle (open → actioned → approved/rejected).
100
+ ## Micro-commit
52
101
 
53
- ## Micro commit
102
+ Every stage ends with a commit made by the orchestrator. This enables two things: file-modification enforcement (the write-invariant check compares the stage's diff to its allowed patterns) and recoverability (a crash mid-flow leaves a clean commit boundary to resume from). Orchestration refuses to proceed if uncommitted work is lingering in `WORK.md`, `WORK.history.yaml`, or `.foundry/`.
54
103
 
55
- Every stage ends with a commit (via the `foundry_git_commit` tool). This enables file modification enforcement — the sort tool checks the git diff to ensure each stage only touched files it was allowed to.
104
+ ## Stage token
105
+
106
+ A single-use HMAC-signed string, minted by `foundry_orchestrate` when a stage is dispatched. The sub-agent must redeem the token via `foundry_stage_begin`; mutation tools then check the active stage matches their role. Keys live in `.foundry/.secret` (mode 0600, gitignored, one per worktree). This prevents out-of-band mutations, replayed stages, and sub-agents skipping the lifecycle.
107
+
108
+ ## `.foundry/` state directory
109
+
110
+ A gitignored directory created on first plugin boot, holding runtime state:
111
+
112
+ - `.secret` — the HMAC key.
113
+ - `active-stage.json` — present only during an active stage.
114
+ - `last-stage.json` — used by `foundry_stage_finalize` after `stage_end`.
56
115
 
57
116
  ## Custom tools
58
117
 
59
- All deterministic pipeline operations are exposed as custom tools via the Foundry plugin. Skills call tools instead of manipulating files directly. The tools are backed by shared library modules in `scripts/lib/` with injectable I/O for testability. This separation ensures that file format parsing, state transitions, and routing logic are handled by tested code rather than LLM interpretation.
118
+ All deterministic pipeline operations are exposed as custom tools by the Foundry plugin. Skills call these tools instead of manipulating files directly. Tools are backed by shared library modules in `scripts/lib/` with injectable I/O so they can be unit-tested. This separation ensures state transitions and routing logic are tested code, not LLM interpretation. See the [README](../README.md#custom-tools) for the full catalogue.
119
+
120
+ ## Skill
121
+
122
+ A self-contained workflow written as markdown with YAML frontmatter. Foundry ships pipeline skills (`flow`, `orchestrate`, `forge`, `quench`, `appraise`, `human-appraise`), authoring skills (`add-*`, `init-foundry`), and utility skills (`list-agents`, `refresh-agents`, `upgrade-foundry`). Skills are either **atomic** (do one thing) or **composite** (orchestrate other skills).
@@ -1,78 +1,187 @@
1
1
  # Getting Started
2
2
 
3
- How to set up and run your first foundry flow.
3
+ End-to-end walkthrough for setting up Foundry and running your first flow.
4
+
5
+ ---
4
6
 
5
7
  ## Prerequisites
6
8
 
7
- - Git repository initialised
8
- - Node.js available (for validation scripts)
9
- - An AI coding tool that supports skills (OpenCode, Claude Code, Copilot CLI, etc.)
9
+ - A git repository initialised with a clean working tree.
10
+ - Node.js 18.3.0 (for the plugin and validation scripts).
11
+ - [OpenCode](https://opencode.ai) (primary target multi-model routing relies on OpenCode's agent files).
12
+
13
+ ## Install
14
+
15
+ Add Foundry to `opencode.json`:
16
+
17
+ ```json
18
+ {
19
+ "packages": {
20
+ "@really-knows-ai/foundry": "latest"
21
+ }
22
+ }
23
+ ```
24
+
25
+ Restart OpenCode (or reload plugins) so the plugin registers its tools and skills.
26
+
27
+ ## Initialize
28
+
29
+ In your project, invoke the `init-foundry` skill. It:
30
+
31
+ 1. Creates the `foundry/` directory structure:
32
+ ```
33
+ foundry/
34
+ artefacts/.gitkeep
35
+ flows/.gitkeep
36
+ cycles/.gitkeep
37
+ laws/.gitkeep
38
+ appraisers/.gitkeep
39
+ ```
40
+ 2. Runs `refresh-agents` to generate `.opencode/agents/foundry-*.md` — one per available model — so cycles can dispatch to specific models later.
41
+ 3. Commits the scaffolding.
42
+
43
+ The `.foundry/` runtime directory (holding `.secret` for stage tokens) is created automatically on first plugin boot and added to `.gitignore`.
10
44
 
11
- ## Step by step
45
+ ---
46
+
47
+ ## Author the configuration
48
+
49
+ Foundry's configuration is five things: artefact types, laws, appraisers, cycles, and flows. You can write the files by hand, but the authoring skills do conflict checking, scaffolding, and validation — use them.
12
50
 
13
51
  ### 1. Define an artefact type
14
52
 
15
- Create a directory under `foundry/artefacts/` with three files:
53
+ Run `add-artefact-type`. It walks you through:
16
54
 
17
- ```
18
- foundry/artefacts/my-type/
19
- definition.md # what it is, file patterns, output location
20
- laws.md # subjective laws (optional)
21
- validation.md # CLI validation commands (optional)
22
- ```
55
+ - `id` (lowercase, hyphenated), `name`, prose description.
56
+ - `file-patterns` — glob patterns describing which files this type owns. The skill refuses patterns that overlap with existing types.
57
+ - `output-dir` where forge should write new files.
58
+ - Appraiser config — how many appraisers evaluate this type and which personalities are allowed.
59
+ - Optional `laws.md` type-specific criteria.
60
+ - Optional `validation.md` — CLI commands for quench (non-zero exit = failure).
23
61
 
24
- Use the `init-foundry` skill to scaffold the `foundry/` directory, then use `add-artefact-type` to create your first artefact type interactively — or create the directory structure above manually.
62
+ Produces `foundry/artefacts/<id>/definition.md` (+ optional `laws.md`, `validation.md`).
25
63
 
26
64
  ### 2. Write laws
27
65
 
28
- Add global laws to any `.md` file in `foundry/laws/`. Add type-specific laws to `foundry/artefacts/<type>/laws.md`.
66
+ Laws are subjective pass/fail criteria evaluated by appraisers. Two scopes:
67
+
68
+ - **Global** — `foundry/laws/*.md`. All files are concatenated and apply to every artefact.
69
+ - **Type-specific** — `foundry/artefacts/<type>/laws.md`.
29
70
 
30
- Each law is a `##` heading with: a description, what passing looks like, and what failing looks like.
71
+ Run `add-law` to create one with conflict detection. Each law is a `## heading` (its identifier, referenced as `#law:<id>` in feedback) with a description, passing criteria, and failing criteria.
31
72
 
32
- ### 3. Define a foundry cycle
73
+ ### 3. Create appraisers
33
74
 
34
- Create a file in `foundry/cycles/` that specifies what artefact type the foundry cycle produces and what inputs it reads:
75
+ Appraisers are independent evaluators with named personalities. Run `add-appraiser`. Each appraiser may override the cycle-level appraise model via a `model` field. Artefact types pick which appraisers may evaluate them (`appraisers.allowed`).
35
76
 
36
- ```yaml
77
+ ### 4. Define a cycle
78
+
79
+ Run `add-cycle`. A cycle produces one artefact type and declares:
80
+
81
+ - `output` — the artefact type (must already exist).
82
+ - `inputs` — a contract (`any-of` or `all-of`) over other types. Empty for starting cycles.
83
+ - `targets` — the cycle(s) that may run after this one. Empty for terminal cycles.
84
+ - `human-appraise` / `deadlock-appraise` / `deadlock-iterations` — human-gate config.
85
+ - `models` — optional per-stage model overrides.
86
+
87
+ Example:
88
+
89
+ ```markdown
37
90
  ---
38
- id: my-cycle
39
- name: My Cycle
40
- output: my-type
41
- inputs: []
91
+ id: haiku-creation
92
+ name: Haiku Creation
93
+ output: haiku
94
+ inputs:
95
+ type: any-of
96
+ artefacts:
97
+ - petition
98
+ targets: []
99
+ human-appraise: false
100
+ deadlock-appraise: true
101
+ deadlock-iterations: 5
102
+ models:
103
+ appraise: openai/gpt-5
42
104
  ---
105
+
106
+ # Haiku Creation
107
+
108
+ Writes a haiku satisfying the petition produced by haiku-ideation.
43
109
  ```
44
110
 
45
- Cycles list their stages using `base:alias` format e.g. `forge:write-haiku`, `quench:check-syllables`. The alias makes each stage's purpose clear when reading WORK.md. You can also include `hitl:alias` stages for human-in-the-loop checkpoints.
111
+ The skill validates that every input type can be produced by some cycle in the flow and that targets are reachable.
46
112
 
47
- ### 4. Define a foundry flow
113
+ ### 5. Define a flow
48
114
 
49
- Create a file in `foundry/flows/` that lists foundry cycles in order:
115
+ Run `add-flow`. A flow groups cycles and declares starting points:
50
116
 
51
117
  ```markdown
52
118
  ---
53
- id: my-flow
54
- name: My Flow
119
+ id: make-haiku
120
+ name: Make a Haiku
121
+ starting-cycles:
122
+ - haiku-ideation
55
123
  ---
56
124
 
57
- # My Flow
125
+ # Make a Haiku
58
126
 
59
- Description of what this flow produces.
127
+ End-to-end flow: petition haiku, with a human quality gate.
60
128
 
61
129
  ## Cycles
62
130
 
63
- 1. my-cycle
131
+ - haiku-ideation
132
+ - haiku-creation
64
133
  ```
65
134
 
66
- ### 5. Run the foundry flow
135
+ Routing between cycles is owned by individual cycles via their `targets`, not by the flow.
136
+
137
+ ---
138
+
139
+ ## Run the flow
140
+
141
+ Tell OpenCode something like:
142
+
143
+ > Run the `make-haiku` flow to write a haiku about autumn rain.
144
+
145
+ The `flow` skill will:
146
+
147
+ 1. Check prerequisites and pick a starting cycle — matching your prose to a cycle's output type. If the request is ambiguous, it prompts (defaulting to `starting-cycles`). If a cycle's input contract can't be satisfied from files on disk, it won't be chosen.
148
+ 2. Create a work branch and scaffold `WORK.md` with the goal.
149
+ 3. Hand off to `orchestrate`, which drives the cycle:
150
+ - **forge** writes the artefact.
151
+ - **quench** runs CLI validators (if configured).
152
+ - **appraise** dispatches parallel appraiser sub-agents and consolidates their `#law:<id>` feedback.
153
+ - **human-appraise** (if configured, or on deadlock) asks you for input.
154
+ - If any unresolved feedback remains, another forge iteration begins.
155
+ 4. When the cycle completes, the flow skill checks the cycle's `targets`. If a target's input contract is satisfied, it asks whether to proceed.
156
+ 5. When all desired cycles are done, the flow skill summarises the output and asks how to finish — squash-merge, PR, or leave the branch.
157
+
158
+ Every stage ends with a micro-commit. Violations of the write invariant (writing to disallowed files) hard-stop the cycle.
159
+
160
+ ---
161
+
162
+ ## Inspecting progress
163
+
164
+ While a flow is running, the state of the world is in three places:
67
165
 
68
- Tell your AI tool to start the foundry flow. It will create a work branch, initialise WORK.md, and begin executing foundry cycles.
166
+ - `WORK.md` current cycle, goal, artefact table, all feedback with full lifecycle.
167
+ - `WORK.history.yaml` — append-only log of every stage execution.
168
+ - `git log` — one commit per stage.
169
+
170
+ You can pause and resume: if the flow skill sees an existing `WORK.md` when you start, it asks whether to resume, discard, or abort. Resume is only offered if the existing flow and cycle match the current request.
171
+
172
+ ---
173
+
174
+ ## Cleaning up
175
+
176
+ Before squash-merging the work branch back into main, **delete `WORK.md` and `WORK.history.yaml`** — they're ephemeral per-flow state, not artefacts. `.foundry/` is gitignored and doesn't need cleanup.
177
+
178
+ If you used `foundry_git_finish`, it handles this for you.
179
+
180
+ ---
69
181
 
70
- ## What happens during a foundry flow
182
+ ## Next steps
71
183
 
72
- 1. The foundry flow skill creates a branch and WORK.md
73
- 2. For each foundry cycle:
74
- - Forge produces the artefact
75
- - Quench runs CLI commands (if defined)
76
- - Appraise dispatches sub-agent appraisers against the laws
77
- - If feedback exists, forge revises and the foundry cycle repeats
78
- 3. When all foundry cycles complete, the human decides to merge, PR, or discard
184
+ - [docs/concepts.md](concepts.md) concise glossary.
185
+ - [docs/work-spec.md](work-spec.md) full WORK.md spec.
186
+ - [README.md](../README.md) architecture, enforcement, design decisions.
187
+ - [CHANGELOG.md](../CHANGELOG.md) version history.
package/docs/work-spec.md CHANGED
@@ -10,23 +10,31 @@ flow: <flow-id>
10
10
  cycle: <current-cycle-id>
11
11
  stages: [forge:write-haiku, quench:check-syllables, appraise:evaluate-quality]
12
12
  max-iterations: 3
13
+ human-appraise: false
14
+ deadlock-appraise: true
15
+ deadlock-iterations: 5
16
+ models:
17
+ forge: anthropic/claude-opus-4.7
18
+ appraise: openai/gpt-5
13
19
  ---
14
20
  ```
15
21
 
16
22
  Fields:
17
- - `flow` — the foundry flow being executed
18
- - `cycle` — the current foundry cycle id
19
- - `stages` — the ordered route for this foundry cycle, set when the foundry cycle starts. Each entry uses `base:alias` format where `base` is the stage type (`forge`, `quench`, `appraise`, or `hitl`) and `alias` is a human-readable name for what that stage does in this cycle. Determined from the artefact type: if `validation.md` exists, include `quench`; always include `forge` and `appraise`. A `hitl` stage can be included for human-in-the-loop checkpoints.
20
- - `max-iterations` — how many forge passes before the foundry cycle is blocked (default: 3)
23
+ - `flow` — the foundry flow being executed.
24
+ - `cycle` — the current cycle id.
25
+ - `stages` — the ordered route for this cycle. Each entry uses `base:alias` format where `base` is the stage type (`forge`, `quench`, `appraise`, or `human-appraise`) and `alias` is a human-readable name for what that stage does in this cycle. Derived from the cycle and artefact type: `forge` + `appraise` are always included, `quench` is included iff the artefact type has `validation.md`, `human-appraise` is included iff the cycle sets `human-appraise: true`.
26
+ - `max-iterations` — how many forge passes before the cycle is blocked (default: 3).
27
+ - `human-appraise` — run human-appraise every iteration (default: `false`).
28
+ - `deadlock-appraise` — route to human-appraise when LLM appraisers deadlock (default: `true`).
29
+ - `deadlock-iterations` — deadlock threshold (default: 5).
30
+ - `models` — optional per-stage model overrides; individual appraisers may further override via their own `model` field.
21
31
 
22
- The `stages` list is the happy path. Sort follows it but loops back to `forge` when unresolved feedback demands it.
32
+ The `stages` list is the happy path. Sort follows it but loops back to `forge` when unresolved feedback demands it, and inserts a `human-appraise` stage on deadlock.
23
33
 
24
34
  ### Who sets what
25
35
 
26
- - `flow` — set by the foundry flow skill at foundry flow start, never changes
27
- - `cycle` — set by the foundry flow skill when starting each foundry cycle
28
- - `stages` — set by the orchestrate skill when starting each foundry cycle (reads artefact type to determine if quench is needed)
29
- - `max-iterations` — set by the orchestrate skill (default 3, could be overridden in foundry cycle definition)
36
+ - `flow`, `cycle`, `goal` — set by the `flow` skill via `foundry_workfile_create` at flow/cycle boundaries.
37
+ - `stages`, `max-iterations`, `human-appraise`, `deadlock-appraise`, `deadlock-iterations`, `models` — set by `foundry_orchestrate` on the first call of each cycle (via internal `workfile_configure_from_cycle`, reading the cycle definition).
30
38
 
31
39
  ## Sections
32
40
 
@@ -70,7 +78,7 @@ Grouped by artefact file path. Each item is a checklist entry with a tag indicat
70
78
 
71
79
  - `#validation` — from a deterministic quench command
72
80
  - `#law:<law-id>` — from subjective appraise, tied to a specific law
73
- - `#hitl` — from human-provided feedback at a hitl checkpoint
81
+ - `#human` — from human-provided feedback at a human-appraise checkpoint
74
82
 
75
83
  #### Lifecycle states
76
84
 
@@ -86,20 +94,23 @@ Grouped by artefact file path. Each item is a checklist entry with a tag indicat
86
94
 
87
95
  #### Rules
88
96
 
89
- - Validation feedback (`#validation`) cannot be wont-fixed
90
- - Feedback is never deleted — it stays as a record of the iteration history
91
- - New feedback is appended, not inserted
92
- - Items are grouped under the artefact they relate to
97
+ - Validation feedback (`#validation`) cannot be wont-fixed — deterministic rules are not negotiable.
98
+ - Human feedback (`#human`) cannot be wont-fixed — it takes absolute priority over LLM feedback.
99
+ - Feedback is never deleted — it stays as a record of the iteration history.
100
+ - New feedback is appended, not inserted.
101
+ - Items are grouped under the artefact they relate to.
93
102
 
94
103
  ## Who writes what
95
104
 
96
105
  | Section | Written by | Updated by |
97
106
  |---------|-----------|------------|
98
- | Frontmatter (`flow`) | `foundry_workfile_create` (flow skill) | nobody |
99
- | Frontmatter (`cycle`, `stages`, `max-iterations`) | `foundry_workfile_set` (orchestrate skill) | `foundry_workfile_set` (reset on each new cycle) |
107
+ | Frontmatter (`flow`, `cycle`, `goal`) | `foundry_workfile_create` (flow skill) | `foundry_workfile_delete` + re-create between cycles |
108
+ | Frontmatter (`stages`, `max-iterations`, `human-appraise`, `deadlock-appraise`, `deadlock-iterations`, `models`) | `foundry_orchestrate` (first call of each cycle, internally) | reset on each new cycle |
100
109
  | Goal | `foundry_workfile_create` (flow skill) | nobody |
101
- | Artefacts | `foundry_artefacts_add` (forge skill) | `foundry_artefacts_set_status` (orchestrate skill) |
102
- | Feedback | `foundry_feedback_add` (quench/appraise/hitl) | `foundry_feedback_action`/`foundry_feedback_wontfix` (forge), `foundry_feedback_resolve` (quench/appraise/hitl) |
110
+ | Artefacts | `foundry_stage_finalize` (orchestrator, after forge closes) | `foundry_artefacts_set_status` (orchestrator → `done`/`blocked`) |
111
+ | Feedback | `foundry_feedback_add` (quench / appraise / human-appraise) | `foundry_feedback_action` / `foundry_feedback_wontfix` (forge), `foundry_feedback_resolve` (quench / appraise / human-appraise) |
112
+
113
+ Note: `foundry_artefacts_add` no longer exists as a public tool — artefact registration is automatic via `stage_finalize`, which scans the git diff and registers files matching the output type's `file-patterns` as `draft`.
103
114
 
104
115
  ## WORK.history.yaml
105
116
 
@@ -141,20 +152,20 @@ A separate file (`WORK.history.yaml`) alongside WORK.md. Append-only log of ever
141
152
 
142
153
  - `timestamp` — ISO 8601 UTC
143
154
  - `cycle` — which foundry cycle this entry belongs to
144
- - `stage` — which stage just completed, in `base:alias` format (e.g. `forge:draft-petition`, `quench:validate-petition`, `appraise:review-petition`, `hitl:human-review`)
155
+ - `stage` — which stage just completed, in `base:alias` format (e.g. `forge:draft-petition`, `quench:validate-petition`, `appraise:review-petition`, `human-appraise:human-review`)
145
156
  - `iteration` — the current iteration number (increments each time forge runs within a cycle)
146
157
  - `comment` — brief description of what happened
147
158
 
148
159
  ### Rules
149
160
 
150
- - Append-only — never edit or delete entries
151
- - Every stage skill appends an entry when it completes
152
- - The sort tool reads this to determine what has happened in the current foundry cycle
153
- - Iteration is derived from counting forge entries for the current foundry cycle
161
+ - Append-only — never edit or delete entries.
162
+ - Every stage produces an entry when it completes.
163
+ - Sort reads this to determine what has happened in the current cycle.
164
+ - Iteration is derived from counting forge entries for the current cycle.
154
165
 
155
166
  ### Who writes
156
167
 
157
- Every stage skill (forge, quench, appraise, hitl) appends an entry when it finishes via the `foundry_history_append` tool.
168
+ History entries are written by `foundry_orchestrate` after each stage closes (via its internal `foundry_history_append` — the tool is not registered publicly). Sub-agents never append history directly.
158
169
 
159
170
  ## Example
160
171
 
@@ -166,6 +177,9 @@ flow: make-haiku
166
177
  cycle: haiku-creation
167
178
  stages: [forge:write-haiku, quench:check-syllables, appraise:evaluate-quality]
168
179
  max-iterations: 3
180
+ human-appraise: false
181
+ deadlock-appraise: true
182
+ deadlock-iterations: 5
169
183
  ---
170
184
 
171
185
  # Goal
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@really-knows-ai/foundry",
3
- "version": "2.3.1",
3
+ "version": "2.3.2",
4
4
  "description": "A structured framework for AI-driven artefact creation with deterministic routing, quality gates, and iterative refinement cycles.",
5
5
  "type": "module",
6
6
  "main": ".opencode/plugins/foundry.js",
@@ -10,9 +10,15 @@ You help the user create a new appraiser personality. You ensure it's genuinely
10
10
 
11
11
  ## Prerequisites
12
12
 
13
- Before running this skill, verify that the `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
13
+ Before running this skill, verify both of the following:
14
14
 
15
- > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
15
+ 1. The `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
16
+
17
+ > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
18
+
19
+ 2. The current git branch is not a work branch. Run `git rev-parse --abbrev-ref HEAD` — if it starts with `work/`, stop and tell the user:
20
+
21
+ > You're on a work branch (`<branch>`). Foundry configuration changes must be made on the base branch (usually `main`). Complete or discard the in-flight flow (`foundry_git_finish`, or switch branches and delete it), then re-run this skill from the base branch.
16
22
 
17
23
  ## Protocol
18
24
 
@@ -10,9 +10,15 @@ You help the user create a new artefact type. You ensure it doesn't conflict wit
10
10
 
11
11
  ## Prerequisites
12
12
 
13
- Before running this skill, verify that the `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
13
+ Before running this skill, verify both of the following:
14
14
 
15
- > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
15
+ 1. The `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
16
+
17
+ > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
18
+
19
+ 2. The current git branch is not a work branch. Run `git rev-parse --abbrev-ref HEAD` — if it starts with `work/`, stop and tell the user:
20
+
21
+ > You're on a work branch (`<branch>`). Foundry configuration changes must be made on the base branch (usually `main`). Complete or discard the in-flight flow (`foundry_git_finish`, or switch branches and delete it), then re-run this skill from the base branch.
16
22
 
17
23
  ## Protocol
18
24
 
@@ -10,9 +10,15 @@ You help the user create a new foundry cycle and add it to an existing foundry f
10
10
 
11
11
  ## Prerequisites
12
12
 
13
- Before running this skill, verify that the `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
13
+ Before running this skill, verify both of the following:
14
14
 
15
- > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
15
+ 1. The `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
16
+
17
+ > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
18
+
19
+ 2. The current git branch is not a work branch. Run `git rev-parse --abbrev-ref HEAD` — if it starts with `work/`, stop and tell the user:
20
+
21
+ > You're on a work branch (`<branch>`). Foundry configuration changes must be made on the base branch (usually `main`). Complete or discard the in-flight flow (`foundry_git_finish`, or switch branches and delete it), then re-run this skill from the base branch.
16
22
 
17
23
  ## Protocol
18
24
 
@@ -10,9 +10,15 @@ You help the user create a new foundry flow. A foundry flow is a set of foundry
10
10
 
11
11
  ## Prerequisites
12
12
 
13
- Before running this skill, verify that the `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
13
+ Before running this skill, verify both of the following:
14
14
 
15
- > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
15
+ 1. The `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
16
+
17
+ > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
18
+
19
+ 2. The current git branch is not a work branch. Run `git rev-parse --abbrev-ref HEAD` — if it starts with `work/`, stop and tell the user:
20
+
21
+ > You're on a work branch (`<branch>`). Foundry configuration changes must be made on the base branch (usually `main`). Complete or discard the in-flight flow (`foundry_git_finish`, or switch branches and delete it), then re-run this skill from the base branch.
16
22
 
17
23
  ## Protocol
18
24
 
@@ -10,9 +10,15 @@ You help the user create a new law. You ensure it's well-scoped, doesn't conflic
10
10
 
11
11
  ## Prerequisites
12
12
 
13
- Before running this skill, verify that the `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
13
+ Before running this skill, verify both of the following:
14
14
 
15
- > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
15
+ 1. The `foundry/` directory exists in the project root. If it does not exist, stop and tell the user:
16
+
17
+ > Foundry is not initialized in this project. Run the `init-foundry` skill first to create the foundry/ directory structure.
18
+
19
+ 2. The current git branch is not a work branch. Run `git rev-parse --abbrev-ref HEAD` — if it starts with `work/`, stop and tell the user:
20
+
21
+ > You're on a work branch (`<branch>`). Foundry configuration changes must be made on the base branch (usually `main`). Complete or discard the in-flight flow (`foundry_git_finish`, or switch branches and delete it), then re-run this skill from the base branch.
16
22
 
17
23
  ## Protocol
18
24