@really-knows-ai/foundry 2.3.2 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (170) hide show
  1. package/README.md +180 -369
  2. package/dist/.opencode/plugins/foundry-tools/appraiser-tools.js +28 -0
  3. package/dist/.opencode/plugins/foundry-tools/artefact-tools.js +58 -0
  4. package/dist/.opencode/plugins/foundry-tools/assay-tools.js +92 -0
  5. package/dist/.opencode/plugins/foundry-tools/attestation-tools.js +191 -0
  6. package/dist/.opencode/plugins/foundry-tools/config-create-tools.js +128 -0
  7. package/dist/.opencode/plugins/foundry-tools/config-law-tools.js +380 -0
  8. package/dist/.opencode/plugins/foundry-tools/config-tools.js +43 -0
  9. package/dist/.opencode/plugins/foundry-tools/feedback-tools.js +234 -0
  10. package/dist/.opencode/plugins/foundry-tools/git-helpers.js +354 -0
  11. package/dist/.opencode/plugins/foundry-tools/git-tools.js +181 -0
  12. package/dist/.opencode/plugins/foundry-tools/helpers.js +340 -0
  13. package/dist/.opencode/plugins/foundry-tools/history-tools.js +20 -0
  14. package/dist/.opencode/plugins/foundry-tools/memory-admin-tools.js +296 -0
  15. package/dist/.opencode/plugins/foundry-tools/memory-helpers.js +104 -0
  16. package/dist/.opencode/plugins/foundry-tools/memory-tools.js +286 -0
  17. package/dist/.opencode/plugins/foundry-tools/orchestrate-tool.js +159 -0
  18. package/dist/.opencode/plugins/foundry-tools/snapshot-tools.js +104 -0
  19. package/dist/.opencode/plugins/foundry-tools/stage-tools.js +186 -0
  20. package/dist/.opencode/plugins/foundry-tools/validate-tools.js +263 -0
  21. package/dist/.opencode/plugins/foundry-tools/workfile-tools.js +102 -0
  22. package/dist/.opencode/plugins/foundry.js +105 -0
  23. package/dist/CHANGELOG.md +490 -0
  24. package/dist/LICENSE +21 -0
  25. package/dist/README.md +278 -0
  26. package/dist/docs/README.md +59 -0
  27. package/dist/docs/architecture.md +434 -0
  28. package/dist/docs/concepts.md +396 -0
  29. package/dist/docs/getting-started.md +345 -0
  30. package/dist/docs/memory-maintenance.md +176 -0
  31. package/dist/docs/tools.md +1411 -0
  32. package/dist/docs/work-spec.md +283 -0
  33. package/dist/scripts/lib/artefacts.js +151 -0
  34. package/dist/scripts/lib/assay/loader.js +151 -0
  35. package/dist/scripts/lib/assay/parse-jsonl.js +102 -0
  36. package/dist/scripts/lib/assay/permissions.js +52 -0
  37. package/dist/scripts/lib/assay/run.js +219 -0
  38. package/dist/scripts/lib/assay/spawn-with-timeout.js +138 -0
  39. package/dist/scripts/lib/attestation/attest.js +111 -0
  40. package/dist/scripts/lib/attestation/canonical-json.js +109 -0
  41. package/dist/scripts/lib/attestation/hash.js +17 -0
  42. package/dist/scripts/lib/attestation/parse.js +14 -0
  43. package/dist/scripts/lib/attestation/payload.js +106 -0
  44. package/dist/scripts/lib/attestation/render.js +16 -0
  45. package/dist/scripts/lib/attestation/verify.js +15 -0
  46. package/dist/scripts/lib/branch-guard.js +72 -0
  47. package/dist/scripts/lib/config-creators/appraiser.js +9 -0
  48. package/dist/scripts/lib/config-creators/artefact-type.js +9 -0
  49. package/dist/scripts/lib/config-creators/cycle.js +11 -0
  50. package/dist/scripts/lib/config-creators/factory.js +49 -0
  51. package/dist/scripts/lib/config-creators/flow.js +11 -0
  52. package/dist/scripts/lib/config-validators/appraiser.js +49 -0
  53. package/dist/scripts/lib/config-validators/artefact-type.js +38 -0
  54. package/dist/scripts/lib/config-validators/cycle.js +131 -0
  55. package/dist/scripts/lib/config-validators/flow.js +57 -0
  56. package/dist/scripts/lib/config-validators/helpers.js +96 -0
  57. package/dist/scripts/lib/config-validators/law.js +96 -0
  58. package/dist/scripts/lib/config.js +393 -0
  59. package/dist/scripts/lib/failed-flow.js +131 -0
  60. package/dist/scripts/lib/feedback-store.js +249 -0
  61. package/dist/scripts/lib/feedback-transitions.js +105 -0
  62. package/dist/scripts/lib/finalize.js +70 -0
  63. package/dist/scripts/lib/foundational-guards.js +13 -0
  64. package/dist/scripts/lib/git-bridge.js +77 -0
  65. package/dist/scripts/lib/git-finish/work-finish.js +233 -0
  66. package/dist/scripts/lib/git-policy.js +101 -0
  67. package/dist/scripts/lib/guards.js +125 -0
  68. package/dist/scripts/lib/history.js +132 -0
  69. package/dist/scripts/lib/memory/admin/create-edge-type.js +91 -0
  70. package/dist/scripts/lib/memory/admin/create-entity-type.js +43 -0
  71. package/dist/scripts/lib/memory/admin/create-extractor.js +67 -0
  72. package/dist/scripts/lib/memory/admin/drop-edge-type.js +40 -0
  73. package/dist/scripts/lib/memory/admin/drop-entity-type.js +172 -0
  74. package/dist/scripts/lib/memory/admin/dump.js +47 -0
  75. package/dist/scripts/lib/memory/admin/helpers.js +31 -0
  76. package/dist/scripts/lib/memory/admin/init.js +170 -0
  77. package/dist/scripts/lib/memory/admin/live-store.js +76 -0
  78. package/dist/scripts/lib/memory/admin/reembed.js +285 -0
  79. package/dist/scripts/lib/memory/admin/rename-edge-type.js +54 -0
  80. package/dist/scripts/lib/memory/admin/rename-entity-type.js +151 -0
  81. package/dist/scripts/lib/memory/admin/reset.js +24 -0
  82. package/dist/scripts/lib/memory/admin/vacuum.js +9 -0
  83. package/dist/scripts/lib/memory/admin/validate.js +19 -0
  84. package/dist/scripts/lib/memory/config.js +149 -0
  85. package/dist/scripts/lib/memory/cozo.js +136 -0
  86. package/dist/scripts/lib/memory/drift.js +71 -0
  87. package/dist/scripts/lib/memory/embeddings.js +128 -0
  88. package/dist/scripts/lib/memory/frontmatter.js +75 -0
  89. package/dist/scripts/lib/memory/ndjson.js +84 -0
  90. package/dist/scripts/lib/memory/paths.js +25 -0
  91. package/dist/scripts/lib/memory/permissions.js +41 -0
  92. package/dist/scripts/lib/memory/prompt.js +109 -0
  93. package/dist/scripts/lib/memory/query.js +56 -0
  94. package/dist/scripts/lib/memory/reads.js +109 -0
  95. package/dist/scripts/lib/memory/schema.js +64 -0
  96. package/dist/scripts/lib/memory/search.js +73 -0
  97. package/dist/scripts/lib/memory/singleton.js +49 -0
  98. package/dist/scripts/lib/memory/store.js +162 -0
  99. package/dist/scripts/lib/memory/types.js +93 -0
  100. package/dist/scripts/lib/memory/validate.js +58 -0
  101. package/dist/scripts/lib/memory/writes.js +40 -0
  102. package/{scripts → dist/scripts}/lib/pending.js +7 -2
  103. package/dist/scripts/lib/secret.js +59 -0
  104. package/{scripts → dist/scripts}/lib/slug.js +3 -2
  105. package/dist/scripts/lib/snapshot/finish.js +103 -0
  106. package/dist/scripts/lib/snapshot/inspect.js +253 -0
  107. package/dist/scripts/lib/snapshot/render.js +55 -0
  108. package/dist/scripts/lib/sort-fs-check.js +121 -0
  109. package/dist/scripts/lib/sort-routing.js +101 -0
  110. package/{scripts → dist/scripts}/lib/stage-guard.js +12 -6
  111. package/{scripts → dist/scripts}/lib/state.js +4 -0
  112. package/dist/scripts/lib/token.js +57 -0
  113. package/dist/scripts/lib/tracing.js +59 -0
  114. package/dist/scripts/lib/ulid.js +100 -0
  115. package/dist/scripts/lib/validator-jsonl.js +162 -0
  116. package/{scripts → dist/scripts}/lib/workfile.js +38 -20
  117. package/dist/scripts/orchestrate-cycle.js +215 -0
  118. package/dist/scripts/orchestrate-phases.js +314 -0
  119. package/dist/scripts/orchestrate.js +163 -0
  120. package/dist/scripts/sort.js +278 -0
  121. package/{skills → dist/skills}/add-appraiser/SKILL.md +39 -9
  122. package/{skills → dist/skills}/add-artefact-type/SKILL.md +46 -24
  123. package/{skills → dist/skills}/add-cycle/SKILL.md +57 -17
  124. package/dist/skills/add-extractor/SKILL.md +133 -0
  125. package/{skills → dist/skills}/add-flow/SKILL.md +36 -10
  126. package/dist/skills/add-law/SKILL.md +191 -0
  127. package/dist/skills/add-memory-edge-type/SKILL.md +52 -0
  128. package/dist/skills/add-memory-entity-type/SKILL.md +74 -0
  129. package/{skills → dist/skills}/appraise/SKILL.md +62 -13
  130. package/dist/skills/assay/SKILL.md +72 -0
  131. package/dist/skills/change-embedding-model/SKILL.md +58 -0
  132. package/dist/skills/drop-memory-edge-type/SKILL.md +54 -0
  133. package/dist/skills/drop-memory-entity-type/SKILL.md +57 -0
  134. package/dist/skills/dry-run/SKILL.md +116 -0
  135. package/{skills → dist/skills}/flow/SKILL.md +15 -2
  136. package/dist/skills/forge/SKILL.md +121 -0
  137. package/dist/skills/human-appraise/SKILL.md +153 -0
  138. package/{skills → dist/skills}/init-foundry/SKILL.md +23 -4
  139. package/dist/skills/init-memory/SKILL.md +92 -0
  140. package/{skills → dist/skills}/orchestrate/SKILL.md +30 -4
  141. package/dist/skills/quench/SKILL.md +99 -0
  142. package/{skills → dist/skills}/refresh-agents/SKILL.md +1 -1
  143. package/dist/skills/rename-memory-edge-type/SKILL.md +50 -0
  144. package/dist/skills/rename-memory-entity-type/SKILL.md +51 -0
  145. package/dist/skills/reset-memory/SKILL.md +54 -0
  146. package/dist/skills/upgrade-foundry/SKILL.md +192 -0
  147. package/package.json +34 -17
  148. package/.opencode/plugins/foundry.js +0 -761
  149. package/CHANGELOG.md +0 -100
  150. package/docs/concepts.md +0 -122
  151. package/docs/getting-started.md +0 -187
  152. package/docs/work-spec.md +0 -207
  153. package/scripts/lib/artefacts.js +0 -124
  154. package/scripts/lib/config.js +0 -175
  155. package/scripts/lib/feedback-transitions.js +0 -25
  156. package/scripts/lib/feedback.js +0 -440
  157. package/scripts/lib/finalize.js +0 -41
  158. package/scripts/lib/history.js +0 -59
  159. package/scripts/lib/secret.js +0 -23
  160. package/scripts/lib/tags.js +0 -108
  161. package/scripts/lib/token.js +0 -26
  162. package/scripts/orchestrate.js +0 -418
  163. package/scripts/sort.js +0 -370
  164. package/scripts/validate-tags.js +0 -54
  165. package/skills/add-law/SKILL.md +0 -111
  166. package/skills/forge/SKILL.md +0 -88
  167. package/skills/human-appraise/SKILL.md +0 -82
  168. package/skills/quench/SKILL.md +0 -62
  169. package/skills/upgrade-foundry/SKILL.md +0 -216
  170. /package/{skills → dist/skills}/list-agents/SKILL.md +0 -0
package/README.md CHANGED
@@ -1,467 +1,278 @@
1
1
  # Foundry
2
2
 
3
- > A skill-driven framework for governed artefact generation with AI coding tools. Define your own artefact types, laws, and flows — Foundry handles the forge → quench → appraise pipeline with deterministic routing, quality gates, and iterative refinement.
3
+ > Engineered confidence for AI-generated work. Define what good looks like.
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/@really-knows-ai/foundry.svg)](https://www.npmjs.com/package/@really-knows-ai/foundry)
6
+ [![Tests](https://github.com/really-knows-ai/foundry/actions/workflows/test.yml/badge.svg)](https://github.com/really-knows-ai/foundry/actions/workflows/test.yml)
6
7
  [![license](https://img.shields.io/npm/l/@really-knows-ai/foundry.svg)](LICENSE)
7
8
 
8
9
  ---
9
10
 
10
- ## Table of contents
11
-
12
- - [Why Foundry?](#why-foundry)
13
- - [Compatibility](#compatibility)
14
- - [Installation](#installation)
15
- - [Quick start](#quick-start)
16
- - [How it works](#how-it-works)
17
- - [Core concepts](#core-concepts)
18
- - [The pipeline in depth](#the-pipeline-in-depth)
19
- - [Feedback lifecycle](#feedback-lifecycle)
20
- - [Enforcement model](#enforcement-model)
21
- - [Multi-model routing](#multi-model-routing)
22
- - [Skills](#skills)
23
- - [Custom tools](#custom-tools)
24
- - [Project layout](#project-layout)
25
- - [Design decisions](#design-decisions)
26
- - [Further reading](#further-reading)
27
- - [License](#license)
11
+ ## Engineering confidence
28
12
 
29
- ---
13
+ ### Confidence is engineered
30
14
 
31
- ## Why Foundry?
15
+ Generation is cheap; trust is expensive. An agent can produce output quickly, skip
16
+ validation, or lose feedback between iterations. The work arrives fast, but the
17
+ evidence is incomplete and trust is fragile. Nobody can see the path from prompt to
18
+ finish. Nobody knows how many times the agent tried, what it fixed, or why it
19
+ stopped.
32
20
 
33
- LLMs are excellent at producing artefacts code, specs, docs, tests — but they are erratic about *governing* that production. They skip checks, silently ignore feedback, drift from constraints, and forget what stage they're in. Foundry is an opinionated framework that separates **creative work** (handled by LLMs via skills) from **process work** (handled by deterministic tools):
21
+ Foundry is the system around the prompt: explicit standards, repeatable checks, and
22
+ recorded sign-off applied to every artefact your AI produces. It transforms "ask an
23
+ agent and hope" into a staged system where the checks are structural and mandatory.
24
+ If an artefact should be validated, it is validated. If feedback must be resolved,
25
+ that state is recorded. If a stage writes outside its lane, the cycle stops. The
26
+ framework is deterministic; the LLM is not. Your laws are.
34
27
 
35
- - **The pipeline is code, not prose.** Routing, state transitions, commit discipline, and write invariants live inside tested plugin tools. LLMs can't rationalise their way past them.
36
- - **Every artefact is governed by laws.** Global and per-type pass/fail criteria are evaluated by a panel of independent appraisers before anything is considered done.
37
- - **Nothing is silent.** Feedback has a full lifecycle (open → actioned/wont-fix → approved/rejected). Wont-fix requires appraiser approval. Validation is non-negotiable.
38
- - **Writes are enforced.** Each stage is allowed to modify a specific, narrow set of files. Violations halt the cycle.
39
- - **Humans can step in.** Human-in-the-loop gates can run every iteration or only when LLM appraisers deadlock.
28
+ Variability helps where creativity matters; control enforces discipline where
29
+ reliability does. You choose what gates each stage passes through, what laws your
30
+ artefacts must satisfy, and which models you trust for each decision. Foundry runs
31
+ the loop and records every step in git, so the path from draft to approved artefact
32
+ is auditable, repeatable, and defensible to auditors and stakeholders. You can show
33
+ exactly how the output was made. Confidence is engineered; it is not hoped for.
40
34
 
41
- ---
35
+ ### The operating model: assay, then forge → quench → appraise
42
36
 
43
- ## Compatibility
37
+ A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
38
+ that runs project-authored extractor scripts, parses the strict JSONL facts they
39
+ emit, and writes typed facts into flow memory. In the foundry metaphor, an assay
40
+ establishes composition before work begins. In Foundry, assay gives forge a
41
+ measured map of the project before it creates an artefact. Cycles without memory
42
+ configuration skip this stage.
44
43
 
45
- - **OpenCode** full support. Multi-model routing via file-based `foundry-*` agents. This is the primary target.
46
- - **Other skill-aware AI tools** — the skills and tools are portable. Multi-model stage routing is OpenCode-specific today because it relies on `.opencode/agents/` files generated by `refresh-agents`.
44
+ After assay, one draft enters a short loop and leaves only when it passes quality
45
+ gates. Each loop has four distinct roles that turn a candidate into a verified output:
47
46
 
48
- ---
47
+ - **Forge** produces or revises the artefact. The stage that creates and reshapes
48
+ work, responding to feedback from appraisers or building on prior drafts.
49
49
 
50
- ## Installation
50
+ - **Quench** runs deterministic checks that harden or reject the work. Validation is
51
+ fast and non-negotiable, catching errors before they reach appraisers.
51
52
 
52
- Add `@really-knows-ai/foundry` to your OpenCode config:
53
+ - **Appraise** judges quality against written laws. Independent evaluators inspect
54
+ whether the work meets the subjective standards you define.
53
55
 
54
- ```json
55
- // opencode.json
56
- {
57
- "$schema": "https://opencode.ai/config.json",
58
- "plugin": ["@really-knows-ai/foundry"]
59
- }
60
- ```
56
+ - **Human-appraise** provides direct judgement when the stakes require it or the loop
57
+ deadlocks. Offers human oversight at critical decision points.
61
58
 
62
- ---
59
+ Every stage commits separately, so every step leaves a record. Every decision is
60
+ timestamped. A single loop produces an **output** — a verified draft. A flow
61
+ composes one or more such loops to produce an **outcome** — the final artefact that
62
+ reaches your codebase or customers.
63
63
 
64
- ## Quick start
64
+ ### What you describe, what Foundry enforces
65
65
 
66
- 1. **Install** the package (above).
67
- 2. **Initialize** run the `init-foundry` skill to scaffold a `foundry/` directory and generate `foundry-*` agent files.
68
- 3. **Define artefact types** `add-artefact-type` walks you through identity, file patterns, output directory, laws, and optional CLI validation.
69
- 4. **Add laws** `add-law` creates subjective pass/fail criteria, globally or per-type.
70
- 5. **Add appraisers** — `add-appraiser` creates appraiser personalities with conflict detection.
71
- 6. **Define cycles** — `add-cycle` wires artefact types into a forge/quench/appraise loop with targets and input contracts.
72
- 7. **Define a flow** — `add-flow` groups cycles and declares entry points.
73
- 8. **Run** — invoke the `flow` skill with your goal. It creates a work branch, picks the right cycle, and hands off to `orchestrate`.
66
+ You write the laws — the criteria that define acceptable. You describe the artefact
67
+ types you want produced and what files they generate. You choose which stages each
68
+ cycle passes through and what models to use at each step. You control the operating
69
+ model entirely. Your configuration is law.
74
70
 
75
- ---
76
-
77
- ## How it works
78
-
79
- ```
80
- ┌─────────────────────────────┐
81
- │ Flow (entry points + set)
82
- └──────────────┬──────────────┘
83
- │ starting cycle picked
84
-
85
- ┌────────────────────────────────────────────────────────────────┐
86
- │ Cycle (outputs exactly one artefact type) │
87
- │ │
88
- │ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │
89
- │ │ forge │ → │ quench │ → │ appraise │ ──┐ │
90
- │ └─────────┘ └─────────┘ └─────────────┘ │ loop │
91
- │ ▲ │ until │
92
- │ └───── unresolved feedback ─────────────────┘ clean │
93
- │ │
94
- │ [ optional: human-appraise — every iter or on deadlock ] │
95
- └──────────────┬─────────────────────────────────────────────────┘
96
- │ targets (may branch)
97
-
98
- next cycle → … → done
99
- ```
100
-
101
- - A **flow** defines the set of cycles and their entry points.
102
- - A **cycle** produces exactly one artefact type and declares its own `targets` — Foundry follows a dependency graph, not a linear list.
103
- - Each cycle loops through **forge → quench → appraise** until there is no unresolved feedback, or an iteration limit is hit.
104
- - All inter-stage communication goes through **WORK.md** on a dedicated work branch; every stage ends with a micro-commit.
71
+ Foundry runs the loop, gates writes per stage so only the right mutation happens at
72
+ the right time, records every decision in git, and stops when there is nothing left
73
+ to fix. Each stage holds a token that authorises its mutations. Stages cannot write
74
+ outside their assigned lane. Feedback state moves through a state machine that
75
+ prevents invalid transitions. The framework owns the process and enforces the rules;
76
+ the LLM performs the creative and evaluative work inside each stage. You define the
77
+ machine; Foundry runs it. Confidence is the difference.
105
78
 
106
79
  ---
107
80
 
108
- ## Core concepts
109
-
110
- ### Flow
111
-
112
- A flow lives in `foundry/flows/`. It declares:
113
-
114
- - `starting-cycles` — hints about where the flow can be entered.
115
- - The set of cycles it contains (routing between them is owned by cycles, not by the flow).
116
-
117
- Starting a flow creates a work branch and a fresh `WORK.md`.
118
-
119
- ### Cycle
120
-
121
- A cycle lives in `foundry/cycles/`. It declares:
122
-
123
- - `output` — the artefact type the cycle produces (read-write).
124
- - `inputs` — a contract (`any-of` or `all-of`) over artefact types from other cycles. Inputs are discovered on disk by filesystem scan against each input type's file-patterns; they are read-only.
125
- - `targets` — which cycle(s) may run next after this one completes.
126
- - `human-appraise` / `deadlock-appraise` / `deadlock-iterations` — human-in-the-loop configuration.
127
- - `models` — optional per-stage model overrides for multi-model diversity.
128
-
129
- ### Stage
130
-
131
- A single step within a cycle. Stages are identified as `base:alias` (e.g. `forge:write-haiku`, `quench:check-syllables`). The base is one of:
132
-
133
- - **forge** — produce or revise the artefact.
134
- - **quench** — run deterministic CLI checks (skipped if the artefact type has no `validation.md`).
135
- - **appraise** — subjective evaluation by multiple independent appraiser sub-agents.
136
- - **human-appraise** — human quality gate, either every iteration or only on deadlock.
137
-
138
- ### Artefact type
139
-
140
- Defined in `foundry/artefacts/<type>/`:
141
-
142
- - `definition.md` — id, name, file patterns, output directory, appraiser configuration, prose description.
143
- - `laws.md` *(optional)* — type-specific subjective criteria.
144
- - `validation.md` *(optional)* — CLI commands with a `{file}` placeholder; non-zero exit = failure.
81
+ ## Compatibility
145
82
 
146
- ### Laws
83
+ Foundry works primarily with OpenCode. The skills and tools are portable to other
84
+ skill-aware AI systems. Multi-model stage routing is OpenCode-specific today.
147
85
 
148
- Subjective pass/fail criteria evaluated by appraisers.
86
+ - **OpenCode** full support. Multi-model routing via file-based `foundry-*` agents.
87
+ This is the primary target platform.
149
88
 
150
- - `foundry/laws/*.md`global laws (all files concatenated, apply everywhere).
151
- - `foundry/artefacts/<type>/laws.md` type-specific laws.
89
+ - **Other skill-aware AI tools** the skills and tools are portable to any
90
+ skill-aware AI system. Multi-model stage routing is OpenCode-specific today
91
+ because it relies on `.opencode/agents/` files.
152
92
 
153
- Each law is a `## heading` (its identifier, referenced in feedback as `#law:<id>`) with a description, passing criteria, and failing criteria.
93
+ ---
154
94
 
155
- ### Appraisers
95
+ ## Install
156
96
 
157
- Defined in `foundry/appraisers/`. Each appraiser is a named personality with an optional `model` override. Artefact types pick which appraisers may evaluate them:
97
+ Add the plugin to `opencode.json`:
158
98
 
159
- ```yaml
160
- appraisers:
161
- count: 3 # how many appraisers (default: 3)
162
- allowed: [pedantic, pragmatic] # which personalities (default: all)
99
+ ```json
100
+ {
101
+ "$schema": "https://opencode.ai/config.json",
102
+ "plugin": ["@really-knows-ai/foundry"]
103
+ }
163
104
  ```
164
105
 
165
- Appraisers are distributed evenly across the allowed set for maximum diversity.
166
-
167
- ### WORK.md
168
-
169
- Transient shared state on the work branch. Created when the flow starts, deleted before the branch is squash-merged. It contains:
170
-
171
- - **Frontmatter** — current position (`flow`, `cycle`, stage list, max iterations, model map, human-appraise config).
172
- - **Goal** — the prose request that kicked off the flow.
173
- - **Artefacts** — a table of every file produced by the flow and its status (`draft`, `done`, `blocked`).
174
- - **Feedback** — grouped by artefact file, every feedback item with its full lifecycle.
175
-
176
- A sibling file `WORK.history.yaml` is an append-only log of every stage execution. See [docs/work-spec.md](docs/work-spec.md).
106
+ Restart OpenCode so the plugin registers its tools and skills. You will see new
107
+ tools and skills become available in OpenCode's command palette once the restart
108
+ completes. The `init-foundry` skill and flow-management tools are now ready to use.
177
109
 
178
110
  ---
179
111
 
180
- ## The pipeline in depth
181
-
182
- ### Stages run inside a token-gated lifecycle
183
-
184
- Every dispatched stage (forge, quench, appraise, human-appraise) runs under a single-use HMAC token:
112
+ ## Upgrade
185
113
 
186
- 1. The `orchestrate` tool mints a token and hands it to the sub-agent in the dispatch prompt.
187
- 2. The sub-agent's **first** call must be `foundry_stage_begin({stage, cycle, token})`. The token is redeemed; mutation tools now check that the active stage matches.
188
- 3. The sub-agent does its work (reads WORK.md, writes artefact files / feedback, etc.).
189
- 4. The sub-agent's **last** call is `foundry_stage_end({summary})`.
190
- 5. The orchestrator then calls `foundry_stage_finalize`, which:
191
- - Scans the git diff against the stage's allowed file-patterns.
192
- - Registers any new files matching the output artefact type as `draft` artefacts.
193
- - Returns `{error: 'unexpected_files'}` if the stage wrote anywhere it shouldn't have.
194
- 6. The cycle is committed (`foundry_git_commit` internally) and routing advances.
114
+ Run the `upgrade-foundry` skill from a clean project state when moving an existing project to the installed Foundry version. The skill preserves the existing `foundry/` directory, initialises a fresh current-version configuration, analyses the preserved configuration as source material, and recreates supported concepts through current tools.
195
115
 
196
- Per-stage write rules:
116
+ The upgrade process asks clarifying questions for ambiguous routing, input contracts, validation behaviour, memory settings, and deprecated concepts. It leaves the preserved source directory in place until you explicitly approve cleanup.
197
117
 
198
- | Stage | May write |
199
- |-------|-----------|
200
- | `forge` | Files matching the output artefact type's `file-patterns`, plus `WORK.md` / `WORK.history.yaml` |
201
- | `quench` | `WORK.md` / `WORK.history.yaml` only (feedback) |
202
- | `appraise` | `WORK.md` / `WORK.history.yaml` only (feedback) |
203
- | `human-appraise` | `WORK.md` / `WORK.history.yaml` only (feedback) |
118
+ ---
204
119
 
205
- Input artefacts are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle.
120
+ ## Quick start
206
121
 
207
- ### Deterministic orchestration
122
+ ### Phase 1 — Install
208
123
 
209
- The `orchestrate` skill is thin a 3-line loop:
124
+ Add the plugin to `opencode.json` (see Install section above):
210
125
 
211
- ```text
212
- call foundry_orchestrate({lastResult})
213
- switch on action:
214
- dispatch → task tool (subagent) → report back
215
- human_appraise → run human-appraise inline → report back
216
- done / blocked / violation → terminate the loop
126
+ ```json
127
+ {
128
+ "$schema": "https://opencode.ai/config.json",
129
+ "plugin": ["@really-knows-ai/foundry"]
130
+ }
217
131
  ```
218
132
 
219
- `foundry_orchestrate` owns sort routing, history, commits, finalize, deadlock detection, and violation handling. Because the protocol lives in a plugin tool, the LLM can't skip steps, reorder them, or silently drop a commit.
220
-
221
- ---
133
+ Then restart OpenCode so the plugin registers its tools and skills. You will see new
134
+ tools and skills become available in OpenCode's command palette once the restart
135
+ completes. The `init-foundry` skill and flow-management tools are now ready to use.
222
136
 
223
- ## Feedback lifecycle
137
+ ### Phase 2 — Initialise
224
138
 
225
- Feedback is markdown checklists under each artefact in WORK.md, tagged to indicate source.
139
+ Open OpenCode in your project repo and say:
226
140
 
227
141
  ```
228
- - [ ] issue #tag → open — needs forge action
229
- - [x] issue #tag → actioned — needs appraise approval
230
- - [~] issue #tag | wont-fix: <reason> → wont-fix — needs appraise approval
231
- - [x] issue #tag | approved → resolved
232
- - [~] issue #tag | wont-fix: <reason> | approved → resolved
233
- - [x] issue #tag | rejected: <reason> → re-opened
234
- - [~] issue #tag | wont-fix: <reason> | rejected → re-opened
142
+ > run init-foundry
235
143
  ```
236
144
 
237
- Tags:
238
-
239
- | Tag | Source | Notes |
240
- |-----|--------|-------|
241
- | `#validation` | quench (CLI command failed) | Cannot be wont-fixed. Deterministic rules are not negotiable. |
242
- | `#law:<id>` | appraise (subjective law) | May be wont-fixed with justification; an appraiser must approve. |
243
- | `#human` | human-appraise | Takes absolute priority. Forge MUST address it — cannot wont-fix. |
244
-
245
- Feedback is append-only: items are never deleted, only resolved. Re-opened items show their full history.
145
+ Foundry scaffolds a `foundry/` directory, generates one `foundry-<model>` agent file
146
+ per model available in your session, commits the structure, and then asks you to
147
+ restart. All the foundational configuration directories are created; you will
148
+ populate them next.
246
149
 
247
- ### Deadlock handling
248
-
249
- If forge and appraise ping-pong on the same items for `deadlock-iterations` (default 5) iterations, and the cycle has `deadlock-appraise: true` (default), the router inserts a `human-appraise` stage. If `deadlock-appraise: false`, the cycle is marked `blocked` and control returns to the human.
250
-
251
- ---
150
+ Restart OpenCode so the new `foundry-<model>` agents register — multi-model dispatch cannot route to agents it cannot discover.
252
151
 
253
- ## Enforcement model
254
-
255
- Foundry is designed around "trust the tool, not the LLM". The following guarantees are enforced in plugin code, not prose:
256
-
257
- - **Stage-locked mutations.** `foundry_feedback_*`, `foundry_artefacts_*`, and `foundry_workfile_*` tools require the caller's role to match the active stage. A forge sub-agent cannot add feedback; a quench sub-agent cannot register artefacts.
258
- - **Single-use tokens.** `foundry_stage_begin` verifies an HMAC token minted at dispatch time. Replays, forgery, and cross-stage reuse all fail closed. Keys live in `.foundry/.secret` (mode 0600, gitignored, one per worktree).
259
- - **Commit-per-stage contract.** `foundry_orchestrate` refuses to proceed if there are uncommitted changes to `WORK.md`, `WORK.history.yaml`, or anything under `.foundry/` at the start of a sort call and history is non-empty.
260
- - **Write invariants.** `foundry_stage_finalize` scans the git diff and rejects stray writes with `{error: 'unexpected_files'}`.
261
- - **Feedback state machine.** Only legal transitions are accepted: `approved` is terminal; quench cannot approve/reject a `wont-fix`; validation cannot be wont-fixed.
262
- - **Artefact-type glob uniqueness.** `add-artefact-type` refuses to create a type whose file patterns overlap with an existing type; the enforcer can't determine file ownership otherwise.
263
-
264
- ---
265
-
266
- ## Multi-model routing
267
-
268
- Different stages can run on different models for genuine cognitive diversity (mitigating shared blind spots):
269
-
270
- - Cycle definitions can declare a `models` map, e.g. `models: { forge: anthropic/claude-opus-4.7, appraise: openai/gpt-5 }`.
271
- - Individual appraisers can override the cycle-level appraise model via a `model` field in their personality definition.
272
- - `refresh-agents` generates a `foundry-<provider>-<model>.md` agent file in `.opencode/agents/` for every model available in the session. `orchestrate` picks the matching agent when dispatching.
273
-
274
- Resolution order for a given stage: **appraiser `model`** → **cycle `models.<stage>`** → **session default**.
275
-
276
- Run `list-agents` to see what's available.
277
-
278
- ---
152
+ ### Phase 3 — Build a flow without writing one
279
153
 
280
- ## Skills
154
+ Ask Foundry to set up a flow:
281
155
 
282
- Foundry is a collection of skills. Skills are either **atomic** (do one thing) or **composite** (orchestrate other skills).
283
-
284
- ### Pipeline
285
-
286
- | Skill | Type | Purpose |
287
- |-------|------|---------|
288
- | `flow` | composite | Entry point. Picks a starting cycle, creates the work branch, invokes `orchestrate`, follows `targets` between cycles. |
289
- | `orchestrate` | atomic | Thin driver around `foundry_orchestrate`. Dispatches sub-agents, runs human-appraise inline, reports terminal states. |
290
- | `forge` | atomic | Produce or revise the artefact. Discovers inputs by filesystem scan. |
291
- | `quench` | atomic | Run the artefact type's CLI validation commands; write `#validation` feedback. |
292
- | `appraise` | atomic | Dispatch the selected appraiser personalities as parallel sub-agents; consolidate `#law:<id>` feedback (union + dedup). |
293
- | `human-appraise` | atomic | Human quality gate. Presents the artefact, collects `#human` feedback. |
294
-
295
- ### Authoring
296
-
297
- | Skill | Purpose |
298
- |-------|---------|
299
- | `init-foundry` | Scaffold the `foundry/` directory and generate agent files. |
300
- | `add-artefact-type` | Create a new artefact type, with conflict and glob-overlap checks. |
301
- | `add-law` | Create a new law with conflict detection. |
302
- | `add-appraiser` | Create an appraiser personality with semantic-overlap checks. |
303
- | `add-cycle` | Create a cycle, validate its targets and input contract against the flow. |
304
- | `add-flow` | Create a flow definition with cycle-graph reachability checks. |
305
-
306
- ### Utility
307
-
308
- | Skill | Purpose |
309
- |-------|---------|
310
- | `list-agents` | List available `foundry-*` sub-agents (for multi-model routing). |
311
- | `refresh-agents` | Regenerate `foundry-*` agent files from the currently available models. |
312
- | `upgrade-foundry` | Analyse and migrate `foundry/` config to the current version. |
313
-
314
- All authoring skills are interactive and conflict-aware — they explain what they're about to write and ask before writing.
315
-
316
- ---
317
-
318
- ## Custom tools
319
-
320
- The plugin registers **24 custom tools**. Skills call these rather than manipulating files directly, which keeps format-parsing and state transitions out of LLM hands.
321
-
322
- | Category | Tools |
323
- |----------|-------|
324
- | **Orchestration** | `foundry_orchestrate` |
325
- | **Stage lifecycle** | `foundry_stage_begin`, `foundry_stage_end` |
326
- | **Workfile** | `foundry_workfile_create`, `foundry_workfile_get`, `foundry_workfile_delete` |
327
- | **Artefacts** | `foundry_artefacts_set_status`, `foundry_artefacts_list` |
328
- | **Feedback** | `foundry_feedback_add`, `foundry_feedback_action`, `foundry_feedback_wontfix`, `foundry_feedback_resolve`, `foundry_feedback_list` |
329
- | **History** | `foundry_history_list` |
330
- | **Config** | `foundry_config_cycle`, `foundry_config_artefact_type`, `foundry_config_laws`, `foundry_config_validation`, `foundry_config_appraisers`, `foundry_config_flow` |
331
- | **Validation** | `foundry_validate_run`, `foundry_appraisers_select` |
332
- | **Git** | `foundry_git_branch`, `foundry_git_finish` |
333
-
334
- A handful of internal tools (`foundry_sort`, `foundry_history_append`, `foundry_stage_finalize`, `foundry_git_commit`, `foundry_workfile_set`, `foundry_workfile_configure_from_cycle`) are intentionally *not* registered — they exist only inside `foundry_orchestrate` so they cannot be called out of band.
335
-
336
- Tools are backed by shared modules in `scripts/lib/` with injectable I/O for testability (see `tests/`).
337
-
338
- ---
156
+ ```
157
+ > set up a flow that writes haikus
158
+ ```
339
159
 
340
- ## Project layout
160
+ Foundry will ask clarifying questions about the flow's purpose, constraints, and
161
+ entry points. It will then scaffold a haiku artefact type with a syllable-count
162
+ validator, laws for form / imagery / mood, two appraisers with different
163
+ sensibilities and bias profiles, a cycle that connects them in sequence, and a flow
164
+ that ties it all together. Everything is scaffolded; you do not write any
165
+ configuration by hand. This demonstrates the full system in action.
341
166
 
342
- ### Package (this repo)
167
+ Now run it:
343
168
 
344
169
  ```
345
- @really-knows-ai/foundry
346
- ├── .opencode/
347
- │ └── plugins/
348
- │ └── foundry.js # plugin: skills + 24 custom tools
349
- ├── skills/ # skill definitions
350
- │ ├── flow/ # pipeline
351
- │ ├── orchestrate/
352
- │ ├── forge/
353
- │ ├── quench/
354
- │ ├── appraise/
355
- │ ├── human-appraise/
356
- │ ├── init-foundry/ # authoring
357
- │ ├── add-artefact-type/
358
- │ ├── add-law/
359
- │ ├── add-appraiser/
360
- │ ├── add-cycle/
361
- │ ├── add-flow/
362
- │ ├── list-agents/ # utility
363
- │ ├── refresh-agents/
364
- │ └── upgrade-foundry/
365
- ├── scripts/
366
- │ ├── lib/ # shared libraries (injectable I/O)
367
- │ │ ├── workfile.js # WORK.md frontmatter
368
- │ │ ├── artefacts.js # artefact table ops
369
- │ │ ├── history.js # WORK.history.yaml ops
370
- │ │ ├── feedback.js # feedback lifecycle
371
- │ │ ├── feedback-transitions.js
372
- │ │ ├── finalize.js # stage_finalize implementation
373
- │ │ ├── stage-guard.js # stage-lock preconditions
374
- │ │ ├── token.js # HMAC token mint/verify
375
- │ │ ├── secret.js # .foundry/.secret handling
376
- │ │ ├── pending.js # active-stage state
377
- │ │ ├── state.js # .foundry state dir
378
- │ │ ├── config.js # foundry/ config readers
379
- │ │ ├── tags.js # feedback tag extraction
380
- │ │ └── slug.js
381
- │ ├── orchestrate.js # orchestration loop (exports runOrchestrate)
382
- │ └── sort.js # routing engine (exports runSort)
383
- ├── tests/ # node:test suite
384
- ├── docs/ # concepts, getting-started, work-spec
385
- ├── CHANGELOG.md
386
- └── README.md
170
+ > write me a haiku about autumn
387
171
  ```
388
172
 
389
- ### User project (after `init-foundry`)
173
+ Here is what the loop produces:
390
174
 
391
175
  ```
392
- your-project/
393
- ├── foundry/
394
- │ ├── flows/ # flow definitions
395
- │ ├── cycles/ # cycle definitions
396
- │ ├── artefacts/ # artefact type definitions
397
- │ │ └── <type>/
398
- │ │ ├── definition.md
399
- │ │ ├── laws.md # optional
400
- │ │ └── validation.md # optional
401
- │ ├── laws/ # global laws
402
- │ └── appraisers/ # appraiser personalities
403
- ├── .foundry/ # runtime state (gitignored)
404
- │ └── .secret # per-worktree HMAC key (mode 0600)
405
- ├── .opencode/
406
- │ └── agents/
407
- │ └── foundry-*.md # generated by refresh-agents
408
- ├── opencode.json
409
- └── ...
176
+ forge → drafts a haiku [commit]
177
+ quench → 7/7/5 — fails syllable check [commit]
178
+ forge → revises [commit]
179
+ quench → 5/7/5 passes [commit]
180
+ appraise → 2 appraisers, one flags weak imagery [commit]
181
+ forge → revises [commit]
182
+ appraise → clean [commit]
183
+ done → squash-merged to main with attestation
410
184
  ```
411
185
 
412
- During a flow, a work branch also contains `WORK.md` and `WORK.history.yaml` at the repo root. Both are ephemeral delete them before squash-merging.
186
+ Every stage commits. Every decision is recorded. Every piece of feedback and every
187
+ revision leaves a trace in the work branch. The final artefact on `main` carries a
188
+ signed attestation showing exactly how that output was produced, which models
189
+ contributed, and when each appraiser signed off.
190
+
191
+ This trace is the proof. You can play it back, audit it, replay it under a different
192
+ model, or use it to argue that the AI output is trustworthy. Every step is visible.
193
+ Nothing is hidden.
194
+
195
+ For codebase-aware flows, add flow memory after the first run: initialise memory,
196
+ declare the entity and edge vocabulary, add extractors, and opt a cycle into
197
+ `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
198
+ and [Assay](docs/concepts.md#assay) for the configuration path.
199
+
200
+ > **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
201
+ > unmaintained upstream. Installation produces six cosmetic deprecation warnings
202
+ > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
203
+ > a maintained backend in a future release; the public `foundry_memory_*` tools
204
+ > and on-disk vocabulary/NDJSON format are designed to survive that migration.
205
+ > See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
413
206
 
414
207
  ---
415
208
 
416
- ## Design decisions
209
+ ## What you can show your team
417
210
 
418
- ### Everything is markdown
211
+ After the quick start completes, you have five concrete artefacts to point at to
212
+ demonstrate engineered confidence:
419
213
 
420
- Flows, cycles, artefact types, laws, appraiser personalities, skills all markdown with YAML frontmatter. Readable by humans, consumable by LLMs, diff-able in git. No bespoke formats, no databases.
214
+ - **The artefact itself**`haikus/autumn.md` on `main`. The final, approved output
215
+ ready for use or deployment.
421
216
 
422
- ### Skills are the pipeline, tools are the machinery
217
+ - **The laws it satisfied** `foundry/artefacts/haiku/laws.md`. The criteria it was
218
+ measured against, written in markdown and version-controlled.
423
219
 
424
- Composition happens at the skill layer. `flow` reads a definition and invokes `orchestrate`. `orchestrate` calls `foundry_orchestrate` in a loop. The hard guarantees — routing, commits, state transitions, enforcement — live inside the plugin's custom tools and the libraries under `scripts/lib/`. Skills handle creative and subjective work; tools handle everything else.
220
+ - **The feedback ledger** `WORK.feedback.yaml` on the archived work branch. Every
221
+ issue raised, by whom, and how it was resolved during the loop.
425
222
 
426
- ### WORK.md as shared state
223
+ - **The per-stage commit history** — the raw commits on `archive/work/<flow>-<...>`.
224
+ A micro-commit per stage showing exactly what changed and why at each step.
427
225
 
428
- All inter-stage communication goes through WORK.md via the `foundry_workfile_*`, `foundry_artefacts_*`, `foundry_feedback_*`, and `foundry_history_*` tools. No stage passes output directly to another. This gives a complete audit trail, makes flows resumable after a crash, and lets any stage be re-run independently.
226
+ - **The signed attestation on main** the squash commit with the Foundry attestation
227
+ block embedded in its message. Proof of approval, signed and timestamped.
429
228
 
430
- ### Cycles own their routing
229
+ This is what makes "engineered confidence" concrete. You can show your team exactly
230
+ how that AI output was produced, what it passed through, why you trust it, and who
231
+ signed off. Every step is auditable. Every decision is recorded. The loop is
232
+ reproducible.
431
233
 
432
- A flow declares starting points; individual cycles declare `targets` and input contracts. The flow skill walks the resulting graph. This keeps cycles composable across flows and prevents the flow file from becoming a procedural monolith.
234
+ ---
433
235
 
434
- ### Feedback as checklists
236
+ ## What's in the box
435
237
 
436
- Markdown checkboxes with `#validation`, `#law:<id>`, or `#human` tags. Human-readable, trivially parseable, lifecycle encoded inline. Feedback is append-only; history is part of the artefact's story.
238
+ - **Deterministic governance** routing, commits, write boundaries, and feedback
239
+ state live in tested plugin code, outside LLM control.
437
240
 
438
- ### Wont-fix requires approval
241
+ - **Written quality criteria** — laws are markdown files; an appraiser panel scores
242
+ each artefact against them, so quality is objective.
439
243
 
440
- A forge sub-agent can decline subjective feedback with a justification, but an appraiser must approve or reject that decision on the next iteration. Validation and human feedback cannot be wont-fixed.
244
+ - **Multi-model diversity** forge on one model, appraise on another, every
245
+ appraiser on a different model if you want. Different models catch different
246
+ mistakes.
441
247
 
442
- ### Multi-model diversity
248
+ - **Full git audit trail** — one commit per stage with `WORK.md`,
249
+ `WORK.feedback.yaml`, and `WORK.history.yaml`. Every iteration is recorded.
443
250
 
444
- Cycle definitions specify per-stage models; individual appraisers may override. Different models catch different issues; consolidation is a union. One appraiser flagging an issue is enough to raise it.
251
+ - **Signed attestation on main** every flow finishes with a squash commit carrying
252
+ a canonical Foundry attestation block that proves the artefact was processed.
445
253
 
446
- ### Input artefacts are read-only
254
+ - **Archived forensic branch** — the raw work branch is retained for auditors as
255
+ `archive/work/<flow>-<desc>-<hash>`. The full micro-history is never lost.
447
256
 
448
- When a cycle reads from another cycle's output, those files cannot be modified. Enforced via `stage_finalize` and `sort`'s diff check. Downstream cycles cannot corrupt upstream work.
257
+ - **Bring your own pipeline** artefact types, laws, and stages are yours; works
258
+ for code, specs, docs, data, and anything else you can describe as files with
259
+ pass/fail criteria.
449
260
 
450
- ### Glob patterns must not overlap
261
+ - **Assay preflight** deterministic extractor stage that measures the project
262
+ before forge starts, so codebase-aware flows can begin from structured facts.
451
263
 
452
- Two artefact types cannot have file patterns that match the same files. Hard-blocked at creation time; the file-ownership rule doesn't have a meaningful answer otherwise.
264
+ - **Flow memory** typed graph store with scoped tools, semantic search when
265
+ enabled, and committed NDJSON rows for cross-cycle reuse.
453
266
 
454
267
  ---
455
268
 
456
269
  ## Further reading
457
270
 
458
- - [docs/concepts.md](docs/concepts.md) — every concept defined concisely.
459
- - [docs/getting-started.md](docs/getting-started.md) end-to-end walkthrough.
460
- - [docs/work-spec.md](docs/work-spec.md) — the full WORK.md + WORK.history.yaml spec.
461
- - [CHANGELOG.md](CHANGELOG.md) — version history and migration notes.
271
+ The full reference set lives in [docs/](docs/) — start at [docs/README.md](docs/README.md)
272
+ for a guided index of every document and when to read it.
462
273
 
463
274
  ---
464
275
 
465
276
  ## License
466
277
 
467
- [MIT](LICENSE)
278
+ MIT.