@really-knows-ai/foundry 3.5.8 → 3.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/README.md +16 -10
  2. package/dist/.opencode/plugins/foundry-tools/config-create-tools.js +2 -3
  3. package/dist/.opencode/plugins/foundry-tools/feedback-tools.js +9 -5
  4. package/dist/.opencode/plugins/foundry-tools/orchestrate-tool.js +3 -1
  5. package/dist/CHANGELOG.md +38 -0
  6. package/dist/README.md +16 -10
  7. package/dist/docs/README.md +6 -6
  8. package/dist/docs/architecture.md +59 -19
  9. package/dist/docs/concepts.md +55 -19
  10. package/dist/docs/getting-started.md +37 -15
  11. package/dist/docs/memory-maintenance.md +3 -3
  12. package/dist/docs/tools.md +131 -70
  13. package/dist/docs/work-spec.md +38 -52
  14. package/dist/scripts/appraise-module.js +69 -7
  15. package/dist/scripts/lib/artefacts.js +43 -1
  16. package/dist/scripts/lib/config-creators/cycle.js +6 -10
  17. package/dist/scripts/lib/config-validators/cycle.js +1 -9
  18. package/dist/scripts/lib/feedback-store.js +26 -51
  19. package/dist/scripts/lib/finalize.js +10 -2
  20. package/dist/scripts/lib/forge-contract.js +93 -0
  21. package/dist/scripts/lib/history.js +2 -1
  22. package/dist/scripts/lib/sort-reason.js +11 -8
  23. package/dist/scripts/lib/sort-routing.js +185 -63
  24. package/dist/scripts/lib/workfile.js +28 -0
  25. package/dist/scripts/orchestrate-cycle.js +3 -13
  26. package/dist/scripts/orchestrate-phases.js +51 -45
  27. package/dist/scripts/orchestrate-terminals.js +37 -2
  28. package/dist/scripts/orchestrate.js +62 -5
  29. package/dist/scripts/quench-module.js +54 -12
  30. package/dist/scripts/sort.js +42 -62
  31. package/dist/skills/add-cycle/SKILL.md +4 -4
  32. package/dist/skills/add-flow/SKILL.md +1 -1
  33. package/dist/skills/human-appraise/SKILL.md +12 -40
  34. package/package.json +1 -1
package/README.md CHANGED
@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
32
32
  is auditable, repeatable, and defensible to auditors and stakeholders. You can show
33
33
  exactly how the output was made. Confidence is engineered; it is not hoped for.
34
34
 
35
- ### The operating model: assay, then forge → quench → appraise
35
+ ### The operating model: assay, then forge → quench → appraise → attest → finish
36
36
 
37
37
  A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
38
38
  that runs project-authored extractor scripts, parses the strict JSONL facts they
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
51
51
  fast and non-negotiable, catching errors before they reach appraisers.
52
52
 
53
53
  - **Appraise** judges quality against written laws. Independent evaluators inspect
54
- whether the work meets the subjective standards you define.
54
+ whether the work meets the rules or criteria you define.
55
55
 
56
- - **Human-appraise** provides direct judgement when the stakes require it or the loop
57
- deadlocks. Offers human oversight at critical decision points.
56
+ - **Human-appraise** provides direct judgement when the stakes require it or the
57
+ cycle reaches its iteration limit. Offers human oversight at critical
58
+ decision points.
58
59
 
59
60
  Every stage commits separately, so every step leaves a record. Every decision is
60
61
  timestamped. A single loop produces an **output** — a verified draft. A flow
61
- composes one or more such loops to produce an **outcome** — the final artefact that
62
- reaches your codebase or customers.
62
+ composes one or more such loops to produce an **outcome** — the final artefact.
63
+
64
+ When the loop clears, completing the work branch requires **attest** — a final
65
+ verification that writes and commits `ATTEST.md` — followed by **finish**, which
66
+ squash-merges the approved work to the base branch with a signed attestation block.
63
67
 
64
68
  ### What you describe, what Foundry enforces
65
69
 
@@ -174,7 +178,8 @@ quench → 5/7/5 — passes [commit]
174
178
  appraise → 2 appraisers, one flags weak imagery [commit]
175
179
  forge → revises [commit]
176
180
  appraise → clean [commit]
177
- done squash-merged to main with attestation
181
+ attest ATTEST.md committed [commit]
182
+ finish → squash-merged to main with attestation
178
183
  ```
179
184
 
180
185
  Every stage commits. Every decision is recorded. Every piece of feedback and every
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
191
196
  `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
192
197
  and [Assay](docs/concepts.md#assay) for the configuration path.
193
198
 
194
- > **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
199
+ > **Note:** flow memory currently persists to `cozo-node`, which is
195
200
  > unmaintained upstream. Installation produces six cosmetic deprecation warnings
196
201
  > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
197
202
  > a maintained backend in a future release; the public `foundry_memory_*` tools
198
203
  > and on-disk vocabulary/NDJSON format are designed to survive that migration.
199
- > See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
204
+ > See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
200
205
 
201
206
  ---
202
207
 
@@ -233,7 +238,8 @@ reproducible.
233
238
  state live in tested plugin code, outside LLM control.
234
239
 
235
240
  - **Written quality criteria** — laws are markdown files; an appraiser panel scores
236
- each artefact against them, so quality is objective.
241
+ each artefact against them, providing structured quality assessment from
242
+ multiple perspectives.
237
243
 
238
244
  - **Multi-model diversity** — forge on one model, appraise on another, every
239
245
  appraiser on a different model if you want. Different models catch different
@@ -162,9 +162,8 @@ function cycleArgs(s) { return {
162
162
  artefacts: s.array(s.string()).describe('Artefact type IDs this cycle reads'),
163
163
  }).optional().describe('Input contract for this cycle. Omit for source cycles that start from the user goal; empty artefacts arrays are invalid.'),
164
164
  targets: s.array(s.string()).optional().describe('Downstream cycle IDs this cycle can route to'),
165
- humanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
166
- deadlockAppraise: s.boolean().optional().describe('Route to human-appraise on LLM appraiser deadlock'),
167
- deadlockIterations: s.number().optional().describe('Iteration threshold for deadlock detection'),
165
+ alwaysHumanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
166
+ deadlockHumanAppraise: s.boolean().optional().describe('Route to human-appraise when max-iterations is reached'),
168
167
  maxIterations: s.number().optional().describe('Maximum forge iterations before cycle blocks'),
169
168
  assay: s.object({
170
169
  extractors: s.array(s.string()).describe('Extractor IDs for the assay stage'),
@@ -70,13 +70,17 @@ async function executeFeedbackAdd(args, context) {
70
70
 
71
71
  try {
72
72
  const store = openFeedbackStore('WORK.feedback.yaml', io);
73
- const { id, deduped } = store.add({
73
+ const params = {
74
74
  file: args.file,
75
75
  tag: args.tag,
76
76
  text: args.text,
77
77
  source: activeStage,
78
78
  cycle,
79
- });
79
+ };
80
+ if (args.artefact_version !== undefined) {
81
+ params.artefact_version = args.artefact_version;
82
+ }
83
+ const { id, deduped } = store.add(params);
80
84
  return JSON.stringify({ ok: true, id, deduped });
81
85
  } catch (err) {
82
86
  return JSON.stringify({ error: `foundry_feedback_add: ${err.message}` });
@@ -201,6 +205,7 @@ export function createFeedbackTools({ tool }) {
201
205
  file: tool.schema.string().describe('Artefact file path'),
202
206
  text: tool.schema.string().describe('Feedback text'),
203
207
  tag: tool.schema.string().describe('Tag for the feedback item'),
208
+ artefact_version: tool.schema.string().optional().describe('SHA-256 hash for version-aware sorting'),
204
209
  },
205
210
  execute: guarded('foundry_feedback_add', [flowBranchGuard, gateNotFailed], executeFeedbackAdd, { branchIo: branchIoFactory, io: asyncIoFactory }),
206
211
  }),
@@ -218,7 +223,7 @@ export function createFeedbackTools({ tool }) {
218
223
  execute: guarded('foundry_feedback_wontfix', [flowBranchGuard, gateNotFailed], executeFeedbackWontfix, { branchIo: branchIoFactory, io: asyncIoFactory }),
219
224
  }),
220
225
  foundry_feedback_resolve: tool({
221
- description: 'Resolve a feedback item (approved or rejected). In human-appraise stages, this tool can override deadlocked items by providing a reason.',
226
+ description: 'Resolve a feedback item (approved or rejected). Human-appraise stages can override deadlock with a reason.',
222
227
  args: {
223
228
  id: tool.schema.string().describe('Feedback item id (ULID)'),
224
229
  resolution: tool.schema.enum(['approved', 'rejected']).describe('Resolution type'),
@@ -230,7 +235,6 @@ export function createFeedbackTools({ tool }) {
230
235
  args: {
231
236
  file: tool.schema.string().optional().describe('Filter by artefact file path'),
232
237
  },
233
- execute: executeFeedbackList,
234
- }),
238
+ execute: executeFeedbackList }),
235
239
  };
236
240
  }
@@ -38,7 +38,7 @@ function createGitBridge(cwd) {
38
38
  }
39
39
 
40
40
  async function createFinalize(cwd, io) {
41
- return async ({ cycleId, stage, baseSha }) => {
41
+ return async ({ cycleId, stage, baseSha, artefact_version, contractPassed }) => {
42
42
  let cycleDoc;
43
43
  try {
44
44
  cycleDoc = await getCycleDefinition('foundry', cycleId, io);
@@ -66,6 +66,8 @@ async function createFinalize(cwd, io) {
66
66
  cycleDef,
67
67
  artefactTypes,
68
68
  io,
69
+ artefact_version,
70
+ contractPassed,
69
71
  });
70
72
  return result;
71
73
  };
package/dist/CHANGELOG.md CHANGED
@@ -1,5 +1,43 @@
1
1
  # Changelog
2
2
 
3
+ ## [3.6.0] - 2026-05-25
4
+
5
+ ### Added
6
+
7
+ - State-driven sort routing (R1–R4, R7): sort routes based on feedback item state (`unresolved` → forge, `addressed` → source stage) instead of stage position, with deadlock detection and iteration cap.
8
+ - Artefact version tracking: `computeArtefactVersion` hashes all artefact files after each forge run; the version is recorded on every feedback item and forge history entry.
9
+ - Forge contract enforcement (R8–R9): per-item response check and batch-level version consistency check. Contract violations revert items to `open` and post system feedback. Three consecutive contract failures block the cycle.
10
+ - Stale feedback detection (R6): when quench, appraise, or human-appraise re-enters after a forge run, feedback items with a mismatched artefact version are auto-resolved as superseded.
11
+ - Legacy feedback migration: items without a valid artefact version are auto-resolved on load and excluded from routing.
12
+ - Integration test suite: 40 integration tests covering the routing decision tree, forge contract enforcement, and the spec's 10-step worked example.
13
+
14
+ ### Fixed
15
+
16
+ - Forge contract enforcement wired into the orchestration post-dispatch path (previously it finalised without computing versions or checking the contract).
17
+ - New feedback items from quench, appraise, and the plugin feedback tool now carry the current artefact version, preventing false legacy detection.
18
+ - `computeArtefactVersion` scans the worktree root for file patterns (previously scanned the `foundry/` config directory).
19
+ - Version computation failures during stale resolution are surfaced to the orchestrator instead of being silently swallowed.
20
+ - Appraise stale feedback resolution runs during the gather phase (before appraiser dispatch), not only during consolidation.
21
+
22
+ ### Changed
23
+
24
+ - Orchestration: forge post-dispatch extracts a dedicated `runForgePostDispatch` path that reads the forge context, computes versions, enforces the contract, and passes `contractPassed`/`postVersion` to `finaliseStage`.
25
+ - Quench and appraise stale resolution helpers gracefully degrade when the artefact type is not configured (best-effort, not fatal).
26
+ - Added `worktree` parameter to `computeArtefactVersion` for explicit worktree-root specification.
27
+
28
+ ## [3.5.9] - 2026-05-24
29
+
30
+ ### Changed
31
+
32
+ - Cycle frontmatter keys renamed: `human-appraise` → `always-human-appraise`, `deadlock-appraise` → `deadlock-human-appraise`, `deadlock-iterations` replaced with `max-iterations`-based deadlock routing.
33
+ - Deadlock routing replaced per-item deadlocked state with iteration-limit detection using `max-iterations` and `deadlock-human-appraise`.
34
+ - Orchestration uses iteration cap routing instead of per-item deadlock history.
35
+ - Skills and public documentation (`README.md`, `docs/*`) refreshed against the current 65-tool implementation: updated tool reference with structured config-creation arguments, corrected quench/appraise execution models, documented attestation-before-finish workflow, added Validator and ATTEST.md concepts, and fixed memory paths. All stale v3.0.x terminology removed.
36
+
37
+ ### Fixed
38
+
39
+ - Guidance audit tests aligned with current user-facing documentation style (no direct `foundry_git_finish({)` call syntax in walkthrough).
40
+
3
41
  ## [3.5.8] - 2026-05-23
4
42
 
5
43
  ### Added
package/dist/README.md CHANGED
@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
32
32
  is auditable, repeatable, and defensible to auditors and stakeholders. You can show
33
33
  exactly how the output was made. Confidence is engineered; it is not hoped for.
34
34
 
35
- ### The operating model: assay, then forge → quench → appraise
35
+ ### The operating model: assay, then forge → quench → appraise → attest → finish
36
36
 
37
37
  A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
38
38
  that runs project-authored extractor scripts, parses the strict JSONL facts they
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
51
51
  fast and non-negotiable, catching errors before they reach appraisers.
52
52
 
53
53
  - **Appraise** judges quality against written laws. Independent evaluators inspect
54
- whether the work meets the subjective standards you define.
54
+ whether the work meets the rules or criteria you define.
55
55
 
56
- - **Human-appraise** provides direct judgement when the stakes require it or the loop
57
- deadlocks. Offers human oversight at critical decision points.
56
+ - **Human-appraise** provides direct judgement when the stakes require it or the
57
+ cycle reaches its iteration limit. Offers human oversight at critical
58
+ decision points.
58
59
 
59
60
  Every stage commits separately, so every step leaves a record. Every decision is
60
61
  timestamped. A single loop produces an **output** — a verified draft. A flow
61
- composes one or more such loops to produce an **outcome** — the final artefact that
62
- reaches your codebase or customers.
62
+ composes one or more such loops to produce an **outcome** — the final artefact.
63
+
64
+ When the loop clears, completing the work branch requires **attest** — a final
65
+ verification that writes and commits `ATTEST.md` — followed by **finish**, which
66
+ squash-merges the approved work to the base branch with a signed attestation block.
63
67
 
64
68
  ### What you describe, what Foundry enforces
65
69
 
@@ -174,7 +178,8 @@ quench → 5/7/5 — passes [commit]
174
178
  appraise → 2 appraisers, one flags weak imagery [commit]
175
179
  forge → revises [commit]
176
180
  appraise → clean [commit]
177
- done squash-merged to main with attestation
181
+ attest ATTEST.md committed [commit]
182
+ finish → squash-merged to main with attestation
178
183
  ```
179
184
 
180
185
  Every stage commits. Every decision is recorded. Every piece of feedback and every
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
191
196
  `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
192
197
  and [Assay](docs/concepts.md#assay) for the configuration path.
193
198
 
194
- > **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
199
+ > **Note:** flow memory currently persists to `cozo-node`, which is
195
200
  > unmaintained upstream. Installation produces six cosmetic deprecation warnings
196
201
  > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
197
202
  > a maintained backend in a future release; the public `foundry_memory_*` tools
198
203
  > and on-disk vocabulary/NDJSON format are designed to survive that migration.
199
- > See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
204
+ > See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
200
205
 
201
206
  ---
202
207
 
@@ -233,7 +238,8 @@ reproducible.
233
238
  state live in tested plugin code, outside LLM control.
234
239
 
235
240
  - **Written quality criteria** — laws are markdown files; an appraiser panel scores
236
- each artefact against them, so quality is objective.
241
+ each artefact against them, providing structured quality assessment from
242
+ multiple perspectives.
237
243
 
238
244
  - **Multi-model diversity** — forge on one model, appraise on another, every
239
245
  appraiser on a different model if you want. Different models catch different
@@ -1,6 +1,6 @@
1
1
  # Foundry docs
2
2
 
3
- This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need.
3
+ This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need. The `docs/superpowers/` directory is future-facing internal scaffolding and is not indexed as public documentation.
4
4
 
5
5
  **How to navigate:** Work through the sections in order: **Start here** establishes conceptual foundations, **Reference** provides detailed specifications for implementation, and **Contributors** covers subsystem maintenance and extensions.
6
6
 
@@ -10,9 +10,9 @@ Getting oriented with Foundry means understanding both the concepts it uses and
10
10
 
11
11
  **Reading order:** Work through them in order; [getting-started.md](getting-started.md) builds hands-on confidence, and [concepts.md](concepts.md) provides reference depth. Most implementers spend 1–2 hours on getting-started before moving to Reference materials.
12
12
 
13
- - **getting-started.md** — Complete end-to-end installation, auto-bootstrapping, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
13
+ - **getting-started.md** — Complete end-to-end installation, bootstrap, Foundry agent wizard, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
14
14
 
15
- It establishes the operating model, directory structure, and practical confidence in one pass. Includes hands-on guidance on authoring the five foundational concepts (artefact types, laws, appraisers, cycles, flows) with worked examples you can run against real code. Also covers the optional flow-memory path: initialise memory, declare vocabulary, add extractors, and opt a cycle into assay for codebase-aware flows.
15
+ It covers bootstrap states, the Understand–Plan–Confirm–Build wizard protocol, structured config authoring, current flow execution with quench and appraise, attestation and branch finishing, dry-run completion, and the optional flow-memory path (initialise memory, declare vocabulary, add extractors, opt a cycle into assay). Uses worked examples you can run against real code.
16
16
 
17
17
  Implementers must follow every step and complete the bootstrap; architects typically skim for structure before moving to [concepts.md](concepts.md) and [architecture.md](architecture.md) to reason about their designs.
18
18
 
@@ -28,19 +28,19 @@ These documents specify formats, tools, and design principles. Use them when imp
28
28
 
29
29
  **Key property:** These are sources of truth and normative references. Changes to Foundry flow formats or tool behaviour must be reflected here first. Use them together—cross-references appear throughout.
30
30
 
31
- - **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, the artefact registry, and the full feedback state machine with all valid transitions and guards.
31
+ - **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, feedback and history state, and the full feedback state machine with all valid transitions and guards. Artefacts are discovered from branch diffs, not stored as an artefact table in the workfile.
32
32
 
33
33
  Use this when implementing tooling around work files, validating state transitions, or understanding what metadata flows carry through an execution. It is the authoritative source of truth for all transient work-branch structures, format validation rules, and field semantics.
34
34
 
35
35
  Implementers and tool builders rely on this heavily; keep it updated immediately as formats evolve or new fields are added.
36
36
 
37
- - **tools.md** — Categorical index and reference documentation for all custom tools, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
37
+ - **tools.md** — Complete 65-tool categorical index and reference, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
38
38
 
39
39
  Consult this when you need to understand what a specific tool does, its branch requirements, what stage locks apply, what arguments it accepts, and how it integrates with the overall system. Covers calling conventions, enforcement invariants, and the permission model for memory access. References `foundry_assay_run`, `foundry_extractor_create`, and memory data and admin tools.
40
40
 
41
41
  Tool authors and system integrators use this constantly; it is the comprehensive reference for all custom tools in the Foundry ecosystem.
42
42
 
43
- - **architecture.md** — The design and enforcement model covering token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing, and core design principles.
43
+ - **architecture.md** — The design and enforcement model covering orchestration actions (`dispatch`, `dispatch_multi`, `human_appraise`, `done`, `blocked`, `violation`), internal quench and appraise execution, branch finish paths with attestation preconditions, token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing with `defaultModel`, forge required-tool verification, and core design principles.
44
44
 
45
45
  Read this when you need to understand how Foundry maintains safety (how tokens prevent replay, why stages lock mutations, how writes are validated), what guarantees it makes and where they live in the code, or why it is structured the way it is. Explains the memory layout, assay write boundary, and failed-flow behaviour that keep extractor-populated memory auditable.
46
46
 
@@ -34,7 +34,7 @@ A forge sub-agent can decline subjective feedback with a justification, and an a
34
34
 
35
35
  ### Humans can step in at known points
36
36
 
37
- Human-in-the-loop gates are first-class stages. A cycle can declare `human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-appraise: true` (the default) to pull a human in only when LLM appraisers and forge ping-pong on the same items. Human feedback takes absolute priority and cannot be wont-fixed.
37
+ Human-in-the-loop gates are first-class stages. A cycle can declare `always-human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-human-appraise: true` (the default) to pull a human in when the iteration count reaches `max-iterations`. Human feedback takes absolute priority and cannot be wont-fixed.
38
38
 
39
39
  ### Multi-model diversity
40
40
 
@@ -75,9 +75,10 @@ The following guarantees live in plugin code and are outside LLM control:
75
75
  The `orchestrate` skill is a thin driver around `foundry_orchestrate`. Its entire loop is:
76
76
 
77
77
  ```text
78
- call foundry_orchestrate({lastResult})
78
+ call foundry_orchestrate({lastResult, lastResults, baseBranch, defaultModel})
79
79
  switch on action:
80
- dispatch → dispatch the requested subagent → report back
80
+ dispatch → dispatch single subagent → report back
81
+ dispatch_multi → dispatch parallel appraiser tasks → consolidate → report back
81
82
  human_appraise → run human-appraise inline → report back
82
83
  done / blocked / violation → terminate the loop
83
84
  ```
@@ -92,6 +93,14 @@ switch on action:
92
93
 
93
94
  Because the protocol lives in a plugin tool, the LLM cannot skip steps, reorder them, or silently drop a commit.
94
95
 
96
+ ### Internal quench execution
97
+
98
+ Quench runs inside the orchestrator as `runQuench(ctx)` — a deterministic, non-LLM validation pass. It reads the active stage from `.foundry/active-stage.json`, discovers artefact changes via branch-based artefact discovery, runs validators for each applicable law, and posts feedback with tags in the format `law:<law-id>:<validator-id>`. It also resolves prior quench feedback: items whose issues remain are set to `rejected`; items no longer present are set to `approved` (transitioned to `resolved`). The quench module is available at `src/scripts/quench-module.js`.
99
+
100
+ ### Internal appraise execution
101
+
102
+ Appraise uses `gatherAppraiseContext()` to build parallel subagent tasks, one per (artefact, appraiser) pair. It returns a `dispatch_multi` action containing the task list. The orchestrator's loop dispatches each task independently. After all appraisers report back, `consolidateAppraise()` processes the `lastResults` array: it parses each successful output for structured issues, de-duplicates across appraisers, posts feedback with `law:<law-id>` tags, and resolves prior appraise feedback items (resolves stale items, rejects items still present). The appraise module is available at `src/scripts/appraise-module.js`.
103
+
95
104
  ---
96
105
 
97
106
  ## Token lifecycle
@@ -134,9 +143,9 @@ Foundry partitions mutation across three branch namespaces. The plugin enforces
134
143
 
135
144
  | Namespace | Pattern | Owns | Created from | Finished by |
136
145
  |-----------|---------|------|--------------|-------------|
137
- | **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | PR or direct merge to `main` |
138
- | **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | `foundry_git_finish` (squash-merges to base, preserves forensic archive) |
139
- | **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot, force-deletes branch) |
146
+ | **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | Squash-merge to base branch; no attestation required |
147
+ | **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | Requires `ATTEST.md` at HEAD (created by `foundry_attest({ confirm: true })`). `foundry_git_finish({ confirm: true })` verifies the attestation, preserves `archive/<work-branch>-<short-sha>`, squash-merges to base, creates a signed commit (`-S`), deletes the work branch |
148
+ | **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot to `.snapshots/<run-id>/`, force-deletes branch) |
140
149
 
141
150
  ### Guard implementation
142
151
 
@@ -150,7 +159,7 @@ Implementation: `src/scripts/lib/branch-guard.js`.
150
159
 
151
160
  ### Forensic branches and snapshots
152
161
 
153
- - **Work branches** are preserved as `archive/work/<flowId>-<description>-<hash>` when `foundry_git_finish` completes. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
162
+ - **Work branches** require `ATTEST.md` at HEAD, created by `foundry_attest({ confirm: true })` before `foundry_git_finish({ confirm: true })` runs. The finish tool verifies the attestation, checks the diff SHA matches, preserves the branch as `archive/work/<flowId>-<description>-<hash>`, squash-merges to the base branch with a signed commit (`-S`), and deletes the work branch. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
154
163
  - **Dry-run branches** are force-deleted after `foundry_git_finish` captures a snapshot to `.snapshots/<runId>/` on the parent `config/*` working tree. Each snapshot includes `README.md` (metadata), `work/WORK*` (workfile triple), `diff.patch` (full diff), and `trace.jsonl` (tool-call trace).
155
164
 
156
165
  ---
@@ -233,6 +242,20 @@ Every stage runs inside a token-gated lifecycle. The sub-agent must call `foundr
233
242
 
234
243
  Input artefacts (files matching an input type's `file-patterns`) are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle with `{error: 'unexpected_files'}`.
235
244
 
245
+ ### Forge required-tool verification
246
+
247
+ During `foundry_stage_end` for a forge stage, the plugin verifies that the forge sub-agent called five required context-reading tools:
248
+
249
+ 1. `foundry_config_cycle`
250
+ 2. `foundry_workfile_get`
251
+ 3. `foundry_config_artefact_type`
252
+ 4. `foundry_config_laws`
253
+ 5. `foundry_feedback_list`
254
+
255
+ Tool calls are logged to `.foundry/.forge-tool-calls.jsonl` during stage execution. When `foundry_stage_end` runs, it checks the log against the required set. Missing required calls generate system feedback with the tag `system:missing-tool-calls` and the forge stage completes normally — the missing-tool feedback acts as a signal to the sort router. When all required tools are present, any prior `system:missing-tool-calls` feedback is resolved.
256
+
257
+ Implementation: `src/plugin/tools/stage-tools.js` (`verifyAndManageForgeTools`) and `src/scripts/lib/stage-calls.js`.
258
+
236
259
  ### Failed flow state
237
260
 
238
261
  When an unrecoverable error occurs (e.g. assay extractor abort, invalid JSONL, or memory-sync failure), the orchestrator marks `WORK.md` frontmatter with `status: failed` and a `reason`. The flow is then locked:
@@ -258,10 +281,10 @@ All pipeline skills (`orchestrate`, `flow`, stage skills) check for this state a
258
281
 
259
282
  `src/scripts/sort.js` (exported as `runSort`) owns the routing engine. It reads `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml`, then decides which stage runs next based on:
260
283
 
261
- - **Unresolved feedback.** If `quench` or `appraise` feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is usually `forge` or the originating evaluation stage.
262
- - **Deadlock detection.** If the same feedback items ping-pong between forge and appraise for `deadlock-iterations` (default 5) iterations, sort marks them `deadlocked` and inserts a `human-appraise` stage (if `deadlock-appraise: true`, the default).
263
- - **Iteration limits.** If `max-iterations` is exceeded, the cycle is marked `blocked` and control returns to the user.
284
+ - **Unresolved feedback.** If feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is `forge` (for items needing action) or the originating evaluation stage (for items pending approval).
285
+ - **Iteration limits and deadlock routing.** When the forge iteration count reaches `max-iterations` with unresolved feedback, sort routes to `human-appraise` (if `deadlock-human-appraise: true`, the default) or marks the cycle `blocked` if human routing is disabled. Sort does not write per-item deadlocked state; deadlock is a routing decision, not a feedback item state.
264
286
  - **Clean state.** If all feedback is resolved and no new validation or appraisal failures exist, the cycle is `done`.
287
+ - **Blocked.** If `max-iterations` is exceeded and `deadlock-human-appraise` is `false`, the cycle is marked `blocked` and control returns to the user.
265
288
 
266
289
  ### Feedback state machine
267
290
 
@@ -269,15 +292,15 @@ Feedback items live in `WORK.feedback.yaml` with a full transition history. Each
269
292
 
270
293
  - `id` — a ULID.
271
294
  - `source` — the stage that created it (e.g. `quench:check-syllables`, `appraise:pedantic`, `human-appraise:hitl`).
272
- - `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`, `deadlocked`).
295
+ - `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`).
273
296
  - `history` — append-only log of state transitions with timestamps and metadata.
274
297
 
275
298
  Transitions are **source-based**:
276
299
 
277
300
  | Source stage | Forge can `wont-fix`? | Resolved by |
278
301
  |--------------|------------------------|-------------|
279
- | `quench` (CLI validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
280
- | `appraise` (subjective law) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
302
+ | `quench` (deterministic validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
303
+ | `appraise` (law evaluation) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
281
304
  | `human-appraise` (user instruction) | No — must `actioned` | the originating `human-appraise` stage |
282
305
 
283
306
  Implementation: `src/scripts/lib/feedback-transitions.js` and `src/scripts/lib/feedback-store.js`. See [work-spec.md](work-spec.md) for the full state machine table.
@@ -290,24 +313,27 @@ Different stages can run on different models for cognitive diversity. Cycle defi
290
313
 
291
314
  ### Configuration
292
315
 
316
+ - **Orchestrator argument.** `defaultModel` (optional) can be passed as an orchestrator argument. When set, it serves as the fallback for any stage or appraiser that does not declare a model.
293
317
  - **Cycle-level.** Declare a `models` map in the cycle frontmatter:
294
318
  ```yaml
295
319
  models:
320
+ default: anthropic/claude-sonnet-4
296
321
  forge: anthropic/claude-opus-4.7
297
322
  appraise: openai/gpt-5
298
323
  ```
299
- - **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model on a per-appraiser basis.
324
+ `models.default` provides a cycle-level fallback when no per-stage override exists and no `defaultModel` is passed to the orchestrator.
325
+ - **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model and `models.default` on a per-appraiser basis.
300
326
 
301
327
  ### Agent files
302
328
 
303
329
  The user-facing `Foundry` agent is installed by the plugin's `config` hook as `.opencode/agents/foundry.md`. Users switch to this agent after restarting OpenCode. It guides authoring and flow execution while generated `foundry-*` stage agents remain hidden routing targets for specific models.
304
330
 
305
- `foundry_refresh_agents()` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`).
331
+ `foundry_refresh_agents` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`). Call `foundry_refresh_agents()` in code examples when referring to the tool invocation.
306
332
 
307
333
  ### Dispatch behaviour
308
334
 
309
- - **Non-appraise stages** (forge, quench, assay): if the cycle declares `models.<stage>`, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing. If `models.<stage>` is not set, the stage is dispatched with the `general` subagent (session default).
310
- - **Appraise stage**: each appraiser is dispatched independently by the `appraise` skill. If an appraiser has its own `model`, the skill dispatches to `foundry-<slug>` and hard-fails if that agent file is missing; otherwise the appraiser runs under the `general` subagent.
335
+ - **Non-appraise stages** (forge, quench, assay): the orchestrator resolves the model by checking `models.<stage>`, then `defaultModel`, then `models.default`, then falls back to `general` (session default). If a specific model is resolved, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing.
336
+ - **Appraise stage**: each appraiser is dispatched independently by the appraise module. The model resolution order is: appraiser's own `model` field, then `defaultModel`, then `models.default`, then `general`. If a specific model is resolved, the task is dispatched to `foundry-<slug>` and the orchestrator hard-fails if that agent file is missing.
311
337
 
312
338
  Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `src/skills/appraise/SKILL.md`.
313
339
 
@@ -331,12 +357,13 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
331
357
  │ │ ├── appraise/
332
358
  │ │ ├── human-appraise/
333
359
  │ │ ├── add-artefact-type/ # authoring
334
- │ │ ├── add-artefact-type/
335
360
  │ │ ├── add-law/
336
361
  │ │ ├── add-appraiser/
337
362
  │ │ ├── add-cycle/
338
363
  │ │ ├── add-flow/
339
364
  │ │ ├── add-extractor/
365
+ │ │ ├── assay/ # deterministic extractor execution
366
+ │ │ ├── dry-run/ # dry-run execution and snapshots
340
367
  │ │ ├── list-agents/ # utility
341
368
  │ │ ├── refresh-agents/ # utility (now backed by foundry_refresh_agents tool)
342
369
  │ │ ├── upgrade-foundry/
@@ -352,7 +379,7 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
352
379
  │ └── scripts/
353
380
  │ ├── lib/ # shared libraries (injectable I/O)
354
381
  │ │ ├── workfile.js # WORK.md frontmatter
355
- │ ├── artefacts.js # artefact table operations
382
+ │ ├── artefacts.js # artefact discovery via branch diffs
356
383
  │ │ ├── history.js # WORK.history.yaml operations
357
384
  │ │ ├── feedback-store.js
358
385
  │ │ ├── feedback-transitions.js
@@ -367,10 +394,18 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
367
394
  │ │ ├── state.js
368
395
  │ │ ├── config.js # foundry/ config readers
369
396
  │ │ ├── slug.js
397
+ │ │ ├── tool-paths.js
398
+ │ │ ├── stage-calls.js # forge tool-call logging and verification
399
+ │ │ ├── sort-routing.js
400
+ │ │ ├── sort-reason.js
401
+ │ │ ├── sort-fs-check.js
402
+ │ │ ├── validation.js
370
403
  │ │ ├── ulid.js
371
404
  │ │ ├── tracing.js
372
405
  │ │ ├── failed-flow.js
373
406
  │ │ ├── git-bridge.js
407
+ │ │ ├── git-finish/ # branch finishing logic
408
+ │ │ ├── attestation/ # ATTEST.md generation and verification
374
409
  │ │ ├── git-policy.js
375
410
  │ │ ├── assay/
376
411
  │ │ ├── config-creators/
@@ -378,6 +413,11 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
378
413
  │ │ ├── snapshot/
379
414
  │ │ └── memory/ # flow memory (Cozo 0.7)
380
415
  │ ├── orchestrate.js # orchestration loop (exports runOrchestrate)
416
+ │ ├── orchestrate-cycle.js
417
+ │ ├── orchestrate-phases.js
418
+ │ ├── orchestrate-terminals.js
419
+ │ ├── quench-module.js # deterministic validation (runQuench)
420
+ │ ├── appraise-module.js # appraise gather and consolidate
381
421
  │ └── sort.js # routing engine (exports runSort)
382
422
  ├── scripts/
383
423
  │ └── build.js # builds src/ into dist/