@really-knows-ai/foundry 3.5.8 → 3.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -10
- package/dist/.opencode/plugins/foundry-tools/config-create-tools.js +2 -3
- package/dist/.opencode/plugins/foundry-tools/feedback-tools.js +9 -5
- package/dist/.opencode/plugins/foundry-tools/orchestrate-tool.js +3 -1
- package/dist/CHANGELOG.md +38 -0
- package/dist/README.md +16 -10
- package/dist/docs/README.md +6 -6
- package/dist/docs/architecture.md +59 -19
- package/dist/docs/concepts.md +55 -19
- package/dist/docs/getting-started.md +37 -15
- package/dist/docs/memory-maintenance.md +3 -3
- package/dist/docs/tools.md +131 -70
- package/dist/docs/work-spec.md +38 -52
- package/dist/scripts/appraise-module.js +69 -7
- package/dist/scripts/lib/artefacts.js +43 -1
- package/dist/scripts/lib/config-creators/cycle.js +6 -10
- package/dist/scripts/lib/config-validators/cycle.js +1 -9
- package/dist/scripts/lib/feedback-store.js +26 -51
- package/dist/scripts/lib/finalize.js +10 -2
- package/dist/scripts/lib/forge-contract.js +93 -0
- package/dist/scripts/lib/history.js +2 -1
- package/dist/scripts/lib/sort-reason.js +11 -8
- package/dist/scripts/lib/sort-routing.js +185 -63
- package/dist/scripts/lib/workfile.js +28 -0
- package/dist/scripts/orchestrate-cycle.js +3 -13
- package/dist/scripts/orchestrate-phases.js +51 -45
- package/dist/scripts/orchestrate-terminals.js +37 -2
- package/dist/scripts/orchestrate.js +62 -5
- package/dist/scripts/quench-module.js +54 -12
- package/dist/scripts/sort.js +42 -62
- package/dist/skills/add-cycle/SKILL.md +4 -4
- package/dist/skills/add-flow/SKILL.md +1 -1
- package/dist/skills/human-appraise/SKILL.md +12 -40
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
|
|
|
32
32
|
is auditable, repeatable, and defensible to auditors and stakeholders. You can show
|
|
33
33
|
exactly how the output was made. Confidence is engineered; it is not hoped for.
|
|
34
34
|
|
|
35
|
-
### The operating model: assay, then forge → quench → appraise
|
|
35
|
+
### The operating model: assay, then forge → quench → appraise → attest → finish
|
|
36
36
|
|
|
37
37
|
A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
|
|
38
38
|
that runs project-authored extractor scripts, parses the strict JSONL facts they
|
|
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
|
|
|
51
51
|
fast and non-negotiable, catching errors before they reach appraisers.
|
|
52
52
|
|
|
53
53
|
- **Appraise** judges quality against written laws. Independent evaluators inspect
|
|
54
|
-
whether the work meets the
|
|
54
|
+
whether the work meets the rules or criteria you define.
|
|
55
55
|
|
|
56
|
-
- **Human-appraise** provides direct judgement when the stakes require it or the
|
|
57
|
-
|
|
56
|
+
- **Human-appraise** provides direct judgement when the stakes require it or the
|
|
57
|
+
cycle reaches its iteration limit. Offers human oversight at critical
|
|
58
|
+
decision points.
|
|
58
59
|
|
|
59
60
|
Every stage commits separately, so every step leaves a record. Every decision is
|
|
60
61
|
timestamped. A single loop produces an **output** — a verified draft. A flow
|
|
61
|
-
composes one or more such loops to produce an **outcome** — the final artefact
|
|
62
|
-
|
|
62
|
+
composes one or more such loops to produce an **outcome** — the final artefact.
|
|
63
|
+
|
|
64
|
+
When the loop clears, completing the work branch requires **attest** — a final
|
|
65
|
+
verification that writes and commits `ATTEST.md` — followed by **finish**, which
|
|
66
|
+
squash-merges the approved work to the base branch with a signed attestation block.
|
|
63
67
|
|
|
64
68
|
### What you describe, what Foundry enforces
|
|
65
69
|
|
|
@@ -174,7 +178,8 @@ quench → 5/7/5 — passes [commit]
|
|
|
174
178
|
appraise → 2 appraisers, one flags weak imagery [commit]
|
|
175
179
|
forge → revises [commit]
|
|
176
180
|
appraise → clean [commit]
|
|
177
|
-
|
|
181
|
+
attest → ATTEST.md committed [commit]
|
|
182
|
+
finish → squash-merged to main with attestation
|
|
178
183
|
```
|
|
179
184
|
|
|
180
185
|
Every stage commits. Every decision is recorded. Every piece of feedback and every
|
|
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
|
|
|
191
196
|
`assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
|
|
192
197
|
and [Assay](docs/concepts.md#assay) for the configuration path.
|
|
193
198
|
|
|
194
|
-
> **Note
|
|
199
|
+
> **Note:** flow memory currently persists to `cozo-node`, which is
|
|
195
200
|
> unmaintained upstream. Installation produces six cosmetic deprecation warnings
|
|
196
201
|
> from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
|
|
197
202
|
> a maintained backend in a future release; the public `foundry_memory_*` tools
|
|
198
203
|
> and on-disk vocabulary/NDJSON format are designed to survive that migration.
|
|
199
|
-
> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status
|
|
204
|
+
> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
|
|
200
205
|
|
|
201
206
|
---
|
|
202
207
|
|
|
@@ -233,7 +238,8 @@ reproducible.
|
|
|
233
238
|
state live in tested plugin code, outside LLM control.
|
|
234
239
|
|
|
235
240
|
- **Written quality criteria** — laws are markdown files; an appraiser panel scores
|
|
236
|
-
each artefact against them,
|
|
241
|
+
each artefact against them, providing structured quality assessment from
|
|
242
|
+
multiple perspectives.
|
|
237
243
|
|
|
238
244
|
- **Multi-model diversity** — forge on one model, appraise on another, every
|
|
239
245
|
appraiser on a different model if you want. Different models catch different
|
|
@@ -162,9 +162,8 @@ function cycleArgs(s) { return {
|
|
|
162
162
|
artefacts: s.array(s.string()).describe('Artefact type IDs this cycle reads'),
|
|
163
163
|
}).optional().describe('Input contract for this cycle. Omit for source cycles that start from the user goal; empty artefacts arrays are invalid.'),
|
|
164
164
|
targets: s.array(s.string()).optional().describe('Downstream cycle IDs this cycle can route to'),
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
deadlockIterations: s.number().optional().describe('Iteration threshold for deadlock detection'),
|
|
165
|
+
alwaysHumanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
|
|
166
|
+
deadlockHumanAppraise: s.boolean().optional().describe('Route to human-appraise when max-iterations is reached'),
|
|
168
167
|
maxIterations: s.number().optional().describe('Maximum forge iterations before cycle blocks'),
|
|
169
168
|
assay: s.object({
|
|
170
169
|
extractors: s.array(s.string()).describe('Extractor IDs for the assay stage'),
|
|
@@ -70,13 +70,17 @@ async function executeFeedbackAdd(args, context) {
|
|
|
70
70
|
|
|
71
71
|
try {
|
|
72
72
|
const store = openFeedbackStore('WORK.feedback.yaml', io);
|
|
73
|
-
const
|
|
73
|
+
const params = {
|
|
74
74
|
file: args.file,
|
|
75
75
|
tag: args.tag,
|
|
76
76
|
text: args.text,
|
|
77
77
|
source: activeStage,
|
|
78
78
|
cycle,
|
|
79
|
-
}
|
|
79
|
+
};
|
|
80
|
+
if (args.artefact_version !== undefined) {
|
|
81
|
+
params.artefact_version = args.artefact_version;
|
|
82
|
+
}
|
|
83
|
+
const { id, deduped } = store.add(params);
|
|
80
84
|
return JSON.stringify({ ok: true, id, deduped });
|
|
81
85
|
} catch (err) {
|
|
82
86
|
return JSON.stringify({ error: `foundry_feedback_add: ${err.message}` });
|
|
@@ -201,6 +205,7 @@ export function createFeedbackTools({ tool }) {
|
|
|
201
205
|
file: tool.schema.string().describe('Artefact file path'),
|
|
202
206
|
text: tool.schema.string().describe('Feedback text'),
|
|
203
207
|
tag: tool.schema.string().describe('Tag for the feedback item'),
|
|
208
|
+
artefact_version: tool.schema.string().optional().describe('SHA-256 hash for version-aware sorting'),
|
|
204
209
|
},
|
|
205
210
|
execute: guarded('foundry_feedback_add', [flowBranchGuard, gateNotFailed], executeFeedbackAdd, { branchIo: branchIoFactory, io: asyncIoFactory }),
|
|
206
211
|
}),
|
|
@@ -218,7 +223,7 @@ export function createFeedbackTools({ tool }) {
|
|
|
218
223
|
execute: guarded('foundry_feedback_wontfix', [flowBranchGuard, gateNotFailed], executeFeedbackWontfix, { branchIo: branchIoFactory, io: asyncIoFactory }),
|
|
219
224
|
}),
|
|
220
225
|
foundry_feedback_resolve: tool({
|
|
221
|
-
description: 'Resolve a feedback item (approved or rejected).
|
|
226
|
+
description: 'Resolve a feedback item (approved or rejected). Human-appraise stages can override deadlock with a reason.',
|
|
222
227
|
args: {
|
|
223
228
|
id: tool.schema.string().describe('Feedback item id (ULID)'),
|
|
224
229
|
resolution: tool.schema.enum(['approved', 'rejected']).describe('Resolution type'),
|
|
@@ -230,7 +235,6 @@ export function createFeedbackTools({ tool }) {
|
|
|
230
235
|
args: {
|
|
231
236
|
file: tool.schema.string().optional().describe('Filter by artefact file path'),
|
|
232
237
|
},
|
|
233
|
-
execute: executeFeedbackList,
|
|
234
|
-
}),
|
|
238
|
+
execute: executeFeedbackList }),
|
|
235
239
|
};
|
|
236
240
|
}
|
|
@@ -38,7 +38,7 @@ function createGitBridge(cwd) {
|
|
|
38
38
|
}
|
|
39
39
|
|
|
40
40
|
async function createFinalize(cwd, io) {
|
|
41
|
-
return async ({ cycleId, stage, baseSha }) => {
|
|
41
|
+
return async ({ cycleId, stage, baseSha, artefact_version, contractPassed }) => {
|
|
42
42
|
let cycleDoc;
|
|
43
43
|
try {
|
|
44
44
|
cycleDoc = await getCycleDefinition('foundry', cycleId, io);
|
|
@@ -66,6 +66,8 @@ async function createFinalize(cwd, io) {
|
|
|
66
66
|
cycleDef,
|
|
67
67
|
artefactTypes,
|
|
68
68
|
io,
|
|
69
|
+
artefact_version,
|
|
70
|
+
contractPassed,
|
|
69
71
|
});
|
|
70
72
|
return result;
|
|
71
73
|
};
|
package/dist/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,43 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [3.6.0] - 2026-05-25
|
|
4
|
+
|
|
5
|
+
### Added
|
|
6
|
+
|
|
7
|
+
- State-driven sort routing (R1–R4, R7): sort routes based on feedback item state (`unresolved` → forge, `addressed` → source stage) instead of stage position, with deadlock detection and iteration cap.
|
|
8
|
+
- Artefact version tracking: `computeArtefactVersion` hashes all artefact files after each forge run; the version is recorded on every feedback item and forge history entry.
|
|
9
|
+
- Forge contract enforcement (R8–R9): per-item response check and batch-level version consistency check. Contract violations revert items to `open` and post system feedback. Three consecutive contract failures block the cycle.
|
|
10
|
+
- Stale feedback detection (R6): when quench, appraise, or human-appraise re-enters after a forge run, feedback items with a mismatched artefact version are auto-resolved as superseded.
|
|
11
|
+
- Legacy feedback migration: items without a valid artefact version are auto-resolved on load and excluded from routing.
|
|
12
|
+
- Integration test suite: 40 integration tests covering the routing decision tree, forge contract enforcement, and the spec's 10-step worked example.
|
|
13
|
+
|
|
14
|
+
### Fixed
|
|
15
|
+
|
|
16
|
+
- Forge contract enforcement wired into the orchestration post-dispatch path (previously it finalised without computing versions or checking the contract).
|
|
17
|
+
- New feedback items from quench, appraise, and the plugin feedback tool now carry the current artefact version, preventing false legacy detection.
|
|
18
|
+
- `computeArtefactVersion` scans the worktree root for file patterns (previously scanned the `foundry/` config directory).
|
|
19
|
+
- Version computation failures during stale resolution are surfaced to the orchestrator instead of being silently swallowed.
|
|
20
|
+
- Appraise stale feedback resolution runs during the gather phase (before appraiser dispatch), not only during consolidation.
|
|
21
|
+
|
|
22
|
+
### Changed
|
|
23
|
+
|
|
24
|
+
- Orchestration: forge post-dispatch extracts a dedicated `runForgePostDispatch` path that reads the forge context, computes versions, enforces the contract, and passes `contractPassed`/`postVersion` to `finaliseStage`.
|
|
25
|
+
- Quench and appraise stale resolution helpers gracefully degrade when the artefact type is not configured (best-effort, not fatal).
|
|
26
|
+
- Added `worktree` parameter to `computeArtefactVersion` for explicit worktree-root specification.
|
|
27
|
+
|
|
28
|
+
## [3.5.9] - 2026-05-24
|
|
29
|
+
|
|
30
|
+
### Changed
|
|
31
|
+
|
|
32
|
+
- Cycle frontmatter keys renamed: `human-appraise` → `always-human-appraise`, `deadlock-appraise` → `deadlock-human-appraise`, `deadlock-iterations` replaced with `max-iterations`-based deadlock routing.
|
|
33
|
+
- Deadlock routing replaced per-item deadlocked state with iteration-limit detection using `max-iterations` and `deadlock-human-appraise`.
|
|
34
|
+
- Orchestration uses iteration cap routing instead of per-item deadlock history.
|
|
35
|
+
- Skills and public documentation (`README.md`, `docs/*`) refreshed against the current 65-tool implementation: updated tool reference with structured config-creation arguments, corrected quench/appraise execution models, documented attestation-before-finish workflow, added Validator and ATTEST.md concepts, and fixed memory paths. All stale v3.0.x terminology removed.
|
|
36
|
+
|
|
37
|
+
### Fixed
|
|
38
|
+
|
|
39
|
+
- Guidance audit tests aligned with current user-facing documentation style (no direct `foundry_git_finish({)` call syntax in walkthrough).
|
|
40
|
+
|
|
3
41
|
## [3.5.8] - 2026-05-23
|
|
4
42
|
|
|
5
43
|
### Added
|
package/dist/README.md
CHANGED
|
@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
|
|
|
32
32
|
is auditable, repeatable, and defensible to auditors and stakeholders. You can show
|
|
33
33
|
exactly how the output was made. Confidence is engineered; it is not hoped for.
|
|
34
34
|
|
|
35
|
-
### The operating model: assay, then forge → quench → appraise
|
|
35
|
+
### The operating model: assay, then forge → quench → appraise → attest → finish
|
|
36
36
|
|
|
37
37
|
A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
|
|
38
38
|
that runs project-authored extractor scripts, parses the strict JSONL facts they
|
|
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
|
|
|
51
51
|
fast and non-negotiable, catching errors before they reach appraisers.
|
|
52
52
|
|
|
53
53
|
- **Appraise** judges quality against written laws. Independent evaluators inspect
|
|
54
|
-
whether the work meets the
|
|
54
|
+
whether the work meets the rules or criteria you define.
|
|
55
55
|
|
|
56
|
-
- **Human-appraise** provides direct judgement when the stakes require it or the
|
|
57
|
-
|
|
56
|
+
- **Human-appraise** provides direct judgement when the stakes require it or the
|
|
57
|
+
cycle reaches its iteration limit. Offers human oversight at critical
|
|
58
|
+
decision points.
|
|
58
59
|
|
|
59
60
|
Every stage commits separately, so every step leaves a record. Every decision is
|
|
60
61
|
timestamped. A single loop produces an **output** — a verified draft. A flow
|
|
61
|
-
composes one or more such loops to produce an **outcome** — the final artefact
|
|
62
|
-
|
|
62
|
+
composes one or more such loops to produce an **outcome** — the final artefact.
|
|
63
|
+
|
|
64
|
+
When the loop clears, completing the work branch requires **attest** — a final
|
|
65
|
+
verification that writes and commits `ATTEST.md` — followed by **finish**, which
|
|
66
|
+
squash-merges the approved work to the base branch with a signed attestation block.
|
|
63
67
|
|
|
64
68
|
### What you describe, what Foundry enforces
|
|
65
69
|
|
|
@@ -174,7 +178,8 @@ quench → 5/7/5 — passes [commit]
|
|
|
174
178
|
appraise → 2 appraisers, one flags weak imagery [commit]
|
|
175
179
|
forge → revises [commit]
|
|
176
180
|
appraise → clean [commit]
|
|
177
|
-
|
|
181
|
+
attest → ATTEST.md committed [commit]
|
|
182
|
+
finish → squash-merged to main with attestation
|
|
178
183
|
```
|
|
179
184
|
|
|
180
185
|
Every stage commits. Every decision is recorded. Every piece of feedback and every
|
|
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
|
|
|
191
196
|
`assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
|
|
192
197
|
and [Assay](docs/concepts.md#assay) for the configuration path.
|
|
193
198
|
|
|
194
|
-
> **Note
|
|
199
|
+
> **Note:** flow memory currently persists to `cozo-node`, which is
|
|
195
200
|
> unmaintained upstream. Installation produces six cosmetic deprecation warnings
|
|
196
201
|
> from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
|
|
197
202
|
> a maintained backend in a future release; the public `foundry_memory_*` tools
|
|
198
203
|
> and on-disk vocabulary/NDJSON format are designed to survive that migration.
|
|
199
|
-
> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status
|
|
204
|
+
> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
|
|
200
205
|
|
|
201
206
|
---
|
|
202
207
|
|
|
@@ -233,7 +238,8 @@ reproducible.
|
|
|
233
238
|
state live in tested plugin code, outside LLM control.
|
|
234
239
|
|
|
235
240
|
- **Written quality criteria** — laws are markdown files; an appraiser panel scores
|
|
236
|
-
each artefact against them,
|
|
241
|
+
each artefact against them, providing structured quality assessment from
|
|
242
|
+
multiple perspectives.
|
|
237
243
|
|
|
238
244
|
- **Multi-model diversity** — forge on one model, appraise on another, every
|
|
239
245
|
appraiser on a different model if you want. Different models catch different
|
package/dist/docs/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Foundry docs
|
|
2
2
|
|
|
3
|
-
This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need.
|
|
3
|
+
This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need. The `docs/superpowers/` directory is future-facing internal scaffolding and is not indexed as public documentation.
|
|
4
4
|
|
|
5
5
|
**How to navigate:** Work through the sections in order: **Start here** establishes conceptual foundations, **Reference** provides detailed specifications for implementation, and **Contributors** covers subsystem maintenance and extensions.
|
|
6
6
|
|
|
@@ -10,9 +10,9 @@ Getting oriented with Foundry means understanding both the concepts it uses and
|
|
|
10
10
|
|
|
11
11
|
**Reading order:** Work through them in order; [getting-started.md](getting-started.md) builds hands-on confidence, and [concepts.md](concepts.md) provides reference depth. Most implementers spend 1–2 hours on getting-started before moving to Reference materials.
|
|
12
12
|
|
|
13
|
-
- **getting-started.md** — Complete end-to-end installation,
|
|
13
|
+
- **getting-started.md** — Complete end-to-end installation, bootstrap, Foundry agent wizard, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
|
|
14
14
|
|
|
15
|
-
It
|
|
15
|
+
It covers bootstrap states, the Understand–Plan–Confirm–Build wizard protocol, structured config authoring, current flow execution with quench and appraise, attestation and branch finishing, dry-run completion, and the optional flow-memory path (initialise memory, declare vocabulary, add extractors, opt a cycle into assay). Uses worked examples you can run against real code.
|
|
16
16
|
|
|
17
17
|
Implementers must follow every step and complete the bootstrap; architects typically skim for structure before moving to [concepts.md](concepts.md) and [architecture.md](architecture.md) to reason about their designs.
|
|
18
18
|
|
|
@@ -28,19 +28,19 @@ These documents specify formats, tools, and design principles. Use them when imp
|
|
|
28
28
|
|
|
29
29
|
**Key property:** These are sources of truth and normative references. Changes to Foundry flow formats or tool behaviour must be reflected here first. Use them together—cross-references appear throughout.
|
|
30
30
|
|
|
31
|
-
- **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields,
|
|
31
|
+
- **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, feedback and history state, and the full feedback state machine with all valid transitions and guards. Artefacts are discovered from branch diffs, not stored as an artefact table in the workfile.
|
|
32
32
|
|
|
33
33
|
Use this when implementing tooling around work files, validating state transitions, or understanding what metadata flows carry through an execution. It is the authoritative source of truth for all transient work-branch structures, format validation rules, and field semantics.
|
|
34
34
|
|
|
35
35
|
Implementers and tool builders rely on this heavily; keep it updated immediately as formats evolve or new fields are added.
|
|
36
36
|
|
|
37
|
-
- **tools.md** —
|
|
37
|
+
- **tools.md** — Complete 65-tool categorical index and reference, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
|
|
38
38
|
|
|
39
39
|
Consult this when you need to understand what a specific tool does, its branch requirements, what stage locks apply, what arguments it accepts, and how it integrates with the overall system. Covers calling conventions, enforcement invariants, and the permission model for memory access. References `foundry_assay_run`, `foundry_extractor_create`, and memory data and admin tools.
|
|
40
40
|
|
|
41
41
|
Tool authors and system integrators use this constantly; it is the comprehensive reference for all custom tools in the Foundry ecosystem.
|
|
42
42
|
|
|
43
|
-
- **architecture.md** — The design and enforcement model covering token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing, and core design principles.
|
|
43
|
+
- **architecture.md** — The design and enforcement model covering orchestration actions (`dispatch`, `dispatch_multi`, `human_appraise`, `done`, `blocked`, `violation`), internal quench and appraise execution, branch finish paths with attestation preconditions, token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing with `defaultModel`, forge required-tool verification, and core design principles.
|
|
44
44
|
|
|
45
45
|
Read this when you need to understand how Foundry maintains safety (how tokens prevent replay, why stages lock mutations, how writes are validated), what guarantees it makes and where they live in the code, or why it is structured the way it is. Explains the memory layout, assay write boundary, and failed-flow behaviour that keep extractor-populated memory auditable.
|
|
46
46
|
|
|
@@ -34,7 +34,7 @@ A forge sub-agent can decline subjective feedback with a justification, and an a
|
|
|
34
34
|
|
|
35
35
|
### Humans can step in at known points
|
|
36
36
|
|
|
37
|
-
Human-in-the-loop gates are first-class stages. A cycle can declare `human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-appraise: true` (the default) to pull a human in
|
|
37
|
+
Human-in-the-loop gates are first-class stages. A cycle can declare `always-human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-human-appraise: true` (the default) to pull a human in when the iteration count reaches `max-iterations`. Human feedback takes absolute priority and cannot be wont-fixed.
|
|
38
38
|
|
|
39
39
|
### Multi-model diversity
|
|
40
40
|
|
|
@@ -75,9 +75,10 @@ The following guarantees live in plugin code and are outside LLM control:
|
|
|
75
75
|
The `orchestrate` skill is a thin driver around `foundry_orchestrate`. Its entire loop is:
|
|
76
76
|
|
|
77
77
|
```text
|
|
78
|
-
call foundry_orchestrate({lastResult})
|
|
78
|
+
call foundry_orchestrate({lastResult, lastResults, baseBranch, defaultModel})
|
|
79
79
|
switch on action:
|
|
80
|
-
dispatch → dispatch
|
|
80
|
+
dispatch → dispatch single subagent → report back
|
|
81
|
+
dispatch_multi → dispatch parallel appraiser tasks → consolidate → report back
|
|
81
82
|
human_appraise → run human-appraise inline → report back
|
|
82
83
|
done / blocked / violation → terminate the loop
|
|
83
84
|
```
|
|
@@ -92,6 +93,14 @@ switch on action:
|
|
|
92
93
|
|
|
93
94
|
Because the protocol lives in a plugin tool, the LLM cannot skip steps, reorder them, or silently drop a commit.
|
|
94
95
|
|
|
96
|
+
### Internal quench execution
|
|
97
|
+
|
|
98
|
+
Quench runs inside the orchestrator as `runQuench(ctx)` — a deterministic, non-LLM validation pass. It reads the active stage from `.foundry/active-stage.json`, discovers artefact changes via branch-based artefact discovery, runs validators for each applicable law, and posts feedback with tags in the format `law:<law-id>:<validator-id>`. It also resolves prior quench feedback: items whose issues remain are set to `rejected`; items no longer present are set to `approved` (transitioned to `resolved`). The quench module is available at `src/scripts/quench-module.js`.
|
|
99
|
+
|
|
100
|
+
### Internal appraise execution
|
|
101
|
+
|
|
102
|
+
Appraise uses `gatherAppraiseContext()` to build parallel subagent tasks, one per (artefact, appraiser) pair. It returns a `dispatch_multi` action containing the task list. The orchestrator's loop dispatches each task independently. After all appraisers report back, `consolidateAppraise()` processes the `lastResults` array: it parses each successful output for structured issues, de-duplicates across appraisers, posts feedback with `law:<law-id>` tags, and resolves prior appraise feedback items (resolves stale items, rejects items still present). The appraise module is available at `src/scripts/appraise-module.js`.
|
|
103
|
+
|
|
95
104
|
---
|
|
96
105
|
|
|
97
106
|
## Token lifecycle
|
|
@@ -134,9 +143,9 @@ Foundry partitions mutation across three branch namespaces. The plugin enforces
|
|
|
134
143
|
|
|
135
144
|
| Namespace | Pattern | Owns | Created from | Finished by |
|
|
136
145
|
|-----------|---------|------|--------------|-------------|
|
|
137
|
-
| **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` |
|
|
138
|
-
| **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | `foundry_git_finish`
|
|
139
|
-
| **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot
|
|
146
|
+
| **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | Squash-merge to base branch; no attestation required |
|
|
147
|
+
| **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | Requires `ATTEST.md` at HEAD (created by `foundry_attest({ confirm: true })`). `foundry_git_finish({ confirm: true })` verifies the attestation, preserves `archive/<work-branch>-<short-sha>`, squash-merges to base, creates a signed commit (`-S`), deletes the work branch |
|
|
148
|
+
| **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot to `.snapshots/<run-id>/`, force-deletes branch) |
|
|
140
149
|
|
|
141
150
|
### Guard implementation
|
|
142
151
|
|
|
@@ -150,7 +159,7 @@ Implementation: `src/scripts/lib/branch-guard.js`.
|
|
|
150
159
|
|
|
151
160
|
### Forensic branches and snapshots
|
|
152
161
|
|
|
153
|
-
- **Work branches**
|
|
162
|
+
- **Work branches** require `ATTEST.md` at HEAD, created by `foundry_attest({ confirm: true })` before `foundry_git_finish({ confirm: true })` runs. The finish tool verifies the attestation, checks the diff SHA matches, preserves the branch as `archive/work/<flowId>-<description>-<hash>`, squash-merges to the base branch with a signed commit (`-S`), and deletes the work branch. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
|
|
154
163
|
- **Dry-run branches** are force-deleted after `foundry_git_finish` captures a snapshot to `.snapshots/<runId>/` on the parent `config/*` working tree. Each snapshot includes `README.md` (metadata), `work/WORK*` (workfile triple), `diff.patch` (full diff), and `trace.jsonl` (tool-call trace).
|
|
155
164
|
|
|
156
165
|
---
|
|
@@ -233,6 +242,20 @@ Every stage runs inside a token-gated lifecycle. The sub-agent must call `foundr
|
|
|
233
242
|
|
|
234
243
|
Input artefacts (files matching an input type's `file-patterns`) are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle with `{error: 'unexpected_files'}`.
|
|
235
244
|
|
|
245
|
+
### Forge required-tool verification
|
|
246
|
+
|
|
247
|
+
During `foundry_stage_end` for a forge stage, the plugin verifies that the forge sub-agent called five required context-reading tools:
|
|
248
|
+
|
|
249
|
+
1. `foundry_config_cycle`
|
|
250
|
+
2. `foundry_workfile_get`
|
|
251
|
+
3. `foundry_config_artefact_type`
|
|
252
|
+
4. `foundry_config_laws`
|
|
253
|
+
5. `foundry_feedback_list`
|
|
254
|
+
|
|
255
|
+
Tool calls are logged to `.foundry/.forge-tool-calls.jsonl` during stage execution. When `foundry_stage_end` runs, it checks the log against the required set. Missing required calls generate system feedback with the tag `system:missing-tool-calls` and the forge stage completes normally — the missing-tool feedback acts as a signal to the sort router. When all required tools are present, any prior `system:missing-tool-calls` feedback is resolved.
|
|
256
|
+
|
|
257
|
+
Implementation: `src/plugin/tools/stage-tools.js` (`verifyAndManageForgeTools`) and `src/scripts/lib/stage-calls.js`.
|
|
258
|
+
|
|
236
259
|
### Failed flow state
|
|
237
260
|
|
|
238
261
|
When an unrecoverable error occurs (e.g. assay extractor abort, invalid JSONL, or memory-sync failure), the orchestrator marks `WORK.md` frontmatter with `status: failed` and a `reason`. The flow is then locked:
|
|
@@ -258,10 +281,10 @@ All pipeline skills (`orchestrate`, `flow`, stage skills) check for this state a
|
|
|
258
281
|
|
|
259
282
|
`src/scripts/sort.js` (exported as `runSort`) owns the routing engine. It reads `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml`, then decides which stage runs next based on:
|
|
260
283
|
|
|
261
|
-
- **Unresolved feedback.** If
|
|
262
|
-
- **
|
|
263
|
-
- **Iteration limits.** If `max-iterations` is exceeded, the cycle is marked `blocked` and control returns to the user.
|
|
284
|
+
- **Unresolved feedback.** If feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is `forge` (for items needing action) or the originating evaluation stage (for items pending approval).
|
|
285
|
+
- **Iteration limits and deadlock routing.** When the forge iteration count reaches `max-iterations` with unresolved feedback, sort routes to `human-appraise` (if `deadlock-human-appraise: true`, the default) or marks the cycle `blocked` if human routing is disabled. Sort does not write per-item deadlocked state; deadlock is a routing decision, not a feedback item state.
|
|
264
286
|
- **Clean state.** If all feedback is resolved and no new validation or appraisal failures exist, the cycle is `done`.
|
|
287
|
+
- **Blocked.** If `max-iterations` is exceeded and `deadlock-human-appraise` is `false`, the cycle is marked `blocked` and control returns to the user.
|
|
265
288
|
|
|
266
289
|
### Feedback state machine
|
|
267
290
|
|
|
@@ -269,15 +292,15 @@ Feedback items live in `WORK.feedback.yaml` with a full transition history. Each
|
|
|
269
292
|
|
|
270
293
|
- `id` — a ULID.
|
|
271
294
|
- `source` — the stage that created it (e.g. `quench:check-syllables`, `appraise:pedantic`, `human-appraise:hitl`).
|
|
272
|
-
- `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected
|
|
295
|
+
- `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`).
|
|
273
296
|
- `history` — append-only log of state transitions with timestamps and metadata.
|
|
274
297
|
|
|
275
298
|
Transitions are **source-based**:
|
|
276
299
|
|
|
277
300
|
| Source stage | Forge can `wont-fix`? | Resolved by |
|
|
278
301
|
|--------------|------------------------|-------------|
|
|
279
|
-
| `quench` (
|
|
280
|
-
| `appraise` (
|
|
302
|
+
| `quench` (deterministic validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
|
|
303
|
+
| `appraise` (law evaluation) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
|
|
281
304
|
| `human-appraise` (user instruction) | No — must `actioned` | the originating `human-appraise` stage |
|
|
282
305
|
|
|
283
306
|
Implementation: `src/scripts/lib/feedback-transitions.js` and `src/scripts/lib/feedback-store.js`. See [work-spec.md](work-spec.md) for the full state machine table.
|
|
@@ -290,24 +313,27 @@ Different stages can run on different models for cognitive diversity. Cycle defi
|
|
|
290
313
|
|
|
291
314
|
### Configuration
|
|
292
315
|
|
|
316
|
+
- **Orchestrator argument.** `defaultModel` (optional) can be passed as an orchestrator argument. When set, it serves as the fallback for any stage or appraiser that does not declare a model.
|
|
293
317
|
- **Cycle-level.** Declare a `models` map in the cycle frontmatter:
|
|
294
318
|
```yaml
|
|
295
319
|
models:
|
|
320
|
+
default: anthropic/claude-sonnet-4
|
|
296
321
|
forge: anthropic/claude-opus-4.7
|
|
297
322
|
appraise: openai/gpt-5
|
|
298
323
|
```
|
|
299
|
-
|
|
324
|
+
`models.default` provides a cycle-level fallback when no per-stage override exists and no `defaultModel` is passed to the orchestrator.
|
|
325
|
+
- **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model and `models.default` on a per-appraiser basis.
|
|
300
326
|
|
|
301
327
|
### Agent files
|
|
302
328
|
|
|
303
329
|
The user-facing `Foundry` agent is installed by the plugin's `config` hook as `.opencode/agents/foundry.md`. Users switch to this agent after restarting OpenCode. It guides authoring and flow execution while generated `foundry-*` stage agents remain hidden routing targets for specific models.
|
|
304
330
|
|
|
305
|
-
`foundry_refresh_agents
|
|
331
|
+
`foundry_refresh_agents` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`). Call `foundry_refresh_agents()` in code examples when referring to the tool invocation.
|
|
306
332
|
|
|
307
333
|
### Dispatch behaviour
|
|
308
334
|
|
|
309
|
-
- **Non-appraise stages** (forge, quench, assay):
|
|
310
|
-
- **Appraise stage**: each appraiser is dispatched independently by the
|
|
335
|
+
- **Non-appraise stages** (forge, quench, assay): the orchestrator resolves the model by checking `models.<stage>`, then `defaultModel`, then `models.default`, then falls back to `general` (session default). If a specific model is resolved, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing.
|
|
336
|
+
- **Appraise stage**: each appraiser is dispatched independently by the appraise module. The model resolution order is: appraiser's own `model` field, then `defaultModel`, then `models.default`, then `general`. If a specific model is resolved, the task is dispatched to `foundry-<slug>` and the orchestrator hard-fails if that agent file is missing.
|
|
311
337
|
|
|
312
338
|
Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `src/skills/appraise/SKILL.md`.
|
|
313
339
|
|
|
@@ -331,12 +357,13 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
|
|
|
331
357
|
│ │ ├── appraise/
|
|
332
358
|
│ │ ├── human-appraise/
|
|
333
359
|
│ │ ├── add-artefact-type/ # authoring
|
|
334
|
-
│ │ ├── add-artefact-type/
|
|
335
360
|
│ │ ├── add-law/
|
|
336
361
|
│ │ ├── add-appraiser/
|
|
337
362
|
│ │ ├── add-cycle/
|
|
338
363
|
│ │ ├── add-flow/
|
|
339
364
|
│ │ ├── add-extractor/
|
|
365
|
+
│ │ ├── assay/ # deterministic extractor execution
|
|
366
|
+
│ │ ├── dry-run/ # dry-run execution and snapshots
|
|
340
367
|
│ │ ├── list-agents/ # utility
|
|
341
368
|
│ │ ├── refresh-agents/ # utility (now backed by foundry_refresh_agents tool)
|
|
342
369
|
│ │ ├── upgrade-foundry/
|
|
@@ -352,7 +379,7 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
|
|
|
352
379
|
│ └── scripts/
|
|
353
380
|
│ ├── lib/ # shared libraries (injectable I/O)
|
|
354
381
|
│ │ ├── workfile.js # WORK.md frontmatter
|
|
355
|
-
│
|
|
382
|
+
│ │ ├── artefacts.js # artefact discovery via branch diffs
|
|
356
383
|
│ │ ├── history.js # WORK.history.yaml operations
|
|
357
384
|
│ │ ├── feedback-store.js
|
|
358
385
|
│ │ ├── feedback-transitions.js
|
|
@@ -367,10 +394,18 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
|
|
|
367
394
|
│ │ ├── state.js
|
|
368
395
|
│ │ ├── config.js # foundry/ config readers
|
|
369
396
|
│ │ ├── slug.js
|
|
397
|
+
│ │ ├── tool-paths.js
|
|
398
|
+
│ │ ├── stage-calls.js # forge tool-call logging and verification
|
|
399
|
+
│ │ ├── sort-routing.js
|
|
400
|
+
│ │ ├── sort-reason.js
|
|
401
|
+
│ │ ├── sort-fs-check.js
|
|
402
|
+
│ │ ├── validation.js
|
|
370
403
|
│ │ ├── ulid.js
|
|
371
404
|
│ │ ├── tracing.js
|
|
372
405
|
│ │ ├── failed-flow.js
|
|
373
406
|
│ │ ├── git-bridge.js
|
|
407
|
+
│ │ ├── git-finish/ # branch finishing logic
|
|
408
|
+
│ │ ├── attestation/ # ATTEST.md generation and verification
|
|
374
409
|
│ │ ├── git-policy.js
|
|
375
410
|
│ │ ├── assay/
|
|
376
411
|
│ │ ├── config-creators/
|
|
@@ -378,6 +413,11 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
|
|
|
378
413
|
│ │ ├── snapshot/
|
|
379
414
|
│ │ └── memory/ # flow memory (Cozo 0.7)
|
|
380
415
|
│ ├── orchestrate.js # orchestration loop (exports runOrchestrate)
|
|
416
|
+
│ ├── orchestrate-cycle.js
|
|
417
|
+
│ ├── orchestrate-phases.js
|
|
418
|
+
│ ├── orchestrate-terminals.js
|
|
419
|
+
│ ├── quench-module.js # deterministic validation (runQuench)
|
|
420
|
+
│ ├── appraise-module.js # appraise gather and consolidate
|
|
381
421
|
│ └── sort.js # routing engine (exports runSort)
|
|
382
422
|
├── scripts/
|
|
383
423
|
│ └── build.js # builds src/ into dist/
|