npm - @really-knows-ai/foundry - Versions diffs - 3.5.8 → 3.6.0 - Mend

@really-knows-ai/foundry 3.5.8 → 3.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/README.md +16 -10
package/dist/.opencode/plugins/foundry-tools/config-create-tools.js +2 -3
package/dist/.opencode/plugins/foundry-tools/feedback-tools.js +9 -5
package/dist/.opencode/plugins/foundry-tools/orchestrate-tool.js +3 -1
package/dist/CHANGELOG.md +38 -0
package/dist/README.md +16 -10
package/dist/docs/README.md +6 -6
package/dist/docs/architecture.md +59 -19
package/dist/docs/concepts.md +55 -19
package/dist/docs/getting-started.md +37 -15
package/dist/docs/memory-maintenance.md +3 -3
package/dist/docs/tools.md +131 -70
package/dist/docs/work-spec.md +38 -52
package/dist/scripts/appraise-module.js +69 -7
package/dist/scripts/lib/artefacts.js +43 -1
package/dist/scripts/lib/config-creators/cycle.js +6 -10
package/dist/scripts/lib/config-validators/cycle.js +1 -9
package/dist/scripts/lib/feedback-store.js +26 -51
package/dist/scripts/lib/finalize.js +10 -2
package/dist/scripts/lib/forge-contract.js +93 -0
package/dist/scripts/lib/history.js +2 -1
package/dist/scripts/lib/sort-reason.js +11 -8
package/dist/scripts/lib/sort-routing.js +185 -63
package/dist/scripts/lib/workfile.js +28 -0
package/dist/scripts/orchestrate-cycle.js +3 -13
package/dist/scripts/orchestrate-phases.js +51 -45
package/dist/scripts/orchestrate-terminals.js +37 -2
package/dist/scripts/orchestrate.js +62 -5
package/dist/scripts/quench-module.js +54 -12
package/dist/scripts/sort.js +42 -62
package/dist/skills/add-cycle/SKILL.md +4 -4
package/dist/skills/add-flow/SKILL.md +1 -1
package/dist/skills/human-appraise/SKILL.md +12 -40
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
 is auditable, repeatable, and defensible to auditors and stakeholders. You can show
 exactly how the output was made. Confidence is engineered; it is not hoped for.
-### The operating model: assay, then forge → quench → appraise
+### The operating model: assay, then forge → quench → appraise → attest → finish
 A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
 that runs project-authored extractor scripts, parses the strict JSONL facts they
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
   fast and non-negotiable, catching errors before they reach appraisers.
 - **Appraise** judges quality against written laws. Independent evaluators inspect
-  whether the work meets the subjective standards you define.
+  whether the work meets the rules or criteria you define.
-- **Human-appraise** provides direct judgement when the stakes require it or the loop
-  deadlocks. Offers human oversight at critical decision points.
+- **Human-appraise** provides direct judgement when the stakes require it or the
+  cycle reaches its iteration limit. Offers human oversight at critical
+  decision points.
 Every stage commits separately, so every step leaves a record. Every decision is
 timestamped. A single loop produces an **output** — a verified draft. A flow
-composes one or more such loops to produce an **outcome** — the final artefact that
-reaches your codebase or customers.
+composes one or more such loops to produce an **outcome** — the final artefact.
+When the loop clears, completing the work branch requires **attest** — a final
+verification that writes and commits `ATTEST.md` — followed by **finish**, which
+squash-merges the approved work to the base branch with a signed attestation block.
 ### What you describe, what Foundry enforces
@@ -174,7 +178,8 @@ quench    → 5/7/5 — passes                          [commit]
 appraise  → 2 appraisers, one flags weak imagery    [commit]
 forge     → revises                                 [commit]
 appraise  → clean                                   [commit]
-done      → squash-merged to main with attestation
+attest    → ATTEST.md committed                     [commit]
+finish    → squash-merged to main with attestation
 ```
 Every stage commits. Every decision is recorded. Every piece of feedback and every
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
 `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
 and [Assay](docs/concepts.md#assay) for the configuration path.
-> **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
+> **Note:** flow memory currently persists to `cozo-node`, which is
 > unmaintained upstream. Installation produces six cosmetic deprecation warnings
 > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
 > a maintained backend in a future release; the public `foundry_memory_*` tools
 > and on-disk vocabulary/NDJSON format are designed to survive that migration.
-> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
+> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
 ---
@@ -233,7 +238,8 @@ reproducible.
   state live in tested plugin code, outside LLM control.
 - **Written quality criteria** — laws are markdown files; an appraiser panel scores
-  each artefact against them, so quality is objective.
+  each artefact against them, providing structured quality assessment from
+  multiple perspectives.
 - **Multi-model diversity** — forge on one model, appraise on another, every
   appraiser on a different model if you want. Different models catch different

package/dist/.opencode/plugins/foundry-tools/config-create-tools.js CHANGED Viewed

@@ -162,9 +162,8 @@ function cycleArgs(s) { return {
     artefacts: s.array(s.string()).describe('Artefact type IDs this cycle reads'),
   }).optional().describe('Input contract for this cycle. Omit for source cycles that start from the user goal; empty artefacts arrays are invalid.'),
   targets: s.array(s.string()).optional().describe('Downstream cycle IDs this cycle can route to'),
-  humanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
-  deadlockAppraise: s.boolean().optional().describe('Route to human-appraise on LLM appraiser deadlock'),
-  deadlockIterations: s.number().optional().describe('Iteration threshold for deadlock detection'),
+  alwaysHumanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
+  deadlockHumanAppraise: s.boolean().optional().describe('Route to human-appraise when max-iterations is reached'),
   maxIterations: s.number().optional().describe('Maximum forge iterations before cycle blocks'),
   assay: s.object({
     extractors: s.array(s.string()).describe('Extractor IDs for the assay stage'),

package/dist/.opencode/plugins/foundry-tools/feedback-tools.js CHANGED Viewed

@@ -70,13 +70,17 @@ async function executeFeedbackAdd(args, context) {
   try {
     const store = openFeedbackStore('WORK.feedback.yaml', io);
-    const { id, deduped } = store.add({
+    const params = {
       file: args.file,
       tag: args.tag,
       text: args.text,
       source: activeStage,
       cycle,
-    });
+    };
+    if (args.artefact_version !== undefined) {
+      params.artefact_version = args.artefact_version;
+    }
+    const { id, deduped } = store.add(params);
     return JSON.stringify({ ok: true, id, deduped });
   } catch (err) {
     return JSON.stringify({ error: `foundry_feedback_add: ${err.message}` });
@@ -201,6 +205,7 @@ export function createFeedbackTools({ tool }) {
         file: tool.schema.string().describe('Artefact file path'),
         text: tool.schema.string().describe('Feedback text'),
         tag: tool.schema.string().describe('Tag for the feedback item'),
+        artefact_version: tool.schema.string().optional().describe('SHA-256 hash for version-aware sorting'),
       },
       execute: guarded('foundry_feedback_add', [flowBranchGuard, gateNotFailed], executeFeedbackAdd, { branchIo: branchIoFactory, io: asyncIoFactory }),
     }),
@@ -218,7 +223,7 @@ export function createFeedbackTools({ tool }) {
       execute: guarded('foundry_feedback_wontfix', [flowBranchGuard, gateNotFailed], executeFeedbackWontfix, { branchIo: branchIoFactory, io: asyncIoFactory }),
     }),
     foundry_feedback_resolve: tool({
-      description: 'Resolve a feedback item (approved or rejected). In human-appraise stages, this tool can override deadlocked items by providing a reason.',
+      description: 'Resolve a feedback item (approved or rejected). Human-appraise stages can override deadlock with a reason.',
       args: {
         id: tool.schema.string().describe('Feedback item id (ULID)'),
         resolution: tool.schema.enum(['approved', 'rejected']).describe('Resolution type'),
@@ -230,7 +235,6 @@ export function createFeedbackTools({ tool }) {
       args: {
         file: tool.schema.string().optional().describe('Filter by artefact file path'),
       },
-      execute: executeFeedbackList,
-    }),
+      execute: executeFeedbackList }),
   };
 }

package/dist/.opencode/plugins/foundry-tools/orchestrate-tool.js CHANGED Viewed

@@ -38,7 +38,7 @@ function createGitBridge(cwd) {
 }
 async function createFinalize(cwd, io) {
-  return async ({ cycleId, stage, baseSha }) => {
+  return async ({ cycleId, stage, baseSha, artefact_version, contractPassed }) => {
     let cycleDoc;
     try {
       cycleDoc = await getCycleDefinition('foundry', cycleId, io);
@@ -66,6 +66,8 @@ async function createFinalize(cwd, io) {
       cycleDef,
       artefactTypes,
       io,
+      artefact_version,
+      contractPassed,
     });
     return result;
   };

package/dist/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,43 @@
 # Changelog
+## [3.6.0] - 2026-05-25
+### Added
+- State-driven sort routing (R1–R4, R7): sort routes based on feedback item state (`unresolved` → forge, `addressed` → source stage) instead of stage position, with deadlock detection and iteration cap.
+- Artefact version tracking: `computeArtefactVersion` hashes all artefact files after each forge run; the version is recorded on every feedback item and forge history entry.
+- Forge contract enforcement (R8–R9): per-item response check and batch-level version consistency check. Contract violations revert items to `open` and post system feedback. Three consecutive contract failures block the cycle.
+- Stale feedback detection (R6): when quench, appraise, or human-appraise re-enters after a forge run, feedback items with a mismatched artefact version are auto-resolved as superseded.
+- Legacy feedback migration: items without a valid artefact version are auto-resolved on load and excluded from routing.
+- Integration test suite: 40 integration tests covering the routing decision tree, forge contract enforcement, and the spec's 10-step worked example.
+### Fixed
+- Forge contract enforcement wired into the orchestration post-dispatch path (previously it finalised without computing versions or checking the contract).
+- New feedback items from quench, appraise, and the plugin feedback tool now carry the current artefact version, preventing false legacy detection.
+- `computeArtefactVersion` scans the worktree root for file patterns (previously scanned the `foundry/` config directory).
+- Version computation failures during stale resolution are surfaced to the orchestrator instead of being silently swallowed.
+- Appraise stale feedback resolution runs during the gather phase (before appraiser dispatch), not only during consolidation.
+### Changed
+- Orchestration: forge post-dispatch extracts a dedicated `runForgePostDispatch` path that reads the forge context, computes versions, enforces the contract, and passes `contractPassed`/`postVersion` to `finaliseStage`.
+- Quench and appraise stale resolution helpers gracefully degrade when the artefact type is not configured (best-effort, not fatal).
+- Added `worktree` parameter to `computeArtefactVersion` for explicit worktree-root specification.
+## [3.5.9] - 2026-05-24
+### Changed
+- Cycle frontmatter keys renamed: `human-appraise` → `always-human-appraise`, `deadlock-appraise` → `deadlock-human-appraise`, `deadlock-iterations` replaced with `max-iterations`-based deadlock routing.
+- Deadlock routing replaced per-item deadlocked state with iteration-limit detection using `max-iterations` and `deadlock-human-appraise`.
+- Orchestration uses iteration cap routing instead of per-item deadlock history.
+- Skills and public documentation (`README.md`, `docs/*`) refreshed against the current 65-tool implementation: updated tool reference with structured config-creation arguments, corrected quench/appraise execution models, documented attestation-before-finish workflow, added Validator and ATTEST.md concepts, and fixed memory paths. All stale v3.0.x terminology removed.
+### Fixed
+- Guidance audit tests aligned with current user-facing documentation style (no direct `foundry_git_finish({)` call syntax in walkthrough).
 ## [3.5.8] - 2026-05-23
 ### Added

package/dist/README.md CHANGED Viewed

@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
 is auditable, repeatable, and defensible to auditors and stakeholders. You can show
 exactly how the output was made. Confidence is engineered; it is not hoped for.
-### The operating model: assay, then forge → quench → appraise
+### The operating model: assay, then forge → quench → appraise → attest → finish
 A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
 that runs project-authored extractor scripts, parses the strict JSONL facts they
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
   fast and non-negotiable, catching errors before they reach appraisers.
 - **Appraise** judges quality against written laws. Independent evaluators inspect
-  whether the work meets the subjective standards you define.
+  whether the work meets the rules or criteria you define.
-- **Human-appraise** provides direct judgement when the stakes require it or the loop
-  deadlocks. Offers human oversight at critical decision points.
+- **Human-appraise** provides direct judgement when the stakes require it or the
+  cycle reaches its iteration limit. Offers human oversight at critical
+  decision points.
 Every stage commits separately, so every step leaves a record. Every decision is
 timestamped. A single loop produces an **output** — a verified draft. A flow
-composes one or more such loops to produce an **outcome** — the final artefact that
-reaches your codebase or customers.
+composes one or more such loops to produce an **outcome** — the final artefact.
+When the loop clears, completing the work branch requires **attest** — a final
+verification that writes and commits `ATTEST.md` — followed by **finish**, which
+squash-merges the approved work to the base branch with a signed attestation block.
 ### What you describe, what Foundry enforces
@@ -174,7 +178,8 @@ quench    → 5/7/5 — passes                          [commit]
 appraise  → 2 appraisers, one flags weak imagery    [commit]
 forge     → revises                                 [commit]
 appraise  → clean                                   [commit]
-done      → squash-merged to main with attestation
+attest    → ATTEST.md committed                     [commit]
+finish    → squash-merged to main with attestation
 ```
 Every stage commits. Every decision is recorded. Every piece of feedback and every
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
 `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
 and [Assay](docs/concepts.md#assay) for the configuration path.
-> **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
+> **Note:** flow memory currently persists to `cozo-node`, which is
 > unmaintained upstream. Installation produces six cosmetic deprecation warnings
 > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
 > a maintained backend in a future release; the public `foundry_memory_*` tools
 > and on-disk vocabulary/NDJSON format are designed to survive that migration.
-> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
+> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
 ---
@@ -233,7 +238,8 @@ reproducible.
   state live in tested plugin code, outside LLM control.
 - **Written quality criteria** — laws are markdown files; an appraiser panel scores
-  each artefact against them, so quality is objective.
+  each artefact against them, providing structured quality assessment from
+  multiple perspectives.
 - **Multi-model diversity** — forge on one model, appraise on another, every
   appraiser on a different model if you want. Different models catch different

package/dist/docs/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Foundry docs
-This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need.
+This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need. The `docs/superpowers/` directory is future-facing internal scaffolding and is not indexed as public documentation.
 **How to navigate:** Work through the sections in order: **Start here** establishes conceptual foundations, **Reference** provides detailed specifications for implementation, and **Contributors** covers subsystem maintenance and extensions.
@@ -10,9 +10,9 @@ Getting oriented with Foundry means understanding both the concepts it uses and
 **Reading order:** Work through them in order; [getting-started.md](getting-started.md) builds hands-on confidence, and [concepts.md](concepts.md) provides reference depth. Most implementers spend 1–2 hours on getting-started before moving to Reference materials.
-- **getting-started.md** — Complete end-to-end installation, auto-bootstrapping, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
+- **getting-started.md** — Complete end-to-end installation, bootstrap, Foundry agent wizard, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
-  It establishes the operating model, directory structure, and practical confidence in one pass. Includes hands-on guidance on authoring the five foundational concepts (artefact types, laws, appraisers, cycles, flows) with worked examples you can run against real code. Also covers the optional flow-memory path: initialise memory, declare vocabulary, add extractors, and opt a cycle into assay for codebase-aware flows.
+  It covers bootstrap states, the Understand–Plan–Confirm–Build wizard protocol, structured config authoring, current flow execution with quench and appraise, attestation and branch finishing, dry-run completion, and the optional flow-memory path (initialise memory, declare vocabulary, add extractors, opt a cycle into assay). Uses worked examples you can run against real code.
   Implementers must follow every step and complete the bootstrap; architects typically skim for structure before moving to [concepts.md](concepts.md) and [architecture.md](architecture.md) to reason about their designs.
@@ -28,19 +28,19 @@ These documents specify formats, tools, and design principles. Use them when imp
 **Key property:** These are sources of truth and normative references. Changes to Foundry flow formats or tool behaviour must be reflected here first. Use them together—cross-references appear throughout.
-- **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, the artefact registry, and the full feedback state machine with all valid transitions and guards.
+- **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, feedback and history state, and the full feedback state machine with all valid transitions and guards. Artefacts are discovered from branch diffs, not stored as an artefact table in the workfile.
   Use this when implementing tooling around work files, validating state transitions, or understanding what metadata flows carry through an execution. It is the authoritative source of truth for all transient work-branch structures, format validation rules, and field semantics.
   Implementers and tool builders rely on this heavily; keep it updated immediately as formats evolve or new fields are added.
-- **tools.md** — Categorical index and reference documentation for all custom tools, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
+- **tools.md** — Complete 65-tool categorical index and reference, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
   Consult this when you need to understand what a specific tool does, its branch requirements, what stage locks apply, what arguments it accepts, and how it integrates with the overall system. Covers calling conventions, enforcement invariants, and the permission model for memory access. References `foundry_assay_run`, `foundry_extractor_create`, and memory data and admin tools.
   Tool authors and system integrators use this constantly; it is the comprehensive reference for all custom tools in the Foundry ecosystem.
-- **architecture.md** — The design and enforcement model covering token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing, and core design principles.
+- **architecture.md** — The design and enforcement model covering orchestration actions (`dispatch`, `dispatch_multi`, `human_appraise`, `done`, `blocked`, `violation`), internal quench and appraise execution, branch finish paths with attestation preconditions, token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing with `defaultModel`, forge required-tool verification, and core design principles.
   Read this when you need to understand how Foundry maintains safety (how tokens prevent replay, why stages lock mutations, how writes are validated), what guarantees it makes and where they live in the code, or why it is structured the way it is. Explains the memory layout, assay write boundary, and failed-flow behaviour that keep extractor-populated memory auditable.

package/dist/docs/architecture.md CHANGED Viewed

@@ -34,7 +34,7 @@ A forge sub-agent can decline subjective feedback with a justification, and an a
 ### Humans can step in at known points
-Human-in-the-loop gates are first-class stages. A cycle can declare `human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-appraise: true` (the default) to pull a human in only when LLM appraisers and forge ping-pong on the same items. Human feedback takes absolute priority and cannot be wont-fixed.
+Human-in-the-loop gates are first-class stages. A cycle can declare `always-human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-human-appraise: true` (the default) to pull a human in when the iteration count reaches `max-iterations`. Human feedback takes absolute priority and cannot be wont-fixed.
 ### Multi-model diversity
@@ -75,9 +75,10 @@ The following guarantees live in plugin code and are outside LLM control:
 The `orchestrate` skill is a thin driver around `foundry_orchestrate`. Its entire loop is:
 ```text
-call foundry_orchestrate({lastResult})
+call foundry_orchestrate({lastResult, lastResults, baseBranch, defaultModel})
 switch on action:
-  dispatch        → dispatch the requested subagent → report back
+  dispatch        → dispatch single subagent → report back
+  dispatch_multi  → dispatch parallel appraiser tasks → consolidate → report back
   human_appraise  → run human-appraise inline → report back
   done / blocked / violation → terminate the loop
 ```
@@ -92,6 +93,14 @@ switch on action:
 Because the protocol lives in a plugin tool, the LLM cannot skip steps, reorder them, or silently drop a commit.
+### Internal quench execution
+Quench runs inside the orchestrator as `runQuench(ctx)` — a deterministic, non-LLM validation pass. It reads the active stage from `.foundry/active-stage.json`, discovers artefact changes via branch-based artefact discovery, runs validators for each applicable law, and posts feedback with tags in the format `law:<law-id>:<validator-id>`. It also resolves prior quench feedback: items whose issues remain are set to `rejected`; items no longer present are set to `approved` (transitioned to `resolved`). The quench module is available at `src/scripts/quench-module.js`.
+### Internal appraise execution
+Appraise uses `gatherAppraiseContext()` to build parallel subagent tasks, one per (artefact, appraiser) pair. It returns a `dispatch_multi` action containing the task list. The orchestrator's loop dispatches each task independently. After all appraisers report back, `consolidateAppraise()` processes the `lastResults` array: it parses each successful output for structured issues, de-duplicates across appraisers, posts feedback with `law:<law-id>` tags, and resolves prior appraise feedback items (resolves stale items, rejects items still present). The appraise module is available at `src/scripts/appraise-module.js`.
 ---
 ## Token lifecycle
@@ -134,9 +143,9 @@ Foundry partitions mutation across three branch namespaces. The plugin enforces
 | Namespace | Pattern | Owns | Created from | Finished by |
 |-----------|---------|------|--------------|-------------|
-| **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | PR or direct merge to `main` |
-| **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | `foundry_git_finish` (squash-merges to base, preserves forensic archive) |
-| **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot, force-deletes branch) |
+| **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | Squash-merge to base branch; no attestation required |
+| **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | Requires `ATTEST.md` at HEAD (created by `foundry_attest({ confirm: true })`). `foundry_git_finish({ confirm: true })` verifies the attestation, preserves `archive/<work-branch>-<short-sha>`, squash-merges to base, creates a signed commit (`-S`), deletes the work branch |
+| **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot to `.snapshots/<run-id>/`, force-deletes branch) |
 ### Guard implementation
@@ -150,7 +159,7 @@ Implementation: `src/scripts/lib/branch-guard.js`.
 ### Forensic branches and snapshots
-- **Work branches** are preserved as `archive/work/<flowId>-<description>-<hash>` when `foundry_git_finish` completes. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
+- **Work branches** require `ATTEST.md` at HEAD, created by `foundry_attest({ confirm: true })` before `foundry_git_finish({ confirm: true })` runs. The finish tool verifies the attestation, checks the diff SHA matches, preserves the branch as `archive/work/<flowId>-<description>-<hash>`, squash-merges to the base branch with a signed commit (`-S`), and deletes the work branch. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
 - **Dry-run branches** are force-deleted after `foundry_git_finish` captures a snapshot to `.snapshots/<runId>/` on the parent `config/*` working tree. Each snapshot includes `README.md` (metadata), `work/WORK*` (workfile triple), `diff.patch` (full diff), and `trace.jsonl` (tool-call trace).
 ---
@@ -233,6 +242,20 @@ Every stage runs inside a token-gated lifecycle. The sub-agent must call `foundr
 Input artefacts (files matching an input type's `file-patterns`) are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle with `{error: 'unexpected_files'}`.
+### Forge required-tool verification
+During `foundry_stage_end` for a forge stage, the plugin verifies that the forge sub-agent called five required context-reading tools:
+1. `foundry_config_cycle`
+2. `foundry_workfile_get`
+3. `foundry_config_artefact_type`
+4. `foundry_config_laws`
+5. `foundry_feedback_list`
+Tool calls are logged to `.foundry/.forge-tool-calls.jsonl` during stage execution. When `foundry_stage_end` runs, it checks the log against the required set. Missing required calls generate system feedback with the tag `system:missing-tool-calls` and the forge stage completes normally — the missing-tool feedback acts as a signal to the sort router. When all required tools are present, any prior `system:missing-tool-calls` feedback is resolved.
+Implementation: `src/plugin/tools/stage-tools.js` (`verifyAndManageForgeTools`) and `src/scripts/lib/stage-calls.js`.
 ### Failed flow state
 When an unrecoverable error occurs (e.g. assay extractor abort, invalid JSONL, or memory-sync failure), the orchestrator marks `WORK.md` frontmatter with `status: failed` and a `reason`. The flow is then locked:
@@ -258,10 +281,10 @@ All pipeline skills (`orchestrate`, `flow`, stage skills) check for this state a
 `src/scripts/sort.js` (exported as `runSort`) owns the routing engine. It reads `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml`, then decides which stage runs next based on:
-- **Unresolved feedback.** If `quench` or `appraise` feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is usually `forge` or the originating evaluation stage.
-- **Deadlock detection.** If the same feedback items ping-pong between forge and appraise for `deadlock-iterations` (default 5) iterations, sort marks them `deadlocked` and inserts a `human-appraise` stage (if `deadlock-appraise: true`, the default).
-- **Iteration limits.** If `max-iterations` is exceeded, the cycle is marked `blocked` and control returns to the user.
+- **Unresolved feedback.** If feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is `forge` (for items needing action) or the originating evaluation stage (for items pending approval).
+- **Iteration limits and deadlock routing.** When the forge iteration count reaches `max-iterations` with unresolved feedback, sort routes to `human-appraise` (if `deadlock-human-appraise: true`, the default) or marks the cycle `blocked` if human routing is disabled. Sort does not write per-item deadlocked state; deadlock is a routing decision, not a feedback item state.
 - **Clean state.** If all feedback is resolved and no new validation or appraisal failures exist, the cycle is `done`.
+- **Blocked.** If `max-iterations` is exceeded and `deadlock-human-appraise` is `false`, the cycle is marked `blocked` and control returns to the user.
 ### Feedback state machine
@@ -269,15 +292,15 @@ Feedback items live in `WORK.feedback.yaml` with a full transition history. Each
 - `id` — a ULID.
 - `source` — the stage that created it (e.g. `quench:check-syllables`, `appraise:pedantic`, `human-appraise:hitl`).
-- `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`, `deadlocked`).
+- `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`).
 - `history` — append-only log of state transitions with timestamps and metadata.
 Transitions are **source-based**:
 | Source stage | Forge can `wont-fix`? | Resolved by |
 |--------------|------------------------|-------------|
-| `quench` (CLI validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
-| `appraise` (subjective law) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
+| `quench` (deterministic validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
+| `appraise` (law evaluation) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
 | `human-appraise` (user instruction) | No — must `actioned` | the originating `human-appraise` stage |
 Implementation: `src/scripts/lib/feedback-transitions.js` and `src/scripts/lib/feedback-store.js`. See [work-spec.md](work-spec.md) for the full state machine table.
@@ -290,24 +313,27 @@ Different stages can run on different models for cognitive diversity. Cycle defi
 ### Configuration
+- **Orchestrator argument.** `defaultModel` (optional) can be passed as an orchestrator argument. When set, it serves as the fallback for any stage or appraiser that does not declare a model.
 - **Cycle-level.** Declare a `models` map in the cycle frontmatter:
   ```yaml
   models:
+    default: anthropic/claude-sonnet-4
     forge: anthropic/claude-opus-4.7
     appraise: openai/gpt-5
   ```
-- **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model on a per-appraiser basis.
+  `models.default` provides a cycle-level fallback when no per-stage override exists and no `defaultModel` is passed to the orchestrator.
+- **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model and `models.default` on a per-appraiser basis.
 ### Agent files
 The user-facing `Foundry` agent is installed by the plugin's `config` hook as `.opencode/agents/foundry.md`. Users switch to this agent after restarting OpenCode. It guides authoring and flow execution while generated `foundry-*` stage agents remain hidden routing targets for specific models.
-`foundry_refresh_agents()` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`).
+`foundry_refresh_agents` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`). Call `foundry_refresh_agents()` in code examples when referring to the tool invocation.
 ### Dispatch behaviour
-- **Non-appraise stages** (forge, quench, assay): if the cycle declares `models.<stage>`, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing. If `models.<stage>` is not set, the stage is dispatched with the `general` subagent (session default).
-- **Appraise stage**: each appraiser is dispatched independently by the `appraise` skill. If an appraiser has its own `model`, the skill dispatches to `foundry-<slug>` and hard-fails if that agent file is missing; otherwise the appraiser runs under the `general` subagent.
+- **Non-appraise stages** (forge, quench, assay): the orchestrator resolves the model by checking `models.<stage>`, then `defaultModel`, then `models.default`, then falls back to `general` (session default). If a specific model is resolved, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing.
+- **Appraise stage**: each appraiser is dispatched independently by the appraise module. The model resolution order is: appraiser's own `model` field, then `defaultModel`, then `models.default`, then `general`. If a specific model is resolved, the task is dispatched to `foundry-<slug>` and the orchestrator hard-fails if that agent file is missing.
 Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `src/skills/appraise/SKILL.md`.
@@ -331,12 +357,13 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │   │   ├── appraise/
 │   │   ├── human-appraise/
 │   │   ├── add-artefact-type/  # authoring
-│   │   ├── add-artefact-type/
 │   │   ├── add-law/
 │   │   ├── add-appraiser/
 │   │   ├── add-cycle/
 │   │   ├── add-flow/
 │   │   ├── add-extractor/
+│   │   ├── assay/              # deterministic extractor execution
+│   │   ├── dry-run/            # dry-run execution and snapshots
 │   │   ├── list-agents/        # utility
 │   │   ├── refresh-agents/       # utility (now backed by foundry_refresh_agents tool)
 │   │   ├── upgrade-foundry/
@@ -352,7 +379,7 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │   └── scripts/
 │       ├── lib/                # shared libraries (injectable I/O)
 │       │   ├── workfile.js     # WORK.md frontmatter
-│       │   ├── artefacts.js    # artefact table operations
+│   │   ├── artefacts.js    # artefact discovery via branch diffs
 │       │   ├── history.js      # WORK.history.yaml operations
 │       │   ├── feedback-store.js
 │       │   ├── feedback-transitions.js
@@ -367,10 +394,18 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │       │   ├── state.js
 │       │   ├── config.js       # foundry/ config readers
 │       │   ├── slug.js
+│       │   ├── tool-paths.js
+│       │   ├── stage-calls.js  # forge tool-call logging and verification
+│       │   ├── sort-routing.js
+│       │   ├── sort-reason.js
+│       │   ├── sort-fs-check.js
+│       │   ├── validation.js
 │       │   ├── ulid.js
 │       │   ├── tracing.js
 │       │   ├── failed-flow.js
 │       │   ├── git-bridge.js
+│       │   ├── git-finish/     # branch finishing logic
+│       │   ├── attestation/    # ATTEST.md generation and verification
 │       │   ├── git-policy.js
 │       │   ├── assay/
 │       │   ├── config-creators/
@@ -378,6 +413,11 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │       │   ├── snapshot/
 │       │   └── memory/         # flow memory (Cozo 0.7)
 │       ├── orchestrate.js      # orchestration loop (exports runOrchestrate)
+│       ├── orchestrate-cycle.js
+│       ├── orchestrate-phases.js
+│       ├── orchestrate-terminals.js
+│       ├── quench-module.js    # deterministic validation (runQuench)
+│       ├── appraise-module.js  # appraise gather and consolidate
 │       └── sort.js             # routing engine (exports runSort)
 ├── scripts/
 │   └── build.js                # builds src/ into dist/