npm - @really-knows-ai/foundry - Versions diffs - 3.5.7 → 3.5.9 - Mend

@really-knows-ai/foundry 3.5.7 → 3.5.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/README.md +16 -10
package/dist/.opencode/plugins/foundry-tools/config-create-tools.js +2 -3
package/dist/.opencode/plugins/foundry.js +11 -1
package/dist/CHANGELOG.md +23 -0
package/dist/README.md +16 -10
package/dist/docs/README.md +6 -6
package/dist/docs/architecture.md +59 -19
package/dist/docs/concepts.md +55 -19
package/dist/docs/getting-started.md +37 -15
package/dist/docs/memory-maintenance.md +3 -3
package/dist/docs/tools.md +131 -70
package/dist/docs/work-spec.md +38 -52
package/dist/scripts/lib/config-creators/cycle.js +6 -10
package/dist/scripts/lib/config-validators/cycle.js +1 -9
package/dist/scripts/lib/feedback-store.js +1 -52
package/dist/scripts/lib/sort-reason.js +8 -7
package/dist/scripts/lib/sort-routing.js +106 -28
package/dist/scripts/lib/tool-paths.js +5 -1
package/dist/scripts/orchestrate-cycle.js +3 -13
package/dist/scripts/orchestrate-phases.js +3 -7
package/dist/scripts/sort.js +16 -53
package/dist/skills/add-cycle/SKILL.md +4 -4
package/dist/skills/add-flow/SKILL.md +1 -1
package/dist/skills/add-law/SKILL.md +1 -1
package/dist/skills/human-appraise/SKILL.md +12 -40
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
 is auditable, repeatable, and defensible to auditors and stakeholders. You can show
 exactly how the output was made. Confidence is engineered; it is not hoped for.
-### The operating model: assay, then forge → quench → appraise
+### The operating model: assay, then forge → quench → appraise → attest → finish
 A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
 that runs project-authored extractor scripts, parses the strict JSONL facts they
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
   fast and non-negotiable, catching errors before they reach appraisers.
 - **Appraise** judges quality against written laws. Independent evaluators inspect
-  whether the work meets the subjective standards you define.
+  whether the work meets the rules or criteria you define.
-- **Human-appraise** provides direct judgement when the stakes require it or the loop
-  deadlocks. Offers human oversight at critical decision points.
+- **Human-appraise** provides direct judgement when the stakes require it or the
+  cycle reaches its iteration limit. Offers human oversight at critical
+  decision points.
 Every stage commits separately, so every step leaves a record. Every decision is
 timestamped. A single loop produces an **output** — a verified draft. A flow
-composes one or more such loops to produce an **outcome** — the final artefact that
-reaches your codebase or customers.
+composes one or more such loops to produce an **outcome** — the final artefact.
+When the loop clears, completing the work branch requires **attest** — a final
+verification that writes and commits `ATTEST.md` — followed by **finish**, which
+squash-merges the approved work to the base branch with a signed attestation block.
 ### What you describe, what Foundry enforces
@@ -174,7 +178,8 @@ quench    → 5/7/5 — passes                          [commit]
 appraise  → 2 appraisers, one flags weak imagery    [commit]
 forge     → revises                                 [commit]
 appraise  → clean                                   [commit]
-done      → squash-merged to main with attestation
+attest    → ATTEST.md committed                     [commit]
+finish    → squash-merged to main with attestation
 ```
 Every stage commits. Every decision is recorded. Every piece of feedback and every
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
 `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
 and [Assay](docs/concepts.md#assay) for the configuration path.
-> **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
+> **Note:** flow memory currently persists to `cozo-node`, which is
 > unmaintained upstream. Installation produces six cosmetic deprecation warnings
 > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
 > a maintained backend in a future release; the public `foundry_memory_*` tools
 > and on-disk vocabulary/NDJSON format are designed to survive that migration.
-> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
+> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
 ---
@@ -233,7 +238,8 @@ reproducible.
   state live in tested plugin code, outside LLM control.
 - **Written quality criteria** — laws are markdown files; an appraiser panel scores
-  each artefact against them, so quality is objective.
+  each artefact against them, providing structured quality assessment from
+  multiple perspectives.
 - **Multi-model diversity** — forge on one model, appraise on another, every
   appraiser on a different model if you want. Different models catch different

package/dist/.opencode/plugins/foundry-tools/config-create-tools.js CHANGED Viewed

@@ -162,9 +162,8 @@ function cycleArgs(s) { return {
     artefacts: s.array(s.string()).describe('Artefact type IDs this cycle reads'),
   }).optional().describe('Input contract for this cycle. Omit for source cycles that start from the user goal; empty artefacts arrays are invalid.'),
   targets: s.array(s.string()).optional().describe('Downstream cycle IDs this cycle can route to'),
-  humanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
-  deadlockAppraise: s.boolean().optional().describe('Route to human-appraise on LLM appraiser deadlock'),
-  deadlockIterations: s.number().optional().describe('Iteration threshold for deadlock detection'),
+  alwaysHumanAppraise: s.boolean().optional().describe('Include human-appraise in every iteration'),
+  deadlockHumanAppraise: s.boolean().optional().describe('Route to human-appraise when max-iterations is reached'),
   maxIterations: s.number().optional().describe('Maximum forge iterations before cycle blocks'),
   assay: s.object({
     extractors: s.array(s.string()).describe('Extractor IDs for the assay stage'),

package/dist/.opencode/plugins/foundry.js CHANGED Viewed

@@ -36,7 +36,7 @@ import { createMemoryAdminTools } from './foundry-tools/memory-admin-tools.js';
 import { createSnapshotTools } from './foundry-tools/snapshot-tools.js';
 import { createAttestationTools } from './foundry-tools/attestation-tools.js';
 import { createRefreshAgentsTool } from './foundry-tools/refresh-agents-tool.js';
-import { resolveGit } from '../../scripts/lib/tool-paths.js';
+import { resolveGit, resolvePnpm } from '../../scripts/lib/tool-paths.js';
 function findPackageRoot(startDir) {
   let dir = startDir;
@@ -103,7 +103,17 @@ function initGitRepo(worktree) {
   }
 }
+function ensurePackageJson(worktree) {
+  if (existsSync(path.join(worktree, 'package.json'))) return;
+  try {
+    execFileSync(resolvePnpm(), ['init'], { cwd: worktree, stdio: 'pipe' });
+  } catch (err) {
+    console.error('Foundry pnpm init error:', err.message);
+  }
+}
 function runBootstrapSequence(worktree, pkgRoot) {
+  ensurePackageJson(worktree);
   bootstrapDirectories(worktree);
   bootstrapGitignore(worktree);
   refreshAgents(worktree);

package/dist/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,28 @@
 # Changelog
+## [3.5.9] - 2026-05-24
+### Changed
+- Cycle frontmatter keys renamed: `human-appraise` → `always-human-appraise`, `deadlock-appraise` → `deadlock-human-appraise`, `deadlock-iterations` replaced with `max-iterations`-based deadlock routing.
+- Deadlock routing replaced per-item deadlocked state with iteration-limit detection using `max-iterations` and `deadlock-human-appraise`.
+- Orchestration uses iteration cap routing instead of per-item deadlock history.
+- Skills and public documentation (`README.md`, `docs/*`) refreshed against the current 65-tool implementation: updated tool reference with structured config-creation arguments, corrected quench/appraise execution models, documented attestation-before-finish workflow, added Validator and ATTEST.md concepts, and fixed memory paths. All stale v3.0.x terminology removed.
+### Fixed
+- Guidance audit tests aligned with current user-facing documentation style (no direct `foundry_git_finish({)` call syntax in walkthrough).
+## [3.5.8] - 2026-05-23
+### Added
+- Foundry bootstrap runs `pnpm init` on the project workspace when no `package.json` exists, so validators can install dependencies immediately.
+### Changed
+- Validator guidance in the add-law skill hardened from "prefer libraries" to "hand-rolled heuristics are a last resort."
 ## [3.5.7] - 2026-05-23
 ### Added

package/dist/README.md CHANGED Viewed

@@ -32,7 +32,7 @@ the loop and records every step in git, so the path from draft to approved artef
 is auditable, repeatable, and defensible to auditors and stakeholders. You can show
 exactly how the output was made. Confidence is engineered; it is not hoped for.
-### The operating model: assay, then forge → quench → appraise
+### The operating model: assay, then forge → quench → appraise → attest → finish
 A codebase-aware cycle can begin with **assay**: a deterministic pre-forge stage
 that runs project-authored extractor scripts, parses the strict JSONL facts they
@@ -51,15 +51,19 @@ gates. Each loop has four distinct roles that turn a candidate into a verified o
   fast and non-negotiable, catching errors before they reach appraisers.
 - **Appraise** judges quality against written laws. Independent evaluators inspect
-  whether the work meets the subjective standards you define.
+  whether the work meets the rules or criteria you define.
-- **Human-appraise** provides direct judgement when the stakes require it or the loop
-  deadlocks. Offers human oversight at critical decision points.
+- **Human-appraise** provides direct judgement when the stakes require it or the
+  cycle reaches its iteration limit. Offers human oversight at critical
+  decision points.
 Every stage commits separately, so every step leaves a record. Every decision is
 timestamped. A single loop produces an **output** — a verified draft. A flow
-composes one or more such loops to produce an **outcome** — the final artefact that
-reaches your codebase or customers.
+composes one or more such loops to produce an **outcome** — the final artefact.
+When the loop clears, completing the work branch requires **attest** — a final
+verification that writes and commits `ATTEST.md` — followed by **finish**, which
+squash-merges the approved work to the base branch with a signed attestation block.
 ### What you describe, what Foundry enforces
@@ -174,7 +178,8 @@ quench    → 5/7/5 — passes                          [commit]
 appraise  → 2 appraisers, one flags weak imagery    [commit]
 forge     → revises                                 [commit]
 appraise  → clean                                   [commit]
-done      → squash-merged to main with attestation
+attest    → ATTEST.md committed                     [commit]
+finish    → squash-merged to main with attestation
 ```
 Every stage commits. Every decision is recorded. Every piece of feedback and every
@@ -191,12 +196,12 @@ declare the entity and edge vocabulary, add extractors, and opt a cycle into
 `assay.extractors`. See [Optional: flow memory](docs/getting-started.md#optional-flow-memory)
 and [Assay](docs/concepts.md#assay) for the configuration path.
-> **Note (3.0.0):** flow memory currently persists to `cozo-node`, which is
+> **Note:** flow memory currently persists to `cozo-node`, which is
 > unmaintained upstream. Installation produces six cosmetic deprecation warnings
 > from transitive dependencies (`pnpm audit` is clean). Foundry will migrate to
 > a maintained backend in a future release; the public `foundry_memory_*` tools
 > and on-disk vocabulary/NDJSON format are designed to survive that migration.
-> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status-as-of-300).
+> See `CHANGELOG.md` and [docs/memory-maintenance.md](docs/memory-maintenance.md#backend-status).
 ---
@@ -233,7 +238,8 @@ reproducible.
   state live in tested plugin code, outside LLM control.
 - **Written quality criteria** — laws are markdown files; an appraiser panel scores
-  each artefact against them, so quality is objective.
+  each artefact against them, providing structured quality assessment from
+  multiple perspectives.
 - **Multi-model diversity** — forge on one model, appraise on another, every
   appraiser on a different model if you want. Different models catch different

package/dist/docs/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Foundry docs
-This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need.
+This directory contains the reference set behind the project README. Every document here serves a single purpose; use this index to find what you need. The `docs/superpowers/` directory is future-facing internal scaffolding and is not indexed as public documentation.
 **How to navigate:** Work through the sections in order: **Start here** establishes conceptual foundations, **Reference** provides detailed specifications for implementation, and **Contributors** covers subsystem maintenance and extensions.
@@ -10,9 +10,9 @@ Getting oriented with Foundry means understanding both the concepts it uses and
 **Reading order:** Work through them in order; [getting-started.md](getting-started.md) builds hands-on confidence, and [concepts.md](concepts.md) provides reference depth. Most implementers spend 1–2 hours on getting-started before moving to Reference materials.
-- **getting-started.md** — Complete end-to-end installation, auto-bootstrapping, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
+- **getting-started.md** — Complete end-to-end installation, bootstrap, Foundry agent wizard, and first flow walkthrough. Read this immediately after installing the plugin and before authoring any of your own configuration.
-  It establishes the operating model, directory structure, and practical confidence in one pass. Includes hands-on guidance on authoring the five foundational concepts (artefact types, laws, appraisers, cycles, flows) with worked examples you can run against real code. Also covers the optional flow-memory path: initialise memory, declare vocabulary, add extractors, and opt a cycle into assay for codebase-aware flows.
+  It covers bootstrap states, the Understand–Plan–Confirm–Build wizard protocol, structured config authoring, current flow execution with quench and appraise, attestation and branch finishing, dry-run completion, and the optional flow-memory path (initialise memory, declare vocabulary, add extractors, opt a cycle into assay). Uses worked examples you can run against real code.
   Implementers must follow every step and complete the bootstrap; architects typically skim for structure before moving to [concepts.md](concepts.md) and [architecture.md](architecture.md) to reason about their designs.
@@ -28,19 +28,19 @@ These documents specify formats, tools, and design principles. Use them when imp
 **Key property:** These are sources of truth and normative references. Changes to Foundry flow formats or tool behaviour must be reflected here first. Use them together—cross-references appear throughout.
-- **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, the artefact registry, and the full feedback state machine with all valid transitions and guards.
+- **work-spec.md** — Complete specification of the `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml` file formats, including frontmatter fields, feedback and history state, and the full feedback state machine with all valid transitions and guards. Artefacts are discovered from branch diffs, not stored as an artefact table in the workfile.
   Use this when implementing tooling around work files, validating state transitions, or understanding what metadata flows carry through an execution. It is the authoritative source of truth for all transient work-branch structures, format validation rules, and field semantics.
   Implementers and tool builders rely on this heavily; keep it updated immediately as formats evolve or new fields are added.
-- **tools.md** — Categorical index and reference documentation for all custom tools, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
+- **tools.md** — Complete 65-tool categorical index and reference, organised by family (lifecycle, artefacts, feedback, config, memory, etc.) with complete signatures and permissions.
   Consult this when you need to understand what a specific tool does, its branch requirements, what stage locks apply, what arguments it accepts, and how it integrates with the overall system. Covers calling conventions, enforcement invariants, and the permission model for memory access. References `foundry_assay_run`, `foundry_extractor_create`, and memory data and admin tools.
   Tool authors and system integrators use this constantly; it is the comprehensive reference for all custom tools in the Foundry ecosystem.
-- **architecture.md** — The design and enforcement model covering token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing, and core design principles.
+- **architecture.md** — The design and enforcement model covering orchestration actions (`dispatch`, `dispatch_multi`, `human_appraise`, `done`, `blocked`, `violation`), internal quench and appraise execution, branch finish paths with attestation preconditions, token lifecycle, stage-locked mutations, write invariants, branch namespaces, multi-model routing with `defaultModel`, forge required-tool verification, and core design principles.
   Read this when you need to understand how Foundry maintains safety (how tokens prevent replay, why stages lock mutations, how writes are validated), what guarantees it makes and where they live in the code, or why it is structured the way it is. Explains the memory layout, assay write boundary, and failed-flow behaviour that keep extractor-populated memory auditable.

package/dist/docs/architecture.md CHANGED Viewed

@@ -34,7 +34,7 @@ A forge sub-agent can decline subjective feedback with a justification, and an a
 ### Humans can step in at known points
-Human-in-the-loop gates are first-class stages. A cycle can declare `human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-appraise: true` (the default) to pull a human in only when LLM appraisers and forge ping-pong on the same items. Human feedback takes absolute priority and cannot be wont-fixed.
+Human-in-the-loop gates are first-class stages. A cycle can declare `always-human-appraise: true` to run a human quality gate every iteration, or rely on `deadlock-human-appraise: true` (the default) to pull a human in when the iteration count reaches `max-iterations`. Human feedback takes absolute priority and cannot be wont-fixed.
 ### Multi-model diversity
@@ -75,9 +75,10 @@ The following guarantees live in plugin code and are outside LLM control:
 The `orchestrate` skill is a thin driver around `foundry_orchestrate`. Its entire loop is:
 ```text
-call foundry_orchestrate({lastResult})
+call foundry_orchestrate({lastResult, lastResults, baseBranch, defaultModel})
 switch on action:
-  dispatch        → dispatch the requested subagent → report back
+  dispatch        → dispatch single subagent → report back
+  dispatch_multi  → dispatch parallel appraiser tasks → consolidate → report back
   human_appraise  → run human-appraise inline → report back
   done / blocked / violation → terminate the loop
 ```
@@ -92,6 +93,14 @@ switch on action:
 Because the protocol lives in a plugin tool, the LLM cannot skip steps, reorder them, or silently drop a commit.
+### Internal quench execution
+Quench runs inside the orchestrator as `runQuench(ctx)` — a deterministic, non-LLM validation pass. It reads the active stage from `.foundry/active-stage.json`, discovers artefact changes via branch-based artefact discovery, runs validators for each applicable law, and posts feedback with tags in the format `law:<law-id>:<validator-id>`. It also resolves prior quench feedback: items whose issues remain are set to `rejected`; items no longer present are set to `approved` (transitioned to `resolved`). The quench module is available at `src/scripts/quench-module.js`.
+### Internal appraise execution
+Appraise uses `gatherAppraiseContext()` to build parallel subagent tasks, one per (artefact, appraiser) pair. It returns a `dispatch_multi` action containing the task list. The orchestrator's loop dispatches each task independently. After all appraisers report back, `consolidateAppraise()` processes the `lastResults` array: it parses each successful output for structured issues, de-duplicates across appraisers, posts feedback with `law:<law-id>` tags, and resolves prior appraise feedback items (resolves stale items, rejects items still present). The appraise module is available at `src/scripts/appraise-module.js`.
 ---
 ## Token lifecycle
@@ -134,9 +143,9 @@ Foundry partitions mutation across three branch namespaces. The plugin enforces
 | Namespace | Pattern | Owns | Created from | Finished by |
 |-----------|---------|------|--------------|-------------|
-| **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | PR or direct merge to `main` |
-| **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | `foundry_git_finish` (squash-merges to base, preserves forensic archive) |
-| **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot, force-deletes branch) |
+| **config** | `config/<description>` | `foundry/` (schema and config) | `main` via `foundry_git_branch({ kind: "config", description })` | Squash-merge to base branch; no attestation required |
+| **work** | `work/<flowId>-<description>` | `WORK.md`, `WORK.feedback.yaml`, `WORK.history.yaml`, `foundry-memory/` (row data) | `main` via `foundry_git_branch({ kind: "work", flowId, description })` | Requires `ATTEST.md` at HEAD (created by `foundry_attest({ confirm: true })`). `foundry_git_finish({ confirm: true })` verifies the attestation, preserves `archive/<work-branch>-<short-sha>`, squash-merges to base, creates a signed commit (`-S`), deletes the work branch |
+| **dry-run** | `dry-run/<parentConfig>/<flowId>-<description>` | Same as `work/*` | `config/*` via `foundry_git_branch({ kind: "dry-run", flowId, description })` | `foundry_git_finish` (captures snapshot to `.snapshots/<run-id>/`, force-deletes branch) |
 ### Guard implementation
@@ -150,7 +159,7 @@ Implementation: `src/scripts/lib/branch-guard.js`.
 ### Forensic branches and snapshots
-- **Work branches** are preserved as `archive/work/<flowId>-<description>-<hash>` when `foundry_git_finish` completes. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
+- **Work branches** require `ATTEST.md` at HEAD, created by `foundry_attest({ confirm: true })` before `foundry_git_finish({ confirm: true })` runs. The finish tool verifies the attestation, checks the diff SHA matches, preserves the branch as `archive/work/<flowId>-<description>-<hash>`, squash-merges to the base branch with a signed commit (`-S`), and deletes the work branch. The full stage micro-commit history, `WORK.*` files, and all intermediate artefact states remain intact. The signed squash commit on the base branch references the archive branch tip SHA in its attestation block.
 - **Dry-run branches** are force-deleted after `foundry_git_finish` captures a snapshot to `.snapshots/<runId>/` on the parent `config/*` working tree. Each snapshot includes `README.md` (metadata), `work/WORK*` (workfile triple), `diff.patch` (full diff), and `trace.jsonl` (tool-call trace).
 ---
@@ -233,6 +242,20 @@ Every stage runs inside a token-gated lifecycle. The sub-agent must call `foundr
 Input artefacts (files matching an input type's `file-patterns`) are read-only. Files outside any artefact type's patterns are read-only. Violations hard-stop the cycle with `{error: 'unexpected_files'}`.
+### Forge required-tool verification
+During `foundry_stage_end` for a forge stage, the plugin verifies that the forge sub-agent called five required context-reading tools:
+1. `foundry_config_cycle`
+2. `foundry_workfile_get`
+3. `foundry_config_artefact_type`
+4. `foundry_config_laws`
+5. `foundry_feedback_list`
+Tool calls are logged to `.foundry/.forge-tool-calls.jsonl` during stage execution. When `foundry_stage_end` runs, it checks the log against the required set. Missing required calls generate system feedback with the tag `system:missing-tool-calls` and the forge stage completes normally — the missing-tool feedback acts as a signal to the sort router. When all required tools are present, any prior `system:missing-tool-calls` feedback is resolved.
+Implementation: `src/plugin/tools/stage-tools.js` (`verifyAndManageForgeTools`) and `src/scripts/lib/stage-calls.js`.
 ### Failed flow state
 When an unrecoverable error occurs (e.g. assay extractor abort, invalid JSONL, or memory-sync failure), the orchestrator marks `WORK.md` frontmatter with `status: failed` and a `reason`. The flow is then locked:
@@ -258,10 +281,10 @@ All pipeline skills (`orchestrate`, `flow`, stage skills) check for this state a
 `src/scripts/sort.js` (exported as `runSort`) owns the routing engine. It reads `WORK.md`, `WORK.feedback.yaml`, and `WORK.history.yaml`, then decides which stage runs next based on:
-- **Unresolved feedback.** If `quench` or `appraise` feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is usually `forge` or the originating evaluation stage.
-- **Deadlock detection.** If the same feedback items ping-pong between forge and appraise for `deadlock-iterations` (default 5) iterations, sort marks them `deadlocked` and inserts a `human-appraise` stage (if `deadlock-appraise: true`, the default).
-- **Iteration limits.** If `max-iterations` is exceeded, the cycle is marked `blocked` and control returns to the user.
+- **Unresolved feedback.** If feedback exists in a non-terminal state (`open`, `actioned`, `wont-fix`), the next stage is `forge` (for items needing action) or the originating evaluation stage (for items pending approval).
+- **Iteration limits and deadlock routing.** When the forge iteration count reaches `max-iterations` with unresolved feedback, sort routes to `human-appraise` (if `deadlock-human-appraise: true`, the default) or marks the cycle `blocked` if human routing is disabled. Sort does not write per-item deadlocked state; deadlock is a routing decision, not a feedback item state.
 - **Clean state.** If all feedback is resolved and no new validation or appraisal failures exist, the cycle is `done`.
+- **Blocked.** If `max-iterations` is exceeded and `deadlock-human-appraise` is `false`, the cycle is marked `blocked` and control returns to the user.
 ### Feedback state machine
@@ -269,15 +292,15 @@ Feedback items live in `WORK.feedback.yaml` with a full transition history. Each
 - `id` — a ULID.
 - `source` — the stage that created it (e.g. `quench:check-syllables`, `appraise:pedantic`, `human-appraise:hitl`).
-- `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`, `deadlocked`).
+- `state` — current state (`open`, `actioned`, `wont-fix`, `resolved`, `rejected`).
 - `history` — append-only log of state transitions with timestamps and metadata.
 Transitions are **source-based**:
 | Source stage | Forge can `wont-fix`? | Resolved by |
 |--------------|------------------------|-------------|
-| `quench` (CLI validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
-| `appraise` (subjective law) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
+| `quench` (deterministic validation) | No — must `actioned` | the originating `quench` stage, or `human-appraise` override |
+| `appraise` (law evaluation) | Yes (with reason) | the originating `appraise` stage, or `human-appraise` override |
 | `human-appraise` (user instruction) | No — must `actioned` | the originating `human-appraise` stage |
 Implementation: `src/scripts/lib/feedback-transitions.js` and `src/scripts/lib/feedback-store.js`. See [work-spec.md](work-spec.md) for the full state machine table.
@@ -290,24 +313,27 @@ Different stages can run on different models for cognitive diversity. Cycle defi
 ### Configuration
+- **Orchestrator argument.** `defaultModel` (optional) can be passed as an orchestrator argument. When set, it serves as the fallback for any stage or appraiser that does not declare a model.
 - **Cycle-level.** Declare a `models` map in the cycle frontmatter:
   ```yaml
   models:
+    default: anthropic/claude-sonnet-4
     forge: anthropic/claude-opus-4.7
     appraise: openai/gpt-5
   ```
-- **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model on a per-appraiser basis.
+  `models.default` provides a cycle-level fallback when no per-stage override exists and no `defaultModel` is passed to the orchestrator.
+- **Appraiser-level.** Individual appraisers can declare a `model` field in their personality definition; this overrides the cycle-level appraise model and `models.default` on a per-appraiser basis.
 ### Agent files
 The user-facing `Foundry` agent is installed by the plugin's `config` hook as `.opencode/agents/foundry.md`. Users switch to this agent after restarting OpenCode. It guides authoring and flow execution while generated `foundry-*` stage agents remain hidden routing targets for specific models.
-`foundry_refresh_agents()` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`).
+`foundry_refresh_agents` generates a `foundry-<slug>.md` agent file in `.opencode/agents/` for every model available in the session, where `<slug>` is the model ID with both `/` and `.` replaced by `-` (e.g. `anthropic-claude-opus-4-7.md`). Call `foundry_refresh_agents()` in code examples when referring to the tool invocation.
 ### Dispatch behaviour
-- **Non-appraise stages** (forge, quench, assay): if the cycle declares `models.<stage>`, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing. If `models.<stage>` is not set, the stage is dispatched with the `general` subagent (session default).
-- **Appraise stage**: each appraiser is dispatched independently by the `appraise` skill. If an appraiser has its own `model`, the skill dispatches to `foundry-<slug>` and hard-fails if that agent file is missing; otherwise the appraiser runs under the `general` subagent.
+- **Non-appraise stages** (forge, quench, assay): the orchestrator resolves the model by checking `models.<stage>`, then `defaultModel`, then `models.default`, then falls back to `general` (session default). If a specific model is resolved, the orchestrator dispatches to `foundry-<slug>` and hard-fails if `.opencode/agents/foundry-<slug>.md` is missing.
+- **Appraise stage**: each appraiser is dispatched independently by the appraise module. The model resolution order is: appraiser's own `model` field, then `defaultModel`, then `models.default`, then `general`. If a specific model is resolved, the task is dispatched to `foundry-<slug>` and the orchestrator hard-fails if that agent file is missing.
 Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `src/skills/appraise/SKILL.md`.
@@ -331,12 +357,13 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │   │   ├── appraise/
 │   │   ├── human-appraise/
 │   │   ├── add-artefact-type/  # authoring
-│   │   ├── add-artefact-type/
 │   │   ├── add-law/
 │   │   ├── add-appraiser/
 │   │   ├── add-cycle/
 │   │   ├── add-flow/
 │   │   ├── add-extractor/
+│   │   ├── assay/              # deterministic extractor execution
+│   │   ├── dry-run/            # dry-run execution and snapshots
 │   │   ├── list-agents/        # utility
 │   │   ├── refresh-agents/       # utility (now backed by foundry_refresh_agents tool)
 │   │   ├── upgrade-foundry/
@@ -352,7 +379,7 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │   └── scripts/
 │       ├── lib/                # shared libraries (injectable I/O)
 │       │   ├── workfile.js     # WORK.md frontmatter
-│       │   ├── artefacts.js    # artefact table operations
+│   │   ├── artefacts.js    # artefact discovery via branch diffs
 │       │   ├── history.js      # WORK.history.yaml operations
 │       │   ├── feedback-store.js
 │       │   ├── feedback-transitions.js
@@ -367,10 +394,18 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │       │   ├── state.js
 │       │   ├── config.js       # foundry/ config readers
 │       │   ├── slug.js
+│       │   ├── tool-paths.js
+│       │   ├── stage-calls.js  # forge tool-call logging and verification
+│       │   ├── sort-routing.js
+│       │   ├── sort-reason.js
+│       │   ├── sort-fs-check.js
+│       │   ├── validation.js
 │       │   ├── ulid.js
 │       │   ├── tracing.js
 │       │   ├── failed-flow.js
 │       │   ├── git-bridge.js
+│       │   ├── git-finish/     # branch finishing logic
+│       │   ├── attestation/    # ATTEST.md generation and verification
 │       │   ├── git-policy.js
 │       │   ├── assay/
 │       │   ├── config-creators/
@@ -378,6 +413,11 @@ Implementation: `src/plugin/tools/helpers.js` (`buildCyclePromptExtras`) and `sr
 │       │   ├── snapshot/
 │       │   └── memory/         # flow memory (Cozo 0.7)
 │       ├── orchestrate.js      # orchestration loop (exports runOrchestrate)
+│       ├── orchestrate-cycle.js
+│       ├── orchestrate-phases.js
+│       ├── orchestrate-terminals.js
+│       ├── quench-module.js    # deterministic validation (runQuench)
+│       ├── appraise-module.js  # appraise gather and consolidate
 │       └── sort.js             # routing engine (exports runSort)
 ├── scripts/
 │   └── build.js                # builds src/ into dist/

package/dist/docs/concepts.md CHANGED Viewed

@@ -20,9 +20,9 @@ An iterative unit that produces one artefact type and routes to later cycles thr
 - `output-type` — the artefact type it produces (read-write).
 - `inputs` — a contract (`any-of` / `all-of`) over other artefact types. Inputs are discovered on disk; they are read-only unless the output type's patterns happen to cover them.
 - `targets` — the cycle(s) that may run after this one. May be empty (terminal cycle).
-- `human-appraise` — whether a human quality gate runs every iteration (default: `false`).
-- `deadlock-appraise` — whether a human is pulled in when LLM appraisers deadlock (default: `true`).
-- `deadlock-iterations` — deadlock threshold (default: `5`).
+- `always-human-appraise` — whether a human quality gate runs every iteration (default: `false`).
+- `deadlock-human-appraise` — whether a human is pulled in when the deadlock threshold is reached (default: `true`).
+- `max-iterations` — maximum forge iterations (default: `3`).
 - `models` — optional per-stage model overrides.
 A cycle runs **assay** first when configured, then **forge → quench → appraise** (and optionally **human-appraise**), looping until all feedback is resolved or `max-iterations` is hit.
@@ -35,9 +35,9 @@ The stage names come from the foundry metaphor because the system treats AI outp
 - **assay** — opt-in pre-forge stage that populates flow memory by running project-authored extractor scripts (iteration 0 only). No artefact, no feedback, no output beyond memory writes. See the [Assay](#assay) and [Extractor](#extractor) entries below.
 - **forge** — produce or revise the artefact.
-- **quench** — run deterministic CLI checks declared in laws (via their optional `validators:` blocks).
-- **appraise** — subjective evaluation by multiple appraiser sub-agents.
-- **human-appraise** — human quality gate. Can run every iteration, only on deadlock, or both.
+- **quench** — deterministic validation run inside the orchestrator against laws that contain validators.
+- **appraise** — orchestrator-managed `dispatch_multi` fan-out to appraiser sub-agents, with internal consolidation.
+- **human-appraise** — human quality gate. Runs every iteration when `always-human-appraise` is true, or on deadlock when `deadlock-human-appraise` is true.
 Feedback is always *about an artefact* and flows backward to forge. Assay sits outside the artefact-feedback loop because it precedes the artefact and its only failure mode (a broken extractor under `foundry/memory/extractors/`) lives outside forge's `file-patterns`.
@@ -63,31 +63,35 @@ See also: [Extractor](#extractor).
 A definition of what is being produced. Lives in `foundry/artefacts/<type>/`:
 - `definition.md` — identity, file patterns, output directory, appraiser config, prose description.
-- `laws.md` *(optional)* — type-specific subjective criteria, with optional validators for deterministic checks.
+- `laws.md` *(optional)* — type-specific criteria, with optional validators for deterministic checks.
 File patterns must not overlap with any other artefact type's patterns — the write-invariant enforcer needs to know which type owns a given file.
 ## Law
-A subjective pass/fail criterion. Two scopes:
+A rule or criterion that defines expectations for artefacts. Two scopes:
 - **Global** — `foundry/laws/*.md`, all files concatenated, applies to every artefact.
 - **Type-specific** — `foundry/artefacts/<type>/laws.md`.
-Each law is a `## heading` (its identifier, used in feedback tags as `law:<id>`) with a description, passing criteria, and failing criteria.
+Each law is a `## heading` (its identifier, used in feedback tags as `law:<id>`) with a description, passing criteria, and failing criteria. Laws may declare optional validators, which are deterministic scripts that verify the artefact programmatically.
+## Validator
+An optional deterministic script attached to a law. Declared in a law's `validators` field and run during the quench stage. Each validator produces feedback tagged `law:<law-id>:<validator-id>` when its check fails. Validators are the mechanism for automated, deterministic enforcement of law requirements.
 ## Appraiser
-An independent evaluator with a defined personality. Lives in `foundry/appraisers/*.md`. Appraisers may specify a `model` field to override the cycle-level appraise model. Each artefact type picks which appraisers may evaluate it (`appraisers.allowed`) and how many run per iteration (`appraisers.count`). Selection distributes evenly across allowed personalities.
+An evaluator that judges artefacts against laws through its personality or perspective. Lives in `foundry/appraisers/*.md`. Appraisers may specify a `model` field to override the cycle-level appraise model. Each artefact type picks which appraisers may evaluate it (`appraisers.allowed`) and how many run per iteration (`appraisers.count`). Selection distributes evenly across allowed personalities. Appraisers receive the artefact content and applicable laws; they do not receive validator metadata.
 ## WORK.md
-The transient shared state for a flow. Created on the work branch by the flow skill, it tracks:
+The transient shared state for a flow. Created on the work branch, it tracks:
 - Current position (flow, cycle, stage list, iteration limits) in frontmatter.
 - The goal (prose — written once).
-- An artefact registry (file, type, cycle, status).
-- Feedback state lives alongside it in `WORK.feedback.yaml`.
+Artefacts are discovered from branch changes against the current cycle's output-type file patterns, not stored as an artefact table in `WORK.md`. Feedback state lives alongside `WORK.md` in `WORK.feedback.yaml`.
 See [work-spec.md](work-spec.md) for the full spec.
@@ -100,12 +104,12 @@ Append-only log of every stage execution, sitting next to WORK.md. Used by sort
 Feedback items live in `WORK.feedback.yaml` — a yaml file at the worktree
 root, alongside `WORK.md`. Every item has a ULID, a source stage, and a
 full history of state transitions (open → actioned → resolved, or variants
-including wont-fix / rejected / deadlocked).
+including wont-fix / rejected).
 Plugins read and write feedback through the `foundry_feedback_*` tools;
-skills never edit the yaml directly. Sort-side detection of deadlocked
-items (per-item history depth) replaces the earlier global-iteration
-counter.
+skills never edit the yaml directly. Sort routing uses the iteration count
+and `max-iterations` / `deadlock-human-appraise` settings to detect
+deadlock, not per-item deadlock history.
 See `docs/work-spec.md` for the full schema and state machine.
@@ -113,8 +117,8 @@ See `docs/work-spec.md` for the full schema and state machine.
 Human-in-the-loop checkpoint. A stage where Foundry pauses and asks a human for input. Two triggers:
-1. **Every-iteration** — the cycle declares `human-appraise: true`. The `human-appraise` stage runs after LLM appraise each iteration.
-2. **Deadlock** — the cycle declares `deadlock-appraise: true` (default). If forge and appraisers ping-pong on the same items for `deadlock-iterations` (default 5) iterations, sort inserts a `human-appraise` stage to break the tie.
+1. **Every-iteration** — the cycle declares `always-human-appraise: true`. The `human-appraise` stage runs after LLM appraise each iteration.
+2. **Deadlock** — the cycle declares `deadlock-human-appraise: true` (default). If the iteration count reaches `max-iterations`, sort inserts a `human-appraise` stage to break the tie.
 Human feedback is tagged `human` and takes priority over LLM feedback on the same topic.
@@ -194,6 +198,28 @@ The signed squash commit on the base branch references the archive
 branch tip SHA in its attestation block. Archive branches accumulate
 indefinitely — periodic manual pruning is outside the tool's scope.
+## Attestation
+A cryptographic claim that a work branch completed all required stages
+and all feedback was resolved. Created by `foundry_attest({ confirm: true })`,
+which writes and commits `ATTEST.md` at `HEAD` containing the cycle
+goal, a diff SHA-256 of the branch changes, and a canonical JSON payload
+with the flow contract, governance hashes, output artefact list, process
+log, and archive branch reference. Work-branch `foundry_git_finish`
+verifies the attestation before allowing the squash merge. Config and
+dry-run branches do not require attestation.
+## ATTEST.md
+The attestation document written by `foundry_attest` as the work-branch
+`HEAD` commit. Contains the goal prose, `diff-sha256` of the branch
+diff, and a signed attestation block between `-----BEGIN FOUNDRY
+ATTESTATION-----` and `-----END FOUNDRY ATTESTATION-----` delimiters.
+`foundry_git_finish` verifies `ATTEST.md` exists at `HEAD`, checks the
+`diff-sha256` matches the recomputed branch diff, uses the attestation
+payload in the signed final commit message, and preserves the archive
+branch reference.
 ## Stage token
 A single-use HMAC-signed string, minted by `foundry_orchestrate` when a stage is dispatched. The sub-agent must redeem the token via `foundry_stage_begin`; mutation tools then check the active stage matches their role. Keys live in `.foundry/.secret` (mode 0600, gitignored, one per worktree). This prevents out-of-band mutations, replayed stages, and sub-agents skipping the lifecycle.
@@ -267,6 +293,16 @@ All deterministic pipeline operations are exposed as custom tools by the Foundry
 A self-contained workflow written as markdown with YAML frontmatter. Foundry ships a user-facing `Foundry` guide agent plus skills for pipeline execution, authoring, maintenance, and memory administration. The guide agent is the normal interface for users; skills and tools provide the internal workflows it uses to initialise projects, create artefact types, define laws, configure appraisers, build cycles and flows, and run governed artefact generation. Skills are either **atomic** (do one thing) or **composite** (orchestrate other skills).
+## Foundry agent wizard
+The interactive configuration process that runs when the user first
+interacts with the Foundry agent. The wizard walks through four phases
+— Understand, Plan, Confirm, Build — asking one question at a time.
+Configuration files are created only after the user confirms the plan.
+The wizard eliminates hand-authoring of normal setup by using the
+structured config creation tools (`foundry_config_create_*`) on a
+`config/*` branch.
 ---
 ## Extractor